Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Proposed changes to GA release process #15586

Open
deepthi opened this issue Mar 27, 2024 · 15 comments
Open

RFC: Proposed changes to GA release process #15586

deepthi opened this issue Mar 27, 2024 · 15 comments
Labels
Type: RFC Request For Comment

Comments

@deepthi
Copy link
Member

deepthi commented Mar 27, 2024

Summary

We currently have a process where we publish one or more release candidates before doing a GA release. We do a code freeze before cutting the release branch, and again before doing the GA release.
We have had a couple of GA releases where we had to do patch releases almost immediately because of critical bugs that broke one of more pieces of major functionality. The most recent example of this is #15419. At that time, @L3o-pold pointed out that the GA release was not identical to a previous release candidate.
This is an artifact of the current release process. We publish an RC1, and then continue to merge bug fixes on to the release branch whether or not they are reported specifically with the RC1. Unless RC1 is significantly broken, we don't typically do an RC2, we go straight to GA. What goes into the GA is just the latest state of the release branch, it is not expected to be identical to a previously published release candidate.

Shortcomings

  • Bugs can be introduced into the release branch after an RC which are not caught until GA
  • The volume of bug fixes on the release branch between RC and GA is quite high
  • There is a scramble on the last few days to get bug fixes "under the wire"

Proposal

Let's assume we have planned a GA release for date T.

  • T minus 3 weeks: Feature freeze. No new enhancements for this release cycle can be merged after this date, only bug fixes. We will cut a release branch at this date. This is the same as what we do today.
  • T minus 2 weeks: Publish RC1. This gives ~1 week to push bug fixes to the release branch before doing an RC.
  • Until T minus 1 week: We accept bug reports against RC1 and evaluate whether they should be fixed. We'll accumulate bug fixes before publishing another release candidate.
  • T minus 1 week: Publish RC2 if necessary
  • Until T: Only critical bug reports will be evaluated to determine whether the release should be pushed back.
  • T: GA Release which will be exactly the same code as the latest RC.

Exceptions:
If there is a critical bug report after RC2, we MAY need to push the GA release out by 1 week and do an RC3 instead.

Notes:

  • Absolutely no enhancements should be added to the release branch. This includes performance fixes except those found as regressions from the previous release using arewefastyet

References

Current release process
Release schedule, bug fix releases, support lifecycle etc.

EDIT Apr 17, 2023: Incorporate feedback from comments.
EDIT2 Apr 18, 2023: Clarified GA Release relationship to RC.

@deepthi deepthi added the Type: RFC Request For Comment label Mar 27, 2024
@L3o-pold
Copy link
Collaborator

L3o-pold commented Mar 27, 2024

Thanks for this @deepthi, you perfectly summarise the "issue" and your proposal match what I was thinking.

Bugs can be introduced into the release branch after an RC which are not caught until GA

it's the main issue IMO

If we fix a bug, we publish another release candidate

exactly 👍

@frouioui
Copy link
Member

The latest release candidate commit is used to make the GA release.

Just to be clear here, the exact same commit cannot be used as we have to push more commits during the release process of the GA. However, the release PR of GA will be based on a RC commit.

@frouioui
Copy link
Member

frouioui commented Mar 27, 2024

This proposal sounds good to me, however I have concerns about the 2-weeks period before the RC-1 release. If we want to be able to do bug fixes, I think we might as well release an RC-1 early (at the beginning of the 2 weeks period) and leave enough time for everyone in the community to test the RC-1. Because unless people are running their systems on Vitess' main, we won't get many bug reports from the community.

That would also allow us to not block development on main.

@systay
Copy link
Collaborator

systay commented Mar 28, 2024

Once RC1 is published, we accept bug reports against that and evaluate whether they should be fixed. If we fix a bug, we publish another release candidate.

Would we do this immediately after every bug fix, or should we accumulate bug fixes for some time before doing the next RC?

@frouioui
Copy link
Member

Would we do this immediately after every bug fix, or should we accumulate bug fixes for some time before doing the next RC?

I agree with @systay. But, doing a RC takes some time and some planning, it will be time consuming for the release team to do a new RC after each bug fix or even every other day. I think we need some sort of cadance/schedule, just an example: one or two RC per week (if needed: when there are new bug fixes). That way the release team knows what to expect during the period between the first RC and the GA release, and a day before every scheduled RC they can evaluate if a new RC is really needed.

@deepthi deepthi changed the title RFC: Proposed changes to release process RFC: Proposed changes to GA release process Mar 29, 2024
@systay
Copy link
Collaborator

systay commented Apr 15, 2024

I would suggest this process:

  • Three weeks out: Branch off the release branch from main and switch it to bug-fix-only mode. Normal development continues on the main branch.
  • Two weeks out: Cut RC1 and publish it.
  • Bug fixes: If bugs serious enough for a new RC are identified, allocate 3-5 days for the team to fix, merge, and cut a new RC.
  • Stability check: Continue cutting new RCs until no critical bugs are found for a full week.
  • Release GA: Once stable, release the GA using the same SHA as the last RC.

@harshit-gangal
Copy link
Member

We should define the minimum time gap between the last RC and the GA release which might risk postponing the release but keep the GA release more stable

@frouioui
Copy link
Member

Continue cutting new RCs until no critical bugs are found for a full week.

@harshit-gangal I think given what @systay said, we would want to wait a full week before proceeding with the GA. I think it should be fine to have a flexible release dates, but that will mean more "work" to remember notifying different parties to cross-post our blog post.

@deepthi
Copy link
Member Author

deepthi commented Apr 17, 2024

Agree with @frouioui. We coordinate the release blog post with two other parties (CNCF and PlanetScale), so we need to have a planned date for the release. It's also important to have that for the community to make plans around releases.

@systay
Copy link
Collaborator

systay commented Apr 17, 2024

Agree with @frouioui. We coordinate the release blog post with two other parties (CNCF and PlanetScale), so we need to have a planned date for the release. It's also important to have that for the community to make plans around releases.

I don't quite follow what this means for the suggested process. Are you saying we will do the release even if we find bugs?

I think everyone agrees we want a planned release date, the question is how to handle situations where this is hard to achieve. How do we achieve both no known bad bugs, and hit the release date?

@deepthi
Copy link
Member Author

deepthi commented Apr 17, 2024

I think everyone agrees we want a planned release date, the question is how to handle situations where this is hard to achieve. How do we achieve both no known bad bugs, and hit the release date?

We can't. The new description addresses this question, and it's consistent with your suggestion.

@L3o-pold
Copy link
Collaborator

T: GA Release which is essentially the same as either RC1 or RC2.

For me it SHOULD be the same as the latest RC

@deepthi
Copy link
Member Author

deepthi commented Apr 18, 2024

T: GA Release which is essentially the same as either RC1 or RC2.

For me it SHOULD be the same as the latest RC

Thank you. That is what I was trying to convey, so I've edited that line to make it clearer. The only reason I initially said "essentially" is because we do a release commit changing the displayed version name from something like 20.0.0-rc1 to just 20.0.0.

@frouioui
Copy link
Member

frouioui commented Apr 18, 2024

  • T minus 3 weeks: Feature freeze. No new enhancements can be merged after this date, only bug fixes. We will cut a release branch at this date. This is the same as what we do today.

It is unclear to me if the feature freeze applies to both branches (main and the new release branch) or only to the release branch.

@deepthi
Copy link
Member Author

deepthi commented May 2, 2024

  • T minus 3 weeks: Feature freeze. No new enhancements can be merged after this date, only bug fixes. We will cut a release branch at this date. This is the same as what we do today.

It is unclear to me if the feature freeze applies to both branches (main and the new release branch) or only to the release branch.

Added text to make it clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: RFC Request For Comment
Projects
None yet
Development

No branches or pull requests

5 participants