On Mon, Feb 27, 2023 at 10:59:27PM +0000, Matthew Wilcox wrote:
On Mon, Feb 27, 2023 at 05:35:30PM -0500, Sasha Levin wrote:
On Mon, Feb 27, 2023 at 09:38:46PM +0000, Eric Biggers wrote:
Just because you can't be 100% certain whether a commit is a fix doesn't mean you should be rushing to backport random commits that have no indications they are fixing anything.
The difference in opinion here is that I don't think it's rushing: the stable kernel rules say a commit must be in a released kernel, while the AUTOSEL timelines make it so a commit must have been in two released kernels.
Patches in -rc1 have been in _no_ released kernels. I'd feel a lot better about AUTOSEL if it didn't pick up changes until, say, -rc4, unless they were cc'd to stable.
This happened before my time, but -rc are considered releases.
The counter point to your argument/ask is that if you run the numbers on regressions between -rc releases, it's the later one that tend to introduce (way) more issues.
I've actually written about it a few years back to ksummit discuss (here: https://lwn.net/Articles/753329/) because the numbers I saw indicate that later -rc releases are 3x likely to introduce a regression.
Linus pushed back on it saying that it is "by design" because those commits are way more complex than ones that land during the early -rc cycles.
So yes, I don't mind modifying the release workflow to decrease the regressions we introduce, but I think that there's a difference between what folks see as "helpful" and the outcome it would have.
Nothing has changed, but that doesn't mean that your process is actually working. 7 days might be appropriate for something that looks like a security fix, but not for a random commit with no indications it is fixing anything.
How do we know if this is working or not though? How do you quantify the amount of useful commits?
Sasha, 7 days is too short. People have to be allowed to take holiday.
That's true, and I don't have strong objections to making it longer. How often did it happen though? We don't end up getting too many replies past the 7 day window.
I'll bump it to 14 days for a few months and see if it changes anything.
I'd love to improve the process, but for that we need to figure out criteria for what we consider good or bad, collect data, and make decisions based on that data.
What I'm getting from this thread is a few anecdotal examples and statements that the process isn't working at all.
I took Jon's stablefixes script which he used for his previous articles around stable kernel regressions (here: https://lwn.net/Articles/812231/) and tried running it on the 5.15 stable tree (just a random pick). I've proceeded with ignoring the non-user-visible regressions as Jon defined in his article (basically issues that were introduced and fixed in the same releases) and ended up with 604 commits that caused a user visible regression.
Out of those 604 commits:
- 170 had an explicit stable tag.
- 434 did not have a stable tag.
I think a lot of people don't realise they have to _both_ put a Fixes tag _and_ add a Cc: stable. How many of those 604 commits had a Fixes tag?
What do you mean? Just a cc: stable tag is enough to land it in stable, you don't have to do both. The numbers above reflect that.
Running the numbers, there are 9422 commits with a Fixes tag in the 5.15 tree, out of which 360 had a regression, so 360 / 9422 = 3.82%.