On Mon, Feb 27, 2023 at 09:38:46PM +0000, Eric Biggers wrote:
On Mon, Feb 27, 2023 at 03:39:14PM -0500, Sasha Levin wrote:
So to summarize, that buggy commit was backported even though:
- There were no indications that it was a bug fix (and thus potentially suitable for stable) in the first place.
- On the AUTOSEL thread, someone told you the commit is broken.
- There was already a thread that reported a regression caused by the commit. Easily findable via lore search.
- There was also already a pending patch that Fixes the commit. Again easily findable via lore search.
So it seems a *lot* of things went wrong, no? Why? If so many things can go wrong, it's not just a "mistake" but rather the process is the problem...
BTW, another cause of this is that the commit (66f99628eb24) was AUTOSEL'd after only being in mainline for 4 days, and *released* in all LTS kernels after only being in mainline for 12 days. Surely that's a timeline befitting a critical security vulnerability, not some random neural-network-selected commit that wasn't even fixing anything?
I would love to have a mechanism that tells me with 100% confidence if a given commit fixes a bug or not, could you provide me with one?
Just because you can't be 100% certain whether a commit is a fix doesn't mean you should be rushing to backport random commits that have no indications they are fixing anything.
The difference in opinion here is that I don't think it's rushing: the stable kernel rules say a commit must be in a released kernel, while the AUTOSEL timelines make it so a commit must have been in two released kernels.
Should we extend it? Maybe, I just don't think we have enough data to make a decision either way.
w.r.t timelines, this is something that was discussed on the mailing list a few years ago where we decided that giving AUTOSEL commits 7 days of soaking time is sufficient, if anything changed we can have this discussion again.
Nothing has changed, but that doesn't mean that your process is actually working. 7 days might be appropriate for something that looks like a security fix, but not for a random commit with no indications it is fixing anything.
How do we know if this is working or not though? How do you quantify the amount of useful commits?
How do you know if a certain fix has security implications? Or even if it actually fixes anything? For every "security" commit tagged for stable I could probably list a "security" commit with no tags whatsoever.
BTW, based on that example it's not even 7 days between AUTOSEL and patch applied, but actually 7 days from AUTOSEL to *release*. So e.g. if someone takes just a 1 week vacation, in that time a commit they would have NAK'ed can be AUTOSEL'ed and pushed out across all LTS kernels...
Right, and same as above: what's "enough"?
Note, however, that it's not enough to keep pointing at a tiny set and using it to suggest that the entire process is broken. How many AUTOSEL commits introduced a regression? How many -stable tagged ones did? How many bugs did AUTOSEL commits fix?
So basically you don't accept feedback from individual people, as individual people don't have enough data?
I'd love to improve the process, but for that we need to figure out criteria for what we consider good or bad, collect data, and make decisions based on that data.
What I'm getting from this thread is a few anecdotal examples and statements that the process isn't working at all.
I took Jon's stablefixes script which he used for his previous articles around stable kernel regressions (here: https://lwn.net/Articles/812231/) and tried running it on the 5.15 stable tree (just a random pick). I've proceeded with ignoring the non-user-visible regressions as Jon defined in his article (basically issues that were introduced and fixed in the same releases) and ended up with 604 commits that caused a user visible regression.
Out of those 604 commits:
- 170 had an explicit stable tag. - 434 did not have a stable tag.
Looking at the commits in the 5.15 tree:
With stable tag:
$ git log --oneline -i --grep "cc.*stable" v5.15..stable/linux-5.15.y | wc -l 3676
Without stable tag (-96 commits which are version bumps):
$ git log --oneline --invert-grep -i --grep "cc.*stable" v5.15..stable/linux-5.15.y | wc -l 10649
Regression rate for commits with stable tag: 170 / 3676 = 4.62% Regression rate for commits without a stable tag: 434 / 10553 = 4.11%
Is the analysis flawed somehow? Probably, and I'd happy take feedback on how/what I can do better, but this type of analysis is what I look for to know if the process is working well or not.