On Wed, Jul 9, 2025 at 6:35 PM Mario Limonciello mario.limonciello@amd.com wrote:
On 7/9/2025 12:23 PM, Eric W. Biederman wrote:
Sasha Levin sashal@kernel.org writes:
On Tue, Jul 08, 2025 at 04:46:19PM -0500, Eric W. Biederman wrote:
Sasha Levin sashal@kernel.org writes:
On Tue, Jul 08, 2025 at 02:32:02PM -0500, Eric W. Biederman wrote:
Wow!
Sasha I think an impersonator has gotten into your account, and is just making nonsense up.
It is nice it is giving explanations for it's backporting decisions.
It would be nicer if those explanations were clearly marked as coming from a non-human agent, and did not read like a human being impatient for a patch to be backported.
Thats a fair point. I'll add "LLM Analysis:" before the explanation to future patches.
Further the machine given explanations were clearly wrong. Do you have plans to do anything about that? Using very incorrect justifications for backporting patches is scary.
Just like in the past 8 years where AUTOSEL ran without any explanation whatsoever, the patches are manually reviewed and tested prior to being included in the stable tree.
I believe there is some testing done. However for a lot of what I see go by I would be strongly surprised if there is actually much manual review.
I expect there is a lot of the changes are simply ignored after a quick glance because people don't know what is going on, or they are of too little consequence to spend time on.
I don't make a point to go back and correct the justification, it's there more to give some idea as to why this patch was marked for review and may be completely bogus (in which case I'll drop the patch).
For that matter, I'd often look at the explanation only if I don't fully understand why a certain patch was selected. Most often I just use it as a "Yes/No" signal.
In this instance I honestly haven't read the LLM explanation. I agree with you that the explanation is flawed, but the patch clearly fixes a problem:
"On AMD dGPUs this can lead to failed suspends under memory pressure situations as all VRAM must be evicted to system memory or swap."
So it was included in the AUTOSEL patchset.
Do you have an objection to this patch being included in -stable? So far your concerns were about the LLM explanation rather than actual patch.
Several objections.
- The explanation was clearly bogus.
- The maintainer takes alarm.
- The patch while small, is not simple and not obviously correct.
- The patch has not been thoroughly tested.
I object because the code does not appear to have been well tested outside of the realm of fixing the issue.
There is no indication that the kexec code path has ever been exercised.
So this appears to be one of those changes that was merged under the banner of "Let's see if this causes a regression".> To the original authors. I would have appreciated it being a little more clearly called out in the change description that this came in under "Let's see if this causes a regression".
As the original author of this patch I don't feel this patch is any different than any other patch in that regard. I don't write in a commit message the expected risk of a patch.
There are always people that find interesting ways to exercise it and they could find problems that I didn't envision.
Such changes should not be backported automatically. They should be backported with care after the have seen much more usage/testing of the kernel they were merged into. Probably after a kernel release or so. This is something that can take some actual judgment to decide, when a backport is reasonable.
TBH - I didn't include stable in the commit message with the intent that after this baked a cycle or so that we could bring it back later if AUTOSEL hadn't picked it up by then.
I actually see an issue in this patch that I have overlooked previously, so Sasha and "stable" folks - please drop this one.
Namely, the change in dpm_resume_end() is going too far.
It's a real issue people have complained about for years that is non-obvious where the root cause is.
Once we're all confident on this I'd love to discuss bringing it back even further to LTS kernels if it's viable.
Sure.