On Tue, Jan 23, 2024 at 04:36:48PM -0600, Bjorn Helgaas wrote:
On Tue, Jan 23, 2024 at 06:25:52PM +0100, Johan Hovold wrote:
On Mon, Jan 22, 2024 at 12:26:15PM -0600, Bjorn Helgaas wrote:
On Mon, Jan 22, 2024 at 11:53:35AM +0100, Johan Hovold wrote:
08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") was a start at fixing other problems and also improving the ASPM style, so I hope somebody steps up to fix both it and the lockdep issue. I haven't looked at it enough to have a preference for *how* to fix it.
Ok, but since you were the one introducing the locking regression in 6.7-final shouldn't you look into fixing it?
Especially if there were alternatives to restoring the offending commit which would solve the underlying issue for the resume failure without breaking other platforms.
Did somebody propose an alternate patch? If so, I missed it, but we could look at it now.
I've only skimmed the discussion leading up to the revert, but I got the impression that other alternatives were looked at as it was still not clear what the underlying issue actually was.
As Michael and Thorsten pointed out before the revert, it may have been better not to do a last minute revert of a 16 month old commit which risks introducing regressions (and brought back another sysfs issue IIUC) before fully understanding what is really going on here.
I don't want to spend more time on this if the offending commit could simply be reverted.
I don't quite follow. By simply reverting, do you mean to revert f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"")? IIUC that would break Michael's machine again.
Right, at least until that issue is fully understood and alternative fixes have been considered.
If that's not an option, we need to rework core to pass a flag through more than one layer to indicate whether pcie_aspm_pm_state_change() should take the bus semaphore or not. I'd rather not do that if it can be avoided.
Johan