Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting for once, to make this easily accessible to everyone.
Hans, from your point of view, how fast should we try to mainline this revert? I got the impression that you want it merged there rather sooner than later -- and that sounds appropriate to me. So should we maybe ask Linus on Friday to pick this up from here? Ideally of course with an ACK from Pavel or Lee.
Ciao, Thorsten -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.
#regzbot poke
On 07.06.24 17:26, Hans de Goede wrote:
On 6/7/24 2:03 PM, Andrew Lunn wrote:
On Fri, Jun 07, 2024 at 12:18:47PM +0200, Hans de Goede wrote:
Commit 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") causes ledtrig-netdev to get set as default trigger on various network LEDs.
This causes users to hit a pre-existing AB-BA deadlock issue in ledtrig-netdev between the LED-trigger locks and the rtnl mutex, resulting in hung tasks in kernels >= 6.9.
Solving the deadlock is non trivial, so for now revert the change to set the hw_control trigger as default trigger, so that ledtrig-netdev no longer gets activated automatically for various network LEDs.
The netdev trigger is not needed because the network LEDs are usually under hw-control and the netdev trigger tries to leave things that way so setting it as the active trigger for the LED class device is a no-op.
Fixes: 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") Reported-by: Genes Lists lists@sapience.com Closes: https://lore.kernel.org/all/9d189ec329cfe68ed68699f314e191a10d4b5eda.camel@s... Reported-by: "Johannes Wüller" johanneswueller@gmail.com Closes: https://lore.kernel.org/lkml/e441605c-eaf2-4c2d-872b-d8e541f4cf60@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com
I'm not sure i agree with the Closes: All this does is make it less likely to deadlock. The deadlock is still there.
I agree that the deadlock which is the root-cause is still there. But with this revert ledtrig-netdev will no longer get activated by default.
So now the only way to actually get the code-paths which may deadlock to run is by the user or some script explicitly activating the netdev trigger by writing "netdev" to the trigger sysfs file for a LED classdev. So most users will now no longer hit this, including the reporters of these bugs.
The auto-activating of the netdev trigger is what is causing these reports when users are running kernels >= 6.9 . So now the only way to actually get the code-paths which may deadlock to run is by the user or some script explicitly activating the netdev trigger by writing "netdev" to the trigger sysfs file for a LED classdev. So most users will now no longer hit this, including the reporters of these bugs.
The auto-activating of the netdev trigger is what is causing these reports when users are running kernels >= 6.9 .
But:
Reviewed-by: Andrew Lunn andrew@lunn.ch