Commit 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") causes ledtrig-netdev to get set as default trigger on various network LEDs.
This causes users to hit a pre-existing AB-BA deadlock issue in ledtrig-netdev between the LED-trigger locks and the rtnl mutex, resulting in hung tasks in kernels >= 6.9.
Solving the deadlock is non trivial, so for now revert the change to set the hw_control trigger as default trigger, so that ledtrig-netdev no longer gets activated automatically for various network LEDs.
The netdev trigger is not needed because the network LEDs are usually under hw-control and the netdev trigger tries to leave things that way so setting it as the active trigger for the LED class device is a no-op.
Fixes: 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") Reported-by: Genes Lists lists@sapience.com Closes: https://lore.kernel.org/all/9d189ec329cfe68ed68699f314e191a10d4b5eda.camel@s... Reported-by: "Johannes Wüller" johanneswueller@gmail.com Closes: https://lore.kernel.org/lkml/e441605c-eaf2-4c2d-872b-d8e541f4cf60@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com --- drivers/leds/led-class.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c index 24fcff682b24..ba1be15cfd8e 100644 --- a/drivers/leds/led-class.c +++ b/drivers/leds/led-class.c @@ -552,12 +552,6 @@ int led_classdev_register_ext(struct device *parent, led_init_core(led_cdev);
#ifdef CONFIG_LEDS_TRIGGERS - /* - * If no default trigger was given and hw_control_trigger is set, - * make it the default trigger. - */ - if (!led_cdev->default_trigger && led_cdev->hw_control_trigger) - led_cdev->default_trigger = led_cdev->hw_control_trigger; led_trigger_set_default(led_cdev); #endif
On Fri, Jun 07, 2024 at 12:18:47PM +0200, Hans de Goede wrote:
Commit 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") causes ledtrig-netdev to get set as default trigger on various network LEDs.
This causes users to hit a pre-existing AB-BA deadlock issue in ledtrig-netdev between the LED-trigger locks and the rtnl mutex, resulting in hung tasks in kernels >= 6.9.
Solving the deadlock is non trivial, so for now revert the change to set the hw_control trigger as default trigger, so that ledtrig-netdev no longer gets activated automatically for various network LEDs.
The netdev trigger is not needed because the network LEDs are usually under hw-control and the netdev trigger tries to leave things that way so setting it as the active trigger for the LED class device is a no-op.
Fixes: 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") Reported-by: Genes Lists lists@sapience.com Closes: https://lore.kernel.org/all/9d189ec329cfe68ed68699f314e191a10d4b5eda.camel@s... Reported-by: "Johannes Wüller" johanneswueller@gmail.com Closes: https://lore.kernel.org/lkml/e441605c-eaf2-4c2d-872b-d8e541f4cf60@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com
I'm not sure i agree with the Closes: All this does is make it less likely to deadlock. The deadlock is still there. But:
Reviewed-by: Andrew Lunn andrew@lunn.ch
Andrew
Hi Andrew,
On 6/7/24 2:03 PM, Andrew Lunn wrote:
On Fri, Jun 07, 2024 at 12:18:47PM +0200, Hans de Goede wrote:
Commit 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") causes ledtrig-netdev to get set as default trigger on various network LEDs.
This causes users to hit a pre-existing AB-BA deadlock issue in ledtrig-netdev between the LED-trigger locks and the rtnl mutex, resulting in hung tasks in kernels >= 6.9.
Solving the deadlock is non trivial, so for now revert the change to set the hw_control trigger as default trigger, so that ledtrig-netdev no longer gets activated automatically for various network LEDs.
The netdev trigger is not needed because the network LEDs are usually under hw-control and the netdev trigger tries to leave things that way so setting it as the active trigger for the LED class device is a no-op.
Fixes: 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") Reported-by: Genes Lists lists@sapience.com Closes: https://lore.kernel.org/all/9d189ec329cfe68ed68699f314e191a10d4b5eda.camel@s... Reported-by: "Johannes Wüller" johanneswueller@gmail.com Closes: https://lore.kernel.org/lkml/e441605c-eaf2-4c2d-872b-d8e541f4cf60@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com
I'm not sure i agree with the Closes: All this does is make it less likely to deadlock. The deadlock is still there.
I agree that the deadlock which is the root-cause is still there. But with this revert ledtrig-netdev will no longer get activated by default.
So now the only way to actually get the code-paths which may deadlock to run is by the user or some script explicitly activating the netdev trigger by writing "netdev" to the trigger sysfs file for a LED classdev. So most users will now no longer hit this, including the reporters of these bugs.
The auto-activating of the netdev trigger is what is causing these reports when users are running kernels >= 6.9 .
But:
Reviewed-by: Andrew Lunn andrew@lunn.ch
Thank you.
Regards,
Hans
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting for once, to make this easily accessible to everyone.
Hans, from your point of view, how fast should we try to mainline this revert? I got the impression that you want it merged there rather sooner than later -- and that sounds appropriate to me. So should we maybe ask Linus on Friday to pick this up from here? Ideally of course with an ACK from Pavel or Lee.
Ciao, Thorsten -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.
#regzbot poke
On 07.06.24 17:26, Hans de Goede wrote:
On 6/7/24 2:03 PM, Andrew Lunn wrote:
On Fri, Jun 07, 2024 at 12:18:47PM +0200, Hans de Goede wrote:
Commit 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") causes ledtrig-netdev to get set as default trigger on various network LEDs.
This causes users to hit a pre-existing AB-BA deadlock issue in ledtrig-netdev between the LED-trigger locks and the rtnl mutex, resulting in hung tasks in kernels >= 6.9.
Solving the deadlock is non trivial, so for now revert the change to set the hw_control trigger as default trigger, so that ledtrig-netdev no longer gets activated automatically for various network LEDs.
The netdev trigger is not needed because the network LEDs are usually under hw-control and the netdev trigger tries to leave things that way so setting it as the active trigger for the LED class device is a no-op.
Fixes: 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") Reported-by: Genes Lists lists@sapience.com Closes: https://lore.kernel.org/all/9d189ec329cfe68ed68699f314e191a10d4b5eda.camel@s... Reported-by: "Johannes Wüller" johanneswueller@gmail.com Closes: https://lore.kernel.org/lkml/e441605c-eaf2-4c2d-872b-d8e541f4cf60@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com
I'm not sure i agree with the Closes: All this does is make it less likely to deadlock. The deadlock is still there.
I agree that the deadlock which is the root-cause is still there. But with this revert ledtrig-netdev will no longer get activated by default.
So now the only way to actually get the code-paths which may deadlock to run is by the user or some script explicitly activating the netdev trigger by writing "netdev" to the trigger sysfs file for a LED classdev. So most users will now no longer hit this, including the reporters of these bugs.
The auto-activating of the netdev trigger is what is causing these reports when users are running kernels >= 6.9 . So now the only way to actually get the code-paths which may deadlock to run is by the user or some script explicitly activating the netdev trigger by writing "netdev" to the trigger sysfs file for a LED classdev. So most users will now no longer hit this, including the reporters of these bugs.
The auto-activating of the netdev trigger is what is causing these reports when users are running kernels >= 6.9 .
But:
Reviewed-by: Andrew Lunn andrew@lunn.ch
Hi,
On 6/12/24 4:58 PM, Linux regression tracking (Thorsten Leemhuis) wrote:
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting for once, to make this easily accessible to everyone.
Hans, from your point of view, how fast should we try to mainline this revert? I got the impression that you want it merged there rather sooner than later -- and that sounds appropriate to me.
There are at least 2 separate bug reports from 6.9 users who are gettinhg stuck tasks which should be fixed by this, so yes this should go upstream soon.
So should we maybe ask Linus on Friday to pick this up from here? Ideally of course with an ACK from Pavel or Lee.
Indeed having an ack from Lee or Pavel here would be great!
Regards,
Hans
Ciao, Thorsten
Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.
#regzbot poke
On 07.06.24 17:26, Hans de Goede wrote:
On 6/7/24 2:03 PM, Andrew Lunn wrote:
On Fri, Jun 07, 2024 at 12:18:47PM +0200, Hans de Goede wrote:
Commit 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") causes ledtrig-netdev to get set as default trigger on various network LEDs.
This causes users to hit a pre-existing AB-BA deadlock issue in ledtrig-netdev between the LED-trigger locks and the rtnl mutex, resulting in hung tasks in kernels >= 6.9.
Solving the deadlock is non trivial, so for now revert the change to set the hw_control trigger as default trigger, so that ledtrig-netdev no longer gets activated automatically for various network LEDs.
The netdev trigger is not needed because the network LEDs are usually under hw-control and the netdev trigger tries to leave things that way so setting it as the active trigger for the LED class device is a no-op.
Fixes: 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") Reported-by: Genes Lists lists@sapience.com Closes: https://lore.kernel.org/all/9d189ec329cfe68ed68699f314e191a10d4b5eda.camel@s... Reported-by: "Johannes Wüller" johanneswueller@gmail.com Closes: https://lore.kernel.org/lkml/e441605c-eaf2-4c2d-872b-d8e541f4cf60@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com
I'm not sure i agree with the Closes: All this does is make it less likely to deadlock. The deadlock is still there.
I agree that the deadlock which is the root-cause is still there. But with this revert ledtrig-netdev will no longer get activated by default.
So now the only way to actually get the code-paths which may deadlock to run is by the user or some script explicitly activating the netdev trigger by writing "netdev" to the trigger sysfs file for a LED classdev. So most users will now no longer hit this, including the reporters of these bugs.
The auto-activating of the netdev trigger is what is causing these reports when users are running kernels >= 6.9 . So now the only way to actually get the code-paths which may deadlock to run is by the user or some script explicitly activating the netdev trigger by writing "netdev" to the trigger sysfs file for a LED classdev. So most users will now no longer hit this, including the reporters of these bugs.
The auto-activating of the netdev trigger is what is causing these reports when users are running kernels >= 6.9 .
But:
Reviewed-by: Andrew Lunn andrew@lunn.ch
On Wed, 12 Jun 2024, Hans de Goede wrote:
Hi,
On 6/12/24 4:58 PM, Linux regression tracking (Thorsten Leemhuis) wrote:
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting for once, to make this easily accessible to everyone.
Hans, from your point of view, how fast should we try to mainline this revert? I got the impression that you want it merged there rather sooner than later -- and that sounds appropriate to me.
There are at least 2 separate bug reports from 6.9 users who are gettinhg stuck tasks which should be fixed by this, so yes this should go upstream soon.
So should we maybe ask Linus on Friday to pick this up from here? Ideally of course with an ACK from Pavel or Lee.
Indeed having an ack from Lee or Pavel here would be great!
Acked-by: Lee Jones lee@kernel.org
On 12.06.24 17:26, Lee Jones wrote:
On Wed, 12 Jun 2024, Hans de Goede wrote:
On 6/12/24 4:58 PM, Linux regression tracking (Thorsten Leemhuis) wrote:
Hans, from your point of view, how fast should we try to mainline this revert? I got the impression that you want it merged there rather sooner than later -- and that sounds appropriate to me.
There are at least 2 separate bug reports from 6.9 users who are gettinhg stuck tasks which should be fixed by this, so yes this should go upstream soon.
So should we maybe ask Linus on Friday to pick this up from here? Ideally of course with an ACK from Pavel or Lee.
Indeed having an ack from Lee or Pavel here would be great!
Acked-by: Lee Jones lee@kernel.org
Thx everyone. In that case: why wait till Friday. :-D
Linus, could you please pick the revert up this thread is about? You can find it here at the start of this thread, which is:
https://lore.kernel.org/all/20240607101847.23037-1-hdegoede@redhat.com/
You can see the Ack from Lee above and there is a Reviewed-by: from Andrew in the first reply as well.
Tia! Ciao, Thorsten
[resending this request, I fear my earlier mail triggered a spam filter]
Linus, could you please merge the revert at this thread's start, e.g.: https://lore.kernel.org/all/20240607101847.23037-1-hdegoede@redhat.com/
It fixes a regression that causes some trouble.
You can see the Ack from Lee (who merged and mainlined the patch that is reverted) below; there is a Reviewed-by: from Andrew in the first reply as well. Heiner, who authored the culprit, did not reply afaics.
Ciao, Thorsten
On 13.06.24 08:01, Linux regression tracking (Thorsten Leemhuis) wrote:
On 12.06.24 17:26, Lee Jones wrote:
On Wed, 12 Jun 2024, Hans de Goede wrote:
On 6/12/24 4:58 PM, Linux regression tracking (Thorsten Leemhuis) wrote:
Hans, from your point of view, how fast should we try to mainline this revert? I got the impression that you want it merged there rather sooner than later -- and that sounds appropriate to me.
There are at least 2 separate bug reports from 6.9 users who are gettinhg stuck tasks which should be fixed by this, so yes this should go upstream soon.
So should we maybe ask Linus on Friday to pick this up from here? Ideally of course with an ACK from Pavel or Lee.
Indeed having an ack from Lee or Pavel here would be great!
Acked-by: Lee Jones lee@kernel.org
Thx everyone. In that case: why wait till Friday. :-D
Linus, could you please pick the revert up this thread is about? You can find it here at the start of this thread, which is:
https://lore.kernel.org/all/20240607101847.23037-1-hdegoede@redhat.com/
You can see the Ack from Lee above and there is a Reviewed-by: from Andrew in the first reply as well.
Tia! Ciao, Thorsten
On Fri, 07 Jun 2024 12:18:47 +0200, Hans de Goede wrote:
Commit 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") causes ledtrig-netdev to get set as default trigger on various network LEDs.
This causes users to hit a pre-existing AB-BA deadlock issue in ledtrig-netdev between the LED-trigger locks and the rtnl mutex, resulting in hung tasks in kernels >= 6.9.
[...]
Applied, thanks!
[1/1] leds: class: Revert: "If no default trigger is given, make hw_control trigger the default trigger" commit: 3acc45f2ceb0609812522e45aec4cb9516e1c586
-- Lee Jones [李琼斯]
On Thu, 20 Jun 2024, Lee Jones wrote:
On Fri, 07 Jun 2024 12:18:47 +0200, Hans de Goede wrote:
Commit 66601a29bb23 ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") causes ledtrig-netdev to get set as default trigger on various network LEDs.
This causes users to hit a pre-existing AB-BA deadlock issue in ledtrig-netdev between the LED-trigger locks and the rtnl mutex, resulting in hung tasks in kernels >= 6.9.
[...]
Applied, thanks!
[1/1] leds: class: Revert: "If no default trigger is given, make hw_control trigger the default trigger" commit: 3acc45f2ceb0609812522e45aec4cb9516e1c586
Cancel.
It looks as though Linus did end up picking this up, just silently.
linux-stable-mirror@lists.linaro.org