 
            On Tue 2025-10-14 10:49:53, Li,Rongqing wrote:
On Tue 2025-10-14 13:23:58, Lance Yang wrote:
Thanks for the patch!
I noticed the implementation panics only when N tasks are detected within a single scan, because total_hung_task is reset for each check_hung_uninterruptible_tasks() run.
Great catch!
Does it make sense? Is is the intended behavior, please?
Yes, this is intended behavior
So some suggestions to align the documentation with the code's behavior below :)
On 2025/10/12 19:50, lirongqing wrote:
From: Li RongQing lirongqing@baidu.com
Currently, when 'hung_task_panic' is enabled, the kernel panics immediately upon detecting the first hung task. However, some hung tasks are transient and the system can recover, while others are persistent and may accumulate progressively.
My understanding is that this patch wanted to do:
- report even temporary stalls
- panic only when the stall was much longer and likely persistent
Which might make some sense. But the code does something else.
A single task hanging for an extended period may not be a critical issue, as users might still log into the system to investigate. However, if multiple tasks hang simultaneously-such as in cases of I/O hangs caused by disk failures-it could prevent users from logging in and become a serious problem, and a panic is expected.
I see. This another approach and it makes sense as well. An this is much more clear description than the original text.
I would also update the subject to something like:
hung_task: Panic when there are more than N hung tasks at the same time
That said, I think that both approaches make sense.
Your approach would trigger the panic when many processes are stuck. Note that it still might be a transient state. But I agree that the more stuck processes exist the more serious the problem likely is for the heath of the system.
My approach would trigger panic when a single process hangs for a long time. It will trigger more likely only when the problem is persistent. The seriousness depends on which particular process get stuck.
I am fine with your approach. Just please, make more clear that the number means the number of hung tasks at the same time. And mention the problems to login, ...
Best Regards, Petr