On Tue, Aug 29, 2023 at 12:08 AM Huacai Chen chenhuacai@kernel.org wrote:
Hi, Joel,
On Tue, Aug 29, 2023 at 4:47 AM Joel Fernandes joel@joelfernandes.org wrote:
Hi Huacai,
On Mon, Aug 28, 2023 at 11:13 AM Huacai Chen chenhuacai@kernel.org wrote:
[...]
[Huacai] I also think the original patch should be OK, but I have another question: what will happen if the current GP ends before nr_fqs_jiffies_stall reaches zero?
Nothing should happen. Stall detection only happens when a GP is in progress. If a new GP starts, it resets nr_fqs_jiffies_stall.
Or can you elaborate your concern more?
OK, I will test your patch these days. Maybe putting nr_fqs_jiffies_stall before jiffies_force_qs is better, because I think putting an 'int' between two 'long' is wasting space. :)
That's a good point and I'll look into that.
Another point, is it better to replace ULONG_MAX with ULONG_MAX/4 as Paul suggested?
I could do that but I don't feel too strongly about it. I will keep it at ULONG_MAX if it's OK with everyone.
Meanwhile I pushed the patch out to my 6.4 stable tree for testing on my fleet.
Ideally, I'd like to change the stall detection test in the rcutorture to actually fail rcutorture if stalls don't happen in time. But at least I verified this manually using rcutorture.
I should also add a documentation patch for stallwarn.rst to document the understandable sensitivity of RCU stall detection to jiffies updates (or lack thereof). Or if you have time, I'd appreciate support on such a patch (not mandatory but I thought it would not hurt to ask).
Looking forward to how your testing goes as well!
I have tested, it works for KGDB.
Thanks! If you don't mind, I will add your Tested-by tag to the patch and send it out soon. My tests also look good!
- Joel