On Thu, Aug 11, 2022 at 12:13 AM Ingo Molnar mingo@kernel.org wrote:
By using a WARN_ON() we at least give the user a chance to report any bugs triggered here - instead of getting silent hangs.
None of these WARN_ON()s are supposed to trigger, ever - so we ignore cases where a NULL check is done via a BUG_ON() and we let a NULL pointer through after a WARN_ON().
May I suggest going one step further, and making these WARN_ON_ONCE() instead.
From personal experience, once some scheduler bug (or task struct corruption) happens, ti often *keeps* happening, and the logs just fill up with more and more data, to the point where you lose sight of the original report (and the machine can even get unusable just from the logging).
WARN_ON_ONCE() can help that situation.
Now, obviously
(a) WARN_ON_ONCE *can* also result in less information, and maybe there are situations where having more - possibly different - cases of the same thing triggering could be useful.
(b) WARN_ON_ONCE historically generated a bit bigger code than WARN_ON simply due to the extra "did this already trigger" check.
I *think* (b) is no longer true, and it's just a flag these days, but I didn't actually check.
so it's not like there aren't potential downsides, but in general I think the sanest and most natural thing is to have BUG_ON() translate to WARN_ON_ONCE().
For the "reboot-on-warn" people, it ends up being the same thing. And for the rest of us, the "give me *one* warning" can end up making the reporting a lot easier.
Obviously, with the "this never actually happens", the whole "once or many times" is kind of moot. But if it never happens at all, to the point where it doesn't even add a chance of helping debugging, maybe the whole test should be removed entirely...
Linus