From: Jason Gunthorpe
Sent: 09 August 2022 20:08
On Tue, Aug 09, 2022 at 11:59:45AM -0700, Linus Torvalds wrote:
But as a very good approximation, the rule is "absolutely no new BUG_ON() calls _ever_". Because I really cannot see a single case where "proper error handling and WARN_ON_ONCE()" isn't the right thing.
Parallel to this discussion I've had ones where people more or less say
Since BUG_ON crashes the machine and Linus says that crashing the machine is bad, WARN_ON will also crash the machine if you set the panic_on_warn parameter, so it is also bad, thus we shouldn't use anything.
I've generally maintained that people who set the panic_on_warn *want* these crashes, because that is the entire point of it. So we should use WARN_ON with an error recovery for "can't happen" assertions like these. I think it is what you are saying here.
They don't necessarily want the crashes, it is more the people who built the distribution think they want the crashes.
I have had issues with a customer system (with our drivers) randomly locking up. Someone had decided that 'PANIC_ON_OOPS' was a good idea but hadn't enabled anything to actually take the dump.
So instead of a diagnosable problem (and a 'doh' moment) you get several weeks of head scratching and a very annoyed user.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)