On Fri, May 23, 2025, at 16:08, Kent Overstreet wrote:
On Fri, May 23, 2025 at 03:49:54PM +0200, Arnd Bergmann wrote:
On Fri, May 23, 2025, at 15:19, Naresh Kamboju wrote:
I reproduced the problem locally and found this to go down to 1440 bytes after I turn off KASAN_STACK. next-20250523 has some changes that take the number down further to 1136 with KASAN_STACK and or 1552 with KASAN_STACK.
I've turned bcachefs with kasan-stack on for my randconfig builds again to see if there are any remaining corner cases.
Thanks for the numbers - that does still seem high, I'll have to have a look with pahole.
I agree it's still larger than it should be: having more than a few hundred bytes on a function usually means that there is both the risk for actual overflow and general inefficiency if all the stack data gets accessed as well.
It's probably not actually structure data though, but a combination of effects:
- KASAN_STACK adds extra redzones for each variable - KASAN_STACK further prevents stack slots from getting reused inside one function, in order to better pinpoint which instance caused problems like out-of-scope access - passing structures by value causes them to be put on the stack on some architectures, even when the structure size is only one or two registers - sanitizers turn off optimizations that lead to better stack usage - in some cases, the missed optimization ends up causing local variables to get spilled to the stack many times because of a combination of all the above.
The good news is that so far my randconfig builds have not shown any more stack frame warnings on next-20230523 with bcachefs force-enabled, now 55 builds into the change, across arm32/arm64/x86 using gcc-15.1.
Arnd