On Mon, 11 May 2020, Linus Torvalds wrote:
On Mon, May 11, 2020 at 2:57 PM Jason A. Donenfeld Jason@zx2c4.com wrote:
GCC 10 appears to have changed -O2 in order to make compilation time faster when using -flto, seemingly at the expense of performance, in particular with regards to how the inliner works. Since -O3 these days shouldn't have the same set of bugs as 10 years ago, this commit defaults new kernel compiles to -O3 when using gcc >= 10.
I'm not convinced this is sensible.
Note the real thing that changed for GCC 10 at -O2 is that -O2 now includes -finline-functions which means GCC considers inlining of functions not marked with 'inline' at -O2. To counter code-size growth and tune that back to previous levels the inlining limits in effect at -O2 have been lowered.
Note this has been done based on analyzing larger C++ code and obviously not because the kernel would benefit (IIRC kernel folks like 'inline' to behave as written and thus rather may dislike the change to default to -finline-functions).
-O3 historically does bad things with gcc. Including bad things for performance. It traditionally makes code larger and often SLOWER.
And I don't mean slower to compile (although that's an issue). I mean actually generating slower code.
Things like trying to unroll loops etc makes very little sense in the kernel, where we very seldom have high loop counts for pretty much anything.
There's a reason -O3 isn't even offered as an option.
And I think that's completely sensible. I would not recommend to use -O3 for the kernel. Somehow feeding back profile data might help - though getting such data at all and with enough coverage is probably hard.
As you said in the followup I wouldn't recommend tweaking GCCs defaults for the various --param affecting inlining. The behavior with this is not consistent across releases.
Richard.
Maybe things have changed, and maybe they've improved. But I'd like to see actual numbers for something like this.
Not inlining as aggressively is not necessarily a bad thing. It can be, of course. But I've actually also done gcc bugreports about gcc inlining too much, and generating _worse_ code as a result (ie inlinging things that were behind an "if (unlikely())" test, and causing the likely path to grow a stack fram and stack spills as a result).
So just "O3 inlines more" is not a valid argument.
Linus