On Mon, Aug 03, 2020 at 09:45:32PM -0700, Andi Kleen wrote:
Why is that? Both .text and .text.hot have alignment of 2^4 (default function alignment on x86) by default, so it doesn't seem like it should matter for packing density. Avoiding interspersing cold text among
You may lose part of a cache line on each unit boundary. Linux has a lot of units, some of them small. All these bytes add up.
Separating out .text.unlikely, which isn't aligned, slightly _reduces_ this loss, but not by much -- just over 1K on a defconfig. More importantly, it moves cold code out of line (~320k on a defconfig), giving better code density for the hot code.
For .text and .text.hot, you lose the alignment padding on every function boundary, not unit boundary, because of the 16-byte alignment. Whether .text.hot and .text are arranged by translation unit or not makes no difference.
With *(.text.hot) *(.text) you get HHTT, with *(.text.hot .text) you get HTHT, but in both cases the individual chunks are already aligned to 16 bytes. If .text.hot _had_ different alignment requirements to .text, the HHTT should actually give better packing in general, I think.
It's bad for TLB locality too. Sadly with all the fine grained protection changes the 2MB coverage is eroding anyways, but this makes it even worse.
Yes, that could be true for .text.hot, depending on whether the hot functions are called from all over the kernel (in which case putting them together ought to be better) or mostly from regular text within the unit in which they appeared (in which case it would be better together with that code).