On Tue, Aug 04, 2020 at 12:06:49PM -0400, Arvind Sankar wrote:
On Mon, Aug 03, 2020 at 09:45:32PM -0700, Andi Kleen wrote:
Why is that? Both .text and .text.hot have alignment of 2^4 (default function alignment on x86) by default, so it doesn't seem like it should matter for packing density. Avoiding interspersing cold text among
You may lose part of a cache line on each unit boundary. Linux has a lot of units, some of them small. All these bytes add up.
Separating out .text.unlikely, which isn't aligned, slightly _reduces_ this loss, but not by much -- just over 1K on a defconfig. More importantly, it moves cold code out of line (~320k on a defconfig), giving better code density for the hot code.
For .text and .text.hot, you lose the alignment padding on every function boundary, not unit boundary, because of the 16-byte alignment. Whether .text.hot and .text are arranged by translation unit or not makes no difference.
With *(.text.hot) *(.text) you get HHTT, with *(.text.hot .text) you get HTHT, but in both cases the individual chunks are already aligned to 16 bytes. If .text.hot _had_ different alignment requirements to .text, the HHTT should actually give better packing in general, I think.
Okay, so at the end of the conversation, I think it looks like this patch is correct: it collects the hot, unlikely, etc into their own areas (e.g. HHTTUU is more correct than HTUHTU), so this patch stands as-is.