On Thu, Jun 13, 2024 at 11:33:22AM -0700, Nathan Chancellor wrote:
commit aba091547ef6159d52471f42a3ef531b7b660ed8 upstream.
There is an issue in clang's ThinLTO caching (enabled for the kernel via '--thinlto-cache-dir') with .incbin, which the kernel occasionally uses to include data within the kernel, such as the .config file for /proc/config.gz. For example, when changing the .config and rebuilding vmlinux, the copy of .config in vmlinux does not match the copy of .config in the build folder:
$ echo 'CONFIG_LTO_NONE=n CONFIG_LTO_CLANG_THIN=y CONFIG_IKCONFIG=y CONFIG_HEADERS_INSTALL=y' >kernel/configs/repro.config
$ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 clean defconfig repro.config vmlinux ...
$ grep CONFIG_HEADERS_INSTALL .config CONFIG_HEADERS_INSTALL=y
$ scripts/extract-ikconfig vmlinux | grep CONFIG_HEADERS_INSTALL CONFIG_HEADERS_INSTALL=y
$ scripts/config -d HEADERS_INSTALL
$ make -kj"$(nproc)" ARCH=x86_64 LLVM=1 vmlinux ... UPD kernel/config_data GZIP kernel/config_data.gz CC kernel/configs.o ... LD vmlinux ...
$ grep CONFIG_HEADERS_INSTALL .config # CONFIG_HEADERS_INSTALL is not set
$ scripts/extract-ikconfig vmlinux | grep CONFIG_HEADERS_INSTALL CONFIG_HEADERS_INSTALL=y
Without '--thinlto-cache-dir' or when using full LTO, this issue does not occur.
Benchmarking incremental builds on a few different machines with and without the cache shows a 20% increase in incremental build time without the cache when measured by touching init/main.c and running 'make all'.
ARCH=arm64 defconfig + CONFIG_LTO_CLANG_THIN=y on an arm64 host:
Benchmark 1: With ThinLTO cache Time (mean ± σ): 56.347 s ± 0.163 s [User: 83.768 s, System: 24.661 s] Range (min … max): 56.109 s … 56.594 s 10 runs
Benchmark 2: Without ThinLTO cache Time (mean ± σ): 67.740 s ± 0.479 s [User: 718.458 s, System: 31.797 s] Range (min … max): 67.059 s … 68.556 s 10 runs
Summary With ThinLTO cache ran 1.20 ± 0.01 times faster than Without ThinLTO cache
ARCH=x86_64 defconfig + CONFIG_LTO_CLANG_THIN=y on an x86_64 host:
Benchmark 1: With ThinLTO cache Time (mean ± σ): 85.772 s ± 0.252 s [User: 91.505 s, System: 8.408 s] Range (min … max): 85.447 s … 86.244 s 10 runs
Benchmark 2: Without ThinLTO cache Time (mean ± σ): 103.833 s ± 0.288 s [User: 232.058 s, System: 8.569 s] Range (min … max): 103.286 s … 104.124 s 10 runs
Summary With ThinLTO cache ran 1.21 ± 0.00 times faster than Without ThinLTO cache
While it is unfortunate to take this performance improvement off the table, correctness is more important. If/when this is fixed in LLVM, it can potentially be brought back in a conditional manner. Alternatively, a developer can just disable LTO if doing incremental compiles quickly is important, as a full compile cycle can still take over a minute even with the cache and it is unlikely that LTO will result in functional differences for a kernel change.
Cc: stable@vger.kernel.org Fixes: dc5723b02e52 ("kbuild: add support for Clang LTO") Reported-by: Yifan Hong elsk@google.com Closes: https://github.com/ClangBuiltLinux/linux/issues/2021 Reported-by: Masami Hiramatsu mhiramat@kernel.org Closes: https://lore.kernel.org/r/20220327115526.cc4b0ff55fc53c97683c3e4d@kernel.org... Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Masahiro Yamada masahiroy@kernel.org [nathan: Address conflict in Makefile] Signed-off-by: Nathan Chancellor nathan@kernel.org
Makefile | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
This applied to 5.15.y, not 6.1.y :(
Can you rebase and resend a fix for 6.1.y?
thanks,
greg k-h