Hi Dmitriy,
Linaro Benchmarking CI has flagged several interesting code-speed and code-size regressions for your patch -- see [1].
In particular, could you check if below regressions can be avoided: - grew in size by 21% - 473.astar:[.] _ZN7way2obj12releasepointEii - slowed down by 61% - 505.mcf_r:[.] price_out_impl
Both of these are for 32-bit ARM, but AArch64 also has code-speed and code-size regressions.
Let me know if you need any assistance in reproducing these problems.
[1] https://linaro.atlassian.net/browse/LLVM-1001
Kind regards,
-- Maxim Kuvyrkov https://www.linaro.org
Begin forwarded message:
From: ci_notify@linaro.org Subject: [Linaro-TCWG-CI] llvmorg-18-init-7933-ge13bed4c5f35: slowed down by 6% - 464.h264ref on aarch64 O2 Date: October 8, 2023 at 04:26:39 GMT+4 To: maxim.kuvyrkov@linaro.org Reply-To: linaro-toolchain@lists.linaro.org
Dear contributor, our automatic CI has detected problems related to your patch(es). Please find some details below. If you have any questions, please follow up on linaro-toolchain@lists.linaro.org mailing list, Libera's #linaro-tcwg channel, or ping your favourite Linaro toolchain developer on the usual project channel.
In CI config tcwg_bmk-code_speed-spec2k6/llvm-aarch64-master-O2 after:
| commit llvmorg-18-init-7933-ge13bed4c5f35 | Author: Dmitriy Smirnov dmitriy.smirnov@arm.com | Date: Fri Oct 6 11:15:00 2023 +0100 | | [PATCH] [llvm] [InstCombine] Canonicalise ADD+GEP | | This patch tries to canonicalise add + gep to gep + gep. | | Co-authored-by: Paul Walker paul.walker@arm.com | | Reviewed By: nikic | ... 2 lines of the commit log omitted.
the following benchmarks slowed down by more than 3%:
- slowed down by 6% - 464.h264ref - from 11126 to 11766 perf samples
the following hot functions slowed down by more than 15% (but their benchmarks slowed down by less than 3%):
- slowed down by 44% - 464.h264ref:[.] FastFullPelBlockMotionSearch - from 1531 to 2206 perf samples
The configuration of this build is: Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI.
Configuration:
- Benchmark:
- Toolchain: Clang + Glibc + LLVM Linker
- Version: all components were built from their tip of trunk
- Target: aarch64-linux-gnu
- Compiler flags: O2
- Hardware: NVidia TX1 4x Cortex-A57
This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain@lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports.
-----------------8<--------------------------8<--------------------------8<-------------------------- The information below can be used to reproduce a debug environment:
Current build : https://ci.linaro.org/job/tcwg_bmk-code_speed-spec2k6--llvm-aarch64-master-O... Reference build : https://ci.linaro.org/job/tcwg_bmk-code_speed-spec2k6--llvm-aarch64-master-O...
Reproduce last good and first bad builds: https://git-us.linaro.org/toolchain/ci/interesting-commits.git/plain/llvm/sh...
Full commit : https://github.com/llvm/llvm-project/commit/e13bed4c5f3544c076ce57e36d9a11ee...
Latest bug report status : https://linaro.atlassian.net/browse/LLVM-1001
List of configurations that regressed due to this commit :
- tcwg_bmk-code_speed-spec2k6
** llvm-aarch64-master-O2 *** slowed down by 6% - 464.h264ref *** https://git-us.linaro.org/toolchain/ci/interesting-commits.git/plain/llvm/sh... *** https://ci.linaro.org/job/tcwg_bmk-code_speed-spec2k6--llvm-aarch64-master-O...