Thanks, Stanislav,
FWIW, it will be, probably, easier for you to just rebuild the compiler, it is an x86_64-linux-gnu -> arm-linux-gnueabihf cross. This link has the build log [1].
cmake -G Ninja ../llvm/llvm '-DLLVM_ENABLE_PROJECTS=clang;lld' -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=True -DCMAKE_INSTALL_PREFIX=../llvm-install -DLLVM_TARGETS_TO_BUILD=ARM
Then compile the pre-processed source with plain -O2 or -O3 optimisation settings.
[1] https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-a...
Regards,
-- Maxim Kuvyrkov https://www.linaro.org
On 24 Sep 2021, at 20:30, Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com wrote:
[AMD Official Use Only]
I have reverted the whole change. There was yet another perf regression report. Stas From: Mekhanoshin, Stanislav Sent: Thursday, September 23, 2021 11:48 To: Maxim Kuvyrkov maxim.kuvyrkov@linaro.org Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: RE: [TCWG CI] 456.hmmer slowed down by 6% after llvm: Allow rematerialization of virtual reg uses Thanks. I see the reload. There shall not be extra pressure since that is the whole idea, make pressure less. However, I see more spills in that specific file, fast_algorithms.s if I get it right. Can I get the IR for it? Something to feed llc. Stas From: Maxim Kuvyrkov maxim.kuvyrkov@linaro.org Sent: Thursday, September 23, 2021 2:31 To: Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: Re: [TCWG CI] 456.hmmer slowed down by 6% after llvm: Allow rematerialization of virtual reg uses [CAUTION: External Email]
Thanks, Stanislav.
I’ve looked into profile dumps, and 456.hmmer’s hot loop get several additional reloads. E.g., "ldr r1, [sp, #84]” generates 203 additional samples, which translates into 20 seconds of time just for that one instruction.
See the attached profile dumps and the the screenshot with the hot loop highlighted.
Maybe your patch increases register pressure too much?
Regards,
-- Maxim Kuvyrkov https://www.linaro.org
On 22 Sep 2021, at 22:35, Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com wrote:
[AMD Official Use Only]
There are actually couple things worth to try if that is easy:
https://reviews.llvm.org/D109077 https://reviews.llvm.org/differential/diff/374324/
Both may slightly change spill weights and then spilling pattern.
Stas
-----Original Message----- From: Mekhanoshin, Stanislav Sent: Wednesday, September 22, 2021 12:09 To: Maxim Kuvyrkov maxim.kuvyrkov@linaro.org Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: RE: [TCWG CI] 456.hmmer slowed down by 6% after llvm: Allow rematerialization of virtual reg uses
I assume some of the newly rematerialized instructions caused perf drops. Probably some very specific ones. I would appreciate if you could point them to me. In addition I believe I would need to have a linked or optimized bitcode to feed into llc.
Stas
-----Original Message----- From: Maxim Kuvyrkov maxim.kuvyrkov@linaro.org Sent: Wednesday, September 22, 2021 12:06 To: Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: Re: [TCWG CI] 456.hmmer slowed down by 6% after llvm: Allow rematerialization of virtual reg uses
[CAUTION: External Email]
Hi Stanislav,
That's fair; I or someone from Linaro will try to analyze this and follow up here.
On a more general note, what info would you like to see in these benchmarking regression reports?
Thanks,
-- Maxim Kuvyrkov https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro...
On Sep 22, 2021, at 9:40 PM, Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com wrote:
[AMD Official Use Only]
Hm... I'd really like to help, but I do not think I can do anything with megabytes of code in an asm which I do not understand and tons of differences in 48 asm files. What I can see there is overall less spilling code which was the intent in the first place: hmmer has 4 less spill opcodes overall and sphinx has 27 less of them. I doubt I could say much more without someone pointing to the actual root cause.
Stas
-----Original Message----- From: Maxim Kuvyrkov maxim.kuvyrkov@linaro.org Sent: Wednesday, September 22, 2021 5:16 To: Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: Re: [TCWG CI] 456.hmmer slowed down by 6% after llvm: Allow rematerialization of virtual reg uses
[CAUTION: External Email]
Hi Stanislav,
Attached is a tarball with -save-temps output (pre-processed source and generated assembly) for first-bad run (your commit) and last-good run (immediate parent of your commit).
-- Maxim Kuvyrkov https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro...
On 20 Sep 2021, at 23:15, Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com wrote:
[AMD Official Use Only]
Thanks for letting me know. Some regressions are inevitable, however do you happen to have any analysis and dumps? I myself do not understand ARM ISA well...
Stas
-----Original Message----- From: Maxim Kuvyrkov maxim.kuvyrkov@linaro.org Sent: Wednesday, September 15, 2021 5:52 To: Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: Re: [TCWG CI] 456.hmmer slowed down by 6% after llvm: Allow rematerialization of virtual reg uses
[CAUTION: External Email]
Hi Stanislav,
FYI, your patch seems to be slowing down two of SPEC CPU2006 tests on 32-bit ARM at -O2 and -O3 optimization levels.
-- Maxim Kuvyrkov https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro...
<image001.png>