Thanks, Stanislav.
I’ve looked into profile dumps, and 456.hmmer’s hot loop get several additional reloads. E.g., "ldr r1, [sp, #84]” generates 203 additional samples, which translates into 20 seconds of time just for that one instruction.
See the attached profile dumps and the the screenshot with the hot loop highlighted.
Maybe your patch increases register pressure too much?
Regards,
-- Maxim Kuvyrkov https://www.linaro.org
On 22 Sep 2021, at 22:35, Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com wrote:
[AMD Official Use Only]
There are actually couple things worth to try if that is easy:
https://reviews.llvm.org/D109077 https://reviews.llvm.org/differential/diff/374324/
Both may slightly change spill weights and then spilling pattern.
Stas
-----Original Message----- From: Mekhanoshin, Stanislav Sent: Wednesday, September 22, 2021 12:09 To: Maxim Kuvyrkov maxim.kuvyrkov@linaro.org Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: RE: [TCWG CI] 456.hmmer slowed down by 6% after llvm: Allow rematerialization of virtual reg uses
I assume some of the newly rematerialized instructions caused perf drops. Probably some very specific ones. I would appreciate if you could point them to me. In addition I believe I would need to have a linked or optimized bitcode to feed into llc.
Stas
-----Original Message----- From: Maxim Kuvyrkov maxim.kuvyrkov@linaro.org Sent: Wednesday, September 22, 2021 12:06 To: Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: Re: [TCWG CI] 456.hmmer slowed down by 6% after llvm: Allow rematerialization of virtual reg uses
[CAUTION: External Email]
Hi Stanislav,
That's fair; I or someone from Linaro will try to analyze this and follow up here.
On a more general note, what info would you like to see in these benchmarking regression reports?
Thanks,
-- Maxim Kuvyrkov https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro...
On Sep 22, 2021, at 9:40 PM, Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com wrote:
[AMD Official Use Only]
Hm... I'd really like to help, but I do not think I can do anything with megabytes of code in an asm which I do not understand and tons of differences in 48 asm files. What I can see there is overall less spilling code which was the intent in the first place: hmmer has 4 less spill opcodes overall and sphinx has 27 less of them. I doubt I could say much more without someone pointing to the actual root cause.
Stas
-----Original Message----- From: Maxim Kuvyrkov maxim.kuvyrkov@linaro.org Sent: Wednesday, September 22, 2021 5:16 To: Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: Re: [TCWG CI] 456.hmmer slowed down by 6% after llvm: Allow rematerialization of virtual reg uses
[CAUTION: External Email]
Hi Stanislav,
Attached is a tarball with -save-temps output (pre-processed source and generated assembly) for first-bad run (your commit) and last-good run (immediate parent of your commit).
-- Maxim Kuvyrkov https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro...
On 20 Sep 2021, at 23:15, Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com wrote:
[AMD Official Use Only]
Thanks for letting me know. Some regressions are inevitable, however do you happen to have any analysis and dumps? I myself do not understand ARM ISA well...
Stas
-----Original Message----- From: Maxim Kuvyrkov maxim.kuvyrkov@linaro.org Sent: Wednesday, September 15, 2021 5:52 To: Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com Cc: linaro-toolchain linaro-toolchain@lists.linaro.org Subject: Re: [TCWG CI] 456.hmmer slowed down by 6% after llvm: Allow rematerialization of virtual reg uses
[CAUTION: External Email]
Hi Stanislav,
FYI, your patch seems to be slowing down two of SPEC CPU2006 tests on 32-bit ARM at -O2 and -O3 optimization levels.
-- Maxim Kuvyrkov https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro...