Hi Richard,
This is a report from Linaro’s automatic benchmarking system, which tracks upstream components and automatically bisects performance regressions down to a single commit. We have recently enabled these emails to be sent to patch authors automatically.
It appears that after your patch SPEC CPU2006’s 483.xalancbmk slowed down by 3% on arm-linux-gnueabihf when built with -O2 -flto. Any theories on what could’ve caused it?
This is one of the first reports, and we welcome feedback on improving reporting template.
Regards,
-- Maxim Kuvyrkov https://www.linaro.org
On 26 Jun 2021, at 16:24, ci_notify@linaro.org wrote:
Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tk1/gnu-master-arm-spec2k6-O2_LTO. So far, this commit has regressed CI configurations:
- tcwg_bmk_gnu_tk1/gnu-master-arm-spec2k6-O2_LTO
Culprit:
<cut> commit 32955416d8040b1fa1ba21cd4179b3264e6c5bd6 Author: Richard Biener <rguenther@suse.de> Date: Mon May 3 12:07:58 2021 +0200
Improve PHI handling in DSE
This improves handling of PHI defs when walking uses in dse_classify_store to track two PHI defs. This happens when there are CFG merges and one PHI feeds into another. If we decide to want more then using a sbitmap for this might be the way to go.
2021-05-03 Richard Biener rguenther@suse.de
* tree-ssa-dse.c (dse_classify_store): Track two PHI defs. * gcc.dg/tree-ssa/ssa-dse-42.c: New testcase. * gcc.dg/pr81192.c: Disable DSE.
</cut>
Results regressed to (for first_bad == 32955416d8040b1fa1ba21cd4179b3264e6c5bd6) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O2_LTO_marm -- artifacts/build-32955416d8040b1fa1ba21cd4179b3264e6c5bd6/results_id: 1 # 483.xalancbmk,Xalan_base.default regressed by 103
from (for last_good == ed3c43224cc4e378dbab066122bc63536ccb1276) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O2_LTO_marm -- artifacts/build-ed3c43224cc4e378dbab066122bc63536ccb1276/results_id: 1
Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-master-arm... Results ID of last_good: tk1_32/tcwg_bmk_gnu_tk1/bisect-gnu-master-arm-spec2k6-O2_LTO/458 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-master-arm... Results ID of first_bad: tk1_32/tcwg_bmk_gnu_tk1/bisect-gnu-master-arm-spec2k6-O2_LTO/480 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-master-arm...
Configuration details:
Reproduce builds:
<cut> mkdir investigate-gcc-32955416d8040b1fa1ba21cd4179b3264e6c5bd6 cd investigate-gcc-32955416d8040b1fa1ba21cd4179b3264e6c5bd6
git clone https://git.linaro.org/toolchain/jenkins-scripts
mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-master-arm... --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-master-arm... --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-master-arm... --fail chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
cd gcc
# Reproduce first_bad build git checkout --detach 32955416d8040b1fa1ba21cd4179b3264e6c5bd6 ../artifacts/test.sh
# Reproduce last_good build git checkout --detach ed3c43224cc4e378dbab066122bc63536ccb1276 ../artifacts/test.sh
cd ..
</cut>
History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/c...
Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-master-arm... Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-master-arm...
Full commit (up to 1000 lines):
<cut> commit 32955416d8040b1fa1ba21cd4179b3264e6c5bd6 Author: Richard Biener <rguenther@suse.de> Date: Mon May 3 12:07:58 2021 +0200
Improve PHI handling in DSE
This improves handling of PHI defs when walking uses in dse_classify_store to track two PHI defs. This happens when there are CFG merges and one PHI feeds into another. If we decide to want more then using a sbitmap for this might be the way to go.
2021-05-03 Richard Biener rguenther@suse.de
* tree-ssa-dse.c (dse_classify_store): Track two PHI defs. * gcc.dg/tree-ssa/ssa-dse-42.c: New testcase. * gcc.dg/pr81192.c: Disable DSE.
gcc/testsuite/gcc.dg/pr81192.c | 4 +++- gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-42.c | 20 ++++++++++++++++++++ gcc/tree-ssa-dse.c | 23 ++++++++++++++--------- 3 files changed, 37 insertions(+), 10 deletions(-)
diff --git a/gcc/testsuite/gcc.dg/pr81192.c b/gcc/testsuite/gcc.dg/pr81192.c index 71bbc13a0e9..6cab6056558 100644 --- a/gcc/testsuite/gcc.dg/pr81192.c +++ b/gcc/testsuite/gcc.dg/pr81192.c @@ -1,4 +1,4 @@ -/* { dg-options "-Os -fdump-tree-pre-details -fdisable-tree-evrp" } */ +/* { dg-options "-Os -fdump-tree-pre-details -fdisable-tree-evrp -fno-tree-dse" } */
/* Disable tree-evrp because the new version of evrp sees <bb 3> : @@ -16,6 +16,8 @@ produces # iftmp.2_12 = PHI <2147483647(3), iftmp.2_11(4)> which causes the situation being tested to dissapear before we get to PRE. */
+/* Likewise disable DSE which also elides the tail merging "opportunity". */
#if __SIZEOF_INT__ == 2 #define unsigned __UINT32_TYPE__ #define int __INT32_TYPE__ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-42.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-42.c new file mode 100644 index 00000000000..34108c83828 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-42.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-dse1" } */
+int a[2]; +void foo(int i, int k, int j) +{
- a[0] = i;
- if (k)
- a[0] = a[i] + k;
- else
- {
if (j)
a[1] = 1;
a[0] = a[i] + 3;
- }
- a[0] = 0;
+}
+/* The last stores to a[0] and a[1] remain. */ +/* { dg-final { scan-tree-dump-times " = " 2 "dse1" } } */ diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c index e0a944c704a..dfa6d314727 100644 --- a/gcc/tree-ssa-dse.c +++ b/gcc/tree-ssa-dse.c @@ -799,7 +799,8 @@ dse_classify_store (ao_ref *ref, gimple *stmt, return DSE_STORE_LIVE;
auto_vec<gimple *, 10> defs;
gimple *phi_def = NULL;
gimple *first_phi_def = NULL;
{ /* Limit stmt walking. */gimple *last_phi_def = NULL; FOR_EACH_IMM_USE_STMT (use_stmt, ui, defvar)
@@ -825,7 +826,9 @@ dse_classify_store (ao_ref *ref, gimple *stmt, SSA_NAME_VERSION (PHI_RESULT (use_stmt)))) { defs.safe_push (use_stmt);
phi_def = use_stmt;
if (!first_phi_def)
first_phi_def = use_stmt;
} } /* If the statement is a use the store is not dead. */last_phi_def = use_stmt;
@@ -889,6 +892,8 @@ dse_classify_store (ao_ref *ref, gimple *stmt, gimple *def = defs[i]; gimple *use_stmt; use_operand_p use_p;
tree vdef = (gimple_code (def) == GIMPLE_PHI
/* If the path to check starts with a kill we do not need to process it further. ??? With byte tracking we need only kill the bytes currently? gimple_phi_result (def) : gimple_vdef (def));
@@ -901,8 +906,7 @@ dse_classify_store (ao_ref *ref, gimple *stmt, } /* If the path ends here we do not need to process it further. This for example happens with calls to noreturn functions. */
else if (gimple_code (def) != GIMPLE_PHI
&& has_zero_uses (gimple_vdef (def)))
not dead. */else if (has_zero_uses (vdef)) { /* But if the store is to global memory it is definitely
@@ -912,12 +916,13 @@ dse_classify_store (ao_ref *ref, gimple *stmt, } /* In addition to kills we can remove defs whose only use is another def in defs. That can only ever be PHIs of which
we track a single for simplicity reasons (we fail for multiple
PHIs anyways). We can also ignore defs that feed only into
we track two for simplicity reasons, the first and last in
{first,last}_phi_def (we fail for multiple PHIs anyways).
We can also ignore defs that feed only into already visited PHIs. */
else if (gimple_code (def) != GIMPLE_PHI
&& single_imm_use (gimple_vdef (def), &use_p, &use_stmt)
&& (use_stmt == phi_def
else if (single_imm_use (vdef, &use_p, &use_stmt)
&& (use_stmt == first_phi_def
|| use_stmt == last_phi_def || (gimple_code (use_stmt) == GIMPLE_PHI && bitmap_bit_p (visited, SSA_NAME_VERSION
</cut>