Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tx1/gnu-master-aarch64-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_tx1/gnu-master-aarch64-spec2k6-O3_LTO
Culprit: <cut> commit fedcf3c476aff7533741a1c61071200f0a38cf83 Author: Richard Biener rguenther@suse.de Date: Thu Jul 8 09:52:49 2021 +0200
tree-optimization/101373 - avoid PRE across externally throwing call
PRE already tries to avoid hoisting possibly trapping expressions across calls that might not return normally but fails to consider const calls that throw externally. The following fixes that and also plugs the hole of trapping references not pruned in case they are not catched by the actuall call clobbering it.
At -Os we hit the same issue in RTL PRE and postreload-gcse has even more incomplete checks so the patch adjusts both of those as well.
2021-07-08 Richard Biener rguenther@suse.de
PR tree-optimization/101373 * tree-ssa-pre.c (prune_clobbered_mems): Also prune trapping references when the BB may not return. (compute_avail): Pass in the function we're working on and replace cfun references with it. Externally throwing const calls also possibly terminate the function. (pass_pre::execute): Pass down the function we're working on. * gcse.c (compute_hash_table_work): Externally throwing const/pure calls also need record_last_mem_set_info. * postreload-gcse.c (record_opr_changes): Looping or externally throwing const/pure calls also need record_last_mem_set_info.
* g++.dg/torture/pr101373.C: New testcase, XFAILed. * gnat.dg/opt95.adb: Likewise. </cut>
Results regressed to (for first_bad == fedcf3c476aff7533741a1c61071200f0a38cf83) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O3_LTO -- artifacts/build-fedcf3c476aff7533741a1c61071200f0a38cf83/results_id: 1 # 429.mcf,mcf_base.default regressed by 106
from (for last_good == fe610051a803131822bd02a8842a67b573b8e46a) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O3_LTO -- artifacts/build-fe610051a803131822bd02a8842a67b573b8e46a/results_id: 1
Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aar... Results ID of last_good: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-master-aarch64-spec2k6-O3_LTO/1618 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aar... Results ID of first_bad: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-master-aarch64-spec2k6-O3_LTO/1613 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aar...
Configuration details:
Reproduce builds: <cut> mkdir investigate-gcc-fedcf3c476aff7533741a1c61071200f0a38cf83 cd investigate-gcc-fedcf3c476aff7533741a1c61071200f0a38cf83
git clone https://git.linaro.org/toolchain/jenkins-scripts
mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aar... --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aar... --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aar... --fail chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude gcc/ ./ ./bisect/baseline/
cd gcc
# Reproduce first_bad build git checkout --detach fedcf3c476aff7533741a1c61071200f0a38cf83 ../artifacts/test.sh
# Reproduce last_good build git checkout --detach fe610051a803131822bd02a8842a67b573b8e46a ../artifacts/test.sh
cd .. </cut>
History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/c...
Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aar... Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aar...
Full commit (up to 1000 lines): <cut> commit fedcf3c476aff7533741a1c61071200f0a38cf83 Author: Richard Biener rguenther@suse.de Date: Thu Jul 8 09:52:49 2021 +0200
tree-optimization/101373 - avoid PRE across externally throwing call
PRE already tries to avoid hoisting possibly trapping expressions across calls that might not return normally but fails to consider const calls that throw externally. The following fixes that and also plugs the hole of trapping references not pruned in case they are not catched by the actuall call clobbering it.
At -Os we hit the same issue in RTL PRE and postreload-gcse has even more incomplete checks so the patch adjusts both of those as well.
2021-07-08 Richard Biener rguenther@suse.de
PR tree-optimization/101373 * tree-ssa-pre.c (prune_clobbered_mems): Also prune trapping references when the BB may not return. (compute_avail): Pass in the function we're working on and replace cfun references with it. Externally throwing const calls also possibly terminate the function. (pass_pre::execute): Pass down the function we're working on. * gcse.c (compute_hash_table_work): Externally throwing const/pure calls also need record_last_mem_set_info. * postreload-gcse.c (record_opr_changes): Looping or externally throwing const/pure calls also need record_last_mem_set_info.
* g++.dg/torture/pr101373.C: New testcase, XFAILed. * gnat.dg/opt95.adb: Likewise. --- gcc/gcse.c | 3 ++- gcc/postreload-gcse.c | 4 +++- gcc/testsuite/g++.dg/torture/pr101373.C | 33 +++++++++++++++++++++++++++ gcc/testsuite/gnat.dg/opt95.adb | 40 +++++++++++++++++++++++++++++++++ gcc/tree-ssa-pre.c | 34 +++++++++++++++++----------- 5 files changed, 99 insertions(+), 15 deletions(-)
diff --git a/gcc/gcse.c b/gcc/gcse.c index ecf7e51aac5..ccd33664af5 100644 --- a/gcc/gcse.c +++ b/gcc/gcse.c @@ -1537,7 +1537,8 @@ compute_hash_table_work (struct gcse_hash_table_d *table) record_last_reg_set_info (insn, regno);
if (! RTL_CONST_OR_PURE_CALL_P (insn) - || RTL_LOOPING_CONST_OR_PURE_CALL_P (insn)) + || RTL_LOOPING_CONST_OR_PURE_CALL_P (insn) + || can_throw_external (insn)) record_last_mem_set_info (insn); }
diff --git a/gcc/postreload-gcse.c b/gcc/postreload-gcse.c index 0b28247e299..6c95d09a1e5 100644 --- a/gcc/postreload-gcse.c +++ b/gcc/postreload-gcse.c @@ -779,7 +779,9 @@ record_opr_changes (rtx_insn *insn) EXECUTE_IF_SET_IN_HARD_REG_SET (callee_clobbers, 0, regno, hrsi) record_last_reg_set_info_regno (insn, regno);
- if (! RTL_CONST_OR_PURE_CALL_P (insn)) + if (! RTL_CONST_OR_PURE_CALL_P (insn) + || RTL_LOOPING_CONST_OR_PURE_CALL_P (insn) + || can_throw_external (insn)) record_last_mem_set_info (insn); } } diff --git a/gcc/testsuite/g++.dg/torture/pr101373.C b/gcc/testsuite/g++.dg/torture/pr101373.C new file mode 100644 index 00000000000..f8c809739e2 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr101373.C @@ -0,0 +1,33 @@ +// { dg-do run } +// { dg-xfail-run-if "PR100409" { *-*-* } } + +int __attribute__((const,noipa)) foo (int j) +{ + if (j != 0) + throw 1; + return 0; +} + +int __attribute__((noipa)) bar (int *p, int n) +{ + int ret = 0; + if (n) + { + foo (n); + ret = *p; + } + ret += *p; + return ret; +} + +int main() +{ + try + { + return bar (nullptr, 1); + } + catch (...) + { + return 0; + } +} diff --git a/gcc/testsuite/gnat.dg/opt95.adb b/gcc/testsuite/gnat.dg/opt95.adb new file mode 100644 index 00000000000..2c72582b3f1 --- /dev/null +++ b/gcc/testsuite/gnat.dg/opt95.adb @@ -0,0 +1,40 @@ +-- { dg-do run } +-- { dg-options "-O2 -gnatp" } + +procedure Opt95 is + + function Foo (J : Integer) return Integer; + pragma Pure_Function (Foo); + pragma Machine_Attribute (Foo, "noipa"); + + function Foo (J : Integer) return Integer is + begin + if J /= 0 then + raise Constraint_Error; + end if; + return 0; + end; + + function Bar (A : access Integer; N : Integer) return Integer; + pragma Machine_Attribute (Bar, "noipa"); + + function Bar (A : access Integer; N : Integer) return Integer is + Ret : Integer := 0; + Ret2 : Integer := 0; + begin + if N /= 0 then + Ret2 := Foo (N); + Ret := A.all; + end if; + Ret := Ret + A.all; + return Ret + Ret2; + end; + + V : Integer; + pragma Volatile (V); + +begin + V := Bar (null, 1); +exception + when Constraint_Error => null; +end; diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c index 69141c2f0c9..aa5244e678c 100644 --- a/gcc/tree-ssa-pre.c +++ b/gcc/tree-ssa-pre.c @@ -2071,6 +2071,13 @@ prune_clobbered_mems (bitmap_set_t set, basic_block block) && value_dies_in_block_x (expr, block)))) to_remove = i; } + /* If the REFERENCE may trap make sure the block does not contain + a possible exit point. + ??? This is overly conservative if we translate AVAIL_OUT + as the available expression might be after the exit point. */ + if (BB_MAY_NOTRETURN (block) + && vn_reference_may_trap (ref)) + to_remove = i; } else if (expr->kind == NARY) { @@ -3860,7 +3867,7 @@ insert (void) AVAIL_OUT[BLOCK] = AVAIL_IN[BLOCK] U PHI_GEN[BLOCK] U TMP_GEN[BLOCK]. */
static void -compute_avail (void) +compute_avail (function *fun) {
basic_block block, son; @@ -3871,7 +3878,7 @@ compute_avail (void)
/* We pretend that default definitions are defined in the entry block. This includes function arguments and the static chain decl. */ - FOR_EACH_SSA_NAME (i, name, cfun) + FOR_EACH_SSA_NAME (i, name, fun) { pre_expr e; if (!SSA_NAME_IS_DEFAULT_DEF (name) @@ -3881,31 +3888,31 @@ compute_avail (void)
e = get_or_alloc_expr_for_name (name); add_to_value (get_expr_value_id (e), e); - bitmap_insert_into_set (TMP_GEN (ENTRY_BLOCK_PTR_FOR_FN (cfun)), e); - bitmap_value_insert_into_set (AVAIL_OUT (ENTRY_BLOCK_PTR_FOR_FN (cfun)), + bitmap_insert_into_set (TMP_GEN (ENTRY_BLOCK_PTR_FOR_FN (fun)), e); + bitmap_value_insert_into_set (AVAIL_OUT (ENTRY_BLOCK_PTR_FOR_FN (fun)), e); }
if (dump_file && (dump_flags & TDF_DETAILS)) { - print_bitmap_set (dump_file, TMP_GEN (ENTRY_BLOCK_PTR_FOR_FN (cfun)), + print_bitmap_set (dump_file, TMP_GEN (ENTRY_BLOCK_PTR_FOR_FN (fun)), "tmp_gen", ENTRY_BLOCK); - print_bitmap_set (dump_file, AVAIL_OUT (ENTRY_BLOCK_PTR_FOR_FN (cfun)), + print_bitmap_set (dump_file, AVAIL_OUT (ENTRY_BLOCK_PTR_FOR_FN (fun)), "avail_out", ENTRY_BLOCK); }
/* Allocate the worklist. */ - worklist = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun)); + worklist = XNEWVEC (basic_block, n_basic_blocks_for_fn (fun));
/* Seed the algorithm by putting the dominator children of the entry block on the worklist. */ - for (son = first_dom_son (CDI_DOMINATORS, ENTRY_BLOCK_PTR_FOR_FN (cfun)); + for (son = first_dom_son (CDI_DOMINATORS, ENTRY_BLOCK_PTR_FOR_FN (fun)); son; son = next_dom_son (CDI_DOMINATORS, son)) worklist[sp++] = son;
- BB_LIVE_VOP_ON_EXIT (ENTRY_BLOCK_PTR_FOR_FN (cfun)) - = ssa_default_def (cfun, gimple_vop (cfun)); + BB_LIVE_VOP_ON_EXIT (ENTRY_BLOCK_PTR_FOR_FN (fun)) + = ssa_default_def (fun, gimple_vop (fun));
/* Loop until the worklist is empty. */ while (sp) @@ -3970,7 +3977,8 @@ compute_avail (void) before it. */ int flags = gimple_call_flags (stmt); if (!(flags & ECF_CONST) - || (flags & ECF_LOOPING_CONST_OR_PURE)) + || (flags & ECF_LOOPING_CONST_OR_PURE) + || stmt_can_throw_external (fun, stmt)) BB_MAY_NOTRETURN (block) = 1; }
@@ -3987,7 +3995,7 @@ compute_avail (void) BB_LIVE_VOP_ON_EXIT (block) = gimple_vdef (stmt);
if (gimple_has_side_effects (stmt) - || stmt_could_throw_p (cfun, stmt) + || stmt_could_throw_p (fun, stmt) || is_gimple_debug (stmt)) continue;
@@ -4384,7 +4392,7 @@ pass_pre::execute (function *fun) we require AVAIL. */ if (n_basic_blocks_for_fn (fun) < 4000) { - compute_avail (); + compute_avail (fun); compute_antic (); insert (); } </cut>