linaro-toolchain July 2021

linaro-toolchain@lists.linaro.org

19 participants
99 discussions

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-release-aarch64-spec2k6-O3 - Build # 5 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O3. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O3 Culprit: <cut> commit 1fb610429308a7c29c5065f5cc35dcc3fd69c8b1 Author: Roman Lebedev <lebedev.ri(a)gmail.com> Date: Mon Oct 12 22:19:17 2020 +0300 Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" This relands commit 1c021c64caef83cccb719c9bf0a2554faa6563af which was reverted in commit 17cec6a11a12f815052d56a17ef738cf246a2d9a because an assertion was being triggered, since `BuildConstantFromSCEV()` wasn't updated to handle the case where the constant we want to truncate is actually a pointer. I was unsuccessful in coming up with a test case where we'd end there with constant zext/sext of a pointer, so i didn't handle those cases there until there is a test case. Original commit message: While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806 </cut> Results regressed to (for first_bad == 1fb610429308a7c29c5065f5cc35dcc3fd69c8b1) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O3 -- artifacts/build-1fb610429308a7c29c5065f5cc35dcc3fd69c8b1/results_id: 1 # 400.perlbench,perlbench_base.default regressed by 104 from (for last_good == 73818f450e3a90fc89eca143ee30777ed7e660e9) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O3 -- artifacts/build-73818f450e3a90fc89eca143ee30777ed7e660e9/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O3/1768 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O3/1778 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-1fb610429308a7c29c5065f5cc35dcc3fd69c8b1 cd investigate-llvm-1fb610429308a7c29c5065f5cc35dcc3fd69c8b1 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 1fb610429308a7c29c5065f5cc35dcc3fd69c8b1 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 73818f450e3a90fc89eca143ee30777ed7e660e9 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Full commit (up to 1000 lines): <cut> commit 1fb610429308a7c29c5065f5cc35dcc3fd69c8b1 Author: Roman Lebedev <lebedev.ri(a)gmail.com> Date: Mon Oct 12 22:19:17 2020 +0300 Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" This relands commit 1c021c64caef83cccb719c9bf0a2554faa6563af which was reverted in commit 17cec6a11a12f815052d56a17ef738cf246a2d9a because an assertion was being triggered, since `BuildConstantFromSCEV()` wasn't updated to handle the case where the constant we want to truncate is actually a pointer. I was unsuccessful in coming up with a test case where we'd end there with constant zext/sext of a pointer, so i didn't handle those cases there until there is a test case. Original commit message: While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806 --- llvm/lib/Analysis/ScalarEvolution.cpp | 50 ++++++-- llvm/lib/Transforms/Utils/SimplifyIndVar.cpp | 2 +- .../add-expr-pointer-operand-sorting.ll | 4 +- .../Analysis/ScalarEvolution/no-wrap-add-exprs.ll | 4 +- .../ScalarEvolution/ptrtoint-constantexpr-loop.ll | 130 ++++++++------------- llvm/test/Analysis/ScalarEvolution/ptrtoint.ll | 60 +++++----- llvm/test/CodeGen/ARM/lsr-undef-in-binop.ll | 4 +- llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll | 4 +- .../IndVarSimplify/2011-11-01-lftrptr.ll | 16 +-- .../Isl/CodeGen/scev_looking_through_bitcasts.ll | 3 +- 10 files changed, 140 insertions(+), 137 deletions(-) diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp index 1d3e26b93cb6..74bffc0facdb 100644 --- a/llvm/lib/Analysis/ScalarEvolution.cpp +++ b/llvm/lib/Analysis/ScalarEvolution.cpp @@ -3505,15 +3505,15 @@ const SCEV *ScalarEvolution::getUMinExpr(SmallVectorImpl<const SCEV *> &Ops) { } const SCEV *ScalarEvolution::getSizeOfExpr(Type *IntTy, Type *AllocTy) { - // We can bypass creating a target-independent - // constant expression and then folding it back into a ConstantInt. - // This is just a compile-time optimization. if (isa<ScalableVectorType>(AllocTy)) { Constant *NullPtr = Constant::getNullValue(AllocTy->getPointerTo()); Constant *One = ConstantInt::get(IntTy, 1); Constant *GEP = ConstantExpr::getGetElementPtr(AllocTy, NullPtr, One); - return getSCEV(ConstantExpr::getPtrToInt(GEP, IntTy)); + return getUnknown(ConstantExpr::getPtrToInt(GEP, IntTy)); } + // We can bypass creating a target-independent + // constant expression and then folding it back into a ConstantInt. + // This is just a compile-time optimization. return getConstant(IntTy, getDataLayout().getTypeAllocSize(AllocTy)); } @@ -6301,6 +6301,36 @@ const SCEV *ScalarEvolution::createSCEV(Value *V) { return getSCEV(U->getOperand(0)); break; + case Instruction::PtrToInt: { + // It's tempting to handle inttoptr and ptrtoint as no-ops, + // however this can lead to pointer expressions which cannot safely be + // expanded to GEPs because ScalarEvolution doesn't respect + // the GEP aliasing rules when simplifying integer expressions. + // + // However, given + // %x = ??? + // %y = ptrtoint %x + // %z = ptrtoint %x + // it is safe to say that %y and %z are the same thing. + // + // So instead of modelling the cast itself as unknown, + // since the casts are transparent within SCEV, + // we can at least model the casts original value as unknow instead. + + // BUT, there's caveat. If we simply model %x as unknown, unrelated uses + // of %x will also see it as unknown, which is obviously bad. + // So we can only do this iff %x would be modelled as unknown anyways. + auto *OpSCEV = getSCEV(U->getOperand(0)); + if (isa<SCEVUnknown>(OpSCEV)) + return getTruncateOrZeroExtend(OpSCEV, U->getType()); + // If we can model the operand, however, we must fallback to modelling + // the whole cast as unknown instead. + LLVM_FALLTHROUGH; + } + case Instruction::IntToPtr: + // We can't do this for inttoptr at all, however. + return getUnknown(V); + case Instruction::SDiv: // If both operands are non-negative, this is just an udiv. if (isKnownNonNegative(getSCEV(U->getOperand(0))) && @@ -6315,11 +6345,6 @@ const SCEV *ScalarEvolution::createSCEV(Value *V) { return getURemExpr(getSCEV(U->getOperand(0)), getSCEV(U->getOperand(1))); break; - // It's tempting to handle inttoptr and ptrtoint as no-ops, however this can - // lead to pointer expressions which cannot safely be expanded to GEPs, - // because ScalarEvolution doesn't respect the GEP aliasing rules when - // simplifying integer expressions. - case Instruction::GetElementPtr: return createNodeForGEP(cast<GEPOperator>(U)); @@ -7974,8 +7999,11 @@ static Constant *BuildConstantFromSCEV(const SCEV *V) { } case scTruncate: { const SCEVTruncateExpr *ST = cast<SCEVTruncateExpr>(V); - if (Constant *CastOp = BuildConstantFromSCEV(ST->getOperand())) - return ConstantExpr::getTrunc(CastOp, ST->getType()); + if (Constant *CastOp = BuildConstantFromSCEV(ST->getOperand())) { + if (!CastOp->getType()->isPointerTy()) + return ConstantExpr::getTrunc(CastOp, ST->getType()); + return ConstantExpr::getPtrToInt(CastOp, ST->getType()); + } break; } case scAddExpr: { diff --git a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp index 2d71b0fff889..3e280a66175c 100644 --- a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp +++ b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp @@ -427,7 +427,7 @@ static bool willNotOverflow(ScalarEvolution *SE, Instruction::BinaryOps BinOp, : &ScalarEvolution::getZeroExtendExpr; // Check ext(LHS op RHS) == ext(LHS) op ext(RHS) - auto *NarrowTy = cast<IntegerType>(LHS->getType()); + auto *NarrowTy = cast<IntegerType>(SE->getEffectiveSCEVType(LHS->getType())); auto *WideTy = IntegerType::get(NarrowTy->getContext(), NarrowTy->getBitWidth() * 2); diff --git a/llvm/test/Analysis/ScalarEvolution/add-expr-pointer-operand-sorting.ll b/llvm/test/Analysis/ScalarEvolution/add-expr-pointer-operand-sorting.ll index 93a3bf4d4c37..e798e2715ba1 100644 --- a/llvm/test/Analysis/ScalarEvolution/add-expr-pointer-operand-sorting.ll +++ b/llvm/test/Analysis/ScalarEvolution/add-expr-pointer-operand-sorting.ll @@ -33,9 +33,9 @@ define i32 @d(i32 %base) { ; CHECK-NEXT: %1 = load i32*, i32** @c, align 8 ; CHECK-NEXT: --> %1 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } ; CHECK-NEXT: %sub.ptr.lhs.cast = ptrtoint i32* %1 to i64 -; CHECK-NEXT: --> %sub.ptr.lhs.cast U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } +; CHECK-NEXT: --> %1 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } ; CHECK-NEXT: %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, ptrtoint ([1 x i32]* @b to i64) -; CHECK-NEXT: --> ((-1 * ptrtoint ([1 x i32]* @b to i64)) + %sub.ptr.lhs.cast) U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } +; CHECK-NEXT: --> ((-1 * @b) + %1) U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } ; CHECK-NEXT: %sub.ptr.div = sdiv exact i64 %sub.ptr.sub, 4 ; CHECK-NEXT: --> %sub.ptr.div U: full-set S: [-2305843009213693952,2305843009213693952) Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } ; CHECK-NEXT: %arrayidx1 = getelementptr inbounds [1 x i8], [1 x i8]* %arrayidx, i64 0, i64 %sub.ptr.div diff --git a/llvm/test/Analysis/ScalarEvolution/no-wrap-add-exprs.ll b/llvm/test/Analysis/ScalarEvolution/no-wrap-add-exprs.ll index 5a7bb3c9e5cd..eb669cab0c79 100644 --- a/llvm/test/Analysis/ScalarEvolution/no-wrap-add-exprs.ll +++ b/llvm/test/Analysis/ScalarEvolution/no-wrap-add-exprs.ll @@ -170,14 +170,14 @@ define void @f3(i8* %x_addr, i8* %y_addr, i32* %tmp_addr) { %int5 = add i32 %int0, 5 %int.zext = zext i32 %int5 to i64 ; CHECK: %int.zext = zext i32 %int5 to i64 -; CHECK-NEXT: --> (1 + (zext i32 (4 + %int0) to i64))<nuw><nsw> U: [1,4294967294) S: [1,4294967297) +; CHECK-NEXT: --> (1 + (zext i32 (4 + (trunc [16 x i8]* @z_addr to i32)) to i64))<nuw><nsw> U: [1,4294967294) S: [1,4294967297) %ptr_noalign = bitcast [16 x i8]* @z_addr_noalign to i8* %int0_na = ptrtoint i8* %ptr_noalign to i32 %int5_na = add i32 %int0_na, 5 %int.zext_na = zext i32 %int5_na to i64 ; CHECK: %int.zext_na = zext i32 %int5_na to i64 -; CHECK-NEXT: --> (zext i32 (5 + %int0_na) to i64) U: [0,4294967296) S: [0,4294967296) +; CHECK-NEXT: --> (zext i32 (5 + (trunc [16 x i8]* @z_addr_noalign to i32)) to i64) U: [0,4294967296) S: [0,4294967296) %tmp = load i32, i32* %tmp_addr %mul = and i32 %tmp, -4 diff --git a/llvm/test/Analysis/ScalarEvolution/ptrtoint-constantexpr-loop.ll b/llvm/test/Analysis/ScalarEvolution/ptrtoint-constantexpr-loop.ll index 8cfa041e7552..d0ead6028071 100644 --- a/llvm/test/Analysis/ScalarEvolution/ptrtoint-constantexpr-loop.ll +++ b/llvm/test/Analysis/ScalarEvolution/ptrtoint-constantexpr-loop.ll @@ -11,48 +11,31 @@ @global = external hidden global [0 x i8] define hidden i32* @i64(i8* %arg, i32* %arg10) { -; PTR64_IDX64-LABEL: 'i64' -; PTR64_IDX64-NEXT: Classifying expressions for: @i64 -; PTR64_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] -; PTR64_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i64 ptrtoint ([0 x i8]* @global to i64) -; PTR64_IDX64-NEXT: --> (ptrtoint ([0 x i8]* @global to i64) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i64) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX64-NEXT: --> (ptrtoint ([0 x i8]* @global to i64) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i64) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 -; PTR64_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } -; PTR64_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 -; PTR64_IDX64-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX64-NEXT: Determining loop execution counts for: @i64 -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable backedge-taken count. -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable max backedge-taken count. -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. -; -; PTR64_IDX32-LABEL: 'i64' -; PTR64_IDX32-NEXT: Classifying expressions for: @i64 -; PTR64_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] -; PTR64_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i64 ptrtoint ([0 x i8]* @global to i64) -; PTR64_IDX32-NEXT: --> ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX32-NEXT: --> ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 -; PTR64_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } -; PTR64_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 -; PTR64_IDX32-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX32-NEXT: Determining loop execution counts for: @i64 -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable backedge-taken count. -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable max backedge-taken count. -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. +; X64-LABEL: 'i64' +; X64-NEXT: Classifying expressions for: @i64 +; X64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] +; X64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } +; X64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i64 ptrtoint ([0 x i8]* @global to i64) +; X64-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } +; X64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* +; X64-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } +; X64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 +; X64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } +; X64-NEXT: %tmp18 = add i32 %tmp, 2 +; X64-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } +; X64-NEXT: Determining loop execution counts for: @i64 +; X64-NEXT: Loop %bb11: Unpredictable backedge-taken count. +; X64-NEXT: Loop %bb11: Unpredictable max backedge-taken count. +; X64-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. ; ; PTR32_IDX32-LABEL: 'i64' ; PTR32_IDX32-NEXT: Classifying expressions for: @i64 ; PTR32_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i64 ptrtoint ([0 x i8]* @global to i64) -; PTR32_IDX32-NEXT: --> ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX32-NEXT: --> ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 @@ -67,9 +50,9 @@ define hidden i32* @i64(i8* %arg, i32* %arg10) { ; PTR32_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i64 ptrtoint ([0 x i8]* @global to i64) -; PTR32_IDX64-NEXT: --> (ptrtoint ([0 x i8]* @global to i64) + %arg) U: [0,8589934591) S: full-set Exits: (ptrtoint ([0 x i8]* @global to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> (@global + %arg) U: [0,8589934591) S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX64-NEXT: --> (ptrtoint ([0 x i8]* @global to i64) + %arg) U: [0,8589934591) S: full-set Exits: (ptrtoint ([0 x i8]* @global to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> (@global + %arg) U: [0,8589934591) S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 @@ -103,9 +86,9 @@ define hidden i32* @i64_to_i32(i8* %arg, i32* %arg10) { ; PTR64_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR64_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR64_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i32 ptrtoint ([0 x i8]* @global to i32) -; PTR64_IDX64-NEXT: --> ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) U: full-set S: full-set Exits: ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR64_IDX64-NEXT: --> ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) U: full-set S: full-set Exits: ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } ; PTR64_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX64-NEXT: --> ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) U: full-set S: full-set Exits: ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR64_IDX64-NEXT: --> ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) U: full-set S: full-set Exits: ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } ; PTR64_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR64_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR64_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 @@ -120,9 +103,9 @@ define hidden i32* @i64_to_i32(i8* %arg, i32* %arg10) { ; PTR64_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR64_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR64_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i32 ptrtoint ([0 x i8]* @global to i32) -; PTR64_IDX32-NEXT: --> (ptrtoint ([0 x i8]* @global to i32) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR64_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR64_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX32-NEXT: --> (ptrtoint ([0 x i8]* @global to i32) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR64_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR64_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR64_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR64_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 @@ -137,9 +120,9 @@ define hidden i32* @i64_to_i32(i8* %arg, i32* %arg10) { ; PTR32_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i32 ptrtoint ([0 x i8]* @global to i32) -; PTR32_IDX32-NEXT: --> (ptrtoint ([0 x i8]* @global to i32) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX32-NEXT: --> (ptrtoint ([0 x i8]* @global to i32) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 @@ -154,9 +137,9 @@ define hidden i32* @i64_to_i32(i8* %arg, i32* %arg10) { ; PTR32_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i32 ptrtoint ([0 x i8]* @global to i32) -; PTR32_IDX64-NEXT: --> ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) U: [-2147483648,6442450943) S: full-set Exits: ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) U: [-2147483648,6442450943) S: full-set Exits: ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX64-NEXT: --> ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) U: [-2147483648,6442450943) S: full-set Exits: ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) U: [-2147483648,6442450943) S: full-set Exits: ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 @@ -185,48 +168,31 @@ bb17: ; preds = %bb11 br label %bb11 } define hidden i32* @i64_to_i128(i8* %arg, i32* %arg10) { -; PTR64_IDX64-LABEL: 'i64_to_i128' -; PTR64_IDX64-NEXT: Classifying expressions for: @i64_to_i128 -; PTR64_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] -; PTR64_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i128 ptrtoint ([0 x i8]* @global to i128) -; PTR64_IDX64-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX64-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 -; PTR64_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } -; PTR64_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 -; PTR64_IDX64-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX64-NEXT: Determining loop execution counts for: @i64_to_i128 -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable backedge-taken count. -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable max backedge-taken count. -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. -; -; PTR64_IDX32-LABEL: 'i64_to_i128' -; PTR64_IDX32-NEXT: Classifying expressions for: @i64_to_i128 -; PTR64_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] -; PTR64_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i128 ptrtoint ([0 x i8]* @global to i128) -; PTR64_IDX32-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX32-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 -; PTR64_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } -; PTR64_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 -; PTR64_IDX32-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX32-NEXT: Determining loop execution counts for: @i64_to_i128 -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable backedge-taken count. -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable max backedge-taken count. -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. +; X64-LABEL: 'i64_to_i128' +; X64-NEXT: Classifying expressions for: @i64_to_i128 +; X64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] +; X64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } +; X64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i128 ptrtoint ([0 x i8]* @global to i128) +; X64-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } +; X64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* +; X64-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } +; X64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 +; X64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } +; X64-NEXT: %tmp18 = add i32 %tmp, 2 +; X64-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } +; X64-NEXT: Determining loop execution counts for: @i64_to_i128 +; X64-NEXT: Loop %bb11: Unpredictable backedge-taken count. +; X64-NEXT: Loop %bb11: Unpredictable max backedge-taken count. +; X64-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. ; ; PTR32_IDX32-LABEL: 'i64_to_i128' ; PTR32_IDX32-NEXT: Classifying expressions for: @i64_to_i128 ; PTR32_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i128 ptrtoint ([0 x i8]* @global to i128) -; PTR32_IDX32-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX32-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 @@ -241,9 +207,9 @@ define hidden i32* @i64_to_i128(i8* %arg, i32* %arg10) { ; PTR32_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i128 ptrtoint ([0 x i8]* @global to i128) -; PTR32_IDX64-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) U: [0,8589934591) S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> (@global + %arg) U: [0,8589934591) S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX64-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) U: [0,8589934591) S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> (@global + %arg) U: [0,8589934591) S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 diff --git a/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll b/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll index e3e9330e241f..ac08fb24775e 100644 --- a/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll +++ b/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll @@ -16,25 +16,25 @@ define void @ptrtoint(i8* %in, i64* %out0, i32* %out1, i16* %out2, i128* %out3) ; X64-LABEL: 'ptrtoint' ; X64-NEXT: Classifying expressions for: @ptrtoint ; X64-NEXT: %p0 = ptrtoint i8* %in to i64 -; X64-NEXT: --> %p0 U: full-set S: full-set +; X64-NEXT: --> %in U: full-set S: full-set ; X64-NEXT: %p1 = ptrtoint i8* %in to i32 -; X64-NEXT: --> %p1 U: full-set S: full-set +; X64-NEXT: --> (trunc i8* %in to i32) U: full-set S: full-set ; X64-NEXT: %p2 = ptrtoint i8* %in to i16 -; X64-NEXT: --> %p2 U: full-set S: full-set +; X64-NEXT: --> (trunc i8* %in to i16) U: full-set S: full-set ; X64-NEXT: %p3 = ptrtoint i8* %in to i128 -; X64-NEXT: --> %p3 U: [0,18446744073709551616) S: [-18446744073709551616,18446744073709551616) +; X64-NEXT: --> (zext i8* %in to i128) U: [0,18446744073709551616) S: [0,18446744073709551616) ; X64-NEXT: Determining loop execution counts for: @ptrtoint ; ; X32-LABEL: 'ptrtoint' ; X32-NEXT: Classifying expressions for: @ptrtoint ; X32-NEXT: %p0 = ptrtoint i8* %in to i64 -; X32-NEXT: --> %p0 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8* %in to i64) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: %p1 = ptrtoint i8* %in to i32 -; X32-NEXT: --> %p1 U: full-set S: full-set +; X32-NEXT: --> %in U: full-set S: full-set ; X32-NEXT: %p2 = ptrtoint i8* %in to i16 -; X32-NEXT: --> %p2 U: full-set S: full-set +; X32-NEXT: --> (trunc i8* %in to i16) U: full-set S: full-set ; X32-NEXT: %p3 = ptrtoint i8* %in to i128 -; X32-NEXT: --> %p3 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8* %in to i128) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: Determining loop execution counts for: @ptrtoint ; %p0 = ptrtoint i8* %in to i64 @@ -53,25 +53,25 @@ define void @ptrtoint_as1(i8 addrspace(1)* %in, i64* %out0, i32* %out1, i16* %ou ; X64-LABEL: 'ptrtoint_as1' ; X64-NEXT: Classifying expressions for: @ptrtoint_as1 ; X64-NEXT: %p0 = ptrtoint i8 addrspace(1)* %in to i64 -; X64-NEXT: --> %p0 U: full-set S: full-set +; X64-NEXT: --> %in U: full-set S: full-set ; X64-NEXT: %p1 = ptrtoint i8 addrspace(1)* %in to i32 -; X64-NEXT: --> %p1 U: full-set S: full-set +; X64-NEXT: --> (trunc i8 addrspace(1)* %in to i32) U: full-set S: full-set ; X64-NEXT: %p2 = ptrtoint i8 addrspace(1)* %in to i16 -; X64-NEXT: --> %p2 U: full-set S: full-set +; X64-NEXT: --> (trunc i8 addrspace(1)* %in to i16) U: full-set S: full-set ; X64-NEXT: %p3 = ptrtoint i8 addrspace(1)* %in to i128 -; X64-NEXT: --> %p3 U: [0,18446744073709551616) S: [-18446744073709551616,18446744073709551616) +; X64-NEXT: --> (zext i8 addrspace(1)* %in to i128) U: [0,18446744073709551616) S: [0,18446744073709551616) ; X64-NEXT: Determining loop execution counts for: @ptrtoint_as1 ; ; X32-LABEL: 'ptrtoint_as1' ; X32-NEXT: Classifying expressions for: @ptrtoint_as1 ; X32-NEXT: %p0 = ptrtoint i8 addrspace(1)* %in to i64 -; X32-NEXT: --> %p0 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8 addrspace(1)* %in to i64) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: %p1 = ptrtoint i8 addrspace(1)* %in to i32 -; X32-NEXT: --> %p1 U: full-set S: full-set +; X32-NEXT: --> %in U: full-set S: full-set ; X32-NEXT: %p2 = ptrtoint i8 addrspace(1)* %in to i16 -; X32-NEXT: --> %p2 U: full-set S: full-set +; X32-NEXT: --> (trunc i8 addrspace(1)* %in to i16) U: full-set S: full-set ; X32-NEXT: %p3 = ptrtoint i8 addrspace(1)* %in to i128 -; X32-NEXT: --> %p3 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8 addrspace(1)* %in to i128) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: Determining loop execution counts for: @ptrtoint_as1 ; %p0 = ptrtoint i8 addrspace(1)* %in to i64 @@ -92,7 +92,7 @@ define void @ptrtoint_of_bitcast(i8* %in, i64* %out0) { ; X64-NEXT: %in_casted = bitcast i8* %in to float* ; X64-NEXT: --> %in U: full-set S: full-set ; X64-NEXT: %p0 = ptrtoint float* %in_casted to i64 -; X64-NEXT: --> %p0 U: full-set S: full-set +; X64-NEXT: --> %in U: full-set S: full-set ; X64-NEXT: Determining loop execution counts for: @ptrtoint_of_bitcast ; ; X32-LABEL: 'ptrtoint_of_bitcast' @@ -100,7 +100,7 @@ define void @ptrtoint_of_bitcast(i8* %in, i64* %out0) { ; X32-NEXT: %in_casted = bitcast i8* %in to float* ; X32-NEXT: --> %in U: full-set S: full-set ; X32-NEXT: %p0 = ptrtoint float* %in_casted to i64 -; X32-NEXT: --> %p0 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8* %in to i64) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: Determining loop execution counts for: @ptrtoint_of_bitcast ; %in_casted = bitcast i8* %in to float* @@ -116,7 +116,7 @@ define void @ptrtoint_of_addrspacecast(i8* %in, i64* %out0) { ; X64-NEXT: %in_casted = addrspacecast i8* %in to i8 addrspace(1)* ; X64-NEXT: --> %in_casted U: full-set S: full-set ; X64-NEXT: %p0 = ptrtoint i8 addrspace(1)* %in_casted to i64 -; X64-NEXT: --> %p0 U: full-set S: full-set +; X64-NEXT: --> %in_casted U: full-set S: full-set ; X64-NEXT: Determining loop execution counts for: @ptrtoint_of_addrspacecast ; ; X32-LABEL: 'ptrtoint_of_addrspacecast' @@ -124,7 +124,7 @@ define void @ptrtoint_of_addrspacecast(i8* %in, i64* %out0) { ; X32-NEXT: %in_casted = addrspacecast i8* %in to i8 addrspace(1)* ; X32-NEXT: --> %in_casted U: full-set S: full-set ; X32-NEXT: %p0 = ptrtoint i8 addrspace(1)* %in_casted to i64 -; X32-NEXT: --> %p0 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8 addrspace(1)* %in_casted to i64) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: Determining loop execution counts for: @ptrtoint_of_addrspacecast ; %in_casted = addrspacecast i8* %in to i8 addrspace(1)* @@ -140,7 +140,7 @@ define void @ptrtoint_of_inttoptr(i64 %in, i64* %out0) { ; X64-NEXT: %in_casted = inttoptr i64 %in to i8* ; X64-NEXT: --> %in_casted U: full-set S: full-set ; X64-NEXT: %p0 = ptrtoint i8* %in_casted to i64 -; X64-NEXT: --> %p0 U: full-set S: full-set +; X64-NEXT: --> %in_casted U: full-set S: full-set ; X64-NEXT: Determining loop execution counts for: @ptrtoint_of_inttoptr ; ; X32-LABEL: 'ptrtoint_of_inttoptr' @@ -148,7 +148,7 @@ define void @ptrtoint_of_inttoptr(i64 %in, i64* %out0) { ; X32-NEXT: %in_casted = inttoptr i64 %in to i8* ; X32-NEXT: --> %in_casted U: full-set S: full-set ; X32-NEXT: %p0 = ptrtoint i8* %in_casted to i64 -; X32-NEXT: --> %p0 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8* %in_casted to i64) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: Determining loop execution counts for: @ptrtoint_of_inttoptr ; %in_casted = inttoptr i64 %in to i8* @@ -197,11 +197,17 @@ define void @ptrtoint_of_nullptr(i64* %out0) { ; A constant inttoptr argument of an ptrtoint is still bad. define void @ptrtoint_of_constantexpr_inttoptr(i64* %out0) { -; ALL-LABEL: 'ptrtoint_of_constantexpr_inttoptr' -; ALL-NEXT: Classifying expressions for: @ptrtoint_of_constantexpr_inttoptr -; ALL-NEXT: %p0 = ptrtoint i8* inttoptr (i64 42 to i8*) to i64 -; ALL-NEXT: --> %p0 U: [42,43) S: [-64,64) -; ALL-NEXT: Determining loop execution counts for: @ptrtoint_of_constantexpr_inttoptr +; X64-LABEL: 'ptrtoint_of_constantexpr_inttoptr' +; X64-NEXT: Classifying expressions for: @ptrtoint_of_constantexpr_inttoptr +; X64-NEXT: %p0 = ptrtoint i8* inttoptr (i64 42 to i8*) to i64 +; X64-NEXT: --> inttoptr (i64 42 to i8*) U: [42,43) S: [-64,64) +; X64-NEXT: Determining loop execution counts for: @ptrtoint_of_constantexpr_inttoptr +; +; X32-LABEL: 'ptrtoint_of_constantexpr_inttoptr' +; X32-NEXT: Classifying expressions for: @ptrtoint_of_constantexpr_inttoptr +; X32-NEXT: %p0 = ptrtoint i8* inttoptr (i64 42 to i8*) to i64 +; X32-NEXT: --> (zext i8* inttoptr (i64 42 to i8*) to i64) U: [42,43) S: [0,4294967296) +; X32-NEXT: Determining loop execution counts for: @ptrtoint_of_constantexpr_inttoptr ; %p0 = ptrtoint i8* inttoptr (i64 42 to i8*) to i64 store i64 %p0, i64* %out0 diff --git a/llvm/test/CodeGen/ARM/lsr-undef-in-binop.ll b/llvm/test/CodeGen/ARM/lsr-undef-in-binop.ll index 564328d99998..e73397214475 100644 --- a/llvm/test/CodeGen/ARM/lsr-undef-in-binop.ll +++ b/llvm/test/CodeGen/ARM/lsr-undef-in-binop.ll @@ -186,7 +186,9 @@ define linkonce_odr i32 @vector_insert(%"class.std::__1::vector.182"*, [1 x i32] br i1 %114, label %124, label %115 ; CHECK-LABEL: .preheader: -; CHECK-NEXT: sub i32 [[OLD_CAST]], [[NEW_CAST]] +; CHECK-NEXT: [[NEG_NEW:%[0-9]+]] = sub i32 0, [[NEW_CAST]] +; CHECK-NEXT: getelementptr i8, i8* %97, i32 [[NEG_NEW]] + ; <label>:115: ; preds = %111, %115 %116 = phi i8* [ %118, %115 ], [ %97, %111 ] %117 = phi i8* [ %119, %115 ], [ %11, %111 ] diff --git a/llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll b/llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll index 670477c4c285..d4dd7352aa52 100644 --- a/llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll +++ b/llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll @@ -268,9 +268,9 @@ define i8* @SyFgets(i8* %line, i64 %length, i64 %fid) { ; CHECK-NEXT: LBB0_48: ## %if.then1477 ; CHECK-NEXT: movl $1, %edx ; CHECK-NEXT: callq _write -; CHECK-NEXT: subq %rbx, %r14 ; CHECK-NEXT: movq _syHistory(a){{.*}}(%rip), %rax -; CHECK-NEXT: leaq 8189(%r14,%rax), %rax +; CHECK-NEXT: subq %rbx, %rax +; CHECK-NEXT: leaq 8189(%rax,%r14), %rax ; CHECK-NEXT: .p2align 4, 0x90 ; CHECK-NEXT: LBB0_49: ## %for.body1723 ; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1 diff --git a/llvm/test/Transforms/IndVarSimplify/2011-11-01-lftrptr.ll b/llvm/test/Transforms/IndVarSimplify/2011-11-01-lftrptr.ll index e1ef6bd6635d..bc756c666bde 100644 --- a/llvm/test/Transforms/IndVarSimplify/2011-11-01-lftrptr.ll +++ b/llvm/test/Transforms/IndVarSimplify/2011-11-01-lftrptr.ll @@ -166,21 +166,23 @@ define i8 @testnullptrint(i8* %buf, i8* %end) nounwind { ; PTR64-NEXT: ret i8 [[RET]] ; ; PTR32-LABEL: @testnullptrint( +; PTR32-NEXT: [[BUF1:%.*]] = ptrtoint i8* [[BUF:%.*]] to i32 ; PTR32-NEXT: br label [[LOOPGUARD:%.*]] ; PTR32: loopguard: -; PTR32-NEXT: [[BI:%.*]] = ptrtoint i8* [[BUF:%.*]] to i32 +; PTR32-NEXT: [[BI:%.*]] = ptrtoint i8* [[BUF]] to i32 ; PTR32-NEXT: [[EI:%.*]] = ptrtoint i8* [[END:%.*]] to i32 ; PTR32-NEXT: [[CNT:%.*]] = sub i32 [[EI]], [[BI]] -; PTR32-NEXT: [[CNT1:%.*]] = inttoptr i32 [[CNT]] to i8* ; PTR32-NEXT: [[GUARD:%.*]] = icmp ult i32 0, [[CNT]] ; PTR32-NEXT: br i1 [[GUARD]], label [[PREHEADER:%.*]], label [[EXIT:%.*]] ; PTR32: preheader: +; PTR32-NEXT: [[TMP1:%.*]] = sub i32 0, [[BUF1]] +; PTR32-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, i8* [[END]], i32 [[TMP1]] ; PTR32-NEXT: br label [[LOOP:%.*]] ; PTR32: loop: ; PTR32-NEXT: [[P_01_US_US:%.*]] = phi i8* [ null, [[PREHEADER]] ], [ [[GEP:%.*]], [[LOOP]] ] ; PTR32-NEXT: [[GEP]] = getelementptr inbounds i8, i8* [[P_01_US_US]], i64 1 -; PTR32-NEXT: [[SNEXT:%.*]] = load i8, i8* [[GEP]] -; PTR32-NEXT: [[EXITCOND:%.*]] = icmp ne i8* [[GEP]], [[CNT1]] +; PTR32-NEXT: [[SNEXT:%.*]] = load i8, i8* [[GEP]], align 1 +; PTR32-NEXT: [[EXITCOND:%.*]] = icmp ne i8* [[GEP]], [[SCEVGEP]] ; PTR32-NEXT: br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT_LOOPEXIT:%.*]] ; PTR32: exit.loopexit: ; PTR32-NEXT: [[SNEXT_LCSSA:%.*]] = phi i8 [ [[SNEXT]], [[LOOP]] ] @@ -256,10 +258,10 @@ define i8 @testptrint(i8* %buf, i8* %end) nounwind { ; PTR32-NEXT: [[P_01_US_US:%.*]] = phi i8* [ [[BUF]], [[PREHEADER]] ], [ [[GEP:%.*]], [[LOOP]] ] ; PTR32-NEXT: [[IV:%.*]] = phi i32 [ [[BI]], [[PREHEADER]] ], [ [[IVNEXT:%.*]], [[LOOP]] ] ; PTR32-NEXT: [[GEP]] = getelementptr inbounds i8, i8* [[P_01_US_US]], i64 1 -; PTR32-NEXT: [[SNEXT:%.*]] = load i8, i8* [[GEP]] +; PTR32-NEXT: [[SNEXT:%.*]] = load i8, i8* [[GEP]], align 1 ; PTR32-NEXT: [[IVNEXT]] = add nuw i32 [[IV]], 1 -; PTR32-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[IVNEXT]], [[CNT]] -; PTR32-NEXT: br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT_LOOPEXIT:%.*]] +; PTR32-NEXT: [[CMP:%.*]] = icmp ult i32 [[IVNEXT]], [[CNT]] +; PTR32-NEXT: br i1 [[CMP]], label [[LOOP]], label [[EXIT_LOOPEXIT:%.*]] ; PTR32: exit.loopexit: ; PTR32-NEXT: [[SNEXT_LCSSA:%.*]] = phi i8 [ [[SNEXT]], [[LOOP]] ] ; PTR32-NEXT: br label [[EXIT]] diff --git a/polly/test/Isl/CodeGen/scev_looking_through_bitcasts.ll b/polly/test/Isl/CodeGen/scev_looking_through_bitcasts.ll index 1012e23cd3a2..321e98ab6772 100644 --- a/polly/test/Isl/CodeGen/scev_looking_through_bitcasts.ll +++ b/polly/test/Isl/CodeGen/scev_looking_through_bitcasts.ll @@ -32,6 +32,5 @@ bitmap_element_allocate.exit: ; CHECK: polly.stmt.cond.end73.i: -; CHECK-NEXT: %0 = bitcast %structty** %b.s2a to i8** -; CHECK-NEXT: store i8* undef, i8** %0 +; CHECK-NEXT: store %structty* undef, %structty** %b.s2a ; CHECK-NEXT: br label %polly.exiting </cut>

3 years, 11 months

New IRC Channel for Linaro Toolchain Working Group

by Matthew Gretton-Dann

All, During Connect the suggestion was made that each working group should have its own IRC Channel for discussions and topics relating to the group in particular (as opposed to #linaro which is 'generic' Linaro conversations). Therefore I have just set up #linaro-tcwg on Freenode for the Toolchain Working Group. This channel is public and open to anyone who wants to talk with the TCWG group about anything toolchain related. Thanks, Matt -- Matthew Gretton-Dann Toolchain Working Group, Linaro

3 years, 11 months

[ACTIVITY] report week ending 16 Jul

by Peter Maydell

Progress: * UM-2 [QEMU upstream maintainership] + Code review: - ITS patchset v6 - RTH's series to allow usermode emulation users to set default vector length + Arm pullreq; shepherding stuff in for softfreeze * QEMU-406 [QEMU support for MVE (M-profile Vector Extension; Helium)] + Sent out patchset with the 3rd slice of MVE insns; I now have all the non-floating-point insns done, I think. New insns: scatter-gather loads/stores, interleaving loads/stores, VCTP, realized I already did VMOVL as it is "VSHLL-by-0" + Implemented most of the floating point insns: implemented VADD fp, VSUB fp, VABD fp, VMUL fp, VMAXNM, VMINNM, VCADD fp, VFMA, VFMS, VCMUL, VCMLA, VMAXNMA, VMINNMA, VADD fp scalar, VSUB fp scalar, VMUL fp scalar, VFMA fp scalar, VFMAS fp scalar, VMAXNMV, VMAXNMAV, VMINNMV, VMINNMAV, VCMP and VPT fp vector and scalar, VCVT fixed-point + Just 4 insns to go: three flavours of VCVT, plus VRINT (and then all the other stuff that wasn't included in this simplistic measure of progress :-)) + Progress: 206/210 (98%) -- PMM

3 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-release-arm-spec2k6-Os - Build # 18 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-Os. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-Os Culprit: <cut> commit ee2f721c2f7ac5574456833447a492ed1b24b5c2 Author: Jonathan Wakely <jwakely(a)redhat.com> Date: Thu Apr 25 23:43:15 2019 +0100 PR libstdc++/90239 use uses_allocator_construction_args in <scoped_allocator> PR libstdc++/90239 * doc/xml/manual/status_cxx2020.xml: Amend P0591R4 status. * include/std/scoped_allocator [__cplusplus > 201703L] (scoped_allocator_adaptor::construct): Define in terms of uses_allocator_construction_args, as per P0591R4. * testsuite/20_util/scoped_allocator/construct_pair_c++2a.cc: New test. * testsuite/util/testsuite_allocator.h: Remove name of unused parameter. From-SVN: r270588 </cut> Results regressed to (for first_bad == ee2f721c2f7ac5574456833447a492ed1b24b5c2) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -Os_mthumb -- artifacts/build-ee2f721c2f7ac5574456833447a492ed1b24b5c2/results_id: 1 # 429.mcf,mcf_base.default regressed by 104 # 447.dealII,dealII_base.default regressed by 57393 # 470.lbm,lbm_base.default regressed by 103 from (for last_good == b6bf4d8a773cde07e751542f2911307d78b717fd) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -Os_mthumb -- artifacts/build-baseline/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/baseline-llvm-release-arm-spec2k6-Os/1687 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-Os/1742 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-ee2f721c2f7ac5574456833447a492ed1b24b5c2 cd investigate-gcc-ee2f721c2f7ac5574456833447a492ed1b24b5c2 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach ee2f721c2f7ac5574456833447a492ed1b24b5c2 ../artifacts/test.sh # Reproduce last_good build git checkout --detach b6bf4d8a773cde07e751542f2911307d78b717fd ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Full commit (up to 1000 lines): <cut> commit ee2f721c2f7ac5574456833447a492ed1b24b5c2 Author: Jonathan Wakely <jwakely(a)redhat.com> Date: Thu Apr 25 23:43:15 2019 +0100 PR libstdc++/90239 use uses_allocator_construction_args in <scoped_allocator> PR libstdc++/90239 * doc/xml/manual/status_cxx2020.xml: Amend P0591R4 status. * include/std/scoped_allocator [__cplusplus > 201703L] (scoped_allocator_adaptor::construct): Define in terms of uses_allocator_construction_args, as per P0591R4. * testsuite/20_util/scoped_allocator/construct_pair_c++2a.cc: New test. * testsuite/util/testsuite_allocator.h: Remove name of unused parameter. From-SVN: r270588 --- libstdc++-v3/ChangeLog | 11 +++ libstdc++-v3/doc/xml/manual/status_cxx2020.xml | 4 +- libstdc++-v3/include/std/scoped_allocator | 21 +++++ .../scoped_allocator/construct_pair_c++2a.cc | 97 ++++++++++++++++++++++ libstdc++-v3/testsuite/util/testsuite_allocator.h | 2 +- 5 files changed, 133 insertions(+), 2 deletions(-) diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog index b616df758a0..f15125ba5a9 100644 --- a/libstdc++-v3/ChangeLog +++ b/libstdc++-v3/ChangeLog @@ -1,3 +1,14 @@ +2019-04-25 Jonathan Wakely <jwakely(a)redhat.com> + + PR libstdc++/90239 + * doc/xml/manual/status_cxx2020.xml: Amend P0591R4 status. + * include/std/scoped_allocator [__cplusplus > 201703L] + (scoped_allocator_adaptor::construct): Define in terms of + uses_allocator_construction_args, as per P0591R4. + * testsuite/20_util/scoped_allocator/construct_pair_c++2a.cc: New test. + * testsuite/util/testsuite_allocator.h: Remove name of unused + parameter. + 2019-04-24 Jonathan Wakely <jwakely(a)redhat.com> * doc/xml/manual/status_cxx2017.xml: Document P0024R2 status. diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml index cedb3d03066..a075103ea4a 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml @@ -674,7 +674,9 @@ Feature-testing recommendations for C++</link>. </link> </entry> <entry align="center"> 9.1 </entry> - <entry /> + <entry> + <code>std::scoped_allocator_adaptor</code> changes missing in 9.1.0 + </entry> </row> <row> diff --git a/libstdc++-v3/include/std/scoped_allocator b/libstdc++-v3/include/std/scoped_allocator index 335df483f69..2c7ad8e94d7 100644 --- a/libstdc++-v3/include/std/scoped_allocator +++ b/libstdc++-v3/include/std/scoped_allocator @@ -35,6 +35,7 @@ # include <bits/c++0x_warning.h> #else +#include <memory> #include <utility> #include <tuple> #include <bits/alloc_traits.h> @@ -187,6 +188,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using __outermost_alloc_traits = allocator_traits<typename __outermost_type<_Alloc>::type>; +#if __cplusplus <= 201703 template<typename _Tp, typename... _Args> void _M_construct(__uses_alloc0, _Tp* __p, _Args&&... __args) @@ -218,6 +220,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION std::forward<_Args>(__args)..., inner_allocator()); } +#endif // C++17 template<typename _Alloc> static _Alloc @@ -355,6 +358,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION size_type max_size() const { return __traits::max_size(outer_allocator()); } +#if __cplusplus <= 201703 template<typename _Tp, typename... _Args> typename __not_pair<_Tp>::type construct(_Tp* __p, _Args&&... __args) @@ -417,6 +421,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION std::forward_as_tuple(std::forward<_Up>(__x.first)), std::forward_as_tuple(std::forward<_Vp>(__x.second))); } +#else // C++2a + template<typename _Tp, typename... _Args> + __attribute__((__nonnull__)) + void + construct(_Tp* __p, _Args&&... __args) + { + typedef __outermost_alloc_traits<scoped_allocator_adaptor> _O_traits; + std::apply([__p, this](auto&&... __newargs) { + _O_traits::construct(__outermost(*this), __p, + std::forward<decltype(__newargs)>(__newargs)...); + }, + uses_allocator_construction_args<_Tp>(inner_allocator(), + std::forward<_Args>(__args)...)); + } +#endif // C++2a template<typename _Tp> void destroy(_Tp* __p) @@ -439,6 +458,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION const scoped_allocator_adaptor<_OutA2, _InA...>& __b) noexcept; private: +#if __cplusplus <= 201703L template<typename _Ind, typename... _Args> tuple<_Args&&...> _M_construct_p(__uses_alloc0, _Ind, tuple<_Args...>& __t) @@ -461,6 +481,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return { std::get<_Ind>(std::move(__t))..., inner_allocator() }; } +#endif // C++17 }; template <typename _OutA1, typename _OutA2, typename... _InA> diff --git a/libstdc++-v3/testsuite/20_util/scoped_allocator/construct_pair_c++2a.cc b/libstdc++-v3/testsuite/20_util/scoped_allocator/construct_pair_c++2a.cc new file mode 100644 index 00000000000..1630f2a4d09 --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/scoped_allocator/construct_pair_c++2a.cc @@ -0,0 +1,97 @@ +// Copyright (C) 2019 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// <http://www.gnu.org/licenses/>. + +// { dg-options "-std=gnu++2a" } +// { dg-do run { target c++2a } } + +#include <scoped_allocator> +#include <vector> +#include <testsuite_hooks.h> +#include <testsuite_allocator.h> + +struct X +{ + using allocator_type = __gnu_test::uneq_allocator<int>; + + X(int personality) : a(personality) { } + X(std::allocator_arg_t, allocator_type a) : a(a) { } + X(std::allocator_arg_t, allocator_type a, const X&) : a(a) { } + + allocator_type a; +}; + +void +test01() +{ + using value_type = std::pair<std::pair<X, int>, std::pair<int, X>>; + using scoped_alloc + = std::scoped_allocator_adaptor<__gnu_test::uneq_allocator<value_type>>; + + const scoped_alloc a(10); + std::vector<value_type, scoped_alloc> v(a); + VERIFY( v.get_allocator().get_personality() == a.get_personality() ); + + value_type val( { X(1), 2 }, { 3, X(4) } ); + v.push_back(val); + X& x1 = v.back().first.first; + VERIFY( x1.a.get_personality() != val.first.first.a.get_personality() ); + VERIFY( x1.a.get_personality() == a.get_personality() ); + + X& x2 = v.back().second.second; + VERIFY( x2.a.get_personality() != val.second.second.a.get_personality() ); + VERIFY( x2.a.get_personality() == a.get_personality() ); + + // Check other members of the pairs are correctly initialized too: + VERIFY( v.back().first.second == val.first.second ); + VERIFY( v.back().second.first == val.second.first ); +} + +void +test02() +{ + using value_type = std::pair<std::pair<X, int>, std::pair<int, X>>; + using scoped_alloc + = std::scoped_allocator_adaptor<__gnu_test::uneq_allocator<value_type>, + X::allocator_type>; + + const scoped_alloc a(10, 20); + std::vector<value_type, scoped_alloc> v(a); + VERIFY( v.get_allocator().get_personality() == a.get_personality() ); + + value_type val( { X(1), 2 }, { 3, X(4) } ); + v.push_back(val); + X& x1 = v.back().first.first; + VERIFY( x1.a.get_personality() != val.first.first.a.get_personality() ); + VERIFY( x1.a.get_personality() != a.get_personality() ); + VERIFY( x1.a.get_personality() == a.inner_allocator().get_personality() ); + + X& x2 = v.back().second.second; + VERIFY( x2.a.get_personality() != val.second.second.a.get_personality() ); + VERIFY( x2.a.get_personality() != a.get_personality() ); + VERIFY( x2.a.get_personality() == a.inner_allocator().get_personality() ); + + // Check other members of the pairs are correctly initialized too: + VERIFY( v.back().first.second == val.first.second ); + VERIFY( v.back().second.first == val.second.first ); +} + +int +main() +{ + test01(); + test02(); +} diff --git a/libstdc++-v3/testsuite/util/testsuite_allocator.h b/libstdc++-v3/testsuite/util/testsuite_allocator.h index 627749299d2..0392421ca04 100644 --- a/libstdc++-v3/testsuite/util/testsuite_allocator.h +++ b/libstdc++-v3/testsuite/util/testsuite_allocator.h @@ -334,7 +334,7 @@ namespace __gnu_test int get_personality() const { return personality; } pointer - allocate(size_type n, const void* hint = 0) + allocate(size_type n, const void* = 0) { pointer p = AllocTraits::allocate(*this, n); </cut>

3 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/gnu-release-arm-spec2k6-O2_LTO - Build # 21 - Fixed!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tk1/gnu-release-arm-spec2k6-O2_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_tk1/gnu-release-arm-spec2k6-O2_LTO Culprit: <cut> commit 268d509d67efac45f01b356602036e1dc7c6935e Author: Andrew Stubbs <ams(a)codesourcery.com> Date: Thu Jun 6 15:11:59 2019 +0000 Add -march=gfx906 for AMD GCN. 2019-06-06 Andrew Stubbs <ams(a)codesourcery.com> gcc/ * config.gcc (amdgcn-*-*): Allow --with-arch=gfx906. * config/gcn/gcn.opt (gpu_type): Add gfx906. * config/gcn/t-gcn-hsa (MULTILIB_OPTIONS): Add gfx906 multilib. (MULTILIB_DIRNAMES): Rename gcn5 to gfx900. Add gfx906. From-SVN: r272007 </cut> Results regressed to (for first_bad == 268d509d67efac45f01b356602036e1dc7c6935e) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O2_LTO_marm -- artifacts/build-268d509d67efac45f01b356602036e1dc7c6935e/results_id: 1 # 459.GemsFDTD,GemsFDTD_base.default regressed by 103 from (for last_good == 41dab855dce20d5d7042c9330dd8124d0ece19c0) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O2_LTO_marm -- artifacts/build-41dab855dce20d5d7042c9330dd8124d0ece19c0/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Results ID of last_good: tk1_32/tcwg_bmk_gnu_tk1/bisect-gnu-release-arm-spec2k6-O2_LTO/1715 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Results ID of first_bad: tk1_32/tcwg_bmk_gnu_tk1/bisect-gnu-release-arm-spec2k6-O2_LTO/1700 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-268d509d67efac45f01b356602036e1dc7c6935e cd investigate-gcc-268d509d67efac45f01b356602036e1dc7c6935e git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach 268d509d67efac45f01b356602036e1dc7c6935e ../artifacts/test.sh # Reproduce last_good build git checkout --detach 41dab855dce20d5d7042c9330dd8124d0ece19c0 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Full commit (up to 1000 lines): <cut> commit 268d509d67efac45f01b356602036e1dc7c6935e Author: Andrew Stubbs <ams(a)codesourcery.com> Date: Thu Jun 6 15:11:59 2019 +0000 Add -march=gfx906 for AMD GCN. 2019-06-06 Andrew Stubbs <ams(a)codesourcery.com> gcc/ * config.gcc (amdgcn-*-*): Allow --with-arch=gfx906. * config/gcn/gcn.opt (gpu_type): Add gfx906. * config/gcn/t-gcn-hsa (MULTILIB_OPTIONS): Add gfx906 multilib. (MULTILIB_DIRNAMES): Rename gcn5 to gfx900. Add gfx906. From-SVN: r272007 --- gcc/ChangeLog | 8 ++++++++ gcc/config.gcc | 2 +- gcc/config/gcn/gcn.opt | 3 +++ gcc/config/gcn/t-gcn-hsa | 4 ++-- 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index ae15b05c65f..3c587a17da0 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,11 @@ +2019-06-06 Andrew Stubbs <ams(a)codesourcery.com> + + * config.gcc (amdgcn-*-*): Allow --with-arch=gfx906. + * config/gcn/gcn.opt (gpu_type): Add gfx906. + * config/gcn/t-gcn-hsa (MULTILIB_OPTIONS): Add gfx906 multilib. + (MULTILIB_DIRNAMES): Rename gcn5 to gfx900. + Add gfx906. + 2019-06-06 Kyrylo Tkachov <kyrylo.tkachov(a)arm.com> PR tree-optimization/90332 diff --git a/gcc/config.gcc b/gcc/config.gcc index 67c3c2c7a42..6b00c387247 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -4127,7 +4127,7 @@ case "${target}" in for which in arch tune; do eval "val=\$with_$which" case ${val} in - "" | carrizo | fiji | gfx900 ) + "" | carrizo | fiji | gfx900 | gfx906 ) # OK ;; *) diff --git a/gcc/config/gcn/gcn.opt b/gcc/config/gcn/gcn.opt index 2fd3996edba..bdc878f35ad 100644 --- a/gcc/config/gcn/gcn.opt +++ b/gcc/config/gcn/gcn.opt @@ -34,6 +34,9 @@ Enum(gpu_type) String(fiji) Value(PROCESSOR_FIJI) EnumValue Enum(gpu_type) String(gfx900) Value(PROCESSOR_VEGA) +EnumValue +Enum(gpu_type) String(gfx906) Value(PROCESSOR_VEGA) + march= Target RejectNegative Joined ToLower Enum(gpu_type) Var(gcn_arch) Init(PROCESSOR_CARRIZO) Specify the name of the target GPU. diff --git a/gcc/config/gcn/t-gcn-hsa b/gcc/config/gcn/t-gcn-hsa index 085ba429c9d..1600a586ac4 100644 --- a/gcc/config/gcn/t-gcn-hsa +++ b/gcc/config/gcn/t-gcn-hsa @@ -42,8 +42,8 @@ ALL_HOST_OBJS += gcn-run.o gcn-run$(exeext): gcn-run.o +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ $< -ldl -MULTILIB_OPTIONS = march=gfx900 -MULTILIB_DIRNAMES = gcn5 +MULTILIB_OPTIONS = march=gfx900 march=gfx906 +MULTILIB_DIRNAMES = gfx900 gfx906 PASSES_EXTRA += $(srcdir)/config/gcn/gcn-passes.def gcn-tree.o: $(srcdir)/config/gcn/gcn-tree.c </cut>

3 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-release-aarch64-spec2k6-Oz - Build # 10 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-Oz. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-Oz Culprit: <cut> commit f645cea8f63e76f4d1ed291da3f61768cbd6abf4 Author: Chen Zheng <czhengsz(a)cn.ibm.com> Date: Mon Sep 21 20:33:05 2020 -0400 [MachineSink] add more profitable pattern. Add more profitable sinking patterns if the target bb register pressure is not too high. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D88126 </cut> Results regressed to (for first_bad == f645cea8f63e76f4d1ed291da3f61768cbd6abf4) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -Oz -- artifacts/build-f645cea8f63e76f4d1ed291da3f61768cbd6abf4/results_id: 1 # 470.lbm,lbm_base.default regressed by 103 from (for last_good == 8dc98897c4af20aeb52f1f19f538c08e55793283) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -Oz -- artifacts/build-8dc98897c4af20aeb52f1f19f538c08e55793283/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-Oz/1697 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-Oz/1706 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-f645cea8f63e76f4d1ed291da3f61768cbd6abf4 cd investigate-llvm-f645cea8f63e76f4d1ed291da3f61768cbd6abf4 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach f645cea8f63e76f4d1ed291da3f61768cbd6abf4 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 8dc98897c4af20aeb52f1f19f538c08e55793283 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Full commit (up to 1000 lines): <cut> commit f645cea8f63e76f4d1ed291da3f61768cbd6abf4 Author: Chen Zheng <czhengsz(a)cn.ibm.com> Date: Mon Sep 21 20:33:05 2020 -0400 [MachineSink] add more profitable pattern. Add more profitable sinking patterns if the target bb register pressure is not too high. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D88126 --- llvm/lib/CodeGen/MachineSink.cpp | 79 +- .../PowerPC/sink-down-more-instructions-1.mir | 18 +- ...ink-down-more-instructions-regpressure-high.mir | 804 +++++++++++++++++++++ llvm/test/CodeGen/X86/2007-01-13-StackPtrIndex.ll | 22 +- 4 files changed, 894 insertions(+), 29 deletions(-) diff --git a/llvm/lib/CodeGen/MachineSink.cpp b/llvm/lib/CodeGen/MachineSink.cpp index 0c7c1cb67723..0abdf897b319 100644 --- a/llvm/lib/CodeGen/MachineSink.cpp +++ b/llvm/lib/CodeGen/MachineSink.cpp @@ -34,6 +34,8 @@ #include "llvm/CodeGen/MachineOperand.h" #include "llvm/CodeGen/MachinePostDominators.h" #include "llvm/CodeGen/MachineRegisterInfo.h" +#include "llvm/CodeGen/RegisterClassInfo.h" +#include "llvm/CodeGen/RegisterPressure.h" #include "llvm/CodeGen/TargetInstrInfo.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" @@ -94,6 +96,7 @@ namespace { MachineBlockFrequencyInfo *MBFI; const MachineBranchProbabilityInfo *MBPI; AliasAnalysis *AA; + RegisterClassInfo RegClassInfo; // Remember which edges have been considered for breaking. SmallSet<std::pair<MachineBasicBlock*, MachineBasicBlock*>, 8> @@ -133,6 +136,9 @@ namespace { std::vector<MachineInstr *>> StoreInstrCache; + /// Cached BB's register pressure. + std::map<MachineBasicBlock *, std::vector<unsigned>> CachedRegisterPressure; + public: static char ID; // Pass identification @@ -209,6 +215,8 @@ namespace { SmallVector<MachineBasicBlock *, 4> & GetAllSortedSuccessors(MachineInstr &MI, MachineBasicBlock *MBB, AllSuccsCache &AllSuccessors) const; + + std::vector<unsigned> &getBBRegisterPressure(MachineBasicBlock &MBB); }; } // end anonymous namespace @@ -335,6 +343,7 @@ bool MachineSinking::runOnMachineFunction(MachineFunction &MF) { MBFI = UseBlockFreqInfo ? &getAnalysis<MachineBlockFrequencyInfo>() : nullptr; MBPI = &getAnalysis<MachineBranchProbabilityInfo>(); AA = &getAnalysis<AAResultsWrapperPass>().getAAResults(); + RegClassInfo.runOnMachineFunction(MF); bool EverMadeChange = false; @@ -428,6 +437,8 @@ bool MachineSinking::ProcessBlock(MachineBasicBlock &MBB) { SeenDbgUsers.clear(); SeenDbgVars.clear(); + // recalculate the bb register pressure after sinking one BB. + CachedRegisterPressure.clear(); return MadeChange; } @@ -570,6 +581,42 @@ bool MachineSinking::PostponeSplitCriticalEdge(MachineInstr &MI, return true; } +std::vector<unsigned> & +MachineSinking::getBBRegisterPressure(MachineBasicBlock &MBB) { + // Currently to save compiling time, MBB's register pressure will not change + // in one ProcessBlock iteration because of CachedRegisterPressure. but MBB's + // register pressure is changed after sinking any instructions into it. + // FIXME: need a accurate and cheap register pressure estiminate model here. + auto RP = CachedRegisterPressure.find(&MBB); + if (RP != CachedRegisterPressure.end()) + return RP->second; + + RegionPressure Pressure; + RegPressureTracker RPTracker(Pressure); + + // Initialize the register pressure tracker. + RPTracker.init(MBB.getParent(), &RegClassInfo, nullptr, &MBB, MBB.end(), + /*TrackLaneMasks*/ false, /*TrackUntiedDefs=*/true); + + for (MachineBasicBlock::iterator MII = MBB.instr_end(), + MIE = MBB.instr_begin(); + MII != MIE; --MII) { + MachineInstr &MI = *std::prev(MII); + if (MI.isDebugValue() || MI.isDebugLabel()) + continue; + RegisterOperands RegOpers; + RegOpers.collect(MI, *TRI, *MRI, false, false); + RPTracker.recedeSkipDebugValues(); + assert(&*RPTracker.getPos() == &MI && "RPTracker sync error!"); + RPTracker.recede(RegOpers); + } + + RPTracker.closeRegion(); + auto It = CachedRegisterPressure.insert( + std::make_pair(&MBB, RPTracker.getPressure().MaxSetPressure)); + return It.first->second; +} + /// isProfitableToSinkTo - Return true if it is profitable to sink MI. bool MachineSinking::isProfitableToSinkTo(Register Reg, MachineInstr &MI, MachineBasicBlock *MBB, @@ -614,6 +661,21 @@ bool MachineSinking::isProfitableToSinkTo(Register Reg, MachineInstr &MI, if (!ML) return false; + auto isRegisterPressureSetExceedLimit = [&](const TargetRegisterClass *RC) { + unsigned Weight = TRI->getRegClassWeight(RC).RegWeight; + const int *PS = TRI->getRegClassPressureSets(RC); + // Get register pressure for block SuccToSinkTo. + std::vector<unsigned> BBRegisterPressure = + getBBRegisterPressure(*SuccToSinkTo); + for (; *PS != -1; PS++) + // check if any register pressure set exceeds limit in block SuccToSinkTo + // after sinking. + if (Weight + BBRegisterPressure[*PS] >= + TRI->getRegPressureSetLimit(*MBB->getParent(), *PS)) + return true; + return false; + }; + // If this instruction is inside a loop and sinking this instruction can make // more registers live range shorten, it is still prifitable. for (unsigned i = 0, e = MI.getNumOperands(); i != e; ++i) { @@ -645,16 +707,19 @@ bool MachineSinking::isProfitableToSinkTo(Register Reg, MachineInstr &MI, if (LI->getLoopFor(DefMI->getParent()) != ML || (DefMI->isPHI() && LI->isLoopHeader(DefMI->getParent()))) continue; - // DefMI is inside the loop. Mark it as not profitable as sinking MI will - // enlarge DefMI live range. - // FIXME: check the register pressure in block SuccToSinkTo, if it is - // smaller than the limit after sinking, it is still profitable to sink. - return false; + // The DefMI is defined inside the loop. + // If sinking this operand makes some register pressure set exceed limit, + // it is not profitable. + if (isRegisterPressureSetExceedLimit(MRI->getRegClass(Reg))) { + LLVM_DEBUG(dbgs() << "register pressure exceed limit, not profitable."); + return false; + } } } - // If MI is in loop and all its operands are alive across the whole loop, it - // is profitable to sink MI. + // If MI is in loop and all its operands are alive across the whole loop or if + // no operand sinking make register pressure set exceed limit, it is + // profitable to sink MI. return true; } diff --git a/llvm/test/CodeGen/PowerPC/sink-down-more-instructions-1.mir b/llvm/test/CodeGen/PowerPC/sink-down-more-instructions-1.mir index 94cd5877e47d..e44f096db645 100644 --- a/llvm/test/CodeGen/PowerPC/sink-down-more-instructions-1.mir +++ b/llvm/test/CodeGen/PowerPC/sink-down-more-instructions-1.mir @@ -370,14 +370,12 @@ body: | ; CHECK: [[PHI5:%[0-9]+]]:gprc = PHI [[LI2]], %bb.2, %27, %bb.17 ; CHECK: [[PHI6:%[0-9]+]]:g8rc_and_g8rc_nox0 = PHI [[ADDI8_]], %bb.2, %55, %bb.17 ; CHECK: [[PHI7:%[0-9]+]]:g8rc_and_g8rc_nox0 = PHI [[ADDI8_1]], %bb.2, %15, %bb.17 - ; CHECK: [[LWZU:%[0-9]+]]:gprc, [[LWZU1:%[0-9]+]]:g8rc_and_g8rc_nox0 = LWZU 8, [[PHI6]] :: (load 4 from %ir.46, !tbaa !2) ; CHECK: [[COPY10:%[0-9]+]]:gprc_and_gprc_nor0 = COPY [[PHI4]].sub_32 ; CHECK: [[MULHWU1:%[0-9]+]]:gprc = MULHWU [[COPY10]], [[ORI]] ; CHECK: [[RLWINM2:%[0-9]+]]:gprc = RLWINM [[MULHWU1]], 28, 4, 31 ; CHECK: [[MULLI1:%[0-9]+]]:gprc = nsw MULLI killed [[RLWINM2]], -30 ; CHECK: [[INSERT_SUBREG1:%[0-9]+]]:g8rc = INSERT_SUBREG [[DEF1]], killed [[MULLI1]], %subreg.sub_32 ; CHECK: [[RLDICL2:%[0-9]+]]:g8rc = RLDICL killed [[INSERT_SUBREG1]], 0, 32 - ; CHECK: [[ADD4_2:%[0-9]+]]:gprc = nsw ADD4 killed [[LWZU]], [[PHI5]] ; CHECK: BCC 76, [[CMPLWI1]], %bb.11 ; CHECK: B %bb.10 ; CHECK: bb.10 (%ir-block.38): @@ -396,11 +394,11 @@ body: | ; CHECK: successors: %bb.15(0x2aaaaaab), %bb.13(0x55555555) ; CHECK: [[PHI8:%[0-9]+]]:gprc = PHI [[ADDI2]], %bb.11, [[ISEL1]], %bb.10 ; CHECK: [[ADDI8_4:%[0-9]+]]:g8rc_and_g8rc_nox0 = ADDI8 [[PHI7]], 8 - ; CHECK: [[COPY13:%[0-9]+]]:g8rc_and_g8rc_nox0 = COPY [[ADDI8_4]] + ; CHECK: [[LWZU:%[0-9]+]]:gprc, [[LWZU1:%[0-9]+]]:g8rc_and_g8rc_nox0 = LWZU 8, [[PHI6]] :: (load 4 from %ir.46, !tbaa !2) + ; CHECK: [[ADD4_2:%[0-9]+]]:gprc = nsw ADD4 [[LWZU]], [[PHI5]] ; CHECK: [[ADD4_3:%[0-9]+]]:gprc = nsw ADD4 [[PHI8]], [[ADD4_2]] ; CHECK: STW killed [[ADD4_3]], 0, [[ADDI8_4]] :: (store 4 into %ir.44, !tbaa !2) ; CHECK: [[LWZ:%[0-9]+]]:gprc = LWZ 4, [[LWZU1]] :: (load 4 from %ir.uglygep1112.cast, !tbaa !2) - ; CHECK: [[ADD4_4:%[0-9]+]]:gprc = nsw ADD4 killed [[LWZ]], [[ADD4_2]] ; CHECK: BCC 76, [[CMPLWI2]], %bb.15 ; CHECK: B %bb.13 ; CHECK: bb.13 (%ir-block.60): @@ -414,19 +412,21 @@ body: | ; CHECK: bb.15 (%ir-block.69): ; CHECK: successors: %bb.17(0x80000000) ; CHECK: [[ORI8_:%[0-9]+]]:g8rc = ORI8 [[PHI4]], 1 - ; CHECK: [[COPY14:%[0-9]+]]:gprc = COPY [[ORI8_]].sub_32 - ; CHECK: [[RLWINM4:%[0-9]+]]:gprc = RLWINM [[COPY14]], 1, 0, 30 + ; CHECK: [[COPY13:%[0-9]+]]:gprc = COPY [[ORI8_]].sub_32 + ; CHECK: [[RLWINM4:%[0-9]+]]:gprc = RLWINM [[COPY13]], 1, 0, 30 ; CHECK: B %bb.17 ; CHECK: bb.16 (%ir-block.72): ; CHECK: successors: %bb.17(0x80000000) ; CHECK: [[ORI8_1:%[0-9]+]]:g8rc = ORI8 [[RLDICL2]], 1 ; CHECK: [[ADD8_1:%[0-9]+]]:g8rc = ADD8 [[PHI4]], [[ORI8_1]] - ; CHECK: [[COPY15:%[0-9]+]]:gprc = COPY [[ADD8_1]].sub_32 + ; CHECK: [[COPY14:%[0-9]+]]:gprc = COPY [[ADD8_1]].sub_32 ; CHECK: bb.17 (%ir-block.74): ; CHECK: successors: %bb.9(0x7c000000), %bb.3(0x04000000) - ; CHECK: [[PHI9:%[0-9]+]]:gprc = PHI [[ADDI3]], %bb.14, [[RLWINM4]], %bb.15, [[COPY15]], %bb.16 + ; CHECK: [[PHI9:%[0-9]+]]:gprc = PHI [[ADDI3]], %bb.14, [[RLWINM4]], %bb.15, [[COPY14]], %bb.16 + ; CHECK: [[COPY15:%[0-9]+]]:g8rc_and_g8rc_nox0 = COPY [[ADDI8_4]] + ; CHECK: [[ADD4_4:%[0-9]+]]:gprc = nsw ADD4 [[LWZ]], [[ADD4_2]] ; CHECK: [[ADD4_5:%[0-9]+]]:gprc = nsw ADD4 [[PHI9]], [[ADD4_4]] - ; CHECK: STW killed [[ADD4_5]], 4, [[COPY13]] :: (store 4 into %ir.uglygep78.cast, !tbaa !2) + ; CHECK: STW killed [[ADD4_5]], 4, [[COPY15]] :: (store 4 into %ir.uglygep78.cast, !tbaa !2) ; CHECK: [[ADDI8_5:%[0-9]+]]:g8rc = nuw nsw ADDI8 [[PHI4]], 2 ; CHECK: BDNZ8 %bb.9, implicit-def dead $ctr8, implicit $ctr8 ; CHECK: B %bb.3 diff --git a/llvm/test/CodeGen/PowerPC/sink-down-more-instructions-regpressure-high.mir b/llvm/test/CodeGen/PowerPC/sink-down-more-instructions-regpressure-high.mir new file mode 100644 index 000000000000..c16de1744383 --- /dev/null +++ b/llvm/test/CodeGen/PowerPC/sink-down-more-instructions-regpressure-high.mir @@ -0,0 +1,804 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -o - %s -verify-machineinstrs \ +# RUN: -run-pass=machine-sink | FileCheck %s + +--- | + ; ModuleID = 'sink-down-more-instructions-regpressure-high.ll' + source_filename = "sink-down-more-instructions-regpressure-high.c" + target datalayout = "e-m:e-i64:64-n32:64" + target triple = "powerpc64le-unknown-linux-gnu" + + ; This file check that %16:gprc in MIR can not be sunk down because of high + ; register pressure in destination block. + + ; Function Attrs: nofree norecurse nounwind + define dso_local signext i32 @foo(i32 signext %0, i32 signext %1, i32* nocapture readonly %2, i32* nocapture %3, i32 signext %4, i32* nocapture readonly %5, i32* nocapture readonly %6, i32* nocapture readonly %7, i32* nocapture readonly %8, i32* nocapture readonly %9, i32* nocapture readonly %10, i32* nocapture readonly %11, i32* nocapture readonly %12, i32* nocapture readonly %13, i32* nocapture readonly %14, i32* nocapture readonly %15, i32* nocapture readonly %16, i32* nocapture readonly %17, i32* nocapture readonly %18, i32* nocapture readonly %19, i32* nocapture readonly %20, i32* nocapture readonly %21, i32* nocapture readonly %22, i32* nocapture readonly %23, i32* nocapture readonly %24, i32* nocapture readonly %25, i32* nocapture readonly %26, i32* nocapture readonly %27, i32* nocapture readonly %28, i32* nocapture readonly %29, i32* nocapture readonly %30, i32* nocapture readonly %31, i32* nocapture readonly %32, i32* nocapture readonly %33, i32* nocapture readonly %34, i32* nocapture readonly %35, i32* nocapture readonly %36) local_unnamed_addr #0 { + %38 = icmp sgt i32 %4, 0 + br i1 %38, label %39, label %41 + + 39: ; preds = %37 + %40 = zext i32 %4 to i64 + %scevgep = getelementptr i32, i32* %2, i64 -1 + %scevgep69 = bitcast i32* %scevgep to i8* + %scevgep70 = getelementptr i32, i32* %5, i64 -1 + %scevgep7071 = bitcast i32* %scevgep70 to i8* + %scevgep72 = getelementptr i32, i32* %6, i64 -1 + %scevgep7273 = bitcast i32* %scevgep72 to i8* + call void @llvm.set.loop.iterations.i64(i64 %40) + br label %42 + + 41: ; preds = %65, %37 + ret i32 undef + + 42: ; preds = %65, %39 + %lsr.iv = phi i64 [ %lsr.iv.next, %65 ], [ 0, %39 ] + %43 = phi i64 [ 0, %39 ], [ %163, %65 ] + %44 = phi i32 [ 0, %39 ], [ %58, %65 ] + %45 = phi i8* [ %scevgep69, %39 ], [ %52, %65 ] + %46 = phi i8* [ %scevgep7071, %39 ], [ %50, %65 ] + %47 = phi i8* [ %scevgep7273, %39 ], [ %48, %65 ] + %48 = getelementptr i8, i8* %47, i64 4 + %49 = bitcast i8* %48 to i32* + %50 = getelementptr i8, i8* %46, i64 4 + %51 = bitcast i8* %50 to i32* + %52 = getelementptr i8, i8* %45, i64 4 + %53 = bitcast i8* %52 to i32* + %lsr68 = trunc i64 %43 to i32 + %54 = udiv i32 %lsr68, 30 + %55 = mul nuw nsw i32 %54, 30 + %56 = sub i32 %lsr68, %55 + %57 = load i32, i32* %53, align 4, !tbaa !2 + %58 = add nsw i32 %57, %44 + switch i32 %0, label %64 [ + i32 1, label %59 + i32 3, label %62 + ] + + 59: ; preds = %42 + %60 = trunc i64 %43 to i32 + %61 = shl i32 %60, 1 + br label %65 + + 62: ; preds = %42 + %63 = add nuw nsw i32 %lsr68, 100 + br label %65 + + 64: ; preds = %42 + br label %65 + + 65: ; preds = %64, %62, %59 + %66 = phi i32 [ %56, %64 ], [ %63, %62 ], [ %61, %59 ] + %67 = bitcast i32* %7 to i8* + %68 = bitcast i32* %8 to i8* + %69 = bitcast i32* %9 to i8* + %70 = bitcast i32* %10 to i8* + %71 = bitcast i32* %11 to i8* + %72 = bitcast i32* %12 to i8* + %73 = bitcast i32* %13 to i8* + %74 = bitcast i32* %14 to i8* + %75 = bitcast i32* %15 to i8* + %76 = bitcast i32* %16 to i8* + %77 = bitcast i32* %17 to i8* + %78 = bitcast i32* %18 to i8* + %79 = bitcast i32* %19 to i8* + %80 = bitcast i32* %20 to i8* + %81 = bitcast i32* %21 to i8* + %82 = bitcast i32* %22 to i8* + %83 = bitcast i32* %23 to i8* + %84 = bitcast i32* %24 to i8* + %85 = bitcast i32* %25 to i8* + %86 = bitcast i32* %26 to i8* + %87 = bitcast i32* %27 to i8* + %88 = bitcast i32* %28 to i8* + %89 = bitcast i32* %29 to i8* + %90 = bitcast i32* %30 to i8* + %91 = bitcast i32* %31 to i8* + %92 = bitcast i32* %32 to i8* + %93 = bitcast i32* %33 to i8* + %94 = bitcast i32* %34 to i8* + %95 = bitcast i32* %35 to i8* + %96 = bitcast i32* %36 to i8* + %97 = bitcast i32* %3 to i8* + %98 = add nsw i32 %66, %58 + %99 = load i32, i32* %51, align 4, !tbaa !2 + %100 = add nsw i32 %98, %99 + %101 = load i32, i32* %49, align 4, !tbaa !2 + %102 = add nsw i32 %100, %101 + %uglygep60 = getelementptr i8, i8* %67, i64 %lsr.iv + %uglygep6061 = bitcast i8* %uglygep60 to i32* + %103 = load i32, i32* %uglygep6061, align 4, !tbaa !2 + %104 = add nsw i32 %102, %103 + %uglygep58 = getelementptr i8, i8* %68, i64 %lsr.iv + %uglygep5859 = bitcast i8* %uglygep58 to i32* + %105 = load i32, i32* %uglygep5859, align 4, !tbaa !2 + %106 = add nsw i32 %104, %105 + %uglygep56 = getelementptr i8, i8* %69, i64 %lsr.iv + %uglygep5657 = bitcast i8* %uglygep56 to i32* + %107 = load i32, i32* %uglygep5657, align 4, !tbaa !2 + %108 = add nsw i32 %106, %107 + %uglygep54 = getelementptr i8, i8* %70, i64 %lsr.iv + %uglygep5455 = bitcast i8* %uglygep54 to i32* + %109 = load i32, i32* %uglygep5455, align 4, !tbaa !2 + %110 = add nsw i32 %108, %109 + %uglygep52 = getelementptr i8, i8* %71, i64 %lsr.iv + %uglygep5253 = bitcast i8* %uglygep52 to i32* + %111 = load i32, i32* %uglygep5253, align 4, !tbaa !2 + %112 = add nsw i32 %110, %111 + %uglygep50 = getelementptr i8, i8* %72, i64 %lsr.iv + %uglygep5051 = bitcast i8* %uglygep50 to i32* + %113 = load i32, i32* %uglygep5051, align 4, !tbaa !2 + %114 = add nsw i32 %112, %113 + %uglygep48 = getelementptr i8, i8* %73, i64 %lsr.iv + %uglygep4849 = bitcast i8* %uglygep48 to i32* + %115 = load i32, i32* %uglygep4849, align 4, !tbaa !2 + %116 = add nsw i32 %114, %115 + %uglygep46 = getelementptr i8, i8* %74, i64 %lsr.iv + %uglygep4647 = bitcast i8* %uglygep46 to i32* + %117 = load i32, i32* %uglygep4647, align 4, !tbaa !2 + %118 = add nsw i32 %116, %117 + %uglygep44 = getelementptr i8, i8* %75, i64 %lsr.iv + %uglygep4445 = bitcast i8* %uglygep44 to i32* + %119 = load i32, i32* %uglygep4445, align 4, !tbaa !2 + %120 = add nsw i32 %118, %119 + %uglygep42 = getelementptr i8, i8* %76, i64 %lsr.iv + %uglygep4243 = bitcast i8* %uglygep42 to i32* + %121 = load i32, i32* %uglygep4243, align 4, !tbaa !2 + %122 = add nsw i32 %120, %121 + %uglygep40 = getelementptr i8, i8* %77, i64 %lsr.iv + %uglygep4041 = bitcast i8* %uglygep40 to i32* + %123 = load i32, i32* %uglygep4041, align 4, !tbaa !2 + %124 = add nsw i32 %122, %123 + %uglygep38 = getelementptr i8, i8* %78, i64 %lsr.iv + %uglygep3839 = bitcast i8* %uglygep38 to i32* + %125 = load i32, i32* %uglygep3839, align 4, !tbaa !2 + %126 = add nsw i32 %124, %125 + %uglygep36 = getelementptr i8, i8* %79, i64 %lsr.iv + %uglygep3637 = bitcast i8* %uglygep36 to i32* + %127 = load i32, i32* %uglygep3637, align 4, !tbaa !2 + %128 = add nsw i32 %126, %127 + %uglygep34 = getelementptr i8, i8* %80, i64 %lsr.iv + %uglygep3435 = bitcast i8* %uglygep34 to i32* + %129 = load i32, i32* %uglygep3435, align 4, !tbaa !2 + %130 = add nsw i32 %128, %129 + %uglygep32 = getelementptr i8, i8* %81, i64 %lsr.iv + %uglygep3233 = bitcast i8* %uglygep32 to i32* + %131 = load i32, i32* %uglygep3233, align 4, !tbaa !2 + %132 = add nsw i32 %130, %131 + %uglygep30 = getelementptr i8, i8* %82, i64 %lsr.iv + %uglygep3031 = bitcast i8* %uglygep30 to i32* + %133 = load i32, i32* %uglygep3031, align 4, !tbaa !2 + %134 = add nsw i32 %132, %133 + %uglygep28 = getelementptr i8, i8* %83, i64 %lsr.iv + %uglygep2829 = bitcast i8* %uglygep28 to i32* + %135 = load i32, i32* %uglygep2829, align 4, !tbaa !2 + %136 = add nsw i32 %134, %135 + %uglygep26 = getelementptr i8, i8* %84, i64 %lsr.iv + %uglygep2627 = bitcast i8* %uglygep26 to i32* + %137 = load i32, i32* %uglygep2627, align 4, !tbaa !2 + %138 = add nsw i32 %136, %137 + %uglygep24 = getelementptr i8, i8* %85, i64 %lsr.iv + %uglygep2425 = bitcast i8* %uglygep24 to i32* + %139 = load i32, i32* %uglygep2425, align 4, !tbaa !2 + %140 = add nsw i32 %138, %139 + %uglygep22 = getelementptr i8, i8* %86, i64 %lsr.iv + %uglygep2223 = bitcast i8* %uglygep22 to i32* + %141 = load i32, i32* %uglygep2223, align 4, !tbaa !2 + %142 = add nsw i32 %140, %141 + %uglygep20 = getelementptr i8, i8* %87, i64 %lsr.iv + %uglygep2021 = bitcast i8* %uglygep20 to i32* + %143 = load i32, i32* %uglygep2021, align 4, !tbaa !2 + %144 = add nsw i32 %142, %143 + %uglygep18 = getelementptr i8, i8* %88, i64 %lsr.iv + %uglygep1819 = bitcast i8* %uglygep18 to i32* + %145 = load i32, i32* %uglygep1819, align 4, !tbaa !2 + %146 = add nsw i32 %144, %145 + %uglygep16 = getelementptr i8, i8* %89, i64 %lsr.iv + %uglygep1617 = bitcast i8* %uglygep16 to i32* + %147 = load i32, i32* %uglygep1617, align 4, !tbaa !2 + %148 = add nsw i32 %146, %147 + %uglygep14 = getelementptr i8, i8* %90, i64 %lsr.iv + %uglygep1415 = bitcast i8* %uglygep14 to i32* + %149 = load i32, i32* %uglygep1415, align 4, !tbaa !2 + %150 = add nsw i32 %148, %149 + %uglygep12 = getelementptr i8, i8* %91, i64 %lsr.iv + %uglygep1213 = bitcast i8* %uglygep12 to i32* + %151 = load i32, i32* %uglygep1213, align 4, !tbaa !2 + %152 = add nsw i32 %150, %151 + %uglygep10 = getelementptr i8, i8* %92, i64 %lsr.iv + %uglygep1011 = bitcast i8* %uglygep10 to i32* + %153 = load i32, i32* %uglygep1011, align 4, !tbaa !2 + %154 = add nsw i32 %152, %153 + %uglygep8 = getelementptr i8, i8* %93, i64 %lsr.iv + %uglygep89 = bitcast i8* %uglygep8 to i32* + %155 = load i32, i32* %uglygep89, align 4, !tbaa !2 + %156 = add nsw i32 %154, %155 + %uglygep6 = getelementptr i8, i8* %94, i64 %lsr.iv + %uglygep67 = bitcast i8* %uglygep6 to i32* + %157 = load i32, i32* %uglygep67, align 4, !tbaa !2 + %158 = add nsw i32 %156, %157 + %uglygep4 = getelementptr i8, i8* %95, i64 %lsr.iv + %uglygep45 = bitcast i8* %uglygep4 to i32* + %159 = load i32, i32* %uglygep45, align 4, !tbaa !2 + %160 = add nsw i32 %158, %159 + %uglygep2 = getelementptr i8, i8* %96, i64 %lsr.iv + %uglygep23 = bitcast i8* %uglygep2 to i32* + %161 = load i32, i32* %uglygep23, align 4, !tbaa !2 + %162 = add nsw i32 %160, %161 + %uglygep = getelementptr i8, i8* %97, i64 %lsr.iv + %uglygep1 = bitcast i8* %uglygep to i32* + store i32 %162, i32* %uglygep1, align 4, !tbaa !2 + %163 = add nuw nsw i64 %43, 1 + %lsr.iv.next = add nuw nsw i64 %lsr.iv, 4 + %164 = call i1 @llvm.loop.decrement.i64(i64 1) + br i1 %164, label %42, label %41 + } + + ; Function Attrs: noduplicate nounwind + declare void @llvm.set.loop.iterations.i64(i64) #1 + + ; Function Attrs: noduplicate nounwind + declare i1 @llvm.loop.decrement.i64(i64) #1 + + attributes #0 = { nofree norecurse nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="ppc64le" "target-features"="+altivec,+bpermd,+crypto,+direct-move,+extdiv,+htm,+power8-vector,+vsx,-power9-vector,-spe" "unsafe-fp-math"="false" "use-soft-float"="false" } + attributes #1 = { noduplicate nounwind } + + !llvm.module.flags = !{!0} + !llvm.ident = !{!1} + + !0 = !{i32 1, !"wchar_size", i32 4} + !1 = !{!"clang version 12.0.0"} + !2 = !{!3, !3, i64 0} + !3 = !{!"int", !4, i64 0} + !4 = !{!"omnipotent char", !5, i64 0} + !5 = !{!"Simple C/C++ TBAA"} + +... +--- +name: foo +alignment: 16 +tracksRegLiveness: true +registers: + - { id: 0, class: g8rc } + - { id: 1, class: g8rc } + - { id: 2, class: g8rc } + - { id: 3, class: g8rc_and_g8rc_nox0 } + - { id: 4, class: g8rc_and_g8rc_nox0 } + - { id: 5, class: gprc } + - { id: 6, class: g8rc_and_g8rc_nox0 } + - { id: 7, class: g8rc_and_g8rc_nox0 } + - { id: 8, class: g8rc_and_g8rc_nox0 } + - { id: 9, class: g8rc } + - { id: 10, class: g8rc_and_g8rc_nox0 } + - { id: 11, class: g8rc } + - { id: 12, class: g8rc_and_g8rc_nox0 } + - { id: 13, class: g8rc } + - { id: 14, class: gprc_and_gprc_nor0 } + - { id: 15, class: gprc } + - { id: 16, class: gprc } + - { id: 17, class: gprc } + - { id: 18, class: gprc } + - { id: 19, class: gprc } + - { id: 20, class: g8rc } + - { id: 21, class: g8rc } + - { id: 22, class: g8rc } + - { id: 23, class: g8rc } + - { id: 24, class: g8rc_and_g8rc_nox0 } + - { id: 25, class: g8rc_and_g8rc_nox0 } + - { id: 26, class: g8rc } + - { id: 27, class: g8rc_and_g8rc_nox0 } + - { id: 28, class: g8rc_and_g8rc_nox0 } + - { id: 29, class: g8rc_and_g8rc_nox0 } + - { id: 30, class: gprc } + - { id: 31, class: gprc } + - { id: 32, class: g8rc_and_g8rc_nox0 } + - { id: 33, class: g8rc_and_g8rc_nox0 } + - { id: 34, class: g8rc_and_g8rc_nox0 } + - { id: 35, class: g8rc_and_g8rc_nox0 } + - { id: 36, class: g8rc_and_g8rc_nox0 } + - { id: 37, class: g8rc_and_g8rc_nox0 } + - { id: 38, class: g8rc_and_g8rc_nox0 } + - { id: 39, class: g8rc_and_g8rc_nox0 } + - { id: 40, class: g8rc_and_g8rc_nox0 } + - { id: 41, class: g8rc_and_g8rc_nox0 } + - { id: 42, class: g8rc_and_g8rc_nox0 } + - { id: 43, class: g8rc_and_g8rc_nox0 } + - { id: 44, class: g8rc_and_g8rc_nox0 } + - { id: 45, class: g8rc_and_g8rc_nox0 } + - { id: 46, class: g8rc_and_g8rc_nox0 } + - { id: 47, class: g8rc_and_g8rc_nox0 } + - { id: 48, class: g8rc_and_g8rc_nox0 } + - { id: 49, class: g8rc_and_g8rc_nox0 } + - { id: 50, class: g8rc_and_g8rc_nox0 } + - { id: 51, class: g8rc_and_g8rc_nox0 } + - { id: 52, class: g8rc_and_g8rc_nox0 } + - { id: 53, class: g8rc_and_g8rc_nox0 } + - { id: 54, class: g8rc_and_g8rc_nox0 } + - { id: 55, class: g8rc_and_g8rc_nox0 } + - { id: 56, class: g8rc_and_g8rc_nox0 } + - { id: 57, class: g8rc_and_g8rc_nox0 } + - { id: 58, class: g8rc_and_g8rc_nox0 } + - { id: 59, class: g8rc_and_g8rc_nox0 } + - { id: 60, class: g8rc_and_g8rc_nox0 } + - { id: 61, class: crrc } + - { id: 62, class: g8rc } + - { id: 63, class: gprc } + - { id: 64, class: g8rc } + - { id: 65, class: g8rc } + - { id: 66, class: g8rc } + - { id: 67, class: gprc } + - { id: 68, class: g8rc_and_g8rc_nox0 } + - { id: 69, class: gprc } + - { id: 70, class: gprc } + - { id: 71, class: gprc } + - { id: 72, class: gprc } + - { id: 73, class: gprc } + - { id: 74, class: crrc } + - { id: 75, class: crrc } + - { id: 76, class: gprc } + - { id: 77, class: gprc } + - { id: 78, class: gprc } + - { id: 79, class: gprc } + - { id: 80, class: gprc } + - { id: 81, class: gprc } + - { id: 82, class: gprc } + - { id: 83, class: gprc } + - { id: 84, class: gprc } + - { id: 85, class: gprc } + - { id: 86, class: gprc } + - { id: 87, class: gprc } + - { id: 88, class: gprc } + - { id: 89, class: gprc } + - { id: 90, class: gprc } + - { id: 91, class: gprc } + - { id: 92, class: gprc } + - { id: 93, class: gprc } + - { id: 94, class: gprc } + - { id: 95, class: gprc } + - { id: 96, class: gprc } + - { id: 97, class: gprc } + - { id: 98, class: gprc } + - { id: 99, class: gprc } + - { id: 100, class: gprc } + - { id: 101, class: gprc } + - { id: 102, class: gprc } + - { id: 103, class: gprc } + - { id: 104, class: gprc } + - { id: 105, class: gprc } + - { id: 106, class: gprc } + - { id: 107, class: gprc } + - { id: 108, class: gprc } + - { id: 109, class: gprc } + - { id: 110, class: gprc } + - { id: 111, class: gprc } + - { id: 112, class: gprc } + - { id: 113, class: gprc } + - { id: 114, class: gprc } + - { id: 115, class: gprc } + - { id: 116, class: gprc } + - { id: 117, class: gprc } + - { id: 118, class: gprc } + - { id: 119, class: gprc } + - { id: 120, class: gprc } + - { id: 121, class: gprc } + - { id: 122, class: gprc } + - { id: 123, class: gprc } + - { id: 124, class: gprc } + - { id: 125, class: gprc } + - { id: 126, class: gprc } + - { id: 127, class: gprc } + - { id: 128, class: gprc } + - { id: 129, class: gprc } + - { id: 130, class: gprc } + - { id: 131, class: gprc } + - { id: 132, class: gprc } + - { id: 133, class: gprc } + - { id: 134, class: gprc } + - { id: 135, class: gprc } + - { id: 136, class: gprc } + - { id: 137, class: gprc } + - { id: 138, class: gprc } + - { id: 139, class: gprc } + - { id: 140, class: gprc } + - { id: 141, class: gprc } + - { id: 142, class: g8rc } +liveins: + - { reg: '$x3', virtual-reg: '%22' } + - { reg: '$x5', virtual-reg: '%24' } + - { reg: '$x6', virtual-reg: '%25' } + - { reg: '$x7', virtual-reg: '%26' } + - { reg: '$x8', virtual-reg: '%27' } + - { reg: '$x9', virtual-reg: '%28' } + - { reg: '$x10', virtual-reg: '%29' } +frameInfo: + maxAlignment: 1 +fixedStack: + - { id: 0, offset: 320, size: 8, alignment: 16, isImmutable: true } + - { id: 1, offset: 312, size: 8, alignment: 8, isImmutable: true } + - { id: 2, offset: 304, size: 8, alignment: 16, isImmutable: true } + - { id: 3, offset: 296, size: 8, alignment: 8, isImmutable: true } + - { id: 4, offset: 288, size: 8, alignment: 16, isImmutable: true } + - { id: 5, offset: 280, size: 8, alignment: 8, isImmutable: true } + - { id: 6, offset: 272, size: 8, alignment: 16, isImmutable: true } + - { id: 7, offset: 264, size: 8, alignment: 8, isImmutable: true } + - { id: 8, offset: 256, size: 8, alignment: 16, isImmutable: true } + - { id: 9, offset: 248, size: 8, alignment: 8, isImmutable: true } + - { id: 10, offset: 240, size: 8, alignment: 16, isImmutable: true } + - { id: 11, offset: 232, size: 8, alignment: 8, isImmutable: true } + - { id: 12, offset: 224, size: 8, alignment: 16, isImmutable: true } + - { id: 13, offset: 216, size: 8, alignment: 8, isImmutable: true } + - { id: 14, offset: 208, size: 8, alignment: 16, isImmutable: true } + - { id: 15, offset: 200, size: 8, alignment: 8, isImmutable: true } + - { id: 16, offset: 192, size: 8, alignment: 16, isImmutable: true } + - { id: 17, offset: 184, size: 8, alignment: 8, isImmutable: true } + - { id: 18, offset: 176, size: 8, alignment: 16, isImmutable: true } + - { id: 19, offset: 168, size: 8, alignment: 8, isImmutable: true } + - { id: 20, offset: 160, size: 8, alignment: 16, isImmutable: true } + - { id: 21, offset: 152, size: 8, alignment: 8, isImmutable: true } + - { id: 22, offset: 144, size: 8, alignment: 16, isImmutable: true } + - { id: 23, offset: 136, size: 8, alignment: 8, isImmutable: true } + - { id: 24, offset: 128, size: 8, alignment: 16, isImmutable: true } + - { id: 25, offset: 120, size: 8, alignment: 8, isImmutable: true } + - { id: 26, offset: 112, size: 8, alignment: 16, isImmutable: true } + - { id: 27, offset: 104, size: 8, alignment: 8, isImmutable: true } + - { id: 28, offset: 96, size: 8, alignment: 16, isImmutable: true } +machineFunctionInfo: {} +body: | + ; CHECK-LABEL: name: foo + ; CHECK: bb.0 (%ir-block.37): + ; CHECK: successors: %bb.1(0x50000000), %bb.2(0x30000000) + ; CHECK: liveins: $x3, $x5, $x6, $x7, $x8, $x9, $x10 + ; CHECK: [[COPY:%[0-9]+]]:g8rc_and_g8rc_nox0 = COPY $x10 + ; CHECK: [[COPY1:%[0-9]+]]:g8rc_and_g8rc_nox0 = COPY $x9 + ; CHECK: [[COPY2:%[0-9]+]]:g8rc_and_g8rc_nox0 = COPY $x8 + ; CHECK: [[COPY3:%[0-9]+]]:g8rc = COPY $x7 + ; CHECK: [[COPY4:%[0-9]+]]:g8rc_and_g8rc_nox0 = COPY $x6 + ; CHECK: [[COPY5:%[0-9]+]]:g8rc_and_g8rc_nox0 = COPY $x5 + ; CHECK: [[COPY6:%[0-9]+]]:g8rc = COPY $x3 + ; CHECK: [[COPY7:%[0-9]+]]:gprc = COPY [[COPY3]].sub_32 + ; CHECK: [[CMPWI:%[0-9]+]]:crrc = CMPWI [[COPY7]], 1 + ; CHECK: BCC 12, killed [[CMPWI]], %bb.2 + ; CHECK: B %bb.1 + ; CHECK: bb.1 (%ir-block.39): + ; CHECK: successors: %bb.3(0x80000000) + ; CHECK: [[COPY8:%[0-9]+]]:gprc = COPY [[COPY6]].sub_32 + ; CHECK: [[LD:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.28 :: (load 8 from %fixed-stack.28, align 16) + ; CHECK: [[LD1:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.27 :: (load 8 from %fixed-stack.27) + ; CHECK: [[LD2:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.26 :: (load 8 from %fixed-stack.26, align 16) + ; CHECK: [[LD3:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.25 :: (load 8 from %fixed-stack.25) + ; CHECK: [[LD4:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.24 :: (load 8 from %fixed-stack.24, align 16) + ; CHECK: [[LD5:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.23 :: (load 8 from %fixed-stack.23) + ; CHECK: [[LD6:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.22 :: (load 8 from %fixed-stack.22, align 16) + ; CHECK: [[LD7:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.21 :: (load 8 from %fixed-stack.21) + ; CHECK: [[LD8:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.20 :: (load 8 from %fixed-stack.20, align 16) + ; CHECK: [[LD9:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.19 :: (load 8 from %fixed-stack.19) + ; CHECK: [[LD10:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.18 :: (load 8 from %fixed-stack.18, align 16) + ; CHECK: [[LD11:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.17 :: (load 8 from %fixed-stack.17) + ; CHECK: [[LD12:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.16 :: (load 8 from %fixed-stack.16, align 16) + ; CHECK: [[LD13:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.15 :: (load 8 from %fixed-stack.15) + ; CHECK: [[LD14:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.14 :: (load 8 from %fixed-stack.14, align 16) + ; CHECK: [[LD15:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.13 :: (load 8 from %fixed-stack.13) + ; CHECK: [[LD16:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.12 :: (load 8 from %fixed-stack.12, align 16) + ; CHECK: [[LD17:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.11 :: (load 8 from %fixed-stack.11) + ; CHECK: [[LD18:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.10 :: (load 8 from %fixed-stack.10, align 16) + ; CHECK: [[LD19:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.9 :: (load 8 from %fixed-stack.9) + ; CHECK: [[LD20:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.8 :: (load 8 from %fixed-stack.8, align 16) + ; CHECK: [[LD21:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.7 :: (load 8 from %fixed-stack.7) + ; CHECK: [[LD22:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.6 :: (load 8 from %fixed-stack.6, align 16) + ; CHECK: [[LD23:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.5 :: (load 8 from %fixed-stack.5) + ; CHECK: [[LD24:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.4 :: (load 8 from %fixed-stack.4, align 16) + ; CHECK: [[LD25:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.3 :: (load 8 from %fixed-stack.3) + ; CHECK: [[LD26:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.2 :: (load 8 from %fixed-stack.2, align 16) + ; CHECK: [[LD27:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.1 :: (load 8 from %fixed-stack.1) + ; CHECK: [[LD28:%[0-9]+]]:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.0 :: (load 8 from %fixed-stack.0, align 16) + ; CHECK: [[DEF:%[0-9]+]]:g8rc = IMPLICIT_DEF + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:g8rc = INSERT_SUBREG [[DEF]], [[COPY7]], %subreg.sub_32 + ; CHECK: [[RLDICL:%[0-9]+]]:g8rc = RLDICL killed [[INSERT_SUBREG]], 0, 32 + ; CHECK: [[ADDI8_:%[0-9]+]]:g8rc = ADDI8 [[COPY5]], -4 + ; CHECK: [[ADDI8_1:%[0-9]+]]:g8rc = ADDI8 [[COPY2]], -4 + ; CHECK: [[ADDI8_2:%[0-9]+]]:g8rc = ADDI8 [[COPY1]], -4 + ; CHECK: MTCTR8loop killed [[RLDICL]], implicit-def dead $ctr8 + ; CHECK: [[LI:%[0-9]+]]:gprc = LI 0 + ; CHECK: [[LI8_:%[0-9]+]]:g8rc = LI8 0 + ; CHECK: [[LIS:%[0-9]+]]:gprc = LIS 34952 + ; CHECK: [[ORI:%[0-9]+]]:gprc = ORI [[LIS]], 34953 + ; CHECK: [[CMPLWI:%[0-9]+]]:crrc = CMPLWI [[COPY8]], 3 + ; CHECK: [[CMPLWI1:%[0-9]+]]:crrc = CMPLWI [[COPY8]], 1 + ; CHECK: B %bb.3 + ; CHECK: bb.2 (%ir-block.41): + ; CHECK: [[LI8_1:%[0-9]+]]:g8rc = LI8 0 + ; CHECK: $x3 = COPY [[LI8_1]] + ; CHECK: BLR8 implicit $lr8, implicit $rm, implicit $x3 + ; CHECK: bb.3 (%ir-block.42): + ; CHECK: successors: %bb.6(0x2aaaaaab), %bb.4(0x55555555) + ; CHECK: [[PHI:%[0-9]+]]:g8rc_and_g8rc_nox0 = PHI [[LI8_]], %bb.1, %21, %bb.8 + ; CHECK: [[PHI1:%[0-9]+]]:g8rc_and_g8rc_nox0 = PHI [[LI8_]], %bb.1, %20, %bb.8 + ; CHECK: [[PHI2:%[0-9]+]]:gprc = PHI [[LI]], %bb.1, %16, %bb.8 + ; CHECK: [[PHI3:%[0-9]+]]:g8rc_and_g8rc_nox0 = PHI [[ADDI8_]], %bb.1, %13, %bb.8 + ; CHECK: [[PHI4:%[0-9]+]]:g8rc_and_g8rc_nox0 = PHI [[ADDI8_1]], %bb.1, %11, %bb.8 + ; CHECK: [[PHI5:%[0-9]+]]:g8rc_and_g8rc_nox0 = PHI [[ADDI8_2]], %bb.1, %9, %bb.8 + ; CHECK: [[LWZU:%[0-9]+]]:gprc, [[LWZU1:%[0-9]+]]:g8rc_and_g8rc_nox0 = LWZU 4, [[PHI3]] :: (load 4 from %ir.53, !tbaa !2) + ; CHECK: [[COPY9:%[0-9]+]]:gprc_and_gprc_nor0 = COPY [[PHI1]].sub_32 + ; CHECK: [[ADD4_:%[0-9]+]]:gprc = nsw ADD4 killed [[LWZU]], [[PHI2]] + ; CHECK: BCC 76, [[CMPLWI]], %bb.6 + ; CHECK: B %bb.4 + ; CHECK: bb.4 (%ir-block.42): + ; CHECK: successors: %bb.5(0x40000001), %bb.7(0x3fffffff) + ; CHECK: BCC 68, [[CMPLWI1]], %bb.7 + ; CHECK: B %bb.5 + ; CHECK: bb.5 (%ir-block.59): + ; CHECK: successors: %bb.8(0x80000000) + ; CHECK: [[COPY10:%[0-9]+]]:gprc = COPY [[PHI1]].sub_32 + ; CHECK: [[RLWINM:%[0-9]+]]:gprc = RLWINM [[COPY10]], 1, 0, 30 + ; CHECK: B %bb.8 + ; CHECK: bb.6 (%ir-block.62): + ; CHECK: successors: %bb.8(0x80000000) + ; CHECK: [[ADDI:%[0-9]+]]:gprc = nuw nsw ADDI [[COPY9]], 100 + ; CHECK: B %bb.8 + ; CHECK: bb.7 (%ir-block.64): + ; CHECK: successors: %bb.8(0x80000000) + ; CHECK: [[MULHWU:%[0-9]+]]:gprc = MULHWU [[COPY9]], [[ORI]] + ; CHECK: [[RLWINM1:%[0-9]+]]:gprc = RLWINM [[MULHWU]], 28, 4, 31 + ; CHECK: [[MULLI:%[0-9]+]]:gprc = nuw nsw MULLI [[RLWINM1]], 30 + ; CHECK: [[SUBF:%[0-9]+]]:gprc = SUBF [[MULLI]], [[COPY9]] + ; CHECK: bb.8 (%ir-block.65): + ; CHECK: successors: %bb.3(0x7c000000), %bb.2(0x04000000) + ; CHECK: [[PHI6:%[0-9]+]]:gprc = PHI [[ADDI]], %bb.6, [[RLWINM]], %bb.5, [[SUBF]], %bb.7 + ; CHECK: [[ADDI8_3:%[0-9]+]]:g8rc_and_g8rc_nox0 = ADDI8 [[PHI5]], 4 + ; CHECK: [[COPY11:%[0-9]+]]:g8rc = COPY [[ADDI8_3]] + ; CHECK: [[ADDI8_4:%[0-9]+]]:g8rc_and_g8rc_nox0 = ADDI8 [[PHI4]], 4 + ; CHECK: [[COPY12:%[0-9]+]]:g8rc = COPY [[ADDI8_4]] + ; CHECK: [[COPY13:%[0-9]+]]:g8rc = COPY [[LWZU1]] + ; CHECK: [[ADD4_1:%[0-9]+]]:gprc = nsw ADD4 [[PHI6]], [[ADD4_]] + ; CHECK: [[LWZ:%[0-9]+]]:gprc = LWZ 0, [[ADDI8_4]] :: (load 4 from %ir.51, !tbaa !2) + ; CHECK: [[ADD4_2:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_1]], killed [[LWZ]] + ; CHECK: [[LWZ1:%[0-9]+]]:gprc = LWZ 0, [[ADDI8_3]] :: (load 4 from %ir.49, !tbaa !2) + ; CHECK: [[ADD4_3:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_2]], killed [[LWZ1]] + ; CHECK: [[LWZX:%[0-9]+]]:gprc = LWZX [[COPY]], [[PHI]] :: (load 4 from %ir.uglygep6061, !tbaa !2) + ; CHECK: [[ADD4_4:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_3]], killed [[LWZX]] + ; CHECK: [[LWZX1:%[0-9]+]]:gprc = LWZX [[LD28]], [[PHI]] :: (load 4 from %ir.uglygep5859, !tbaa !2) + ; CHECK: [[ADD4_5:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_4]], killed [[LWZX1]] + ; CHECK: [[LWZX2:%[0-9]+]]:gprc = LWZX [[LD27]], [[PHI]] :: (load 4 from %ir.uglygep5657, !tbaa !2) + ; CHECK: [[ADD4_6:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_5]], killed [[LWZX2]] + ; CHECK: [[LWZX3:%[0-9]+]]:gprc = LWZX [[LD26]], [[PHI]] :: (load 4 from %ir.uglygep5455, !tbaa !2) + ; CHECK: [[ADD4_7:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_6]], killed [[LWZX3]] + ; CHECK: [[LWZX4:%[0-9]+]]:gprc = LWZX [[LD25]], [[PHI]] :: (load 4 from %ir.uglygep5253, !tbaa !2) + ; CHECK: [[ADD4_8:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_7]], killed [[LWZX4]] + ; CHECK: [[LWZX5:%[0-9]+]]:gprc = LWZX [[LD24]], [[PHI]] :: (load 4 from %ir.uglygep5051, !tbaa !2) + ; CHECK: [[ADD4_9:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_8]], killed [[LWZX5]] + ; CHECK: [[LWZX6:%[0-9]+]]:gprc = LWZX [[LD23]], [[PHI]] :: (load 4 from %ir.uglygep4849, !tbaa !2) + ; CHECK: [[ADD4_10:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_9]], killed [[LWZX6]] + ; CHECK: [[LWZX7:%[0-9]+]]:gprc = LWZX [[LD22]], [[PHI]] :: (load 4 from %ir.uglygep4647, !tbaa !2) + ; CHECK: [[ADD4_11:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_10]], killed [[LWZX7]] + ; CHECK: [[LWZX8:%[0-9]+]]:gprc = LWZX [[LD21]], [[PHI]] :: (load 4 from %ir.uglygep4445, !tbaa !2) + ; CHECK: [[ADD4_12:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_11]], killed [[LWZX8]] + ; CHECK: [[LWZX9:%[0-9]+]]:gprc = LWZX [[LD20]], [[PHI]] :: (load 4 from %ir.uglygep4243, !tbaa !2) + ; CHECK: [[ADD4_13:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_12]], killed [[LWZX9]] + ; CHECK: [[LWZX10:%[0-9]+]]:gprc = LWZX [[LD19]], [[PHI]] :: (load 4 from %ir.uglygep4041, !tbaa !2) + ; CHECK: [[ADD4_14:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_13]], killed [[LWZX10]] + ; CHECK: [[LWZX11:%[0-9]+]]:gprc = LWZX [[LD18]], [[PHI]] :: (load 4 from %ir.uglygep3839, !tbaa !2) + ; CHECK: [[ADD4_15:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_14]], killed [[LWZX11]] + ; CHECK: [[LWZX12:%[0-9]+]]:gprc = LWZX [[LD17]], [[PHI]] :: (load 4 from %ir.uglygep3637, !tbaa !2) + ; CHECK: [[ADD4_16:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_15]], killed [[LWZX12]] + ; CHECK: [[LWZX13:%[0-9]+]]:gprc = LWZX [[LD16]], [[PHI]] :: (load 4 from %ir.uglygep3435, !tbaa !2) + ; CHECK: [[ADD4_17:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_16]], killed [[LWZX13]] + ; CHECK: [[LWZX14:%[0-9]+]]:gprc = LWZX [[LD15]], [[PHI]] :: (load 4 from %ir.uglygep3233, !tbaa !2) + ; CHECK: [[ADD4_18:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_17]], killed [[LWZX14]] + ; CHECK: [[LWZX15:%[0-9]+]]:gprc = LWZX [[LD14]], [[PHI]] :: (load 4 from %ir.uglygep3031, !tbaa !2) + ; CHECK: [[ADD4_19:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_18]], killed [[LWZX15]] + ; CHECK: [[LWZX16:%[0-9]+]]:gprc = LWZX [[LD13]], [[PHI]] :: (load 4 from %ir.uglygep2829, !tbaa !2) + ; CHECK: [[ADD4_20:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_19]], killed [[LWZX16]] + ; CHECK: [[LWZX17:%[0-9]+]]:gprc = LWZX [[LD12]], [[PHI]] :: (load 4 from %ir.uglygep2627, !tbaa !2) + ; CHECK: [[ADD4_21:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_20]], killed [[LWZX17]] + ; CHECK: [[LWZX18:%[0-9]+]]:gprc = LWZX [[LD11]], [[PHI]] :: (load 4 from %ir.uglygep2425, !tbaa !2) + ; CHECK: [[ADD4_22:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_21]], killed [[LWZX18]] + ; CHECK: [[LWZX19:%[0-9]+]]:gprc = LWZX [[LD10]], [[PHI]] :: (load 4 from %ir.uglygep2223, !tbaa !2) + ; CHECK: [[ADD4_23:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_22]], killed [[LWZX19]] + ; CHECK: [[LWZX20:%[0-9]+]]:gprc = LWZX [[LD9]], [[PHI]] :: (load 4 from %ir.uglygep2021, !tbaa !2) + ; CHECK: [[ADD4_24:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_23]], killed [[LWZX20]] + ; CHECK: [[LWZX21:%[0-9]+]]:gprc = LWZX [[LD8]], [[PHI]] :: (load 4 from %ir.uglygep1819, !tbaa !2) + ; CHECK: [[ADD4_25:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_24]], killed [[LWZX21]] + ; CHECK: [[LWZX22:%[0-9]+]]:gprc = LWZX [[LD7]], [[PHI]] :: (load 4 from %ir.uglygep1617, !tbaa !2) + ; CHECK: [[ADD4_26:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_25]], killed [[LWZX22]] + ; CHECK: [[LWZX23:%[0-9]+]]:gprc = LWZX [[LD6]], [[PHI]] :: (load 4 from %ir.uglygep1415, !tbaa !2) + ; CHECK: [[ADD4_27:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_26]], killed [[LWZX23]] + ; CHECK: [[LWZX24:%[0-9]+]]:gprc = LWZX [[LD5]], [[PHI]] :: (load 4 from %ir.uglygep1213, !tbaa !2) + ; CHECK: [[ADD4_28:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_27]], killed [[LWZX24]] + ; CHECK: [[LWZX25:%[0-9]+]]:gprc = LWZX [[LD4]], [[PHI]] :: (load 4 from %ir.uglygep1011, !tbaa !2) + ; CHECK: [[ADD4_29:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_28]], killed [[LWZX25]] + ; CHECK: [[LWZX26:%[0-9]+]]:gprc = LWZX [[LD3]], [[PHI]] :: (load 4 from %ir.uglygep89, !tbaa !2) + ; CHECK: [[ADD4_30:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_29]], killed [[LWZX26]] + ; CHECK: [[LWZX27:%[0-9]+]]:gprc = LWZX [[LD2]], [[PHI]] :: (load 4 from %ir.uglygep67, !tbaa !2) + ; CHECK: [[ADD4_31:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_30]], killed [[LWZX27]] + ; CHECK: [[LWZX28:%[0-9]+]]:gprc = LWZX [[LD1]], [[PHI]] :: (load 4 from %ir.uglygep45, !tbaa !2) + ; CHECK: [[ADD4_32:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_31]], killed [[LWZX28]] + ; CHECK: [[LWZX29:%[0-9]+]]:gprc = LWZX [[LD]], [[PHI]] :: (load 4 from %ir.uglygep23, !tbaa !2) + ; CHECK: [[ADD4_33:%[0-9]+]]:gprc = nsw ADD4 killed [[ADD4_32]], killed [[LWZX29]] + ; CHECK: STWX killed [[ADD4_33]], [[COPY4]], [[PHI]] :: (store 4 into %ir.uglygep1, !tbaa !2) + ; CHECK: [[ADDI8_5:%[0-9]+]]:g8rc = nuw nsw ADDI8 [[PHI1]], 1 + ; CHECK: [[ADDI8_6:%[0-9]+]]:g8rc = nuw nsw ADDI8 [[PHI]], 4 + ; CHECK: BDNZ8 %bb.3, implicit-def dead $ctr8, implicit $ctr8 + ; CHECK: B %bb.2 + bb.0 (%ir-block.37): + successors: %bb.1(0x50000000), %bb.2(0x30000000) + liveins: $x3, $x5, $x6, $x7, $x8, $x9, $x10 + + %29:g8rc_and_g8rc_nox0 = COPY $x10 + %28:g8rc_and_g8rc_nox0 = COPY $x9 + %27:g8rc_and_g8rc_nox0 = COPY $x8 + %26:g8rc = COPY $x7 + %25:g8rc_and_g8rc_nox0 = COPY $x6 + %24:g8rc_and_g8rc_nox0 = COPY $x5 + %22:g8rc = COPY $x3 + %30:gprc = COPY %22.sub_32 + %31:gprc = COPY %26.sub_32 + %60:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.0 :: (load 8 from %fixed-stack.0, align 16) + %59:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.1 :: (load 8 from %fixed-stack.1) + %58:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.2 :: (load 8 from %fixed-stack.2, align 16) + %57:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.3 :: (load 8 from %fixed-stack.3) + %56:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.4 :: (load 8 from %fixed-stack.4, align 16) + %55:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.5 :: (load 8 from %fixed-stack.5) + %54:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.6 :: (load 8 from %fixed-stack.6, align 16) + %53:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.7 :: (load 8 from %fixed-stack.7) + %52:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.8 :: (load 8 from %fixed-stack.8, align 16) + %51:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.9 :: (load 8 from %fixed-stack.9) + %50:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.10 :: (load 8 from %fixed-stack.10, align 16) + %49:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.11 :: (load 8 from %fixed-stack.11) + %48:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.12 :: (load 8 from %fixed-stack.12, align 16) + %47:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.13 :: (load 8 from %fixed-stack.13) + %46:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.14 :: (load 8 from %fixed-stack.14, align 16) + %45:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.15 :: (load 8 from %fixed-stack.15) + %44:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.16 :: (load 8 from %fixed-stack.16, align 16) + %43:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.17 :: (load 8 from %fixed-stack.17) + %42:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.18 :: (load 8 from %fixed-stack.18, align 16) + %41:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.19 :: (load 8 from %fixed-stack.19) + %40:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.20 :: (load 8 from %fixed-stack.20, align 16) + %39:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.21 :: (load 8 from %fixed-stack.21) + %38:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.22 :: (load 8 from %fixed-stack.22, align 16) + %37:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.23 :: (load 8 from %fixed-stack.23) + %36:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.24 :: (load 8 from %fixed-stack.24, align 16) + %35:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.25 :: (load 8 from %fixed-stack.25) + %34:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.26 :: (load 8 from %fixed-stack.26, align 16) + %33:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.27 :: (load 8 from %fixed-stack.27) + %32:g8rc_and_g8rc_nox0 = LD 0, %fixed-stack.28 :: (load 8 from %fixed-stack.28, align 16) + %61:crrc = CMPWI %31, 1 + BCC 12, killed %61, %bb.2 + B %bb.1 + + bb.1 (%ir-block.39): + %65:g8rc = IMPLICIT_DEF + %64:g8rc = INSERT_SUBREG %65, %31, %subreg.sub_32 + %66:g8rc = RLDICL killed %64, 0, 32 + %0:g8rc = ADDI8 %24, -4 + %1:g8rc = ADDI8 %27, -4 + %2:g8rc = ADDI8 %28, -4 + MTCTR8loop killed %66, implicit-def dead $ctr8 + %63:gprc = LI 0 + %62:g8rc = LI8 0 + %69:gprc = LIS 34952 + %70:gprc = ORI %69, 34953 + %74:crrc = CMPLWI %30, 3 + %75:crrc = CMPLWI %30, 1 + B %bb.3 + + bb.2 (%ir-block.41): + %142:g8rc = LI8 0 + $x3 = COPY %142 + BLR8 implicit $lr8, implicit $rm, implicit $x3 + + bb.3 (%ir-block.42): + successors: %bb.5(0x2aaaaaab), %bb.8(0x55555555) + + %3:g8rc_and_g8rc_nox0 = PHI %62, %bb.1, %21, %bb.7 + %4:g8rc_and_g8rc_nox0 = PHI %62, %bb.1, %20, %bb.7 + %5:gprc = PHI %63, %bb.1, %16, %bb.7 + %6:g8rc_and_g8rc_nox0 = PHI %0, %bb.1, %13, %bb.7 + %7:g8rc_and_g8rc_nox0 = PHI %1, %bb.1, %11, %bb.7 + %8:g8rc_and_g8rc_nox0 = PHI %2, %bb.1, %9, %bb.7 + %10:g8rc_and_g8rc_nox0 = ADDI8 %8, 4 + %9:g8rc = COPY %10 + %12:g8rc_and_g8rc_nox0 = ADDI8 %7, 4 + %11:g8rc = COPY %12 + %67:gprc, %68:g8rc_and_g8rc_nox0 = LWZU 4, %6 :: (load 4 from %ir.53, !tbaa !2) + %13:g8rc = COPY %68 + %14:gprc_and_gprc_nor0 = COPY %4.sub_32 + %71:gprc = MULHWU %14, %70 + %72:gprc = RLWINM %71, 28, 4, 31 + %73:gprc = nuw nsw MULLI killed %72, 30 + %15:gprc = SUBF killed %73, %14 + %16:gprc = nsw ADD4 killed %67, %5 + BCC 76, %74, %bb.5 + B %bb.8 + + bb.8 (%ir-block.42): + successors: %bb.4(0x40000001), %bb.6(0x3fffffff) + + BCC 68, %75, %bb.6 + B %bb.4 + + bb.4 (%ir-block.59): + %76:gprc = COPY %4.sub_32 + %17:gprc = RLWINM %76, 1, 0, 30 + B %bb.7 + + bb.5 (%ir-block.62): + %18:gprc = nuw nsw ADDI %14, 100 + B %bb.7 + + bb.6 (%ir-block.64): + + bb.7 (%ir-block.65): + successors: %bb.3(0x7c000000), %bb.2(0x04000000) + + %19:gprc = PHI %18, %bb.5, %17, %bb.4, %15, %bb.6 + %77:gprc = nsw ADD4 %19, %16 + %78:gprc = LWZ 0, %12 :: (load 4 from %ir.51, !tbaa !2) + %79:gprc = nsw ADD4 killed %77, killed %78 + %80:gprc = LWZ 0, %10 :: (load 4 from %ir.49, !tbaa !2) + %81:gprc = nsw ADD4 killed %79, killed %80 + %82:gprc = LWZX %29, %3 :: (load 4 from %ir.uglygep6061, !tbaa !2) + %83:gprc = nsw ADD4 killed %81, killed %82 + %84:gprc = LWZX %32, %3 :: (load 4 from %ir.uglygep5859, !tbaa !2) + %85:gprc = nsw ADD4 killed %83, killed %84 + %86:gprc = LWZX %33, %3 :: (load 4 from %ir.uglygep5657, !tbaa !2) + %87:gprc = nsw ADD4 killed %85, killed %86 + %88:gprc = LWZX %34, %3 :: (load 4 from %ir.uglygep5455, !tbaa !2) + %89:gprc = nsw ADD4 killed %87, killed %88 + %90:gprc = LWZX %35, %3 :: (load 4 from %ir.uglygep5253, !tbaa !2) + %91:gprc = nsw ADD4 killed %89, killed %90 + %92:gprc = LWZX %36, %3 :: (load 4 from %ir.uglygep5051, !tbaa !2) + %93:gprc = nsw ADD4 killed %91, killed %92 + %94:gprc = LWZX %37, %3 :: (load 4 from %ir.uglygep4849, !tbaa !2) + %95:gprc = nsw ADD4 killed %93, killed %94 + %96:gprc = LWZX %38, %3 :: (load 4 from %ir.uglygep4647, !tbaa !2) + %97:gprc = nsw ADD4 killed %95, killed %96 + %98:gprc = LWZX %39, %3 :: (load 4 from %ir.uglygep4445, !tbaa !2) + %99:gprc = nsw ADD4 killed %97, killed %98 + %100:gprc = LWZX %40, %3 :: (load 4 from %ir.uglygep4243, !tbaa !2) + %101:gprc = nsw ADD4 killed %99, killed %100 + %102:gprc = LWZX %41, %3 :: (load 4 from %ir.uglygep4041, !tbaa !2) + %103:gprc = nsw ADD4 killed %101, killed %102 + %104:gprc = LWZX %42, %3 :: (load 4 from %ir.uglygep3839, !tbaa !2) + %105:gprc = nsw ADD4 killed %103, killed %104 + %106:gprc = LWZX %43, %3 :: (load 4 from %ir.uglygep3637, !tbaa !2) + %107:gprc = nsw ADD4 killed %105, killed %106 + %108:gprc = LWZX %44, %3 :: (load 4 from %ir.uglygep3435, !tbaa !2) + %109:gprc = nsw ADD4 killed %107, killed %108 + %110:gprc = LWZX %45, %3 :: (load 4 from %ir.uglygep3233, !tbaa !2) </cut>

3 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/gnu-release-arm-spec2k6-O3_LTO - Build # 27 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tk1/gnu-release-arm-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_tk1/gnu-release-arm-spec2k6-O3_LTO Culprit: <cut> commit c7207339a7dbce5b68f872064e624dcf1639ba46 Author: Wilco Dijkstra <wdijkstr(a)arm.com> Date: Mon Oct 14 12:21:14 2019 +0000 [ARM] Switch to default sched pressure algorithm Currently the Arm backend selects the alternative sched pressure algorithm. The issue is that this doesn't take register pressure into account, and so it causes significant additional spilling on Arm where there are only 14 allocatable registers. Building SPEC2006 showed significant codesize gains with the default pressure algorithm, so switch back to that. PR77308 shows ~800 fewer instructions. SPECINT2006 is ~0.6% faster on Cortex-A57 together with the other DImode patches. Overall SPEC codesize is 1.1% smaller. gcc/ * config/arm/arm.c (arm_option_override): Don't override sched pressure algorithm. From-SVN: r276960 </cut> Results regressed to (for first_bad == c7207339a7dbce5b68f872064e624dcf1639ba46) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O3_LTO_marm -- artifacts/build-c7207339a7dbce5b68f872064e624dcf1639ba46/results_id: 1 # 410.bwaves,bwaves_base.default regressed by 108 # 454.calculix,calculix_base.default regressed by 105 # 482.sphinx3,sphinx_livepretend_base.default regressed by 104 # 436.cactusADM,cactusADM_base.default regressed by 116 # 444.namd,namd_base.default regressed by 103 # 435.gromacs,gromacs_base.default regressed by 106 from (for last_good == 7bd8bec53f0e43c7a7852c54650746e65324514b) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O3_LTO_marm -- artifacts/build-7bd8bec53f0e43c7a7852c54650746e65324514b/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Results ID of last_good: tk1_32/tcwg_bmk_gnu_tk1/bisect-gnu-release-arm-spec2k6-O3_LTO/1468 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Results ID of first_bad: tk1_32/tcwg_bmk_gnu_tk1/bisect-gnu-release-arm-spec2k6-O3_LTO/1469 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-c7207339a7dbce5b68f872064e624dcf1639ba46 cd investigate-gcc-c7207339a7dbce5b68f872064e624dcf1639ba46 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh cd gcc # Reproduce first_bad build git checkout --detach c7207339a7dbce5b68f872064e624dcf1639ba46 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 7bd8bec53f0e43c7a7852c54650746e65324514b ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Full commit (up to 1000 lines): <cut> commit c7207339a7dbce5b68f872064e624dcf1639ba46 Author: Wilco Dijkstra <wdijkstr(a)arm.com> Date: Mon Oct 14 12:21:14 2019 +0000 [ARM] Switch to default sched pressure algorithm Currently the Arm backend selects the alternative sched pressure algorithm. The issue is that this doesn't take register pressure into account, and so it causes significant additional spilling on Arm where there are only 14 allocatable registers. Building SPEC2006 showed significant codesize gains with the default pressure algorithm, so switch back to that. PR77308 shows ~800 fewer instructions. SPECINT2006 is ~0.6% faster on Cortex-A57 together with the other DImode patches. Overall SPEC codesize is 1.1% smaller. gcc/ * config/arm/arm.c (arm_option_override): Don't override sched pressure algorithm. From-SVN: r276960 --- gcc/ChangeLog | 5 +++++ gcc/config/arm/arm.c | 5 ----- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c2cbd4274ca..f07a0e61e6b 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2019-10-14 Wilco Dijkstra <wdijkstr(a)arm.com> + + * config/arm/arm.c (arm_option_override): Don't override sched + pressure algorithm. + 2019-10-14 Richard Biener <rguenther(a)suse.de> PR tree-optimization/92069 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 39e1a1ef9a2..394b1dd1902 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -3555,11 +3555,6 @@ arm_option_override (void) global_options.x_param_values, global_options_set.x_param_values); - /* Use the alternative scheduling-pressure algorithm by default. */ - maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL, - global_options.x_param_values, - global_options_set.x_param_values); - /* Look through ready list and all of queue for instructions relevant for L2 auto-prefetcher. */ int param_sched_autopref_queue_depth; </cut>

3 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-master-aarch64-spec2k6-O2 - Build # 9 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2 Culprit: <cut> commit 2f69b78a578dad55f0fde3c184a3dc0ea615fd43 Author: Florian Hahn <flo(a)fhahn.com> Date: Sun May 16 11:12:55 2021 +0100 [VectorCombine] Add tests with and & urem guaranteeing idx is valid. </cut> Results regressed to (for first_bad == 2f69b78a578dad55f0fde3c184a3dc0ea615fd43) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O2 -- artifacts/build-2f69b78a578dad55f0fde3c184a3dc0ea615fd43/results_id: 1 # 447.dealII,dealII_base.default regressed by 103 # 447.dealII,[.] _ZN16ConstraintMatrix8add_lineEj regressed by 113 from (for last_good == a39f85d118cc4c7045e710302115da034bb3cb22) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O2 -- artifacts/build-a39f85d118cc4c7045e710302115da034bb3cb22/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O2/1645 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O2/1644 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-2f69b78a578dad55f0fde3c184a3dc0ea615fd43 cd investigate-llvm-2f69b78a578dad55f0fde3c184a3dc0ea615fd43 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 2f69b78a578dad55f0fde3c184a3dc0ea615fd43 ../artifacts/test.sh # Reproduce last_good build git checkout --detach a39f85d118cc4c7045e710302115da034bb3cb22 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Full commit (up to 1000 lines): <cut> commit 2f69b78a578dad55f0fde3c184a3dc0ea615fd43 Author: Florian Hahn <flo(a)fhahn.com> Date: Sun May 16 11:12:55 2021 +0100 [VectorCombine] Add tests with and & urem guaranteeing idx is valid. --- .../AArch64/load-extractelement-scalarization.ll | 60 +++++++++++++++++++ .../Transforms/VectorCombine/load-insert-store.ll | 68 ++++++++++++++++++++++ 2 files changed, 128 insertions(+) diff --git a/llvm/test/Transforms/VectorCombine/AArch64/load-extractelement-scalarization.ll b/llvm/test/Transforms/VectorCombine/AArch64/load-extractelement-scalarization.ll index 3f8e276f06ca..5e105031ec78 100644 --- a/llvm/test/Transforms/VectorCombine/AArch64/load-extractelement-scalarization.ll +++ b/llvm/test/Transforms/VectorCombine/AArch64/load-extractelement-scalarization.ll @@ -113,6 +113,66 @@ entry: declare void @llvm.assume(i1) +define i32 @load_extract_idx_var_i64_known_valid_by_and(<4 x i32>* %x, i64 %idx) { +; CHECK-LABEL: @load_extract_idx_var_i64_known_valid_by_and( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[IDX_CLAMPED:%.*]] = and i64 [[IDX:%.*]], 3 +; CHECK-NEXT: [[LV:%.*]] = load <4 x i32>, <4 x i32>* [[X:%.*]], align 16 +; CHECK-NEXT: [[R:%.*]] = extractelement <4 x i32> [[LV]], i64 [[IDX_CLAMPED]] +; CHECK-NEXT: ret i32 [[R]] +; +entry: + %idx.clamped = and i64 %idx, 3 + %lv = load <4 x i32>, <4 x i32>* %x + %r = extractelement <4 x i32> %lv, i64 %idx.clamped + ret i32 %r +} + +define i32 @load_extract_idx_var_i64_not_known_valid_by_and(<4 x i32>* %x, i64 %idx) { +; CHECK-LABEL: @load_extract_idx_var_i64_not_known_valid_by_and( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[IDX_CLAMPED:%.*]] = and i64 [[IDX:%.*]], 4 +; CHECK-NEXT: [[LV:%.*]] = load <4 x i32>, <4 x i32>* [[X:%.*]], align 16 +; CHECK-NEXT: [[R:%.*]] = extractelement <4 x i32> [[LV]], i64 [[IDX_CLAMPED]] +; CHECK-NEXT: ret i32 [[R]] +; +entry: + %idx.clamped = and i64 %idx, 4 + %lv = load <4 x i32>, <4 x i32>* %x + %r = extractelement <4 x i32> %lv, i64 %idx.clamped + ret i32 %r +} + +define i32 @load_extract_idx_var_i64_known_valid_by_urem(<4 x i32>* %x, i64 %idx) { +; CHECK-LABEL: @load_extract_idx_var_i64_known_valid_by_urem( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[IDX_CLAMPED:%.*]] = urem i64 [[IDX:%.*]], 4 +; CHECK-NEXT: [[LV:%.*]] = load <4 x i32>, <4 x i32>* [[X:%.*]], align 16 +; CHECK-NEXT: [[R:%.*]] = extractelement <4 x i32> [[LV]], i64 [[IDX_CLAMPED]] +; CHECK-NEXT: ret i32 [[R]] +; +entry: + %idx.clamped = urem i64 %idx, 4 + %lv = load <4 x i32>, <4 x i32>* %x + %r = extractelement <4 x i32> %lv, i64 %idx.clamped + ret i32 %r +} + +define i32 @load_extract_idx_var_i64_not_known_valid_by_urem(<4 x i32>* %x, i64 %idx) { +; CHECK-LABEL: @load_extract_idx_var_i64_not_known_valid_by_urem( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[IDX_CLAMPED:%.*]] = urem i64 [[IDX:%.*]], 5 +; CHECK-NEXT: [[LV:%.*]] = load <4 x i32>, <4 x i32>* [[X:%.*]], align 16 +; CHECK-NEXT: [[R:%.*]] = extractelement <4 x i32> [[LV]], i64 [[IDX_CLAMPED]] +; CHECK-NEXT: ret i32 [[R]] +; +entry: + %idx.clamped = urem i64 %idx, 5 + %lv = load <4 x i32>, <4 x i32>* %x + %r = extractelement <4 x i32> %lv, i64 %idx.clamped + ret i32 %r +} + define i32 @load_extract_idx_var_i32(<4 x i32>* %x, i32 %idx) { ; CHECK-LABEL: @load_extract_idx_var_i32( ; CHECK-NEXT: [[LV:%.*]] = load <4 x i32>, <4 x i32>* [[X:%.*]], align 16 diff --git a/llvm/test/Transforms/VectorCombine/load-insert-store.ll b/llvm/test/Transforms/VectorCombine/load-insert-store.ll index e565bda0a08f..611d66978019 100644 --- a/llvm/test/Transforms/VectorCombine/load-insert-store.ll +++ b/llvm/test/Transforms/VectorCombine/load-insert-store.ll @@ -188,6 +188,74 @@ entry: declare void @llvm.assume(i1) +define void @insert_store_nonconst_index_known_valid_by_and(<16 x i8>* %q, i8 zeroext %s, i32 %idx) { +; CHECK-LABEL: @insert_store_nonconst_index_known_valid_by_and( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[TMP0:%.*]] = load <16 x i8>, <16 x i8>* [[Q:%.*]], align 16 +; CHECK-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7 +; CHECK-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]] +; CHECK-NEXT: store <16 x i8> [[VECINS]], <16 x i8>* [[Q]], align 16 +; CHECK-NEXT: ret void +; +entry: + %0 = load <16 x i8>, <16 x i8>* %q + %idx.clamped = and i32 %idx, 7 + %vecins = insertelement <16 x i8> %0, i8 %s, i32 %idx.clamped + store <16 x i8> %vecins, <16 x i8>* %q + ret void +} + +define void @insert_store_nonconst_index_not_known_valid_by_and(<16 x i8>* %q, i8 zeroext %s, i32 %idx) { +; CHECK-LABEL: @insert_store_nonconst_index_not_known_valid_by_and( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[TMP0:%.*]] = load <16 x i8>, <16 x i8>* [[Q:%.*]], align 16 +; CHECK-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 16 +; CHECK-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]] +; CHECK-NEXT: store <16 x i8> [[VECINS]], <16 x i8>* [[Q]], align 16 +; CHECK-NEXT: ret void +; +entry: + %0 = load <16 x i8>, <16 x i8>* %q + %idx.clamped = and i32 %idx, 16 + %vecins = insertelement <16 x i8> %0, i8 %s, i32 %idx.clamped + store <16 x i8> %vecins, <16 x i8>* %q + ret void +} + +define void @insert_store_nonconst_index_known_valid_by_urem(<16 x i8>* %q, i8 zeroext %s, i32 %idx) { +; CHECK-LABEL: @insert_store_nonconst_index_known_valid_by_urem( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[TMP0:%.*]] = load <16 x i8>, <16 x i8>* [[Q:%.*]], align 16 +; CHECK-NEXT: [[IDX_CLAMPED:%.*]] = urem i32 [[IDX:%.*]], 16 +; CHECK-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]] +; CHECK-NEXT: store <16 x i8> [[VECINS]], <16 x i8>* [[Q]], align 16 +; CHECK-NEXT: ret void +; +entry: + %0 = load <16 x i8>, <16 x i8>* %q + %idx.clamped = urem i32 %idx, 16 + %vecins = insertelement <16 x i8> %0, i8 %s, i32 %idx.clamped + store <16 x i8> %vecins, <16 x i8>* %q + ret void +} + +define void @insert_store_nonconst_index_not_known_valid_by_urem(<16 x i8>* %q, i8 zeroext %s, i32 %idx) { +; CHECK-LABEL: @insert_store_nonconst_index_not_known_valid_by_urem( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[TMP0:%.*]] = load <16 x i8>, <16 x i8>* [[Q:%.*]], align 16 +; CHECK-NEXT: [[IDX_CLAMPED:%.*]] = urem i32 [[IDX:%.*]], 17 +; CHECK-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]] +; CHECK-NEXT: store <16 x i8> [[VECINS]], <16 x i8>* [[Q]], align 16 +; CHECK-NEXT: ret void +; +entry: + %0 = load <16 x i8>, <16 x i8>* %q + %idx.clamped = urem i32 %idx, 17 + %vecins = insertelement <16 x i8> %0, i8 %s, i32 %idx.clamped + store <16 x i8> %vecins, <16 x i8>* %q + ret void +} + define void @insert_store_ptr_strip(<16 x i8>* %q, i8 zeroext %s) { ; CHECK-LABEL: @insert_store_ptr_strip( ; CHECK-NEXT: entry: </cut>

3 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/gnu-release-arm-spec2k6-O3_LTO - Build # 28 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tk1/gnu-release-arm-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_tk1/gnu-release-arm-spec2k6-O3_LTO Culprit: <cut> commit c7207339a7dbce5b68f872064e624dcf1639ba46 Author: Wilco Dijkstra <wdijkstr(a)arm.com> Date: Mon Oct 14 12:21:14 2019 +0000 [ARM] Switch to default sched pressure algorithm Currently the Arm backend selects the alternative sched pressure algorithm. The issue is that this doesn't take register pressure into account, and so it causes significant additional spilling on Arm where there are only 14 allocatable registers. Building SPEC2006 showed significant codesize gains with the default pressure algorithm, so switch back to that. PR77308 shows ~800 fewer instructions. SPECINT2006 is ~0.6% faster on Cortex-A57 together with the other DImode patches. Overall SPEC codesize is 1.1% smaller. gcc/ * config/arm/arm.c (arm_option_override): Don't override sched pressure algorithm. From-SVN: r276960 </cut> Results regressed to (for first_bad == c7207339a7dbce5b68f872064e624dcf1639ba46) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O3_LTO_marm -- artifacts/build-c7207339a7dbce5b68f872064e624dcf1639ba46/results_id: 1 # 435.gromacs,gromacs_base.default regressed by 106 # 459.GemsFDTD,GemsFDTD_base.default regressed by 103 # 436.cactusADM,cactusADM_base.default regressed by 115 # 444.namd,namd_base.default regressed by 103 # 482.sphinx3,sphinx_livepretend_base.default regressed by 103 # 410.bwaves,bwaves_base.default regressed by 107 # 454.calculix,calculix_base.default regressed by 105 from (for last_good == 7bd8bec53f0e43c7a7852c54650746e65324514b) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O3_LTO_marm -- artifacts/build-7bd8bec53f0e43c7a7852c54650746e65324514b/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Results ID of last_good: tk1_32/tcwg_bmk_gnu_tk1/bisect-gnu-release-arm-spec2k6-O3_LTO/1638 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Results ID of first_bad: tk1_32/tcwg_bmk_gnu_tk1/bisect-gnu-release-arm-spec2k6-O3_LTO/1643 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-c7207339a7dbce5b68f872064e624dcf1639ba46 cd investigate-gcc-c7207339a7dbce5b68f872064e624dcf1639ba46 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach c7207339a7dbce5b68f872064e624dcf1639ba46 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 7bd8bec53f0e43c7a7852c54650746e65324514b ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Full commit (up to 1000 lines): <cut> commit c7207339a7dbce5b68f872064e624dcf1639ba46 Author: Wilco Dijkstra <wdijkstr(a)arm.com> Date: Mon Oct 14 12:21:14 2019 +0000 [ARM] Switch to default sched pressure algorithm Currently the Arm backend selects the alternative sched pressure algorithm. The issue is that this doesn't take register pressure into account, and so it causes significant additional spilling on Arm where there are only 14 allocatable registers. Building SPEC2006 showed significant codesize gains with the default pressure algorithm, so switch back to that. PR77308 shows ~800 fewer instructions. SPECINT2006 is ~0.6% faster on Cortex-A57 together with the other DImode patches. Overall SPEC codesize is 1.1% smaller. gcc/ * config/arm/arm.c (arm_option_override): Don't override sched pressure algorithm. From-SVN: r276960 --- gcc/ChangeLog | 5 +++++ gcc/config/arm/arm.c | 5 ----- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c2cbd4274ca..f07a0e61e6b 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2019-10-14 Wilco Dijkstra <wdijkstr(a)arm.com> + + * config/arm/arm.c (arm_option_override): Don't override sched + pressure algorithm. + 2019-10-14 Richard Biener <rguenther(a)suse.de> PR tree-optimization/92069 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 39e1a1ef9a2..394b1dd1902 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -3555,11 +3555,6 @@ arm_option_override (void) global_options.x_param_values, global_options_set.x_param_values); - /* Use the alternative scheduling-pressure algorithm by default. */ - maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL, - global_options.x_param_values, - global_options_set.x_param_values); - /* Look through ready list and all of queue for instructions relevant for L2 auto-prefetcher. */ int param_sched_autopref_queue_depth; </cut>

3 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/gnu-master-aarch64-spec2k6-O3_LTO - Build # 20 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tx1/gnu-master-aarch64-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_tx1/gnu-master-aarch64-spec2k6-O3_LTO Culprit: <cut> commit fedcf3c476aff7533741a1c61071200f0a38cf83 Author: Richard Biener <rguenther(a)suse.de> Date: Thu Jul 8 09:52:49 2021 +0200 tree-optimization/101373 - avoid PRE across externally throwing call PRE already tries to avoid hoisting possibly trapping expressions across calls that might not return normally but fails to consider const calls that throw externally. The following fixes that and also plugs the hole of trapping references not pruned in case they are not catched by the actuall call clobbering it. At -Os we hit the same issue in RTL PRE and postreload-gcse has even more incomplete checks so the patch adjusts both of those as well. 2021-07-08 Richard Biener <rguenther(a)suse.de> PR tree-optimization/101373 * tree-ssa-pre.c (prune_clobbered_mems): Also prune trapping references when the BB may not return. (compute_avail): Pass in the function we're working on and replace cfun references with it. Externally throwing const calls also possibly terminate the function. (pass_pre::execute): Pass down the function we're working on. * gcse.c (compute_hash_table_work): Externally throwing const/pure calls also need record_last_mem_set_info. * postreload-gcse.c (record_opr_changes): Looping or externally throwing const/pure calls also need record_last_mem_set_info. * g++.dg/torture/pr101373.C: New testcase, XFAILed. * gnat.dg/opt95.adb: Likewise. </cut> Results regressed to (for first_bad == fedcf3c476aff7533741a1c61071200f0a38cf83) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O3_LTO -- artifacts/build-fedcf3c476aff7533741a1c61071200f0a38cf83/results_id: 1 # 429.mcf,mcf_base.default regressed by 106 from (for last_good == fe610051a803131822bd02a8842a67b573b8e46a) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O3_LTO -- artifacts/build-fe610051a803131822bd02a8842a67b573b8e46a/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aa… Results ID of last_good: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-master-aarch64-spec2k6-O3_LTO/1618 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aa… Results ID of first_bad: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-master-aarch64-spec2k6-O3_LTO/1613 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aa… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-fedcf3c476aff7533741a1c61071200f0a38cf83 cd investigate-gcc-fedcf3c476aff7533741a1c61071200f0a38cf83 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aa… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aa… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aa… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach fedcf3c476aff7533741a1c61071200f0a38cf83 ../artifacts/test.sh # Reproduce last_good build git checkout --detach fe610051a803131822bd02a8842a67b573b8e46a ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aa… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-master-aa… Full commit (up to 1000 lines): <cut> commit fedcf3c476aff7533741a1c61071200f0a38cf83 Author: Richard Biener <rguenther(a)suse.de> Date: Thu Jul 8 09:52:49 2021 +0200 tree-optimization/101373 - avoid PRE across externally throwing call PRE already tries to avoid hoisting possibly trapping expressions across calls that might not return normally but fails to consider const calls that throw externally. The following fixes that and also plugs the hole of trapping references not pruned in case they are not catched by the actuall call clobbering it. At -Os we hit the same issue in RTL PRE and postreload-gcse has even more incomplete checks so the patch adjusts both of those as well. 2021-07-08 Richard Biener <rguenther(a)suse.de> PR tree-optimization/101373 * tree-ssa-pre.c (prune_clobbered_mems): Also prune trapping references when the BB may not return. (compute_avail): Pass in the function we're working on and replace cfun references with it. Externally throwing const calls also possibly terminate the function. (pass_pre::execute): Pass down the function we're working on. * gcse.c (compute_hash_table_work): Externally throwing const/pure calls also need record_last_mem_set_info. * postreload-gcse.c (record_opr_changes): Looping or externally throwing const/pure calls also need record_last_mem_set_info. * g++.dg/torture/pr101373.C: New testcase, XFAILed. * gnat.dg/opt95.adb: Likewise. --- gcc/gcse.c | 3 ++- gcc/postreload-gcse.c | 4 +++- gcc/testsuite/g++.dg/torture/pr101373.C | 33 +++++++++++++++++++++++++++ gcc/testsuite/gnat.dg/opt95.adb | 40 +++++++++++++++++++++++++++++++++ gcc/tree-ssa-pre.c | 34 +++++++++++++++++----------- 5 files changed, 99 insertions(+), 15 deletions(-) diff --git a/gcc/gcse.c b/gcc/gcse.c index ecf7e51aac5..ccd33664af5 100644 --- a/gcc/gcse.c +++ b/gcc/gcse.c @@ -1537,7 +1537,8 @@ compute_hash_table_work (struct gcse_hash_table_d *table) record_last_reg_set_info (insn, regno); if (! RTL_CONST_OR_PURE_CALL_P (insn) - || RTL_LOOPING_CONST_OR_PURE_CALL_P (insn)) + || RTL_LOOPING_CONST_OR_PURE_CALL_P (insn) + || can_throw_external (insn)) record_last_mem_set_info (insn); } diff --git a/gcc/postreload-gcse.c b/gcc/postreload-gcse.c index 0b28247e299..6c95d09a1e5 100644 --- a/gcc/postreload-gcse.c +++ b/gcc/postreload-gcse.c @@ -779,7 +779,9 @@ record_opr_changes (rtx_insn *insn) EXECUTE_IF_SET_IN_HARD_REG_SET (callee_clobbers, 0, regno, hrsi) record_last_reg_set_info_regno (insn, regno); - if (! RTL_CONST_OR_PURE_CALL_P (insn)) + if (! RTL_CONST_OR_PURE_CALL_P (insn) + || RTL_LOOPING_CONST_OR_PURE_CALL_P (insn) + || can_throw_external (insn)) record_last_mem_set_info (insn); } } diff --git a/gcc/testsuite/g++.dg/torture/pr101373.C b/gcc/testsuite/g++.dg/torture/pr101373.C new file mode 100644 index 00000000000..f8c809739e2 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr101373.C @@ -0,0 +1,33 @@ +// { dg-do run } +// { dg-xfail-run-if "PR100409" { *-*-* } } + +int __attribute__((const,noipa)) foo (int j) +{ + if (j != 0) + throw 1; + return 0; +} + +int __attribute__((noipa)) bar (int *p, int n) +{ + int ret = 0; + if (n) + { + foo (n); + ret = *p; + } + ret += *p; + return ret; +} + +int main() +{ + try + { + return bar (nullptr, 1); + } + catch (...) + { + return 0; + } +} diff --git a/gcc/testsuite/gnat.dg/opt95.adb b/gcc/testsuite/gnat.dg/opt95.adb new file mode 100644 index 00000000000..2c72582b3f1 --- /dev/null +++ b/gcc/testsuite/gnat.dg/opt95.adb @@ -0,0 +1,40 @@ +-- { dg-do run } +-- { dg-options "-O2 -gnatp" } + +procedure Opt95 is + + function Foo (J : Integer) return Integer; + pragma Pure_Function (Foo); + pragma Machine_Attribute (Foo, "noipa"); + + function Foo (J : Integer) return Integer is + begin + if J /= 0 then + raise Constraint_Error; + end if; + return 0; + end; + + function Bar (A : access Integer; N : Integer) return Integer; + pragma Machine_Attribute (Bar, "noipa"); + + function Bar (A : access Integer; N : Integer) return Integer is + Ret : Integer := 0; + Ret2 : Integer := 0; + begin + if N /= 0 then + Ret2 := Foo (N); + Ret := A.all; + end if; + Ret := Ret + A.all; + return Ret + Ret2; + end; + + V : Integer; + pragma Volatile (V); + +begin + V := Bar (null, 1); +exception + when Constraint_Error => null; +end; diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c index 69141c2f0c9..aa5244e678c 100644 --- a/gcc/tree-ssa-pre.c +++ b/gcc/tree-ssa-pre.c @@ -2071,6 +2071,13 @@ prune_clobbered_mems (bitmap_set_t set, basic_block block) && value_dies_in_block_x (expr, block)))) to_remove = i; } + /* If the REFERENCE may trap make sure the block does not contain + a possible exit point. + ??? This is overly conservative if we translate AVAIL_OUT + as the available expression might be after the exit point. */ + if (BB_MAY_NOTRETURN (block) + && vn_reference_may_trap (ref)) + to_remove = i; } else if (expr->kind == NARY) { @@ -3860,7 +3867,7 @@ insert (void) AVAIL_OUT[BLOCK] = AVAIL_IN[BLOCK] U PHI_GEN[BLOCK] U TMP_GEN[BLOCK]. */ static void -compute_avail (void) +compute_avail (function *fun) { basic_block block, son; @@ -3871,7 +3878,7 @@ compute_avail (void) /* We pretend that default definitions are defined in the entry block. This includes function arguments and the static chain decl. */ - FOR_EACH_SSA_NAME (i, name, cfun) + FOR_EACH_SSA_NAME (i, name, fun) { pre_expr e; if (!SSA_NAME_IS_DEFAULT_DEF (name) @@ -3881,31 +3888,31 @@ compute_avail (void) e = get_or_alloc_expr_for_name (name); add_to_value (get_expr_value_id (e), e); - bitmap_insert_into_set (TMP_GEN (ENTRY_BLOCK_PTR_FOR_FN (cfun)), e); - bitmap_value_insert_into_set (AVAIL_OUT (ENTRY_BLOCK_PTR_FOR_FN (cfun)), + bitmap_insert_into_set (TMP_GEN (ENTRY_BLOCK_PTR_FOR_FN (fun)), e); + bitmap_value_insert_into_set (AVAIL_OUT (ENTRY_BLOCK_PTR_FOR_FN (fun)), e); } if (dump_file && (dump_flags & TDF_DETAILS)) { - print_bitmap_set (dump_file, TMP_GEN (ENTRY_BLOCK_PTR_FOR_FN (cfun)), + print_bitmap_set (dump_file, TMP_GEN (ENTRY_BLOCK_PTR_FOR_FN (fun)), "tmp_gen", ENTRY_BLOCK); - print_bitmap_set (dump_file, AVAIL_OUT (ENTRY_BLOCK_PTR_FOR_FN (cfun)), + print_bitmap_set (dump_file, AVAIL_OUT (ENTRY_BLOCK_PTR_FOR_FN (fun)), "avail_out", ENTRY_BLOCK); } /* Allocate the worklist. */ - worklist = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun)); + worklist = XNEWVEC (basic_block, n_basic_blocks_for_fn (fun)); /* Seed the algorithm by putting the dominator children of the entry block on the worklist. */ - for (son = first_dom_son (CDI_DOMINATORS, ENTRY_BLOCK_PTR_FOR_FN (cfun)); + for (son = first_dom_son (CDI_DOMINATORS, ENTRY_BLOCK_PTR_FOR_FN (fun)); son; son = next_dom_son (CDI_DOMINATORS, son)) worklist[sp++] = son; - BB_LIVE_VOP_ON_EXIT (ENTRY_BLOCK_PTR_FOR_FN (cfun)) - = ssa_default_def (cfun, gimple_vop (cfun)); + BB_LIVE_VOP_ON_EXIT (ENTRY_BLOCK_PTR_FOR_FN (fun)) + = ssa_default_def (fun, gimple_vop (fun)); /* Loop until the worklist is empty. */ while (sp) @@ -3970,7 +3977,8 @@ compute_avail (void) before it. */ int flags = gimple_call_flags (stmt); if (!(flags & ECF_CONST) - || (flags & ECF_LOOPING_CONST_OR_PURE)) + || (flags & ECF_LOOPING_CONST_OR_PURE) + || stmt_can_throw_external (fun, stmt)) BB_MAY_NOTRETURN (block) = 1; } @@ -3987,7 +3995,7 @@ compute_avail (void) BB_LIVE_VOP_ON_EXIT (block) = gimple_vdef (stmt); if (gimple_has_side_effects (stmt) - || stmt_could_throw_p (cfun, stmt) + || stmt_could_throw_p (fun, stmt) || is_gimple_debug (stmt)) continue; @@ -4384,7 +4392,7 @@ pass_pre::execute (function *fun) we require AVAIL. */ if (n_basic_blocks_for_fn (fun) < 4000) { - compute_avail (); + compute_avail (fun); compute_antic (); insert (); } </cut>

3 years, 11 months

← Newer
1
2
3
4
5
6
7
8
9
10
Older →

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain July 2021