linaro-toolchain July 2021

linaro-toolchain@lists.linaro.org

19 participants
99 discussions

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-master-arm-spec2k6-O3_LTO - Build # 14 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O3_LTO Culprit: <cut> commit c77230856eac2d28eb7bf10985846885c3c8727b Author: Iain Buclaw <ibuclaw(a)gdcproject.org> Date: Sat Jul 3 00:13:29 2021 +0200 d: RHS value lost when a target_expr modifies LHS in a cond_expr To prevent the RHS of an assignment modifying the LHS before the assignment proper, a target_expr is forced so that function calls that return with slot optimization modify the temporary instead. This did not work for conditional expressions however, to give one example. So now the RHS is always forced to a temporary. PR d/101282 gcc/d/ChangeLog: * d-codegen.cc (build_assign): Force target_expr on RHS for non-POD assignment expressions. gcc/testsuite/ChangeLog: * gdc.dg/torture/pr101282.d: New test. </cut> Results regressed to (for first_bad == c77230856eac2d28eb7bf10985846885c3c8727b) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O3_LTO_marm -- artifacts/build-c77230856eac2d28eb7bf10985846885c3c8727b/results_id: 1 # 447.dealII,dealII_base.default regressed by 103 from (for last_good == 6feb628a706e86eb3f303aff388c74bdb29e7381) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O3_LTO_marm -- artifacts/build-6feb628a706e86eb3f303aff388c74bdb29e7381/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-master-arm-spec2k6-O3_LTO/1951 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-master-arm-spec2k6-O3_LTO/1938 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-c77230856eac2d28eb7bf10985846885c3c8727b cd investigate-gcc-c77230856eac2d28eb7bf10985846885c3c8727b git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach c77230856eac2d28eb7bf10985846885c3c8727b ../artifacts/test.sh # Reproduce last_good build git checkout --detach 6feb628a706e86eb3f303aff388c74bdb29e7381 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Full commit (up to 1000 lines): <cut> commit c77230856eac2d28eb7bf10985846885c3c8727b Author: Iain Buclaw <ibuclaw(a)gdcproject.org> Date: Sat Jul 3 00:13:29 2021 +0200 d: RHS value lost when a target_expr modifies LHS in a cond_expr To prevent the RHS of an assignment modifying the LHS before the assignment proper, a target_expr is forced so that function calls that return with slot optimization modify the temporary instead. This did not work for conditional expressions however, to give one example. So now the RHS is always forced to a temporary. PR d/101282 gcc/d/ChangeLog: * d-codegen.cc (build_assign): Force target_expr on RHS for non-POD assignment expressions. gcc/testsuite/ChangeLog: * gdc.dg/torture/pr101282.d: New test. --- gcc/d/d-codegen.cc | 7 +++++++ gcc/testsuite/gdc.dg/torture/pr101282.d | 23 +++++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/gcc/d/d-codegen.cc b/gcc/d/d-codegen.cc index 9a9447371aa..ce7c17baaaf 100644 --- a/gcc/d/d-codegen.cc +++ b/gcc/d/d-codegen.cc @@ -1344,6 +1344,13 @@ build_assign (tree_code code, tree lhs, tree rhs) d_mark_addressable (lhs); CALL_EXPR_RETURN_SLOT_OPT (rhs) = true; } + /* If modifying an LHS whose type is marked TREE_ADDRESSABLE. */ + else if (code == MODIFY_EXPR && TREE_ADDRESSABLE (TREE_TYPE (lhs)) + && TREE_SIDE_EFFECTS (rhs) && TREE_CODE (rhs) != TARGET_EXPR) + { + /* LHS may be referenced by the RHS expression, so force a temporary. */ + rhs = force_target_expr (rhs); + } /* The LHS assignment replaces the temporary in TARGET_EXPR_SLOT. */ if (TREE_CODE (rhs) == TARGET_EXPR) diff --git a/gcc/testsuite/gdc.dg/torture/pr101282.d b/gcc/testsuite/gdc.dg/torture/pr101282.d new file mode 100644 index 00000000000..b75d5fc678f --- /dev/null +++ b/gcc/testsuite/gdc.dg/torture/pr101282.d @@ -0,0 +1,23 @@ +// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101282 +// { dg-do run } + +void main() +{ + struct S101282 + { + int impl; + S101282 opUnary(string op : "-")() + { + return S101282(-impl); + } + int opCmp(int i) + { + return (impl < i) ? -1 : (impl > i) ? 1 : 0; + } + } + auto a = S101282(120); + a = -a; + assert(a.impl == -120); + a = a >= 0 ? a : -a; + assert(a.impl == 120); +} </cut>

4 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-release-aarch64-spec2k6-O3 - Build # 6 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O3. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O3 Culprit: <cut> commit 1fb610429308a7c29c5065f5cc35dcc3fd69c8b1 Author: Roman Lebedev <lebedev.ri(a)gmail.com> Date: Mon Oct 12 22:19:17 2020 +0300 Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" This relands commit 1c021c64caef83cccb719c9bf0a2554faa6563af which was reverted in commit 17cec6a11a12f815052d56a17ef738cf246a2d9a because an assertion was being triggered, since `BuildConstantFromSCEV()` wasn't updated to handle the case where the constant we want to truncate is actually a pointer. I was unsuccessful in coming up with a test case where we'd end there with constant zext/sext of a pointer, so i didn't handle those cases there until there is a test case. Original commit message: While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806 </cut> Results regressed to (for first_bad == 1fb610429308a7c29c5065f5cc35dcc3fd69c8b1) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O3 -- artifacts/build-1fb610429308a7c29c5065f5cc35dcc3fd69c8b1/results_id: 1 # 400.perlbench,perlbench_base.default regressed by 104 from (for last_good == 73818f450e3a90fc89eca143ee30777ed7e660e9) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O3 -- artifacts/build-73818f450e3a90fc89eca143ee30777ed7e660e9/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O3/1930 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O3/1947 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-1fb610429308a7c29c5065f5cc35dcc3fd69c8b1 cd investigate-llvm-1fb610429308a7c29c5065f5cc35dcc3fd69c8b1 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 1fb610429308a7c29c5065f5cc35dcc3fd69c8b1 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 73818f450e3a90fc89eca143ee30777ed7e660e9 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Full commit (up to 1000 lines): <cut> commit 1fb610429308a7c29c5065f5cc35dcc3fd69c8b1 Author: Roman Lebedev <lebedev.ri(a)gmail.com> Date: Mon Oct 12 22:19:17 2020 +0300 Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" This relands commit 1c021c64caef83cccb719c9bf0a2554faa6563af which was reverted in commit 17cec6a11a12f815052d56a17ef738cf246a2d9a because an assertion was being triggered, since `BuildConstantFromSCEV()` wasn't updated to handle the case where the constant we want to truncate is actually a pointer. I was unsuccessful in coming up with a test case where we'd end there with constant zext/sext of a pointer, so i didn't handle those cases there until there is a test case. Original commit message: While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806 --- llvm/lib/Analysis/ScalarEvolution.cpp | 50 ++++++-- llvm/lib/Transforms/Utils/SimplifyIndVar.cpp | 2 +- .../add-expr-pointer-operand-sorting.ll | 4 +- .../Analysis/ScalarEvolution/no-wrap-add-exprs.ll | 4 +- .../ScalarEvolution/ptrtoint-constantexpr-loop.ll | 130 ++++++++------------- llvm/test/Analysis/ScalarEvolution/ptrtoint.ll | 60 +++++----- llvm/test/CodeGen/ARM/lsr-undef-in-binop.ll | 4 +- llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll | 4 +- .../IndVarSimplify/2011-11-01-lftrptr.ll | 16 +-- .../Isl/CodeGen/scev_looking_through_bitcasts.ll | 3 +- 10 files changed, 140 insertions(+), 137 deletions(-) diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp index 1d3e26b93cb6..74bffc0facdb 100644 --- a/llvm/lib/Analysis/ScalarEvolution.cpp +++ b/llvm/lib/Analysis/ScalarEvolution.cpp @@ -3505,15 +3505,15 @@ const SCEV *ScalarEvolution::getUMinExpr(SmallVectorImpl<const SCEV *> &Ops) { } const SCEV *ScalarEvolution::getSizeOfExpr(Type *IntTy, Type *AllocTy) { - // We can bypass creating a target-independent - // constant expression and then folding it back into a ConstantInt. - // This is just a compile-time optimization. if (isa<ScalableVectorType>(AllocTy)) { Constant *NullPtr = Constant::getNullValue(AllocTy->getPointerTo()); Constant *One = ConstantInt::get(IntTy, 1); Constant *GEP = ConstantExpr::getGetElementPtr(AllocTy, NullPtr, One); - return getSCEV(ConstantExpr::getPtrToInt(GEP, IntTy)); + return getUnknown(ConstantExpr::getPtrToInt(GEP, IntTy)); } + // We can bypass creating a target-independent + // constant expression and then folding it back into a ConstantInt. + // This is just a compile-time optimization. return getConstant(IntTy, getDataLayout().getTypeAllocSize(AllocTy)); } @@ -6301,6 +6301,36 @@ const SCEV *ScalarEvolution::createSCEV(Value *V) { return getSCEV(U->getOperand(0)); break; + case Instruction::PtrToInt: { + // It's tempting to handle inttoptr and ptrtoint as no-ops, + // however this can lead to pointer expressions which cannot safely be + // expanded to GEPs because ScalarEvolution doesn't respect + // the GEP aliasing rules when simplifying integer expressions. + // + // However, given + // %x = ??? + // %y = ptrtoint %x + // %z = ptrtoint %x + // it is safe to say that %y and %z are the same thing. + // + // So instead of modelling the cast itself as unknown, + // since the casts are transparent within SCEV, + // we can at least model the casts original value as unknow instead. + + // BUT, there's caveat. If we simply model %x as unknown, unrelated uses + // of %x will also see it as unknown, which is obviously bad. + // So we can only do this iff %x would be modelled as unknown anyways. + auto *OpSCEV = getSCEV(U->getOperand(0)); + if (isa<SCEVUnknown>(OpSCEV)) + return getTruncateOrZeroExtend(OpSCEV, U->getType()); + // If we can model the operand, however, we must fallback to modelling + // the whole cast as unknown instead. + LLVM_FALLTHROUGH; + } + case Instruction::IntToPtr: + // We can't do this for inttoptr at all, however. + return getUnknown(V); + case Instruction::SDiv: // If both operands are non-negative, this is just an udiv. if (isKnownNonNegative(getSCEV(U->getOperand(0))) && @@ -6315,11 +6345,6 @@ const SCEV *ScalarEvolution::createSCEV(Value *V) { return getURemExpr(getSCEV(U->getOperand(0)), getSCEV(U->getOperand(1))); break; - // It's tempting to handle inttoptr and ptrtoint as no-ops, however this can - // lead to pointer expressions which cannot safely be expanded to GEPs, - // because ScalarEvolution doesn't respect the GEP aliasing rules when - // simplifying integer expressions. - case Instruction::GetElementPtr: return createNodeForGEP(cast<GEPOperator>(U)); @@ -7974,8 +7999,11 @@ static Constant *BuildConstantFromSCEV(const SCEV *V) { } case scTruncate: { const SCEVTruncateExpr *ST = cast<SCEVTruncateExpr>(V); - if (Constant *CastOp = BuildConstantFromSCEV(ST->getOperand())) - return ConstantExpr::getTrunc(CastOp, ST->getType()); + if (Constant *CastOp = BuildConstantFromSCEV(ST->getOperand())) { + if (!CastOp->getType()->isPointerTy()) + return ConstantExpr::getTrunc(CastOp, ST->getType()); + return ConstantExpr::getPtrToInt(CastOp, ST->getType()); + } break; } case scAddExpr: { diff --git a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp index 2d71b0fff889..3e280a66175c 100644 --- a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp +++ b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp @@ -427,7 +427,7 @@ static bool willNotOverflow(ScalarEvolution *SE, Instruction::BinaryOps BinOp, : &ScalarEvolution::getZeroExtendExpr; // Check ext(LHS op RHS) == ext(LHS) op ext(RHS) - auto *NarrowTy = cast<IntegerType>(LHS->getType()); + auto *NarrowTy = cast<IntegerType>(SE->getEffectiveSCEVType(LHS->getType())); auto *WideTy = IntegerType::get(NarrowTy->getContext(), NarrowTy->getBitWidth() * 2); diff --git a/llvm/test/Analysis/ScalarEvolution/add-expr-pointer-operand-sorting.ll b/llvm/test/Analysis/ScalarEvolution/add-expr-pointer-operand-sorting.ll index 93a3bf4d4c37..e798e2715ba1 100644 --- a/llvm/test/Analysis/ScalarEvolution/add-expr-pointer-operand-sorting.ll +++ b/llvm/test/Analysis/ScalarEvolution/add-expr-pointer-operand-sorting.ll @@ -33,9 +33,9 @@ define i32 @d(i32 %base) { ; CHECK-NEXT: %1 = load i32*, i32** @c, align 8 ; CHECK-NEXT: --> %1 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } ; CHECK-NEXT: %sub.ptr.lhs.cast = ptrtoint i32* %1 to i64 -; CHECK-NEXT: --> %sub.ptr.lhs.cast U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } +; CHECK-NEXT: --> %1 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } ; CHECK-NEXT: %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, ptrtoint ([1 x i32]* @b to i64) -; CHECK-NEXT: --> ((-1 * ptrtoint ([1 x i32]* @b to i64)) + %sub.ptr.lhs.cast) U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } +; CHECK-NEXT: --> ((-1 * @b) + %1) U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } ; CHECK-NEXT: %sub.ptr.div = sdiv exact i64 %sub.ptr.sub, 4 ; CHECK-NEXT: --> %sub.ptr.div U: full-set S: [-2305843009213693952,2305843009213693952) Exits: <<Unknown>> LoopDispositions: { %for.cond: Variant } ; CHECK-NEXT: %arrayidx1 = getelementptr inbounds [1 x i8], [1 x i8]* %arrayidx, i64 0, i64 %sub.ptr.div diff --git a/llvm/test/Analysis/ScalarEvolution/no-wrap-add-exprs.ll b/llvm/test/Analysis/ScalarEvolution/no-wrap-add-exprs.ll index 5a7bb3c9e5cd..eb669cab0c79 100644 --- a/llvm/test/Analysis/ScalarEvolution/no-wrap-add-exprs.ll +++ b/llvm/test/Analysis/ScalarEvolution/no-wrap-add-exprs.ll @@ -170,14 +170,14 @@ define void @f3(i8* %x_addr, i8* %y_addr, i32* %tmp_addr) { %int5 = add i32 %int0, 5 %int.zext = zext i32 %int5 to i64 ; CHECK: %int.zext = zext i32 %int5 to i64 -; CHECK-NEXT: --> (1 + (zext i32 (4 + %int0) to i64))<nuw><nsw> U: [1,4294967294) S: [1,4294967297) +; CHECK-NEXT: --> (1 + (zext i32 (4 + (trunc [16 x i8]* @z_addr to i32)) to i64))<nuw><nsw> U: [1,4294967294) S: [1,4294967297) %ptr_noalign = bitcast [16 x i8]* @z_addr_noalign to i8* %int0_na = ptrtoint i8* %ptr_noalign to i32 %int5_na = add i32 %int0_na, 5 %int.zext_na = zext i32 %int5_na to i64 ; CHECK: %int.zext_na = zext i32 %int5_na to i64 -; CHECK-NEXT: --> (zext i32 (5 + %int0_na) to i64) U: [0,4294967296) S: [0,4294967296) +; CHECK-NEXT: --> (zext i32 (5 + (trunc [16 x i8]* @z_addr_noalign to i32)) to i64) U: [0,4294967296) S: [0,4294967296) %tmp = load i32, i32* %tmp_addr %mul = and i32 %tmp, -4 diff --git a/llvm/test/Analysis/ScalarEvolution/ptrtoint-constantexpr-loop.ll b/llvm/test/Analysis/ScalarEvolution/ptrtoint-constantexpr-loop.ll index 8cfa041e7552..d0ead6028071 100644 --- a/llvm/test/Analysis/ScalarEvolution/ptrtoint-constantexpr-loop.ll +++ b/llvm/test/Analysis/ScalarEvolution/ptrtoint-constantexpr-loop.ll @@ -11,48 +11,31 @@ @global = external hidden global [0 x i8] define hidden i32* @i64(i8* %arg, i32* %arg10) { -; PTR64_IDX64-LABEL: 'i64' -; PTR64_IDX64-NEXT: Classifying expressions for: @i64 -; PTR64_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] -; PTR64_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i64 ptrtoint ([0 x i8]* @global to i64) -; PTR64_IDX64-NEXT: --> (ptrtoint ([0 x i8]* @global to i64) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i64) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX64-NEXT: --> (ptrtoint ([0 x i8]* @global to i64) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i64) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 -; PTR64_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } -; PTR64_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 -; PTR64_IDX64-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX64-NEXT: Determining loop execution counts for: @i64 -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable backedge-taken count. -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable max backedge-taken count. -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. -; -; PTR64_IDX32-LABEL: 'i64' -; PTR64_IDX32-NEXT: Classifying expressions for: @i64 -; PTR64_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] -; PTR64_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i64 ptrtoint ([0 x i8]* @global to i64) -; PTR64_IDX32-NEXT: --> ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX32-NEXT: --> ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 -; PTR64_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } -; PTR64_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 -; PTR64_IDX32-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX32-NEXT: Determining loop execution counts for: @i64 -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable backedge-taken count. -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable max backedge-taken count. -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. +; X64-LABEL: 'i64' +; X64-NEXT: Classifying expressions for: @i64 +; X64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] +; X64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } +; X64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i64 ptrtoint ([0 x i8]* @global to i64) +; X64-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } +; X64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* +; X64-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } +; X64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 +; X64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } +; X64-NEXT: %tmp18 = add i32 %tmp, 2 +; X64-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } +; X64-NEXT: Determining loop execution counts for: @i64 +; X64-NEXT: Loop %bb11: Unpredictable backedge-taken count. +; X64-NEXT: Loop %bb11: Unpredictable max backedge-taken count. +; X64-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. ; ; PTR32_IDX32-LABEL: 'i64' ; PTR32_IDX32-NEXT: Classifying expressions for: @i64 ; PTR32_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i64 ptrtoint ([0 x i8]* @global to i64) -; PTR32_IDX32-NEXT: --> ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX32-NEXT: --> ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i64 ptrtoint ([0 x i8]* @global to i64) to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 @@ -67,9 +50,9 @@ define hidden i32* @i64(i8* %arg, i32* %arg10) { ; PTR32_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i64 ptrtoint ([0 x i8]* @global to i64) -; PTR32_IDX64-NEXT: --> (ptrtoint ([0 x i8]* @global to i64) + %arg) U: [0,8589934591) S: full-set Exits: (ptrtoint ([0 x i8]* @global to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> (@global + %arg) U: [0,8589934591) S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX64-NEXT: --> (ptrtoint ([0 x i8]* @global to i64) + %arg) U: [0,8589934591) S: full-set Exits: (ptrtoint ([0 x i8]* @global to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> (@global + %arg) U: [0,8589934591) S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 @@ -103,9 +86,9 @@ define hidden i32* @i64_to_i32(i8* %arg, i32* %arg10) { ; PTR64_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR64_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR64_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i32 ptrtoint ([0 x i8]* @global to i32) -; PTR64_IDX64-NEXT: --> ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) U: full-set S: full-set Exits: ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR64_IDX64-NEXT: --> ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) U: full-set S: full-set Exits: ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } ; PTR64_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX64-NEXT: --> ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) U: full-set S: full-set Exits: ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR64_IDX64-NEXT: --> ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) U: full-set S: full-set Exits: ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } ; PTR64_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR64_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR64_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 @@ -120,9 +103,9 @@ define hidden i32* @i64_to_i32(i8* %arg, i32* %arg10) { ; PTR64_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR64_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR64_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i32 ptrtoint ([0 x i8]* @global to i32) -; PTR64_IDX32-NEXT: --> (ptrtoint ([0 x i8]* @global to i32) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR64_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR64_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX32-NEXT: --> (ptrtoint ([0 x i8]* @global to i32) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR64_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR64_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR64_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR64_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 @@ -137,9 +120,9 @@ define hidden i32* @i64_to_i32(i8* %arg, i32* %arg10) { ; PTR32_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i32 ptrtoint ([0 x i8]* @global to i32) -; PTR32_IDX32-NEXT: --> (ptrtoint ([0 x i8]* @global to i32) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX32-NEXT: --> (ptrtoint ([0 x i8]* @global to i32) + %arg) U: full-set S: full-set Exits: (ptrtoint ([0 x i8]* @global to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 @@ -154,9 +137,9 @@ define hidden i32* @i64_to_i32(i8* %arg, i32* %arg10) { ; PTR32_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i32 ptrtoint ([0 x i8]* @global to i32) -; PTR32_IDX64-NEXT: --> ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) U: [-2147483648,6442450943) S: full-set Exits: ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) U: [-2147483648,6442450943) S: full-set Exits: ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX64-NEXT: --> ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) U: [-2147483648,6442450943) S: full-set Exits: ((sext i32 ptrtoint ([0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) U: [-2147483648,6442450943) S: full-set Exits: ((sext i32 (trunc [0 x i8]* @global to i32) to i64) + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 @@ -185,48 +168,31 @@ bb17: ; preds = %bb11 br label %bb11 } define hidden i32* @i64_to_i128(i8* %arg, i32* %arg10) { -; PTR64_IDX64-LABEL: 'i64_to_i128' -; PTR64_IDX64-NEXT: Classifying expressions for: @i64_to_i128 -; PTR64_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] -; PTR64_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i128 ptrtoint ([0 x i8]* @global to i128) -; PTR64_IDX64-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX64-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 -; PTR64_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } -; PTR64_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 -; PTR64_IDX64-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX64-NEXT: Determining loop execution counts for: @i64_to_i128 -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable backedge-taken count. -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable max backedge-taken count. -; PTR64_IDX64-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. -; -; PTR64_IDX32-LABEL: 'i64_to_i128' -; PTR64_IDX32-NEXT: Classifying expressions for: @i64_to_i128 -; PTR64_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] -; PTR64_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i128 ptrtoint ([0 x i8]* @global to i128) -; PTR64_IDX32-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR64_IDX32-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) LoopDispositions: { %bb11: Invariant } -; PTR64_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 -; PTR64_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } -; PTR64_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 -; PTR64_IDX32-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } -; PTR64_IDX32-NEXT: Determining loop execution counts for: @i64_to_i128 -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable backedge-taken count. -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable max backedge-taken count. -; PTR64_IDX32-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. +; X64-LABEL: 'i64_to_i128' +; X64-NEXT: Classifying expressions for: @i64_to_i128 +; X64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] +; X64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } +; X64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i128 ptrtoint ([0 x i8]* @global to i128) +; X64-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } +; X64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* +; X64-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } +; X64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 +; X64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } +; X64-NEXT: %tmp18 = add i32 %tmp, 2 +; X64-NEXT: --> {2,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } +; X64-NEXT: Determining loop execution counts for: @i64_to_i128 +; X64-NEXT: Loop %bb11: Unpredictable backedge-taken count. +; X64-NEXT: Loop %bb11: Unpredictable max backedge-taken count. +; X64-NEXT: Loop %bb11: Unpredictable predicated backedge-taken count. ; ; PTR32_IDX32-LABEL: 'i64_to_i128' ; PTR32_IDX32-NEXT: Classifying expressions for: @i64_to_i128 ; PTR32_IDX32-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX32-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX32-NEXT: %tmp12 = getelementptr i8, i8* %arg, i128 ptrtoint ([0 x i8]* @global to i128) -; PTR32_IDX32-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX32-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) U: full-set S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i32) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX32-NEXT: --> (@global + %arg) U: full-set S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX32-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX32-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX32-NEXT: %tmp18 = add i32 %tmp, 2 @@ -241,9 +207,9 @@ define hidden i32* @i64_to_i128(i8* %arg, i32* %arg10) { ; PTR32_IDX64-NEXT: %tmp = phi i32 [ 0, %bb ], [ %tmp18, %bb17 ] ; PTR32_IDX64-NEXT: --> {0,+,2}<%bb11> U: [0,-1) S: [-2147483648,2147483647) Exits: <<Unknown>> LoopDispositions: { %bb11: Computable } ; PTR32_IDX64-NEXT: %tmp12 = getelementptr i8, i8* %arg, i128 ptrtoint ([0 x i8]* @global to i128) -; PTR32_IDX64-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) U: [0,8589934591) S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> (@global + %arg) U: [0,8589934591) S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp13 = bitcast i8* %tmp12 to i32* -; PTR32_IDX64-NEXT: --> ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) U: [0,8589934591) S: full-set Exits: ((trunc i128 ptrtoint ([0 x i8]* @global to i128) to i64) + %arg) LoopDispositions: { %bb11: Invariant } +; PTR32_IDX64-NEXT: --> (@global + %arg) U: [0,8589934591) S: full-set Exits: (@global + %arg) LoopDispositions: { %bb11: Invariant } ; PTR32_IDX64-NEXT: %tmp14 = load i32, i32* %tmp13, align 4 ; PTR32_IDX64-NEXT: --> %tmp14 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb11: Variant } ; PTR32_IDX64-NEXT: %tmp18 = add i32 %tmp, 2 diff --git a/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll b/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll index e3e9330e241f..ac08fb24775e 100644 --- a/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll +++ b/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll @@ -16,25 +16,25 @@ define void @ptrtoint(i8* %in, i64* %out0, i32* %out1, i16* %out2, i128* %out3) ; X64-LABEL: 'ptrtoint' ; X64-NEXT: Classifying expressions for: @ptrtoint ; X64-NEXT: %p0 = ptrtoint i8* %in to i64 -; X64-NEXT: --> %p0 U: full-set S: full-set +; X64-NEXT: --> %in U: full-set S: full-set ; X64-NEXT: %p1 = ptrtoint i8* %in to i32 -; X64-NEXT: --> %p1 U: full-set S: full-set +; X64-NEXT: --> (trunc i8* %in to i32) U: full-set S: full-set ; X64-NEXT: %p2 = ptrtoint i8* %in to i16 -; X64-NEXT: --> %p2 U: full-set S: full-set +; X64-NEXT: --> (trunc i8* %in to i16) U: full-set S: full-set ; X64-NEXT: %p3 = ptrtoint i8* %in to i128 -; X64-NEXT: --> %p3 U: [0,18446744073709551616) S: [-18446744073709551616,18446744073709551616) +; X64-NEXT: --> (zext i8* %in to i128) U: [0,18446744073709551616) S: [0,18446744073709551616) ; X64-NEXT: Determining loop execution counts for: @ptrtoint ; ; X32-LABEL: 'ptrtoint' ; X32-NEXT: Classifying expressions for: @ptrtoint ; X32-NEXT: %p0 = ptrtoint i8* %in to i64 -; X32-NEXT: --> %p0 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8* %in to i64) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: %p1 = ptrtoint i8* %in to i32 -; X32-NEXT: --> %p1 U: full-set S: full-set +; X32-NEXT: --> %in U: full-set S: full-set ; X32-NEXT: %p2 = ptrtoint i8* %in to i16 -; X32-NEXT: --> %p2 U: full-set S: full-set +; X32-NEXT: --> (trunc i8* %in to i16) U: full-set S: full-set ; X32-NEXT: %p3 = ptrtoint i8* %in to i128 -; X32-NEXT: --> %p3 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8* %in to i128) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: Determining loop execution counts for: @ptrtoint ; %p0 = ptrtoint i8* %in to i64 @@ -53,25 +53,25 @@ define void @ptrtoint_as1(i8 addrspace(1)* %in, i64* %out0, i32* %out1, i16* %ou ; X64-LABEL: 'ptrtoint_as1' ; X64-NEXT: Classifying expressions for: @ptrtoint_as1 ; X64-NEXT: %p0 = ptrtoint i8 addrspace(1)* %in to i64 -; X64-NEXT: --> %p0 U: full-set S: full-set +; X64-NEXT: --> %in U: full-set S: full-set ; X64-NEXT: %p1 = ptrtoint i8 addrspace(1)* %in to i32 -; X64-NEXT: --> %p1 U: full-set S: full-set +; X64-NEXT: --> (trunc i8 addrspace(1)* %in to i32) U: full-set S: full-set ; X64-NEXT: %p2 = ptrtoint i8 addrspace(1)* %in to i16 -; X64-NEXT: --> %p2 U: full-set S: full-set +; X64-NEXT: --> (trunc i8 addrspace(1)* %in to i16) U: full-set S: full-set ; X64-NEXT: %p3 = ptrtoint i8 addrspace(1)* %in to i128 -; X64-NEXT: --> %p3 U: [0,18446744073709551616) S: [-18446744073709551616,18446744073709551616) +; X64-NEXT: --> (zext i8 addrspace(1)* %in to i128) U: [0,18446744073709551616) S: [0,18446744073709551616) ; X64-NEXT: Determining loop execution counts for: @ptrtoint_as1 ; ; X32-LABEL: 'ptrtoint_as1' ; X32-NEXT: Classifying expressions for: @ptrtoint_as1 ; X32-NEXT: %p0 = ptrtoint i8 addrspace(1)* %in to i64 -; X32-NEXT: --> %p0 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8 addrspace(1)* %in to i64) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: %p1 = ptrtoint i8 addrspace(1)* %in to i32 -; X32-NEXT: --> %p1 U: full-set S: full-set +; X32-NEXT: --> %in U: full-set S: full-set ; X32-NEXT: %p2 = ptrtoint i8 addrspace(1)* %in to i16 -; X32-NEXT: --> %p2 U: full-set S: full-set +; X32-NEXT: --> (trunc i8 addrspace(1)* %in to i16) U: full-set S: full-set ; X32-NEXT: %p3 = ptrtoint i8 addrspace(1)* %in to i128 -; X32-NEXT: --> %p3 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8 addrspace(1)* %in to i128) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: Determining loop execution counts for: @ptrtoint_as1 ; %p0 = ptrtoint i8 addrspace(1)* %in to i64 @@ -92,7 +92,7 @@ define void @ptrtoint_of_bitcast(i8* %in, i64* %out0) { ; X64-NEXT: %in_casted = bitcast i8* %in to float* ; X64-NEXT: --> %in U: full-set S: full-set ; X64-NEXT: %p0 = ptrtoint float* %in_casted to i64 -; X64-NEXT: --> %p0 U: full-set S: full-set +; X64-NEXT: --> %in U: full-set S: full-set ; X64-NEXT: Determining loop execution counts for: @ptrtoint_of_bitcast ; ; X32-LABEL: 'ptrtoint_of_bitcast' @@ -100,7 +100,7 @@ define void @ptrtoint_of_bitcast(i8* %in, i64* %out0) { ; X32-NEXT: %in_casted = bitcast i8* %in to float* ; X32-NEXT: --> %in U: full-set S: full-set ; X32-NEXT: %p0 = ptrtoint float* %in_casted to i64 -; X32-NEXT: --> %p0 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8* %in to i64) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: Determining loop execution counts for: @ptrtoint_of_bitcast ; %in_casted = bitcast i8* %in to float* @@ -116,7 +116,7 @@ define void @ptrtoint_of_addrspacecast(i8* %in, i64* %out0) { ; X64-NEXT: %in_casted = addrspacecast i8* %in to i8 addrspace(1)* ; X64-NEXT: --> %in_casted U: full-set S: full-set ; X64-NEXT: %p0 = ptrtoint i8 addrspace(1)* %in_casted to i64 -; X64-NEXT: --> %p0 U: full-set S: full-set +; X64-NEXT: --> %in_casted U: full-set S: full-set ; X64-NEXT: Determining loop execution counts for: @ptrtoint_of_addrspacecast ; ; X32-LABEL: 'ptrtoint_of_addrspacecast' @@ -124,7 +124,7 @@ define void @ptrtoint_of_addrspacecast(i8* %in, i64* %out0) { ; X32-NEXT: %in_casted = addrspacecast i8* %in to i8 addrspace(1)* ; X32-NEXT: --> %in_casted U: full-set S: full-set ; X32-NEXT: %p0 = ptrtoint i8 addrspace(1)* %in_casted to i64 -; X32-NEXT: --> %p0 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8 addrspace(1)* %in_casted to i64) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: Determining loop execution counts for: @ptrtoint_of_addrspacecast ; %in_casted = addrspacecast i8* %in to i8 addrspace(1)* @@ -140,7 +140,7 @@ define void @ptrtoint_of_inttoptr(i64 %in, i64* %out0) { ; X64-NEXT: %in_casted = inttoptr i64 %in to i8* ; X64-NEXT: --> %in_casted U: full-set S: full-set ; X64-NEXT: %p0 = ptrtoint i8* %in_casted to i64 -; X64-NEXT: --> %p0 U: full-set S: full-set +; X64-NEXT: --> %in_casted U: full-set S: full-set ; X64-NEXT: Determining loop execution counts for: @ptrtoint_of_inttoptr ; ; X32-LABEL: 'ptrtoint_of_inttoptr' @@ -148,7 +148,7 @@ define void @ptrtoint_of_inttoptr(i64 %in, i64* %out0) { ; X32-NEXT: %in_casted = inttoptr i64 %in to i8* ; X32-NEXT: --> %in_casted U: full-set S: full-set ; X32-NEXT: %p0 = ptrtoint i8* %in_casted to i64 -; X32-NEXT: --> %p0 U: [0,4294967296) S: [-4294967296,4294967296) +; X32-NEXT: --> (zext i8* %in_casted to i64) U: [0,4294967296) S: [0,4294967296) ; X32-NEXT: Determining loop execution counts for: @ptrtoint_of_inttoptr ; %in_casted = inttoptr i64 %in to i8* @@ -197,11 +197,17 @@ define void @ptrtoint_of_nullptr(i64* %out0) { ; A constant inttoptr argument of an ptrtoint is still bad. define void @ptrtoint_of_constantexpr_inttoptr(i64* %out0) { -; ALL-LABEL: 'ptrtoint_of_constantexpr_inttoptr' -; ALL-NEXT: Classifying expressions for: @ptrtoint_of_constantexpr_inttoptr -; ALL-NEXT: %p0 = ptrtoint i8* inttoptr (i64 42 to i8*) to i64 -; ALL-NEXT: --> %p0 U: [42,43) S: [-64,64) -; ALL-NEXT: Determining loop execution counts for: @ptrtoint_of_constantexpr_inttoptr +; X64-LABEL: 'ptrtoint_of_constantexpr_inttoptr' +; X64-NEXT: Classifying expressions for: @ptrtoint_of_constantexpr_inttoptr +; X64-NEXT: %p0 = ptrtoint i8* inttoptr (i64 42 to i8*) to i64 +; X64-NEXT: --> inttoptr (i64 42 to i8*) U: [42,43) S: [-64,64) +; X64-NEXT: Determining loop execution counts for: @ptrtoint_of_constantexpr_inttoptr +; +; X32-LABEL: 'ptrtoint_of_constantexpr_inttoptr' +; X32-NEXT: Classifying expressions for: @ptrtoint_of_constantexpr_inttoptr +; X32-NEXT: %p0 = ptrtoint i8* inttoptr (i64 42 to i8*) to i64 +; X32-NEXT: --> (zext i8* inttoptr (i64 42 to i8*) to i64) U: [42,43) S: [0,4294967296) +; X32-NEXT: Determining loop execution counts for: @ptrtoint_of_constantexpr_inttoptr ; %p0 = ptrtoint i8* inttoptr (i64 42 to i8*) to i64 store i64 %p0, i64* %out0 diff --git a/llvm/test/CodeGen/ARM/lsr-undef-in-binop.ll b/llvm/test/CodeGen/ARM/lsr-undef-in-binop.ll index 564328d99998..e73397214475 100644 --- a/llvm/test/CodeGen/ARM/lsr-undef-in-binop.ll +++ b/llvm/test/CodeGen/ARM/lsr-undef-in-binop.ll @@ -186,7 +186,9 @@ define linkonce_odr i32 @vector_insert(%"class.std::__1::vector.182"*, [1 x i32] br i1 %114, label %124, label %115 ; CHECK-LABEL: .preheader: -; CHECK-NEXT: sub i32 [[OLD_CAST]], [[NEW_CAST]] +; CHECK-NEXT: [[NEG_NEW:%[0-9]+]] = sub i32 0, [[NEW_CAST]] +; CHECK-NEXT: getelementptr i8, i8* %97, i32 [[NEG_NEW]] + ; <label>:115: ; preds = %111, %115 %116 = phi i8* [ %118, %115 ], [ %97, %111 ] %117 = phi i8* [ %119, %115 ], [ %11, %111 ] diff --git a/llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll b/llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll index 670477c4c285..d4dd7352aa52 100644 --- a/llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll +++ b/llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll @@ -268,9 +268,9 @@ define i8* @SyFgets(i8* %line, i64 %length, i64 %fid) { ; CHECK-NEXT: LBB0_48: ## %if.then1477 ; CHECK-NEXT: movl $1, %edx ; CHECK-NEXT: callq _write -; CHECK-NEXT: subq %rbx, %r14 ; CHECK-NEXT: movq _syHistory(a){{.*}}(%rip), %rax -; CHECK-NEXT: leaq 8189(%r14,%rax), %rax +; CHECK-NEXT: subq %rbx, %rax +; CHECK-NEXT: leaq 8189(%rax,%r14), %rax ; CHECK-NEXT: .p2align 4, 0x90 ; CHECK-NEXT: LBB0_49: ## %for.body1723 ; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1 diff --git a/llvm/test/Transforms/IndVarSimplify/2011-11-01-lftrptr.ll b/llvm/test/Transforms/IndVarSimplify/2011-11-01-lftrptr.ll index e1ef6bd6635d..bc756c666bde 100644 --- a/llvm/test/Transforms/IndVarSimplify/2011-11-01-lftrptr.ll +++ b/llvm/test/Transforms/IndVarSimplify/2011-11-01-lftrptr.ll @@ -166,21 +166,23 @@ define i8 @testnullptrint(i8* %buf, i8* %end) nounwind { ; PTR64-NEXT: ret i8 [[RET]] ; ; PTR32-LABEL: @testnullptrint( +; PTR32-NEXT: [[BUF1:%.*]] = ptrtoint i8* [[BUF:%.*]] to i32 ; PTR32-NEXT: br label [[LOOPGUARD:%.*]] ; PTR32: loopguard: -; PTR32-NEXT: [[BI:%.*]] = ptrtoint i8* [[BUF:%.*]] to i32 +; PTR32-NEXT: [[BI:%.*]] = ptrtoint i8* [[BUF]] to i32 ; PTR32-NEXT: [[EI:%.*]] = ptrtoint i8* [[END:%.*]] to i32 ; PTR32-NEXT: [[CNT:%.*]] = sub i32 [[EI]], [[BI]] -; PTR32-NEXT: [[CNT1:%.*]] = inttoptr i32 [[CNT]] to i8* ; PTR32-NEXT: [[GUARD:%.*]] = icmp ult i32 0, [[CNT]] ; PTR32-NEXT: br i1 [[GUARD]], label [[PREHEADER:%.*]], label [[EXIT:%.*]] ; PTR32: preheader: +; PTR32-NEXT: [[TMP1:%.*]] = sub i32 0, [[BUF1]] +; PTR32-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, i8* [[END]], i32 [[TMP1]] ; PTR32-NEXT: br label [[LOOP:%.*]] ; PTR32: loop: ; PTR32-NEXT: [[P_01_US_US:%.*]] = phi i8* [ null, [[PREHEADER]] ], [ [[GEP:%.*]], [[LOOP]] ] ; PTR32-NEXT: [[GEP]] = getelementptr inbounds i8, i8* [[P_01_US_US]], i64 1 -; PTR32-NEXT: [[SNEXT:%.*]] = load i8, i8* [[GEP]] -; PTR32-NEXT: [[EXITCOND:%.*]] = icmp ne i8* [[GEP]], [[CNT1]] +; PTR32-NEXT: [[SNEXT:%.*]] = load i8, i8* [[GEP]], align 1 +; PTR32-NEXT: [[EXITCOND:%.*]] = icmp ne i8* [[GEP]], [[SCEVGEP]] ; PTR32-NEXT: br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT_LOOPEXIT:%.*]] ; PTR32: exit.loopexit: ; PTR32-NEXT: [[SNEXT_LCSSA:%.*]] = phi i8 [ [[SNEXT]], [[LOOP]] ] @@ -256,10 +258,10 @@ define i8 @testptrint(i8* %buf, i8* %end) nounwind { ; PTR32-NEXT: [[P_01_US_US:%.*]] = phi i8* [ [[BUF]], [[PREHEADER]] ], [ [[GEP:%.*]], [[LOOP]] ] ; PTR32-NEXT: [[IV:%.*]] = phi i32 [ [[BI]], [[PREHEADER]] ], [ [[IVNEXT:%.*]], [[LOOP]] ] ; PTR32-NEXT: [[GEP]] = getelementptr inbounds i8, i8* [[P_01_US_US]], i64 1 -; PTR32-NEXT: [[SNEXT:%.*]] = load i8, i8* [[GEP]] +; PTR32-NEXT: [[SNEXT:%.*]] = load i8, i8* [[GEP]], align 1 ; PTR32-NEXT: [[IVNEXT]] = add nuw i32 [[IV]], 1 -; PTR32-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[IVNEXT]], [[CNT]] -; PTR32-NEXT: br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT_LOOPEXIT:%.*]] +; PTR32-NEXT: [[CMP:%.*]] = icmp ult i32 [[IVNEXT]], [[CNT]] +; PTR32-NEXT: br i1 [[CMP]], label [[LOOP]], label [[EXIT_LOOPEXIT:%.*]] ; PTR32: exit.loopexit: ; PTR32-NEXT: [[SNEXT_LCSSA:%.*]] = phi i8 [ [[SNEXT]], [[LOOP]] ] ; PTR32-NEXT: br label [[EXIT]] diff --git a/polly/test/Isl/CodeGen/scev_looking_through_bitcasts.ll b/polly/test/Isl/CodeGen/scev_looking_through_bitcasts.ll index 1012e23cd3a2..321e98ab6772 100644 --- a/polly/test/Isl/CodeGen/scev_looking_through_bitcasts.ll +++ b/polly/test/Isl/CodeGen/scev_looking_through_bitcasts.ll @@ -32,6 +32,5 @@ bitmap_element_allocate.exit: ; CHECK: polly.stmt.cond.end73.i: -; CHECK-NEXT: %0 = bitcast %structty** %b.s2a to i8** -; CHECK-NEXT: store i8* undef, i8** %0 +; CHECK-NEXT: store %structty* undef, %structty** %b.s2a ; CHECK-NEXT: br label %polly.exiting </cut>

4 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-release-arm-spec2k6-Os - Build # 19 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-Os. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-Os Culprit: <cut> commit ab97c9bdb747c873cd35a18229e2694156a7607d Author: David Green <david.green(a)arm.com> Date: Sat Dec 12 14:21:40 2020 +0000 [LV] Fix scalar cost for tail predicated loops When it comes to the scalar cost of any predicated block, the loop vectorizer by default regards this predication as a sign that it is looking at an if-conversion and divides the scalar cost of the block by 2, assuming it would only be executed half the time. This however makes no sense if the predication has been introduced to tail predicate the loop. Original patch by Anna Welker Differential Revision: https://reviews.llvm.org/D86452 </cut> Results regressed to (for first_bad == ab97c9bdb747c873cd35a18229e2694156a7607d) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -Os_mthumb -- artifacts/build-ab97c9bdb747c873cd35a18229e2694156a7607d/results_id: 1 # 401.bzip2,bzip2_base.default regressed by 103 # 401.bzip2,[.] BZ2_compressBlock regressed by 113 # 473.astar,astar_base.default regressed by 103 from (for last_good == d716eab197abec0b9aab4a76cd1a52b248b8c3b1) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -Os_mthumb -- artifacts/build-d716eab197abec0b9aab4a76cd1a52b248b8c3b1/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-Os/1878 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-Os/1876 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-ab97c9bdb747c873cd35a18229e2694156a7607d cd investigate-llvm-ab97c9bdb747c873cd35a18229e2694156a7607d git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach ab97c9bdb747c873cd35a18229e2694156a7607d ../artifacts/test.sh # Reproduce last_good build git checkout --detach d716eab197abec0b9aab4a76cd1a52b248b8c3b1 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Full commit (up to 1000 lines): <cut> commit ab97c9bdb747c873cd35a18229e2694156a7607d Author: David Green <david.green(a)arm.com> Date: Sat Dec 12 14:21:40 2020 +0000 [LV] Fix scalar cost for tail predicated loops When it comes to the scalar cost of any predicated block, the loop vectorizer by default regards this predication as a sign that it is looking at an if-conversion and divides the scalar cost of the block by 2, assuming it would only be executed half the time. This however makes no sense if the predication has been introduced to tail predicate the loop. Original patch by Anna Welker Differential Revision: https://reviews.llvm.org/D86452 --- llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | 7 ++++--- llvm/test/Transforms/LoopVectorize/ARM/scalar-block-cost.ll | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp index c381377b67c9..663ea50c4c02 100644 --- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -6483,9 +6483,10 @@ LoopVectorizationCostModel::expectedCost(ElementCount VF) { // if-converted. This means that the block's instructions (aside from // stores and instructions that may divide by zero) will now be // unconditionally executed. For the scalar case, we may not always execute - // the predicated block. Thus, scale the block's cost by the probability of - // executing it. - if (VF.isScalar() && blockNeedsPredication(BB)) + // the predicated block, if it is an if-else block. Thus, scale the block's + // cost by the probability of executing it. blockNeedsPredication from + // Legal is used so as to not include all blocks in tail folded loops. + if (VF.isScalar() && Legal->blockNeedsPredication(BB)) BlockCost.first /= getReciprocalPredBlockProb(); Cost.first += BlockCost.first; diff --git a/llvm/test/Transforms/LoopVectorize/ARM/scalar-block-cost.ll b/llvm/test/Transforms/LoopVectorize/ARM/scalar-block-cost.ll index 959fbe676e67..fc8ea4fc938c 100644 --- a/llvm/test/Transforms/LoopVectorize/ARM/scalar-block-cost.ll +++ b/llvm/test/Transforms/LoopVectorize/ARM/scalar-block-cost.ll @@ -15,7 +15,7 @@ define void @pred_loop(i32* %off, i32* %data, i32* %dst, i32 %n) #0 { ; CHECK-COST-NEXT: LV: Found an estimated cost of 1 for VF 1 For instruction: store i32 %add1, i32* %arrayidx2, align 4 ; CHECK-COST-NEXT: LV: Found an estimated cost of 1 for VF 1 For instruction: %exitcond.not = icmp eq i32 %add, %n ; CHECK-COST-NEXT: LV: Found an estimated cost of 0 for VF 1 For instruction: br i1 %exitcond.not, label %exit.loopexit, label %for.body -; CHECK-COST-NEXT: LV: Scalar loop costs: 2. +; CHECK-COST-NEXT: LV: Scalar loop costs: 5. entry: %cmp8 = icmp sgt i32 %n, 0 </cut>

4 years, 11 months

Re: [CI-NOTIFY]: TCWG Bisect tcwg_gnu/gnu-master-aarch64-check_bootstrap - Build # 80 - Successful!

by Maxim Kuvyrkov

Reported to upstream at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101506 . -- Maxim Kuvyrkov https://www.linaro.org > On 19 Jul 2021, at 02:30, ci_notify(a)linaro.org wrote: > > Successfully identified regression in *gcc* in CI configuration tcwg_gnu/gnu-master-aarch64-check_bootstrap. So far, this commit has regressed CI configurations: > - tcwg_gnu/gnu-master-aarch64-check_bootstrap > > Culprit: > <cut> > commit 1dd3f21095858fbfd3e28a149578d5fb67e75f95 > Author: Richard Biener <rguenther(a)suse.de> > Date: Tue Jul 13 13:59:15 2021 +0200 > > Support reduction def re-use for epilogue with different vector size > > The following adds support for re-using the vector reduction def > from the main loop in vectorized epilogue loops on architectures > which use different vector sizes for the epilogue. That's only > x86 as far as I am aware. > > 2021-07-13 Richard Biener <rguenther(a)suse.de> > > * tree-vect-loop.c (vect_find_reusable_accumulator): Handle > vector types where the old vector type has a multiple of > the new vector type elements. > (vect_create_partial_epilog): New function, split out from... > (vect_create_epilog_for_reduction): ... here. > (vect_transform_cycle_phi): Reduce the re-used accumulator > to the new vector type. > > * gcc.target/i386/vect-reduc-1.c: New testcase. > </cut> > > Results regressed to (for first_bad == 1dd3f21095858fbfd3e28a149578d5fb67e75f95) > # reset_artifacts: > -10 > # build_abe binutils: > -2 > # build_abe bootstrap: > -1 > # build_abe dejagnu: > 0 > # build_abe check_bootstrap -- --set runtestflags=g++.dg/dg.exp --set runtestflags=gcc.target/aarch64/aarch64.exp: > 1 > # Getting actual results from build directory /home/tcwg-buildslave/workspace/tcwg_gnu_3/artifacts/build-1dd3f21095858fbfd3e28a149578d5fb67e75f95/sumfiles > # /home/tcwg-buildslave/workspace/tcwg_gnu_3/artifacts/build-1dd3f21095858fbfd3e28a149578d5fb67e75f95/sumfiles/libstdc++.sum > # /home/tcwg-buildslave/workspace/tcwg_gnu_3/artifacts/build-1dd3f21095858fbfd3e28a149578d5fb67e75f95/sumfiles/gfortran.sum > # /home/tcwg-buildslave/workspace/tcwg_gnu_3/artifacts/build-1dd3f21095858fbfd3e28a149578d5fb67e75f95/sumfiles/libitm.sum > # /home/tcwg-buildslave/workspace/tcwg_gnu_3/artifacts/build-1dd3f21095858fbfd3e28a149578d5fb67e75f95/sumfiles/libgomp.sum > # /home/tcwg-buildslave/workspace/tcwg_gnu_3/artifacts/build-1dd3f21095858fbfd3e28a149578d5fb67e75f95/sumfiles/libatomic.sum > # /home/tcwg-buildslave/workspace/tcwg_gnu_3/artifacts/build-1dd3f21095858fbfd3e28a149578d5fb67e75f95/sumfiles/g++.sum > # /home/tcwg-buildslave/workspace/tcwg_gnu_3/artifacts/build-1dd3f21095858fbfd3e28a149578d5fb67e75f95/sumfiles/gcc.sum > # Manifest: gcc-compare-results/contrib/testsuite-management/flaky/gnu-master-aarch64-check_bootstrap.xfail > # Getting actual results from build directory base-artifacts/sumfiles > # base-artifacts/sumfiles/libstdc++.sum > # base-artifacts/sumfiles/gfortran.sum > # base-artifacts/sumfiles/libitm.sum > # base-artifacts/sumfiles/libgomp.sum > # base-artifacts/sumfiles/libatomic.sum > # base-artifacts/sumfiles/g++.sum > # base-artifacts/sumfiles/gcc.sum > # > # > # Unexpected results in this build (new failures) > # === gcc tests === > # > # Running gcc.target/aarch64/aarch64.exp ... > # FAIL: gcc.target/aarch64/vect-fmaxv-fminv-compile.c scan-assembler fminnmv > # FAIL: gcc.target/aarch64/vect-fmaxv-fminv-compile.c scan-assembler fmaxnmv > # > # === Results Summary === > > from (for last_good == a7098d6ef4e4e799dab8ef925c62b199d707694b) > # reset_artifacts: > -10 > # build_abe binutils: > -2 > # build_abe bootstrap: > -1 > # build_abe dejagnu: > 0 > # build_abe check_bootstrap -- --set runtestflags=g++.dg/dg.exp --set runtestflags=gcc.target/aarch64/aarch64.exp: > 1 > > Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gcc-bisect-gnu-master-aarch64-check_bootstra… > Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gcc-bisect-gnu-master-aarch64-check_bootstra… > Build top page/logs: https://ci.linaro.org/job/tcwg_gcc-bisect-gnu-master-aarch64-check_bootstra… > > Configuration details: > > > Reproduce builds: > <cut> > mkdir investigate-gcc-1dd3f21095858fbfd3e28a149578d5fb67e75f95 > cd investigate-gcc-1dd3f21095858fbfd3e28a149578d5fb67e75f95 > > git clone https://git.linaro.org/toolchain/jenkins-scripts > > mkdir -p artifacts/manifests > curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gcc-bisect-gnu-master-aarch64-check_bootstra… --fail > curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gcc-bisect-gnu-master-aarch64-check_bootstra… --fail > curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gcc-bisect-gnu-master-aarch64-check_bootstra… --fail > chmod +x artifacts/test.sh > > # Reproduce the baseline build (build all pre-requisites) > ./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh > > # Save baseline build state (which is then restored in artifacts/test.sh) > rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude gcc/ ./ ./bisect/baseline/ > > cd gcc > > # Reproduce first_bad build > git checkout --detach 1dd3f21095858fbfd3e28a149578d5fb67e75f95 > ../artifacts/test.sh > > # Reproduce last_good build > git checkout --detach a7098d6ef4e4e799dab8ef925c62b199d707694b > ../artifacts/test.sh > > cd .. > </cut> > > History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… > > Artifacts: https://ci.linaro.org/job/tcwg_gcc-bisect-gnu-master-aarch64-check_bootstra… > Build log: https://ci.linaro.org/job/tcwg_gcc-bisect-gnu-master-aarch64-check_bootstra… > > Full commit (up to 1000 lines): > <cut> > commit 1dd3f21095858fbfd3e28a149578d5fb67e75f95 > Author: Richard Biener <rguenther(a)suse.de> > Date: Tue Jul 13 13:59:15 2021 +0200 > > Support reduction def re-use for epilogue with different vector size > > The following adds support for re-using the vector reduction def > from the main loop in vectorized epilogue loops on architectures > which use different vector sizes for the epilogue. That's only > x86 as far as I am aware. > > 2021-07-13 Richard Biener <rguenther(a)suse.de> > > * tree-vect-loop.c (vect_find_reusable_accumulator): Handle > vector types where the old vector type has a multiple of > the new vector type elements. > (vect_create_partial_epilog): New function, split out from... > (vect_create_epilog_for_reduction): ... here. > (vect_transform_cycle_phi): Reduce the re-used accumulator > to the new vector type. > > * gcc.target/i386/vect-reduc-1.c: New testcase. > --- > gcc/testsuite/gcc.target/i386/vect-reduc-1.c | 17 ++ > gcc/tree-vect-loop.c | 227 ++++++++++++++++----------- > 2 files changed, 156 insertions(+), 88 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/i386/vect-reduc-1.c b/gcc/testsuite/gcc.target/i386/vect-reduc-1.c > new file mode 100644 > index 00000000000..9ee9ba4e736 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/vect-reduc-1.c > @@ -0,0 +1,17 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3 -mavx2 -mno-avx512f -fdump-tree-vect-details" } */ > + > +#define N 32 > +int foo (int *a, int n) > +{ > + int sum = 1; > + for (int i = 0; i < 8*N + 4; ++i) > + sum += a[i]; > + return sum; > +} > + > +/* The reduction epilog should be vectorized and the accumulator > + re-used. */ > +/* { dg-final { scan-tree-dump "LOOP EPILOGUE VECTORIZED" "vect" } } */ > +/* { dg-final { scan-assembler-times "psrl" 2 } } */ > +/* { dg-final { scan-assembler-times "padd" 5 } } */ > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c > index 8c27d75f889..e9780158a51 100644 > --- a/gcc/tree-vect-loop.c > +++ b/gcc/tree-vect-loop.c > @@ -4896,12 +4896,11 @@ vect_find_reusable_accumulator (loop_vec_info loop_vinfo, > accumulator->reduc_info->reduc_scalar_results.begin ())) > return false; > > - /* For now, only handle the case in which both loops are operating on the > - same vector types. In future we could reduce wider vectors to narrower > - ones as well. */ > + /* Handle the case where we can reduce wider vectors to narrower ones. */ > tree vectype = STMT_VINFO_VECTYPE (reduc_info); > tree old_vectype = TREE_TYPE (accumulator->reduc_input); > - if (!useless_type_conversion_p (old_vectype, vectype)) > + if (!constant_multiple_p (TYPE_VECTOR_SUBPARTS (old_vectype), > + TYPE_VECTOR_SUBPARTS (vectype))) > return false; > > /* Non-SLP reductions might apply an adjustment after the reduction > @@ -4935,6 +4934,101 @@ vect_find_reusable_accumulator (loop_vec_info loop_vinfo, > return true; > } > > +/* Reduce the vector VEC_DEF down to VECTYPE with reduction operation > + CODE emitting stmts before GSI. Returns a vector def of VECTYPE. */ > + > +static tree > +vect_create_partial_epilog (tree vec_def, tree vectype, enum tree_code code, > + gimple_seq *seq) > +{ > + unsigned nunits = TYPE_VECTOR_SUBPARTS (TREE_TYPE (vec_def)).to_constant (); > + unsigned nunits1 = TYPE_VECTOR_SUBPARTS (vectype).to_constant (); > + tree stype = TREE_TYPE (vectype); > + tree new_temp = vec_def; > + while (nunits > nunits1) > + { > + nunits /= 2; > + tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > + stype, nunits); > + unsigned int bitsize = tree_to_uhwi (TYPE_SIZE (vectype1)); > + > + /* The target has to make sure we support lowpart/highpart > + extraction, either via direct vector extract or through > + an integer mode punning. */ > + tree dst1, dst2; > + gimple *epilog_stmt; > + if (convert_optab_handler (vec_extract_optab, > + TYPE_MODE (TREE_TYPE (new_temp)), > + TYPE_MODE (vectype1)) > + != CODE_FOR_nothing) > + { > + /* Extract sub-vectors directly once vec_extract becomes > + a conversion optab. */ > + dst1 = make_ssa_name (vectype1); > + epilog_stmt > + = gimple_build_assign (dst1, BIT_FIELD_REF, > + build3 (BIT_FIELD_REF, vectype1, > + new_temp, TYPE_SIZE (vectype1), > + bitsize_int (0))); > + gimple_seq_add_stmt_without_update (seq, epilog_stmt); > + dst2 = make_ssa_name (vectype1); > + epilog_stmt > + = gimple_build_assign (dst2, BIT_FIELD_REF, > + build3 (BIT_FIELD_REF, vectype1, > + new_temp, TYPE_SIZE (vectype1), > + bitsize_int (bitsize))); > + gimple_seq_add_stmt_without_update (seq, epilog_stmt); > + } > + else > + { > + /* Extract via punning to appropriately sized integer mode > + vector. */ > + tree eltype = build_nonstandard_integer_type (bitsize, 1); > + tree etype = build_vector_type (eltype, 2); > + gcc_assert (convert_optab_handler (vec_extract_optab, > + TYPE_MODE (etype), > + TYPE_MODE (eltype)) > + != CODE_FOR_nothing); > + tree tem = make_ssa_name (etype); > + epilog_stmt = gimple_build_assign (tem, VIEW_CONVERT_EXPR, > + build1 (VIEW_CONVERT_EXPR, > + etype, new_temp)); > + gimple_seq_add_stmt_without_update (seq, epilog_stmt); > + new_temp = tem; > + tem = make_ssa_name (eltype); > + epilog_stmt > + = gimple_build_assign (tem, BIT_FIELD_REF, > + build3 (BIT_FIELD_REF, eltype, > + new_temp, TYPE_SIZE (eltype), > + bitsize_int (0))); > + gimple_seq_add_stmt_without_update (seq, epilog_stmt); > + dst1 = make_ssa_name (vectype1); > + epilog_stmt = gimple_build_assign (dst1, VIEW_CONVERT_EXPR, > + build1 (VIEW_CONVERT_EXPR, > + vectype1, tem)); > + gimple_seq_add_stmt_without_update (seq, epilog_stmt); > + tem = make_ssa_name (eltype); > + epilog_stmt > + = gimple_build_assign (tem, BIT_FIELD_REF, > + build3 (BIT_FIELD_REF, eltype, > + new_temp, TYPE_SIZE (eltype), > + bitsize_int (bitsize))); > + gimple_seq_add_stmt_without_update (seq, epilog_stmt); > + dst2 = make_ssa_name (vectype1); > + epilog_stmt = gimple_build_assign (dst2, VIEW_CONVERT_EXPR, > + build1 (VIEW_CONVERT_EXPR, > + vectype1, tem)); > + gimple_seq_add_stmt_without_update (seq, epilog_stmt); > + } > + > + new_temp = make_ssa_name (vectype1); > + epilog_stmt = gimple_build_assign (new_temp, code, dst1, dst2); > + gimple_seq_add_stmt_without_update (seq, epilog_stmt); > + } > + > + return new_temp; > +} > + > /* Function vect_create_epilog_for_reduction > > Create code at the loop-epilog to finalize the result of a reduction > @@ -5684,87 +5778,11 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, > > /* First reduce the vector to the desired vector size we should > do shift reduction on by combining upper and lower halves. */ > - new_temp = reduc_inputs[0]; > - while (nunits > nunits1) > - { > - nunits /= 2; > - vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > - stype, nunits); > - unsigned int bitsize = tree_to_uhwi (TYPE_SIZE (vectype1)); > - > - /* The target has to make sure we support lowpart/highpart > - extraction, either via direct vector extract or through > - an integer mode punning. */ > - tree dst1, dst2; > - if (convert_optab_handler (vec_extract_optab, > - TYPE_MODE (TREE_TYPE (new_temp)), > - TYPE_MODE (vectype1)) > - != CODE_FOR_nothing) > - { > - /* Extract sub-vectors directly once vec_extract becomes > - a conversion optab. */ > - dst1 = make_ssa_name (vectype1); > - epilog_stmt > - = gimple_build_assign (dst1, BIT_FIELD_REF, > - build3 (BIT_FIELD_REF, vectype1, > - new_temp, TYPE_SIZE (vectype1), > - bitsize_int (0))); > - gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > - dst2 = make_ssa_name (vectype1); > - epilog_stmt > - = gimple_build_assign (dst2, BIT_FIELD_REF, > - build3 (BIT_FIELD_REF, vectype1, > - new_temp, TYPE_SIZE (vectype1), > - bitsize_int (bitsize))); > - gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > - } > - else > - { > - /* Extract via punning to appropriately sized integer mode > - vector. */ > - tree eltype = build_nonstandard_integer_type (bitsize, 1); > - tree etype = build_vector_type (eltype, 2); > - gcc_assert (convert_optab_handler (vec_extract_optab, > - TYPE_MODE (etype), > - TYPE_MODE (eltype)) > - != CODE_FOR_nothing); > - tree tem = make_ssa_name (etype); > - epilog_stmt = gimple_build_assign (tem, VIEW_CONVERT_EXPR, > - build1 (VIEW_CONVERT_EXPR, > - etype, new_temp)); > - gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > - new_temp = tem; > - tem = make_ssa_name (eltype); > - epilog_stmt > - = gimple_build_assign (tem, BIT_FIELD_REF, > - build3 (BIT_FIELD_REF, eltype, > - new_temp, TYPE_SIZE (eltype), > - bitsize_int (0))); > - gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > - dst1 = make_ssa_name (vectype1); > - epilog_stmt = gimple_build_assign (dst1, VIEW_CONVERT_EXPR, > - build1 (VIEW_CONVERT_EXPR, > - vectype1, tem)); > - gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > - tem = make_ssa_name (eltype); > - epilog_stmt > - = gimple_build_assign (tem, BIT_FIELD_REF, > - build3 (BIT_FIELD_REF, eltype, > - new_temp, TYPE_SIZE (eltype), > - bitsize_int (bitsize))); > - gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > - dst2 = make_ssa_name (vectype1); > - epilog_stmt = gimple_build_assign (dst2, VIEW_CONVERT_EXPR, > - build1 (VIEW_CONVERT_EXPR, > - vectype1, tem)); > - gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > - } > - > - new_temp = make_ssa_name (vectype1); > - epilog_stmt = gimple_build_assign (new_temp, code, dst1, dst2); > - gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > - reduc_inputs[0] = new_temp; > - } > + gimple_seq stmts = NULL; > + new_temp = vect_create_partial_epilog (reduc_inputs[0], vectype1, > + code, &stmts); > + gsi_insert_seq_before (&exit_gsi, stmts, GSI_SAME_STMT); > + reduc_inputs[0] = new_temp; > > if (reduce_with_shift && !slp_reduc) > { > @@ -7681,13 +7699,46 @@ vect_transform_cycle_phi (loop_vec_info loop_vinfo, > > if (auto *accumulator = reduc_info->reused_accumulator) > { > + tree def = accumulator->reduc_input; > + unsigned int nreduc; > + bool res = constant_multiple_p (TYPE_VECTOR_SUBPARTS (TREE_TYPE (def)), > + TYPE_VECTOR_SUBPARTS (vectype_out), > + &nreduc); > + gcc_assert (res); > + if (nreduc != 1) > + { > + /* Reduce the single vector to a smaller one. */ > + gimple_seq stmts = NULL; > + def = vect_create_partial_epilog (def, vectype_out, > + STMT_VINFO_REDUC_CODE (reduc_info), > + &stmts); > + /* Adjust the input so we pick up the partially reduced value > + for the skip edge in vect_create_epilog_for_reduction. */ > + accumulator->reduc_input = def; > + if (loop_vinfo->main_loop_edge) > + { > + /* While we'd like to insert on the edge this will split > + blocks and disturb bookkeeping, we also will eventually > + need this on the skip edge. Rely on sinking to > + fixup optimal placement and insert in the pred. */ > + gimple_stmt_iterator gsi > + = gsi_last_bb (loop_vinfo->main_loop_edge->src); > + /* Insert before a cond that eventually skips the > + epilogue. */ > + if (!gsi_end_p (gsi) && stmt_ends_bb_p (gsi_stmt (gsi))) > + gsi_prev (&gsi); > + gsi_insert_seq_after (&gsi, stmts, GSI_CONTINUE_LINKING); > + } > + else > + gsi_insert_seq_on_edge_immediate (loop_preheader_edge (loop), > + stmts); > + } > if (loop_vinfo->main_loop_edge) > vec_initial_defs[0] > - = vect_get_main_loop_result (loop_vinfo, accumulator->reduc_input, > + = vect_get_main_loop_result (loop_vinfo, def, > vec_initial_defs[0]); > else > - vec_initial_defs.safe_push (accumulator->reduc_input); > - gcc_assert (vec_initial_defs.length () == 1); > + vec_initial_defs.safe_push (def); > } > > /* Generate the reduction PHIs upfront. */ > </cut>

4 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-release-aarch64-spec2k6-O2 - Build # 11 - Fixed!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O2. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O2 Culprit: <cut> commit 3d31adaec443daee75c62823082fa2912bbd267e Author: Evgeniy Brevnov <ybrevnov(a)azul.com> Date: Thu Oct 29 14:27:54 2020 +0700 [DSE] Improve partial overlap detection Currently isOverwrite returns OW_MaybePartial even for accesss known not to overlap. This is not a big problem for legacy implementation (since isPartialOverwrite follows isOverwrite and clarifies the result). Contrary SSA based version does a lot of work to later find out that accesses don't overlap. Besides negative impact on compile time we quickly reach MemorySSAPartialStoreLimit and miss optimization opportunities. Note: In fact, I think it would be cleaner implementation if isOverwrite returned fully clarified result in the first place whithout need to call isPartialOverwrite. This can be done as a follow up. What do you think? Reviewed By: fhahn, asbirlea Differential Revision: https://reviews.llvm.org/D90371 </cut> Results regressed to (for first_bad == 3d31adaec443daee75c62823082fa2912bbd267e) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O2 -- artifacts/build-3d31adaec443daee75c62823082fa2912bbd267e/results_id: 1 # 464.h264ref,h264ref_base.default regressed by 106 # 464.h264ref,[.] FastFullPelBlockMotionSearch regressed by 142 from (for last_good == 1eeae4310771d8a6896fe09effe88883998f34e8) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O2 -- artifacts/build-1eeae4310771d8a6896fe09effe88883998f34e8/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O2/1862 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O2/1864 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-3d31adaec443daee75c62823082fa2912bbd267e cd investigate-llvm-3d31adaec443daee75c62823082fa2912bbd267e git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 3d31adaec443daee75c62823082fa2912bbd267e ../artifacts/test.sh # Reproduce last_good build git checkout --detach 1eeae4310771d8a6896fe09effe88883998f34e8 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Full commit (up to 1000 lines): <cut> commit 3d31adaec443daee75c62823082fa2912bbd267e Author: Evgeniy Brevnov <ybrevnov(a)azul.com> Date: Thu Oct 29 14:27:54 2020 +0700 [DSE] Improve partial overlap detection Currently isOverwrite returns OW_MaybePartial even for accesss known not to overlap. This is not a big problem for legacy implementation (since isPartialOverwrite follows isOverwrite and clarifies the result). Contrary SSA based version does a lot of work to later find out that accesses don't overlap. Besides negative impact on compile time we quickly reach MemorySSAPartialStoreLimit and miss optimization opportunities. Note: In fact, I think it would be cleaner implementation if isOverwrite returned fully clarified result in the first place whithout need to call isPartialOverwrite. This can be done as a follow up. What do you think? Reviewed By: fhahn, asbirlea Differential Revision: https://reviews.llvm.org/D90371 --- .../lib/Transforms/Scalar/DeadStoreElimination.cpp | 50 ++++++++++------ .../MSSA/combined-partial-overwrites.ll | 53 +++++----------- .../MSSA/multiblock-overlap.ll | 70 +++++++--------------- 3 files changed, 69 insertions(+), 104 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp index acdb1c4fa8c3..e578d15dfc50 100644 --- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp +++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp @@ -501,28 +501,40 @@ isOverwrite(const Instruction *LaterI, const Instruction *EarlierI, if (BP1 != BP2) return OW_Unknown; - // The later store completely overlaps the earlier store if: - // - // 1. Both start at the same offset and the later one's size is greater than - // or equal to the earlier one's, or - // - // |--earlier--| - // |-- later --| - // - // 2. The earlier store has an offset greater than the later offset, but which - // still lies completely within the later store. - // - // |--earlier--| - // |----- later ------| + // The later access completely overlaps the earlier store if and only if + // both start and end of the earlier one is "inside" the later one: + // |<->|--earlier--|<->| + // |-------later-------| + // Accesses may overlap if and only if start of one of them is "inside" + // another one: + // |<->|--earlier--|<----->| + // |-------later-------| + // OR + // |----- earlier -----| + // |<->|---later---|<----->| // // We have to be careful here as *Off is signed while *.Size is unsigned. - if (EarlierOff >= LaterOff && - LaterSize >= EarlierSize && - uint64_t(EarlierOff - LaterOff) + EarlierSize <= LaterSize) - return OW_Complete; - // Later may overwrite earlier completely with other partial writes. - return OW_MaybePartial; + // Check if the earlier access starts "not before" the later one. + if (EarlierOff >= LaterOff) { + // If the earlier access ends "not after" the later access then the earlier + // one is completely overwritten by the later one. + if (uint64_t(EarlierOff - LaterOff) + EarlierSize <= LaterSize) + return OW_Complete; + // If start of the earlier access is "before" end of the later access then + // accesses overlap. + else if ((uint64_t)(EarlierOff - LaterOff) < LaterSize) + return OW_MaybePartial; + } + // If start of the later access is "before" end of the earlier access then + // accesses overlap. + else if ((uint64_t)(LaterOff - EarlierOff) < EarlierSize) { + return OW_MaybePartial; + } + + // Can reach here only if accesses are known not to overlap. There is no + // dedicated code to indicate no overlap so signal "unknown". + return OW_Unknown; } /// Return 'OW_Complete' if a store to the 'Later' location completely diff --git a/llvm/test/Transforms/DeadStoreElimination/MSSA/combined-partial-overwrites.ll b/llvm/test/Transforms/DeadStoreElimination/MSSA/combined-partial-overwrites.ll index ec1b9a5ee514..ab957e0c3cf0 100644 --- a/llvm/test/Transforms/DeadStoreElimination/MSSA/combined-partial-overwrites.ll +++ b/llvm/test/Transforms/DeadStoreElimination/MSSA/combined-partial-overwrites.ll @@ -1,6 +1,5 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt -S -dse -enable-dse-partial-store-merging=false < %s | FileCheck --check-prefixes=CHECK,DEFAULT-LIMIT %s -; RUN: opt -S -dse -enable-dse-partial-store-merging=false -dse-memoryssa-partial-store-limit=10 < %s | FileCheck --check-prefixes=CHECK,LARGER-LIMIT %s +; RUN: opt -S -dse -enable-dse-partial-store-merging=false < %s | FileCheck --check-prefixes=CHECK %s target datalayout = "E-m:e-i64:64-n32:64" target triple = "powerpc64le-unknown-linux" @@ -213,41 +212,21 @@ declare i32 @fa(i8*, i8**, i32, i8, i8*) ; We miss this case, because of an aggressive limit of partial overlap analysis. ; With a larger partial store limit, we remove the memset. define void @test4() { -; DEFAULT-LIMIT-LABEL: @test4( -; DEFAULT-LIMIT-NEXT: entry: -; DEFAULT-LIMIT-NEXT: [[BANG:%.*]] = alloca [[STRUCT_FOOSTRUCT:%.*]], align 8 -; DEFAULT-LIMIT-NEXT: [[V1:%.*]] = bitcast %struct.foostruct* [[BANG]] to i8* -; DEFAULT-LIMIT-NEXT: [[TMP0:%.*]] = getelementptr inbounds i8, i8* [[V1]], i64 32 -; DEFAULT-LIMIT-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[TMP0]], i8 0, i64 8, i1 false) -; DEFAULT-LIMIT-NEXT: [[V2:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 0 -; DEFAULT-LIMIT-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V2]], align 8 -; DEFAULT-LIMIT-NEXT: [[V3:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 1 -; DEFAULT-LIMIT-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V3]], align 8 -; DEFAULT-LIMIT-NEXT: [[V4:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 2 -; DEFAULT-LIMIT-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V4]], align 8 -; DEFAULT-LIMIT-NEXT: [[V5:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 3 -; DEFAULT-LIMIT-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V5]], align 8 -; DEFAULT-LIMIT-NEXT: [[V6:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 4 -; DEFAULT-LIMIT-NEXT: store void (i8*, i32, i32)* null, void (i8*, i32, i32)** [[V6]], align 8 -; DEFAULT-LIMIT-NEXT: call void @goFunc(%struct.foostruct* [[BANG]]) -; DEFAULT-LIMIT-NEXT: ret void -; -; LARGER-LIMIT-LABEL: @test4( -; LARGER-LIMIT-NEXT: entry: -; LARGER-LIMIT-NEXT: [[BANG:%.*]] = alloca [[STRUCT_FOOSTRUCT:%.*]], align 8 -; LARGER-LIMIT-NEXT: [[V2:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 0 -; LARGER-LIMIT-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V2]], align 8 -; LARGER-LIMIT-NEXT: [[V3:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 1 -; LARGER-LIMIT-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V3]], align 8 -; LARGER-LIMIT-NEXT: [[V4:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 2 -; LARGER-LIMIT-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V4]], align 8 -; LARGER-LIMIT-NEXT: [[V5:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 3 -; LARGER-LIMIT-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V5]], align 8 -; LARGER-LIMIT-NEXT: [[V6:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 4 -; LARGER-LIMIT-NEXT: store void (i8*, i32, i32)* null, void (i8*, i32, i32)** [[V6]], align 8 -; LARGER-LIMIT-NEXT: call void @goFunc(%struct.foostruct* [[BANG]]) -; LARGER-LIMIT-NEXT: ret void -; +; CHECK-LABEL: @test4( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[BANG:%.*]] = alloca [[STRUCT_FOOSTRUCT:%.*]], align 8 +; CHECK-NEXT: [[V2:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 0 +; CHECK-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V2]], align 8 +; CHECK-NEXT: [[V3:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 1 +; CHECK-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V3]], align 8 +; CHECK-NEXT: [[V4:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 2 +; CHECK-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V4]], align 8 +; CHECK-NEXT: [[V5:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 3 +; CHECK-NEXT: store i32 (i8*, i8**, i32, i8, i8*)* @fa, i32 (i8*, i8**, i32, i8, i8*)** [[V5]], align 8 +; CHECK-NEXT: [[V6:%.*]] = getelementptr inbounds [[STRUCT_FOOSTRUCT]], %struct.foostruct* [[BANG]], i64 0, i32 4 +; CHECK-NEXT: store void (i8*, i32, i32)* null, void (i8*, i32, i32)** [[V6]], align 8 +; CHECK-NEXT: call void @goFunc(%struct.foostruct* [[BANG]]) +; CHECK-NEXT: ret void entry: %bang = alloca %struct.foostruct, align 8 diff --git a/llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-overlap.ll b/llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-overlap.ll index 8a71c7397917..2ed717343a8a 100644 --- a/llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-overlap.ll +++ b/llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-overlap.ll @@ -1,6 +1,5 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt -dse %s -S | FileCheck --check-prefixes=CHECK,DEFAULT-LIMIT %s -; RUN: opt -dse -dse-memoryssa-partial-store-limit=10 %s -S | FileCheck --check-prefixes=CHECK,LARGER-LIMIT %s +; RUN: opt -dse %s -S | FileCheck --check-prefixes=CHECK %s %struct.ham = type { [3 x double], [3 x double]} @@ -11,52 +10,27 @@ declare void @llvm.memset.p0i8.i64(i8* nocapture writeonly, i8, i64, i1 immarg) ; We miss this case, because of an aggressive limit of partial overlap analysis. ; With a larger partial store limit, we remove the memset. define void @overlap1(%struct.ham* %arg, i1 %cond) { -; DEFAULT-LIMIT-LABEL: @overlap1( -; DEFAULT-LIMIT-NEXT: bb: -; DEFAULT-LIMIT-NEXT: [[TMP:%.*]] = getelementptr inbounds [[STRUCT_HAM:%.*]], %struct.ham* [[ARG:%.*]], i64 0, i32 0, i64 2 -; DEFAULT-LIMIT-NEXT: [[TMP1:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 0, i64 1 -; DEFAULT-LIMIT-NEXT: [[TMP2:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 0, i64 0 -; DEFAULT-LIMIT-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 1, i64 2 -; DEFAULT-LIMIT-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 1, i64 1 -; DEFAULT-LIMIT-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 1, i32 0 -; DEFAULT-LIMIT-NEXT: [[TMP6:%.*]] = bitcast double* [[TMP2]] to i8* -; DEFAULT-LIMIT-NEXT: [[TMP0:%.*]] = getelementptr inbounds i8, i8* [[TMP6]], i64 32 -; DEFAULT-LIMIT-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 8 dereferenceable(48) [[TMP0]], i8 0, i64 16, i1 false) -; DEFAULT-LIMIT-NEXT: br i1 [[COND:%.*]], label [[BB7:%.*]], label [[BB8:%.*]] -; DEFAULT-LIMIT: bb7: -; DEFAULT-LIMIT-NEXT: br label [[BB9:%.*]] -; DEFAULT-LIMIT: bb8: -; DEFAULT-LIMIT-NEXT: br label [[BB9]] -; DEFAULT-LIMIT: bb9: -; DEFAULT-LIMIT-NEXT: store double 1.000000e+00, double* [[TMP2]], align 8 -; DEFAULT-LIMIT-NEXT: store double 2.000000e+00, double* [[TMP1]], align 8 -; DEFAULT-LIMIT-NEXT: store double 3.000000e+00, double* [[TMP]], align 8 -; DEFAULT-LIMIT-NEXT: store double 4.000000e+00, double* [[TMP5]], align 8 -; DEFAULT-LIMIT-NEXT: store double 5.000000e+00, double* [[TMP4]], align 8 -; DEFAULT-LIMIT-NEXT: store double 6.000000e+00, double* [[TMP3]], align 8 -; DEFAULT-LIMIT-NEXT: ret void -; -; LARGER-LIMIT-LABEL: @overlap1( -; LARGER-LIMIT-NEXT: bb: -; LARGER-LIMIT-NEXT: [[TMP:%.*]] = getelementptr inbounds [[STRUCT_HAM:%.*]], %struct.ham* [[ARG:%.*]], i64 0, i32 0, i64 2 -; LARGER-LIMIT-NEXT: [[TMP1:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 0, i64 1 -; LARGER-LIMIT-NEXT: [[TMP2:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 0, i64 0 -; LARGER-LIMIT-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 1, i64 2 -; LARGER-LIMIT-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 1, i64 1 -; LARGER-LIMIT-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 1, i32 0 -; LARGER-LIMIT-NEXT: br i1 [[COND:%.*]], label [[BB7:%.*]], label [[BB8:%.*]] -; LARGER-LIMIT: bb7: -; LARGER-LIMIT-NEXT: br label [[BB9:%.*]] -; LARGER-LIMIT: bb8: -; LARGER-LIMIT-NEXT: br label [[BB9]] -; LARGER-LIMIT: bb9: -; LARGER-LIMIT-NEXT: store double 1.000000e+00, double* [[TMP2]], align 8 -; LARGER-LIMIT-NEXT: store double 2.000000e+00, double* [[TMP1]], align 8 -; LARGER-LIMIT-NEXT: store double 3.000000e+00, double* [[TMP]], align 8 -; LARGER-LIMIT-NEXT: store double 4.000000e+00, double* [[TMP5]], align 8 -; LARGER-LIMIT-NEXT: store double 5.000000e+00, double* [[TMP4]], align 8 -; LARGER-LIMIT-NEXT: store double 6.000000e+00, double* [[TMP3]], align 8 -; LARGER-LIMIT-NEXT: ret void +; CHECK-LABEL: @overlap1( +; CHECK-NEXT: bb: +; CHECK-NEXT: [[TMP:%.*]] = getelementptr inbounds [[STRUCT_HAM:%.*]], %struct.ham* [[ARG:%.*]], i64 0, i32 0, i64 2 +; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 0, i64 1 +; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 0, i64 0 +; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 1, i64 2 +; CHECK-NEXT: [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 1, i64 1 +; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds [[STRUCT_HAM]], %struct.ham* [[ARG]], i64 0, i32 1, i32 0 +; CHECK-NEXT: br i1 [[COND:%.*]], label [[BB7:%.*]], label [[BB8:%.*]] +; CHECK: bb7: +; CHECK-NEXT: br label [[BB9:%.*]] +; CHECK: bb8: +; CHECK-NEXT: br label [[BB9]] +; CHECK: bb9: +; CHECK-NEXT: store double 1.000000e+00, double* [[TMP2]], align 8 +; CHECK-NEXT: store double 2.000000e+00, double* [[TMP1]], align 8 +; CHECK-NEXT: store double 3.000000e+00, double* [[TMP]], align 8 +; CHECK-NEXT: store double 4.000000e+00, double* [[TMP5]], align 8 +; CHECK-NEXT: store double 5.000000e+00, double* [[TMP4]], align 8 +; CHECK-NEXT: store double 6.000000e+00, double* [[TMP3]], align 8 +; CHECK-NEXT: ret void ; bb: %tmp = getelementptr inbounds %struct.ham, %struct.ham* %arg, i64 0, i32 0, i64 2 </cut>

4 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-master-arm-spec2k6-O2 - Build # 10 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O2. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O2 Culprit: <cut> commit d181fd918d18cbd99768f025e14a69d35d275f14 Author: Simon Pilgrim <llvm-dev(a)redking.me.uk> Date: Fri Jul 2 14:27:27 2021 +0100 [CostModel][X86] Drop some hard coded fp<->int scalarization costs Scalarization costs handling is a lot better now, and the hard coded costs were higher than the worse case numbers from the script in D103695 </cut> Results regressed to (for first_bad == d181fd918d18cbd99768f025e14a69d35d275f14) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O2_marm -- artifacts/build-d181fd918d18cbd99768f025e14a69d35d275f14/results_id: 1 # 400.perlbench,libc-2.33.9000.so regressed by 113 from (for last_good == 5df556ac8bb8c5f4ef3dff1a2039dd389d1d27c0) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O2_marm -- artifacts/build-5df556ac8bb8c5f4ef3dff1a2039dd389d1d27c0/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-master-arm-spec2k6-O2/1840 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-master-arm-spec2k6-O2/1837 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-d181fd918d18cbd99768f025e14a69d35d275f14 cd investigate-llvm-d181fd918d18cbd99768f025e14a69d35d275f14 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach d181fd918d18cbd99768f025e14a69d35d275f14 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 5df556ac8bb8c5f4ef3dff1a2039dd389d1d27c0 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Full commit (up to 1000 lines): <cut> commit d181fd918d18cbd99768f025e14a69d35d275f14 Author: Simon Pilgrim <llvm-dev(a)redking.me.uk> Date: Fri Jul 2 14:27:27 2021 +0100 [CostModel][X86] Drop some hard coded fp<->int scalarization costs Scalarization costs handling is a lot better now, and the hard coded costs were higher than the worse case numbers from the script in D103695 --- llvm/lib/Target/X86/X86TargetTransformInfo.cpp | 13 ------------- llvm/test/Analysis/CostModel/X86/sitofp.ll | 6 +++--- 2 files changed, 3 insertions(+), 16 deletions(-) diff --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp index d55cd8a8c7a8..9eb5abe4dd9b 100644 --- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp +++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp @@ -1977,13 +1977,6 @@ InstructionCost X86TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst, { ISD::UINT_TO_FP, MVT::v8f64, MVT::v8i32, 10 }, { ISD::UINT_TO_FP, MVT::v2f64, MVT::v2i64, 5 }, { ISD::UINT_TO_FP, MVT::v4f64, MVT::v4i64, 6 }, - // The generic code to compute the scalar overhead is currently broken. - // Workaround this limitation by estimating the scalarization overhead - // here. We have roughly 10 instructions per scalar element. - // Multiply that by the vector width. - // FIXME: remove that when PR19268 is fixed. - { ISD::SINT_TO_FP, MVT::v4f64, MVT::v4i64, 13 }, - { ISD::SINT_TO_FP, MVT::v4f64, MVT::v4i64, 13 }, { ISD::FP_TO_SINT, MVT::v8i8, MVT::v8f32, 4 }, { ISD::FP_TO_SINT, MVT::v4i8, MVT::v4f64, 3 }, @@ -2003,12 +1996,6 @@ InstructionCost X86TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst, { ISD::FP_TO_UINT, MVT::v8i16, MVT::v8f32, 3 }, { ISD::FP_TO_UINT, MVT::v8i32, MVT::v8f32, 9 }, { ISD::FP_TO_UINT, MVT::v8i32, MVT::v8f64, 19 }, - // This node is expanded into scalarized operations but BasicTTI is overly - // optimistic estimating its cost. It computes 3 per element (one - // vector-extract, one scalar conversion and one vector-insert). The - // problem is that the inserts form a read-modify-write chain so latency - // should be factored in too. Inflating the cost per element by 1. - { ISD::FP_TO_UINT, MVT::v4i32, MVT::v4f64, 4*4 }, { ISD::FP_EXTEND, MVT::v4f64, MVT::v4f32, 1 }, { ISD::FP_ROUND, MVT::v4f32, MVT::v4f64, 1 }, diff --git a/llvm/test/Analysis/CostModel/X86/sitofp.ll b/llvm/test/Analysis/CostModel/X86/sitofp.ll index b3c400c93b9f..b327454c1d09 100644 --- a/llvm/test/Analysis/CostModel/X86/sitofp.ll +++ b/llvm/test/Analysis/CostModel/X86/sitofp.ll @@ -122,14 +122,14 @@ define i32 @sitofp_i64_double() { ; AVX-LABEL: 'sitofp_i64_double' ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %cvt_i64_f64 = sitofp i64 undef to double ; AVX-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %cvt_v2i64_v2f64 = sitofp <2 x i64> undef to <2 x double> -; AVX-NEXT: Cost Model: Found an estimated cost of 13 for instruction: %cvt_v4i64_v4f64 = sitofp <4 x i64> undef to <4 x double> -; AVX-NEXT: Cost Model: Found an estimated cost of 26 for instruction: %cvt_v8i64_v8f64 = sitofp <8 x i64> undef to <8 x double> +; AVX-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %cvt_v4i64_v4f64 = sitofp <4 x i64> undef to <4 x double> +; AVX-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %cvt_v8i64_v8f64 = sitofp <8 x i64> undef to <8 x double> ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX512F-LABEL: 'sitofp_i64_double' ; AVX512F-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %cvt_i64_f64 = sitofp i64 undef to double ; AVX512F-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %cvt_v2i64_v2f64 = sitofp <2 x i64> undef to <2 x double> -; AVX512F-NEXT: Cost Model: Found an estimated cost of 13 for instruction: %cvt_v4i64_v4f64 = sitofp <4 x i64> undef to <4 x double> +; AVX512F-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %cvt_v4i64_v4f64 = sitofp <4 x i64> undef to <4 x double> ; AVX512F-NEXT: Cost Model: Found an estimated cost of 25 for instruction: %cvt_v8i64_v8f64 = sitofp <8 x i64> undef to <8 x double> ; AVX512F-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; </cut>

4 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-master-aarch64-spec2k6-O2_LTO - Build # 18 - Fixed!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2_LTO Culprit: <cut> commit 428a62f65f16f1640b1bfe033d20e6a4f545dd3e Author: thomasraoux <thomasraoux(a)google.com> Date: Wed Jun 9 09:42:32 2021 -0700 [mlir][gpu] Add op to create MMA constant matrix This allow creating a matrix with all elements set to a given value. This is needed to be able to implement a simple dot op. Differential Revision: https://reviews.llvm.org/D103870 </cut> Results regressed to (for first_bad == 428a62f65f16f1640b1bfe033d20e6a4f545dd3e) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O2_LTO -- artifacts/build-428a62f65f16f1640b1bfe033d20e6a4f545dd3e/results_id: 1 # 400.perlbench,perlbench_base.default regressed by 103 from (for last_good == 3b46283c1539f89619f2b40ab7732f434d7c68ff) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -O2_LTO -- artifacts/build-3b46283c1539f89619f2b40ab7732f434d7c68ff/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O2_LTO/1827 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O2_LTO/1831 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-428a62f65f16f1640b1bfe033d20e6a4f545dd3e cd investigate-llvm-428a62f65f16f1640b1bfe033d20e6a4f545dd3e git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 428a62f65f16f1640b1bfe033d20e6a4f545dd3e ../artifacts/test.sh # Reproduce last_good build git checkout --detach 3b46283c1539f89619f2b40ab7732f434d7c68ff ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Full commit (up to 1000 lines): <cut> commit 428a62f65f16f1640b1bfe033d20e6a4f545dd3e Author: thomasraoux <thomasraoux(a)google.com> Date: Wed Jun 9 09:42:32 2021 -0700 [mlir][gpu] Add op to create MMA constant matrix This allow creating a matrix with all elements set to a given value. This is needed to be able to implement a simple dot op. Differential Revision: https://reviews.llvm.org/D103870 --- mlir/include/mlir/Dialect/GPU/GPUOps.td | 45 ++++++++++++++++++++++ mlir/lib/Conversion/GPUToNVVM/WmmaOpsToNvvm.cpp | 42 +++++++++++++++++++- .../Conversion/GPUToNVVM/wmma-ops-to-nvvm.mlir | 25 ++++++++++++ mlir/test/Dialect/GPU/ops.mlir | 4 ++ 4 files changed, 115 insertions(+), 1 deletion(-) diff --git a/mlir/include/mlir/Dialect/GPU/GPUOps.td b/mlir/include/mlir/Dialect/GPU/GPUOps.td index 8e2520b675ae..1e78e4af4d51 100644 --- a/mlir/include/mlir/Dialect/GPU/GPUOps.td +++ b/mlir/include/mlir/Dialect/GPU/GPUOps.td @@ -1022,4 +1022,49 @@ def GPU_SubgroupMmaComputeOp : GPU_Op<"subgroup_mma_compute", let verifier = [{ return ::verify(*this); }]; } +def GPU_SubgroupMmaConstantMatrixOp : GPU_Op<"subgroup_mma_constant_matrix", + [NoSideEffect, + TypesMatchWith<"value type matches element type of mma_matrix", + "res", "value", + "$_self.cast<gpu::MMAMatrixType>().getElementType()">]>{ + + let summary = "GPU warp synchronous constant matrix"; + + let description = [{ + The `gpu.subgroup_mma_constant_matrix` creates a `!gpu.mma_matrix` with + constant elements. + + The operation takes a scalar input and return a `!gpu.mma_matrix` where each + element of is equal to the operand constant. The destination mma_matrix type + must have elememt type equal to the constant type. Since the layout of + `!gpu.mma_matrix` is opaque this only support setting all the elements to + the same value. + + This op is meant to be used along with `gpu.subgroup_mma_compute`. + + Example: + + ```mlir + %0 = gpu.subgroup_mma_constant_matrix %a : + !gpu.mma_matrix<16x16xf16, "AOp"> + %1 = gpu.subgroup_mma_constant_matrix %b : + !gpu.mma_matrix<16x16xf32, "COp"> + ``` + }]; + + let arguments = (ins AnyTypeOf<[F16, F32]>:$value); + + let results = (outs GPU_MMAMatrix:$res); + + let extraClassDeclaration = [{ + gpu::MMAMatrixType getType() { + return res().getType().cast<gpu::MMAMatrixType>(); + } + }]; + + let assemblyFormat = [{ + $value attr-dict `:` type($res) + }]; +} + #endif // GPU_OPS diff --git a/mlir/lib/Conversion/GPUToNVVM/WmmaOpsToNvvm.cpp b/mlir/lib/Conversion/GPUToNVVM/WmmaOpsToNvvm.cpp index d72c8c217f86..d46a185dec22 100644 --- a/mlir/lib/Conversion/GPUToNVVM/WmmaOpsToNvvm.cpp +++ b/mlir/lib/Conversion/GPUToNVVM/WmmaOpsToNvvm.cpp @@ -348,12 +348,52 @@ struct WmmaMmaOpToNVVMLowering } }; +/// Convert GPU MMA ConstantMatrixOp to a chain of InsertValueOp. +struct WmmaConstantOpToNVVMLowering + : public ConvertOpToLLVMPattern<gpu::SubgroupMmaConstantMatrixOp> { + using ConvertOpToLLVMPattern< + gpu::SubgroupMmaConstantMatrixOp>::ConvertOpToLLVMPattern; + + LogicalResult + matchAndRewrite(gpu::SubgroupMmaConstantMatrixOp subgroupMmaConstantOp, + ArrayRef<Value> operands, + ConversionPatternRewriter &rewriter) const override { + if (failed(areAllLLVMTypes(subgroupMmaConstantOp.getOperation(), operands, + rewriter))) + return failure(); + Location loc = subgroupMmaConstantOp.getLoc(); + Value cst = operands[0]; + LLVM::LLVMStructType type = convertMMAToLLVMType( + subgroupMmaConstantOp.getType().cast<gpu::MMAMatrixType>()); + // If the element type is a vector create a vector from the operand. + if (auto vecType = type.getBody()[0].dyn_cast<VectorType>()) { + Value vecCst = rewriter.create<LLVM::UndefOp>(loc, vecType); + for (int64_t vecEl = 0; vecEl < vecType.getNumElements(); vecEl++) { + Value idx = rewriter.create<LLVM::ConstantOp>( + loc, typeConverter->convertType(rewriter.getIntegerType(32)), + rewriter.getI32ArrayAttr(vecEl)); + vecCst = rewriter.create<LLVM::InsertElementOp>(loc, vecType, vecCst, + cst, idx); + } + cst = vecCst; + } + Value matrixStruct = rewriter.create<LLVM::UndefOp>(loc, type); + for (size_t i : llvm::seq(size_t(0), type.getBody().size())) { + matrixStruct = rewriter.create<LLVM::InsertValueOp>( + loc, matrixStruct, cst, rewriter.getI32ArrayAttr(i)); + } + rewriter.replaceOp(subgroupMmaConstantOp, matrixStruct); + return success(); + } +}; + } // anonymous namespace namespace mlir { void populateGpuWMMAToNVVMConversionPatterns(LLVMTypeConverter &converter, RewritePatternSet &patterns) { patterns.insert<WmmaLoadOpToNVVMLowering, WmmaMmaOpToNVVMLowering, - WmmaStoreOpToNVVMLowering>(converter); + WmmaStoreOpToNVVMLowering, WmmaConstantOpToNVVMLowering>( + converter); } } // namespace mlir diff --git a/mlir/test/Conversion/GPUToNVVM/wmma-ops-to-nvvm.mlir b/mlir/test/Conversion/GPUToNVVM/wmma-ops-to-nvvm.mlir index de5d0d3fcf1c..f692dffdfcba 100644 --- a/mlir/test/Conversion/GPUToNVVM/wmma-ops-to-nvvm.mlir +++ b/mlir/test/Conversion/GPUToNVVM/wmma-ops-to-nvvm.mlir @@ -151,3 +151,28 @@ gpu.module @test_module { return } } + + +// ----- + +gpu.module @test_module { + +// CHECK-LABEL: func @gpu_wmma_constant_op +// CHECK: %[[CST:.+]] = llvm.mlir.constant(1.000000e+00 : f16) : f16 +// CHECK: %[[V0:.+]] = llvm.mlir.undef : vector<2xf16> +// CHECK: %[[C0:.+]] = llvm.mlir.constant([0 : i32]) : i32 +// CHECK: %[[V1:.+]] = llvm.insertelement %[[CST]], %[[V0]][%[[C0]] : i32] : vector<2xf16> +// CHECK: %[[C1:.+]] = llvm.mlir.constant([1 : i32]) : i32 +// CHECK: %[[V2:.+]] = llvm.insertelement %[[CST]], %[[V1]][%[[C1]] : i32] : vector<2xf16> +// CHECK: %[[M0:.+]] = llvm.mlir.undef : !llvm.struct<(vector<2xf16>, vector<2xf16>, vector<2xf16>, vector<2xf16>)> +// CHECK: %[[M1:.+]] = llvm.insertvalue %[[V2]], %[[M0]][0 : i32] : !llvm.struct<(vector<2xf16>, vector<2xf16>, vector<2xf16>, vector<2xf16>)> +// CHECK: %[[M2:.+]] = llvm.insertvalue %[[V2]], %[[M1]][1 : i32] : !llvm.struct<(vector<2xf16>, vector<2xf16>, vector<2xf16>, vector<2xf16>)> +// CHECK: %[[M3:.+]] = llvm.insertvalue %[[V2]], %[[M2]][2 : i32] : !llvm.struct<(vector<2xf16>, vector<2xf16>, vector<2xf16>, vector<2xf16>)> +// CHECK: %[[M4:.+]] = llvm.insertvalue %[[V2]], %[[M3]][3 : i32] : !llvm.struct<(vector<2xf16>, vector<2xf16>, vector<2xf16>, vector<2xf16>)> +// CHECK: llvm.return %[[M4]] : !llvm.struct<(vector<2xf16>, vector<2xf16>, vector<2xf16>, vector<2xf16>)> + func @gpu_wmma_constant_op() ->(!gpu.mma_matrix<16x16xf16, "COp">) { + %cst = constant 1.0 : f16 + %C = gpu.subgroup_mma_constant_matrix %cst : !gpu.mma_matrix<16x16xf16, "COp"> + return %C : !gpu.mma_matrix<16x16xf16, "COp"> + } +} diff --git a/mlir/test/Dialect/GPU/ops.mlir b/mlir/test/Dialect/GPU/ops.mlir index a98fe1c49683..1bed13c4b21a 100644 --- a/mlir/test/Dialect/GPU/ops.mlir +++ b/mlir/test/Dialect/GPU/ops.mlir @@ -201,8 +201,12 @@ module attributes {gpu.container_module} { // CHECK: %[[wg:.*]] = memref.alloca() %i = constant 16 : index // CHECK: %[[i:.*]] = constant 16 : index + %cst = constant 1.000000e+00 : f32 + // CHECK: %[[cst:.*]] = constant 1.000000e+00 : f32 %0 = gpu.subgroup_mma_load_matrix %wg[%i, %i] {leadDimension = 32 : index} : memref<32x32xf16, 3> -> !gpu.mma_matrix<16x16xf16, "AOp"> // CHECK: gpu.subgroup_mma_load_matrix %[[wg]][%[[i]], %[[i]]] {leadDimension = 32 : index} : memref<32x32xf16, 3> -> !gpu.mma_matrix<16x16xf16, "AOp"> + %1 = gpu.subgroup_mma_constant_matrix %cst : !gpu.mma_matrix<16x16xf32, "COp"> + // CHECK: gpu.subgroup_mma_constant_matrix %[[cst]] : !gpu.mma_matrix<16x16xf32, "COp"> return } } </cut>

4 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/gnu-release-arm-spec2k6-O2 - Build # 22 - Fixed!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tk1/gnu-release-arm-spec2k6-O2. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_tk1/gnu-release-arm-spec2k6-O2 Culprit: <cut> commit bf09e559b22b44e74a91ccc00507a1885ec3d578 Author: Thomas Koenig <tkoenig(a)gcc.gnu.org> Date: Sun May 19 10:21:06 2019 +0000 re PR fortran/88821 (Inline packing of non-contiguous arguments) 2019-05-19 Thomas Koenig <tkoenig(a)gcc.gnu.org> PR fortran/88821 * expr.c (gfc_is_simply_contiguous): Return true for an EXPR_ARRAY. * trans-array.c (is_pointer): New function. (gfc_conv_array_parameter): Call gfc_conv_subref_array_arg when not optimizing and not optimizing for size if the formal arg is passed by reference. * trans-expr.c (gfc_conv_subref_array_arg): Add arguments fsym, proc_name and sym. Add run-time warning for temporary array creation. Wrap argument if passing on an optional argument to an optional argument. * trans.h (gfc_conv_subref_array_arg): Add optional arguments fsym, proc_name and sym to prototype. 2019-05-19 Thomas Koenig <tkoenig(a)gcc.gnu.org> PR fortran/88821 * gfortran.dg/alloc_comp_auto_array_3.f90: Add -O0 to dg-options to make sure the test for internal_pack is retained. * gfortran.dg/assumed_type_2.f90: Split compile and run time tests into this and * gfortran.dg/assumed_type_2a.f90: New file. * gfortran.dg/c_loc_test_22.f90: Likewise. * gfortran.dg/contiguous_3.f90: Likewise. * gfortran.dg/internal_pack_11.f90: Likewise. * gfortran.dg/internal_pack_12.f90: Likewise. * gfortran.dg/internal_pack_16.f90: Likewise. * gfortran.dg/internal_pack_17.f90: Likewise. * gfortran.dg/internal_pack_18.f90: Likewise. * gfortran.dg/internal_pack_4.f90: Likewise. * gfortran.dg/internal_pack_5.f90: Add -O0 to dg-options to make sure the test for internal_pack is retained. * gfortran.dg/internal_pack_6.f90: Split compile and run time tests into this and * gfortran.dg/internal_pack_6a.f90: New file. * gfortran.dg/internal_pack_8.f90: Likewise. * gfortran.dg/missing_optional_dummy_6: Split compile and run time tests into this and * gfortran.dg/missing_optional_dummy_6a.f90: New file. * gfortran.dg/no_arg_check_2.f90: Split compile and run time tests into this and * gfortran.dg/no_arg_check_2a.f90: New file. * gfortran.dg/typebound_assignment_5.f90: Split compile and run time tests into this and * gfortran.dg/typebound_assignment_5a.f90: New file. * gfortran.dg/typebound_assignment_6.f90: Split compile and run time tests into this and * gfortran.dg/typebound_assignment_6a.f90: New file. * gfortran.dg/internal_pack_19.f90: New file. * gfortran.dg/internal_pack_20.f90: New file. * gfortran.dg/internal_pack_21.f90: New file. From-SVN: r271377 </cut> Results regressed to (for first_bad == bf09e559b22b44e74a91ccc00507a1885ec3d578) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O2_marm -- artifacts/build-bf09e559b22b44e74a91ccc00507a1885ec3d578/results_id: 1 # 481.wrf,wrf_base.default regressed by 107 # 481.wrf,[.] __module_small_step_em_MOD_advance_w regressed by 112 from (for last_good == 14688b8de389740f07079a945edf887a682fc9d1) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -O2_marm -- artifacts/build-14688b8de389740f07079a945edf887a682fc9d1/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Results ID of last_good: tk1_32/tcwg_bmk_gnu_tk1/bisect-gnu-release-arm-spec2k6-O2/1812 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Results ID of first_bad: tk1_32/tcwg_bmk_gnu_tk1/bisect-gnu-release-arm-spec2k6-O2/1816 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-bf09e559b22b44e74a91ccc00507a1885ec3d578 cd investigate-gcc-bf09e559b22b44e74a91ccc00507a1885ec3d578 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach bf09e559b22b44e74a91ccc00507a1885ec3d578 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 14688b8de389740f07079a945edf887a682fc9d1 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tk1-gnu-release-a… Full commit (up to 1000 lines): <cut> commit bf09e559b22b44e74a91ccc00507a1885ec3d578 Author: Thomas Koenig <tkoenig(a)gcc.gnu.org> Date: Sun May 19 10:21:06 2019 +0000 re PR fortran/88821 (Inline packing of non-contiguous arguments) 2019-05-19 Thomas Koenig <tkoenig(a)gcc.gnu.org> PR fortran/88821 * expr.c (gfc_is_simply_contiguous): Return true for an EXPR_ARRAY. * trans-array.c (is_pointer): New function. (gfc_conv_array_parameter): Call gfc_conv_subref_array_arg when not optimizing and not optimizing for size if the formal arg is passed by reference. * trans-expr.c (gfc_conv_subref_array_arg): Add arguments fsym, proc_name and sym. Add run-time warning for temporary array creation. Wrap argument if passing on an optional argument to an optional argument. * trans.h (gfc_conv_subref_array_arg): Add optional arguments fsym, proc_name and sym to prototype. 2019-05-19 Thomas Koenig <tkoenig(a)gcc.gnu.org> PR fortran/88821 * gfortran.dg/alloc_comp_auto_array_3.f90: Add -O0 to dg-options to make sure the test for internal_pack is retained. * gfortran.dg/assumed_type_2.f90: Split compile and run time tests into this and * gfortran.dg/assumed_type_2a.f90: New file. * gfortran.dg/c_loc_test_22.f90: Likewise. * gfortran.dg/contiguous_3.f90: Likewise. * gfortran.dg/internal_pack_11.f90: Likewise. * gfortran.dg/internal_pack_12.f90: Likewise. * gfortran.dg/internal_pack_16.f90: Likewise. * gfortran.dg/internal_pack_17.f90: Likewise. * gfortran.dg/internal_pack_18.f90: Likewise. * gfortran.dg/internal_pack_4.f90: Likewise. * gfortran.dg/internal_pack_5.f90: Add -O0 to dg-options to make sure the test for internal_pack is retained. * gfortran.dg/internal_pack_6.f90: Split compile and run time tests into this and * gfortran.dg/internal_pack_6a.f90: New file. * gfortran.dg/internal_pack_8.f90: Likewise. * gfortran.dg/missing_optional_dummy_6: Split compile and run time tests into this and * gfortran.dg/missing_optional_dummy_6a.f90: New file. * gfortran.dg/no_arg_check_2.f90: Split compile and run time tests into this and * gfortran.dg/no_arg_check_2a.f90: New file. * gfortran.dg/typebound_assignment_5.f90: Split compile and run time tests into this and * gfortran.dg/typebound_assignment_5a.f90: New file. * gfortran.dg/typebound_assignment_6.f90: Split compile and run time tests into this and * gfortran.dg/typebound_assignment_6a.f90: New file. * gfortran.dg/internal_pack_19.f90: New file. * gfortran.dg/internal_pack_20.f90: New file. * gfortran.dg/internal_pack_21.f90: New file. From-SVN: r271377 --- gcc/fortran/expr.c | 3 + gcc/fortran/trans-array.c | 31 +++++ gcc/fortran/trans-expr.c | 83 +++++++++++- gcc/fortran/trans.h | 5 +- .../gfortran.dg/alloc_comp_auto_array_3.f90 | 2 +- gcc/testsuite/gfortran.dg/assumed_type_2.f90 | 4 +- gcc/testsuite/gfortran.dg/assumed_type_2a.f90 | 139 +++++++++++++++++++++ gcc/testsuite/gfortran.dg/c_loc_test_22.f90 | 2 +- gcc/testsuite/gfortran.dg/contiguous_3.f90 | 2 +- gcc/testsuite/gfortran.dg/internal_pack_11.f90 | 2 +- gcc/testsuite/gfortran.dg/internal_pack_12.f90 | 2 +- gcc/testsuite/gfortran.dg/internal_pack_16.f90 | 2 +- gcc/testsuite/gfortran.dg/internal_pack_17.f90 | 2 +- gcc/testsuite/gfortran.dg/internal_pack_18.f90 | 2 +- gcc/testsuite/gfortran.dg/internal_pack_19.f90 | 23 ++++ gcc/testsuite/gfortran.dg/internal_pack_20.f90 | 23 ++++ gcc/testsuite/gfortran.dg/internal_pack_21.f90 | 24 ++++ gcc/testsuite/gfortran.dg/internal_pack_4.f90 | 4 - gcc/testsuite/gfortran.dg/internal_pack_5.f90 | 2 +- gcc/testsuite/gfortran.dg/internal_pack_6.f90 | 4 +- gcc/testsuite/gfortran.dg/internal_pack_6a.f90 | 56 +++++++++ gcc/testsuite/gfortran.dg/internal_pack_9.f90 | 2 +- .../gfortran.dg/missing_optional_dummy_6.f90 | 11 -- .../gfortran.dg/missing_optional_dummy_6a.f90 | 59 +++++++++ gcc/testsuite/gfortran.dg/no_arg_check_2.f90 | 4 +- gcc/testsuite/gfortran.dg/no_arg_check_2a.f90 | 121 ++++++++++++++++++ .../gfortran.dg/typebound_assignment_5.f03 | 4 +- .../gfortran.dg/typebound_assignment_5a.f03 | 39 ++++++ .../gfortran.dg/typebound_assignment_6.f03 | 4 - .../gfortran.dg/typebound_assignment_6a.f03 | 42 +++++++ 30 files changed, 663 insertions(+), 40 deletions(-) diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c index 474e9ecc401..949eff19cdd 100644 --- a/gcc/fortran/expr.c +++ b/gcc/fortran/expr.c @@ -5713,6 +5713,9 @@ gfc_is_simply_contiguous (gfc_expr *expr, bool strict, bool permit_element) gfc_ref *ref, *part_ref = NULL; gfc_symbol *sym; + if (expr->expr_type == EXPR_ARRAY) + return true; + if (expr->expr_type == EXPR_FUNCTION) { if (expr->value.function.esym) diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index 8a0de6140ed..9c96d897f41 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -7866,6 +7866,23 @@ array_parameter_size (tree desc, gfc_expr *expr, tree *size) *size, fold_convert (gfc_array_index_type, elem)); } +/* Helper function - return true if the argument is a pointer. */ + +static bool +is_pointer (gfc_expr *e) +{ + gfc_symbol *sym; + + if (e->expr_type != EXPR_VARIABLE || e->symtree == NULL) + return false; + + sym = e->symtree->n.sym; + if (sym == NULL) + return false; + + return sym->attr.pointer || sym->attr.proc_pointer; +} + /* Convert an array for passing as an actual parameter. */ void @@ -8117,6 +8134,20 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77, "Creating array temporary at %L", &expr->where); } + /* When optmizing, we can use gfc_conv_subref_array_arg for + making the packing and unpacking operation visible to the + optimizers. */ + + if (g77 && optimize && !optimize_size && expr->expr_type == EXPR_VARIABLE + && !is_pointer (expr) && (fsym == NULL + || fsym->ts.type != BT_ASSUMED)) + { + gfc_conv_subref_array_arg (se, expr, g77, + fsym ? fsym->attr.intent : INTENT_INOUT, + false, fsym, proc_name, sym); + return; + } + ptr = build_call_expr_loc (input_location, gfor_fndecl_in_pack, 1, desc); diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 3711c38b2f2..b7a8456c021 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -4576,8 +4576,10 @@ gfc_apply_interface_mapping (gfc_interface_mapping * mapping, an actual argument derived type array is copied and then returned after the function call. */ void -gfc_conv_subref_array_arg (gfc_se * parmse, gfc_expr * expr, int g77, - sym_intent intent, bool formal_ptr) +gfc_conv_subref_array_arg (gfc_se *se, gfc_expr * expr, int g77, + sym_intent intent, bool formal_ptr, + const gfc_symbol *fsym, const char *proc_name, + gfc_symbol *sym) { gfc_se lse; gfc_se rse; @@ -4594,6 +4596,36 @@ gfc_conv_subref_array_arg (gfc_se * parmse, gfc_expr * expr, int g77, stmtblock_t body; int n; int dimen; + gfc_se work_se; + gfc_se *parmse; + bool pass_optional; + + pass_optional = fsym && fsym->attr.optional && sym && sym->attr.optional; + + if (pass_optional) + { + gfc_init_se (&work_se, NULL); + parmse = &work_se; + } + else + parmse = se; + + if (gfc_option.rtcheck & GFC_RTCHECK_ARRAY_TEMPS) + { + /* We will create a temporary array, so let us warn. */ + char * msg; + + if (fsym && proc_name) + msg = xasprintf ("An array temporary was created for argument " + "'%s' of procedure '%s'", fsym->name, proc_name); + else + msg = xasprintf ("An array temporary was created"); + + tmp = build_int_cst (logical_type_node, 1); + gfc_trans_runtime_check (false, true, tmp, &parmse->pre, + &expr->where, msg); + free (msg); + } gfc_init_se (&lse, NULL); gfc_init_se (&rse, NULL); @@ -4848,6 +4880,53 @@ class_array_fcn: else parmse->expr = gfc_build_addr_expr (NULL_TREE, parmse->expr); + if (pass_optional) + { + tree present; + tree type; + stmtblock_t else_block; + tree pre_stmts, post_stmts; + tree pointer; + tree else_stmt; + + /* Make this into + + if (present (a)) + { + parmse->pre; + optional = parse->expr; + } + else + optional = NULL; + call foo (optional); + if (present (a)) + parmse->post; + + */ + + type = TREE_TYPE (parmse->expr); + pointer = gfc_create_var (type, "optional"); + tmp = gfc_conv_expr_present (sym); + present = gfc_evaluate_now (tmp, &se->pre); + gfc_add_modify (&parmse->pre, pointer, parmse->expr); + pre_stmts = gfc_finish_block (&parmse->pre); + + gfc_init_block (&else_block); + gfc_add_modify (&else_block, pointer, build_int_cst (type, 0)); + else_stmt = gfc_finish_block (&else_block); + + tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node, present, + pre_stmts, else_stmt); + gfc_add_expr_to_block (&se->pre, tmp); + + post_stmts = gfc_finish_block (&parmse->post); + tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node, present, + post_stmts, build_empty_stmt (input_location)); + gfc_add_expr_to_block (&se->post, tmp); + + se->expr = pointer; + } + return; } diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h index 273c75a422c..e0118abaf18 100644 --- a/gcc/fortran/trans.h +++ b/gcc/fortran/trans.h @@ -532,7 +532,10 @@ int gfc_is_intrinsic_libcall (gfc_expr *); int gfc_conv_procedure_call (gfc_se *, gfc_symbol *, gfc_actual_arglist *, gfc_expr *, vec<tree, va_gc> *); -void gfc_conv_subref_array_arg (gfc_se *, gfc_expr *, int, sym_intent, bool); +void gfc_conv_subref_array_arg (gfc_se *, gfc_expr *, int, sym_intent, bool, + const gfc_symbol *fsym = NULL, + const char *proc_name = NULL, + gfc_symbol *sym = NULL); /* Generate code for a scalar assignment. */ tree gfc_trans_scalar_assign (gfc_se *, gfc_se *, gfc_typespec, bool, bool, diff --git a/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_3.f90 b/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_3.f90 index 15f9ecb74de..2af089e84e8 100644 --- a/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_3.f90 +++ b/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_3.f90 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-options "-fdump-tree-original" } +! { dg-options "-O0 -fdump-tree-original" } ! ! Test the fix for PR66082. The original problem was with the first ! call foo_1d. diff --git a/gcc/testsuite/gfortran.dg/assumed_type_2.f90 b/gcc/testsuite/gfortran.dg/assumed_type_2.f90 index dce5ac6839c..5d3cd7eaece 100644 --- a/gcc/testsuite/gfortran.dg/assumed_type_2.f90 +++ b/gcc/testsuite/gfortran.dg/assumed_type_2.f90 @@ -1,5 +1,5 @@ -! { dg-do run } -! { dg-options "-fdump-tree-original" } +! { dg-do compile } +! { dg-options "-O0 -fdump-tree-original" } ! ! PR fortran/48820 ! diff --git a/gcc/testsuite/gfortran.dg/assumed_type_2a.f90 b/gcc/testsuite/gfortran.dg/assumed_type_2a.f90 new file mode 100644 index 00000000000..125bfcbe839 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/assumed_type_2a.f90 @@ -0,0 +1,139 @@ +! { dg-do run } +! +! PR fortran/48820 +! +! Test TYPE(*) +! + +module mod + use iso_c_binding, only: c_loc, c_ptr, c_bool + implicit none + interface my_c_loc + function my_c_loc1(x) bind(C) + import c_ptr + type(*) :: x + type(c_ptr) :: my_c_loc1 + end function + function my_c_loc2(x) bind(C) + import c_ptr + type(*) :: x(*) + type(c_ptr) :: my_c_loc2 + end function + end interface my_c_loc +contains + subroutine sub_scalar (arg1, presnt) + type(*), target, optional :: arg1 + logical :: presnt + type(c_ptr) :: cpt + if (presnt .neqv. present (arg1)) STOP 1 + cpt = c_loc (arg1) + end subroutine sub_scalar + + subroutine sub_array_shape (arg2, lbounds, ubounds) + type(*), target :: arg2(:,:) + type(c_ptr) :: cpt + integer :: lbounds(2), ubounds(2) + if (any (lbound(arg2) /= lbounds)) STOP 2 + if (any (ubound(arg2) /= ubounds)) STOP 3 + if (any (shape(arg2) /= ubounds-lbounds+1)) STOP 4 + if (size(arg2) /= product (ubounds-lbounds+1)) STOP 5 + if (rank (arg2) /= 2) STOP 6 +! if (.not. is_continuous (arg2)) STOP 7 !<< Not yet implemented +! cpt = c_loc (arg2) ! << FIXME: Valid since TS29113 + call sub_array_assumed (arg2) + end subroutine sub_array_shape + + subroutine sub_array_assumed (arg3) + type(*), target :: arg3(*) + type(c_ptr) :: cpt + cpt = c_loc (arg3) + end subroutine sub_array_assumed +end module + +use mod +use iso_c_binding, only: c_int, c_null_ptr +implicit none +type t1 + integer :: a +end type t1 +type :: t2 + sequence + integer :: b +end type t2 +type, bind(C) :: t3 + integer(c_int) :: c +end type t3 + +integer :: scalar_int +real, allocatable :: scalar_real_alloc +character, pointer :: scalar_char_ptr + +integer :: array_int(3) +real, allocatable :: array_real_alloc(:,:) +character, pointer :: array_char_ptr(:,:) + +type(t1) :: scalar_t1 +type(t2), allocatable :: scalar_t2_alloc +type(t3), pointer :: scalar_t3_ptr + +type(t1) :: array_t1(4) +type(t2), allocatable :: array_t2_alloc(:,:) +type(t3), pointer :: array_t3_ptr(:,:) + +class(t1), allocatable :: scalar_class_t1_alloc +class(t1), pointer :: scalar_class_t1_ptr + +class(t1), allocatable :: array_class_t1_alloc(:,:) +class(t1), pointer :: array_class_t1_ptr(:,:) + +scalar_char_ptr => null() +scalar_t3_ptr => null() + +call sub_scalar (presnt=.false.) +call sub_scalar (scalar_real_alloc, .false.) +call sub_scalar (scalar_char_ptr, .false.) +call sub_scalar (null (), .false.) +call sub_scalar (scalar_t2_alloc, .false.) +call sub_scalar (scalar_t3_ptr, .false.) + +allocate (scalar_real_alloc, scalar_char_ptr, scalar_t3_ptr) +allocate (scalar_class_t1_alloc, scalar_class_t1_ptr, scalar_t2_alloc) +allocate (array_real_alloc(3:5,2:4), array_char_ptr(-2:2,2)) +allocate (array_t2_alloc(3:5,2:4), array_t3_ptr(-2:2,2)) +allocate (array_class_t1_alloc(3,3), array_class_t1_ptr(4,4)) + +call sub_scalar (scalar_int, .true.) +call sub_scalar (scalar_real_alloc, .true.) +call sub_scalar (scalar_char_ptr, .true.) +call sub_scalar (array_int(2), .true.) +call sub_scalar (array_real_alloc(3,2), .true.) +call sub_scalar (array_char_ptr(0,1), .true.) +call sub_scalar (scalar_t1, .true.) +call sub_scalar (scalar_t2_alloc, .true.) +call sub_scalar (scalar_t3_ptr, .true.) +call sub_scalar (array_t1(2), .true.) +call sub_scalar (array_t2_alloc(3,2), .true.) +call sub_scalar (array_t3_ptr(0,1), .true.) +call sub_scalar (array_class_t1_alloc(2,1), .true.) +call sub_scalar (array_class_t1_ptr(3,3), .true.) + +call sub_array_assumed (array_int) +call sub_array_assumed (array_real_alloc) +call sub_array_assumed (array_char_ptr) +call sub_array_assumed (array_t1) +call sub_array_assumed (array_t2_alloc) +call sub_array_assumed (array_t3_ptr) +call sub_array_assumed (array_class_t1_alloc) +call sub_array_assumed (array_class_t1_ptr) + +call sub_array_shape (array_real_alloc, [1,1], shape(array_real_alloc)) +call sub_array_shape (array_char_ptr, [1,1], shape(array_char_ptr)) +call sub_array_shape (array_t2_alloc, [1,1], shape(array_t2_alloc)) +call sub_array_shape (array_t3_ptr, [1,1], shape(array_t3_ptr)) +call sub_array_shape (array_class_t1_alloc, [1,1], shape(array_class_t1_alloc)) +call sub_array_shape (array_class_t1_ptr, [1,1], shape(array_class_t1_ptr)) + +deallocate (scalar_char_ptr, scalar_class_t1_ptr, array_char_ptr) +deallocate (array_class_t1_ptr, array_t3_ptr) + +end diff --git a/gcc/testsuite/gfortran.dg/c_loc_test_22.f90 b/gcc/testsuite/gfortran.dg/c_loc_test_22.f90 index 5f4f9775b4a..9c40b26d830 100644 --- a/gcc/testsuite/gfortran.dg/c_loc_test_22.f90 +++ b/gcc/testsuite/gfortran.dg/c_loc_test_22.f90 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-options "-fdump-tree-original" } +! { dg-options "-O0 -fdump-tree-original" } ! ! PR fortran/56907 ! diff --git a/gcc/testsuite/gfortran.dg/contiguous_3.f90 b/gcc/testsuite/gfortran.dg/contiguous_3.f90 index 724ec83ed10..ba0ccce8f9e 100644 --- a/gcc/testsuite/gfortran.dg/contiguous_3.f90 +++ b/gcc/testsuite/gfortran.dg/contiguous_3.f90 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-options "-fdump-tree-original" } +! { dg-options "-O0 -fdump-tree-original" } ! ! PR fortran/40632 ! diff --git a/gcc/testsuite/gfortran.dg/internal_pack_11.f90 b/gcc/testsuite/gfortran.dg/internal_pack_11.f90 index a1d357cee73..c341a1bbc5f 100644 --- a/gcc/testsuite/gfortran.dg/internal_pack_11.f90 +++ b/gcc/testsuite/gfortran.dg/internal_pack_11.f90 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-options "-fdump-tree-original" } +! { dg-options "-O0 -fdump-tree-original" } ! ! Test the fix for PR43173, where unnecessary calls to internal_pack/unpack ! were being produced below. These references are contiguous and so do not diff --git a/gcc/testsuite/gfortran.dg/internal_pack_12.f90 b/gcc/testsuite/gfortran.dg/internal_pack_12.f90 index 55631c80e6e..da507322cbb 100644 --- a/gcc/testsuite/gfortran.dg/internal_pack_12.f90 +++ b/gcc/testsuite/gfortran.dg/internal_pack_12.f90 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-options "-fdump-tree-original" } +! { dg-options "-O0 -fdump-tree-original" } ! ! Test the fix for PR43243, where unnecessary calls to internal_pack/unpack ! were being produced below. These references are contiguous and so do not diff --git a/gcc/testsuite/gfortran.dg/internal_pack_16.f90 b/gcc/testsuite/gfortran.dg/internal_pack_16.f90 index 7e34c2bf733..92c4b150db8 100644 --- a/gcc/testsuite/gfortran.dg/internal_pack_16.f90 +++ b/gcc/testsuite/gfortran.dg/internal_pack_16.f90 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-O0 -fdump-tree-original" } ! PR 59345 - pack/unpack was not needed here. SUBROUTINE S1(A) REAL :: A(3) diff --git a/gcc/testsuite/gfortran.dg/internal_pack_17.f90 b/gcc/testsuite/gfortran.dg/internal_pack_17.f90 index c1b813b0c91..176ad879ba2 100644 --- a/gcc/testsuite/gfortran.dg/internal_pack_17.f90 +++ b/gcc/testsuite/gfortran.dg/internal_pack_17.f90 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-O0 -fdump-tree-original" } ! PR 59345 - pack/unpack was not needed here. ! Original test case by Joost VandeVondele SUBROUTINE S1(A) diff --git a/gcc/testsuite/gfortran.dg/internal_pack_18.f90 b/gcc/testsuite/gfortran.dg/internal_pack_18.f90 index ede0691bb9f..b4404726d12 100644 --- a/gcc/testsuite/gfortran.dg/internal_pack_18.f90 +++ b/gcc/testsuite/gfortran.dg/internal_pack_18.f90 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-O0 -fdump-tree-original" } ! PR 57992 - this was packed/unpacked unnecessarily. ! Original case by Tobias Burnus. subroutine test diff --git a/gcc/testsuite/gfortran.dg/internal_pack_19.f90 b/gcc/testsuite/gfortran.dg/internal_pack_19.f90 new file mode 100644 index 00000000000..06b916b7d8e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/internal_pack_19.f90 @@ -0,0 +1,23 @@ +! { dg-do compile } +! { dg-options "-Os -fdump-tree-original" } +! Check that internal_pack is called with -Os. +module x + implicit none +contains + subroutine bar(a, n) + integer, intent(in) :: n + integer, intent(in), dimension(n) :: a + print *,a + end subroutine bar +end module x + +program main + use x + implicit none + integer, parameter :: n = 10 + integer, dimension(n) :: a + integer :: i + a = [(i,i=1,n)] + call bar(a(n:1:-1),n) +end program main +! { dg-final { scan-tree-dump-times "_gfortran_internal_pack" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/internal_pack_20.f90 b/gcc/testsuite/gfortran.dg/internal_pack_20.f90 new file mode 100644 index 00000000000..f93f06bf272 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/internal_pack_20.f90 @@ -0,0 +1,23 @@ +! { dg-do compile } +! { dg-options "-O -fdump-tree-original" } +! Check that internal_pack is not called with -O. +module x + implicit none +contains + subroutine bar(a, n) + integer, intent(in) :: n + integer, intent(in), dimension(n) :: a + print *,a + end subroutine bar +end module x + +program main + use x + implicit none + integer, parameter :: n = 10 + integer, dimension(n) :: a + integer :: i + a = [(i,i=1,n)] + call bar(a(n:1:-1),n) +end program main +! { dg-final { scan-tree-dump-not "_gfortran_internal_pack" "original" } } diff --git a/gcc/testsuite/gfortran.dg/internal_pack_21.f90 b/gcc/testsuite/gfortran.dg/internal_pack_21.f90 new file mode 100644 index 00000000000..d0ce942a9f8 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/internal_pack_21.f90 @@ -0,0 +1,24 @@ +! { dg-do run } +! { dg-options "-O -fdump-tree-original" } +! Test handling of the optional argument. + +MODULE M1 + INTEGER, PARAMETER :: dp=KIND(0.0D0) +CONTAINS + SUBROUTINE S1(a) + REAL(dp), DIMENSION(45), INTENT(OUT), & + OPTIONAL :: a + if (present(a)) STOP 1 + END SUBROUTINE S1 + SUBROUTINE S2(a) + REAL(dp), DIMENSION(:, :), INTENT(OUT), & + OPTIONAL :: a + CALL S1(a) + END SUBROUTINE +END MODULE M1 + +USE M1 +CALL S2() +END +! { dg-final { scan-tree-dump-times "optional" 4 "original" } } +! { dg-final { scan-tree-dump-not "_gfortran_internal_unpack" "original" } } diff --git a/gcc/testsuite/gfortran.dg/internal_pack_4.f90 b/gcc/testsuite/gfortran.dg/internal_pack_4.f90 index 00f316414bc..9de09ab072b 100644 --- a/gcc/testsuite/gfortran.dg/internal_pack_4.f90 +++ b/gcc/testsuite/gfortran.dg/internal_pack_4.f90 @@ -1,5 +1,4 @@ ! { dg-do run } -! { dg-options "-fdump-tree-original" } ! ! PR fortran/36132 ! @@ -25,6 +24,3 @@ END MODULE M1 USE M1 CALL S2() END - -! { dg-final { scan-tree-dump-times "a != 0B \\? \\$.*\\$ _gfortran_internal_pack" 1 "original" } } -! { dg-final { scan-tree-dump-times "if \\(a != 0B &&" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/internal_pack_5.f90 b/gcc/testsuite/gfortran.dg/internal_pack_5.f90 index 3c5868f9efc..360ade491b5 100644 --- a/gcc/testsuite/gfortran.dg/internal_pack_5.f90 +++ b/gcc/testsuite/gfortran.dg/internal_pack_5.f90 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-options "-fdump-tree-original" } +! { dg-options "-O0 -fdump-tree-original" } ! ! PR fortran/36909 ! diff --git a/gcc/testsuite/gfortran.dg/internal_pack_6.f90 b/gcc/testsuite/gfortran.dg/internal_pack_6.f90 index d6102761904..6d52a8c98c4 100644 --- a/gcc/testsuite/gfortran.dg/internal_pack_6.f90 +++ b/gcc/testsuite/gfortran.dg/internal_pack_6.f90 @@ -1,5 +1,5 @@ -! { dg-do run } -! { dg-options "-fdump-tree-original" } +! { dg-do compile } +! { dg-options "-O0 -fdump-tree-original" } ! ! Test the fix for PR41113 and PR41117, in which unnecessary calls ! to internal_pack and internal_unpack were being generated. diff --git a/gcc/testsuite/gfortran.dg/internal_pack_6a.f90 b/gcc/testsuite/gfortran.dg/internal_pack_6a.f90 new file mode 100644 index 00000000000..a9fb2b52d97 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/internal_pack_6a.f90 @@ -0,0 +1,56 @@ +! { dg-do run } +! +! Test the fix for PR41113 and PR41117, in which unnecessary calls +! to internal_pack and internal_unpack were being generated. +! +! Contributed by Joost VandeVondele <jv244(a)cam.ac.uk> +! +MODULE M1 + TYPE T1 + REAL :: data(10) = [(i, i = 1, 10)] + END TYPE T1 +CONTAINS + SUBROUTINE S1(data, i, chksum) + REAL, DIMENSION(*) :: data + integer :: i, j + real :: subsum, chksum + subsum = 0 + do j = 1, i + subsum = subsum + data(j) + end do + if (abs(subsum - chksum) > 1e-6) STOP 1 + END SUBROUTINE S1 +END MODULE + +SUBROUTINE S2 + use m1 + TYPE(T1) :: d + + real :: data1(10) = [(i, i = 1, 10)] + REAL :: data(-4:5,-4:5) = reshape ([(real(i), i = 1, 100)], [10,10]) + +! PR41113 + CALL S1(d%data, 10, sum (d%data)) + CALL S1(data1, 10, sum (data1)) + +! PR41117 + DO i=-4,5 + CALL S1(data(:,i), 10, sum (data(:,i))) + ENDDO + +! With the fix for PR41113/7 this is the only time that _internal_pack +! was called. The final part of the fix for PR43072 put paid to it too. + DO i=-4,5 + CALL S1(data(-2:,i), 8, sum (data(-2:,i))) + ENDDO + DO i=-4,4 + CALL S1(data(:,i:i+1), 20, sum (reshape (data(:,i:i+1), [20]))) + ENDDO + DO i=-4,5 + CALL S1(data(2,i), 1, data(2,i)) + ENDDO +END SUBROUTINE S2 + + call s2 +end + diff --git a/gcc/testsuite/gfortran.dg/internal_pack_9.f90 b/gcc/testsuite/gfortran.dg/internal_pack_9.f90 index 9ce53f44354..2b44db5a805 100644 --- a/gcc/testsuite/gfortran.dg/internal_pack_9.f90 +++ b/gcc/testsuite/gfortran.dg/internal_pack_9.f90 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-options "-fdump-tree-original" } +! { dg-options "-O0 -fdump-tree-original" } ! ! During the discussion of the fix for PR43072, in which unnecessary ! calls to internal PACK/UNPACK were being generated, the following, diff --git a/gcc/testsuite/gfortran.dg/missing_optional_dummy_6.f90 b/gcc/testsuite/gfortran.dg/missing_optional_dummy_6.f90 index 4468ff159b9..cb6de2ebf61 100644 --- a/gcc/testsuite/gfortran.dg/missing_optional_dummy_6.f90 +++ b/gcc/testsuite/gfortran.dg/missing_optional_dummy_6.f90 @@ -46,14 +46,3 @@ contains end subroutine scalar2 end program test - -! { dg-final { scan-tree-dump-times "scalar2 \\(slr1" 1 "original" } } - -! { dg-final { scan-tree-dump-times "= es1 != 0B" 1 "original" } } -! { dg-final { scan-tree-dump-times "assumed_shape2 \\(es1" 0 "original" } } -! { dg-final { scan-tree-dump-times "explicit_shape2 \\(es1" 1 "original" } } - -! { dg-final { scan-tree-dump-times "= as1 != 0B" 2 "original" } } -! { dg-final { scan-tree-dump-times "assumed_shape2 \\(as1" 0 "original" } } -! { dg-final { scan-tree-dump-times "explicit_shape2 \\(as1" 0 "original" } } - diff --git a/gcc/testsuite/gfortran.dg/missing_optional_dummy_6a.f90 b/gcc/testsuite/gfortran.dg/missing_optional_dummy_6a.f90 new file mode 100644 index 00000000000..0e08ed3aa0c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/missing_optional_dummy_6a.f90 @@ -0,0 +1,59 @@ +! { dg-do compile } +! { dg-options "-O0 -fdump-tree-original" } +! +! PR fortran/41907 +! +program test + implicit none + call scalar1 () + call assumed_shape1 () + call explicit_shape1 () +contains + + ! Calling functions + subroutine scalar1 (slr1) + integer, optional :: slr1 + call scalar2 (slr1) + end subroutine scalar1 + + subroutine assumed_shape1 (as1) + integer, dimension(:), optional :: as1 + call assumed_shape2 (as1) + call explicit_shape2 (as1) + end subroutine assumed_shape1 + + subroutine explicit_shape1 (es1) + integer, dimension(5), optional :: es1 + call assumed_shape2 (es1) + call explicit_shape2 (es1) + end subroutine explicit_shape1 + + + ! Called functions + subroutine assumed_shape2 (as2) + integer, dimension(:),optional :: as2 + if (present (as2)) STOP 1 + end subroutine assumed_shape2 + + subroutine explicit_shape2 (es2) + integer, dimension(5),optional :: es2 + if (present (es2)) STOP 2 + end subroutine explicit_shape2 + + subroutine scalar2 (slr2) + integer, optional :: slr2 + if (present (slr2)) STOP 3 + end subroutine scalar2 + +end program test + +! { dg-final { scan-tree-dump-times "scalar2 \\(slr1" 1 "original" } } + +! { dg-final { scan-tree-dump-times "= es1 != 0B" 1 "original" } } +! { dg-final { scan-tree-dump-times "assumed_shape2 \\(es1" 0 "original" } } +! { dg-final { scan-tree-dump-times "explicit_shape2 \\(es1" 1 "original" } } + +! { dg-final { scan-tree-dump-times "= as1 != 0B" 2 "original" } } +! { dg-final { scan-tree-dump-times "assumed_shape2 \\(as1" 0 "original" } } +! { dg-final { scan-tree-dump-times "explicit_shape2 \\(as1" 0 "original" } } + diff --git a/gcc/testsuite/gfortran.dg/no_arg_check_2.f90 b/gcc/testsuite/gfortran.dg/no_arg_check_2.f90 index fe334883a3e..3570b9719eb 100644 --- a/gcc/testsuite/gfortran.dg/no_arg_check_2.f90 +++ b/gcc/testsuite/gfortran.dg/no_arg_check_2.f90 @@ -1,5 +1,5 @@ -! { dg-do run } -! { dg-options "-fdump-tree-original" } +! { dg-do compile } +! { dg-options "-O0 -fdump-tree-original" } ! ! PR fortran/39505 ! diff --git a/gcc/testsuite/gfortran.dg/no_arg_check_2a.f90 b/gcc/testsuite/gfortran.dg/no_arg_check_2a.f90 new file mode 100644 index 00000000000..dc4adcb5619 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/no_arg_check_2a.f90 @@ -0,0 +1,121 @@ +! { dg-do run } +! +! PR fortran/39505 +! +! Test NO_ARG_CHECK +! Copied from assumed_type_2.f90 +! + +module mod + use iso_c_binding, only: c_loc, c_ptr, c_bool + implicit none + interface my_c_loc + function my_c_loc1(x) bind(C) + import c_ptr +!GCC$ attributes NO_ARG_CHECK :: x + type(*) :: x + type(c_ptr) :: my_c_loc1 + end function + end interface my_c_loc +contains + subroutine sub_scalar (arg1, presnt) + integer(8), target, optional :: arg1 + logical :: presnt + type(c_ptr) :: cpt +!GCC$ attributes NO_ARG_CHECK :: arg1 + if (presnt .neqv. present (arg1)) STOP 1 + cpt = c_loc (arg1) + end subroutine sub_scalar + + subroutine sub_array_assumed (arg3) +!GCC$ attributes NO_ARG_CHECK :: arg3 + logical(1), target :: arg3(*) + type(c_ptr) :: cpt + cpt = c_loc (arg3) + end subroutine sub_array_assumed +end module + +use mod +use iso_c_binding, only: c_int, c_null_ptr +implicit none +type t1 + integer :: a +end type t1 +type :: t2 + sequence + integer :: b +end type t2 +type, bind(C) :: t3 + integer(c_int) :: c +end type t3 + +integer :: scalar_int +real, allocatable :: scalar_real_alloc +character, pointer :: scalar_char_ptr + +integer :: array_int(3) +real, allocatable :: array_real_alloc(:,:) +character, pointer :: array_char_ptr(:,:) + +type(t1) :: scalar_t1 +type(t2), allocatable :: scalar_t2_alloc +type(t3), pointer :: scalar_t3_ptr + +type(t1) :: array_t1(4) +type(t2), allocatable :: array_t2_alloc(:,:) +type(t3), pointer :: array_t3_ptr(:,:) + +class(t1), allocatable :: scalar_class_t1_alloc +class(t1), pointer :: scalar_class_t1_ptr + +class(t1), allocatable :: array_class_t1_alloc(:,:) +class(t1), pointer :: array_class_t1_ptr(:,:) + +scalar_char_ptr => null() +scalar_t3_ptr => null() + +call sub_scalar (presnt=.false.) +call sub_scalar (scalar_real_alloc, .false.) +call sub_scalar (scalar_char_ptr, .false.) +call sub_scalar (null (), .false.) +call sub_scalar (scalar_t2_alloc, .false.) +call sub_scalar (scalar_t3_ptr, .false.) + +allocate (scalar_real_alloc, scalar_char_ptr, scalar_t3_ptr) +allocate (scalar_class_t1_alloc, scalar_class_t1_ptr, scalar_t2_alloc) +allocate (array_real_alloc(3:5,2:4), array_char_ptr(-2:2,2)) +allocate (array_t2_alloc(3:5,2:4), array_t3_ptr(-2:2,2)) +allocate (array_class_t1_alloc(3,3), array_class_t1_ptr(4,4)) + +call sub_scalar (scalar_int, .true.) +call sub_scalar (scalar_real_alloc, .true.) +call sub_scalar (scalar_char_ptr, .true.) +call sub_scalar (array_int(2), .true.) +call sub_scalar (array_real_alloc(3,2), .true.) +call sub_scalar (array_char_ptr(0,1), .true.) +call sub_scalar (scalar_t1, .true.) +call sub_scalar (scalar_t2_alloc, .true.) +call sub_scalar (scalar_t3_ptr, .true.) +call sub_scalar (array_t1(2), .true.) +call sub_scalar (array_t2_alloc(3,2), .true.) +call sub_scalar (array_t3_ptr(0,1), .true.) +call sub_scalar (array_class_t1_alloc(2,1), .true.) +call sub_scalar (array_class_t1_ptr(3,3), .true.) + +call sub_array_assumed (array_int) +call sub_array_assumed (array_real_alloc) +call sub_array_assumed (array_char_ptr) +call sub_array_assumed (array_t1) +call sub_array_assumed (array_t2_alloc) +call sub_array_assumed (array_t3_ptr) +call sub_array_assumed (array_class_t1_alloc) +call sub_array_assumed (array_class_t1_ptr) + +deallocate (scalar_char_ptr, scalar_class_t1_ptr, array_char_ptr) +deallocate (array_class_t1_ptr, array_t3_ptr) +contains + subroutine sub(x) + integer :: x(:) + call sub_array_assumed (x) + end subroutine sub +end diff --git a/gcc/testsuite/gfortran.dg/typebound_assignment_5.f03 b/gcc/testsuite/gfortran.dg/typebound_assignment_5.f03 index f176b841fc0..e7c9126b35c 100644 --- a/gcc/testsuite/gfortran.dg/typebound_assignment_5.f03 +++ b/gcc/testsuite/gfortran.dg/typebound_assignment_5.f03 @@ -1,5 +1,5 @@ -! { dg-do run } -! { dg-options "-fdump-tree-original" } +! { dg-do compile } +! { dg-options "-O0 -fdump-tree-original" } ! ! PR fortran/49074 ! ICE on defined assignment with class arrays. diff --git a/gcc/testsuite/gfortran.dg/typebound_assignment_5a.f03 b/gcc/testsuite/gfortran.dg/typebound_assignment_5a.f03 new file mode 100644 index 00000000000..b55b42b589c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/typebound_assignment_5a.f03 @@ -0,0 +1,39 @@ +! { dg-do run } +! +! PR fortran/49074 +! ICE on defined assignment with class arrays. + + module foo + type bar + integer :: i + + contains + + generic :: assignment (=) => assgn_bar + procedure, private :: assgn_bar + end type bar + + contains + + elemental subroutine assgn_bar (a, b) + class (bar), intent (inout) :: a + class (bar), intent (in) :: b + + select type (b) + type is (bar) + a%i = b%i + end select + + return + end subroutine assgn_bar + end module foo + + program main + use foo </cut>

4 years, 11 months

Re: [CI-NOTIFY]: TCWG Bisect tcwg_kernel/llvm-release-arm-next-allyesconfig - Build # 13 - Successful!

by Maxim Kuvyrkov

Hi Alexey, Your patch appears to break build on aarch64 and arm in allmodconfig and allyesconfig configurations. Would you please investigate? Please let us know if it doesn't easily reproduce for you. Thanks! -- Maxim Kuvyrkov https://www.linaro.org > On Jul 17, 2021, at 8:08 PM, ci_notify(a)linaro.org wrote: > > Successfully identified regression in *linux* in CI configuration tcwg_kernel/llvm-release-arm-next-allyesconfig. So far, this commit has regressed CI configurations: > - tcwg_kernel/gnu-master-aarch64-next-allmodconfig > - tcwg_kernel/llvm-master-aarch64-next-allyesconfig > - tcwg_kernel/llvm-release-arm-next-allmodconfig > - tcwg_kernel/llvm-release-arm-next-allyesconfig > > Culprit: > <cut> > commit 6f4266a78a4e090b452a0335c1414f3240684743 > Author: Alexey Dobriyan <adobriyan(a)gmail.com> > > kbuild: decouple build from userspace headers > </cut> > > Results regressed to (for first_bad == 6f4266a78a4e090b452a0335c1414f3240684743) > # reset_artifacts: > -10 > # build_abe binutils: > -9 > # build_llvm: > -5 > # build_abe qemu: > -2 > # linux_n_obj: > 19710 > # First few build errors in logs: > # 00:04:48 crypto/aegis128-neon-inner.c:11:10: fatal error: 'arm_neon.h' file not found > # 00:04:48 make[1]: *** [crypto/aegis128-neon-inner.o] Error 1 > # 00:05:37 lib/raid6/recov_neon_inner.c:7:10: fatal error: 'arm_neon.h' file not found > # 00:05:37 make[2]: *** [lib/raid6/recov_neon_inner.o] Error 1 > # 00:05:53 lib/raid6/neon1.c:27:10: fatal error: 'arm_neon.h' file not found > # 00:05:53 make[2]: *** [lib/raid6/neon1.o] Error 1 > # 00:05:53 lib/raid6/neon2.c:27:10: fatal error: 'arm_neon.h' file not found > # 00:05:53 make[2]: *** [lib/raid6/neon2.o] Error 1 > # 00:05:53 lib/raid6/neon4.c:27:10: fatal error: 'arm_neon.h' file not found > # 00:05:53 make[2]: *** [lib/raid6/neon4.o] Error 1 > > from (for last_good == d06391f28276b0ad28a59073bdb8eb10dc4ea495) > # reset_artifacts: > -10 > # build_abe binutils: > -9 > # build_llvm: > -5 > # build_abe qemu: > -2 > # linux_n_obj: > 19802 > # linux build successful: > all > > Artifacts of last_good build: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… > Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… > Build top page/logs: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… > > Configuration details: > rr[linux_url]="https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git" > rr[linux_branch]="e9338abf0e186336022293d2e454c106761f262b" > > Reproduce builds: > <cut> > mkdir investigate-linux-6f4266a78a4e090b452a0335c1414f3240684743 > cd investigate-linux-6f4266a78a4e090b452a0335c1414f3240684743 > > git clone https://git.linaro.org/toolchain/jenkins-scripts > > mkdir -p artifacts/manifests > curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… --fail > curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… --fail > curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… --fail > chmod +x artifacts/test.sh > > # Reproduce the baseline build (build all pre-requisites) > ./jenkins-scripts/tcwg_kernel-build.sh @@ artifacts/manifests/build-baseline.sh > > # Save baseline build state (which is then restored in artifacts/test.sh) > rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude linux/ ./ ./bisect/baseline/ > > cd linux > > # Reproduce first_bad build > git checkout --detach 6f4266a78a4e090b452a0335c1414f3240684743 > ../artifacts/test.sh > > # Reproduce last_good build > git checkout --detach d06391f28276b0ad28a59073bdb8eb10dc4ea495 > ../artifacts/test.sh > > cd .. > </cut> > > History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… > > Artifacts: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… > Build log: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… > > Full commit (up to 1000 lines): > <cut> > commit 6f4266a78a4e090b452a0335c1414f3240684743 > Author: Alexey Dobriyan <adobriyan(a)gmail.com> > Date: Fri Jul 16 13:44:03 2021 +1000 > > kbuild: decouple build from userspace headers > > First, userspace headers can be under incompatible license. > > Second, kernel doesn't require userspace to operate and should not > require anything from userspace to be built other than compiler. > We would use -ffreestanding too if not builtin function shenanigans. > > To decouple: > > * ship minimal stdarg.h as <linux/stdarg.h>, > 1 type, 4 macros > > GPL 2 version of <stdarg.h> can be extracted from > http://archive.debian.org/debian/pool/main/g/gcc-4.2/gcc-4.2_4.2.4.orig.tar… > > * delete "-isystem" from command line arguments, > this is what enables header leakage > > * fixup/delete include directives where necessary. > > Link: https://lkml.kernel.org/r/YO8ioz4sHwcUAkdt@localhost.localdomain > Signed-off-by: Alexey Dobriyan <adobriyan(a)gmail.com> > Cc: Arnd Bergmann <arnd(a)arndb.de> > Cc: Masahiro Yamada <masahiroy(a)kernel.org> > Cc: Christoph Hellwig <hch(a)infradead.org> > Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> > Signed-off-by: Stephen Rothwell <sfr(a)canb.auug.org.au> > --- > Makefile | 2 +- > arch/arm/kernel/process.c | 2 -- > arch/arm/mach-bcm/bcm_kona_smc.c | 2 -- > arch/arm64/kernel/process.c | 3 --- > arch/openrisc/kernel/process.c | 2 -- > arch/parisc/kernel/firmware.c | 2 +- > arch/parisc/kernel/process.c | 3 --- > arch/powerpc/kernel/prom.c | 1 - > arch/powerpc/kernel/prom_init.c | 2 +- > arch/powerpc/kernel/rtas.c | 2 +- > arch/powerpc/kernel/udbg.c | 2 +- > arch/s390/boot/pgm_check_info.c | 2 +- > arch/sparc/kernel/process_32.c | 3 --- > arch/sparc/kernel/process_64.c | 3 --- > arch/um/include/shared/irq_user.h | 1 - > arch/um/include/shared/os.h | 1 - > arch/um/os-Linux/signal.c | 2 +- > arch/um/os-Linux/util.c | 1 + > arch/x86/boot/boot.h | 2 +- > crypto/aegis128-neon-inner.c | 2 -- > drivers/block/xen-blkback/xenbus.c | 1 - > drivers/firmware/efi/libstub/efi-stub-helper.c | 2 +- > drivers/firmware/efi/libstub/vsprintf.c | 2 +- > drivers/gpu/drm/amd/display/dc/dc_helper.c | 2 +- > drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 1 - > drivers/gpu/drm/drm_print.c | 2 +- > drivers/gpu/drm/msm/disp/msm_disp_snapshot.h | 1 - > drivers/isdn/capi/capiutil.c | 2 +- > drivers/macintosh/macio-adb.c | 1 - > drivers/macintosh/via-cuda.c | 2 +- > drivers/macintosh/via-macii.c | 2 -- > drivers/macintosh/via-pmu.c | 2 +- > drivers/net/wireless/intersil/orinoco/hermes.c | 1 - > drivers/net/wwan/iosm/iosm_ipc_imem.h | 1 - > drivers/pinctrl/aspeed/pinmux-aspeed.h | 1 - > drivers/scsi/elx/efct/efct_driver.h | 1 - > .../media/atomisp/pci/hive_isp_css_common/host/isp_local.h | 2 -- > .../media/atomisp/pci/hive_isp_css_include/print_support.h | 2 +- > drivers/staging/media/atomisp/pci/ia_css_env.h | 2 +- > .../media/atomisp/pci/runtime/debug/interface/ia_css_debug.h | 2 +- > drivers/staging/media/atomisp/pci/sh_css_internal.h | 2 +- > drivers/xen/xen-scsiback.c | 2 -- > fs/befs/debug.c | 2 +- > fs/reiserfs/prints.c | 2 +- > fs/ufs/super.c | 2 +- > include/acpi/platform/acgcc.h | 2 +- > include/linux/filter.h | 2 -- > include/linux/kernel.h | 2 +- > include/linux/mISDNif.h | 1 - > include/linux/printk.h | 2 +- > include/linux/stdarg.h | 11 +++++++++++ > include/linux/string.h | 2 +- > kernel/debug/kdb/kdb_support.c | 1 - > lib/debug_info.c | 3 +-- > lib/kasprintf.c | 2 +- > lib/kunit/string-stream.h | 2 +- > lib/vsprintf.c | 2 +- > mm/kfence/report.c | 2 +- > net/batman-adv/log.c | 2 +- > sound/aoa/codecs/onyx.h | 1 - > sound/aoa/codecs/tas.c | 1 - > sound/core/info.c | 1 - > 62 files changed, 44 insertions(+), 77 deletions(-) > > diff --git a/Makefile b/Makefile > index eaa692976851..02591a333d86 100644 > --- a/Makefile > +++ b/Makefile > @@ -972,7 +972,7 @@ KBUILD_CFLAGS += -falign-functions=64 > endif > > # arch Makefile may override CC so keep this after arch Makefile is included > -NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include) > +NOSTDINC_FLAGS += -nostdinc > > # warn about C99 declaration after statement > KBUILD_CFLAGS += -Wdeclaration-after-statement > diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c > index fc9e8b37eaa8..bb5ad8a6a4c3 100644 > --- a/arch/arm/kernel/process.c > +++ b/arch/arm/kernel/process.c > @@ -5,8 +5,6 @@ > * Copyright (C) 1996-2000 Russell King - Converted to ARM. > * Original Copyright (C) 1995 Linus Torvalds > */ > -#include <stdarg.h> > - > #include <linux/export.h> > #include <linux/sched.h> > #include <linux/sched/debug.h> > diff --git a/arch/arm/mach-bcm/bcm_kona_smc.c b/arch/arm/mach-bcm/bcm_kona_smc.c > index 43a16f922b53..43829e49ad93 100644 > --- a/arch/arm/mach-bcm/bcm_kona_smc.c > +++ b/arch/arm/mach-bcm/bcm_kona_smc.c > @@ -10,8 +10,6 @@ > * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > * GNU General Public License for more details. > */ > - > -#include <stdarg.h> > #include <linux/smp.h> > #include <linux/io.h> > #include <linux/ioport.h> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index c8989b999250..5f7ac9a0f9a3 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -6,9 +6,6 @@ > * Copyright (C) 1996-2000 Russell King - Converted to ARM. > * Copyright (C) 2012 ARM Ltd. > */ > - > -#include <stdarg.h> > - > #include <linux/compat.h> > #include <linux/efi.h> > #include <linux/elf.h> > diff --git a/arch/openrisc/kernel/process.c b/arch/openrisc/kernel/process.c > index eb62429681fc..b0698d9ce14f 100644 > --- a/arch/openrisc/kernel/process.c > +++ b/arch/openrisc/kernel/process.c > @@ -14,8 +14,6 @@ > */ > > #define __KERNEL_SYSCALLS__ > -#include <stdarg.h> > - > #include <linux/errno.h> > #include <linux/sched.h> > #include <linux/sched/debug.h> > diff --git a/arch/parisc/kernel/firmware.c b/arch/parisc/kernel/firmware.c > index 665b70086685..7034227dbdf3 100644 > --- a/arch/parisc/kernel/firmware.c > +++ b/arch/parisc/kernel/firmware.c > @@ -51,7 +51,7 @@ > * prumpf 991016 > */ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > > #include <linux/delay.h> > #include <linux/init.h> > diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c > index 184ec3c1eae4..38ec4ae81239 100644 > --- a/arch/parisc/kernel/process.c > +++ b/arch/parisc/kernel/process.c > @@ -17,9 +17,6 @@ > * Copyright (C) 2001-2014 Helge Deller <deller(a)gmx.de> > * Copyright (C) 2002 Randolph Chung <tausq with parisc-linux.org> > */ > - > -#include <stdarg.h> > - > #include <linux/elf.h> > #include <linux/errno.h> > #include <linux/kernel.h> > diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c > index f620e04dc9bf..a1e7ba0fad09 100644 > --- a/arch/powerpc/kernel/prom.c > +++ b/arch/powerpc/kernel/prom.c > @@ -11,7 +11,6 @@ > > #undef DEBUG > > -#include <stdarg.h> > #include <linux/kernel.h> > #include <linux/string.h> > #include <linux/init.h> > diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c > index a5bf355ce1d6..10664633f7e3 100644 > --- a/arch/powerpc/kernel/prom_init.c > +++ b/arch/powerpc/kernel/prom_init.c > @@ -14,7 +14,7 @@ > /* we cannot use FORTIFY as it brings in new symbols */ > #define __NO_FORTIFY > > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/kernel.h> > #include <linux/string.h> > #include <linux/init.h> > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > index 99f2cce635fb..ff80bbad22a5 100644 > --- a/arch/powerpc/kernel/rtas.c > +++ b/arch/powerpc/kernel/rtas.c > @@ -7,7 +7,7 @@ > * Copyright (C) 2001 IBM. > */ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/kernel.h> > #include <linux/types.h> > #include <linux/spinlock.h> > diff --git a/arch/powerpc/kernel/udbg.c b/arch/powerpc/kernel/udbg.c > index 01595e8cafe7..b1544b2f6321 100644 > --- a/arch/powerpc/kernel/udbg.c > +++ b/arch/powerpc/kernel/udbg.c > @@ -5,7 +5,7 @@ > * c 2001 PPC 64 Team, IBM Corp > */ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/types.h> > #include <linux/sched.h> > #include <linux/console.h> > diff --git a/arch/s390/boot/pgm_check_info.c b/arch/s390/boot/pgm_check_info.c > index 3a46abed2549..b7d8dd88bbf2 100644 > --- a/arch/s390/boot/pgm_check_info.c > +++ b/arch/s390/boot/pgm_check_info.c > @@ -1,5 +1,6 @@ > // SPDX-License-Identifier: GPL-2.0 > #include <linux/kernel.h> > +#include <linux/stdarg.h> > #include <linux/string.h> > #include <linux/ctype.h> > #include <asm/stacktrace.h> > @@ -8,7 +9,6 @@ > #include <asm/setup.h> > #include <asm/sclp.h> > #include <asm/uv.h> > -#include <stdarg.h> > #include "boot.h" > > const char hex_asc[] = "0123456789abcdef"; > diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c > index 93983d6d431d..bbbe0cfef746 100644 > --- a/arch/sparc/kernel/process_32.c > +++ b/arch/sparc/kernel/process_32.c > @@ -8,9 +8,6 @@ > /* > * This file handles the architecture-dependent parts of process handling.. > */ > - > -#include <stdarg.h> > - > #include <linux/elfcore.h> > #include <linux/errno.h> > #include <linux/module.h> > diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c > index d33c58a58d4f..0cabcdfb23fd 100644 > --- a/arch/sparc/kernel/process_64.c > +++ b/arch/sparc/kernel/process_64.c > @@ -9,9 +9,6 @@ > /* > * This file handles the architecture-dependent parts of process handling.. > */ > - > -#include <stdarg.h> > - > #include <linux/errno.h> > #include <linux/export.h> > #include <linux/sched.h> > diff --git a/arch/um/include/shared/irq_user.h b/arch/um/include/shared/irq_user.h > index 065829f443ae..86a8a573b65c 100644 > --- a/arch/um/include/shared/irq_user.h > +++ b/arch/um/include/shared/irq_user.h > @@ -7,7 +7,6 @@ > #define __IRQ_USER_H__ > > #include <sysdep/ptrace.h> > -#include <stdbool.h> > > enum um_irq_type { > IRQ_READ, > diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h > index 60b84edc8a68..96d400387c93 100644 > --- a/arch/um/include/shared/os.h > +++ b/arch/um/include/shared/os.h > @@ -8,7 +8,6 @@ > #ifndef __OS_H__ > #define __OS_H__ > > -#include <stdarg.h> > #include <irq_user.h> > #include <longjmp.h> > #include <mm_id.h> > diff --git a/arch/um/os-Linux/signal.c b/arch/um/os-Linux/signal.c > index 6de99bb16113..6cf098c23a39 100644 > --- a/arch/um/os-Linux/signal.c > +++ b/arch/um/os-Linux/signal.c > @@ -67,7 +67,7 @@ int signals_enabled; > #ifdef UML_CONFIG_UML_TIME_TRAVEL_SUPPORT > static int signals_blocked; > #else > -#define signals_blocked false > +#define signals_blocked 0 > #endif > static unsigned int signals_pending; > static unsigned int signals_active = 0; > diff --git a/arch/um/os-Linux/util.c b/arch/um/os-Linux/util.c > index 07327425d06e..41297ec404bf 100644 > --- a/arch/um/os-Linux/util.c > +++ b/arch/um/os-Linux/util.c > @@ -3,6 +3,7 @@ > * Copyright (C) 2000 - 2007 Jeff Dike (jdike(a){addtoit,linux.intel}.com) > */ > > +#include <stdarg.h> > #include <stdio.h> > #include <stdlib.h> > #include <unistd.h> > diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h > index ca866f1cca2e..34c9dbb6a47d 100644 > --- a/arch/x86/boot/boot.h > +++ b/arch/x86/boot/boot.h > @@ -18,7 +18,7 @@ > > #ifndef __ASSEMBLY__ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/types.h> > #include <linux/edd.h> > #include <asm/setup.h> > diff --git a/crypto/aegis128-neon-inner.c b/crypto/aegis128-neon-inner.c > index 7de485907d81..6b37fbb79884 100644 > --- a/crypto/aegis128-neon-inner.c > +++ b/crypto/aegis128-neon-inner.c > @@ -15,8 +15,6 @@ > > #define AEGIS_BLOCK_SIZE 16 > > -#include <stddef.h> > - > extern int aegis128_have_aes_insn; > > void *memcpy(void *dest, const void *src, size_t n); > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c > index 125b22205d38..33eba3df4dd9 100644 > --- a/drivers/block/xen-blkback/xenbus.c > +++ b/drivers/block/xen-blkback/xenbus.c > @@ -8,7 +8,6 @@ > > #define pr_fmt(fmt) "xen-blkback: " fmt > > -#include <stdarg.h> > #include <linux/module.h> > #include <linux/kthread.h> > #include <xen/events.h> > diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c > index aa8da0a49829..4921580e1725 100644 > --- a/drivers/firmware/efi/libstub/efi-stub-helper.c > +++ b/drivers/firmware/efi/libstub/efi-stub-helper.c > @@ -7,7 +7,7 @@ > * Copyright 2011 Intel Corporation; author Matt Fleming > */ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > > #include <linux/ctype.h> > #include <linux/efi.h> > diff --git a/drivers/firmware/efi/libstub/vsprintf.c b/drivers/firmware/efi/libstub/vsprintf.c > index 1088e288c04d..71c71c222346 100644 > --- a/drivers/firmware/efi/libstub/vsprintf.c > +++ b/drivers/firmware/efi/libstub/vsprintf.c > @@ -10,7 +10,7 @@ > * Oh, it's a waste of space, but oh-so-yummy for debugging. > */ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > > #include <linux/compiler.h> > #include <linux/ctype.h> > diff --git a/drivers/gpu/drm/amd/display/dc/dc_helper.c b/drivers/gpu/drm/amd/display/dc/dc_helper.c > index a612ba6dc389..ab6bc5d79012 100644 > --- a/drivers/gpu/drm/amd/display/dc/dc_helper.c > +++ b/drivers/gpu/drm/amd/display/dc/dc_helper.c > @@ -28,9 +28,9 @@ > */ > > #include <linux/delay.h> > +#include <linux/stdarg.h> > > #include "dm_services.h" > -#include <stdarg.h> > > #include "dc.h" > #include "dc_dmub_srv.h" > diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h > index 7c4734f905d9..68fd451aca23 100644 > --- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h > +++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h > @@ -39,7 +39,6 @@ > #include <linux/types.h> > #include <linux/string.h> > #include <linux/delay.h> > -#include <stdarg.h> > > #include "atomfirmware.h" > > diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c > index 111b932cf2a9..f783d4963d4b 100644 > --- a/drivers/gpu/drm/drm_print.c > +++ b/drivers/gpu/drm/drm_print.c > @@ -25,7 +25,7 @@ > > #define DEBUG /* for pr_debug() */ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > > #include <linux/io.h> > #include <linux/moduleparam.h> > diff --git a/drivers/gpu/drm/msm/disp/msm_disp_snapshot.h b/drivers/gpu/drm/msm/disp/msm_disp_snapshot.h > index c92a9508c8d3..0f9a5364cd86 100644 > --- a/drivers/gpu/drm/msm/disp/msm_disp_snapshot.h > +++ b/drivers/gpu/drm/msm/disp/msm_disp_snapshot.h > @@ -25,7 +25,6 @@ > #include <linux/pm_runtime.h> > #include <linux/kthread.h> > #include <linux/devcoredump.h> > -#include <stdarg.h> > #include "msm_kms.h" > > #define MSM_DISP_SNAPSHOT_MAX_BLKS 10 > diff --git a/drivers/isdn/capi/capiutil.c b/drivers/isdn/capi/capiutil.c > index f26bf3c66d7e..d7ae42edc4a8 100644 > --- a/drivers/isdn/capi/capiutil.c > +++ b/drivers/isdn/capi/capiutil.c > @@ -379,7 +379,7 @@ static char *pnames[] = > /*2f */ "Useruserdata" > }; > > -#include <stdarg.h> > +#include <linux/stdarg.h> > > /*-------------------------------------------------------*/ > static _cdebbuf *bufprint(_cdebbuf *cdb, char *fmt, ...) > diff --git a/drivers/macintosh/macio-adb.c b/drivers/macintosh/macio-adb.c > index d4759db002c6..dc634c2932fd 100644 > --- a/drivers/macintosh/macio-adb.c > +++ b/drivers/macintosh/macio-adb.c > @@ -2,7 +2,6 @@ > /* > * Driver for the ADB controller in the Mac I/O (Hydra) chip. > */ > -#include <stdarg.h> > #include <linux/types.h> > #include <linux/errno.h> > #include <linux/kernel.h> > diff --git a/drivers/macintosh/via-cuda.c b/drivers/macintosh/via-cuda.c > index 3581abfb0c6a..cd267392289c 100644 > --- a/drivers/macintosh/via-cuda.c > +++ b/drivers/macintosh/via-cuda.c > @@ -9,7 +9,7 @@ > * > * Copyright (C) 1996 Paul Mackerras. > */ > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/types.h> > #include <linux/errno.h> > #include <linux/kernel.h> > diff --git a/drivers/macintosh/via-macii.c b/drivers/macintosh/via-macii.c > index 060e03f2264b..db9270da5b8e 100644 > --- a/drivers/macintosh/via-macii.c > +++ b/drivers/macintosh/via-macii.c > @@ -23,8 +23,6 @@ > * Apple's "ADB Analyzer" bus sniffer is invaluable: > * ftp://ftp.apple.com/developer/Tool_Chest/Devices_-_Hardware/Apple_Desktop_B… > */ > - > -#include <stdarg.h> > #include <linux/types.h> > #include <linux/errno.h> > #include <linux/kernel.h> > diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c > index 4bdd4c45e7a7..4b98bc26a94b 100644 > --- a/drivers/macintosh/via-pmu.c > +++ b/drivers/macintosh/via-pmu.c > @@ -18,7 +18,7 @@ > * a sleep or a freq. switch > * > */ > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/mutex.h> > #include <linux/types.h> > #include <linux/errno.h> > diff --git a/drivers/net/wireless/intersil/orinoco/hermes.c b/drivers/net/wireless/intersil/orinoco/hermes.c > index 6d4b7f64efcf..256946552742 100644 > --- a/drivers/net/wireless/intersil/orinoco/hermes.c > +++ b/drivers/net/wireless/intersil/orinoco/hermes.c > @@ -79,7 +79,6 @@ > > #undef HERMES_DEBUG > #ifdef HERMES_DEBUG > -#include <stdarg.h> > > #define DEBUG(lvl, stuff...) if ((lvl) <= HERMES_DEBUG) DMSG(stuff) > > diff --git a/drivers/net/wwan/iosm/iosm_ipc_imem.h b/drivers/net/wwan/iosm/iosm_ipc_imem.h > index 0d2f10e4cbc8..dc65b0712261 100644 > --- a/drivers/net/wwan/iosm/iosm_ipc_imem.h > +++ b/drivers/net/wwan/iosm/iosm_ipc_imem.h > @@ -7,7 +7,6 @@ > #define IOSM_IPC_IMEM_H > > #include <linux/skbuff.h> > -#include <stdbool.h> > > #include "iosm_ipc_mmio.h" > #include "iosm_ipc_pcie.h" > diff --git a/drivers/pinctrl/aspeed/pinmux-aspeed.h b/drivers/pinctrl/aspeed/pinmux-aspeed.h > index b69ba6b360a2..4d7548686f39 100644 > --- a/drivers/pinctrl/aspeed/pinmux-aspeed.h > +++ b/drivers/pinctrl/aspeed/pinmux-aspeed.h > @@ -5,7 +5,6 @@ > #define ASPEED_PINMUX_H > > #include <linux/regmap.h> > -#include <stdbool.h> > > /* > * The ASPEED SoCs provide typically more than 200 pins for GPIO and other > diff --git a/drivers/scsi/elx/efct/efct_driver.h b/drivers/scsi/elx/efct/efct_driver.h > index dab8eac4f243..0e3c931db7c2 100644 > --- a/drivers/scsi/elx/efct/efct_driver.h > +++ b/drivers/scsi/elx/efct/efct_driver.h > @@ -10,7 +10,6 @@ > /*************************************************************************** > * OS specific includes > */ > -#include <stdarg.h> > #include <linux/module.h> > #include <linux/debugfs.h> > #include <linux/firmware.h> > diff --git a/drivers/staging/media/atomisp/pci/hive_isp_css_common/host/isp_local.h b/drivers/staging/media/atomisp/pci/hive_isp_css_common/host/isp_local.h > index eceeb5d160ad..4dbec4063b3d 100644 > --- a/drivers/staging/media/atomisp/pci/hive_isp_css_common/host/isp_local.h > +++ b/drivers/staging/media/atomisp/pci/hive_isp_css_common/host/isp_local.h > @@ -16,8 +16,6 @@ > #ifndef __ISP_LOCAL_H_INCLUDED__ > #define __ISP_LOCAL_H_INCLUDED__ > > -#include <stdbool.h> > - > #include "isp_global.h" > > #include <isp2400_support.h> > diff --git a/drivers/staging/media/atomisp/pci/hive_isp_css_include/print_support.h b/drivers/staging/media/atomisp/pci/hive_isp_css_include/print_support.h > index 540b405cc0f7..a3c7f3de6d17 100644 > --- a/drivers/staging/media/atomisp/pci/hive_isp_css_include/print_support.h > +++ b/drivers/staging/media/atomisp/pci/hive_isp_css_include/print_support.h > @@ -16,7 +16,7 @@ > #ifndef __PRINT_SUPPORT_H_INCLUDED__ > #define __PRINT_SUPPORT_H_INCLUDED__ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > > extern int (*sh_css_printf)(const char *fmt, va_list args); > /* depends on host supplied print function in ia_css_init() */ > diff --git a/drivers/staging/media/atomisp/pci/ia_css_env.h b/drivers/staging/media/atomisp/pci/ia_css_env.h > index 6b38723b27cd..3b89bbd837a0 100644 > --- a/drivers/staging/media/atomisp/pci/ia_css_env.h > +++ b/drivers/staging/media/atomisp/pci/ia_css_env.h > @@ -17,7 +17,7 @@ > #define __IA_CSS_ENV_H > > #include <type_support.h> > -#include <stdarg.h> /* va_list */ > +#include <linux/stdarg.h> /* va_list */ > #include "ia_css_types.h" > #include "ia_css_acc_types.h" > > diff --git a/drivers/staging/media/atomisp/pci/runtime/debug/interface/ia_css_debug.h b/drivers/staging/media/atomisp/pci/runtime/debug/interface/ia_css_debug.h > index 5e6e7447ae00..e37ef4232c55 100644 > --- a/drivers/staging/media/atomisp/pci/runtime/debug/interface/ia_css_debug.h > +++ b/drivers/staging/media/atomisp/pci/runtime/debug/interface/ia_css_debug.h > @@ -19,7 +19,7 @@ > /*! \file */ > > #include <type_support.h> > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include "ia_css_types.h" > #include "ia_css_binary.h" > #include "ia_css_frame_public.h" > diff --git a/drivers/staging/media/atomisp/pci/sh_css_internal.h b/drivers/staging/media/atomisp/pci/sh_css_internal.h > index 3c669ec79b68..496faa7297a5 100644 > --- a/drivers/staging/media/atomisp/pci/sh_css_internal.h > +++ b/drivers/staging/media/atomisp/pci/sh_css_internal.h > @@ -20,7 +20,7 @@ > #include <math_support.h> > #include <type_support.h> > #include <platform_support.h> > -#include <stdarg.h> > +#include <linux/stdarg.h> > > #if !defined(ISP2401) > #include "input_formatter.h" > diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c > index 61ce0d142eea..0c5e565aa8cf 100644 > --- a/drivers/xen/xen-scsiback.c > +++ b/drivers/xen/xen-scsiback.c > @@ -33,8 +33,6 @@ > > #define pr_fmt(fmt) "xen-pvscsi: " fmt > > -#include <stdarg.h> > - > #include <linux/module.h> > #include <linux/utsname.h> > #include <linux/interrupt.h> > diff --git a/fs/befs/debug.c b/fs/befs/debug.c > index eb7bd6c692c7..02fa66fb82c2 100644 > --- a/fs/befs/debug.c > +++ b/fs/befs/debug.c > @@ -14,7 +14,7 @@ > #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > #ifdef __KERNEL__ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/string.h> > #include <linux/spinlock.h> > #include <linux/kernel.h> > diff --git a/fs/reiserfs/prints.c b/fs/reiserfs/prints.c > index 500f2000eb41..30319dc33c18 100644 > --- a/fs/reiserfs/prints.c > +++ b/fs/reiserfs/prints.c > @@ -8,7 +8,7 @@ > #include <linux/string.h> > #include <linux/buffer_head.h> > > -#include <stdarg.h> > +#include <linux/stdarg.h> > > static char error_buf[1024]; > static char fmt_buf[1024]; > diff --git a/fs/ufs/super.c b/fs/ufs/super.c > index 74028b5a7b0a..00a01471ea05 100644 > --- a/fs/ufs/super.c > +++ b/fs/ufs/super.c > @@ -70,7 +70,7 @@ > #include <linux/module.h> > #include <linux/bitops.h> > > -#include <stdarg.h> > +#include <linux/stdarg.h> > > #include <linux/uaccess.h> > > diff --git a/include/acpi/platform/acgcc.h b/include/acpi/platform/acgcc.h > index f6656be81760..fb172a03a753 100644 > --- a/include/acpi/platform/acgcc.h > +++ b/include/acpi/platform/acgcc.h > @@ -22,7 +22,7 @@ typedef __builtin_va_list va_list; > #define va_arg(v, l) __builtin_va_arg(v, l) > #define va_copy(d, s) __builtin_va_copy(d, s) > #else > -#include <stdarg.h> > +#include <linux/stdarg.h> > #endif > #endif > > diff --git a/include/linux/filter.h b/include/linux/filter.h > index 472f97074da0..45785fc231a8 100644 > --- a/include/linux/filter.h > +++ b/include/linux/filter.h > @@ -5,8 +5,6 @@ > #ifndef __LINUX_FILTER_H__ > #define __LINUX_FILTER_H__ > > -#include <stdarg.h> > - > #include <linux/atomic.h> > #include <linux/refcount.h> > #include <linux/compat.h> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > index 1b2f0a7e00d6..2776423a587e 100644 > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -2,7 +2,7 @@ > #ifndef _LINUX_KERNEL_H > #define _LINUX_KERNEL_H > > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/align.h> > #include <linux/limits.h> > #include <linux/linkage.h> > diff --git a/include/linux/mISDNif.h b/include/linux/mISDNif.h > index a7330eb3ec64..7dd1f01ec4f9 100644 > --- a/include/linux/mISDNif.h > +++ b/include/linux/mISDNif.h > @@ -18,7 +18,6 @@ > #ifndef mISDNIF_H > #define mISDNIF_H > > -#include <stdarg.h> > #include <linux/types.h> > #include <linux/errno.h> > #include <linux/socket.h> > diff --git a/include/linux/printk.h b/include/linux/printk.h > index e834d78f0478..9f3f29ea348e 100644 > --- a/include/linux/printk.h > +++ b/include/linux/printk.h > @@ -2,7 +2,7 @@ > #ifndef __KERNEL_PRINTK__ > #define __KERNEL_PRINTK__ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/init.h> > #include <linux/kern_levels.h> > #include <linux/linkage.h> > diff --git a/include/linux/stdarg.h b/include/linux/stdarg.h > new file mode 100644 > index 000000000000..c8dc7f4f390c > --- /dev/null > +++ b/include/linux/stdarg.h > @@ -0,0 +1,11 @@ > +// SPDX-License-Identifier: GPL-2.0-or-later > +#ifndef _LINUX_STDARG_H > +#define _LINUX_STDARG_H > + > +typedef __builtin_va_list va_list; > +#define va_start(v, l) __builtin_va_start(v, l) > +#define va_end(v) __builtin_va_end(v) > +#define va_arg(v, T) __builtin_va_arg(v, T) > +#define va_copy(d, s) __builtin_va_copy(d, s) > + > +#endif > diff --git a/include/linux/string.h b/include/linux/string.h > index b48d2d28e0b1..5e96d656be7a 100644 > --- a/include/linux/string.h > +++ b/include/linux/string.h > @@ -6,7 +6,7 @@ > #include <linux/types.h> /* for size_t */ > #include <linux/stddef.h> /* for NULL */ > #include <linux/errno.h> /* for E2BIG */ > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <uapi/linux/string.h> > > extern char *strndup_user(const char __user *, long); > diff --git a/kernel/debug/kdb/kdb_support.c b/kernel/debug/kdb/kdb_support.c > index 9f50d22d68e6..4f9950678e7b 100644 > --- a/kernel/debug/kdb/kdb_support.c > +++ b/kernel/debug/kdb/kdb_support.c > @@ -10,7 +10,6 @@ > * 03/02/13 added new 2.5 kallsyms <xavier.bru(a)bull.net> > */ > > -#include <stdarg.h> > #include <linux/types.h> > #include <linux/sched.h> > #include <linux/mm.h> > diff --git a/lib/debug_info.c b/lib/debug_info.c > index 36daf753293c..cc4723c74af5 100644 > --- a/lib/debug_info.c > +++ b/lib/debug_info.c > @@ -5,8 +5,6 @@ > * CONFIG_DEBUG_INFO_REDUCED. Please do not add actual code. However, > * adding appropriate #includes is fine. > */ > -#include <stdarg.h> > - > #include <linux/cred.h> > #include <linux/crypto.h> > #include <linux/dcache.h> > @@ -22,6 +20,7 @@ > #include <linux/net.h> > #include <linux/sched.h> > #include <linux/slab.h> > +#include <linux/stdarg.h> > #include <linux/types.h> > #include <net/addrconf.h> > #include <net/sock.h> > diff --git a/lib/kasprintf.c b/lib/kasprintf.c > index bacf7b83ccf0..cd2f5974ed98 100644 > --- a/lib/kasprintf.c > +++ b/lib/kasprintf.c > @@ -5,7 +5,7 @@ > * Copyright (C) 1991, 1992 Linus Torvalds > */ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/export.h> > #include <linux/slab.h> > #include <linux/types.h> > diff --git a/lib/kunit/string-stream.h b/lib/kunit/string-stream.h > index 5e94b623454f..43f9508a55b4 100644 > --- a/lib/kunit/string-stream.h > +++ b/lib/kunit/string-stream.h > @@ -11,7 +11,7 @@ > > #include <linux/spinlock.h> > #include <linux/types.h> > -#include <stdarg.h> > +#include <linux/stdarg.h> > > struct string_stream_fragment { > struct kunit *test; > diff --git a/lib/vsprintf.c b/lib/vsprintf.c > index 26c83943748a..3bcb7be03f93 100644 > --- a/lib/vsprintf.c > +++ b/lib/vsprintf.c > @@ -17,7 +17,7 @@ > * - scnprintf and vscnprintf > */ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > #include <linux/build_bug.h> > #include <linux/clk.h> > #include <linux/clk-provider.h> > diff --git a/mm/kfence/report.c b/mm/kfence/report.c > index 2a319c21c939..4b891dd75650 100644 > --- a/mm/kfence/report.c > +++ b/mm/kfence/report.c > @@ -5,7 +5,7 @@ > * Copyright (C) 2020, Google LLC. > */ > > -#include <stdarg.h> > +#include <linux/stdarg.h> > > #include <linux/kernel.h> > #include <linux/lockdep.h> > diff --git a/net/batman-adv/log.c b/net/batman-adv/log.c > index f0e5d1429662..7a93a1e94c40 100644 > --- a/net/batman-adv/log.c > +++ b/net/batman-adv/log.c > @@ -7,7 +7,7 @@ > #include "log.h" > #include "main.h" > > -#include <stdarg.h> > +#include <linux/stdarg.h> > > #include "trace.h" > > diff --git a/sound/aoa/codecs/onyx.h b/sound/aoa/codecs/onyx.h > index 8a32c3c3d716..6c31b7373b78 100644 > --- a/sound/aoa/codecs/onyx.h > +++ b/sound/aoa/codecs/onyx.h > @@ -6,7 +6,6 @@ > */ > #ifndef __SND_AOA_CODEC_ONYX_H > #define __SND_AOA_CODEC_ONYX_H > -#include <stddef.h> > #include <linux/i2c.h> > #include <asm/pmac_low_i2c.h> > #include <asm/prom.h> > diff --git a/sound/aoa/codecs/tas.c b/sound/aoa/codecs/tas.c > index ac246dd3ab49..ab19a37e2a68 100644 > --- a/sound/aoa/codecs/tas.c > +++ b/sound/aoa/codecs/tas.c > @@ -58,7 +58,6 @@ > * and up to the hardware designer to not wire > * them up in some weird unusable way. > */ > -#include <stddef.h> > #include <linux/i2c.h> > #include <asm/pmac_low_i2c.h> > #include <asm/prom.h> > diff --git a/sound/core/info.c b/sound/core/info.c > index 9fec3070f8ba..a451b24199c3 100644 > --- a/sound/core/info.c > +++ b/sound/core/info.c > @@ -16,7 +16,6 @@ > #include <linux/utsname.h> > #include <linux/proc_fs.h> > #include <linux/mutex.h> > -#include <stdarg.h> > > int snd_info_check_reserved_words(const char *str) > { > </cut>

4 years, 11 months

[CI-NOTIFY]: TCWG Bisect tcwg_kernel/llvm-release-arm-next-allmodconfig - Build # 29 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *linux* in CI configuration tcwg_kernel/llvm-release-arm-next-allmodconfig. So far, this commit has regressed CI configurations: - tcwg_kernel/llvm-release-arm-next-allmodconfig Culprit: <cut> commit 6f4266a78a4e090b452a0335c1414f3240684743 Author: Alexey Dobriyan <adobriyan(a)gmail.com> Date: Fri Jul 16 13:44:03 2021 +1000 kbuild: decouple build from userspace headers First, userspace headers can be under incompatible license. Second, kernel doesn't require userspace to operate and should not require anything from userspace to be built other than compiler. We would use -ffreestanding too if not builtin function shenanigans. To decouple: * ship minimal stdarg.h as <linux/stdarg.h>, 1 type, 4 macros GPL 2 version of <stdarg.h> can be extracted from http://archive.debian.org/debian/pool/main/g/gcc-4.2/gcc-4.2_4.2.4.orig.tar… * delete "-isystem" from command line arguments, this is what enables header leakage * fixup/delete include directives where necessary. Link: https://lkml.kernel.org/r/YO8ioz4sHwcUAkdt@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan(a)gmail.com> Cc: Arnd Bergmann <arnd(a)arndb.de> Cc: Masahiro Yamada <masahiroy(a)kernel.org> Cc: Christoph Hellwig <hch(a)infradead.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr(a)canb.auug.org.au> </cut> Results regressed to (for first_bad == 6f4266a78a4e090b452a0335c1414f3240684743) # reset_artifacts: -10 # build_abe binutils: -9 # build_llvm: -5 # build_abe qemu: -2 # linux_n_obj: 21684 # First few build errors in logs: # 00:01:11 crypto/aegis128-neon-inner.c:11:10: fatal error: 'arm_neon.h' file not found # 00:01:11 make[1]: *** [crypto/aegis128-neon-inner.o] Error 1 # 00:01:14 lib/raid6/recov_neon_inner.c:7:10: fatal error: 'arm_neon.h' file not found # 00:01:14 make[2]: *** [lib/raid6/recov_neon_inner.o] Error 1 # 00:01:14 lib/raid6/neon1.c:27:10: fatal error: 'arm_neon.h' file not found # 00:01:14 make[2]: *** [lib/raid6/neon1.o] Error 1 # 00:01:15 lib/raid6/neon2.c:27:10: fatal error: 'arm_neon.h' file not found # 00:01:15 make[2]: *** [lib/raid6/neon2.o] Error 1 # 00:01:15 lib/raid6/neon4.c:27:10: fatal error: 'arm_neon.h' file not found # 00:01:15 make[2]: *** [lib/raid6/neon4.o] Error 1 from (for last_good == d06391f28276b0ad28a59073bdb8eb10dc4ea495) # reset_artifacts: -10 # build_abe binutils: -9 # build_llvm: -5 # build_abe qemu: -2 # linux_n_obj: 29753 # linux build successful: all Artifacts of last_good build: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… Build top page/logs: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… Configuration details: rr[linux_url]="https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git" rr[linux_branch]="e9338abf0e186336022293d2e454c106761f262b" Reproduce builds: <cut> mkdir investigate-linux-6f4266a78a4e090b452a0335c1414f3240684743 cd investigate-linux-6f4266a78a4e090b452a0335c1414f3240684743 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_kernel-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ --exclude linux/ ./ ./bisect/baseline/ cd linux # Reproduce first_bad build git checkout --detach 6f4266a78a4e090b452a0335c1414f3240684743 ../artifacts/test.sh # Reproduce last_good build git checkout --detach d06391f28276b0ad28a59073bdb8eb10dc4ea495 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… Build log: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-release-arm-next-all… Full commit (up to 1000 lines): <cut> commit 6f4266a78a4e090b452a0335c1414f3240684743 Author: Alexey Dobriyan <adobriyan(a)gmail.com> Date: Fri Jul 16 13:44:03 2021 +1000 kbuild: decouple build from userspace headers First, userspace headers can be under incompatible license. Second, kernel doesn't require userspace to operate and should not require anything from userspace to be built other than compiler. We would use -ffreestanding too if not builtin function shenanigans. To decouple: * ship minimal stdarg.h as <linux/stdarg.h>, 1 type, 4 macros GPL 2 version of <stdarg.h> can be extracted from http://archive.debian.org/debian/pool/main/g/gcc-4.2/gcc-4.2_4.2.4.orig.tar… * delete "-isystem" from command line arguments, this is what enables header leakage * fixup/delete include directives where necessary. Link: https://lkml.kernel.org/r/YO8ioz4sHwcUAkdt@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan(a)gmail.com> Cc: Arnd Bergmann <arnd(a)arndb.de> Cc: Masahiro Yamada <masahiroy(a)kernel.org> Cc: Christoph Hellwig <hch(a)infradead.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr(a)canb.auug.org.au> --- Makefile | 2 +- arch/arm/kernel/process.c | 2 -- arch/arm/mach-bcm/bcm_kona_smc.c | 2 -- arch/arm64/kernel/process.c | 3 --- arch/openrisc/kernel/process.c | 2 -- arch/parisc/kernel/firmware.c | 2 +- arch/parisc/kernel/process.c | 3 --- arch/powerpc/kernel/prom.c | 1 - arch/powerpc/kernel/prom_init.c | 2 +- arch/powerpc/kernel/rtas.c | 2 +- arch/powerpc/kernel/udbg.c | 2 +- arch/s390/boot/pgm_check_info.c | 2 +- arch/sparc/kernel/process_32.c | 3 --- arch/sparc/kernel/process_64.c | 3 --- arch/um/include/shared/irq_user.h | 1 - arch/um/include/shared/os.h | 1 - arch/um/os-Linux/signal.c | 2 +- arch/um/os-Linux/util.c | 1 + arch/x86/boot/boot.h | 2 +- crypto/aegis128-neon-inner.c | 2 -- drivers/block/xen-blkback/xenbus.c | 1 - drivers/firmware/efi/libstub/efi-stub-helper.c | 2 +- drivers/firmware/efi/libstub/vsprintf.c | 2 +- drivers/gpu/drm/amd/display/dc/dc_helper.c | 2 +- drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 1 - drivers/gpu/drm/drm_print.c | 2 +- drivers/gpu/drm/msm/disp/msm_disp_snapshot.h | 1 - drivers/isdn/capi/capiutil.c | 2 +- drivers/macintosh/macio-adb.c | 1 - drivers/macintosh/via-cuda.c | 2 +- drivers/macintosh/via-macii.c | 2 -- drivers/macintosh/via-pmu.c | 2 +- drivers/net/wireless/intersil/orinoco/hermes.c | 1 - drivers/net/wwan/iosm/iosm_ipc_imem.h | 1 - drivers/pinctrl/aspeed/pinmux-aspeed.h | 1 - drivers/scsi/elx/efct/efct_driver.h | 1 - .../media/atomisp/pci/hive_isp_css_common/host/isp_local.h | 2 -- .../media/atomisp/pci/hive_isp_css_include/print_support.h | 2 +- drivers/staging/media/atomisp/pci/ia_css_env.h | 2 +- .../media/atomisp/pci/runtime/debug/interface/ia_css_debug.h | 2 +- drivers/staging/media/atomisp/pci/sh_css_internal.h | 2 +- drivers/xen/xen-scsiback.c | 2 -- fs/befs/debug.c | 2 +- fs/reiserfs/prints.c | 2 +- fs/ufs/super.c | 2 +- include/acpi/platform/acgcc.h | 2 +- include/linux/filter.h | 2 -- include/linux/kernel.h | 2 +- include/linux/mISDNif.h | 1 - include/linux/printk.h | 2 +- include/linux/stdarg.h | 11 +++++++++++ include/linux/string.h | 2 +- kernel/debug/kdb/kdb_support.c | 1 - lib/debug_info.c | 3 +-- lib/kasprintf.c | 2 +- lib/kunit/string-stream.h | 2 +- lib/vsprintf.c | 2 +- mm/kfence/report.c | 2 +- net/batman-adv/log.c | 2 +- sound/aoa/codecs/onyx.h | 1 - sound/aoa/codecs/tas.c | 1 - sound/core/info.c | 1 - 62 files changed, 44 insertions(+), 77 deletions(-) diff --git a/Makefile b/Makefile index eaa692976851..02591a333d86 100644 --- a/Makefile +++ b/Makefile @@ -972,7 +972,7 @@ KBUILD_CFLAGS += -falign-functions=64 endif # arch Makefile may override CC so keep this after arch Makefile is included -NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include) +NOSTDINC_FLAGS += -nostdinc # warn about C99 declaration after statement KBUILD_CFLAGS += -Wdeclaration-after-statement diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index fc9e8b37eaa8..bb5ad8a6a4c3 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -5,8 +5,6 @@ * Copyright (C) 1996-2000 Russell King - Converted to ARM. * Original Copyright (C) 1995 Linus Torvalds */ -#include <stdarg.h> - #include <linux/export.h> #include <linux/sched.h> #include <linux/sched/debug.h> diff --git a/arch/arm/mach-bcm/bcm_kona_smc.c b/arch/arm/mach-bcm/bcm_kona_smc.c index 43a16f922b53..43829e49ad93 100644 --- a/arch/arm/mach-bcm/bcm_kona_smc.c +++ b/arch/arm/mach-bcm/bcm_kona_smc.c @@ -10,8 +10,6 @@ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. */ - -#include <stdarg.h> #include <linux/smp.h> #include <linux/io.h> #include <linux/ioport.h> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index c8989b999250..5f7ac9a0f9a3 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -6,9 +6,6 @@ * Copyright (C) 1996-2000 Russell King - Converted to ARM. * Copyright (C) 2012 ARM Ltd. */ - -#include <stdarg.h> - #include <linux/compat.h> #include <linux/efi.h> #include <linux/elf.h> diff --git a/arch/openrisc/kernel/process.c b/arch/openrisc/kernel/process.c index eb62429681fc..b0698d9ce14f 100644 --- a/arch/openrisc/kernel/process.c +++ b/arch/openrisc/kernel/process.c @@ -14,8 +14,6 @@ */ #define __KERNEL_SYSCALLS__ -#include <stdarg.h> - #include <linux/errno.h> #include <linux/sched.h> #include <linux/sched/debug.h> diff --git a/arch/parisc/kernel/firmware.c b/arch/parisc/kernel/firmware.c index 665b70086685..7034227dbdf3 100644 --- a/arch/parisc/kernel/firmware.c +++ b/arch/parisc/kernel/firmware.c @@ -51,7 +51,7 @@ * prumpf 991016 */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/delay.h> #include <linux/init.h> diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c index 184ec3c1eae4..38ec4ae81239 100644 --- a/arch/parisc/kernel/process.c +++ b/arch/parisc/kernel/process.c @@ -17,9 +17,6 @@ * Copyright (C) 2001-2014 Helge Deller <deller(a)gmx.de> * Copyright (C) 2002 Randolph Chung <tausq with parisc-linux.org> */ - -#include <stdarg.h> - #include <linux/elf.h> #include <linux/errno.h> #include <linux/kernel.h> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index f620e04dc9bf..a1e7ba0fad09 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -11,7 +11,6 @@ #undef DEBUG -#include <stdarg.h> #include <linux/kernel.h> #include <linux/string.h> #include <linux/init.h> diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index a5bf355ce1d6..10664633f7e3 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -14,7 +14,7 @@ /* we cannot use FORTIFY as it brings in new symbols */ #define __NO_FORTIFY -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/kernel.h> #include <linux/string.h> #include <linux/init.h> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 99f2cce635fb..ff80bbad22a5 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -7,7 +7,7 @@ * Copyright (C) 2001 IBM. */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/kernel.h> #include <linux/types.h> #include <linux/spinlock.h> diff --git a/arch/powerpc/kernel/udbg.c b/arch/powerpc/kernel/udbg.c index 01595e8cafe7..b1544b2f6321 100644 --- a/arch/powerpc/kernel/udbg.c +++ b/arch/powerpc/kernel/udbg.c @@ -5,7 +5,7 @@ * c 2001 PPC 64 Team, IBM Corp */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/types.h> #include <linux/sched.h> #include <linux/console.h> diff --git a/arch/s390/boot/pgm_check_info.c b/arch/s390/boot/pgm_check_info.c index 3a46abed2549..b7d8dd88bbf2 100644 --- a/arch/s390/boot/pgm_check_info.c +++ b/arch/s390/boot/pgm_check_info.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/kernel.h> +#include <linux/stdarg.h> #include <linux/string.h> #include <linux/ctype.h> #include <asm/stacktrace.h> @@ -8,7 +9,6 @@ #include <asm/setup.h> #include <asm/sclp.h> #include <asm/uv.h> -#include <stdarg.h> #include "boot.h" const char hex_asc[] = "0123456789abcdef"; diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c index 93983d6d431d..bbbe0cfef746 100644 --- a/arch/sparc/kernel/process_32.c +++ b/arch/sparc/kernel/process_32.c @@ -8,9 +8,6 @@ /* * This file handles the architecture-dependent parts of process handling.. */ - -#include <stdarg.h> - #include <linux/elfcore.h> #include <linux/errno.h> #include <linux/module.h> diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c index d33c58a58d4f..0cabcdfb23fd 100644 --- a/arch/sparc/kernel/process_64.c +++ b/arch/sparc/kernel/process_64.c @@ -9,9 +9,6 @@ /* * This file handles the architecture-dependent parts of process handling.. */ - -#include <stdarg.h> - #include <linux/errno.h> #include <linux/export.h> #include <linux/sched.h> diff --git a/arch/um/include/shared/irq_user.h b/arch/um/include/shared/irq_user.h index 065829f443ae..86a8a573b65c 100644 --- a/arch/um/include/shared/irq_user.h +++ b/arch/um/include/shared/irq_user.h @@ -7,7 +7,6 @@ #define __IRQ_USER_H__ #include <sysdep/ptrace.h> -#include <stdbool.h> enum um_irq_type { IRQ_READ, diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h index 60b84edc8a68..96d400387c93 100644 --- a/arch/um/include/shared/os.h +++ b/arch/um/include/shared/os.h @@ -8,7 +8,6 @@ #ifndef __OS_H__ #define __OS_H__ -#include <stdarg.h> #include <irq_user.h> #include <longjmp.h> #include <mm_id.h> diff --git a/arch/um/os-Linux/signal.c b/arch/um/os-Linux/signal.c index 6de99bb16113..6cf098c23a39 100644 --- a/arch/um/os-Linux/signal.c +++ b/arch/um/os-Linux/signal.c @@ -67,7 +67,7 @@ int signals_enabled; #ifdef UML_CONFIG_UML_TIME_TRAVEL_SUPPORT static int signals_blocked; #else -#define signals_blocked false +#define signals_blocked 0 #endif static unsigned int signals_pending; static unsigned int signals_active = 0; diff --git a/arch/um/os-Linux/util.c b/arch/um/os-Linux/util.c index 07327425d06e..41297ec404bf 100644 --- a/arch/um/os-Linux/util.c +++ b/arch/um/os-Linux/util.c @@ -3,6 +3,7 @@ * Copyright (C) 2000 - 2007 Jeff Dike (jdike(a){addtoit,linux.intel}.com) */ +#include <stdarg.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h index ca866f1cca2e..34c9dbb6a47d 100644 --- a/arch/x86/boot/boot.h +++ b/arch/x86/boot/boot.h @@ -18,7 +18,7 @@ #ifndef __ASSEMBLY__ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/types.h> #include <linux/edd.h> #include <asm/setup.h> diff --git a/crypto/aegis128-neon-inner.c b/crypto/aegis128-neon-inner.c index 7de485907d81..6b37fbb79884 100644 --- a/crypto/aegis128-neon-inner.c +++ b/crypto/aegis128-neon-inner.c @@ -15,8 +15,6 @@ #define AEGIS_BLOCK_SIZE 16 -#include <stddef.h> - extern int aegis128_have_aes_insn; void *memcpy(void *dest, const void *src, size_t n); diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 125b22205d38..33eba3df4dd9 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -8,7 +8,6 @@ #define pr_fmt(fmt) "xen-blkback: " fmt -#include <stdarg.h> #include <linux/module.h> #include <linux/kthread.h> #include <xen/events.h> diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c index aa8da0a49829..4921580e1725 100644 --- a/drivers/firmware/efi/libstub/efi-stub-helper.c +++ b/drivers/firmware/efi/libstub/efi-stub-helper.c @@ -7,7 +7,7 @@ * Copyright 2011 Intel Corporation; author Matt Fleming */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/ctype.h> #include <linux/efi.h> diff --git a/drivers/firmware/efi/libstub/vsprintf.c b/drivers/firmware/efi/libstub/vsprintf.c index 1088e288c04d..71c71c222346 100644 --- a/drivers/firmware/efi/libstub/vsprintf.c +++ b/drivers/firmware/efi/libstub/vsprintf.c @@ -10,7 +10,7 @@ * Oh, it's a waste of space, but oh-so-yummy for debugging. */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/compiler.h> #include <linux/ctype.h> diff --git a/drivers/gpu/drm/amd/display/dc/dc_helper.c b/drivers/gpu/drm/amd/display/dc/dc_helper.c index a612ba6dc389..ab6bc5d79012 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_helper.c +++ b/drivers/gpu/drm/amd/display/dc/dc_helper.c @@ -28,9 +28,9 @@ */ #include <linux/delay.h> +#include <linux/stdarg.h> #include "dm_services.h" -#include <stdarg.h> #include "dc.h" #include "dc_dmub_srv.h" diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h index 7c4734f905d9..68fd451aca23 100644 --- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h +++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h @@ -39,7 +39,6 @@ #include <linux/types.h> #include <linux/string.h> #include <linux/delay.h> -#include <stdarg.h> #include "atomfirmware.h" diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c index 111b932cf2a9..f783d4963d4b 100644 --- a/drivers/gpu/drm/drm_print.c +++ b/drivers/gpu/drm/drm_print.c @@ -25,7 +25,7 @@ #define DEBUG /* for pr_debug() */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/io.h> #include <linux/moduleparam.h> diff --git a/drivers/gpu/drm/msm/disp/msm_disp_snapshot.h b/drivers/gpu/drm/msm/disp/msm_disp_snapshot.h index c92a9508c8d3..0f9a5364cd86 100644 --- a/drivers/gpu/drm/msm/disp/msm_disp_snapshot.h +++ b/drivers/gpu/drm/msm/disp/msm_disp_snapshot.h @@ -25,7 +25,6 @@ #include <linux/pm_runtime.h> #include <linux/kthread.h> #include <linux/devcoredump.h> -#include <stdarg.h> #include "msm_kms.h" #define MSM_DISP_SNAPSHOT_MAX_BLKS 10 diff --git a/drivers/isdn/capi/capiutil.c b/drivers/isdn/capi/capiutil.c index f26bf3c66d7e..d7ae42edc4a8 100644 --- a/drivers/isdn/capi/capiutil.c +++ b/drivers/isdn/capi/capiutil.c @@ -379,7 +379,7 @@ static char *pnames[] = /*2f */ "Useruserdata" }; -#include <stdarg.h> +#include <linux/stdarg.h> /*-------------------------------------------------------*/ static _cdebbuf *bufprint(_cdebbuf *cdb, char *fmt, ...) diff --git a/drivers/macintosh/macio-adb.c b/drivers/macintosh/macio-adb.c index d4759db002c6..dc634c2932fd 100644 --- a/drivers/macintosh/macio-adb.c +++ b/drivers/macintosh/macio-adb.c @@ -2,7 +2,6 @@ /* * Driver for the ADB controller in the Mac I/O (Hydra) chip. */ -#include <stdarg.h> #include <linux/types.h> #include <linux/errno.h> #include <linux/kernel.h> diff --git a/drivers/macintosh/via-cuda.c b/drivers/macintosh/via-cuda.c index 3581abfb0c6a..cd267392289c 100644 --- a/drivers/macintosh/via-cuda.c +++ b/drivers/macintosh/via-cuda.c @@ -9,7 +9,7 @@ * * Copyright (C) 1996 Paul Mackerras. */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/types.h> #include <linux/errno.h> #include <linux/kernel.h> diff --git a/drivers/macintosh/via-macii.c b/drivers/macintosh/via-macii.c index 060e03f2264b..db9270da5b8e 100644 --- a/drivers/macintosh/via-macii.c +++ b/drivers/macintosh/via-macii.c @@ -23,8 +23,6 @@ * Apple's "ADB Analyzer" bus sniffer is invaluable: * ftp://ftp.apple.com/developer/Tool_Chest/Devices_-_Hardware/Apple_Desktop_B… */ - -#include <stdarg.h> #include <linux/types.h> #include <linux/errno.h> #include <linux/kernel.h> diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c index 4bdd4c45e7a7..4b98bc26a94b 100644 --- a/drivers/macintosh/via-pmu.c +++ b/drivers/macintosh/via-pmu.c @@ -18,7 +18,7 @@ * a sleep or a freq. switch * */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/mutex.h> #include <linux/types.h> #include <linux/errno.h> diff --git a/drivers/net/wireless/intersil/orinoco/hermes.c b/drivers/net/wireless/intersil/orinoco/hermes.c index 6d4b7f64efcf..256946552742 100644 --- a/drivers/net/wireless/intersil/orinoco/hermes.c +++ b/drivers/net/wireless/intersil/orinoco/hermes.c @@ -79,7 +79,6 @@ #undef HERMES_DEBUG #ifdef HERMES_DEBUG -#include <stdarg.h> #define DEBUG(lvl, stuff...) if ((lvl) <= HERMES_DEBUG) DMSG(stuff) diff --git a/drivers/net/wwan/iosm/iosm_ipc_imem.h b/drivers/net/wwan/iosm/iosm_ipc_imem.h index 0d2f10e4cbc8..dc65b0712261 100644 --- a/drivers/net/wwan/iosm/iosm_ipc_imem.h +++ b/drivers/net/wwan/iosm/iosm_ipc_imem.h @@ -7,7 +7,6 @@ #define IOSM_IPC_IMEM_H #include <linux/skbuff.h> -#include <stdbool.h> #include "iosm_ipc_mmio.h" #include "iosm_ipc_pcie.h" diff --git a/drivers/pinctrl/aspeed/pinmux-aspeed.h b/drivers/pinctrl/aspeed/pinmux-aspeed.h index b69ba6b360a2..4d7548686f39 100644 --- a/drivers/pinctrl/aspeed/pinmux-aspeed.h +++ b/drivers/pinctrl/aspeed/pinmux-aspeed.h @@ -5,7 +5,6 @@ #define ASPEED_PINMUX_H #include <linux/regmap.h> -#include <stdbool.h> /* * The ASPEED SoCs provide typically more than 200 pins for GPIO and other diff --git a/drivers/scsi/elx/efct/efct_driver.h b/drivers/scsi/elx/efct/efct_driver.h index dab8eac4f243..0e3c931db7c2 100644 --- a/drivers/scsi/elx/efct/efct_driver.h +++ b/drivers/scsi/elx/efct/efct_driver.h @@ -10,7 +10,6 @@ /*************************************************************************** * OS specific includes */ -#include <stdarg.h> #include <linux/module.h> #include <linux/debugfs.h> #include <linux/firmware.h> diff --git a/drivers/staging/media/atomisp/pci/hive_isp_css_common/host/isp_local.h b/drivers/staging/media/atomisp/pci/hive_isp_css_common/host/isp_local.h index eceeb5d160ad..4dbec4063b3d 100644 --- a/drivers/staging/media/atomisp/pci/hive_isp_css_common/host/isp_local.h +++ b/drivers/staging/media/atomisp/pci/hive_isp_css_common/host/isp_local.h @@ -16,8 +16,6 @@ #ifndef __ISP_LOCAL_H_INCLUDED__ #define __ISP_LOCAL_H_INCLUDED__ -#include <stdbool.h> - #include "isp_global.h" #include <isp2400_support.h> diff --git a/drivers/staging/media/atomisp/pci/hive_isp_css_include/print_support.h b/drivers/staging/media/atomisp/pci/hive_isp_css_include/print_support.h index 540b405cc0f7..a3c7f3de6d17 100644 --- a/drivers/staging/media/atomisp/pci/hive_isp_css_include/print_support.h +++ b/drivers/staging/media/atomisp/pci/hive_isp_css_include/print_support.h @@ -16,7 +16,7 @@ #ifndef __PRINT_SUPPORT_H_INCLUDED__ #define __PRINT_SUPPORT_H_INCLUDED__ -#include <stdarg.h> +#include <linux/stdarg.h> extern int (*sh_css_printf)(const char *fmt, va_list args); /* depends on host supplied print function in ia_css_init() */ diff --git a/drivers/staging/media/atomisp/pci/ia_css_env.h b/drivers/staging/media/atomisp/pci/ia_css_env.h index 6b38723b27cd..3b89bbd837a0 100644 --- a/drivers/staging/media/atomisp/pci/ia_css_env.h +++ b/drivers/staging/media/atomisp/pci/ia_css_env.h @@ -17,7 +17,7 @@ #define __IA_CSS_ENV_H #include <type_support.h> -#include <stdarg.h> /* va_list */ +#include <linux/stdarg.h> /* va_list */ #include "ia_css_types.h" #include "ia_css_acc_types.h" diff --git a/drivers/staging/media/atomisp/pci/runtime/debug/interface/ia_css_debug.h b/drivers/staging/media/atomisp/pci/runtime/debug/interface/ia_css_debug.h index 5e6e7447ae00..e37ef4232c55 100644 --- a/drivers/staging/media/atomisp/pci/runtime/debug/interface/ia_css_debug.h +++ b/drivers/staging/media/atomisp/pci/runtime/debug/interface/ia_css_debug.h @@ -19,7 +19,7 @@ /*! \file */ #include <type_support.h> -#include <stdarg.h> +#include <linux/stdarg.h> #include "ia_css_types.h" #include "ia_css_binary.h" #include "ia_css_frame_public.h" diff --git a/drivers/staging/media/atomisp/pci/sh_css_internal.h b/drivers/staging/media/atomisp/pci/sh_css_internal.h index 3c669ec79b68..496faa7297a5 100644 --- a/drivers/staging/media/atomisp/pci/sh_css_internal.h +++ b/drivers/staging/media/atomisp/pci/sh_css_internal.h @@ -20,7 +20,7 @@ #include <math_support.h> #include <type_support.h> #include <platform_support.h> -#include <stdarg.h> +#include <linux/stdarg.h> #if !defined(ISP2401) #include "input_formatter.h" diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c index 61ce0d142eea..0c5e565aa8cf 100644 --- a/drivers/xen/xen-scsiback.c +++ b/drivers/xen/xen-scsiback.c @@ -33,8 +33,6 @@ #define pr_fmt(fmt) "xen-pvscsi: " fmt -#include <stdarg.h> - #include <linux/module.h> #include <linux/utsname.h> #include <linux/interrupt.h> diff --git a/fs/befs/debug.c b/fs/befs/debug.c index eb7bd6c692c7..02fa66fb82c2 100644 --- a/fs/befs/debug.c +++ b/fs/befs/debug.c @@ -14,7 +14,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #ifdef __KERNEL__ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/string.h> #include <linux/spinlock.h> #include <linux/kernel.h> diff --git a/fs/reiserfs/prints.c b/fs/reiserfs/prints.c index 500f2000eb41..30319dc33c18 100644 --- a/fs/reiserfs/prints.c +++ b/fs/reiserfs/prints.c @@ -8,7 +8,7 @@ #include <linux/string.h> #include <linux/buffer_head.h> -#include <stdarg.h> +#include <linux/stdarg.h> static char error_buf[1024]; static char fmt_buf[1024]; diff --git a/fs/ufs/super.c b/fs/ufs/super.c index 74028b5a7b0a..00a01471ea05 100644 --- a/fs/ufs/super.c +++ b/fs/ufs/super.c @@ -70,7 +70,7 @@ #include <linux/module.h> #include <linux/bitops.h> -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/uaccess.h> diff --git a/include/acpi/platform/acgcc.h b/include/acpi/platform/acgcc.h index f6656be81760..fb172a03a753 100644 --- a/include/acpi/platform/acgcc.h +++ b/include/acpi/platform/acgcc.h @@ -22,7 +22,7 @@ typedef __builtin_va_list va_list; #define va_arg(v, l) __builtin_va_arg(v, l) #define va_copy(d, s) __builtin_va_copy(d, s) #else -#include <stdarg.h> +#include <linux/stdarg.h> #endif #endif diff --git a/include/linux/filter.h b/include/linux/filter.h index 472f97074da0..45785fc231a8 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -5,8 +5,6 @@ #ifndef __LINUX_FILTER_H__ #define __LINUX_FILTER_H__ -#include <stdarg.h> - #include <linux/atomic.h> #include <linux/refcount.h> #include <linux/compat.h> diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 1b2f0a7e00d6..2776423a587e 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -2,7 +2,7 @@ #ifndef _LINUX_KERNEL_H #define _LINUX_KERNEL_H -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/align.h> #include <linux/limits.h> #include <linux/linkage.h> diff --git a/include/linux/mISDNif.h b/include/linux/mISDNif.h index a7330eb3ec64..7dd1f01ec4f9 100644 --- a/include/linux/mISDNif.h +++ b/include/linux/mISDNif.h @@ -18,7 +18,6 @@ #ifndef mISDNIF_H #define mISDNIF_H -#include <stdarg.h> #include <linux/types.h> #include <linux/errno.h> #include <linux/socket.h> diff --git a/include/linux/printk.h b/include/linux/printk.h index e834d78f0478..9f3f29ea348e 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -2,7 +2,7 @@ #ifndef __KERNEL_PRINTK__ #define __KERNEL_PRINTK__ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/init.h> #include <linux/kern_levels.h> #include <linux/linkage.h> diff --git a/include/linux/stdarg.h b/include/linux/stdarg.h new file mode 100644 index 000000000000..c8dc7f4f390c --- /dev/null +++ b/include/linux/stdarg.h @@ -0,0 +1,11 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +#ifndef _LINUX_STDARG_H +#define _LINUX_STDARG_H + +typedef __builtin_va_list va_list; +#define va_start(v, l) __builtin_va_start(v, l) +#define va_end(v) __builtin_va_end(v) +#define va_arg(v, T) __builtin_va_arg(v, T) +#define va_copy(d, s) __builtin_va_copy(d, s) + +#endif diff --git a/include/linux/string.h b/include/linux/string.h index b48d2d28e0b1..5e96d656be7a 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -6,7 +6,7 @@ #include <linux/types.h> /* for size_t */ #include <linux/stddef.h> /* for NULL */ #include <linux/errno.h> /* for E2BIG */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <uapi/linux/string.h> extern char *strndup_user(const char __user *, long); diff --git a/kernel/debug/kdb/kdb_support.c b/kernel/debug/kdb/kdb_support.c index 9f50d22d68e6..4f9950678e7b 100644 --- a/kernel/debug/kdb/kdb_support.c +++ b/kernel/debug/kdb/kdb_support.c @@ -10,7 +10,6 @@ * 03/02/13 added new 2.5 kallsyms <xavier.bru(a)bull.net> */ -#include <stdarg.h> #include <linux/types.h> #include <linux/sched.h> #include <linux/mm.h> diff --git a/lib/debug_info.c b/lib/debug_info.c index 36daf753293c..cc4723c74af5 100644 --- a/lib/debug_info.c +++ b/lib/debug_info.c @@ -5,8 +5,6 @@ * CONFIG_DEBUG_INFO_REDUCED. Please do not add actual code. However, * adding appropriate #includes is fine. */ -#include <stdarg.h> - #include <linux/cred.h> #include <linux/crypto.h> #include <linux/dcache.h> @@ -22,6 +20,7 @@ #include <linux/net.h> #include <linux/sched.h> #include <linux/slab.h> +#include <linux/stdarg.h> #include <linux/types.h> #include <net/addrconf.h> #include <net/sock.h> diff --git a/lib/kasprintf.c b/lib/kasprintf.c index bacf7b83ccf0..cd2f5974ed98 100644 --- a/lib/kasprintf.c +++ b/lib/kasprintf.c @@ -5,7 +5,7 @@ * Copyright (C) 1991, 1992 Linus Torvalds */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/export.h> #include <linux/slab.h> #include <linux/types.h> diff --git a/lib/kunit/string-stream.h b/lib/kunit/string-stream.h index 5e94b623454f..43f9508a55b4 100644 --- a/lib/kunit/string-stream.h +++ b/lib/kunit/string-stream.h @@ -11,7 +11,7 @@ #include <linux/spinlock.h> #include <linux/types.h> -#include <stdarg.h> +#include <linux/stdarg.h> struct string_stream_fragment { struct kunit *test; diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 26c83943748a..3bcb7be03f93 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -17,7 +17,7 @@ * - scnprintf and vscnprintf */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/build_bug.h> #include <linux/clk.h> #include <linux/clk-provider.h> diff --git a/mm/kfence/report.c b/mm/kfence/report.c index 2a319c21c939..4b891dd75650 100644 --- a/mm/kfence/report.c +++ b/mm/kfence/report.c @@ -5,7 +5,7 @@ * Copyright (C) 2020, Google LLC. */ -#include <stdarg.h> +#include <linux/stdarg.h> #include <linux/kernel.h> #include <linux/lockdep.h> diff --git a/net/batman-adv/log.c b/net/batman-adv/log.c index f0e5d1429662..7a93a1e94c40 100644 --- a/net/batman-adv/log.c +++ b/net/batman-adv/log.c @@ -7,7 +7,7 @@ #include "log.h" #include "main.h" -#include <stdarg.h> +#include <linux/stdarg.h> #include "trace.h" diff --git a/sound/aoa/codecs/onyx.h b/sound/aoa/codecs/onyx.h index 8a32c3c3d716..6c31b7373b78 100644 --- a/sound/aoa/codecs/onyx.h +++ b/sound/aoa/codecs/onyx.h @@ -6,7 +6,6 @@ */ #ifndef __SND_AOA_CODEC_ONYX_H #define __SND_AOA_CODEC_ONYX_H -#include <stddef.h> #include <linux/i2c.h> #include <asm/pmac_low_i2c.h> #include <asm/prom.h> diff --git a/sound/aoa/codecs/tas.c b/sound/aoa/codecs/tas.c index ac246dd3ab49..ab19a37e2a68 100644 --- a/sound/aoa/codecs/tas.c +++ b/sound/aoa/codecs/tas.c @@ -58,7 +58,6 @@ * and up to the hardware designer to not wire * them up in some weird unusable way. */ -#include <stddef.h> #include <linux/i2c.h> #include <asm/pmac_low_i2c.h> #include <asm/prom.h> diff --git a/sound/core/info.c b/sound/core/info.c index 9fec3070f8ba..a451b24199c3 100644 --- a/sound/core/info.c +++ b/sound/core/info.c @@ -16,7 +16,6 @@ #include <linux/utsname.h> #include <linux/proc_fs.h> #include <linux/mutex.h> -#include <stdarg.h> int snd_info_check_reserved_words(const char *str) { </cut>

4 years, 11 months

← Newer
1
2
3
4
5
6
7
8
9
10
Older →

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain July 2021