After llvm commit de2fed61528a5584dc54c47f6754408597be24de Author: Philip Reames listmail@philipreames.com
[unroll] Keep unrolled iterations with initial iteration
the following benchmarks slowed down by more than 2%: - 464.h264ref slowed down by 6% from 10902 to 11518 perf samples - 464.h264ref:[.] FastFullPelBlockMotionSearch slowed down by 43% from 1494 to 2141 perf samples
Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI.
For your convenience, we have uploaded tarballs with pre-processed source and assembly files at: - First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-a... - Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-a... - Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-a...
Configuration: - Benchmark: SPEC CPU2006 - Toolchain: Clang + Glibc + LLVM Linker - Version: all components were built from their tip of trunk - Target: aarch64-linux-gnu - Compiler flags: -O3 - Hardware: NVidia TX1 4x Cortex-A57
This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain@lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports.
THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
This commit has regressed these CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3
First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-a... Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-a... Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-a... Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-a...
Reproduce builds: <cut> mkdir investigate-llvm-de2fed61528a5584dc54c47f6754408597be24de cd investigate-llvm-de2fed61528a5584dc54c47f6754408597be24de
# Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts
# Fetch manifests and test.sh script mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-a... --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-a... --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-a... --fail chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/
cd llvm
# Reproduce first_bad build git checkout --detach de2fed61528a5584dc54c47f6754408597be24de ../artifacts/test.sh
# Reproduce last_good build git checkout --detach da25f968a90ad4560fc920a6d18fc2a0221d2750 ../artifacts/test.sh
cd .. </cut>
Full commit (up to 1000 lines): <cut> commit de2fed61528a5584dc54c47f6754408597be24de Author: Philip Reames listmail@philipreames.com Date: Fri Nov 12 11:35:28 2021 -0800
[unroll] Keep unrolled iterations with initial iteration
The unrolling code was previously inserting new cloned blocks at the end of the function. The result of this with typical loop structures is that the new iterations are placed far from the initial iteration.
With unrolling, the general assumption is that the a) the loop is reasonable hot, and b) the first Count-1 copies of the loop are rarely (if ever) loop exiting. As such, placing Count-1 copies out of line is a fairly poor code placement choice. We'd much rather fall through into the hot (non-exiting) path. For code with branch profiles, later layout would fix this, but this may have a positive impact on non-PGO compiled code.
However, the real motivation for this change isn't performance. Its readability and human understanding. Having to jump around long distances in an IR file to trace an unrolled loop structure is error prone and tedious. --- llvm/lib/Transforms/Utils/LoopUnroll.cpp | 6 +- llvm/test/DebugInfo/unrolled-loop-remainder.ll | 86 +- .../Transforms/LoopUnroll/2011-08-08-PhiUpdate.ll | 66 +- .../Transforms/LoopUnroll/2011-08-09-PhiUpdate.ll | 24 +- .../LoopUnroll/AArch64/runtime-unroll-generic.ll | 4 +- .../LoopUnroll/AArch64/thresholdO3-cost-model.ll | 8 +- .../LoopUnroll/AArch64/unroll-upperbound.ll | 4 +- .../Transforms/LoopUnroll/ARM/loop-unrolling.ll | 4 +- .../test/Transforms/LoopUnroll/ARM/multi-blocks.ll | 230 +- llvm/test/Transforms/LoopUnroll/ARM/upperbound.ll | 10 +- .../LoopUnroll/full-unroll-keep-first-exit.ll | 16 +- .../full-unroll-one-unpredictable-exit.ll | 16 +- llvm/test/Transforms/LoopUnroll/multiple-exits.ll | 8 +- llvm/test/Transforms/LoopUnroll/nonlatchcondbr.ll | 20 +- .../LoopUnroll/partial-unroll-non-latch-exit.ll | 14 +- .../partially-unroll-unconditional-latch.ll | 4 +- .../LoopUnroll/runtime-loop-at-most-two-exits.ll | 120 +- .../runtime-loop-multiexit-dom-verify.ll | 206 +- .../LoopUnroll/runtime-loop-multiple-exits.ll | 2560 ++++++++++---------- llvm/test/Transforms/LoopUnroll/runtime-loop5.ll | 34 +- .../LoopUnroll/runtime-multiexit-heuristic.ll | 122 +- .../LoopUnroll/runtime-small-upperbound.ll | 8 +- .../LoopUnroll/runtime-unroll-remainder.ll | 62 +- llvm/test/Transforms/LoopUnroll/scevunroll.ll | 48 +- .../Transforms/LoopUnroll/shifted-tripcount.ll | 4 +- ...er-exiting-with-phis-multiple-exiting-blocks.ll | 20 +- .../LoopUnroll/unroll-unconditional-latch.ll | 12 +- .../Transforms/LoopUnrollAndJam/unroll-and-jam.ll | 68 +- .../PhaseOrdering/AArch64/matrix-extract-insert.ll | 4 +- 29 files changed, 1896 insertions(+), 1892 deletions(-)
diff --git a/llvm/lib/Transforms/Utils/LoopUnroll.cpp b/llvm/lib/Transforms/Utils/LoopUnroll.cpp index ce463927fd50..b0c622b98d5e 100644 --- a/llvm/lib/Transforms/Utils/LoopUnroll.cpp +++ b/llvm/lib/Transforms/Utils/LoopUnroll.cpp @@ -514,6 +514,10 @@ LoopUnrollResult llvm::UnrollLoop(Loop *L, UnrollLoopOptions ULO, LoopInfo *LI, SmallVector<MDNode *, 6> LoopLocalNoAliasDeclScopes; identifyNoAliasScopesToClone(L->getBlocks(), LoopLocalNoAliasDeclScopes);
+ // We place the unrolled iterations immediately after the original loop + // latch. This is a reasonable default placement if we don't have block + // frequencies, and if we do, well the layout will be adjusted later. + auto BlockInsertPt = std::next(LatchBlock->getIterator()); for (unsigned It = 1; It != ULO.Count; ++It) { SmallVector<BasicBlock *, 8> NewBlocks; SmallDenseMap<const Loop *, Loop *, 4> NewLoops; @@ -522,7 +526,7 @@ LoopUnrollResult llvm::UnrollLoop(Loop *L, UnrollLoopOptions ULO, LoopInfo *LI, for (LoopBlocksDFS::RPOIterator BB = BlockBegin; BB != BlockEnd; ++BB) { ValueToValueMapTy VMap; BasicBlock *New = CloneBasicBlock(*BB, VMap, "." + Twine(It)); - Header->getParent()->getBasicBlockList().push_back(New); + Header->getParent()->getBasicBlockList().insert(BlockInsertPt, New);
assert((*BB != Header || LI->getLoopFor(*BB) == L) && "Header should not be in a sub-loop"); diff --git a/llvm/test/DebugInfo/unrolled-loop-remainder.ll b/llvm/test/DebugInfo/unrolled-loop-remainder.ll index 83c30dec780d..ba4ce1f409f6 100644 --- a/llvm/test/DebugInfo/unrolled-loop-remainder.ll +++ b/llvm/test/DebugInfo/unrolled-loop-remainder.ll @@ -38,71 +38,71 @@ define i32 @func_c() local_unnamed_addr #0 !dbg !14 { ; CHECK-NEXT: [[PROL_ITER_SUB:%.*]] = sub i32 [[XTRAITER]], 1, !dbg [[DBG24]] ; CHECK-NEXT: [[PROL_ITER_CMP:%.*]] = icmp ne i32 [[PROL_ITER_SUB]], 0, !dbg [[DBG24]] ; CHECK-NEXT: br i1 [[PROL_ITER_CMP]], label [[FOR_BODY_PROL_1:%.*]], label [[FOR_BODY_PROL_LOOPEXIT_UNR_LCSSA:%.*]], !dbg [[DBG24]] +; CHECK: for.body.prol.1: +; CHECK-NEXT: [[ARRAYIDX_PROL_1:%.*]] = getelementptr inbounds i32, i32* [[TMP6]], i64 1, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP7:%.*]] = load i32, i32* [[ARRAYIDX_PROL_1]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] +; CHECK-NEXT: [[CONV_PROL_1:%.*]] = sext i32 [[TMP7]] to i64, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP8:%.*]] = inttoptr i64 [[CONV_PROL_1]] to i32*, !dbg [[DBG28]] +; CHECK-NEXT: [[ADD_PROL_1:%.*]] = add nsw i32 [[ADD_PROL]], 2, !dbg [[DBG29]] +; CHECK-NEXT: [[PROL_ITER_SUB_1:%.*]] = sub i32 [[PROL_ITER_SUB]], 1, !dbg [[DBG24]] +; CHECK-NEXT: [[PROL_ITER_CMP_1:%.*]] = icmp ne i32 [[PROL_ITER_SUB_1]], 0, !dbg [[DBG24]] +; CHECK-NEXT: br i1 [[PROL_ITER_CMP_1]], label [[FOR_BODY_PROL_2:%.*]], label [[FOR_BODY_PROL_LOOPEXIT_UNR_LCSSA]], !dbg [[DBG24]] +; CHECK: for.body.prol.2: +; CHECK-NEXT: [[ARRAYIDX_PROL_2:%.*]] = getelementptr inbounds i32, i32* [[TMP8]], i64 1, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP9:%.*]] = load i32, i32* [[ARRAYIDX_PROL_2]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] +; CHECK-NEXT: [[CONV_PROL_2:%.*]] = sext i32 [[TMP9]] to i64, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP10:%.*]] = inttoptr i64 [[CONV_PROL_2]] to i32*, !dbg [[DBG28]] +; CHECK-NEXT: [[ADD_PROL_2:%.*]] = add nsw i32 [[ADD_PROL_1]], 2, !dbg [[DBG29]] +; CHECK-NEXT: br label [[FOR_BODY_PROL_LOOPEXIT_UNR_LCSSA]] ; CHECK: for.body.prol.loopexit.unr-lcssa: -; CHECK-NEXT: [[DOTLCSSA_UNR_PH:%.*]] = phi i32* [ [[TMP6]], [[FOR_BODY_PROL]] ], [ [[TMP20:%.*]], [[FOR_BODY_PROL_1]] ], [ [[TMP22:%.*]], [[FOR_BODY_PROL_2:%.*]] ] -; CHECK-NEXT: [[DOTUNR_PH:%.*]] = phi i32* [ [[TMP6]], [[FOR_BODY_PROL]] ], [ [[TMP20]], [[FOR_BODY_PROL_1]] ], [ [[TMP22]], [[FOR_BODY_PROL_2]] ] -; CHECK-NEXT: [[DOTUNR1_PH:%.*]] = phi i32 [ [[ADD_PROL]], [[FOR_BODY_PROL]] ], [ [[ADD_PROL_1:%.*]], [[FOR_BODY_PROL_1]] ], [ [[ADD_PROL_2:%.*]], [[FOR_BODY_PROL_2]] ] +; CHECK-NEXT: [[DOTLCSSA_UNR_PH:%.*]] = phi i32* [ [[TMP6]], [[FOR_BODY_PROL]] ], [ [[TMP8]], [[FOR_BODY_PROL_1]] ], [ [[TMP10]], [[FOR_BODY_PROL_2]] ] +; CHECK-NEXT: [[DOTUNR_PH:%.*]] = phi i32* [ [[TMP6]], [[FOR_BODY_PROL]] ], [ [[TMP8]], [[FOR_BODY_PROL_1]] ], [ [[TMP10]], [[FOR_BODY_PROL_2]] ] +; CHECK-NEXT: [[DOTUNR1_PH:%.*]] = phi i32 [ [[ADD_PROL]], [[FOR_BODY_PROL]] ], [ [[ADD_PROL_1]], [[FOR_BODY_PROL_1]] ], [ [[ADD_PROL_2]], [[FOR_BODY_PROL_2]] ] ; CHECK-NEXT: br label [[FOR_BODY_PROL_LOOPEXIT]], !dbg [[DBG24]] ; CHECK: for.body.prol.loopexit: ; CHECK-NEXT: [[DOTLCSSA_UNR:%.*]] = phi i32* [ undef, [[FOR_BODY_LR_PH]] ], [ [[DOTLCSSA_UNR_PH]], [[FOR_BODY_PROL_LOOPEXIT_UNR_LCSSA]] ] ; CHECK-NEXT: [[DOTUNR:%.*]] = phi i32* [ [[A_PROMOTED]], [[FOR_BODY_LR_PH]] ], [ [[DOTUNR_PH]], [[FOR_BODY_PROL_LOOPEXIT_UNR_LCSSA]] ] ; CHECK-NEXT: [[DOTUNR1:%.*]] = phi i32 [ [[DOTPR]], [[FOR_BODY_LR_PH]] ], [ [[DOTUNR1_PH]], [[FOR_BODY_PROL_LOOPEXIT_UNR_LCSSA]] ] -; CHECK-NEXT: [[TMP7:%.*]] = icmp ult i32 [[TMP3]], 3, !dbg [[DBG24]] -; CHECK-NEXT: br i1 [[TMP7]], label [[FOR_COND_FOR_END_CRIT_EDGE:%.*]], label [[FOR_BODY_LR_PH_NEW:%.*]], !dbg [[DBG24]] +; CHECK-NEXT: [[TMP11:%.*]] = icmp ult i32 [[TMP3]], 3, !dbg [[DBG24]] +; CHECK-NEXT: br i1 [[TMP11]], label [[FOR_COND_FOR_END_CRIT_EDGE:%.*]], label [[FOR_BODY_LR_PH_NEW:%.*]], !dbg [[DBG24]] ; CHECK: for.body.lr.ph.new: ; CHECK-NEXT: br label [[FOR_BODY:%.*]], !dbg [[DBG24]] ; CHECK: for.body: -; CHECK-NEXT: [[TMP8:%.*]] = phi i32* [ [[DOTUNR]], [[FOR_BODY_LR_PH_NEW]] ], [ [[TMP17:%.*]], [[FOR_BODY]] ], !dbg [[DBG28]] -; CHECK-NEXT: [[TMP9:%.*]] = phi i32 [ [[DOTUNR1]], [[FOR_BODY_LR_PH_NEW]] ], [ [[ADD_3:%.*]], [[FOR_BODY]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[TMP8]], i64 1, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP10:%.*]] = load i32, i32* [[ARRAYIDX]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] -; CHECK-NEXT: [[CONV:%.*]] = sext i32 [[TMP10]] to i64, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP11:%.*]] = inttoptr i64 [[CONV]] to i32*, !dbg [[DBG28]] -; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP9]], 2, !dbg [[DBG29]] -; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr inbounds i32, i32* [[TMP11]], i64 1, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP12:%.*]] = load i32, i32* [[ARRAYIDX_1]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] -; CHECK-NEXT: [[CONV_1:%.*]] = sext i32 [[TMP12]] to i64, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP13:%.*]] = inttoptr i64 [[CONV_1]] to i32*, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP12:%.*]] = phi i32* [ [[DOTUNR]], [[FOR_BODY_LR_PH_NEW]] ], [ [[TMP21:%.*]], [[FOR_BODY]] ], !dbg [[DBG28]] +; CHECK-NEXT: [[TMP13:%.*]] = phi i32 [ [[DOTUNR1]], [[FOR_BODY_LR_PH_NEW]] ], [ [[ADD_3:%.*]], [[FOR_BODY]] ] +; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[TMP12]], i64 1, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP14:%.*]] = load i32, i32* [[ARRAYIDX]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] +; CHECK-NEXT: [[CONV:%.*]] = sext i32 [[TMP14]] to i64, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP15:%.*]] = inttoptr i64 [[CONV]] to i32*, !dbg [[DBG28]] +; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP13]], 2, !dbg [[DBG29]] +; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr inbounds i32, i32* [[TMP15]], i64 1, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP16:%.*]] = load i32, i32* [[ARRAYIDX_1]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] +; CHECK-NEXT: [[CONV_1:%.*]] = sext i32 [[TMP16]] to i64, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP17:%.*]] = inttoptr i64 [[CONV_1]] to i32*, !dbg [[DBG28]] ; CHECK-NEXT: [[ADD_1:%.*]] = add nsw i32 [[ADD]], 2, !dbg [[DBG29]] -; CHECK-NEXT: [[ARRAYIDX_2:%.*]] = getelementptr inbounds i32, i32* [[TMP13]], i64 1, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP14:%.*]] = load i32, i32* [[ARRAYIDX_2]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] -; CHECK-NEXT: [[CONV_2:%.*]] = sext i32 [[TMP14]] to i64, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP15:%.*]] = inttoptr i64 [[CONV_2]] to i32*, !dbg [[DBG28]] +; CHECK-NEXT: [[ARRAYIDX_2:%.*]] = getelementptr inbounds i32, i32* [[TMP17]], i64 1, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP18:%.*]] = load i32, i32* [[ARRAYIDX_2]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] +; CHECK-NEXT: [[CONV_2:%.*]] = sext i32 [[TMP18]] to i64, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP19:%.*]] = inttoptr i64 [[CONV_2]] to i32*, !dbg [[DBG28]] ; CHECK-NEXT: [[ADD_2:%.*]] = add nsw i32 [[ADD_1]], 2, !dbg [[DBG29]] -; CHECK-NEXT: [[ARRAYIDX_3:%.*]] = getelementptr inbounds i32, i32* [[TMP15]], i64 1, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP16:%.*]] = load i32, i32* [[ARRAYIDX_3]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] -; CHECK-NEXT: [[CONV_3:%.*]] = sext i32 [[TMP16]] to i64, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP17]] = inttoptr i64 [[CONV_3]] to i32*, !dbg [[DBG28]] +; CHECK-NEXT: [[ARRAYIDX_3:%.*]] = getelementptr inbounds i32, i32* [[TMP19]], i64 1, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP20:%.*]] = load i32, i32* [[ARRAYIDX_3]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] +; CHECK-NEXT: [[CONV_3:%.*]] = sext i32 [[TMP20]] to i64, !dbg [[DBG28]] +; CHECK-NEXT: [[TMP21]] = inttoptr i64 [[CONV_3]] to i32*, !dbg [[DBG28]] ; CHECK-NEXT: [[ADD_3]] = add nsw i32 [[ADD_2]], 2, !dbg [[DBG29]] ; CHECK-NEXT: [[TOBOOL_3:%.*]] = icmp eq i32 [[ADD_3]], 0, !dbg [[DBG24]] ; CHECK-NEXT: br i1 [[TOBOOL_3]], label [[FOR_COND_FOR_END_CRIT_EDGE_UNR_LCSSA:%.*]], label [[FOR_BODY]], !dbg [[DBG24]], !llvm.loop [[LOOP30:![0-9]+]] ; CHECK: for.cond.for.end_crit_edge.unr-lcssa: -; CHECK-NEXT: [[DOTLCSSA_PH:%.*]] = phi i32* [ [[TMP17]], [[FOR_BODY]] ] +; CHECK-NEXT: [[DOTLCSSA_PH:%.*]] = phi i32* [ [[TMP21]], [[FOR_BODY]] ] ; CHECK-NEXT: br label [[FOR_COND_FOR_END_CRIT_EDGE]], !dbg [[DBG24]] ; CHECK: for.cond.for.end_crit_edge: ; CHECK-NEXT: [[DOTLCSSA:%.*]] = phi i32* [ [[DOTLCSSA_UNR]], [[FOR_BODY_PROL_LOOPEXIT]] ], [ [[DOTLCSSA_PH]], [[FOR_COND_FOR_END_CRIT_EDGE_UNR_LCSSA]] ], !dbg [[DBG28]] -; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP2]], 2, !dbg [[DBG24]] +; CHECK-NEXT: [[TMP22:%.*]] = add i32 [[TMP2]], 2, !dbg [[DBG24]] ; CHECK-NEXT: store i32* [[DOTLCSSA]], i32** @a, align 8, !dbg [[DBG25]], !tbaa [[TBAA26]] -; CHECK-NEXT: store i32 [[TMP18]], i32* @b, align 4, !dbg [[DBG33:![0-9]+]], !tbaa [[TBAA20]] +; CHECK-NEXT: store i32 [[TMP22]], i32* @b, align 4, !dbg [[DBG33:![0-9]+]], !tbaa [[TBAA20]] ; CHECK-NEXT: br label [[FOR_END]], !dbg [[DBG24]] ; CHECK: for.end: ; CHECK-NEXT: ret i32 undef, !dbg [[DBG34:![0-9]+]] -; CHECK: for.body.prol.1: -; CHECK-NEXT: [[ARRAYIDX_PROL_1:%.*]] = getelementptr inbounds i32, i32* [[TMP6]], i64 1, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP19:%.*]] = load i32, i32* [[ARRAYIDX_PROL_1]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] -; CHECK-NEXT: [[CONV_PROL_1:%.*]] = sext i32 [[TMP19]] to i64, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP20]] = inttoptr i64 [[CONV_PROL_1]] to i32*, !dbg [[DBG28]] -; CHECK-NEXT: [[ADD_PROL_1]] = add nsw i32 [[ADD_PROL]], 2, !dbg [[DBG29]] -; CHECK-NEXT: [[PROL_ITER_SUB_1:%.*]] = sub i32 [[PROL_ITER_SUB]], 1, !dbg [[DBG24]] -; CHECK-NEXT: [[PROL_ITER_CMP_1:%.*]] = icmp ne i32 [[PROL_ITER_SUB_1]], 0, !dbg [[DBG24]] -; CHECK-NEXT: br i1 [[PROL_ITER_CMP_1]], label [[FOR_BODY_PROL_2]], label [[FOR_BODY_PROL_LOOPEXIT_UNR_LCSSA]], !dbg [[DBG24]] -; CHECK: for.body.prol.2: -; CHECK-NEXT: [[ARRAYIDX_PROL_2:%.*]] = getelementptr inbounds i32, i32* [[TMP20]], i64 1, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP21:%.*]] = load i32, i32* [[ARRAYIDX_PROL_2]], align 4, !dbg [[DBG28]], !tbaa [[TBAA20]] -; CHECK-NEXT: [[CONV_PROL_2:%.*]] = sext i32 [[TMP21]] to i64, !dbg [[DBG28]] -; CHECK-NEXT: [[TMP22]] = inttoptr i64 [[CONV_PROL_2]] to i32*, !dbg [[DBG28]] -; CHECK-NEXT: [[ADD_PROL_2]] = add nsw i32 [[ADD_PROL_1]], 2, !dbg [[DBG29]] -; CHECK-NEXT: br label [[FOR_BODY_PROL_LOOPEXIT_UNR_LCSSA]] ; entry: %.pr = load i32, i32* @b, align 4, !dbg !17, !tbaa !20 diff --git a/llvm/test/Transforms/LoopUnroll/2011-08-08-PhiUpdate.ll b/llvm/test/Transforms/LoopUnroll/2011-08-08-PhiUpdate.ll index 3e611430d69e..7bb2d732195a 100644 --- a/llvm/test/Transforms/LoopUnroll/2011-08-08-PhiUpdate.ll +++ b/llvm/test/Transforms/LoopUnroll/2011-08-08-PhiUpdate.ll @@ -17,24 +17,24 @@ define void @test1(i32 %i, i32 %j) nounwind uwtable ssp { ; CHECK-NEXT: [[SUB5:%.*]] = sub i32 [[SUB]], [[J:%.*]] ; CHECK-NEXT: [[COND2:%.*]] = call zeroext i1 @check() ; CHECK-NEXT: br i1 [[COND2]], label [[IF_THEN_LOOPEXIT:%.*]], label [[IF_ELSE_1:%.*]] -; CHECK: if.then.loopexit: -; CHECK-NEXT: [[SUB5_LCSSA:%.*]] = phi i32 [ [[SUB5]], [[IF_ELSE]] ], [ [[SUB5_1:%.*]], [[IF_ELSE_1]] ], [ [[SUB5_2:%.*]], [[IF_ELSE_2:%.*]] ], [ [[SUB5_3]], [[IF_ELSE_3]] ] -; CHECK-NEXT: br label [[IF_THEN]] -; CHECK: if.then: -; CHECK-NEXT: [[I_TR:%.*]] = phi i32 [ [[I]], [[ENTRY:%.*]] ], [ [[SUB5_LCSSA]], [[IF_THEN_LOOPEXIT]] ] -; CHECK-NEXT: ret void ; CHECK: if.else.1: -; CHECK-NEXT: [[SUB5_1]] = sub i32 [[SUB5]], [[J]] +; CHECK-NEXT: [[SUB5_1:%.*]] = sub i32 [[SUB5]], [[J]] ; CHECK-NEXT: [[COND2_1:%.*]] = call zeroext i1 @check() -; CHECK-NEXT: br i1 [[COND2_1]], label [[IF_THEN_LOOPEXIT]], label [[IF_ELSE_2]] +; CHECK-NEXT: br i1 [[COND2_1]], label [[IF_THEN_LOOPEXIT]], label [[IF_ELSE_2:%.*]] ; CHECK: if.else.2: -; CHECK-NEXT: [[SUB5_2]] = sub i32 [[SUB5_1]], [[J]] +; CHECK-NEXT: [[SUB5_2:%.*]] = sub i32 [[SUB5_1]], [[J]] ; CHECK-NEXT: [[COND2_2:%.*]] = call zeroext i1 @check() ; CHECK-NEXT: br i1 [[COND2_2]], label [[IF_THEN_LOOPEXIT]], label [[IF_ELSE_3]] ; CHECK: if.else.3: ; CHECK-NEXT: [[SUB5_3]] = sub i32 [[SUB5_2]], [[J]] ; CHECK-NEXT: [[COND2_3:%.*]] = call zeroext i1 @check() ; CHECK-NEXT: br i1 [[COND2_3]], label [[IF_THEN_LOOPEXIT]], label [[IF_ELSE]], !llvm.loop [[LOOP0:![0-9]+]] +; CHECK: if.then.loopexit: +; CHECK-NEXT: [[SUB5_LCSSA:%.*]] = phi i32 [ [[SUB5]], [[IF_ELSE]] ], [ [[SUB5_1]], [[IF_ELSE_1]] ], [ [[SUB5_2]], [[IF_ELSE_2]] ], [ [[SUB5_3]], [[IF_ELSE_3]] ] +; CHECK-NEXT: br label [[IF_THEN]] +; CHECK: if.then: +; CHECK-NEXT: [[I_TR:%.*]] = phi i32 [ [[I]], [[ENTRY:%.*]] ], [ [[SUB5_LCSSA]], [[IF_THEN_LOOPEXIT]] ] +; CHECK-NEXT: ret void ; entry: %cond1 = call zeroext i1 @check() @@ -77,17 +77,11 @@ define i32 @test2(i32* nocapture %p, i32 %n) nounwind readonly { ; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add nuw nsw i64 [[INDVAR]], 1 ; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i64 [[INDVAR_NEXT]], [[TMP]] ; CHECK-NEXT: br i1 [[EXITCOND]], label [[BB_1:%.*]], label [[BB1_BB2_CRIT_EDGE:%.*]] -; CHECK: bb1.bb2_crit_edge: -; CHECK-NEXT: [[DOTLCSSA:%.*]] = phi i32 [ [[TMP2]], [[BB1]] ], [ [[TMP4:%.*]], [[BB1_1:%.*]] ], [ [[TMP6:%.*]], [[BB1_2:%.*]] ], [ [[TMP8]], [[BB1_3]] ] -; CHECK-NEXT: br label [[BB2]] -; CHECK: bb2: -; CHECK-NEXT: [[S_0_LCSSA:%.*]] = phi i32 [ [[DOTLCSSA]], [[BB1_BB2_CRIT_EDGE]] ], [ 0, [[ENTRY:%.*]] ] -; CHECK-NEXT: ret i32 [[S_0_LCSSA]] ; CHECK: bb.1: ; CHECK-NEXT: [[SCEVGEP_1:%.*]] = getelementptr i32, i32* [[P]], i64 [[INDVAR_NEXT]] ; CHECK-NEXT: [[TMP3:%.*]] = load i32, i32* [[SCEVGEP_1]], align 1 -; CHECK-NEXT: [[TMP4]] = add nsw i32 [[TMP3]], [[TMP2]] -; CHECK-NEXT: br label [[BB1_1]] +; CHECK-NEXT: [[TMP4:%.*]] = add nsw i32 [[TMP3]], [[TMP2]] +; CHECK-NEXT: br label [[BB1_1:%.*]] ; CHECK: bb1.1: ; CHECK-NEXT: [[INDVAR_NEXT_1:%.*]] = add nuw nsw i64 [[INDVAR_NEXT]], 1 ; CHECK-NEXT: [[EXITCOND_1:%.*]] = icmp ne i64 [[INDVAR_NEXT_1]], [[TMP]] @@ -95,8 +89,8 @@ define i32 @test2(i32* nocapture %p, i32 %n) nounwind readonly { ; CHECK: bb.2: ; CHECK-NEXT: [[SCEVGEP_2:%.*]] = getelementptr i32, i32* [[P]], i64 [[INDVAR_NEXT_1]] ; CHECK-NEXT: [[TMP5:%.*]] = load i32, i32* [[SCEVGEP_2]], align 1 -; CHECK-NEXT: [[TMP6]] = add nsw i32 [[TMP5]], [[TMP4]] -; CHECK-NEXT: br label [[BB1_2]] +; CHECK-NEXT: [[TMP6:%.*]] = add nsw i32 [[TMP5]], [[TMP4]] +; CHECK-NEXT: br label [[BB1_2:%.*]] ; CHECK: bb1.2: ; CHECK-NEXT: [[INDVAR_NEXT_2:%.*]] = add nuw nsw i64 [[INDVAR_NEXT_1]], 1 ; CHECK-NEXT: [[EXITCOND_2:%.*]] = icmp ne i64 [[INDVAR_NEXT_2]], [[TMP]] @@ -110,6 +104,12 @@ define i32 @test2(i32* nocapture %p, i32 %n) nounwind readonly { ; CHECK-NEXT: [[INDVAR_NEXT_3]] = add i64 [[INDVAR_NEXT_2]], 1 ; CHECK-NEXT: [[EXITCOND_3:%.*]] = icmp ne i64 [[INDVAR_NEXT_3]], [[TMP]] ; CHECK-NEXT: br i1 [[EXITCOND_3]], label [[BB]], label [[BB1_BB2_CRIT_EDGE]], !llvm.loop [[LOOP2:![0-9]+]] +; CHECK: bb1.bb2_crit_edge: +; CHECK-NEXT: [[DOTLCSSA:%.*]] = phi i32 [ [[TMP2]], [[BB1]] ], [ [[TMP4]], [[BB1_1]] ], [ [[TMP6]], [[BB1_2]] ], [ [[TMP8]], [[BB1_3]] ] +; CHECK-NEXT: br label [[BB2]] +; CHECK: bb2: +; CHECK-NEXT: [[S_0_LCSSA:%.*]] = phi i32 [ [[DOTLCSSA]], [[BB1_BB2_CRIT_EDGE]] ], [ 0, [[ENTRY:%.*]] ] +; CHECK-NEXT: ret i32 [[S_0_LCSSA]] ; entry: %0 = icmp sgt i32 %n, 0 ; <i1> [#uses=1] @@ -162,20 +162,12 @@ define i32 @test3() nounwind uwtable ssp align 2 { ; CHECK: do.cond: ; CHECK-NEXT: [[COND3:%.*]] = call zeroext i1 @check() ; CHECK-NEXT: br i1 [[COND3]], label [[DO_END:%.*]], label [[DO_BODY_1:%.*]] -; CHECK: do.end: -; CHECK-NEXT: br label [[RETURN]] -; CHECK: return.loopexit: -; CHECK-NEXT: [[TMP7_I_LCSSA:%.*]] = phi i32 [ [[TMP7_I]], [[LAND_LHS_TRUE]] ], [ [[TMP7_I_1:%.*]], [[LAND_LHS_TRUE_1:%.*]] ], [ [[TMP7_I_2:%.*]], [[LAND_LHS_TRUE_2:%.*]] ], [ [[TMP7_I_3:%.*]], [[LAND_LHS_TRUE_3:%.*]] ] -; CHECK-NEXT: br label [[RETURN]] -; CHECK: return: -; CHECK-NEXT: [[RETVAL_0:%.*]] = phi i32 [ 0, [[DO_END]] ], [ 0, [[ENTRY:%.*]] ], [ [[TMP7_I_LCSSA]], [[RETURN_LOOPEXIT]] ] -; CHECK-NEXT: ret i32 [[RETVAL_0]] ; CHECK: do.body.1: ; CHECK-NEXT: [[COND2_1:%.*]] = call zeroext i1 @check() ; CHECK-NEXT: br i1 [[COND2_1]], label [[EXIT_1:%.*]], label [[DO_COND_1:%.*]] ; CHECK: exit.1: -; CHECK-NEXT: [[TMP7_I_1]] = load i32, i32* undef, align 8 -; CHECK-NEXT: br i1 undef, label [[DO_COND_1]], label [[LAND_LHS_TRUE_1]] +; CHECK-NEXT: [[TMP7_I_1:%.*]] = load i32, i32* undef, align 8 +; CHECK-NEXT: br i1 undef, label [[DO_COND_1]], label [[LAND_LHS_TRUE_1:%.*]] ; CHECK: land.lhs.true.1: ; CHECK-NEXT: br i1 true, label [[RETURN_LOOPEXIT]], label [[DO_COND_1]] ; CHECK: do.cond.1: @@ -185,8 +177,8 @@ define i32 @test3() nounwind uwtable ssp align 2 { ; CHECK-NEXT: [[COND2_2:%.*]] = call zeroext i1 @check() ; CHECK-NEXT: br i1 [[COND2_2]], label [[EXIT_2:%.*]], label [[DO_COND_2:%.*]] ; CHECK: exit.2: -; CHECK-NEXT: [[TMP7_I_2]] = load i32, i32* undef, align 8 -; CHECK-NEXT: br i1 undef, label [[DO_COND_2]], label [[LAND_LHS_TRUE_2]] +; CHECK-NEXT: [[TMP7_I_2:%.*]] = load i32, i32* undef, align 8 +; CHECK-NEXT: br i1 undef, label [[DO_COND_2]], label [[LAND_LHS_TRUE_2:%.*]] ; CHECK: land.lhs.true.2: ; CHECK-NEXT: br i1 true, label [[RETURN_LOOPEXIT]], label [[DO_COND_2]] ; CHECK: do.cond.2: @@ -196,13 +188,21 @@ define i32 @test3() nounwind uwtable ssp align 2 { ; CHECK-NEXT: [[COND2_3:%.*]] = call zeroext i1 @check() ; CHECK-NEXT: br i1 [[COND2_3]], label [[EXIT_3:%.*]], label [[DO_COND_3:%.*]] ; CHECK: exit.3: -; CHECK-NEXT: [[TMP7_I_3]] = load i32, i32* undef, align 8 -; CHECK-NEXT: br i1 undef, label [[DO_COND_3]], label [[LAND_LHS_TRUE_3]] +; CHECK-NEXT: [[TMP7_I_3:%.*]] = load i32, i32* undef, align 8 +; CHECK-NEXT: br i1 undef, label [[DO_COND_3]], label [[LAND_LHS_TRUE_3:%.*]] ; CHECK: land.lhs.true.3: ; CHECK-NEXT: br i1 true, label [[RETURN_LOOPEXIT]], label [[DO_COND_3]] ; CHECK: do.cond.3: ; CHECK-NEXT: [[COND3_3:%.*]] = call zeroext i1 @check() ; CHECK-NEXT: br i1 [[COND3_3]], label [[DO_END]], label [[DO_BODY]], !llvm.loop [[LOOP3:![0-9]+]] +; CHECK: do.end: +; CHECK-NEXT: br label [[RETURN]] +; CHECK: return.loopexit: +; CHECK-NEXT: [[TMP7_I_LCSSA:%.*]] = phi i32 [ [[TMP7_I]], [[LAND_LHS_TRUE]] ], [ [[TMP7_I_1]], [[LAND_LHS_TRUE_1]] ], [ [[TMP7_I_2]], [[LAND_LHS_TRUE_2]] ], [ [[TMP7_I_3]], [[LAND_LHS_TRUE_3]] ] +; CHECK-NEXT: br label [[RETURN]] +; CHECK: return: +; CHECK-NEXT: [[RETVAL_0:%.*]] = phi i32 [ 0, [[DO_END]] ], [ 0, [[ENTRY:%.*]] ], [ [[TMP7_I_LCSSA]], [[RETURN_LOOPEXIT]] ] +; CHECK-NEXT: ret i32 [[RETVAL_0]] ; entry: %cond1 = call zeroext i1 @check() diff --git a/llvm/test/Transforms/LoopUnroll/2011-08-09-PhiUpdate.ll b/llvm/test/Transforms/LoopUnroll/2011-08-09-PhiUpdate.ll index be4b6ff64fdd..af648bae8642 100644 --- a/llvm/test/Transforms/LoopUnroll/2011-08-09-PhiUpdate.ll +++ b/llvm/test/Transforms/LoopUnroll/2011-08-09-PhiUpdate.ll @@ -33,16 +33,13 @@ define i32 @foo() uwtable ssp align 2 { ; CHECK: do.cond: ; CHECK-NEXT: [[CMP18:%.*]] = icmp sgt i32 [[CALL2]], -1 ; CHECK-NEXT: br i1 [[CMP18]], label [[LAND_LHS_TRUE_I_1:%.*]], label [[RETURN]] -; CHECK: return: -; CHECK-NEXT: [[RETVAL_0:%.*]] = phi i32 [ [[TMP7_I]], [[LAND_LHS_TRUE]] ], [ 0, [[DO_COND]] ], [ [[TMP7_I_1:%.*]], [[LAND_LHS_TRUE_1:%.*]] ], [ 0, [[DO_COND_1:%.*]] ], [ [[TMP7_I_2:%.*]], [[LAND_LHS_TRUE_2:%.*]] ], [ 0, [[DO_COND_2:%.*]] ], [ [[TMP7_I_3:%.*]], [[LAND_LHS_TRUE_3:%.*]] ], [ 0, [[DO_COND_3:%.*]] ] -; CHECK-NEXT: ret i32 [[RETVAL_0]] ; CHECK: land.lhs.true.i.1: ; CHECK-NEXT: [[CMP4_I_1:%.*]] = call zeroext i1 @check() #[[ATTR0]] -; CHECK-NEXT: br i1 [[CMP4_I_1]], label [[BAR_EXIT_1:%.*]], label [[DO_COND_1]] +; CHECK-NEXT: br i1 [[CMP4_I_1]], label [[BAR_EXIT_1:%.*]], label [[DO_COND_1:%.*]] ; CHECK: bar.exit.1: -; CHECK-NEXT: [[TMP7_I_1]] = call i32 @getval() #[[ATTR0]] +; CHECK-NEXT: [[TMP7_I_1:%.*]] = call i32 @getval() #[[ATTR0]] ; CHECK-NEXT: [[CMP_NOT_1:%.*]] = icmp eq i32 [[TMP7_I_1]], 0 -; CHECK-NEXT: br i1 [[CMP_NOT_1]], label [[DO_COND_1]], label [[LAND_LHS_TRUE_1]] +; CHECK-NEXT: br i1 [[CMP_NOT_1]], label [[DO_COND_1]], label [[LAND_LHS_TRUE_1:%.*]] ; CHECK: land.lhs.true.1: ; CHECK-NEXT: [[CALL10_1:%.*]] = call i32 @getval() ; CHECK-NEXT: [[CMP11_1:%.*]] = icmp eq i32 [[CALL10_1]], 0 @@ -52,11 +49,11 @@ define i32 @foo() uwtable ssp align 2 { ; CHECK-NEXT: br i1 [[CMP18_1]], label [[LAND_LHS_TRUE_I_2:%.*]], label [[RETURN]] ; CHECK: land.lhs.true.i.2: ; CHECK-NEXT: [[CMP4_I_2:%.*]] = call zeroext i1 @check() #[[ATTR0]] -; CHECK-NEXT: br i1 [[CMP4_I_2]], label [[BAR_EXIT_2:%.*]], label [[DO_COND_2]] +; CHECK-NEXT: br i1 [[CMP4_I_2]], label [[BAR_EXIT_2:%.*]], label [[DO_COND_2:%.*]] ; CHECK: bar.exit.2: -; CHECK-NEXT: [[TMP7_I_2]] = call i32 @getval() #[[ATTR0]] +; CHECK-NEXT: [[TMP7_I_2:%.*]] = call i32 @getval() #[[ATTR0]] ; CHECK-NEXT: [[CMP_NOT_2:%.*]] = icmp eq i32 [[TMP7_I_2]], 0 -; CHECK-NEXT: br i1 [[CMP_NOT_2]], label [[DO_COND_2]], label [[LAND_LHS_TRUE_2]] +; CHECK-NEXT: br i1 [[CMP_NOT_2]], label [[DO_COND_2]], label [[LAND_LHS_TRUE_2:%.*]] ; CHECK: land.lhs.true.2: ; CHECK-NEXT: [[CALL10_2:%.*]] = call i32 @getval() ; CHECK-NEXT: [[CMP11_2:%.*]] = icmp eq i32 [[CALL10_2]], 0 @@ -66,11 +63,11 @@ define i32 @foo() uwtable ssp align 2 { ; CHECK-NEXT: br i1 [[CMP18_2]], label [[LAND_LHS_TRUE_I_3:%.*]], label [[RETURN]] ; CHECK: land.lhs.true.i.3: ; CHECK-NEXT: [[CMP4_I_3:%.*]] = call zeroext i1 @check() #[[ATTR0]] -; CHECK-NEXT: br i1 [[CMP4_I_3]], label [[BAR_EXIT_3:%.*]], label [[DO_COND_3]] +; CHECK-NEXT: br i1 [[CMP4_I_3]], label [[BAR_EXIT_3:%.*]], label [[DO_COND_3:%.*]] ; CHECK: bar.exit.3: -; CHECK-NEXT: [[TMP7_I_3]] = call i32 @getval() #[[ATTR0]] +; CHECK-NEXT: [[TMP7_I_3:%.*]] = call i32 @getval() #[[ATTR0]] ; CHECK-NEXT: [[CMP_NOT_3:%.*]] = icmp eq i32 [[TMP7_I_3]], 0 -; CHECK-NEXT: br i1 [[CMP_NOT_3]], label [[DO_COND_3]], label [[LAND_LHS_TRUE_3]] +; CHECK-NEXT: br i1 [[CMP_NOT_3]], label [[DO_COND_3]], label [[LAND_LHS_TRUE_3:%.*]] ; CHECK: land.lhs.true.3: ; CHECK-NEXT: [[CALL10_3:%.*]] = call i32 @getval() ; CHECK-NEXT: [[CMP11_3:%.*]] = icmp eq i32 [[CALL10_3]], 0 @@ -78,6 +75,9 @@ define i32 @foo() uwtable ssp align 2 { ; CHECK: do.cond.3: ; CHECK-NEXT: [[CMP18_3:%.*]] = icmp sgt i32 [[CALL2]], -1 ; CHECK-NEXT: br i1 [[CMP18_3]], label [[LAND_LHS_TRUE_I]], label [[RETURN]], !llvm.loop [[LOOP0:![0-9]+]] +; CHECK: return: +; CHECK-NEXT: [[RETVAL_0:%.*]] = phi i32 [ [[TMP7_I]], [[LAND_LHS_TRUE]] ], [ 0, [[DO_COND]] ], [ [[TMP7_I_1]], [[LAND_LHS_TRUE_1]] ], [ 0, [[DO_COND_1]] ], [ [[TMP7_I_2]], [[LAND_LHS_TRUE_2]] ], [ 0, [[DO_COND_2]] ], [ [[TMP7_I_3]], [[LAND_LHS_TRUE_3]] ], [ 0, [[DO_COND_3]] ] +; CHECK-NEXT: ret i32 [[RETVAL_0]] ; entry: br i1 undef, label %return, label %if.end diff --git a/llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll b/llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll index 5bbab929c936..5c8f9ca01679 100644 --- a/llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll +++ b/llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll @@ -67,8 +67,6 @@ define void @runtime_unroll_generic(i32 %arg_0, i32* %arg_1, i16* %arg_2, i16* % ; CHECK-A55-NEXT: store i32 [[ADD21_EPIL]], i32* [[ARRAYIDX20]], align 4 ; CHECK-A55-NEXT: [[EPIL_ITER_CMP_NOT:%.*]] = icmp eq i32 [[XTRAITER]], 1 ; CHECK-A55-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[FOR_END]], label [[FOR_BODY6_EPIL_1:%.*]] -; CHECK-A55: for.end: -; CHECK-A55-NEXT: ret void ; CHECK-A55: for.body6.epil.1: ; CHECK-A55-NEXT: [[TMP14:%.*]] = load i16, i16* [[ARRAYIDX10]], align 2 ; CHECK-A55-NEXT: [[CONV_EPIL_1:%.*]] = sext i16 [[TMP14]] to i32 @@ -90,6 +88,8 @@ define void @runtime_unroll_generic(i32 %arg_0, i32* %arg_1, i16* %arg_2, i16* % ; CHECK-A55-NEXT: [[ADD21_EPIL_2:%.*]] = add nsw i32 [[MUL16_EPIL_2]], [[TMP19]] ; CHECK-A55-NEXT: store i32 [[ADD21_EPIL_2]], i32* [[ARRAYIDX20]], align 4 ; CHECK-A55-NEXT: br label [[FOR_END]] +; CHECK-A55: for.end: +; CHECK-A55-NEXT: ret void ; ; CHECK-GENERIC-LABEL: @runtime_unroll_generic( ; CHECK-GENERIC-NEXT: entry: diff --git a/llvm/test/Transforms/LoopUnroll/AArch64/thresholdO3-cost-model.ll b/llvm/test/Transforms/LoopUnroll/AArch64/thresholdO3-cost-model.ll index ee07518f8cac..5c6ac690c0ca 100644 --- a/llvm/test/Transforms/LoopUnroll/AArch64/thresholdO3-cost-model.ll +++ b/llvm/test/Transforms/LoopUnroll/AArch64/thresholdO3-cost-model.ll @@ -21,10 +21,6 @@ define i32 @tripcount_11() { ; CHECK-NEXT: br label [[DO_BODY6:%.*]] ; CHECK: for.cond: ; CHECK-NEXT: br i1 true, label [[FOR_COND_1:%.*]], label [[IF_THEN11:%.*]] -; CHECK: do.body6: -; CHECK-NEXT: br i1 true, label [[FOR_COND:%.*]], label [[IF_THEN11]] -; CHECK: if.then11: -; CHECK-NEXT: unreachable ; CHECK: for.cond.1: ; CHECK-NEXT: br i1 true, label [[FOR_COND_2:%.*]], label [[IF_THEN11]] ; CHECK: for.cond.2: @@ -45,6 +41,10 @@ define i32 @tripcount_11() { ; CHECK-NEXT: br i1 true, label [[FOR_COND_10:%.*]], label [[IF_THEN11]] ; CHECK: for.cond.10: ; CHECK-NEXT: ret i32 0 +; CHECK: do.body6: +; CHECK-NEXT: br i1 true, label [[FOR_COND:%.*]], label [[IF_THEN11]] +; CHECK: if.then11: +; CHECK-NEXT: unreachable ; do.body6.preheader: br label %do.body6 diff --git a/llvm/test/Transforms/LoopUnroll/AArch64/unroll-upperbound.ll b/llvm/test/Transforms/LoopUnroll/AArch64/unroll-upperbound.ll index 3b82365d1a6e..ee905e5b10fe 100644 --- a/llvm/test/Transforms/LoopUnroll/AArch64/unroll-upperbound.ll +++ b/llvm/test/Transforms/LoopUnroll/AArch64/unroll-upperbound.ll @@ -18,8 +18,6 @@ define void @test(i1 %cond) { ; CHECK-NEXT: br label [[LATCH]] ; CHECK: latch: ; CHECK-NEXT: br i1 false, label [[FOR_END:%.*]], label [[FOR_BODY_1:%.*]] -; CHECK: for.end: -; CHECK-NEXT: ret void ; CHECK: for.body.1: ; CHECK-NEXT: switch i32 1, label [[SW_DEFAULT_1:%.*]] [ ; CHECK-NEXT: i32 2, label [[LATCH_1:%.*]] @@ -38,6 +36,8 @@ define void @test(i1 %cond) { ; CHECK-NEXT: br label [[LATCH_2]] ; CHECK: latch.2: ; CHECK-NEXT: br label [[FOR_END]] +; CHECK: for.end: +; CHECK-NEXT: ret void ; entry: %0 = select i1 %cond, i32 2, i32 3 diff --git a/llvm/test/Transforms/LoopUnroll/ARM/loop-unrolling.ll b/llvm/test/Transforms/LoopUnroll/ARM/loop-unrolling.ll index f2e748ade0a2..e12dbf031b3b 100644 --- a/llvm/test/Transforms/LoopUnroll/ARM/loop-unrolling.ll +++ b/llvm/test/Transforms/LoopUnroll/ARM/loop-unrolling.ll @@ -121,14 +121,14 @@ for.body4: ; CHECK-NOUNROLL: br
; CHECK-UNROLL: for.body4.epil: +; CHECK-UNROLL: for.body4.epil.1: +; CHECK-UNROLL: for.body4.epil.2: ; CHECK-UNROLL: [[IV0:%[a-z.0-9]+]] = phi i32 [ 0, [[PRE:%[a-z0-9.]+]] ], [ [[IV4:%[a-z.0-9]+]], %for.body4 ] ; CHECK-UNROLL: [[IV1:%[a-z.0-9]+]] = add nuw nsw i32 [[IV0]], 1 ; CHECK-UNROLL: [[IV2:%[a-z.0-9]+]] = add nuw nsw i32 [[IV1]], 1 ; CHECK-UNROLL: [[IV3:%[a-z.0-9]+]] = add nuw nsw i32 [[IV2]], 1 ; CHECK-UNROLL: [[IV4]] = add nuw i32 [[IV3]], 1 ; CHECK-UNROLL: br -; CHECK-UNROLL: for.body4.epil.1: -; CHECK-UNROLL: for.body4.epil.2:
%w.024 = phi i32 [ 0, %for.body4.lr.ph ], [ %inc, %for.body4 ] %add = add i32 %w.024, %mul diff --git a/llvm/test/Transforms/LoopUnroll/ARM/multi-blocks.ll b/llvm/test/Transforms/LoopUnroll/ARM/multi-blocks.ll index 156c0ab10658..8c4257698ab7 100644 --- a/llvm/test/Transforms/LoopUnroll/ARM/multi-blocks.ll +++ b/llvm/test/Transforms/LoopUnroll/ARM/multi-blocks.ll @@ -45,8 +45,37 @@ define void @test_three_blocks(i32* nocapture %Output, ; CHECK-NEXT: [[EPIL_ITER_SUB:%.*]] = sub i32 [[XTRAITER]], 1 ; CHECK-NEXT: [[EPIL_ITER_CMP:%.*]] = icmp ne i32 [[EPIL_ITER_SUB]], 0 ; CHECK-NEXT: br i1 [[EPIL_ITER_CMP]], label [[FOR_BODY_EPIL_1:%.*]], label [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA:%.*]] +; CHECK: for.body.epil.1: +; CHECK-NEXT: [[ARRAYIDX_EPIL_1:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC_EPIL]] +; CHECK-NEXT: [[TMP4:%.*]] = load i32, i32* [[ARRAYIDX_EPIL_1]], align 4 +; CHECK-NEXT: [[TOBOOL_EPIL_1:%.*]] = icmp eq i32 [[TMP4]], 0 +; CHECK-NEXT: br i1 [[TOBOOL_EPIL_1]], label [[FOR_INC_EPIL_1:%.*]], label [[IF_THEN_EPIL_1:%.*]] +; CHECK: if.then.epil.1: +; CHECK-NEXT: [[ARRAYIDX1_EPIL_1:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC_EPIL]] +; CHECK-NEXT: [[TMP5:%.*]] = load i32, i32* [[ARRAYIDX1_EPIL_1]], align 4 +; CHECK-NEXT: [[ADD_EPIL_1:%.*]] = add i32 [[TMP5]], [[TEMP_1_EPIL]] +; CHECK-NEXT: br label [[FOR_INC_EPIL_1]] +; CHECK: for.inc.epil.1: +; CHECK-NEXT: [[TEMP_1_EPIL_1:%.*]] = phi i32 [ [[ADD_EPIL_1]], [[IF_THEN_EPIL_1]] ], [ [[TEMP_1_EPIL]], [[FOR_BODY_EPIL_1]] ] +; CHECK-NEXT: [[INC_EPIL_1:%.*]] = add nuw i32 [[INC_EPIL]], 1 +; CHECK-NEXT: [[EPIL_ITER_SUB_1:%.*]] = sub i32 [[EPIL_ITER_SUB]], 1 +; CHECK-NEXT: [[EPIL_ITER_CMP_1:%.*]] = icmp ne i32 [[EPIL_ITER_SUB_1]], 0 +; CHECK-NEXT: br i1 [[EPIL_ITER_CMP_1]], label [[FOR_BODY_EPIL_2:%.*]], label [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA]] +; CHECK: for.body.epil.2: +; CHECK-NEXT: [[ARRAYIDX_EPIL_2:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC_EPIL_1]] +; CHECK-NEXT: [[TMP6:%.*]] = load i32, i32* [[ARRAYIDX_EPIL_2]], align 4 +; CHECK-NEXT: [[TOBOOL_EPIL_2:%.*]] = icmp eq i32 [[TMP6]], 0 +; CHECK-NEXT: br i1 [[TOBOOL_EPIL_2]], label [[FOR_INC_EPIL_2:%.*]], label [[IF_THEN_EPIL_2:%.*]] +; CHECK: if.then.epil.2: +; CHECK-NEXT: [[ARRAYIDX1_EPIL_2:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC_EPIL_1]] +; CHECK-NEXT: [[TMP7:%.*]] = load i32, i32* [[ARRAYIDX1_EPIL_2]], align 4 +; CHECK-NEXT: [[ADD_EPIL_2:%.*]] = add i32 [[TMP7]], [[TEMP_1_EPIL_1]] +; CHECK-NEXT: br label [[FOR_INC_EPIL_2]] +; CHECK: for.inc.epil.2: +; CHECK-NEXT: [[TEMP_1_EPIL_2:%.*]] = phi i32 [ [[ADD_EPIL_2]], [[IF_THEN_EPIL_2]] ], [ [[TEMP_1_EPIL_1]], [[FOR_BODY_EPIL_2]] ] +; CHECK-NEXT: br label [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA]] ; CHECK: for.cond.cleanup.loopexit.epilog-lcssa: -; CHECK-NEXT: [[TEMP_1_LCSSA_PH1:%.*]] = phi i32 [ [[TEMP_1_EPIL]], [[FOR_INC_EPIL]] ], [ [[TEMP_1_EPIL_1:%.*]], [[FOR_INC_EPIL_1:%.*]] ], [ [[TEMP_1_EPIL_2:%.*]], [[FOR_INC_EPIL_2:%.*]] ] +; CHECK-NEXT: [[TEMP_1_LCSSA_PH1:%.*]] = phi i32 [ [[TEMP_1_EPIL]], [[FOR_INC_EPIL]] ], [ [[TEMP_1_EPIL_1]], [[FOR_INC_EPIL_1]] ], [ [[TEMP_1_EPIL_2]], [[FOR_INC_EPIL_2]] ] ; CHECK-NEXT: br label [[FOR_COND_CLEANUP_LOOPEXIT]] ; CHECK: for.cond.cleanup.loopexit: ; CHECK-NEXT: [[TEMP_1_LCSSA:%.*]] = phi i32 [ [[TEMP_1_LCSSA_PH]], [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA]] ], [ [[TEMP_1_LCSSA_PH1]], [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA]] ] @@ -60,51 +89,22 @@ define void @test_three_blocks(i32* nocapture %Output, ; CHECK-NEXT: [[TEMP_09:%.*]] = phi i32 [ 0, [[FOR_BODY_PREHEADER_NEW]] ], [ [[TEMP_1_3]], [[FOR_INC_3]] ] ; CHECK-NEXT: [[NITER:%.*]] = phi i32 [ [[UNROLL_ITER]], [[FOR_BODY_PREHEADER_NEW]] ], [ [[NITER_NSUB_3:%.*]], [[FOR_INC_3]] ] ; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[J_010]] -; CHECK-NEXT: [[TMP4:%.*]] = load i32, i32* [[ARRAYIDX]], align 4 -; CHECK-NEXT: [[TOBOOL:%.*]] = icmp eq i32 [[TMP4]], 0 +; CHECK-NEXT: [[TMP8:%.*]] = load i32, i32* [[ARRAYIDX]], align 4 +; CHECK-NEXT: [[TOBOOL:%.*]] = icmp eq i32 [[TMP8]], 0 ; CHECK-NEXT: br i1 [[TOBOOL]], label [[FOR_INC:%.*]], label [[IF_THEN:%.*]] ; CHECK: if.then: ; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[J_010]] -; CHECK-NEXT: [[TMP5:%.*]] = load i32, i32* [[ARRAYIDX1]], align 4 -; CHECK-NEXT: [[ADD:%.*]] = add i32 [[TMP5]], [[TEMP_09]] +; CHECK-NEXT: [[TMP9:%.*]] = load i32, i32* [[ARRAYIDX1]], align 4 +; CHECK-NEXT: [[ADD:%.*]] = add i32 [[TMP9]], [[TEMP_09]] ; CHECK-NEXT: br label [[FOR_INC]] ; CHECK: for.inc: ; CHECK-NEXT: [[TEMP_1:%.*]] = phi i32 [ [[ADD]], [[IF_THEN]] ], [ [[TEMP_09]], [[FOR_BODY]] ] ; CHECK-NEXT: [[INC:%.*]] = add nuw nsw i32 [[J_010]], 1 ; CHECK-NEXT: [[NITER_NSUB:%.*]] = sub i32 [[NITER]], 1 ; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC]] -; CHECK-NEXT: [[TMP6:%.*]] = load i32, i32* [[ARRAYIDX_1]], align 4 -; CHECK-NEXT: [[TOBOOL_1:%.*]] = icmp eq i32 [[TMP6]], 0 +; CHECK-NEXT: [[TMP10:%.*]] = load i32, i32* [[ARRAYIDX_1]], align 4 +; CHECK-NEXT: [[TOBOOL_1:%.*]] = icmp eq i32 [[TMP10]], 0 ; CHECK-NEXT: br i1 [[TOBOOL_1]], label [[FOR_INC_1:%.*]], label [[IF_THEN_1:%.*]] -; CHECK: for.body.epil.1: -; CHECK-NEXT: [[ARRAYIDX_EPIL_1:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC_EPIL]] -; CHECK-NEXT: [[TMP7:%.*]] = load i32, i32* [[ARRAYIDX_EPIL_1]], align 4 -; CHECK-NEXT: [[TOBOOL_EPIL_1:%.*]] = icmp eq i32 [[TMP7]], 0 -; CHECK-NEXT: br i1 [[TOBOOL_EPIL_1]], label [[FOR_INC_EPIL_1]], label [[IF_THEN_EPIL_1:%.*]] -; CHECK: if.then.epil.1: -; CHECK-NEXT: [[ARRAYIDX1_EPIL_1:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC_EPIL]] -; CHECK-NEXT: [[TMP8:%.*]] = load i32, i32* [[ARRAYIDX1_EPIL_1]], align 4 -; CHECK-NEXT: [[ADD_EPIL_1:%.*]] = add i32 [[TMP8]], [[TEMP_1_EPIL]] -; CHECK-NEXT: br label [[FOR_INC_EPIL_1]] -; CHECK: for.inc.epil.1: -; CHECK-NEXT: [[TEMP_1_EPIL_1]] = phi i32 [ [[ADD_EPIL_1]], [[IF_THEN_EPIL_1]] ], [ [[TEMP_1_EPIL]], [[FOR_BODY_EPIL_1]] ] -; CHECK-NEXT: [[INC_EPIL_1:%.*]] = add nuw i32 [[INC_EPIL]], 1 -; CHECK-NEXT: [[EPIL_ITER_SUB_1:%.*]] = sub i32 [[EPIL_ITER_SUB]], 1 -; CHECK-NEXT: [[EPIL_ITER_CMP_1:%.*]] = icmp ne i32 [[EPIL_ITER_SUB_1]], 0 -; CHECK-NEXT: br i1 [[EPIL_ITER_CMP_1]], label [[FOR_BODY_EPIL_2:%.*]], label [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA]] -; CHECK: for.body.epil.2: -; CHECK-NEXT: [[ARRAYIDX_EPIL_2:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC_EPIL_1]] -; CHECK-NEXT: [[TMP9:%.*]] = load i32, i32* [[ARRAYIDX_EPIL_2]], align 4 -; CHECK-NEXT: [[TOBOOL_EPIL_2:%.*]] = icmp eq i32 [[TMP9]], 0 -; CHECK-NEXT: br i1 [[TOBOOL_EPIL_2]], label [[FOR_INC_EPIL_2]], label [[IF_THEN_EPIL_2:%.*]] -; CHECK: if.then.epil.2: -; CHECK-NEXT: [[ARRAYIDX1_EPIL_2:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC_EPIL_1]] -; CHECK-NEXT: [[TMP10:%.*]] = load i32, i32* [[ARRAYIDX1_EPIL_2]], align 4 -; CHECK-NEXT: [[ADD_EPIL_2:%.*]] = add i32 [[TMP10]], [[TEMP_1_EPIL_1]] -; CHECK-NEXT: br label [[FOR_INC_EPIL_2]] -; CHECK: for.inc.epil.2: -; CHECK-NEXT: [[TEMP_1_EPIL_2]] = phi i32 [ [[ADD_EPIL_2]], [[IF_THEN_EPIL_2]] ], [ [[TEMP_1_EPIL_1]], [[FOR_BODY_EPIL_2]] ] -; CHECK-NEXT: br label [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA]] ; CHECK: if.then.1: ; CHECK-NEXT: [[ARRAYIDX1_1:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC]] ; CHECK-NEXT: [[TMP11:%.*]] = load i32, i32* [[ARRAYIDX1_1]], align 4 @@ -203,41 +203,34 @@ define void @test_two_exits(i32* nocapture %Output, ; CHECK-NEXT: [[INC:%.*]] = add nuw nsw i32 [[J_016]], 1 ; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[INC]], [[MAXJ]] ; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY_1:%.*]], label [[CLEANUP_LOOPEXIT]] -; CHECK: cleanup.loopexit: -; CHECK-NEXT: [[TEMP_0_LCSSA_PH:%.*]] = phi i32 [ [[TEMP_0_ADD]], [[IF_END]] ], [ [[TEMP_015]], [[FOR_BODY]] ], [ [[TEMP_0_ADD]], [[FOR_BODY_1]] ], [ [[TEMP_0_ADD_1:%.*]], [[IF_END_1:%.*]] ], [ [[TEMP_0_ADD_1]], [[FOR_BODY_2:%.*]] ], [ [[TEMP_0_ADD_2:%.*]], [[IF_END_2:%.*]] ], [ [[TEMP_0_ADD_2]], [[FOR_BODY_3:%.*]] ], [ [[TEMP_0_ADD_3]], [[IF_END_3]] ] -; CHECK-NEXT: br label [[CLEANUP]] -; CHECK: cleanup: -; CHECK-NEXT: [[TEMP_0_LCSSA:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[TEMP_0_LCSSA_PH]], [[CLEANUP_LOOPEXIT]] ] -; CHECK-NEXT: store i32 [[TEMP_0_LCSSA]], i32* [[OUTPUT:%.*]], align 4 -; CHECK-NEXT: ret void ; CHECK: for.body.1: ; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC]] ; CHECK-NEXT: [[TMP2:%.*]] = load i32, i32* [[ARRAYIDX_1]], align 4 ; CHECK-NEXT: [[CMP1_1:%.*]] = icmp ugt i32 [[TMP2]], 65535 -; CHECK-NEXT: br i1 [[CMP1_1]], label [[CLEANUP_LOOPEXIT]], label [[IF_END_1]] +; CHECK-NEXT: br i1 [[CMP1_1]], label [[CLEANUP_LOOPEXIT]], label [[IF_END_1:%.*]] ; CHECK: if.end.1: ; CHECK-NEXT: [[ARRAYIDX2_1:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC]] ; CHECK-NEXT: [[TMP3:%.*]] = load i32, i32* [[ARRAYIDX2_1]], align 4 ; CHECK-NEXT: [[TOBOOL_1:%.*]] = icmp eq i32 [[TMP3]], 0 ; CHECK-NEXT: [[ADD_1:%.*]] = select i1 [[TOBOOL_1]], i32 0, i32 [[TMP2]] -; CHECK-NEXT: [[TEMP_0_ADD_1]] = add i32 [[ADD_1]], [[TEMP_0_ADD]] +; CHECK-NEXT: [[TEMP_0_ADD_1:%.*]] = add i32 [[ADD_1]], [[TEMP_0_ADD]] ; CHECK-NEXT: [[INC_1:%.*]] = add nuw nsw i32 [[INC]], 1 ; CHECK-NEXT: [[CMP_1:%.*]] = icmp ult i32 [[INC_1]], [[MAXJ]] -; CHECK-NEXT: br i1 [[CMP_1]], label [[FOR_BODY_2]], label [[CLEANUP_LOOPEXIT]] +; CHECK-NEXT: br i1 [[CMP_1]], label [[FOR_BODY_2:%.*]], label [[CLEANUP_LOOPEXIT]] ; CHECK: for.body.2: ; CHECK-NEXT: [[ARRAYIDX_2:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC_1]] ; CHECK-NEXT: [[TMP4:%.*]] = load i32, i32* [[ARRAYIDX_2]], align 4 ; CHECK-NEXT: [[CMP1_2:%.*]] = icmp ugt i32 [[TMP4]], 65535 -; CHECK-NEXT: br i1 [[CMP1_2]], label [[CLEANUP_LOOPEXIT]], label [[IF_END_2]] +; CHECK-NEXT: br i1 [[CMP1_2]], label [[CLEANUP_LOOPEXIT]], label [[IF_END_2:%.*]] ; CHECK: if.end.2: ; CHECK-NEXT: [[ARRAYIDX2_2:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC_1]] ; CHECK-NEXT: [[TMP5:%.*]] = load i32, i32* [[ARRAYIDX2_2]], align 4 ; CHECK-NEXT: [[TOBOOL_2:%.*]] = icmp eq i32 [[TMP5]], 0 ; CHECK-NEXT: [[ADD_2:%.*]] = select i1 [[TOBOOL_2]], i32 0, i32 [[TMP4]] -; CHECK-NEXT: [[TEMP_0_ADD_2]] = add i32 [[ADD_2]], [[TEMP_0_ADD_1]] +; CHECK-NEXT: [[TEMP_0_ADD_2:%.*]] = add i32 [[ADD_2]], [[TEMP_0_ADD_1]] ; CHECK-NEXT: [[INC_2:%.*]] = add nuw nsw i32 [[INC_1]], 1 ; CHECK-NEXT: [[CMP_2:%.*]] = icmp ult i32 [[INC_2]], [[MAXJ]] -; CHECK-NEXT: br i1 [[CMP_2]], label [[FOR_BODY_3]], label [[CLEANUP_LOOPEXIT]] +; CHECK-NEXT: br i1 [[CMP_2]], label [[FOR_BODY_3:%.*]], label [[CLEANUP_LOOPEXIT]] ; CHECK: for.body.3: ; CHECK-NEXT: [[ARRAYIDX_3:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC_2]] ; CHECK-NEXT: [[TMP6:%.*]] = load i32, i32* [[ARRAYIDX_3]], align 4 @@ -252,6 +245,13 @@ define void @test_two_exits(i32* nocapture %Output, ; CHECK-NEXT: [[INC_3]] = add nuw i32 [[INC_2]], 1 ; CHECK-NEXT: [[CMP_3:%.*]] = icmp ult i32 [[INC_3]], [[MAXJ]] ; CHECK-NEXT: br i1 [[CMP_3]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT]] +; CHECK: cleanup.loopexit: +; CHECK-NEXT: [[TEMP_0_LCSSA_PH:%.*]] = phi i32 [ [[TEMP_0_ADD]], [[IF_END]] ], [ [[TEMP_015]], [[FOR_BODY]] ], [ [[TEMP_0_ADD]], [[FOR_BODY_1]] ], [ [[TEMP_0_ADD_1]], [[IF_END_1]] ], [ [[TEMP_0_ADD_1]], [[FOR_BODY_2]] ], [ [[TEMP_0_ADD_2]], [[IF_END_2]] ], [ [[TEMP_0_ADD_2]], [[FOR_BODY_3]] ], [ [[TEMP_0_ADD_3]], [[IF_END_3]] ] +; CHECK-NEXT: br label [[CLEANUP]] +; CHECK: cleanup: +; CHECK-NEXT: [[TEMP_0_LCSSA:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[TEMP_0_LCSSA_PH]], [[CLEANUP_LOOPEXIT]] ] +; CHECK-NEXT: store i32 [[TEMP_0_LCSSA]], i32* [[OUTPUT:%.*]], align 4 +; CHECK-NEXT: ret void ; i32* nocapture readonly %Condition, i32* nocapture readonly %Input, @@ -417,100 +417,100 @@ define void @test_four_blocks(i32* nocapture %Output, ; CHECK-NEXT: [[EPIL_ITER_SUB:%.*]] = sub i32 [[XTRAITER]], 1 ; CHECK-NEXT: [[EPIL_ITER_CMP:%.*]] = icmp ne i32 [[EPIL_ITER_SUB]], 0 ; CHECK-NEXT: br i1 [[EPIL_ITER_CMP]], label [[FOR_BODY_EPIL_1:%.*]], label [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA:%.*]] -; CHECK: for.cond.cleanup.loopexit.epilog-lcssa: -; CHECK-NEXT: [[TEMP_1_LCSSA_PH1:%.*]] = phi i32 [ [[TEMP_1_EPIL]], [[FOR_INC_EPIL]] ], [ [[TEMP_1_EPIL_1:%.*]], [[FOR_INC_EPIL_1:%.*]] ], [ [[TEMP_1_EPIL_2:%.*]], [[FOR_INC_EPIL_2:%.*]] ] -; CHECK-NEXT: br label [[FOR_COND_CLEANUP_LOOPEXIT]] -; CHECK: for.cond.cleanup.loopexit: -; CHECK-NEXT: [[TEMP_1_LCSSA:%.*]] = phi i32 [ [[TEMP_1_LCSSA_PH]], [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA]] ], [ [[TEMP_1_LCSSA_PH1]], [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA]] ] -; CHECK-NEXT: br label [[FOR_COND_CLEANUP]] -; CHECK: for.cond.cleanup: -; CHECK-NEXT: [[TEMP_0_LCSSA:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[TEMP_1_LCSSA]], [[FOR_COND_CLEANUP_LOOPEXIT]] ] -; CHECK-NEXT: store i32 [[TEMP_0_LCSSA]], i32* [[OUTPUT:%.*]], align 4 -; CHECK-NEXT: ret void -; CHECK: for.body: -; CHECK-NEXT: [[TMP6:%.*]] = phi i32 [ [[DOTPRE]], [[FOR_BODY_LR_PH_NEW]] ], [ [[TMP23]], [[FOR_INC_3]] ] -; CHECK-NEXT: [[J_027:%.*]] = phi i32 [ 1, [[FOR_BODY_LR_PH_NEW]] ], [ [[INC_3]], [[FOR_INC_3]] ] -; CHECK-NEXT: [[TEMP_026:%.*]] = phi i32 [ 0, [[FOR_BODY_LR_PH_NEW]] ], [ [[TEMP_1_3]], [[FOR_INC_3]] ] -; CHECK-NEXT: [[NITER:%.*]] = phi i32 [ [[UNROLL_ITER]], [[FOR_BODY_LR_PH_NEW]] ], [ [[NITER_NSUB_3:%.*]], [[FOR_INC_3]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[J_027]] -; CHECK-NEXT: [[TMP7:%.*]] = load i32, i32* [[ARRAYIDX]], align 4 -; CHECK-NEXT: [[CMP1:%.*]] = icmp ugt i32 [[TMP7]], 65535 -; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[J_027]] -; CHECK-NEXT: [[TMP8:%.*]] = load i32, i32* [[ARRAYIDX2]], align 4 -; CHECK-NEXT: [[CMP4:%.*]] = icmp ugt i32 [[TMP8]], [[TMP6]] -; CHECK-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]] -; CHECK: if.then: -; CHECK-NEXT: [[COND:%.*]] = zext i1 [[CMP4]] to i32 -; CHECK-NEXT: [[ADD:%.*]] = add i32 [[TEMP_026]], [[COND]] -; CHECK-NEXT: br label [[FOR_INC:%.*]] -; CHECK: if.else: -; CHECK-NEXT: [[NOT_CMP4:%.*]] = xor i1 [[CMP4]], true -; CHECK-NEXT: [[SUB:%.*]] = sext i1 [[NOT_CMP4]] to i32 -; CHECK-NEXT: [[SUB10_SINK:%.*]] = add i32 [[J_027]], [[SUB]] -; CHECK-NEXT: [[ARRAYIDX11:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[SUB10_SINK]] -; CHECK-NEXT: [[TMP9:%.*]] = load i32, i32* [[ARRAYIDX11]], align 4 -; CHECK-NEXT: [[SUB13:%.*]] = sub i32 [[TEMP_026]], [[TMP9]] -; CHECK-NEXT: br label [[FOR_INC]] -; CHECK: for.inc: -; CHECK-NEXT: [[TEMP_1:%.*]] = phi i32 [ [[ADD]], [[IF_THEN]] ], [ [[SUB13]], [[IF_ELSE]] ] -; CHECK-NEXT: [[INC:%.*]] = add nuw nsw i32 [[J_027]], 1 -; CHECK-NEXT: [[NITER_NSUB:%.*]] = sub i32 [[NITER]], 1 -; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC]] -; CHECK-NEXT: [[TMP10:%.*]] = load i32, i32* [[ARRAYIDX_1]], align 4 -; CHECK-NEXT: [[CMP1_1:%.*]] = icmp ugt i32 [[TMP10]], 65535 -; CHECK-NEXT: [[ARRAYIDX2_1:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC]] -; CHECK-NEXT: [[TMP11:%.*]] = load i32, i32* [[ARRAYIDX2_1]], align 4 -; CHECK-NEXT: [[CMP4_1:%.*]] = icmp ugt i32 [[TMP11]], [[TMP8]] -; CHECK-NEXT: br i1 [[CMP1_1]], label [[IF_THEN_1:%.*]], label [[IF_ELSE_1:%.*]] ; CHECK: for.body.epil.1: ; CHECK-NEXT: [[ARRAYIDX_EPIL_1:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC_EPIL]] -; CHECK-NEXT: [[TMP12:%.*]] = load i32, i32* [[ARRAYIDX_EPIL_1]], align 4 -; CHECK-NEXT: [[CMP1_EPIL_1:%.*]] = icmp ugt i32 [[TMP12]], 65535 +; CHECK-NEXT: [[TMP6:%.*]] = load i32, i32* [[ARRAYIDX_EPIL_1]], align 4 +; CHECK-NEXT: [[CMP1_EPIL_1:%.*]] = icmp ugt i32 [[TMP6]], 65535 ; CHECK-NEXT: [[ARRAYIDX2_EPIL_1:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC_EPIL]] -; CHECK-NEXT: [[TMP13:%.*]] = load i32, i32* [[ARRAYIDX2_EPIL_1]], align 4 -; CHECK-NEXT: [[CMP4_EPIL_1:%.*]] = icmp ugt i32 [[TMP13]], [[TMP4]] +; CHECK-NEXT: [[TMP7:%.*]] = load i32, i32* [[ARRAYIDX2_EPIL_1]], align 4 +; CHECK-NEXT: [[CMP4_EPIL_1:%.*]] = icmp ugt i32 [[TMP7]], [[TMP4]] ; CHECK-NEXT: br i1 [[CMP1_EPIL_1]], label [[IF_THEN_EPIL_1:%.*]], label [[IF_ELSE_EPIL_1:%.*]] ; CHECK: if.else.epil.1: ; CHECK-NEXT: [[NOT_CMP4_EPIL_1:%.*]] = xor i1 [[CMP4_EPIL_1]], true ; CHECK-NEXT: [[SUB_EPIL_1:%.*]] = sext i1 [[NOT_CMP4_EPIL_1]] to i32 ; CHECK-NEXT: [[SUB10_SINK_EPIL_1:%.*]] = add i32 [[INC_EPIL]], [[SUB_EPIL_1]] ; CHECK-NEXT: [[ARRAYIDX11_EPIL_1:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[SUB10_SINK_EPIL_1]] -; CHECK-NEXT: [[TMP14:%.*]] = load i32, i32* [[ARRAYIDX11_EPIL_1]], align 4 -; CHECK-NEXT: [[SUB13_EPIL_1:%.*]] = sub i32 [[TEMP_1_EPIL]], [[TMP14]] -; CHECK-NEXT: br label [[FOR_INC_EPIL_1]] +; CHECK-NEXT: [[TMP8:%.*]] = load i32, i32* [[ARRAYIDX11_EPIL_1]], align 4 +; CHECK-NEXT: [[SUB13_EPIL_1:%.*]] = sub i32 [[TEMP_1_EPIL]], [[TMP8]] +; CHECK-NEXT: br label [[FOR_INC_EPIL_1:%.*]] ; CHECK: if.then.epil.1: ; CHECK-NEXT: [[COND_EPIL_1:%.*]] = zext i1 [[CMP4_EPIL_1]] to i32 ; CHECK-NEXT: [[ADD_EPIL_1:%.*]] = add i32 [[TEMP_1_EPIL]], [[COND_EPIL_1]] ; CHECK-NEXT: br label [[FOR_INC_EPIL_1]] ; CHECK: for.inc.epil.1: -; CHECK-NEXT: [[TEMP_1_EPIL_1]] = phi i32 [ [[ADD_EPIL_1]], [[IF_THEN_EPIL_1]] ], [ [[SUB13_EPIL_1]], [[IF_ELSE_EPIL_1]] ] +; CHECK-NEXT: [[TEMP_1_EPIL_1:%.*]] = phi i32 [ [[ADD_EPIL_1]], [[IF_THEN_EPIL_1]] ], [ [[SUB13_EPIL_1]], [[IF_ELSE_EPIL_1]] ] ; CHECK-NEXT: [[INC_EPIL_1:%.*]] = add nuw i32 [[INC_EPIL]], 1 ; CHECK-NEXT: [[EPIL_ITER_SUB_1:%.*]] = sub i32 [[EPIL_ITER_SUB]], 1 ; CHECK-NEXT: [[EPIL_ITER_CMP_1:%.*]] = icmp ne i32 [[EPIL_ITER_SUB_1]], 0 ; CHECK-NEXT: br i1 [[EPIL_ITER_CMP_1]], label [[FOR_BODY_EPIL_2:%.*]], label [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA]] ; CHECK: for.body.epil.2: ; CHECK-NEXT: [[ARRAYIDX_EPIL_2:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC_EPIL_1]] -; CHECK-NEXT: [[TMP15:%.*]] = load i32, i32* [[ARRAYIDX_EPIL_2]], align 4 -; CHECK-NEXT: [[CMP1_EPIL_2:%.*]] = icmp ugt i32 [[TMP15]], 65535 +; CHECK-NEXT: [[TMP9:%.*]] = load i32, i32* [[ARRAYIDX_EPIL_2]], align 4 +; CHECK-NEXT: [[CMP1_EPIL_2:%.*]] = icmp ugt i32 [[TMP9]], 65535 ; CHECK-NEXT: [[ARRAYIDX2_EPIL_2:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC_EPIL_1]] -; CHECK-NEXT: [[TMP16:%.*]] = load i32, i32* [[ARRAYIDX2_EPIL_2]], align 4 -; CHECK-NEXT: [[CMP4_EPIL_2:%.*]] = icmp ugt i32 [[TMP16]], [[TMP13]] +; CHECK-NEXT: [[TMP10:%.*]] = load i32, i32* [[ARRAYIDX2_EPIL_2]], align 4 +; CHECK-NEXT: [[CMP4_EPIL_2:%.*]] = icmp ugt i32 [[TMP10]], [[TMP7]] ; CHECK-NEXT: br i1 [[CMP1_EPIL_2]], label [[IF_THEN_EPIL_2:%.*]], label [[IF_ELSE_EPIL_2:%.*]] ; CHECK: if.else.epil.2: ; CHECK-NEXT: [[NOT_CMP4_EPIL_2:%.*]] = xor i1 [[CMP4_EPIL_2]], true ; CHECK-NEXT: [[SUB_EPIL_2:%.*]] = sext i1 [[NOT_CMP4_EPIL_2]] to i32 ; CHECK-NEXT: [[SUB10_SINK_EPIL_2:%.*]] = add i32 [[INC_EPIL_1]], [[SUB_EPIL_2]] ; CHECK-NEXT: [[ARRAYIDX11_EPIL_2:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[SUB10_SINK_EPIL_2]] -; CHECK-NEXT: [[TMP17:%.*]] = load i32, i32* [[ARRAYIDX11_EPIL_2]], align 4 -; CHECK-NEXT: [[SUB13_EPIL_2:%.*]] = sub i32 [[TEMP_1_EPIL_1]], [[TMP17]] -; CHECK-NEXT: br label [[FOR_INC_EPIL_2]] +; CHECK-NEXT: [[TMP11:%.*]] = load i32, i32* [[ARRAYIDX11_EPIL_2]], align 4 +; CHECK-NEXT: [[SUB13_EPIL_2:%.*]] = sub i32 [[TEMP_1_EPIL_1]], [[TMP11]] +; CHECK-NEXT: br label [[FOR_INC_EPIL_2:%.*]] ; CHECK: if.then.epil.2: ; CHECK-NEXT: [[COND_EPIL_2:%.*]] = zext i1 [[CMP4_EPIL_2]] to i32 ; CHECK-NEXT: [[ADD_EPIL_2:%.*]] = add i32 [[TEMP_1_EPIL_1]], [[COND_EPIL_2]] ; CHECK-NEXT: br label [[FOR_INC_EPIL_2]] ; CHECK: for.inc.epil.2: -; CHECK-NEXT: [[TEMP_1_EPIL_2]] = phi i32 [ [[ADD_EPIL_2]], [[IF_THEN_EPIL_2]] ], [ [[SUB13_EPIL_2]], [[IF_ELSE_EPIL_2]] ] +; CHECK-NEXT: [[TEMP_1_EPIL_2:%.*]] = phi i32 [ [[ADD_EPIL_2]], [[IF_THEN_EPIL_2]] ], [ [[SUB13_EPIL_2]], [[IF_ELSE_EPIL_2]] ] ; CHECK-NEXT: br label [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA]] +; CHECK: for.cond.cleanup.loopexit.epilog-lcssa: +; CHECK-NEXT: [[TEMP_1_LCSSA_PH1:%.*]] = phi i32 [ [[TEMP_1_EPIL]], [[FOR_INC_EPIL]] ], [ [[TEMP_1_EPIL_1]], [[FOR_INC_EPIL_1]] ], [ [[TEMP_1_EPIL_2]], [[FOR_INC_EPIL_2]] ] +; CHECK-NEXT: br label [[FOR_COND_CLEANUP_LOOPEXIT]] +; CHECK: for.cond.cleanup.loopexit: +; CHECK-NEXT: [[TEMP_1_LCSSA:%.*]] = phi i32 [ [[TEMP_1_LCSSA_PH]], [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA]] ], [ [[TEMP_1_LCSSA_PH1]], [[FOR_COND_CLEANUP_LOOPEXIT_EPILOG_LCSSA]] ] +; CHECK-NEXT: br label [[FOR_COND_CLEANUP]] +; CHECK: for.cond.cleanup: +; CHECK-NEXT: [[TEMP_0_LCSSA:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[TEMP_1_LCSSA]], [[FOR_COND_CLEANUP_LOOPEXIT]] ] +; CHECK-NEXT: store i32 [[TEMP_0_LCSSA]], i32* [[OUTPUT:%.*]], align 4 +; CHECK-NEXT: ret void +; CHECK: for.body: +; CHECK-NEXT: [[TMP12:%.*]] = phi i32 [ [[DOTPRE]], [[FOR_BODY_LR_PH_NEW]] ], [ [[TMP23]], [[FOR_INC_3]] ] +; CHECK-NEXT: [[J_027:%.*]] = phi i32 [ 1, [[FOR_BODY_LR_PH_NEW]] ], [ [[INC_3]], [[FOR_INC_3]] ] +; CHECK-NEXT: [[TEMP_026:%.*]] = phi i32 [ 0, [[FOR_BODY_LR_PH_NEW]] ], [ [[TEMP_1_3]], [[FOR_INC_3]] ] +; CHECK-NEXT: [[NITER:%.*]] = phi i32 [ [[UNROLL_ITER]], [[FOR_BODY_LR_PH_NEW]] ], [ [[NITER_NSUB_3:%.*]], [[FOR_INC_3]] ] +; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[J_027]] +; CHECK-NEXT: [[TMP13:%.*]] = load i32, i32* [[ARRAYIDX]], align 4 +; CHECK-NEXT: [[CMP1:%.*]] = icmp ugt i32 [[TMP13]], 65535 +; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[J_027]] +; CHECK-NEXT: [[TMP14:%.*]] = load i32, i32* [[ARRAYIDX2]], align 4 +; CHECK-NEXT: [[CMP4:%.*]] = icmp ugt i32 [[TMP14]], [[TMP12]] +; CHECK-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]] +; CHECK: if.then: +; CHECK-NEXT: [[COND:%.*]] = zext i1 [[CMP4]] to i32 +; CHECK-NEXT: [[ADD:%.*]] = add i32 [[TEMP_026]], [[COND]] +; CHECK-NEXT: br label [[FOR_INC:%.*]] +; CHECK: if.else: +; CHECK-NEXT: [[NOT_CMP4:%.*]] = xor i1 [[CMP4]], true +; CHECK-NEXT: [[SUB:%.*]] = sext i1 [[NOT_CMP4]] to i32 +; CHECK-NEXT: [[SUB10_SINK:%.*]] = add i32 [[J_027]], [[SUB]] +; CHECK-NEXT: [[ARRAYIDX11:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[SUB10_SINK]] +; CHECK-NEXT: [[TMP15:%.*]] = load i32, i32* [[ARRAYIDX11]], align 4 +; CHECK-NEXT: [[SUB13:%.*]] = sub i32 [[TEMP_026]], [[TMP15]] +; CHECK-NEXT: br label [[FOR_INC]] +; CHECK: for.inc: +; CHECK-NEXT: [[TEMP_1:%.*]] = phi i32 [ [[ADD]], [[IF_THEN]] ], [ [[SUB13]], [[IF_ELSE]] ] +; CHECK-NEXT: [[INC:%.*]] = add nuw nsw i32 [[J_027]], 1 +; CHECK-NEXT: [[NITER_NSUB:%.*]] = sub i32 [[NITER]], 1 +; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr inbounds i32, i32* [[CONDITION]], i32 [[INC]] +; CHECK-NEXT: [[TMP16:%.*]] = load i32, i32* [[ARRAYIDX_1]], align 4 +; CHECK-NEXT: [[CMP1_1:%.*]] = icmp ugt i32 [[TMP16]], 65535 +; CHECK-NEXT: [[ARRAYIDX2_1:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC]] +; CHECK-NEXT: [[TMP17:%.*]] = load i32, i32* [[ARRAYIDX2_1]], align 4 +; CHECK-NEXT: [[CMP4_1:%.*]] = icmp ugt i32 [[TMP17]], [[TMP14]] +; CHECK-NEXT: br i1 [[CMP1_1]], label [[IF_THEN_1:%.*]], label [[IF_ELSE_1:%.*]] ; CHECK: if.else.1: ; CHECK-NEXT: [[NOT_CMP4_1:%.*]] = xor i1 [[CMP4_1]], true ; CHECK-NEXT: [[SUB_1:%.*]] = sext i1 [[NOT_CMP4_1]] to i32 @@ -532,7 +532,7 @@ define void @test_four_blocks(i32* nocapture %Output, ; CHECK-NEXT: [[CMP1_2:%.*]] = icmp ugt i32 [[TMP19]], 65535 ; CHECK-NEXT: [[ARRAYIDX2_2:%.*]] = getelementptr inbounds i32, i32* [[INPUT]], i32 [[INC_1]] ; CHECK-NEXT: [[TMP20:%.*]] = load i32, i32* [[ARRAYIDX2_2]], align 4 -; CHECK-NEXT: [[CMP4_2:%.*]] = icmp ugt i32 [[TMP20]], [[TMP11]] +; CHECK-NEXT: [[CMP4_2:%.*]] = icmp ugt i32 [[TMP20]], [[TMP17]] ; CHECK-NEXT: br i1 [[CMP1_2]], label [[IF_THEN_2:%.*]], label [[IF_ELSE_2:%.*]] ; CHECK: if.else.2: ; CHECK-NEXT: [[NOT_CMP4_2:%.*]] = xor i1 [[CMP4_2]], true @@ -742,10 +742,6 @@ define void @iterate_inc(%struct.Node* %n, i32 %limit) { ; CHECK-NEXT: [[TMP2:%.*]] = load %struct.Node*, %struct.Node** [[TMP1]], align 4 ; CHECK-NEXT: [[TOBOOL:%.*]] = icmp eq %struct.Node* [[TMP2]], null ; CHECK-NEXT: br i1 [[TOBOOL]], label [[WHILE_END_LOOPEXIT]], label [[LAND_RHS_1:%.*]] -; CHECK: while.end.loopexit: -; CHECK-NEXT: br label [[WHILE_END]] -; CHECK: while.end: -; CHECK-NEXT: ret void ; CHECK: land.rhs.1: ; CHECK-NEXT: [[VAL_1:%.*]] = getelementptr inbounds [[STRUCT_NODE]], %struct.Node* [[TMP2]], i32 0, i32 1 ; CHECK-NEXT: [[TMP3:%.*]] = load i32, i32* [[VAL_1]], align 4 @@ -782,6 +778,10 @@ define void @iterate_inc(%struct.Node* %n, i32 %limit) { ; CHECK-NEXT: [[TMP11]] = load %struct.Node*, %struct.Node** [[TMP10]], align 4 ; CHECK-NEXT: [[TOBOOL_3:%.*]] = icmp eq %struct.Node* [[TMP11]], null ; CHECK-NEXT: br i1 [[TOBOOL_3]], label [[WHILE_END_LOOPEXIT]], label [[LAND_RHS]] +; CHECK: while.end.loopexit: +; CHECK-NEXT: br label [[WHILE_END]] +; CHECK: while.end: +; CHECK-NEXT: ret void ; entry: %tobool5 = icmp eq %struct.Node* %n, null diff --git a/llvm/test/Transforms/LoopUnroll/ARM/upperbound.ll b/llvm/test/Transforms/LoopUnroll/ARM/upperbound.ll index ea18d3aa1054..33151c68b319 100644 --- a/llvm/test/Transforms/LoopUnroll/ARM/upperbound.ll +++ b/llvm/test/Transforms/LoopUnroll/ARM/upperbound.ll @@ -20,8 +20,6 @@ define void @test(i32* %x, i32 %n) { ; CHECK-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds i32, i32* [[X]], i64 1 ; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[REM]], 1 ; CHECK-NEXT: br i1 [[CMP]], label [[WHILE_BODY_1:%.*]], label [[WHILE_END]] -; CHECK: while.end: -; CHECK-NEXT: ret void ; CHECK: while.body.1: ; CHECK-NEXT: [[TMP1:%.*]] = load i32, i32* [[INCDEC_PTR]], align 4 ; CHECK-NEXT: [[CMP1_1:%.*]] = icmp slt i32 [[TMP1]], 10 @@ -40,6 +38,8 @@ define void @test(i32* %x, i32 %n) { ; CHECK: if.then.2: ; CHECK-NEXT: store i32 0, i32* [[INCDEC_PTR_1]], align 4 ; CHECK-NEXT: br label [[WHILE_END]] +; CHECK: while.end: +; CHECK-NEXT: ret void ; entry: %sub = add nsw i32 %n, -1 @@ -76,9 +76,9 @@ define i32 @test2(i32 %l86) { ; CHECK-NEXT: [[L86_OFF:%.*]] = add i32 [[L86:%.*]], -1 ; CHECK-NEXT: [[SWITCH:%.*]] = icmp ult i32 [[L86_OFF]], 24 ; CHECK-NEXT: [[DOTNOT30:%.*]] = icmp ne i32 [[L86]], 25 -; CHECK-NEXT: [[SPEC_SELECT24:%.*]] = zext i1 [[DOTNOT30]] to i32 -; CHECK-NEXT: [[COMMON_RET31_OP:%.*]] = select i1 [[SWITCH]], i32 0, i32 [[SPEC_SELECT24]] -; CHECK-NEXT: ret i32 [[COMMON_RET31_OP]] +; CHECK-NEXT: [[SPEC_SELECT:%.*]] = zext i1 [[DOTNOT30]] to i32 +; CHECK-NEXT: [[COMMON_RET_OP:%.*]] = select i1 [[SWITCH]], i32 0, i32 [[SPEC_SELECT]] +; CHECK-NEXT: ret i32 [[COMMON_RET_OP]] ; entry: br label %for.body.i.i diff --git a/llvm/test/Transforms/LoopUnroll/full-unroll-keep-first-exit.ll b/llvm/test/Transforms/LoopUnroll/full-unroll-keep-first-exit.ll index 316051715584..cdc8e944715e 100644 --- a/llvm/test/Transforms/LoopUnroll/full-unroll-keep-first-exit.ll +++ b/llvm/test/Transforms/LoopUnroll/full-unroll-keep-first-exit.ll @@ -15,12 +15,12 @@ define void @s32_max1(i32 %n, i32* %p) { ; CHECK-NEXT: [[INC:%.*]] = add i32 [[N]], 1 ; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[N]], [[ADD]] ; CHECK-NEXT: br i1 [[CMP]], label [[DO_BODY_1:%.*]], label [[DO_END:%.*]] -; CHECK: do.end: -; CHECK-NEXT: ret void ; CHECK: do.body.1: ; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr i32, i32* [[P]], i32 [[INC]] ; CHECK-NEXT: store i32 [[INC]], i32* [[ARRAYIDX_1]], align 4 ; CHECK-NEXT: br label [[DO_END]] +; CHECK: do.end: +; CHECK-NEXT: ret void ; entry: %add = add i32 %n, 1 @@ -51,8 +51,6 @@ define void @s32_max2(i32 %n, i32* %p) { ; CHECK-NEXT: [[INC:%.*]] = add i32 [[N]], 1 ; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[N]], [[ADD]] ; CHECK-NEXT: br i1 [[CMP]], label [[DO_BODY_1:%.*]], label [[DO_END:%.*]] -; CHECK: do.end: -; CHECK-NEXT: ret void ; CHECK: do.body.1: ; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr i32, i32* [[P]], i32 [[INC]] ; CHECK-NEXT: store i32 [[INC]], i32* [[ARRAYIDX_1]], align 4 @@ -60,6 +58,8 @@ define void @s32_max2(i32 %n, i32* %p) { ; CHECK-NEXT: [[ARRAYIDX_2:%.*]] = getelementptr i32, i32* [[P]], i32 [[INC_1]] ; CHECK-NEXT: store i32 [[INC_1]], i32* [[ARRAYIDX_2]], align 4 ; CHECK-NEXT: br label [[DO_END]] +; CHECK: do.end: +; CHECK-NEXT: ret void ; entry: %add = add i32 %n, 2 @@ -163,12 +163,12 @@ define void @u32_max1(i32 %n, i32* %p) { ; CHECK-NEXT: [[INC:%.*]] = add i32 [[N]], 1 ; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[N]], [[ADD]] ; CHECK-NEXT: br i1 [[CMP]], label [[DO_BODY_1:%.*]], label [[DO_END:%.*]] -; CHECK: do.end: -; CHECK-NEXT: ret void ; CHECK: do.body.1: ; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr i32, i32* [[P]], i32 [[INC]] ; CHECK-NEXT: store i32 [[INC]], i32* [[ARRAYIDX_1]], align 4 ; CHECK-NEXT: br label [[DO_END]] +; CHECK: do.end: +; CHECK-NEXT: ret void ; entry: %add = add i32 %n, 1 @@ -199,8 +199,6 @@ define void @u32_max2(i32 %n, i32* %p) { ; CHECK-NEXT: [[INC:%.*]] = add i32 [[N]], 1 ; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[N]], [[ADD]] ; CHECK-NEXT: br i1 [[CMP]], label [[DO_BODY_1:%.*]], label [[DO_END:%.*]] -; CHECK: do.end: -; CHECK-NEXT: ret void ; CHECK: do.body.1: ; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr i32, i32* [[P]], i32 [[INC]] ; CHECK-NEXT: store i32 [[INC]], i32* [[ARRAYIDX_1]], align 4 @@ -208,6 +206,8 @@ define void @u32_max2(i32 %n, i32* %p) { ; CHECK-NEXT: [[ARRAYIDX_2:%.*]] = getelementptr i32, i32* [[P]], i32 [[INC_1]] ; CHECK-NEXT: store i32 [[INC_1]], i32* [[ARRAYIDX_2]], align 4 ; CHECK-NEXT: br label [[DO_END]] +; CHECK: do.end: +; CHECK-NEXT: ret void ; entry: %add = add i32 %n, 2 diff --git a/llvm/test/Transforms/LoopUnroll/full-unroll-one-unpredictable-exit.ll b/llvm/test/Transforms/LoopUnroll/full-unroll-one-unpredictable-exit.ll index 095a7c1e1dd1..b7d7e00fa0c9 100644 --- a/llvm/test/Transforms/LoopUnroll/full-unroll-one-unpredictable-exit.ll +++ b/llvm/test/Transforms/LoopUnroll/full-unroll-one-unpredictable-exit.ll @@ -34,11 +34,11 @@ define i1 @test_latch() { ; CHECK-NEXT: [[LOAD2_1:%.*]] = load i64, i64* [[GEP2_1]], align 8 ; CHECK-NEXT: [[EXITCOND2_1:%.*]] = icmp eq i64 [[LOAD1_1]], [[LOAD2_1]] ; CHECK-NEXT: br i1 [[EXITCOND2_1]], label [[LATCH_1:%.*]], label [[EXIT]] +; CHECK: latch.1: +; CHECK-NEXT: br label [[EXIT]] ; CHECK: exit: ; CHECK-NEXT: [[EXIT_VAL:%.*]] = phi i1 [ false, [[LOOP]] ], [ false, [[LATCH]] ], [ true, [[LATCH_1]] ] ; CHECK-NEXT: ret i1 [[EXIT_VAL]] -; CHECK: latch.1: -; CHECK-NEXT: br label [[EXIT]] ; start: %a1 = alloca [2 x i64], align 8 @@ -95,22 +95,22 @@ define i1 @test_non_latch() { ; CHECK-NEXT: [[LOAD2:%.*]] = load i64, i64* [[GEP2]], align 8 ; CHECK-NEXT: [[EXITCOND2:%.*]] = icmp eq i64 [[LOAD1]], [[LOAD2]] ; CHECK-NEXT: br i1 [[EXITCOND2]], label [[LOOP_1:%.*]], label [[EXIT:%.*]] -; CHECK: exit: -; CHECK-NEXT: [[EXIT_VAL:%.*]] = phi i1 [ false, [[LATCH]] ], [ false, [[LATCH_1:%.*]] ], [ true, [[LOOP_2:%.*]] ], [ false, [[LATCH_2:%.*]] ] -; CHECK-NEXT: ret i1 [[EXIT_VAL]] ; CHECK: loop.1: -; CHECK-NEXT: br label [[LATCH_1]] +; CHECK-NEXT: br label [[LATCH_1:%.*]] ; CHECK: latch.1: ; CHECK-NEXT: [[GEP1_1:%.*]] = getelementptr inbounds [2 x i64], [2 x i64]* [[A1]], i64 0, i64 1 ; CHECK-NEXT: [[GEP2_1:%.*]] = getelementptr inbounds [2 x i64], [2 x i64]* [[A2]], i64 0, i64 1 ; CHECK-NEXT: [[LOAD1_1:%.*]] = load i64, i64* [[GEP1_1]], align 8 ; CHECK-NEXT: [[LOAD2_1:%.*]] = load i64, i64* [[GEP2_1]], align 8 ; CHECK-NEXT: [[EXITCOND2_1:%.*]] = icmp eq i64 [[LOAD1_1]], [[LOAD2_1]] -; CHECK-NEXT: br i1 [[EXITCOND2_1]], label [[LOOP_2]], label [[EXIT]] +; CHECK-NEXT: br i1 [[EXITCOND2_1]], label [[LOOP_2:%.*]], label [[EXIT]] ; CHECK: loop.2: -; CHECK-NEXT: br i1 true, label [[EXIT]], label [[LATCH_2]] +; CHECK-NEXT: br i1 true, label [[EXIT]], label [[LATCH_2:%.*]] ; CHECK: latch.2: ; CHECK-NEXT: br label [[EXIT]] +; CHECK: exit: +; CHECK-NEXT: [[EXIT_VAL:%.*]] = phi i1 [ false, [[LATCH]] ], [ false, [[LATCH_1]] ], [ true, [[LOOP_2]] ], [ false, [[LATCH_2]] ] +; CHECK-NEXT: ret i1 [[EXIT_VAL]] ; start: %a1 = alloca [2 x i64], align 8 diff --git a/llvm/test/Transforms/LoopUnroll/multiple-exits.ll b/llvm/test/Transforms/LoopUnroll/multiple-exits.ll index 0bea86350b99..9f40f51c10e6 100644 --- a/llvm/test/Transforms/LoopUnroll/multiple-exits.ll +++ b/llvm/test/Transforms/LoopUnroll/multiple-exits.ll @@ -14,8 +14,6 @@ define void @test1() { ; CHECK-NEXT: call void @bar() ; CHECK-NEXT: call void @bar() ; CHECK-NEXT: br label [[LATCH_1:%.*]] -; CHECK: exit: -; CHECK-NEXT: ret void ; CHECK: latch.1: </cut>
linaro-toolchain@lists.linaro.org