LoopUnrollPass.cpp - OpenGrok history log for /llvm-project/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-4.0.0-rc3
# c2af82b4	22-Feb-2017	Michael Kuperstein <mkuper@google.com>	[LoopUnroll] Enable PGO-based loop peeling by default. This enables peeling of loops with low dynamic iteration count by default, when profile information is available. Differential Revision: https [LoopUnroll] Enable PGO-based loop peeling by default. This enables peeling of loops with low dynamic iteration count by default, when profile information is available. Differential Revision: https://reviews.llvm.org/D27734 llvm-svn: 295796 show more ...
# 7d230325	18-Feb-2017	Dehao Chen <dehao@google.com>	Increases full-unroll threshold. Summary: The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold This change will affect the following speccpu2006 Increases full-unroll threshold. Summary: The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold This change will affect the following speccpu2006 benchmarks (performance numbers were collected from Intel Sandybridge): Performance: 403 0.11% 433 0.51% 445 0.48% 447 3.50% 453 1.49% 464 0.75% Code size: 403 0.56% 433 0.96% 445 2.16% 447 2.96% 453 0.94% 464 8.02% The compiler time overhead is similar with code size. Reviewers: davidxl, mkuper, mzolotukhin, hfinkel, chandlerc Reviewed By: hfinkel, chandlerc Subscribers: mehdi_amini, zzheng, efriedma, haicheng, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D28368 llvm-svn: 295538 show more ...
Revision tags: llvmorg-4.0.0-rc2
# eab3b90a	26-Jan-2017	Chandler Carruth <chandlerc@gmail.com>	[PM] Simplify the new PM interface to the loop unroller and expose two factory functions for the two modes the loop unroller is actually used in in-tree: simplified full-unrolling and the entire thin [PM] Simplify the new PM interface to the loop unroller and expose two factory functions for the two modes the loop unroller is actually used in in-tree: simplified full-unrolling and the entire thing including partial unrolling. I've also wired these up to nice names so you can express both of these being in a pipeline easily. This is a precursor to actually enabling these parts of the O2 pipeline. Differential Revision: https://reviews.llvm.org/D28897 llvm-svn: 293136 show more ...
# 5dd55e84	26-Jan-2017	Michael Kuperstein <mkuper@google.com>	[LoopUnroll] Properly update loopinfo for runtime unrolling by 2 Even when we don't create a remainder loop (that is, when we unroll by 2), we may duplicate nested loops into the remainder. This is [LoopUnroll] Properly update loopinfo for runtime unrolling by 2 Even when we don't create a remainder loop (that is, when we unroll by 2), we may duplicate nested loops into the remainder. This is complicated by the fact the remainder may itself be either inserted into an outer loop, or at the top level. In the latter case, we may need to create new top-level loops. Differential Revision: https://reviews.llvm.org/D29156 llvm-svn: 293124 show more ...
# ce40fa13	25-Jan-2017	Chandler Carruth <chandlerc@gmail.com>	[PM] Teach LoopUnroll to update the LPM infrastructure as it unrolls loops. We do this by reconstructing the newly added loops after the unroll completes to avoid threading pass manager details thro [PM] Teach LoopUnroll to update the LPM infrastructure as it unrolls loops. We do this by reconstructing the newly added loops after the unroll completes to avoid threading pass manager details through all the mess of the unrolling infrastructure. I've enabled some extra assertions in the LPM to try and catch issues here and enabled a bunch of unroller tests to try and make sure this is sane. Currently, I'm manually running loop-simplify when needed. That should go away once it is folded into the LPM infrastructure. Differential Revision: https://reviews.llvm.org/D28848 llvm-svn: 293011 show more ...
Revision tags: llvmorg-4.0.0-rc1
# c3f87f02	17-Jan-2017	Dehao Chen <dehao@google.com>	Introduce -unroll-partial-threshold to separate PartialThreshold from Threshold in loop unorller. Summary: Partial unrolling should have separate threshold with full unrolling. Reviewers: efriedma, Introduce -unroll-partial-threshold to separate PartialThreshold from Threshold in loop unorller. Summary: Partial unrolling should have separate threshold with full unrolling. Reviewers: efriedma, mzolotukhin Reviewed By: efriedma, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28831 llvm-svn: 292293 show more ...
# ca68a3ec	15-Jan-2017	Chandler Carruth <chandlerc@gmail.com>	[PM] Introduce an analysis set used to preserve all analyses over a function's CFG when that CFG is unchanged. This allows transformation passes to simply claim they preserve the CFG and analysis pa [PM] Introduce an analysis set used to preserve all analyses over a function's CFG when that CFG is unchanged. This allows transformation passes to simply claim they preserve the CFG and analysis passes to check for the CFG being preserved to remove the fanout of all analyses being listed in all passes. I've gone through and removed or cleaned up as many of the comments reminding us to do this as I could. Differential Revision: https://reviews.llvm.org/D28627 llvm-svn: 292054 show more ...
# 3bab7e1a	11-Jan-2017	Chandler Carruth <chandlerc@gmail.com>	[PM] Separate the LoopAnalysisManager from the LoopPassManager and move the latter to the Transforms library. While the loop PM uses an analysis to form the IR units, the current plan is to have the [PM] Separate the LoopAnalysisManager from the LoopPassManager and move the latter to the Transforms library. While the loop PM uses an analysis to form the IR units, the current plan is to have the PM itself establish and enforce both loop simplified form and LCSSA. This would be a layering violation in the analysis library. Fundamentally, the idea behind the loop PM is to transform loops in addition to running passes over them, so it really seemed like the most natural place to sink this was into the transforms library. We can't just move everything because we also have loop analyses that rely on a subset of the invariants. So this patch splits the the loop infrastructure into the analysis management that has to be part of the analysis library, and the transform-aware pass manager. This also required splitting the loop analyses' printer passes out to the transforms library, which makes sense to me as running these will transform the code into LCSSA in theory. I haven't split the unittest though because testing one component without the other seems nearly intractable. Differential Revision: https://reviews.llvm.org/D28452 llvm-svn: 291662 show more ...
# 410eaeb0	11-Jan-2017	Chandler Carruth <chandlerc@gmail.com>	[PM] Rewrite the loop pass manager to use a worklist and augmented run arguments much like the CGSCC pass manager. This is a major redesign following the pattern establish for the CGSCC layer to sup [PM] Rewrite the loop pass manager to use a worklist and augmented run arguments much like the CGSCC pass manager. This is a major redesign following the pattern establish for the CGSCC layer to support updates to the set of loops during the traversal of the loop nest and to support invalidation of analyses. An additional significant burden in the loop PM is that so many passes require access to a large number of function analyses. Manually ensuring these are cached, available, and preserved has been a long-standing burden in LLVM even with the help of the automatic scheduling in the old pass manager. And it made the new pass manager extremely unweildy. With this design, we can package the common analyses up while in a function pass and make them immediately available to all the loop passes. While in some cases this is unnecessary, I think the simplicity afforded is worth it. This does not (yet) address loop simplified form or LCSSA form, but those are the next things on my radar and I have a clear plan for them. While the patch is very large, most of it is either mechanically updating loop passes to the new API or the new testing for the loop PM. The code for it is reasonably compact. I have not yet updated all of the loop passes to correctly leverage the update mechanisms demonstrated in the unittests. I'll do that in follow-up patches along with improved FileCheck tests for those passes that ensure things work in more realistic scenarios. In many cases, there isn't much we can do with these until the loop simplified form and LCSSA form are in place. Differential Revision: https://reviews.llvm.org/D28292 llvm-svn: 291651 show more ...
# cc76344e	30-Dec-2016	Dehao Chen <dehao@google.com>	Use continuous boosting factor for complete unroll. Summary: The current loop complete unroll algorithm checks if unrolling complete will reduce the runtime by a certain percentage. If yes, it will Use continuous boosting factor for complete unroll. Summary: The current loop complete unroll algorithm checks if unrolling complete will reduce the runtime by a certain percentage. If yes, it will apply a fixed boosting factor to the threshold (by discounting cost). The problem for this approach is that the threshold abruptly. This patch makes the boosting factor a function of runtime reduction percentage, capped by a fixed threshold. In this way, the threshold changes continuously. The patch also simplified the code by reducing one parameter in UP. The patch only affects code-gen of two speccpu2006 benchmark: 445.gobmk binary size decreases 0.08%, no performance change. 464.h264ref binary size increases 0.24%, no performance change. Reviewers: mzolotukhin, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26989 llvm-svn: 290737 show more ...
# aec2fa35	19-Dec-2016	Daniel Jasper <djasper@google.com>	Revert @llvm.assume with operator bundles (r289755-r289757) This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086
# 3ca4a6bc	15-Dec-2016	Hal Finkel <hfinkel@anl.gov>	Remove the AssumptionCache After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is mo Remove the AssumptionCache After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756 show more ...
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3
# c3be2258	02-Dec-2016	Dehao Chen <dehao@google.com>	Change LoopUnrollPass cost from int to unsigned to make it consistent. (NFC) llvm-svn: 288463
Revision tags: llvmorg-3.9.1-rc2
# b151a641	30-Nov-2016	Michael Kuperstein <mkuper@google.com>	[LoopUnroll] Implement profile-based loop peeling This implements PGO-driven loop peeling. The basic idea is that when the average dynamic trip-count of a loop is known, based on PGO, to be low, we [LoopUnroll] Implement profile-based loop peeling This implements PGO-driven loop peeling. The basic idea is that when the average dynamic trip-count of a loop is known, based on PGO, to be low, we can expect a performance win by peeling off the first several iterations of that loop. Unlike unrolling based on a known trip count, or a trip count multiple, this doesn't save us the conditional check and branch on each iteration. However, it does allow us to simplify the straight-line code we get (constant-folding, etc.). This is important given that we know that we will usually only hit this code, and not the actual loop. This is currently disabled by default. Differential Revision: https://reviews.llvm.org/D25963 llvm-svn: 288274 show more ...
Revision tags: llvmorg-3.9.1-rc1
# 731b04ca	23-Nov-2016	Haicheng Wu <haicheng@codeaurora.org>	[LoopUnroll] Move code to exit early. NFC. Just to save some compilation time. Differential Revision: https://reviews.llvm.org/D26784 llvm-svn: 287800
# 41d72a86	17-Nov-2016	Dehao Chen <dehao@google.com>	Use profile info to adjust loop unroll threshold. Summary: For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold. For hot loop, w Use profile info to adjust loop unroll threshold. Summary: For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold. For hot loop, we set a higher unroll threshold and allows expensive tripcount computation to allow more aggressive unrolling. Reviewers: davidxl, mzolotukhin Subscribers: sanjoy, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D26527 llvm-svn: 287186 show more ...
# c2698cd9	09-Nov-2016	Evgeny Stupachenko <evstupac@gmail.com>	Minor unroll pass refacoring. Summary: Unrolled Loop Size calculations moved to a function. Constant representing number of optimized instructions when "back edge" becomes "fall through" replaced w Minor unroll pass refacoring. Summary: Unrolled Loop Size calculations moved to a function. Constant representing number of optimized instructions when "back edge" becomes "fall through" replaced with variable. Some comments added. Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D21719 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 286389 show more ...
# 430b3e48	27-Oct-2016	Haicheng Wu <haicheng@codeaurora.org>	[LoopUnroll] Check partial unrolling is enabled before initialization. NFC. Differential Revision: https://reviews.llvm.org/D23891 llvm-svn: 285330
# cffedc4a	25-Oct-2016	Michael Kuperstein <mkuper@google.com>	Fix 80-char violations. NFC. llvm-svn: 285092
# 84b21835	21-Oct-2016	John Brawn <john.brawn@arm.com>	[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 llvm-svn: 284818 show more ...
# 1ef17e90	12-Oct-2016	Haicheng Wu <haicheng@codeaurora.org>	Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049. The original summary: This p Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049. The original summary: This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. llvm-svn: 284053 show more ...
# 45e4ef73	12-Oct-2016	Haicheng Wu <haicheng@codeaurora.org>	Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" This reverts commit r284044. llvm-svn: 284051
# 6cac34fd	12-Oct-2016	Haicheng Wu <haicheng@codeaurora.org>	[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == [LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. Differential Revision: https://reviews.llvm.org/D24790 llvm-svn: 284044 show more ...
# 977853b7	30-Sep-2016	Dehao Chen <dehao@google.com>	Update loop unroller cost model to make sure debug info does not affect optimization decisions. Summary: Debug info should not affect optimization decisions. This patch updates loop unroller cost Update loop unroller cost model to make sure debug info does not affect optimization decisions. Summary: Debug info should not affect optimization decisions. This patch updates loop unroller cost model to make it not affected by debug info. Reviewers: davidxl, mzolotukhin Subscribers: haicheng, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D25098 llvm-svn: 282894 show more ...
# f57cc62a	30-Sep-2016	Adam Nemet <anemet@apple.com>	[LoopUnroll] Port to the new streaming interface for opt remarks. llvm-svn: 282834
1 2 3 4 5 678 9 10 >>...15