History log of /llvm-project/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (Results 151 – 175 of 369)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-4.0.0-rc3
# c2af82b4 22-Feb-2017 Michael Kuperstein <mkuper@google.com>

[LoopUnroll] Enable PGO-based loop peeling by default.

This enables peeling of loops with low dynamic iteration count by default,
when profile information is available.

Differential Revision: https

[LoopUnroll] Enable PGO-based loop peeling by default.

This enables peeling of loops with low dynamic iteration count by default,
when profile information is available.

Differential Revision: https://reviews.llvm.org/D27734

llvm-svn: 295796

show more ...


# 7d230325 18-Feb-2017 Dehao Chen <dehao@google.com>

Increases full-unroll threshold.

Summary:
The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold

This change will affect the following speccpu2006

Increases full-unroll threshold.

Summary:
The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold

This change will affect the following speccpu2006 benchmarks (performance numbers were collected from Intel Sandybridge):

Performance:

403 0.11%
433 0.51%
445 0.48%
447 3.50%
453 1.49%
464 0.75%

Code size:

403 0.56%
433 0.96%
445 2.16%
447 2.96%
453 0.94%
464 8.02%

The compiler time overhead is similar with code size.

Reviewers: davidxl, mkuper, mzolotukhin, hfinkel, chandlerc

Reviewed By: hfinkel, chandlerc

Subscribers: mehdi_amini, zzheng, efriedma, haicheng, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D28368

llvm-svn: 295538

show more ...


Revision tags: llvmorg-4.0.0-rc2
# eab3b90a 26-Jan-2017 Chandler Carruth <chandlerc@gmail.com>

[PM] Simplify the new PM interface to the loop unroller and expose two
factory functions for the two modes the loop unroller is actually used
in in-tree: simplified full-unrolling and the entire thin

[PM] Simplify the new PM interface to the loop unroller and expose two
factory functions for the two modes the loop unroller is actually used
in in-tree: simplified full-unrolling and the entire thing including
partial unrolling.

I've also wired these up to nice names so you can express both of these
being in a pipeline easily. This is a precursor to actually enabling
these parts of the O2 pipeline.

Differential Revision: https://reviews.llvm.org/D28897

llvm-svn: 293136

show more ...


# 5dd55e84 26-Jan-2017 Michael Kuperstein <mkuper@google.com>

[LoopUnroll] Properly update loopinfo for runtime unrolling by 2

Even when we don't create a remainder loop (that is, when we unroll by 2), we
may duplicate nested loops into the remainder. This is

[LoopUnroll] Properly update loopinfo for runtime unrolling by 2

Even when we don't create a remainder loop (that is, when we unroll by 2), we
may duplicate nested loops into the remainder. This is complicated by the fact
the remainder may itself be either inserted into an outer loop, or at the top
level. In the latter case, we may need to create new top-level loops.

Differential Revision: https://reviews.llvm.org/D29156

llvm-svn: 293124

show more ...


# ce40fa13 25-Jan-2017 Chandler Carruth <chandlerc@gmail.com>

[PM] Teach LoopUnroll to update the LPM infrastructure as it unrolls
loops.

We do this by reconstructing the newly added loops after the unroll
completes to avoid threading pass manager details thro

[PM] Teach LoopUnroll to update the LPM infrastructure as it unrolls
loops.

We do this by reconstructing the newly added loops after the unroll
completes to avoid threading pass manager details through all the mess
of the unrolling infrastructure.

I've enabled some extra assertions in the LPM to try and catch issues
here and enabled a bunch of unroller tests to try and make sure this is
sane.

Currently, I'm manually running loop-simplify when needed. That should
go away once it is folded into the LPM infrastructure.

Differential Revision: https://reviews.llvm.org/D28848

llvm-svn: 293011

show more ...


Revision tags: llvmorg-4.0.0-rc1
# c3f87f02 17-Jan-2017 Dehao Chen <dehao@google.com>

Introduce -unroll-partial-threshold to separate PartialThreshold from Threshold in loop unorller.

Summary: Partial unrolling should have separate threshold with full unrolling.

Reviewers: efriedma,

Introduce -unroll-partial-threshold to separate PartialThreshold from Threshold in loop unorller.

Summary: Partial unrolling should have separate threshold with full unrolling.

Reviewers: efriedma, mzolotukhin

Reviewed By: efriedma, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28831

llvm-svn: 292293

show more ...


# ca68a3ec 15-Jan-2017 Chandler Carruth <chandlerc@gmail.com>

[PM] Introduce an analysis set used to preserve all analyses over
a function's CFG when that CFG is unchanged.

This allows transformation passes to simply claim they preserve the CFG
and analysis pa

[PM] Introduce an analysis set used to preserve all analyses over
a function's CFG when that CFG is unchanged.

This allows transformation passes to simply claim they preserve the CFG
and analysis passes to check for the CFG being preserved to remove the
fanout of all analyses being listed in all passes.

I've gone through and removed or cleaned up as many of the comments
reminding us to do this as I could.

Differential Revision: https://reviews.llvm.org/D28627

llvm-svn: 292054

show more ...


# 3bab7e1a 11-Jan-2017 Chandler Carruth <chandlerc@gmail.com>

[PM] Separate the LoopAnalysisManager from the LoopPassManager and move
the latter to the Transforms library.

While the loop PM uses an analysis to form the IR units, the current
plan is to have the

[PM] Separate the LoopAnalysisManager from the LoopPassManager and move
the latter to the Transforms library.

While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.

Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.

We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.

This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.

I haven't split the unittest though because testing one component
without the other seems nearly intractable.

Differential Revision: https://reviews.llvm.org/D28452

llvm-svn: 291662

show more ...


# 410eaeb0 11-Jan-2017 Chandler Carruth <chandlerc@gmail.com>

[PM] Rewrite the loop pass manager to use a worklist and augmented run
arguments much like the CGSCC pass manager.

This is a major redesign following the pattern establish for the CGSCC layer to
sup

[PM] Rewrite the loop pass manager to use a worklist and augmented run
arguments much like the CGSCC pass manager.

This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.

An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.

This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.

While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.

I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.

Differential Revision: https://reviews.llvm.org/D28292

llvm-svn: 291651

show more ...


# cc76344e 30-Dec-2016 Dehao Chen <dehao@google.com>

Use continuous boosting factor for complete unroll.

Summary:
The current loop complete unroll algorithm checks if unrolling complete will reduce the runtime by a certain percentage. If yes, it will

Use continuous boosting factor for complete unroll.

Summary:
The current loop complete unroll algorithm checks if unrolling complete will reduce the runtime by a certain percentage. If yes, it will apply a fixed boosting factor to the threshold (by discounting cost). The problem for this approach is that the threshold abruptly. This patch makes the boosting factor a function of runtime reduction percentage, capped by a fixed threshold. In this way, the threshold changes continuously.

The patch also simplified the code by reducing one parameter in UP.

The patch only affects code-gen of two speccpu2006 benchmark:

445.gobmk binary size decreases 0.08%, no performance change.
464.h264ref binary size increases 0.24%, no performance change.

Reviewers: mzolotukhin, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26989

llvm-svn: 290737

show more ...


# aec2fa35 19-Dec-2016 Daniel Jasper <djasper@google.com>

Revert @llvm.assume with operator bundles (r289755-r289757)

This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086


# 3ca4a6bc 15-Dec-2016 Hal Finkel <hfinkel@anl.gov>

Remove the AssumptionCache

After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is mo

Remove the AssumptionCache

After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

llvm-svn: 289756

show more ...


Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3
# c3be2258 02-Dec-2016 Dehao Chen <dehao@google.com>

Change LoopUnrollPass cost from int to unsigned to make it consistent. (NFC)

llvm-svn: 288463


Revision tags: llvmorg-3.9.1-rc2
# b151a641 30-Nov-2016 Michael Kuperstein <mkuper@google.com>

[LoopUnroll] Implement profile-based loop peeling

This implements PGO-driven loop peeling.

The basic idea is that when the average dynamic trip-count of a loop is known,
based on PGO, to be low, we

[LoopUnroll] Implement profile-based loop peeling

This implements PGO-driven loop peeling.

The basic idea is that when the average dynamic trip-count of a loop is known,
based on PGO, to be low, we can expect a performance win by peeling off the
first several iterations of that loop.
Unlike unrolling based on a known trip count, or a trip count multiple, this
doesn't save us the conditional check and branch on each iteration. However,
it does allow us to simplify the straight-line code we get (constant-folding,
etc.). This is important given that we know that we will usually only hit this
code, and not the actual loop.

This is currently disabled by default.

Differential Revision: https://reviews.llvm.org/D25963

llvm-svn: 288274

show more ...


Revision tags: llvmorg-3.9.1-rc1
# 731b04ca 23-Nov-2016 Haicheng Wu <haicheng@codeaurora.org>

[LoopUnroll] Move code to exit early. NFC.

Just to save some compilation time.

Differential Revision: https://reviews.llvm.org/D26784

llvm-svn: 287800


# 41d72a86 17-Nov-2016 Dehao Chen <dehao@google.com>

Use profile info to adjust loop unroll threshold.

Summary:
For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold.
For hot loop, w

Use profile info to adjust loop unroll threshold.

Summary:
For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold.
For hot loop, we set a higher unroll threshold and allows expensive tripcount computation to allow more aggressive unrolling.

Reviewers: davidxl, mzolotukhin

Subscribers: sanjoy, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D26527

llvm-svn: 287186

show more ...


# c2698cd9 09-Nov-2016 Evgeny Stupachenko <evstupac@gmail.com>

Minor unroll pass refacoring.

Summary:
Unrolled Loop Size calculations moved to a function.
Constant representing number of optimized instructions
when "back edge" becomes "fall through" replaced w

Minor unroll pass refacoring.

Summary:
Unrolled Loop Size calculations moved to a function.
Constant representing number of optimized instructions
when "back edge" becomes "fall through" replaced with
variable.
Some comments added.

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D21719

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 286389

show more ...


# 430b3e48 27-Oct-2016 Haicheng Wu <haicheng@codeaurora.org>

[LoopUnroll] Check partial unrolling is enabled before initialization. NFC.

Differential Revision: https://reviews.llvm.org/D23891

llvm-svn: 285330


# cffedc4a 25-Oct-2016 Michael Kuperstein <mkuper@google.com>

Fix 80-char violations. NFC.

llvm-svn: 285092


# 84b21835 21-Oct-2016 John Brawn <john.brawn@arm.com>

[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops

When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number

[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops

When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number of iterations will be either exactly
that upper bound or zero, then we can fully unroll up to that upper bound
keeping only the first loop test to check for the zero iteration case.

Most of the work here is in plumbing this 'max-or-zero' information from the
part of scalar evolution where it's detected through to loop unrolling. I've
also gone for the safe default of 'false' everywhere but howManyLessThans which
could probably be improved.

Differential Revision: https://reviews.llvm.org/D25682

llvm-svn: 284818

show more ...


# 1ef17e90 12-Oct-2016 Haicheng Wu <haicheng@codeaurora.org>

Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"

Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049.

The original summary:

This p

Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"

Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049.

The original summary:

This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
if (a[i] == value) {
found = true;
break;
}
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

llvm-svn: 284053

show more ...


# 45e4ef73 12-Oct-2016 Haicheng Wu <haicheng@codeaurora.org>

Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"

This reverts commit r284044.

llvm-svn: 284051


# 6cac34fd 12-Oct-2016 Haicheng Wu <haicheng@codeaurora.org>

[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop

This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
if (a[i] ==

[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop

This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
if (a[i] == value) {
found = true;
break;
}
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

Differential Revision: https://reviews.llvm.org/D24790

llvm-svn: 284044

show more ...


# 977853b7 30-Sep-2016 Dehao Chen <dehao@google.com>

Update loop unroller cost model to make sure debug info does not affect optimization decisions.

Summary: Debug info should *not* affect optimization decisions. This patch updates loop unroller cost

Update loop unroller cost model to make sure debug info does not affect optimization decisions.

Summary: Debug info should *not* affect optimization decisions. This patch updates loop unroller cost model to make it not affected by debug info.

Reviewers: davidxl, mzolotukhin

Subscribers: haicheng, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D25098

llvm-svn: 282894

show more ...


# f57cc62a 30-Sep-2016 Adam Nemet <anemet@apple.com>

[LoopUnroll] Port to the new streaming interface for opt remarks.

llvm-svn: 282834


12345678910>>...15