History log of /llvm-project/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (Results 176 – 200 of 369)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 58c5a7f5 28-Sep-2016 Jonas Paulsson <paulsson@linux.vnet.ibm.com>

[SystemZ] Implementation of getUnrollingPreferences().

This commit enables more unrolling for SystemZ by implementing the
SystemZTargetTransformInfo::getUnrollingPreferences() method.

It has been f

[SystemZ] Implementation of getUnrollingPreferences().

This commit enables more unrolling for SystemZ by implementing the
SystemZTargetTransformInfo::getUnrollingPreferences() method.

It has been found that it is better to only unroll moderately, so the
DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order
to set this to a lower value for SystemZ (4).

Reviewers: Evgeny Stupachenko, Ulrich Weigand.
https://reviews.llvm.org/D24451

llvm-svn: 282570

show more ...


# 109f4f35 07-Sep-2016 Haicheng Wu <haicheng@codeaurora.org>

[LoopUnroll] Correct a debug message. NFC.

Differential Revision: https://reviews.llvm.org/D24299

llvm-svn: 280865


# 4f155b6e 26-Aug-2016 Adam Nemet <anemet@apple.com>

[LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis pass

We can't mark ORE (a function pass) preserved as required by the loop
passes because that is how we ensure that the requ

[LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis pass

We can't mark ORE (a function pass) preserved as required by the loop
passes because that is how we ensure that the required passes like
LazyBFI are all available any time ORE is used. See the new comments in
the patch.

Instead we use it directly just like the inliner does in D22694.

As expected there is some additional overhead after removing the caching
provided by analysis passes. The worst case, I measured was
LNT/CINT2006_ref/401.bzip2 which regresses by 12%. As before, this only
affects -Rpass-with-hotness and not default compilation.

llvm-svn: 279829

show more ...


Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3
# bd63d436 23-Aug-2016 Michael Zolotukhin <mzolotukhin@apple.com>

[LoopUnroll] By default disable unrolling when optimizing for size.

Summary:
In clang commit r268509 we started to invoke loop-unroll pass from the
driver even under -Os. However, we happen to not i

[LoopUnroll] By default disable unrolling when optimizing for size.

Summary:
In clang commit r268509 we started to invoke loop-unroll pass from the
driver even under -Os. However, we happen to not initialize optsize
thresholds properly, which si fixed with this change.

r268509 led to some big compile time regressions, because we started to
unroll some loops that we didn't unroll before. With this change I hope
to recover most of the regressions. We still are slightly slower than
before, because we do some checks here and there in loop-unrolling
before we bail out, but at least the slowdown is not that huge now.

Reviewers: hfinkel, chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23388

llvm-svn: 279585

show more ...


Revision tags: llvmorg-3.9.0-rc2
# e7877632 17-Aug-2016 Haicheng Wu <haicheng@codeaurora.org>

[LoopUnroll] Move a simple check earlier. NFC.

Move the check of CallInst earlier to skip expensive recursive operations.

Differential Revision: https://reviews.llvm.org/D23611

llvm-svn: 278998


# 0746f3bf 09-Aug-2016 Sean Silva <chisophugis@gmail.com>

Consistently use LoopAnalysisManager

One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).

Also, some includes for Lo

Consistently use LoopAnalysisManager

One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).

Also, some includes for LoopPassManager.h were needed since that file
provides the typedef.

Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278079

show more ...


Revision tags: llvmorg-3.9.0-rc1
# 12937c36 29-Jul-2016 Adam Nemet <anemet@apple.com>

[LoopUnroll] Include hotness of region in opt remark

LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter
is added to the common function analysis passes that loop passes
depend o

[LoopUnroll] Include hotness of region in opt remark

LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter
is added to the common function analysis passes that loop passes
depend on.

The BFI and indirectly BPI used in this pass is computed lazily so no
overhead should be observed unless -pass-remarks-with-hotness is used.

This is how the patch affects the O3 pipeline:

Dominator Tree Construction
Natural Loop Information
Canonicalize natural loops
Loop-Closed SSA Form Pass
Basic Alias Analysis (stateless AA impl)
Function Alias Analysis Results
Scalar Evolution Analysis
+ Lazy Branch Probability Analysis
+ Lazy Block Frequency Analysis
+ Optimization Remark Emitter
Loop Pass Manager
Rotate Loops
Loop Invariant Code Motion
Unswitch loops
Simplify the CFG
Dominator Tree Construction
Basic Alias Analysis (stateless AA impl)
Function Alias Analysis Results
Combine redundant instructions
Natural Loop Information
Canonicalize natural loops
Loop-Closed SSA Form Pass
Scalar Evolution Analysis
+ Lazy Branch Probability Analysis
+ Lazy Block Frequency Analysis
+ Optimization Remark Emitter
Loop Pass Manager
Induction Variable Simplification
Recognize loop idioms
Delete dead loops
Unroll loops
...

llvm-svn: 277203

show more ...


# e3c18a5a 19-Jul-2016 Sean Silva <chisophugis@gmail.com>

[PM] Port LoopUnroll.

We just set PreserveLCSSA to always true since we don't have an
analogous method `mustPreserveAnalysisID(LCSSA)`.

Also port LoopInfo verifier pass to test LoopUnrollPass.

llv

[PM] Port LoopUnroll.

We just set PreserveLCSSA to always true since we don't have an
analogous method `mustPreserveAnalysisID(LCSSA)`.

Also port LoopInfo verifier pass to test LoopUnrollPass.

llvm-svn: 276063

show more ...


# 4a697c31 15-Jun-2016 David Majnemer <david.majnemer@gmail.com>

[LoopUnroll] Don't crash trying to unroll loop with EH pad exit

We do not support splitting cleanuppad or catchswitches. This is
problematic for passes which assume that a loop is in loop simplify

[LoopUnroll] Don't crash trying to unroll loop with EH pad exit

We do not support splitting cleanuppad or catchswitches. This is
problematic for passes which assume that a loop is in loop simplify
form (the loop would have a dedicated exit block instead of sharing it).

While it isn't great that we don't support this for cleanups, we still
cannot make loop-simplify form an assertable precondition because
indirectbr will also disable these sorts of CFG cleanups.

This fixes PR28132.

llvm-svn: 272739

show more ...


# 3e2f389a 08-Jun-2016 Evgeny Stupachenko <evstupac@gmail.com>

The patch set unroll disable pragma when unroll
with user specified count has been applied.

Summary:
Previously SetLoopAlreadyUnrolled() set the disable pragma only if
there was some loop metadata.

The patch set unroll disable pragma when unroll
with user specified count has been applied.

Summary:
Previously SetLoopAlreadyUnrolled() set the disable pragma only if
there was some loop metadata.
Now it set the pragma in all cases. This helps to prevent multiple
unroll when -unroll-count=N is given.

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D20765

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 272195

show more ...


Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1
# 58564989 03-Jun-2016 Michael Zolotukhin <mzolotukhin@apple.com>

[LoopUnroll] Set correct thresholds for new recently enabled unrolling heuristic.

In r270478, where I enabled the new heuristic I posted testing results,
which I got when explicitly passed the thres

[LoopUnroll] Set correct thresholds for new recently enabled unrolling heuristic.

In r270478, where I enabled the new heuristic I posted testing results,
which I got when explicitly passed the thresholds values via CL options.
However, setting the CL options init-values is not enough to change the
default values of thresholds, so I'm changing them in another place now.

llvm-svn: 271615

show more ...


# b787522d 28-May-2016 Evgeny Stupachenko <evstupac@gmail.com>

The patch fixes r271071
Summary:
unused variables in Release mode:
BasicBlock *Header
unsigned OrigCount
put under DEBUG

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 271076


# ea2aef4a 27-May-2016 Evgeny Stupachenko <evstupac@gmail.com>

The patch refactors unroll pass.
Summary:
Unroll factor (Count) calculations moved to a new function.
Early exits on pragma and "-unroll-count" defined factor added.
New type of unrolling "Force" int

The patch refactors unroll pass.
Summary:
Unroll factor (Count) calculations moved to a new function.
Early exits on pragma and "-unroll-count" defined factor added.
New type of unrolling "Force" introduced (previously used implicitly).
New unroll preference "AllowRemainder" introduced and set "true" by default.
(should be set to false for architectures that suffers from it).

Reviewers: hfinkel, mzolotukhin, zzheng

Differential Revision: http://reviews.llvm.org/D19553

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 271071

show more ...


# 82de7d32 27-May-2016 Benjamin Kramer <benny.kra@googlemail.com>

Apply clang-tidy's misc-move-constructor-init throughout LLVM.

No functionality change intended, maybe a tiny performance improvement.

llvm-svn: 270997


# 1ecdedad 26-May-2016 Michael Zolotukhin <mzolotukhin@apple.com>

[LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost.

Condition might be simplified to a Constant, but it doesn't have to be
ConstantInt, so we should dyn_cast, instead of cast.

This fixes PR2

[LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost.

Condition might be simplified to a Constant, but it doesn't have to be
ConstantInt, so we should dyn_cast, instead of cast.

This fixes PR27886.

llvm-svn: 270924

show more ...


# 8f7a242c 24-May-2016 Michael Zolotukhin <mzolotukhin@apple.com>

Re-enable "[LoopUnroll] Enable advanced unrolling analysis by default" one more time.

This reverts commit r270577.

llvm-svn: 270630


# b64e4390 24-May-2016 Hans Wennborg <hans@hanshq.net>

Revert r270518, which re-enabled "[LoopUnroll] Enable advanced unrolling analysis by default.

Chromium builds are still hitting the assert in PR27874.

llvm-svn: 270577


# 96c150d1 24-May-2016 Michael Zolotukhin <mzolotukhin@apple.com>

Revert "Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default.""

This reverts commit r270512 and reapplies r270478. Originally it caused
PR27847, but it was fixed in r270517.

l

Revert "Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default.""

This reverts commit r270512 and reapplies r270478. Originally it caused
PR27847, but it was fixed in r270517.

llvm-svn: 270518

show more ...


# 6951028b 23-May-2016 Hans Wennborg <hans@hanshq.net>

Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default."

This caused PR27847.

llvm-svn: 270512


# be080fc5 23-May-2016 Michael Zolotukhin <mzolotukhin@apple.com>

[LoopUnroll] Enable advanced unrolling analysis by default.

Summary:
This patch turns on LoopUnrollAnalyzer by default. To mitigate compile
time regressions, I chose very conservative thresholds for

[LoopUnroll] Enable advanced unrolling analysis by default.

Summary:
This patch turns on LoopUnrollAnalyzer by default. To mitigate compile
time regressions, I chose very conservative thresholds for now. Later we
can make them more aggressive, but it might require being smarter in
which loops we're optimizing. E.g. currently the biggest issue is that
with more agressive thresholds we unroll many cold loops, which
increases compile time for no performance benefit (performance of those
loops is improved, but it doesn't matter since they are cold).

Test results for compile time(using 4 samples to reduce noise):
```
MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19%
SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect 4.19%
MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow 3.39%
MultiSource/Applications/JM/lencod/lencod 1.47%
MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06%
```

I didn't see any performance changes in the testsuite, but it improves
some internal tests.

Reviewers: hfinkel, chandlerc

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20482

llvm-svn: 270478

show more ...


# d2268a73 18-May-2016 Michael Zolotukhin <mzolotukhin@apple.com>

[LoopUnrollAnalyzer] Take into account cost of instructions controlling branches, along with their operands.

Previously, we didn't add their and their operands cost, which could've
resulted in unrol

[LoopUnrollAnalyzer] Take into account cost of instructions controlling branches, along with their operands.

Previously, we didn't add their and their operands cost, which could've
resulted in unrolling loops for no actual benefit.

llvm-svn: 269985

show more ...


# 963a6d9c 13-May-2016 Michael Zolotukhin <mzolotukhin@apple.com>

Revert "Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that inst

Revert "Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...""

This reverts commit r269395.

Try to reapply with a fix from chapuni.

llvm-svn: 269486

show more ...


# 9be3b8b9 13-May-2016 Michael Zolotukhin <mzolotukhin@apple.com>

Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction

Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."

This reverts commit r269388.

It caused some bots to fail, I'm reverting it until I investigate the
issue.

llvm-svn: 269395

show more ...


# b7b80529 13-May-2016 Michael Zolotukhin <mzolotukhin@apple.com>

[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes

[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...

Summary:
...loop after the last iteration.

This is really hard to do correctly. The core problem is that we need to
model liveness through the induction PHIs from iteration to iteration in
order to get the correct results, and we need to correctly de-duplicate
the common subgraphs of instructions feeding some subset of the
induction PHIs. All of this can be driven either from a side effect at
some iteration or from the loop values used after the loop finishes.

This patch implements this by storing the forward-propagating analysis
of each instruction in a cache to recall whether it was free and whether
it has become live and thus counted toward the total unroll cost. Then,
at each sink for a value in the loop, we recursively walk back through
every value that feeds the sink, including looping back through the
iterations as needed, until we have marked the entire input graph as
live. Because we cache this, we never visit instructions more than twice
-- once when we analyze them and put them into the cache, and once when
we count their cost towards the unrolled loop. Also, because the cache
is only two bits and because we are dealing with relatively small
iteration counts, we can store all of this very densely in memory to
avoid this from becoming an excessively slow analysis.

The code here is still pretty gross. I would appreciate suggestions
about better ways to factor or split this up, I've stared too long at
the algorithmic side to really have a good sense of what the design
should probably look at.

Also, it might seem like we should do all of this bottom-up, but I think
that is a red herring. Specifically, the simplification power is *much*
greater working top-down. We can forward propagate very effectively,
even across strange and interesting recurrances around the backedge.
Because we use data to propagate, this doesn't cause a state space
explosion. Doing this level of constant folding, etc, would be very
expensive to do bottom-up because it wouldn't be until the last moment
that you could collapse everything. The current solution is essentially
a top-down simplification with a bottom-up cost accounting which seems
to get the best of both worlds. It makes the simplification incremental
and powerful while leaving everything dead until we *know* it is needed.

Finally, a core property of this approach is its *monotonicity*. At all
times, the current UnrolledCost is a conservatively low estimate. This
ensures that we will never early-exit from the analysis due to exceeding
a threshold when if we had continued, the cost would have gone back
below the threshold. These kinds of bugs can cause incredibly hard to
track down random changes to behavior.

We could use a techinque similar (but much simpler) within the inliner
as well to avoid considering speculated code in the inline cost.

Reviewers: chandlerc

Subscribers: sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11758

llvm-svn: 269388

show more ...


# 719b26ba 10-May-2016 Hans Wennborg <hans@hanshq.net>

Loop unroller: set thresholds for optsize and minsize functions to zero

Before r268509, Clang would disable the loop unroll pass when optimizing
for size. That commit enabled it to be able to suppor

Loop unroller: set thresholds for optsize and minsize functions to zero

Before r268509, Clang would disable the loop unroll pass when optimizing
for size. That commit enabled it to be able to support unroll pragmas
in -Os builds. However, this regressed binary size in one of Chromium's
DLLs with ~100 KB.

This restores the original behaviour of no unrolling at -Os, but doing it
in LLVM instead of Clang makes more sense, and also allows the pragmas to
keep working.

Differential revision: http://reviews.llvm.org/D20115

llvm-svn: 269124

show more ...


12345678910>>...15