History log of /llvm-project/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (Results 126 – 150 of 369)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-5.0.2-rc1
# fc97b617 15-Mar-2018 Florian Hahn <florian.hahn@arm.com>

[LoopUnroll] Peel off iterations if it makes conditions true/false.

If the loop body contains conditions of the form IndVar < #constant, we
can remove the checks by peeling off #constant iterations.

[LoopUnroll] Peel off iterations if it makes conditions true/false.

If the loop body contains conditions of the form IndVar < #constant, we
can remove the checks by peeling off #constant iterations.

This improves codegen for PR34364.

Reviewers: mkuper, mkazantsev, efriedma

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D43876

llvm-svn: 327671

show more ...


# f9b8035f 15-Mar-2018 Andrei Elovikov <andrei.elovikov@intel.com>

[LoopUnroll] Ignore ephemeral values when checking full unroll profitability.

Summary:
Before this patch call graph is like this in the LoopUnrollPass:

tryToUnrollLoop
ApproximateLoopSize

[LoopUnroll] Ignore ephemeral values when checking full unroll profitability.

Summary:
Before this patch call graph is like this in the LoopUnrollPass:

tryToUnrollLoop
ApproximateLoopSize
collectEphemeralValues
/* Use collected ephemeral values */
computeUnrollCount
analyzeLoopUnrollCost
/* Bail out from the analysis if loop contains CallInst */

This patch moves collection of the ephemeral values to the tryToUnrollLoop
function and passes the collected values into both ApproximateLoopsize (as
before) and additionally starts using them in analyzeLoopUnrollCost:

tryToUnrollLoop
collectEphemeralValues
ApproximateLoopSize(EphValues)
/* Use EphValues */
computeUnrollCount(EphValues)
analyzeLoopUnrollCost(EphValues)
/* Ignore ephemeral values - they don't contribute to the final cost */
/* Bail out from the analysis if loop contains CallInst */

Reviewers: mzolotukhin, evstupac, sanjoy

Reviewed By: evstupac

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43931

llvm-svn: 327617

show more ...


# 3c42f1c3 02-Mar-2018 Yaxun Liu <Yaxun.Liu@amd.com>

LoopUnroll: respect pragma unroll when AllowRemainder is disabled

Currently when AllowRemainder is disabled, pragma unroll count is not
respected even though there is no remainder. This bug causes a

LoopUnroll: respect pragma unroll when AllowRemainder is disabled

Currently when AllowRemainder is disabled, pragma unroll count is not
respected even though there is no remainder. This bug causes a loop
fully unrolled in many cases even though the user specifies a unroll
count. Especially it affects OpenCL/CUDA since in many cases a loop
contains convergent instructions and currently AllowRemainder is
disabled for such loops.

Differential Revision: https://reviews.llvm.org/D43826

llvm-svn: 326585

show more ...


Revision tags: llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1
# a17f2205 22-Dec-2017 Easwaran Raman <eraman@google.com>

Add hasProfileData() to check if a function has profile data. NFC.

Summary:
This replaces calls to getEntryCount().hasValue() with hasProfileData
that does the same thing. This refactoring is useful

Add hasProfileData() to check if a function has profile data. NFC.

Summary:
This replaces calls to getEntryCount().hasValue() with hasProfileData
that does the same thing. This refactoring is useful to do before adding
synthetic function entry counts but also a useful cleanup IMO even
otherwise. I have used hasProfileData instead of hasRealProfileData as
David had earlier suggested since I think profile implies "real" and I
use the phrase "synthetic entry count" and not "synthetic profile count"
but I am fine calling it hasRealProfileData if you prefer.

Reviewers: davidxl, silvas

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41461

llvm-svn: 321331

show more ...


Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1
# 0444e4fc 19-Oct-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

Fix MSVC signed/unsigned comparison warning

llvm-svn: 316161


# 306d2997 18-Oct-2017 Eugene Zelenko <eugene.zelenko@gmail.com>

[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

llvm-svn: 316128


# 73f65043 15-Oct-2017 Hongbin Zheng <etherzhhb@gmail.com>

[LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC.

This avoid code duplication and allow us to add the disable unroll metadata elsewhere.

Differential Revisio

[LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC.

This avoid code duplication and allow us to add the disable unroll metadata elsewhere.

Differential Revision: https://reviews.llvm.org/D38928

llvm-svn: 315850

show more ...


# 9590658f 11-Oct-2017 Vivek Pandya <vivekvpandya@gmail.com>

[NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure
parameterized emit() calls

Summary: This is not functional change to adopt new emit() API added in r313691.

Reviewed By: anem

[NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure
parameterized emit() calls

Summary: This is not functional change to adopt new emit() API added in r313691.

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38285

llvm-svn: 315476

show more ...


# 0965da20 09-Oct-2017 Adam Nemet <anemet@apple.com>

Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*

Sync it up with the name of the class actually defined here. This has been
bothering me for a while...

llvm-svn: 315249


# c965b30e 28-Sep-2017 Benjamin Kramer <benny.kra@googlemail.com>

[LoopUnroll] Fix use after poison.

llvm-svn: 314418


# def1729d 28-Sep-2017 Sanjoy Das <sanjoy@playingwithpointers.com>

Use a BumpPtrAllocator for Loop objects

Summary:
And now that we no longer have to explicitly free() the Loop instances, we can
(with more ease) use the destructor of LoopBase to do what LoopBase::c

Use a BumpPtrAllocator for Loop objects

Summary:
And now that we no longer have to explicitly free() the Loop instances, we can
(with more ease) use the destructor of LoopBase to do what LoopBase::clear() was
doing.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38201

llvm-svn: 314375

show more ...


# 0dbb0f10 27-Sep-2017 Rui Ueyama <ruiu@google.com>

Fix -Wunused-variable for Release build.

llvm-svn: 314353


# 4f3ebd53 27-Sep-2017 Sanjoy Das <sanjoy@playingwithpointers.com>

Return the LoopUnrollResult from tryToUnrollLoop; NFC

I will use this in a later change.

llvm-svn: 314352


# 3567d3d2 27-Sep-2017 Sanjoy Das <sanjoy@playingwithpointers.com>

Rename LoopUnrollStatus to LoopUnrollResult; NFC

A "Result" suffix is more appropriate here

llvm-svn: 314350


# 09613b12 20-Sep-2017 Sanjoy Das <sanjoy@playingwithpointers.com>

Tighten the invariants around LoopBase::invalidate

Summary:
With this change:
- Methods in LoopBase trip an assert if the receiver has been invalidated
- LoopBase::clear frees up the memory held t

Tighten the invariants around LoopBase::invalidate

Summary:
With this change:
- Methods in LoopBase trip an assert if the receiver has been invalidated
- LoopBase::clear frees up the memory held the LoopBase instance

This change also shuffles things around as necessary to work with this stricter invariant.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38055

llvm-svn: 313708

show more ...


Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4
# 9a09ae44 28-Aug-2017 Davide Italiano <davide@freebsd.org>

[LoopUnroll] Add a cl::opt to force peeling, for testing purposes.

Will be used to test the patch proposed in D37153.

llvm-svn: 311915


Revision tags: llvmorg-5.0.0-rc3
# 718c8a6a 14-Aug-2017 Sam Parker <sam.parker@arm.com>

[LoopUnroll] Enable option to peel remainder loop

On some targets, the penalty of executing runtime unrolling checks
and then not the unrolled loop can be significantly detrimental to
performance. T

[LoopUnroll] Enable option to peel remainder loop

On some targets, the penalty of executing runtime unrolling checks
and then not the unrolled loop can be significantly detrimental to
performance. This results in the need to be more conservative with
the unroll count, keeping a trip count of 2 reduces the overhead as
well as increasing the chance of the unrolled body being executed. But
being conservative leaves performance gains on the table.

This patch enables the unrolling of the remainder loop introduced by
runtime unrolling. This can help reduce the overhead of misunrolled
loops because the cost of non-taken branches is much less than the
cost of the backedge that would normally be executed in the remainder
loop. This allows larger unroll factors to be used without suffering
performance loses with smaller iteration counts.

Differential Revision: https://reviews.llvm.org/D36309

llvm-svn: 310824

show more ...


Revision tags: llvmorg-5.0.0-rc2
# 7c888dca 08-Aug-2017 Chandler Carruth <chandlerc@gmail.com>

[PM] Fix new LoopUnroll function pass by invalidating loop analysis
results when a loop is completely removed.

This is very hard to manifest as a visible bug. You need to arrange for
there to be a s

[PM] Fix new LoopUnroll function pass by invalidating loop analysis
results when a loop is completely removed.

This is very hard to manifest as a visible bug. You need to arrange for
there to be a subsequent allocation of a 'Loop' object which gets the
exact same address as the one which the unroll deleted, and you need the
LoopAccessAnalysis results to be significant in the way that they're
stale. And you need a million other things to align.

But when it does, you get a deeply mysterious crash due to actually
finding a stale analysis result. This fixes the issue and tests for it
by directly checking we successfully invalidate things. I have not been
able to get *any* test case to reliably trigger this. Changes to LLVM
itself caused the only test case I ever had to cease to crash.

I've looked pretty extensively at less brittle ways of fixing this and
they are actually very, very hard to do. This is a somewhat strange and
unusual case as we have a pass which is deleting an IR unit, but is not
running within that IR unit's pass framework (which is what handles this
cleanly for the normal loop unroll). And where there isn't a definitive
way to clear *all* of the stale cache entries. And where the pass *is*
updating the core analysis that provides the IR units!

For example, we don't have any of these problems with Function analyses
because it is easy to clear out function analyses when the functions
themselves may have been deleted -- we clear an entire module's worth!
But that is too heavy of a hammer down here in the LoopAnalysisManager
layer.

A better long-term solution IMO is to require that AnalysisManager's
make their keys durable to this kind of thing. Specifically, when
caching an analysis for one IR unit that is conceptually "owned" by
a higher level IR unit, the AnalysisManager should incorporate this into
its data structures so that we can reliably clear these results without
having to teach each and every pass to do so manually as we do here. But
that is a change for another day as it will be a fairly invasive change
to the AnalysisManager infrastructure. Until then, this fortunately
seems to be quite rare.

llvm-svn: 310333

show more ...


# 8482e569 03-Aug-2017 Teresa Johnson <tejohnson@google.com>

Use profile summary to disable peeling for huge working sets

Summary:
Detect when the working set size of a profiled application is huge,
by comparing the number of counts required to reach the hot

Use profile summary to disable peeling for huge working sets

Summary:
Detect when the working set size of a profiled application is huge,
by comparing the number of counts required to reach the hot percentile
in the profile summary to a large threshold*.

When the working set size is determined to be huge, disable peeling
to avoid bloating the working set further.

*Note that the selected threshold (15K) is significantly larger than the
largest working set value in SPEC cpu2006 (which is gcc at around 11K).

Reviewers: davidxl

Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36288

llvm-svn: 310005

show more ...


# 9a18a6f0 03-Aug-2017 Teresa Johnson <tejohnson@google.com>

Disable loop peeling during full unrolling pass.

Summary:
Peeling should not occur during the full unrolling invocation early
in the pipeline, but rather later with partial and runtime loop
unrollin

Disable loop peeling during full unrolling pass.

Summary:
Peeling should not occur during the full unrolling invocation early
in the pipeline, but rather later with partial and runtime loop
unrolling. The later loop unrolling invocation will also eventually
utilize profile summary and branch frequency information, which
we would like to use to control peeling. And for ThinLTO we want
to delay peeling until the backend (post thin link) phase, just as
we do for most types of unrolling.

Ensure peeling doesn't occur during the full unrolling invocation
by adding a parameter to the shared implementation function, similar
to the way partial and runtime loop unrolling are disabled.

Performance results for ThinLTO suggest this has a neutral to positive
effect on some internal benchmarks.

Reviewers: chandlerc, davidxl

Subscribers: mzolotukhin, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36258

llvm-svn: 309966

show more ...


# ecd90131 02-Aug-2017 Teresa Johnson <tejohnson@google.com>

[PM] Split LoopUnrollPass and make partial unroller a function pass

Summary:
This is largely NFC*, in preparation for utilizing ProfileSummaryInfo
and BranchFrequencyInfo analyses. In this patch I a

[PM] Split LoopUnrollPass and make partial unroller a function pass

Summary:
This is largely NFC*, in preparation for utilizing ProfileSummaryInfo
and BranchFrequencyInfo analyses. In this patch I am only doing the
splitting for the New PM, but I can do the same for the legacy PM as
a follow-on if this looks good.

*Not NFC since for partial unrolling we lose the updates done to the
loop traversal (adding new sibling and child loops) - according to
Chandler this is not very useful for partial unrolling, but it also
means that the debugging flag -unroll-revisit-child-loops no longer
works for partial unrolling.

Reviewers: chandlerc

Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36157

llvm-svn: 309886

show more ...


Revision tags: llvmorg-5.0.0-rc1
# b0573547 28-Jun-2017 Geoff Berry <gberry@codeaurora.org>

[LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount

Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Subscribers: mcrosier, llvm-commits, mzolotukhin

Diffe

[LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount

Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D34532

llvm-svn: 306564

show more ...


# 66d9bdbc 28-Jun-2017 Geoff Berry <gberry@codeaurora.org>

[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.

Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, j

[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.

Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D34531

llvm-svn: 306554

show more ...


Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1
# 927d8e61 12-Apr-2017 Chandler Carruth <chandlerc@gmail.com>

[IR] Redesign the case iterator in SwitchInst to actually be an iterator
and to expose a handle to represent the actual case rather than having
the iterator return a reference to itself.

All of this

[IR] Redesign the case iterator in SwitchInst to actually be an iterator
and to expose a handle to represent the actual case rather than having
the iterator return a reference to itself.

All of this allows the iterator to be used with common STL facilities,
standard algorithms, etc.

Doing this exposed some missing facilities in the iterator facade that
I've fixed and required some work to the actual iterator to fully
support the necessary API.

Differential Revision: https://reviews.llvm.org/D31548

llvm-svn: 300032

show more ...


Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4
# eed71b9e 03-Mar-2017 Sanjoy Das <sanjoy@playingwithpointers.com>

[LoopUnrolling] Re-prioritize Peeling and Partial unrolling

Summary:
In current implementation the loop peeling happens after trip-count based partial unrolling and may
sometimes not happen at all d

[LoopUnrolling] Re-prioritize Peeling and Partial unrolling

Summary:
In current implementation the loop peeling happens after trip-count based partial unrolling and may
sometimes not happen at all due to it (for example, if trip count is known, but UP.Partial = false). This
is generally bad, the more than there are some situations where peeling is profitable even if the partial
unrolling is disabled.

This patch is a NFC which reorders peeling and partial unrolling application and prepares the code for
implementation of the said optimizations.

Patch by Max Kazantsev!

Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Reviewed By: mkuper

Subscribers: mkuper, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D30243

llvm-svn: 296897

show more ...


12345678910>>...15