Revision tags: llvmorg-5.0.2-rc1 |
|
#
fc97b617 |
| 15-Mar-2018 |
Florian Hahn <florian.hahn@arm.com> |
[LoopUnroll] Peel off iterations if it makes conditions true/false.
If the loop body contains conditions of the form IndVar < #constant, we can remove the checks by peeling off #constant iterations.
[LoopUnroll] Peel off iterations if it makes conditions true/false.
If the loop body contains conditions of the form IndVar < #constant, we can remove the checks by peeling off #constant iterations.
This improves codegen for PR34364.
Reviewers: mkuper, mkazantsev, efriedma
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D43876
llvm-svn: 327671
show more ...
|
#
f9b8035f |
| 15-Mar-2018 |
Andrei Elovikov <andrei.elovikov@intel.com> |
[LoopUnroll] Ignore ephemeral values when checking full unroll profitability.
Summary: Before this patch call graph is like this in the LoopUnrollPass:
tryToUnrollLoop ApproximateLoopSize
[LoopUnroll] Ignore ephemeral values when checking full unroll profitability.
Summary: Before this patch call graph is like this in the LoopUnrollPass:
tryToUnrollLoop ApproximateLoopSize collectEphemeralValues /* Use collected ephemeral values */ computeUnrollCount analyzeLoopUnrollCost /* Bail out from the analysis if loop contains CallInst */
This patch moves collection of the ephemeral values to the tryToUnrollLoop function and passes the collected values into both ApproximateLoopsize (as before) and additionally starts using them in analyzeLoopUnrollCost:
tryToUnrollLoop collectEphemeralValues ApproximateLoopSize(EphValues) /* Use EphValues */ computeUnrollCount(EphValues) analyzeLoopUnrollCost(EphValues) /* Ignore ephemeral values - they don't contribute to the final cost */ /* Bail out from the analysis if loop contains CallInst */
Reviewers: mzolotukhin, evstupac, sanjoy
Reviewed By: evstupac
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43931
llvm-svn: 327617
show more ...
|
#
3c42f1c3 |
| 02-Mar-2018 |
Yaxun Liu <Yaxun.Liu@amd.com> |
LoopUnroll: respect pragma unroll when AllowRemainder is disabled
Currently when AllowRemainder is disabled, pragma unroll count is not respected even though there is no remainder. This bug causes a
LoopUnroll: respect pragma unroll when AllowRemainder is disabled
Currently when AllowRemainder is disabled, pragma unroll count is not respected even though there is no remainder. This bug causes a loop fully unrolled in many cases even though the user specifies a unroll count. Especially it affects OpenCL/CUDA since in many cases a loop contains convergent instructions and currently AllowRemainder is disabled for such loops.
Differential Revision: https://reviews.llvm.org/D43826
llvm-svn: 326585
show more ...
|
Revision tags: llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1 |
|
#
a17f2205 |
| 22-Dec-2017 |
Easwaran Raman <eraman@google.com> |
Add hasProfileData() to check if a function has profile data. NFC.
Summary: This replaces calls to getEntryCount().hasValue() with hasProfileData that does the same thing. This refactoring is useful
Add hasProfileData() to check if a function has profile data. NFC.
Summary: This replaces calls to getEntryCount().hasValue() with hasProfileData that does the same thing. This refactoring is useful to do before adding synthetic function entry counts but also a useful cleanup IMO even otherwise. I have used hasProfileData instead of hasRealProfileData as David had earlier suggested since I think profile implies "real" and I use the phrase "synthetic entry count" and not "synthetic profile count" but I am fine calling it hasRealProfileData if you prefer.
Reviewers: davidxl, silvas
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41461
llvm-svn: 321331
show more ...
|
Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1 |
|
#
0444e4fc |
| 19-Oct-2017 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
Fix MSVC signed/unsigned comparison warning
llvm-svn: 316161
|
#
306d2997 |
| 18-Oct-2017 |
Eugene Zelenko <eugene.zelenko@gmail.com> |
[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316128
|
#
73f65043 |
| 15-Oct-2017 |
Hongbin Zheng <etherzhhb@gmail.com> |
[LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC.
This avoid code duplication and allow us to add the disable unroll metadata elsewhere.
Differential Revisio
[LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC.
This avoid code duplication and allow us to add the disable unroll metadata elsewhere.
Differential Revision: https://reviews.llvm.org/D38928
llvm-svn: 315850
show more ...
|
#
9590658f |
| 11-Oct-2017 |
Vivek Pandya <vivekvpandya@gmail.com> |
[NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure parameterized emit() calls
Summary: This is not functional change to adopt new emit() API added in r313691.
Reviewed By: anem
[NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure parameterized emit() calls
Summary: This is not functional change to adopt new emit() API added in r313691.
Reviewed By: anemet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38285
llvm-svn: 315476
show more ...
|
#
0965da20 |
| 09-Oct-2017 |
Adam Nemet <anemet@apple.com> |
Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*
Sync it up with the name of the class actually defined here. This has been bothering me for a while...
llvm-svn: 315249
|
#
c965b30e |
| 28-Sep-2017 |
Benjamin Kramer <benny.kra@googlemail.com> |
[LoopUnroll] Fix use after poison.
llvm-svn: 314418
|
#
def1729d |
| 28-Sep-2017 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
Use a BumpPtrAllocator for Loop objects
Summary: And now that we no longer have to explicitly free() the Loop instances, we can (with more ease) use the destructor of LoopBase to do what LoopBase::c
Use a BumpPtrAllocator for Loop objects
Summary: And now that we no longer have to explicitly free() the Loop instances, we can (with more ease) use the destructor of LoopBase to do what LoopBase::clear() was doing.
Reviewers: chandlerc
Subscribers: mehdi_amini, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D38201
llvm-svn: 314375
show more ...
|
#
0dbb0f10 |
| 27-Sep-2017 |
Rui Ueyama <ruiu@google.com> |
Fix -Wunused-variable for Release build.
llvm-svn: 314353
|
#
4f3ebd53 |
| 27-Sep-2017 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
Return the LoopUnrollResult from tryToUnrollLoop; NFC
I will use this in a later change.
llvm-svn: 314352
|
#
3567d3d2 |
| 27-Sep-2017 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
Rename LoopUnrollStatus to LoopUnrollResult; NFC
A "Result" suffix is more appropriate here
llvm-svn: 314350
|
#
09613b12 |
| 20-Sep-2017 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
Tighten the invariants around LoopBase::invalidate
Summary: With this change: - Methods in LoopBase trip an assert if the receiver has been invalidated - LoopBase::clear frees up the memory held t
Tighten the invariants around LoopBase::invalidate
Summary: With this change: - Methods in LoopBase trip an assert if the receiver has been invalidated - LoopBase::clear frees up the memory held the LoopBase instance
This change also shuffles things around as necessary to work with this stricter invariant.
Reviewers: chandlerc
Subscribers: mehdi_amini, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D38055
llvm-svn: 313708
show more ...
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4 |
|
#
9a09ae44 |
| 28-Aug-2017 |
Davide Italiano <davide@freebsd.org> |
[LoopUnroll] Add a cl::opt to force peeling, for testing purposes.
Will be used to test the patch proposed in D37153.
llvm-svn: 311915
|
Revision tags: llvmorg-5.0.0-rc3 |
|
#
718c8a6a |
| 14-Aug-2017 |
Sam Parker <sam.parker@arm.com> |
[LoopUnroll] Enable option to peel remainder loop
On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. T
[LoopUnroll] Enable option to peel remainder loop
On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. This results in the need to be more conservative with the unroll count, keeping a trip count of 2 reduces the overhead as well as increasing the chance of the unrolled body being executed. But being conservative leaves performance gains on the table.
This patch enables the unrolling of the remainder loop introduced by runtime unrolling. This can help reduce the overhead of misunrolled loops because the cost of non-taken branches is much less than the cost of the backedge that would normally be executed in the remainder loop. This allows larger unroll factors to be used without suffering performance loses with smaller iteration counts.
Differential Revision: https://reviews.llvm.org/D36309
llvm-svn: 310824
show more ...
|
Revision tags: llvmorg-5.0.0-rc2 |
|
#
7c888dca |
| 08-Aug-2017 |
Chandler Carruth <chandlerc@gmail.com> |
[PM] Fix new LoopUnroll function pass by invalidating loop analysis results when a loop is completely removed.
This is very hard to manifest as a visible bug. You need to arrange for there to be a s
[PM] Fix new LoopUnroll function pass by invalidating loop analysis results when a loop is completely removed.
This is very hard to manifest as a visible bug. You need to arrange for there to be a subsequent allocation of a 'Loop' object which gets the exact same address as the one which the unroll deleted, and you need the LoopAccessAnalysis results to be significant in the way that they're stale. And you need a million other things to align.
But when it does, you get a deeply mysterious crash due to actually finding a stale analysis result. This fixes the issue and tests for it by directly checking we successfully invalidate things. I have not been able to get *any* test case to reliably trigger this. Changes to LLVM itself caused the only test case I ever had to cease to crash.
I've looked pretty extensively at less brittle ways of fixing this and they are actually very, very hard to do. This is a somewhat strange and unusual case as we have a pass which is deleting an IR unit, but is not running within that IR unit's pass framework (which is what handles this cleanly for the normal loop unroll). And where there isn't a definitive way to clear *all* of the stale cache entries. And where the pass *is* updating the core analysis that provides the IR units!
For example, we don't have any of these problems with Function analyses because it is easy to clear out function analyses when the functions themselves may have been deleted -- we clear an entire module's worth! But that is too heavy of a hammer down here in the LoopAnalysisManager layer.
A better long-term solution IMO is to require that AnalysisManager's make their keys durable to this kind of thing. Specifically, when caching an analysis for one IR unit that is conceptually "owned" by a higher level IR unit, the AnalysisManager should incorporate this into its data structures so that we can reliably clear these results without having to teach each and every pass to do so manually as we do here. But that is a change for another day as it will be a fairly invasive change to the AnalysisManager infrastructure. Until then, this fortunately seems to be quite rare.
llvm-svn: 310333
show more ...
|
#
8482e569 |
| 03-Aug-2017 |
Teresa Johnson <tejohnson@google.com> |
Use profile summary to disable peeling for huge working sets
Summary: Detect when the working set size of a profiled application is huge, by comparing the number of counts required to reach the hot
Use profile summary to disable peeling for huge working sets
Summary: Detect when the working set size of a profiled application is huge, by comparing the number of counts required to reach the hot percentile in the profile summary to a large threshold*.
When the working set size is determined to be huge, disable peeling to avoid bloating the working set further.
*Note that the selected threshold (15K) is significantly larger than the largest working set value in SPEC cpu2006 (which is gcc at around 11K).
Reviewers: davidxl
Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D36288
llvm-svn: 310005
show more ...
|
#
9a18a6f0 |
| 03-Aug-2017 |
Teresa Johnson <tejohnson@google.com> |
Disable loop peeling during full unrolling pass.
Summary: Peeling should not occur during the full unrolling invocation early in the pipeline, but rather later with partial and runtime loop unrollin
Disable loop peeling during full unrolling pass.
Summary: Peeling should not occur during the full unrolling invocation early in the pipeline, but rather later with partial and runtime loop unrolling. The later loop unrolling invocation will also eventually utilize profile summary and branch frequency information, which we would like to use to control peeling. And for ThinLTO we want to delay peeling until the backend (post thin link) phase, just as we do for most types of unrolling.
Ensure peeling doesn't occur during the full unrolling invocation by adding a parameter to the shared implementation function, similar to the way partial and runtime loop unrolling are disabled.
Performance results for ThinLTO suggest this has a neutral to positive effect on some internal benchmarks.
Reviewers: chandlerc, davidxl
Subscribers: mzolotukhin, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D36258
llvm-svn: 309966
show more ...
|
#
ecd90131 |
| 02-Aug-2017 |
Teresa Johnson <tejohnson@google.com> |
[PM] Split LoopUnrollPass and make partial unroller a function pass
Summary: This is largely NFC*, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I a
[PM] Split LoopUnrollPass and make partial unroller a function pass
Summary: This is largely NFC*, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I am only doing the splitting for the New PM, but I can do the same for the legacy PM as a follow-on if this looks good.
*Not NFC since for partial unrolling we lose the updates done to the loop traversal (adding new sibling and child loops) - according to Chandler this is not very useful for partial unrolling, but it also means that the debugging flag -unroll-revisit-child-loops no longer works for partial unrolling.
Reviewers: chandlerc
Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D36157
llvm-svn: 309886
show more ...
|
Revision tags: llvmorg-5.0.0-rc1 |
|
#
b0573547 |
| 28-Jun-2017 |
Geoff Berry <gberry@codeaurora.org> |
[LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper
Subscribers: mcrosier, llvm-commits, mzolotukhin
Diffe
[LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper
Subscribers: mcrosier, llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D34532
llvm-svn: 306564
show more ...
|
#
66d9bdbc |
| 28-Jun-2017 |
Geoff Berry <gberry@codeaurora.org> |
[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper
Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, j
[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper
Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D34531
llvm-svn: 306554
show more ...
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1 |
|
#
927d8e61 |
| 12-Apr-2017 |
Chandler Carruth <chandlerc@gmail.com> |
[IR] Redesign the case iterator in SwitchInst to actually be an iterator and to expose a handle to represent the actual case rather than having the iterator return a reference to itself.
All of this
[IR] Redesign the case iterator in SwitchInst to actually be an iterator and to expose a handle to represent the actual case rather than having the iterator return a reference to itself.
All of this allows the iterator to be used with common STL facilities, standard algorithms, etc.
Doing this exposed some missing facilities in the iterator facade that I've fixed and required some work to the actual iterator to fully support the necessary API.
Differential Revision: https://reviews.llvm.org/D31548
llvm-svn: 300032
show more ...
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4 |
|
#
eed71b9e |
| 03-Mar-2017 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
[LoopUnrolling] Re-prioritize Peeling and Partial unrolling
Summary: In current implementation the loop peeling happens after trip-count based partial unrolling and may sometimes not happen at all d
[LoopUnrolling] Re-prioritize Peeling and Partial unrolling
Summary: In current implementation the loop peeling happens after trip-count based partial unrolling and may sometimes not happen at all due to it (for example, if trip count is known, but UP.Partial = false). This is generally bad, the more than there are some situations where peeling is profitable even if the partial unrolling is disabled.
This patch is a NFC which reorders peeling and partial unrolling application and prepares the code for implementation of the said optimizations.
Patch by Max Kazantsev!
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper
Reviewed By: mkuper
Subscribers: mkuper, llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D30243
llvm-svn: 296897
show more ...
|