Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3 |
|
#
fcdb1c14 |
| 20-Aug-2015 |
Benjamin Kramer <benny.kra@googlemail.com> |
Make helper functions static. NFC.
llvm-svn: 245549
|
#
2f1fd165 |
| 17-Aug-2015 |
Chandler Carruth <chandlerc@gmail.com> |
[PM] Port ScalarEvolution to the new pass manager.
This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the objec
[PM] Port ScalarEvolution to the new pass manager.
This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings.
I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses.
But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never *actually* invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see.
To make matters worse, there *is* actually a key that we don't update with value handles -- there is a map keyed off of Loop*s. Because LoopInfo *does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch.
To make matters still worse, there are plenty of updates that *don't* trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in *just* such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug.
With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best.
Differential Revision: http://reviews.llvm.org/D12063
llvm-svn: 245193
show more ...
|
Revision tags: studio-1.4 |
|
#
8939154a |
| 10-Aug-2015 |
Mark Heffernan <meheff@google.com> |
Add new llvm.loop.unroll.enable metadata.
This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time
Add new llvm.loop.unroll.enable metadata.
This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time, and unroll partially if the trip count is not known at compile time. This differs from "llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not known at compile time.
The "llvm.loop.unroll.enable" is intended to be added for loops annotated with "#pragma unroll".
llvm-svn: 244466
show more ...
|
#
df005cbe |
| 08-Aug-2015 |
Benjamin Kramer <benny.kra@googlemail.com> |
Fix some comment typos.
llvm-svn: 244402
|
#
b2fda0d9 |
| 05-Aug-2015 |
Chandler Carruth <chandlerc@gmail.com> |
[Unroll] Switch to using 'int' cost types in preparation for a somewhat more involved change to the cost computation pattern.
llvm-svn: 244095
|
#
924879ad |
| 04-Aug-2015 |
Sanjay Patel <spatel@rotateright.com> |
wrap OptSize and MinSize attributes for easier and consistent access (NFCI)
Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of
wrap OptSize and MinSize attributes for easier and consistent access (NFCI)
Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os).
Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests.
This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call.
Differential Revision: http://reviews.llvm.org/D11734
llvm-svn: 243994
show more ...
|
#
87adb7a2 |
| 03-Aug-2015 |
Chandler Carruth <chandlerc@gmail.com> |
[Unroll] Improve the brute force loop unroll estimate by propagating through PHI nodes across iterations.
This patch teaches the new advanced loop unrolling heuristics to propagate constants into th
[Unroll] Improve the brute force loop unroll estimate by propagating through PHI nodes across iterations.
This patch teaches the new advanced loop unrolling heuristics to propagate constants into the loop from the preheader and around the backedge after simulating each iteration. This lets us brute force solve simple recurrances that aren't modeled effectively by SCEV. It also makes it more clear why we need to process the loop in-order rather than bottom-up which might otherwise make much more sense (for example, for DCE).
This came out of an attempt I'm making to develop a principled way to account for dead code in the unroll estimation. When I implemented a forward-propagating version of that it produced incorrect results due to failing to propagate *cost* between loop iterations through the PHI nodes, and it occured to me we really should at least propagate simplifications across those edges, and it is quite easy thanks to the loop being in canonical and LCSSA form.
Differential Revision: http://reviews.llvm.org/D11706
llvm-svn: 243900
show more ...
|
Revision tags: llvmorg-3.7.0-rc2 |
|
#
9f06ef76 |
| 29-Jul-2015 |
Michael Zolotukhin <mzolotukhin@apple.com> |
[Unroll] Handle SwitchInst properly.
Previously successor selection was simply wrong.
llvm-svn: 243545
|
#
3a7d55b6 |
| 29-Jul-2015 |
Michael Zolotukhin <mzolotukhin@apple.com> |
[Unroll] Don't crash when simplified branch condition is undef.
llvm-svn: 243544
|
#
80d13bac |
| 28-Jul-2015 |
Michael Zolotukhin <mzolotukhin@apple.com> |
[Unroll] Add debug dumps to loop-unroll analyzer.
llvm-svn: 243471
|
#
a425c9d0 |
| 28-Jul-2015 |
Michael Zolotukhin <mzolotukhin@apple.com> |
[Unroll] Don't analyze blocks outside the loop.
llvm-svn: 243466
|
#
57776b81 |
| 24-Jul-2015 |
Michael Zolotukhin <mzolotukhin@apple.com> |
Handle resolvable branches in complete loop unroll heuristic.
Summary: Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate. This patch is i
Handle resolvable branches in complete loop unroll heuristic.
Summary: Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate. This patch is intended to be applied after D10205 (though it could be applied independently).
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10206
llvm-svn: 243084
show more ...
|
Revision tags: llvmorg-3.7.0-rc1 |
|
#
31b3eaaf |
| 15-Jul-2015 |
Michael Zolotukhin <mzolotukhin@apple.com> |
[LoopUnrolling] Handle cast instructions.
During estimation of unrolling effect we should be able to propagate constants through casts.
Differential Revision: http://reviews.llvm.org/D10207
llvm-s
[LoopUnrolling] Handle cast instructions.
During estimation of unrolling effect we should be able to propagate constants through casts.
Differential Revision: http://reviews.llvm.org/D10207
llvm-svn: 242257
show more ...
|
#
d7ebc241 |
| 13-Jul-2015 |
Mark Heffernan <meheff@google.com> |
Enable runtime unrolling with unroll pragma metadata
Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N") and a runtime trip count. Also, do not unroll loops with unrol
Enable runtime unrolling with unroll pragma metadata
Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N") and a runtime trip count. Also, do not unroll loops with unroll full metadata if the loop has a runtime loop count. Previously, such loops would be unrolled with a very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be enabled resulting in a very large (and likely unwise) unroll factor.
llvm-svn: 242047
show more ...
|
Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1 |
|
#
f00654e3 |
| 23-Jun-2015 |
Alexander Kornienko <alexfh@google.com> |
Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.
llvm-svn: 240390
|
#
70bc5f13 |
| 19-Jun-2015 |
Alexander Kornienko <alexfh@google.com> |
Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-c
Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/
Thanks to Eugene Kosov for the original patch!
llvm-svn: 240137
show more ...
|
#
c4e4f33e |
| 11-Jun-2015 |
Michael Zolotukhin <mzolotukhin@apple.com> |
Update stale comment before analyzeLoopUnrollCost. NFC.
llvm-svn: 239565
|
#
a60bdb56 |
| 08-Jun-2015 |
Michael Zolotukhin <mzolotukhin@apple.com> |
Remove SCEVCache and FindConstantPointers from complete loop unrolling heuristic.
Summary: Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visit
Remove SCEVCache and FindConstantPointers from complete loop unrolling heuristic.
Summary: Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visitor. Also, this makes the code more universal - I'll take advandate of it in next patches where I start handling additional types of instructions.
Test Plan: Tests would be submitted in subsequent patches.
Reviewers: atrick, chandlerc
Reviewed By: atrick, chandlerc
Subscribers: atrick, llvm-commits
Differential Revision: http://reviews.llvm.org/D10205
llvm-svn: 239282
show more ...
|
#
ad714b1a |
| 06-Jun-2015 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
[LoopUnroll] Fix truncation bug in canUnrollCompletely.
Summary: canUnrollCompletely takes `unsigned` values for `UnrolledCost` and `RolledDynamicCost` but is passed in `uint64_t`s that are silently
[LoopUnroll] Fix truncation bug in canUnrollCompletely.
Summary: canUnrollCompletely takes `unsigned` values for `UnrolledCost` and `RolledDynamicCost` but is passed in `uint64_t`s that are silently truncated. Because of this, when `UnrolledSize` is a large integer that has a small remainder with UINT32_MAX, LLVM tries to completely unroll loops with high trip counts.
Reviewers: mzolotukhin, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10293
llvm-svn: 239218
show more ...
|
#
9dabd14d |
| 05-Jun-2015 |
Chandler Carruth <chandlerc@gmail.com> |
[Unroll] Rework the naming and structure of the new unroll heuristics.
The new naming is (to me) much easier to understand. Here is a summary of the new state of the world:
- '*Threshold' is the th
[Unroll] Rework the naming and structure of the new unroll heuristics.
The new naming is (to me) much easier to understand. Here is a summary of the new state of the world:
- '*Threshold' is the threshold for full unrolling. It is measured against the estimated unrolled cost as computed by getUserCost in TTI (or CodeMetrics, etc). We will exceed this threshold when unrolling loops where unrolling exposes a significant degree of simplification of the logic within the loop. - '*PercentDynamicCostSavedThreshold' is the percentage of the loop's estimated dynamic execution cost which needs to be saved by unrolling to apply a discount to the estimated unrolled cost. - '*DynamicCostSavingsDiscount' is the discount applied to the estimated unrolling cost when the dynamic savings are expected to be high.
When actually analyzing the loop, we now produce both an estimated unrolled cost, and an estimated rolled cost. The rolled cost is notably a dynamic estimate based on our analysis of the expected execution of each iteration.
While we're still working to build up the infrastructure for making these estimates, to me it is much more clear *how* to make them better when they have reasonably descriptive names. For example, we may want to apply estimated (from heuristics or profiles) dynamic execution weights to the *dynamic* cost estimates. If we start doing that, we would also need to track the static unrolled cost and the dynamic unrolled cost, as only the latter could reasonably be weighted by profile information.
This patch is sadly not without functionality change for the new unroll analysis logic. Buried in the heuristic management were several things that surprised me. For example, we never subtracted the optimized instruction count off when comparing against the unroll heursistics! I don't know if this just got lost somewhere along the way or what, but with the new accounting of things, this is much easier to keep track of and we use the post-simplification cost estimate to compare to the thresholds, and use the dynamic cost reduction ratio to select whether we can exceed the baseline threshold.
The old values of these flags also don't necessarily make sense. My impression is that none of these thresholds or discounts have been tuned yet, and so they're just arbitrary placehold numbers. As such, I've not bothered to adjust for the fact that this is now a discount and not a tow-tier threshold model. We need to tune all these values once the logic is ready to be enabled.
Differential Revision: http://reviews.llvm.org/D9966
llvm-svn: 239164
show more ...
|
#
04cc665c |
| 25-May-2015 |
Chandler Carruth <chandlerc@gmail.com> |
[Unroll] Switch from an eagerly populated SCEV cache to one that is lazily built.
Also, make it a much more generic SCEV cache, which today exposes only a reduced GEP model description but could be
[Unroll] Switch from an eagerly populated SCEV cache to one that is lazily built.
Also, make it a much more generic SCEV cache, which today exposes only a reduced GEP model description but could be extended in the future to do other profitable caching of SCEV information.
llvm-svn: 238124
show more ...
|
#
0215608b |
| 22-May-2015 |
Chandler Carruth <chandlerc@gmail.com> |
[Unroll] Separate the logic for testing each iteration of the loop, accumulating estimated cost, and other loop-centric logic from the logic used to analyze instructions in a particular iteration.
T
[Unroll] Separate the logic for testing each iteration of the loop, accumulating estimated cost, and other loop-centric logic from the logic used to analyze instructions in a particular iteration.
This makes the visitor very narrow in scope -- all it does is visit instructions, update a map of simplified values, and return whether it is able to optimize away a particular instruction.
The two cost metrics are now returned as an optional struct. When the optional is left unengaged, there is no information about the unrolled cost of the loop, when it is engaged the cost metrics are available to run against the thresholds.
No functionality changed.
llvm-svn: 238033
show more ...
|
#
51895599 |
| 22-May-2015 |
Chandler Carruth <chandlerc@gmail.com> |
[Unroll] Replace a hand-wavy FIXME with a FIXME that explains the actual problem instead of suggesting doing something that is trivial to do but incorrect given the current design of the libraries.
[Unroll] Replace a hand-wavy FIXME with a FIXME that explains the actual problem instead of suggesting doing something that is trivial to do but incorrect given the current design of the libraries.
llvm-svn: 237994
show more ...
|
#
e1a0462d |
| 22-May-2015 |
Chandler Carruth <chandlerc@gmail.com> |
[Unroll] Extract the logic for caching SCEV-modeled GEPs with their simplified model for use simulating each iteration into a separate helper function that just returns the cache.
Building this cach
[Unroll] Extract the logic for caching SCEV-modeled GEPs with their simplified model for use simulating each iteration into a separate helper function that just returns the cache.
Building this cache had nothing to do with the rest of the unroll analysis and so this removes an unnecessary coupling, etc. It should also make it easier to think about the concept of providing fast cached access to basic SCEV models as an orthogonal concept to the overall unroll simulation.
I'd really like to see this kind of caching logic folded into SCEV itself, it seems weird for us to provide it at this layer rather than making repeated queries into SCEV fast all on their own.
No functionality changed.
llvm-svn: 237993
show more ...
|
#
f174a156 |
| 22-May-2015 |
Chandler Carruth <chandlerc@gmail.com> |
[Unroll] Refactor the accumulation of optimized instruction costs into a single location.
This reduces code duplication a bit and will also pave the way for a better separation between the visitatio
[Unroll] Refactor the accumulation of optimized instruction costs into a single location.
This reduces code duplication a bit and will also pave the way for a better separation between the visitation algorithm and the unroll analysis.
No functionality changed.
llvm-svn: 237990
show more ...
|