LoopUnrollPass.cpp - OpenGrok history log for /llvm-project/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3
# fcdb1c14	20-Aug-2015	Benjamin Kramer <benny.kra@googlemail.com>	Make helper functions static. NFC. llvm-svn: 245549
# 2f1fd165	17-Aug-2015	Chandler Carruth <chandlerc@gmail.com>	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the objec [PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193 show more ...
Revision tags: studio-1.4
# 8939154a	10-Aug-2015	Mark Heffernan <meheff@google.com>	Add new llvm.loop.unroll.enable metadata. This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time Add new llvm.loop.unroll.enable metadata. This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time, and unroll partially if the trip count is not known at compile time. This differs from "llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not known at compile time. The "llvm.loop.unroll.enable" is intended to be added for loops annotated with "#pragma unroll". llvm-svn: 244466 show more ...
# df005cbe	08-Aug-2015	Benjamin Kramer <benny.kra@googlemail.com>	Fix some comment typos. llvm-svn: 244402
# b2fda0d9	05-Aug-2015	Chandler Carruth <chandlerc@gmail.com>	[Unroll] Switch to using 'int' cost types in preparation for a somewhat more involved change to the cost computation pattern. llvm-svn: 244095
# 924879ad	04-Aug-2015	Sanjay Patel <spatel@rotateright.com>	wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994 show more ...
# 87adb7a2	03-Aug-2015	Chandler Carruth <chandlerc@gmail.com>	[Unroll] Improve the brute force loop unroll estimate by propagating through PHI nodes across iterations. This patch teaches the new advanced loop unrolling heuristics to propagate constants into th [Unroll] Improve the brute force loop unroll estimate by propagating through PHI nodes across iterations. This patch teaches the new advanced loop unrolling heuristics to propagate constants into the loop from the preheader and around the backedge after simulating each iteration. This lets us brute force solve simple recurrances that aren't modeled effectively by SCEV. It also makes it more clear why we need to process the loop in-order rather than bottom-up which might otherwise make much more sense (for example, for DCE). This came out of an attempt I'm making to develop a principled way to account for dead code in the unroll estimation. When I implemented a forward-propagating version of that it produced incorrect results due to failing to propagate cost between loop iterations through the PHI nodes, and it occured to me we really should at least propagate simplifications across those edges, and it is quite easy thanks to the loop being in canonical and LCSSA form. Differential Revision: http://reviews.llvm.org/D11706 llvm-svn: 243900 show more ...
Revision tags: llvmorg-3.7.0-rc2
# 9f06ef76	29-Jul-2015	Michael Zolotukhin <mzolotukhin@apple.com>	[Unroll] Handle SwitchInst properly. Previously successor selection was simply wrong. llvm-svn: 243545
# 3a7d55b6	29-Jul-2015	Michael Zolotukhin <mzolotukhin@apple.com>	[Unroll] Don't crash when simplified branch condition is undef. llvm-svn: 243544
# 80d13bac	28-Jul-2015	Michael Zolotukhin <mzolotukhin@apple.com>	[Unroll] Add debug dumps to loop-unroll analyzer. llvm-svn: 243471
# a425c9d0	28-Jul-2015	Michael Zolotukhin <mzolotukhin@apple.com>	[Unroll] Don't analyze blocks outside the loop. llvm-svn: 243466
# 57776b81	24-Jul-2015	Michael Zolotukhin <mzolotukhin@apple.com>	Handle resolvable branches in complete loop unroll heuristic. Summary: Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate. This patch is i Handle resolvable branches in complete loop unroll heuristic. Summary: Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate. This patch is intended to be applied after D10205 (though it could be applied independently). Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10206 llvm-svn: 243084 show more ...
Revision tags: llvmorg-3.7.0-rc1
# 31b3eaaf	15-Jul-2015	Michael Zolotukhin <mzolotukhin@apple.com>	[LoopUnrolling] Handle cast instructions. During estimation of unrolling effect we should be able to propagate constants through casts. Differential Revision: http://reviews.llvm.org/D10207 llvm-s [LoopUnrolling] Handle cast instructions. During estimation of unrolling effect we should be able to propagate constants through casts. Differential Revision: http://reviews.llvm.org/D10207 llvm-svn: 242257 show more ...
# d7ebc241	13-Jul-2015	Mark Heffernan <meheff@google.com>	Enable runtime unrolling with unroll pragma metadata Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N") and a runtime trip count. Also, do not unroll loops with unrol Enable runtime unrolling with unroll pragma metadata Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N") and a runtime trip count. Also, do not unroll loops with unroll full metadata if the loop has a runtime loop count. Previously, such loops would be unrolled with a very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be enabled resulting in a very large (and likely unwise) unroll factor. llvm-svn: 242047 show more ...
Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1
# f00654e3	23-Jun-2015	Alexander Kornienko <alexfh@google.com>	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390
# 70bc5f13	19-Jun-2015	Alexander Kornienko <alexfh@google.com>	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-c Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137 show more ...
# c4e4f33e	11-Jun-2015	Michael Zolotukhin <mzolotukhin@apple.com>	Update stale comment before analyzeLoopUnrollCost. NFC. llvm-svn: 239565
# a60bdb56	08-Jun-2015	Michael Zolotukhin <mzolotukhin@apple.com>	Remove SCEVCache and FindConstantPointers from complete loop unrolling heuristic. Summary: Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visit Remove SCEVCache and FindConstantPointers from complete loop unrolling heuristic. Summary: Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visitor. Also, this makes the code more universal - I'll take advandate of it in next patches where I start handling additional types of instructions. Test Plan: Tests would be submitted in subsequent patches. Reviewers: atrick, chandlerc Reviewed By: atrick, chandlerc Subscribers: atrick, llvm-commits Differential Revision: http://reviews.llvm.org/D10205 llvm-svn: 239282 show more ...
# ad714b1a	06-Jun-2015	Sanjoy Das <sanjoy@playingwithpointers.com>	[LoopUnroll] Fix truncation bug in canUnrollCompletely. Summary: canUnrollCompletely takes `unsigned` values for `UnrolledCost` and `RolledDynamicCost` but is passed in `uint64_t`s that are silently [LoopUnroll] Fix truncation bug in canUnrollCompletely. Summary: canUnrollCompletely takes `unsigned` values for `UnrolledCost` and `RolledDynamicCost` but is passed in `uint64_t`s that are silently truncated. Because of this, when `UnrolledSize` is a large integer that has a small remainder with UINT32_MAX, LLVM tries to completely unroll loops with high trip counts. Reviewers: mzolotukhin, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10293 llvm-svn: 239218 show more ...
# 9dabd14d	05-Jun-2015	Chandler Carruth <chandlerc@gmail.com>	[Unroll] Rework the naming and structure of the new unroll heuristics. The new naming is (to me) much easier to understand. Here is a summary of the new state of the world: - 'Threshold' is the th [Unroll] Rework the naming and structure of the new unroll heuristics. The new naming is (to me) much easier to understand. Here is a summary of the new state of the world: - 'Threshold' is the threshold for full unrolling. It is measured against the estimated unrolled cost as computed by getUserCost in TTI (or CodeMetrics, etc). We will exceed this threshold when unrolling loops where unrolling exposes a significant degree of simplification of the logic within the loop. - 'PercentDynamicCostSavedThreshold' is the percentage of the loop's estimated dynamic execution cost which needs to be saved by unrolling to apply a discount to the estimated unrolled cost. - 'DynamicCostSavingsDiscount' is the discount applied to the estimated unrolling cost when the dynamic savings are expected to be high. When actually analyzing the loop, we now produce both an estimated unrolled cost, and an estimated rolled cost. The rolled cost is notably a dynamic estimate based on our analysis of the expected execution of each iteration. While we're still working to build up the infrastructure for making these estimates, to me it is much more clear how to make them better when they have reasonably descriptive names. For example, we may want to apply estimated (from heuristics or profiles) dynamic execution weights to the dynamic cost estimates. If we start doing that, we would also need to track the static unrolled cost and the dynamic unrolled cost, as only the latter could reasonably be weighted by profile information. This patch is sadly not without functionality change for the new unroll analysis logic. Buried in the heuristic management were several things that surprised me. For example, we never subtracted the optimized instruction count off when comparing against the unroll heursistics! I don't know if this just got lost somewhere along the way or what, but with the new accounting of things, this is much easier to keep track of and we use the post-simplification cost estimate to compare to the thresholds, and use the dynamic cost reduction ratio to select whether we can exceed the baseline threshold. The old values of these flags also don't necessarily make sense. My impression is that none of these thresholds or discounts have been tuned yet, and so they're just arbitrary placehold numbers. As such, I've not bothered to adjust for the fact that this is now a discount and not a tow-tier threshold model. We need to tune all these values once the logic is ready to be enabled. Differential Revision: http://reviews.llvm.org/D9966 llvm-svn: 239164 show more ...
# 04cc665c	25-May-2015	Chandler Carruth <chandlerc@gmail.com>	[Unroll] Switch from an eagerly populated SCEV cache to one that is lazily built. Also, make it a much more generic SCEV cache, which today exposes only a reduced GEP model description but could be [Unroll] Switch from an eagerly populated SCEV cache to one that is lazily built. Also, make it a much more generic SCEV cache, which today exposes only a reduced GEP model description but could be extended in the future to do other profitable caching of SCEV information. llvm-svn: 238124 show more ...
# 0215608b	22-May-2015	Chandler Carruth <chandlerc@gmail.com>	[Unroll] Separate the logic for testing each iteration of the loop, accumulating estimated cost, and other loop-centric logic from the logic used to analyze instructions in a particular iteration. T [Unroll] Separate the logic for testing each iteration of the loop, accumulating estimated cost, and other loop-centric logic from the logic used to analyze instructions in a particular iteration. This makes the visitor very narrow in scope -- all it does is visit instructions, update a map of simplified values, and return whether it is able to optimize away a particular instruction. The two cost metrics are now returned as an optional struct. When the optional is left unengaged, there is no information about the unrolled cost of the loop, when it is engaged the cost metrics are available to run against the thresholds. No functionality changed. llvm-svn: 238033 show more ...
# 51895599	22-May-2015	Chandler Carruth <chandlerc@gmail.com>	[Unroll] Replace a hand-wavy FIXME with a FIXME that explains the actual problem instead of suggesting doing something that is trivial to do but incorrect given the current design of the libraries. [Unroll] Replace a hand-wavy FIXME with a FIXME that explains the actual problem instead of suggesting doing something that is trivial to do but incorrect given the current design of the libraries. llvm-svn: 237994 show more ...
# e1a0462d	22-May-2015	Chandler Carruth <chandlerc@gmail.com>	[Unroll] Extract the logic for caching SCEV-modeled GEPs with their simplified model for use simulating each iteration into a separate helper function that just returns the cache. Building this cach [Unroll] Extract the logic for caching SCEV-modeled GEPs with their simplified model for use simulating each iteration into a separate helper function that just returns the cache. Building this cache had nothing to do with the rest of the unroll analysis and so this removes an unnecessary coupling, etc. It should also make it easier to think about the concept of providing fast cached access to basic SCEV models as an orthogonal concept to the overall unroll simulation. I'd really like to see this kind of caching logic folded into SCEV itself, it seems weird for us to provide it at this layer rather than making repeated queries into SCEV fast all on their own. No functionality changed. llvm-svn: 237993 show more ...
# f174a156	22-May-2015	Chandler Carruth <chandlerc@gmail.com>	[Unroll] Refactor the accumulation of optimized instruction costs into a single location. This reduces code duplication a bit and will also pave the way for a better separation between the visitatio [Unroll] Refactor the accumulation of optimized instruction costs into a single location. This reduces code duplication a bit and will also pave the way for a better separation between the visitation algorithm and the unroll analysis. No functionality changed. llvm-svn: 237990 show more ...
1 2 3 4 5 6 7 8 910>>...15