LoopUnrollPass.cpp - OpenGrok history log for /llvm-project/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-3.6.1
# a6ae877a	12-May-2015	Chandler Carruth <chandlerc@gmail.com>	[Unrolling] Refactor the start and step offsets to simplify overflow checking and make the cache faster and smaller. I had thought that using an APInt here would be useful, but I think I was just wr [Unrolling] Refactor the start and step offsets to simplify overflow checking and make the cache faster and smaller. I had thought that using an APInt here would be useful, but I think I was just wrong. Notably, we don't have to do any fancy overflow checking, we can just bound the values as quite small and do the math in a higher precision integer. I've switched to a signed integer so that UBSan will even point out if we ever have integer overflow. I've added various asserts to try to catch things as well and hoisted the overflow checks so that we just leave the too-large offsets out of the SCEV-GEP cache. This makes the value in the cache quite a bit smaller which is probably worthwhile. No functionality changed here (for trip counts under 1 billion). llvm-svn: 237209 show more ...
# 8c68171f	12-May-2015	Michael Zolotukhin <mzolotukhin@apple.com>	Reimplement heuristic for estimating complete-unroll optimization effects. Summary: This patch reimplements heuristic that tries to estimate optimization beneftis from complete loop unrolling. In t Reimplement heuristic for estimating complete-unroll optimization effects. Summary: This patch reimplements heuristic that tries to estimate optimization beneftis from complete loop unrolling. In this patch I kept the minimal changes - e.g. I removed code handling branches and folding compares. That's a promising area, but now there are too many questions to discuss before we can enable it. Test Plan: Tests are included in the patch. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8816 llvm-svn: 237156 show more ...
Revision tags: llvmorg-3.6.1-rc1
# e178f469	14-Apr-2015	Sanjoy Das <sanjoy@playingwithpointers.com>	[LoopUnrollRuntime] Avoid high-cost trip count computation. Summary: Runtime unrolling of loops needs to emit an expression to compute the loop's runtime trip-count. Avoid runtime unrolling if this [LoopUnrollRuntime] Avoid high-cost trip count computation. Summary: Runtime unrolling of loops needs to emit an expression to compute the loop's runtime trip-count. Avoid runtime unrolling if this computation will be expensive. Depends on D8993. Reviewers: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8994 llvm-svn: 234846 show more ...
# 799003bf	23-Mar-2015	Benjamin Kramer <benny.kra@googlemail.com>	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998
# 51f6096c	23-Mar-2015	Benjamin Kramer <benny.kra@googlemail.com>	Move private classes into anonymous namespaces NFC. llvm-svn: 232944
Revision tags: llvmorg-3.5.2, llvmorg-3.5.2-rc1
# a28d91d8	10-Mar-2015	Mehdi Amini <mehdi.amini@apple.com>	DataLayout is mandatory, update the API to reflect it with references. Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first at DataLayout is mandatory, update the API to reflect it with references. Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740 show more ...
# 715b01e9	09-Mar-2015	Kevin Qin <Kevin.Qin@arm.com>	Introduce runtime unrolling disable matadata and use it to mark the scalar loop from vectorization. Runtime unrolling is an expensive optimization which can bring benefit only if the loop is hot and Introduce runtime unrolling disable matadata and use it to mark the scalar loop from vectorization. Runtime unrolling is an expensive optimization which can bring benefit only if the loop is hot and iteration number is relatively large enough. For some loops, we know they are not worth to be runtime unrolled. The scalar loop from vectorization is one of the cases. llvm-svn: 231631 show more ...
Revision tags: llvmorg-3.6.0, llvmorg-3.6.0-rc4
# 2c79ad97	14-Feb-2015	Duncan P. N. Exon Smith <dexonsmith@apple.com>	Transforms: Canonicalize access to function attributes, NFC Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => g Transforms: Canonicalize access to function attributes, NFC Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229202 show more ...
# 1fbc3165	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Concede defeat and disable the unroll analyzer for now. The issues with the new unroll analyzer are more fundamental than code cleanup, algorithm, or data structure changes. I've sent an em [unroll] Concede defeat and disable the unroll analyzer for now. The issues with the new unroll analyzer are more fundamental than code cleanup, algorithm, or data structure changes. I've sent an email to the original commit thread with details and a proposal for how to redesign things. I'm disabling this for now so that we don't spend time debugging issues with it in its current state. llvm-svn: 229064 show more ...
# 6c03dff7	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Merge the simplification and DCE estimation methods on the UnrollAnalyzer. Now they share a single worklist and have less implicit state between them. There was no real benefit to separatin [unroll] Merge the simplification and DCE estimation methods on the UnrollAnalyzer. Now they share a single worklist and have less implicit state between them. There was no real benefit to separating these two things out. I'm going to subsequently refactor things to share even more code. llvm-svn: 229062 show more ...
# d9591d89	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Remove pointless dyn_cast<>s to Instruction - the users of an instruction must by definition be instructions. llvm-svn: 229061
# 5457e20d	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Don't check the loop set for whether an instruction is contained in it each time we try to add it to the worklist, just check this when pulling it off the worklist. That way we do it at most [unroll] Don't check the loop set for whether an instruction is contained in it each time we try to add it to the worklist, just check this when pulling it off the worklist. That way we do it at most once per instruction with the cost of the worklist set we would need to pay anyways. llvm-svn: 229060 show more ...
# e5c30e4e	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Change the other worklist in the unroll analyzer to be a set vector. In addition to dramatically reducing the work required for contrived example loops, this also has to correct some seriou [unroll] Change the other worklist in the unroll analyzer to be a set vector. In addition to dramatically reducing the work required for contrived example loops, this also has to correct some serious latent bugs in the cost computation. Previously, we might add an instruction onto the worklist once for every load which it used and was simplified. Then we would visit it many times and accumulate "savings" each time. I mean, fortunately this couldn't matter for things like calls with 100s of operands, but even for binary operators this code seems like it must be double counting the savings. I just noticed this by inspection and due to the runtime problems it can introduce, I don't have any test cases for cases where the cost produced by this routine is unacceptable. llvm-svn: 229059 show more ...
# 7824bc92	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Replace a boolean, for loop, condition, and break with std::all_of and a lambda. Much cleaner, no functionality changed. llvm-svn: 229058
# 06d537cd	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Directly query for dead instructions. In the unroll analyzer, it is checking each user to see if that user will become dead. However, it first checked if that user was missing from the simp [unroll] Directly query for dead instructions. In the unroll analyzer, it is checking each user to see if that user will become dead. However, it first checked if that user was missing from the simplified values map, and then if was also missing from the dead instructions set. We add everything from the simplified values map to the dead instructions set, so the first step is completely subsumed by the second. Moreover, the first step requires inserting something into the simplified value map which isn't what we want at all. This also replaces a dyn_cast with a cast as an instruction cannot be used by a non-instruction. llvm-svn: 229057 show more ...
# 82cb30f1	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Replace a linear time check for no uses with a constant time check. Also hoist this into the enqueue process as it is faster even than testing the worklist set, we should just directly filt [unroll] Replace a linear time check for no uses with a constant time check. Also hoist this into the enqueue process as it is faster even than testing the worklist set, we should just directly filter these out much like we filter out constants and such. llvm-svn: 229056 show more ...
# 3b057b32	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Rather than an operand set, use a setvector for the worklist. We don't just want to handle duplicate operands within an instruction, but also duplicates across operands of different instruc [unroll] Rather than an operand set, use a setvector for the worklist. We don't just want to handle duplicate operands within an instruction, but also duplicates across operands of different instructions. I should have gone straight to this, but I had convinced myself that it wasn't going to be necessary briefly. I've come to my senses after chatting more with Nick, and am now happier here. llvm-svn: 229054 show more ...
# 17a0496b	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Extract the code to enqueue operansd for the worklist in the unroll analysis into a lambda and call it. That's much simpler than duplicating all the code. llvm-svn: 229053
# 8c86375a	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Use a small set to de-duplicate operands prior to putting them into the worklist. This avoids allocating lots of worklist memory for them when there are large numbers of repeated operands. [unroll] Use a small set to de-duplicate operands prior to putting them into the worklist. This avoids allocating lots of worklist memory for them when there are large numbers of repeated operands. llvm-svn: 229052 show more ...
# 93063e61	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Make the unroll cost analysis terminate deterministically and reasonably quickly. I don't have a reduced test case, but for a version of FFMPEG, this makes the loop unroller start finishing [unroll] Make the unroll cost analysis terminate deterministically and reasonably quickly. I don't have a reduced test case, but for a version of FFMPEG, this makes the loop unroller start finishing at all (after over 15 minutes of running, it hadn't terminated for me, no idea if it was a true infloop or just exponential work). The key thing here is to check the DeadInstructions set when pulling things off the worklist. Without this, we would re-walk the user list of already dead instructions again and again and again. Consider phi nodes with many, many operands and other patterns. The other important aspect of this is that because we would keep re-visiting instructions that were already known dead, we kept adding their cost savings to this! This would cause our cost savings to be insanely inflated from this. While I was here, I also rotated the operand walk out of the worklist loop to make the code easier to read. There is still work to be done to minimize worklist traffic because we don't de-duplicate operands. This means we may add the same instruction onto the worklist 1000s of times if it shows up in 1000s of operansd to a PHI node for example. Still, with this patch, the ffmpeg testcase I have finishes quickly and I can't measure the runtime impact of the unroll analysis any more. I'll probably try to do a few more cleanups to this code, but not sure how much cleanup I can justify right now. llvm-svn: 229038 show more ...
Revision tags: llvmorg-3.6.0-rc3
# dd6029fc	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Make range based for loops a bit more explicit and more readable. The biggest thing that was causing me problems is recognizing the references vs. poniters here. I also found that for maps [unroll] Make range based for loops a bit more explicit and more readable. The biggest thing that was causing me problems is recognizing the references vs. poniters here. I also found that for maps naming the loop variable as KeyValue helps make it obvious why you don't actually use it directly. Finally, using 'auto' instead of 'User *' doesn't seem like a good tradeoff. Much like with the other cases, I like to know its a pointer, and 'User' is just as long and tells the reader a lot more. llvm-svn: 229033 show more ...
# 415f4125	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Avoid the "Insn" abbreviation of Instruction. This is quite hard to type and read for me, and is inconsistent with the other abbreviation in the base class "Inst". For most of these (where t [unroll] Avoid the "Insn" abbreviation of Instruction. This is quite hard to type and read for me, and is inconsistent with the other abbreviation in the base class "Inst". For most of these (where they are used widely) I prefer just spelling it out as Instruction. I've changed two of the short-lived variables to use "Inst" to match the base class. llvm-svn: 229028 show more ...
# 302a133b	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Tidy up the integer we use to accumululate the number of instructions optimized. NFC, just separating this out from the functionality changing commit. llvm-svn: 229026
# 10a9926a	13-Feb-2015	Chandler Carruth <chandlerc@gmail.com>	[unroll] Don't use a map from pointer to bool. Use a set. This is much more efficient. In particular, the query with the user instruction has to insert a false for every missing instruction into the [unroll] Don't use a map from pointer to bool. Use a set. This is much more efficient. In particular, the query with the user instruction has to insert a false for every missing instruction into the set. This is just a cleanup a long the way to fixing the underlying algorithm problems here. llvm-svn: 228994 show more ...
# 1b480197	13-Feb-2015	Michael Zolotukhin <mzolotukhin@apple.com>	Prevent division by 0. When we try to estimate number of potentially removed instructions in loop unroller, we analyze first N iterations and then scale the computed number by TripCount/N. We should Prevent division by 0. When we try to estimate number of potentially removed instructions in loop unroller, we analyze first N iterations and then scale the computed number by TripCount/N. We should bail out early if N is 0. llvm-svn: 228988 show more ...
1...<<1112 13 14 15