History log of /llvm-project/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (Results 251 – 275 of 369)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-3.6.1
# a6ae877a 12-May-2015 Chandler Carruth <chandlerc@gmail.com>

[Unrolling] Refactor the start and step offsets to simplify overflow
checking and make the cache faster and smaller.

I had thought that using an APInt here would be useful, but I think
I was just wr

[Unrolling] Refactor the start and step offsets to simplify overflow
checking and make the cache faster and smaller.

I had thought that using an APInt here would be useful, but I think
I was just wrong. Notably, we don't have to do any fancy overflow
checking, we can just bound the values as quite small and do the math in
a higher precision integer. I've switched to a signed integer so that
UBSan will even point out if we ever have integer overflow. I've added
various asserts to try to catch things as well and hoisted the overflow
checks so that we just leave the too-large offsets out of the SCEV-GEP
cache. This makes the value in the cache quite a bit smaller which is
probably worthwhile.

No functionality changed here (for trip counts under 1 billion).

llvm-svn: 237209

show more ...


# 8c68171f 12-May-2015 Michael Zolotukhin <mzolotukhin@apple.com>

Reimplement heuristic for estimating complete-unroll optimization effects.

Summary:
This patch reimplements heuristic that tries to estimate optimization beneftis
from complete loop unrolling.

In t

Reimplement heuristic for estimating complete-unroll optimization effects.

Summary:
This patch reimplements heuristic that tries to estimate optimization beneftis
from complete loop unrolling.

In this patch I kept the minimal changes - e.g. I removed code handling
branches and folding compares. That's a promising area, but now there
are too many questions to discuss before we can enable it.

Test Plan: Tests are included in the patch.

Reviewers: hfinkel, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8816

llvm-svn: 237156

show more ...


Revision tags: llvmorg-3.6.1-rc1
# e178f469 14-Apr-2015 Sanjoy Das <sanjoy@playingwithpointers.com>

[LoopUnrollRuntime] Avoid high-cost trip count computation.

Summary:
Runtime unrolling of loops needs to emit an expression to compute the
loop's runtime trip-count. Avoid runtime unrolling if this

[LoopUnrollRuntime] Avoid high-cost trip count computation.

Summary:
Runtime unrolling of loops needs to emit an expression to compute the
loop's runtime trip-count. Avoid runtime unrolling if this computation
will be expensive.

Depends on D8993.

Reviewers: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8994

llvm-svn: 234846

show more ...


# 799003bf 23-Mar-2015 Benjamin Kramer <benny.kra@googlemail.com>

Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.

llvm-svn: 232998


# 51f6096c 23-Mar-2015 Benjamin Kramer <benny.kra@googlemail.com>

Move private classes into anonymous namespaces

NFC.

llvm-svn: 232944


Revision tags: llvmorg-3.5.2, llvmorg-3.5.2-rc1
# a28d91d8 10-Mar-2015 Mehdi Amini <mehdi.amini@apple.com>

DataLayout is mandatory, update the API to reflect it with references.

Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first at

DataLayout is mandatory, update the API to reflect it with references.

Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.

This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.

I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.

I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.

Test Plan:

Reviewers: echristo

Subscribers: llvm-commits

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740

show more ...


# 715b01e9 09-Mar-2015 Kevin Qin <Kevin.Qin@arm.com>

Introduce runtime unrolling disable matadata and use it to mark the scalar loop from vectorization.

Runtime unrolling is an expensive optimization which can bring benefit
only if the loop is hot and

Introduce runtime unrolling disable matadata and use it to mark the scalar loop from vectorization.

Runtime unrolling is an expensive optimization which can bring benefit
only if the loop is hot and iteration number is relatively large enough.
For some loops, we know they are not worth to be runtime unrolled.
The scalar loop from vectorization is one of the cases.

llvm-svn: 231631

show more ...


Revision tags: llvmorg-3.6.0, llvmorg-3.6.0-rc4
# 2c79ad97 14-Feb-2015 Duncan P. N. Exon Smith <dexonsmith@apple.com>

Transforms: Canonicalize access to function attributes, NFC

Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
=> g

Transforms: Canonicalize access to function attributes, NFC

Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
=> getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
=> hasFnAttribute(Kind)

llvm-svn: 229202

show more ...


# 1fbc3165 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Concede defeat and disable the unroll analyzer for now.

The issues with the new unroll analyzer are more fundamental than code
cleanup, algorithm, or data structure changes. I've sent an em

[unroll] Concede defeat and disable the unroll analyzer for now.

The issues with the new unroll analyzer are more fundamental than code
cleanup, algorithm, or data structure changes. I've sent an email to the
original commit thread with details and a proposal for how to redesign
things. I'm disabling this for now so that we don't spend time
debugging issues with it in its current state.

llvm-svn: 229064

show more ...


# 6c03dff7 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Merge the simplification and DCE estimation methods on the
UnrollAnalyzer.

Now they share a single worklist and have less implicit state between
them. There was no real benefit to separatin

[unroll] Merge the simplification and DCE estimation methods on the
UnrollAnalyzer.

Now they share a single worklist and have less implicit state between
them. There was no real benefit to separating these two things out.

I'm going to subsequently refactor things to share even more code.

llvm-svn: 229062

show more ...


# d9591d89 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Remove pointless dyn_cast<>s to Instruction - the users of an
instruction must by definition be instructions.

llvm-svn: 229061


# 5457e20d 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Don't check the loop set for whether an instruction is
contained in it each time we try to add it to the worklist, just check
this when pulling it off the worklist. That way we do it at most

[unroll] Don't check the loop set for whether an instruction is
contained in it each time we try to add it to the worklist, just check
this when pulling it off the worklist. That way we do it at most once
per instruction with the cost of the worklist set we would need to pay
anyways.

llvm-svn: 229060

show more ...


# e5c30e4e 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Change the other worklist in the unroll analyzer to be a set
vector.

In addition to dramatically reducing the work required for contrived
example loops, this also has to correct some seriou

[unroll] Change the other worklist in the unroll analyzer to be a set
vector.

In addition to dramatically reducing the work required for contrived
example loops, this also has to correct some serious latent bugs in the
cost computation. Previously, we might add an instruction onto the
worklist once for every load which it used and was simplified. Then we
would visit it many times and accumulate "savings" each time.

I mean, fortunately this couldn't matter for things like calls with 100s
of operands, but even for binary operators this code seems like it must
be double counting the savings.

I just noticed this by inspection and due to the runtime problems it can
introduce, I don't have any test cases for cases where the cost produced
by this routine is unacceptable.

llvm-svn: 229059

show more ...


# 7824bc92 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Replace a boolean, for loop, condition, and break with
std::all_of and a lambda. Much cleaner, no functionality
changed.

llvm-svn: 229058


# 06d537cd 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Directly query for dead instructions.

In the unroll analyzer, it is checking each user to see if that user
will become dead. However, it first checked if that user was missing
from the simp

[unroll] Directly query for dead instructions.

In the unroll analyzer, it is checking each user to see if that user
will become dead. However, it first checked if that user was missing
from the simplified values map, and then if was also missing from the
dead instructions set. We add everything from the simplified values map
to the dead instructions set, so the first step is completely subsumed
by the second. Moreover, the first step requires *inserting* something
into the simplified value map which isn't what we want at all.

This also replaces a dyn_cast with a cast as an instruction cannot be
used by a non-instruction.

llvm-svn: 229057

show more ...


# 82cb30f1 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Replace a linear time check for no uses with a constant time
check.

Also hoist this into the enqueue process as it is faster even than
testing the worklist set, we should just directly filt

[unroll] Replace a linear time check for no uses with a constant time
check.

Also hoist this into the enqueue process as it is faster even than
testing the worklist set, we should just directly filter these out much
like we filter out constants and such.

llvm-svn: 229056

show more ...


# 3b057b32 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Rather than an operand set, use a setvector for the worklist.

We don't just want to handle duplicate operands within an instruction,
but also duplicates across operands of different instruc

[unroll] Rather than an operand set, use a setvector for the worklist.

We don't just want to handle duplicate operands within an instruction,
but also duplicates across operands of different instructions. I should
have gone straight to this, but I had convinced myself that it wasn't
going to be necessary briefly. I've come to my senses after chatting
more with Nick, and am now happier here.

llvm-svn: 229054

show more ...


# 17a0496b 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Extract the code to enqueue operansd for the worklist in the
unroll analysis into a lambda and call it. That's much simpler than
duplicating all the code.

llvm-svn: 229053


# 8c86375a 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Use a small set to de-duplicate operands prior to putting them
into the worklist. This avoids allocating lots of worklist memory for
them when there are large numbers of repeated operands.

[unroll] Use a small set to de-duplicate operands prior to putting them
into the worklist. This avoids allocating lots of worklist memory for
them when there are large numbers of repeated operands.

llvm-svn: 229052

show more ...


# 93063e61 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Make the unroll cost analysis terminate deterministically and
reasonably quickly.

I don't have a reduced test case, but for a version of FFMPEG, this
makes the loop unroller start finishing

[unroll] Make the unroll cost analysis terminate deterministically and
reasonably quickly.

I don't have a reduced test case, but for a version of FFMPEG, this
makes the loop unroller start finishing at all (after over 15 minutes of
running, it hadn't terminated for me, no idea if it was a true infloop
or just exponential work).

The key thing here is to check the DeadInstructions set when pulling
things off the worklist. Without this, we would re-walk the user list of
already dead instructions again and again and again. Consider phi nodes
with many, many operands and other patterns.

The other important aspect of this is that because we would keep
re-visiting instructions that were already known dead, we kept adding
their cost savings to this! This would cause our cost savings to be
*insanely* inflated from this.

While I was here, I also rotated the operand walk out of the worklist
loop to make the code easier to read. There is still work to be done to
minimize worklist traffic because we don't de-duplicate operands. This
means we may add the same instruction onto the worklist 1000s of times
if it shows up in 1000s of operansd to a PHI node for example.

Still, with this patch, the ffmpeg testcase I have finishes quickly and
I can't measure the runtime impact of the unroll analysis any more. I'll
probably try to do a few more cleanups to this code, but not sure how
much cleanup I can justify right now.

llvm-svn: 229038

show more ...


Revision tags: llvmorg-3.6.0-rc3
# dd6029fc 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Make range based for loops a bit more explicit and more
readable.

The biggest thing that was causing me problems is recognizing the
references vs. poniters here. I also found that for maps

[unroll] Make range based for loops a bit more explicit and more
readable.

The biggest thing that was causing me problems is recognizing the
references vs. poniters here. I also found that for maps naming the loop
variable as KeyValue helps make it obvious why you don't actually use it
directly. Finally, using 'auto' instead of 'User *' doesn't seem like
a good tradeoff. Much like with the other cases, I like to know its
a pointer, and 'User' is just as long and tells the reader a lot more.

llvm-svn: 229033

show more ...


# 415f4125 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Avoid the "Insn" abbreviation of Instruction. This is quite
hard to type and read for me, and is inconsistent with the other
abbreviation in the base class "Inst". For most of these (where t

[unroll] Avoid the "Insn" abbreviation of Instruction. This is quite
hard to type and read for me, and is inconsistent with the other
abbreviation in the base class "Inst". For most of these (where they are
used widely) I prefer just spelling it out as Instruction. I've changed
two of the short-lived variables to use "Inst" to match the base class.

llvm-svn: 229028

show more ...


# 302a133b 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Tidy up the integer we use to accumululate the number of
instructions optimized. NFC, just separating this out from the
functionality changing commit.

llvm-svn: 229026


# 10a9926a 13-Feb-2015 Chandler Carruth <chandlerc@gmail.com>

[unroll] Don't use a map from pointer to bool. Use a set.

This is much more efficient. In particular, the query with the user
instruction has to insert a false for every missing instruction into the

[unroll] Don't use a map from pointer to bool. Use a set.

This is much more efficient. In particular, the query with the user
instruction has to insert a false for every missing instruction into the
set. This is just a cleanup a long the way to fixing the underlying
algorithm problems here.

llvm-svn: 228994

show more ...


# 1b480197 13-Feb-2015 Michael Zolotukhin <mzolotukhin@apple.com>

Prevent division by 0.

When we try to estimate number of potentially removed instructions in
loop unroller, we analyze first N iterations and then scale the
computed number by TripCount/N. We should

Prevent division by 0.

When we try to estimate number of potentially removed instructions in
loop unroller, we analyze first N iterations and then scale the
computed number by TripCount/N. We should bail out early if N is 0.

llvm-svn: 228988

show more ...


1...<<1112131415