History log of /llvm-project/llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll (Results 1 – 25 of 44)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5
# abb9f9fa 21-Nov-2024 Lee Wei <lee10202013@gmail.com>

[llvm] Remove `br i1 undef` from some regression tests [NFC] (#117112)

This PR removes tests with `br i1 undef` under
`llvm/tests/Transforms/Loop*, Lower*`.


Revision tags: llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1
# 2f7ccaf4 28-Sep-2024 Florian Hahn <flo@fhahn.com>

[SCEV] Add predicate in SolveLinEq to ensure B is a multiple of A. (#108777)

This can help in cases where pointer alignment info is missing, e.g.
https://github.com/llvm/llvm-project/pull/108210

[SCEV] Add predicate in SolveLinEq to ensure B is a multiple of A. (#108777)

This can help in cases where pointer alignment info is missing, e.g.
https://github.com/llvm/llvm-project/pull/108210

The predicate is formed for the complex expression that's passed to
SolveLinEquationWithOverflow and the checks could probably be pushed
closer to the root nodes, which in some cases may be cheaper to check.


PR: https://github.com/llvm/llvm-project/pull/108777

show more ...


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 37c736e0 25-Jun-2024 Nikita Popov <npopov@redhat.com>

[LoopUnroll] Use poison instead of undef for another preheader value


# eeb0884e 25-Jun-2024 Nikita Popov <npopov@redhat.com>

[LoopUnroll] Use poison instead of undef for preheader value


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# b9808e56 14-Jun-2023 Nikita Popov <npopov@redhat.com>

[LoopUnroll] Fold add chains during unrolling

Loop unrolling tends to produce chains of
`%x1 = add %x0, 1; %x2 = add %x1, 1; ...` with one add per unrolled
iteration. This patch simplifies these add

[LoopUnroll] Fold add chains during unrolling

Loop unrolling tends to produce chains of
`%x1 = add %x0, 1; %x2 = add %x1, 1; ...` with one add per unrolled
iteration. This patch simplifies these adds to `%xN = add %x0, N`
directly during unrolling, rather than waiting for InstCombine to do so.

The motivation for this is that having a single add (rather than
an add chain) on the induction variable makes it a simple recurrence,
which we specially recognize in a number of places. This allows
InstCombine to directly perform folds with that knowledge, instead
of first folding the add chains, and then doing other folds in another
InstCombine iteration.

Due to the reduced number of InstCombine iterations, this also
results in a small compile-time improvement.

Differential Revision: https://reviews.llvm.org/D153540

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# ef992b60 23-Dec-2022 Nikita Popov <npopov@redhat.com>

[LoopUnroll] Convert some tests to opaque pointers (NFC)


# 3a8e009f 20-Dec-2022 Roman Lebedev <lebedev.ri@gmail.com>

Revert "Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block""

One of these two changes is exposing (or causing) some more miscompiles.
A re

Revert "Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block""

One of these two changes is exposing (or causing) some more miscompiles.
A reproducer is in progress, so reverting until resolved.

This reverts commit 428f36401b1b695fd501ebfdc8773bed8ced8d4e.

show more ...


# 428f3640 17-Dec-2022 Roman Lebedev <lebedev.ri@gmail.com>

Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block"

This reverts commit 37b8f09a4b61bf9bf9d0b9017d790c8b82be2e17,
and returns commit 1bd0b

Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block"

This reverts commit 37b8f09a4b61bf9bf9d0b9017d790c8b82be2e17,
and returns commit 1bd0b82e508d049efdb07f4f8a342f35818df341.
The miscompile was in InstCombine, and it has been addressed.

This tries to approach the problem noted by @arsenm:
terrible codegen for `__builtin_fpclassify()`:
https://godbolt.org/z/388zqdE37

Just because the PHI in the common successor happens to have different
incoming values for these two blocks, doesn't mean we have to give up.
It's quite easy to deal with this, we just need to produce a select:
https://alive2.llvm.org/ce/z/000srb

Now, the cost model for this transform is rather overly strict,
so this will basically never fire. We tally all (over all preds)
the selects needed to the NumBonusInsts

Differential Revision: https://reviews.llvm.org/D139275

show more ...


# 37b8f09a 16-Dec-2022 Alexander Kornienko <alexfh@google.com>

Revert "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block"

This reverts commit 1bd0b82e508d049efdb07f4f8a342f35818df341, since it leads to
miscom

Revert "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block"

This reverts commit 1bd0b82e508d049efdb07f4f8a342f35818df341, since it leads to
miscompiles. See https://reviews.llvm.org/D139275#3993229 and
https://reviews.llvm.org/D139275#4001580.

show more ...


# 1bd0b82e 12-Dec-2022 Roman Lebedev <lebedev.ri@gmail.com>

[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block

This tries to approach the problem noted by @arsenm:
terrible codegen for `__builtin_fpclassify

[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block

This tries to approach the problem noted by @arsenm:
terrible codegen for `__builtin_fpclassify()`:
https://godbolt.org/z/388zqdE37

Just because the PHI in the common successor happens to have different
incoming values for these two blocks, doesn't mean we have to give up.
It's quite easy to deal with this, we just need to produce a select:
https://alive2.llvm.org/ce/z/000srb

Now, the cost model for this transform is rather overly strict,
so this will basically never fire. We tally all (over all preds)
the selects needed to the NumBonusInsts

Differential Revision: https://reviews.llvm.org/D139275

show more ...


# 5103ef64 07-Dec-2022 Roman Lebedev <lebedev.ri@gmail.com>

[NFC] Port all (but one) LoopUnroll tests to `-passes=` syntax


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3
# ebabd6bf 16-Aug-2022 Max Kazantsev <mkazantsev@azul.com>

Return "[SCEV] Use context to strengthen flags of BinOps"

This reverts commit 354fa0b48008eca701a110badd6974bf449df257.

Returning as is. The patch was reverted due to a miscompile, but
this patch i

Return "[SCEV] Use context to strengthen flags of BinOps"

This reverts commit 354fa0b48008eca701a110badd6974bf449df257.

Returning as is. The patch was reverted due to a miscompile, but
this patch is not causing it. This patch made it possible to infer
some nuw flags in code guarded by `false` condition, and then someone
else to managed to propagate the flag from dead code outside.

Returning the patch to be able to reproduce the issue.

show more ...


# 354fa0b4 15-Aug-2022 Max Kazantsev <mkazantsev@azul.com>

Revert "[SCEV] Use context to strengthen flags of BinOps"

This reverts commit 34ae308c73e4d76dbdab25a6206d3fbc5ebafdf5.

Our internal testing found a miscompile. Not sure if it's caused by
this patc

Revert "[SCEV] Use context to strengthen flags of BinOps"

This reverts commit 34ae308c73e4d76dbdab25a6206d3fbc5ebafdf5.

Our internal testing found a miscompile. Not sure if it's caused by
this patch or it revealed something else. Reverting while investigating.

show more ...


Revision tags: llvmorg-15.0.0-rc2
# 34ae308c 03-Aug-2022 Max Kazantsev <mkazantsev@azul.com>

[SCEV] Use context to strengthen flags of BinOps

Sometimes SCEV cannot infer nuw/nsw from something as simple as
```
len in [0, MAX_INT]
...
iv = phi(0, iv.next)
guard(iv <s len)
guard(iv <u

[SCEV] Use context to strengthen flags of BinOps

Sometimes SCEV cannot infer nuw/nsw from something as simple as
```
len in [0, MAX_INT]
...
iv = phi(0, iv.next)
guard(iv <s len)
guard(iv <u len)
iv.next = iv + 1
```
just because flag strenthening only relies on definition and does not use local facts.
This patch adds support for the simplest case: inference of flags of `add(x, constant)`
if we can contextually prove that `x <= max_int - constant`.

In case if it has negative CT impact, we can add an option to switch it off. I woudln't
expect that though.

Differential Revision: https://reviews.llvm.org/D129643
Reviewed By: apilipenko

show more ...


Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4
# 81c648a3 18-May-2022 Nikita Popov <npopov@redhat.com>

[LoopUnroll] Freeze tripcount rather than condition

This is a followup to D125754. We introduce two branches, one
before the unrolled loop and one before the epilogue (and similar
for the prologue c

[LoopUnroll] Freeze tripcount rather than condition

This is a followup to D125754. We introduce two branches, one
before the unrolled loop and one before the epilogue (and similar
for the prologue case). The previous patch only froze the
condition on the first branch.

Rather than independently freezing the second condition, this patch
instead freezes TripCount and bases BECount on it. These are the
two quantities involved in the conditions, and this ensures that
both work on a consistent, non-poisonous trip count.

Differential Revision: https://reviews.llvm.org/D125896

show more ...


# 323514de 17-May-2022 Nikita Popov <npopov@redhat.com>

[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits

When performing runtime unrolling with multiple exits, one of the
earlier (non-latch) exits may exit the loop on the first

[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits

When performing runtime unrolling with multiple exits, one of the
earlier (non-latch) exits may exit the loop on the first iteration,
such that we never branch on the latch exit condition. As such, we
need to freeze the condition of the new branch that is introduced
before the loop, as it now executes unconditionally.

Differential Revision: https://reviews.llvm.org/D125754

show more ...


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 72031407 03-Jan-2022 Philip Reames <listmail@philipreames.com>

Revert "[unroll] Prune all but first copy of invariant exit"

This reverts commit 9bd22595bad36cd19f5e7ae18ccd9f41cba29dc5.

Seeing some bot failures which look plausibly connected. Revert while inv

Revert "[unroll] Prune all but first copy of invariant exit"

This reverts commit 9bd22595bad36cd19f5e7ae18ccd9f41cba29dc5.

Seeing some bot failures which look plausibly connected. Revert while investigating/waiting for bots to stablize.

e.g. https://lab.llvm.org/buildbot#builders/36/builds/15933

show more ...


# 9bd22595 03-Jan-2022 Philip Reames <listmail@philipreames.com>

[unroll] Prune all but first copy of invariant exit

If we have an exit which is controlled by a loop invariant condition and which dominates the latch, we know only the copy in the first unrolled it

[unroll] Prune all but first copy of invariant exit

If we have an exit which is controlled by a loop invariant condition and which dominates the latch, we know only the copy in the first unrolled iteration can be taken. All other copies are dead.

The change itself is pretty straight forward, but let me add two points of context:
* I'd have expected other transform passes to catch this after unrolling, but I'm seeing multiple examples where we get to the end of O2/O3 without simplifying.
* I'd like to do a stronger change which did CSE during unroll and accounted for invariant expressions (as defined by SCEV instead of trivial ones from LoopInfo), but that doesn't fit cleanly into the current code structure.

Differential Revision: https://reviews.llvm.org/D116496

show more ...


# 8906a0fe 29-Nov-2021 Philip Reames <listmail@philipreames.com>

[SCEVExpander] Drop poison generating flags when reusing instructions

The basic problem we have is that we're trying to reuse an instruction which is mapped to some SCEV. Since we can have multiple

[SCEVExpander] Drop poison generating flags when reusing instructions

The basic problem we have is that we're trying to reuse an instruction which is mapped to some SCEV. Since we can have multiple such instructions (potentially with different flags), this is analogous to our need to drop flags when performing CSE. A trivial implementation would simply drop flags on any instruction we decided to reuse, and that would be correct.

This patch is almost that trivial patch except that we preserve flags on the reused instruction when existing users would imply UB on overflow already. Adding new users can, at most, refine this program to one which doesn't execute UB which is valid.

In practice, this fixes two conceptual problems with the previous code: 1) a binop could have been canonicalized into a form with different opcode or operands, or 2) the inbounds GEP case which was simply unhandled.

On the test changes, most are pretty straight forward. We loose some flags (in some cases, they'd have been dropped on the next CSE pass anyways). The one that took me the longest to understand was the ashr-expansion test. What's happening there is that we're considering reuse of the mul, previously we disallowed it entirely, now we allow it with no flags. The surrounding diffs are all effects of generating the same mul with a different operand order, and then doing simple DCE.

The loss of the inbounds is unfortunate, but even there, we can recover most of those once we actually treat branch-on-poison as immediate UB.

Differential Revision: https://reviews.llvm.org/D112734

show more ...


Revision tags: llvmorg-13.0.1-rc1
# da327e72 15-Nov-2021 Philip Reames <listmail@philipreames.com>

Fix a misleading FIXME in an unroll test


# 37ead201 12-Nov-2021 Philip Reames <listmail@philipreames.com>

[runtime-unroll] Use incrementing IVs instead of decrementing ones

This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the

[runtime-unroll] Use incrementing IVs instead of decrementing ones

This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the IVs inserted by the runtime unroller to clamp iteration count of the loops* from decrementing to incrementing.

Why does this matter? A couple of reasons:
* SCEV doesn't have a native subtract node. Instead, all subtracts (A - B) are represented as A + -1 * B and drops any flags invalidated by such. As a result, SCEV is slightly less good at reasoning about edge cases involving decrementing addrecs than incrementing ones. (You can see this in the inferred flags in some of the test cases.)
* Other parts of the optimizer produce incrementing IVs, and they're common in idiomatic source language. We do have support for reversing IVs, but in general if we produce one of each, the pair will persist surprisingly far through the optimizer before being coalesced. (You can see this looking at nearby phis in the test cases.)

Note that if the hardware prefers decrementing (i.e. zero tested) loops, LSR should convert back immediately before codegen.

* Mostly irrelevant detail: The main loop of the prolog case is handled independently and will simple use the original IV with a changed start value. We could in theory use this scheme for all iteration clamping, but that's a larger and more invasive change.

show more ...


# de2fed61 12-Nov-2021 Philip Reames <listmail@philipreames.com>

[unroll] Keep unrolled iterations with initial iteration

The unrolling code was previously inserting new cloned blocks at the end of the function. The result of this with typical loop structures is

[unroll] Keep unrolled iterations with initial iteration

The unrolling code was previously inserting new cloned blocks at the end of the function. The result of this with typical loop structures is that the new iterations are placed far from the initial iteration.

With unrolling, the general assumption is that the a) the loop is reasonable hot, and b) the first Count-1 copies of the loop are rarely (if ever) loop exiting. As such, placing Count-1 copies out of line is a fairly poor code placement choice. We'd much rather fall through into the hot (non-exiting) path. For code with branch profiles, later layout would fix this, but this may have a positive impact on non-PGO compiled code.

However, the real motivation for this change isn't performance. Its readability and human understanding. Having to jump around long distances in an IR file to trace an unrolled loop structure is error prone and tedious.

show more ...


# e01c91f2 12-Nov-2021 Philip Reames <listmail@philipreames.com>

[tests] Add coverage for cases we can prune exits when runtlme unrolling


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# fa82a3d0 02-Sep-2021 Philip Reames <listmail@philipreames.com>

[runtimeunroll] Support epilogue unrolling with a parent loop

This patch adds support for unrolling inner loops using epilogue unrolling. The basic issue is that the original latch exit block of the

[runtimeunroll] Support epilogue unrolling with a parent loop

This patch adds support for unrolling inner loops using epilogue unrolling. The basic issue is that the original latch exit block of the inner loop could be outside the outer loop. When we clone the inner loop and split the latch exit, the cloned blocks need to be in the outer loop.

Differential Revision: https://reviews.llvm.org/D108476

show more ...


# b604fcb7 31-Aug-2021 Philip Reames <listmail@philipreames.com>

[runtime] Move prolog/epilog block to a post-simplify strategy

The runtime unroller will try to produce a non-loop if the unroll count is 2 and thus the prolog/epilog loop would only run at most one

[runtime] Move prolog/epilog block to a post-simplify strategy

The runtime unroller will try to produce a non-loop if the unroll count is 2 and thus the prolog/epilog loop would only run at most one iteration. The old implementation did this by avoiding loop construction entirely. This patches instead constructs the trivial loop and then explicitly breaks the backedge and simplifies. This does result in some additional code churn when triggered, but a) results in better quality code and b) removes a codepath which didn't work properly for multiple exit epilogs.

One oddity that I want to draw to reviewer attention is that this somehow changes revisit order. The new order looks equivalent to me, but I don't understand how creating and erasing an extra loop here creates this effect.

Differential Revision: https://reviews.llvm.org/D108521

show more ...


12