Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5 |
|
#
abb9f9fa |
| 21-Nov-2024 |
Lee Wei <lee10202013@gmail.com> |
[llvm] Remove `br i1 undef` from some regression tests [NFC] (#117112)
This PR removes tests with `br i1 undef` under
`llvm/tests/Transforms/Loop*, Lower*`.
|
Revision tags: llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1 |
|
#
2f7ccaf4 |
| 28-Sep-2024 |
Florian Hahn <flo@fhahn.com> |
[SCEV] Add predicate in SolveLinEq to ensure B is a multiple of A. (#108777)
This can help in cases where pointer alignment info is missing, e.g.
https://github.com/llvm/llvm-project/pull/108210
[SCEV] Add predicate in SolveLinEq to ensure B is a multiple of A. (#108777)
This can help in cases where pointer alignment info is missing, e.g.
https://github.com/llvm/llvm-project/pull/108210
The predicate is formed for the complex expression that's passed to
SolveLinEquationWithOverflow and the checks could probably be pushed
closer to the root nodes, which in some cases may be cheaper to check.
PR: https://github.com/llvm/llvm-project/pull/108777
show more ...
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
37c736e0 |
| 25-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[LoopUnroll] Use poison instead of undef for another preheader value
|
#
eeb0884e |
| 25-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[LoopUnroll] Use poison instead of undef for preheader value
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
b9808e56 |
| 14-Jun-2023 |
Nikita Popov <npopov@redhat.com> |
[LoopUnroll] Fold add chains during unrolling
Loop unrolling tends to produce chains of `%x1 = add %x0, 1; %x2 = add %x1, 1; ...` with one add per unrolled iteration. This patch simplifies these add
[LoopUnroll] Fold add chains during unrolling
Loop unrolling tends to produce chains of `%x1 = add %x0, 1; %x2 = add %x1, 1; ...` with one add per unrolled iteration. This patch simplifies these adds to `%xN = add %x0, N` directly during unrolling, rather than waiting for InstCombine to do so.
The motivation for this is that having a single add (rather than an add chain) on the induction variable makes it a simple recurrence, which we specially recognize in a number of places. This allows InstCombine to directly perform folds with that knowledge, instead of first folding the add chains, and then doing other folds in another InstCombine iteration.
Due to the reduced number of InstCombine iterations, this also results in a small compile-time improvement.
Differential Revision: https://reviews.llvm.org/D153540
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
ef992b60 |
| 23-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[LoopUnroll] Convert some tests to opaque pointers (NFC)
|
#
3a8e009f |
| 20-Dec-2022 |
Roman Lebedev <lebedev.ri@gmail.com> |
Revert "Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block""
One of these two changes is exposing (or causing) some more miscompiles. A re
Revert "Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block""
One of these two changes is exposing (or causing) some more miscompiles. A reproducer is in progress, so reverting until resolved.
This reverts commit 428f36401b1b695fd501ebfdc8773bed8ced8d4e.
show more ...
|
#
428f3640 |
| 17-Dec-2022 |
Roman Lebedev <lebedev.ri@gmail.com> |
Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block"
This reverts commit 37b8f09a4b61bf9bf9d0b9017d790c8b82be2e17, and returns commit 1bd0b
Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block"
This reverts commit 37b8f09a4b61bf9bf9d0b9017d790c8b82be2e17, and returns commit 1bd0b82e508d049efdb07f4f8a342f35818df341. The miscompile was in InstCombine, and it has been addressed.
This tries to approach the problem noted by @arsenm: terrible codegen for `__builtin_fpclassify()`: https://godbolt.org/z/388zqdE37
Just because the PHI in the common successor happens to have different incoming values for these two blocks, doesn't mean we have to give up. It's quite easy to deal with this, we just need to produce a select: https://alive2.llvm.org/ce/z/000srb
Now, the cost model for this transform is rather overly strict, so this will basically never fire. We tally all (over all preds) the selects needed to the NumBonusInsts
Differential Revision: https://reviews.llvm.org/D139275
show more ...
|
#
37b8f09a |
| 16-Dec-2022 |
Alexander Kornienko <alexfh@google.com> |
Revert "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block"
This reverts commit 1bd0b82e508d049efdb07f4f8a342f35818df341, since it leads to miscom
Revert "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block"
This reverts commit 1bd0b82e508d049efdb07f4f8a342f35818df341, since it leads to miscompiles. See https://reviews.llvm.org/D139275#3993229 and https://reviews.llvm.org/D139275#4001580.
show more ...
|
#
1bd0b82e |
| 12-Dec-2022 |
Roman Lebedev <lebedev.ri@gmail.com> |
[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block
This tries to approach the problem noted by @arsenm: terrible codegen for `__builtin_fpclassify
[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block
This tries to approach the problem noted by @arsenm: terrible codegen for `__builtin_fpclassify()`: https://godbolt.org/z/388zqdE37
Just because the PHI in the common successor happens to have different incoming values for these two blocks, doesn't mean we have to give up. It's quite easy to deal with this, we just need to produce a select: https://alive2.llvm.org/ce/z/000srb
Now, the cost model for this transform is rather overly strict, so this will basically never fire. We tally all (over all preds) the selects needed to the NumBonusInsts
Differential Revision: https://reviews.llvm.org/D139275
show more ...
|
#
5103ef64 |
| 07-Dec-2022 |
Roman Lebedev <lebedev.ri@gmail.com> |
[NFC] Port all (but one) LoopUnroll tests to `-passes=` syntax
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3 |
|
#
ebabd6bf |
| 16-Aug-2022 |
Max Kazantsev <mkazantsev@azul.com> |
Return "[SCEV] Use context to strengthen flags of BinOps"
This reverts commit 354fa0b48008eca701a110badd6974bf449df257.
Returning as is. The patch was reverted due to a miscompile, but this patch i
Return "[SCEV] Use context to strengthen flags of BinOps"
This reverts commit 354fa0b48008eca701a110badd6974bf449df257.
Returning as is. The patch was reverted due to a miscompile, but this patch is not causing it. This patch made it possible to infer some nuw flags in code guarded by `false` condition, and then someone else to managed to propagate the flag from dead code outside.
Returning the patch to be able to reproduce the issue.
show more ...
|
#
354fa0b4 |
| 15-Aug-2022 |
Max Kazantsev <mkazantsev@azul.com> |
Revert "[SCEV] Use context to strengthen flags of BinOps"
This reverts commit 34ae308c73e4d76dbdab25a6206d3fbc5ebafdf5.
Our internal testing found a miscompile. Not sure if it's caused by this patc
Revert "[SCEV] Use context to strengthen flags of BinOps"
This reverts commit 34ae308c73e4d76dbdab25a6206d3fbc5ebafdf5.
Our internal testing found a miscompile. Not sure if it's caused by this patch or it revealed something else. Reverting while investigating.
show more ...
|
Revision tags: llvmorg-15.0.0-rc2 |
|
#
34ae308c |
| 03-Aug-2022 |
Max Kazantsev <mkazantsev@azul.com> |
[SCEV] Use context to strengthen flags of BinOps
Sometimes SCEV cannot infer nuw/nsw from something as simple as ``` len in [0, MAX_INT] ... iv = phi(0, iv.next) guard(iv <s len) guard(iv <u
[SCEV] Use context to strengthen flags of BinOps
Sometimes SCEV cannot infer nuw/nsw from something as simple as ``` len in [0, MAX_INT] ... iv = phi(0, iv.next) guard(iv <s len) guard(iv <u len) iv.next = iv + 1 ``` just because flag strenthening only relies on definition and does not use local facts. This patch adds support for the simplest case: inference of flags of `add(x, constant)` if we can contextually prove that `x <= max_int - constant`.
In case if it has negative CT impact, we can add an option to switch it off. I woudln't expect that though.
Differential Revision: https://reviews.llvm.org/D129643 Reviewed By: apilipenko
show more ...
|
Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
#
81c648a3 |
| 18-May-2022 |
Nikita Popov <npopov@redhat.com> |
[LoopUnroll] Freeze tripcount rather than condition
This is a followup to D125754. We introduce two branches, one before the unrolled loop and one before the epilogue (and similar for the prologue c
[LoopUnroll] Freeze tripcount rather than condition
This is a followup to D125754. We introduce two branches, one before the unrolled loop and one before the epilogue (and similar for the prologue case). The previous patch only froze the condition on the first branch.
Rather than independently freezing the second condition, this patch instead freezes TripCount and bases BECount on it. These are the two quantities involved in the conditions, and this ensures that both work on a consistent, non-poisonous trip count.
Differential Revision: https://reviews.llvm.org/D125896
show more ...
|
#
323514de |
| 17-May-2022 |
Nikita Popov <npopov@redhat.com> |
[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits
When performing runtime unrolling with multiple exits, one of the earlier (non-latch) exits may exit the loop on the first
[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits
When performing runtime unrolling with multiple exits, one of the earlier (non-latch) exits may exit the loop on the first iteration, such that we never branch on the latch exit condition. As such, we need to freeze the condition of the new branch that is introduced before the loop, as it now executes unconditionally.
Differential Revision: https://reviews.llvm.org/D125754
show more ...
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
72031407 |
| 03-Jan-2022 |
Philip Reames <listmail@philipreames.com> |
Revert "[unroll] Prune all but first copy of invariant exit"
This reverts commit 9bd22595bad36cd19f5e7ae18ccd9f41cba29dc5.
Seeing some bot failures which look plausibly connected. Revert while inv
Revert "[unroll] Prune all but first copy of invariant exit"
This reverts commit 9bd22595bad36cd19f5e7ae18ccd9f41cba29dc5.
Seeing some bot failures which look plausibly connected. Revert while investigating/waiting for bots to stablize.
e.g. https://lab.llvm.org/buildbot#builders/36/builds/15933
show more ...
|
#
9bd22595 |
| 03-Jan-2022 |
Philip Reames <listmail@philipreames.com> |
[unroll] Prune all but first copy of invariant exit
If we have an exit which is controlled by a loop invariant condition and which dominates the latch, we know only the copy in the first unrolled it
[unroll] Prune all but first copy of invariant exit
If we have an exit which is controlled by a loop invariant condition and which dominates the latch, we know only the copy in the first unrolled iteration can be taken. All other copies are dead.
The change itself is pretty straight forward, but let me add two points of context: * I'd have expected other transform passes to catch this after unrolling, but I'm seeing multiple examples where we get to the end of O2/O3 without simplifying. * I'd like to do a stronger change which did CSE during unroll and accounted for invariant expressions (as defined by SCEV instead of trivial ones from LoopInfo), but that doesn't fit cleanly into the current code structure.
Differential Revision: https://reviews.llvm.org/D116496
show more ...
|
#
8906a0fe |
| 29-Nov-2021 |
Philip Reames <listmail@philipreames.com> |
[SCEVExpander] Drop poison generating flags when reusing instructions
The basic problem we have is that we're trying to reuse an instruction which is mapped to some SCEV. Since we can have multiple
[SCEVExpander] Drop poison generating flags when reusing instructions
The basic problem we have is that we're trying to reuse an instruction which is mapped to some SCEV. Since we can have multiple such instructions (potentially with different flags), this is analogous to our need to drop flags when performing CSE. A trivial implementation would simply drop flags on any instruction we decided to reuse, and that would be correct.
This patch is almost that trivial patch except that we preserve flags on the reused instruction when existing users would imply UB on overflow already. Adding new users can, at most, refine this program to one which doesn't execute UB which is valid.
In practice, this fixes two conceptual problems with the previous code: 1) a binop could have been canonicalized into a form with different opcode or operands, or 2) the inbounds GEP case which was simply unhandled.
On the test changes, most are pretty straight forward. We loose some flags (in some cases, they'd have been dropped on the next CSE pass anyways). The one that took me the longest to understand was the ashr-expansion test. What's happening there is that we're considering reuse of the mul, previously we disallowed it entirely, now we allow it with no flags. The surrounding diffs are all effects of generating the same mul with a different operand order, and then doing simple DCE.
The loss of the inbounds is unfortunate, but even there, we can recover most of those once we actually treat branch-on-poison as immediate UB.
Differential Revision: https://reviews.llvm.org/D112734
show more ...
|
Revision tags: llvmorg-13.0.1-rc1 |
|
#
da327e72 |
| 15-Nov-2021 |
Philip Reames <listmail@philipreames.com> |
Fix a misleading FIXME in an unroll test
|
#
37ead201 |
| 12-Nov-2021 |
Philip Reames <listmail@philipreames.com> |
[runtime-unroll] Use incrementing IVs instead of decrementing ones
This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the
[runtime-unroll] Use incrementing IVs instead of decrementing ones
This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the IVs inserted by the runtime unroller to clamp iteration count of the loops* from decrementing to incrementing.
Why does this matter? A couple of reasons: * SCEV doesn't have a native subtract node. Instead, all subtracts (A - B) are represented as A + -1 * B and drops any flags invalidated by such. As a result, SCEV is slightly less good at reasoning about edge cases involving decrementing addrecs than incrementing ones. (You can see this in the inferred flags in some of the test cases.) * Other parts of the optimizer produce incrementing IVs, and they're common in idiomatic source language. We do have support for reversing IVs, but in general if we produce one of each, the pair will persist surprisingly far through the optimizer before being coalesced. (You can see this looking at nearby phis in the test cases.)
Note that if the hardware prefers decrementing (i.e. zero tested) loops, LSR should convert back immediately before codegen.
* Mostly irrelevant detail: The main loop of the prolog case is handled independently and will simple use the original IV with a changed start value. We could in theory use this scheme for all iteration clamping, but that's a larger and more invasive change.
show more ...
|
#
de2fed61 |
| 12-Nov-2021 |
Philip Reames <listmail@philipreames.com> |
[unroll] Keep unrolled iterations with initial iteration
The unrolling code was previously inserting new cloned blocks at the end of the function. The result of this with typical loop structures is
[unroll] Keep unrolled iterations with initial iteration
The unrolling code was previously inserting new cloned blocks at the end of the function. The result of this with typical loop structures is that the new iterations are placed far from the initial iteration.
With unrolling, the general assumption is that the a) the loop is reasonable hot, and b) the first Count-1 copies of the loop are rarely (if ever) loop exiting. As such, placing Count-1 copies out of line is a fairly poor code placement choice. We'd much rather fall through into the hot (non-exiting) path. For code with branch profiles, later layout would fix this, but this may have a positive impact on non-PGO compiled code.
However, the real motivation for this change isn't performance. Its readability and human understanding. Having to jump around long distances in an IR file to trace an unrolled loop structure is error prone and tedious.
show more ...
|
#
e01c91f2 |
| 12-Nov-2021 |
Philip Reames <listmail@philipreames.com> |
[tests] Add coverage for cases we can prune exits when runtlme unrolling
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
#
fa82a3d0 |
| 02-Sep-2021 |
Philip Reames <listmail@philipreames.com> |
[runtimeunroll] Support epilogue unrolling with a parent loop
This patch adds support for unrolling inner loops using epilogue unrolling. The basic issue is that the original latch exit block of the
[runtimeunroll] Support epilogue unrolling with a parent loop
This patch adds support for unrolling inner loops using epilogue unrolling. The basic issue is that the original latch exit block of the inner loop could be outside the outer loop. When we clone the inner loop and split the latch exit, the cloned blocks need to be in the outer loop.
Differential Revision: https://reviews.llvm.org/D108476
show more ...
|
#
b604fcb7 |
| 31-Aug-2021 |
Philip Reames <listmail@philipreames.com> |
[runtime] Move prolog/epilog block to a post-simplify strategy
The runtime unroller will try to produce a non-loop if the unroll count is 2 and thus the prolog/epilog loop would only run at most one
[runtime] Move prolog/epilog block to a post-simplify strategy
The runtime unroller will try to produce a non-loop if the unroll count is 2 and thus the prolog/epilog loop would only run at most one iteration. The old implementation did this by avoiding loop construction entirely. This patches instead constructs the trivial loop and then explicitly breaks the backedge and simplifies. This does result in some additional code churn when triggered, but a) results in better quality code and b) removes a codepath which didn't work properly for multiple exit epilogs.
One oddity that I want to draw to reviewer attention is that this somehow changes revisit order. The new order looks equivalent to me, but I don't understand how creating and erasing an extra loop here creates this effect.
Differential Revision: https://reviews.llvm.org/D108521
show more ...
|