Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6 |
|
#
f7685af4 |
| 05-Dec-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Move gep of phi fold into separate function
This makes sure that an early return during this fold doesn't end up skipping later gep folds.
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
38fffa63 |
| 06-Nov-2024 |
Paul Walker <paul.walker@arm.com> |
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548)
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4 |
|
#
a1058776 |
| 21-Aug-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Remove some of the complexity-based canonicalization (#91185)
The idea behind this canonicalization is that it allows us to handle less
patterns, because we know that some will be can
[InstCombine] Remove some of the complexity-based canonicalization (#91185)
The idea behind this canonicalization is that it allows us to handle less
patterns, because we know that some will be canonicalized away. This is
indeed very useful to e.g. know that constants are always on the right.
However, this is only useful if the canonicalization is actually
reliable. This is the case for constants, but not for arguments: Moving
these to the right makes it look like the "more complex" expression is
guaranteed to be on the left, but this is not actually the case in
practice. It fails as soon as you replace the argument with another
instruction.
The end result is that it looks like things correctly work in tests,
while they actually don't. We use the "thwart complexity-based
canonicalization" trick to handle this in tests, but it's often a
challenge for new contributors to get this right, and based on the
regressions this PR originally exposed, we clearly don't get this right
in many cases.
For this reason, I think that it's better to remove this complexity
canonicalization. It will make it much easier to write tests for
commuted cases and make sure that they are handled.
show more ...
|
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1 |
|
#
90ba3309 |
| 24-Jan-2024 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Canonicalize constant GEPs to i8 source element type (#68882)
This patch canonicalizes getelementptr instructions with constant
indices to use the `i8` source element type. This makes
[InstCombine] Canonicalize constant GEPs to i8 source element type (#68882)
This patch canonicalizes getelementptr instructions with constant
indices to use the `i8` source element type. This makes it easier for
optimizations to recognize that two GEPs are identical, because they
don't need to see past many different ways to express the same offset.
This is a first step towards
https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699.
This is limited to constant GEPs only for now, as they have a clear
canonical form, while we're not yet sure how exactly to deal with
variable indices.
The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives
two representative examples of the kind of optimization improvement we
expect from this change. In the first test SimplifyCFG can now realize
that all switch branches are actually the same. In the second test it
can convert it into simple arithmetic. These are representative of
common optimization failures we see in Rust.
Fixes https://github.com/llvm/llvm-project/issues/69841.
show more ...
|
Revision tags: llvmorg-19-init |
|
#
f18536d6 |
| 01-Jan-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Model address separately. (#72164)
Move vector pointer generation to a separate VPVectorPointerRecipe.
This untangles address computation from the memory recipes future
and is also needed
[VPlan] Model address separately. (#72164)
Move vector pointer generation to a separate VPVectorPointerRecipe.
This untangles address computation from the memory recipes future
and is also needed to enable explicit unrolling in VPlan.
https://github.com/llvm/llvm-project/pull/72164
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2 |
|
#
707359ec |
| 01-Aug-2023 |
Florian Hahn <flo@fhahn.com> |
Recommit "[LV] Re-use existing broadcast value for live-ins."
This reverts commit 245ec675a4e41f7ec24dfc998720bffdc46a6c53.
Recommits eea9258648ce with a fix to only erase the instruction from the
Recommit "[LV] Re-use existing broadcast value for live-ins."
This reverts commit 245ec675a4e41f7ec24dfc998720bffdc46a6c53.
Recommits eea9258648ce with a fix to only erase the instruction from the first part if it is defined outside the loop. This fixes a use-after-free error reported.
show more ...
|
#
d01aec4c |
| 01-Aug-2023 |
Nikita Popov <npopov@redhat.com> |
[InstCombine] Set dead phi inputs to poison in more cases
Set phi inputs to poison whenever we find a dead edge (either during initial worklist population or the main InstCombine run), instead of on
[InstCombine] Set dead phi inputs to poison in more cases
Set phi inputs to poison whenever we find a dead edge (either during initial worklist population or the main InstCombine run), instead of only doing this for successors of dead blocks.
This means that the phi operand is set to poison even if for critical edges without an intermediate block.
There are quite a few test changes, because the pattern is fairly common in vectorizer output, for cases where we know the vectorized loop will be entered.
show more ...
|
#
7c64449e |
| 01-Aug-2023 |
Nikita Popov <npopov@redhat.com> |
[LoopVectorize] Regenerate test checks (NFC)
To reduce spurious diffs in future changes.
|
Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
245ec675 |
| 25-Jul-2023 |
Martin Storsjö <martin@martin.st> |
Revert "[LV] Re-use existing broadcast value for live-ins."
This reverts commit eea9258648ce73507f6f85c395de978af659d498.
That commit triggered crashes in the following testcase:
$ cat reduced.c t
Revert "[LV] Re-use existing broadcast value for live-ins."
This reverts commit eea9258648ce73507f6f85c395de978af659d498.
That commit triggered crashes in the following testcase:
$ cat reduced.c typedef struct { int a[8] } b; typedef struct { b *c; short d } e; void f() { int g; char *h; e *i = f; short j = i->d; int a = i->c->a[0]; for (;;) for (; g < a; g++) { *h = j * i->d >> 8; h++; } } $ clang -target aarch64-linux-gnu -w -c -O2 reduced.c
show more ...
|
#
eea92586 |
| 24-Jul-2023 |
Florian Hahn <flo@fhahn.com> |
[LV] Re-use existing broadcast value for live-ins.
When requesting a vector value for a live-in, we can re-use the broadcast of the live-in of part 0 for parts > 0.
|
#
bb3763e4 |
| 30-Jun-2023 |
Nikita Popov <nikita.ppv@gmail.com> |
Revert "[SimplifyCFG] Allow dropping block that only contains ephemeral values"
This reverts commit 20f0c68fd83a0147a8ec1722bd2e848180610288.
https://reviews.llvm.org/D153966#4464594 reports an opt
Revert "[SimplifyCFG] Allow dropping block that only contains ephemeral values"
This reverts commit 20f0c68fd83a0147a8ec1722bd2e848180610288.
https://reviews.llvm.org/D153966#4464594 reports an optimization regression in Rust.
Additionally this change has caused an unexpected 0.3% compile-time regression.
show more ...
|
#
20f0c68f |
| 28-Jun-2023 |
Nikita Popov <npopov@redhat.com> |
[SimplifyCFG] Allow dropping block that only contains ephemeral values
Perform the TryToSimplifyUncondBranchFromEmptyBlock() transform if the block is empty except for ephemeral values. The ephemera
[SimplifyCFG] Allow dropping block that only contains ephemeral values
Perform the TryToSimplifyUncondBranchFromEmptyBlock() transform if the block is empty except for ephemeral values. The ephemeral values will be dropped in that case.
This makes sure that assumes don't block this transforms, as reported in https://discourse.llvm.org/t/llvm-assume-blocks-optimization/71609.
Differential Revision: https://reviews.llvm.org/D153966
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
68469a80 |
| 06-Jan-2023 |
Florian Hahn <flo@fhahn.com> |
[LV] Disable runtime unrolling for vectorized loops.
This patch adds metadata to disable runtime unrolling to the vectorized loop. If runtime unrolling/interleaving is considered profitable, LV will
[LV] Disable runtime unrolling for vectorized loops.
This patch adds metadata to disable runtime unrolling to the vectorized loop. If runtime unrolling/interleaving is considered profitable, LV will interleave the loop directly. There should be no need to perform runtime unrolling at a later stage.
Note that we already add metadata to disable runtime unrolling to the scalar loop after vectorization.
The additional unrolling unnecessarily increases code size and compile time. In addition to that we have several bug reports of unncessary runtime unrolling for vectorized loops, e.g. PR40961
Compile-time improvements:
NewPM-O3: -1.04% NewPM-ReleaseThinLTO: -0.59% NewPM-ReleaseLTO-g: -0.97%
https://llvm-compile-time-tracker.com/compare.php?from=ce1be13a868d0f8afa367975558c1a6175cce33a&to=78bc2e67f22e9e10e61cdb6cdac4bb857d95eb1b&stat=instructions:u
Fixes #40306.
Reviewed By: lebedev.ri, nikic
Differential Revision: https://reviews.llvm.org/D115261
show more ...
|
#
2fab9275 |
| 04-Jan-2023 |
Nikita Popov <npopov@redhat.com> |
[LoopVectorize] Convert some tests to opaque pointers (NFC)
Check lines for some of these tests were regenerated. The difference is that with opaque pointers SCEVExpander always emits i8 GEPs, makin
[LoopVectorize] Convert some tests to opaque pointers (NFC)
Check lines for some of these tests were regenerated. The difference is that with opaque pointers SCEVExpander always emits i8 GEPs, making the address calculation explicit. This is a known problem that will be solved long term by making all address calculations explicit.
show more ...
|
#
05dbdb00 |
| 08-Dec-2022 |
Sanjay Patel <spatel@rotateright.com> |
Revert "[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1 (2nd try)"
This reverts commit e71b81cab09bf33e3b08ed600418b72cc4117461.
As discussed in the planned follow-on to this
Revert "[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1 (2nd try)"
This reverts commit e71b81cab09bf33e3b08ed600418b72cc4117461.
As discussed in the planned follow-on to this patch (D138874), this and the subsequent patches in this set can cause trouble for the backend, and there's probably no quick fix. We may even want to canonicalize in the opposite direction (towards insertelt).
show more ...
|
#
be51fa45 |
| 05-Dec-2022 |
Roman Lebedev <lebedev.ri@gmail.com> |
[NFC] Port all runlines for LoopVectorize pass tests to -passes syntax
|
#
e71b81ca |
| 30-Nov-2022 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1 (2nd try)
The first attempt was reverted because a clang test changed unexpectedly - the file is already marked with a FIXME, s
[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1 (2nd try)
The first attempt was reverted because a clang test changed unexpectedly - the file is already marked with a FIXME, so I just updated it this time to pass.
Original commit message: This is the main patch for converting a truncated scalar that is inserted into a vector to bitcast+shuffle. We could go either way on patterns like this, but this direction will allow collapsing a pair of these sequences on the motivating example from issue
The patch is split into 3 parts to make it easier to see the progression of tests diffs. We allow inserting/shuffling into a different size vector for flexibility, so there are several test variations. The length-changing is handled by shortening/padding the shuffle mask with undef elements.
In part 1, handle the basic pattern: inselt undef, (trunc T), IndexC --> shuffle (bitcast T), IdentityMask
Proof for the endian-dependency behaving as expected: https://alive2.llvm.org/ce/z/BsA7yC
The TODO items for handling shifts and insert into an arbitrary base vector value are implemented as follow-ups.
Differential Revision: https://reviews.llvm.org/D138872
show more ...
|
#
5eacdcff |
| 30-Nov-2022 |
Sanjay Patel <spatel@rotateright.com> |
Revert "[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1"
This reverts commit a4c466766db77cd1fb42d7f98f32bb87a3d38829. This broke clang tests that are wrongly dependent on the
Revert "[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1"
This reverts commit a4c466766db77cd1fb42d7f98f32bb87a3d38829. This broke clang tests that are wrongly dependent on the optimizer.
show more ...
|
#
a4c46676 |
| 30-Nov-2022 |
Sanjay Patel <spatel@rotateright.com> |
[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1
This is the main patch for converting a truncated scalar that is inserted into a vector to bitcast+shuffle. We could go either
[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1
This is the main patch for converting a truncated scalar that is inserted into a vector to bitcast+shuffle. We could go either way on patterns like this, but this direction will allow collapsing a pair of these sequences on the motivating example from issue
The patch is split into 3 parts to make it easier to see the progression of tests diffs. We allow inserting/shuffling into a different size vector for flexibility, so there are several test variations. The length-changing is handled by shortening/padding the shuffle mask with undef elements.
In part 1, handle the basic pattern: inselt undef, (trunc T), IndexC --> shuffle (bitcast T), IdentityMask
Proof for the endian-dependency behaving as expected: https://alive2.llvm.org/ce/z/BsA7yC
The TODO items for handling shifts and insert into an arbitrary base vector value are implemented as follow-ups.
Differential Revision: https://reviews.llvm.org/D138872
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1 |
|
#
bbef90ac |
| 15-Sep-2022 |
Vitaly Buka <vitalybuka@google.com> |
[IRBuilder] Use PoisonValue in CreateMasked*
Followup to 72b776168c7c80d2035c7226488462dcffc97e75
Reviewed By: nlopes
Differential Revision: https://reviews.llvm.org/D133967
|
Revision tags: llvmorg-15.0.0 |
|
#
4c106463 |
| 29-Aug-2022 |
Philip Reames <preames@rivosinc.com> |
[LV] Refresh autogen tests to reflect naming changes [nfc]
Purely so that these can be easily autogened without spurious diffs
|
Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
438ffdb8 |
| 07-Jul-2022 |
David Green <david.green@arm.com> |
[ARM] Switch the costs of mve1beat and mve4beat
These three subtarget features are meant to control where MVE instructions take 1 vs 2 vs 4 architectural beats. The mve1beat feature is described as
[ARM] Switch the costs of mve1beat and mve4beat
These three subtarget features are meant to control where MVE instructions take 1 vs 2 vs 4 architectural beats. The mve1beat feature is described as "Model MVE instructions as a 1 beat per tick architecture", meaning MVE instruction will execute over 4 cycles. mve4beat is the opposite where the entire 4 beats of the MVE instruction execute in a single cycle. The costs for the two were backwards though, not matching the cycle counts like they should. This patch switches the costs on the two to bring them in-line with expectations.
Differential Revision: https://reviews.llvm.org/D129141
show more ...
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
65c4d619 |
| 05-Jan-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Add VPCanonicalIVPHIRecipe, partly retire createInductionVariable.
At the moment, the primary induction variable for the vector loop is created as part of the skeleton creation. This is tied
[VPlan] Add VPCanonicalIVPHIRecipe, partly retire createInductionVariable.
At the moment, the primary induction variable for the vector loop is created as part of the skeleton creation. This is tied to creating the vector loop latch outside of VPlan. This prevents from modeling the *whole* vector loop in VPlan, which in turn is required to model preheader and exit blocks in VPlan as well.
This patch introduces a new recipe VPCanonicalIVPHIRecipe to represent the primary IV in VPlan and CanonicalIVIncrement{NUW} opcodes for VPInstruction to model the increment.
This allows us to partly retire createInductionVariable. At the moment, a bit of patching up is done after executing all blocks in the plan.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D113223
show more ...
|
#
e6ad9ef4 |
| 14-Dec-2021 |
Philip Reames <listmail@philipreames.com> |
[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement
The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transform
[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement
The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transforms are inconsistent about which types we end up with based on visit order.
I'm restricting this to constants as for non-constants, we'd have to decide whether the simplicity was worth extra instructions. For constants, there are no extra instructions.
We chose the canonical type as i64 arbitrarily. We might consider changing this to something else in the future if we have cause.
Differential Revision: https://reviews.llvm.org/D115387
show more ...
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
#
23c2f2e6 |
| 07-Jun-2021 |
Florian Hahn <flo@fhahn.com> |
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail.
If th
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail.
If the tail is not folded, we know that End - Start >= Step (either statically or through the minimum iteration checks). We also know that both Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV + %Step == %End. Hence we must exit the loop before %IV + %Step unsigned overflows and we can mark the induction increment as NUW.
This should make SCEV return more precise bounds for the created vector loops, used by later optimizations, like late unrolling.
At the moment quite a few tests still need to be updated, but before doing so I'd like to get initial feedback to make sure I am not missing anything.
Note that this could probably be further improved by using information from the original IV.
Attempt of modeling of the assumption in Alive2: https://alive2.llvm.org/ce/z/H_DL_g
Part of a set of fixes required for PR50412.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D103255
show more ...
|