Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5 |
|
#
56c091ea |
| 21-Nov-2024 |
Paul Walker <paul.walker@arm.com> |
[LLVM][IR] Use splat syntax when printing ConstantExpr based splats. (#116856)
This brings the printing of scalable vector constant splats inline with
their fixed length counterparts.
|
Revision tags: llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
99d6c6d9 |
| 05-Jul-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651)
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block
[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651)
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block also requires
modeling the successor blocks. This is done using the recently
introduced VPIRBasicBlock.
Note that the middle.block is still created as part of the skeleton and
then patched in during VPlan execution. Unfortunately the skeleton needs
to create the middle.block early on, as it is also used for induction
resume value creation and is also needed to properly update the
dominator tree during skeleton creation.
After this patch lands, I plan to move induction resume value and phi
node creation in the scalar preheader to VPlan. Once that is done, we
should be able to create the middle.block in VPlan directly.
This is a re-worked version based on the earlier
https://reviews.llvm.org/D150398 and the main change is the use of
VPIRBasicBlock.
Depends on https://github.com/llvm/llvm-project/pull/92525
PR: https://github.com/llvm/llvm-project/pull/92651
show more ...
|
#
3808ba78 |
| 20-Jun-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Model middle block via VPIRBasicBlock. (#95816)
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle bloc
[VPlan] Model middle block via VPIRBasicBlock. (#95816)
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle block
is only created after skeleton creation. Initially a regular
VPBasicBlock is created, which will later be replaced by a
VPIRBasicBlock once the middle IR basic block has been created.
Note that this slightly changes the order of instructions created in the
middle block; code generated by recipe execution in the middle block
will now be inserted before the terminator (and in between the compare
to used by the terminator). The original order will be restored in
https://github.com/llvm/llvm-project/pull/92651.
PR: https://github.com/llvm/llvm-project/pull/95816
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
5ea6a3fc |
| 08-Dec-2023 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Compute scalable VF in preheader for induction increment. (#74762)
UF * VF is loop invariant and can be computed directly in the preheader.
This prepares the code for #74761 and reduces the
[VPlan] Compute scalable VF in preheader for induction increment. (#74762)
UF * VF is loop invariant and can be computed directly in the preheader.
This prepares the code for #74761 and reduces the test changes.
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4 |
|
#
2ec7bba7 |
| 20-Oct-2023 |
Florian Hahn <flo@fhahn.com> |
Recommit "[VPlan] Insert Trunc/Exts for reductions directly in VPlan."
This reverts commit e4ea0997486000b460c4875a00301b73b3c0d6a7.
The recommit fixes a reported crash by adding a missing check to
Recommit "[VPlan] Insert Trunc/Exts for reductions directly in VPlan."
This reverts commit e4ea0997486000b460c4875a00301b73b3c0d6a7.
The recommit fixes a reported crash by adding a missing check to make sure the cast recipes are only introduced when vectorizing.
Test coverage added in 3cac608fbd0811b2f5c59c6e13148162ccd8543e. Original commit message: Update the code to create Trunc/Ext recipes directly in adjustRecipesForReductions instead of fixing it up later in fixReductions.
This explicitly models the required conversions and also makes sure they are generated at the right place (instead of after the exit condition), hence the changes in a few tests.
show more ...
|
#
e4ea0997 |
| 19-Oct-2023 |
Fangrui Song <i@maskray.me> |
Revert "[VPlan] Insert Trunc/Exts for reductions directly in VPlan."
This reverts commit fd311126349b8fe1684d62154a9fa5a7bbb0b713.
There are two different crash reports on https://github.com/llvm/l
Revert "[VPlan] Insert Trunc/Exts for reductions directly in VPlan."
This reverts commit fd311126349b8fe1684d62154a9fa5a7bbb0b713.
There are two different crash reports on https://github.com/llvm/llvm-project/commit/fd311126349b8fe1684d62154a9fa5a7bbb0b713
show more ...
|
#
fd311126 |
| 17-Oct-2023 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Insert Trunc/Exts for reductions directly in VPlan.
Update the code to create Trunc/Ext recipes directly in adjustRecipesForReductions instead of fixing it up later in fixReductions.
This e
[VPlan] Insert Trunc/Exts for reductions directly in VPlan.
Update the code to create Trunc/Ext recipes directly in adjustRecipesForReductions instead of fixing it up later in fixReductions.
This explicitly models the required conversions and also makes sure they are generated at the right place (instead of after the exit condition), hence the changes in a few tests.
show more ...
|
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
eae26b66 |
| 04-Jan-2023 |
Paul Walker <paul.walker@arm.com> |
[IRBuilder] Use canonical i64 type for insertelement index used by vector splats.
Instcombine prefers this canonical form (see getPreferredVectorIndex), as does IRBuilder when passing the index as a
[IRBuilder] Use canonical i64 type for insertelement index used by vector splats.
Instcombine prefers this canonical form (see getPreferredVectorIndex), as does IRBuilder when passing the index as an integer so we may as well use the prefered form from creation.
NOTE: All test changes are mechanical with nothing else expected beyond a change of index type from i32 to i64.
Differential Revision: https://reviews.llvm.org/D140983
show more ...
|
#
7d757725 |
| 14-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[LoopVectorize] Convert some tests to opaque pointers (NFC)
|
#
be51fa45 |
| 05-Dec-2022 |
Roman Lebedev <lebedev.ri@gmail.com> |
[NFC] Port all runlines for LoopVectorize pass tests to -passes syntax
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
872f7000 |
| 03-Apr-2022 |
Dávid Bolvanský <david.bolvansky@gmail.com> |
Revert "[NFCI] Regenerate SROA/LoopVectorize test checks"
This reverts commit 14e3450fb57305aa9ff3e9e60687b458e43835c9.
|
#
a113a582 |
| 03-Apr-2022 |
Dávid Bolvanský <david.bolvansky@gmail.com> |
[NFCI] Regenerate LoopVectorize test checks
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
#
2a48b699 |
| 08-Nov-2021 |
David Sherwood <david.sherwood@arm.com> |
[IR] In ConstantFoldShuffleVectorInstruction use zeroinitializer for splats of 0
When creating a splat of 0 for scalable vectors we tend to create them with using a combination of shufflevector and
[IR] In ConstantFoldShuffleVectorInstruction use zeroinitializer for splats of 0
When creating a splat of 0 for scalable vectors we tend to create them with using a combination of shufflevector and insertelement, i.e.
shufflevector (<vscale x 4 x i32> insertelement (<vscale x 4 x i32> poison, i32 0, i32 0), <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer)
However, for the case of a zero splat we can actually just replace the above with zeroinitializer instead. This makes the IR a lot simpler and easier to read. I have changed ConstantFoldShuffleVectorInstruction to use zeroinitializer when creating a splat of integer 0 or FP +0.0 values.
Differential Revision: https://reviews.llvm.org/D113394
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
#
26b7d9d6 |
| 04-Aug-2021 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve selecting an integer value based on a prev
[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve selecting an integer value based on a previous comparison. Consider the following C++ loop:
int r = a; for (int i = 0; i < n; i++) { if (src[i] > 3) { r = b; } src[i] += 2; }
We should be able to vectorise this loop because all we are doing is selecting between two states - 'a' and 'b' - both of which are loop invariant. This just involves building a vector of values that contain either 'a' or 'b', where the final reduced value will be 'b' if any lane contains 'b'.
The IR generated by clang typically looks like this:
%phi = phi i32 [ %a, %entry ], [ %phi.update, %for.body ] ... %pred = icmp ugt i32 %val, i32 3 %phi.update = select i1 %pred, i32 %b, i32 %phi
We already detect min/max patterns, which also involve a select + cmp. However, with the min/max patterns we are selecting loaded values (and hence loop variant) in the loop. In addition we only support certain cmp predicates. This patch adds a new pattern matching function (isSelectCmpPattern) and new RecurKind enums - SelectICmp & SelectFCmp. We only support selecting values that are integer and loop invariant, however we can support any kind of compare - integer or float.
Tests have been added here:
Transforms/LoopVectorize/AArch64/sve-select-cmp.ll Transforms/LoopVectorize/select-cmp-predicated.ll Transforms/LoopVectorize/select-cmp.ll
Differential Revision: https://reviews.llvm.org/D108136
show more ...
|
#
685f1bfd |
| 01-Oct-2021 |
Krasimir Georgiev <krasimir@google.com> |
Revert "[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns"
It appears to cause stage2 clang build failures, e.g., https://lab.llvm.org/buildbot/#/builders/74/builds
Revert "[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns"
It appears to cause stage2 clang build failures, e.g., https://lab.llvm.org/buildbot/#/builders/74/builds/7145.
This reverts commit 1fb37334bdb3cdb028977382fbd84cebde64ebb2.
show more ...
|
#
1fb37334 |
| 04-Aug-2021 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve selecting an integer value based on a prev
[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve selecting an integer value based on a previous comparison. Consider the following C++ loop:
int r = a; for (int i = 0; i < n; i++) { if (src[i] > 3) { r = b; } src[i] += 2; }
We should be able to vectorise this loop because all we are doing is selecting between two states - 'a' and 'b' - both of which are loop invariant. This just involves building a vector of values that contain either 'a' or 'b', where the final reduced value will be 'b' if any lane contains 'b'.
The IR generated by clang typically looks like this:
%phi = phi i32 [ %a, %entry ], [ %phi.update, %for.body ] ... %pred = icmp ugt i32 %val, i32 3 %phi.update = select i1 %pred, i32 %b, i32 %phi
We already detect min/max patterns, which also involve a select + cmp. However, with the min/max patterns we are selecting loaded values (and hence loop variant) in the loop. In addition we only support certain cmp predicates. This patch adds a new pattern matching function (isSelectCmpPattern) and new RecurKind enums - SelectICmp & SelectFCmp. We only support selecting values that are integer and loop invariant, however we can support any kind of compare - integer or float.
Tests have been added here:
Transforms/LoopVectorize/AArch64/sve-select-cmp.ll Transforms/LoopVectorize/select-cmp-predicated.ll Transforms/LoopVectorize/select-cmp.ll
Differential Revision: https://reviews.llvm.org/D108136
show more ...
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
#
23c2f2e6 |
| 07-Jun-2021 |
Florian Hahn <flo@fhahn.com> |
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail.
If th
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail.
If the tail is not folded, we know that End - Start >= Step (either statically or through the minimum iteration checks). We also know that both Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV + %Step == %End. Hence we must exit the loop before %IV + %Step unsigned overflows and we can mark the induction increment as NUW.
This should make SCEV return more precise bounds for the created vector loops, used by later optimizations, like late unrolling.
At the moment quite a few tests still need to be updated, but before doing so I'd like to get initial feedback to make sure I am not missing anything.
Note that this could probably be further improved by using information from the original IV.
Attempt of modeling of the assumption in Alive2: https://alive2.llvm.org/ce/z/H_DL_g
Part of a set of fixes required for PR50412.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D103255
show more ...
|
Revision tags: llvmorg-12.0.1-rc1 |
|
#
4f86aa65 |
| 08-Apr-2021 |
Sander de Smalen <sander.desmalen@arm.com> |
[LV] Add -scalable-vectorization=<option> flag.
This patch adds a new option to the LoopVectorizer to control how scalable vectors can be used.
Initially, this suggests three levels to control scal
[LV] Add -scalable-vectorization=<option> flag.
This patch adds a new option to the LoopVectorizer to control how scalable vectors can be used.
Initially, this suggests three levels to control scalable vectorization, although other more aggressive options can be added in the future.
The possible options are: - Disabled: Disables vectorization with scalable vectors. - Enabled: Vectorize loops using scalable vectors or fixed-width vectors, but favors fixed-width vectors when the cost is a tie. - Preferred: Like 'Enabled', but favoring scalable vectors when the cost-model is inconclusive.
Reviewed By: paulwalker-arm, vkmr
Differential Revision: https://reviews.llvm.org/D101945
show more ...
|
#
cf06c8ee |
| 22-Apr-2021 |
Caroline Concatto <caroline.concatto@arm.com> |
[LoopVectorize][SVE] Remove assert for scalable vector in InnerLoopVectorizer::fixReduction
The function fixReduction used to assert/crash for scalable vector when a vector reduce could be done with
[LoopVectorize][SVE] Remove assert for scalable vector in InnerLoopVectorizer::fixReduction
The function fixReduction used to assert/crash for scalable vector when a vector reduce could be done with a smaller vector. This patch removes this assertion as it is safe to use scalable vector for vector reduce and truncate.
Differential Revision: https://reviews.llvm.org/D101260
show more ...
|