#
2b55ef18 |
| 29-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Add helper to run VPlan passes, verify after run (NFC). (#123640)
Add new runPass helpers to run a VPlan transformation. This makes it
easier to add additional checks/functionality for each
[VPlan] Add helper to run VPlan passes, verify after run (NFC). (#123640)
Add new runPass helpers to run a VPlan transformation. This makes it
easier to add additional checks/functionality for each transform run. In
this patch, an option is added to run the verifier after each VPlan
transform.
Follow-ups will use the same helper to also support printing VPlans
after each transform.
Note that the verifier at the moment requires there to be a canonical IV
and vector loop region, so the final lowering transforms aren't run via
runPass yet.
PR: https://github.com/llvm/llvm-project/pull/123640
show more ...
|
Revision tags: llvmorg-21-init |
|
#
09a29fcc |
| 27-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Don't collect live-ins in collectUsersInExitBlocks. (NFC) (#123819)
Live-ins don't need to be handled, other than adding to the exit phi
recipe. Do that early and assert that otherwise the
[VPlan] Don't collect live-ins in collectUsersInExitBlocks. (NFC) (#123819)
Live-ins don't need to be handled, other than adding to the exit phi
recipe. Do that early and assert that otherwise the exit value is
defined in the vector loop region.
This should enable simply skipping other exit values that do not need
further fixing, e.g. if handling the exit value from the early exit
directly in handleUncountableEarlyExit.
PR: https://github.com/llvm/llvm-project/pull/123819
show more ...
|
#
2c87133c |
| 19-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
Reapply "[VPlan] Update final IV exit value via VPlan. (#112147)"
This reverts the revert commit 58326f1d5b5b379590af92dd129b2f3b3e96af46.
The build failure in sanitizer stage2 builds has been fixe
Reapply "[VPlan] Update final IV exit value via VPlan. (#112147)"
This reverts the revert commit 58326f1d5b5b379590af92dd129b2f3b3e96af46.
The build failure in sanitizer stage2 builds has been fixed with 0d39fe6f5bb3edf0bddec09a8c6417377390aeac.
Original commit message: Model updating IV users directly in VPlan, replace fixupIVUsers.
Now simple extracts are created for all phis in the exit block during initial VPlan construction. A later VPlan transform (optimizeInductionExitUsers) replaces extracts of inductions with their pre-computed values if possible.
This completes the transition towards modeling all live-outs directly in VPlan.
There are a few follow-ups: * emit extracts initially also for resume phis, and optimize them tougher with IV exit users * support for VPlans with multiple exits in optimizeInductionExitUsers.
Depends on https://github.com/llvm/llvm-project/pull/110004, https://github.com/llvm/llvm-project/pull/109975 and https://github.com/llvm/llvm-project/pull/112145.
show more ...
|
#
58326f1d |
| 18-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
Revert "[VPlan] Update final IV exit value via VPlan. (#112147)"
This reverts commit c2d15ac4d4432788557e77c15ce572ac655a8fec.
Causes build failures on PPC stage2 & fuchsia bots https://lab.llv
Revert "[VPlan] Update final IV exit value via VPlan. (#112147)"
This reverts commit c2d15ac4d4432788557e77c15ce572ac655a8fec.
Causes build failures on PPC stage2 & fuchsia bots https://lab.llvm.org/buildbot/#/builders/168/builds/7650 https://lab.llvm.org/buildbot/#/builders/11/builds/11248
show more ...
|
#
c2d15ac4 |
| 18-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Update final IV exit value via VPlan. (#112147)
Model updating IV users directly in VPlan, replace fixupIVUsers.
Now simple extracts are created for all phis in the exit block during
ini
[VPlan] Update final IV exit value via VPlan. (#112147)
Model updating IV users directly in VPlan, replace fixupIVUsers.
Now simple extracts are created for all phis in the exit block during
initial VPlan construction. A later VPlan transform
(optimizeInductionExitUsers) replaces extracts of inductions with
their pre-computed values if possible.
This completes the transition towards modeling all live-outs directly in
VPlan.
There are a few follow-ups:
* emit extracts initially also for resume phis, and optimize them
tougher with IV exit users
* support for VPlans with multiple exits in optimizeInductionExitUsers.
Depends on https://github.com/llvm/llvm-project/pull/110004,
https://github.com/llvm/llvm-project/pull/109975 and
https://github.com/llvm/llvm-project/pull/112145.
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6 |
|
#
5fae408d |
| 11-Dec-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138)
A more lightweight variant of
https://github.com/llvm/llvm-project/pull/109193,
which dispatches to multiple exit blocks via t
[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138)
A more lightweight variant of
https://github.com/llvm/llvm-project/pull/109193,
which dispatches to multiple exit blocks via the middle blocks.
The patch also introduces a bit of required scaffolding to enable
early-exit vectorization, including an option. At the moment, early-exit
vectorization doesn't come with legality checks, and is only used if the
option is provided and the loop has metadata forcing vectorization. This
is only intended to be used for testing during bring-up, with @david-arm
enabling auto early-exit vectorization plugging in the changes from
https://github.com/llvm/llvm-project/pull/88385.
PR: https://github.com/llvm/llvm-project/pull/112138
show more ...
|
#
afef545e |
| 08-Dec-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Address post-commit for #114305.
Apply suggested renaming and adjust placement as suggested in https://github.com/llvm/llvm-project/pull/114305. Also drop unneeded RPOT creation.
|
#
a7fda0e1 |
| 03-Dec-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Introduce VPScalarPHIRecipe, use for can & EVL IV codegen (NFC). (#114305)
Introduce a general recipe to generate a scalar phi. Lower
VPCanonicalIVPHIRecipe and VPEVLBasedIVRecipe to VPScal
[VPlan] Introduce VPScalarPHIRecipe, use for can & EVL IV codegen (NFC). (#114305)
Introduce a general recipe to generate a scalar phi. Lower
VPCanonicalIVPHIRecipe and VPEVLBasedIVRecipe to VPScalarIVPHIrecipe
before plan execution, avoiding the need for duplicated ::execute
implementations. There are other cases that could benefit, including
in-loop reduction phis and pointer induction phis.
Builds on a similar idea as
https://github.com/llvm/llvm-project/pull/82270.
PR: https://github.com/llvm/llvm-project/pull/114305
show more ...
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3 |
|
#
2dfb1c66 |
| 23-Oct-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Try to hoist Previous (and operands), if sinking fails for FORs. (#108945)
In some cases, Previous (and its operands) can be hoisted. This allows
supporting additional cases where sinking o
[VPlan] Try to hoist Previous (and operands), if sinking fails for FORs. (#108945)
In some cases, Previous (and its operands) can be hoisted. This allows
supporting additional cases where sinking of all users of to FOR fails,
e.g. due having to sink recipes with side-effects.
This fixes a crash where we fail to create a scalar VPlan for a
first-order recurrence, but can create a vector VPlan, because the trunc
instruction of an IV which generates the previous value of the
recurrence has been optimized to a truncated induction recipe, thus
hoisting it to the beginning.
Fixes https://github.com/llvm/llvm-project/issues/106523.
PR: https://github.com/llvm/llvm-project/pull/108945
show more ...
|
#
f148d579 |
| 18-Oct-2024 |
Alexey Bataev <a.bataev@outlook.com> |
[LV]Initial support for safe distance in predicated DataWithEVL vectorization mode.
Enabled initial support for max safe distance in DataWithEVL mode. If max safe distance is required, need to emit
[LV]Initial support for safe distance in predicated DataWithEVL vectorization mode.
Enabled initial support for max safe distance in DataWithEVL mode. If max safe distance is required, need to emit special code: CMP = icmp ult AVL, MAX_SAFE_DISTANCE SAFE_AVL = select CMP, AVL, MAX_SAFE_DISTANCE EVL = call i32 @llvm.experimental.get.vector.length(i64 SAFE_AVL)
while vectorize the loop in DataWithEVL tail folding mode.
Reviewers: fhahn
Reviewed By: fhahn
Pull Request: https://github.com/llvm/llvm-project/pull/102897
show more ...
|
Revision tags: llvmorg-19.1.2 |
|
#
7f746518 |
| 06-Oct-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Use pointer to member 0 as VPInterleaveRecipe's pointer arg. (#106431)
Update VPInterleaveRecipe to always use the pointer to member 0 as
pointer argument. This in many cases helps to remov
[VPlan] Use pointer to member 0 as VPInterleaveRecipe's pointer arg. (#106431)
Update VPInterleaveRecipe to always use the pointer to member 0 as
pointer argument. This in many cases helps to remove unneeded index
adjustments and simplifies VPInterleaveRecipe::execute.
In some rare cases, the address of member 0 does not dominate the insert
position of the interleave group. In those cases a PtrAdd VPInstruction
is emitted to compute the address of member 0 based on the address of
the insert position. Alternatively we could hoist the recipe computing
the address of member 0.
show more ...
|
Revision tags: llvmorg-19.1.1 |
|
#
53266f73 |
| 22-Sep-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Run DCE after unrolling.
This cleans up a number of dead recipes after unrolling if only their first or last parts are used. This simplifies a number of tests.
Fixes https://github.com/llvm
[VPlan] Run DCE after unrolling.
This cleans up a number of dead recipes after unrolling if only their first or last parts are used. This simplifies a number of tests.
Fixes https://github.com/llvm/llvm-project/issues/109581.
show more ...
|
#
8ec40675 |
| 21-Sep-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Implement unrolling as VPlan-to-VPlan transform. (#95842)
This patch implements explicit unrolling by UF as VPlan transform. In
follow up patches this will allow simplifying VPTransform st
[VPlan] Implement unrolling as VPlan-to-VPlan transform. (#95842)
This patch implements explicit unrolling by UF as VPlan transform. In
follow up patches this will allow simplifying VPTransform state (no need
to store unrolled parts) as well as recipe execution (no need to
generate code for multiple parts in an each recipe). It also allows for
more general optimziations (e.g. avoid generating code for recipes that
are uniform-across parts).
It also unifies the logic dealing with unrolled parts in a single place,
rather than spreading it out across multiple places (e.g. VPlan post
processing for header-phi recipes previously.)
In the initial implementation, a number of recipes still take the
unrolled part as additional, optional argument, if their execution
depends on the unrolled part.
The computation for start/step values for scalable inductions changed
slightly. Previously the step would be computed as scalar and then
splatted, now vscale gets splatted and multiplied by the step in a
vector mul.
This has been split off https://github.com/llvm/llvm-project/pull/94339
which also includes changes to simplify VPTransfomState and recipes'
::execute.
The current version mostly leaves existing ::execute untouched and
instead sets VPTransfomState::UF to 1.
A follow-up patch will clean up all references to VPTransformState::UF.
Another follow-up patch will simplify VPTransformState to only store a
single vector value per VPValue.
PR: https://github.com/llvm/llvm-project/pull/95842
show more ...
|
Revision tags: llvmorg-19.1.0 |
|
#
f3029b33 |
| 13-Sep-2024 |
David Sherwood <david.sherwood@arm.com> |
[NFC][LoopVectorize] Avoid passing ScalarEvolution to VPlanTransforms::optimize (#108380)
Whilst trying to write some VPlan unit tests I realised
that we don't need to pass a ScalarEvolution object
[NFC][LoopVectorize] Avoid passing ScalarEvolution to VPlanTransforms::optimize (#108380)
Whilst trying to write some VPlan unit tests I realised
that we don't need to pass a ScalarEvolution object into
VPlanTransforms::optimize because the only thing we
actually need is a LLVMContext.
show more ...
|
Revision tags: llvmorg-19.1.0-rc4 |
|
#
16910a21 |
| 28-Aug-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Move logic to create interleave groups to VPlanTransforms (NFC).
This is a step towards further breaking up the rather large tryToBuildVPlanWithVPRecipes. It moves logic create interleave gr
[VPlan] Move logic to create interleave groups to VPlanTransforms (NFC).
This is a step towards further breaking up the rather large tryToBuildVPlanWithVPRecipes. It moves logic create interleave groups to VPlanTransforms.cpp, where similar replacements for other recipes are defined as well (e.g. EVL-based ones)
show more ...
|
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
0338c55e |
| 25-May-2024 |
Shih-Po Hung <shihpo.hung@sifive.com> |
[LV, VPlan] Check if plan is compatible to EVL transform (#92092)
The transform updates all users of inductions to work based on EVL, instead of the VF directly. At the moment, widened inductions ca
[LV, VPlan] Check if plan is compatible to EVL transform (#92092)
The transform updates all users of inductions to work based on EVL, instead of the VF directly. At the moment, widened inductions cannot be updated, so bail out if the plan contains any. This patch introduces a check before applying EVL transform. If any recipes in loop rely on RuntimeVF, the plan is discarded.
show more ...
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4 |
|
#
413a66f3 |
| 04-Apr-2024 |
Alexey Bataev <a.bataev@outlook.com> |
[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (#76172)
This patch introduces generating VP intrinsics in the Loop Vectorizer.
Currently the Loop
[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (#76172)
This patch introduces generating VP intrinsics in the Loop Vectorizer.
Currently the Loop Vectorizer supports vector predication in a very
limited capacity via tail-folding and masked load/store/gather/scatter
intrinsics. However, this does not let architectures with active vector
length predication support take advantage of their capabilities.
Architectures with general masked predication support also can only take
advantage of predication on memory operations. By having a way for the
Loop Vectorizer to generate Vector Predication intrinsics, which (will)
provide a target-independent way to model predicated vector
instructions. These architectures can make better use of their
predication capabilities.
Our first approach (implemented in this patch) builds on top of the
existing tail-folding mechanism in the LV (just adds a new tail-folding
mode using EVL), but instead of generating masked intrinsics for memory
operations it generates VP intrinsics for loads/stores instructions. The
patch adds a new VPlanTransforms to replace the wide header predicate
compare with EVL and updates codegen for load/stores to use VP
store/load with EVL.
Other important part of this approach is how the Explicit Vector Length
is computed. (VP intrinsics define this vector length parameter as
Explicit Vector Length (EVL)). We use an experimental intrinsic
`get_vector_length`, that can be lowered to architecture specific
instruction(s) to compute EVL.
Also, added a new recipe to emit instructions for computing EVL. Using
VPlan in this way will eventually help build and compare VPlans
corresponding to different strategies and alternatives.
Differential Revision: https://reviews.llvm.org/D99750
show more ...
|
Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
20177c45 |
| 17-Feb-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Turn private members of VPlanTransforms to static funcs (NFC)
Private members of VPlanTransforms are only used inside VPlanTransforms.cpp, just make them static.
|
#
debca7ee |
| 14-Feb-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Move dropping of poison flags to VPlanTransforms. (NFC)
Move collectPoisonGeneratingFlags from InnerLoopVectorizer to VPlanTransforms and also update its name. collectPoisonGeneratingFlags a
[VPlan] Move dropping of poison flags to VPlanTransforms. (NFC)
Move collectPoisonGeneratingFlags from InnerLoopVectorizer to VPlanTransforms and also update its name. collectPoisonGeneratingFlags already directly drops poison-generating flags, not only collecting it. This means it is more appropriate to integerate it directly into the VPlan transform pipeline.
The current implementation still calls back to legal to check if a block needs predication, which should be improved in the future.
show more ...
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
8b118113 |
| 10-Dec-2023 |
Kazu Hirata <kazu@google.com> |
[Transforms] Remove unused forward declarations (NFC)
|
#
70535f5e |
| 02-Dec-2023 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Replace IR based truncateToMinimalBitwidths with VPlan version.
This patch replaces the IR based truncateToMinimalBitwidths with a VPlan version. This has 3 benefits: 1) the VPlan-based vers
[VPlan] Replace IR based truncateToMinimalBitwidths with VPlan version.
This patch replaces the IR based truncateToMinimalBitwidths with a VPlan version. This has 3 benefits: 1) the VPlan-based version is simpler; we don't need to implement special codegen for each supported instruction type like the IR based one. 2) Removes a dependency on the cost-model after VPlan execution and 3) Removes a use of getVPValue that uses underlying values after VPlan execution (See removed FIXME).
Depends on D149081.
Depends on D149079.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D149903
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2 |
|
#
97687b7a |
| 25-Sep-2023 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Add active-lane-mask as VPlan-to-VPlan transformation.
This patch updates the mask creation code to always create compares of the form (ICMP_ULE, wide canonical IV, backedge-taken-count) up
[VPlan] Add active-lane-mask as VPlan-to-VPlan transformation.
This patch updates the mask creation code to always create compares of the form (ICMP_ULE, wide canonical IV, backedge-taken-count) up front when tail folding and introduce active-lane-mask as later transformation.
This effectively makes (ICMP_ULE, wide canonical IV, backedge-taken-count) the canonical form for tail-folding early on. Introducing more specific active-lane-mask recipes is treated as a VPlan-to-VPlan optimization.
This has the advantage of keeping the logic (and complexity) of introducing active-lane-mask recipes in a single place, instead of spreading the logic out across multiple functions. It also simplifies initial VPlan construction and enables treating introducing EVL as similar optimization.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D158779
show more ...
|
Revision tags: llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2 |
|
#
a6d67307 |
| 04-Aug-2023 |
Florian Hahn <flo@fhahn.com> |
[LV] Split off code to optimize initial VPlan (NFC).
Split up tryToBuildVPlanWithVPRecipes into intial plan creation and optimizations, by introducing a VPLanTransform::optimize helper.
Depends on
[LV] Split off code to optimize initial VPlan (NFC).
Split up tryToBuildVPlanWithVPRecipes into intial plan creation and optimizations, by introducing a VPLanTransform::optimize helper.
Depends on D154640.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D154644
show more ...
|
Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
9259f41e |
| 09-Jul-2023 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Clear reduction flags directly as VPlanTransform.
After D150027, all relevant recipes should model their IR flags directly. Instead of removing the flags after codegen as part of fixReductio
[VPlan] Clear reduction flags directly as VPlanTransform.
After D150027, all relevant recipes should model their IR flags directly. Instead of removing the flags after codegen as part of fixReductions, drop poison generating flags directly from the recipes.
Depends on D150027.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D150028
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
6303fa36 |
| 01-May-2023 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Remove DeadInsts arg from VPInstructionsToVPRecipes (NFC)
The argument isn't used. VPlan-based dead recipe removal can be used instead.
|