#
32f1c553 |
| 16-Nov-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Update VPValue::getDef to return VPRecipeBase, adjust name(NFC)
The return value of getDef is guaranteed to be a VPRecipeBase and all users can also accept a VPRecipeBase *. Most users actua
[VPlan] Update VPValue::getDef to return VPRecipeBase, adjust name(NFC)
The return value of getDef is guaranteed to be a VPRecipeBase and all users can also accept a VPRecipeBase *. Most users actually case to VPRecipeBase or a specific recipe before using it, so this change removes a number of redundant casts.
Also rename it to getDefiningRecipe to make the name a bit clearer.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D136068
show more ...
|
Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2 |
|
#
2c692d89 |
| 23-Sep-2022 |
Florian Hahn <flo@fhahn.com> |
[LV] Update handling of scalable pointer inductions after b73d2c8.
The dependent code has been changed quite a lot since 151c144 which b73d2c8 effectively reverts. Now we run into a case where lower
[LV] Update handling of scalable pointer inductions after b73d2c8.
The dependent code has been changed quite a lot since 151c144 which b73d2c8 effectively reverts. Now we run into a case where lowering didn't expect/support the behavior pre 151c144 any longer.
Update the code dealing with scalable pointer inductions to also check for uniformity in combination with isScalarAfterVectorization. This should ensure scalable pointer inductions are handled properly during epilogue vectorization.
Fixes #57912.
show more ...
|
Revision tags: llvmorg-15.0.1 |
|
#
582f8ef1 |
| 19-Sep-2022 |
Florian Hahn <flo@fhahn.com> |
[LV] Keep track of cost-based ScalarAfterVec in VPWidenPointerInd.
Epilogue vectorization uses isScalarAfterVectorization to check if widened versions for inductions need to be generated and bails o
[LV] Keep track of cost-based ScalarAfterVec in VPWidenPointerInd.
Epilogue vectorization uses isScalarAfterVectorization to check if widened versions for inductions need to be generated and bails out in those cases.
At the moment, there are scenarios where isScalarAfterVectorization returns true but VPWidenPointerInduction::onlyScalarsGenerated would return false, causing widening.
This can lead to widened phis with incorrect start values being created in the epilogue vector body.
This patch addresses the issue by storing the cost-model decision in VPWidenPointerInductionRecipe and restoring the behavior before 151c144. This effectively reverts 151c144, but the long-term fix is to properly support widened inductions during epilogue vectorization
Fixes #57712.
show more ...
|
Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3 |
|
#
50724716 |
| 14-Aug-2022 |
Kazu Hirata <kazu@google.com> |
[Transforms] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
|
Revision tags: llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
#
03fee671 |
| 10-May-2022 |
David Sherwood <david.sherwood@arm.com> |
[LoopVectorize] Add option to use active lane mask for loop control flow
Currently, for vectorised loops that use the get.active.lane.mask intrinsic we only use the mask for predicated vector operat
[LoopVectorize] Add option to use active lane mask for loop control flow
Currently, for vectorised loops that use the get.active.lane.mask intrinsic we only use the mask for predicated vector operations, such as masked loads and stores, etc. The loop itself is still controlled by comparing the canonical induction variable with the trip count. However, for some targets this is inefficient when it's cheap to use the mask itself to control the loop.
This patch adds support for using the active lane mask for control flow by:
1. Generating the active lane mask for the next iteration of the vector loop, rather than the current one. If there are still any remaining iterations then at least the first bit of the mask will be set. 2. Extract the first bit of this mask and use this bit for the conditional branch.
I did this by creating a new VPActiveLaneMaskPHIRecipe that sets up the initial PHI values in the vector loop pre-header. I've also made use of the new BranchOnCond VPInstruction for the final instruction in the loop region.
Differential Revision: https://reviews.llvm.org/D125301
show more ...
|
#
b4694229 |
| 04-Jul-2022 |
Florian Hahn <flo@fhahn.com> |
[LV] Simplify setDebugLocFromInst by using early exit (NFC).
Suggested as separate improvement in D128657.
|
#
b0da3c6f |
| 02-Jul-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Move setDebugLocFromInst to VPTransformState (NFC).
The moved helpers are only used for codegen. It will allow moving the remaining ::execute implementations out of LoopVectorize.cpp.
Revie
[VPlan] Move setDebugLocFromInst to VPTransformState (NFC).
The moved helpers are only used for codegen. It will allow moving the remaining ::execute implementations out of LoopVectorize.cpp.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D128657
show more ...
|
#
0dddf04c |
| 01-Jul-2022 |
Florian Hahn <flo@fhahn.com> |
[LV] Don't optimize exit cond during epilogue vectorization.
At the moment, the same VPlan can be used code generation of both the main vector and epilogue vector loop. This can lead to wrong result
[LV] Don't optimize exit cond during epilogue vectorization.
At the moment, the same VPlan can be used code generation of both the main vector and epilogue vector loop. This can lead to wrong results, if the plan is optimized based on the VF of the main vector loop and then re-used for the epilogue loop.
One example where this is problematic is if the scalar loops need to execute at least one iteration, e.g. due to interleave groups.
To prevent mis-compiles in the short-term, disable optimizing exit conditions for VPlans when using epilogue vectorization. The proper fix is to avoid re-using the same plan for both loops, which will require support for cloning plans first.
Fixes #56319.
show more ...
|
#
583abd0e |
| 01-Jul-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Move addMetadata to VPTransformState (NFC).
The moved helpers are only used for codegen. It will allow moving the remaining ::execute implementations out of LoopVectorize.cpp.
Depends on D1
[VPlan] Move addMetadata to VPTransformState (NFC).
The moved helpers are only used for codegen. It will allow moving the remaining ::execute implementations out of LoopVectorize.cpp.
Depends on D127966. Depends on D127965.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D127968
show more ...
|
#
03975b7f |
| 28-Jun-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Move recipe implementations to separate file (NFC).
This patch moves the code for recipe implementations to a separate file.
The benefits are: * Keep VPlan.cpp smaller => faster compile-ti
[VPlan] Move recipe implementations to separate file (NFC).
This patch moves the code for recipe implementations to a separate file.
The benefits are: * Keep VPlan.cpp smaller => faster compile-time during parallel builds. * Keep code for logical units together
As a follow-up I am also planning on moving all ::execute implemetnations from LoopVectorize.cpp over to the new file, which should help to reduce the size of the file a bit.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D127965
show more ...
|
#
569d84fe |
| 23-Jun-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Remove dead recipes across whole plan.
This extends removeDeadRecipe to remove recipes across the whole plan.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D127580
|
#
949c1364 |
| 16-Jun-2022 |
Florian Hahn <flo@fhahn.com> |
[LV] Remove widenPHIInstruction dependence on underlying instr (NFC).
Instead of using the underlying instruction and VF to get the type, use the type of the incoming value. This removes an unnecess
[LV] Remove widenPHIInstruction dependence on underlying instr (NFC).
Instead of using the underlying instruction and VF to get the type, use the type of the incoming value. This removes an unnecessary dependence on the underlying instruction and enables using the recipe without an underlying instruction.
show more ...
|
#
b0c9a71b |
| 07-Jun-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Handle VPInst without underlying instr in VPInterleavedAccess.
This violation is hidden while `cast` is missing an isa assertion after D123901.
|
#
eaf48dd9 |
| 06-Jun-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Replace BranchOnCount with BranchOnCond if TC <= UF * VF.
Try to simplify BranchOnCount to `BranchOnCond true` if TC <= UF * VF.
This is an alternative to D121899 which simplifies the VPlan
[VPlan] Replace BranchOnCount with BranchOnCond if TC <= UF * VF.
Try to simplify BranchOnCount to `BranchOnCond true` if TC <= UF * VF.
This is an alternative to D121899 which simplifies the VPlan directly instead of doing so late in code-gen.
The potential benefit of doing this in VPlan is that this may help cost-modeling in the future. The reason this is done in prepareToExecute at the moment is that a single plan may be used for multiple VFs/UFs.
There are further simplifications that can be applied as follow ups:
1. Replace inductions with constants 2. Replace vector region with regular block.
Fixes #55354.
Depends on D126679.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D126680
show more ...
|
#
416a5080 |
| 04-Jun-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Update vector latch terminator edge to exit block after execution.
Instead of setting the successor to the exit using CFG.ExitBB, set it to nullptr initially. The successor to the exit block
[VPlan] Update vector latch terminator edge to exit block after execution.
Instead of setting the successor to the exit using CFG.ExitBB, set it to nullptr initially. The successor to the exit block is later set either through createEmptyBasicBlock or after VPlan execution (because at the moment, no block is created by VPlan for the exit block, the existing one is reused).
This also enables BranchOnCond to be used as terminator for the exiting block of the topmost vector region.
Depends on D126618.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D126679
show more ...
|
#
a8d2a381 |
| 03-Jun-2022 |
Benjamin Kramer <benny.kra@googlemail.com> |
[VPlan] Silence another unused variable warning in release builds
|
#
a5bb4a3b |
| 03-Jun-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Replace CondBit with BranchOnCond VPInstruction.
This patch removes CondBit and Predicate from VPBasicBlock. To do so, the patch introduces a new branch-on-cond VPInstruction opcode to model
[VPlan] Replace CondBit with BranchOnCond VPInstruction.
This patch removes CondBit and Predicate from VPBasicBlock. To do so, the patch introduces a new branch-on-cond VPInstruction opcode to model a branch on a condition explicitly.
This addresses a long-standing TODO/FIXME that blocks shouldn't be users of VPValues. Those extra users can cause issues for VPValue-based analyses that don't expect blocks. Addressing this fixme should allow us to re-introduce 266ea446ab7476.
The generic branch opcode can also be used in follow-up patches.
Depends on D123005.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D126618
show more ...
|
#
4f1c86e3 |
| 02-Jun-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Remove dead VPlan-native special case from BranchOnCount (NFC).
After 05776122b682684ad this special case doesn't exist any longer.
|
#
05776122 |
| 01-Jun-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Use region for each loop in native path.
This patch updates the VPlan native path to use VPRegionBlocks for all loops in a loop nest. Up to now, only the outermost loop used a region.
This
[VPlan] Use region for each loop in native path.
This patch updates the VPlan native path to use VPRegionBlocks for all loops in a loop nest. Up to now, only the outermost loop used a region.
This is a step towards unifying both paths and keep things consistent between them. It also prepares various code-gen parts for modeling the pre-header in the inner loop vectorizer (D121624).
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D123005
show more ...
|
#
6abce17f |
| 28-May-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Use Exiting-block instead of Exit-block terminology (NFC).
In LLVM's common loop terminology, an exit block is a block outside a loop with a predecessor inside the loop. An exiting block is
[VPlan] Use Exiting-block instead of Exit-block terminology (NFC).
In LLVM's common loop terminology, an exit block is a block outside a loop with a predecessor inside the loop. An exiting block is a block inside the loop which branches to an exit block outside the loop.
This patch updates a few places where VPlan was using ExitBlock for a block exiting a region. Those instances have been updated to use ExitingBlock.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D126173
show more ...
|
#
97590bae |
| 22-May-2022 |
Florian Hahn <flo@fhahn.com> |
[LV] Widen ptr-inductions with scalar uses for scalable VFs.
Current codegen only supports scalarization of pointer inductions for scalable VFs if they are uniform. After 3bebec659 we now may enter
[LV] Widen ptr-inductions with scalar uses for scalable VFs.
Current codegen only supports scalarization of pointer inductions for scalable VFs if they are uniform. After 3bebec659 we now may enter the scalarization code path in VPWidenPointerInductionRecipe::execute for scalable vectors.
Fall back to widening for scalable vectors if necessary.
This should fix a build failure when bootstrapping LLVM with SVE, e.g. https://lab.llvm.org/buildbot/#/builders/176/builds/1723
show more ...
|
#
3bebec65 |
| 21-May-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Model first exit values using VPLiveOut.
This patch introduces a new VPLiveOut subclass of VPUser to model exit values explicitly. The initial version handles exit values that are neither
[VPlan] Model first exit values using VPLiveOut.
This patch introduces a new VPLiveOut subclass of VPUser to model exit values explicitly. The initial version handles exit values that are neither part of induction or reduction chains nor first order recurrence phis.
Fixes #51366, #54867, #55167, #55459
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D123537
show more ...
|
#
df56fb44 |
| 19-May-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Update VPWidenMemoryInstruction to not inherit from VPValue.
VPWidenMemoryInstruction also models stores which may not produce a value. This can trip over analyses. Improve the modeling by o
[VPlan] Update VPWidenMemoryInstruction to not inherit from VPValue.
VPWidenMemoryInstruction also models stores which may not produce a value. This can trip over analyses. Improve the modeling by only adding VPValues for VPWidenMemoryInstructionRecipes modeling loads.
show more ...
|
#
c1a9d149 |
| 17-May-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Move usesScalars/onlyFirstLaneUsed to VPUser.
Those helpers model properties of a user and they should also be available to non-recipe users. This will be used in D123537 for a new exit valu
[VPlan] Move usesScalars/onlyFirstLaneUsed to VPUser.
Those helpers model properties of a user and they should also be available to non-recipe users. This will be used in D123537 for a new exit value user.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D124936
show more ...
|
#
39552964 |
| 15-May-2022 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Improve printing of VPReplicateRecipe with calls.
Suggested as part of D124718.
|