Revision tags: llvmorg-21-init |
|
#
713482fc |
| 27-Jan-2025 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Use State.get to extract lane mask for BranchOnMask.
Simplifies the code slightly and avoids redundant extracts/broadcasts if the operand is live-in or already scalar.
|
Revision tags: llvmorg-19.1.7 |
|
#
4ad0fdd1 |
| 17-Dec-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Remove reverse() of predecessors from VPInstruction::generate.
This was originally done to reduce the diff for the change. Remove it and update the remaining tests. NFC modulo reordering of
[VPlan] Remove reverse() of predecessors from VPInstruction::generate.
This was originally done to reduce the diff for the change. Remove it and update the remaining tests. NFC modulo reordering of incoming values.
Clean up after https://github.com/llvm/llvm-project/pull/114292.
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
38fffa63 |
| 06-Nov-2024 |
Paul Walker <paul.walker@arm.com> |
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548)
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
710dab6e |
| 20-Jul-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Remove VPPredInstPHIRecipes without users after region merging.
After merging replicate regions, VPPredInstPHIRecipes may become unused. Remove them directly instead of moving them to the me
[VPlan] Remove VPPredInstPHIRecipes without users after region merging.
After merging replicate regions, VPPredInstPHIRecipes may become unused. Remove them directly instead of moving them to the merged region.
show more ...
|
#
99d6c6d9 |
| 05-Jul-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651)
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block
[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651)
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block also requires
modeling the successor blocks. This is done using the recently
introduced VPIRBasicBlock.
Note that the middle.block is still created as part of the skeleton and
then patched in during VPlan execution. Unfortunately the skeleton needs
to create the middle.block early on, as it is also used for induction
resume value creation and is also needed to properly update the
dominator tree during skeleton creation.
After this patch lands, I plan to move induction resume value and phi
node creation in the scalar preheader to VPlan. Once that is done, we
should be able to create the middle.block in VPlan directly.
This is a re-worked version based on the earlier
https://reviews.llvm.org/D150398 and the main change is the use of
VPIRBasicBlock.
Depends on https://github.com/llvm/llvm-project/pull/92525
PR: https://github.com/llvm/llvm-project/pull/92651
show more ...
|
#
3808ba78 |
| 20-Jun-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Model middle block via VPIRBasicBlock. (#95816)
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle bloc
[VPlan] Model middle block via VPIRBasicBlock. (#95816)
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle block
is only created after skeleton creation. Initially a regular
VPBasicBlock is created, which will later be replaced by a
VPIRBasicBlock once the middle IR basic block has been created.
Note that this slightly changes the order of instructions created in the
middle block; code generated by recipe execution in the middle block
will now be inserted before the terminator (and in between the compare
to used by the terminator). The original order will be restored in
https://github.com/llvm/llvm-project/pull/92651.
PR: https://github.com/llvm/llvm-project/pull/95816
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4 |
|
#
c8369836 |
| 09-Apr-2024 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Remove unused first mask op from VPBlendRecipe. (#87770)
VPBlendRecipe does not use the first mask operand. Removing it allows
VPlan-based DCE to remove unused mask computations.
This al
[VPlan] Remove unused first mask op from VPBlendRecipe. (#87770)
VPBlendRecipe does not use the first mask operand. Removing it allows
VPlan-based DCE to remove unused mask computations.
This also fixes #87410, where unused Not VPInstructions are considered
having only their first lane demanded, but some of their operands
providing a vector value due to other users.
Fixes https://github.com/llvm/llvm-project/issues/87410
PR: https://github.com/llvm/llvm-project/pull/87770
show more ...
|
Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
51afb101 |
| 09-Jan-2024 |
Florian Hahn <flo@fhahn.com> |
[LV] Create block in mask up-front if needed. (#76635)
At the moment, block and edge masks are created on demand, which means
that they are inserted at the point where they are demanded and then
c
[LV] Create block in mask up-front if needed. (#76635)
At the moment, block and edge masks are created on demand, which means
that they are inserted at the point where they are demanded and then
cached. It is possible that the mask for a block is looked up later at a
point that's not dominated by the point where the mask has been
inserted.
To avoid this, create masks up front on entry to the corresponding basic
block and leave it to VPlan simplification to remove unneeded masks.
Note that we need to create masks for all blocks, if any of the blocks
in the loop needs predication, as computing the mask of a block depends
on the masks of its predecessor.
Needed for #76090.
https://github.com/llvm/llvm-project/pull/76635
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4 |
|
#
96e83d37 |
| 29-Aug-2023 |
Florian Hahn <flo@fhahn.com> |
[LV] Use IRBuilder to create and optimize middle-block compare.
Split off from D150398 to avoid builder-related diff changes there. Using IRBuilder to create ICmps simplifies the result if both oper
[LV] Use IRBuilder to create and optimize middle-block compare.
Split off from D150398 to avoid builder-related diff changes there. Using IRBuilder to create ICmps simplifies the result if both operands are constants.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D158332
show more ...
|
Revision tags: llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2 |
|
#
83ab5708 |
| 17-Apr-2023 |
Florian Hahn <flo@fhahn.com> |
[LV] Don't sink scalar instructions that may read from memory.
The current sinking code doesn't prevent us from sinking a load past an aliasing store. Skip sinking instructions that may read from me
[LV] Don't sink scalar instructions that may read from memory.
The current sinking code doesn't prevent us from sinking a load past an aliasing store. Skip sinking instructions that may read from memory to avoid a mis-compile.
See @minimal_bit_widths_with_aliasing_store for an example where 2 loads are sunk past aliasing stores before this fix.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D147259
show more ...
|
Revision tags: llvmorg-16.0.1 |
|
#
b060ca70 |
| 30-Mar-2023 |
Florian Hahn <flo@fhahn.com> |
[LV] Regenerate check lines for test to reduce diff in follow-up patch.
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
eae26b66 |
| 04-Jan-2023 |
Paul Walker <paul.walker@arm.com> |
[IRBuilder] Use canonical i64 type for insertelement index used by vector splats.
Instcombine prefers this canonical form (see getPreferredVectorIndex), as does IRBuilder when passing the index as a
[IRBuilder] Use canonical i64 type for insertelement index used by vector splats.
Instcombine prefers this canonical form (see getPreferredVectorIndex), as does IRBuilder when passing the index as an integer so we may as well use the prefered form from creation.
NOTE: All test changes are mechanical with nothing else expected beyond a change of index type from i32 to i64.
Differential Revision: https://reviews.llvm.org/D140983
show more ...
|
#
68469a80 |
| 06-Jan-2023 |
Florian Hahn <flo@fhahn.com> |
[LV] Disable runtime unrolling for vectorized loops.
This patch adds metadata to disable runtime unrolling to the vectorized loop. If runtime unrolling/interleaving is considered profitable, LV will
[LV] Disable runtime unrolling for vectorized loops.
This patch adds metadata to disable runtime unrolling to the vectorized loop. If runtime unrolling/interleaving is considered profitable, LV will interleave the loop directly. There should be no need to perform runtime unrolling at a later stage.
Note that we already add metadata to disable runtime unrolling to the scalar loop after vectorization.
The additional unrolling unnecessarily increases code size and compile time. In addition to that we have several bug reports of unncessary runtime unrolling for vectorized loops, e.g. PR40961
Compile-time improvements:
NewPM-O3: -1.04% NewPM-ReleaseThinLTO: -0.59% NewPM-ReleaseLTO-g: -0.97%
https://llvm-compile-time-tracker.com/compare.php?from=ce1be13a868d0f8afa367975558c1a6175cce33a&to=78bc2e67f22e9e10e61cdb6cdac4bb857d95eb1b&stat=instructions:u
Fixes #40306.
Reviewed By: lebedev.ri, nikic
Differential Revision: https://reviews.llvm.org/D115261
show more ...
|
#
5b400150 |
| 14-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[LoopVectorize] Convert some tests to opaque pointers (NFC)
For these tests update_test_checks.py had to be rerun.
|
#
be51fa45 |
| 05-Dec-2022 |
Roman Lebedev <lebedev.ri@gmail.com> |
[NFC] Port all runlines for LoopVectorize pass tests to -passes syntax
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4 |
|
#
e25ed058 |
| 20-Oct-2022 |
Florian Hahn <flo@fhahn.com> |
[LV] Use buildScalarSteps to also handle VF = 1. (NFCI)
The code in buildScalarSteps already properly handles creating the scalar induction values with VF = 1. Use it directly instead of using extra
[LV] Use buildScalarSteps to also handle VF = 1. (NFCI)
The code in buildScalarSteps already properly handles creating the scalar induction values with VF = 1. Use it directly instead of using extra code to handle that case.
Suggested by @Ayal in D133760.
show more ...
|
Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1 |
|
#
4c4c0d2c |
| 08-Sep-2022 |
Philip Reames <preames@rivosinc.com> |
[LV] Use safe-divisor lowering for fixed vectors if profitable
This extends the safe-divisor widening scheme recently added for scalable vectors to handle fixed vectors as well.
Differential Revisi
[LV] Use safe-divisor lowering for fixed vectors if profitable
This extends the safe-divisor widening scheme recently added for scalable vectors to handle fixed vectors as well.
Differential Revision: https://reviews.llvm.org/D132591
show more ...
|
Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
1a73ef75 |
| 20-Jul-2022 |
Philip Reames <preames@rivosinc.com> |
[LV] Autogen a test for ease of update
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
5b362e4c |
| 20-Dec-2021 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Add Debugloc to VPInstruction.
Upcoming changes require attaching debug locations to VPInstructions, e.g. adding induction increment recipes in D113223.
Reviewed By: Ayal
Differential Revi
[VPlan] Add Debugloc to VPInstruction.
Upcoming changes require attaching debug locations to VPInstructions, e.g. adding induction increment recipes in D113223.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D115123
show more ...
|
#
42263e7d |
| 13-Dec-2021 |
Florian Hahn <flo@fhahn.com> |
[LV] Add test with debug locations on branches that get scalarized.
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4 |
|
#
80aa7e14 |
| 28-Jun-2021 |
Florian Hahn <flo@fhahn.com> |
[VPlan] Merge predicated-triangle regions, after sinking.
Sinking scalar operands into predicated-triangle regions may allow merging regions. This patch adds a VPlan-to-VPlan transform that tries to
[VPlan] Merge predicated-triangle regions, after sinking.
Sinking scalar operands into predicated-triangle regions may allow merging regions. This patch adds a VPlan-to-VPlan transform that tries to merge predicate-triangle regions after sinking.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D100260
show more ...
|
Revision tags: llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
#
ed253ef7 |
| 09-Feb-2021 |
Juneyoung Lee <aqjune@gmail.com> |
[LoopVectorize] Fix VPRecipeBuilder::createEdgeMask to correctly generate the mask
This patch fixes pr48832 by correctly generating the mask when a poison value is involved.
Consider this CFG (whic
[LoopVectorize] Fix VPRecipeBuilder::createEdgeMask to correctly generate the mask
This patch fixes pr48832 by correctly generating the mask when a poison value is involved.
Consider this CFG (which is a part of the input):
``` for.body: ; preds = %for.cond br i1 true, label %cond.false, label %land.rhs
land.rhs: ; preds = %for.body br i1 poison, label %cond.end, label %cond.false
cond.false: ; preds = %for.body, %land.rhs br label %cond.end
cond.end: ; preds = %land.rhs, %cond.false %cond = phi i32 [ 0, %cond.false ], [ 1, %land.rhs ]
```
The path for.body -> land.rhs -> cond.end should be taken when 'select i1 false, i1 poison, i1 false' holds (which means it's never taken); but VPRecipeBuilder::createEdgeMask was emitting 'and i1 false, poison' instead. The former one successfully blocks poison propagation whereas the latter one doesn't, making the condition poison and thus causing the miscompilation.
SimplifyCFG has a similar bug (which didn't expose a real-world bug yet), and a patch for this is also ongoing (see https://reviews.llvm.org/D95026).
Reviewed By: bjope
Differential Revision: https://reviews.llvm.org/D95217
show more ...
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
#
4a8e6ed2 |
| 05-Jan-2021 |
Juneyoung Lee <aqjune@gmail.com> |
[SLP,LV] Use poison constant vector for shufflevector/initial insertelement
This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef. This is a part o
[SLP,LV] Use poison constant vector for shufflevector/initial insertelement
This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef. This is a part of efforts for using poison vector instead of undef to represent "doesn't care" vector. The goal is to make nice shufflevector optimizations valid that is currently incorrect due to the tricky interaction between undef and poison (see https://bugs.llvm.org/show_bug.cgi?id=44185 ).
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D94061
show more ...
|
#
c043f505 |
| 19-Dec-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[SimplifyCFG] Teach FoldBranchToCommonDest() to preserve DomTree, part 1
... for conditional branch case
|
#
b43b77ff |
| 19-Dec-2020 |
Roman Lebedev <lebedev.ri@gmail.com> |
[NFCI][SimlifyCFG] simplifyOnce(): also perform DomTree validation
And that exposes that a number of tests don't *actually* manage to maintain DomTree validity, which is inline with my observations.
[NFCI][SimlifyCFG] simplifyOnce(): also perform DomTree validation
And that exposes that a number of tests don't *actually* manage to maintain DomTree validity, which is inline with my observations.
Once again, SimlifyCFG pass currently does not require/preserve DomTree by default, so this is effectively NFC.
show more ...
|