Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
d942f5e1 |
| 13-Nov-2024 |
hanbeom <kese111@gmail.com> |
[VectorCombine] Combine extract/insert from vector into a shuffle (#115213)
insert (DstVec, (extract SrcVec, ExtIdx), InsIdx) --> shuffle (DstVec, SrcVec, Mask)
This commit combines extract/inser
[VectorCombine] Combine extract/insert from vector into a shuffle (#115213)
insert (DstVec, (extract SrcVec, ExtIdx), InsIdx) --> shuffle (DstVec, SrcVec, Mask)
This commit combines extract/insert on a vector into Shuffle with vector.
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
8b56da5e |
| 26-Apr-2023 |
ManuelJBrito <manuel.brito@tecnico.ulisboa.pt> |
[IR] Change shufflevector undef mask to poison
With this patch an undefined mask in a shufflevector will be printed as poison. This change is done to support the new shufflevector semantics for unde
[IR] Change shufflevector undef mask to poison
With this patch an undefined mask in a shufflevector will be printed as poison. This change is done to support the new shufflevector semantics for undefined mask elements.
Differential Revision: https://reviews.llvm.org/D149210
show more ...
|
Revision tags: llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
#
5e4dbd7a |
| 20-Oct-2021 |
Bjorn Pettersson <bjorn.a.pettersson@ericsson.com> |
[NewPM][test] Use -passes syntax in VectorCombine lit tests
The legacy PM is deprecated, so use the new PM syntax in lit tests running the vector-combine pass.
|
#
0dcd2b40 |
| 06-Oct-2021 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[TTI] Remove default condition type and predicate arguments from getCmpSelInstrCost
We need to be better at exposing the comparison predicate to getCmpSelInstrCost calls as some targets (e.g. X86 SS
[TTI] Remove default condition type and predicate arguments from getCmpSelInstrCost
We need to be better at exposing the comparison predicate to getCmpSelInstrCost calls as some targets (e.g. X86 SSE) have very different costs for different comparisons (PR48337), and we can't always rely on the optional Instruction argument.
This initial commit requires explicit condition type and predicate arguments. The next step will be to review a lot of the existing getCmpSelInstrCost calls which have used BAD_ICMP_PREDICATE even when the predicate is known.
Differential Revision: https://reviews.llvm.org/D111024
show more ...
|
#
0776924a |
| 05-Oct-2021 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[CostModel][X86] getCmpSelInstrCost - treat BAD_PREDICATEs the same as the worst case cost predicates for ICMP/FCMP instructions
As suggested on D111024, we should treat getCmpSelInstrCost calls wit
[CostModel][X86] getCmpSelInstrCost - treat BAD_PREDICATEs the same as the worst case cost predicates for ICMP/FCMP instructions
As suggested on D111024, we should treat getCmpSelInstrCost calls without a specific predicate as matching the worst case predicate cost.
These regressions will be addressed with a mixture of D111024 and fixing other specific getCmpSelInstrCost calls to have realistic predicates.
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
#
278aa65c |
| 24-Dec-2020 |
Juneyoung Lee <aqjune@gmail.com> |
[IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.
Reviewed By: n
[IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D93793
show more ...
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
cce625f7 |
| 22-Jun-2020 |
Sanjay Patel <spatel@rotateright.com> |
[VectorCombine] improve IR debugging by providing/salvaging value names
The tests are regenerated to show the diffs, but there should be no functional change from this patch.
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
ce97ce3a |
| 03-Apr-2020 |
Sanjay Patel <spatel@rotateright.com> |
[VectorCombine] try to form a better extractelement
Extracting to the same index that we are going to insert back into allows forming select ("blend") shuffles and enables further transforms.
Admit
[VectorCombine] try to form a better extractelement
Extracting to the same index that we are going to insert back into allows forming select ("blend") shuffles and enables further transforms.
Admittedly, this is a quick-fix for a more general problem that I'm hoping to solve by adding transforms for patterns that start with an insertelement.
But this might resolve some regressions known to be caused by the extract-extract transform (although I have not gotten more details on those yet).
In the motivating case from PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724
The combination of subsequent instcombine and codegen transforms gets us this improvement:
vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm4 vmovshdup %xmm1, %xmm3 ## xmm3 = xmm1[1,1,3,3] vaddps %xmm0, %xmm2, %xmm0 vaddps %xmm1, %xmm3, %xmm1 vshufps $200, %xmm4, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm4[0,3] vinsertps $177, %xmm1, %xmm0, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm1[2]
-->
vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm1 vaddps %xmm0, %xmm2, %xmm0 vshufps $200, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm1[0,3]
Differential Revision: https://reviews.llvm.org/D76623
show more ...
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6 |
|
#
5eeea337 |
| 23-Mar-2020 |
Sanjay Patel <spatel@rotateright.com> |
[VectorCombine] add more tests for extract-extract patterns; NFC
|
Revision tags: llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4 |
|
#
a69158c1 |
| 08-Mar-2020 |
Sanjay Patel <spatel@rotateright.com> |
[VectorCombine] fold extract-extract-op with different extraction indexes
opcode (extelt V0, Ext0), (ext V1, Ext1) --> extelt (opcode (splat V0, Ext0), V1), Ext1
The first part of this patch genera
[VectorCombine] fold extract-extract-op with different extraction indexes
opcode (extelt V0, Ext0), (ext V1, Ext1) --> extelt (opcode (splat V0, Ext0), V1), Ext1
The first part of this patch generalizes the cost calculation to accept different extraction indexes. The second part creates a shuffle+extract before feeding into the existing code to create a vector op+extract.
The patch conservatively uses "TargetTransformInfo::SK_PermuteSingleSrc" rather than "TargetTransformInfo::SK_Broadcast" (splat specifically from element 0) because we do not have a more general "SK_Splat" currently. That does not affect any of the current regression tests, but we might be able to find some cost model target specialization where that comes into play.
I suspect that we can expose some missing x86 horizontal op codegen with this transform, so I'm speculatively adding a debug flag to disable the binop variant of this transform to allow easier testing.
The test changes show that we're sensitive to cost model diffs (as we should be), so that means that patches like D74976 should have better coverage.
Differential Revision: https://reviews.llvm.org/D75689
show more ...
|
Revision tags: llvmorg-10.0.0-rc3 |
|
#
e0568ef2 |
| 24-Feb-2020 |
Sanjay Patel <spatel@rotateright.com> |
[VectorCombine] add tests for possible extract->shuffle; NFC
|
#
62dd44d7 |
| 16-Feb-2020 |
Sanjay Patel <spatel@rotateright.com> |
[VectorCombine] fix cost calc for extract-cmp
getOperationCost() is not the cost we wanted; that's not the throughput value that the rest of the calculation uses.
We may want to switch everything i
[VectorCombine] fix cost calc for extract-cmp
getOperationCost() is not the cost we wanted; that's not the throughput value that the rest of the calculation uses.
We may want to switch everything in this code to use the getInstructionThroughput() wrapper to avoid these kinds of problems, but I'll look at that as a follow-up because that can create other logical diffs via using optional parameters (we'd need to speculatively create the vector instruction to make a fair(er) comparison).
show more ...
|
Revision tags: llvmorg-10.0.0-rc2 |
|
#
a17f03bd |
| 09-Feb-2020 |
Sanjay Patel <spatel@rotateright.com> |
[VectorCombine] new IR transform pass for partial vector ops
We have several bug reports that could be characterized as "reducing scalarization", and this topic was also raised on llvm-dev recently:
[VectorCombine] new IR transform pass for partial vector ops
We have several bug reports that could be characterized as "reducing scalarization", and this topic was also raised on llvm-dev recently: http://lists.llvm.org/pipermail/llvm-dev/2020-January/138157.html ...so I'm proposing that we deal with these patterns in a new, lightweight IR vector pass that runs before/after other vectorization passes.
There are 4 alternate options that I can think of to deal with this kind of problem (and we've seen various attempts at all of these), but they all have flaws:
InstCombine - can't happen without TTI, but we don't want target-specific folds there. SDAG - too late to assist other vectorization passes; TLI is not equipped for these kind of cost queries; limited to a single basic block. CGP - too late to assist other vectorization passes; would need to re-implement basic cleanups like CSE/instcombine. SLP - doesn't fit with existing transforms; limited to a single basic block.
This initial patch/transform is based on existing code in AggressiveInstCombine: we walk backwards through the function looking for a pattern match. But we diverge from that cost-independent IR canonicalization pass by using TTI to decide if the vector alternative is profitable.
We probably have at least 10 similar bug reports/patterns (binops, constants, inserts, cheap shuffles, etc) that would fit in this pass as follow-up enhancements. It's possible that we could iterate on a worklist to fix-point like InstCombine does, but it's safer to start with a most basic case and evolve from there, so I didn't try to do anything fancy with this initial implementation.
Differential Revision: https://reviews.llvm.org/D73480
show more ...
|