Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
38fffa63 |
| 06-Nov-2024 |
Paul Walker <paul.walker@arm.com> |
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548)
|
Revision tags: llvmorg-19.1.3 |
|
#
12bcea32 |
| 18-Oct-2024 |
Han-Kuan Chen <hankuan.chen@sifive.com> |
[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#111459)
reference: https://github.com/llvm/llvm-project/pull/110457
|
Revision tags: llvmorg-19.1.2 |
|
#
a65a5feb |
| 08-Oct-2024 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Improve masked loads vectorization, attempting gathered loads
If the vector of loads can be vectorized as masked gather and there are several other masked gather nodes, compiler can try to atte
[SLP]Improve masked loads vectorization, attempting gathered loads
If the vector of loads can be vectorized as masked gather and there are several other masked gather nodes, compiler can try to attempt to check, if it possible to gather such nodes into big consecutive/strided loads node, which provide better performance.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/110151
show more ...
|
#
f11568bc |
| 07-Oct-2024 |
Philip Reames <preames@rivosinc.com> |
Revert "[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457)"
This reverts commit 554eaec63908ed20c35c8cc85304a3d44a63c634. Change was not approved
Revert "[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457)"
This reverts commit 554eaec63908ed20c35c8cc85304a3d44a63c634. Change was not approved when landed.
show more ...
|
#
554eaec6 |
| 05-Oct-2024 |
Han-Kuan Chen <hankuan.chen@sifive.com> |
[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457)
|
Revision tags: llvmorg-19.1.1 |
|
#
7f6bbb3c |
| 20-Sep-2024 |
Philip Reames <preames@rivosinc.com> |
[RISCV][TTI] Reduce cost of a build_vector pattern (#108419)
This change is actually two related changes, but they're very hard to
meaningfully separate as the second balances the first, and yet do
[RISCV][TTI] Reduce cost of a build_vector pattern (#108419)
This change is actually two related changes, but they're very hard to
meaningfully separate as the second balances the first, and yet doesn't
do much good on it's own.
First, we can reduce the cost of a build_vector pattern. Our current
costing for this defers to generic insertelement costing which isn't
unreasonable, but also isn't correct. While inserting N elements
requires N-1 slides and N vmv.s.x, doing the full build_vector only
requires N vslide1down. (Note there are other cases that our build
vector lowering can do more cheaply, this is simply the easiest upper
bound which appears to be "good enough" for SLP costing purposes.)
Second, we need to tell SLP that calls don't preserve vector registers.
Without this, SLP will vectorize scalar code which performs e.g. 4 x
float @exp calls as two <2 x float> @exp intrinsic calls. Oddly, the
costing works out that this is in fact the optimal choice - except that
we don't actually have a <2 x float> @exp, and unroll during DAG. This
would be fine (or at least cost neutral) except that the libcall for the
scalar @exp blows all vector registers. So the net effect is we added a
bunch of spills that SLP had no idea about. Thankfully, AArch64 has a
similiar problem, and has taught SLP how to reason about spill cost once
the right TTI hook is implemented.
Now, for some implications...
The SLP solution for spill costing has some inaccuracies. In particular,
it basically just guesses whether a intrinsic will be lowered to a call
or not, and can be wrong in both directions. It also has no mechanism to
differentiate on calling convention.
This has the effect of making partial vectorization (i.e. starting in
scalar) more profitable. In practice, the major effect of this is to
make it more like SLP will vectorize part of a tree in an intersecting
forrest, and then vectorize the remaining tree once those uses have been
removed.
This has the effect of biasing us slightly away from strided, or indexed
loads during vectorization - because the scalar cost is more accurately
modeled, and these instructions look relevatively less profitable.
show more ...
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
73ce13d7 |
| 10-Jan-2024 |
Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com> |
[SLP][TTI]Improve detection of the insert-subvector pattern for SLP. (#74749)
SLP vectorizer passes the type of the subvector and the mask, which size determines the size of the resulting vector. TT
[SLP][TTI]Improve detection of the insert-subvector pattern for SLP. (#74749)
SLP vectorizer passes the type of the subvector and the mask, which size determines the size of the resulting vector. TTI should support this pattern to improve cost estimation of the insert_subvector shuffle pattern.
show more ...
|
#
dd0e38eb |
| 07-Dec-2023 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Add a test for missed insert_subvector pattern detection, NFC.
|