Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 38fffa63 06-Nov-2024 Paul Walker <paul.walker@arm.com>

[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548)


Revision tags: llvmorg-19.1.3
# 12bcea32 18-Oct-2024 Han-Kuan Chen <hankuan.chen@sifive.com>

[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#111459)

reference: https://github.com/llvm/llvm-project/pull/110457


Revision tags: llvmorg-19.1.2
# a65a5feb 08-Oct-2024 Alexey Bataev <a.bataev@outlook.com>

[SLP]Improve masked loads vectorization, attempting gathered loads

If the vector of loads can be vectorized as masked gather and there are
several other masked gather nodes, compiler can try to atte

[SLP]Improve masked loads vectorization, attempting gathered loads

If the vector of loads can be vectorized as masked gather and there are
several other masked gather nodes, compiler can try to attempt to check,
if it possible to gather such nodes into big consecutive/strided loads
node, which provide better performance.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/110151

show more ...


# f11568bc 07-Oct-2024 Philip Reames <preames@rivosinc.com>

Revert "[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457)"

This reverts commit 554eaec63908ed20c35c8cc85304a3d44a63c634. Change was not approved

Revert "[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457)"

This reverts commit 554eaec63908ed20c35c8cc85304a3d44a63c634. Change was not approved when landed.

show more ...


# 554eaec6 05-Oct-2024 Han-Kuan Chen <hankuan.chen@sifive.com>

[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457)


Revision tags: llvmorg-19.1.1
# 7f6bbb3c 20-Sep-2024 Philip Reames <preames@rivosinc.com>

[RISCV][TTI] Reduce cost of a build_vector pattern (#108419)

This change is actually two related changes, but they're very hard to
meaningfully separate as the second balances the first, and yet do

[RISCV][TTI] Reduce cost of a build_vector pattern (#108419)

This change is actually two related changes, but they're very hard to
meaningfully separate as the second balances the first, and yet doesn't
do much good on it's own.

First, we can reduce the cost of a build_vector pattern. Our current
costing for this defers to generic insertelement costing which isn't
unreasonable, but also isn't correct. While inserting N elements
requires N-1 slides and N vmv.s.x, doing the full build_vector only
requires N vslide1down. (Note there are other cases that our build
vector lowering can do more cheaply, this is simply the easiest upper
bound which appears to be "good enough" for SLP costing purposes.)

Second, we need to tell SLP that calls don't preserve vector registers.
Without this, SLP will vectorize scalar code which performs e.g. 4 x
float @exp calls as two <2 x float> @exp intrinsic calls. Oddly, the
costing works out that this is in fact the optimal choice - except that
we don't actually have a <2 x float> @exp, and unroll during DAG. This
would be fine (or at least cost neutral) except that the libcall for the
scalar @exp blows all vector registers. So the net effect is we added a
bunch of spills that SLP had no idea about. Thankfully, AArch64 has a
similiar problem, and has taught SLP how to reason about spill cost once
the right TTI hook is implemented.

Now, for some implications...

The SLP solution for spill costing has some inaccuracies. In particular,
it basically just guesses whether a intrinsic will be lowered to a call
or not, and can be wrong in both directions. It also has no mechanism to
differentiate on calling convention.

This has the effect of making partial vectorization (i.e. starting in
scalar) more profitable. In practice, the major effect of this is to
make it more like SLP will vectorize part of a tree in an intersecting
forrest, and then vectorize the remaining tree once those uses have been
removed.

This has the effect of biasing us slightly away from strided, or indexed
loads during vectorization - because the scalar cost is more accurately
modeled, and these instructions look relevatively less profitable.

show more ...


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 73ce13d7 10-Jan-2024 Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>

[SLP][TTI]Improve detection of the insert-subvector pattern for SLP. (#74749)

SLP vectorizer passes the type of the subvector and the mask, which size
determines the size of the resulting vector. TT

[SLP][TTI]Improve detection of the insert-subvector pattern for SLP. (#74749)

SLP vectorizer passes the type of the subvector and the mask, which size
determines the size of the resulting vector. TTI should support this
pattern to improve cost estimation of the insert_subvector shuffle
pattern.

show more ...


# dd0e38eb 07-Dec-2023 Alexey Bataev <a.bataev@outlook.com>

[SLP]Add a test for missed insert_subvector pattern detection, NFC.