History log of /llvm-project/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-splat.ll (Results 1 – 25 of 35)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 97982a8c 05-Nov-2024 dlav-sc <daniil.avdeev@syntacore.com>

[RISCV][CFI] add function epilogue cfi information (#110810)

This patch adds CFI instructions in the function epilogue.

Before patch:
addi sp, s0, -32
ld ra, 24(sp) # 8-byte Folded Reload
ld s

[RISCV][CFI] add function epilogue cfi information (#110810)

This patch adds CFI instructions in the function epilogue.

Before patch:
addi sp, s0, -32
ld ra, 24(sp) # 8-byte Folded Reload
ld s0, 16(sp) # 8-byte Folded Reload
ld s1, 8(sp) # 8-byte Folded Reload
addi sp, sp, 32
ret

After patch:
addi sp, s0, -32
.cfi_def_cfa sp, 32
ld ra, 24(sp) # 8-byte Folded Reload
ld s0, 16(sp) # 8-byte Folded Reload
ld s1, 8(sp) # 8-byte Folded Reload
.cfi_restore ra
.cfi_restore s0
.cfi_restore s1
addi sp, sp, 32
.cfi_def_cfa_offset 0
ret

This functionality is already present in `riscv-gcc`, but it’s not in
`clang` and this slightly impairs the `lldb` debugging experience, e.g.
backtrace.

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# 2967e5f8 11-Oct-2024 Alex Bradbury <asb@igalia.com>

[RISCV] Enable store clustering by default (#73796)

Builds on #73789, enabling store clustering by default using the same
heuristic.


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4
# 26a8a857 22-Aug-2024 Philip Reames <preames@rivosinc.com>

[RISCV] Introduce local peephole to reduce VLs based on demanded VL (#104689)

This is a fairly narrow transform (at the moment) to reduce the VLs of
instructions feeding a store with a smaller VL.

[RISCV] Introduce local peephole to reduce VLs based on demanded VL (#104689)

This is a fairly narrow transform (at the moment) to reduce the VLs of
instructions feeding a store with a smaller VL. Note that the goal of
this transform isn't really to reduce VL - it's to reduce VL *toggles*.
To our knowledge, small reductions in VL without also changing LMUL are
generally not profitable on existing hardware.

For a single use instruction without side effects, fp exceptions, or a
result dependency on VL, reducing VL is legal if only a subset of
elements are legal. We'd already implemented this logic for vmv.v.v, and
this patch simply applies it to stores as an alternate root.

Longer term, I plan to extend this to other root instructions (i.e.
different kind of stores, reduces, etc..), and add a more general
recursive walkback through operands.

One risk with the dataflow based approach is that we could be reducing
VL of an instruction scheduled in a region with the wider VL (i.e. mixed
mode computations) forcing an additional VL toggle. An example of this
is the @insert_subvector_dag_loop test case, but it doesn't appear to
happen widely. I think this is a risk we should accept.

show more ...


Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4
# d8d131df 09-Apr-2024 Luke Lau <luke@igalia.com>

[RISCV] Convert more constant splats in tests to splat shorthand. NFC (#87616)

A handy shorthand for specifying the shufflevector(insertelement(poison,
foo, 0), poison, zeroinitializer) splat patte

[RISCV] Convert more constant splats in tests to splat shorthand. NFC (#87616)

A handy shorthand for specifying the shufflevector(insertelement(poison,
foo, 0), poison, zeroinitializer) splat pattern was introduced in
#74620.

Some of the RISC-V tests were converted over to use this new form in
dbb65dd330cc1696d7ca3dedc7aa9fa12c55a075, this patch handles the rest
which didn't have any codegen diffs.

This not only converts some constant expressions to the new form, but
also instruction sequences that weren't previously constant expressions
to constant expressions as well. In some cases this affects codegen, but
these have been omitted here and will be handled in a separate PR.

show more ...


Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3
# 0fee2115 14-Feb-2024 Luke Lau <luke@igalia.com>

[RISCV] Remove -riscv-v-fixed-length-vector-lmul-max from tests. NFC (#78299)

Some fixed vector tests in test/CodeGen/RISCV/rvv have multiple run
lines that
check various configurations of -riscv-

[RISCV] Remove -riscv-v-fixed-length-vector-lmul-max from tests. NFC (#78299)

Some fixed vector tests in test/CodeGen/RISCV/rvv have multiple run
lines that
check various configurations of -riscv-v-fixed-length-vector-lmul-max.
From
what I understand this flag was introduced in the early days of fixed
length
vector support, but now that fixed vector codegen has matured I'm not
sure if
it's as relevant today.

This patch proposes to remove the various lmul-max run lines from the
tests to
make them more readable, and any changes to fixed vector codegen easier
to
review.

We have removed them before for the same reason, so this would take care
of the
remaining test cases: https://reviews.llvm.org/D157973#4593268

(I don't have any strong motivation to remove the actual flag itself, my
own
personal motivation is just to clean up the tests)

show more ...


Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0
# 74f985b7 06-Sep-2023 Luke Lau <luke@igalia.com>

[RISCV] Remove -riscv-v-vector-bits-min in tests. NFC (#65404)

V implies Zvl128b, but a lot of the fixed vector tests also redundantly
specify -riscv-v-vector-bits-min=128. This patch removes them

[RISCV] Remove -riscv-v-vector-bits-min in tests. NFC (#65404)

V implies Zvl128b, but a lot of the fixed vector tests also redundantly
specify -riscv-v-vector-bits-min=128. This patch removes them where
there isn't another minimum vlen being tested for, and for cases where
Zve* is being used Zvl128b was added to maintain the old test diff (and
because an awkward vlen probably isn't interesting to test for). Other
places where -risc-v-vector-bits-min were being used were replaced with
Zvl.

show more ...


Revision tags: llvmorg-17.0.0-rc4
# 079c968e 30-Aug-2023 Philip Reames <preames@rivosinc.com>

[RISCV] Form vmv.s.f/x from single element splats via DAG combine

This re-implements the special casing we had in lowerScalarSplat as a DAG combine. As can be seen in the tests, this ends up trigger

[RISCV] Form vmv.s.f/x from single element splats via DAG combine

This re-implements the special casing we had in lowerScalarSplat as a DAG combine. As can be seen in the tests, this ends up triggering in a bunch more cases.

The semantically interesting bit of this change is the use of the implicit truncate semantics for when XLEN > SEW. We'd already been doing this for vmv.v.x, but this change extends e.g. the constant matching to make the same assumption about vmv.s.x. Per my reading of the specification, this should be fine, and if anything, is more obviously true of vmv.s.x than vmv.v.x.

Differential Revision: https://reviews.llvm.org/D158874

show more ...


Revision tags: llvmorg-17.0.0-rc3
# a63bd7e9 14-Aug-2023 Philip Reames <preames@rivosinc.com>

[RISCV] Use NoReg in place of IMPLICIT_DEF for undefined passthru operands

In a recent series of refactorings (described here: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structur

[RISCV] Use NoReg in place of IMPLICIT_DEF for undefined passthru operands

In a recent series of refactorings (described here: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295), I greatly increased the number of IMPLICIT_DEF operands to our vector instructions. This has turned out to have an unexpected negative impact because MachineCSE does not CSE IMPLICIT_DEFs, and thus does not CSE any instruction with an IMPLICIT_DEF operand. SelectionDAG *does* CSE the same case, but that only covers the same block case, not the cross block case. This lead to the performance regression reported in https://github.com/llvm/llvm-project/issues/64282.

This change is a slightly ugly hack to side step the issue. Instead of fixing the root cause (lack of CSE for IMPLICIT_DEF) or undoing the operand changes, we leave the extra operand in place, and use NoReg in place of IMPLICIT_DEF. I then convert back to IMPLICIT_DEF just before register allocation so that ProcessImplicitDefs and TwoAddressInstructions can do the normal transforms to Undef tied registers.

We may end up backporting this into the 17.x release branch. Given how late in the release cycle this is landing, that's much less likely now, but still a possibility.

Differential Revision: https://reviews.llvm.org/D156909

show more ...


Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 92b5a340 29-Jun-2023 Philip Reames <preames@rivosinc.com>

[RISCV] Remove legacy TA/TU pseudo distinction for unary instructions

This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structur

[RISCV] Remove legacy TA/TU pseudo distinction for unary instructions

This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295. In D153155, we started removing the legacy distinction between unsuffixed (TA) and _TU pseudos. This patch continues that effort for the unary instruction families.

The change consists of a few interacting pieces:
* Adding a vector policy operand to VPseudoUnaryNoMaskTU.
* Then using VPseudoUnaryNoMaskTU for all cases where VPseudoUnaryNoMask was previously used and deleting the unsuffixed form.
* Then renaming VPseudoUnaryNoMaskTU to VPseudoUnaryNoMask, and adjusting the RISCVMaskedPseudo table to use the combined pseudo.
* Fixing up two places in C++ code which manually construct VMV_V_* instructions.

Normally, I'd try to factor this into a couple of changes, but in this case, the table structure is tied to naming and thus we can't really separate the otherwise NFC bits.

As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions.

Differential Revision: https://reviews.llvm.org/D153899

show more ...


# aae155c5 21-Jun-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Use a build_vector instead of a chain insert_vector_elts for vXi1 build_vector lowreing.

A build_vector is the canonical representation rather than multiple
insert_vector_elts.

Unfortunatel

[RISCV] Use a build_vector instead of a chain insert_vector_elts for vXi1 build_vector lowreing.

A build_vector is the canonical representation rather than multiple
insert_vector_elts.

Unfortunately, this regresses quite a few tests now primarily due to not
having a vmv.s.x special case, but I hope we can improve this with future
patches.

Stress testing in our downstream found an infinite loop in DAG combine.
This patch breaks the infinite loop.

The insert_vector_element chain starts with a fixed vector undef.
Fixed vector undef is currently expanded to a build_vector of 0s
which gets lowered to a vmv.v.i. The insert chain overwrites all
elements so SimplifyDemandedVectorElts turns the vmv.v.i back into
undef and the cycle repeats.

We probably should custom lower fixed vector undef to scalable
vector undef. I think that would also fix the infinite loop, but
I didn't test that.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D153399

show more ...


Revision tags: llvmorg-16.0.6
# 2a1716de 06-Jun-2023 Luke Lau <luke@igalia.com>

[LegalizeTypes][VP] Widen load/store of fixed length vectors to VP ops

If we have a load/store with an illegal fixed length vector result type that
needs widened, e.g. `x:v6i32 = load p`
Instead of

[LegalizeTypes][VP] Widen load/store of fixed length vectors to VP ops

If we have a load/store with an illegal fixed length vector result type that
needs widened, e.g. `x:v6i32 = load p`
Instead of just widening it to: `x:v8i32 = load p`
We can widen it to the equivalent VP operation and set the EVL to the
exact number of elements needed: `x:v8i32 = vp_load a, b, mask=true, evl=6`
Provided that the target supports vp_load/vp_store on the widened type.

Scalable vectors are already widened this way where possible, so this
largely reuses the same logic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D148713

show more ...


Revision tags: llvmorg-16.0.5
# badf11de 31-May-2023 Luke Lau <luke@igalia.com>

[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block

vmv.s.x/vfmv.s.f instructions that only write to the first destination
element can use any SEW greater than or equal to its origi

[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block

vmv.s.x/vfmv.s.f instructions that only write to the first destination
element can use any SEW greater than or equal to its original SEW,
provided that it's writing to an implicit_def operand where we can
clobber the other lanes.

We were already handling this in needVSETVLI, which meant that when
scanning the instructions from top to bottom we could detect this and
avoid the toggle:

vsetivli zero, 4, e64, mf2, ta, ma
li a0, 11
vsetivli zero, 1, e8, mf8, ta, ma
vmv.s.x v0, a0

->
vsetivli zero, 4, e64, mf2, ta, ma
li a0, 11
vmv.s.x v0, a0
The issue that this patch aims to solve is arises when the vmv.s.x is
the first vector instruction in the block and doesn't have any prior
predecessor info:

entry_bb:
li a0, 11
; No previous state here: forced to set VL/VTYPE
vsetivli zero, 1, e8, mf8, ta, ma
vmv.s.x v0, a0
vsetivli zero, 4, e16, mf2, ta, ma
vmerge.vvm v8, v9, v8, v0
doLocalPostpass can work backwards from bottom to top and work out if
an earlier vsetvli can be mutated to avoid a toggle. It uses
DemandedFields and getDemanded for this, which previously didn't take
into account the possibility of going to a larger SEW.

A previous patch consolidated the vmv.s.x logic from needVSETVLI logic
into getDemanded, and this patch removes the gate around it so that
doLocalPostpass can now delete vsetvlis like in the scenario below:

entry_bb:
li a0, 11
; Previous vsetivli mutated: second one deleted
vsetivli zero, 4, e16, mf2, ta, ma
vmv.s.x v0, a0
vmerge.vvm v8, v9, v8, v0

Differential Revision: https://reviews.llvm.org/D151561

show more ...


# 319adf5d 31-May-2023 Luke Lau <luke@igalia.com>

Revert "[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block"

This reverts commit 0ba41dd3806e658e67acb63353fd5540f2bf333c.


# 0ba41dd3 31-May-2023 Luke Lau <luke@igalia.com>

[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block

vmv.s.x and friends that only write to the first destination element can
use any SEW greater than or equal to its original SEW, p

[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block

vmv.s.x and friends that only write to the first destination element can
use any SEW greater than or equal to its original SEW, provided that
it's writing to an implicit_def operand where we can clobber the other
lanes.

We were already handling this in needVSETVLI, which meant that when
scanning the instructions from top to bottom we could detect this and
avoid the toggle:

```
vsetivli zero, 4, e64, mf2, ta, ma
li a0, 11
vsetivli zero, 1, e8, mf8, ta, ma
vmv.s.x v0, a0

->
vsetivli zero, 4, e64, mf2, ta, ma
li a0, 11
vmv.s.x v0, a0

```
The issue that this patch aims to solve is whenever vmv.s.x arises when
the first vector instruction in the block and doesn't have any prior
predecessor info:

```
entry_bb:
li a0, 11
; No previous state here: forced to set VL/VTYPE
vsetivli zero, 1, e8, mf8, ta, ma
vmv.s.x v0, a0
vsetivli zero, 4, e16, mf2, ta, ma
vmerge.vvm v8, v9, v8, v0
```

doLocalPostpass can work backwards from bottom to top and work out if
an earlier vsetvli can be mutated to avoid a toggle. It uses
DemandedFields and getDemanded for this, which previously didn't take
into account the possibility of going to a larger SEW.

This patch adds a third option for SEW in DemandedFields, that's weaker
than demanded but stronger than not demanded, that states that it the
new SEW must be greater than or equal to the current SEW.

We can then use this option to move that vmv.s.x specific logic from
needVSETVLI into getDemanded, making it available for both phase 2 and
3, i.e. we can now mutate the earlier vsetivli going from bottom to top:

```
entry_bb:
li a0, 11
; Previous vsetivli mutated: second one deleted
vsetivli zero, 4, e16, mf2, ta, ma
vmv.s.x v0, a0
vmerge.vvm v8, v9, v8, v0
```

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D151561

show more ...


# 4dc9a2c5 17-May-2023 Philip Reames <preames@rivosinc.com>

[RISCV] Use scalar stores for splats of zero to memory up to XLen

The direct motivation here is to undo an unprofitable vectorization performed by SLP, but the transform seems generally useful as we

[RISCV] Use scalar stores for splats of zero to memory up to XLen

The direct motivation here is to undo an unprofitable vectorization performed by SLP, but the transform seems generally useful as well. If we are storing a zero to memory, we can use a single scalar store (from X0) for all power of two sizes up to XLen.

Differential Revision: https://reviews.llvm.org/D150717

show more ...


Revision tags: llvmorg-16.0.4
# c5c6ea8e 16-May-2023 Philip Reames <preames@rivosinc.com>

[RISCV] Precommit coverage for an upcoming dag combine change


Revision tags: llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 1456b686 19-Dec-2022 Nikita Popov <npopov@redhat.com>

[RISCV] Convert some tests to opaque pointers (NFC)


Revision tags: llvmorg-15.0.6
# a2b5b584 25-Nov-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Use register allocation hints to improve use of compressed instructions.

Compressed instructions usually require one of the source registers
to also be the source register. The register allo

[RISCV] Use register allocation hints to improve use of compressed instructions.

Compressed instructions usually require one of the source registers
to also be the source register. The register allocator doesn't have
that bias on its own.

This patch adds register allocation hints to introduce this bias.
I've started with ADDI, ADDIW, and SLLI. These all have a 5-bit
field for the register. If the source and dest register are the
same they are guaranteed to compress as long as the immediate is
also 6 bits.

This code was inspired by similar code from the SystemZ target.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D138242

show more ...


Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3
# d89d45ca 06-Oct-2022 Philip Reames <preames@rivosinc.com>

[RISCV][InsertVSETVLI] Default to MA not MU

This changes the default value used for mask policy from mask undisturbed to mask agnostic. In hardware, there may be a minor preference for ta/ma, but si

[RISCV][InsertVSETVLI] Default to MA not MU

This changes the default value used for mask policy from mask undisturbed to mask agnostic. In hardware, there may be a minor preference for ta/ma, but since this is only going to apply to instructions which don't use the mask policy bit, this is functionally mostly a nop. The main value is to make future changes to using MA when legal for masked instructions easier to review by reducing test churn.

The prior code was motivated by a desire to minimize state transitions between masked and unmasked code. This patch achieves the same effect using the demanded field logic (landed in afb45ff), and there are no regressions I spotted in the test diffs. (Given the size, I have only been able to skim.) I do want to call out that regressions are possible here; the demanded analysis only works on a block local scope right now, so e.g. a tight loop mixing masked and unmasked computation might see an extra vsetvli or two.

Differential Revision: https://reviews.llvm.org/D133803

show more ...


Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6
# c06d0b4d 19-Jun-2022 luxufan <luxufan@iscas.ac.cn>

[RISCV] Add ADDI instr for computing FrameIndex address

RVV doesn't have immediate field for memory addressing. Currently
we build MachineInstructions in PEI to computing stack offset for
RVV load s

[RISCV] Add ADDI instr for computing FrameIndex address

RVV doesn't have immediate field for memory addressing. Currently
we build MachineInstructions in PEI to computing stack offset for
RVV load store instructions. These instructions were added too late to
can be optimized by CSE, LICM... passes.

This patch makes FrameIndex SDNodes can't be matched in RVV Load Store
instruction selection patterns. So that the FrameIndex SDNodes would be
selected as `ADDI GPR, targetframeindex`.

There are 2 advantages for such change:
1. Stack objects address computing can be optimized by machine function
passes.
2. Since the ADDI instruction's destination register can be used as a
temp register, we can save an emergency spill slot.

Differential Revision: https://reviews.llvm.org/D128187

show more ...


# 89a11ebd 16-Jun-2022 Philip Reames <preames@rivosinc.com>

[RISCV] Avoid reducing etype just to initialize lane 0 of an undef vector

If we're writing to an undef vector (i.e. implicit_def), we can change the value of bits outside the requested write without

[RISCV] Avoid reducing etype just to initialize lane 0 of an undef vector

If we're writing to an undef vector (i.e. implicit_def), we can change the value of bits outside the requested write without consequence. This allows us to avoid a VSETVLI just for narrowing the value written.

Differential Revision: https://reviews.llvm.org/D127880

show more ...


# 4a3e4611 16-Jun-2022 Philip Reames <preames@rivosinc.com>

[RISCV] Extend demanded field transform in InsertVSETVLI to VTYPE subfeilds

The motivating case, and the only one actually enabled by this patch, is a load or store followed by another op with the s

[RISCV] Extend demanded field transform in InsertVSETVLI to VTYPE subfeilds

The motivating case, and the only one actually enabled by this patch, is a load or store followed by another op with the same SEW/LMUL ratio.

As an example, consider:

define void @test1(ptr %in, ptr %out) {
entry:
%0 = load <8 x i16>, ptr %in, align 2
%1 = sext <8 x i16> %0 to <8 x i32>
store <8 x i32> %1, ptr %out, align 4
ret void
}

Without this patch, we get:

vsetivli zero, 8, e16, mf4, ta, mu
vle16.v v8, (a0)
vsetvli zero, zero, e32, mf2, ta, mu
vsext.vf2 v9, v8
vse32.v v9, (a1)
ret

Whereas with the patch we get:

vsetivli zero, 8, e32, mf2, ta, mu
vle16.v v8, (a0)
vsext.vf2 v9, v8
vse32.v v9, (a1)
ret

We have rewritten the first vsetvli and thus removed the second one.

As is strongly hinted by the code structure and todos, I am planning on communing this with all (or most all?) of the cases from isCompatible used in the forward data flow. This will be done in a series of following changes - some NFC reworks, and some reviewed optimization extensions.

Differential Revision: https://reviews.llvm.org/D127780

show more ...


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4
# cc0283a6 11-May-2022 Philip Reames <preames@rivosinc.com>

[riscv] Prefer to use previous VL for scalar move instructionsK

This patch is an alternative to a piece of D125270. Its direct motivation is to fix a wrong code bug (described below), but somewhat u

[riscv] Prefer to use previous VL for scalar move instructionsK

This patch is an alternative to a piece of D125270. Its direct motivation is to fix a wrong code bug (described below), but somewhat unexpectedly, it also results in a significant code quality improvement for idiomatic fixed length vector patterns.

The existing transform is simply wrong in its current location. We are correct about the fact that the scalar move itself can use the previous vsetvli, but we loose track of the fact that later instructions might depend on the state change represented. That is, the actual value of VL in the register is different than the abstract state thinks it is. Not simply due to precision of modeling, but e.g. the VL register could contain 3 when the abstract state says it is 1. This is annoying hard to demonstrate in practice due to differences in policy flags on the intrinsics, but this is at least a latent wrong code bug.

The code quality benefit comes from the fact we don't need to tie this to explicit vsetvli instructions at all. We can propagate the abstract state, and reduce a) the number of transitions, or b) the cost of those transitions. It turns out we have a bunch of cases - in tests at least - where fixed length AVLs are known non-zero, and we can leave VL unchanged while changing VTYPE.

Differential Revision: https://reviews.llvm.org/D125337

show more ...


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init
# 8d1169cf 01-Feb-2022 Fraser Cormack <fraser@codeplay.com>

[RISCV][2/3] Switch undef -> poison in fixed-vector RVV tests


# 3cf15af2 21-Jan-2022 eopXD <eop.chen@sifive.com>

[RISCV] Remove experimental prefix from rvv-related extensions.

Extensions affected: +v, +zve*, +zvl*

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117860


12