Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6 |
|
#
b6c0f1bf |
| 05-Dec-2024 |
Luke Lau <luke@igalia.com> |
[RISCV] Clear vill for whole vector register moves in vsetvli insertion (#118283)
This is an alternative to #117866 that works by demanding a valid vtype
instead of using a separate pass.
The ma
[RISCV] Clear vill for whole vector register moves in vsetvli insertion (#118283)
This is an alternative to #117866 that works by demanding a valid vtype
instead of using a separate pass.
The main advantage of this is that it allows coalesceVSETVLIs to just
reuse an existing vsetvli later in the block.
To do this we need to first transfer the vsetvli info to some arbitrary
valid state in transferBefore when we encounter a vector copy. Then we
add a new vill demanded field that will happily accept any other known
vtype, which allows us to coalesce these where possible.
Note we also need to check for vector copies in computeVLVTYPEChanges,
otherwise the pass will completely skip over functions that only have
vector copies and nothing else.
This is one part of a fix for #114518. We still need to check if there's
other cases where vector copies/whole register moves that are inserted
after vsetvli insertion.
show more ...
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
9122c523 |
| 15-Nov-2024 |
Pengcheng Wang <wangpengcheng.pp@bytedance.com> |
[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)
This is based on other targets like PPC/AArch64 and some experiments.
This PR will only enable bidirectional schedu
[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)
This is based on other targets like PPC/AArch64 and some experiments.
This PR will only enable bidirectional scheduling and tracking register pressure.
Disclaimer: I haven't tested it on many cores, maybe we should make some options being features. I believe downstreams must have tried this before, so feedbacks are welcome.
show more ...
|
#
1cb59983 |
| 30-Oct-2024 |
Luke Lau <luke@igalia.com> |
[RISCV] Remove redundant +zfh from +zvfh[min] tests. NFC
In the vast majority of f16 tests we don't end up emitting any scalar code that needs +zfh, so remove it.
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
c74ba57e |
| 11-Jul-2024 |
Luke Lau <luke@igalia.com> |
[RISCV] Convert AVLs with vlenb to VLMAX where possible (#97800)
Given an AVL that's computed from vlenb, if it's equal to VLMAX then we
can replace it with the VLMAX sentinel value.
The main mo
[RISCV] Convert AVLs with vlenb to VLMAX where possible (#97800)
Given an AVL that's computed from vlenb, if it's equal to VLMAX then we
can replace it with the VLMAX sentinel value.
The main motiviation is to be able to express an EVL of VLMAX in VP
intrinsics whilst emitting vsetvli a0, zero, so that we can replace
llvm.riscv.masked.strided.{load,store} with their VP counterparts.
This is done in RISCVVectorPeephole (previously RISCVFoldMasks, renamed
to account for the fact that it no longer just folds masks) instead of
SelectionDAG since there are multiple places places where VP nodes are
lowered that would have need to have been handled.
This also avoids doing it in RISCVInsertVSETVLI as it's much harder to
lookup the value of the AVL, and in RISCVVectorPeephole we can take
advantage of DeadMachineInstrElim to remove any leftover
PseudoReadVLENBs.
show more ...
|
#
32597685 |
| 09-Jul-2024 |
Jianjian Guan <jacquesguan@me.com> |
[RISCV] Remove experimental for bf16 extensions (#97996)
They are already ratified now.
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6 |
|
#
37fcb323 |
| 07-May-2024 |
Jianjian Guan <jacquesguan@me.com> |
[RISCV] Add codegen support for Zvfbfmin (#87911)
This patch adds basic codegen support for Zvfbfmin extension.
|
Revision tags: llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1 |
|
#
9617da88 |
| 28-Feb-2024 |
Luke Lau <luke@igalia.com> |
[RISCV] Use a ta vslideup if inserting over end of InterSubVT (#83230)
The description in #83146 is slightly inaccurate: it relaxes a tail
undisturbed vslideup to tail agnostic if we are inserting
[RISCV] Use a ta vslideup if inserting over end of InterSubVT (#83230)
The description in #83146 is slightly inaccurate: it relaxes a tail
undisturbed vslideup to tail agnostic if we are inserting over the
entire tail of the vector **and** we didn't shrink the LMUL of the
vector being inserted into.
This handles the case where we did shrink down the LMUL via InterSubVT
by checking if we inserted over the entire tail of InterSubVT, the
actual type that we're performing the vslideup on, not VecVT.
show more ...
|
#
91d23370 |
| 28-Feb-2024 |
Luke Lau <luke@igalia.com> |
[RISCV] Use a tail agnostic vslideup if possible for scalable insert_subvector (#83146)
If we know that an insert_subvector inserting a fixed subvector will
overwrite the entire tail of the vector,
[RISCV] Use a tail agnostic vslideup if possible for scalable insert_subvector (#83146)
If we know that an insert_subvector inserting a fixed subvector will
overwrite the entire tail of the vector, we use a tail agnostic
vslideup. This was added in https://reviews.llvm.org/D147347, but we can
do the same thing for scalable vectors too.
The `Policy` variable is defined in a slightly weird place but this is
to mirror the fixed length subvector code path as closely as possible. I
think we may be able to deduplicate them in future.
show more ...
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2 |
|
#
ff9af4c4 |
| 05-Feb-2024 |
Nikita Popov <npopov@redhat.com> |
[CodeGen] Convert tests to opaque pointers (NFC)
|
Revision tags: llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6 |
|
#
93e15683 |
| 28-Nov-2023 |
Philip Reames <preames@rivosinc.com> |
[DAG] Fix a miscompile in insert_subvector undef (insert_subvector undef, ..), idx combine (#73587)
The combine was implicitly assuming that the index on the outer
insert_subvector meant the same t
[DAG] Fix a miscompile in insert_subvector undef (insert_subvector undef, ..), idx combine (#73587)
The combine was implicitly assuming that the index on the outer
insert_subvector meant the same thing when the source was switched to be
the index of the inner insert_subvector. This is not true if the
innermost sub-vector is fixed, and the outer subvector is scalable.
I could do a less restrictive fix here - i.e. allow the case where the
scalability of the subvectors are the same - but there's no test
coverage which shows this transform actually has profit. Given that, go
for the simplest fix.
show more ...
|
Revision tags: llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
a63bd7e9 |
| 14-Aug-2023 |
Philip Reames <preames@rivosinc.com> |
[RISCV] Use NoReg in place of IMPLICIT_DEF for undefined passthru operands
In a recent series of refactorings (described here: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structur
[RISCV] Use NoReg in place of IMPLICIT_DEF for undefined passthru operands
In a recent series of refactorings (described here: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295), I greatly increased the number of IMPLICIT_DEF operands to our vector instructions. This has turned out to have an unexpected negative impact because MachineCSE does not CSE IMPLICIT_DEFs, and thus does not CSE any instruction with an IMPLICIT_DEF operand. SelectionDAG *does* CSE the same case, but that only covers the same block case, not the cross block case. This lead to the performance regression reported in https://github.com/llvm/llvm-project/issues/64282.
This change is a slightly ugly hack to side step the issue. Instead of fixing the root cause (lack of CSE for IMPLICIT_DEF) or undoing the operand changes, we leave the extra operand in place, and use NoReg in place of IMPLICIT_DEF. I then convert back to IMPLICIT_DEF just before register allocation so that ProcessImplicitDefs and TwoAddressInstructions can do the normal transforms to Undef tied registers.
We may end up backporting this into the 17.x release branch. Given how late in the release cycle this is landing, that's much less likely now, but still a possibility.
Differential Revision: https://reviews.llvm.org/D156909
show more ...
|
#
6238b8ea |
| 09-Aug-2023 |
Luke Lau <luke@igalia.com> |
[LegalizeTypes] Factor in vscale_range when widening insert_subvector
Currently when widening operands for insert_subvector nodes, we check first that the indices are valid by seeing if the subvecto
[LegalizeTypes] Factor in vscale_range when widening insert_subvector
Currently when widening operands for insert_subvector nodes, we check first that the indices are valid by seeing if the subvector is statically known to be smaller than or equal to the in-place vector.
However if we're inserting a fixed subvector into a scalable vector we rely on the minimum vector length of the latter. This patch extends the widening logic to also take into account the minimum vscale from the vscale_range attribute, so we can handle more scenarios where we know the scalable vector is large enough to contain the subvector.
Fixes https://github.com/llvm/llvm-project/issues/63437
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153519
show more ...
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
3055c581 |
| 19-Jul-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Upgrade Zvfh version to 1.0 and move out of experimental state.
This has been ratified according to https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions
Differential Revision: h
[RISCV] Upgrade Zvfh version to 1.0 and move out of experimental state.
This has been ratified according to https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions
Differential Revision: https://reviews.llvm.org/D155668
show more ...
|
#
d8562e27 |
| 13-Jun-2023 |
Philip Reames <preames@rivosinc.com> |
[RISCV] Canonicalize towards vmerge w/passthrough representation
This is the first patch in a series to change how we represent tail agnostic, tail undefined, and tail undisturbed operations. In cur
[RISCV] Canonicalize towards vmerge w/passthrough representation
This is the first patch in a series to change how we represent tail agnostic, tail undefined, and tail undisturbed operations. In current code, we tend to use an unsuffixed pseudo for undefined (despite calling it TA most places in code), and the _TU form for both agnostic and undisturbed (via the policy operand).
The key observation behind this patch is that we can represent tail undefined via a pseudo with a passthrough operand if that operand is IMPLICIT_DEF (aka undef). We already have a few instances of this in tree - see vmv.s.x and vslide* - but we can do this more universally. Once complete, we will be able to delete roughly ~1/3 of our vector pseudo classes.
A bit more information on the overall goal can be found in this discourse post: https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295.
This patch doesn't actually remove the legacy unsuffixed pseudo as there's still some path from intrinsic lowering which uses it. (I have not yet located it.) This also means we don't have to modify any of the lookup tables which makes the migration simpler. We can defer deleting the tables and pseudos until one final change once all the instructions have been migrated.
There are a couple of regressions in the tests. At first, these concerned me, but it turns out that all of them are differences in expansion of a single source level instruction. I think we can safely ignore this for the moment. I did explore changing the handling of IMPLICIT_DEF in ScheduleDAG, but that causes an absolutely *massive* test diff with minimal profit. I really don't think it's worth doing.
Differential Revision: https://reviews.llvm.org/D152380
show more ...
|
#
17e2d07a |
| 12-Jun-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Use tail undisturbed vmv.v.v instead of vslideup.vi vN, vM, 0 for subvector insertion
vslideup has a vector overlap constraint that vmv.v.v doesn't. vmv.v.v is also a simpler instruction so
[RISCV] Use tail undisturbed vmv.v.v instead of vslideup.vi vN, vM, 0 for subvector insertion
vslideup has a vector overlap constraint that vmv.v.v doesn't. vmv.v.v is also a simpler instruction so may have better throughput and/or latency in some CPUs.
This is an alternative to D152298, D152368, and D152496.
Reviewed By: luke, reames
Differential Revision: https://reviews.llvm.org/D152565
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4 |
|
#
365f8403 |
| 09-Mar-2023 |
Piyou Chen <piyou.chen@sifive.com> |
[RISCV] Enable subregister liveness by default
This commit enable the subregister liveness by default in RISC-V.
It was previously disabled in https://reviews.llvm.org/D129646 after a previous atte
[RISCV] Enable subregister liveness by default
This commit enable the subregister liveness by default in RISC-V.
It was previously disabled in https://reviews.llvm.org/D129646 after a previous attempt to enabled it https://reviews.llvm.org/D128016.
We believe that https://reviews.llvm.org/D129735 fixes the issue that caused it to be disabled.
Reviewed By: craig.topper, kito-cheng
Differential Revision: https://reviews.llvm.org/D145546
show more ...
|
Revision tags: llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3 |
|
#
1bdf21d5 |
| 11-Oct-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Use mask/tail agnostic if tied source is IMPLICIT_DEF regardless of the policy operand.
If the source is implicit_def, the register allocator won't have any constraint on what register it pi
[RISCV] Use mask/tail agnostic if tied source is IMPLICIT_DEF regardless of the policy operand.
If the source is implicit_def, the register allocator won't have any constraint on what register it picks for the destination. This doesn't give the user much control of what register is being used.
So in my mind that means the only reason to honor the policy operand is to control what policy is used in vsetvli to maybe avoid a vtype change. Given the other optimizations we do on the policy field, I don't think allowing the user this control is reliable.
Therefore, I think we should use agnostic policies if the source is undef.
This should give better performance on some CPUs for VP intrinsics where there is no merge operand and the backend adds IMPLICIT_DEF to the instruction.
Differential Revision: https://reviews.llvm.org/D135396
show more ...
|
#
d89d45ca |
| 06-Oct-2022 |
Philip Reames <preames@rivosinc.com> |
[RISCV][InsertVSETVLI] Default to MA not MU
This changes the default value used for mask policy from mask undisturbed to mask agnostic. In hardware, there may be a minor preference for ta/ma, but si
[RISCV][InsertVSETVLI] Default to MA not MU
This changes the default value used for mask policy from mask undisturbed to mask agnostic. In hardware, there may be a minor preference for ta/ma, but since this is only going to apply to instructions which don't use the mask policy bit, this is functionally mostly a nop. The main value is to make future changes to using MA when legal for masked instructions easier to review by reducing test churn.
The prior code was motivated by a desire to minimize state transitions between masked and unmasked code. This patch achieves the same effect using the demanded field logic (landed in afb45ff), and there are no regressions I spotted in the test diffs. (Given the size, I have only been able to skim.) I do want to call out that regressions are possible here; the demanded analysis only works on a block local scope right now, so e.g. a tight loop mixing masked and unmasked computation might see an extra vsetvli or two.
Differential Revision: https://reviews.llvm.org/D133803
show more ...
|
Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
d1a5669f |
| 13-Jul-2022 |
Fraser Cormack <fraser@codeplay.com> |
[RISCV] Disable subregister liveness by default
We previously enabled subregister liveness by default when compiling with RVV. This has been shown to cause miscompilations where RVV register operand
[RISCV] Disable subregister liveness by default
We previously enabled subregister liveness by default when compiling with RVV. This has been shown to cause miscompilations where RVV register operand constraints are not met. A test was added for this in D129639 which explains the issue in more detail.
Until this issue is fixed in some way, we should not be enabling subregister liveness unless the user asks for it.
Reviewed By: craig.topper, rogfer01, kito-cheng
Differential Revision: https://reviews.llvm.org/D129646
show more ...
|
Revision tags: llvmorg-14.0.6 |
|
#
a83aa33d |
| 16-Jun-2022 |
Bradley Smith <bradley.smith@arm.com> |
[IR] Move vector.insert/vector.extract out of experimental namespace
These intrinsics are now fundemental for SVE code generation and have been present for a year and a half, hence move them out of
[IR] Move vector.insert/vector.extract out of experimental namespace
These intrinsics are now fundemental for SVE code generation and have been present for a year and a half, hence move them out of the experimental namespace.
Differential Revision: https://reviews.llvm.org/D127976
show more ...
|
#
59cde213 |
| 21-Jun-2022 |
Craig Topper <craig.topper@sifive.com> |
Recommit "[RISCV] Enable subregister liveness tracking for RVV."
The failure that caused the previous revert has been fixed by https://reviews.llvm.org/D126048
Original commit message:
RVV makes h
Recommit "[RISCV] Enable subregister liveness tracking for RVV."
The failure that caused the previous revert has been fixed by https://reviews.llvm.org/D126048
Original commit message:
RVV makes heavy use of subregisters due to LMUL>1 and segment load/store tuples. Enabling subregister liveness tracking improves the quality of the register allocation.
I've added a command line that can be used to turn it off if it causes compile time or functional issues. I used the command line to keep the old behavior for one interesting test case that was testing register allocation.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D128016
show more ...
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4 |
|
#
d40b7f0d |
| 17-May-2022 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[DAG] Fold (shl (srl x, c), c) -> and(x, m) even if srl has other uses
If we're using shift pairs to mask, then relax the one use limit if the shift amounts are equal - we'll only be generating a si
[DAG] Fold (shl (srl x, c), c) -> and(x, m) even if srl has other uses
If we're using shift pairs to mask, then relax the one use limit if the shift amounts are equal - we'll only be generating a single AND node.
AArch64 has a couple of regressions due to this, so I've enforced the existing one use limit inside a AArch64TargetLowering::shouldFoldConstantShiftPairToMask callback.
Part of the work to fix the regressions in D77804
Differential Revision: https://reviews.llvm.org/D125607
show more ...
|
#
1878f240 |
| 16-May-2022 |
Zakk Chen <zakk.chen@sifive.com> |
[RISCV] Fix incorrect use of tail agnostic vslideup.
We need to use tail undisturbed for vslideup to implement vector insert operation correctly.
Ideally, we cound use the tail agnostic when insert
[RISCV] Fix incorrect use of tail agnostic vslideup.
We need to use tail undisturbed for vslideup to implement vector insert operation correctly.
Ideally, we cound use the tail agnostic when insert subvector or element at the end of the vector. This will be in follow-up patch.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125545
show more ...
|
#
a2918976 |
| 13-May-2022 |
Craig Topper <craig.topper@sifive.com> |
Revert "[RISCV] Enable subregister liveness tracking for RVV."
This reverts most of ed242b54c9c2aa84a47f66af5b8497d93646b68d
I'm seeing failures in our intrinsic testing on qemu that seem related t
Revert "[RISCV] Enable subregister liveness tracking for RVV."
This reverts most of ed242b54c9c2aa84a47f66af5b8497d93646b68d
I'm seeing failures in our intrinsic testing on qemu that seem related to this. Reverting while I investigate.
I've left the command line option in place for directed testing. It defaults to off.
show more ...
|
#
ed242b54 |
| 11-May-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Enable subregister liveness tracking for RVV.
RVV makes heavy use of subregisters due to LMUL>1 and segment load/store tuples. Enabling subregister liveness tracking improves the quality of
[RISCV] Enable subregister liveness tracking for RVV.
RVV makes heavy use of subregisters due to LMUL>1 and segment load/store tuples. Enabling subregister liveness tracking improves the quality of the register allocation.
I've added a command line that can be used to turn it off if it causes compile time or functional issues. I used the command line to keep the old behavior for one interesting test case that was testing register allocation.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125108
show more ...
|