Revision tags: llvmorg-21-init, llvmorg-19.1.7 |
|
#
6f53886a |
| 10-Jan-2025 |
Raphael Moreira Zinsly <rzinsly@ventanamicro.com> |
[RISCV] Add stack clash vector support (#119458)
Use the probe loop structure to allocate vector code in the stack as
well. We add the pseudo instruction RISCV::PROBED_STACKALLOC_RVV to
differenti
[RISCV] Add stack clash vector support (#119458)
Use the probe loop structure to allocate vector code in the stack as
well. We add the pseudo instruction RISCV::PROBED_STACKALLOC_RVV to
differentiate from the normal loop.
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
97982a8c |
| 05-Nov-2024 |
dlav-sc <daniil.avdeev@syntacore.com> |
[RISCV][CFI] add function epilogue cfi information (#110810)
This patch adds CFI instructions in the function epilogue.
Before patch:
addi sp, s0, -32
ld ra, 24(sp) # 8-byte Folded Reload
ld s
[RISCV][CFI] add function epilogue cfi information (#110810)
This patch adds CFI instructions in the function epilogue.
Before patch:
addi sp, s0, -32
ld ra, 24(sp) # 8-byte Folded Reload
ld s0, 16(sp) # 8-byte Folded Reload
ld s1, 8(sp) # 8-byte Folded Reload
addi sp, sp, 32
ret
After patch:
addi sp, s0, -32
.cfi_def_cfa sp, 32
ld ra, 24(sp) # 8-byte Folded Reload
ld s0, 16(sp) # 8-byte Folded Reload
ld s1, 8(sp) # 8-byte Folded Reload
.cfi_restore ra
.cfi_restore s0
.cfi_restore s1
addi sp, sp, 32
.cfi_def_cfa_offset 0
ret
This functionality is already present in `riscv-gcc`, but it’s not in
`clang` and this slightly impairs the `lldb` debugging experience, e.g.
backtrace.
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1 |
|
#
ab393cee |
| 30-Sep-2024 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Take known minimum vlen into account when calculating alignment padding in assignRVVStackObjectOffsets. (#110312)
If we know vlen is a multiple of 16, we don't need any alignment
padding.
[RISCV] Take known minimum vlen into account when calculating alignment padding in assignRVVStackObjectOffsets. (#110312)
If we know vlen is a multiple of 16, we don't need any alignment
padding.
I wrote the code so that it would generate the minimum amount of padding
if the stack align was 32 or larger or if RVVBitsPerBlock was smaller
than half the stack alignment.
show more ...
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
675e7bd1 |
| 21-May-2024 |
Piyou Chen <piyou.chen@sifive.com> |
[RISCV] Support postRA vsetvl insertion pass (#70549)
This patch try to get rid of vsetvl implict vl/vtype def-use chain and
improve the register allocation quality by moving the vsetvl insertion
[RISCV] Support postRA vsetvl insertion pass (#70549)
This patch try to get rid of vsetvl implict vl/vtype def-use chain and
improve the register allocation quality by moving the vsetvl insertion
pass after RVV register allocation
It will gain the benefit for the following optimization from
1. unblock scheduler's constraints by removing vl/vtype def-use chain
2. Support RVV re-materialization
3. Support partial spill
This patch add a new option `-riscv-vsetvl-after-rvv-regalloc=<1|0>` to
control this feature and default set as disable.
show more ...
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2 |
|
#
ff9af4c4 |
| 05-Feb-2024 |
Nikita Popov <npopov@redhat.com> |
[CodeGen] Convert tests to opaque pointers (NFC)
|
Revision tags: llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
1456b686 |
| 19-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[RISCV] Convert some tests to opaque pointers (NFC)
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4 |
|
#
132dc442 |
| 19-Oct-2022 |
Sergey Kachkov <sergey.kachkov@syntacore.com> |
[RISCV] Generate .cfi_def_cfa_expression for RVV stack adjustment
Cannonical frame address after RVV stack adjustment is sp + StackSize + RVVStackSize * vlenb, and since vlenb is unknown at compile-
[RISCV] Generate .cfi_def_cfa_expression for RVV stack adjustment
Cannonical frame address after RVV stack adjustment is sp + StackSize + RVVStackSize * vlenb, and since vlenb is unknown at compile-time (but it is a constant for particular HW implementation), emit .cfi_def_cfa_expression so libunwind can read VLENB CSR register at run-time and obtain correct frame address.
Fixes https://github.com/llvm/llvm-project/issues/58356 (but additional run-time support for reading CSR may be required)
Differential Revision: https://reviews.llvm.org/D136263
show more ...
|
Revision tags: llvmorg-15.0.3 |
|
#
d89d45ca |
| 06-Oct-2022 |
Philip Reames <preames@rivosinc.com> |
[RISCV][InsertVSETVLI] Default to MA not MU
This changes the default value used for mask policy from mask undisturbed to mask agnostic. In hardware, there may be a minor preference for ta/ma, but si
[RISCV][InsertVSETVLI] Default to MA not MU
This changes the default value used for mask policy from mask undisturbed to mask agnostic. In hardware, there may be a minor preference for ta/ma, but since this is only going to apply to instructions which don't use the mask policy bit, this is functionally mostly a nop. The main value is to make future changes to using MA when legal for masked instructions easier to review by reducing test churn.
The prior code was motivated by a desire to minimize state transitions between masked and unmasked code. This patch achieves the same effect using the demanded field logic (landed in afb45ff), and there are no regressions I spotted in the test diffs. (Given the size, I have only been able to skim.) I do want to call out that regressions are possible here; the demanded analysis only works on a block local scope right now, so e.g. a tight loop mixing masked and unmasked computation might see an extra vsetvli or two.
Differential Revision: https://reviews.llvm.org/D133803
show more ...
|
Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6 |
|
#
c06d0b4d |
| 19-Jun-2022 |
luxufan <luxufan@iscas.ac.cn> |
[RISCV] Add ADDI instr for computing FrameIndex address
RVV doesn't have immediate field for memory addressing. Currently we build MachineInstructions in PEI to computing stack offset for RVV load s
[RISCV] Add ADDI instr for computing FrameIndex address
RVV doesn't have immediate field for memory addressing. Currently we build MachineInstructions in PEI to computing stack offset for RVV load store instructions. These instructions were added too late to can be optimized by CSE, LICM... passes.
This patch makes FrameIndex SDNodes can't be matched in RVV Load Store instruction selection patterns. So that the FrameIndex SDNodes would be selected as `ADDI GPR, targetframeindex`.
There are 2 advantages for such change: 1. Stack objects address computing can be optimized by machine function passes. 2. Since the ADDI instruction's destination register can be used as a temp register, we can save an emergency spill slot.
Differential Revision: https://reviews.llvm.org/D128187
show more ...
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4 |
|
#
cb8681a2 |
| 16-May-2022 |
Fraser Cormack <fraser@codeplay.com> |
[RISCV] Fix RVV stack frame alignment bugs
This patch addresses several alignment issues in the stack frame when RVV objects are taken into account.
One bug is that the RVV stack was never guarante
[RISCV] Fix RVV stack frame alignment bugs
This patch addresses several alignment issues in the stack frame when RVV objects are taken into account.
One bug is that the RVV stack was never guaranteed to keep the alignment of the stack *as a whole*. We must maintain a 16-byte aligned stack at all times, especially when calling other functions. With the standard V extension, this is conveniently happening since VLEN is at least 128 and always 16-byte aligned. However, we support Zvl64b which does not guarantee this. To fix this, the RVV stack size is rounded up to be aligned to 16 bytes. This in practice generally makes us allocate a stack sized at least 2*VLEN in size, and a multiple of 2.
|------------------------------| -- <-- FP | 8-byte callee-save | | | |------------------------------| | | | one VLENB-sized RVV object | | | |------------------------------| | | | 8-byte local variable | | | |------------------------------| -- <-- SP (must be aligned to 16)
In the example above, with Zvl64b we are decrementing SP by 12 bytes which does not leave SP correctly aligned. We therefore introduce an extra VLENB-sized amount used for alignment. This would therefore ensure the total stack size was 16 bytes (48 for Zvl128b, 80 for Zvl256b, etc):
|------------------------------| -- <-- FP | 8-byte callee-save | | | |------------------------------| | | | one VLENB-sized padding obj | | | | one VLENB-sized RVV object | | | |------------------------------| | | | 8-byte local variable | | | |------------------------------| -- <-- SP
A new RVV invariant has been introduced in this patch, which is that the base of the RVV stack itself is now always aligned to 16 bytes, not 8 as before. This keeps us more in line with the scalar stack and should be easier to reason about. The calculation of the RVV padding has thus changed to be the amount required to align the scalar local variable section to the RVV section's alignment. This amount is further rounded up when setting up the initial stack to keep everything aligned:
|------------------------------| -- <-- FP | 8-byte callee-save | |------------------------------| | | | RVV objects | | (aligned to at least 16) | | | |------------------------------| | RVV padding of 8 bytes | |------------------------------| | 8-byte local variable | |------------------------------| -- <-- SP
In the example above, it's clear that we need 8 bytes of padding to keep the RVV section aligned to 16 when using SP. But to keep SP *itself* aligned to 16 we can't decrement the initial stack pointer by 24 - we have to round up to 32.
With the RVV section correctly aligned, the second bug fixed by this patch is that RVV objects themselves are now correctly aligned. We were previously only guaranteeing an alignment of 8 bytes, even if they required a higher alignment. This is relatively simple and in practice we see more rounding up of VLEN amounts to account for alignment in between objects:
|------------------------------| | RVV object (aligned to 16) | |------------------------------| | no padding necessary | |------------------------------| | 2*VLENB RVV object (align 16)| |------------------------------| | VLENB alignment padding | |------------------------------| | RVV object (align 32) | |------------------------------| | 3*VLENB alignment padding | |------------------------------| | VLENB RVV object (align 32) | |------------------------------| -- <-- base of RVV section
Note that a lot of the regressions in codegen owing to the new alignment rules are correct but actually only strictly necessary for Zvl64b (and Zvl32b but that's not really supported). I plan a follow-up patch to take the known VLEN into account when padding for alignment.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D125787
show more ...
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
#
b7847199 |
| 14-Feb-2022 |
Zakk Chen <zakk.chen@sifive.com> |
[RISCV] Add the passthru operand for RVV nomask binary intrinsics.
The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail
[RISCV] Add the passthru operand for RVV nomask binary intrinsics.
The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed.
Add passthru operand for VSLIDE1UP_VL and VSLIDE1DOWN_VL to support i64 scalar in rv32.
The masked VSLIDE1 would only emit mask undisturbed policy regardless of giving mask agnostic policy until InsertVSETVLI supports mask agnostic.
Reviewed by: craig.topper, rogfer01
Differential Revision: https://reviews.llvm.org/D117989
show more ...
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init |
|
#
3cf15af2 |
| 21-Jan-2022 |
eopXD <eop.chen@sifive.com> |
[RISCV] Remove experimental prefix from rvv-related extensions.
Extensions affected: +v, +zve*, +zvl*
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D117860
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
#
facff468 |
| 07-Oct-2021 |
Hsiangkai Wang <kai.wang@sifive.com> |
[RISCV] Reorder the vector register allocation order.
GPR uses argument registers as the first group of registers to allocate. This patch uses vector argument registers, v8 to v23, as the first grou
[RISCV] Reorder the vector register allocation order.
GPR uses argument registers as the first group of registers to allocate. This patch uses vector argument registers, v8 to v23, as the first group to allocate.
Differential Revision: https://reviews.llvm.org/D111304
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
#
242ddd50 |
| 08-Jun-2021 |
Jim Lin <jim@andestech.com> |
[RISCV][NFC] Add a single space after comma for VType
In most of cases, it has a single space after comma in assembly operands.
Reviewed By: craig.topper
Differential Revision: https://reviews.llv
[RISCV][NFC] Add a single space after comma for VType
In most of cases, it has a single space after comma in assembly operands.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103790
show more ...
|
#
fdf10e61 |
| 26-May-2021 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Use X0 as destination of inserted vsetvli when possible.
We aren't going to connect the result to anything so we might as well avoid allocating a register.
Reviewed By: frasercrmck, HsiangK
[RISCV] Use X0 as destination of inserted vsetvli when possible.
We aren't going to connect the result to anything so we might as well avoid allocating a register.
Reviewed By: frasercrmck, HsiangKai
Differential Revision: https://reviews.llvm.org/D102031
show more ...
|
Revision tags: llvmorg-12.0.1-rc1 |
|
#
b4a358a7 |
| 15-Apr-2021 |
Fraser Cormack <fraser@codeplay.com> |
[RISCV] Fix missing emergency slots for scalable stack offsets
This patch adds an additional emergency spill slot to RVV code. This is required as RVV stack offsets may require an additional registe
[RISCV] Fix missing emergency slots for scalable stack offsets
This patch adds an additional emergency spill slot to RVV code. This is required as RVV stack offsets may require an additional register to compute.
This patch includes an optimization by @HsiangKai <kai.wang@sifive.com> to reduce the number of registers required for the computation of stack offsets from 3 to 2. Otherwise we'd need two additional emergency spill slots.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D100574
show more ...
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
#
a9b9c64f |
| 20-Feb-2021 |
luxufan <932494295@qq.com> |
change rvv frame layout
This patch change the rvv frame layout that proposed in D94465. In patch D94465, In the eliminateFrameIndex function, to eliminate the rvv frame index, create temp virtual re
change rvv frame layout
This patch change the rvv frame layout that proposed in D94465. In patch D94465, In the eliminateFrameIndex function, to eliminate the rvv frame index, create temp virtual register is needed. This virtual register should be scavenged by class RegsiterScavenger. If the machine function has other unused registers, there is no problem. But if there isn't unused registers, we need a emergency spill slot. Because of the emergency spill slot belongs to the scalar local variables field, to access emergency spill slot, we need a temp virtual register again. This makes the compiler report the "Incomplete scavenging after 2nd pass" error. So I change the rvv frame layout as follows:
``` |--------------------------------------| | arguments passed on the stack | |--------------------------------------|<--- fp | callee saved registers | |--------------------------------------| | rvv vector objects(local variables | | and outgoing arguments | |--------------------------------------| | realignment field | |--------------------------------------| | scalar local variable(also contains| | emergency spill slot) | |--------------------------------------|<--- bp | variable-sized local variables | |--------------------------------------|<--- sp ```
Differential Revision: https://reviews.llvm.org/D97111
show more ...
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
#
5a31a673 |
| 08-Jan-2021 |
Hsiangkai Wang <kai.wang@sifive.com> |
[RISCV] Frame handling for RISC-V V extension.
This patch proposes how to deal with RISC-V vector frame objects. The layout of RISC-V vector frame will look like
|---------------------------------|
[RISCV] Frame handling for RISC-V V extension.
This patch proposes how to deal with RISC-V vector frame objects. The layout of RISC-V vector frame will look like
|---------------------------------| | scalar callee-saved registers | |---------------------------------| | scalar local variables | |---------------------------------| | scalar outgoing arguments | |---------------------------------| | RVV local variables && | | RVV outgoing arguments | |---------------------------------| <- end of frame (sp)
If there is realignment or variable length array in the stack, we will use frame pointer to access fixed objects and stack pointer to access non-fixed objects.
|---------------------------------| <- frame pointer (fp) | scalar callee-saved registers | |---------------------------------| | scalar local variables | |---------------------------------| | ///// realignment ///// | |---------------------------------| | scalar outgoing arguments | |---------------------------------| | RVV local variables && | | RVV outgoing arguments | |---------------------------------| <- end of frame (sp)
If there are both realignment and variable length array in the stack, we will use frame pointer to access fixed objects and base pointer to access non-fixed objects.
|---------------------------------| <- frame pointer (fp) | scalar callee-saved registers | |---------------------------------| | scalar local variables | |---------------------------------| | ///// realignment ///// | |---------------------------------| <- base pointer (bp) | RVV local variables && | | RVV outgoing arguments | |---------------------------------| | /////////////////////////////// | | variable length array | | /////////////////////////////// | |---------------------------------| <- end of frame (sp) | scalar outgoing arguments | |---------------------------------|
In this version, we do not save the addresses of RVV objects in the stack. We access them directly through the polynomial expression (a x VLENB + b). We do not reserve frame pointer when there is any RVV object in the stack. So, we also access the scalar frame objects through the polynomial expression (a x VLENB + b) if the access across RVV stack area.
Differential Revision: https://reviews.llvm.org/D94465
show more ...
|