Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
9122c523 |
| 15-Nov-2024 |
Pengcheng Wang <wangpengcheng.pp@bytedance.com> |
[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)
This is based on other targets like PPC/AArch64 and some experiments.
This PR will only enable bidirectional schedu
[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)
This is based on other targets like PPC/AArch64 and some experiments.
This PR will only enable bidirectional scheduling and tracking register pressure.
Disclaimer: I haven't tested it on many cores, maybe we should make some options being features. I believe downstreams must have tried this before, so feedbacks are welcome.
show more ...
|
#
a25d91a1 |
| 08-Nov-2024 |
Gergely Futo <gergely.futo@hightec-rt.com> |
[RISCV] Skip DAG combine for bitcast fabs/fneg (#115325)
Disable the DAG combine for bitcast fabs/fneg in case of the zdinx
extension.
The combine folds the fabs/fneg nodes in some cases. This m
[RISCV] Skip DAG combine for bitcast fabs/fneg (#115325)
Disable the DAG combine for bitcast fabs/fneg in case of the zdinx
extension.
The combine folds the fabs/fneg nodes in some cases. This might result
in suboptimal code if compiled with the zdinx extension. In case of the
zdinx extension, there is no need to load the double value from an x
register to an f register, so the combine can be skipped.
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
49660e55 |
| 06-Sep-2024 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Pass f32/f64 directly without a bitcast for Zfinx/Zdinx. (#107464)
With Zfinx/Zdinx, f32/f64 are legal types for a GPR, we don't need a
bitcast.
This avoids turning fneg/fabs into bitwis
[RISCV] Pass f32/f64 directly without a bitcast for Zfinx/Zdinx. (#107464)
With Zfinx/Zdinx, f32/f64 are legal types for a GPR, we don't need a
bitcast.
This avoids turning fneg/fabs into bitwise operations purely because of
these bitcasts. If the bitwise operations are faster for some reason on
a Zfinx CPU, then that seems like it should be done for all fneg/fabs,
not just ones near function arguments/returns.
I don't have much interest in Zfinx, this just makes the code more
similar to what I proposed for Zhinx in #107446.
show more ...
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2 |
|
#
13996378 |
| 27-Jul-2024 |
Yingwei Zheng <dtcxzyw2333@gmail.com> |
[RISCV][ISel] Fold FSGNJX idioms (#100718)
This patch folds `fmul X, (fcopysign 1.0, Y)` into `fsgnjx X, Y`. This
pattern exists in some graphics applications/math libraries.
Alive2: https://alive
[RISCV][ISel] Fold FSGNJX idioms (#100718)
This patch folds `fmul X, (fcopysign 1.0, Y)` into `fsgnjx X, Y`. This
pattern exists in some graphics applications/math libraries.
Alive2: https://alive2.llvm.org/ce/z/epyL33
Since fpimm +1.0 is lowered to a load from constant pool after
OpLegalization, I have to introduce a new RISCVISD node FSGNJX and fold
this pattern in DAGCombine.
Closes https://github.com/dtcxzyw/llvm-opt-benchmark/issues/1072.
show more ...
|
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
db9252b1 |
| 01-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Call SimplifyDemandedBits on fcopysign sign value (#97151)
Math library code has quite a few places with complex bit logic that are ultimately fed into a copysign. This helps avoid some regress
DAG: Call SimplifyDemandedBits on fcopysign sign value (#97151)
Math library code has quite a few places with complex bit logic that are ultimately fed into a copysign. This helps avoid some regressions in a future patch.
This assumes the position in the float type, which should at least be valid for IEEE types. Not sure if we need to guard against ppc_fp128 or anything else weird.
There appears to be some value in simplifying the value operand as well, but I'll address that separately.
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
576d81ba |
| 20-Mar-2024 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Use REG_SEQUENCE/EXTRACT_SUBREG to move between individual GPRs and GPRPair. (#85887)
Previously we used memory like we do to move between GPRs and FPR64 with
the D extension on RV32.
We
[RISCV] Use REG_SEQUENCE/EXTRACT_SUBREG to move between individual GPRs and GPRPair. (#85887)
Previously we used memory like we do to move between GPRs and FPR64 with
the D extension on RV32.
We can instead use REG_SEQUENCE/EXTRACT_SUBREG to inform register
allocation how to do the copy without memory.
show more ...
|
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
eabaee0c |
| 07-Jan-2024 |
Fangrui Song <i@maskray.me> |
[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467)
R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530
[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467)
R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530 `call foo` assembles to R_RISCV_CALL_PLT. The `@plt` suffix is not useful and can be removed now (matching AArch64 and PowerPC).
GNU assembler assembles `call foo` to RISCV_CALL_PLT since 2022-09 (70f35d72ef04cd23771875c1661c9975044a749c).
Without this patch, unconditionally changing MO_CALL to MO_PLT could create `jump .L1@plt, a0`, which is invalid in LLVM integrated assembler and GNU assembler.
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5 |
|
#
8b90f8e0 |
| 25-May-2023 |
Shao-Ce SUN <sunshaoce@iscas.ac.cn> |
[RISCV][CodeGen] Support Zdinx on RV32 codegen
This patch was split from D122918 .
Co-Author: @StephenFan @liaolucy @realqhc
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.
[RISCV][CodeGen] Support Zdinx on RV32 codegen
This patch was split from D122918 .
Co-Author: @StephenFan @liaolucy @realqhc
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D149743
show more ...
|
Revision tags: llvmorg-16.0.4 |
|
#
2dc0fa05 |
| 03-May-2023 |
Shao-Ce SUN <sunshaoce@iscas.ac.cn> |
[RISCV][CodeGen] Support Zdinx on RV64 codegen
This patch was split from D122918 . Co-Author: @liaolucy @realqhc
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D149665
|
Revision tags: llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1 |
|
#
7b0c4184 |
| 28-Mar-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Move compressible registers to the beginning of the FP allocation order.
We don't have very many compressible FP instructions, just load and store. These instruction require the FP register
[RISCV] Move compressible registers to the beginning of the FP allocation order.
We don't have very many compressible FP instructions, just load and store. These instruction require the FP register to be f8-f15.
This patch changes the FP allocation order to prioritize f10-f15 first. These are also the FP argument registers. So I allocated them in reverse order starting at f15 to avoid taking the first argument registers. This appears to match gcc allocation order.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D146488
show more ...
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
e00e20a0 |
| 01-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints.
These instructions requires both register operands to be compressible so I've only applied the hint if we already have a GPRC physical regis
[RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints.
These instructions requires both register operands to be compressible so I've only applied the hint if we already have a GPRC physical register assigned for the other register operand.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D139079
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
2ce0a5c8 |
| 13-Jul-2022 |
Alex Bradbury <asb@igalia.com> |
[RISCV][test][NFC] Regenerate RISC-V tests with update_llc_test_checks.py -u
If a change alters more than a couple of tests it's really handy to be able to regenerate any that were created by update
[RISCV][test][NFC] Regenerate RISC-V tests with update_llc_test_checks.py -u
If a change alters more than a couple of tests it's really handy to be able to regenerate any that were created by update_llc_test_checks.py with something like `update_llc_test_checks.py -u llvm/test/CodeGen/RISCV`. I noticed this causes some extraneous changes (perhaps due to hand editing). This commit addresses that by updating any fails that are modified by update_llc_test_checks.py -u.
show more ...
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5 |
|
#
84bacb18 |
| 03-Jun-2022 |
Shao-Ce SUN <sunshaoce@iscas.ac.cn> |
[RISCV] Use check-prefixes to reduce check lines
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125083
|
#
f14d18c7 |
| 02-Jun-2022 |
LiaoChunyu <chunyu@iscas.ac.cn> |
[RISCV] Add more patterns for FNMADD
D54205 handles fnmadd: -rs1 * rs2 - rs3 This patch add fnmadd: -(rs1 * rs2 + rs3) (the nsz flag on the FMA)
Reviewed By: craig.topper
Differential Revision: ht
[RISCV] Add more patterns for FNMADD
D54205 handles fnmadd: -rs1 * rs2 - rs3 This patch add fnmadd: -(rs1 * rs2 + rs3) (the nsz flag on the FMA)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D126852
show more ...
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
6a54776f |
| 16-Mar-2022 |
Haocong.Lu <Haocong.Lu@streamcomputing.com> |
[RISCV] Select SRLI+SLLI for AND with leading ones mask
Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask. It's useful in RV64 when the mask exceeds simm32 (cannot be generated
[RISCV] Select SRLI+SLLI for AND with leading ones mask
Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask. It's useful in RV64 when the mask exceeds simm32 (cannot be generated by LUI).
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D121598
show more ...
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init |
|
#
8def89b5 |
| 21-Jan-2022 |
wangpc <pc.wang@linux.alibaba.com> |
[RISCV] Set CostPerUse to 1 iff RVC is enabled
After D86836, we can define multiple cost values for different cost models. So here we set CostPerUse to 1 iff RVC is enabled to avoid potential impact
[RISCV] Set CostPerUse to 1 iff RVC is enabled
After D86836, we can define multiple cost values for different cost models. So here we set CostPerUse to 1 iff RVC is enabled to avoid potential impact on RA.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D117741
show more ...
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
bd653f64 |
| 11-Jan-2022 |
Haocong.Lu <Haocong.Lu@streamcomputing.com> |
[RISCV] Use shift for zero extension when Zbb and Zbp are not enabled
Now AND is used for zero extension when both Zbb and Zbp are not enabled. It may be better to use shift operation if the trailin
[RISCV] Use shift for zero extension when Zbb and Zbp are not enabled
Now AND is used for zero extension when both Zbb and Zbp are not enabled. It may be better to use shift operation if the trailing ones mask exceeds simm12.
This patch optimzes LUI+ADDI+AND to SLLI+SRLI.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D116720
show more ...
|
#
b271184f |
| 10-Jan-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Use FP ABI on some of the FP tests to reduce the number of CHECK lines. NFC
These tests are interested in the FP instructions being used, not the conversions needed to pass the arguments/ret
[RISCV] Use FP ABI on some of the FP tests to reduce the number of CHECK lines. NFC
These tests are interested in the FP instructions being used, not the conversions needed to pass the arguments/returns in GPRs.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D116869
show more ...
|
#
0ec5f1e6 |
| 09-Dec-2021 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Reduce duplicate FP test cases.
-Remove feq, fle, flt tests from *-arith.ll in favor of *-fcmp.ll which tests all predicates.
Reviewed By: asb
Differential Revision: https://reviews.llvm.o
[RISCV] Reduce duplicate FP test cases.
-Remove feq, fle, flt tests from *-arith.ll in favor of *-fcmp.ll which tests all predicates.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D113703
show more ...
|
Revision tags: llvmorg-13.0.1-rc1 |
|
#
137d3474 |
| 16-Nov-2021 |
Hsiangkai Wang <kai.wang@sifive.com> |
[RISCV] Reverse the order of loading/storing callee-saved registers.
Currently, we restore the return address register as the last restoring instruction in the epilog. The next instruction is `ret`
[RISCV] Reverse the order of loading/storing callee-saved registers.
Currently, we restore the return address register as the last restoring instruction in the epilog. The next instruction is `ret` usually. It is a use of return address register. In some microarchitectures, there is load-to-use data hazard. To avoid the load-to-use data hazard, we could separate the load instruction from its use as far as possible. In this patch, we reverse the order of restoring callee-saved registers to increase the distance of `load ra` and `ret` in the epilog.
Differential Revision: https://reviews.llvm.org/D113967
show more ...
|
#
af0ecfcc |
| 22-Nov-2021 |
wangpc <pc.wang@linux.alibaba.com> |
[RISCV] Generate pseudo instruction li
Add an alias of `addi [x], zero, imm` to generate pseudo instruction li, which makes assembly mush more readable. For existed tests, users can update them by r
[RISCV] Generate pseudo instruction li
Add an alias of `addi [x], zero, imm` to generate pseudo instruction li, which makes assembly mush more readable. For existed tests, users can update them by running script `llvm/utils/update_llc_test_checks.py`.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D112692
show more ...
|
#
eb44f3fc |
| 11-Nov-2021 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add rv32i/rv64i command lines to some floating point tests. NFC
This improves our coverage of soft float libcalls lowering.
Remove most of the test cases from rv64i-single-softfloat.ll. The
[RISCV] Add rv32i/rv64i command lines to some floating point tests. NFC
This improves our coverage of soft float libcalls lowering.
Remove most of the test cases from rv64i-single-softfloat.ll. They were duplicated in the test files that now test softflow. Only a couple test cases for constrained FP remain. Those should be removed when we start supporting constrained FP.
This is follow up from D113528.
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
#
ed95cafb |
| 25-Nov-2020 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add an implementation of isFMAFasterThanFMulAndFAdd
Start with an assumption that FMA is faster than Fmul+FAdd. If thats not true on some particular implementation we can add a tuning parame
[RISCV] Add an implementation of isFMAFasterThanFMulAndFAdd
Start with an assumption that FMA is faster than Fmul+FAdd. If thats not true on some particular implementation we can add a tuning parameter in the future.
I've update the fmuladd test cases and added new test cases for fast math flag based contraction.
Differential Revision: https://reviews.llvm.org/D91987
show more ...
|
#
4252f777 |
| 23-Nov-2020 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAG][ARM][AArch64][Hexagon][RISCV][X86] Add SDNPCommutative to fma and fmad nodes in tablegen. Remove explicit commuted patterns from targets.
X86 was already specially marking fma as comm
[SelectionDAG][ARM][AArch64][Hexagon][RISCV][X86] Add SDNPCommutative to fma and fmad nodes in tablegen. Remove explicit commuted patterns from targets.
X86 was already specially marking fma as commutable which allowed tablegen to autogenerate commuted patterns. This moves it to the target independent definition and fix up the targets to remove now unneeded patterns.
Unfortunately, the tests change because the commuted version of the patterns are generating operands in a different than the explicit patterns.
Differential Revision: https://reviews.llvm.org/D91842
show more ...
|
#
defe1186 |
| 05-Nov-2020 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add isel patterns for fnmadd/fnmsub with an fneg on the second operand instead of the first.
The multiply part of FMA is commutable, but TargetSelectionDAG.td doesn't have it marked as commu
[RISCV] Add isel patterns for fnmadd/fnmsub with an fneg on the second operand instead of the first.
The multiply part of FMA is commutable, but TargetSelectionDAG.td doesn't have it marked as commutable so tablegen won't automatically create the additional patterns.
So manually add commuted patterns.
show more ...
|