Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
9122c523 |
| 15-Nov-2024 |
Pengcheng Wang <wangpengcheng.pp@bytedance.com> |
[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)
This is based on other targets like PPC/AArch64 and some experiments.
This PR will only enable bidirectional schedu
[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)
This is based on other targets like PPC/AArch64 and some experiments.
This PR will only enable bidirectional scheduling and tracking register pressure.
Disclaimer: I haven't tested it on many cores, maybe we should make some options being features. I believe downstreams must have tried this before, so feedbacks are welcome.
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3 |
|
#
86240751 |
| 06-Oct-2023 |
Philip Reames <preames@rivosinc.com> |
[RISCV] Strip W suffix from ADDIW (#68425)
The motivation of this change is simply to reduce test duplication. As
can be seen in the (massive) test delta, we have many tests whose output
differ on
[RISCV] Strip W suffix from ADDIW (#68425)
The motivation of this change is simply to reduce test duplication. As
can be seen in the (massive) test delta, we have many tests whose output
differ only due to the use of addi on rv32 vs addiw on rv64 when the
high bits are don't care.
As an aside, we don't need to worry about the non-zero immediate
restriction on the compressed variants because we're not directly
forming the compressed variants. If we happen to get a zero immediate
for the ADDI, then either a later optimization will strip the useless
instruction or the encoder is responsible for not compressing the
instruction.
show more ...
|
Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6 |
|
#
38f7c7eb |
| 07-Jun-2023 |
Florian Mayer <fmayer@google.com> |
Revert "Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C).""
Revert broke even more stuff.
This reverts commit d5fbec30939f2c9f82475cf42c638
Revert "Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C).""
Revert broke even more stuff.
This reverts commit d5fbec30939f2c9f82475cf42c638619514b5c67.
show more ...
|
#
d5fbec30 |
| 07-Jun-2023 |
Florian Mayer <fmayer@google.com> |
Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C)."
Triggers UBSan error.
This reverts commit 58b2d652af49ee9d9ff2af6edd7f67f23b26bfee.
|
#
58b2d652 |
| 06-Jun-2023 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C).
Where C is a simm32.
This costs an extra temporary register, but avoids a constant pool.
Reviewe
[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C).
Where C is a simm32.
This costs an extra temporary register, but avoids a constant pool.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D152236
show more ...
|
Revision tags: llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init |
|
#
86eff6be |
| 20-Jan-2023 |
Philip Reames <preames@rivosinc.com> |
[MachineCombiner] Use default latency model when no detailed model available
This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat
[MachineCombiner] Use default latency model when no detailed model available
This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost.
Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.)
The motivation here is two fold.
First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler.
Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree.
Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch.
For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes.
Differential Revision: https://reviews.llvm.org/D141017
show more ...
|
Revision tags: llvmorg-15.0.7 |
|
#
002005e6 |
| 22-Dec-2022 |
Hsiangkai Wang <hsiangkai@google.com> |
[RISCV] Add integer scalar instructions to isAssociativeAndCommutative
Inspired by D138107.
We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative to increase instruction-l
[RISCV] Add integer scalar instructions to isAssociativeAndCommutative
Inspired by D138107.
We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative to increase instruction-level parallelism by the existing MachineCombiner pass.
Differential Revision: https://reviews.llvm.org/D140530
show more ...
|
#
d64d3c5a |
| 22-Dec-2022 |
Nitin John Raj <nitin.raj@sifive.com> |
[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility
SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW h
[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility
SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW have a 3-bit register encoding. They both require the dest to also be one of the sources.
We aggressively form ADDW/SLLIW as it helps hasAllWBitUsers in RISCVISelDAGToDAG to not require recursion. So we need a pass to remove excessive -w suffixes.
Differential Revision: https://reviews.llvm.org/D139948
show more ...
|
#
d741a31a |
| 14-Dec-2022 |
Nitin John Raj <nitin.raj@sifive.com> |
[RISCV][CodeGen][SelectionDAG] Recursively check hasAllNBitUsers for logical machine opcodes
We don’t have W versions of AND/OR/XOR/ANDN/ORN/XNOR so we should recursively check their users. We shoul
[RISCV][CodeGen][SelectionDAG] Recursively check hasAllNBitUsers for logical machine opcodes
We don’t have W versions of AND/OR/XOR/ANDN/ORN/XNOR so we should recursively check their users. We should limit the recursion to SelectionDAG::MaxRecursionDepth levels.
We need to add a Depth argument, all existing callers should pass 0 to the Depth. The new recursive calls should increment it by 1. At the top of the function we should give up and return false if Depth >= SelectionDAG::MaxRecursionDepth.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D139462
show more ...
|
#
e00e20a0 |
| 01-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints.
These instructions requires both register operands to be compressible so I've only applied the hint if we already have a GPRC physical regis
[RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints.
These instructions requires both register operands to be compressible so I've only applied the hint if we already have a GPRC physical register assigned for the other register operand.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D139079
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5 |
|
#
f387918d |
| 15-Nov-2022 |
Craig Topper <craig.topper@sifive.com> |
[TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion.
We can reuse constants if we use SRL followed by AND and AND followed by SHL. Similar was
[TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion.
We can reuse constants if we use SRL followed by AND and AND followed by SHL. Similar was done to bitreverse previously.
Differential Revision: https://reviews.llvm.org/D138045
show more ...
|
Revision tags: llvmorg-15.0.4 |
|
#
78739fdb |
| 29-Oct-2022 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[DAG] Enable combineShiftOfShiftedLogic folds after type legalization
This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which
[DAG] Enable combineShiftOfShiftedLogic folds after type legalization
This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides.
Fixes #57872
Differential Revision: https://reviews.llvm.org/D136042
show more ...
|
Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2 |
|
#
182aa0cb |
| 22-Sep-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Remove support for the unratified Zbp extension.
This extension does not appear to be on its way to ratification.
Still need some follow up to simplify the RISCVISD nodes.
|
Revision tags: llvmorg-15.0.1, llvmorg-15.0.0 |
|
#
893f5e95 |
| 30-Aug-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Improve isel of AND with shiftedMask containing 32 leading zeros and some trailing zeros.
We can use srliw to shift out the trailing bits and slli to shift back in zeros. The sign extend of
[RISCV] Improve isel of AND with shiftedMask containing 32 leading zeros and some trailing zeros.
We can use srliw to shift out the trailing bits and slli to shift back in zeros. The sign extend of srliw will 0 the upper 32 bits since we will be shifting a 0 into bit 31.
show more ...
|
Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1 |
|
#
69d5a038 |
| 28-Jul-2022 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[DAG] Enable ISD::SRL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the ISD::SRL
[DAG] Enable ISD::SRL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the ISD::SRL source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.
This is another step towards removing SelectionDAG::GetDemandedBits and just using TargetLowering::SimplifyMultipleUseDemandedBits.
There a few cases where we end up with extra register moves which I think we can accept in exchange for the increased ILP.
Differential Revision: https://reviews.llvm.org/D77804
show more ...
|
Revision tags: llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
#
46eef768 |
| 18-May-2022 |
Craig Topper <craig.topper@sifive.com> |
[DAGCombiner] Fix bug in MatchBSwapHWordLow.
This function tries to match (a >> 8) | (a << 8) as (bswap a) >> 16.
If the SRL isn't masked and the high bits aren't demanded, we still need to ensure
[DAGCombiner] Fix bug in MatchBSwapHWordLow.
This function tries to match (a >> 8) | (a << 8) as (bswap a) >> 16.
If the SRL isn't masked and the high bits aren't demanded, we still need to ensure that bits 23:16 are zero. After the right shift they will be in bits 15:8 which is where the important bits from the SHL end up. It's only a bswap if the OR on bits 15:8 only takes the bits from the SHL.
Fixes PR55484.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D125641
show more ...
|
#
74f6ded4 |
| 16-May-2022 |
Craig Topper <craig.topper@sifive.com> |
[AArch64][ARM][RISCV][X86] Add test cases for PR55484. NFC
This bug is in generic DAG combine and easily reproducible on many targets.
Reviewed By: david-arm
Differential Revision: https://reviews
[AArch64][ARM][RISCV][X86] Add test cases for PR55484. NFC
This bug is in generic DAG combine and easily reproducible on many targets.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D125640
show more ...
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0 |
|
#
fd4d584d |
| 13-Mar-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb.
If the type is less than XLenVT, type legalization will turn this into (srl (bitreverse (bswap (srl (bswap X), C))), C). We
[RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb.
If the type is less than XLenVT, type legalization will turn this into (srl (bitreverse (bswap (srl (bswap X), C))), C). We can't completely recover from these shifts. They introduce zeros into the upper bits of the result and we can't easily tell if they are needed. By doing a DAG combine early, we avoid introducing these shifts.
show more ...
|
#
b55a77d2 |
| 13-Mar-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add Zbp command lines to bswap-bitreverse.ll. NFC
|
Revision tags: llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init |
|
#
d8f929a5 |
| 29-Jan-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Custom legalize BITREVERSE with Zbkb.
With Zbkb, a bitreverse can be split into a rev8 and a brev8.
Reviewed By: VincentWu
Differential Revision: https://reviews.llvm.org/D118430
|
#
3e98ce45 |
| 28-Jan-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add Zbkb RUN lines to bswap-bitreverse.ll. NFC
|
#
dcd751b2 |
| 28-Jan-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Split bswap-bitreverse-ctlz-cttz-ctpop.ll into two files bswap/bitreverse and ctlz/cttz/ctpop. NFC
Add Zbkb command lines to the bswap/bitreverse test.
|