div-by-constant.ll - OpenGrok history log for /llvm-project/llvm/test/CodeGen/RISCV/div-by-constant.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 9122c523	15-Nov-2024	Pengcheng Wang <wangpengcheng.pp@bytedance.com>	[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445) This is based on other targets like PPC/AArch64 and some experiments. This PR will only enable bidirectional schedu [RISCV] Enable bidirectional scheduling and tracking register pressure (#115445) This is based on other targets like PPC/AArch64 and some experiments. This PR will only enable bidirectional scheduling and tracking register pressure. Disclaimer: I haven't tested it on many cores, maybe we should make some options being features. I believe downstreams must have tried this before, so feedbacks are welcome. show more ...
# aef0e77c	05-Nov-2024	Simon Pilgrim <llvm-dev@redking.me.uk>	[DAG] visitAND - Fold (and (srl X, C), 1) -> (srl X, BW-1) for signbit extraction (#114992) If we're masking the LSB of a SRL node result and that is shifting down an extended sign bit, see if we ca [DAG] visitAND - Fold (and (srl X, C), 1) -> (srl X, BW-1) for signbit extraction (#114992) If we're masking the LSB of a SRL node result and that is shifting down an extended sign bit, see if we can change the SRL to shift down the MSB directly. These patterns can occur during legalisation when we've sign extended to a wider type but the SRL is still shifting from the subreg. Alternative to #114967 Fixes the remaining regression in #112588 show more ...
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# eabaee0c	07-Jan-2024	Fangrui Song <i@maskray.me>	[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467) R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530 [RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467) R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530 `call foo` assembles to R_RISCV_CALL_PLT. The `@plt` suffix is not useful and can be removed now (matching AArch64 and PowerPC). GNU assembler assembles `call foo` to RISCV_CALL_PLT since 2022-09 (70f35d72ef04cd23771875c1661c9975044a749c). Without this patch, unconditionally changing MO_CALL to MO_PLT could create `jump .L1@plt, a0`, which is invalid in LLVM integrated assembler and GNU assembler. show more ...
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3
# 86240751	06-Oct-2023	Philip Reames <preames@rivosinc.com>	[RISCV] Strip W suffix from ADDIW (#68425) The motivation of this change is simply to reduce test duplication. As can be seen in the (massive) test delta, we have many tests whose output differ on [RISCV] Strip W suffix from ADDIW (#68425) The motivation of this change is simply to reduce test duplication. As can be seen in the (massive) test delta, we have many tests whose output differ only due to the use of addi on rv32 vs addiw on rv64 when the high bits are don't care. As an aside, we don't need to worry about the non-zero immediate restriction on the compressed variants because we're not directly forming the compressed variants. If we happen to get a zero immediate for the ADDI, then either a later optimization will strip the useless instruction or the encoder is responsible for not compressing the instruction. show more ...
Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6
# 38f7c7eb	07-Jun-2023	Florian Mayer <fmayer@google.com>	Revert "Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C)."" Revert broke even more stuff. This reverts commit d5fbec30939f2c9f82475cf42c638 Revert "Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C)."" Revert broke even more stuff. This reverts commit d5fbec30939f2c9f82475cf42c638619514b5c67. show more ...
# d5fbec30	07-Jun-2023	Florian Mayer <fmayer@google.com>	Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C)." Triggers UBSan error. This reverts commit 58b2d652af49ee9d9ff2af6edd7f67f23b26bfee.
# 58b2d652	06-Jun-2023	Craig Topper <craig.topper@sifive.com>	[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C). Where C is a simm32. This costs an extra temporary register, but avoids a constant pool. Reviewe [RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C). Where C is a simm32. This costs an extra temporary register, but avoids a constant pool. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D152236 show more ...
Revision tags: llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# cbbcb10e	28-Jan-2023	Craig Topper <craig.topper@sifive.com>	[RISCV] Refine the (mul (zext.w X), C) -> mulhu isel heuristic. We try to shift both X and C left by 32 to replace the zext.w with a SLLI and use mulhu. If C is already a simm32, this likely makes a [RISCV] Refine the (mul (zext.w X), C) -> mulhu isel heuristic. We try to shift both X and C left by 32 to replace the zext.w with a SLLI and use mulhu. If C is already a simm32, this likely makes a constant that is more expensive to materialize. show more ...
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# 86eff6be	20-Jan-2023	Philip Reames <preames@rivosinc.com>	[MachineCombiner] Use default latency model when no detailed model available This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat [MachineCombiner] Use default latency model when no detailed model available This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost. Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.) The motivation here is two fold. First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler. Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree. Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch. For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes. Differential Revision: https://reviews.llvm.org/D141017 show more ...
Revision tags: llvmorg-15.0.7
# 002005e6	22-Dec-2022	Hsiangkai Wang <hsiangkai@google.com>	[RISCV] Add integer scalar instructions to isAssociativeAndCommutative Inspired by D138107. We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative to increase instruction-l [RISCV] Add integer scalar instructions to isAssociativeAndCommutative Inspired by D138107. We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative to increase instruction-level parallelism by the existing MachineCombiner pass. Differential Revision: https://reviews.llvm.org/D140530 show more ...
# d64d3c5a	22-Dec-2022	Nitin John Raj <nitin.raj@sifive.com>	[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW h [RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW have a 3-bit register encoding. They both require the dest to also be one of the sources. We aggressively form ADDW/SLLIW as it helps hasAllWBitUsers in RISCVISelDAGToDAG to not require recursion. So we need a pass to remove excessive -w suffixes. Differential Revision: https://reviews.llvm.org/D139948 show more ...
# e00e20a0	01-Dec-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints. These instructions requires both register operands to be compressible so I've only applied the hint if we already have a GPRC physical regis [RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints. These instructions requires both register operands to be compressible so I've only applied the hint if we already have a GPRC physical register assigned for the other register operand. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D139079 show more ...
Revision tags: llvmorg-15.0.6
# 64612f5d	25-Nov-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Add ADD to getRegAllocationHints to improve to improve use of c.add. add can always be compressed to c.add if one of the sources is the same as the destination. The same is not true for c.a [RISCV] Add ADD to getRegAllocationHints to improve to improve use of c.add. add can always be compressed to c.add if one of the sources is the same as the destination. The same is not true for c.addw where the registers need to be x8-x15. show more ...
Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1
# 38ffa2bb	12-Sep-2022	Craig Topper <craig.topper@sifive.com>	[LegalizeTypes] Improve splitting for urem/udiv by constant for some constants. For remainder: If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves together and use a (Bitwidt [LegalizeTypes] Improve splitting for urem/udiv by constant for some constants. For remainder: If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves together and use a (Bitwidth / 2) urem. If (BitWidth /2) is a legal integer type, this urem will be expand by DAGCombiner using multiply by magic constant. We do have to take into account that adding high and low together can produce a carry, making it a (BitWidth / 2)+1 bit number. So we need to also add back in the carry from the first addition. For division: We can use the above trick to compute the remainder, subtract that remainder from the dividend, then multiply by the multiplicative inverse of the Divisor modulo (1 << BitWidth). This is based on the section "Remainder by Summing Digits" in Hacker's delight. The remainder trick is similar to a trick you may have learned for determining if a decimal number is divisible by 3. You can add all the digits together and see if the sum is divisible by 3. If you're not sure if the sum is divisible by 3, you can add its digits together. This can be repeated until you have a single decimal digit. If that digit is 3, 6, or 9, then the original number is divisible by 3. This works because 10 % 3 == 1. gcc already does this same trick. There are additional tricks gcc does urem as well as srem, udiv, and sdiv that I plan to add in future patches. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130862 show more ...
Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 79016f6e	15-Jul-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Refine the heuristics for our custom (mul (and X, C2), C1) isel. Prefer to use SLLI instead of zext.w/zext.h in more cases. SLLI might be better for compression.
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 1903b991	07-Apr-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Always select (and (srl X, C), Mask) as (srli (slli X, C2), C3). SLLI is always compressible to C.SLLI as long as the source and dest register is the same. ANDI and SRLI are only compressib [RISCV] Always select (and (srl X, C), Mask) as (srli (slli X, C2), C3). SLLI is always compressible to C.SLLI as long as the source and dest register is the same. ANDI and SRLI are only compressible if the register is x8-x15. By using SLLI we have a better chance of generating shorter code. I had to exclude one exclusion for the BEXTI case so that it's pattern match could still fire. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D123336 show more ...
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 33d008b1	12-Jan-2022	Alex Bradbury <asb@lowrisc.org>	[RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental Agreed policy is that RISC-V extensions that have not yet been ratified should be marked as experimental, and ena [RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental Agreed policy is that RISC-V extensions that have not yet been ratified should be marked as experimental, and enabling them requires the use of the -menable-experimental-extensions flag when using clang alongside the version number. These extensions have now been ratified, so this is no longer necessary, and the target feature names can be renamed to no longer be prefixed with "experimental-". Differential Revision: https://reviews.llvm.org/D117131 show more ...
# b645bcd9	10-Jan-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Generalize (srl (and X, 0xffff), C) -> (srli (slli X, (XLen-16), (XLen-16) + C) optimization. This can be generalized to (srl (and X, C2), C) -> (srli (slli X, (XLen-C3), (XLen-C3) + C). Whe [RISCV] Generalize (srl (and X, 0xffff), C) -> (srli (slli X, (XLen-16), (XLen-16) + C) optimization. This can be generalized to (srl (and X, C2), C) -> (srli (slli X, (XLen-C3), (XLen-C3) + C). Where C2 is a mask with C3 trailing ones. This can avoid constant materialization for C2. This is beneficial even when C2 can be selected to ANDI because the SLLI can become C.SLLI, but C.ANDI cannot cover all the immediates of ANDI. This also enables CSE in some cases of i8 sdiv by constant codegen. show more ...
# 296e8cae	10-Jan-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Isel (sra (sext_inreg X, i16), C) -> (srai (slli X, (XLen-16), (XLen-16) + C). Similar for (sra (sext_inreg X, i8), C). With Zbb, sext_inreg of i8 and i16 are legal for sext.b and sext.h. T [RISCV] Isel (sra (sext_inreg X, i16), C) -> (srai (slli X, (XLen-16), (XLen-16) + C). Similar for (sra (sext_inreg X, i8), C). With Zbb, sext_inreg of i8 and i16 are legal for sext.b and sext.h. This transform makes the Zbb codegen the same as without Zbb. The shifts are more compressible. This also exposes an opportunity for CSE with another slli in the i16 sdiv by constant codegen. show more ...
# 2dd52f84	10-Jan-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Fold (srl (and X, 0xffff), C)->(srli (slli X, (XLen-16), (XLen-16) + C) even with Zbb/Zbp. We can use zext.h with Zbb, but srli/slli may offer more opportunities for compression.
# 41454ab2	31-Dec-2021	wangpc <pc.wang@linux.alibaba.com>	[RISCV] Use constant pool for large integers For large integers (for example, magic numbers generated by TargetLowering::BuildSDIV when dividing by constant), we may need about 4~8 instructions to b [RISCV] Use constant pool for large integers For large integers (for example, magic numbers generated by TargetLowering::BuildSDIV when dividing by constant), we may need about 4~8 instructions to build them. In the same time, it just takes two instructions to load constants (with extra cycles to access memory), so it may be profitable to put these integers into constant pool. Reviewed By: asb, craig.topper Differential Revision: https://reviews.llvm.org/D114950 show more ...
# 015ff729	29-Dec-2021	Craig Topper <craig.topper@sifive.com>	[RISCV] Add a few more instructions to hasAllNBitUsers.
# f7b096d7	29-Dec-2021	Craig Topper <craig.topper@sifive.com>	[RISCV] Add more div by constant test cases. Some constants require more instructions than others. This adds additional test for each variation. UDIV has 2 variations, SDIV has 4 variations. Some o [RISCV] Add more div by constant test cases. Some constants require more instructions than others. This adds additional test for each variation. UDIV has 2 variations, SDIV has 4 variations. Some of these sequence may have gotten worse on RV32 when we started doing the div by constant optimization before type legalization. We materialized a smaller constant, but we require more instructions to emulate 8 or 16 bit right shifts. This was hidden by the lack of test coverage. I've also added Zba and Zbb test cases to show the affect of sext.b, sext.h, zext.h, and zext.w on some of the shifts. In some cases we end up generating more code after the multiply because we use a zext.h+srli and sext.h+srai where without Zbb we share a slli between a srli and srai. show more ...