History log of /llvm-project/llvm/test/CodeGen/RISCV/div-by-constant.ll (Results 1 – 23 of 23)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 9122c523 15-Nov-2024 Pengcheng Wang <wangpengcheng.pp@bytedance.com>

[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)


This is based on other targets like PPC/AArch64 and some experiments.

This PR will only enable bidirectional schedu

[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)


This is based on other targets like PPC/AArch64 and some experiments.

This PR will only enable bidirectional scheduling and tracking register
pressure.

Disclaimer: I haven't tested it on many cores, maybe we should make
some options being features. I believe downstreams must have tried
this before, so feedbacks are welcome.

show more ...


# aef0e77c 05-Nov-2024 Simon Pilgrim <llvm-dev@redking.me.uk>

[DAG] visitAND - Fold (and (srl X, C), 1) -> (srl X, BW-1) for signbit extraction (#114992)

If we're masking the LSB of a SRL node result and that is shifting down an extended sign bit, see if we ca

[DAG] visitAND - Fold (and (srl X, C), 1) -> (srl X, BW-1) for signbit extraction (#114992)

If we're masking the LSB of a SRL node result and that is shifting down an extended sign bit, see if we can change the SRL to shift down the MSB directly.

These patterns can occur during legalisation when we've sign extended to a wider type but the SRL is still shifting from the subreg.

Alternative to #114967

Fixes the remaining regression in #112588

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# eabaee0c 07-Jan-2024 Fangrui Song <i@maskray.me>

[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467)

R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and
R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530

[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467)

R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and
R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530
`call foo` assembles to R_RISCV_CALL_PLT. The `@plt` suffix is not
useful and can be removed now (matching AArch64 and PowerPC).

GNU assembler assembles `call foo` to RISCV_CALL_PLT since 2022-09
(70f35d72ef04cd23771875c1661c9975044a749c).

Without this patch, unconditionally changing MO_CALL to MO_PLT could
create `jump .L1@plt, a0`, which is invalid in LLVM integrated assembler
and GNU assembler.

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3
# 86240751 06-Oct-2023 Philip Reames <preames@rivosinc.com>

[RISCV] Strip W suffix from ADDIW (#68425)

The motivation of this change is simply to reduce test duplication. As
can be seen in the (massive) test delta, we have many tests whose output
differ on

[RISCV] Strip W suffix from ADDIW (#68425)

The motivation of this change is simply to reduce test duplication. As
can be seen in the (massive) test delta, we have many tests whose output
differ only due to the use of addi on rv32 vs addiw on rv64 when the
high bits are don't care.

As an aside, we don't need to worry about the non-zero immediate
restriction on the compressed variants because we're not directly
forming the compressed variants. If we happen to get a zero immediate
for the ADDI, then either a later optimization will strip the useless
instruction or the encoder is responsible for not compressing the
instruction.

show more ...


Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6
# 38f7c7eb 07-Jun-2023 Florian Mayer <fmayer@google.com>

Revert "Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C).""

Revert broke even more stuff.

This reverts commit d5fbec30939f2c9f82475cf42c638

Revert "Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C).""

Revert broke even more stuff.

This reverts commit d5fbec30939f2c9f82475cf42c638619514b5c67.

show more ...


# d5fbec30 07-Jun-2023 Florian Mayer <fmayer@google.com>

Revert "[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C)."

Triggers UBSan error.

This reverts commit 58b2d652af49ee9d9ff2af6edd7f67f23b26bfee.


# 58b2d652 06-Jun-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C).

Where C is a simm32.

This costs an extra temporary register, but avoids a constant pool.

Reviewe

[RISCV] Add special case to selectImm for constants that can be created with (ADD (SLLI C, 32), C).

Where C is a simm32.

This costs an extra temporary register, but avoids a constant pool.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D152236

show more ...


Revision tags: llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# cbbcb10e 28-Jan-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Refine the (mul (zext.w X), C) -> mulhu isel heuristic.

We try to shift both X and C left by 32 to replace the zext.w
with a SLLI and use mulhu. If C is already a simm32, this likely
makes a

[RISCV] Refine the (mul (zext.w X), C) -> mulhu isel heuristic.

We try to shift both X and C left by 32 to replace the zext.w
with a SLLI and use mulhu. If C is already a simm32, this likely
makes a constant that is more expensive to materialize.

show more ...


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# 86eff6be 20-Jan-2023 Philip Reames <preames@rivosinc.com>

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost.

Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.)

The motivation here is two fold.

First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler.

Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree.

Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch.

For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes.

Differential Revision: https://reviews.llvm.org/D141017

show more ...


Revision tags: llvmorg-15.0.7
# 002005e6 22-Dec-2022 Hsiangkai Wang <hsiangkai@google.com>

[RISCV] Add integer scalar instructions to isAssociativeAndCommutative

Inspired by D138107.

We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative
to increase instruction-l

[RISCV] Add integer scalar instructions to isAssociativeAndCommutative

Inspired by D138107.

We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative
to increase instruction-level parallelism by the existing MachineCombiner pass.

Differential Revision: https://reviews.llvm.org/D140530

show more ...


# d64d3c5a 22-Dec-2022 Nitin John Raj <nitin.raj@sifive.com>

[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility

SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW h

[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility

SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW have a 3-bit register encoding. They both require the dest to also be one of the sources.

We aggressively form ADDW/SLLIW as it helps hasAllWBitUsers in RISCVISelDAGToDAG to not require recursion. So we need a pass to remove excessive -w suffixes.

Differential Revision: https://reviews.llvm.org/D139948

show more ...


# e00e20a0 01-Dec-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints.

These instructions requires both register operands to be compressible
so I've only applied the hint if we already have a GPRC physical regis

[RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints.

These instructions requires both register operands to be compressible
so I've only applied the hint if we already have a GPRC physical register
assigned for the other register operand.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D139079

show more ...


Revision tags: llvmorg-15.0.6
# 64612f5d 25-Nov-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Add ADD to getRegAllocationHints to improve to improve use of c.add.

add can always be compressed to c.add if one of the sources is the
same as the destination.

The same is not true for c.a

[RISCV] Add ADD to getRegAllocationHints to improve to improve use of c.add.

add can always be compressed to c.add if one of the sources is the
same as the destination.

The same is not true for c.addw where the registers need to be x8-x15.

show more ...


Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1
# 38ffa2bb 12-Sep-2022 Craig Topper <craig.topper@sifive.com>

[LegalizeTypes] Improve splitting for urem/udiv by constant for some constants.

For remainder:
If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves
together and use a (Bitwidt

[LegalizeTypes] Improve splitting for urem/udiv by constant for some constants.

For remainder:
If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves
together and use a (Bitwidth / 2) urem. If (BitWidth /2) is a legal integer
type, this urem will be expand by DAGCombiner using multiply by magic
constant. We do have to take into account that adding high and low
together can produce a carry, making it a (BitWidth / 2)+1 bit number.
So we need to also add back in the carry from the first addition.

For division:
We can use the above trick to compute the remainder, subtract that
remainder from the dividend, then multiply by the multiplicative
inverse of the Divisor modulo (1 << BitWidth).

This is based on the section "Remainder by Summing Digits" in
Hacker's delight.

The remainder trick is similar to a trick you may have learned for
determining if a decimal number is divisible by 3. You can add all the
digits together and see if the sum is divisible by 3. If you're not sure
if the sum is divisible by 3, you can add its digits together. This
can be repeated until you have a single decimal digit. If that digit
is 3, 6, or 9, then the original number is divisible by 3. This works
because 10 % 3 == 1.

gcc already does this same trick. There are additional tricks gcc
does urem as well as srem, udiv, and sdiv that I plan to add in
future patches.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130862

show more ...


Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 79016f6e 15-Jul-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Refine the heuristics for our custom (mul (and X, C2), C1) isel.

Prefer to use SLLI instead of zext.w/zext.h in more cases. SLLI
might be better for compression.


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 1903b991 07-Apr-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Always select (and (srl X, C), Mask) as (srli (slli X, C2), C3).

SLLI is always compressible to C.SLLI as long as the source and dest
register is the same.

ANDI and SRLI are only compressib

[RISCV] Always select (and (srl X, C), Mask) as (srli (slli X, C2), C3).

SLLI is always compressible to C.SLLI as long as the source and dest
register is the same.

ANDI and SRLI are only compressible if the register is x8-x15. By
using SLLI we have a better chance of generating shorter code.

I had to exclude one exclusion for the BEXTI case so that it's
pattern match could still fire.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D123336

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 33d008b1 12-Jan-2022 Alex Bradbury <asb@lowrisc.org>

[RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental

Agreed policy is that RISC-V extensions that have not yet been ratified
should be marked as experimental, and ena

[RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental

Agreed policy is that RISC-V extensions that have not yet been ratified
should be marked as experimental, and enabling them requires the use of
the -menable-experimental-extensions flag when using clang alongside the
version number. These extensions have now been ratified, so this is no
longer necessary, and the target feature names can be renamed to no
longer be prefixed with "experimental-".

Differential Revision: https://reviews.llvm.org/D117131

show more ...


# b645bcd9 10-Jan-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Generalize (srl (and X, 0xffff), C) -> (srli (slli X, (XLen-16), (XLen-16) + C) optimization.

This can be generalized to (srl (and X, C2), C) ->
(srli (slli X, (XLen-C3), (XLen-C3) + C). Whe

[RISCV] Generalize (srl (and X, 0xffff), C) -> (srli (slli X, (XLen-16), (XLen-16) + C) optimization.

This can be generalized to (srl (and X, C2), C) ->
(srli (slli X, (XLen-C3), (XLen-C3) + C). Where C2 is a mask with
C3 trailing ones.

This can avoid constant materialization for C2. This is beneficial
even when C2 can be selected to ANDI because the SLLI can become
C.SLLI, but C.ANDI cannot cover all the immediates of ANDI.

This also enables CSE in some cases of i8 sdiv by constant codegen.

show more ...


# 296e8cae 10-Jan-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Isel (sra (sext_inreg X, i16), C) -> (srai (slli X, (XLen-16), (XLen-16) + C).

Similar for (sra (sext_inreg X, i8), C).

With Zbb, sext_inreg of i8 and i16 are legal for sext.b and sext.h.
T

[RISCV] Isel (sra (sext_inreg X, i16), C) -> (srai (slli X, (XLen-16), (XLen-16) + C).

Similar for (sra (sext_inreg X, i8), C).

With Zbb, sext_inreg of i8 and i16 are legal for sext.b and sext.h.
This transform makes the Zbb codegen the same as without Zbb. The
shifts are more compressible. This also exposes an opportunity for
CSE with another slli in the i16 sdiv by constant codegen.

show more ...


# 2dd52f84 10-Jan-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Fold (srl (and X, 0xffff), C)->(srli (slli X, (XLen-16), (XLen-16) + C) even with Zbb/Zbp.

We can use zext.h with Zbb, but srli/slli may offer more opportunities
for compression.


# 41454ab2 31-Dec-2021 wangpc <pc.wang@linux.alibaba.com>

[RISCV] Use constant pool for large integers

For large integers (for example, magic numbers generated by
TargetLowering::BuildSDIV when dividing by constant), we may
need about 4~8 instructions to b

[RISCV] Use constant pool for large integers

For large integers (for example, magic numbers generated by
TargetLowering::BuildSDIV when dividing by constant), we may
need about 4~8 instructions to build them.
In the same time, it just takes two instructions to load
constants (with extra cycles to access memory), so it may be
profitable to put these integers into constant pool.

Reviewed By: asb, craig.topper

Differential Revision: https://reviews.llvm.org/D114950

show more ...


# 015ff729 29-Dec-2021 Craig Topper <craig.topper@sifive.com>

[RISCV] Add a few more instructions to hasAllNBitUsers.


# f7b096d7 29-Dec-2021 Craig Topper <craig.topper@sifive.com>

[RISCV] Add more div by constant test cases.

Some constants require more instructions than others. This adds
additional test for each variation. UDIV has 2 variations, SDIV has
4 variations.

Some o

[RISCV] Add more div by constant test cases.

Some constants require more instructions than others. This adds
additional test for each variation. UDIV has 2 variations, SDIV has
4 variations.

Some of these sequence may have gotten worse on RV32 when we started
doing the div by constant optimization before type legalization. We
materialized a smaller constant, but we require more instructions
to emulate 8 or 16 bit right shifts. This was hidden by the lack
of test coverage.

I've also added Zba and Zbb test cases to show the affect of sext.b,
sext.h, zext.h, and zext.w on some of the shifts. In some cases
we end up generating more code after the multiply because we use
a zext.h+srli and sext.h+srai where without Zbb we share a slli
between a srli and srai.

show more ...