#
83d56fb1 |
| 19-Jan-2023 |
Kazu Hirata <kazu@google.com> |
Drop the ZeroBehavior parameter from countLeadingZeros and the like (NFC)
This patch drops the ZeroBehavior parameter from bit counting functions like countLeadingZeros. ZeroBehavior specifies the
Drop the ZeroBehavior parameter from countLeadingZeros and the like (NFC)
This patch drops the ZeroBehavior parameter from bit counting functions like countLeadingZeros. ZeroBehavior specifies the behavior when the input to count{Leading,Trailing}Zeros is zero and when the input to count{Leading,Trailing}Ones is all ones.
ZeroBehavior was first introduced on May 24, 2013 in commit eb91eac9fb866ab1243366d2e238b9961895612d. While that patch did not state the intention, I would guess ZeroBehavior was for performance reasons. The x86 machines around that time required a conditional branch to implement countLeadingZero<uint32_t> that returns the 32 on zero:
test edi, edi je .LBB0_2 bsr eax, edi xor eax, 31 .LBB1_2: mov eax, 32
That is, we can remove the conditional branch if we don't care about the behavior on zero.
IIUC, Intel's Haswell architecture, launched on June 4, 2013, introduced several bit manipulation instructions, including lzcnt and tzcnt, which eliminated the need for the conditional branch.
I think it's time to retire ZeroBehavior as its utility is very limited. If you care about compilation speed, you should build LLVM with an appropriate -march= to take advantage of lzcnt and tzcnt. Even if not, modern host compilers should be able to optimize away quite a few conditional branches because the input is often known to be nonzero from dominating conditional branches.
Differential Revision: https://reviews.llvm.org/D141798
show more ...
|
Revision tags: llvmorg-15.0.7 |
|
#
564e09c7 |
| 08-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Use bseti for 2048 in RISCVMatInt when Zbs is enabled.
2048 requires an LUI and ADDI instruction due to ADDI using a signed immediate. It can also be done with C.LI+C.SLLI for better code si
[RISCV] Use bseti for 2048 in RISCVMatInt when Zbs is enabled.
2048 requires an LUI and ADDI instruction due to ADDI using a signed immediate. It can also be done with C.LI+C.SLLI for better code size.
With Zbs we can use a single BSETI to have an instruction.
Reorder the checks so that BSETI is checked first, with an extra qualification to prefer a single LUI or ADDI when possible. I'm continuing to think about other ways to structure this code, but this works for now.
Fixes PR59362.
show more ...
|
#
f2ffdbeb |
| 08-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add accessors to RISCVMatInt::Inst.
Make fields private. This helps hide that the Imm field doesn't store a full int64_t.
|
#
2c52d516 |
| 07-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
Revert "[RISCV] Return InstSeq from generateInstSeqImpl instead of using an output parameter. NFC"
This reverts commit d24915207c631b7cf637081f333b41bc5159c700.
Thinking about this more this probab
Revert "[RISCV] Return InstSeq from generateInstSeqImpl instead of using an output parameter. NFC"
This reverts commit d24915207c631b7cf637081f333b41bc5159c700.
Thinking about this more this probably chewed up 100+ bytes of stack for each recursive call. So this probably needs more thought. The code simplification wasn't that much.
show more ...
|
#
938d0d6d |
| 07-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Replace uses of hasStdExtC with COrZca.
Except MakeCompressible which will need more work.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D139504
|
#
d6cfdf04 |
| 06-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Pass ZB_Undefined to countTrailingZeros/countLeadingZeros. NFC
We know the input is not zero so we can simplify the generated code.
|
#
d2491520 |
| 06-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Return InstSeq from generateInstSeqImpl instead of using an output parameter. NFC
We should be able to rely on RVO here.
|
#
1806ce90 |
| 06-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Teach RISCVMatInt to prefer li+slli over lui+addi(w) for compressibility.
With C extension, li with a 6 bit immediate followed by slli is 4 bytes. The lui+addi(w) sequence is at least 6 byte
[RISCV] Teach RISCVMatInt to prefer li+slli over lui+addi(w) for compressibility.
With C extension, li with a 6 bit immediate followed by slli is 4 bytes. The lui+addi(w) sequence is at least 6 bytes.
The two sequences probably have similar execution latency. The exception being if the target supports lui+addi(w) macrofusion.
Since the execution latency is probably the same I didn't restrict this to C extension.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D139135
show more ...
|
#
ce66f4d0 |
| 06-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Restrict when RISCVMatInt will retry SLLI as a last step. NFC
The main algorithm will already end with a SLLI when there are 12 or more trailing zeros. We only need to retry when there are l
[RISCV] Restrict when RISCVMatInt will retry SLLI as a last step. NFC
The main algorithm will already end with a SLLI when there are 12 or more trailing zeros. We only need to retry when there are less than 12 trailing zeros since the main algorithm will pick an ADDI or ADDIW at the end for those cases.
show more ...
|
#
dd3fe524 |
| 06-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Remove some RISCVMatInt early exits.
These were early exiting if we replaced a sequence with a 2 instruction sequence since that is the best we could do. All the later optimizations only occ
[RISCV] Remove some RISCVMatInt early exits.
These were early exiting if we replaced a sequence with a 2 instruction sequence since that is the best we could do. All the later optimizations only occur if the sequence is more than 2 instructions so this wasn't a functional check.
At best it helps the compiler generate better code, but I don't think that was analyzed when it was added. Remove it to simplify the code.
show more ...
|
#
47ff3042 |
| 05-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Use findFirstSet instead of countTrailingZeros. NFC
findFirstSet is a wrapper around countTrailingZeros so they are equivalent here, but I think findFirstSet more cleary describes the algori
[RISCV] Use findFirstSet instead of countTrailingZeros. NFC
findFirstSet is a wrapper around countTrailingZeros so they are equivalent here, but I think findFirstSet more cleary describes the algorithm here.
show more ...
|
#
c8c1d7af |
| 05-Dec-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Use emplace_back to shorten lines in RISCVMatInt. NFC
A few other minor improvements.
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3 |
|
#
0fe5f03e |
| 12-Aug-2022 |
jacquesguan <Jianjian.Guan@streamcomputing.com> |
[RISCV][NFC] Use nested namespace definations.
Since we use C++17 now, we could use nested namespace definations to simplify code.
Differential Revision: https://reviews.llvm.org/D131751
|
Revision tags: llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
#
d2ee2c9c |
| 24-May-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add an operand kind to the opcode/imm returned from RISCVMatInt.
Instead of matching opcodes to know the format to emit, use an enum value that we can get from the RISCVMatInt::Inst class.
[RISCV] Add an operand kind to the opcode/imm returned from RISCVMatInt.
Instead of matching opcodes to know the format to emit, use an enum value that we can get from the RISCVMatInt::Inst class.
Change the consumers to use fully covered switches so that we get a compiler warning if a new kind is added. With the opcode checks it was easier to forget to update one of the 3 consumers.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126317
show more ...
|
#
5c383731 |
| 29-Apr-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Improve constant materialization for cases that can use LUI+ADDI instead of LUI+ADDIW.
It's possible that we have a constant that isn't simm32 so we can't use LUI+ADDIW, but we can use LUI+A
[RISCV] Improve constant materialization for cases that can use LUI+ADDI instead of LUI+ADDIW.
It's possible that we have a constant that isn't simm32 so we can't use LUI+ADDIW, but we can use LUI+ADDI. Because ADDI uses a sign extended constant, it's possible that after subtracting it out, we end up with a simm32 that maps to LUI.
This patch detects this case after removing Lo12 and before shifting the value for SLLI.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D124222
show more ...
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2 |
|
#
9534811a |
| 21-Apr-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Teach generateInstSeqImpl to generate BSETI for single bit cases.
If the immediate has one bit set, but isn't a simm32 we can try the BSETI instruction from Zbs.
|
#
98b86689 |
| 21-Apr-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Add special case to constant materialization to remove trailing zeros first.
If there are fewer than 12 trailing zeros, we'll try to use an ADDI at the end of the sequence. If we strip trail
[RISCV] Add special case to constant materialization to remove trailing zeros first.
If there are fewer than 12 trailing zeros, we'll try to use an ADDI at the end of the sequence. If we strip trailing zeros and end the sequence with a SLLI we might find a shorter sequence.
Differential Revision: https://reviews.llvm.org/D124148
show more ...
|
#
186d5c8a |
| 21-Apr-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Make getInstSeqCost handle other Zb* instructions.
We haven't been updating this as Zb* instructions have been used for immediate materialization. They will hit the default case and trigger
[RISCV] Make getInstSeqCost handle other Zb* instructions.
We haven't been updating this as Zb* instructions have been used for immediate materialization. They will hit the default case and trigger an llvm_unreachable. Instead of trying to list them all, assume instructions that aren't explicitly listed aren't compressible.
Spotted while looking at integer materialization for other reasons. I haven't seen a crash from this yet.
show more ...
|
Revision tags: llvmorg-14.0.1 |
|
#
70046438 |
| 09-Apr-2022 |
Craig Topper <craig.topper@sifive.com> |
[RISCV] Only try LUI+SH*ADD+ADDI for int materialization if LUI+ADDI+SH*ADD failed.
There's an assert in LUI+SH*ADD+ADDI materialization that makes sure the lower 12 bits aren't zero since that case
[RISCV] Only try LUI+SH*ADD+ADDI for int materialization if LUI+ADDI+SH*ADD failed.
There's an assert in LUI+SH*ADD+ADDI materialization that makes sure the lower 12 bits aren't zero since that case should have been handled as LUI+ADDI+SH*ADD. But nothing prevented the LUI+SH*ADD+ADDI checks from running after the earlier code handled it.
The sequence would be the same length or longer so it wouldn't replace the earlier sequence, but the assert happened before that was checked.
The vector holding the sequence also wasn't reset before the second check so that guaranteed the sequence would never be found to be shorter.
This patch fixes this by only trying the second expansion when the earlier fails.
Fixes PR54812.
Reviewed By: benshi001
Differential Revision: https://reviews.llvm.org/D123406
show more ...
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init |
|
#
588f121a |
| 28-Jan-2022 |
Alex Bradbury <asb@lowrisc.org> |
[RISCV][NFC] Make Zb* instruction naming match the convention used elsewhere in the RISC-V backend
Where the instruction mnemonic contains a dot, we name the corresponding instruction in the .td fil
[RISCV][NFC] Make Zb* instruction naming match the convention used elsewhere in the RISC-V backend
Where the instruction mnemonic contains a dot, we name the corresponding instruction in the .td file using a _ in the place of the dot. e.g. LR_W rather than LRW. This commit updates RISCVInstrInfoZb.td to follow that convention.
show more ...
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
af931a51 |
| 07-Jan-2022 |
Baoshan Pang <pangbw@gmail.com> |
[RISCV] Materializing constants with 'rori'
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D116574
|
Revision tags: llvmorg-13.0.1-rc1 |
|
#
4c3d916c |
| 11-Nov-2021 |
Ben Shi <powerman1st@163.com> |
[RISCV] Optimize immediate materialisation with SH*ADD
Use LUI+SH*ADD+ADDI to compose specific immediates.
Reviewed By: craig.topper, luismarques
Differential Revision: https://reviews.llvm.org/D1
[RISCV] Optimize immediate materialisation with SH*ADD
Use LUI+SH*ADD+ADDI to compose specific immediates.
Reviewed By: craig.topper, luismarques
Differential Revision: https://reviews.llvm.org/D113568
show more ...
|
#
97e52e1c |
| 17-Oct-2021 |
Ben Shi <powerman1st@163.com> |
[RISCV] Optimize immediate materialisation with SLLI.UW in the Zba extension
Simplify "LUI+SLLI+ADDI+SLLI" and "LUI+ADDIW+SLLI+ADDI+SLLI" to "LUI+ADDIW+SLLIUW" to reduce total instruction amount.
R
[RISCV] Optimize immediate materialisation with SLLI.UW in the Zba extension
Simplify "LUI+SLLI+ADDI+SLLI" and "LUI+ADDIW+SLLI+ADDI+SLLI" to "LUI+ADDIW+SLLIUW" to reduce total instruction amount.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D111933
show more ...
|
#
4fe5ab4b |
| 15-Oct-2021 |
Ben Shi <powerman1st@163.com> |
[RISCV] Optimize immediate materialisation with SH*ADD
Use SH1ADD/SH2ADD/SH3ADD along with LUI+ADDI to compose int32*3, int32*5 and int32*9.
Reviewed By: craig.topper, luismarques
Differential Rev
[RISCV] Optimize immediate materialisation with SH*ADD
Use SH1ADD/SH2ADD/SH3ADD along with LUI+ADDI to compose int32*3, int32*5 and int32*9.
Reviewed By: craig.topper, luismarques
Differential Revision: https://reviews.llvm.org/D111484
show more ...
|
#
7e815261 |
| 14-Oct-2021 |
Ben Shi <powerman1st@163.com> |
[RISCV] Optimize immediate materialisation with BSETI/BCLRI
Opitimize immediate materialisation in the following way if profitable: 1. Use BCLRI for upper 32 bits if the lower 32 bits are negative i
[RISCV] Optimize immediate materialisation with BSETI/BCLRI
Opitimize immediate materialisation in the following way if profitable: 1. Use BCLRI for upper 32 bits if the lower 32 bits are negative int32. 2. Use BSETI for upper 32 bits if the lower 32 bits are positive int32.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D111508
show more ...
|