History log of /llvm-project/llvm/test/CodeGen/RISCV/vararg.ll (Results 1 – 25 of 66)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 9122c523 15-Nov-2024 Pengcheng Wang <wangpengcheng.pp@bytedance.com>

[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)


This is based on other targets like PPC/AArch64 and some experiments.

This PR will only enable bidirectional schedu

[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)


This is based on other targets like PPC/AArch64 and some experiments.

This PR will only enable bidirectional scheduling and tracking register
pressure.

Disclaimer: I haven't tested it on many cores, maybe we should make
some options being features. I believe downstreams must have tried
this before, so feedbacks are welcome.

show more ...


# 97982a8c 05-Nov-2024 dlav-sc <daniil.avdeev@syntacore.com>

[RISCV][CFI] add function epilogue cfi information (#110810)

This patch adds CFI instructions in the function epilogue.

Before patch:
addi sp, s0, -32
ld ra, 24(sp) # 8-byte Folded Reload
ld s

[RISCV][CFI] add function epilogue cfi information (#110810)

This patch adds CFI instructions in the function epilogue.

Before patch:
addi sp, s0, -32
ld ra, 24(sp) # 8-byte Folded Reload
ld s0, 16(sp) # 8-byte Folded Reload
ld s1, 8(sp) # 8-byte Folded Reload
addi sp, sp, 32
ret

After patch:
addi sp, s0, -32
.cfi_def_cfa sp, 32
ld ra, 24(sp) # 8-byte Folded Reload
ld s0, 16(sp) # 8-byte Folded Reload
ld s1, 8(sp) # 8-byte Folded Reload
.cfi_restore ra
.cfi_restore s0
.cfi_restore s1
addi sp, sp, 32
.cfi_def_cfa_offset 0
ret

This functionality is already present in `riscv-gcc`, but it’s not in
`clang` and this slightly impairs the `lldb` debugging experience, e.g.
backtrace.

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# 2967e5f8 11-Oct-2024 Alex Bradbury <asb@igalia.com>

[RISCV] Enable store clustering by default (#73796)

Builds on #73789, enabling store clustering by default using the same
heuristic.


# 14c4f28e 01-Oct-2024 Alex Bradbury <asb@igalia.com>

[RISCV] Enable load clustering by default (#73789)

We believe this is neutral or slightly better in the majority of cases.


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3
# c08b90c5 10-Feb-2024 Craig Topper <craig.topper@sifive.com>

[RISCV] Lower the TransientStackAlignment to the ABI alignment for rv32e/rv64e.

I don't think the transient alignment needs to be larger than the
ABI alignment.


Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 3ac9fe69 16-Jan-2024 Wang Pengcheng <wangpengcheng.pp@bytedance.com>

[RISCV] CodeGen of RVE and ilp32e/lp64e ABIs (#76777)

This commit includes the necessary changes to clang and LLVM to support
codegen of `RVE` and the `ilp32e`/`lp64e` ABIs.

The differences between

[RISCV] CodeGen of RVE and ilp32e/lp64e ABIs (#76777)

This commit includes the necessary changes to clang and LLVM to support
codegen of `RVE` and the `ilp32e`/`lp64e` ABIs.

The differences between `RVE` and `RVI` are:
* `RVE` reduces the integer register count to 16(x0-x16).
* The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits.

`RVE` can be combined with all current standard extensions.

The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:
* Only 6 integer argument registers (rather than 8).
* Only 2 callee-saved registers (rather than 12).
* A Stack Alignment of 32bits (rather than 128bits).
* ilp32e isn't compatible with D ISA extension.

If `ilp32e` or `lp64` is used with an ISA that has any of the registers
x16-x31 and f0-f31, then these registers are considered temporaries.

To be compatible with the implementation of ilp32e in GCC, we don't use
aligned registers to pass variadic arguments and set stack alignment\
to 4-bytes for types with length of 2*XLEN.

FastCC is also supported on RVE, while GHC isn't since there is only one
avaiable register.

Differential Revision: https://reviews.llvm.org/D70401

show more ...


# eabaee0c 07-Jan-2024 Fangrui Song <i@maskray.me>

[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467)

R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and
R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530

[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467)

R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and
R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530
`call foo` assembles to R_RISCV_CALL_PLT. The `@plt` suffix is not
useful and can be removed now (matching AArch64 and PowerPC).

GNU assembler assembles `call foo` to RISCV_CALL_PLT since 2022-09
(70f35d72ef04cd23771875c1661c9975044a749c).

Without this patch, unconditionally changing MO_CALL to MO_PLT could
create `jump .L1@plt, a0`, which is invalid in LLVM integrated assembler
and GNU assembler.

show more ...


# d6fbd96e 05-Dec-2023 Alex Bradbury <asb@igalia.com>

[RISCV] Support FrameIndex operands in getMemOperandsWithOffsetWidth / getMemOperandWithOffsetWidth (#73802)

I noted AArch64 happily accepts a FrameIndex operand as well as a
register. This doesn't

[RISCV] Support FrameIndex operands in getMemOperandsWithOffsetWidth / getMemOperandWithOffsetWidth (#73802)

I noted AArch64 happily accepts a FrameIndex operand as well as a
register. This doesn't cause any changes outside of my C++ unit test for
the current state of in-tree, but this will cause additional test
changes if #73789 is rebased on top of it.

Note that the returned Offset doesn't seem at all as meaningful if you
have a FrameIndex base, though the approach taken here follows AArch64
(see D54847). This change won't harm the approach taken in
shouldClusterMemOps because memOpsHaveSameBasePtr will only return true
if the FrameIndex operand is the same for both operations.

show more ...


# 3c5b42ac 05-Dec-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Allocate the varargs GPR save area as a single object. (#74354)

Previously we allocated one object for each GPR. We also allocated the
same offset twice, once to save for VASTART and then a

[RISCV] Allocate the varargs GPR save area as a single object. (#74354)

Previously we allocated one object for each GPR. We also allocated the
same offset twice, once to save for VASTART and then again for the first
register in the save loop.

This patch uses a single object for all the registers and shares this
with VASTART. This is more consistent with other targets like AArch64
and ARM.

I've removed the setValue(nullptr) from the memory operand now. Having a
single object makes me a lot more comfortable about alias analysis being
able to see what is going on. This led to the scheduling changes in
push-pop-popret.ll and vararg.ll.

show more ...


# 83dabd05 05-Dec-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Use iXLen for ptr<->int casts in vararg.ll. NFC

Fix another test I missed in 9e4210faf20014bf8637040b2231cbcd83c38ddd


# 9e4210fa 05-Dec-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Use iXLen for ptr<->int casts in vararg.ll (#74426)

Also use ABI alignment for ptr sized objects.

This makes the code more sane and avoids only loading part of what was
stored by vastart

[RISCV] Use iXLen for ptr<->int casts in vararg.ll (#74426)

Also use ABI alignment for ptr sized objects.

This makes the code more sane and avoids only loading part of what was
stored by vastart on RV64.

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3
# 86240751 06-Oct-2023 Philip Reames <preames@rivosinc.com>

[RISCV] Strip W suffix from ADDIW (#68425)

The motivation of this change is simply to reduce test duplication. As
can be seen in the (massive) test delta, we have many tests whose output
differ on

[RISCV] Strip W suffix from ADDIW (#68425)

The motivation of this change is simply to reduce test duplication. As
can be seen in the (massive) test delta, we have many tests whose output
differ only due to the use of addi on rv32 vs addiw on rv64 when the
high bits are don't care.

As an aside, we don't need to worry about the non-zero immediate
restriction on the compressed variants because we're not directly
forming the compressed variants. If we happen to get a zero immediate
for the ADDI, then either a later optimization will strip the useless
instruction or the encoder is responsible for not compressing the
instruction.

show more ...


Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2
# 5f73d2b7 08-Aug-2023 Yunze Zhu <yunzezhu@linux.alibaba.com>

[RISCV] Enable alias analysis by default

In llvm alias analysis is off by default now.
This patch enable alias analysis on RISCV target during code generation by default,
and this makes more chances

[RISCV] Enable alias analysis by default

In llvm alias analysis is off by default now.
This patch enable alias analysis on RISCV target during code generation by default,
and this makes more chances for improving performance.
Modified related test cases.

Differential Revision: https://reviews.llvm.org/D157250

show more ...


Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# 7b0c4184 28-Mar-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Move compressible registers to the beginning of the FP allocation order.

We don't have very many compressible FP instructions, just load and store.
These instruction require the FP register

[RISCV] Move compressible registers to the beginning of the FP allocation order.

We don't have very many compressible FP instructions, just load and store.
These instruction require the FP register to be f8-f15.

This patch changes the FP allocation order to prioritize f10-f15 first.
These are also the FP argument registers. So I allocated them in reverse
order starting at f15 to avoid taking the first argument registers.
This appears to match gcc allocation order.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D146488

show more ...


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3
# c65b4d64 09-Feb-2023 Andrew Savonichev <andrew.savonichev@gmail.com>

[SelectionDAG] Do not second-guess alignment for alloca

Alignment of an alloca in IR can be lower than the preferred alignment
on purpose, but this override essentially treats the preferred
alignmen

[SelectionDAG] Do not second-guess alignment for alloca

Alignment of an alloca in IR can be lower than the preferred alignment
on purpose, but this override essentially treats the preferred
alignment as the minimum alignment.

The patch changes this behavior to always use the specified
alignment. If alignment is not set explicitly in LLVM IR, it is set to
DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign.

Tests are changed as well: explicit alignment is increased to match
the preferred alignment if it changes output, or omitted when it is
hard to determine the right value (e.g. for pointers, some structs, or
weird types).

Differential Revision: https://reviews.llvm.org/D135462

show more ...


Revision tags: llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init
# 86eff6be 20-Jan-2023 Philip Reames <preames@rivosinc.com>

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost.

Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.)

The motivation here is two fold.

First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler.

Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree.

Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch.

For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes.

Differential Revision: https://reviews.llvm.org/D141017

show more ...


Revision tags: llvmorg-15.0.7
# 002005e6 22-Dec-2022 Hsiangkai Wang <hsiangkai@google.com>

[RISCV] Add integer scalar instructions to isAssociativeAndCommutative

Inspired by D138107.

We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative
to increase instruction-l

[RISCV] Add integer scalar instructions to isAssociativeAndCommutative

Inspired by D138107.

We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative
to increase instruction-level parallelism by the existing MachineCombiner pass.

Differential Revision: https://reviews.llvm.org/D140530

show more ...


# 79d6e9c7 29-Dec-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Prefer ADDI over ORI if the known bits are disjoint.

There is no compressed form of ORI but there is a compressed form
for ADDI.

This also works for XORI since DAGCombine will turn Xor with

[RISCV] Prefer ADDI over ORI if the known bits are disjoint.

There is no compressed form of ORI but there is a compressed form
for ADDI.

This also works for XORI since DAGCombine will turn Xor with disjoint
bits in Or.

Note: The compressed forms require a simm6 immediate, but I'm doing
this for the full simm12 range.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D140674

show more ...


# d64d3c5a 22-Dec-2022 Nitin John Raj <nitin.raj@sifive.com>

[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility

SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW h

[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility

SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW have a 3-bit register encoding. They both require the dest to also be one of the sources.

We aggressively form ADDW/SLLIW as it helps hasAllWBitUsers in RISCVISelDAGToDAG to not require recursion. So we need a pass to remove excessive -w suffixes.

Differential Revision: https://reviews.llvm.org/D139948

show more ...


# 1456b686 19-Dec-2022 Nikita Popov <npopov@redhat.com>

[RISCV] Convert some tests to opaque pointers (NFC)


# 38f1abef 15-Dec-2022 Ron Lieberman <ron.lieberman@amd.com>

Revert "[SelectionDAG] Do not second-guess alignment for alloca"

Breaks amdgpu buildbot https://lab.llvm.org/buildbot/#/builders/193
23491

This reverts commit ffedf47d8b793e07317f82f9c2a5f5425ebb7

Revert "[SelectionDAG] Do not second-guess alignment for alloca"

Breaks amdgpu buildbot https://lab.llvm.org/buildbot/#/builders/193
23491

This reverts commit ffedf47d8b793e07317f82f9c2a5f5425ebb71ad.

show more ...


# ffedf47d 15-Dec-2022 Andrew Savonichev <andrew.savonichev@gmail.com>

[SelectionDAG] Do not second-guess alignment for alloca

Alignment of an alloca in IR can be lower than the preferred alignment
on purpose, but this override essentially treats the preferred
alignmen

[SelectionDAG] Do not second-guess alignment for alloca

Alignment of an alloca in IR can be lower than the preferred alignment
on purpose, but this override essentially treats the preferred
alignment as the minimum alignment.

The patch changes this behavior to always use the specified
alignment. If alignment is not set explicitly in LLVM IR, it is set to
DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign.

Tests are changed as well: explicit alignment is increased to match
the preferred alignment if it changes output, or omitted when it is
hard to determine the right value (e.g. for pointers, some structs, or
weird types).

Differential Revision: https://reviews.llvm.org/D135462

show more ...


# b7753330 02-Dec-2022 Philip Reames <preames@rivosinc.com>

[RISCV] Fold low 12 bits into instruction during frame index elimination

Fold the low 12 bits of an immediate offset into the offset field of the using instruction. That using instruction will be a

[RISCV] Fold low 12 bits into instruction during frame index elimination

Fold the low 12 bits of an immediate offset into the offset field of the using instruction. That using instruction will be a load, store, or addi which performs an add of a signed 12-bit immediate as part of it's operation. Splitting out the low bits allows the high bits to be generated via a single LUI instead of needing an LUI/ADDI pair.

The codegen effect of this is mostly converting cases where "split addi" kicks in to using LUI + a folded offset. There are a couple of straight dynamic instruction count wins, and using a canonical LUI is probably better than a chain of SP adds if the dynamic instruction count is equal.

Differential Revision: https://reviews.llvm.org/D139037

show more ...


# ac1ec9e2 30-Nov-2022 Philip Reames <preames@rivosinc.com>

[RISCV] Share code for fixed offsets adjustRegs (thus materializing fewer constants)

This reuses the existing optimized implementation of adjustReg, and commons up code. This has the effect of enabl

[RISCV] Share code for fixed offsets adjustRegs (thus materializing fewer constants)

This reuses the existing optimized implementation of adjustReg, and commons up code. This has the effect of enabling two code changes for the new caller. First, we enable the "split andi" lowering (with no alignment requirement), and second we use a sub with smaller constant in register instead of a add with negative constant in register.

Differential Revision: https://reviews.llvm.org/D132839

show more ...


Revision tags: llvmorg-15.0.6
# a2b5b584 25-Nov-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Use register allocation hints to improve use of compressed instructions.

Compressed instructions usually require one of the source registers
to also be the source register. The register allo

[RISCV] Use register allocation hints to improve use of compressed instructions.

Compressed instructions usually require one of the source registers
to also be the source register. The register allocator doesn't have
that bias on its own.

This patch adds register allocation hints to introduce this bias.
I've started with ADDI, ADDIW, and SLLI. These all have a 5-bit
field for the register. If the source and dest register are the
same they are guaranteed to compress as long as the immediate is
also 6 bits.

This code was inspired by similar code from the SystemZ target.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D138242

show more ...


123