History log of /llvm-project/llvm/test/CodeGen/RISCV/unaligned-load-store.ll (Results 1 – 24 of 24)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 9122c523 15-Nov-2024 Pengcheng Wang <wangpengcheng.pp@bytedance.com>

[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)


This is based on other targets like PPC/AArch64 and some experiments.

This PR will only enable bidirectional schedu

[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)


This is based on other targets like PPC/AArch64 and some experiments.

This PR will only enable bidirectional scheduling and tracking register
pressure.

Disclaimer: I haven't tested it on many cores, maybe we should make
some options being features. I believe downstreams must have tried
this before, so feedbacks are welcome.

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# 2967e5f8 11-Oct-2024 Alex Bradbury <asb@igalia.com>

[RISCV] Enable store clustering by default (#73796)

Builds on #73789, enabling store clustering by default using the same
heuristic.


# e45b44c6 01-Oct-2024 Alex Bradbury <asb@igalia.com>

[RISCV] Add pattern for PACK/PACKH in common misaligned load case (#110644)

PACKH is currently only selected for assembling the first two bytes of a
misligned load. A fairly complex RV32-only patte

[RISCV] Add pattern for PACK/PACKH in common misaligned load case (#110644)

PACKH is currently only selected for assembling the first two bytes of a
misligned load. A fairly complex RV32-only pattern is added for
producing PACKH+PACKH+PACK to assemble the result of a misaligned 32-bit
load.

Another pattern was added that just covers PACKH for shifted offsets 16
and 24, producing a packh and shift to replace two shifts and an 'or'.
This slightly improves RV64IZKBK for a 64-bit load, but fails to match
for the misaligned 32-bit load because the load of the upper byte is
anyext in the SelectionDAG.

I wrote the patch this way because it was quick and easy and has at
least some benefit, but the "right" approach probably merits further
discussion. Introducing target-specific SDNodes for PACK* and having
custom lowering for unaligned load/stores that introduces those nodes
them seems like it might be attractive. However, adding these patterns does provide benefit - so that's what this patch does for now.

show more ...


# 14c4f28e 01-Oct-2024 Alex Bradbury <asb@igalia.com>

[RISCV] Enable load clustering by default (#73789)

We believe this is neutral or slightly better in the majority of cases.


Revision tags: llvmorg-19.1.1
# 39b2e35f 01-Oct-2024 Alex Bradbury <asb@igalia.com>

[RISCV][test] Precommit tests showing codegen for unaligned load/store with zbkb

We have missed opportunities for selecting pack* instructions, that will
be addressed in future patches.


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6
# 90109d44 14-May-2024 Alex Bradbury <asb@igalia.com>

[RISCV] Improve constant materialisation for stores of i8 negative constants (#92131)

This follows the same pattern as 20e62658735a1b03ecadc. Although we
can't reduce the number of instructions use

[RISCV] Improve constant materialisation for stores of i8 negative constants (#92131)

This follows the same pattern as 20e62658735a1b03ecadc. Although we
can't reduce the number of instructions used, if we are able to use a
sign-extended 6-bit immediate then the 16-bit c.li instruction can be
selected (thus saving code size). Although this _could_ be gated so it
only happens if C is enabled, I've opted not to because at worst it's
neutral and it doesn't seem helpful to add unnecessary divergence
between the RVC and non-RVC paths.

show more ...


Revision tags: llvmorg-18.1.5, llvmorg-18.1.4
# 9067070d 16-Apr-2024 Craig Topper <craig.topper@sifive.com>

[RISCV] Re-separate unaligned scalar and vector memory features in the backend. (#88954)

This is largely a revert of commit
e81796671890b59c110f8e41adc7ca26f8484d20.

As #88029 shows, there exist

[RISCV] Re-separate unaligned scalar and vector memory features in the backend. (#88954)

This is largely a revert of commit
e81796671890b59c110f8e41adc7ca26f8484d20.

As #88029 shows, there exists hardware that only supports unaligned
scalar.

I'm leaving how this gets exposed to the clang interface to a future
patch.

show more ...


Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# e8179667 01-Dec-2023 Philip Reames <preames@rivosinc.com>

[RISCV] Collapse fast unaligned access into a single feature [nfc-ish] (#73971)

When we'd originally added unaligned-scalar-mem and
unaligned-vector-mem, they were separated into two parts under th

[RISCV] Collapse fast unaligned access into a single feature [nfc-ish] (#73971)

When we'd originally added unaligned-scalar-mem and
unaligned-vector-mem, they were separated into two parts under the
theory that some processor might implement one, but not the other. At
the moment, we don't have evidence of such a processor. The C/C++ level
interface, and the clang driver command lines have settled on a single
unaligned flag which indicates both scalar and vector support unaligned.
Given that, let's remove the test matrix complexity for a set of
configurations which don't appear useful.

Given these are internal feature names, I don't think we need to provide
any forward compatibility. Anyone disagree?

Note: The immediate trigger for this patch was finding another case
where the unaligned-vector-mem wasn't being properly serialized to IR
from clang which resulted in problems reproducing assembly from clang's
-emit-llvm feature. Instead of fixing this, I decided getting rid of the
complexity was the better approach.

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0
# 8f04d81e 16-Sep-2023 Craig Topper <craig.topper@sifive.com>

[SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore.

If the SRL for Hi constant folds, but we don't remoe those bits from
the Lo, we can end up with strange c

[SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore.

If the SRL for Hi constant folds, but we don't remoe those bits from
the Lo, we can end up with strange constant folding through DAGCombine later.
I've only seen this with constants being lowered to constant pools
during lowering on RISC-V.

show more ...


# 17a12a27 16-Sep-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Add test case to show bad codegen for unaligned i64 store of a large constant.

On the first split we create two i32 trunc stores and a srl to shift
the high part down. The srl gets constant

[RISCV] Add test case to show bad codegen for unaligned i64 store of a large constant.

On the first split we create two i32 trunc stores and a srl to shift
the high part down. The srl gets constant folded, but to produce
a new i32 constant. But the truncstore for the low store still uses
the original constant.

This original constant then gets converted to a constant pool
before we revisit the stores to further split them. The constant
pool prevents further constant folding of the additional srls.

After legalization is done, we run DAGCombiner and get some constant
folding of srl via computeKnownBits which can peek through the constant
pool load. This can create new constants that also need a constant pool.

show more ...


Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 5b95bba6 24-Jul-2023 Luke Lau <luke@igalia.com>

[RISCV] Set Fast flag for unaligned memory accesses

The +unaligned-scalar-mem and +unaligned-vector-mem features were added in
D126085 and D149375 respectively to allow subtargets to indicate that
t

[RISCV] Set Fast flag for unaligned memory accesses

The +unaligned-scalar-mem and +unaligned-vector-mem features were added in
D126085 and D149375 respectively to allow subtargets to indicate that
they supported misaligned loads/stores with "sufficient" performance.
This is separate from whether or not the target actually supports
misaligned accesses, which could be determined from Zicclsm.

This patch enables the Fast flag under the assumption that any subtarget
that declares support for +unaligned-*-mem will want to opt into
optimisations that take advantage of misaligned scalar accesses, such as
store merging.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D150771

show more ...


# eb3f2fe4 20-Jul-2023 Philip Reames <preames@rivosinc.com>

[RISCV] Revise check names for unaligned memory op tests [nfc]

This has come up a few times in review; the current ones seem to be universally confusing. Even I as the original author of most of th

[RISCV] Revise check names for unaligned memory op tests [nfc]

This has come up a few times in review; the current ones seem to be universally confusing. Even I as the original author of most of these get confused. Switch to using the SLOW/FAST naming used by x86, hopefully that's a bit clearer.

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5
# 5ab13332 17-May-2023 Luke Lau <luke@igalia.com>

[RISCV] Add tests for store merging with unaligned scalar access

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D150770


Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# 8e43c22d 22-Mar-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Use LBU for extloadi8.

The Zcb extension has c.lbu, but not c.lb. This patch makes us
prefer LBU over LB if we have a choice which will enable more
compression opportunities.

Reviewed By: a

[RISCV] Use LBU for extloadi8.

The Zcb extension has c.lbu, but not c.lb. This patch makes us
prefer LBU over LB if we have a choice which will enable more
compression opportunities.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D146270

show more ...


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init
# 86eff6be 20-Jan-2023 Philip Reames <preames@rivosinc.com>

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost.

Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.)

The motivation here is two fold.

First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler.

Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree.

Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch.

For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes.

Differential Revision: https://reviews.llvm.org/D141017

show more ...


Revision tags: llvmorg-15.0.7
# 002005e6 22-Dec-2022 Hsiangkai Wang <hsiangkai@google.com>

[RISCV] Add integer scalar instructions to isAssociativeAndCommutative

Inspired by D138107.

We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative
to increase instruction-l

[RISCV] Add integer scalar instructions to isAssociativeAndCommutative

Inspired by D138107.

We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative
to increase instruction-level parallelism by the existing MachineCombiner pass.

Differential Revision: https://reviews.llvm.org/D140530

show more ...


# d64d3c5a 22-Dec-2022 Nitin John Raj <nitin.raj@sifive.com>

[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility

SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW h

[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility

SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW have a 3-bit register encoding. They both require the dest to also be one of the sources.

We aggressively form ADDW/SLLIW as it helps hasAllWBitUsers in RISCVISelDAGToDAG to not require recursion. So we need a pass to remove excessive -w suffixes.

Differential Revision: https://reviews.llvm.org/D139948

show more ...


# 1456b686 19-Dec-2022 Nikita Popov <npopov@redhat.com>

[RISCV] Convert some tests to opaque pointers (NFC)


# d741a31a 14-Dec-2022 Nitin John Raj <nitin.raj@sifive.com>

[RISCV][CodeGen][SelectionDAG] Recursively check hasAllNBitUsers for logical machine opcodes

We don’t have W versions of AND/OR/XOR/ANDN/ORN/XNOR so we should recursively check their users. We shoul

[RISCV][CodeGen][SelectionDAG] Recursively check hasAllNBitUsers for logical machine opcodes

We don’t have W versions of AND/OR/XOR/ANDN/ORN/XNOR so we should recursively check their users. We should limit the recursion to SelectionDAG::MaxRecursionDepth levels.

We need to add a Depth argument, all existing callers should pass 0 to the Depth. The new recursive calls should increment it by 1. At the top of the function we should give up and return false if Depth >= SelectionDAG::MaxRecursionDepth.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D139462

show more ...


# e00e20a0 01-Dec-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints.

These instructions requires both register operands to be compressible
so I've only applied the hint if we already have a GPRC physical regis

[RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints.

These instructions requires both register operands to be compressible
so I've only applied the hint if we already have a GPRC physical register
assigned for the other register operand.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D139079

show more ...


Revision tags: llvmorg-15.0.6
# a2b5b584 25-Nov-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Use register allocation hints to improve use of compressed instructions.

Compressed instructions usually require one of the source registers
to also be the source register. The register allo

[RISCV] Use register allocation hints to improve use of compressed instructions.

Compressed instructions usually require one of the source registers
to also be the source register. The register allocator doesn't have
that bias on its own.

This patch adds register allocation hints to introduce this bias.
I've started with ADDI, ADDIW, and SLLI. These all have a 5-bit
field for the register. If the source and dest register are the
same they are guaranteed to compress as long as the immediate is
also 6 bits.

This code was inspired by similar code from the SystemZ target.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D138242

show more ...


Revision tags: llvmorg-15.0.5, llvmorg-15.0.4
# 78739fdb 29-Oct-2022 Simon Pilgrim <llvm-dev@redking.me.uk>

[DAG] Enable combineShiftOfShiftedLogic folds after type legalization

This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which

[DAG] Enable combineShiftOfShiftedLogic folds after type legalization

This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides.

Fixes #57872

Differential Revision: https://reviews.llvm.org/D136042

show more ...


Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5
# 8a3b6ba7 26-May-2022 Philip Reames <preames@rivosinc.com>

[RISCV] Add a subtarget feature to enable unaligned scalar loads and stores

A RISCV implementation can choose to implement unaligned load/store support. We currently don't have a way for such a proc

[RISCV] Add a subtarget feature to enable unaligned scalar loads and stores

A RISCV implementation can choose to implement unaligned load/store support. We currently don't have a way for such a processor to indicate a preference for unaligned load/stores, so add a subtarget feature.

There doesn't appear to be a formal extension for unaligned support. The RISCV Profiles (https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva20u64-profile) docs use the name Zicclsm, but a) that doesn't appear to actually been standardized, and b) isn't quite what we want here anyway due to the perf comment.

Instead, we can follow precedent from other backends and have a feature flag for the existence of misaligned load/stores with sufficient performance that user code should actually use them.

Differential Revision: https://reviews.llvm.org/D126085

show more ...


Revision tags: llvmorg-14.0.4
# 3e5b1e9c 20-May-2022 Philip Reames <preames@rivosinc.com>

[RISCV] Add test showing codegen for unaligned loads and stores of scalar types