unaligned-load-store.ll - OpenGrok history log for /llvm-project/llvm/test/CodeGen/RISCV/unaligned-load-store.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 9122c523	15-Nov-2024	Pengcheng Wang <wangpengcheng.pp@bytedance.com>	[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445) This is based on other targets like PPC/AArch64 and some experiments. This PR will only enable bidirectional schedu [RISCV] Enable bidirectional scheduling and tracking register pressure (#115445) This is based on other targets like PPC/AArch64 and some experiments. This PR will only enable bidirectional scheduling and tracking register pressure. Disclaimer: I haven't tested it on many cores, maybe we should make some options being features. I believe downstreams must have tried this before, so feedbacks are welcome. show more ...
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# 2967e5f8	11-Oct-2024	Alex Bradbury <asb@igalia.com>	[RISCV] Enable store clustering by default (#73796) Builds on #73789, enabling store clustering by default using the same heuristic.
# e45b44c6	01-Oct-2024	Alex Bradbury <asb@igalia.com>	[RISCV] Add pattern for PACK/PACKH in common misaligned load case (#110644) PACKH is currently only selected for assembling the first two bytes of a misligned load. A fairly complex RV32-only patte [RISCV] Add pattern for PACK/PACKH in common misaligned load case (#110644) PACKH is currently only selected for assembling the first two bytes of a misligned load. A fairly complex RV32-only pattern is added for producing PACKH+PACKH+PACK to assemble the result of a misaligned 32-bit load. Another pattern was added that just covers PACKH for shifted offsets 16 and 24, producing a packh and shift to replace two shifts and an 'or'. This slightly improves RV64IZKBK for a 64-bit load, but fails to match for the misaligned 32-bit load because the load of the upper byte is anyext in the SelectionDAG. I wrote the patch this way because it was quick and easy and has at least some benefit, but the "right" approach probably merits further discussion. Introducing target-specific SDNodes for PACK* and having custom lowering for unaligned load/stores that introduces those nodes them seems like it might be attractive. However, adding these patterns does provide benefit - so that's what this patch does for now. show more ...
# 14c4f28e	01-Oct-2024	Alex Bradbury <asb@igalia.com>	[RISCV] Enable load clustering by default (#73789) We believe this is neutral or slightly better in the majority of cases.
Revision tags: llvmorg-19.1.1
# 39b2e35f	01-Oct-2024	Alex Bradbury <asb@igalia.com>	[RISCV][test] Precommit tests showing codegen for unaligned load/store with zbkb We have missed opportunities for selecting pack* instructions, that will be addressed in future patches.
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6
# 90109d44	14-May-2024	Alex Bradbury <asb@igalia.com>	[RISCV] Improve constant materialisation for stores of i8 negative constants (#92131) This follows the same pattern as 20e62658735a1b03ecadc. Although we can't reduce the number of instructions use [RISCV] Improve constant materialisation for stores of i8 negative constants (#92131) This follows the same pattern as 20e62658735a1b03ecadc. Although we can't reduce the number of instructions used, if we are able to use a sign-extended 6-bit immediate then the 16-bit c.li instruction can be selected (thus saving code size). Although this _could_ be gated so it only happens if C is enabled, I've opted not to because at worst it's neutral and it doesn't seem helpful to add unnecessary divergence between the RVC and non-RVC paths. show more ...
Revision tags: llvmorg-18.1.5, llvmorg-18.1.4
# 9067070d	16-Apr-2024	Craig Topper <craig.topper@sifive.com>	[RISCV] Re-separate unaligned scalar and vector memory features in the backend. (#88954) This is largely a revert of commit e81796671890b59c110f8e41adc7ca26f8484d20. As #88029 shows, there exist [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (#88954) This is largely a revert of commit e81796671890b59c110f8e41adc7ca26f8484d20. As #88029 shows, there exists hardware that only supports unaligned scalar. I'm leaving how this gets exposed to the clang interface to a future patch. show more ...
Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# e8179667	01-Dec-2023	Philip Reames <preames@rivosinc.com>	[RISCV] Collapse fast unaligned access into a single feature [nfc-ish] (#73971) When we'd originally added unaligned-scalar-mem and unaligned-vector-mem, they were separated into two parts under th [RISCV] Collapse fast unaligned access into a single feature [nfc-ish] (#73971) When we'd originally added unaligned-scalar-mem and unaligned-vector-mem, they were separated into two parts under the theory that some processor might implement one, but not the other. At the moment, we don't have evidence of such a processor. The C/C++ level interface, and the clang driver command lines have settled on a single unaligned flag which indicates both scalar and vector support unaligned. Given that, let's remove the test matrix complexity for a set of configurations which don't appear useful. Given these are internal feature names, I don't think we need to provide any forward compatibility. Anyone disagree? Note: The immediate trigger for this patch was finding another case where the unaligned-vector-mem wasn't being properly serialized to IR from clang which resulted in problems reproducing assembly from clang's -emit-llvm feature. Instead of fixing this, I decided getting rid of the complexity was the better approach. show more ...
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0
# 8f04d81e	16-Sep-2023	Craig Topper <craig.topper@sifive.com>	[SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore. If the SRL for Hi constant folds, but we don't remoe those bits from the Lo, we can end up with strange c [SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore. If the SRL for Hi constant folds, but we don't remoe those bits from the Lo, we can end up with strange constant folding through DAGCombine later. I've only seen this with constants being lowered to constant pools during lowering on RISC-V. show more ...
# 17a12a27	16-Sep-2023	Craig Topper <craig.topper@sifive.com>	[RISCV] Add test case to show bad codegen for unaligned i64 store of a large constant. On the first split we create two i32 trunc stores and a srl to shift the high part down. The srl gets constant [RISCV] Add test case to show bad codegen for unaligned i64 store of a large constant. On the first split we create two i32 trunc stores and a srl to shift the high part down. The srl gets constant folded, but to produce a new i32 constant. But the truncstore for the low store still uses the original constant. This original constant then gets converted to a constant pool before we revisit the stores to further split them. The constant pool prevents further constant folding of the additional srls. After legalization is done, we run DAGCombiner and get some constant folding of srl via computeKnownBits which can peek through the constant pool load. This can create new constants that also need a constant pool. show more ...
Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 5b95bba6	24-Jul-2023	Luke Lau <luke@igalia.com>	[RISCV] Set Fast flag for unaligned memory accesses The +unaligned-scalar-mem and +unaligned-vector-mem features were added in D126085 and D149375 respectively to allow subtargets to indicate that t [RISCV] Set Fast flag for unaligned memory accesses The +unaligned-scalar-mem and +unaligned-vector-mem features were added in D126085 and D149375 respectively to allow subtargets to indicate that they supported misaligned loads/stores with "sufficient" performance. This is separate from whether or not the target actually supports misaligned accesses, which could be determined from Zicclsm. This patch enables the Fast flag under the assumption that any subtarget that declares support for +unaligned-*-mem will want to opt into optimisations that take advantage of misaligned scalar accesses, such as store merging. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D150771 show more ...
# eb3f2fe4	20-Jul-2023	Philip Reames <preames@rivosinc.com>	[RISCV] Revise check names for unaligned memory op tests [nfc] This has come up a few times in review; the current ones seem to be universally confusing. Even I as the original author of most of th [RISCV] Revise check names for unaligned memory op tests [nfc] This has come up a few times in review; the current ones seem to be universally confusing. Even I as the original author of most of these get confused. Switch to using the SLOW/FAST naming used by x86, hopefully that's a bit clearer. show more ...
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5
# 5ab13332	17-May-2023	Luke Lau <luke@igalia.com>	[RISCV] Add tests for store merging with unaligned scalar access Reviewed By: reames Differential Revision: https://reviews.llvm.org/D150770
Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# 8e43c22d	22-Mar-2023	Craig Topper <craig.topper@sifive.com>	[RISCV] Use LBU for extloadi8. The Zcb extension has c.lbu, but not c.lb. This patch makes us prefer LBU over LB if we have a choice which will enable more compression opportunities. Reviewed By: a [RISCV] Use LBU for extloadi8. The Zcb extension has c.lbu, but not c.lb. This patch makes us prefer LBU over LB if we have a choice which will enable more compression opportunities. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D146270 show more ...
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init
# 86eff6be	20-Jan-2023	Philip Reames <preames@rivosinc.com>	[MachineCombiner] Use default latency model when no detailed model available This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat [MachineCombiner] Use default latency model when no detailed model available This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost. Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.) The motivation here is two fold. First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler. Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree. Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch. For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes. Differential Revision: https://reviews.llvm.org/D141017 show more ...
Revision tags: llvmorg-15.0.7
# 002005e6	22-Dec-2022	Hsiangkai Wang <hsiangkai@google.com>	[RISCV] Add integer scalar instructions to isAssociativeAndCommutative Inspired by D138107. We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative to increase instruction-l [RISCV] Add integer scalar instructions to isAssociativeAndCommutative Inspired by D138107. We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative to increase instruction-level parallelism by the existing MachineCombiner pass. Differential Revision: https://reviews.llvm.org/D140530 show more ...
# d64d3c5a	22-Dec-2022	Nitin John Raj <nitin.raj@sifive.com>	[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW h [RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW have a 3-bit register encoding. They both require the dest to also be one of the sources. We aggressively form ADDW/SLLIW as it helps hasAllWBitUsers in RISCVISelDAGToDAG to not require recursion. So we need a pass to remove excessive -w suffixes. Differential Revision: https://reviews.llvm.org/D139948 show more ...
# 1456b686	19-Dec-2022	Nikita Popov <npopov@redhat.com>	[RISCV] Convert some tests to opaque pointers (NFC)
# d741a31a	14-Dec-2022	Nitin John Raj <nitin.raj@sifive.com>	[RISCV][CodeGen][SelectionDAG] Recursively check hasAllNBitUsers for logical machine opcodes We don’t have W versions of AND/OR/XOR/ANDN/ORN/XNOR so we should recursively check their users. We shoul [RISCV][CodeGen][SelectionDAG] Recursively check hasAllNBitUsers for logical machine opcodes We don’t have W versions of AND/OR/XOR/ANDN/ORN/XNOR so we should recursively check their users. We should limit the recursion to SelectionDAG::MaxRecursionDepth levels. We need to add a Depth argument, all existing callers should pass 0 to the Depth. The new recursive calls should increment it by 1. At the top of the function we should give up and return false if Depth >= SelectionDAG::MaxRecursionDepth. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D139462 show more ...
# e00e20a0	01-Dec-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints. These instructions requires both register operands to be compressible so I've only applied the hint if we already have a GPRC physical regis [RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints. These instructions requires both register operands to be compressible so I've only applied the hint if we already have a GPRC physical register assigned for the other register operand. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D139079 show more ...
Revision tags: llvmorg-15.0.6
# a2b5b584	25-Nov-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Use register allocation hints to improve use of compressed instructions. Compressed instructions usually require one of the source registers to also be the source register. The register allo [RISCV] Use register allocation hints to improve use of compressed instructions. Compressed instructions usually require one of the source registers to also be the source register. The register allocator doesn't have that bias on its own. This patch adds register allocation hints to introduce this bias. I've started with ADDI, ADDIW, and SLLI. These all have a 5-bit field for the register. If the source and dest register are the same they are guaranteed to compress as long as the immediate is also 6 bits. This code was inspired by similar code from the SystemZ target. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D138242 show more ...
Revision tags: llvmorg-15.0.5, llvmorg-15.0.4
# 78739fdb	29-Oct-2022	Simon Pilgrim <llvm-dev@redking.me.uk>	[DAG] Enable combineShiftOfShiftedLogic folds after type legalization This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which [DAG] Enable combineShiftOfShiftedLogic folds after type legalization This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides. Fixes #57872 Differential Revision: https://reviews.llvm.org/D136042 show more ...
Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5
# 8a3b6ba7	26-May-2022	Philip Reames <preames@rivosinc.com>	[RISCV] Add a subtarget feature to enable unaligned scalar loads and stores A RISCV implementation can choose to implement unaligned load/store support. We currently don't have a way for such a proc [RISCV] Add a subtarget feature to enable unaligned scalar loads and stores A RISCV implementation can choose to implement unaligned load/store support. We currently don't have a way for such a processor to indicate a preference for unaligned load/stores, so add a subtarget feature. There doesn't appear to be a formal extension for unaligned support. The RISCV Profiles (https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva20u64-profile) docs use the name Zicclsm, but a) that doesn't appear to actually been standardized, and b) isn't quite what we want here anyway due to the perf comment. Instead, we can follow precedent from other backends and have a feature flag for the existence of misaligned load/stores with sufficient performance that user code should actually use them. Differential Revision: https://reviews.llvm.org/D126085 show more ...
Revision tags: llvmorg-14.0.4
# 3e5b1e9c	20-May-2022	Philip Reames <preames@rivosinc.com>	[RISCV] Add test showing codegen for unaligned loads and stores of scalar types