amdgcn-load-offset-from-reg.ll - OpenGrok history log for /llvm-project/llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 229e1185	23-Jul-2024	Christudasan Devadasan <christudasan.devadasan@amd.com>	[AMDGPU] Codegen support for constrained multi-dword sloads (#96163) For targets that support xnack replay feature (gfx8+), the multi-dword scalar loads shouldn't clobber any register that holds the [AMDGPU] Codegen support for constrained multi-dword sloads (#96163) For targets that support xnack replay feature (gfx8+), the multi-dword scalar loads shouldn't clobber any register that holds the src address. The constrained version of the scalar loads have the early clobber flag attached to the dst operand to restrict RA from re-allocating any of the src regs for its dst operand. show more ...
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 9e9907f1	17-Jan-2024	Fangrui Song <i@maskray.me>	[AMDGPU,test] Change llc -march= to -mtriple= (#75982) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while [AMDGPU,test] Change llc -march= to -mtriple= (#75982) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. amdgpu-apple-darwin. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly. This patch changes AMDGPU tests to not rely on the default OS/environment components. Tests that need fixes are not changed: ``` LLVM :: CodeGen/AMDGPU/fabs.f64.ll LLVM :: CodeGen/AMDGPU/fabs.ll LLVM :: CodeGen/AMDGPU/floor.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.ll LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll LLVM :: CodeGen/AMDGPU/schedule-if-2.ll ``` show more ...
Revision tags: llvmorg-17.0.6
# 01c1c7a1	14-Nov-2023	Acim Maravic <119684637+Acim-Maravic@users.noreply.github.com>	[AMDGPU][CodeGen] Update support (soffset + offset) s_buffer_load's (#68302) getBaseWithConstantOffset() is used for scalar and non-scalar buffer loads. Diffrence between s_load and load instructio [AMDGPU][CodeGen] Update support (soffset + offset) s_buffer_load's (#68302) getBaseWithConstantOffset() is used for scalar and non-scalar buffer loads. Diffrence between s_load and load instruction is that s_load instruction extends 32-bit offset to 64-bits, so a 32-bit (address + offset) should not cause unsigned 32-bit integer wraparound, because it performs addition in 64-bits. show more ...
Revision tags: llvmorg-17.0.5
# 86f2e092	01-Nov-2023	Jay Foad <jay.foad@amd.com>	[AMDGPU] Tweak handling of GlobalAddress operands in SI_PC_ADD_REL_OFFSET (#70960) When SI_PC_ADD_REL_OFFSET is expanded to S_GETPC/S_ADD/S_ADDC, the GlobalAddress operands have to be adjusted by 4 [AMDGPU] Tweak handling of GlobalAddress operands in SI_PC_ADD_REL_OFFSET (#70960) When SI_PC_ADD_REL_OFFSET is expanded to S_GETPC/S_ADD/S_ADDC, the GlobalAddress operands have to be adjusted by 4 or 12 bytes to account for the offset from the end of the S_GETPC instruction to the literal operands. Do this all in SIInstrInfo::expandPostRAPseudo instead of duplicating the adjustment code in both AMDGPULegalizerInfo and SITargetLowering. NFCI. show more ...
Revision tags: llvmorg-17.0.4
# d96529af	29-Oct-2023	Simon Pilgrim <llvm-dev@redking.me.uk>	[DAG] Attempt shl narrowing in SimplifyDemandedBits (REAPPLIED) If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a fre [DAG] Attempt shl narrowing in SimplifyDemandedBits (REAPPLIED) If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext. Followup to D146121 Reapplied - moved after the ShrinkDemandedOp call; reuse the existing KnownBits result; ensure that we only attempt this if all the upper bits are demanded; 547dc461225ba should address the remaining regressions that were noticed in the previous commit. Differential Revision: https://reviews.llvm.org/D155472 show more ...
Revision tags: llvmorg-17.0.3
# 0a776996	04-Oct-2023	Kirill Stoimenov <kstoimenov@google.com>	Revert "[DAG] Attempt shl narrowing in SimplifyDemandedBits" This reverts commit 7a8c04ef84ecdab4390b451d4c2fe17bc45a7b63.
# 7a8c04ef	04-Oct-2023	Simon Pilgrim <llvm-dev@redking.me.uk>	[DAG] Attempt shl narrowing in SimplifyDemandedBits If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext [DAG] Attempt shl narrowing in SimplifyDemandedBits If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext. Followup to D146121 Differential Revision: https://reviews.llvm.org/D155472 show more ...
Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# faa2c678	04-Apr-2023	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	[AMDGPU] Add buffer intrinsics that take resources as pointers In order to enable the LLVM frontend to better analyze buffer operations (and to potentially enable more precise analyses on the backen [AMDGPU] Add buffer intrinsics that take resources as pointers In order to enable the LLVM frontend to better analyze buffer operations (and to potentially enable more precise analyses on the backend), define versions of the raw and structured buffer intrinsics that use `ptr addrspace(8)` instead of `<4 x i32>` to represent their rsrc arguments. The new intrinsics are named by replacing `buffer.` with `buffer.ptr`. One advantage to these intrinsic definitions is that, instead of specifying that a buffer load/store will read/write some memory, we can indicate that the memory read or written will be based on the pointer argument. This means that, for example, a read from a `noalias` buffer can be pulled out of a loop that is modifying a distinct buffer. In the future, we will define custom PseudoSourceValues that will allow us to package up the (buffer, index, offset) triples that buffer intrinsics contain and allow for more precise backend analysis. This work also enables creating address space 7, which represents manipulation of raw buffers using native LLVM load and store instructions. Where tests simply used a buffer intrinsic while testing some other code path (such as the tests for VGPR spills), they have been updated to use the new intrinsic form. Tests that are "about" buffer intrinsics (for instance, those that ensure that they codegen as expected) have been duplicated, either within existing files or into new ones. Depends on D145441 Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D147547 show more ...
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4
# 2be31896	10-Mar-2023	Jay Foad <jay.foad@amd.com>	[AMDGPU] Don't select _SGPR forms of SMEM instructions on GFX9+ On GFX9+, SMEM instructions have an _SGPR_IMM form which is strictly more powerful than the _SGPR form. It simplifies codegen if we al [AMDGPU] Don't select _SGPR forms of SMEM instructions on GFX9+ On GFX9+, SMEM instructions have an _SGPR_IMM form which is strictly more powerful than the _SGPR form. It simplifies codegen if we always select the _SGPR_IMM form with an immediate offset of 0 instead of the _SGPR form. Note that this patch just makes minimal changes to the selection patterns to prove the concept. Further simplifications are possible to reduced the number of selection patterns. On GFX9 the _SGPR form of the Real instruction is still required for assembly/disassembly but on GFX10+ it can be removed completely. Differential Revision: https://reviews.llvm.org/D147334 show more ...
Revision tags: llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# d85e849f	02-Dec-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Convert some assorted tests to opaque pointers
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# bb70b5d4	18-Jul-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	CodeGen: Set MODereferenceable from isDereferenceableAndAlignedPointer Previously this was assuming piontsToConstantMemory implies dereferenceable.
# 5db8d6fd	05-Sep-2022	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][CodeGen] Support (base \| offset) SMEM loads. Prevents generation of unnecessary s_or_b32 instructions. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D132552
# 1f550d86	05-Sep-2022	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][CodeGen] Pre-commit a test on (base \| offset) SMEM loads for D132552. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D133021
# f3364530	05-Sep-2022	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][CodeGen] Support (soffset + offset) s_buffer_load's. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D130263
# 8d0383eb	24-Jun-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	CodeGen: Remove AliasAnalysis from regalloc This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is CodeGen: Remove AliasAnalysis from regalloc This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable. Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy. Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed. show more ...
# 432cbd78	18-Jul-2022	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][CodeGen] Support (register + immediate) SMRD offsets. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D129381
# 9c66c02e	18-Jul-2022	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][CodeGen] Match SMRDs with constant bases and register offsets. Saves some add instructions on a couple Rage 2 shaders and is also a prerequisite for a coming-soon change matching (register [AMDGPU][CodeGen] Match SMRDs with constant bases and register offsets. Saves some add instructions on a couple Rage 2 shaders and is also a prerequisite for a coming-soon change matching (register + immediate) offsets. Reviewed By: foad, arsenm Differential Revision: https://reviews.llvm.org/D129095 show more ...
# 8cd79bc1	05-Jul-2022	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][GlobalISel] Support register offsets for SMRDs. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D128836
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# fae05692	20-May-2021	Matt Arsenault <Matthew.Arsenault@amd.com>	CodeGen: Print/parse LLTs in MachineMemOperands This will currently accept the old number of bytes syntax, and convert it to a scalar. This should be removed in the near future (I think I converted CodeGen: Print/parse LLTs in MachineMemOperands This will currently accept the old number of bytes syntax, and convert it to a scalar. This should be removed in the near future (I think I converted all of the tests already, but likely missed a few). Not sure what the exact syntax and policy should be. We can continue printing the number of bytes for non-generic instructions to avoid test churn and only allow non-scalar types for generic instructions. This will currently print the LLT in parentheses, but accept parsing the existing integers and implicitly converting to scalar. The parentheses are a bit ugly, but the parser logic seems unable to deal without either parentheses or some keyword to indicate the start of a type. show more ...
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# 3bffb1cd	09-Feb-2021	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	[AMDGPU] Use single cache policy operand Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amoun [AMDGPU] Use single cache policy operand Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amount of code. These operands are mostly 0 anyway. Additional advantage that parser will accept these flags in any order unlike now. Differential Revision: https://reviews.llvm.org/D96469 show more ...
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2
# b8c8d1b3	30-Jul-2020	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Convert some tests to use new buffer intrinsics The legacy not struct or raw buffer intrinsics should now all be consolidated into the tests specifically for those intrinsics.
Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# 77ce2e21	30-Mar-2020	Jakub Kuderski <kubak@google.com>	[AMDGPU] Add Relocation Constant Support Summary: This change adds amdgcn.reloc.constant intrinsic to the amdgpu backend, which will compile into a relocation entry in the resulting elf. The intrin [AMDGPU] Add Relocation Constant Support Summary: This change adds amdgcn.reloc.constant intrinsic to the amdgpu backend, which will compile into a relocation entry in the resulting elf. The intrinsics takes a MetadataNode (String) as its only argument, which specifies the symbol name of the relocation entry. `SelectionDAGBuilder::getValueImpl` is changed to allow metadata operands passed through to ISel. Author: csyonghe <yonghe@google.com> Reviewers: tpr, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76440 show more ...