#
8a52fef1 |
| 27-Jan-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] SILoadStoreOptimizer: tweak API of CombineInfo::setMI. NFC.
Change CombineInfo::setMI to take a reference to the SILoadStoreOptimizer instance, for easy access to common fields like TII and
[AMDGPU] SILoadStoreOptimizer: tweak API of CombineInfo::setMI. NFC.
Change CombineInfo::setMI to take a reference to the SILoadStoreOptimizer instance, for easy access to common fields like TII and STM.
Differential Revision: https://reviews.llvm.org/D118366
show more ...
|
#
185cb8e8 |
| 26-Jan-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] SILoadStoreOptimizer: Allow merging across a swizzled access
Swizzled accesses are not merged, but there is no particular reason not to merge two instructions if any of the intervening inst
[AMDGPU] SILoadStoreOptimizer: Allow merging across a swizzled access
Swizzled accesses are not merged, but there is no particular reason not to merge two instructions if any of the intervening instructions happens to be a swizzled access.
This moves the check for swizzled accesses out of checkAndPrepareMerge into collectMergeableInsts where I think it makes more sense.
Differential Revision: https://reviews.llvm.org/D118267
show more ...
|
#
95857a70 |
| 26-Jan-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] SILoadStoreOptimizer: Remove redundant check for volatile
SILoadStoreOptimizer::collectMergeableInsts already ends the current block if it sees a volatile (or ordered) memory access, so the
[AMDGPU] SILoadStoreOptimizer: Remove redundant check for volatile
SILoadStoreOptimizer::collectMergeableInsts already ends the current block if it sees a volatile (or ordered) memory access, so there is no need to check for them again when scanning the instructions between two pairing candidates in a block.
Differential Revision: https://reviews.llvm.org/D118266
show more ...
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
#
63eea41d |
| 19-Jan-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Simplify SILoadStoreOptimizer::getSubRegIdxs. NFC.
|
Revision tags: llvmorg-13.0.1-rc2 |
|
#
5a667c0e |
| 28-Dec-2021 |
Kazu Hirata <kazu@google.com> |
[llvm] Use nullptr instead of 0 (NFC)
Identified with modernize-use-nullptr.
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
#
654c89d8 |
| 06-Sep-2021 |
Christudasan Devadasan <Christudasan.Devadasan@amd.com> |
[AMDGPU] Make vector superclasses allocatable
The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to
[AMDGPU] Make vector superclasses allocatable
The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to enable copy between VGPR and AGPR registers during regalloc.
Also, added the missing AV register classes from 192b to 1024b.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D109300
show more ...
|
#
d1f45ed5 |
| 11-Nov-2021 |
Neubauer, Sebastian <Sebastian.Neubauer@amd.com> |
[AMDGPU][NFC] Fix typos
Differential Revision: https://reviews.llvm.org/D113672
|
#
c5029023 |
| 02-Nov-2021 |
Martin Liska <mliska@suse.cz> |
Fix building with GCC 12:
Fixes: https://bugs.llvm.org/show_bug.cgi?id=52380
Differential Revision: https://reviews.llvm.org/D112990
|
#
30d6c39b |
| 30-Aug-2021 |
Piotr Sobczak <Piotr.Sobczak@amd.com> |
[AMDGPU] Add merging into S_BUFFER_LOAD_DWORDX8_IMM
Extend SILoadStoreOptimizer to merge into DWORDX8 variant of S_BUFFER_LOAD.
Merging into DWORDX2 and DWORDX4 variants is handled already.
Differ
[AMDGPU] Add merging into S_BUFFER_LOAD_DWORDX8_IMM
Extend SILoadStoreOptimizer to merge into DWORDX8 variant of S_BUFFER_LOAD.
Merging into DWORDX2 and DWORDX4 variants is handled already.
Differential Revision: https://reviews.llvm.org/D108909
show more ...
|
Revision tags: llvmorg-13.0.0-rc2 |
|
#
99c790dc |
| 17-Aug-2021 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Make BVH isel consistent with other MIMG opcodes
Suffix opcodes with _gfx10. Remove direct references to architecture specific opcodes. Add a BVH flag and apply this to diassembly. Fix a nu
[AMDGPU] Make BVH isel consistent with other MIMG opcodes
Suffix opcodes with _gfx10. Remove direct references to architecture specific opcodes. Add a BVH flag and apply this to diassembly. Fix a number of disassembly errors on gfx90a target caused by previous incorrect BVH detection code.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D108117
show more ...
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
#
cc79aace |
| 10-May-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix SILoadStoreOptimizer for gfx90a
This was hardcoding the register class to use for the newly created pointer registers, violating the aligned VGPR requirement.
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
#
c297709e |
| 15-Mar-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Fixed msan failure with uninitialized value
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
#
3bffb1cd |
| 09-Feb-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amoun
[AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amount of code. These operands are mostly 0 anyway.
Additional advantage that parser will accept these flags in any order unlike now.
Differential Revision: https://reviews.llvm.org/D96469
show more ...
|
#
78b6d73a |
| 19-Feb-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add even aligned VGPR/AGPR register classes
gfx90a operations require even aligned registers, but this was previously achieved by reserving registers inside the full class.
Ideally this wou
AMDGPU: Add even aligned VGPR/AGPR register classes
gfx90a operations require even aligned registers, but this was previously achieved by reserving registers inside the full class.
Ideally this would be captured in the static instruction definitions for the operands, and we would have different instructions per subtarget. The hackiest part of this is we need to manually reassign AGPR register classes after instruction selection (we get away without this for VGPRs since those types are actually registered for legal types).
show more ...
|
#
75997e84 |
| 18-Feb-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Fixed msan build
LoadStoreOptimizer was using uninitialized SCC value for instructions where it is unsupported.
|
#
a8d9d507 |
| 17-Feb-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
|
#
23db2d36 |
| 09-Feb-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Better selection of base offset when merging DS reads/writes
When merging a pair of DS reads or writes needs to materialize the base offset in a vgpr, choose a value that is aligned to as h
[AMDGPU] Better selection of base offset when merging DS reads/writes
When merging a pair of DS reads or writes needs to materialize the base offset in a vgpr, choose a value that is aligned to as high a power of two as possible. This maximises the chance that different pairs can use the same base offset, in which case the base offset registers can be commoned up by MachineCSE.
Differential Revision: https://reviews.llvm.org/D96421
show more ...
|
#
2114b458 |
| 09-Feb-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Fix comments in SILoadStoreOptimizer::offsetsCanBeCombined
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
#
560d7e04 |
| 20-Jan-2021 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
Revision tags: llvmorg-11.1.0-rc1 |
|
#
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
91f503c3 |
| 16-Sep-2020 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] gfx1030 RT support
Differential Revision: https://reviews.llvm.org/D87782
|
#
34978602 |
| 20-Aug-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister
... in favour of the isPhysical/isVirtual methods.
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init |
|
#
79f67cae |
| 14-Jul-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Rename add/sub with carry out instructions
The hardware has created a real mess in the naming for add/sub, which have been renamed basically every generation. Switch the carry out pseudos to
AMDGPU: Rename add/sub with carry out instructions
The hardware has created a real mess in the naming for add/sub, which have been renamed basically every generation. Switch the carry out pseudos to have the gfx9/gfx10 names. We were using the original SI/CI v_add_i32/v_sub_i32 names. Later targets reintroduced these names as carryless instructions with a saturating clamp bit, which we do not define. Do this rename so we can unambiguously add these missing instructions.
The carry-in versions should also be renamed, but at least those had a consistent _u32 name to begin with. The 16-bit instructions were also renamed, but aren't ambiguous.
This does regress assembler error message quality in some cases. In mismatched wave32/wave64 situations, this will switch from "unsupported instruction" to "invalid operand", with the error pointing at the wrong position. I couldn't quite follow how the assembler selects these, but the previous behavior seemed accidental to me. It looked like there was a partial attempt to handle this which was never completed (i.e. there is an AMDGPUOperand::isBoolReg but it isn't used for anything).
show more ...
|
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
47788b97 |
| 11-Jun-2020 |
Jay Foad <jay.foad@amd.com> |
SILoadStoreOptimizer: add support for GFX10 image instructions
GFX10 image instructions use one or more address operands starting at vaddr0, instead of a single vaddr operand, to allow for NSA forms
SILoadStoreOptimizer: add support for GFX10 image instructions
GFX10 image instructions use one or more address operands starting at vaddr0, instead of a single vaddr operand, to allow for NSA forms.
Differential Revision: https://reviews.llvm.org/D81675
show more ...
|
#
d0b0b252 |
| 28-Jun-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Use IsSSA property check instead of asserting on isSSA
Also fix an SSA violation in a test the MIRParser/verifier fails to catch. It's illegal to define a subregister in SSA. For the purpose
AMDGPU: Use IsSSA property check instead of asserting on isSSA
Also fix an SSA violation in a test the MIRParser/verifier fails to catch. It's illegal to define a subregister in SSA. For the purpose of the test, it just needs to define the super-register to use the subregister in the use operand.
show more ...
|