History log of /llvm-project/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp (Results 51 – 75 of 167)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 8a52fef1 27-Jan-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: tweak API of CombineInfo::setMI. NFC.

Change CombineInfo::setMI to take a reference to the
SILoadStoreOptimizer instance, for easy access to common fields like
TII and

[AMDGPU] SILoadStoreOptimizer: tweak API of CombineInfo::setMI. NFC.

Change CombineInfo::setMI to take a reference to the
SILoadStoreOptimizer instance, for easy access to common fields like
TII and STM.

Differential Revision: https://reviews.llvm.org/D118366

show more ...


# 185cb8e8 26-Jan-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: Allow merging across a swizzled access

Swizzled accesses are not merged, but there is no particular reason not
to merge two instructions if any of the intervening inst

[AMDGPU] SILoadStoreOptimizer: Allow merging across a swizzled access

Swizzled accesses are not merged, but there is no particular reason not
to merge two instructions if any of the intervening instructions happens
to be a swizzled access.

This moves the check for swizzled accesses out of checkAndPrepareMerge
into collectMergeableInsts where I think it makes more sense.

Differential Revision: https://reviews.llvm.org/D118267

show more ...


# 95857a70 26-Jan-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: Remove redundant check for volatile

SILoadStoreOptimizer::collectMergeableInsts already ends the current
block if it sees a volatile (or ordered) memory access, so the

[AMDGPU] SILoadStoreOptimizer: Remove redundant check for volatile

SILoadStoreOptimizer::collectMergeableInsts already ends the current
block if it sees a volatile (or ordered) memory access, so there is no
need to check for them again when scanning the instructions between two
pairing candidates in a block.

Differential Revision: https://reviews.llvm.org/D118266

show more ...


Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3
# 63eea41d 19-Jan-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Simplify SILoadStoreOptimizer::getSubRegIdxs. NFC.


Revision tags: llvmorg-13.0.1-rc2
# 5a667c0e 28-Dec-2021 Kazu Hirata <kazu@google.com>

[llvm] Use nullptr instead of 0 (NFC)

Identified with modernize-use-nullptr.


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# 654c89d8 06-Sep-2021 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Make vector superclasses allocatable

The combined vector register classes with both
VGPRs and AGPRs are currently unallocatable.
This patch turns them into allocatable as a
prerequisite to

[AMDGPU] Make vector superclasses allocatable

The combined vector register classes with both
VGPRs and AGPRs are currently unallocatable.
This patch turns them into allocatable as a
prerequisite to enable copy between VGPR and
AGPR registers during regalloc.

Also, added the missing AV register classes from
192b to 1024b.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D109300

show more ...


# d1f45ed5 11-Nov-2021 Neubauer, Sebastian <Sebastian.Neubauer@amd.com>

[AMDGPU][NFC] Fix typos

Differential Revision: https://reviews.llvm.org/D113672


# c5029023 02-Nov-2021 Martin Liska <mliska@suse.cz>

Fix building with GCC 12:

Fixes: https://bugs.llvm.org/show_bug.cgi?id=52380

Differential Revision: https://reviews.llvm.org/D112990


# 30d6c39b 30-Aug-2021 Piotr Sobczak <Piotr.Sobczak@amd.com>

[AMDGPU] Add merging into S_BUFFER_LOAD_DWORDX8_IMM

Extend SILoadStoreOptimizer to merge into DWORDX8 variant of S_BUFFER_LOAD.

Merging into DWORDX2 and DWORDX4 variants is handled already.

Differ

[AMDGPU] Add merging into S_BUFFER_LOAD_DWORDX8_IMM

Extend SILoadStoreOptimizer to merge into DWORDX8 variant of S_BUFFER_LOAD.

Merging into DWORDX2 and DWORDX4 variants is handled already.

Differential Revision: https://reviews.llvm.org/D108909

show more ...


Revision tags: llvmorg-13.0.0-rc2
# 99c790dc 17-Aug-2021 Carl Ritson <carl.ritson@amd.com>

[AMDGPU] Make BVH isel consistent with other MIMG opcodes

Suffix opcodes with _gfx10.
Remove direct references to architecture specific opcodes.
Add a BVH flag and apply this to diassembly.
Fix a nu

[AMDGPU] Make BVH isel consistent with other MIMG opcodes

Suffix opcodes with _gfx10.
Remove direct references to architecture specific opcodes.
Add a BVH flag and apply this to diassembly.
Fix a number of disassembly errors on gfx90a target caused by
previous incorrect BVH detection code.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D108117

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# cc79aace 10-May-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix SILoadStoreOptimizer for gfx90a

This was hardcoding the register class to use for the newly created
pointer registers, violating the aligned VGPR requirement.


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# c297709e 15-Mar-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Fixed msan failure with uninitialized value


Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# 3bffb1cd 09-Feb-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Use single cache policy operand

Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amoun

[AMDGPU] Use single cache policy operand

Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amount of code. These operands are mostly 0 anyway.

Additional advantage that parser will accept these flags in any order unlike
now.

Differential Revision: https://reviews.llvm.org/D96469

show more ...


# 78b6d73a 19-Feb-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add even aligned VGPR/AGPR register classes

gfx90a operations require even aligned registers, but this was
previously achieved by reserving registers inside the full class.

Ideally this wou

AMDGPU: Add even aligned VGPR/AGPR register classes

gfx90a operations require even aligned registers, but this was
previously achieved by reserving registers inside the full class.

Ideally this would be captured in the static instruction definitions
for the operands, and we would have different instructions per
subtarget. The hackiest part of this is we need to manually reassign
AGPR register classes after instruction selection (we get away without
this for VGPRs since those types are actually registered for legal
types).

show more ...


# 75997e84 18-Feb-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Fixed msan build

LoadStoreOptimizer was using uninitialized SCC value for
instructions where it is unsupported.


# a8d9d507 17-Feb-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx90a support

Differential Revision: https://reviews.llvm.org/D96906


# 23db2d36 09-Feb-2021 Jay Foad <jay.foad@amd.com>

[AMDGPU] Better selection of base offset when merging DS reads/writes

When merging a pair of DS reads or writes needs to materialize the base
offset in a vgpr, choose a value that is aligned to as h

[AMDGPU] Better selection of base offset when merging DS reads/writes

When merging a pair of DS reads or writes needs to materialize the base
offset in a vgpr, choose a value that is aligned to as high a power of
two as possible. This maximises the chance that different pairs can use
the same base offset, in which case the base offset registers can be
commoned up by MachineCSE.

Differential Revision: https://reviews.llvm.org/D96421

show more ...


# 2114b458 09-Feb-2021 Jay Foad <jay.foad@amd.com>

[AMDGPU] Fix comments in SILoadStoreOptimizer::offsetsCanBeCombined


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 560d7e04 20-Jan-2021 dfukalov <daniil.fukalov@amd.com>

[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets

... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036


Revision tags: llvmorg-11.1.0-rc1
# 6a87e9b0 25-Dec-2020 dfukalov <daniil.fukalov@amd.com>

[NFC][AMDGPU] Reduce include files dependency.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# 91f503c3 16-Sep-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx1030 RT support

Differential Revision: https://reviews.llvm.org/D87782


# 34978602 20-Aug-2020 Jay Foad <jay.foad@amd.com>

[AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister

... in favour of the isPhysical/isVirtual methods.


Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init
# 79f67cae 14-Jul-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Rename add/sub with carry out instructions

The hardware has created a real mess in the naming for add/sub, which
have been renamed basically every generation. Switch the carry out
pseudos to

AMDGPU: Rename add/sub with carry out instructions

The hardware has created a real mess in the naming for add/sub, which
have been renamed basically every generation. Switch the carry out
pseudos to have the gfx9/gfx10 names. We were using the original SI/CI
v_add_i32/v_sub_i32 names. Later targets reintroduced these names as
carryless instructions with a saturating clamp bit, which we do not
define. Do this rename so we can unambiguously add these missing
instructions.

The carry-in versions should also be renamed, but at least those had a
consistent _u32 name to begin with. The 16-bit instructions were also
renamed, but aren't ambiguous.

This does regress assembler error message quality in some cases. In
mismatched wave32/wave64 situations, this will switch from
"unsupported instruction" to "invalid operand", with the error
pointing at the wrong position. I couldn't quite follow how the
assembler selects these, but the previous behavior seemed accidental
to me. It looked like there was a partial attempt to handle this which
was never completed (i.e. there is an AMDGPUOperand::isBoolReg but it
isn't used for anything).

show more ...


Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# 47788b97 11-Jun-2020 Jay Foad <jay.foad@amd.com>

SILoadStoreOptimizer: add support for GFX10 image instructions

GFX10 image instructions use one or more address operands starting at
vaddr0, instead of a single vaddr operand, to allow for NSA forms

SILoadStoreOptimizer: add support for GFX10 image instructions

GFX10 image instructions use one or more address operands starting at
vaddr0, instead of a single vaddr operand, to allow for NSA forms.

Differential Revision: https://reviews.llvm.org/D81675

show more ...


# d0b0b252 28-Jun-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use IsSSA property check instead of asserting on isSSA

Also fix an SSA violation in a test the MIRParser/verifier fails to
catch. It's illegal to define a subregister in SSA. For the purpose

AMDGPU: Use IsSSA property check instead of asserting on isSSA

Also fix an SSA violation in a test the MIRParser/verifier fails to
catch. It's illegal to define a subregister in SSA. For the purpose of
the test, it just needs to define the super-register to use the
subregister in the use operand.

show more ...


1234567