History log of /llvm-project/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp (Results 1 – 25 of 167)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2
# 8d13e7b8 03-Oct-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Qualify auto. NFC. (#110878)

Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0
# 7a30b9c0 11-Sep-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Make more use of getWaveMaskRegClass. NFC. (#108186)


Revision tags: llvmorg-19.1.0-rc4
# cd3667d1 02-Sep-2024 Craig Topper <craig.topper@sifive.com>

[CodeGen] Update a few places that were passing Register to raw_ostream::operator<< (#106877)

These would implicitly cast the register to `unsigned`. Switch most of
them to use printReg will give a

[CodeGen] Update a few places that were passing Register to raw_ostream::operator<< (#106877)

These would implicitly cast the register to `unsigned`. Switch most of
them to use printReg will give a more readable output. Change some
others to use Register::id() so we can eventually remove the implicit
cast to `unsigned`.

show more ...


# da137541 02-Sep-2024 Akshat Oke <76596238+Akshat-Oke@users.noreply.github.com>

AMDGPU/NewPM Port SILoadStoreOptimizer to NPM (#106362)


Revision tags: llvmorg-19.1.0-rc3
# 273e0a4c 12-Aug-2024 Tim Gymnich <tgymnich@icloud.com>

[AMDGPU] add missing checks in processBaseWithConstOffset (#102310)

fixes https://github.com/llvm/llvm-project/issues/102231 by inserting
missing checks.


# 37d7b06d 06-Aug-2024 Christudasan Devadasan <christudasan.devadasan@amd.com>

[AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (#101619)

Use the constrained buffer load opcodes while combining under-aligned
loads for XNACK enabled subtargets.


Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# a1d7da05 23-Jul-2024 Christudasan Devadasan <christudasan.devadasan@amd.com>

[AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (#96162)

Consider the constrained multi-dword loads while merging
individual loads to a single multi-dword load.


Revision tags: llvmorg-18.1.8
# c771b670 06-Jun-2024 Stanislav Mekhanoshin <rampitec@users.noreply.github.com>

[AMDGPU] Promote immediate offset to atomics (#94043)


Revision tags: llvmorg-18.1.7
# fc21387b 31-May-2024 Stanislav Mekhanoshin <rampitec@users.noreply.github.com>

[AMDGPU] Enable constant offset promotion to immediate FLAT (#93884)

Currently it is only supported for FLAT Global.


# 215f92b9 30-May-2024 Stanislav Mekhanoshin <rampitec@users.noreply.github.com>

[AMDGPU] Fix crash in the SILoadStoreOptimizer (#93862)

It does not properly handle situation when address calculation uses
V_ADDC_U32 0, 0, carry-in (i.e. with both src0 and src1 immediates).


Revision tags: llvmorg-18.1.6
# 11f76b85 02-May-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Use some merging/unmerging helpers in SILoadStoreOptimizer (#90866)

Factor out copyToDestRegs and copyFromSrcRegs for merging store sources
and unmerging load results. NFC.


# e020e287 02-May-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Modernize some syntax in SILoadStoreOptimizer. NFC.

Use structured bindings and similar.


Revision tags: llvmorg-18.1.5
# 0606747c 01-May-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Remove some pointless fallthrough annotations


Revision tags: llvmorg-18.1.4, llvmorg-18.1.3
# 06cfbe3c 25-Mar-2024 David Stuttard <david.stuttard@amd.com>

[AMDPU] Add support for idxen and bothen buffer load/store merging in SILoadStoreOptimizer (#86285)

Added more buffer instruction merging support


Revision tags: llvmorg-18.1.2
# 601e102b 17-Mar-2024 David Green <david.green@arm.com>

[CodeGen] Use LocationSize for MMO getSize (#84751)

This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
const

[CodeGen] Use LocationSize for MMO getSize (#84751)

This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
constructors take LocationSize, and convert ~UINT64_C(0) to
LocationSize::beforeOrAfter(). The getSize methods return a
LocationSize.

This allows us to be more precise with unknown sizes, not accidentally
treating them as unsigned values, and in the future should allow us to
add proper scalable vector support but none of that is included in this
patch. It should mostly be an NFC.

Global ISel is still expected to use the underlying LLT as it needs, and
are not expected to see unknown sizes for generic operations. Most of
the changes are hopefully fairly mechanical, adding a lot of getValue()
calls and protecting them with hasValue() where needed.

show more ...


Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 5879162f 15-Dec-2023 Mirko Brkušanin <Mirko.Brkusanin@amd.com>

[AMDGPU] CodeGen for GFX12 VBUFFER instructions (#75492)


# 26b14aed 15-Dec-2023 Mirko Brkušanin <Mirko.Brkusanin@amd.com>

[AMDGPU] CodeGen for GFX12 VIMAGE and VSAMPLE instructions (#75488)


# a278ac57 15-Dec-2023 Mirko Brkušanin <Mirko.Brkusanin@amd.com>

[AMDGPU] CodeGen for SMEM instructions (#75579)


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3
# 4fa8a548 11-Aug-2023 Konrad Kusiak <konrad.kusiak@codeplay>

[AMDGPU] Add sanity check that fixes bad shift operation in AMD backend

There is a problem with the
SILoadStoreOptimizer::dmasksCanBeCombined() function that can lead to
UB.

This boolean function d

[AMDGPU] Add sanity check that fixes bad shift operation in AMD backend

There is a problem with the
SILoadStoreOptimizer::dmasksCanBeCombined() function that can lead to
UB.

This boolean function decides if two masks can be combined into 1. The
idea here is that the bits which are "on" in one mask, don't overlap
with the "on" bits of the other. Consider an example (10 bits for
simplicity):

Mask 1: 0101101000
Mask 2: 0000000110

Those can be combined into a single mask: 0101101110.

To check if such an operation is possible, the code takes the mask
which is greater and counts how many 0s there are, starting from the
LSB and stopping at the first 1. Then, it shifts 1u by this number and
compares it with the smaller mask. The problem is that when both masks
are 0, the counter will find 32 zeroes in the first mask and will try
to do a shift by 32 positions which leads to UB.

The fix is a simple sanity check, if the bigger mask is 0 or not.

https://reviews.llvm.org/D155051

show more ...


Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# c68c6c56 21-Jun-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Minor refactoring in SILoadStoreOptimizer::offsetsCanBeCombined


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3
# 0c13e0b7 25-Apr-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Do not handle _SGPR SMEM instructions in SILoadStoreOptimizer

After D147334 we never select _SGPR forms of SMEM instructions on
subtargets that also support the _SGPR_IMM form, so there is

[AMDGPU] Do not handle _SGPR SMEM instructions in SILoadStoreOptimizer

After D147334 we never select _SGPR forms of SMEM instructions on
subtargets that also support the _SGPR_IMM form, so there is no need to
handle them here.

Differential Revision: https://reviews.llvm.org/D149139

show more ...


Revision tags: llvmorg-16.0.2
# f6e70ed1 07-Apr-2023 mmarjano <mmarjano@amd.com>

[AMDGPU] Extend tbuffer_load_format merge

Add support for merging _IDXEN and _BOTHEN variants of
TBUFFER_LOAD_FORMAT instruction.


Revision tags: llvmorg-16.0.1, llvmorg-16.0.0
# 7ada7bbe 15-Mar-2023 Kazu Hirata <kazu@google.com>

[Target] Use *{Set,Map}::contains (NFC)


Revision tags: llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# e0782018 28-Jan-2023 Kazu Hirata <kazu@google.com>

[Target] Use llvm::count{l,r}_{zero,one} (NFC)


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# caa99a01 22-Jan-2023 Kazu Hirata <kazu@google.com>

Use llvm::popcount instead of llvm::countPopulation(NFC)


1234567