Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2 |
|
#
8d13e7b8 |
| 03-Oct-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Qualify auto. NFC. (#110878)
Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
7a30b9c0 |
| 11-Sep-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Make more use of getWaveMaskRegClass. NFC. (#108186)
|
Revision tags: llvmorg-19.1.0-rc4 |
|
#
cd3667d1 |
| 02-Sep-2024 |
Craig Topper <craig.topper@sifive.com> |
[CodeGen] Update a few places that were passing Register to raw_ostream::operator<< (#106877)
These would implicitly cast the register to `unsigned`. Switch most of
them to use printReg will give a
[CodeGen] Update a few places that were passing Register to raw_ostream::operator<< (#106877)
These would implicitly cast the register to `unsigned`. Switch most of
them to use printReg will give a more readable output. Change some
others to use Register::id() so we can eventually remove the implicit
cast to `unsigned`.
show more ...
|
#
da137541 |
| 02-Sep-2024 |
Akshat Oke <76596238+Akshat-Oke@users.noreply.github.com> |
AMDGPU/NewPM Port SILoadStoreOptimizer to NPM (#106362)
|
Revision tags: llvmorg-19.1.0-rc3 |
|
#
273e0a4c |
| 12-Aug-2024 |
Tim Gymnich <tgymnich@icloud.com> |
[AMDGPU] add missing checks in processBaseWithConstOffset (#102310)
fixes https://github.com/llvm/llvm-project/issues/102231 by inserting
missing checks.
|
#
37d7b06d |
| 06-Aug-2024 |
Christudasan Devadasan <christudasan.devadasan@amd.com> |
[AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (#101619)
Use the constrained buffer load opcodes while combining under-aligned loads for XNACK enabled subtargets.
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
a1d7da05 |
| 23-Jul-2024 |
Christudasan Devadasan <christudasan.devadasan@amd.com> |
[AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (#96162)
Consider the constrained multi-dword loads while merging individual loads to a single multi-dword load.
|
Revision tags: llvmorg-18.1.8 |
|
#
c771b670 |
| 06-Jun-2024 |
Stanislav Mekhanoshin <rampitec@users.noreply.github.com> |
[AMDGPU] Promote immediate offset to atomics (#94043)
|
Revision tags: llvmorg-18.1.7 |
|
#
fc21387b |
| 31-May-2024 |
Stanislav Mekhanoshin <rampitec@users.noreply.github.com> |
[AMDGPU] Enable constant offset promotion to immediate FLAT (#93884)
Currently it is only supported for FLAT Global.
|
#
215f92b9 |
| 30-May-2024 |
Stanislav Mekhanoshin <rampitec@users.noreply.github.com> |
[AMDGPU] Fix crash in the SILoadStoreOptimizer (#93862)
It does not properly handle situation when address calculation uses
V_ADDC_U32 0, 0, carry-in (i.e. with both src0 and src1 immediates).
|
Revision tags: llvmorg-18.1.6 |
|
#
11f76b85 |
| 02-May-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Use some merging/unmerging helpers in SILoadStoreOptimizer (#90866)
Factor out copyToDestRegs and copyFromSrcRegs for merging store sources
and unmerging load results. NFC.
|
#
e020e287 |
| 02-May-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Modernize some syntax in SILoadStoreOptimizer. NFC.
Use structured bindings and similar.
|
Revision tags: llvmorg-18.1.5 |
|
#
0606747c |
| 01-May-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Remove some pointless fallthrough annotations
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
06cfbe3c |
| 25-Mar-2024 |
David Stuttard <david.stuttard@amd.com> |
[AMDPU] Add support for idxen and bothen buffer load/store merging in SILoadStoreOptimizer (#86285)
Added more buffer instruction merging support
|
Revision tags: llvmorg-18.1.2 |
|
#
601e102b |
| 17-Mar-2024 |
David Green <david.green@arm.com> |
[CodeGen] Use LocationSize for MMO getSize (#84751)
This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
const
[CodeGen] Use LocationSize for MMO getSize (#84751)
This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
constructors take LocationSize, and convert ~UINT64_C(0) to
LocationSize::beforeOrAfter(). The getSize methods return a
LocationSize.
This allows us to be more precise with unknown sizes, not accidentally
treating them as unsigned values, and in the future should allow us to
add proper scalable vector support but none of that is included in this
patch. It should mostly be an NFC.
Global ISel is still expected to use the underlying LLT as it needs, and
are not expected to see unknown sizes for generic operations. Most of
the changes are hopefully fairly mechanical, adding a lot of getValue()
calls and protecting them with hasValue() where needed.
show more ...
|
Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
5879162f |
| 15-Dec-2023 |
Mirko Brkušanin <Mirko.Brkusanin@amd.com> |
[AMDGPU] CodeGen for GFX12 VBUFFER instructions (#75492)
|
#
26b14aed |
| 15-Dec-2023 |
Mirko Brkušanin <Mirko.Brkusanin@amd.com> |
[AMDGPU] CodeGen for GFX12 VIMAGE and VSAMPLE instructions (#75488)
|
#
a278ac57 |
| 15-Dec-2023 |
Mirko Brkušanin <Mirko.Brkusanin@amd.com> |
[AMDGPU] CodeGen for SMEM instructions (#75579)
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
4fa8a548 |
| 11-Aug-2023 |
Konrad Kusiak <konrad.kusiak@codeplay> |
[AMDGPU] Add sanity check that fixes bad shift operation in AMD backend
There is a problem with the SILoadStoreOptimizer::dmasksCanBeCombined() function that can lead to UB.
This boolean function d
[AMDGPU] Add sanity check that fixes bad shift operation in AMD backend
There is a problem with the SILoadStoreOptimizer::dmasksCanBeCombined() function that can lead to UB.
This boolean function decides if two masks can be combined into 1. The idea here is that the bits which are "on" in one mask, don't overlap with the "on" bits of the other. Consider an example (10 bits for simplicity):
Mask 1: 0101101000 Mask 2: 0000000110
Those can be combined into a single mask: 0101101110.
To check if such an operation is possible, the code takes the mask which is greater and counts how many 0s there are, starting from the LSB and stopping at the first 1. Then, it shifts 1u by this number and compares it with the smaller mask. The problem is that when both masks are 0, the counter will find 32 zeroes in the first mask and will try to do a shift by 32 positions which leads to UB.
The fix is a simple sanity check, if the bigger mask is 0 or not.
https://reviews.llvm.org/D155051
show more ...
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
c68c6c56 |
| 21-Jun-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Minor refactoring in SILoadStoreOptimizer::offsetsCanBeCombined
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
0c13e0b7 |
| 25-Apr-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Do not handle _SGPR SMEM instructions in SILoadStoreOptimizer
After D147334 we never select _SGPR forms of SMEM instructions on subtargets that also support the _SGPR_IMM form, so there is
[AMDGPU] Do not handle _SGPR SMEM instructions in SILoadStoreOptimizer
After D147334 we never select _SGPR forms of SMEM instructions on subtargets that also support the _SGPR_IMM form, so there is no need to handle them here.
Differential Revision: https://reviews.llvm.org/D149139
show more ...
|
Revision tags: llvmorg-16.0.2 |
|
#
f6e70ed1 |
| 07-Apr-2023 |
mmarjano <mmarjano@amd.com> |
[AMDGPU] Extend tbuffer_load_format merge
Add support for merging _IDXEN and _BOTHEN variants of TBUFFER_LOAD_FORMAT instruction.
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0 |
|
#
7ada7bbe |
| 15-Mar-2023 |
Kazu Hirata <kazu@google.com> |
[Target] Use *{Set,Map}::contains (NFC)
|
Revision tags: llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2 |
|
#
e0782018 |
| 28-Jan-2023 |
Kazu Hirata <kazu@google.com> |
[Target] Use llvm::count{l,r}_{zero,one} (NFC)
|
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init |
|
#
caa99a01 |
| 22-Jan-2023 |
Kazu Hirata <kazu@google.com> |
Use llvm::popcount instead of llvm::countPopulation(NFC)
|