SILoadStoreOptimizer.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2
# 8d13e7b8	03-Oct-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Qualify auto. NFC. (#110878) Generated automatically with: $ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find lib/Target/AMDGPU/ -type f)
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0
# 7a30b9c0	11-Sep-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Make more use of getWaveMaskRegClass. NFC. (#108186)
Revision tags: llvmorg-19.1.0-rc4
# cd3667d1	02-Sep-2024	Craig Topper <craig.topper@sifive.com>	[CodeGen] Update a few places that were passing Register to raw_ostream::operator<< (#106877) These would implicitly cast the register to `unsigned`. Switch most of them to use printReg will give a [CodeGen] Update a few places that were passing Register to raw_ostream::operator<< (#106877) These would implicitly cast the register to `unsigned`. Switch most of them to use printReg will give a more readable output. Change some others to use Register::id() so we can eventually remove the implicit cast to `unsigned`. show more ...
# da137541	02-Sep-2024	Akshat Oke <76596238+Akshat-Oke@users.noreply.github.com>	AMDGPU/NewPM Port SILoadStoreOptimizer to NPM (#106362)
Revision tags: llvmorg-19.1.0-rc3
# 273e0a4c	12-Aug-2024	Tim Gymnich <tgymnich@icloud.com>	[AMDGPU] add missing checks in processBaseWithConstOffset (#102310) fixes https://github.com/llvm/llvm-project/issues/102231 by inserting missing checks.
# 37d7b06d	06-Aug-2024	Christudasan Devadasan <christudasan.devadasan@amd.com>	[AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (#101619) Use the constrained buffer load opcodes while combining under-aligned loads for XNACK enabled subtargets.
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# a1d7da05	23-Jul-2024	Christudasan Devadasan <christudasan.devadasan@amd.com>	[AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (#96162) Consider the constrained multi-dword loads while merging individual loads to a single multi-dword load.
Revision tags: llvmorg-18.1.8
# c771b670	06-Jun-2024	Stanislav Mekhanoshin <rampitec@users.noreply.github.com>	[AMDGPU] Promote immediate offset to atomics (#94043)
Revision tags: llvmorg-18.1.7
# fc21387b	31-May-2024	Stanislav Mekhanoshin <rampitec@users.noreply.github.com>	[AMDGPU] Enable constant offset promotion to immediate FLAT (#93884) Currently it is only supported for FLAT Global.
# 215f92b9	30-May-2024	Stanislav Mekhanoshin <rampitec@users.noreply.github.com>	[AMDGPU] Fix crash in the SILoadStoreOptimizer (#93862) It does not properly handle situation when address calculation uses V_ADDC_U32 0, 0, carry-in (i.e. with both src0 and src1 immediates).
Revision tags: llvmorg-18.1.6
# 11f76b85	02-May-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Use some merging/unmerging helpers in SILoadStoreOptimizer (#90866) Factor out copyToDestRegs and copyFromSrcRegs for merging store sources and unmerging load results. NFC.
# e020e287	02-May-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Modernize some syntax in SILoadStoreOptimizer. NFC. Use structured bindings and similar.
Revision tags: llvmorg-18.1.5
# 0606747c	01-May-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Remove some pointless fallthrough annotations
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3
# 06cfbe3c	25-Mar-2024	David Stuttard <david.stuttard@amd.com>	[AMDPU] Add support for idxen and bothen buffer load/store merging in SILoadStoreOptimizer (#86285) Added more buffer instruction merging support
Revision tags: llvmorg-18.1.2
# 601e102b	17-Mar-2024	David Green <david.green@arm.com>	[CodeGen] Use LocationSize for MMO getSize (#84751) This is part of #70452 that changes the type used for the external interface of MMO to LocationSize as opposed to uint64_t. This means the const [CodeGen] Use LocationSize for MMO getSize (#84751) This is part of #70452 that changes the type used for the external interface of MMO to LocationSize as opposed to uint64_t. This means the constructors take LocationSize, and convert ~UINT64_C(0) to LocationSize::beforeOrAfter(). The getSize methods return a LocationSize. This allows us to be more precise with unknown sizes, not accidentally treating them as unsigned values, and in the future should allow us to add proper scalable vector support but none of that is included in this patch. It should mostly be an NFC. Global ISel is still expected to use the underlying LLT as it needs, and are not expected to see unknown sizes for generic operations. Most of the changes are hopefully fairly mechanical, adding a lot of getValue() calls and protecting them with hasValue() where needed. show more ...
Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 5879162f	15-Dec-2023	Mirko Brkušanin <Mirko.Brkusanin@amd.com>	[AMDGPU] CodeGen for GFX12 VBUFFER instructions (#75492)
# 26b14aed	15-Dec-2023	Mirko Brkušanin <Mirko.Brkusanin@amd.com>	[AMDGPU] CodeGen for GFX12 VIMAGE and VSAMPLE instructions (#75488)
# a278ac57	15-Dec-2023	Mirko Brkušanin <Mirko.Brkusanin@amd.com>	[AMDGPU] CodeGen for SMEM instructions (#75579)
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3
# 4fa8a548	11-Aug-2023	Konrad Kusiak <konrad.kusiak@codeplay>	[AMDGPU] Add sanity check that fixes bad shift operation in AMD backend There is a problem with the SILoadStoreOptimizer::dmasksCanBeCombined() function that can lead to UB. This boolean function d [AMDGPU] Add sanity check that fixes bad shift operation in AMD backend There is a problem with the SILoadStoreOptimizer::dmasksCanBeCombined() function that can lead to UB. This boolean function decides if two masks can be combined into 1. The idea here is that the bits which are "on" in one mask, don't overlap with the "on" bits of the other. Consider an example (10 bits for simplicity): Mask 1: 0101101000 Mask 2: 0000000110 Those can be combined into a single mask: 0101101110. To check if such an operation is possible, the code takes the mask which is greater and counts how many 0s there are, starting from the LSB and stopping at the first 1. Then, it shifts 1u by this number and compares it with the smaller mask. The problem is that when both masks are 0, the counter will find 32 zeroes in the first mask and will try to do a shift by 32 positions which leads to UB. The fix is a simple sanity check, if the bigger mask is 0 or not. https://reviews.llvm.org/D155051 show more ...
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# c68c6c56	21-Jun-2023	Jay Foad <jay.foad@amd.com>	[AMDGPU] Minor refactoring in SILoadStoreOptimizer::offsetsCanBeCombined
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3
# 0c13e0b7	25-Apr-2023	Jay Foad <jay.foad@amd.com>	[AMDGPU] Do not handle _SGPR SMEM instructions in SILoadStoreOptimizer After D147334 we never select _SGPR forms of SMEM instructions on subtargets that also support the _SGPR_IMM form, so there is [AMDGPU] Do not handle _SGPR SMEM instructions in SILoadStoreOptimizer After D147334 we never select _SGPR forms of SMEM instructions on subtargets that also support the _SGPR_IMM form, so there is no need to handle them here. Differential Revision: https://reviews.llvm.org/D149139 show more ...
Revision tags: llvmorg-16.0.2
# f6e70ed1	07-Apr-2023	mmarjano <mmarjano@amd.com>	[AMDGPU] Extend tbuffer_load_format merge Add support for merging _IDXEN and _BOTHEN variants of TBUFFER_LOAD_FORMAT instruction.
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0
# 7ada7bbe	15-Mar-2023	Kazu Hirata <kazu@google.com>	[Target] Use *{Set,Map}::contains (NFC)
Revision tags: llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# e0782018	28-Jan-2023	Kazu Hirata <kazu@google.com>	[Target] Use llvm::count{l,r}_{zero,one} (NFC)
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# caa99a01	22-Jan-2023	Kazu Hirata <kazu@google.com>	Use llvm::popcount instead of llvm::countPopulation(NFC)
12 3 4 5 6 7