|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
| #
b7f44f7c |
| 29-Nov-2022 |
Nicolai Hähnle <nicolai.haehnle@amd.com> |
AMDGPU: Remove ImagePSV and move images to addrspace 7
Following up on the removal of BufferPSV in commit 43b86bf992 ("AMDGPU: Remove BufferPseudoSourceValue")
It is unclear what exactly the right
AMDGPU: Remove ImagePSV and move images to addrspace 7
Following up on the removal of BufferPSV in commit 43b86bf992 ("AMDGPU: Remove BufferPseudoSourceValue")
It is unclear what exactly the right address space for images should be. They seem morally closest to buffers, so that's what I went with. In practical terms, address space 7 is better than address space 0 because it can't alias with LDS.
Differential Revision: https://reviews.llvm.org/D138949
show more ...
|
|
Revision tags: llvmorg-15.0.6 |
|
| #
43b86bf9 |
| 25-Nov-2022 |
Nicolai Hähnle <nicolai.haehnle@amd.com> |
AMDGPU: Remove BufferPseudoSourceValue
The use of a PSV for buffer intrinsics is misleading because it may be misinterpreted as all buffer intrinsics accessing the same address in memory, which is c
AMDGPU: Remove BufferPseudoSourceValue
The use of a PSV for buffer intrinsics is misleading because it may be misinterpreted as all buffer intrinsics accessing the same address in memory, which is clearly not true.
Instead, build MachineMemOperands without a pointer value but with an address space, so that address space-based alias analysis can still work.
There is a lot of test churn because previously address space 4 (constant address space) was used as an address space for buffer intrinsics. This doesn't make much sense and seems to have been an accident -- see the change in AMDGPUTargetMachine::getAddressSpaceForPseudoSourceKind.
Differential Revision: https://reviews.llvm.org/D138711
show more ...
|
|
Revision tags: llvmorg-15.0.5 |
|
| #
1b560e6a |
| 14-Nov-2022 |
Ivan Kosarev <ivan.kosarev@amd.com> |
[AMDGPU][MC] Support TFE modifiers in MUBUF loads and stores.
Reviewed By: dp, arsenm
Differential Revision: https://reviews.llvm.org/D137783
|
|
Revision tags: llvmorg-15.0.4 |
|
| #
6a748100 |
| 31-Oct-2022 |
Valery Pykhtin <valery.pykhtin@gmail.com> |
[AMDGPU] Fix RP tracker's live registers after processing a memory clause.
It's incorrect to reuse live registers left from the first instruction in a clause after the clause as they don't contain i
[AMDGPU] Fix RP tracker's live registers after processing a memory clause.
It's incorrect to reuse live registers left from the first instruction in a clause after the clause as they don't contain in-clause defs.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D137081
show more ...
|
|
Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
8871c3c5 |
| 27-Jun-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Regenerate MIR checks. NFC.
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
fae05692 |
| 20-May-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert it to a scalar. This should be removed in the near future (I think I converted
CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert it to a scalar. This should be removed in the near future (I think I converted all of the tests already, but likely missed a few).
Not sure what the exact syntax and policy should be. We can continue printing the number of bytes for non-generic instructions to avoid test churn and only allow non-scalar types for generic instructions.
This will currently print the LLT in parentheses, but accept parsing the existing integers and implicitly converting to scalar. The parentheses are a bit ugly, but the parser logic seems unable to deal without either parentheses or some keyword to indicate the start of a type.
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
3bffb1cd |
| 09-Feb-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amoun
[AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amount of code. These operands are mostly 0 anyway.
Additional advantage that parser will accept these flags in any order unlike now.
Differential Revision: https://reviews.llvm.org/D96469
show more ...
|
| #
54c0f520 |
| 03-Mar-2021 |
Baptiste Saleil <baptiste.saleil@ibm.com> |
[VirtRegRewriter] Insert missing killed flags when tracking subregister liveness
VirtRegRewriter may sometimes fail to correctly apply the kill flag where necessary, which causes unecessary code gen
[VirtRegRewriter] Insert missing killed flags when tracking subregister liveness
VirtRegRewriter may sometimes fail to correctly apply the kill flag where necessary, which causes unecessary code gen on PowerPC. This patch fixes the way masks for defined lanes are computed and the way mask for used lanes is computed.
Contact albion.fung@ibm.com instead of author for problems related to this commit.
Differential Revision: https://reviews.llvm.org/D92405
show more ...
|
| #
81b2c23b |
| 12-Feb-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Use kill instruction to hint soft clause live ranges
Previously we would use a bundle to hint the register allocator to not overwrite the pointers in a sequence of loads to avoid breaking so
AMDGPU: Use kill instruction to hint soft clause live ranges
Previously we would use a bundle to hint the register allocator to not overwrite the pointers in a sequence of loads to avoid breaking soft clauses. This bundling was based on a fuzzy register pressure heuristic, so we could not guarantee using more registers than are really available. This would result in register allocator failing on unsatisfiable bundles. Use a kill to artificially extend the live ranges, so we can always succeed at register allocation even if it means extra spills in the worst case.
This seems to capture most of the benefit of the bundle while avoiding most of the risk presented by the bundle. However the lit tests do show a handful of regressions. In some cases with sequences of volatile loads, unused load components end up getting reallocated to the next load which forces a wait between. There are also a few small scheduling regressions where a hazard used to be avoided, and one spill torture test which for some reason nearly doubles the stack usage. There is also a bit of noise from leftover kills (it may make sense for post-RA pseudos to strip all of these out).
show more ...
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3 |
|
| #
27093f1a |
| 03-Feb-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add regression testcase for bundle pressure issue
This is a somewhat reduced testcase that regressed, causing the revert in 477e3fe4f874b1c4d5896f3bfaf7b3b8a6d38103.
This was producing a bu
AMDGPU: Add regression testcase for bundle pressure issue
This is a somewhat reduced testcase that regressed, causing the revert in 477e3fe4f874b1c4d5896f3bfaf7b3b8a6d38103.
This was producing a bundle that could not be allocated. This is a tricky one to reduce/reproduce, but I do like having some sanity check for this.
show more ...
|