Revision tags: llvmorg-21-init, llvmorg-19.1.7 |
|
#
457f3024 |
| 08-Jan-2025 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Disallow null for more resource operands (#121941)
Following on from #115200, disallow the null sgpr as a resource operand
in some instructions that were missed.
|
#
b2adeae8 |
| 03-Jan-2025 |
Jun Wang <jwang86@yahoo.com> |
[AMDGPU][MC] Allow null where 128b or larger dst reg is expected (#115200)
For GFX10+, currently null cannot be used as dst reg in instructions
that expect the dst reg to be 128b or larger (e.g., s
[AMDGPU][MC] Allow null where 128b or larger dst reg is expected (#115200)
For GFX10+, currently null cannot be used as dst reg in instructions
that expect the dst reg to be 128b or larger (e.g., s_load_dwordx4).
This patch fixes this problem while ensuring null cannot be used as S#,
T#, or V#.
show more ...
|
Revision tags: llvmorg-19.1.6 |
|
#
6b223260 |
| 11-Dec-2024 |
Sergei Barannikov <barannikov88@gmail.com> |
[TableGen] Replace WantRoot/WantParent SDNode properties with flags (#119599)
These properties are only valid on ComplexPatterns. Having them as flags is more convenient because one can now use "let
[TableGen] Replace WantRoot/WantParent SDNode properties with flags (#119599)
These properties are only valid on ComplexPatterns. Having them as flags is more convenient because one can now use "let = ... in" syntax to set these flags on several patterns at a time. This is also less error-prone as it makes it impossible to specify these properties on records derived from SDPatternOperator.
Pull Request: https://github.com/llvm/llvm-project/pull/119599
show more ...
|
Revision tags: llvmorg-19.1.5 |
|
#
7fc71f79 |
| 26-Nov-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Support buffer_atomic_pk_add_bf16 for gfx950 (#117599)
Co-authored-by: Sirish Pande <Sirish.Pande@amd.com>
|
#
1b792252 |
| 20-Nov-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Remove hasPostISelHook for atomics. NFC. (#116791)
This is not required since 2147b6c89d44 changed that way that no-ret
atomic ops are selected.
|
Revision tags: llvmorg-19.1.4 |
|
#
92703280 |
| 19-Nov-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Handle gfx950 96/128-bit buffer_load_lds (#116681)
Enforcing this limit in the clang builtin will come later.
|
#
550501f2 |
| 01-Nov-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Simplify GFX12 VBUFFER definitions. NFC. (#114403)
For GFX12 hasTFE is always true because it does not have the buffer load
to LDS instructions.
|
#
12409024 |
| 31-Oct-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU/GlobalISel: Handle atomic sextload and zextload (#111721)
Atomic loads are handled differently from the DAG, and have separate opcodes and explicit control over the extensions, like ordinary
AMDGPU/GlobalISel: Handle atomic sextload and zextload (#111721)
Atomic loads are handled differently from the DAG, and have separate opcodes and explicit control over the extensions, like ordinary loads. Add new patterns for these.
There's room for cleanup and improvement. d16 cases aren't handled.
Fixes #111645
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1 |
|
#
5927c674 |
| 23-Sep-2024 |
Jun Wang <jwang86@yahoo.com> |
[AMDGPU][MC] Instructions not to be supported in GFX940 (#109225)
Buffer_store_lds_dword, buffer_wbinvl1, and buffer_wbinvl1_vol are
obsolete in GFX940 and should not be supported.
|
Revision tags: llvmorg-19.1.0 |
|
#
935b9f62 |
| 11-Sep-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Make use of multiclass inheritance. NFC.
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1 |
|
#
9398cc2e |
| 25-Jul-2024 |
Acim Maravic <Acim.Maravic@Syrmia.com> |
[LLVM][AMDGPU] Copy isConvergent from Pseudo to Real instructions (#99658)
This patch copies the flag isConvergent from pseudo instructions to the
corresponding real instructions, so that isConverg
[LLVM][AMDGPU] Copy isConvergent from Pseudo to Real instructions (#99658)
This patch copies the flag isConvergent from pseudo instructions to the
corresponding real instructions, so that isConvergent flag is also
defined for real instructions.
Flags are not required by the compiler, but for consistency it would be
nice to have them.
Co-authored-by: Acim Maravic <Acim.Maravic@amd.com>
show more ...
|
Revision tags: llvmorg-20-init |
|
#
2ef4f863 |
| 10-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add subtarget feature for memory atomic fadd f64 (#96444)
|
#
889f3c57 |
| 25-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Handle legal v2bf16 atomicrmw fadd for gfx12 (#95930)
Annoyingly gfx90a/940 support this for global/flat but not buffer.
|
#
a440a96e |
| 23-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Start selecting flat/global atomicrmw fmin/fmax. (#95592)
Define subtarget features for atomic fmin/fmax support.
The flat/global support is a real messe. We had float/double support at the
AMDGPU: Start selecting flat/global atomicrmw fmin/fmax. (#95592)
Define subtarget features for atomic fmin/fmax support.
The flat/global support is a real messe. We had float/double support at the beginning in gfx6 and gfx7. gfx8 removed these. gfx10 reintroduced them. gfx11 removed the f64 versions again.
gfx9 partially reintroduced them, in gfx90a and gfx940 but only for f64.
show more ...
|
#
b9c7d60a |
| 21-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Start fixing inconsistencies in usage of SubtargetPredicate (#96337)
SubtargetPredicate should be the primary "does this instruction exist" predicate, with OtherPredicates used for other sid
AMDGPU: Start fixing inconsistencies in usage of SubtargetPredicate (#96337)
SubtargetPredicate should be the primary "does this instruction exist" predicate, with OtherPredicates used for other side pieces of information.
Changes like 856d1c4410 were backwards. The problematic usage is how GFX12 is using HasRestrictedOffset. The multiclasses for buffers should probably be split up instead of hiding OtherPredicates inside the buffer atomic multiclasses. The two cases are mutually exclusive and really need a negated predicate for the not-gfx12 case.
It's pretty terrible we have to manage this in the first place. TableGen should be able to figure out the required predicates from any instructions that appear in the pattern output.
show more ...
|
#
5d6d2fc0 |
| 21-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix overriding SubtargetPredicate in MUBUF_Real_gfx90a (#96351)
|
#
9f8e7c3a |
| 18-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Create pseudo to real mapping for flat/buffer atomic fmin/fmax (#95591)
The global/flat/buffer atomic fmin/fmax situation is a mess. These instructions have been renamed 3 times. We currentl
AMDGPU: Create pseudo to real mapping for flat/buffer atomic fmin/fmax (#95591)
The global/flat/buffer atomic fmin/fmax situation is a mess. These instructions have been renamed 3 times. We currently have separate pseudos defined for the same opcodes with the different names (e.g. GLOBAL_ATOMIC_MIN_F64 from gfx90a and GLOBAL_ATOMIC_FMIN_X2 from gfx10).
Use the _FMIN versions as the canonical name for the f32 versions. Use the _MIN_F64 style as the canonical name for the f64 case. This is because gfx90a has the most sensible names, but does not have the f32 versions.t sho
Wire through the pseudo to use for the instruction properties vs. the assembly name like in other cases. This will simplify handling of direct atomicrmw selection.
This will simplify directly selecting these from atomicrmw.
show more ...
|
#
8930ac1b |
| 17-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Cleanup selection patterns for buffer loads (#95378)
We should just support these for all register types.
|
#
3b997294 |
| 17-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Remove .v2bf16 buffer atomic fadd intrinsics (#95783)
These are redundant with the unsuffixed versions, and have a name collision with surprising behavior when the base intrinsic is used wit
AMDGPU: Remove .v2bf16 buffer atomic fadd intrinsics (#95783)
These are redundant with the unsuffixed versions, and have a name collision with surprising behavior when the base intrinsic is used with v2bf16.
The global and flat variants should be removed too, but those are complicated due to using v2i16 in place of the natural v2bf16. Those cases can soon be completely deleted in favor of atomicrmw.
The GlobalISel codegen change is broken and substitutes handling as bf16 for handling as f16, but it's a bug that this passed the IRTranslator in the first place.
show more ...
|
Revision tags: llvmorg-18.1.8 |
|
#
7e3e9d43 |
| 14-Jun-2024 |
Joe Nash <joseph.nash@amd.com> |
[AMDGPU] Change getLdStRegisterOperand to !cond for better diagnostic (#95475)
If you would hit the unexpected case in these !if trees, you'd get an
error message like "error: Not a known RegisterC
[AMDGPU] Change getLdStRegisterOperand to !cond for better diagnostic (#95475)
If you would hit the unexpected case in these !if trees, you'd get an
error message like "error: Not a known RegisterClass! def VReg_1..."
This can happen when changing code quite indirectly related to these
class definitions. We can use !cond here, which has a builtin facility
to throw an error if no case in the !cond statement is hit.
NFC.
show more ...
|
#
c0ff36ea |
| 13-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix buffer intrinsic handling for various 16-bit elements. (#95376)
Mostly fixes handling of bfloat vectors, but also some missing i16 cases.
|
#
5c9352eb |
| 13-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Replace bitwidth with type in suffix in atomic tablegen ops (#94845)
|
#
935d3773 |
| 13-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix using wrong memory type for non-image resource intrinsics (#94911)
An 8 x i16 raw load was incorrectly using a 64-bit memory type, which
would assert in the MachineMemOperand constructo
AMDGPU: Fix using wrong memory type for non-image resource intrinsics (#94911)
An 8 x i16 raw load was incorrectly using a 64-bit memory type, which
would assert in the MachineMemOperand constructor.
This is preparation for a cleanup which will make the buffer intrinsics
work for all legal types.
show more ...
|
#
9890f943 |
| 13-Jun-2024 |
Ivan Kosarev <ivan.kosarev@amd.com> |
[AMDGPU][GFX12] Support disassembling MUBUF instructions with arbitrary FORMAT values. (#95243)
Some tools generate such instructions with the FORMAT field set to 0,
which corresponds to buf_fmt_in
[AMDGPU][GFX12] Support disassembling MUBUF instructions with arbitrary FORMAT values. (#95243)
Some tools generate such instructions with the FORMAT field set to 0,
which corresponds to buf_fmt_invalid, but that should not prevent them
from being recognised on decoding.
show more ...
|
#
dd7540f3 |
| 12-Jun-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Handle buffer load/store for 64-bit element types
Note pointers still don't work correctly.
|