History log of /llvm-project/llvm/lib/Target/AMDGPU/BUFInstructions.td (Results 1 – 25 of 232)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# 457f3024 08-Jan-2025 Jay Foad <jay.foad@amd.com>

[AMDGPU] Disallow null for more resource operands (#121941)

Following on from #115200, disallow the null sgpr as a resource operand
in some instructions that were missed.


# b2adeae8 03-Jan-2025 Jun Wang <jwang86@yahoo.com>

[AMDGPU][MC] Allow null where 128b or larger dst reg is expected (#115200)

For GFX10+, currently null cannot be used as dst reg in instructions
that expect the dst reg to be 128b or larger (e.g., s

[AMDGPU][MC] Allow null where 128b or larger dst reg is expected (#115200)

For GFX10+, currently null cannot be used as dst reg in instructions
that expect the dst reg to be 128b or larger (e.g., s_load_dwordx4).
This patch fixes this problem while ensuring null cannot be used as S#,
T#, or V#.

show more ...


Revision tags: llvmorg-19.1.6
# 6b223260 11-Dec-2024 Sergei Barannikov <barannikov88@gmail.com>

[TableGen] Replace WantRoot/WantParent SDNode properties with flags (#119599)

These properties are only valid on ComplexPatterns. Having them as flags
is more convenient because one can now use "let

[TableGen] Replace WantRoot/WantParent SDNode properties with flags (#119599)

These properties are only valid on ComplexPatterns. Having them as flags
is more convenient because one can now use "let = ... in" syntax to set
these flags on several patterns at a time. This is also less error-prone
as it makes it impossible to specify these properties on records derived
from SDPatternOperator.

Pull Request: https://github.com/llvm/llvm-project/pull/119599

show more ...


Revision tags: llvmorg-19.1.5
# 7fc71f79 26-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Support buffer_atomic_pk_add_bf16 for gfx950 (#117599)

Co-authored-by: Sirish Pande <Sirish.Pande@amd.com>


# 1b792252 20-Nov-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Remove hasPostISelHook for atomics. NFC. (#116791)

This is not required since 2147b6c89d44 changed that way that no-ret
atomic ops are selected.


Revision tags: llvmorg-19.1.4
# 92703280 19-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Handle gfx950 96/128-bit buffer_load_lds (#116681)

Enforcing this limit in the clang builtin will come later.


# 550501f2 01-Nov-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Simplify GFX12 VBUFFER definitions. NFC. (#114403)

For GFX12 hasTFE is always true because it does not have the buffer load
to LDS instructions.


# 12409024 31-Oct-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU/GlobalISel: Handle atomic sextload and zextload (#111721)

Atomic loads are handled differently from the DAG, and have separate opcodes
and explicit control over the extensions, like ordinary

AMDGPU/GlobalISel: Handle atomic sextload and zextload (#111721)

Atomic loads are handled differently from the DAG, and have separate opcodes
and explicit control over the extensions, like ordinary loads. Add
new patterns for these.

There's room for cleanup and improvement. d16 cases aren't handled.

Fixes #111645

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1
# 5927c674 23-Sep-2024 Jun Wang <jwang86@yahoo.com>

[AMDGPU][MC] Instructions not to be supported in GFX940 (#109225)

Buffer_store_lds_dword, buffer_wbinvl1, and buffer_wbinvl1_vol are
obsolete in GFX940 and should not be supported.


Revision tags: llvmorg-19.1.0
# 935b9f62 11-Sep-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Make use of multiclass inheritance. NFC.


Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1
# 9398cc2e 25-Jul-2024 Acim Maravic <Acim.Maravic@Syrmia.com>

[LLVM][AMDGPU] Copy isConvergent from Pseudo to Real instructions (#99658)

This patch copies the flag isConvergent from pseudo instructions to the
corresponding real instructions, so that isConverg

[LLVM][AMDGPU] Copy isConvergent from Pseudo to Real instructions (#99658)

This patch copies the flag isConvergent from pseudo instructions to the
corresponding real instructions, so that isConvergent flag is also
defined for real instructions.

Flags are not required by the compiler, but for consistency it would be
nice to have them.

Co-authored-by: Acim Maravic <Acim.Maravic@amd.com>

show more ...


Revision tags: llvmorg-20-init
# 2ef4f863 10-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add subtarget feature for memory atomic fadd f64 (#96444)


# 889f3c57 25-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Handle legal v2bf16 atomicrmw fadd for gfx12 (#95930)

Annoyingly gfx90a/940 support this for global/flat but not buffer.


# a440a96e 23-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Start selecting flat/global atomicrmw fmin/fmax. (#95592)

Define subtarget features for atomic fmin/fmax support.

The flat/global support is a real messe. We had float/double support at
the

AMDGPU: Start selecting flat/global atomicrmw fmin/fmax. (#95592)

Define subtarget features for atomic fmin/fmax support.

The flat/global support is a real messe. We had float/double support at
the beginning in gfx6 and gfx7. gfx8 removed these. gfx10 reintroduced them.
gfx11 removed the f64 versions again.

gfx9 partially reintroduced them, in gfx90a and gfx940 but only for f64.

show more ...


# b9c7d60a 21-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Start fixing inconsistencies in usage of SubtargetPredicate (#96337)

SubtargetPredicate should be the primary "does this instruction exist"
predicate, with OtherPredicates used for other sid

AMDGPU: Start fixing inconsistencies in usage of SubtargetPredicate (#96337)

SubtargetPredicate should be the primary "does this instruction exist"
predicate, with OtherPredicates used for other side pieces of information.

Changes like 856d1c4410 were backwards. The problematic usage is how
GFX12 is using HasRestrictedOffset. The multiclasses for buffers
should probably be split up instead of hiding OtherPredicates inside
the buffer atomic multiclasses. The two cases are mutually exclusive
and really need a negated predicate for the not-gfx12 case.

It's pretty terrible we have to manage this in the first place.
TableGen should be able to figure out the required predicates
from any instructions that appear in the pattern output.

show more ...


# 5d6d2fc0 21-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix overriding SubtargetPredicate in MUBUF_Real_gfx90a (#96351)


# 9f8e7c3a 18-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Create pseudo to real mapping for flat/buffer atomic fmin/fmax (#95591)

The global/flat/buffer atomic fmin/fmax situation is a mess. These
instructions have been renamed 3 times. We currentl

AMDGPU: Create pseudo to real mapping for flat/buffer atomic fmin/fmax (#95591)

The global/flat/buffer atomic fmin/fmax situation is a mess. These
instructions have been renamed 3 times. We currently have
separate pseudos defined for the same opcodes with the different names
(e.g. GLOBAL_ATOMIC_MIN_F64 from gfx90a and GLOBAL_ATOMIC_FMIN_X2 from gfx10).

Use the _FMIN versions as the canonical name for the f32 versions. Use the
_MIN_F64 style as the canonical name for the f64 case. This is because
gfx90a has the most sensible names, but does not have the f32 versions.t sho

Wire through the pseudo to use for the instruction properties vs. the assembly
name like in other cases. This will simplify handling of direct atomicrmw selection.

This will simplify directly selecting these from atomicrmw.

show more ...


# 8930ac1b 17-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Cleanup selection patterns for buffer loads (#95378)

We should just support these for all register types.


# 3b997294 17-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove .v2bf16 buffer atomic fadd intrinsics (#95783)

These are redundant with the unsuffixed versions, and have a name
collision with surprising behavior when the base intrinsic is used wit

AMDGPU: Remove .v2bf16 buffer atomic fadd intrinsics (#95783)

These are redundant with the unsuffixed versions, and have a name
collision with surprising behavior when the base intrinsic is used with
v2bf16.

The global and flat variants should be removed too, but those are complicated
due to using v2i16 in place of the natural v2bf16. Those cases can soon be
completely deleted in favor of atomicrmw.

The GlobalISel codegen change is broken and substitutes handling as bf16
for handling as f16, but it's a bug that this passed the IRTranslator in the first
place.

show more ...


Revision tags: llvmorg-18.1.8
# 7e3e9d43 14-Jun-2024 Joe Nash <joseph.nash@amd.com>

[AMDGPU] Change getLdStRegisterOperand to !cond for better diagnostic (#95475)

If you would hit the unexpected case in these !if trees, you'd get an
error message like "error: Not a known RegisterC

[AMDGPU] Change getLdStRegisterOperand to !cond for better diagnostic (#95475)

If you would hit the unexpected case in these !if trees, you'd get an
error message like "error: Not a known RegisterClass! def VReg_1..."
This can happen when changing code quite indirectly related to these
class definitions. We can use !cond here, which has a builtin facility
to throw an error if no case in the !cond statement is hit.

NFC.

show more ...


# c0ff36ea 13-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix buffer intrinsic handling for various 16-bit elements. (#95376)

Mostly fixes handling of bfloat vectors, but also some missing
i16 cases.


# 5c9352eb 13-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

DAG: Replace bitwidth with type in suffix in atomic tablegen ops (#94845)


# 935d3773 13-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix using wrong memory type for non-image resource intrinsics (#94911)

An 8 x i16 raw load was incorrectly using a 64-bit memory type, which
would assert in the MachineMemOperand constructo

AMDGPU: Fix using wrong memory type for non-image resource intrinsics (#94911)

An 8 x i16 raw load was incorrectly using a 64-bit memory type, which
would assert in the MachineMemOperand constructor.

This is preparation for a cleanup which will make the buffer intrinsics
work for all legal types.

show more ...


# 9890f943 13-Jun-2024 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][GFX12] Support disassembling MUBUF instructions with arbitrary FORMAT values. (#95243)

Some tools generate such instructions with the FORMAT field set to 0,
which corresponds to buf_fmt_in

[AMDGPU][GFX12] Support disassembling MUBUF instructions with arbitrary FORMAT values. (#95243)

Some tools generate such instructions with the FORMAT field set to 0,
which corresponds to buf_fmt_invalid, but that should not prevent them
from being recognised on decoding.

show more ...


# dd7540f3 12-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Handle buffer load/store for 64-bit element types

Note pointers still don't work correctly.


12345678910