History log of /llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td (Results 1 – 25 of 166)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5
# 01c9a14c 21-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Define v_mfma_f32_{16x16x128|32x32x64}_f8f6f4 instructions (#116723)

These use a new VOP3PX encoding for the v_mfma_scale_* instructions,
which bundles the pre-scale v_mfma_ld_scale_b32. Non

AMDGPU: Define v_mfma_f32_{16x16x128|32x32x64}_f8f6f4 instructions (#116723)

These use a new VOP3PX encoding for the v_mfma_scale_* instructions,
which bundles the pre-scale v_mfma_ld_scale_b32. None of the modifiers
are supported yet (op_sel, neg or clamp).

I'm not sure the intrinsic should really expose op_sel (or any of the
others). If I'm reading the documentation correctly, we should be able
to just have the raw scale operands and auto-match op_sel to byte
extract patterns.

The op_sel syntax also seems extra horrible in this usage, especially with the
usual assumed op_sel_hi=-1 behavior.

show more ...


Revision tags: llvmorg-19.1.4
# 12409024 31-Oct-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU/GlobalISel: Handle atomic sextload and zextload (#111721)

Atomic loads are handled differently from the DAG, and have separate opcodes
and explicit control over the extensions, like ordinary

AMDGPU/GlobalISel: Handle atomic sextload and zextload (#111721)

Atomic loads are handled differently from the DAG, and have separate opcodes
and explicit control over the extensions, like ordinary loads. Add
new patterns for these.

There's room for cleanup and improvement. d16 cases aren't handled.

Fixes #111645

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# c198f775 09-Oct-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove flat/global fmin/fmax intrinsics (#105642)

These have been replaced with atomicrmw


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4
# ee08d9cb 22-Aug-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove global/flat atomic fadd intrinics (#97051)

These have been replaced with atomicrmw.


# 9d364286 21-Aug-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove flat/global atomic fadd v2bf16 intrinsics (#97050)

These are now fully covered by atomicrmw.


Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 70c8b9c2 23-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove ds atomic fadd intrinsics (#95396)

These have been replaced with atomicrmw fadd


# 85200612 18-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Support local atomicrmw fmin/fmax for float/double (#95590)

This has always been supported. Somehow, we ended up with 2
copies of clang builtins for this case, and the newer one
erroneously

AMDGPU: Support local atomicrmw fmin/fmax for float/double (#95590)

This has always been supported. Somehow, we ended up with 2
copies of clang builtins for this case, and the newer one
erroneously requires gfx8-insts.

show more ...


# 16238669 18-Jun-2024 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][MC] Support UC_VERSION_* constants. (#95618)

Our other tools support them, so we want them in LLVM
assembler/disassembler too.


# eda9ff89 18-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Flat instructions do not have signed offsets gfx7-gfx11 (#95852)

Fixes some atomicrmw fadd and intrinsic cases


Revision tags: llvmorg-18.1.8
# 5c9352eb 13-Jun-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

DAG: Replace bitwidth with type in suffix in atomic tablegen ops (#94845)


Revision tags: llvmorg-18.1.7
# 4ae896fe 21-May-2024 jofrn <jofernau@amd.com>

[AMDGPU] HasOneUse uses (#92534)

#91578 implements `HasOneUse` predicate on `PatFrag`, so this commit
uses it within AMDGPU.


Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2
# ceb744eb 13-Mar-2024 Harald van Dijk <harald@gigawatt.nl>

[AMDGPU] Fix canonicalization of truncated values. (#83054)

We were relying on roundings to implicitly canonicalize, which is
generally safe, except with roundings that may be optimized away.

Fi

[AMDGPU] Fix canonicalization of truncated values. (#83054)

We were relying on roundings to implicitly canonicalize, which is
generally safe, except with roundings that may be optimized away.

Fixes #82937.

show more ...


Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3
# f122268c 20-Feb-2024 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][NFC] Extend PredicateControl to support True16 predicates. (#82245)

Using OtherPredicates for True16 predicates is often problematic due to
interference with other kinds of predicates, par

[AMDGPU][NFC] Extend PredicateControl to support True16 predicates. (#82245)

Using OtherPredicates for True16 predicates is often problematic due to
interference with other kinds of predicates, particularly when this
overrides predicates inherited from pseudo instructions.

show more ...


Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# c99da46f 17-Jan-2024 Mariusz Sikora <mariusz.sikora@amd.com>

[AMDGPU][GFX12] Add Atomic cond_sub_u32 (#76224)

Co-authored-by: Vang Thao <Vang.Thao@amd.com>


# cf025c76 02-Jan-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] GFX12 global_atomic_ordered_add_b64 instruction and intrinsic (#76149)


# 07a6d736 15-Dec-2023 Mirko Brkušanin <Mirko.Brkusanin@amd.com>

[AMDGPU] CodeGen for GFX12 VFLAT, VSCRATCH and VGLOBAL instructions (#75493)


# 6eec8013 13-Dec-2023 Piotr Sobczak <piotr.sobczak@amd.com>

[AMDGPU] Min/max changes for GFX12 (#75214)

Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5
# c9cdaffe 10-Nov-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU] Fix operand definitions for atomic scalar memory instructions. (#71799)

CPol and CPol_GLC1 operand classes have identical predicates, which
means AsmParser cannot differentiate between the

[AMDGPU] Fix operand definitions for atomic scalar memory instructions. (#71799)

CPol and CPol_GLC1 operand classes have identical predicates, which
means AsmParser cannot differentiate between the RTN and non-RTN
variants of the instructions. When it currently selects the wrong
instruction, a hack in cvtSMEMAtomic() corrects the op-code. Using the
new predicated-value operands makes this hack and the whole conversion
function not needed.

Other uses of CPol_GLC1 operands are to be addressed separately.

Resolves about half of the remaining ~1000 pairs of ambiguous
instructions.

Part of <https://github.com/llvm/llvm-project/issues/69256>.

show more ...


Revision tags: llvmorg-17.0.4
# 509b5708 17-Oct-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][AsmParser] Eliminate custom predicates for named-bit operands. (#69243)

isGDS() and isTFE() need special treatment, because they may be both
named-bit and token operands.

Part of #62629.


Revision tags: llvmorg-17.0.3
# 720be6c5 11-Oct-2023 Stephen Thomas <104134586+stepthomas@users.noreply.github.com>

[AMDGPU] Add encoding/decoding support for non-result-returning ATOMIC_CSUB instructions (#68684)

The BUFFER_ATOMIC_CSUB and GLOBAL_ATOMIC_CSUB instructions have
encodings for
non-value-returning

[AMDGPU] Add encoding/decoding support for non-result-returning ATOMIC_CSUB instructions (#68684)

The BUFFER_ATOMIC_CSUB and GLOBAL_ATOMIC_CSUB instructions have
encodings for
non-value-returning forms, although actually using them isn't supported
by
hardware. However, these encodings aren't supported by the backend,
meaning
that they can't even be assembled or disassembled.

Add support for the non-returning encodings, but gate actually using
them
in instruction selection behind a new feature
FeatureAtomicCSubNoRtnInsts,
which no target uses. This does allow the non-returning instructions to
be
tested manually and llvm.amdgcn.atomic.csub.ll is extended to cover
them.
The feature does not gate assembling or disassembling them, this is now
not an error, and encoding and decoding tests have been adapted
accordingly.

show more ...


Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# ad9d13d5 04-Apr-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

SelectionDAG: Swap operands of atomic_store

Irritatingly, atomic_store had operands in the opposite order from
regular store. This made it difficult to share patterns between
regular and atomic stor

SelectionDAG: Swap operands of atomic_store

Irritatingly, atomic_store had operands in the opposite order from
regular store. This made it difficult to share patterns between
regular and atomic stores.

There was a previous incomplete attempt to move atomic_store into the
regular StoreSDNode which would be better.

I think it was a mistake for all atomicrmw to swap the operand order,
so maybe it's better to take this one step further.

https://reviews.llvm.org/D123143

show more ...


# fbe4ff81 14-Jun-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Partially fix not respecting dynamic denormal mode

The most notable issue was producing v_mad_f32 in functions with the
dynamic mode, since it just ignores the mode. fdiv lowering is still
s

AMDGPU: Partially fix not respecting dynamic denormal mode

The most notable issue was producing v_mad_f32 in functions with the
dynamic mode, since it just ignores the mode. fdiv lowering is still
somewhat broken because it involves a mode switch and we need to query
the original mode.

show more ...


# 12460cf9 05-Jul-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][AsmParser] Simplify the implementation of SWZ operands.

Those are implicit helper operands and therefore don't need any parsers
or printers.

Part of <https://github.com/llvm/llvm-project/i

[AMDGPU][AsmParser] Simplify the implementation of SWZ operands.

Those are implicit helper operands and therefore don't need any parsers
or printers.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: piotr, foad

Differential Revision: https://reviews.llvm.org/D154432

show more ...


# e95457d2 15-Jun-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][AsmParser][NFC] Simplify v_interp-related operand definitions.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm

[AMDGPU][AsmParser][NFC] Simplify v_interp-related operand definitions.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D152897

show more ...


# 024e5408 14-Jun-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][AsmParser][NFC] Get rid of custom default operand handlers.

Removes the need to add and remove them manually depending on whether
they are used in cvt*() functions. Also removes the compile

[AMDGPU][AsmParser][NFC] Get rid of custom default operand handlers.

Removes the need to add and remove them manually depending on whether
they are used in cvt*() functions. Also removes the compiler warnings
about unused handlers when it happens to be the case.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D151688

show more ...


1234567