History log of /llvm-project/llvm/lib/Target/AMDGPU/VOP2Instructions.td (Results 1 – 25 of 226)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init
# 8a0c2e75 16-Jan-2025 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC][CodeGen] true16 for v_cndmask_b16 (#119736)

Support true16 format for v_cndmask_b16 in MC and CodeGen in true16 and
fake16 flow.

Since we are replacing `v_cndmask_b16` to `v

[AMDGPU][True16][MC][CodeGen] true16 for v_cndmask_b16 (#119736)

Support true16 format for v_cndmask_b16 in MC and CodeGen in true16 and
fake16 flow.

Since we are replacing `v_cndmask_b16` to `v_cndmask_b16_t16/fake16`, we
have to at least update the fake16 codeGen to get codeGen test passing.
For this case, we have to update the true16 and with fake16 together,
otherwise some of the true16 tests will fail

show more ...


Revision tags: llvmorg-19.1.7
# 0f3aeca1 13-Jan-2025 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][CodeGen] Update and/or/xor codegen pattern for i16 (#121835)

In true16 flow, remove and/or/xor 32bit patterns for i16


# c3241a9a 18-Dec-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] test update for v_subrev_f16 in true16 (#119315)

This is a NFC change. Update mc test for v_subrev_f16 in true16 format.

MC source change was done by previous patch and autom

[AMDGPU][True16][MC] test update for v_subrev_f16 in true16 (#119315)

This is a NFC change. Update mc test for v_subrev_f16 in true16 format.

MC source change was done by previous patch and automatically enabled by
t16 pesudo

show more ...


# 5270e63c 18-Dec-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] test update for v_ldexp_f16 in true16 (#119313)

This is a NFC change. Update mc test for v_ldexp_f16 in true16 format.

MC source change was done by previous patch and automat

[AMDGPU][True16][MC] test update for v_ldexp_f16 in true16 (#119313)

This is a NFC change. Update mc test for v_ldexp_f16 in true16 format.

MC source change was done by previous patch and automatically enabled by
t16 pesudo

show more ...


# f9a9173b 17-Dec-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] test update for v_mul_f16 in true16 (#119314)

This is a NFC change. Update mc test for v_mul_f16 in true16 format.

MC source change was done by previous patch and automatical

[AMDGPU][True16][MC] test update for v_mul_f16 in true16 (#119314)

This is a NFC change. Update mc test for v_mul_f16 in true16 format.

MC source change was done by previous patch and automatically enabled by
t16 pesudo

show more ...


# 8bbbcadd 17-Dec-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] test update for v_max_f16/v_min_f16 in true16 (#119291)

This is a NFC change. Update mc test for v_max/min_f16 in true16 format.

MC source change was done by previous patch a

[AMDGPU][True16][MC] test update for v_max_f16/v_min_f16 in true16 (#119291)

This is a NFC change. Update mc test for v_max/min_f16 in true16 format.

MC source change was done by previous patch and automatically enabled by
t16 pesudo

show more ...


Revision tags: llvmorg-19.1.6
# cbed714f 09-Dec-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] test update for v_add/sub_f16 in true16 (#118926)

This is a NFC change. Update mc test for v_add/sub_f16 in true16 format.

MC source change was done by previous patch and aut

[AMDGPU][True16][MC] test update for v_add/sub_f16 in true16 (#118926)

This is a NFC change. Update mc test for v_add/sub_f16 in true16 format.

MC source change was done by previous patch and automatically enabled by
t16 pesudo

show more ...


# f9d6d46a 09-Dec-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add assembler/disassembler support for v_dual_dot2acc_f32_bf16 (#118984)

There is still no codegen support because the corresponding
v_dot2c_f32_bf16 instruction is not supported on GFX11.


Revision tags: llvmorg-19.1.5
# 716364eb 26-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add support for v_dot2c_f32_bf16 instruction for gfx950 (#117598)

The encoding of v_dot2c_f32_bf16 opcode is same as v_mac_f32 in gfx90a,
both from gfx9 series. This required a new decoderNa

AMDGPU: Add support for v_dot2c_f32_bf16 instruction for gfx950 (#117598)

The encoding of v_dot2c_f32_bf16 opcode is same as v_mac_f32 in gfx90a,
both from gfx9 series. This required a new decoderNameSpace GFX950_DOT.

Co-authored-by: Sirish Pande <Sirish.Pande@amd.com>

show more ...


# 9fb01fcd 20-Nov-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][MC][True16] Support VOP2 instructions with true16 format (#115233)

Support true16 format for VOP2 instructions in MC

This patch updates the true16 and fake16 vop_profile for the followin

[AMDGPU][MC][True16] Support VOP2 instructions with true16 format (#115233)

Support true16 format for VOP2 instructions in MC

This patch updates the true16 and fake16 vop_profile for the following
instructions and update the asm/dasm tests:
v_fmac_f16
v_fmamk_f16
v_fmaak_f16

It seems vop2_t16_promote.s files are not yet updated with true16 flag
in the previous batch update. It will be updated seperately

show more ...


Revision tags: llvmorg-19.1.4
# b7d635ed 18-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Copy correct predicates for SDWA reals (#116288)

There are a lot of messes in the special case
predicate handling. Currently broad let blocks
override specific predicates with more general

AMDGPU: Copy correct predicates for SDWA reals (#116288)

There are a lot of messes in the special case
predicate handling. Currently broad let blocks
override specific predicates with more general
cases. For instructions with SDWA, the HasSDWA
predicate was overriding the SubtargetPredicate
for the instruction.

This fixes enough to properly disallow new instructions
that support SDWA on older targets.

show more ...


# e8644e3b 05-Nov-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] VOP2 update instructions with fake16 format (#114436)

Some old "t16" VOP2 instructions are actually in fake16 format. Correct
and update test file


Revision tags: llvmorg-19.1.3
# b3acb257 22-Oct-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Don't rely on !eq comparing int with bits<5>. NFC. (#113279)

Tweak VOP2eInst_Base so that it does not rely on !eq comparing an int
value (-1) with a bits<5> value. This is to avoid a chang

[AMDGPU] Don't rely on !eq comparing int with bits<5>. NFC. (#113279)

Tweak VOP2eInst_Base so that it does not rely on !eq comparing an int
value (-1) with a bits<5> value. This is to avoid a change in behaviour
when #112904 lands, which is a bug fix which has the side effect of
implicitly casting template arguments to the declared template parameter
type.

show more ...


# 7b4c8b35 16-Oct-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] VOP3 profile in True16 format (#109031)

Modify VOP3 profile and pesudo, and add encoding info for VOP3 True16
including DPP and DPP8 in true16 and fake16 format.

This patch

[AMDGPU][True16][MC] VOP3 profile in True16 format (#109031)

Modify VOP3 profile and pesudo, and add encoding info for VOP3 True16
including DPP and DPP8 in true16 and fake16 format.

This patch applies true16/fake16 changes and asm/dasm changes to
V_ADD_NC_U16
V_ADD_NC_I16
V_SUB_NC_U16
V_SUB_NC_I16

show more ...


Revision tags: llvmorg-19.1.2
# 3b88805c 04-Oct-2024 Yaxun (Sam) Liu <yaxun.liu@amd.com>

[AMDGPU] Fix SDWA commuting (#106920)

SDWA insts miss reverse opcode, which causes them to be treated as
commutable with default reverse opcode i.e. their own opcode. As a
result, SWDA F16 sub A,

[AMDGPU] Fix SDWA commuting (#106920)

SDWA insts miss reverse opcode, which causes them to be treated as
commutable with default reverse opcode i.e. their own opcode. As a
result, SWDA F16 sub A, B and Sub B, A are merged by machine CSE. The
correct behavior is to merged sub A, B and subrev B, A instead of sub B,
A. This issues caused failures in rocFFT tests.

Another issue is that src0_sel and src1_sel are not swapped when SDWA
insts are commuted.

Verified that this fixes rocFFT tests failure.

show more ...


# 2672037e 01-Oct-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] Support VOP3 only instructions with true16 and fake16 (#109891)

Update VOP3 only instructions with true16 and fake16 formats.

This patch includes instructions:
V_MUL_LO_U16

[AMDGPU][True16][MC] Support VOP3 only instructions with true16 and fake16 (#109891)

Update VOP3 only instructions with true16 and fake16 formats.

This patch includes instructions:
V_MUL_LO_U16
V_MAX_U16
V_MAX_I16
V_MIN_U16
V_MIN_I16
V_LSHLREV_B16
V_LSHRREV_B16
V_ASHRREV_I16

show more ...


Revision tags: llvmorg-19.1.1
# 661666d4 26-Sep-2024 Corbin Robeck <corbin.robeck@amd.com>

[AMDGPU] Move renamedInGFX9 from TableGen to SIInstrInfo helper function/macro to free up a bit slot (#82787)

Follow on to #81525 and #81901 in the series of consolidating bits in
TSFlags.

Remov

[AMDGPU] Move renamedInGFX9 from TableGen to SIInstrInfo helper function/macro to free up a bit slot (#82787)

Follow on to #81525 and #81901 in the series of consolidating bits in
TSFlags.

Remove renamedInGFX9 from SIInstrFormats.td and move to helper
function/macro in SIInstrInfo. renamedInGFX9 points to V_{add, sub,
subrev, addc, subb, subbrev}_ U32 and V_{div_fixup_F16, fma_F16,
interp_p2_F16, mad_F16, mad_U16, mad_I16}.

show more ...


# 396f6775 24-Sep-2024 Scott Egerton <9487234+ScottEgerton@users.noreply.github.com>

[AMDGPU] Remove unused VGPRSingleUseHintInsts feature (#109769)


Revision tags: llvmorg-19.1.0
# 35e27c0e 11-Sep-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] 16bit vsrc and vdst support in MC (#104510)

This is a large patch includes the MC level support for V_CVT_F16_F32,
V_CVT_F32_F16 and V_LDEXP_F16 in true16 format.

This patch

[AMDGPU][True16][MC] 16bit vsrc and vdst support in MC (#104510)

This is a large patch includes the MC level support for V_CVT_F16_F32,
V_CVT_F32_F16 and V_LDEXP_F16 in true16 format.

This patch includes the asm/disasm changes to encode/decode the 16bit
vsrc, vdst and src modifieres for vop and dpp format. This patch is a
dependency for many 16 bit instructions while only three instructions
are updated to make it easier to review.

There will be another patch to support these three instructions in the
codeGen level, this patch just replaces these two instructions with its
fake16 format.

show more ...


# 935b9f62 11-Sep-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Make use of multiclass inheritance. NFC.


Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3
# afd42fb3 13-Aug-2024 Brox Chen <broxigarchen@outlook.com>

[AMDGPU][True16][CodeGen] Support AND/OR/XOR and LDEXP True16 format (#102620)

Support AND/OR/XOR true16 and LDEXP true/fake16 format.

These instructions are previously implemented with fake16 pr

[AMDGPU][True16][CodeGen] Support AND/OR/XOR and LDEXP True16 format (#102620)

Support AND/OR/XOR true16 and LDEXP true/fake16 format.

These instructions are previously implemented with fake16 profile.
Fixing the implementation.

Added a RA hint so that when using 16bit register in a 32bit
instruction, try to use the register directly without an extra 16bit
move

---------

Co-authored-by: guochen2 <guochen2@amd.com>

show more ...


# 0a62980a 08-Aug-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Support VALU add instructions in localstackalloc (#101692)

Pre-enable this optimization before allowing folds of frame
indexes into add instructions. Disables this fold when using
scratch

AMDGPU: Support VALU add instructions in localstackalloc (#101692)

Pre-enable this optimization before allowing folds of frame
indexes into add instructions. Disables this fold when using
scratch instructions for now. I see some code size improvements
with it, but the optimization needs to be smarter about the
uses depending on the register classes.

show more ...


Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1
# 9398cc2e 25-Jul-2024 Acim Maravic <Acim.Maravic@Syrmia.com>

[LLVM][AMDGPU] Copy isConvergent from Pseudo to Real instructions (#99658)

This patch copies the flag isConvergent from pseudo instructions to the
corresponding real instructions, so that isConverg

[LLVM][AMDGPU] Copy isConvergent from Pseudo to Real instructions (#99658)

This patch copies the flag isConvergent from pseudo instructions to the
corresponding real instructions, so that isConvergent flag is also
defined for real instructions.

Flags are not required by the compiler, but for consistency it would be
nice to have them.

Co-authored-by: Acim Maravic <Acim.Maravic@amd.com>

show more ...


Revision tags: llvmorg-20-init
# 5feb32ba 25-Jun-2024 Vikram Hegde <115221833+vikramRH@users.noreply.github.com>

[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217)

This patch is intended to be the first of a series with end goal to
adapt atomic optimizer pass t

[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217)

This patch is intended to be the first of a series with end goal to
adapt atomic optimizer pass to support i64 and f64 operations (along
with removing all unnecessary bitcasts). This legalizes 64 bit readlane,
writelane and readfirstlane ops pre-ISel

---------

Co-authored-by: vikramRH <vikhegde@amd.com>

show more ...


Revision tags: llvmorg-18.1.8
# 4a305d40 12-Jun-2024 Scott Egerton <9487234+ScottEgerton@users.noreply.github.com>

[AMDGPU] Exclude certain opcodes from being marked as single use (#91802)

The s_singleuse_vdst instruction is used to mark regions of instructions
that produce values that have only one use.
Certa

[AMDGPU] Exclude certain opcodes from being marked as single use (#91802)

The s_singleuse_vdst instruction is used to mark regions of instructions
that produce values that have only one use.
Certain instructions take more than one cycle to execute, resulting in
regions being incorrectly marked.
This patch excludes these multi-cycle instructions from being marked as
either producing single use values or consuming single use values
or both depending on the instruction.

show more ...


12345678910