VOP2Instructions.td - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/VOP2Instructions.td

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init
# 8a0c2e75	16-Jan-2025	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][MC][CodeGen] true16 for v_cndmask_b16 (#119736) Support true16 format for v_cndmask_b16 in MC and CodeGen in true16 and fake16 flow. Since we are replacing `v_cndmask_b16` to `v [AMDGPU][True16][MC][CodeGen] true16 for v_cndmask_b16 (#119736) Support true16 format for v_cndmask_b16 in MC and CodeGen in true16 and fake16 flow. Since we are replacing `v_cndmask_b16` to `v_cndmask_b16_t16/fake16`, we have to at least update the fake16 codeGen to get codeGen test passing. For this case, we have to update the true16 and with fake16 together, otherwise some of the true16 tests will fail show more ...
Revision tags: llvmorg-19.1.7
# 0f3aeca1	13-Jan-2025	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][CodeGen] Update and/or/xor codegen pattern for i16 (#121835) In true16 flow, remove and/or/xor 32bit patterns for i16
# c3241a9a	18-Dec-2024	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][MC] test update for v_subrev_f16 in true16 (#119315) This is a NFC change. Update mc test for v_subrev_f16 in true16 format. MC source change was done by previous patch and autom [AMDGPU][True16][MC] test update for v_subrev_f16 in true16 (#119315) This is a NFC change. Update mc test for v_subrev_f16 in true16 format. MC source change was done by previous patch and automatically enabled by t16 pesudo show more ...
# 5270e63c	18-Dec-2024	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][MC] test update for v_ldexp_f16 in true16 (#119313) This is a NFC change. Update mc test for v_ldexp_f16 in true16 format. MC source change was done by previous patch and automat [AMDGPU][True16][MC] test update for v_ldexp_f16 in true16 (#119313) This is a NFC change. Update mc test for v_ldexp_f16 in true16 format. MC source change was done by previous patch and automatically enabled by t16 pesudo show more ...
# f9a9173b	17-Dec-2024	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][MC] test update for v_mul_f16 in true16 (#119314) This is a NFC change. Update mc test for v_mul_f16 in true16 format. MC source change was done by previous patch and automatical [AMDGPU][True16][MC] test update for v_mul_f16 in true16 (#119314) This is a NFC change. Update mc test for v_mul_f16 in true16 format. MC source change was done by previous patch and automatically enabled by t16 pesudo show more ...
# 8bbbcadd	17-Dec-2024	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][MC] test update for v_max_f16/v_min_f16 in true16 (#119291) This is a NFC change. Update mc test for v_max/min_f16 in true16 format. MC source change was done by previous patch a [AMDGPU][True16][MC] test update for v_max_f16/v_min_f16 in true16 (#119291) This is a NFC change. Update mc test for v_max/min_f16 in true16 format. MC source change was done by previous patch and automatically enabled by t16 pesudo show more ...
Revision tags: llvmorg-19.1.6
# cbed714f	09-Dec-2024	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][MC] test update for v_add/sub_f16 in true16 (#118926) This is a NFC change. Update mc test for v_add/sub_f16 in true16 format. MC source change was done by previous patch and aut [AMDGPU][True16][MC] test update for v_add/sub_f16 in true16 (#118926) This is a NFC change. Update mc test for v_add/sub_f16 in true16 format. MC source change was done by previous patch and automatically enabled by t16 pesudo show more ...
# f9d6d46a	09-Dec-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Add assembler/disassembler support for v_dual_dot2acc_f32_bf16 (#118984) There is still no codegen support because the corresponding v_dot2c_f32_bf16 instruction is not supported on GFX11.
Revision tags: llvmorg-19.1.5
# 716364eb	26-Nov-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Add support for v_dot2c_f32_bf16 instruction for gfx950 (#117598) The encoding of v_dot2c_f32_bf16 opcode is same as v_mac_f32 in gfx90a, both from gfx9 series. This required a new decoderNa AMDGPU: Add support for v_dot2c_f32_bf16 instruction for gfx950 (#117598) The encoding of v_dot2c_f32_bf16 opcode is same as v_mac_f32 in gfx90a, both from gfx9 series. This required a new decoderNameSpace GFX950_DOT. Co-authored-by: Sirish Pande <Sirish.Pande@amd.com> show more ...
# 9fb01fcd	20-Nov-2024	Brox Chen <guochen2@amd.com>	[AMDGPU][MC][True16] Support VOP2 instructions with true16 format (#115233) Support true16 format for VOP2 instructions in MC This patch updates the true16 and fake16 vop_profile for the followin [AMDGPU][MC][True16] Support VOP2 instructions with true16 format (#115233) Support true16 format for VOP2 instructions in MC This patch updates the true16 and fake16 vop_profile for the following instructions and update the asm/dasm tests: v_fmac_f16 v_fmamk_f16 v_fmaak_f16 It seems vop2_t16_promote.s files are not yet updated with true16 flag in the previous batch update. It will be updated seperately show more ...
Revision tags: llvmorg-19.1.4
# b7d635ed	18-Nov-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Copy correct predicates for SDWA reals (#116288) There are a lot of messes in the special case predicate handling. Currently broad let blocks override specific predicates with more general AMDGPU: Copy correct predicates for SDWA reals (#116288) There are a lot of messes in the special case predicate handling. Currently broad let blocks override specific predicates with more general cases. For instructions with SDWA, the HasSDWA predicate was overriding the SubtargetPredicate for the instruction. This fixes enough to properly disallow new instructions that support SDWA on older targets. show more ...
# e8644e3b	05-Nov-2024	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][MC] VOP2 update instructions with fake16 format (#114436) Some old "t16" VOP2 instructions are actually in fake16 format. Correct and update test file
Revision tags: llvmorg-19.1.3
# b3acb257	22-Oct-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Don't rely on !eq comparing int with bits<5>. NFC. (#113279) Tweak VOP2eInst_Base so that it does not rely on !eq comparing an int value (-1) with a bits<5> value. This is to avoid a chang [AMDGPU] Don't rely on !eq comparing int with bits<5>. NFC. (#113279) Tweak VOP2eInst_Base so that it does not rely on !eq comparing an int value (-1) with a bits<5> value. This is to avoid a change in behaviour when #112904 lands, which is a bug fix which has the side effect of implicitly casting template arguments to the declared template parameter type. show more ...
# 7b4c8b35	16-Oct-2024	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][MC] VOP3 profile in True16 format (#109031) Modify VOP3 profile and pesudo, and add encoding info for VOP3 True16 including DPP and DPP8 in true16 and fake16 format. This patch [AMDGPU][True16][MC] VOP3 profile in True16 format (#109031) Modify VOP3 profile and pesudo, and add encoding info for VOP3 True16 including DPP and DPP8 in true16 and fake16 format. This patch applies true16/fake16 changes and asm/dasm changes to V_ADD_NC_U16 V_ADD_NC_I16 V_SUB_NC_U16 V_SUB_NC_I16 show more ...
Revision tags: llvmorg-19.1.2
# 3b88805c	04-Oct-2024	Yaxun (Sam) Liu <yaxun.liu@amd.com>	[AMDGPU] Fix SDWA commuting (#106920) SDWA insts miss reverse opcode, which causes them to be treated as commutable with default reverse opcode i.e. their own opcode. As a result, SWDA F16 sub A, [AMDGPU] Fix SDWA commuting (#106920) SDWA insts miss reverse opcode, which causes them to be treated as commutable with default reverse opcode i.e. their own opcode. As a result, SWDA F16 sub A, B and Sub B, A are merged by machine CSE. The correct behavior is to merged sub A, B and subrev B, A instead of sub B, A. This issues caused failures in rocFFT tests. Another issue is that src0_sel and src1_sel are not swapped when SDWA insts are commuted. Verified that this fixes rocFFT tests failure. show more ...
# 2672037e	01-Oct-2024	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][MC] Support VOP3 only instructions with true16 and fake16 (#109891) Update VOP3 only instructions with true16 and fake16 formats. This patch includes instructions: V_MUL_LO_U16 [AMDGPU][True16][MC] Support VOP3 only instructions with true16 and fake16 (#109891) Update VOP3 only instructions with true16 and fake16 formats. This patch includes instructions: V_MUL_LO_U16 V_MAX_U16 V_MAX_I16 V_MIN_U16 V_MIN_I16 V_LSHLREV_B16 V_LSHRREV_B16 V_ASHRREV_I16 show more ...
Revision tags: llvmorg-19.1.1
# 661666d4	26-Sep-2024	Corbin Robeck <corbin.robeck@amd.com>	[AMDGPU] Move renamedInGFX9 from TableGen to SIInstrInfo helper function/macro to free up a bit slot (#82787) Follow on to #81525 and #81901 in the series of consolidating bits in TSFlags. Remov [AMDGPU] Move renamedInGFX9 from TableGen to SIInstrInfo helper function/macro to free up a bit slot (#82787) Follow on to #81525 and #81901 in the series of consolidating bits in TSFlags. Remove renamedInGFX9 from SIInstrFormats.td and move to helper function/macro in SIInstrInfo. renamedInGFX9 points to V_{add, sub, subrev, addc, subb, subbrev}_ U32 and V_{div_fixup_F16, fma_F16, interp_p2_F16, mad_F16, mad_U16, mad_I16}. show more ...
# 396f6775	24-Sep-2024	Scott Egerton <9487234+ScottEgerton@users.noreply.github.com>	[AMDGPU] Remove unused VGPRSingleUseHintInsts feature (#109769)
Revision tags: llvmorg-19.1.0
# 35e27c0e	11-Sep-2024	Brox Chen <guochen2@amd.com>	[AMDGPU][True16][MC] 16bit vsrc and vdst support in MC (#104510) This is a large patch includes the MC level support for V_CVT_F16_F32, V_CVT_F32_F16 and V_LDEXP_F16 in true16 format. This patch [AMDGPU][True16][MC] 16bit vsrc and vdst support in MC (#104510) This is a large patch includes the MC level support for V_CVT_F16_F32, V_CVT_F32_F16 and V_LDEXP_F16 in true16 format. This patch includes the asm/disasm changes to encode/decode the 16bit vsrc, vdst and src modifieres for vop and dpp format. This patch is a dependency for many 16 bit instructions while only three instructions are updated to make it easier to review. There will be another patch to support these three instructions in the codeGen level, this patch just replaces these two instructions with its fake16 format. show more ...
# 935b9f62	11-Sep-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Make use of multiclass inheritance. NFC.
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3
# afd42fb3	13-Aug-2024	Brox Chen <broxigarchen@outlook.com>	[AMDGPU][True16][CodeGen] Support AND/OR/XOR and LDEXP True16 format (#102620) Support AND/OR/XOR true16 and LDEXP true/fake16 format. These instructions are previously implemented with fake16 pr [AMDGPU][True16][CodeGen] Support AND/OR/XOR and LDEXP True16 format (#102620) Support AND/OR/XOR true16 and LDEXP true/fake16 format. These instructions are previously implemented with fake16 profile. Fixing the implementation. Added a RA hint so that when using 16bit register in a 32bit instruction, try to use the register directly without an extra 16bit move --------- Co-authored-by: guochen2 <guochen2@amd.com> show more ...
# 0a62980a	08-Aug-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Support VALU add instructions in localstackalloc (#101692) Pre-enable this optimization before allowing folds of frame indexes into add instructions. Disables this fold when using scratch AMDGPU: Support VALU add instructions in localstackalloc (#101692) Pre-enable this optimization before allowing folds of frame indexes into add instructions. Disables this fold when using scratch instructions for now. I see some code size improvements with it, but the optimization needs to be smarter about the uses depending on the register classes. show more ...
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1
# 9398cc2e	25-Jul-2024	Acim Maravic <Acim.Maravic@Syrmia.com>	[LLVM][AMDGPU] Copy isConvergent from Pseudo to Real instructions (#99658) This patch copies the flag isConvergent from pseudo instructions to the corresponding real instructions, so that isConverg [LLVM][AMDGPU] Copy isConvergent from Pseudo to Real instructions (#99658) This patch copies the flag isConvergent from pseudo instructions to the corresponding real instructions, so that isConvergent flag is also defined for real instructions. Flags are not required by the compiler, but for consistency it would be nice to have them. Co-authored-by: Acim Maravic <Acim.Maravic@amd.com> show more ...
Revision tags: llvmorg-20-init
# 5feb32ba	25-Jun-2024	Vikram Hegde <115221833+vikramRH@users.noreply.github.com>	[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217) This patch is intended to be the first of a series with end goal to adapt atomic optimizer pass t [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217) This patch is intended to be the first of a series with end goal to adapt atomic optimizer pass to support i64 and f64 operations (along with removing all unnecessary bitcasts). This legalizes 64 bit readlane, writelane and readfirstlane ops pre-ISel --------- Co-authored-by: vikramRH <vikhegde@amd.com> show more ...
Revision tags: llvmorg-18.1.8
# 4a305d40	12-Jun-2024	Scott Egerton <9487234+ScottEgerton@users.noreply.github.com>	[AMDGPU] Exclude certain opcodes from being marked as single use (#91802) The s_singleuse_vdst instruction is used to mark regions of instructions that produce values that have only one use. Certa [AMDGPU] Exclude certain opcodes from being marked as single use (#91802) The s_singleuse_vdst instruction is used to mark regions of instructions that produce values that have only one use. Certain instructions take more than one cycle to execute, resulting in regions being incorrectly marked. This patch excludes these multi-cycle instructions from being marked as either producing single use values or consuming single use values or both depending on the instruction. show more ...
12 3 4 5 6 7 8 9 10