History log of /llvm-project/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (Results 1 – 25 of 248)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init
# 5e26ff35 14-Jan-2025 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] true16 for v_cmp_lt_f16 (#122499)

True16 format for v_cmp_lt_f16. Update VOPC t16 and fake16 pseudo.


Revision tags: llvmorg-19.1.7
# b2adeae8 03-Jan-2025 Jun Wang <jwang86@yahoo.com>

[AMDGPU][MC] Allow null where 128b or larger dst reg is expected (#115200)

For GFX10+, currently null cannot be used as dst reg in instructions
that expect the dst reg to be 128b or larger (e.g., s

[AMDGPU][MC] Allow null where 128b or larger dst reg is expected (#115200)

For GFX10+, currently null cannot be used as dst reg in instructions
that expect the dst reg to be 128b or larger (e.g., s_load_dwordx4).
This patch fixes this problem while ensuring null cannot be used as S#,
T#, or V#.

show more ...


Revision tags: llvmorg-19.1.6, llvmorg-19.1.5
# 716364eb 26-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add support for v_dot2c_f32_bf16 instruction for gfx950 (#117598)

The encoding of v_dot2c_f32_bf16 opcode is same as v_mac_f32 in gfx90a,
both from gfx9 series. This required a new decoderNa

AMDGPU: Add support for v_dot2c_f32_bf16 instruction for gfx950 (#117598)

The encoding of v_dot2c_f32_bf16 opcode is same as v_mac_f32 in gfx90a,
both from gfx9 series. This required a new decoderNameSpace GFX950_DOT.

Co-authored-by: Sirish Pande <Sirish.Pande@amd.com>

show more ...


# 22503a9d 26-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Support v_cvt_scalef32_pk32_{bf|f}6_{bf|fp}16 for gfx950 (#117592)

Co-authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>


# 5dd48c49 26-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: MC support for v_cvt_scalef32_pk32_f32_[fp|bf]6 of gfx950 (#117590)

Co-authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>


# cd20fc07 23-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove wavefrontsize64 feature from dummy target (#117410)

This is a refinement for the existing hack. With this,
the default target will have neither wavefrontsize feature
present, unless i

AMDGPU: Remove wavefrontsize64 feature from dummy target (#117410)

This is a refinement for the existing hack. With this,
the default target will have neither wavefrontsize feature
present, unless it was explicitly specified. That is,
getWavefrontSize() == 64 no longer implies +wavefrontsize64.
getWavefrontSize() == 32 does imply +wavefrontsize32.

Continue to assume the value is 64 with no wavesize feature.
This maintains the codegenable property without any code
that directly cares about the wavesize needing to worry about it.

Introduce an isWaveSizeKnown helper to check if we know the
wavesize is accurate based on having one of the features explicitly
set, or a known target-cpu.

I'm not sure what's going on in wave_any.s. It's testing what
happens when both wavesizes are enabled, but this is treated
as an error in codegen. We now treat wave32 as the winning
case, so some cases that were previously printed as vcc are now
vcc_lo.

show more ...


# 8b087d64 23-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Move default wavesize hack for disassembler (#117422)

You cannot adjust the disassembler's subtarget. llvm-mc passes
the originally constructed MCSubtargetInfo around, rather than
querying t

AMDGPU: Move default wavesize hack for disassembler (#117422)

You cannot adjust the disassembler's subtarget. llvm-mc passes
the originally constructed MCSubtargetInfo around, rather than
querying the pointer in the disassembler instance.

show more ...


# 01c9a14c 21-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Define v_mfma_f32_{16x16x128|32x32x64}_f8f6f4 instructions (#116723)

These use a new VOP3PX encoding for the v_mfma_scale_* instructions,
which bundles the pre-scale v_mfma_ld_scale_b32. Non

AMDGPU: Define v_mfma_f32_{16x16x128|32x32x64}_f8f6f4 instructions (#116723)

These use a new VOP3PX encoding for the v_mfma_scale_* instructions,
which bundles the pre-scale v_mfma_ld_scale_b32. None of the modifiers
are supported yet (op_sel, neg or clamp).

I'm not sure the intrinsic should really expose op_sel (or any of the
others). If I'm reading the documentation correctly, we should be able
to just have the raw scale operands and auto-match op_sel to byte
extract patterns.

The op_sel syntax also seems extra horrible in this usage, especially with the
usual assumed op_sel_hi=-1 behavior.

show more ...


# 9fb01fcd 20-Nov-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][MC][True16] Support VOP2 instructions with true16 format (#115233)

Support true16 format for VOP2 instructions in MC

This patch updates the true16 and fake16 vop_profile for the followin

[AMDGPU][MC][True16] Support VOP2 instructions with true16 format (#115233)

Support true16 format for VOP2 instructions in MC

This patch updates the true16 and fake16 vop_profile for the following
instructions and update the asm/dasm tests:
v_fmac_f16
v_fmamk_f16
v_fmaak_f16

It seems vop2_t16_promote.s files are not yet updated with true16 flag
in the previous batch update. It will be updated seperately

show more ...


Revision tags: llvmorg-19.1.4
# abff8fe2 14-Nov-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] VINTERP instructions supporting true16/fake16 (#113634)

Update VInterp instructions with true16 and fake16 formats.

This patch includes instructions:
v_interp_p10_f16_f32
v

[AMDGPU][True16][MC] VINTERP instructions supporting true16/fake16 (#113634)

Update VInterp instructions with true16 and fake16 formats.

This patch includes instructions:
v_interp_p10_f16_f32
v_interp_p2_f16_f32
v_interp_p10_rtz_f16_f32
v_interp_p2_rtz_f16_f32

dasm test vinterp-fake16.txt is removed and the testline are merged into
vinterp.txt which handles both true16/fake16 cases

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# 8d13e7b8 03-Oct-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Qualify auto. NFC. (#110878)

Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)


Revision tags: llvmorg-19.1.1
# fd50cdfb 28-Sep-2024 Craig Topper <craig.topper@sifive.com>

[AMDGPU] Use MCRegister. NFC


# f6a8eb98 24-Sep-2024 Jun Wang <jwang86@yahoo.com>

[AMDGPU][MC] Disallow null as saddr in flat instructions (#101730)

Some flat instructions have an saddr operand. When 'null' is provided as
saddr, it may have the same encoding as another instructi

[AMDGPU][MC] Disallow null as saddr in flat instructions (#101730)

Some flat instructions have an saddr operand. When 'null' is provided as
saddr, it may have the same encoding as another instruction. For
example, the instructions 'global_atomic_add v1, v2, null' and
'global_atomic_add v[1:2], v2, off' have the same encoding. This patch
disallows having null as saddr.

show more ...


# 73b8074e 20-Sep-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Do not use APInt for simple 64-bit arithmetic. NFC. (#109414)


Revision tags: llvmorg-19.1.0
# 35e27c0e 11-Sep-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] 16bit vsrc and vdst support in MC (#104510)

This is a large patch includes the MC level support for V_CVT_F16_F32,
V_CVT_F32_F16 and V_LDEXP_F16 in true16 format.

This patch

[AMDGPU][True16][MC] 16bit vsrc and vdst support in MC (#104510)

This is a large patch includes the MC level support for V_CVT_F16_F32,
V_CVT_F32_F16 and V_LDEXP_F16 in true16 format.

This patch includes the asm/disasm changes to encode/decode the 16bit
vsrc, vdst and src modifieres for vop and dpp format. This patch is a
dependency for many 16 bit instructions while only three instructions
are updated to make it easier to review.

There will be another patch to support these three instructions in the
codeGen level, this patch just replaces these two instructions with its
fake16 format.

show more ...


Revision tags: llvmorg-19.1.0-rc4
# c1b3ebba 26-Aug-2024 Craig Topper <craig.topper@sifive.com>

[MC] Update MCOperand::getReg/setReg/createReg and MCInstBuilder::addReg to use MCRegister. (#106015)

Replace unsigned with MCRegister.

Update some ternary operators that started giving errors.


Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 63fae3ed 17-Jul-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] clang-tidy: no else after return etc. NFC. (#99298)


# b132dd41 16-Jul-2024 Stanislav Mekhanoshin <rampitec@users.noreply.github.com>

[AMDGPU] Remove wavefrontsize feature from GFX10+ (#98400)

Processor definition shall not include a default feature which may be
switched off by a different wave size. This allows not to write
-ma

[AMDGPU] Remove wavefrontsize feature from GFX10+ (#98400)

Processor definition shall not include a default feature which may be
switched off by a different wave size. This allows not to write
-mattr=-wavefrontsize32,+wavefrontsize64 in tests.

show more ...


# e83e53b7 07-Jul-2024 Carl Ritson <carl.ritson@amd.com>

[AMDGPU][MC] Allow UC_VERSION_* constant reuse (#96461)

If more than one disassembler is created for a context then allow reuse
of existing constants.
Warn if constants values do not match.


# bb973785 27-Jun-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Only reinitialize disassembler Bytes array when needed. NFC. (#96666)


# 16238669 18-Jun-2024 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][MC] Support UC_VERSION_* constants. (#95618)

Our other tools support them, so we want them in LLVM
assembler/disassembler too.


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6
# a98a6e95 04-May-2024 luolent <56246516+luolent@users.noreply.github.com>

Add clarifying parenthesis around non-trivial conditions in ternary expressions. (#90391)

Fixes [#85868](https://github.com/llvm/llvm-project/issues/85868)

Parenthesis are added as requested on t

Add clarifying parenthesis around non-trivial conditions in ternary expressions. (#90391)

Fixes [#85868](https://github.com/llvm/llvm-project/issues/85868)

Parenthesis are added as requested on ternary operators with non trivial conditions.

I used this [precedence table](https://en.cppreference.com/w/cpp/language/operator_precedence) for reference, to make sure we get the expected behavior on each change.

show more ...


Revision tags: llvmorg-18.1.5
# 6e722bbe 26-Apr-2024 Stanislav Mekhanoshin <rampitec@users.noreply.github.com>

[AMDGPU] Support byte_sel modifier on v_cvt_sr_fp8_f32 and v_cvt_sr_bf8_f32 (#90244)


# 68e814d9 18-Apr-2024 Emma Pilkington <emma.pilkington95@gmail.com>

[AMDGPU] Add disassembler diagnostics for invalid kernel descriptors (#87400)

These mostly are checking for various reserved bits being set. The diagnostics
for gpu-dependent reserved bits have a

[AMDGPU] Add disassembler diagnostics for invalid kernel descriptors (#87400)

These mostly are checking for various reserved bits being set. The diagnostics
for gpu-dependent reserved bits have a bit more context since they seem like the
most likely ones to be observed in practice.

This commit also improves the error handling mechanism for
MCDisassembler::onSymbolStart(). Previously it had a comment stream parameter
that was just being ignored by llvm-objdump, now it returns errors using
Expected<T>.

show more ...


Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4
# 60e7ae3f 26-Feb-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Only try DecoderTables for the current subtarget. NFCI. (#82992)

Speed up disassembly by only calling tryDecodeInst for DecoderTables
that make sense for the current subtarget.

This giv

[AMDGPU] Only try DecoderTables for the current subtarget. NFCI. (#82992)

Speed up disassembly by only calling tryDecodeInst for DecoderTables
that make sense for the current subtarget.

This gives a 1.3x speed-up on check-llvm-mc-disassembler-amdgpu in my
Release+Asserts build.

show more ...


12345678910