AMDGPUDisassembler.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# 42f6f95e	23-Feb-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Simplify AMDGPUDisassembler::getInstruction by removing Res. (#82775) Remove all the code that set and tested Res. Change all convert* functions to return void since none of them can fail. [AMDGPU] Simplify AMDGPUDisassembler::getInstruction by removing Res. (#82775) Remove all the code that set and tested Res. Change all convert* functions to return void since none of them can fail. getInstruction only has one main point of failure, after all calls to tryDecodeInst have failed. show more ...
# 3b7d4330	22-Feb-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Remove DPP DecoderNamespaces. NFC. (#82491) Now that there is no special checking for valid DPP encodings, these instructions can use the same DecoderNamespace as other 64- or 96-bit inst [AMDGPU] Remove DPP DecoderNamespaces. NFC. (#82491) Now that there is no special checking for valid DPP encodings, these instructions can use the same DecoderNamespace as other 64- or 96-bit instructions. Also clean up setting DecoderNamespace: in most cases it should be set as a pair with AssemblerPredicate. show more ...
# b9ce2379	22-Feb-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Clean up conversion of DPP instructions in AMDGPUDisassembler (#82480) Convert DPP instructions after all calls to tryDecodeInst, just like we do for all other instruction types. NFCI.
# bcbffd99	22-Feb-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Split Dpp8FI and Dpp16FI operands (#82379) Split Dpp8FI and Dpp16FI into two different operands sharing an AsmOperandClass. They are parsed and rendered identically as fi:1 but the encodi [AMDGPU] Split Dpp8FI and Dpp16FI operands (#82379) Split Dpp8FI and Dpp16FI into two different operands sharing an AsmOperandClass. They are parsed and rendered identically as fi:1 but the encoding is different: for DPP16 FI is a single bit, but for DPP8 it uses two different special values in the src0 field. Having a dedicated decoder for Dpp8FI allows it to reject other (non-special) src0 values so that AMDGPUDisassembler::getInstruction no longer needs to call isValidDPP8 to do post hoc validation of decoded DPP8 instructions. show more ...
Revision tags: llvmorg-18.1.0-rc3
# ddba6b27	20-Feb-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Stop using SDWA DecoderNamespaces. NFCI. (#82233) 64-bit SDWA encodings have to be checked first because their first 32 bits are a special case of the corresponding 32-bit non-SDWA encodin [AMDGPU] Stop using SDWA DecoderNamespaces. NFCI. (#82233) 64-bit SDWA encodings have to be checked first because their first 32 bits are a special case of the corresponding 32-bit non-SDWA encoding of the same instruction. But all 64-bit encodings are checked first, so we don't need special handling for SDWA. show more ...
# a4d46157	20-Feb-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Try decoding instructions longest first. NFCI. (#82014) AMDGPUDisassembler::getInstruction tries decoding instructions using different DecoderTables in a confusing order: first 96-bit inst [AMDGPU] Try decoding instructions longest first. NFCI. (#82014) AMDGPUDisassembler::getInstruction tries decoding instructions using different DecoderTables in a confusing order: first 96-bit instructions, then some 64-bit, then 32-bit, then some more 64-bit. This patch changes it to always try longer encodings first. The motivation is to make getInstruction easier to understand, and to pave the way for combining some 64-bit tables that do not need to be separate. show more ...
# 13e64958	19-Feb-2024	Stanislav Mekhanoshin <rampitec@users.noreply.github.com>	[AMDGPU] Fix decoder for BF16 inline constants (#82276) Fix #82039.
# ded3ca22	17-Feb-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Set predicates more consistently for BUF instructions (#81865) Set DecoderNamespace and AssemblerPredicate in the base class for Real instructions for each subtarget. This avoids some ad h [AMDGPU] Set predicates more consistently for BUF instructions (#81865) Set DecoderNamespace and AssemblerPredicate in the base class for Real instructions for each subtarget. This avoids some ad hoc "let" around groups of instructions definitions, and fixes some missed cases like BUFFER_GL0_INV_gfx10 which was missing DecoderNamespace. show more ...
# d3b825f8	15-Feb-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Use consistent DecoderNamespace for wave64 instructions. NFC. (#81863) For wave64 WMMA instructions, putting W64 in the DecoderNamespace is more descriptive than WMMA, and matches other us [AMDGPU] Use consistent DecoderNamespace for wave64 instructions. NFC. (#81863) For wave64 WMMA instructions, putting W64 in the DecoderNamespace is more descriptive than WMMA, and matches other uses for GFX12 GLOBAL_LOAD_TR instructions. show more ...
# 4c931091	13-Feb-2024	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][NFC] Get rid of some operand decoders defined using macros. (#81482) Use templates instead. Part of <https://github.com/llvm/llvm-project/issues/62629>.
# 7d19dc50	08-Feb-2024	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][True16] Support VOP3 source DPP operands. (#80892)
Revision tags: llvmorg-18.1.0-rc2
# 4eb08109	01-Feb-2024	Emma Pilkington <emma.pilkington95@gmail.com>	[llvm-objdump][AMDGPU] Pass ELF ABIVersion through disassembler (#78907) Admittedly, its a bit ugly to pass the ABIVersion through onSymbolStart but I'm not sure what a better place for it would be.
Revision tags: llvmorg-18.1.0-rc1
# 70fbcdb4	26-Jan-2024	Simon Pilgrim <llvm-dev@redking.me.uk>	Fix MSVC "signed/unsigned mismatch" warning. NFC.
# 2aa8945d	25-Jan-2024	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][NFC] Use templates to decode AV operands. (#79313) Eliminates the need to define them manually. Part of <https://github.com/llvm/llvm-project/issues/62629>.
# 2e81ac25	24-Jan-2024	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][NFC] Simplify AGPR/VGPR load/store operand definitions. (#79289) Part of <https://github.com/llvm/llvm-project/issues/62629>.
# 7fdf608c	24-Jan-2024	Mirko Brkušanin <Mirko.Brkusanin@amd.com>	[AMDGPU] Add GFX12 WMMA and SWMMAC instructions (#77795) Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com> Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>
# cfddb59b	24-Jan-2024	Mariusz Sikora <mariusz.sikora@amd.com>	[AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (#78414) …bf8 instructions Add VOP1, VOP1_DPP8, VOP1_DPP16, VOP3, VOP3_DPP8, VOP3_DPP16 instructions that were supp [AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (#78414) …bf8 instructions Add VOP1, VOP1_DPP8, VOP1_DPP16, VOP3, VOP3_DPP8, VOP3_DPP16 instructions that were supported on GFX940 (MI300): - V_CVT_F32_FP8 - V_CVT_F32_BF8 - V_CVT_PK_F32_FP8 - V_CVT_PK_F32_BF8 - V_CVT_PK_FP8_F32 - V_CVT_PK_BF8_F32 - V_CVT_SR_FP8_F32 - V_CVT_SR_BF8_F32 --------- Co-authored-by: Mateja Marjanovic <mateja.marjanovic@amd.com> Co-authored-by: Mirko Brkušanin <Mirko.Brkusanin@amd.com> show more ...
Revision tags: llvmorg-19-init
# bc82cfb3	21-Jan-2024	Emma Pilkington <emma.pilkington95@gmail.com>	[AMDGPU] Add an asm directive to track code_object_version (#76267) Named '.amdhsa_code_object_version'. This directive sets the e_ident[ABIVERSION] in the ELF header, and should be used as the ass [AMDGPU] Add an asm directive to track code_object_version (#76267) Named '.amdhsa_code_object_version'. This directive sets the e_ident[ABIVERSION] in the ELF header, and should be used as the assumed COV for the rest of the asm file. This commit also weakens the --amdhsa-code-object-version CL flag. Previously, the CL flag took precedence over the IR flag. Now the IR flag/asm directive take precedence over the CL flag. This is implemented by merging a few COV-checking functions in AMDGPUBaseInfo.h. show more ...
# 57f6a3f7	18-Jan-2024	Piotr Sobczak <piotr.sobczak@amd.com>	[AMDGPU] Add global_load_tr for GFX12 (#77772) Support new amdgcn_global_load_tr instructions for load with transpose. * MC layer support for GLOBAL_LOAD_TR_B64/GLOBAL_LOAD_TR_B128 * Intrinsic i [AMDGPU] Add global_load_tr for GFX12 (#77772) Support new amdgcn_global_load_tr instructions for load with transpose. * MC layer support for GLOBAL_LOAD_TR_B64/GLOBAL_LOAD_TR_B128 * Intrinsic int_amdgcn_global_load_tr * Clang builtins amdgcn_global_load_tr* show more ...
# 49b49204	03-Jan-2024	Nicolai Hähnle <nicolai.haehnle@amd.com>	AMDGPU: Fix packed 16-bit inline constants (#76522) Consistently treat packed 16-bit operands as 32-bit values, because that's really what they are. The attempt to treat them differently was ultim AMDGPU: Fix packed 16-bit inline constants (#76522) Consistently treat packed 16-bit operands as 32-bit values, because that's really what they are. The attempt to treat them differently was ultimately incorrect and lead to miscompiles, e.g. when using non-splat constants such as (1, 0) as operands. Recognize 32-bit float constants for i/u16 instructions. This is a bit odd conceptually, but it matches HW behavior and SP3. Remove isFoldableLiteralV216; there was too much magic in the dependency between it and its use in SIFoldOperands. Instead, we now simply rely on checking whether a constant is an inline constant, and trying a bunch of permutations of the low and high halves. This is more obviously correct and leads to some new cases where inline constants are used as shown by tests. Move the logic for switching packed add vs. sub into SIFoldOperands. This has two benefits: all logic that optimizes for inline constants in packed math is now in one place; and it applies to both SelectionDAG and GISel paths. Disable the use of opsel with v_dot* instructions on gfx11. They are documented to ignore opsel on src0 and src1. It may be interesting to re-enable to use of opsel on src2 as a future optimization. A similar "proper" fix of what inline constants mean could potentially be applied to unpacked 16-bit ops. However, it's less clear what the benefit would be, and there are surely places where we'd have to carefully audit whether values are properly sign- or zero-extended. It is best to keep such a change separate. Fixes: Corruption in FSR 2.0 (latent bug exposed by an LLPC change) show more ...
# c01e844a	02-Jan-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Update compute program resource registers for GFX12 (#75911) Co-authored-by: Konstantin Zhuravlyov <kzhuravl@amd.com>
# 8c6172b0	28-Dec-2023	Ivan Kosarev <ivan.kosarev@amd.com>	[AMDGPU][True16] Don't use the VGPR_LO/HI16 register classes. (#76440) Removing the classes requires updating tests and so is planned to be done with a separate change.
# 8fdfd34c	21-Dec-2023	Jay Foad <jay.foad@amd.com>	[AMDGPU] Remove GDS and GWS for GFX12 (#76148)
# 569ef8dd	15-Dec-2023	Mirko Brkušanin <Mirko.Brkusanin@amd.com>	[AMDGPU] Add pseudo scalar trans instructions for GFX12 (#75204)
# c1a6974d	15-Dec-2023	Mirko Brkušanin <Mirko.Brkusanin@amd.com>	[AMDGPU][MC] Add GFX12 SMEM encoding (#75215)
123 4 5 6 7 8 9 10