GCNHazardRecognizer.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# 149ed9d2	24-Jan-2024	Petar Avramovic <Petar.Avramovic@amd.com>	AMDGPU: update GFX11 wmma hazards (#76143) One V_NOP or unrelated VALU instruction in between is required for correctness when matrix A or B of current WMMA instruction overlaps with matrix D of p AMDGPU: update GFX11 wmma hazards (#76143) One V_NOP or unrelated VALU instruction in between is required for correctness when matrix A or B of current WMMA instruction overlaps with matrix D of previous WMMA instruction. Remaining cases of WMMA operand overlaps are handled by the hardware and do not require handling in hazard recognizer. Hardware may stall in cases where: - matrix C of current WMMA instruction overlaps with matrix D of previous WMMA instruction - VALU instruction reads matrix D of previous WMMA instruction - matrix A,B or C of WMMA instruction reads result of previous VALU instruction show more ...
Revision tags: llvmorg-19-init
# 42b08842	23-Jan-2024	Pierre van Houtryve <pierre.vanhoutryve@amd.com>	[AMDGPU] Handle V_PERMLANE64_B32 in fixVcmpxPermlaneHazards (#79125) Fixes #78856
# 97747467	19-Jan-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Update hazard recognition for new GFX12 wait counters (#78722) In most cases the hazards no longer apply, so just assert that we are not on GFX12.
# ba52f06f	18-Jan-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438) Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into sep [AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438) Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into separate LOADCNT, SAMPLECNT and BVHCNT counters. show more ...
# b120dae9	11-Jan-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Support GFX12 VDSDIR instructions WAITVMSRC operand in GCNHazardRecognizer (#77628) Modify GCNHazardRecognizer::fixLdsDirectVMEMHazard() so the waitvsrc operand in gfx12 DS_PARAM_LOAD or [AMDGPU] Support GFX12 VDSDIR instructions WAITVMSRC operand in GCNHazardRecognizer (#77628) Modify GCNHazardRecognizer::fixLdsDirectVMEMHazard() so the waitvsrc operand in gfx12 DS_PARAM_LOAD or DS_DIRECT_LOAD instructions is set appropriately depending on whether a hazard is found or not, rather than inserting an S_WAITCNT_DEPCTR instruction if a hazard needs to be mitigated. Co-authored-by: Stephen Thomas <Stephen.Thomas@amd.com> show more ...
# 966416b9	15-Dec-2023	Mariusz Sikora <mariusz.sikora@amd.com>	[AMDGPU][GFX12] Add new v_permlane16 variants (#75475)
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4
# 85e3875a	23-Aug-2023	Michael Maitland <michaeltmaitland@gmail.com>	[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics D150312 added a TODO: TODO: consider renaming the field `StartAtCycle` and `Cycles` to `AcquireAtCycle` and `ReleaseAtCycle` r [TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics D150312 added a TODO: TODO: consider renaming the field `StartAtCycle` and `Cycles` to `AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the fact that resource allocation is now represented as an interval, relatively to the issue cycle of the instruction. This patch implements that TODO. This naming clarifies how to use these fields in the scheduler. In addition it was confusing that `StartAtCycle` was singular but `Cycles` was plural. This renaming fixes this inconsistency. This commit as previously reverted since it missed renaming that came down after rebasing. This version of the commit fixes those problems. Differential Revision: https://reviews.llvm.org/D158568 show more ...
# 71bfec76	24-Aug-2023	Michael Maitland <michaeltmaitland@gmail.com>	Revert "[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics" This reverts commit 5b854f2c23ea1b000cb4cac4c0fea77326c03d43. Build still failing.
# 5b854f2c	23-Aug-2023	Michael Maitland <michaeltmaitland@gmail.com>	[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics D150312 added a TODO: TODO: consider renaming the field `StartAtCycle` and `Cycles` to `AcquireAtCycle` and `ReleaseAtCycle` r [TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics D150312 added a TODO: TODO: consider renaming the field `StartAtCycle` and `Cycles` to `AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the fact that resource allocation is now represented as an interval, relatively to the issue cycle of the instruction. This patch implements that TODO. This naming clarifies how to use these fields in the scheduler. In addition it was confusing that `StartAtCycle` was singular but `Cycles` was plural. This renaming fixes this inconsistency. This commit as previously reverted since it missed renaming that came down after rebasing. This version of the commit fixes those problems. Differential Revision: https://reviews.llvm.org/D158568 show more ...
Revision tags: llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 2dfb4b56	04-Jul-2023	Stephen Thomas <Stephen.Thomas@amd.com>	[AMDGPU] Fix incorrect hazard mitigation GCNHazardRecognizer::fixVcmpxExecWARHazard() mitigates a specific hazard by inserting a wait on sa_sdst==0 if such a wait isn't already present. Unfortunatel [AMDGPU] Fix incorrect hazard mitigation GCNHazardRecognizer::fixVcmpxExecWARHazard() mitigates a specific hazard by inserting a wait on sa_sdst==0 if such a wait isn't already present. Unfortunately, the check for an existing wait incorrectly checks for one that doesn't actually care about sa_sdst itself, but requires that no other counters are waited for. Once the check is performed correctly, a lit test needs to be updated, since it is currently testing for the incorrect behaviour. Differential Revision: https://reviews.llvm.org/D154438 show more ...
# 8aedad0f	04-Jul-2023	Stephen Thomas <Stephen.Thomas@amd.com>	[AMDGPU] Add functions for composing and decomposing S_WAIT_DEPCTR operands Add functions AMDGPU::DepCtr::encodeField() and AMDGPU::DepCtr::decodeField() for each of vm_vsrc, va_vdst and sa_sdst. [AMDGPU] Add functions for composing and decomposing S_WAIT_DEPCTR operands Add functions AMDGPU::DepCtr::encodeField() and AMDGPU::DepCtr::decodeField() for each of vm_vsrc, va_vdst and sa_sdst. These are now used in AMDGPUInsertDelayAlu and GCNHazardRecognizer so as to make working with S_WAITCNT_DEPCTR operands easier and more readable. Differential Revision: https://reviews.llvm.org/D154424 show more ...
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5
# aa2d0fbc	21-May-2023	Sergei Barannikov <barannikov88@gmail.com>	[MC] Add MCRegisterInfo::regunits for iteration over register units Reviewed By: foad Differential Revision: https://reviews.llvm.org/D152098
# 890c76a9	19-May-2023	Jay Foad <jay.foad@amd.com>	[AMDGPU] Fix odd implicit operand handling in clause breaking By inspection. Because of the strange behaviour of MI.uses(), this was adding implicit defs to the clause uses set, and then wrongly d [AMDGPU] Fix odd implicit operand handling in clause breaking By inspection. Because of the strange behaviour of MI.uses(), this was adding implicit defs to the clause uses set, and then wrongly detecting a conflict between explicit defs and implicit defs. For example it would detect a conflict on this pair of instructions: $vgpr0 = BUFFER_LOAD_DWORD_OFFSET $sgpr0_sgpr1_sgpr2_sgpr3, 0, 4088, 0, 0, implicit $exec, implicit-def $vgpr0_vgpr1 :: (load (s32) from %stack.1, addrspace 5) $vgpr1 = BUFFER_LOAD_DWORD_OFFSET $sgpr0_sgpr1_sgpr2_sgpr3, 0, 4092, 0, 0, implicit $exec, implicit-def $vgpr0_vgpr1 :: (load (s32) from %stack.1 + 4, addrspace 5) Differential Revision: https://reviews.llvm.org/D150947 show more ...
Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2
# 4241d890	15-Apr-2023	Kazu Hirata <kazu@google.com>	[Target] Use range-based for loops (NFC)
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# a07584d5	03-Feb-2023	Jay Foad <jay.foad@amd.com>	[CodeGen] Make more use of MachineOperand::getOperandNo. NFC. Differential Revision: https://reviews.llvm.org/D143252
# 8e3d7cf5	07-Feb-2023	Archibald Elliott <archibald.elliott@arm.com>	[NFC][TargetParser] Remove llvm/Support/TargetParser.h
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# 768aed13	13-Jan-2023	Jay Foad <jay.foad@amd.com>	[MC] Make more use of MCInstrDesc::operands. NFC. Change MCInstrDesc::operands to return an ArrayRef so we can easily use it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end. A futu [MC] Make more use of MCInstrDesc::operands. NFC. Change MCInstrDesc::operands to return an ArrayRef so we can easily use it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end. A future patch will remove opInfo_begin and opInfo_end. Also use it instead of raw access to the OpInfo pointer. A future patch will remove this pointer. Differential Revision: https://reviews.llvm.org/D142213 show more ...
Revision tags: llvmorg-15.0.7
# 5bc703f7	20-Dec-2022	Carl Ritson <carl.ritson@amd.com>	[AMDGPU] Replace getPhysRegClass with getPhysRegBaseClass Accelerate finding the base class for a physical register by building a statically mapping table from physical registers to base classes usi [AMDGPU] Replace getPhysRegClass with getPhysRegBaseClass Accelerate finding the base class for a physical register by building a statically mapping table from physical registers to base classes using TableGen. Replace uses of SIRegisterInfo::getPhysRegClass with TargetRegisterInfo::getPhysRegBaseClass in order to use the computed table. Reviewed By: arsenm, foad Differential Revision: https://reviews.llvm.org/D139422 show more ...
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 49762162	08-Mar-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] Remove isLiteralConstant and isLiteralConstantLike isLiteralConstant and isLiteralConstantLike were similar to !isInlineConstant with slight differences like handling isReg operands. To av [AMDGPU] Remove isLiteralConstant and isLiteralConstantLike isLiteralConstant and isLiteralConstantLike were similar to !isInlineConstant with slight differences like handling isReg operands. To avoid a profusion of similar functions with undocumented differences, this patch removes all the isLiteralConstant* variants. Callers are responsible for handling the isReg case. Differential Revision: https://reviews.llvm.org/D125759 show more ...
# 7425077e	07-Nov-2022	Pierre van Houtryve <pierre.vanhoutryve@amd.com>	[AMDGPU] Add & use `hasNamedOperand`, NFC In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1. This is fine in itself, but it's verbose and doesn' [AMDGPU] Add & use `hasNamedOperand`, NFC In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1. This is fine in itself, but it's verbose and doesn't make the intention clear, IMHO. I added a `hasNamedOperand` and replaced all cases I could find with regexes and manually. Reviewed By: arsenm, foad Differential Revision: https://reviews.llvm.org/D137540 show more ...
# c8a90316	28-Oct-2022	Stephen Thomas <Stephen.Thomas@amd.com>	[AMDGPU] Small cleanups in wait counter code A small number of cleanups and refactors intended to enhance readability in two passes that deal with s_waitcnt instructions. Differential Revision: htt [AMDGPU] Small cleanups in wait counter code A small number of cleanups and refactors intended to enhance readability in two passes that deal with s_waitcnt instructions. Differential Revision: https://reviews.llvm.org/D136677 show more ...
# 9bb1e21f	27-Oct-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] Clean up calls to MachineOperand::setIsDead and friends. NFC.
# 575eed3d	11-Oct-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix hazard with v_accvgpr_write_b32 and inline asm VGPR defs If inline asm has a VGPR def, it must have come from a VGPR write somewhere inside the asm. This should be further extended to al AMDGPU: Fix hazard with v_accvgpr_write_b32 and inline asm VGPR defs If inline asm has a VGPR def, it must have come from a VGPR write somewhere inside the asm. This should be further extended to all read after write hazards. show more ...
# a35013be	01-Oct-2022	Carl Ritson <carl.ritson@amd.com>	[AMDGPU][GFX11] Mitigate VALU mask write hazard VALU use of an SGPR (pair) as mask followed by SALU write to the same SGPR can cause incorrect execution of subsequent SALU reads of the SGPR. Review [AMDGPU][GFX11] Mitigate VALU mask write hazard VALU use of an SGPR (pair) as mask followed by SALU write to the same SGPR can cause incorrect execution of subsequent SALU reads of the SGPR. Reviewed By: foad, rampitec Differential Revision: https://reviews.llvm.org/D134151 show more ...
# f19cc793	20-Sep-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] Disable fp atomic to s_denorm_mode hazard for GFX11 This hazard only exists on GFX10. Differential Revision: https://reviews.llvm.org/D134276
123 4 5 6 7