History log of /llvm-project/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (Results 26 – 50 of 153)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 149ed9d2 24-Jan-2024 Petar Avramovic <Petar.Avramovic@amd.com>

AMDGPU: update GFX11 wmma hazards (#76143)

One V_NOP or unrelated VALU instruction in between is required for
correctness when matrix A or B of current WMMA instruction overlaps with
matrix D of p

AMDGPU: update GFX11 wmma hazards (#76143)

One V_NOP or unrelated VALU instruction in between is required for
correctness when matrix A or B of current WMMA instruction overlaps with
matrix D of previous WMMA instruction.
Remaining cases of WMMA operand overlaps are handled by the hardware and
do not require handling in hazard recognizer.

Hardware may stall in cases where:
- matrix C of current WMMA instruction overlaps with matrix D of
previous WMMA instruction
- VALU instruction reads matrix D of previous WMMA instruction
- matrix A,B or C of WMMA instruction reads result of previous VALU
instruction

show more ...


Revision tags: llvmorg-19-init
# 42b08842 23-Jan-2024 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Handle V_PERMLANE64_B32 in fixVcmpxPermlaneHazards (#79125)

Fixes #78856


# 97747467 19-Jan-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Update hazard recognition for new GFX12 wait counters (#78722)

In most cases the hazards no longer apply, so just assert that we are
not on GFX12.


# ba52f06f 18-Jan-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438)

Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait
instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into
sep

[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438)

Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait
instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into
separate LOADCNT, SAMPLECNT and BVHCNT counters.

show more ...


# b120dae9 11-Jan-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Support GFX12 VDSDIR instructions WAITVMSRC operand in GCNHazardRecognizer (#77628)

Modify GCNHazardRecognizer::fixLdsDirectVMEMHazard() so the waitvsrc
operand
in gfx12 DS_PARAM_LOAD or

[AMDGPU] Support GFX12 VDSDIR instructions WAITVMSRC operand in GCNHazardRecognizer (#77628)

Modify GCNHazardRecognizer::fixLdsDirectVMEMHazard() so the waitvsrc
operand
in gfx12 DS_PARAM_LOAD or DS_DIRECT_LOAD instructions is set
appropriately
depending on whether a hazard is found or not, rather than inserting an
S_WAITCNT_DEPCTR instruction if a hazard needs to be mitigated.

Co-authored-by: Stephen Thomas <Stephen.Thomas@amd.com>

show more ...


# 966416b9 15-Dec-2023 Mariusz Sikora <mariusz.sikora@amd.com>

[AMDGPU][GFX12] Add new v_permlane16 variants (#75475)


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4
# 85e3875a 23-Aug-2023 Michael Maitland <michaeltmaitland@gmail.com>

[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics

D150312 added a TODO:

TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` r

[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics

D150312 added a TODO:

TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the
fact that resource allocation is now represented as an interval,
relatively to the issue cycle of the instruction.

This patch implements that TODO. This naming clarifies how to use these
fields in the scheduler. In addition it was confusing that `StartAtCycle` was
singular but `Cycles` was plural. This renaming fixes this inconsistency.

This commit as previously reverted since it missed renaming that came
down after rebasing. This version of the commit fixes those problems.

Differential Revision: https://reviews.llvm.org/D158568

show more ...


# 71bfec76 24-Aug-2023 Michael Maitland <michaeltmaitland@gmail.com>

Revert "[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics"

This reverts commit 5b854f2c23ea1b000cb4cac4c0fea77326c03d43.

Build still failing.


# 5b854f2c 23-Aug-2023 Michael Maitland <michaeltmaitland@gmail.com>

[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics

D150312 added a TODO:

TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` r

[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics

D150312 added a TODO:

TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the
fact that resource allocation is now represented as an interval,
relatively to the issue cycle of the instruction.

This patch implements that TODO. This naming clarifies how to use these
fields in the scheduler. In addition it was confusing that `StartAtCycle` was
singular but `Cycles` was plural. This renaming fixes this inconsistency.

This commit as previously reverted since it missed renaming that came
down after rebasing. This version of the commit fixes those problems.

Differential Revision: https://reviews.llvm.org/D158568

show more ...


Revision tags: llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 2dfb4b56 04-Jul-2023 Stephen Thomas <Stephen.Thomas@amd.com>

[AMDGPU] Fix incorrect hazard mitigation

GCNHazardRecognizer::fixVcmpxExecWARHazard() mitigates a specific hazard
by inserting a wait on sa_sdst==0 if such a wait isn't already present.
Unfortunatel

[AMDGPU] Fix incorrect hazard mitigation

GCNHazardRecognizer::fixVcmpxExecWARHazard() mitigates a specific hazard
by inserting a wait on sa_sdst==0 if such a wait isn't already present.
Unfortunately, the check for an existing wait incorrectly checks for one
that doesn't actually care about sa_sdst itself, but requires that no
other counters are waited for.

Once the check is performed correctly, a lit test needs to be updated,
since it is currently testing for the incorrect behaviour.

Differential Revision: https://reviews.llvm.org/D154438

show more ...


# 8aedad0f 04-Jul-2023 Stephen Thomas <Stephen.Thomas@amd.com>

[AMDGPU] Add functions for composing and decomposing S_WAIT_DEPCTR operands

Add functions AMDGPU::DepCtr::encodeField*() and AMDGPU::DepCtr::decodeField*()
for each of vm_vsrc, va_vdst and sa_sdst.

[AMDGPU] Add functions for composing and decomposing S_WAIT_DEPCTR operands

Add functions AMDGPU::DepCtr::encodeField*() and AMDGPU::DepCtr::decodeField*()
for each of vm_vsrc, va_vdst and sa_sdst. These are now used in
AMDGPUInsertDelayAlu and GCNHazardRecognizer so as to make working with
S_WAITCNT_DEPCTR operands easier and more readable.

Differential Revision: https://reviews.llvm.org/D154424

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5
# aa2d0fbc 21-May-2023 Sergei Barannikov <barannikov88@gmail.com>

[MC] Add MCRegisterInfo::regunits for iteration over register units

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D152098


# 890c76a9 19-May-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] Fix odd implicit operand handling in clause breaking

By inspection. Because of the strange behaviour of MI.uses(), this was
adding implicit defs to the clause *uses* set, and then wrongly
d

[AMDGPU] Fix odd implicit operand handling in clause breaking

By inspection. Because of the strange behaviour of MI.uses(), this was
adding implicit defs to the clause *uses* set, and then wrongly
detecting a conflict between explicit defs and implicit defs.

For example it would detect a conflict on this pair of instructions:

$vgpr0 = BUFFER_LOAD_DWORD_OFFSET $sgpr0_sgpr1_sgpr2_sgpr3, 0, 4088, 0, 0, implicit $exec, implicit-def $vgpr0_vgpr1 :: (load (s32) from %stack.1, addrspace 5)
$vgpr1 = BUFFER_LOAD_DWORD_OFFSET $sgpr0_sgpr1_sgpr2_sgpr3, 0, 4092, 0, 0, implicit $exec, implicit-def $vgpr0_vgpr1 :: (load (s32) from %stack.1 + 4, addrspace 5)

Differential Revision: https://reviews.llvm.org/D150947

show more ...


Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2
# 4241d890 15-Apr-2023 Kazu Hirata <kazu@google.com>

[Target] Use range-based for loops (NFC)


Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# a07584d5 03-Feb-2023 Jay Foad <jay.foad@amd.com>

[CodeGen] Make more use of MachineOperand::getOperandNo. NFC.

Differential Revision: https://reviews.llvm.org/D143252


# 8e3d7cf5 07-Feb-2023 Archibald Elliott <archibald.elliott@arm.com>

[NFC][TargetParser] Remove llvm/Support/TargetParser.h


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# 768aed13 13-Jan-2023 Jay Foad <jay.foad@amd.com>

[MC] Make more use of MCInstrDesc::operands. NFC.

Change MCInstrDesc::operands to return an ArrayRef so we can easily use
it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end.
A futu

[MC] Make more use of MCInstrDesc::operands. NFC.

Change MCInstrDesc::operands to return an ArrayRef so we can easily use
it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end.
A future patch will remove opInfo_begin and opInfo_end.

Also use it instead of raw access to the OpInfo pointer. A future patch
will remove this pointer.

Differential Revision: https://reviews.llvm.org/D142213

show more ...


Revision tags: llvmorg-15.0.7
# 5bc703f7 20-Dec-2022 Carl Ritson <carl.ritson@amd.com>

[AMDGPU] Replace getPhysRegClass with getPhysRegBaseClass

Accelerate finding the base class for a physical register by
building a statically mapping table from physical registers
to base classes usi

[AMDGPU] Replace getPhysRegClass with getPhysRegBaseClass

Accelerate finding the base class for a physical register by
building a statically mapping table from physical registers
to base classes using TableGen.

Replace uses of SIRegisterInfo::getPhysRegClass with
TargetRegisterInfo::getPhysRegBaseClass in order to use
the computed table.

Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D139422

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 49762162 08-Mar-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Remove isLiteralConstant and isLiteralConstantLike

isLiteralConstant and isLiteralConstantLike were similar to
!isInlineConstant with slight differences like handling isReg operands.

To av

[AMDGPU] Remove isLiteralConstant and isLiteralConstantLike

isLiteralConstant and isLiteralConstantLike were similar to
!isInlineConstant with slight differences like handling isReg operands.

To avoid a profusion of similar functions with undocumented differences,
this patch removes all the isLiteralConstant* variants. Callers are responsible
for handling the isReg case.

Differential Revision: https://reviews.llvm.org/D125759

show more ...


# 7425077e 07-Nov-2022 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Add & use `hasNamedOperand`, NFC

In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1.
This is fine in itself, but it's verbose and doesn'

[AMDGPU] Add & use `hasNamedOperand`, NFC

In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1.
This is fine in itself, but it's verbose and doesn't make the intention clear, IMHO. I added a `hasNamedOperand` and replaced all cases I could find with regexes and manually.

Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D137540

show more ...


# c8a90316 28-Oct-2022 Stephen Thomas <Stephen.Thomas@amd.com>

[AMDGPU] Small cleanups in wait counter code

A small number of cleanups and refactors intended to enhance readability in
two passes that deal with s_waitcnt instructions.

Differential Revision: htt

[AMDGPU] Small cleanups in wait counter code

A small number of cleanups and refactors intended to enhance readability in
two passes that deal with s_waitcnt instructions.

Differential Revision: https://reviews.llvm.org/D136677

show more ...


# 9bb1e21f 27-Oct-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Clean up calls to MachineOperand::setIsDead and friends. NFC.


# 575eed3d 11-Oct-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix hazard with v_accvgpr_write_b32 and inline asm VGPR defs

If inline asm has a VGPR def, it must have come from a VGPR write
somewhere inside the asm. This should be further extended to al

AMDGPU: Fix hazard with v_accvgpr_write_b32 and inline asm VGPR defs

If inline asm has a VGPR def, it must have come from a VGPR write
somewhere inside the asm. This should be further extended to all
read after write hazards.

show more ...


# a35013be 01-Oct-2022 Carl Ritson <carl.ritson@amd.com>

[AMDGPU][GFX11] Mitigate VALU mask write hazard

VALU use of an SGPR (pair) as mask followed by SALU write to the
same SGPR can cause incorrect execution of subsequent SALU reads
of the SGPR.

Review

[AMDGPU][GFX11] Mitigate VALU mask write hazard

VALU use of an SGPR (pair) as mask followed by SALU write to the
same SGPR can cause incorrect execution of subsequent SALU reads
of the SGPR.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D134151

show more ...


# f19cc793 20-Sep-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Disable fp atomic to s_denorm_mode hazard for GFX11

This hazard only exists on GFX10.

Differential Revision: https://reviews.llvm.org/D134276


1234567