History log of /llvm-project/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h (Results 1 – 25 of 42)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5
# c3fe5ad6 25-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Handle vcmpx+permalane gfx950 hazard (#117286)

Confusingly, this is a different hazard to the one on gfx10
with a subtarget feature.


Revision tags: llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2
# d6173713 02-Oct-2024 Juan Manuel Martinez Caamaño <jmartinezcaamao@gmail.com>

[AMDGPU] Use the SchedModel available in SIInstrInfo (#110859)

Instead of allocating an initializing a new instance in
`GCNHazardRecognizer` and `AMDGPUInsertDelayAlu`.


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0
# 86627149 04-Sep-2024 Carl Ritson <carl.ritson@amd.com>

[AMDGPU] Mitigate GFX12 VALU read SGPR hazard (#100067)

Any SGPR read by a VALU can potentially obscure SALU writes to the same
register.
Insert s_wait_alu instructions to mitigate the hazard on a

[AMDGPU] Mitigate GFX12 VALU read SGPR hazard (#100067)

Any SGPR read by a VALU can potentially obscure SALU writes to the same
register.
Insert s_wait_alu instructions to mitigate the hazard on affected paths.

Compute a global cache of SGPRs with any VALU reads and use this to
avoid inserting mitigation for SGPRs never accessed by VALUs.

To avoid excessive search when compile time is priority implement
secondary mode where all SALU writes are mitigated.

Co-authored-by: Shilei Tian <shilei.tian@amd.com>

show more ...


Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 939a6624 23-Jul-2024 Carl Ritson <carl.ritson@amd.com>

[AMDGPU] Implement workaround for GFX11.5 export priority (#99273)

On GFX11.5 shaders having completed exports need to execute/wait at a
lower priority than shaders still executing exports.
Add co

[AMDGPU] Implement workaround for GFX11.5 export priority (#99273)

On GFX11.5 shaders having completed exports need to execute/wait at a
lower priority than shaders still executing exports.
Add code to maintain normal priority of 2 for shaders that export and
drop to priority 0 after exports.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2
# a35013be 01-Oct-2022 Carl Ritson <carl.ritson@amd.com>

[AMDGPU][GFX11] Mitigate VALU mask write hazard

VALU use of an SGPR (pair) as mask followed by SALU write to the
same SGPR can cause incorrect execution of subsequent SALU reads
of the SGPR.

Review

[AMDGPU][GFX11] Mitigate VALU mask write hazard

VALU use of an SGPR (pair) as mask followed by SALU write to the
same SGPR can cause incorrect execution of subsequent SALU reads
of the SGPR.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D134151

show more ...


Revision tags: llvmorg-15.0.1, llvmorg-15.0.0
# 95d497ff 31-Aug-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] W/a hazard if 64 bit shift amount is a highest allocated VGPR

In this case gfx90a uses v0 instead of the correct register. Swap
the value temporarily with a lower register and then swap it

[AMDGPU] W/a hazard if 64 bit shift amount is a highest allocated VGPR

In this case gfx90a uses v0 instead of the correct register. Swap
the value temporarily with a lower register and then swap it back.

Unfortunately hazard recognizer works after wait count insertion,
so we cannot simply reuse an arbitrary register, hence w/a also
includes a full waitcount. This can be avoided if we run it from
expandPostRAPseudo, but that is a complete misplacement.

Differential Revision: https://reviews.llvm.org/D133067

show more ...


Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 4874838a 28-Jun-2022 Piotr Sobczak <piotr.sobczak@amd.com>

[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D1287

[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D128756

show more ...


Revision tags: llvmorg-14.0.6
# 13107c27 16-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add support for GFX11 LDSDIR hazards

Detect LDS direct WAR/WAW hazards and compute values for
wait_vdst (va_vdst) parameter. Where appropriate this
raises wait_vdst from the default 0 to a

[AMDGPU] Add support for GFX11 LDSDIR hazards

Detect LDS direct WAR/WAW hazards and compute values for
wait_vdst (va_vdst) parameter. Where appropriate this
raises wait_vdst from the default 0 to allow concurrent
issue of LDS direct with VALU execution.

Also detect LDS direct versus VMEM source VGPR hazards
and insert vm_vsrc=0 waits using s_waitcnt_depctr.

Differential Revision: https://reviews.llvm.org/D127963

show more ...


# 9dff14be 15-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add support for GFX11 hazards

Add support for partial stall over EXEC hazard and trans use hazard.

Differential Revision: https://reviews.llvm.org/D127872


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 1e15adba 04-Mar-2022 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Add s_nop WaitStates between neighboring mfma

In some cases padding bubbles between sequential MFMA instructions may
lead to increased inter-wave performance. Add option to request to pad
s

[AMDGPU] Add s_nop WaitStates between neighboring mfma

In some cases padding bubbles between sequential MFMA instructions may
lead to increased inter-wave performance. Add option to request to pad
some portion of these stall cycles with s_nops.

Fixes: SWDEV-326925

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D121437

show more ...


Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 41bfac6a 02-Jan-2022 Kazu Hirata <kazu@google.com>

[Target] Remove unused forward declarations (NFC)


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# e0c382a9 14-Jun-2021 Piotr Sobczak <Piotr.Sobczak@amd.com>

[AMDGPU] Limit runs of fixLdsBranchVmemWARHazard

The code in fixLdsBranchVmemWARHazard looks for patterns of a vmem/lds
access followed by a branch, followed by an lds/vmem access.

The handling of

[AMDGPU] Limit runs of fixLdsBranchVmemWARHazard

The code in fixLdsBranchVmemWARHazard looks for patterns of a vmem/lds
access followed by a branch, followed by an lds/vmem access.

The handling of the hazard requires an arbitrary number of instructions
to process. In the worst case where a function has a vmem access, but no lds
accesses, all instructions are examined only to conclude that the hazard
cannot occur.

Add the pre-processing stage which detects if there is both lds and vmem
present in the function and only then does the more costly search.

This patch significantly improves compilation time in the cases the hazard
cannot happen. In one pathological case I looked at IsHazardInst is needlesly
called 88.6 milions times.

The numbers could also be improved by introducing a map around the
inner calls to ::getWaitStatesSince in fixLdsBranchVmemWARHazard, but
nothing will beat not running fixLdsBranchVmemWARHazard at all in the cases
detected by shouldRunLdsBranchVmemWARHazardFixup().

Differential Revision: https://reviews.llvm.org/D104219

show more ...


Revision tags: llvmorg-12.0.1-rc1
# 424f1f6f 30-Apr-2021 Carl Ritson <carl.ritson@amd.com>

[AMDGPU][NFC] Refactor hazard recognition IsHazardFn and IsExpiredFn

Refactor IsHazardFn and IsExpiredFn to use constant references as these should not be mutating the instructions visited and the i

[AMDGPU][NFC] Refactor hazard recognition IsHazardFn and IsExpiredFn

Refactor IsHazardFn and IsExpiredFn to use constant references as these should not be mutating the instructions visited and the instruction can never be null.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D101430

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# a8d9d507 17-Feb-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx90a support

Differential Revision: https://reviews.llvm.org/D96906


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# de518673 28-Oct-2020 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Add Reset function to GCNHazardRecognizer

Reset the tracked emitted instructions when starting scheduling on a new
region.

Reviewed By: rampitec

Differential Revision: https://reviews.llv

[AMDGPU] Add Reset function to GCNHazardRecognizer

Reset the tracked emitted instructions when starting scheduling on a new
region.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D90347

show more ...


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2
# 13b63be4 29-Jul-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] prefer non-mfma in post-RA schedule

MFMA instructions shall not be scheduled back to back
to avoid MAI SIMD stall. Tell post-RA schedule we would
prefer some other instruction instead.

Dif

[AMDGPU] prefer non-mfma in post-RA schedule

MFMA instructions shall not be scheduled back to back
to avoid MAI SIMD stall. Tell post-RA schedule we would
prefer some other instruction instead.

Differential Revision: https://reviews.llvm.org/D84883

show more ...


Revision tags: llvmorg-11.0.0-rc1
# 2e87acac 17-Jul-2020 Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com>

[AMDGPU] Removed s_mov_regrd and mov_fed opcodes

These opcodes are not intended for public use.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D81659


Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# 29067aac 06-May-2020 Jay Foad <jay.foad@amd.com>

[AMDGPU] Don't implement GCNHazardRecognizer::PreEmitNoops(SUnit *)

When called from the post-RA scheduler, hazards have already been
handled by getHazardType returning NoopHazard, so PreEmitNoops a

[AMDGPU] Don't implement GCNHazardRecognizer::PreEmitNoops(SUnit *)

When called from the post-RA scheduler, hazards have already been
handled by getHazardType returning NoopHazard, so PreEmitNoops always
returns zero. Remove it. NFC.

Historical note: PreEmitNoops was added to the hazard recognizer
interface as an optional feature to support dispatch group formation on
the POWER target:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20131202/197470.html
So it seems right that we shouldn't need to implement it.

We do still implement the other overload PreEmitNoops(MachineInstr *)
because that is used by the PostRAHazardRecognizer pass.

Differential Revision: https://reviews.llvm.org/D79476

show more ...


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init
# 7d2019bb 11-Jul-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx908 hazard recognizer

Differential Revision: https://reviews.llvm.org/D64593

llvm-svn: 365829


Revision tags: llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3
# bdf7f81b 21-Jun-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] hazard recognizer for fp atomic to s_denorm_mode

This requires 3 wait states unless there is a wait or VALU in
between.

Differential Revision: https://reviews.llvm.org/D63619

llvm-svn: 36

[AMDGPU] hazard recognizer for fp atomic to s_denorm_mode

This requires 3 wait states unless there is a wait or VALU in
between.

Differential Revision: https://reviews.llvm.org/D63619

llvm-svn: 364074

show more ...


# 5f581c9f 12-Jun-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx1010 premlane instructions

Differential Revision: https://reviews.llvm.org/D63202

llvm-svn: 363185


Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1
# 8a3d3a9a 07-May-2019 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Check MI bundles for hazards

Summary: GCNHazardRecognizer fails to identify hazards that are in and around bundles. This patch allows the hazard recognizer to consider bundled instructions

[AMDGPU] Check MI bundles for hazards

Summary: GCNHazardRecognizer fails to identify hazards that are in and around bundles. This patch allows the hazard recognizer to consider bundled instructions in both scheduler and hazard recognizer mode. We ignore “bundledness” for the purpose of detecting hazards and examine the instructions individually.

Reviewers: arsenm, msearles, rampitec

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61564

llvm-svn: 360199

show more ...


# 51d1415a 04-May-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

AMDGPU] gfx1010 hazard recognizer

Differential Revision: https://reviews.llvm.org/D61536

llvm-svn: 359961


Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# f92ed696 21-Jan-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Fixed hazard recognizer to walk predecessors

Fixes two problems with GCNHazardRecognizer:
1. It only scans up to 5 instructions emitted earlier.
2. It does not take control flow into accoun

[AMDGPU] Fixed hazard recognizer to walk predecessors

Fixes two problems with GCNHazardRecognizer:
1. It only scans up to 5 instructions emitted earlier.
2. It does not take control flow into account. An earlier instruction
from the previous basic block is not necessarily a predecessor.
At the same time a real predecessor block is not scanned.

The patch provides a way to distinguish between scheduler and
hazard recognizer mode. It is OK to work with emitted instructions
in the scheduler because we do not really know what will be emitted
later and its order. However, when pass works as a hazard recognizer
the schedule is already finalized, and we have full access to the
instructions for the whole function, so we can properly traverse
predecessors and their instructions.

Differential Revision: https://reviews.llvm.org/D56923

llvm-svn: 351759

show more ...


# 2946cd70 19-Jan-2019 Chandler Carruth <chandlerc@gmail.com>

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the ne

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636

show more ...


12