History log of /llvm-project/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (Results 51 – 75 of 153)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# fb28bf3f 07-Sep-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Fix liveness verifier error in hazard recognizer

After D133067 we are inserting swaps to use a new physical
register. I have noticed verifier errors about undefined
physical register uses i

[AMDGPU] Fix liveness verifier error in hazard recognizer

After D133067 we are inserting swaps to use a new physical
register. I have noticed verifier errors about undefined
physical register uses if we are tracking liveness post RA.

We have no access to LIS at this point, so mark new register
uses as undef to calm down the verifier. Liveness should not
matter at this point anyway.

Note the description of the RegState::Undef: "Value of the
register doesn't matter." I.e. it does not say it is strictly
undefined. In fact that is what we really need: this value
does not matter.

I also had to modify the test a bit since with tracking enabled
it does not pass verification even before the recognizer.

Differential Revision: https://reviews.llvm.org/D133459

show more ...


# 95d497ff 31-Aug-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] W/a hazard if 64 bit shift amount is a highest allocated VGPR

In this case gfx90a uses v0 instead of the correct register. Swap
the value temporarily with a lower register and then swap it

[AMDGPU] W/a hazard if 64 bit shift amount is a highest allocated VGPR

In this case gfx90a uses v0 instead of the correct register. Swap
the value temporarily with a lower register and then swap it back.

Unfortunately hazard recognizer works after wait count insertion,
so we cannot simply reuse an arbitrary register, hence w/a also
includes a full waitcount. This can be avoided if we run it from
expandPostRAPseudo, but that is a complete misplacement.

Differential Revision: https://reviews.llvm.org/D133067

show more ...


# de9d80c1 08-Aug-2022 Fangrui Song <i@maskray.me>

[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC

With C++17 there is no Clang pedantic warning or MSVC C5051.


# 7fc52d7c 27-Jul-2022 Vang Thao <Vang.Thao@amd.com>

[AMDGPU] Fix DGEMM hazard for GFX90a

For VALU write and memory (VM, L/DS, FLAT) instructions, SQ would insert
wait-states to avoid data hazard. However when there is a DGEMM instruction
in-between t

[AMDGPU] Fix DGEMM hazard for GFX90a

For VALU write and memory (VM, L/DS, FLAT) instructions, SQ would insert
wait-states to avoid data hazard. However when there is a DGEMM instruction
in-between them, SQ incorrectly disables the wait-states thus the data hazard
needs to be handled with this workaround.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D130677

show more ...


# f29a19b0 01-Aug-2022 Piotr Sobczak <Piotr.Sobczak@amd.com>

[AMDGPU] Extend cases for ReadM0MovRelInterpHazard

Extend hazard recognizer of ReadM0MovRelInterpHazard with
DS_READ_ADDTID and DS_WRITE_ADDTID, as they also
require a manually inserted S_NOP after

[AMDGPU] Extend cases for ReadM0MovRelInterpHazard

Extend hazard recognizer of ReadM0MovRelInterpHazard with
DS_READ_ADDTID and DS_WRITE_ADDTID, as they also
require a manually inserted S_NOP after SALU writing m0.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D130783

show more ...


# 4874838a 28-Jun-2022 Piotr Sobczak <piotr.sobczak@amd.com>

[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D1287

[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D128756

show more ...


# 13107c27 16-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add support for GFX11 LDSDIR hazards

Detect LDS direct WAR/WAW hazards and compute values for
wait_vdst (va_vdst) parameter. Where appropriate this
raises wait_vdst from the default 0 to a

[AMDGPU] Add support for GFX11 LDSDIR hazards

Detect LDS direct WAR/WAW hazards and compute values for
wait_vdst (va_vdst) parameter. Where appropriate this
raises wait_vdst from the default 0 to allow concurrent
issue of LDS direct with VALU execution.

Also detect LDS direct versus VMEM source VGPR hazards
and insert vm_vsrc=0 waits using s_waitcnt_depctr.

Differential Revision: https://reviews.llvm.org/D127963

show more ...


# 9dff14be 15-Jun-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add support for GFX11 hazards

Add support for partial stall over EXEC hazard and trans use hazard.

Differential Revision: https://reviews.llvm.org/D127872


# bd9eed3a 22-May-2022 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Add isMFMA helper function. NFC

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D127124


# 5c974d08 08-Jun-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Fix hazard handling of v_cmpx to permlane

- VOP3 and SDWA forms of V_CMPX were not handled
- Hazard only exists if the compare defines EXEC (i.e. V_CMPX)
forwarded to the permlane.

Diffe

[AMDGPU] Fix hazard handling of v_cmpx to permlane

- VOP3 and SDWA forms of V_CMPX were not handled
- Hazard only exists if the compare defines EXEC (i.e. V_CMPX)
forwarded to the permlane.

Differential Revision: https://reviews.llvm.org/D127344

show more ...


# 63f21f4c 27-Apr-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Handle LDS DMA and LDS_DIRECT hazards

There shall be 1 wait state between M0 write and LDS DMA/LDS_DIRECT use.

Differential Revision: https://reviews.llvm.org/D124550


# d951d937 13-Apr-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Increate hazard for store dwordx3/4 to 2 waitstates on gfx940

Fixes: SWDEV-327053

Differential Revision: https://reviews.llvm.org/D123687


# f311f934 23-Mar-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx940 VALU hazard recognizer

Differntial Revision: https://reviews.llvm.org/D122339


# 64838ba3 23-Mar-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Use GenericTable to classify DGEMM

Since there is a table introduced for MAI instructions extend it
to use for DGEMM classification.

Differential Revision: https://reviews.llvm.org/D122337


# cad9de71 23-Mar-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx940 MAI hazard recognizer

Differential Revision: https://reviews.llvm.org/D122263


# 1e15adba 04-Mar-2022 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Add s_nop WaitStates between neighboring mfma

In some cases padding bubbles between sequential MFMA instructions may
lead to increased inter-wave performance. Add option to request to pad
s

[AMDGPU] Add s_nop WaitStates between neighboring mfma

In some cases padding bubbles between sequential MFMA instructions may
lead to increased inter-wave performance. Add option to request to pad
some portion of these stall cycles with s_nops.

Fixes: SWDEV-326925

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D121437

show more ...


# e9a49c64 17-Mar-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx940 basic speed model

This is incomplete and will handle more instructions as they are added.

Differential Revision: https://reviews.llvm.org/D121966


Revision tags: llvmorg-14.0.0-rc2
# 380ff31d 22-Feb-2022 Thomas Symalla <5754458+tsymalla@users.noreply.github.com>

[AMDGPU] Fix typo in comment [NFC]

This replaces "V_MOB_B32" with "V_MOV_B32" in some comment.


# 6527b2a4 18-Feb-2022 Sebastian Neubauer <Sebastian.Neubauer@amd.com>

[AMDGPU][NFC] Fix typos

Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init
# dbf278b9 21-Jan-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Prevent aliasing of SrcC and Dst in MAI

Form the MAI spec: It’s ok that Src_C and vDst are the exact same VGPRs
or Src_C and vDst are completely separated. The case that Src_C and vDst
are

[AMDGPU] Prevent aliasing of SrcC and Dst in MAI

Form the MAI spec: It’s ok that Src_C and vDst are the exact same VGPRs
or Src_C and vDst are completely separated. The case that Src_C and vDst
are overlapping should be avoid as new value could be written to accumulator
input before it gets read.

Note that this inevitably increases register pressure to the point where
some programs will become uncompilable.

This patch separates MAC and FMA versions of MFMA instructions using either
tied dst and src2 or earlyclobber dst.

Fixes: SWDEV-318900

Differential Revision: https://reviews.llvm.org/D117844

show more ...


Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3
# d6b07348 19-Jan-2022 Jim Lin <jim@andestech.com>

[NFC] Use Register instead of unsigned


Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 661a232e 19-Nov-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Remove a no-op check in the gfx90a hazard recognizer

Also rename helper function accordingly.

Differential Revision: https://reviews.llvm.org/D114289


# d1f45ed5 11-Nov-2021 Neubauer, Sebastian <Sebastian.Neubauer@amd.com>

[AMDGPU][NFC] Fix typos

Differential Revision: https://reviews.llvm.org/D113672


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 4f5ba46e 17-Aug-2021 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Set wait state for meta instructions to zero

It looked more reasonable to set the wait state to
zero for all non-instructions. With that we can avoid
the special handling for them in `getWa

[AMDGPU] Set wait state for meta instructions to zero

It looked more reasonable to set the wait state to
zero for all non-instructions. With that we can avoid
the special handling for them in `getWaitStatesSince`
and `AdvanceCycle`. This NFC patch makes the handling
more generic.

show more ...


# 68660767 13-Aug-2021 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Skip pseudo MIs in hazard recognizer

Instructions like WAVE_BARRIER and SI_MASKED_UNREACHABLE
are only placeholders to prevent certain unwanted
transformations and will get discarded during

[AMDGPU] Skip pseudo MIs in hazard recognizer

Instructions like WAVE_BARRIER and SI_MASKED_UNREACHABLE
are only placeholders to prevent certain unwanted
transformations and will get discarded during assembly
emission. They should not be counted during nop insertion.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D108022

show more ...


1234567