History log of /llvm-project/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (Results 76 – 100 of 153)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# e0c382a9 14-Jun-2021 Piotr Sobczak <Piotr.Sobczak@amd.com>

[AMDGPU] Limit runs of fixLdsBranchVmemWARHazard

The code in fixLdsBranchVmemWARHazard looks for patterns of a vmem/lds
access followed by a branch, followed by an lds/vmem access.

The handling of

[AMDGPU] Limit runs of fixLdsBranchVmemWARHazard

The code in fixLdsBranchVmemWARHazard looks for patterns of a vmem/lds
access followed by a branch, followed by an lds/vmem access.

The handling of the hazard requires an arbitrary number of instructions
to process. In the worst case where a function has a vmem access, but no lds
accesses, all instructions are examined only to conclude that the hazard
cannot occur.

Add the pre-processing stage which detects if there is both lds and vmem
present in the function and only then does the more costly search.

This patch significantly improves compilation time in the cases the hazard
cannot happen. In one pathological case I looked at IsHazardInst is needlesly
called 88.6 milions times.

The numbers could also be improved by introducing a map around the
inner calls to ::getWaitStatesSince in fixLdsBranchVmemWARHazard, but
nothing will beat not running fixLdsBranchVmemWARHazard at all in the cases
detected by shouldRunLdsBranchVmemWARHazardFixup().

Differential Revision: https://reviews.llvm.org/D104219

show more ...


Revision tags: llvmorg-12.0.1-rc1
# f251379a 30-Apr-2021 Jay Foad <jay.foad@amd.com>

[AMDGPU] Simplify getWaitStatesSince. NFC.


# 424f1f6f 30-Apr-2021 Carl Ritson <carl.ritson@amd.com>

[AMDGPU][NFC] Refactor hazard recognition IsHazardFn and IsExpiredFn

Refactor IsHazardFn and IsExpiredFn to use constant references as these should not be mutating the instructions visited and the i

[AMDGPU][NFC] Refactor hazard recognition IsHazardFn and IsExpiredFn

Refactor IsHazardFn and IsExpiredFn to use constant references as these should not be mutating the instructions visited and the instruction can never be null.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D101430

show more ...


# 749702fc 29-Apr-2021 Carl Ritson <carl.ritson@amd.com>

[AMDGPU] Remove dead early-out in GCNHazardRecognizer

Remove an early-out in wait state counting which can never be
taken.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.o

[AMDGPU] Remove dead early-out in GCNHazardRecognizer

Remove an early-out in wait state counting which can never be
taken.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D101520

show more ...


# 12011b52 27-Apr-2021 Jay Foad <jay.foad@amd.com>

[AMDGPU] GCNHazardRecognizer: ignore all meta instructions

This is hopefully NFC, but should be more robust in ignoring all
instructions that should be ignored, instead of just some of them.

Differ

[AMDGPU] GCNHazardRecognizer: ignore all meta instructions

This is hopefully NFC, but should be more robust in ignoring all
instructions that should be ignored, instead of just some of them.

Differential Revision: https://reviews.llvm.org/D101372

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3
# ed745839 03-Mar-2021 Jay Foad <jay.foad@amd.com>

[AMDGPU] Don't check for VMEM hazards on GFX10

The hazard where a VMEM reads an SGPR written by a VALU counts as a data
dependency hazard, so no nops are required on GFX10. Tested with Vulkan
CTS on

[AMDGPU] Don't check for VMEM hazards on GFX10

The hazard where a VMEM reads an SGPR written by a VALU counts as a data
dependency hazard, so no nops are required on GFX10. Tested with Vulkan
CTS on GFX10.1 and GFX10.3.

Differential Revision: https://reviews.llvm.org/D97926

show more ...


Revision tags: llvmorg-12.0.0-rc2
# a8d9d507 17-Feb-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx90a support

Differential Revision: https://reviews.llvm.org/D96906


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 560d7e04 20-Jan-2021 dfukalov <daniil.fukalov@amd.com>

[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets

... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036


Revision tags: llvmorg-11.1.0-rc1
# 314e29ed 07-Jan-2021 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] Add _e64 suffix to VOP3 Insts

Previously, instructions which could be
expressed as VOP3 in addition to another
encoding had a _e64 suffix on the tablegen
record name, while those
only avail

[AMDGPU] Add _e64 suffix to VOP3 Insts

Previously, instructions which could be
expressed as VOP3 in addition to another
encoding had a _e64 suffix on the tablegen
record name, while those
only available as VOP3 did not. With this
patch, all VOP3s will have the _e64 suffix.
The assembly does not change, only the mir.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D94341

Change-Id: Ia8ec8890d47f8f94bbbdac43745b4e9dd2b03423

show more ...


# 6a87e9b0 25-Dec-2020 dfukalov <daniil.fukalov@amd.com>

[NFC][AMDGPU] Reduce include files dependency.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# 9f69c1bc 18-Nov-2020 Jay Foad <jay.foad@amd.com>

[AMDGPU] Rename pseudo S_WAITCNT_IDLE to S_WAIT_IDLE. NFC.


# 5dc47541 03-Nov-2020 Mircea Trofin <mtrofin@google.com>

[NFC] Use Register/MCRegister

Differential Revision: https://reviews.llvm.org/D90724


# 58de4b20 29-Oct-2020 Jay Foad <jay.foad@amd.com>

[AMDGPU] Use pseudo instructions for readlane/writelane

This reverts r227987 "R600/SI: Determine target-specific encoding of READLANE and WRITELANE early v2".

All the codegen changes are caused by

[AMDGPU] Use pseudo instructions for readlane/writelane

This reverts r227987 "R600/SI: Determine target-specific encoding of READLANE and WRITELANE early v2".

All the codegen changes are caused by the post-RA scheduler no longer
treating readlane/writelane as scheduling barriers due to having
unmodelled side effects. (The pseudos are hasSideEffects = 0, but the
real instructions are hasSideEffects = ? which TableGen conservatively
treats as 1.)

Differential Revision: https://reviews.llvm.org/D90401

show more ...


# 69f5105f 29-Oct-2020 Jay Foad <jay.foad@amd.com>

[AMDGPU] Simplify insertNoops functions. NFC.


# de518673 28-Oct-2020 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Add Reset function to GCNHazardRecognizer

Reset the tracked emitted instructions when starting scheduling on a new
region.

Reviewed By: rampitec

Differential Revision: https://reviews.llv

[AMDGPU] Add Reset function to GCNHazardRecognizer

Reset the tracked emitted instructions when starting scheduling on a new
region.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D90347

show more ...


# 8b127a86 28-Oct-2020 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Fix inserting combined s_nop in bundles

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D90334


# ebdcef20 19-Oct-2020 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Avoid inserting noops during scheduling

Passes that are run after the post-RA scheduler may insert instructions like
waitcnt which eliminate the need for certain noops. After this patch the

[AMDGPU] Avoid inserting noops during scheduling

Passes that are run after the post-RA scheduler may insert instructions like
waitcnt which eliminate the need for certain noops. After this patch the
scheduler is still aware of possible latency from hazards but noops will
not be inserted until the dedicated hazard recognizer pass is run.

Depends on D89753.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D89754

show more ...


# a4f35ab2 08-Oct-2020 Austin Kerbow <Austin.Kerbow@amd.com>

[AMDGPU] Fix mai hazard VALU to LD/ST

Fixes: SWDEV-251863

Differential Revision: https://reviews.llvm.org/D89079


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# 90777e29 09-Sep-2020 Jay Foad <jay.foad@amd.com>

[AMDGPU] Enable scheduling around FP MODE-setting instructions

Pre-gfx10 all MODE-setting instructions were S_SETREG_B32 which is
marked as having unmodeled side effects, which makes the machine
sch

[AMDGPU] Enable scheduling around FP MODE-setting instructions

Pre-gfx10 all MODE-setting instructions were S_SETREG_B32 which is
marked as having unmodeled side effects, which makes the machine
scheduler treat it as a barrier. Now that we have proper implicit $mode
operands we can use a no-side-effects S_SETREG_B32_mode pseudo instead
for setregs that only touch the FP MODE bits, to give the scheduler more
freedom.

Differential Revision: https://reviews.llvm.org/D87446

show more ...


# 85490874 09-Sep-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Skip all meta instructions in hazard recognizer

This was not adding a necessary nop due to thinking the kill counted.


Revision tags: llvmorg-11.0.0-rc2
# 43a38dc2 14-Aug-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Fix MAI ld/st hazard handling

It did not process hazard for ds_permute because it does not
load or store even though it is DS.

Differential Revision: https://reviews.llvm.org/D86003


# decfdb8c 29-Jul-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Fixed formatting in GCNHazardRecognizer.cpp. NFC.


# 13b63be4 29-Jul-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] prefer non-mfma in post-RA schedule

MFMA instructions shall not be scheduled back to back
to avoid MAI SIMD stall. Tell post-RA schedule we would
prefer some other instruction instead.

Dif

[AMDGPU] prefer non-mfma in post-RA schedule

MFMA instructions shall not be scheduled back to back
to avoid MAI SIMD stall. Tell post-RA schedule we would
prefer some other instruction instead.

Differential Revision: https://reviews.llvm.org/D84883

show more ...


Revision tags: llvmorg-11.0.0-rc1
# 2e87acac 17-Jul-2020 Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com>

[AMDGPU] Removed s_mov_regrd and mov_fed opcodes

These opcodes are not intended for public use.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D81659


# 5bf2a9dd 16-Jul-2020 Carl Ritson <carl.ritson@amd.com>

[AMDGPU] Update VMEM scalar write hazard mitigation sequence

Using s_waitcnt_depctr 0xffe3 is potentially faster than v_nop.

Reviewed By: rampitec, foad

Differential Revision: https://reviews.llvm

[AMDGPU] Update VMEM scalar write hazard mitigation sequence

Using s_waitcnt_depctr 0xffe3 is potentially faster than v_nop.

Reviewed By: rampitec, foad

Differential Revision: https://reviews.llvm.org/D83872

show more ...


1234567