Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
#
e0c382a9 |
| 14-Jun-2021 |
Piotr Sobczak <Piotr.Sobczak@amd.com> |
[AMDGPU] Limit runs of fixLdsBranchVmemWARHazard
The code in fixLdsBranchVmemWARHazard looks for patterns of a vmem/lds access followed by a branch, followed by an lds/vmem access.
The handling of
[AMDGPU] Limit runs of fixLdsBranchVmemWARHazard
The code in fixLdsBranchVmemWARHazard looks for patterns of a vmem/lds access followed by a branch, followed by an lds/vmem access.
The handling of the hazard requires an arbitrary number of instructions to process. In the worst case where a function has a vmem access, but no lds accesses, all instructions are examined only to conclude that the hazard cannot occur.
Add the pre-processing stage which detects if there is both lds and vmem present in the function and only then does the more costly search.
This patch significantly improves compilation time in the cases the hazard cannot happen. In one pathological case I looked at IsHazardInst is needlesly called 88.6 milions times.
The numbers could also be improved by introducing a map around the inner calls to ::getWaitStatesSince in fixLdsBranchVmemWARHazard, but nothing will beat not running fixLdsBranchVmemWARHazard at all in the cases detected by shouldRunLdsBranchVmemWARHazardFixup().
Differential Revision: https://reviews.llvm.org/D104219
show more ...
|
Revision tags: llvmorg-12.0.1-rc1 |
|
#
f251379a |
| 30-Apr-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Simplify getWaitStatesSince. NFC.
|
#
424f1f6f |
| 30-Apr-2021 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU][NFC] Refactor hazard recognition IsHazardFn and IsExpiredFn
Refactor IsHazardFn and IsExpiredFn to use constant references as these should not be mutating the instructions visited and the i
[AMDGPU][NFC] Refactor hazard recognition IsHazardFn and IsExpiredFn
Refactor IsHazardFn and IsExpiredFn to use constant references as these should not be mutating the instructions visited and the instruction can never be null.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D101430
show more ...
|
#
749702fc |
| 29-Apr-2021 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Remove dead early-out in GCNHazardRecognizer
Remove an early-out in wait state counting which can never be taken.
Reviewed By: foad, rampitec
Differential Revision: https://reviews.llvm.o
[AMDGPU] Remove dead early-out in GCNHazardRecognizer
Remove an early-out in wait state counting which can never be taken.
Reviewed By: foad, rampitec
Differential Revision: https://reviews.llvm.org/D101520
show more ...
|
#
12011b52 |
| 27-Apr-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] GCNHazardRecognizer: ignore all meta instructions
This is hopefully NFC, but should be more robust in ignoring all instructions that should be ignored, instead of just some of them.
Differ
[AMDGPU] GCNHazardRecognizer: ignore all meta instructions
This is hopefully NFC, but should be more robust in ignoring all instructions that should be ignored, instead of just some of them.
Differential Revision: https://reviews.llvm.org/D101372
show more ...
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3 |
|
#
ed745839 |
| 03-Mar-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Don't check for VMEM hazards on GFX10
The hazard where a VMEM reads an SGPR written by a VALU counts as a data dependency hazard, so no nops are required on GFX10. Tested with Vulkan CTS on
[AMDGPU] Don't check for VMEM hazards on GFX10
The hazard where a VMEM reads an SGPR written by a VALU counts as a data dependency hazard, so no nops are required on GFX10. Tested with Vulkan CTS on GFX10.1 and GFX10.3.
Differential Revision: https://reviews.llvm.org/D97926
show more ...
|
Revision tags: llvmorg-12.0.0-rc2 |
|
#
a8d9d507 |
| 17-Feb-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
#
560d7e04 |
| 20-Jan-2021 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
Revision tags: llvmorg-11.1.0-rc1 |
|
#
314e29ed |
| 07-Jan-2021 |
Joe Nash <Joseph.Nash@amd.com> |
[AMDGPU] Add _e64 suffix to VOP3 Insts
Previously, instructions which could be expressed as VOP3 in addition to another encoding had a _e64 suffix on the tablegen record name, while those only avail
[AMDGPU] Add _e64 suffix to VOP3 Insts
Previously, instructions which could be expressed as VOP3 in addition to another encoding had a _e64 suffix on the tablegen record name, while those only available as VOP3 did not. With this patch, all VOP3s will have the _e64 suffix. The assembly does not change, only the mir.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D94341
Change-Id: Ia8ec8890d47f8f94bbbdac43745b4e9dd2b03423
show more ...
|
#
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
#
9f69c1bc |
| 18-Nov-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Rename pseudo S_WAITCNT_IDLE to S_WAIT_IDLE. NFC.
|
#
5dc47541 |
| 03-Nov-2020 |
Mircea Trofin <mtrofin@google.com> |
[NFC] Use Register/MCRegister
Differential Revision: https://reviews.llvm.org/D90724
|
#
58de4b20 |
| 29-Oct-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Use pseudo instructions for readlane/writelane
This reverts r227987 "R600/SI: Determine target-specific encoding of READLANE and WRITELANE early v2".
All the codegen changes are caused by
[AMDGPU] Use pseudo instructions for readlane/writelane
This reverts r227987 "R600/SI: Determine target-specific encoding of READLANE and WRITELANE early v2".
All the codegen changes are caused by the post-RA scheduler no longer treating readlane/writelane as scheduling barriers due to having unmodelled side effects. (The pseudos are hasSideEffects = 0, but the real instructions are hasSideEffects = ? which TableGen conservatively treats as 1.)
Differential Revision: https://reviews.llvm.org/D90401
show more ...
|
#
69f5105f |
| 29-Oct-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Simplify insertNoops functions. NFC.
|
#
de518673 |
| 28-Oct-2020 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Add Reset function to GCNHazardRecognizer
Reset the tracked emitted instructions when starting scheduling on a new region.
Reviewed By: rampitec
Differential Revision: https://reviews.llv
[AMDGPU] Add Reset function to GCNHazardRecognizer
Reset the tracked emitted instructions when starting scheduling on a new region.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D90347
show more ...
|
#
8b127a86 |
| 28-Oct-2020 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Fix inserting combined s_nop in bundles
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D90334
|
#
ebdcef20 |
| 19-Oct-2020 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Avoid inserting noops during scheduling
Passes that are run after the post-RA scheduler may insert instructions like waitcnt which eliminate the need for certain noops. After this patch the
[AMDGPU] Avoid inserting noops during scheduling
Passes that are run after the post-RA scheduler may insert instructions like waitcnt which eliminate the need for certain noops. After this patch the scheduler is still aware of possible latency from hazards but noops will not be inserted until the dedicated hazard recognizer pass is run.
Depends on D89753.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D89754
show more ...
|
#
a4f35ab2 |
| 08-Oct-2020 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Fix mai hazard VALU to LD/ST
Fixes: SWDEV-251863
Differential Revision: https://reviews.llvm.org/D89079
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
90777e29 |
| 09-Sep-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Enable scheduling around FP MODE-setting instructions
Pre-gfx10 all MODE-setting instructions were S_SETREG_B32 which is marked as having unmodeled side effects, which makes the machine sch
[AMDGPU] Enable scheduling around FP MODE-setting instructions
Pre-gfx10 all MODE-setting instructions were S_SETREG_B32 which is marked as having unmodeled side effects, which makes the machine scheduler treat it as a barrier. Now that we have proper implicit $mode operands we can use a no-side-effects S_SETREG_B32_mode pseudo instead for setregs that only touch the FP MODE bits, to give the scheduler more freedom.
Differential Revision: https://reviews.llvm.org/D87446
show more ...
|
#
85490874 |
| 09-Sep-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Skip all meta instructions in hazard recognizer
This was not adding a necessary nop due to thinking the kill counted.
|
Revision tags: llvmorg-11.0.0-rc2 |
|
#
43a38dc2 |
| 14-Aug-2020 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Fix MAI ld/st hazard handling
It did not process hazard for ds_permute because it does not load or store even though it is DS.
Differential Revision: https://reviews.llvm.org/D86003
|
#
decfdb8c |
| 29-Jul-2020 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Fixed formatting in GCNHazardRecognizer.cpp. NFC.
|
#
13b63be4 |
| 29-Jul-2020 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] prefer non-mfma in post-RA schedule
MFMA instructions shall not be scheduled back to back to avoid MAI SIMD stall. Tell post-RA schedule we would prefer some other instruction instead.
Dif
[AMDGPU] prefer non-mfma in post-RA schedule
MFMA instructions shall not be scheduled back to back to avoid MAI SIMD stall. Tell post-RA schedule we would prefer some other instruction instead.
Differential Revision: https://reviews.llvm.org/D84883
show more ...
|
Revision tags: llvmorg-11.0.0-rc1 |
|
#
2e87acac |
| 17-Jul-2020 |
Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com> |
[AMDGPU] Removed s_mov_regrd and mov_fed opcodes
These opcodes are not intended for public use.
Reviewers: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D81659
|
#
5bf2a9dd |
| 16-Jul-2020 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Update VMEM scalar write hazard mitigation sequence
Using s_waitcnt_depctr 0xffe3 is potentially faster than v_nop.
Reviewed By: rampitec, foad
Differential Revision: https://reviews.llvm
[AMDGPU] Update VMEM scalar write hazard mitigation sequence
Using s_waitcnt_depctr 0xffe3 is potentially faster than v_nop.
Reviewed By: rampitec, foad
Differential Revision: https://reviews.llvm.org/D83872
show more ...
|