|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1 |
|
| #
b434051d |
| 28-Mar-2023 |
skc7 <Krishna.Sankisa@amd.com> |
[AMDGPU] Introduce SIInstrWorklist to process instructions in moveToVALU
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D147168
|
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6 |
|
| #
2e29b013 |
| 21-Jun-2022 |
Alexander Timofeev <alexander.timofeev@amd.com> |
[AMDGPU] Lowering VGPR to SGPR copies to v_readfirstlane_b32 if profitable.
Since the divergence-driven instruction selection has been enabled for AMDGPU, all the uniform instructions are expected
[AMDGPU] Lowering VGPR to SGPR copies to v_readfirstlane_b32 if profitable.
Since the divergence-driven instruction selection has been enabled for AMDGPU, all the uniform instructions are expected to be selected to SALU form, except those not having one. VGPR to SGPR copies appear in MIR to connect values producers and consumers. This change implements an algorithm that evolves a reasonable tradeoff between the profit achieved from keeping the uniform instructions in SALU form and overhead introduced by the data transfer between the VGPRs and SGPRs.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D128252
show more ...
|
| #
5cae8816 |
| 06-Jul-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Add GFX11 test coverage
Add GFX11 test coverage to a bunch of tests where it was easy to do so, mostly because the checks are autogenerated and/or GFX11 can share the same checks as GFX10.
[AMDGPU] Add GFX11 test coverage
Add GFX11 test coverage to a bunch of tests where it was easy to do so, mostly because the checks are autogenerated and/or GFX11 can share the same checks as GFX10.
Differential Revision: https://reviews.llvm.org/D129295
show more ...
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
3ce1b963 |
| 08-Sep-2021 |
Joe Nash <Joseph.Nash@amd.com> |
[AMDGPU] Switch PostRA sched to MachineSched
Use GCNHazardRecognizer in postra sched. Updated tests for the new schedules.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D1095
[AMDGPU] Switch PostRA sched to MachineSched
Use GCNHazardRecognizer in postra sched. Updated tests for the new schedules.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D109536
Change-Id: Ia86ba2ae168f12fb34b4d8efdab491f84d936cde
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
| #
d28624a2 |
| 01-Dec-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Stop adding an implicit def of vcc_hi for wave32
This doesn't seem to be needed for anything.
Differential Revision: https://reviews.llvm.org/D92400
|
|
Revision tags: llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
| #
43830790 |
| 07-Oct-2019 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Remove dubious logic in bidirectional list scheduler
Summary: pickNodeBidirectional tried to compare the best top candidate and the best bottom candidate by examining TopCand.Reason and Bot
[AMDGPU] Remove dubious logic in bidirectional list scheduler
Summary: pickNodeBidirectional tried to compare the best top candidate and the best bottom candidate by examining TopCand.Reason and BotCand.Reason. This is unsound because, after calling pickNodeFromQueue, Cand.Reason does not reflect the most important reason why Cand was chosen. Rather it reflects the most recent reason why it beat some other potential candidate, which could have been for some low priority tie breaker reason.
I have seen this cause problems where TopCand is a good candidate, but because TopCand.Reason is ORDER (which is very low priority) it is repeatedly ignored in favour of a mediocre BotCand. This is not how bidirectional scheduling is supposed to work.
To fix this I changed the code to always compare TopCand and BotCand directly, like the generic implementation of pickNodeBidirectional does. This removes some uncommented AMDGPU-specific logic; if this logic turns out to be important then perhaps it could be moved into an override of tryCandidate instead.
Graphics shader benchmarking on gfx10 shows a lot more positive than negative effects from this change.
Reviewers: arsenm, tstellar, rampitec, kzhuravl, vpykhtin, dstuttard, tpr, atrick, MatzeB
Subscribers: jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68338
show more ...
|
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3 |
|
| #
0846c125 |
| 20-Jun-2019 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] gfx1010 core wave32 changes
Differential Revision: https://reviews.llvm.org/D63204
llvm-svn: 363934
|
|
Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
| #
64196850 |
| 10-May-2019 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Pattern for v_xor3_b32
This also allows three op patterns to use increased constant bus limit of GFX10.
Differential Revision: https://reviews.llvm.org/D61763
llvm-svn: 360395
|
| #
971cb8b6 |
| 06-May-2019 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] gfx1010: prefer V_MUL_LO_U32 over V_MUL_LO_I32
GFX10 deprecates v_mul_lo_i32 instruction, so choose u32 form for all targets.
Differential Revision: https://reviews.llvm.org/D61525
llvm-s
[AMDGPU] gfx1010: prefer V_MUL_LO_U32 over V_MUL_LO_I32
GFX10 deprecates v_mul_lo_i32 instruction, so choose u32 form for all targets.
Differential Revision: https://reviews.llvm.org/D61525
llvm-svn: 360094
show more ...
|
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3 |
|
| #
871821f7 |
| 14-Feb-2019 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Ressociate 'add (add x, y), z' to use SALU
Reassociate adds to collect scalar operands in a single instruction when possible. That will result in a scalar add followed by vector instead of
[AMDGPU] Ressociate 'add (add x, y), z' to use SALU
Reassociate adds to collect scalar operands in a single instruction when possible. That will result in a scalar add followed by vector instead of two vector adds, thus better utilizing SALU.
Differential Revision: https://reviews.llvm.org/D58220
llvm-svn: 354066
show more ...
|
|
Revision tags: llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
| #
20fe3d2f |
| 15-Jan-2019 |
Changpeng Fang <changpeng.fang@gmail.com> |
AMDGPU: Raise the priority of MAD24 in instruction selection.
Summary: We have seen performance regression when v_add3 is generated. The major reason is that the v_mad pattern is broken when v_add
AMDGPU: Raise the priority of MAD24 in instruction selection.
Summary: We have seen performance regression when v_add3 is generated. The major reason is that the v_mad pattern is broken when v_add3 is generated. We also see the register pressure increased. While we could not properly estimate register pressure during instruction selection, we can give mad a higher priority.
In this work, we raise the priority for mad24 in selection and resolve the performance regression.
Reviewers: rampitec
Differential Revision: https://reviews.llvm.org/D56745
llvm-svn: 351273
show more ...
|
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3 |
|
| #
ca4a3294 |
| 06-Dec-2018 |
Nicolai Haehnle <nhaehnle@gmail.com> |
AMDGPU: Generate VALU ThreeOp Integer instructions
Summary: Original patch by: Fabian Wahlster <razor@singul4rity.com>
Change-Id: I148f692a88432541fad468963f58da9ddf79fac5
Reviewers: arsenm, rampi
AMDGPU: Generate VALU ThreeOp Integer instructions
Summary: Original patch by: Fabian Wahlster <razor@singul4rity.com>
Change-Id: I148f692a88432541fad468963f58da9ddf79fac5
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, b-sumner, llvm-commits
Differential Revision: https://reviews.llvm.org/D51995
llvm-svn: 348488
show more ...
|