Revision tags: llvmorg-21-init |
|
#
bfd9bc27 |
| 24-Jan-2025 |
Frederik Harwath <frederik.harwath@amd.com> |
[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#124131)
This PR reapplies the changes from PR #123942 which had to be reverted because of a test failure. The test has been adjusted.
|
#
99d450e9 |
| 23-Jan-2025 |
Nico Weber <thakis@chromium.org> |
Revert "[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942)"
This reverts commit 6fdaaafd89d7cbc15dafe3ebf1aa3235d148aaab. Breaks check-llvm, see https://github.com/llvm/llvm-pr
Revert "[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942)"
This reverts commit 6fdaaafd89d7cbc15dafe3ebf1aa3235d148aaab. Breaks check-llvm, see https://github.com/llvm/llvm-project/pull/123942#issuecomment-2609861953
show more ...
|
#
6fdaaafd |
| 23-Jan-2025 |
Frederik Harwath <frederik.harwath@amd.com> |
[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942)
This is meant as a short-term workaround for an invalid conversion in
this pass that occurs because existing SDWA selections
[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942)
This is meant as a short-term workaround for an invalid conversion in
this pass that occurs because existing SDWA selections are not correctly
taken into account during the conversion.
See the draft PR #123221 for an attempt to fix the actual issue.
---------
Co-authored-by: Frederik Harwath <fharwath@amd.com>
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2 |
|
#
8d13e7b8 |
| 03-Oct-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Qualify auto. NFC. (#110878)
Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
e1ee07d0 |
| 11-Sep-2024 |
Akshat Oke <76596238+Akshat-Oke@users.noreply.github.com> |
[AMDGPU][NewPM] Port SIPeepholeSDWA pass to NPM (#107049)
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
63fae3ed |
| 17-Jul-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] clang-tidy: no else after return etc. NFC. (#99298)
|
#
f903e3ec |
| 29-Jun-2024 |
Jeffrey Byrnes <Jeffrey.Byrnes@amd.com> |
[AMDGPU] Reset kill flags for multiple uses of SDWAInst Ops
Change-Id: I8b56d86a55c397623567945a87ad2f55749680bc
|
Revision tags: llvmorg-18.1.8 |
|
#
e7e90dd1 |
| 14-Jun-2024 |
Brian Favela <brianfavela@microsoft.com> |
[AMDGPU] Adding multiple use analysis to SIPeepholeSDWA (#94800)
Allow for multiple uses of an operand where each instruction can be
promoted to SDWA.
For instance:
; v_and_b32 v2, lit(0x0000
[AMDGPU] Adding multiple use analysis to SIPeepholeSDWA (#94800)
Allow for multiple uses of an operand where each instruction can be
promoted to SDWA.
For instance:
; v_and_b32 v2, lit(0x0000ffff), v2
; v_and_b32 v3, 6, v2
; v_and_b32 v2, 1, v2
Can be folded to:
; v_and_b32 v3, 6, sel_lo(v2)
; v_and_b32 v2, 1, sel_lo(v2)
show more ...
|
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1 |
|
#
52d5b8e0 |
| 06-Mar-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Don't form sext/abs/neg fp8 cvt (#83843)
gfx940 does not allow abs/sext/neg on v_cvt_fp8/bf8 & pk variants.
Fixes SWDEV-447468
|
#
a845ea38 |
| 28-Feb-2024 |
Valery Pykhtin <valery.pykhtin@gmail.com> |
[AMDGPU] Fix SDWA 'preserve' transformation for instructions in different basic blocks. (#82406)
This fixes crash when operand sources for V_OR instruction reside in
different basic blocks.
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4 |
|
#
d2edff83 |
| 19-Oct-2023 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] PeepholeSDWA: Don't assume inst srcs are registers (#69576)
To fix that ticket we only needed to address the V_LSHLREV_B16 case, but
I did it for all insts just in case.
Fixes #66899
|
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
2802739d |
| 11-Jun-2023 |
David Green <david.green@arm.com> |
[NFC] Replace ;; with ;
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2 |
|
#
a07584d5 |
| 03-Feb-2023 |
Jay Foad <jay.foad@amd.com> |
[CodeGen] Make more use of MachineOperand::getOperandNo. NFC.
Differential Revision: https://reviews.llvm.org/D143252
|
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init |
|
#
768aed13 |
| 13-Jan-2023 |
Jay Foad <jay.foad@amd.com> |
[MC] Make more use of MCInstrDesc::operands. NFC.
Change MCInstrDesc::operands to return an ArrayRef so we can easily use it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end. A futu
[MC] Make more use of MCInstrDesc::operands. NFC.
Change MCInstrDesc::operands to return an ArrayRef so we can easily use it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end. A future patch will remove opInfo_begin and opInfo_end.
Also use it instead of raw access to the OpInfo pointer. A future patch will remove this pointer.
Differential Revision: https://reviews.llvm.org/D142213
show more ...
|
Revision tags: llvmorg-15.0.7 |
|
#
6443c0ee |
| 12-Dec-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Stop using make_pair and make_tuple. NFC.
C++17 allows us to call constructors pair and tuple instead of helper functions make_pair and make_tuple.
Differential Revision: https://reviews.l
[AMDGPU] Stop using make_pair and make_tuple. NFC.
C++17 allows us to call constructors pair and tuple instead of helper functions make_pair and make_tuple.
Differential Revision: https://reviews.llvm.org/D139828
show more ...
|
#
67819a72 |
| 13-Dec-2022 |
Fangrui Song <i@maskray.me> |
[CodeGen] llvm::Optional => std::optional
|
#
20cde154 |
| 03-Dec-2022 |
Kazu Hirata <kazu@google.com> |
[Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of
[Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
Revision tags: llvmorg-15.0.6 |
|
#
09e0aeaa |
| 26-Nov-2022 |
Kazu Hirata <kazu@google.com> |
[AMDGPU] Use std::optional in SIPeepholeSDWA.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-g
[AMDGPU] Use std::optional in SIPeepholeSDWA.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
#
2652db4d |
| 17-Nov-2022 |
Yashwant Singh <Yashwant.Singh@amd.com> |
Handling ADD|SUB U64 decomposed Pseudos not getting lowered to SDWA form
This patch fixes some of the V_ADD/SUB_U64_PSEUDO not getting converted to their sdwa form. We still get below patterns in ge
Handling ADD|SUB U64 decomposed Pseudos not getting lowered to SDWA form
This patch fixes some of the V_ADD/SUB_U64_PSEUDO not getting converted to their sdwa form. We still get below patterns in generated code: v_and_b32_e32 v0, 0xff, v0 v_add_co_u32_e32 v0, vcc, v1, v0 v_addc_co_u32_e64 v1, s[0:1], 0, 0, vcc
and, v_and_b32_e32 v2, 0xff, v2 v_add_co_u32_e32 v0, vcc, v0, v2 v_addc_co_u32_e32 v1, vcc, 0, v1, vcc
1st and 2nd instructions of both above examples should have been folded into sdwa add with BYTE_0 src operand.
The reason being the pseudo instruction is broken down into VOP3 instruction pair of V_ADD_CO_U32_e64 and V_ADDC_U32_e64. The sdwa pass attempts lowering them to their VOP2 form before converting them into sdwa instructions. However V_ADDC_U32_e64 cannot be shrunk to it's VOP2 form if it has non-reg src1 operand. This change attempts to fix that problem by only shrinking V_ADD_CO_U32_e64 instruction.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D136663
show more ...
|
Revision tags: llvmorg-15.0.5 |
|
#
7425077e |
| 07-Nov-2022 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Add & use `hasNamedOperand`, NFC
In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1. This is fine in itself, but it's verbose and doesn'
[AMDGPU] Add & use `hasNamedOperand`, NFC
In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1. This is fine in itself, but it's verbose and doesn't make the intention clear, IMHO. I added a `hasNamedOperand` and replaced all cases I could find with regexes and manually.
Reviewed By: arsenm, foad
Differential Revision: https://reviews.llvm.org/D137540
show more ...
|
Revision tags: llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
#
6527b2a4 |
| 18-Feb-2022 |
Sebastian Neubauer <Sebastian.Neubauer@amd.com> |
[AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.
Differential Revision: https://reviews.llvm.org/D119235
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
#
399b7de0 |
| 18-Sep-2021 |
Christudasan Devadasan <Christudasan.Devadasan@amd.com> |
[AMDGPU] Add a regclass flag for scalar registers
Along with vector RC flags, this scalar flag will make various regclass queries like `isVGPR` more accurate.
Regclasses other than vectors are curr
[AMDGPU] Add a regclass flag for scalar registers
Along with vector RC flags, this scalar flag will make various regclass queries like `isVGPR` more accurate.
Regclasses other than vectors are currently set with the new flag even though certain unallocatable classes aren't truly scalars. It would be ok as long as they remain unallocatable.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D110053
show more ...
|
Revision tags: llvmorg-13.0.0-rc3 |
|
#
654c89d8 |
| 06-Sep-2021 |
Christudasan Devadasan <Christudasan.Devadasan@amd.com> |
[AMDGPU] Make vector superclasses allocatable
The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to
[AMDGPU] Make vector superclasses allocatable
The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to enable copy between VGPR and AGPR registers during regalloc.
Also, added the missing AV register classes from 192b to 1024b.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D109300
show more ...
|
#
d1f45ed5 |
| 11-Nov-2021 |
Neubauer, Sebastian <Sebastian.Neubauer@amd.com> |
[AMDGPU][NFC] Fix typos
Differential Revision: https://reviews.llvm.org/D113672
|
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
#
560d7e04 |
| 20-Jan-2021 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|