History log of /llvm-project/llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp (Results 1 – 25 of 71)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init
# bfd9bc27 24-Jan-2025 Frederik Harwath <frederik.harwath@amd.com>

[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#124131)

This PR reapplies the changes from PR #123942 which had to be reverted
because of a test failure. The test has been adjusted.


# 99d450e9 23-Jan-2025 Nico Weber <thakis@chromium.org>

Revert "[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942)"

This reverts commit 6fdaaafd89d7cbc15dafe3ebf1aa3235d148aaab.
Breaks check-llvm, see
https://github.com/llvm/llvm-pr

Revert "[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942)"

This reverts commit 6fdaaafd89d7cbc15dafe3ebf1aa3235d148aaab.
Breaks check-llvm, see
https://github.com/llvm/llvm-project/pull/123942#issuecomment-2609861953

show more ...


# 6fdaaafd 23-Jan-2025 Frederik Harwath <frederik.harwath@amd.com>

[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942)

This is meant as a short-term workaround for an invalid conversion in
this pass that occurs because existing SDWA selections

[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942)

This is meant as a short-term workaround for an invalid conversion in
this pass that occurs because existing SDWA selections are not correctly
taken into account during the conversion.

See the draft PR #123221 for an attempt to fix the actual issue.

---------

Co-authored-by: Frederik Harwath <fharwath@amd.com>

show more ...


Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2
# 8d13e7b8 03-Oct-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Qualify auto. NFC. (#110878)

Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0
# e1ee07d0 11-Sep-2024 Akshat Oke <76596238+Akshat-Oke@users.noreply.github.com>

[AMDGPU][NewPM] Port SIPeepholeSDWA pass to NPM (#107049)


Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 63fae3ed 17-Jul-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] clang-tidy: no else after return etc. NFC. (#99298)


# f903e3ec 29-Jun-2024 Jeffrey Byrnes <Jeffrey.Byrnes@amd.com>

[AMDGPU] Reset kill flags for multiple uses of SDWAInst Ops

Change-Id: I8b56d86a55c397623567945a87ad2f55749680bc


Revision tags: llvmorg-18.1.8
# e7e90dd1 14-Jun-2024 Brian Favela <brianfavela@microsoft.com>

[AMDGPU] Adding multiple use analysis to SIPeepholeSDWA (#94800)

Allow for multiple uses of an operand where each instruction can be
promoted to SDWA.

For instance:

; v_and_b32 v2, lit(0x0000

[AMDGPU] Adding multiple use analysis to SIPeepholeSDWA (#94800)

Allow for multiple uses of an operand where each instruction can be
promoted to SDWA.

For instance:

; v_and_b32 v2, lit(0x0000ffff), v2
; v_and_b32 v3, 6, v2
; v_and_b32 v2, 1, v2

Can be folded to:
; v_and_b32 v3, 6, sel_lo(v2)
; v_and_b32 v2, 1, sel_lo(v2)

show more ...


Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1
# 52d5b8e0 06-Mar-2024 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Don't form sext/abs/neg fp8 cvt (#83843)

gfx940 does not allow abs/sext/neg on v_cvt_fp8/bf8 & pk variants.

Fixes SWDEV-447468


# a845ea38 28-Feb-2024 Valery Pykhtin <valery.pykhtin@gmail.com>

[AMDGPU] Fix SDWA 'preserve' transformation for instructions in different basic blocks. (#82406)

This fixes crash when operand sources for V_OR instruction reside in
different basic blocks.


Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4
# d2edff83 19-Oct-2023 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] PeepholeSDWA: Don't assume inst srcs are registers (#69576)

To fix that ticket we only needed to address the V_LSHLREV_B16 case, but
I did it for all insts just in case.

Fixes #66899


Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 2802739d 11-Jun-2023 David Green <david.green@arm.com>

[NFC] Replace ;; with ;


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# a07584d5 03-Feb-2023 Jay Foad <jay.foad@amd.com>

[CodeGen] Make more use of MachineOperand::getOperandNo. NFC.

Differential Revision: https://reviews.llvm.org/D143252


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# 768aed13 13-Jan-2023 Jay Foad <jay.foad@amd.com>

[MC] Make more use of MCInstrDesc::operands. NFC.

Change MCInstrDesc::operands to return an ArrayRef so we can easily use
it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end.
A futu

[MC] Make more use of MCInstrDesc::operands. NFC.

Change MCInstrDesc::operands to return an ArrayRef so we can easily use
it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end.
A future patch will remove opInfo_begin and opInfo_end.

Also use it instead of raw access to the OpInfo pointer. A future patch
will remove this pointer.

Differential Revision: https://reviews.llvm.org/D142213

show more ...


Revision tags: llvmorg-15.0.7
# 6443c0ee 12-Dec-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Stop using make_pair and make_tuple. NFC.

C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.l

[AMDGPU] Stop using make_pair and make_tuple. NFC.

C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.llvm.org/D139828

show more ...


# 67819a72 13-Dec-2022 Fangrui Song <i@maskray.me>

[CodeGen] llvm::Optional => std::optional


# 20cde154 03-Dec-2022 Kazu Hirata <kazu@google.com>

[Target] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of

[Target] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


Revision tags: llvmorg-15.0.6
# 09e0aeaa 26-Nov-2022 Kazu Hirata <kazu@google.com>

[AMDGPU] Use std::optional in SIPeepholeSDWA.cpp (NFC)

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-g

[AMDGPU] Use std::optional in SIPeepholeSDWA.cpp (NFC)

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


# 2652db4d 17-Nov-2022 Yashwant Singh <Yashwant.Singh@amd.com>

Handling ADD|SUB U64 decomposed Pseudos not getting lowered to SDWA form

This patch fixes some of the V_ADD/SUB_U64_PSEUDO not getting converted to their sdwa form.
We still get below patterns in ge

Handling ADD|SUB U64 decomposed Pseudos not getting lowered to SDWA form

This patch fixes some of the V_ADD/SUB_U64_PSEUDO not getting converted to their sdwa form.
We still get below patterns in generated code:
v_and_b32_e32 v0, 0xff, v0
v_add_co_u32_e32 v0, vcc, v1, v0
v_addc_co_u32_e64 v1, s[0:1], 0, 0, vcc

and,
v_and_b32_e32 v2, 0xff, v2
v_add_co_u32_e32 v0, vcc, v0, v2
v_addc_co_u32_e32 v1, vcc, 0, v1, vcc

1st and 2nd instructions of both above examples should have been folded into sdwa add with BYTE_0 src operand.

The reason being the pseudo instruction is broken down into VOP3 instruction pair of V_ADD_CO_U32_e64 and V_ADDC_U32_e64.
The sdwa pass attempts lowering them to their VOP2 form before converting them into sdwa instructions. However V_ADDC_U32_e64
cannot be shrunk to it's VOP2 form if it has non-reg src1 operand.
This change attempts to fix that problem by only shrinking V_ADD_CO_U32_e64 instruction.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D136663

show more ...


Revision tags: llvmorg-15.0.5
# 7425077e 07-Nov-2022 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Add & use `hasNamedOperand`, NFC

In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1.
This is fine in itself, but it's verbose and doesn'

[AMDGPU] Add & use `hasNamedOperand`, NFC

In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1.
This is fine in itself, but it's verbose and doesn't make the intention clear, IMHO. I added a `hasNamedOperand` and replaced all cases I could find with regexes and manually.

Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D137540

show more ...


Revision tags: llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# 6527b2a4 18-Feb-2022 Sebastian Neubauer <Sebastian.Neubauer@amd.com>

[AMDGPU][NFC] Fix typos

Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# 399b7de0 18-Sep-2021 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Add a regclass flag for scalar registers

Along with vector RC flags, this scalar flag will
make various regclass queries like `isVGPR` more
accurate.

Regclasses other than vectors are curr

[AMDGPU] Add a regclass flag for scalar registers

Along with vector RC flags, this scalar flag will
make various regclass queries like `isVGPR` more
accurate.

Regclasses other than vectors are currently set
with the new flag even though certain unallocatable
classes aren't truly scalars. It would be ok as long
as they remain unallocatable.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D110053

show more ...


Revision tags: llvmorg-13.0.0-rc3
# 654c89d8 06-Sep-2021 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Make vector superclasses allocatable

The combined vector register classes with both
VGPRs and AGPRs are currently unallocatable.
This patch turns them into allocatable as a
prerequisite to

[AMDGPU] Make vector superclasses allocatable

The combined vector register classes with both
VGPRs and AGPRs are currently unallocatable.
This patch turns them into allocatable as a
prerequisite to enable copy between VGPR and
AGPR registers during regalloc.

Also, added the missing AV register classes from
192b to 1024b.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D109300

show more ...


# d1f45ed5 11-Nov-2021 Neubauer, Sebastian <Sebastian.Neubauer@amd.com>

[AMDGPU][NFC] Fix typos

Differential Revision: https://reviews.llvm.org/D113672


Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 560d7e04 20-Jan-2021 dfukalov <daniil.fukalov@amd.com>

[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets

... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036


123