SIPeepholeSDWA.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init
# bfd9bc27	24-Jan-2025	Frederik Harwath <frederik.harwath@amd.com>	[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#124131) This PR reapplies the changes from PR #123942 which had to be reverted because of a test failure. The test has been adjusted.
# 99d450e9	23-Jan-2025	Nico Weber <thakis@chromium.org>	Revert "[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942)" This reverts commit 6fdaaafd89d7cbc15dafe3ebf1aa3235d148aaab. Breaks check-llvm, see https://github.com/llvm/llvm-pr Revert "[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942)" This reverts commit 6fdaaafd89d7cbc15dafe3ebf1aa3235d148aaab. Breaks check-llvm, see https://github.com/llvm/llvm-project/pull/123942#issuecomment-2609861953 show more ...
# 6fdaaafd	23-Jan-2025	Frederik Harwath <frederik.harwath@amd.com>	[AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942) This is meant as a short-term workaround for an invalid conversion in this pass that occurs because existing SDWA selections [AMDGPU] SIPeepholeSDWA: Disable on existing SDWA instructions (#123942) This is meant as a short-term workaround for an invalid conversion in this pass that occurs because existing SDWA selections are not correctly taken into account during the conversion. See the draft PR #123221 for an attempt to fix the actual issue. --------- Co-authored-by: Frederik Harwath <fharwath@amd.com> show more ...
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2
# 8d13e7b8	03-Oct-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Qualify auto. NFC. (#110878) Generated automatically with: $ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find lib/Target/AMDGPU/ -type f)
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0
# e1ee07d0	11-Sep-2024	Akshat Oke <76596238+Akshat-Oke@users.noreply.github.com>	[AMDGPU][NewPM] Port SIPeepholeSDWA pass to NPM (#107049)
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 63fae3ed	17-Jul-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] clang-tidy: no else after return etc. NFC. (#99298)
# f903e3ec	29-Jun-2024	Jeffrey Byrnes <Jeffrey.Byrnes@amd.com>	[AMDGPU] Reset kill flags for multiple uses of SDWAInst Ops Change-Id: I8b56d86a55c397623567945a87ad2f55749680bc
Revision tags: llvmorg-18.1.8
# e7e90dd1	14-Jun-2024	Brian Favela <brianfavela@microsoft.com>	[AMDGPU] Adding multiple use analysis to SIPeepholeSDWA (#94800) Allow for multiple uses of an operand where each instruction can be promoted to SDWA. For instance: ; v_and_b32 v2, lit(0x0000 [AMDGPU] Adding multiple use analysis to SIPeepholeSDWA (#94800) Allow for multiple uses of an operand where each instruction can be promoted to SDWA. For instance: ; v_and_b32 v2, lit(0x0000ffff), v2 ; v_and_b32 v3, 6, v2 ; v_and_b32 v2, 1, v2 Can be folded to: ; v_and_b32 v3, 6, sel_lo(v2) ; v_and_b32 v2, 1, sel_lo(v2) show more ...
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1
# 52d5b8e0	06-Mar-2024	Pierre van Houtryve <pierre.vanhoutryve@amd.com>	[AMDGPU] Don't form sext/abs/neg fp8 cvt (#83843) gfx940 does not allow abs/sext/neg on v_cvt_fp8/bf8 & pk variants. Fixes SWDEV-447468
# a845ea38	28-Feb-2024	Valery Pykhtin <valery.pykhtin@gmail.com>	[AMDGPU] Fix SDWA 'preserve' transformation for instructions in different basic blocks. (#82406) This fixes crash when operand sources for V_OR instruction reside in different basic blocks.
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4
# d2edff83	19-Oct-2023	Pierre van Houtryve <pierre.vanhoutryve@amd.com>	[AMDGPU] PeepholeSDWA: Don't assume inst srcs are registers (#69576) To fix that ticket we only needed to address the V_LSHLREV_B16 case, but I did it for all insts just in case. Fixes #66899
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 2802739d	11-Jun-2023	David Green <david.green@arm.com>	[NFC] Replace ;; with ;
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# a07584d5	03-Feb-2023	Jay Foad <jay.foad@amd.com>	[CodeGen] Make more use of MachineOperand::getOperandNo. NFC. Differential Revision: https://reviews.llvm.org/D143252
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# 768aed13	13-Jan-2023	Jay Foad <jay.foad@amd.com>	[MC] Make more use of MCInstrDesc::operands. NFC. Change MCInstrDesc::operands to return an ArrayRef so we can easily use it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end. A futu [MC] Make more use of MCInstrDesc::operands. NFC. Change MCInstrDesc::operands to return an ArrayRef so we can easily use it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end. A future patch will remove opInfo_begin and opInfo_end. Also use it instead of raw access to the OpInfo pointer. A future patch will remove this pointer. Differential Revision: https://reviews.llvm.org/D142213 show more ...
Revision tags: llvmorg-15.0.7
# 6443c0ee	12-Dec-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] Stop using make_pair and make_tuple. NFC. C++17 allows us to call constructors pair and tuple instead of helper functions make_pair and make_tuple. Differential Revision: https://reviews.l [AMDGPU] Stop using make_pair and make_tuple. NFC. C++17 allows us to call constructors pair and tuple instead of helper functions make_pair and make_tuple. Differential Revision: https://reviews.llvm.org/D139828 show more ...
# 67819a72	13-Dec-2022	Fangrui Song <i@maskray.me>	[CodeGen] llvm::Optional => std::optional
# 20cde154	03-Dec-2022	Kazu Hirata <kazu@google.com>	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of [Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 show more ...
Revision tags: llvmorg-15.0.6
# 09e0aeaa	26-Nov-2022	Kazu Hirata <kazu@google.com>	[AMDGPU] Use std::optional in SIPeepholeSDWA.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-g [AMDGPU] Use std::optional in SIPeepholeSDWA.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 show more ...
# 2652db4d	17-Nov-2022	Yashwant Singh <Yashwant.Singh@amd.com>	Handling ADD\|SUB U64 decomposed Pseudos not getting lowered to SDWA form This patch fixes some of the V_ADD/SUB_U64_PSEUDO not getting converted to their sdwa form. We still get below patterns in ge Handling ADD\|SUB U64 decomposed Pseudos not getting lowered to SDWA form This patch fixes some of the V_ADD/SUB_U64_PSEUDO not getting converted to their sdwa form. We still get below patterns in generated code: v_and_b32_e32 v0, 0xff, v0 v_add_co_u32_e32 v0, vcc, v1, v0 v_addc_co_u32_e64 v1, s[0:1], 0, 0, vcc and, v_and_b32_e32 v2, 0xff, v2 v_add_co_u32_e32 v0, vcc, v0, v2 v_addc_co_u32_e32 v1, vcc, 0, v1, vcc 1st and 2nd instructions of both above examples should have been folded into sdwa add with BYTE_0 src operand. The reason being the pseudo instruction is broken down into VOP3 instruction pair of V_ADD_CO_U32_e64 and V_ADDC_U32_e64. The sdwa pass attempts lowering them to their VOP2 form before converting them into sdwa instructions. However V_ADDC_U32_e64 cannot be shrunk to it's VOP2 form if it has non-reg src1 operand. This change attempts to fix that problem by only shrinking V_ADD_CO_U32_e64 instruction. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D136663 show more ...
Revision tags: llvmorg-15.0.5
# 7425077e	07-Nov-2022	Pierre van Houtryve <pierre.vanhoutryve@amd.com>	[AMDGPU] Add & use `hasNamedOperand`, NFC In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1. This is fine in itself, but it's verbose and doesn' [AMDGPU] Add & use `hasNamedOperand`, NFC In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1. This is fine in itself, but it's verbose and doesn't make the intention clear, IMHO. I added a `hasNamedOperand` and replaced all cases I could find with regexes and manually. Reviewed By: arsenm, foad Differential Revision: https://reviews.llvm.org/D137540 show more ...
Revision tags: llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# 6527b2a4	18-Feb-2022	Sebastian Neubauer <Sebastian.Neubauer@amd.com>	[AMDGPU][NFC] Fix typos Fix some typos in the amdgpu backend. Differential Revision: https://reviews.llvm.org/D119235
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# 399b7de0	18-Sep-2021	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU] Add a regclass flag for scalar registers Along with vector RC flags, this scalar flag will make various regclass queries like `isVGPR` more accurate. Regclasses other than vectors are curr [AMDGPU] Add a regclass flag for scalar registers Along with vector RC flags, this scalar flag will make various regclass queries like `isVGPR` more accurate. Regclasses other than vectors are currently set with the new flag even though certain unallocatable classes aren't truly scalars. It would be ok as long as they remain unallocatable. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D110053 show more ...
Revision tags: llvmorg-13.0.0-rc3
# 654c89d8	06-Sep-2021	Christudasan Devadasan <Christudasan.Devadasan@amd.com>	[AMDGPU] Make vector superclasses allocatable The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to [AMDGPU] Make vector superclasses allocatable The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to enable copy between VGPR and AGPR registers during regalloc. Also, added the missing AV register classes from 192b to 1024b. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D109300 show more ...
# d1f45ed5	11-Nov-2021	Neubauer, Sebastian <Sebastian.Neubauer@amd.com>	[AMDGPU][NFC] Fix typos Differential Revision: https://reviews.llvm.org/D113672
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 560d7e04	20-Jan-2021	dfukalov <daniil.fukalov@amd.com>	[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets ... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036
12 3