AMDGPUCallLowering.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6
# b1d42465	08-Dec-2024	Austin Kerbow <Austin.Kerbow@amd.com>	[AMDGPU] Fix hidden kernarg preload count inconsistency (#116759) It is possible that the number of hidden arguments that are selected to be preloaded in AMDGPULowerKernel arguments and isel can di [AMDGPU] Fix hidden kernarg preload count inconsistency (#116759) It is possible that the number of hidden arguments that are selected to be preloaded in AMDGPULowerKernel arguments and isel can differ. This isn't an issue with explicit arguments since isel can lower the argument correctly either way, but with hidden arguments we may have alignment issues if we try to load these hidden arguments that were added to the kernel signature. The reason for the mismatch is that isel reserves an extra synthetic user SGPR for module LDS. Instead of teaching lowerFormalArguments how to handle these properly it makes more sense and is less expensive to fix the mismatch and assert if we ever run into this issue again. We should never be trying to lower these in the normal way. In a future change we probably want to revise how we track "synthetic" user SGPRs and unify the handling in GCNUserSGPRUsageInfo. Sometimes synthetic SGPRSs are considered user SGPRs and sometimes they are not. Until then this patch resolves the inconsistency, fixes the bug, and is otherwise a NFC. show more ...
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4
# be187369	14-Nov-2024	Kazu Hirata <kazu@google.com>	[AMDGPU] Remove unused includes (NFC) (#116154) Identified with misc-include-cleaner.
# 6548b635	09-Nov-2024	Shilei Tian <i@tianshilei.me>	Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)" This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.
# ca33649a	08-Nov-2024	Shilei Tian <i@tianshilei.me>	Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)" This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both hip and openmp buildbots.
# e215a1e2	08-Nov-2024	Shilei Tian <i@tianshilei.me>	[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)
# 8ee5e19c	30-Oct-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Fix @llvm.amdgcn.cs.chain with SGPR args not provably uniform (#114232) The correct behaviour is to insert a readfirstlane. SelectionDAG was already doing this in some cases, but not in th [AMDGPU] Fix @llvm.amdgcn.cs.chain with SGPR args not provably uniform (#114232) The correct behaviour is to insert a readfirstlane. SelectionDAG was already doing this in some cases, but not in the general case for chain calls. GlobalISel was already doing this for return values but not for arguments. show more ...
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# 428ae0f1	03-Oct-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Do not tail call if an inreg argument requires waterfalling (#111002) If we have a divergent value passed to an outgoing inreg argument, the call needs to be executed in a waterfall loop and AMDGPU: Do not tail call if an inreg argument requires waterfalling (#111002) If we have a divergent value passed to an outgoing inreg argument, the call needs to be executed in a waterfall loop and thus cannot be tail called. The waterfall handling of arbitrary calls is broken on the selectiondag path, so some of these cases still hit an error later. I also noticed the argument evaluation code in isEligibleForTailCallOptimization is not correctly accounting for implicit argument assignments. It also seems inreg codegen is generally broken; we are assigning arguments to the reserved private resource descriptor. show more ...
# 8d13e7b8	03-Oct-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Qualify auto. NFC. (#110878) Generated automatically with: $ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find lib/Target/AMDGPU/ -type f)
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3
# 3b9f1839	13-Aug-2024	Kazu Hirata <kazu@google.com>	[AMDGPU] Use llvm::any_of, llvm::all_of, and llvm::none_of (NFC) (#103007)
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 74b87b02	16-Jul-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Fix and add namespace closing comments. NFC.
# 9df71d76	28-Jun-2024	Nikita Popov <npopov@redhat.com>	[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919) Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, re [IR] Add getDataLayout() helpers to Function and GlobalValue (#96919) Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern. show more ...
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3
# 256343a0	26-Mar-2024	Thomas Symalla <5754458+tsymalla@users.noreply.github.com>	Revert "Update amdgpu_gfx functions to use s0-s3 for inreg SGPR arguments on targets using scratch instructions for stack #78226" (#86273) Reverts llvm/llvm-project#81394 This reverts commit 3ac2 Revert "Update amdgpu_gfx functions to use s0-s3 for inreg SGPR arguments on targets using scratch instructions for stack #78226" (#86273) Reverts llvm/llvm-project#81394 This reverts commit 3ac243bc0d7922d083af2cf025247b5698556062. It is not handling RSrc registers s0-s3 correctly. This leads to a broken test, where it expects s0-s3 as function argument and uses it as RSrc register as well. We need to re-visit the patch, but apparently we only want to have s0-s3 as argument registers if we don't need them as RSrc registers. show more ...
# 3ac243bc	21-Mar-2024	SahilPatidar <patidarsahil2001@gmail.com>	Update amdgpu_gfx functions to use s0-s3 for inreg SGPR arguments on targets using scratch instructions for stack #78226 (#81394) Resolve #78226
Revision tags: llvmorg-18.1.2
# ec34699f	18-Mar-2024	Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>	[GlobalISel] convergence control tokens and intrinsics (#67006) [GlobalISel] Implement convergence control tokens and intrinsics in GMIR In the IR translator, convert the LLVM token type to LLT:: [GlobalISel] convergence control tokens and intrinsics (#67006) [GlobalISel] Implement convergence control tokens and intrinsics in GMIR In the IR translator, convert the LLVM token type to LLT::token(), which is an alias for the s0 type. These show up as implicit uses on convergent operations. Differential Revision: https://reviews.llvm.org/D158147 show more ...
Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# bc82cfb3	21-Jan-2024	Emma Pilkington <emma.pilkington95@gmail.com>	[AMDGPU] Add an asm directive to track code_object_version (#76267) Named '.amdhsa_code_object_version'. This directive sets the e_ident[ABIVERSION] in the ELF header, and should be used as the ass [AMDGPU] Add an asm directive to track code_object_version (#76267) Named '.amdhsa_code_object_version'. This directive sets the e_ident[ABIVERSION] in the ELF header, and should be used as the assumed COV for the rest of the asm file. This commit also weakens the --amdhsa-code-object-version CL flag. Previously, the CL flag took precedence over the IR flag. Now the IR flag/asm directive take precedence over the CL flag. This is implemented by merging a few COV-checking functions in AMDGPUBaseInfo.h. show more ...
# af4f1766	17-Jan-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Allocate special SGPRs before user SGPR arguments (#78234)
# 480cc413	16-Jan-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU/GlobalISel: Handle inreg arguments as SGPRs (#78123) This is the missing GISel part of 54470176afe20b16e6b026ab989591d1d19ad2b7
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5
# 7f5d59b3	06-Nov-2023	Diana <Diana-Magda.Picus@amd.com>	[AMDGPU] ISel for @llvm.amdgcn.cs.chain intrinsic (#68186) The @llvm.amdgcn.cs.chain intrinsic is essentially a call. The call parameters are bundled up into 2 intrinsic arguments, one for those th [AMDGPU] ISel for @llvm.amdgcn.cs.chain intrinsic (#68186) The @llvm.amdgcn.cs.chain intrinsic is essentially a call. The call parameters are bundled up into 2 intrinsic arguments, one for those that should go in the SGPRs (the 3rd intrinsic argument), and one for those that should go in the VGPRs (the 4th intrinsic argument). Both will often be some kind of aggregate. Both instruction selection frameworks have some internal representation for intrinsics (G_INTRINSIC[_WITH_SIDE_EFFECTS] for GlobalISel, ISD::INTRINSIC_[VOID\|WITH_CHAIN] for DAGISel), but we can't use those because aggregates are dissolved very early on during ISel and we'd lose the inreg information. Therefore, this patch shortcircuits both the IRTranslator and SelectionDAGBuilder to lower this intrinsic as a call from the very start. It tries to use the existing infrastructure as much as possible, by calling into the code for lowering tail calls. This has already gone through a few rounds of review in Phab: Differential Revision: https://reviews.llvm.org/D153761 show more ...
Revision tags: llvmorg-17.0.4
# 2f4328e6	24-Oct-2023	Craig Topper <craig.topper@sifive.com>	[GISel] Make assignValueToReg take CCValAssign by const reference. (#70086) This was previously passed by value. It used to be passed by non-const reference, but it was changed to value in D110610. [GISel] Make assignValueToReg take CCValAssign by const reference. (#70086) This was previously passed by value. It used to be passed by non-const reference, but it was changed to value in D110610. I'm not sure why. show more ...
# 9f592cbc	24-Oct-2023	Craig Topper <craig.topper@sifive.com>	[GISel] Pass MPO and VA to assignValueToAddress by const reference. NFC (#69810) Previously they were passed by non-const reference. No in tree target modifies the values. This makes it possible [GISel] Pass MPO and VA to assignValueToAddress by const reference. NFC (#69810) Previously they were passed by non-const reference. No in tree target modifies the values. This makes it possible to call assignValueToAddress from assignCustomValue without a const_cast. For example in this patch https://github.com/llvm/llvm-project/pull/69138. show more ...
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3
# 343be513	19-Aug-2023	Austin Kerbow <Austin.Kerbow@amd.com>	[AMDGPU] Add utilities to track number of user SGPRs. NFC. Factor out and unify some common code that calculates and tracks the number of user SGRPs. Reviewed By: arsenm Differential Revision: htt [AMDGPU] Add utilities to track number of user SGPRs. NFC. Factor out and unify some common code that calculates and tracks the number of user SGRPs. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D159439 show more ...
# ef38e6d9	18-Aug-2023	Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>	[GlobalISel] introduce MIFlag::NoConvergent Some opcodes in MIR are defined to be convergent by the target by setting IsConvergent in the corresponding TD file. For example, in AMDGPU, the opcodes G [GlobalISel] introduce MIFlag::NoConvergent Some opcodes in MIR are defined to be convergent by the target by setting IsConvergent in the corresponding TD file. For example, in AMDGPU, the opcodes G_SI_CALL and G_INTRINSIC* are marked as convergent. But this is too conservative, since calls to functions that do not execute convergent operations should not be marked convergent. This information is available in LLVM IR. The new flag MIFlag::NoConvergent now allows the IR translator to mark an instruction as not performing any convergent operations. It is relevant only on occurrences of opcodes that are marked isConvergent in the target. Differential Revision: https://reviews.llvm.org/D157475 show more ...
Revision tags: llvmorg-17.0.0-rc2
# d9847cde	31-Jul-2023	Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>	[GlobalISel] convergent intrinsics Introduced the convergent equivalent of the existing G_INTRINSIC opcodes: - G_INTRINSIC_CONVERGENT - G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS Out of the targets tha [GlobalISel] convergent intrinsics Introduced the convergent equivalent of the existing G_INTRINSIC opcodes: - G_INTRINSIC_CONVERGENT - G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS Out of the targets that currently have some support for GlobalISel, the patch assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154766 show more ...
Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init
# 74e928a0	13-Jul-2023	Jon Chesterfield <jonathanchesterfield@gmail.com>	[amdgpu][lds] Remove recalculation of LDS frame from backend Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend. Prior to this patch: The IR lowering pa [amdgpu][lds] Remove recalculation of LDS frame from backend Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend. Prior to this patch: The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol metadata so that the assembler can build lookup tables out of it. There is a fragile association between kernel functions and named structs which is used to recompute the frame layout in the backend, with fatal_errors catching inconsistencies in the second calculation. After this patch: The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same absolute_symbol metadata that the assembler uses to place objects within that frame size. Deleted the now dead allocation code from the backend. Left for a later cleanup: - enabling lowering for anonymous functions - removing the elide-module-lds attribute (test churn, it's not used by llc any more) - adjusting the dynamic alignment check to not use symbol names Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D155190 show more ...
Revision tags: llvmorg-16.0.6
# 3d0350b7	07-Jun-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Add MF independent version of getImplicitParameterOffset
12 3 4 5 6 7