History log of /llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp (Results 1 – 25 of 63)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# be187369 14-Nov-2024 Kazu Hirata <kazu@google.com>

[AMDGPU] Remove unused includes (NFC) (#116154)

Identified with misc-include-cleaner.


# 4a6d13bf 06-Nov-2024 Thurston Dang <thurston@google.com>

Remove unused variable to fix '[AMDGPU] modify named barrier builtins and intrinsics (#114550)'

https://github.com/llvm/llvm-project/pull/114550 caused a buildbot breakage (https://lab.llvm.org/buil

Remove unused variable to fix '[AMDGPU] modify named barrier builtins and intrinsics (#114550)'

https://github.com/llvm/llvm-project/pull/114550 caused a buildbot breakage (https://lab.llvm.org/buildbot/#/builders/66/builds/5853) because of an unused variable. This patch attempts to fix forward:

/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:106:24: error: variable 'TTy' set but not used [-Werror,-Wunused-but-set-variable]
106 | if (TargetExtType *TTy = AMDGPU::isNamedBarrier(GV)) {
| ^

show more ...


# 8c752900 06-Nov-2024 Gang Chen <gangc@amd.com>

[AMDGPU] modify named barrier builtins and intrinsics (#114550)

Use a local pointer type to represent the named barrier in builtin and
intrinsic. This makes the definitions more user friendly
baca

[AMDGPU] modify named barrier builtins and intrinsics (#114550)

Use a local pointer type to represent the named barrier in builtin and
intrinsic. This makes the definitions more user friendly
bacause they do not need to worry about the hardware ID assignment. Also
this approach is more like the other popular GPU programming language.
Named barriers should be represented as global variables of addrspace(3)
in LLVM-IR. Compiler assigns the special LDS offsets for those variables
during AMDGPULowerModuleLDS pass. Those addresses are converted to hw
barrier ID during instruction selection. The rest of the
instruction-selection changes are primarily due to the
intrinsic-definition changes.

show more ...


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# 8d13e7b8 03-Oct-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Qualify auto. NFC. (#110878)

Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 6bba44e8 16-Jul-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Use member initializers. NFC.


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 9803de0e 04-Jan-2024 Chaitanya <Krishna.Sankisa@amd.com>

[AMDGPU] Add dynamic LDS size implicit kernel argument to CO-v5 (#65273)

"hidden_dynamic_lds_size" argument will be added in the reserved section
at offset 120 of the implicit argument layout.
Add

[AMDGPU] Add dynamic LDS size implicit kernel argument to CO-v5 (#65273)

"hidden_dynamic_lds_size" argument will be added in the reserved section
at offset 120 of the implicit argument layout.
Add "isDynamicLDSUsed" flag to AMDGPUMachineFunction to identify if a
function uses dynamic LDS.

hidden argument will be added in below cases:

- LDS global is used in the kernel.
- Kernel calls a function which uses LDS global.
- LDS pointer is passed as argument to kernel itself.

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1
# 5272ae66 27-Jul-2023 Diana Picus <Diana-Magda.Picus@amd.com>

[AMDGPU] Add IsChainFunction to the MachineFunctionInfo

This will represent functions with the amdgpu_cs_chain or
amdgpu_cs_chain_preserve calling conventions.

Differential Revision: https://review

[AMDGPU] Add IsChainFunction to the MachineFunctionInfo

This will represent functions with the amdgpu_cs_chain or
amdgpu_cs_chain_preserve calling conventions.

Differential Revision: https://reviews.llvm.org/D156410

show more ...


Revision tags: llvmorg-18-init
# 6043d4df 15-Jul-2023 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu] Accept an optional max to amdgpu-lds-size attribute for use in PromoteAlloca


# 74e928a0 13-Jul-2023 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu][lds] Remove recalculation of LDS frame from backend

Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.

Prior to this patch:
The IR lowering pa

[amdgpu][lds] Remove recalculation of LDS frame from backend

Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.

Prior to this patch:
The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol
metadata so that the assembler can build lookup tables out of it. There is a fragile association between
kernel functions and named structs which is used to recompute the frame layout in the backend, with
fatal_errors catching inconsistencies in the second calculation.

After this patch:
The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same
absolute_symbol metadata that the assembler uses to place objects within that frame size.

Deleted the now dead allocation code from the backend. Left for a later cleanup:
- enabling lowering for anonymous functions
- removing the elide-module-lds attribute (test churn, it's not used by llc any more)
- adjusting the dynamic alignment check to not use symbol names

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D155190

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# 0507448d 04-Apr-2023 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu] Implement dynamic LDS accesses from non-kernel functions

The premise here is to allow non-kernel functions to locate external LDS variables without using LDS or extra magic SGPRs to do so.

[amdgpu] Implement dynamic LDS accesses from non-kernel functions

The premise here is to allow non-kernel functions to locate external LDS variables without using LDS or extra magic SGPRs to do so.

1/ First it crawls the callgraph to work out which external LDS variables are reachable from a given kernel
2/ Then it creates a new `extern char[0]` variable for each kernel, which will alias all the other extern LDS variables because that's the documented behaviour of these variables
3/ The address of that variable is written to a lookup table. The global variable is tagged with metadata to track what address it was allocated at by codegen
4/ The assembler builds the lookup table using the metadata
5/ Any non-kernel functions use the same magic intrinsic used by table lookups of non-dynamic LDS variables to find the address to use

Heavy overlap with the code paths taken for other lowering, in particular the same intrinsic is used to pass the dynamic scope information through the same sgpr as for table lookups of static LDS.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D144233

show more ...


# 75c7019b 30-Mar-2023 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu] Fix broken error detection in LDS lowering

std::optional<uint32_t> can be compared to uint32_t without warning, but does
not compare to the value within the optional. It needs to be prefixe

[amdgpu] Fix broken error detection in LDS lowering

std::optional<uint32_t> can be compared to uint32_t without warning, but does
not compare to the value within the optional. It needs to be prefixed *.
Wconversion does not warn about this.
```
bool bug(uint32_t Offset, std::optional<uint32_t> Expect)
{
return (Offset != Expect);
}
bool deref(uint32_t Offset, std::optional<uint32_t> Expect)
{
return (Offset != *Expect);
}
```
Both compile without warnings. Wrote the former, intended the latter.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D146775

show more ...


Revision tags: llvmorg-16.0.0
# d3dda422 12-Mar-2023 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu][nfc] Replace ad hoc LDS frame recalculation with absolute_symbol MD

Post ISel, LDS variables are absolute values. Representing them as
such is simpler than the frame recalculation currently

[amdgpu][nfc] Replace ad hoc LDS frame recalculation with absolute_symbol MD

Post ISel, LDS variables are absolute values. Representing them as
such is simpler than the frame recalculation currently used to build assembler
tables from their addresses.

This is a precursor to lowering dynamic/external LDS accesses from non-kernel
functions.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D144221

show more ...


Revision tags: llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# 69e75ae6 18-Jun-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

CodeGen: Don't lazily construct MachineFunctionInfo

This fixes what I consider to be an API flaw I've tripped over
multiple times. The point this is constructed isn't well defined, so
depending on w

CodeGen: Don't lazily construct MachineFunctionInfo

This fixes what I consider to be an API flaw I've tripped over
multiple times. The point this is constructed isn't well defined, so
depending on where this is first called, you can conclude different
information based on the MachineFunction. For example, the AMDGPU
implementation inspected the MachineFrameInfo on construction for the
stack objects and if the frame has calls. This kind of worked in
SelectionDAG which visited all allocas up front, but broke in
GlobalISel which hasn't visited any of the IR when arguments are
lowered.

I've run into similar problems before with the MIR parser and trying
to make use of other MachineFunction fields, so I think it's best to
just categorically disallow dependency on the MachineFunction state in
the constructor and to always construct this at the same time as the
MachineFunction itself.

A missing feature I still could use is a way to access an custom
analysis pass on the IR here.

show more ...


# 6443c0ee 12-Dec-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Stop using make_pair and make_tuple. NFC.

C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.l

[AMDGPU] Stop using make_pair and make_tuple. NFC.

C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.llvm.org/D139828

show more ...


# 67819a72 13-Dec-2022 Fangrui Song <i@maskray.me>

[CodeGen] llvm::Optional => std::optional


# d77ae7f2 07-Dec-2022 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu] Reimplement LDS lowering

Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variabl

[amdgpu] Reimplement LDS lowering

Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variable basis.

Unit tests are set up to pass with the default lowering of "module" or "hybrid"
with this patch defaulting to "module", which will be a less dramatic codegen
change relative to the current. This reflects the sparsity of test coverage for
the table lowering method. Hybrid is better than module in every respect and
will be default in a subsequent patch.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D139433

show more ...


# a862d09a 06-Dec-2022 Nico Weber <thakis@chromium.org>

Revert "[amdgpu] Reimplement LDS lowering"

This reverts commit 982017240d7f25a8a6969b8b73dc51f9ac5b93ed.
Breaks check-llvm, see https://reviews.llvm.org/D139433#3974862


# 98201724 06-Dec-2022 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu] Reimplement LDS lowering

Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variabl

[amdgpu] Reimplement LDS lowering

Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variable basis.

Unit tests are set up to pass with the default lowering of "module" or "hybrid"
with this patch defaulting to "module", which will be a less dramatic codegen
change relative to the current. This reflects the sparsity of test coverage for
the table lowering method. Hybrid is better than module in every respect and
will be default in a subsequent patch.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D139433

show more ...


# 5a3fe9a0 27-Sep-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Move SIModeRegisterDefaults to SI MFI

It does not belong to a general AMDGPU MFI.

Differential Revision: https://reviews.llvm.org/D134666


# 80ba4328 28-Sep-2022 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu][nfc] Allocate kernel-specific LDS struct deterministically

A kernel may have an associated struct for laying out LDS variables.
This patch puts that instance, if present, at a deterministic

[amdgpu][nfc] Allocate kernel-specific LDS struct deterministically

A kernel may have an associated struct for laying out LDS variables.
This patch puts that instance, if present, at a deterministic address by
allocating it at the same time as the module scope instance.

This is relatively likely to be where the instance was allocated anyway (~NFC)
but will allow later patches to calculate where a given field can be found,
which means a function which is only reachable from a single kernel will be
able to access a LDS variable with zero overhead. That will be particularly
helpful for applications that instantiate a function template containing LDS
variables once per kernel.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D127052

show more ...


# 20a80d60 27-Sep-2022 Vitaly Buka <vitalybuka@google.com>

Revert "[AMDGPU] Move SIModeRegisterDefaults to SI MFI"

Break msan bots. Details in D134666.

This reverts commit 0ce96e06ee0226938e723bd0c8e16e3d2d51f203.


# 0ce96e06 26-Sep-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Move SIModeRegisterDefaults to SI MFI

It does not belong to a general AMDGPU MFI.

Differential Revision: https://reviews.llvm.org/D134666


# 3a205977 19-Jul-2022 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu] Implement lds kernel id intrinsic

Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS varia

[amdgpu] Implement lds kernel id intrinsic

Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.

There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.

It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.

The intent is to emit a __const array of LDS addresses and index into it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D125060

show more ...


# bc78c099 04-May-2022 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu] Elide module lds allocation in kernels with no callees

Introduces a string attribute, amdgpu-requires-module-lds, to allow
eliding the module.lds block from kernels. Will allocate the block

[amdgpu] Elide module lds allocation in kernels with no callees

Introduces a string attribute, amdgpu-requires-module-lds, to allow
eliding the module.lds block from kernels. Will allocate the block as before
if the attribute is missing or has its default value of true.

Patch uses the new attribute to detect the simplest possible instance of this,
where a kernel makes no calls and thus cannot call any functions that use LDS.

Tests updated to match, coverage was already good. Interesting cases is in
lower-module-lds-offsets where annotating the kernel allows the backend to pick
a different (in this case better) variable ordering than previously. A later
patch will avoid moving kernel variables into module.lds when the kernel can
have this attribute, allowing optimal ordering and locally unused variable
elimination.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122091

show more ...


# 70306542 03-May-2022 serge-sans-paille <sguelton@redhat.com>

[iwyu] Handle regressions in libLLVM header include

Running iwyu-diff on LLVM codebase since fa5a4e1b95c8f37796 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.

[iwyu] Handle regressions in libLLVM header include

Running iwyu-diff on LLVM codebase since fa5a4e1b95c8f37796 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.org/D124847

show more ...


123