History log of /llvm-project/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp (Results 76 – 99 of 99)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 86caf517 11-Dec-2021 Jon Chesterfield <jonathanchesterfield@gmail.com>

Revert "[amdgpu][nfc] Delete dead code in LowerModuleLDS"

This reverts commit 7b9ab06d10a6a989f76e6c5ecf89d906f838fe7d.
Said code is better removed as part of a larger change.


# 7b9ab06d 10-Dec-2021 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu][nfc] Delete dead code in LowerModuleLDS


# 04b2f6ea 09-Dec-2021 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu][nfc] Drop dead PtrSet, fix a comment


# f0e3b39a 08-Dec-2021 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu][nfc] Move non-shared code out of LDSUtils


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# cbdf624b 19-Sep-2021 Brendon Cahoon <brendon.cahoon@amd.com>

[AMDGPU] Correctly merge alias.scope and noalias metadata for memops

When adding alias.scope and noalias metadata to a memcpy function,
the alias.scope and noalias metadata from the operands are mer

[AMDGPU] Correctly merge alias.scope and noalias metadata for memops

When adding alias.scope and noalias metadata to a memcpy function,
the alias.scope and noalias metadata from the operands are merged.
The rule for merging alias.scope is to take the intersection of
the domains and the union of the scopes within those domains.
The rule for merging noalias is to take the intersection.

The bug is that AMDGPULowerModuleLDS was using concatenation for
both alias.scope and noalias. For example, when f1 and f2 are added
to the LDS structure and there is a memcpy(f2, f1, sizeof(f1)).
Then, concatenation creates noalias metadata for the memcpy that
includes both {f1, f2}. That means that the memcpy is assumed
not to alias a prior load of f2, which enables the optimizer to
remove a load of f2 that occurs after mempcy.

The function MDNode::getmostGenericAliasScope defines the semantics
for alias.scope. There is a function, combineMetadata in Local.cpp,
that uses intersect for noalias.

Differential Revision: https://reviews.llvm.org/D110049

show more ...


# dc6e8dfd 20-Sep-2021 Jacob Lambert <jacob.lambert@amd.com>

[AMDGPU][NFC] Correct typos in lib/Target/AMDGPU/AMDGPU*.cpp files. Test commit for new contributor.


Revision tags: llvmorg-13.0.0-rc3
# ce51c5d4 26-Aug-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix crashing on kernel declarations when lowering LDS

This was trying to insert the used marker into a declaration.


Revision tags: llvmorg-13.0.0-rc2
# 8d7d89b0 17-Aug-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Add alias.scope metadata to lowered LDS struct

Alias analysis is unable to disambiguate accesses to the structure
fields without it unlike distinct variables. As a result we cannot
combine

[AMDGPU] Add alias.scope metadata to lowered LDS struct

Alias analysis is unable to disambiguate accesses to the structure
fields without it unlike distinct variables. As a result we cannot
combine ds_read and ds_write operations in a case of any store in
between which always considered clobbering.

Differential Revision: https://reviews.llvm.org/D108315

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init
# 9dc26366 16-Jul-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Disable LDS lowering for GFX shaders

Apparently these need external LDS symbols to remain.

Fixes: SC1-3279

Differential Revision: https://reviews.llvm.org/D106288


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3
# d274d64e 23-Jun-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Check for pointer operand while refining LDS align

Also skips the propagation if alignment is 1.

Differential Revision: https://reviews.llvm.org/D104796


Revision tags: llvmorg-12.0.1-rc2
# 2b43209e 15-Jun-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Propagate LDS align into to instructions

Differential Revision: https://reviews.llvm.org/D104316


# d797a7f8 15-Jun-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Use performOptimizedStructLayout for LDS sort

This gives better packing.

Differential Revision: https://reviews.llvm.org/D104331


# 80fd5fa5 21-Jun-2021 hsmahesha <mahesha.comp@gmail.com>

[AMDGPU] Replace non-kernel function uses of LDS globals by pointers.

The main motivation behind pointer replacement of LDS use within non-kernel
functions is - to *avoid* subsequent LDS lowering pa

[AMDGPU] Replace non-kernel function uses of LDS globals by pointers.

The main motivation behind pointer replacement of LDS use within non-kernel
functions is - to *avoid* subsequent LDS lowering pass from directly packing
LDS (assume large LDS) into a struct type which would otherwise cause allocating
huge memory for struct instance within every kernel.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D103225

show more ...


# f6632f11 10-Jun-2021 hsmahesha <mahesha.comp@gmail.com>

[AMDGPU] Fix missing lowering of LDS used in global scope.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D103431


# 05289dfb 07-Jun-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Handle constant LDS uses from different kernels

This allows to lower an LDS variable into a kernel structure
even if there is a constant expression used from different
kernels.

Differentia

[AMDGPU] Handle constant LDS uses from different kernels

This allows to lower an LDS variable into a kernel structure
even if there is a constant expression used from different
kernels.

Differential Revision: https://reviews.llvm.org/D103655

show more ...


# 713ca2f3 07-Jun-2021 hsmahesha <mahesha.comp@gmail.com>

[AMDGPU] Introduce command line switch to control super aligning of LDS.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D103817


# 52ffbfdf 07-Jun-2021 hsmahesha <mahesha.comp@gmail.com>

[AMDGPU] Increase alignment of LDS globals if necessary before LDS lowering.

Before packing LDS globals into a sorted structure, make sure that
their alignment is properly updated based on their siz

[AMDGPU] Increase alignment of LDS globals if necessary before LDS lowering.

Before packing LDS globals into a sorted structure, make sure that
their alignment is properly updated based on their size. This will make
sure that the members of sorted structure are properly aligned, and
hence it will further reduce the probability of unaligned LDS access.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D103261

show more ...


# 753437fc 04-Jun-2021 hsmahesha <mahesha.comp@gmail.com>

Revert "[AMDGPU] Increase alignment of LDS globals if necessary before LDS lowering."

This reverts commit d71ff907ef23eaef86ad66ba2d711e4986cd6cb2.


# d71ff907 04-Jun-2021 hsmahesha <mahesha.comp@gmail.com>

[AMDGPU] Increase alignment of LDS globals if necessary before LDS lowering.

Before packing LDS globals into a sorted structure, make sure that
their alignment is properly updated based on their siz

[AMDGPU] Increase alignment of LDS globals if necessary before LDS lowering.

Before packing LDS globals into a sorted structure, make sure that
their alignment is properly updated based on their size. This will make
sure that the members of sorted structure are properly aligned, and
hence it will further reduce the probability of unaligned LDS access.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D103261

show more ...


Revision tags: llvmorg-12.0.1-rc1
# 8de4db69 19-May-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Lower kernel LDS into a sorted structure

Differential Revision: https://reviews.llvm.org/D102954


# 49028858 20-May-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Request module used variables from LDS lowering as internal

I do not see any practical difference but technically
used.* variables are internal and a call to getGlobalVariable
misses true a

[AMDGPU] Request module used variables from LDS lowering as internal

I do not see any practical difference but technically
used.* variables are internal and a call to getGlobalVariable
misses true as a second argument. NFC as far as I can tell.

Differential Revision: https://reviews.llvm.org/D102884

show more ...


# 748db5bf 20-May-2021 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Fix module LDS selection

Accesses to global module LDS variable start from null,
but kernel also thinks its variables start address is
null. Fixed by not using a null as an address.

Differ

[AMDGPU] Fix module LDS selection

Accesses to global module LDS variable start from null,
but kernel also thinks its variables start address is
null. Fixed by not using a null as an address.

Differential Revision: https://reviews.llvm.org/D102882

show more ...


# 82787eb2 15-Apr-2021 hsmahesha <mahesha.comp@gmail.com>

[AMDGPU] Move LDS lowering related utility functions to a separate utils file.

Move some utility functions which are used within LDS lowering pass to a separate utils
file so that other LDS related

[AMDGPU] Move LDS lowering related utility functions to a separate utils file.

Move some utility functions which are used within LDS lowering pass to a separate utils
file so that other LDS related passes can make use of them when required.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D100526

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 13e49dce 15-Mar-2021 Jon Chesterfield <jonathanchesterfield@gmail.com>

[amdgpu] Implement lower function LDS pass

[amdgpu] Implement lower function LDS pass

Local variables are allocated at kernel launch. This pass collects global
variables that are used from non-kern

[amdgpu] Implement lower function LDS pass

[amdgpu] Implement lower function LDS pass

Local variables are allocated at kernel launch. This pass collects global
variables that are used from non-kernel functions, moves them into a new struct
type, and allocates an instance of that type in every kernel. Uses are then
replaced with a constantexpr offset.

Prior to this pass, accesses from a function are compiled to trap. With this
pass, most such accesses are removed before reaching codegen. The trap logic
is left unchanged by this pass. It is still reachable for the cases this pass
misses, notably the extern shared construct from hip and variables marked
constant which survive the optimizer.

This is of interest to the openmp project because the deviceRTL runtime library
uses cuda shared variables from functions that cannot be inlined. Trunk llvm
therefore cannot compile some openmp kernels for amdgpu. In addition to the
unit tests attached, this patch applied to ROCm llvm with fixed-abi enabled
and the function pointer hashing scheme deleted passes the openmp suite.

This lowering will use more LDS than strictly necessary. It is intended to be
a functionally correct fallback for cases that are difficult to target from
future optimisation passes.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D94648

show more ...


1234