#
1900b6c7 |
| 20-Apr-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add assert for GDS globals
|
#
b5ec1312 |
| 16-Apr-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix allocating GDS globals to LDS offsets
These don't seem to be very well used or tested, but try to make the behavior a bit more consistent with LDS globals.
I'm not sure what the definit
AMDGPU: Fix allocating GDS globals to LDS offsets
These don't seem to be very well used or tested, but try to make the behavior a bit more consistent with LDS globals.
I'm not sure what the definition for amdgpu-gds-size is supposed to mean. For now I assumed it's allocating a static size at the beginning of the allocation, and any known globals are allocated after it.
show more ...
|
#
bcbd4cf1 |
| 20-Mar-2022 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
Revert "[amdgpu][nfc] Pass function instead of module to allocateModuleLDSGlobal" Reconsidered, better to handle per-function state in the constructor as before. This reverts commit 98e474c1b3210d90e
Revert "[amdgpu][nfc] Pass function instead of module to allocateModuleLDSGlobal" Reconsidered, better to handle per-function state in the constructor as before. This reverts commit 98e474c1b3210d90e313457bf6a6e39a7edb4d2b.
show more ...
|
#
98e474c1 |
| 19-Mar-2022 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu][nfc] Pass function instead of module to allocateModuleLDSGlobal
|
#
f3a344d2 |
| 07-Jan-2022 |
Kazu Hirata <kazu@google.com> |
[Target] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
|
#
748db5bf |
| 20-May-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Fix module LDS selection
Accesses to global module LDS variable start from null, but kernel also thinks its variables start address is null. Fixed by not using a null as an address.
Differ
[AMDGPU] Fix module LDS selection
Accesses to global module LDS variable start from null, but kernel also thinks its variables start address is null. Fixed by not using a null as an address.
Differential Revision: https://reviews.llvm.org/D102882
show more ...
|
#
d6de1e1a |
| 24-Mar-2021 |
Serge Guelton <sguelton@redhat.com> |
Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from
Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from
if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")
to
if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")
Introduce a getValueAsBool that normalize the check, with the following behavior:
no attributes or attribute set to "false" => return false attribute set to "true" => return true
Differential Revision: https://reviews.llvm.org/D99299
show more ...
|
#
13e49dce |
| 15-Mar-2021 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu] Implement lower function LDS pass
[amdgpu] Implement lower function LDS pass
Local variables are allocated at kernel launch. This pass collects global variables that are used from non-kern
[amdgpu] Implement lower function LDS pass
[amdgpu] Implement lower function LDS pass
Local variables are allocated at kernel launch. This pass collects global variables that are used from non-kernel functions, moves them into a new struct type, and allocates an instance of that type in every kernel. Uses are then replaced with a constantexpr offset.
Prior to this pass, accesses from a function are compiled to trap. With this pass, most such accesses are removed before reaching codegen. The trap logic is left unchanged by this pass. It is still reachable for the cases this pass misses, notably the extern shared construct from hip and variables marked constant which survive the optimizer.
This is of interest to the openmp project because the deviceRTL runtime library uses cuda shared variables from functions that cannot be inlined. Trunk llvm therefore cannot compile some openmp kernels for amdgpu. In addition to the unit tests attached, this patch applied to ROCm llvm with fixed-abi enabled and the function pointer hashing scheme deleted passes the openmp suite.
This lowering will use more LDS than strictly necessary. It is intended to be a functionally correct fallback for cases that are difficult to target from future optimisation passes.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D94648
show more ...
|
#
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
#
5733167f |
| 09-Dec-2020 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Mark amdgpu_gfx functions as module entry function
- Allows lds allocations - Writes resource usage into COMPUTE_PGM_RSRC1 registers in PAL metadata
Differential Revision: https://reviews.
[AMDGPU] Mark amdgpu_gfx functions as module entry function
- Allows lds allocations - Writes resource usage into COMPUTE_PGM_RSRC1 registers in PAL metadata
Differential Revision: https://reviews.llvm.org/D92946
show more ...
|
#
5257a60e |
| 24-Jun-2020 |
Michael Liao <michael.hliao@gmail.com> |
[amdgpu] Add codegen support for HIP dynamic shared memory.
Summary: - HIP uses an unsized extern array `extern __shared__ T s[]` to declare the dynamic shared memory, which size is not known at t
[amdgpu] Add codegen support for HIP dynamic shared memory.
Summary: - HIP uses an unsized extern array `extern __shared__ T s[]` to declare the dynamic shared memory, which size is not known at the compile time.
Reviewers: arsenm, yaxunl, kpyzhov, b-sumner
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82496
show more ...
|
#
52911428 |
| 29-Jun-2020 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] Migrate AMDGPU backend to Align
This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851
[Alignment][NFC] Migrate AMDGPU backend to Align
This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Differential Revision: https://reviews.llvm.org/D82743
show more ...
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
a2caa3b6 |
| 19-May-2020 |
Eli Friedman <efriedma@quicinc.com> |
Remove GlobalValue::getAlignment().
This function is deceptive at best: it doesn't return what you'd expect. If you have an arbitrary GlobalValue and you want to determine the alignment of that poin
Remove GlobalValue::getAlignment().
This function is deceptive at best: it doesn't return what you'd expect. If you have an arbitrary GlobalValue and you want to determine the alignment of that pointer, Value::getPointerAlignment() returns the correct value. If you want the actual declared alignment of a function or variable, GlobalObject::getAlignment() returns that.
This patch switches all the users of GlobalValue::getAlignment to an appropriate alternative.
Differential Revision: https://reviews.llvm.org/D80368
show more ...
|
#
61813b80 |
| 19-May-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Use member initializers in MFI
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
#
5660bb6b |
| 18-Nov-2019 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Remove denormal subtarget features
Switch to using the denormal-fp-math/denormal-fp-math-f32 attributes.
|
#
db0ed3e4 |
| 01-Nov-2019 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Refactor treatment of denormal mode
Start moving towards treating this as a property of the calling convention, and not the subtarget. The default denormal mode should not be part of the sub
AMDGPU: Refactor treatment of denormal mode
Start moving towards treating this as a property of the calling convention, and not the subtarget. The default denormal mode should not be part of the subtarget, and be moved into a separate function attribute.
This patch is still NFC. The denormal mode remains as a subtarget feature for now, but make the necessary changes to switch to using an attribute.
show more ...
|
#
b65fa483 |
| 15-Oct-2019 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment] Migrate Attribute::getWith(Stack)Alignment
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-de
[Alignment] Migrate Attribute::getWith(Stack)Alignment
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, jdoerfert
Reviewed By: courbet
Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D68792
llvm-svn: 374884
show more ...
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4 |
|
#
e7e23e3e |
| 05-Jul-2019 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Make AMDGPUPerfHintAnalysis an SCC pass
Add a string attribute instead of directly setting MachineFunctionInfo. This avoids trying to get the analysis in the MachineFunctionInfo in a way tha
AMDGPU: Make AMDGPUPerfHintAnalysis an SCC pass
Add a string attribute instead of directly setting MachineFunctionInfo. This avoids trying to get the analysis in the MachineFunctionInfo in a way that doesn't work with the new pass manager.
This will also avoid re-visiting the call graph for every single function.
llvm-svn: 365241
show more ...
|
Revision tags: llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
#
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <chandlerc@gmail.com> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1 |
|
#
4bec7d42 |
| 20-Jul-2018 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"
Reverts r337079 with fix for msan error.
llvm-svn: 337535
|
#
1971ba09 |
| 14-Jul-2018 |
Evgeniy Stepanov <eugeni.stepanov@gmail.com> |
Revert "AMDGPU: Fix handling of alignment padding in DAG argument lowering"
This reverts commit r337021.
WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x1415cd65 in void write_signed<
Revert "AMDGPU: Fix handling of alignment padding in DAG argument lowering"
This reverts commit r337021.
WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x1415cd65 in void write_signed<long>(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:95:7 #1 0x1415c900 in llvm::write_integer(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:121:3 #2 0x1472357f in llvm::raw_ostream::operator<<(long) /code/llvm-project/llvm/lib/Support/raw_ostream.cpp:117:3 #3 0x13bb9d4 in llvm::raw_ostream::operator<<(int) /code/llvm-project/llvm/include/llvm/Support/raw_ostream.h:210:18 #4 0x3c2bc18 in void printField<unsigned int, &(amd_kernel_code_s::amd_kernel_code_version_major)>(llvm::StringRef, amd_kernel_code_s const&, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:78:23 #5 0x3c250ba in llvm::printAmdKernelCodeField(amd_kernel_code_s const&, int, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:104:5 #6 0x3c27ca3 in llvm::dumpAmdKernelCode(amd_kernel_code_s const*, llvm::raw_ostream&, char const*) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:113:5 #7 0x3a46e6c in llvm::AMDGPUTargetAsmStreamer::EmitAMDKernelCodeT(amd_kernel_code_s const&) /code/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:161:3 #8 0xd371e4 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:204:26
[...]
Uninitialized value was created by an allocation of 'KernelCode' in the stack frame of function '_ZN4llvm16AMDGPUAsmPrinter21EmitFunctionBodyStartEv' #0 0xd36650 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:192
llvm-svn: 337079
show more ...
|
#
de950777 |
| 13-Jul-2018 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix handling of alignment padding in DAG argument lowering
This was completely broken if there was ever a struct argument, as this information is thrown away during the argument analysis.
T
AMDGPU: Fix handling of alignment padding in DAG argument lowering
This was completely broken if there was ever a struct argument, as this information is thrown away during the argument analysis.
The offsets as passed in to LowerFormalArguments are not useful, as they partially depend on the legalized result register type, and they don't consider the alignment in the first place.
Ignore the Ins array, and instead figure out from the raw IR type what we need to do. This seems to fix the padding computation if the DAG lowering is forced (and stops breaking arguments following padded arguments if the arguments were only partially lowered in the IR)
llvm-svn: 337021
show more ...
|
#
75e7192b |
| 28-Jun-2018 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Remove MFI::ABIArgOffset
We have too many mechanisms for tracking the various offsets used for kernel arguments, so remove one. There's still a lot of confusion with these because there are
AMDGPU: Remove MFI::ABIArgOffset
We have too many mechanisms for tracking the various offsets used for kernel arguments, so remove one. There's still a lot of confusion with these because there are two different "implicit" argument areas located at the beginning and end of the kernarg segment.
Additionally, the offset was determined based on the memory size of the split element types. This would break in a future commit where v3i32 is decomposed into separate i32 pieces.
llvm-svn: 335830
show more ...
|
Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2 |
|
#
1c538423 |
| 25-May-2018 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Add perf hints to functions
This is adoption of HSAIL perfhint pass. Two types of hints are produced:
1. Function is memory bound. 2. Kernel can use wave limiter.
Currently these hints ar
[AMDGPU] Add perf hints to functions
This is adoption of HSAIL perfhint pass. Two types of hints are produced:
1. Function is memory bound. 2. Kernel can use wave limiter.
Currently these hints are used in the scheduler. If a function is suspected to be memory bound we allow occupancy to decrease to 4 waves in the course of scheduling.
Differential Revision: https://reviews.llvm.org/D46992
llvm-svn: 333289
show more ...
|
Revision tags: llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1 |
|
#
f1caa283 |
| 15-Dec-2017 |
Matthias Braun <matze@braunis.de> |
MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.
llvm-svn: 320884
|