|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4 |
|
| #
1f520600 |
| 02-Sep-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Use poison instead of undef in module lds pass
|
|
Revision tags: llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
| #
d3316bc1 |
| 13-Jul-2023 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu] Delete elide-module-lds attribute
Requires D155190
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D155238
|
| #
74e928a0 |
| 13-Jul-2023 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu][lds] Remove recalculation of LDS frame from backend
Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.
Prior to this patch: The IR lowering pa
[amdgpu][lds] Remove recalculation of LDS frame from backend
Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.
Prior to this patch: The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol metadata so that the assembler can build lookup tables out of it. There is a fragile association between kernel functions and named structs which is used to recompute the frame layout in the backend, with fatal_errors catching inconsistencies in the second calculation.
After this patch: The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same absolute_symbol metadata that the assembler uses to place objects within that frame size.
Deleted the now dead allocation code from the backend. Left for a later cleanup: - enabling lowering for anonymous functions - removing the elide-module-lds attribute (test churn, it's not used by llc any more) - adjusting the dynamic alignment check to not use symbol names
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D155190
show more ...
|
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1 |
|
| #
5a4a8eb2 |
| 25-Jan-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Convert some tests to opaque pointers
|
|
Revision tags: llvmorg-17-init, llvmorg-15.0.7 |
|
| #
9ed2f14c |
| 14-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[AsmParser] Remove typed pointer auto-detection
IR is now always parsed in opaque pointer mode, unless -opaque-pointers=0 is explicitly given. There is no automatic detection of typed pointers anymo
[AsmParser] Remove typed pointer auto-detection
IR is now always parsed in opaque pointer mode, unless -opaque-pointers=0 is explicitly given. There is no automatic detection of typed pointers anymore.
The -opaque-pointers=0 option is added to any remaining IR tests that haven't been migrated yet.
Differential Revision: https://reviews.llvm.org/D141912
show more ...
|
| #
d77ae7f2 |
| 07-Dec-2022 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu] Reimplement LDS lowering
Renames the current lowering scheme to "module" and introduces two new ones, "kernel" and "table", plus a "hybrid" that chooses between those three on a per-variabl
[amdgpu] Reimplement LDS lowering
Renames the current lowering scheme to "module" and introduces two new ones, "kernel" and "table", plus a "hybrid" that chooses between those three on a per-variable basis.
Unit tests are set up to pass with the default lowering of "module" or "hybrid" with this patch defaulting to "module", which will be a less dramatic codegen change relative to the current. This reflects the sparsity of test coverage for the table lowering method. Hybrid is better than module in every respect and will be default in a subsequent patch.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D139433
show more ...
|
| #
a862d09a |
| 06-Dec-2022 |
Nico Weber <thakis@chromium.org> |
Revert "[amdgpu] Reimplement LDS lowering"
This reverts commit 982017240d7f25a8a6969b8b73dc51f9ac5b93ed. Breaks check-llvm, see https://reviews.llvm.org/D139433#3974862
|
| #
98201724 |
| 06-Dec-2022 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu] Reimplement LDS lowering
Renames the current lowering scheme to "module" and introduces two new ones, "kernel" and "table", plus a "hybrid" that chooses between those three on a per-variabl
[amdgpu] Reimplement LDS lowering
Renames the current lowering scheme to "module" and introduces two new ones, "kernel" and "table", plus a "hybrid" that chooses between those three on a per-variable basis.
Unit tests are set up to pass with the default lowering of "module" or "hybrid" with this patch defaulting to "module", which will be a less dramatic codegen change relative to the current. This reflects the sparsity of test coverage for the table lowering method. Hybrid is better than module in every respect and will be default in a subsequent patch.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D139433
show more ...
|
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5 |
|
| #
e9a2aa63 |
| 09-Nov-2022 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu][lds] Use a consistent order of fields in generated structs
Avoids spurious and confusing test failures on changing implementation.
Reviewed By: arsenm
Differential Revision: https://revie
[amdgpu][lds] Use a consistent order of fields in generated structs
Avoids spurious and confusing test failures on changing implementation.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D136598
show more ...
|
|
Revision tags: llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1 |
|
| #
cdb97389 |
| 14-Sep-2022 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu] Expand all ConstantExpr users of LDS variables in instructions
Bug noted in D112717 can be sidestepped with this change.
Expanding all ConstantExpr involved with LDS up front makes the var
[amdgpu] Expand all ConstantExpr users of LDS variables in instructions
Bug noted in D112717 can be sidestepped with this change.
Expanding all ConstantExpr involved with LDS up front makes the variable specialisation simpler. Excludes ConstantExpr that don't access LDS to avoid disturbing codegen elsewhere.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D133422
show more ...
|
|
Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
bc78c099 |
| 04-May-2022 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu] Elide module lds allocation in kernels with no callees
Introduces a string attribute, amdgpu-requires-module-lds, to allow eliding the module.lds block from kernels. Will allocate the block
[amdgpu] Elide module lds allocation in kernels with no callees
Introduces a string attribute, amdgpu-requires-module-lds, to allow eliding the module.lds block from kernels. Will allocate the block as before if the attribute is missing or has its default value of true.
Patch uses the new attribute to detect the simplest possible instance of this, where a kernel makes no calls and thus cannot call any functions that use LDS.
Tests updated to match, coverage was already good. Interesting cases is in lower-module-lds-offsets where annotating the kernel allows the backend to pick a different (in this case better) variable ordering than previously. A later patch will avoid moving kernel variables into module.lds when the kernel can have this attribute, allowing optimal ordering and locally unused variable elimination.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D122091
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
d797a7f8 |
| 15-Jun-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Use performOptimizedStructLayout for LDS sort
This gives better packing.
Differential Revision: https://reviews.llvm.org/D104331
|
| #
05289dfb |
| 07-Jun-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Handle constant LDS uses from different kernels
This allows to lower an LDS variable into a kernel structure even if there is a constant expression used from different kernels.
Differentia
[AMDGPU] Handle constant LDS uses from different kernels
This allows to lower an LDS variable into a kernel structure even if there is a constant expression used from different kernels.
Differential Revision: https://reviews.llvm.org/D103655
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
8de4db69 |
| 19-May-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Lower kernel LDS into a sorted structure
Differential Revision: https://reviews.llvm.org/D102954
|
| #
748db5bf |
| 20-May-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Fix module LDS selection
Accesses to global module LDS variable start from null, but kernel also thinks its variables start address is null. Fixed by not using a null as an address.
Differ
[AMDGPU] Fix module LDS selection
Accesses to global module LDS variable start from null, but kernel also thinks its variables start address is null. Fixed by not using a null as an address.
Differential Revision: https://reviews.llvm.org/D102882
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
13e49dce |
| 15-Mar-2021 |
Jon Chesterfield <jonathanchesterfield@gmail.com> |
[amdgpu] Implement lower function LDS pass
[amdgpu] Implement lower function LDS pass
Local variables are allocated at kernel launch. This pass collects global variables that are used from non-kern
[amdgpu] Implement lower function LDS pass
[amdgpu] Implement lower function LDS pass
Local variables are allocated at kernel launch. This pass collects global variables that are used from non-kernel functions, moves them into a new struct type, and allocates an instance of that type in every kernel. Uses are then replaced with a constantexpr offset.
Prior to this pass, accesses from a function are compiled to trap. With this pass, most such accesses are removed before reaching codegen. The trap logic is left unchanged by this pass. It is still reachable for the cases this pass misses, notably the extern shared construct from hip and variables marked constant which survive the optimizer.
This is of interest to the openmp project because the deviceRTL runtime library uses cuda shared variables from functions that cannot be inlined. Trunk llvm therefore cannot compile some openmp kernels for amdgpu. In addition to the unit tests attached, this patch applied to ROCm llvm with fixed-abi enabled and the function pointer hashing scheme deleted passes the openmp suite.
This lowering will use more LDS than strictly necessary. It is intended to be a functionally correct fallback for cases that are difficult to target from future optimisation passes.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D94648
show more ...
|