Revision tags: llvmorg-21-init |
|
#
07ed8187 |
| 25-Jan-2025 |
Alex MacLean <amaclean@nvidia.com> |
[OpenMP] Replace nvvm.annotation usage with kernel calling conventions (#122320)
Specifying a kernel with the `ptx_kernel` or `amdgpu_kernel` calling convention is a more idiomatic and compile-time
[OpenMP] Replace nvvm.annotation usage with kernel calling conventions (#122320)
Specifying a kernel with the `ptx_kernel` or `amdgpu_kernel` calling convention is a more idiomatic and compile-time performant than using the `nvvm.annoation !"kernel"` metadata.
Transition OMPIRBuilder to use calling conventions for PTX kernels and no longer emit `nvvm.annoation`. Update OpenMPOpt to work with kernels specified via calling convention as well as metadata. Update OpenMP tests to use the calling conventions.
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
84bf0da3 |
| 05-Sep-2024 |
Johannes Doerfert <johannes@jdoerfert.de> |
[Attributor][FIX] Ensure to always translate call site arguments (#107323)
When we propagate call site arguments we always need to translate them,
this is important as we ended up picking the funct
[Attributor][FIX] Ensure to always translate call site arguments (#107323)
When we propagate call site arguments we always need to translate them,
this is important as we ended up picking the function argument for a
recurisve call not the call site argument. `@recBad` and `@recGood` in
`returned.ll` show the problem as they used to transform them the same
way. The restructuring cleans the code up and helps derive more
"returned" arguments and better information in the presence of recursive
calls. The "dropped" attributes are simply dropped because we do not
query them anymore, not because we cannot derive them.
show more ...
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6 |
|
#
cd3a4c31 |
| 03-May-2024 |
Johannes Doerfert <johannes@jdoerfert.de> |
[Attributor][NFC] update tests (#91011)
|
Revision tags: llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5 |
|
#
3de645ef |
| 03-Nov-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFC] Split the reduction buffer size into two components
Before we tracked the size of the teams reduction buffer in order to allocate it at runtime per kernel launch. This patch splits the
[OpenMP][NFC] Split the reduction buffer size into two components
Before we tracked the size of the teams reduction buffer in order to allocate it at runtime per kernel launch. This patch splits the number into two parts, the size of the reduction data (=all reduction variables) and the (maximal) length of the buffer. This will allow us to allocate less if we need less, e.g., if we have less teams than the maximal length. It also allows us to move code from clangs codegen into the runtime as we now know how large the reduction data is.
show more ...
|
#
b8cbc5c0 |
| 01-Nov-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.
Fixes: https://github.com/llvm/llvm-project/issues/70249
show more ...
|
Revision tags: llvmorg-17.0.4 |
|
#
f390a76b |
| 27-Oct-2023 |
Mehdi Amini <joker.eph@gmail.com> |
Revert "Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)""
This reverts commit ddbaa11e9f43a38d50d62a9b9b07c3653b6bf8ab.
Reapply the original commit, the br
Revert "Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)""
This reverts commit ddbaa11e9f43a38d50d62a9b9b07c3653b6bf8ab.
Reapply the original commit, the broken test was repaired in 5e51363f38d083ab326736c0d4d1b5f9fe0de080 in the meantime.
show more ...
|
#
ddbaa11e |
| 27-Oct-2023 |
Mehdi Amini <joker.eph@gmail.com> |
Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)"
This reverts commit c2a1249a8257ed033a98e32e425539c6da6700ec.
The MLIR bots are broken with an omp test fa
Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)"
This reverts commit c2a1249a8257ed033a98e32e425539c6da6700ec.
The MLIR bots are broken with an omp test failure.
show more ...
|
#
c2a1249a |
| 26-Oct-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)
The runtime needs to know about the acceptable launch bounds, especially
if the compiler (middle- or backend) assum
[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)
The runtime needs to know about the acceptable launch bounds, especially
if the compiler (middle- or backend) assumed those bounds. While this
patch does not yet inform the runtime, it stores the bounds in a place
that can/will be accessed and is associated with the kernel.
show more ...
|
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1 |
|
#
10068cd6 |
| 26-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Depend on D155886.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
Revision tags: llvmorg-18-init |
|
#
05b181d8 |
| 24-Jul-2023 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Make the nested parallelism global hidden
Summary: These will probably be removed with the kernel environment, but they should have hidden visibliity so they can be optimized out.
|
#
6bd74fd6 |
| 24-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
Revert commits for kernel environment
This reverts commits for kernel environments as they causes issues in AMD BB.
|
#
c5c80403 |
| 23-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Depend on D155886.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
#
77dbd1d7 |
| 03-Jul-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[Attributor][NFCI] Manifest assumption attributes explicitly
We had some custom manifest for assumption attributes but we use the generic manifest logic. If we later decide to curb duplication (of a
[Attributor][NFCI] Manifest assumption attributes explicitly
We had some custom manifest for assumption attributes but we use the generic manifest logic. If we later decide to curb duplication (of attributes on the call site and callee), we can do that at a single location and for all attributes.
The test changes basically add known `llvm.assume` callee information to the call sites.
show more ...
|
#
b672c602 |
| 30-Jun-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[Attributor][NFCI] Merge MemoryEffects explicitly
We had some custom handling for existing MemoryEffects but we now move it to the place we check other existing attributes before we manifest new one
[Attributor][NFCI] Merge MemoryEffects explicitly
We had some custom handling for existing MemoryEffects but we now move it to the place we check other existing attributes before we manifest new ones. If we later decide to curb duplication (of attributes on the call site and callee), we can do that at a single location and for all attributes.
The test changes basically add known `memory` callee information to the call sites.
show more ...
|
#
339a1f3c |
| 24-Jun-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[Attributor] Avoid more AAs through IR implication
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
d4ecd124 |
| 23-Apr-2023 |
Shilei Tian <i@tianshilei.me> |
Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9.
It makes a couple of buildbots unhappy because of the following test failures: - `Transf
Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9.
It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU
show more ...
|
#
35cfadfb |
| 23-Apr-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
Revision tags: llvmorg-16.0.2 |
|
#
46ee1021 |
| 06-Apr-2023 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Replace HeapToShared's initial value with `poison`
There's a desire to move away from `undef` in LLVM. Currently we want to have the `addressspace(3)` variables use `poison` instead.
Revie
[OpenMP] Replace HeapToShared's initial value with `poison`
There's a desire to move away from `undef` in LLVM. Currently we want to have the `addressspace(3)` variables use `poison` instead.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D147719
show more ...
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
43c1c59f |
| 14-Dec-2022 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Merge barrier elimination into AAExecutionDomain
With this patch we track aligned barriers in AAExecutionDomain and also delete unnecessary barriers there. This allows us to eliminate barri
[OpenMP] Merge barrier elimination into AAExecutionDomain
With this patch we track aligned barriers in AAExecutionDomain and also delete unnecessary barriers there. This allows us to eliminate barriers across blocks, across functions, and in the presence of complex accesses that do not force a barrier. Further, we can use the collected information to enable store-load forwarding in a threaded environment (follow up patch).
Differential Revision: https://reviews.llvm.org/D140463
show more ...
|
#
aa8e9fac |
| 03-Jan-2023 |
Nikita Popov <npopov@redhat.com> |
[OpenMP] Convert some tests to opaque pointers (NFC)
|
#
90609fb6 |
| 13-Dec-2022 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFCI] Remove effectively dead code in clang and the runtime
Differential Revision: https://reviews.llvm.org/D136903
|
#
f9c29878 |
| 13-Dec-2022 |
Johannes Doerfert <johannes@jdoerfert.de> |
Revert "[OpenMP][NFCI] Remove effectively dead code in clang and the runtime"
This reverts commit c1c8cbbf5f29257d084a23a2f6c4236c40b7afb9. One of the tests seems to be flaky/non-deterministic.
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4 |
|
#
c1c8cbbf |
| 27-Oct-2022 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFCI] Remove effectively dead code in clang and the runtime
|
#
304f1d59 |
| 02-Nov-2022 |
Nikita Popov <npopov@redhat.com> |
[IR] Switch everything to use memory attribute
This switches everything to use the memory attribute proposed in https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. The old argmemo
[IR] Switch everything to use memory attribute
This switches everything to use the memory attribute proposed in https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly attributes are dropped. The readnone, readonly and writeonly attributes are restricted to parameters only.
The old attributes are auto-upgraded both in bitcode and IR. The bitcode upgrade is a policy requirement that has to be retained indefinitely. The IR upgrade is mainly there so it's not necessary to update all tests using memory attributes in this patch, which is already large enough. We could drop that part after migrating tests, or retain it longer term, to make it easier to import IR from older LLVM versions.
High-level Function/CallBase APIs like doesNotAccessMemory() or setDoesNotAccessMemory() are mapped transparently to the memory attribute. Code that directly manipulates attributes (e.g. via AttributeList) on the other hand needs to switch to working with the memory attribute instead.
Differential Revision: https://reviews.llvm.org/D135780
show more ...
|
#
41a278f5 |
| 26-Oct-2022 |
Johannes Rudolf Doerfert <doerfert@lassen735.coral.llnl.gov> |
[OpenMP][FIX] Do not add custom state machine eagerly in LTO runs
If we run LTO optimization we migth end up introducing a custom state machine and later transforming the region into SPMD. This is a
[OpenMP][FIX] Do not add custom state machine eagerly in LTO runs
If we run LTO optimization we migth end up introducing a custom state machine and later transforming the region into SPMD. This is a problem. While a follow up will introduce a check for the SPMD conversion, this already prevents the eager custom state machine generation. Only if the kernel init function is defined, rather then declared, we will emit a custom state machine. SPMD-zation can happen eagerly though. Tests are adjusted via a weak definition. The LTO test was added to verify this works as expected.
Differential Revision: https://reviews.llvm.org/D136740
show more ...
|