Revision tags: llvmorg-21-init |
|
#
07ed8187 |
| 25-Jan-2025 |
Alex MacLean <amaclean@nvidia.com> |
[OpenMP] Replace nvvm.annotation usage with kernel calling conventions (#122320)
Specifying a kernel with the `ptx_kernel` or `amdgpu_kernel` calling convention is a more idiomatic and compile-time
[OpenMP] Replace nvvm.annotation usage with kernel calling conventions (#122320)
Specifying a kernel with the `ptx_kernel` or `amdgpu_kernel` calling convention is a more idiomatic and compile-time performant than using the `nvvm.annoation !"kernel"` metadata.
Transition OMPIRBuilder to use calling conventions for PTX kernels and no longer emit `nvvm.annoation`. Update OpenMPOpt to work with kernels specified via calling convention as well as metadata. Update OpenMP tests to use the calling conventions.
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3 |
|
#
907c7eb3 |
| 16-Aug-2024 |
Shilei Tian <i@tianshilei.me> |
[Attributor] Enable `AAAddressSpace` in `OpenMPOpt` (#104363)
This reverts commit e592c2dcf5b7d2da6c2564f5d9990aa34079bad4.
We can finally reland the PR since the issue that caused the PR to be
[Attributor] Enable `AAAddressSpace` in `OpenMPOpt` (#104363)
This reverts commit e592c2dcf5b7d2da6c2564f5d9990aa34079bad4.
We can finally reland the PR since the issue that caused the PR to be
reverted has been resolved in
https://github.com/llvm/llvm-project/pull/104051.
show more ...
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6 |
|
#
cd3a4c31 |
| 03-May-2024 |
Johannes Doerfert <johannes@jdoerfert.de> |
[Attributor][NFC] update tests (#91011)
|
Revision tags: llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5 |
|
#
3de645ef |
| 03-Nov-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFC] Split the reduction buffer size into two components
Before we tracked the size of the teams reduction buffer in order to allocate it at runtime per kernel launch. This patch splits the
[OpenMP][NFC] Split the reduction buffer size into two components
Before we tracked the size of the teams reduction buffer in order to allocate it at runtime per kernel launch. This patch splits the number into two parts, the size of the reduction data (=all reduction variables) and the (maximal) length of the buffer. This will allow us to allocate less if we need less, e.g., if we have less teams than the maximal length. It also allows us to move code from clangs codegen into the runtime as we now know how large the reduction data is.
show more ...
|
#
b8cbc5c0 |
| 01-Nov-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.
Fixes: https://github.com/llvm/llvm-project/issues/70249
show more ...
|
Revision tags: llvmorg-17.0.4 |
|
#
f390a76b |
| 27-Oct-2023 |
Mehdi Amini <joker.eph@gmail.com> |
Revert "Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)""
This reverts commit ddbaa11e9f43a38d50d62a9b9b07c3653b6bf8ab.
Reapply the original commit, the br
Revert "Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)""
This reverts commit ddbaa11e9f43a38d50d62a9b9b07c3653b6bf8ab.
Reapply the original commit, the broken test was repaired in 5e51363f38d083ab326736c0d4d1b5f9fe0de080 in the meantime.
show more ...
|
#
ddbaa11e |
| 27-Oct-2023 |
Mehdi Amini <joker.eph@gmail.com> |
Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)"
This reverts commit c2a1249a8257ed033a98e32e425539c6da6700ec.
The MLIR bots are broken with an omp test fa
Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)"
This reverts commit c2a1249a8257ed033a98e32e425539c6da6700ec.
The MLIR bots are broken with an omp test failure.
show more ...
|
#
c2a1249a |
| 26-Oct-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)
The runtime needs to know about the acceptable launch bounds, especially
if the compiler (middle- or backend) assum
[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)
The runtime needs to know about the acceptable launch bounds, especially
if the compiler (middle- or backend) assumed those bounds. While this
patch does not yet inform the runtime, it stores the bounds in a place
that can/will be accessed and is associated with the kernel.
show more ...
|
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0 |
|
#
499f691b |
| 08-Sep-2023 |
Shilei Tian <i@tianshilei.me> |
Revert "Reapply "[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544)"""
This reverts commit c5525a6e8fb7f7c2ce7126ac5b17aaff01ac407f. AMD BB is not happy again.
|
#
c5525a6e |
| 08-Sep-2023 |
Shilei Tian <i@tianshilei.me> |
Reapply "[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544)""
This reverts commit e592c2dcf5b7d2da6c2564f5d9990aa34079bad4 that reverts e91e3cf.
|
#
e592c2dc |
| 07-Sep-2023 |
Shilei Tian <i@tianshilei.me> |
Revert "[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544)"
This reverts commit e91e3cf0748a80e1d7219c13fa6a7622321f4936 because AMD BB is not happy with it.
|
#
e91e3cf0 |
| 07-Sep-2023 |
Shilei Tian <i@tianshilei.me> |
[Attributor] Enable AAAddressSpace for OpenMPOpt (#65544)
|
Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1 |
|
#
10068cd6 |
| 26-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Depend on D155886.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
Revision tags: llvmorg-18-init |
|
#
05b181d8 |
| 24-Jul-2023 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Make the nested parallelism global hidden
Summary: These will probably be removed with the kernel environment, but they should have hidden visibliity so they can be optimized out.
|
#
6bd74fd6 |
| 24-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
Revert commits for kernel environment
This reverts commits for kernel environments as they causes issues in AMD BB.
|
#
c5c80403 |
| 23-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Depend on D155886.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
Revision tags: llvmorg-16.0.6 |
|
#
8f4fadd1 |
| 03-Jun-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Use "kernel" attribute consistently
|
Revision tags: llvmorg-16.0.5, llvmorg-16.0.4 |
|
#
787d6bb5 |
| 15-May-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[Attributor][OpenMP-Opt][NFC] Run the update test checks script
|
Revision tags: llvmorg-16.0.3 |
|
#
d4ecd124 |
| 23-Apr-2023 |
Shilei Tian <i@tianshilei.me> |
Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9.
It makes a couple of buildbots unhappy because of the following test failures: - `Transf
Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9.
It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU
show more ...
|
#
35cfadfb |
| 23-Apr-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
Revision tags: llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
13b909ef |
| 09-Jan-2023 |
Rafael A Herrera Guaitero <randres2011@gmail.com> |
OpenMPOpt: Check nested parallelism in target region
Analysis that determines if a parallel region can reach another parallel region in any target region of the TU. A new global var is emitted with
OpenMPOpt: Check nested parallelism in target region
Analysis that determines if a parallel region can reach another parallel region in any target region of the TU. A new global var is emitted with the name of the kernel + "_nested_parallelism", which is either 0 or 1 depending on the result.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D141010
show more ...
|