|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7 |
|
| #
c44fa3e8 |
| 22-May-2024 |
Sirraide <aeternalmail@gmail.com> |
[Clang] Refactor `__attribute__((assume))` (#84934)
This is a followup to #81014 and #84582: Before this patch, Clang
would accept `__attribute__((assume))` and `[[clang::assume]]` as
nonstandar
[Clang] Refactor `__attribute__((assume))` (#84934)
This is a followup to #81014 and #84582: Before this patch, Clang
would accept `__attribute__((assume))` and `[[clang::assume]]` as
nonstandard spellings for the `[[omp::assume]]` attribute; this
resulted in a potentially very confusing name clash with C++23’s
`[[assume]]` attribute (and GCC’s `assume` attribute with the same
semantics).
This pr replaces every usage of `__attribute__((assume))` with
`[[omp::assume]]` and makes `__attribute__((assume))` and
`[[clang::assume]]` alternative spellings for C++23’s `[[assume]]`;
this shouldn’t cause any problems due to differences in appertainment
and because almost no-one was using this variant spelling to begin
with (a use in libclc has already been changed to use a different
attribute).
show more ...
|
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5 |
|
| #
b8cbc5c0 |
| 01-Nov-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.
Fixes: https://github.com/llvm/llvm-project/issues/70249
show more ...
|
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1 |
|
| #
10068cd6 |
| 26-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Depend on D155886.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
|
Revision tags: llvmorg-18-init |
|
| #
6bd74fd6 |
| 24-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
Revert commits for kernel environment
This reverts commits for kernel environments as they causes issues in AMD BB.
|
| #
c5c80403 |
| 23-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Depend on D155886.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
| #
63ca93c7 |
| 06-Jul-2023 |
Sergio Afonso <safonsof@amd.com> |
[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over their meaning. `IsTargetCodegen` becomes `IsGPU`, whe
[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes `IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to `-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed to `omp.is_target_device`. Getters and setters of all these renamed properties are also updated accordingly. Many unit tests have been updated to use the new names, but an alias for the `-fopenmp-is-device` option is created so that external programs do not stop working after the name change.
`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the `-fopenmp-is-target-device` compiler frontend option, which is only added to the OpenMP device invocation for offloading-enabled programs.
Differential Revision: https://reviews.llvm.org/D154591
show more ...
|
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
| #
d4ecd124 |
| 23-Apr-2023 |
Shilei Tian <i@tianshilei.me> |
Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9.
It makes a couple of buildbots unhappy because of the following test failures: - `Transf
Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9.
It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU
show more ...
|
| #
35cfadfb |
| 23-Apr-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
|
Revision tags: llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
| #
90609fb6 |
| 13-Dec-2022 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFCI] Remove effectively dead code in clang and the runtime
Differential Revision: https://reviews.llvm.org/D136903
|
| #
f9c29878 |
| 13-Dec-2022 |
Johannes Doerfert <johannes@jdoerfert.de> |
Revert "[OpenMP][NFCI] Remove effectively dead code in clang and the runtime"
This reverts commit c1c8cbbf5f29257d084a23a2f6c4236c40b7afb9. One of the tests seems to be flaky/non-deterministic.
|
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4 |
|
| #
c1c8cbbf |
| 27-Oct-2022 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFCI] Remove effectively dead code in clang and the runtime
|
| #
126db467 |
| 26-Oct-2022 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][FIX] Adjust to clang tests after D136740
|
|
Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1 |
|
| #
839ac62c |
| 15-Sep-2022 |
Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com> |
Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 7539e9cf811e590d9f12ae39673ca789e26386b4.
|
| #
7539e9cf |
| 15-Sep-2022 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument lis
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert, jhuber6, ABataev
Differential Revision: https://reviews.llvm.org/D102107
show more ...
|
|
Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
61d418f9 |
| 11-Apr-2022 |
Arthur Eubanks <aeubanks@google.com> |
[test] Remove references to -fexperimental-new-pass-manager in tests
This has been the default for a while and we're in the process of removing the legacy PM optimization pipeline.
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
ac90dfc4 |
| 21-Sep-2021 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 1d66649adf28d48ae1731516d87fb899426e3349.
Revert to fix AMG GPU issue.
|
| #
1d66649a |
| 21-Sep-2021 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument lis
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert, jhuber6
Differential Revision: https://reviews.llvm.org/D102107
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
fb0cf017 |
| 19-Jul-2021 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit e9c7291cb25f071f1a1dfa4049ed9f5a8a217b3e.
Fix failing tests
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
e9c7291c |
| 15-Jun-2021 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument lis
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102107
show more ...
|
| #
2c31d5eb |
| 13-Jul-2021 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Add IDs to OpenMP remarks
This patch adds unique idenfitiers to the existing OpenMP remarks. This makes it easier to identify the corresponding documentation for each remark that will be ho
[OpenMP] Add IDs to OpenMP remarks
This patch adds unique idenfitiers to the existing OpenMP remarks. This makes it easier to identify the corresponding documentation for each remark that will be hosted in the OpenMP webpage.
Depends on D105898
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105939
show more ...
|
| #
eef6601b |
| 13-Jul-2021 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Rework OpenMP remarks
This patch rewrites and reworks a few of the existing remarks to make the mmore concise and consistent prior to writing the documentation for them.
Reviewed By: jdoer
[OpenMP] Rework OpenMP remarks
This patch rewrites and reworks a few of the existing remarks to make the mmore concise and consistent prior to writing the documentation for them.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105898
show more ...
|
| #
514c033d |
| 23-Jun-2021 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Detect SPMD compatible kernels and execute them as such
In the spirit of TRegions [0], this patch analyzes a kernel and tracks if it can be executed in SPMD-mode. If so, we flip the argumen
[OpenMP] Detect SPMD compatible kernels and execute them as such
In the spirit of TRegions [0], this patch analyzes a kernel and tracks if it can be executed in SPMD-mode. If so, we flip the arguments of the __kmpc_target_init and deinit call to enable the mode. We also update the `<kernel>_exec_mode` flag to indicate to the runtime we changed the mode to SPMD.
The code analysis is done interprocedurally by extending the AAKernelInfo abstract attribute to track SPMD compatibility as well.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D102307
show more ...
|
| #
a706b94e |
| 10-Jul-2021 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFCI] Re-enable two remarks tests after D101977 landed
|
| #
e2cfbfcc |
| 17-Jun-2021 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime
[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch.
The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
This was in parts extracted from D59319.
Reviewed By: ABataev, JonChesterfield
Differential Revision: https://reviews.llvm.org/D101976
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
03d7e61c |
| 20-May-2021 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Internalize functions in OpenMPOpt to improve IPO passes
Summary: Currently the attributor needs to give up if a function has external linkage. This means that the optimization introduced i
[OpenMP] Internalize functions in OpenMPOpt to improve IPO passes
Summary: Currently the attributor needs to give up if a function has external linkage. This means that the optimization introduced in D97818 will only apply to static functions. This change uses the Attributor to internalize OpenMP device routines by making a copy of each function with private linkage and replacing the uses in the module with it. This allows for the optimization to be applied to any regular function.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102824
show more ...
|