Revision tags: llvmorg-21-init |
|
#
07ed8187 |
| 25-Jan-2025 |
Alex MacLean <amaclean@nvidia.com> |
[OpenMP] Replace nvvm.annotation usage with kernel calling conventions (#122320)
Specifying a kernel with the `ptx_kernel` or `amdgpu_kernel` calling convention is a more idiomatic and compile-time
[OpenMP] Replace nvvm.annotation usage with kernel calling conventions (#122320)
Specifying a kernel with the `ptx_kernel` or `amdgpu_kernel` calling convention is a more idiomatic and compile-time performant than using the `nvvm.annoation !"kernel"` metadata.
Transition OMPIRBuilder to use calling conventions for PTX kernels and no longer emit `nvvm.annoation`. Update OpenMPOpt to work with kernels specified via calling convention as well as metadata. Update OpenMP tests to use the calling conventions.
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5 |
|
#
b8cbc5c0 |
| 01-Nov-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.
Fixes: https://github.com/llvm/llvm-project/issues/70249
show more ...
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
63ca93c7 |
| 06-Jul-2023 |
Sergio Afonso <safonsof@amd.com> |
[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over their meaning. `IsTargetCodegen` becomes `IsGPU`, whe
[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes `IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to `-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed to `omp.is_target_device`. Getters and setters of all these renamed properties are also updated accordingly. Many unit tests have been updated to use the new names, but an alias for the `-fopenmp-is-device` option is created so that external programs do not stop working after the name change.
`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the `-fopenmp-is-target-device` compiler frontend option, which is only added to the OpenMP device invocation for offloading-enabled programs.
Differential Revision: https://reviews.llvm.org/D154591
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4 |
|
#
8c1449a8 |
| 18-Oct-2022 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Make kernels have protected visibility
This patch changes the kernels generated by OpenMP to have protected visibility. This is unlikely to change anything functionally. However, protected
[OpenMP] Make kernels have protected visibility
This patch changes the kernels generated by OpenMP to have protected visibility. This is unlikely to change anything functionally. However, protected visibility better matches the behaviour of these GPU kernels. We do not expect any pending shared library load to preempt these kernels so we can specify a more restrictive visibility.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D136198
show more ...
|
Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
b9f67d44 |
| 24-Mar-2022 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Replace device kernel linkage with weak_odr
Currently the device kernels all have weak linkage to prevent linkage errors on multiple defintions. However, this prevents some optimizations fr
[OpenMP] Replace device kernel linkage with weak_odr
Currently the device kernels all have weak linkage to prevent linkage errors on multiple defintions. However, this prevents some optimizations from adequately analyzing them because of the nature of weak linkage. This patch replaces the weak linkage with weak_odr linkage so we can statically assert that multiple declarations of the same kernel will have the same definition.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122443
show more ...
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
#
1b1c8d83 |
| 16-Jan-2022 |
hyeongyu kim <gusrb406@snu.ac.kr> |
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
show more ...
|
Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
#
4b5c3e59 |
| 08-Oct-2021 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Remove doing assumption propagation in the front end.
This patch removes the assumption propagation that was added in D110655 primarily to get assumption informatino on opaque call sites fo
[OpenMP] Remove doing assumption propagation in the front end.
This patch removes the assumption propagation that was added in D110655 primarily to get assumption informatino on opaque call sites for optimizations. The analysis done in D111445 allows us to do this more intelligently in the back-end.
Depends on D111445
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D111463
show more ...
|
#
fd9b0999 |
| 08-Nov-2021 |
hyeongyu kim <gusrb406@snu.ac.kr> |
Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit aacfbb953eb705af2ecfeb95a6262818fa85dd92.
Revert "Fix lit test failu
Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit aacfbb953eb705af2ecfeb95a6262818fa85dd92.
Revert "Fix lit test failures in CodeGenCoroutines"
This reverts commit 63fff0f5bffe20fa2c84a45a41161afa0043cb34.
show more ...
|
#
aacfbb95 |
| 15-Oct-2021 |
hyeongyukim <gusrb406@snu.ac.kr> |
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169. Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:
(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.
(2) The remaining tests are updated manually.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D108453
Resolve lit failures in clang after 8ca4b3e's land
Fix lit test failures in clang-ppc* and clang-x64-windows-msvc
Fix missing failures in clang-ppc64be* and retry fixing clang-x64-windows-msvc
Fix internal_clone(aarch64) inline assembly
show more ...
|
#
89ad2822 |
| 06-Nov-2021 |
Juneyoung Lee <aqjune@gmail.com> |
Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit 7584ef766a7219b6ee5a400637206d26e0fa98ac.
|
#
7584ef76 |
| 06-Nov-2021 |
Juneyoung Lee <aqjune@gmail.com> |
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
show more ...
|
#
f193bcc7 |
| 18-Oct-2021 |
Juneyoung Lee <aqjune@gmail.com> |
Revert D105169 due to the two-stage failure in ASAN
This reverts the following commits: 37ca7a795b277c20c02a218bf44052278c03344b 9aa6c72b92b6c89cc6d23b693257df9af7de2d15 705387c5074bcca36d626882462e
Revert D105169 due to the two-stage failure in ASAN
This reverts the following commits: 37ca7a795b277c20c02a218bf44052278c03344b 9aa6c72b92b6c89cc6d23b693257df9af7de2d15 705387c5074bcca36d626882462ebbc2bcc3bed4 8ca4b3ef19fe82d7ad6a6e1515317dcc01b41515 80dba72a669b5416e97a42fd2c2a7bc5a6d3f44a
show more ...
|
#
8ca4b3ef |
| 15-Oct-2021 |
Juneyoung Lee <aqjune@gmail.com> |
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169. Autogenerated test codes are changed by `utils/up
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169. Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:
(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.
(2) The remaining tests are updated manually.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D108453
show more ...
|
#
d12502a3 |
| 28-Sep-2021 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Apply OpenMP assumptions to applicable call sites
This patch adds OpenMP assumption attributes to call sites in applicable regions. Currently this applies the caller's assumption attributes
[OpenMP] Apply OpenMP assumptions to applicable call sites
This patch adds OpenMP assumption attributes to call sites in applicable regions. Currently this applies the caller's assumption attributes to any calls contained within it. So, if a call occurs inside an OpenMP assumes region to a function outside that region, we will assume that call respects the assumptions. This is primarily useful for inline assembly calls used heavily in the OpenMP GPU device runtime, which allows us to then make judgements about what the ASM will do.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110655
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
#
45e8e084 |
| 13-Jul-2021 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Encode `omp [...] assume[...]` assumptions with `omp[x]` prefix
Since these assumptions are coming from OpenMP it makes sense to mark them as such in the generic IR encoding. Standardized a
[OpenMP] Encode `omp [...] assume[...]` assumptions with `omp[x]` prefix
Since these assumptions are coming from OpenMP it makes sense to mark them as such in the generic IR encoding. Standardized assumptions will be named omp_ASSUMPTION_NAME and extensions will be named ompx_ASSUMPTION_NAME which is the OpenMP 5.2 syntax for "extensions" of any kind.
This also matches what the OpenMP-Opt pass expects.
Summarized, #pragma omp [...] assume[s] no_parallelism now generates the same IR assumption annotation as __attribute__((assume("omp_no_parallelism")))
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D105937
show more ...
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3 |
|
#
e2cfbfcc |
| 17-Jun-2021 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime
[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch.
The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
This was in parts extracted from D59319.
Reviewed By: ABataev, JonChesterfield
Differential Revision: https://reviews.llvm.org/D101976
show more ...
|
Revision tags: llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
#
68d133a3 |
| 22-Mar-2021 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Simplify GPU memory globalization
Summary: Memory globalization is required to maintain OpenMP standard semantics for data sharing between worker and master threads. The GPU cannot share da
[OpenMP] Simplify GPU memory globalization
Summary: Memory globalization is required to maintain OpenMP standard semantics for data sharing between worker and master threads. The GPU cannot share data between its threads so must allocate global or shared memory to store the data in. Currently this is implemented fully in the frontend using the `__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard CPU stack sharing. The front-end scans the target region for variables that escape the region and must be shared between the threads. Each variable then has a field created for it in a global record type.
This patch replaces this functinality with a single allocation command, effectively mimicing an alloca instruction for the variables that must be shared between the threads. This will be much slower than the current solution, but makes it much easier to optimize as we can analyze each variable independently and determine if it is not captured. In the future, we can replace these calls with an `alloca` and small allocations can be pushed to shared memory.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D97680
show more ...
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
#
2e6e4e6a |
| 23-Nov-2020 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Add initial support for `omp [begin/end] assumes`
The `assumes` directive is an OpenMP 5.1 feature that allows the user to provide assumptions to the optimizer. Assumptions can refer to dir
[OpenMP] Add initial support for `omp [begin/end] assumes`
The `assumes` directive is an OpenMP 5.1 feature that allows the user to provide assumptions to the optimizer. Assumptions can refer to directives (`absent` and `contains` clauses), expressions (`holds` clause), or generic properties (`no_openmp_routines`, `ext_ABCD`, ...).
The `assumes` spelling is used for assumptions in the global scope while `assume` is used for executable contexts with an associated structured block.
This patch only implements the global spellings. While clauses with arguments are "accepted" by the parser, they will simply be ignored for now. The implementation lowers the assumptions directly to the `AssumptionAttr`.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D91980
show more ...
|
#
a5a14cbe |
| 23-Nov-2020 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Add initial support for `omp [begin/end] assumes`
The `assumes` directive is an OpenMP 5.1 feature that allows the user to provide assumptions to the optimizer. Assumptions can refer to dir
[OpenMP] Add initial support for `omp [begin/end] assumes`
The `assumes` directive is an OpenMP 5.1 feature that allows the user to provide assumptions to the optimizer. Assumptions can refer to directives (`absent` and `contains` clauses), expressions (`holds` clause), or generic properties (`no_openmp_routines`, `ext_ABCD`, ...).
The `assumes` spelling is used for assumptions in the global scope while `assume` is used for executable contexts with an associated structured block.
This patch only implements the global spellings. While clauses with arguments are "accepted" by the parser, they will simply be ignored for now. The implementation lowers the assumptions directly to the `AssumptionAttr`.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D91980
show more ...
|