Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
7eca38ce |
| 05-Sep-2024 |
Hari Limaye <hari.limaye@arm.com> |
Reland "[clang] Add nuw attribute to GEPs (#105496)" (#107257)
Add nuw attribute to inbounds GEPs where the expression used to form the
GEP is an addition of unsigned indices.
Relands #105496, w
Reland "[clang] Add nuw attribute to GEPs (#105496)" (#107257)
Add nuw attribute to inbounds GEPs where the expression used to form the
GEP is an addition of unsigned indices.
Relands #105496, which was reverted because it exposed a miscompilation
arising from #98608. This is now fixed by #106512.
show more ...
|
Revision tags: llvmorg-19.1.0-rc4 |
|
#
69437a39 |
| 28-Aug-2024 |
Vitaly Buka <vitalybuka@google.com> |
Revert "[clang] Add nuw attribute to GEPs" (#106343)
Reverts llvm/llvm-project#105496
This patch breaks:
https://lab.llvm.org/buildbot/#/builders/25/builds/1952
https://lab.llvm.org/buildbot/#/
Revert "[clang] Add nuw attribute to GEPs" (#106343)
Reverts llvm/llvm-project#105496
This patch breaks:
https://lab.llvm.org/buildbot/#/builders/25/builds/1952
https://lab.llvm.org/buildbot/#/builders/52/builds/1775
Somehow output is different with sanitizers.
Maybe non-determinism in the code?
show more ...
|
#
3d2fd31c |
| 27-Aug-2024 |
Hari Limaye <hari.limaye@arm.com> |
[clang] Add nuw attribute to GEPs (#105496)
Add nuw attribute to inbounds GEPs where the expression used to form the
GEP is an addition of unsigned indices.
|
Revision tags: llvmorg-19.1.0-rc3 |
|
#
94473f4d |
| 09-Aug-2024 |
Hari Limaye <hari.limaye@arm.com> |
[IRBuilder] Generate nuw GEPs for struct member accesses (#99538)
Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.
Regression tests are updated using update
[IRBuilder] Generate nuw GEPs for struct member accesses (#99538)
Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.
Regression tests are updated using update scripts where possible, and by
find + replace where not.
show more ...
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
6b1c51bc |
| 26-Jun-2024 |
Akash Banerjee <akash.banerjee@amd.com> |
[OpenMP] Migrate GPU Reductions CodeGen from Clang to OMPIRBuilder (#80343)
This patch migrates the CGOpenMPRuntimeGPU::emitReduction and related functions to the OpenMPIRBUilder. In future patches
[OpenMP] Migrate GPU Reductions CodeGen from Clang to OMPIRBuilder (#80343)
This patch migrates the CGOpenMPRuntimeGPU::emitReduction and related functions to the OpenMPIRBUilder. In future patches MLIR OpenMP translation would be making use of these functions.
Co-authored-by: Jan Leyonberg <jan.leyonberg@amd.com>
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
dfc6a193 |
| 22-May-2024 |
Aaron Ballman <aaron@aaronballman.com> |
Reword OpenMP diagnostics for style; NFC
Three different OpenMP diagnostics were starting with a capital letter, so this makes them all lowercase and updates the tests accordingly.
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
b5d02bbd |
| 19-Mar-2024 |
dhruvachak <Dhruva.Chakrabarti@amd.com> |
[OpenMP] Increment kernel args version, used by runtime for detecting dyn_ptr. (#85363)
A kernel implicit parameter (dyn_ptr) was introduced some time back.
This patch increments the kernel args ve
[OpenMP] Increment kernel args version, used by runtime for detecting dyn_ptr. (#85363)
A kernel implicit parameter (dyn_ptr) was introduced some time back.
This patch increments the kernel args version for a compiler supporting
dyn_ptr. The version will be used by the runtime to determine whether
the implicit parameter is generated by the compiler. The versioning is
required to support use cases where code generated by an older compiler
is linked with a newer runtime.
If approved, this patch should be backported to release 18.
show more ...
|
Revision tags: llvmorg-18.1.2 |
|
#
4e3310a8 |
| 12-Mar-2024 |
mikaoP <raul.penacoba@bsc.es> |
[clang] Fix OMPT ident flag in combined distribute parallel for pragma (#80987)
Authored-by: Raúl Peñacoba Veigas <rpenacob@bsc.es>
|
Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4 |
|
#
cc374d80 |
| 21-Feb-2024 |
Joseph Huber <huberjn@outlook.com> |
[OpenMP] Remove `register_requires` global constructor (#80460)
Summary: Currently, OpenMP handles the `omp requires` clause by emitting a global constructor into the runtime for every translation u
[OpenMP] Remove `register_requires` global constructor (#80460)
Summary: Currently, OpenMP handles the `omp requires` clause by emitting a global constructor into the runtime for every translation unit that requires it. However, this is not a great solution because it prevents us from having a defined order in which the runtime is accessed and used.
This patch changes the approach to no longer use global constructors, but to instead group the flag with the other offloading entires that we already handle. This has the effect of still registering each flag per requires TU, but now we have a single constructor that handles everything.
This function removes support for the old `__tgt_register_requires` and replaces it with a warning message. We just had a recent release, and the OpenMP policy for the past four releases since we switched to LLVM is that we do not provide strict backwards compatibility between major LLVM releases now that the library is versioned. This means that a user will need to recompile if they have an old binary that relied on `register_requires` having the old behavior. It is important that we actively deprecate this, as otherwise it would not solve the problem of having no defined init and shutdown order for `libomptarget`. The problem of `libomptarget` not having a define init and shutdown order cascades into a lot of other issues so I have a strong incentive to be rid of it.
It is worth noting that the current `__tgt_offload_entry` only has space for a 32-bit integer here. I am planning to overhaul these at some point as well.
show more ...
|
Revision tags: llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5 |
|
#
3de645ef |
| 03-Nov-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFC] Split the reduction buffer size into two components
Before we tracked the size of the teams reduction buffer in order to allocate it at runtime per kernel launch. This patch splits the
[OpenMP][NFC] Split the reduction buffer size into two components
Before we tracked the size of the teams reduction buffer in order to allocate it at runtime per kernel launch. This patch splits the number into two parts, the size of the reduction data (=all reduction variables) and the (maximal) length of the buffer. This will allow us to allocate less if we need less, e.g., if we have less teams than the maximal length. It also allows us to move code from clangs codegen into the runtime as we now know how large the reduction data is.
show more ...
|
#
921bd299 |
| 03-Nov-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Remove alignment for global <-> local reduction functions
The alignment did likely not help much but increases the memory requirement. Note that half of the affected accesses are all perfor
[OpenMP] Remove alignment for global <-> local reduction functions
The alignment did likely not help much but increases the memory requirement. Note that half of the affected accesses are all performed by a single thread in each block. The reads are by consecutive threads in a single block.
show more ...
|
#
d3e7a48c |
| 03-Nov-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFC] Remove a no-op function
|
#
b8cbc5c0 |
| 01-Nov-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.
Fixes: https://github.com/llvm/llvm-project/issues/70249
show more ...
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
c5488c8d |
| 19-Aug-2023 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Properly set static thread limit (w/o analysis)
We used to have two separate implementations to derive the number of threads used in a target region. This lead us to sometimes miss out on u
[OpenMP] Properly set static thread limit (w/o analysis)
We used to have two separate implementations to derive the number of threads used in a target region. This lead us to sometimes miss out on user provided thread bounds (num_threads, or thread_limit) when we looked for "constant default values". If we might miss out on the presence of those bounds, we cannot set the thread_limit statically since the runtime will try to honor user input rather than cap it at the "preferred default". This patch replaces the secondary implementation with the primary in a mode that will not emit code but just look for the presence, and potentially upper bounds, of thread limiting clauses.
The runtime test would not pass without this rewrite as we missed some clauses, set the static limit on the device to the preferred value, but then violated that value at runtime.
Fixes: https://github.com/llvm/llvm-project/issues/64845
Differential Revision: https://reviews.llvm.org/D158381
show more ...
|
Revision tags: llvmorg-17.0.0-rc2 |
|
#
25bc999d |
| 29-Jul-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Intrinsics: Add type overload to stacksave and stackstore
This allows use with non-0 address space stacks. llvm_ptr_ty should never be used. This could use some more percolation up through mlir, but
Intrinsics: Add type overload to stacksave and stackstore
This allows use with non-0 address space stacks. llvm_ptr_ty should never be used. This could use some more percolation up through mlir, but this is enough to fix existing tests.
https://reviews.llvm.org/D156666
show more ...
|
Revision tags: llvmorg-17.0.0-rc1 |
|
#
10068cd6 |
| 26-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Depend on D155886.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
Revision tags: llvmorg-18-init |
|
#
6bd74fd6 |
| 24-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
Revert commits for kernel environment
This reverts commits for kernel environments as they causes issues in AMD BB.
|
#
c5c80403 |
| 23-Jul-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Depend on D155886.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
#
63ca93c7 |
| 06-Jul-2023 |
Sergio Afonso <safonsof@amd.com> |
[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over their meaning. `IsTargetCodegen` becomes `IsGPU`, whe
[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes `IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to `-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed to `omp.is_target_device`. Getters and setters of all these renamed properties are also updated accordingly. Many unit tests have been updated to use the new names, but an alias for the `-fopenmp-is-device` option is created so that external programs do not stop working after the name change.
`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the `-fopenmp-is-target-device` compiler frontend option, which is only added to the OpenMP device invocation for offloading-enabled programs.
Differential Revision: https://reviews.llvm.org/D154591
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1 |
|
#
196c144d |
| 29-Mar-2023 |
David Tenty <daltenty@ibm.com> |
[clang][CodeGenCXX] Improve handling of itanium ABI member function alignment requirements
The itanium ABI for certain platforms requires a minimum alignments for member function pointers to reserve
[clang][CodeGenCXX] Improve handling of itanium ABI member function alignment requirements
The itanium ABI for certain platforms requires a minimum alignments for member function pointers to reserve certain bits for distinguishing virtual and non-virtual functions.
Our implementation of this however depends on the alignment of the function involved, which may however not reflect the true alignment of function pointers on certain targets for which the alignment is independent of the function (e.g. AIX). Worse, the 2-byte alignment we use may be less than the ABI minimum for the target, and in the case we are using explicit sections will result in invalid codegen.
This patch attempts to correct this situation by considering the target alignment of function pointers as part of making the decision about whether we need to adjust the function alignment to conform to the ABI. Targets which do not provide the function ptr alignment information will return a value of 1 when queried and will conservatively retain the old alignment.
Differential Revision: https://reviews.llvm.org/D147184
show more ...
|
#
dd029845 |
| 12-May-2023 |
Joseph Huber <jhuber6@vols.utk.edu> |
[OpenMP] Naturally align internal global variables in the OpenMPIRBuilder
We use this helper to make several internal global variables during codegen. currently we do not specify any alignment which
[OpenMP] Naturally align internal global variables in the OpenMPIRBuilder
We use this helper to make several internal global variables during codegen. currently we do not specify any alignment which allows the alignment to be set incorrectly after some changes in how alignment was handled. This patch explicitly aligns these variables to the natural alignment as specified by the data layout
Fixes https://github.com/llvm/llvm-project/issues/62668
Reviewed By: tianshilei1992, gchatelet
Differential Revision: https://reviews.llvm.org/D150461
show more ...
|
#
d4ecd124 |
| 23-Apr-2023 |
Shilei Tian <i@tianshilei.me> |
Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9.
It makes a couple of buildbots unhappy because of the following test failures: - `Transf
Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9.
It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU
show more ...
|
#
35cfadfb |
| 23-Apr-2023 |
Shilei Tian <i@tianshilei.me> |
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`
[OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.
This is a combination and refinement of patch series D116908, D116909, and D116910.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142569
show more ...
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
782c59a4 |
| 23-Dec-2022 |
Itay Bookstein <itay.bookstein@nextsilicon.com> |
[OpenMP] Prefix outlined and reduction func names with original func's name
This patch prefixes omp outlined helpers and reduction funcs with the original function's name.
Reviewed By: jdoerfert
D
[OpenMP] Prefix outlined and reduction func names with original func's name
This patch prefixes omp outlined helpers and reduction funcs with the original function's name.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D140722
show more ...
|
#
6fdd13e0 |
| 19-Apr-2023 |
Itay Bookstein <itay.bookstein@nextsilicon.com> |
Revert "[OpenMP] Prefix outlined and reduction func names with original func's name"
This reverts commit 029bfc311d4d7d3cd90be81bb08c046848796d02.
|