History log of /llvm-project/clang/test/OpenMP/target_teams_distribute_codegen.cpp (Results 1 – 25 of 86)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3
# 94473f4d 09-Aug-2024 Hari Limaye <hari.limaye@arm.com>

[IRBuilder] Generate nuw GEPs for struct member accesses (#99538)

Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.

Regression tests are updated using update

[IRBuilder] Generate nuw GEPs for struct member accesses (#99538)

Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.

Regression tests are updated using update scripts where possible, and by
find + replace where not.

show more ...


Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3
# b5d02bbd 19-Mar-2024 dhruvachak <Dhruva.Chakrabarti@amd.com>

[OpenMP] Increment kernel args version, used by runtime for detecting dyn_ptr. (#85363)

A kernel implicit parameter (dyn_ptr) was introduced some time back.
This patch increments the kernel args ve

[OpenMP] Increment kernel args version, used by runtime for detecting dyn_ptr. (#85363)

A kernel implicit parameter (dyn_ptr) was introduced some time back.
This patch increments the kernel args version for a compiler supporting
dyn_ptr. The version will be used by the runtime to determine whether
the implicit parameter is generated by the compiler. The versioning is
required to support use cases where code generated by an older compiler
is linked with a newer runtime.

If approved, this patch should be backported to release 18.

show more ...


Revision tags: llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4
# cc374d80 21-Feb-2024 Joseph Huber <huberjn@outlook.com>

[OpenMP] Remove `register_requires` global constructor (#80460)

Summary:
Currently, OpenMP handles the `omp requires` clause by emitting a global
constructor into the runtime for every translation u

[OpenMP] Remove `register_requires` global constructor (#80460)

Summary:
Currently, OpenMP handles the `omp requires` clause by emitting a global
constructor into the runtime for every translation unit that requires
it. However, this is not a great solution because it prevents us from
having a defined order in which the runtime is accessed and used.

This patch changes the approach to no longer use global constructors,
but to instead group the flag with the other offloading entires that we
already handle. This has the effect of still registering each flag per
requires TU, but now we have a single constructor that handles
everything.

This function removes support for the old `__tgt_register_requires` and
replaces it with a warning message. We just had a recent release, and
the OpenMP policy for the past four releases since we switched to LLVM
is that we do not provide strict backwards compatibility between major
LLVM releases now that the library is versioned. This means that a user
will need to recompile if they have an old binary that relied on
`register_requires` having the old behavior. It is important that we
actively deprecate this, as otherwise it would not solve the problem of
having no defined init and shutdown order for `libomptarget`. The
problem of `libomptarget` not having a define init and shutdown order
cascades into a lot of other issues so I have a strong incentive to be
rid of it.

It is worth noting that the current `__tgt_offload_entry` only has space
for a 32-bit integer here. I am planning to overhaul these at some point
as well.

show more ...


Revision tags: llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5
# b8cbc5c0 01-Nov-2023 Johannes Doerfert <johannes@jdoerfert.de>

[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)

The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the

[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)

The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.

Fixes: https://github.com/llvm/llvm-project/issues/70249

show more ...


Revision tags: llvmorg-17.0.4
# 84a3aadf 20-Oct-2023 Aaron Ballman <aaron@aaronballman.com>

Diagnose use of VLAs in C++ by default

Reapplication of 7339c0f782d5c70e0928f8991b0c05338a90c84c with a fix
for a crash involving arrays without a size expression.

Clang supports VLAs in C++ as an

Diagnose use of VLAs in C++ by default

Reapplication of 7339c0f782d5c70e0928f8991b0c05338a90c84c with a fix
for a crash involving arrays without a size expression.

Clang supports VLAs in C++ as an extension, but we currently only warn
on their use when you pass -Wvla, -Wvla-extension, or -pedantic.
However, VLAs as they're expressed in C have been considered by WG21
and rejected, are easy to use accidentally to the surprise of users
(e.g., https://ddanilov.me/default-non-standard-features/), and they
have potential security implications beyond constant-size arrays
(https://wiki.sei.cmu.edu/confluence/display/c/ARR32-C.+Ensure+size+arguments+for+variable+length+arrays+are+in+a+valid+range).
C++ users should strongly consider using other functionality such as
std::vector instead.

This seems like sufficiently compelling evidence to warn users about
VLA use by default in C++ modes. This patch enables the -Wvla-extension
diagnostic group in C++ language modes by default, and adds the warning
group to -Wall in GNU++ language modes. The warning is still opt-in in
C language modes, where support for VLAs is somewhat less surprising to
users.

RFC: https://discourse.llvm.org/t/rfc-diagnosing-use-of-vlas-in-c/73109
Fixes https://github.com/llvm/llvm-project/issues/62836
Differential Revision: https://reviews.llvm.org/D156565

show more ...


# f5043f46 20-Oct-2023 Aaron Ballman <aaron@aaronballman.com>

Revert "Diagnose use of VLAs in C++ by default"

This reverts commit 7339c0f782d5c70e0928f8991b0c05338a90c84c.

Breaks bots:
https://lab.llvm.org/buildbot/#/builders/139/builds/51875
https://lab.llvm

Revert "Diagnose use of VLAs in C++ by default"

This reverts commit 7339c0f782d5c70e0928f8991b0c05338a90c84c.

Breaks bots:
https://lab.llvm.org/buildbot/#/builders/139/builds/51875
https://lab.llvm.org/buildbot/#/builders/164/builds/45262

show more ...


# 7339c0f7 20-Oct-2023 Aaron Ballman <aaron@aaronballman.com>

Diagnose use of VLAs in C++ by default

Clang supports VLAs in C++ as an extension, but we currently only warn
on their use when you pass -Wvla, -Wvla-extension, or -pedantic.
However, VLAs as they'r

Diagnose use of VLAs in C++ by default

Clang supports VLAs in C++ as an extension, but we currently only warn
on their use when you pass -Wvla, -Wvla-extension, or -pedantic.
However, VLAs as they're expressed in C have been considered by WG21
and rejected, are easy to use accidentally to the surprise of users
(e.g., https://ddanilov.me/default-non-standard-features/), and they
have potential security implications beyond constant-size arrays
(https://wiki.sei.cmu.edu/confluence/display/c/ARR32-C.+Ensure+size+arguments+for+variable+length+arrays+are+in+a+valid+range).
C++ users should strongly consider using other functionality such as
std::vector instead.

This seems like sufficiently compelling evidence to warn users about
VLA use by default in C++ modes. This patch enables the -Wvla-extension
diagnostic group in C++ language modes by default, and adds the warning
group to -Wall in GNU++ language modes. The warning is still opt-in in
C language modes, where support for VLAs is somewhat less surprising to
users.

RFC: https://discourse.llvm.org/t/rfc-diagnosing-use-of-vlas-in-c/73109
Fixes https://github.com/llvm/llvm-project/issues/62836
Differential Revision: https://reviews.llvm.org/D156565

show more ...


Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3
# f3958ce0 21-Aug-2023 Johannes Doerfert <johannes@jdoerfert.de>

[OpenMP] Add NVIDIA annotations for static grid thread limit

We already add AMD GPU annotations, the NVIDIA ones are just a little
more convoluted to add/update but otherwise the same.
We see again

[OpenMP] Add NVIDIA annotations for static grid thread limit

We already add AMD GPU annotations, the NVIDIA ones are just a little
more convoluted to add/update but otherwise the same.
We see again that the interplay of ompx_attribute and deduced value
needs to be improved, see the TODO.

Differential Revision: https://reviews.llvm.org/D158383

show more ...


# c5488c8d 19-Aug-2023 Johannes Doerfert <johannes@jdoerfert.de>

[OpenMP] Properly set static thread limit (w/o analysis)

We used to have two separate implementations to derive the number of
threads used in a target region. This lead us to sometimes miss out on
u

[OpenMP] Properly set static thread limit (w/o analysis)

We used to have two separate implementations to derive the number of
threads used in a target region. This lead us to sometimes miss out on
user provided thread bounds (num_threads, or thread_limit) when we
looked for "constant default values". If we might miss out on the
presence of those bounds, we cannot set the thread_limit statically
since the runtime will try to honor user input rather than cap it at the
"preferred default". This patch replaces the secondary implementation
with the primary in a mode that will not emit code but just look for the
presence, and potentially upper bounds, of thread limiting clauses.

The runtime test would not pass without this rewrite as we missed some
clauses, set the static limit on the device to the preferred value, but
then violated that value at runtime.

Fixes: https://github.com/llvm/llvm-project/issues/64845

Differential Revision: https://reviews.llvm.org/D158381

show more ...


# 5a64ae75 20-Aug-2023 Johannes Doerfert <johannes@jdoerfert.de>

[OpenMP][NFC] Update clang OpenMP tests

Just re-running the script to make future updates easier


Revision tags: llvmorg-17.0.0-rc2
# 25bc999d 29-Jul-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

Intrinsics: Add type overload to stacksave and stackstore

This allows use with non-0 address space stacks. llvm_ptr_ty should
never be used. This could use some more percolation up through mlir,
but

Intrinsics: Add type overload to stacksave and stackstore

This allows use with non-0 address space stacks. llvm_ptr_ty should
never be used. This could use some more percolation up through mlir,
but this is enough to fix existing tests.

https://reviews.llvm.org/D156666

show more ...


Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init
# a709c49d 08-Jul-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

clang: Regenerate OpenMP tests

Avoid diffs from no longer hardcoding metadata checks


# 63ca93c7 06-Jul-2023 Sergio Afonso <safonsof@amd.com>

[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags

This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whe

[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags

This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes
`IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to
`-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed
to `omp.is_target_device`. Getters and setters of all these renamed properties
are also updated accordingly. Many unit tests have been updated to use the new
names, but an alias for the `-fopenmp-is-device` option is created so that
external programs do not stop working after the name change.

`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only
valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the
`-fopenmp-is-target-device` compiler frontend option, which is only added to
the OpenMP device invocation for offloading-enabled programs.

Differential Revision: https://reviews.llvm.org/D154591

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# 196c144d 29-Mar-2023 David Tenty <daltenty@ibm.com>

[clang][CodeGenCXX] Improve handling of itanium ABI member function alignment requirements

The itanium ABI for certain platforms requires a minimum alignments for
member function pointers to reserve

[clang][CodeGenCXX] Improve handling of itanium ABI member function alignment requirements

The itanium ABI for certain platforms requires a minimum alignments for
member function pointers to reserve certain bits for distinguishing
virtual and non-virtual functions.

Our implementation of this however depends on the alignment of the
function involved, which may however not reflect the true alignment of
function pointers on certain targets for which the alignment is
independent of the function (e.g. AIX). Worse, the 2-byte alignment
we use may be less than the ABI minimum for the target, and in the case
we are using explicit sections will result in invalid codegen.

This patch attempts to correct this situation by considering the target
alignment of function pointers as part of making the decision about
whether we need to adjust the function alignment to conform to the ABI.
Targets which do not provide the function ptr alignment information
will return a value of 1 when queried and will conservatively retain
the old alignment.

Differential Revision: https://reviews.llvm.org/D147184

show more ...


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 782c59a4 23-Dec-2022 Itay Bookstein <itay.bookstein@nextsilicon.com>

[OpenMP] Prefix outlined and reduction func names with original func's name

This patch prefixes omp outlined helpers and reduction funcs
with the original function's name.

Reviewed By: jdoerfert

D

[OpenMP] Prefix outlined and reduction func names with original func's name

This patch prefixes omp outlined helpers and reduction funcs
with the original function's name.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D140722

show more ...


# 6fdd13e0 19-Apr-2023 Itay Bookstein <itay.bookstein@nextsilicon.com>

Revert "[OpenMP] Prefix outlined and reduction func names with original func's name"

This reverts commit 029bfc311d4d7d3cd90be81bb08c046848796d02.


# 029bfc31 23-Dec-2022 Itay Bookstein <itay.bookstein@nextsilicon.com>

[OpenMP] Prefix outlined and reduction func names with original func's name

This patch attempts to prefix omp outlined helpers and reduction funcs
with the original function's name.

Reviewed By: jd

[OpenMP] Prefix outlined and reduction func names with original func's name

This patch attempts to prefix omp outlined helpers and reduction funcs
with the original function's name.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D140722

show more ...


# 1c9ec74e 17-Mar-2023 Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>

[Clang][OpenMP] Insert alloca for kernel args at function entry block instead of the launch point.

If an inlined kernel is called in a loop, the launch point alloca would
lead to increasing stack us

[Clang][OpenMP] Insert alloca for kernel args at function entry block instead of the launch point.

If an inlined kernel is called in a loop, the launch point alloca would
lead to increasing stack usage every time the kernel is invoked. This
could make the application run out of stack space and crash. This problem
is fixed by using the alloca insertion point while creating the alloca instruction.

Fixes https://github.com/llvm/llvm-project/issues/60602

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D145820

show more ...


# 16a385ba 19-Jan-2023 Johannes Doerfert <johannes@jdoerfert.de>

[OpenMP] Modernize the kernel launching interface and APIs

We already created a versioned `__tgt_kernel_arguments` struct but it
was only briefly used and its content was passed in isolation anyway.

[OpenMP] Modernize the kernel launching interface and APIs

We already created a versioned `__tgt_kernel_arguments` struct but it
was only briefly used and its content was passed in isolation anyway.
This makes it hard to add more information in the future. With this
patch we fully embrace the struct as means to pass information from the
compiler to the plugin as part of a kernel launch.

The patch also extends and renames the struct, bumping the version
number to 2. Version 1 entries are auto-upgraded. This is in preparation
for "bare" kernel launches, per kernel dynamic shared memory, CUDA/HIP
lowering, etc.

The `__tgt_target_kernel_nowait` interface was deprecated as it was
unused. Once we actually implement support for something like that, we
can add an appropriate API.

Note: Only plugins with the `launch_kernel` interface are now supported.
That means that a new clang won't be able to use an old runtime.
An old clang can still use the new runtime since the libomptarget
interface did not change.

Differential Revision: https://reviews.llvm.org/D141232

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3
# 40e353d0 07-Oct-2022 Nikita Popov <npopov@redhat.com>

[OpenMP] Convert more tests to opaque pointers (NFC)

These were converted using the script at
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
followed by a re-run of update_cc_test_ch

[OpenMP] Convert more tests to opaque pointers (NFC)

These were converted using the script at
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
followed by a re-run of update_cc_test_checks.py.

show more ...


Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1
# 839ac62c 15-Sep-2022 Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>

Revert "[OpenMP] Codegen aggregate for outlined function captures"

This reverts commit 7539e9cf811e590d9f12ae39673ca789e26386b4.


# 7539e9cf 15-Sep-2022 Giorgis Georgakoudis <georgakoudis1@llnl.gov>

[OpenMP] Codegen aggregate for outlined function captures

Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument lis

[OpenMP] Codegen aggregate for outlined function captures

Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert, jhuber6, ABataev

Differential Revision: https://reviews.llvm.org/D102107

show more ...


Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 1ddc51d8 15-Jul-2022 Nicolai Hähnle <nicolai.haehnle@amd.com>

Inliner: don't mark call sites as 'nounwind' if that would be redundant

When F calls G calls H, G is nounwind, and G is inlined into F, then the
inlined call-site to H should be effectively nounwind

Inliner: don't mark call sites as 'nounwind' if that would be redundant

When F calls G calls H, G is nounwind, and G is inlined into F, then the
inlined call-site to H should be effectively nounwind so as not to lose
information during inlining.

If H itself is nounwind (which often happens when H is an intrinsic), we
no longer mark the callsite explicitly as nounwind. Previously, there
were cases where the inlined call-site of H differs from a pre-existing
call-site of H in F *only* in the explicitly added nounwind attribute,
thus preventing common subexpression elimination.

v2:
- just check CI->doesNotThrow

v3 (resubmit after revert at 344378808778c61d5599f4e0ac783ef7e6f8ed05):
- update Clang tests

Differential Revision: https://reviews.llvm.org/D129860

show more ...


# 1586075a 18-Jul-2022 Nicolai Hähnle <nicolai.haehnle@amd.com>

Rerun ./utils/update_cc_test.py on a bunch of tests

Due to update script changes; this reduces the size of a later "real"
diff.


# 5300263c 27-Jun-2022 Joseph Huber <jhuber6@vols.utk.edu>

[OpenMP] Add loop tripcount argument to kernel launch and remove push function

Previously we added the `push_target_tripcount` function to send the
loop tripcount to the device runtime so we knew ho

[OpenMP] Add loop tripcount argument to kernel launch and remove push function

Previously we added the `push_target_tripcount` function to send the
loop tripcount to the device runtime so we knew how to configure the
teams / threads for execute the loop for a teams distribute construct.
This was implemented as a separate function mostly to avoid changing the
interface for backwards compatbility. Now that we've changed it anyway
and the new interface can take an arbitrary number of arguments via the
struct without changing the ABI, we can move this to the new interface.
This will simplify the runtime by removing unnecessary state between
calls.

Depends on D128550

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D128816

show more ...


1234