History log of /llvm-project/clang/test/OpenMP/ompx_attributes_codegen.cpp (Results 1 – 6 of 6)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1
# 3c8efd79 23-Jul-2024 Johannes Doerfert <johannes@jdoerfert.de>

[OpenMP] Ensure the actual kernel is annotated with launch bounds (#99927)

In debug mode there is a wrapper (the kernel) around the function in
which we generate the kernel code. We worked around t

[OpenMP] Ensure the actual kernel is annotated with launch bounds (#99927)

In debug mode there is a wrapper (the kernel) around the function in
which we generate the kernel code. We worked around this before to get
the correct kernel name, but now we really distinguish both to attach
the launch bounds to the kernel, not the inner function.

show more ...


Revision tags: llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4
# 0287a5cc 15-Apr-2024 Joseph Huber <huberjn@outlook.com>

[OpenMP] Remove 'minncta' attributes from NVPTX kernels (#88398)

Summary:
Currently we treat this attribute as a minimum number for the amount of
blocks scheduled on the kernel. However, the doucm

[OpenMP] Remove 'minncta' attributes from NVPTX kernels (#88398)

Summary:
Currently we treat this attribute as a minimum number for the amount of
blocks scheduled on the kernel. However, the doucmentation states that
this applies to CTA's mapped onto a *single* SM. Currently we just set
it to the total number of blocks, which will almost always result in a
warning that the value is out of range and will be ignored. We don't
have a good way to automatically know how many CTAs can be put on a
single SM nor if we should do this, so we should probably leave this up
to users manually adding it.


https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#performance-tuning-directives-minnctapersm

show more ...


Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5
# b8cbc5c0 01-Nov-2023 Johannes Doerfert <johannes@jdoerfert.de>

[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)

The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the

[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)

The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.

Fixes: https://github.com/llvm/llvm-project/issues/70249

show more ...


Revision tags: llvmorg-17.0.4
# 0ba57c8b 26-Oct-2023 Johannes Doerfert <johannes@jdoerfert.de>

[OpenMP] Pass min/max thread and team count to the OMPIRBuilder (#70247)

We now provide the information about the min/max thread and team count
from to the OMPIRBuilder, no matter what the source w

[OpenMP] Pass min/max thread and team count to the OMPIRBuilder (#70247)

We now provide the information about the min/max thread and team count
from to the OMPIRBuilder, no matter what the source was. That means we
unify `thread_limit`, `num_teams`, `num_threads` handling with the
target specific attriutes (`__launch_bounds__` and
`amdgpu_flat_work_group_size`). This is in preparation to pass the
values to the runtime, and to allow the middle-end (OpenMP-opt) to
tighten the values if it seems appropriate. There is no "real" change
after this commit.

show more ...


Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1
# 08a22076 25-Jul-2023 Johannes Doerfert <johannes@jdoerfert.de>

Reapply "[OpenMP] Add the `ompx_attribute` clause for target directives"

This reverts commit 0d12683046ca75fb08e285f4622f2af5c82609dc and
reapplies ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2 with an e

Reapply "[OpenMP] Add the `ompx_attribute` clause for target directives"

This reverts commit 0d12683046ca75fb08e285f4622f2af5c82609dc and
reapplies ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2 with an extension to
fix the Flang build.

Differential Revision: https://reviews.llvm.org/D156184

show more ...


Revision tags: llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4
# ef9ec4bb 03-Mar-2023 Johannes Doerfert <johannes@jdoerfert.de>

[OpenMP] Add the `ompx_attribute` clause for target directives

CUDA and HIP have kernel attributes to tune the code generation (in the
backend). To reuse this functionality for OpenMP target regions

[OpenMP] Add the `ompx_attribute` clause for target directives

CUDA and HIP have kernel attributes to tune the code generation (in the
backend). To reuse this functionality for OpenMP target regions we
introduce the `ompx_attribute` clause that takes these kernel
attributes and emits code as if they had been attached to the kernel
fuction (which is implicitly generated).

To limit the impact, we only support three kernel attributes:
`amdgpu_waves_per_eu`, for AMDGPU
`amdgpu_flat_work_group_size`, for AMDGPU
`launch_bounds`, for NVPTX

The existing implementations of those attributes are used for error
checking and code generation. `ompx_attribute` can be attached to any
executable target region and it can hold more than one kernel attribute.

Differential Revision: https://reviews.llvm.org/D156184

show more ...