__clang_openmp_device_functions.h - OpenGrok history log for /llvm-project/clang/lib/Headers/openmp_wrappers/__clang_openmp_device

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3
# 80525dfc	13-Aug-2024	Johannes Doerfert <johannes@jdoerfert.de>	[Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (#94549) Through the new `-foffload-via-llvm` flag, CUDA kernels can now be lowered to the LLVM/Offload API. On the Clang side, this is simply [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (#94549) Through the new `-foffload-via-llvm` flag, CUDA kernels can now be lowered to the LLVM/Offload API. On the Clang side, this is simply done by using the OpenMP offload toolchain and emitting calls to `llvm` functions to orchestrate the kernel launch rather than `cuda` functions. These `llvm` functions are implemented on top of the existing LLVM/Offload API. As we are about to redefine the Offload API, this wil help us in the design process as a second offload language. We do not support any CUDA APIs yet, however, we could: https://www.osti.gov/servlets/purl/1892137 For proper host execution we need to resurrect/rebase https://tianshilei.me/wp-content/uploads/2021/12/llpp-2021.pdf (which was designed for debugging). ``` ❯❯❯ cat test.cu extern "C" { void llvm_omp_target_alloc_shared(size_t Size, int DeviceNum); void llvm_omp_target_free_shared(void DevicePtr, int DeviceNum); } __global__ void square(int A) { A = 42; } int main(int argc, char argv) { int DevNo = 0; int Ptr = reinterpret_cast<int >(llvm_omp_target_alloc_shared(4, DevNo)); Ptr = 7; printf("Ptr %p, Ptr %i\n", Ptr, Ptr); square<<<1, 1>>>(Ptr); printf("Ptr %p, Ptr %i\n", Ptr, Ptr); llvm_omp_target_free_shared(Ptr, DevNo); } ❯❯❯ clang++ test.cu -O3 -o test123 -foffload-via-llvm --offload-arch=native ❯❯❯ llvm-objdump --offloading test123 test123: file format elf64-x86-64 OFFLOADING IMAGE [0]: kind elf arch gfx90a triple amdgcn-amd-amdhsa producer openmp ❯❯❯ LIBOMPTARGET_INFO=16 ./test123 Ptr 0x155448ac8000, Ptr 7 Ptr 0x155448ac8000, Ptr 42 ``` show more ...
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# dee1f5b3	18-Jun-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	OpenMP: Don't include stdbool.h in builtin headers Pre-C99 didn't include bool, and C99 allows you to redefine true/false apparently.
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# f3eb5f90	04-Aug-2021	Pushpinder Singh <Pushpinder.Singh@amd.com>	[AMDGPU][OpenMP] Wrap amdgcn declare variant inside ifdef This fixes the issue https://bugs.llvm.org/show_bug.cgi?id=51337 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm. [AMDGPU][OpenMP] Wrap amdgcn declare variant inside ifdef This fixes the issue https://bugs.llvm.org/show_bug.cgi?id=51337 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D107468 show more ...
Revision tags: llvmorg-13.0.0-rc1
# 713a5d12	02-Aug-2021	Pushpinder Singh <Pushpinder.Singh@amd.com>	[OpenMP][AMDGCN] Initial math headers support With this patch, OpenMP on AMDGCN will use the math functions provided by ROCm ocml library. Linking device code to the ocml will be done in the next pa [OpenMP][AMDGCN] Initial math headers support With this patch, OpenMP on AMDGCN will use the math functions provided by ROCm ocml library. Linking device code to the ocml will be done in the next patch. Reviewed By: JonChesterfield, jdoerfert, scchan Differential Revision: https://reviews.llvm.org/D104904 show more ...
# 7f97ddaf	30-Jul-2021	Jon Chesterfield <jonathanchesterfield@gmail.com>	Revert "[OpenMP][AMDGCN] Initial math headers support" Broke nvptx compilation on files including <complex> This reverts commit 12da97ea10a941f0123340831300d09a2121e173.
# 12da97ea	30-Jul-2021	Pushpinder Singh <Pushpinder.Singh@amd.com>	[OpenMP][AMDGCN] Initial math headers support With this patch, OpenMP on AMDGCN will use the math functions provided by ROCm ocml library. Linking device code to the ocml will be done in the next pa [OpenMP][AMDGCN] Initial math headers support With this patch, OpenMP on AMDGCN will use the math functions provided by ROCm ocml library. Linking device code to the ocml will be done in the next patch. Reviewed By: JonChesterfield, jdoerfert, scchan Differential Revision: https://reviews.llvm.org/D104904 show more ...
Revision tags: llvmorg-14-init
# d71062fb	21-Jul-2021	Jon Chesterfield <jonathanchesterfield@gmail.com>	Revert "[OpenMP][AMDGCN] Initial math headers support" This reverts commit 968899ad9cf17579f9867dafb35c4d97bad0863f.
# 968899ad	21-Jul-2021	Pushpinder Singh <Pushpinder.Singh@amd.com>	[OpenMP][AMDGCN] Initial math headers support With this patch, OpenMP on AMDGCN will use the math functions provided by ROCm ocml library. Linking device code to the ocml will be done in the next pa [OpenMP][AMDGCN] Initial math headers support With this patch, OpenMP on AMDGCN will use the math functions provided by ROCm ocml library. Linking device code to the ocml will be done in the next patch. Reviewed By: JonChesterfield, jdoerfert, scchan Differential Revision: https://reviews.llvm.org/D104904 show more ...
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# 5d8d994d	16-Apr-2021	Johannes Doerfert <johannes@jdoerfert.de>	[OpenMP] Make sure classes work on the device as they do on the host We do provide `operator delete(void)` in `<new>` but it should be available by default. This is mostly boilerplate to test it an [OpenMP] Make sure classes work on the device as they do on the host We do provide `operator delete(void)` in `<new>` but it should be available by default. This is mostly boilerplate to test it and the unconditional include of `<new>` in the header we always in include on the device. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D100620 show more ...
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init
# 7f1e6fcf	10-Jul-2020	Johannes Doerfert <johannes@jdoerfert.de>	[OpenMP] Use __OPENMP_NVPTX__ instead of _OPENMP in wrapper headers Due to recent changes we cannot use OpenMP in CUDA files anymore (PR45533) as the math handling of CUDA is different when _OPENMP [OpenMP] Use __OPENMP_NVPTX__ instead of _OPENMP in wrapper headers Due to recent changes we cannot use OpenMP in CUDA files anymore (PR45533) as the math handling of CUDA is different when _OPENMP is defined. We actually want this different behavior only if we are offloading with OpenMP to NVIDIA, thus generating NVPTX. With this patch we do not interfere with the CUDA math handling except if we are in NVPTX offloading mode, as indicated by the presence of __OPENMP_NVPTX__. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D78155 show more ...
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# f85ae058	28-Mar-2020	Johannes Doerfert <johannes@jdoerfert.de>	[OpenMP] Provide math functions in OpenMP device code via OpenMP variants For OpenMP target regions to piggy back on the CUDA/AMDGPU/... implementation of math functions, we include the appropriate [OpenMP] Provide math functions in OpenMP device code via OpenMP variants For OpenMP target regions to piggy back on the CUDA/AMDGPU/... implementation of math functions, we include the appropriate definitions inside of an `omp begin/end declare variant match(device={arch(nvptx)})` scope. This way, the vendor specific math functions will become specialized versions of the system math functions. When a system math function is called and specialized version is available the selection logic introduced in D75779 instead call the specialized version. In contrast to the code path we used so far, the system header is actually included. This means functions without specialized versions are available and so are macro definitions. This should address PR42061, PR42798, and PR42799. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D75788 show more ...