Lines Matching +full:build +full:- +full:docs
4 Please do not hesitate to reach out to us on the `Discourse forums (Runtimes - OpenMP) <https://discourse.llvm.org/c/runtimes/openmp/35>`_ or join
11 -----
16 - Development updates on OpenMP (and OpenACC) in the LLVM Project, including Clang, optimization, and runtime work.
17 - Join `OpenMP in LLVM Technical Call <https://bluejeans.com/544112769//webrtc>`__.
18 - Time: Weekly call on every Wednesday 7:00 AM Pacific time.
19 - Meeting minutes are `here <https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit>`__.
20 - Status tracking `page <https://openmp.llvm.org/docs>`__.
25 - Development updates on OpenMP and OpenACC in the Flang Project.
26 - Join `OpenMP in Flang Technical Call <https://bit.ly/39eQW3o>`_
27 - Time: Weekly call on every Thursdays 8:00 AM Pacific time.
28 - Meeting minutes are `here <https://docs.google.com/document/d/1yA-MeJf6RYY-ZXpdol0t7YoDoqtwAyBhFLr5thu5pFI>`__.
29 - Status tracking `page <https://docs.google.com/spreadsheets/d/1FvHPuSkGbl4mQZRAwCIndvQx9dQboffiD-xD0oqxgU0/edit#gid=0>`__.
35 ---
40 additions. Please post on the `Discourse forums (Runtimes - OpenMP) <https://discourse.llvm.org/c/runtimes/openmp/35>`__.
47 <https://llvm.org/docs/Contributing.html#how-to-submit-a-patch>`_.
52 Q: How to build an OpenMP GPU offload capable compiler?
59 .. code-block:: sh
61 $> cd llvm-project # The llvm-project checkout
62 $> mkdir build
63 $> cd build
64 $> cmake ../llvm -G Ninja \
65 -C ../offload/cmake/caches/Offload.cmake \ # The preset cache file
66 -DCMAKE_BUILD_TYPE=<Debug|Release> \ # Select build type
67 -DCMAKE_INSTALL_PREFIX=<PATH> \ # Where the libraries will live
70 To manually build an *effective* OpenMP offload capable compiler, only one extra CMake
73 <https://llvm.org/docs/GettingStarted.html>`__.). Make sure all backends that
89 .. _advanced_builds: https://llvm.org//docs/AdvancedBuilds.html
93 Q: How to build an OpenMP Nvidia offload capable compiler?
97 If your build machine is not the target machine or automatic detection of the
100 - ``LIBOMPTARGET_DEVICE_ARCHITECTURES='sm_<xy>;...'`` where ``<xy>`` is the numeric
108 Q: How to build an OpenMP AMDGPU offload capable compiler?
111 required to build the LLVM toolchain and to execute the openmp application.
113 build the required subcomponents ROCt and ROCr from source.
115 The two components used are ROCT-Thunk-Interface, roct, and ROCR-Runtime, rocr.
121 .. code-block:: text
123 SOURCE_DIR=same-as-llvm-source # e.g. the checkout of llvm-project, next to openmp
125 INSTALL_PREFIX=same-as-llvm-install
128 git clone git@github.com:RadeonOpenCompute/ROCT-Thunk-Interface.git -b roc-4.2.x \
129 --single-branch
130 git clone git@github.com:RadeonOpenCompute/ROCR-Runtime.git -b rocm-4.2.x \
131 --single-branch
134 cmake $SOURCE_DIR/ROCT-Thunk-Interface/ -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX \
135 -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF
139 cmake $SOURCE_DIR/ROCR-Runtime/src -DIMAGE_SUPPORT=OFF \
140 -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release \
141 -DBUILD_SHARED_LIBS=ON
146 Provided cmake's find_package can find the ROCR-Runtime package, LLVM will
147 build a tool ``bin/amdgpu-arch`` which will print a string like ``gfx906`` when
148 run if it recognises a GPU on the local system. LLVM will also build a shared
151 With those libraries installed, then LLVM build and installed, try:
153 .. code-block:: shell
155 clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa example.c -o example && ./example
157 If your build machine is not the target machine or automatic detection of the
160 - ``LIBOMPTARGET_DEVICE_ARCHITECTURES='gfx<xyz>;...'`` where ``<xyz>`` is the
170 of the rocm device library, which will be searched for if linking with '-lm'.
183 - ``libomp.so`` (or similar), the host openmp runtime
184 - ``libomptarget.so``, the target-agnostic target offloading openmp runtime
185 - plugins loaded by libomptarget.so:
187 - ``libomptarget.rtl.amdgpu.so``
188 - ``libomptarget.rtl.cuda.so``
189 - ``libomptarget.rtl.x86_64.so``
190 - ``libomptarget.rtl.ve.so``
191 - and others
193 - dependencies of those plugins, e.g. cuda/rocr for nvptx/amdgpu
203 file <https://clang.llvm.org/docs/UsersManual.html#configuration-files>`__ to
207 .. code-block:: text
210 -L '<CFGDIR>/../lib'
211 -Wl,-rpath='<CFGDIR>/../lib'
213 The plugins will try to find their dependencies in plugin-dependent fashion.
216 compiler build time. Otherwise it will attempt to dlopen ``libcuda.so``. It does
219 The amdgpu plugin is linked against ROCr if cmake found it at compiler build
220 time. Otherwise it will attempt to dlopen ``libhsa-runtime64.so``. It has rpath
221 set to ``$ORIGIN``, so installing ``libhsa-runtime64.so`` in the same directory is a
226 bitcode library, e.g. ``libomptarget-nvptx-sm_70.bc``.
236 flag, ``--libomptarget-nvptx-bc-path`` or ``--libomptarget-amdgcn-bc-path``. That
240 Q: Does OpenMP offloading support work in pre-packaged LLVM releases?
274 <https://clang.llvm.org/docs/AttributeReference.html#pragma-omp-declare-variant>`__
282 By using ``libomptarget.rtl.rpc.so`` and ``openmp-offloading-server``, it is
293 Q: How to build an OpenMP offload capable compiler with an outdated host compiler?
296 Enabling the OpenMP runtime will perform a two-stage build for you.
297 If your host compiler is different from your system-wide compiler, you may need
299 ``--gcc-install-dir=/usr/lib/gcc/x86_64-linux-gnu/12`` so that clang will be
300 able to find the correct GCC toolchain in the second stage of the build.
302 For example, if your system-wide GCC installation is too old to build LLVM and
303 you would like to use a newer GCC, set ``--gcc-install-dir=``
321 .. code-block:: cmake
358 multiple sub-architectures for the same target. Additionally, static libraries
362 The architecture can either be specified manually using ``--offload-arch=``. If
363 ``--offload-arch=`` is present no ``-fopenmp-targets=`` flag is present then the
365 ``--fopenmp-targets=`` is present with no ``--offload-arch`` then the target
370 given that the necessary build tools are installed for both.
372 .. code-block:: shell
374 clang example.c -fopenmp --offload-arch=gfx90a --offload-arch=sm_80
379 .. code-block:: shell
381 clang example.c -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa,nvptx64-nvidia-cuda \
382 -Xopenmp-target=amdgcn-amd-amdhsa --offload-arch=gfx90a \
383 -Xopenmp-target=nvptx64-nvidia-cuda --offload-arch=sm_80
388 .. code-block:: shell
390 clang example.c -fopenmp --offload-arch=gfx90a,gfx90a,sm_70,sm_80 -c
391 llvm-ar rcs libexample.a example.o
392 clang app.c -fopenmp --offload-arch=gfx90a -o app
394 The supported device images can be viewed using the ``--offloading`` option with
395 ``llvm-objdump``.
397 .. code-block:: shell
399 clang example.c -fopenmp --offload-arch=gfx90a --offload-arch=sm_80 -o example
400 llvm-objdump --offloading example
402 a.out: file format elf64-x86-64
407 triple amdgcn-amd-amdhsa
413 triple nvptx64-nvidia-cuda
420 files. This will allow OpenMP to call a CUDA device function or vice-versa.
426 for CUDA / HIP with ``--offload-new-driver`` and to link using
427 ``--offload-link``. Additionally, ``-fgpu-rdc`` must be used to create a
430 .. code-block:: shell
432 clang++ openmp.cpp -fopenmp --offload-arch=sm_80 -c
433 clang++ cuda.cu --offload-new-driver --offload-arch=sm_80 -fgpu-rdc -c
434 clang++ openmp.o cuda.o --offload-link -o app
445 Clang compiler and runtime libraries from the same build. Nevertheless, in order
446 to better support third-party libraries and toolchains that depend on existing
455 <https://libc.llvm.org/gpu/using.html#building-the-gpu-library>`_. Once built,
461 .. code-block:: shell
463 clang++ openmp.cpp -fopenmp --offload-arch=gfx90a -lcgpu
473 Q: Can I build the offloading runtimes without CUDA or HSA?
482 Q: Why is my build taking a long time?
484 When installing OpenMP and other LLVM components, the build time on multicore
485 systems can be significantly reduced with parallel build jobs. As suggested in
487 generator. This can be done with the CMake option ``cmake -G Ninja``. Afterward,
488 use ``ninja install`` and specify the number of parallel jobs with ``-j``. The build
489 time can also be reduced by setting the build type to ``Release`` with the