xref: /llvm-project/libc/docs/gpu/testing.rst (revision 6818c7b8efef16ff373a1d8a6d7e35ecf14541be)
1807f0584SJoseph Huber.. _libc_gpu_testing:
2807f0584SJoseph Huber
3807f0584SJoseph Huber
4*6818c7b8SJoseph Huber=========================
5*6818c7b8SJoseph HuberTesting the GPU C library
6*6818c7b8SJoseph Huber=========================
7807f0584SJoseph Huber
862a2a07cSJoseph Huber.. note::
962a2a07cSJoseph Huber   Running GPU tests with high parallelism is likely to cause spurious failures,
109a515d81SKazu Hirata   out of resource errors, or indefinite hangs. limiting the number of threads
11089b8110SJoseph Huber   used while testing using ``LIBC_GPU_TEST_JOBS=<N>`` is highly recommended.
1262a2a07cSJoseph Huber
13807f0584SJoseph Huber.. contents:: Table of Contents
14807f0584SJoseph Huber  :depth: 4
15807f0584SJoseph Huber  :local:
16807f0584SJoseph Huber
17*6818c7b8SJoseph HuberTesting infrastructure
18807f0584SJoseph Huber======================
19807f0584SJoseph Huber
20*6818c7b8SJoseph HuberThe LLVM C library supports different kinds of :ref:`tests <build_and_test>`
21*6818c7b8SJoseph Huberdepending on the build configuration. The GPU target is considered a full build
22*6818c7b8SJoseph Huberand therefore provides all of its own utilities to build and run the generated
23*6818c7b8SJoseph Hubertests. Currently the GPU supports two kinds of tests.
24*6818c7b8SJoseph Huber
25*6818c7b8SJoseph Huber#. **Hermetic tests** - These are unit tests built with a test suite similar to
26*6818c7b8SJoseph Huber   Google's ``gtest`` infrastructure. These use the same infrastructure as unit
27*6818c7b8SJoseph Huber   tests except that the entire environment is self-hosted. This allows us to
28*6818c7b8SJoseph Huber   run them on the GPU using our custom utilities. These are used to test the
29*6818c7b8SJoseph Huber   majority of functional implementations.
30*6818c7b8SJoseph Huber
31*6818c7b8SJoseph Huber#. **Integration tests** - These are lightweight tests that simply call a
32*6818c7b8SJoseph Huber   ``main`` function and checks if it returns non-zero. These are primarily used
33*6818c7b8SJoseph Huber   to test interfaces that are sensitive to threading.
34*6818c7b8SJoseph Huber
35*6818c7b8SJoseph HuberThe GPU uses the same testing infrastructure as the other supported ``libc``
36*6818c7b8SJoseph Hubertargets. We do this by treating the GPU as a standard hosted environment capable
37*6818c7b8SJoseph Huberof launching a ``main`` function. Effectively, this means building our own
38*6818c7b8SJoseph Huberstartup libraries and loader.
39*6818c7b8SJoseph Huber
40*6818c7b8SJoseph HuberTesting utilities
41*6818c7b8SJoseph Huber=================
42*6818c7b8SJoseph Huber
43*6818c7b8SJoseph HuberWe provide two utilities to execute arbitrary programs on the GPU. That is the
44*6818c7b8SJoseph Huber``loader`` and the ``start`` object.
45*6818c7b8SJoseph Huber
46*6818c7b8SJoseph HuberStartup object
47*6818c7b8SJoseph Huber--------------
48*6818c7b8SJoseph Huber
49*6818c7b8SJoseph HuberThis object mimics the standard object used by existing C library
50*6818c7b8SJoseph Huberimplementations. Its job is to perform the necessary setup prior to calling the
51*6818c7b8SJoseph Huber``main`` function. In the GPU case, this means exporting GPU kernels that will
52*6818c7b8SJoseph Huberperform the necessary operations. Here we use ``_begin`` and ``_end`` to handle
53*6818c7b8SJoseph Hubercalling global constructors and destructors while ``_start`` begins the standard
54*6818c7b8SJoseph Huberexecution. The following code block shows the implementation for AMDGPU
55*6818c7b8SJoseph Huberarchitectures.
56*6818c7b8SJoseph Huber
57*6818c7b8SJoseph Huber.. code-block:: c++
58*6818c7b8SJoseph Huber
59*6818c7b8SJoseph Huber  extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
60*6818c7b8SJoseph Huber  _begin(int argc, char **argv, char **env) {
61*6818c7b8SJoseph Huber    LIBC_NAMESPACE::atexit(&LIBC_NAMESPACE::call_fini_array_callbacks);
62*6818c7b8SJoseph Huber    LIBC_NAMESPACE::call_init_array_callbacks(argc, argv, env);
63*6818c7b8SJoseph Huber  }
64*6818c7b8SJoseph Huber
65*6818c7b8SJoseph Huber  extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
66*6818c7b8SJoseph Huber  _start(int argc, char **argv, char **envp, int *ret) {
67*6818c7b8SJoseph Huber    __atomic_fetch_or(ret, main(argc, argv, envp), __ATOMIC_RELAXED);
68*6818c7b8SJoseph Huber  }
69*6818c7b8SJoseph Huber
70*6818c7b8SJoseph Huber  extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
71*6818c7b8SJoseph Huber  _end(int retval) {
72*6818c7b8SJoseph Huber    LIBC_NAMESPACE::exit(retval);
73*6818c7b8SJoseph Huber  }
74*6818c7b8SJoseph Huber
75*6818c7b8SJoseph HuberLoader runtime
76*6818c7b8SJoseph Huber--------------
77*6818c7b8SJoseph Huber
78*6818c7b8SJoseph HuberThe startup object provides a GPU executable with callable kernels for the
79*6818c7b8SJoseph Huberrespective runtime. We can then define a minimal runtime that will launch these
80*6818c7b8SJoseph Huberkernels on the given device. Currently we provide the ``amdhsa-loader`` and
81*6818c7b8SJoseph Huber``nvptx-loader`` targeting the AMD HSA runtime and CUDA driver runtime
82*6818c7b8SJoseph Huberrespectively. By default these will launch with a single thread on the GPU.
83807f0584SJoseph Huber
84807f0584SJoseph Huber.. code-block:: sh
85807f0584SJoseph Huber
86*6818c7b8SJoseph Huber   $> clang++ crt1.o test.cpp --target=amdgcn-amd-amdhsa -mcpu=native -flto
87*6818c7b8SJoseph Huber   $> amdhsa_loader --threads 1 --blocks 1 ./a.out
88807f0584SJoseph Huber   Test Passed!
89807f0584SJoseph Huber
90*6818c7b8SJoseph HuberThe loader utility will forward any arguments passed after the executable image
91*6818c7b8SJoseph Huberto the program on the GPU as well as any set environment variables. The number
92*6818c7b8SJoseph Huberof threads and blocks to be set can be controlled with ``--threads`` and
93*6818c7b8SJoseph Huber``--blocks``. These also accept additional ``x``, ``y``, ``z`` variants for
94*6818c7b8SJoseph Hubermultidimensional grids.
95*6818c7b8SJoseph Huber
96*6818c7b8SJoseph HuberRunning tests
97*6818c7b8SJoseph Huber=============
98*6818c7b8SJoseph Huber
99*6818c7b8SJoseph HuberTests will only be built and run if a GPU target architecture is set and the
100*6818c7b8SJoseph Hubercorresponding loader utility was built. These can be overridden with the
101*6818c7b8SJoseph Huber``LIBC_GPU_TEST_ARCHITECTURE`` and ``LIBC_GPU_LOADER_EXECUTABLE`` :ref:`CMake
102*6818c7b8SJoseph Huberoptions <gpu_cmake_options>`. Once built, they can be run like any other tests.
103*6818c7b8SJoseph HuberThe CMake target depends on how the library was built.
104*6818c7b8SJoseph Huber
105*6818c7b8SJoseph Huber#. **Cross build** - If the C library was built using ``LLVM_ENABLE_PROJECTS``
106*6818c7b8SJoseph Huber   or a runtimes cross build, then the standard targets will be present in the
107*6818c7b8SJoseph Huber   base CMake build directory.
108*6818c7b8SJoseph Huber
109*6818c7b8SJoseph Huber   #. All tests - You can run all supported tests with the command:
110*6818c7b8SJoseph Huber
111*6818c7b8SJoseph Huber      .. code-block:: sh
112*6818c7b8SJoseph Huber
113*6818c7b8SJoseph Huber        $> ninja check-libc
114*6818c7b8SJoseph Huber
115*6818c7b8SJoseph Huber   #. Hermetic tests - You can run hermetic with tests the command:
116*6818c7b8SJoseph Huber
117*6818c7b8SJoseph Huber      .. code-block:: sh
118*6818c7b8SJoseph Huber
119*6818c7b8SJoseph Huber        $> ninja libc-hermetic-tests
120*6818c7b8SJoseph Huber
121*6818c7b8SJoseph Huber   #. Integration tests - You can run integration tests by the command:
122*6818c7b8SJoseph Huber
123*6818c7b8SJoseph Huber      .. code-block:: sh
124*6818c7b8SJoseph Huber
125*6818c7b8SJoseph Huber        $> ninja libc-integration-tests
126*6818c7b8SJoseph Huber
127*6818c7b8SJoseph Huber#. **Runtimes build** - If the library was built using ``LLVM_ENABLE_RUNTIMES``
128*6818c7b8SJoseph Huber   then the actual ``libc`` build will be in a separate directory.
129*6818c7b8SJoseph Huber
130*6818c7b8SJoseph Huber   #. All tests - You can run all supported tests with the command:
131*6818c7b8SJoseph Huber
132*6818c7b8SJoseph Huber      .. code-block:: sh
133*6818c7b8SJoseph Huber
134*6818c7b8SJoseph Huber        $> ninja check-libc-amdgcn-amd-amdhsa
135*6818c7b8SJoseph Huber        $> ninja check-libc-nvptx64-nvidia-cuda
136*6818c7b8SJoseph Huber
137*6818c7b8SJoseph Huber   #. Specific tests - You can use the same targets as above by entering the
138*6818c7b8SJoseph Huber      runtimes build directory.
139*6818c7b8SJoseph Huber
140*6818c7b8SJoseph Huber      .. code-block:: sh
141*6818c7b8SJoseph Huber
142*6818c7b8SJoseph Huber        $> ninja -C runtimes/runtimes-amdgcn-amd-amdhsa-bins check-libc
143*6818c7b8SJoseph Huber        $> ninja -C runtimes/runtimes-nvptx64-nvidia-cuda-bins check-libc
144*6818c7b8SJoseph Huber        $> cd runtimes/runtimes-amdgcn-amd-amdhsa-bins && ninja check-libc
145*6818c7b8SJoseph Huber        $> cd runtimes/runtimes-nvptx64-nvidia-cuda-bins && ninja check-libc
146*6818c7b8SJoseph Huber
147*6818c7b8SJoseph HuberTests can also be built and run manually using the respective loader utility.
148