1807f0584SJoseph Huber.. _libc_gpu_testing: 2807f0584SJoseph Huber 3807f0584SJoseph Huber 4*6818c7b8SJoseph Huber========================= 5*6818c7b8SJoseph HuberTesting the GPU C library 6*6818c7b8SJoseph Huber========================= 7807f0584SJoseph Huber 862a2a07cSJoseph Huber.. note:: 962a2a07cSJoseph Huber Running GPU tests with high parallelism is likely to cause spurious failures, 109a515d81SKazu Hirata out of resource errors, or indefinite hangs. limiting the number of threads 11089b8110SJoseph Huber used while testing using ``LIBC_GPU_TEST_JOBS=<N>`` is highly recommended. 1262a2a07cSJoseph Huber 13807f0584SJoseph Huber.. contents:: Table of Contents 14807f0584SJoseph Huber :depth: 4 15807f0584SJoseph Huber :local: 16807f0584SJoseph Huber 17*6818c7b8SJoseph HuberTesting infrastructure 18807f0584SJoseph Huber====================== 19807f0584SJoseph Huber 20*6818c7b8SJoseph HuberThe LLVM C library supports different kinds of :ref:`tests <build_and_test>` 21*6818c7b8SJoseph Huberdepending on the build configuration. The GPU target is considered a full build 22*6818c7b8SJoseph Huberand therefore provides all of its own utilities to build and run the generated 23*6818c7b8SJoseph Hubertests. Currently the GPU supports two kinds of tests. 24*6818c7b8SJoseph Huber 25*6818c7b8SJoseph Huber#. **Hermetic tests** - These are unit tests built with a test suite similar to 26*6818c7b8SJoseph Huber Google's ``gtest`` infrastructure. These use the same infrastructure as unit 27*6818c7b8SJoseph Huber tests except that the entire environment is self-hosted. This allows us to 28*6818c7b8SJoseph Huber run them on the GPU using our custom utilities. These are used to test the 29*6818c7b8SJoseph Huber majority of functional implementations. 30*6818c7b8SJoseph Huber 31*6818c7b8SJoseph Huber#. **Integration tests** - These are lightweight tests that simply call a 32*6818c7b8SJoseph Huber ``main`` function and checks if it returns non-zero. These are primarily used 33*6818c7b8SJoseph Huber to test interfaces that are sensitive to threading. 34*6818c7b8SJoseph Huber 35*6818c7b8SJoseph HuberThe GPU uses the same testing infrastructure as the other supported ``libc`` 36*6818c7b8SJoseph Hubertargets. We do this by treating the GPU as a standard hosted environment capable 37*6818c7b8SJoseph Huberof launching a ``main`` function. Effectively, this means building our own 38*6818c7b8SJoseph Huberstartup libraries and loader. 39*6818c7b8SJoseph Huber 40*6818c7b8SJoseph HuberTesting utilities 41*6818c7b8SJoseph Huber================= 42*6818c7b8SJoseph Huber 43*6818c7b8SJoseph HuberWe provide two utilities to execute arbitrary programs on the GPU. That is the 44*6818c7b8SJoseph Huber``loader`` and the ``start`` object. 45*6818c7b8SJoseph Huber 46*6818c7b8SJoseph HuberStartup object 47*6818c7b8SJoseph Huber-------------- 48*6818c7b8SJoseph Huber 49*6818c7b8SJoseph HuberThis object mimics the standard object used by existing C library 50*6818c7b8SJoseph Huberimplementations. Its job is to perform the necessary setup prior to calling the 51*6818c7b8SJoseph Huber``main`` function. In the GPU case, this means exporting GPU kernels that will 52*6818c7b8SJoseph Huberperform the necessary operations. Here we use ``_begin`` and ``_end`` to handle 53*6818c7b8SJoseph Hubercalling global constructors and destructors while ``_start`` begins the standard 54*6818c7b8SJoseph Huberexecution. The following code block shows the implementation for AMDGPU 55*6818c7b8SJoseph Huberarchitectures. 56*6818c7b8SJoseph Huber 57*6818c7b8SJoseph Huber.. code-block:: c++ 58*6818c7b8SJoseph Huber 59*6818c7b8SJoseph Huber extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void 60*6818c7b8SJoseph Huber _begin(int argc, char **argv, char **env) { 61*6818c7b8SJoseph Huber LIBC_NAMESPACE::atexit(&LIBC_NAMESPACE::call_fini_array_callbacks); 62*6818c7b8SJoseph Huber LIBC_NAMESPACE::call_init_array_callbacks(argc, argv, env); 63*6818c7b8SJoseph Huber } 64*6818c7b8SJoseph Huber 65*6818c7b8SJoseph Huber extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void 66*6818c7b8SJoseph Huber _start(int argc, char **argv, char **envp, int *ret) { 67*6818c7b8SJoseph Huber __atomic_fetch_or(ret, main(argc, argv, envp), __ATOMIC_RELAXED); 68*6818c7b8SJoseph Huber } 69*6818c7b8SJoseph Huber 70*6818c7b8SJoseph Huber extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void 71*6818c7b8SJoseph Huber _end(int retval) { 72*6818c7b8SJoseph Huber LIBC_NAMESPACE::exit(retval); 73*6818c7b8SJoseph Huber } 74*6818c7b8SJoseph Huber 75*6818c7b8SJoseph HuberLoader runtime 76*6818c7b8SJoseph Huber-------------- 77*6818c7b8SJoseph Huber 78*6818c7b8SJoseph HuberThe startup object provides a GPU executable with callable kernels for the 79*6818c7b8SJoseph Huberrespective runtime. We can then define a minimal runtime that will launch these 80*6818c7b8SJoseph Huberkernels on the given device. Currently we provide the ``amdhsa-loader`` and 81*6818c7b8SJoseph Huber``nvptx-loader`` targeting the AMD HSA runtime and CUDA driver runtime 82*6818c7b8SJoseph Huberrespectively. By default these will launch with a single thread on the GPU. 83807f0584SJoseph Huber 84807f0584SJoseph Huber.. code-block:: sh 85807f0584SJoseph Huber 86*6818c7b8SJoseph Huber $> clang++ crt1.o test.cpp --target=amdgcn-amd-amdhsa -mcpu=native -flto 87*6818c7b8SJoseph Huber $> amdhsa_loader --threads 1 --blocks 1 ./a.out 88807f0584SJoseph Huber Test Passed! 89807f0584SJoseph Huber 90*6818c7b8SJoseph HuberThe loader utility will forward any arguments passed after the executable image 91*6818c7b8SJoseph Huberto the program on the GPU as well as any set environment variables. The number 92*6818c7b8SJoseph Huberof threads and blocks to be set can be controlled with ``--threads`` and 93*6818c7b8SJoseph Huber``--blocks``. These also accept additional ``x``, ``y``, ``z`` variants for 94*6818c7b8SJoseph Hubermultidimensional grids. 95*6818c7b8SJoseph Huber 96*6818c7b8SJoseph HuberRunning tests 97*6818c7b8SJoseph Huber============= 98*6818c7b8SJoseph Huber 99*6818c7b8SJoseph HuberTests will only be built and run if a GPU target architecture is set and the 100*6818c7b8SJoseph Hubercorresponding loader utility was built. These can be overridden with the 101*6818c7b8SJoseph Huber``LIBC_GPU_TEST_ARCHITECTURE`` and ``LIBC_GPU_LOADER_EXECUTABLE`` :ref:`CMake 102*6818c7b8SJoseph Huberoptions <gpu_cmake_options>`. Once built, they can be run like any other tests. 103*6818c7b8SJoseph HuberThe CMake target depends on how the library was built. 104*6818c7b8SJoseph Huber 105*6818c7b8SJoseph Huber#. **Cross build** - If the C library was built using ``LLVM_ENABLE_PROJECTS`` 106*6818c7b8SJoseph Huber or a runtimes cross build, then the standard targets will be present in the 107*6818c7b8SJoseph Huber base CMake build directory. 108*6818c7b8SJoseph Huber 109*6818c7b8SJoseph Huber #. All tests - You can run all supported tests with the command: 110*6818c7b8SJoseph Huber 111*6818c7b8SJoseph Huber .. code-block:: sh 112*6818c7b8SJoseph Huber 113*6818c7b8SJoseph Huber $> ninja check-libc 114*6818c7b8SJoseph Huber 115*6818c7b8SJoseph Huber #. Hermetic tests - You can run hermetic with tests the command: 116*6818c7b8SJoseph Huber 117*6818c7b8SJoseph Huber .. code-block:: sh 118*6818c7b8SJoseph Huber 119*6818c7b8SJoseph Huber $> ninja libc-hermetic-tests 120*6818c7b8SJoseph Huber 121*6818c7b8SJoseph Huber #. Integration tests - You can run integration tests by the command: 122*6818c7b8SJoseph Huber 123*6818c7b8SJoseph Huber .. code-block:: sh 124*6818c7b8SJoseph Huber 125*6818c7b8SJoseph Huber $> ninja libc-integration-tests 126*6818c7b8SJoseph Huber 127*6818c7b8SJoseph Huber#. **Runtimes build** - If the library was built using ``LLVM_ENABLE_RUNTIMES`` 128*6818c7b8SJoseph Huber then the actual ``libc`` build will be in a separate directory. 129*6818c7b8SJoseph Huber 130*6818c7b8SJoseph Huber #. All tests - You can run all supported tests with the command: 131*6818c7b8SJoseph Huber 132*6818c7b8SJoseph Huber .. code-block:: sh 133*6818c7b8SJoseph Huber 134*6818c7b8SJoseph Huber $> ninja check-libc-amdgcn-amd-amdhsa 135*6818c7b8SJoseph Huber $> ninja check-libc-nvptx64-nvidia-cuda 136*6818c7b8SJoseph Huber 137*6818c7b8SJoseph Huber #. Specific tests - You can use the same targets as above by entering the 138*6818c7b8SJoseph Huber runtimes build directory. 139*6818c7b8SJoseph Huber 140*6818c7b8SJoseph Huber .. code-block:: sh 141*6818c7b8SJoseph Huber 142*6818c7b8SJoseph Huber $> ninja -C runtimes/runtimes-amdgcn-amd-amdhsa-bins check-libc 143*6818c7b8SJoseph Huber $> ninja -C runtimes/runtimes-nvptx64-nvidia-cuda-bins check-libc 144*6818c7b8SJoseph Huber $> cd runtimes/runtimes-amdgcn-amd-amdhsa-bins && ninja check-libc 145*6818c7b8SJoseph Huber $> cd runtimes/runtimes-nvptx64-nvidia-cuda-bins && ninja check-libc 146*6818c7b8SJoseph Huber 147*6818c7b8SJoseph HuberTests can also be built and run manually using the respective loader utility. 148