Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3 |
|
#
d1b29402 |
| 05-Aug-2024 |
Joseph Huber <huberjn@outlook.com> |
[libc] Add loader option to force serial execution of GPU region (#101601)
Summary:
The loader is used as a test utility to run traditionally CPU based unit
tests on the GPU. This has issues when
[libc] Add loader option to force serial execution of GPU region (#101601)
Summary:
The loader is used as a test utility to run traditionally CPU based unit
tests on the GPU. This has issues when used with something like
`llvm-lit` because the GPU runtimes have a nasty habit of either running
out of resources or hanging when they are overloaded. To combat this, I
added this option to force each process to perform the GPU part
serially.
This is done right now with a simple file lock on the executing file. I
was originally thinking about using more complex IPC to allow N
processes to share execution, but that seemed overly complicated given
the incredibly large number of failure modes it introduces. File locks
are nice here because if the process crashes or is killed it will
release the lock automatically (at least on Linux). This is in contrast
to something like POSIX shared memory which will stick around until it's
unlinked, meaning that if someone did `sigkill` on the program it would
never get cleaned up and other threads might wait on a mutex that never
occurs.
Restricting this to one thread isn't overly ideal, given the fact that
the runtime can likely handle at least a *few* separate processes, but
this was easy and it works, so might as well start here. This will
hopefully unblock me on running `libcxx` tests, as those ran with so
much parallelism spurious failures were very common.
show more ...
|
Revision tags: llvmorg-19.1.0-rc2 |
|
#
5e326983 |
| 01-Aug-2024 |
Joseph Huber <huberjn@outlook.com> |
[libc] Use LLVM CommandLine for loader tool (#101501)
Summary: This patch removes the ad-hoc parsing that I used previously and replaces it with the LLVM CommnadLine interface. This doesn't change a
[libc] Use LLVM CommandLine for loader tool (#101501)
Summary: This patch removes the ad-hoc parsing that I used previously and replaces it with the LLVM CommnadLine interface. This doesn't change any functionality, but makes it easier to maintain.
show more ...
|
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
1ecffdaf |
| 17-Jul-2024 |
jameshu15869 <55058507+jameshu15869@users.noreply.github.com> |
[libc] Add Kernel Resource Usage to nvptx-loader (#97503)
This PR allows `nvptx-loader` to read the resource usage of `_start`,
`_begin`, and `_end` when executing CUDA binaries.
Example output:
[libc] Add Kernel Resource Usage to nvptx-loader (#97503)
This PR allows `nvptx-loader` to read the resource usage of `_start`,
`_begin`, and `_end` when executing CUDA binaries.
Example output:
```
$ nvptx-loader --print-resource-usage libc/benchmarks/gpu/src/ctype/libc.benchmarks.gpu.src.ctype.isalnum_benchmark.__build__
[ RUN ] LlvmLibcIsAlNumGpuBenchmark.IsAlnumWrapper
[ OK ] LlvmLibcIsAlNumGpuBenchmark.IsAlnumWrapper: 93 cycles, 76 min, 470 max, 23 iterations, 78000 ns, 80 stddev
_begin registers: 25
_start registers: 80
_end registers: 62
```
---------
Co-authored-by: Joseph Huber <huberjn@outlook.com>
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2 |
|
#
bc11bb3e |
| 17-Apr-2023 |
Joseph Huber <jhuber6@vols.utk.edu> |
[libc] Add the '--threads' and '--blocks' option to the GPU loaders
We will want to test the GPU `libc` with multiple threads in the future. This patch adds the `--threads` and `--blocks` option to
[libc] Add the '--threads' and '--blocks' option to the GPU loaders
We will want to test the GPU `libc` with multiple threads in the future. This patch adds the `--threads` and `--blocks` option to set the `x` dimension of the kernel. Using CUDA terminology instead of OpenCL for familiarity.
Depends on D148288 D148342
Reviewed By: jdoerfert, sivachandra, tra
Differential Revision: https://reviews.llvm.org/D148485
show more ...
|
Revision tags: llvmorg-16.0.1 |
|
#
6bd4d717 |
| 17-Mar-2023 |
Joseph Huber <jhuber6@vols.utk.edu> |
[libc] Add environment variables to GPU libc test for AMDGPU
This patch performs the same operation to copy over the `argv` array to the `envp` array. This allows the GPU tests to use environment va
[libc] Add environment variables to GPU libc test for AMDGPU
This patch performs the same operation to copy over the `argv` array to the `envp` array. This allows the GPU tests to use environment variables.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D146322
show more ...
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2 |
|
#
67d78e3c |
| 06-Feb-2023 |
Joseph Huber <jhuber6@vols.utk.edu> |
[libc] Add a loader utility for AMDHSA architectures for testing
This is the first attempt to get some testing support for GPUs in LLVM's libc. We want to be able to compile for and call generic cod
[libc] Add a loader utility for AMDHSA architectures for testing
This is the first attempt to get some testing support for GPUs in LLVM's libc. We want to be able to compile for and call generic code while on the device. This is difficult as most GPU applications also require the support of large runtimes that may contain their own bugs (e.g. CUDA / HIP / OpenMP / OpenCL / SYCL). The proposed solution is to provide a "loader" utility that allows us to execute a "main" function on the GPU.
This patch implements a simple loader utility targeting the AMDHSA runtime called `amdhsa_loader` that takes a GPU program as its first argument. It will then attempt to load a predetermined `_start` kernel inside that image and launch execution. The `_start` symbol is provided by a `start` utility function that will be linked alongside the application. Thus, this should allow us to run arbitrary code on the user's GPU with the following steps for testing.
``` clang++ Start.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -ffreestanding -nogpulib -nostdinc -nostdlib -c clang++ Main.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -nogpulib -nostdinc -nostdlib -c clang++ Start.o Main.o --target=amdgcn-amd-amdhsa -o image amdhsa_loader image <args, ...> ```
We determine the `-mcpu` value using the `amdgpu-arch` utility provided either by `clang` or `rocm`. If `amdgpu-arch` isn't found or returns an error we shouldn't run the tests as the machine does not have a valid HSA compatible GPU. Alternatively we could make this utility in-source to avoid the external dependency.
This patch provides a single test for this untility that simply checks to see if we can compile an application containing a simple `main` function and execute it.
The proposed solution in the future is to create an alternate implementation of the LibcTest.cpp source that can be compiled and launched using this utility. This approach should allow us to use the same test sources as the other applications.
This is primarily a prototype, suggestions for how to better integrate this with the existing LibC infastructure would be greatly appreciated. The loader code should also be cleaned up somewhat. An implementation for NVPTX will need to be written as well.
Reviewed By: sivachandra, JonChesterfield
Differential Revision: https://reviews.llvm.org/D139839
show more ...
|