Revision tags: llvmorg-21-init |
|
#
13dcc95d |
| 28-Jan-2025 |
Joseph Huber <huberjn@outlook.com> |
[Offload] Rework offloading entry type to be more generic (#124018)
Summary:
The previous offloading entry type did not fit the current use-cases
very well. This widens it and adds a version to pr
[Offload] Rework offloading entry type to be more generic (#124018)
Summary:
The previous offloading entry type did not fit the current use-cases
very well. This widens it and adds a version to prevent further
annoyances. It also includes the kind to better sort who's using it.
The first 64-bytes are reserved as zero so the OpenMP runtime can detect
the old format for binary compatibilitry.
show more ...
|
#
70a16b90 |
| 22-Jan-2025 |
Joseph Huber <huberjn@outlook.com> |
[HIP] Support managed variables using the new driver (#123437)
Summary: Previously, managed variables didn't work in rdc mode using the new driver because we just didn't register them. This was prev
[HIP] Support managed variables using the new driver (#123437)
Summary: Previously, managed variables didn't work in rdc mode using the new driver because we just didn't register them. This was previously ignored because we didn't have enough space in the current struct format. This patch amends that by just emitting a struct pair for the two variables and using the single pointer.
In the future, a more extensible entry format would be nice, but that can be done later.
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5 |
|
#
36ada1b9 |
| 20-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[Frontend] Remove unused includes (NFC) (#116927)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.4, llvmorg-19.1.3 |
|
#
42eb54b7 |
| 28-Oct-2024 |
Joseph Huber <huberjn@outlook.com> |
[Clang] Put offloading globals in the `.llvm.rodata.offloading` section (#111890)
Summary: For our offloading entries, we currently store all the string names of kernels that the runtime will need t
[Clang] Put offloading globals in the `.llvm.rodata.offloading` section (#111890)
Summary: For our offloading entries, we currently store all the string names of kernels that the runtime will need to load from the target executable. These are available via pointer in the `__tgt_offload_entry` struct, however this makes it difficult to obtain from the object itself. This patch simply puts the strings in a named section so they can be easily queried.
The motivation behind this is that when the linker wrapper is doing linking, it wants to know which kernels the host executable is calling. We *could* get this already via the `.relaomp_offloading_entires` section and trawling through the string table, but that's quite annoying and not portable. The follow-up to this should be to make the linker wrapper get a list of all used symbols the device link job should count as "needed" so we can handle static linking more directly.
show more ...
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4 |
|
#
cfc76b64 |
| 20-Aug-2024 |
Fabian Mora <fmora.dev@gmail.com> |
[llvm][offload] Move AMDGPU offload utilities to LLVM (#102487)
This patch moves utilities from
`offload/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h` to
`llvm/Frontend/Offloading/Utility.h` to be
[llvm][offload] Move AMDGPU offload utilities to LLVM (#102487)
This patch moves utilities from
`offload/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h` to
`llvm/Frontend/Offloading/Utility.h` to be reused by
other projects.
Concretely the following changes were made:
- Rename `KernelMetaDataTy` to `AMDGPUKernelMetaData`.
- Remove unused fields `KernelObject`, `KernelSegmentSize`,
`ExplicitArgumentCount` and `ImplicitArgumentCount` from
`AMDGPUKernelMetaData`.
- Return the produced error if `ELFObj.sections()` failed instead of
using `cantFail`.
- Added `AGPRCount` field to `AMDGPUKernelMetaData`.
- Added a default invalid value to all the fields in
`AMDGPUKernelMetaData`.
show more ...
|
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4 |
|
#
470aefb2 |
| 09-Apr-2024 |
Joseph Huber <huberjn@outlook.com> |
[Offload][NFC] Remove `omp_` prefix from offloading entries (#88071)
Summary:
These entires are generic for offloading with the new driver now. Having
the `omp` prefix was a historical artifact an
[Offload][NFC] Remove `omp_` prefix from offloading entries (#88071)
Summary:
These entires are generic for offloading with the new driver now. Having
the `omp` prefix was a historical artifact and is confusing when used
for CUDA. This patch just renames them for now, future patches will
rework the binary format to make it more common.
show more ...
|
Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4 |
|
#
3ee8c937 |
| 21-Feb-2024 |
Joseph Huber <huberjn@outlook.com> |
[Offload] Fix NVPTX global entry names
Summary: This was missed, the NVPTX globals cannot use a `.`.
|
Revision tags: llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2 |
|
#
3bf88163 |
| 05-Feb-2024 |
Joseph Huber <huberjn@outlook.com> |
[Offload] Fix entry global names on NVPTX target
Summary: The PTX language rejects globals with `.` in the name. We need to change the global name if we are targeting NVPTX to prevent the toolchain
[Offload] Fix entry global names on NVPTX target
Summary: The PTX language rejects globals with `.` in the name. We need to change the global name if we are targeting NVPTX to prevent the toolchain from complaining.
show more ...
|
Revision tags: llvmorg-18.1.0-rc1 |
|
#
a551703c |
| 24-Jan-2024 |
Joseph Huber <huberjn@outlook.com> |
[Offload] Fix the offloading wrapper when merged multiple times. (#79231)
Summary: The offloading wrapper is a object file that contains code necessary to register offloading entries for the given r
[Offload] Fix the offloading wrapper when merged multiple times. (#79231)
Summary: The offloading wrapper is a object file that contains code necessary to register offloading entries for the given runtime. Currently, we expected only one of these to be present when we make the final executable. However, in the case of redistributable linking with `-r` we can end up with multiple of these being generated before finally creating the executable.
This patch simply changes the defintiions of these globals to be mergable. This allows multiples of these to participate in a single link job. For ELF, we just make the dummy variable internal and used so it sets up the section as expected. For COFF we make the entries weak_odr so they merge to a single symbol
show more ...
|
Revision tags: llvmorg-19-init |
|
#
9fa9d9a7 |
| 15-Jan-2024 |
Fabian Mora <fmora.dev@gmail.com> |
[llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (#78057)
This patch moves `clang/tools/clang-linker-wrapper/OffloadWrapper.*` to
`llvm/Frontend/O
[llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (#78057)
This patch moves `clang/tools/clang-linker-wrapper/OffloadWrapper.*` to
`llvm/Frontend/Offloading` allowing them to be re-utilized by other
projects.
Additionally, it makes minor modifications to the API to make it more
flexible.
Concretely:
- The `wrap*` methods now have additional arguments `EntryArray`,
`Suffix` and `EmitSurfacesAndTextures` to specify some additional options.
- The `EntryArray` is now constructed by the caller. This change is needed to
enable JIT compilation, as ORC doesn't fully support `__start_` and `__stop_`
symbols. Thus, to JIT the code, the `EntryArray` has to be constructed explicitly in the IR.
- The `Suffix` field is used when emitting the descriptor, registration
methods, etc, to make them more readable. It is empty by default.
- The `EmitSurfacesAndTextures` field controls whether to emit surface
and texture registration code, as those functions were removed from `CUDART`
in CUDA 12. It is true by default.
- The function `getOffloadingEntryInitializer` was added to help create
the `EntryArray`, as it returns the constant initializer and not a global
variable.
show more ...
|
#
97f3be2c |
| 07-Dec-2023 |
Joseph Huber <huberjn@outlook.com> |
[CUDA][HIP] Improve variable registration with the new driver (#73177)
Summary: This patch adds support for registering texture / surface variables from CUDA / HIP. Additionally, we now properly tra
[CUDA][HIP] Improve variable registration with the new driver (#73177)
Summary: This patch adds support for registering texture / surface variables from CUDA / HIP. Additionally, we now properly track the `extern` and `const` flags that are also used in these runtime functions.
This does not implement the `managed` variables yet as those seem to require some extra handling I'm not familiar with. The issue is that the current offload entry isn't large enough to carry size and alignment information along with an extra global.
show more ...
|
Revision tags: llvmorg-17.0.6 |
|
#
52204a29 |
| 21-Nov-2023 |
Joseph Huber <huberjn@outlook.com> |
[Offload] Initial support for registering offloading entries on COFF targets (#72697)
Summary: This patch provides the initial support to allow handling the new driver's offloading entries. Normally
[Offload] Initial support for registering offloading entries on COFF targets (#72697)
Summary: This patch provides the initial support to allow handling the new driver's offloading entries. Normally, the ELF target can emit varibles at C-identifier named sections and the linker will provide a pointer to the section. For COFF target, instead the linker merges sections containing a `$` in alphabetical order. We thus can emit these variables at sections and then emit two variables that are guaranteed to be sorted before and after the others to traverse it. Previous patches consolidated the handling of offloading entries so that this patch more easily can handle mapping them to the appropriate section.
Ideally, the only remaining step to allow the new driver to run on Windows targets is to accurately map the following `ld.lld` arguments to their `llvm-link` equivalents. These are used inside the linker-wrapper, so we should simply need to remap the arguments to the same functionality if possible. ``` -o, -output -l, --library -L, --library-path -v, --version -rpath -whole-archive, -no-whole-archive ```
I have not tested this at runtime as I do not have access to a windows machine.
This patch was adapted from some initial efforts in https://reviews.llvm.org/D137470.
show more ...
|
#
9c0e6499 |
| 17-Nov-2023 |
Joseph Huber <huberjn@outlook.com> |
[Offloading][NFC] Refactor handling of offloading entries (#72544)
Summary: This patch is a simple refactoring of code out of the linker wrapper into a common location. The main motivation behind th
[Offloading][NFC] Refactor handling of offloading entries (#72544)
Summary: This patch is a simple refactoring of code out of the linker wrapper into a common location. The main motivation behind this change is to make it easier to change the handling in the future to accept a triple to be used to emit entries that function on that target.
show more ...
|
Revision tags: llvmorg-17.0.5 |
|
#
7b9d73c2 |
| 07-Nov-2023 |
Paulo Matos <pmatos@igalia.com> |
[NFC] Remove Type::getInt8PtrTy (#71029)
Replace this with PointerType::getUnqual().
Followup to the opaque pointer transition. Fixes an in-code TODO item.
|
Revision tags: llvmorg-17.0.4 |
|
#
078ae8cd |
| 25-Oct-2023 |
Joseph Huber <35342157+jhuber6@users.noreply.github.com> |
[Offloading][NFC] Move creation of offloading entries from OpenMP (#70116)
Summary: This patch is a first step to remove dependencies on the OpenMPIRBuilder for creating generic offloading entries.
[Offloading][NFC] Move creation of offloading entries from OpenMP (#70116)
Summary: This patch is a first step to remove dependencies on the OpenMPIRBuilder for creating generic offloading entries. This patch changes no functionality and merely moves the code around. In the future the interface will be changed to allow for more code re-use in the registration and creation of offloading entries as well as a more generic interface for CUDA, HIP, OpenMP, and SYCL(?). Doing this as a first step to reduce the noise involved in the functional changes.
show more ...
|