#
0aa831e0 |
| 09-Jan-2025 |
Krzysztof Drewniak <Krzysztof.Drewniak@amd.com> |
[mlir][GPU] Implement ValueBoundsOpInterface for GPU ID operations (#122190)
The GPU ID operations already implement InferIntRangeInterface, which
gives constant lower and upper bounds on those IDs
[mlir][GPU] Implement ValueBoundsOpInterface for GPU ID operations (#122190)
The GPU ID operations already implement InferIntRangeInterface, which
gives constant lower and upper bounds on those IDs when appropriate
metadata is prentent on the operations or in the surrounding context.
This commit uses that existing code to implement the
ValueBoundsOpInterface, which is used when analyzing affine operations
(unlike the integer range interface, which is used for arithmetic
optimization).
It also implements the interface for gpu.launch, where we can use it to
express the constraint that block/grid sizes are equal to their value
from outside the launch op and that the corresponding IDs are bounded
above by that size.
As a consequence, the test pass for this inference is updated to work on
a FunctionOpInterface and not a func.func, creating minor churn in other
tests.
show more ...
|
#
bc29fc93 |
| 13-Dec-2024 |
Petr Kurapov <petr.a.kurapov@intel.com> |
[MLIR] Create GPU utils library & move distribution utils (#119264)
Continue the move of `warp_execute_on_lane_0` op to the gpu dialect
(#116994). This patch creates a utils library in GPU and move
[MLIR] Create GPU utils library & move distribution utils (#119264)
Continue the move of `warp_execute_on_lane_0` op to the gpu dialect
(#116994). This patch creates a utils library in GPU and moves generic
helper functions there.
show more ...
|
#
8b47711e |
| 30-Sep-2024 |
Mehdi Amini <joker.eph@gmail.com> |
Revert "CMake: Remove unnecessary dependencies on LLVM/MLIR" (#110594)
Reverts llvm/llvm-project#110362
Multiple bots are broken.
|
#
4980f217 |
| 30-Sep-2024 |
BARRET <41060790+Adnios@users.noreply.github.com> |
CMake: Remove unnecessary dependencies on LLVM/MLIR (#110362)
There are some spurious libraries which can be removed.
I'm trying to bundle MLIR/LLVM library dependencies for our own
libraries. W
CMake: Remove unnecessary dependencies on LLVM/MLIR (#110362)
There are some spurious libraries which can be removed.
I'm trying to bundle MLIR/LLVM library dependencies for our own
libraries. We're utilizing cmake function to recursively collect
MLIR/LLVM related dependencies. However, we identified certain library
dependencies as redundant and safe for removal.
show more ...
|
#
863a2ed4 |
| 07-Aug-2024 |
Angel Zhang <angel.zhang@amd.com> |
[mlir][memref] Rename `MemRef` directories and files. NFC. (#102337)
This PR renames the `MemRef` integration test directory for and the
`DecomposeMemref.s.cpp` so that they can be found when doing
[mlir][memref] Rename `MemRef` directories and files. NFC. (#102337)
This PR renames the `MemRef` integration test directory for and the
`DecomposeMemref.s.cpp` so that they can be found when doing a
case-sensitive search on file paths.
show more ...
|
#
9ddf3b83 |
| 20-Jun-2024 |
Fabian Mora <fmora.dev@gmail.com> |
[mlir][gpu] Remove old GPU serialization passes (#94998)
This patch removes the last vestiges of the old gpu serialization
pipeline. To compile GPU code use target attributes instead.
See [Compi
[mlir][gpu] Remove old GPU serialization passes (#94998)
This patch removes the last vestiges of the old gpu serialization
pipeline. To compile GPU code use target attributes instead.
See [Compilation overview | 'gpu' Dialect - MLIR
docs](https://mlir.llvm.org/docs/Dialects/GPU/#compilation-overview) for
additional information on the target attributes compilation pipeline
that replaced the old serialization pipeline.
show more ...
|
#
6244d87f |
| 19-Jun-2024 |
Michael Kruse <llvm-project@meinersbur.de> |
Avoid object libraries in the VS IDE (#93519)
As discussed in #89743, when using the Visual Studio solution
generators, object library projects are displayed as a collection of
non-editable *.obj
Avoid object libraries in the VS IDE (#93519)
As discussed in #89743, when using the Visual Studio solution
generators, object library projects are displayed as a collection of
non-editable *.obj files. To look for the corresponding source files,
one has to browse (or search) to the library's obj.libname project. This
patch tries to avoid this as much as possible.
For Clang, there is already an exception for XCode. We handle MSVC_IDE
the same way.
For MLIR, this is more complicated. There are explicit references to the
obj.libname target that only work when there is an object library. This
patch cleans up the reasons for why an object library is needed:
1. The obj.libname is modified in the calling CMakeLists.txt. Note that
with use-only references, `add_library(<name> ALIAS <target>)` could
have been used.
2. An `libMLIR.so` (mlir-shlib) is also created. This works by adding
linking the object libraries' object files into the libMLIR.so (in
addition to the library's own .so/.a). XCode is handled using the
`-force_load` linker option instead. Windows is not supported. This
mechanism is different from LLVM's llvm-shlib that is created by linking
static libraries with `-Wl,--whole-archive` (and `-Wl,-all_load` on
MacOS).
3. The library might be added to an aggregate library. In-tree, the
seems to be only `libMLIR-C.so` and the standalone example. In XCode, it
uses the object library and `-force_load` mechanism as above. Again,
this is different from `libLLVM-C.so`.
4. Build an object library whenever it was before this patch, except
when generating a Visual Studio solution. This condition could be
removed, but I am trying to avoid build breakages of whatever
configurations others use.
This seems to never have worked with XCode because of the explicit
references to obj.libname (reason 1.). I don't have access to XCode, but
I tried to preserve the current working. IMHO there should be a common
mechanism to build aggregate libraries for all LLVM projects instead of
the 4 that we have now.
As far as I can see, this means for LLVM there are the following changes
on whether object libraries are created:
1. An object library is created even in XCode if FORCE_OBJECT_LIBRARY is
set. I do not know how XCode handles it, but I also know CMake will
abort otherwise.
2. An object library is created even for explicitly SHARED libraries for
building `libMLIR.so`. Again, mlir-shlib does not work otherwise.
`libMLIR.so` itself is created using SHARED so this patch is marking it
as EXCLUDE_FROM_LIBMLIR.
3. For the second condition, it is now sensitive to whether the
mlir-shlib is built at all (LLVM_BUILD_LLVM_DYLIB). However, an object
library is still built using the fourth condition unless using the MSVC
solution generator. That is, except with MSVC_IDE, when an object
library was built before, it will also be an object library now.
show more ...
|
#
3a2f7d8a |
| 17-Jun-2024 |
Fabian Mora <fmora.dev@gmail.com> |
Revert "Reland [mlir][Target] Improve ROCDL gpu serialization API" (#95847)
Reverts llvm/llvm-project#95813
|
#
dcb6c0d7 |
| 17-Jun-2024 |
Fabian Mora <fmora.dev@gmail.com> |
Reland [mlir][Target] Improve ROCDL gpu serialization API (#95813)
Reland: https://github.com/llvm/llvm-project/pull/95456
This patch improves the ROCDL gpu serialization API by:
- Introducing t
Reland [mlir][Target] Improve ROCDL gpu serialization API (#95813)
Reland: https://github.com/llvm/llvm-project/pull/95456
This patch improves the ROCDL gpu serialization API by:
- Introducing the enum `AMDGCNLibraries` for specifying the AMD GCN
device code libraries to use during linking.
- Removing `getCommonBitcodeLibs` in favor of `AMDGCNLibraries`.
Previously `getCommonBitcodeLibs` would try to load all AMD GCN bitcode
librariesm now it will only load the requested libraries.
- Exposing the `compileToBinary` method and making it virtual, allowing
downstream users to re-use this method.
- Exposing `moduleToObjectImpl`, this method provides a prototype flow
for compiling to binary, allowing downstream users to re-use this
method.
- It also avoids constructing the control variables if no device
libraries are being used.
- Changes the style of the error messages to be composable, ie no full
stops.
- Adds an error message for when the ROCm toolkit can't be found but it
was required.
show more ...
|
#
57b8be46 |
| 17-Jun-2024 |
Fabian Mora <fmora.dev@gmail.com> |
Revert [mlir][Target] Improve ROCDL gpu serialization API (#95790)
Reverts llvm/llvm-project#95456
|
#
954cb5f9 |
| 17-Jun-2024 |
Fabian Mora <fmora.dev@gmail.com> |
[mlir][Target] Improve ROCDL gpu serialization API (#95456)
This patch improves the ROCDL gpu serialization API by:
- Introducing the enum `AMDGCNLibraries` for specifying the AMD GCN
device code
[mlir][Target] Improve ROCDL gpu serialization API (#95456)
This patch improves the ROCDL gpu serialization API by:
- Introducing the enum `AMDGCNLibraries` for specifying the AMD GCN
device code libraries to use during linking.
- Removing `getCommonBitcodeLibs` in favor of `AMDGCNLibraries`.
Previously `getCommonBitcodeLibs` would try to load all AMD GCN bitcode
librariesm now it will only load the requested libraries.
- Exposing the `compileToBinary` method and making it virtual, allowing
downstream users to re-use this method.
- Exposing `moduleToObjectImpl`, this method provides a prototype flow
for compiling to binary, allowing downstream users to re-use this
method.
- It also avoids constructing the control variables if no device
libraries are being used.
This patch also changes the behavior of the CMake flag
`DEFAULT_ROCM_PATH`. Before it would fall back to a default value of
`/opt/rocm` if not specified. However, that default value causes fragile
builds in environments with ROCm. Now, the flag falls back to the empty
string, making it clear that **the user must provide a value at LLVM
build time**.
show more ...
|
#
6e27dd47 |
| 06-Mar-2024 |
Ingo Müller <ingomueller@google.com> |
[mlir][gpu] Replace MLIR_GPU_TO_HSACO_PASS_ENABLE by more generic one. (#84001)
This is another follow-up of #83004. The PR replaces the macro
`MLIR_GPU_TO_HSACO_PASS_ENABLE` with the more generic
[mlir][gpu] Replace MLIR_GPU_TO_HSACO_PASS_ENABLE by more generic one. (#84001)
This is another follow-up of #83004. The PR replaces the macro
`MLIR_GPU_TO_HSACO_PASS_ENABLE` with the more generic macro
`MLIR_ENABLE_ROCM_CONVERSIONS`. Until now, the former has been defined
if and only if the latter evaluated to true in CMake. However, the
former was not defined when the latter evaluated to false, in which case
a warning was raised if compiled with `-Wundef`. Using a single macro
relies on the `#cmakedefine01` mechanism that ensures the macro is
always set to either 0 or 1.
show more ...
|
#
f204aee1 |
| 22-Feb-2024 |
Fabian Mora <fmora.dev@gmail.com> |
[mlir][GPU] Remove the SerializeToCubin pass (#82486)
The `SerializeToCubin` pass was deprecated in September 2023 in favor of
GPU compilation attributes; see the [GPU
compilation](https://mlir.ll
[mlir][GPU] Remove the SerializeToCubin pass (#82486)
The `SerializeToCubin` pass was deprecated in September 2023 in favor of
GPU compilation attributes; see the [GPU
compilation](https://mlir.llvm.org/docs/Dialects/GPU/#gpu-compilation)
section in the `gpu` dialect MLIR docs.
This patch removes `SerializeToCubin` from the repo.
show more ...
|
#
a1eaed7a |
| 15-Jan-2024 |
Fabian Mora <fmora.dev@gmail.com> |
[mlir][gpu] Fix GPU YieldOP format and traits (#78006)
This patch adds assembly format to `gpu::YieldOp`. It also adds the
return like trait, to make it compatible with `RegionBranchOpInterface`.
|
#
c0345b46 |
| 02-Jan-2024 |
Jakub Kuderski <jakub@nod-labs.com> |
[mlir][gpu] Add subgroup_reduce to shuffle lowering (#76530)
This supports both the scalar and the vector multi-reduction cases.
|
#
2af186f9 |
| 28-Dec-2023 |
Jakub Kuderski <jakub@nod-labs.com> |
[mlir][gpu] Add patterns to break down subgroup reduce (#76271)
The new patterns break down subgroup reduce ops with vector values into
a sequence of subgroup reductions that fit the native shuffle
[mlir][gpu] Add patterns to break down subgroup reduce (#76271)
The new patterns break down subgroup reduce ops with vector values into
a sequence of subgroup reductions that fit the native shuffle size. The
maximum/native shuffle size is parametrized.
The overall goal is to be able to perform multi-element reductions with
a sequence of `gpu.shuffle` ops.
show more ...
|
#
5caae72d |
| 19-Dec-2023 |
Guray Ozen <guray.ozen@gmail.com> |
[mlir][gpu] Productize `test-lower-to-nvvm` as `gpu-lower-to-nvvm` (#75775)
The `test-lower-to-nvvm` pipeline serves as the common and proper
pipeline for nvvm+host compilation, and it's used acros
[mlir][gpu] Productize `test-lower-to-nvvm` as `gpu-lower-to-nvvm` (#75775)
The `test-lower-to-nvvm` pipeline serves as the common and proper
pipeline for nvvm+host compilation, and it's used across our CUDA
integration tests.
This PR updates the `test-lower-to-nvvm` pipeline to `gpu-lower-to-nvvm`
and moves it within `InitAllPasses.h`. The aim is to call it from
Python, also having a standardize compilation process for nvvm.
show more ...
|
#
6402706a |
| 01-Dec-2023 |
Mehdi Amini <joker.eph@gmail.com> |
[mlir] Fix the link of libcuda.so in MLIRGPUTransforms to not use fully qualified path (#74018)
At the moment we find libcuda.so in a path like:
/usr/local/cuda/targets/x86_64-linux/lib/stubs/l
[mlir] Fix the link of libcuda.so in MLIRGPUTransforms to not use fully qualified path (#74018)
At the moment we find libcuda.so in a path like:
/usr/local/cuda/targets/x86_64-linux/lib/stubs/libcuda.so
and directly add this to `target_link_libraries`. The problem is that
our installed MLIR package will include the full path to the library,
and a user downstream when including our cmake installed package will
inherit this full path.
We're changing this to instead
-L /usr/local/cuda/targets/x86_64-linux/lib/stubs/ -lcuda
show more ...
|
#
45f66925 |
| 11-Nov-2023 |
spaceotter <zaimzet@gmail.com> |
[mlir][gpu] Fix build error after barrier elimination code moved (#72019)
Should fix
https://lab.llvm.org/buildbot/#/builders/61/builds/51692/steps/5/logs/stdio
|
#
00c3c731 |
| 11-Nov-2023 |
spaceotter <zaimzet@gmail.com> |
[mlir][gpu] Separate the barrier elimination code from transform ops (#71762)
Allows the barrier elimination code to be run from C++ as well. The code
from transforms dialect is copied as-is, the p
[mlir][gpu] Separate the barrier elimination code from transform ops (#71762)
Allows the barrier elimination code to be run from C++ as well. The code
from transforms dialect is copied as-is, the pass and populate functions
have beed added at the end.
Co-authored-by: Eric Eaton <eric@nod-labs.com>
show more ...
|
#
2dace045 |
| 05-Nov-2023 |
Sang Ik Lee <sang.ik.lee@intel.com> |
[mlir][spirv] Implement gpu::TargetAttrInterface (#69949)
This commit implements gpu::TargetAttrInterface for SPIR-V target
attribute. The plan is to use this to enable GPU compilation pipeline
fo
[mlir][spirv] Implement gpu::TargetAttrInterface (#69949)
This commit implements gpu::TargetAttrInterface for SPIR-V target
attribute. The plan is to use this to enable GPU compilation pipeline
for OpenCL kernels later.
The changes do not impact Vulkan shaders using milr-vulkan-runner.
New GPU Dialect transform pass spirv-attach-target is implemented for
attaching attribute from CLI.
gpu-module-to-binary pass now works with GPU module that has SPIR-V
module with OpenCL kernel functions inside.
show more ...
|
#
d9dadfda |
| 03-Nov-2023 |
Mehdi Amini <joker.eph@gmail.com> |
Refactor ModuleToObject to offer more flexibility to subclass (NFC)
Some specific implementation of the offload may want more customization, and even avoid using LLVM in-tree to dispatch the ISA tra
Refactor ModuleToObject to offer more flexibility to subclass (NFC)
Some specific implementation of the offload may want more customization, and even avoid using LLVM in-tree to dispatch the ISA translation to a custom solution. This refactoring makes it possible for such implementation to work without even configuring the target backend in LLVM.
Reviewers: fabianmcg
Reviewed By: fabianmcg
Pull Request: https://github.com/llvm/llvm-project/pull/71165
show more ...
|
#
522c1d0e |
| 20-Sep-2023 |
Martin Erhart <merhart@google.com> |
[mlir][gpu][bufferization] Implement BufferDeallocationOpInterface for gpu.terminator (#66880)
This is necessary to support deallocation of IR with gpu.launch
operations because it does not impleme
[mlir][gpu][bufferization] Implement BufferDeallocationOpInterface for gpu.terminator (#66880)
This is necessary to support deallocation of IR with gpu.launch
operations because it does not implement the RegionBranchOpInterface.
Implementing the interface would require it to support regions with
unstructured control flow and produced arguments/results.
show more ...
|
#
34a35a8b |
| 31-Aug-2023 |
Martin Erhart <merhart@google.com> |
[mlir] Move FunctionInterfaces to Interfaces directory and inherit from CallableOpInterface
Functions are always callable operations and thus every operation implementing the `FunctionOpInterface` a
[mlir] Move FunctionInterfaces to Interfaces directory and inherit from CallableOpInterface
Functions are always callable operations and thus every operation implementing the `FunctionOpInterface` also implements the `CallableOpInterface`. The only exception was the FuncOp in the toy example. To make implementation of the `FunctionOpInterface` easier, this commit lets `FunctionOpInterface` inherit from `CallableOpInterface` and merges some of their methods. More precisely, the `CallableOpInterface` has methods to get the argument and result attributes and a method to get the result types of the callable region. These methods are always implemented the same way as their analogues in `FunctionOpInterface` and thus this commit moves all the argument and result attribute handling methods to the callable interface as well as the methods to get the argument and result types. The `FuntionOpInterface` then does not have to declare them as well, but just inherits them from the `CallableOpInterface`. Adding the inheritance relation also required to move the `FunctionOpInterface` from the IR directory to the Interfaces directory since IR should not depend on Interfaces.
Reviewed By: jpienaar, springerm
Differential Revision: https://reviews.llvm.org/D157988
show more ...
|
#
fbbb8ade |
| 12-Aug-2023 |
Fabian Mora <fmora.dev@gmail.com> |
[mlir][gpu] Add passes to attach (NVVM|ROCDL) target attributes to GPU Modules
Adds the passes `nvvm-attach-target` & `rocdl-attach-target for attaching `nvvm.target` & `rocdl.target` attributes to
[mlir][gpu] Add passes to attach (NVVM|ROCDL) target attributes to GPU Modules
Adds the passes `nvvm-attach-target` & `rocdl-attach-target for attaching `nvvm.target` & `rocdl.target` attributes to GPU Modules.
These passes search GPU Modules in the immediate region of the Op being acted on, attaching the target attribute to the module. Modules can be selected using a regex string, allowing fine grain attachment of targets, see the test `attach-target.mlir` for an example.
Depends on D154153
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D157351
show more ...
|