LowerGpuOpsToROCDLOps.cpp - OpenGrok history log for /llvm-project/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp

Revision	Date	Author	Comments
# 8c3a8d17	15-May-2023	Manupa Karunaratne <manupa.karunaratne@amd.com>	[MLIR][ROCDL] add gpu to rocdl erf support This commit adds lowering of lib func call to support erf in rocdl. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D150355
# 5550c821	08-May-2023	Tres Popp <tpopp@google.com>	[mlir] Move casting calls from methods to function calls The MLIR classes Type/Attribute/Operation/Op/Value support cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast functionali [mlir] Move casting calls from methods to function calls The MLIR classes Type/Attribute/Operation/Op/Value support cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast functionality in addition to defining methods with the same name. This change begins the migration of uses of the method to the corresponding function call as has been decided as more consistent. Note that there still exist classes that only define methods directly, such as AffineExpr, and this does not include work currently to support a functional cast/isa call. Caveats include: - This clang-tidy script probably has more problems. - This only touches C++ code, so nothing that is being generated. Context: - https://mlir.llvm.org/deprecation/ at "Use the free function variants for dyn_cast/cast/isa/…" - Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443 Implementation: This first patch was created with the following steps. The intention is to only do automated changes at first, so I waste less time if it's reverted, and so the first mass change is more clear as an example to other teams that will need to follow similar steps. Steps are described per line, as comments are removed by git: 0. Retrieve the change from the following to build clang-tidy with an additional check: https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check 1. Build clang-tidy 2. Run clang-tidy over your entire codebase while disabling all checks and enabling the one relevant one. Run on all header files also. 3. Delete .inc files that were also modified, so the next build rebuilds them to a pure state. 4. Some changes have been deleted for the following reasons: - Some files had a variable also named cast - Some files had not included a header file that defines the cast functions - Some files are definitions of the classes that have the casting methods, so the code still refers to the method instead of the function without adding a prefix or removing the method declaration at the same time. ``` ninja -C $BUILD_DIR clang-tidy run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-,misc-cast-functions'\ -header-filter=mlir/ mlir/ -fix rm -rf $BUILD_DIR/tools/mlir/*/.inc git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\ mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\ mlir/lib/**/IR/\ mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\ mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\ mlir/test/lib/Dialect/Test/TestTypes.cpp\ mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\ mlir/test/lib/Dialect/Test/TestAttributes.cpp\ mlir/unittests/TableGen/EnumsGenTest.cpp\ mlir/test/python/lib/PythonTestCAPI.cpp\ mlir/include/mlir/IR/ ``` Differential Revision: https://reviews.llvm.org/D150123 show more ...
# 0e5aeae6	21-Feb-2023	Markus Böck <markus.boeck02@gmail.com>	[mlir][GPUToLLVM] Add support for emitting opaque pointers Part of https://discourse.llvm.org/t/rfc-switching-the-llvm-dialect-and-dialect-lowerings-to-opaque-pointers/68179 This patch adds the new [mlir][GPUToLLVM] Add support for emitting opaque pointers Part of https://discourse.llvm.org/t/rfc-switching-the-llvm-dialect-and-dialect-lowerings-to-opaque-pointers/68179 This patch adds the new pass option `use-opaque-pointers` to the GPU to LLVM lowerings (including ROCD and NVVM) and adapts the code to support using opaque pointers in addition to typed pointers. The required changes mostly boil down to avoiding `getElementType` and specifying base types in GEP and Alloca. In the future opaque pointers will be the only supported model, hence tests have been ported to using opaque pointers by default. Additional regression tests for typed-pointers have been added to avoid breaking existing clients. Note: This does not yet port the `GpuToVulkan` passes. Differential Revision: https://reviews.llvm.org/D144448 show more ...
# 499abb24	19-Jan-2023	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	Add generic type attribute mapping infrastructure, use it in GpuToX Remapping memory spaces is a function often needed in type conversions, most often when going to LLVM or to/from SPIR-V (a future Add generic type attribute mapping infrastructure, use it in GpuToX Remapping memory spaces is a function often needed in type conversions, most often when going to LLVM or to/from SPIR-V (a future commit), and it is possible that such remappings may become more common in the future as dialects take advantage of the more generic memory space infrastructure. Currently, memory space remappings are handled by running a special-purpose conversion pass before the main conversion that changes the address space attributes. In this commit, this approach is replaced by adding a notion of type attribute conversions TypeConverter, which is then used to convert memory space attributes. Then, we use this infrastructure throughout the ToLLVM conversions. This has the advantage of loosing the requirements on the inputs to those passes from "all address spaces must be integers" to "all memory spaces must be convertible to integer spaces", a looser requirement that reduces the coupling between portions of MLIR. ON top of that, this change leads to the removal of most of the calls to getMemorySpaceAsInt(), bringing us closer to removing it. (A rework of the SPIR-V conversions to use this new system will be in a folowup commit.) As a note, one long-term motivation for this change is that I would eventually like to add an allocaMemorySpace key to MLIR data layouts and then call getMemRefAddressSpace(allocaMemorySpace) in the relevant ToLLVM in order to ensure all alloca()s, whether incoming or produces during the LLVM lowering, have the correct address space for a given target. I expect that the type attribute conversion system may be useful in other contexts. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D142159 show more ...
# cb4ccd38	24-Jan-2023	Quentin Colombet <quentin.colombet@gmail.com>	[mlir][Conversion] Rename the MemRefToLLVM pass Since the recent MemRef refactoring that centralizes the lowering of complex MemRef operations outside of the conversion framework, the MemRefToLLVM p [mlir][Conversion] Rename the MemRefToLLVM pass Since the recent MemRef refactoring that centralizes the lowering of complex MemRef operations outside of the conversion framework, the MemRefToLLVM pass doesn't directly convert these complex operations. Instead, to fully convert the whole MemRef dialect space, MemRefToLLVM needs to run after `expand-strided-metadata`. Make this more obvious by changing the name of the pass and the option associated with it from `convert-memref-to-llvm` to `finalize-memref-to-llvm`. The word "finalize" conveys that this pass needs to run after something else and that something else is documented in its tablegen description. This is a follow-up patch related to the conversation at: https://discourse.llvm.org/t/psa-you-need-to-run-expand-strided-metadata-before-memref-to-llvm-now/66956/14 Differential Revision: https://reviews.llvm.org/D142463 show more ...
# 4f1e244e	22-Jan-2023	Mehdi Amini <joker.eph@gmail.com>	Add missing dependent dialects to "convert-gpu-to-rocdl" Fixes #60198
# 60dd937d	13-Jan-2023	Christopher Bate <cbate@nvidia.com>	[mlir][gpu] Fix build failure / silence windows build warnings Fixes Windows build failure (C4715) caused by 6ca1a09f03e8e940f306bea73efa935e4ee38173.
# 6ca1a09f	24-Dec-2022	Christopher Bate <cbate@nvidia.com>	[mlir][gpu] Migrate hard-coded address space integers to an enum attribute (gpu::AddressSpaceAttr) This is a purely mechanical change that introduces an enum attribute in the GPU dialect to represen [mlir][gpu] Migrate hard-coded address space integers to an enum attribute (gpu::AddressSpaceAttr) This is a purely mechanical change that introduces an enum attribute in the GPU dialect to represent the various memref memory spaces as opposed to the hard-coded integer attributes that are currently used. The following steps were taken to make the transition across the codebase: 1. Introduce a pass "gpu-lower-memory-space-attributes": The pass updates all memref types that have a memory space attribute that is a `gpu::AddressSpaceAttr`. These attributes are changed to `IntegerAttr`'s using a mapping that is given by the caller. This pass is based on the "map-memref-spirv-storage-class" pass and the common functions can probably be refactored into a set of utilities under the MemRef dialect. 2. Update the verifiers of GPU/NVGPU dialect operations. If a verifier currently checks the address space of an operand using e.g.`getWorkspaceAddressSpace`, then it can continue to do so. However, the checks are changed to only fail if the memory space is either missing or a wrong value of type `gpu::AddressSpaceAttr`. Otherwise, it just assumes the address space is correct because it was specifically lowered to something other than a `gpu::AddressSpaceAttr`. 3. Update existing gpu-to-llvm conversion infrastructure. In the existing gpu-to-X passes, we add a full conversion equivalent to `gpu-lower-memory-space-attributes` just before doing the conversion to the LLVMDialect. This is done because currently both the gpu-to-llvm passes (rocdl,nvvm) run gpu-to-gpu rewrites within the pass, which introduce `AddressSpaceAttr` memory space annotations. Therefore, I inserted the memory space conversion between the gpu-to-gpu rewrites and the LLVM conversion. For more context see the below discourse discussion: https://discourse.llvm.org/t/gpu-workgroup-shared-memory-address-space-is-hard-coded/ Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D140644 show more ...
# b7fe0d34	11-Jan-2023	Goran Flegar <gflegar@google.com>	Lower math.tan to __nv_tan[f] / __ocml_tan_f{32\|64} At present math.tan fails to lower for NVVM and ROCDL. Differential Revision: https://reviews.llvm.org/D141505
# 059cf735	09-Jan-2023	Johannes Reifferscheid <jreiffers@google.com>	Lower math.cbrt to NVVM/ROCDL. Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D141270
# f6076bd8	05-Dec-2022	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	[mlir][ROCDL] Translate known block size attributes to ROCDL 1. When converting from the GPU dialect to the ROCDL dialect, if the function that contains a gpu.thread_id or gpu.block_id op is annotat [mlir][ROCDL] Translate known block size attributes to ROCDL 1. When converting from the GPU dialect to the ROCDL dialect, if the function that contains a gpu.thread_id or gpu.block_id op is annotated with gpu.known_{block,grid}_size, use that size to set a "range" attribute on the corresponding rocdl intrinsic so that the LLVM frontend can optimize based on that range information. 1b. When translating from the rocdl dialect to LLVM IR, use the "range" attribute, if present, to set !range metadata on the relevant function call. 2. Deprecate the old rocdl.max_flat_work_group_size attribute, which was used in a tensorflow backend. Instead, use rocdl.flat_work_group_size going forward to allow kernel generators to specify the minimum and maximum work group sizes a kernel may be launched with in one attribute, thus more closely matching the backend. 3. When translating from gpu.func to llvm.func within gpu-to-rocdl, copy the known_block_size attribute as rocdl.reqd_work_group_size to enable further translations to set the corresponding metadata on the LLVM IR function. Also, set the rocdl.flat_work_group_size attribute to ensure that the reqd_work_group_size metadata and the amdgpu-flat-work-group-size metadata are consistent. 3b. Extend the ROCDL to LLVM IR translation to set the !reqd_work_group_size metadata on LLVM functions Also update tests and add functions to the ROCDL dialect to ensure attribute names are used consistently. Depends on D139865 Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D139866 show more ...
# b251b608	27-Oct-2022	Christian Sigg <csigg@google.com>	[mlir][gpu] Unroll ops on vectors which map to intrinsic calls Unroll ops that map to intrinsics when lowering to LLVM, because intrinsics don't support vector operands/results. Reviewed By: herhut [mlir][gpu] Unroll ops on vectors which map to intrinsic calls Unroll ops that map to intrinsics when lowering to LLVM, because intrinsics don't support vector operands/results. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D136345 show more ...
# abc362a1	29-Sep-2022	Jakub Kuderski <kubak@google.com>	[mlir][arith] Change dialect name from Arithmetic to Arith Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22. Tested with: `ninja check-mlir check-ml [mlir][arith] Change dialect name from Arithmetic to Arith Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22. Tested with: `ninja check-mlir check-mlir-integration check-mlir-mlir-spirv-cpu-runner check-mlir-mlir-vulkan-runner check-mlir-examples` and `bazel build --config=generic_clang @llvm-project//mlir:all`. Reviewed By: lattner, Mogball, rriddle, jpienaar, mehdi_amini Differential Revision: https://reviews.llvm.org/D134762 show more ...
# 67d0d7ac	31-Aug-2022	Michele Scuttari <michele.scuttari@outlook.com>	[MLIR] Update pass declarations to new autogenerated files The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow drop [MLIR] Update pass declarations to new autogenerated files The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure. Reviewed By: mehdi_amini, rriddle Differential Review: https://reviews.llvm.org/D132838 show more ...
# 039b969b	30-Aug-2022	Michele Scuttari <michele.scuttari@outlook.com>	Revert "[MLIR] Update pass declarations to new autogenerated files" This reverts commit 2be8af8f0e0780901213b6fd3013a5268ddc3359.
# 2be8af8f	30-Aug-2022	Michele Scuttari <michele.scuttari@outlook.com>	[MLIR] Update pass declarations to new autogenerated files The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow drop [MLIR] Update pass declarations to new autogenerated files The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure. Reviewed By: mehdi_amini, rriddle Differential Review: https://reviews.llvm.org/D132838 show more ...
# 00f7096d	06-Aug-2022	Jeff Niu <jeff@modular.com>	[mlir][math] Rename math.abs -> math.absf To make room for introducing `math.absi`. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D131325
# c2fc8d9b	11-Jul-2022	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	[mlir][GPU] Allow bare pointer memrefs when calling GPU kernels In the ROCm runtime (and probably CUDA as well), all kernel arguments are aligned. Therefore, enable using bare pointers for memref ar [mlir][GPU] Allow bare pointer memrefs when calling GPU kernels In the ROCm runtime (and probably CUDA as well), all kernel arguments are aligned. Therefore, enable using bare pointers for memref arguments to kernels when these memrefs have static shape and a trivial layout. This is a substantial optimization to launching kernels that use memrefs with known, static sizes, since it causes the kernel launch packet to no longer include information already known to the kernel, which can enable packing the kernel launch arguments into launch packets instead of having to allocate an entire separate structure to hold unneeded memref information. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D130716 show more ...
# d6ef3d20	07-Jul-2022	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	[mlir] Remove VectorToROCDL Between issues such as https://github.com/llvm/llvm-project/issues/56323, the fact that this lowering (unlike the code in amdgpu-to-rocdl) does not correctly set up bound [mlir] Remove VectorToROCDL Between issues such as https://github.com/llvm/llvm-project/issues/56323, the fact that this lowering (unlike the code in amdgpu-to-rocdl) does not correctly set up bounds checks (and thus will cause page faults on reads that might need to be padded instead), and that fixing these problems would, essentially, involve replicating amdgpu-to-rocdl, remove --vector-to-rocdl for being broken. In addition, the lowering does not support many aspects of transfer_{read,write}, like supervectors, and may not work correctly in their presence. We (the MLIR-based convolution generator at AMD) do not use this conversion pass, nor are we aware of any other clients. Migration strategies: - Use VectorToLLVM - If buffer ops are particularly needed in your application, use amdgpu.raw_buffer_{load,store} A VectorToAMDGPU pass may be introduced in the future. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D129308 show more ...
# cab44c51	06-Jul-2022	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	[mlir][AMDGPU] Add --chipset option to AMDGPUToROCDL Because the buffer descriptor structure (the V#) has no backwards-compatibility guarentees, and since said guarantees have been violated in pract [mlir][AMDGPU] Add --chipset option to AMDGPUToROCDL Because the buffer descriptor structure (the V#) has no backwards-compatibility guarentees, and since said guarantees have been violated in practice (see https://github.com/llvm/llvm-project/issues/56323 ), and since the `targetIsRDNA` attribute isn't something that higher-level clients can set in general, make the lowering of the amdgpu dialect to rocdl take a --chipset option. Note that this option is a string because adding a parser for the Chipset struct to llvm::cl wasn't working out. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D129228 show more ...
# 610139d2	16-Jun-2022	Alex Zinenko <zinenko@google.com>	[mlir] replace 'emit_c_wrappers' func->llvm conversion option with a pass The 'emit_c_wrappers' option in the FuncToLLVM conversion requests C interface wrappers to be emitted for every builtin func [mlir] replace 'emit_c_wrappers' func->llvm conversion option with a pass The 'emit_c_wrappers' option in the FuncToLLVM conversion requests C interface wrappers to be emitted for every builtin function in the module. While this has been useful to bootstrap the interface, it is problematic in the longer term as it may unintentionally affect the functions that should retain their existing interface, e.g., libm functions obtained by lowering math operations (see D126964 for an example). Since D77314, we have a finer-grain control over interface generation via an attribute that avoids the problem entirely. Remove the 'emit_c_wrappers' option. Introduce the '-llvm-request-c-wrappers' pass that can be run in any pipeline that needs blanket emission of functions to annotate all builtin functions with the attribute before performing the usual lowering that accounts for the attribute. Reviewed By: chelini Differential Revision: https://reviews.llvm.org/D127952 show more ...
# d7ef488b	09-Jun-2022	Mogball <jeffniu22@gmail.com>	[mlir][gpu] Move GPU headers into IR/ and Transforms/ Depends on D127350 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D127352
# 814b6050	10-May-2022	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	[mlir][AMDGPU] Add AMDGPU conversion patterns to ConvertGPUToROCDL This ensures that attributes such as the index bitwidth propagate correctly to the AMDGPUToROCDL patterns. Differential Revision: [mlir][AMDGPU] Add AMDGPU conversion patterns to ConvertGPUToROCDL This ensures that attributes such as the index bitwidth propagate correctly to the AMDGPUToROCDL patterns. Differential Revision: https://reviews.llvm.org/D125320 show more ...
# f1f05a91	30-Mar-2022	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	[MLIR][AMDGPU] Add AMDGPU dialect, wrappers around raw buffer intrinsics By analogy with the NVGPU dialect, introduce an AMDGPU dialect for AMD-specific intrinsic wrappers. The dialect initially in [MLIR][AMDGPU] Add AMDGPU dialect, wrappers around raw buffer intrinsics By analogy with the NVGPU dialect, introduce an AMDGPU dialect for AMD-specific intrinsic wrappers. The dialect initially includes wrappers around the raw buffer intrinsics. On AMD GPUs, a memref can be converted to a "buffer descriptor" that allows more precise control of memory access, such as by allowing for out of bounds loads/stores to be replaced by 0/ignored without adding additional conditional logic, which is important for performance. The repository currently contains a limited conversion from transfer_read/transfer_write to Mubuf intrinsics, which are an older, deprecated intrinsic for the same functionality. The new amdgpu.raw_buffer_* ops allow these operations to be used explicitly and for including metadata such as whether the target chipset is an RDNA chip or not (which impacts the interpretation of some bits in the buffer descriptor), while still maintaining an MLIR-like interface. (This change also exposes the floating-point atomic add intrinsic.) Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D122765 show more ...
# 58ceae95	18-Apr-2022	River Riddle <riddleriver@gmail.com>	[mlir:NFC] Remove the forward declaration of FuncOp in the mlir namespace FuncOp has been moved to the `func` namespace for a little over a month, the using directive can be dropped now.
123 4 5