NVPTXUtilities.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/NVPTX/NVPTXUtilities.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# 4583f6d3	08-Jan-2025	Alex MacLean <amaclean@nvidia.com>	[NVPTX] Switch front-ends and tests to ptx_kernel cc (#120806) the `ptx_kernel` calling convention is a more idiomatic and standard way of specifying a NVPTX kernel than using the metadata which is [NVPTX] Switch front-ends and tests to ptx_kernel cc (#120806) the `ptx_kernel` calling convention is a more idiomatic and standard way of specifying a NVPTX kernel than using the metadata which is not supposed to change the meaning of the program. Further, checking the calling convention is significantly faster than traversing the metadata, improving compile time. This change updates the clang and mlir frontends as well as the NVPTXCtorDtorLowering pass to emit kernels using the calling convention. In addition, this updates all NVPTX unit tests to use the calling convention as well. show more ...
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# ed8019d9	18-Nov-2024	Kazu Hirata <kazu@google.com>	[Target] Remove unused includes (NFC) (#116577) Identified with misc-include-cleaner.
Revision tags: llvmorg-19.1.3
# 020fa868	22-Oct-2024	Artem Belevich <tra@google.com>	[NVPTX] mangle symbols in debug info to conform to PTX restrictions. (#113216) Until now debug info was printing the symbols names as-is and that resulted in invalid PTX when the symbols contained [NVPTX] mangle symbols in debug info to conform to PTX restrictions. (#113216) Until now debug info was printing the symbols names as-is and that resulted in invalid PTX when the symbols contained characters that are invalid for PTX. E.g. `__PRETTY_FUNCTION.something` Debug info is somewhat disconnected from the symbols themselves, so the regular "NVPTXAssignValidGlobalNames" pass can't easily fix them. As the "plan B" this patch catches printout of debug symbols and fixes them, as needed. One gotcha is that the same code path is used to print the names of debug info sections. Those section names do start with a '.debug'. The dot in those names is nominally illegal in PTX, but the debug section names with a dot are accepted as a special case. The downside of this change is that if someone ever has a `.debug*` symbol that needs to be referred to from the debug info, that label will be passed through as-is, and will still produce broken PTX output. If/when we run into a case where we need it to work, we could consider only passing through specific debug section names, or add a mechanism allowing us to tell section names apart from regular symbols. Fixes #58491 show more ...
Revision tags: llvmorg-19.1.2
# 9220f645	13-Oct-2024	Kazu Hirata <kazu@google.com>	[NVPTX] Avoid repeated hash lookups (NFC) (#112117)
Revision tags: llvmorg-19.1.1
# 9bc26e9e	25-Sep-2024	Alex MacLean <amaclean@nvidia.com>	[NVPTX] Support !"cluster_dim_{x,y,z}" metadata (#109548) Add support for !"cluster_dim_{x,y,z}" metadata to allow specifying cluster dimensions on a kernel function in llvm. If any of these met [NVPTX] Support !"cluster_dim_{x,y,z}" metadata (#109548) Add support for !"cluster_dim_{x,y,z}" metadata to allow specifying cluster dimensions on a kernel function in llvm. If any of these metadata entries are present, the `.explicitcluster` PTX directive is used and the specified dimensions are lowered with the `.reqnctapercluster` directive. For more details see: [PTX ISA: 11.7. Cluster Dimension Directives] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives) show more ...
# 489acb24	25-Sep-2024	Alex MacLean <amaclean@nvidia.com>	[NVPTX][NFC] Refactor utilities to use std::optional (#109883)
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 9fa7c05a	30-Jun-2024	Akshay Deodhar <adeodhar@nvidia.com>	[NVPTX] Improved support for grid_constant (#97112) - Supports escaped grid_constant pointers less conservatively. Casts uses inside Calls, PtrToInts, Stores where the pointer is a _value operand_ [NVPTX] Improved support for grid_constant (#97112) - Supports escaped grid_constant pointers less conservatively. Casts uses inside Calls, PtrToInts, Stores where the pointer is a _value operand_ to generic address space, immediately before the escape, while keeping other uses in the param address space - Related to: https://github.com/llvm/llvm-project/pull/96125 show more ...
# 687d6fbf	24-Jun-2024	Akshay Deodhar <adeodhar@nvidia.com>	[NVPTX] Basic support for "grid_constant" (#96125) - Adds a helper function for checking whether an argument is a [grid_constant](https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html#supported-prop [NVPTX] Basic support for "grid_constant" (#96125) - Adds a helper function for checking whether an argument is a [grid_constant](https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html#supported-properties). - Adds support for cvta.param using changes from https://github.com/llvm/llvm-project/pull/95289 - Supports escaped grid_constant pointers conservatively, by casting all uses to the generic address space with cvta.param. show more ...
Revision tags: llvmorg-18.1.8
# 435addbf	06-Jun-2024	Alex MacLean <amaclean@nvidia.com>	[NVPTX] Revamp NVVMIntrRange pass (#94422) Revamp the NVVMIntrRange pass making the following updates: - Use range attributes over range metadata. This is what instcombine has move to for ranges o [NVPTX] Revamp NVVMIntrRange pass (#94422) Revamp the NVVMIntrRange pass making the following updates: - Use range attributes over range metadata. This is what instcombine has move to for ranges on intrinsics in https://github.com/llvm/llvm-project/pull/88776 and it seems a bit cleaner. - Consider the `!"maxntid{x,y,z}"` and `!"reqntid{x,y,z}"` function metadata when adding ranges for `tid` srge instrinsics. This can allow for smaller ranges and more optimization. - When range attributes are already present, use the intersection of the old and new range. This complements the metadata change by allowing ranges to be shrunk when an intrinsic is in a function which is inlined into a kernel with metadata. While we don't call this more then once yet, we should consider adding a second call after inlining, once this has had a chance to soak for a while and no issues have arisen. I've also re-enabled this pass in the TM, it was disabled years ago due to "numerical discrepancies" https://reviews.llvm.org/D96166. In our testing we haven't seen any issues with adding ranges to intrinsics, and I cannot find any further info about what issues were encountered. show more ...
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6
# 8da3a8f5	17-May-2024	Alex MacLean <amaclean@nvidia.com>	[NVPTX] fixup support for over-aligned parameters (#92457) This extends the NVPTX support for over-aligned parameters and return values in a few related ways: - Support for `alignstack` attribut [NVPTX] fixup support for over-aligned parameters (#92457) This extends the NVPTX support for over-aligned parameters and return values in a few related ways: - Support for `alignstack` attribute, as an alternative to legacy nvvm `!"align"` metadata entries. While we still maintain the legacy support, long term it might be nice to auto-upgrade to `alignstack`. - Check the alignment info when emitting the parameter list to prevent a mismatch between alignment of caller and callee, which would previously cause a fatal error for `ptxas`. - Check the alignment info when emitting loads for parameters, potentially enabling better vectorization. show more ...
Revision tags: llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2
# 3f8d4a8e	29-Sep-2023	Jakub Chlanda <jakub@codeplay.com>	Reland [NVPTX] Add support for maxclusterrank in launch_bounds (#66496) (#67667) This reverts commit 0afbcb20fd908f8bf9073697423da097be7db592.
# 0afbcb20	27-Sep-2023	Sam McCall <sam.mccall@gmail.com>	Revert "[NVPTX] Add support for maxclusterrank in launch_bounds (#66496)" This reverts commit dfab31b41b4988b6dc8129840eba68f0c36c0f13. SemaDeclAttr.cpp cannot depend on Basic's private headers (li Revert "[NVPTX] Add support for maxclusterrank in launch_bounds (#66496)" This reverts commit dfab31b41b4988b6dc8129840eba68f0c36c0f13. SemaDeclAttr.cpp cannot depend on Basic's private headers (lib/Basic/Targets/NVPTX.h) show more ...
# dfab31b4	27-Sep-2023	Jakub Chlanda <jakub@codeplay.com>	[NVPTX] Add support for maxclusterrank in launch_bounds (#66496) Since SM_90 CUDA supports specifying additional argument to the launch_bounds attribute: maxBlocksPerCluster, to express the maximum [NVPTX] Add support for maxclusterrank in launch_bounds (#66496) Since SM_90 CUDA supports specifying additional argument to the launch_bounds attribute: maxBlocksPerCluster, to express the maximum number of CTAs that can be part of the cluster. See: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives-maxclusterrank and https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#launch-bounds for details. show more ...
Revision tags: llvmorg-17.0.1, llvmorg-17.0.0
# 0a7a9260	08-Sep-2023	Thomas <thomas.raoux@openai.com>	[NVPTX] Make i16x2 a native type and add supported vec instructions (#65799) recommit https://github.com/llvm/llvm-project/pull/65432 with minor bug fix for bitcasts
# b3a14cac	08-Sep-2023	Dmitri Gribenko <gribozavr@gmail.com>	Revert "[NVPTX] Make i16x2 a native type and add supported vec instructions (#65432)" This reverts commit db5d845c73ee2d64f1a5bab3fc72edece9e3a7ba. As per PR discussion "Looks like we've missed low Revert "[NVPTX] Make i16x2 a native type and add supported vec instructions (#65432)" This reverts commit db5d845c73ee2d64f1a5bab3fc72edece9e3a7ba. As per PR discussion "Looks like we've missed lowering of bitcasts between v2f16 and v2i16 and it breaks XLA." show more ...
# db5d845c	07-Sep-2023	Thomas <thomas.raoux@openai.com>	[NVPTX] Make i16x2 a native type and add supported vec instructions (#65432) On sm_90 some instructions now support i16x2 which allows hardware to execute more efficiently add, min and max instruct [NVPTX] Make i16x2 a native type and add supported vec instructions (#65432) On sm_90 some instructions now support i16x2 which allows hardware to execute more efficiently add, min and max instructions. In order to support that we need to make i16x2 a native type in the backend. This does the necessary changes to make i16x2 a native type and adds support for the instructions natively supporting i16x2. This caused a negative test in nvptx slp to start passing. Changed the test to a positive one as the IR is correctly vectorized. show more ...
Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# fa023e0f	28-Dec-2022	Pavel Kopyl <pavelkopyl@gmail.com>	[NVPTX] Emit .noreturn directive Differential Revision: https://reviews.llvm.org/D140238
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2
# 940fa35e	23-Sep-2022	Luke Drummond <luke.drummond@codeplay.com>	[NVPTX] Fix a segfault for bitcasted calls with byval params `getFunctionParamOptimizedAlign` was being passed a null function argument when getting the callee of a bitcasted function symbol. This i [NVPTX] Fix a segfault for bitcasted calls with byval params `getFunctionParamOptimizedAlign` was being passed a null function argument when getting the callee of a bitcasted function symbol. This is because `CallBase::getCalledFunction` does not look through bitcasts. There is already code to handle this case in `NVPTXTargetLowering::getArgumentAlignment`, which is now hoisted into an NVPTX util. The alignment computation now gracefully handles computing alignment of virtual functions with a check for null. show more ...
Revision tags: llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# ede60037	29-Jun-2022	Nicolai Hähnle <nicolai.haehnle@amd.com>	ManagedStatic: remove many straightforward uses in llvm (Reapply after revert in e9ce1a588030d8d4004f5d7e443afe46245e9a92 due to Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other ManagedStatic: remove many straightforward uses in llvm (Reapply after revert in e9ce1a588030d8d4004f5d7e443afe46245e9a92 due to Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other than error categories, to be checked in more detail and reapplied separately.) Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120 show more ...
# e9ce1a58	10-Jul-2022	Nicolai Hähnle <nicolai.haehnle@amd.com>	Revert "ManagedStatic: remove many straightforward uses in llvm" This reverts commit e6f1f062457c928c18a88c612f39d9e168f65a85. Reverting due to a failure on the fuchsia-x86_64-linux buildbot.
# e6f1f062	29-Jun-2022	Nicolai Hähnle <nicolai.haehnle@amd.com>	ManagedStatic: remove many straightforward uses in llvm Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, ManagedStatic: remove many straightforward uses in llvm Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120 show more ...
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 9d745828	08-Jan-2022	Kazu Hirata <kazu@google.com>	[Target] use range-based for loops (NFC)
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1
# adcd0268	28-Jan-2020	Benjamin Kramer <benny.kra@googlemail.com>	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly m Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up. show more ...
Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2
# 3d5360a4	07-Aug-2019	Benjamin Kramer <benny.kra@googlemail.com>	Replace llvm::MutexGuard/UniqueLock with their standard equivalents All supported platforms have <mutex> now, so we don't need our own copies any longer. No functionality change intended. llvm-svn: Replace llvm::MutexGuard/UniqueLock with their standard equivalents All supported platforms have <mutex> now, so we don't need our own copies any longer. No functionality change intended. llvm-svn: 368149 show more ...
Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init
# 7ba838d2	12-Jul-2019	Bryant Wong <llvm-commits@xorshift.org>	Test commit. NFC. Formatting fix. llvm-svn: 365878
12