History log of /llvm-project/llvm/lib/Target/NVPTX/NVPTXUtilities.cpp (Results 1 – 25 of 49)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# 4583f6d3 08-Jan-2025 Alex MacLean <amaclean@nvidia.com>

[NVPTX] Switch front-ends and tests to ptx_kernel cc (#120806)

the `ptx_kernel` calling convention is a more idiomatic and standard way
of specifying a NVPTX kernel than using the metadata which is

[NVPTX] Switch front-ends and tests to ptx_kernel cc (#120806)

the `ptx_kernel` calling convention is a more idiomatic and standard way
of specifying a NVPTX kernel than using the metadata which is not
supposed to change the meaning of the program. Further, checking the
calling convention is significantly faster than traversing the metadata,
improving compile time.

This change updates the clang and mlir frontends as well as the
NVPTXCtorDtorLowering pass to emit kernels using the calling convention.
In addition, this updates all NVPTX unit tests to use the calling
convention as well.

show more ...


Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# ed8019d9 18-Nov-2024 Kazu Hirata <kazu@google.com>

[Target] Remove unused includes (NFC) (#116577)

Identified with misc-include-cleaner.


Revision tags: llvmorg-19.1.3
# 020fa868 22-Oct-2024 Artem Belevich <tra@google.com>

[NVPTX] mangle symbols in debug info to conform to PTX restrictions. (#113216)

Until now debug info was printing the symbols names as-is and that
resulted in invalid PTX when the symbols contained

[NVPTX] mangle symbols in debug info to conform to PTX restrictions. (#113216)

Until now debug info was printing the symbols names as-is and that
resulted in invalid PTX when the symbols contained characters that are
invalid for PTX. E.g. `__PRETTY_FUNCTION.something`

Debug info is somewhat disconnected from the symbols themselves, so the
regular "NVPTXAssignValidGlobalNames" pass can't easily fix them.

As the "plan B" this patch catches printout of debug symbols and fixes
them, as needed. One gotcha is that the same code path is used to print
the names of debug info sections. Those section names do start with a
'.debug'. The dot in those names is nominally illegal in PTX, but the
debug section names with a dot are accepted as a special case. The
downside of this change is that if someone ever has a `.debug*` symbol
that needs to be referred to from the debug info, that label will be
passed through as-is, and will still produce broken PTX output. If/when
we run into a case where we need it to work, we could consider only
passing through specific debug section names, or add a mechanism
allowing us to tell section names apart from regular symbols.

Fixes #58491

show more ...


Revision tags: llvmorg-19.1.2
# 9220f645 13-Oct-2024 Kazu Hirata <kazu@google.com>

[NVPTX] Avoid repeated hash lookups (NFC) (#112117)


Revision tags: llvmorg-19.1.1
# 9bc26e9e 25-Sep-2024 Alex MacLean <amaclean@nvidia.com>

[NVPTX] Support !"cluster_dim_{x,y,z}" metadata (#109548)

Add support for !"cluster_dim_{x,y,z}" metadata to allow specifying
cluster dimensions on a kernel function in llvm.

If any of these met

[NVPTX] Support !"cluster_dim_{x,y,z}" metadata (#109548)

Add support for !"cluster_dim_{x,y,z}" metadata to allow specifying
cluster dimensions on a kernel function in llvm.

If any of these metadata entries are present, the `.explicitcluster` PTX
directive is used and the specified dimensions are lowered with the
`.reqnctapercluster` directive. For more details see:
[PTX ISA: 11.7. Cluster Dimension Directives]
(https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives)

show more ...


# 489acb24 25-Sep-2024 Alex MacLean <amaclean@nvidia.com>

[NVPTX][NFC] Refactor utilities to use std::optional (#109883)


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 9fa7c05a 30-Jun-2024 Akshay Deodhar <adeodhar@nvidia.com>

[NVPTX] Improved support for grid_constant (#97112)

- Supports escaped grid_constant pointers less conservatively. Casts
uses inside Calls, PtrToInts, Stores where the pointer is a _value
operand_

[NVPTX] Improved support for grid_constant (#97112)

- Supports escaped grid_constant pointers less conservatively. Casts
uses inside Calls, PtrToInts, Stores where the pointer is a _value
operand_ to generic address space, immediately before the escape, while
keeping other uses in the param address space

- Related to: https://github.com/llvm/llvm-project/pull/96125

show more ...


# 687d6fbf 24-Jun-2024 Akshay Deodhar <adeodhar@nvidia.com>

[NVPTX] Basic support for "grid_constant" (#96125)

- Adds a helper function for checking whether an argument is a
[grid_constant](https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html#supported-prop

[NVPTX] Basic support for "grid_constant" (#96125)

- Adds a helper function for checking whether an argument is a
[grid_constant](https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html#supported-properties).
- Adds support for cvta.param using changes from
https://github.com/llvm/llvm-project/pull/95289
- Supports escaped grid_constant pointers conservatively, by casting all
uses to the generic address space with cvta.param.

show more ...


Revision tags: llvmorg-18.1.8
# 435addbf 06-Jun-2024 Alex MacLean <amaclean@nvidia.com>

[NVPTX] Revamp NVVMIntrRange pass (#94422)

Revamp the NVVMIntrRange pass making the following updates:
- Use range attributes over range metadata. This is what instcombine has
move to for ranges o

[NVPTX] Revamp NVVMIntrRange pass (#94422)

Revamp the NVVMIntrRange pass making the following updates:
- Use range attributes over range metadata. This is what instcombine has
move to for ranges on intrinsics in
https://github.com/llvm/llvm-project/pull/88776 and it seems a bit
cleaner.
- Consider the `!"maxntid{x,y,z}"` and `!"reqntid{x,y,z}"` function
metadata when adding ranges for `tid` srge instrinsics. This can allow
for smaller ranges and more optimization.
- When range attributes are already present, use the intersection of the
old and new range. This complements the metadata change by allowing
ranges to be shrunk when an intrinsic is in a function which is inlined
into a kernel with metadata. While we don't call this more then once
yet, we should consider adding a second call after inlining, once this
has had a chance to soak for a while and no issues have arisen.

I've also re-enabled this pass in the TM, it was disabled years ago due
to "numerical discrepancies" https://reviews.llvm.org/D96166. In our
testing we haven't seen any issues with adding ranges to intrinsics, and
I cannot find any further info about what issues were encountered.

show more ...


Revision tags: llvmorg-18.1.7, llvmorg-18.1.6
# 8da3a8f5 17-May-2024 Alex MacLean <amaclean@nvidia.com>

[NVPTX] fixup support for over-aligned parameters (#92457)

This extends the NVPTX support for over-aligned parameters and return
values in a few related ways:

- Support for `alignstack` attribut

[NVPTX] fixup support for over-aligned parameters (#92457)

This extends the NVPTX support for over-aligned parameters and return
values in a few related ways:

- Support for `alignstack` attribute, as an alternative to legacy nvvm
`!"align"` metadata entries. While we still maintain the legacy support,
long term it might be nice to auto-upgrade to `alignstack`.
- Check the alignment info when emitting the parameter list to prevent a
mismatch between alignment of caller and callee, which would previously
cause a fatal error for `ptxas`.
- Check the alignment info when emitting loads for parameters,
potentially enabling better vectorization.

show more ...


Revision tags: llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2
# 3f8d4a8e 29-Sep-2023 Jakub Chlanda <jakub@codeplay.com>

Reland [NVPTX] Add support for maxclusterrank in launch_bounds (#66496) (#67667)

This reverts commit 0afbcb20fd908f8bf9073697423da097be7db592.


# 0afbcb20 27-Sep-2023 Sam McCall <sam.mccall@gmail.com>

Revert "[NVPTX] Add support for maxclusterrank in launch_bounds (#66496)"

This reverts commit dfab31b41b4988b6dc8129840eba68f0c36c0f13.

SemaDeclAttr.cpp cannot depend on Basic's private headers
(li

Revert "[NVPTX] Add support for maxclusterrank in launch_bounds (#66496)"

This reverts commit dfab31b41b4988b6dc8129840eba68f0c36c0f13.

SemaDeclAttr.cpp cannot depend on Basic's private headers
(lib/Basic/Targets/NVPTX.h)

show more ...


# dfab31b4 27-Sep-2023 Jakub Chlanda <jakub@codeplay.com>

[NVPTX] Add support for maxclusterrank in launch_bounds (#66496)

Since SM_90 CUDA supports specifying additional argument to the
launch_bounds attribute: maxBlocksPerCluster, to express the maximum

[NVPTX] Add support for maxclusterrank in launch_bounds (#66496)

Since SM_90 CUDA supports specifying additional argument to the
launch_bounds attribute: maxBlocksPerCluster, to express the maximum
number of CTAs that can be part of the cluster. See:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives-maxclusterrank
and

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#launch-bounds
for details.

show more ...


Revision tags: llvmorg-17.0.1, llvmorg-17.0.0
# 0a7a9260 08-Sep-2023 Thomas <thomas.raoux@openai.com>

[NVPTX] Make i16x2 a native type and add supported vec instructions (#65799)

recommit https://github.com/llvm/llvm-project/pull/65432 with minor bug
fix for bitcasts


# b3a14cac 08-Sep-2023 Dmitri Gribenko <gribozavr@gmail.com>

Revert "[NVPTX] Make i16x2 a native type and add supported vec instructions (#65432)"

This reverts commit db5d845c73ee2d64f1a5bab3fc72edece9e3a7ba.

As per PR discussion "Looks like we've missed low

Revert "[NVPTX] Make i16x2 a native type and add supported vec instructions (#65432)"

This reverts commit db5d845c73ee2d64f1a5bab3fc72edece9e3a7ba.

As per PR discussion "Looks like we've missed lowering of bitcasts
between v2f16 and v2i16 and it breaks XLA."

show more ...


# db5d845c 07-Sep-2023 Thomas <thomas.raoux@openai.com>

[NVPTX] Make i16x2 a native type and add supported vec instructions (#65432)

On sm_90 some instructions now support i16x2 which allows hardware to
execute more efficiently add, min and max instruct

[NVPTX] Make i16x2 a native type and add supported vec instructions (#65432)

On sm_90 some instructions now support i16x2 which allows hardware to
execute more efficiently add, min and max instructions.

In order to support that we need to make i16x2 a native type in the
backend. This does the necessary changes to make i16x2 a native type and
adds support for the instructions natively supporting i16x2.

This caused a negative test in nvptx slp to start passing. Changed the
test to a positive one as the IR is correctly vectorized.

show more ...


Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# fa023e0f 28-Dec-2022 Pavel Kopyl <pavelkopyl@gmail.com>

[NVPTX] Emit .noreturn directive

Differential Revision: https://reviews.llvm.org/D140238


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2
# 940fa35e 23-Sep-2022 Luke Drummond <luke.drummond@codeplay.com>

[NVPTX] Fix a segfault for bitcasted calls with byval params

`getFunctionParamOptimizedAlign` was being passed a null function
argument when getting the callee of a bitcasted function symbol. This i

[NVPTX] Fix a segfault for bitcasted calls with byval params

`getFunctionParamOptimizedAlign` was being passed a null function
argument when getting the callee of a bitcasted function symbol. This is
because `CallBase::getCalledFunction` does not look through bitcasts.

There is already code to handle this case in
`NVPTXTargetLowering::getArgumentAlignment`, which is now hoisted into
an NVPTX util.

The alignment computation now gracefully handles computing alignment of
virtual functions with a check for null.

show more ...


Revision tags: llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# ede60037 29-Jun-2022 Nicolai Hähnle <nicolai.haehnle@amd.com>

ManagedStatic: remove many straightforward uses in llvm

(Reapply after revert in e9ce1a588030d8d4004f5d7e443afe46245e9a92 due to
Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other

ManagedStatic: remove many straightforward uses in llvm

(Reapply after revert in e9ce1a588030d8d4004f5d7e443afe46245e9a92 due to
Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other
than error categories, to be checked in more detail and reapplied
separately.)

Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.

Differential Revision: https://reviews.llvm.org/D129120

show more ...


# e9ce1a58 10-Jul-2022 Nicolai Hähnle <nicolai.haehnle@amd.com>

Revert "ManagedStatic: remove many straightforward uses in llvm"

This reverts commit e6f1f062457c928c18a88c612f39d9e168f65a85.

Reverting due to a failure on the fuchsia-x86_64-linux buildbot.


# e6f1f062 29-Jun-2022 Nicolai Hähnle <nicolai.haehnle@amd.com>

ManagedStatic: remove many straightforward uses in llvm

Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,

ManagedStatic: remove many straightforward uses in llvm

Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.

Differential Revision: https://reviews.llvm.org/D129120

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 9d745828 08-Jan-2022 Kazu Hirata <kazu@google.com>

[Target] use range-based for loops (NFC)


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1
# adcd0268 28-Jan-2020 Benjamin Kramer <benny.kra@googlemail.com>

Make llvm::StringRef to std::string conversions explicit.

This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly m

Make llvm::StringRef to std::string conversions explicit.

This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.

show more ...


Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2
# 3d5360a4 07-Aug-2019 Benjamin Kramer <benny.kra@googlemail.com>

Replace llvm::MutexGuard/UniqueLock with their standard equivalents

All supported platforms have <mutex> now, so we don't need our own
copies any longer. No functionality change intended.

llvm-svn:

Replace llvm::MutexGuard/UniqueLock with their standard equivalents

All supported platforms have <mutex> now, so we don't need our own
copies any longer. No functionality change intended.

llvm-svn: 368149

show more ...


Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init
# 7ba838d2 12-Jul-2019 Bryant Wong <llvm-commits@xorshift.org>

Test commit. NFC.

Formatting fix.

llvm-svn: 365878


12