History log of /llvm-project/llvm/lib/Target/NVPTX/NVVMReflect.cpp (Results 1 – 25 of 44)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init
# 01045b75 21-Jan-2025 Joseph Huber <huberjn@outlook.com>

[CUDA] Add missing zero initializer for reflect


# cdb4da32 21-Jan-2025 Joseph Huber <huberjn@outlook.com>

[NVPTX] Fix failing test and incorrect `mcpu` reading in reflect

Summary:
Test uses nvptx in 32-bit mode and calling `mcpu` is broken and caused
asan failures.


Revision tags: llvmorg-19.1.7
# e7a83fc7 07-Jan-2025 Kazu Hirata <kazu@google.com>

[NVPTX] Fix a warning

This patch fixes:

llvm/lib/Target/NVPTX/NVVMReflect.cpp:225:18: error: object backing
the pointer will be destroyed at the end of the full-expression
[-Werror,-Wdangling

[NVPTX] Fix a warning

This patch fixes:

llvm/lib/Target/NVPTX/NVVMReflect.cpp:225:18: error: object backing
the pointer will be destroyed at the end of the full-expression
[-Werror,-Wdangling-gsl]

show more ...


# 29b5c18e 07-Jan-2025 Joseph Huber <huberjn@outlook.com>

[NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline (#121834)

Summary:
This pass lowers the `__nvvm_reflect` builtin in the IR. However, this
currently runs in the standard optimi

[NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline (#121834)

Summary:
This pass lowers the `__nvvm_reflect` builtin in the IR. However, this
currently runs in the standard optimization pipeline, not just the
backend pipeline. This means that if the user creates LLVM-IR without an
architecture set, it will always delete the reflect code even if it is
intended to be used later.

Pushing this into the backend pipeline will ensure that this works as
intended, allowing users to conditionally include code depending on
which target architecture the user ended up using. This fixes a bug in
OpenMP and missing code in `libc`.

show more ...


Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# ed8019d9 18-Nov-2024 Kazu Hirata <kazu@google.com>

[Target] Remove unused includes (NFC) (#116577)

Identified with misc-include-cleaner.


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 9df71d76 28-Jun-2024 Nikita Popov <npopov@redhat.com>

[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)

Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, re

[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)

Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.

show more ...


# fef144ce 25-Jun-2024 Kazu Hirata <kazu@google.com>

Revert "[llvm] Use llvm::sort (NFC) (#96434)"

This reverts commit 05d167fc201b4f2e96108be0d682f6800a70c23d.

Reverting the patch fixes the following under EXPENSIVE_CHECKS:

LLVM :: CodeGen/AMDGPU

Revert "[llvm] Use llvm::sort (NFC) (#96434)"

This reverts commit 05d167fc201b4f2e96108be0d682f6800a70c23d.

Reverting the patch fixes the following under EXPENSIVE_CHECKS:

LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
LLVM :: CodeGen/AMDGPU/sched-group-barrier-pre-RA.mir
LLVM :: CodeGen/PowerPC/aix-xcoff-used-with-stringpool.ll
LLVM :: CodeGen/PowerPC/merge-string-used-by-metadata.mir
LLVM :: CodeGen/PowerPC/mergeable-string-pool-large.ll
LLVM :: CodeGen/PowerPC/mergeable-string-pool-pass-only.mir
LLVM :: CodeGen/PowerPC/mergeable-string-pool.ll

show more ...


# 05d167fc 23-Jun-2024 Kazu Hirata <kazu@google.com>

[llvm] Use llvm::sort (NFC) (#96434)


Revision tags: llvmorg-18.1.8
# 7c6d0d26 15-Jun-2024 Kazu Hirata <kazu@google.com>

[llvm] Use llvm::unique (NFC) (#95628)


Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3
# 45260bf2 12-Feb-2024 Petr <piter.zh@gmail.com>

Fix use after free error in NVVMReflect (#81471)

I have a Triton kernel, which triggered a heap-use-after-free error in
LLVM.

The problem was that the same instruction may be added to the
`ToSi

Fix use after free error in NVVMReflect (#81471)

I have a Triton kernel, which triggered a heap-use-after-free error in
LLVM.

The problem was that the same instruction may be added to the
`ToSimplify` array multiple times. If this duplicate instruction is
trivially dead, it gets deleted on the first pass. Then, on the second
pass, the freed instruction is passed.

To fix this, I'm adding the instructions to the `ToRemove` array and
filter it out for duplicates to avoid possible double frees.

show more ...


# 07dc85ba 09-Feb-2024 Joseph Huber <huberjn@outlook.com>

[NVVMReflect] Improve folding inside of the NVVMReflect pass (#81253)

Summary:
The previous patch did very simple folding that only worked for driectly
used branches. This patch improves this by tra

[NVVMReflect] Improve folding inside of the NVVMReflect pass (#81253)

Summary:
The previous patch did very simple folding that only worked for driectly
used branches. This patch improves this by traversing the use-def chain
to sipmlify every constant subexpression until it reaches a terminator
we can delete. The support should work for all expected cases now.

show more ...


# ffabcbcf 08-Feb-2024 Joseph Huber <huberjn@outlook.com>

[NVVMReflect][Reland] Force dead branch elimination in NVVMReflect (#81189)

Summary:
The `__nvvm_reflect` function is used to guard invalid code that varies
between architectures. One problem with t

[NVVMReflect][Reland] Force dead branch elimination in NVVMReflect (#81189)

Summary:
The `__nvvm_reflect` function is used to guard invalid code that varies
between architectures. One problem with this feature is that if it is
used without optimizations, it will leave invalid code in the module
that will then make it to the backend. The `__nvvm_reflect` pass is
already mandatory, so it should do some trivial branch removal to ensure
that constants are handled correctly. This dead branch elimination only
works in the trivial case of a compare on a branch and does not touch
any conditionals that were not realted to the `__nvvm_reflect` call in
order to preserve `O0` semantics as much as possible. This should allow
the following to work on NVPTX targets

```c
int foo() {
if (__nvvm_reflect("__CUDA_ARCH") >= 700)
asm("valid;\n");
}
```

Relanding after fixing a bug.

show more ...


# 0800a360 08-Feb-2024 Joseph Huber <huberjn@outlook.com>

Revert "[NVVMReflect] Force dead branch elimination in NVVMReflect (#81189)"

This reverts commit 9211e67da36782db44a46ccb9ac06734ccf2570f.

Summary:
This seemed to crash one one of the CUDA math tes

Revert "[NVVMReflect] Force dead branch elimination in NVVMReflect (#81189)"

This reverts commit 9211e67da36782db44a46ccb9ac06734ccf2570f.

Summary:
This seemed to crash one one of the CUDA math tests. Revert until it can
be fixed.

show more ...


# 9211e67d 08-Feb-2024 Joseph Huber <huberjn@outlook.com>

[NVVMReflect] Force dead branch elimination in NVVMReflect (#81189)

Summary:
The `__nvvm_reflect` function is used to guard invalid code that varies
between architectures. One problem with this feat

[NVVMReflect] Force dead branch elimination in NVVMReflect (#81189)

Summary:
The `__nvvm_reflect` function is used to guard invalid code that varies
between architectures. One problem with this feature is that if it is
used without optimizations, it will leave invalid code in the module
that will then make it to the backend. The `__nvvm_reflect` pass is
already mandatory, so it should do some trivial branch removal to ensure
that constants are handled correctly. This dead branch elimination only
works in the trivial case of a compare on a branch and does not touch
any conditionals that were not realted to the `__nvvm_reflect` call in
order to preserve `O0` semantics as much as possible. This should allow
the following to work on NVPTX targets

```c
int foo() {
if (__nvvm_reflect("__CUDA_ARCH") >= 700)
asm("valid;\n");
}
```

show more ...


Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4
# 935d8e12 22-Oct-2023 Kazu Hirata <kazu@google.com>

[llvm] Stop including llvm/ADT/StringMap.h (NFC)

Identified misc-include-cleaner.


Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# ce43e2f0 04-Jan-2023 Hugh Delaney <hugh.delaney@codeplay.com>

[llvm][CUDA] Allow NVVMREflect to process OpenCL-specific __nvvm_reflect_ocl()

OpenCL requires constant string arguments to be in a particular address space,
so OpenCL sources can't use the regular

[llvm][CUDA] Allow NVVMREflect to process OpenCL-specific __nvvm_reflect_ocl()

OpenCL requires constant string arguments to be in a particular address space,
so OpenCL sources can't use the regular `__nvvm_reflect()`.

Allow NVVMReflect pass to accept an Open_CL specific variant with a constant
string in a non-default address space.

Differential Revision: https://reviews.llvm.org/D139213

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 0f070bee 11-Apr-2022 Johannes Doerfert <johannes@jdoerfert.de>

[NVPTX][FIX] Allow __nvvm_reflect in the presence of opaque pointers

Differential Revision: https://reviews.llvm.org/D123522


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 9ccf13c3 30-Dec-2020 Arthur Eubanks <aeubanks@google.com>

[NewPM][NVPTX] Port NVPTX opt passes

There are only two used in the IR optimization pipeline.
Port these and add them to the default pipeline.

Similar to https://reviews.llvm.org/D93863.

I added -

[NewPM][NVPTX] Port NVPTX opt passes

There are only two used in the IR optimization pipeline.
Port these and add them to the default pipeline.

Similar to https://reviews.llvm.org/D93863.

I added -mtriple to some tests since under the new PM, the passes are
only available when the TargetMachine is specified.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D93930

show more ...


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3
# 5d986953 11-Dec-2019 Reid Kleckner <rnk@google.com>

[IR] Split out target specific intrinsic enums into separate headers

This has two main effects:
- Optimizes debug info size by saving 221.86 MB of obj file size in a
Windows optimized+debug build

[IR] Split out target specific intrinsic enums into separate headers

This has two main effects:
- Optimizes debug info size by saving 221.86 MB of obj file size in a
Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of
object file size.
- Incremental step towards decoupling target intrinsics.

The enums are still compact, so adding and removing a single
target-specific intrinsic will trigger a rebuild of all of LLVM.
Assigning distinct target id spaces is potential future work.

Part of PR34259

Reviewers: efriedma, echristo, MaskRay

Reviewed By: echristo, MaskRay

Differential Revision: https://reviews.llvm.org/D71320

show more ...


Revision tags: llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# 2946cd70 19-Jan-2019 Chandler Carruth <chandlerc@gmail.com>

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the ne

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636

show more ...


Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2
# 0a11b636 03-Aug-2018 Artem Belevich <tra@google.com>

[NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").

Summary:
libdevice in recent CUDA versions relies on __nvvm_reflect() to select
GPU-specific bitcode. This patch addresses the requirement.

Reviewers:

[NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").

Summary:
libdevice in recent CUDA versions relies on __nvvm_reflect() to select
GPU-specific bitcode. This patch addresses the requirement.

Reviewers: jlebar

Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D50207

llvm-svn: 338908

show more ...


Revision tags: llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2
# d34e60ca 14-May-2018 Nicola Zaghen <nicola.zaghen@imgtec.com>

Rename DEBUG macro to LLVM_DEBUG.

The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/

Rename DEBUG macro to LLVM_DEBUG.

The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240

show more ...


Revision tags: llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1
# 38746d97 15-Jan-2017 Justin Lebar <jlebar@google.com>

[NVPTX] Let there be One True Way to set NVVMReflect params.

Summary:
Previously there were three ways to inform the NVVMReflect pass whether
you wanted to flush denormals to zero:

* An LLVM comm

[NVPTX] Let there be One True Way to set NVVMReflect params.

Summary:
Previously there were three ways to inform the NVVMReflect pass whether
you wanted to flush denormals to zero:

* An LLVM command-line option
* Parameters to the NVVMReflect constructor
* Metadata on the module itself.

This change removes the first two, leaving only the third.

The motivation for this change, aside from simplifying things, is that
we want LLVM to be aware of whether it's operating in FTZ mode, so other
passes can use this information. Ideally we'd have a target-generic
piece of metadata on the module. This change moves us in that
direction.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D28700

llvm-svn: 292068

show more ...


# a54f4d70 14-Dec-2016 Justin Lebar <jlebar@google.com>

[NVPTX] Remove dead code.

I've chosen to remove NVPTXInstrInfo::CanTailMerge but not
NVPTXInstrInfo::isLoadInstr and isStoreInstr (which are also dead)
because while the latter two are reasonably us

[NVPTX] Remove dead code.

I've chosen to remove NVPTXInstrInfo::CanTailMerge but not
NVPTXInstrInfo::isLoadInstr and isStoreInstr (which are also dead)
because while the latter two are reasonably useful utilities, the former
cannot be used safely: It relies on successful address space inference
to identify writes to shared memory, but addrspace inference is a
best-effort thing.

llvm-svn: 289740

show more ...


Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1
# fea8a8d7 25-May-2016 Justin Lebar <jlebar@google.com>

[NVPTX] Don't (incorrectly) say that the NVVMReflect pass preserves all analyses.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D20585

llvm-

[NVPTX] Don't (incorrectly) say that the NVVMReflect pass preserves all analyses.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D20585

llvm-svn: 270790

show more ...


12