Revision tags: llvmorg-21-init |
|
#
01045b75 |
| 21-Jan-2025 |
Joseph Huber <huberjn@outlook.com> |
[CUDA] Add missing zero initializer for reflect
|
#
cdb4da32 |
| 21-Jan-2025 |
Joseph Huber <huberjn@outlook.com> |
[NVPTX] Fix failing test and incorrect `mcpu` reading in reflect
Summary: Test uses nvptx in 32-bit mode and calling `mcpu` is broken and caused asan failures.
|
Revision tags: llvmorg-19.1.7 |
|
#
e7a83fc7 |
| 07-Jan-2025 |
Kazu Hirata <kazu@google.com> |
[NVPTX] Fix a warning
This patch fixes:
llvm/lib/Target/NVPTX/NVVMReflect.cpp:225:18: error: object backing the pointer will be destroyed at the end of the full-expression [-Werror,-Wdangling
[NVPTX] Fix a warning
This patch fixes:
llvm/lib/Target/NVPTX/NVVMReflect.cpp:225:18: error: object backing the pointer will be destroyed at the end of the full-expression [-Werror,-Wdangling-gsl]
show more ...
|
#
29b5c18e |
| 07-Jan-2025 |
Joseph Huber <huberjn@outlook.com> |
[NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline (#121834)
Summary: This pass lowers the `__nvvm_reflect` builtin in the IR. However, this currently runs in the standard optimi
[NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline (#121834)
Summary: This pass lowers the `__nvvm_reflect` builtin in the IR. However, this currently runs in the standard optimization pipeline, not just the backend pipeline. This means that if the user creates LLVM-IR without an architecture set, it will always delete the reflect code even if it is intended to be used later.
Pushing this into the backend pipeline will ensure that this works as intended, allowing users to conditionally include code depending on which target architecture the user ended up using. This fixes a bug in OpenMP and missing code in `libc`.
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
ed8019d9 |
| 18-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[Target] Remove unused includes (NFC) (#116577)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
9df71d76 |
| 28-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, re
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
show more ...
|
#
fef144ce |
| 25-Jun-2024 |
Kazu Hirata <kazu@google.com> |
Revert "[llvm] Use llvm::sort (NFC) (#96434)"
This reverts commit 05d167fc201b4f2e96108be0d682f6800a70c23d.
Reverting the patch fixes the following under EXPENSIVE_CHECKS:
LLVM :: CodeGen/AMDGPU
Revert "[llvm] Use llvm::sort (NFC) (#96434)"
This reverts commit 05d167fc201b4f2e96108be0d682f6800a70c23d.
Reverting the patch fixes the following under EXPENSIVE_CHECKS:
LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir LLVM :: CodeGen/AMDGPU/sched-group-barrier-pre-RA.mir LLVM :: CodeGen/PowerPC/aix-xcoff-used-with-stringpool.ll LLVM :: CodeGen/PowerPC/merge-string-used-by-metadata.mir LLVM :: CodeGen/PowerPC/mergeable-string-pool-large.ll LLVM :: CodeGen/PowerPC/mergeable-string-pool-pass-only.mir LLVM :: CodeGen/PowerPC/mergeable-string-pool.ll
show more ...
|
#
05d167fc |
| 23-Jun-2024 |
Kazu Hirata <kazu@google.com> |
[llvm] Use llvm::sort (NFC) (#96434)
|
Revision tags: llvmorg-18.1.8 |
|
#
7c6d0d26 |
| 15-Jun-2024 |
Kazu Hirata <kazu@google.com> |
[llvm] Use llvm::unique (NFC) (#95628)
|
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
45260bf2 |
| 12-Feb-2024 |
Petr <piter.zh@gmail.com> |
Fix use after free error in NVVMReflect (#81471)
I have a Triton kernel, which triggered a heap-use-after-free error in
LLVM.
The problem was that the same instruction may be added to the
`ToSi
Fix use after free error in NVVMReflect (#81471)
I have a Triton kernel, which triggered a heap-use-after-free error in
LLVM.
The problem was that the same instruction may be added to the
`ToSimplify` array multiple times. If this duplicate instruction is
trivially dead, it gets deleted on the first pass. Then, on the second
pass, the freed instruction is passed.
To fix this, I'm adding the instructions to the `ToRemove` array and
filter it out for duplicates to avoid possible double frees.
show more ...
|
#
07dc85ba |
| 09-Feb-2024 |
Joseph Huber <huberjn@outlook.com> |
[NVVMReflect] Improve folding inside of the NVVMReflect pass (#81253)
Summary: The previous patch did very simple folding that only worked for driectly used branches. This patch improves this by tra
[NVVMReflect] Improve folding inside of the NVVMReflect pass (#81253)
Summary: The previous patch did very simple folding that only worked for driectly used branches. This patch improves this by traversing the use-def chain to sipmlify every constant subexpression until it reaches a terminator we can delete. The support should work for all expected cases now.
show more ...
|
#
ffabcbcf |
| 08-Feb-2024 |
Joseph Huber <huberjn@outlook.com> |
[NVVMReflect][Reland] Force dead branch elimination in NVVMReflect (#81189)
Summary: The `__nvvm_reflect` function is used to guard invalid code that varies between architectures. One problem with t
[NVVMReflect][Reland] Force dead branch elimination in NVVMReflect (#81189)
Summary: The `__nvvm_reflect` function is used to guard invalid code that varies between architectures. One problem with this feature is that if it is used without optimizations, it will leave invalid code in the module that will then make it to the backend. The `__nvvm_reflect` pass is already mandatory, so it should do some trivial branch removal to ensure that constants are handled correctly. This dead branch elimination only works in the trivial case of a compare on a branch and does not touch any conditionals that were not realted to the `__nvvm_reflect` call in order to preserve `O0` semantics as much as possible. This should allow the following to work on NVPTX targets
```c int foo() { if (__nvvm_reflect("__CUDA_ARCH") >= 700) asm("valid;\n"); } ```
Relanding after fixing a bug.
show more ...
|
#
0800a360 |
| 08-Feb-2024 |
Joseph Huber <huberjn@outlook.com> |
Revert "[NVVMReflect] Force dead branch elimination in NVVMReflect (#81189)"
This reverts commit 9211e67da36782db44a46ccb9ac06734ccf2570f.
Summary: This seemed to crash one one of the CUDA math tes
Revert "[NVVMReflect] Force dead branch elimination in NVVMReflect (#81189)"
This reverts commit 9211e67da36782db44a46ccb9ac06734ccf2570f.
Summary: This seemed to crash one one of the CUDA math tests. Revert until it can be fixed.
show more ...
|
#
9211e67d |
| 08-Feb-2024 |
Joseph Huber <huberjn@outlook.com> |
[NVVMReflect] Force dead branch elimination in NVVMReflect (#81189)
Summary: The `__nvvm_reflect` function is used to guard invalid code that varies between architectures. One problem with this feat
[NVVMReflect] Force dead branch elimination in NVVMReflect (#81189)
Summary: The `__nvvm_reflect` function is used to guard invalid code that varies between architectures. One problem with this feature is that if it is used without optimizations, it will leave invalid code in the module that will then make it to the backend. The `__nvvm_reflect` pass is already mandatory, so it should do some trivial branch removal to ensure that constants are handled correctly. This dead branch elimination only works in the trivial case of a compare on a branch and does not touch any conditionals that were not realted to the `__nvvm_reflect` call in order to preserve `O0` semantics as much as possible. This should allow the following to work on NVPTX targets
```c int foo() { if (__nvvm_reflect("__CUDA_ARCH") >= 700) asm("valid;\n"); } ```
show more ...
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4 |
|
#
935d8e12 |
| 22-Oct-2023 |
Kazu Hirata <kazu@google.com> |
[llvm] Stop including llvm/ADT/StringMap.h (NFC)
Identified misc-include-cleaner.
|
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
ce43e2f0 |
| 04-Jan-2023 |
Hugh Delaney <hugh.delaney@codeplay.com> |
[llvm][CUDA] Allow NVVMREflect to process OpenCL-specific __nvvm_reflect_ocl()
OpenCL requires constant string arguments to be in a particular address space, so OpenCL sources can't use the regular
[llvm][CUDA] Allow NVVMREflect to process OpenCL-specific __nvvm_reflect_ocl()
OpenCL requires constant string arguments to be in a particular address space, so OpenCL sources can't use the regular `__nvvm_reflect()`.
Allow NVVMReflect pass to accept an Open_CL specific variant with a constant string in a non-default address space.
Differential Revision: https://reviews.llvm.org/D139213
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
0f070bee |
| 11-Apr-2022 |
Johannes Doerfert <johannes@jdoerfert.de> |
[NVPTX][FIX] Allow __nvvm_reflect in the presence of opaque pointers
Differential Revision: https://reviews.llvm.org/D123522
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
#
9ccf13c3 |
| 30-Dec-2020 |
Arthur Eubanks <aeubanks@google.com> |
[NewPM][NVPTX] Port NVPTX opt passes
There are only two used in the IR optimization pipeline. Port these and add them to the default pipeline.
Similar to https://reviews.llvm.org/D93863.
I added -
[NewPM][NVPTX] Port NVPTX opt passes
There are only two used in the IR optimization pipeline. Port these and add them to the default pipeline.
Similar to https://reviews.llvm.org/D93863.
I added -mtriple to some tests since under the new PM, the passes are only available when the TargetMachine is specified.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D93930
show more ...
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3 |
|
#
5d986953 |
| 11-Dec-2019 |
Reid Kleckner <rnk@google.com> |
[IR] Split out target specific intrinsic enums into separate headers
This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build
[IR] Split out target specific intrinsic enums into separate headers
This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics.
The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work.
Part of PR34259
Reviewers: efriedma, echristo, MaskRay
Reviewed By: echristo, MaskRay
Differential Revision: https://reviews.llvm.org/D71320
show more ...
|
Revision tags: llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
#
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <chandlerc@gmail.com> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2 |
|
#
0a11b636 |
| 03-Aug-2018 |
Artem Belevich <tra@google.com> |
[NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").
Summary: libdevice in recent CUDA versions relies on __nvvm_reflect() to select GPU-specific bitcode. This patch addresses the requirement.
Reviewers:
[NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").
Summary: libdevice in recent CUDA versions relies on __nvvm_reflect() to select GPU-specific bitcode. This patch addresses the requirement.
Reviewers: jlebar
Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits
Differential Revision: https://reviews.llvm.org/D50207
llvm-svn: 338908
show more ...
|
Revision tags: llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2 |
|
#
d34e60ca |
| 14-May-2018 |
Nicola Zaghen <nicola.zaghen@imgtec.com> |
Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/
Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
show more ...
|
Revision tags: llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1 |
|
#
38746d97 |
| 15-Jan-2017 |
Justin Lebar <jlebar@google.com> |
[NVPTX] Let there be One True Way to set NVVMReflect params.
Summary: Previously there were three ways to inform the NVVMReflect pass whether you wanted to flush denormals to zero:
* An LLVM comm
[NVPTX] Let there be One True Way to set NVVMReflect params.
Summary: Previously there were three ways to inform the NVVMReflect pass whether you wanted to flush denormals to zero:
* An LLVM command-line option * Parameters to the NVVMReflect constructor * Metadata on the module itself.
This change removes the first two, leaving only the third.
The motivation for this change, aside from simplifying things, is that we want LLVM to be aware of whether it's operating in FTZ mode, so other passes can use this information. Ideally we'd have a target-generic piece of metadata on the module. This change moves us in that direction.
Reviewers: tra
Subscribers: jholewinski, llvm-commits
Differential Revision: https://reviews.llvm.org/D28700
llvm-svn: 292068
show more ...
|
#
a54f4d70 |
| 14-Dec-2016 |
Justin Lebar <jlebar@google.com> |
[NVPTX] Remove dead code.
I've chosen to remove NVPTXInstrInfo::CanTailMerge but not NVPTXInstrInfo::isLoadInstr and isStoreInstr (which are also dead) because while the latter two are reasonably us
[NVPTX] Remove dead code.
I've chosen to remove NVPTXInstrInfo::CanTailMerge but not NVPTXInstrInfo::isLoadInstr and isStoreInstr (which are also dead) because while the latter two are reasonably useful utilities, the former cannot be used safely: It relies on successful address space inference to identify writes to shared memory, but addrspace inference is a best-effort thing.
llvm-svn: 289740
show more ...
|
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
#
fea8a8d7 |
| 25-May-2016 |
Justin Lebar <jlebar@google.com> |
[NVPTX] Don't (incorrectly) say that the NVVMReflect pass preserves all analyses.
Reviewers: tra
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D20585
llvm-
[NVPTX] Don't (incorrectly) say that the NVVMReflect pass preserves all analyses.
Reviewers: tra
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D20585
llvm-svn: 270790
show more ...
|