|
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6 |
|
| #
b279f6b0 |
| 15-Dec-2024 |
Fangrui Song <i@maskray.me> |
[NVPTX,test] Change llc -march= to -mtriple=
Similar to 806761a7629df268c8aed49657aeccffa6bca449
-mtriple= specifies the full target triple while -march= merely sets the architecture part of the de
[NVPTX,test] Change llc -march= to -mtriple=
Similar to 806761a7629df268c8aed49657aeccffa6bca449
-mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple (e.g. Windows, macOS), leaving a target triple which may not make sense.
Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize nvptx{,64}-apple-darwin as ELF instead of rejecting it outrightly.
show more ...
|
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6 |
|
| #
ef8655ad |
| 05-Jun-2023 |
Artem Belevich <tra@google.com> |
[NVPTX] Adapt tests to make them usable with CUDA-12.x
CUDA-12 no longer supports 32-bit compilation.
Tests agnostic to 32/64 compilation mode are switched to use nvptx64. Tests that do care about
[NVPTX] Adapt tests to make them usable with CUDA-12.x
CUDA-12 no longer supports 32-bit compilation.
Tests agnostic to 32/64 compilation mode are switched to use nvptx64. Tests that do care about it have 32-bit ptxas compilation disabled with cuda-12+.
Differential Revision: https://reviews.llvm.org/D152199
show more ...
|
|
Revision tags: llvmorg-16.0.5 |
|
| #
6963c61f |
| 17-May-2023 |
Artem Belevich <tra@google.com> |
[NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*
The optional argument is needed for CUDA-11+ headers when we're compiling for sm_80+ GPUs.
Differential Revision: https://review
[NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*
The optional argument is needed for CUDA-11+ headers when we're compiling for sm_80+ GPUs.
Differential Revision: https://reviews.llvm.org/D150820
show more ...
|
| #
0e43eb24 |
| 18-May-2023 |
Artem Belevich <tra@google.com> |
Revert "[NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*"
Breaks MLIR which happens to be using the intrinsics.
This reverts commit e7b9c2f00fa04ef8d9b69ee0e36d7775823dbe6b.
|
| #
e7b9c2f0 |
| 17-May-2023 |
Artem Belevich <tra@google.com> |
[NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*
The optional argument is needed for CUDA-11+ headers when we're compiling for sm_80+ GPUs.
Differential Revision: https://review
[NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*
The optional argument is needed for CUDA-11+ headers when we're compiling for sm_80+ GPUs.
Differential Revision: https://reviews.llvm.org/D150820
show more ...
|
|
Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
| #
9b81548a |
| 19-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[NVPTX] Convert some tests to opaque pointers (NFC)
|
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3 |
|
| #
0f1b5f11 |
| 27-Apr-2022 |
Andrew Savonichev <andrew.savonichev@gmail.com> |
[NVPTX] Integrate ptxas to LIT tests
ptxas is a proprietary compiler from Nvidia that can compile PTX to machine code (SASS). It has a lot of diagnostics to catch errors in PTX, which can be used to
[NVPTX] Integrate ptxas to LIT tests
ptxas is a proprietary compiler from Nvidia that can compile PTX to machine code (SASS). It has a lot of diagnostics to catch errors in PTX, which can be used to verify PTX output from llc.
Set -DPXTAS_EXECUTABLE=/path/to/ptxas CMake option to enable it. If this option is not set, then ptxas is substituted to true which effectively disables all ptxas RUN lines.
LLVM_PTXAS_EXECUTABLE environment variable takes precedence over the CMake option, and allows to override ptxas executable that is used for LIT without complete re-configuration.
Differential Revision: https://reviews.llvm.org/D121727
show more ...
|
|
Revision tags: llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
02c24688 |
| 17-May-2021 |
Stuart Adams <stuart.adams@codeplay.com> |
[Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions
Adds NVPTX builtins and intrinsics for the CUDA PTX `cp.async` instructions for `sm_80` architecture or newer.
PT
[Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions
Adds NVPTX builtins and intrinsics for the CUDA PTX `cp.async` instructions for `sm_80` architecture or newer.
PTX ISA description of `cp.async`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-asynchronous-copy https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive
Authored-by: Stuart Adams <stuart.adams@codeplay.com> Co-Authored-by: Alexander Johnston <alexander@codeplay.com>
Differential Revision: https://reviews.llvm.org/D100394
show more ...
|