History log of /llvm-project/llvm/test/CodeGen/NVPTX/local-stack-frame.ll (Results 1 – 13 of 13)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# 4583f6d3 08-Jan-2025 Alex MacLean <amaclean@nvidia.com>

[NVPTX] Switch front-ends and tests to ptx_kernel cc (#120806)

the `ptx_kernel` calling convention is a more idiomatic and standard way
of specifying a NVPTX kernel than using the metadata which is

[NVPTX] Switch front-ends and tests to ptx_kernel cc (#120806)

the `ptx_kernel` calling convention is a more idiomatic and standard way
of specifying a NVPTX kernel than using the metadata which is not
supposed to change the meaning of the program. Further, checking the
calling convention is significantly faster than traversing the metadata,
improving compile time.

This change updates the clang and mlir frontends as well as the
NVPTXCtorDtorLowering pass to emit kernels using the calling convention.
In addition, this updates all NVPTX unit tests to use the calling
convention as well.

show more ...


Revision tags: llvmorg-19.1.6
# b279f6b0 15-Dec-2024 Fangrui Song <i@maskray.me>

[NVPTX,test] Change llc -march= to -mtriple=

Similar to 806761a7629df268c8aed49657aeccffa6bca449

-mtriple= specifies the full target triple while -march= merely sets the
architecture part of the de

[NVPTX,test] Change llc -march= to -mtriple=

Similar to 806761a7629df268c8aed49657aeccffa6bca449

-mtriple= specifies the full target triple while -march= merely sets the
architecture part of the default target triple (e.g. Windows, macOS),
leaving a target triple which may not make sense.

Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
nvptx{,64}-apple-darwin as ELF instead of rejecting it outrightly.

show more ...


Revision tags: llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3
# 0f0a96b8 19-Oct-2024 Youngsuk Kim <youngsuk.kim@hpe.com>

[llvm][NVPTX] Strip unneeded '+0' in PTX load/store (#113017)

Remove the extraneous '+0' immediate offset part in PTX load/stores, to
improve readability of output PTX code.


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6
# ef8655ad 05-Jun-2023 Artem Belevich <tra@google.com>

[NVPTX] Adapt tests to make them usable with CUDA-12.x

CUDA-12 no longer supports 32-bit compilation.

Tests agnostic to 32/64 compilation mode are switched to use nvptx64.
Tests that do care about

[NVPTX] Adapt tests to make them usable with CUDA-12.x

CUDA-12 no longer supports 32-bit compilation.

Tests agnostic to 32/64 compilation mode are switched to use nvptx64.
Tests that do care about it have 32-bit ptxas compilation disabled with cuda-12+.

Differential Revision: https://reviews.llvm.org/D152199

show more ...


Revision tags: llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 9b81548a 19-Dec-2022 Nikita Popov <npopov@redhat.com>

[NVPTX] Convert some tests to opaque pointers (NFC)


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3
# 0f1b5f11 27-Apr-2022 Andrew Savonichev <andrew.savonichev@gmail.com>

[NVPTX] Integrate ptxas to LIT tests

ptxas is a proprietary compiler from Nvidia that can compile PTX to
machine code (SASS). It has a lot of diagnostics to catch errors
in PTX, which can be used to

[NVPTX] Integrate ptxas to LIT tests

ptxas is a proprietary compiler from Nvidia that can compile PTX to
machine code (SASS). It has a lot of diagnostics to catch errors
in PTX, which can be used to verify PTX output from llc.

Set -DPXTAS_EXECUTABLE=/path/to/ptxas CMake option to enable it.
If this option is not set, then ptxas is substituted to true which
effectively disables all ptxas RUN lines.

LLVM_PTXAS_EXECUTABLE environment variable takes precedence over
the CMake option, and allows to override ptxas executable that is used for LIT
without complete re-configuration.

Differential Revision: https://reviews.llvm.org/D121727

show more ...


Revision tags: llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 51eefa81 14-Oct-2021 Andrew Savonichev <andrew.savonichev@gmail.com>

[NVPTX] Add VRFrame and VRFrameLocal to integer register classes

These registers are used as operands for instructions that expect an
integer register, so they should be added to Int32Regs or Int64R

[NVPTX] Add VRFrame and VRFrameLocal to integer register classes

These registers are used as operands for instructions that expect an
integer register, so they should be added to Int32Regs or Int64Regs
register classes. Otherwise the machine verifier emits an error for
the following LIT tests when LLVM_ENABLE_MACHINE_VERIFIER=1
environment variable is set:

*** Bad machine code: Illegal physical register for instruction ***
- function: kernel_func
- basic block: %bb.0 entry (0x55c8903d5438)
- instruction: %3:int64regs = LEA_ADDRi64 $vrframelocal, 0
- operand 1: $vrframelocal
$vrframelocal is not a Int64Regs register.

CodeGen/NVPTX/call-with-alloca-buffer.ll
CodeGen/NVPTX/disable-opt.ll
CodeGen/NVPTX/lower-alloca.ll
CodeGen/NVPTX/lower-args.ll
CodeGen/NVPTX/param-align.ll
CodeGen/NVPTX/reg-types.ll
DebugInfo/NVPTX/dbg-declare-alloca.ll
DebugInfo/NVPTX/dbg-value-const-byref.ll

Differential Revision: https://reviews.llvm.org/D110164

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1, llvmorg-3.8.0, llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1, llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1, llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1
# 77b5b385 01-Jul-2015 Jingyue Wu <jingyue@google.com>

[NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPass

Summary:
Offset of frame index is calculated by NVPTXPrologEpilogPass. Before
that the correct offset of stack objects cannot be obtained, whic

[NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPass

Summary:
Offset of frame index is calculated by NVPTXPrologEpilogPass. Before
that the correct offset of stack objects cannot be obtained, which
leads to wrong offset if there are more than 2 frame objects. This patch
move NVPTXPeephole after NVPTXPrologEpilogPass. Because the frame index
is already replaced by %VRFrame in NVPTXPrologEpilogPass, we check
VRFrame register instead, and try to remove the VRFrame if there
is no usage after NVPTXPeephole pass.

Patched by Xuetian Weng.

Test Plan:
Strengthened test/CodeGen/NVPTX/local-stack-frame.ll to check the
offset calculation based on SP and SPL.

Reviewers: jholewinski, jingyue

Reviewed By: jingyue

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10853

llvm-svn: 241185

show more ...


# 9fe08c4b 30-Jun-2015 Jingyue Wu <jingyue@google.com>

[NVPTX] Fix issue introduced in D10321

Summary:
Really check if %SP is not used in other places, instead of checking only exact
one non-dbg use.

Patched by Xuetian Weng.

Test Plan:
@foo4 in test/

[NVPTX] Fix issue introduced in D10321

Summary:
Really check if %SP is not used in other places, instead of checking only exact
one non-dbg use.

Patched by Xuetian Weng.

Test Plan:
@foo4 in test/CodeGen/NVPTX/local-stack-frame.ll, create a case that
SP will appear twice.

Reviewers: jholewinski, jingyue

Reviewed By: jingyue

Subscribers: llvm-commits, sfantao, jholewinski

Differential Revision: http://reviews.llvm.org/D10844

llvm-svn: 241099

show more ...


# 9c71150b 24-Jun-2015 Jingyue Wu <jingyue@google.com>

Add NVPTXPeephole pass to reduce unnecessary address cast

Summary:
This patch first change the register that holds local address for stack
frame to %SPL. Then the new NVPTXPeephole pass will try to

Add NVPTXPeephole pass to reduce unnecessary address cast

Summary:
This patch first change the register that holds local address for stack
frame to %SPL. Then the new NVPTXPeephole pass will try to scan the
following pattern

%vreg0<def> = LEA_ADDRi64 <fi#0>, 4
%vreg1<def> = cvta_to_local %vreg0

and transform it into

%vreg1<def> = LEA_ADDRi64 %VRFrameLocal, 4

Patched by Xuetian Weng

Test Plan: test/CodeGen/NVPTX/local-stack-frame.ll

Reviewers: jholewinski, jingyue

Reviewed By: jingyue

Subscribers: eliben, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10549

llvm-svn: 240587

show more ...


Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1, llvmorg-3.6.1, llvmorg-3.6.1-rc1, llvmorg-3.5.2, llvmorg-3.5.2-rc1, llvmorg-3.6.0, llvmorg-3.6.0-rc4, llvmorg-3.6.0-rc3, llvmorg-3.6.0-rc2, llvmorg-3.6.0-rc1, llvmorg-3.5.1, llvmorg-3.5.1-rc2, llvmorg-3.5.1-rc1, llvmorg-3.5.0, llvmorg-3.5.0-rc4, llvmorg-3.5.0-rc3, llvmorg-3.5.0-rc2, llvmorg-3.5.0-rc1
# 3e037d98 16-Jul-2014 Justin Holewinski <jholewinski@nvidia.com>

[NVPTX] Rename registers %fl -> %fd and %rl -> %rd

This matches the internal behavior of NVIDIA tools like libnvvm.

llvm-svn: 213168


Revision tags: llvmorg-3.4.2, llvmorg-3.4.2-rc1, llvmorg-3.4.1, llvmorg-3.4.1-rc2
# 7cd70df7 21-Apr-2014 Eli Bendersky <eliben@google.com>

Fix the test: DCE optimized away everything.

Use volatile store to protect the generated PTX from DCE.

Patch by Jingyue Wu.

llvm-svn: 206763


Revision tags: llvmorg-3.4.1-rc1, llvmorg-3.4.0, llvmorg-3.4.0-rc3, llvmorg-3.4.0-rc2, llvmorg-3.4.0-rc1
# 871ec939 06-Aug-2013 Justin Holewinski <jholewinski@nvidia.com>

[NVPTX] Fix bug in stack code generation causes by MC conversion

We do use a very small set of physical registers, so account for
them in the virtual register encoding between MachineInstr and MC

l

[NVPTX] Fix bug in stack code generation causes by MC conversion

We do use a very small set of physical registers, so account for
them in the virtual register encoding between MachineInstr and MC

llvm-svn: 187799

show more ...