|
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6 |
|
| #
b279f6b0 |
| 15-Dec-2024 |
Fangrui Song <i@maskray.me> |
[NVPTX,test] Change llc -march= to -mtriple=
Similar to 806761a7629df268c8aed49657aeccffa6bca449
-mtriple= specifies the full target triple while -march= merely sets the architecture part of the de
[NVPTX,test] Change llc -march= to -mtriple=
Similar to 806761a7629df268c8aed49657aeccffa6bca449
-mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple (e.g. Windows, macOS), leaving a target triple which may not make sense.
Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize nvptx{,64}-apple-darwin as ELF instead of rejecting it outrightly.
show more ...
|
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3 |
|
| #
0f0a96b8 |
| 19-Oct-2024 |
Youngsuk Kim <youngsuk.kim@hpe.com> |
[llvm][NVPTX] Strip unneeded '+0' in PTX load/store (#113017)
Remove the extraneous '+0' immediate offset part in PTX load/stores, to
improve readability of output PTX code.
|
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
| #
4d8e42ea |
| 20-Jul-2024 |
Alex MacLean <amaclean@nvidia.com> |
[NVPTX] enforce signed 32 bit type for immediate offset (#99682)
The NVPTX ISA states that an immOff must fit in a signed 32-bit integer
(https://docs.nvidia.com/cuda/parallel-thread-execution/#add
[NVPTX] enforce signed 32 bit type for immediate offset (#99682)
The NVPTX ISA states that an immOff must fit in a signed 32-bit integer
(https://docs.nvidia.com/cuda/parallel-thread-execution/#addresses-as-operands):
> `[reg+immOff]`
>
> a sum of register `reg` containing a byte address plus a constant
> integer byte offset (signed, 32-bit).
>
> `[var+immOff]`
>
> a sum of address of addressable variable `var` containing a byte
> address plus a constant integer byte offset (signed, 32-bit).
Currently we do not consider this constraint, meaning that in some edge
cases we generate invalid PTX when a value is offset by a very large
immediate.
show more ...
|