History log of /llvm-project/llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-preserve-cc.ll (Results 1 – 12 of 12)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init
# 5a81a559 27-Jan-2025 David Green <david.green@arm.com>

[GISel] Explicitly disable BF16 tablegen patterns. (#124113)

We currently have an issue where bf16 patters can be used to match fp16
types, as GISel does not know about the difference between the t

[GISel] Explicitly disable BF16 tablegen patterns. (#124113)

We currently have an issue where bf16 patters can be used to match fp16
types, as GISel does not know about the difference between the two. This
patch explicitly disables them to make sure that they are never used.

The opposite can also happen too, where fp16 patterns are used for
operators that should be bf16. So this also changes any operations with
bf16 types to now cause a fallback to SDAG.

The pass setup for GISel has been slightly adjusted to make sure that a
verify pass does not get added between AMD-SDAG and SIFixSGPRCopiesPass,
which otherwise can cause verifier issues when falling back.

show more ...


Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3
# 7b4c8b35 16-Oct-2024 Brox Chen <guochen2@amd.com>

[AMDGPU][True16][MC] VOP3 profile in True16 format (#109031)

Modify VOP3 profile and pesudo, and add encoding info for VOP3 True16
including DPP and DPP8 in true16 and fake16 format.

This patch

[AMDGPU][True16][MC] VOP3 profile in True16 format (#109031)

Modify VOP3 profile and pesudo, and add encoding info for VOP3 True16
including DPP and DPP8 in true16 and fake16 format.

This patch applies true16/fake16 changes and asm/dasm changes to
V_ADD_NC_U16
V_ADD_NC_I16
V_SUB_NC_U16
V_SUB_NC_I16

show more ...


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1
# a8203291 26-Jul-2024 Changpeng Fang <changpeng.fang@amd.com>

[AMDGPU] Remove -wavefrontsize32 and -wavefrontsize64 from GFX10+ tests (NFC) (#100711)

They are no longer needed after the patch: [AMDGPU] Remove wavefrontsize
feature from GFX10: https://github.c

[AMDGPU] Remove -wavefrontsize32 and -wavefrontsize64 from GFX10+ tests (NFC) (#100711)

They are no longer needed after the patch: [AMDGPU] Remove wavefrontsize
feature from GFX10: https://github.com/llvm/llvm-project/pull/98400
The exception is when "target-features" are set to "+wavefrontsize32" or
"+wavefrontsize64", we still need to remove a wavefrontsize feature
before add a different one to make sure only one of them are present.

show more ...


Revision tags: llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4
# be36812f 21-Feb-2024 David Majnemer <david.majnemer@gmail.com>

[TargetLowering] Be more efficient in fp -> bf16 NaN conversions

We can avoid masking completely as it is OK (and probably preferable) to
bring over some of the existant NaN payload.


# 9eff001d 21-Feb-2024 David Majnemer <david.majnemer@gmail.com>

[TargetLowering] Correctly yield NaN from FP_TO_BF16

We didn't set the exponent field, resulting in tiny numbers instead of
NaNs.


# cc13f3ba 21-Feb-2024 David Majnemer <david.majnemer@gmail.com>

Correctly round FP -> BF16 when SDAG expands such nodes (#82399)

We did something pretty naive:
- round FP64 -> BF16 by first rounding to FP32
- skip FP32 -> BF16 rounding entirely
- taking the t

Correctly round FP -> BF16 when SDAG expands such nodes (#82399)

We did something pretty naive:
- round FP64 -> BF16 by first rounding to FP32
- skip FP32 -> BF16 rounding entirely
- taking the top 16 bits of a FP32 which will turn some NaNs into
infinities

Let's do this in a more principled way by rounding types with more
precision than FP32 to FP32 using round-inexact-to-odd which will negate
double rounding issues.

show more ...


Revision tags: llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 9e9907f1 17-Jan-2024 Fangrui Song <i@maskray.me>

[AMDGPU,test] Change llc -march= to -mtriple= (#75982)

Similar to 806761a7629df268c8aed49657aeccffa6bca449.

For IR files without a target triple, -mtriple= specifies the full
target triple while

[AMDGPU,test] Change llc -march= to -mtriple= (#75982)

Similar to 806761a7629df268c8aed49657aeccffa6bca449.

For IR files without a target triple, -mtriple= specifies the full
target triple while -march= merely sets the architecture part of the
default target triple, leaving a target triple which may not make sense,
e.g. amdgpu-apple-darwin.

Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
$unknown-apple-darwin as ELF instead of rejecting it outrightly.

This patch changes AMDGPU tests to not rely on the default
OS/environment components. Tests that need fixes are not changed:

```
LLVM :: CodeGen/AMDGPU/fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fabs.ll
LLVM :: CodeGen/AMDGPU/floor.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.ll
LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll
LLVM :: CodeGen/AMDGPU/schedule-if-2.ll
```

show more ...


# 460ffcdd 04-Jan-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Make bf16/v2bf16 legal types (#76215)

There are some intrinsics are using i16 vectors in place of bfloat
vectors.
Move towards making bf16 vectors legal so these can migrate. Leave the
la

AMDGPU: Make bf16/v2bf16 legal types (#76215)

There are some intrinsics are using i16 vectors in place of bfloat
vectors.
Move towards making bf16 vectors legal so these can migrate. Leave the
larger vectors for a later change.

Depends #76213 #76214

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2
# fab28e0e 22-Sep-2023 Ivan Kosarev <ivan.kosarev@amd.com>

Reapply "[AMDGPU] Introduce real and keep fake True16 instructions."

Reverts 6cb3866b1ce9d835402e414049478cea82427cf1.

Analysis of failures on buildbots with expensive checks enabled showed
that th

Reapply "[AMDGPU] Introduce real and keep fake True16 instructions."

Reverts 6cb3866b1ce9d835402e414049478cea82427cf1.

Analysis of failures on buildbots with expensive checks enabled showed
that the problem was triggered by changes in another commit,
469b3bfad20550968ac428738eb1f8bb8ce3e96d, and was caused by the bug
addressed in #67245.

show more ...


# 6cb3866b 22-Sep-2023 Ivan Kosarev <ivan.kosarev@amd.com>

Revert "[AMDGPU] Introduce real and keep fake True16 instructions."

This reverts commit 0f864c7b8bc9323293ec3d85f4bd5322f8f61b16 due to
failures on expensive checks.


# 0f864c7b 22-Sep-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU] Introduce real and keep fake True16 instructions.

The existing fake True16 instructions using 32-bit VGPRs are supposed to
co-exist with real ones until all the necessary True16 functionali

[AMDGPU] Introduce real and keep fake True16 instructions.

The existing fake True16 instructions using 32-bit VGPRs are supposed to
co-exist with real ones until all the necessary True16 functionality is
implemented and relevant tests are updated.

Reviewed By: arsenm, Joe_Nash

Differential Revision: https://reviews.llvm.org/D156101

show more ...


Revision tags: llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5
# 26dc2844 30-May-2023 Diana Picus <Diana-Magda.Picus@amd.com>

[AMDGPU] ISel for amdgpu_cs_chain[_preserve] functions

Lower formal arguments and returns for functions with the
`amdgpu_cs_chain` and `amdgpu_cs_chain_preserve` calling conventions:

* Put `inreg`

[AMDGPU] ISel for amdgpu_cs_chain[_preserve] functions

Lower formal arguments and returns for functions with the
`amdgpu_cs_chain` and `amdgpu_cs_chain_preserve` calling conventions:

* Put `inreg` arguments into SGPRs, starting at s0, and other arguments
into VGPRs, starting at v8. No arguments should end up on the stack, if
we don't have enough registers we should error out.

* Lower the return (which is always void) as an S_ENDPGM.

* Set the ScratchRSrc register to s48:51, as described in the docs.

* Set the SP to s32, matching amdgpu_gfx. This might be revisited in a
future patch.

Differential Revision: https://reviews.llvm.org/D153517

show more ...