History log of /llvm-project/llvm/test/CodeGen/AMDGPU/inline-asm.i128.ll (Results 1 – 25 of 29)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# b2adeae8 03-Jan-2025 Jun Wang <jwang86@yahoo.com>

[AMDGPU][MC] Allow null where 128b or larger dst reg is expected (#115200)

For GFX10+, currently null cannot be used as dst reg in instructions
that expect the dst reg to be 128b or larger (e.g., s

[AMDGPU][MC] Allow null where 128b or larger dst reg is expected (#115200)

For GFX10+, currently null cannot be used as dst reg in instructions
that expect the dst reg to be 128b or larger (e.g., s_load_dwordx4).
This patch fixes this problem while ensuring null cannot be used as S#,
T#, or V#.

show more ...


Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 6548b635 09-Nov-2024 Shilei Tian <i@tianshilei.me>

Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.


# ca33649a 08-Nov-2024 Shilei Tian <i@tianshilei.me>

Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both
hip and openmp buildbots.


# e215a1e2 08-Nov-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1
# f6a8eb98 24-Sep-2024 Jun Wang <jwang86@yahoo.com>

[AMDGPU][MC] Disallow null as saddr in flat instructions (#101730)

Some flat instructions have an saddr operand. When 'null' is provided as
saddr, it may have the same encoding as another instructi

[AMDGPU][MC] Disallow null as saddr in flat instructions (#101730)

Some flat instructions have an saddr operand. When 'null' is provided as
saddr, it may have the same encoding as another instruction. For
example, the instructions 'global_atomic_add v1, v2, null' and
'global_atomic_add v[1:2], v2, off' have the same encoding. This patch
disallows having null as saddr.

show more ...


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# b1bcb7ca 15-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.

Drop the -O3 checks from default-attributes.hip. I don't know why they
are different on some bots but reverting this is far too disruptive.

show more ...


# adaff46d 15-Jul-2024 dyung <douglas.yung@sony.com>

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.

The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614

These bots have been broken for a day, so reverting to get everything
back to green.

show more ...


# 78bc1b64 14-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# b6daac02 29-Dec-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][True16] Remove the VGPR_LO/HI16 register classes. (#76500)


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2
# 469b3bfa 21-Sep-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU] Add True16 register classes.

Reviewed By: rampitec, Joe_Nash

Differential Revision: https://reviews.llvm.org/D156099


Revision tags: llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5
# e4284a7c 25-May-2023 Jay Foad <jay.foad@amd.com>

[AMDGPU] 4-align SGPR triples

Previously SGPR triples like s[3:5] were aligned on a 3-SGPR boundary
which has no basis in hardware.

Aligning them on a 4-SGPR boundary is at least justified by the
a

[AMDGPU] 4-align SGPR triples

Previously SGPR triples like s[3:5] were aligned on a 3-SGPR boundary
which has no basis in hardware.

Aligning them on a 4-SGPR boundary is at least justified by the
architecture reference guide which says: "Quad-alignment of SGPRs is
required for operation on more than 64-bits".

Currently there are no instructions that take SGPR triples as operands
so the issue is latent.

Differential Revision: https://reviews.llvm.org/D151463

show more ...


Revision tags: llvmorg-16.0.4, llvmorg-16.0.3
# 1ab8b9ae 27-Apr-2023 Changpeng Fang <changpeng.fang@amd.com>

AMDGPU: Define sub-class of SGPR_64 for tail call return

Summary:
Registers for tail call return should not be clobbered by callee.
So we need a sub-class of SGPR_64 (excluding callee saved regist

AMDGPU: Define sub-class of SGPR_64 for tail call return

Summary:
Registers for tail call return should not be clobbered by callee.
So we need a sub-class of SGPR_64 (excluding callee saved registers (CSR)) to hold
the tail call return address.

Because GFX and C calling conventions have different CSR, we need to define
the sub-class separately. This work is an extension of D147096 with the
consideration of GFX calling convention.

Based on the calling conventions, different instructions will be selected with
different sub-class of SGPR_64 as the input.

Reviewers: arsenm, cdevadas and sebastian-ne

Differential Revision: https://reviews.llvm.org/D148824

show more ...


Revision tags: llvmorg-16.0.2
# 3bc1e084 10-Apr-2023 Changpeng Fang <changpeng.fang@amd.com>

AMDGPU: Created a subclass for the return address operand in the tail call return instruction

Summary:
This is to avoid using the callee saved registers for the return address
of the tail call ret

AMDGPU: Created a subclass for the return address operand in the tail call return instruction

Summary:
This is to avoid using the callee saved registers for the return address
of the tail call return instruction.

Reviewers:
arsenm, cdevadas

Differential Revision:
https://reviews.llvm.org/D147096

show more ...


Revision tags: llvmorg-16.0.1
# 282b8ac1 04-Apr-2023 Changpeng Fang <changpeng.fang@amd.com>

Revert "AMDGPU: Created a subclass for the return address operand in the tail call return instruction"

This reverts commit 461a559bc9bd755436ba8f12f8b74757e03f9b9f.


# 461a559b 04-Apr-2023 Changpeng Fang <changpeng.fang@amd.com>

AMDGPU: Created a subclass for the return address operand in the tail call return instruction

Summary:
This is to avoid using the callee saved registers for the return address of the tail call ret

AMDGPU: Created a subclass for the return address operand in the tail call return instruction

Summary:
This is to avoid using the callee saved registers for the return address of the tail call return instruction.

Reviewers:
arsenm, cdevadas

Differential Revision:
https://reviews.llvm.org/D147096

show more ...


# ec1f5b94 01-Apr-2023 Aaron Ballman <aaron@aaronballman.com>

Revert "AMDGPU: Created a subclass for the return address operand in the tail call return instruction"

This reverts commit 7a98934fadc3581ff024a77dc696b62f1a538ad5.

This appears to have broken seve

Revert "AMDGPU: Created a subclass for the return address operand in the tail call return instruction"

This reverts commit 7a98934fadc3581ff024a77dc696b62f1a538ad5.

This appears to have broken several bots, including:
https://lab.llvm.org/buildbot/#/builders/42/builds/9472

show more ...


# 7a98934f 30-Mar-2023 Changpeng Fang <changpeng.fang@amd.com>

AMDGPU: Created a subclass for the return address operand in the tail call return instruction

Summary:
This is to avoid using the callee saved registers for the return address of the tail call ret

AMDGPU: Created a subclass for the return address operand in the tail call return instruction

Summary:
This is to avoid using the callee saved registers for the return address of the tail call return instruction.

Reviewers:
arsenm, cdevadas

Differential Revision:
https://reviews.llvm.org/D147096

show more ...


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 113aafbf 14-Dec-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Clean up SReg classes

Remove unused LO16 classes SReg_LO16_XM0_XEXEC, SReg_LO16_XEXEC_HI and
SReg_LO16_XM0.

Simplify the definition of SReg_32.

Add SReg_32_XEXEC and use it to improve SRe

[AMDGPU] Clean up SReg classes

Remove unused LO16 classes SReg_LO16_XM0_XEXEC, SReg_LO16_XEXEC_HI and
SReg_LO16_XM0.

Simplify the definition of SReg_32.

Add SReg_32_XEXEC and use it to improve SReg_1_XEXEC which previously
excluded M0 for no good reason.

Improve SReg_1 which previously excluded EXEC_HI for no good reason.

Differential Revision: https://reviews.llvm.org/D140012

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# b982ba2a 13-Jul-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C

Due to the encoding changes in GFX11, we had a hack in place that
disables the use of VGPRs above 128. This patch removes the need for
that

[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C

Due to the encoding changes in GFX11, we had a hack in place that
disables the use of VGPRs above 128. This patch removes the need for
that hack.

We introduce a new register class VGPR_32_Lo128 which is used for 16-bit
operands of VOP1, VOP2, and VOPC instructions. This register class only has the
low 128 VGPRs, but is otherwise identical to VGPR_32. Therefore, 16-bit VOP1,
VOP2, and VOPC instructions are correctly limited to use the first 128
VGPRs, while the other instructions can freely use all 256.

We introduce new pseduo-instructions used on GFX11 which have the suffix
t16 (True 16) to use the VGPR_32_Lo128 register class.

Reviewed By: foad, rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D133723

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 04fff547 07-Mar-2022 Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu@amd.com>

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added a

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added as a live-in on the function entry to
preserve its value when we have calls so that it gets saved and restored
around the calls.

But the DWARF unwind information (CFI) needs to track where the return address
resides in a frame and the above approach makes it difficult to track the
return address when the CFI information is emitted during the frame lowering,
due to the involvment of understanding the control flow.

This patch moves the return address ABI registers s[30:31] into callee saved
registers range and stops adding live-in for return address registers, so that
the CFI machinery will know where the return address resides when CSR
save/restore happen during the frame lowering.

And doing the above poses an issue that now the return instruction uses undefined
register `sgpr30_sgpr31`. This is resolved by hiding the return address register
use by the return instruction through the `SI_RETURN` pseudo instruction, which
doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the
`S_SETPC_B64_return` during the `expandPostRAPseudo()`.

As an added benefit, this patch simplifies overall return instruction handling.

Note: The AMDGPU CFI changes are there only in the downstream code and another
version of this patch will be posted for review for the downstream code.

Reviewed By: arsenm, ronlieb

Differential Revision: https://reviews.llvm.org/D114652

show more ...


Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 09b53296 22-Dec-2021 Ron Lieberman <Ron.Lieberman@amd.com>

Revert "[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range"

This reverts commit 9075009d1fd5f2bf9aa6c2f362d2993691a316b3.

Failed amdgpu runtime buildbot # 3514


# 9075009d 22-Dec-2021 RamNalamothu <VenkataRamanaiah.Nalamothu@amd.com>

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added a

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added as a live-in on the function entry to
preserve its value when we have calls so that it gets saved and restored
around the calls.

But the DWARF unwind information (CFI) needs to track where the return address
resides in a frame and the above approach makes it difficult to track the
return address when the CFI information is emitted during the frame lowering,
due to the involvment of understanding the control flow.

This patch moves the return address ABI registers s[30:31] into callee saved
registers range and stops adding live-in for return address registers, so that
the CFI machinery will know where the return address resides when CSR
save/restore happen during the frame lowering.

And doing the above poses an issue that now the return instruction uses undefined
register `sgpr30_sgpr31`. This is resolved by hiding the return address register
use by the return instruction through the `SI_RETURN` pseudo instruction, which
doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the
`S_SETPC_B64_return` during the `expandPostRAPseudo()`.

As an added benefit, this patch simplifies overall return instruction handling.

Note: The AMDGPU CFI changes are there only in the downstream code and another
version of this patch will be posted for review for the downstream code.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D114652

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# 654c89d8 06-Sep-2021 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Make vector superclasses allocatable

The combined vector register classes with both
VGPRs and AGPRs are currently unallocatable.
This patch turns them into allocatable as a
prerequisite to

[AMDGPU] Make vector superclasses allocatable

The combined vector register classes with both
VGPRs and AGPRs are currently unallocatable.
This patch turns them into allocatable as a
prerequisite to enable copy between VGPR and
AGPR registers during regalloc.

Also, added the missing AV register classes from
192b to 1024b.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D109300

show more ...


# 76cbe622 25-Oct-2021 Thomas Symalla <thomas.symalla@amd.com>

[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not goin

[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not going to change the argument values.

This patch changes the AMDGPU_Gfx calling convention. It defines the SGPR registers s[4:29] as callee-save and leaves some SGPRs usable for callers. The intention is to avoid unneccessary s_mov instructions for arguments the caller would otherwise save and restore in these registers.

Reviewed By: sebastian-ne

Differential Revision: https://reviews.llvm.org/D111637

show more ...


# f0331100 25-Oct-2021 Thomas Symalla <thomas.symalla@amd.com>

[AMDGPU] Regenerate some tests with the current version of update_mir_test_checks.py


12