|
Revision tags: llvmorg-21-init, llvmorg-19.1.7 |
|
| #
b2adeae8 |
| 03-Jan-2025 |
Jun Wang <jwang86@yahoo.com> |
[AMDGPU][MC] Allow null where 128b or larger dst reg is expected (#115200)
For GFX10+, currently null cannot be used as dst reg in instructions
that expect the dst reg to be 128b or larger (e.g., s
[AMDGPU][MC] Allow null where 128b or larger dst reg is expected (#115200)
For GFX10+, currently null cannot be used as dst reg in instructions
that expect the dst reg to be 128b or larger (e.g., s_load_dwordx4).
This patch fixes this problem while ensuring null cannot be used as S#,
T#, or V#.
show more ...
|
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
| #
6548b635 |
| 09-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"
This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.
|
| #
ca33649a |
| 08-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"
This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both hip and openmp buildbots.
|
| #
e215a1e2 |
| 08-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)
|
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1 |
|
| #
f6a8eb98 |
| 24-Sep-2024 |
Jun Wang <jwang86@yahoo.com> |
[AMDGPU][MC] Disallow null as saddr in flat instructions (#101730)
Some flat instructions have an saddr operand. When 'null' is provided as
saddr, it may have the same encoding as another instructi
[AMDGPU][MC] Disallow null as saddr in flat instructions (#101730)
Some flat instructions have an saddr operand. When 'null' is provided as
saddr, it may have the same encoding as another instruction. For
example, the instructions 'global_atomic_add v1, v2, null' and
'global_atomic_add v[1:2], v2, off' have the same encoding. This patch
disallows having null as saddr.
show more ...
|
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
| #
b1bcb7ca |
| 15-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commit adaff46d087799
Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.
Drop the -O3 checks from default-attributes.hip. I don't know why they are different on some bots but reverting this is far too disruptive.
show more ...
|
| #
adaff46d |
| 15-Jul-2024 |
dyung <douglas.yung@sony.com> |
Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commits 677cc15e0ff2e0
Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.
The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614
These bots have been broken for a day, so reverting to get everything
back to green.
show more ...
|
| #
78bc1b64 |
| 14-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Move attributor into optimization pipeline (#83131)
Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.
AMDGPU: Move attributor into optimization pipeline (#83131)
Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.
Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.
show more ...
|
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
| #
b6daac02 |
| 29-Dec-2023 |
Ivan Kosarev <ivan.kosarev@amd.com> |
[AMDGPU][True16] Remove the VGPR_LO/HI16 register classes. (#76500)
|
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2 |
|
| #
469b3bfa |
| 21-Sep-2023 |
Ivan Kosarev <ivan.kosarev@amd.com> |
[AMDGPU] Add True16 register classes.
Reviewed By: rampitec, Joe_Nash
Differential Revision: https://reviews.llvm.org/D156099
|
|
Revision tags: llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5 |
|
| #
e4284a7c |
| 25-May-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] 4-align SGPR triples
Previously SGPR triples like s[3:5] were aligned on a 3-SGPR boundary which has no basis in hardware.
Aligning them on a 4-SGPR boundary is at least justified by the a
[AMDGPU] 4-align SGPR triples
Previously SGPR triples like s[3:5] were aligned on a 3-SGPR boundary which has no basis in hardware.
Aligning them on a 4-SGPR boundary is at least justified by the architecture reference guide which says: "Quad-alignment of SGPRs is required for operation on more than 64-bits".
Currently there are no instructions that take SGPR triples as operands so the issue is latent.
Differential Revision: https://reviews.llvm.org/D151463
show more ...
|
|
Revision tags: llvmorg-16.0.4, llvmorg-16.0.3 |
|
| #
1ab8b9ae |
| 27-Apr-2023 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Define sub-class of SGPR_64 for tail call return
Summary: Registers for tail call return should not be clobbered by callee. So we need a sub-class of SGPR_64 (excluding callee saved regist
AMDGPU: Define sub-class of SGPR_64 for tail call return
Summary: Registers for tail call return should not be clobbered by callee. So we need a sub-class of SGPR_64 (excluding callee saved registers (CSR)) to hold the tail call return address.
Because GFX and C calling conventions have different CSR, we need to define the sub-class separately. This work is an extension of D147096 with the consideration of GFX calling convention.
Based on the calling conventions, different instructions will be selected with different sub-class of SGPR_64 as the input.
Reviewers: arsenm, cdevadas and sebastian-ne
Differential Revision: https://reviews.llvm.org/D148824
show more ...
|
|
Revision tags: llvmorg-16.0.2 |
|
| #
3bc1e084 |
| 10-Apr-2023 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Created a subclass for the return address operand in the tail call return instruction
Summary: This is to avoid using the callee saved registers for the return address of the tail call ret
AMDGPU: Created a subclass for the return address operand in the tail call return instruction
Summary: This is to avoid using the callee saved registers for the return address of the tail call return instruction.
Reviewers: arsenm, cdevadas
Differential Revision: https://reviews.llvm.org/D147096
show more ...
|
|
Revision tags: llvmorg-16.0.1 |
|
| #
282b8ac1 |
| 04-Apr-2023 |
Changpeng Fang <changpeng.fang@amd.com> |
Revert "AMDGPU: Created a subclass for the return address operand in the tail call return instruction"
This reverts commit 461a559bc9bd755436ba8f12f8b74757e03f9b9f.
|
| #
461a559b |
| 04-Apr-2023 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Created a subclass for the return address operand in the tail call return instruction
Summary: This is to avoid using the callee saved registers for the return address of the tail call ret
AMDGPU: Created a subclass for the return address operand in the tail call return instruction
Summary: This is to avoid using the callee saved registers for the return address of the tail call return instruction.
Reviewers: arsenm, cdevadas
Differential Revision: https://reviews.llvm.org/D147096
show more ...
|
| #
ec1f5b94 |
| 01-Apr-2023 |
Aaron Ballman <aaron@aaronballman.com> |
Revert "AMDGPU: Created a subclass for the return address operand in the tail call return instruction"
This reverts commit 7a98934fadc3581ff024a77dc696b62f1a538ad5.
This appears to have broken seve
Revert "AMDGPU: Created a subclass for the return address operand in the tail call return instruction"
This reverts commit 7a98934fadc3581ff024a77dc696b62f1a538ad5.
This appears to have broken several bots, including: https://lab.llvm.org/buildbot/#/builders/42/builds/9472
show more ...
|
| #
7a98934f |
| 30-Mar-2023 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Created a subclass for the return address operand in the tail call return instruction
Summary: This is to avoid using the callee saved registers for the return address of the tail call ret
AMDGPU: Created a subclass for the return address operand in the tail call return instruction
Summary: This is to avoid using the callee saved registers for the return address of the tail call return instruction.
Reviewers: arsenm, cdevadas
Differential Revision: https://reviews.llvm.org/D147096
show more ...
|
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
| #
113aafbf |
| 14-Dec-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Clean up SReg classes
Remove unused LO16 classes SReg_LO16_XM0_XEXEC, SReg_LO16_XEXEC_HI and SReg_LO16_XM0.
Simplify the definition of SReg_32.
Add SReg_32_XEXEC and use it to improve SRe
[AMDGPU] Clean up SReg classes
Remove unused LO16 classes SReg_LO16_XM0_XEXEC, SReg_LO16_XEXEC_HI and SReg_LO16_XM0.
Simplify the definition of SReg_32.
Add SReg_32_XEXEC and use it to improve SReg_1_XEXEC which previously excluded M0 for no good reason.
Improve SReg_1 which previously excluded EXEC_HI for no good reason.
Differential Revision: https://reviews.llvm.org/D140012
show more ...
|
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
b982ba2a |
| 13-Jul-2022 |
Joe Nash <Joseph.Nash@amd.com> |
[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C
Due to the encoding changes in GFX11, we had a hack in place that disables the use of VGPRs above 128. This patch removes the need for that
[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C
Due to the encoding changes in GFX11, we had a hack in place that disables the use of VGPRs above 128. This patch removes the need for that hack.
We introduce a new register class VGPR_32_Lo128 which is used for 16-bit operands of VOP1, VOP2, and VOPC instructions. This register class only has the low 128 VGPRs, but is otherwise identical to VGPR_32. Therefore, 16-bit VOP1, VOP2, and VOPC instructions are correctly limited to use the first 128 VGPRs, while the other instructions can freely use all 256.
We introduce new pseduo-instructions used on GFX11 which have the suffix t16 (True 16) to use the VGPR_32_Lo128 register class.
Reviewed By: foad, rampitec, #amdgpu
Differential Revision: https://reviews.llvm.org/D133723
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
04fff547 |
| 07-Mar-2022 |
Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu@amd.com> |
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added a
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls.
But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow.
This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering.
And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`.
As an added benefit, this patch simplifies overall return instruction handling.
Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code.
Reviewed By: arsenm, ronlieb
Differential Revision: https://reviews.llvm.org/D114652
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
09b53296 |
| 22-Dec-2021 |
Ron Lieberman <Ron.Lieberman@amd.com> |
Revert "[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range"
This reverts commit 9075009d1fd5f2bf9aa6c2f362d2993691a316b3.
Failed amdgpu runtime buildbot # 3514
|
| #
9075009d |
| 22-Dec-2021 |
RamNalamothu <VenkataRamanaiah.Nalamothu@amd.com> |
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added a
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls.
But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow.
This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering.
And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`.
As an added benefit, this patch simplifies overall return instruction handling.
Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D114652
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
654c89d8 |
| 06-Sep-2021 |
Christudasan Devadasan <Christudasan.Devadasan@amd.com> |
[AMDGPU] Make vector superclasses allocatable
The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to
[AMDGPU] Make vector superclasses allocatable
The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to enable copy between VGPR and AGPR registers during regalloc.
Also, added the missing AV register classes from 192b to 1024b.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D109300
show more ...
|
| #
76cbe622 |
| 25-Oct-2021 |
Thomas Symalla <thomas.symalla@amd.com> |
[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not goin
[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not going to change the argument values.
This patch changes the AMDGPU_Gfx calling convention. It defines the SGPR registers s[4:29] as callee-save and leaves some SGPRs usable for callers. The intention is to avoid unneccessary s_mov instructions for arguments the caller would otherwise save and restore in these registers.
Reviewed By: sebastian-ne
Differential Revision: https://reviews.llvm.org/D111637
show more ...
|
| #
f0331100 |
| 25-Oct-2021 |
Thomas Symalla <thomas.symalla@amd.com> |
[AMDGPU] Regenerate some tests with the current version of update_mir_test_checks.py
|