History log of /llvm-project/llvm/test/CodeGen/AMDGPU/spill-vector-superclass.ll (Results 1 – 20 of 20)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 6548b635 09-Nov-2024 Shilei Tian <i@tianshilei.me>

Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.


# ca33649a 08-Nov-2024 Shilei Tian <i@tianshilei.me>

Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both
hip and openmp buildbots.


# e215a1e2 08-Nov-2024 Shilei Tian <i@tianshilei.me>

[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1
# ac0f64f0 30-Sep-2024 Christudasan Devadasan <christudasan.devadasan@amd.com>

[AMDGPU] Split vgpr regalloc pipeline (#93526)

Allocating wwm-registers and per-thread VGPR operands
together imposes many challenges in the way the
registers are reused during allocation. There a

[AMDGPU] Split vgpr regalloc pipeline (#93526)

Allocating wwm-registers and per-thread VGPR operands
together imposes many challenges in the way the
registers are reused during allocation. There are
times when regalloc reuses the registers of regular
VGPRs operations for wwm-operations in a small range
leading to unwantedly clobbering their inactive lanes
causing correctness issues that are hard to trace.

This patch splits the VGPR allocation pipeline further
to allocate wwm-registers first and the regular VGPR
operands in a separate pipeline. The splitting would
ensure that the physical registers used for wwm
allocations won't take part in the next allocation
pipeline to avoid any such clobbering.

show more ...


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# b1bcb7ca 15-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799

Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.

Drop the -O3 checks from default-attributes.hip. I don't know why they
are different on some bots but reverting this is far too disruptive.

show more ...


# adaff46d 15-Jul-2024 dyung <douglas.yung@sony.com>

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0

Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.

The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614

These bots have been broken for a day, so reverting to get everything
back to green.

show more ...


# 78bc1b64 14-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

AMDGPU: Move attributor into optimization pipeline (#83131)

Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.

Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# b6daac02 29-Dec-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][True16] Remove the VGPR_LO/HI16 register classes. (#76500)


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2
# 469b3bfa 21-Sep-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU] Add True16 register classes.

Reviewed By: rampitec, Joe_Nash

Differential Revision: https://reviews.llvm.org/D156099


Revision tags: llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# 7dcb9c0f 20-Mar-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

InlineSpiller: Consider copy bundles when looking for snippet copies

This was looking for full copies produced by SplitKit, but SplitKit
introduces copy bundles if not all lanes are live. The scan f

InlineSpiller: Consider copy bundles when looking for snippet copies

This was looking for full copies produced by SplitKit, but SplitKit
introduces copy bundles if not all lanes are live. The scan for uses
needs to look at bundles, not individual instructions.

This is a prerequisite to avoiding some redundant spills due to
subregisters which will help avoid an allocation failure in a future
patch.

show more ...


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init
# e7f080b3 18-Jan-2023 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Introduce separate register limit bias in scheduler

Current implementation abuses ErrorMargin to apply an additional
bias to VGPR and SGPR limits under a high register pressure. The
ErrorMa

[AMDGPU] Introduce separate register limit bias in scheduler

Current implementation abuses ErrorMargin to apply an additional
bias to VGPR and SGPR limits under a high register pressure. The
ErrorMargin exists to account for inaccuracies of the RP tracker
and not to tackle an excess pressure. Introduce separate bias for
this purpose and also make it different for SGPRs and VGPRs as we
may want to use different values in the future.

This is supposed to be NFC, however there is a subtle difference
when subtracting a margin overflows the limit. Doing two subtractions
makes it less probable, although manifests only in mir tests with
an artificially small register budget.

Differential Revision: https://reviews.llvm.org/D142051

show more ...


Revision tags: llvmorg-15.0.7
# ce096b22 19-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Convert some tests to opaque pointers

These required update_mir_test_checks.


# 113aafbf 14-Dec-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Clean up SReg classes

Remove unused LO16 classes SReg_LO16_XM0_XEXEC, SReg_LO16_XEXEC_HI and
SReg_LO16_XM0.

Simplify the definition of SReg_32.

Add SReg_32_XEXEC and use it to improve SRe

[AMDGPU] Clean up SReg classes

Remove unused LO16 classes SReg_LO16_XM0_XEXEC, SReg_LO16_XEXEC_HI and
SReg_LO16_XM0.

Simplify the definition of SReg_32.

Add SReg_32_XEXEC and use it to improve SReg_1_XEXEC which previously
excluded M0 for no good reason.

Improve SReg_1 which previously excluded EXEC_HI for no good reason.

Differential Revision: https://reviews.llvm.org/D140012

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# b982ba2a 13-Jul-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C

Due to the encoding changes in GFX11, we had a hack in place that
disables the use of VGPRs above 128. This patch removes the need for
that

[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C

Due to the encoding changes in GFX11, we had a hack in place that
disables the use of VGPRs above 128. This patch removes the need for
that hack.

We introduce a new register class VGPR_32_Lo128 which is used for 16-bit
operands of VOP1, VOP2, and VOPC instructions. This register class only has the
low 128 VGPRs, but is otherwise identical to VGPR_32. Therefore, 16-bit VOP1,
VOP2, and VOPC instructions are correctly limited to use the first 128
VGPRs, while the other instructions can freely use all 256.

We introduce new pseduo-instructions used on GFX11 which have the suffix
t16 (True 16) to use the VGPR_32_Lo128 register class.

Reviewed By: foad, rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D133723

show more ...


# b92161f9 11-Aug-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] Autogenerate spill-vector-superclass. NFC

This test is already a subset of the autogenerated test lines, so truly
auto-generate it to make it easier to update.


# a321d95b 01-Aug-2022 Alexander Timofeev <alexander.timofeev@amd.com>

[AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs

In the 2e29b0138ca243 we introduce a specific solving algorithm
that analyzes the VGPR to SGPR copies use chains and either lowers
the c

[AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs

In the 2e29b0138ca243 we introduce a specific solving algorithm
that analyzes the VGPR to SGPR copies use chains and either lowers
the copy to v_readfirstlane_b32 or converts the whole chain to VALU forms.
Same time we still have the code that blindly converts to VALU REG_SEQUENCE and PHIs
in case they produce SGPR but have VGPRs input operands. In case the REG_SEQUENCE and PHIs
are in the VGPR to SGPR copy use chain, and this chain was considered long enough to convert
copy to v_readfistlane_b32, further lowering them to VALU leads to several kinds of issues.
At first, we have v_readfistlane_b32 which is completely useless because most parts of its use chain
were moved to VALU forms. Second, we may encounter subtle bugs related to the EXEC-dependent CF
because of the weird mixing of SALU and VALU instructions.
This change removes the code that moves REG_SEQUENCE and PHIs to VALU. Instead, we use the fact
that both REG_SEQUENCE and PHIs have copy semantics. That is, if they define SGPR but have VGPR inputs,
we insert VGPR to SGPR copies to make them pure SGPR. Then, the new copies are processed by the common
VGPR to SGPR lowering algorithm.
This is Part 2 in the series of commits aiming at the massive refactoring of the SIFixSGPRCopies pass.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D130367

show more ...


# d7ae1a90 29-Jul-2022 Alexander Timofeev <alexander.timofeev@amd.com>

Revert "[AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs"

This reverts commit 76d9ae924cc361578ecbb5688559f7cebc78ab87.
because it causes several VK CTS tests to fail


# 76d9ae92 26-Jul-2022 Alexander Timofeev <alexander.timofeev@amd.com>

[AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs

In the 2e29b0138ca243 we introduce a specific solving algorithm
that analyzes the VGPR to SGPR copies use chains and either lowers
the c

[AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs

In the 2e29b0138ca243 we introduce a specific solving algorithm
that analyzes the VGPR to SGPR copies use chains and either lowers
the copy to v_readfirstlane_b32 or converts the whole chain to VALU forms.
Same time we still have the code that blindly converts to VALU REG_SEQUENCE and PHIs
in case they produce SGPR but have VGPRs input operands. In case the REG_SEQUENCE and PHIs
are in the VGPR to SGPR copy use chain, and this chain was considered long enough to convert
copy to v_readfistlane_b32, further lowering them to VALU leads to several kinds of issues.
At first, we have v_readfistlane_b32 which is completely useless because most parts of its use chain
were moved to VALU forms. Second, we may encounter subtle bugs related to the EXEC-dependent CF
because of the weird mixing of SALU and VALU instructions.
This change removes the code that moves REG_SEQUENCE and PHIs to VALU. Instead, we use the fact
that both REG_SEQUENCE and PHIs have copy semantics. That is, if they define SGPR but have VGPR inputs,
we insert VGPR to SGPR copies to make them pure SGPR. Then, the new copies are processed by the common
VGPR to SGPR lowering algorithm.
This is Part 2 in the series of commits aiming at the massive refactoring of the SIFixSGPRCopies pass.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D130367

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# cf58b9ce 09-Dec-2021 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Add AV class spill pseudo instructions

While enabling vector superclasses with D109301,
the AV spills are converted into VGPR spills by
introducing appropriate copies. The whole thing
ended

[AMDGPU] Add AV class spill pseudo instructions

While enabling vector superclasses with D109301,
the AV spills are converted into VGPR spills by
introducing appropriate copies. The whole thing
ended up adding two instructions per spill (a copy
+ vgpr spill pseudo) and caused an incorrect
liverange update during inline spiller.

This patch adds the pseudo instructions for all
AV spills from 32b to 1024b and handles them in
the way all other spills are lowered.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D115439

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# 5297cbf0 06-Sep-2021 Christudasan Devadasan <Christudasan.Devadasan@amd.com>

[AMDGPU] Enable copy between VGPR and AGPR classes during regalloc

Greedy register allocator prefers to move a constrained
live range into a larger allocatable class over spilling
them. This patch d

[AMDGPU] Enable copy between VGPR and AGPR classes during regalloc

Greedy register allocator prefers to move a constrained
live range into a larger allocatable class over spilling
them. This patch defines the necessary superclasses for
vector registers. For subtargets that support copy between
VGPRs and AGPRs, the vector register spills during regalloc
now become just copies.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D109301

show more ...