Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1 |
|
#
44be5a7f |
| 06-Mar-2024 |
David Green <david.green@arm.com> |
[Codegen] Make Width in getMemOperandsWithOffsetWidth a LocationSize. (#83875)
This is another part of #70452 which makes getMemOperandsWithOffsetWidth
use a LocationSize for Width, as opposed to t
[Codegen] Make Width in getMemOperandsWithOffsetWidth a LocationSize. (#83875)
This is another part of #70452 which makes getMemOperandsWithOffsetWidth
use a LocationSize for Width, as opposed to the unsigned it currently
uses. The advantages on it's own are not super high if
getMemOperandsWithOffsetWidth usually uses known sizes, but if the
values can come from an MMO it can help be more accurate in case they
are Unknown (and in the future, scalable).
show more ...
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
b4972341 |
| 15-Feb-2024 |
Krzysztof Drewniak <Krzysztof.Drewniak@amd.com> |
[AMDGPU] Make maximum hard clause size a subtarget feature (#81287)
gfx11 chips may, in some conditions, behave incorrectly with S_CLAUSE
instructions (hard clauses) containing more than 32 operati
[AMDGPU] Make maximum hard clause size a subtarget feature (#81287)
gfx11 chips may, in some conditions, behave incorrectly with S_CLAUSE
instructions (hard clauses) containing more than 32 operations (that is,
whose arguments exceed 0x1f). However, gfx10 targets will work
successfully with clauses of up to length 63.
Therefore, define the MaxHardClauseLength property on GCNSubtarget and
make it a subtarget feature via tablegen, thus allowing us to specify,
both now and in the future, the maximum viable size of clauses on
various hardware from the tablegen definition. If MaxHardClauseLength is
0, which is the default, the hardware does not support hard clauses.
show more ...
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
b7173652 |
| 06-Dec-2023 |
Alex Bradbury <asb@igalia.com> |
[MachineScheduler][NFCI] Add Offset and OffsetIsScalable args to shouldClusterMemOps (#73778)
These are picked up from getMemOperandsWithOffsetWidth but weren't then
being passed through to shouldC
[MachineScheduler][NFCI] Add Offset and OffsetIsScalable args to shouldClusterMemOps (#73778)
These are picked up from getMemOperandsWithOffsetWidth but weren't then
being passed through to shouldClusterMemOps, which forces backends to
collect the information again if they want to use the kind of heuristics
typically used for the similar shouldScheduleLoadsNear function (e.g.
checking the offset is within 1 cache line).
This patch just adds the parameters, but doesn't attempt to use them.
There is potential to use them in the current PPC and AArch64
shouldClusterMemOps implementation, and I intend to use the offset in
the heuristic for RISC-V. I've left these for future patches in the
interest of being as incremental as possible.
As noted in the review and in an inline FIXME, an ElementCount-style abstraction may later be used to condense these two parameters to one argument. ElementCount isn't quite suitable as it doesn't support negative offsets.
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5 |
|
#
ffe86e3b |
| 09-Jun-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Update SIInsertHardClauses for GFX11
Changes for GFX11: - Clauses may not mix instructions of different types, and there are more types. For example image instructions with and without a
[AMDGPU] Update SIInsertHardClauses for GFX11
Changes for GFX11: - Clauses may not mix instructions of different types, and there are more types. For example image instructions with and without a sampler are now different types. - The max size of a clause is explicitly documented as 63 instructions. Previously it was implicitly assumed to be 64. This is such a tiny difference that it does not seem worth making it conditional on the subtarget. - It can be beneficial to clause stores as well as loads.
Differential Revision: https://reviews.llvm.org/D127391
show more ...
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
989f1c72 |
| 15-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
show more ...
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
#
a278250b |
| 10-Mar-2022 |
Nico Weber <thakis@chromium.org> |
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https:/
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
show more ...
|
#
7f230fee |
| 07-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup codegen includes
after: 1061034926 before: 1063332844
Differential Revision: https://reviews.llvm.org/D121169
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
#
bf980930 |
| 16-Jul-2021 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Ignore KILLs when forming clauses
KILL instructions are sometimes present and prevented hard clauses from being formed.
Fix this by ignoring all meta instructions in clauses.
Differential
[AMDGPU] Ignore KILLs when forming clauses
KILL instructions are sometimes present and prevented hard clauses from being formed.
Fix this by ignoring all meta instructions in clauses.
Differential Revision: https://reviews.llvm.org/D106042
show more ...
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
#
9cf6ff7a |
| 14-May-2021 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Do not clause NSA instructions
To ensure correct behaviour NSA instructions should not be claused.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D102211
|
#
9e026273 |
| 06-May-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] SIInsertHardClauses: move more stuff into the class. NFC.
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
#
560d7e04 |
| 20-Jan-2021 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
Revision tags: llvmorg-11.1.0-rc1 |
|
#
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
0ed2c046 |
| 01-Jun-2020 |
hsmahesha <mahesha.comp@gmail.com> |
[AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes
Summary: While clustering mem ops, AMDGPU target needs to consider number of clustered bytes to decide on
[AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes
Summary: While clustering mem ops, AMDGPU target needs to consider number of clustered bytes to decide on max number of mem ops that can be clustered. This patch adds support to pass number of clustered bytes to target mem ops clustering logic.
Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar
Reviewed By: foad
Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80545
show more ...
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
10c10f24 |
| 15-May-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Fix assertion failure in SIInsertHardClauses
This new pass failed an assertion whenever there were s_nops after the end of clause.
Differential Revision: https://reviews.llvm.org/D80007
|
#
42a55605 |
| 01-May-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] New SIInsertHardClauses pass
Enable clausing of memory loads on gfx10 by adding a new pass to insert the s_clause instructions that mark the start of each hard clause.
Differential Revisio
[AMDGPU] New SIInsertHardClauses pass
Enable clausing of memory loads on gfx10 by adding a new pass to insert the s_clause instructions that mark the start of each hard clause.
Differential Revision: https://reviews.llvm.org/D79792
show more ...
|