|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0 |
|
| #
5da7179c |
| 14-Sep-2023 |
Jeffrey Byrnes <Jeffrey.Byrnes@amd.com> |
[AMDGPU] Reland: Add IR LiveReg type-based optimization
|
| #
3e53c97d |
| 29-Jun-2024 |
Vitaly Buka <vitalybuka@google.com> |
Revert "[AMDGPU] Add IR LiveReg type-based optimization" (#97138)
Part of #66838.
https://lab.llvm.org/buildbot/#/builders/52/builds/404
https://lab.llvm.org/buildbot/#/builders/55/builds/358
h
Revert "[AMDGPU] Add IR LiveReg type-based optimization" (#97138)
Part of #66838.
https://lab.llvm.org/buildbot/#/builders/52/builds/404
https://lab.llvm.org/buildbot/#/builders/55/builds/358
https://lab.llvm.org/buildbot/#/builders/164/builds/518
This reverts commit ded956440739ae326a99cbaef18ce4362e972679.
show more ...
|
| #
ded95644 |
| 14-Sep-2023 |
Jeffrey Byrnes <Jeffrey.Byrnes@amd.com> |
[AMDGPU] Add IR LiveReg type-based optimization
Change-Id: Ia0d11b79b8302e79247fe193ccabc0dad2d359a0
|
| #
4a026b50 |
| 20-Mar-2024 |
Peter Rong <peterrong96@gmail.com> |
[AMDGCN] Use ZExt when handling indices in insertment element (#85718)
When i1 true is used as an index, SExt extends it to i32 -1. This would
cause BitVector to overflow.
The language manual hav
[AMDGCN] Use ZExt when handling indices in insertment element (#85718)
When i1 true is used as an index, SExt extends it to i32 -1. This would
cause BitVector to overflow.
The language manual have specified that the index shall be treated as an
unsigned number, this patch fixes that.
(https://llvm.org/docs/LangRef.html#insertelement-instruction)
This patch fixes #85717
---------
Signed-off-by: Peter Rong <PeterRong96@gmail.com>
show more ...
|
|
Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5 |
|
| #
fa87dd52 |
| 22-May-2023 |
pvanhout <pierre.vanhoutryve@amd.com> |
[AMDGPU] Handle multiple occurences of an incoming value in break large PHIs
We naively broke all incoming values, assuming they'd be unique. However it's not illegal to have multiple occurences of,
[AMDGPU] Handle multiple occurences of an incoming value in break large PHIs
We naively broke all incoming values, assuming they'd be unique. However it's not illegal to have multiple occurences of, e.g. `[BB0, V0]` in a PHI node. What's illegal though is having the same basic block multiple times but with different values, and it's exactly what the transform caused. This broke in some rare applications where the pattern arised.
Now we cache the `BasicBlock, Value` pairs we're breaking so we can reuse the values and preserve this invariant.
Solves SWDEV-399460
Reviewed By: #amdgpu, rovka
Differential Revision: https://reviews.llvm.org/D151069
show more ...
|
|
Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2 |
|
| #
b3b3cb2d |
| 07-Apr-2023 |
pvanhout <pierre.vanhoutryve@amd.com> |
[AMDGPU] Less aggressively break large PHIs
In some cases, breaking large PHIs can very negatively affect performance (3x more instructions observed in a particular test case).
This patch adds some
[AMDGPU] Less aggressively break large PHIs
In some cases, breaking large PHIs can very negatively affect performance (3x more instructions observed in a particular test case).
This patch adds some basic profitability heuristics to help with some of these issues without affecting the "good" cases. e.g. avoid breaking PHIs if it causes back-and-forth between vector/scalar form for no good reason.
Fixes SWDEV-392803 Fixes SWDEV-393781 Fixes SWDEV-394228
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D147786
show more ...
|
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3 |
|
| #
d8925210 |
| 13-Feb-2023 |
pvanhout <pierre.vanhoutryve@amd.com> |
[AMDGPU] Break-up large PHIs for DAGISel
DAGISel uses CopyToReg/CopyFromReg to lower PHI nodes. With large PHIs, this can result in poor codegen. This is because it introduces a need to have a build
[AMDGPU] Break-up large PHIs for DAGISel
DAGISel uses CopyToReg/CopyFromReg to lower PHI nodes. With large PHIs, this can result in poor codegen. This is because it introduces a need to have a build_vector before copying the PHI value, and that build_vector may have many undef elements. This can cause very high register pressure and abnormal stack usage in some cases.
This scalarization/phi "break-up" can be easily tuned/disabled through CL options in case it's not beneficial for some users. It's also only enabled for DAGIsel and GlobalISel handles PHIs much better (as it works on the whole function).
This can both scalarize (break a vector into its elements) and simplify (break a vector into smaller, more manageable subvectors) PHIs.
Fixes SWDEV-321581
Reviewed By: kzhuravl
Differential Revision: https://reviews.llvm.org/D143731
show more ...
|