Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0
# 5da7179c 14-Sep-2023 Jeffrey Byrnes <Jeffrey.Byrnes@amd.com>

[AMDGPU] Reland: Add IR LiveReg type-based optimization


# 3e53c97d 29-Jun-2024 Vitaly Buka <vitalybuka@google.com>

Revert "[AMDGPU] Add IR LiveReg type-based optimization" (#97138)

Part of #66838.

https://lab.llvm.org/buildbot/#/builders/52/builds/404
https://lab.llvm.org/buildbot/#/builders/55/builds/358
h

Revert "[AMDGPU] Add IR LiveReg type-based optimization" (#97138)

Part of #66838.

https://lab.llvm.org/buildbot/#/builders/52/builds/404
https://lab.llvm.org/buildbot/#/builders/55/builds/358
https://lab.llvm.org/buildbot/#/builders/164/builds/518

This reverts commit ded956440739ae326a99cbaef18ce4362e972679.

show more ...


# ded95644 14-Sep-2023 Jeffrey Byrnes <Jeffrey.Byrnes@amd.com>

[AMDGPU] Add IR LiveReg type-based optimization

Change-Id: Ia0d11b79b8302e79247fe193ccabc0dad2d359a0


# 4a026b50 20-Mar-2024 Peter Rong <peterrong96@gmail.com>

[AMDGCN] Use ZExt when handling indices in insertment element (#85718)

When i1 true is used as an index, SExt extends it to i32 -1. This would
cause BitVector to overflow.
The language manual hav

[AMDGCN] Use ZExt when handling indices in insertment element (#85718)

When i1 true is used as an index, SExt extends it to i32 -1. This would
cause BitVector to overflow.
The language manual have specified that the index shall be treated as an
unsigned number, this patch fixes that.
(https://llvm.org/docs/LangRef.html#insertelement-instruction)

This patch fixes #85717

---------

Signed-off-by: Peter Rong <PeterRong96@gmail.com>

show more ...


Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5
# fa87dd52 22-May-2023 pvanhout <pierre.vanhoutryve@amd.com>

[AMDGPU] Handle multiple occurences of an incoming value in break large PHIs

We naively broke all incoming values, assuming they'd be unique.
However it's not illegal to have multiple occurences of,

[AMDGPU] Handle multiple occurences of an incoming value in break large PHIs

We naively broke all incoming values, assuming they'd be unique.
However it's not illegal to have multiple occurences of, e.g. `[BB0, V0]`
in a PHI node. What's illegal though is having the same basic block
multiple times but with different values, and it's exactly what the
transform caused. This broke in some rare applications where the pattern
arised.

Now we cache the `BasicBlock, Value` pairs we're breaking so we can reuse the values and preserve this invariant.

Solves SWDEV-399460

Reviewed By: #amdgpu, rovka

Differential Revision: https://reviews.llvm.org/D151069

show more ...


Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2
# b3b3cb2d 07-Apr-2023 pvanhout <pierre.vanhoutryve@amd.com>

[AMDGPU] Less aggressively break large PHIs

In some cases, breaking large PHIs can very negatively affect
performance (3x more instructions observed in a particular test case).

This patch adds some

[AMDGPU] Less aggressively break large PHIs

In some cases, breaking large PHIs can very negatively affect
performance (3x more instructions observed in a particular test case).

This patch adds some basic profitability heuristics to help with some of these issues without affecting the "good" cases.
e.g. avoid breaking PHIs if it causes back-and-forth between vector/scalar form for no good reason.

Fixes SWDEV-392803
Fixes SWDEV-393781
Fixes SWDEV-394228

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D147786

show more ...


Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3
# d8925210 13-Feb-2023 pvanhout <pierre.vanhoutryve@amd.com>

[AMDGPU] Break-up large PHIs for DAGISel

DAGISel uses CopyToReg/CopyFromReg to lower PHI nodes. With large PHIs, this can result in poor codegen.
This is because it introduces a need to have a build

[AMDGPU] Break-up large PHIs for DAGISel

DAGISel uses CopyToReg/CopyFromReg to lower PHI nodes. With large PHIs, this can result in poor codegen.
This is because it introduces a need to have a build_vector before copying the PHI value, and that build_vector may have many undef elements. This can cause very high register pressure and abnormal stack usage in some cases.

This scalarization/phi "break-up" can be easily tuned/disabled through CL options in case it's not beneficial for some users.
It's also only enabled for DAGIsel and GlobalISel handles PHIs much better (as it works on the whole function).

This can both scalarize (break a vector into its elements) and simplify (break a vector into smaller, more manageable subvectors) PHIs.

Fixes SWDEV-321581

Reviewed By: kzhuravl

Differential Revision: https://reviews.llvm.org/D143731

show more ...