History log of /llvm-project/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp (Results 51 – 75 of 142)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 2bfcacf0 03-Jul-2020 Carl Ritson <carl.ritson@amd.com>

[AMDGPU] Insert PS early exit at end of control flow

Exit early if the exec mask is zero at the end of control flow.
Mark the ends of control flow during control flow lowering and
convert these to e

[AMDGPU] Insert PS early exit at end of control flow

Exit early if the exec mask is zero at the end of control flow.
Mark the ends of control flow during control flow lowering and
convert these to exits during the insert skips pass.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D82737

show more ...


Revision tags: llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# 12a32439 07-Apr-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Limit endcf-collapase to simple if

We can only collapse adjacent SI_END_CF if outer statement
belongs to a simple SI_IF, otherwise correct mask is not in the
register we expect, but is an a

[AMDGPU] Limit endcf-collapase to simple if

We can only collapse adjacent SI_END_CF if outer statement
belongs to a simple SI_IF, otherwise correct mask is not in the
register we expect, but is an argument of an S_XOR instruction.

Even if SI_IF is simple it might be lowered using S_XOR because
lowering is dependent on a basic block layout. It is not
considered simple if instruction consuming its output is
not an SI_END_CF. Since that SI_END_CF might have already been
lowered to an S_OR isSimpleIf() check may return false.

This situation is an opportunity for a further optimization
of SI_IF lowering, but that is a separate optimization. In the
meanwhile move SI_END_CF post the lowering when we already know
how the rest of the CFG was lowered since a non-simple SI_IF
case still needs to be handled.

Differential Revision: https://reviews.llvm.org/D77610

show more ...


# ddd2f4b9 06-Apr-2020 Jay Foad <jay.foad@amd.com>

[AMDGPU] Fix inaccurate comments


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5
# c262b69d 13-Mar-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Fix endcf collapse

Only collapse inner endcf if the outer one belongs to SI_IF.
If it does belong to SI_ELSE then mask being restored in fact
a partial inverse of what we need.

Differentia

[AMDGPU] Fix endcf collapse

Only collapse inner endcf if the outer one belongs to SI_IF.
If it does belong to SI_ELSE then mask being restored in fact
a partial inverse of what we need.

Differential Revision: https://reviews.llvm.org/D76154

show more ...


# 32e90cbc 13-Mar-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Disable endcf collapse

There are some functional regressions and I suspect our
scopes are not as perfectly enclosed as I expected.
Disable it for now.

Differential Revision: https://review

[AMDGPU] Disable endcf collapse

There are some functional regressions and I suspect our
scopes are not as perfectly enclosed as I expected.
Disable it for now.

Differential Revision: https://reviews.llvm.org/D76148

show more ...


Revision tags: llvmorg-10.0.0-rc4
# 360aff04 11-Mar-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Simplify nested SI_END_CF

This is to replace the optimization from the SIOptimizeExecMaskingPreRA.
We have less opportunities in the control flow lowering because many
VGPR copies are still

[AMDGPU] Simplify nested SI_END_CF

This is to replace the optimization from the SIOptimizeExecMaskingPreRA.
We have less opportunities in the control flow lowering because many
VGPR copies are still in place and will be removed later, but we know
for sure an instruction is SI_END_CF and not just an arbitrary S_OR_B64
with EXEC.

The subsequent change needs to convert s_and_saveexec into s_and and
address new TODO lines in tests, then code block guarded by the
-amdgpu-remove-redundant-endcf option in the pre-RA exec mask optimizer
will be removed.

Differential Revision: https://reviews.llvm.org/D76033

show more ...


Revision tags: llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init
# 6e177082 27-Dec-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix SI_IF lowering when the save exec reg has terminator uses

Reverts part of 6524a7a2b9ca072bd7f7b4355d1230e70c679d2f. Since that
commit, the expansion was ignoring the actual save exec reg

AMDGPU: Fix SI_IF lowering when the save exec reg has terminator uses

Reverts part of 6524a7a2b9ca072bd7f7b4355d1230e70c679d2f. Since that
commit, the expansion was ignoring the actual save exec register
produced by the instruction, and looking at other instructions. I do
not understand why it was looking at other instructions, but relying
on this scan was wrong.

Fixes verifier errors after SI_IF is tail duplicated, which should be
correct to do. The results were fed into a phi, which was lowered to
the S_MOV_B64_term instructions.

show more ...


# e53a9d96 22-Jan-2020 cdevadas <cdevadas@amd.com>

Resubmit: [AMDGPU] Invert the handling of skip insertion.

The current implementation of skip insertion (SIInsertSkip) makes it a
mandatory pass required for correctness. Initially, the idea was to
h

Resubmit: [AMDGPU] Invert the handling of skip insertion.

The current implementation of skip insertion (SIInsertSkip) makes it a
mandatory pass required for correctness. Initially, the idea was to
have an optional pass. This patch inserts the s_cbranch_execz upfront
during SILowerControlFlow to skip over the sections of code when no
lanes are active. Later, SIRemoveShortExecBranches removes the skips
for short branches, unless there is a sideeffect and the skip branch is
really necessary.

This new pass will replace the handling of skip insertion in the
existing SIInsertSkip Pass.

Differential revision: https://reviews.llvm.org/D68092

show more ...


# a80291ce 21-Jan-2020 Nicolai Hähnle <nicolai.haehnle@amd.com>

Revert "[AMDGPU] Invert the handling of skip insertion."

This reverts commit 0dc6c249bffac9f23a605ce4e42a84341da3ddbd.

The commit is reported to cause a regression in piglit/bin/glsl-vs-loop for
Me

Revert "[AMDGPU] Invert the handling of skip insertion."

This reverts commit 0dc6c249bffac9f23a605ce4e42a84341da3ddbd.

The commit is reported to cause a regression in piglit/bin/glsl-vs-loop for
Mesa.

show more ...


# 0dc6c249 10-Jan-2020 cdevadas <cdevadas@amd.com>

[AMDGPU] Invert the handling of skip insertion.

The current implementation of skip insertion (SIInsertSkip) makes it a
mandatory pass required for correctness. Initially, the idea was to
have an opt

[AMDGPU] Invert the handling of skip insertion.

The current implementation of skip insertion (SIInsertSkip) makes it a
mandatory pass required for correctness. Initially, the idea was to
have an optional pass. This patch inserts the s_cbranch_execz upfront
during SILowerControlFlow to skip over the sections of code when no
lanes are active. Later, SIRemoveShortExecBranches removes the skips
for short branches, unless there is a sideeffect and the skip branch is
really necessary.

This new pass will replace the handling of skip insertion in the
existing SIInsertSkip Pass.

Differential revision: https://reviews.llvm.org/D68092

show more ...


Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 008e65a7 18-Nov-2019 vpykhtin <valery.pykhtin@gmail.com>

[AMDGPU] Fix emitIfBreak CF lowering: use temp reg to make register coalescer life easier.

Differential revision: https://reviews.llvm.org/D70405


Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6
# 6524a7a2 17-Sep-2019 Alexander Timofeev <Alexander.Timofeev@amd.com>

[AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Fixed

Defferential Revision: https://reviews.llvm.org/D67101

Reviewers: rampitec, vpykhtin
llvm-svn: 372086


# 9ff70132 13-Sep-2019 Alexander Timofeev <Alexander.Timofeev@amd.com>

Revert for: [AMDGPU]: PHI Elimination hooks added for custom COPY insertion.

llvm-svn: 371873


Revision tags: llvmorg-9.0.0-rc5
# c2d292f8 10-Sep-2019 Alexander Timofeev <Alexander.Timofeev@amd.com>

[AMDGPU]: PHI Elimination hooks added for custom COPY insertion.

Reviewers: rampitec, vpykhtin

Differential Revision: https://reviews.llvm.org/D67101

llvm-svn: 371508


Revision tags: llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3
# 4b7fc85c 20-Aug-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

Revert "AMDGPU: Fix iterator error when lowering SI_END_CF"

This reverts r367500 and r369203. This is causing various test
failures.

llvm-svn: 369417


# 479f3bdb 18-Aug-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix iterator error when lowering SI_END_CF

If the instruction is the last in the block, there is no next
instruction but the iteration still needs to look at the new block.

llvm-svn: 369203


# 0c476111 15-Aug-2019 Daniel Sanders <daniel_l_sanders@apple.com>

Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM

Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Re

Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM

Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor

Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&

Depends on D65919

Reviewers: arsenm, bogner, craig.topper, RKSimon

Reviewed By: arsenm

Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65962

llvm-svn: 369041

show more ...


Revision tags: llvmorg-9.0.0-rc2
# 2bea69bf 01-Aug-2019 Daniel Sanders <daniel_l_sanders@apple.com>

Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC

llvm-svn: 367633


# d48324ff 01-Aug-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

Reapply "AMDGPU: Split block for si_end_cf"

This reverts commit r359363, reapplying r357634

llvm-svn: 367500


Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3
# e3a676e9 24-Jun-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

CodeGen: Introduce a class for registers

Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set neede

CodeGen: Introduce a class for registers

Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().

llvm-svn: 364191

show more ...


# 52500216 16-Jun-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx10 conditional registers handling

This is cpp source part of wave32 support, excluding overriden
getRegClass().

Differential Revision: https://reviews.llvm.org/D63351

llvm-svn: 363513


Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1
# 76c5b629 27-Apr-2019 Mark Searles <m.c.searles@gmail.com>

Revert "AMDGPU: Split block for si_end_cf"

This reverts commit 7a6ef3004655dd86d722199c471ae78c28e31bb4.

We discovered some internal test failures, so reverting for now.

Differential Revision: htt

Revert "AMDGPU: Split block for si_end_cf"

This reverts commit 7a6ef3004655dd86d722199c471ae78c28e31bb4.

We discovered some internal test failures, so reverting for now.

Differential Revision: https://reviews.llvm.org/D61213

llvm-svn: 359363

show more ...


# 396653f8 03-Apr-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Split block for si_end_cf

Relying on no spill or other code being inserted before this was
precarious. It relied on code diligently checking isBasicBlockPrologue
which is likely to be forgot

AMDGPU: Split block for si_end_cf

Relying on no spill or other code being inserted before this was
precarious. It relied on code diligently checking isBasicBlockPrologue
which is likely to be forgotten.

Ideally this could be done earlier, but this doesn't work because of
phis. Any other instruction can't be placed before them, so we have to
accept the position being incorrect during SSA.

This avoids regressions in the fast register allocator rewrite from
inverting the direction.

llvm-svn: 357634

show more ...


Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4
# 87039773 05-Mar-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Preserve undef flag when expanding SI_IF

Fixes undefined value verifier error.

llvm-svn: 355426


Revision tags: llvmorg-8.0.0-rc3
# 476e26b5 22-Feb-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use removeAllRegUnitsForPhysReg

llvm-svn: 354686


123456