History log of /llvm-project/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (Results 201 – 225 of 229)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1
# 69e3001b 11-Jan-2017 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix folding immediates into mac src2

Whether it is legal or not needs to check for the instruction
it will be replaced with.

llvm-svn: 291711


# 51818c14 10-Jan-2017 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Constant fold when immediate is materialized

In future commits these patterns will appear after moveToVALU changes.

llvm-svn: 291615


# 4bd72361 10-Dec-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix handling of 16-bit immediates

Since 32-bit instructions with 32-bit input immediate behavior
are used to materialize 16-bit constants in 32-bit registers
for 16-bit instructions, determi

AMDGPU: Fix handling of 16-bit immediates

Since 32-bit instructions with 32-bit input immediate behavior
are used to materialize 16-bit constants in 32-bit registers
for 16-bit instructions, determining the legality based
on the size is incorrect. Change operands to have the size
specified in the type.

Also adds a workaround for a disassembler bug that
produces an immediate MCOperand for an operand that
is supposed to be OPERAND_REGISTER.

The assembler appears to accept out of bounds immediates and
truncates them, but this seems to be an issue for 32-bit
already.

llvm-svn: 289306

show more ...


# 8485fa09 07-Dec-2016 Tom Stellard <thomas.stellard@amd.com>

AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.

Patch By: Wei Ding

Summary: This patch fixes the fdiv precision issues.

Reviewers: b-sumner, cfang, wdng, arsenm

Subscribers: kzhu

AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.

Patch By: Wei Ding

Summary: This patch fixes the fdiv precision issues.

Reviewers: b-sumner, cfang, wdng, arsenm

Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D26424

llvm-svn: 288879

show more ...


Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2
# ff8bb49b 29-Nov-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Refactor immediate folding logic

Change the logic for when to fold immediates to
consider the destination operand rather than the
source of the materializing mov instruction.

No change yet,

AMDGPU: Refactor immediate folding logic

Change the logic for when to fold immediates to
consider the destination operand rather than the
source of the materializing mov instruction.

No change yet, but this will allow for correctly handling
i16/f16 operands. Since 32-bit moves are used to materialize
constants for these, the same bitvalue will not be in the
register.

llvm-svn: 288184

show more ...


Revision tags: llvmorg-3.9.1-rc1
# a24d84be 23-Nov-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Cleanup immediate folding code

Move code down to use, reorder to avoid hard to follow
immediate folding logic.

llvm-svn: 287818


# 391c3ea9 23-Nov-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix debug printing

The uint8_t was printed as a char which didn't really work.

llvm-svn: 287817


# f86e4b72 13-Nov-2016 Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

[AMDGPU] Add f16 support (VI+)

Differential Revision: https://reviews.llvm.org/D25975

llvm-svn: 286753


# 5e63a04e 06-Oct-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Don't fold undef uses or copies with implicit uses

llvm-svn: 283476


# c2ee42cd 06-Oct-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove leftover implicit operands when folding immediates

When constant folding an operation to a copy or an immediate
mov, the implicit uses/defs of the old instruction were left behind,
e.

AMDGPU: Remove leftover implicit operands when folding immediates

When constant folding an operation to a copy or an immediate
mov, the implicit uses/defs of the old instruction were left behind,
e.g. replacing v_or_b32 left the implicit exec use on the new copy.

llvm-svn: 283471

show more ...


# 117296c0 01-Oct-2016 Mehdi Amini <mehdi.amini@apple.com>

Use StringRef in Pass/PassManager APIs (NFC)

llvm-svn: 283004


# 2bc198a3 14-Sep-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Support folding FrameIndex operands

This avoids test regressions in a future commit.

llvm-svn: 281491


# fa5f767a 14-Sep-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Improve splitting 64-bit bit ops by constants

This addresses a TODO to handle operations besides and. This
also starts eliminating no-op operations with a constant that
can emerge later.

ll

AMDGPU: Improve splitting 64-bit bit ops by constants

This addresses a TODO to handle operations besides and. This
also starts eliminating no-op operations with a constant that
can emerge later.

llvm-svn: 281488

show more ...


Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2
# 3661e90e 15-Aug-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Don't fold subregister extracts into tied operands

llvm-svn: 278676


Revision tags: llvmorg-3.9.0-rc1
# 9cfc75c2 30-Jun-2016 Duncan P. N. Exon Smith <dexonsmith@apple.com>

CodeGen: Use MachineInstr& in TargetInstrInfo, NFC

This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when th

CodeGen: Use MachineInstr& in TargetInstrInfo, NFC

This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr. This is a
general API improvement.

Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other. Instead I've done everything as a block and just
updated what was necessary.

This is mostly mechanical fixes: adding and removing `*` and `&`
operators. The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy. I couldn't run tests
for AVR since llc doesn't link with it turned on.

llvm-svn: 274189

show more ...


# 43e92fe3 24-Jun-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Cleanup subtarget handling.

Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict

AMDGPU: Cleanup subtarget handling.

Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.

llvm-svn: 273652

show more ...


Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1
# 7de74af9 25-Apr-2016 Andrew Kaylor <andrew.kaylor@intel.com>

Add optimization bisect opt-in calls for AMDGPU passes

Differential Revision: http://reviews.llvm.org/D19450

llvm-svn: 267485


Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3
# 427c5489 11-Feb-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix passes depending on dominator tree for no reason

llvm-svn: 260494


Revision tags: llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1
# 926c56f5 13-Jan-2016 Marek Olsak <marek.olsak@amd.com>

AMDGPU/SI: Fix a bug in SIFoldOperands

Summary: ret.ll will contain a test for this

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D16029

llvm-

AMDGPU/SI: Fix a bug in SIFoldOperands

Summary: ret.ll will contain a test for this

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D16029

llvm-svn: 257590

show more ...


# 82fc962c 07-Jan-2016 Nicolai Haehnle <nhaehnle@gmail.com>

AMDGPU/SI: Fold operands with sub-registers

Summary:
Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs,
increasing the code size and VGPR pressure. These moves are now fol

AMDGPU/SI: Fold operands with sub-registers

Summary:
Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs,
increasing the code size and VGPR pressure. These moves are now folded away.

Note that this lack of operand folding was not a problem for VMEM loads,
because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register
coalescer.

Some tests are updated, note that the fsub.ll test explicitly checks that
the move is elided.

With the IR generated by current Mesa, the changes are obviously relatively
minor:

7063 shaders in 3531 tests
Totals:
SGPRS: 351872 -> 352560 (0.20 %)
VGPRS: 199984 -> 200732 (0.37 %)
Code Size: 9876968 -> 9881112 (0.04 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave
Wait states: 295164 -> 295337 (0.06 %)

Totals from affected shaders:
SGPRS: 65784 -> 66472 (1.05 %)
VGPRS: 38064 -> 38812 (1.97 %)
Code Size: 1993828 -> 1997972 (0.21 %) bytes
LDS: 42 -> 42 (0.00 %) blocks
Scratch: 795648 -> 783360 (-1.54 %) bytes per wave
Wait states: 54026 -> 54199 (0.32 %)

Reviewers: tstellarAMD, arsenm, mareko

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15875

llvm-svn: 257074

show more ...


Revision tags: llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1
# e8c0891e 21-Oct-2015 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix verifier error in SIFoldOperands

There may be other use operands that also need their kill flags cleared.

This happens in a few tests when SIFoldOperands is moved after
PeepholeOptimize

AMDGPU: Fix verifier error in SIFoldOperands

There may be other use operands that also need their kill flags cleared.

This happens in a few tests when SIFoldOperands is moved after
PeepholeOptimizer.

PeepholeOptimizer rewrites cases that look like:
%vreg0 = ...
%vreg1 = COPY %vreg0
use %vreg1<kill>
%vreg2 = COPY %vreg0
use %vreg2<kill>

to use the earlier source to
%vreg0 = ...
use %vreg0
use %vreg0

Currently SIFoldOperands sees the copied registers, so there is
only one use. So far I haven't managed to come up with a test
that currently has multiple uses of a foldable VGPR -> VGPR copy.

llvm-svn: 250960

show more ...


# 16c4da03 28-Sep-2015 Andrew Kaylor <andrew.kaylor@intel.com>

Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing.

Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com)

Differential Revision: http://reviews.llvm.

Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing.

Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com)

Differential Revision: http://reviews.llvm.org/D11370

llvm-svn: 248735

show more ...


# 0cb8517d 25-Sep-2015 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix recomputing dominator tree unnecessarily

SIFixSGPRCopies does not modify the CFG, but this was
being recomputed before running SIFoldOperands.

llvm-svn: 248587


# ad46e0c1 10-Sep-2015 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU/SI: Fix creating v_mov_b32s without exec uses

This will be caught by existing tests with a
verifier check to be added in a future commit.

llvm-svn: 247229


# 9a197676 09-Sep-2015 Tom Stellard <thomas.stellard@amd.com>

AMDGPU/SI: Fold operands through REG_SEQUENCE instructions

Summary:
This helps mostly when we use add instructions for address calculations
that contain immediates.

Reviewers: arsenm

Subscribers:

AMDGPU/SI: Fold operands through REG_SEQUENCE instructions

Summary:
This helps mostly when we use add instructions for address calculations
that contain immediates.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12256

llvm-svn: 247157

show more ...


12345678910