History log of /llvm-project/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp (Results 151 – 167 of 167)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-4.0.0-rc1
# 116bbab4 13-Jan-2017 Diana Picus <diana.picus@linaro.org>

[CodeGen] Rename MachineInstrBuilder::addOperand. NFC

Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

S

[CodeGen] Rename MachineInstrBuilder::addOperand. NFC

Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

See https://reviews.llvm.org/D28057 for the whole discussion.

Differential Revision: https://reviews.llvm.org/D28556

llvm-svn: 291891

show more ...


Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1
# f867a40b 03-Nov-2016 Alexander Timofeev <Alexander.Timofeev@amd.com>

[AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads.

hange explores the fact that LDS reads may be reordered even if access
the same location.

Prior the change, algorithm immediately

[AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads.

hange explores the fact that LDS reads may be reordered even if access
the same location.

Prior the change, algorithm immediately stops as soon as any memory
access encountered between loads that are expected to be merged
together. Although, Read-After-Read conflict cannot affect execution
correctness.

Improves hcBLAS CGEMM manually loop-unrolled kernels performance by 44%.
Also improvement expected on any massive sequences of reads from LDS.

Differential Revision: https://reviews.llvm.org/D25944

llvm-svn: 285919

show more ...


# 7b0e25b7 27-Oct-2016 Nicolai Haehnle <nhaehnle@gmail.com>

AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies

Summary:
When finding a match for a merge and collecting the instructions that must
be moved, keep in mind tha

AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies

Summary:
When finding a match for a merge and collecting the instructions that must
be moved, keep in mind that the instruction we merge might actually use one
of the defs that are being moved.

Fixes piglit spec/arb_enhanced_layouts/execution/component-layout/vs-tcs-load-output[-indirect].

The fact that the ds_read in the test case is not eliminated suggests that
there might be another problem related to alias analysis, but that's a
separate problem: this pass should still work correctly even when earlier
optimization passes missed something or were disabled.

Reviewers: tstellarAMD, arsenm

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25829

llvm-svn: 285273

show more ...


# 117296c0 01-Oct-2016 Mehdi Amini <mehdi.amini@apple.com>

Use StringRef in Pass/PassManager APIs (NFC)

llvm-svn: 283004


# 9720f57a 30-Aug-2016 NAKAMURA Takumi <geek4civic@gmail.com>

SILoadStoreOptimizer.cpp: Fix a warning in r279991. [-Wunused-variable]

llvm-svn: 280075


# c2ff0eb6 29-Aug-2016 Tom Stellard <thomas.stellard@amd.com>

AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler

Summary:
The SILoadStoreOptimizer can now look ahead more then one instruction when
looking for instructions to merge, which g

AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler

Summary:
The SILoadStoreOptimizer can now look ahead more then one instruction when
looking for instructions to merge, which greatly improves the number of
loads/stores that we are able to merge.

Moving the pass before scheduling avoids increasing register pressure after
the scheduler, so that the scheduler's register pressure estimates will be
more accurate. It also gives more consistent results, since it is no longer
affected by minor scheduling changes.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: https://reviews.llvm.org/D23814

llvm-svn: 279991

show more ...


# e175d8ab 26-Aug-2016 Tom Stellard <thomas.stellard@amd.com>

AMDGPU/SI: Canonicalize offset order for merged DS instructions

Summary:
If the scheduler clusters the loads, then the offsets will be sorted,
but it is possible for the scheduler to scheduler loads

AMDGPU/SI: Canonicalize offset order for merged DS instructions

Summary:
If the scheduler clusters the loads, then the offsets will be sorted,
but it is possible for the scheduler to scheduler loads together
without out explicitly clustering them, which would give us non-sorted
offsets.

Also, we will want to do this if we move the load/store optimizer before
the scheduler.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23776

llvm-svn: 279870

show more ...


Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3
# 90799ce8 23-Aug-2016 Matthias Braun <matze@braunis.de>

MachineFunction: Introduce NoPHIs property

I want to compute the SSA property of .mir files automatically in
upcoming patches. The problem with this is that some inputs will be
reported as static si

MachineFunction: Introduce NoPHIs property

I want to compute the SSA property of .mir files automatically in
upcoming patches. The problem with this is that some inputs will be
reported as static single assignment with some passes claiming not to
support SSA form. In reality though those passes do not support PHI
instructions => Track the presence of PHI instructions separate from the
SSA property.

Differential Revision: https://reviews.llvm.org/D22719

llvm-svn: 279573

show more ...


Revision tags: llvmorg-3.9.0-rc2
# 0dd9ed1d 13-Aug-2016 Hans Wennborg <hans@hanshq.net>

Fix more dereferenced end() iterators after r278532

llvm-svn: 278587


Revision tags: llvmorg-3.9.0-rc1
# 03d85845 27-Jun-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Move subtarget feature checks into passes

llvm-svn: 273937


# 43e92fe3 24-Jun-2016 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Cleanup subtarget handling.

Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict

AMDGPU: Cleanup subtarget handling.

Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.

llvm-svn: 273652

show more ...


# 48975881 21-Jun-2016 Rafael Espindola <rafael.espindola@gmail.com>

Delete some dead code.

Found by gcc 6.

llvm-svn: 273303


Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1
# 7de74af9 25-Apr-2016 Andrew Kaylor <andrew.kaylor@intel.com>

Add optimization bisect opt-in calls for AMDGPU passes

Differential Revision: http://reviews.llvm.org/D19450

llvm-svn: 267485


# ecc7cbf6 29-Mar-2016 Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Test commit access

llvm-svn: 264736


Revision tags: llvmorg-3.8.0
# 3ac9cc61 27-Feb-2016 Duncan P. N. Exon Smith <dexonsmith@apple.com>

CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFC

Take MachineInstr by reference instead of by pointer in SlotIndexes and
the SlotIndex wrappers in LiveIntervals. The MachineInstrs

CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFC

Take MachineInstr by reference instead of by pointer in SlotIndexes and
the SlotIndex wrappers in LiveIntervals. The MachineInstrs here are
never null, so this cleans up the API a bit. It also incidentally
removes a few implicit conversions from MachineInstrBundleIterator to
MachineInstr* (see PR26753).

At a couple of call sites it was convenient to convert to a range-based
for loop over MachineBasicBlock::instr_begin/instr_end, so I added
MachineBasicBlock::instrs.

llvm-svn: 262115

show more ...


Revision tags: llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1, llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1, llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1
# 84db5d97 14-Jul-2015 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU/SI: Fix read2 merging into a super register.

If the read2 produced was supposed to be writing into a
super register, it would use the wrong subregister indices.
Fix this by inserting copies,

AMDGPU/SI: Fix read2 merging into a super register.

If the read2 produced was supposed to be writing into a
super register, it would use the wrong subregister indices.
Fix this by inserting copies, so we only ever write to a vreg_64.
Run the register coalescer again to clean this up, although this
isn't ideal and often does result in an extra move.

Also remove the assert that offset1 > offset0.

There isn't a real reason to not allow this other than a minor
convenience in the compiler, and it doesn't seem worth the effort
of avoiding it.

llvm-svn: 242174

show more ...


Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1
# 45bb48ea 13-Jun-2015 Tom Stellard <thomas.stellard@amd.com>

R600 -> AMDGPU rename

llvm-svn: 239657


1234567