SILoadStoreOptimizer.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-4.0.0-rc1
# 116bbab4	13-Jan-2017	Diana Picus <diana.picus@linaro.org>	[CodeGen] Rename MachineInstrBuilder::addOperand. NFC Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. S [CodeGen] Rename MachineInstrBuilder::addOperand. NFC Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891 show more ...
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1
# f867a40b	03-Nov-2016	Alexander Timofeev <Alexander.Timofeev@amd.com>	[AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads. hange explores the fact that LDS reads may be reordered even if access the same location. Prior the change, algorithm immediately [AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads. hange explores the fact that LDS reads may be reordered even if access the same location. Prior the change, algorithm immediately stops as soon as any memory access encountered between loads that are expected to be merged together. Although, Read-After-Read conflict cannot affect execution correctness. Improves hcBLAS CGEMM manually loop-unrolled kernels performance by 44%. Also improvement expected on any massive sequences of reads from LDS. Differential Revision: https://reviews.llvm.org/D25944 llvm-svn: 285919 show more ...
# 7b0e25b7	27-Oct-2016	Nicolai Haehnle <nhaehnle@gmail.com>	AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies Summary: When finding a match for a merge and collecting the instructions that must be moved, keep in mind tha AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies Summary: When finding a match for a merge and collecting the instructions that must be moved, keep in mind that the instruction we merge might actually use one of the defs that are being moved. Fixes piglit spec/arb_enhanced_layouts/execution/component-layout/vs-tcs-load-output[-indirect]. The fact that the ds_read in the test case is not eliminated suggests that there might be another problem related to alias analysis, but that's a separate problem: this pass should still work correctly even when earlier optimization passes missed something or were disabled. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25829 llvm-svn: 285273 show more ...
# 117296c0	01-Oct-2016	Mehdi Amini <mehdi.amini@apple.com>	Use StringRef in Pass/PassManager APIs (NFC) llvm-svn: 283004
# 9720f57a	30-Aug-2016	NAKAMURA Takumi <geek4civic@gmail.com>	SILoadStoreOptimizer.cpp: Fix a warning in r279991. [-Wunused-variable] llvm-svn: 280075
# c2ff0eb6	29-Aug-2016	Tom Stellard <thomas.stellard@amd.com>	AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler Summary: The SILoadStoreOptimizer can now look ahead more then one instruction when looking for instructions to merge, which g AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler Summary: The SILoadStoreOptimizer can now look ahead more then one instruction when looking for instructions to merge, which greatly improves the number of loads/stores that we are able to merge. Moving the pass before scheduling avoids increasing register pressure after the scheduler, so that the scheduler's register pressure estimates will be more accurate. It also gives more consistent results, since it is no longer affected by minor scheduling changes. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23814 llvm-svn: 279991 show more ...
# e175d8ab	26-Aug-2016	Tom Stellard <thomas.stellard@amd.com>	AMDGPU/SI: Canonicalize offset order for merged DS instructions Summary: If the scheduler clusters the loads, then the offsets will be sorted, but it is possible for the scheduler to scheduler loads AMDGPU/SI: Canonicalize offset order for merged DS instructions Summary: If the scheduler clusters the loads, then the offsets will be sorted, but it is possible for the scheduler to scheduler loads together without out explicitly clustering them, which would give us non-sorted offsets. Also, we will want to do this if we move the load/store optimizer before the scheduler. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23776 llvm-svn: 279870 show more ...
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3
# 90799ce8	23-Aug-2016	Matthias Braun <matze@braunis.de>	MachineFunction: Introduce NoPHIs property I want to compute the SSA property of .mir files automatically in upcoming patches. The problem with this is that some inputs will be reported as static si MachineFunction: Introduce NoPHIs property I want to compute the SSA property of .mir files automatically in upcoming patches. The problem with this is that some inputs will be reported as static single assignment with some passes claiming not to support SSA form. In reality though those passes do not support PHI instructions => Track the presence of PHI instructions separate from the SSA property. Differential Revision: https://reviews.llvm.org/D22719 llvm-svn: 279573 show more ...
Revision tags: llvmorg-3.9.0-rc2
# 0dd9ed1d	13-Aug-2016	Hans Wennborg <hans@hanshq.net>	Fix more dereferenced end() iterators after r278532 llvm-svn: 278587
Revision tags: llvmorg-3.9.0-rc1
# 03d85845	27-Jun-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Move subtarget feature checks into passes llvm-svn: 273937
# 43e92fe3	24-Jun-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652 show more ...
# 48975881	21-Jun-2016	Rafael Espindola <rafael.espindola@gmail.com>	Delete some dead code. Found by gcc 6. llvm-svn: 273303
Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1
# 7de74af9	25-Apr-2016	Andrew Kaylor <andrew.kaylor@intel.com>	Add optimization bisect opt-in calls for AMDGPU passes Differential Revision: http://reviews.llvm.org/D19450 llvm-svn: 267485
# ecc7cbf6	29-Mar-2016	Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>	Test commit access llvm-svn: 264736
Revision tags: llvmorg-3.8.0
# 3ac9cc61	27-Feb-2016	Duncan P. N. Exon Smith <dexonsmith@apple.com>	CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFC Take MachineInstr by reference instead of by pointer in SlotIndexes and the SlotIndex wrappers in LiveIntervals. The MachineInstrs CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFC Take MachineInstr by reference instead of by pointer in SlotIndexes and the SlotIndex wrappers in LiveIntervals. The MachineInstrs here are never null, so this cleans up the API a bit. It also incidentally removes a few implicit conversions from MachineInstrBundleIterator to MachineInstr* (see PR26753). At a couple of call sites it was convenient to convert to a range-based for loop over MachineBasicBlock::instr_begin/instr_end, so I added MachineBasicBlock::instrs. llvm-svn: 262115 show more ...
Revision tags: llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1, llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1, llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1
# 84db5d97	14-Jul-2015	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU/SI: Fix read2 merging into a super register. If the read2 produced was supposed to be writing into a super register, it would use the wrong subregister indices. Fix this by inserting copies, AMDGPU/SI: Fix read2 merging into a super register. If the read2 produced was supposed to be writing into a super register, it would use the wrong subregister indices. Fix this by inserting copies, so we only ever write to a vreg_64. Run the register coalescer again to clean this up, although this isn't ideal and often does result in an extra move. Also remove the assert that offset1 > offset0. There isn't a real reason to not allow this other than a minor convenience in the compiler, and it doesn't seem worth the effort of avoiding it. llvm-svn: 242174 show more ...
Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1
# 45bb48ea	13-Jun-2015	Tom Stellard <thomas.stellard@amd.com>	R600 -> AMDGPU rename llvm-svn: 239657
1 2 3 4 5 67