#
70573dcd |
| 19-Nov-2014 |
David Blaikie <dblaikie@gmail.com> |
Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard library's associative container inse
Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard library's associative container insert function.
This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions...
llvm-svn: 222334
show more ...
|
#
2954280f |
| 15-Oct-2014 |
Jingyue Wu <jingyue@google.com> |
[MachineSink] Use the real post dominator tree
Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree and Machin
[MachineSink] Use the real post dominator tree
Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo. The old heuristics caused instructions to sink unnecessarily, and might create register pressure.
This is the second try of the fix. The first one (D4814) caused a performance regression due to failing to sink instructions out of loops (PR21115). This patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower one regardless of whether the target block post-dominates the source.
Thanks Alexey Volkov for reporting PR21115!
Test Plan: Added a NVPTX codegen test to verify that our change prevents the backend from over-sinking. It also shows the unnecessary register pressure caused by over-sinking.
Added an X86 test to verify we can sink instructions out of loops regardless of the dominance relationship. This test is reduced from Alexey's test in PR21115.
Updated an affected test in X86.
Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime performance. Results are attached separately in the review thread.
Reviewers: Jiangning, resistor, hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski
Differential Revision: http://reviews.llvm.org/D5633
llvm-svn: 219773
show more ...
|
#
eb9e87f6 |
| 14-Oct-2014 |
Eric Christopher <echristo@gmail.com> |
Access subtarget specific variables off of the MachineFunction's cached subtarget and not the TargetMachine.
llvm-svn: 219668
|
#
fd47fb99 |
| 01-Oct-2014 |
Jingyue Wu <jingyue@google.com> |
Revert r216862 due to a performance regression
Reported by Alexey Volkov in PR21115
llvm-svn: 218771
|
#
d04f7596 |
| 25-Sep-2014 |
Bruno Cardoso Lopes <bruno.cardoso@gmail.com> |
[MachineSink+PGO] Teach MachineSink to use BlockFrequencyInfo
Machine Sink uses loop depth information to select between successors BBs to sink machine instructions into, where BBs within smaller lo
[MachineSink+PGO] Teach MachineSink to use BlockFrequencyInfo
Machine Sink uses loop depth information to select between successors BBs to sink machine instructions into, where BBs within smaller loop depths are preferable. This patch adds support for choosing between successors by using profile information from BlockFrequencyInfo instead, whenever the information is available.
Tested it under SPEC2006 train (average of 30 runs for each program); ~1.5% execution speedup in average on x86-64 darwin.
<rdar://problem/18021659>
llvm-svn: 218472
show more ...
|
#
57d315b7 |
| 09-Sep-2014 |
Patrik Hagglund <patrik.h.hagglund@ericsson.com> |
[MachineSinking] Conservatively clear kill flags after coalescing.
This solves the problem of having a kill flag inside a loop with a definition of the register prior to the loop:
%vreg368<def> ...
[MachineSinking] Conservatively clear kill flags after coalescing.
This solves the problem of having a kill flag inside a loop with a definition of the register prior to the loop:
%vreg368<def> ...
Inside loop:
%vreg520<def> = COPY %vreg368 %vreg568<def,tied1> = add %vreg341<tied0>, %vreg520<kill>
=> was coalesced into =>
%vreg568<def,tied1> = add %vreg341<tied0>, %vreg368<kill>
MachineVerifier then complained: *** Bad machine code: Virtual register killed in block, but needed live out. ***
The kill flag for %vreg368 is incorrect, and is cleared by this patch.
This is similar to the clearing done at the end of MachineSinking::SinkInstruction().
Patch provided by Jonas Paulsson.
Reviewed by Quentin Colombet and Juergen Ributzka.
llvm-svn: 217427
show more ...
|
#
4bea4945 |
| 04-Sep-2014 |
Juergen Ributzka <juergen@apple.com> |
Revert r216803 "[MachineSinking] Clear kill flag of all operands at all their uses."
This reverts commit r216803, because it might have broken the buildbot. The issue is tracked in PR20842.
llvm-sv
Revert r216803 "[MachineSinking] Clear kill flag of all operands at all their uses."
This reverts commit r216803, because it might have broken the buildbot. The issue is tracked in PR20842.
llvm-svn: 217120
show more ...
|
Revision tags: llvmorg-3.5.0 |
|
#
5208cc5d |
| 01-Sep-2014 |
Jingyue Wu <jingyue@google.com> |
[MachineSink] Use the real post dominator tree
Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree. The old h
[MachineSink] Use the real post dominator tree
Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree. The old heuristics caused instructions to sink unnecessarily, and might create register pressure.
Test Plan: Added a NVPTX codegen test to verify that our change is in effect. It also shows the unnecessary register pressure caused by over-sinking. Updated affected tests in AArch64 and X86.
Reviewers: eliben, meheff, Jiangning
Reviewed By: Jiangning
Subscribers: jholewinski, aemerson, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D4814
llvm-svn: 216862
show more ...
|
#
00d78221 |
| 29-Aug-2014 |
Juergen Ributzka <juergen@apple.com> |
[MachineSinking] Clear kill flag of all operands at all their uses.
When sinking an instruction it might be moved past the original last use of one of its operands. This last use has the kill flag s
[MachineSinking] Clear kill flag of all operands at all their uses.
When sinking an instruction it might be moved past the original last use of one of its operands. This last use has the kill flag set and the verifier will obviously complain about this.
Before Machine Sinking (AArch64): %vreg3<def> = ASRVXr %vreg1, %vreg2<kill> %XZR<def> = SUBSXrs %vreg4, %vreg1<kill>, 160, %NZCV<imp-def> ...
After Machine Sinking: %XZR<def> = SUBSXrs %vreg4, %vreg1<kill>, 160, %NZCV<imp-def> ... %vreg3<def> = ASRVXr %vreg1, %vreg2<kill>
This fix clears all the kill flags in all instruction that use the same operands as the instruction that is being sunk.
This fixes rdar://problem/18180996.
llvm-svn: 216803
show more ...
|
Revision tags: llvmorg-3.5.0-rc4, llvmorg-3.5.0-rc3 |
|
#
5cded89d |
| 11-Aug-2014 |
Quentin Colombet <qcolombet@apple.com> |
[MachineSink] Improve the compile time by preserving the dominance information as long as possible.
** Context **
Each time the dominance information is modified, the dominator tree analysis switch
[MachineSink] Improve the compile time by preserving the dominance information as long as possible.
** Context **
Each time the dominance information is modified, the dominator tree analysis switches in a slow query mode. After a few queries without any modification on the dominator tree, it performs an expensive update of its internal structure to provide fast queries again.
** Problem **
Prior to this patch, the MachineSink pass was splitting the critical edges on demand while relying heavy on the dominator tree information. In some cases, this leads to pathological behavior where: - We end up in the slow query mode right after splitting an edge. - We update the dominance information. - We break the dominance information again, thus ending up in the slow query mode and so on.
** Proposed Solution **
To mitigate this effect, this patch postpones all the splitting of the edges at the end of each iteration of the main loop. The benefits are: - The dominance information is valid for the life time of an iteration. - This simplifies the code as we do not have to special treat instructions that are sunk on critical edges. Indeed, the related block will be available through the next iteration.
The downside is that when edges splitting is required, this incurs an additional iteration of the main loop compared to the previous scheme.
** Performance **
Thanks to this patch, the motivating example compiles in 6+ minutes instead of 10+ minutes. No test case added as the motivating example as nothing special but being huge!
I have measured only noise for both the compile time and the runtime on the llvm test-suite + SPECs with Os and O3.
Note: The current implementation of MachineBasicBlock::SplitCriticalEdge also uses the dominance information and therefore, hits this problem. A subsequent patch will address that.
<rdar://problem/17894619>
llvm-svn: 215410
show more ...
|
Revision tags: llvmorg-3.5.0-rc2 |
|
#
d913448b |
| 04-Aug-2014 |
Eric Christopher <echristo@gmail.com> |
Remove the TargetMachine forwards for TargetSubtargetInfo based information and update all callers. No functional change.
llvm-svn: 214781
|
#
c3053129 |
| 29-Jul-2014 |
Jiangning Liu <jiangning.liu@arm.com> |
Add TargetInstrInfo interface isAsCheapAsAMove.
llvm-svn: 214158
|
Revision tags: llvmorg-3.5.0-rc1, llvmorg-3.4.2, llvmorg-3.4.2-rc1, llvmorg-3.4.1, llvmorg-3.4.1-rc2 |
|
#
1b9dde08 |
| 22-Apr-2014 |
Chandler Carruth <chandlerc@gmail.com> |
[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind
[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers.
Other sub-trees will follow.
llvm-svn: 206837
show more ...
|
#
c0196b1b |
| 14-Apr-2014 |
Craig Topper <craig.topper@gmail.com> |
[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.
llvm-svn: 206142
|
Revision tags: llvmorg-3.4.1-rc1 |
|
#
7c99ec5b |
| 31-Mar-2014 |
Paul Robinson <paul_robinson@playstation.sony.com> |
Disable each MachineFunctionPass for 'optnone' functions, unless that pass normally runs at optimization level None, or is part of the register allocation pipeline.
llvm-svn: 205228
|
#
b36376ef |
| 17-Mar-2014 |
Owen Anderson <resistor@mac.com> |
Switch a number of loops in lib/CodeGen over to range-based for-loops, now that the MachineRegisterInfo iterators are compatible with it.
llvm-svn: 204075
|
#
16c6bf49 |
| 13-Mar-2014 |
Owen Anderson <resistor@mac.com> |
Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changing operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost
Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changing operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators!
Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future.
llvm-svn: 203865
show more ...
|
#
4584cd54 |
| 07-Mar-2014 |
Craig Topper <craig.topper@gmail.com> |
[C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 203220
|
#
3a377bce |
| 01-Mar-2014 |
Benjamin Kramer <benny.kra@googlemail.com> |
Now that we have C++11, turn simple functors into lambdas and remove a ton of boilerplate.
No intended functionality change.
llvm-svn: 202588
|
Revision tags: llvmorg-3.4.0, llvmorg-3.4.0-rc3, llvmorg-3.4.0-rc2, llvmorg-3.4.0-rc1 |
|
#
5cb7f4e3 |
| 14-Oct-2013 |
Will Dietz <wdietz2@illinois.edu> |
MachineSink: Fix and tweak critical-edge breaking heuristic.
Per original comment, the intention of this loop is to go ahead and break the critical edge (in order to sink this instruction) if there'
MachineSink: Fix and tweak critical-edge breaking heuristic.
Per original comment, the intention of this loop is to go ahead and break the critical edge (in order to sink this instruction) if there's reason to believe doing so might "unblock" the sinking of additional instructions that define registers used by this one. The idea is that if we have a few instructions to sink "together" breaking the edge might be worthwhile.
This commit makes a few small changes to help better realize this goal:
First, modify the loop to ignore registers defined by this instruction. We don't sink definitions of physical registers, and sinking an SSA definition isn't going to unblock an upstream instruction.
Second, ignore uses of physical registers. Instructions that define physical registers are rejected for sinking, and so moving this one won't enable moving any defining instructions. As an added bonus, while virtual register use-def chains are generally small due to SSA goodness, iteration over the uses and definitions (used by hasOneNonDBGUse) for physical registers like EFLAGS can be rather expensive in practice. (This is the original reason for looking at this)
Finally, to keep things simple continue to only consider this trick for registers that have a single use (via hasOneNonDBGUse), but to avoid spuriously breaking critical edges only do so if the definition resides in the same MBB and therefore this one directly blocks it from being sunk as well. If sinking them together is meant to be, let the iterative nature of this pass sink the definition into this block first.
Update tests to accomodate this change, add new testcase where sinking avoids pipeline stalls.
llvm-svn: 192608
show more ...
|
#
b94011fd |
| 14-Jul-2013 |
Craig Topper <craig.topper@gmail.com> |
Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
llvm-svn: 186274
|
Revision tags: llvmorg-3.3.1-rc1 |
|
#
e1c1d363 |
| 03-Jul-2013 |
Craig Topper <craig.topper@gmail.com> |
Use SmallVectorImpl instead of SmallVector for iterators and references to avoid specifying the vector size unnecessarily.
llvm-svn: 185512
|
Revision tags: llvmorg-3.3.0, llvmorg-3.3.0-rc3, llvmorg-3.3.0-rc2, llvmorg-3.3.0-rc1, llvmorg-3.2.0, llvmorg-3.2.0-rc3 |
|
#
ed0881b2 |
| 03-Dec-2012 |
Chandler Carruth <chandlerc@gmail.com> |
Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module
Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented.
Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =]
llvm-svn: 169131
show more ...
|
Revision tags: llvmorg-3.2.0-rc2, llvmorg-3.2.0-rc1 |
|
#
244beb42 |
| 16-Oct-2012 |
Jakob Stoklund Olesen <stoklund@2pi.dk> |
Remove unused BitVectors from getAllocatableSet().
llvm-svn: 165999
|
#
f288d2f1 |
| 31-Jul-2012 |
Manman Ren <mren@apple.com> |
MachineSink: Sort the successors before trying to find SuccToSinkTo.
Use stable_sort instead of sort. Follow-up to r161062.
rdar://11980766
llvm-svn: 161075
|