Revision tags: llvmorg-3.6.0, llvmorg-3.6.0-rc4 |
|
#
ed9eb720 |
| 18-Feb-2015 |
Daniel Jasper <djasper@google.com> |
NFC: Use range-based for loops and more consistent naming.
No functional changes intended.
(I plan on doing some modifications to this function and would like to have as few unrelated changes as po
NFC: Use range-based for loops and more consistent naming.
No functional changes intended.
(I plan on doing some modifications to this function and would like to have as few unrelated changes as possible in the patch)
llvm-svn: 229649
show more ...
|
#
4d7b0438 |
| 18-Feb-2015 |
Daniel Jasper <djasper@google.com> |
Remove experimental options to control machine block placement.
This reverts r226034. Benchmarking with those flags has not revealed anything interesting.
llvm-svn: 229648
|
#
70eb9c5a |
| 14-Feb-2015 |
Duncan P. N. Exon Smith <dexonsmith@apple.com> |
CodeGen: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.
getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getF
CodeGen: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.
getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind)
getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind)
Also, add `Function::getFnStackAlignment()`, and canonicalize:
getAttributes().getStackAlignment(AttributeSet::FunctionIndex) => getFnStackAlignment()
llvm-svn: 229208
show more ...
|
Revision tags: llvmorg-3.6.0-rc3, llvmorg-3.6.0-rc2, llvmorg-3.6.0-rc1 |
|
#
e3288147 |
| 14-Jan-2015 |
Chandler Carruth <chandlerc@gmail.com> |
[MBP] Add flags to disable the BadCFGConflict check in MachineBlockPlacement.
Some benchmarks have shown that this could lead to a potential performance benefit, and so adding some flags to try to h
[MBP] Add flags to disable the BadCFGConflict check in MachineBlockPlacement.
Some benchmarks have shown that this could lead to a potential performance benefit, and so adding some flags to try to help measure the difference.
A possible explanation. In diamond-shaped CFGs (A followed by either B or C both followed by D), putting B and C both in between A and D leads to the code being less dense than it could be. Always either B or C have to be skipped increasing the chance of cache misses etc. Moving either B or C to after D might be beneficial on average.
In the long run, but we should probably do a better job of analyzing the basic block and branch probabilities to move the correct one of B or C to after D. But even if we don't use this in the long run, it is a good baseline for benchmarking.
Original patch authored by Daniel Jasper with test tweaks and a second flag added by me.
Differential Revision: http://reviews.llvm.org/D6969
llvm-svn: 226034
show more ...
|
#
5772566e |
| 03-Jan-2015 |
Hal Finkel <hfinkel@anl.gov> |
[PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preference
The existing code provided for specifying a global loop alignment preference. However, the preferred loop alignment m
[PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preference
The existing code provided for specifying a global loop alignment preference. However, the preferred loop alignment might depend on the loop itself. For recent POWER cores, loops between 5 and 8 instructions should have 32-byte alignment (while the others are better with 16-byte alignment) so that the entire loop will fit in one i-cache line.
To support this, getPrefLoopAlignment has been made virtual, and can be provided with an optional MachineLoop* so the target can inspect the loop before answering the query. The default behavior, as before, is to return the value set with setPrefLoopAlignment. MachineBlockPlacement now queries the target for each loop instead of only once per function. There should be no functional change for other targets.
llvm-svn: 225117
show more ...
|
Revision tags: llvmorg-3.5.1, llvmorg-3.5.1-rc2, llvmorg-3.5.1-rc1 |
|
#
70573dcd |
| 19-Nov-2014 |
David Blaikie <dblaikie@gmail.com> |
Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard library's associative container inse
Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard library's associative container insert function.
This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions...
llvm-svn: 222334
show more ...
|
Revision tags: llvmorg-3.5.0, llvmorg-3.5.0-rc4, llvmorg-3.5.0-rc3, llvmorg-3.5.0-rc2 |
|
#
fc6de428 |
| 05-Aug-2014 |
Eric Christopher <echristo@gmail.com> |
Have MachineFunction cache a pointer to the subtarget to make lookups shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lo
Have MachineFunction cache a pointer to the subtarget to make lookups shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily.
Update the MIPS subtarget switching machinery to update this pointer at the same time it runs.
llvm-svn: 214838
show more ...
|
#
d913448b |
| 04-Aug-2014 |
Eric Christopher <echristo@gmail.com> |
Remove the TargetMachine forwards for TargetSubtargetInfo based information and update all callers. No functional change.
llvm-svn: 214781
|
Revision tags: llvmorg-3.5.0-rc1 |
|
#
e69170a1 |
| 26-Jun-2014 |
Alp Toker <alp@nuanti.com> |
Revert "Introduce a string_ostream string builder facilty"
Temporarily back out commits r211749, r211752 and r211754.
llvm-svn: 211814
|
#
61471738 |
| 26-Jun-2014 |
Alp Toker <alp@nuanti.com> |
Introduce a string_ostream string builder facilty
string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface.
small_string_ostream<by
Introduce a string_ostream string builder facilty
string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size other than the default 128 bytes to be provided. Beyond that, storage is transferred to the heap.
This convenient class can be used in most places an std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair would previously have been used, in order to guarantee consistent access without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes include several probable bug fixes for truncated output, a programming error that's no longer possible with the new interface.
llvm-svn: 211749
show more ...
|
Revision tags: llvmorg-3.4.2, llvmorg-3.4.2-rc1, llvmorg-3.4.1, llvmorg-3.4.1-rc2 |
|
#
1b9dde08 |
| 22-Apr-2014 |
Chandler Carruth <chandlerc@gmail.com> |
[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind
[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers.
Other sub-trees will follow.
llvm-svn: 206837
show more ...
|
#
c0196b1b |
| 14-Apr-2014 |
Craig Topper <craig.topper@gmail.com> |
[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.
llvm-svn: 206142
|
Revision tags: llvmorg-3.4.1-rc1 |
|
#
7c99ec5b |
| 31-Mar-2014 |
Paul Robinson <paul_robinson@playstation.sony.com> |
Disable each MachineFunctionPass for 'optnone' functions, unless that pass normally runs at optimization level None, or is part of the register allocation pipeline.
llvm-svn: 205228
|
#
4584cd54 |
| 07-Mar-2014 |
Craig Topper <craig.topper@gmail.com> |
[C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 203220
|
#
b6d0bd48 |
| 02-Mar-2014 |
Benjamin Kramer <benny.kra@googlemail.com> |
[C++11] Replace llvm::next and llvm::prior with std::next and std::prev.
Remove the old functions.
llvm-svn: 202636
|
#
3a377bce |
| 01-Mar-2014 |
Benjamin Kramer <benny.kra@googlemail.com> |
Now that we have C++11, turn simple functors into lambdas and remove a ton of boilerplate.
No intended functionality change.
llvm-svn: 202588
|
#
7408c706 |
| 03-Jan-2014 |
Nico Weber <nicolasweber@gmx.de> |
Add a LLVM_DUMP_METHOD macro.
The motivation is to mark dump methods as used in debug builds so that they can be called from lldb, but to not do so in release builds so that they can be dead-strippe
Add a LLVM_DUMP_METHOD macro.
The motivation is to mark dump methods as used in debug builds so that they can be called from lldb, but to not do so in release builds so that they can be dead-stripped.
There's lots of potential follow-up work suggested in the thread "Should dump methods be LLVM_ATTRIBUTE_USED only in debug builds?" on cfe-dev, but everyone seems to agreen on this subset.
Macro name chosen by fair coin toss.
llvm-svn: 198456
show more ...
|
Revision tags: llvmorg-3.4.0, llvmorg-3.4.0-rc3 |
|
#
b78dec8f |
| 14-Dec-2013 |
Michael Gottesman <mgottesman@apple.com> |
[block-freq] Update MachineBlockPlacement and RegAllocGreedy to use the new MachineBlockFrequencyInfo methods.
llvm-svn: 197290
|
#
0f5f015b |
| 10-Dec-2013 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Fix gcc warnings.
Unused variable and unused typedef in release build.
llvm-svn: 196947
|
#
79d55f5c |
| 05-Dec-2013 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Revert part of GCC warning fix to fix debug build.
The typedef is used inside the DEBUG(), and apparently can't be moved inside of it.
llvm-svn: 196528
|
#
c44a3ff6 |
| 05-Dec-2013 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Fix minor GCC warnings.
Unused typedefs and unused variables.
llvm-svn: 196526
|
Revision tags: llvmorg-3.4.0-rc2 |
|
#
260258b9 |
| 25-Nov-2013 |
Chandler Carruth <chandlerc@gmail.com> |
Output a bit more information in the debug printing for MBP. This was useful when analyzing parts of zlib's behavior here.
llvm-svn: 195588
|
#
c8160d65 |
| 20-Nov-2013 |
Benjamin Kramer <benny.kra@googlemail.com> |
MachineBlockPlacement: Strengthen the source order bias when picking an exit block.
We now only allow breaking source order if the exit block frequency is significantly higher than the other exit bl
MachineBlockPlacement: Strengthen the source order bias when picking an exit block.
We now only allow breaking source order if the exit block frequency is significantly higher than the other exit block. The actual bias is currently under a flag so the best cut-off can be found; the flag defaults to the old behavior. The idea is to get some benchmark coverage over different values for the flag and pick the best one.
When we require the new frequency to be at least 20% higher than the old frequency I see a 5% speedup on zlib's deflate when compressing a random file on x86_64/westmere. Hal reported a small speedup on Fhourstones on a BG/Q and no regressions in the test suite.
The test case is the full long_match function from zlib's deflate. I was reluctant to add it for previous tweaks to branch probabilities because it's large and potentially fragile, but changed my mind since it's an important use case and more likely to break with all the current work going into the PGO infrastructure.
Differential Revision: http://llvm-reviews.chandlerc.com/D2202
llvm-svn: 195265
show more ...
|
Revision tags: llvmorg-3.4.0-rc1, llvmorg-3.3.1-rc1, llvmorg-3.3.0, llvmorg-3.3.0-rc3 |
|
#
8b8fd217 |
| 04-Jun-2013 |
Shuxin Yang <shuxin.llvm@gmail.com> |
Fix a defect in code-layout pass, improving Benchmarks/Olden/em3d/em3d by about 30% (4.58s vs 3.2s on an oldish Mac Tower).
The corresponding src is excerpted bellow. The lopp accounts for about
Fix a defect in code-layout pass, improving Benchmarks/Olden/em3d/em3d by about 30% (4.58s vs 3.2s on an oldish Mac Tower).
The corresponding src is excerpted bellow. The lopp accounts for about 90% of execution time. -------------------- cat -n test-suite/MultiSource/Benchmarks/Olden/em3d/make_graph.c 90 91 for (k=0; k<j; k++) 92 if (other_node == cur_node->to_nodes[k]) break;
The defective layout is sketched bellow, where the two branches need to swap. ------------------------------------------------------------------------ L: ... if (cond) goto out-of-loop goto L
While this code sequence is defective, I don't understand why it incurs 1/3 of execution time. CPU-event-profiling indicates the poor laoyout dose not increase in br-misprediction; it dosen't increase stall cycle at all, and it dosen't prevent the CPU detect the loop (i.e. Loop-Stream-Detector seems to be working fine as well)...
The root cause of the problem is that the layout pass calls AnalyzeBranch() with basic-block which is not updated to reflect its current layout.
rdar://13966341
llvm-svn: 183174
show more ...
|
Revision tags: llvmorg-3.3.0-rc2, llvmorg-3.3.0-rc1 |
|
#
c0adc9fd |
| 12-Apr-2013 |
Nadav Rotem <nrotem@apple.com> |
Don't disable block layout when forcing block alignment.
llvm-svn: 179355
|