#
e1a2e90f |
| 31-Mar-2016 |
Hans Wennborg <hans@hanshq.net> |
Change eliminateCallFramePseudoInstr() to return an iterator
This will become necessary in a subsequent change to make this method merge adjacent stack adjustments, i.e. it might erase the previous
Change eliminateCallFramePseudoInstr() to return an iterator
This will become necessary in a subsequent change to make this method merge adjacent stack adjustments, i.e. it might erase the previous and/or next instruction.
It also greatly simplifies the calls to this function from Prolog- EpilogInserter. Previously, that had a bunch of logic to resume iteration after the call; now it just continues with the returned iterator.
Note that this changes the behaviour of PEI a little. Previously, it attempted to re-visit the new instruction created by eliminateCallFramePseudoInstr(). That code was added in r36625, but I can't see any reason for it: the new instructions will obviously not be pseudo instructions, they will not have FrameIndex operands, and we have already accounted for the stack adjustment.
Differential Revision: http://reviews.llvm.org/D18627
llvm-svn: 265036
show more ...
|
Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3 |
|
#
8374c1f7 |
| 23-Feb-2016 |
Aaron Ballman <aaron@aaronballman.com> |
Silencing a signed vs unsigned mismatch.
llvm-svn: 261640
|
#
a8ef3c9b |
| 22-Feb-2016 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
Fix for PR26690 take 2
This is what was meant to be in the initial commit to fix this bug. The parens were missing. This commit also adds a test case for the bug and has undergone full testing on PP
Fix for PR26690 take 2
This is what was meant to be in the initial commit to fix this bug. The parens were missing. This commit also adds a test case for the bug and has undergone full testing on PPC and X86.
llvm-svn: 261546
show more ...
|
#
33618674 |
| 22-Feb-2016 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
Revert bad fix for PR26690.
llvm-svn: 261527
|
#
d58b976b |
| 22-Feb-2016 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
Fix for PR26690
I mistook BitVector::empty() to mean BitVector::count() == 0 and it does not. Corrected the issue with the fix for PR26500.
llvm-svn: 261525
|
#
daf0ca23 |
| 20-Feb-2016 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
Fix the build bot break caused by rL261441.
The patch has a necessary call to a function inside an assert. Which is fine when you have asserts turned on. Not so much when they're off. Sorry about th
Fix the build bot break caused by rL261441.
The patch has a necessary call to a function inside an assert. Which is fine when you have asserts turned on. Not so much when they're off. Sorry about the regression.
llvm-svn: 261447
show more ...
|
#
ae22101c |
| 20-Feb-2016 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
Fix for PR 26500
This patch corresponds to review: http://reviews.llvm.org/D17294
It ensures that whatever block we are emitting the prologue/epilogue into, we have the necessary scratch registers.
Fix for PR 26500
This patch corresponds to review: http://reviews.llvm.org/D17294
It ensures that whatever block we are emitting the prologue/epilogue into, we have the necessary scratch registers. It takes away the hard-coded register numbers for use as scratch registers as registers that are guaranteed to be available in the function prologue/epilogue are not guaranteed to be available within the function body. Since we shrink-wrap, the prologue/epilogue may end up in the function body.
llvm-svn: 261441
show more ...
|
Revision tags: llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1 |
|
#
e5e035a3 |
| 05-Dec-2015 |
Craig Topper <craig.topper@gmail.com> |
Replace uint16_t with the MCPhysReg typedef in many places. A lot of physical register arrays already use this typedef.
llvm-svn: 254843
|
#
39b7d65d |
| 02-Dec-2015 |
Alexey Samsonov <vonosmas@gmail.com> |
[PowerPC] Remove wild call to RegScavenger::initRegState().
This call should in fact be made by RegScavenger::enterBasicBlock() called below. The first call does nothing except for triggering UB, in
[PowerPC] Remove wild call to RegScavenger::initRegState().
This call should in fact be made by RegScavenger::enterBasicBlock() called below. The first call does nothing except for triggering UB, indicated by UBSan (passing nullptr to memset()).
llvm-svn: 254548
show more ...
|
Revision tags: llvmorg-3.7.1 |
|
#
f4ce2f3a |
| 30-Nov-2015 |
Kit Barton <kbarton@ca.ibm.com> |
Enable shrink wrapping for PPC64
Re-enable shrink wrapping for PPC64 Little Endian.
One minor modification to PPCFrameLowering::findScratchRegister was necessary to handle fall-thru blocks (blocks
Enable shrink wrapping for PPC64
Re-enable shrink wrapping for PPC64 Little Endian.
One minor modification to PPCFrameLowering::findScratchRegister was necessary to handle fall-thru blocks (blocks with no terminator) correctly.
Tested with all LLVM test, clang tests, and the self-hosting build, with no problems found.
PHabricator: http://reviews.llvm.org/D14778 llvm-svn: 254314
show more ...
|
Revision tags: llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1 |
|
#
9c432ae1 |
| 16-Nov-2015 |
Kit Barton <kbarton@ca.ibm.com> |
Find available scratch register to use in function prologue and epilogue as part of shrink wrapping.
Phabricator: http://reviews.llvm.org/D13955 llvm-svn: 253247
|
#
4c457758 |
| 30-Sep-2015 |
Hal Finkel <hfinkel@anl.gov> |
[PowerPC] Disable shrink wrapping
Shrink wrapping is causing a self-hosting failure on PPC64/Linux. Disable for now until the problem can be fixed.
llvm-svn: 248924
|
#
c2d4befb |
| 25-Sep-2015 |
Matthias Braun <matze@braunis.de> |
MachineBasicBlock: Factor out common code into isReturnBlock()
llvm-svn: 248617
|
#
8061e864 |
| 11-Sep-2015 |
NAKAMURA Takumi <geek4civic@gmail.com> |
PPCFrameLowering::emitEpilogue(): Avoid manipulating MBBI on iterator end.
It caused crash in MachineInstr::hasPropertyInBundle() since r247237.
llvm-svn: 247395
|
#
d3b904d4 |
| 10-Sep-2015 |
Kit Barton <kbarton@ca.ibm.com> |
Enable the shrink wrapping optimization for PPC64.
The changes in this patch are as follows: 1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blo
Enable the shrink wrapping optimization for PPC64.
The changes in this patch are as follows: 1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blocks are not the first/last blocks in the function 2. Fix a bug in PPCEarlyReturn optimization caused by an empty entry block in the function 3. Override the runShrinkWrap PredicateFtor (defined in TargetMachine) to check whether shrink wrapping should run: Shrink wrapping will run on PPC64 (Little Endian and Big Endian) unless -enable-shrink-wrap=false is specified on command line
A new test case, ppc-shrink-wrapping.ll was created based on the existing shrink wrapping tests for x86, arm, and arm64.
Phabricator review: http://reviews.llvm.org/D11817
llvm-svn: 247237
show more ...
|
Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3 |
|
#
3e190cb0 |
| 14-Aug-2015 |
Saleem Abdulrasool <compnerd@compnerd.org> |
PowerPC: remove dead initialization (NFC)
Identified by the clang static analyzer. No functional change intended.
llvm-svn: 245022
|
Revision tags: studio-1.4, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1 |
|
#
9912bb81 |
| 14-Jul-2015 |
Matthias Braun <matze@braunis.de> |
MachineRegisterInfo: Remove UsedPhysReg infrastructure
We have a detailed def/use lists for every physical register in MachineRegisterInfo anyway, so there is little use in maintaining an additional
MachineRegisterInfo: Remove UsedPhysReg infrastructure
We have a detailed def/use lists for every physical register in MachineRegisterInfo anyway, so there is little use in maintaining an additional bitset of which ones are used.
Removing it frees us from extra book keeping. This simplifies VirtRegMap.
Differential Revision: http://reviews.llvm.org/D10911
llvm-svn: 242173
show more ...
|
#
02564865 |
| 14-Jul-2015 |
Matthias Braun <matze@braunis.de> |
PrologEpilogInserter: Rewrite API to determine callee save regsiters.
This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan():
- Rename the function to determineCalleeSaves() - Pas
PrologEpilogInserter: Rewrite API to determine callee save regsiters.
This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan():
- Rename the function to determineCalleeSaves() - Pass a bitset of callee saved registers by reference, thus avoiding the function-global PhysRegUsed bitset in MachineRegisterInfo. - Without PhysRegUsed the implementation is fine tuned to not save physcial registers which are only read but never modified.
Related to rdar://21539507
Differential Revision: http://reviews.llvm.org/D10909
llvm-svn: 242165
show more ...
|
Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1, llvmorg-3.6.1, llvmorg-3.6.1-rc1 |
|
#
61b305ed |
| 05-May-2015 |
Quentin Colombet <qcolombet@apple.com> |
[ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find
[ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks.
As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64.
** Context **
Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places.
** Motivating example **
Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false
true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false
false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 }
On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret
With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret
Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call.
** Proposed Solution **
This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI.
Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties.
The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap.
Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block.
** Design Decisions **
1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples.
Differential Revision: http://reviews.llvm.org/D9210
<rdar://problem/3201744>
llvm-svn: 236507
show more ...
|
Revision tags: llvmorg-3.5.2, llvmorg-3.5.2-rc1, llvmorg-3.6.0 |
|
#
c93a9a2c |
| 25-Feb-2015 |
Hal Finkel <hfinkel@anl.gov> |
[PowerPC] Add support for the QPX vector instruction set
This adds support for the QPX vector instruction set, which is used by the enhanced A2 cores on the IBM BG/Q supercomputers. QPX vectors are
[PowerPC] Add support for the QPX vector instruction set
This adds support for the QPX vector instruction set, which is used by the enhanced A2 cores on the IBM BG/Q supercomputers. QPX vectors are 256 bytes wide, holding 4 double-precision floating-point values. Boolean values, modeled here as <4 x i1> are actually also represented as floating-point values (essentially { -1, 1 } for { false, true }). QPX shares many features with Altivec and VSX, but is distinct from both of them. One major difference is that, instead of adding completely-separate vector registers, QPX vector registers are extensions of the scalar floating-point registers (lane 0 is the corresponding scalar floating-point value). The operations supported on QPX vectors mirrors that supported on the scalar floating-point values (with some additional ones for permutations and logical/comparison operations).
I've been maintaining this support out-of-tree, as part of the bgclang project, for several years. This is not the entire bgclang patch set, but is most of the subset that can be cleanly integrated into LLVM proper at this time. Adding this to the LLVM backend is part of my efforts to rebase bgclang to the current LLVM trunk, but is independently useful (especially for codes that use LLVM as a JIT in library form).
The assembler/disassembler test coverage is complete. The CodeGen test coverage is not, but I've included some tests, and more will be added as follow-up work.
llvm-svn: 230413
show more ...
|
Revision tags: llvmorg-3.6.0-rc4 |
|
#
003ed332 |
| 14-Feb-2015 |
Chandler Carruth <chandlerc@gmail.com> |
Remove a variable only used in an assert and sink its initializer into the assert. Fixes -Wunused-variable on non-asserts builds.
llvm-svn: 229250
|
#
5bedaf93 |
| 14-Feb-2015 |
Duncan P. N. Exon Smith <dexonsmith@apple.com> |
PowerPC: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.
getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getF
PowerPC: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.
getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind)
getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind)
llvm-svn: 229224
show more ...
|
#
fcd3d87a |
| 13-Feb-2015 |
Eric Christopher <echristo@gmail.com> |
The base pointer save offset can be computed at initialization time, do so and fix up the calls.
llvm-svn: 229169
|
#
a4ae2131 |
| 13-Feb-2015 |
Eric Christopher <echristo@gmail.com> |
PPC LinkageSize can be computed at initialization time, do so.
llvm-svn: 229163
|
Revision tags: llvmorg-3.6.0-rc3 |
|
#
dc3a8a4a |
| 13-Feb-2015 |
Eric Christopher <echristo@gmail.com> |
PPCFrameLowering's FramePointerOffset can be computed at initialization time. Do so.
llvm-svn: 228998
|