Revision tags: llvmorg-3.6.1, llvmorg-3.6.1-rc1 |
|
#
61b305ed |
| 05-May-2015 |
Quentin Colombet <qcolombet@apple.com> |
[ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find
[ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks.
As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64.
** Context **
Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places.
** Motivating example **
Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false
true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false
false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 }
On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret
With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret
Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call.
** Proposed Solution **
This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI.
Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties.
The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap.
Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block.
** Design Decisions **
1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples.
Differential Revision: http://reviews.llvm.org/D9210
<rdar://problem/3201744>
llvm-svn: 236507
show more ...
|
#
75e0c4b0 |
| 27-Mar-2015 |
Yaron Keren <yaron.keren@gmail.com> |
Remove superfluous .str() and replace std::string concatenation with Twine.
llvm-svn: 233392
|
#
11470c48 |
| 24-Mar-2015 |
Reid Kleckner <reid@kleckner.net> |
X86: Fix frameescape when not using an FP
We can't use TargetFrameLowering::getFrameIndexOffset directly, because Win64 really wants the offset from the stack pointer at the end of the prologue. Ins
X86: Fix frameescape when not using an FP
We can't use TargetFrameLowering::getFrameIndexOffset directly, because Win64 really wants the offset from the stack pointer at the end of the prologue. Instead, use X86FrameLowering::getFrameIndexOffsetFromSP(), which is a pretty close approximiation of that. It fails to handle cases with interestingly large stack alignments, which is pretty uncommon on Win64 and is TODO.
llvm-svn: 233137
show more ...
|
#
c5a85af3 |
| 21-Mar-2015 |
Eric Christopher <echristo@gmail.com> |
Cache the Function dependent subtarget on the MachineFunction.
As preparation for removing the getSubtargetImpl() call from TargetMachine go ahead and flip the switch on caching the function depende
Cache the Function dependent subtarget on the MachineFunction.
As preparation for removing the getSubtargetImpl() call from TargetMachine go ahead and flip the switch on caching the function dependent subtarget and remove the bare getSubtargetImpl call from the X86 port. As part of this add a few tests that show we can generate code and assemble on X86 based on features/cpu on the Function.
llvm-svn: 232879
show more ...
|
Revision tags: llvmorg-3.5.2, llvmorg-3.5.2-rc1 |
|
#
a28d91d8 |
| 10-Mar-2015 |
Mehdi Amini <mehdi.amini@apple.com> |
DataLayout is mandatory, update the API to reflect it with references.
Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first at
DataLayout is mandatory, update the API to reflect it with references.
Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that.
This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation.
I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up.
I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state.
Test Plan:
Reviewers: echristo
Subscribers: llvm-commits
From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740
show more ...
|
Revision tags: llvmorg-3.6.0, llvmorg-3.6.0-rc4 |
|
#
70eb9c5a |
| 14-Feb-2015 |
Duncan P. N. Exon Smith <dexonsmith@apple.com> |
CodeGen: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.
getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getF
CodeGen: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.
getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind)
getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind)
Also, add `Function::getFnStackAlignment()`, and canonicalize:
getAttributes().getStackAlignment(AttributeSet::FunctionIndex) => getFnStackAlignment()
llvm-svn: 229208
show more ...
|
Revision tags: llvmorg-3.6.0-rc3, llvmorg-3.6.0-rc2 |
|
#
33804cac |
| 29-Jan-2015 |
Rafael Espindola <rafael.espindola@gmail.com> |
Remove MergeableConst.
Only the specific ones (MergeableConst4, MergeableConst8, MergeableConst16) are handled specially.
llvm-svn: 227440
|
#
e2d4b2df |
| 29-Jan-2015 |
Rafael Espindola <rafael.espindola@gmail.com> |
Use enum values. NFC.
llvm-svn: 227435
|
#
33726206 |
| 27-Jan-2015 |
Eric Christopher <echristo@gmail.com> |
Replace some uses of getSubtargetImpl with the cached version off of the MachineFunction or with the version that takes a Function reference as an argument.
llvm-svn: 227185
|
#
8b770651 |
| 26-Jan-2015 |
Eric Christopher <echristo@gmail.com> |
Move DataLayout back to the TargetMachine from TargetSubtargetInfo derived classes.
Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine.
Move DataLayout back to the TargetMachine from TargetSubtargetInfo derived classes.
Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets(*) have had subtarget dependent code moved out and onto the TargetMachine.
*One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features.
llvm-svn: 227113
show more ...
|
Revision tags: llvmorg-3.6.0-rc1 |
|
#
e9b89318 |
| 13-Jan-2015 |
Reid Kleckner <reid@kleckner.net> |
Add the llvm.frameallocate and llvm.recoverframeallocation intrinsics
These intrinsics allow multiple functions to share a single stack allocation from one function's call frame. The function with t
Add the llvm.frameallocate and llvm.recoverframeallocation intrinsics
These intrinsics allow multiple functions to share a single stack allocation from one function's call frame. The function with the allocation may only perform one allocation, and it must be in the entry block.
Functions accessing the allocation call llvm.recoverframeallocation with the function whose frame they are accessing and a frame pointer from an active call frame of that function.
These intrinsics are very difficult to inline correctly, so the intention is that they be introduced rarely, or at least very late during EH preparation.
Reviewers: echristo, andrew.w.kaylor
Differential Revision: http://reviews.llvm.org/D6493
llvm-svn: 225746
show more ...
|
Revision tags: llvmorg-3.5.1, llvmorg-3.5.1-rc2, llvmorg-3.5.1-rc1 |
|
#
76936ebc |
| 14-Oct-2014 |
Rafael Espindola <rafael.espindola@gmail.com> |
Remove unused member variable.
Fixes pr20904.
llvm-svn: 219706
|
#
000ef037 |
| 08-Oct-2014 |
Eric Christopher <echristo@gmail.com> |
Replace calls to get the subtarget and TargetFrameLowering with cached variables and a single call in the constructor.
llvm-svn: 219287
|
#
51bedaf2 |
| 08-Oct-2014 |
Eric Christopher <echristo@gmail.com> |
Use cached subtarget rather than looking it up on the TargetMachine again.
llvm-svn: 219285
|
#
2e52f028 |
| 04-Oct-2014 |
Benjamin Kramer <benny.kra@googlemail.com> |
Make AAMDNodes ctor and operator bool (!!!) explicit, mop up bugs and weirdness exposed by it.
llvm-svn: 219068
|
Revision tags: llvmorg-3.5.0, llvmorg-3.5.0-rc4, llvmorg-3.5.0-rc3 |
|
#
0815a05f |
| 16-Aug-2014 |
Hal Finkel <hfinkel@anl.gov> |
Make isAliased property for fixed-offset stack objects adjustable
We used to assume that any fixed-offset stack object was not aliased. This meant that no IR value could point to the memory containe
Make isAliased property for fixed-offset stack objects adjustable
We used to assume that any fixed-offset stack object was not aliased. This meant that no IR value could point to the memory contained in such an object. This is a reasonable default, but is not a universally-correct target-independent fact. For example, on PowerPC (both Darwin and non-Darwin), some byval arguments are allocated at fixed offsets by the ABI. These, however, certainly can be pointed to by IR values. This change moves the 'isAliased' logic out of FixedStackPseudoSourceValue and into MFI, and allows the isAliased property to be overridden for fixed-offset objects.
This will be used by an upcoming commit to the PowerPC backend to fix PR20280.
No functionality change intended (the behavior of FixedStackPseudoSourceValue::isAliased has been made more conservative for callers that don't pass an MFI object, but I don't see any in-tree callers that do that).
llvm-svn: 215794
show more ...
|
#
ce40dbcb |
| 12-Aug-2014 |
Eric Christopher <echristo@gmail.com> |
Have MachineRegisterInfo take and store the MachineFunction it was created for rather than the TargetMachine since we only needed the TM for the subtarget and we can get that from the MF.
llvm-svn:
Have MachineRegisterInfo take and store the MachineFunction it was created for rather than the TargetMachine since we only needed the TM for the subtarget and we can get that from the MF.
llvm-svn: 215432
show more ...
|
Revision tags: llvmorg-3.5.0-rc2 |
|
#
fc6de428 |
| 05-Aug-2014 |
Eric Christopher <echristo@gmail.com> |
Have MachineFunction cache a pointer to the subtarget to make lookups shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lo
Have MachineFunction cache a pointer to the subtarget to make lookups shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily.
Update the MIPS subtarget switching machinery to update this pointer at the same time it runs.
llvm-svn: 214838
show more ...
|
#
d913448b |
| 04-Aug-2014 |
Eric Christopher <echristo@gmail.com> |
Remove the TargetMachine forwards for TargetSubtargetInfo based information and update all callers. No functional change.
llvm-svn: 214781
|
#
cc39b675 |
| 24-Jul-2014 |
Hal Finkel <hfinkel@anl.gov> |
AA metadata refactoring (introduce AAMDNodes)
In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer
AA metadata refactoring (introduce AAMDNodes)
In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode*).
This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode* in AliasAnalysis::Location, etc.
No functionality change intended.
llvm-svn: 213859
show more ...
|
Revision tags: llvmorg-3.5.0-rc1 |
|
#
5a1c4b82 |
| 14-Jul-2014 |
David Majnemer <david.majnemer@gmail.com> |
CodeGen: Add a getSectionKind method to MachineConstantPoolEntry
This is just a helper routine, no functionality has changed.
llvm-svn: 212993
|
#
e69170a1 |
| 26-Jun-2014 |
Alp Toker <alp@nuanti.com> |
Revert "Introduce a string_ostream string builder facilty"
Temporarily back out commits r211749, r211752 and r211754.
llvm-svn: 211814
|
#
61471738 |
| 26-Jun-2014 |
Alp Toker <alp@nuanti.com> |
Introduce a string_ostream string builder facilty
string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface.
small_string_ostream<by
Introduce a string_ostream string builder facilty
string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size other than the default 128 bytes to be provided. Beyond that, storage is transferred to the heap.
This convenient class can be used in most places an std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair would previously have been used, in order to guarantee consistent access without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes include several probable bug fixes for truncated output, a programming error that's no longer possible with the new interface.
llvm-svn: 211749
show more ...
|
#
1db5995d |
| 25-Jun-2014 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore SEH pseudo ops in X86 JIT emitter.
-- This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. I
Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore SEH pseudo ops in X86 JIT emitter.
-- This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment.
Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer.
Patch by Vadim Chugunov!
Reviewed By: rnk
Differential Revision: http://reviews.llvm.org/D4081
llvm-svn: 211691
show more ...
|
#
c403be19 |
| 25-Jun-2014 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Reformat.
llvm-svn: 211689
|