Revision tags: llvmorg-3.4.0, llvmorg-3.4.0-rc3, llvmorg-3.4.0-rc2, llvmorg-3.4.0-rc1 |
|
#
f381afc9 |
| 20-Aug-2013 |
Bill Schmidt <wschmidt@linux.vnet.ibm.com> |
[PowerPC] More refactoring prior to real PPC emitPrologue/Epilogue changes.
(Patch committed on behalf of Mark Minich, whose log entry follows.)
This is a continuation of the refactorings performed
[PowerPC] More refactoring prior to real PPC emitPrologue/Epilogue changes.
(Patch committed on behalf of Mark Minich, whose log entry follows.)
This is a continuation of the refactorings performed in svn rev 188573 (see that rev's comments for more detail).
This is my stage 2 refactoring: I combined the emitPrologue() & emitEpilogue() PPC32 & PPC64 code into a single flow, simplifying a lot of the code since in essence the PPC32 & PPC64 code generation logic is the same, only the instruction forms are different (in most cases). This simplification is necessary because my functional changes (yet to come) add significant complexity, and without the simplification of my stage 2 refactoring, the overall complexity of both emitPrologue() & emitEpilogue() would have become almost intractable for most mortal programmers (like me).
This submission was intended to be a pure refactoring (no functional changes whatsoever). However, in the process of combining the PPC32 & PPC64 flows, I spotted a difference that I believe is a bug (see svn rev 186478 line 863, or svn rev 188573 line 888): This line appears to be restoring the BP with the original FP content, not the original BP content. When I merged the 32-bit and 64-bit code, I used the corresponding code from the 64-bit flow, which I believe uses the correct offset (BPOffset) for this operation.
llvm-svn: 188741
show more ...
|
#
8893a3d1 |
| 16-Aug-2013 |
Bill Schmidt <wschmidt@linux.vnet.ibm.com> |
[PowerPC] Preparatory refactoring for making prologue and epilogue safe on PPC32 SVR4 ABI
[Patch and following text by Mark Minich; committing on his behalf.]
There are FIXME's in PowerPC/PPCFrameL
[PowerPC] Preparatory refactoring for making prologue and epilogue safe on PPC32 SVR4 ABI
[Patch and following text by Mark Minich; committing on his behalf.]
There are FIXME's in PowerPC/PPCFrameLowering.cpp, method PPCFrameLowering::emitPrologue() related to "negative offsets of R1" on PPC32 SVR4. They're true, but the real issue is that on PPC32 SVR4 (and any ABI without a Red Zone), no spills may be made until after the stackframe is claimed, which also includes the LR spill which is at a positive offset. The same problem exists in emitEpilogue(), though there's no FIXME for it. I intend to fix this issue, making LLVM-compiled code finally safe for use on SVR4/EABI/e500 32-bit platforms (including in particular, OS-free embedded systems & kernel code, where interrupts may share the same stack as user code).
In preparation for making these changes, to make the diffs for the functional changes less cluttered, I am providing the non-functional refactorings in two stages:
Stage 1 does some minor fluffy refactorings to pull multiple method calls up into a single bool, creating named bools for repeated uses of obscure logic, moving some code up earlier because either stage 2 or my final version will require it earlier, and rewording/adding some comments. My stage 1 changes can be characterized as primarily fluffy cleanup, the purpose of which may be unclear until the stage 2 or final changes are made.
My stage 2 refactorings combine the separate PPC32 & PPC64 logic, which is currently performed by largely duplicate code, into a single flow, with the differences handled by a group of constants initialized early in the methods.
This submission is for my stage 1 changes. There should be no functional changes whatsoever; this is a pure refactoring.
llvm-svn: 188573
show more ...
|
#
1860763c |
| 18-Jul-2013 |
Hal Finkel <hfinkel@anl.gov> |
PPC: Support dynamic allocas with large alignment
Support for dynamic stack alignments in the PPC backend has been unfinished, in part because it depends on dynamic stack realignment (which I only j
PPC: Support dynamic allocas with large alignment
Support for dynamic stack alignments in the PPC backend has been unfinished, in part because it depends on dynamic stack realignment (which I only just recently implemented fully). Now we can also support dynamic allocas with higher than the default target stack alignment (16 bytes).
In order to round-up the requested size to the maximum requested alignment, we need an additional register to hold the rounded-up size. We're already using one scavenged register to hold the previous stack-pointer value (which needs to be stored with the signal-safe stdux update), and so when we have dynamic allocas and a large alignment, we allocate two emergency spill slots for the scavenger.
llvm-svn: 186562
show more ...
|
#
f05d6c78 |
| 17-Jul-2013 |
Hal Finkel <hfinkel@anl.gov> |
PPC: Add base-pointer support to builtin setjmp/longjmp
First, this changes the base-pointer implementation to remove an unnecessary complication (and one that is incompatible with how builtin SjLj
PPC: Add base-pointer support to builtin setjmp/longjmp
First, this changes the base-pointer implementation to remove an unnecessary complication (and one that is incompatible with how builtin SjLj is implemented): instead of using r31 as the base pointer when it is not needed as a frame pointer, now the base pointer will always be r30 when needed.
Second, we introduce another pseudo register, BP, which is used just like the FP pseudo register to refer to the base register before we know for certain what register it will be.
Third, we now save BP into the jmp_buf, and restore r30 from that slot in longjmp. If the function that called setjmp did not use a base pointer, then r30 will be overwritten by the setjmp-calling-function's restore code. FP restoration (which is restored into r31) works the same way.
llvm-svn: 186545
show more ...
|
#
a7c54e8c |
| 17-Jul-2013 |
Hal Finkel <hfinkel@anl.gov> |
PPC: Implement base pointer and stack realignment
This builds on some frame-lowering code that has existed since 2005 (r24224) but was disabled in 2008 (r48188) because it needed base pointer suppor
PPC: Implement base pointer and stack realignment
This builds on some frame-lowering code that has existed since 2005 (r24224) but was disabled in 2008 (r48188) because it needed base pointer support to function correctly. This implementation follows the strategy suggested by Dale Johannesen in r48188 where the following comment was added:
This does not currently work, because the delta between old and new stack pointers is added to offsets that reference incoming parameters after the prolog is generated, and the code that does that doesn't handle a variable delta. You don't want to do that anyway; a better approach is to reserve another register that retains to the incoming stack pointer, and reference parameters relative to that.
And now we do exactly that. If we don't need a frame pointer, then we use r31 as a base pointer. If we do need a frame pointer, then we use r30 as a base pointer. The base pointer retains the value of the stack pointer before it was decremented in the prologue. We then use the base pointer to resolve all negative frame indicies. The basic scheme follows that for base pointers in the X86 backend.
We use a base pointer when we need to dynamically realign the incoming stack pointer. This currently applies only to static objects (dynamic allocas with large alignments, and base-pointer support in SjLj lowering will come in future commits).
llvm-svn: 186478
show more ...
|
#
b94011fd |
| 14-Jul-2013 |
Craig Topper <craig.topper@gmail.com> |
Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
llvm-svn: 186274
|
Revision tags: llvmorg-3.3.1-rc1 |
|
#
49f487e6 |
| 03-Jul-2013 |
Ulrich Weigand <ulrich.weigand@de.ibm.com> |
[PowerPC] Use mtocrf when available
Just as with mfocrf, it is also preferable to use mtocrf instead of mtcrf when only a single CR register is to be written.
Current code however always emits mtcr
[PowerPC] Use mtocrf when available
Just as with mfocrf, it is also preferable to use mtocrf instead of mtcrf when only a single CR register is to be written.
Current code however always emits mtcrf. This probably does not matter when using an external assembler, since the GNU assembler will in fact automatically replace mtcrf with mtocrf when possible. It does create inefficient code with the integrated assembler, however.
To fix this, this patch adds MTOCRF/MTOCRF8 instruction patterns and uses those instead of MTCRF/MTCRF8 everything. Just as done in the MFOCRF patch committed as 185556, these patterns will be converted back to MTCRF if MTOCRF is not available on the machine.
As a side effect, this allows to modify the MTCRF pattern to accept the full range of mask operands for the benefit of the asm parser.
llvm-svn: 185561
show more ...
|
#
ac1a24b5 |
| 28-Jun-2013 |
Hal Finkel <hfinkel@anl.gov> |
PPC: Ignore spill/restore requests for VRSAVE (except on Darwin)
This fixes PR16418, which reports that a function calling __builtin_unwind_init() asserts. The cause is that this generates a spill/r
PPC: Ignore spill/restore requests for VRSAVE (except on Darwin)
This fixes PR16418, which reports that a function calling __builtin_unwind_init() asserts. The cause is that this generates a spill/restore for VRSAVE, and we support that only on Darwin (because VRSAVE is only really used on Darwin).
The test case checks only that we don't crash. We can add correctness checks once someone verifies what behavior the function is supposed to have.
llvm-svn: 185235
show more ...
|
#
bc07a890 |
| 18-Jun-2013 |
Bill Wendling <isanbard@gmail.com> |
Use pointers to the MCAsmInfo and MCRegInfo.
Someone may want to do something crazy, like replace these objects if they change or something.
No functionality change intended.
llvm-svn: 184175
|
Revision tags: llvmorg-3.3.0, llvmorg-3.3.0-rc3, llvmorg-3.3.0-rc2 |
|
#
b08d2c2d |
| 16-May-2013 |
Rafael Espindola <rafael.espindola@gmail.com> |
Remove addFrameMove.
Now that we have good testing, remove addFrameMove and create cfi instructions directly.
llvm-svn: 182052
|
#
9d980cbd |
| 16-May-2013 |
Ulrich Weigand <ulrich.weigand@de.ibm.com> |
[PowerPC] Use true offset value in "memrix" machine operands
This is the second part of the change to always return "true" offset values from getPreIndexedAddressParts, tackling the case of "memrix"
[PowerPC] Use true offset value in "memrix" machine operands
This is the second part of the change to always return "true" offset values from getPreIndexedAddressParts, tackling the case of "memrix" type operands.
This is about instructions like LD/STD that only have a 14-bit field to encode immediate offsets, which are implicitly extended by two zero bits by the machine, so that in effect we can access 16-bit offsets as long as they are a multiple of 4.
The PowerPC back end currently handles such instructions by carrying the 14-bit value (as it will get encoded into the actual machine instructions) in the machine operand fields for such instructions. This means that those values are in fact not the true offset, but rather the offset divided by 4 (and then truncated to an unsigned 14-bit value).
Like in the case fixed in r182012, this makes common code operations on such offset values not work as expected. Furthermore, there doesn't really appear to be any strong reason why we should encode machine operands this way.
This patch therefore changes the encoding of "memrix" type machine operands to simply contain the "true" offset value as a signed immediate value, while enforcing the rules that it must fit in a 16-bit signed value and must also be a multiple of 4.
This change must be made simultaneously in all places that access machine operands of this type. However, just about all those changes make the code simpler; in many cases we can now just share the same code for memri and memrix operands.
llvm-svn: 182032
show more ...
|
#
6e8c0d94 |
| 16-May-2013 |
Rafael Espindola <rafael.espindola@gmail.com> |
Removed dead code.
llvm-svn: 181975
|
#
ef3d1a24 |
| 14-May-2013 |
Bill Schmidt <wschmidt@linux.vnet.ibm.com> |
PPC32: Fix stack collision between FP and CR save areas.
The changes to CR spill handling missed a case for 32-bit PowerPC. The code in PPCFrameLowering::processFunctionBeforeFrameFinalized() checks
PPC32: Fix stack collision between FP and CR save areas.
The changes to CR spill handling missed a case for 32-bit PowerPC. The code in PPCFrameLowering::processFunctionBeforeFrameFinalized() checks whether CR spill has occurred using a flag in the function info. This flag is only set by storeRegToStackSlot and loadRegFromStackSlot. spillCalleeSavedRegisters does not call storeRegToStackSlot, but instead produces MI directly. Thus we don't see the CR is spilled when assigning frame offsets, and the CR spill ends up colliding with some other location (generally the FP slot).
This patch sets the flag in spillCalleeSavedRegisters for PPC32 so that the CR spill is properly detected and gets its own slot in the stack frame.
llvm-svn: 181800
show more ...
|
#
1b09836b |
| 11-May-2013 |
Rafael Espindola <rafael.espindola@gmail.com> |
Change getFrameMoves to return a const reference.
To add a frame now there is a dedicated addFrameMove which also takes care of constructing the move itself.
llvm-svn: 181657
|
Revision tags: llvmorg-3.3.0-rc1 |
|
#
6736988a |
| 15-Apr-2013 |
Hal Finkel <hfinkel@anl.gov> |
Fix PPC64 CR spill location for callee-saved registers
This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition registers, the spill location is specified relative to the stack poi
Fix PPC64 CR spill location for callee-saved registers
This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition registers, the spill location is specified relative to the stack pointer (SP + 8). However, this is not relative to the SP after the new stack frame is established, but instead relative to the caller's stack pointer (it is stored into the linkage area of the parent's stack frame).
So, like with the link register, we don't directly spill the CRs with other callee-saved registers, but just mark them to be spilled during prologue generation.
In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32).
llvm-svn: 179500
show more ...
|
#
2f293915 |
| 13-Apr-2013 |
Hal Finkel <hfinkel@anl.gov> |
Mark all PPC CR registers to be spilled as live-in and tag MFCR appropriately
Leaving MFCR has having unmodeled side effects is not enough to prevent unwanted instruction reordering post-RA. We coul
Mark all PPC CR registers to be spilled as live-in and tag MFCR appropriately
Leaving MFCR has having unmodeled side effects is not enough to prevent unwanted instruction reordering post-RA. We could probably apply a stronger barrier attribute, but there is a better way: Add all (not just the first) CR to be spilled as live-in to the entry block, and add all CRs to the MFCR instruction as implicitly killed.
Unfortunately, I don't have a small test case.
llvm-svn: 179465
show more ...
|
#
d85a04b3 |
| 13-Apr-2013 |
Hal Finkel <hfinkel@anl.gov> |
Spill and restore PPC CR registers using the FP when we have one
For functions that need to spill CRs, and have dynamic stack allocations, the value of the SP during the restore is not what it was d
Spill and restore PPC CR registers using the FP when we have one
For functions that need to spill CRs, and have dynamic stack allocations, the value of the SP during the restore is not what it was during the save, and so we need to use the FP in these cases (as for all of the other spills and restores, but the CR restore has a special code path because its reserved slot, like the link register, is specified directly relative to the adjusted SP).
llvm-svn: 179457
show more ...
|
#
035b4825 |
| 28-Mar-2013 |
Hal Finkel <hfinkel@anl.gov> |
Cleanup PPC CR-spill kill flags and 32- vs. 64-bit instructions
There were a few places where kill flags were not being set correctly, and where 32-bit instruction variants were being used with 64-b
Cleanup PPC CR-spill kill flags and 32- vs. 64-bit instructions
There were a few places where kill flags were not being set correctly, and where 32-bit instruction variants were being used with 64-bit registers. After r178180, this code was being triggered causing llc to assert.
llvm-svn: 178220
show more ...
|
#
feea6539 |
| 26-Mar-2013 |
Hal Finkel <hfinkel@anl.gov> |
PPC: Use HWEncoding and TRI->getEncodingValue
As pointed out by Jakob, we don't need to maintain a separate register-numbering table. Instead we should let TableGen generate the table for us from th
PPC: Use HWEncoding and TRI->getEncodingValue
As pointed out by Jakob, we don't need to maintain a separate register-numbering table. Instead we should let TableGen generate the table for us from the information (already present) in PPCRegisterInfo.td. TRI->getEncodingValue is now used to access register-encoding values.
No functionality change intended.
llvm-svn: 178067
show more ...
|
#
0dfbb05a |
| 26-Mar-2013 |
Hal Finkel <hfinkel@anl.gov> |
Use multiple virtual registers in PPC CR spilling
Now that the register scavenger can support multiple spill slots, and PEI can use virtual-register-based scavenging for multiple simultaneous regist
Use multiple virtual registers in PPC CR spilling
Now that the register scavenger can support multiple spill slots, and PEI can use virtual-register-based scavenging for multiple simultaneous registers, we can use a virtual register for the transfer register in the CR spilling code.
This should eliminate the last place (outside of the prologue/epilogue) where we depend on the unconditional availability of the r0 register. We will soon be able to allocate it (in a somewhat restricted sense) as a GPR.
llvm-svn: 178060
show more ...
|
#
cc1eeda1 |
| 23-Mar-2013 |
Hal Finkel <hfinkel@anl.gov> |
Note in PPCFunctionInfo VRSAVE spills
In preparation for using the new register scavenger capability for providing more than one register simultaneously, specifically note functions that have spille
Note in PPCFunctionInfo VRSAVE spills
In preparation for using the new register scavenger capability for providing more than one register simultaneously, specifically note functions that have spilled VRSAVE (currently, this can happen only in functions that use the setjmp intrinsic). As with CR spilling, such functions will need to provide two emergency spill slots to the scavenger.
No functionality change intended.
llvm-svn: 177832
show more ...
|
#
9e331c2f |
| 22-Mar-2013 |
Hal Finkel <hfinkel@anl.gov> |
Allow the register scavenger to spill multiple registers
This patch lets the register scavenger make use of multiple spill slots in order to guarantee that it will be able to provide multiple regist
Allow the register scavenger to spill multiple registers
This patch lets the register scavenger make use of multiple spill slots in order to guarantee that it will be able to provide multiple registers simultaneously.
To support this, the RS's API has changed slightly: setScavengingFrameIndex / getScavengingFrameIndex have been replaced by addScavengingFrameIndex / isScavengingFrameIndex / getScavengingFrameIndices.
In forthcoming commits, the PowerPC backend will use this capability in order to implement the spilling of condition registers, and some special-purpose registers, without relying on r0 being reserved. In some cases, spilling these registers requires two GPRs: one for addressing and one to hold the value being transferred.
llvm-svn: 177774
show more ...
|
#
aa03c03a |
| 21-Mar-2013 |
Hal Finkel <hfinkel@anl.gov> |
Correct PPC FRAMEADDR lowering using a pseudo-register
The old code used to lower FRAMEADDR tried to replicate the logic in the real frame-lowering code that determines whether or not the frame poin
Correct PPC FRAMEADDR lowering using a pseudo-register
The old code used to lower FRAMEADDR tried to replicate the logic in the real frame-lowering code that determines whether or not the frame pointer (r31) will be used. When it seemed as through the frame pointer would not be used, the stack pointer (r1) was used instead. Unfortunately, because the stack size is not yet known, this does not work. Instead, this change introduces new always-reserved pseudo-registers (FP and FP8) that are replaced during prologue insertion with the real frame-pointer register (either r1 or r31).
It is important that this intrinsic always return a valid frame address because it is used by Clang to store the frame address as part of code generation for __builtin_setjmp.
llvm-svn: 177653
show more ...
|
#
fcc51d4f |
| 17-Mar-2013 |
Hal Finkel <hfinkel@anl.gov> |
Improve PPC VR (Altivec) register spilling
This change cleans up two issues with Altivec register spilling:
1. The spilling code was inefficient (using two instructions, and add and a load,
Improve PPC VR (Altivec) register spilling
This change cleans up two issues with Altivec register spilling:
1. The spilling code was inefficient (using two instructions, and add and a load, when just one would do)
2. The code assumed that r0 would always be available (true for now, but this will change)
The new code handles VR spilling just like GPR spills but forced into r+r mode. As a result, when any VR spills are present, we must now always allocate the register-scavenger spill slot.
llvm-svn: 177231
show more ...
|
#
bb420f10 |
| 15-Mar-2013 |
Hal Finkel <hfinkel@anl.gov> |
Allocate the RS spill slot for any PPC function with spills and a large stack frame
For spills into a large stack frame, the FI-elimination code uses the register scavenger to obtain a free GPR for
Allocate the RS spill slot for any PPC function with spills and a large stack frame
For spills into a large stack frame, the FI-elimination code uses the register scavenger to obtain a free GPR for use with an r+r-addressed load or store. When there are no available GPRs, the scavenger gets one by using its spill slot. Previously, we were not always allocating that spill slot and the RS would assert when the spill slot was needed.
I don't currently have a small test that triggered the assert, but I've created a small regression test that verifies that the spill slot is now added when the stack frame is sufficiently large.
llvm-svn: 177140
show more ...
|