#
b4733ca8 |
| 15-Jun-2017 |
Lei Huang <lei@ca.ibm.com> |
[MachineLICM] Hoist TOC-based address instructions
Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved.
On PPC, global variable access i
[MachineLICM] Hoist TOC-based address instructions
Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved.
On PPC, global variable access is done through the table of contents (TOC) which is always in register X2. The ABI reserves this register in any functions that have calls or access global variables.
A call through a function pointer involves saving, changing and restoring this register around the call and thus MachineLICM does not consider it to be invariant. We can however guarantee the register is preserved across the call and thus is invariant.
Differential Revision: https://reviews.llvm.org/D33562
llvm-svn: 305490
show more ...
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2 |
|
#
aeb8e339 |
| 25-Jan-2017 |
Matthias Braun <matze@braunis.de> |
PowerPC: Slight cleanup of getReservedRegs(); NFC
Change getReservedRegs() to not mark a register as reserved and then revert that decision in some cases. Motivated by the discussion in https://revi
PowerPC: Slight cleanup of getReservedRegs(); NFC
Change getReservedRegs() to not mark a register as reserved and then revert that decision in some cases. Motivated by the discussion in https://reviews.llvm.org/D29056
llvm-svn: 293073
show more ...
|
#
1d77599b |
| 24-Jan-2017 |
Matthias Braun <matze@braunis.de> |
PowerPC: Mark super regs of reserved regs reserved.
When a register like R1 is reserved, X1 should be reserved as well. This was already done "manually" when 64bit code was enabled, however using th
PowerPC: Mark super regs of reserved regs reserved.
When a register like R1 is reserved, X1 should be reserved as well. This was already done "manually" when 64bit code was enabled, however using the markSuperRegs() function on the base register is more convenient and allows to use the checksAllSuperRegsMarked() function even in 32bit mode to avoid accidental breakage in the future.
This is also necessary to allow https://reviews.llvm.org/D28881
Differential Revision: https://reviews.llvm.org/D29056
llvm-svn: 292870
show more ...
|
Revision tags: llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1 |
|
#
6354d235 |
| 04-Oct-2016 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register set
This patch corresponds to review:
The newly added VSX D-Form (register + offset) memory ops target the upper half of
[Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register set
This patch corresponds to review:
The newly added VSX D-Form (register + offset) memory ops target the upper half of the VSX register set. The existing ones target the lower half. In order to unify these and have the ability to target all the VSX registers using D-Form operations, this patch defines Pseudo-ops for the loads/stores which are expanded post-RA. The expansion then choses the correct opcode based on the register that was allocated for the operation.
llvm-svn: 283212
show more ...
|
#
11049f8f |
| 04-Oct-2016 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[Power9] Part-word VSX integer scalar loads/stores and sign extend instructions
This patch corresponds to review: https://reviews.llvm.org/D23155
This patch removes the VSHRC register class (based
[Power9] Part-word VSX integer scalar loads/stores and sign extend instructions
This patch corresponds to review: https://reviews.llvm.org/D23155
This patch removes the VSHRC register class (based on D20310) and adds exploitation of the Power9 sub-word integer loads into VSX registers as well as vector sign extensions. The new instructions are useful for a few purposes:
Int to Fp conversions of 1 or 2-byte values loaded from memory Building vectors of 1 or 2-byte integers with values loaded from memory Storing individual 1 or 2-byte elements from integer vectors
This patch implements all of those uses.
llvm-svn: 283190
show more ...
|
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1 |
|
#
941a705b |
| 28-Jul-2016 |
Matthias Braun <matze@braunis.de> |
MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference instead of a pointer.
llvm-svn: 277017
|
#
b1556c42 |
| 28-Jun-2016 |
Rafael Espindola <rafael.espindola@gmail.com> |
Use isPositionIndependent in a few more places.
I think this converts all the simple cases that really just care about the generated code being position independent or not. The remaining uses are a
Use isPositionIndependent in a few more places.
I think this converts all the simple cases that really just care about the generated code being position independent or not. The remaining uses are a bit more complicated and are checking things like "is this a library or executable" or "can this symbol be preempted".
llvm-svn: 274055
show more ...
|
#
8bba5600 |
| 27-Jun-2016 |
Rafael Espindola <rafael.espindola@gmail.com> |
Refactor duplicated condition.
llvm-svn: 273900
|
Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
#
98c18947 |
| 08-Apr-2016 |
Chuang-Yu Cheng <cycheng@multicorewareinc.com> |
CXX_FAST_TLS calling convention: performance improvement for PPC64
This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message.
The access function has a short e
CXX_FAST_TLS calling convention: performance improvement for PPC64
This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message.
The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit.
We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList.
Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block.
We add CSRsViaCopy, it will be explicitly handled during lowering.
1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated.
Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng
http://reviews.llvm.org/D17533
llvm-svn: 265781
show more ...
|
Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1 |
|
#
d7dbb66e |
| 01-Dec-2015 |
Yury Gribov <y.gribov@samsung.com> |
Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics.
The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most
Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics.
The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most recent dynamic alloca on the caller's stack. These intrinsics are intendend for use in combination with @llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning routines.
Patch by Max Ostapenko.
Differential Revision: http://reviews.llvm.org/D14983
llvm-svn: 254404
show more ...
|
Revision tags: llvmorg-3.7.1, llvmorg-3.7.1-rc2 |
|
#
b64f0a5a |
| 17-Nov-2015 |
Jay Foad <jay.foad@gmail.com> |
Fix typos in comments.
llvm-svn: 253324
|
Revision tags: llvmorg-3.7.1-rc1 |
|
#
10c80e79 |
| 22-Sep-2015 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Prune trailing whitespaces.
llvm-svn: 248265
|
#
0a7d0ad9 |
| 22-Sep-2015 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Untabify.
llvm-svn: 248264
|
#
a9cb538a |
| 22-Sep-2015 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Reformat blank lines.
llvm-svn: 248263
|
#
84965031 |
| 22-Sep-2015 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Reformat comment lines.
llvm-svn: 248262
|
#
70ad98ac |
| 22-Sep-2015 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Reformat.
llvm-svn: 248261
|
Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4, llvmorg-3.7.0-rc2 |
|
#
e4d22d59 |
| 20-Jul-2015 |
JF Bastien <jfb@google.com> |
Targets: commonize some stack realignment code
This patch does the following: * Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`,
Targets: commonize some stack realignment code
This patch does the following: * Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`. * Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute.
Multiple targets duplicated the same `needsStackRealignment` code: - Aarch64. - ARM. - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has. - PowerPC. - WebAssembly. - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has.
The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects: - AMDGPU - BPF - CppBackend - MSP430 - NVPTX - Sparc - SystemZ - XCore - Out-of-tree targets This is a breaking change! `make check` passes.
The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation.
`needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone.
Reviewers: sunfish
Subscribers: aemerson, llvm-commits, jfb
Differential Revision: http://reviews.llvm.org/D11160
llvm-svn: 242727
show more ...
|
Revision tags: llvmorg-3.7.0-rc1 |
|
#
b73a2ed2 |
| 10-Jul-2015 |
JF Bastien <jfb@google.com> |
Target RegisterInfo: devirtualize TargetFrameLowering
Summary: The target frame lowering's concrete type is always known in RegisterInfo, yet it's only sometimes devirtualized through a static_cast.
Target RegisterInfo: devirtualize TargetFrameLowering
Summary: The target frame lowering's concrete type is always known in RegisterInfo, yet it's only sometimes devirtualized through a static_cast. This change adds an auto-generated static function <Target>GenRegisterInfo::getFrameLowering(const MachineFunction &MF) which does this devirtualization, and uses this function in all targets which can.
This change was suggested by sunfish in D11070 for WebAssembly, I figure that I may as well improve the other targets while I'm here.
Subscribers: sunfish, ted, llvm-commits, jfb
Differential Revision: http://reviews.llvm.org/D11093
llvm-svn: 241921
show more ...
|
Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1, llvmorg-3.6.1, llvmorg-3.6.1-rc1 |
|
#
f3c94b1e |
| 07-May-2015 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
Add VSX Scalar loads and stores to the PPC back end
This patch corresponds to review: http://reviews.llvm.org/D9440
It adds a new register class to the PPC back end to contain single precision valu
Add VSX Scalar loads and stores to the PPC back end
This patch corresponds to review: http://reviews.llvm.org/D9440
It adds a new register class to the PPC back end to contain single precision values in VSX registers. Additionally, it adds scalar loads and stores for VSX registers.
llvm-svn: 236755
show more ...
|
#
535e69de |
| 25-Mar-2015 |
Kit Barton <kbarton@ca.ibm.com> |
Add Hardware Transactional Memory (HTM) Support
This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07 (POWER8). The intrinsic support is based on GCC one [1], but currently
Add Hardware Transactional Memory (HTM) Support
This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07 (POWER8). The intrinsic support is based on GCC one [1], but currently only the 'PowerPC HTM Low Level Built-in Function' are implemented.
The HTM instructions follows the RC ones and the transaction initiation result is set on RC0 (with exception of tcheck). Currently approach is to create a register copy from CR0 to GPR and comapring. Although this is suboptimal, since the branch could be taken directly by comparing the CR0 value, it generates code correctly on both test and branch and just return value. A possible future optimization could be elimitate the MFCR instruction to branch directly.
The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on powerpc64 and powerpc64le.
This is send along a clang patch to enabled the builtins and option switch.
[1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html
Phabricator Review: http://reviews.llvm.org/D8247
llvm-svn: 233204
show more ...
|
#
1f26a476 |
| 20-Mar-2015 |
John Brawn <john.brawn@arm.com> |
[ARM] Fix handling of thumb1 out-of-range frame offsets
LocalStackSlotPass assumes that isFrameOffsetLegal doesn't change its answer when the base register changes. Unfortunately this isn't true in
[ARM] Fix handling of thumb1 out-of-range frame offsets
LocalStackSlotPass assumes that isFrameOffsetLegal doesn't change its answer when the base register changes. Unfortunately this isn't true in thumb1, where SP-based loads allow a larger offset than non-SP-based loads, and this causes the base register reuse code to generate instructions that are unencodable, causing an assertion failure.
Solve this by adding a BaseReg parameter to isFrameOffsetLegal, which ARMBaseRegisterInfo can then make use of to give the correct answer.
Differential Revision: http://reviews.llvm.org/D8419
llvm-svn: 232825
show more ...
|
Revision tags: llvmorg-3.5.2, llvmorg-3.5.2-rc1 |
|
#
ea178cf4 |
| 12-Mar-2015 |
Eric Christopher <echristo@gmail.com> |
Remove the need to cache the subtarget in the PowerPC TargetRegisterInfo classes. Replace it with a cache to the TargetMachine and use that where applicable at the moment.
llvm-svn: 232002
|
#
9deb75d1 |
| 11-Mar-2015 |
Eric Christopher <echristo@gmail.com> |
Have getCallPreservedMask and getThisCallPreservedMask take a MachineFunction argument so that we can grab subtarget specific features off of it.
llvm-svn: 231979
|
#
433c432b |
| 10-Mar-2015 |
Eric Christopher <echristo@gmail.com> |
Have TargetRegisterInfo::getLargestLegalSuperClass take a MachineFunction argument so that it can look up the subtarget rather than using a cached one in some Targets.
llvm-svn: 231888
|
#
867bfc53 |
| 07-Mar-2015 |
Benjamin Kramer <benny.kra@googlemail.com> |
Make constant arrays that are passed to functions as const.
In theory this allows the compiler to skip materializing the array on the stack. In practice clang often fails to do that, but that's a di
Make constant arrays that are passed to functions as const.
In theory this allows the compiler to skip materializing the array on the stack. In practice clang often fails to do that, but that's a different story. NFC.
llvm-svn: 231571
show more ...
|