|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1 |
|
| #
db158c7c |
| 28-Jul-2023 |
Harvin Iriawan <harvin.iriawan@arm.com> |
[AArch64] Update generic sched model to A510
Refresh of the generic scheduling model to use A510 instead of A55. Main benefits are to the little core, and introducing SVE scheduling information.
[AArch64] Update generic sched model to A510
Refresh of the generic scheduling model to use A510 instead of A55. Main benefits are to the little core, and introducing SVE scheduling information. Changes tested on various OoO cores, no performance degradation is seen.
Differential Revision: https://reviews.llvm.org/D156799
show more ...
|
|
Revision tags: llvmorg-18-init |
|
| #
6e54fcce |
| 01-Jul-2023 |
Igor Kudrin <ikudrin@accesssoftek.com> |
[AArch64] Emit fewer CFI instructions for synchronous unwind tables
The instruction-precise, or asynchronous, unwind tables usually take up much more space than the synchronous ones. If a user is co
[AArch64] Emit fewer CFI instructions for synchronous unwind tables
The instruction-precise, or asynchronous, unwind tables usually take up much more space than the synchronous ones. If a user is concerned about the load size of the program and does not need the features provided with the asynchronous tables, the compiler should be able to generate the more compact variant.
This patch changes the generation of CFI instructions for these cases so that they all come in one chunk in the prolog; it emits only one `.cfi_def_cfa*` instruction followed by `.cfi_offset` ones after all stack adjustments and register spills, and avoids generating CFI instructions in the epilog(s) as well as any other exceeding CFI instructions like `.cfi_remember_state` and `.cfi_restore_state`. Effectively, it reverses the effects of D111411 and D114545 on functions with the `uwtable(sync)` attribute. As a side effect, it also restores the behavior on functions that have neither `uwtable` nor `nounwind` attributes.
Differential Revision: https://reviews.llvm.org/D153098
show more ...
|
| #
8de9c1ab |
| 28-Jun-2023 |
Nikita Popov <npopov@redhat.com> |
[AArch64] Make tests more robust (NFC)
|
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
| #
5ddce70e |
| 19-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[AArch64] Convert some tests to opaque pointers (NFC)
|
|
Revision tags: llvmorg-15.0.6 |
|
| #
6a353c77 |
| 27-Nov-2022 |
David Green <david.green@arm.com> |
[AArch64] Add GPR rr instructions to isAssociativeAndCommutative
This adds some more scalar instructions that are both associative and commutative to isAssociativeAndCommutative, allowing the machin
[AArch64] Add GPR rr instructions to isAssociativeAndCommutative
This adds some more scalar instructions that are both associative and commutative to isAssociativeAndCommutative, allowing the machine combiner to reassociate them to reduce critical path length.
Differential Revision: https://reviews.llvm.org/D134260
show more ...
|
| #
8f104b80 |
| 22-Nov-2022 |
Vitaly Buka <vitalybuka@google.com> |
Revert "[AArch64] Add GPR rr instructions to isAssociativeAndCommutative"
Breaks msan on aarch64.
This reverts commit 5f7f484ee54ebbf702ee4c5fe9852502dc237121.
|
| #
5f7f484e |
| 16-Nov-2022 |
David Green <david.green@arm.com> |
[AArch64] Add GPR rr instructions to isAssociativeAndCommutative
This adds some more scalar instructions that are both associative and commutative to isAssociativeAndCommutative, allowing the machin
[AArch64] Add GPR rr instructions to isAssociativeAndCommutative
This adds some more scalar instructions that are both associative and commutative to isAssociativeAndCommutative, allowing the machine combiner to reassociate them to reduce critical path length.
Differential Revision: https://reviews.llvm.org/D134260
show more ...
|
|
Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1 |
|
| #
7a817825 |
| 09-Sep-2022 |
zhongyunde <zhongyunde@huawei.com> |
[AArch64][CodeGen]Fold the mov and lsl into ubfiz
Fix the issue exposed by D132322, depand on D132939 Reviewed By: efriedma, paulwalker-arm Differential Revision: https://reviews.llvm.org/D132325
|
|
Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
50a97aac |
| 24-Mar-2022 |
Momchil Velikov <momchil.velikov@arm.com> |
[AArch64] Async unwind - function prologues
Re-commit of 32e8b550e5439c7e4aafa73894faffd5f25d0d05
This patch rearranges emission of CFI instructions, so the resulting DWARF and `.eh_frame` informat
[AArch64] Async unwind - function prologues
Re-commit of 32e8b550e5439c7e4aafa73894faffd5f25d0d05
This patch rearranges emission of CFI instructions, so the resulting DWARF and `.eh_frame` information is precise at every instruction.
The current state is that the unwind info is emitted only after the function prologue. This is fine for synchronous (e.g. C++) exceptions, but the information is generally incorrect when the program counter is at an instruction in the prologue or the epilogue, for example:
``` stp x29, x30, [sp, #-16]! // 16-byte Folded Spill mov x29, sp .cfi_def_cfa w29, 16 ... ```
after the `stp` is executed the (initial) rule for the CFA still says the CFA is in the `sp`, even though it's already offset by 16 bytes
A correct unwind info could look like: ``` stp x29, x30, [sp, #-16]! // 16-byte Folded Spill .cfi_def_cfa_offset 16 mov x29, sp .cfi_def_cfa w29, 16 ... ```
Having this information precise up to an instruction is useful for sampling profilers that would like to get a stack backtrace. The end goal (towards this patch is just a step) is to have fully working `-fasynchronous-unwind-tables`.
Reviewed By: danielkiss, MaskRay
Differential Revision: https://reviews.llvm.org/D111411
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
85c53c70 |
| 04-Mar-2022 |
Hans Wennborg <hans@chromium.org> |
Revert "[AArch64] Async unwind - function prologues"
It caused builds to assert with:
(StackSize == 0 && "We already have the CFA offset!"), function generateCompactUnwindEncoding, file AArch64
Revert "[AArch64] Async unwind - function prologues"
It caused builds to assert with:
(StackSize == 0 && "We already have the CFA offset!"), function generateCompactUnwindEncoding, file AArch64AsmBackend.cpp, line 624.
when targeting iOS. See comment on the code review for reproducer.
> This patch rearranges emission of CFI instructions, so the resulting > DWARF and `.eh_frame` information is precise at every instruction. > > The current state is that the unwind info is emitted only after the > function prologue. This is fine for synchronous (e.g. C++) exceptions, > but the information is generally incorrect when the program counter is > at an instruction in the prologue or the epilogue, for example: > > ``` > stp x29, x30, [sp, #-16]! // 16-byte Folded Spill > mov x29, sp > .cfi_def_cfa w29, 16 > ... > ``` > > after the `stp` is executed the (initial) rule for the CFA still says > the CFA is in the `sp`, even though it's already offset by 16 bytes > > A correct unwind info could look like: > ``` > stp x29, x30, [sp, #-16]! // 16-byte Folded Spill > .cfi_def_cfa_offset 16 > mov x29, sp > .cfi_def_cfa w29, 16 > ... > ``` > > Having this information precise up to an instruction is useful for > sampling profilers that would like to get a stack backtrace. The end > goal (towards this patch is just a step) is to have fully working > `-fasynchronous-unwind-tables`. > > Reviewed By: danielkiss, MaskRay > > Differential Revision: https://reviews.llvm.org/D111411
This reverts commit 32e8b550e5439c7e4aafa73894faffd5f25d0d05.
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
32e8b550 |
| 28-Feb-2022 |
Momchil Velikov <momchil.velikov@arm.com> |
[AArch64] Async unwind - function prologues
This patch rearranges emission of CFI instructions, so the resulting DWARF and `.eh_frame` information is precise at every instruction.
The current state
[AArch64] Async unwind - function prologues
This patch rearranges emission of CFI instructions, so the resulting DWARF and `.eh_frame` information is precise at every instruction.
The current state is that the unwind info is emitted only after the function prologue. This is fine for synchronous (e.g. C++) exceptions, but the information is generally incorrect when the program counter is at an instruction in the prologue or the epilogue, for example:
``` stp x29, x30, [sp, #-16]! // 16-byte Folded Spill mov x29, sp .cfi_def_cfa w29, 16 ... ```
after the `stp` is executed the (initial) rule for the CFA still says the CFA is in the `sp`, even though it's already offset by 16 bytes
A correct unwind info could look like: ``` stp x29, x30, [sp, #-16]! // 16-byte Folded Spill .cfi_def_cfa_offset 16 mov x29, sp .cfi_def_cfa w29, 16 ... ```
Having this information precise up to an instruction is useful for sampling profilers that would like to get a stack backtrace. The end goal (towards this patch is just a step) is to have fully working `-fasynchronous-unwind-tables`.
Reviewed By: danielkiss, MaskRay
Differential Revision: https://reviews.llvm.org/D111411
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
0d8cd4e2 |
| 03-Aug-2021 |
Jason Molenda <jason@molenda.com> |
[AArch64InstPrinter] Change printAddSubImm to comment imm value when shifted
Add a comment when there is a shifted value, add x9, x0, #291, lsl #12 ; =1191936 but not when the immediate value is
[AArch64InstPrinter] Change printAddSubImm to comment imm value when shifted
Add a comment when there is a shifted value, add x9, x0, #291, lsl #12 ; =1191936 but not when the immediate value is unshifted, subs x9, x0, #256 ; =256 when the comment adds nothing additional to the reader.
Differential Revision: https://reviews.llvm.org/D107196
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
4ab3041a |
| 24-May-2021 |
serge-sans-paille <sguelton@redhat.com> |
Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee04c75b1f1332b4fd1ac4e8ef6c3c247.
See https://lab.llvm.org/buildbot/#/builders/109/builds
Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee04c75b1f1332b4fd1ac4e8ef6c3c247.
See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
show more ...
|
| #
bda6e5be |
| 23-May-2021 |
serge-sans-paille <sguelton@redhat.com> |
[NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71406c75a4ea4d5a2fe84289f07ea3a1, no attributes is quivalent to setting attribute to false.
This is a preliminar
[NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71406c75a4ea4d5a2fe84289f07ea3a1, no attributes is quivalent to setting attribute to false.
This is a preliminary commit for https://reviews.llvm.org/D99080
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3 |
|
| #
f28e1128 |
| 16-Aug-2019 |
Sander de Smalen <sander.desmalen@arm.com> |
Relanding r368987 [AArch64] Change location of frame-record within callee-save area.
Changes: There was a condition for `!NeedsFrameRecord` missing in the assert. The assert in question has changed
Relanding r368987 [AArch64] Change location of frame-record within callee-save area.
Changes: There was a condition for `!NeedsFrameRecord` missing in the assert. The assert in question has changed to:
+ assert((!RPI.isPaired() || !NeedsFrameRecord || RPI.Reg2 != AArch64::FP || + RPI.Reg1 == AArch64::LR) && + "FrameRecord must be allocated together with LR");
This addresses PR43016.
llvm-svn: 369122
show more ...
|
| #
ee96499a |
| 16-Aug-2019 |
Nico Weber <nicolasweber@gmx.de> |
Revert r368987, it caused PR43016.
llvm-svn: 369080
|
| #
643adb55 |
| 15-Aug-2019 |
Sander de Smalen <sander.desmalen@arm.com> |
[AArch64] Change location of frame-record within callee-save area.
This patch changes the location of the frame-record (FP, LR) to the bottom of the callee-saved area. According to the AAPCS the lo
[AArch64] Change location of frame-record within callee-save area.
This patch changes the location of the frame-record (FP, LR) to the bottom of the callee-saved area. According to the AAPCS the location of the frame-record within the stackframe is unspecified (section 5.2.3 The Frame Pointer), so the compiler should be free to choose a different location.
The reason for changing the location of the frame-record is to prepare the frame for allocating an SVE area below the callee-saves. This way the compiler can use the VL-scaled addressing modes to directly access SVE objects from the frame-pointer.
: : | stack | | stack | | args | | args | +-------+ +-------+ | x30 | | x19 | | x29 | | x20 | FP -> |- - - -| | x21 | | x19 | ==> | x22 | | x20 | |- - - -| | x21 | | x30 | | x22 | | x29 | +-------+ +-------+ <- FP |///////| |///////| // realignment gap |- - - -| |- - - -| |spills/| |spills/| | locals| | locals| SP -> +-------+ +-------+ <- SP
Things to point out: - The algorithm to find a paired register should be prevented from accidentally pairing some callee-saved register with LR that is not FP, since they should always be paired together when the frame has a frame-record. - For Darwin platforms the location of the frame-record is unchanged, since the unwind encoding does not allow for encoding this position dynamically and other tools currently depend on the former layout.
Reviewers: efriedma, rovka, rengolin, thegameg, greened, t.p.northover
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D65653
llvm-svn: 368987
show more ...
|
|
Revision tags: llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
| #
b7cef81f |
| 14-Jan-2019 |
Francis Visoiu Mistrih <francisvm@yahoo.com> |
Replace "no-frame-pointer-*" function attributes with "frame-pointer"
Part of the effort to refactoring frame pointer code generation. We used to use two function attributes "no-frame-pointer-elim"
Replace "no-frame-pointer-*" function attributes with "frame-pointer"
Part of the effort to refactoring frame pointer code generation. We used to use two function attributes "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" to represent three kinds of frame pointer usage: (all) frames use frame pointer, (non-leaf) frames use frame pointer, (none) frame use frame pointer. This CL makes the idea explicit by using only one enum function attribute "frame-pointer"
Option "-frame-pointer=" replaces "-disable-fp-elim" for tools such as llc.
"no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" are still supported for easy migration to "frame-pointer".
tests are mostly updated with
// replace command line args ‘-disable-fp-elim=false’ with ‘-frame-pointer=none’ grep -iIrnl '\-disable-fp-elim=false' * | xargs sed -i '' -e "s/-disable-fp-elim=false/-frame-pointer=none/g"
// replace command line args ‘-disable-fp-elim’ with ‘-frame-pointer=all’ grep -iIrnl '\-disable-fp-elim' * | xargs sed -i '' -e "s/-disable-fp-elim/-frame-pointer=all/g"
Patch by Yuanfang Chen (tabloid.adroit)!
Differential Revision: https://reviews.llvm.org/D56351
llvm-svn: 351049
show more ...
|
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
| #
66f6b65f |
| 02-Jun-2016 |
Geoff Berry <gberry@codeaurora.org> |
[PEI, AArch64] Use empty spaces in stack area for local stack slot allocation.
Summary: If the target requests it, use emptry spaces in the fixed and callee-save stack area to allocate local stack o
[PEI, AArch64] Use empty spaces in stack area for local stack slot allocation.
Summary: If the target requests it, use emptry spaces in the fixed and callee-save stack area to allocate local stack objects.
AArch64: Change last callee-save reg stack object alignment instead of size to leave a gap to take advantage of above change.
Reviewers: t.p.northover, qcolombet, MatzeB
Subscribers: rengolin, mcrosier, llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D20220
llvm-svn: 271527
show more ...
|
| #
a5335647 |
| 06-May-2016 |
Geoff Berry <gberry@codeaurora.org> |
[AArch64] Combine callee-save and local stack SP adjustment instructions.
Summary: If a function needs to allocate both callee-save stack memory and local stack memory, we currently decrement/increm
[AArch64] Combine callee-save and local stack SP adjustment instructions.
Summary: If a function needs to allocate both callee-save stack memory and local stack memory, we currently decrement/increment the SP in two steps: first for the callee-save area, and then for the local stack area. This changes the code to allocate them both at once at the very beginning/end of the function. This has two benefits:
1) there is one fewer sub/add micro-op in the prologue/epilogue
2) the stack adjustment instructions act as a scheduling barrier, so moving them to the very beginning/end of the function increases post-RA scheduler's ability to move instructions (that only depend on argument registers) before any of the callee-save stores
This change can cause an increase in instructions if the original local stack SP decrement could be folded into the first store to the stack. This occurs when the first local stack store is to stack offset 0. In this case we are trading off one more sub instruction for one fewer sub micro-op (along with benefits (2) and (3) above).
Reviewers: t.p.northover
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D18619
llvm-svn: 268746
show more ...
|
|
Revision tags: llvmorg-3.8.0 |
|
| #
62c1a1e7 |
| 02-Mar-2016 |
Geoff Berry <gberry@codeaurora.org> |
[AArch64] Enable non-leaf frame pointer elimination.
Summary: This change enables frame pointer elimination in non-leaf functions. The -fomit-frame-pointer option still needs to be used when compili
[AArch64] Enable non-leaf frame pointer elimination.
Summary: This change enables frame pointer elimination in non-leaf functions. The -fomit-frame-pointer option still needs to be used when compiling via clang (or an equivalent method of not setting the 'no-frame-pointer-elim*' function attributes if generating llvm IR via some other method) to take advantage of this optimization.
This change should be NFC when compiling via clang without -fomit-frame-pointer.
Reviewers: t.p.northover
Subscribers: aemerson, rengolin, tberghammer, qcolombet, llvm-commits, danalbert, mcrosier, srhines
Differential Revision: http://reviews.llvm.org/D17730
llvm-svn: 262495
show more ...
|
|
Revision tags: llvmorg-3.8.0-rc3 |
|
| #
c25d3bd2 |
| 12-Feb-2016 |
Geoff Berry <gberry@codeaurora.org> |
[AArch64] Reduce number of callee-save save/restores.
Summary: Before this change, callee-save registers would be rounded up to even pairs of GPRs and FPRs. This change eliminates these extra paddi
[AArch64] Reduce number of callee-save save/restores.
Summary: Before this change, callee-save registers would be rounded up to even pairs of GPRs and FPRs. This change eliminates these extra padding load/stores, though it does keep the stack allocation the same size unless both the GPR and FPR sets have an odd size, in which case one full pair stack slot (16 bytes) is saved.
This optimization cannot currently be done for MachO targets since they rely on a fast-path .debug_frame equivalent that can only encode callee-save registers as pairs.
Reviewers: t.p.northover, rengolin, mcrosier, jmolloy
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D17000
llvm-svn: 260689
show more ...
|
|
Revision tags: llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1 |
|
| #
d016574d |
| 21-Dec-2015 |
Chad Rosier <mcrosier@codeaurora.org> |
[AArch64] Enable PostRAScheduler for AArch64 generic build.
Disable post-ra scheduler for perturbed tests to appease the bots and to preserve the history of the tests.
http://reviews.llvm.org/D1565
[AArch64] Enable PostRAScheduler for AArch64 generic build.
Disable post-ra scheduler for perturbed tests to appease the bots and to preserve the history of the tests.
http://reviews.llvm.org/D15652
llvm-svn: 256158
show more ...
|
|
Revision tags: llvmorg-3.7.1, llvmorg-3.7.1-rc2 |
|
| #
f6645cce |
| 18-Nov-2015 |
Quentin Colombet <qcolombet@apple.com> |
[AArch64] Enable shrink-wrapping by default.
Differential Revision: http://reviews.llvm.org/D14360
rdar://problem/20820748
llvm-svn: 253520
|
|
Revision tags: llvmorg-3.7.1-rc1, llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4, llvmorg-3.7.0-rc2 |
|
| #
2a9d801f |
| 29-Jul-2015 |
Tim Northover <tnorthover@apple.com> |
AArch64: use 32-bit MOV rather than UBFX to truncate registers.
It's potentially more efficient on Cyclone, and from the optimization guides & schedulers looks like it has no effect on Cortex-A53 or
AArch64: use 32-bit MOV rather than UBFX to truncate registers.
It's potentially more efficient on Cyclone, and from the optimization guides & schedulers looks like it has no effect on Cortex-A53 or A57. In general you'd expect a MOV to be about the most efficient instruction with its semantics, even though the official "UXTW" alias is really a UBFX.
llvm-svn: 243576
show more ...
|