#
58de1e2c |
| 27-Mar-2024 |
Wesley Wiser <wwiser@gmail.com> |
Fix stack layout for frames larger than 2gb (#84114)
For very large stack frames, the offset from the stack pointer to a local can be more than 2^31 which overflows various `int` offsets in the fram
Fix stack layout for frames larger than 2gb (#84114)
For very large stack frames, the offset from the stack pointer to a local can be more than 2^31 which overflows various `int` offsets in the frame lowering code.
This patch updates the frame lowering code to calculate the offsets as 64-bit values and resolves the overflows, resulting in the correct codegen for very large frames.
Fixes #48911
show more ...
|
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1 |
|
#
b2c16e7f |
| 05-Mar-2024 |
James Westwood <james.westwood@arm.com> |
Revert "[ARM] R11 not pushed adjacent to link register with PAC-M and… (#84019)
… AAPCS frame chain fix (#82801)"
This reverts commit 00e4a4197137410129d4725ffb82bae9ce44bdde. This patch
was fou
Revert "[ARM] R11 not pushed adjacent to link register with PAC-M and… (#84019)
… AAPCS frame chain fix (#82801)"
This reverts commit 00e4a4197137410129d4725ffb82bae9ce44bdde. This patch
was found to cause miscompilations and compilation failures.
show more ...
|
#
00e4a419 |
| 04-Mar-2024 |
James Westwood <james.westwood@arm.com> |
[ARM] R11 not pushed adjacent to link register with PAC-M and AAPCS frame chain fix (#82801)
When code for M class architecture was compiled with AAPCS and PAC
enabled, the frame pointer, r11, was
[ARM] R11 not pushed adjacent to link register with PAC-M and AAPCS frame chain fix (#82801)
When code for M class architecture was compiled with AAPCS and PAC
enabled, the frame pointer, r11, was not pushed to the stack adjacent to
the link register. Due to PAC being enabled, r12 was placed between r11
and lr. This patch fixes this by adding an extra case to the already
existing code that splits the GPR push in two when R11 is the frame
pointer and certain paremeters are met. The differential revision for
this previous change can be found here:
https://reviews.llvm.org/D125649. This now ensures that r11 and lr are
pushed in a separate push instruction to the other GPRs when PAC and
AAPCS are enabled, meaning the frame pointer and link register are now
pushed onto the stack adjacent to each other.
show more ...
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4 |
|
#
749384c0 |
| 26-Feb-2024 |
ostannard <oliver.stannard@arm.com> |
[ARM] Update IsRestored for LR based on all returns (#82745)
PR #75527 fixed ARMFrameLowering to set the IsRestored flag for LR based
on all of the return instructions in the function, not just one
[ARM] Update IsRestored for LR based on all returns (#82745)
PR #75527 fixed ARMFrameLowering to set the IsRestored flag for LR based
on all of the return instructions in the function, not just one.
However, there is also code in ARMLoadStoreOptimizer which changes
return instructions, but it set IsRestored based on the one instruction
it changed, not the whole function.
The fix is to factor out the code added in #75527, and also call it from
ARMLoadStoreOptimizer if it made a change to return instructions.
Fixes #80287.
show more ...
|
Revision tags: llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
af8d0502 |
| 25-Dec-2023 |
Kazu Hirata <kazu@google.com> |
[Target] Use range-based for loops (NFC)
|
#
b1a5ee1f |
| 20-Dec-2023 |
Florian Hahn <flo@fhahn.com> |
[ARM] Check all terms in emitPopInst when clearing Restored for LR. (#75527)
emitPopInst checks a single function exit MBB. If other paths also exit
the function and any of there terminators uses L
[ARM] Check all terms in emitPopInst when clearing Restored for LR. (#75527)
emitPopInst checks a single function exit MBB. If other paths also exit
the function and any of there terminators uses LR implicitly, it is not
save to clear the Restored bit.
Check all terminators for the function before clearing Restored.
This fixes a mis-compile in outlined-fn-may-clobber-lr-in-caller.ll
where the machine-outliner previously introduced BLs that clobbered LR
which in turn is used by the tail call return.
Alternative to #73553
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
fae3f9ec |
| 11-Aug-2023 |
John Brawn <john.brawn@arm.com> |
[ARM] Fix prologue/epilogue for pacbti-m leaf functions
R12 is callee-saved in functions with pacbti-m enabled, but this is done in assignCalleeSavedSpillSlots, meaning that in determineCalleeSaves
[ARM] Fix prologue/epilogue for pacbti-m leaf functions
R12 is callee-saved in functions with pacbti-m enabled, but this is done in assignCalleeSavedSpillSlots, meaning that in determineCalleeSaves we have to manually set CanEliminateFrame.
This fixes a bug where in leaf functions with no other callee-saved registers the aut instruction wouldn't be emitted and stack offsets of arguments passed on the stack would be incorrect.
Differential Revision: https://reviews.llvm.org/D157865
show more ...
|
#
40614e1c |
| 24-Aug-2023 |
Oliver Stannard <oliver.stannard@arm.com> |
[ARM] Save and restore CPSR around tMOVimm32
When resolving a frame index with a large offset for v6M execute-only, we emit a tMOVimm32 pseudo-instruction, which later gets lowered to a sequence of
[ARM] Save and restore CPSR around tMOVimm32
When resolving a frame index with a large offset for v6M execute-only, we emit a tMOVimm32 pseudo-instruction, which later gets lowered to a sequence of instructions, all of which are flag-setting. However, a frame index may be generated for a register spill or reload instruction, which can be inserted at a point where CPSR is live. This patch inserts MRS and MSR instructions around the tMOVimm32 to save and restore the value of CPSR, if CPSR is live at that point.
This may need up to two virtual registers (one to build the immediate value, one to save CPSR) during frame index lowering, which happens after register allocation, so we need to ensure two spill slots are avilable to the register scavenger to ensure it can free up enough registers for this.
There is no test for the emission (or not) of the MRS/MSR pair, because it requires a spill or reload to be inserted at a point where CPSR is live, which requires a large, complex function and is fragile enough that any optimisation changes will break the test. This bug was easily found by csmith with -verify-machineinstrs, which I now run regularly on v6M execute-only (and many other combinations).
Patch by John Brawn and myself.
Reviewed By: stuij
Differential Revision: https://reviews.llvm.org/D158404
show more ...
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1 |
|
#
8336d38b |
| 25-Jul-2023 |
John Brawn <john.brawn@arm.com> |
[ARM] Correctly handle combining segmented stacks with execute-only
Using segmented stacks with execute-only mostly works, but we need to use the correct movi32 opcode in 6-M, and there's one place
[ARM] Correctly handle combining segmented stacks with execute-only
Using segmented stacks with execute-only mostly works, but we need to use the correct movi32 opcode in 6-M, and there's one place where for thumb1 (i.e. 6-M and 8-M.base) a constant pool was unconditionally used which needed to be fixed.
Differential Revision: https://reviews.llvm.org/D156339
show more ...
|
Revision tags: llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
1d0ccebc |
| 02-May-2023 |
Zhiyao Ma <zhiyao.ma.98@gmail.com> |
[ARM] Don't allocate memory if free space in segmented stack is just enough
Assuming that the stack grows downwards, it is fine if the stack pointer is exactly at the stacklet boundary. We should us
[ARM] Don't allocate memory if free space in segmented stack is just enough
Assuming that the stack grows downwards, it is fine if the stack pointer is exactly at the stacklet boundary. We should use less-or-equal condition when deciding whether to skip new memory allocation.
Differential Revision: https://reviews.llvm.org/D149315
show more ...
|
Revision tags: llvmorg-16.0.2 |
|
#
4241d890 |
| 15-Apr-2023 |
Kazu Hirata <kazu@google.com> |
[Target] Use range-based for loops (NFC)
|
Revision tags: llvmorg-16.0.1 |
|
#
b2061453 |
| 31-Mar-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
ARMFrameLowering.cpp - fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFC.
|
#
c5383536 |
| 30-Mar-2023 |
Martin Storsjö <martin@martin.st> |
[ARM] Handle generating SEH unwind info for t2STR_PRE/t2LDR_POST
This fixes compiling some uncommon cases.
Differential Revision: https://reviews.llvm.org/D147212
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
c16a58b3 |
| 08-Dec-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Attributes: Add function getter to parse integer string attributes
The most common case for string attributes parses them as integers. We don't have a convenient way to do this, and as a result we h
Attributes: Add function getter to parse integer string attributes
The most common case for string attributes parses them as integers. We don't have a convenient way to do this, and as a result we have inconsistent missing attribute and invalid attribute handling scattered around. We also have inconsistent radix usage to getAsInteger; some places use the default 0 and others use base 10.
Update a few of the uses, but there are quite a lot of these.
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1 |
|
#
5e96cea1 |
| 07-Sep-2022 |
Joe Loser <joeloser@fastmail.com> |
[llvm] Use std::size instead of llvm::array_lengthof
LLVM contains a helpful function for getting the size of a C-style array: `llvm::array_lengthof`. This is useful prior to C++17, but not as helpf
[llvm] Use std::size instead of llvm::array_lengthof
LLVM contains a helpful function for getting the size of a C-style array: `llvm::array_lengthof`. This is useful prior to C++17, but not as helpful for C++17 or later: `std::size` already has support for C-style arrays.
Change call sites to use `std::size` instead.
Differential Revision: https://reviews.llvm.org/D133429
show more ...
|
Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2 |
|
#
de9d80c1 |
| 08-Aug-2022 |
Fangrui Song <i@maskray.me> |
[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
|
Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
#
70a5c525 |
| 06-May-2022 |
Lucas Prates <lucas.prates@arm.com> |
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it.
In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125094
show more ...
|
#
8f2ba363 |
| 15-Jun-2022 |
Krasimir Georgiev <krasimir@google.com> |
Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records AND [NFC][Thumb] Update frame-chain codegen test to use thumbv6m"
This reverts commit 7625e01d661644a560884057755d48a
Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records AND [NFC][Thumb] Update frame-chain codegen test to use thumbv6m"
This reverts commit 7625e01d661644a560884057755d48a0da8b77b4 and dependent cbcce82ef6b512d97e92a319a75a03e997c844e1.
Commit 7625e01d661644a560884057755d48a0da8b77b4 causes some new codegen test failures under asan, e.g., CodeGen/ARM/execute-only.ll: https://lab.llvm.org/buildbot/#/builders/5/builds/24659/steps/15/logs/stdio.
show more ...
|
#
7625e01d |
| 06-May-2022 |
Lucas Prates <lucas.prates@arm.com> |
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it.
In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125094
show more ...
|
#
33b9ad64 |
| 13-Jun-2022 |
Lucas Prates <lucas.prates@arm.com> |
Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records"
Reverting change due to test failure.
This reverts commit 6119053dab67129eb1700dbf36db3524dd3e421f.
|
#
6119053d |
| 06-May-2022 |
Lucas Prates <lucas.prates@arm.com> |
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it.
In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125094
show more ...
|
#
485432f3 |
| 01-Jun-2022 |
Martin Storsjö <martin@martin.st> |
[ARM] Make a narrow tMOVi8 where possible in SEH prologues
We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions
[ARM] Make a narrow tMOVi8 where possible in SEH prologues
We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions in a potential future pass in frame lowering.
But for this specific case, where we know we can express the intent with a narrow instruction, change to that instruction form directly in frame lowering.
Differential Revision: https://reviews.llvm.org/D126949
show more ...
|
#
bd52506d |
| 01-Jun-2022 |
Martin Storsjö <martin@martin.st> |
[ARM] Make narrow push/pop in SEH prologues/epilogues where applicable
We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the i
[ARM] Make narrow push/pop in SEH prologues/epilogues where applicable
We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions in a potential future pass in frame lowering.
But for this specific case, where we know we can express the intent with a narrow instruction, change to that instruction form directly in frame lowering.
Differential Revision: https://reviews.llvm.org/D126948
show more ...
|
#
40c937cb |
| 02-Jun-2022 |
Martin Storsjö <martin@martin.st> |
[ARM] Fix restoring stack for varargs with SEH split frame pointer push
Previously, the "add sp, #12" ended up inserted after "bx lr".
Differential Revision: https://reviews.llvm.org/D126872
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
2ab19bfa |
| 26-Nov-2021 |
Martin Storsjö <martin@martin.st> |
[ARM] Adjust the frame pointer when it's needed for SEH unwinding
For functions that require restoring SP from FP (e.g. that need to align the stack, or that have variable sized allocations), the pr
[ARM] Adjust the frame pointer when it's needed for SEH unwinding
For functions that require restoring SP from FP (e.g. that need to align the stack, or that have variable sized allocations), the prologue and epilogue previously used to look like this:
push {r4-r5, r11, lr} add r11, sp, #8 ... sub r4, r11, #8 mov sp, r4 pop {r4-r5, r11, pc}
This is problematic, because this unwinding operation (restoring sp from r11 - offset) can't be expressed with the SEH unwind opcodes (probably because this unwind procedure doesn't map exactly to individual instructions; note the detour via r4 in the epilogue too).
To make unwinding work, the GPR push is split into two; the first one pushing all other registers, and the second one pushing r11+lr, so that r11 can be set pointing at this spot on the stack:
push {r4-r5} push {r11, lr} mov r11, sp ... mov sp, r11 pop {r11, lr} pop {r4-r5} bx lr
For the same setup, MSVC generates code that uses two registers; r11 still pointing at the {r11,lr} pair, but a separate register used for restoring the stack at the end:
push {r4-r5, r7, r11, lr} add r11, sp, #12 mov r7, sp ... mov sp, r7 pop {r4-r5, r7, r11, pc}
For cases with clobbered float/vector registers, they are pushed after the GPRs, before the {r11,lr} pair.
Differential Revision: https://reviews.llvm.org/D125649
show more ...
|