Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
#
be179b99 |
| 11-Jan-2021 |
Paul Robinson <paul.robinson@sony.com> |
[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option
This option is not used for anything after #c161665 (D91737). This commit reapplies #a474657.
|
#
c161775d |
| 11-Jan-2021 |
Paul Robinson <Paul.Robinson@sony.com> |
[FastISel] Flush local value map on every instruction
Local values are constants or addresses that can't be folded into the instruction that uses them. FastISel materializes these in a "local value"
[FastISel] Flush local value map on every instruction
Local values are constants or addresses that can't be folded into the instruction that uses them. FastISel materializes these in a "local value" area that always dominates the current insertion point, to try to avoid materializing these values more than once (per block).
https://reviews.llvm.org/D43093 added code to sink these local value instructions to their first use, which has two beneficial effects. One, it is likely to avoid some unnecessary spills and reloads; two, it allows us to attach the debug location of the user to the local value instruction. The latter effect can improve the debugging experience for debuggers with a "set next statement" feature, such as the Visual Studio debugger and PS4 debugger, because instructions to set up constants for a given statement will be associated with the appropriate source line.
There are also some constants (primarily addresses) that could be produced by no-op casts or GEP instructions; the main difference from "local value" instructions is that these are values from separate IR instructions, and therefore could have multiple users across multiple basic blocks. D43093 avoided sinking these, even though they were emitted to the same "local value" area as the other instructions. The patch comment for D43093 states:
Local values may also be used by no-op casts, which adds the register to the RegFixups table. Without reversing the RegFixups map direction, we don't have enough information to sink these instructions.
This patch undoes most of D43093, and instead flushes the local value map after(*) every IR instruction, using that instruction's debug location. This avoids sometimes incorrect locations used previously, and emits instructions in a more natural order.
In addition, constants materialized due to PHI instructions are not assigned a debug location immediately; instead, when the local value map is flushed, if the first local value instruction has no debug location, it is given the same location as the first non-local-value-map instruction. This prevents PHIs from introducing unattributed instructions, which would either be implicitly attributed to the location for the preceding IR instruction, or given line 0 if they are at the beginning of a machine basic block. Neither of those consequences is good for debugging.
This does mean materialized values are not re-used across IR instruction boundaries; however, only about 5% of those values were reused in an experimental self-build of clang.
(*) Actually, just prior to the next instruction. It seems like it would be cleaner the other way, but I was having trouble getting that to work.
This reapplies commits cf1c774d and dc35368c, and adds the modification to PHI handling, which should avoid problems with debugging under gdb.
Differential Revision: https://reviews.llvm.org/D91734
show more ...
|
#
e5eb5c8a |
| 11-Jan-2021 |
Paul Robinson <paul.robinson@sony.com> |
NFC: Use -LABEL more
There were a number of tests needing updates for D91734, and I added a bunch of LABEL directives to help track down where those had to go. These directives are an improvement in
NFC: Use -LABEL more
There were a number of tests needing updates for D91734, and I added a bunch of LABEL directives to help track down where those had to go. These directives are an improvement independent of the functional patch, so I'm committing them as their own separate patch.
show more ...
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
#
615f63e1 |
| 01-Dec-2020 |
David Blaikie <dblaikie@gmail.com> |
Revert "[FastISel] Flush local value map on ever instruction" and dependent patches
This reverts commit cf1c774d6ace59c5adc9ab71b31e762c1be695b1.
This change caused several regressions in the gdb t
Revert "[FastISel] Flush local value map on ever instruction" and dependent patches
This reverts commit cf1c774d6ace59c5adc9ab71b31e762c1be695b1.
This change caused several regressions in the gdb test suite - at least a sample of which was due to line zero instructions making breakpoints un-lined. I think they're worth investigating/understanding more (& possibly addressing) before moving forward with this change.
Revert "[FastISel] NFC: Clean up unnecessary bookkeeping" This reverts commit 3fd39d3694d32efa44242c099e923a7f4d982095.
Revert "[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option" This reverts commit a474657e30edccd9e175d92bddeefcfa544751b2.
Revert "Remove static function unused after cf1c774." This reverts commit dc35368ccf17a7dca0874ace7490cc3836fb063f.
Revert "[lldb] Fix TestThreadStepOut.py after "Flush local value map on every instruction"" This reverts commit 53a14a47ee89dadb8798ca8ed19848f33f4551d5.
show more ...
|
#
a474657e |
| 30-Nov-2020 |
Paul Robinson <paul.robinson@sony.com> |
[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option
This option is not used for anything after #dc35368 (D91734).
|
Revision tags: llvmorg-11.0.1-rc1 |
|
#
cf1c774d |
| 18-Nov-2020 |
Paul Robinson <Paul.Robinson@sony.com> |
[FastISel] Flush local value map on ever instruction
Local values are constants or addresses that can't be folded into the instruction that uses them. FastISel materializes these in a "local value"
[FastISel] Flush local value map on ever instruction
Local values are constants or addresses that can't be folded into the instruction that uses them. FastISel materializes these in a "local value" area that always dominates the current insertion point, to try to avoid materializing these values more than once (per block).
https://reviews.llvm.org/D43093 added code to sink these local value instructions to their first use, which has two beneficial effects. One, it is likely to avoid some unnecessary spills and reloads; two, it allows us to attach the debug location of the user to the local value instruction. The latter effect can improve the debugging experience for debuggers with a "set next statement" feature, such as the Visual Studio debugger and PS4 debugger, because instructions to set up constants for a given statement will be associated with the appropriate source line.
There are also some constants (primarily addresses) that could be produced by no-op casts or GEP instructions; the main difference from "local value" instructions is that these are values from separate IR instructions, and therefore could have multiple users across multiple basic blocks. D43093 avoided sinking these, even though they were emitted to the same "local value" area as the other instructions. The patch comment for D43093 states:
Local values may also be used by no-op casts, which adds the register to the RegFixups table. Without reversing the RegFixups map direction, we don't have enough information to sink these instructions.
This patch undoes most of D43093, and instead flushes the local value map after(*) every IR instruction, using that instruction's debug location. This avoids sometimes incorrect locations used previously, and emits instructions in a more natural order.
This does mean materialized values are not re-used across IR instruction boundaries; however, only about 5% of those values were reused in an experimental self-build of clang.
(*) Actually, just prior to the next instruction. It seems like it would be cleaner the other way, but I was having trouble getting that to work.
Differential Revision: https://reviews.llvm.org/D91734
show more ...
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4 |
|
#
89baeaef |
| 22-Sep-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "RegAllocFast: Rewrite and improve"
This reverts commit 73a6a164b84a8195defbb8f5eeb6faecfc478ad4.
|
Revision tags: llvmorg-11.0.0-rc3 |
|
#
73a6a164 |
| 22-Sep-2020 |
Muhammad Omair Javaid <omair.javaid@linaro.org> |
Revert "Reapply Revert "RegAllocFast: Rewrite and improve""
This reverts commit 55f9f87da2c2ad791b9e62cccb1c035e037444fa.
Breaks following buildbots: http://lab.llvm.org:8011/builders/lldb-arm-ubun
Revert "Reapply Revert "RegAllocFast: Rewrite and improve""
This reverts commit 55f9f87da2c2ad791b9e62cccb1c035e037444fa.
Breaks following buildbots: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4306 http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu/builds/9154
show more ...
|
#
55f9f87d |
| 21-Sep-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply Revert "RegAllocFast: Rewrite and improve"
This reverts commit dbd53a1f0c939a55e7719c39d08179468f9ad3dc.
Needed lldb test updates
|
#
dbd53a1f |
| 19-Sep-2020 |
Eric Christopher <echristo@gmail.com> |
Temporarily Revert "RegAllocFast: Rewrite and improve" as it's breaking a few tests in the lldb test suite.
Bot: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4226/steps/test/logs/stdio
Temporarily Revert "RegAllocFast: Rewrite and improve" as it's breaking a few tests in the lldb test suite.
Bot: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4226/steps/test/logs/stdio
This reverts commit c8757ff3aa7dd7a25a6343f6ef74a70c7be04325.
show more ...
|
#
c8757ff3 |
| 14-Sep-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
RegAllocFast: Rewrite and improve
This rewrites big parts of the fast register allocator. The basic strategy of doing block-local allocation hasn't changed but I tweaked several details:
Track regi
RegAllocFast: Rewrite and improve
This rewrites big parts of the fast register allocator. The basic strategy of doing block-local allocation hasn't changed but I tweaked several details:
Track register state on register units instead of physical registers. This simplifies and speeds up handling of register aliases. Process basic blocks in reverse order: Definitions are known to end register livetimes when walking backwards (contrary when walking forward then uses may or may not be a kill so we need heuristics).
Check register mask operands (calls) instead of conservatively assuming everything is clobbered. Enhance heuristics to detect killing uses: In case of a small number of defs/uses check if they are all in the same basic block and if so the last one is a killing use. Enhance heuristic for copy-coalescing through hinting: We check the first k defs of a register for COPYs rather than relying on there just being a single definition. When testing this on the full llvm test-suite including SPEC externals I measured:
average 5.1% reduction in code size for X86, 4.9% reduction in code on aarch64. (ranging between 0% and 20% depending on the test) 0.5% faster compiletime (some analysis suggests the pass is slightly slower than before, but we more than make up for it because later passes are faster with the reduced instruction count)
Also adds a few testcases that were broken without this patch, in particular bug 47278.
Patch mostly by Matthias Braun
show more ...
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1 |
|
#
08286994 |
| 11-Apr-2018 |
Reid Kleckner <rnk@google.com> |
[FastISel] Disable local value sinking by default
This is causing compilation timeouts on code with long sequences of local values and calls (i.e. foo(1); foo(2); foo(3); ...). It turns out that co
[FastISel] Disable local value sinking by default
This is causing compilation timeouts on code with long sequences of local values and calls (i.e. foo(1); foo(2); foo(3); ...). It turns out that code coverage instrumentation is a great way to create sequences like this, which how our users ran into the issue in practice.
Intel has a tool that detects these kinds of non-linear compile time issues, and Andy Kaylor reported it as PR37010.
The current sinking code scans the whole basic block once per local value sink, which happens before emitting each call. In theory, local values should only be introduced to be used by instructions between the current flush point and the last flush point, so we should only need to scan those instructions.
llvm-svn: 329822
show more ...
|
Revision tags: llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1 |
|
#
3a7a2e4a |
| 14-Mar-2018 |
Reid Kleckner <rnk@google.com> |
[FastISel] Sink local value materializations to first use
Summary: Local values are constants, global addresses, and stack addresses that can't be folded into the instruction that uses them. For exa
[FastISel] Sink local value materializations to first use
Summary: Local values are constants, global addresses, and stack addresses that can't be folded into the instruction that uses them. For example, when storing the address of a global variable into memory, we need to materialize that address into a register.
FastISel doesn't want to materialize any given local value more than once, so it generates all local value materialization code at EmitStartPt, which always dominates the current insertion point. This allows it to maintain a map of local value registers, and it knows that the local value area will always dominate the current insertion point.
The downside is that local value instructions are always emitted without a source location. This is done to prevent jumpy line tables, but it means that the local value area will be considered part of the previous statement. Consider this C code: call1(); // line 1 ++global; // line 2 ++global; // line 3 call2(&global, &local); // line 4
Today we end up with assembly and line tables like this: .loc 1 1 callq call1 leaq global(%rip), %rdi leaq local(%rsp), %rsi .loc 1 2 addq $1, global(%rip) .loc 1 3 addq $1, global(%rip) .loc 1 4 callq call2
The LEA instructions in the local value area have no source location and are treated as being on line 1. Stepping through the code in a debugger and correlating it with the assembly won't make much sense, because these materializations are only required for line 4.
This is actually problematic for the VS debugger "set next statement" feature, which effectively assumes that there are no registers live across statement boundaries. By sinking the local value code into the statement and fixing up the source location, we can make that feature work. This was filed as https://bugs.llvm.org/show_bug.cgi?id=35975 and https://crbug.com/793819.
This change is obviously not enough to make this feature work reliably in all cases, but I felt that it was worth doing anyway because it usually generates smaller, more comprehensible -O0 code. I measured a 0.12% regression in code generation time with LLC on the sqlite3 amalgamation, so I think this is worth doing.
There are some special cases worth calling out in the commit message: 1. local values materialized for phis 2. local values used by no-op casts 3. dead local value code
Local values can be materialized for phis, and this does not show up as a vreg use in MachineRegisterInfo. In this case, if there are no other uses, this patch sinks the value to the first terminator, EH label, or the end of the BB if nothing else exists.
Local values may also be used by no-op casts, which adds the register to the RegFixups table. Without reversing the RegFixups map direction, we don't have enough information to sink these instructions.
Lastly, if the local value register has no other uses, we can delete it. This comes up when fastisel tries two instruction selection approaches and the first materializes the value but fails and the second succeeds without using the local value.
Reviewers: aprantl, dblaikie, qcolombet, MatzeB, vsk, echristo
Subscribers: dotdash, chandlerc, hans, sdardis, amccarth, javed.absar, zturner, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D43093
llvm-svn: 327581
show more ...
|
Revision tags: llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1, llvmorg-3.8.0, llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1, llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1, llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1, llvmorg-3.6.2, llvmorg-3.6.2-rc1, llvmorg-3.6.1, llvmorg-3.6.1-rc1 |
|
#
e8d0c4cc |
| 06-May-2015 |
Ahmed Bougacha <ahmed.bougacha@gmail.com> |
[ARM][FastISel] Use TST #1 instead of CMP #0 for select.
Since r234249, i1 are sext instead of zext; because of that, doing "CMP rN, #0; IT EQ/NE" isn't correct anymore.
"TST #1" is the conservativ
[ARM][FastISel] Use TST #1 instead of CMP #0 for select.
Since r234249, i1 are sext instead of zext; because of that, doing "CMP rN, #0; IT EQ/NE" isn't correct anymore.
"TST #1" is the conservatively correct alternative - the tradeoff being that it doesn't have a 16-bit encoding -, so use that instead.
llvm-svn: 236569
show more ...
|
Revision tags: llvmorg-3.5.2, llvmorg-3.5.2-rc1 |
|
#
945a660c |
| 27-Feb-2015 |
Mehdi Amini <mehdi.amini@apple.com> |
Change the fast-isel-abort option from bool to int to enable "levels"
Summary: Currently fast-isel-abort will only abort for regular instructions, and just warn for function calls, terminators, func
Change the fast-isel-abort option from bool to int to enable "levels"
Summary: Currently fast-isel-abort will only abort for regular instructions, and just warn for function calls, terminators, function arguments. There is already fast-isel-abort-args but nothing for calls and terminators.
This change turns the fast-isel-abort options into an integer option, so that multiple levels of strictness can be defined. This will help no being surprised when the "abort" option indeed does not abort, and enables the possibility to write test that verifies that no intrinsics are forgotten by fast-isel.
Reviewers: resistor, echristo
Subscribers: jfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D7941
From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 230775
show more ...
|
Revision tags: llvmorg-3.6.0, llvmorg-3.6.0-rc4, llvmorg-3.6.0-rc3, llvmorg-3.6.0-rc2, llvmorg-3.6.0-rc1, llvmorg-3.5.1, llvmorg-3.5.1-rc2, llvmorg-3.5.1-rc1, llvmorg-3.5.0, llvmorg-3.5.0-rc4, llvmorg-3.5.0-rc3 |
|
#
4bf6c01c |
| 19-Aug-2014 |
Juergen Ributzka <juergen@apple.com> |
Reapply [FastISel] Let the target decide first if it wants to materialize a constant (215588).
Note: This was originally reverted to track down a buildbot error. This commit exposed a latent bug tha
Reapply [FastISel] Let the target decide first if it wants to materialize a constant (215588).
Note: This was originally reverted to track down a buildbot error. This commit exposed a latent bug that was fixed in r215753. Therefore it is reapplied without any modifications.
I run it through SPEC2k and SPEC2k6 for AArch64 and it didn't introduce any new regeressions.
Original commit message: This changes the order in which FastISel tries to materialize a constant. Originally it would try to use a simple target-independent approach, which can lead to the generation of inefficient code.
On X86 this would result in the use of movabsq to materialize any 64bit integer constant - even for simple and small values such as 0 and 1. Also some very funny floating-point materialization could be observed too.
On AArch64 it would materialize the constant 0 in a register even the architecture has an actual "zero" register.
On ARM it would generate unnecessary mov instructions or not use mvn.
This change simply changes the order and always asks the target first if it likes to materialize the constant. This doesn't fix all the issues mentioned above, but it enables the targets to implement such optimizations.
Related to <rdar://problem/17420988>.
llvm-svn: 216006
show more ...
|
#
790bacf2 |
| 14-Aug-2014 |
Juergen Ributzka <juergen@apple.com> |
Revert several FastISel commits to track down a buildbot error.
This reverts: r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants." r215594 "[FastISel][X
Revert several FastISel commits to track down a buildbot error.
This reverts: r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants." r215594 "[FastISel][X86] Use XOR to materialize the "0" value." r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization." r215591 "[FastISel][AArch64] Make use of the zero register when possible." r215588 "[FastISel] Let the target decide first if it wants to materialize a constant." r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI."
llvm-svn: 215673
show more ...
|
#
7cee768e |
| 13-Aug-2014 |
Juergen Ributzka <juergen@apple.com> |
[FastISel] Let the target decide first if it wants to materialize a constant.
This changes the order in which FastISel tries to materialize a constant. Originally it would try to use a simple target
[FastISel] Let the target decide first if it wants to materialize a constant.
This changes the order in which FastISel tries to materialize a constant. Originally it would try to use a simple target-independent approach, which can lead to the generation of inefficient code.
On X86 this would result in the use of movabsq to materialize any 64bit integer constant - even for simple and small values such as 0 and 1. Also some very funny floating-point materialization could be observed too.
On AArch64 it would materialize the constant 0 in a register even the architecture has an actual "zero" register.
On ARM it would generate unnecessary mov instructions or not use mvn.
This change simply changes the order and always asks the target first if it likes to materialize the constant. This doesn't fix all the issues mentioned above, but it enables the targets to implement such optimizations.
Related to <rdar://problem/17420988>.
llvm-svn: 215588
show more ...
|
Revision tags: llvmorg-3.5.0-rc2, llvmorg-3.5.0-rc1, llvmorg-3.4.2, llvmorg-3.4.2-rc1, llvmorg-3.4.1, llvmorg-3.4.1-rc2, llvmorg-3.4.1-rc1, llvmorg-3.4.0, llvmorg-3.4.0-rc3, llvmorg-3.4.0-rc2, llvmorg-3.4.0-rc1 |
|
#
a5153cb0 |
| 09-Sep-2013 |
Joey Gouly <joey.gouly@arm.com> |
[ARMv8] Prevent generation of deprecated IT blocks on ARMv8 in Thumb mode. IT blocks can only be one instruction lonf, and can only contain a subset of the 16 instructions.
Patch by Artyom Skrobov!
[ARMv8] Prevent generation of deprecated IT blocks on ARMv8 in Thumb mode. IT blocks can only be one instruction lonf, and can only contain a subset of the 16 instructions.
Patch by Artyom Skrobov!
llvm-svn: 190309
show more ...
|
#
71a78f96 |
| 20-Aug-2013 |
Jim Grosbach <grosbach@apple.com> |
ARM: Fix fast-isel copy/paste-o.
Update testcase to be more careful about checking register values. While regexes are general goodness for these sorts of testcases, in this example, the registers ar
ARM: Fix fast-isel copy/paste-o.
Update testcase to be more careful about checking register values. While regexes are general goodness for these sorts of testcases, in this example, the registers are constrained by the calling convention, so we can and should check their explicit values.
rdar://14779513
llvm-svn: 188819
show more ...
|
#
d7866790 |
| 16-Aug-2013 |
Jim Grosbach <grosbach@apple.com> |
ARM: Properly constrain comparison fastisel register classes.
Ongoing 'make the verifier happy' improvements to ARM fast-isel.
rdar://12594152
llvm-svn: 188595
|
Revision tags: llvmorg-3.3.1-rc1 |
|
#
18db1f2f |
| 14-Jun-2013 |
JF Bastien <jfb@google.com> |
Enable FastISel on ARM for Linux and NaCl, not MCJIT
This is a resubmit of r182877, which was reverted because it broken MCJIT tests on ARM. The patch leaves MCJIT on ARM as it was before: only enab
Enable FastISel on ARM for Linux and NaCl, not MCJIT
This is a resubmit of r182877, which was reverted because it broken MCJIT tests on ARM. The patch leaves MCJIT on ARM as it was before: only enabled for iOS. I've CC'ed people from the original review and revert.
FastISel was only enabled for iOS ARM and Thumb2, this patch enables it for ARM (not Thumb2) on Linux and NaCl, but not MCJIT.
Thumb2 support needs a bit more work, mainly around register class restrictions.
The patch punts to SelectionDAG when doing TLS relocation on non-Darwin targets. I will fix this and other FastISel-to-SelectionDAG failures in a separate patch.
The patch also forces FastISel to retain frame pointers: iOS always keeps them for backtracking (so emitted code won't change because of this), but Linux was getting much worse code that was incorrect when using big frames (such as test-suite's lencod). I'll also fix this in a later patch, it will probably require a peephole so that FastISel doesn't rematerialize frame pointers back-to-back.
The test changes are straightforward, similar to: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html They also add a vararg test that got dropped in that change.
I ran all of lnt test-suite on A15 hardware with --optimize-option=-O0 and all the tests pass. All the tests also pass on x86 make check-all. I also re-ran the check-all tests that failed on ARM, and they all seem to pass.
llvm-svn: 183966
show more ...
|
Revision tags: llvmorg-3.3.0, llvmorg-3.3.0-rc3 |
|
#
99bd2ae4 |
| 30-May-2013 |
Rafael Espindola <rafael.espindola@gmail.com> |
Revert r182937 and r182877.
r182877 broke MCJIT tests on ARM and r182937 was working around another failure by r182877.
This should make the ARM bots green.
llvm-svn: 182960
|
#
f60e0e44 |
| 29-May-2013 |
JF Bastien <jfb@google.com> |
Enable FastISel on ARM for Linux and NaCl
FastISel was only enabled for iOS ARM and Thumb2, this patch enables it for ARM (not Thumb2) on Linux and NaCl.
Thumb2 support needs a bit more work, mainl
Enable FastISel on ARM for Linux and NaCl
FastISel was only enabled for iOS ARM and Thumb2, this patch enables it for ARM (not Thumb2) on Linux and NaCl.
Thumb2 support needs a bit more work, mainly around register class restrictions.
The patch punts to SelectionDAG when doing TLS relocation on non-Darwin targets. I will fix this and other FastISel-to-SelectionDAG failures in a separate patch.
The patch also forces FastISel to retain frame pointers: iOS always keeps them for backtracking (so emitted code won't change because of this), but Linux was getting much worse code that was incorrect when using big frames (such as test-suite's lencod). I'll also fix this in a later patch, it will probably require a peephole so that FastISel doesn't rematerialize frame pointers back-to-back.
The test changes are straightforward, similar to: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html They also add a vararg test that got dropped in that change.
I ran all of test-suite on A15 hardware with --optimize-option=-O0 and all the tests pass.
llvm-svn: 182877
show more ...
|
Revision tags: llvmorg-3.3.0-rc2 |
|
#
bd7c6e50 |
| 14-May-2013 |
Derek Schuff <dschuff@google.com> |
Fix ARM FastISel tests, as a first step to enabling ARM FastISel
ARM FastISel is currently only enabled for iOS non-Thumb1, and I'm working on enabling it for other targets. As a first step I've fix
Fix ARM FastISel tests, as a first step to enabling ARM FastISel
ARM FastISel is currently only enabled for iOS non-Thumb1, and I'm working on enabling it for other targets. As a first step I've fixed some of the tests. Changes to ARM FastISel tests: - Different triples don't generate the same relocations (especially movw/movt versus constant pool loads). Use a regex to allow either. - Mangling is different. Use a regex to allow either. - The reserved registers are sometimes different, so registers get allocated in a different order. Capture the names only where this occurs. - Add -verify-machineinstrs to some tests where it works. It doesn't work everywhere it should yet. - Add -fast-isel-abort to many tests that didn't have it before. - Split out the VarArg test from fast-isel-call.ll into its own test. This simplifies test setup because of --check-prefix.
Patch by JF Bastien
llvm-svn: 181801
show more ...
|