#
a2b90112 |
| 27-Feb-2018 |
Geoff Berry <gberry@codeaurora.org> |
Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Re-enable commit r323991 now that r325931 has been committed to make MachineOperand::isRenamable() check more conservati
Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Re-enable commit r323991 now that r325931 has been committed to make MachineOperand::isRenamable() check more conservative w.r.t. code changes and opt-in on a per-target basis.
llvm-svn: 326208
show more ...
|
#
9386bde1 |
| 24-Feb-2018 |
Heejin Ahn <aheejin@gmail.com> |
[WebAssembly] Add exception handling option and feature
Summary: Add a llc command line option and WebAssembly architecture feature for exception handling.
Reviewers: dschuff
Subscribers: jfb, sbc
[WebAssembly] Add exception handling option and feature
Summary: Add a llc command line option and WebAssembly architecture feature for exception handling.
Reviewers: dschuff
Subscribers: jfb, sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D43683
llvm-svn: 326004
show more ...
|
Revision tags: llvmorg-6.0.0-rc3 |
|
#
48abac82 |
| 17-Feb-2018 |
Quentin Colombet <qcolombet@apple.com> |
Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r323991.
This commit breaks target that don't model all the register constraints in TableGen. So far t
Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r323991.
This commit breaks target that don't model all the register constraints in TableGen. So far the workaround was to set the hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the cases. For instance, when mutating an instruction (like in the lowering of COPYs) the isRenamable flag is not properly updated. The same problem will happen when attaching machine operand from one instruction to another.
Geoff Berry is working on a fix in https://reviews.llvm.org/D43042.
llvm-svn: 325421
show more ...
|
Revision tags: llvmorg-6.0.0-rc2 |
|
#
10003e31 |
| 07-Feb-2018 |
Clement Courbet <courbet@google.com> |
[MergeICmps] Re-commit rL324317 "Enable the MergeICmps Pass by default."
With fixes from rL324341.
Original commit message:
[MergeICmps] Enable the MergeICmps Pass by default.
Summary: Now that P
[MergeICmps] Re-commit rL324317 "Enable the MergeICmps Pass by default."
With fixes from rL324341.
Original commit message:
[MergeICmps] Enable the MergeICmps Pass by default.
Summary: Now that PR33325 is fixed, this should always improve the generated code.
Reviewers: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42793
llvm-svn: 324465
show more ...
|
#
333be329 |
| 06-Feb-2018 |
Clement Courbet <courbet@google.com> |
Revert "[MergeICmps] Enable the MergeICmps Pass by default."
Breaks clang-ppc64be-linux-multistage buildbot.
This reverts commit 515bab711f308c2e8299c49dd8c84ea6a2e0b60e.
llvm-svn: 324319
|
#
7d09780f |
| 06-Feb-2018 |
Clement Courbet <courbet@google.com> |
[MergeICmps] Enable the MergeICmps Pass by default.
Summary: Now that PR33325 is fixed, this should always improve the generated code.
Reviewers: spatel
Subscribers: llvm-commits
Differential Rev
[MergeICmps] Enable the MergeICmps Pass by default.
Summary: Now that PR33325 is fixed, this should always improve the generated code.
Reviewers: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42793
llvm-svn: 324317
show more ...
|
#
94503c7b |
| 01-Feb-2018 |
Geoff Berry <gberry@codeaurora.org> |
[MachineCopyPropagation] Extend pass to do COPY source forwarding
Summary: This change extends MachineCopyPropagation to do COPY source forwarding and adds an additional run of the pass to the defau
[MachineCopyPropagation] Extend pass to do COPY source forwarding
Summary: This change extends MachineCopyPropagation to do COPY source forwarding and adds an additional run of the pass to the default pass pipeline just after register allocation.
This version of this patch uses the newly added MachineOperand::isRenamable bit to avoid forwarding registers is such a way as to violate constraints that aren't captured in the Machine IR (e.g. ABI or ISA constraints).
This change is a continuation of the work started in D30751.
Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar
Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits
Differential Revision: https://reviews.llvm.org/D41835
llvm-svn: 323991
show more ...
|
#
f386e2b0 |
| 24-Jan-2018 |
Amara Emerson <aemerson@apple.com> |
[GlobalISel] Don't fall back to FastISel.
Apparently checking the pass structure isn't enough to ensure that we don't fall back to FastISel, as it's set up as part of the SelectionDAGISel.
llvm-svn
[GlobalISel] Don't fall back to FastISel.
Apparently checking the pass structure isn't enough to ensure that we don't fall back to FastISel, as it's set up as part of the SelectionDAGISel.
llvm-svn: 323369
show more ...
|
#
c58f2166 |
| 22-Jan-2018 |
Chandler Carruth <chandlerc@gmail.com> |
Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", an
Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..
Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took *in the speculative execution*, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain.
The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers.
However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address.
On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address.
This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886
We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction.
There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection.
The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness.
For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile *all* libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller.
When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel.
When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%.
However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we *strongly* recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline.
We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors.
This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design.
Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer
Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D41723
llvm-svn: 323155
show more ...
|
#
4a7c8e7a |
| 19-Jan-2018 |
Matthias Braun <matze@braunis.de> |
Split MachineLICM into EarlyMachineLICM and MachineLICM; NFC
This avoids playing games with pseudo pass IDs and avoids using an unreliable MRI::isSSA() check to determine whether register allocation
Split MachineLICM into EarlyMachineLICM and MachineLICM; NFC
This avoids playing games with pseudo pass IDs and avoids using an unreliable MRI::isSSA() check to determine whether register allocation has happened.
Note that this renames: - MachineLICMID -> EarlyMachineLICM - PostRAMachineLICMID -> MachineLICMID to be consistent with the EarlyTailDuplicate/TailDuplicate naming.
llvm-svn: 322927
show more ...
|
#
3ab9fcb9 |
| 19-Jan-2018 |
Matthias Braun <matze@braunis.de> |
Split TailDuplicatePass into pre- and post-RA variant; NFC
Split TailDuplicatePass into EarlyTailDuplicate and TailDuplicate. This avoids playing games with fake pass IDs and using MRI::isSSA() to d
Split TailDuplicatePass into pre- and post-RA variant; NFC
Split TailDuplicatePass into EarlyTailDuplicate and TailDuplicate. This avoids playing games with fake pass IDs and using MRI::isSSA() to determine pre-/post-RA state.
llvm-svn: 322926
show more ...
|
#
4aa73a64 |
| 18-Jan-2018 |
Volkan Keles <vkeles@apple.com> |
Fix the failure caused by r322773
Do not run GlobalISel if `-fast-isel=0 -global-isel=false`.
llvm-svn: 322800
|
#
a79b0620 |
| 17-Jan-2018 |
Volkan Keles <vkeles@apple.com> |
Add a TargetOption to enable/disable GlobalISel
Summary: This patch adds a new target option in order to control GlobalISel. This will allow the users to enable/disable GlobalISel prior to the backe
Add a TargetOption to enable/disable GlobalISel
Summary: This patch adds a new target option in order to control GlobalISel. This will allow the users to enable/disable GlobalISel prior to the backend by calling `TargetMachine::setGlobalISel(bool Enable)`.
No test case as there is already a test to check GlobalISel command line options. See: CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll.
Reviewers: qcolombet, aemerson, ab, dsanders
Reviewed By: qcolombet
Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42137
llvm-svn: 322773
show more ...
|
Revision tags: llvmorg-6.0.0-rc1 |
|
#
854d10d1 |
| 02-Jan-2018 |
Amara Emerson <aemerson@apple.com> |
[AArch64][GlobalISel] Enable GlobalISel at -O0 by default
Tests updated to explicitly use fast-isel at -O0 instead of implicitly.
This change also allows an explicit -fast-isel option to override a
[AArch64][GlobalISel] Enable GlobalISel at -O0 by default
Tests updated to explicitly use fast-isel at -O0 instead of implicitly.
This change also allows an explicit -fast-isel option to override an implicitly enabled global-isel. Otherwise -fast-isel would have no effect at -O0.
Differential Revision: https://reviews.llvm.org/D41362
llvm-svn: 321655
show more ...
|
Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3 |
|
#
8065f0b9 |
| 01-Dec-2017 |
Zachary Turner <zturner@google.com> |
Mark all library options as hidden.
These command line options are not intended for public use, and often don't even make sense in the context of a particular tool anyway. About 90% of them are alre
Mark all library options as hidden.
These command line options are not intended for public use, and often don't even make sense in the context of a particular tool anyway. About 90% of them are already hidden, but when people add new options they forget to hide them, so if you were to make a brand new tool today, link against one of LLVM's libraries, and run tool -help you would get a bunch of junk that doesn't make sense for the tool you're writing.
This patch hides these options. The real solution is to not have libraries defining command line options, but that's a much larger effort and not something I'm prepared to take on.
Differential Revision: https://reviews.llvm.org/D40674
llvm-svn: 319505
show more ...
|
Revision tags: llvmorg-5.0.1-rc2 |
|
#
e1ecd61b |
| 14-Nov-2017 |
Hans Wennborg <hans@hanshq.net> |
Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which inserts calls to __
Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.
This is useful for getting a trace of how the functions in a program are executed. Normally, the calls remain even if a function is inlined into another function, but it is useful to be able to turn this off for users who are interested in a lower-level trace, i.e. one that reflects what functions are called post-inlining. (We use this to generate link order files for Chromium.)
LLVM already has a pass for inserting similar instrumentation calls to mcount(), which it does after inlining. This patch renames and extends that pass to handle calls both to mcount and the cygprofile functions, before and/or after inlining as controlled by function attributes.
Differential Revision: https://reviews.llvm.org/D39287
llvm-svn: 318195
show more ...
|
#
063bed9b |
| 03-Nov-2017 |
Clement Courbet <courbet@google.com> |
re-land [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
Fix undefined references: ExpandMemCmp belongs to CodeGen/, not Scalar/.
llvm-svn: 317318
|
#
82bade61 |
| 02-Nov-2017 |
Clement Courbet <courbet@google.com> |
Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
undefined reference to `llvm::TargetPassConfig::ID' on clang-ppc64le-linux-multistage
This reverts commit eea333c33fa73ad2
Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
undefined reference to `llvm::TargetPassConfig::ID' on clang-ppc64le-linux-multistage
This reverts commit eea333c33fa73ad225ef28607795984829f65688.
llvm-svn: 317213
show more ...
|
#
1dc37b9c |
| 02-Nov-2017 |
Clement Courbet <courbet@google.com> |
[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.
Summary: This is mostly a noop (most of the test diffs are renamed blocks). There are a few temporary register renames (eax<->ecx) a
[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.
Summary: This is mostly a noop (most of the test diffs are renamed blocks). There are a few temporary register renames (eax<->ecx) and a few blocks are shuffled around.
See the discussion in PR33325 for more details.
Reviewers: spatel
Subscribers: mgorny
Differential Revision: https://reviews.llvm.org/D39456
llvm-svn: 317211
show more ...
|
Revision tags: llvmorg-5.0.1-rc1 |
|
#
bb8507e6 |
| 12-Oct-2017 |
Matthias Braun <matze@braunis.de> |
Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"
Reverting to investigate layering effects of MCJIT not linking libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"
Reverting to investigate layering effects of MCJIT not linking libCodeGen but using TargetMachine::getNameWithPrefix() breaking the lldb bots.
This reverts commit r315633.
llvm-svn: 315637
show more ...
|
#
3a9c114b |
| 12-Oct-2017 |
Matthias Braun <matze@braunis.de> |
TargetMachine: Merge TargetMachine and LLVMTargetMachine
Merge LLVMTargetMachine into TargetMachine.
- There is no in-tree target anymore that just implements TargetMachine but not LLVMTargetMach
TargetMachine: Merge TargetMachine and LLVMTargetMachine
Merge LLVMTargetMachine into TargetMachine.
- There is no in-tree target anymore that just implements TargetMachine but not LLVMTargetMachine. - It should still be possible to stub out all the various functions in case a target does not want to use lib/CodeGen - This simplifies the code and avoids methods ending up in the wrong interface.
Differential Revision: https://reviews.llvm.org/D38489
llvm-svn: 315633
show more ...
|
#
13593843 |
| 07-Oct-2017 |
Jessica Paquette <jpaquette@apple.com> |
[MachineOutliner] Disable outlining from LinkOnceODRs by default
Say you have two identical linkonceodr functions, one in M1 and one in M2. Say that the outliner outlines A,B,C from one function, an
[MachineOutliner] Disable outlining from LinkOnceODRs by default
Say you have two identical linkonceodr functions, one in M1 and one in M2. Say that the outliner outlines A,B,C from one function, and D,E,F from another function (where letters are instructions). Now those functions are not identical, and cannot be deduped. Locally to M1 and M2, these outlining choices would be good-- to the whole program, however, this might not be true!
To mitigate this, this commit makes it so that the outliner sees linkonceodr functions as unsafe to outline from. It also adds a flag, -enable-linkonceodr-outlining, which allows the user to specify that they want to outline from such functions when they know what they're doing.
Changing this handles most code size regressions in the test suite caused by competing with linker dedupe. It also doesn't have a huge impact on the code size improvements from the outliner. There are 6 tests that regress > 5% from outlining WITH linkonceodrs to outlining WITHOUT linkonceodrs. Overall, most tests either improve or are not impacted.
Not outlined vs outlined without linkonceodrs: https://hastebin.com/raw/qeguxavuda
Not outlined vs outlined with linkonceodrs: https://hastebin.com/raw/edepoqoqic
Outlined with linkonceodrs vs outlined without linkonceodrs: https://hastebin.com/raw/awiqifiheb
Numbers generated using compare.py with -m size.__text. Tests run for AArch64 with -Oz -mllvm -enable-machine-outliner -mno-red-zone.
llvm-svn: 315136
show more ...
|
#
fabedbad |
| 03-Oct-2017 |
Geoff Berry <gberry@codeaurora.org> |
Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
This reverts commit r314729.
Another bug has been encountered in an out-of-tree target reported by Quentin.
l
Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
This reverts commit r314729.
Another bug has been encountered in an out-of-tree target reported by Quentin.
llvm-svn: 314814
show more ...
|
#
bfc5fb45 |
| 02-Oct-2017 |
Geoff Berry <gberry@codeaurora.org> |
Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues addressed since original review: - Avoid bug in regalloc greedy/machine verifier when forwarding to use in an i
Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues addressed since original review: - Avoid bug in regalloc greedy/machine verifier when forwarding to use in an instruction that re-defines the same virtual register. - Fixed bug when forwarding to use in EarlyClobber instruction slot. - Fixed incorrect forwarding to register definitions that showed up in explicit_uses() iterator (e.g. in INLINEASM). - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use).
[MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.
This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses.
llvm-svn: 314729
show more ...
|
#
34e66217 |
| 12-Sep-2017 |
Lei Huang <lei@ca.ibm.com> |
Update branch coalescing to be a PowerPC specific pass
Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any im
Update branch coalescing to be a PowerPC specific pass
Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets.
Pass is currently off by default. Enabled via -enable-ppc-branch-coalesce.
Differential Revision : https: // reviews.llvm.org/D32776
llvm-svn: 313061
show more ...
|