#
259a0cf3 |
| 24-Feb-2017 |
Justin Bogner <mail@justinbogner.com> |
Add missing initialization for MachineOptimizationRemarkEmitter
This was missed in r293110.
llvm-svn: 296096
|
#
43130592 |
| 18-Feb-2017 |
Matthias Braun <matze@braunis.de> |
MachineRegionInfo: Fix pass initialization
- Adapt MachineBasicBlock::getName() to have the same behavior as the IR BasicBlock (Value::getName()). - Add it to lib/CodeGen/CodeGen.cpp::initializeCo
MachineRegionInfo: Fix pass initialization
- Adapt MachineBasicBlock::getName() to have the same behavior as the IR BasicBlock (Value::getName()). - Add it to lib/CodeGen/CodeGen.cpp::initializeCodeGen so that it is linked in the CodeGen library. - MachineRegionInfoPass's name conflicts with RegionInfoPass's name ("region"). - MachineRegionInfo should depend on MachineDominatorTree, MachinePostDominatorTree and MachineDominanceFrontier instead of their respective IR versions. - Since there were no tests for this, add a X86 MIR test.
Patch by Francis Visoiu Mistrih<fvisoiumistrih@apple.com>
llvm-svn: 295518
show more ...
|
Revision tags: llvmorg-4.0.0-rc2 |
|
#
a7c041d1 |
| 31-Jan-2017 |
Nirav Dave <niravd@google.com> |
[X86] Implement -mfentry
Summary: Insert calls to __fentry__ at function entry.
Reviewers: hfinkel, craig.topper
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/
[X86] Implement -mfentry
Summary: Insert calls to __fentry__ at function entry.
Reviewers: hfinkel, craig.topper
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D28000
llvm-svn: 293648
show more ...
|
Revision tags: llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1 |
|
#
11e60ff7 |
| 14-Nov-2016 |
Tom Stellard <thomas.stellard@amd.com> |
RegAllocGreedy: Properly initialize this pass, so that -run-pass will work
Reviewers: qcolombet, MatzeB
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D26572
llvm
RegAllocGreedy: Properly initialize this pass, so that -run-pass will work
Reviewers: qcolombet, MatzeB
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D26572
llvm-svn: 286895
show more ...
|
#
36919a4f |
| 06-Oct-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Move AArch64BranchRelaxation to generic code
llvm-svn: 283459
|
#
40d7f5c2 |
| 01-Sep-2016 |
Hal Finkel <hfinkel@anl.gov> |
Add a counter-function insertion pass
As discussed in https://reviews.llvm.org/D22666, our current mechanism to support -pg profiling, where we insert calls to mcount(), or some similar function, is
Add a counter-function insertion pass
As discussed in https://reviews.llvm.org/D22666, our current mechanism to support -pg profiling, where we insert calls to mcount(), or some similar function, is fundamentally broken. We insert these calls in the frontend, which means they get duplicated when inlining, and so the accumulated execution counts for the inlined-into functions are wrong.
Because we don't want the presence of these functions to affect optimizaton, they should be inserted in the backend. Here's a pass which would do just that. The knowledge of the name of the counting function lives in the frontend, so we're passing it here as a function attribute. Clang will be updated to use this mechanism.
Differential Revision: https://reviews.llvm.org/D22825
llvm-svn: 280347
show more ...
|
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1 |
|
#
254f889d |
| 29-Jul-2016 |
Brendon Cahoon <bcahoon@codeaurora.org> |
MachinePipeliner pass that implements Swing Modulo Scheduling
Software pipelining is an optimization for improving ILP by overlapping loop iterations. Swing Modulo Scheduling (SMS) is an implementat
MachinePipeliner pass that implements Swing Modulo Scheduling
Software pipelining is an optimization for improving ILP by overlapping loop iterations. Swing Modulo Scheduling (SMS) is an implementation of software pipelining that attempts to reduce register pressure and generate efficient pipelines with a low compile-time cost.
This implementaion of SMS is a target-independent back-end pass. When enabled, the pass should run just prior to the register allocation pass, while the machine IR is in SSA form. If the pass is successful, then the original loop is replaced by the optimized loop. The optimized loop contains one or more prolog blocks, the pipelined kernel, and one or more epilog blocks.
This pass is enabled for Hexagon only. To enable for other targets, a couple of target specific hooks must be implemented, and the pass needs to be called from the target's TargetMachine implementation.
Differential Review: http://reviews.llvm.org/D16829
llvm-svn: 277169
show more ...
|
#
52735fc4 |
| 14-Jul-2016 |
Dean Michael Berris <dberris@google.com> |
XRay: Add entry and exit sleds
Summary: In this patch we implement the following parts of XRay:
- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-alw
XRay: Add entry and exit sleds
Summary: In this patch we implement the following parts of XRay:
- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches. - Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts). - X86-specific nop sleds as described in the white paper. - A machine function pass that adds the different instrumentation marker instructions at a very late stage. - A way of identifying which return opcode is considered "normal" for each architecture.
There are some caveats here:
1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet.
2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library.
Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk
Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits
Differential Revision: http://reviews.llvm.org/D19904
llvm-svn: 275367
show more ...
|
#
90d195a5 |
| 08-Jul-2016 |
Wei Mi <wmi@google.com> |
[PM] Port UnreachableBlockElim to the new Pass Manager
Differential Revision: http://reviews.llvm.org/D22124
llvm-svn: 274824
|
#
82d5da5a |
| 24-Jun-2016 |
Michael Kuperstein <mkuper@google.com> |
[PM] Port PreISelIntrinsicLowering to the new PM
llvm-svn: 273713
|
Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
#
f9acacaa |
| 31-May-2016 |
Matthias Braun <matze@braunis.de> |
CodeGen: Refactor renameDisconnectedComponents() as a pass
Refactor LiveIntervals::renameDisconnectedComponents() to be a pass. Also change the name to "RenameIndependentSubregs":
- renameDisconnec
CodeGen: Refactor renameDisconnectedComponents() as a pass
Refactor LiveIntervals::renameDisconnectedComponents() to be a pass. Also change the name to "RenameIndependentSubregs":
- renameDisconnectedComponents() worked on a MachineFunction at a time so it is a natural candidate for a machine function pass.
- The algorithm is testable with a .mir test now.
- This also fixes a problem where the lazy renaming as part of the MachineScheduler introduced IMPLICIT_DEF instructions after the number of a nodes in a region were counted leading to a mismatch.
Differential Revision: http://reviews.llvm.org/D20507
llvm-svn: 271345
show more ...
|
#
330a1255 |
| 19-May-2016 |
Matthew Simpson <mssimpso@codeaurora.org> |
[ARM, AArch64] Properly initialize InterleavedAccessPass
InterleavedAccessPass is an IR-level pass, so this change will enable testing it with opt. This is part of D20250.
llvm-svn: 270101
|
#
fbe85ae1 |
| 28-Apr-2016 |
Matthias Braun <matze@braunis.de> |
CodeGen: Add DetectDeadLanes pass.
The DetectDeadLanes pass performs a dataflow analysis of used/defined subregister lanes across COPY instructions and instructions that will get lowered to copies.
CodeGen: Add DetectDeadLanes pass.
The DetectDeadLanes pass performs a dataflow analysis of used/defined subregister lanes across COPY instructions and instructions that will get lowered to copies. It detects dead definitions and uses reading undefined values which are obscured by COPY and subregister usage.
These dead definitions cause trouble in the register coalescer which cannot deal with definitions suddenly becoming dead after coalescing COPY instructions.
For now the pass only adds dead and undef flags to machine operands. It should be possible to extend it in the future to remove the dead instructions and redo the analysis for the affected virtual registers.
Differential Revision: http://reviews.llvm.org/D18427
llvm-svn: 267851
show more ...
|
#
7dd8dbf4 |
| 22-Apr-2016 |
Peter Collingbourne <peter@pcc.me.uk> |
Introduce llvm.load.relative intrinsic.
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and ret
Introduce llvm.load.relative intrinsic.
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant.
Differential Revision: http://reviews.llvm.org/D18367
llvm-svn: 267223
show more ...
|
#
ee34680b |
| 22-Apr-2016 |
Tom Stellard <thomas.stellard@amd.com> |
CodeGen: Add a stand-alone hazard recognizer pass
Summary: This new pass allows targets to use the hazard recognizer without having to also run one of the schedulers. This is useful when compiling
CodeGen: Add a stand-alone hazard recognizer pass
Summary: This new pass allows targets to use the hazard recognizer without having to also run one of the schedulers. This is useful when compiling with optimizations disabled for targets that still need noop hazards to be handled correctly.
Reviewers: hfinkel, atrick
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18594
llvm-svn: 267156
show more ...
|
#
c0441c29 |
| 19-Apr-2016 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
Introduce a "patchable-function" function attribute
Summary: The `"patchable-function"` attribute can be used by an LLVM client to influence LLVM's code generation in ways that makes the generated c
Introduce a "patchable-function" function attribute
Summary: The `"patchable-function"` attribute can be used by an LLVM client to influence LLVM's code generation in ways that makes the generated code easily patchable at runtime (for instance, to redirect control). Right now only one patchability scheme is supported, `"prologue-short-redirect"`, but this can be expanded in the future.
Reviewers: joker.eph, rnk, echristo, dberris
Subscribers: joker.eph, echristo, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D19046
llvm-svn: 266715
show more ...
|
Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2 |
|
#
390c33cd |
| 27-Jan-2016 |
Benjamin Kramer <benny.kra@googlemail.com> |
Move SafeStack to CodeGen.
It depends on the target machinery, that's not available for instrumentation passes.
llvm-svn: 258942
|
Revision tags: llvmorg-3.8.0-rc1 |
|
#
859ad29b |
| 16-Dec-2015 |
Vikram TV <vikram.tarikere@gmail.com> |
Recommit LiveDebugValues pass after fixing a couple of minor issues.
llvm-svn: 255759
|
#
ceca9715 |
| 09-Dec-2015 |
Mehdi Amini <mehdi.amini@apple.com> |
Revert "Implement a new pass - LiveDebugValues - to compute the set of live DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933"
This reverts c
Revert "Implement a new pass - LiveDebugValues - to compute the set of live DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933"
This reverts commit r255096.
Break the bots: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/16378/
From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255101
show more ...
|
#
0876d2d5 |
| 09-Dec-2015 |
Vikram TV <vikram.tarikere@gmail.com> |
Implement a new pass - LiveDebugValues - to compute the set of live DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933
llvm-svn: 255096
|
Revision tags: llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1 |
|
#
97890230 |
| 17-Sep-2015 |
David Majnemer <david.majnemer@gmail.com> |
[WinEH] Add a funclet layout pass
Windows EH funclets need to be contiguous. The FuncletLayout pass will ensure that the funclets are together and begin with a funclet entry MBB.
Differential Revi
[WinEH] Add a funclet layout pass
Windows EH funclets need to be contiguous. The FuncletLayout pass will ensure that the funclets are together and begin with a funclet entry MBB.
Differential Revision: http://reviews.llvm.org/D12943
llvm-svn: 247937
show more ...
|
Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1, llvmorg-3.6.2, llvmorg-3.6.2-rc1 |
|
#
69fad079 |
| 15-Jun-2015 |
Sanjoy Das <sanjoy@playingwithpointers.com> |
[CodeGen] Add a pass to fold null checks into nearby memory operations.
Summary: This change adds an "ImplicitNullChecks" target dependent pass. This pass folds null checks into memory operation us
[CodeGen] Add a pass to fold null checks into nearby memory operations.
Summary: This change adds an "ImplicitNullChecks" target dependent pass. This pass folds null checks into memory operation using the FAULTING_LOAD pseudo-op introduced in previous patches.
Depends on D10197 Depends on D10199 Depends on D10200
Reviewers: reames, rnk, pgavlin, JosephTremoulet, atrick
Reviewed By: atrick
Subscribers: ab, JosephTremoulet, llvm-commits
Differential Revision: http://reviews.llvm.org/D10201
llvm-svn: 239743
show more ...
|
Revision tags: llvmorg-3.6.1, llvmorg-3.6.1-rc1 |
|
#
61b305ed |
| 05-May-2015 |
Quentin Colombet <qcolombet@apple.com> |
[ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find
[ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks.
As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64.
** Context **
Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places.
** Motivating example **
Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false
true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false
false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 }
On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret
With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret
Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call.
** Proposed Solution **
This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI.
Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties.
The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap.
Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block.
** Design Decisions **
1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples.
Differential Revision: http://reviews.llvm.org/D9210
<rdar://problem/3201744>
llvm-svn: 236507
show more ...
|
Revision tags: llvmorg-3.5.2, llvmorg-3.5.2-rc1 |
|
#
be0a0506 |
| 09-Mar-2015 |
Reid Kleckner <reid@kleckner.net> |
Reland r229944: EH: Prune unreachable resume instructions during Dwarf EH preparation
Fix the double-deletion of AnalysisResolver when delegating through to Dwarf EH preparation by creating one from
Reland r229944: EH: Prune unreachable resume instructions during Dwarf EH preparation
Fix the double-deletion of AnalysisResolver when delegating through to Dwarf EH preparation by creating one from scratch. Hopefully the new pass manager simplifies this.
This reverts commit r229952.
llvm-svn: 231719
show more ...
|
Revision tags: llvmorg-3.6.0 |
|
#
301ed0c3 |
| 20-Feb-2015 |
Chandler Carruth <chandlerc@gmail.com> |
Revert r229944: EH: Prune unreachable resume instructions during Dwarf EH preparation
This doesn't pass 'ninja check-llvm' for me. Lots of tests, including the ones updated, fail with crashes and ot
Revert r229944: EH: Prune unreachable resume instructions during Dwarf EH preparation
This doesn't pass 'ninja check-llvm' for me. Lots of tests, including the ones updated, fail with crashes and other explosions.
llvm-svn: 229952
show more ...
|