#
c5490e5a |
| 15-May-2017 |
Ayman Musa <ayman.musa@intel.com> |
[X86] Relocate code of replacement of subtarget unsupported masked memory intrinsics to run also on -O0 option.
Currently, when masked load, store, gather or scatter intrinsics are used, we check in
[X86] Relocate code of replacement of subtarget unsupported masked memory intrinsics to run also on -O0 option.
Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional).
CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0).
Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation.
Differential Revision: https://reviews.llvm.org/D32487
llvm-svn: 303050
show more ...
|
#
65dd23e2 |
| 12-May-2017 |
Dehao Chen <dehao@google.com> |
Add LiveRangeShrink pass to shrink live range within BB.
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more
Add LiveRangeShrink pass to shrink live range within BB.
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.
Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb
Reviewed By: MatzeB, andreadb
Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32563
llvm-svn: 302938
show more ...
|
#
836b0f48 |
| 10-May-2017 |
Amara Emerson <amara.emerson@arm.com> |
Add a late IR expansion pass for the experimental reduction intrinsics.
This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence.
Diff
Add a late IR expansion pass for the experimental reduction intrinsics.
This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence.
Differential Revision: https://reviews.llvm.org/D32245
llvm-svn: 302631
show more ...
|
Revision tags: llvmorg-4.0.1-rc1 |
|
#
7b0d9474 |
| 04-Apr-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Allow targets to opt-in to codegen in SCC order
Decouple this setting from EnableIRPA.
To support function calls on AMDGPU, it is necessary to report the global register usage throughout the kernel
Allow targets to opt-in to codegen in SCC order
Decouple this setting from EnableIRPA.
To support function calls on AMDGPU, it is necessary to report the global register usage throughout the kernel's call graph, so callees need to be handled first.
llvm-svn: 299487
show more ...
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4 |
|
#
596f483a |
| 06-Mar-2017 |
Jessica Paquette <jpaquette@apple.com> |
[Outliner] Fixed Asan bot failure in r296418
Fixed the asan bot failure which led to the last commit of the outliner being reverted. The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTre
[Outliner] Fixed Asan bot failure in r296418
Fixed the asan bot failure which led to the last commit of the outliner being reverted. The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTree's constructor. LeafVector is no longer initialized using reserve but just a standard constructor.
llvm-svn: 297081
show more ...
|
Revision tags: llvmorg-4.0.0-rc3 |
|
#
120ae22d |
| 01-Mar-2017 |
Ahmed Bougacha <ahmed.bougacha@gmail.com> |
[GlobalISel] Add a way for targets to enable GISel.
Until now, we've had to use -global-isel to enable GISel. But using that on other targets that don't support it will result in an abort, as we ca
[GlobalISel] Add a way for targets to enable GISel.
Until now, we've had to use -global-isel to enable GISel. But using that on other targets that don't support it will result in an abort, as we can't build a full pipeline. Additionally, we want to experiment with enabling GISel by default for some targets: we can't just enable GISel by default, even among those target that do have some support, because the level of support varies.
This first step adds an override for the target to explicitly define its level of support. For AArch64, do that using a new command-line option (I know..): -aarch64-enable-global-isel-at-O=<N> Where N is the opt-level below which GISel should be used.
Default that to -1, so that we still don't enable GISel anywhere. We're not there yet!
While there, remove a couple LLVM_UNLIKELYs. Building the pipeline is such a cold path that in practice that shouldn't matter at all.
llvm-svn: 296710
show more ...
|
#
b223cfab |
| 01-Mar-2017 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
Improve scheduling with branch coalescing
This patch adds a MachineSSA pass that coalesces blocks that branch on the same condition.
Committing on behalf of Lei Huang.
Differential Revision: https
Improve scheduling with branch coalescing
This patch adds a MachineSSA pass that coalesces blocks that branch on the same condition.
Committing on behalf of Lei Huang.
Differential Revision: https://reviews.llvm.org/D28249
llvm-svn: 296670
show more ...
|
#
81f68ec3 |
| 28-Feb-2017 |
Matthias Braun <matze@braunis.de> |
Revert "Add MIR-level outlining pass"
Revert Machine Outliner for now, as it breaks the asan bot.
This reverts commit r296418.
llvm-svn: 296426
|
#
d3641094 |
| 28-Feb-2017 |
Matthias Braun <matze@braunis.de> |
Add MIR-level outlining pass
This is a patch for the outliner described in the RFC at: http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html
The outliner is a code-size reduction pass wh
Add MIR-level outlining pass
This is a patch for the outliner described in the RFC at: http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html
The outliner is a code-size reduction pass which works by finding repeated sequences of instructions in a program, and replacing them with calls to functions. This is useful to people working in low-memory environments, where sacrificing performance for space is acceptable.
This adds an interprocedural outliner directly before printing assembly. For reference on how this would work, this patch also includes X86 target hooks and an X86 test.
The outliner is run like so:
clang -mno-red-zone -mllvm -enable-machine-outliner file.c
Patch by Jessica Paquette<jpaquette@apple.com>!
rdar://29166825
Differential Revision: https://reviews.llvm.org/D26872
llvm-svn: 296418
show more ...
|
Revision tags: llvmorg-4.0.0-rc2 |
|
#
5d2bd8dd |
| 05-Feb-2017 |
Kamil Rytarowski <n54@gmx.com> |
Revamp llvm::once_flag to be closer to std::once_flag
Summary: Make this interface reusable similarly to std::call_once and std::once_flag interface.
This makes porting LLDB to NetBSD easier as the
Revamp llvm::once_flag to be closer to std::once_flag
Summary: Make this interface reusable similarly to std::call_once and std::once_flag interface.
This makes porting LLDB to NetBSD easier as there was in the original approach a portable way to specify a non-static once_flag. With this change translating std::once_flag to llvm::once_flag is mechanical.
Sponsored by <The NetBSD Foundation>
Reviewers: mehdi_amini, labath, joerg
Reviewed By: mehdi_amini
Subscribers: emaste, clayborg
Differential Revision: https://reviews.llvm.org/D29566
llvm-svn: 294143
show more ...
|
#
a7c041d1 |
| 31-Jan-2017 |
Nirav Dave <niravd@google.com> |
[X86] Implement -mfentry
Summary: Insert calls to __fentry__ at function entry.
Reviewers: hfinkel, craig.topper
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/
[X86] Implement -mfentry
Summary: Insert calls to __fentry__ at function entry.
Reviewers: hfinkel, craig.topper
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D28000
llvm-svn: 293648
show more ...
|
Revision tags: llvmorg-4.0.0-rc1 |
|
#
e2d2ead6 |
| 08-Dec-2016 |
Matthias Braun <matze@braunis.de> |
TargetPassConfig: Rename DisablePostRA -> DisablePostRASched; NFC
llvm-svn: 289003
|
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1 |
|
#
35a024fe |
| 28-Oct-2016 |
Matthias Braun <matze@braunis.de> |
TargetPassConfig: Move addPass of IPRA RegUsageInfoProp down.
TargetPassConfig::addMachinePasses() does some housekeeping first: Handling the -print-machineinstrs flag and doing an initial printing
TargetPassConfig: Move addPass of IPRA RegUsageInfoProp down.
TargetPassConfig::addMachinePasses() does some housekeeping first: Handling the -print-machineinstrs flag and doing an initial printing "After Instruction Selection". There is no reason for RegUsageInfoProp to run before those two steps.
llvm-svn: 285422
show more ...
|
#
732afdd0 |
| 08-Oct-2016 |
Mehdi Amini <mehdi.amini@apple.com> |
Turn cl::values() (for enum) from a vararg function to using C++ variadic template
The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when c
Turn cl::values() (for enum) from a vararg function to using C++ variadic template
The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when calling:
va_start(ValueArgs, Desc);
with Desc being a StringRef.
Differential Revision: https://reviews.llvm.org/D25342
llvm-svn: 283671
show more ...
|
#
729c9890 |
| 23-Sep-2016 |
Matthias Braun <matze@braunis.de> |
llc: Add -start-before/-stop-before options
Differential Revision: https://reviews.llvm.org/D23089
llvm-svn: 282302
|
#
40d7f5c2 |
| 01-Sep-2016 |
Hal Finkel <hfinkel@anl.gov> |
Add a counter-function insertion pass
As discussed in https://reviews.llvm.org/D22666, our current mechanism to support -pg profiling, where we insert calls to mcount(), or some similar function, is
Add a counter-function insertion pass
As discussed in https://reviews.llvm.org/D22666, our current mechanism to support -pg profiling, where we insert calls to mcount(), or some similar function, is fundamentally broken. We insert these calls in the frontend, which means they get duplicated when inlining, and so the accumulated execution counts for the inlined-into functions are wrong.
Because we don't want the presence of these functions to affect optimizaton, they should be inserted in the backend. Here's a pass which would do just that. The knowledge of the name of the counting function lives in the frontend, so we're passing it here as a function attribute. Clang will be updated to use this mechanism.
Differential Revision: https://reviews.llvm.org/D22825
llvm-svn: 280347
show more ...
|
#
1c06a73a |
| 31-Aug-2016 |
Quentin Colombet <qcolombet@apple.com> |
[TargetPassConfig] Add a hook to tell whether GlobalISel should warm on fallback.
Thanks to this patch, we know have a way to easly see if GlobalISel failed.
llvm-svn: 280273
|
#
0de43b22 |
| 26-Aug-2016 |
Quentin Colombet <qcolombet@apple.com> |
[TargetPassConfig] Add a target hook to know what GlobalISel should do on error.
By default, this hook tells GlobalISel to abort (report a fatal error) when it encounters an error. The alternative w
[TargetPassConfig] Add a target hook to know what GlobalISel should do on error.
By default, this hook tells GlobalISel to abort (report a fatal error) when it encounters an error. The alternative will be to fall back on SDISel. This fall back will be removed when the bring-up of GlobalISel is over.
llvm-svn: 279879
show more ...
|
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2 |
|
#
3447ca3f |
| 18-Aug-2016 |
Alex Bradbury <asb@lowrisc.org> |
(Trivial) TargetPassConfig: assert when TargetMachine has no MCAsmInfo
Summary: This is a pretty trivial, but I thought it was worth just checking that nobody feels it's completely the wrong thing t
(Trivial) TargetPassConfig: assert when TargetMachine has no MCAsmInfo
Summary: This is a pretty trivial, but I thought it was worth just checking that nobody feels it's completely the wrong thing to be doing.
The motivation is that when starting a new backend, you often start with a minimal stub, pretty much just FooTargetMachine and FooTargetInfo. Once that's built, you might naturally try `llc -march=foo myinput.ll` and it seems more developer-friendly if this ends up asserting due to the lack of MCAsmInfo with an informative message rather than just segfaulting.
Reviewers: MatzeB, chandlerc
Subscribers: bogner, llvm-commits
Differential Revision: https://reviews.llvm.org/D23443
llvm-svn: 279061
show more ...
|
#
b03fd12c |
| 17-Aug-2016 |
Justin Bogner <mail@justinbogner.com> |
Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.
llvm
Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.
llvm-svn: 278902
show more ...
|
Revision tags: llvmorg-3.9.0-rc1 |
|
#
52735fc4 |
| 14-Jul-2016 |
Dean Michael Berris <dberris@google.com> |
XRay: Add entry and exit sleds
Summary: In this patch we implement the following parts of XRay:
- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-alw
XRay: Add entry and exit sleds
Summary: In this patch we implement the following parts of XRay:
- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches. - Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts). - X86-specific nop sleds as described in the white paper. - A machine function pass that adds the different instrumentation marker instructions at a very late stage. - A way of identifying which return opcode is considered "normal" for each architecture.
There are some caveats here:
1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet.
2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library.
Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk
Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits
Differential Revision: http://reviews.llvm.org/D19904
llvm-svn: 275367
show more ...
|
#
cfed2564 |
| 13-Jul-2016 |
Mehdi Amini <mehdi.amini@apple.com> |
Add EnableIPRA to TargetOptions, and move the cl::opt -enable-ipra to TargetMachine.cpp
Avoid exposing a cl::opt in a public header and instead promote this option in the API. Alternatively, we coul
Add EnableIPRA to TargetOptions, and move the cl::opt -enable-ipra to TargetMachine.cpp
Avoid exposing a cl::opt in a public header and instead promote this option in the API. Alternatively, we could land the cl::opt in CommandFlags.h so that it is available to every tool, but we would still have to find an option for clang.
llvm-svn: 275348
show more ...
|
#
4beea662 |
| 13-Jul-2016 |
Mehdi Amini <mehdi.amini@apple.com> |
[IPRA] Set callee saved registers to none for local function when IPRA is enabled.
IPRA try to optimize caller saved register by propagating register usage information from callee to caller so it is
[IPRA] Set callee saved registers to none for local function when IPRA is enabled.
IPRA try to optimize caller saved register by propagating register usage information from callee to caller so it is beneficial to have caller saved registers compare to callee saved registers when IPRA is enabled. Please find more detailed explanation here https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/tjAJqb0eEgAJ.
This change makes local function do not have any callee preserved register when IPRA is enabled. A simple test case is also added to verify this change.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D21561
llvm-svn: 275347
show more ...
|
#
d9d02d82 |
| 08-Jul-2016 |
David Majnemer <david.majnemer@gmail.com> |
[CodeGen, TargetPassConfig] Remove a race from createRegAllocPass
The createRegAllocPass reads and writes to a global variable 'Registry' via calls to getDefault and setDefault. Run this under a ca
[CodeGen, TargetPassConfig] Remove a race from createRegAllocPass
The createRegAllocPass reads and writes to a global variable 'Registry' via calls to getDefault and setDefault. Run this under a call_once to avoid races.
llvm-svn: 274875
show more ...
|
#
bfa401e5 |
| 06-Jul-2016 |
George Burgess IV <george.burgess.iv@gmail.com> |
[CFLAA] Split into Anders+Steens analysis.
StratifiedSets (as implemented) is very fast, but its accuracy is also limited. If we take a more aggressive andersens-like approach, we can be way more ac
[CFLAA] Split into Anders+Steens analysis.
StratifiedSets (as implemented) is very fast, but its accuracy is also limited. If we take a more aggressive andersens-like approach, we can be way more accurate, but we'll also end up being slower.
So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA.
Long-term, we want to end up in a place where CFLSteens is queried first; if it can provide an answer, great (since queries are basically map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc.
This patch splits everything out so we can try to do something like that when we get a reasonable CFLAnders implementation.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21910
llvm-svn: 274589
show more ...
|