#
5bf8fef5 |
| 09-Dec-2014 |
Duncan P. N. Exon Smith <dexonsmith@apple.com> |
IR: Split Metadata from Value
Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the I
IR: Split Metadata from Value
Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API.
I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself.
This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems.
Here's a quick guide for updating your code:
- `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do *not* have a `Type`.
- `MDNode`'s operands are all `Metadata *` (instead of `Value *`).
- `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively.
If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode*`.
- `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`.
As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.)
If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive.
- An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`).
As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`.
The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s).
In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was:
MDNode *N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4)));
you can trivially match its semantics with:
MDNode *N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4)));
and when you transition your metadata schema to `MDInt`:
MDNode *N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4)));
- A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`.
`MetadataAsValue` is the *only* class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass.
(I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.)
llvm-svn: 223802
show more ...
|
Revision tags: llvmorg-3.5.1-rc1 |
|
#
61ba2e39 |
| 08-Dec-2014 |
Justin Bogner <mail@justinbogner.com> |
InstrProf: An intrinsic and lowering for instrumentation based profiling
Introduce the ``llvm.instrprof_increment`` intrinsic and the ``-instrprof`` pass. These provide the infrastructure for writin
InstrProf: An intrinsic and lowering for instrumentation based profiling
Introduce the ``llvm.instrprof_increment`` intrinsic and the ``-instrprof`` pass. These provide the infrastructure for writing counters for profiling, as in clang's ``-fprofile-instr-generate``.
The implementation of the instrprof pass is ported directly out of the CodeGenPGO classes in clang, and with the followup in clang that rips that code out to use these new intrinsics this ends up being NFC.
Doing the instrumentation this way opens some doors in terms of improving the counter performance. For example, this will make it simple to experiment with alternate lowering strategies, and allows us to try handling profiling specially in some optimizations if we want to.
Finally, this drastically simplifies the frontend and puts all of the lowering logic in one place.
llvm-svn: 223672
show more ...
|
#
08de833c |
| 06-Dec-2014 |
Hans Wennborg <hans@hanshq.net> |
SelectionDAG switch lowering: Replace unreachable default with most popular case.
This can significantly reduce the size of the switch, allowing for more efficient lowering.
I also worked with the
SelectionDAG switch lowering: Replace unreachable default with most popular case.
This can significantly reduce the size of the switch, allowing for more efficient lowering.
I also worked with the idea of exploiting unreachable defaults by omitting the range check for jump tables, but always ended up with a non-neglible binary size increase. It might be worth looking into some more.
SimplifyCFG currently does this transformation, but I'm working towards changing that so we can optimize harder based on unreachable defaults.
Differential Revision: http://reviews.llvm.org/D6510
llvm-svn: 223566
show more ...
|
#
f1de34b8 |
| 04-Dec-2014 |
Elena Demikhovsky <elena.demikhovsky@intel.com> |
Masked Load / Store Intrinsics - the CodeGen part. I'm recommiting the codegen part of the patch. The vectorizer part will be send to review again.
Masked Vector Load and Store Intrinsics. Introduce
Masked Load / Store Intrinsics - the CodeGen part. I'm recommiting the codegen part of the patch. The vectorizer part will be send to review again.
Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)
Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.
http://reviews.llvm.org/D6191
llvm-svn: 223348
show more ...
|
#
1a1bdb22 |
| 02-Dec-2014 |
Philip Reames <listmail@philipreames.com> |
[Statepoints 3/4] Statepoint infrastructure for garbage collection: SelectionDAGBuilder
This is the third patch in a small series. It contains the CodeGen support for lowering the gc.statepoint int
[Statepoints 3/4] Statepoint infrastructure for garbage collection: SelectionDAGBuilder
This is the third patch in a small series. It contains the CodeGen support for lowering the gc.statepoint intrinsic sequences (223078) to the STATEPOINT pseudo machine instruction (223085). The change also includes the set of helper routines and classes for working with gc.statepoints, gc.relocates, and gc.results since the lowering code uses them.
With this change, gc.statepoints should be functionally complete. The documentation will follow in the fourth change, and there will likely be some cleanup changes, but interested parties can start experimenting now.
I'm not particularly happy with the amount of code or complexity involved with the lowering step, but at least it's fairly well isolated. The statepoint lowering code is split into it's own files and anyone not working on the statepoint support itself should be able to ignore it.
During the lowering process, we currently spill aggressively to stack. This is not entirely ideal (and we have plans to do better), but it's functional, relatively straight forward, and matches closely the implementations of the patchpoint intrinsics. Most of the complexity comes from trying to keep relocated copies of values in the same stack slots across statepoints. Doing so avoids the insertion of pointless load and store instructions to reshuffle the stack. The current implementation isn't as effective as I'd like, but it is functional and 'good enough' for many common use cases.
In the long term, I'd like to figure out how to integrate the statepoint lowering with the register allocator. In principal, we shouldn't need to eagerly spill at all. The register allocator should do any spilling required and the statepoint should simply record that fact. Depending on how challenging that turns out to be, we may invest in a smarter global stack slot assignment mechanism as a stop gap measure.
Reviewed by: atrick, ributzka
llvm-svn: 223137
show more ...
|
#
5bef5b52 |
| 01-Dec-2014 |
Hans Wennborg <hans@hanshq.net> |
Revert r223049, r223050 and r223051 while investigating test failures.
I didn't foresee affecting the Clang test suite :/
llvm-svn: 223054
|
#
1571336f |
| 01-Dec-2014 |
Hans Wennborg <hans@hanshq.net> |
SelectionDAG switch lowering: Replace unreachable default with most popular case.
This can significantly reduce the size of the switch, allowing for more efficient lowering.
I also worked with the
SelectionDAG switch lowering: Replace unreachable default with most popular case.
This can significantly reduce the size of the switch, allowing for more efficient lowering.
I also worked with the idea of exploiting unreachable defaults by omitting the range check for jump tables, but always ended up with a non-neglible binary size increase. It might be worth looking into some more.
llvm-svn: 223049
show more ...
|
#
b9991a26 |
| 01-Dec-2014 |
Akira Hatanaka <ahatanaka@apple.com> |
[stack protector] Set edge weights for newly created basic blocks.
This commit fixes a bug in stack protector pass where edge weights were not set when new basic blocks were added to lists of succes
[stack protector] Set edge weights for newly created basic blocks.
This commit fixes a bug in stack protector pass where edge weights were not set when new basic blocks were added to lists of successor basic blocks.
Differential Revision: http://reviews.llvm.org/D5766
llvm-svn: 222987
show more ...
|
#
6dfb041f |
| 29-Nov-2014 |
Hans Wennborg <hans@hanshq.net> |
Switch lowering: reformat some for loops etc. NFC
llvm-svn: 222962
|
#
6c42d1a5 |
| 29-Nov-2014 |
Hans Wennborg <hans@hanshq.net> |
Switch lowering: Fix broken 'Figure out which block is next' code
This doesn't seem to have worked in a long time, but other optimizations would clean it up.
llvm-svn: 222961
|
#
9bc81fbe |
| 28-Nov-2014 |
Duncan P. N. Exon Smith <dexonsmith@apple.com> |
Revert "Masked Vector Load and Store Intrinsics."
This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list
Revert "Masked Vector Load and Store Intrinsics."
This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list with a reproduction of one of the failures.
Conflicts: lib/Target/X86/X86TargetTransformInfo.cpp
llvm-svn: 222936
show more ...
|
#
9e5089a9 |
| 23-Nov-2014 |
Elena Demikhovsky <elena.demikhovsky@intel.com> |
Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional me
Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)
Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.
http://reviews.llvm.org/D6191
llvm-svn: 222632
show more ...
|
#
70573dcd |
| 19-Nov-2014 |
David Blaikie <dblaikie@gmail.com> |
Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard library's associative container inse
Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard library's associative container insert function.
This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions...
llvm-svn: 222334
show more ...
|
#
283bc2ed |
| 14-Nov-2014 |
Reid Kleckner <reid@kleckner.net> |
Allow the use of functions as typeinfo in landingpad clauses
This is one step towards supporting SEH filter functions in LLVM.
llvm-svn: 221954
|
#
de36e804 |
| 11-Nov-2014 |
Duncan P. N. Exon Smith <dexonsmith@apple.com> |
Revert "IR: MDNode => Value"
Instead, we're going to separate metadata from the Value hierarchy. See PR21532.
This reverts commit r221375. This reverts commit r221373. This reverts commit r221359.
Revert "IR: MDNode => Value"
Instead, we're going to separate metadata from the Value hierarchy. See PR21532.
This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994.
llvm-svn: 221711
show more ...
|
#
3872d008 |
| 01-Nov-2014 |
Duncan P. N. Exon Smith <dexonsmith@apple.com> |
IR: MDNode => Value: Instruction::getMetadata()
Change `Instruction::getMetadata()` to return `Value` as part of PR21433.
Update most callers to use `Instruction::getMDNode()`, which wraps the resu
IR: MDNode => Value: Instruction::getMetadata()
Change `Instruction::getMetadata()` to return `Value` as part of PR21433.
Update most callers to use `Instruction::getMDNode()`, which wraps the result in a `cast_or_null<MDNode>`.
llvm-svn: 221024
show more ...
|
#
4b7bd2d5 |
| 24-Oct-2014 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Fix copy paste comment
llvm-svn: 220581
|
#
7c93690b |
| 21-Oct-2014 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Add minnum / maxnum codegen
llvm-svn: 220342
|
#
5a3f5f75 |
| 21-Oct-2014 |
Philip Reames <listmail@philipreames.com> |
Introduce enum values for previously defined metadata types. (NFC)
Our metadata scheme lazily assigns IDs to string metadata, but we have a mechanism to preassign them as well. Using a preassigned
Introduce enum values for previously defined metadata types. (NFC)
Our metadata scheme lazily assigns IDs to string metadata, but we have a mechanism to preassign them as well. Using a preassigned ID is helpful since we get compile time type checking, and avoid some (minimal) string construction and comparison. This change adds enum value for three existing metadata types: + MD_nontemporal = 9, // "nontemporal" + MD_mem_parallel_loop_access = 10, // "llvm.mem.parallel_loop_access" + MD_nonnull = 11 // "nonnull"
I went through an updated various uses as well. I made no attempt to get all uses; I focused on the ones which were easily grepable and easily to translate. For example, there were several items in LoopInfo.cpp I chose not to update.
llvm-svn: 220248
show more ...
|
#
230332f4 |
| 17-Oct-2014 |
Pete Cooper <peter_cooper@apple.com> |
Check for dynamic alloca's when selecting lifetime intrinsics.
TL;DR: Indexing maps with [] creates missing entries.
The long version:
When selecting lifetime intrinsics, we index the *static* all
Check for dynamic alloca's when selecting lifetime intrinsics.
TL;DR: Indexing maps with [] creates missing entries.
The long version:
When selecting lifetime intrinsics, we index the *static* alloca map with the AllocaInst we find for that lifetime. Trouble is, we don't first check to see if this is a dynamic alloca.
On the attached example, this causes a dynamic alloca to create an entry in the static map, and returns 0 (the default) as the frame index for that lifetime. 0 was used for the frame index of the stack protector, which given that it now has a lifetime, is coloured, and merged with other stack slots.
PEI would later trigger an assert because it expects the stack protector to not be dead.
This fix ensures that we only get frame indices for static allocas, ie, those in the map. Dynamic ones are effectively dropped, which is suboptimal, but at least isn't completely broken.
rdar://problem/18672951
llvm-svn: 220099
show more ...
|
#
ad2363f9 |
| 17-Oct-2014 |
Juergen Ributzka <juergen@apple.com> |
[Stackmaps] Enable invoking the patchpoint intrinsic.
Patch by Kevin Modzelewski Reviewers: atrick, ributzka Reviewed By: ributzka Subscribers: llvm-commits, reames
Differential Revision: http://re
[Stackmaps] Enable invoking the patchpoint intrinsic.
Patch by Kevin Modzelewski Reviewers: atrick, ributzka Reviewed By: ributzka Subscribers: llvm-commits, reames
Differential Revision: http://reviews.llvm.org/D5634
llvm-svn: 220055
show more ...
|
#
fd4633e1 |
| 16-Oct-2014 |
Juergen Ributzka <juergen@apple.com> |
Reduce code duplication between patchpoint and non-patchpoint lowering. NFC.
This is in preparation for another patch that makes patchpoints invokable.
Reviewers: atrick, ributzka Reviewed By: ribu
Reduce code duplication between patchpoint and non-patchpoint lowering. NFC.
This is in preparation for another patch that makes patchpoints invokable.
Reviewers: atrick, ributzka Reviewed By: ributzka Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5657
llvm-svn: 219967
show more ...
|
#
e2de06be |
| 16-Oct-2014 |
Robin Morisset <morisset@google.com> |
Erase fence insertion from SelectionDAGBuilder.cpp (NFC)
Summary: Backends can use setInsertFencesForAtomic to signal to the middle-end that montonic is the only memory ordering they can accept for
Erase fence insertion from SelectionDAGBuilder.cpp (NFC)
Summary: Backends can use setInsertFencesForAtomic to signal to the middle-end that montonic is the only memory ordering they can accept for stores/loads/rmws/cmpxchg. The code lowering those accesses with a stronger ordering to fences + monotonic accesses is currently living in SelectionDAGBuilder.cpp. In this patch I propose moving this logic out of it for several reasons: - There is lots of redundancy to avoid: extremely similar logic already exists in AtomicExpand. - The current code in SelectionDAGBuilder does not use any target-hooks, it does the same transformation for every backend that requires it - As a result it is plain *unsound*, as it was apparently designed for ARM. It happens to mostly work for the other targets because they are extremely conservative, but Power for example had to switch to AtomicExpand to be able to use lwsync safely (see r218331). - Because it produces IR-level fences, it cannot be made sound ! This is noted in the C++11 standard (section 29.3, page 1140): ``` Fences cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering semantics. ``` It can also be seen by the following example (called IRIW in the litterature): ``` atomic<int> x = y = 0; int r1, r2, r3, r4; Thread 0: x.store(1); Thread 1: y.store(1); Thread 2: r1 = x.load(); r2 = y.load(); Thread 3: r3 = y.load(); r4 = x.load(); ``` r1 = r3 = 1 and r2 = r4 = 0 is impossible as long as the accesses are all seq_cst. But if they are lowered to monotonic accesses, no amount of fences can prevent it..
This patch does three things (I could cut it into parts, but then some of them would not be tested/testable, please tell me if you would prefer that): - it provides a default implementation for emitLeadingFence/emitTrailingFence in terms of IR-level fences, that mimic the original logic of SelectionDAGBuilder. As we saw above, this is unsound, but the best that can be done without knowing the targets well (and there is a comment warning about this risk). - it then switches Mips/Sparc/XCore to use AtomicExpand, relying on this default implementation (that exactly replicates the logic of SelectionDAGBuilder, so no functional change) - it finally erase this logic from SelectionDAGBuilder as it is dead-code.
Ideally, each target would define its own override for emitLeading/TrailingFence using target-specific fences, but I do not know the Sparc/Mips/XCore memory model well enough to do this, and they appear to be dealing fine with the ARM-inspired default expansion for now (probably because they are overly conservative, as Power was). If anyone wants to compile fences more agressively on these platforms, the long comment should make it clear why he should first override emitLeading/TrailingFence.
Test Plan: make check-all, no functional change
Reviewers: jfb, t.p.northover
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D5474
llvm-svn: 219957
show more ...
|
#
df82a33d |
| 13-Oct-2014 |
Chad Rosier <mcrosier@codeaurora.org> |
Refactor debug statement and remove dead argument. NFC.
llvm-svn: 219626
|
#
85de8f98 |
| 09-Oct-2014 |
Eric Christopher <echristo@gmail.com> |
Use the subtarget on the dag to get TargetFrameLowering rather than off the target machine.
llvm-svn: 219378
|