#
80c33de2 |
| 09-Dec-2020 |
Joe Ellis <joe.ellis@arm.com> |
[SelectionDAG] Add llvm.vector.{extract,insert} intrinsics
This commit adds two new intrinsics.
- llvm.experimental.vector.insert: used to insert a vector into another vector starting at a given
[SelectionDAG] Add llvm.vector.{extract,insert} intrinsics
This commit adds two new intrinsics.
- llvm.experimental.vector.insert: used to insert a vector into another vector starting at a given index.
- llvm.experimental.vector.extract: used to extract a subvector from a larger vector starting from a given index.
The codegen work for these intrinsics has already been completed; this commit is simply exposing the existing ISD nodes to LLVM IR.
Reviewed By: cameron.mcinally
Differential Revision: https://reviews.llvm.org/D91362
show more ...
|
#
3ffbc793 |
| 09-Dec-2020 |
Simon Moll <simon.moll@emea.nec.com> |
[VP] Build VP SDNodes
Translate VP intrinsics to VP_* SDNodes. The tests check whether a matching vp_* SDNode is emitted.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org
[VP] Build VP SDNodes
Translate VP intrinsics to VP_* SDNodes. The tests check whether a matching vp_* SDNode is emitted.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D91441
show more ...
|
#
c5978f42 |
| 21-Oct-2020 |
Tim Northover <t.p.northover@gmail.com> |
UBSAN: emit distinctive traps
Sometimes people get minimal crash reports after a UBSAN incident. This change tags each trap with an integer representing the kind of failure encountered, which can ai
UBSAN: emit distinctive traps
Sometimes people get minimal crash reports after a UBSAN incident. This change tags each trap with an integer representing the kind of failure encountered, which can aid in tracking down the root cause of the problem.
show more ...
|
#
f6dd32fd |
| 07-Dec-2020 |
Kerry McLaughlin <kerry.mclaughlin@arm.com> |
[SVE][CodeGen] Lower scalable masked gathers
Lowers the llvm.masked.gather intrinsics (scalar plus vector addressing mode only)
Changes in this patch: - Add custom lowering for MGATHER, using getGa
[SVE][CodeGen] Lower scalable masked gathers
Lowers the llvm.masked.gather intrinsics (scalar plus vector addressing mode only)
Changes in this patch: - Add custom lowering for MGATHER, using getGatherVecOpcode() to choose the appropriate gather load opcode to use. - Improve codegen with refineIndexType/refineUniformBase, added in D90942 - Tests added for gather loads with 32 & 64-bit scaled & unscaled offsets.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D91092
show more ...
|
#
a553ac97 |
| 05-Dec-2020 |
Kazu Hirata <kazu@google.com> |
[CodeGen] llvm::erase_if (NFC)
|
#
f6150aa4 |
| 30-Nov-2020 |
Francesco Petrogalli <francesco.petrogalli@arm.com> |
[SelectionDAGBuilder] Update signature of `getRegsAndSizes()`.
The mapping between registers and relative size has been updated to use TypeSize to account for the size of scalable EVTs.
The patch i
[SelectionDAGBuilder] Update signature of `getRegsAndSizes()`.
The mapping between registers and relative size has been updated to use TypeSize to account for the size of scalable EVTs.
The patch is a NFCI, if not for the fact that with this change the function `getUnderlyingArgRegs` does not raise a warning for implicit conversion of `TypeSize` to `unsigned` when generating machine code from the test added to the patch.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D92096
show more ...
|
#
4df8efce |
| 17-Nov-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[AA] Split up LocationSize::unknown()
Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationS
[AA] Split up LocationSize::unknown()
Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationSize::unknown() only allows accesses after the base pointer. Some parts (various callers of AA) assume that LocationSize::unknown() allows accesses both before and after the base pointer (but within the underlying object).
This patch splits up LocationSize::unknown() into LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer() to make this completely unambiguous. I tried my best to determine which one is appropriate for all the existing uses.
The test changes in cs-cs.ll in particular illustrate a previously clearly incorrect AA result: We were effectively assuming that argmemonly functions were only allowed to access their arguments after the passed pointer, but not before it. I'm pretty sure that this was not intentional, and it's certainly not specified by LangRef that way.
Differential Revision: https://reviews.llvm.org/D91649
show more ...
|
#
2d604293 |
| 25-Nov-2020 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAGBuilder] Add SPF_NABS support to visitSelect
We currently don't match this which limits the effectiveness of D91120 until InstCombine starts canonicalizing to llvm.abs. This should be e
[SelectionDAGBuilder] Add SPF_NABS support to visitSelect
We currently don't match this which limits the effectiveness of D91120 until InstCombine starts canonicalizing to llvm.abs. This should be easy to remove if/when we remove the SPF_ABS handling.
Differential Revision: https://reviews.llvm.org/D92118
show more ...
|
#
d0e42037 |
| 10-Sep-2020 |
Hongtao Yu <hoy@fb.com> |
[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic
This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe fo
[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic
This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.
An `llvm.pseudoprobe` intrinsic call will be lowered into a target-independent operation named `PSEUDO_PROBE`. Given the following instrumented IR,
``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` the corresponding MIR is shown below. Note that block `bb3` is duplicated into `bb1` and `bb2` where its probe is duplicated too. This allows for an accurate execution count to be collected for `bb3`, which is basically the sum of the counts of `bb1` and `bb2`.
``` bb.0.bb0: frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp TEST32rr killed renamable $edi, renamable $edi, implicit-def $eflags PSEUDO_PROBE 837061429793323041, 1, 0 $edi = MOV32ri 1, debug-location !13; test.c:0 JCC_1 %bb.1, 4, implicit $eflags
bb.2.bb2: PSEUDO_PROBE 837061429793323041, 3, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ
bb.1.bb1: PSEUDO_PROBE 837061429793323041, 2, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ ```
The target op PSEUDO_PROBE will be converted into a piece of binary data by the object emitter with no machine instructions generated. This is done in a different patch.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D86495
show more ...
|
#
a7eae62a |
| 20-Nov-2020 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version.
The default version only works if the returned node has a single resu
[SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version.
The default version only works if the returned node has a single result. The X86 and PowerPC versions support multiple results and allow a single result to be returned from a node with multiple outputs. And allow a single result that is not result 0 of the node.
Also replace the Mips version since the new version should work for it. The original version handled multiple results, but only if the new node and original node had the same number of results.
Differential Revision: https://reviews.llvm.org/D91846
show more ...
|
#
393b9e9d |
| 19-Nov-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[MemLoc] Require LocationSize argument (NFC)
When constructing a MemoryLocation by hand, require that a LocationSize is explicitly specified. D91649 will split up LocationSize::unknown() into two di
[MemLoc] Require LocationSize argument (NFC)
When constructing a MemoryLocation by hand, require that a LocationSize is explicitly specified. D91649 will split up LocationSize::unknown() into two different states, and callers should make an explicit choice regarding the kind of MemoryLocation they want to have.
show more ...
|
#
a97f6283 |
| 01-Apr-2020 |
Leonard Chan <leonardchan@google.com> |
[llvm][IR] Add dso_local_equivalent Constant
The `dso_local_equivalent` constant is a wrapper for functions that represents a value which is functionally equivalent to the global passed to this. Tha
[llvm][IR] Add dso_local_equivalent Constant
The `dso_local_equivalent` constant is a wrapper for functions that represents a value which is functionally equivalent to the global passed to this. That is, if this accepts a function, calling this constant should have the same effects as calling the function directly. This could be a direct reference to the function, the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the resolved function as a call target.
When lowered, the returned address must have a constant offset at link time from some other symbol defined within the same binary. The address of this value is also insignificant. The name is leveraged from `dso_local` where use of a function or variable is resolved to a symbol in the same linkage unit.
In this patch: - Addition of `dso_local_equivalent` and handling it - Update Constant::needsRelocation() to strip constant inbound GEPs and take advantage of `dso_local_equivalent` for relative references
This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959) which makes vtables readonly. This works by replacing the dynamic relocations for function pointers in them with static relocations that represent the offset between the vtable and virtual functions. If a function is externally defined, `dso_local_equivalent` can be used as a generic wrapper for the function to still allow for this static offset calculation to be done.
See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details.
Differential Revision: https://reviews.llvm.org/D77248
show more ...
|
#
1983acce |
| 19-Nov-2020 |
Florian Hahn <flo@fhahn.com> |
[SelDAGBuilder] Do not require simple VTs for constraints.
In some cases, the values passed to `asm sideeffect` calls cannot be mapped directly to simple MVTs. Currently, we crash in the backend if
[SelDAGBuilder] Do not require simple VTs for constraints.
In some cases, the values passed to `asm sideeffect` calls cannot be mapped directly to simple MVTs. Currently, we crash in the backend if that happens. An example can be found in the @test_vector_too_large_r_m test case, where we pass <9 x float> vectors. In practice, this can happen in cases like the simple C example below.
using vec = float __attribute__((ext_vector_type(9))); void f1 (vec m) { asm volatile("" : "+r,m"(m) : : "memory"); }
One case that use "+r,m" constraints for arbitrary data types in practice is google-benchmark's DoNotOptimize.
This patch updates visitInlineAsm so that it use MVT::Other for constraints with complex VTs. It looks like the rest of the backend correctly deals with that and properly legalizes the type.
And we still report an error if there are no registers to satisfy the constraint.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D91710
show more ...
|
#
170947a5 |
| 11-Nov-2020 |
Kerry McLaughlin <kerry.mclaughlin@arm.com> |
[SVE][CodeGen] Lower scalable masked scatters
Lowers the llvm.masked.scatter intrinsics (scalar plus vector addressing mode only)
Changes included in this patch: - Custom lowering for MSCATTER, wh
[SVE][CodeGen] Lower scalable masked scatters
Lowers the llvm.masked.scatter intrinsics (scalar plus vector addressing mode only)
Changes included in this patch: - Custom lowering for MSCATTER, which chooses the appropriate scatter store opcode to use. Floating-point scatters are cast to integer, with patterns added to match FP reinterpret_casts. - Added the getCanonicalIndexType function to convert redundant addressing modes (e.g. scaling is redundant when accessing bytes) - Tests with 32 & 64-bit scaled & unscaled offsets
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D90941
show more ...
|
#
ffbbfc76 |
| 11-Nov-2020 |
Kerry McLaughlin <kerry.mclaughlin@arm.com> |
[SVE][CodeGen] Add the isTruncatingStore flag to MSCATTER
This patch adds the IsTruncatingStore flag to MaskedScatterSDNode, set by getMaskedScatter(). Updated SelectionDAGDumper::print_details for
[SVE][CodeGen] Add the isTruncatingStore flag to MSCATTER
This patch adds the IsTruncatingStore flag to MaskedScatterSDNode, set by getMaskedScatter(). Updated SelectionDAGDumper::print_details for MaskedScatterSDNode to print the details of masked scatters (is truncating, signed or scaled).
This is the first in a series of patches which adds support for scalable masked scatters
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D90939
show more ...
|
#
4634ad6c |
| 21-Oct-2020 |
Gaurav Jain <gjn@google.com> |
[NFC] Set return type of getStackPointerRegisterToSaveRestore to Register
Differential Revision: https://reviews.llvm.org/D89858
|
#
35a531fb |
| 29-Sep-2020 |
David Sherwood <david.sherwood@arm.com> |
[SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents
In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the
[SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents
In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the code was only ever expecting either scalar values or fixed width vectors. I've changed some of these places to use the equivalent scalar operator.
Differential Revision: https://reviews.llvm.org/D88482
show more ...
|
#
f693f915 |
| 29-Sep-2020 |
David Sherwood <david.sherwood@arm.com> |
[SVE][CodeGen] Replace uses of TypeSize comparison operators
In certain places in the code we can never end up in a situation where we're mixing fixed width and scalable vector types. For example, w
[SVE][CodeGen] Replace uses of TypeSize comparison operators
In certain places in the code we can never end up in a situation where we're mixing fixed width and scalable vector types. For example, we can't have truncations and extends that change the lane count. Also, in other places such as GenWidenVectorStores and GenWidenVectorLoads we know from the behaviour of FindMemType that we can never choose a vector type with a different scalable property.
In various places I have used EVT::bitsXY functions instead of TypeSize::isKnownXY, where it probably makes sense to keep an assert that scalable properties match.
Differential Revision: https://reviews.llvm.org/D88654
show more ...
|
#
e72cfd93 |
| 04-Oct-2020 |
Amara Emerson <amara@apple.com> |
Rename the VECREDUCE_STRICT_{FADD,FMUL} SDNodes to VECREDUCE_SEQ_{FADD,FMUL}.
The STRICT was causing unnecessary confusion. I think SEQ is a more accurate name for what they actually do, and the oth
Rename the VECREDUCE_STRICT_{FADD,FMUL} SDNodes to VECREDUCE_SEQ_{FADD,FMUL}.
The STRICT was causing unnecessary confusion. I think SEQ is a more accurate name for what they actually do, and the other obvious option of "ORDERED" has the issue of already having a meaning in FP contexts.
Differential Revision: https://reviews.llvm.org/D88791
show more ...
|
#
322d0afd |
| 03-Oct-2020 |
Amara Emerson <amara@apple.com> |
[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.
The autoupgrader will handle lega
[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.
The autoupgrader will handle legacy intrinsics.
Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html
Differential Revision: https://reviews.llvm.org/D88787
show more ...
|
#
1127662c |
| 05-Oct-2020 |
Craig Topper <craig.topper@intel.com> |
[SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS.
getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize FP constants to the RHS. When this
[SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS.
getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize FP constants to the RHS. When this happens we should create the node with the FMF that was requested. By using FlagInserter when can ensure any calls to getNode/getSetcc during canonicalization will also get the flags.
Differential Revision: https://reviews.llvm.org/D88063
show more ...
|
#
179e15d5 |
| 25-Sep-2020 |
Dávid Bolvanský <david.bolvansky@gmail.com> |
[SystemZ] Optimize bcmp calls (PR47420)
Solves https://bugs.llvm.org/show_bug.cgi?id=47420
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D87988
|
#
0c0c57f7 |
| 24-Sep-2020 |
Bill Wendling <isanbard@gmail.com> |
Revert "[CodeGen] Postprocess PHI nodes for callbr"
Accidental commit.
This reverts commit 7f4c940bd0b526f25e11c51bb4d58a85024330ae.
|
#
7f4c940b |
| 20-Aug-2020 |
Bill Wendling <isanbard@gmail.com> |
[CodeGen] Postprocess PHI nodes for callbr
When processing PHI nodes after a callbr, we need to make sure that the PHI nodes on the default branch are resolved after the callbr (inserted after INLIN
[CodeGen] Postprocess PHI nodes for callbr
When processing PHI nodes after a callbr, we need to make sure that the PHI nodes on the default branch are resolved after the callbr (inserted after INLINEASM_BR). The PHI node values on the indirect branches are processed before the INLINEASM_BR.
Differential Revision: https://reviews.llvm.org/D86260
show more ...
|
#
53d238a9 |
| 17-Sep-2020 |
Lucas Prates <lucas.prates@arm.com> |
[CodeGen] Fixing inconsistent ABI mangling of vlaues in SelectionDAGBuilder
SelectionDAGBuilder was inconsistently mangling values based on ABI Calling Conventions when getting them through copyFrom
[CodeGen] Fixing inconsistent ABI mangling of vlaues in SelectionDAGBuilder
SelectionDAGBuilder was inconsistently mangling values based on ABI Calling Conventions when getting them through copyFromRegs in SelectionDAGBuilder, causing duplicate value type convertions for function arguments. The checking for the mangling requirement was based on the value's originating instruction and was performed outside of, and inspite of, the regular Calling Convention Lowering.
The issue could be observed in a scenario such as:
``` %arg1 = load half, half* %const, align 2 %arg2 = call fastcc half @someFunc() call fastcc void @otherFunc(half %arg1, half %arg2) ; Here, %arg2 was incorrectly mangled twice, as the CallConv data from ; the call to @someFunc() was taken into consideration for the check ; when getting the value for processing the call to @otherFunc(...), ; after the proper convertion had taken place when lowering the return ; value of the first call. ```
This patch fixes the issue by disregarding the Calling Convention information for such copyFromRegs, making sure the ABI mangling is properly contanined in the Calling Convention Lowering.
This fixes Bugzilla #47454.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D87844
show more ...
|