Revision tags: llvmorg-11.0.1-rc1 |
|
#
abbf4802 |
| 24-Nov-2020 |
Hongtao Yu <hoy@fb.com> |
[SelectionDAG] Add PseudoProbeSDNode to LargestSDNode to fix 32-bt build break.
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
d0e42037 |
| 10-Sep-2020 |
Hongtao Yu <hoy@fb.com> |
[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic
This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe fo
[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic
This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.
An `llvm.pseudoprobe` intrinsic call will be lowered into a target-independent operation named `PSEUDO_PROBE`. Given the following instrumented IR,
``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` the corresponding MIR is shown below. Note that block `bb3` is duplicated into `bb1` and `bb2` where its probe is duplicated too. This allows for an accurate execution count to be collected for `bb3`, which is basically the sum of the counts of `bb1` and `bb2`.
``` bb.0.bb0: frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp TEST32rr killed renamable $edi, renamable $edi, implicit-def $eflags PSEUDO_PROBE 837061429793323041, 1, 0 $edi = MOV32ri 1, debug-location !13; test.c:0 JCC_1 %bb.1, 4, implicit $eflags
bb.2.bb2: PSEUDO_PROBE 837061429793323041, 3, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ
bb.1.bb1: PSEUDO_PROBE 837061429793323041, 2, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ ```
The target op PSEUDO_PROBE will be converted into a piece of binary data by the object emitter with no machine instructions generated. This is done in a different patch.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D86495
show more ...
|
#
170947a5 |
| 11-Nov-2020 |
Kerry McLaughlin <kerry.mclaughlin@arm.com> |
[SVE][CodeGen] Lower scalable masked scatters
Lowers the llvm.masked.scatter intrinsics (scalar plus vector addressing mode only)
Changes included in this patch: - Custom lowering for MSCATTER, wh
[SVE][CodeGen] Lower scalable masked scatters
Lowers the llvm.masked.scatter intrinsics (scalar plus vector addressing mode only)
Changes included in this patch: - Custom lowering for MSCATTER, which chooses the appropriate scatter store opcode to use. Floating-point scatters are cast to integer, with patterns added to match FP reinterpret_casts. - Added the getCanonicalIndexType function to convert redundant addressing modes (e.g. scaling is redundant when accessing bytes) - Tests with 32 & 64-bit scaled & unscaled offsets
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D90941
show more ...
|
#
ffbbfc76 |
| 11-Nov-2020 |
Kerry McLaughlin <kerry.mclaughlin@arm.com> |
[SVE][CodeGen] Add the isTruncatingStore flag to MSCATTER
This patch adds the IsTruncatingStore flag to MaskedScatterSDNode, set by getMaskedScatter(). Updated SelectionDAGDumper::print_details for
[SVE][CodeGen] Add the isTruncatingStore flag to MSCATTER
This patch adds the IsTruncatingStore flag to MaskedScatterSDNode, set by getMaskedScatter(). Updated SelectionDAGDumper::print_details for MaskedScatterSDNode to print the details of masked scatters (is truncating, signed or scaled).
This is the first in a series of patches which adds support for scalable masked scatters
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D90939
show more ...
|
#
ce356e15 |
| 24-Oct-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[DAG] Add BuildVectorSDNode::getRepeatedSequence helper to recognise multi-element splat patterns
Replace the X86 specific isSplatZeroExtended helper with a generic BuildVectorSDNode method.
I've j
[DAG] Add BuildVectorSDNode::getRepeatedSequence helper to recognise multi-element splat patterns
Replace the X86 specific isSplatZeroExtended helper with a generic BuildVectorSDNode method.
I've just used this to simplify the X86ISD::BROADCASTM lowering so far (and remove isSplatZeroExtended), but we should be able to use this in more places to lower to complex broadcast patterns.
Differential Revision: https://reviews.llvm.org/D87930
show more ...
|
#
b8ce6a67 |
| 01-Oct-2020 |
David Sherwood <david.sherwood@arm.com> |
[SVE][CodeGen] Add new EVT/MVT getFixedSizeInBits() functions
When we know that a particular type is always going to be fixed width we have so far been writing code like this:
getSizeInBits().get
[SVE][CodeGen] Add new EVT/MVT getFixedSizeInBits() functions
When we know that a particular type is always going to be fixed width we have so far been writing code like this:
getSizeInBits().getFixedSize()
Since we are doing this in quite a few places now it seems to make sense to add a new helper function that allows us to replace these calls with a single getFixedSizeInBits() call.
Differential Revision: https://reviews.llvm.org/D88649
show more ...
|
#
e077367a |
| 18-Sep-2020 |
David Sherwood <david.sherwood@arm.com> |
[SVE] Make EVT::getScalarSizeInBits and others consistent with Type::getScalarSizeInBits
An existing function Type::getScalarSizeInBits returns a uint64_t instead of a TypeSize class because the cal
[SVE] Make EVT::getScalarSizeInBits and others consistent with Type::getScalarSizeInBits
An existing function Type::getScalarSizeInBits returns a uint64_t instead of a TypeSize class because the caller is requesting a scalar size, which cannot be scalable. This patch makes other similar functions requesting a scalar size consistent with that, thereby eliminating more than 1000 implicit TypeSize -> uint64_t casts.
Differential revision: https://reviews.llvm.org/D87889
show more ...
|
#
b1e68f88 |
| 08-Sep-2020 |
Craig Topper <craig.topper@intel.com> |
[SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.:
This removes the after the fact FMF handling from D46854 in favor of passing fast math fla
[SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.:
This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130.
This required adding a SDNodeFlags to SelectionDAG::getSetCC.
Now we manage to contant fold some stuff undefs during the initial getNode that we don't do in later DAG combines.
Differential Revision: https://reviews.llvm.org/D87200
show more ...
|
#
714ceefa |
| 31-Aug-2020 |
Jonas Paulsson <paulsson@linux.vnet.ibm.com> |
[SelectionDAG] Always intersect SDNode flags during getNode() node memoization.
Previously SDNodeFlags::instersectWith(Flags) would do nothing if Flags was in an undefined state, which is very bad g
[SelectionDAG] Always intersect SDNode flags during getNode() node memoization.
Previously SDNodeFlags::instersectWith(Flags) would do nothing if Flags was in an undefined state, which is very bad given that this is the default when getNode() is called without passing an explicit SDNodeFlags argument.
This meant that if an already existing and reused node had a flag which the second caller to getNode() did not set, that flag would remain uncleared.
This was exposed by https://bugs.llvm.org/show_bug.cgi?id=47092, where an NSW flag was incorrectly set on an add instruction (which did in fact overflow in one of the two original contexts), so when SystemZElimCompare removed the compare with 0 trusting that flag, wrong-code resulted.
There is more that needs to be done in this area as discussed here:
Differential Revision: https://reviews.llvm.org/D86871
Review: Ulrich Weigand, Sanjay Patel
show more ...
|
#
514d6e9a |
| 24-Aug-2020 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
[SDAG] Improve MemSDNode::getBasePtr
It returned getOperand(1), except for STORE for which it returned getOperand(2). Handle MSTORE, MGATHER, and MSCATTER as well.
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3 |
|
#
87e2751c |
| 03-Jul-2020 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode
This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.
[Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode
This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Differential Revision: https://reviews.llvm.org/D83082
show more ...
|
Revision tags: llvmorg-10.0.1-rc2 |
|
#
b1360caa |
| 25-May-2020 |
Michael Liao <michael.hliao@gmail.com> |
[SDAG] Add new AssertAlign ISD node.
Summary: - AssertAlign node records the guaranteed alignment on its source node, where these alignments are retrieved from alignment attributes in LLVM IR. T
[SDAG] Add new AssertAlign ISD node.
Summary: - AssertAlign node records the guaranteed alignment on its source node, where these alignments are retrieved from alignment attributes in LLVM IR. These tracked alignments could help DAG combining and lowering generating efficient code. - In this patch, the basic support of AssertAlign node is added. So far, we only generate AssertAlign nodes on return values from intrinsic calls. - Addressing selection in AMDGPU is revised accordingly to capture the new (base + offset) patterns.
Reviewers: arsenm, bogner
Subscribers: jvesely, wdng, nhaehnle, tpr, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81711
show more ...
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
bebdc62c |
| 08-May-2020 |
Craig Topper <craig.topper@gmail.com> |
[SelectionDAG] Remove ConstantPoolSDNode::getAlignment.
Use getAlign instead.
Differential Revision: https://reviews.llvm.org/D79459
|
#
d1119980 |
| 08-May-2020 |
Craig Topper <craig.topper@gmail.com> |
[SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode.
This patch stores the alignment for ConstantPoolSDNode as an Align and updates the getConstantPool interface to take a MaybeAlign.
Removi
[SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode.
This patch stores the alignment for ConstantPoolSDNode as an Align and updates the getConstantPool interface to take a MaybeAlign.
Removing getAlignment() will be done as a follow up.
Differential Revision: https://reviews.llvm.org/D79436
show more ...
|
#
586769cc |
| 08-Apr-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Use Register
|
#
b91535f6 |
| 27-Mar-2020 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] Return Align for SelectionDAGNodes::getOriginalAlignment/getAlignment
Summary: Also deprecate getOriginalAlignment, getAlignment will take much more time as it is pervasive through
[Alignment][NFC] Return Align for SelectionDAGNodes::getOriginalAlignment/getAlignment
Summary: Also deprecate getOriginalAlignment, getAlignment will take much more time as it is pervasive through the codebase (including TableGened files).
This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76933
show more ...
|
#
74eac903 |
| 27-Mar-2020 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] MachineMemOperand::getAlign/getBaseAlign
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm
[Alignment][NFC] MachineMemOperand::getAlign/getBaseAlign
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: arsenm, dschuff, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, jrtc27, atanasyan, jfb, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76925
show more ...
|
#
9f7d4150 |
| 26-Mar-2020 |
Craig Topper <craig.topper@gmail.com> |
[X86] Move combineLoopMAddPattern and combineLoopSADPattern to an IR pass before SelecitonDAG.
These transforms rely on a vector reduction flag on the SDNode set by SelectionDAGBuilder. This flag ex
[X86] Move combineLoopMAddPattern and combineLoopSADPattern to an IR pass before SelecitonDAG.
These transforms rely on a vector reduction flag on the SDNode set by SelectionDAGBuilder. This flag exists because SelectionDAG can't see across basic blocks so SelectionDAGBuilder is looking across and saving the info. X86 is the only target that uses this flag currently. By removing the X86 code we can remove the flag and the SelectionDAGBuilder code.
This pass adds a dedicated IR pass for X86 that looks across the blocks and transforms the IR into a form that the X86 SelectionDAG can finish.
An advantage of this new approach is that we can enhance it to shrink the phi nodes and final reduction tree based on the zeroes that we need to concatenate to bring the partially reduced reduction back up to the original width.
Differential Revision: https://reviews.llvm.org/D76649
show more ...
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
#
fd7c2e24 |
| 26-Feb-2020 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
[SDAG] Add SDNode::values() = make_range(values_begin(), values_end())
Also use it in a few places to simplify code a little bit. NFC
|
#
0ad6fc99 |
| 21-Feb-2020 |
Sanjay Patel <spatel@rotateright.com> |
[SelectionDAG] remove unused isFast() helper function; NFC
We want flag users to check individual fast-math flags, not that all of them are set. This was also probably not working as intended becaus
[SelectionDAG] remove unused isFast() helper function; NFC
We want flag users to check individual fast-math flags, not that all of them are set. This was also probably not working as intended because NoFPExcept isn't always set on non-strict nodes.
show more ...
|
#
9bbf271f |
| 20-Feb-2020 |
Craig Topper <craig.topper@intel.com> |
[AArch64] Move isOverflowIntrOpRes help function to the ISD namespace in SelectionDAG.h. NFC
Enables sharing with an upcoming X86 change.
|
Revision tags: llvmorg-10.0.0-rc2 |
|
#
0daf9b8e |
| 12-Feb-2020 |
Craig Topper <craig.topper@gmail.com> |
[X86][LegalizeTypes] Add SoftPromoteHalf support STRICT_FP_EXTEND and STRICT_FP_ROUND
This adds a strict version of FP16_TO_FP and FP_TO_FP16 and uses them to implement soft promotion for the half t
[X86][LegalizeTypes] Add SoftPromoteHalf support STRICT_FP_EXTEND and STRICT_FP_ROUND
This adds a strict version of FP16_TO_FP and FP_TO_FP16 and uses them to implement soft promotion for the half type. This is enough to provide basic support for __fp16 with strictfp.
Add the necessary X86 support to use VCVTPS2PH/VCVTPH2PS when F16C is enabled.
show more ...
|
Revision tags: llvmorg-10.0.0-rc1 |
|
#
17b8f96d |
| 17-Jan-2020 |
Wang, Pengfei <pengfei.wang@intel.com> |
[FPEnv] Divide macro INSTRUCTION into INSTRUCTION and DAG_INSTRUCTION, and macro FUNCTION likewise. NFCI.
Some functions like fmuladd don't really have a node, we should divide the declaration form
[FPEnv] Divide macro INSTRUCTION into INSTRUCTION and DAG_INSTRUCTION, and macro FUNCTION likewise. NFCI.
Some functions like fmuladd don't really have a node, we should divide the declaration form those have node to avoid introducing fake nodes.
Differential Revision: https://reviews.llvm.org/D72871
show more ...
|
Revision tags: llvmorg-11-init |
|
#
63336795 |
| 02-Jan-2020 |
Ulrich Weigand <ulrich.weigand@de.ibm.com> |
[FPEnv] Default NoFPExcept SDNodeFlag to false
The NoFPExcept bit in SDNodeFlags currently defaults to true, unlike all other such flags. This is a problem, because it implies that all code that tra
[FPEnv] Default NoFPExcept SDNodeFlag to false
The NoFPExcept bit in SDNodeFlags currently defaults to true, unlike all other such flags. This is a problem, because it implies that all code that transforms SDNodes without copying flags can introduce a correctness bug, not just a missed optimization.
This patch changes the default to false. This makes it necessary to move setting the (No)FPExcept flag for constrained intrinsics from the visitConstrainedIntrinsic routine to the generic visit routine at the place where the other flags are set, or else the intersectFlagsWith call would erase the NoFPExcept flag again.
In order to avoid making non-strict FP code worse, whenever SelectionDAGISel::SelectCodeCommon matches on a set of orignal nodes none of which can raise FP exceptions, it will preserve this property on all results nodes generated, by setting the NoFPExcept flag on those result nodes that would otherwise be considered as raising an FP exception.
To check whether or not an SD node should be considered as raising an FP exception, the following logic applies:
- For machine nodes, check the mayRaiseFPException property of the underlying MI instruction - For regular nodes, check isStrictFPOpcode - For target nodes, check a newly introduced isTargetStrictFPOpcode
The latter is implemented by reserving a range of target opcodes, similarly to how memory opcodes are identified. (Note that there a bit of a quirk in identifying target nodes that are both memory nodes and strict FP nodes. To simplify the logic, right now all target memory nodes are automatically also considered strict FP nodes -- this could be fixed by adding one more range.)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D71841
show more ...
|
Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
#
b5315ae8 |
| 21-Nov-2019 |
David Green <david.green@arm.com> |
[Codegen][ARM] Add addressing modes from masked loads and stores
MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores
[Codegen][ARM] Add addressing modes from masked loads and stores
MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores can use pre-inc and post-inc addressing modes, just like the standard loads and stores already do.
To enable that, this patch adds all the relevant infrastructure for treating masked loads/stores addressing modes in the same way as normal loads/stores.
This involves: - Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra Offset operand that is added after the PtrBase. - Extending the IndexedModeActions from 8bits to 16bits to store the legality of masked operations as well as normal ones. This array is fairly small, so doubling the size still won't make it very large. Offset masked loads can then be controlled with setIndexedMaskedLoadAction, similar to standard loads. - The same methods that combine to indexed loads, such as CombineToPostIndexedLoadStore, are adjusted to handle masked loads in the same way. - The ARM backend is then adjusted to make use of these indexed masked loads/stores. - The X86 backend is adjusted to hopefully be no functional changes.
Differential Revision: https://reviews.llvm.org/D70176
show more ...
|