#
60afa49a |
| 09-Jul-2019 |
Tim Northover <tnorthover@apple.com> |
OpaquePtr: add Type parameter to Loads analysis API.
This makes the functions in Loads.h require a type to be specified independently of the pointer Value so that when pointers have no structure oth
OpaquePtr: add Type parameter to Loads analysis API.
This makes the functions in Loads.h require a type to be specified independently of the pointer Value so that when pointers have no structure other than address-space, it can still do its job.
Most callers had an obvious memory operation handy to provide this type, but a SROA and ArgumentPromotion were doing more complicated analysis. They get updated to merge the properties of the various instructions they were considering.
llvm-svn: 365468
show more ...
|
#
83bbe2f4 |
| 03-Jul-2019 |
Francis Visoiu Mistrih <francisvm@yahoo.com> |
[CodeGen] Make branch funnels pass the machine verifier
We previously marked all the tests with branch funnels as `-verify-machineinstrs=0`.
This is an attempt to fix it.
1) `ICALL_BRANCH_FUNNEL`
[CodeGen] Make branch funnels pass the machine verifier
We previously marked all the tests with branch funnels as `-verify-machineinstrs=0`.
This is an attempt to fix it.
1) `ICALL_BRANCH_FUNNEL` has no defs. Mark it as `let OutOperandList = (outs)`
2) After that we hit an assert: ``` Assertion failed: (Op.getValueType() != MVT::Other && Op.getValueType() != MVT::Glue && "Chain and glue operands should occur at end of operand list!"), function AddOperand, file /Users/francisvm/llvm/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp, line 461. ```
The chain operand was added at the beginning of the operand list. Move that to the end.
3) After that we hit another verifier issue in the pseudo expansion where the registers used in the cmps and jmps are not added to the livein lists. Add the `EFLAGS` to all the new MBBs that we create.
PR39436
Differential Review: https://reviews.llvm.org/D54155
llvm-svn: 365058
show more ...
|
#
fa4aac73 |
| 03-Jul-2019 |
James Molloy <jmolloy@google.com> |
[SelectionDAG] Propagate alias metadata to target intrinsic nodes
When a target intrinsic has been determined to touch memory, we construct a MachineMemOperand during SDAG construction. In this case
[SelectionDAG] Propagate alias metadata to target intrinsic nodes
When a target intrinsic has been determined to touch memory, we construct a MachineMemOperand during SDAG construction. In this case, we should propagate AAMDNodes metadata to the MachineMemOperand where available.
Differential revision: https://reviews.llvm.org/D64131
llvm-svn: 365043
show more ...
|
#
ed13fef4 |
| 01-Jul-2019 |
Benjamin Kramer <benny.kra@googlemail.com> |
[SelectionDAG] Do minnum->minimum at legalization time instead of building time
The SDAGBuilder behavior stems from the days when we didn't have fast math flags available in SDAG. We do now and doin
[SelectionDAG] Do minnum->minimum at legalization time instead of building time
The SDAGBuilder behavior stems from the days when we didn't have fast math flags available in SDAG. We do now and doing the transformation in the legalizer has the advantage that it also works for vector types.
llvm-svn: 364743
show more ...
|
#
e3a676e9 |
| 24-Jun-2019 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set neede
CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg().
llvm-svn: 364191
show more ...
|
#
9cac4e6d |
| 19-Jun-2019 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Rename ExpandISelPseudo->FinalizeISel, delay register reservation
This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are call
Rename ExpandISelPseudo->FinalizeISel, delay register reservation
This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization.
Patch by Matthias Braun
llvm-svn: 363757
show more ...
|
#
cbeb563c |
| 11-Jun-2019 |
Sander de Smalen <sander.desmalen@arm.com> |
Change semantics of fadd/fmul vector reductions.
This patch changes how LLVM handles the accumulator/start value in the reduction, by never ignoring it regardless of the presence of fast-math flags
Change semantics of fadd/fmul vector reductions.
This patch changes how LLVM handles the accumulator/start value in the reduction, by never ignoring it regardless of the presence of fast-math flags on callsites. This change introduces the following new intrinsics to replace the existing ones:
llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul
and adds functionality to auto-upgrade existing LLVM IR and bitcode.
Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D60261
llvm-svn: 363035
show more ...
|
#
a438432a |
| 10-Jun-2019 |
Francis Visoiu Mistrih <francisvm@yahoo.com> |
[FastISel] Skip creating unnecessary vregs for arguments
This behavior was added in r130928 for both FastISel and SD, and then disabled in r131156 for FastISel.
This re-enables it for FastISel with
[FastISel] Skip creating unnecessary vregs for arguments
This behavior was added in r130928 for both FastISel and SD, and then disabled in r131156 for FastISel.
This re-enables it for FastISel with the corresponding fix.
This is triggered only when FastISel can't lower the arguments and falls back to SelectionDAG for it.
FastISel contains a map of "register fixups" where at the end of the selection phase it replaces all uses of a register with another register that FastISel sometimes pre-assigned. Code at the end of SelectionDAGISel::runOnMachineFunction is doing the replacement at the very end of the function, while other pieces that come in before that look through the MachineFunction and assume everything is done. In this case, the real issue is that the code emitting COPY instructions for the liveins (physreg to vreg) (EmitLiveInCopies) is checking if the vreg assigned to the physreg is used, and if it's not, it will skip the COPY. If a register wasn't replaced with its assigned fixup yet, the copy will be skipped and we'll end up with uses of undefined registers.
This fix moves the replacement of registers before the emission of copies for the live-ins.
The initial motivation for this fix is to enable tail calls for swiftself functions, which were blocked because we couldn't prove that the swiftself argument (which is callee-save) comes from a function argument (live-in), because there was an extra copy (vreg to vreg).
A few tests are affected by this:
* llvm/test/CodeGen/AArch64/swifterror.ll: we used to spill x21 (callee-save) but never reload it because it's attached to the return. We now don't even spill it anymore. * llvm/test/CodeGen/*/swiftself.ll: we tail-call now. * llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.ll: I believe this test was not really testing the right thing, but it worked because the same registers were re-used. * llvm/test/CodeGen/ARM/cmpxchg-O0.ll: regalloc changes * llvm/test/CodeGen/ARM/swifterror.ll: get rid of a copy * llvm/test/CodeGen/Mips/*: get rid of spills and copies * llvm/test/CodeGen/SystemZ/swift-return.ll: smaller stack * llvm/test/CodeGen/X86/atomic-unordered.ll: smaller stack * llvm/test/CodeGen/X86/swifterror.ll: same as AArch64 * llvm/test/DebugInfo/X86/dbg-declare-arg.ll: stack size changed
Differential Revision: https://reviews.llvm.org/D62361
llvm-svn: 362963
show more ...
|
#
829037a9 |
| 08-Jun-2019 |
Amara Emerson <aemerson@apple.com> |
Factor out SelectionDAG's switch analysis and lowering into a separate component.
In order for GlobalISel to re-use the significant amount of analysis and optimization code in SDAG's switch lowering
Factor out SelectionDAG's switch analysis and lowering into a separate component.
In order for GlobalISel to re-use the significant amount of analysis and optimization code in SDAG's switch lowering, we first have to extract it and create an interface to be used by both frameworks.
No test changes as it's NFC.
Differential Revision: https://reviews.llvm.org/D62745
llvm-svn: 362857
show more ...
|
#
6c5d5ce5 |
| 05-Jun-2019 |
Ulrich Weigand <ulrich.weigand@de.ibm.com> |
Allow target to handle STRICT floating-point nodes
The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes i
Allow target to handle STRICT floating-point nodes
The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping).
This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules.
To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts:
- A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic.
Any MI instruction that is *both* marked as mayRaiseFPException *and* FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling).
Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files.
The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today.
This patch includes both common code changes required to implement the new features, and the SystemZ implementation.
Reviewed By: andrew.w.kaylor
Differential Revision: https://reviews.llvm.org/D55506
llvm-svn: 362663
show more ...
|
#
607c8a9d |
| 05-Jun-2019 |
Tim Northover <tnorthover@apple.com> |
IR: make getParamByValType Just Work. NFC.
Most parts of LLVM don't care whether the byval type is derived from an explicit Attribute or from the parameter's pointee type, so it makes sense for the
IR: make getParamByValType Just Work. NFC.
Most parts of LLVM don't care whether the byval type is derived from an explicit Attribute or from the parameter's pointee type, so it makes sense for the main access function to just return the right value.
The very few users who do care (only BitcodeReader so far) can find out how it's specified by accessing the Attribute directly.
llvm-svn: 362642
show more ...
|
#
6b432dca |
| 04-Jun-2019 |
Johannes Doerfert <jdoerfert@anl.gov> |
[SelectionDAG][FIX] Allow "returned" arguments to be bit-casted
Summary: An argument that is return by a function but bit-casted before can still be annotated as "returned". Make sure we do not cras
[SelectionDAG][FIX] Allow "returned" arguments to be bit-casted
Summary: An argument that is return by a function but bit-casted before can still be annotated as "returned". Make sure we do not crash for this case.
Reviewers: sunfish, stephenwlin, niravd, arsenm
Subscribers: wdng, hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59917
llvm-svn: 362546
show more ...
|
#
b7141207 |
| 30-May-2019 |
Tim Northover <tnorthover@apple.com> |
Reapply: IR: add optional type to 'byval' function parameters
When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack.
Reapply: IR: add optional type to 'byval' function parameters
When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds a (for now) optional extra type parameter.
If present, the type must match the pointee type of the argument.
The original commit did not remap byval types when linking modules, which broke LTO. This version fixes that.
Note to front-end maintainers: if this causes test failures, it's probably because the "byval" attribute is printed after attributes without any parameter after this change.
llvm-svn: 362128
show more ...
|
#
71ee3d02 |
| 29-May-2019 |
Tim Northover <tnorthover@apple.com> |
Revert "IR: add optional type to 'byval' function parameters"
The IRLinker doesn't delve into the new byval attribute when mapping types, and this breaks LTO.
llvm-svn: 362029
|
#
6e07f16f |
| 29-May-2019 |
Tim Northover <tnorthover@apple.com> |
IR: add optional type to 'byval' function parameters
When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds
IR: add optional type to 'byval' function parameters
When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds a (for now) optional extra type parameter.
If present, the type must match the pointee type of the argument.
Note to front-end maintainers: if this causes test failures, it's probably because the "byval" attribute is printed after attributes without any parameter after this change.
llvm-svn: 362012
show more ...
|
#
6d7bf5e8 |
| 28-May-2019 |
Adhemerval Zanella <adhemerval.zanella@linaro.org> |
[CodeGen] Add lrint/llrint builtins
This patch add the ISD::LRINT and ISD::LLRINT along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with jus
[CodeGen] Add lrint/llrint builtins
This patch add the ISD::LRINT and ISD::LLRINT along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger.
The idea is to optimize lrint/llrint generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D62017
llvm-svn: 361875
show more ...
|
#
ba447bae |
| 26-May-2019 |
Alexander Timofeev <Alexander.Timofeev@amd.com> |
[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assi
[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence.
Reviewers: rampitec, nhaehnle
Differential Revision: https://reviews.llvm.org/D59990
This commit was reverted because of the build failure. The reason was mlformed patch. Build failure fixed.
llvm-svn: 361741
show more ...
|
#
3b937374 |
| 25-May-2019 |
Peter Collingbourne <peter@pcc.me.uk> |
Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence."
Broke sanitizer bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-
Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence."
Broke sanitizer bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio
llvm-svn: 361688
show more ...
|
#
dffedea0 |
| 24-May-2019 |
Alexander Timofeev <Alexander.Timofeev@amd.com> |
[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign
[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence.
Reviewers: rampitec, nhaehnle
Differential Revision: https://reviews.llvm.org/D59990
llvm-svn: 361644
show more ...
|
#
3d7a057b |
| 24-May-2019 |
Tim Northover <tnorthover@apple.com> |
CodeGen: factor out swifterror value tracking.
llvm-svn: 361607
|
#
0bada7ce |
| 21-May-2019 |
Leonard Chan <leonardchan@google.com> |
[Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic
Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multip
[Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic
Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands.
This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D55720
llvm-svn: 361289
show more ...
|
#
51dc59d0 |
| 21-May-2019 |
Sanjay Patel <spatel@rotateright.com> |
[SelectionDAG] remove redundant code; NFCI
getNode() squashes concatenation of undefs via FoldCONCAT_VECTORS(): // Concat of UNDEFs is UNDEF. if (llvm::all_of(Ops, [](SDValue Op) { return Op.isU
[SelectionDAG] remove redundant code; NFCI
getNode() squashes concatenation of undefs via FoldCONCAT_VECTORS(): // Concat of UNDEFs is UNDEF. if (llvm::all_of(Ops, [](SDValue Op) { return Op.isUndef(); })) return DAG.getUNDEF(VT);
llvm-svn: 361284
show more ...
|
#
97d4f7c1 |
| 20-May-2019 |
Craig Topper <craig.topper@intel.com> |
[SelectionDAGBuilder] Flush PendingExports before creating INLINEASM_BR node for asm goto.
Since INLINEASM_BR is a terminator we need to flush the pending exports before emitting it. If we don't do
[SelectionDAGBuilder] Flush PendingExports before creating INLINEASM_BR node for asm goto.
Since INLINEASM_BR is a terminator we need to flush the pending exports before emitting it. If we don't do this, a TokenFactor can be inserted between it and the BR instruction emitted to finish the callbr lowering.
It looks like nodes are glued to the INLINEASM_BR so I had to make sure we emit the TokenFactor before that.
Differential Revision: https://reviews.llvm.org/D59981
llvm-svn: 361177
show more ...
|
#
af7a1884 |
| 20-May-2019 |
Craig Topper <craig.topper@intel.com> |
[Intrinsics] Merge lround.i32 and lround.i64 into a single intrinsic with overloaded result type. Make result type for llvm.llround overloaded instead of fixing to i64
We shouldn't really make assum
[Intrinsics] Merge lround.i32 and lround.i64 into a single intrinsic with overloaded result type. Make result type for llvm.llround overloaded instead of fixing to i64
We shouldn't really make assumptions about possible sizes for long and long long. And longer term we should probably support vectorizing these intrinsics. By making the result types not fixed we can support vectors as well.
Differential Revision: https://reviews.llvm.org/D62026
llvm-svn: 361169
show more ...
|
#
e386a01e |
| 20-May-2019 |
Guillaume Chatelet <gchatelet@google.com> |
[NFC] Refactor visitIntrinsicCall so it doesn't return a const char*
Summary: API simplification
Reviewers: courbet
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https:
[NFC] Refactor visitIntrinsicCall so it doesn't return a const char*
Summary: API simplification
Reviewers: courbet
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61306
llvm-svn: 361140
show more ...
|