#
7802be4a |
| 23-Mar-2020 |
Juneyoung Lee <aqjune@gmail.com> |
[SelDag] Add FREEZE
Summary: - Add FREEZE node to SelDag - Lower FreezeInst (in IR) to FREEZE node - Add Legalization for FREEZE node
Reviewers: qcolombet, bogner, efriedma, lebedev.ri, nlopes, cra
[SelDag] Add FREEZE
Summary: - Add FREEZE node to SelDag - Lower FreezeInst (in IR) to FREEZE node - Add Legalization for FREEZE node
Reviewers: qcolombet, bogner, efriedma, lebedev.ri, nlopes, craig.topper, arsenm
Reviewed By: lebedev.ri
Subscribers: wdng, xbolva00, Petar.Avramovic, liuz, lkail, dylanmckay, hiraditya, Jim, arsenm, craig.topper, RKSimon, spatel, lebedev.ri, regehr, trentxintong, nlopes, mkuper, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D29014
show more ...
|
#
4167645d |
| 02-Mar-2020 |
Volkan Keles <vkeles@apple.com> |
GlobalISel: Move Localizer::shouldLocalize(..) to TargetLowering
Add a new target hook for shouldLocalize so that targets can customize the logic.
https://reviews.llvm.org/D75207
|
#
6e561d1c |
| 16-Dec-2019 |
Bevin Hansson <bevin.hansson@ericsson.com> |
[Intrinsic] Add fixed point saturating division intrinsics.
Summary: This patch adds intrinsics and ISelDAG nodes for signed and unsigned fixed-point division:
``` llvm.sdiv.fix.sat.* llvm.udiv.fix
[Intrinsic] Add fixed point saturating division intrinsics.
Summary: This patch adds intrinsics and ISelDAG nodes for signed and unsigned fixed-point division:
``` llvm.sdiv.fix.sat.* llvm.udiv.fix.sat.* ```
These intrinsics perform scaled, saturating division on two integers or vectors of integers. They are required for the implementation of the Embedded-C fixed-point arithmetic in Clang.
Reviewers: bjope, leonardchan, craig.topper
Subscribers: hiraditya, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71550
show more ...
|
#
466f8843 |
| 18-Feb-2020 |
Jim Lin <tclin914@gmail.com> |
[NFC] Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h,td}
|
#
943b5561 |
| 01-Feb-2020 |
Craig Topper <craig.topper@gmail.com> |
[LegalizeTypes][X86] Add a new strategy for type legalizing f16 type that softens it to i16, but promotes to f32 around arithmetic ops.
This is based on this llvm-dev thread http://lists.llvm.org/pi
[LegalizeTypes][X86] Add a new strategy for type legalizing f16 type that softens it to i16, but promotes to f32 around arithmetic ops.
This is based on this llvm-dev thread http://lists.llvm.org/pipermail/llvm-dev/2019-December/137521.html
The current strategy for f16 is to promote type to float every except where the specific width is required like loads, stores, and bitcasts. This results in rounding occurring in odd places instead of immediately after arithmetic operations. This interacts in weird ways with the __fp16 type in clang which is a storage only type where arithmetic is always promoted to float. InstCombine can remove some fpext/fptruncs around such arithmetic and turn it into arithmetic on half. This wouldn't be so bad if SelectionDAG was able to put those fpext/fpround back in when it promotes.
It is also not obvious how to handle to make the existing strategy work with STRICT fp. We need to use STRICT versions of the conversions which require chain operands. But if the conversions are created for a bitcast, there is no place to get an appropriate chain from.
This patch implements a different strategy where conversions are emitted directly around arithmetic operations. And otherwise its passed around as an i16 including in arguments and return values. This can result in more conversions between arithmetic operations, but is closer to matching the IR the frontend generates for __fp16. And it will allow us to use the chain from constrained arithmetic nodes to link the STRICT_FP_TO_FP16/STRICT_FP16_TO_FP that will need to be added. I've set it up so that each target can opt into the new behavior. Converting all the targets myself was more than I was able to handle.
Differential Revision: https://reviews.llvm.org/D73749
show more ...
|
#
17b8f96d |
| 17-Jan-2020 |
Wang, Pengfei <pengfei.wang@intel.com> |
[FPEnv] Divide macro INSTRUCTION into INSTRUCTION and DAG_INSTRUCTION, and macro FUNCTION likewise. NFCI.
Some functions like fmuladd don't really have a node, we should divide the declaration form
[FPEnv] Divide macro INSTRUCTION into INSTRUCTION and DAG_INSTRUCTION, and macro FUNCTION likewise. NFCI.
Some functions like fmuladd don't really have a node, we should divide the declaration form those have node to avoid introducing fake nodes.
Differential Revision: https://reviews.llvm.org/D72871
show more ...
|
#
d0943537 |
| 12-Jan-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Apply target MMO flags to atomics
Unify MMO flag handling with SelectionDAG like with loads and stores.
|
#
0d0fce42 |
| 12-Jan-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Preserve load/store metadata in IRTranslator
This was dropping the invariant metadata on dead argument loads, so they weren't deleted.
Atomics still need to be fixed the same way. Also,
GlobalISel: Preserve load/store metadata in IRTranslator
This was dropping the invariant metadata on dead argument loads, so they weren't deleted.
Atomics still need to be fixed the same way. Also, apparently store was never preserving dereferencable which should also be fixed.
show more ...
|
#
bb255317 |
| 11-Jan-2020 |
Craig Topper <craig.topper@gmail.com> |
[TargetLowering][ARM][Mips][WebAssembly] Remove the ordered FP compare from RunttimeLibcalls.def and all associated usages
Summary: This always just used the same libcall as unordered, but the compa
[TargetLowering][ARM][Mips][WebAssembly] Remove the ordered FP compare from RunttimeLibcalls.def and all associated usages
Summary: This always just used the same libcall as unordered, but the comparison predicate was different. This change appears to have been made when targets were given the ability to override the predicates. Before that they were hardcoded into the type legalizer. At that time we never inverted predicates and we handled ugt/ult/uge/ule compares by emitting an unordered check ORed with a ogt/olt/oge/ole checks. So only ordered needed an inverted predicate. Later ugt/ult/uge/ule were optimized to only call a single libcall and invert the compare.
This patch removes the ordered entries and just uses the inverting logic that is now present. This removes some odd things in both the Mips and WebAssembly code.
Reviewers: efriedma, ABataev, uweigand, cameron.mcinally, kpn
Reviewed By: efriedma
Subscribers: dschuff, sdardis, sbc100, arichardson, jgravelle-google, kristof.beyls, hiraditya, aheejin, sunfish, atanasyan, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72536
show more ...
|
#
8e2b44f7 |
| 08-Jan-2020 |
Bevin Hansson <bevin.hansson@ericsson.com> |
[Intrinsic] Add fixed point division intrinsics.
Summary: This patch adds intrinsics and ISelDAG nodes for signed and unsigned fixed-point division:
llvm.sdiv.fix.* llvm.udiv.fix.*
These intri
[Intrinsic] Add fixed point division intrinsics.
Summary: This patch adds intrinsics and ISelDAG nodes for signed and unsigned fixed-point division:
llvm.sdiv.fix.* llvm.udiv.fix.*
These intrinsics perform scaled division on two integers or vectors of integers. They are required for the implementation of the Embedded-C fixed-point arithmetic in Clang.
Patch by: ebevhan
Reviewers: bjope, leonardchan, efriedma, craig.topper
Reviewed By: craig.topper
Subscribers: Ka-Ka, ilya, hiraditya, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70007
show more ...
|
#
2133d3c5 |
| 03-Jan-2020 |
QingShan Zhang <qshanz@cn.ibm.com> |
[DAGCombine] Initialize the default operation action for SIGN_EXTEND_INREG for vector type as 'expand' instead of 'legal'
For now, we didn't set the default operation action for SIGN_EXTEND_INREG fo
[DAGCombine] Initialize the default operation action for SIGN_EXTEND_INREG for vector type as 'expand' instead of 'legal'
For now, we didn't set the default operation action for SIGN_EXTEND_INREG for vector type, which is 0 by default, that is legal. However, most target didn't have native instructions to support this opcode. It should be set as expand by default, as what we did for ANY_EXTEND_VECTOR_INREG.
Differential Revision: https://reviews.llvm.org/D70000
show more ...
|
#
7a733466 |
| 27-Dec-2019 |
Fangrui Song <maskray@google.com> |
Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821
Intrinsic has incorrect argument type! i32 (i32*)* @llvm.setjmp
*wipes tear*
|
#
b5315ae8 |
| 21-Nov-2019 |
David Green <david.green@arm.com> |
[Codegen][ARM] Add addressing modes from masked loads and stores
MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores
[Codegen][ARM] Add addressing modes from masked loads and stores
MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores can use pre-inc and post-inc addressing modes, just like the standard loads and stores already do.
To enable that, this patch adds all the relevant infrastructure for treating masked loads/stores addressing modes in the same way as normal loads/stores.
This involves: - Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra Offset operand that is added after the PtrBase. - Extending the IndexedModeActions from 8bits to 16bits to store the legality of masked operations as well as normal ones. This array is fairly small, so doubling the size still won't make it very large. Offset masked loads can then be controlled with setIndexedMaskedLoadAction, similar to standard loads. - The same methods that combine to indexed loads, such as CombineToPostIndexedLoadStore, are adjusted to handle masked loads in the same way. - The ARM backend is then adjusted to make use of these indexed masked loads/stores. - The X86 backend is adjusted to hopefully be no functional changes.
Differential Revision: https://reviews.llvm.org/D70176
show more ...
|
#
22a0edd0 |
| 22-Nov-2019 |
Pengfei Wang <pengfei.wang@intel.com> |
[FPEnv] Add an option to disable strict float node mutating to an normal float node
This patch add an option 'disable-strictnode-mutation' to prevent strict node mutating to an normal node. So we ca
[FPEnv] Add an option to disable strict float node mutating to an normal float node
This patch add an option 'disable-strictnode-mutation' to prevent strict node mutating to an normal node. So we can make sure that the patch which sets strict-node as legal works correctly.
Patch by Chen Liu(LiuChen3)
Differential Revision: https://reviews.llvm.org/D70226
show more ...
|
#
0c50c0b0 |
| 05-Nov-2019 |
Serge Pavlov <sepavloff@gmail.com> |
[FEnv] File with properties of constrained intrinsics
Summary In several places we need to enumerate all constrained intrinsics or IR nodes that should be represented by them. It is easy to miss som
[FEnv] File with properties of constrained intrinsics
Summary In several places we need to enumerate all constrained intrinsics or IR nodes that should be represented by them. It is easy to miss some of the cases. To make working with these intrinsics more convenient and robust, this change introduces file containing definitions of all constrained intrinsics and some of their properties. This file can be included to generate constrained intrinsics processing code.
Reviewers: kpn, andrew.w.kaylor, cameron.mcinally, uweigand
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69887
show more ...
|
#
84e83b54 |
| 13-Nov-2019 |
Craig Topper <craig.topper@intel.com> |
[TargetLowering] Increase the storage size of NumRegistersForVT to allow the type break down for v256i1 and other types to be stored correctly
v256i1 on X86 without avx512 breaks down to 256 i8 valu
[TargetLowering] Increase the storage size of NumRegistersForVT to allow the type break down for v256i1 and other types to be stored correctly
v256i1 on X86 without avx512 breaks down to 256 i8 values when passed between basic blocks. But the NumRegistersForVT was sized at a byte for each VT. This results in 256 being stored as 0.
This patch enlarges the type to 16 bits and adds an assert to ensure that no information is lost when the entry is stored.
Differential Revision: https://reviews.llvm.org/D70138
show more ...
|
#
58acbce3 |
| 05-Nov-2019 |
aqjune <aqjune@gmail.com> |
[IR] Add Freeze instruction
Summary: - Define Instruction::Freeze, let it be UnaryOperator - Add support for freeze to LLLexer/LLParser/BitcodeReader/BitcodeWriter The format is `%x = freeze <ty>
[IR] Add Freeze instruction
Summary: - Define Instruction::Freeze, let it be UnaryOperator - Add support for freeze to LLLexer/LLParser/BitcodeReader/BitcodeWriter The format is `%x = freeze <ty> %v` - Add support for freeze instruction to llvm-c interface. - Add m_Freeze in PatternMatch. - Erase freeze when lowering IR to SelDag.
Reviewers: deadalnix, hfinkel, efriedma, lebedev.ri, nlopes, jdoerfert, regehr, filcab, delcypher, whitequark
Reviewed By: lebedev.ri, jdoerfert
Subscribers: jfb, kristof.beyls, hiraditya, lebedev.ri, steven_wu, dexonsmith, xbolva00, delcypher, spatel, regehr, trentxintong, vsk, filcab, nlopes, mehdi_amini, deadalnix, llvm-commits
Differential Revision: https://reviews.llvm.org/D29011
show more ...
|
#
44d0c3d9 |
| 31-Oct-2019 |
Fangrui Song <maskray@google.com> |
[PGO][PGSO] Fix -DBUILD_SHARED_LIBS=on builds after D69580/llvmorg-10-init-8797-g0d987e411ac
Move TargetLoweringBase::isSuitableForJumpTable from llvm/CodeGen/TargetLowering.h to .cpp, to avoid the
[PGO][PGSO] Fix -DBUILD_SHARED_LIBS=on builds after D69580/llvmorg-10-init-8797-g0d987e411ac
Move TargetLoweringBase::isSuitableForJumpTable from llvm/CodeGen/TargetLowering.h to .cpp, to avoid the undefined reference from all LLVM${Target}ISelLowering.cpp.
Another fix is to add a dependency on TransformUtils to all lib/Target/$Target/LLVMBuild.txt, but that is too disruptive.
show more ...
|
#
84da2596 |
| 18-Oct-2019 |
Graham Hunter <graham.hunter@arm.com> |
[AArch64][SVE] Add SPLAT_VECTOR ISD Node
Adds a new ISD node to replicate a scalar value across all elements of a vector. This is needed for scalable vectors, since BUILD_VECTOR cannot be used.
Fix
[AArch64][SVE] Add SPLAT_VECTOR ISD Node
Adds a new ISD node to replicate a scalar value across all elements of a vector. This is needed for scalable vectors, since BUILD_VECTOR cannot be used.
Fixes up default type legalization for scalable vectors after the new MVT type ranges were introduced.
At present I only use this node for scalable vectors. A DAGCombine has been added to transform a BUILD_VECTOR into a SPLAT_VECTOR if all elements are the same, but only if the default operation action of Expand has been overridden by the target.
I've only added result promotion legalization for scalable vector i8/i16/i32/i64 types in AArch64 for now.
Reviewers: t.p.northover, javed.absar, greened, cameron.mcinally, jmolloy
Reviewed By: jmolloy
Differential Revision: https://reviews.llvm.org/D47775
llvm-svn: 375222
show more ...
|
#
1c3d19c8 |
| 07-Oct-2019 |
Kevin P. Neal <kevin.neal@sas.com> |
[FPEnv] Add constrained intrinsics for lrint and lround
Earlier in the year intrinsics for lrint, llrint, lround and llround were added to llvm. The constrained versions are now implemented here.
R
[FPEnv] Add constrained intrinsics for lrint and lround
Earlier in the year intrinsics for lrint, llrint, lround and llround were added to llvm. The constrained versions are now implemented here.
Reviewed by: andrew.w.kaylor, craig.topper, cameron.mcinally Approved by: craig.topper Differential Revision: https://reviews.llvm.org/D64746
llvm-svn: 373900
show more ...
|
#
3740ae3b |
| 27-Sep-2019 |
Hans Wennborg <hans@hanshq.net> |
Revert r372893 "[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets"
This caused severe compile-time regressions, see PR43455.
> Modern processors predict the targets of an indirect
Revert r372893 "[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets"
This caused severe compile-time regressions, see PR43455.
> Modern processors predict the targets of an indirect branch regardless of > the size of any jump table used to glean its target address. Moreover, > branch predictors typically use resources limited by the number of actual > targets that occur at run time. > > This patch changes the semantics of the option `-max-jump-table-size` to limit > the number of different targets instead of the number of entries in a jump > table. Thus, it is now renamed to `-max-jump-table-targets`. > > Before, when `-max-jump-table-size` was specified, it could happen that > cluster jump tables could have targets used repeatedly, but each one was > counted and typically resulted in tables with the same number of entries. > With this patch, when specifying `-max-jump-table-targets`, tables may have > different lengths, since the number of unique targets is counted towards the > limit, but the number of unique targets in tables is the same, but for the > last one containing the balance of targets. > > Differential revision: https://reviews.llvm.org/D60295
llvm-svn: 373060
show more ...
|
#
3c8c6672 |
| 26-Sep-2019 |
Thomas Raoux <thomas.raoux@gmail.com> |
[TargetLowering] Make allowsMemoryAccess methode virtual.
Rename old function to explicitly show that it cares only about alignment. The new allowsMemoryAccess call the function related to alignment
[TargetLowering] Make allowsMemoryAccess methode virtual.
Rename old function to explicitly show that it cares only about alignment. The new allowsMemoryAccess call the function related to alignment by default and can be overridden by target to inform whether the memory access is legal or not.
Differential Revision: https://reviews.llvm.org/D67121
llvm-svn: 372935
show more ...
|
#
3bd8ba15 |
| 25-Sep-2019 |
Evandro Menezes <e.menezes@samsung.com> |
[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets
Modern processors predict the targets of an indirect branch regardless of the size of any jump table used to glean its target addr
[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets
Modern processors predict the targets of an indirect branch regardless of the size of any jump table used to glean its target address. Moreover, branch predictors typically use resources limited by the number of actual targets that occur at run time.
This patch changes the semantics of the option `-max-jump-table-size` to limit the number of different targets instead of the number of entries in a jump table. Thus, it is now renamed to `-max-jump-table-targets`.
Before, when `-max-jump-table-size` was specified, it could happen that cluster jump tables could have targets used repeatedly, but each one was counted and typically resulted in tables with the same number of entries. With this patch, when specifying `-max-jump-table-targets`, tables may have different lengths, since the number of unique targets is counted towards the limit, but the number of unique targets in tables is the same, but for the last one containing the balance of targets.
Differential revision: https://reviews.llvm.org/D60295
llvm-svn: 372893
show more ...
|
#
1a9195d8 |
| 17-Sep-2019 |
Graham Hunter <graham.hunter@arm.com> |
[SVE][MVT] Fixed-length vector MVT ranges
* Reordered MVT simple types to group scalable vector types together. * New range functions in MachineValueType.h to only iterate over the fixed
[SVE][MVT] Fixed-length vector MVT ranges
* Reordered MVT simple types to group scalable vector types together. * New range functions in MachineValueType.h to only iterate over the fixed-length int/fp vector types. * Stopped backends which don't support scalable vector types from iterating over scalable types.
Reviewers: sdesmalen, greened
Reviewed By: greened
Differential Revision: https://reviews.llvm.org/D66339
llvm-svn: 372099
show more ...
|
#
f1c28929 |
| 12-Sep-2019 |
Tim Northover <tnorthover@apple.com> |
AArch64: support arm64_32, an ILP32 slice for watchOS.
This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM. FastISel is mostly disabled for now since it would generate incorre
AArch64: support arm64_32, an ILP32 slice for watchOS.
This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM. FastISel is mostly disabled for now since it would generate incorrect code for ILP32.
llvm-svn: 371722
show more ...
|