#
87fb204e |
| 30-Dec-2019 |
Fangrui Song <maskray@google.com> |
[SelectionDAG] Simplify SelectionDAGBuilder::visitInlineAsm
|
#
63336795 |
| 02-Jan-2020 |
Ulrich Weigand <ulrich.weigand@de.ibm.com> |
[FPEnv] Default NoFPExcept SDNodeFlag to false
The NoFPExcept bit in SDNodeFlags currently defaults to true, unlike all other such flags. This is a problem, because it implies that all code that tra
[FPEnv] Default NoFPExcept SDNodeFlag to false
The NoFPExcept bit in SDNodeFlags currently defaults to true, unlike all other such flags. This is a problem, because it implies that all code that transforms SDNodes without copying flags can introduce a correctness bug, not just a missed optimization.
This patch changes the default to false. This makes it necessary to move setting the (No)FPExcept flag for constrained intrinsics from the visitConstrainedIntrinsic routine to the generic visit routine at the place where the other flags are set, or else the intersectFlagsWith call would erase the NoFPExcept flag again.
In order to avoid making non-strict FP code worse, whenever SelectionDAGISel::SelectCodeCommon matches on a set of orignal nodes none of which can raise FP exceptions, it will preserve this property on all results nodes generated, by setting the NoFPExcept flag on those result nodes that would otherwise be considered as raising an FP exception.
To check whether or not an SD node should be considered as raising an FP exception, the following logic applies:
- For machine nodes, check the mayRaiseFPException property of the underlying MI instruction - For regular nodes, check isStrictFPOpcode - For target nodes, check a newly introduced isTargetStrictFPOpcode
The latter is implemented by reserving a range of target opcodes, similarly to how memory opcodes are identified. (Note that there a bit of a quirk in identifying target nodes that are both memory nodes and strict FP nodes. To simplify the logic, right now all target memory nodes are automatically also considered strict FP nodes -- this could be fixed by adding one more range.)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D71841
show more ...
|
#
8dc7b982 |
| 01-Jan-2020 |
Mark de Wever <koraq@xs4all.nl> |
[NFC] Fixes -Wrange-loop-analysis warnings
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.
Differential Revision: https://reviews.llvm.org/D71857
|
#
6f9b4c68 |
| 30-Dec-2019 |
Fangrui Song <maskray@google.com> |
[SelectionDAT] Simplify SelectionDAGBuilder::visitInlineAsm
Indirect C_Immediate or C_Other constraints have been excluded.
Also simplify an unneeded change to indirect 'X' by D60942.
|
#
044cc919 |
| 28-Dec-2019 |
Fangrui Song <maskray@google.com> |
Delete setjmp_undefined_for_msvc workaround after llvm.setjmp was removed
|
#
7a733466 |
| 27-Dec-2019 |
Fangrui Song <maskray@google.com> |
Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821
Intrinsic has incorrect argument type! i32 (i32*)* @llvm.setjmp
*wipes tear*
|
#
e0d855b3 |
| 24-Dec-2019 |
Fangrui Song <maskray@google.com> |
[SelectionDAG] Change SelectionDAGISel::{funcInfo,SDB} to use unique_ptr
CurDAG is referenced more than 2000 times and used in many gerated .cpp files. Don't touch it for now.
|
#
fb0ccff6 |
| 22-Dec-2019 |
Valentin Churavy <v.churavy@gmail.com> |
[SelectionDAG] Copy FP flags when visiting a binary instruction.
Summary: We noticed in Julia that the sequence below no longer turned into a sequence of FMA instructions in LLVM 7+, but it did in L
[SelectionDAG] Copy FP flags when visiting a binary instruction.
Summary: We noticed in Julia that the sequence below no longer turned into a sequence of FMA instructions in LLVM 7+, but it did in LLVM 6.
``` %29 = fmul contract <4 x double> %wide.load, %wide.load16 %30 = fmul contract <4 x double> %wide.load13, %wide.load17 %31 = fmul contract <4 x double> %wide.load14, %wide.load18 %32 = fmul contract <4 x double> %wide.load15, %wide.load19 %33 = fadd fast <4 x double> %vec.phi, %29 %34 = fadd fast <4 x double> %vec.phi10, %30 %35 = fadd fast <4 x double> %vec.phi11, %31 %36 = fadd fast <4 x double> %vec.phi12, %32 ```
Unlike Clang, Julia doesn't set the `unsafe-fp-math=true` function attribute, but rather emits more local instruction flags.
This partially undoes https://reviews.llvm.org/D46854 and if required I can try to minimize the test further.
Reviewers: spatel, mcberg2017
Reviewed By: spatel
Subscribers: chriselrod, merge_guards_bot, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71495
show more ...
|
#
cfe31600 |
| 18-Dec-2019 |
Craig Topper <craig.topper@intel.com> |
[SelectionDAGBuilder] Use getConstant instead of getTargetConstant to build the offset for struct types in getUniformBase.
getTargetConstant prevents any optimizations from operating on the value an
[SelectionDAGBuilder] Use getConstant instead of getTargetConstant to build the offset for struct types in getUniformBase.
getTargetConstant prevents any optimizations from operating on the value and basically says its already been iseled. But since we want the index to be in a register, this isn't true.
Prior to this we were generating a vbroadcast with an immediate argument which is illegal and was flagged by the expensive checks bot.
show more ...
|
#
89d19d60 |
| 18-Dec-2019 |
stozer <stephen.tozer@sony.com> |
Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISel
This reverts commit 1f3dd83cc1f2b8f72b9d59e2b4221b12fb7f9a95, reapplying commit bb1b0bc4e57428ce364d3d6c075ff03cb8973
Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISel
This reverts commit 1f3dd83cc1f2b8f72b9d59e2b4221b12fb7f9a95, reapplying commit bb1b0bc4e57428ce364d3d6c075ff03cb8973462.
The original commit failed on some builds seemingly due to the use of a bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.
show more ...
|
#
1f3dd83c |
| 18-Dec-2019 |
stozer <stephen.tozer@sony.com> |
Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel"
Reverted due to build failure on windows bots.
This reverts commit bb1b0bc4e57428ce364d3d6c075ff03cb8973462.
|
#
bb1b0bc4 |
| 17-Dec-2019 |
stozer <stephen.tozer@sony.com> |
[DebugInfo] Correctly handle salvaged casts and split fragments at ISel
Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions
[DebugInfo] Correctly handle salvaged casts and split fragments at ISel
Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. This patch enables the salvaging of casts by using the DW_OP_LLVM_convert operator for SExt and Trunc instructions.
There is another issue which is exposed by this fix, in which fragment DIExpressions (which are preserved more readily by this patch) for values that must be split across registers in ISel trigger an assertion, as the 'split' fragments extend beyond the bounds of the fragment DIExpression causing an error. This patch also fixes this issue by checking the fragment status of DIExpressions which are to be split, and dropping fragments that are invalid.
show more ...
|
#
8cc0b586 |
| 13-Dec-2019 |
Wang, Pengfei <pengfei.wang@intel.com> |
[X86] Add calculation for elements in structures in getting uniform base for the Gather/Scatter intrinsic.
Summary: Add calculation for elements in structures in getting uniform base for the Gather/
[X86] Add calculation for elements in structures in getting uniform base for the Gather/Scatter intrinsic.
Summary: Add calculation for elements in structures in getting uniform base for the Gather/Scatter intrinsic.
Reviewers: craig.topper, c-rhodes, RKSimon
Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71442
show more ...
|
#
11448eeb |
| 13-Dec-2019 |
Alex Richardson <Alexander.Richardson@cl.cam.ac.uk> |
[NFC] Use SelectionDAG::getMemBasePlusOffset() instead of getNode(ISD::ADD)
Summary: To find potential opportunities to use getMemBasePlusOffset() I looked at all ISD::ADD uses found with the regex
[NFC] Use SelectionDAG::getMemBasePlusOffset() instead of getNode(ISD::ADD)
Summary: To find potential opportunities to use getMemBasePlusOffset() I looked at all ISD::ADD uses found with the regex getNode\(ISD::ADD,.+,.+Ptr in lib/CodeGen/SelectionDAG. If this patch is accepted I will convert the files in the individual backends too.
The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type.
We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges.
Reviewers: spatel
Reviewed By: spatel
Subscribers: merge_guards_bot, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71207
show more ...
|
#
e39e2b4a |
| 10-Dec-2019 |
stozer <stephen.tozer@sony.com> |
[DebugInfo] Prevent invalid fragments at ISel from dropping debug info
During SelectionDAG, if a value which is associated with a DBG_VALUE needs to be split across multiple registers, the DBG_VALUE
[DebugInfo] Prevent invalid fragments at ISel from dropping debug info
During SelectionDAG, if a value which is associated with a DBG_VALUE needs to be split across multiple registers, the DBG_VALUE will be split into a set of fragment expressions to recreate the original value.
If one or more of these fragments cannot be created, they would previously be silently dropped, causing the old debug value to live past its expiry date. This patch fixes this issue by keeping invalid fragments while setting their value as Undef.
Differential revision: https://reviews.llvm.org/D70248
show more ...
|
#
5d986953 |
| 11-Dec-2019 |
Reid Kleckner <rnk@google.com> |
[IR] Split out target specific intrinsic enums into separate headers
This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build
[IR] Split out target specific intrinsic enums into separate headers
This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics.
The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work.
Part of PR34259
Reviewers: efriedma, echristo, MaskRay
Reviewed By: echristo, MaskRay
Differential Revision: https://reviews.llvm.org/D71320
show more ...
|
#
9db13b5a |
| 06-Dec-2019 |
Ulrich Weigand <ulrich.weigand@de.ibm.com> |
[FPEnv] Constrained FCmp intrinsics
This adds support for constrained floating-point comparison intrinsics.
Specifically, we add:
declare <ty2> @llvm.experimental.constrained.fcmp(<typ
[FPEnv] Constrained FCmp intrinsics
This adds support for constrained floating-point comparison intrinsics.
Specifically, we add:
declare <ty2> @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) declare <ty2> @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>)
The first variant implements an IEEE "quiet" comparison (i.e. we only get an invalid FP exception if either argument is a SNaN), while the second variant implements an IEEE "signaling" comparison (i.e. we get an invalid FP exception if either argument is any NaN).
The condition code is implemented as a metadata string. The same set of predicates as for the fcmp instruction is supported (except for the "true" and "false" predicates).
These new intrinsics are mapped by SelectionDAG codegen onto two new ISD opcodes, ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS, again representing quiet vs. signaling comparison operations. Otherwise those nodes look like SETCC nodes, with an additional chain argument and result as usual for strict FP nodes. The patch includes support for the common legalization operations for those nodes.
The patch also includes full SystemZ back-end support for the new ISD nodes, mapping them to all available SystemZ instruction to fully implement strict semantics (scalar and vector).
Differential Revision: https://reviews.llvm.org/D69281
show more ...
|
#
daee549b |
| 06-Dec-2019 |
Ulrich Weigand <ulrich.weigand@de.ibm.com> |
[FPEnv][SelectionDAG] Relax chain requirements
This patch implements the following changes:
1) SelectionDAGBuilder::visitConstrainedFPIntrinsic currently treats each constrained intrinsic like a gl
[FPEnv][SelectionDAG] Relax chain requirements
This patch implements the following changes:
1) SelectionDAGBuilder::visitConstrainedFPIntrinsic currently treats each constrained intrinsic like a global barrier (e.g. a function call) and fully serializes all pending chains. This is actually not required; it is allowed for constrained intrinsics to be reordered w.r.t one another or (nonvolatile) memory accesses. The MI-level scheduler already allows for that flexibility, so it makes sense to allow it at the DAG level as well.
This patch therefore changes the way chains for constrained intrisincs are created, and handles them basically like load operations are handled. This has the effect that constrained intrinsics are no longer serialized against one another or (nonvolatile) loads. They are still serialized against stores, but that seems hard to change with the current DAG chain setup, and it also doesn't seem to be a big problem preventing DAG
2) The OPC_CheckFoldableChainNode check requires that each of the intermediate nodes in a multi-node pattern match only has a single use. This check tends to fail if those intermediate nodes are strict operations as those have a chain output that typically indeed has another use. However, we don't really need to consider chains here at all, since they will all be rewritten anyway by UpdateChains later. Other parts of the matcher therefore already ignore chains, but this hasOneUse check doesn't.
This patch replaces hasOneUse by a custom test that verifies there is no more than one use of any non-chain output value.
In theory, this change could affect code unrelated to strict FP nodes, but at least on SystemZ I could not find any single instance of that happening
3) The SystemZ back-end currently does not allow matching multiply-and- extend operations (32x32 -> 64bit or 64x64 -> 128bit FP multiply) for strict FP operations. This was not possible in the past due to the problems described under 1) and 2) above.
With those issues fixed, it is now possible to fully support those instructions in strict mode as well, and this patch does so.
Differential Revision: https://reviews.llvm.org/D70913
show more ...
|
#
b5315ae8 |
| 21-Nov-2019 |
David Green <david.green@arm.com> |
[Codegen][ARM] Add addressing modes from masked loads and stores
MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores
[Codegen][ARM] Add addressing modes from masked loads and stores
MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores can use pre-inc and post-inc addressing modes, just like the standard loads and stores already do.
To enable that, this patch adds all the relevant infrastructure for treating masked loads/stores addressing modes in the same way as normal loads/stores.
This involves: - Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra Offset operand that is added after the PtrBase. - Extending the IndexedModeActions from 8bits to 16bits to store the legality of masked operations as well as normal ones. This array is fairly small, so doubling the size still won't make it very large. Offset masked loads can then be controlled with setIndexedMaskedLoadAction, similar to standard loads. - The same methods that combine to indexed loads, such as CombineToPostIndexedLoadStore, are adjusted to handle masked loads in the same way. - The ARM backend is then adjusted to make use of these indexed masked loads/stores. - The X86 backend is adjusted to hopefully be no functional changes.
Differential Revision: https://reviews.llvm.org/D70176
show more ...
|
#
52e37749 |
| 11-Nov-2019 |
Hiroshi Yamauchi <yamauchi@google.com> |
[PGO][PGSO] DAG.shouldOptForSize part.
Summary: (Split of off D67120)
SelectionDAG::shouldOptForSize changes for profile guided size optimization.
Reviewers: davidxl
Subscribers: hiraditya, llvm-
[PGO][PGSO] DAG.shouldOptForSize part.
Summary: (Split of off D67120)
SelectionDAG::shouldOptForSize changes for profile guided size optimization.
Reviewers: davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70095
show more ...
|
#
ea8678d1 |
| 29-Aug-2019 |
Serge Pavlov <sepavloff@gmail.com> |
Move floating point related entities to namespace level
This is recommit of commit e6584b2b7b2d, which was reverted in 30e7ee3c4bac together with af57dbf12e54. Original message is below.
Enumeratio
Move floating point related entities to namespace level
This is recommit of commit e6584b2b7b2d, which was reverted in 30e7ee3c4bac together with af57dbf12e54. Original message is below.
Enumerations that describe rounding mode and exception behavior were defined inside ConstrainedFPIntrinsic. It makes sense to use the same definitions to represent the same properties in other cases, not only in constrained intrinsics. It was however inconvenient as required to include constrained intrinsics definitions even if they were not needed. Also using long scope prefix reduced readability.
This change moves these definitioins to the namespace llvm::fp. No functional changes.
Differential Revision: https://reviews.llvm.org/D69552
show more ...
|
#
0c50c0b0 |
| 05-Nov-2019 |
Serge Pavlov <sepavloff@gmail.com> |
[FEnv] File with properties of constrained intrinsics
Summary In several places we need to enumerate all constrained intrinsics or IR nodes that should be represented by them. It is easy to miss som
[FEnv] File with properties of constrained intrinsics
Summary In several places we need to enumerate all constrained intrinsics or IR nodes that should be represented by them. It is easy to miss some of the cases. To make working with these intrinsics more convenient and robust, this change introduces file containing definitions of all constrained intrinsics and some of their properties. This file can be included to generate constrained intrinsics processing code.
Reviewers: kpn, andrew.w.kaylor, cameron.mcinally, uweigand
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69887
show more ...
|
#
b696b9db |
| 29-Oct-2019 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Add function context to isFMAFasterThanFMulAndFAdd
AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget.
AArch64 had to make this
DAG: Add function context to isFMAFasterThanFMulAndFAdd
AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget.
AArch64 had to make this more complicated by using this from an IR hook, so add an IR typed overload.
show more ...
|
#
30e7ee3c |
| 18-Nov-2019 |
Eric Christopher <echristo@gmail.com> |
Temporarily Revert "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior=" and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is
Temporarily Revert "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior=" and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread.
This reverts commits af57dbf12e54f3a8ff48534bf1078f4de104c1cd and e6584b2b7b2de06f1e59aac41971760cac1e1b79
show more ...
|
#
3f08ad61 |
| 14-Aug-2019 |
Graham Hunter <graham.hunter@arm.com> |
[SVE][CodeGen] Scalable vector MVT size queries
* Implements scalable size queries for MVTs, split out from D53137.
* Contains a fix for FindMemType to avoid using scalable vector type to contain
[SVE][CodeGen] Scalable vector MVT size queries
* Implements scalable size queries for MVTs, split out from D53137.
* Contains a fix for FindMemType to avoid using scalable vector type to contain non-scalable types.
* Explicit casts for several places where implicit integer sign changes or promotion from 32 to 64 bits caused problems.
* CodeGenDAGPatterns will treat scalable and non-scalable vector types as different.
Reviewers: greened, cameron.mcinally, sdesmalen, rovka
Reviewed By: rovka
Differential Revision: https://reviews.llvm.org/D66871
show more ...
|