#
85eae455 |
| 30-Mar-2022 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAG] Move extension type for ConstantSDNode from getCopyToRegs to HandlePHINodesInSuccessorBlocks.
D122053 set the ExtendType for ConstantSDNodes in getCopyToRegs to ZERO_EXTEND to match a
[SelectionDAG] Move extension type for ConstantSDNode from getCopyToRegs to HandlePHINodesInSuccessorBlocks.
D122053 set the ExtendType for ConstantSDNodes in getCopyToRegs to ZERO_EXTEND to match assumptions in ComputePHILiveOutRegInfo. PHIs are probably not the only way ConstantSDNodeNodes can get to getCopyToRegs.
This patch adds an ExtendType parameter to CopyValueToVirtualRegister and has HandlePHINodesInSuccessorBlocks pass ISD::ZERO_EXTEND for ConstantInts. This way we only affect ConstantSDNodes for PHIs.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D122171
show more ...
|
#
02c28970 |
| 21-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup include: codegen second round
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D122180
|
#
4eb59f01 |
| 19-Mar-2022 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAG][RISCV] Make RegsForValue::getCopyToRegs explicitly zero_extend constants.
ComputePHILiveOutRegInfo assumes that constant incoming values to Phis will be zero extended if they aren't a
[SelectionDAG][RISCV] Make RegsForValue::getCopyToRegs explicitly zero_extend constants.
ComputePHILiveOutRegInfo assumes that constant incoming values to Phis will be zero extended if they aren't a legal type. To guarantee that we should zero_extend rather than any_extend constants.
This fixes a bug for RISCV where any_extend of constants can be treated as a sign_extend.
Differential Revision: https://reviews.llvm.org/D122053
show more ...
|
#
ed98c1b3 |
| 09-Mar-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332
|
#
28cfa764 |
| 10-Mar-2022 |
Lorenzo Albano <loralb@posteo.net> |
[VP] Strided loads/stores
This patch introduces two new experimental IR intrinsics and SDAG nodes to represent vector strided loads and stores.
Reviewed By: simoll
Differential Revision: https://r
[VP] Strided loads/stores
This patch introduces two new experimental IR intrinsics and SDAG nodes to represent vector strided loads and stores.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D114884
show more ...
|
#
17310f3d |
| 08-Mar-2022 |
Fraser Cormack <fraser@codeplay.com> |
[SelectionDAG][NFC] Address a few clang-tidy warnings
Fix a couple of else-after-return warnings and some unnecessary parentheses.
|
#
7e570308 |
| 03-Mar-2022 |
Maksim Panchenko <maks@fb.com> |
[NFC] Fix typos
Reviewed By: yota9, Amir
Differential Revision: https://reviews.llvm.org/D120859
|
#
7b85f0f3 |
| 02-Mar-2022 |
Paul Robinson <Paul.Robinson@sony.com> |
[PS4] isPS4 and isPS4CPU are not meaningfully different
|
#
87ebd9a3 |
| 25-Feb-2022 |
Nikita Popov <npopov@redhat.com> |
[IR] Use CallBase::getParamElementType() (NFC)
As this method now exists on CallBase, use it rather than the one on AttributeList.
|
#
24bfa243 |
| 19-Feb-2022 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAGBuilder] Simplify visitShift. NFC
This code was detecting whether the value returned by getShiftAmountTy can represent all shift amounts. If not, it would use MVT::i32 as a placeholder.
[SelectionDAGBuilder] Simplify visitShift. NFC
This code was detecting whether the value returned by getShiftAmountTy can represent all shift amounts. If not, it would use MVT::i32 as a placeholder. getShiftAmountTy was updated last year to return i32 if the type returned by the target couldn't represent all values.
This means the MVT::i32 case here is dead and can the logic can be simplified.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D120164
show more ...
|
#
04f815c2 |
| 18-Feb-2022 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAGBuilder] Remove LegalTypes=false from a call to getShiftAmountConstant.
getShiftAmountTy will return MVT::i32 if the shift amount coming from the target's getScalarShiftAmountTy can't r
[SelectionDAGBuilder] Remove LegalTypes=false from a call to getShiftAmountConstant.
getShiftAmountTy will return MVT::i32 if the shift amount coming from the target's getScalarShiftAmountTy can't reprsent all possible values. That should eliminate the need to use the pointer type which is what we do when LegalTypes is false.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D120165
show more ...
|
#
dcb2da13 |
| 11-Feb-2022 |
Julien Pages <Julien.Pages@amd.com> |
[AMDGPU] Add a new intrinsic to control fp_trunc rounding mode
Add a new llvm.fptrunc.round intrinsic to precisely control the rounding mode when converting from f32 to f16.
Differential Revision:
[AMDGPU] Add a new intrinsic to control fp_trunc rounding mode
Add a new llvm.fptrunc.round intrinsic to precisely control the rounding mode when converting from f32 to f16.
Differential Revision: https://reviews.llvm.org/D110579
show more ...
|
#
002b944d |
| 31-Jan-2022 |
Kerry McLaughlin <kerry.mclaughlin@arm.com> |
[SVE] Fix TypeSize->uint64_t implicit conversion in visitAlloca()
Fixes a crash ('Invalid size request on a scalable vector') in visitAlloca() when we call this function for a scalable alloca instru
[SVE] Fix TypeSize->uint64_t implicit conversion in visitAlloca()
Fixes a crash ('Invalid size request on a scalable vector') in visitAlloca() when we call this function for a scalable alloca instruction, caused by the implicit conversion of TySize to uint64_t. This patch changes TySize to a TypeSize as returned by getTypeAllocSize() and ensures the allocation size is multiplied by vscale for scalable vectors.
Reviewed By: sdesmalen, david-arm
Differential Revision: https://reviews.llvm.org/D118372
show more ...
|
#
11d30742 |
| 27-Jan-2022 |
Ellis Hoag <ellis.sparky.hoag@gmail.com> |
[InstrProf] Add single byte coverage mode
Use the llvm flag `-pgo-function-entry-coverage` to create single byte "counters" to track functions coverage. This mode has significantly less size overhea
[InstrProf] Add single byte coverage mode
Use the llvm flag `-pgo-function-entry-coverage` to create single byte "counters" to track functions coverage. This mode has significantly less size overhead in both code and data because * We mark a function as "covered" with a store instead of an increment which generally requires fewer assembly instructions * We use a single byte per function rather than 8 bytes per block
The trade off of course is that this mode only tells you if a function has been covered. This is useful, for example, to detect dead code.
When combined with debug info correlation [0] we are able to create an instrumented Clang binary that is only 150M (the vanilla Clang binary is 143M). That is an overhead of 7M (4.9%) compared to the default instrumentation (without value profiling) which has an overhead of 31M (21.7%).
[0] https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4
Reviewed By: kyulee
Differential Revision: https://reviews.llvm.org/D116180
show more ...
|
#
c8e33978 |
| 19-Nov-2021 |
Fraser Cormack <fraser@codeplay.com> |
[VP] Propagate align parameter attr on VP gather/scatter to ISel
This patch fixes a case where the 'align' parameter attribute on the pointer operands to llvm.vp.gather and llvm.vp.scatter was being
[VP] Propagate align parameter attr on VP gather/scatter to ISel
This patch fixes a case where the 'align' parameter attribute on the pointer operands to llvm.vp.gather and llvm.vp.scatter was being dropped during the conversion to the SelectionDAG. The default alignment equal to the ABI type alignment of the vector type was kept. It also updates the documentation to reflect the fact that the parameter attribute is now properly supported.
The default alignment of these intrinsics was previously documented as being equal to the ABI alignment of the *scalar* type, when in fact that wasn't the case: the ABI alignment of the vector type was used instead. This has also been fixed in this patch.
Reviewed By: simoll, craig.topper
Differential Revision: https://reviews.llvm.org/D114423
show more ...
|
#
877d1b3d |
| 13-Jan-2022 |
Fraser Cormack <fraser@codeplay.com> |
[SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE
Original patch by @hussainjk.
This patch was split off from D109377 to keep vector legalization (widening/splitting) separate from
[SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE
Original patch by @hussainjk.
This patch was split off from D109377 to keep vector legalization (widening/splitting) separate from vector element legalization (promoting).
While the original patch added a third overload of SelectionDAG::getVPStore, this patch takes the liberty of collapsing those all down to 1, as three overloads seems excessive for a little-used node.
The original patch also used ModifyToType in places, but that method still crashes on scalable vector types. Seeing as the other VP legalization methods only work when all operands need identical widening, this patch follows in that vein.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D117235
show more ...
|
#
e0841f69 |
| 14-Jan-2022 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAGBuilder] Remove unneeded vector bitcast from visitTargetIntrinsic.
This seems to be a leftover from a long time ago when there was an ISD::VBIT_CONVERT and a MVT::Vector. It looks like
[SelectionDAGBuilder] Remove unneeded vector bitcast from visitTargetIntrinsic.
This seems to be a leftover from a long time ago when there was an ISD::VBIT_CONVERT and a MVT::Vector. It looks like in those days the vector type was carried in a VTSDNode.
As far as I know, these days ComputeValueTypes would have already assigned "Result" the same type we're getting from TLI.getValueType here. Thus the BITCAST is always a NOP. Verified by adding an assert and running check-llvm.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D117335
show more ...
|
#
4edb9983 |
| 11-Jan-2022 |
Nick Desaulniers <ndesaulniers@google.com> |
[SelectionDAG] treat X constrained labels as i for asm
Completely rework how we handle X constrained labels for inline asm.
X should really be treated as i. Then existing tests can be moved to use
[SelectionDAG] treat X constrained labels as i for asm
Completely rework how we handle X constrained labels for inline asm.
X should really be treated as i. Then existing tests can be moved to use i D115410 and clang can just emit i D115311. (D115410 and D115311 are callbr, but this can be done for label inputs, too).
Coincidentally, this simplification solves an ICE uncovered by D87279 based on assumptions made during D69868.
This is the third approach considered. See also discussions v1 (D114895) and v2 (D115409).
Reported-by: kernel test robot <lkp@intel.com> Fixes: https://github.com/ClangBuiltLinux/linux/issues/1512
Reviewed By: void, jyknight
Differential Revision: https://reviews.llvm.org/D115688
show more ...
|
#
51497dc0 |
| 17-Dec-2021 |
David Sherwood <david.sherwood@arm.com> |
[IR] Change vector.splice intrinsic to reject out-of-bounds indices
I've changed the definition of the experimental.vector.splice instrinsic to reject indices that are known to be or possibly out-of
[IR] Change vector.splice intrinsic to reject out-of-bounds indices
I've changed the definition of the experimental.vector.splice instrinsic to reject indices that are known to be or possibly out-of-bounds. In practice, this means changing the definition so that the index is now only valid in the range [-VL, VL-1] where VL is the known minimum vector length. We use the vscale_range attribute to take the minimum vscale value into account so that we can permit more indices when the attribute is present.
The splice intrinsic is currently only ever generated by the vectoriser, which will never attempt to splice vectors with out-of-bounds values. Changing the definition also makes things simpler for codegen since we can always assume that the index is valid.
This patch was created in response to review comments on D115863
Differential Revision: https://reviews.llvm.org/D115933
show more ...
|
#
0312fe29 |
| 07-Jan-2022 |
Nikita Popov <npopov@redhat.com> |
[CodeGen] Support opaque pointers for inline asm
This is the last part of D116531. Fetch the type of the indirect inline asm operand from the elementtype attribute, rather than the pointer element t
[CodeGen] Support opaque pointers for inline asm
This is the last part of D116531. Fetch the type of the indirect inline asm operand from the elementtype attribute, rather than the pointer element type.
Fixes https://github.com/llvm/llvm-project/issues/52928.
show more ...
|
#
e4d17799 |
| 07-Jan-2022 |
Nikita Popov <npopov@redhat.com> |
[IR] Add ConstraintInfo::hasArg() helper (NFC)
Checking whether a constraint corresponds to an argument is a recurring pattern.
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
#
5dc8aaac |
| 10-Aug-2021 |
Sami Tolvanen <samitolvanen@google.com> |
[llvm][IR] Add no_cfi constant
With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces function references with CFI jump table references, which is a problem for low-level code that need
[llvm][IR] Add no_cfi constant
With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces function references with CFI jump table references, which is a problem for low-level code that needs the address of the actual function body.
For example, in the Linux kernel, the code that sets up interrupt handlers needs to take the address of the interrupt handler function instead of the CFI jump table, as the jump table may not even be mapped into memory when an interrupt is triggered.
This change adds the no_cfi constant type, which wraps function references in a value that LowerTypeTestsModule::replaceCfiUses does not replace.
Link: https://github.com/ClangBuiltLinux/linux/issues/1353
Reviewed By: nickdesaulniers, pcc
Differential Revision: https://reviews.llvm.org/D108478
show more ...
|
#
921e89c5 |
| 08-Dec-2021 |
Peter Waller <peter.waller@arm.com> |
[SVE] Only combine (fneg (fma)) => FNMLA with nsz
-(Za + Zm * Zn) != (-Za + Zm * (-Zn)) when the FMA produces a zero output (e.g. all zero inputs can produce -0 output)
Add a PatFrag to check prese
[SVE] Only combine (fneg (fma)) => FNMLA with nsz
-(Za + Zm * Zn) != (-Za + Zm * (-Zn)) when the FMA produces a zero output (e.g. all zero inputs can produce -0 output)
Add a PatFrag to check presence of nsz on the fneg, add tests which ensure the combine does not fire in the absense of nsz.
See https://reviews.llvm.org/D90901 for a similar discussion on X86.
Differential Revision: https://reviews.llvm.org/D109525
show more ...
|
#
b0319ab7 |
| 30-Nov-2021 |
Fraser Cormack <fraser@codeplay.com> |
[PR52475] Ensure a correct chain in copies to/from hidden sret parameter
This patch fixes an issue during SelectionDAG construction. When the target is unable to lower the function's return value, a
[PR52475] Ensure a correct chain in copies to/from hidden sret parameter
This patch fixes an issue during SelectionDAG construction. When the target is unable to lower the function's return value, a hidden sret parameter is created. It is initialized and copied to a stored variable (DemoteRegister) with CopyToReg and is later fetched with CopyFromReg. The bug is that the chains used for each copy are inconsistent, and thus in rare cases the scheduler may issue them out of order.
The fix is to ensure that the CopyFromReg uses the DAG root which is set as the chain corresponding to the initial CopyToReg.
Fixes https://llvm.org/PR52475
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D114795
show more ...
|
#
652faed3 |
| 07-Dec-2021 |
David Sherwood <david.sherwood@arm.com> |
[CodeGen] Improve SelectionDAGBuilder lowering code for get.active.lane.mask intrinsic
Previously we were using UADDO to generate a two-result value with the unsigned addition and the overflow mask.
[CodeGen] Improve SelectionDAGBuilder lowering code for get.active.lane.mask intrinsic
Previously we were using UADDO to generate a two-result value with the unsigned addition and the overflow mask. We then combined the overflow mask with the trip count comparison to get a result. However, we don't need to do this - we can simply use a UADDSAT saturating add node to add the vector index splat and the stepvector together. Then we can just compare this to a splat of the trip count. This results in overall better code quality for both Thumb2 and AArch64.
Differential Revision: https://reviews.llvm.org/D115354
show more ...
|