History log of /llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (Results 351 – 375 of 2094)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 85eae455 30-Mar-2022 Craig Topper <craig.topper@sifive.com>

[SelectionDAG] Move extension type for ConstantSDNode from getCopyToRegs to HandlePHINodesInSuccessorBlocks.

D122053 set the ExtendType for ConstantSDNodes in getCopyToRegs to
ZERO_EXTEND to match a

[SelectionDAG] Move extension type for ConstantSDNode from getCopyToRegs to HandlePHINodesInSuccessorBlocks.

D122053 set the ExtendType for ConstantSDNodes in getCopyToRegs to
ZERO_EXTEND to match assumptions in ComputePHILiveOutRegInfo. PHIs
are probably not the only way ConstantSDNodeNodes can get to
getCopyToRegs.

This patch adds an ExtendType parameter to CopyValueToVirtualRegister and
has HandlePHINodesInSuccessorBlocks pass ISD::ZERO_EXTEND for ConstantInts.
This way we only affect ConstantSDNodes for PHIs.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D122171

show more ...


# 02c28970 21-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup include: codegen second round

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122180


# 4eb59f01 19-Mar-2022 Craig Topper <craig.topper@sifive.com>

[SelectionDAG][RISCV] Make RegsForValue::getCopyToRegs explicitly zero_extend constants.

ComputePHILiveOutRegInfo assumes that constant incoming values to
Phis will be zero extended if they aren't a

[SelectionDAG][RISCV] Make RegsForValue::getCopyToRegs explicitly zero_extend constants.

ComputePHILiveOutRegInfo assumes that constant incoming values to
Phis will be zero extended if they aren't a legal type. To guarantee
that we should zero_extend rather than any_extend constants.

This fixes a bug for RISCV where any_extend of constants can be
treated as a sign_extend.

Differential Revision: https://reviews.llvm.org/D122053

show more ...


# ed98c1b3 09-Mar-2022 serge-sans-paille <sguelton@redhat.com>

Cleanup includes: DebugInfo & CodeGen

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332


# 28cfa764 10-Mar-2022 Lorenzo Albano <loralb@posteo.net>

[VP] Strided loads/stores

This patch introduces two new experimental IR intrinsics and SDAG nodes
to represent vector strided loads and stores.

Reviewed By: simoll

Differential Revision: https://r

[VP] Strided loads/stores

This patch introduces two new experimental IR intrinsics and SDAG nodes
to represent vector strided loads and stores.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D114884

show more ...


# 17310f3d 08-Mar-2022 Fraser Cormack <fraser@codeplay.com>

[SelectionDAG][NFC] Address a few clang-tidy warnings

Fix a couple of else-after-return warnings and some unnecessary
parentheses.


# 7e570308 03-Mar-2022 Maksim Panchenko <maks@fb.com>

[NFC] Fix typos

Reviewed By: yota9, Amir

Differential Revision: https://reviews.llvm.org/D120859


# 7b85f0f3 02-Mar-2022 Paul Robinson <Paul.Robinson@sony.com>

[PS4] isPS4 and isPS4CPU are not meaningfully different


# 87ebd9a3 25-Feb-2022 Nikita Popov <npopov@redhat.com>

[IR] Use CallBase::getParamElementType() (NFC)

As this method now exists on CallBase, use it rather than the
one on AttributeList.


# 24bfa243 19-Feb-2022 Craig Topper <craig.topper@sifive.com>

[SelectionDAGBuilder] Simplify visitShift. NFC

This code was detecting whether the value returned by getShiftAmountTy
can represent all shift amounts. If not, it would use MVT::i32 as a
placeholder.

[SelectionDAGBuilder] Simplify visitShift. NFC

This code was detecting whether the value returned by getShiftAmountTy
can represent all shift amounts. If not, it would use MVT::i32 as a
placeholder. getShiftAmountTy was updated last year to return i32
if the type returned by the target couldn't represent all values.

This means the MVT::i32 case here is dead and can the logic can
be simplified.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120164

show more ...


# 04f815c2 18-Feb-2022 Craig Topper <craig.topper@sifive.com>

[SelectionDAGBuilder] Remove LegalTypes=false from a call to getShiftAmountConstant.

getShiftAmountTy will return MVT::i32 if the shift amount
coming from the target's getScalarShiftAmountTy can't r

[SelectionDAGBuilder] Remove LegalTypes=false from a call to getShiftAmountConstant.

getShiftAmountTy will return MVT::i32 if the shift amount
coming from the target's getScalarShiftAmountTy can't reprsent
all possible values. That should eliminate the need to use the
pointer type which is what we do when LegalTypes is false.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D120165

show more ...


# dcb2da13 11-Feb-2022 Julien Pages <Julien.Pages@amd.com>

[AMDGPU] Add a new intrinsic to control fp_trunc rounding mode

Add a new llvm.fptrunc.round intrinsic to precisely control
the rounding mode when converting from f32 to f16.

Differential Revision:

[AMDGPU] Add a new intrinsic to control fp_trunc rounding mode

Add a new llvm.fptrunc.round intrinsic to precisely control
the rounding mode when converting from f32 to f16.

Differential Revision: https://reviews.llvm.org/D110579

show more ...


# 002b944d 31-Jan-2022 Kerry McLaughlin <kerry.mclaughlin@arm.com>

[SVE] Fix TypeSize->uint64_t implicit conversion in visitAlloca()

Fixes a crash ('Invalid size request on a scalable vector') in visitAlloca()
when we call this function for a scalable alloca instru

[SVE] Fix TypeSize->uint64_t implicit conversion in visitAlloca()

Fixes a crash ('Invalid size request on a scalable vector') in visitAlloca()
when we call this function for a scalable alloca instruction, caused
by the implicit conversion of TySize to uint64_t.
This patch changes TySize to a TypeSize as returned by getTypeAllocSize()
and ensures the allocation size is multiplied by vscale for scalable vectors.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D118372

show more ...


# 11d30742 27-Jan-2022 Ellis Hoag <ellis.sparky.hoag@gmail.com>

[InstrProf] Add single byte coverage mode

Use the llvm flag `-pgo-function-entry-coverage` to create single byte "counters" to track functions coverage. This mode has significantly less size overhea

[InstrProf] Add single byte coverage mode

Use the llvm flag `-pgo-function-entry-coverage` to create single byte "counters" to track functions coverage. This mode has significantly less size overhead in both code and data because
* We mark a function as "covered" with a store instead of an increment which generally requires fewer assembly instructions
* We use a single byte per function rather than 8 bytes per block

The trade off of course is that this mode only tells you if a function has been covered. This is useful, for example, to detect dead code.

When combined with debug info correlation [0] we are able to create an instrumented Clang binary that is only 150M (the vanilla Clang binary is 143M). That is an overhead of 7M (4.9%) compared to the default instrumentation (without value profiling) which has an overhead of 31M (21.7%).

[0] https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4

Reviewed By: kyulee

Differential Revision: https://reviews.llvm.org/D116180

show more ...


# c8e33978 19-Nov-2021 Fraser Cormack <fraser@codeplay.com>

[VP] Propagate align parameter attr on VP gather/scatter to ISel

This patch fixes a case where the 'align' parameter attribute on the
pointer operands to llvm.vp.gather and llvm.vp.scatter was being

[VP] Propagate align parameter attr on VP gather/scatter to ISel

This patch fixes a case where the 'align' parameter attribute on the
pointer operands to llvm.vp.gather and llvm.vp.scatter was being dropped
during the conversion to the SelectionDAG. The default alignment equal
to the ABI type alignment of the vector type was kept. It also updates
the documentation to reflect the fact that the parameter attribute is
now properly supported.

The default alignment of these intrinsics was previously documented as
being equal to the ABI alignment of the *scalar* type, when in fact that
wasn't the case: the ABI alignment of the vector type was used instead.
This has also been fixed in this patch.

Reviewed By: simoll, craig.topper

Differential Revision: https://reviews.llvm.org/D114423

show more ...


# 877d1b3d 13-Jan-2022 Fraser Cormack <fraser@codeplay.com>

[SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE

Original patch by @hussainjk.

This patch was split off from D109377 to keep vector legalization
(widening/splitting) separate from

[SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE

Original patch by @hussainjk.

This patch was split off from D109377 to keep vector legalization
(widening/splitting) separate from vector element legalization
(promoting).

While the original patch added a third overload of
SelectionDAG::getVPStore, this patch takes the liberty of collapsing
those all down to 1, as three overloads seems excessive for a
little-used node.

The original patch also used ModifyToType in places, but that method
still crashes on scalable vector types. Seeing as the other VP
legalization methods only work when all operands need identical
widening, this patch follows in that vein.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117235

show more ...


# e0841f69 14-Jan-2022 Craig Topper <craig.topper@sifive.com>

[SelectionDAGBuilder] Remove unneeded vector bitcast from visitTargetIntrinsic.

This seems to be a leftover from a long time ago when there was
an ISD::VBIT_CONVERT and a MVT::Vector. It looks like

[SelectionDAGBuilder] Remove unneeded vector bitcast from visitTargetIntrinsic.

This seems to be a leftover from a long time ago when there was
an ISD::VBIT_CONVERT and a MVT::Vector. It looks like in those days
the vector type was carried in a VTSDNode.

As far as I know, these days ComputeValueTypes would have already
assigned "Result" the same type we're getting from TLI.getValueType
here. Thus the BITCAST is always a NOP. Verified by adding an assert
and running check-llvm.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D117335

show more ...


# 4edb9983 11-Jan-2022 Nick Desaulniers <ndesaulniers@google.com>

[SelectionDAG] treat X constrained labels as i for asm

Completely rework how we handle X constrained labels for inline asm.

X should really be treated as i. Then existing tests can be moved to use

[SelectionDAG] treat X constrained labels as i for asm

Completely rework how we handle X constrained labels for inline asm.

X should really be treated as i. Then existing tests can be moved to use
i D115410 and clang can just emit i D115311. (D115410 and D115311 are
callbr, but this can be done for label inputs, too).

Coincidentally, this simplification solves an ICE uncovered by D87279
based on assumptions made during D69868.

This is the third approach considered. See also discussions v1 (D114895)
and v2 (D115409).

Reported-by: kernel test robot <lkp@intel.com>
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1512

Reviewed By: void, jyknight

Differential Revision: https://reviews.llvm.org/D115688

show more ...


# 51497dc0 17-Dec-2021 David Sherwood <david.sherwood@arm.com>

[IR] Change vector.splice intrinsic to reject out-of-bounds indices

I've changed the definition of the experimental.vector.splice
instrinsic to reject indices that are known to be or possibly
out-of

[IR] Change vector.splice intrinsic to reject out-of-bounds indices

I've changed the definition of the experimental.vector.splice
instrinsic to reject indices that are known to be or possibly
out-of-bounds. In practice, this means changing the definition so that
the index is now only valid in the range [-VL, VL-1] where VL is the
known minimum vector length. We use the vscale_range attribute to
take the minimum vscale value into account so that we can permit
more indices when the attribute is present.

The splice intrinsic is currently only ever generated by the vectoriser,
which will never attempt to splice vectors with out-of-bounds values.
Changing the definition also makes things simpler for codegen since we
can always assume that the index is valid.

This patch was created in response to review comments on D115863

Differential Revision: https://reviews.llvm.org/D115933

show more ...


# 0312fe29 07-Jan-2022 Nikita Popov <npopov@redhat.com>

[CodeGen] Support opaque pointers for inline asm

This is the last part of D116531. Fetch the type of the indirect
inline asm operand from the elementtype attribute, rather than
the pointer element t

[CodeGen] Support opaque pointers for inline asm

This is the last part of D116531. Fetch the type of the indirect
inline asm operand from the elementtype attribute, rather than
the pointer element type.

Fixes https://github.com/llvm/llvm-project/issues/52928.

show more ...


# e4d17799 07-Jan-2022 Nikita Popov <npopov@redhat.com>

[IR] Add ConstraintInfo::hasArg() helper (NFC)

Checking whether a constraint corresponds to an argument is a
recurring pattern.


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 5dc8aaac 10-Aug-2021 Sami Tolvanen <samitolvanen@google.com>

[llvm][IR] Add no_cfi constant

With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces
function references with CFI jump table references, which is a problem
for low-level code that need

[llvm][IR] Add no_cfi constant

With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces
function references with CFI jump table references, which is a problem
for low-level code that needs the address of the actual function body.

For example, in the Linux kernel, the code that sets up interrupt
handlers needs to take the address of the interrupt handler function
instead of the CFI jump table, as the jump table may not even be mapped
into memory when an interrupt is triggered.

This change adds the no_cfi constant type, which wraps function
references in a value that LowerTypeTestsModule::replaceCfiUses does not
replace.

Link: https://github.com/ClangBuiltLinux/linux/issues/1353

Reviewed By: nickdesaulniers, pcc

Differential Revision: https://reviews.llvm.org/D108478

show more ...


# 921e89c5 08-Dec-2021 Peter Waller <peter.waller@arm.com>

[SVE] Only combine (fneg (fma)) => FNMLA with nsz

-(Za + Zm * Zn) != (-Za + Zm * (-Zn))
when the FMA produces a zero output (e.g. all zero inputs can produce -0
output)

Add a PatFrag to check prese

[SVE] Only combine (fneg (fma)) => FNMLA with nsz

-(Za + Zm * Zn) != (-Za + Zm * (-Zn))
when the FMA produces a zero output (e.g. all zero inputs can produce -0
output)

Add a PatFrag to check presence of nsz on the fneg, add tests which
ensure the combine does not fire in the absense of nsz.

See https://reviews.llvm.org/D90901 for a similar discussion on X86.

Differential Revision: https://reviews.llvm.org/D109525

show more ...


# b0319ab7 30-Nov-2021 Fraser Cormack <fraser@codeplay.com>

[PR52475] Ensure a correct chain in copies to/from hidden sret parameter

This patch fixes an issue during SelectionDAG construction. When the
target is unable to lower the function's return value, a

[PR52475] Ensure a correct chain in copies to/from hidden sret parameter

This patch fixes an issue during SelectionDAG construction. When the
target is unable to lower the function's return value, a hidden sret
parameter is created. It is initialized and copied to a stored variable
(DemoteRegister) with CopyToReg and is later fetched with
CopyFromReg. The bug is that the chains used for each copy are
inconsistent, and thus in rare cases the scheduler may issue them out of
order.

The fix is to ensure that the CopyFromReg uses the DAG root which is set
as the chain corresponding to the initial CopyToReg.

Fixes https://llvm.org/PR52475

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D114795

show more ...


# 652faed3 07-Dec-2021 David Sherwood <david.sherwood@arm.com>

[CodeGen] Improve SelectionDAGBuilder lowering code for get.active.lane.mask intrinsic

Previously we were using UADDO to generate a two-result value with
the unsigned addition and the overflow mask.

[CodeGen] Improve SelectionDAGBuilder lowering code for get.active.lane.mask intrinsic

Previously we were using UADDO to generate a two-result value with
the unsigned addition and the overflow mask. We then combined the
overflow mask with the trip count comparison to get a result.
However, we don't need to do this - we can simply use a UADDSAT
saturating add node to add the vector index splat and the stepvector
together. Then we can just compare this to a splat of the trip count.
This results in overall better code quality for both Thumb2 and AArch64.

Differential Revision: https://reviews.llvm.org/D115354

show more ...


1...<<11121314151617181920>>...84