Revision tags: llvmorg-18.1.1 |
|
#
60822637 |
| 06-Mar-2024 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
Restore "Implement convergence control in MIR using SelectionDAG (#71785)"
This restores commit c7fdd8c11e54585dc9d15d63de9742067e0506b9. Previously reverted in f010b1bef4dda2c7082cbb41dbabf1f149cce
Restore "Implement convergence control in MIR using SelectionDAG (#71785)"
This restores commit c7fdd8c11e54585dc9d15d63de9742067e0506b9. Previously reverted in f010b1bef4dda2c7082cbb41dbabf1f149cce306.
LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows:
1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR.
The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR.
The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.
show more ...
|
#
d95a0d7c |
| 05-Mar-2024 |
Yeting Kuo <46629943+yetingk@users.noreply.github.com> |
[DAG] Teach SelectionDAGBuilder to read parameter alignment of compressstore/expandload. (#83763)
Previously SelectionDAGBuilder used ABI alignment for
compressstore/expandload. This patch allows S
[DAG] Teach SelectionDAGBuilder to read parameter alignment of compressstore/expandload. (#83763)
Previously SelectionDAGBuilder used ABI alignment for
compressstore/expandload. This patch allows SelectionDAGBuilder to use
parameter alignment like vp intrinsics. This does not follow the
original code to default use vector type alignment, since it is possible
implemented to unaligned vector alignment.
show more ...
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2 |
|
#
a4951eca |
| 29-Sep-2023 |
Noah Goldstein <goldstein.w.n@gmail.com> |
Recommit "[X86] Don't always separate conditions in `(br (and/or cond0, cond1))` into separate branches" (2nd Try)
Changes in Recommit: 1) Fix non-determanism by using `SmallMapVector` instead o
Recommit "[X86] Don't always separate conditions in `(br (and/or cond0, cond1))` into separate branches" (2nd Try)
Changes in Recommit: 1) Fix non-determanism by using `SmallMapVector` instead of `SmallPtrSet`. 2) Fix bug in dependency pruning where we discounted the actual `and/or` combining the two conditions. This lead to over pruning.
Closes #81689
show more ...
|
#
f010b1be |
| 04-Mar-2024 |
Mitch Phillips <mitchp@google.com> |
Revert "Restore "Implement convergence control in MIR using SelectionDAG (#71785)""
This reverts commit c7fdd8c11e54585dc9d15d63de9742067e0506b9.
Reason: Broke the sanitizer buildbots. See the comm
Revert "Restore "Implement convergence control in MIR using SelectionDAG (#71785)""
This reverts commit c7fdd8c11e54585dc9d15d63de9742067e0506b9.
Reason: Broke the sanitizer buildbots. See the comments at https://github.com/llvm/llvm-project/pull/71785 for more information.
show more ...
|
#
c7fdd8c1 |
| 04-Mar-2024 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
Restore "Implement convergence control in MIR using SelectionDAG (#71785)"
Original commit 79889734b940356ab3381423c93ae06f22e772c9. Perviously reverted in commit a2afcd5721869d1d03c8146bae3885b3385
Restore "Implement convergence control in MIR using SelectionDAG (#71785)"
Original commit 79889734b940356ab3381423c93ae06f22e772c9. Perviously reverted in commit a2afcd5721869d1d03c8146bae3885b3385ba15e.
LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows:
1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR.
The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR.
The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.
show more ...
|
#
5b4759f9 |
| 03-Mar-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Revert "[X86] Don't always separate conditions in `(br (and/or cond0, cond1))` into separate branches"
This has been buggy for a while.
Reverts #81689 This reverts commit ae76dfb74701e05e5ab4be194e
Revert "[X86] Don't always separate conditions in `(br (and/or cond0, cond1))` into separate branches"
This has been buggy for a while.
Reverts #81689 This reverts commit ae76dfb74701e05e5ab4be194e20e49f10768e46.
show more ...
|
#
ae76dfb7 |
| 29-Sep-2023 |
Noah Goldstein <goldstein.w.n@gmail.com> |
[X86] Don't always separate conditions in `(br (and/or cond0, cond1))` into separate branches
It makes sense to split if the cost of computing `cond1` is high (proportionally to how likely `cond0` i
[X86] Don't always separate conditions in `(br (and/or cond0, cond1))` into separate branches
It makes sense to split if the cost of computing `cond1` is high (proportionally to how likely `cond0` is), but it doesn't really make sense to introduce a second branch if its only a few instructions.
Splitting can also get in the way of potentially folding patterns.
This patch introduces some logic to try to check if the cost of computing `cond1` is relatively low, and if so don't split the branches.
Modest improvement on clang bootstrap build: https://llvm-compile-time-tracker.com/compare.php?from=79ce933114e46c891a5632f7ad4a004b93a5b808&to=978278eabc0bafe2f390ca8fcdad24154f954020&stat=cycles Average stage2-O3: 0.59% Improvement (cycles) Average stage2-O0-g: 1.20% Improvement (cycles)
Likewise on llvm-test-suite on SKX saw a net 0.84% improvement (cycles)
There is also a modest compile time improvement with this patch: https://llvm-compile-time-tracker.com/compare.php?from=79ce933114e46c891a5632f7ad4a004b93a5b808&to=978278eabc0bafe2f390ca8fcdad24154f954020&stat=instructions%3Au
Note that the stage2 instruction count increases is expected, this patch trades instructions for decreasing branch-misses (which is proportionately lower): https://llvm-compile-time-tracker.com/compare.php?from=79ce933114e46c891a5632f7ad4a004b93a5b808&to=978278eabc0bafe2f390ca8fcdad24154f954020&stat=branch-misses
NB: This will also likely help for APX targets with the new `CCMP` and `CTEST` instructions.
Closes #81689
show more ...
|
#
62d0c01c |
| 27-Feb-2024 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAG] Remove pointer from MMO for VP strided load/store. (#82667)
MachineIR alias analysis assumes that only bytes after the pointer will
be accessed. This is incorrect if the stride is ne
[SelectionDAG] Remove pointer from MMO for VP strided load/store. (#82667)
MachineIR alias analysis assumes that only bytes after the pointer will
be accessed. This is incorrect if the stride is negative.
This is causing miscompiles in our downstream after SLP started making
strided loads.
Fixes #82657
show more ...
|
#
8a164220 |
| 23-Feb-2024 |
Orlando Cazalet-Hyams <orlando.hyams@sony.com> |
[RemoveDIs] Add DPLabels support [3a/3] (#82633)
Patch 2 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds the DPLabel class, which is the RemoveDIs llvm.dbg.label
[RemoveDIs] Add DPLabels support [3a/3] (#82633)
Patch 2 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds the DPLabel class, which is the RemoveDIs llvm.dbg.label
equivalent.
1. Add DbgRecord base class for DPValue and the not-yet-added
DPLabel class.
2. Add the DPLabel class.
-> 3. Add support to passes.
The next patch, #82639, will enable conversion between dbg.labels and DPLabels.
AssignemntTrackingAnalysis support could have gone two ways:
1. Have the analysis store a DPLabel representation in its results -
SelectionDAGBuilder reads the analysis results and ignores all DbgRecord
kinds.
2. Ignore DPLabels in the analysis - SelectionDAGBuilder reads the analysis
results but still needs to iterate over DPLabels from the IR.
I went with option 2 because it's less work and is no less correct than 1. It's
worth noting that causes labels to sink to the bottom of packs of debug records.
e.g., [value, label, value] becomes [value, value, label]. This shouldn't be a
problem because labels and variable locations don't have an ordering requirement.
The ordering between variable locations is maintained and the label movement is
deterministic
show more ...
|
#
28fb2b33 |
| 21-Feb-2024 |
Paul Walker <paul.walker@arm.com> |
[LLVM][SelectionDAG] Reduce number of ComputeValueVTs variants. (#75614)
This is another step in the direction of fixing the `Fixed(0) !=
Scalable(0)` bugbear, although whilst weird I don't believe
[LLVM][SelectionDAG] Reduce number of ComputeValueVTs variants. (#75614)
This is another step in the direction of fixing the `Fixed(0) !=
Scalable(0)` bugbear, although whilst weird I don't believe it's causing
us any real issues.
show more ...
|
#
a2afcd57 |
| 21-Feb-2024 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
Revert "Implement convergence control in MIR using SelectionDAG (#71785)"
This reverts commit 79889734b940356ab3381423c93ae06f22e772c9.
Encountered multiple buildbot failures.
|
#
79889734 |
| 21-Feb-2024 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
Implement convergence control in MIR using SelectionDAG (#71785)
LLVM function calls carry convergence control tokens as operand bundles, where
the tokens themselves are produced by convergence con
Implement convergence control in MIR using SelectionDAG (#71785)
LLVM function calls carry convergence control tokens as operand bundles, where
the tokens themselves are produced by convergence control intrinsics. This patch
implements convergence control tokens in MIR as follows:
1. Introduce target-independent ISD opcodes and MIR opcodes for convergence
control intrinsics.
2. Model token values as untyped virtual registers in MIR.
The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a
corresponding machine opcode with the same spelling. This glues the convergence
control token to SDNodes that represent calls to intrinsics. The glued token is
later translated to an implicit argument in the MIR.
The lowering of calls to user-defined functions is target-specific. On AMDGPU,
the convergence control operand bundle at a non-intrinsic call is translated to
an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment
converts this explicit argument to an implicit argument on the SI_CALL
instruction.
show more ...
|
#
ababa964 |
| 20-Feb-2024 |
Orlando Cazalet-Hyams <orlando.hyams@sony.com> |
[RemoveDIs][NFC] Introduce DbgRecord base class [1/3] (#78252)
Patch 1 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds a new base class
-> 1. Add DbgRecord
[RemoveDIs][NFC] Introduce DbgRecord base class [1/3] (#78252)
Patch 1 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds a new base class
-> 1. Add DbgRecord base class for DPValue and the not-yet-added
DPLabel class.
2. Add the DPLabel class.
3. Enable dbg.label conversion and add support to passes.
Patches 1 and 2 are NFC.
In the near future we also will rename DPValue to DbgVariableRecord and
DPLabel to DbgLabelRecord, at which point we'll overhaul the function
names too. The name DPLabel keeps things consistent for now.
show more ...
|
#
0215d2c5 |
| 08-Jan-2024 |
Tim Northover <tnorthover@apple.com> |
arm64_32: extend @llvm.stackguard call to in-DAG 64-bits before handing off
Pointers are 64-bits in the DAG, so we need to extend the result of loading the cookie when building the DAG.
|
#
11fcae69 |
| 13-Feb-2024 |
Joseph Huber <huberjn@outlook.com> |
[LLVM] Add `__builtin_readsteadycounter` intrinsic and builtin for realtime clocks (#81331)
Summary: This patch adds a new intrinsic and builtin function mirroring the existing `__builtin_readcyclec
[LLVM] Add `__builtin_readsteadycounter` intrinsic and builtin for realtime clocks (#81331)
Summary: This patch adds a new intrinsic and builtin function mirroring the existing `__builtin_readcyclecounter`. The difference is that this implementation targets a separate counter that some targets have which returns a fixed frequency clock that can be used to determine elapsed time, this is different compared to the cycle counter which often has variable frequency.
This patch only adds support for the NVPTX and AMDGPU targets.
This is done as a new and separate builtin rather than an argument to `readcyclecounter` to avoid needing to change existing code and to make the separation more explicit.
show more ...
|
#
30845e8a |
| 23-Jan-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
[RemoveDIs][DebugInfo] Handle DPVAssigns in Assignment Tracking excluding lowering (#78982)
This patch adds support for DPVAssigns across all of
AssignmentTrackingAnalysis except for AssignmentTrac
[RemoveDIs][DebugInfo] Handle DPVAssigns in Assignment Tracking excluding lowering (#78982)
This patch adds support for DPVAssigns across all of
AssignmentTrackingAnalysis except for AssignmentTrackingLowering, which
is implemented in a separate patch. This patch includes handling
DPValues in MemLocFragFill, the removal of redundant DPValues as part of
AssignmentTrackingAnalysis (which is different to the version in
`BasicBlockUtils.cpp`), and preventing the DPVAssigns from being
directly emitted in SelectionDAG (just as we don't emit llvm.dbg.assigns
directly, but receive a set of locations from
AssignmentTrackingAnalysis' output).
show more ...
|
#
8c1b7fba |
| 22-Jan-2024 |
Jeremy Morse <jeremy.morse@sony.com> |
[SelectionDAG][DebugInfo][RemoveDIs] Handle entry value variables in DPValues too (#78726)
This patch abstracts visitEntryValueDbgValue to deal with the substance
of variable locations (Value, Var,
[SelectionDAG][DebugInfo][RemoveDIs] Handle entry value variables in DPValues too (#78726)
This patch abstracts visitEntryValueDbgValue to deal with the substance
of variable locations (Value, Var, Expr, DebugLoc) rather than how
they're stored. That allows us to call it from handleDebugValue, which
is similarly abstracted. This allows the entry-value behaviour (see the
test) to be supported with non-instruction debug-info too!.
show more ...
|
#
11bf02e0 |
| 18-Jan-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Fix ABI lowering with FP promote in strictfp functions (#74405)
This was emitting non-strict casts in ABI contexts for illegal
types.
|
#
197214e3 |
| 09-Jan-2024 |
Alex Bradbury <asb@igalia.com> |
[RFC][SelectionDAG] Add and use SDNode::getAsZExtVal() helper (#76710)
This follows on from #76708, allowing
`cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just
`N->getAsZextVal();`
[RFC][SelectionDAG] Add and use SDNode::getAsZExtVal() helper (#76710)
This follows on from #76708, allowing
`cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just
`N->getAsZextVal();`
Introduced via `git grep -l "cast<ConstantSDNode>\(.*\).*getZExtValue" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and
then using `git clang-format` on the result.
show more ...
|
#
535d8e8b |
| 07-Jan-2024 |
Amara Emerson <amara@apple.com> |
NFC: Extract switch lowering binary tree splitting code from DAG into SwitchLoweringUtils.
This will help re-use this code with the upcoming GlobalISel implementation of this optimization.
|
#
7954c571 |
| 04-Jan-2024 |
Jannik Silvanus <37809848+jasilvanus@users.noreply.github.com> |
[IR] Fix GEP offset computations for vector GEPs (#75448)
Vectors are always bit-packed and don't respect the elements' alignment
requirements. This is different from arrays. This means offsets of
[IR] Fix GEP offset computations for vector GEPs (#75448)
Vectors are always bit-packed and don't respect the elements' alignment
requirements. This is different from arrays. This means offsets of
vector GEPs need to be computed differently than offsets of array GEPs.
This PR fixes many places that rely on an incorrect pattern
that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`.
We replace these by usages of `GTI.getSequentialElementStride(DL)`,
which is a new helper function added in this PR.
This changes behavior for GEPs into vectors with element types for which
the (bit) size and alloc size is different. This includes two cases:
* Types with a bit size that is not a multiple of a byte, e.g. i1.
GEPs into such vectors are questionable to begin with, as some elements
are not even addressable.
* Overaligned types, e.g. i16 with 32-bit alignment.
Existing tests are unaffected, but a miscompilation of a new test is fixed.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
show more ...
|
#
bbd57e18 |
| 03-Jan-2024 |
Craig Topper <craig.topper@sifive.com> |
[SelectionDAG] Add initial plumbing for the disjoint flag. (#76751)
This copies the flag from IR to the SDNode in SelectionDAGBuilder, clears
the flag in SimplifyDemandedBits, and adds it to canCre
[SelectionDAG] Add initial plumbing for the disjoint flag. (#76751)
This copies the flag from IR to the SDNode in SelectionDAGBuilder, clears
the flag in SimplifyDemandedBits, and adds it to canCreateUndefOrPoison.
Uses of the flag will come in later patches.
show more ...
|
#
41cb686d |
| 25-Dec-2023 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Use range-based for loops (NFC)
|
#
5ee08813 |
| 12-Dec-2023 |
Orlando Cazalet-Hyams <orlando.hyams@sony.com> |
[DebugInfo][RemoveDIs] Handle dbg.declares in SelectionDAGISel (#73496)
This is a boring mechanical update to support DPValues that look like
dbg.declares in SelectionDAG.
The tests will become
[DebugInfo][RemoveDIs] Handle dbg.declares in SelectionDAGISel (#73496)
This is a boring mechanical update to support DPValues that look like
dbg.declares in SelectionDAG.
The tests will become "live" once #74090 lands (see for more info).
show more ...
|
#
b3000ecb |
| 09-Dec-2023 |
Jay Foad <jay.foad@amd.com> |
[SelectionDAG] Fix typo in comment
|