Revision tags: llvmorg-21-init, llvmorg-19.1.7 |
|
#
1941f341 |
| 18-Dec-2024 |
Sergei Barannikov <barannikov88@gmail.com> |
[TableGen][GISel] Import more "multi-level" patterns (#120332)
Previously, if the destination DAG has an untyped leaf, we would import the pattern only if that leaf is defined by the *top-level* sou
[TableGen][GISel] Import more "multi-level" patterns (#120332)
Previously, if the destination DAG has an untyped leaf, we would import the pattern only if that leaf is defined by the *top-level* source DAG. This is an unnecessary restriction.
Here is an example of such pattern: ``` def : Pat<(add (mul v8i16:$vA, v8i16:$vB), v8i16:$vC), (VMLADDUHM $vA, $vB, $vC)>; ```
Previously, it failed to import because `add` doesn't define neither `$vA` nor `$vB`.
This change reduces the number of skipped patterns as follows:
``` AArch64: 8695 -> 8548 (-147) AMDGPU: 11333 -> 11240 (-93) ARM: 4297 -> 4278 (-1) PowerPC: 3955 -> 3010 (-945) ```
Other GISel-enabled targets are unaffected.
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
ba4bcce5 |
| 11-Sep-2024 |
Thorsten Schütt <schuett@gmail.com> |
[GlobalIsel] Combine trunc of binop (#107721)
trunc (binop X, C) --> binop (trunc X, trunc C) --> binop (trunc X, C`)
Try to narrow the width of math or bitwise logic instructions by pulling
a
[GlobalIsel] Combine trunc of binop (#107721)
trunc (binop X, C) --> binop (trunc X, trunc C) --> binop (trunc X, C`)
Try to narrow the width of math or bitwise logic instructions by pulling
a truncate ahead of binary operators.
Vx and Nx cores consider 32-bit and 64-bit basic arithmetic equal in
costs.
show more ...
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
9e9907f1 |
| 17-Jan-2024 |
Fangrui Song <i@maskray.me> |
[AMDGPU,test] Change llc -march= to -mtriple= (#75982)
Similar to 806761a7629df268c8aed49657aeccffa6bca449.
For IR files without a target triple, -mtriple= specifies the full
target triple while
[AMDGPU,test] Change llc -march= to -mtriple= (#75982)
Similar to 806761a7629df268c8aed49657aeccffa6bca449.
For IR files without a target triple, -mtriple= specifies the full
target triple while -march= merely sets the architecture part of the
default target triple, leaving a target triple which may not make sense,
e.g. amdgpu-apple-darwin.
Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
$unknown-apple-darwin as ELF instead of rejecting it outrightly.
This patch changes AMDGPU tests to not rely on the default
OS/environment components. Tests that need fixes are not changed:
```
LLVM :: CodeGen/AMDGPU/fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fabs.ll
LLVM :: CodeGen/AMDGPU/floor.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.ll
LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll
LLVM :: CodeGen/AMDGPU/schedule-if-2.ll
```
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4 |
|
#
fe8335ba |
| 30-Oct-2023 |
Stanislav Mekhanoshin <rampitec@users.noreply.github.com> |
[AMDGPU] Select 64-bit imm moves if can be encoded as 32 bit operand (#70395)
This allows folding of 64-bit operands if fit into 32-bit. Fixes
https://github.com/llvm/llvm-project/issues/67781
|
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
3f4055de |
| 09-Jan-2023 |
Chen Zheng <czhengsz@cn.ibm.com> |
[GlobalISelEmitter] handle operand without MVT/class
There are some patterns in td files without MVT/class set for some operands in target pattern that are from the source pattern. This prevents Glo
[GlobalISelEmitter] handle operand without MVT/class
There are some patterns in td files without MVT/class set for some operands in target pattern that are from the source pattern. This prevents GlobalISelEmitter from adding them as a valid rule, because the target child operand is an unsupported kind operand. For now, for a leaf child, only IntInit and DefInit are handled in GlobalISelEmitter.
This issue can be workaround by adding MVT/class to the patterns in the td files, like the workarounds for patterns anyext and setcc in PPCInstrInfo.td in D140878.
To avoid adding the same workarounds for other patterns in td files, this patch tries to handle the UnsetInit case in GlobalISelEmitter.
Adding the new handling allows us to remove the workarounds in the td files and also generates many selection rules for PPC target.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D141247
show more ...
|
#
d9a6fc82 |
| 24-Jan-2023 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Run unmerge combines post regbankselect
RegBankSelect can insert G_UNMERGE_VALUES in a lot of places which left us with a lot of unmerge/merge pairs that could be simplified. These often go
[AMDGPU] Run unmerge combines post regbankselect
RegBankSelect can insert G_UNMERGE_VALUES in a lot of places which left us with a lot of unmerge/merge pairs that could be simplified. These often got in the way of pattern matching and made codegen worse.
This patch: - Makes the necessary changes to the merge/unmerge combines so they can run post RegBankSelect - Adds relevant unmerge combines to the list of RegBankSelect combines for AMDGPU - Updates some tablegen patterns that were missing explicit cross-regbank copies (V_BFI patterns were causing constant bus violations with this change).
This seems to be mostly beneficial for code quality.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D142192
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
1416744f |
| 11-Apr-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Implement computeKnownBits for overflow bool results
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
#
078da26b |
| 08-Nov-2021 |
Abinav Puthan Purayil <abinav.puthanpurayil@amd.com> |
[AMDGPU] Check for unneeded shift mask in shift PatFrags.
The existing constrained shift PatFrags only dealt with masked shift from OpenCL front-ends. This change copies the X86DAGToDAGISel::isUnnee
[AMDGPU] Check for unneeded shift mask in shift PatFrags.
The existing constrained shift PatFrags only dealt with masked shift from OpenCL front-ends. This change copies the X86DAGToDAGISel::isUnneededShiftMask() function to AMDGPU and uses it in the shift PatFrag predicates.
Differential Revision: https://reviews.llvm.org/D113448
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
#
61e3b9fe |
| 22-Sep-2021 |
Abinav Puthan Purayil <abinav.puthanpurayil@amd.com> |
[AMDGPU] Add constrained shift pattern matches.
The motivation for this is due to clang's conformance to https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#operators-shift
[AMDGPU] Add constrained shift pattern matches.
The motivation for this is due to clang's conformance to https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#operators-shift which makes clang emit (<shift> a, (and b, <width> - 1)) for `a <shift> b` in OpenCL where a is an int of bit width <width>.
Differential revision: https://reviews.llvm.org/D110231
show more ...
|