Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
6548b635 |
| 09-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"
This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.
|
#
ca33649a |
| 08-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"
This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both hip and openmp buildbots.
|
#
17f3e009 |
| 08-Nov-2024 |
Craig Topper <craig.topper@sifive.com> |
Recommit "[GISel][AArch64][AMDGPU][RISCV] Canonicalize (sub X, C) -> (add X, -C) (#114309)"
The increase in fallbacks that was previously reported were not caused by this change.
Original descripti
Recommit "[GISel][AArch64][AMDGPU][RISCV] Canonicalize (sub X, C) -> (add X, -C) (#114309)"
The increase in fallbacks that was previously reported were not caused by this change.
Original description:
This matches InstCombine and DAGCombine.
RISC-V only has an ADDI instruction so without this we need additional patterns to do the conversion.
Some of the AMDGPU tests look like possible regressions. Maybe some patterns from isel aren't imported.
show more ...
|
#
e215a1e2 |
| 08-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)
|
#
cff2199e |
| 06-Nov-2024 |
Craig Topper <craig.topper@sifive.com> |
Revert "[GISel][AArch64][AMDGPU][RISCV] Canonicalize (sub X, C) -> (add X, -C) (#114309)"
This reverts commit 999dfb2067eb75609b735944af876279025ac171.
I received a report that his may have increas
Revert "[GISel][AArch64][AMDGPU][RISCV] Canonicalize (sub X, C) -> (add X, -C) (#114309)"
This reverts commit 999dfb2067eb75609b735944af876279025ac171.
I received a report that his may have increased fallbacks on AArch64.
show more ...
|
#
999dfb20 |
| 05-Nov-2024 |
Craig Topper <craig.topper@sifive.com> |
[GISel][AArch64][AMDGPU][RISCV] Canonicalize (sub X, C) -> (add X, -C) (#114309)
This matches InstCombine and DAGCombine.
RISC-V only has an ADDI instruction so without this we need additional
p
[GISel][AArch64][AMDGPU][RISCV] Canonicalize (sub X, C) -> (add X, -C) (#114309)
This matches InstCombine and DAGCombine.
RISC-V only has an ADDI instruction so without this we need additional
patterns to do the conversion.
Some of the AMDGPU tests look like possible regressions. Maybe some
patterns from isel aren't imported.
show more ...
|
Revision tags: llvmorg-19.1.3 |
|
#
3277c7cd |
| 21-Oct-2024 |
Stanislav Mekhanoshin <rampitec@users.noreply.github.com> |
[AMDGPU] Skip VGPR deallocation for waveslot limited kernels (#112765)
MSG_DEALLOC_VGPRS slows down very small waveslot limited kernels. It's
been identified this message is only really needed for
[AMDGPU] Skip VGPR deallocation for waveslot limited kernels (#112765)
MSG_DEALLOC_VGPRS slows down very small waveslot limited kernels. It's
been identified this message is only really needed for VGPR limited
kernels. A kernel becomes VGPR limited if a total number of VGPRs per
SIMD / number of used VGPRs is more than a number of wave slots.
show more ...
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4 |
|
#
7b7b0b95 |
| 29-Aug-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Check if is_fpclass is custom, instead of isLegalOrCustom (#105577)
For some reason, isOperationLegalOrCustom is not the same as isOperationLegal || isOperationCustom. Unfortunately, it checks
DAG: Check if is_fpclass is custom, instead of isLegalOrCustom (#105577)
For some reason, isOperationLegalOrCustom is not the same as isOperationLegal || isOperationCustom. Unfortunately, it checks if the type is legal which makes it uesless for custom lowering on non-legal types (which is always ppcf128).
Really the DAG builder shouldn't be going to expand this in the builder, it makes it difficult to work with. It's only here to work around the DAG requiring legal integer types the same size as the FP type after type legalization.
show more ...
|
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
b1bcb7ca |
| 15-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commit adaff46d087799
Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.
Drop the -O3 checks from default-attributes.hip. I don't know why they are different on some bots but reverting this is far too disruptive.
show more ...
|
#
adaff46d |
| 15-Jul-2024 |
dyung <douglas.yung@sony.com> |
Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commits 677cc15e0ff2e0
Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.
The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614
These bots have been broken for a day, so reverting to get everything
back to green.
show more ...
|
#
78bc1b64 |
| 14-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Move attributor into optimization pipeline (#83131)
Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.
AMDGPU: Move attributor into optimization pipeline (#83131)
Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.
Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4 |
|
#
f4f772ce |
| 16-Apr-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Stop reserving $vcc_hi in wave32 mode (#87783)
This gives us one extra SGPR to play with. The comment suggested that it
could cause bugs, but I have tested it with Vulkan CTS with the defa
[AMDGPU] Stop reserving $vcc_hi in wave32 mode (#87783)
This gives us one extra SGPR to play with. The comment suggested that it
could cause bugs, but I have tested it with Vulkan CTS with the default
wave size for compute shaders set to 32 and did not find any problems.
show more ...
|
Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2 |
|
#
364f7813 |
| 06-Feb-2024 |
Thorsten Schütt <schuett@gmail.com> |
[GlobalIsel] Combine logic of icmps (#77855)
Inspired by InstCombinerImpl::foldAndOrOfICmpsUsingRanges with some
adaptations to MIR.
|
Revision tags: llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
11bf02e0 |
| 18-Jan-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Fix ABI lowering with FP promote in strictfp functions (#74405)
This was emitting non-strict casts in ABI contexts for illegal
types.
|
#
9e9907f1 |
| 17-Jan-2024 |
Fangrui Song <i@maskray.me> |
[AMDGPU,test] Change llc -march= to -mtriple= (#75982)
Similar to 806761a7629df268c8aed49657aeccffa6bca449.
For IR files without a target triple, -mtriple= specifies the full
target triple while
[AMDGPU,test] Change llc -march= to -mtriple= (#75982)
Similar to 806761a7629df268c8aed49657aeccffa6bca449.
For IR files without a target triple, -mtriple= specifies the full
target triple while -march= merely sets the architecture part of the
default target triple, leaving a target triple which may not make sense,
e.g. amdgpu-apple-darwin.
Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
$unknown-apple-darwin as ELF instead of rejecting it outrightly.
This patch changes AMDGPU tests to not rely on the default
OS/environment components. Tests that need fixes are not changed:
```
LLVM :: CodeGen/AMDGPU/fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fabs.ll
LLVM :: CodeGen/AMDGPU/floor.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.ll
LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll
LLVM :: CodeGen/AMDGPU/schedule-if-2.ll
```
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5 |
|
#
a4196666 |
| 13-Nov-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Revert "Preliminary patch for divergence driven instruction selection. Operands Folding 1." (#71710)
This reverts commit 201f892b3b597f24287ab6a712a286e25a45a7d9.
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1 |
|
#
d86a7d63 |
| 28-Jul-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Add constant fold combine for zext/sext/anyext
Could use more work for vectors.
https://reviews.llvm.org/D156534
|
#
76c22b18 |
| 25-Jul-2023 |
Kevin P. Neal <kevin.neal@sas.com> |
[FPEnv][AMDGPU] Correct strictfp tests.
Correct AMDGPU strictfp tests to follow the rules documented in the LangRef: https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics
Mostly
[FPEnv][AMDGPU] Correct strictfp tests.
Correct AMDGPU strictfp tests to follow the rules documented in the LangRef: https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics
Mostly these tests just needed the strictfp attribute on function definitions. I've also removed the strictfp attribute from uses of the constrained intrinsics because it comes by default since D154991, but I only did this in tests I was changing anyway.
I also removed attributes added to declare lines of intrinsics. The attributes of intrinsics cannot be changed in a test so I eliminated attempts to do so.
Test changes verified with D146845.
show more ...
|
Revision tags: llvmorg-18-init |
|
#
7fa7a08f |
| 19-Jul-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Insert s_nop before s_sendmsg sendmsg(MSG_DEALLOC_VGPRS)
Differential Revision: https://reviews.llvm.org/D155681
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2 |
|
#
b59022b4 |
| 06-Feb-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Handle lowering of unordered fcZero|fcSubnormal to fcmp
|
#
64df9573 |
| 02-Feb-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG: Handle inversion of fcSubnormal | fcZero
There are a number of more test combinations here that can be done together and reduce the number of instructions.
https://reviews.llvm.org/D143191
|
#
61820f8b |
| 02-Feb-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Optimize lowering of is.fpclass fcZero|fcSubnormal
Combine the two checks into a check if the exponent bits are 0. The inverted case isn't reachable until a future change, and GlobalISel cu
CodeGen: Optimize lowering of is.fpclass fcZero|fcSubnormal
Combine the two checks into a check if the exponent bits are 0. The inverted case isn't reachable until a future change, and GlobalISel currently doesn't attempt the inversion optimization.
https://reviews.llvm.org/D143182
show more ...
|
#
f2c164c8 |
| 21-Jun-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Do not wait for vscnt on function entry and return
SIInsertWaitcnts inserts waitcnt instructions to resolve data dependencies. The GFX10+ vscnt (VMEM store count) counter is never used in t
[AMDGPU] Do not wait for vscnt on function entry and return
SIInsertWaitcnts inserts waitcnt instructions to resolve data dependencies. The GFX10+ vscnt (VMEM store count) counter is never used in this way. It is only used to resolve memory dependencies, and that is handled by SIMemoryLegalizer. Hence there is no need to conservatively wait for vscnt to be 0 on function entry and before returns.
Differential Revision: https://reviews.llvm.org/D153537
show more ...
|
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
3f4055de |
| 09-Jan-2023 |
Chen Zheng <czhengsz@cn.ibm.com> |
[GlobalISelEmitter] handle operand without MVT/class
There are some patterns in td files without MVT/class set for some operands in target pattern that are from the source pattern. This prevents Glo
[GlobalISelEmitter] handle operand without MVT/class
There are some patterns in td files without MVT/class set for some operands in target pattern that are from the source pattern. This prevents GlobalISelEmitter from adding them as a valid rule, because the target child operand is an unsupported kind operand. For now, for a leaf child, only IntInit and DefInit are handled in GlobalISelEmitter.
This issue can be workaround by adding MVT/class to the patterns in the td files, like the workarounds for patterns anyext and setcc in PPCInstrInfo.td in D140878.
To avoid adding the same workarounds for other patterns in td files, this patch tries to handle the UnsetInit case in GlobalISelEmitter.
Adding the new handling allows us to remove the workarounds in the td files and also generates many selection rules for PPC target.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D141247
show more ...
|
#
9356ec15 |
| 02-Feb-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Reorder case handling for is.fpclass legalization
Subnormal and zero checks can be combined into one, so move the code closer to reduce the diff in a future change.
|