|
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
| #
6548b635 |
| 09-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"
This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.
|
| #
ca33649a |
| 08-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"
This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both hip and openmp buildbots.
|
| #
e215a1e2 |
| 08-Nov-2024 |
Shilei Tian <i@tianshilei.me> |
[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)
|
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
| #
b1bcb7ca |
| 15-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commit adaff46d087799
Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.
Drop the -O3 checks from default-attributes.hip. I don't know why they are different on some bots but reverting this is far too disruptive.
show more ...
|
| #
adaff46d |
| 15-Jul-2024 |
dyung <douglas.yung@sony.com> |
Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commits 677cc15e0ff2e0
Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.
The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614
These bots have been broken for a day, so reverting to get everything
back to green.
show more ...
|
| #
78bc1b64 |
| 14-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Move attributor into optimization pipeline (#83131)
Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.
AMDGPU: Move attributor into optimization pipeline (#83131)
Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.
Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.
show more ...
|
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
| #
9e9907f1 |
| 17-Jan-2024 |
Fangrui Song <i@maskray.me> |
[AMDGPU,test] Change llc -march= to -mtriple= (#75982)
Similar to 806761a7629df268c8aed49657aeccffa6bca449.
For IR files without a target triple, -mtriple= specifies the full
target triple while
[AMDGPU,test] Change llc -march= to -mtriple= (#75982)
Similar to 806761a7629df268c8aed49657aeccffa6bca449.
For IR files without a target triple, -mtriple= specifies the full
target triple while -march= merely sets the architecture part of the
default target triple, leaving a target triple which may not make sense,
e.g. amdgpu-apple-darwin.
Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
$unknown-apple-darwin as ELF instead of rejecting it outrightly.
This patch changes AMDGPU tests to not rely on the default
OS/environment components. Tests that need fixes are not changed:
```
LLVM :: CodeGen/AMDGPU/fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fabs.ll
LLVM :: CodeGen/AMDGPU/floor.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.ll
LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll
LLVM :: CodeGen/AMDGPU/schedule-if-2.ll
```
show more ...
|
| #
7b1e4239 |
| 18-Dec-2023 |
Simon Pilgrim <RKSimon@users.noreply.github.com> |
[DAG] Fold (vt trunc (extload (vt x))) -> (vt load x) (#75229)
We were only folding cases which remained extloads, but DAG.getExtLoad can also handle the cases which don't need to extend at all (we
[DAG] Fold (vt trunc (extload (vt x))) -> (vt load x) (#75229)
We were only folding cases which remained extloads, but DAG.getExtLoad can also handle the cases which don't need to extend at all (we just can't do truncloads).
reduceLoadWidth can handle this for scalar loads, but not for vectors.
Noticed while triaging D152928
show more ...
|
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0 |
|
| #
806761a7 |
| 11-Sep-2023 |
Fangrui Song <i@maskray.me> |
[test] Change llc -march= to -mtriple=
The issue is uncovered by #47698: for IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture p
[test] Change llc -march= to -mtriple=
The issue is uncovered by #47698: for IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. riscv64-apple-darwin.
Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly.
show more ...
|
|
Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
| #
e70ae0f4 |
| 07-Jan-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
DAG/GlobalISel: Fix broken/redundant setting of MODereferenceable
This was incorrectly setting dereferenceable on unaligned operands. getLoadMemOperandFlags does the alignment dereferenceabilty chec
DAG/GlobalISel: Fix broken/redundant setting of MODereferenceable
This was incorrectly setting dereferenceable on unaligned operands. getLoadMemOperandFlags does the alignment dereferenceabilty check without alignment, and then both paths went on to check isDereferenceableAndAlignedPointer. Make getLoadMemOperandFlags check isDereferenceableAndAlignedPointer, and remove the second call.
show more ...
|
| #
d85e849f |
| 02-Dec-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Convert some assorted tests to opaque pointers
|
|
Revision tags: llvmorg-15.0.6 |
|
| #
595a0884 |
| 17-Nov-2022 |
Mateja Marjanovic <mateja.marjanovic@amd.com> |
[AMDGPU] Add support for new LLVM vector types
Add VReg, AReg and SReg on AMDGPU for bit widths: 288, 320, 352 and 384.
Differential Revision: https://reviews.llvm.org/D138205
|
|
Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1 |
|
| #
4c4db816 |
| 30-Jul-2022 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Extend SILoadStoreOptimizer to s_load instructions
Apply merging to s_load as is done for s_buffer_load.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D130742
|
|
Revision tags: llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
bb1fe369 |
| 19-Jan-2022 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Make v8i16/v8f16 legal
Differential Revision: https://reviews.llvm.org/D117721
|
|
Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
da067ed5 |
| 10-Nov-2021 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Set most sched model resource's BufferSize to one
Using a BufferSize of one for memory ProcResources will result in better ILP since it more accurately models the dependencies between memor
[AMDGPU] Set most sched model resource's BufferSize to one
Using a BufferSize of one for memory ProcResources will result in better ILP since it more accurately models the dependencies between memory ops and their consumers on an in-order processor. After this change, the scheduler will treat the data edges from loads as blocking so that stalls are guaranteed when waiting for data to be retreaved from memory. Since we don't actually track waitcnt here, this should do a better job at modeling their behavior.
Practically, this means that the scheduler will trigger the 'STALL' heuristic more often.
This type of change needs to be evaluated experimentally. Preliminary results are positive.
Fixes: SWDEV-282962
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D114777
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
3ce1b963 |
| 08-Sep-2021 |
Joe Nash <Joseph.Nash@amd.com> |
[AMDGPU] Switch PostRA sched to MachineSched
Use GCNHazardRecognizer in postra sched. Updated tests for the new schedules.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D1095
[AMDGPU] Switch PostRA sched to MachineSched
Use GCNHazardRecognizer in postra sched. Updated tests for the new schedules.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D109536
Change-Id: Ia86ba2ae168f12fb34b4d8efdab491f84d936cde
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1 |
|
| #
6e712fdf |
| 29-Jul-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Autogenerate checks in kernel-args.ll
Differential Revision: https://reviews.llvm.org/D107052
|
|
Revision tags: llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
d2e52eec |
| 10-Nov-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Select global saddr mode from SGPR pointer
Use the 64-bit SGPR base with a 0 offset, since it's 1 fewer instruction to materialize the 0 vs. the 64-bit copy.
|
| #
3fdf3b15 |
| 14-Oct-2020 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
AMDGPU: Update AMDHSA code object version handling
Differential Revision: https://reviews.llvm.org/D89076
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2 |
|
| #
33fd4a18 |
| 30-Jul-2020 |
hsmahesha <mahesha.comp@gmail.com> |
[AMDGPU/MemOpsCluster] Clean-up fixme's around mem ops clustering logic
Get rid of all fixmes and base heuristic on `num-clustered-dwords`. The main intuition behind this is as follows. The existing
[AMDGPU/MemOpsCluster] Clean-up fixme's around mem ops clustering logic
Get rid of all fixmes and base heuristic on `num-clustered-dwords`. The main intuition behind this is as follows. The existing heuristic roughly summarizes as below:
* Assume, all the mem ops instructions participating in the clustering process, loads/stores same num bytes * If num bytes loaded by each mem op is 4 bytes, then cluster at max 5 mem ops, that is at max 20 bytes * If num bytes loaded by each mem op is 8 bytes, then cluster at max 3 mem ops, that is at max 24 bytes * If num bytes loaded by each mem op is 16 bytes, then cluster at max 2 mem ops, that is at max 32 bytes
So, we need to make sure that the new heuristic do not completey deviate away from the above one, and it properly handles both the sub-word loads and the wide loads.
Reviewed By: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D84354
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
| #
1168119c |
| 07-May-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Start interpreting byref on kernel arguments
These are treated identically to value aggregates placed in the kernel argument list. A %struct.foo or %struct.foo addrspace(4)* byref(sizeof(%st
AMDGPU: Start interpreting byref on kernel arguments
These are treated identically to value aggregates placed in the kernel argument list. A %struct.foo or %struct.foo addrspace(4)* byref(sizeof(%struct.foo)) align(alignof(%struct.foo)) argument should produce the same offsets and argument metadata.
This handles all 3 kernel ABI implementations, and the two HSA metadata emission paths.
show more ...
|
| #
49055360 |
| 17-Jul-2020 |
hsmahesha <mahesha.comp@gmail.com> |
Revert "[AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size"
This reverts commit cc9d69385659be32178506a38b4f2e112ed01ad4.
|
| #
cc9d6938 |
| 23-Jun-2020 |
Your Name <mahesha.comp@gmail.com> |
[AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size
Summary: Make use of both the - (1) clustered bytes and (2) cluster length, to decide on the max number of mem o
[AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size
Summary: Make use of both the - (1) clustered bytes and (2) cluster length, to decide on the max number of mem ops that can be clustered. On an average, when loads are dword or smaller, consider `5` as max threshold, otherwise `4`. This heuristic is purely based on different experimentation conducted, and there is no analytical logic here.
Reviewers: foad, rampitec, arsenm, vpykhtin
Reviewed By: rampitec
Subscribers: llvm-commits, kerbowa, hiraditya, t-tye, Anastasia, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl, thakis
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82393
show more ...
|
| #
7410571c |
| 09-Jun-2020 |
hsmahesha <mahesha.comp@gmail.com> |
Revert "[AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size"
This reverts commit 40a632a335119fe3e8d5d500a5d2641998314ecb.
|
| #
40a632a3 |
| 09-Jun-2020 |
hsmahesha <mahesha.comp@gmail.com> |
[AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size
Summary: Make use of both the - (1) clustered bytes and (2) cluster length, to decide on the max number of mem o
[AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size
Summary: Make use of both the - (1) clustered bytes and (2) cluster length, to decide on the max number of mem ops that can be clustered. On an average, when loads are dword or smaller, consider `5` as max threshold, otherwise `4`. This heuristic is purely based on different experimentation conducted, and there is no analytical logic here.
Reviewers: foad, rampitec, arsenm, vpykhtin
Reviewed By: foad, rampitec
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, Anastasia, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81085
show more ...
|