| #
b1bcb7ca |
| 15-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commit adaff46d087799
Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.
Drop the -O3 checks from default-attributes.hip. I don't know why they are different on some bots but reverting this is far too disruptive.
show more ...
|
| #
adaff46d |
| 15-Jul-2024 |
dyung <douglas.yung@sony.com> |
Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commits 677cc15e0ff2e0
Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)
This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and
78bc1b64a6dc3fb6191355a5e1b502be8b3668e7.
The test CodeGenHIP/default-attributes.hip is failing on multiple bots
even after the attempted fix including the following:
- https://lab.llvm.org/buildbot/#/builders/3/builds/1473
- https://lab.llvm.org/buildbot/#/builders/65/builds/1380
- https://lab.llvm.org/buildbot/#/builders/161/builds/595
- https://lab.llvm.org/buildbot/#/builders/154/builds/1372
- https://lab.llvm.org/buildbot/#/builders/133/builds/1547
- https://lab.llvm.org/buildbot/#/builders/81/builds/755
- https://lab.llvm.org/buildbot/#/builders/40/builds/570
- https://lab.llvm.org/buildbot/#/builders/13/builds/748
- https://lab.llvm.org/buildbot/#/builders/12/builds/1845
- https://lab.llvm.org/buildbot/#/builders/11/builds/1695
- https://lab.llvm.org/buildbot/#/builders/190/builds/1829
- https://lab.llvm.org/buildbot/#/builders/193/builds/962
- https://lab.llvm.org/buildbot/#/builders/23/builds/991
- https://lab.llvm.org/buildbot/#/builders/144/builds/2256
- https://lab.llvm.org/buildbot/#/builders/46/builds/1614
These bots have been broken for a day, so reverting to get everything
back to green.
show more ...
|
| #
78bc1b64 |
| 14-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Move attributor into optimization pipeline (#83131)
Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.
AMDGPU: Move attributor into optimization pipeline (#83131)
Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.
Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.
show more ...
|
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
| #
fac34c01 |
| 29-Nov-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Bulk update some call tests to use opaque pointers
|
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
203a1e36 |
| 17-Dec-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "AMDGPU: Remove AMDGPUFixFunctionBitcasts pass"
This reverts commit 8a85be807bd453eb9c88d0126c75fd5ea393f60d.
The unrelated failure this exposed was fixed.
|
| #
8a85be80 |
| 16-Dec-2021 |
Ron Lieberman <ron.lieberman@amd.com> |
Revert "AMDGPU: Remove AMDGPUFixFunctionBitcasts pass"
Offload abort in Nekbone
This reverts commit 2b4876157562bc76e86f193d371348993905bc61.
|
| #
2b487615 |
| 14-Dec-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Remove AMDGPUFixFunctionBitcasts pass
This was a workaround for not supporting indirect calls when instcombine didn't eliminate constant expression casts of the callee at -O0. Indirect calls
AMDGPU: Remove AMDGPUFixFunctionBitcasts pass
This was a workaround for not supporting indirect calls when instcombine didn't eliminate constant expression casts of the callee at -O0. Indirect calls are supposed to work now, so drop the hack.
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
729bf9b2 |
| 14-Aug-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Enable fixed function ABI by default
Code using indirect calls is broken without this, and there isn't really much value in supporting the old attempt to vary the argument placement based on
AMDGPU: Enable fixed function ABI by default
Code using indirect calls is broken without this, and there isn't really much value in supporting the old attempt to vary the argument placement based on uses. This resulted in more argument shuffling code anyway.
Also have the option stop implying all inputs need to be passed. This will no rely on the amdgpu-no-* attributes to avoid passing unnecessary values.
show more ...
|
| #
722b8e0e |
| 14-Aug-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Invert ABI attribute handling
Previously we assumed all callable functions did not need any implicitly passed inputs, and added attributes to functions to indicate when they were necessary.
AMDGPU: Invert ABI attribute handling
Previously we assumed all callable functions did not need any implicitly passed inputs, and added attributes to functions to indicate when they were necessary. Requiring attributes for correctness is pretty ugly, and it makes supporting indirect and external calls more complicated.
This inverts the direction of the attributes, so an undecorated function is assumed to need all implicit imputs. This enables AMDGPUAttributor by default to mark when functions are proven to not need a given input. This strips the equivalent functionality from the legacy AMDGPUAnnotateKernelFeatures pass.
However, AMDGPUAnnotateKernelFeatures is not fully removed at this point although it should be in the future. It is still necessary for the two hacky amdgpu-calls and amdgpu-stack-objects attributes, which would be better served by a trivial analysis on the IR during selection. Additionally, AMDGPUAnnotateKernelFeatures still redundantly handles the uniform-work-group-size attribute to be removed in a future commit.
At this point when not using -amdgpu-fixed-function-abi, we are still modifying the ABI based on these newly negated attributes. In the future, this option will be removed and the locations for implicit inputs will always be fixed. We will then use the new attributes to avoid passing the values when unnecessary.
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
5682ae2f |
| 25-Mar-2021 |
madhur13490 <Madhur.Amilkanthwar@amd.com> |
[AMDGPU] Set implicit arg attributes for indirect calls
This patch adds attributes corresponding to implicits to functions/kernels if 1. it has an indirect call OR 2. it's address is taken.
Once su
[AMDGPU] Set implicit arg attributes for indirect calls
This patch adds attributes corresponding to implicits to functions/kernels if 1. it has an indirect call OR 2. it's address is taken.
Once such attributes are set, rest of the codegen would work out-of-box for indirect calls. This patch eliminates the potential overhead -fixed-abi imposes even though indirect functions calls are not used.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D99347
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
3c297a25 |
| 10-Feb-2021 |
madhur13490 <Madhur.Amilkanthwar@amd.com> |
Make fixed-abi default for AMD HSA OS
fixed-abi uses pre-defined and predictable SGPR/VGPRs for passing arguments. This patch makes this scheme default when HSA OS is specified in triple.
Reviewed
Make fixed-abi default for AMD HSA OS
fixed-abi uses pre-defined and predictable SGPR/VGPRs for passing arguments. This patch makes this scheme default when HSA OS is specified in triple.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D96340
show more ...
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
| #
2a814cd9 |
| 18-Dec-2020 |
Whitney Tsang <whitneyt@ca.ibm.com> |
Ensure SplitEdge to return the new block between the two given blocks
This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside
Ensure SplitEdge to return the new block between the two given blocks
This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore. The issue occurs in SplitEdge when the Succ has a single predecessor and the edge between the BB and Succ is not critical. This produces the result ‘BB->Succ->New’. The new function splitBasicBlockBefore was added to splitBlockBefore to handle the issue and now produces the correct result ‘BB->New->Succ’.
Below is an example of splitting the block bb1 at its first instruction.
/// Original IR bb0: br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlock bb0: br bb1 bb1: br bb1.split bb1.split: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore bb0: br bb1.split bb1.split br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2:
Differential Revision: https://reviews.llvm.org/D92200
show more ...
|
| #
511cfe94 |
| 17-Dec-2020 |
Bangtian Liu <bangtian@cs.toronto.edu> |
Revert "Ensure SplitEdge to return the new block between the two given blocks"
This reverts commit d20e0c3444ad9ada550d9d6d1d56fd72948ae444.
|
| #
d20e0c34 |
| 17-Dec-2020 |
Bangtian Liu <bangtian@cs.toronto.edu> |
Ensure SplitEdge to return the new block between the two given blocks
This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside
Ensure SplitEdge to return the new block between the two given blocks
This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore. The issue occurs in SplitEdge when the Succ has a single predecessor and the edge between the BB and Succ is not critical. This produces the result ‘BB->Succ->New’. The new function splitBasicBlockBefore was added to splitBlockBefore to handle the issue and now produces the correct result ‘BB->New->Succ’.
Below is an example of splitting the block bb1 at its first instruction.
/// Original IR bb0: br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlock bb0: br bb1 bb1: br bb1.split bb1.split: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore bb0: br bb1.split bb1.split br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2:
Differential Revision: https://reviews.llvm.org/D92200
show more ...
|
| #
c1075720 |
| 16-Dec-2020 |
Bangtian Liu <bangtian@cs.toronto.edu> |
Revert "Ensure SplitEdge to return the new block between the two given blocks"
This reverts commit cf638d793c489632bbcf0ee0fbf9d0f8c76e1f48.
|
| #
cf638d79 |
| 15-Dec-2020 |
Bangtian Liu <bangtian@cs.toronto.edu> |
Ensure SplitEdge to return the new block between the two given blocks
This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside
Ensure SplitEdge to return the new block between the two given blocks
This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore. The issue occurs in SplitEdge when the Succ has a single predecessor and the edge between the BB and Succ is not critical. This produces the result ‘BB->Succ->New’. The new function splitBasicBlockBefore was added to splitBlockBefore to handle the issue and now produces the correct result ‘BB->New->Succ’.
Below is an example of splitting the block bb1 at its first instruction.
/// Original IR bb0: br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlock bb0: br bb1 bb1: br bb1.split bb1.split: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore bb0: br bb1.split bb1.split br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2:
Differential Revision: https://reviews.llvm.org/D92200
show more ...
|
|
Revision tags: llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
| #
4bdab2e8 |
| 01-Sep-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Fix offset for REL32_HI relocs
The addend in a REL32 reloc needs to be adjusted to account for the offset from the PC value returned by the s_getpc instruction to the point where the reloc
[AMDGPU] Fix offset for REL32_HI relocs
The addend in a REL32 reloc needs to be adjusted to account for the offset from the PC value returned by the s_getpc instruction to the point where the reloc is applied. This was being done correctly for (GOTPC)REL32_LO but not for (GOTPC)REL32_HI. This will only make a difference if the target symbol happens to get loaded almost exactly a multiple of 4G away from the relocated instructions.
Differential Revision: https://reviews.llvm.org/D86938
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4 |
|
| #
07fd88d7 |
| 28-Jun-2019 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Packed thread ids in function call ABI
Differential Revision: https://reviews.llvm.org/D63851
llvm-svn: 364619
|
|
Revision tags: llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2 |
|
| #
afc24ed2 |
| 01-Feb-2019 |
Scott Linder <scott@scottlinder.com> |
[AMDGPU] Mark test functions with hidden visibility
Prepare for future patch which affects codegen for calls to preemptible functions.
Differential Revision: https://reviews.llvm.org/D57605
llvm-s
[AMDGPU] Mark test functions with hidden visibility
Prepare for future patch which affects codegen for calls to preemptible functions.
Differential Revision: https://reviews.llvm.org/D57605
llvm-svn: 352920
show more ...
|
|
Revision tags: llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
| #
11ef7984 |
| 26-Oct-2018 |
Scott Linder <scott@scottlinder.com> |
[AMDGPU] Add a pass to promote bitcast calls
AMDGPU currently only supports direct calls, but at lower optimisation levels it fails to lower statically direct calls which appear indirect due to a bi
[AMDGPU] Add a pass to promote bitcast calls
AMDGPU currently only supports direct calls, but at lower optimisation levels it fails to lower statically direct calls which appear indirect due to a bitcast.
Add a pass to visit all CallSites and use CallPromotionUtils to "devirtualize" calls.
Differential Revision: https://reviews.llvm.org/D52741
llvm-svn: 345382
show more ...
|