#
6fc8a1cd |
| 11-Nov-2016 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
Revert "[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies"
This reverts commit r286171, it breaks piglit test fs-discard-exit-2
llvm-svn: 286530
|
#
92e01ee9 |
| 07-Nov-2016 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Codegen prepare sinks comparisons close to a user is we have only one register for conditions. For AMDGPU we have
[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Codegen prepare sinks comparisons close to a user is we have only one register for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions. Changed BE to report we have many condition registers. That way IR LICM pass would hoist an invariant comparison out of a loop and codegen prepare will not sink it.
With that done a condition is calculated in one block and used in another. Current behavior is to store workitem's condition in a VGPR using v_cndmask and then restore it with yet another v_cmp instruction from that v_cndmask's result. To mitigate the issue a forward propagation of a v_cmp 64 bit result to an user is implemented. Additional side effect of this is that we may consume less VGPRs in a cost of more SGPRs in case if holding of multiple conditions is needed, and that is a clear win in most cases.
llvm-svn: 286171
show more ...
|
#
117296c0 |
| 01-Oct-2016 |
Mehdi Amini <mehdi.amini@apple.com> |
Use StringRef in Pass/PassManager APIs (NFC)
llvm-svn: 283004
|
#
5d8eb25e |
| 30-Sep-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Use unsigned compare for eq/ne
For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for
AMDGPU: Use unsigned compare for eq/ne
For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one.
llvm-svn: 282832
show more ...
|
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1 |
|
#
43e92fe3 |
| 24-Jun-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict
AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target.
llvm-svn: 273652
show more ...
|
Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1, llvmorg-3.8.0, llvmorg-3.8.0-rc3 |
|
#
427c5489 |
| 11-Feb-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix passes depending on dominator tree for no reason
llvm-svn: 260494
|
Revision tags: llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1, llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1 |
|
#
0cb8517d |
| 25-Sep-2015 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix recomputing dominator tree unnecessarily
SIFixSGPRCopies does not modify the CFG, but this was being recomputed before running SIFoldOperands.
llvm-svn: 248587
|
Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1, llvmorg-3.6.2, llvmorg-3.6.2-rc1 |
|
#
45bb48ea |
| 13-Jun-2015 |
Tom Stellard <thomas.stellard@amd.com> |
R600 -> AMDGPU rename
llvm-svn: 239657
|