Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
236fda55 |
| 06-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[Analysis] Remove unused includes (NFC) (#114936)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1 |
|
#
b14e30f1 |
| 27-Jul-2023 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[LLVM] refactor GenericSSAContext and its specializations
Fix the GenericSSAContext template so that it actually declares all the necessary typenames and the methods that must be implemented by its
[LLVM] refactor GenericSSAContext and its specializations
Fix the GenericSSAContext template so that it actually declares all the necessary typenames and the methods that must be implemented by its specializations SSAContext and MachineSSAContext.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D156288
show more ...
|
Revision tags: llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5 |
|
#
53fb907d |
| 24-May-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Special case uniformity info for single lane workgroups
Constructors/destructors and OpenMP make use of single lane groups in some cases.
|
#
d61cba6d |
| 02-Jun-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
UniformityAnalysis: Skip computation with no branch divergence
Check TTI before bothering to run the computation. Everything will be assumed uniform by default.
|
Revision tags: llvmorg-16.0.4 |
|
#
0a170eb7 |
| 17-May-2023 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[Uniformity] Propagate divergence only along divergent outputs.
When an instruction is determined to be divergent, not all its outputs are divergent. The users of only divergent outputs should now b
[Uniformity] Propagate divergence only along divergent outputs.
When an instruction is determined to be divergent, not all its outputs are divergent. The users of only divergent outputs should now be examined for divergence.
Also, replaced a repeating pattern of "if new divergent instruction, then add to worklist" by combining it into a single function. This does not cause any change in functionality.
Reviewed By: foad, arsenm
Differential Revision: https://reviews.llvm.org/D150636
show more ...
|
#
fbe1c061 |
| 16-May-2023 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[LLVM][Uniformity] Improve detection of uniform registers
The MachineUA now queries the target to determine if a given register holds a uniform value. This is determined using the corresponding regi
[LLVM][Uniformity] Improve detection of uniform registers
The MachineUA now queries the target to determine if a given register holds a uniform value. This is determined using the corresponding register bank if available, or by a combination of the register class and value type. This assumes that the target is optimizing for performance by choosing registers, and the target is responsible for any mismatch with the inferred uniformity.
For example, on AMDGPU, an SGPR is now treated as uniform, except if the register bank is VCC (i.e., the register holds a wave-wide vector of 1-bit values) or equivalently if it has a value type of s1.
- This does not always work with inline asm, where the register bank or the value type might not be present. We assume that the SGPR is uniform, because it is not expected to be s1 in the vast majority of cases. - The pseudo branch instruction SI_LOOP is now hard-coded to be always divergent, although its condition is an SGPR.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D150438
show more ...
|
#
b0f0dd25 |
| 15-May-2023 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[LLVM][Uniformity] Propagate temporal divergence explicitly
At a cycle C with divergent exits, UA was using a naive traversal of the exiting edges to locate blocks that may use values defined inside
[LLVM][Uniformity] Propagate temporal divergence explicitly
At a cycle C with divergent exits, UA was using a naive traversal of the exiting edges to locate blocks that may use values defined inside C. But this traversal fails when it encounters a cycle. This is now replaced with a much simpler propagation that iterates over every instruction in C and checks any uses that are outside C. But such an iteration can be expensive when C is very large; the original strategy may need to be reconsidered if there is a regression in compilation times.
Also fixed lit tests that should have originally caught the missed propagation of temporal divergence.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D149646
show more ...
|
Revision tags: llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0 |
|
#
f90849df |
| 14-Mar-2023 |
pvanhout <pierre.vanhoutryve@amd.com> |
[AMDGPU] Use UniformityAnalysis in AtomicOptimizer
Adds & uses a new `isDivergentUse` API in UA. UniformityAnalysis now requires CycleInfo as well as the new temporal divergence API can query it.
-
[AMDGPU] Use UniformityAnalysis in AtomicOptimizer
Adds & uses a new `isDivergentUse` API in UA. UniformityAnalysis now requires CycleInfo as well as the new temporal divergence API can query it.
-----
Original patch that adds `isDivergentUse` by @sameerds
The user of a temporally divergent value is marked as divergent in the uniformity analysis. But the same user may also have been marked divergent for other reasons, thus losing this information about temporal divergence. But some clients need to specificly check for temporal divergence. This change restores such an API, that already existed in DivergenceAnalysis.
Reviewed By: sameerds, foad
Differential Revision: https://reviews.llvm.org/D146018
show more ...
|
Revision tags: llvmorg-16.0.0-rc4 |
|
#
fd98416d |
| 10-Mar-2023 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
[llvm][Uniformity] consistently handle always-uniform instructions
An instruction that is "always uniform" is so even if it occurs in an irreducible cycle. The output produced by such an instruction
[llvm][Uniformity] consistently handle always-uniform instructions
An instruction that is "always uniform" is so even if it occurs in an irreducible cycle. The output produced by such an instruction may depend on the implementation defined cycle hierarchy, but that does not affect the uniformity of the output. In other words, an "always uniform" instruction is uniform even if it is not m-converged.
Reviewed By: ruiling, ronlieb
Differential Revision: https://reviews.llvm.org/D145572
show more ...
|
#
dbebebf6 |
| 06-Mar-2023 |
pvanhout <pierre.vanhoutryve@amd.com> |
[AMDGPU] Use UniformityAnalysis in CodeGenPrepare
A little extra change was needed in UA because it didn't consider InvokeInst and it made call-constexpr.ll assert.
Reviewed By: sameerds, arsenm
D
[AMDGPU] Use UniformityAnalysis in CodeGenPrepare
A little extra change was needed in UA because it didn't consider InvokeInst and it made call-constexpr.ll assert.
Reviewed By: sameerds, arsenm
Differential Revision: https://reviews.llvm.org/D145358
show more ...
|
#
5230f6c1 |
| 02-Mar-2023 |
Yashwant Singh <Yashwant.Singh@amd.com> |
[llvm][GenericUniformity] Prevent assert while calculating temporal divergence
analyzeTemporalDivergence() was missing the check for always-uniform before evaluating weather an instruction depends o
[llvm][GenericUniformity] Prevent assert while calculating temporal divergence
analyzeTemporalDivergence() was missing the check for always-uniform before evaluating weather an instruction depends on a value defined in the cycle. Fix for #60638 https://github.com/llvm/llvm-project/issues/60638
Reviewed By: sameerds, foad, #amdgpu
Differential Revision: https://reviews.llvm.org/D144070
show more ...
|
Revision tags: llvmorg-16.0.0-rc3 |
|
#
c76acb9d |
| 16-Feb-2023 |
Jay Foad <jay.foad@amd.com> |
[UniformityAnalysis] Fix some file headers and pass names
Differential Revision: https://reviews.llvm.org/D144167
|
Revision tags: llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init |
|
#
5d98dc71 |
| 16-Jan-2023 |
Krzysztof Drewniak <Krzysztof.Drewniak@amd.com> |
[llvm][GenericUniformity] Hack around strict is_invocable() checks
With recent (> 15, as far as I can tell, possibly > 16) clang, c++17, and GNU's libstdc++ (versions 9 and 10 and maybe others), LLV
[llvm][GenericUniformity] Hack around strict is_invocable() checks
With recent (> 15, as far as I can tell, possibly > 16) clang, c++17, and GNU's libstdc++ (versions 9 and 10 and maybe others), LLVM fails to compile due to an is_invocable() check in unique_ptr::reset().
To resolve this issue, add a template argument to ImplDeleter to make things work.
Differential Revision: https://reviews.llvm.org/D141865
show more ...
|
Revision tags: llvmorg-15.0.7 |
|
#
475ce4c2 |
| 20-Dec-2022 |
Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com> |
RFC: Uniformity Analysis for Irreducible Control Flow
Uniformity analysis is a generalization of divergence analysis to include irreducible control flow:
1. The proposed spec presents a notion of
RFC: Uniformity Analysis for Irreducible Control Flow
Uniformity analysis is a generalization of divergence analysis to include irreducible control flow:
1. The proposed spec presents a notion of "maximal convergence" that captures the existing convention of converging threads at the headers of natual loops.
2. Maximal convergence is then extended to irreducible cycles. The identity of irreducible cycles is determined by the choices made in a depth-first traversal of the control flow graph. Uniformity analysis uses criteria that depend only on closed paths and not cycles, to determine maximal convergence. This makes it a conservative analysis that is independent of the effect of DFS on CycleInfo.
3. The analysis is implemented as a template that can be instantiated for both LLVM IR and Machine IR.
Validation: - passes existing tests for divergence analysis - passes new tests with irreducible control flow - passes equivalent tests in MIR and GMIR
Based on concepts originally outlined by Nicolai Haehnle <nicolai.haehnle@amd.com>
With contributions from Ruiling Song <ruiling.song@amd.com> and Jay Foad <jay.foad@amd.com>.
Support for GMIR and lit tests for GMIR/MIR added by Yashwant Singh <yashwant.singh@amd.com>.
Differential Revision: https://reviews.llvm.org/D130746
show more ...
|