History log of /llvm-project/llvm/unittests/Target/AMDGPU/DwarfRegMappings.cpp (Results 1 – 7 of 7)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# dc0e258f 06-Jan-2025 Emma Pilkington <emma.pilkington95@gmail.com>

[AMDGPU] Remove Dwarf encodings for subregisters (#117891)

Previously, registers and subregisters mapped to the same Dwarf
encoding. We don't really have any way to refer to subregisters directly

[AMDGPU] Remove Dwarf encodings for subregisters (#117891)

Previously, registers and subregisters mapped to the same Dwarf
encoding. We don't really have any way to refer to subregisters directly
from Dwarf, the expression emitter should instead use DW_OPs to stencil
out the subregister from the whole register. This was also confusing
tools that need to map back to the llvm reg (e.g. dwarfdump), since
getLLVMRegNum() would arbitrarily return the _LO16 register.

show more ...


Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2
# 538aeb18 11-Mar-2024 Emma Pilkington <emma.pilkington95@gmail.com>

[AMDGPU] Use a consistent DwarfEH register flavour (#84513)

Previously, we always used the wave64 encodings for EH registers
regardless of whether we were compiling for wave32, which seems wrong.

[AMDGPU] Use a consistent DwarfEH register flavour (#84513)

Previously, we always used the wave64 encodings for EH registers
regardless of whether we were compiling for wave32, which seems wrong.
We don't seem to use the EH registers, so this commit is mostly just
about papering over code that converts from non-EH dwarf registers to
LLVM registers while claiming they are EH dwarf registers. That kind of
code should be okay on any non-darwin target (since darwin is the only
target that uses a different encoding for EH registers).

show more ...


Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6
# d09d834b 21-Nov-2022 Valery Pykhtin <valery.pykhtin@gmail.com>

[AMDGPU] Fix GCNSubtarget::getMinNumVGPRs, add unit test to check consistency between GCNSubtarget's getMinNumVGPRs, getMaxNumVGPRs and getOccupancyWithNumVGPRs.

```
/// \returns Minimum number of

[AMDGPU] Fix GCNSubtarget::getMinNumVGPRs, add unit test to check consistency between GCNSubtarget's getMinNumVGPRs, getMaxNumVGPRs and getOccupancyWithNumVGPRs.

```
/// \returns Minimum number of VGPRs that meets given number of waves per
/// execution unit requirement supported by the subtarget.
unsigned getMinNumVGPRs(unsigned WavesPerEU) const;

/// \returns Maximum number of VGPRs that meets given number of waves per
/// execution unit requirement supported by the subtarget.
unsigned getMaxNumVGPRs(unsigned WavesPerEU) const;

/// Return the maximum number of waves per SIMD for kernels using \p VGPRs
/// VGPRs
unsigned getOccupancyWithNumVGPRs(unsigned VGPRs) const;
```

While working on RP tracking issues I noticed that getMinNumVGPRs return incorrect
values: the problem is large VGPR granule sizes on GFX10+ architectures. Some of the
occupancies aren't reachable because require the same amount of VGPR granules as others.
For example 19 waves occupancy on gfx1010 require the same amount of granules as 20 waves
so the resultng occupancy would be 20.

SGPRs have the same issue and even have inconsistency between getMaxNumSGPRs and getOccupancyWithNumSGPRs.
It will be addressed in the next patch.

Legend:
# MinVGPR and MaxVGPR are values returned by getMinNumVGPRs and getMaxNumVGPRs for a given Occ.
# (ONumber) is the value returned by getOccupancyWithNumVGPRs for a given MinVGPR or MaxVGPR.
# R means range problem: MinVGPR should be less than MaxVGPR and both should refer to the same occupancy.

Unit test output without the fix:
```
./build/unittests/Target/AMDGPU/AMDGPUTests --gtest_filter=AMDGPU.TestVGPRLimitsPerOccupancy --print-cpu-reg-limits

gfx90a gfx940:
Occ MinVGPR MaxVGPR
8 0 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 80 (O6)
5 81 (O5) 96 (O5)
4 97 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 257 (O1) 512 (O1)

gfx600 gfx600 gfx601 gfx601 gfx601 gfx602 gfx602 gfx602 gfx700 gfx700 gfx701 gfx701 gfx702 gfx703 gfx703 gfx703 gfx704 gfx704 gfx705 gfx801 gfx801 gfx802 gfx802 gfx802 gfx803 gfx803 gfx803 gfx803 gfx805 gfx805 gfx810 gfx810 gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90c:
Occ MinVGPR MaxVGPR
10 0 (O10) 24 (O10)
9 25 (O9) 28 (O9)
8 29 (O8) 32 (O8)
7 33 (O7) 36 (O7)
6 37 (O6) 40 (O6)
5 41 (O5) 48 (O5)
4 49 (O4) 64 (O4)
3 65 (O3) 84 (O3)
2 85 (O2) 128 (O2)
1 129 (O1) 256 (O1)

gfx1030w64 gfx1031w64 gfx1032w64 gfx1033w64 gfx1034w64 gfx1035w64 gfx1036w64 gfx1102w64 gfx1103w64:
Occ MinVGPR MaxVGPR
16 0 (O16) 32 (O16)
15 33 (O12) R 32 (O16)
14 33 (O12) R 32 (O16)
13 33 (O12) R 32 (O16)
12 33 (O12) 40 (O12)
11 41 (O10) R 40 (O12)
10 41 (O10) 48 (O10)
9 49 (O9) 56 (O9)
8 57 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 80 (O6)
5 81 (O5) 96 (O5)
4 97 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 256 (O2) R 256 (O2)

gfx1100w64 gfx1101w64:
Occ MinVGPR MaxVGPR
16 0 (O16) 48 (O16)
15 49 (O12) R 48 (O16)
14 49 (O12) R 48 (O16)
13 49 (O12) R 48 (O16)
12 49 (O12) 60 (O12)
11 61 (O10) R 60 (O12)
10 61 (O10) 72 (O10)
9 73 (O9) 84 (O9)
8 85 (O8) 96 (O8)
7 97 (O7) 108 (O7)
6 109 (O6) 120 (O6)
5 121 (O5) 144 (O5)
4 145 (O4) 192 (O4)
3 193 (O3) 252 (O3)
2 253 (O2) 256 (O2)
1 256 (O2) R 256 (O2)

gfx1030w32 gfx1031w32 gfx1032w32 gfx1033w32 gfx1034w32 gfx1035w32 gfx1036w32 gfx1102w32 gfx1103w32:
Occ MinVGPR MaxVGPR
16 0 (O16) 64 (O16)
15 65 (O12) R 64 (O16)
14 65 (O12) R 64 (O16)
13 65 (O12) R 64 (O16)
12 65 (O12) 80 (O12)
11 81 (O10) R 80 (O12)
10 81 (O10) 96 (O10)
9 97 (O9) 112 (O9)
8 113 (O8) 128 (O8)
7 129 (O7) 144 (O7)
6 145 (O6) 160 (O6)
5 161 (O5) 192 (O5)
4 193 (O4) 256 (O4)
3 256 (O4) R 256 (O4)
2 256 (O4) R 256 (O4)
1 256 (O4) R 256 (O4)

gfx1100w32 gfx1101w32:
Occ MinVGPR MaxVGPR
16 0 (O16) 96 (O16)
15 97 (O12) R 96 (O16)
14 97 (O12) R 96 (O16)
13 97 (O12) R 96 (O16)
12 97 (O12) 120 (O12)
11 121 (O10) R 120 (O12)
10 121 (O10) 144 (O10)
9 145 (O9) 168 (O9)
8 169 (O8) 192 (O8)
7 193 (O7) 216 (O7)
6 217 (O6) 240 (O6)
5 241 (O5) 256 (O5)
4 256 (O5) R 256 (O5)
3 256 (O5) R 256 (O5)
2 256 (O5) R 256 (O5)
1 256 (O5) R 256 (O5)

gfx1010w64 gfx1011w64 gfx1012w64 gfx1013w64:
Occ MinVGPR MaxVGPR
20 0 (O20) 24 (O20)
19 25 (O18) R 24 (O20)
18 25 (O18) 28 (O18)
17 29 (O16) R 28 (O18)
16 29 (O16) 32 (O16)
15 33 (O14) R 32 (O16)
14 33 (O14) 36 (O14)
13 37 (O12) R 36 (O14)
12 37 (O12) 40 (O12)
11 41 (O11) 44 (O11)
10 45 (O10) 48 (O10)
9 49 (O9) 56 (O9)
8 57 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 84 (O6)
5 85 (O5) 100 (O5)
4 101 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 256 (O2) R 256 (O2)

gfx1010w32 gfx1011w32 gfx1012w32 gfx1013w32:
Occ MinVGPR MaxVGPR
20 0 (O20) 48 (O20)
19 49 (O18) R 48 (O20)
18 49 (O18) 56 (O18)
17 57 (O16) R 56 (O18)
16 57 (O16) 64 (O16)
15 65 (O14) R 64 (O16)
14 65 (O14) 72 (O14)
13 73 (O12) R 72 (O14)
12 73 (O12) 80 (O12)
11 81 (O11) 88 (O11)
10 89 (O10) 96 (O10)
9 97 (O9) 112 (O9)
8 113 (O8) 128 (O8)
7 129 (O7) 144 (O7)
6 145 (O6) 168 (O6)
5 169 (O5) 200 (O5)
4 201 (O4) 256 (O4)
3 256 (O4) R 256 (O4)
2 256 (O4) R 256 (O4)
1 256 (O4) R 256 (O4)
```

After the fix:
```
gfx90a gfx940:
Occ MinVGPR MaxVGPR
8 0 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 80 (O6)
5 81 (O5) 96 (O5)
4 97 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 257 (O1) 512 (O1)

gfx600 gfx600 gfx601 gfx601 gfx601 gfx602 gfx602 gfx602 gfx700 gfx700 gfx701 gfx701 gfx702 gfx703 gfx703 gfx703 gfx704 gfx704 gfx705 gfx801 gfx801 gfx802 gfx802 gfx802 gfx803 gfx803 gfx803 gfx803 gfx805 gfx805 gfx810 gfx810 gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90c:
Occ MinVGPR MaxVGPR
10 0 (O10) 24 (O10)
9 25 (O9) 28 (O9)
8 29 (O8) 32 (O8)
7 33 (O7) 36 (O7)
6 37 (O6) 40 (O6)
5 41 (O5) 48 (O5)
4 49 (O4) 64 (O4)
3 65 (O3) 84 (O3)
2 85 (O2) 128 (O2)
1 129 (O1) 256 (O1)

gfx1030w64 gfx1031w64 gfx1032w64 gfx1033w64 gfx1034w64 gfx1035w64 gfx1036w64 gfx1102w64 gfx1103w64:
Occ MinVGPR MaxVGPR
16 0 (O16) 32 (O16)
15 0 (O16) 32 (O16)
14 0 (O16) 32 (O16)
13 0 (O16) 32 (O16)
12 33 (O12) 40 (O12)
11 33 (O12) 40 (O12)
10 41 (O10) 48 (O10)
9 49 (O9) 56 (O9)
8 57 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 80 (O6)
5 81 (O5) 96 (O5)
4 97 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 169 (O2) 256 (O2)

gfx1100w64 gfx1101w64:
Occ MinVGPR MaxVGPR
16 0 (O16) 48 (O16)
15 0 (O16) 48 (O16)
14 0 (O16) 48 (O16)
13 0 (O16) 48 (O16)
12 49 (O12) 60 (O12)
11 49 (O12) 60 (O12)
10 61 (O10) 72 (O10)
9 73 (O9) 84 (O9)
8 85 (O8) 96 (O8)
7 97 (O7) 108 (O7)
6 109 (O6) 120 (O6)
5 121 (O5) 144 (O5)
4 145 (O4) 192 (O4)
3 193 (O3) 252 (O3)
2 253 (O2) 256 (O2)
1 253 (O2) 256 (O2)

gfx1030w32 gfx1031w32 gfx1032w32 gfx1033w32 gfx1034w32 gfx1035w32 gfx1036w32 gfx1102w32 gfx1103w32:
Occ MinVGPR MaxVGPR
16 0 (O16) 64 (O16)
15 0 (O16) 64 (O16)
14 0 (O16) 64 (O16)
13 0 (O16) 64 (O16)
12 65 (O12) 80 (O12)
11 65 (O12) 80 (O12)
10 81 (O10) 96 (O10)
9 97 (O9) 112 (O9)
8 113 (O8) 128 (O8)
7 129 (O7) 144 (O7)
6 145 (O6) 160 (O6)
5 161 (O5) 192 (O5)
4 193 (O4) 256 (O4)
3 193 (O4) 256 (O4)
2 193 (O4) 256 (O4)
1 193 (O4) 256 (O4)

gfx1100w32 gfx1101w32:
Occ MinVGPR MaxVGPR
16 0 (O16) 96 (O16)
15 0 (O16) 96 (O16)
14 0 (O16) 96 (O16)
13 0 (O16) 96 (O16)
12 97 (O12) 120 (O12)
11 97 (O12) 120 (O12)
10 121 (O10) 144 (O10)
9 145 (O9) 168 (O9)
8 169 (O8) 192 (O8)
7 193 (O7) 216 (O7)
6 217 (O6) 240 (O6)
5 241 (O5) 256 (O5)
4 241 (O5) 256 (O5)
3 241 (O5) 256 (O5)
2 241 (O5) 256 (O5)
1 241 (O5) 256 (O5)

gfx1010w64 gfx1011w64 gfx1012w64 gfx1013w64:
Occ MinVGPR MaxVGPR
20 0 (O20) 24 (O20)
19 0 (O20) 24 (O20)
18 25 (O18) 28 (O18)
17 25 (O18) 28 (O18)
16 29 (O16) 32 (O16)
15 29 (O16) 32 (O16)
14 33 (O14) 36 (O14)
13 33 (O14) 36 (O14)
12 37 (O12) 40 (O12)
11 41 (O11) 44 (O11)
10 45 (O10) 48 (O10)
9 49 (O9) 56 (O9)
8 57 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 84 (O6)
5 85 (O5) 100 (O5)
4 101 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 169 (O2) 256 (O2)

gfx1010w32 gfx1011w32 gfx1012w32 gfx1013w32:
Occ MinVGPR MaxVGPR
20 0 (O20) 48 (O20)
19 0 (O20) 48 (O20)
18 49 (O18) 56 (O18)
17 49 (O18) 56 (O18)
16 57 (O16) 64 (O16)
15 57 (O16) 64 (O16)
14 65 (O14) 72 (O14)
13 65 (O14) 72 (O14)
12 73 (O12) 80 (O12)
11 81 (O11) 88 (O11)
10 89 (O10) 96 (O10)
9 97 (O9) 112 (O9)
8 113 (O8) 128 (O8)
7 129 (O7) 144 (O7)
6 145 (O6) 168 (O6)
5 169 (O5) 200 (O5)
4 201 (O4) 256 (O4)
3 201 (O4) 256 (O4)
2 201 (O4) 256 (O4)
1 201 (O4) 256 (O4)
```

Reviewed By: #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D138443

show more ...


# b6a01caa 03-Dec-2022 Kazu Hirata <kazu@google.com>

[llvm/unittests] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the am

[llvm/unittests] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 89b57061 08-Oct-2021 Reid Kleckner <rnk@google.com>

Move TargetRegistry.(h|cpp) from Support to MC

This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually us

Move TargetRegistry.(h|cpp) from Support to MC

This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.

This allows us to ensure that Support doesn't have includes from MC/*.

Differential Revision: https://reviews.llvm.org/D111454

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# bd12ecb8 24-Mar-2020 Scott Linder <Scott.Linder@amd.com>

[AMDGPU] Fix PC register mapping in wave32 mode

Summary:
The PC_32 DWARF register is for a 32-bit process address space which we
don't implement in AMDGCN; another way of putting this is that the si

[AMDGPU] Fix PC register mapping in wave32 mode

Summary:
The PC_32 DWARF register is for a 32-bit process address space which we
don't implement in AMDGCN; another way of putting this is that the size
of the PC register is not a function of the wavefront size. If we ever
implement a 32-bit process address space we will need to add two more
DwarfFlavours i.e. we will need to represent the product of (wave32,
wave64) x (64-bit address space, 32-bit address space).

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76732

show more ...


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6
# 24698e52 23-Mar-2020 Ram Nalamothu <VenkataRamanaiah.Nalamothu@amd.com>

Implement wave32 DWARF register mapping

Implement the DWARF register mapping described in llvm/docs/AMDGPUUsage.rst.

This enables generating appropriate DWARF register numbers for wave64 and
wave32

Implement wave32 DWARF register mapping

Implement the DWARF register mapping described in llvm/docs/AMDGPUUsage.rst.

This enables generating appropriate DWARF register numbers for wave64 and
wave32 modes.

show more ...