History log of /llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp (Results 101 – 125 of 364)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 64dad4ba 14-Feb-2023 Kazu Hirata <kazu@google.com>

Use llvm::bit_cast (NFC)


# e3515ba3 10-Feb-2023 Janek van Oirschot <janek.vanoirschot@amd.com>

Reapply "[AMDGPU] Modify adjustInliningThreshold to also consider the cost of passing function arguments through the stack"

Reapplies 142c28ffa1323e9a8d53200a22c80d5d778e0d0f as part of D140242 whic

Reapply "[AMDGPU] Modify adjustInliningThreshold to also consider the cost of passing function arguments through the stack"

Reapplies 142c28ffa1323e9a8d53200a22c80d5d778e0d0f as part of D140242 which got reverted due to amdgpu openmp test failures.

This diff fixes said failures by eliding most of `adjustInliningThresholdUsingCallee` for indirect calls as the callee function is unavailable for indirect calls.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D143498

show more ...


# 7ca3444f 10-Feb-2023 Changpeng Fang <changpeng.fang@amd.com>

AMDGPU: Use module flag to get code object version at IR level folow-up

Summary:
This is part of the leftover work for https://reviews.llvm.org/D143138.
In this work, we pass code object version a

AMDGPU: Use module flag to get code object version at IR level folow-up

Summary:
This is part of the leftover work for https://reviews.llvm.org/D143138.
In this work, we pass code object version as an argument to initialize target ID
and use it for targetID dump.

Reviewers: arsenm

Differential Revision
https://reviews.llvm.org/D143293

show more ...


Revision tags: llvmorg-16.0.0-rc2
# 8e3d7cf5 07-Feb-2023 Archibald Elliott <archibald.elliott@arm.com>

[NFC][TargetParser] Remove llvm/Support/TargetParser.h


# 54cf69c9 03-Feb-2023 Changpeng Fang <changpeng.fang@amd.com>

AMDGPU: Use module flag to get code object version at IR level

Summary:
This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command l

AMDGPU: Use module flag to get code object version at IR level

Summary:
This patch introduces a mechanism to check the code object version from the module flag, This avoids checking from command line.
In case the module flag is missing, we use the current default code object version supported in the compiler.

For tools whose inputs are not IR, we may need other approach (directive, for example) to check the code
object version, That will be in a separate patch later.

For LIT tests update, we directly add module flag if there is only a single code object version associated with all checks in one file.
In cause of multiple code object version in one file, we use the "sed" method to "clone" the checks to achieve the goal.

Reviewer: arsenm

Differential Revision:
https://reviews.llvm.org/D14313

show more ...


# 422d379d 31-Jan-2023 Yashwant Singh <Yashwant.Singh@amd.com>

[AMDGPU] Use tablegen to list uniform intrinsics

Right now we do opcode wise matching to identify uniform/non-divergent
AMDGPU intrinsics. It is duplicated at 2 places once at IR level uniformity an

[AMDGPU] Use tablegen to list uniform intrinsics

Right now we do opcode wise matching to identify uniform/non-divergent
AMDGPU intrinsics. It is duplicated at 2 places once at IR level uniformity analysis
and at MIR level. Moving them to single tablegen table for consistency and adding
and API rapper to access them.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D142961

show more ...


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 10cef708 01-Dec-2022 Nicolai Hähnle <nicolai.haehnle@amd.com>

AMDGPU: Clean up LDS-related occupancy calculations

Occupancy is expressed as waves per SIMD. This means that we need to
take into account the number of SIMDs per "CU" or, to be more precise,
the nu

AMDGPU: Clean up LDS-related occupancy calculations

Occupancy is expressed as waves per SIMD. This means that we need to
take into account the number of SIMDs per "CU" or, to be more precise,
the number of SIMDs over which a workgroup may be distributed.

getOccupancyWithLocalMemSize was wrong because it didn't take SIMDs
into account at all.

At the same time, we need to take into account that WGP mode offers
access to a larger total amount of LDS, since this can affect how
non-power-of-two LDS allocations are rounded. To make this work
consistently, we distinguish between (available) local memory size and
addressable local memory size (which is always limited by 64kB on
gfx10+, even with WGP mode).

This change results in a massive amount of test churn. A lot of it is
caused by the fact that the default work group size is 1024, which means
that (due to rounding effects) the default occupancy on older hardware
is 8 instead of 10, which affects scheduling via register pressure
estimates. I've adjusted most tests by just running the UTC tools, but
in some cases I manually changed the work group size to 32 or 64 to make
sure that work group size chunkiness has no effect.

Differential Revision: https://reviews.llvm.org/D139468

show more ...


# 768aed13 13-Jan-2023 Jay Foad <jay.foad@amd.com>

[MC] Make more use of MCInstrDesc::operands. NFC.

Change MCInstrDesc::operands to return an ArrayRef so we can easily use
it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end.
A futu

[MC] Make more use of MCInstrDesc::operands. NFC.

Change MCInstrDesc::operands to return an ArrayRef so we can easily use
it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end.
A future patch will remove opInfo_begin and opInfo_end.

Also use it instead of raw access to the OpInfo pointer. A future patch
will remove this pointer.

Differential Revision: https://reviews.llvm.org/D142213

show more ...


# 4d4894ab 08-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

Partially reapply "AMDGPU: Invert handling of enqueued block detection"

This mostly reverts commit 270e96f435596449002fc89962595497481c8770.

Keep the attributor related changes around, but function

Partially reapply "AMDGPU: Invert handling of enqueued block detection"

This mostly reverts commit 270e96f435596449002fc89962595497481c8770.

Keep the attributor related changes around, but functionally restore
the old behavior as a workaround. Device enqueue goes back to not
working at -O0 with this version.

show more ...


# f460c665 22-Dec-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Simplify getNumFlatOffsetBits. NFC.

Previously we considered this field to be either N-bit unsigned or
N+1-bit signed, depending on the instruction. I think it's conceptually
simpler to say

[AMDGPU] Simplify getNumFlatOffsetBits. NFC.

Previously we considered this field to be either N-bit unsigned or
N+1-bit signed, depending on the instruction. I think it's conceptually
simpler to say that the field is always N+1-bit signed, but some
instructions do not allow negative values.

Differential Revision: https://reviews.llvm.org/D140883

show more ...


# 2d945ef8 09-Jan-2023 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][NFC] Rename GFX10A16 operands.

They do not seem to be GFX10-specific anymore. Also renames the
corresponding feature.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D1410

[AMDGPU][NFC] Rename GFX10A16 operands.

They do not seem to be GFX10-specific anymore. Also renames the
corresponding feature.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D141069

show more ...


# 270e96f4 08-Jan-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

Revert "AMDGPU: Invert handling of enqueued block detection"

This reverts commit 47288cc977fa31c44cc92b4e65044a5b75c2597e.

The runtime is having trouble with this at -O0 when the inputs are
always

Revert "AMDGPU: Invert handling of enqueued block detection"

This reverts commit 47288cc977fa31c44cc92b4e65044a5b75c2597e.

The runtime is having trouble with this at -O0 when the inputs are
always enabled.

show more ...


# 47288cc9 23-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Invert handling of enqueued block detection

Invert the sense of the attribute and let the attributor figure this
out like everything else. If needed we can have the not-OpenCL
languages set

AMDGPU: Invert handling of enqueued block detection

Invert the sense of the attribute and let the attributor figure this
out like everything else. If needed we can have the not-OpenCL
languages set amdgpu-no-default-queue and amdgpu-no-completion-action
up front so they never have to pay the cost.

There are also so many of these now, the offset use API should
probably consider all of them at once. Maybe they should merge into
one attribute with used fields. Having separate functions for each
field in AMDGPUBaseInfo is also not the greatest API (might as well
fix this when the patch to get the object version from the module
lands).

show more ...


# 4463badf 06-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Use DenormalMode type in FP mode tracking

This simplies a future patch. The MIR handling should be fixed. We're
still printing these in custom MachineFunctionInfo as bools (plus the
inverted

AMDGPU: Use DenormalMode type in FP mode tracking

This simplies a future patch. The MIR handling should be fixed. We're
still printing these in custom MachineFunctionInfo as bools (plus the
inverted meaning is hard to follow).

show more ...


# c16a58b3 08-Dec-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

Attributes: Add function getter to parse integer string attributes

The most common case for string attributes parses them as integers. We
don't have a convenient way to do this, and as a result we h

Attributes: Add function getter to parse integer string attributes

The most common case for string attributes parses them as integers. We
don't have a convenient way to do this, and as a result we have
inconsistent missing attribute and invalid attribute handling
scattered around. We also have inconsistent radix usage to
getAsInteger; some places use the default 0 and others use base 10.

Update a few of the uses, but there are quite a lot of these.

show more ...


# 67819a72 13-Dec-2022 Fangrui Song <i@maskray.me>

[CodeGen] llvm::Optional => std::optional


# cc6b10d1 09-Dec-2022 Petar Avramovic <Petar.Avramovic@amd.com>

AMDGPU: Check if operand RC contains register used when printing

Disassembler can successfully decode sgpr register when only vgpr
registers are valid for the operand (e.g. VReg_* and VISrc_* operan

AMDGPU: Check if operand RC contains register used when printing

Disassembler can successfully decode sgpr register when only vgpr
registers are valid for the operand (e.g. VReg_* and VISrc_* operands).
In InstPrinter, detect when operand register class does not contain
register that is being printed. Does not result in an error.
Intended use is for disassembler tests.

Differential Revision: https://reviews.llvm.org/D139646

show more ...


Revision tags: llvmorg-15.0.6
# d09d834b 21-Nov-2022 Valery Pykhtin <valery.pykhtin@gmail.com>

[AMDGPU] Fix GCNSubtarget::getMinNumVGPRs, add unit test to check consistency between GCNSubtarget's getMinNumVGPRs, getMaxNumVGPRs and getOccupancyWithNumVGPRs.

```
/// \returns Minimum number of

[AMDGPU] Fix GCNSubtarget::getMinNumVGPRs, add unit test to check consistency between GCNSubtarget's getMinNumVGPRs, getMaxNumVGPRs and getOccupancyWithNumVGPRs.

```
/// \returns Minimum number of VGPRs that meets given number of waves per
/// execution unit requirement supported by the subtarget.
unsigned getMinNumVGPRs(unsigned WavesPerEU) const;

/// \returns Maximum number of VGPRs that meets given number of waves per
/// execution unit requirement supported by the subtarget.
unsigned getMaxNumVGPRs(unsigned WavesPerEU) const;

/// Return the maximum number of waves per SIMD for kernels using \p VGPRs
/// VGPRs
unsigned getOccupancyWithNumVGPRs(unsigned VGPRs) const;
```

While working on RP tracking issues I noticed that getMinNumVGPRs return incorrect
values: the problem is large VGPR granule sizes on GFX10+ architectures. Some of the
occupancies aren't reachable because require the same amount of VGPR granules as others.
For example 19 waves occupancy on gfx1010 require the same amount of granules as 20 waves
so the resultng occupancy would be 20.

SGPRs have the same issue and even have inconsistency between getMaxNumSGPRs and getOccupancyWithNumSGPRs.
It will be addressed in the next patch.

Legend:
# MinVGPR and MaxVGPR are values returned by getMinNumVGPRs and getMaxNumVGPRs for a given Occ.
# (ONumber) is the value returned by getOccupancyWithNumVGPRs for a given MinVGPR or MaxVGPR.
# R means range problem: MinVGPR should be less than MaxVGPR and both should refer to the same occupancy.

Unit test output without the fix:
```
./build/unittests/Target/AMDGPU/AMDGPUTests --gtest_filter=AMDGPU.TestVGPRLimitsPerOccupancy --print-cpu-reg-limits

gfx90a gfx940:
Occ MinVGPR MaxVGPR
8 0 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 80 (O6)
5 81 (O5) 96 (O5)
4 97 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 257 (O1) 512 (O1)

gfx600 gfx600 gfx601 gfx601 gfx601 gfx602 gfx602 gfx602 gfx700 gfx700 gfx701 gfx701 gfx702 gfx703 gfx703 gfx703 gfx704 gfx704 gfx705 gfx801 gfx801 gfx802 gfx802 gfx802 gfx803 gfx803 gfx803 gfx803 gfx805 gfx805 gfx810 gfx810 gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90c:
Occ MinVGPR MaxVGPR
10 0 (O10) 24 (O10)
9 25 (O9) 28 (O9)
8 29 (O8) 32 (O8)
7 33 (O7) 36 (O7)
6 37 (O6) 40 (O6)
5 41 (O5) 48 (O5)
4 49 (O4) 64 (O4)
3 65 (O3) 84 (O3)
2 85 (O2) 128 (O2)
1 129 (O1) 256 (O1)

gfx1030w64 gfx1031w64 gfx1032w64 gfx1033w64 gfx1034w64 gfx1035w64 gfx1036w64 gfx1102w64 gfx1103w64:
Occ MinVGPR MaxVGPR
16 0 (O16) 32 (O16)
15 33 (O12) R 32 (O16)
14 33 (O12) R 32 (O16)
13 33 (O12) R 32 (O16)
12 33 (O12) 40 (O12)
11 41 (O10) R 40 (O12)
10 41 (O10) 48 (O10)
9 49 (O9) 56 (O9)
8 57 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 80 (O6)
5 81 (O5) 96 (O5)
4 97 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 256 (O2) R 256 (O2)

gfx1100w64 gfx1101w64:
Occ MinVGPR MaxVGPR
16 0 (O16) 48 (O16)
15 49 (O12) R 48 (O16)
14 49 (O12) R 48 (O16)
13 49 (O12) R 48 (O16)
12 49 (O12) 60 (O12)
11 61 (O10) R 60 (O12)
10 61 (O10) 72 (O10)
9 73 (O9) 84 (O9)
8 85 (O8) 96 (O8)
7 97 (O7) 108 (O7)
6 109 (O6) 120 (O6)
5 121 (O5) 144 (O5)
4 145 (O4) 192 (O4)
3 193 (O3) 252 (O3)
2 253 (O2) 256 (O2)
1 256 (O2) R 256 (O2)

gfx1030w32 gfx1031w32 gfx1032w32 gfx1033w32 gfx1034w32 gfx1035w32 gfx1036w32 gfx1102w32 gfx1103w32:
Occ MinVGPR MaxVGPR
16 0 (O16) 64 (O16)
15 65 (O12) R 64 (O16)
14 65 (O12) R 64 (O16)
13 65 (O12) R 64 (O16)
12 65 (O12) 80 (O12)
11 81 (O10) R 80 (O12)
10 81 (O10) 96 (O10)
9 97 (O9) 112 (O9)
8 113 (O8) 128 (O8)
7 129 (O7) 144 (O7)
6 145 (O6) 160 (O6)
5 161 (O5) 192 (O5)
4 193 (O4) 256 (O4)
3 256 (O4) R 256 (O4)
2 256 (O4) R 256 (O4)
1 256 (O4) R 256 (O4)

gfx1100w32 gfx1101w32:
Occ MinVGPR MaxVGPR
16 0 (O16) 96 (O16)
15 97 (O12) R 96 (O16)
14 97 (O12) R 96 (O16)
13 97 (O12) R 96 (O16)
12 97 (O12) 120 (O12)
11 121 (O10) R 120 (O12)
10 121 (O10) 144 (O10)
9 145 (O9) 168 (O9)
8 169 (O8) 192 (O8)
7 193 (O7) 216 (O7)
6 217 (O6) 240 (O6)
5 241 (O5) 256 (O5)
4 256 (O5) R 256 (O5)
3 256 (O5) R 256 (O5)
2 256 (O5) R 256 (O5)
1 256 (O5) R 256 (O5)

gfx1010w64 gfx1011w64 gfx1012w64 gfx1013w64:
Occ MinVGPR MaxVGPR
20 0 (O20) 24 (O20)
19 25 (O18) R 24 (O20)
18 25 (O18) 28 (O18)
17 29 (O16) R 28 (O18)
16 29 (O16) 32 (O16)
15 33 (O14) R 32 (O16)
14 33 (O14) 36 (O14)
13 37 (O12) R 36 (O14)
12 37 (O12) 40 (O12)
11 41 (O11) 44 (O11)
10 45 (O10) 48 (O10)
9 49 (O9) 56 (O9)
8 57 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 84 (O6)
5 85 (O5) 100 (O5)
4 101 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 256 (O2) R 256 (O2)

gfx1010w32 gfx1011w32 gfx1012w32 gfx1013w32:
Occ MinVGPR MaxVGPR
20 0 (O20) 48 (O20)
19 49 (O18) R 48 (O20)
18 49 (O18) 56 (O18)
17 57 (O16) R 56 (O18)
16 57 (O16) 64 (O16)
15 65 (O14) R 64 (O16)
14 65 (O14) 72 (O14)
13 73 (O12) R 72 (O14)
12 73 (O12) 80 (O12)
11 81 (O11) 88 (O11)
10 89 (O10) 96 (O10)
9 97 (O9) 112 (O9)
8 113 (O8) 128 (O8)
7 129 (O7) 144 (O7)
6 145 (O6) 168 (O6)
5 169 (O5) 200 (O5)
4 201 (O4) 256 (O4)
3 256 (O4) R 256 (O4)
2 256 (O4) R 256 (O4)
1 256 (O4) R 256 (O4)
```

After the fix:
```
gfx90a gfx940:
Occ MinVGPR MaxVGPR
8 0 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 80 (O6)
5 81 (O5) 96 (O5)
4 97 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 257 (O1) 512 (O1)

gfx600 gfx600 gfx601 gfx601 gfx601 gfx602 gfx602 gfx602 gfx700 gfx700 gfx701 gfx701 gfx702 gfx703 gfx703 gfx703 gfx704 gfx704 gfx705 gfx801 gfx801 gfx802 gfx802 gfx802 gfx803 gfx803 gfx803 gfx803 gfx805 gfx805 gfx810 gfx810 gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90c:
Occ MinVGPR MaxVGPR
10 0 (O10) 24 (O10)
9 25 (O9) 28 (O9)
8 29 (O8) 32 (O8)
7 33 (O7) 36 (O7)
6 37 (O6) 40 (O6)
5 41 (O5) 48 (O5)
4 49 (O4) 64 (O4)
3 65 (O3) 84 (O3)
2 85 (O2) 128 (O2)
1 129 (O1) 256 (O1)

gfx1030w64 gfx1031w64 gfx1032w64 gfx1033w64 gfx1034w64 gfx1035w64 gfx1036w64 gfx1102w64 gfx1103w64:
Occ MinVGPR MaxVGPR
16 0 (O16) 32 (O16)
15 0 (O16) 32 (O16)
14 0 (O16) 32 (O16)
13 0 (O16) 32 (O16)
12 33 (O12) 40 (O12)
11 33 (O12) 40 (O12)
10 41 (O10) 48 (O10)
9 49 (O9) 56 (O9)
8 57 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 80 (O6)
5 81 (O5) 96 (O5)
4 97 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 169 (O2) 256 (O2)

gfx1100w64 gfx1101w64:
Occ MinVGPR MaxVGPR
16 0 (O16) 48 (O16)
15 0 (O16) 48 (O16)
14 0 (O16) 48 (O16)
13 0 (O16) 48 (O16)
12 49 (O12) 60 (O12)
11 49 (O12) 60 (O12)
10 61 (O10) 72 (O10)
9 73 (O9) 84 (O9)
8 85 (O8) 96 (O8)
7 97 (O7) 108 (O7)
6 109 (O6) 120 (O6)
5 121 (O5) 144 (O5)
4 145 (O4) 192 (O4)
3 193 (O3) 252 (O3)
2 253 (O2) 256 (O2)
1 253 (O2) 256 (O2)

gfx1030w32 gfx1031w32 gfx1032w32 gfx1033w32 gfx1034w32 gfx1035w32 gfx1036w32 gfx1102w32 gfx1103w32:
Occ MinVGPR MaxVGPR
16 0 (O16) 64 (O16)
15 0 (O16) 64 (O16)
14 0 (O16) 64 (O16)
13 0 (O16) 64 (O16)
12 65 (O12) 80 (O12)
11 65 (O12) 80 (O12)
10 81 (O10) 96 (O10)
9 97 (O9) 112 (O9)
8 113 (O8) 128 (O8)
7 129 (O7) 144 (O7)
6 145 (O6) 160 (O6)
5 161 (O5) 192 (O5)
4 193 (O4) 256 (O4)
3 193 (O4) 256 (O4)
2 193 (O4) 256 (O4)
1 193 (O4) 256 (O4)

gfx1100w32 gfx1101w32:
Occ MinVGPR MaxVGPR
16 0 (O16) 96 (O16)
15 0 (O16) 96 (O16)
14 0 (O16) 96 (O16)
13 0 (O16) 96 (O16)
12 97 (O12) 120 (O12)
11 97 (O12) 120 (O12)
10 121 (O10) 144 (O10)
9 145 (O9) 168 (O9)
8 169 (O8) 192 (O8)
7 193 (O7) 216 (O7)
6 217 (O6) 240 (O6)
5 241 (O5) 256 (O5)
4 241 (O5) 256 (O5)
3 241 (O5) 256 (O5)
2 241 (O5) 256 (O5)
1 241 (O5) 256 (O5)

gfx1010w64 gfx1011w64 gfx1012w64 gfx1013w64:
Occ MinVGPR MaxVGPR
20 0 (O20) 24 (O20)
19 0 (O20) 24 (O20)
18 25 (O18) 28 (O18)
17 25 (O18) 28 (O18)
16 29 (O16) 32 (O16)
15 29 (O16) 32 (O16)
14 33 (O14) 36 (O14)
13 33 (O14) 36 (O14)
12 37 (O12) 40 (O12)
11 41 (O11) 44 (O11)
10 45 (O10) 48 (O10)
9 49 (O9) 56 (O9)
8 57 (O8) 64 (O8)
7 65 (O7) 72 (O7)
6 73 (O6) 84 (O6)
5 85 (O5) 100 (O5)
4 101 (O4) 128 (O4)
3 129 (O3) 168 (O3)
2 169 (O2) 256 (O2)
1 169 (O2) 256 (O2)

gfx1010w32 gfx1011w32 gfx1012w32 gfx1013w32:
Occ MinVGPR MaxVGPR
20 0 (O20) 48 (O20)
19 0 (O20) 48 (O20)
18 49 (O18) 56 (O18)
17 49 (O18) 56 (O18)
16 57 (O16) 64 (O16)
15 57 (O16) 64 (O16)
14 65 (O14) 72 (O14)
13 65 (O14) 72 (O14)
12 73 (O12) 80 (O12)
11 81 (O11) 88 (O11)
10 89 (O10) 96 (O10)
9 97 (O9) 112 (O9)
8 113 (O8) 128 (O8)
7 129 (O7) 144 (O7)
6 145 (O6) 168 (O6)
5 169 (O5) 200 (O5)
4 201 (O4) 256 (O4)
3 201 (O4) 256 (O4)
2 201 (O4) 256 (O4)
1 201 (O4) 256 (O4)
```

Reviewed By: #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D138443

show more ...


# 453eb9eb 05-Dec-2022 Dmitry Preobrazhensky <dmitri.preobrazhenski@gmail.com>

[AMDGPU][MC] Correct handling of mandatory literals

Differential Revision: https://reviews.llvm.org/D138661


# 20cde154 03-Dec-2022 Kazu Hirata <kazu@google.com>

[Target] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of

[Target] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


# ca856fff 29-Nov-2022 Ron Lieberman <ron.lieberman@amd.com>

Revert "enable code-object-version=5"

very sorry wrong repo.

This reverts commit d882ba7aeac4b496dccd1b10cb58bd691786b691.


# d882ba7a 29-Nov-2022 Ron Lieberman <ron.lieberman@amd.com>

enable code-object-version=5


# 595a0884 17-Nov-2022 Mateja Marjanovic <mateja.marjanovic@amd.com>

[AMDGPU] Add support for new LLVM vector types

Add VReg, AReg and SReg on AMDGPU for bit widths: 288, 320, 352 and 384.

Differential Revision: https://reviews.llvm.org/D138205


# 9b8eb5fa 29-Nov-2022 Dmitry Preobrazhensky <dmitri.preobrazhenski@gmail.com>

[AMDGPU][MC][GFX11] Correct op_sel handling for permlane*16

Differential Revision: https://reviews.llvm.org/D137969


# 869fc7ea 29-Nov-2022 Dmitry Preobrazhensky <dmitri.preobrazhenski@gmail.com>

[AMDGPU][MC][MI100+] Enable VOP3 variants of dot2c/dot4c/dot8c opcodes

Differential Revision: https://reviews.llvm.org/D138494


12345678910>>...15