xref: /llvm-project/llvm/docs/AMDGPUInstructionSyntax.rst (revision b8e1071a29035f5f7314bccfc1259a5a4935a8bc)
1=========================
2AMDGPU Instruction Syntax
3=========================
4
5.. contents::
6   :local:
7
8.. _amdgpu_syn_instructions:
9
10Instructions
11============
12
13Syntax
14~~~~~~
15
16Syntax of Regular Instructions
17------------------------------
18
19An instruction has the following syntax:
20
21  | ``<``\ *opcode mnemonic*\ ``>    <``\ *operand0*\ ``>,
22      <``\ *operand1*\ ``>,...    <``\ *modifier0*\ ``> <``\ *modifier1*\ ``>...``
23
24:doc:`Operands<AMDGPUOperandSyntax>` are normally comma-separated, while
25:doc:`modifiers<AMDGPUModifierSyntax>` are space-separated.
26
27The order of *operands* and *modifiers* is fixed.
28Most *modifiers* are optional and may be omitted.
29
30Syntax of VOPD Instructions
31---------------------------
32
33*VOPDX* and *VOPDY* instructions must be concatenated with the :: operator to form a single *VOPD* instruction:
34
35    ``<``\ *VOPDX instruction*\ ``>  ::  <``\ *VOPDY instruction*\ ``>``
36
37An example:
38
39.. parsed-literal::
40
41    v_dual_add_f32 v255, v255, v2 :: v_dual_fmaak_f32 v6, v2, v3, 1.0
42
43Note that *VOPDX* and *VOPDY* instructions cannot be used as separate opcodes.
44
45.. _amdgpu_syn_instruction_mnemo:
46
47Opcode Mnemonic
48~~~~~~~~~~~~~~~
49
50Opcode mnemonic describes opcode semantics
51and may include one or more suffices in this order:
52
53* :ref:`Packing suffix<amdgpu_syn_instruction_pk>`.
54* :ref:`Destination operand type suffix<amdgpu_syn_instruction_type>`.
55* :ref:`Source operand type suffix<amdgpu_syn_instruction_type>`.
56* :ref:`Encoding suffix<amdgpu_syn_instruction_enc>`.
57
58.. _amdgpu_syn_instruction_pk:
59
60Packing Suffix
61~~~~~~~~~~~~~~
62
63Most instructions which operate on packed data have a *_pk* suffix.
64Unless otherwise :ref:`noted<amdgpu_syn_instruction_operand_tags>`,
65these instructions operate on and produce packed data composed of
66two values. The type of values is indicated by
67:ref:`type suffices<amdgpu_syn_instruction_type>`.
68
69For example, the following instruction sums up two pairs of f16 values
70and produces a pair of f16 values:
71
72.. parsed-literal::
73
74    v_pk_add_f16 v1, v2, v3     // Each operand has f16x2 type
75
76.. _amdgpu_syn_instruction_type:
77
78Type and Size Suffices
79~~~~~~~~~~~~~~~~~~~~~~
80
81Instructions which operate with data have an implied type of *data* operands.
82This data type is specified as a suffix of instruction mnemonic.
83
84There are instructions which have 2 type suffices:
85the first is the data type of the destination operand,
86the second is the data type of source *data* operand(s).
87
88Note that data type specified by an instruction does not apply
89to other kinds of operands such as *addresses*, *offsets* and so on.
90
91The following table enumerates the most frequently used type suffices.
92
93    ============================================ ======================= ============================
94    Type Suffices                                Packed instruction?     Data Type
95    ============================================ ======================= ============================
96    _b512, _b256, _b128, _b64, _b32, _b16, _b8   No                      Bits.
97    _u64, _u32, _u16, _u8                        No                      Unsigned integer.
98    _i64, _i32, _i16, _i8                        No                      Signed integer.
99    _f64, _f32, _f16                             No                      Floating-point.
100    _b16, _u16, _i16, _f16                       Yes                     Packed (b16x2, u16x2, etc).
101    ============================================ ======================= ============================
102
103Instructions which have no type suffices are assumed to operate with typeless data.
104The size of typeless data is specified by size suffices:
105
106    ================= =================== =====================================
107    Size Suffix       Implied data type   Required register size in dwords
108    ================= =================== =====================================
109    \-                b32                 1
110    x2                b64                 2
111    x3                b96                 3
112    x4                b128                4
113    x8                b256                8
114    x16               b512                16
115    x                 b32                 1
116    xy                b64                 2
117    xyz               b96                 3
118    xyzw              b128                4
119    d16_x             b16                 1
120    d16_xy            b16x2               2 for GFX8.0, 1 for GFX8.1 and GFX9+
121    d16_xyz           b16x3               3 for GFX8.0, 2 for GFX8.1 and GFX9+
122    d16_xyzw          b16x4               4 for GFX8.0, 2 for GFX8.1 and GFX9+
123    d16_format_x      b16                 1
124    d16_format_xy     b16x2               1
125    d16_format_xyz    b16x3               2
126    d16_format_xyzw   b16x4               2
127    ================= =================== =====================================
128
129.. WARNING::
130    There are exceptions to the rules described above.
131    Operands which have a type different from the type specified by the opcode are
132    :ref:`tagged<amdgpu_syn_instruction_operand_tags>` in the description.
133
134Examples of instructions with different types of source and destination operands:
135
136.. parsed-literal::
137
138    s_bcnt0_i32_b64
139    v_cvt_f32_u32
140
141Examples of instructions with one data type:
142
143.. parsed-literal::
144
145    v_max3_f32
146    v_max3_i16
147
148Examples of instructions which operate with packed data:
149
150.. parsed-literal::
151
152    v_pk_add_u16
153    v_pk_add_i16
154    v_pk_add_f16
155
156Examples of typeless instructions which operate on b128 data:
157
158.. parsed-literal::
159
160    buffer_store_dwordx4
161    flat_load_dwordx4
162
163.. _amdgpu_syn_instruction_enc:
164
165Encoding Suffices
166~~~~~~~~~~~~~~~~~
167
168Most *VOP1*, *VOP2* and *VOPC* instructions have several variants:
169they may also be encoded in *VOP3*, *DPP* and *SDWA* formats.
170
171The assembler selects an optimal encoding automatically
172based on instruction operands and modifiers,
173unless a specific encoding is explicitly requested.
174To force specific encoding, one can add a suffix to the opcode of the instruction:
175
176    =================================================== =================
177    Encoding                                            Encoding Suffix
178    =================================================== =================
179    *VOP1*, *VOP2* and *VOPC* (32-bit) encoding         _e32
180    *VOP3* (64-bit) encoding                            _e64
181    *DPP* encoding                                      _dpp
182    *SDWA* encoding                                     _sdwa
183    *VOP3 DPP* encoding                                 _e64_dpp
184    =================================================== =================
185
186This reference uses encoding suffices to specify which encoding is implied.
187When no suffix is specified, native instruction encoding is assumed.
188
189Operands
190========
191
192Syntax
193~~~~~~
194
195The syntax of generic operands is described :doc:`in this document<AMDGPUOperandSyntax>`.
196
197For detailed information about operands, follow *operand links* in GPU-specific documents.
198
199Modifiers
200=========
201
202Syntax
203~~~~~~
204
205The syntax of modifiers is described :doc:`in this document<AMDGPUModifierSyntax>`.
206
207Information about modifiers supported for individual instructions
208may be found in GPU-specific documents.
209