xref: /llvm-project/llvm/docs/GlobalISel/GenericOpcode.rst (revision fc2cc018ec6112910d6e86585cd9ecdc5f5a3a7e)
1
2.. _gmir-opcodes:
3
4Generic Opcodes
5===============
6
7.. contents::
8   :local:
9
10.. note::
11
12  This documentation does not yet fully account for vectors. Many of the
13  scalar/integer/floating-point operations can also take vectors.
14
15Constants
16---------
17
18G_IMPLICIT_DEF
19^^^^^^^^^^^^^^
20
21An undefined value.
22
23.. code-block:: none
24
25  %0:_(s32) = G_IMPLICIT_DEF
26
27G_CONSTANT
28^^^^^^^^^^
29
30An integer constant.
31
32.. code-block:: none
33
34  %0:_(s32) = G_CONSTANT i32 1
35
36G_FCONSTANT
37^^^^^^^^^^^
38
39A floating point constant.
40
41.. code-block:: none
42
43  %0:_(s32) = G_FCONSTANT float 1.0
44
45G_FRAME_INDEX
46^^^^^^^^^^^^^
47
48The address of an object in the stack frame.
49
50.. code-block:: none
51
52  %1:_(p0) = G_FRAME_INDEX %stack.0.ptr0
53
54G_GLOBAL_VALUE
55^^^^^^^^^^^^^^
56
57The address of a global value.
58
59.. code-block:: none
60
61  %0(p0) = G_GLOBAL_VALUE @var_local
62
63G_PTRAUTH_GLOBAL_VALUE
64^^^^^^^^^^^^^^^^^^^^^^
65
66The signed address of a global value. Operands: address to be signed (pointer),
67key (32-bit imm), address for address discrimination (zero if not needed) and
68an extra discriminator (64-bit imm).
69
70.. code-block:: none
71
72  %0:_(p0) = G_PTRAUTH_GLOBAL_VALUE %1:_(p0), s32, %2:_(p0), s64
73
74G_BLOCK_ADDR
75^^^^^^^^^^^^
76
77The address of a basic block.
78
79.. code-block:: none
80
81  %0:_(p0) = G_BLOCK_ADDR blockaddress(@test_blockaddress, %ir-block.block)
82
83G_CONSTANT_POOL
84^^^^^^^^^^^^^^^
85
86The address of an object in the constant pool.
87
88.. code-block:: none
89
90  %0:_(p0) = G_CONSTANT_POOL %const.0
91
92Integer Extension and Truncation
93--------------------------------
94
95G_ANYEXT
96^^^^^^^^
97
98Extend the underlying scalar type of an operation, leaving the high bits
99unspecified.
100
101.. code-block:: none
102
103  %1:_(s32) = G_ANYEXT %0:_(s16)
104
105G_SEXT
106^^^^^^
107
108Sign extend the underlying scalar type of an operation, copying the sign bit
109into the newly-created space.
110
111.. code-block:: none
112
113  %1:_(s32) = G_SEXT %0:_(s16)
114
115G_SEXT_INREG
116^^^^^^^^^^^^
117
118Sign extend the value from an arbitrary bit position, copying the sign bit
119into all bits above it. This is equivalent to a shl + ashr pair with an
120appropriate shift amount. $sz is an immediate (MachineOperand::isImm()
121returns true) to allow targets to have some bitwidths legal and others
122lowered. This opcode is particularly useful if the target has sign-extension
123instructions that are cheaper than the constituent shifts as the optimizer is
124able to make decisions on whether it's better to hang on to the G_SEXT_INREG
125or to lower it and optimize the individual shifts.
126
127.. code-block:: none
128
129  %1:_(s32) = G_SEXT_INREG %0:_(s32), 16
130
131G_ZEXT
132^^^^^^
133
134Zero extend the underlying scalar type of an operation, putting zero bits
135into the newly-created space.
136
137.. code-block:: none
138
139  %1:_(s32) = G_ZEXT %0:_(s16)
140
141G_TRUNC
142^^^^^^^
143
144Truncate the underlying scalar type of an operation. This is equivalent to
145G_EXTRACT for scalar types, but acts elementwise on vectors.
146
147.. code-block:: none
148
149  %1:_(s16) = G_TRUNC %0:_(s32)
150
151Type Conversions
152----------------
153
154G_INTTOPTR
155^^^^^^^^^^
156
157Convert an integer to a pointer.
158
159.. code-block:: none
160
161  %1:_(p0) = G_INTTOPTR %0:_(s32)
162
163G_PTRTOINT
164^^^^^^^^^^
165
166Convert a pointer to an integer.
167
168.. code-block:: none
169
170  %1:_(s32) = G_PTRTOINT %0:_(p0)
171
172G_BITCAST
173^^^^^^^^^
174
175Reinterpret a value as a new type. This is usually done without
176changing any bits but this is not always the case due a subtlety in the
177definition of the :ref:`LLVM-IR Bitcast Instruction <i_bitcast>`. It
178is allowed to bitcast between pointers with the same size, but
179different address spaces.
180
181.. code-block:: none
182
183  %1:_(s64) = G_BITCAST %0:_(<2 x s32>)
184
185G_ADDRSPACE_CAST
186^^^^^^^^^^^^^^^^
187
188Convert a pointer to an address space to a pointer to another address space.
189
190.. code-block:: none
191
192  %1:_(p1) = G_ADDRSPACE_CAST %0:_(p0)
193
194.. caution::
195
196  :ref:`i_addrspacecast` doesn't mention what happens if the cast is simply
197  invalid (i.e. if the address spaces are disjoint).
198
199Scalar Operations
200-----------------
201
202G_EXTRACT
203^^^^^^^^^
204
205Extract a register of the specified size, starting from the block given by
206index. This will almost certainly be mapped to sub-register COPYs after
207register banks have been selected.
208
209.. code-block:: none
210
211  %3:_(s32) = G_EXTRACT %2:_(s64), 32
212
213G_INSERT
214^^^^^^^^
215
216Insert a smaller register into a larger one at the specified bit-index.
217
218.. code-block:: none
219
220  %2:_(s64) = G_INSERT %0:(_s64), %1:_(s32), 0
221
222G_MERGE_VALUES
223^^^^^^^^^^^^^^
224
225Concatenate multiple registers of the same size into a wider register.
226The input operands are always ordered from lowest bits to highest:
227
228.. code-block:: none
229
230  %0:(s32) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8),
231                            %bits_16_23:(s8), %bits_24_31:(s8)
232
233G_UNMERGE_VALUES
234^^^^^^^^^^^^^^^^
235
236Extract multiple registers of the specified size, starting from blocks given by
237indexes. This will almost certainly be mapped to sub-register COPYs after
238register banks have been selected.
239The output operands are always ordered from lowest bits to highest:
240
241.. code-block:: none
242
243  %bits_0_7:(s8), %bits_8_15:(s8),
244      %bits_16_23:(s8), %bits_24_31:(s8) = G_UNMERGE_VALUES %0:(s32)
245
246G_BSWAP
247^^^^^^^
248
249Reverse the order of the bytes in a scalar.
250
251.. code-block:: none
252
253  %1:_(s32) = G_BSWAP %0:_(s32)
254
255G_BITREVERSE
256^^^^^^^^^^^^
257
258Reverse the order of the bits in a scalar.
259
260.. code-block:: none
261
262  %1:_(s32) = G_BITREVERSE %0:_(s32)
263
264G_SBFX, G_UBFX
265^^^^^^^^^^^^^^
266
267Extract a range of bits from a register.
268
269The source operands are registers as follows:
270
271- Source
272- The least-significant bit for the extraction
273- The width of the extraction
274
275The least-significant bit (lsb) and width operands are in the range:
276
277::
278
279      0 <= lsb < lsb + width <= source bitwidth, where all values are unsigned
280
281G_SBFX sign-extends the result, while G_UBFX zero-extends the result.
282
283.. code-block:: none
284
285  ; Extract 5 bits starting at bit 1 from %x and store them in %a.
286  ; Sign-extend the result.
287  ;
288  ; Example:
289  ; %x = 0...0000[10110]1 ---> %a = 1...111111[10110]
290  %lsb_one = G_CONSTANT i32 1
291  %width_five = G_CONSTANT i32 5
292  %a:_(s32) = G_SBFX %x, %lsb_one, %width_five
293
294  ; Extract 3 bits starting at bit 2 from %x and store them in %b. Zero-extend
295  ; the result.
296  ;
297  ; Example:
298  ; %x = 1...11111[100]11 ---> %b = 0...00000[100]
299  %lsb_two = G_CONSTANT i32 2
300  %width_three = G_CONSTANT i32 3
301  %b:_(s32) = G_UBFX %x, %lsb_two, %width_three
302
303Integer Operations
304-------------------
305
306G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SDIV, G_UDIV, G_SREM, G_UREM
307^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
308
309These each perform their respective integer arithmetic on a scalar.
310
311.. code-block:: none
312
313  %dst:_(s32) = G_ADD %src0:_(s32), %src1:_(s32)
314
315The above example adds %src1 to %src0 and stores the result in %dst.
316
317G_SDIVREM, G_UDIVREM
318^^^^^^^^^^^^^^^^^^^^
319
320Perform integer division and remainder thereby producing two results.
321
322.. code-block:: none
323
324  %div:_(s32), %rem:_(s32) = G_SDIVREM %0:_(s32), %1:_(s32)
325
326G_SADDSAT, G_UADDSAT, G_SSUBSAT, G_USUBSAT, G_SSHLSAT, G_USHLSAT
327^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
328
329Signed and unsigned addition, subtraction and left shift with saturation.
330
331.. code-block:: none
332
333  %2:_(s32) = G_SADDSAT %0:_(s32), %1:_(s32)
334
335G_SHL, G_LSHR, G_ASHR
336^^^^^^^^^^^^^^^^^^^^^
337
338Shift the bits of a scalar left or right inserting zeros (sign-bit for G_ASHR).
339
340G_ROTR, G_ROTL
341^^^^^^^^^^^^^^
342
343Rotate the bits right (G_ROTR) or left (G_ROTL).
344
345G_ICMP
346^^^^^^
347
348Perform integer comparison producing non-zero (true) or zero (false). It's
349target specific whether a true value is 1, ~0U, or some other non-zero value.
350
351G_SCMP
352^^^^^^
353
354Perform signed 3-way integer comparison producing -1 (smaller), 0 (equal), or 1 (larger).
355
356.. code-block:: none
357
358  %5:_(s32) = G_SCMP %6, %2
359
360
361G_UCMP
362^^^^^^
363
364Perform unsigned 3-way integer comparison producing -1 (smaller), 0 (equal), or 1 (larger).
365
366.. code-block:: none
367
368  %7:_(s32) = G_UCMP %2, %6
369
370
371G_SELECT
372^^^^^^^^
373
374Select between two values depending on a zero/non-zero value.
375
376.. code-block:: none
377
378  %5:_(s32) = G_SELECT %4(s1), %6, %2
379
380G_PTR_ADD
381^^^^^^^^^
382
383Add a scalar offset in addressible units to a pointer. Addressible units are
384typically bytes but this may vary between targets.
385
386.. code-block:: none
387
388  %1:_(p0) = G_PTR_ADD %0:_(p0), %1:_(s32)
389
390.. caution::
391
392  There are currently no in-tree targets that use this with addressable units
393  not equal to 8 bit.
394
395G_PTRMASK
396^^^^^^^^^^
397
398Zero out an arbitrary mask of bits of a pointer. The mask type must be
399an integer, and the number of vector elements must match for all
400operands. This corresponds to `i_intr_llvm_ptrmask`.
401
402.. code-block:: none
403
404  %2:_(p0) = G_PTRMASK %0, %1
405
406G_SMIN, G_SMAX, G_UMIN, G_UMAX
407^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
408
409Take the minimum/maximum of two values.
410
411.. code-block:: none
412
413  %5:_(s32) = G_SMIN %6, %2
414
415G_ABS
416^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
417
418Take the absolute value of a signed integer. The absolute value of the minimum
419negative value (e.g. the 8-bit value `0x80`) is defined to be itself.
420
421.. code-block:: none
422
423  %1:_(s32) = G_ABS %0
424
425G_UADDO, G_SADDO, G_USUBO, G_SSUBO, G_SMULO, G_UMULO
426^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
427
428Perform the requested arithmetic and produce a carry output in addition to the
429normal result.
430
431.. code-block:: none
432
433  %3:_(s32), %4:_(s1) = G_UADDO %0, %1
434
435G_UADDE, G_SADDE, G_USUBE, G_SSUBE
436^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
437
438Perform the requested arithmetic and consume a carry input in addition to the
439normal input. Also produce a carry output in addition to the normal result.
440
441.. code-block:: none
442
443  %4:_(s32), %5:_(s1) = G_UADDE %0, %1, %3:_(s1)
444
445G_UMULH, G_SMULH
446^^^^^^^^^^^^^^^^
447
448Multiply two numbers at twice the incoming bit width (unsigned or signed) and
449return the high half of the result.
450
451.. code-block:: none
452
453  %3:_(s32) = G_UMULH %0, %1
454
455G_CTLZ, G_CTTZ, G_CTPOP
456^^^^^^^^^^^^^^^^^^^^^^^
457
458Count leading zeros, trailing zeros, or number of set bits.
459
460.. code-block:: none
461
462  %2:_(s33) = G_CTLZ_ZERO_UNDEF %1
463  %2:_(s33) = G_CTTZ_ZERO_UNDEF %1
464  %2:_(s33) = G_CTPOP %1
465
466G_CTLZ_ZERO_UNDEF, G_CTTZ_ZERO_UNDEF
467^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
468
469Count leading zeros or trailing zeros. If the value is zero then the result is
470undefined.
471
472.. code-block:: none
473
474  %2:_(s33) = G_CTLZ_ZERO_UNDEF %1
475  %2:_(s33) = G_CTTZ_ZERO_UNDEF %1
476
477G_ABDS, G_ABDU
478^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
479
480Compute the absolute difference (signed and unsigned), e.g. abs(x-y).
481
482.. code-block:: none
483
484  %0:_(s33) = G_ABDS %2, %3
485  %1:_(s33) = G_ABDU %4, %5
486
487Floating Point Operations
488-------------------------
489
490G_FCMP
491^^^^^^
492
493Perform floating point comparison producing non-zero (true) or zero
494(false). It's target specific whether a true value is 1, ~0U, or some other
495non-zero value.
496
497G_FNEG
498^^^^^^
499
500Floating point negation.
501
502G_FPEXT
503^^^^^^^
504
505Convert a floating point value to a larger type.
506
507G_FPTRUNC
508^^^^^^^^^
509
510Convert a floating point value to a narrower type.
511
512G_FPTOSI, G_FPTOUI, G_SITOFP, G_UITOFP
513^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
514
515Convert between integer and floating point.
516
517G_FPTOSI_SAT, G_FPTOUI_SAT
518^^^^^^^^^^^^^^^^^^^^^^^^^^
519
520Saturating convert between integer and floating point.
521
522G_FABS
523^^^^^^
524
525Take the absolute value of a floating point value.
526
527G_FCOPYSIGN
528^^^^^^^^^^^
529
530Copy the value of the first operand, replacing the sign bit with that of the
531second operand.
532
533G_FCANONICALIZE
534^^^^^^^^^^^^^^^
535
536See :ref:`i_intr_llvm_canonicalize`.
537
538G_IS_FPCLASS
539^^^^^^^^^^^^
540
541Tests if the first operand, which must be floating-point scalar or vector, has
542floating-point class specified by the second operand. Returns non-zero (true)
543or zero (false). It's target specific whether a true value is 1, ~0U, or some
544other non-zero value. If the first operand is a vector, the returned value is a
545vector of the same length.
546
547G_FMINNUM
548^^^^^^^^^
549
550Perform floating-point minimum on two values.
551
552In the case where a single input is a NaN (either signaling or quiet),
553the non-NaN input is returned.
554
555The return value of (FMINNUM 0.0, -0.0) could be either 0.0 or -0.0.
556
557G_FMAXNUM
558^^^^^^^^^
559
560Perform floating-point maximum on two values.
561
562In the case where a single input is a NaN (either signaling or quiet),
563the non-NaN input is returned.
564
565The return value of (FMAXNUM 0.0, -0.0) could be either 0.0 or -0.0.
566
567G_FMINNUM_IEEE
568^^^^^^^^^^^^^^
569
570Perform floating-point minimum on two values, following IEEE-754
571definitions. This differs from FMINNUM in the handling of signaling
572NaNs.
573
574If one input is a signaling NaN, returns a quiet NaN. This matches
575IEEE-754 2008's minnum/maxnum for signaling NaNs (which differs from
5762019).
577
578These treat -0 as ordered less than +0, matching the behavior of
579IEEE-754 2019's minimumNumber/maximumNumber (which was unspecified in
5802008).
581
582G_FMAXNUM_IEEE
583^^^^^^^^^^^^^^
584
585Perform floating-point maximum on two values, following IEEE-754
586definitions. This differs from FMAXNUM in the handling of signaling
587NaNs.
588
589If one input is a signaling NaN, returns a quiet NaN. This matches
590IEEE-754 2008's minnum/maxnum for signaling NaNs (which differs from
5912019).
592
593These treat -0 as ordered less than +0, matching the behavior of
594IEEE-754 2019's minimumNumber/maximumNumber (which was unspecified in
5952008).
596
597G_FMINIMUM
598^^^^^^^^^^
599
600NaN-propagating minimum that also treat -0.0 as less than 0.0. While
601FMINNUM_IEEE follow IEEE 754-2008 semantics, FMINIMUM follows IEEE
602754-2019 semantics.
603
604G_FMAXIMUM
605^^^^^^^^^^
606
607NaN-propagating maximum that also treat -0.0 as less than 0.0. While
608FMAXNUM_IEEE follow IEEE 754-2008 semantics, FMAXIMUM follows IEEE
609754-2019 semantics.
610
611G_FADD, G_FSUB, G_FMUL, G_FDIV, G_FREM
612^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
613
614Perform the specified floating point arithmetic.
615
616G_FMA
617^^^^^
618
619Perform a fused multiply add (i.e. without the intermediate rounding step).
620
621G_FMAD
622^^^^^^
623
624Perform a non-fused multiply add (i.e. with the intermediate rounding step).
625
626G_FPOW
627^^^^^^
628
629Raise the first operand to the power of the second.
630
631G_FEXP, G_FEXP2
632^^^^^^^^^^^^^^^
633
634Calculate the base-e or base-2 exponential of a value
635
636G_FLOG, G_FLOG2, G_FLOG10
637^^^^^^^^^^^^^^^^^^^^^^^^^
638
639Calculate the base-e, base-2, or base-10 respectively.
640
641G_FCEIL, G_FSQRT, G_FFLOOR, G_FRINT, G_FNEARBYINT
642^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
643
644These correspond to the standard C functions of the same name.
645
646G_FCOS, G_FSIN, G_FSINCOS, G_FTAN, G_FACOS, G_FASIN, G_FATAN, G_FATAN2, G_FCOSH, G_FSINH, G_FTANH
647^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
648
649These correspond to the standard C trigonometry functions of the same name.
650
651G_INTRINSIC_TRUNC
652^^^^^^^^^^^^^^^^^
653
654Returns the operand rounded to the nearest integer not larger in magnitude than the operand.
655
656G_INTRINSIC_ROUND
657^^^^^^^^^^^^^^^^^
658
659Returns the operand rounded to the nearest integer.
660
661G_LROUND, G_LLROUND
662^^^^^^^^^^^^^^^^^^^
663
664Returns the source operand rounded to the nearest integer with ties away from
665zero.
666
667See the LLVM LangRef entry on '``llvm.lround.*'`` for details on behaviour.
668
669.. code-block:: none
670
671  %rounded_32:_(s32) = G_LROUND %round_me:_(s64)
672  %rounded_64:_(s64) = G_LLROUND %round_me:_(s64)
673
674Vector Specific Operations
675--------------------------
676
677G_VSCALE
678^^^^^^^^
679
680Puts the value of the runtime ``vscale`` multiplied by the value in the source
681operand into the destination register. This can be useful in determining the
682actual runtime number of elements in a vector.
683
684.. code-block::
685
686  %0:_(s32) = G_VSCALE 4
687
688G_INSERT_SUBVECTOR
689^^^^^^^^^^^^^^^^^^
690
691Insert the second source vector into the first source vector. The index operand
692represents the starting index in the first source vector at which the second
693source vector should be inserted into.
694
695The index must be a constant multiple of the second source vector's minimum
696vector length. If the vectors are scalable, then the index is first scaled by
697the runtime scaling factor. The indices inserted in the source vector must be
698valid indices of that vector. If this condition cannot be determined statically
699but is false at runtime, then the result vector is undefined.
700
701.. code-block:: none
702
703  %2:_(<vscale x 4 x i64>) = G_INSERT_SUBVECTOR %0:_(<vscale x 4 x i64>), %1:_(<vscale x 2 x i64>), 0
704
705G_EXTRACT_SUBVECTOR
706^^^^^^^^^^^^^^^^^^^
707
708Extract a vector of destination type from the source vector. The index operand
709represents the starting index from which a subvector is extracted from
710the source vector.
711
712The index must be a constant multiple of the source vector's minimum vector
713length. If the source vector is a scalable vector, then the index is first
714scaled by the runtime scaling factor. The indices extracted from the source
715vector must be valid indices of that vector. If this condition cannot be
716determined statically but is false at runtime, then the result vector is
717undefined.
718
719Mixing scalable vectors and fixed vectors are not allowed.
720
721.. code-block:: none
722
723  %3:_(<vscale x 4 x i64>) = G_EXTRACT_SUBVECTOR %2:_(<vscale x 8 x i64>), 2
724
725G_CONCAT_VECTORS
726^^^^^^^^^^^^^^^^
727
728Concatenate two vectors to form a longer vector.
729
730G_BUILD_VECTOR, G_BUILD_VECTOR_TRUNC
731^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
732
733Create a vector from multiple scalar registers. No implicit
734conversion is performed (i.e. the result element type must be the
735same as all source operands)
736
737The _TRUNC version truncates the larger operand types to fit the
738destination vector elt type.
739
740G_INSERT_VECTOR_ELT
741^^^^^^^^^^^^^^^^^^^
742
743Insert an element into a vector
744
745G_EXTRACT_VECTOR_ELT
746^^^^^^^^^^^^^^^^^^^^
747
748Extract an element from a vector
749
750G_SHUFFLE_VECTOR
751^^^^^^^^^^^^^^^^
752
753Concatenate two vectors and shuffle the elements according to the mask operand.
754The mask operand should be an IR Constant which exactly matches the
755corresponding mask for the IR shufflevector instruction.
756
757G_SPLAT_VECTOR
758^^^^^^^^^^^^^^^^
759
760Create a vector where all elements are the scalar from the source operand.
761
762The type of the operand must be equal to or larger than the vector element
763type. If the operand is larger than the vector element type, the scalar is
764implicitly truncated to the vector element type.
765
766G_STEP_VECTOR
767^^^^^^^^^^^^^
768
769Create a scalable vector where all lanes are linear sequences starting at 0
770with a given unsigned step.
771
772The type of the operand must be equal to the vector element type. Arithmetic
773is performed modulo the bitwidth of the element. The step must be > 0.
774Otherwise the vector is zero.
775
776.. code-block::
777
778  %0:_(<vscale x 2 x s64>) = G_STEP_VECTOR i64 4
779
780  %1:_(<vscale x s32>) = G_STEP_VECTOR i32 4
781
782  0, 1*Step, 2*Step, 3*Step, 4*Step, ...
783
784G_VECTOR_COMPRESS
785^^^^^^^^^^^^^^^^^
786
787Given an input vector, a mask vector, and a passthru vector, continuously place
788all selected (i.e., where mask[i] = true) input lanes in an output vector. All
789remaining lanes in the output are taken from passthru, which may be undef.
790
791Vector Reduction Operations
792---------------------------
793
794These operations represent horizontal vector reduction, producing a scalar result.
795
796G_VECREDUCE_SEQ_FADD, G_VECREDUCE_SEQ_FMUL
797^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
798
799The SEQ variants perform reductions in sequential order. The first operand is
800an initial scalar accumulator value, and the second operand is the vector to reduce.
801
802G_VECREDUCE_FADD, G_VECREDUCE_FMUL
803^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
804
805These reductions are relaxed variants which may reduce the elements in any order.
806
807G_VECREDUCE_FMAX, G_VECREDUCE_FMIN, G_VECREDUCE_FMAXIMUM, G_VECREDUCE_FMINIMUM
808^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
809
810FMIN/FMAX/FMINIMUM/FMAXIMUM nodes can have flags, for NaN/NoNaN variants.
811
812
813Integer/bitwise reductions
814^^^^^^^^^^^^^^^^^^^^^^^^^^
815
816* G_VECREDUCE_ADD
817* G_VECREDUCE_MUL
818* G_VECREDUCE_AND
819* G_VECREDUCE_OR
820* G_VECREDUCE_XOR
821* G_VECREDUCE_SMAX
822* G_VECREDUCE_SMIN
823* G_VECREDUCE_UMAX
824* G_VECREDUCE_UMIN
825
826Integer reductions may have a result type larger than the vector element type.
827However, the reduction is performed using the vector element type and the value
828in the top bits is unspecified.
829
830Memory Operations
831-----------------
832
833G_LOAD, G_SEXTLOAD, G_ZEXTLOAD
834^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
835
836Generic load. Expects a MachineMemOperand in addition to explicit
837operands. If the result size is larger than the memory size, the
838high bits are undefined, sign-extended, or zero-extended respectively.
839
840Only G_LOAD is valid if the result is a vector type. If the result is larger
841than the memory size, the high elements are undefined (i.e. this is not a
842per-element, vector anyextload)
843
844Unlike in SelectionDAG, atomic loads are expressed with the same
845opcodes as regular loads. G_LOAD, G_SEXTLOAD and G_ZEXTLOAD may all
846have atomic memory operands.
847
848G_INDEXED_LOAD
849^^^^^^^^^^^^^^
850
851Generic indexed load. Combines a GEP with a load. $newaddr is set to $base + $offset.
852If $am is 0 (post-indexed), then the value is loaded from $base; if $am is 1 (pre-indexed)
853then the value is loaded from $newaddr.
854
855G_INDEXED_SEXTLOAD
856^^^^^^^^^^^^^^^^^^
857
858Same as G_INDEXED_LOAD except that the load performed is sign-extending, as with G_SEXTLOAD.
859
860G_INDEXED_ZEXTLOAD
861^^^^^^^^^^^^^^^^^^
862
863Same as G_INDEXED_LOAD except that the load performed is zero-extending, as with G_ZEXTLOAD.
864
865G_STORE
866^^^^^^^
867
868Generic store. Expects a MachineMemOperand in addition to explicit
869operands. If the stored value size is greater than the memory size,
870the high bits are implicitly truncated. If this is a vector store, the
871high elements are discarded (i.e. this does not function as a per-lane
872vector, truncating store)
873
874G_INDEXED_STORE
875^^^^^^^^^^^^^^^
876
877Combines a store with a GEP. See description of G_INDEXED_LOAD for indexing behaviour.
878
879G_ATOMIC_CMPXCHG_WITH_SUCCESS
880^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
881
882Generic atomic cmpxchg with internal success check. Expects a
883MachineMemOperand in addition to explicit operands.
884
885G_ATOMIC_CMPXCHG
886^^^^^^^^^^^^^^^^
887
888Generic atomic cmpxchg. Expects a MachineMemOperand in addition to explicit
889operands.
890
891|all_g_atomicrmw|
892^^^^^^^^^^^^^^^^^
893
894.. |all_g_atomicrmw| replace:: G_ATOMICRMW_XCHG, G_ATOMICRMW_ADD,
895                               G_ATOMICRMW_SUB, G_ATOMICRMW_AND,
896                               G_ATOMICRMW_NAND, G_ATOMICRMW_OR,
897                               G_ATOMICRMW_XOR, G_ATOMICRMW_MAX,
898                               G_ATOMICRMW_MIN, G_ATOMICRMW_UMAX,
899                               G_ATOMICRMW_UMIN, G_ATOMICRMW_FADD,
900                               G_ATOMICRMW_FSUB, G_ATOMICRMW_FMAX,
901                               G_ATOMICRMW_FMIN, G_ATOMICRMW_UINC_WRAP,
902			       G_ATOMICRMW_UDEC_WRAP, G_ATOMICRMW_USUB_COND,
903			       G_ATOMICRMW_USUB_SAT
904
905Generic atomicrmw. Expects a MachineMemOperand in addition to explicit
906operands.
907
908G_FENCE
909^^^^^^^
910
911Generic fence. The first operand is the memory ordering. The second operand is
912the syncscope.
913
914See the LLVM LangRef entry on the '``fence'`` instruction for more details.
915
916G_MEMCPY
917^^^^^^^^
918
919Generic memcpy. Expects two MachineMemOperands covering the store and load
920respectively, in addition to explicit operands.
921
922G_MEMCPY_INLINE
923^^^^^^^^^^^^^^^
924
925Generic inlined memcpy. Like G_MEMCPY, but it is guaranteed that this version
926will not be lowered as a call to an external function. Currently the size
927operand is required to evaluate as a constant (not an immediate), though that is
928expected to change when llvm.memcpy.inline is taught to support dynamic sizes.
929
930G_MEMMOVE
931^^^^^^^^^
932
933Generic memmove. Similar to G_MEMCPY, but the source and destination memory
934ranges are allowed to overlap.
935
936G_MEMSET
937^^^^^^^^
938
939Generic memset. Expects a MachineMemOperand in addition to explicit operands.
940
941G_BZERO
942^^^^^^^
943
944Generic bzero. Expects a MachineMemOperand in addition to explicit operands.
945
946Control Flow
947------------
948
949G_PHI
950^^^^^
951
952Implement the φ node in the SSA graph representing the function.
953
954.. code-block:: none
955
956  %dst(s8) = G_PHI %src1(s8), %bb.<id1>, %src2(s8), %bb.<id2>
957
958G_BR
959^^^^
960
961Unconditional branch
962
963.. code-block:: none
964
965  G_BR %bb.<id>
966
967G_BRCOND
968^^^^^^^^
969
970Conditional branch
971
972.. code-block:: none
973
974  G_BRCOND %condition, %basicblock.<id>
975
976G_BRINDIRECT
977^^^^^^^^^^^^
978
979Indirect branch
980
981.. code-block:: none
982
983  G_BRINDIRECT %src(p0)
984
985G_BRJT
986^^^^^^
987
988Indirect branch to jump table entry
989
990.. code-block:: none
991
992  G_BRJT %ptr(p0), %jti, %idx(s64)
993
994G_JUMP_TABLE
995^^^^^^^^^^^^
996
997Generates a pointer to the address of the jump table specified by the source
998operand. The source operand is a jump table index.
999G_JUMP_TABLE can be used in conjunction with G_BRJT to support jump table
1000codegen with GlobalISel.
1001
1002.. code-block:: none
1003
1004  %dst:_(p0) = G_JUMP_TABLE %jump-table.0
1005
1006The above example generates a pointer to the source jump table index.
1007
1008G_INVOKE_REGION_START
1009^^^^^^^^^^^^^^^^^^^^^
1010
1011A marker instruction that acts as a pseudo-terminator for regions of code that may
1012throw exceptions. Being a terminator, it prevents code from being inserted after
1013it during passes like legalization. This is needed because calls to exception
1014throw routines do not return, so no code that must be on an executable path must
1015be placed after throwing.
1016
1017G_INTRINSIC, G_INTRINSIC_CONVERGENT
1018^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1019
1020Call an intrinsic that has no side-effects.
1021
1022The _CONVERGENT variant corresponds to an LLVM IR intrinsic marked `convergent`.
1023
1024.. note::
1025
1026  Unlike SelectionDAG, there is no _VOID variant. Both of these are permitted
1027  to have zero, one, or multiple results.
1028
1029G_INTRINSIC_W_SIDE_EFFECTS, G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS
1030^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1031
1032Call an intrinsic that is considered to have unknown side-effects and as such
1033cannot be reordered across other side-effecting instructions.
1034
1035The _CONVERGENT variant corresponds to an LLVM IR intrinsic marked `convergent`.
1036
1037.. note::
1038
1039  Unlike SelectionDAG, there is no _VOID variant. Both of these are permitted
1040  to have zero, one, or multiple results.
1041
1042G_TRAP, G_DEBUGTRAP, G_UBSANTRAP
1043^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1044
1045Represents :ref:`llvm.trap <llvm.trap>`, :ref:`llvm.debugtrap <llvm.debugtrap>`
1046and :ref:`llvm.ubsantrap <llvm.ubsantrap>` that generate a target dependent
1047trap instructions.
1048
1049.. code-block:: none
1050
1051  G_TRAP
1052
1053.. code-block:: none
1054
1055  G_DEBUGTRAP
1056
1057.. code-block:: none
1058
1059  G_UBSANTRAP 12
1060
1061Variadic Arguments
1062------------------
1063
1064G_VASTART
1065^^^^^^^^^
1066
1067.. caution::
1068
1069  I found no documentation for this instruction at the time of writing.
1070
1071G_VAARG
1072^^^^^^^
1073
1074.. caution::
1075
1076  I found no documentation for this instruction at the time of writing.
1077
1078Other Operations
1079----------------
1080
1081G_DYN_STACKALLOC
1082^^^^^^^^^^^^^^^^
1083
1084Dynamically realigns the stack pointer to the specified size and alignment.
1085An alignment value of `0` or `1` means no specific alignment.
1086
1087.. code-block:: none
1088
1089  %8:_(p0) = G_DYN_STACKALLOC %7(s64), 32
1090
1091Optimization Hints
1092------------------
1093
1094These instructions do not correspond to any target instructions. They act as
1095hints for various combines.
1096
1097G_ASSERT_SEXT, G_ASSERT_ZEXT
1098^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1099
1100This signifies that the contents of a register were previously extended from a
1101smaller type.
1102
1103The smaller type is denoted using an immediate operand. For scalars, this is the
1104width of the entire smaller type. For vectors, this is the width of the smaller
1105element type.
1106
1107.. code-block:: none
1108
1109  %x_was_zexted:_(s32) = G_ASSERT_ZEXT %x(s32), 16
1110  %y_was_zexted:_(<2 x s32>) = G_ASSERT_ZEXT %y(<2 x s32>), 16
1111
1112  %z_was_sexted:_(s32) = G_ASSERT_SEXT %z(s32), 8
1113
1114G_ASSERT_SEXT and G_ASSERT_ZEXT act like copies, albeit with some restrictions.
1115
1116The source and destination registers must
1117
1118- Be virtual
1119- Belong to the same register class
1120- Belong to the same register bank
1121
1122It should always be safe to
1123
1124- Look through the source register
1125- Replace the destination register with the source register
1126
1127
1128Miscellaneous
1129-------------
1130
1131G_CONSTANT_FOLD_BARRIER
1132^^^^^^^^^^^^^^^^^^^^^^^
1133
1134This operation is used as an opaque barrier to prevent constant folding. Combines
1135and other transformations should not look through this. These have no other
1136semantics and can be safely eliminated if a target chooses.
1137
1138
1139Unlisted: G_STACKSAVE, G_STACKRESTORE, G_FSHL, G_FSHR, G_SMULFIX, G_UMULFIX, G_SMULFIXSAT, G_UMULFIXSAT, G_SDIVFIX, G_UDIVFIX, G_SDIVFIXSAT, G_UDIVFIXSAT, G_FPOWI, G_FEXP10, G_FLDEXP, G_FFREXP, G_GET_FPENV, G_SET_FPENV, G_RESET_FPENV, G_GET_FPMODE, G_SET_FPMODE, G_RESET_FPMODE, G_INTRINSIC_FPTRUNC_ROUND, G_INTRINSIC_LRINT, G_INTRINSIC_LLRINT, G_INTRINSIC_ROUNDEVEN, G_READCYCLECOUNTER, G_READSTEADYCOUNTER, G_PREFETCH, G_READ_REGISTER, G_WRITE_REGISTER, G_STRICT_FADD, G_STRICT_FSUB, G_STRICT_FMUL, G_STRICT_FDIV, G_STRICT_FREM, G_STRICT_FMA, G_STRICT_FSQRT, G_STRICT_FLDEXP, G_ASSERT_ALIGN
1140