1 2.. _gmir-opcodes: 3 4Generic Opcodes 5=============== 6 7.. contents:: 8 :local: 9 10.. note:: 11 12 This documentation does not yet fully account for vectors. Many of the 13 scalar/integer/floating-point operations can also take vectors. 14 15Constants 16--------- 17 18G_IMPLICIT_DEF 19^^^^^^^^^^^^^^ 20 21An undefined value. 22 23.. code-block:: none 24 25 %0:_(s32) = G_IMPLICIT_DEF 26 27G_CONSTANT 28^^^^^^^^^^ 29 30An integer constant. 31 32.. code-block:: none 33 34 %0:_(s32) = G_CONSTANT i32 1 35 36G_FCONSTANT 37^^^^^^^^^^^ 38 39A floating point constant. 40 41.. code-block:: none 42 43 %0:_(s32) = G_FCONSTANT float 1.0 44 45G_FRAME_INDEX 46^^^^^^^^^^^^^ 47 48The address of an object in the stack frame. 49 50.. code-block:: none 51 52 %1:_(p0) = G_FRAME_INDEX %stack.0.ptr0 53 54G_GLOBAL_VALUE 55^^^^^^^^^^^^^^ 56 57The address of a global value. 58 59.. code-block:: none 60 61 %0(p0) = G_GLOBAL_VALUE @var_local 62 63G_PTRAUTH_GLOBAL_VALUE 64^^^^^^^^^^^^^^^^^^^^^^ 65 66The signed address of a global value. Operands: address to be signed (pointer), 67key (32-bit imm), address for address discrimination (zero if not needed) and 68an extra discriminator (64-bit imm). 69 70.. code-block:: none 71 72 %0:_(p0) = G_PTRAUTH_GLOBAL_VALUE %1:_(p0), s32, %2:_(p0), s64 73 74G_BLOCK_ADDR 75^^^^^^^^^^^^ 76 77The address of a basic block. 78 79.. code-block:: none 80 81 %0:_(p0) = G_BLOCK_ADDR blockaddress(@test_blockaddress, %ir-block.block) 82 83G_CONSTANT_POOL 84^^^^^^^^^^^^^^^ 85 86The address of an object in the constant pool. 87 88.. code-block:: none 89 90 %0:_(p0) = G_CONSTANT_POOL %const.0 91 92Integer Extension and Truncation 93-------------------------------- 94 95G_ANYEXT 96^^^^^^^^ 97 98Extend the underlying scalar type of an operation, leaving the high bits 99unspecified. 100 101.. code-block:: none 102 103 %1:_(s32) = G_ANYEXT %0:_(s16) 104 105G_SEXT 106^^^^^^ 107 108Sign extend the underlying scalar type of an operation, copying the sign bit 109into the newly-created space. 110 111.. code-block:: none 112 113 %1:_(s32) = G_SEXT %0:_(s16) 114 115G_SEXT_INREG 116^^^^^^^^^^^^ 117 118Sign extend the value from an arbitrary bit position, copying the sign bit 119into all bits above it. This is equivalent to a shl + ashr pair with an 120appropriate shift amount. $sz is an immediate (MachineOperand::isImm() 121returns true) to allow targets to have some bitwidths legal and others 122lowered. This opcode is particularly useful if the target has sign-extension 123instructions that are cheaper than the constituent shifts as the optimizer is 124able to make decisions on whether it's better to hang on to the G_SEXT_INREG 125or to lower it and optimize the individual shifts. 126 127.. code-block:: none 128 129 %1:_(s32) = G_SEXT_INREG %0:_(s32), 16 130 131G_ZEXT 132^^^^^^ 133 134Zero extend the underlying scalar type of an operation, putting zero bits 135into the newly-created space. 136 137.. code-block:: none 138 139 %1:_(s32) = G_ZEXT %0:_(s16) 140 141G_TRUNC 142^^^^^^^ 143 144Truncate the underlying scalar type of an operation. This is equivalent to 145G_EXTRACT for scalar types, but acts elementwise on vectors. 146 147.. code-block:: none 148 149 %1:_(s16) = G_TRUNC %0:_(s32) 150 151Type Conversions 152---------------- 153 154G_INTTOPTR 155^^^^^^^^^^ 156 157Convert an integer to a pointer. 158 159.. code-block:: none 160 161 %1:_(p0) = G_INTTOPTR %0:_(s32) 162 163G_PTRTOINT 164^^^^^^^^^^ 165 166Convert a pointer to an integer. 167 168.. code-block:: none 169 170 %1:_(s32) = G_PTRTOINT %0:_(p0) 171 172G_BITCAST 173^^^^^^^^^ 174 175Reinterpret a value as a new type. This is usually done without 176changing any bits but this is not always the case due a subtlety in the 177definition of the :ref:`LLVM-IR Bitcast Instruction <i_bitcast>`. It 178is allowed to bitcast between pointers with the same size, but 179different address spaces. 180 181.. code-block:: none 182 183 %1:_(s64) = G_BITCAST %0:_(<2 x s32>) 184 185G_ADDRSPACE_CAST 186^^^^^^^^^^^^^^^^ 187 188Convert a pointer to an address space to a pointer to another address space. 189 190.. code-block:: none 191 192 %1:_(p1) = G_ADDRSPACE_CAST %0:_(p0) 193 194.. caution:: 195 196 :ref:`i_addrspacecast` doesn't mention what happens if the cast is simply 197 invalid (i.e. if the address spaces are disjoint). 198 199Scalar Operations 200----------------- 201 202G_EXTRACT 203^^^^^^^^^ 204 205Extract a register of the specified size, starting from the block given by 206index. This will almost certainly be mapped to sub-register COPYs after 207register banks have been selected. 208 209.. code-block:: none 210 211 %3:_(s32) = G_EXTRACT %2:_(s64), 32 212 213G_INSERT 214^^^^^^^^ 215 216Insert a smaller register into a larger one at the specified bit-index. 217 218.. code-block:: none 219 220 %2:_(s64) = G_INSERT %0:(_s64), %1:_(s32), 0 221 222G_MERGE_VALUES 223^^^^^^^^^^^^^^ 224 225Concatenate multiple registers of the same size into a wider register. 226The input operands are always ordered from lowest bits to highest: 227 228.. code-block:: none 229 230 %0:(s32) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8), 231 %bits_16_23:(s8), %bits_24_31:(s8) 232 233G_UNMERGE_VALUES 234^^^^^^^^^^^^^^^^ 235 236Extract multiple registers of the specified size, starting from blocks given by 237indexes. This will almost certainly be mapped to sub-register COPYs after 238register banks have been selected. 239The output operands are always ordered from lowest bits to highest: 240 241.. code-block:: none 242 243 %bits_0_7:(s8), %bits_8_15:(s8), 244 %bits_16_23:(s8), %bits_24_31:(s8) = G_UNMERGE_VALUES %0:(s32) 245 246G_BSWAP 247^^^^^^^ 248 249Reverse the order of the bytes in a scalar. 250 251.. code-block:: none 252 253 %1:_(s32) = G_BSWAP %0:_(s32) 254 255G_BITREVERSE 256^^^^^^^^^^^^ 257 258Reverse the order of the bits in a scalar. 259 260.. code-block:: none 261 262 %1:_(s32) = G_BITREVERSE %0:_(s32) 263 264G_SBFX, G_UBFX 265^^^^^^^^^^^^^^ 266 267Extract a range of bits from a register. 268 269The source operands are registers as follows: 270 271- Source 272- The least-significant bit for the extraction 273- The width of the extraction 274 275The least-significant bit (lsb) and width operands are in the range: 276 277:: 278 279 0 <= lsb < lsb + width <= source bitwidth, where all values are unsigned 280 281G_SBFX sign-extends the result, while G_UBFX zero-extends the result. 282 283.. code-block:: none 284 285 ; Extract 5 bits starting at bit 1 from %x and store them in %a. 286 ; Sign-extend the result. 287 ; 288 ; Example: 289 ; %x = 0...0000[10110]1 ---> %a = 1...111111[10110] 290 %lsb_one = G_CONSTANT i32 1 291 %width_five = G_CONSTANT i32 5 292 %a:_(s32) = G_SBFX %x, %lsb_one, %width_five 293 294 ; Extract 3 bits starting at bit 2 from %x and store them in %b. Zero-extend 295 ; the result. 296 ; 297 ; Example: 298 ; %x = 1...11111[100]11 ---> %b = 0...00000[100] 299 %lsb_two = G_CONSTANT i32 2 300 %width_three = G_CONSTANT i32 3 301 %b:_(s32) = G_UBFX %x, %lsb_two, %width_three 302 303Integer Operations 304------------------- 305 306G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SDIV, G_UDIV, G_SREM, G_UREM 307^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 308 309These each perform their respective integer arithmetic on a scalar. 310 311.. code-block:: none 312 313 %dst:_(s32) = G_ADD %src0:_(s32), %src1:_(s32) 314 315The above example adds %src1 to %src0 and stores the result in %dst. 316 317G_SDIVREM, G_UDIVREM 318^^^^^^^^^^^^^^^^^^^^ 319 320Perform integer division and remainder thereby producing two results. 321 322.. code-block:: none 323 324 %div:_(s32), %rem:_(s32) = G_SDIVREM %0:_(s32), %1:_(s32) 325 326G_SADDSAT, G_UADDSAT, G_SSUBSAT, G_USUBSAT, G_SSHLSAT, G_USHLSAT 327^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 328 329Signed and unsigned addition, subtraction and left shift with saturation. 330 331.. code-block:: none 332 333 %2:_(s32) = G_SADDSAT %0:_(s32), %1:_(s32) 334 335G_SHL, G_LSHR, G_ASHR 336^^^^^^^^^^^^^^^^^^^^^ 337 338Shift the bits of a scalar left or right inserting zeros (sign-bit for G_ASHR). 339 340G_ROTR, G_ROTL 341^^^^^^^^^^^^^^ 342 343Rotate the bits right (G_ROTR) or left (G_ROTL). 344 345G_ICMP 346^^^^^^ 347 348Perform integer comparison producing non-zero (true) or zero (false). It's 349target specific whether a true value is 1, ~0U, or some other non-zero value. 350 351G_SCMP 352^^^^^^ 353 354Perform signed 3-way integer comparison producing -1 (smaller), 0 (equal), or 1 (larger). 355 356.. code-block:: none 357 358 %5:_(s32) = G_SCMP %6, %2 359 360 361G_UCMP 362^^^^^^ 363 364Perform unsigned 3-way integer comparison producing -1 (smaller), 0 (equal), or 1 (larger). 365 366.. code-block:: none 367 368 %7:_(s32) = G_UCMP %2, %6 369 370 371G_SELECT 372^^^^^^^^ 373 374Select between two values depending on a zero/non-zero value. 375 376.. code-block:: none 377 378 %5:_(s32) = G_SELECT %4(s1), %6, %2 379 380G_PTR_ADD 381^^^^^^^^^ 382 383Add a scalar offset in addressible units to a pointer. Addressible units are 384typically bytes but this may vary between targets. 385 386.. code-block:: none 387 388 %1:_(p0) = G_PTR_ADD %0:_(p0), %1:_(s32) 389 390.. caution:: 391 392 There are currently no in-tree targets that use this with addressable units 393 not equal to 8 bit. 394 395G_PTRMASK 396^^^^^^^^^^ 397 398Zero out an arbitrary mask of bits of a pointer. The mask type must be 399an integer, and the number of vector elements must match for all 400operands. This corresponds to `i_intr_llvm_ptrmask`. 401 402.. code-block:: none 403 404 %2:_(p0) = G_PTRMASK %0, %1 405 406G_SMIN, G_SMAX, G_UMIN, G_UMAX 407^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 408 409Take the minimum/maximum of two values. 410 411.. code-block:: none 412 413 %5:_(s32) = G_SMIN %6, %2 414 415G_ABS 416^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 417 418Take the absolute value of a signed integer. The absolute value of the minimum 419negative value (e.g. the 8-bit value `0x80`) is defined to be itself. 420 421.. code-block:: none 422 423 %1:_(s32) = G_ABS %0 424 425G_UADDO, G_SADDO, G_USUBO, G_SSUBO, G_SMULO, G_UMULO 426^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 427 428Perform the requested arithmetic and produce a carry output in addition to the 429normal result. 430 431.. code-block:: none 432 433 %3:_(s32), %4:_(s1) = G_UADDO %0, %1 434 435G_UADDE, G_SADDE, G_USUBE, G_SSUBE 436^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 437 438Perform the requested arithmetic and consume a carry input in addition to the 439normal input. Also produce a carry output in addition to the normal result. 440 441.. code-block:: none 442 443 %4:_(s32), %5:_(s1) = G_UADDE %0, %1, %3:_(s1) 444 445G_UMULH, G_SMULH 446^^^^^^^^^^^^^^^^ 447 448Multiply two numbers at twice the incoming bit width (unsigned or signed) and 449return the high half of the result. 450 451.. code-block:: none 452 453 %3:_(s32) = G_UMULH %0, %1 454 455G_CTLZ, G_CTTZ, G_CTPOP 456^^^^^^^^^^^^^^^^^^^^^^^ 457 458Count leading zeros, trailing zeros, or number of set bits. 459 460.. code-block:: none 461 462 %2:_(s33) = G_CTLZ_ZERO_UNDEF %1 463 %2:_(s33) = G_CTTZ_ZERO_UNDEF %1 464 %2:_(s33) = G_CTPOP %1 465 466G_CTLZ_ZERO_UNDEF, G_CTTZ_ZERO_UNDEF 467^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 468 469Count leading zeros or trailing zeros. If the value is zero then the result is 470undefined. 471 472.. code-block:: none 473 474 %2:_(s33) = G_CTLZ_ZERO_UNDEF %1 475 %2:_(s33) = G_CTTZ_ZERO_UNDEF %1 476 477G_ABDS, G_ABDU 478^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 479 480Compute the absolute difference (signed and unsigned), e.g. abs(x-y). 481 482.. code-block:: none 483 484 %0:_(s33) = G_ABDS %2, %3 485 %1:_(s33) = G_ABDU %4, %5 486 487Floating Point Operations 488------------------------- 489 490G_FCMP 491^^^^^^ 492 493Perform floating point comparison producing non-zero (true) or zero 494(false). It's target specific whether a true value is 1, ~0U, or some other 495non-zero value. 496 497G_FNEG 498^^^^^^ 499 500Floating point negation. 501 502G_FPEXT 503^^^^^^^ 504 505Convert a floating point value to a larger type. 506 507G_FPTRUNC 508^^^^^^^^^ 509 510Convert a floating point value to a narrower type. 511 512G_FPTOSI, G_FPTOUI, G_SITOFP, G_UITOFP 513^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 514 515Convert between integer and floating point. 516 517G_FPTOSI_SAT, G_FPTOUI_SAT 518^^^^^^^^^^^^^^^^^^^^^^^^^^ 519 520Saturating convert between integer and floating point. 521 522G_FABS 523^^^^^^ 524 525Take the absolute value of a floating point value. 526 527G_FCOPYSIGN 528^^^^^^^^^^^ 529 530Copy the value of the first operand, replacing the sign bit with that of the 531second operand. 532 533G_FCANONICALIZE 534^^^^^^^^^^^^^^^ 535 536See :ref:`i_intr_llvm_canonicalize`. 537 538G_IS_FPCLASS 539^^^^^^^^^^^^ 540 541Tests if the first operand, which must be floating-point scalar or vector, has 542floating-point class specified by the second operand. Returns non-zero (true) 543or zero (false). It's target specific whether a true value is 1, ~0U, or some 544other non-zero value. If the first operand is a vector, the returned value is a 545vector of the same length. 546 547G_FMINNUM 548^^^^^^^^^ 549 550Perform floating-point minimum on two values. 551 552In the case where a single input is a NaN (either signaling or quiet), 553the non-NaN input is returned. 554 555The return value of (FMINNUM 0.0, -0.0) could be either 0.0 or -0.0. 556 557G_FMAXNUM 558^^^^^^^^^ 559 560Perform floating-point maximum on two values. 561 562In the case where a single input is a NaN (either signaling or quiet), 563the non-NaN input is returned. 564 565The return value of (FMAXNUM 0.0, -0.0) could be either 0.0 or -0.0. 566 567G_FMINNUM_IEEE 568^^^^^^^^^^^^^^ 569 570Perform floating-point minimum on two values, following IEEE-754 571definitions. This differs from FMINNUM in the handling of signaling 572NaNs. 573 574If one input is a signaling NaN, returns a quiet NaN. This matches 575IEEE-754 2008's minnum/maxnum for signaling NaNs (which differs from 5762019). 577 578These treat -0 as ordered less than +0, matching the behavior of 579IEEE-754 2019's minimumNumber/maximumNumber (which was unspecified in 5802008). 581 582G_FMAXNUM_IEEE 583^^^^^^^^^^^^^^ 584 585Perform floating-point maximum on two values, following IEEE-754 586definitions. This differs from FMAXNUM in the handling of signaling 587NaNs. 588 589If one input is a signaling NaN, returns a quiet NaN. This matches 590IEEE-754 2008's minnum/maxnum for signaling NaNs (which differs from 5912019). 592 593These treat -0 as ordered less than +0, matching the behavior of 594IEEE-754 2019's minimumNumber/maximumNumber (which was unspecified in 5952008). 596 597G_FMINIMUM 598^^^^^^^^^^ 599 600NaN-propagating minimum that also treat -0.0 as less than 0.0. While 601FMINNUM_IEEE follow IEEE 754-2008 semantics, FMINIMUM follows IEEE 602754-2019 semantics. 603 604G_FMAXIMUM 605^^^^^^^^^^ 606 607NaN-propagating maximum that also treat -0.0 as less than 0.0. While 608FMAXNUM_IEEE follow IEEE 754-2008 semantics, FMAXIMUM follows IEEE 609754-2019 semantics. 610 611G_FADD, G_FSUB, G_FMUL, G_FDIV, G_FREM 612^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 613 614Perform the specified floating point arithmetic. 615 616G_FMA 617^^^^^ 618 619Perform a fused multiply add (i.e. without the intermediate rounding step). 620 621G_FMAD 622^^^^^^ 623 624Perform a non-fused multiply add (i.e. with the intermediate rounding step). 625 626G_FPOW 627^^^^^^ 628 629Raise the first operand to the power of the second. 630 631G_FEXP, G_FEXP2 632^^^^^^^^^^^^^^^ 633 634Calculate the base-e or base-2 exponential of a value 635 636G_FLOG, G_FLOG2, G_FLOG10 637^^^^^^^^^^^^^^^^^^^^^^^^^ 638 639Calculate the base-e, base-2, or base-10 respectively. 640 641G_FCEIL, G_FSQRT, G_FFLOOR, G_FRINT, G_FNEARBYINT 642^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 643 644These correspond to the standard C functions of the same name. 645 646G_FCOS, G_FSIN, G_FSINCOS, G_FTAN, G_FACOS, G_FASIN, G_FATAN, G_FATAN2, G_FCOSH, G_FSINH, G_FTANH 647^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 648 649These correspond to the standard C trigonometry functions of the same name. 650 651G_INTRINSIC_TRUNC 652^^^^^^^^^^^^^^^^^ 653 654Returns the operand rounded to the nearest integer not larger in magnitude than the operand. 655 656G_INTRINSIC_ROUND 657^^^^^^^^^^^^^^^^^ 658 659Returns the operand rounded to the nearest integer. 660 661G_LROUND, G_LLROUND 662^^^^^^^^^^^^^^^^^^^ 663 664Returns the source operand rounded to the nearest integer with ties away from 665zero. 666 667See the LLVM LangRef entry on '``llvm.lround.*'`` for details on behaviour. 668 669.. code-block:: none 670 671 %rounded_32:_(s32) = G_LROUND %round_me:_(s64) 672 %rounded_64:_(s64) = G_LLROUND %round_me:_(s64) 673 674Vector Specific Operations 675-------------------------- 676 677G_VSCALE 678^^^^^^^^ 679 680Puts the value of the runtime ``vscale`` multiplied by the value in the source 681operand into the destination register. This can be useful in determining the 682actual runtime number of elements in a vector. 683 684.. code-block:: 685 686 %0:_(s32) = G_VSCALE 4 687 688G_INSERT_SUBVECTOR 689^^^^^^^^^^^^^^^^^^ 690 691Insert the second source vector into the first source vector. The index operand 692represents the starting index in the first source vector at which the second 693source vector should be inserted into. 694 695The index must be a constant multiple of the second source vector's minimum 696vector length. If the vectors are scalable, then the index is first scaled by 697the runtime scaling factor. The indices inserted in the source vector must be 698valid indices of that vector. If this condition cannot be determined statically 699but is false at runtime, then the result vector is undefined. 700 701.. code-block:: none 702 703 %2:_(<vscale x 4 x i64>) = G_INSERT_SUBVECTOR %0:_(<vscale x 4 x i64>), %1:_(<vscale x 2 x i64>), 0 704 705G_EXTRACT_SUBVECTOR 706^^^^^^^^^^^^^^^^^^^ 707 708Extract a vector of destination type from the source vector. The index operand 709represents the starting index from which a subvector is extracted from 710the source vector. 711 712The index must be a constant multiple of the source vector's minimum vector 713length. If the source vector is a scalable vector, then the index is first 714scaled by the runtime scaling factor. The indices extracted from the source 715vector must be valid indices of that vector. If this condition cannot be 716determined statically but is false at runtime, then the result vector is 717undefined. 718 719Mixing scalable vectors and fixed vectors are not allowed. 720 721.. code-block:: none 722 723 %3:_(<vscale x 4 x i64>) = G_EXTRACT_SUBVECTOR %2:_(<vscale x 8 x i64>), 2 724 725G_CONCAT_VECTORS 726^^^^^^^^^^^^^^^^ 727 728Concatenate two vectors to form a longer vector. 729 730G_BUILD_VECTOR, G_BUILD_VECTOR_TRUNC 731^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 732 733Create a vector from multiple scalar registers. No implicit 734conversion is performed (i.e. the result element type must be the 735same as all source operands) 736 737The _TRUNC version truncates the larger operand types to fit the 738destination vector elt type. 739 740G_INSERT_VECTOR_ELT 741^^^^^^^^^^^^^^^^^^^ 742 743Insert an element into a vector 744 745G_EXTRACT_VECTOR_ELT 746^^^^^^^^^^^^^^^^^^^^ 747 748Extract an element from a vector 749 750G_SHUFFLE_VECTOR 751^^^^^^^^^^^^^^^^ 752 753Concatenate two vectors and shuffle the elements according to the mask operand. 754The mask operand should be an IR Constant which exactly matches the 755corresponding mask for the IR shufflevector instruction. 756 757G_SPLAT_VECTOR 758^^^^^^^^^^^^^^^^ 759 760Create a vector where all elements are the scalar from the source operand. 761 762The type of the operand must be equal to or larger than the vector element 763type. If the operand is larger than the vector element type, the scalar is 764implicitly truncated to the vector element type. 765 766G_STEP_VECTOR 767^^^^^^^^^^^^^ 768 769Create a scalable vector where all lanes are linear sequences starting at 0 770with a given unsigned step. 771 772The type of the operand must be equal to the vector element type. Arithmetic 773is performed modulo the bitwidth of the element. The step must be > 0. 774Otherwise the vector is zero. 775 776.. code-block:: 777 778 %0:_(<vscale x 2 x s64>) = G_STEP_VECTOR i64 4 779 780 %1:_(<vscale x s32>) = G_STEP_VECTOR i32 4 781 782 0, 1*Step, 2*Step, 3*Step, 4*Step, ... 783 784G_VECTOR_COMPRESS 785^^^^^^^^^^^^^^^^^ 786 787Given an input vector, a mask vector, and a passthru vector, continuously place 788all selected (i.e., where mask[i] = true) input lanes in an output vector. All 789remaining lanes in the output are taken from passthru, which may be undef. 790 791Vector Reduction Operations 792--------------------------- 793 794These operations represent horizontal vector reduction, producing a scalar result. 795 796G_VECREDUCE_SEQ_FADD, G_VECREDUCE_SEQ_FMUL 797^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 798 799The SEQ variants perform reductions in sequential order. The first operand is 800an initial scalar accumulator value, and the second operand is the vector to reduce. 801 802G_VECREDUCE_FADD, G_VECREDUCE_FMUL 803^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 804 805These reductions are relaxed variants which may reduce the elements in any order. 806 807G_VECREDUCE_FMAX, G_VECREDUCE_FMIN, G_VECREDUCE_FMAXIMUM, G_VECREDUCE_FMINIMUM 808^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 809 810FMIN/FMAX/FMINIMUM/FMAXIMUM nodes can have flags, for NaN/NoNaN variants. 811 812 813Integer/bitwise reductions 814^^^^^^^^^^^^^^^^^^^^^^^^^^ 815 816* G_VECREDUCE_ADD 817* G_VECREDUCE_MUL 818* G_VECREDUCE_AND 819* G_VECREDUCE_OR 820* G_VECREDUCE_XOR 821* G_VECREDUCE_SMAX 822* G_VECREDUCE_SMIN 823* G_VECREDUCE_UMAX 824* G_VECREDUCE_UMIN 825 826Integer reductions may have a result type larger than the vector element type. 827However, the reduction is performed using the vector element type and the value 828in the top bits is unspecified. 829 830Memory Operations 831----------------- 832 833G_LOAD, G_SEXTLOAD, G_ZEXTLOAD 834^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 835 836Generic load. Expects a MachineMemOperand in addition to explicit 837operands. If the result size is larger than the memory size, the 838high bits are undefined, sign-extended, or zero-extended respectively. 839 840Only G_LOAD is valid if the result is a vector type. If the result is larger 841than the memory size, the high elements are undefined (i.e. this is not a 842per-element, vector anyextload) 843 844Unlike in SelectionDAG, atomic loads are expressed with the same 845opcodes as regular loads. G_LOAD, G_SEXTLOAD and G_ZEXTLOAD may all 846have atomic memory operands. 847 848G_INDEXED_LOAD 849^^^^^^^^^^^^^^ 850 851Generic indexed load. Combines a GEP with a load. $newaddr is set to $base + $offset. 852If $am is 0 (post-indexed), then the value is loaded from $base; if $am is 1 (pre-indexed) 853then the value is loaded from $newaddr. 854 855G_INDEXED_SEXTLOAD 856^^^^^^^^^^^^^^^^^^ 857 858Same as G_INDEXED_LOAD except that the load performed is sign-extending, as with G_SEXTLOAD. 859 860G_INDEXED_ZEXTLOAD 861^^^^^^^^^^^^^^^^^^ 862 863Same as G_INDEXED_LOAD except that the load performed is zero-extending, as with G_ZEXTLOAD. 864 865G_STORE 866^^^^^^^ 867 868Generic store. Expects a MachineMemOperand in addition to explicit 869operands. If the stored value size is greater than the memory size, 870the high bits are implicitly truncated. If this is a vector store, the 871high elements are discarded (i.e. this does not function as a per-lane 872vector, truncating store) 873 874G_INDEXED_STORE 875^^^^^^^^^^^^^^^ 876 877Combines a store with a GEP. See description of G_INDEXED_LOAD for indexing behaviour. 878 879G_ATOMIC_CMPXCHG_WITH_SUCCESS 880^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 881 882Generic atomic cmpxchg with internal success check. Expects a 883MachineMemOperand in addition to explicit operands. 884 885G_ATOMIC_CMPXCHG 886^^^^^^^^^^^^^^^^ 887 888Generic atomic cmpxchg. Expects a MachineMemOperand in addition to explicit 889operands. 890 891|all_g_atomicrmw| 892^^^^^^^^^^^^^^^^^ 893 894.. |all_g_atomicrmw| replace:: G_ATOMICRMW_XCHG, G_ATOMICRMW_ADD, 895 G_ATOMICRMW_SUB, G_ATOMICRMW_AND, 896 G_ATOMICRMW_NAND, G_ATOMICRMW_OR, 897 G_ATOMICRMW_XOR, G_ATOMICRMW_MAX, 898 G_ATOMICRMW_MIN, G_ATOMICRMW_UMAX, 899 G_ATOMICRMW_UMIN, G_ATOMICRMW_FADD, 900 G_ATOMICRMW_FSUB, G_ATOMICRMW_FMAX, 901 G_ATOMICRMW_FMIN, G_ATOMICRMW_UINC_WRAP, 902 G_ATOMICRMW_UDEC_WRAP, G_ATOMICRMW_USUB_COND, 903 G_ATOMICRMW_USUB_SAT 904 905Generic atomicrmw. Expects a MachineMemOperand in addition to explicit 906operands. 907 908G_FENCE 909^^^^^^^ 910 911Generic fence. The first operand is the memory ordering. The second operand is 912the syncscope. 913 914See the LLVM LangRef entry on the '``fence'`` instruction for more details. 915 916G_MEMCPY 917^^^^^^^^ 918 919Generic memcpy. Expects two MachineMemOperands covering the store and load 920respectively, in addition to explicit operands. 921 922G_MEMCPY_INLINE 923^^^^^^^^^^^^^^^ 924 925Generic inlined memcpy. Like G_MEMCPY, but it is guaranteed that this version 926will not be lowered as a call to an external function. Currently the size 927operand is required to evaluate as a constant (not an immediate), though that is 928expected to change when llvm.memcpy.inline is taught to support dynamic sizes. 929 930G_MEMMOVE 931^^^^^^^^^ 932 933Generic memmove. Similar to G_MEMCPY, but the source and destination memory 934ranges are allowed to overlap. 935 936G_MEMSET 937^^^^^^^^ 938 939Generic memset. Expects a MachineMemOperand in addition to explicit operands. 940 941G_BZERO 942^^^^^^^ 943 944Generic bzero. Expects a MachineMemOperand in addition to explicit operands. 945 946Control Flow 947------------ 948 949G_PHI 950^^^^^ 951 952Implement the φ node in the SSA graph representing the function. 953 954.. code-block:: none 955 956 %dst(s8) = G_PHI %src1(s8), %bb.<id1>, %src2(s8), %bb.<id2> 957 958G_BR 959^^^^ 960 961Unconditional branch 962 963.. code-block:: none 964 965 G_BR %bb.<id> 966 967G_BRCOND 968^^^^^^^^ 969 970Conditional branch 971 972.. code-block:: none 973 974 G_BRCOND %condition, %basicblock.<id> 975 976G_BRINDIRECT 977^^^^^^^^^^^^ 978 979Indirect branch 980 981.. code-block:: none 982 983 G_BRINDIRECT %src(p0) 984 985G_BRJT 986^^^^^^ 987 988Indirect branch to jump table entry 989 990.. code-block:: none 991 992 G_BRJT %ptr(p0), %jti, %idx(s64) 993 994G_JUMP_TABLE 995^^^^^^^^^^^^ 996 997Generates a pointer to the address of the jump table specified by the source 998operand. The source operand is a jump table index. 999G_JUMP_TABLE can be used in conjunction with G_BRJT to support jump table 1000codegen with GlobalISel. 1001 1002.. code-block:: none 1003 1004 %dst:_(p0) = G_JUMP_TABLE %jump-table.0 1005 1006The above example generates a pointer to the source jump table index. 1007 1008G_INVOKE_REGION_START 1009^^^^^^^^^^^^^^^^^^^^^ 1010 1011A marker instruction that acts as a pseudo-terminator for regions of code that may 1012throw exceptions. Being a terminator, it prevents code from being inserted after 1013it during passes like legalization. This is needed because calls to exception 1014throw routines do not return, so no code that must be on an executable path must 1015be placed after throwing. 1016 1017G_INTRINSIC, G_INTRINSIC_CONVERGENT 1018^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1019 1020Call an intrinsic that has no side-effects. 1021 1022The _CONVERGENT variant corresponds to an LLVM IR intrinsic marked `convergent`. 1023 1024.. note:: 1025 1026 Unlike SelectionDAG, there is no _VOID variant. Both of these are permitted 1027 to have zero, one, or multiple results. 1028 1029G_INTRINSIC_W_SIDE_EFFECTS, G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS 1030^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1031 1032Call an intrinsic that is considered to have unknown side-effects and as such 1033cannot be reordered across other side-effecting instructions. 1034 1035The _CONVERGENT variant corresponds to an LLVM IR intrinsic marked `convergent`. 1036 1037.. note:: 1038 1039 Unlike SelectionDAG, there is no _VOID variant. Both of these are permitted 1040 to have zero, one, or multiple results. 1041 1042G_TRAP, G_DEBUGTRAP, G_UBSANTRAP 1043^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1044 1045Represents :ref:`llvm.trap <llvm.trap>`, :ref:`llvm.debugtrap <llvm.debugtrap>` 1046and :ref:`llvm.ubsantrap <llvm.ubsantrap>` that generate a target dependent 1047trap instructions. 1048 1049.. code-block:: none 1050 1051 G_TRAP 1052 1053.. code-block:: none 1054 1055 G_DEBUGTRAP 1056 1057.. code-block:: none 1058 1059 G_UBSANTRAP 12 1060 1061Variadic Arguments 1062------------------ 1063 1064G_VASTART 1065^^^^^^^^^ 1066 1067.. caution:: 1068 1069 I found no documentation for this instruction at the time of writing. 1070 1071G_VAARG 1072^^^^^^^ 1073 1074.. caution:: 1075 1076 I found no documentation for this instruction at the time of writing. 1077 1078Other Operations 1079---------------- 1080 1081G_DYN_STACKALLOC 1082^^^^^^^^^^^^^^^^ 1083 1084Dynamically realigns the stack pointer to the specified size and alignment. 1085An alignment value of `0` or `1` means no specific alignment. 1086 1087.. code-block:: none 1088 1089 %8:_(p0) = G_DYN_STACKALLOC %7(s64), 32 1090 1091Optimization Hints 1092------------------ 1093 1094These instructions do not correspond to any target instructions. They act as 1095hints for various combines. 1096 1097G_ASSERT_SEXT, G_ASSERT_ZEXT 1098^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1099 1100This signifies that the contents of a register were previously extended from a 1101smaller type. 1102 1103The smaller type is denoted using an immediate operand. For scalars, this is the 1104width of the entire smaller type. For vectors, this is the width of the smaller 1105element type. 1106 1107.. code-block:: none 1108 1109 %x_was_zexted:_(s32) = G_ASSERT_ZEXT %x(s32), 16 1110 %y_was_zexted:_(<2 x s32>) = G_ASSERT_ZEXT %y(<2 x s32>), 16 1111 1112 %z_was_sexted:_(s32) = G_ASSERT_SEXT %z(s32), 8 1113 1114G_ASSERT_SEXT and G_ASSERT_ZEXT act like copies, albeit with some restrictions. 1115 1116The source and destination registers must 1117 1118- Be virtual 1119- Belong to the same register class 1120- Belong to the same register bank 1121 1122It should always be safe to 1123 1124- Look through the source register 1125- Replace the destination register with the source register 1126 1127 1128Miscellaneous 1129------------- 1130 1131G_CONSTANT_FOLD_BARRIER 1132^^^^^^^^^^^^^^^^^^^^^^^ 1133 1134This operation is used as an opaque barrier to prevent constant folding. Combines 1135and other transformations should not look through this. These have no other 1136semantics and can be safely eliminated if a target chooses. 1137 1138 1139Unlisted: G_STACKSAVE, G_STACKRESTORE, G_FSHL, G_FSHR, G_SMULFIX, G_UMULFIX, G_SMULFIXSAT, G_UMULFIXSAT, G_SDIVFIX, G_UDIVFIX, G_SDIVFIXSAT, G_UDIVFIXSAT, G_FPOWI, G_FEXP10, G_FLDEXP, G_FFREXP, G_GET_FPENV, G_SET_FPENV, G_RESET_FPENV, G_GET_FPMODE, G_SET_FPMODE, G_RESET_FPMODE, G_INTRINSIC_FPTRUNC_ROUND, G_INTRINSIC_LRINT, G_INTRINSIC_LLRINT, G_INTRINSIC_ROUNDEVEN, G_READCYCLECOUNTER, G_READSTEADYCOUNTER, G_PREFETCH, G_READ_REGISTER, G_WRITE_REGISTER, G_STRICT_FADD, G_STRICT_FSUB, G_STRICT_FMUL, G_STRICT_FDIV, G_STRICT_FREM, G_STRICT_FMA, G_STRICT_FSQRT, G_STRICT_FLDEXP, G_ASSERT_ALIGN 1140