1============================== 2LLVM Language Reference Manual 3============================== 4 5.. contents:: 6 :local: 7 :depth: 3 8 9Abstract 10======== 11 12This document is a reference manual for the LLVM assembly language. LLVM 13is a Static Single Assignment (SSA) based representation that provides 14type safety, low-level operations, flexibility, and the capability of 15representing 'all' high-level languages cleanly. It is the common code 16representation used throughout all phases of the LLVM compilation 17strategy. 18 19Introduction 20============ 21 22The LLVM code representation is designed to be used in three different 23forms: as an in-memory compiler IR, as an on-disk bitcode representation 24(suitable for fast loading by a Just-In-Time compiler), and as a human 25readable assembly language representation. This allows LLVM to provide a 26powerful intermediate representation for efficient compiler 27transformations and analysis, while providing a natural means to debug 28and visualize the transformations. The three different forms of LLVM are 29all equivalent. This document describes the human readable 30representation and notation. 31 32The LLVM representation aims to be light-weight and low-level while 33being expressive, typed, and extensible at the same time. It aims to be 34a "universal IR" of sorts, by being at a low enough level that 35high-level ideas may be cleanly mapped to it (similar to how 36microprocessors are "universal IR's", allowing many source languages to 37be mapped to them). By providing type information, LLVM can be used as 38the target of optimizations: for example, through pointer analysis, it 39can be proven that a C automatic variable is never accessed outside of 40the current function, allowing it to be promoted to a simple SSA value 41instead of a memory location. 42 43.. _wellformed: 44 45Well-Formedness 46--------------- 47 48It is important to note that this document describes 'well formed' LLVM 49assembly language. There is a difference between what the parser accepts 50and what is considered 'well formed'. For example, the following 51instruction is syntactically okay, but not well formed: 52 53.. code-block:: llvm 54 55 %x = add i32 1, %x 56 57because the definition of ``%x`` does not dominate all of its uses. The 58LLVM infrastructure provides a verification pass that may be used to 59verify that an LLVM module is well formed. This pass is automatically 60run by the parser after parsing input assembly and by the optimizer 61before it outputs bitcode. The violations pointed out by the verifier 62pass indicate bugs in transformation passes or input to the parser. 63 64Syntax 65====== 66 67.. _identifiers: 68 69Identifiers 70----------- 71 72LLVM identifiers come in two basic types: global and local. Global 73identifiers (functions, global variables) begin with the ``'@'`` 74character. Local identifiers (register names, types) begin with the 75``'%'`` character. Additionally, there are three different formats for 76identifiers, for different purposes: 77 78#. Named values are represented as a string of characters with their 79 prefix. For example, ``%foo``, ``@DivisionByZero``, 80 ``%a.really.long.identifier``. The actual regular expression used is 81 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other 82 characters in their names can be surrounded with quotes. Special 83 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII 84 code for the character in hexadecimal. In this way, any character can 85 be used in a name value, even quotes themselves. The ``"\01"`` prefix 86 can be used on global values to suppress mangling. 87#. Unnamed values are represented as an unsigned numeric value with 88 their prefix. For example, ``%12``, ``@2``, ``%44``. 89#. Constants, which are described in the section Constants_ below. 90 91LLVM requires that values start with a prefix for two reasons: Compilers 92don't need to worry about name clashes with reserved words, and the set 93of reserved words may be expanded in the future without penalty. 94Additionally, unnamed identifiers allow a compiler to quickly come up 95with a temporary variable without having to avoid symbol table 96conflicts. 97 98Reserved words in LLVM are very similar to reserved words in other 99languages. There are keywords for different opcodes ('``add``', 100'``bitcast``', '``ret``', etc...), for primitive type names ('``void``', 101'``i32``', etc...), and others. These reserved words cannot conflict 102with variable names, because none of them start with a prefix character 103(``'%'`` or ``'@'``). 104 105Here is an example of LLVM code to multiply the integer variable 106'``%X``' by 8: 107 108The easy way: 109 110.. code-block:: llvm 111 112 %result = mul i32 %X, 8 113 114After strength reduction: 115 116.. code-block:: llvm 117 118 %result = shl i32 %X, 3 119 120And the hard way: 121 122.. code-block:: llvm 123 124 %0 = add i32 %X, %X ; yields i32:%0 125 %1 = add i32 %0, %0 /* yields i32:%1 */ 126 %result = add i32 %1, %1 127 128This last way of multiplying ``%X`` by 8 illustrates several important 129lexical features of LLVM: 130 131#. Comments are delimited with a '``;``' and go until the end of line. 132 Alternatively, comments can start with ``/*`` and terminate with ``*/``. 133#. Unnamed temporaries are created when the result of a computation is 134 not assigned to a named value. 135#. By default, unnamed temporaries are numbered sequentially (using a 136 per-function incrementing counter, starting with 0). However, when explicitly 137 specifying temporary numbers, it is allowed to skip over numbers. 138 139 Note that basic blocks and unnamed function parameters are included in this 140 numbering. For example, if the entry basic block is not given a label name 141 and all function parameters are named, then it will get number 0. 142 143It also shows a convention that we follow in this document. When 144demonstrating instructions, we will follow an instruction with a comment 145that defines the type and name of value produced. 146 147.. _string_constants: 148 149String constants 150---------------- 151 152Strings in LLVM programs are delimited by ``"`` characters. Within a 153string, all bytes are treated literally with the exception of ``\`` 154characters, which start escapes, and the first ``"`` character, which 155ends the string. 156 157There are two kinds of escapes. 158 159* ``\\`` represents a single ``\`` character. 160 161* ``\`` followed by two hexadecimal characters (0-9, a-f, or A-F) 162 represents the byte with the given value (e.g. \x00 represents a 163 null byte). 164 165To represent a ``"`` character, use ``\22``. (``\"`` will end the string 166with a trailing ``\``.) 167 168Newlines do not terminate string constants; strings can span multiple 169lines. 170 171The interpretation of string constants (e.g. their character encoding) 172depends on context. 173 174 175High Level Structure 176==================== 177 178Module Structure 179---------------- 180 181LLVM programs are composed of ``Module``'s, each of which is a 182translation unit of the input programs. Each module consists of 183functions, global variables, and symbol table entries. Modules may be 184combined together with the LLVM linker, which merges function (and 185global variable) definitions, resolves forward declarations, and merges 186symbol table entries. Here is an example of the "hello world" module: 187 188.. code-block:: llvm 189 190 ; Declare the string constant as a global constant. 191 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" 192 193 ; External declaration of the puts function 194 declare i32 @puts(ptr captures(none)) nounwind 195 196 ; Definition of main function 197 define i32 @main() { 198 ; Call puts function to write out the string to stdout. 199 call i32 @puts(ptr @.str) 200 ret i32 0 201 } 202 203 ; Named metadata 204 !0 = !{i32 42, null, !"string"} 205 !foo = !{!0} 206 207This example is made up of a :ref:`global variable <globalvars>` named 208"``.str``", an external declaration of the "``puts``" function, a 209:ref:`function definition <functionstructure>` for "``main``" and 210:ref:`named metadata <namedmetadatastructure>` "``foo``". 211 212In general, a module is made up of a list of global values (where both 213functions and global variables are global values). Global values are 214represented by a pointer to a memory location (in this case, a pointer 215to an array of char, and a pointer to a function), and have one of the 216following :ref:`linkage types <linkage>`. 217 218.. _linkage: 219 220Linkage Types 221------------- 222 223All Global Variables and Functions have one of the following types of 224linkage: 225 226``private`` 227 Global values with "``private``" linkage are only directly 228 accessible by objects in the current module. In particular, linking 229 code into a module with a private global value may cause the 230 private to be renamed as necessary to avoid collisions. Because the 231 symbol is private to the module, all references can be updated. This 232 doesn't show up in any symbol table in the object file. 233``internal`` 234 Similar to private, but the value shows as a local symbol 235 (``STB_LOCAL`` in the case of ELF) in the object file. This 236 corresponds to the notion of the '``static``' keyword in C. 237``available_externally`` 238 Globals with "``available_externally``" linkage are never emitted into 239 the object file corresponding to the LLVM module. From the linker's 240 perspective, an ``available_externally`` global is equivalent to 241 an external declaration. They exist to allow inlining and other 242 optimizations to take place given knowledge of the definition of the 243 global, which is known to be somewhere outside the module. Globals 244 with ``available_externally`` linkage are allowed to be discarded at 245 will, and allow inlining and other optimizations. This linkage type is 246 only allowed on definitions, not declarations. 247``linkonce`` 248 Globals with "``linkonce``" linkage are merged with other globals of 249 the same name when linkage occurs. This can be used to implement 250 some forms of inline functions, templates, or other code which must 251 be generated in each translation unit that uses it, but where the 252 body may be overridden with a more definitive definition later. 253 Unreferenced ``linkonce`` globals are allowed to be discarded. Note 254 that ``linkonce`` linkage does not actually allow the optimizer to 255 inline the body of this function into callers because it doesn't 256 know if this definition of the function is the definitive definition 257 within the program or whether it will be overridden by a stronger 258 definition. To enable inlining and other optimizations, use 259 "``linkonce_odr``" linkage. 260``weak`` 261 "``weak``" linkage has the same merging semantics as ``linkonce`` 262 linkage, except that unreferenced globals with ``weak`` linkage may 263 not be discarded. This is used for globals that are declared "weak" 264 in C source code. 265``common`` 266 "``common``" linkage is most similar to "``weak``" linkage, but they 267 are used for tentative definitions in C, such as "``int X;``" at 268 global scope. Symbols with "``common``" linkage are merged in the 269 same way as ``weak symbols``, and they may not be deleted if 270 unreferenced. ``common`` symbols may not have an explicit section, 271 must have a zero initializer, and may not be marked 272 ':ref:`constant <globalvars>`'. Functions and aliases may not have 273 common linkage. 274 275.. _linkage_appending: 276 277``appending`` 278 "``appending``" linkage may only be applied to global variables of 279 pointer to array type. When two global variables with appending 280 linkage are linked together, the two global arrays are appended 281 together. This is the LLVM, typesafe, equivalent of having the 282 system linker append together "sections" with identical names when 283 .o files are linked. 284 285 Unfortunately this doesn't correspond to any feature in .o files, so it 286 can only be used for variables like ``llvm.global_ctors`` which llvm 287 interprets specially. 288 289``extern_weak`` 290 The semantics of this linkage follow the ELF object file model: the 291 symbol is weak until linked, if not linked, the symbol becomes null 292 instead of being an undefined reference. 293``linkonce_odr``, ``weak_odr`` 294 The ``odr`` suffix indicates that all globals defined with the given name 295 are equivalent, along the lines of the C++ "one definition rule" ("ODR"). 296 Informally, this means we can inline functions and fold loads of constants. 297 298 Formally, use the following definition: when an ``odr`` function is 299 called, one of the definitions is non-deterministically chosen to run. For 300 ``odr`` variables, if any byte in the value is not equal in all 301 initializers, that byte is a :ref:`poison value <poisonvalues>`. For 302 aliases and ifuncs, apply the rule for the underlying function or variable. 303 304 These linkage types are otherwise the same as their non-``odr`` versions. 305``external`` 306 If none of the above identifiers are used, the global is externally 307 visible, meaning that it participates in linkage and can be used to 308 resolve external symbol references. 309 310It is illegal for a global variable or function *declaration* to have any 311linkage type other than ``external`` or ``extern_weak``. 312 313.. _callingconv: 314 315Calling Conventions 316------------------- 317 318LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and 319:ref:`invokes <i_invoke>` can all have an optional calling convention 320specified for the call. The calling convention of any pair of dynamic 321caller/callee must match, or the behavior of the program is undefined. 322The following calling conventions are supported by LLVM, and more may be 323added in the future: 324 325"``ccc``" - The C calling convention 326 This calling convention (the default if no other calling convention 327 is specified) matches the target C calling conventions. This calling 328 convention supports varargs function calls and tolerates some 329 mismatch in the declared prototype and implemented declaration of 330 the function (as does normal C). 331"``fastcc``" - The fast calling convention 332 This calling convention attempts to make calls as fast as possible 333 (e.g. by passing things in registers). This calling convention 334 allows the target to use whatever tricks it wants to produce fast 335 code for the target, without having to conform to an externally 336 specified ABI (Application Binary Interface). `Tail calls can only 337 be optimized when this, the tailcc, the GHC or the HiPE convention is 338 used. <CodeGenerator.html#tail-call-optimization>`_ This calling 339 convention does not support varargs and requires the prototype of all 340 callees to exactly match the prototype of the function definition. 341"``coldcc``" - The cold calling convention 342 This calling convention attempts to make code in the caller as 343 efficient as possible under the assumption that the call is not 344 commonly executed. As such, these calls often preserve all registers 345 so that the call does not break any live ranges in the caller side. 346 This calling convention does not support varargs and requires the 347 prototype of all callees to exactly match the prototype of the 348 function definition. Furthermore the inliner doesn't consider such function 349 calls for inlining. 350"``ghccc``" - GHC convention 351 This calling convention has been implemented specifically for use by 352 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_. 353 It passes everything in registers, going to extremes to achieve this 354 by disabling callee save registers. This calling convention should 355 not be used lightly but only for specific situations such as an 356 alternative to the *register pinning* performance technique often 357 used when implementing functional programming languages. At the 358 moment only X86, AArch64, and RISCV support this convention. The 359 following limitations exist: 360 361 - On *X86-32* only up to 4 bit type parameters are supported. No 362 floating-point types are supported. 363 - On *X86-64* only up to 10 bit type parameters and 6 364 floating-point parameters are supported. 365 - On *AArch64* only up to 4 32-bit floating-point parameters, 366 4 64-bit floating-point parameters, and 10 bit type parameters 367 are supported. 368 - *RISCV64* only supports up to 11 bit type parameters, 4 369 32-bit floating-point parameters, and 4 64-bit floating-point 370 parameters. 371 372 This calling convention supports `tail call 373 optimization <CodeGenerator.html#tail-call-optimization>`_ but requires 374 both the caller and callee are using it. 375"``cc 11``" - The HiPE calling convention 376 This calling convention has been implemented specifically for use by 377 the `High-Performance Erlang 378 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the* 379 native code compiler of the `Ericsson's Open Source Erlang/OTP 380 system <http://www.erlang.org/download.shtml>`_. It uses more 381 registers for argument passing than the ordinary C calling 382 convention and defines no callee-saved registers. The calling 383 convention properly supports `tail call 384 optimization <CodeGenerator.html#tail-call-optimization>`_ but requires 385 that both the caller and the callee use it. It uses a *register pinning* 386 mechanism, similar to GHC's convention, for keeping frequently 387 accessed runtime components pinned to specific hardware registers. 388 At the moment only X86 supports this convention (both 32 and 64 389 bit). 390"``anyregcc``" - Dynamic calling convention for code patching 391 This is a special convention that supports patching an arbitrary code 392 sequence in place of a call site. This convention forces the call 393 arguments into registers but allows them to be dynamically 394 allocated. This can currently only be used with calls to 395 llvm.experimental.patchpoint because only this intrinsic records 396 the location of its arguments in a side table. See :doc:`StackMaps`. 397"``preserve_mostcc``" - The `PreserveMost` calling convention 398 This calling convention attempts to make the code in the caller as 399 unintrusive as possible. This convention behaves identically to the `C` 400 calling convention on how arguments and return values are passed, but it 401 uses a different set of caller/callee-saved registers. This alleviates the 402 burden of saving and recovering a large register set before and after the 403 call in the caller. If the arguments are passed in callee-saved registers, 404 then they will be preserved by the callee across the call. This doesn't 405 apply for values returned in callee-saved registers. 406 407 - On X86-64 the callee preserves all general purpose registers, except for 408 R11 and return registers, if any. R11 can be used as a scratch register. 409 The treatment of floating-point registers (XMMs/YMMs) matches the OS's C 410 calling convention: on most platforms, they are not preserved and need to 411 be saved by the caller, but on Windows, xmm6-xmm15 are preserved. 412 413 - On AArch64 the callee preserve all general purpose registers, except X0-X8 414 and X16-X18. 415 416 The idea behind this convention is to support calls to runtime functions 417 that have a hot path and a cold path. The hot path is usually a small piece 418 of code that doesn't use many registers. The cold path might need to call out to 419 another function and therefore only needs to preserve the caller-saved 420 registers, which haven't already been saved by the caller. The 421 `PreserveMost` calling convention is very similar to the `cold` calling 422 convention in terms of caller/callee-saved registers, but they are used for 423 different types of function calls. `coldcc` is for function calls that are 424 rarely executed, whereas `preserve_mostcc` function calls are intended to be 425 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc` 426 doesn't prevent the inliner from inlining the function call. 427 428 This calling convention will be used by a future version of the ObjectiveC 429 runtime and should therefore still be considered experimental at this time. 430 Although this convention was created to optimize certain runtime calls to 431 the ObjectiveC runtime, it is not limited to this runtime and might be used 432 by other runtimes in the future too. The current implementation only 433 supports X86-64, but the intention is to support more architectures in the 434 future. 435"``preserve_allcc``" - The `PreserveAll` calling convention 436 This calling convention attempts to make the code in the caller even less 437 intrusive than the `PreserveMost` calling convention. This calling 438 convention also behaves identical to the `C` calling convention on how 439 arguments and return values are passed, but it uses a different set of 440 caller/callee-saved registers. This removes the burden of saving and 441 recovering a large register set before and after the call in the caller. If 442 the arguments are passed in callee-saved registers, then they will be 443 preserved by the callee across the call. This doesn't apply for values 444 returned in callee-saved registers. 445 446 - On X86-64 the callee preserves all general purpose registers, except for 447 R11. R11 can be used as a scratch register. Furthermore it also preserves 448 all floating-point registers (XMMs/YMMs). 449 450 - On AArch64 the callee preserve all general purpose registers, except X0-X8 451 and X16-X18. Furthermore it also preserves lower 128 bits of V8-V31 SIMD - 452 floating point registers. 453 454 The idea behind this convention is to support calls to runtime functions 455 that don't need to call out to any other functions. 456 457 This calling convention, like the `PreserveMost` calling convention, will be 458 used by a future version of the ObjectiveC runtime and should be considered 459 experimental at this time. 460"``preserve_nonecc``" - The `PreserveNone` calling convention 461 This calling convention doesn't preserve any general registers. So all 462 general registers are caller saved registers. It also uses all general 463 registers to pass arguments. This attribute doesn't impact non-general 464 purpose registers (e.g. floating point registers, on X86 XMMs/YMMs). 465 Non-general purpose registers still follow the standard c calling 466 convention. Currently it is for x86_64 and AArch64 only. 467"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions 468 Clang generates an access function to access C++-style TLS. The access 469 function generally has an entry block, an exit block and an initialization 470 block that is run at the first time. The entry and exit blocks can access 471 a few TLS IR variables, each access will be lowered to a platform-specific 472 sequence. 473 474 This calling convention aims to minimize overhead in the caller by 475 preserving as many registers as possible (all the registers that are 476 preserved on the fast path, composed of the entry and exit blocks). 477 478 This calling convention behaves identical to the `C` calling convention on 479 how arguments and return values are passed, but it uses a different set of 480 caller/callee-saved registers. 481 482 Given that each platform has its own lowering sequence, hence its own set 483 of preserved registers, we can't use the existing `PreserveMost`. 484 485 - On X86-64 the callee preserves all general purpose registers, except for 486 RDI and RAX. 487"``tailcc``" - Tail callable calling convention 488 This calling convention ensures that calls in tail position will always be 489 tail call optimized. This calling convention is equivalent to fastcc, 490 except for an additional guarantee that tail calls will be produced 491 whenever possible. `Tail calls can only be optimized when this, the fastcc, 492 the GHC or the HiPE convention is used. <CodeGenerator.html#tail-call-optimization>`_ 493 This calling convention does not support varargs and requires the prototype of 494 all callees to exactly match the prototype of the function definition. 495"``swiftcc``" - This calling convention is used for Swift language. 496 - On X86-64 RCX and R8 are available for additional integer returns, and 497 XMM2 and XMM3 are available for additional FP/vector returns. 498 - On iOS platforms, we use AAPCS-VFP calling convention. 499"``swifttailcc``" 500 This calling convention is like ``swiftcc`` in most respects, but also the 501 callee pops the argument area of the stack so that mandatory tail calls are 502 possible as in ``tailcc``. 503"``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism) 504 This calling convention is used for the Control Flow Guard check function, 505 calls to which can be inserted before indirect calls to check that the call 506 target is a valid function address. The check function has no return value, 507 but it will trigger an OS-level error if the address is not a valid target. 508 The set of registers preserved by the check function, and the register 509 containing the target address are architecture-specific. 510 511 - On X86 the target address is passed in ECX. 512 - On ARM the target address is passed in R0. 513 - On AArch64 the target address is passed in X15. 514"``cc <n>``" - Numbered convention 515 Any calling convention may be specified by number, allowing 516 target-specific calling conventions to be used. Target specific 517 calling conventions start at 64. 518 519More calling conventions can be added/defined on an as-needed basis, to 520support Pascal conventions or any other well-known target-independent 521convention. 522 523.. _visibilitystyles: 524 525Visibility Styles 526----------------- 527 528All Global Variables and Functions have one of the following visibility 529styles: 530 531"``default``" - Default style 532 On targets that use the ELF object file format, default visibility 533 means that the declaration is visible to other modules and, in 534 shared libraries, means that the declared entity may be overridden. 535 On Darwin, default visibility means that the declaration is visible 536 to other modules. On XCOFF, default visibility means no explicit 537 visibility bit will be set and whether the symbol is visible 538 (i.e "exported") to other modules depends primarily on export lists 539 provided to the linker. Default visibility corresponds to "external 540 linkage" in the language. 541"``hidden``" - Hidden style 542 Two declarations of an object with hidden visibility refer to the 543 same object if they are in the same shared object. Usually, hidden 544 visibility indicates that the symbol will not be placed into the 545 dynamic symbol table, so no other module (executable or shared 546 library) can reference it directly. 547"``protected``" - Protected style 548 On ELF, protected visibility indicates that the symbol will be 549 placed in the dynamic symbol table, but that references within the 550 defining module will bind to the local symbol. That is, the symbol 551 cannot be overridden by another module. 552 553A symbol with ``internal`` or ``private`` linkage must have ``default`` 554visibility. 555 556.. _dllstorageclass: 557 558DLL Storage Classes 559------------------- 560 561All Global Variables, Functions and Aliases can have one of the following 562DLL storage class: 563 564``dllimport`` 565 "``dllimport``" causes the compiler to reference a function or variable via 566 a global pointer to a pointer that is set up by the DLL exporting the 567 symbol. On Microsoft Windows targets, the pointer name is formed by 568 combining ``__imp_`` and the function or variable name. 569``dllexport`` 570 On Microsoft Windows targets, "``dllexport``" causes the compiler to provide 571 a global pointer to a pointer in a DLL, so that it can be referenced with the 572 ``dllimport`` attribute. the pointer name is formed by combining ``__imp_`` 573 and the function or variable name. On XCOFF targets, ``dllexport`` indicates 574 that the symbol will be made visible to other modules using "exported" 575 visibility and thus placed by the linker in the loader section symbol table. 576 Since this storage class exists for defining a dll interface, the compiler, 577 assembler and linker know it is externally referenced and must refrain from 578 deleting the symbol. 579 580A symbol with ``internal`` or ``private`` linkage cannot have a DLL storage 581class. 582 583.. _tls_model: 584 585Thread Local Storage Models 586--------------------------- 587 588A variable may be defined as ``thread_local``, which means that it will 589not be shared by threads (each thread will have a separated copy of the 590variable). Not all targets support thread-local variables. Optionally, a 591TLS model may be specified: 592 593``localdynamic`` 594 For variables that are only used within the current shared library. 595``initialexec`` 596 For variables in modules that will not be loaded dynamically. 597``localexec`` 598 For variables defined in the executable and only used within it. 599 600If no explicit model is given, the "general dynamic" model is used. 601 602The models correspond to the ELF TLS models; see `ELF Handling For 603Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for 604more information on under which circumstances the different models may 605be used. The target may choose a different TLS model if the specified 606model is not supported, or if a better choice of model can be made. 607 608A model can also be specified in an alias, but then it only governs how 609the alias is accessed. It will not have any effect in the aliasee. 610 611For platforms without linker support of ELF TLS model, the -femulated-tls 612flag can be used to generate GCC compatible emulated TLS code. 613 614.. _runtime_preemption_model: 615 616Runtime Preemption Specifiers 617----------------------------- 618 619Global variables, functions and aliases may have an optional runtime preemption 620specifier. If a preemption specifier isn't given explicitly, then a 621symbol is assumed to be ``dso_preemptable``. 622 623``dso_preemptable`` 624 Indicates that the function or variable may be replaced by a symbol from 625 outside the linkage unit at runtime. 626 627``dso_local`` 628 The compiler may assume that a function or variable marked as ``dso_local`` 629 will resolve to a symbol within the same linkage unit. Direct access will 630 be generated even if the definition is not within this compilation unit. 631 632.. _namedtypes: 633 634Structure Types 635--------------- 636 637LLVM IR allows you to specify both "identified" and "literal" :ref:`structure 638types <t_struct>`. Literal types are uniqued structurally, but identified types 639are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used 640to forward declare a type that is not yet available. 641 642An example of an identified structure specification is: 643 644.. code-block:: llvm 645 646 %mytype = type { %mytype*, i32 } 647 648Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only 649literal types are uniqued in recent versions of LLVM. 650 651.. _nointptrtype: 652 653Non-Integral Pointer Type 654------------------------- 655 656Note: non-integral pointer types are a work in progress, and they should be 657considered experimental at this time. 658 659LLVM IR optionally allows the frontend to denote pointers in certain address 660spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`. 661Non-integral pointer types represent pointers that have an *unspecified* bitwise 662representation; that is, the integral representation may be target dependent or 663unstable (not backed by a fixed integer). 664 665``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for 666integral (i.e. normal) pointers in that they convert integers to and from 667corresponding pointer types, but there are additional implications to be 668aware of. Because the bit-representation of a non-integral pointer may 669not be stable, two identical casts of the same operand may or may not 670return the same value. Said differently, the conversion to or from the 671non-integral type depends on environmental state in an implementation 672defined manner. 673 674If the frontend wishes to observe a *particular* value following a cast, the 675generated IR must fence with the underlying environment in an implementation 676defined manner. (In practice, this tends to require ``noinline`` routines for 677such operations.) 678 679From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for 680non-integral types are analogous to ones on integral types with one 681key exception: the optimizer may not, in general, insert new dynamic 682occurrences of such casts. If a new cast is inserted, the optimizer would 683need to either ensure that a) all possible values are valid, or b) 684appropriate fencing is inserted. Since the appropriate fencing is 685implementation defined, the optimizer can't do the latter. The former is 686challenging as many commonly expected properties, such as 687``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types. 688Similar restrictions apply to intrinsics that might examine the pointer bits, 689such as :ref:`llvm.ptrmask<int_ptrmask>`. 690 691The alignment information provided by the frontend for a non-integral pointer 692(typically using attributes or metadata) must be valid for every possible 693representation of the pointer. 694 695.. _globalvars: 696 697Global Variables 698---------------- 699 700Global variables define regions of memory allocated at compilation time 701instead of run-time. 702 703Global variable definitions must be initialized. 704 705Global variables in other translation units can also be declared, in which 706case they don't have an initializer. 707 708Global variables can optionally specify a :ref:`linkage type <linkage>`. 709 710Either global variable definitions or declarations may have an explicit section 711to be placed in and may have an optional explicit alignment specified. If there 712is a mismatch between the explicit or inferred section information for the 713variable declaration and its definition the resulting behavior is undefined. 714 715A variable may be defined as a global ``constant``, which indicates that 716the contents of the variable will **never** be modified (enabling better 717optimization, allowing the global data to be placed in the read-only 718section of an executable, etc). Note that variables that need runtime 719initialization cannot be marked ``constant`` as there is a store to the 720variable. 721 722LLVM explicitly allows *declarations* of global variables to be marked 723constant, even if the final definition of the global is not. This 724capability can be used to enable slightly better optimization of the 725program, but requires the language definition to guarantee that 726optimizations based on the 'constantness' are valid for the translation 727units that do not include the definition. 728 729As SSA values, global variables define pointer values that are in scope 730(i.e. they dominate) all basic blocks in the program. Global variables 731always define a pointer to their "content" type because they describe a 732region of memory, and all memory objects in LLVM are accessed through 733pointers. 734 735Global variables can be marked with ``unnamed_addr`` which indicates 736that the address is not significant, only the content. Constants marked 737like this can be merged with other constants if they have the same 738initializer. Note that a constant with significant address *can* be 739merged with a ``unnamed_addr`` constant, the result being a constant 740whose address is significant. 741 742If the ``local_unnamed_addr`` attribute is given, the address is known to 743not be significant within the module. 744 745A global variable may be declared to reside in a target-specific 746numbered address space. For targets that support them, address spaces 747may affect how optimizations are performed and/or what target 748instructions are used to access the variable. The default address space 749is zero. The address space qualifier must precede any other attributes. 750 751LLVM allows an explicit section to be specified for globals. If the 752target supports it, it will emit globals to the section specified. 753Additionally, the global can placed in a comdat if the target has the necessary 754support. 755 756External declarations may have an explicit section specified. Section 757information is retained in LLVM IR for targets that make use of this 758information. Attaching section information to an external declaration is an 759assertion that its definition is located in the specified section. If the 760definition is located in a different section, the behavior is undefined. 761 762LLVM allows an explicit code model to be specified for globals. If the 763target supports it, it will emit globals in the code model specified, 764overriding the code model used to compile the translation unit. 765The allowed values are "tiny", "small", "kernel", "medium", "large". 766This may be extended in the future to specify global data layout that 767doesn't cleanly fit into a specific code model. 768 769By default, global initializers are optimized by assuming that global 770variables defined within the module are not modified from their 771initial values before the start of the global initializer. This is 772true even for variables potentially accessible from outside the 773module, including those with external linkage or appearing in 774``@llvm.used`` or dllexported variables. This assumption may be suppressed 775by marking the variable with ``externally_initialized``. 776 777An explicit alignment may be specified for a global, which must be a 778power of 2. If not present, or if the alignment is set to zero, the 779alignment of the global is set by the target to whatever it feels 780convenient. If an explicit alignment is specified, the global is forced 781to have exactly that alignment. Targets and optimizers are not allowed 782to over-align the global if the global has an assigned section. In this 783case, the extra alignment could be observable: for example, code could 784assume that the globals are densely packed in their section and try to 785iterate over them as an array, alignment padding would break this 786iteration. For TLS variables, the module flag ``MaxTLSAlign``, if present, 787limits the alignment to the given value. Optimizers are not allowed to 788impose a stronger alignment on these variables. The maximum alignment 789is ``1 << 32``. 790 791For global variable declarations, as well as definitions that may be 792replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common`` 793linkage types), the allocation size and alignment of the definition it resolves 794to must be greater than or equal to that of the declaration or replaceable 795definition, otherwise the behavior is undefined. 796 797Globals can also have a :ref:`DLL storage class <dllstorageclass>`, 798an optional :ref:`runtime preemption specifier <runtime_preemption_model>`, 799an optional :ref:`global attributes <glattrs>` and 800an optional list of attached :ref:`metadata <metadata>`. 801 802Variables and aliases can have a 803:ref:`Thread Local Storage Model <tls_model>`. 804 805Globals cannot be or contain :ref:`Scalable vectors <t_vector>` because their 806size is unknown at compile time. They are allowed in structs to facilitate 807intrinsics returning multiple values. Generally, structs containing scalable 808vectors are not considered "sized" and cannot be used in loads, stores, allocas, 809or GEPs. The only exception to this rule is for structs that contain scalable 810vectors of the same type (e.g. ``{<vscale x 2 x i32>, <vscale x 2 x i32>}`` 811contains the same type while ``{<vscale x 2 x i32>, <vscale x 2 x i64>}`` 812doesn't). These kinds of structs (we may call them homogeneous scalable vector 813structs) are considered sized and can be used in loads, stores, allocas, but 814not GEPs. 815 816Globals with ``toc-data`` attribute set are stored in TOC of XCOFF. Their 817alignments are not larger than that of a TOC entry. Optimizations should not 818increase their alignments to mitigate TOC overflow. 819 820Syntax:: 821 822 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility] 823 [DLLStorageClass] [ThreadLocal] 824 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] 825 [ExternallyInitialized] 826 <global | constant> <Type> [<InitializerConstant>] 827 [, section "name"] [, partition "name"] 828 [, comdat [($name)]] [, align <Alignment>] 829 [, code_model "model"] 830 [, no_sanitize_address] [, no_sanitize_hwaddress] 831 [, sanitize_address_dyninit] [, sanitize_memtag] 832 (, !name !N)* 833 834For example, the following defines a global in a numbered address space 835with an initializer, section, and alignment: 836 837.. code-block:: llvm 838 839 @G = addrspace(5) constant float 1.0, section "foo", align 4 840 841The following example just declares a global variable 842 843.. code-block:: llvm 844 845 @G = external global i32 846 847The following example defines a global variable with the 848``large`` code model: 849 850.. code-block:: llvm 851 852 @G = internal global i32 0, code_model "large" 853 854The following example defines a thread-local global with the 855``initialexec`` TLS model: 856 857.. code-block:: llvm 858 859 @G = thread_local(initialexec) global i32 0, align 4 860 861.. _functionstructure: 862 863Functions 864--------- 865 866LLVM function definitions consist of the "``define``" keyword, an 867optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption 868specifier <runtime_preemption_model>`, an optional :ref:`visibility 869style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, 870an optional :ref:`calling convention <callingconv>`, 871an optional ``unnamed_addr`` attribute, a return type, an optional 872:ref:`parameter attribute <paramattrs>` for the return type, a function 873name, a (possibly empty) argument list (each with optional :ref:`parameter 874attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, 875an optional address space, an optional section, an optional partition, 876an optional alignment, an optional :ref:`comdat <langref_comdats>`, 877an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, 878an optional :ref:`prologue <prologuedata>`, 879an optional :ref:`personality <personalityfn>`, 880an optional list of attached :ref:`metadata <metadata>`, 881an opening curly brace, a list of basic blocks, and a closing curly brace. 882 883Syntax:: 884 885 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass] 886 [cconv] [ret attrs] 887 <ResultType> @<FunctionName> ([argument list]) 888 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs] 889 [section "name"] [partition "name"] [comdat [($name)]] [align N] 890 [gc] [prefix Constant] [prologue Constant] [personality Constant] 891 (!name !N)* { ... } 892 893The argument list is a comma separated sequence of arguments where each 894argument is of the following form: 895 896Syntax:: 897 898 <type> [parameter Attrs] [name] 899 900LLVM function declarations consist of the "``declare``" keyword, an 901optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style 902<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an 903optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` 904or ``local_unnamed_addr`` attribute, an optional address space, a return type, 905an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly 906empty list of arguments, an optional alignment, an optional :ref:`garbage 907collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional 908:ref:`prologue <prologuedata>`. 909 910Syntax:: 911 912 declare [linkage] [visibility] [DLLStorageClass] 913 [cconv] [ret attrs] 914 <ResultType> @<FunctionName> ([argument list]) 915 [(unnamed_addr|local_unnamed_addr)] [align N] [gc] 916 [prefix Constant] [prologue Constant] 917 918A function definition contains a list of basic blocks, forming the CFG (Control 919Flow Graph) for the function. Each basic block may optionally start with a label 920(giving the basic block a symbol table entry), contains a list of instructions 921and :ref:`debug records <debugrecords>`, 922and ends with a :ref:`terminator <terminators>` instruction (such as a branch or 923function return). If an explicit label name is not provided, a block is assigned 924an implicit numbered label, using the next value from the same counter as used 925for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a 926function entry block does not have an explicit label, it will be assigned label 927"%0", then the first unnamed temporary in that block will be "%1", etc. If a 928numeric label is explicitly specified, it must match the numeric label that 929would be used implicitly. 930 931The first basic block in a function is special in two ways: it is 932immediately executed on entrance to the function, and it is not allowed 933to have predecessor basic blocks (i.e. there can not be any branches to 934the entry block of a function). Because the block can have no 935predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`. 936 937LLVM allows an explicit section to be specified for functions. If the 938target supports it, it will emit functions to the section specified. 939Additionally, the function can be placed in a COMDAT. 940 941An explicit alignment may be specified for a function. If not present, 942or if the alignment is set to zero, the alignment of the function is set 943by the target to whatever it feels convenient. If an explicit alignment 944is specified, the function is forced to have at least that much 945alignment. All alignments must be a power of 2. 946 947If the ``unnamed_addr`` attribute is given, the address is known to not 948be significant and two identical functions can be merged. 949 950If the ``local_unnamed_addr`` attribute is given, the address is known to 951not be significant within the module. 952 953If an explicit address space is not given, it will default to the program 954address space from the :ref:`datalayout string<langref_datalayout>`. 955 956.. _langref_aliases: 957 958Aliases 959------- 960 961Aliases, unlike function or variables, don't create any new data. They 962are just a new symbol and metadata for an existing position. 963 964Aliases have a name and an aliasee that is either a global value or a 965constant expression. 966 967Aliases may have an optional :ref:`linkage type <linkage>`, an optional 968:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional 969:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class 970<dllstorageclass>` and an optional :ref:`tls model <tls_model>`. 971 972Syntax:: 973 974 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> 975 [, partition "name"] 976 977The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, 978``linkonce_odr``, ``weak_odr``, ``external``, ``available_externally``. Note 979that some system linkers might not correctly handle dropping a weak symbol that 980is aliased. 981 982Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as 983the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point 984to the same content. 985 986If the ``local_unnamed_addr`` attribute is given, the address is known to 987not be significant within the module. 988 989Since aliases are only a second name, some restrictions apply, of which 990some can only be checked when producing an object file: 991 992* The expression defining the aliasee must be computable at assembly 993 time. Since it is just a name, no relocations can be used. 994 995* No alias in the expression can be weak as the possibility of the 996 intermediate alias being overridden cannot be represented in an 997 object file. 998 999* If the alias has the ``available_externally`` linkage, the aliasee must be an 1000 ``available_externally`` global value; otherwise the aliasee can be an 1001 expression but no global value in the expression can be a declaration, since 1002 that would require a relocation, which is not possible. 1003 1004* If either the alias or the aliasee may be replaced by a symbol outside the 1005 module at link time or runtime, any optimization cannot replace the alias with 1006 the aliasee, since the behavior may be different. The alias may be used as a 1007 name guaranteed to point to the content in the current module. 1008 1009.. _langref_ifunc: 1010 1011IFuncs 1012------- 1013 1014IFuncs, like as aliases, don't create any new data or func. They are just a new 1015symbol that is resolved at runtime by calling a resolver function. 1016 1017On ELF platforms, IFuncs are resolved by the dynamic linker at load time. On 1018Mach-O platforms, they are lowered in terms of ``.symbol_resolver`` functions, 1019which lazily resolve the callee the first time they are called. 1020 1021IFunc may have an optional :ref:`linkage type <linkage>` and an optional 1022:ref:`visibility style <visibility>`. 1023 1024Syntax:: 1025 1026 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver> 1027 [, partition "name"] 1028 1029 1030.. _langref_comdats: 1031 1032Comdats 1033------- 1034 1035Comdat IR provides access to object file COMDAT/section group functionality 1036which represents interrelated sections. 1037 1038Comdats have a name which represents the COMDAT key and a selection kind to 1039provide input on how the linker deduplicates comdats with the same key in two 1040different object files. A comdat must be included or omitted as a unit. 1041Discarding the whole comdat is allowed but discarding a subset is not. 1042 1043A global object may be a member of at most one comdat. Aliases are placed in the 1044same COMDAT that their aliasee computes to, if any. 1045 1046Syntax:: 1047 1048 $<Name> = comdat SelectionKind 1049 1050For selection kinds other than ``nodeduplicate``, only one of the duplicate 1051comdats may be retained by the linker and the members of the remaining comdats 1052must be discarded. The following selection kinds are supported: 1053 1054``any`` 1055 The linker may choose any COMDAT key, the choice is arbitrary. 1056``exactmatch`` 1057 The linker may choose any COMDAT key but the sections must contain the 1058 same data. 1059``largest`` 1060 The linker will choose the section containing the largest COMDAT key. 1061``nodeduplicate`` 1062 No deduplication is performed. 1063``samesize`` 1064 The linker may choose any COMDAT key but the sections must contain the 1065 same amount of data. 1066 1067- XCOFF and Mach-O don't support COMDATs. 1068- COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need 1069 a non-local linkage COMDAT symbol. 1070- ELF supports ``any`` and ``nodeduplicate``. 1071- WebAssembly only supports ``any``. 1072 1073Here is an example of a COFF COMDAT where a function will only be selected if 1074the COMDAT key's section is the largest: 1075 1076.. code-block:: text 1077 1078 $foo = comdat largest 1079 @foo = global i32 2, comdat($foo) 1080 1081 define void @bar() comdat($foo) { 1082 ret void 1083 } 1084 1085In a COFF object file, this will create a COMDAT section with selection kind 1086``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol 1087and another COMDAT section with selection kind 1088``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT 1089section and contains the contents of the ``@bar`` symbol. 1090 1091As a syntactic sugar the ``$name`` can be omitted if the name is the same as 1092the global name: 1093 1094.. code-block:: llvm 1095 1096 $foo = comdat any 1097 @foo = global i32 2, comdat 1098 @bar = global i32 3, comdat($foo) 1099 1100There are some restrictions on the properties of the global object. 1101It, or an alias to it, must have the same name as the COMDAT group when 1102targeting COFF. 1103The contents and size of this object may be used during link-time to determine 1104which COMDAT groups get selected depending on the selection kind. 1105Because the name of the object must match the name of the COMDAT group, the 1106linkage of the global object must not be local; local symbols can get renamed 1107if a collision occurs in the symbol table. 1108 1109The combined use of COMDATS and section attributes may yield surprising results. 1110For example: 1111 1112.. code-block:: llvm 1113 1114 $foo = comdat any 1115 $bar = comdat any 1116 @g1 = global i32 42, section "sec", comdat($foo) 1117 @g2 = global i32 42, section "sec", comdat($bar) 1118 1119From the object file perspective, this requires the creation of two sections 1120with the same name. This is necessary because both globals belong to different 1121COMDAT groups and COMDATs, at the object file level, are represented by 1122sections. 1123 1124Note that certain IR constructs like global variables and functions may 1125create COMDATs in the object file in addition to any which are specified using 1126COMDAT IR. This arises when the code generator is configured to emit globals 1127in individual sections (e.g. when `-data-sections` or `-function-sections` 1128is supplied to `llc`). 1129 1130.. _namedmetadatastructure: 1131 1132Named Metadata 1133-------------- 1134 1135Named metadata is a collection of metadata. :ref:`Metadata 1136nodes <metadata>` (but not metadata strings) are the only valid 1137operands for a named metadata. 1138 1139#. Named metadata are represented as a string of characters with the 1140 metadata prefix. The rules for metadata names are the same as for 1141 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes 1142 are still valid, which allows any character to be part of a name. 1143 1144Syntax:: 1145 1146 ; Some unnamed metadata nodes, which are referenced by the named metadata. 1147 !0 = !{!"zero"} 1148 !1 = !{!"one"} 1149 !2 = !{!"two"} 1150 ; A named metadata. 1151 !name = !{!0, !1, !2} 1152 1153.. _paramattrs: 1154 1155Parameter Attributes 1156-------------------- 1157 1158The return type and each parameter of a function type may have a set of 1159*parameter attributes* associated with them. Parameter attributes are 1160used to communicate additional information about the result or 1161parameters of a function. Parameter attributes are considered to be part 1162of the function, not of the function type, so functions with different 1163parameter attributes can have the same function type. Parameter attributes can 1164be placed both on function declarations/definitions, and at call-sites. 1165 1166Parameter attributes are either simple keywords or strings that follow the 1167specified type. Multiple parameter attributes, when required, are separated by 1168spaces. For example: 1169 1170.. code-block:: llvm 1171 1172 ; On function declarations/definitions: 1173 declare i32 @printf(ptr noalias captures(none), ...) 1174 declare i32 @atoi(i8 zeroext) 1175 declare signext i8 @returns_signed_char() 1176 define void @baz(i32 "amdgpu-flat-work-group-size"="1,256" %x) 1177 1178 ; On call-sites: 1179 call i32 @atoi(i8 zeroext %x) 1180 call signext i8 @returns_signed_char() 1181 1182Note that any attributes for the function result (``nonnull``, 1183``signext``) come before the result type. 1184 1185Parameter attributes can be broadly separated into two kinds: ABI attributes 1186that affect how values are passed to/from functions, like ``zeroext``, 1187``inreg``, ``byval``, or ``sret``. And optimization attributes, which provide 1188additional optimization guarantees, like ``noalias``, ``nonnull`` and 1189``dereferenceable``. 1190 1191ABI attributes must be specified *both* at the function declaration/definition 1192and call-site, otherwise the behavior may be undefined. ABI attributes cannot 1193be safely dropped. Optimization attributes do not have to match between 1194call-site and function: The intersection of their implied semantics applies. 1195Optimization attributes can also be freely dropped. 1196 1197If an integer argument to a function is not marked signext/zeroext/noext, the 1198kind of extension used is target-specific. Some targets depend for 1199correctness on the kind of extension to be explicitly specified. 1200 1201Currently, only the following parameter attributes are defined: 1202 1203``zeroext`` 1204 This indicates to the code generator that the parameter or return 1205 value should be zero-extended to the extent required by the target's 1206 ABI by the caller (for a parameter) or the callee (for a return value). 1207``signext`` 1208 This indicates to the code generator that the parameter or return 1209 value should be sign-extended to the extent required by the target's 1210 ABI (which is usually 32-bits) by the caller (for a parameter) or 1211 the callee (for a return value). 1212``noext`` 1213 This indicates to the code generator that the parameter or return 1214 value has the high bits undefined, as for a struct in register, and 1215 therefore does not need to be sign or zero extended. This is the same 1216 as default behavior and is only actually used (by some targets) to 1217 validate that one of the attributes is always present. 1218``inreg`` 1219 This indicates that this parameter or return value should be treated 1220 in a special target-dependent fashion while emitting code for 1221 a function call or return (usually, by putting it in a register as 1222 opposed to memory, though some targets use it to distinguish between 1223 two different kinds of registers). Use of this attribute is 1224 target-specific. 1225``byval(<ty>)`` 1226 This indicates that the pointer parameter should really be passed by 1227 value to the function. The attribute implies that a hidden copy of 1228 the pointee is made between the caller and the callee, so the callee 1229 is unable to modify the value in the caller. This attribute is only 1230 valid on LLVM pointer arguments. It is generally used to pass 1231 structs and arrays by value, but is also valid on pointers to 1232 scalars. The copy is considered to belong to the caller not the 1233 callee (for example, ``readonly`` functions should not write to 1234 ``byval`` parameters). This is not a valid attribute for return 1235 values. 1236 1237 The byval type argument indicates the in-memory value type, and 1238 must be the same as the pointee type of the argument. 1239 1240 The byval attribute also supports specifying an alignment with the 1241 align attribute. It indicates the alignment of the stack slot to 1242 form and the known alignment of the pointer specified to the call 1243 site. If the alignment is not specified, then the code generator 1244 makes a target-specific assumption. 1245 1246.. _attr_byref: 1247 1248``byref(<ty>)`` 1249 1250 The ``byref`` argument attribute allows specifying the pointee 1251 memory type of an argument. This is similar to ``byval``, but does 1252 not imply a copy is made anywhere, or that the argument is passed 1253 on the stack. This implies the pointer is dereferenceable up to 1254 the storage size of the type. 1255 1256 It is not generally permissible to introduce a write to an 1257 ``byref`` pointer. The pointer may have any address space and may 1258 be read only. 1259 1260 This is not a valid attribute for return values. 1261 1262 The alignment for an ``byref`` parameter can be explicitly 1263 specified by combining it with the ``align`` attribute, similar to 1264 ``byval``. If the alignment is not specified, then the code generator 1265 makes a target-specific assumption. 1266 1267 This is intended for representing ABI constraints, and is not 1268 intended to be inferred for optimization use. 1269 1270.. _attr_preallocated: 1271 1272``preallocated(<ty>)`` 1273 This indicates that the pointer parameter should really be passed by 1274 value to the function, and that the pointer parameter's pointee has 1275 already been initialized before the call instruction. This attribute 1276 is only valid on LLVM pointer arguments. The argument must be the value 1277 returned by the appropriate 1278 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non 1279 ``musttail`` calls, or the corresponding caller parameter in ``musttail`` 1280 calls, although it is ignored during codegen. 1281 1282 A non ``musttail`` function call with a ``preallocated`` attribute in 1283 any parameter must have a ``"preallocated"`` operand bundle. A ``musttail`` 1284 function call cannot have a ``"preallocated"`` operand bundle. 1285 1286 The preallocated attribute requires a type argument, which must be 1287 the same as the pointee type of the argument. 1288 1289 The preallocated attribute also supports specifying an alignment with the 1290 align attribute. It indicates the alignment of the stack slot to 1291 form and the known alignment of the pointer specified to the call 1292 site. If the alignment is not specified, then the code generator 1293 makes a target-specific assumption. 1294 1295.. _attr_inalloca: 1296 1297``inalloca(<ty>)`` 1298 1299 The ``inalloca`` argument attribute allows the caller to take the 1300 address of outgoing stack arguments. An ``inalloca`` argument must 1301 be a pointer to stack memory produced by an ``alloca`` instruction. 1302 The alloca, or argument allocation, must also be tagged with the 1303 inalloca keyword. Only the last argument may have the ``inalloca`` 1304 attribute, and that argument is guaranteed to be passed in memory. 1305 1306 An argument allocation may be used by a call at most once because 1307 the call may deallocate it. The ``inalloca`` attribute cannot be 1308 used in conjunction with other attributes that affect argument 1309 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The 1310 ``inalloca`` attribute also disables LLVM's implicit lowering of 1311 large aggregate return values, which means that frontend authors 1312 must lower them with ``sret`` pointers. 1313 1314 When the call site is reached, the argument allocation must have 1315 been the most recent stack allocation that is still live, or the 1316 behavior is undefined. It is possible to allocate additional stack 1317 space after an argument allocation and before its call site, but it 1318 must be cleared off with :ref:`llvm.stackrestore 1319 <int_stackrestore>`. 1320 1321 The inalloca attribute requires a type argument, which must be the 1322 same as the pointee type of the argument. 1323 1324 See :doc:`InAlloca` for more information on how to use this 1325 attribute. 1326 1327``sret(<ty>)`` 1328 This indicates that the pointer parameter specifies the address of a 1329 structure that is the return value of the function in the source 1330 program. This pointer must be guaranteed by the caller to be valid: 1331 loads and stores to the structure may be assumed by the callee not 1332 to trap and to be properly aligned. 1333 1334 The sret type argument specifies the in memory type, which must be 1335 the same as the pointee type of the argument. 1336 1337 A function that accepts an ``sret`` argument must return ``void``. 1338 A return value may not be ``sret``. 1339 1340.. _attr_elementtype: 1341 1342``elementtype(<ty>)`` 1343 1344 The ``elementtype`` argument attribute can be used to specify a pointer 1345 element type in a way that is compatible with `opaque pointers 1346 <OpaquePointers.html>`__. 1347 1348 The ``elementtype`` attribute by itself does not carry any specific 1349 semantics. However, certain intrinsics may require this attribute to be 1350 present and assign it particular semantics. This will be documented on 1351 individual intrinsics. 1352 1353 The attribute may only be applied to pointer typed arguments of intrinsic 1354 calls. It cannot be applied to non-intrinsic calls, and cannot be applied 1355 to parameters on function declarations. For non-opaque pointers, the type 1356 passed to ``elementtype`` must match the pointer element type. 1357 1358.. _attr_align: 1359 1360``align <n>`` or ``align(<n>)`` 1361 This indicates that the pointer value or vector of pointers has the 1362 specified alignment. If applied to a vector of pointers, *all* pointers 1363 (elements) have the specified alignment. If the pointer value does not have 1364 the specified alignment, :ref:`poison value <poisonvalues>` is returned or 1365 passed instead. The ``align`` attribute should be combined with the 1366 ``noundef`` attribute to ensure a pointer is aligned, or otherwise the 1367 behavior is undefined. Note that ``align 1`` has no effect on non-byval, 1368 non-preallocated arguments. 1369 1370 Note that this attribute has additional semantics when combined with the 1371 ``byval`` or ``preallocated`` attribute, which are documented there. 1372 1373.. _noalias: 1374 1375``noalias`` 1376 This indicates that memory locations accessed via pointer values 1377 :ref:`based <pointeraliasing>` on the argument or return value are not also 1378 accessed, during the execution of the function, via pointer values not 1379 *based* on the argument or return value. This guarantee only holds for 1380 memory locations that are *modified*, by any means, during the execution of 1381 the function. If there are other accesses not based on the argument or 1382 return value, the behavior is undefined. The attribute on a return value 1383 also has additional semantics, as described below. Both the caller and the 1384 callee share the responsibility of ensuring that these requirements are 1385 met. For further details, please see the discussion of the NoAlias response 1386 in :ref:`alias analysis <Must, May, or No>`. 1387 1388 Note that this definition of ``noalias`` is intentionally similar 1389 to the definition of ``restrict`` in C99 for function arguments. 1390 1391 For function return values, C99's ``restrict`` is not meaningful, 1392 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias`` 1393 attribute on return values are stronger than the semantics of the attribute 1394 when used on function arguments. On function return values, the ``noalias`` 1395 attribute indicates that the function acts like a system memory allocation 1396 function, returning a pointer to allocated storage disjoint from the 1397 storage for any other object accessible to the caller. 1398 1399``captures(...)`` 1400 This attributes restrict the ways in which the callee may capture the 1401 pointer. This is not a valid attribute for return values. This attribute 1402 applies only to the particular copy of the pointer passed in this argument. 1403 1404 The arguments of ``captures`` is a list of captured pointer components, 1405 which may be ``none``, or a combination of: 1406 1407 - ``address``: The integral address of the pointer. 1408 - ``address_is_null`` (subset of ``address``): Whether the address is null. 1409 - ``provenance``: The ability to access the pointer for both read and write 1410 after the function returns. 1411 - ``read_provenance`` (subset of ``provenance``): The ability to access the 1412 pointer only for reads after the function returns. 1413 1414 Additionally, it is possible to specify that some components are only 1415 captured in certain locations. Currently only the return value (``ret``) 1416 and other (default) locations are supported. 1417 1418 The `pointer capture section <pointercapture>` discusses these semantics 1419 in more detail. 1420 1421 Some examples of how to use the attribute: 1422 1423 - ``captures(none)``: Pointer not captured. 1424 - ``captures(address, provenance)``: Equivalent to omitting the attribute. 1425 - ``captures(address)``: Address may be captured, but not provenance. 1426 - ``captures(address_is_null)``: Only captures whether the address is null. 1427 - ``captures(address, read_provenance)``: Both address and provenance 1428 captured, but only for read-only access. 1429 - ``captures(ret: address, provenance)``: Pointer captured through return 1430 value only. 1431 - ``captures(address_is_null, ret: address, provenance)``: The whole pointer 1432 is captured through the return value, and additionally whether the pointer 1433 is null is captured in some other way. 1434 1435``nofree`` 1436 This indicates that callee does not free the pointer argument. This is not 1437 a valid attribute for return values. 1438 1439.. _nest: 1440 1441``nest`` 1442 This indicates that the pointer parameter can be excised using the 1443 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid 1444 attribute for return values and can only be applied to one parameter. 1445 1446``returned`` 1447 This indicates that the function always returns the argument as its return 1448 value. This is a hint to the optimizer and code generator used when 1449 generating the caller, allowing value propagation, tail call optimization, 1450 and omission of register saves and restores in some cases; it is not 1451 checked or enforced when generating the callee. The parameter and the 1452 function return type must be valid operands for the 1453 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for 1454 return values and can only be applied to one parameter. 1455 1456``nonnull`` 1457 This indicates that the parameter or return pointer is not null. This 1458 attribute may only be applied to pointer typed parameters. This is not 1459 checked or enforced by LLVM; if the parameter or return pointer is null, 1460 :ref:`poison value <poisonvalues>` is returned or passed instead. 1461 The ``nonnull`` attribute should be combined with the ``noundef`` attribute 1462 to ensure a pointer is not null or otherwise the behavior is undefined. 1463 1464``dereferenceable(<n>)`` 1465 This indicates that the parameter or return pointer is dereferenceable. This 1466 attribute may only be applied to pointer typed parameters. A pointer that 1467 is dereferenceable can be loaded from speculatively without a risk of 1468 trapping. The number of bytes known to be dereferenceable must be provided 1469 in parentheses. It is legal for the number of bytes to be less than the 1470 size of the pointee type. The ``nonnull`` attribute does not imply 1471 dereferenceability (consider a pointer to one element past the end of an 1472 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in 1473 ``addrspace(0)`` (which is the default address space), except if the 1474 ``null_pointer_is_valid`` function attribute is present. 1475 ``n`` should be a positive number. The pointer should be well defined, 1476 otherwise it is undefined behavior. This means ``dereferenceable(<n>)`` 1477 implies ``noundef``. 1478 1479``dereferenceable_or_null(<n>)`` 1480 This indicates that the parameter or return value isn't both 1481 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same 1482 time. All non-null pointers tagged with 1483 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``. 1484 For address space 0 ``dereferenceable_or_null(<n>)`` implies that 1485 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``, 1486 and in other address spaces ``dereferenceable_or_null(<n>)`` 1487 implies that a pointer is at least one of ``dereferenceable(<n>)`` 1488 or ``null`` (i.e. it may be both ``null`` and 1489 ``dereferenceable(<n>)``). This attribute may only be applied to 1490 pointer typed parameters. 1491 1492``swiftself`` 1493 This indicates that the parameter is the self/context parameter. This is not 1494 a valid attribute for return values and can only be applied to one 1495 parameter. 1496 1497.. _swiftasync: 1498 1499``swiftasync`` 1500 This indicates that the parameter is the asynchronous context parameter and 1501 triggers the creation of a target-specific extended frame record to store 1502 this pointer. This is not a valid attribute for return values and can only 1503 be applied to one parameter. 1504 1505``swifterror`` 1506 This attribute is motivated to model and optimize Swift error handling. It 1507 can be applied to a parameter with pointer to pointer type or a 1508 pointer-sized alloca. At the call site, the actual argument that corresponds 1509 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or 1510 the ``swifterror`` parameter of the caller. A ``swifterror`` value (either 1511 the parameter or the alloca) can only be loaded and stored from, or used as 1512 a ``swifterror`` argument. This is not a valid attribute for return values 1513 and can only be applied to one parameter. 1514 1515 These constraints allow the calling convention to optimize access to 1516 ``swifterror`` variables by associating them with a specific register at 1517 call boundaries rather than placing them in memory. Since this does change 1518 the calling convention, a function which uses the ``swifterror`` attribute 1519 on a parameter is not ABI-compatible with one which does not. 1520 1521 These constraints also allow LLVM to assume that a ``swifterror`` argument 1522 does not alias any other memory visible within a function and that a 1523 ``swifterror`` alloca passed as an argument does not escape. 1524 1525``immarg`` 1526 This indicates the parameter is required to be an immediate 1527 value. This must be a trivial immediate integer or floating-point 1528 constant. Undef or constant expressions are not valid. This is 1529 only valid on intrinsic declarations and cannot be applied to a 1530 call site or arbitrary function. 1531 1532``noundef`` 1533 This attribute applies to parameters and return values. If the value 1534 representation contains any undefined or poison bits, the behavior is 1535 undefined. Note that this does not refer to padding introduced by the 1536 type's storage representation. 1537 1538 If memory sanitizer is enabled, ``noundef`` becomes an ABI attribute and 1539 must match between the call-site and the function definition. 1540 1541.. _nofpclass: 1542 1543``nofpclass(<test mask>)`` 1544 This attribute applies to parameters and return values with 1545 floating-point and vector of floating-point types, as well as 1546 :ref:`supported aggregates <fastmath_return_types>` of such types 1547 (matching the supported types for :ref:`fast-math flags <fastmath>`). 1548 The test mask has the same format as the second argument to the 1549 :ref:`llvm.is.fpclass <llvm.is.fpclass>`, and indicates which classes 1550 of floating-point values are not permitted for the value. For example 1551 a bitmask of 3 indicates the parameter may not be a NaN. 1552 1553 If the value is a floating-point class indicated by the 1554 ``nofpclass`` test mask, a :ref:`poison value <poisonvalues>` is 1555 passed or returned instead. 1556 1557.. code-block:: text 1558 :caption: The following invariants hold 1559 1560 @llvm.is.fpclass(nofpclass(test_mask) %x, test_mask) => false 1561 @llvm.is.fpclass(nofpclass(test_mask) %x, ~test_mask) => true 1562 nofpclass(all) => poison 1563.. 1564 1565 In textual IR, various string names are supported for readability 1566 and can be combined. For example ``nofpclass(nan pinf nzero)`` 1567 evaluates to a mask of 547. 1568 1569 This does not depend on the floating-point environment. For 1570 example, a function parameter marked ``nofpclass(zero)`` indicates 1571 no zero inputs. If this is applied to an argument in a function 1572 marked with :ref:`\"denormal-fp-math\" <denormal_fp_math>` 1573 indicating zero treatment of input denormals, it does not imply the 1574 value cannot be a denormal value which would compare equal to 0. 1575 1576.. table:: Recognized test mask names 1577 1578 +-------+----------------------+---------------+ 1579 | Name | floating-point class | Bitmask value | 1580 +=======+======================+===============+ 1581 | nan | Any NaN | 3 | 1582 +-------+----------------------+---------------+ 1583 | inf | +/- infinity | 516 | 1584 +-------+----------------------+---------------+ 1585 | norm | +/- normal | 264 | 1586 +-------+----------------------+---------------+ 1587 | sub | +/- subnormal | 144 | 1588 +-------+----------------------+---------------+ 1589 | zero | +/- 0 | 96 | 1590 +-------+----------------------+---------------+ 1591 | all | All values | 1023 | 1592 +-------+----------------------+---------------+ 1593 | snan | Signaling NaN | 1 | 1594 +-------+----------------------+---------------+ 1595 | qnan | Quiet NaN | 2 | 1596 +-------+----------------------+---------------+ 1597 | ninf | Negative infinity | 4 | 1598 +-------+----------------------+---------------+ 1599 | nnorm | Negative normal | 8 | 1600 +-------+----------------------+---------------+ 1601 | nsub | Negative subnormal | 16 | 1602 +-------+----------------------+---------------+ 1603 | nzero | Negative zero | 32 | 1604 +-------+----------------------+---------------+ 1605 | pzero | Positive zero | 64 | 1606 +-------+----------------------+---------------+ 1607 | psub | Positive subnormal | 128 | 1608 +-------+----------------------+---------------+ 1609 | pnorm | Positive normal | 256 | 1610 +-------+----------------------+---------------+ 1611 | pinf | Positive infinity | 512 | 1612 +-------+----------------------+---------------+ 1613 1614 1615``alignstack(<n>)`` 1616 This indicates the alignment that should be considered by the backend when 1617 assigning this parameter to a stack slot during calling convention 1618 lowering. The enforcement of the specified alignment is target-dependent, 1619 as target-specific calling convention rules may override this value. This 1620 attribute serves the purpose of carrying language specific alignment 1621 information that is not mapped to base types in the backend (for example, 1622 over-alignment specification through language attributes). 1623 1624``allocalign`` 1625 The function parameter marked with this attribute is the alignment in bytes of the 1626 newly allocated block returned by this function. The returned value must either have 1627 the specified alignment or be the null pointer. The return value MAY be more aligned 1628 than the requested alignment, but not less aligned. Invalid (e.g. non-power-of-2) 1629 alignments are permitted for the allocalign parameter, so long as the returned pointer 1630 is null. This attribute may only be applied to integer parameters. 1631 1632``allocptr`` 1633 The function parameter marked with this attribute is the pointer 1634 that will be manipulated by the allocator. For a realloc-like 1635 function the pointer will be invalidated upon success (but the 1636 same address may be returned), for a free-like function the 1637 pointer will always be invalidated. 1638 1639``readnone`` 1640 This attribute indicates that the function does not dereference that 1641 pointer argument, even though it may read or write the memory that the 1642 pointer points to if accessed through other pointers. 1643 1644 If a function reads from or writes to a readnone pointer argument, the 1645 behavior is undefined. 1646 1647``readonly`` 1648 This attribute indicates that the function does not write through this 1649 pointer argument, even though it may write to the memory that the pointer 1650 points to. 1651 1652 If a function writes to a readonly pointer argument, the behavior is 1653 undefined. 1654 1655``writeonly`` 1656 This attribute indicates that the function may write to, but does not read 1657 through this pointer argument (even though it may read from the memory that 1658 the pointer points to). 1659 1660 This attribute is understood in the same way as the ``memory(write)`` 1661 attribute. That is, the pointer may still be read as long as the read is 1662 not observable outside the function. See the ``memory`` documentation for 1663 precise semantics. 1664 1665``writable`` 1666 This attribute is only meaningful in conjunction with ``dereferenceable(N)`` 1667 or another attribute that implies the first ``N`` bytes of the pointer 1668 argument are dereferenceable. 1669 1670 In that case, the attribute indicates that the first ``N`` bytes will be 1671 (non-atomically) loaded and stored back on entry to the function. 1672 1673 This implies that it's possible to introduce spurious stores on entry to 1674 the function without introducing traps or data races. This does not 1675 necessarily hold throughout the whole function, as the pointer may escape 1676 to a different thread during the execution of the function. See also the 1677 :ref:`atomic optimization guide <Optimization outside atomic>` 1678 1679 The "other attributes" that imply dereferenceability are 1680 ``dereferenceable_or_null`` (if the pointer is non-null) and the 1681 ``sret``, ``byval``, ``byref``, ``inalloca``, ``preallocated`` family of 1682 attributes. Note that not all of these combinations are useful, e.g. 1683 ``byval`` arguments are known to be writable even without this attribute. 1684 1685 The ``writable`` attribute cannot be combined with ``readnone``, 1686 ``readonly`` or a ``memory`` attribute that does not contain 1687 ``argmem: write``. 1688 1689``initializes((Lo1, Hi1), ...)`` 1690 This attribute indicates that the function initializes the ranges of the 1691 pointer parameter's memory, ``[%p+LoN, %p+HiN)``. Initialization of memory 1692 means the first memory access is a non-volatile, non-atomic write. The 1693 write must happen before the function returns. If the function unwinds, 1694 the write may not happen. 1695 1696 This attribute only holds for the memory accessed via this pointer 1697 parameter. Other arbitrary accesses to the same memory via other pointers 1698 are allowed. 1699 1700 The ``writable`` or ``dereferenceable`` attribute do not imply the 1701 ``initializes`` attribute. The ``initializes`` attribute does not imply 1702 ``writeonly`` since ``initializes`` allows reading from the pointer 1703 after writing. 1704 1705 This attribute is a list of constant ranges in ascending order with no 1706 overlapping or consecutive list elements. ``LoN/HiN`` are 64-bit integers, 1707 and negative values are allowed in case the argument points partway into 1708 an allocation. An empty list is not allowed. 1709 1710``dead_on_unwind`` 1711 At a high level, this attribute indicates that the pointer argument is dead 1712 if the call unwinds, in the sense that the caller will not depend on the 1713 contents of the memory. Stores that would only be visible on the unwind 1714 path can be elided. 1715 1716 More precisely, the behavior is as-if any memory written through the 1717 pointer during the execution of the function is overwritten with a poison 1718 value on unwind. This includes memory written by the implicit write implied 1719 by the ``writable`` attribute. The caller is allowed to access the affected 1720 memory, but all loads that are not preceded by a store will return poison. 1721 1722 This attribute cannot be applied to return values. 1723 1724``range(<ty> <a>, <b>)`` 1725 This attribute expresses the possible range of the parameter or return value. 1726 If the value is not in the specified range, it is converted to poison. 1727 The arguments passed to ``range`` have the following properties: 1728 1729 - The type must match the scalar type of the parameter or return value. 1730 - The pair ``a,b`` represents the range ``[a,b)``. 1731 - Both ``a`` and ``b`` are constants. 1732 - The range is allowed to wrap. 1733 - The empty range is represented using ``0,0``. 1734 - Otherwise, ``a`` and ``b`` are not allowed to be equal. 1735 1736 This attribute may only be applied to parameters or return values with integer 1737 or vector of integer types. 1738 1739 For vector-typed parameters, the range is applied element-wise. 1740 1741.. _gc: 1742 1743Garbage Collector Strategy Names 1744-------------------------------- 1745 1746Each function may specify a garbage collector strategy name, which is simply a 1747string: 1748 1749.. code-block:: llvm 1750 1751 define void @f() gc "name" { ... } 1752 1753The supported values of *name* includes those :ref:`built in to LLVM 1754<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC 1755strategy will cause the compiler to alter its output in order to support the 1756named garbage collection algorithm. Note that LLVM itself does not contain a 1757garbage collector, this functionality is restricted to generating machine code 1758which can interoperate with a collector provided externally. 1759 1760.. _prefixdata: 1761 1762Prefix Data 1763----------- 1764 1765Prefix data is data associated with a function which the code 1766generator will emit immediately before the function's entrypoint. 1767The purpose of this feature is to allow frontends to associate 1768language-specific runtime metadata with specific functions and make it 1769available through the function pointer while still allowing the 1770function pointer to be called. 1771 1772To access the data for a given function, a program may bitcast the 1773function pointer to a pointer to the constant's type and dereference 1774index -1. This implies that the IR symbol points just past the end of 1775the prefix data. For instance, take the example of a function annotated 1776with a single ``i32``, 1777 1778.. code-block:: llvm 1779 1780 define void @f() prefix i32 123 { ... } 1781 1782The prefix data can be referenced as, 1783 1784.. code-block:: llvm 1785 1786 %a = getelementptr inbounds i32, ptr @f, i32 -1 1787 %b = load i32, ptr %a 1788 1789Prefix data is laid out as if it were an initializer for a global variable 1790of the prefix data's type. The function will be placed such that the 1791beginning of the prefix data is aligned. This means that if the size 1792of the prefix data is not a multiple of the alignment size, the 1793function's entrypoint will not be aligned. If alignment of the 1794function's entrypoint is desired, padding must be added to the prefix 1795data. 1796 1797A function may have prefix data but no body. This has similar semantics 1798to the ``available_externally`` linkage in that the data may be used by the 1799optimizers but will not be emitted in the object file. 1800 1801.. _prologuedata: 1802 1803Prologue Data 1804------------- 1805 1806The ``prologue`` attribute allows arbitrary code (encoded as bytes) to 1807be inserted prior to the function body. This can be used for enabling 1808function hot-patching and instrumentation. 1809 1810To maintain the semantics of ordinary function calls, the prologue data must 1811have a particular format. Specifically, it must begin with a sequence of 1812bytes which decode to a sequence of machine instructions, valid for the 1813module's target, which transfer control to the point immediately succeeding 1814the prologue data, without performing any other visible action. This allows 1815the inliner and other passes to reason about the semantics of the function 1816definition without needing to reason about the prologue data. Obviously this 1817makes the format of the prologue data highly target dependent. 1818 1819A trivial example of valid prologue data for the x86 architecture is ``i8 144``, 1820which encodes the ``nop`` instruction: 1821 1822.. code-block:: text 1823 1824 define void @f() prologue i8 144 { ... } 1825 1826Generally prologue data can be formed by encoding a relative branch instruction 1827which skips the metadata, as in this example of valid prologue data for the 1828x86_64 architecture, where the first two bytes encode ``jmp .+10``: 1829 1830.. code-block:: text 1831 1832 %0 = type <{ i8, i8, ptr }> 1833 1834 define void @f() prologue %0 <{ i8 235, i8 8, ptr @md}> { ... } 1835 1836A function may have prologue data but no body. This has similar semantics 1837to the ``available_externally`` linkage in that the data may be used by the 1838optimizers but will not be emitted in the object file. 1839 1840.. _personalityfn: 1841 1842Personality Function 1843-------------------- 1844 1845The ``personality`` attribute permits functions to specify what function 1846to use for exception handling. 1847 1848.. _attrgrp: 1849 1850Attribute Groups 1851---------------- 1852 1853Attribute groups are groups of attributes that are referenced by objects within 1854the IR. They are important for keeping ``.ll`` files readable, because a lot of 1855functions will use the same set of attributes. In the degenerative case of a 1856``.ll`` file that corresponds to a single ``.c`` file, the single attribute 1857group will capture the important command line flags used to build that file. 1858 1859An attribute group is a module-level object. To use an attribute group, an 1860object references the attribute group's ID (e.g. ``#37``). An object may refer 1861to more than one attribute group. In that situation, the attributes from the 1862different groups are merged. 1863 1864Here is an example of attribute groups for a function that should always be 1865inlined, has a stack alignment of 4, and which shouldn't use SSE instructions: 1866 1867.. code-block:: llvm 1868 1869 ; Target-independent attributes: 1870 attributes #0 = { alwaysinline alignstack=4 } 1871 1872 ; Target-dependent attributes: 1873 attributes #1 = { "no-sse" } 1874 1875 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse". 1876 define void @f() #0 #1 { ... } 1877 1878.. _fnattrs: 1879 1880Function Attributes 1881------------------- 1882 1883Function attributes are set to communicate additional information about 1884a function. Function attributes are considered to be part of the 1885function, not of the function type, so functions with different function 1886attributes can have the same function type. 1887 1888Function attributes are simple keywords or strings that follow the specified 1889type. Multiple attributes, when required, are separated by spaces. 1890For example: 1891 1892.. code-block:: llvm 1893 1894 define void @f() noinline { ... } 1895 define void @f() alwaysinline { ... } 1896 define void @f() alwaysinline optsize { ... } 1897 define void @f() optsize { ... } 1898 define void @f() "no-sse" { ... } 1899 1900``alignstack(<n>)`` 1901 This attribute indicates that, when emitting the prologue and 1902 epilogue, the backend should forcibly align the stack pointer. 1903 Specify the desired alignment, which must be a power of two, in 1904 parentheses. 1905``"alloc-family"="FAMILY"`` 1906 This indicates which "family" an allocator function is part of. To avoid 1907 collisions, the family name should match the mangled name of the primary 1908 allocator function, that is "malloc" for malloc/calloc/realloc/free, 1909 "_Znwm" for ``::operator::new`` and ``::operator::delete``, and 1910 "_ZnwmSt11align_val_t" for aligned ``::operator::new`` and 1911 ``::operator::delete``. Matching malloc/realloc/free calls within a family 1912 can be optimized, but mismatched ones will be left alone. 1913``allockind("KIND")`` 1914 Describes the behavior of an allocation function. The KIND string contains comma 1915 separated entries from the following options: 1916 1917 * "alloc": the function returns a new block of memory or null. 1918 * "realloc": the function returns a new block of memory or null. If the 1919 result is non-null the memory contents from the start of the block up to 1920 the smaller of the original allocation size and the new allocation size 1921 will match that of the ``allocptr`` argument and the ``allocptr`` 1922 argument is invalidated, even if the function returns the same address. 1923 * "free": the function frees the block of memory specified by ``allocptr``. 1924 Functions marked as "free" ``allockind`` must return void. 1925 * "uninitialized": Any newly-allocated memory (either a new block from 1926 a "alloc" function or the enlarged capacity from a "realloc" function) 1927 will be uninitialized. 1928 * "zeroed": Any newly-allocated memory (either a new block from a "alloc" 1929 function or the enlarged capacity from a "realloc" function) will be 1930 zeroed. 1931 * "aligned": the function returns memory aligned according to the 1932 ``allocalign`` parameter. 1933 1934 The first three options are mutually exclusive, and the remaining options 1935 describe more details of how the function behaves. The remaining options 1936 are invalid for "free"-type functions. 1937``allocsize(<EltSizeParam>[, <NumEltsParam>])`` 1938 This attribute indicates that the annotated function will always return at 1939 least a given number of bytes (or null). Its arguments are zero-indexed 1940 parameter numbers; if one argument is provided, then it's assumed that at 1941 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the 1942 returned pointer. If two are provided, then it's assumed that 1943 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are 1944 available. The referenced parameters must be integer types. No assumptions 1945 are made about the contents of the returned block of memory. 1946``alwaysinline`` 1947 This attribute indicates that the inliner should attempt to inline 1948 this function into callers whenever possible, ignoring any active 1949 inlining size threshold for this caller. 1950``builtin`` 1951 This indicates that the callee function at a call site should be 1952 recognized as a built-in function, even though the function's declaration 1953 uses the ``nobuiltin`` attribute. This is only valid at call sites for 1954 direct calls to functions that are declared with the ``nobuiltin`` 1955 attribute. 1956``cold`` 1957 This attribute indicates that this function is rarely called. When 1958 computing edge weights, basic blocks post-dominated by a cold 1959 function call are also considered to be cold; and, thus, given low 1960 weight. 1961 1962.. _attr_convergent: 1963 1964``convergent`` 1965 This attribute indicates that this function is convergent. 1966 When it appears on a call/invoke, the convergent attribute 1967 indicates that we should treat the call as though we’re calling a 1968 convergent function. This is particularly useful on indirect 1969 calls; without this we may treat such calls as though the target 1970 is non-convergent. 1971 1972 See :doc:`ConvergentOperations` for further details. 1973 1974 It is an error to call :ref:`llvm.experimental.convergence.entry 1975 <llvm.experimental.convergence.entry>` from a function that 1976 does not have this attribute. 1977``disable_sanitizer_instrumentation`` 1978 When instrumenting code with sanitizers, it can be important to skip certain 1979 functions to ensure no instrumentation is applied to them. 1980 1981 This attribute is not always similar to absent ``sanitize_<name>`` 1982 attributes: depending on the specific sanitizer, code can be inserted into 1983 functions regardless of the ``sanitize_<name>`` attribute to prevent false 1984 positive reports. 1985 1986 ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation, 1987 taking precedence over the ``sanitize_<name>`` attributes and other compiler 1988 flags. 1989``"dontcall-error"`` 1990 This attribute denotes that an error diagnostic should be emitted when a 1991 call of a function with this attribute is not eliminated via optimization. 1992 Front ends can provide optional ``srcloc`` metadata nodes on call sites of 1993 such callees to attach information about where in the source language such a 1994 call came from. A string value can be provided as a note. 1995``"dontcall-warn"`` 1996 This attribute denotes that a warning diagnostic should be emitted when a 1997 call of a function with this attribute is not eliminated via optimization. 1998 Front ends can provide optional ``srcloc`` metadata nodes on call sites of 1999 such callees to attach information about where in the source language such a 2000 call came from. A string value can be provided as a note. 2001``fn_ret_thunk_extern`` 2002 This attribute tells the code generator that returns from functions should 2003 be replaced with jumps to externally-defined architecture-specific symbols. 2004 For X86, this symbol's identifier is ``__x86_return_thunk``. 2005``"frame-pointer"`` 2006 This attribute tells the code generator whether the function 2007 should keep the frame pointer. The code generator may emit the frame pointer 2008 even if this attribute says the frame pointer can be eliminated. 2009 The allowed string values are: 2010 2011 * ``"none"`` (default) - the frame pointer can be eliminated, and it's 2012 register can be used for other purposes. 2013 * ``"reserved"`` - the frame pointer register must either be updated to 2014 point to a valid frame record for the current function, or not be 2015 modified. 2016 * ``"non-leaf"`` - the frame pointer should be kept if the function calls 2017 other functions. 2018 * ``"all"`` - the frame pointer should be kept. 2019``hot`` 2020 This attribute indicates that this function is a hot spot of the program 2021 execution. The function will be optimized more aggressively and will be 2022 placed into special subsection of the text section to improving locality. 2023 2024 When profile feedback is enabled, this attribute has the precedence over 2025 the profile information. By marking a function ``hot``, users can work 2026 around the cases where the training input does not have good coverage 2027 on all the hot functions. 2028``inlinehint`` 2029 This attribute indicates that the source code contained a hint that 2030 inlining this function is desirable (such as the "inline" keyword in 2031 C/C++). It is just a hint; it imposes no requirements on the 2032 inliner. 2033``jumptable`` 2034 This attribute indicates that the function should be added to a 2035 jump-instruction table at code-generation time, and that all address-taken 2036 references to this function should be replaced with a reference to the 2037 appropriate jump-instruction-table function pointer. Note that this creates 2038 a new pointer for the original function, which means that code that depends 2039 on function-pointer identity can break. So, any function annotated with 2040 ``jumptable`` must also be ``unnamed_addr``. 2041``memory(...)`` 2042 This attribute specifies the possible memory effects of the call-site or 2043 function. It allows specifying the possible access kinds (``none``, 2044 ``read``, ``write``, or ``readwrite``) for the possible memory location 2045 kinds (``argmem``, ``inaccessiblemem``, as well as a default). It is best 2046 understood by example: 2047 2048 - ``memory(none)``: Does not access any memory. 2049 - ``memory(read)``: May read (but not write) any memory. 2050 - ``memory(write)``: May write (but not read) any memory. 2051 - ``memory(readwrite)``: May read or write any memory. 2052 - ``memory(argmem: read)``: May only read argument memory. 2053 - ``memory(argmem: read, inaccessiblemem: write)``: May only read argument 2054 memory and only write inaccessible memory. 2055 - ``memory(read, argmem: readwrite)``: May read any memory (default mode) 2056 and additionally write argument memory. 2057 - ``memory(readwrite, argmem: none)``: May access any memory apart from 2058 argument memory. 2059 2060 The supported access kinds are: 2061 2062 - ``readwrite``: Any kind of access to the location is allowed. 2063 - ``read``: The location is only read. Writing to the location is immediate 2064 undefined behavior. This includes the case where the location is read from 2065 and then the same value is written back. 2066 - ``write``: Only writes to the location are observable outside the function 2067 call. However, the function may still internally read the location after 2068 writing it, as this is not observable. Reading the location prior to 2069 writing it results in a poison value. 2070 - ``none``: No reads or writes to the location are observed outside the 2071 function. It is always valid to read and write allocas, and to read global 2072 constants, even if ``memory(none)`` is used, as these effects are not 2073 externally observable. 2074 2075 The supported memory location kinds are: 2076 2077 - ``argmem``: This refers to accesses that are based on pointer arguments 2078 to the function. 2079 - ``inaccessiblemem``: This refers to accesses to memory which is not 2080 accessible by the current module (before return from the function -- an 2081 allocator function may return newly accessible memory while only 2082 accessing inaccessible memory itself). Inaccessible memory is often used 2083 to model control dependencies of intrinsics. 2084 - The default access kind (specified without a location prefix) applies to 2085 all locations that haven't been specified explicitly, including those that 2086 don't currently have a dedicated location kind (e.g. accesses to globals 2087 or captured pointers). 2088 2089 If the ``memory`` attribute is not specified, then ``memory(readwrite)`` 2090 is implied (all memory effects are possible). 2091 2092 The memory effects of a call can be computed as 2093 ``CallSiteEffects & (FunctionEffects | OperandBundleEffects)``. Thus, the 2094 call-site annotation takes precedence over the potential effects described 2095 by either the function annotation or the operand bundles. 2096``minsize`` 2097 This attribute suggests that optimization passes and code generator 2098 passes make choices that keep the code size of this function as small 2099 as possible and perform optimizations that may sacrifice runtime 2100 performance in order to minimize the size of the generated code. 2101 This attribute is incompatible with the ``optdebug`` and ``optnone`` 2102 attributes. 2103``naked`` 2104 This attribute disables prologue / epilogue emission for the 2105 function. This can have very system-specific consequences. The arguments of 2106 a ``naked`` function can not be referenced through IR values. 2107``"no-inline-line-tables"`` 2108 When this attribute is set to true, the inliner discards source locations 2109 when inlining code and instead uses the source location of the call site. 2110 Breakpoints set on code that was inlined into the current function will 2111 not fire during the execution of the inlined call sites. If the debugger 2112 stops inside an inlined call site, it will appear to be stopped at the 2113 outermost inlined call site. 2114``no-jump-tables`` 2115 When this attribute is set to true, the jump tables and lookup tables that 2116 can be generated from a switch case lowering are disabled. 2117``nobuiltin`` 2118 This indicates that the callee function at a call site is not recognized as 2119 a built-in function. LLVM will retain the original call and not replace it 2120 with equivalent code based on the semantics of the built-in function, unless 2121 the call site uses the ``builtin`` attribute. This is valid at call sites 2122 and on function declarations and definitions. 2123``nocallback`` 2124 This attribute indicates that the function is only allowed to jump back into 2125 caller's module by a return or an exception, and is not allowed to jump back 2126 by invoking a callback function, a direct, possibly transitive, external 2127 function call, use of ``longjmp``, or other means. It is a compiler hint that 2128 is used at module level to improve dataflow analysis, dropped during linking, 2129 and has no effect on functions defined in the current module. 2130``nodivergencesource`` 2131 A call to this function is not a source of divergence. In uniformity 2132 analysis, a *source of divergence* is an instruction that generates 2133 divergence even if its inputs are uniform. A call with no further information 2134 would normally be considered a source of divergence; setting this attribute 2135 on a function means that a call to it is not a source of divergence. 2136``noduplicate`` 2137 This attribute indicates that calls to the function cannot be 2138 duplicated. A call to a ``noduplicate`` function may be moved 2139 within its parent function, but may not be duplicated within 2140 its parent function. 2141 2142 A function containing a ``noduplicate`` call may still 2143 be an inlining candidate, provided that the call is not 2144 duplicated by inlining. That implies that the function has 2145 internal linkage and only has one call site, so the original 2146 call is dead after inlining. 2147``nofree`` 2148 This function attribute indicates that the function does not, directly or 2149 transitively, call a memory-deallocation function (``free``, for example) 2150 on a memory allocation which existed before the call. 2151 2152 As a result, uncaptured pointers that are known to be dereferenceable 2153 prior to a call to a function with the ``nofree`` attribute are still 2154 known to be dereferenceable after the call. The capturing condition is 2155 necessary in environments where the function might communicate the 2156 pointer to another thread which then deallocates the memory. Alternatively, 2157 ``nosync`` would ensure such communication cannot happen and even captured 2158 pointers cannot be freed by the function. 2159 2160 A ``nofree`` function is explicitly allowed to free memory which it 2161 allocated or (if not ``nosync``) arrange for another thread to free 2162 memory on it's behalf. As a result, perhaps surprisingly, a ``nofree`` 2163 function can return a pointer to a previously deallocated memory object. 2164``noimplicitfloat`` 2165 Disallows implicit floating-point code. This inhibits optimizations that 2166 use floating-point code and floating-point registers for operations that are 2167 not nominally floating-point. LLVM instructions that perform floating-point 2168 operations or require access to floating-point registers may still cause 2169 floating-point code to be generated. 2170 2171 Also inhibits optimizations that create SIMD/vector code and registers from 2172 scalar code such as vectorization or memcpy/memset optimization. This 2173 includes integer vectors. Vector instructions present in IR may still cause 2174 vector code to be generated. 2175``noinline`` 2176 This attribute indicates that the inliner should never inline this 2177 function in any situation. This attribute may not be used together 2178 with the ``alwaysinline`` attribute. 2179``nomerge`` 2180 This attribute indicates that calls to this function should never be merged 2181 during optimization. For example, it will prevent tail merging otherwise 2182 identical code sequences that raise an exception or terminate the program. 2183 Tail merging normally reduces the precision of source location information, 2184 making stack traces less useful for debugging. This attribute gives the 2185 user control over the tradeoff between code size and debug information 2186 precision. 2187``nonlazybind`` 2188 This attribute suppresses lazy symbol binding for the function. This 2189 may make calls to the function faster, at the cost of extra program 2190 startup time if the function is not called during program startup. 2191``noprofile`` 2192 This function attribute prevents instrumentation based profiling, used for 2193 coverage or profile based optimization, from being added to a function. It 2194 also blocks inlining if the caller and callee have different values of this 2195 attribute. 2196``skipprofile`` 2197 This function attribute prevents instrumentation based profiling, used for 2198 coverage or profile based optimization, from being added to a function. This 2199 attribute does not restrict inlining, so instrumented instruction could end 2200 up in this function. 2201``noredzone`` 2202 This attribute indicates that the code generator should not use a 2203 red zone, even if the target-specific ABI normally permits it. 2204``indirect-tls-seg-refs`` 2205 This attribute indicates that the code generator should not use 2206 direct TLS access through segment registers, even if the 2207 target-specific ABI normally permits it. 2208``noreturn`` 2209 This function attribute indicates that the function never returns 2210 normally, hence through a return instruction. This produces undefined 2211 behavior at runtime if the function ever does dynamically return. Annotated 2212 functions may still raise an exception, i.a., ``nounwind`` is not implied. 2213``norecurse`` 2214 This function attribute indicates that the function does not call itself 2215 either directly or indirectly down any possible call path. This produces 2216 undefined behavior at runtime if the function ever does recurse. 2217 2218.. _langref_willreturn: 2219 2220``willreturn`` 2221 This function attribute indicates that a call of this function will 2222 either exhibit undefined behavior or comes back and continues execution 2223 at a point in the existing call stack that includes the current invocation. 2224 Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied. 2225 If an invocation of an annotated function does not return control back 2226 to a point in the call stack, the behavior is undefined. 2227``nosync`` 2228 This function attribute indicates that the function does not communicate 2229 (synchronize) with another thread through memory or other well-defined means. 2230 Synchronization is considered possible in the presence of `atomic` accesses 2231 that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses, 2232 as well as `convergent` function calls. 2233 2234 Note that `convergent` operations can involve communication that is 2235 considered to be not through memory and does not necessarily imply an 2236 ordering between threads for the purposes of the memory model. Therefore, 2237 an operation can be both `convergent` and `nosync`. 2238 2239 If a `nosync` function does ever synchronize with another thread, 2240 the behavior is undefined. 2241``nounwind`` 2242 This function attribute indicates that the function never raises an 2243 exception. If the function does raise an exception, its runtime 2244 behavior is undefined. However, functions marked nounwind may still 2245 trap or generate asynchronous exceptions. Exception handling schemes 2246 that are recognized by LLVM to handle asynchronous exceptions, such 2247 as SEH, will still provide their implementation defined semantics. 2248``nosanitize_bounds`` 2249 This attribute indicates that bounds checking sanitizer instrumentation 2250 is disabled for this function. 2251``nosanitize_coverage`` 2252 This attribute indicates that SanitizerCoverage instrumentation is disabled 2253 for this function. 2254``null_pointer_is_valid`` 2255 If ``null_pointer_is_valid`` is set, then the ``null`` address 2256 in address-space 0 is considered to be a valid address for memory loads and 2257 stores. Any analysis or optimization should not treat dereferencing a 2258 pointer to ``null`` as undefined behavior in this function. 2259 Note: Comparing address of a global variable to ``null`` may still 2260 evaluate to false because of a limitation in querying this attribute inside 2261 constant expressions. 2262``optdebug`` 2263 This attribute suggests that optimization passes and code generator passes 2264 should make choices that try to preserve debug info without significantly 2265 degrading runtime performance. 2266 This attribute is incompatible with the ``minsize``, ``optsize``, and 2267 ``optnone`` attributes. 2268``optforfuzzing`` 2269 This attribute indicates that this function should be optimized 2270 for maximum fuzzing signal. 2271``optnone`` 2272 This function attribute indicates that most optimization passes will skip 2273 this function, with the exception of interprocedural optimization passes. 2274 Code generation defaults to the "fast" instruction selector. 2275 This attribute cannot be used together with the ``alwaysinline`` 2276 attribute; this attribute is also incompatible 2277 with the ``minsize``, ``optsize``, and ``optdebug`` attributes. 2278 2279 This attribute requires the ``noinline`` attribute to be specified on 2280 the function as well, so the function is never inlined into any caller. 2281 Only functions with the ``alwaysinline`` attribute are valid 2282 candidates for inlining into the body of this function. 2283``optsize`` 2284 This attribute suggests that optimization passes and code generator 2285 passes make choices that keep the code size of this function low, 2286 and otherwise do optimizations specifically to reduce code size as 2287 long as they do not significantly impact runtime performance. 2288 This attribute is incompatible with the ``optdebug`` and ``optnone`` 2289 attributes. 2290``"patchable-function"`` 2291 This attribute tells the code generator that the code 2292 generated for this function needs to follow certain conventions that 2293 make it possible for a runtime function to patch over it later. 2294 The exact effect of this attribute depends on its string value, 2295 for which there currently is one legal possibility: 2296 2297 * ``"prologue-short-redirect"`` - This style of patchable 2298 function is intended to support patching a function prologue to 2299 redirect control away from the function in a thread safe 2300 manner. It guarantees that the first instruction of the 2301 function will be large enough to accommodate a short jump 2302 instruction, and will be sufficiently aligned to allow being 2303 fully changed via an atomic compare-and-swap instruction. 2304 While the first requirement can be satisfied by inserting large 2305 enough NOP, LLVM can and will try to re-purpose an existing 2306 instruction (i.e. one that would have to be emitted anyway) as 2307 the patchable instruction larger than a short jump. 2308 2309 ``"prologue-short-redirect"`` is currently only supported on 2310 x86-64. 2311 2312 This attribute by itself does not imply restrictions on 2313 inter-procedural optimizations. All of the semantic effects the 2314 patching may have to be separately conveyed via the linkage type. 2315``"probe-stack"`` 2316 This attribute indicates that the function will trigger a guard region 2317 in the end of the stack. It ensures that accesses to the stack must be 2318 no further apart than the size of the guard region to a previous 2319 access of the stack. It takes one required string value, the name of 2320 the stack probing function that will be called. 2321 2322 If a function that has a ``"probe-stack"`` attribute is inlined into 2323 a function with another ``"probe-stack"`` attribute, the resulting 2324 function has the ``"probe-stack"`` attribute of the caller. If a 2325 function that has a ``"probe-stack"`` attribute is inlined into a 2326 function that has no ``"probe-stack"`` attribute at all, the resulting 2327 function has the ``"probe-stack"`` attribute of the callee. 2328``"stack-probe-size"`` 2329 This attribute controls the behavior of stack probes: either 2330 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any. 2331 It defines the size of the guard region. It ensures that if the function 2332 may use more stack space than the size of the guard region, stack probing 2333 sequence will be emitted. It takes one required integer value, which 2334 is 4096 by default. 2335 2336 If a function that has a ``"stack-probe-size"`` attribute is inlined into 2337 a function with another ``"stack-probe-size"`` attribute, the resulting 2338 function has the ``"stack-probe-size"`` attribute that has the lower 2339 numeric value. If a function that has a ``"stack-probe-size"`` attribute is 2340 inlined into a function that has no ``"stack-probe-size"`` attribute 2341 at all, the resulting function has the ``"stack-probe-size"`` attribute 2342 of the callee. 2343``"no-stack-arg-probe"`` 2344 This attribute disables ABI-required stack probes, if any. 2345``returns_twice`` 2346 This attribute indicates that this function can return twice. The C 2347 ``setjmp`` is an example of such a function. The compiler disables 2348 some optimizations (like tail calls) in the caller of these 2349 functions. 2350``safestack`` 2351 This attribute indicates that 2352 `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_ 2353 protection is enabled for this function. 2354 2355 If a function that has a ``safestack`` attribute is inlined into a 2356 function that doesn't have a ``safestack`` attribute or which has an 2357 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting 2358 function will have a ``safestack`` attribute. 2359``sanitize_address`` 2360 This attribute indicates that AddressSanitizer checks 2361 (dynamic address safety analysis) are enabled for this function. 2362``sanitize_memory`` 2363 This attribute indicates that MemorySanitizer checks (dynamic detection 2364 of accesses to uninitialized memory) are enabled for this function. 2365``sanitize_thread`` 2366 This attribute indicates that ThreadSanitizer checks 2367 (dynamic thread safety analysis) are enabled for this function. 2368``sanitize_hwaddress`` 2369 This attribute indicates that HWAddressSanitizer checks 2370 (dynamic address safety analysis based on tagged pointers) are enabled for 2371 this function. 2372``sanitize_memtag`` 2373 This attribute indicates that MemTagSanitizer checks 2374 (dynamic address safety analysis based on Armv8 MTE) are enabled for 2375 this function. 2376``sanitize_realtime`` 2377 This attribute indicates that RealtimeSanitizer checks 2378 (realtime safety analysis - no allocations, syscalls or exceptions) are enabled 2379 for this function. 2380``sanitize_realtime_blocking`` 2381 This attribute indicates that RealtimeSanitizer should error immediately 2382 if the attributed function is called during invocation of a function 2383 attributed with ``sanitize_realtime``. 2384 This attribute is incompatible with the ``sanitize_realtime`` attribute. 2385``speculative_load_hardening`` 2386 This attribute indicates that 2387 `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_ 2388 should be enabled for the function body. 2389 2390 Speculative Load Hardening is a best-effort mitigation against 2391 information leak attacks that make use of control flow 2392 miss-speculation - specifically miss-speculation of whether a branch 2393 is taken or not. Typically vulnerabilities enabling such attacks are 2394 classified as "Spectre variant #1". Notably, this does not attempt to 2395 mitigate against miss-speculation of branch target, classified as 2396 "Spectre variant #2" vulnerabilities. 2397 2398 When inlining, the attribute is sticky. Inlining a function that carries 2399 this attribute will cause the caller to gain the attribute. This is intended 2400 to provide a maximally conservative model where the code in a function 2401 annotated with this attribute will always (even after inlining) end up 2402 hardened. 2403``speculatable`` 2404 This function attribute indicates that the function does not have any 2405 effects besides calculating its result and does not have undefined behavior. 2406 Note that ``speculatable`` is not enough to conclude that along any 2407 particular execution path the number of calls to this function will not be 2408 externally observable. This attribute is only valid on functions 2409 and declarations, not on individual call sites. If a function is 2410 incorrectly marked as speculatable and really does exhibit 2411 undefined behavior, the undefined behavior may be observed even 2412 if the call site is dead code. 2413 2414``ssp`` 2415 This attribute indicates that the function should emit a stack 2416 smashing protector. It is in the form of a "canary" --- a random value 2417 placed on the stack before the local variables that's checked upon 2418 return from the function to see if it has been overwritten. A 2419 heuristic is used to determine if a function needs stack protectors 2420 or not. The heuristic used will enable protectors for functions with: 2421 2422 - Character arrays larger than ``ssp-buffer-size`` (default 8). 2423 - Aggregates containing character arrays larger than ``ssp-buffer-size``. 2424 - Calls to alloca() with variable sizes or constant sizes greater than 2425 ``ssp-buffer-size``. 2426 2427 Variables that are identified as requiring a protector will be arranged 2428 on the stack such that they are adjacent to the stack protector guard. 2429 2430 If a function with an ``ssp`` attribute is inlined into a calling function, 2431 the attribute is not carried over to the calling function. 2432 2433``sspstrong`` 2434 This attribute indicates that the function should emit a stack smashing 2435 protector. This attribute causes a strong heuristic to be used when 2436 determining if a function needs stack protectors. The strong heuristic 2437 will enable protectors for functions with: 2438 2439 - Arrays of any size and type 2440 - Aggregates containing an array of any size and type. 2441 - Calls to alloca(). 2442 - Local variables that have had their address taken. 2443 2444 Variables that are identified as requiring a protector will be arranged 2445 on the stack such that they are adjacent to the stack protector guard. 2446 The specific layout rules are: 2447 2448 #. Large arrays and structures containing large arrays 2449 (``>= ssp-buffer-size``) are closest to the stack protector. 2450 #. Small arrays and structures containing small arrays 2451 (``< ssp-buffer-size``) are 2nd closest to the protector. 2452 #. Variables that have had their address taken are 3rd closest to the 2453 protector. 2454 2455 This overrides the ``ssp`` function attribute. 2456 2457 If a function with an ``sspstrong`` attribute is inlined into a calling 2458 function which has an ``ssp`` attribute, the calling function's attribute 2459 will be upgraded to ``sspstrong``. 2460 2461``sspreq`` 2462 This attribute indicates that the function should *always* emit a stack 2463 smashing protector. This overrides the ``ssp`` and ``sspstrong`` function 2464 attributes. 2465 2466 Variables that are identified as requiring a protector will be arranged 2467 on the stack such that they are adjacent to the stack protector guard. 2468 The specific layout rules are: 2469 2470 #. Large arrays and structures containing large arrays 2471 (``>= ssp-buffer-size``) are closest to the stack protector. 2472 #. Small arrays and structures containing small arrays 2473 (``< ssp-buffer-size``) are 2nd closest to the protector. 2474 #. Variables that have had their address taken are 3rd closest to the 2475 protector. 2476 2477 If a function with an ``sspreq`` attribute is inlined into a calling 2478 function which has an ``ssp`` or ``sspstrong`` attribute, the calling 2479 function's attribute will be upgraded to ``sspreq``. 2480 2481.. _strictfp: 2482 2483``strictfp`` 2484 This attribute indicates that the function was called from a scope that 2485 requires strict floating-point semantics. LLVM will not attempt any 2486 optimizations that require assumptions about the floating-point rounding 2487 mode or that might alter the state of floating-point status flags that 2488 might otherwise be set or cleared by calling this function. LLVM will 2489 not introduce any new floating-point instructions that may trap. 2490 2491.. _denormal_fp_math: 2492 2493``"denormal-fp-math"`` 2494 This indicates the denormal (subnormal) handling that may be 2495 assumed for the default floating-point environment. This is a 2496 comma separated pair. The elements may be one of ``"ieee"``, 2497 ``"preserve-sign"``, ``"positive-zero"``, or ``"dynamic"``. The 2498 first entry indicates the flushing mode for the result of floating 2499 point operations. The second indicates the handling of denormal inputs 2500 to floating point instructions. For compatibility with older 2501 bitcode, if the second value is omitted, both input and output 2502 modes will assume the same mode. 2503 2504 If this is attribute is not specified, the default is ``"ieee,ieee"``. 2505 2506 If the output mode is ``"preserve-sign"``, or ``"positive-zero"``, 2507 denormal outputs may be flushed to zero by standard floating-point 2508 operations. It is not mandated that flushing to zero occurs, but if 2509 a denormal output is flushed to zero, it must respect the sign 2510 mode. Not all targets support all modes. 2511 2512 If the mode is ``"dynamic"``, the behavior is derived from the 2513 dynamic state of the floating-point environment. Transformations 2514 which depend on the behavior of denormal values should not be 2515 performed. 2516 2517 While this indicates the expected floating point mode the function 2518 will be executed with, this does not make any attempt to ensure 2519 the mode is consistent. User or platform code is expected to set 2520 the floating point mode appropriately before function entry. 2521 2522 If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, 2523 a floating-point operation must treat any input denormal value as 2524 zero. In some situations, if an instruction does not respect this 2525 mode, the input may need to be converted to 0 as if by 2526 ``@llvm.canonicalize`` during lowering for correctness. 2527 2528``"denormal-fp-math-f32"`` 2529 Same as ``"denormal-fp-math"``, but only controls the behavior of 2530 the 32-bit float type (or vectors of 32-bit floats). If both are 2531 are present, this overrides ``"denormal-fp-math"``. Not all targets 2532 support separately setting the denormal mode per type, and no 2533 attempt is made to diagnose unsupported uses. Currently this 2534 attribute is respected by the AMDGPU and NVPTX backends. 2535 2536``"thunk"`` 2537 This attribute indicates that the function will delegate to some other 2538 function with a tail call. The prototype of a thunk should not be used for 2539 optimization purposes. The caller is expected to cast the thunk prototype to 2540 match the thunk target prototype. 2541``uwtable[(sync|async)]`` 2542 This attribute indicates that the ABI being targeted requires that 2543 an unwind table entry be produced for this function even if we can 2544 show that no exceptions passes by it. This is normally the case for 2545 the ELF x86-64 abi, but it can be disabled for some compilation 2546 units. The optional parameter describes what kind of unwind tables 2547 to generate: ``sync`` for normal unwind tables, ``async`` for asynchronous 2548 (instruction precise) unwind tables. Without the parameter, the attribute 2549 ``uwtable`` is equivalent to ``uwtable(async)``. 2550``nocf_check`` 2551 This attribute indicates that no control-flow check will be performed on 2552 the attributed entity. It disables -fcf-protection=<> for a specific 2553 entity to fine grain the HW control flow protection mechanism. The flag 2554 is target independent and currently appertains to a function or function 2555 pointer. 2556``shadowcallstack`` 2557 This attribute indicates that the ShadowCallStack checks are enabled for 2558 the function. The instrumentation checks that the return address for the 2559 function has not changed between the function prolog and epilog. It is 2560 currently x86_64-specific. 2561 2562.. _langref_mustprogress: 2563 2564``mustprogress`` 2565 This attribute indicates that the function is required to return, unwind, 2566 or interact with the environment in an observable way e.g. via a volatile 2567 memory access, I/O, or other synchronization. The ``mustprogress`` 2568 attribute is intended to model the requirements of the first section of 2569 [intro.progress] of the C++ Standard. As a consequence, a loop in a 2570 function with the ``mustprogress`` attribute can be assumed to terminate if 2571 it does not interact with the environment in an observable way, and 2572 terminating loops without side-effects can be removed. If a ``mustprogress`` 2573 function does not satisfy this contract, the behavior is undefined. If a 2574 ``mustprogress`` function calls a function not marked ``mustprogress``, 2575 and that function never returns, the program is well-defined even if there 2576 isn't any other observable progress. Note that ``willreturn`` implies 2577 ``mustprogress``. 2578``"warn-stack-size"="<threshold>"`` 2579 This attribute sets a threshold to emit diagnostics once the frame size is 2580 known should the frame size exceed the specified value. It takes one 2581 required integer value, which should be a non-negative integer, and less 2582 than `UINT_MAX`. It's unspecified which threshold will be used when 2583 duplicate definitions are linked together with differing values. 2584``vscale_range(<min>[, <max>])`` 2585 This function attribute indicates `vscale` is a power-of-two within a 2586 specified range. `min` must be a power-of-two that is greater than 0. When 2587 specified, `max` must be a power-of-two greater-than-or-equal to `min` or 0 2588 to signify an unbounded maximum. The syntax `vscale_range(<val>)` can be 2589 used to set both `min` and `max` to the same value. Functions that don't 2590 include this attribute make no assumptions about the value of `vscale`. 2591``"nooutline"`` 2592 This attribute indicates that outlining passes should not modify the 2593 function. 2594 2595Call Site Attributes 2596---------------------- 2597 2598In addition to function attributes the following call site only 2599attributes are supported: 2600 2601``vector-function-abi-variant`` 2602 This attribute can be attached to a :ref:`call <i_call>` to list 2603 the vector functions associated to the function. Notice that the 2604 attribute cannot be attached to a :ref:`invoke <i_invoke>` or a 2605 :ref:`callbr <i_callbr>` instruction. The attribute consists of a 2606 comma separated list of mangled names. The order of the list does 2607 not imply preference (it is logically a set). The compiler is free 2608 to pick any listed vector function of its choosing. 2609 2610 The syntax for the mangled names is as follows::: 2611 2612 _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)] 2613 2614 When present, the attribute informs the compiler that the function 2615 ``<scalar_name>`` has a corresponding vector variant that can be 2616 used to perform the concurrent invocation of ``<scalar_name>`` on 2617 vectors. The shape of the vector function is described by the 2618 tokens between the prefix ``_ZGV`` and the ``<scalar_name>`` 2619 token. The standard name of the vector function is 2620 ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present, 2621 the optional token ``(<vector_redirection>)`` informs the compiler 2622 that a custom name is provided in addition to the standard one 2623 (custom names can be provided for example via the use of ``declare 2624 variant`` in OpenMP 5.0). The declaration of the variant must be 2625 present in the IR Module. The signature of the vector variant is 2626 determined by the rules of the Vector Function ABI (VFABI) 2627 specifications of the target. For Arm and X86, the VFABI can be 2628 found at https://github.com/ARM-software/abi-aa and 2629 https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html, 2630 respectively. 2631 2632 For X86 and Arm targets, the values of the tokens in the standard 2633 name are those that are defined in the VFABI. LLVM has an internal 2634 ``<isa>`` token that can be used to create scalar-to-vector 2635 mappings for functions that are not directly associated to any of 2636 the target ISAs (for example, some of the mappings stored in the 2637 TargetLibraryInfo). Valid values for the ``<isa>`` token are::: 2638 2639 <isa>:= b | c | d | e -> X86 SSE, AVX, AVX2, AVX512 2640 | n | s -> Armv8 Advanced SIMD, SVE 2641 | __LLVM__ -> Internal LLVM Vector ISA 2642 2643 For all targets currently supported (x86, Arm and Internal LLVM), 2644 the remaining tokens can have the following values::: 2645 2646 <mask>:= M | N -> mask | no mask 2647 2648 <vlen>:= number -> number of lanes 2649 | x -> VLA (Vector Length Agnostic) 2650 2651 <parameters>:= v -> vector 2652 | l | l <number> -> linear 2653 | R | R <number> -> linear with ref modifier 2654 | L | L <number> -> linear with val modifier 2655 | U | U <number> -> linear with uval modifier 2656 | ls <pos> -> runtime linear 2657 | Rs <pos> -> runtime linear with ref modifier 2658 | Ls <pos> -> runtime linear with val modifier 2659 | Us <pos> -> runtime linear with uval modifier 2660 | u -> uniform 2661 2662 <scalar_name>:= name of the scalar function 2663 2664 <vector_redirection>:= optional, custom name of the vector function 2665 2666``preallocated(<ty>)`` 2667 This attribute is required on calls to ``llvm.call.preallocated.arg`` 2668 and cannot be used on any other call. See 2669 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more 2670 details. 2671 2672.. _glattrs: 2673 2674Global Attributes 2675----------------- 2676 2677Attributes may be set to communicate additional information about a global variable. 2678Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable 2679are grouped into a single :ref:`attribute group <attrgrp>`. 2680 2681``no_sanitize_address`` 2682 This attribute indicates that the global variable should not have 2683 AddressSanitizer instrumentation applied to it, because it was annotated 2684 with `__attribute__((no_sanitize("address")))`, 2685 `__attribute__((disable_sanitizer_instrumentation))`, or included in the 2686 `-fsanitize-ignorelist` file. 2687``no_sanitize_hwaddress`` 2688 This attribute indicates that the global variable should not have 2689 HWAddressSanitizer instrumentation applied to it, because it was annotated 2690 with `__attribute__((no_sanitize("hwaddress")))`, 2691 `__attribute__((disable_sanitizer_instrumentation))`, or included in the 2692 `-fsanitize-ignorelist` file. 2693``sanitize_memtag`` 2694 This attribute indicates that the global variable should have AArch64 memory 2695 tags (MTE) instrumentation applied to it. This attribute causes the 2696 suppression of certain optimizations, like GlobalMerge, as well as ensuring 2697 extra directives are emitted in the assembly and extra bits of metadata are 2698 placed in the object file so that the linker can ensure the accesses are 2699 protected by MTE. This attribute is added by clang when 2700 `-fsanitize=memtag-globals` is provided, as long as the global is not marked 2701 with `__attribute__((no_sanitize("memtag")))`, 2702 `__attribute__((disable_sanitizer_instrumentation))`, or included in the 2703 `-fsanitize-ignorelist` file. The AArch64 Globals Tagging pass may remove 2704 this attribute when it's not possible to tag the global (e.g. it's a TLS 2705 variable). 2706``sanitize_address_dyninit`` 2707 This attribute indicates that the global variable, when instrumented with 2708 AddressSanitizer, should be checked for ODR violations. This attribute is 2709 applied to global variables that are dynamically initialized according to 2710 C++ rules. 2711 2712.. _opbundles: 2713 2714Operand Bundles 2715--------------- 2716 2717Operand bundles are tagged sets of SSA values or metadata strings that can be 2718associated with certain LLVM instructions (currently only ``call`` s and 2719``invoke`` s). In a way they are like metadata, but dropping them is 2720incorrect and will change program semantics. 2721 2722Syntax:: 2723 2724 operand bundle set ::= '[' operand bundle (, operand bundle )* ']' 2725 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')' 2726 bundle operand ::= SSA value | metadata string 2727 tag ::= string constant 2728 2729Operand bundles are **not** part of a function's signature, and a 2730given function may be called from multiple places with different kinds 2731of operand bundles. This reflects the fact that the operand bundles 2732are conceptually a part of the ``call`` (or ``invoke``), not the 2733callee being dispatched to. 2734 2735Operand bundles are a generic mechanism intended to support 2736runtime-introspection-like functionality for managed languages. While 2737the exact semantics of an operand bundle depend on the bundle tag, 2738there are certain limitations to how much the presence of an operand 2739bundle can influence the semantics of a program. These restrictions 2740are described as the semantics of an "unknown" operand bundle. As 2741long as the behavior of an operand bundle is describable within these 2742restrictions, LLVM does not need to have special knowledge of the 2743operand bundle to not miscompile programs containing it. 2744 2745- The bundle operands for an unknown operand bundle escape in unknown 2746 ways before control is transferred to the callee or invokee. 2747- Calls and invokes with operand bundles have unknown read / write 2748 effect on the heap on entry and exit (even if the call target specifies 2749 a ``memory`` attribute), unless they're overridden with 2750 callsite specific attributes. 2751- An operand bundle at a call site cannot change the implementation 2752 of the called function. Inter-procedural optimizations work as 2753 usual as long as they take into account the first two properties. 2754 2755More specific types of operand bundles are described below. 2756 2757.. _deopt_opbundles: 2758 2759Deoptimization Operand Bundles 2760^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2761 2762Deoptimization operand bundles are characterized by the ``"deopt"`` 2763operand bundle tag. These operand bundles represent an alternate 2764"safe" continuation for the call site they're attached to, and can be 2765used by a suitable runtime to deoptimize the compiled frame at the 2766specified call site. There can be at most one ``"deopt"`` operand 2767bundle attached to a call site. Exact details of deoptimization is 2768out of scope for the language reference, but it usually involves 2769rewriting a compiled frame into a set of interpreted frames. 2770 2771From the compiler's perspective, deoptimization operand bundles make 2772the call sites they're attached to at least ``readonly``. They read 2773through all of their pointer typed operands (even if they're not 2774otherwise escaped) and the entire visible heap. Deoptimization 2775operand bundles do not capture their operands except during 2776deoptimization, in which case control will not be returned to the 2777compiled frame. 2778 2779The inliner knows how to inline through calls that have deoptimization 2780operand bundles. Just like inlining through a normal call site 2781involves composing the normal and exceptional continuations, inlining 2782through a call site with a deoptimization operand bundle needs to 2783appropriately compose the "safe" deoptimization continuation. The 2784inliner does this by prepending the parent's deoptimization 2785continuation to every deoptimization continuation in the inlined body. 2786E.g. inlining ``@f`` into ``@g`` in the following example 2787 2788.. code-block:: llvm 2789 2790 define void @f() { 2791 call void @x() ;; no deopt state 2792 call void @y() [ "deopt"(i32 10) ] 2793 call void @y() [ "deopt"(i32 10), "unknown"(ptr null) ] 2794 ret void 2795 } 2796 2797 define void @g() { 2798 call void @f() [ "deopt"(i32 20) ] 2799 ret void 2800 } 2801 2802will result in 2803 2804.. code-block:: llvm 2805 2806 define void @g() { 2807 call void @x() ;; still no deopt state 2808 call void @y() [ "deopt"(i32 20, i32 10) ] 2809 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(ptr null) ] 2810 ret void 2811 } 2812 2813It is the frontend's responsibility to structure or encode the 2814deoptimization state in a way that syntactically prepending the 2815caller's deoptimization state to the callee's deoptimization state is 2816semantically equivalent to composing the caller's deoptimization 2817continuation after the callee's deoptimization continuation. 2818 2819.. _ob_funclet: 2820 2821Funclet Operand Bundles 2822^^^^^^^^^^^^^^^^^^^^^^^ 2823 2824Funclet operand bundles are characterized by the ``"funclet"`` 2825operand bundle tag. These operand bundles indicate that a call site 2826is within a particular funclet. There can be at most one 2827``"funclet"`` operand bundle attached to a call site and it must have 2828exactly one bundle operand. 2829 2830If any funclet EH pads have been "entered" but not "exited" (per the 2831`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_), 2832it is undefined behavior to execute a ``call`` or ``invoke`` which: 2833 2834* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind 2835 intrinsic, or 2836* has a ``"funclet"`` bundle whose operand is not the most-recently-entered 2837 not-yet-exited funclet EH pad. 2838 2839Similarly, if no funclet EH pads have been entered-but-not-yet-exited, 2840executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior. 2841 2842GC Transition Operand Bundles 2843^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2844 2845GC transition operand bundles are characterized by the 2846``"gc-transition"`` operand bundle tag. These operand bundles mark a 2847call as a transition between a function with one GC strategy to a 2848function with a different GC strategy. If coordinating the transition 2849between GC strategies requires additional code generation at the call 2850site, these bundles may contain any values that are needed by the 2851generated code. For more details, see :ref:`GC Transitions 2852<gc_transition_args>`. 2853 2854The bundle contain an arbitrary list of Values which need to be passed 2855to GC transition code. They will be lowered and passed as operands to 2856the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed 2857that these arguments must be available before and after (but not 2858necessarily during) the execution of the callee. 2859 2860.. _assume_opbundles: 2861 2862Assume Operand Bundles 2863^^^^^^^^^^^^^^^^^^^^^^ 2864 2865Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing 2866assumptions, such as that a :ref:`parameter attribute <paramattrs>` or a 2867:ref:`function attribute <fnattrs>` holds for a certain value at a certain 2868location. Operand bundles enable assumptions that are either hard or impossible 2869to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`. 2870 2871An assume operand bundle has the form: 2872 2873:: 2874 2875 "<tag>"([ <arguments>] ]) 2876 2877In the case of function or parameter attributes, the operand bundle has the 2878restricted form: 2879 2880:: 2881 2882 "<tag>"([ <holds for value> [, <attribute argument>] ]) 2883 2884* The tag of the operand bundle is usually the name of attribute that can be 2885 assumed to hold. It can also be `ignore`, this tag doesn't contain any 2886 information and should be ignored. 2887* The first argument if present is the value for which the attribute hold. 2888* The second argument if present is an argument of the attribute. 2889 2890If there are no arguments the attribute is a property of the call location. 2891 2892For example: 2893 2894.. code-block:: llvm 2895 2896 call void @llvm.assume(i1 true) ["align"(ptr %val, i32 8)] 2897 2898allows the optimizer to assume that at location of call to 2899:ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8. 2900 2901.. code-block:: llvm 2902 2903 call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(ptr %val)] 2904 2905allows the optimizer to assume that the :ref:`llvm.assume <int_assume>` 2906call location is cold and that ``%val`` may not be null. 2907 2908Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the 2909provided guarantees are violated at runtime the behavior is undefined. 2910 2911While attributes expect constant arguments, assume operand bundles may be 2912provided a dynamic value, for example: 2913 2914.. code-block:: llvm 2915 2916 call void @llvm.assume(i1 true) ["align"(ptr %val, i32 %align)] 2917 2918If the operand bundle value violates any requirements on the attribute value, 2919the behavior is undefined, unless one of the following exceptions applies: 2920 2921* ``"align"`` operand bundles may specify a non-power-of-two alignment 2922 (including a zero alignment). If this is the case, then the pointer value 2923 must be a null pointer, otherwise the behavior is undefined. 2924 2925In addition to allowing operand bundles encoding function and parameter 2926attributes, an assume operand bundle my also encode a ``separate_storage`` 2927operand bundle. This has the form: 2928 2929.. code-block:: llvm 2930 2931 separate_storage(<val1>, <val2>)`` 2932 2933This indicates that no pointer :ref:`based <pointeraliasing>` on one of its 2934arguments can alias any pointer based on the other. 2935 2936Even if the assumed property can be encoded as a boolean value, like 2937``nonnull``, using operand bundles to express the property can still have 2938benefits: 2939 2940* Attributes that can be expressed via operand bundles are directly the 2941 property that the optimizer uses and cares about. Encoding attributes as 2942 operand bundles removes the need for an instruction sequence that represents 2943 the property (e.g., `icmp ne ptr %p, null` for `nonnull`) and for the 2944 optimizer to deduce the property from that instruction sequence. 2945* Expressing the property using operand bundles makes it easy to identify the 2946 use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then 2947 simplifies and improves heuristics, e.g., for use "use-sensitive" 2948 optimizations. 2949 2950.. _ob_preallocated: 2951 2952Preallocated Operand Bundles 2953^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2954 2955Preallocated operand bundles are characterized by the ``"preallocated"`` 2956operand bundle tag. These operand bundles allow separation of the allocation 2957of the call argument memory from the call site. This is necessary to pass 2958non-trivially copyable objects by value in a way that is compatible with MSVC 2959on some targets. There can be at most one ``"preallocated"`` operand bundle 2960attached to a call site and it must have exactly one bundle operand, which is 2961a token generated by ``@llvm.call.preallocated.setup``. A call with this 2962operand bundle should not adjust the stack before entering the function, as 2963that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics. 2964 2965.. code-block:: llvm 2966 2967 %foo = type { i64, i32 } 2968 2969 ... 2970 2971 %t = call token @llvm.call.preallocated.setup(i32 1) 2972 %a = call ptr @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo) 2973 ; initialize %b 2974 call void @bar(i32 42, ptr preallocated(%foo) %a) ["preallocated"(token %t)] 2975 2976.. _ob_gc_live: 2977 2978GC Live Operand Bundles 2979^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2980 2981A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>` 2982intrinsic. The operand bundle must contain every pointer to a garbage collected 2983object which potentially needs to be updated by the garbage collector. 2984 2985When lowered, any relocated value will be recorded in the corresponding 2986:ref:`stackmap entry <statepoint-stackmap-format>`. See the intrinsic description 2987for further details. 2988 2989ObjC ARC Attached Call Operand Bundles 2990^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2991 2992A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is 2993implicitly followed by a marker instruction and a call to an ObjC runtime 2994function that uses the result of the call. The operand bundle takes a mandatory 2995pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or 2996``@objc_unsafeClaimAutoreleasedReturnValue``). 2997The return value of a call with this bundle is used by a call to 2998``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is 2999void, in which case the operand bundle is ignored. 3000 3001.. code-block:: llvm 3002 3003 ; The marker instruction and a runtime function call are inserted after the call 3004 ; to @foo. 3005 call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_retainAutoreleasedReturnValue) ] 3006 call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_unsafeClaimAutoreleasedReturnValue) ] 3007 3008The operand bundle is needed to ensure the call is immediately followed by the 3009marker instruction and the ObjC runtime call in the final output. 3010 3011.. _ob_ptrauth: 3012 3013Pointer Authentication Operand Bundles 3014^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 3015 3016Pointer Authentication operand bundles are characterized by the 3017``"ptrauth"`` operand bundle tag. They are described in the 3018`Pointer Authentication <PointerAuth.html#operand-bundle>`__ document. 3019 3020.. _ob_kcfi: 3021 3022KCFI Operand Bundles 3023^^^^^^^^^^^^^^^^^^^^ 3024 3025A ``"kcfi"`` operand bundle on an indirect call indicates that the call will 3026be preceded by a runtime type check, which validates that the call target is 3027prefixed with a :ref:`type identifier<md_kcfi_type>` that matches the operand 3028bundle attribute. For example: 3029 3030.. code-block:: llvm 3031 3032 call void %0() ["kcfi"(i32 1234)] 3033 3034Clang emits KCFI operand bundles and the necessary metadata with 3035``-fsanitize=kcfi``. 3036 3037.. _convergencectrl: 3038 3039Convergence Control Operand Bundles 3040^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 3041 3042A "convergencectrl" operand bundle is only valid on a ``convergent`` operation. 3043When present, the operand bundle must contain exactly one value of token type. 3044See the :doc:`ConvergentOperations` document for details. 3045 3046.. _moduleasm: 3047 3048Module-Level Inline Assembly 3049---------------------------- 3050 3051Modules may contain "module-level inline asm" blocks, which corresponds 3052to the GCC "file scope inline asm" blocks. These blocks are internally 3053concatenated by LLVM and treated as a single unit, but may be separated 3054in the ``.ll`` file if desired. The syntax is very simple: 3055 3056.. code-block:: llvm 3057 3058 module asm "inline asm code goes here" 3059 module asm "more can go here" 3060 3061The strings can contain any character by escaping non-printable 3062characters. The escape sequence used is simply "\\xx" where "xx" is the 3063two digit hex code for the number. 3064 3065Note that the assembly string *must* be parseable by LLVM's integrated assembler 3066(unless it is disabled), even when emitting a ``.s`` file. 3067 3068.. _langref_datalayout: 3069 3070Data Layout 3071----------- 3072 3073A module may specify a target specific data layout string that specifies 3074how data is to be laid out in memory. The syntax for the data layout is 3075simply: 3076 3077.. code-block:: llvm 3078 3079 target datalayout = "layout specification" 3080 3081The *layout specification* consists of a list of specifications 3082separated by the minus sign character ('-'). Each specification starts 3083with a letter and may include other information after the letter to 3084define some aspect of the data layout. The specifications accepted are 3085as follows: 3086 3087``E`` 3088 Specifies that the target lays out data in big-endian form. That is, 3089 the bits with the most significance have the lowest address 3090 location. 3091``e`` 3092 Specifies that the target lays out data in little-endian form. That 3093 is, the bits with the least significance have the lowest address 3094 location. 3095``S<size>`` 3096 Specifies the natural alignment of the stack in bits. Alignment 3097 promotion of stack variables is limited to the natural stack 3098 alignment to avoid dynamic stack realignment. The stack alignment 3099 must be a multiple of 8-bits. If omitted, the natural stack 3100 alignment defaults to "unspecified", which does not prevent any 3101 alignment promotions. 3102``P<address space>`` 3103 Specifies the address space that corresponds to program memory. 3104 Harvard architectures can use this to specify what space LLVM 3105 should place things such as functions into. If omitted, the 3106 program memory space defaults to the default address space of 0, 3107 which corresponds to a Von Neumann architecture that has code 3108 and data in the same space. 3109``G<address space>`` 3110 Specifies the address space to be used by default when creating global 3111 variables. If omitted, the globals address space defaults to the default 3112 address space 0. 3113 Note: variable declarations without an address space are always created in 3114 address space 0, this property only affects the default value to be used 3115 when creating globals without additional contextual information (e.g. in 3116 LLVM passes). 3117 3118.. _alloca_addrspace: 3119 3120``A<address space>`` 3121 Specifies the address space of objects created by '``alloca``'. 3122 Defaults to the default address space of 0. 3123``p[n]:<size>:<abi>[:<pref>][:<idx>]`` 3124 This specifies the *size* of a pointer and its ``<abi>`` and 3125 ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional 3126 and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the 3127 index that used for address calculation, which must be less than or equal 3128 to the pointer size. If not 3129 specified, the default index size is equal to the pointer size. All sizes 3130 are in bits. The address space, ``n``, is optional, and if not specified, 3131 denotes the default address space 0. The value of ``n`` must be 3132 in the range [1,2^24). 3133``i<size>:<abi>[:<pref>]`` 3134 This specifies the alignment for an integer type of a given bit 3135 ``<size>``. The value of ``<size>`` must be in the range [1,2^24). 3136 ``<pref>`` is optional and defaults to ``<abi>``. 3137 For ``i8``, the ``<abi>`` value must equal 8, 3138 that is, ``i8`` must be naturally aligned. 3139``v<size>:<abi>[:<pref>]`` 3140 This specifies the alignment for a vector type of a given bit 3141 ``<size>``. The value of ``<size>`` must be in the range [1,2^24). 3142 ``<pref>`` is optional and defaults to ``<abi>``. 3143``f<size>:<abi>[:<pref>]`` 3144 This specifies the alignment for a floating-point type of a given bit 3145 ``<size>``. Only values of ``<size>`` that are supported by the target 3146 will work. 32 (float) and 64 (double) are supported on all targets; 80 3147 or 128 (different flavors of long double) are also supported on some 3148 targets. The value of ``<size>`` must be in the range [1,2^24). 3149 ``<pref>`` is optional and defaults to ``<abi>``. 3150``a:<abi>[:<pref>]`` 3151 This specifies the alignment for an object of aggregate type. 3152 ``<pref>`` is optional and defaults to ``<abi>``. 3153``F<type><abi>`` 3154 This specifies the alignment for function pointers. 3155 The options for ``<type>`` are: 3156 3157 * ``i``: The alignment of function pointers is independent of the alignment 3158 of functions, and is a multiple of ``<abi>``. 3159 * ``n``: The alignment of function pointers is a multiple of the explicit 3160 alignment specified on the function, and is a multiple of ``<abi>``. 3161``m:<mangling>`` 3162 If present, specifies that llvm names are mangled in the output. Symbols 3163 prefixed with the mangling escape character ``\01`` are passed through 3164 directly to the assembler without the escape character. The mangling style 3165 options are 3166 3167 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix. 3168 * ``l``: GOFF mangling: Private symbols get a ``@`` prefix. 3169 * ``m``: Mips mangling: Private symbols get a ``$`` prefix. 3170 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other 3171 symbols get a ``_`` prefix. 3172 * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix. 3173 Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``, 3174 ``__fastcall``, and ``__vectorcall`` have custom mangling that appends 3175 ``@N`` where N is the number of bytes used to pass parameters. C++ symbols 3176 starting with ``?`` are not mangled in any way. 3177 * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C 3178 symbols do not receive a ``_`` prefix. 3179 * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix. 3180``n<size1>:<size2>:<size3>...`` 3181 This specifies a set of native integer widths for the target CPU in 3182 bits. For example, it might contain ``n32`` for 32-bit PowerPC, 3183 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of 3184 this set are considered to support most general arithmetic operations 3185 efficiently. 3186``ni:<address space0>:<address space1>:<address space2>...`` 3187 This specifies pointer types with the specified address spaces 3188 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0`` 3189 address space cannot be specified as non-integral. 3190 3191On every specification that takes a ``<abi>:<pref>``, specifying the 3192``<pref>`` alignment is optional. If omitted, the preceding ``:`` 3193should be omitted too and ``<pref>`` will be equal to ``<abi>``. 3194 3195When constructing the data layout for a given target, LLVM starts with a 3196default set of specifications which are then (possibly) overridden by 3197the specifications in the ``datalayout`` keyword. The default 3198specifications are given in this list: 3199 3200- ``e`` - little endian 3201- ``p:64:64:64`` - 64-bit pointers with 64-bit alignment. 3202- ``p[n]:64:64:64`` - Other address spaces are assumed to be the 3203 same as the default address space. 3204- ``S0`` - natural stack alignment is unspecified 3205- ``i1:8:8`` - i1 is 8-bit (byte) aligned 3206- ``i8:8:8`` - i8 is 8-bit (byte) aligned as mandated 3207- ``i16:16:16`` - i16 is 16-bit aligned 3208- ``i32:32:32`` - i32 is 32-bit aligned 3209- ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred 3210 alignment of 64-bits 3211- ``f16:16:16`` - half is 16-bit aligned 3212- ``f32:32:32`` - float is 32-bit aligned 3213- ``f64:64:64`` - double is 64-bit aligned 3214- ``f128:128:128`` - quad is 128-bit aligned 3215- ``v64:64:64`` - 64-bit vector is 64-bit aligned 3216- ``v128:128:128`` - 128-bit vector is 128-bit aligned 3217- ``a:0:64`` - aggregates are 64-bit aligned 3218 3219When LLVM is determining the alignment for a given type, it uses the 3220following rules: 3221 3222#. If the type sought is an exact match for one of the specifications, 3223 that specification is used. 3224#. If no match is found, and the type sought is an integer type, then 3225 the smallest integer type that is larger than the bitwidth of the 3226 sought type is used. If none of the specifications are larger than 3227 the bitwidth then the largest integer type is used. For example, 3228 given the default specifications above, the i7 type will use the 3229 alignment of i8 (next largest) while both i65 and i256 will use the 3230 alignment of i64 (largest specified). 3231 3232The function of the data layout string may not be what you expect. 3233Notably, this is not a specification from the frontend of what alignment 3234the code generator should use. 3235 3236Instead, if specified, the target data layout is required to match what 3237the ultimate *code generator* expects. This string is used by the 3238mid-level optimizers to improve code, and this only works if it matches 3239what the ultimate code generator uses. There is no way to generate IR 3240that does not embed this target-specific detail into the IR. If you 3241don't specify the string, the default specifications will be used to 3242generate a Data Layout and the optimization phases will operate 3243accordingly and introduce target specificity into the IR with respect to 3244these default specifications. 3245 3246.. _langref_triple: 3247 3248Target Triple 3249------------- 3250 3251A module may specify a target triple string that describes the target 3252host. The syntax for the target triple is simply: 3253 3254.. code-block:: llvm 3255 3256 target triple = "x86_64-apple-macosx10.7.0" 3257 3258The *target triple* string consists of a series of identifiers delimited 3259by the minus sign character ('-'). The canonical forms are: 3260 3261:: 3262 3263 ARCHITECTURE-VENDOR-OPERATING_SYSTEM 3264 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT 3265 3266This information is passed along to the backend so that it generates 3267code for the proper architecture. It's possible to override this on the 3268command line with the ``-mtriple`` command line option. 3269 3270.. _objectlifetime: 3271 3272Object Lifetime 3273---------------------- 3274 3275A memory object, or simply object, is a region of a memory space that is 3276reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap 3277allocation calls, and global variable definitions. 3278Once it is allocated, the bytes stored in the region can only be read or written 3279through a pointer that is :ref:`based on <pointeraliasing>` the allocation 3280value. 3281If a pointer that is not based on the object tries to read or write to the 3282object, it is undefined behavior. 3283 3284A lifetime of a memory object is a property that decides its accessibility. 3285Unless stated otherwise, a memory object is alive since its allocation, and 3286dead after its deallocation. 3287It is undefined behavior to access a memory object that isn't alive, but 3288operations that don't dereference it such as 3289:ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and 3290:ref:`icmp <i_icmp>` return a valid result. 3291This explains code motion of these instructions across operations that 3292impact the object's lifetime. 3293A stack object's lifetime can be explicitly specified using 3294:ref:`llvm.lifetime.start <int_lifestart>` and 3295:ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls. 3296 3297.. _pointeraliasing: 3298 3299Pointer Aliasing Rules 3300---------------------- 3301 3302Any memory access must be done through a pointer value associated with 3303an address range of the memory access, otherwise the behavior is 3304undefined. Pointer values are associated with address ranges according 3305to the following rules: 3306 3307- A pointer value is associated with the addresses associated with any 3308 value it is *based* on. 3309- An address of a global variable is associated with the address range 3310 of the variable's storage. 3311- The result value of an allocation instruction is associated with the 3312 address range of the allocated storage. 3313- A null pointer in the default address-space is associated with no 3314 address. 3315- An :ref:`undef value <undefvalues>` in *any* address-space is 3316 associated with no address. 3317- An integer constant other than zero or a pointer value returned from 3318 a function not defined within LLVM may be associated with address 3319 ranges allocated through mechanisms other than those provided by 3320 LLVM. Such ranges shall not overlap with any ranges of addresses 3321 allocated by mechanisms provided by LLVM. 3322 3323A pointer value is *based* on another pointer value according to the 3324following rules: 3325 3326- A pointer value formed from a scalar ``getelementptr`` operation is *based* on 3327 the pointer-typed operand of the ``getelementptr``. 3328- The pointer in lane *l* of the result of a vector ``getelementptr`` operation 3329 is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand 3330 of the ``getelementptr``. 3331- The result value of a ``bitcast`` is *based* on the operand of the 3332 ``bitcast``. 3333- A pointer value formed by an ``inttoptr`` is *based* on all pointer 3334 values that contribute (directly or indirectly) to the computation of 3335 the pointer's value. 3336- The "*based* on" relationship is transitive. 3337 3338Note that this definition of *"based"* is intentionally similar to the 3339definition of *"based"* in C99, though it is slightly weaker. 3340 3341LLVM IR does not associate types with memory. The result type of a 3342``load`` merely indicates the size and alignment of the memory from 3343which to load, as well as the interpretation of the value. The first 3344operand type of a ``store`` similarly only indicates the size and 3345alignment of the store. 3346 3347Consequently, type-based alias analysis, aka TBAA, aka 3348``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR. 3349:ref:`Metadata <metadata>` may be used to encode additional information 3350which specialized optimization passes may use to implement type-based 3351alias analysis. 3352 3353.. _pointercapture: 3354 3355Pointer Capture 3356--------------- 3357 3358Given a function call and a pointer that is passed as an argument or stored in 3359memory before the call, the call may capture two components of the pointer: 3360 3361 * The address of the pointer, which is its integral value. This also includes 3362 parts of the address or any information about the address, including the 3363 fact that it does not equal one specific value. We further distinguish 3364 whether only the fact that the address is/isn't null is captured. 3365 * The provenance of the pointer, which is the ability to perform memory 3366 accesses through the pointer, in the sense of the :ref:`pointer aliasing 3367 rules <pointeraliasing>`. We further distinguish whether only read acceses 3368 are allowed, or both reads and writes. 3369 3370For example, the following function captures the address of ``%a``, because 3371it is compared to a pointer, leaking information about the identitiy of the 3372pointer: 3373 3374.. code-block:: llvm 3375 3376 @glb = global i8 0 3377 3378 define i1 @f(ptr %a) { 3379 %c = icmp eq ptr %a, @glb 3380 ret i1 %c 3381 } 3382 3383The function does not capture the provenance of the pointer, because the 3384``icmp`` instruction only operates on the pointer address. The following 3385function captures both the address and provenance of the pointer, as both 3386may be read from ``@glb`` after the function returns: 3387 3388.. code-block:: llvm 3389 3390 @glb = global ptr null 3391 3392 define void @f(ptr %a) { 3393 store ptr %a, ptr @glb 3394 ret void 3395 } 3396 3397The following function captures *neither* the address nor the provenance of 3398the pointer: 3399 3400.. code-block:: llvm 3401 3402 define i32 @f(ptr %a) { 3403 %v = load i32, ptr %a 3404 ret i32 3405 } 3406 3407While address capture includes uses of the address within the body of the 3408function, provenance capture refers exclusively to the ability to perform 3409accesses *after* the function returns. Memory accesses within the function 3410itself are not considered pointer captures. 3411 3412We can further say that the capture only occurs through a specific location. 3413In the following example, the pointer (both address and provenance) is captured 3414through the return value only: 3415 3416.. code-block:: llvm 3417 3418 define ptr @f(ptr %a) { 3419 %gep = getelementptr i8, ptr %a, i64 4 3420 ret ptr %gep 3421 } 3422 3423However, we always consider direct inspection of the pointer address 3424(e.g. using ``ptrtoint``) to be location-independent. The following example 3425is *not* considered a return-only capture, even though the ``ptrtoint`` 3426ultimately only contribues to the return value: 3427 3428.. code-block:: llvm 3429 3430 @lookup = constant [4 x i8] [i8 0, i8 1, i8 2, i8 3] 3431 3432 define ptr @f(ptr %a) { 3433 %a.addr = ptrtoint ptr %a to i64 3434 %mask = and i64 %a.addr, 3 3435 %gep = getelementptr i8, ptr @lookup, i64 %mask 3436 ret ptr %gep 3437 } 3438 3439This definition is chosen to allow capture analysis to continue with the return 3440value in the usual fashion. 3441 3442The following describes possible ways to capture a pointer in more detail, 3443where unqualified uses of the word "capture" refer to capturing both address 3444and provenance. 3445 34461. The call stores any bit of the pointer carrying information into a place, 3447 and the stored bits can be read from the place by the caller after this call 3448 exits. 3449 3450.. code-block:: llvm 3451 3452 @glb = global ptr null 3453 @glb2 = global ptr null 3454 @glb3 = global ptr null 3455 @glbi = global i32 0 3456 3457 define ptr @f(ptr %a, ptr %b, ptr %c, ptr %d, ptr %e) { 3458 store ptr %a, ptr @glb ; %a is captured by this call 3459 3460 store ptr %b, ptr @glb2 ; %b isn't captured because the stored value is overwritten by the store below 3461 store ptr null, ptr @glb2 3462 3463 store ptr %c, ptr @glb3 3464 call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured 3465 store ptr null, ptr @glb3 3466 3467 %i = ptrtoint ptr %d to i64 3468 %j = trunc i64 %i to i32 3469 store i32 %j, ptr @glbi ; %d is captured 3470 3471 ret ptr %e ; %e is captured 3472 } 3473 34742. The call stores any bit of the pointer carrying information into a place, 3475 and the stored bits can be safely read from the place by another thread via 3476 synchronization. 3477 3478.. code-block:: llvm 3479 3480 @lock = global i1 true 3481 3482 define void @f(ptr %a) { 3483 store ptr %a, ptr @glb 3484 store atomic i1 false, ptr @lock release ; %a is captured because another thread can safely read @glb 3485 store ptr null, ptr @glb 3486 ret void 3487 } 3488 34893. The call's behavior depends on any bit of the pointer carrying information 3490 (address capture only). 3491 3492.. code-block:: llvm 3493 3494 @glb = global i8 0 3495 3496 define void @f(ptr %a) { 3497 %c = icmp eq ptr %a, @glb 3498 br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; captures address of %a only 3499 BB_EXIT: 3500 call void @exit() 3501 unreachable 3502 BB_CONTINUE: 3503 ret void 3504 } 3505 35064. The pointer is used as the pointer operand of a volatile access. 3507 3508.. _volatile: 3509 3510Volatile Memory Accesses 3511------------------------ 3512 3513Certain memory accesses, such as :ref:`load <i_load>`'s, 3514:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be 3515marked ``volatile``. The optimizers must not change the number of 3516volatile operations or change their order of execution relative to other 3517volatile operations. The optimizers *may* change the order of volatile 3518operations relative to non-volatile operations. This is not Java's 3519"volatile" and has no cross-thread synchronization behavior. 3520 3521A volatile load or store may have additional target-specific semantics. 3522Any volatile operation can have side effects, and any volatile operation 3523can read and/or modify state which is not accessible via a regular load 3524or store in this module. Volatile operations may use addresses which do 3525not point to memory (like MMIO registers). This means the compiler may 3526not use a volatile operation to prove a non-volatile access to that 3527address has defined behavior. 3528 3529The allowed side-effects for volatile accesses are limited. If a 3530non-volatile store to a given address would be legal, a volatile 3531operation may modify the memory at that address. A volatile operation 3532may not modify any other memory accessible by the module being compiled. 3533A volatile operation may not call any code in the current module. 3534 3535In general (without target specific context), the address space of a 3536volatile operation may not be changed. Different address spaces may 3537have different trapping behavior when dereferencing an invalid 3538pointer. 3539 3540The compiler may assume execution will continue after a volatile operation, 3541so operations which modify memory or may have undefined behavior can be 3542hoisted past a volatile operation. 3543 3544As an exception to the preceding rule, the compiler may not assume execution 3545will continue after a volatile store operation. This restriction is necessary 3546to support the somewhat common pattern in C of intentionally storing to an 3547invalid pointer to crash the program. In the future, it might make sense to 3548allow frontends to control this behavior. 3549 3550IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy 3551or llvm.memmove intrinsics even when those intrinsics are flagged volatile. 3552Likewise, the backend should never split or merge target-legal volatile 3553load/store instructions. Similarly, IR-level volatile loads and stores cannot 3554change from integer to floating-point or vice versa. 3555 3556.. admonition:: Rationale 3557 3558 Platforms may rely on volatile loads and stores of natively supported 3559 data width to be executed as single instruction. For example, in C 3560 this holds for an l-value of volatile primitive type with native 3561 hardware support, but not necessarily for aggregate types. The 3562 frontend upholds these expectations, which are intentionally 3563 unspecified in the IR. The rules above ensure that IR transformations 3564 do not violate the frontend's contract with the language. 3565 3566.. _memmodel: 3567 3568Memory Model for Concurrent Operations 3569-------------------------------------- 3570 3571The LLVM IR does not define any way to start parallel threads of 3572execution or to register signal handlers. Nonetheless, there are 3573platform-specific ways to create them, and we define LLVM IR's behavior 3574in their presence. This model is inspired by the C++ memory model. 3575 3576For a more informal introduction to this model, see the :doc:`Atomics`. 3577 3578We define a *happens-before* partial order as the least partial order 3579that 3580 3581- Is a superset of single-thread program order, and 3582- When ``a`` *synchronizes-with* ``b``, includes an edge from ``a`` to 3583 ``b``. *Synchronizes-with* pairs are introduced by platform-specific 3584 techniques, like pthread locks, thread creation, thread joining, 3585 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering 3586 Constraints <ordering>`). 3587 3588Note that program order does not introduce *happens-before* edges 3589between a thread and signals executing inside that thread. 3590 3591Every (defined) read operation (load instructions, memcpy, atomic 3592loads/read-modify-writes, etc.) R reads a series of bytes written by 3593(defined) write operations (store instructions, atomic 3594stores/read-modify-writes, memcpy, etc.). For the purposes of this 3595section, initialized globals are considered to have a write of the 3596initializer which is atomic and happens before any other read or write 3597of the memory in question. For each byte of a read R, R\ :sub:`byte` 3598may see any write to the same byte, except: 3599 3600- If write\ :sub:`1` happens before write\ :sub:`2`, and 3601 write\ :sub:`2` happens before R\ :sub:`byte`, then 3602 R\ :sub:`byte` does not see write\ :sub:`1`. 3603- If R\ :sub:`byte` happens before write\ :sub:`3`, then 3604 R\ :sub:`byte` does not see write\ :sub:`3`. 3605 3606Given that definition, R\ :sub:`byte` is defined as follows: 3607 3608- If R is volatile, the result is target-dependent. (Volatile is 3609 supposed to give guarantees which can support ``sig_atomic_t`` in 3610 C/C++, and may be used for accesses to addresses that do not behave 3611 like normal memory. It does not generally provide cross-thread 3612 synchronization.) 3613- Otherwise, if there is no write to the same byte that happens before 3614 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte. 3615- Otherwise, if R\ :sub:`byte` may see exactly one write, 3616 R\ :sub:`byte` returns the value written by that write. 3617- Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may 3618 see are atomic, it chooses one of the values written. See the :ref:`Atomic 3619 Memory Ordering Constraints <ordering>` section for additional 3620 constraints on how the choice is made. 3621- Otherwise R\ :sub:`byte` returns ``undef``. 3622 3623R returns the value composed of the series of bytes it read. This 3624implies that some bytes within the value may be ``undef`` **without** 3625the entire value being ``undef``. Note that this only defines the 3626semantics of the operation; it doesn't mean that targets will emit more 3627than one instruction to read the series of bytes. 3628 3629Note that in cases where none of the atomic intrinsics are used, this 3630model places only one restriction on IR transformations on top of what 3631is required for single-threaded execution: introducing a store to a byte 3632which might not otherwise be stored is not allowed in general. 3633(Specifically, in the case where another thread might write to and read 3634from an address, introducing a store can change a load that may see 3635exactly one write into a load that may see multiple writes.) 3636 3637.. _ordering: 3638 3639Atomic Memory Ordering Constraints 3640---------------------------------- 3641 3642Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`, 3643:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`, 3644:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take 3645ordering parameters that determine which other atomic instructions on 3646the same address they *synchronize with*. These semantics implement 3647the Java or C++ memory models; if these descriptions aren't precise 3648enough, check those specs (see spec references in the 3649:doc:`atomics guide <Atomics>`). :ref:`fence <i_fence>` instructions 3650treat these orderings somewhat differently since they don't take an 3651address. See that instruction's documentation for details. 3652 3653For a simpler introduction to the ordering constraints, see the 3654:doc:`Atomics`. 3655 3656``unordered`` 3657 The set of values that can be read is governed by the happens-before 3658 partial order. A value cannot be read unless some operation wrote 3659 it. This is intended to provide a guarantee strong enough to model 3660 Java's non-volatile shared variables. This ordering cannot be 3661 specified for read-modify-write operations; it is not strong enough 3662 to make them atomic in any interesting way. 3663``monotonic`` 3664 In addition to the guarantees of ``unordered``, there is a single 3665 total order for modifications by ``monotonic`` operations on each 3666 address. All modification orders must be compatible with the 3667 happens-before order. There is no guarantee that the modification 3668 orders can be combined to a global total order for the whole program 3669 (and this often will not be possible). The read in an atomic 3670 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and 3671 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification 3672 order immediately before the value it writes. If one atomic read 3673 happens before another atomic read of the same address, the later 3674 read must see the same value or a later value in the address's 3675 modification order. This disallows reordering of ``monotonic`` (or 3676 stronger) operations on the same address. If an address is written 3677 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally 3678 read that address repeatedly, the other threads must eventually see 3679 the write. This corresponds to the C/C++ ``memory_order_relaxed``. 3680``acquire`` 3681 In addition to the guarantees of ``monotonic``, a 3682 *synchronizes-with* edge may be formed with a ``release`` operation. 3683 This is intended to model C/C++'s ``memory_order_acquire``. 3684``release`` 3685 In addition to the guarantees of ``monotonic``, if this operation 3686 writes a value which is subsequently read by an ``acquire`` 3687 operation, it *synchronizes-with* that operation. Furthermore, 3688 this occurs even if the value written by a ``release`` operation 3689 has been modified by a read-modify-write operation before being 3690 read. (Such a set of operations comprises a *release 3691 sequence*). This corresponds to the C/C++ 3692 ``memory_order_release``. 3693``acq_rel`` (acquire+release) 3694 Acts as both an ``acquire`` and ``release`` operation on its 3695 address. This corresponds to the C/C++ ``memory_order_acq_rel``. 3696``seq_cst`` (sequentially consistent) 3697 In addition to the guarantees of ``acq_rel`` (``acquire`` for an 3698 operation that only reads, ``release`` for an operation that only 3699 writes), there is a global total order on all 3700 sequentially-consistent operations on all addresses. Each 3701 sequentially-consistent read sees the last preceding write to the 3702 same address in this global order. This corresponds to the C/C++ 3703 ``memory_order_seq_cst`` and Java ``volatile``. 3704 3705 Note: this global total order is *not* guaranteed to be fully 3706 consistent with the *happens-before* partial order if 3707 non-``seq_cst`` accesses are involved. See the C++ standard 3708 `[atomics.order] <https://wg21.link/atomics.order>`_ section 3709 for more details on the exact guarantees. 3710 3711.. _syncscope: 3712 3713If an atomic operation is marked ``syncscope("singlethread")``, it only 3714*synchronizes with* and only participates in the seq\_cst total orderings of 3715other operations running in the same thread (for example, in signal handlers). 3716 3717If an atomic operation is marked ``syncscope("<target-scope>")``, where 3718``<target-scope>`` is a target specific synchronization scope, then it is target 3719dependent if it *synchronizes with* and participates in the seq\_cst total 3720orderings of other operations. 3721 3722Otherwise, an atomic operation that is not marked ``syncscope("singlethread")`` 3723or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the 3724seq\_cst total orderings of other operations that are not marked 3725``syncscope("singlethread")`` or ``syncscope("<target-scope>")``. 3726 3727.. _floatenv: 3728 3729Floating-Point Environment 3730-------------------------- 3731 3732The default LLVM floating-point environment assumes that traps are disabled and 3733status flags are not observable. Therefore, floating-point math operations do 3734not have side effects and may be speculated freely. Results assume the 3735round-to-nearest rounding mode, and subnormals are assumed to be preserved. 3736 3737Running LLVM code in an environment where these assumptions are not met 3738typically leads to undefined behavior. The ``strictfp`` and ``denormal-fp-math`` 3739attributes as well as :ref:`Constrained Floating-Point Intrinsics 3740<constrainedfp>` can be used to weaken LLVM's assumptions and ensure defined 3741behavior in non-default floating-point environments; see their respective 3742documentation for details. 3743 3744.. _floatnan: 3745 3746Behavior of Floating-Point NaN values 3747------------------------------------- 3748 3749A floating-point NaN value consists of a sign bit, a quiet/signaling bit, and a 3750payload (which makes up the rest of the mantissa except for the quiet/signaling 3751bit). LLVM assumes that the quiet/signaling bit being set to ``1`` indicates a 3752quiet NaN (QNaN), and a value of ``0`` indicates a signaling NaN (SNaN). In the 3753following we will hence just call it the "quiet bit". 3754 3755The representation bits of a floating-point value do not mutate arbitrarily; in 3756particular, if there is no floating-point operation being performed, NaN signs, 3757quiet bits, and payloads are preserved. 3758 3759For the purpose of this section, ``bitcast`` as well as the following operations 3760are not "floating-point math operations": ``fneg``, ``llvm.fabs``, and 3761``llvm.copysign``. These operations act directly on the underlying bit 3762representation and never change anything except possibly for the sign bit. 3763 3764Floating-point math operations that return a NaN are an exception from the 3765general principle that LLVM implements IEEE-754 semantics. Unless specified 3766otherwise, the following rules apply whenever the IEEE-754 semantics say that a 3767NaN value is returned: the result has a non-deterministic sign; the quiet bit 3768and payload are non-deterministically chosen from the following set of options: 3769 3770- The quiet bit is set and the payload is all-zero. ("Preferred NaN" case) 3771- The quiet bit is set and the payload is copied from any input operand that is 3772 a NaN. ("Quieting NaN propagation" case) 3773- The quiet bit and payload are copied from any input operand that is a NaN. 3774 ("Unchanged NaN propagation" case) 3775- The quiet bit is set and the payload is picked from a target-specific set of 3776 "extra" possible NaN payloads. The set can depend on the input operand values. 3777 This set is empty on x86 and ARM, but can be non-empty on other architectures. 3778 (For instance, on wasm, if any input NaN does not have the preferred all-zero 3779 payload or any input NaN is an SNaN, then this set contains all possible 3780 payloads; otherwise, it is empty. On SPARC, this set consists of the all-one 3781 payload.) 3782 3783In particular, if all input NaNs are quiet (or if there are no input NaNs), then 3784the output NaN is definitely quiet. Signaling NaN outputs can only occur if they 3785are provided as an input value. For example, "fmul SNaN, 1.0" may be simplified 3786to SNaN rather than QNaN. Similarly, if all input NaNs are preferred (or if 3787there are no input NaNs) and the target does not have any "extra" NaN payloads, 3788then the output NaN is guaranteed to be preferred. 3789 3790Floating-point math operations are allowed to treat all NaNs as if they were 3791quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0. 3792 3793Code that requires different behavior than this should use the 3794:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`. 3795In particular, constrained intrinsics rule out the "Unchanged NaN propagation" 3796case; they are guaranteed to return a QNaN. 3797 3798Unfortunately, due to hard-or-impossible-to-fix issues, LLVM violates its own 3799specification on some architectures: 3800 3801- x86-32 without SSE2 enabled may convert floating-point values to x86_fp80 and 3802 back when performing floating-point math operations; this can lead to results 3803 with different precision than expected and it can alter NaN values. Since 3804 optimizations can make contradicting assumptions, this can lead to arbitrary 3805 miscompilations. See `issue #44218 3806 <https://github.com/llvm/llvm-project/issues/44218>`_. 3807- x86-32 (even with SSE2 enabled) may implicitly perform such a conversion on 3808 values returned from a function for some calling conventions. See `issue 3809 #66803 <https://github.com/llvm/llvm-project/issues/66803>`_. 3810- Older MIPS versions use the opposite polarity for the quiet/signaling bit, and 3811 LLVM does not correctly represent this. See `issue #60796 3812 <https://github.com/llvm/llvm-project/issues/60796>`_. 3813 3814.. _floatsem: 3815 3816Floating-Point Semantics 3817------------------------ 3818 3819This section defines the semantics for core floating-point operations on types 3820that use a format specified by IEEE-745. These types are: ``half``, ``float``, 3821``double``, and ``fp128``, which correspond to the binary16, binary32, binary64, 3822and binary128 formats, respectively. The "core" operations are those defined in 3823section 5 of IEEE-745, which all have corresponding LLVM operations. 3824 3825The value returned by those operations matches that of the corresponding 3826IEEE-754 operation executed in the :ref:`default LLVM floating-point environment 3827<floatenv>`, except that the behavior of NaN results is instead :ref:`as 3828specified here <floatnan>`. In particular, such a floating-point instruction 3829returning a non-NaN value is guaranteed to always return the same bit-identical 3830result on all machines and optimization levels. 3831 3832This means that optimizations and backends may not change the observed bitwise 3833result of these operations in any way (unless NaNs are returned), and frontends 3834can rely on these operations providing correctly rounded results as described in 3835the standard. 3836 3837(Note that this is only about the value returned by these operations; see the 3838:ref:`floating-point environment section <floatenv>` regarding flags and 3839exceptions.) 3840 3841Various flags, attributes, and metadata can alter the behavior of these 3842operations and thus make them not bit-identical across machines and optimization 3843levels any more: most notably, the :ref:`fast-math flags <fastmath>` as well as 3844the :ref:`strictfp <strictfp>` and :ref:`denormal-fp-math <denormal_fp_math>` 3845attributes and :ref:`!fpmath metadata <fpmath-metadata>`. See their 3846corresponding documentation for details. 3847 3848.. _fastmath: 3849 3850Fast-Math Flags 3851--------------- 3852 3853LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`, 3854:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`, 3855:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`, :ref:`fptrunc <i_fptrunc>`, 3856:ref:`fpext <i_fpext>`), and :ref:`phi <i_phi>`, :ref:`select <i_select>`, or 3857:ref:`call <i_call>` instructions that return floating-point types may use the 3858following flags to enable otherwise unsafe floating-point transformations. 3859 3860``fast`` 3861 This flag is a shorthand for specifying all fast-math flags at once, and 3862 imparts no additional semantics from using all of them. 3863 3864``nnan`` 3865 No NaNs - Allow optimizations to assume the arguments and result are not 3866 NaN. If an argument is a nan, or the result would be a nan, it produces 3867 a :ref:`poison value <poisonvalues>` instead. 3868 3869``ninf`` 3870 No Infs - Allow optimizations to assume the arguments and result are not 3871 +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it 3872 produces a :ref:`poison value <poisonvalues>` instead. 3873 3874``nsz`` 3875 No Signed Zeros - Allow optimizations to treat the sign of a zero 3876 argument or zero result as insignificant. This does not imply that -0.0 3877 is poison and/or guaranteed to not exist in the operation. 3878 3879Note: For :ref:`phi <i_phi>`, :ref:`select <i_select>`, and :ref:`call <i_call>` 3880instructions, the following return types are considered to be floating-point 3881types: 3882 3883.. _fastmath_return_types: 3884 3885- Floating-point scalar or vector types 3886- Array types (nested to any depth) of floating-point scalar or vector types 3887- Homogeneous literal struct types of floating-point scalar or vector types 3888 3889Rewrite-based flags 3890^^^^^^^^^^^^^^^^^^^ 3891 3892The following flags have rewrite-based semantics. These flags allow expressions, 3893potentially containing multiple non-consecutive instructions, to be rewritten 3894into alternative instructions. When multiple instructions are involved in an 3895expression, it is necessary that all of the instructions have the necessary 3896rewrite-based flag present on them, and the rewritten instructions will 3897generally have the intersection of the flags present on the input instruction. 3898 3899In the following example, the floating-point expression in the body of ``@orig`` 3900has ``contract`` and ``reassoc`` in common, and thus if it is rewritten into the 3901expression in the body of ``@target``, all of the new instructions get those two 3902flags and only those flags as a result. Since the ``arcp`` is present on only 3903one of the instructions in the expression, it is not present in the transformed 3904expression. Furthermore, this reassociation here is only legal because both the 3905instructions had the ``reassoc`` flag; if only one had it, it would not be legal 3906to make the transformation. 3907 3908.. code-block:: llvm 3909 3910 define double @orig(double %a, double %b, double %c) { 3911 %t1 = fmul contract reassoc double %a, %b 3912 %val = fmul contract reassoc arcp double %t1, %c 3913 ret double %val 3914 } 3915 3916 define double @target(double %a, double %b, double %c) { 3917 %t1 = fmul contract reassoc double %b, %c 3918 %val = fmul contract reassoc double %a, %t1 3919 ret double %val 3920 } 3921 3922These rules do not apply to the other fast-math flags. Whether or not a flag 3923like ``nnan`` is present on any or all of the rewritten instructions is based 3924on whether or not it is possible for said instruction to have a NaN input or 3925output, given the original flags. 3926 3927``arcp`` 3928 Allows division to be treated as a multiplication by a reciprocal. 3929 Specifically, this permits ``a / b`` to be considered equivalent to 3930 ``a * (1.0 / b)`` (which may subsequently be susceptible to code motion), 3931 and it also permits ``a / (b / c)`` to be considered equivalent to 3932 ``a * (c / b)``. Both of these rewrites can be applied in either direction: 3933 ``a * (c / b)`` can be rewritten into ``a / (b / c)``. 3934 3935``contract`` 3936 Allow floating-point contraction (e.g. fusing a multiply followed by an 3937 addition into a fused multiply-and-add). This does not enable reassociation 3938 to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not 3939 be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations. 3940 3941.. _fastmath_afn: 3942 3943``afn`` 3944 Approximate functions - Allow substitution of approximate calculations for 3945 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions 3946 for places where this can apply to LLVM's intrinsic math functions. 3947 3948``reassoc`` 3949 Allow reassociation transformations for floating-point instructions. 3950 This may dramatically change results in floating-point. 3951 3952.. _uselistorder: 3953 3954Use-list Order Directives 3955------------------------- 3956 3957Use-list directives encode the in-memory order of each use-list, allowing the 3958order to be recreated. ``<order-indexes>`` is a comma-separated list of 3959indexes that are assigned to the referenced value's uses. The referenced 3960value's use-list is immediately sorted by these indexes. 3961 3962Use-list directives may appear at function scope or global scope. They are not 3963instructions, and have no effect on the semantics of the IR. When they're at 3964function scope, they must appear after the terminator of the final basic block. 3965 3966If basic blocks have their address taken via ``blockaddress()`` expressions, 3967``uselistorder_bb`` can be used to reorder their use-lists from outside their 3968function's scope. 3969 3970:Syntax: 3971 3972:: 3973 3974 uselistorder <ty> <value>, { <order-indexes> } 3975 uselistorder_bb @function, %block { <order-indexes> } 3976 3977:Examples: 3978 3979:: 3980 3981 define void @foo(i32 %arg1, i32 %arg2) { 3982 entry: 3983 ; ... instructions ... 3984 bb: 3985 ; ... instructions ... 3986 3987 ; At function scope. 3988 uselistorder i32 %arg1, { 1, 0, 2 } 3989 uselistorder label %bb, { 1, 0 } 3990 } 3991 3992 ; At global scope. 3993 uselistorder ptr @global, { 1, 2, 0 } 3994 uselistorder i32 7, { 1, 0 } 3995 uselistorder i32 (i32) @bar, { 1, 0 } 3996 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 } 3997 3998.. _source_filename: 3999 4000Source Filename 4001--------------- 4002 4003The *source filename* string is set to the original module identifier, 4004which will be the name of the compiled source file when compiling from 4005source through the clang front end, for example. It is then preserved through 4006the IR and bitcode. 4007 4008This is currently necessary to generate a consistent unique global 4009identifier for local functions used in profile data, which prepends the 4010source file name to the local function name. 4011 4012The syntax for the source file name is simply: 4013 4014.. code-block:: text 4015 4016 source_filename = "/path/to/source.c" 4017 4018.. _typesystem: 4019 4020Type System 4021=========== 4022 4023The LLVM type system is one of the most important features of the 4024intermediate representation. Being typed enables a number of 4025optimizations to be performed on the intermediate representation 4026directly, without having to do extra analyses on the side before the 4027transformation. A strong type system makes it easier to read the 4028generated code and enables novel analyses and transformations that are 4029not feasible to perform on normal three address code representations. 4030 4031.. _t_void: 4032 4033Void Type 4034--------- 4035 4036:Overview: 4037 4038 4039The void type does not represent any value and has no size. 4040 4041:Syntax: 4042 4043 4044:: 4045 4046 void 4047 4048 4049.. _t_function: 4050 4051Function Type 4052------------- 4053 4054:Overview: 4055 4056 4057The function type can be thought of as a function signature. It consists of a 4058return type and a list of formal parameter types. The return type of a function 4059type is a void type or first class type --- except for :ref:`label <t_label>` 4060and :ref:`metadata <t_metadata>` types. 4061 4062:Syntax: 4063 4064:: 4065 4066 <returntype> (<parameter list>) 4067 4068...where '``<parameter list>``' is a comma-separated list of type 4069specifiers. Optionally, the parameter list may include a type ``...``, which 4070indicates that the function takes a variable number of arguments. Variable 4071argument functions can access their arguments with the :ref:`variable argument 4072handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type 4073except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`. 4074 4075:Examples: 4076 4077+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 4078| ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` | 4079+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 4080| ``i32 (ptr, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` argument and returns an integer. This is the signature for ``printf`` in LLVM. | 4081+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 4082| ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values | 4083+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 4084 4085.. _t_firstclass: 4086 4087First Class Types 4088----------------- 4089 4090The :ref:`first class <t_firstclass>` types are perhaps the most important. 4091Values of these types are the only ones which can be produced by 4092instructions. 4093 4094.. _t_single_value: 4095 4096Single Value Types 4097^^^^^^^^^^^^^^^^^^ 4098 4099These are the types that are valid in registers from CodeGen's perspective. 4100 4101.. _t_integer: 4102 4103Integer Type 4104"""""""""""" 4105 4106:Overview: 4107 4108The integer type is a very simple type that simply specifies an 4109arbitrary bit width for the integer type desired. Any bit width from 1 4110bit to 2\ :sup:`23`\ (about 8 million) can be specified. 4111 4112:Syntax: 4113 4114:: 4115 4116 iN 4117 4118The number of bits the integer will occupy is specified by the ``N`` 4119value. 4120 4121Examples: 4122********* 4123 4124+----------------+------------------------------------------------+ 4125| ``i1`` | a single-bit integer. | 4126+----------------+------------------------------------------------+ 4127| ``i32`` | a 32-bit integer. | 4128+----------------+------------------------------------------------+ 4129| ``i1942652`` | a really big integer of over 1 million bits. | 4130+----------------+------------------------------------------------+ 4131 4132.. _t_floating: 4133 4134Floating-Point Types 4135"""""""""""""""""""" 4136 4137.. list-table:: 4138 :header-rows: 1 4139 4140 * - Type 4141 - Description 4142 4143 * - ``half`` 4144 - 16-bit floating-point value (IEEE-754 binary16) 4145 4146 * - ``bfloat`` 4147 - 16-bit "brain" floating-point value (7-bit significand). Provides the 4148 same number of exponent bits as ``float``, so that it matches its dynamic 4149 range, but with greatly reduced precision. Used in Intel's AVX-512 BF16 4150 extensions and Arm's ARMv8.6-A extensions, among others. 4151 4152 * - ``float`` 4153 - 32-bit floating-point value (IEEE-754 binary32) 4154 4155 * - ``double`` 4156 - 64-bit floating-point value (IEEE-754 binary64) 4157 4158 * - ``fp128`` 4159 - 128-bit floating-point value (IEEE-754 binary128) 4160 4161 * - ``x86_fp80`` 4162 - 80-bit floating-point value (X87) 4163 4164 * - ``ppc_fp128`` 4165 - 128-bit floating-point value (two 64-bits) 4166 4167X86_amx Type 4168"""""""""""" 4169 4170:Overview: 4171 4172The x86_amx type represents a value held in an AMX tile register on an x86 4173machine. The operations allowed on it are quite limited. Only few intrinsics 4174are allowed: stride load and store, zero and dot product. No instruction is 4175allowed for this type. There are no arguments, arrays, pointers, vectors 4176or constants of this type. 4177 4178:Syntax: 4179 4180:: 4181 4182 x86_amx 4183 4184 4185 4186.. _t_pointer: 4187 4188Pointer Type 4189"""""""""""" 4190 4191:Overview: 4192 4193The pointer type ``ptr`` is used to specify memory locations. Pointers are 4194commonly used to reference objects in memory. 4195 4196Pointer types may have an optional address space attribute defining 4197the numbered address space where the pointed-to object resides. For 4198example, ``ptr addrspace(5)`` is a pointer to address space 5. 4199In addition to integer constants, ``addrspace`` can also reference one of the 4200address spaces defined in the :ref:`datalayout string<langref_datalayout>`. 4201``addrspace("A")`` will use the alloca address space, ``addrspace("G")`` 4202the default globals address space and ``addrspace("P")`` the program address 4203space. 4204 4205The default address space is number zero. 4206 4207The semantics of non-zero address spaces are target-specific. Memory 4208access through a non-dereferenceable pointer is undefined behavior in 4209any address space. Pointers with the bit-value 0 are only assumed to 4210be non-dereferenceable in address space 0, unless the function is 4211marked with the ``null_pointer_is_valid`` attribute. 4212 4213If an object can be proven accessible through a pointer with a 4214different address space, the access may be modified to use that 4215address space. Exceptions apply if the operation is ``volatile``. 4216 4217Prior to LLVM 15, pointer types also specified a pointee type, such as 4218``i8*``, ``[4 x i32]*`` or ``i32 (i32*)*``. In LLVM 15, such "typed 4219pointers" are still supported under non-default options. See the 4220`opaque pointers document <OpaquePointers.html>`__ for more information. 4221 4222.. _t_target_type: 4223 4224Target Extension Type 4225""""""""""""""""""""" 4226 4227:Overview: 4228 4229Target extension types represent types that must be preserved through 4230optimization, but are otherwise generally opaque to the compiler. They may be 4231used as function parameters or arguments, and in :ref:`phi <i_phi>` or 4232:ref:`select <i_select>` instructions. Some types may be also used in 4233:ref:`alloca <i_alloca>` instructions or as global values, and correspondingly 4234it is legal to use :ref:`load <i_load>` and :ref:`store <i_store>` instructions 4235on them. Full semantics for these types are defined by the target. 4236 4237The only constants that target extension types may have are ``zeroinitializer``, 4238``undef``, and ``poison``. Other possible values for target extension types may 4239arise from target-specific intrinsics and functions. 4240 4241These types cannot be converted to other types. As such, it is not legal to use 4242them in :ref:`bitcast <i_bitcast>` instructions (as a source or target type), 4243nor is it legal to use them in :ref:`ptrtoint <i_ptrtoint>` or 4244:ref:`inttoptr <i_inttoptr>` instructions. Similarly, they are not legal to use 4245in an :ref:`icmp <i_icmp>` instruction. 4246 4247Target extension types have a name and optional type or integer parameters. The 4248meanings of name and parameters are defined by the target. When being defined in 4249LLVM IR, all of the type parameters must precede all of the integer parameters. 4250 4251Specific target extension types are registered with LLVM as having specific 4252properties. These properties can be used to restrict the type from appearing in 4253certain contexts, such as being the type of a global variable or having a 4254``zeroinitializer`` constant be valid. A complete list of type properties may be 4255found in the documentation for ``llvm::TargetExtType::Property`` (`doxygen 4256<https://llvm.org/doxygen/classllvm_1_1TargetExtType.html>`_). 4257 4258:Syntax: 4259 4260.. code-block:: llvm 4261 4262 target("label") 4263 target("label", void) 4264 target("label", void, i32) 4265 target("label", 0, 1, 2) 4266 target("label", void, i32, 0, 1, 2) 4267 4268 4269.. _t_vector: 4270 4271Vector Type 4272""""""""""" 4273 4274:Overview: 4275 4276A vector type is a simple derived type that represents a vector of 4277elements. Vector types are used when multiple primitive data are 4278operated in parallel using a single instruction (SIMD). A vector type 4279requires a size (number of elements), an underlying primitive data type, 4280and a scalable property to represent vectors where the exact hardware 4281vector length is unknown at compile time. Vector types are considered 4282:ref:`first class <t_firstclass>`. 4283 4284:Memory Layout: 4285 4286In general vector elements are laid out in memory in the same way as 4287:ref:`array types <t_array>`. Such an analogy works fine as long as the vector 4288elements are byte sized. However, when the elements of the vector aren't byte 4289sized it gets a bit more complicated. One way to describe the layout is by 4290describing what happens when a vector such as <N x iM> is bitcasted to an 4291integer type with N*M bits, and then following the rules for storing such an 4292integer to memory. 4293 4294A bitcast from a vector type to a scalar integer type will see the elements 4295being packed together (without padding). The order in which elements are 4296inserted in the integer depends on endianness. For little endian element zero 4297is put in the least significant bits of the integer, and for big endian 4298element zero is put in the most significant bits. 4299 4300Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together 4301with the analogy that we can replace a vector store by a bitcast followed by 4302an integer store, we get this for big endian: 4303 4304.. code-block:: llvm 4305 4306 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16 4307 4308 ; Bitcasting from a vector to an integral type can be seen as 4309 ; concatenating the values: 4310 ; %val now has the hexadecimal value 0x1235. 4311 4312 store i16 %val, ptr %ptr 4313 4314 ; In memory the content will be (8-bit addressing): 4315 ; 4316 ; [%ptr + 0]: 00010010 (0x12) 4317 ; [%ptr + 1]: 00110101 (0x35) 4318 4319The same example for little endian: 4320 4321.. code-block:: llvm 4322 4323 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16 4324 4325 ; Bitcasting from a vector to an integral type can be seen as 4326 ; concatenating the values: 4327 ; %val now has the hexadecimal value 0x5321. 4328 4329 store i16 %val, ptr %ptr 4330 4331 ; In memory the content will be (8-bit addressing): 4332 ; 4333 ; [%ptr + 0]: 00100001 (0x21) 4334 ; [%ptr + 1]: 01010011 (0x53) 4335 4336When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout 4337is unspecified (just like it is for an integral type of the same size). This 4338is because different targets could put the padding at different positions when 4339the type size is smaller than the type's store size. 4340 4341:Syntax: 4342 4343:: 4344 4345 < <# elements> x <elementtype> > ; Fixed-length vector 4346 < vscale x <# elements> x <elementtype> > ; Scalable vector 4347 4348The number of elements is a constant integer value larger than 0; 4349elementtype may be any integer, floating-point or pointer type. Vectors 4350of size zero are not allowed. For scalable vectors, the total number of 4351elements is a constant multiple (called vscale) of the specified number 4352of elements; vscale is a positive integer that is unknown at compile time 4353and the same hardware-dependent constant for all scalable vectors at run 4354time. The size of a specific scalable vector type is thus constant within 4355IR, even if the exact size in bytes cannot be determined until run time. 4356 4357:Examples: 4358 4359+------------------------+----------------------------------------------------+ 4360| ``<4 x i32>`` | Vector of 4 32-bit integer values. | 4361+------------------------+----------------------------------------------------+ 4362| ``<8 x float>`` | Vector of 8 32-bit floating-point values. | 4363+------------------------+----------------------------------------------------+ 4364| ``<2 x i64>`` | Vector of 2 64-bit integer values. | 4365+------------------------+----------------------------------------------------+ 4366| ``<4 x ptr>`` | Vector of 4 pointers | 4367+------------------------+----------------------------------------------------+ 4368| ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. | 4369+------------------------+----------------------------------------------------+ 4370 4371.. _t_label: 4372 4373Label Type 4374^^^^^^^^^^ 4375 4376:Overview: 4377 4378The label type represents code labels. 4379 4380:Syntax: 4381 4382:: 4383 4384 label 4385 4386.. _t_token: 4387 4388Token Type 4389^^^^^^^^^^ 4390 4391:Overview: 4392 4393The token type is used when a value is associated with an instruction 4394but all uses of the value must not attempt to introspect or obscure it. 4395As such, it is not appropriate to have a :ref:`phi <i_phi>` or 4396:ref:`select <i_select>` of type token. 4397 4398:Syntax: 4399 4400:: 4401 4402 token 4403 4404 4405 4406.. _t_metadata: 4407 4408Metadata Type 4409^^^^^^^^^^^^^ 4410 4411:Overview: 4412 4413The metadata type represents embedded metadata. No derived types may be 4414created from metadata except for :ref:`function <t_function>` arguments. 4415 4416:Syntax: 4417 4418:: 4419 4420 metadata 4421 4422.. _t_aggregate: 4423 4424Aggregate Types 4425^^^^^^^^^^^^^^^ 4426 4427Aggregate Types are a subset of derived types that can contain multiple 4428member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are 4429aggregate types. :ref:`Vectors <t_vector>` are not considered to be 4430aggregate types. 4431 4432.. _t_array: 4433 4434Array Type 4435"""""""""" 4436 4437:Overview: 4438 4439The array type is a very simple derived type that arranges elements 4440sequentially in memory. The array type requires a size (number of 4441elements) and an underlying data type. 4442 4443:Syntax: 4444 4445:: 4446 4447 [<# elements> x <elementtype>] 4448 4449The number of elements is a constant integer value; ``elementtype`` may 4450be any type with a size. 4451 4452:Examples: 4453 4454+------------------+--------------------------------------+ 4455| ``[40 x i32]`` | Array of 40 32-bit integer values. | 4456+------------------+--------------------------------------+ 4457| ``[41 x i32]`` | Array of 41 32-bit integer values. | 4458+------------------+--------------------------------------+ 4459| ``[4 x i8]`` | Array of 4 8-bit integer values. | 4460+------------------+--------------------------------------+ 4461 4462Here are some examples of multidimensional arrays: 4463 4464+-----------------------------+----------------------------------------------------------+ 4465| ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | 4466+-----------------------------+----------------------------------------------------------+ 4467| ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. | 4468+-----------------------------+----------------------------------------------------------+ 4469| ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | 4470+-----------------------------+----------------------------------------------------------+ 4471 4472There is no restriction on indexing beyond the end of the array implied 4473by a static type (though there are restrictions on indexing beyond the 4474bounds of an allocated object in some cases). This means that 4475single-dimension 'variable sized array' addressing can be implemented in 4476LLVM with a zero length array type. An implementation of 'pascal style 4477arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for 4478example. 4479 4480.. _t_struct: 4481 4482Structure Type 4483"""""""""""""" 4484 4485:Overview: 4486 4487The structure type is used to represent a collection of data members 4488together in memory. The elements of a structure may be any type that has 4489a size. 4490 4491Structures in memory are accessed using '``load``' and '``store``' by 4492getting a pointer to a field with the '``getelementptr``' instruction. 4493Structures in registers are accessed using the '``extractvalue``' and 4494'``insertvalue``' instructions. 4495 4496Structures may optionally be "packed" structures, which indicate that 4497the alignment of the struct is one byte, and that there is no padding 4498between the elements. In non-packed structs, padding between field types 4499is inserted as defined by the DataLayout string in the module, which is 4500required to match what the underlying code generator expects. 4501 4502Structures can either be "literal" or "identified". A literal structure 4503is defined inline with other types (e.g. ``[2 x {i32, i32}]``) whereas 4504identified types are always defined at the top level with a name. 4505Literal types are uniqued by their contents and can never be recursive 4506or opaque since there is no way to write one. Identified types can be 4507opaqued and are never uniqued. Identified types must not be recursive. 4508 4509:Syntax: 4510 4511:: 4512 4513 %T1 = type { <type list> } ; Identified normal struct type 4514 %T2 = type <{ <type list> }> ; Identified packed struct type 4515 4516:Examples: 4517 4518+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 4519| ``{ i32, i32, i32 }`` | A triple of three ``i32`` values (this is a "homogeneous" struct as all element types are the same) | 4520+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 4521| ``{ float, ptr }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>`. | 4522+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 4523| ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. | 4524+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 4525 4526.. _t_opaque: 4527 4528Opaque Structure Types 4529"""""""""""""""""""""" 4530 4531:Overview: 4532 4533Opaque structure types are used to represent structure types that 4534do not have a body specified. This corresponds (for example) to the C 4535notion of a forward declared structure. They can be named (``%X``) or 4536unnamed (``%52``). 4537 4538:Syntax: 4539 4540:: 4541 4542 %X = type opaque 4543 %52 = type opaque 4544 4545:Examples: 4546 4547+--------------+-------------------+ 4548| ``opaque`` | An opaque type. | 4549+--------------+-------------------+ 4550 4551.. _constants: 4552 4553Constants 4554========= 4555 4556LLVM has several different basic types of constants. This section 4557describes them all and their syntax. 4558 4559Simple Constants 4560---------------- 4561 4562**Boolean constants** 4563 The two strings '``true``' and '``false``' are both valid constants 4564 of the ``i1`` type. 4565**Integer constants** 4566 Standard integers (such as '4') are constants of the :ref:`integer 4567 <t_integer>` type. They can be either decimal or 4568 hexadecimal. Decimal integers can be prefixed with - to represent 4569 negative integers, e.g. '``-1234``'. Hexadecimal integers must be 4570 prefixed with either u or s to indicate whether they are unsigned 4571 or signed respectively. e.g '``u0x8000``' gives 32768, whilst 4572 '``s0x8000``' gives -32768. 4573 4574 Note that hexadecimal integers are sign extended from the number 4575 of active bits, i.e. the bit width minus the number of leading 4576 zeros. So '``s0x0001``' of type '``i16``' will be -1, not 1. 4577**Floating-point constants** 4578 Floating-point constants use standard decimal notation (e.g. 4579 123.421), exponential notation (e.g. 1.23421e+2), or a more precise 4580 hexadecimal notation (see below). The assembler requires the exact 4581 decimal value of a floating-point constant. For example, the 4582 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating 4583 decimal in binary. Floating-point constants must have a 4584 :ref:`floating-point <t_floating>` type. 4585**Null pointer constants** 4586 The identifier '``null``' is recognized as a null pointer constant 4587 and must be of :ref:`pointer type <t_pointer>`. 4588**Token constants** 4589 The identifier '``none``' is recognized as an empty token constant 4590 and must be of :ref:`token type <t_token>`. 4591 4592The one non-intuitive notation for constants is the hexadecimal form of 4593floating-point constants. For example, the form 4594'``double 0x432ff973cafa8000``' is equivalent to (but harder to read 4595than) '``double 4.5e+15``'. The only time hexadecimal floating-point 4596constants are required (and the only time that they are generated by the 4597disassembler) is when a floating-point constant must be emitted but it 4598cannot be represented as a decimal floating-point number in a reasonable 4599number of digits. For example, NaN's, infinities, and other special 4600values are represented in their IEEE hexadecimal format so that assembly 4601and disassembly do not cause any bits to change in the constants. 4602 4603When using the hexadecimal form, constants of types bfloat, half, float, and 4604double are represented using the 16-digit form shown above (which matches the 4605IEEE754 representation for double); bfloat, half and float values must, however, 4606be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single 4607precision respectively. Hexadecimal format is always used for long double, and 4608there are three forms of long double. The 80-bit format used by x86 is 4609represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format 4610used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32 4611hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed 4612by 32 hexadecimal digits. Long doubles will only work if they match the long 4613double format on your target. The IEEE 16-bit format (half precision) is 4614represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit 4615format is represented by ``0xR`` followed by 4 hexadecimal digits. All 4616hexadecimal formats are big-endian (sign bit at the left). 4617 4618There are no constants of type x86_amx. 4619 4620.. _complexconstants: 4621 4622Complex Constants 4623----------------- 4624 4625Complex constants are a (potentially recursive) combination of simple 4626constants and smaller complex constants. 4627 4628**Structure constants** 4629 Structure constants are represented with notation similar to 4630 structure type definitions (a comma separated list of elements, 4631 surrounded by braces (``{}``)). For example: 4632 "``{ i32 4, float 17.0, ptr @G }``", where "``@G``" is declared as 4633 "``@G = external global i32``". Structure constants must have 4634 :ref:`structure type <t_struct>`, and the number and types of elements 4635 must match those specified by the type. 4636**Array constants** 4637 Array constants are represented with notation similar to array type 4638 definitions (a comma separated list of elements, surrounded by 4639 square brackets (``[]``)). For example: 4640 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have 4641 :ref:`array type <t_array>`, and the number and types of elements must 4642 match those specified by the type. As a special case, character array 4643 constants may also be represented as a double-quoted string using the ``c`` 4644 prefix. For example: "``c"Hello World\0A\00"``". 4645**Vector constants** 4646 Vector constants are represented with notation similar to vector 4647 type definitions (a comma separated list of elements, surrounded by 4648 less-than/greater-than's (``<>``)). For example: 4649 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants 4650 must have :ref:`vector type <t_vector>`, and the number and types of 4651 elements must match those specified by the type. 4652 4653 When creating a vector whose elements have the same constant value, the 4654 preferred syntax is ``splat (<Ty> Val)``. For example: "``splat (i32 11)``". 4655 These vector constants must have :ref:`vector type <t_vector>` with an 4656 element type that matches the ``splat`` operand. 4657**Zero initialization** 4658 The string '``zeroinitializer``' can be used to zero initialize a 4659 value to zero of *any* type, including scalar and 4660 :ref:`aggregate <t_aggregate>` types. This is often used to avoid 4661 having to print large zero initializers (e.g. for large arrays) and 4662 is always exactly equivalent to using explicit zero initializers. 4663**Metadata node** 4664 A metadata node is a constant tuple without types. For example: 4665 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values, 4666 for example: "``!{!0, i32 0, ptr @global, ptr @function, !"str"}``". 4667 Unlike other typed constants that are meant to be interpreted as part of 4668 the instruction stream, metadata is a place to attach additional 4669 information such as debug info. 4670 4671Global Variable and Function Addresses 4672-------------------------------------- 4673 4674The addresses of :ref:`global variables <globalvars>` and 4675:ref:`functions <functionstructure>` are always implicitly valid 4676(link-time) constants. These constants are explicitly referenced when 4677the :ref:`identifier for the global <identifiers>` is used and always have 4678:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM 4679file: 4680 4681.. code-block:: llvm 4682 4683 @X = global i32 17 4684 @Y = global i32 42 4685 @Z = global [2 x ptr] [ ptr @X, ptr @Y ] 4686 4687.. _undefvalues: 4688 4689Undefined Values 4690---------------- 4691 4692The string '``undef``' can be used anywhere a constant is expected, and 4693indicates that the user of the value may receive an unspecified 4694bit-pattern. Undefined values may be of any type (other than '``label``' 4695or '``void``') and be used anywhere a constant is permitted. 4696 4697.. note:: 4698 4699 A '``poison``' value (described in the next section) should be used instead of 4700 '``undef``' whenever possible. Poison values are stronger than undef, and 4701 enable more optimizations. Just the existence of '``undef``' blocks certain 4702 optimizations (see the examples below). 4703 4704Undefined values are useful because they indicate to the compiler that 4705the program is well defined no matter what value is used. This gives the 4706compiler more freedom to optimize. Here are some examples of 4707(potentially surprising) transformations that are valid (in pseudo IR): 4708 4709.. code-block:: llvm 4710 4711 %A = add %X, undef 4712 %B = sub %X, undef 4713 %C = xor %X, undef 4714 Safe: 4715 %A = undef 4716 %B = undef 4717 %C = undef 4718 4719This is safe because all of the output bits are affected by the undef 4720bits. Any output bit can have a zero or one depending on the input bits. 4721 4722.. code-block:: llvm 4723 4724 %A = or %X, undef 4725 %B = and %X, undef 4726 Safe: 4727 %A = -1 4728 %B = 0 4729 Safe: 4730 %A = %X ;; By choosing undef as 0 4731 %B = %X ;; By choosing undef as -1 4732 Unsafe: 4733 %A = undef 4734 %B = undef 4735 4736These logical operations have bits that are not always affected by the 4737input. For example, if ``%X`` has a zero bit, then the output of the 4738'``and``' operation will always be a zero for that bit, no matter what 4739the corresponding bit from the '``undef``' is. As such, it is unsafe to 4740optimize or assume that the result of the '``and``' is '``undef``'. 4741However, it is safe to assume that all bits of the '``undef``' could be 47420, and optimize the '``and``' to 0. Likewise, it is safe to assume that 4743all the bits of the '``undef``' operand to the '``or``' could be set, 4744allowing the '``or``' to be folded to -1. 4745 4746.. code-block:: llvm 4747 4748 %A = select undef, %X, %Y 4749 %B = select undef, 42, %Y 4750 %C = select %X, %Y, undef 4751 Safe: 4752 %A = %X (or %Y) 4753 %B = 42 (or %Y) 4754 %C = %Y (if %Y is provably not poison; unsafe otherwise) 4755 Unsafe: 4756 %A = undef 4757 %B = undef 4758 %C = undef 4759 4760This set of examples shows that undefined '``select``' 4761conditions can go *either way*, but they have to come from one 4762of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were 4763both known to have a clear low bit, then ``%A`` would have to have a 4764cleared low bit. However, in the ``%C`` example, the optimizer is 4765allowed to assume that the '``undef``' operand could be the same as 4766``%Y`` if ``%Y`` is provably not '``poison``', allowing the whole '``select``' 4767to be eliminated. This is because '``poison``' is stronger than '``undef``'. 4768 4769.. code-block:: llvm 4770 4771 %A = xor undef, undef 4772 4773 %B = undef 4774 %C = xor %B, %B 4775 4776 %D = undef 4777 %E = icmp slt %D, 4 4778 %F = icmp gte %D, 4 4779 4780 Safe: 4781 %A = undef 4782 %B = undef 4783 %C = undef 4784 %D = undef 4785 %E = undef 4786 %F = undef 4787 4788This example points out that two '``undef``' operands are not 4789necessarily the same. This can be surprising to people (and also matches 4790C semantics) where they assume that "``X^X``" is always zero, even if 4791``X`` is undefined. This isn't true for a number of reasons, but the 4792short answer is that an '``undef``' "variable" can arbitrarily change 4793its value over its "live range". This is true because the variable 4794doesn't actually *have a live range*. Instead, the value is logically 4795read from arbitrary registers that happen to be around when needed, so 4796the value is not necessarily consistent over time. In fact, ``%A`` and 4797``%C`` need to have the same semantics or the core LLVM "replace all 4798uses with" concept would not hold. 4799 4800To ensure all uses of a given register observe the same value (even if 4801'``undef``'), the :ref:`freeze instruction <i_freeze>` can be used. 4802 4803.. code-block:: llvm 4804 4805 %A = sdiv undef, %X 4806 %B = sdiv %X, undef 4807 Safe: 4808 %A = 0 4809 b: unreachable 4810 4811These examples show the crucial difference between an *undefined value* 4812and *undefined behavior*. An undefined value (like '``undef``') is 4813allowed to have an arbitrary bit-pattern. This means that the ``%A`` 4814operation can be constant folded to '``0``', because the '``undef``' 4815could be zero, and zero divided by any value is zero. 4816However, in the second example, we can make a more aggressive 4817assumption: because the ``undef`` is allowed to be an arbitrary value, 4818we are allowed to assume that it could be zero. Since a divide by zero 4819has *undefined behavior*, we are allowed to assume that the operation 4820does not execute at all. This allows us to delete the divide and all 4821code after it. Because the undefined operation "can't happen", the 4822optimizer can assume that it occurs in dead code. 4823 4824.. code-block:: text 4825 4826 a: store undef -> %X 4827 b: store %X -> undef 4828 Safe: 4829 a: <deleted> (if the stored value in %X is provably not poison) 4830 b: unreachable 4831 4832A store *of* an undefined value can be assumed to not have any effect; 4833we can assume that the value is overwritten with bits that happen to 4834match what was already there. This argument is only valid if the stored value 4835is provably not ``poison``. However, a store *to* an undefined 4836location could clobber arbitrary memory, therefore, it has undefined 4837behavior. 4838 4839Branching on an undefined value is undefined behavior. 4840This explains optimizations that depend on branch conditions to construct 4841predicates, such as Correlated Value Propagation and Global Value Numbering. 4842In case of switch instruction, the branch condition should be frozen, otherwise 4843it is undefined behavior. 4844 4845.. code-block:: llvm 4846 4847 Unsafe: 4848 br undef, BB1, BB2 ; UB 4849 4850 %X = and i32 undef, 255 4851 switch %X, label %ret [ .. ] ; UB 4852 4853 store undef, ptr %ptr 4854 %X = load ptr %ptr ; %X is undef 4855 switch i8 %X, label %ret [ .. ] ; UB 4856 4857 Safe: 4858 %X = or i8 undef, 255 ; always 255 4859 switch i8 %X, label %ret [ .. ] ; Well-defined 4860 4861 %X = freeze i1 undef 4862 br %X, BB1, BB2 ; Well-defined (non-deterministic jump) 4863 4864 4865 4866.. _poisonvalues: 4867 4868Poison Values 4869------------- 4870 4871A poison value is a result of an erroneous operation. 4872In order to facilitate speculative execution, many instructions do not 4873invoke immediate undefined behavior when provided with illegal operands, 4874and return a poison value instead. 4875The string '``poison``' can be used anywhere a constant is expected, and 4876operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce 4877a poison value. 4878 4879Most instructions return '``poison``' when one of their arguments is 4880'``poison``'. A notable exception is the :ref:`select instruction <i_select>`. 4881Propagation of poison can be stopped with the 4882:ref:`freeze instruction <i_freeze>`. 4883 4884It is correct to replace a poison value with an 4885:ref:`undef value <undefvalues>` or any value of the type. 4886 4887This means that immediate undefined behavior occurs if a poison value is 4888used as an instruction operand that has any values that trigger undefined 4889behavior. Notably this includes (but is not limited to): 4890 4891- The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or 4892 any other pointer dereferencing instruction (independent of address 4893 space). 4894- The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem`` 4895 instruction. 4896- The condition operand of a :ref:`br <i_br>` instruction. 4897- The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 4898 instruction. 4899- The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 4900 instruction, when the function or invoking call site has a ``noundef`` 4901 attribute in the corresponding position. 4902- The operand of a :ref:`ret <i_ret>` instruction if the function or invoking 4903 call site has a `noundef` attribute in the return value position. 4904 4905Here are some examples: 4906 4907.. code-block:: llvm 4908 4909 entry: 4910 %poison = sub nuw i32 0, 1 ; Results in a poison value. 4911 %poison2 = sub i32 poison, 1 ; Also results in a poison value. 4912 %still_poison = and i32 %poison, 0 ; 0, but also poison. 4913 %poison_yet_again = getelementptr i32, ptr @h, i32 %still_poison 4914 store i32 0, ptr %poison_yet_again ; Undefined behavior due to 4915 ; store to poison. 4916 4917 store i32 %poison, ptr @g ; Poison value stored to memory. 4918 %poison3 = load i32, ptr @g ; Poison value loaded back from memory. 4919 4920 %poison4 = load i16, ptr @g ; Returns a poison value. 4921 %poison5 = load i64, ptr @g ; Returns a poison value. 4922 4923 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value. 4924 br i1 %cmp, label %end, label %end ; undefined behavior 4925 4926 end: 4927 4928.. _welldefinedvalues: 4929 4930Well-Defined Values 4931------------------- 4932 4933Given a program execution, a value is *well defined* if the value does not 4934have an undef bit and is not poison in the execution. 4935An aggregate value or vector is well defined if its elements are well defined. 4936The padding of an aggregate isn't considered, since it isn't visible 4937without storing it into memory and loading it with a different type. 4938 4939A constant of a :ref:`single value <t_single_value>`, non-vector type is well 4940defined if it is neither '``undef``' constant nor '``poison``' constant. 4941The result of :ref:`freeze instruction <i_freeze>` is well defined regardless 4942of its operand. 4943 4944.. _blockaddress: 4945 4946Addresses of Basic Blocks 4947------------------------- 4948 4949``blockaddress(@function, %block)`` 4950 4951The '``blockaddress``' constant computes the address of the specified 4952basic block in the specified function. 4953 4954It always has an ``ptr addrspace(P)`` type, where ``P`` is the address space 4955of the function containing ``%block`` (usually ``addrspace(0)``). 4956 4957Taking the address of the entry block is illegal. 4958 4959This value only has defined behavior when used as an operand to the 4960':ref:`indirectbr <i_indirectbr>`' or for comparisons against null. Pointer 4961equality tests between labels addresses results in undefined behavior --- 4962though, again, comparison against null is ok, and no label is equal to the null 4963pointer. This may be passed around as an opaque pointer sized value as long as 4964the bits are not inspected. This allows ``ptrtoint`` and arithmetic to be 4965performed on these values so long as the original value is reconstituted before 4966the ``indirectbr`` instruction. 4967 4968Finally, some targets may provide defined semantics when using the value 4969as the operand to an inline assembly, but that is target specific. 4970 4971.. _dso_local_equivalent: 4972 4973DSO Local Equivalent 4974-------------------- 4975 4976``dso_local_equivalent @func`` 4977 4978A '``dso_local_equivalent``' constant represents a function which is 4979functionally equivalent to a given function, but is always defined in the 4980current linkage unit. The resulting pointer has the same type as the underlying 4981function. The resulting pointer is permitted, but not required, to be different 4982from a pointer to the function, and it may have different values in different 4983translation units. 4984 4985The target function may not have ``extern_weak`` linkage. 4986 4987``dso_local_equivalent`` can be implemented as such: 4988 4989- If the function has local linkage, hidden visibility, or is 4990 ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer 4991 to the function. 4992- ``dso_local_equivalent`` can be implemented with a stub that tail-calls the 4993 function. Many targets support relocations that resolve at link time to either 4994 a function or a stub for it, depending on if the function is defined within the 4995 linkage unit; LLVM will use this when available. (This is commonly called a 4996 "PLT stub".) On other targets, the stub may need to be emitted explicitly. 4997 4998This can be used wherever a ``dso_local`` instance of a function is needed without 4999needing to explicitly make the original function ``dso_local``. An instance where 5000this can be used is for static offset calculations between a function and some other 5001``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI, 5002where dynamic relocations for function pointers in VTables can be replaced with 5003static relocations for offsets between the VTable and virtual functions which 5004may not be ``dso_local``. 5005 5006This is currently only supported for ELF binary formats. 5007 5008.. _no_cfi: 5009 5010No CFI 5011------ 5012 5013``no_cfi @func`` 5014 5015With `Control-Flow Integrity (CFI) 5016<https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, a '``no_cfi``' 5017constant represents a function reference that does not get replaced with a 5018reference to the CFI jump table in the ``LowerTypeTests`` pass. These constants 5019may be useful in low-level programs, such as operating system kernels, which 5020need to refer to the actual function body. 5021 5022.. _ptrauth_constant: 5023 5024Pointer Authentication Constants 5025-------------------------------- 5026 5027``ptrauth (ptr CST, i32 KEY[, i64 DISC[, ptr ADDRDISC]?]?)`` 5028 5029A '``ptrauth``' constant represents a pointer with a cryptographic 5030authentication signature embedded into some bits, as described in the 5031`Pointer Authentication <PointerAuth.html>`__ document. 5032 5033A '``ptrauth``' constant is simply a constant equivalent to the 5034``llvm.ptrauth.sign`` intrinsic, potentially fed by a discriminator 5035``llvm.ptrauth.blend`` if needed. 5036 5037Its type is the same as the first argument. An integer constant discriminator 5038and an address discriminator may be optionally specified. Otherwise, they have 5039values ``i64 0`` and ``ptr null``. 5040 5041If the address discriminator is ``null`` then the expression is equivalent to 5042 5043.. code-block:: llvm 5044 5045 %tmp = call i64 @llvm.ptrauth.sign(i64 ptrtoint (ptr CST to i64), i32 KEY, i64 DISC) 5046 %val = inttoptr i64 %tmp to ptr 5047 5048Otherwise, the expression is equivalent to: 5049 5050.. code-block:: llvm 5051 5052 %tmp1 = call i64 @llvm.ptrauth.blend(i64 ptrtoint (ptr ADDRDISC to i64), i64 DISC) 5053 %tmp2 = call i64 @llvm.ptrauth.sign(i64 ptrtoint (ptr CST to i64), i32 KEY, i64 %tmp1) 5054 %val = inttoptr i64 %tmp2 to ptr 5055 5056.. _constantexprs: 5057 5058Constant Expressions 5059-------------------- 5060 5061Constant expressions are used to allow expressions involving other 5062constants to be used as constants. Constant expressions may be of any 5063:ref:`first class <t_firstclass>` type and may involve any LLVM operation 5064that does not have side effects (e.g. load and call are not supported). 5065The following is the syntax for constant expressions: 5066 5067``trunc (CST to TYPE)`` 5068 Perform the :ref:`trunc operation <i_trunc>` on constants. 5069``ptrtoint (CST to TYPE)`` 5070 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants. 5071``inttoptr (CST to TYPE)`` 5072 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants. 5073 This one is *really* dangerous! 5074``bitcast (CST to TYPE)`` 5075 Convert a constant, CST, to another TYPE. 5076 The constraints of the operands are the same as those for the 5077 :ref:`bitcast instruction <i_bitcast>`. 5078``addrspacecast (CST to TYPE)`` 5079 Convert a constant pointer or constant vector of pointer, CST, to another 5080 TYPE in a different address space. The constraints of the operands are the 5081 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`. 5082``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)`` 5083 Perform the :ref:`getelementptr operation <i_getelementptr>` on 5084 constants. As with the :ref:`getelementptr <i_getelementptr>` 5085 instruction, the index list may have one or more indexes, which are 5086 required to make sense for the type of "pointer to TY". These indexes 5087 may be implicitly sign-extended or truncated to match the index size 5088 of CSTPTR's address space. 5089``extractelement (VAL, IDX)`` 5090 Perform the :ref:`extractelement operation <i_extractelement>` on 5091 constants. 5092``insertelement (VAL, ELT, IDX)`` 5093 Perform the :ref:`insertelement operation <i_insertelement>` on 5094 constants. 5095``shufflevector (VEC1, VEC2, IDXMASK)`` 5096 Perform the :ref:`shufflevector operation <i_shufflevector>` on 5097 constants. 5098``add (LHS, RHS)`` 5099 Perform an addition on constants. 5100``sub (LHS, RHS)`` 5101 Perform a subtraction on constants. 5102``mul (LHS, RHS)`` 5103 Perform a multiplication on constants. 5104``shl (LHS, RHS)`` 5105 Perform a left shift on constants. 5106``xor (LHS, RHS)`` 5107 Perform a bitwise xor on constants. 5108 5109Other Values 5110============ 5111 5112.. _inlineasmexprs: 5113 5114Inline Assembler Expressions 5115---------------------------- 5116 5117LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level 5118Inline Assembly <moduleasm>`) through the use of a special value. This value 5119represents the inline assembler as a template string (containing the 5120instructions to emit), a list of operand constraints (stored as a string), a 5121flag that indicates whether or not the inline asm expression has side effects, 5122and a flag indicating whether the function containing the asm needs to align its 5123stack conservatively. 5124 5125The template string supports argument substitution of the operands using "``$``" 5126followed by a number, to indicate substitution of the given register/memory 5127location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also 5128be used, where ``MODIFIER`` is a target-specific annotation for how to print the 5129operand (See :ref:`inline-asm-modifiers`). 5130 5131A literal "``$``" may be included by using "``$$``" in the template. To include 5132other special characters into the output, the usual "``\XX``" escapes may be 5133used, just as in other strings. Note that after template substitution, the 5134resulting assembly string is parsed by LLVM's integrated assembler unless it is 5135disabled -- even when emitting a ``.s`` file -- and thus must contain assembly 5136syntax known to LLVM. 5137 5138LLVM also supports a few more substitutions useful for writing inline assembly: 5139 5140- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob. 5141 This substitution is useful when declaring a local label. Many standard 5142 compiler optimizations, such as inlining, may duplicate an inline asm blob. 5143 Adding a blob-unique identifier ensures that the two labels will not conflict 5144 during assembly. This is used to implement `GCC's %= special format 5145 string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_. 5146- ``${:comment}``: Expands to the comment character of the current target's 5147 assembly dialect. This is usually ``#``, but many targets use other strings, 5148 such as ``;``, ``//``, or ``!``. 5149- ``${:private}``: Expands to the assembler private label prefix. Labels with 5150 this prefix will not appear in the symbol table of the assembled object. 5151 Typically the prefix is ``L``, but targets may use other strings. ``.L`` is 5152 relatively popular. 5153 5154LLVM's support for inline asm is modeled closely on the requirements of Clang's 5155GCC-compatible inline-asm support. Thus, the feature-set and the constraint and 5156modifier codes listed here are similar or identical to those in GCC's inline asm 5157support. However, to be clear, the syntax of the template and constraint strings 5158described here is *not* the same as the syntax accepted by GCC and Clang, and, 5159while most constraint letters are passed through as-is by Clang, some get 5160translated to other codes when converting from the C source to the LLVM 5161assembly. 5162 5163An example inline assembler expression is: 5164 5165.. code-block:: llvm 5166 5167 i32 (i32) asm "bswap $0", "=r,r" 5168 5169Inline assembler expressions may **only** be used as the callee operand 5170of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction. 5171Thus, typically we have: 5172 5173.. code-block:: llvm 5174 5175 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y) 5176 5177Inline asms with side effects not visible in the constraint list must be 5178marked as having side effects. This is done through the use of the 5179'``sideeffect``' keyword, like so: 5180 5181.. code-block:: llvm 5182 5183 call void asm sideeffect "eieio", ""() 5184 5185In some cases inline asms will contain code that will not work unless 5186the stack is aligned in some way, such as calls or SSE instructions on 5187x86, yet will not contain code that does that alignment within the asm. 5188The compiler should make conservative assumptions about what the asm 5189might contain and should generate its usual stack alignment code in the 5190prologue if the '``alignstack``' keyword is present: 5191 5192.. code-block:: llvm 5193 5194 call void asm alignstack "eieio", ""() 5195 5196Inline asms also support using non-standard assembly dialects. The 5197assumed dialect is ATT. When the '``inteldialect``' keyword is present, 5198the inline asm is using the Intel dialect. Currently, ATT and Intel are 5199the only supported dialects. An example is: 5200 5201.. code-block:: llvm 5202 5203 call void asm inteldialect "eieio", ""() 5204 5205In the case that the inline asm might unwind the stack, 5206the '``unwind``' keyword must be used, so that the compiler emits 5207unwinding information: 5208 5209.. code-block:: llvm 5210 5211 call void asm unwind "call func", ""() 5212 5213If the inline asm unwinds the stack and isn't marked with 5214the '``unwind``' keyword, the behavior is undefined. 5215 5216If multiple keywords appear, the '``sideeffect``' keyword must come 5217first, the '``alignstack``' keyword second, the '``inteldialect``' keyword 5218third and the '``unwind``' keyword last. 5219 5220Inline Asm Constraint String 5221^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5222 5223The constraint list is a comma-separated string, each element containing one or 5224more constraint codes. 5225 5226For each element in the constraint list an appropriate register or memory 5227operand will be chosen, and it will be made available to assembly template 5228string expansion as ``$0`` for the first constraint in the list, ``$1`` for the 5229second, etc. 5230 5231There are three different types of constraints, which are distinguished by a 5232prefix symbol in front of the constraint code: Output, Input, and Clobber. The 5233constraints must always be given in that order: outputs first, then inputs, then 5234clobbers. They cannot be intermingled. 5235 5236There are also three different categories of constraint codes: 5237 5238- Register constraint. This is either a register class, or a fixed physical 5239 register. This kind of constraint will allocate a register, and if necessary, 5240 bitcast the argument or result to the appropriate type. 5241- Memory constraint. This kind of constraint is for use with an instruction 5242 taking a memory operand. Different constraints allow for different addressing 5243 modes used by the target. 5244- Immediate value constraint. This kind of constraint is for an integer or other 5245 immediate value which can be rendered directly into an instruction. The 5246 various target-specific constraints allow the selection of a value in the 5247 proper range for the instruction you wish to use it with. 5248 5249Output constraints 5250"""""""""""""""""" 5251 5252Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This 5253indicates that the assembly will write to this operand, and the operand will 5254then be made available as a return value of the ``asm`` expression. Output 5255constraints do not consume an argument from the call instruction. (Except, see 5256below about indirect outputs). 5257 5258Normally, it is expected that no output locations are written to by the assembly 5259expression until *all* of the inputs have been read. As such, LLVM may assign 5260the same register to an output and an input. If this is not safe (e.g. if the 5261assembly contains two instructions, where the first writes to one output, and 5262the second reads an input and writes to a second output), then the "``&``" 5263modifier must be used (e.g. "``=&r``") to specify that the output is an 5264"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM 5265will not use the same register for any inputs (other than an input tied to this 5266output). 5267 5268Input constraints 5269""""""""""""""""" 5270 5271Input constraints do not have a prefix -- just the constraint codes. Each input 5272constraint will consume one argument from the call instruction. It is not 5273permitted for the asm to write to any input register or memory location (unless 5274that input is tied to an output). Note also that multiple inputs may all be 5275assigned to the same register, if LLVM can determine that they necessarily all 5276contain the same value. 5277 5278Instead of providing a Constraint Code, input constraints may also "tie" 5279themselves to an output constraint, by providing an integer as the constraint 5280string. Tied inputs still consume an argument from the call instruction, and 5281take up a position in the asm template numbering as is usual -- they will simply 5282be constrained to always use the same register as the output they've been tied 5283to. For example, a constraint string of "``=r,0``" says to assign a register for 5284output, and use that register as an input as well (it being the 0'th 5285constraint). 5286 5287It is permitted to tie an input to an "early-clobber" output. In that case, no 5288*other* input may share the same register as the input tied to the early-clobber 5289(even when the other input has the same value). 5290 5291You may only tie an input to an output which has a register constraint, not a 5292memory constraint. Only a single input may be tied to an output. 5293 5294There is also an "interesting" feature which deserves a bit of explanation: if a 5295register class constraint allocates a register which is too small for the value 5296type operand provided as input, the input value will be split into multiple 5297registers, and all of them passed to the inline asm. 5298 5299However, this feature is often not as useful as you might think. 5300 5301Firstly, the registers are *not* guaranteed to be consecutive. So, on those 5302architectures that have instructions which operate on multiple consecutive 5303instructions, this is not an appropriate way to support them. (e.g. the 32-bit 5304SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The 5305hardware then loads into both the named register, and the next register. This 5306feature of inline asm would not be useful to support that.) 5307 5308A few of the targets provide a template string modifier allowing explicit access 5309to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and 5310``D``). On such an architecture, you can actually access the second allocated 5311register (yet, still, not any subsequent ones). But, in that case, you're still 5312probably better off simply splitting the value into two separate operands, for 5313clarity. (e.g. see the description of the ``A`` constraint on X86, which, 5314despite existing only for use with this feature, is not really a good idea to 5315use) 5316 5317Indirect inputs and outputs 5318""""""""""""""""""""""""""" 5319 5320Indirect output or input constraints can be specified by the "``*``" modifier 5321(which goes after the "``=``" in case of an output). This indicates that the asm 5322will write to or read from the contents of an *address* provided as an input 5323argument. (Note that in this way, indirect outputs act more like an *input* than 5324an output: just like an input, they consume an argument of the call expression, 5325rather than producing a return value. An indirect output constraint is an 5326"output" only in that the asm is expected to write to the contents of the input 5327memory location, instead of just read from it). 5328 5329This is most typically used for memory constraint, e.g. "``=*m``", to pass the 5330address of a variable as a value. 5331 5332It is also possible to use an indirect *register* constraint, but only on output 5333(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output 5334value normally, and then, separately emit a store to the address provided as 5335input, after the provided inline asm. (It's not clear what value this 5336functionality provides, compared to writing the store explicitly after the asm 5337statement, and it can only produce worse code, since it bypasses many 5338optimization passes. I would recommend not using it.) 5339 5340Call arguments for indirect constraints must have pointer type and must specify 5341the :ref:`elementtype <attr_elementtype>` attribute to indicate the pointer 5342element type. 5343 5344Clobber constraints 5345""""""""""""""""""" 5346 5347A clobber constraint is indicated by a "``~``" prefix. A clobber does not 5348consume an input operand, nor generate an output. Clobbers cannot use any of the 5349general constraint code letters -- they may use only explicit register 5350constraints, e.g. "``~{eax}``". The one exception is that a clobber string of 5351"``~{memory}``" indicates that the assembly writes to arbitrary undeclared 5352memory locations -- not only the memory pointed to by a declared indirect 5353output. 5354 5355Note that clobbering named registers that are also present in output 5356constraints is not legal. 5357 5358Label constraints 5359""""""""""""""""" 5360 5361A label constraint is indicated by a "``!``" prefix and typically used in the 5362form ``"!i"``. Instead of consuming call arguments, label constraints consume 5363indirect destination labels of ``callbr`` instructions. 5364 5365Label constraints can only be used in conjunction with ``callbr`` and the 5366number of label constraints must match the number of indirect destination 5367labels in the ``callbr`` instruction. 5368 5369 5370Constraint Codes 5371"""""""""""""""" 5372After a potential prefix comes constraint code, or codes. 5373 5374A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character 5375followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``" 5376(e.g. "``{eax}``"). 5377 5378The one and two letter constraint codes are typically chosen to be the same as 5379GCC's constraint codes. 5380 5381A single constraint may include one or more than constraint code in it, leaving 5382it up to LLVM to choose which one to use. This is included mainly for 5383compatibility with the translation of GCC inline asm coming from clang. 5384 5385There are two ways to specify alternatives, and either or both may be used in an 5386inline asm constraint list: 5387 53881) Append the codes to each other, making a constraint code set. E.g. "``im``" 5389 or "``{eax}m``". This means "choose any of the options in the set". The 5390 choice of constraint is made independently for each constraint in the 5391 constraint list. 5392 53932) Use "``|``" between constraint code sets, creating alternatives. Every 5394 constraint in the constraint list must have the same number of alternative 5395 sets. With this syntax, the same alternative in *all* of the items in the 5396 constraint list will be chosen together. 5397 5398Putting those together, you might have a two operand constraint string like 5399``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then 5400operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1 5401may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m. 5402 5403However, the use of either of the alternatives features is *NOT* recommended, as 5404LLVM is not able to make an intelligent choice about which one to use. (At the 5405point it currently needs to choose, not enough information is available to do so 5406in a smart way.) Thus, it simply tries to make a choice that's most likely to 5407compile, not one that will be optimal performance. (e.g., given "``rm``", it'll 5408always choose to use memory, not registers). And, if given multiple registers, 5409or multiple register classes, it will simply choose the first one. (In fact, it 5410doesn't currently even ensure explicitly specified physical registers are 5411unique, so specifying multiple physical registers as alternatives, like 5412``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was 5413intended.) 5414 5415Supported Constraint Code List 5416"""""""""""""""""""""""""""""" 5417 5418The constraint codes are, in general, expected to behave the same way they do in 5419GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C 5420inline asm code which was supported by GCC. A mismatch in behavior between LLVM 5421and GCC likely indicates a bug in LLVM. 5422 5423Some constraint codes are typically supported by all targets: 5424 5425- ``r``: A register in the target's general purpose register class. 5426- ``m``: A memory address operand. It is target-specific what addressing modes 5427 are supported, typical examples are register, or register + register offset, 5428 or register + immediate offset (of some target-specific size). 5429- ``p``: An address operand. Similar to ``m``, but used by "load address" 5430 type instructions without touching memory. 5431- ``i``: An integer constant (of target-specific width). Allows either a simple 5432 immediate, or a relocatable value. 5433- ``n``: An integer constant -- *not* including relocatable values. 5434- ``s``: A symbol or label reference with a constant offset. 5435- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically 5436 useful to pass a label for an asm branch or call. 5437 5438 .. FIXME: but that surely isn't actually okay to jump out of an asm 5439 block without telling llvm about the control transfer???) 5440 5441- ``{register-name}``: Requires exactly the named physical register. 5442 5443Other constraints are target-specific: 5444 5445AArch64: 5446 5447- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate. 5448- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction, 5449 i.e. 0 to 4095 with optional shift by 12. 5450- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or 5451 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12. 5452- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a 5453 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register. 5454- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a 5455 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register. 5456- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a 5457 32-bit register. This is a superset of ``K``: in addition to the bitmask 5458 immediate, also allows immediate integers which can be loaded with a single 5459 ``MOVZ`` or ``MOVL`` instruction. 5460- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a 5461 64-bit register. This is a superset of ``L``. 5462- ``Q``: Memory address operand must be in a single register (no 5463 offsets). (However, LLVM currently does this for the ``m`` constraint as 5464 well.) 5465- ``r``: A 32 or 64-bit integer register (W* or X*). 5466- ``S``: A symbol or label reference with a constant offset. The generic ``s`` 5467 is not supported. 5468- ``Uci``: Like r, but restricted to registers 8 to 11 inclusive. 5469- ``Ucj``: Like r, but restricted to registers 12 to 15 inclusive. 5470- ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register. 5471- ``x``: Like w, but restricted to registers 0 to 15 inclusive. 5472- ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive. 5473- ``Uph``: One of the upper eight SVE predicate registers (P8 to P15) 5474- ``Upl``: One of the lower eight SVE predicate registers (P0 to P7) 5475- ``Upa``: Any of the SVE predicate registers (P0 to P15) 5476 5477AMDGPU: 5478 5479- ``r``: A 32 or 64-bit integer register. 5480- ``[0-9]v``: The 32-bit VGPR register, number 0-9. 5481- ``[0-9]s``: The 32-bit SGPR register, number 0-9. 5482- ``[0-9]a``: The 32-bit AGPR register, number 0-9. 5483- ``I``: An integer inline constant in the range from -16 to 64. 5484- ``J``: A 16-bit signed integer constant. 5485- ``A``: An integer or a floating-point inline constant. 5486- ``B``: A 32-bit signed integer constant. 5487- ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64. 5488- ``DA``: A 64-bit constant that can be split into two "A" constants. 5489- ``DB``: A 64-bit constant that can be split into two "B" constants. 5490 5491All ARM modes: 5492 5493- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address 5494 operand. Treated the same as operand ``m``, at the moment. 5495- ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14`` 5496- ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11`` 5497 5498ARM and ARM's Thumb2 mode: 5499 5500- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``) 5501- ``I``: An immediate integer valid for a data-processing instruction. 5502- ``J``: An immediate integer between -4095 and 4095. 5503- ``K``: An immediate integer whose bitwise inverse is valid for a 5504 data-processing instruction. (Can be used with template modifier "``B``" to 5505 print the inverted value). 5506- ``L``: An immediate integer whose negation is valid for a data-processing 5507 instruction. (Can be used with template modifier "``n``" to print the negated 5508 value). 5509- ``M``: A power of two or an integer between 0 and 32. 5510- ``N``: Invalid immediate constraint. 5511- ``O``: Invalid immediate constraint. 5512- ``r``: A general-purpose 32-bit integer register (``r0-r15``). 5513- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same 5514 as ``r``. 5515- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode, 5516 invalid. 5517- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 5518 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively. 5519- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 5520 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively. 5521- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 5522 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively. 5523 5524ARM's Thumb1 mode: 5525 5526- ``I``: An immediate integer between 0 and 255. 5527- ``J``: An immediate integer between -255 and -1. 5528- ``K``: An immediate integer between 0 and 255, with optional left-shift by 5529 some amount. 5530- ``L``: An immediate integer between -7 and 7. 5531- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020. 5532- ``N``: An immediate integer between 0 and 31. 5533- ``O``: An immediate integer which is a multiple of 4 between -508 and 508. 5534- ``r``: A low 32-bit GPR register (``r0-r7``). 5535- ``l``: A low 32-bit GPR register (``r0-r7``). 5536- ``h``: A high GPR register (``r0-r7``). 5537- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 5538 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively. 5539- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 5540 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively. 5541- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 5542 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively. 5543 5544Hexagon: 5545 5546- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``, 5547 at the moment. 5548- ``r``: A 32 or 64-bit register. 5549 5550LoongArch: 5551 5552- ``f``: A floating-point register (if available). 5553- ``k``: A memory operand whose address is formed by a base register and 5554 (optionally scaled) index register. 5555- ``l``: A signed 16-bit constant. 5556- ``m``: A memory operand whose address is formed by a base register and 5557 offset that is suitable for use in instructions with the same addressing 5558 mode as st.w and ld.w. 5559- ``I``: A signed 12-bit constant (for arithmetic instructions). 5560- ``J``: An immediate integer zero. 5561- ``K``: An unsigned 12-bit constant (for logic instructions). 5562- ``ZB``: An address that is held in a general-purpose register. The offset 5563 is zero. 5564- ``ZC``: A memory operand whose address is formed by a base register and 5565 offset that is suitable for use in instructions with the same addressing 5566 mode as ll.w and sc.w. 5567 5568MSP430: 5569 5570- ``r``: An 8 or 16-bit register. 5571 5572MIPS: 5573 5574- ``I``: An immediate signed 16-bit integer. 5575- ``J``: An immediate integer zero. 5576- ``K``: An immediate unsigned 16-bit integer. 5577- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0. 5578- ``N``: An immediate integer between -65535 and -1. 5579- ``O``: An immediate signed 15-bit integer. 5580- ``P``: An immediate integer between 1 and 65535. 5581- ``m``: A memory address operand. In MIPS-SE mode, allows a base address 5582 register plus 16-bit immediate offset. In MIPS mode, just a base register. 5583- ``R``: A memory address operand. In MIPS-SE mode, allows a base address 5584 register plus a 9-bit signed offset. In MIPS mode, the same as constraint 5585 ``m``. 5586- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or 5587 ``sc`` instruction on the given subtarget (details vary). 5588- ``r``, ``d``, ``y``: A 32 or 64-bit GPR register. 5589- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register 5590 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w`` 5591 argument modifier for compatibility with GCC. 5592- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always 5593 ``25``). 5594- ``l``: The ``lo`` register, 32 or 64-bit. 5595- ``x``: Invalid. 5596 5597NVPTX: 5598 5599- ``b``: A 1-bit integer register. 5600- ``c`` or ``h``: A 16-bit integer register. 5601- ``r``: A 32-bit integer register. 5602- ``l`` or ``N``: A 64-bit integer register. 5603- ``q``: A 128-bit integer register. 5604- ``f``: A 32-bit float register. 5605- ``d``: A 64-bit float register. 5606 5607 5608PowerPC: 5609 5610- ``I``: An immediate signed 16-bit integer. 5611- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits. 5612- ``K``: An immediate unsigned 16-bit integer. 5613- ``L``: An immediate signed 16-bit integer, shifted left 16 bits. 5614- ``M``: An immediate integer greater than 31. 5615- ``N``: An immediate integer that is an exact power of 2. 5616- ``O``: The immediate integer constant 0. 5617- ``P``: An immediate integer constant whose negation is a signed 16-bit 5618 constant. 5619- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently 5620 treated the same as ``m``. 5621- ``r``: A 32 or 64-bit integer register. 5622- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is: 5623 ``R1-R31``). 5624- ``f``: A 32 or 64-bit float register (``F0-F31``), 5625- ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector 5626 register (``V0-V31``). 5627 5628- ``y``: Condition register (``CR0-CR7``). 5629- ``wc``: An individual CR bit in a CR register. 5630- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX 5631 register set (overlapping both the floating-point and vector register files). 5632- ``ws``: A 32 or 64-bit floating-point register, from the full VSX register 5633 set. 5634 5635RISC-V: 5636 5637- ``A``: An address operand (using a general-purpose register, without an 5638 offset). 5639- ``I``: A 12-bit signed integer immediate operand. 5640- ``J``: A zero integer immediate operand. 5641- ``K``: A 5-bit unsigned integer immediate operand. 5642- ``f``: A 32- or 64-bit floating-point register (requires F or D extension). 5643- ``r``: A 32- or 64-bit general-purpose register (depending on the platform 5644 ``XLEN``). 5645- ``S``: Alias for ``s``. 5646- ``vd``: A vector register, excluding ``v0`` (requires V extension). 5647- ``vm``: The vector register ``v0`` (requires V extension). 5648- ``vr``: A vector register (requires V extension). 5649 5650Sparc: 5651 5652- ``I``: An immediate 13-bit signed integer. 5653- ``r``: A 32-bit integer register. 5654- ``f``: Any floating-point register on SparcV8, or a floating-point 5655 register in the "low" half of the registers on SparcV9. 5656- ``e``: Any floating-point register. (Same as ``f`` on SparcV8.) 5657 5658SystemZ: 5659 5660- ``I``: An immediate unsigned 8-bit integer. 5661- ``J``: An immediate unsigned 12-bit integer. 5662- ``K``: An immediate signed 16-bit integer. 5663- ``L``: An immediate signed 20-bit integer. 5664- ``M``: An immediate integer 0x7fffffff. 5665- ``Q``: A memory address operand with a base address and a 12-bit immediate 5666 unsigned displacement. 5667- ``R``: A memory address operand with a base address, a 12-bit immediate 5668 unsigned displacement, and an index register. 5669- ``S``: A memory address operand with a base address and a 20-bit immediate 5670 signed displacement. 5671- ``T``: A memory address operand with a base address, a 20-bit immediate 5672 signed displacement, and an index register. 5673- ``r`` or ``d``: A 32, 64, or 128-bit integer register. 5674- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an 5675 address context evaluates as zero). 5676- ``h``: A 32-bit value in the high part of a 64bit data register 5677 (LLVM-specific) 5678- ``f``: A 32, 64, or 128-bit floating-point register. 5679 5680X86: 5681 5682- ``I``: An immediate integer between 0 and 31. 5683- ``J``: An immediate integer between 0 and 64. 5684- ``K``: An immediate signed 8-bit integer. 5685- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only) 5686 0xffffffff. 5687- ``M``: An immediate integer between 0 and 3. 5688- ``N``: An immediate unsigned 8-bit integer. 5689- ``O``: An immediate integer between 0 and 127. 5690- ``e``: An immediate 32-bit signed integer. 5691- ``Z``: An immediate 32-bit unsigned integer. 5692- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit 5693 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d`` 5694 registers, and on X86-64, it is all of the integer registers. When feature 5695 `egpr` and `inline-asm-use-gpr32` are both on, it will be extended to gpr32. 5696- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit 5697 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers. 5698- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register. When feature 5699 `egpr` and `inline-asm-use-gpr32` are both on, it will be extended to gpr32. 5700- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has 5701 existed since i386, and can be accessed without the REX prefix. 5702- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register. 5703- ``y``: A 64-bit MMX register, if MMX is enabled. 5704- ``v``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector 5705 operand in a SSE register. If AVX is also enabled, can also be a 256-bit 5706 vector operand in an AVX register. If AVX-512 is also enabled, can also be a 5707 512-bit vector operand in an AVX512 register. Otherwise, an error. 5708- ``Ws``: A symbolic reference with an optional constant addend or a label 5709 reference. 5710- ``x``: The same as ``v``, except that when AVX-512 is enabled, the ``x`` code 5711 only allocates into the first 16 AVX-512 registers, while the ``v`` code 5712 allocates into any of the 32 AVX-512 registers. 5713- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error. 5714- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in 5715 32-bit mode, a 64-bit integer operand will get split into two registers). It 5716 is not recommended to use this constraint, as in 64-bit mode, the 64-bit 5717 operand will get allocated only to RAX -- if two 32-bit operands are needed, 5718 you're better off splitting it yourself, before passing it to the asm 5719 statement. 5720- ``jr``: An 8, 16, 32, or 64-bit integer gpr16. It won't be extended to gpr32 5721 when feature `egpr` or `inline-asm-use-gpr32` is on. 5722- ``jR``: An 8, 16, 32, or 64-bit integer gpr32 when feature `egpr`` is on. 5723 Otherwise, same as ``r``. 5724 5725XCore: 5726 5727- ``r``: A 32-bit integer register. 5728 5729 5730.. _inline-asm-modifiers: 5731 5732Asm template argument modifiers 5733^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5734 5735In the asm template string, modifiers can be used on the operand reference, like 5736"``${0:n}``". 5737 5738The modifiers are, in general, expected to behave the same way they do in 5739GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C 5740inline asm code which was supported by GCC. A mismatch in behavior between LLVM 5741and GCC likely indicates a bug in LLVM. 5742 5743Target-independent: 5744 5745- ``c``: Print an immediate integer constant unadorned, without 5746 the target-specific immediate punctuation (e.g. no ``$`` prefix). 5747- ``n``: Negate and print immediate integer constant unadorned, without the 5748 target-specific immediate punctuation (e.g. no ``$`` prefix). 5749- ``l``: Print as an unadorned label, without the target-specific label 5750 punctuation (e.g. no ``$`` prefix). 5751 5752AArch64: 5753 5754- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g., 5755 instead of ``x30``, print ``w30``. 5756- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow). 5757- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a 5758 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of 5759 ``v*``. 5760 5761AMDGPU: 5762 5763- ``r``: No effect. 5764 5765ARM: 5766 5767- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a 5768 register). 5769- ``P``: No effect. 5770- ``q``: No effect. 5771- ``y``: Print a VFP single-precision register as an indexed double (e.g. print 5772 as ``d4[1]`` instead of ``s9``) 5773- ``B``: Bitwise invert and print an immediate integer constant without ``#`` 5774 prefix. 5775- ``L``: Print the low 16-bits of an immediate integer constant. 5776- ``M``: Print as a register set suitable for ldm/stm. Also prints *all* 5777 register operands subsequent to the specified one (!), so use carefully. 5778- ``Q``: Print the low-order register of a register-pair, or the low-order 5779 register of a two-register operand. 5780- ``R``: Print the high-order register of a register-pair, or the high-order 5781 register of a two-register operand. 5782- ``H``: Print the second register of a register-pair. (On a big-endian system, 5783 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent 5784 to ``R``.) 5785 5786 .. FIXME: H doesn't currently support printing the second register 5787 of a two-register operand. 5788 5789- ``e``: Print the low doubleword register of a NEON quad register. 5790- ``f``: Print the high doubleword register of a NEON quad register. 5791- ``m``: Print the base register of a memory operand without the ``[`` and ``]`` 5792 adornment. 5793 5794Hexagon: 5795 5796- ``L``: Print the second register of a two-register operand. Requires that it 5797 has been allocated consecutively to the first. 5798 5799 .. FIXME: why is it restricted to consecutive ones? And there's 5800 nothing that ensures that happens, is there? 5801 5802- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise 5803 nothing. Used to print 'addi' vs 'add' instructions. 5804 5805LoongArch: 5806 5807- ``z``: Print $zero register if operand is zero, otherwise print it normally. 5808 5809MSP430: 5810 5811No additional modifiers. 5812 5813MIPS: 5814 5815- ``X``: Print an immediate integer as hexadecimal 5816- ``x``: Print the low 16 bits of an immediate integer as hexadecimal. 5817- ``d``: Print an immediate integer as decimal. 5818- ``m``: Subtract one and print an immediate integer as decimal. 5819- ``z``: Print $0 if an immediate zero, otherwise print normally. 5820- ``L``: Print the low-order register of a two-register operand, or prints the 5821 address of the low-order word of a double-word memory operand. 5822 5823 .. FIXME: L seems to be missing memory operand support. 5824 5825- ``M``: Print the high-order register of a two-register operand, or prints the 5826 address of the high-order word of a double-word memory operand. 5827 5828 .. FIXME: M seems to be missing memory operand support. 5829 5830- ``D``: Print the second register of a two-register operand, or prints the 5831 second word of a double-word memory operand. (On a big-endian system, ``D`` is 5832 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to 5833 ``M``.) 5834- ``w``: No effect. Provided for compatibility with GCC which requires this 5835 modifier in order to print MSA registers (``W0-W31``) with the ``f`` 5836 constraint. 5837 5838NVPTX: 5839 5840- ``r``: No effect. 5841 5842PowerPC: 5843 5844- ``L``: Print the second register of a two-register operand. Requires that it 5845 has been allocated consecutively to the first. 5846 5847 .. FIXME: why is it restricted to consecutive ones? And there's 5848 nothing that ensures that happens, is there? 5849 5850- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise 5851 nothing. Used to print 'addi' vs 'add' instructions. 5852- ``y``: For a memory operand, prints formatter for a two-register X-form 5853 instruction. (Currently always prints ``r0,OPERAND``). 5854- ``U``: Prints 'u' if the memory operand is an update form, and nothing 5855 otherwise. (NOTE: LLVM does not support update form, so this will currently 5856 always print nothing) 5857- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does 5858 not support indexed form, so this will currently always print nothing) 5859 5860RISC-V: 5861 5862- ``i``: Print the letter 'i' if the operand is not a register, otherwise print 5863 nothing. Used to print 'addi' vs 'add' instructions, etc. 5864- ``z``: Print the register ``zero`` if an immediate zero, otherwise print 5865 normally. 5866 5867Sparc: 5868 5869- ``L``: Print the low-order register of a two-register operand. 5870- ``H``: Print the high-order register of a two-register operand. 5871- ``r``: No effect. 5872 5873SystemZ: 5874 5875SystemZ implements only ``n``, and does *not* support any of the other 5876target-independent modifiers. 5877 5878X86: 5879 5880- ``c``: Print an unadorned integer or symbol name. (The latter is 5881 target-specific behavior for this typically target-independent modifier). 5882- ``A``: Print a register name with a '``*``' before it. 5883- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory 5884 operand. 5885- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a 5886 memory operand. 5887- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory 5888 operand. 5889- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory 5890 operand. 5891- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are 5892 available, otherwise the 32-bit register name; do nothing on a memory operand. 5893- ``n``: Negate and print an unadorned integer, or, for operands other than an 5894 immediate integer (e.g. a relocatable symbol expression), print a '-' before 5895 the operand. (The behavior for relocatable symbol expressions is a 5896 target-specific behavior for this typically target-independent modifier) 5897- ``H``: Print a memory reference with additional offset +8. 5898- ``p``: Print a raw symbol name (without syntax-specific prefixes). 5899- ``P``: Print a memory reference used as the argument of a call instruction or 5900 used with explicit base reg and index reg as its offset. So it can not use 5901 additional regs to present the memory reference. (E.g. omit ``(rip)``, even 5902 though it's PC-relative.) 5903 5904XCore: 5905 5906No additional modifiers. 5907 5908 5909Inline Asm Metadata 5910^^^^^^^^^^^^^^^^^^^ 5911 5912The call instructions that wrap inline asm nodes may have a 5913"``!srcloc``" MDNode attached to it that contains a list of constant 5914integers. If present, the code generator will use the integer as the 5915location cookie value when report errors through the ``LLVMContext`` 5916error reporting mechanisms. This allows a front-end to correlate backend 5917errors that occur with inline asm back to the source code that produced 5918it. For example: 5919 5920.. code-block:: llvm 5921 5922 call void asm sideeffect "something bad", ""(), !srcloc !42 5923 ... 5924 !42 = !{ i64 1234567 } 5925 5926It is up to the front-end to make sense of the magic numbers it places 5927in the IR. If the MDNode contains multiple constants, the code generator 5928will use the one that corresponds to the line of the asm that the error 5929occurs on. 5930 5931.. _metadata: 5932 5933Metadata 5934======== 5935 5936LLVM IR allows metadata to be attached to instructions and global objects in 5937the program that can convey extra information about the code to the optimizers 5938and code generator. 5939 5940There are two metadata primitives: strings and nodes. There are 5941also specialized nodes which have a distinguished name and a set of named 5942arguments. 5943 5944.. note:: 5945 5946 One example application of metadata is source-level debug information, 5947 which is currently the only user of specialized nodes. 5948 5949Metadata does not have a type, and is not a value. 5950 5951A value of non-\ ``metadata`` type can be used in a metadata context using the 5952syntax '``<type> <value>``'. 5953 5954All other metadata is identified in syntax as starting with an exclamation 5955point ('``!``'). 5956 5957Metadata may be used in the following value contexts by using the ``metadata`` 5958type: 5959 5960- Arguments to certain intrinsic functions, as described in their specification. 5961- Arguments to the ``catchpad``/``cleanuppad`` instructions. 5962 5963.. note:: 5964 5965 Metadata can be "wrapped" in a ``MetadataAsValue`` so it can be referenced 5966 in a value context: ``MetadataAsValue`` is-a ``Value``. 5967 5968 A typed value can be "wrapped" in ``ValueAsMetadata`` so it can be 5969 referenced in a metadata context: ``ValueAsMetadata`` is-a ``Metadata``. 5970 5971 There is no explicit syntax for a ``ValueAsMetadata``, and instead 5972 the fact that a type identifier cannot begin with an exclamation point 5973 is used to resolve ambiguity. 5974 5975 A ``metadata`` type implies a ``MetadataAsValue``, and when followed with a 5976 '``<type> <value>``' pair it wraps the typed value in a ``ValueAsMetadata``. 5977 5978 For example, the first argument 5979 to this call is a ``MetadataAsValue(ValueAsMetadata(Value))``: 5980 5981 .. code-block:: llvm 5982 5983 call void @llvm.foo(metadata i32 1) 5984 5985 Whereas the first argument to this call is a ``MetadataAsValue(MDNode)``: 5986 5987 .. code-block:: llvm 5988 5989 call void @llvm.foo(metadata !0) 5990 5991 The first element of this ``MDTuple`` is a ``MDNode``: 5992 5993 .. code-block:: llvm 5994 5995 !{!0} 5996 5997 And the first element of this ``MDTuple`` is a ``ValueAsMetadata(Value)``: 5998 5999 .. code-block:: llvm 6000 6001 !{i32 1} 6002 6003.. _metadata-string: 6004 6005Metadata Strings (``MDString``) 6006------------------------------- 6007 6008.. FIXME Either fix all references to "MDString" in the docs, or make that 6009 identifier a formal part of the document. 6010 6011A metadata string is a string surrounded by double quotes. It can 6012contain any character by escaping non-printable characters with 6013"``\xx``" where "``xx``" is the two digit hex code. For example: 6014"``!"test\00"``". 6015 6016.. note:: 6017 6018 A metadata string is metadata, but is not a metadata node. 6019 6020.. _metadata-node: 6021 6022Metadata Nodes (``MDNode``) 6023--------------------------- 6024 6025.. FIXME Either fix all references to "MDNode" in the docs, or make that 6026 identifier a formal part of the document. 6027 6028Metadata tuples are represented with notation similar to structure 6029constants: a comma separated list of elements, surrounded by braces and 6030preceded by an exclamation point. Metadata nodes can have any values as 6031their operand. For example: 6032 6033.. code-block:: llvm 6034 6035 !{!"test\00", i32 10} 6036 6037Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example: 6038 6039.. code-block:: text 6040 6041 !0 = distinct !{!"test\00", i32 10} 6042 6043``distinct`` nodes are useful when nodes shouldn't be merged based on their 6044content. They can also occur when transformations cause uniquing collisions 6045when metadata operands change. 6046 6047A :ref:`named metadata <namedmetadatastructure>` is a collection of 6048metadata nodes, which can be looked up in the module symbol table. For 6049example: 6050 6051.. code-block:: llvm 6052 6053 !foo = !{!4, !3} 6054 6055Metadata can be used as function arguments. Here the ``llvm.dbg.value`` 6056intrinsic is using three metadata arguments: 6057 6058.. code-block:: llvm 6059 6060 call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26) 6061 6062 6063.. FIXME Attachments cannot be ValueAsMetadata, but we don't have a 6064 particularly clear way to refer to ValueAsMetadata without getting into 6065 implementation details. Ideally the restriction would be explicit somewhere, 6066 though? 6067 6068Metadata can be attached to an instruction. Here metadata ``!21`` is attached 6069to the ``add`` instruction using the ``!dbg`` identifier: 6070 6071.. code-block:: llvm 6072 6073 %indvar.next = add i64 %indvar, 1, !dbg !21 6074 6075Instructions may not have multiple metadata attachments with the same 6076identifier. 6077 6078Metadata can also be attached to a function or a global variable. Here metadata 6079``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1`` 6080and ``g2`` using the ``!dbg`` identifier: 6081 6082.. code-block:: llvm 6083 6084 declare !dbg !22 void @f1() 6085 define void @f2() !dbg !22 { 6086 ret void 6087 } 6088 6089 @g1 = global i32 0, !dbg !22 6090 @g2 = external global i32, !dbg !22 6091 6092Unlike instructions, global objects (functions and global variables) may have 6093multiple metadata attachments with the same identifier. 6094 6095A transformation is required to drop any metadata attachment that it 6096does not know or know it can't preserve. Currently there is an 6097exception for metadata attachment to globals for ``!func_sanitize``, 6098``!type``, ``!absolute_symbol`` and ``!associated`` which can't be 6099unconditionally dropped unless the global is itself deleted. 6100 6101Metadata attached to a module using named metadata may not be dropped, with 6102the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``). 6103 6104More information about specific metadata nodes recognized by the 6105optimizers and code generator is found below. 6106 6107.. _specialized-metadata: 6108 6109Specialized Metadata Nodes 6110^^^^^^^^^^^^^^^^^^^^^^^^^^ 6111 6112Specialized metadata nodes are custom data structures in metadata (as opposed 6113to generic tuples). Their fields are labelled, and can be specified in any 6114order. 6115 6116These aren't inherently debug info centric, but currently all the specialized 6117metadata nodes are related to debug info. 6118 6119.. _DICompileUnit: 6120 6121DICompileUnit 6122""""""""""""" 6123 6124``DICompileUnit`` nodes represent a compile unit. The ``enums:``, 6125``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples 6126containing the debug info to be emitted along with the compile unit, regardless 6127of code optimizations (some nodes are only emitted if there are references to 6128them from instructions). The ``debugInfoForProfiling:`` field is a boolean 6129indicating whether or not line-table discriminators are updated to provide 6130more-accurate debug info for profiling results. 6131 6132.. code-block:: text 6133 6134 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang", 6135 isOptimized: true, flags: "-O2", runtimeVersion: 2, 6136 splitDebugFilename: "abc.debug", emissionKind: FullDebug, 6137 enums: !2, retainedTypes: !3, globals: !4, imports: !5, 6138 macros: !6, dwoId: 0x0abcd) 6139 6140Compile unit descriptors provide the root scope for objects declared in a 6141specific compilation unit. File descriptors are defined using this scope. These 6142descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep 6143track of global variables, type information, and imported entities (declarations 6144and namespaces). 6145 6146.. _DIFile: 6147 6148DIFile 6149"""""" 6150 6151``DIFile`` nodes represent files. The ``filename:`` can include slashes. 6152 6153.. code-block:: none 6154 6155 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir", 6156 checksumkind: CSK_MD5, 6157 checksum: "000102030405060708090a0b0c0d0e0f") 6158 6159Files are sometimes used in ``scope:`` fields, and are the only valid target 6160for ``file:`` fields. 6161 6162The ``checksum:`` and ``checksumkind:`` fields are optional. If one of these 6163fields is present, then the other is required to be present as well. Valid 6164values for ``checksumkind:`` field are: {CSK_MD5, CSK_SHA1, CSK_SHA256} 6165 6166.. _DIBasicType: 6167 6168DIBasicType 6169""""""""""" 6170 6171``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and 6172``float``. ``tag:`` defaults to ``DW_TAG_base_type``. 6173 6174.. code-block:: text 6175 6176 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8, 6177 encoding: DW_ATE_unsigned_char) 6178 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") 6179 6180The ``encoding:`` describes the details of the type. Usually it's one of the 6181following: 6182 6183.. code-block:: text 6184 6185 DW_ATE_address = 1 6186 DW_ATE_boolean = 2 6187 DW_ATE_float = 4 6188 DW_ATE_signed = 5 6189 DW_ATE_signed_char = 6 6190 DW_ATE_unsigned = 7 6191 DW_ATE_unsigned_char = 8 6192 6193.. _DISubroutineType: 6194 6195DISubroutineType 6196"""""""""""""""" 6197 6198``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field 6199refers to a tuple; the first operand is the return type, while the rest are the 6200types of the formal arguments in order. If the first operand is ``null``, that 6201represents a function with no return value (such as ``void foo() {}`` in C++). 6202 6203.. code-block:: text 6204 6205 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed) 6206 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char) 6207 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char) 6208 6209.. _DIDerivedType: 6210 6211DIDerivedType 6212""""""""""""" 6213 6214``DIDerivedType`` nodes represent types derived from other types, such as 6215qualified types. 6216 6217.. code-block:: text 6218 6219 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8, 6220 encoding: DW_ATE_unsigned_char) 6221 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32, 6222 align: 32) 6223 6224The following ``tag:`` values are valid: 6225 6226.. code-block:: text 6227 6228 DW_TAG_member = 13 6229 DW_TAG_pointer_type = 15 6230 DW_TAG_reference_type = 16 6231 DW_TAG_typedef = 22 6232 DW_TAG_inheritance = 28 6233 DW_TAG_ptr_to_member_type = 31 6234 DW_TAG_const_type = 38 6235 DW_TAG_friend = 42 6236 DW_TAG_volatile_type = 53 6237 DW_TAG_restrict_type = 55 6238 DW_TAG_atomic_type = 71 6239 DW_TAG_immutable_type = 75 6240 6241.. _DIDerivedTypeMember: 6242 6243``DW_TAG_member`` is used to define a member of a :ref:`composite type 6244<DICompositeType>`. The type of the member is the ``baseType:``. The 6245``offset:`` is the member's bit offset. If the composite type has an ODR 6246``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is 6247uniqued based only on its ``name:`` and ``scope:``. 6248 6249``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:`` 6250field of :ref:`composite types <DICompositeType>` to describe parents and 6251friends. 6252 6253``DW_TAG_typedef`` is used to provide a name for the ``baseType:``. 6254 6255``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``, 6256``DW_TAG_volatile_type``, ``DW_TAG_restrict_type``, ``DW_TAG_atomic_type`` and 6257``DW_TAG_immutable_type`` are used to qualify the ``baseType:``. 6258 6259Note that the ``void *`` type is expressed as a type derived from NULL. 6260 6261.. _DICompositeType: 6262 6263DICompositeType 6264""""""""""""""" 6265 6266``DICompositeType`` nodes represent types composed of other types, like 6267structures and unions. ``elements:`` points to a tuple of the composed types. 6268 6269If the source language supports ODR, the ``identifier:`` field gives the unique 6270identifier used for type merging between modules. When specified, 6271:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member 6272derived types <DIDerivedTypeMember>` that reference the ODR-type in their 6273``scope:`` change uniquing rules. 6274 6275For a given ``identifier:``, there should only be a single composite type that 6276does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules 6277together will unique such definitions at parse time via the ``identifier:`` 6278field, even if the nodes are ``distinct``. 6279 6280.. code-block:: text 6281 6282 !0 = !DIEnumerator(name: "SixKind", value: 7) 6283 !1 = !DIEnumerator(name: "SevenKind", value: 7) 6284 !2 = !DIEnumerator(name: "NegEightKind", value: -8) 6285 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12, 6286 line: 2, size: 32, align: 32, identifier: "_M4Enum", 6287 elements: !{!0, !1, !2}) 6288 6289The following ``tag:`` values are valid: 6290 6291.. code-block:: text 6292 6293 DW_TAG_array_type = 1 6294 DW_TAG_class_type = 2 6295 DW_TAG_enumeration_type = 4 6296 DW_TAG_structure_type = 19 6297 DW_TAG_union_type = 23 6298 6299For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange 6300descriptors <DISubrange>`, each representing the range of subscripts at that 6301level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an 6302array type is a native packed vector. The optional ``dataLocation`` is a 6303DIExpression that describes how to get from an object's address to the actual 6304raw data, if they aren't equivalent. This is only supported for array types, 6305particularly to describe Fortran arrays, which have an array descriptor in 6306addition to the array data. Alternatively it can also be DIVariable which 6307has the address of the actual raw data. The Fortran language supports pointer 6308arrays which can be attached to actual arrays, this attachment between pointer 6309and pointee is called association. The optional ``associated`` is a 6310DIExpression that describes whether the pointer array is currently associated. 6311The optional ``allocated`` is a DIExpression that describes whether the 6312allocatable array is currently allocated. The optional ``rank`` is a 6313DIExpression that describes the rank (number of dimensions) of fortran assumed 6314rank array (rank is known at runtime). 6315 6316For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator 6317descriptors <DIEnumerator>`, each representing the definition of an enumeration 6318value for the set. All enumeration type descriptors are collected in the 6319``enums:`` field of the :ref:`compile unit <DICompileUnit>`. 6320 6321For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and 6322``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types 6323<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or 6324``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with 6325``isDefinition: false``. 6326 6327.. _DISubrange: 6328 6329DISubrange 6330"""""""""" 6331 6332``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of 6333:ref:`DICompositeType`. 6334 6335- ``count: -1`` indicates an empty array. 6336- ``count: !10`` describes the count with a :ref:`DILocalVariable`. 6337- ``count: !12`` describes the count with a :ref:`DIGlobalVariable`. 6338 6339.. code-block:: text 6340 6341 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0 6342 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1 6343 !2 = !DISubrange(count: -1) ; empty array. 6344 6345 ; Scopes used in rest of example 6346 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file") 6347 !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6) 6348 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5) 6349 6350 ; Use of local variable as count value 6351 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 6352 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9) 6353 !11 = !DISubrange(count: !10, lowerBound: 0) 6354 6355 ; Use of global variable as count value 6356 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9) 6357 !13 = !DISubrange(count: !12, lowerBound: 0) 6358 6359.. _DIEnumerator: 6360 6361DIEnumerator 6362"""""""""""" 6363 6364``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type`` 6365variants of :ref:`DICompositeType`. 6366 6367.. code-block:: text 6368 6369 !0 = !DIEnumerator(name: "SixKind", value: 7) 6370 !1 = !DIEnumerator(name: "SevenKind", value: 7) 6371 !2 = !DIEnumerator(name: "NegEightKind", value: -8) 6372 6373DITemplateTypeParameter 6374""""""""""""""""""""""" 6375 6376``DITemplateTypeParameter`` nodes represent type parameters to generic source 6377language constructs. They are used (optionally) in :ref:`DICompositeType` and 6378:ref:`DISubprogram` ``templateParams:`` fields. 6379 6380.. code-block:: text 6381 6382 !0 = !DITemplateTypeParameter(name: "Ty", type: !1) 6383 6384DITemplateValueParameter 6385"""""""""""""""""""""""" 6386 6387``DITemplateValueParameter`` nodes represent value parameters to generic source 6388language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``, 6389but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or 6390``DW_TAG_GNU_template_param_pack``. They are used (optionally) in 6391:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields. 6392 6393.. code-block:: text 6394 6395 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7) 6396 6397DINamespace 6398""""""""""" 6399 6400``DINamespace`` nodes represent namespaces in the source language. 6401 6402.. code-block:: text 6403 6404 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7) 6405 6406.. _DIGlobalVariable: 6407 6408DIGlobalVariable 6409"""""""""""""""" 6410 6411``DIGlobalVariable`` nodes represent global variables in the source language. 6412 6413.. code-block:: text 6414 6415 @foo = global i32, !dbg !0 6416 !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression()) 6417 !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2, 6418 file: !3, line: 7, type: !4, isLocal: true, 6419 isDefinition: false, declaration: !5) 6420 6421 6422DIGlobalVariableExpression 6423"""""""""""""""""""""""""" 6424 6425``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together 6426with a :ref:`DIExpression`. 6427 6428.. code-block:: text 6429 6430 @lower = global i32, !dbg !0 6431 @upper = global i32, !dbg !1 6432 !0 = !DIGlobalVariableExpression( 6433 var: !2, 6434 expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32) 6435 ) 6436 !1 = !DIGlobalVariableExpression( 6437 var: !2, 6438 expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32) 6439 ) 6440 !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3, 6441 file: !4, line: 8, type: !5, declaration: !6) 6442 6443All global variable expressions should be referenced by the `globals:` field of 6444a :ref:`compile unit <DICompileUnit>`. 6445 6446.. _DISubprogram: 6447 6448DISubprogram 6449"""""""""""" 6450 6451``DISubprogram`` nodes represent functions from the source language. A distinct 6452``DISubprogram`` may be attached to a function definition using ``!dbg`` 6453metadata. A unique ``DISubprogram`` may be attached to a function declaration 6454used for call site debug info. The ``retainedNodes:`` field is a list of 6455:ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be 6456retained, even if their IR counterparts are optimized out of the IR. The 6457``type:`` field must point at an :ref:`DISubroutineType`. 6458 6459.. _DISubprogramDeclaration: 6460 6461When ``spFlags: DISPFlagDefinition`` is not present, subprograms describe a 6462declaration in the type tree as opposed to a definition of a function. In this 6463case, the ``declaration`` field must be empty. If the scope is a composite type 6464with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, then 6465the subprogram declaration is uniqued based only on its ``linkageName:`` and 6466``scope:``. 6467 6468.. code-block:: text 6469 6470 define void @_Z3foov() !dbg !0 { 6471 ... 6472 } 6473 6474 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1, 6475 file: !2, line: 7, type: !3, 6476 spFlags: DISPFlagDefinition | DISPFlagLocalToUnit, 6477 scopeLine: 8, containingType: !4, 6478 virtuality: DW_VIRTUALITY_pure_virtual, 6479 virtualIndex: 10, flags: DIFlagPrototyped, 6480 isOptimized: true, unit: !5, templateParams: !6, 6481 declaration: !7, retainedNodes: !8, 6482 thrownTypes: !9) 6483 6484.. _DILexicalBlock: 6485 6486DILexicalBlock 6487"""""""""""""" 6488 6489``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram 6490<DISubprogram>`. The line number and column numbers are used to distinguish 6491two lexical blocks at same depth. They are valid targets for ``scope:`` 6492fields. 6493 6494.. code-block:: text 6495 6496 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35) 6497 6498Usually lexical blocks are ``distinct`` to prevent node merging based on 6499operands. 6500 6501.. _DILexicalBlockFile: 6502 6503DILexicalBlockFile 6504"""""""""""""""""" 6505 6506``DILexicalBlockFile`` nodes are used to discriminate between sections of a 6507:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to 6508indicate textual inclusion, or the ``discriminator:`` field can be used to 6509discriminate between control flow within a single block in the source language. 6510 6511.. code-block:: text 6512 6513 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35) 6514 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0) 6515 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1) 6516 6517.. _DILocation: 6518 6519DILocation 6520"""""""""" 6521 6522``DILocation`` nodes represent source debug locations. The ``scope:`` field is 6523mandatory, and points at an :ref:`DILexicalBlockFile`, an 6524:ref:`DILexicalBlock`, or an :ref:`DISubprogram`. 6525 6526.. code-block:: text 6527 6528 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2) 6529 6530.. _DILocalVariable: 6531 6532DILocalVariable 6533""""""""""""""" 6534 6535``DILocalVariable`` nodes represent local variables in the source language. If 6536the ``arg:`` field is set to non-zero, then this variable is a subprogram 6537parameter, and it will be included in the ``retainedNodes:`` field of its 6538:ref:`DISubprogram`. 6539 6540.. code-block:: text 6541 6542 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7, 6543 type: !3, flags: DIFlagArtificial) 6544 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7, 6545 type: !3) 6546 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3) 6547 6548.. _DIExpression: 6549 6550DIExpression 6551"""""""""""" 6552 6553``DIExpression`` nodes represent expressions that are inspired by the DWARF 6554expression language. They are used in :ref:`debug records <debugrecords>` 6555(such as ``#dbg_declare`` and ``#dbg_value``) to describe how the 6556referenced LLVM variable relates to the source language variable. Debug 6557expressions are interpreted left-to-right: start by pushing the value/address 6558operand of the record onto a stack, then repeatedly push and evaluate 6559opcodes from the DIExpression until the final variable description is produced. 6560 6561The current supported opcode vocabulary is limited: 6562 6563- ``DW_OP_deref`` dereferences the top of the expression stack. 6564- ``DW_OP_plus`` pops the last two entries from the expression stack, adds 6565 them together and appends the result to the expression stack. 6566- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts 6567 the last entry from the second last entry and appends the result to the 6568 expression stack. 6569- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression. 6570- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8`` 6571 here, respectively) of the variable fragment from the working expression. Note 6572 that contrary to DW_OP_bit_piece, the offset is describing the location 6573 within the described source variable. 6574- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding 6575 (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the 6576 expression stack is to be converted. Maps into a ``DW_OP_convert`` operation 6577 that references a base type constructed from the supplied values. 6578- ``DW_OP_LLVM_extract_bits_sext, 16, 8,`` specifies the offset and size 6579 (``16`` and ``8`` here, respectively) of bits that are to be extracted and 6580 sign-extended from the value at the top of the expression stack. If the top of 6581 the expression stack is a memory location then these bits are extracted from 6582 the value pointed to by that memory location. Maps into a ``DW_OP_shl`` 6583 followed by ``DW_OP_shra``. 6584- ``DW_OP_LLVM_extract_bits_zext`` behaves similarly to 6585 ``DW_OP_LLVM_extract_bits_sext``, but zero-extends instead of sign-extending. 6586 Maps into a ``DW_OP_shl`` followed by ``DW_OP_shr``. 6587- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be 6588 optionally applied to the pointer. The memory tag is derived from the 6589 given tag offset in an implementation-defined manner. 6590- ``DW_OP_swap`` swaps top two stack entries. 6591- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top 6592 of the stack is treated as an address. The second stack entry is treated as an 6593 address space identifier. 6594- ``DW_OP_stack_value`` marks a constant value. 6595- ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon 6596 function entry. When targeting DWARF, a ``DBG_VALUE(reg, ..., 6597 DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to 6598 ``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon 6599 function entry onto the DWARF expression stack. 6600 6601 The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value`` 6602 block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1, 6603 DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where 6604 the entry value of ``reg`` is pushed onto the stack, and is added with 123. 6605 Due to framework limitations ``N`` must be 1, in other words, 6606 ``DW_OP_entry_value`` always refers to the value/address operand of the 6607 instruction. 6608 6609 Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is 6610 usually used in MIR, but it is also allowed in LLVM IR when targeting a 6611 :ref:`swiftasync <swiftasync>` argument. The operation is introduced by: 6612 6613 - ``LiveDebugValues`` pass, which applies it to function parameters that 6614 are unmodified throughout the function. Support is limited to simple 6615 register location descriptions, or as indirect locations (e.g., 6616 parameters passed-by-value to a callee via a pointer to a temporary copy 6617 made in the caller). 6618 - ``AsmPrinter`` pass when a call site parameter value 6619 (``DW_AT_call_site_parameter_value``) is represented as entry value of 6620 the parameter. 6621 - ``CoroSplit`` pass, which may move variables from allocas into a 6622 coroutine frame. If the coroutine frame is a 6623 :ref:`swiftasync <swiftasync>` argument, the variable is described with 6624 an ``DW_OP_LLVM_entry_value`` operation. 6625 6626- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one 6627 value, such as one that calculates the sum of two registers. This is always 6628 used in combination with an ordered list of values, such that 6629 ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For 6630 example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus, 6631 DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to 6632 ``%reg1 - reg2``. This list of values should be provided by the containing 6633 intrinsic/instruction. 6634- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided 6635 signed offset of the specified register. The opcode is only generated by the 6636 ``AsmPrinter`` pass to describe call site parameter value which requires an 6637 expression over two registers. 6638- ``DW_OP_push_object_address`` pushes the address of the object which can then 6639 serve as a descriptor in subsequent calculation. This opcode can be used to 6640 calculate bounds of fortran allocatable array which has array descriptors. 6641- ``DW_OP_over`` duplicates the entry currently second in the stack at the top 6642 of the stack. This opcode can be used to calculate bounds of fortran assumed 6643 rank array which has rank known at run time and current dimension number is 6644 implicitly first element of the stack. 6645- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can 6646 be used to represent pointer variables which are optimized out but the value 6647 it points to is known. This operator is required as it is different than DWARF 6648 operator DW_OP_implicit_pointer in representation and specification (number 6649 and types of operands) and later can not be used as multiple level. 6650 6651.. code-block:: text 6652 6653 IR for "*ptr = 4;" 6654 -------------- 6655 #dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20) 6656 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5, 6657 type: !18) 6658 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64) 6659 !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 6660 !20 = !DILocation(line: 10, scope: !12) 6661 6662 IR for "**ptr = 4;" 6663 -------------- 6664 #dbg_value(i32 4, !17, 6665 !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer), 6666 !21) 6667 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5, 6668 type: !18) 6669 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64) 6670 !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64) 6671 !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 6672 !21 = !DILocation(line: 10, scope: !12) 6673 6674DWARF specifies three kinds of simple location descriptions: Register, memory, 6675and implicit location descriptions. Note that a location description is 6676defined over certain ranges of a program, i.e the location of a variable may 6677change over the course of the program. Register and memory location 6678descriptions describe the *concrete location* of a source variable (in the 6679sense that a debugger might modify its value), whereas *implicit locations* 6680describe merely the actual *value* of a source variable which might not exist 6681in registers or in memory (see ``DW_OP_stack_value``). 6682 6683A ``#dbg_declare`` record describes an indirect value (the address) of a 6684source variable. The first operand of the record must be an address of some 6685kind. A DIExpression operand to the record refines this address to produce a 6686concrete location for the source variable. 6687 6688A ``#dbg_value`` record describes the direct value of a source variable. 6689The first operand of the record may be a direct or indirect value. A 6690DIExpression operand to the record refines the first operand to produce a 6691direct value. For example, if the first operand is an indirect value, it may be 6692necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a 6693valid debug record. 6694 6695.. note:: 6696 6697 A DIExpression is interpreted in the same way regardless of which kind of 6698 debug record it's attached to. 6699 6700 DIExpressions are always printed and parsed inline; they can never be 6701 referenced by an ID (e.g. ``!1``). 6702 6703.. code-block:: text 6704 6705 !DIExpression(DW_OP_deref) 6706 !DIExpression(DW_OP_plus_uconst, 3) 6707 !DIExpression(DW_OP_constu, 3, DW_OP_plus) 6708 !DIExpression(DW_OP_bit_piece, 3, 7) 6709 !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7) 6710 !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef) 6711 !DIExpression(DW_OP_constu, 42, DW_OP_stack_value) 6712 6713DIAssignID 6714"""""""""" 6715 6716``DIAssignID`` nodes have no operands and are always distinct. They are used to 6717link together (:ref:`#dbg_assign records <debugrecords>`) and instructions 6718that store in IR. See `Debug Info Assignment Tracking 6719<AssignmentTracking.html>`_ for more info. 6720 6721.. code-block:: llvm 6722 6723 store i32 %a, ptr %a.addr, align 4, !DIAssignID !2 6724 #dbg_assign(%a, !1, !DIExpression(), !2, %a.addr, !DIExpression(), !3) 6725 6726 !2 = distinct !DIAssignID() 6727 6728DIArgList 6729""""""""" 6730 6731.. FIXME In the implementation this is not a "node", but as it can only appear 6732 inline in a function context that distinction isn't observable anyway. Even 6733 if it is not required, it would be nice to be more clear about what is a 6734 "node", and what that actually means. The names in the implementation could 6735 also be updated to mirror whatever we decide here. 6736 6737``DIArgList`` nodes hold a list of constant or SSA value references. These are 6738used in :ref:`debug records <debugrecords>` in combination with a 6739``DIExpression`` that uses the 6740``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values 6741within a function, it must only be used as a function argument, must always be 6742inlined, and cannot appear in named metadata. 6743 6744.. code-block:: text 6745 6746 #dbg_value(!DIArgList(i32 %a, i32 %b), 6747 !16, 6748 !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus), 6749 !26) 6750 6751DIFlags 6752""""""" 6753 6754These flags encode various properties of DINodes. 6755 6756The `ExportSymbols` flag marks a class, struct or union whose members 6757may be referenced as if they were defined in the containing class or 6758union. This flag is used to decide whether the DW_AT_export_symbols can 6759be used for the structure type. 6760 6761DIObjCProperty 6762"""""""""""""" 6763 6764``DIObjCProperty`` nodes represent Objective-C property nodes. 6765 6766.. code-block:: text 6767 6768 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo", 6769 getter: "getFoo", attributes: 7, type: !2) 6770 6771DIImportedEntity 6772"""""""""""""""" 6773 6774``DIImportedEntity`` nodes represent entities (such as modules) imported into a 6775compile unit. The ``elements`` field is a list of renamed entities (such as 6776variables and subprograms) in the imported entity (such as module). 6777 6778.. code-block:: text 6779 6780 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0, 6781 entity: !1, line: 7, elements: !3) 6782 !3 = !{!4} 6783 !4 = !DIImportedEntity(tag: DW_TAG_imported_declaration, name: "bar", scope: !0, 6784 entity: !5, line: 7) 6785 6786DIMacro 6787""""""" 6788 6789``DIMacro`` nodes represent definition or undefinition of a macro identifiers. 6790The ``name:`` field is the macro identifier, followed by macro parameters when 6791defining a function-like macro, and the ``value`` field is the token-string 6792used to expand the macro identifier. 6793 6794.. code-block:: text 6795 6796 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)", 6797 value: "((x) + 1)") 6798 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo") 6799 6800DIMacroFile 6801""""""""""" 6802 6803``DIMacroFile`` nodes represent inclusion of source files. 6804The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that 6805appear in the included source file. 6806 6807.. code-block:: text 6808 6809 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2, 6810 nodes: !3) 6811 6812.. _DILabel: 6813 6814DILabel 6815""""""" 6816 6817``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of 6818a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a 6819:ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`. 6820The ``name:`` field is the label identifier. The ``file:`` field is the 6821:ref:`DIFile` the label is present in. The ``line:`` field is the source line 6822within the file where the label is declared. 6823 6824.. code-block:: text 6825 6826 !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7) 6827 6828DICommonBlock 6829""""""""""""" 6830 6831``DICommonBlock`` nodes represent Fortran common blocks. The ``scope:`` field 6832is mandatory and points to a :ref:`DILexicalBlockFile`, a 6833:ref:`DILexicalBlock`, or a :ref:`DISubprogram`. The ``declaration:``, 6834``name:``, ``file:``, and ``line:`` fields are optional. 6835 6836DIModule 6837"""""""" 6838 6839``DIModule`` nodes represent a source language module, for example, a Clang 6840module, or a Fortran module. The ``scope:`` field is mandatory and points to a 6841:ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`. 6842The ``name:`` field is mandatory. The ``configMacros:``, ``includePath:``, 6843``apinotes:``, ``file:``, ``line:``, and ``isDecl:`` fields are optional. 6844 6845DIStringType 6846"""""""""""" 6847 6848``DIStringType`` nodes represent a Fortran ``CHARACTER(n)`` type, with a 6849dynamic length and location encoded as an expression. 6850The ``tag:`` field is optional and defaults to ``DW_TAG_string_type``. The ``name:``, 6851``stringLength:``, ``stringLengthExpression``, ``stringLocationExpression:``, 6852``size:``, ``align:``, and ``encoding:`` fields are optional. 6853 6854If not present, the ``size:`` and ``align:`` fields default to the value zero. 6855 6856The length in bits of the string is specified by the first of the following 6857fields present: 6858 6859- ``stringLength:``, which points to a ``DIVariable`` whose value is the string 6860 length in bits. 6861- ``stringLengthExpression:``, which points to a ``DIExpression`` which 6862 computes the length in bits. 6863- ``size``, which contains the literal length in bits. 6864 6865The ``stringLocationExpression:`` points to a ``DIExpression`` which describes 6866the "data location" of the string object, if present. 6867 6868'``tbaa``' Metadata 6869^^^^^^^^^^^^^^^^^^^ 6870 6871In LLVM IR, memory does not have types, so LLVM's own type system is not 6872suitable for doing type based alias analysis (TBAA). Instead, metadata is 6873added to the IR to describe a type system of a higher level language. This 6874can be used to implement C/C++ strict type aliasing rules, but it can also 6875be used to implement custom alias analysis behavior for other languages. 6876 6877This description of LLVM's TBAA system is broken into two parts: 6878:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and 6879:ref:`Representation<tbaa_node_representation>` talks about the metadata 6880encoding of various entities. 6881 6882It is always possible to trace any TBAA node to a "root" TBAA node (details 6883in the :ref:`Representation<tbaa_node_representation>` section). TBAA 6884nodes with different roots have an unknown aliasing relationship, and LLVM 6885conservatively infers ``MayAlias`` between them. The rules mentioned in 6886this section only pertain to TBAA nodes living under the same root. 6887 6888.. _tbaa_node_semantics: 6889 6890Semantics 6891""""""""" 6892 6893The TBAA metadata system, referred to as "struct path TBAA" (not to be 6894confused with ``tbaa.struct``), consists of the following high level 6895concepts: *Type Descriptors*, further subdivided into scalar type 6896descriptors and struct type descriptors; and *Access Tags*. 6897 6898**Type descriptors** describe the type system of the higher level language 6899being compiled. **Scalar type descriptors** describe types that do not 6900contain other types. Each scalar type has a parent type, which must also 6901be a scalar type or the TBAA root. Via this parent relation, scalar types 6902within a TBAA root form a tree. **Struct type descriptors** denote types 6903that contain a sequence of other type descriptors, at known offsets. These 6904contained type descriptors can either be struct type descriptors themselves 6905or scalar type descriptors. 6906 6907**Access tags** are metadata nodes attached to load and store instructions. 6908Access tags use type descriptors to describe the *location* being accessed 6909in terms of the type system of the higher level language. Access tags are 6910tuples consisting of a base type, an access type and an offset. The base 6911type is a scalar type descriptor or a struct type descriptor, the access 6912type is a scalar type descriptor, and the offset is a constant integer. 6913 6914The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two 6915things: 6916 6917 * If ``BaseTy`` is a struct type, the tag describes a memory access (load 6918 or store) of a value of type ``AccessTy`` contained in the struct type 6919 ``BaseTy`` at offset ``Offset``. 6920 6921 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and 6922 ``AccessTy`` must be the same; and the access tag describes a scalar 6923 access with scalar type ``AccessTy``. 6924 6925We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)`` 6926tuples this way: 6927 6928 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is 6929 ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as 6930 described in the TBAA metadata. ``ImmediateParent(BaseTy, Offset)`` is 6931 undefined if ``Offset`` is non-zero. 6932 6933 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)`` 6934 is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in 6935 ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted 6936 to be relative within that inner type. 6937 6938A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)`` 6939aliases a memory access with an access tag ``(BaseTy2, AccessTy2, 6940Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2, 6941Offset2)`` via the ``Parent`` relation or vice versa. If memory accesses 6942alias even though they are noalias according to ``!tbaa`` metadata, the 6943behavior is undefined. 6944 6945As a concrete example, the type descriptor graph for the following program 6946 6947.. code-block:: c 6948 6949 struct Inner { 6950 int i; // offset 0 6951 float f; // offset 4 6952 }; 6953 6954 struct Outer { 6955 float f; // offset 0 6956 double d; // offset 4 6957 struct Inner inner_a; // offset 12 6958 }; 6959 6960 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) { 6961 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0) 6962 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12) 6963 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, FloatScalarTy, 16) 6964 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0) 6965 } 6966 6967is (note that in C and C++, ``char`` can be used to access any arbitrary 6968type): 6969 6970.. code-block:: text 6971 6972 Root = "TBAA Root" 6973 CharScalarTy = ("char", Root, 0) 6974 FloatScalarTy = ("float", CharScalarTy, 0) 6975 DoubleScalarTy = ("double", CharScalarTy, 0) 6976 IntScalarTy = ("int", CharScalarTy, 0) 6977 InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)} 6978 OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4), 6979 (InnerStructTy, 12)} 6980 6981 6982with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy, 69830)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and 6984``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``. 6985 6986.. _tbaa_node_representation: 6987 6988Representation 6989"""""""""""""" 6990 6991The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or 6992with exactly one ``MDString`` operand. 6993 6994Scalar type descriptors are represented as an ``MDNode`` s with two 6995operands. The first operand is an ``MDString`` denoting the name of the 6996struct type. LLVM does not assign meaning to the value of this operand, it 6997only cares about it being an ``MDString``. The second operand is an 6998``MDNode`` which points to the parent for said scalar type descriptor, 6999which is either another scalar type descriptor or the TBAA root. Scalar 7000type descriptors can have an optional third argument, but that must be the 7001constant integer zero. 7002 7003Struct type descriptors are represented as ``MDNode`` s with an odd number 7004of operands greater than 1. The first operand is an ``MDString`` denoting 7005the name of the struct type. Like in scalar type descriptors the actual 7006value of this name operand is irrelevant to LLVM. After the name operand, 7007the struct type descriptors have a sequence of alternating ``MDNode`` and 7008``ConstantInt`` operands. With N starting from 1, the 2N - 1 th operand, 7009an ``MDNode``, denotes a contained field, and the 2N th operand, a 7010``ConstantInt``, is the offset of the said contained field. The offsets 7011must be in non-decreasing order. 7012 7013Access tags are represented as ``MDNode`` s with either 3 or 4 operands. 7014The first operand is an ``MDNode`` pointing to the node representing the 7015base type. The second operand is an ``MDNode`` pointing to the node 7016representing the access type. The third operand is a ``ConstantInt`` that 7017states the offset of the access. If a fourth field is present, it must be 7018a ``ConstantInt`` valued at 0 or 1. If it is 1 then the access tag states 7019that the location being accessed is "constant" (meaning 7020``pointsToConstantMemory`` should return true; see `other useful 7021AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). The TBAA root of 7022the access type and the base type of an access tag must be the same, and 7023that is the TBAA root of the access tag. 7024 7025'``tbaa.struct``' Metadata 7026^^^^^^^^^^^^^^^^^^^^^^^^^^ 7027 7028The :ref:`llvm.memcpy <int_memcpy>` is often used to implement 7029aggregate assignment operations in C and similar languages, however it 7030is defined to copy a contiguous region of memory, which is more than 7031strictly necessary for aggregate types which contain holes due to 7032padding. Also, it doesn't contain any TBAA information about the fields 7033of the aggregate. 7034 7035``!tbaa.struct`` metadata can describe which memory subregions in a 7036memcpy are padding and what the TBAA tags of the struct are. 7037 7038The current metadata format is very simple. ``!tbaa.struct`` metadata 7039nodes are a list of operands which are in conceptual groups of three. 7040For each group of three, the first operand gives the byte offset of a 7041field in bytes, the second gives its size in bytes, and the third gives 7042its tbaa tag. e.g.: 7043 7044.. code-block:: llvm 7045 7046 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 } 7047 7048This describes a struct with two fields. The first is at offset 0 bytes 7049with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes 7050and has size 4 bytes and has tbaa tag !2. 7051 7052Note that the fields need not be contiguous. In this example, there is a 70534 byte gap between the two fields. This gap represents padding which 7054does not carry useful data and need not be preserved. 7055 7056'``noalias``' and '``alias.scope``' Metadata 7057^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7058 7059``noalias`` and ``alias.scope`` metadata provide the ability to specify generic 7060noalias memory-access sets. This means that some collection of memory access 7061instructions (loads, stores, memory-accessing calls, etc.) that carry 7062``noalias`` metadata can specifically be specified not to alias with some other 7063collection of memory access instructions that carry ``alias.scope`` metadata. If 7064accesses from different collections alias, the behavior is undefined. Each type 7065of metadata specifies a list of scopes where each scope has an id and a domain. 7066 7067When evaluating an aliasing query, if for some domain, the set 7068of scopes with that domain in one instruction's ``alias.scope`` list is a 7069subset of (or equal to) the set of scopes for that domain in another 7070instruction's ``noalias`` list, then the two memory accesses are assumed not to 7071alias. 7072 7073Because scopes in one domain don't affect scopes in other domains, separate 7074domains can be used to compose multiple independent noalias sets. This is 7075used for example during inlining. As the noalias function parameters are 7076turned into noalias scope metadata, a new domain is used every time the 7077function is inlined. 7078 7079The metadata identifying each domain is itself a list containing one or two 7080entries. The first entry is the name of the domain. Note that if the name is a 7081string then it can be combined across functions and translation units. A 7082self-reference can be used to create globally unique domain names. A 7083descriptive string may optionally be provided as a second list entry. 7084 7085The metadata identifying each scope is also itself a list containing two or 7086three entries. The first entry is the name of the scope. Note that if the name 7087is a string then it can be combined across functions and translation units. A 7088self-reference can be used to create globally unique scope names. A metadata 7089reference to the scope's domain is the second entry. A descriptive string may 7090optionally be provided as a third list entry. 7091 7092For example, 7093 7094.. code-block:: llvm 7095 7096 ; Two scope domains: 7097 !0 = !{!0} 7098 !1 = !{!1} 7099 7100 ; Some scopes in these domains: 7101 !2 = !{!2, !0} 7102 !3 = !{!3, !0} 7103 !4 = !{!4, !1} 7104 7105 ; Some scope lists: 7106 !5 = !{!4} ; A list containing only scope !4 7107 !6 = !{!4, !3, !2} 7108 !7 = !{!3} 7109 7110 ; These two instructions don't alias: 7111 %0 = load float, ptr %c, align 4, !alias.scope !5 7112 store float %0, ptr %arrayidx.i, align 4, !noalias !5 7113 7114 ; These two instructions also don't alias (for domain !1, the set of scopes 7115 ; in the !alias.scope equals that in the !noalias list): 7116 %2 = load float, ptr %c, align 4, !alias.scope !5 7117 store float %2, ptr %arrayidx.i2, align 4, !noalias !6 7118 7119 ; These two instructions may alias (for domain !0, the set of scopes in 7120 ; the !noalias list is not a superset of, or equal to, the scopes in the 7121 ; !alias.scope list): 7122 %2 = load float, ptr %c, align 4, !alias.scope !6 7123 store float %0, ptr %arrayidx.i, align 4, !noalias !7 7124 7125.. _fpmath-metadata: 7126 7127'``fpmath``' Metadata 7128^^^^^^^^^^^^^^^^^^^^^ 7129 7130``fpmath`` metadata may be attached to any instruction of floating-point 7131type. It can be used to express the maximum acceptable error in the 7132result of that instruction, in ULPs, thus potentially allowing the 7133compiler to use a more efficient but less accurate method of computing 7134it. ULP is defined as follows: 7135 7136 If ``x`` is a real number that lies between two finite consecutive 7137 floating-point numbers ``a`` and ``b``, without being equal to one 7138 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the 7139 distance between the two non-equal finite floating-point numbers 7140 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``. 7141 7142The metadata node shall consist of a single positive float type number 7143representing the maximum relative error, for example: 7144 7145.. code-block:: llvm 7146 7147 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs 7148 7149.. _range-metadata: 7150 7151'``range``' Metadata 7152^^^^^^^^^^^^^^^^^^^^ 7153 7154``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of 7155integer or vector of integer types. It expresses the possible ranges the loaded 7156value or the value returned by the called function at this call site is in. If 7157the loaded or returned value is not in the specified range, a poison value is 7158returned instead. The ranges are represented with a flattened list of integers. 7159The loaded value or the value returned is known to be in the union of the ranges 7160defined by each consecutive pair. Each pair has the following properties: 7161 7162- The type must match the scalar type of the instruction. 7163- The pair ``a,b`` represents the range ``[a,b)``. 7164- Both ``a`` and ``b`` are constants. 7165- The range is allowed to wrap. 7166- The range should not represent the full or empty set. That is, 7167 ``a!=b``. 7168 7169In addition, the pairs must be in signed order of the lower bound and 7170they must be non-contiguous. 7171 7172For vector-typed instructions, the range is applied element-wise. 7173 7174Examples: 7175 7176.. code-block:: llvm 7177 7178 %a = load i8, ptr %x, align 1, !range !0 ; Can only be 0 or 1 7179 %b = load i8, ptr %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1 7180 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5 7181 %d = invoke i8 @bar() to label %cont 7182 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5 7183 %e = load <2 x i8>, ptr %x, !range 0 ; Can only be <0 or 1, 0 or 1> 7184 ... 7185 !0 = !{ i8 0, i8 2 } 7186 !1 = !{ i8 255, i8 2 } 7187 !2 = !{ i8 0, i8 2, i8 3, i8 6 } 7188 !3 = !{ i8 -2, i8 0, i8 3, i8 6 } 7189 7190'``absolute_symbol``' Metadata 7191^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7192 7193``absolute_symbol`` metadata may be attached to a global variable 7194declaration. It marks the declaration as a reference to an absolute symbol, 7195which causes the backend to use absolute relocations for the symbol even 7196in position independent code, and expresses the possible ranges that the 7197global variable's *address* (not its value) is in, in the same format as 7198``range`` metadata, with the extension that the pair ``all-ones,all-ones`` 7199may be used to represent the full set. 7200 7201Example (assuming 64-bit pointers): 7202 7203.. code-block:: llvm 7204 7205 @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256) 7206 @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64) 7207 7208 ... 7209 !0 = !{ i64 0, i64 256 } 7210 !1 = !{ i64 -1, i64 -1 } 7211 7212'``callees``' Metadata 7213^^^^^^^^^^^^^^^^^^^^^^ 7214 7215``callees`` metadata may be attached to indirect call sites. If ``callees`` 7216metadata is attached to a call site, and any callee is not among the set of 7217functions provided by the metadata, the behavior is undefined. The intent of 7218this metadata is to facilitate optimizations such as indirect-call promotion. 7219For example, in the code below, the call instruction may only target the 7220``add`` or ``sub`` functions: 7221 7222.. code-block:: llvm 7223 7224 %result = call i64 %binop(i64 %x, i64 %y), !callees !0 7225 7226 ... 7227 !0 = !{ptr @add, ptr @sub} 7228 7229'``callback``' Metadata 7230^^^^^^^^^^^^^^^^^^^^^^^ 7231 7232``callback`` metadata may be attached to a function declaration, or definition. 7233(Call sites are excluded only due to the lack of a use case.) For ease of 7234exposition, we'll refer to the function annotated w/ metadata as a broker 7235function. The metadata describes how the arguments of a call to the broker are 7236in turn passed to the callback function specified by the metadata. Thus, the 7237``callback`` metadata provides a partial description of a call site inside the 7238broker function with regards to the arguments of a call to the broker. The only 7239semantic restriction on the broker function itself is that it is not allowed to 7240inspect or modify arguments referenced in the ``callback`` metadata as 7241pass-through to the callback function. 7242 7243The broker is not required to actually invoke the callback function at runtime. 7244However, the assumptions about not inspecting or modifying arguments that would 7245be passed to the specified callback function still hold, even if the callback 7246function is not dynamically invoked. The broker is allowed to invoke the 7247callback function more than once per invocation of the broker. The broker is 7248also allowed to invoke (directly or indirectly) the function passed as a 7249callback through another use. Finally, the broker is also allowed to relay the 7250callback callee invocation to a different thread. 7251 7252The metadata is structured as follows: At the outer level, ``callback`` 7253metadata is a list of ``callback`` encodings. Each encoding starts with a 7254constant ``i64`` which describes the argument position of the callback function 7255in the call to the broker. The following elements, except the last, describe 7256what arguments are passed to the callback function. Each element is again an 7257``i64`` constant identifying the argument of the broker that is passed through, 7258or ``i64 -1`` to indicate an unknown or inspected argument. The order in which 7259they are listed has to be the same in which they are passed to the callback 7260callee. The last element of the encoding is a boolean which specifies how 7261variadic arguments of the broker are handled. If it is true, all variadic 7262arguments of the broker are passed through to the callback function *after* the 7263arguments encoded explicitly before. 7264 7265In the code below, the ``pthread_create`` function is marked as a broker 7266through the ``!callback !1`` metadata. In the example, there is only one 7267callback encoding, namely ``!2``, associated with the broker. This encoding 7268identifies the callback function as the second argument of the broker (``i64 72692``) and the sole argument of the callback function as the third one of the 7270broker function (``i64 3``). 7271 7272.. FIXME why does the llvm-sphinx-docs builder give a highlighting 7273 error if the below is set to highlight as 'llvm', despite that we 7274 have misc.highlighting_failure set? 7275 7276.. code-block:: text 7277 7278 declare !callback !1 dso_local i32 @pthread_create(ptr, ptr, ptr, ptr) 7279 7280 ... 7281 !2 = !{i64 2, i64 3, i1 false} 7282 !1 = !{!2} 7283 7284Another example is shown below. The callback callee is the second argument of 7285the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown 7286values (each identified by a ``i64 -1``) and afterwards all 7287variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the 7288final ``i1 true``). 7289 7290.. FIXME why does the llvm-sphinx-docs builder give a highlighting 7291 error if the below is set to highlight as 'llvm', despite that we 7292 have misc.highlighting_failure set? 7293 7294.. code-block:: text 7295 7296 declare !callback !0 dso_local void @__kmpc_fork_call(ptr, i32, ptr, ...) 7297 7298 ... 7299 !1 = !{i64 2, i64 -1, i64 -1, i1 true} 7300 !0 = !{!1} 7301 7302'``exclude``' Metadata 7303^^^^^^^^^^^^^^^^^^^^^^ 7304 7305``exclude`` metadata may be attached to a global variable to signify that its 7306section should not be included in the final executable or shared library. This 7307option is only valid for global variables with an explicit section targeting ELF 7308or COFF. This is done using the ``SHF_EXCLUDE`` flag on ELF targets and the 7309``IMAGE_SCN_LNK_REMOVE`` and ``IMAGE_SCN_MEM_DISCARDABLE`` flags for COFF 7310targets. Additionally, this metadata is only used as a flag, so the associated 7311node must be empty. The explicit section should not conflict with any other 7312sections that the user does not want removed after linking. 7313 7314.. code-block:: text 7315 7316 @object = private constant [1 x i8] c"\00", section ".foo" !exclude !0 7317 7318 ... 7319 !0 = !{} 7320 7321'``unpredictable``' Metadata 7322^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7323 7324``unpredictable`` metadata may be attached to any branch or switch 7325instruction. It can be used to express the unpredictability of control 7326flow. Similar to the llvm.expect intrinsic, it may be used to alter 7327optimizations related to compare and branch instructions. The metadata 7328is treated as a boolean value; if it exists, it signals that the branch 7329or switch that it is attached to is completely unpredictable. 7330 7331.. _md_dereferenceable: 7332 7333'``dereferenceable``' Metadata 7334^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7335 7336The existence of the ``!dereferenceable`` metadata on the instruction 7337tells the optimizer that the value loaded is known to be dereferenceable, 7338otherwise the behavior is undefined. 7339The number of bytes known to be dereferenceable is specified by the integer 7340value in the metadata node. This is analogous to the ''dereferenceable'' 7341attribute on parameters and return values. 7342 7343.. _md_dereferenceable_or_null: 7344 7345'``dereferenceable_or_null``' Metadata 7346^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7347 7348The existence of the ``!dereferenceable_or_null`` metadata on the 7349instruction tells the optimizer that the value loaded is known to be either 7350dereferenceable or null, otherwise the behavior is undefined. 7351The number of bytes known to be dereferenceable is specified by the integer 7352value in the metadata node. This is analogous to the ''dereferenceable_or_null'' 7353attribute on parameters and return values. 7354 7355.. _llvm.loop: 7356 7357'``llvm.loop``' 7358^^^^^^^^^^^^^^^ 7359 7360It is sometimes useful to attach information to loop constructs. Currently, 7361loop metadata is implemented as metadata attached to the branch instruction 7362in the loop latch block. The loop metadata node is a list of 7363other metadata nodes, each representing a property of the loop. Usually, 7364the first item of the property node is a string. For example, the 7365``llvm.loop.unroll.count`` suggests an unroll factor to the loop 7366unroller: 7367 7368.. code-block:: llvm 7369 7370 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0 7371 ... 7372 !0 = !{!0, !1, !2} 7373 !1 = !{!"llvm.loop.unroll.enable"} 7374 !2 = !{!"llvm.loop.unroll.count", i32 4} 7375 7376For legacy reasons, the first item of a loop metadata node must be a 7377reference to itself. Before the advent of the 'distinct' keyword, this 7378forced the preservation of otherwise identical metadata nodes. Since 7379the loop-metadata node can be attached to multiple nodes, the 'distinct' 7380keyword has become unnecessary. 7381 7382Prior to the property nodes, one or two ``DILocation`` (debug location) 7383nodes can be present in the list. The first, if present, identifies the 7384source-code location where the loop begins. The second, if present, 7385identifies the source-code location where the loop ends. 7386 7387Loop metadata nodes cannot be used as unique identifiers. They are 7388neither persistent for the same loop through transformations nor 7389necessarily unique to just one loop. 7390 7391'``llvm.loop.disable_nonforced``' 7392^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7393 7394This metadata disables all optional loop transformations unless 7395explicitly instructed using other transformation metadata such as 7396``llvm.loop.unroll.enable``. That is, no heuristic will try to determine 7397whether a transformation is profitable. The purpose is to avoid that the 7398loop is transformed to a different loop before an explicitly requested 7399(forced) transformation is applied. For instance, loop fusion can make 7400other transformations impossible. Mandatory loop canonicalizations such 7401as loop rotation are still applied. 7402 7403It is recommended to use this metadata in addition to any llvm.loop.* 7404transformation directive. Also, any loop should have at most one 7405directive applied to it (and a sequence of transformations built using 7406followup-attributes). Otherwise, which transformation will be applied 7407depends on implementation details such as the pass pipeline order. 7408 7409See :ref:`transformation-metadata` for details. 7410 7411'``llvm.loop.vectorize``' and '``llvm.loop.interleave``' 7412^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7413 7414Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are 7415used to control per-loop vectorization and interleaving parameters such as 7416vectorization width and interleave count. These metadata should be used in 7417conjunction with ``llvm.loop`` loop identification metadata. The 7418``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only 7419optimization hints and the optimizer will only interleave and vectorize loops if 7420it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata 7421which contains information about loop-carried memory dependencies can be helpful 7422in determining the safety of these transformations. 7423 7424'``llvm.loop.interleave.count``' Metadata 7425^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7426 7427This metadata suggests an interleave count to the loop interleaver. 7428The first operand is the string ``llvm.loop.interleave.count`` and the 7429second operand is an integer specifying the interleave count. For 7430example: 7431 7432.. code-block:: llvm 7433 7434 !0 = !{!"llvm.loop.interleave.count", i32 4} 7435 7436Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving 7437multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0 7438then the interleave count will be determined automatically. 7439 7440'``llvm.loop.vectorize.enable``' Metadata 7441^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7442 7443This metadata selectively enables or disables vectorization for the loop. The 7444first operand is the string ``llvm.loop.vectorize.enable`` and the second operand 7445is a bit. If the bit operand value is 1 vectorization is enabled. A value of 74460 disables vectorization: 7447 7448.. code-block:: llvm 7449 7450 !0 = !{!"llvm.loop.vectorize.enable", i1 0} 7451 !1 = !{!"llvm.loop.vectorize.enable", i1 1} 7452 7453'``llvm.loop.vectorize.predicate.enable``' Metadata 7454^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7455 7456This metadata selectively enables or disables creating predicated instructions 7457for the loop, which can enable folding of the scalar epilogue loop into the 7458main loop. The first operand is the string 7459``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If 7460the bit operand value is 1 vectorization is enabled. A value of 0 disables 7461vectorization: 7462 7463.. code-block:: llvm 7464 7465 !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0} 7466 !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1} 7467 7468'``llvm.loop.vectorize.scalable.enable``' Metadata 7469^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7470 7471This metadata selectively enables or disables scalable vectorization for the 7472loop, and only has any effect if vectorization for the loop is already enabled. 7473The first operand is the string ``llvm.loop.vectorize.scalable.enable`` 7474and the second operand is a bit. If the bit operand value is 1 scalable 7475vectorization is enabled, whereas a value of 0 reverts to the default fixed 7476width vectorization: 7477 7478.. code-block:: llvm 7479 7480 !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0} 7481 !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1} 7482 7483'``llvm.loop.vectorize.width``' Metadata 7484^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7485 7486This metadata sets the target width of the vectorizer. The first 7487operand is the string ``llvm.loop.vectorize.width`` and the second 7488operand is an integer specifying the width. For example: 7489 7490.. code-block:: llvm 7491 7492 !0 = !{!"llvm.loop.vectorize.width", i32 4} 7493 7494Note that setting ``llvm.loop.vectorize.width`` to 1 disables 7495vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to 74960 or if the loop does not have this metadata the width will be 7497determined automatically. 7498 7499'``llvm.loop.vectorize.followup_vectorized``' Metadata 7500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7501 7502This metadata defines which loop attributes the vectorized loop will 7503have. See :ref:`transformation-metadata` for details. 7504 7505'``llvm.loop.vectorize.followup_epilogue``' Metadata 7506^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7507 7508This metadata defines which loop attributes the epilogue will have. The 7509epilogue is not vectorized and is executed when either the vectorized 7510loop is not known to preserve semantics (because e.g., it processes two 7511arrays that are found to alias by a runtime check) or for the last 7512iterations that do not fill a complete set of vector lanes. See 7513:ref:`Transformation Metadata <transformation-metadata>` for details. 7514 7515'``llvm.loop.vectorize.followup_all``' Metadata 7516^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7517 7518Attributes in the metadata will be added to both the vectorized and 7519epilogue loop. 7520See :ref:`Transformation Metadata <transformation-metadata>` for details. 7521 7522'``llvm.loop.unroll``' 7523^^^^^^^^^^^^^^^^^^^^^^ 7524 7525Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling 7526optimization hints such as the unroll factor. ``llvm.loop.unroll`` 7527metadata should be used in conjunction with ``llvm.loop`` loop 7528identification metadata. The ``llvm.loop.unroll`` metadata are only 7529optimization hints and the unrolling will only be performed if the 7530optimizer believes it is safe to do so. 7531 7532'``llvm.loop.unroll.count``' Metadata 7533^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7534 7535This metadata suggests an unroll factor to the loop unroller. The 7536first operand is the string ``llvm.loop.unroll.count`` and the second 7537operand is a positive integer specifying the unroll factor. For 7538example: 7539 7540.. code-block:: llvm 7541 7542 !0 = !{!"llvm.loop.unroll.count", i32 4} 7543 7544If the trip count of the loop is less than the unroll count the loop 7545will be partially unrolled. 7546 7547'``llvm.loop.unroll.disable``' Metadata 7548^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7549 7550This metadata disables loop unrolling. The metadata has a single operand 7551which is the string ``llvm.loop.unroll.disable``. For example: 7552 7553.. code-block:: llvm 7554 7555 !0 = !{!"llvm.loop.unroll.disable"} 7556 7557'``llvm.loop.unroll.runtime.disable``' Metadata 7558^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7559 7560This metadata disables runtime loop unrolling. The metadata has a single 7561operand which is the string ``llvm.loop.unroll.runtime.disable``. For example: 7562 7563.. code-block:: llvm 7564 7565 !0 = !{!"llvm.loop.unroll.runtime.disable"} 7566 7567'``llvm.loop.unroll.enable``' Metadata 7568^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7569 7570This metadata suggests that the loop should be fully unrolled if the trip count 7571is known at compile time and partially unrolled if the trip count is not known 7572at compile time. The metadata has a single operand which is the string 7573``llvm.loop.unroll.enable``. For example: 7574 7575.. code-block:: llvm 7576 7577 !0 = !{!"llvm.loop.unroll.enable"} 7578 7579'``llvm.loop.unroll.full``' Metadata 7580^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7581 7582This metadata suggests that the loop should be unrolled fully. The 7583metadata has a single operand which is the string ``llvm.loop.unroll.full``. 7584For example: 7585 7586.. code-block:: llvm 7587 7588 !0 = !{!"llvm.loop.unroll.full"} 7589 7590'``llvm.loop.unroll.followup``' Metadata 7591^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7592 7593This metadata defines which loop attributes the unrolled loop will have. 7594See :ref:`Transformation Metadata <transformation-metadata>` for details. 7595 7596'``llvm.loop.unroll.followup_remainder``' Metadata 7597^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7598 7599This metadata defines which loop attributes the remainder loop after 7600partial/runtime unrolling will have. See 7601:ref:`Transformation Metadata <transformation-metadata>` for details. 7602 7603'``llvm.loop.unroll_and_jam``' 7604^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7605 7606This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata 7607above, but affect the unroll and jam pass. In addition any loop with 7608``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will 7609disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the 7610unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam 7611too.) 7612 7613The metadata for unroll and jam otherwise is the same as for ``unroll``. 7614``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and 7615``llvm.loop.unroll_and_jam.count`` do the same as for unroll. 7616``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints 7617and the normal safety checks will still be performed. 7618 7619'``llvm.loop.unroll_and_jam.count``' Metadata 7620^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7621 7622This metadata suggests an unroll and jam factor to use, similarly to 7623``llvm.loop.unroll.count``. The first operand is the string 7624``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer 7625specifying the unroll factor. For example: 7626 7627.. code-block:: llvm 7628 7629 !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4} 7630 7631If the trip count of the loop is less than the unroll count the loop 7632will be partially unroll and jammed. 7633 7634'``llvm.loop.unroll_and_jam.disable``' Metadata 7635^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7636 7637This metadata disables loop unroll and jamming. The metadata has a single 7638operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example: 7639 7640.. code-block:: llvm 7641 7642 !0 = !{!"llvm.loop.unroll_and_jam.disable"} 7643 7644'``llvm.loop.unroll_and_jam.enable``' Metadata 7645^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7646 7647This metadata suggests that the loop should be fully unroll and jammed if the 7648trip count is known at compile time and partially unrolled if the trip count is 7649not known at compile time. The metadata has a single operand which is the 7650string ``llvm.loop.unroll_and_jam.enable``. For example: 7651 7652.. code-block:: llvm 7653 7654 !0 = !{!"llvm.loop.unroll_and_jam.enable"} 7655 7656'``llvm.loop.unroll_and_jam.followup_outer``' Metadata 7657^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7658 7659This metadata defines which loop attributes the outer unrolled loop will 7660have. See :ref:`Transformation Metadata <transformation-metadata>` for 7661details. 7662 7663'``llvm.loop.unroll_and_jam.followup_inner``' Metadata 7664^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7665 7666This metadata defines which loop attributes the inner jammed loop will 7667have. See :ref:`Transformation Metadata <transformation-metadata>` for 7668details. 7669 7670'``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata 7671^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7672 7673This metadata defines which attributes the epilogue of the outer loop 7674will have. This loop is usually unrolled, meaning there is no such 7675loop. This attribute will be ignored in this case. See 7676:ref:`Transformation Metadata <transformation-metadata>` for details. 7677 7678'``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata 7679^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7680 7681This metadata defines which attributes the inner loop of the epilogue 7682will have. The outer epilogue will usually be unrolled, meaning there 7683can be multiple inner remainder loops. See 7684:ref:`Transformation Metadata <transformation-metadata>` for details. 7685 7686'``llvm.loop.unroll_and_jam.followup_all``' Metadata 7687^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7688 7689Attributes specified in the metadata is added to all 7690``llvm.loop.unroll_and_jam.*`` loops. See 7691:ref:`Transformation Metadata <transformation-metadata>` for details. 7692 7693'``llvm.loop.licm_versioning.disable``' Metadata 7694^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7695 7696This metadata indicates that the loop should not be versioned for the purpose 7697of enabling loop-invariant code motion (LICM). The metadata has a single operand 7698which is the string ``llvm.loop.licm_versioning.disable``. For example: 7699 7700.. code-block:: llvm 7701 7702 !0 = !{!"llvm.loop.licm_versioning.disable"} 7703 7704'``llvm.loop.distribute.enable``' Metadata 7705^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7706 7707Loop distribution allows splitting a loop into multiple loops. Currently, 7708this is only performed if the entire loop cannot be vectorized due to unsafe 7709memory dependencies. The transformation will attempt to isolate the unsafe 7710dependencies into their own loop. 7711 7712This metadata can be used to selectively enable or disable distribution of the 7713loop. The first operand is the string ``llvm.loop.distribute.enable`` and the 7714second operand is a bit. If the bit operand value is 1 distribution is 7715enabled. A value of 0 disables distribution: 7716 7717.. code-block:: llvm 7718 7719 !0 = !{!"llvm.loop.distribute.enable", i1 0} 7720 !1 = !{!"llvm.loop.distribute.enable", i1 1} 7721 7722This metadata should be used in conjunction with ``llvm.loop`` loop 7723identification metadata. 7724 7725'``llvm.loop.distribute.followup_coincident``' Metadata 7726^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7727 7728This metadata defines which attributes extracted loops with no cyclic 7729dependencies will have (i.e. can be vectorized). See 7730:ref:`Transformation Metadata <transformation-metadata>` for details. 7731 7732'``llvm.loop.distribute.followup_sequential``' Metadata 7733^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7734 7735This metadata defines which attributes the isolated loops with unsafe 7736memory dependencies will have. See 7737:ref:`Transformation Metadata <transformation-metadata>` for details. 7738 7739'``llvm.loop.distribute.followup_fallback``' Metadata 7740^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7741 7742If loop versioning is necessary, this metadata defined the attributes 7743the non-distributed fallback version will have. See 7744:ref:`Transformation Metadata <transformation-metadata>` for details. 7745 7746'``llvm.loop.distribute.followup_all``' Metadata 7747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7748 7749The attributes in this metadata is added to all followup loops of the 7750loop distribution pass. See 7751:ref:`Transformation Metadata <transformation-metadata>` for details. 7752 7753'``llvm.licm.disable``' Metadata 7754^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7755 7756This metadata indicates that loop-invariant code motion (LICM) should not be 7757performed on this loop. The metadata has a single operand which is the string 7758``llvm.licm.disable``. For example: 7759 7760.. code-block:: llvm 7761 7762 !0 = !{!"llvm.licm.disable"} 7763 7764Note that although it operates per loop it isn't given the llvm.loop prefix 7765as it is not affected by the ``llvm.loop.disable_nonforced`` metadata. 7766 7767'``llvm.access.group``' Metadata 7768^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7769 7770``llvm.access.group`` metadata can be attached to any instruction that 7771potentially accesses memory. It can point to a single distinct metadata 7772node, which we call access group. This node represents all memory access 7773instructions referring to it via ``llvm.access.group``. When an 7774instruction belongs to multiple access groups, it can also point to a 7775list of accesses groups, illustrated by the following example. 7776 7777.. code-block:: llvm 7778 7779 %val = load i32, ptr %arrayidx, !llvm.access.group !0 7780 ... 7781 !0 = !{!1, !2} 7782 !1 = distinct !{} 7783 !2 = distinct !{} 7784 7785It is illegal for the list node to be empty since it might be confused 7786with an access group. 7787 7788The access group metadata node must be 'distinct' to avoid collapsing 7789multiple access groups by content. An access group metadata node must 7790always be empty which can be used to distinguish an access group 7791metadata node from a list of access groups. Being empty avoids the 7792situation that the content must be updated which, because metadata is 7793immutable by design, would required finding and updating all references 7794to the access group node. 7795 7796The access group can be used to refer to a memory access instruction 7797without pointing to it directly (which is not possible in global 7798metadata). Currently, the only metadata making use of it is 7799``llvm.loop.parallel_accesses``. 7800 7801'``llvm.loop.parallel_accesses``' Metadata 7802^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7803 7804The ``llvm.loop.parallel_accesses`` metadata refers to one or more 7805access group metadata nodes (see ``llvm.access.group``). It denotes that 7806no loop-carried memory dependence exist between it and other instructions 7807in the loop with this metadata. 7808 7809Let ``m1`` and ``m2`` be two instructions that both have the 7810``llvm.access.group`` metadata to the access group ``g1``, respectively 7811``g2`` (which might be identical). If a loop contains both access groups 7812in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can 7813assume that there is no dependency between ``m1`` and ``m2`` carried by 7814this loop. Instructions that belong to multiple access groups are 7815considered having this property if at least one of the access groups 7816matches the ``llvm.loop.parallel_accesses`` list. 7817 7818If all memory-accessing instructions in a loop have 7819``llvm.access.group`` metadata that each refer to one of the access 7820groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the 7821loop has no loop carried memory dependencies and is considered to be a 7822parallel loop. If there is a loop-carried dependency, the behavior is 7823undefined. 7824 7825Note that if not all memory access instructions belong to an access 7826group referred to by ``llvm.loop.parallel_accesses``, then the loop must 7827not be considered trivially parallel. Additional 7828memory dependence analysis is required to make that determination. As a fail 7829safe mechanism, this causes loops that were originally parallel to be considered 7830sequential (if optimization passes that are unaware of the parallel semantics 7831insert new memory instructions into the loop body). 7832 7833Example of a loop that is considered parallel due to its correct use of 7834both ``llvm.access.group`` and ``llvm.loop.parallel_accesses`` 7835metadata types. 7836 7837.. code-block:: llvm 7838 7839 for.body: 7840 ... 7841 %val0 = load i32, ptr %arrayidx, !llvm.access.group !1 7842 ... 7843 store i32 %val0, ptr %arrayidx1, !llvm.access.group !1 7844 ... 7845 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0 7846 7847 for.end: 7848 ... 7849 !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}} 7850 !1 = distinct !{} 7851 7852It is also possible to have nested parallel loops: 7853 7854.. code-block:: llvm 7855 7856 outer.for.body: 7857 ... 7858 %val1 = load i32, ptr %arrayidx3, !llvm.access.group !4 7859 ... 7860 br label %inner.for.body 7861 7862 inner.for.body: 7863 ... 7864 %val0 = load i32, ptr %arrayidx1, !llvm.access.group !3 7865 ... 7866 store i32 %val0, ptr %arrayidx2, !llvm.access.group !3 7867 ... 7868 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1 7869 7870 inner.for.end: 7871 ... 7872 store i32 %val1, ptr %arrayidx4, !llvm.access.group !4 7873 ... 7874 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2 7875 7876 outer.for.end: ; preds = %for.body 7877 ... 7878 !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}} ; metadata for the inner loop 7879 !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop 7880 !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well) 7881 !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop 7882 7883.. _langref_llvm_loop_mustprogress: 7884 7885'``llvm.loop.mustprogress``' Metadata 7886^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7887 7888The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to 7889terminate, unwind, or interact with the environment in an observable way e.g. 7890via a volatile memory access, I/O, or other synchronization. If such a loop is 7891not found to interact with the environment in an observable way, the loop may 7892be removed. This corresponds to the ``mustprogress`` function attribute. 7893 7894'``irr_loop``' Metadata 7895^^^^^^^^^^^^^^^^^^^^^^^ 7896 7897``irr_loop`` metadata may be attached to the terminator instruction of a basic 7898block that's an irreducible loop header (note that an irreducible loop has more 7899than once header basic blocks.) If ``irr_loop`` metadata is attached to the 7900terminator instruction of a basic block that is not really an irreducible loop 7901header, the behavior is undefined. The intent of this metadata is to improve the 7902accuracy of the block frequency propagation. For example, in the code below, the 7903block ``header0`` may have a loop header weight (relative to the other headers of 7904the irreducible loop) of 100: 7905 7906.. code-block:: llvm 7907 7908 header0: 7909 ... 7910 br i1 %cmp, label %t1, label %t2, !irr_loop !0 7911 7912 ... 7913 !0 = !{"loop_header_weight", i64 100} 7914 7915Irreducible loop header weights are typically based on profile data. 7916 7917.. _md_invariant.group: 7918 7919'``invariant.group``' Metadata 7920^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7921 7922The experimental ``invariant.group`` metadata may be attached to 7923``load``/``store`` instructions referencing a single metadata with no entries. 7924The existence of the ``invariant.group`` metadata on the instruction tells 7925the optimizer that every ``load`` and ``store`` to the same pointer operand 7926can be assumed to load or store the same 7927value (but see the ``llvm.launder.invariant.group`` intrinsic which affects 7928when two pointers are considered the same). Pointers returned by bitcast or 7929getelementptr with only zero indices are considered the same. 7930 7931Examples: 7932 7933.. code-block:: llvm 7934 7935 @unknownPtr = external global i8 7936 ... 7937 %ptr = alloca i8 7938 store i8 42, ptr %ptr, !invariant.group !0 7939 call void @foo(ptr %ptr) 7940 7941 %a = load i8, ptr %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change 7942 call void @foo(ptr %ptr) 7943 7944 %newPtr = call ptr @getPointer(ptr %ptr) 7945 %c = load i8, ptr %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr 7946 7947 %unknownValue = load i8, ptr @unknownPtr 7948 store i8 %unknownValue, ptr %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42 7949 7950 call void @foo(ptr %ptr) 7951 %newPtr2 = call ptr @llvm.launder.invariant.group.p0(ptr %ptr) 7952 %d = load i8, ptr %newPtr2, !invariant.group !0 ; Can't step through launder.invariant.group to get value of %ptr 7953 7954 ... 7955 declare void @foo(ptr) 7956 declare ptr @getPointer(ptr) 7957 declare ptr @llvm.launder.invariant.group.p0(ptr) 7958 7959 !0 = !{} 7960 7961The invariant.group metadata must be dropped when replacing one pointer by 7962another based on aliasing information. This is because invariant.group is tied 7963to the SSA value of the pointer operand. 7964 7965.. code-block:: llvm 7966 7967 %v = load i8, ptr %x, !invariant.group !0 7968 ; if %x mustalias %y then we can replace the above instruction with 7969 %v = load i8, ptr %y 7970 7971Note that this is an experimental feature, which means that its semantics might 7972change in the future. 7973 7974'``type``' Metadata 7975^^^^^^^^^^^^^^^^^^^ 7976 7977See :doc:`TypeMetadata`. 7978 7979'``associated``' Metadata 7980^^^^^^^^^^^^^^^^^^^^^^^^^ 7981 7982The ``associated`` metadata may be attached to a global variable definition with 7983a single argument that references a global object (optionally through an alias). 7984 7985This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents 7986discarding of the global variable in linker GC unless the referenced object is 7987also discarded. The linker support for this feature is spotty. For best 7988compatibility, globals carrying this metadata should: 7989 7990- Be in ``@llvm.compiler.used``. 7991- If the referenced global variable is in a comdat, be in the same comdat. 7992 7993``!associated`` can not express many-to-one relationship. A global variable with 7994the metadata should generally not be referenced by a function: the function may 7995be inlined into other functions, leading to more references to the metadata. 7996Ideally we would want to keep metadata alive as long as any inline location is 7997alive, but this many-to-one relationship is not representable. Moreover, if the 7998metadata is retained while the function is discarded, the linker will report an 7999error of a relocation referencing a discarded section. 8000 8001The metadata is often used with an explicit section consisting of valid C 8002identifiers so that the runtime can find the metadata section with 8003linker-defined encapsulation symbols ``__start_<section_name>`` and 8004``__stop_<section_name>``. 8005 8006It does not have any effect on non-ELF targets. 8007 8008Example: 8009 8010.. code-block:: text 8011 8012 $a = comdat any 8013 @a = global i32 1, comdat $a 8014 @b = internal global i32 2, comdat $a, section "abc", !associated !0 8015 !0 = !{ptr @a} 8016 8017 8018'``prof``' Metadata 8019^^^^^^^^^^^^^^^^^^^ 8020 8021The ``prof`` metadata is used to record profile data in the IR. 8022The first operand of the metadata node indicates the profile metadata 8023type. There are currently 3 types: 8024:ref:`branch_weights<prof_node_branch_weights>`, 8025:ref:`function_entry_count<prof_node_function_entry_count>`, and 8026:ref:`VP<prof_node_VP>`. 8027 8028.. _prof_node_branch_weights: 8029 8030branch_weights 8031"""""""""""""" 8032 8033Branch weight metadata attached to a branch, select, switch or call instruction 8034represents the likeliness of the associated branch being taken. 8035For more information, see :doc:`BranchWeightMetadata`. 8036 8037.. _prof_node_function_entry_count: 8038 8039function_entry_count 8040"""""""""""""""""""" 8041 8042Function entry count metadata can be attached to function definitions 8043to record the number of times the function is called. Used with BFI 8044information, it is also used to derive the basic block profile count. 8045For more information, see :doc:`BranchWeightMetadata`. 8046 8047.. _prof_node_VP: 8048 8049VP 8050"" 8051 8052VP (value profile) metadata can be attached to instructions that have 8053value profile information. Currently this is indirect calls (where it 8054records the hottest callees) and calls to memory intrinsics such as memcpy, 8055memmove, and memset (where it records the hottest byte lengths). 8056 8057Each VP metadata node contains "VP" string, then a uint32_t value for the value 8058profiling kind, a uint64_t value for the total number of times the instruction 8059is executed, followed by uint64_t value and execution count pairs. 8060The value profiling kind is 0 for indirect call targets and 1 for memory 8061operations. For indirect call targets, each profile value is a hash 8062of the callee function name, and for memory operations each value is the 8063byte length. 8064 8065Note that the value counts do not need to add up to the total count 8066listed in the third operand (in practice only the top hottest values 8067are tracked and reported). 8068 8069Indirect call example: 8070 8071.. code-block:: llvm 8072 8073 call void %f(), !prof !1 8074 !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410} 8075 8076Note that the VP type is 0 (the second operand), which indicates this is 8077an indirect call value profile data. The third operand indicates that the 8078indirect call executed 1600 times. The 4th and 6th operands give the 8079hashes of the 2 hottest target functions' names (this is the same hash used 8080to represent function names in the profile database), and the 5th and 7th 8081operands give the execution count that each of the respective prior target 8082functions was called. 8083 8084.. _md_annotation: 8085 8086'``annotation``' Metadata 8087^^^^^^^^^^^^^^^^^^^^^^^^^ 8088 8089The ``annotation`` metadata can be used to attach a tuple of annotation strings 8090or a tuple of a tuple of annotation strings to any instruction. This metadata does 8091not impact the semantics of the program and may only be used to provide additional 8092insight about the program and transformations to users. 8093 8094Example: 8095 8096.. code-block:: text 8097 8098 %a.addr = alloca ptr, align 8, !annotation !0 8099 !0 = !{!"auto-init"} 8100 8101Embedding tuple of strings example: 8102 8103.. code-block:: text 8104 8105 %a.ptr = getelementptr ptr, ptr %base, i64 0. !annotation !0 8106 !0 = !{!1} 8107 !1 = !{!"gep offset", !"0"} 8108 8109'``func_sanitize``' Metadata 8110^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8111 8112The ``func_sanitize`` metadata is used to attach two values for the function 8113sanitizer instrumentation. The first value is the ubsan function signature. 8114The second value is the address of the proxy variable which stores the address 8115of the RTTI descriptor. If :ref:`prologue <prologuedata>` and '``func_sanitize``' 8116are used at the same time, :ref:`prologue <prologuedata>` is emitted before 8117'``func_sanitize``' in the output. 8118 8119Example: 8120 8121.. code-block:: text 8122 8123 @__llvm_rtti_proxy = private unnamed_addr constant ptr @_ZTIFvvE 8124 define void @_Z3funv() !func_sanitize !0 { 8125 return void 8126 } 8127 !0 = !{i32 846595819, ptr @__llvm_rtti_proxy} 8128 8129.. _md_kcfi_type: 8130 8131'``kcfi_type``' Metadata 8132^^^^^^^^^^^^^^^^^^^^^^^^ 8133 8134The ``kcfi_type`` metadata can be used to attach a type identifier to 8135functions that can be called indirectly. The type data is emitted before the 8136function entry in the assembly. Indirect calls with the :ref:`kcfi operand 8137bundle<ob_kcfi>` will emit a check that compares the type identifier to the 8138metadata. 8139 8140Example: 8141 8142.. code-block:: text 8143 8144 define dso_local i32 @f() !kcfi_type !0 { 8145 ret i32 0 8146 } 8147 !0 = !{i32 12345678} 8148 8149Clang emits ``kcfi_type`` metadata nodes for address-taken functions with 8150``-fsanitize=kcfi``. 8151 8152.. _md_memprof: 8153 8154'``memprof``' Metadata 8155^^^^^^^^^^^^^^^^^^^^^^^^ 8156 8157The ``memprof`` metadata is used to record memory profile data on heap 8158allocation calls. Multiple context-sensitive profiles can be represented 8159with a single ``memprof`` metadata attachment. 8160 8161Example: 8162 8163.. code-block:: text 8164 8165 %call = call ptr @_Znam(i64 10), !memprof !0, !callsite !5 8166 !0 = !{!1, !3} 8167 !1 = !{!2, !"cold"} 8168 !2 = !{i64 4854880825882961848, i64 1905834578520680781} 8169 !3 = !{!4, !"notcold"} 8170 !4 = !{i64 4854880825882961848, i64 -6528110295079665978} 8171 !5 = !{i64 4854880825882961848} 8172 8173Each operand in the ``memprof`` metadata attachment describes the profiled 8174behavior of memory allocated by the associated allocation for a given context. 8175In the above example, there were 2 profiled contexts, one allocating memory 8176that was typically cold and one allocating memory that was typically not cold. 8177 8178The format of the metadata describing a context specific profile (e.g. 8179``!1`` and ``!3`` above) requires a first operand that is a metadata node 8180describing the context, followed by a list of string metadata tags describing 8181the profile behavior (e.g. ``cold`` and ``notcold``) above. The metadata nodes 8182describing the context (e.g. ``!2`` and ``!4`` above) are unique ids 8183corresponding to callsites, which can be matched to associated IR calls via 8184:ref:`callsite metadata<md_callsite>`. In practice these ids are formed via 8185a hash of the callsite's debug info, and the associated call may be in a 8186different module. The contexts are listed in order from leaf-most call (the 8187allocation itself) to the outermost callsite context required for uniquely 8188identifying the described profile behavior (note this may not be the top of 8189the profiled call stack). 8190 8191.. _md_callsite: 8192 8193'``callsite``' Metadata 8194^^^^^^^^^^^^^^^^^^^^^^^^ 8195 8196The ``callsite`` metadata is used to identify callsites involved in memory 8197profile contexts described in :ref:`memprof metadata<md_memprof>`. 8198 8199It is attached both to the profile allocation calls (see the example in 8200:ref:`memprof metadata<md_memprof>`), as well as to other callsites 8201in profiled contexts described in heap allocation ``memprof`` metadata. 8202 8203Example: 8204 8205.. code-block:: text 8206 8207 %call = call ptr @_Z1Bb(void), !callsite !0 8208 !0 = !{i64 -6528110295079665978, i64 5462047985461644151} 8209 8210Each operand in the ``callsite`` metadata attachment is a unique id 8211corresponding to a callsite (possibly inlined). In practice these ids are 8212formed via a hash of the callsite's debug info. If the call was not inlined 8213into any callers it will contain a single operand (id). If it was inlined 8214it will contain a list of ids, including the ids of the callsites in the 8215full inline sequence, in order from the leaf-most call's id to the outermost 8216inlined call. 8217 8218 8219'``noalias.addrspace``' Metadata 8220^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8221 8222The ``noalias.addrspace`` metadata is used to identify memory 8223operations which cannot access objects allocated in a range of address 8224spaces. It is attached to memory instructions, including 8225:ref:`atomicrmw <i_atomicrmw>`, :ref:`cmpxchg <i_cmpxchg>`, and 8226:ref:`call <i_call>` instructions. 8227 8228This follows the same form as :ref:`range metadata <range-metadata>`, 8229except the field entries must be of type `i32`. The interpretation is 8230the same numeric address spaces as applied to IR values. 8231 8232Example: 8233 8234.. code-block:: llvm 8235 8236 ; %ptr cannot point to an object allocated in addrspace(5) 8237 %rmw.valid = atomicrmw and ptr %ptr, i64 %value seq_cst, !noalias.addrspace !0 8238 8239 ; Undefined behavior. The underlying object is allocated in one of the listed 8240 ; address spaces. 8241 %alloca = alloca i64, addrspace(5) 8242 %alloca.cast = addrspacecast ptr addrspace(5) %alloca to ptr 8243 %rmw.ub = atomicrmw and ptr %alloca.cast, i64 %value seq_cst, !noalias.addrspace !0 8244 8245 !0 = !{i32 5, i32 6} ; Exclude addrspace(5) only 8246 8247 8248This is intended for use on targets with a notion of generic address 8249spaces, which at runtime resolve to different physical memory 8250spaces. The interpretation of the address space values is target 8251specific. The behavior is undefined if the runtime memory address does 8252resolve to an object defined in one of the indicated address spaces. 8253 8254 8255Module Flags Metadata 8256===================== 8257 8258Information about the module as a whole is difficult to convey to LLVM's 8259subsystems. The LLVM IR isn't sufficient to transmit this information. 8260The ``llvm.module.flags`` named metadata exists in order to facilitate 8261this. These flags are in the form of key / value pairs --- much like a 8262dictionary --- making it easy for any subsystem who cares about a flag to 8263look it up. 8264 8265The ``llvm.module.flags`` metadata contains a list of metadata triplets. 8266Each triplet has the following form: 8267 8268- The first element is a *behavior* flag, which specifies the behavior 8269 when two (or more) modules are merged together, and it encounters two 8270 (or more) metadata with the same ID. The supported behaviors are 8271 described below. 8272- The second element is a metadata string that is a unique ID for the 8273 metadata. Each module may only have one flag entry for each unique ID (not 8274 including entries with the **Require** behavior). 8275- The third element is the value of the flag. 8276 8277When two (or more) modules are merged together, the resulting 8278``llvm.module.flags`` metadata is the union of the modules' flags. That is, for 8279each unique metadata ID string, there will be exactly one entry in the merged 8280modules ``llvm.module.flags`` metadata table, and the value for that entry will 8281be determined by the merge behavior flag, as described below. The only exception 8282is that entries with the *Require* behavior are always preserved. 8283 8284The following behaviors are supported: 8285 8286.. list-table:: 8287 :header-rows: 1 8288 :widths: 10 90 8289 8290 * - Value 8291 - Behavior 8292 8293 * - 1 8294 - **Error** 8295 Emits an error if two values disagree, otherwise the resulting value 8296 is that of the operands. 8297 8298 * - 2 8299 - **Warning** 8300 Emits a warning if two values disagree. The result value will be the 8301 operand for the flag from the first module being linked, unless the 8302 other module uses **Min** or **Max**, in which case the result will 8303 be **Min** (with the min value) or **Max** (with the max value), 8304 respectively. 8305 8306 * - 3 8307 - **Require** 8308 Adds a requirement that another module flag be present and have a 8309 specified value after linking is performed. The value must be a 8310 metadata pair, where the first element of the pair is the ID of the 8311 module flag to be restricted, and the second element of the pair is 8312 the value the module flag should be restricted to. This behavior can 8313 be used to restrict the allowable results (via triggering of an 8314 error) of linking IDs with the **Override** behavior. 8315 8316 * - 4 8317 - **Override** 8318 Uses the specified value, regardless of the behavior or value of the 8319 other module. If both modules specify **Override**, but the values 8320 differ, an error will be emitted. 8321 8322 * - 5 8323 - **Append** 8324 Appends the two values, which are required to be metadata nodes. 8325 8326 * - 6 8327 - **AppendUnique** 8328 Appends the two values, which are required to be metadata 8329 nodes. However, duplicate entries in the second list are dropped 8330 during the append operation. 8331 8332 * - 7 8333 - **Max** 8334 Takes the max of the two values, which are required to be integers. 8335 8336 * - 8 8337 - **Min** 8338 Takes the min of the two values, which are required to be non-negative integers. 8339 An absent module flag is treated as having the value 0. 8340 8341It is an error for a particular unique flag ID to have multiple behaviors, 8342except in the case of **Require** (which adds restrictions on another metadata 8343value) or **Override**. 8344 8345An example of module flags: 8346 8347.. code-block:: llvm 8348 8349 !0 = !{ i32 1, !"foo", i32 1 } 8350 !1 = !{ i32 4, !"bar", i32 37 } 8351 !2 = !{ i32 2, !"qux", i32 42 } 8352 !3 = !{ i32 3, !"qux", 8353 !{ 8354 !"foo", i32 1 8355 } 8356 } 8357 !llvm.module.flags = !{ !0, !1, !2, !3 } 8358 8359- Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior 8360 if two or more ``!"foo"`` flags are seen is to emit an error if their 8361 values are not equal. 8362 8363- Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The 8364 behavior if two or more ``!"bar"`` flags are seen is to use the value 8365 '37'. 8366 8367- Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The 8368 behavior if two or more ``!"qux"`` flags are seen is to emit a 8369 warning if their values are not equal. 8370 8371- Metadata ``!3`` has the ID ``!"qux"`` and the value: 8372 8373 :: 8374 8375 !{ !"foo", i32 1 } 8376 8377 The behavior is to emit an error if the ``llvm.module.flags`` does not 8378 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is 8379 performed. 8380 8381Synthesized Functions Module Flags Metadata 8382------------------------------------------- 8383 8384These metadata specify the default attributes synthesized functions should have. 8385These metadata are currently respected by a few instrumentation passes, such as 8386sanitizers. 8387 8388These metadata correspond to a few function attributes with significant code 8389generation behaviors. Function attributes with just optimization purposes 8390should not be listed because the performance impact of these synthesized 8391functions is small. 8392 8393- "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function 8394 will get the "frame-pointer" function attribute, with value being "none", 8395 "non-leaf", or "all", respectively. 8396- "function_return_thunk_extern": The synthesized function will get the 8397 ``fn_return_thunk_extern`` function attribute. 8398- "uwtable": **Max**. The value can be 0, 1, or 2. If the value is 1, a synthesized 8399 function will get the ``uwtable(sync)`` function attribute, if the value is 2, 8400 a synthesized function will get the ``uwtable(async)`` function attribute. 8401 8402Objective-C Garbage Collection Module Flags Metadata 8403---------------------------------------------------- 8404 8405On the Mach-O platform, Objective-C stores metadata about garbage 8406collection in a special section called "image info". The metadata 8407consists of a version number and a bitmask specifying what types of 8408garbage collection are supported (if any) by the file. If two or more 8409modules are linked together their garbage collection metadata needs to 8410be merged rather than appended together. 8411 8412The Objective-C garbage collection module flags metadata consists of the 8413following key-value pairs: 8414 8415.. list-table:: 8416 :header-rows: 1 8417 :widths: 30 70 8418 8419 * - Key 8420 - Value 8421 8422 * - ``Objective-C Version`` 8423 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2. 8424 8425 * - ``Objective-C Image Info Version`` 8426 - **[Required]** --- The version of the image info section. Currently 8427 always 0. 8428 8429 * - ``Objective-C Image Info Section`` 8430 - **[Required]** --- The section to place the metadata. Valid values are 8431 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and 8432 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for 8433 Objective-C ABI version 2. 8434 8435 * - ``Objective-C Garbage Collection`` 8436 - **[Required]** --- Specifies whether garbage collection is supported or 8437 not. Valid values are 0, for no garbage collection, and 2, for garbage 8438 collection supported. 8439 8440 * - ``Objective-C GC Only`` 8441 - **[Optional]** --- Specifies that only garbage collection is supported. 8442 If present, its value must be 6. This flag requires that the 8443 ``Objective-C Garbage Collection`` flag have the value 2. 8444 8445Some important flag interactions: 8446 8447- If a module with ``Objective-C Garbage Collection`` set to 0 is 8448 merged with a module with ``Objective-C Garbage Collection`` set to 8449 2, then the resulting module has the 8450 ``Objective-C Garbage Collection`` flag set to 0. 8451- A module with ``Objective-C Garbage Collection`` set to 0 cannot be 8452 merged with a module with ``Objective-C GC Only`` set to 6. 8453 8454C type width Module Flags Metadata 8455---------------------------------- 8456 8457The ARM backend emits a section into each generated object file describing the 8458options that it was compiled with (in a compiler-independent way) to prevent 8459linking incompatible objects, and to allow automatic library selection. Some 8460of these options are not visible at the IR level, namely wchar_t width and enum 8461width. 8462 8463To pass this information to the backend, these options are encoded in module 8464flags metadata, using the following key-value pairs: 8465 8466.. list-table:: 8467 :header-rows: 1 8468 :widths: 30 70 8469 8470 * - Key 8471 - Value 8472 8473 * - short_wchar 8474 - * 0 --- sizeof(wchar_t) == 4 8475 * 1 --- sizeof(wchar_t) == 2 8476 8477 * - short_enum 8478 - * 0 --- Enums are at least as large as an ``int``. 8479 * 1 --- Enums are stored in the smallest integer type which can 8480 represent all of its values. 8481 8482For example, the following metadata section specifies that the module was 8483compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an 8484enum is the smallest type which can represent all of its values:: 8485 8486 !llvm.module.flags = !{!0, !1} 8487 !0 = !{i32 1, !"short_wchar", i32 1} 8488 !1 = !{i32 1, !"short_enum", i32 0} 8489 8490Stack Alignment Metadata 8491------------------------ 8492 8493Changes the default stack alignment from the target ABI's implicit default 8494stack alignment. Takes an i32 value in bytes. It is considered an error to link 8495two modules together with different values for this metadata. 8496 8497For example: 8498 8499 !llvm.module.flags = !{!0} 8500 !0 = !{i32 1, !"override-stack-alignment", i32 8} 8501 8502This will change the stack alignment to 8B. 8503 8504Embedded Objects Names Metadata 8505=============================== 8506 8507Offloading compilations need to embed device code into the host section table to 8508create a fat binary. This metadata node references each global that will be 8509embedded in the module. The primary use for this is to make referencing these 8510globals more efficient in the IR. The metadata references nodes containing 8511pointers to the global to be embedded followed by the section name it will be 8512stored at:: 8513 8514 !llvm.embedded.objects = !{!0} 8515 !0 = !{ptr @object, !".section"} 8516 8517Automatic Linker Flags Named Metadata 8518===================================== 8519 8520Some targets support embedding of flags to the linker inside individual object 8521files. Typically this is used in conjunction with language extensions which 8522allow source files to contain linker command line options, and have these 8523automatically be transmitted to the linker via object files. 8524 8525These flags are encoded in the IR using named metadata with the name 8526``!llvm.linker.options``. Each operand is expected to be a metadata node 8527which should be a list of other metadata nodes, each of which should be a 8528list of metadata strings defining linker options. 8529 8530For example, the following metadata section specifies two separate sets of 8531linker options, presumably to link against ``libz`` and the ``Cocoa`` 8532framework:: 8533 8534 !0 = !{ !"-lz" } 8535 !1 = !{ !"-framework", !"Cocoa" } 8536 !llvm.linker.options = !{ !0, !1 } 8537 8538The metadata encoding as lists of lists of options, as opposed to a collapsed 8539list of options, is chosen so that the IR encoding can use multiple option 8540strings to specify e.g., a single library, while still having that specifier be 8541preserved as an atomic element that can be recognized by a target specific 8542assembly writer or object file emitter. 8543 8544Each individual option is required to be either a valid option for the target's 8545linker, or an option that is reserved by the target specific assembly writer or 8546object file emitter. No other aspect of these options is defined by the IR. 8547 8548Dependent Libs Named Metadata 8549============================= 8550 8551Some targets support embedding of strings into object files to indicate 8552a set of libraries to add to the link. Typically this is used in conjunction 8553with language extensions which allow source files to explicitly declare the 8554libraries they depend on, and have these automatically be transmitted to the 8555linker via object files. 8556 8557The list is encoded in the IR using named metadata with the name 8558``!llvm.dependent-libraries``. Each operand is expected to be a metadata node 8559which should contain a single string operand. 8560 8561For example, the following metadata section contains two library specifiers:: 8562 8563 !0 = !{!"a library specifier"} 8564 !1 = !{!"another library specifier"} 8565 !llvm.dependent-libraries = !{ !0, !1 } 8566 8567Each library specifier will be handled independently by the consuming linker. 8568The effect of the library specifiers are defined by the consuming linker. 8569 8570.. _summary: 8571 8572ThinLTO Summary 8573=============== 8574 8575Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_ 8576causes the building of a compact summary of the module that is emitted into 8577the bitcode. The summary is emitted into the LLVM assembly and identified 8578in syntax by a caret ('``^``'). 8579 8580The summary is parsed into a bitcode output, along with the Module 8581IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes 8582of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the 8583summary entries (just as they currently ignore summary entries in a bitcode 8584input file). 8585 8586Eventually, the summary will be parsed into a ModuleSummaryIndex object under 8587the same conditions where summary index is currently built from bitcode. 8588Specifically, tools that test the Thin Link portion of a ThinLTO compile 8589(i.e. llvm-lto and llvm-lto2), or when parsing a combined index 8590for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag 8591(this part is not yet implemented, use llvm-as to create a bitcode object 8592before feeding into thin link tools for now). 8593 8594There are currently 3 types of summary entries in the LLVM assembly: 8595:ref:`module paths<module_path_summary>`, 8596:ref:`global values<gv_summary>`, and 8597:ref:`type identifiers<typeid_summary>`. 8598 8599.. _module_path_summary: 8600 8601Module Path Summary Entry 8602------------------------- 8603 8604Each module path summary entry lists a module containing global values included 8605in the summary. For a single IR module there will be one such entry, but 8606in a combined summary index produced during the thin link, there will be 8607one module path entry per linked module with summary. 8608 8609Example: 8610 8611.. code-block:: text 8612 8613 ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418)) 8614 8615The ``path`` field is a string path to the bitcode file, and the ``hash`` 8616field is the 160-bit SHA-1 hash of the IR bitcode contents, used for 8617incremental builds and caching. 8618 8619.. _gv_summary: 8620 8621Global Value Summary Entry 8622-------------------------- 8623 8624Each global value summary entry corresponds to a global value defined or 8625referenced by a summarized module. 8626 8627Example: 8628 8629.. code-block:: text 8630 8631 ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831 8632 8633For declarations, there will not be a summary list. For definitions, a 8634global value will contain a list of summaries, one per module containing 8635a definition. There can be multiple entries in a combined summary index 8636for symbols with weak linkage. 8637 8638Each ``Summary`` format will depend on whether the global value is a 8639:ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or 8640:ref:`alias<alias_summary>`. 8641 8642.. _function_summary: 8643 8644Function Summary 8645^^^^^^^^^^^^^^^^ 8646 8647If the global value is a function, the ``Summary`` entry will look like: 8648 8649.. code-block:: text 8650 8651 function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]? 8652 8653The ``module`` field includes the summary entry id for the module containing 8654this definition, and the ``flags`` field contains information such as 8655the linkage type, a flag indicating whether it is legal to import the 8656definition, whether it is globally live and whether the linker resolved it 8657to a local definition (the latter two are populated during the thin link). 8658The ``insts`` field contains the number of IR instructions in the function. 8659Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`, 8660:ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`, 8661:ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`. 8662 8663.. _variable_summary: 8664 8665Global Variable Summary 8666^^^^^^^^^^^^^^^^^^^^^^^ 8667 8668If the global value is a variable, the ``Summary`` entry will look like: 8669 8670.. code-block:: text 8671 8672 variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]? 8673 8674The variable entry contains a subset of the fields in a 8675:ref:`function summary <function_summary>`, see the descriptions there. 8676 8677.. _alias_summary: 8678 8679Alias Summary 8680^^^^^^^^^^^^^ 8681 8682If the global value is an alias, the ``Summary`` entry will look like: 8683 8684.. code-block:: text 8685 8686 alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2) 8687 8688The ``module`` and ``flags`` fields are as described for a 8689:ref:`function summary <function_summary>`. The ``aliasee`` field 8690contains a reference to the global value summary entry of the aliasee. 8691 8692.. _funcflags_summary: 8693 8694Function Flags 8695^^^^^^^^^^^^^^ 8696 8697The optional ``FuncFlags`` field looks like: 8698 8699.. code-block:: text 8700 8701 funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0) 8702 8703If unspecified, flags are assumed to hold the conservative ``false`` value of 8704``0``. 8705 8706.. _calls_summary: 8707 8708Calls 8709^^^^^ 8710 8711The optional ``Calls`` field looks like: 8712 8713.. code-block:: text 8714 8715 calls: ((Callee)[, (Callee)]*) 8716 8717where each ``Callee`` looks like: 8718 8719.. code-block:: text 8720 8721 callee: ^1[, hotness: None]?[, relbf: 0]? 8722 8723The ``callee`` refers to the summary entry id of the callee. At most one 8724of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``, 8725``Hot``, and ``Critical``), and ``relbf`` (which holds the integer 8726branch frequency relative to the entry frequency, scaled down by 2^8) 8727may be specified. The defaults are ``Unknown`` and ``0``, respectively. 8728 8729.. _params_summary: 8730 8731Params 8732^^^^^^ 8733 8734The optional ``Params`` is used by ``StackSafety`` and looks like: 8735 8736.. code-block:: text 8737 8738 Params: ((Param)[, (Param)]*) 8739 8740where each ``Param`` describes pointer parameter access inside of the 8741function and looks like: 8742 8743.. code-block:: text 8744 8745 param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]? 8746 8747where the first ``param`` is the number of the parameter it describes, 8748``offset`` is the inclusive range of offsets from the pointer parameter to bytes 8749which can be accessed by the function. This range does not include accesses by 8750function calls from ``calls`` list. 8751 8752where each ``Callee`` describes how parameter is forwarded into other 8753functions and looks like: 8754 8755.. code-block:: text 8756 8757 callee: ^3, param: 5, offset: [-3, 3] 8758 8759The ``callee`` refers to the summary entry id of the callee, ``param`` is 8760the number of the callee parameter which points into the callers parameter 8761with offset known to be inside of the ``offset`` range. ``calls`` will be 8762consumed and removed by thin link stage to update ``Param::offset`` so it 8763covers all accesses possible by ``calls``. 8764 8765Pointer parameter without corresponding ``Param`` is considered unsafe and we 8766assume that access with any offset is possible. 8767 8768Example: 8769 8770If we have the following function: 8771 8772.. code-block:: text 8773 8774 define i64 @foo(ptr %0, ptr %1, ptr %2, i8 %3) { 8775 store ptr %1, ptr @x 8776 %5 = getelementptr inbounds i8, ptr %2, i64 5 8777 %6 = load i8, ptr %5 8778 %7 = getelementptr inbounds i8, ptr %2, i8 %3 8779 tail call void @bar(i8 %3, ptr %7) 8780 %8 = load i64, ptr %0 8781 ret i64 %8 8782 } 8783 8784We can expect the record like this: 8785 8786.. code-block:: text 8787 8788 params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127])))) 8789 8790The function may access just 8 bytes of the parameter %0 . ``calls`` is empty, 8791so the parameter is either not used for function calls or ``offset`` already 8792covers all accesses from nested function calls. 8793Parameter %1 escapes, so access is unknown. 8794The function itself can access just a single byte of the parameter %2. Additional 8795access is possible inside of the ``@bar`` or ``^3``. The function adds signed 8796offset to the pointer and passes the result as the argument %1 into ``^3``. 8797This record itself does not tell us how ``^3`` will access the parameter. 8798Parameter %3 is not a pointer. 8799 8800.. _refs_summary: 8801 8802Refs 8803^^^^ 8804 8805The optional ``Refs`` field looks like: 8806 8807.. code-block:: text 8808 8809 refs: ((Ref)[, (Ref)]*) 8810 8811where each ``Ref`` contains a reference to the summary id of the referenced 8812value (e.g. ``^1``). 8813 8814.. _typeidinfo_summary: 8815 8816TypeIdInfo 8817^^^^^^^^^^ 8818 8819The optional ``TypeIdInfo`` field, used for 8820`Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, 8821looks like: 8822 8823.. code-block:: text 8824 8825 typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]? 8826 8827These optional fields have the following forms: 8828 8829TypeTests 8830""""""""" 8831 8832.. code-block:: text 8833 8834 typeTests: (TypeIdRef[, TypeIdRef]*) 8835 8836Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` 8837by summary id or ``GUID``. 8838 8839TypeTestAssumeVCalls 8840"""""""""""""""""""" 8841 8842.. code-block:: text 8843 8844 typeTestAssumeVCalls: (VFuncId[, VFuncId]*) 8845 8846Where each VFuncId has the format: 8847 8848.. code-block:: text 8849 8850 vFuncId: (TypeIdRef, offset: 16) 8851 8852Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` 8853by summary id or ``GUID`` preceded by a ``guid:`` tag. 8854 8855TypeCheckedLoadVCalls 8856""""""""""""""""""""" 8857 8858.. code-block:: text 8859 8860 typeCheckedLoadVCalls: (VFuncId[, VFuncId]*) 8861 8862Where each VFuncId has the format described for ``TypeTestAssumeVCalls``. 8863 8864TypeTestAssumeConstVCalls 8865""""""""""""""""""""""""" 8866 8867.. code-block:: text 8868 8869 typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*) 8870 8871Where each ConstVCall has the format: 8872 8873.. code-block:: text 8874 8875 (VFuncId, args: (Arg[, Arg]*)) 8876 8877and where each VFuncId has the format described for ``TypeTestAssumeVCalls``, 8878and each Arg is an integer argument number. 8879 8880TypeCheckedLoadConstVCalls 8881"""""""""""""""""""""""""" 8882 8883.. code-block:: text 8884 8885 typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*) 8886 8887Where each ConstVCall has the format described for 8888``TypeTestAssumeConstVCalls``. 8889 8890.. _typeid_summary: 8891 8892Type ID Summary Entry 8893--------------------- 8894 8895Each type id summary entry corresponds to a type identifier resolution 8896which is generated during the LTO link portion of the compile when building 8897with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, 8898so these are only present in a combined summary index. 8899 8900Example: 8901 8902.. code-block:: text 8903 8904 ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778 8905 8906The ``typeTestRes`` gives the type test resolution ``kind`` (which may 8907be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and 8908the ``size-1`` bit width. It is followed by optional flags, which default to 0, 8909and an optional WpdResolutions (whole program devirtualization resolution) 8910field that looks like: 8911 8912.. code-block:: text 8913 8914 wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]* 8915 8916where each entry is a mapping from the given byte offset to the whole-program 8917devirtualization resolution WpdRes, that has one of the following formats: 8918 8919.. code-block:: text 8920 8921 wpdRes: (kind: branchFunnel) 8922 wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi") 8923 wpdRes: (kind: indir) 8924 8925Additionally, each wpdRes has an optional ``resByArg`` field, which 8926describes the resolutions for calls with all constant integer arguments: 8927 8928.. code-block:: text 8929 8930 resByArg: (ResByArg[, ResByArg]*) 8931 8932where ResByArg is: 8933 8934.. code-block:: text 8935 8936 args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0]) 8937 8938Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal`` 8939or ``VirtualConstProp``. The ``info`` field is only used if the kind 8940is ``UniformRetVal`` (indicates the uniform return value), or 8941``UniqueRetVal`` (holds the return value associated with the unique vtable 8942(0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does 8943not support the use of absolute symbols to store constants. 8944 8945.. _intrinsicglobalvariables: 8946 8947Intrinsic Global Variables 8948========================== 8949 8950LLVM has a number of "magic" global variables that contain data that 8951affect code generation or other IR semantics. These are documented here. 8952All globals of this sort should have a section specified as 8953"``llvm.metadata``". This section and all globals that start with 8954"``llvm.``" are reserved for use by LLVM. 8955 8956.. _gv_llvmused: 8957 8958The '``llvm.used``' Global Variable 8959----------------------------------- 8960 8961The ``@llvm.used`` global is an array which has 8962:ref:`appending linkage <linkage_appending>`. This array contains a list of 8963pointers to named global variables, functions and aliases which may optionally 8964have a pointer cast formed of bitcast or getelementptr. For example, a legal 8965use of it is: 8966 8967.. code-block:: llvm 8968 8969 @X = global i8 4 8970 @Y = global i32 123 8971 8972 @llvm.used = appending global [2 x ptr] [ 8973 ptr @X, 8974 ptr @Y 8975 ], section "llvm.metadata" 8976 8977If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler, 8978and linker are required to treat the symbol as if there is a reference to the 8979symbol that it cannot see (which is why they have to be named). For example, if 8980a variable has internal linkage and no references other than that from the 8981``@llvm.used`` list, it cannot be deleted. This is commonly used to represent 8982references from inline asms and other things the compiler cannot "see", and 8983corresponds to "``attribute((used))``" in GNU C. 8984 8985On some targets, the code generator must emit a directive to the 8986assembler or object file to prevent the assembler and linker from 8987removing the symbol. 8988 8989.. _gv_llvmcompilerused: 8990 8991The '``llvm.compiler.used``' Global Variable 8992-------------------------------------------- 8993 8994The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used`` 8995directive, except that it only prevents the compiler from touching the 8996symbol. On targets that support it, this allows an intelligent linker to 8997optimize references to the symbol without being impeded as it would be 8998by ``@llvm.used``. 8999 9000This is a rare construct that should only be used in rare circumstances, 9001and should not be exposed to source languages. 9002 9003.. _gv_llvmglobalctors: 9004 9005The '``llvm.global_ctors``' Global Variable 9006------------------------------------------- 9007 9008.. code-block:: llvm 9009 9010 %0 = type { i32, ptr, ptr } 9011 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, ptr @ctor, ptr @data }] 9012 9013The ``@llvm.global_ctors`` array contains a list of constructor 9014functions, priorities, and an associated global or function. 9015The functions referenced by this array will be called in ascending order 9016of priority (i.e. lowest first) when the module is loaded. The order of 9017functions with the same priority is not defined. 9018 9019If the third field is non-null, and points to a global variable 9020or function, the initializer function will only run if the associated 9021data from the current module is not discarded. 9022On ELF the referenced global variable or function must be in a comdat. 9023 9024.. _llvmglobaldtors: 9025 9026The '``llvm.global_dtors``' Global Variable 9027------------------------------------------- 9028 9029.. code-block:: llvm 9030 9031 %0 = type { i32, ptr, ptr } 9032 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, ptr @dtor, ptr @data }] 9033 9034The ``@llvm.global_dtors`` array contains a list of destructor 9035functions, priorities, and an associated global or function. 9036The functions referenced by this array will be called in descending 9037order of priority (i.e. highest first) when the module is unloaded. The 9038order of functions with the same priority is not defined. 9039 9040If the third field is non-null, and points to a global variable 9041or function, the destructor function will only run if the associated 9042data from the current module is not discarded. 9043On ELF the referenced global variable or function must be in a comdat. 9044 9045Instruction Reference 9046===================== 9047 9048The LLVM instruction set consists of several different classifications 9049of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary 9050instructions <binaryops>`, :ref:`bitwise binary 9051instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and 9052:ref:`other instructions <otherops>`. There are also :ref:`debug records 9053<debugrecords>`, which are not instructions themselves but are printed 9054interleaved with instructions to describe changes in the state of the program's 9055debug information at each position in the program's execution. 9056 9057.. _terminators: 9058 9059Terminator Instructions 9060----------------------- 9061 9062As mentioned :ref:`previously <functionstructure>`, every basic block in a 9063program ends with a "Terminator" instruction, which indicates which 9064block should be executed after the current block is finished. These 9065terminator instructions typically yield a '``void``' value: they produce 9066control flow, not values (the one exception being the 9067':ref:`invoke <i_invoke>`' instruction). 9068 9069The terminator instructions are: ':ref:`ret <i_ret>`', 9070':ref:`br <i_br>`', ':ref:`switch <i_switch>`', 9071':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`', 9072':ref:`callbr <i_callbr>`' 9073':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`', 9074':ref:`catchret <i_catchret>`', 9075':ref:`cleanupret <i_cleanupret>`', 9076and ':ref:`unreachable <i_unreachable>`'. 9077 9078.. _i_ret: 9079 9080'``ret``' Instruction 9081^^^^^^^^^^^^^^^^^^^^^ 9082 9083Syntax: 9084""""""" 9085 9086:: 9087 9088 ret <type> <value> ; Return a value from a non-void function 9089 ret void ; Return from void function 9090 9091Overview: 9092""""""""" 9093 9094The '``ret``' instruction is used to return control flow (and optionally 9095a value) from a function back to the caller. 9096 9097There are two forms of the '``ret``' instruction: one that returns a 9098value and then causes control flow, and one that just causes control 9099flow to occur. 9100 9101Arguments: 9102"""""""""" 9103 9104The '``ret``' instruction optionally accepts a single argument, the 9105return value. The type of the return value must be a ':ref:`first 9106class <t_firstclass>`' type. 9107 9108A function is not :ref:`well formed <wellformed>` if it has a non-void 9109return type and contains a '``ret``' instruction with no return value or 9110a return value with a type that does not match its type, or if it has a 9111void return type and contains a '``ret``' instruction with a return 9112value. 9113 9114Semantics: 9115"""""""""" 9116 9117When the '``ret``' instruction is executed, control flow returns back to 9118the calling function's context. If the caller is a 9119":ref:`call <i_call>`" instruction, execution continues at the 9120instruction after the call. If the caller was an 9121":ref:`invoke <i_invoke>`" instruction, execution continues at the 9122beginning of the "normal" destination block. If the instruction returns 9123a value, that value shall set the call or invoke instruction's return 9124value. 9125 9126Example: 9127"""""""" 9128 9129.. code-block:: llvm 9130 9131 ret i32 5 ; Return an integer value of 5 9132 ret void ; Return from a void function 9133 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2 9134 9135.. _i_br: 9136 9137'``br``' Instruction 9138^^^^^^^^^^^^^^^^^^^^ 9139 9140Syntax: 9141""""""" 9142 9143:: 9144 9145 br i1 <cond>, label <iftrue>, label <iffalse> 9146 br label <dest> ; Unconditional branch 9147 9148Overview: 9149""""""""" 9150 9151The '``br``' instruction is used to cause control flow to transfer to a 9152different basic block in the current function. There are two forms of 9153this instruction, corresponding to a conditional branch and an 9154unconditional branch. 9155 9156Arguments: 9157"""""""""" 9158 9159The conditional branch form of the '``br``' instruction takes a single 9160'``i1``' value and two '``label``' values. The unconditional form of the 9161'``br``' instruction takes a single '``label``' value as a target. 9162 9163Semantics: 9164"""""""""" 9165 9166Upon execution of a conditional '``br``' instruction, the '``i1``' 9167argument is evaluated. If the value is ``true``, control flows to the 9168'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows 9169to the '``iffalse``' ``label`` argument. 9170If '``cond``' is ``poison`` or ``undef``, this instruction has undefined 9171behavior. 9172 9173Example: 9174"""""""" 9175 9176.. code-block:: llvm 9177 9178 Test: 9179 %cond = icmp eq i32 %a, %b 9180 br i1 %cond, label %IfEqual, label %IfUnequal 9181 IfEqual: 9182 ret i32 1 9183 IfUnequal: 9184 ret i32 0 9185 9186.. _i_switch: 9187 9188'``switch``' Instruction 9189^^^^^^^^^^^^^^^^^^^^^^^^ 9190 9191Syntax: 9192""""""" 9193 9194:: 9195 9196 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ] 9197 9198Overview: 9199""""""""" 9200 9201The '``switch``' instruction is used to transfer control flow to one of 9202several different places. It is a generalization of the '``br``' 9203instruction, allowing a branch to occur to one of many possible 9204destinations. 9205 9206Arguments: 9207"""""""""" 9208 9209The '``switch``' instruction uses three parameters: an integer 9210comparison value '``value``', a default '``label``' destination, and an 9211array of pairs of comparison value constants and '``label``'s. The table 9212is not allowed to contain duplicate constant entries. 9213 9214Semantics: 9215"""""""""" 9216 9217The ``switch`` instruction specifies a table of values and destinations. 9218When the '``switch``' instruction is executed, this table is searched 9219for the given value. If the value is found, control flow is transferred 9220to the corresponding destination; otherwise, control flow is transferred 9221to the default destination. 9222If '``value``' is ``poison`` or ``undef``, this instruction has undefined 9223behavior. 9224 9225Implementation: 9226""""""""""""""" 9227 9228Depending on properties of the target machine and the particular 9229``switch`` instruction, this instruction may be code generated in 9230different ways. For example, it could be generated as a series of 9231chained conditional branches or with a lookup table. 9232 9233Example: 9234"""""""" 9235 9236.. code-block:: llvm 9237 9238 ; Emulate a conditional br instruction 9239 %Val = zext i1 %value to i32 9240 switch i32 %Val, label %truedest [ i32 0, label %falsedest ] 9241 9242 ; Emulate an unconditional br instruction 9243 switch i32 0, label %dest [ ] 9244 9245 ; Implement a jump table: 9246 switch i32 %val, label %otherwise [ i32 0, label %onzero 9247 i32 1, label %onone 9248 i32 2, label %ontwo ] 9249 9250.. _i_indirectbr: 9251 9252'``indirectbr``' Instruction 9253^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9254 9255Syntax: 9256""""""" 9257 9258:: 9259 9260 indirectbr ptr <address>, [ label <dest1>, label <dest2>, ... ] 9261 9262Overview: 9263""""""""" 9264 9265The '``indirectbr``' instruction implements an indirect branch to a 9266label within the current function, whose address is specified by 9267"``address``". Address must be derived from a 9268:ref:`blockaddress <blockaddress>` constant. 9269 9270Arguments: 9271"""""""""" 9272 9273The '``address``' argument is the address of the label to jump to. The 9274rest of the arguments indicate the full set of possible destinations 9275that the address may point to. Blocks are allowed to occur multiple 9276times in the destination list, though this isn't particularly useful. 9277 9278This destination list is required so that dataflow analysis has an 9279accurate understanding of the CFG. 9280 9281Semantics: 9282"""""""""" 9283 9284Control transfers to the block specified in the address argument. All 9285possible destination blocks must be listed in the label list, otherwise 9286this instruction has undefined behavior. This implies that jumps to 9287labels defined in other functions have undefined behavior as well. 9288If '``address``' is ``poison`` or ``undef``, this instruction has undefined 9289behavior. 9290 9291Implementation: 9292""""""""""""""" 9293 9294This is typically implemented with a jump through a register. 9295 9296Example: 9297"""""""" 9298 9299.. code-block:: llvm 9300 9301 indirectbr ptr %Addr, [ label %bb1, label %bb2, label %bb3 ] 9302 9303.. _i_invoke: 9304 9305'``invoke``' Instruction 9306^^^^^^^^^^^^^^^^^^^^^^^^ 9307 9308Syntax: 9309""""""" 9310 9311:: 9312 9313 <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] 9314 [operand bundles] to label <normal label> unwind label <exception label> 9315 9316Overview: 9317""""""""" 9318 9319The '``invoke``' instruction causes control to transfer to a specified 9320function, with the possibility of control flow transfer to either the 9321'``normal``' label or the '``exception``' label. If the callee function 9322returns with the "``ret``" instruction, control flow will return to the 9323"normal" label. If the callee (or any indirect callees) returns via the 9324":ref:`resume <i_resume>`" instruction or other exception handling 9325mechanism, control is interrupted and continued at the dynamically 9326nearest "exception" label. 9327 9328The '``exception``' label is a `landing 9329pad <ExceptionHandling.html#overview>`_ for the exception. As such, 9330'``exception``' label is required to have the 9331":ref:`landingpad <i_landingpad>`" instruction, which contains the 9332information about the behavior of the program after unwinding happens, 9333as its first non-PHI instruction. The restrictions on the 9334"``landingpad``" instruction's tightly couples it to the "``invoke``" 9335instruction, so that the important information contained within the 9336"``landingpad``" instruction can't be lost through normal code motion. 9337 9338Arguments: 9339"""""""""" 9340 9341This instruction requires several arguments: 9342 9343#. The optional "cconv" marker indicates which :ref:`calling 9344 convention <callingconv>` the call should use. If none is 9345 specified, the call defaults to using C calling conventions. 9346#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 9347 values. Only '``zeroext``', '``signext``', '``noext``', and '``inreg``' 9348 attributes are valid here. 9349#. The optional addrspace attribute can be used to indicate the address space 9350 of the called function. If it is not specified, the program address space 9351 from the :ref:`datalayout string<langref_datalayout>` will be used. 9352#. '``ty``': the type of the call instruction itself which is also the 9353 type of the return value. Functions that return no value are marked 9354 ``void``. 9355#. '``fnty``': shall be the signature of the function being invoked. The 9356 argument types must match the types implied by this signature. This 9357 type can be omitted if the function is not varargs. 9358#. '``fnptrval``': An LLVM value containing a pointer to a function to 9359 be invoked. In most cases, this is a direct function invocation, but 9360 indirect ``invoke``'s are just as possible, calling an arbitrary pointer 9361 to function value. 9362#. '``function args``': argument list whose types match the function 9363 signature argument types and parameter attributes. All arguments must 9364 be of :ref:`first class <t_firstclass>` type. If the function signature 9365 indicates the function accepts a variable number of arguments, the 9366 extra arguments can be specified. 9367#. '``normal label``': the label reached when the called function 9368 executes a '``ret``' instruction. 9369#. '``exception label``': the label reached when a callee returns via 9370 the :ref:`resume <i_resume>` instruction or other exception handling 9371 mechanism. 9372#. The optional :ref:`function attributes <fnattrs>` list. 9373#. The optional :ref:`operand bundles <opbundles>` list. 9374 9375Semantics: 9376"""""""""" 9377 9378This instruction is designed to operate as a standard '``call``' 9379instruction in most regards. The primary difference is that it 9380establishes an association with a label, which is used by the runtime 9381library to unwind the stack. 9382 9383This instruction is used in languages with destructors to ensure that 9384proper cleanup is performed in the case of either a ``longjmp`` or a 9385thrown exception. Additionally, this is important for implementation of 9386'``catch``' clauses in high-level languages that support them. 9387 9388For the purposes of the SSA form, the definition of the value returned 9389by the '``invoke``' instruction is deemed to occur on the edge from the 9390current block to the "normal" label. If the callee unwinds then no 9391return value is available. 9392 9393Example: 9394"""""""" 9395 9396.. code-block:: llvm 9397 9398 %retval = invoke i32 @Test(i32 15) to label %Continue 9399 unwind label %TestCleanup ; i32:retval set 9400 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue 9401 unwind label %TestCleanup ; i32:retval set 9402 9403.. _i_callbr: 9404 9405'``callbr``' Instruction 9406^^^^^^^^^^^^^^^^^^^^^^^^ 9407 9408Syntax: 9409""""""" 9410 9411:: 9412 9413 <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] 9414 [operand bundles] to label <fallthrough label> [indirect labels] 9415 9416Overview: 9417""""""""" 9418 9419The '``callbr``' instruction causes control to transfer to a specified 9420function, with the possibility of control flow transfer to either the 9421'``fallthrough``' label or one of the '``indirect``' labels. 9422 9423This instruction should only be used to implement the "goto" feature of gcc 9424style inline assembly. Any other usage is an error in the IR verifier. 9425 9426Note that in order to support outputs along indirect edges, LLVM may need to 9427split critical edges, which may require synthesizing a replacement block for 9428the ``indirect labels``. Therefore, the address of a label as seen by another 9429``callbr`` instruction, or for a :ref:`blockaddress <blockaddress>` constant, 9430may not be equal to the address provided for the same block to this 9431instruction's ``indirect labels`` operand. The assembly code may only transfer 9432control to addresses provided via this instruction's ``indirect labels``. 9433 9434Arguments: 9435"""""""""" 9436 9437This instruction requires several arguments: 9438 9439#. The optional "cconv" marker indicates which :ref:`calling 9440 convention <callingconv>` the call should use. If none is 9441 specified, the call defaults to using C calling conventions. 9442#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 9443 values. Only '``zeroext``', '``signext``', '``noext``', and '``inreg``' 9444 attributes are valid here. 9445#. The optional addrspace attribute can be used to indicate the address space 9446 of the called function. If it is not specified, the program address space 9447 from the :ref:`datalayout string<langref_datalayout>` will be used. 9448#. '``ty``': the type of the call instruction itself which is also the 9449 type of the return value. Functions that return no value are marked 9450 ``void``. 9451#. '``fnty``': shall be the signature of the function being called. The 9452 argument types must match the types implied by this signature. This 9453 type can be omitted if the function is not varargs. 9454#. '``fnptrval``': An LLVM value containing a pointer to a function to 9455 be called. In most cases, this is a direct function call, but 9456 other ``callbr``'s are just as possible, calling an arbitrary pointer 9457 to function value. 9458#. '``function args``': argument list whose types match the function 9459 signature argument types and parameter attributes. All arguments must 9460 be of :ref:`first class <t_firstclass>` type. If the function signature 9461 indicates the function accepts a variable number of arguments, the 9462 extra arguments can be specified. 9463#. '``fallthrough label``': the label reached when the inline assembly's 9464 execution exits the bottom. 9465#. '``indirect labels``': the labels reached when a callee transfers control 9466 to a location other than the '``fallthrough label``'. Label constraints 9467 refer to these destinations. 9468#. The optional :ref:`function attributes <fnattrs>` list. 9469#. The optional :ref:`operand bundles <opbundles>` list. 9470 9471Semantics: 9472"""""""""" 9473 9474This instruction is designed to operate as a standard '``call``' 9475instruction in most regards. The primary difference is that it 9476establishes an association with additional labels to define where control 9477flow goes after the call. 9478 9479The output values of a '``callbr``' instruction are available both in the 9480the '``fallthrough``' block, and any '``indirect``' blocks(s). 9481 9482The only use of this today is to implement the "goto" feature of gcc inline 9483assembly where additional labels can be provided as locations for the inline 9484assembly to jump to. 9485 9486Example: 9487"""""""" 9488 9489.. code-block:: llvm 9490 9491 ; "asm goto" without output constraints. 9492 callbr void asm "", "r,!i"(i32 %x) 9493 to label %fallthrough [label %indirect] 9494 9495 ; "asm goto" with output constraints. 9496 <result> = callbr i32 asm "", "=r,r,!i"(i32 %x) 9497 to label %fallthrough [label %indirect] 9498 9499.. _i_resume: 9500 9501'``resume``' Instruction 9502^^^^^^^^^^^^^^^^^^^^^^^^ 9503 9504Syntax: 9505""""""" 9506 9507:: 9508 9509 resume <type> <value> 9510 9511Overview: 9512""""""""" 9513 9514The '``resume``' instruction is a terminator instruction that has no 9515successors. 9516 9517Arguments: 9518"""""""""" 9519 9520The '``resume``' instruction requires one argument, which must have the 9521same type as the result of any '``landingpad``' instruction in the same 9522function. 9523 9524Semantics: 9525"""""""""" 9526 9527The '``resume``' instruction resumes propagation of an existing 9528(in-flight) exception whose unwinding was interrupted with a 9529:ref:`landingpad <i_landingpad>` instruction. 9530 9531Example: 9532"""""""" 9533 9534.. code-block:: llvm 9535 9536 resume { ptr, i32 } %exn 9537 9538.. _i_catchswitch: 9539 9540'``catchswitch``' Instruction 9541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9542 9543Syntax: 9544""""""" 9545 9546:: 9547 9548 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller 9549 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default> 9550 9551Overview: 9552""""""""" 9553 9554The '``catchswitch``' instruction is used by `LLVM's exception handling system 9555<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers 9556that may be executed by the :ref:`EH personality routine <personalityfn>`. 9557 9558Arguments: 9559"""""""""" 9560 9561The ``parent`` argument is the token of the funclet that contains the 9562``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet, 9563this operand may be the token ``none``. 9564 9565The ``default`` argument is the label of another basic block beginning with 9566either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination 9567must be a legal target with respect to the ``parent`` links, as described in 9568the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_. 9569 9570The ``handlers`` are a nonempty list of successor blocks that each begin with a 9571:ref:`catchpad <i_catchpad>` instruction. 9572 9573Semantics: 9574"""""""""" 9575 9576Executing this instruction transfers control to one of the successors in 9577``handlers``, if appropriate, or continues to unwind via the unwind label if 9578present. 9579 9580The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that 9581it must be both the first non-phi instruction and last instruction in the basic 9582block. Therefore, it must be the only non-phi instruction in the block. 9583 9584Example: 9585"""""""" 9586 9587.. code-block:: text 9588 9589 dispatch1: 9590 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller 9591 dispatch2: 9592 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup 9593 9594.. _i_catchret: 9595 9596'``catchret``' Instruction 9597^^^^^^^^^^^^^^^^^^^^^^^^^^ 9598 9599Syntax: 9600""""""" 9601 9602:: 9603 9604 catchret from <token> to label <normal> 9605 9606Overview: 9607""""""""" 9608 9609The '``catchret``' instruction is a terminator instruction that has a 9610single successor. 9611 9612 9613Arguments: 9614"""""""""" 9615 9616The first argument to a '``catchret``' indicates which ``catchpad`` it 9617exits. It must be a :ref:`catchpad <i_catchpad>`. 9618The second argument to a '``catchret``' specifies where control will 9619transfer to next. 9620 9621Semantics: 9622"""""""""" 9623 9624The '``catchret``' instruction ends an existing (in-flight) exception whose 9625unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The 9626:ref:`personality function <personalityfn>` gets a chance to execute arbitrary 9627code to, for example, destroy the active exception. Control then transfers to 9628``normal``. 9629 9630The ``token`` argument must be a token produced by a ``catchpad`` instruction. 9631If the specified ``catchpad`` is not the most-recently-entered not-yet-exited 9632funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 9633the ``catchret``'s behavior is undefined. 9634 9635Example: 9636"""""""" 9637 9638.. code-block:: text 9639 9640 catchret from %catch to label %continue 9641 9642.. _i_cleanupret: 9643 9644'``cleanupret``' Instruction 9645^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9646 9647Syntax: 9648""""""" 9649 9650:: 9651 9652 cleanupret from <value> unwind label <continue> 9653 cleanupret from <value> unwind to caller 9654 9655Overview: 9656""""""""" 9657 9658The '``cleanupret``' instruction is a terminator instruction that has 9659an optional successor. 9660 9661 9662Arguments: 9663"""""""""" 9664 9665The '``cleanupret``' instruction requires one argument, which indicates 9666which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`. 9667If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited 9668funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 9669the ``cleanupret``'s behavior is undefined. 9670 9671The '``cleanupret``' instruction also has an optional successor, ``continue``, 9672which must be the label of another basic block beginning with either a 9673``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must 9674be a legal target with respect to the ``parent`` links, as described in the 9675`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_. 9676 9677Semantics: 9678"""""""""" 9679 9680The '``cleanupret``' instruction indicates to the 9681:ref:`personality function <personalityfn>` that one 9682:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended. 9683It transfers control to ``continue`` or unwinds out of the function. 9684 9685Example: 9686"""""""" 9687 9688.. code-block:: text 9689 9690 cleanupret from %cleanup unwind to caller 9691 cleanupret from %cleanup unwind label %continue 9692 9693.. _i_unreachable: 9694 9695'``unreachable``' Instruction 9696^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9697 9698Syntax: 9699""""""" 9700 9701:: 9702 9703 unreachable 9704 9705Overview: 9706""""""""" 9707 9708The '``unreachable``' instruction has no defined semantics. This 9709instruction is used to inform the optimizer that a particular portion of 9710the code is not reachable. This can be used to indicate that the code 9711after a no-return function cannot be reached, and other facts. 9712 9713Semantics: 9714"""""""""" 9715 9716The '``unreachable``' instruction has no defined semantics. 9717 9718.. _unaryops: 9719 9720Unary Operations 9721----------------- 9722 9723Unary operators require a single operand, execute an operation on 9724it, and produce a single value. The operand might represent multiple 9725data, as is the case with the :ref:`vector <t_vector>` data type. The 9726result value has the same type as its operand. 9727 9728.. _i_fneg: 9729 9730'``fneg``' Instruction 9731^^^^^^^^^^^^^^^^^^^^^^ 9732 9733Syntax: 9734""""""" 9735 9736:: 9737 9738 <result> = fneg [fast-math flags]* <ty> <op1> ; yields ty:result 9739 9740Overview: 9741""""""""" 9742 9743The '``fneg``' instruction returns the negation of its operand. 9744 9745Arguments: 9746"""""""""" 9747 9748The argument to the '``fneg``' instruction must be a 9749:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 9750floating-point values. 9751 9752Semantics: 9753"""""""""" 9754 9755The value produced is a copy of the operand with its sign bit flipped. 9756The value is otherwise completely identical; in particular, if the input is a 9757NaN, then the quiet/signaling bit and payload are perfectly preserved. 9758 9759This instruction can also take any number of :ref:`fast-math 9760flags <fastmath>`, which are optimization hints to enable otherwise 9761unsafe floating-point optimizations: 9762 9763Example: 9764"""""""" 9765 9766.. code-block:: text 9767 9768 <result> = fneg float %val ; yields float:result = -%var 9769 9770.. _binaryops: 9771 9772Binary Operations 9773----------------- 9774 9775Binary operators are used to do most of the computation in a program. 9776They require two operands of the same type, execute an operation on 9777them, and produce a single value. The operands might represent multiple 9778data, as is the case with the :ref:`vector <t_vector>` data type. The 9779result value has the same type as its operands. 9780 9781There are several different binary operators: 9782 9783.. _i_add: 9784 9785'``add``' Instruction 9786^^^^^^^^^^^^^^^^^^^^^ 9787 9788Syntax: 9789""""""" 9790 9791:: 9792 9793 <result> = add <ty> <op1>, <op2> ; yields ty:result 9794 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result 9795 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result 9796 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result 9797 9798Overview: 9799""""""""" 9800 9801The '``add``' instruction returns the sum of its two operands. 9802 9803Arguments: 9804"""""""""" 9805 9806The two arguments to the '``add``' instruction must be 9807:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 9808arguments must have identical types. 9809 9810Semantics: 9811"""""""""" 9812 9813The value produced is the integer sum of the two operands. 9814 9815If the sum has unsigned overflow, the result returned is the 9816mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of 9817the result. 9818 9819Because LLVM integers use a two's complement representation, this 9820instruction is appropriate for both signed and unsigned integers. 9821 9822``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 9823respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 9824result value of the ``add`` is a :ref:`poison value <poisonvalues>` if 9825unsigned and/or signed overflow, respectively, occurs. 9826 9827Example: 9828"""""""" 9829 9830.. code-block:: text 9831 9832 <result> = add i32 4, %var ; yields i32:result = 4 + %var 9833 9834.. _i_fadd: 9835 9836'``fadd``' Instruction 9837^^^^^^^^^^^^^^^^^^^^^^ 9838 9839Syntax: 9840""""""" 9841 9842:: 9843 9844 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 9845 9846Overview: 9847""""""""" 9848 9849The '``fadd``' instruction returns the sum of its two operands. 9850 9851Arguments: 9852"""""""""" 9853 9854The two arguments to the '``fadd``' instruction must be 9855:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 9856floating-point values. Both arguments must have identical types. 9857 9858Semantics: 9859"""""""""" 9860 9861The value produced is the floating-point sum of the two operands. 9862This instruction is assumed to execute in the default :ref:`floating-point 9863environment <floatenv>`. 9864This instruction can also take any number of :ref:`fast-math 9865flags <fastmath>`, which are optimization hints to enable otherwise 9866unsafe floating-point optimizations: 9867 9868Example: 9869"""""""" 9870 9871.. code-block:: text 9872 9873 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var 9874 9875.. _i_sub: 9876 9877'``sub``' Instruction 9878^^^^^^^^^^^^^^^^^^^^^ 9879 9880Syntax: 9881""""""" 9882 9883:: 9884 9885 <result> = sub <ty> <op1>, <op2> ; yields ty:result 9886 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result 9887 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result 9888 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result 9889 9890Overview: 9891""""""""" 9892 9893The '``sub``' instruction returns the difference of its two operands. 9894 9895Note that the '``sub``' instruction is used to represent the '``neg``' 9896instruction present in most other intermediate representations. 9897 9898Arguments: 9899"""""""""" 9900 9901The two arguments to the '``sub``' instruction must be 9902:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 9903arguments must have identical types. 9904 9905Semantics: 9906"""""""""" 9907 9908The value produced is the integer difference of the two operands. 9909 9910If the difference has unsigned overflow, the result returned is the 9911mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of 9912the result. 9913 9914Because LLVM integers use a two's complement representation, this 9915instruction is appropriate for both signed and unsigned integers. 9916 9917``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 9918respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 9919result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if 9920unsigned and/or signed overflow, respectively, occurs. 9921 9922Example: 9923"""""""" 9924 9925.. code-block:: text 9926 9927 <result> = sub i32 4, %var ; yields i32:result = 4 - %var 9928 <result> = sub i32 0, %val ; yields i32:result = -%var 9929 9930.. _i_fsub: 9931 9932'``fsub``' Instruction 9933^^^^^^^^^^^^^^^^^^^^^^ 9934 9935Syntax: 9936""""""" 9937 9938:: 9939 9940 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 9941 9942Overview: 9943""""""""" 9944 9945The '``fsub``' instruction returns the difference of its two operands. 9946 9947Arguments: 9948"""""""""" 9949 9950The two arguments to the '``fsub``' instruction must be 9951:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 9952floating-point values. Both arguments must have identical types. 9953 9954Semantics: 9955"""""""""" 9956 9957The value produced is the floating-point difference of the two operands. 9958This instruction is assumed to execute in the default :ref:`floating-point 9959environment <floatenv>`. 9960This instruction can also take any number of :ref:`fast-math 9961flags <fastmath>`, which are optimization hints to enable otherwise 9962unsafe floating-point optimizations: 9963 9964Example: 9965"""""""" 9966 9967.. code-block:: text 9968 9969 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var 9970 <result> = fsub float -0.0, %val ; yields float:result = -%var 9971 9972.. _i_mul: 9973 9974'``mul``' Instruction 9975^^^^^^^^^^^^^^^^^^^^^ 9976 9977Syntax: 9978""""""" 9979 9980:: 9981 9982 <result> = mul <ty> <op1>, <op2> ; yields ty:result 9983 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result 9984 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result 9985 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result 9986 9987Overview: 9988""""""""" 9989 9990The '``mul``' instruction returns the product of its two operands. 9991 9992Arguments: 9993"""""""""" 9994 9995The two arguments to the '``mul``' instruction must be 9996:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 9997arguments must have identical types. 9998 9999Semantics: 10000"""""""""" 10001 10002The value produced is the integer product of the two operands. 10003 10004If the result of the multiplication has unsigned overflow, the result 10005returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the 10006bit width of the result. 10007 10008Because LLVM integers use a two's complement representation, and the 10009result is the same width as the operands, this instruction returns the 10010correct result for both signed and unsigned integers. If a full product 10011(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be 10012sign-extended or zero-extended as appropriate to the width of the full 10013product. 10014 10015``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 10016respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 10017result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if 10018unsigned and/or signed overflow, respectively, occurs. 10019 10020Example: 10021"""""""" 10022 10023.. code-block:: text 10024 10025 <result> = mul i32 4, %var ; yields i32:result = 4 * %var 10026 10027.. _i_fmul: 10028 10029'``fmul``' Instruction 10030^^^^^^^^^^^^^^^^^^^^^^ 10031 10032Syntax: 10033""""""" 10034 10035:: 10036 10037 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 10038 10039Overview: 10040""""""""" 10041 10042The '``fmul``' instruction returns the product of its two operands. 10043 10044Arguments: 10045"""""""""" 10046 10047The two arguments to the '``fmul``' instruction must be 10048:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 10049floating-point values. Both arguments must have identical types. 10050 10051Semantics: 10052"""""""""" 10053 10054The value produced is the floating-point product of the two operands. 10055This instruction is assumed to execute in the default :ref:`floating-point 10056environment <floatenv>`. 10057This instruction can also take any number of :ref:`fast-math 10058flags <fastmath>`, which are optimization hints to enable otherwise 10059unsafe floating-point optimizations: 10060 10061Example: 10062"""""""" 10063 10064.. code-block:: text 10065 10066 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var 10067 10068.. _i_udiv: 10069 10070'``udiv``' Instruction 10071^^^^^^^^^^^^^^^^^^^^^^ 10072 10073Syntax: 10074""""""" 10075 10076:: 10077 10078 <result> = udiv <ty> <op1>, <op2> ; yields ty:result 10079 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result 10080 10081Overview: 10082""""""""" 10083 10084The '``udiv``' instruction returns the quotient of its two operands. 10085 10086Arguments: 10087"""""""""" 10088 10089The two arguments to the '``udiv``' instruction must be 10090:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 10091arguments must have identical types. 10092 10093Semantics: 10094"""""""""" 10095 10096The value produced is the unsigned integer quotient of the two operands. 10097 10098Note that unsigned integer division and signed integer division are 10099distinct operations; for signed integer division, use '``sdiv``'. 10100 10101Division by zero is undefined behavior. For vectors, if any element 10102of the divisor is zero, the operation has undefined behavior. 10103 10104 10105If the ``exact`` keyword is present, the result value of the ``udiv`` is 10106a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as 10107such, "((a udiv exact b) mul b) == a"). 10108 10109Example: 10110"""""""" 10111 10112.. code-block:: text 10113 10114 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var 10115 10116.. _i_sdiv: 10117 10118'``sdiv``' Instruction 10119^^^^^^^^^^^^^^^^^^^^^^ 10120 10121Syntax: 10122""""""" 10123 10124:: 10125 10126 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result 10127 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result 10128 10129Overview: 10130""""""""" 10131 10132The '``sdiv``' instruction returns the quotient of its two operands. 10133 10134Arguments: 10135"""""""""" 10136 10137The two arguments to the '``sdiv``' instruction must be 10138:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 10139arguments must have identical types. 10140 10141Semantics: 10142"""""""""" 10143 10144The value produced is the signed integer quotient of the two operands 10145rounded towards zero. 10146 10147Note that signed integer division and unsigned integer division are 10148distinct operations; for unsigned integer division, use '``udiv``'. 10149 10150Division by zero is undefined behavior. For vectors, if any element 10151of the divisor is zero, the operation has undefined behavior. 10152Overflow also leads to undefined behavior; this is a rare case, but can 10153occur, for example, by doing a 32-bit division of -2147483648 by -1. 10154 10155If the ``exact`` keyword is present, the result value of the ``sdiv`` is 10156a :ref:`poison value <poisonvalues>` if the result would be rounded. 10157 10158Example: 10159"""""""" 10160 10161.. code-block:: text 10162 10163 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var 10164 10165.. _i_fdiv: 10166 10167'``fdiv``' Instruction 10168^^^^^^^^^^^^^^^^^^^^^^ 10169 10170Syntax: 10171""""""" 10172 10173:: 10174 10175 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 10176 10177Overview: 10178""""""""" 10179 10180The '``fdiv``' instruction returns the quotient of its two operands. 10181 10182Arguments: 10183"""""""""" 10184 10185The two arguments to the '``fdiv``' instruction must be 10186:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 10187floating-point values. Both arguments must have identical types. 10188 10189Semantics: 10190"""""""""" 10191 10192The value produced is the floating-point quotient of the two operands. 10193This instruction is assumed to execute in the default :ref:`floating-point 10194environment <floatenv>`. 10195This instruction can also take any number of :ref:`fast-math 10196flags <fastmath>`, which are optimization hints to enable otherwise 10197unsafe floating-point optimizations: 10198 10199Example: 10200"""""""" 10201 10202.. code-block:: text 10203 10204 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var 10205 10206.. _i_urem: 10207 10208'``urem``' Instruction 10209^^^^^^^^^^^^^^^^^^^^^^ 10210 10211Syntax: 10212""""""" 10213 10214:: 10215 10216 <result> = urem <ty> <op1>, <op2> ; yields ty:result 10217 10218Overview: 10219""""""""" 10220 10221The '``urem``' instruction returns the remainder from the unsigned 10222division of its two arguments. 10223 10224Arguments: 10225"""""""""" 10226 10227The two arguments to the '``urem``' instruction must be 10228:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 10229arguments must have identical types. 10230 10231Semantics: 10232"""""""""" 10233 10234This instruction returns the unsigned integer *remainder* of a division. 10235This instruction always performs an unsigned division to get the 10236remainder. 10237 10238Note that unsigned integer remainder and signed integer remainder are 10239distinct operations; for signed integer remainder, use '``srem``'. 10240 10241Taking the remainder of a division by zero is undefined behavior. 10242For vectors, if any element of the divisor is zero, the operation has 10243undefined behavior. 10244 10245Example: 10246"""""""" 10247 10248.. code-block:: text 10249 10250 <result> = urem i32 4, %var ; yields i32:result = 4 % %var 10251 10252.. _i_srem: 10253 10254'``srem``' Instruction 10255^^^^^^^^^^^^^^^^^^^^^^ 10256 10257Syntax: 10258""""""" 10259 10260:: 10261 10262 <result> = srem <ty> <op1>, <op2> ; yields ty:result 10263 10264Overview: 10265""""""""" 10266 10267The '``srem``' instruction returns the remainder from the signed 10268division of its two operands. This instruction can also take 10269:ref:`vector <t_vector>` versions of the values in which case the elements 10270must be integers. 10271 10272Arguments: 10273"""""""""" 10274 10275The two arguments to the '``srem``' instruction must be 10276:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 10277arguments must have identical types. 10278 10279Semantics: 10280"""""""""" 10281 10282This instruction returns the *remainder* of a division (where the result 10283is either zero or has the same sign as the dividend, ``op1``), not the 10284*modulo* operator (where the result is either zero or has the same sign 10285as the divisor, ``op2``) of a value. For more information about the 10286difference, see `The Math 10287Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a 10288table of how this is implemented in various languages, please see 10289`Wikipedia: modulo 10290operation <http://en.wikipedia.org/wiki/Modulo_operation>`_. 10291 10292Note that signed integer remainder and unsigned integer remainder are 10293distinct operations; for unsigned integer remainder, use '``urem``'. 10294 10295Taking the remainder of a division by zero is undefined behavior. 10296For vectors, if any element of the divisor is zero, the operation has 10297undefined behavior. 10298Overflow also leads to undefined behavior; this is a rare case, but can 10299occur, for example, by taking the remainder of a 32-bit division of 10300-2147483648 by -1. (The remainder doesn't actually overflow, but this 10301rule lets srem be implemented using instructions that return both the 10302result of the division and the remainder.) 10303 10304Example: 10305"""""""" 10306 10307.. code-block:: text 10308 10309 <result> = srem i32 4, %var ; yields i32:result = 4 % %var 10310 10311.. _i_frem: 10312 10313'``frem``' Instruction 10314^^^^^^^^^^^^^^^^^^^^^^ 10315 10316Syntax: 10317""""""" 10318 10319:: 10320 10321 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 10322 10323Overview: 10324""""""""" 10325 10326The '``frem``' instruction returns the remainder from the division of 10327its two operands. 10328 10329.. note:: 10330 10331 The instruction is implemented as a call to libm's '``fmod``' 10332 for some targets, and using the instruction may thus require linking libm. 10333 10334 10335Arguments: 10336"""""""""" 10337 10338The two arguments to the '``frem``' instruction must be 10339:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 10340floating-point values. Both arguments must have identical types. 10341 10342Semantics: 10343"""""""""" 10344 10345The value produced is the floating-point remainder of the two operands. 10346This is the same output as a libm '``fmod``' function, but without any 10347possibility of setting ``errno``. The remainder has the same sign as the 10348dividend. 10349This instruction is assumed to execute in the default :ref:`floating-point 10350environment <floatenv>`. 10351This instruction can also take any number of :ref:`fast-math 10352flags <fastmath>`, which are optimization hints to enable otherwise 10353unsafe floating-point optimizations: 10354 10355Example: 10356"""""""" 10357 10358.. code-block:: text 10359 10360 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var 10361 10362.. _bitwiseops: 10363 10364Bitwise Binary Operations 10365------------------------- 10366 10367Bitwise binary operators are used to do various forms of bit-twiddling 10368in a program. They are generally very efficient instructions and can 10369commonly be strength reduced from other instructions. They require two 10370operands of the same type, execute an operation on them, and produce a 10371single value. The resulting value is the same type as its operands. 10372 10373.. _i_shl: 10374 10375'``shl``' Instruction 10376^^^^^^^^^^^^^^^^^^^^^ 10377 10378Syntax: 10379""""""" 10380 10381:: 10382 10383 <result> = shl <ty> <op1>, <op2> ; yields ty:result 10384 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result 10385 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result 10386 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result 10387 10388Overview: 10389""""""""" 10390 10391The '``shl``' instruction returns the first operand shifted to the left 10392a specified number of bits. 10393 10394Arguments: 10395"""""""""" 10396 10397Both arguments to the '``shl``' instruction must be the same 10398:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 10399'``op2``' is treated as an unsigned value. 10400 10401Semantics: 10402"""""""""" 10403 10404The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`, 10405where ``n`` is the width of the result. If ``op2`` is (statically or 10406dynamically) equal to or larger than the number of bits in 10407``op1``, this instruction returns a :ref:`poison value <poisonvalues>`. 10408If the arguments are vectors, each vector element of ``op1`` is shifted 10409by the corresponding shift amount in ``op2``. 10410 10411If the ``nuw`` keyword is present, then the shift produces a poison 10412value if it shifts out any non-zero bits. 10413If the ``nsw`` keyword is present, then the shift produces a poison 10414value if it shifts out any bits that disagree with the resultant sign bit. 10415 10416Example: 10417"""""""" 10418 10419.. code-block:: text 10420 10421 <result> = shl i32 4, %var ; yields i32: 4 << %var 10422 <result> = shl i32 4, 2 ; yields i32: 16 10423 <result> = shl i32 1, 10 ; yields i32: 1024 10424 <result> = shl i32 1, 32 ; undefined 10425 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4> 10426 10427.. _i_lshr: 10428 10429 10430'``lshr``' Instruction 10431^^^^^^^^^^^^^^^^^^^^^^ 10432 10433Syntax: 10434""""""" 10435 10436:: 10437 10438 <result> = lshr <ty> <op1>, <op2> ; yields ty:result 10439 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result 10440 10441Overview: 10442""""""""" 10443 10444The '``lshr``' instruction (logical shift right) returns the first 10445operand shifted to the right a specified number of bits with zero fill. 10446 10447Arguments: 10448"""""""""" 10449 10450Both arguments to the '``lshr``' instruction must be the same 10451:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 10452'``op2``' is treated as an unsigned value. 10453 10454Semantics: 10455"""""""""" 10456 10457This instruction always performs a logical shift right operation. The 10458most significant bits of the result will be filled with zero bits after 10459the shift. If ``op2`` is (statically or dynamically) equal to or larger 10460than the number of bits in ``op1``, this instruction returns a :ref:`poison 10461value <poisonvalues>`. If the arguments are vectors, each vector element 10462of ``op1`` is shifted by the corresponding shift amount in ``op2``. 10463 10464If the ``exact`` keyword is present, the result value of the ``lshr`` is 10465a poison value if any of the bits shifted out are non-zero. 10466 10467Example: 10468"""""""" 10469 10470.. code-block:: text 10471 10472 <result> = lshr i32 4, 1 ; yields i32:result = 2 10473 <result> = lshr i32 4, 2 ; yields i32:result = 1 10474 <result> = lshr i8 4, 3 ; yields i8:result = 0 10475 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F 10476 <result> = lshr i32 1, 32 ; undefined 10477 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1> 10478 10479.. _i_ashr: 10480 10481'``ashr``' Instruction 10482^^^^^^^^^^^^^^^^^^^^^^ 10483 10484Syntax: 10485""""""" 10486 10487:: 10488 10489 <result> = ashr <ty> <op1>, <op2> ; yields ty:result 10490 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result 10491 10492Overview: 10493""""""""" 10494 10495The '``ashr``' instruction (arithmetic shift right) returns the first 10496operand shifted to the right a specified number of bits with sign 10497extension. 10498 10499Arguments: 10500"""""""""" 10501 10502Both arguments to the '``ashr``' instruction must be the same 10503:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 10504'``op2``' is treated as an unsigned value. 10505 10506Semantics: 10507"""""""""" 10508 10509This instruction always performs an arithmetic shift right operation, 10510The most significant bits of the result will be filled with the sign bit 10511of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger 10512than the number of bits in ``op1``, this instruction returns a :ref:`poison 10513value <poisonvalues>`. If the arguments are vectors, each vector element 10514of ``op1`` is shifted by the corresponding shift amount in ``op2``. 10515 10516If the ``exact`` keyword is present, the result value of the ``ashr`` is 10517a poison value if any of the bits shifted out are non-zero. 10518 10519Example: 10520"""""""" 10521 10522.. code-block:: text 10523 10524 <result> = ashr i32 4, 1 ; yields i32:result = 2 10525 <result> = ashr i32 4, 2 ; yields i32:result = 1 10526 <result> = ashr i8 4, 3 ; yields i8:result = 0 10527 <result> = ashr i8 -2, 1 ; yields i8:result = -1 10528 <result> = ashr i32 1, 32 ; undefined 10529 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0> 10530 10531.. _i_and: 10532 10533'``and``' Instruction 10534^^^^^^^^^^^^^^^^^^^^^ 10535 10536Syntax: 10537""""""" 10538 10539:: 10540 10541 <result> = and <ty> <op1>, <op2> ; yields ty:result 10542 10543Overview: 10544""""""""" 10545 10546The '``and``' instruction returns the bitwise logical and of its two 10547operands. 10548 10549Arguments: 10550"""""""""" 10551 10552The two arguments to the '``and``' instruction must be 10553:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 10554arguments must have identical types. 10555 10556Semantics: 10557"""""""""" 10558 10559The truth table used for the '``and``' instruction is: 10560 10561+-----+-----+-----+ 10562| In0 | In1 | Out | 10563+-----+-----+-----+ 10564| 0 | 0 | 0 | 10565+-----+-----+-----+ 10566| 0 | 1 | 0 | 10567+-----+-----+-----+ 10568| 1 | 0 | 0 | 10569+-----+-----+-----+ 10570| 1 | 1 | 1 | 10571+-----+-----+-----+ 10572 10573Example: 10574"""""""" 10575 10576.. code-block:: text 10577 10578 <result> = and i32 4, %var ; yields i32:result = 4 & %var 10579 <result> = and i32 15, 40 ; yields i32:result = 8 10580 <result> = and i32 4, 8 ; yields i32:result = 0 10581 10582.. _i_or: 10583 10584'``or``' Instruction 10585^^^^^^^^^^^^^^^^^^^^ 10586 10587Syntax: 10588""""""" 10589 10590:: 10591 10592 <result> = or <ty> <op1>, <op2> ; yields ty:result 10593 <result> = or disjoint <ty> <op1>, <op2> ; yields ty:result 10594 10595Overview: 10596""""""""" 10597 10598The '``or``' instruction returns the bitwise logical inclusive or of its 10599two operands. 10600 10601Arguments: 10602"""""""""" 10603 10604The two arguments to the '``or``' instruction must be 10605:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 10606arguments must have identical types. 10607 10608Semantics: 10609"""""""""" 10610 10611The truth table used for the '``or``' instruction is: 10612 10613+-----+-----+-----+ 10614| In0 | In1 | Out | 10615+-----+-----+-----+ 10616| 0 | 0 | 0 | 10617+-----+-----+-----+ 10618| 0 | 1 | 1 | 10619+-----+-----+-----+ 10620| 1 | 0 | 1 | 10621+-----+-----+-----+ 10622| 1 | 1 | 1 | 10623+-----+-----+-----+ 10624 10625``disjoint`` means that for each bit, that bit is zero in at least one of the 10626inputs. This allows the Or to be treated as an Add since no carry can occur from 10627any bit. If the disjoint keyword is present, the result value of the ``or`` is a 10628:ref:`poison value <poisonvalues>` if both inputs have a one in the same bit 10629position. For vectors, only the element containing the bit is poison. 10630 10631Example: 10632"""""""" 10633 10634:: 10635 10636 <result> = or i32 4, %var ; yields i32:result = 4 | %var 10637 <result> = or i32 15, 40 ; yields i32:result = 47 10638 <result> = or i32 4, 8 ; yields i32:result = 12 10639 10640.. _i_xor: 10641 10642'``xor``' Instruction 10643^^^^^^^^^^^^^^^^^^^^^ 10644 10645Syntax: 10646""""""" 10647 10648:: 10649 10650 <result> = xor <ty> <op1>, <op2> ; yields ty:result 10651 10652Overview: 10653""""""""" 10654 10655The '``xor``' instruction returns the bitwise logical exclusive or of 10656its two operands. The ``xor`` is used to implement the "one's 10657complement" operation, which is the "~" operator in C. 10658 10659Arguments: 10660"""""""""" 10661 10662The two arguments to the '``xor``' instruction must be 10663:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 10664arguments must have identical types. 10665 10666Semantics: 10667"""""""""" 10668 10669The truth table used for the '``xor``' instruction is: 10670 10671+-----+-----+-----+ 10672| In0 | In1 | Out | 10673+-----+-----+-----+ 10674| 0 | 0 | 0 | 10675+-----+-----+-----+ 10676| 0 | 1 | 1 | 10677+-----+-----+-----+ 10678| 1 | 0 | 1 | 10679+-----+-----+-----+ 10680| 1 | 1 | 0 | 10681+-----+-----+-----+ 10682 10683Example: 10684"""""""" 10685 10686.. code-block:: text 10687 10688 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var 10689 <result> = xor i32 15, 40 ; yields i32:result = 39 10690 <result> = xor i32 4, 8 ; yields i32:result = 12 10691 <result> = xor i32 %V, -1 ; yields i32:result = ~%V 10692 10693Vector Operations 10694----------------- 10695 10696LLVM supports several instructions to represent vector operations in a 10697target-independent manner. These instructions cover the element-access 10698and vector-specific operations needed to process vectors effectively. 10699While LLVM does directly support these vector operations, many 10700sophisticated algorithms will want to use target-specific intrinsics to 10701take full advantage of a specific target. 10702 10703.. _i_extractelement: 10704 10705'``extractelement``' Instruction 10706^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10707 10708Syntax: 10709""""""" 10710 10711:: 10712 10713 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty> 10714 <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty> 10715 10716Overview: 10717""""""""" 10718 10719The '``extractelement``' instruction extracts a single scalar element 10720from a vector at a specified index. 10721 10722Arguments: 10723"""""""""" 10724 10725The first operand of an '``extractelement``' instruction is a value of 10726:ref:`vector <t_vector>` type. The second operand is an index indicating 10727the position from which to extract the element. The index may be a 10728variable of any integer type, and will be treated as an unsigned integer. 10729 10730Semantics: 10731"""""""""" 10732 10733The result is a scalar of the same type as the element type of ``val``. 10734Its value is the value at position ``idx`` of ``val``. If ``idx`` 10735exceeds the length of ``val`` for a fixed-length vector, the result is a 10736:ref:`poison value <poisonvalues>`. For a scalable vector, if the value 10737of ``idx`` exceeds the runtime length of the vector, the result is a 10738:ref:`poison value <poisonvalues>`. 10739 10740Example: 10741"""""""" 10742 10743.. code-block:: text 10744 10745 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32 10746 10747.. _i_insertelement: 10748 10749'``insertelement``' Instruction 10750^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10751 10752Syntax: 10753""""""" 10754 10755:: 10756 10757 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>> 10758 <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>> 10759 10760Overview: 10761""""""""" 10762 10763The '``insertelement``' instruction inserts a scalar element into a 10764vector at a specified index. 10765 10766Arguments: 10767"""""""""" 10768 10769The first operand of an '``insertelement``' instruction is a value of 10770:ref:`vector <t_vector>` type. The second operand is a scalar value whose 10771type must equal the element type of the first operand. The third operand 10772is an index indicating the position at which to insert the value. The 10773index may be a variable of any integer type, and will be treated as an 10774unsigned integer. 10775 10776Semantics: 10777"""""""""" 10778 10779The result is a vector of the same type as ``val``. Its element values 10780are those of ``val`` except at position ``idx``, where it gets the value 10781``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector, 10782the result is a :ref:`poison value <poisonvalues>`. For a scalable vector, 10783if the value of ``idx`` exceeds the runtime length of the vector, the result 10784is a :ref:`poison value <poisonvalues>`. 10785 10786Example: 10787"""""""" 10788 10789.. code-block:: text 10790 10791 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32> 10792 10793.. _i_shufflevector: 10794 10795'``shufflevector``' Instruction 10796^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10797 10798Syntax: 10799""""""" 10800 10801:: 10802 10803 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>> 10804 <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask> ; yields <vscale x m x <ty>> 10805 10806Overview: 10807""""""""" 10808 10809The '``shufflevector``' instruction constructs a permutation of elements 10810from two input vectors, returning a vector with the same element type as 10811the input and length that is the same as the shuffle mask. 10812 10813Arguments: 10814"""""""""" 10815 10816The first two operands of a '``shufflevector``' instruction are vectors 10817with the same type. The third argument is a shuffle mask vector constant 10818whose element type is ``i32``. The mask vector elements must be constant 10819integers or ``poison`` values. The result of the instruction is a vector 10820whose length is the same as the shuffle mask and whose element type is the 10821same as the element type of the first two operands. 10822 10823Semantics: 10824"""""""""" 10825 10826The elements of the two input vectors are numbered from left to right 10827across both of the vectors. For each element of the result vector, the 10828shuffle mask selects an element from one of the input vectors to copy 10829to the result. Non-negative elements in the mask represent an index 10830into the concatenated pair of input vectors. 10831 10832A ``poison`` element in the mask vector specifies that the resulting element 10833is ``poison``. 10834For backwards-compatibility reasons, LLVM temporarily also accepts ``undef`` 10835mask elements, which will be interpreted the same way as ``poison`` elements. 10836If the shuffle mask selects an ``undef`` element from one of the input 10837vectors, the resulting element is ``undef``. 10838 10839For scalable vectors, the only valid mask values at present are 10840``zeroinitializer``, ``undef`` and ``poison``, since we cannot write all indices as 10841literals for a vector with a length unknown at compile time. 10842 10843Example: 10844"""""""" 10845 10846.. code-block:: text 10847 10848 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, 10849 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32> 10850 <result> = shufflevector <4 x i32> %v1, <4 x i32> poison, 10851 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle. 10852 <result> = shufflevector <8 x i32> %v1, <8 x i32> poison, 10853 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> 10854 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, 10855 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32> 10856 10857Aggregate Operations 10858-------------------- 10859 10860LLVM supports several instructions for working with 10861:ref:`aggregate <t_aggregate>` values. 10862 10863.. _i_extractvalue: 10864 10865'``extractvalue``' Instruction 10866^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10867 10868Syntax: 10869""""""" 10870 10871:: 10872 10873 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}* 10874 10875Overview: 10876""""""""" 10877 10878The '``extractvalue``' instruction extracts the value of a member field 10879from an :ref:`aggregate <t_aggregate>` value. 10880 10881Arguments: 10882"""""""""" 10883 10884The first operand of an '``extractvalue``' instruction is a value of 10885:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are 10886constant indices to specify which value to extract in a similar manner 10887as indices in a '``getelementptr``' instruction. 10888 10889The major differences to ``getelementptr`` indexing are: 10890 10891- Since the value being indexed is not a pointer, the first index is 10892 omitted and assumed to be zero. 10893- At least one index must be specified. 10894- Not only struct indices but also array indices must be in bounds. 10895 10896Semantics: 10897"""""""""" 10898 10899The result is the value at the position in the aggregate specified by 10900the index operands. 10901 10902Example: 10903"""""""" 10904 10905.. code-block:: text 10906 10907 <result> = extractvalue {i32, float} %agg, 0 ; yields i32 10908 10909.. _i_insertvalue: 10910 10911'``insertvalue``' Instruction 10912^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10913 10914Syntax: 10915""""""" 10916 10917:: 10918 10919 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type> 10920 10921Overview: 10922""""""""" 10923 10924The '``insertvalue``' instruction inserts a value into a member field in 10925an :ref:`aggregate <t_aggregate>` value. 10926 10927Arguments: 10928"""""""""" 10929 10930The first operand of an '``insertvalue``' instruction is a value of 10931:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is 10932a first-class value to insert. The following operands are constant 10933indices indicating the position at which to insert the value in a 10934similar manner as indices in a '``extractvalue``' instruction. The value 10935to insert must have the same type as the value identified by the 10936indices. 10937 10938Semantics: 10939"""""""""" 10940 10941The result is an aggregate of the same type as ``val``. Its value is 10942that of ``val`` except that the value at the position specified by the 10943indices is that of ``elt``. 10944 10945Example: 10946"""""""" 10947 10948.. code-block:: llvm 10949 10950 %agg1 = insertvalue {i32, float} poison, i32 1, 0 ; yields {i32 1, float poison} 10951 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val} 10952 %agg3 = insertvalue {i32, {float}} poison, float %val, 1, 0 ; yields {i32 poison, {float %val}} 10953 10954.. _memoryops: 10955 10956Memory Access and Addressing Operations 10957--------------------------------------- 10958 10959A key design point of an SSA-based representation is how it represents 10960memory. In LLVM, no memory locations are in SSA form, which makes things 10961very simple. This section describes how to read, write, and allocate 10962memory in LLVM. 10963 10964.. _i_alloca: 10965 10966'``alloca``' Instruction 10967^^^^^^^^^^^^^^^^^^^^^^^^ 10968 10969Syntax: 10970""""""" 10971 10972:: 10973 10974 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)] ; yields type addrspace(num)*:result 10975 10976Overview: 10977""""""""" 10978 10979The '``alloca``' instruction allocates memory on the stack frame of the 10980currently executing function, to be automatically released when this 10981function returns to its caller. If the address space is not explicitly 10982specified, the object is allocated in the alloca address space from the 10983:ref:`datalayout string<langref_datalayout>`. 10984 10985Arguments: 10986"""""""""" 10987 10988The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements`` 10989bytes of memory on the runtime stack, returning a pointer of the 10990appropriate type to the program. If "NumElements" is specified, it is 10991the number of elements allocated, otherwise "NumElements" is defaulted 10992to be one. 10993 10994If a constant alignment is specified, the value result of the 10995allocation is guaranteed to be aligned to at least that boundary. The 10996alignment may not be greater than ``1 << 32``. 10997 10998The alignment is only optional when parsing textual IR; for in-memory IR, 10999it is always present. If not specified, the target can choose to align the 11000allocation on any convenient boundary compatible with the type. 11001 11002'``type``' may be any sized type. 11003 11004Structs containing scalable vectors cannot be used in allocas unless all 11005fields are the same scalable vector type (e.g. ``{<vscale x 2 x i32>, 11006<vscale x 2 x i32>}`` contains the same type while ``{<vscale x 2 x i32>, 11007<vscale x 2 x i64>}`` doesn't). 11008 11009Semantics: 11010"""""""""" 11011 11012Memory is allocated; a pointer is returned. The allocated memory is 11013uninitialized, and loading from uninitialized memory produces an undefined 11014value. The operation itself is undefined if there is insufficient stack 11015space for the allocation.'``alloca``'d memory is automatically released 11016when the function returns. The '``alloca``' instruction is commonly used 11017to represent automatic variables that must have an address available. When 11018the function returns (either with the ``ret`` or ``resume`` instructions), 11019the memory is reclaimed. Allocating zero bytes is legal, but the returned 11020pointer may not be unique. The order in which memory is allocated (ie., 11021which way the stack grows) is not specified. 11022 11023Note that '``alloca``' outside of the alloca address space from the 11024:ref:`datalayout string<langref_datalayout>` is meaningful only if the 11025target has assigned it a semantics. 11026 11027If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`, 11028the returned object is initially dead. 11029See :ref:`llvm.lifetime.start <int_lifestart>` and 11030:ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of 11031lifetime-manipulating intrinsics. 11032 11033Example: 11034"""""""" 11035 11036.. code-block:: llvm 11037 11038 %ptr = alloca i32 ; yields ptr 11039 %ptr = alloca i32, i32 4 ; yields ptr 11040 %ptr = alloca i32, i32 4, align 1024 ; yields ptr 11041 %ptr = alloca i32, align 1024 ; yields ptr 11042 11043.. _i_load: 11044 11045'``load``' Instruction 11046^^^^^^^^^^^^^^^^^^^^^^ 11047 11048Syntax: 11049""""""" 11050 11051:: 11052 11053 <result> = load [volatile] <ty>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>] 11054 <result> = load atomic [volatile] <ty>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] 11055 !<nontemp_node> = !{ i32 1 } 11056 !<empty_node> = !{} 11057 !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> } 11058 !<align_node> = !{ i64 <value_alignment> } 11059 11060Overview: 11061""""""""" 11062 11063The '``load``' instruction is used to read from memory. 11064 11065Arguments: 11066"""""""""" 11067 11068The argument to the ``load`` instruction specifies the memory address from which 11069to load. The type specified must be a :ref:`first class <t_firstclass>` type of 11070known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If 11071the ``load`` is marked as ``volatile``, then the optimizer is not allowed to 11072modify the number or order of execution of this ``load`` with other 11073:ref:`volatile operations <volatile>`. 11074 11075If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering 11076<ordering>` and optional ``syncscope("<target-scope>")`` argument. The 11077``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions. 11078Atomic loads produce :ref:`defined <memmodel>` results when they may see 11079multiple atomic stores. The type of the pointee must be an integer, pointer, or 11080floating-point type whose bit width is a power of two greater than or equal to 11081eight and less than or equal to a target-specific size limit. ``align`` must be 11082explicitly specified on atomic loads. Note: if the alignment is not greater or 11083equal to the size of the `<value>` type, the atomic operation is likely to 11084require a lock and have poor performance. ``!nontemporal`` does not have any 11085defined semantics for atomic loads. 11086 11087The optional constant ``align`` argument specifies the alignment of the 11088operation (that is, the alignment of the memory address). It is the 11089responsibility of the code emitter to ensure that the alignment information is 11090correct. Overestimating the alignment results in undefined behavior. 11091Underestimating the alignment may produce less efficient code. An alignment of 110921 is always safe. The maximum possible alignment is ``1 << 32``. An alignment 11093value higher than the size of the loaded type implies memory up to the 11094alignment value bytes can be safely loaded without trapping in the default 11095address space. Access of the high bytes can interfere with debugging tools, so 11096should not be accessed if the function has the ``sanitize_thread`` or 11097``sanitize_address`` attributes. 11098 11099The alignment is only optional when parsing textual IR; for in-memory IR, it is 11100always present. An omitted ``align`` argument means that the operation has the 11101ABI alignment for the target. 11102 11103The optional ``!nontemporal`` metadata must reference a single 11104metadata name ``<nontemp_node>`` corresponding to a metadata node with one 11105``i32`` entry of value 1. The existence of the ``!nontemporal`` 11106metadata on the instruction tells the optimizer and code generator 11107that this load is not expected to be reused in the cache. The code 11108generator may select special instructions to save cache bandwidth, such 11109as the ``MOVNT`` instruction on x86. 11110 11111The optional ``!invariant.load`` metadata must reference a single 11112metadata name ``<empty_node>`` corresponding to a metadata node with no 11113entries. If a load instruction tagged with the ``!invariant.load`` 11114metadata is executed, the memory location referenced by the load has 11115to contain the same value at all points in the program where the 11116memory location is dereferenceable; otherwise, the behavior is 11117undefined. 11118 11119The optional ``!invariant.group`` metadata must reference a single metadata name 11120 ``<empty_node>`` corresponding to a metadata node with no entries. 11121 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`. 11122 11123The optional ``!nonnull`` metadata must reference a single 11124metadata name ``<empty_node>`` corresponding to a metadata node with no 11125entries. The existence of the ``!nonnull`` metadata on the 11126instruction tells the optimizer that the value loaded is known to 11127never be null. If the value is null at runtime, a poison value is returned 11128instead. This is analogous to the ``nonnull`` attribute on parameters and 11129return values. This metadata can only be applied to loads of a pointer type. 11130 11131The optional ``!dereferenceable`` metadata must reference a single metadata 11132name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` 11133entry. 11134See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`. 11135 11136The optional ``!dereferenceable_or_null`` metadata must reference a single 11137metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one 11138``i64`` entry. 11139See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null 11140<md_dereferenceable_or_null>`. 11141 11142The optional ``!align`` metadata must reference a single metadata name 11143``<align_node>`` corresponding to a metadata node with one ``i64`` entry. 11144The existence of the ``!align`` metadata on the instruction tells the 11145optimizer that the value loaded is known to be aligned to a boundary specified 11146by the integer value in the metadata node. The alignment must be a power of 2. 11147This is analogous to the ''align'' attribute on parameters and return values. 11148This metadata can only be applied to loads of a pointer type. If the returned 11149value is not appropriately aligned at runtime, a poison value is returned 11150instead. 11151 11152The optional ``!noundef`` metadata must reference a single metadata name 11153``<empty_node>`` corresponding to a node with no entries. The existence of 11154``!noundef`` metadata on the instruction tells the optimizer that the value 11155loaded is known to be :ref:`well defined <welldefinedvalues>`. 11156If the value isn't well defined, the behavior is undefined. If the ``!noundef`` 11157metadata is combined with poison-generating metadata like ``!nonnull``, 11158violation of that metadata constraint will also result in undefined behavior. 11159 11160Semantics: 11161"""""""""" 11162 11163The location of memory pointed to is loaded. If the value being loaded 11164is of scalar type then the number of bytes read does not exceed the 11165minimum number of bytes needed to hold all bits of the type. For 11166example, loading an ``i24`` reads at most three bytes. When loading a 11167value of a type like ``i20`` with a size that is not an integral number 11168of bytes, the result is undefined if the value was not originally 11169written using a store of the same type. 11170If the value being loaded is of aggregate type, the bytes that correspond to 11171padding may be accessed but are ignored, because it is impossible to observe 11172padding from the loaded aggregate value. 11173If ``<pointer>`` is not a well-defined value, the behavior is undefined. 11174 11175Examples: 11176""""""""" 11177 11178.. code-block:: llvm 11179 11180 %ptr = alloca i32 ; yields ptr 11181 store i32 3, ptr %ptr ; yields void 11182 %val = load i32, ptr %ptr ; yields i32:val = i32 3 11183 11184.. _i_store: 11185 11186'``store``' Instruction 11187^^^^^^^^^^^^^^^^^^^^^^^ 11188 11189Syntax: 11190""""""" 11191 11192:: 11193 11194 store [volatile] <ty> <value>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>] ; yields void 11195 store atomic [volatile] <ty> <value>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void 11196 !<nontemp_node> = !{ i32 1 } 11197 !<empty_node> = !{} 11198 11199Overview: 11200""""""""" 11201 11202The '``store``' instruction is used to write to memory. 11203 11204Arguments: 11205"""""""""" 11206 11207There are two arguments to the ``store`` instruction: a value to store and an 11208address at which to store it. The type of the ``<pointer>`` operand must be a 11209pointer to the :ref:`first class <t_firstclass>` type of the ``<value>`` 11210operand. If the ``store`` is marked as ``volatile``, then the optimizer is not 11211allowed to modify the number or order of execution of this ``store`` with other 11212:ref:`volatile operations <volatile>`. Only values of :ref:`first class 11213<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque 11214structural type <t_opaque>`) can be stored. 11215 11216If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering 11217<ordering>` and optional ``syncscope("<target-scope>")`` argument. The 11218``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions. 11219Atomic loads produce :ref:`defined <memmodel>` results when they may see 11220multiple atomic stores. The type of the pointee must be an integer, pointer, or 11221floating-point type whose bit width is a power of two greater than or equal to 11222eight and less than or equal to a target-specific size limit. ``align`` must be 11223explicitly specified on atomic stores. Note: if the alignment is not greater or 11224equal to the size of the `<value>` type, the atomic operation is likely to 11225require a lock and have poor performance. ``!nontemporal`` does not have any 11226defined semantics for atomic stores. 11227 11228The optional constant ``align`` argument specifies the alignment of the 11229operation (that is, the alignment of the memory address). It is the 11230responsibility of the code emitter to ensure that the alignment information is 11231correct. Overestimating the alignment results in undefined behavior. 11232Underestimating the alignment may produce less efficient code. An alignment of 112331 is always safe. The maximum possible alignment is ``1 << 32``. An alignment 11234value higher than the size of the loaded type implies memory up to the 11235alignment value bytes can be safely loaded without trapping in the default 11236address space. Access of the high bytes can interfere with debugging tools, so 11237should not be accessed if the function has the ``sanitize_thread`` or 11238``sanitize_address`` attributes. 11239 11240The alignment is only optional when parsing textual IR; for in-memory IR, it is 11241always present. An omitted ``align`` argument means that the operation has the 11242ABI alignment for the target. 11243 11244The optional ``!nontemporal`` metadata must reference a single metadata 11245name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry 11246of value 1. The existence of the ``!nontemporal`` metadata on the instruction 11247tells the optimizer and code generator that this load is not expected to 11248be reused in the cache. The code generator may select special 11249instructions to save cache bandwidth, such as the ``MOVNT`` instruction on 11250x86. 11251 11252The optional ``!invariant.group`` metadata must reference a 11253single metadata name ``<empty_node>``. See ``invariant.group`` metadata. 11254 11255Semantics: 11256"""""""""" 11257 11258The contents of memory are updated to contain ``<value>`` at the 11259location specified by the ``<pointer>`` operand. If ``<value>`` is 11260of scalar type then the number of bytes written does not exceed the 11261minimum number of bytes needed to hold all bits of the type. For 11262example, storing an ``i24`` writes at most three bytes. When writing a 11263value of a type like ``i20`` with a size that is not an integral number 11264of bytes, it is unspecified what happens to the extra bits that do not 11265belong to the type, but they will typically be overwritten. 11266If ``<value>`` is of aggregate type, padding is filled with 11267:ref:`undef <undefvalues>`. 11268If ``<pointer>`` is not a well-defined value, the behavior is undefined. 11269 11270Example: 11271"""""""" 11272 11273.. code-block:: llvm 11274 11275 %ptr = alloca i32 ; yields ptr 11276 store i32 3, ptr %ptr ; yields void 11277 %val = load i32, ptr %ptr ; yields i32:val = i32 3 11278 11279.. _i_fence: 11280 11281'``fence``' Instruction 11282^^^^^^^^^^^^^^^^^^^^^^^ 11283 11284Syntax: 11285""""""" 11286 11287:: 11288 11289 fence [syncscope("<target-scope>")] <ordering> ; yields void 11290 11291Overview: 11292""""""""" 11293 11294The '``fence``' instruction is used to introduce happens-before edges 11295between operations. 11296 11297Arguments: 11298"""""""""" 11299 11300'``fence``' instructions take an :ref:`ordering <ordering>` argument which 11301defines what *synchronizes-with* edges they add. They can only be given 11302``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings. 11303 11304Semantics: 11305"""""""""" 11306 11307A fence A which has (at least) ``release`` ordering semantics 11308*synchronizes with* a fence B with (at least) ``acquire`` ordering 11309semantics if and only if there exist atomic operations X and Y, both 11310operating on some atomic object M, such that A is sequenced before X, X 11311modifies M (either directly or through some side effect of a sequence 11312headed by X), Y is sequenced before B, and Y observes M. This provides a 11313*happens-before* dependency between A and B. Rather than an explicit 11314``fence``, one (but not both) of the atomic operations X or Y might 11315provide a ``release`` or ``acquire`` (resp.) ordering constraint and 11316still *synchronize-with* the explicit ``fence`` and establish the 11317*happens-before* edge. 11318 11319A ``fence`` which has ``seq_cst`` ordering, in addition to having both 11320``acquire`` and ``release`` semantics specified above, participates in 11321the global program order of other ``seq_cst`` operations and/or 11322fences. Furthermore, the global ordering created by a ``seq_cst`` 11323fence must be compatible with the individual total orders of 11324``monotonic`` (or stronger) memory accesses occurring before and after 11325such a fence. The exact semantics of this interaction are somewhat 11326complicated, see the C++ standard's `[atomics.order] 11327<https://wg21.link/atomics.order>`_ section for more details. 11328 11329A ``fence`` instruction can also take an optional 11330":ref:`syncscope <syncscope>`" argument. 11331 11332Example: 11333"""""""" 11334 11335.. code-block:: text 11336 11337 fence acquire ; yields void 11338 fence syncscope("singlethread") seq_cst ; yields void 11339 fence syncscope("agent") seq_cst ; yields void 11340 11341.. _i_cmpxchg: 11342 11343'``cmpxchg``' Instruction 11344^^^^^^^^^^^^^^^^^^^^^^^^^ 11345 11346Syntax: 11347""""""" 11348 11349:: 11350 11351 cmpxchg [weak] [volatile] ptr <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields { ty, i1 } 11352 11353Overview: 11354""""""""" 11355 11356The '``cmpxchg``' instruction is used to atomically modify memory. It 11357loads a value in memory and compares it to a given value. If they are 11358equal, it tries to store a new value into the memory. 11359 11360Arguments: 11361"""""""""" 11362 11363There are three arguments to the '``cmpxchg``' instruction: an address 11364to operate on, a value to compare to the value currently be at that 11365address, and a new value to place at that address if the compared values 11366are equal. The type of '<cmp>' must be an integer or pointer type whose 11367bit width is a power of two greater than or equal to eight and less 11368than or equal to a target-specific size limit. '<cmp>' and '<new>' must 11369have the same type, and the type of '<pointer>' must be a pointer to 11370that type. If the ``cmpxchg`` is marked as ``volatile``, then the 11371optimizer is not allowed to modify the number or order of execution of 11372this ``cmpxchg`` with other :ref:`volatile operations <volatile>`. 11373 11374The success and failure :ref:`ordering <ordering>` arguments specify how this 11375``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters 11376must be at least ``monotonic``, the failure ordering cannot be either 11377``release`` or ``acq_rel``. 11378 11379A ``cmpxchg`` instruction can also take an optional 11380":ref:`syncscope <syncscope>`" argument. 11381 11382Note: if the alignment is not greater or equal to the size of the `<value>` 11383type, the atomic operation is likely to require a lock and have poor 11384performance. 11385 11386The alignment is only optional when parsing textual IR; for in-memory IR, it is 11387always present. If unspecified, the alignment is assumed to be equal to the 11388size of the '<value>' type. Note that this default alignment assumption is 11389different from the alignment used for the load/store instructions when align 11390isn't specified. 11391 11392The pointer passed into cmpxchg must have alignment greater than or 11393equal to the size in memory of the operand. 11394 11395Semantics: 11396"""""""""" 11397 11398The contents of memory at the location specified by the '``<pointer>``' operand 11399is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is 11400written to the location. The original value at the location is returned, 11401together with a flag indicating success (true) or failure (false). 11402 11403If the cmpxchg operation is marked as ``weak`` then a spurious failure is 11404permitted: the operation may not write ``<new>`` even if the comparison 11405matched. 11406 11407If the cmpxchg operation is strong (the default), the i1 value is 1 if and only 11408if the value loaded equals ``cmp``. 11409 11410A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of 11411identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic 11412load with an ordering parameter determined the second ordering parameter. 11413 11414Example: 11415"""""""" 11416 11417.. code-block:: llvm 11418 11419 entry: 11420 %orig = load atomic i32, ptr %ptr unordered, align 4 ; yields i32 11421 br label %loop 11422 11423 loop: 11424 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop] 11425 %squared = mul i32 %cmp, %cmp 11426 %val_success = cmpxchg ptr %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 } 11427 %value_loaded = extractvalue { i32, i1 } %val_success, 0 11428 %success = extractvalue { i32, i1 } %val_success, 1 11429 br i1 %success, label %done, label %loop 11430 11431 done: 11432 ... 11433 11434.. _i_atomicrmw: 11435 11436'``atomicrmw``' Instruction 11437^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11438 11439Syntax: 11440""""""" 11441 11442:: 11443 11444 atomicrmw [volatile] <operation> ptr <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>] ; yields ty 11445 11446Overview: 11447""""""""" 11448 11449The '``atomicrmw``' instruction is used to atomically modify memory. 11450 11451Arguments: 11452"""""""""" 11453 11454There are three arguments to the '``atomicrmw``' instruction: an 11455operation to apply, an address whose value to modify, an argument to the 11456operation. The operation must be one of the following keywords: 11457 11458- xchg 11459- add 11460- sub 11461- and 11462- nand 11463- or 11464- xor 11465- max 11466- min 11467- umax 11468- umin 11469- fadd 11470- fsub 11471- fmax 11472- fmin 11473- uinc_wrap 11474- udec_wrap 11475- usub_cond 11476- usub_sat 11477 11478For most of these operations, the type of '<value>' must be an integer 11479type whose bit width is a power of two greater than or equal to eight 11480and less than or equal to a target-specific size limit. For xchg, this 11481may also be a floating point or a pointer type with the same size constraints 11482as integers. For fadd/fsub/fmax/fmin, this must be a floating-point 11483or fixed vector of floating-point type. The type of the '``<pointer>``' 11484operand must be a pointer to that type. If the ``atomicrmw`` is marked 11485as ``volatile``, then the optimizer is not allowed to modify the 11486number or order of execution of this ``atomicrmw`` with other 11487:ref:`volatile operations <volatile>`. 11488 11489Note: if the alignment is not greater or equal to the size of the `<value>` 11490type, the atomic operation is likely to require a lock and have poor 11491performance. 11492 11493The alignment is only optional when parsing textual IR; for in-memory IR, it is 11494always present. If unspecified, the alignment is assumed to be equal to the 11495size of the '<value>' type. Note that this default alignment assumption is 11496different from the alignment used for the load/store instructions when align 11497isn't specified. 11498 11499A ``atomicrmw`` instruction can also take an optional 11500":ref:`syncscope <syncscope>`" argument. 11501 11502Semantics: 11503"""""""""" 11504 11505The contents of memory at the location specified by the '``<pointer>``' 11506operand are atomically read, modified, and written back. The original 11507value at the location is returned. The modification is specified by the 11508operation argument: 11509 11510- xchg: ``*ptr = val`` 11511- add: ``*ptr = *ptr + val`` 11512- sub: ``*ptr = *ptr - val`` 11513- and: ``*ptr = *ptr & val`` 11514- nand: ``*ptr = ~(*ptr & val)`` 11515- or: ``*ptr = *ptr | val`` 11516- xor: ``*ptr = *ptr ^ val`` 11517- max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison) 11518- min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison) 11519- umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison) 11520- umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison) 11521- fadd: ``*ptr = *ptr + val`` (using floating point arithmetic) 11522- fsub: ``*ptr = *ptr - val`` (using floating point arithmetic) 11523- fmax: ``*ptr = maxnum(*ptr, val)`` (match the `llvm.maxnum.*`` intrinsic) 11524- fmin: ``*ptr = minnum(*ptr, val)`` (match the `llvm.minnum.*`` intrinsic) 11525- uinc_wrap: ``*ptr = (*ptr u>= val) ? 0 : (*ptr + 1)`` (increment value with wraparound to zero when incremented above input value) 11526- udec_wrap: ``*ptr = ((*ptr == 0) || (*ptr u> val)) ? val : (*ptr - 1)`` (decrement with wraparound to input value when decremented below zero). 11527- usub_cond: ``*ptr = (*ptr u>= val) ? *ptr - val : *ptr`` (subtract only if no unsigned overflow). 11528- usub_sat: ``*ptr = (*ptr u>= val) ? *ptr - val : 0`` (subtract with unsigned clamping to zero). 11529 11530 11531Example: 11532"""""""" 11533 11534.. code-block:: llvm 11535 11536 %old = atomicrmw add ptr %ptr, i32 1 acquire ; yields i32 11537 11538.. _i_getelementptr: 11539 11540'``getelementptr``' Instruction 11541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11542 11543Syntax: 11544""""""" 11545 11546:: 11547 11548 <result> = getelementptr <ty>, ptr <ptrval>{, <ty> <idx>}* 11549 <result> = getelementptr inbounds <ty>, ptr <ptrval>{, <ty> <idx>}* 11550 <result> = getelementptr nusw <ty>, ptr <ptrval>{, <ty> <idx>}* 11551 <result> = getelementptr nuw <ty>, ptr <ptrval>{, <ty> <idx>}* 11552 <result> = getelementptr inrange(S,E) <ty>, ptr <ptrval>{, <ty> <idx>}* 11553 <result> = getelementptr <ty>, <N x ptr> <ptrval>, <vector index type> <idx> 11554 11555Overview: 11556""""""""" 11557 11558The '``getelementptr``' instruction is used to get the address of a 11559subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs 11560address calculation only and does not access memory. The instruction can also 11561be used to calculate a vector of such addresses. 11562 11563Arguments: 11564"""""""""" 11565 11566The first argument is always a type used as the basis for the calculations. 11567The second argument is always a pointer or a vector of pointers, and is the 11568base address to start from. The remaining arguments are indices 11569that indicate which of the elements of the aggregate object are indexed. 11570The interpretation of each index is dependent on the type being indexed 11571into. The first index always indexes the pointer value given as the 11572second argument, the second index indexes a value of the type pointed to 11573(not necessarily the value directly pointed to, since the first index 11574can be non-zero), etc. The first type indexed into must be a pointer 11575value, subsequent types can be arrays, vectors, and structs. Note that 11576subsequent types being indexed into can never be pointers, since that 11577would require loading the pointer before continuing calculation. 11578 11579The type of each index argument depends on the type it is indexing into. 11580When indexing into a (optionally packed) structure, only ``i32`` integer 11581**constants** are allowed (when using a vector of indices they must all 11582be the **same** ``i32`` integer constant). When indexing into an array, 11583pointer or vector, integers of any width are allowed, and they are not 11584required to be constant. These integers are treated as signed values 11585where relevant. 11586 11587For example, let's consider a C code fragment and how it gets compiled 11588to LLVM: 11589 11590.. code-block:: c 11591 11592 struct RT { 11593 char A; 11594 int B[10][20]; 11595 char C; 11596 }; 11597 struct ST { 11598 int X; 11599 double Y; 11600 struct RT Z; 11601 }; 11602 11603 int *foo(struct ST *s) { 11604 return &s[1].Z.B[5][13]; 11605 } 11606 11607The LLVM code generated by Clang is approximately: 11608 11609.. code-block:: llvm 11610 11611 %struct.RT = type { i8, [10 x [20 x i32]], i8 } 11612 %struct.ST = type { i32, double, %struct.RT } 11613 11614 define ptr @foo(ptr %s) { 11615 entry: 11616 %arrayidx = getelementptr inbounds %struct.ST, ptr %s, i64 1, i32 2, i32 1, i64 5, i64 13 11617 ret ptr %arrayidx 11618 } 11619 11620Semantics: 11621"""""""""" 11622 11623In the example above, the first index is indexing into the 11624'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``' 11625= '``{ i32, double, %struct.RT }``' type, a structure. The second index 11626indexes into the third element of the structure, yielding a 11627'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another 11628structure. The third index indexes into the second element of the 11629structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two 11630dimensions of the array are subscripted into, yielding an '``i32``' 11631type. The '``getelementptr``' instruction returns a pointer to this 11632element. 11633 11634Note that it is perfectly legal to index partially through a structure, 11635returning a pointer to an inner element. Because of this, the LLVM code 11636for the given testcase is equivalent to: 11637 11638.. code-block:: llvm 11639 11640 define ptr @foo(ptr %s) { 11641 %t1 = getelementptr %struct.ST, ptr %s, i32 1 11642 %t2 = getelementptr %struct.ST, ptr %t1, i32 0, i32 2 11643 %t3 = getelementptr %struct.RT, ptr %t2, i32 0, i32 1 11644 %t4 = getelementptr [10 x [20 x i32]], ptr %t3, i32 0, i32 5 11645 %t5 = getelementptr [20 x i32], ptr %t4, i32 0, i32 13 11646 ret ptr %t5 11647 } 11648 11649The indices are first converted to offsets in the pointer's index type. If the 11650currently indexed type is a struct type, the struct offset corresponding to the 11651index is sign-extended or truncated to the pointer index type. Otherwise, the 11652index itself is sign-extended or truncated, and then multiplied by the type 11653allocation size (that is, the size rounded up to the ABI alignment) of the 11654currently indexed type. 11655 11656The offsets are then added to the low bits of the base address up to the index 11657type width, with silently-wrapping two's complement arithmetic. If the pointer 11658size is larger than the index size, this means that the bits outside the index 11659type width will not be affected. 11660 11661The result value of the ``getelementptr`` may be outside the object pointed 11662to by the base pointer. The result value may not necessarily be used to access 11663memory though, even if it happens to point into allocated storage. See the 11664:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more 11665information. 11666 11667The ``getelementptr`` instruction may have a number of attributes that impose 11668additional rules. If any of the rules are violated, the result value is a 11669:ref:`poison value <poisonvalues>`. In cases where the base is a vector of 11670pointers, the attributes apply to each computation element-wise. 11671 11672For ``nusw`` (no unsigned signed wrap): 11673 11674 * If the type of an index is larger than the pointer index type, the 11675 truncation to the pointer index type preserves the signed value 11676 (``trunc nsw``). 11677 * The multiplication of an index by the type size does not wrap the pointer 11678 index type in a signed sense (``mul nsw``). 11679 * The successive addition of each offset (without adding the base address) 11680 does not wrap the pointer index type in a signed sense (``add nsw``). 11681 * The successive addition of the current address, truncated to the pointer 11682 index type and interpreted as an unsigned number, and each offset, 11683 interpreted as a signed number, does not wrap the pointer index type. 11684 11685For ``nuw`` (no unsigned wrap): 11686 11687 * If the type of an index is larger than the pointer index type, the 11688 truncation to the pointer index type preserves the unsigned value 11689 (``trunc nuw``). 11690 * The multiplication of an index by the type size does not wrap the pointer 11691 index type in an unsigned sense (``mul nuw``). 11692 * The successive addition of each offset (without adding the base address) 11693 does not wrap the pointer index type in an unsigned sense (``add nuw``). 11694 * The successive addition of the current address, truncated to the pointer 11695 index type and interpreted as an unsigned number, and each offset, also 11696 interpreted as an unsigned number, does not wrap the pointer index type 11697 (``add nuw``). 11698 11699For ``inbounds`` all rules of the ``nusw`` attribute apply. Additionally, 11700if the ``getelementptr`` has any non-zero indices, the following rules apply: 11701 11702 * The base pointer has an *in bounds* address of the allocated object that it 11703 is :ref:`based <pointeraliasing>` on. This means that it points into that 11704 allocated object, or to its end. Note that the object does not have to be 11705 live anymore; being in-bounds of a deallocated object is sufficient. 11706 * During the successive addition of offsets to the address, the resulting 11707 pointer must remain *in bounds* of the allocated object at each step. 11708 11709Note that ``getelementptr`` with all-zero indices is always considered to be 11710``inbounds``, even if the base pointer does not point to an allocated object. 11711As a corollary, the only pointer in bounds of the null pointer in the default 11712address space is the null pointer itself. 11713 11714These rules are based on the assumption that no allocated object may cross 11715the unsigned address space boundary, and no allocated object may be larger 11716than half the pointer index type space. 11717 11718If ``inbounds`` is present on a ``getelementptr`` instruction, the ``nusw`` 11719attribute will be automatically set as well. For this reason, the ``nusw`` 11720will also not be printed in textual IR if ``inbounds`` is already present. 11721 11722If the ``inrange(Start, End)`` attribute is present, loading from or 11723storing to any pointer derived from the ``getelementptr`` has undefined 11724behavior if the load or store would access memory outside the half-open range 11725``[Start, End)`` from the ``getelementptr`` expression result. The result of 11726a pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations 11727involving memory) involving a pointer derived from a ``getelementptr`` with 11728the ``inrange`` keyword is undefined, with the exception of comparisons 11729in the case where both operands are in the closed range ``[Start, End]``. 11730Note that the ``inrange`` keyword is currently only allowed 11731in constant ``getelementptr`` expressions. 11732 11733The getelementptr instruction is often confusing. For some more insight 11734into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`. 11735 11736Example: 11737"""""""" 11738 11739.. code-block:: llvm 11740 11741 %aptr = getelementptr {i32, [12 x i8]}, ptr %saptr, i64 0, i32 1 11742 %vptr = getelementptr {i32, <2 x i8>}, ptr %svptr, i64 0, i32 1, i32 1 11743 %eptr = getelementptr [12 x i8], ptr %aptr, i64 0, i32 1 11744 %iptr = getelementptr [10 x i32], ptr @arr, i16 0, i16 0 11745 11746Vector of pointers: 11747""""""""""""""""""" 11748 11749The ``getelementptr`` returns a vector of pointers, instead of a single address, 11750when one or more of its arguments is a vector. In such cases, all vector 11751arguments should have the same number of elements, and every scalar argument 11752will be effectively broadcast into a vector during address calculation. 11753 11754.. code-block:: llvm 11755 11756 ; All arguments are vectors: 11757 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8) 11758 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets 11759 11760 ; Add the same scalar offset to each pointer of a vector: 11761 ; A[i] = ptrs[i] + offset*sizeof(i8) 11762 %A = getelementptr i8, <4 x ptr> %ptrs, i64 %offset 11763 11764 ; Add distinct offsets to the same pointer: 11765 ; A[i] = ptr + offsets[i]*sizeof(i8) 11766 %A = getelementptr i8, ptr %ptr, <4 x i64> %offsets 11767 11768 ; In all cases described above the type of the result is <4 x ptr> 11769 11770The two following instructions are equivalent: 11771 11772.. code-block:: llvm 11773 11774 getelementptr %struct.ST, <4 x ptr> %s, <4 x i64> %ind1, 11775 <4 x i32> <i32 2, i32 2, i32 2, i32 2>, 11776 <4 x i32> <i32 1, i32 1, i32 1, i32 1>, 11777 <4 x i32> %ind4, 11778 <4 x i64> <i64 13, i64 13, i64 13, i64 13> 11779 11780 getelementptr %struct.ST, <4 x ptr> %s, <4 x i64> %ind1, 11781 i32 2, i32 1, <4 x i32> %ind4, i64 13 11782 11783Let's look at the C code, where the vector version of ``getelementptr`` 11784makes sense: 11785 11786.. code-block:: c 11787 11788 // Let's assume that we vectorize the following loop: 11789 double *A, *B; int *C; 11790 for (int i = 0; i < size; ++i) { 11791 A[i] = B[C[i]]; 11792 } 11793 11794.. code-block:: llvm 11795 11796 ; get pointers for 8 elements from array B 11797 %ptrs = getelementptr double, ptr %B, <8 x i32> %C 11798 ; load 8 elements from array B into A 11799 %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x ptr> %ptrs, 11800 i32 8, <8 x i1> %mask, <8 x double> %passthru) 11801 11802Conversion Operations 11803--------------------- 11804 11805The instructions in this category are the conversion instructions 11806(casting) which all take a single operand and a type. They perform 11807various bit conversions on the operand. 11808 11809.. _i_trunc: 11810 11811'``trunc .. to``' Instruction 11812^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11813 11814Syntax: 11815""""""" 11816 11817:: 11818 11819 <result> = trunc <ty> <value> to <ty2> ; yields ty2 11820 <result> = trunc nsw <ty> <value> to <ty2> ; yields ty2 11821 <result> = trunc nuw <ty> <value> to <ty2> ; yields ty2 11822 <result> = trunc nuw nsw <ty> <value> to <ty2> ; yields ty2 11823 11824Overview: 11825""""""""" 11826 11827The '``trunc``' instruction truncates its operand to the type ``ty2``. 11828 11829Arguments: 11830"""""""""" 11831 11832The '``trunc``' instruction takes a value to trunc, and a type to trunc 11833it to. Both types must be of :ref:`integer <t_integer>` types, or vectors 11834of the same number of integers. The bit size of the ``value`` must be 11835larger than the bit size of the destination type, ``ty2``. Equal sized 11836types are not allowed. 11837 11838Semantics: 11839"""""""""" 11840 11841The '``trunc``' instruction truncates the high order bits in ``value`` 11842and converts the remaining bits to ``ty2``. Since the source size must 11843be larger than the destination size, ``trunc`` cannot be a *no-op cast*. 11844It will always truncate bits. 11845 11846If the ``nuw`` keyword is present, and any of the truncated bits are non-zero, 11847the result is a :ref:`poison value <poisonvalues>`. If the ``nsw`` keyword 11848is present, and any of the truncated bits are not the same as the top bit 11849of the truncation result, the result is a :ref:`poison value <poisonvalues>`. 11850 11851Example: 11852"""""""" 11853 11854.. code-block:: llvm 11855 11856 %X = trunc i32 257 to i8 ; yields i8:1 11857 %Y = trunc i32 123 to i1 ; yields i1:true 11858 %Z = trunc i32 122 to i1 ; yields i1:false 11859 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7> 11860 11861.. _i_zext: 11862 11863'``zext .. to``' Instruction 11864^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11865 11866Syntax: 11867""""""" 11868 11869:: 11870 11871 <result> = zext <ty> <value> to <ty2> ; yields ty2 11872 11873Overview: 11874""""""""" 11875 11876The '``zext``' instruction zero extends its operand to type ``ty2``. 11877 11878The ``nneg`` (non-negative) flag, if present, specifies that the operand is 11879non-negative. This property may be used by optimization passes to later 11880convert the ``zext`` into a ``sext``. 11881 11882Arguments: 11883"""""""""" 11884 11885The '``zext``' instruction takes a value to cast, and a type to cast it 11886to. Both types must be of :ref:`integer <t_integer>` types, or vectors of 11887the same number of integers. The bit size of the ``value`` must be 11888smaller than the bit size of the destination type, ``ty2``. 11889 11890Semantics: 11891"""""""""" 11892 11893The ``zext`` fills the high order bits of the ``value`` with zero bits 11894until it reaches the size of the destination type, ``ty2``. 11895 11896When zero extending from i1, the result will always be either 0 or 1. 11897 11898If the ``nneg`` flag is set, and the ``zext`` argument is negative, the result 11899is a poison value. 11900 11901Example: 11902"""""""" 11903 11904.. code-block:: llvm 11905 11906 %X = zext i32 257 to i64 ; yields i64:257 11907 %Y = zext i1 true to i32 ; yields i32:1 11908 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> 11909 11910 %a = zext nneg i8 127 to i16 ; yields i16 127 11911 %b = zext nneg i8 -1 to i16 ; yields i16 poison 11912 11913.. _i_sext: 11914 11915'``sext .. to``' Instruction 11916^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11917 11918Syntax: 11919""""""" 11920 11921:: 11922 11923 <result> = sext <ty> <value> to <ty2> ; yields ty2 11924 11925Overview: 11926""""""""" 11927 11928The '``sext``' sign extends ``value`` to the type ``ty2``. 11929 11930Arguments: 11931"""""""""" 11932 11933The '``sext``' instruction takes a value to cast, and a type to cast it 11934to. Both types must be of :ref:`integer <t_integer>` types, or vectors of 11935the same number of integers. The bit size of the ``value`` must be 11936smaller than the bit size of the destination type, ``ty2``. 11937 11938Semantics: 11939"""""""""" 11940 11941The '``sext``' instruction performs a sign extension by copying the sign 11942bit (highest order bit) of the ``value`` until it reaches the bit size 11943of the type ``ty2``. 11944 11945When sign extending from i1, the extension always results in -1 or 0. 11946 11947Example: 11948"""""""" 11949 11950.. code-block:: llvm 11951 11952 %X = sext i8 -1 to i16 ; yields i16 :65535 11953 %Y = sext i1 true to i32 ; yields i32:-1 11954 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> 11955 11956.. _i_fptrunc: 11957 11958'``fptrunc .. to``' Instruction 11959^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11960 11961Syntax: 11962""""""" 11963 11964:: 11965 11966 <result> = fptrunc [fast-math flags]* <ty> <value> to <ty2> ; yields ty2 11967 11968Overview: 11969""""""""" 11970 11971The '``fptrunc``' instruction truncates ``value`` to type ``ty2``. 11972 11973Arguments: 11974"""""""""" 11975 11976The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>` 11977value to cast and a :ref:`floating-point <t_floating>` type to cast it to. 11978The size of ``value`` must be larger than the size of ``ty2``. This 11979implies that ``fptrunc`` cannot be used to make a *no-op cast*. 11980 11981Semantics: 11982"""""""""" 11983 11984The '``fptrunc``' instruction casts a ``value`` from a larger 11985:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point 11986<t_floating>` type. 11987This instruction is assumed to execute in the default :ref:`floating-point 11988environment <floatenv>`. 11989 11990NaN values follow the usual :ref:`NaN behaviors <floatnan>`, except that _if_ a 11991NaN payload is propagated from the input ("Quieting NaN propagation" or 11992"Unchanged NaN propagation" cases), then the low order bits of the NaN payload 11993which cannot fit in the resulting type are discarded. Note that if discarding 11994the low order bits leads to an all-0 payload, this cannot be represented as a 11995signaling NaN (it would represent an infinity instead), so in that case 11996"Unchanged NaN propagation" is not possible. 11997 11998This instruction can also take any number of :ref:`fast-math 11999flags <fastmath>`, which are optimization hints to enable otherwise 12000unsafe floating-point optimizations. 12001 12002Example: 12003"""""""" 12004 12005.. code-block:: llvm 12006 12007 %X = fptrunc double 16777217.0 to float ; yields float:16777216.0 12008 %Y = fptrunc double 1.0E+300 to half ; yields half:+infinity 12009 12010.. _i_fpext: 12011 12012'``fpext .. to``' Instruction 12013^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12014 12015Syntax: 12016""""""" 12017 12018:: 12019 12020 <result> = fpext [fast-math flags]* <ty> <value> to <ty2> ; yields ty2 12021 12022Overview: 12023""""""""" 12024 12025The '``fpext``' extends a floating-point ``value`` to a larger floating-point 12026value. 12027 12028Arguments: 12029"""""""""" 12030 12031The '``fpext``' instruction takes a :ref:`floating-point <t_floating>` 12032``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it 12033to. The source type must be smaller than the destination type. 12034 12035Semantics: 12036"""""""""" 12037 12038The '``fpext``' instruction extends the ``value`` from a smaller 12039:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point 12040<t_floating>` type. The ``fpext`` cannot be used to make a 12041*no-op cast* because it always changes bits. Use ``bitcast`` to make a 12042*no-op cast* for a floating-point cast. 12043 12044NaN values follow the usual :ref:`NaN behaviors <floatnan>`, except that _if_ a 12045NaN payload is propagated from the input ("Quieting NaN propagation" or 12046"Unchanged NaN propagation" cases), then it is copied to the high order bits of 12047the resulting payload, and the remaining low order bits are zero. 12048 12049This instruction can also take any number of :ref:`fast-math 12050flags <fastmath>`, which are optimization hints to enable otherwise 12051unsafe floating-point optimizations. 12052 12053Example: 12054"""""""" 12055 12056.. code-block:: llvm 12057 12058 %X = fpext float 3.125 to double ; yields double:3.125000e+00 12059 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000 12060 12061'``fptoui .. to``' Instruction 12062^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12063 12064Syntax: 12065""""""" 12066 12067:: 12068 12069 <result> = fptoui <ty> <value> to <ty2> ; yields ty2 12070 12071Overview: 12072""""""""" 12073 12074The '``fptoui``' converts a floating-point ``value`` to its unsigned 12075integer equivalent of type ``ty2``. 12076 12077Arguments: 12078"""""""""" 12079 12080The '``fptoui``' instruction takes a value to cast, which must be a 12081scalar or vector :ref:`floating-point <t_floating>` value, and a type to 12082cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If 12083``ty`` is a vector floating-point type, ``ty2`` must be a vector integer 12084type with the same number of elements as ``ty`` 12085 12086Semantics: 12087"""""""""" 12088 12089The '``fptoui``' instruction converts its :ref:`floating-point 12090<t_floating>` operand into the nearest (rounding towards zero) 12091unsigned integer value. If the value cannot fit in ``ty2``, the result 12092is a :ref:`poison value <poisonvalues>`. 12093 12094Example: 12095"""""""" 12096 12097.. code-block:: llvm 12098 12099 %X = fptoui double 123.0 to i32 ; yields i32:123 12100 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1 12101 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1 12102 12103'``fptosi .. to``' Instruction 12104^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12105 12106Syntax: 12107""""""" 12108 12109:: 12110 12111 <result> = fptosi <ty> <value> to <ty2> ; yields ty2 12112 12113Overview: 12114""""""""" 12115 12116The '``fptosi``' instruction converts :ref:`floating-point <t_floating>` 12117``value`` to type ``ty2``. 12118 12119Arguments: 12120"""""""""" 12121 12122The '``fptosi``' instruction takes a value to cast, which must be a 12123scalar or vector :ref:`floating-point <t_floating>` value, and a type to 12124cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If 12125``ty`` is a vector floating-point type, ``ty2`` must be a vector integer 12126type with the same number of elements as ``ty`` 12127 12128Semantics: 12129"""""""""" 12130 12131The '``fptosi``' instruction converts its :ref:`floating-point 12132<t_floating>` operand into the nearest (rounding towards zero) 12133signed integer value. If the value cannot fit in ``ty2``, the result 12134is a :ref:`poison value <poisonvalues>`. 12135 12136Example: 12137"""""""" 12138 12139.. code-block:: llvm 12140 12141 %X = fptosi double -123.0 to i32 ; yields i32:-123 12142 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1 12143 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1 12144 12145'``uitofp .. to``' Instruction 12146^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12147 12148Syntax: 12149""""""" 12150 12151:: 12152 12153 <result> = uitofp <ty> <value> to <ty2> ; yields ty2 12154 12155Overview: 12156""""""""" 12157 12158The '``uitofp``' instruction regards ``value`` as an unsigned integer 12159and converts that value to the ``ty2`` type. 12160 12161The ``nneg`` (non-negative) flag, if present, specifies that the 12162operand is non-negative. This property may be used by optimization 12163passes to later convert the ``uitofp`` into a ``sitofp``. 12164 12165Arguments: 12166"""""""""" 12167 12168The '``uitofp``' instruction takes a value to cast, which must be a 12169scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to 12170``ty2``, which must be an :ref:`floating-point <t_floating>` type. If 12171``ty`` is a vector integer type, ``ty2`` must be a vector floating-point 12172type with the same number of elements as ``ty`` 12173 12174Semantics: 12175"""""""""" 12176 12177The '``uitofp``' instruction interprets its operand as an unsigned 12178integer quantity and converts it to the corresponding floating-point 12179value. If the value cannot be exactly represented, it is rounded using 12180the default rounding mode. 12181 12182If the ``nneg`` flag is set, and the ``uitofp`` argument is negative, 12183the result is a poison value. 12184 12185 12186Example: 12187"""""""" 12188 12189.. code-block:: llvm 12190 12191 %X = uitofp i32 257 to float ; yields float:257.0 12192 %Y = uitofp i8 -1 to double ; yields double:255.0 12193 12194 %a = uitofp nneg i32 256 to i32 ; yields float:256.0 12195 %b = uitofp nneg i32 -256 to i32 ; yields i32 poison 12196 12197'``sitofp .. to``' Instruction 12198^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12199 12200Syntax: 12201""""""" 12202 12203:: 12204 12205 <result> = sitofp <ty> <value> to <ty2> ; yields ty2 12206 12207Overview: 12208""""""""" 12209 12210The '``sitofp``' instruction regards ``value`` as a signed integer and 12211converts that value to the ``ty2`` type. 12212 12213Arguments: 12214"""""""""" 12215 12216The '``sitofp``' instruction takes a value to cast, which must be a 12217scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to 12218``ty2``, which must be an :ref:`floating-point <t_floating>` type. If 12219``ty`` is a vector integer type, ``ty2`` must be a vector floating-point 12220type with the same number of elements as ``ty`` 12221 12222Semantics: 12223"""""""""" 12224 12225The '``sitofp``' instruction interprets its operand as a signed integer 12226quantity and converts it to the corresponding floating-point value. If the 12227value cannot be exactly represented, it is rounded using the default rounding 12228mode. 12229 12230Example: 12231"""""""" 12232 12233.. code-block:: llvm 12234 12235 %X = sitofp i32 257 to float ; yields float:257.0 12236 %Y = sitofp i8 -1 to double ; yields double:-1.0 12237 12238.. _i_ptrtoint: 12239 12240'``ptrtoint .. to``' Instruction 12241^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12242 12243Syntax: 12244""""""" 12245 12246:: 12247 12248 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2 12249 12250Overview: 12251""""""""" 12252 12253The '``ptrtoint``' instruction converts the pointer or a vector of 12254pointers ``value`` to the integer (or vector of integers) type ``ty2``. 12255 12256Arguments: 12257"""""""""" 12258 12259The '``ptrtoint``' instruction takes a ``value`` to cast, which must be 12260a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a 12261type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or 12262a vector of integers type. 12263 12264Semantics: 12265"""""""""" 12266 12267The '``ptrtoint``' instruction converts ``value`` to integer type 12268``ty2`` by interpreting the pointer value as an integer and either 12269truncating or zero extending that value to the size of the integer type. 12270If ``value`` is smaller than ``ty2`` then a zero extension is done. If 12271``value`` is larger than ``ty2`` then a truncation is done. If they are 12272the same size, then nothing is done (*no-op cast*) other than a type 12273change. 12274 12275Example: 12276"""""""" 12277 12278.. code-block:: llvm 12279 12280 %X = ptrtoint ptr %P to i8 ; yields truncation on 32-bit architecture 12281 %Y = ptrtoint ptr %P to i64 ; yields zero extension on 32-bit architecture 12282 %Z = ptrtoint <4 x ptr> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture 12283 12284.. _i_inttoptr: 12285 12286'``inttoptr .. to``' Instruction 12287^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12288 12289Syntax: 12290""""""" 12291 12292:: 12293 12294 <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>] ; yields ty2 12295 12296Overview: 12297""""""""" 12298 12299The '``inttoptr``' instruction converts an integer ``value`` to a 12300pointer type, ``ty2``. 12301 12302Arguments: 12303"""""""""" 12304 12305The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to 12306cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>` 12307type. 12308 12309The optional ``!dereferenceable`` metadata must reference a single metadata 12310name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` 12311entry. 12312See ``dereferenceable`` metadata. 12313 12314The optional ``!dereferenceable_or_null`` metadata must reference a single 12315metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one 12316``i64`` entry. 12317See ``dereferenceable_or_null`` metadata. 12318 12319Semantics: 12320"""""""""" 12321 12322The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by 12323applying either a zero extension or a truncation depending on the size 12324of the integer ``value``. If ``value`` is larger than the size of a 12325pointer then a truncation is done. If ``value`` is smaller than the size 12326of a pointer then a zero extension is done. If they are the same size, 12327nothing is done (*no-op cast*). 12328 12329Example: 12330"""""""" 12331 12332.. code-block:: llvm 12333 12334 %X = inttoptr i32 255 to ptr ; yields zero extension on 64-bit architecture 12335 %Y = inttoptr i32 255 to ptr ; yields no-op on 32-bit architecture 12336 %Z = inttoptr i64 0 to ptr ; yields truncation on 32-bit architecture 12337 %Z = inttoptr <4 x i32> %G to <4 x ptr>; yields truncation of vector G to four pointers 12338 12339.. _i_bitcast: 12340 12341'``bitcast .. to``' Instruction 12342^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12343 12344Syntax: 12345""""""" 12346 12347:: 12348 12349 <result> = bitcast <ty> <value> to <ty2> ; yields ty2 12350 12351Overview: 12352""""""""" 12353 12354The '``bitcast``' instruction converts ``value`` to type ``ty2`` without 12355changing any bits. 12356 12357Arguments: 12358"""""""""" 12359 12360The '``bitcast``' instruction takes a value to cast, which must be a 12361non-aggregate first class value, and a type to cast it to, which must 12362also be a non-aggregate :ref:`first class <t_firstclass>` type. The 12363bit sizes of ``value`` and the destination type, ``ty2``, must be 12364identical. If the source type is a pointer, the destination type must 12365also be a pointer of the same size. This instruction supports bitwise 12366conversion of vectors to integers and to vectors of other types (as 12367long as they have the same size). 12368 12369Semantics: 12370"""""""""" 12371 12372The '``bitcast``' instruction converts ``value`` to type ``ty2``. It 12373is always a *no-op cast* because no bits change with this 12374conversion. The conversion is done as if the ``value`` had been stored 12375to memory and read back as type ``ty2``. Pointer (or vector of 12376pointers) types may only be converted to other pointer (or vector of 12377pointers) types with the same address space through this instruction. 12378To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>` 12379or :ref:`ptrtoint <i_ptrtoint>` instructions first. 12380 12381There is a caveat for bitcasts involving vector types in relation to 12382endianness. For example ``bitcast <2 x i8> <value> to i16`` puts element zero 12383of the vector in the least significant bits of the i16 for little-endian while 12384element zero ends up in the most significant bits for big-endian. 12385 12386Example: 12387"""""""" 12388 12389.. code-block:: text 12390 12391 %X = bitcast i8 255 to i8 ; yields i8 :-1 12392 %Y = bitcast i32* %x to i16* ; yields i16*:%x 12393 %Z = bitcast <2 x i32> %V to i64; ; yields i64: %V (depends on endianness) 12394 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*> 12395 12396.. _i_addrspacecast: 12397 12398'``addrspacecast .. to``' Instruction 12399^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12400 12401Syntax: 12402""""""" 12403 12404:: 12405 12406 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2 12407 12408Overview: 12409""""""""" 12410 12411The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in 12412address space ``n`` to type ``pty2`` in address space ``m``. 12413 12414Arguments: 12415"""""""""" 12416 12417The '``addrspacecast``' instruction takes a pointer or vector of pointer value 12418to cast and a pointer type to cast it to, which must have a different 12419address space. 12420 12421Semantics: 12422"""""""""" 12423 12424The '``addrspacecast``' instruction converts the pointer value 12425``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex 12426value modification, depending on the target and the address space 12427pair. Pointer conversions within the same address space must be 12428performed with the ``bitcast`` instruction. Note that if the address 12429space conversion produces a dereferenceable result then both result 12430and operand refer to the same memory location. The conversion must 12431have no side effects, and must not capture the value of the pointer. 12432 12433If the source is :ref:`poison <poisonvalues>`, the result is 12434:ref:`poison <poisonvalues>`. 12435 12436If the source is not :ref:`poison <poisonvalues>`, and both source and 12437destination are :ref:`integral pointers <nointptrtype>`, and the 12438result pointer is dereferenceable, the cast is assumed to be 12439reversible (i.e. casting the result back to the original address space 12440should yield the original bit pattern). 12441 12442Example: 12443"""""""" 12444 12445.. code-block:: llvm 12446 12447 %X = addrspacecast ptr %x to ptr addrspace(1) 12448 %Y = addrspacecast ptr addrspace(1) %y to ptr addrspace(2) 12449 %Z = addrspacecast <4 x ptr> %z to <4 x ptr addrspace(3)> 12450 12451.. _otherops: 12452 12453Other Operations 12454---------------- 12455 12456The instructions in this category are the "miscellaneous" instructions, 12457which defy better classification. 12458 12459.. _i_icmp: 12460 12461'``icmp``' Instruction 12462^^^^^^^^^^^^^^^^^^^^^^ 12463 12464Syntax: 12465""""""" 12466 12467:: 12468 12469 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 12470 <result> = icmp samesign <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 12471 12472Overview: 12473""""""""" 12474 12475The '``icmp``' instruction returns a boolean value or a vector of 12476boolean values based on comparison of its two integer, integer vector, 12477pointer, or pointer vector operands. 12478 12479Arguments: 12480"""""""""" 12481 12482The '``icmp``' instruction takes three operands. The first operand is 12483the condition code indicating the kind of comparison to perform. It is 12484not a value, just a keyword. The possible condition codes are: 12485 12486.. _icmp_md_cc: 12487 12488#. ``eq``: equal 12489#. ``ne``: not equal 12490#. ``ugt``: unsigned greater than 12491#. ``uge``: unsigned greater or equal 12492#. ``ult``: unsigned less than 12493#. ``ule``: unsigned less or equal 12494#. ``sgt``: signed greater than 12495#. ``sge``: signed greater or equal 12496#. ``slt``: signed less than 12497#. ``sle``: signed less or equal 12498 12499The remaining two arguments must be :ref:`integer <t_integer>` or 12500:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They 12501must also be identical types. 12502 12503Semantics: 12504"""""""""" 12505 12506The '``icmp``' compares ``op1`` and ``op2`` according to the condition 12507code given as ``cond``. The comparison performed always yields either an 12508:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows: 12509 12510.. _icmp_md_cc_sem: 12511 12512#. ``eq``: yields ``true`` if the operands are equal, ``false`` 12513 otherwise. No sign interpretation is necessary or performed. 12514#. ``ne``: yields ``true`` if the operands are unequal, ``false`` 12515 otherwise. No sign interpretation is necessary or performed. 12516#. ``ugt``: interprets the operands as unsigned values and yields 12517 ``true`` if ``op1`` is greater than ``op2``. 12518#. ``uge``: interprets the operands as unsigned values and yields 12519 ``true`` if ``op1`` is greater than or equal to ``op2``. 12520#. ``ult``: interprets the operands as unsigned values and yields 12521 ``true`` if ``op1`` is less than ``op2``. 12522#. ``ule``: interprets the operands as unsigned values and yields 12523 ``true`` if ``op1`` is less than or equal to ``op2``. 12524#. ``sgt``: interprets the operands as signed values and yields ``true`` 12525 if ``op1`` is greater than ``op2``. 12526#. ``sge``: interprets the operands as signed values and yields ``true`` 12527 if ``op1`` is greater than or equal to ``op2``. 12528#. ``slt``: interprets the operands as signed values and yields ``true`` 12529 if ``op1`` is less than ``op2``. 12530#. ``sle``: interprets the operands as signed values and yields ``true`` 12531 if ``op1`` is less than or equal to ``op2``. 12532 12533If the operands are :ref:`pointer <t_pointer>` typed, the pointer values 12534are compared as if they were integers. 12535 12536If the operands are integer vectors, then they are compared element by 12537element. The result is an ``i1`` vector with the same number of elements 12538as the values being compared. Otherwise, the result is an ``i1``. 12539 12540If the ``samesign`` keyword is present and the operands are not of the 12541same sign then the result is a :ref:`poison value <poisonvalues>`. 12542 12543Example: 12544"""""""" 12545 12546.. code-block:: text 12547 12548 <result> = icmp eq i32 4, 5 ; yields: result=false 12549 <result> = icmp ne ptr %X, %X ; yields: result=false 12550 <result> = icmp ult i16 4, 5 ; yields: result=true 12551 <result> = icmp sgt i16 4, 5 ; yields: result=false 12552 <result> = icmp ule i16 -4, 5 ; yields: result=false 12553 <result> = icmp sge i16 4, 5 ; yields: result=false 12554 12555.. _i_fcmp: 12556 12557'``fcmp``' Instruction 12558^^^^^^^^^^^^^^^^^^^^^^ 12559 12560Syntax: 12561""""""" 12562 12563:: 12564 12565 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 12566 12567Overview: 12568""""""""" 12569 12570The '``fcmp``' instruction returns a boolean value or vector of boolean 12571values based on comparison of its operands. 12572 12573If the operands are floating-point scalars, then the result type is a 12574boolean (:ref:`i1 <t_integer>`). 12575 12576If the operands are floating-point vectors, then the result type is a 12577vector of boolean with the same number of elements as the operands being 12578compared. 12579 12580Arguments: 12581"""""""""" 12582 12583The '``fcmp``' instruction takes three operands. The first operand is 12584the condition code indicating the kind of comparison to perform. It is 12585not a value, just a keyword. The possible condition codes are: 12586 12587#. ``false``: no comparison, always returns false 12588#. ``oeq``: ordered and equal 12589#. ``ogt``: ordered and greater than 12590#. ``oge``: ordered and greater than or equal 12591#. ``olt``: ordered and less than 12592#. ``ole``: ordered and less than or equal 12593#. ``one``: ordered and not equal 12594#. ``ord``: ordered (no nans) 12595#. ``ueq``: unordered or equal 12596#. ``ugt``: unordered or greater than 12597#. ``uge``: unordered or greater than or equal 12598#. ``ult``: unordered or less than 12599#. ``ule``: unordered or less than or equal 12600#. ``une``: unordered or not equal 12601#. ``uno``: unordered (either nans) 12602#. ``true``: no comparison, always returns true 12603 12604*Ordered* means that neither operand is a QNAN while *unordered* means 12605that either operand may be a QNAN. 12606 12607Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point 12608<t_floating>` type or a :ref:`vector <t_vector>` of floating-point type. 12609They must have identical types. 12610 12611Semantics: 12612"""""""""" 12613 12614The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the 12615condition code given as ``cond``. If the operands are vectors, then the 12616vectors are compared element by element. Each comparison performed 12617always yields an :ref:`i1 <t_integer>` result, as follows: 12618 12619#. ``false``: always yields ``false``, regardless of operands. 12620#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1`` 12621 is equal to ``op2``. 12622#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1`` 12623 is greater than ``op2``. 12624#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1`` 12625 is greater than or equal to ``op2``. 12626#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1`` 12627 is less than ``op2``. 12628#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1`` 12629 is less than or equal to ``op2``. 12630#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1`` 12631 is not equal to ``op2``. 12632#. ``ord``: yields ``true`` if both operands are not a QNAN. 12633#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is 12634 equal to ``op2``. 12635#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is 12636 greater than ``op2``. 12637#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is 12638 greater than or equal to ``op2``. 12639#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is 12640 less than ``op2``. 12641#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is 12642 less than or equal to ``op2``. 12643#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is 12644 not equal to ``op2``. 12645#. ``uno``: yields ``true`` if either operand is a QNAN. 12646#. ``true``: always yields ``true``, regardless of operands. 12647 12648The ``fcmp`` instruction can also optionally take any number of 12649:ref:`fast-math flags <fastmath>`, which are optimization hints to enable 12650otherwise unsafe floating-point optimizations. 12651 12652Any set of fast-math flags are legal on an ``fcmp`` instruction, but the 12653only flags that have any effect on its semantics are those that allow 12654assumptions to be made about the values of input arguments; namely 12655``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information. 12656 12657Example: 12658"""""""" 12659 12660.. code-block:: text 12661 12662 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false 12663 <result> = fcmp one float 4.0, 5.0 ; yields: result=true 12664 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true 12665 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false 12666 12667.. _i_phi: 12668 12669'``phi``' Instruction 12670^^^^^^^^^^^^^^^^^^^^^ 12671 12672Syntax: 12673""""""" 12674 12675:: 12676 12677 <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ... 12678 12679Overview: 12680""""""""" 12681 12682The '``phi``' instruction is used to implement the φ node in the SSA 12683graph representing the function. 12684 12685Arguments: 12686"""""""""" 12687 12688The type of the incoming values is specified with the first type field. 12689After this, the '``phi``' instruction takes a list of pairs as 12690arguments, with one pair for each predecessor basic block of the current 12691block. Only values of :ref:`first class <t_firstclass>` type may be used as 12692the value arguments to the PHI node. Only labels may be used as the 12693label arguments. 12694 12695There must be no non-phi instructions between the start of a basic block 12696and the PHI instructions: i.e. PHI instructions must be first in a basic 12697block. 12698 12699For the purposes of the SSA form, the use of each incoming value is 12700deemed to occur on the edge from the corresponding predecessor block to 12701the current block (but after any definition of an '``invoke``' 12702instruction's return value on the same edge). 12703 12704The optional ``fast-math-flags`` marker indicates that the phi has one 12705or more :ref:`fast-math-flags <fastmath>`. These are optimization hints 12706to enable otherwise unsafe floating-point optimizations. Fast-math-flags 12707are only valid for phis that return :ref:`supported floating-point types 12708<fastmath_return_types>`. 12709 12710Semantics: 12711"""""""""" 12712 12713At runtime, the '``phi``' instruction logically takes on the value 12714specified by the pair corresponding to the predecessor basic block that 12715executed just prior to the current block. 12716 12717Example: 12718"""""""" 12719 12720.. code-block:: llvm 12721 12722 Loop: ; Infinite loop that counts from 0 on up... 12723 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ] 12724 %nextindvar = add i32 %indvar, 1 12725 br label %Loop 12726 12727.. _i_select: 12728 12729'``select``' Instruction 12730^^^^^^^^^^^^^^^^^^^^^^^^ 12731 12732Syntax: 12733""""""" 12734 12735:: 12736 12737 <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty 12738 12739 selty is either i1 or {<N x i1>} 12740 12741Overview: 12742""""""""" 12743 12744The '``select``' instruction is used to choose one value based on a 12745condition, without IR-level branching. 12746 12747Arguments: 12748"""""""""" 12749 12750The '``select``' instruction requires an 'i1' value or a vector of 'i1' 12751values indicating the condition, and two values of the same :ref:`first 12752class <t_firstclass>` type. 12753 12754#. The optional ``fast-math flags`` marker indicates that the select has one or more 12755 :ref:`fast-math flags <fastmath>`. These are optimization hints to enable 12756 otherwise unsafe floating-point optimizations. Fast-math flags are only valid 12757 for selects that return :ref:`supported floating-point types 12758 <fastmath_return_types>`. 12759 12760Semantics: 12761"""""""""" 12762 12763If the condition is an i1 and it evaluates to 1, the instruction returns 12764the first value argument; otherwise, it returns the second value 12765argument. 12766 12767If the condition is a vector of i1, then the value arguments must be 12768vectors of the same size, and the selection is done element by element. 12769 12770If the condition is an i1 and the value arguments are vectors of the 12771same size, then an entire vector is selected. 12772 12773Example: 12774"""""""" 12775 12776.. code-block:: llvm 12777 12778 %X = select i1 true, i8 17, i8 42 ; yields i8:17 12779 12780 12781.. _i_freeze: 12782 12783'``freeze``' Instruction 12784^^^^^^^^^^^^^^^^^^^^^^^^ 12785 12786Syntax: 12787""""""" 12788 12789:: 12790 12791 <result> = freeze ty <val> ; yields ty:result 12792 12793Overview: 12794""""""""" 12795 12796The '``freeze``' instruction is used to stop propagation of 12797:ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values. 12798 12799Arguments: 12800"""""""""" 12801 12802The '``freeze``' instruction takes a single argument. 12803 12804Semantics: 12805"""""""""" 12806 12807If the argument is ``undef`` or ``poison``, '``freeze``' returns an 12808arbitrary, but fixed, value of type '``ty``'. 12809Otherwise, this instruction is a no-op and returns the input argument. 12810All uses of a value returned by the same '``freeze``' instruction are 12811guaranteed to always observe the same value, while different '``freeze``' 12812instructions may yield different values. 12813 12814While ``undef`` and ``poison`` pointers can be frozen, the result is a 12815non-dereferenceable pointer. See the 12816:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information. 12817If an aggregate value or vector is frozen, the operand is frozen element-wise. 12818The padding of an aggregate isn't considered, since it isn't visible 12819without storing it into memory and loading it with a different type. 12820 12821 12822Example: 12823"""""""" 12824 12825.. code-block:: text 12826 12827 %w = i32 undef 12828 %x = freeze i32 %w 12829 %y = add i32 %w, %w ; undef 12830 %z = add i32 %x, %x ; even number because all uses of %x observe 12831 ; the same value 12832 %x2 = freeze i32 %w 12833 %cmp = icmp eq i32 %x, %x2 ; can be true or false 12834 12835 ; example with vectors 12836 %v = <2 x i32> <i32 undef, i32 poison> 12837 %a = extractelement <2 x i32> %v, i32 0 ; undef 12838 %b = extractelement <2 x i32> %v, i32 1 ; poison 12839 %add = add i32 %a, %a ; undef 12840 12841 %v.fr = freeze <2 x i32> %v ; element-wise freeze 12842 %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef 12843 %add.f = add i32 %d, %d ; even number 12844 12845 ; branching on frozen value 12846 %poison = add nsw i1 %k, undef ; poison 12847 %c = freeze i1 %poison 12848 br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar 12849 12850 12851.. _i_call: 12852 12853'``call``' Instruction 12854^^^^^^^^^^^^^^^^^^^^^^ 12855 12856Syntax: 12857""""""" 12858 12859:: 12860 12861 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)] 12862 <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ] 12863 12864Overview: 12865""""""""" 12866 12867The '``call``' instruction represents a simple function call. 12868 12869Arguments: 12870"""""""""" 12871 12872This instruction requires several arguments: 12873 12874#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers 12875 should perform tail call optimization. The ``tail`` marker is a hint that 12876 `can be ignored <CodeGenerator.html#tail-call-optimization>`_. The 12877 ``musttail`` marker means that the call must be tail call optimized in order 12878 for the program to be correct. This is true even in the presence of 12879 attributes like "disable-tail-calls". The ``musttail`` marker provides these 12880 guarantees: 12881 12882 - The call will not cause unbounded stack growth if it is part of a 12883 recursive cycle in the call graph. 12884 - Arguments with the :ref:`inalloca <attr_inalloca>` or 12885 :ref:`preallocated <attr_preallocated>` attribute are forwarded in place. 12886 - If the musttail call appears in a function with the ``"thunk"`` attribute 12887 and the caller and callee both have varargs, then any unprototyped 12888 arguments in register or memory are forwarded to the callee. Similarly, 12889 the return value of the callee is returned to the caller's caller, even 12890 if a void return type is in use. 12891 12892 Both markers imply that the callee does not access allocas, va_args, or 12893 byval arguments from the caller. As an exception to that, an alloca or byval 12894 argument may be passed to the callee as a byval argument, which can be 12895 dereferenced inside the callee. For example: 12896 12897 .. code-block:: llvm 12898 12899 declare void @take_byval(ptr byval(i64)) 12900 declare void @take_ptr(ptr) 12901 12902 ; Invalid (assuming @take_ptr dereferences the pointer), because %local 12903 ; may be de-allocated before the call to @take_ptr. 12904 define void @invalid_alloca() { 12905 entry: 12906 %local = alloca i64 12907 tail call void @take_ptr(ptr %local) 12908 ret void 12909 } 12910 12911 ; Valid, the byval attribute causes the memory allocated by %local to be 12912 ; copied into @take_byval's stack frame. 12913 define void @byval_alloca() { 12914 entry: 12915 %local = alloca i64 12916 tail call void @take_byval(ptr byval(i64) %local) 12917 ret void 12918 } 12919 12920 ; Invalid, because @use_global_va_list uses the variadic arguments from 12921 ; @invalid_va_list. 12922 %struct.va_list = type { ptr } 12923 @va_list = external global %struct.va_list 12924 define void @use_global_va_list() { 12925 entry: 12926 %arg = va_arg ptr @va_list, i64 12927 ret void 12928 } 12929 define void @invalid_va_list(i32 %a, ...) { 12930 entry: 12931 call void @llvm.va_start.p0(ptr @va_list) 12932 tail call void @use_global_va_list() 12933 ret void 12934 } 12935 12936 ; Valid, byval argument forwarded to tail call as another byval argument. 12937 define void @forward_byval(ptr byval(i64) %x) { 12938 entry: 12939 tail call void @take_byval(ptr byval(i64) %x) 12940 ret void 12941 } 12942 12943 ; Invalid (assuming @take_ptr dereferences the pointer), byval argument 12944 ; passed to tail callee as non-byval ptr. 12945 define void @invalid_byval(ptr byval(i64) %x) { 12946 entry: 12947 tail call void @take_ptr(ptr %x) 12948 ret void 12949 } 12950 12951 Calls marked ``musttail`` must obey the following additional rules: 12952 12953 - The call must immediately precede a :ref:`ret <i_ret>` instruction, 12954 or a pointer bitcast followed by a ret instruction. 12955 - The ret instruction must return the (possibly bitcasted) value 12956 produced by the call, undef, or void. 12957 - The calling conventions of the caller and callee must match. 12958 - The callee must be varargs iff the caller is varargs. Bitcasting a 12959 non-varargs function to the appropriate varargs type is legal so 12960 long as the non-varargs prefixes obey the other rules. 12961 - The return type must not undergo automatic conversion to an `sret` pointer. 12962 12963 In addition, if the calling convention is not `swifttailcc` or `tailcc`: 12964 12965 - All ABI-impacting function attributes, such as sret, byval, inreg, 12966 returned, and inalloca, must match. 12967 - The caller and callee prototypes must match. Pointer types of parameters 12968 or return types may differ in pointee type, but not in address space. 12969 12970 On the other hand, if the calling convention is `swifttailcc` or `tailcc`: 12971 12972 - Only these ABI-impacting attributes attributes are allowed: sret, byval, 12973 swiftself, and swiftasync. 12974 - Prototypes are not required to match. 12975 12976 Tail call optimization for calls marked ``tail`` is guaranteed to occur if 12977 the following conditions are met: 12978 12979 - Caller and callee both have the calling convention ``fastcc`` or ``tailcc``. 12980 - The call is in tail position (ret immediately follows call and ret 12981 uses value of call or is void). 12982 - Option ``-tailcallopt`` is enabled, ``llvm::GuaranteedTailCallOpt`` is 12983 ``true``, or the calling convention is ``tailcc``. 12984 - `Platform-specific constraints are met. 12985 <CodeGenerator.html#tail-call-optimization>`_ 12986 12987#. The optional ``notail`` marker indicates that the optimizers should not add 12988 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail 12989 call optimization from being performed on the call. 12990 12991#. The optional ``fast-math flags`` marker indicates that the call has one or more 12992 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable 12993 otherwise unsafe floating-point optimizations. Fast-math flags are only valid 12994 for calls that return :ref:`supported floating-point types <fastmath_return_types>`. 12995 12996#. The optional "cconv" marker indicates which :ref:`calling 12997 convention <callingconv>` the call should use. If none is 12998 specified, the call defaults to using C calling conventions. The 12999 calling convention of the call must match the calling convention of 13000 the target function, or else the behavior is undefined. 13001#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 13002 values. Only '``zeroext``', '``signext``', '``noext``', and '``inreg``' 13003 attributes are valid here. 13004#. The optional addrspace attribute can be used to indicate the address space 13005 of the called function. If it is not specified, the program address space 13006 from the :ref:`datalayout string<langref_datalayout>` will be used. 13007#. '``ty``': the type of the call instruction itself which is also the 13008 type of the return value. Functions that return no value are marked 13009 ``void``. 13010#. '``fnty``': shall be the signature of the function being called. The 13011 argument types must match the types implied by this signature. This 13012 type can be omitted if the function is not varargs. 13013#. '``fnptrval``': An LLVM value containing a pointer to a function to 13014 be called. In most cases, this is a direct function call, but 13015 indirect ``call``'s are just as possible, calling an arbitrary pointer 13016 to function value. 13017#. '``function args``': argument list whose types match the function 13018 signature argument types and parameter attributes. All arguments must 13019 be of :ref:`first class <t_firstclass>` type. If the function signature 13020 indicates the function accepts a variable number of arguments, the 13021 extra arguments can be specified. 13022#. The optional :ref:`function attributes <fnattrs>` list. 13023#. The optional :ref:`operand bundles <opbundles>` list. 13024 13025Semantics: 13026"""""""""" 13027 13028The '``call``' instruction is used to cause control flow to transfer to 13029a specified function, with its incoming arguments bound to the specified 13030values. Upon a '``ret``' instruction in the called function, control 13031flow continues with the instruction after the function call, and the 13032return value of the function is bound to the result argument. 13033 13034Example: 13035"""""""" 13036 13037.. code-block:: llvm 13038 13039 %retval = call i32 @test(i32 %argc) 13040 call i32 (ptr, ...) @printf(ptr %msg, i32 12, i8 42) ; yields i32 13041 %X = tail call i32 @foo() ; yields i32 13042 %Y = tail call fastcc i32 @foo() ; yields i32 13043 call void %foo(i8 signext 97) 13044 13045 %struct.A = type { i32, i8 } 13046 %r = call %struct.A @foo() ; yields { i32, i8 } 13047 %gr = extractvalue %struct.A %r, 0 ; yields i32 13048 %gr1 = extractvalue %struct.A %r, 1 ; yields i8 13049 %Z = call void @foo() noreturn ; indicates that %foo never returns normally 13050 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended 13051 13052llvm treats calls to some functions with names and arguments that match 13053the standard C99 library as being the C99 library functions, and may 13054perform optimizations or generate code for them under that assumption. 13055This is something we'd like to change in the future to provide better 13056support for freestanding environments and non-C-based languages. 13057 13058.. _i_va_arg: 13059 13060'``va_arg``' Instruction 13061^^^^^^^^^^^^^^^^^^^^^^^^ 13062 13063Syntax: 13064""""""" 13065 13066:: 13067 13068 <resultval> = va_arg <va_list*> <arglist>, <argty> 13069 13070Overview: 13071""""""""" 13072 13073The '``va_arg``' instruction is used to access arguments passed through 13074the "variable argument" area of a function call. It is used to implement 13075the ``va_arg`` macro in C. 13076 13077Arguments: 13078"""""""""" 13079 13080This instruction takes a ``va_list*`` value and the type of the 13081argument. It returns a value of the specified argument type and 13082increments the ``va_list`` to point to the next argument. The actual 13083type of ``va_list`` is target specific. 13084 13085Semantics: 13086"""""""""" 13087 13088The '``va_arg``' instruction loads an argument of the specified type 13089from the specified ``va_list`` and causes the ``va_list`` to point to 13090the next argument. For more information, see the variable argument 13091handling :ref:`Intrinsic Functions <int_varargs>`. 13092 13093It is legal for this instruction to be called in a function which does 13094not take a variable number of arguments, for example, the ``vfprintf`` 13095function. 13096 13097``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic 13098function <intrinsics>` because it takes a type as an argument. 13099 13100Example: 13101"""""""" 13102 13103See the :ref:`variable argument processing <int_varargs>` section. 13104 13105Note that the code generator does not yet fully support va\_arg on many 13106targets. Also, it does not currently support va\_arg with aggregate 13107types on any target. 13108 13109.. _i_landingpad: 13110 13111'``landingpad``' Instruction 13112^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13113 13114Syntax: 13115""""""" 13116 13117:: 13118 13119 <resultval> = landingpad <resultty> <clause>+ 13120 <resultval> = landingpad <resultty> cleanup <clause>* 13121 13122 <clause> := catch <type> <value> 13123 <clause> := filter <array constant type> <array constant> 13124 13125Overview: 13126""""""""" 13127 13128The '``landingpad``' instruction is used by `LLVM's exception handling 13129system <ExceptionHandling.html#overview>`_ to specify that a basic block 13130is a landing pad --- one where the exception lands, and corresponds to the 13131code found in the ``catch`` portion of a ``try``/``catch`` sequence. It 13132defines values supplied by the :ref:`personality function <personalityfn>` upon 13133re-entry to the function. The ``resultval`` has the type ``resultty``. 13134 13135Arguments: 13136"""""""""" 13137 13138The optional 13139``cleanup`` flag indicates that the landing pad block is a cleanup. 13140 13141A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and 13142contains the global variable representing the "type" that may be caught 13143or filtered respectively. Unlike the ``catch`` clause, the ``filter`` 13144clause takes an array constant as its argument. Use 13145"``[0 x ptr] undef``" for a filter which cannot throw. The 13146'``landingpad``' instruction must contain *at least* one ``clause`` or 13147the ``cleanup`` flag. 13148 13149Semantics: 13150"""""""""" 13151 13152The '``landingpad``' instruction defines the values which are set by the 13153:ref:`personality function <personalityfn>` upon re-entry to the function, and 13154therefore the "result type" of the ``landingpad`` instruction. As with 13155calling conventions, how the personality function results are 13156represented in LLVM IR is target specific. 13157 13158The clauses are applied in order from top to bottom. If two 13159``landingpad`` instructions are merged together through inlining, the 13160clauses from the calling function are appended to the list of clauses. 13161When the call stack is being unwound due to an exception being thrown, 13162the exception is compared against each ``clause`` in turn. If it doesn't 13163match any of the clauses, and the ``cleanup`` flag is not set, then 13164unwinding continues further up the call stack. 13165 13166The ``landingpad`` instruction has several restrictions: 13167 13168- A landing pad block is a basic block which is the unwind destination 13169 of an '``invoke``' instruction. 13170- A landing pad block must have a '``landingpad``' instruction as its 13171 first non-PHI instruction. 13172- There can be only one '``landingpad``' instruction within the landing 13173 pad block. 13174- A basic block that is not a landing pad block may not include a 13175 '``landingpad``' instruction. 13176 13177Example: 13178"""""""" 13179 13180.. code-block:: llvm 13181 13182 ;; A landing pad which can catch an integer. 13183 %res = landingpad { ptr, i32 } 13184 catch ptr @_ZTIi 13185 ;; A landing pad that is a cleanup. 13186 %res = landingpad { ptr, i32 } 13187 cleanup 13188 ;; A landing pad which can catch an integer and can only throw a double. 13189 %res = landingpad { ptr, i32 } 13190 catch ptr @_ZTIi 13191 filter [1 x ptr] [ptr @_ZTId] 13192 13193.. _i_catchpad: 13194 13195'``catchpad``' Instruction 13196^^^^^^^^^^^^^^^^^^^^^^^^^^ 13197 13198Syntax: 13199""""""" 13200 13201:: 13202 13203 <resultval> = catchpad within <catchswitch> [<args>*] 13204 13205Overview: 13206""""""""" 13207 13208The '``catchpad``' instruction is used by `LLVM's exception handling 13209system <ExceptionHandling.html#overview>`_ to specify that a basic block 13210begins a catch handler --- one where a personality routine attempts to transfer 13211control to catch an exception. 13212 13213Arguments: 13214"""""""""" 13215 13216The ``catchswitch`` operand must always be a token produced by a 13217:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This 13218ensures that each ``catchpad`` has exactly one predecessor block, and it always 13219terminates in a ``catchswitch``. 13220 13221The ``args`` correspond to whatever information the personality routine 13222requires to know if this is an appropriate handler for the exception. Control 13223will transfer to the ``catchpad`` if this is the first appropriate handler for 13224the exception. 13225 13226The ``resultval`` has the type :ref:`token <t_token>` and is used to match the 13227``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH 13228pads. 13229 13230Semantics: 13231"""""""""" 13232 13233When the call stack is being unwound due to an exception being thrown, the 13234exception is compared against the ``args``. If it doesn't match, control will 13235not reach the ``catchpad`` instruction. The representation of ``args`` is 13236entirely target and personality function-specific. 13237 13238Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad`` 13239instruction must be the first non-phi of its parent basic block. 13240 13241The meaning of the tokens produced and consumed by ``catchpad`` and other "pad" 13242instructions is described in the 13243`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_. 13244 13245When a ``catchpad`` has been "entered" but not yet "exited" (as 13246described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 13247it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 13248that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`. 13249 13250Example: 13251"""""""" 13252 13253.. code-block:: text 13254 13255 dispatch: 13256 %cs = catchswitch within none [label %handler0] unwind to caller 13257 ;; A catch block which can catch an integer. 13258 handler0: 13259 %tok = catchpad within %cs [ptr @_ZTIi] 13260 13261.. _i_cleanuppad: 13262 13263'``cleanuppad``' Instruction 13264^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13265 13266Syntax: 13267""""""" 13268 13269:: 13270 13271 <resultval> = cleanuppad within <parent> [<args>*] 13272 13273Overview: 13274""""""""" 13275 13276The '``cleanuppad``' instruction is used by `LLVM's exception handling 13277system <ExceptionHandling.html#overview>`_ to specify that a basic block 13278is a cleanup block --- one where a personality routine attempts to 13279transfer control to run cleanup actions. 13280The ``args`` correspond to whatever additional 13281information the :ref:`personality function <personalityfn>` requires to 13282execute the cleanup. 13283The ``resultval`` has the type :ref:`token <t_token>` and is used to 13284match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`. 13285The ``parent`` argument is the token of the funclet that contains the 13286``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet, 13287this operand may be the token ``none``. 13288 13289Arguments: 13290"""""""""" 13291 13292The instruction takes a list of arbitrary values which are interpreted 13293by the :ref:`personality function <personalityfn>`. 13294 13295Semantics: 13296"""""""""" 13297 13298When the call stack is being unwound due to an exception being thrown, 13299the :ref:`personality function <personalityfn>` transfers control to the 13300``cleanuppad`` with the aid of the personality-specific arguments. 13301As with calling conventions, how the personality function results are 13302represented in LLVM IR is target specific. 13303 13304The ``cleanuppad`` instruction has several restrictions: 13305 13306- A cleanup block is a basic block which is the unwind destination of 13307 an exceptional instruction. 13308- A cleanup block must have a '``cleanuppad``' instruction as its 13309 first non-PHI instruction. 13310- There can be only one '``cleanuppad``' instruction within the 13311 cleanup block. 13312- A basic block that is not a cleanup block may not include a 13313 '``cleanuppad``' instruction. 13314 13315When a ``cleanuppad`` has been "entered" but not yet "exited" (as 13316described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 13317it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 13318that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`. 13319 13320Example: 13321"""""""" 13322 13323.. code-block:: text 13324 13325 %tok = cleanuppad within %cs [] 13326 13327.. _debugrecords: 13328 13329Debug Records 13330----------------------- 13331 13332Debug records appear interleaved with instructions, but are not instructions; 13333they are used only to define debug information, and have no effect on generated 13334code. They are distinguished from instructions by the use of a leading `#` and 13335an extra level of indentation. As an example: 13336 13337.. code-block:: llvm 13338 13339 %inst1 = op1 %a, %b 13340 #dbg_value(%inst1, !10, !DIExpression(), !11) 13341 %inst2 = op2 %inst1, %c 13342 13343These debug records replace the prior :ref:`debug intrinsics<dbg_intrinsics>`. 13344Debug records will be disabled if ``--write-experimental-debuginfo=false`` is 13345passed to LLVM; it is an error for both records and intrinsics to appear in the 13346same module. More information about debug records can be found in the `LLVM 13347Source Level Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_ 13348document. 13349 13350.. _intrinsics: 13351 13352Intrinsic Functions 13353=================== 13354 13355LLVM supports the notion of an "intrinsic function". These functions 13356have well known names and semantics and are required to follow certain 13357restrictions. Overall, these intrinsics represent an extension mechanism 13358for the LLVM language that does not require changing all of the 13359transformations in LLVM when adding to the language (or the bitcode 13360reader/writer, the parser, etc...). 13361 13362Intrinsic function names must all start with an "``llvm.``" prefix. This 13363prefix is reserved in LLVM for intrinsic names; thus, function names may 13364not begin with this prefix. Intrinsic functions must always be external 13365functions: you cannot define the body of intrinsic functions. Intrinsic 13366functions may only be used in call or invoke instructions: it is illegal 13367to take the address of an intrinsic function. Additionally, because 13368intrinsic functions are part of the LLVM language, it is required if any 13369are added that they be documented here. 13370 13371Some intrinsic functions can be overloaded, i.e., the intrinsic 13372represents a family of functions that perform the same operation but on 13373different data types. Because LLVM can represent over 8 million 13374different integer types, overloading is used commonly to allow an 13375intrinsic function to operate on any integer type. One or more of the 13376argument types or the result type can be overloaded to accept any 13377integer type. Argument types may also be defined as exactly matching a 13378previous argument's type or the result type. This allows an intrinsic 13379function which accepts multiple arguments, but needs all of them to be 13380of the same type, to only be overloaded with respect to a single 13381argument or the result. 13382 13383Overloaded intrinsics will have the names of its overloaded argument 13384types encoded into its function name, each preceded by a period. Only 13385those types which are overloaded result in a name suffix. Arguments 13386whose type is matched against another type do not. For example, the 13387``llvm.ctpop`` function can take an integer of any width and returns an 13388integer of exactly the same integer width. This leads to a family of 13389functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and 13390``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is 13391overloaded, and only one type suffix is required. Because the argument's 13392type is matched against the return type, it does not require its own 13393name suffix. 13394 13395:ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics 13396that depend on an unnamed type in one of its overloaded argument types get an 13397additional ``.<number>`` suffix. This allows differentiating intrinsics with 13398different unnamed types as arguments. (For example: 13399``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and 13400it ensures unique names in the module. While linking together two modules, it is 13401still possible to get a name clash. In that case one of the names will be 13402changed by getting a new number. 13403 13404For target developers who are defining intrinsics for back-end code 13405generation, any intrinsic overloads based solely the distinction between 13406integer or floating point types should not be relied upon for correct 13407code generation. In such cases, the recommended approach for target 13408maintainers when defining intrinsics is to create separate integer and 13409FP intrinsics rather than rely on overloading. For example, if different 13410codegen is required for ``llvm.target.foo(<4 x i32>)`` and 13411``llvm.target.foo(<4 x float>)`` then these should be split into 13412different intrinsics. 13413 13414To learn how to add an intrinsic function, please see the `Extending 13415LLVM Guide <ExtendingLLVM.html>`_. 13416 13417.. _int_varargs: 13418 13419Variable Argument Handling Intrinsics 13420------------------------------------- 13421 13422Variable argument support is defined in LLVM with the 13423:ref:`va_arg <i_va_arg>` instruction and these three intrinsic 13424functions. These functions are related to the similarly named macros 13425defined in the ``<stdarg.h>`` header file. 13426 13427All of these functions take as arguments pointers to a target-specific 13428value type "``va_list``". The LLVM assembly language reference manual 13429does not define what this type is, so all transformations should be 13430prepared to handle these functions regardless of the type used. The intrinsics 13431are overloaded, and can be used for pointers to different address spaces. 13432 13433This example shows how the :ref:`va_arg <i_va_arg>` instruction and the 13434variable argument handling intrinsic functions are used. 13435 13436.. code-block:: llvm 13437 13438 ; This struct is different for every platform. For most platforms, 13439 ; it is merely a ptr. 13440 %struct.va_list = type { ptr } 13441 13442 ; For Unix x86_64 platforms, va_list is the following struct: 13443 ; %struct.va_list = type { i32, i32, ptr, ptr } 13444 13445 define i32 @test(i32 %X, ...) { 13446 ; Initialize variable argument processing 13447 %ap = alloca %struct.va_list 13448 call void @llvm.va_start.p0(ptr %ap) 13449 13450 ; Read a single integer argument 13451 %tmp = va_arg ptr %ap, i32 13452 13453 ; Demonstrate usage of llvm.va_copy and llvm.va_end 13454 %aq = alloca ptr 13455 call void @llvm.va_copy.p0(ptr %aq, ptr %ap) 13456 call void @llvm.va_end.p0(ptr %aq) 13457 13458 ; Stop processing of arguments. 13459 call void @llvm.va_end.p0(ptr %ap) 13460 ret i32 %tmp 13461 } 13462 13463 declare void @llvm.va_start.p0(ptr) 13464 declare void @llvm.va_copy.p0(ptr, ptr) 13465 declare void @llvm.va_end.p0(ptr) 13466 13467.. _int_va_start: 13468 13469'``llvm.va_start``' Intrinsic 13470^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13471 13472Syntax: 13473""""""" 13474 13475:: 13476 13477 declare void @llvm.va_start.p0(ptr <arglist>) 13478 declare void @llvm.va_start.p5(ptr addrspace(5) <arglist>) 13479 13480Overview: 13481""""""""" 13482 13483The '``llvm.va_start``' intrinsic initializes ``<arglist>`` for 13484subsequent use by ``va_arg``. 13485 13486Arguments: 13487"""""""""" 13488 13489The argument is a pointer to a ``va_list`` element to initialize. 13490 13491Semantics: 13492"""""""""" 13493 13494The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro 13495available in C. In a target-dependent way, it initializes the 13496``va_list`` element to which the argument points, so that the next call 13497to ``va_arg`` will produce the first variable argument passed to the 13498function. Unlike the C ``va_start`` macro, this intrinsic does not need 13499to know the last argument of the function as the compiler can figure 13500that out. 13501 13502'``llvm.va_end``' Intrinsic 13503^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13504 13505Syntax: 13506""""""" 13507 13508:: 13509 13510 declare void @llvm.va_end.p0(ptr <arglist>) 13511 declare void @llvm.va_end.p5(ptr addrspace(5) <arglist>) 13512 13513Overview: 13514""""""""" 13515 13516The '``llvm.va_end``' intrinsic destroys ``<arglist>``, which has been 13517initialized previously with ``llvm.va_start`` or ``llvm.va_copy``. 13518 13519Arguments: 13520"""""""""" 13521 13522The argument is a pointer to a ``va_list`` to destroy. 13523 13524Semantics: 13525"""""""""" 13526 13527The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro 13528available in C. In a target-dependent way, it destroys the ``va_list`` 13529element to which the argument points. Calls to 13530:ref:`llvm.va_start <int_va_start>` and 13531:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to 13532``llvm.va_end``. 13533 13534.. _int_va_copy: 13535 13536'``llvm.va_copy``' Intrinsic 13537^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13538 13539Syntax: 13540""""""" 13541 13542:: 13543 13544 declare void @llvm.va_copy.p0(ptr <destarglist>, ptr <srcarglist>) 13545 declare void @llvm.va_copy.p5(ptr addrspace(5) <destarglist>, ptr addrspace(5) <srcarglist>) 13546 13547Overview: 13548""""""""" 13549 13550The '``llvm.va_copy``' intrinsic copies the current argument position 13551from the source argument list to the destination argument list. 13552 13553Arguments: 13554"""""""""" 13555 13556The first argument is a pointer to a ``va_list`` element to initialize. 13557The second argument is a pointer to a ``va_list`` element to copy from. 13558The address spaces of the two arguments must match. 13559 13560Semantics: 13561"""""""""" 13562 13563The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro 13564available in C. In a target-dependent way, it copies the source 13565``va_list`` element into the destination ``va_list`` element. This 13566intrinsic is necessary because the `` llvm.va_start`` intrinsic may be 13567arbitrarily complex and require, for example, memory allocation. 13568 13569Accurate Garbage Collection Intrinsics 13570-------------------------------------- 13571 13572LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_ 13573(GC) requires the frontend to generate code containing appropriate intrinsic 13574calls and select an appropriate GC strategy which knows how to lower these 13575intrinsics in a manner which is appropriate for the target collector. 13576 13577These intrinsics allow identification of :ref:`GC roots on the 13578stack <int_gcroot>`, as well as garbage collector implementations that 13579require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. 13580Frontends for type-safe garbage collected languages should generate 13581these intrinsics to make use of the LLVM garbage collectors. For more 13582details, see `Garbage Collection with LLVM <GarbageCollection.html>`_. 13583 13584LLVM provides an second experimental set of intrinsics for describing garbage 13585collection safepoints in compiled code. These intrinsics are an alternative 13586to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for 13587:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The 13588differences in approach are covered in the `Garbage Collection with LLVM 13589<GarbageCollection.html>`_ documentation. The intrinsics themselves are 13590described in :doc:`Statepoints`. 13591 13592.. _int_gcroot: 13593 13594'``llvm.gcroot``' Intrinsic 13595^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13596 13597Syntax: 13598""""""" 13599 13600:: 13601 13602 declare void @llvm.gcroot(ptr %ptrloc, ptr %metadata) 13603 13604Overview: 13605""""""""" 13606 13607The '``llvm.gcroot``' intrinsic declares the existence of a GC root to 13608the code generator, and allows some metadata to be associated with it. 13609 13610Arguments: 13611"""""""""" 13612 13613The first argument specifies the address of a stack object that contains 13614the root pointer. The second pointer (which must be either a constant or 13615a global value address) contains the meta-data to be associated with the 13616root. 13617 13618Semantics: 13619"""""""""" 13620 13621At runtime, a call to this intrinsic stores a null pointer into the 13622"ptrloc" location. At compile-time, the code generator generates 13623information to allow the runtime to find the pointer at GC safe points. 13624The '``llvm.gcroot``' intrinsic may only be used in a function which 13625:ref:`specifies a GC algorithm <gc>`. 13626 13627.. _int_gcread: 13628 13629'``llvm.gcread``' Intrinsic 13630^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13631 13632Syntax: 13633""""""" 13634 13635:: 13636 13637 declare ptr @llvm.gcread(ptr %ObjPtr, ptr %Ptr) 13638 13639Overview: 13640""""""""" 13641 13642The '``llvm.gcread``' intrinsic identifies reads of references from heap 13643locations, allowing garbage collector implementations that require read 13644barriers. 13645 13646Arguments: 13647"""""""""" 13648 13649The second argument is the address to read from, which should be an 13650address allocated from the garbage collector. The first object is a 13651pointer to the start of the referenced object, if needed by the language 13652runtime (otherwise null). 13653 13654Semantics: 13655"""""""""" 13656 13657The '``llvm.gcread``' intrinsic has the same semantics as a load 13658instruction, but may be replaced with substantially more complex code by 13659the garbage collector runtime, as needed. The '``llvm.gcread``' 13660intrinsic may only be used in a function which :ref:`specifies a GC 13661algorithm <gc>`. 13662 13663.. _int_gcwrite: 13664 13665'``llvm.gcwrite``' Intrinsic 13666^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13667 13668Syntax: 13669""""""" 13670 13671:: 13672 13673 declare void @llvm.gcwrite(ptr %P1, ptr %Obj, ptr %P2) 13674 13675Overview: 13676""""""""" 13677 13678The '``llvm.gcwrite``' intrinsic identifies writes of references to heap 13679locations, allowing garbage collector implementations that require write 13680barriers (such as generational or reference counting collectors). 13681 13682Arguments: 13683"""""""""" 13684 13685The first argument is the reference to store, the second is the start of 13686the object to store it to, and the third is the address of the field of 13687Obj to store to. If the runtime does not require a pointer to the 13688object, Obj may be null. 13689 13690Semantics: 13691"""""""""" 13692 13693The '``llvm.gcwrite``' intrinsic has the same semantics as a store 13694instruction, but may be replaced with substantially more complex code by 13695the garbage collector runtime, as needed. The '``llvm.gcwrite``' 13696intrinsic may only be used in a function which :ref:`specifies a GC 13697algorithm <gc>`. 13698 13699 13700.. _gc_statepoint: 13701 13702'``llvm.experimental.gc.statepoint``' Intrinsic 13703^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13704 13705Syntax: 13706""""""" 13707 13708:: 13709 13710 declare token 13711 @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>, 13712 ptr elementtype(func_type) <target>, 13713 i64 <#call args>, i64 <flags>, 13714 ... (call parameters), 13715 i64 0, i64 0) 13716 13717Overview: 13718""""""""" 13719 13720The statepoint intrinsic represents a call which is parse-able by the 13721runtime. 13722 13723Operands: 13724""""""""" 13725 13726The 'id' operand is a constant integer that is reported as the ID 13727field in the generated stackmap. LLVM does not interpret this 13728parameter in any way and its meaning is up to the statepoint user to 13729decide. Note that LLVM is free to duplicate code containing 13730statepoint calls, and this may transform IR that had a unique 'id' per 13731lexical call to statepoint to IR that does not. 13732 13733If 'num patch bytes' is non-zero then the call instruction 13734corresponding to the statepoint is not emitted and LLVM emits 'num 13735patch bytes' bytes of nops in its place. LLVM will emit code to 13736prepare the function arguments and retrieve the function return value 13737in accordance to the calling convention; the former before the nop 13738sequence and the latter after the nop sequence. It is expected that 13739the user will patch over the 'num patch bytes' bytes of nops with a 13740calling sequence specific to their runtime before executing the 13741generated machine code. There are no guarantees with respect to the 13742alignment of the nop sequence. Unlike :doc:`StackMaps` statepoints do 13743not have a concept of shadow bytes. Note that semantically the 13744statepoint still represents a call or invoke to 'target', and the nop 13745sequence after patching is expected to represent an operation 13746equivalent to a call or invoke to 'target'. 13747 13748The 'target' operand is the function actually being called. The operand 13749must have an :ref:`elementtype <attr_elementtype>` attribute specifying 13750the function type of the target. The target can be specified as either 13751a symbolic LLVM function, or as an arbitrary Value of pointer type. Note 13752that the function type must match the signature of the callee and the 13753types of the 'call parameters' arguments. 13754 13755The '#call args' operand is the number of arguments to the actual 13756call. It must exactly match the number of arguments passed in the 13757'call parameters' variable length section. 13758 13759The 'flags' operand is used to specify extra information about the 13760statepoint. This is currently only used to mark certain statepoints 13761as GC transitions. This operand is a 64-bit integer with the following 13762layout, where bit 0 is the least significant bit: 13763 13764 +-------+---------------------------------------------------+ 13765 | Bit # | Usage | 13766 +=======+===================================================+ 13767 | 0 | Set if the statepoint is a GC transition, cleared | 13768 | | otherwise. | 13769 +-------+---------------------------------------------------+ 13770 | 1-63 | Reserved for future use; must be cleared. | 13771 +-------+---------------------------------------------------+ 13772 13773The 'call parameters' arguments are simply the arguments which need to 13774be passed to the call target. They will be lowered according to the 13775specified calling convention and otherwise handled like a normal call 13776instruction. The number of arguments must exactly match what is 13777specified in '# call args'. The types must match the signature of 13778'target'. 13779 13780The 'call parameter' attributes must be followed by two 'i64 0' constants. 13781These were originally the length prefixes for 'gc transition parameter' and 13782'deopt parameter' arguments, but the role of these parameter sets have been 13783entirely replaced with the corresponding operand bundles. In a future 13784revision, these now redundant arguments will be removed. 13785 13786Semantics: 13787"""""""""" 13788 13789A statepoint is assumed to read and write all memory. As a result, 13790memory operations can not be reordered past a statepoint. It is 13791illegal to mark a statepoint as being either 'readonly' or 'readnone'. 13792 13793Note that legal IR can not perform any memory operation on a 'gc 13794pointer' argument of the statepoint in a location statically reachable 13795from the statepoint. Instead, the explicitly relocated value (from a 13796``gc.relocate``) must be used. 13797 13798'``llvm.experimental.gc.result``' Intrinsic 13799^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13800 13801Syntax: 13802""""""" 13803 13804:: 13805 13806 declare type 13807 @llvm.experimental.gc.result(token %statepoint_token) 13808 13809Overview: 13810""""""""" 13811 13812``gc.result`` extracts the result of the original call instruction 13813which was replaced by the ``gc.statepoint``. The ``gc.result`` 13814intrinsic is actually a family of three intrinsics due to an 13815implementation limitation. Other than the type of the return value, 13816the semantics are the same. 13817 13818Operands: 13819""""""""" 13820 13821The first and only argument is the ``gc.statepoint`` which starts 13822the safepoint sequence of which this ``gc.result`` is a part. 13823Despite the typing of this as a generic token, *only* the value defined 13824by a ``gc.statepoint`` is legal here. 13825 13826Semantics: 13827"""""""""" 13828 13829The ``gc.result`` represents the return value of the call target of 13830the ``statepoint``. The type of the ``gc.result`` must exactly match 13831the type of the target. If the call target returns void, there will 13832be no ``gc.result``. 13833 13834A ``gc.result`` is modeled as a 'readnone' pure function. It has no 13835side effects since it is just a projection of the return value of the 13836previous call represented by the ``gc.statepoint``. 13837 13838'``llvm.experimental.gc.relocate``' Intrinsic 13839^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13840 13841Syntax: 13842""""""" 13843 13844:: 13845 13846 declare <pointer type> 13847 @llvm.experimental.gc.relocate(token %statepoint_token, 13848 i32 %base_offset, 13849 i32 %pointer_offset) 13850 13851Overview: 13852""""""""" 13853 13854A ``gc.relocate`` returns the potentially relocated value of a pointer 13855at the safepoint. 13856 13857Operands: 13858""""""""" 13859 13860The first argument is the ``gc.statepoint`` which starts the 13861safepoint sequence of which this ``gc.relocation`` is a part. 13862Despite the typing of this as a generic token, *only* the value defined 13863by a ``gc.statepoint`` is legal here. 13864 13865The second and third arguments are both indices into operands of the 13866corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle. 13867 13868The second argument is an index which specifies the allocation for the pointer 13869being relocated. The associated value must be within the object with which the 13870pointer being relocated is associated. The optimizer is free to change *which* 13871interior derived pointer is reported, provided that it does not replace an 13872actual base pointer with another interior derived pointer. Collectors are 13873allowed to rely on the base pointer operand remaining an actual base pointer if 13874so constructed. 13875 13876The third argument is an index which specify the (potentially) derived pointer 13877being relocated. It is legal for this index to be the same as the second 13878argument if-and-only-if a base pointer is being relocated. 13879 13880Semantics: 13881"""""""""" 13882 13883The return value of ``gc.relocate`` is the potentially relocated value 13884of the pointer specified by its arguments. It is unspecified how the 13885value of the returned pointer relates to the argument to the 13886``gc.statepoint`` other than that a) it points to the same source 13887language object with the same offset, and b) the 'based-on' 13888relationship of the newly relocated pointers is a projection of the 13889unrelocated pointers. In particular, the integer value of the pointer 13890returned is unspecified. 13891 13892A ``gc.relocate`` is modeled as a ``readnone`` pure function. It has no 13893side effects since it is just a way to extract information about work 13894done during the actual call modeled by the ``gc.statepoint``. 13895 13896.. _gc.get.pointer.base: 13897 13898'``llvm.experimental.gc.get.pointer.base``' Intrinsic 13899^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13900 13901Syntax: 13902""""""" 13903 13904:: 13905 13906 declare <pointer type> 13907 @llvm.experimental.gc.get.pointer.base( 13908 <pointer type> readnone captures(none) %derived_ptr) 13909 nounwind willreturn memory(none) 13910 13911Overview: 13912""""""""" 13913 13914``gc.get.pointer.base`` for a derived pointer returns its base pointer. 13915 13916Operands: 13917""""""""" 13918 13919The only argument is a pointer which is based on some object with 13920an unknown offset from the base of said object. 13921 13922Semantics: 13923"""""""""" 13924 13925This intrinsic is used in the abstract machine model for GC to represent 13926the base pointer for an arbitrary derived pointer. 13927 13928This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by 13929replacing all uses of this callsite with the offset of a derived pointer from 13930its base pointer value. The replacement is done as part of the lowering to the 13931explicit statepoint model. 13932 13933The return pointer type must be the same as the type of the parameter. 13934 13935 13936'``llvm.experimental.gc.get.pointer.offset``' Intrinsic 13937^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13938 13939Syntax: 13940""""""" 13941 13942:: 13943 13944 declare i64 13945 @llvm.experimental.gc.get.pointer.offset( 13946 <pointer type> readnone captures(none) %derived_ptr) 13947 nounwind willreturn memory(none) 13948 13949Overview: 13950""""""""" 13951 13952``gc.get.pointer.offset`` for a derived pointer returns the offset from its 13953base pointer. 13954 13955Operands: 13956""""""""" 13957 13958The only argument is a pointer which is based on some object with 13959an unknown offset from the base of said object. 13960 13961Semantics: 13962"""""""""" 13963 13964This intrinsic is used in the abstract machine model for GC to represent 13965the offset of an arbitrary derived pointer from its base pointer. 13966 13967This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by 13968replacing all uses of this callsite with the offset of a derived pointer from 13969its base pointer value. The replacement is done as part of the lowering to the 13970explicit statepoint model. 13971 13972Basically this call calculates difference between the derived pointer and its 13973base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But 13974this cast done outside the :ref:`RewriteStatepointsForGC` pass could result 13975in the pointers lost for further lowering from the abstract model to the 13976explicit physical one. 13977 13978Code Generator Intrinsics 13979------------------------- 13980 13981These intrinsics are provided by LLVM to expose special features that 13982may only be implemented with code generator support. 13983 13984'``llvm.returnaddress``' Intrinsic 13985^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13986 13987Syntax: 13988""""""" 13989 13990:: 13991 13992 declare ptr @llvm.returnaddress(i32 <level>) 13993 13994Overview: 13995""""""""" 13996 13997The '``llvm.returnaddress``' intrinsic attempts to compute a 13998target-specific value indicating the return address of the current 13999function or one of its callers. 14000 14001Arguments: 14002"""""""""" 14003 14004The argument to this intrinsic indicates which function to return the 14005address for. Zero indicates the calling function, one indicates its 14006caller, etc. The argument is **required** to be a constant integer 14007value. 14008 14009Semantics: 14010"""""""""" 14011 14012The '``llvm.returnaddress``' intrinsic either returns a pointer 14013indicating the return address of the specified call frame, or zero if it 14014cannot be identified. The value returned by this intrinsic is likely to 14015be incorrect or 0 for arguments other than zero, so it should only be 14016used for debugging purposes. 14017 14018Note that calling this intrinsic does not prevent function inlining or 14019other aggressive transformations, so the value returned may not be that 14020of the obvious source-language caller. 14021 14022'``llvm.addressofreturnaddress``' Intrinsic 14023^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14024 14025Syntax: 14026""""""" 14027 14028:: 14029 14030 declare ptr @llvm.addressofreturnaddress() 14031 14032Overview: 14033""""""""" 14034 14035The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific 14036pointer to the place in the stack frame where the return address of the 14037current function is stored. 14038 14039Semantics: 14040"""""""""" 14041 14042Note that calling this intrinsic does not prevent function inlining or 14043other aggressive transformations, so the value returned may not be that 14044of the obvious source-language caller. 14045 14046This intrinsic is only implemented for x86 and aarch64. 14047 14048'``llvm.sponentry``' Intrinsic 14049^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14050 14051Syntax: 14052""""""" 14053 14054:: 14055 14056 declare ptr @llvm.sponentry() 14057 14058Overview: 14059""""""""" 14060 14061The '``llvm.sponentry``' intrinsic returns the stack pointer value at 14062the entry of the current function calling this intrinsic. 14063 14064Semantics: 14065"""""""""" 14066 14067Note this intrinsic is only verified on AArch64 and ARM. 14068 14069'``llvm.frameaddress``' Intrinsic 14070^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14071 14072Syntax: 14073""""""" 14074 14075:: 14076 14077 declare ptr @llvm.frameaddress(i32 <level>) 14078 14079Overview: 14080""""""""" 14081 14082The '``llvm.frameaddress``' intrinsic attempts to return the 14083target-specific frame pointer value for the specified stack frame. 14084 14085Arguments: 14086"""""""""" 14087 14088The argument to this intrinsic indicates which function to return the 14089frame pointer for. Zero indicates the calling function, one indicates 14090its caller, etc. The argument is **required** to be a constant integer 14091value. 14092 14093Semantics: 14094"""""""""" 14095 14096The '``llvm.frameaddress``' intrinsic either returns a pointer 14097indicating the frame address of the specified call frame, or zero if it 14098cannot be identified. The value returned by this intrinsic is likely to 14099be incorrect or 0 for arguments other than zero, so it should only be 14100used for debugging purposes. 14101 14102Note that calling this intrinsic does not prevent function inlining or 14103other aggressive transformations, so the value returned may not be that 14104of the obvious source-language caller. 14105 14106'``llvm.swift.async.context.addr``' Intrinsic 14107^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14108 14109Syntax: 14110""""""" 14111 14112:: 14113 14114 declare ptr @llvm.swift.async.context.addr() 14115 14116Overview: 14117""""""""" 14118 14119The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to 14120the part of the extended frame record containing the asynchronous 14121context of a Swift execution. 14122 14123Semantics: 14124"""""""""" 14125 14126If the caller has a ``swiftasync`` parameter, that argument will initially 14127be stored at the returned address. If not, it will be initialized to null. 14128 14129'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics 14130^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14131 14132Syntax: 14133""""""" 14134 14135:: 14136 14137 declare void @llvm.localescape(...) 14138 declare ptr @llvm.localrecover(ptr %func, ptr %fp, i32 %idx) 14139 14140Overview: 14141""""""""" 14142 14143The '``llvm.localescape``' intrinsic escapes offsets of a collection of static 14144allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a 14145live frame pointer to recover the address of the allocation. The offset is 14146computed during frame layout of the caller of ``llvm.localescape``. 14147 14148Arguments: 14149"""""""""" 14150 14151All arguments to '``llvm.localescape``' must be pointers to static allocas or 14152casts of static allocas. Each function can only call '``llvm.localescape``' 14153once, and it can only do so from the entry block. 14154 14155The ``func`` argument to '``llvm.localrecover``' must be a constant 14156bitcasted pointer to a function defined in the current module. The code 14157generator cannot determine the frame allocation offset of functions defined in 14158other modules. 14159 14160The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a 14161call frame that is currently live. The return value of '``llvm.localaddress``' 14162is one way to produce such a value, but various runtimes also expose a suitable 14163pointer in platform-specific ways. 14164 14165The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to 14166'``llvm.localescape``' to recover. It is zero-indexed. 14167 14168Semantics: 14169"""""""""" 14170 14171These intrinsics allow a group of functions to share access to a set of local 14172stack allocations of a one parent function. The parent function may call the 14173'``llvm.localescape``' intrinsic once from the function entry block, and the 14174child functions can use '``llvm.localrecover``' to access the escaped allocas. 14175The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where 14176the escaped allocas are allocated, which would break attempts to use 14177'``llvm.localrecover``'. 14178 14179'``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics 14180^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14181 14182Syntax: 14183""""""" 14184 14185:: 14186 14187 declare void @llvm.seh.try.begin() 14188 declare void @llvm.seh.try.end() 14189 14190Overview: 14191""""""""" 14192 14193The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark 14194the boundary of a _try region for Windows SEH Asynchronous Exception Handling. 14195 14196Semantics: 14197"""""""""" 14198 14199When a C-function is compiled with Windows SEH Asynchronous Exception option, 14200-feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try 14201boundary and to prevent potential exceptions from being moved across boundary. 14202Any set of operations can then be confined to the region by reading their leaf 14203inputs via volatile loads and writing their root outputs via volatile stores. 14204 14205'``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics 14206^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14207 14208Syntax: 14209""""""" 14210 14211:: 14212 14213 declare void @llvm.seh.scope.begin() 14214 declare void @llvm.seh.scope.end() 14215 14216Overview: 14217""""""""" 14218 14219The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark 14220the boundary of a CPP object lifetime for Windows SEH Asynchronous Exception 14221Handling (MSVC option -EHa). 14222 14223Semantics: 14224"""""""""" 14225 14226LLVM's ordinary exception-handling representation associates EH cleanups and 14227handlers only with ``invoke``s, which normally correspond only to call sites. To 14228support arbitrary faulting instructions, it must be possible to recover the current 14229EH scope for any instruction. Turning every operation in LLVM that could fault 14230into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a 14231large number of intrinsics, impede optimization of those operations, and make 14232compilation slower by introducing many extra basic blocks. These intrinsics can 14233be used instead to mark the region protected by a cleanup, such as for a local 14234C++ object with a non-trivial destructor. ``llvm.seh.scope.begin`` is used to mark 14235the start of the region; it is always called with ``invoke``, with the unwind block 14236being the desired unwind destination for any potentially-throwing instructions 14237within the region. `llvm.seh.scope.end` is used to mark when the scope ends 14238and the EH cleanup is no longer required (e.g. because the destructor is being 14239called). 14240 14241.. _int_read_register: 14242.. _int_read_volatile_register: 14243.. _int_write_register: 14244 14245'``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics 14246^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14247 14248Syntax: 14249""""""" 14250 14251:: 14252 14253 declare i32 @llvm.read_register.i32(metadata) 14254 declare i64 @llvm.read_register.i64(metadata) 14255 declare i32 @llvm.read_volatile_register.i32(metadata) 14256 declare i64 @llvm.read_volatile_register.i64(metadata) 14257 declare void @llvm.write_register.i32(metadata, i32 @value) 14258 declare void @llvm.write_register.i64(metadata, i64 @value) 14259 !0 = !{!"sp\00"} 14260 14261Overview: 14262""""""""" 14263 14264The '``llvm.read_register``', '``llvm.read_volatile_register``', and 14265'``llvm.write_register``' intrinsics provide access to the named register. 14266The register must be valid on the architecture being compiled to. The type 14267needs to be compatible with the register being read. 14268 14269Semantics: 14270"""""""""" 14271 14272The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics 14273return the current value of the register, where possible. The 14274'``llvm.write_register``' intrinsic sets the current value of the register, 14275where possible. 14276 14277A call to '``llvm.read_volatile_register``' is assumed to have side-effects 14278and possibly return a different value each time (e.g. for a timer register). 14279 14280This is useful to implement named register global variables that need 14281to always be mapped to a specific register, as is common practice on 14282bare-metal programs including OS kernels. 14283 14284The compiler doesn't check for register availability or use of the used 14285register in surrounding code, including inline assembly. Because of that, 14286allocatable registers are not supported. 14287 14288Warning: So far it only works with the stack pointer on selected 14289architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of 14290work is needed to support other registers and even more so, allocatable 14291registers. 14292 14293.. _int_stacksave: 14294 14295'``llvm.stacksave``' Intrinsic 14296^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14297 14298Syntax: 14299""""""" 14300 14301:: 14302 14303 declare ptr @llvm.stacksave.p0() 14304 declare ptr addrspace(5) @llvm.stacksave.p5() 14305 14306Overview: 14307""""""""" 14308 14309The '``llvm.stacksave``' intrinsic is used to remember the current state 14310of the function stack, for use with 14311:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for 14312implementing language features like scoped automatic variable sized 14313arrays in C99. 14314 14315Semantics: 14316"""""""""" 14317 14318This intrinsic returns an opaque pointer value that can be passed to 14319:ref:`llvm.stackrestore <int_stackrestore>`. When an 14320``llvm.stackrestore`` intrinsic is executed with a value saved from 14321``llvm.stacksave``, it effectively restores the state of the stack to 14322the state it was in when the ``llvm.stacksave`` intrinsic executed. In 14323practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack 14324that were allocated after the ``llvm.stacksave`` was executed. The 14325address space should typically be the 14326:ref:`alloca address space <alloca_addrspace>`. 14327 14328.. _int_stackrestore: 14329 14330'``llvm.stackrestore``' Intrinsic 14331^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14332 14333Syntax: 14334""""""" 14335 14336:: 14337 14338 declare void @llvm.stackrestore.p0(ptr %ptr) 14339 declare void @llvm.stackrestore.p5(ptr addrspace(5) %ptr) 14340 14341Overview: 14342""""""""" 14343 14344The '``llvm.stackrestore``' intrinsic is used to restore the state of 14345the function stack to the state it was in when the corresponding 14346:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is 14347useful for implementing language features like scoped automatic 14348variable sized arrays in C99. The address space should typically be 14349the :ref:`alloca address space <alloca_addrspace>`. 14350 14351Semantics: 14352"""""""""" 14353 14354See the description for :ref:`llvm.stacksave <int_stacksave>`. 14355 14356.. _int_get_dynamic_area_offset: 14357 14358'``llvm.get.dynamic.area.offset``' Intrinsic 14359^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14360 14361Syntax: 14362""""""" 14363 14364:: 14365 14366 declare i32 @llvm.get.dynamic.area.offset.i32() 14367 declare i64 @llvm.get.dynamic.area.offset.i64() 14368 14369Overview: 14370""""""""" 14371 14372 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to 14373 get the offset from native stack pointer to the address of the most 14374 recent dynamic alloca on the caller's stack. These intrinsics are 14375 intended for use in combination with 14376 :ref:`llvm.stacksave <int_stacksave>` to get a 14377 pointer to the most recent dynamic alloca. This is useful, for example, 14378 for AddressSanitizer's stack unpoisoning routines. 14379 14380Semantics: 14381"""""""""" 14382 14383 These intrinsics return a non-negative integer value that can be used to 14384 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>` 14385 on the caller's stack. In particular, for targets where stack grows downwards, 14386 adding this offset to the native stack pointer would get the address of the most 14387 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more 14388 complicated, because subtracting this value from stack pointer would get the address 14389 one past the end of the most recent dynamic alloca. 14390 14391 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>` 14392 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a 14393 compile-time-known constant value. 14394 14395 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>` 14396 must match the target's default address space's (address space 0) pointer type. 14397 14398'``llvm.prefetch``' Intrinsic 14399^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14400 14401Syntax: 14402""""""" 14403 14404:: 14405 14406 declare void @llvm.prefetch(ptr <address>, i32 <rw>, i32 <locality>, i32 <cache type>) 14407 14408Overview: 14409""""""""" 14410 14411The '``llvm.prefetch``' intrinsic is a hint to the code generator to 14412insert a prefetch instruction if supported; otherwise, it is a noop. 14413Prefetches have no effect on the behavior of the program but can change 14414its performance characteristics. 14415 14416Arguments: 14417"""""""""" 14418 14419``address`` is the address to be prefetched, ``rw`` is the specifier 14420determining if the fetch should be for a read (0) or write (1), and 14421``locality`` is a temporal locality specifier ranging from (0) - no 14422locality, to (3) - extremely local keep in cache. The ``cache type`` 14423specifies whether the prefetch is performed on the data (1) or 14424instruction (0) cache. The ``rw``, ``locality`` and ``cache type`` 14425arguments must be constant integers. 14426 14427Semantics: 14428"""""""""" 14429 14430This intrinsic does not modify the behavior of the program. In 14431particular, prefetches cannot trap and do not produce a value. On 14432targets that support this intrinsic, the prefetch can provide hints to 14433the processor cache for better performance. 14434 14435'``llvm.pcmarker``' Intrinsic 14436^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14437 14438Syntax: 14439""""""" 14440 14441:: 14442 14443 declare void @llvm.pcmarker(i32 <id>) 14444 14445Overview: 14446""""""""" 14447 14448The '``llvm.pcmarker``' intrinsic is a method to export a Program 14449Counter (PC) in a region of code to simulators and other tools. The 14450method is target specific, but it is expected that the marker will use 14451exported symbols to transmit the PC of the marker. The marker makes no 14452guarantees that it will remain with any specific instruction after 14453optimizations. It is possible that the presence of a marker will inhibit 14454optimizations. The intended use is to be inserted after optimizations to 14455allow correlations of simulation runs. 14456 14457Arguments: 14458"""""""""" 14459 14460``id`` is a numerical id identifying the marker. 14461 14462Semantics: 14463"""""""""" 14464 14465This intrinsic does not modify the behavior of the program. Backends 14466that do not support this intrinsic may ignore it. 14467 14468'``llvm.readcyclecounter``' Intrinsic 14469^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14470 14471Syntax: 14472""""""" 14473 14474:: 14475 14476 declare i64 @llvm.readcyclecounter() 14477 14478Overview: 14479""""""""" 14480 14481The '``llvm.readcyclecounter``' intrinsic provides access to the cycle 14482counter register (or similar low latency, high accuracy clocks) on those 14483targets that support it. On X86, it should map to RDTSC. On Alpha, it 14484should map to RPCC. As the backing counters overflow quickly (on the 14485order of 9 seconds on alpha), this should only be used for small 14486timings. 14487 14488Semantics: 14489"""""""""" 14490 14491When directly supported, reading the cycle counter should not modify any 14492memory. Implementations are allowed to either return an application 14493specific value or a system wide value. On backends without support, this 14494is lowered to a constant 0. 14495 14496Note that runtime support may be conditional on the privilege-level code is 14497running at and the host platform. 14498 14499'``llvm.clear_cache``' Intrinsic 14500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14501 14502Syntax: 14503""""""" 14504 14505:: 14506 14507 declare void @llvm.clear_cache(ptr, ptr) 14508 14509Overview: 14510""""""""" 14511 14512The '``llvm.clear_cache``' intrinsic ensures visibility of modifications 14513in the specified range to the execution unit of the processor. On 14514targets with non-unified instruction and data cache, the implementation 14515flushes the instruction cache. 14516 14517Semantics: 14518"""""""""" 14519 14520On platforms with coherent instruction and data caches (e.g. x86), this 14521intrinsic is a nop. On platforms with non-coherent instruction and data 14522cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate 14523instructions or a system call, if cache flushing requires special 14524privileges. 14525 14526The default behavior is to emit a call to ``__clear_cache`` from the run 14527time library. 14528 14529This intrinsic does *not* empty the instruction pipeline. Modifications 14530of the current function are outside the scope of the intrinsic. 14531 14532'``llvm.instrprof.increment``' Intrinsic 14533^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14534 14535Syntax: 14536""""""" 14537 14538:: 14539 14540 declare void @llvm.instrprof.increment(ptr <name>, i64 <hash>, 14541 i32 <num-counters>, i32 <index>) 14542 14543Overview: 14544""""""""" 14545 14546The '``llvm.instrprof.increment``' intrinsic can be emitted by a 14547frontend for use with instrumentation based profiling. These will be 14548lowered by the ``-instrprof`` pass to generate execution counts of a 14549program at runtime. 14550 14551Arguments: 14552"""""""""" 14553 14554The first argument is a pointer to a global variable containing the 14555name of the entity being instrumented. This should generally be the 14556(mangled) function name for a set of counters. 14557 14558The second argument is a hash value that can be used by the consumer 14559of the profile data to detect changes to the instrumented source, and 14560the third is the number of counters associated with ``name``. It is an 14561error if ``hash`` or ``num-counters`` differ between two instances of 14562``instrprof.increment`` that refer to the same name. 14563 14564The last argument refers to which of the counters for ``name`` should 14565be incremented. It should be a value between 0 and ``num-counters``. 14566 14567Semantics: 14568"""""""""" 14569 14570This intrinsic represents an increment of a profiling counter. It will 14571cause the ``-instrprof`` pass to generate the appropriate data 14572structures and the code to increment the appropriate value, in a 14573format that can be written out by a compiler runtime and consumed via 14574the ``llvm-profdata`` tool. 14575 14576.. FIXME: write complete doc on contextual instrumentation and link from here 14577.. and from llvm.instrprof.callsite. 14578 14579The intrinsic is lowered differently for contextual profiling by the 14580``-ctx-instr-lower`` pass. Here: 14581 14582* the entry basic block increment counter is lowered as a call to compiler-rt, 14583 to either ``__llvm_ctx_profile_start_context`` or 14584 ``__llvm_ctx_profile_get_context``. Either returns a pointer to a context object 14585 which contains a buffer into which counter increments can happen. Note that the 14586 pointer value returned by compiler-rt may have its LSB set - counter increments 14587 happen offset from the address with the LSB cleared. 14588 14589* all the other lowerings of ``llvm.instrprof.increment[.step]`` happen within 14590 that context. 14591 14592* the context is assumed to be a local value to the function, and no concurrency 14593 concerns need to be handled by LLVM. 14594 14595'``llvm.instrprof.increment.step``' Intrinsic 14596^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14597 14598Syntax: 14599""""""" 14600 14601:: 14602 14603 declare void @llvm.instrprof.increment.step(ptr <name>, i64 <hash>, 14604 i32 <num-counters>, 14605 i32 <index>, i64 <step>) 14606 14607Overview: 14608""""""""" 14609 14610The '``llvm.instrprof.increment.step``' intrinsic is an extension to 14611the '``llvm.instrprof.increment``' intrinsic with an additional fifth 14612argument to specify the step of the increment. 14613 14614Arguments: 14615"""""""""" 14616The first four arguments are the same as '``llvm.instrprof.increment``' 14617intrinsic. 14618 14619The last argument specifies the value of the increment of the counter variable. 14620 14621Semantics: 14622"""""""""" 14623See description of '``llvm.instrprof.increment``' intrinsic. 14624 14625'``llvm.instrprof.callsite``' Intrinsic 14626^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14627 14628Syntax: 14629""""""" 14630 14631:: 14632 14633 declare void @llvm.instrprof.callsite(ptr <name>, i64 <hash>, 14634 i32 <num-counters>, 14635 i32 <index>, ptr <callsite>) 14636 14637Overview: 14638""""""""" 14639 14640The '``llvm.instrprof.callsite``' intrinsic should be emitted before a callsite 14641that's not to a "fake" callee (like another intrinsic or asm). It is used by 14642contextual profiling and has side-effects. Its lowering happens in IR, and 14643target-specific backends should never encounter it. 14644 14645Arguments: 14646"""""""""" 14647The first 4 arguments are similar to ``llvm.instrprof.increment``. The indexing 14648is specific to callsites, meaning callsites are indexed from 0, independent from 14649the indexes used by the other intrinsics (such as 14650``llvm.instrprof.increment[.step]``). 14651 14652The last argument is the called value of the callsite this intrinsic precedes. 14653 14654Semantics: 14655"""""""""" 14656 14657This is lowered by contextual profiling. In contextual profiling, functions get, 14658from compiler-rt, a pointer to a context object. The context object consists of 14659a buffer LLVM can use to perform counter increments (i.e. the lowering of 14660``llvm.instrprof.increment[.step]``. The address range following the counter 14661buffer, ``<num-counters>`` x ``sizeof(ptr)`` - sized, is expected to contain 14662pointers to contexts of functions called from this function ("subcontexts"). 14663LLVM does not dereference into that memory region, just calculates GEPs. 14664 14665The lowering of ``llvm.instrprof.callsite`` consists of: 14666 14667* write to ``__llvm_ctx_profile_expected_callee`` the ``<callsite>`` value; 14668 14669* write to ``__llvm_ctx_profile_callsite`` the address into this function's 14670 context of the ``<index>`` position into the subcontexts region. 14671 14672 14673``__llvm_ctx_profile_{expected_callee|callsite}`` are initialized by compiler-rt 14674and are TLS. They are both vectors of pointers of size 2. The index into each is 14675determined when the current function obtains the pointer to its context from 14676compiler-rt. The pointer's LSB gives the index. 14677 14678 14679'``llvm.instrprof.timestamp``' Intrinsic 14680^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14681 14682Syntax: 14683""""""" 14684 14685:: 14686 14687 declare void @llvm.instrprof.timestamp(i8* <name>, i64 <hash>, 14688 i32 <num-counters>, i32 <index>) 14689 14690Overview: 14691""""""""" 14692 14693The '``llvm.instrprof.timestamp``' intrinsic is used to implement temporal 14694profiling. 14695 14696Arguments: 14697"""""""""" 14698The arguments are the same as '``llvm.instrprof.increment``'. The ``index`` is 14699expected to always be zero. 14700 14701Semantics: 14702"""""""""" 14703Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores a 14704timestamp representing when this function was executed for the first time. 14705 14706'``llvm.instrprof.cover``' Intrinsic 14707^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14708 14709Syntax: 14710""""""" 14711 14712:: 14713 14714 declare void @llvm.instrprof.cover(ptr <name>, i64 <hash>, 14715 i32 <num-counters>, i32 <index>) 14716 14717Overview: 14718""""""""" 14719 14720The '``llvm.instrprof.cover``' intrinsic is used to implement coverage 14721instrumentation. 14722 14723Arguments: 14724"""""""""" 14725The arguments are the same as the first four arguments of 14726'``llvm.instrprof.increment``'. 14727 14728Semantics: 14729"""""""""" 14730Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores zero to 14731the profiling variable to signify that the function has been covered. We store 14732zero because this is more efficient on some targets. 14733 14734'``llvm.instrprof.value.profile``' Intrinsic 14735^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14736 14737Syntax: 14738""""""" 14739 14740:: 14741 14742 declare void @llvm.instrprof.value.profile(ptr <name>, i64 <hash>, 14743 i64 <value>, i32 <value_kind>, 14744 i32 <index>) 14745 14746Overview: 14747""""""""" 14748 14749The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a 14750frontend for use with instrumentation based profiling. This will be 14751lowered by the ``-instrprof`` pass to find out the target values, 14752instrumented expressions take in a program at runtime. 14753 14754Arguments: 14755"""""""""" 14756 14757The first argument is a pointer to a global variable containing the 14758name of the entity being instrumented. ``name`` should generally be the 14759(mangled) function name for a set of counters. 14760 14761The second argument is a hash value that can be used by the consumer 14762of the profile data to detect changes to the instrumented source. It 14763is an error if ``hash`` differs between two instances of 14764``llvm.instrprof.*`` that refer to the same name. 14765 14766The third argument is the value of the expression being profiled. The profiled 14767expression's value should be representable as an unsigned 64-bit value. The 14768fourth argument represents the kind of value profiling that is being done. The 14769supported value profiling kinds are enumerated through the 14770``InstrProfValueKind`` type declared in the 14771``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the 14772index of the instrumented expression within ``name``. It should be >= 0. 14773 14774Semantics: 14775"""""""""" 14776 14777This intrinsic represents the point where a call to a runtime routine 14778should be inserted for value profiling of target expressions. ``-instrprof`` 14779pass will generate the appropriate data structures and replace the 14780``llvm.instrprof.value.profile`` intrinsic with the call to the profile 14781runtime library with proper arguments. 14782 14783'``llvm.instrprof.mcdc.parameters``' Intrinsic 14784^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14785 14786Syntax: 14787""""""" 14788 14789:: 14790 14791 declare void @llvm.instrprof.mcdc.parameters(ptr <name>, i64 <hash>, 14792 i32 <bitmap-bits>) 14793 14794Overview: 14795""""""""" 14796 14797The '``llvm.instrprof.mcdc.parameters``' intrinsic is used to initiate MC/DC 14798code coverage instrumentation for a function. 14799 14800Arguments: 14801"""""""""" 14802 14803The first argument is a pointer to a global variable containing the 14804name of the entity being instrumented. This should generally be the 14805(mangled) function name for a set of counters. 14806 14807The second argument is a hash value that can be used by the consumer 14808of the profile data to detect changes to the instrumented source. 14809 14810The third argument is the number of bitmap bits required by the function to 14811record the number of test vectors executed for each boolean expression. 14812 14813Semantics: 14814"""""""""" 14815 14816This intrinsic represents basic MC/DC parameters initiating one or more MC/DC 14817instrumentation sequences in a function. It will cause the ``-instrprof`` pass 14818to generate the appropriate data structures and the code to instrument MC/DC 14819test vectors in a format that can be written out by a compiler runtime and 14820consumed via the ``llvm-profdata`` tool. 14821 14822'``llvm.instrprof.mcdc.tvbitmap.update``' Intrinsic 14823^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14824 14825Syntax: 14826""""""" 14827 14828:: 14829 14830 declare void @llvm.instrprof.mcdc.tvbitmap.update(ptr <name>, i64 <hash>, 14831 i32 <bitmap-index>, 14832 ptr <mcdc-temp-addr>) 14833 14834Overview: 14835""""""""" 14836 14837The '``llvm.instrprof.mcdc.tvbitmap.update``' intrinsic is used to track MC/DC 14838test vector execution after each boolean expression has been fully executed. 14839The overall value of the condition bitmap, after it has been successively 14840updated with the true or false evaluation of each condition, uniquely identifies 14841an executed MC/DC test vector and is used as a bit index into the global test 14842vector bitmap. 14843 14844Arguments: 14845"""""""""" 14846 14847The first argument is a pointer to a global variable containing the 14848name of the entity being instrumented. This should generally be the 14849(mangled) function name for a set of counters. 14850 14851The second argument is a hash value that can be used by the consumer 14852of the profile data to detect changes to the instrumented source. 14853 14854The third argument is the bit index into the global test vector bitmap 14855corresponding to the function. 14856 14857The fourth argument is the address of the condition bitmap, which contains a 14858value representing an executed MC/DC test vector. It is loaded and used as the 14859bit index of the test vector bitmap. 14860 14861Semantics: 14862"""""""""" 14863 14864This intrinsic represents the final operation of an MC/DC instrumentation 14865sequence and will cause the ``-instrprof`` pass to generate the code to 14866instrument an update of a function's global test vector bitmap to indicate that 14867a test vector has been executed. The global test vector bitmap can be consumed 14868by the ``llvm-profdata`` and ``llvm-cov`` tools. 14869 14870'``llvm.thread.pointer``' Intrinsic 14871^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14872 14873Syntax: 14874""""""" 14875 14876:: 14877 14878 declare ptr @llvm.thread.pointer() 14879 14880Overview: 14881""""""""" 14882 14883The '``llvm.thread.pointer``' intrinsic returns the value of the thread 14884pointer. 14885 14886Semantics: 14887"""""""""" 14888 14889The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area 14890for the current thread. The exact semantics of this value are target 14891specific: it may point to the start of TLS area, to the end, or somewhere 14892in the middle. Depending on the target, this intrinsic may read a register, 14893call a helper function, read from an alternate memory space, or perform 14894other operations necessary to locate the TLS area. Not all targets support 14895this intrinsic. 14896 14897'``llvm.call.preallocated.setup``' Intrinsic 14898^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14899 14900Syntax: 14901""""""" 14902 14903:: 14904 14905 declare token @llvm.call.preallocated.setup(i32 %num_args) 14906 14907Overview: 14908""""""""" 14909 14910The '``llvm.call.preallocated.setup``' intrinsic returns a token which can 14911be used with a call's ``"preallocated"`` operand bundle to indicate that 14912certain arguments are allocated and initialized before the call. 14913 14914Semantics: 14915"""""""""" 14916 14917The '``llvm.call.preallocated.setup``' intrinsic returns a token which is 14918associated with at most one call. The token can be passed to 14919'``@llvm.call.preallocated.arg``' to get a pointer to get that 14920corresponding argument. The token must be the parameter to a 14921``"preallocated"`` operand bundle for the corresponding call. 14922 14923Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must 14924be properly nested. e.g. 14925 14926:: code-block:: llvm 14927 14928 %t1 = call token @llvm.call.preallocated.setup(i32 0) 14929 %t2 = call token @llvm.call.preallocated.setup(i32 0) 14930 call void foo() ["preallocated"(token %t2)] 14931 call void foo() ["preallocated"(token %t1)] 14932 14933is allowed, but not 14934 14935:: code-block:: llvm 14936 14937 %t1 = call token @llvm.call.preallocated.setup(i32 0) 14938 %t2 = call token @llvm.call.preallocated.setup(i32 0) 14939 call void foo() ["preallocated"(token %t1)] 14940 call void foo() ["preallocated"(token %t2)] 14941 14942.. _int_call_preallocated_arg: 14943 14944'``llvm.call.preallocated.arg``' Intrinsic 14945^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14946 14947Syntax: 14948""""""" 14949 14950:: 14951 14952 declare ptr @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index) 14953 14954Overview: 14955""""""""" 14956 14957The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the 14958corresponding preallocated argument for the preallocated call. 14959 14960Semantics: 14961"""""""""" 14962 14963The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the 14964``%arg_index``th argument with the ``preallocated`` attribute for 14965the call associated with the ``%setup_token``, which must be from 14966'``llvm.call.preallocated.setup``'. 14967 14968A call to '``llvm.call.preallocated.arg``' must have a call site 14969``preallocated`` attribute. The type of the ``preallocated`` attribute must 14970match the type used by the ``preallocated`` attribute of the corresponding 14971argument at the preallocated call. The type is used in the case that an 14972``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due 14973to DCE), where otherwise we cannot know how large the arguments are. 14974 14975It is undefined behavior if this is called with a token from an 14976'``llvm.call.preallocated.setup``' if another 14977'``llvm.call.preallocated.setup``' has already been called or if the 14978preallocated call corresponding to the '``llvm.call.preallocated.setup``' 14979has already been called. 14980 14981.. _int_call_preallocated_teardown: 14982 14983'``llvm.call.preallocated.teardown``' Intrinsic 14984^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14985 14986Syntax: 14987""""""" 14988 14989:: 14990 14991 declare ptr @llvm.call.preallocated.teardown(token %setup_token) 14992 14993Overview: 14994""""""""" 14995 14996The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack 14997created by a '``llvm.call.preallocated.setup``'. 14998 14999Semantics: 15000"""""""""" 15001 15002The token argument must be a '``llvm.call.preallocated.setup``'. 15003 15004The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack 15005allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly 15006one of this or the preallocated call must be called to prevent stack leaks. 15007It is undefined behavior to call both a '``llvm.call.preallocated.teardown``' 15008and the preallocated call for a given '``llvm.call.preallocated.setup``'. 15009 15010For example, if the stack is allocated for a preallocated call by a 15011'``llvm.call.preallocated.setup``', then an initializer function called on an 15012allocated argument throws an exception, there should be a 15013'``llvm.call.preallocated.teardown``' in the exception handler to prevent 15014stack leaks. 15015 15016Following the nesting rules in '``llvm.call.preallocated.setup``', nested 15017calls to '``llvm.call.preallocated.setup``' and 15018'``llvm.call.preallocated.teardown``' are allowed but must be properly 15019nested. 15020 15021Example: 15022"""""""" 15023 15024.. code-block:: llvm 15025 15026 %cs = call token @llvm.call.preallocated.setup(i32 1) 15027 %x = call ptr @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32) 15028 invoke void @constructor(ptr %x) to label %conta unwind label %contb 15029 conta: 15030 call void @foo1(ptr preallocated(i32) %x) ["preallocated"(token %cs)] 15031 ret void 15032 contb: 15033 %s = catchswitch within none [label %catch] unwind to caller 15034 catch: 15035 %p = catchpad within %s [] 15036 call void @llvm.call.preallocated.teardown(token %cs) 15037 ret void 15038 15039Standard C/C++ Library Intrinsics 15040--------------------------------- 15041 15042LLVM provides intrinsics for a few important standard C/C++ library 15043functions. These intrinsics allow source-language front-ends to pass 15044information about the alignment of the pointer arguments to the code 15045generator, providing opportunity for more efficient code generation. 15046 15047.. _int_abs: 15048 15049'``llvm.abs.*``' Intrinsic 15050^^^^^^^^^^^^^^^^^^^^^^^^^^ 15051 15052Syntax: 15053""""""" 15054 15055This is an overloaded intrinsic. You can use ``llvm.abs`` on any 15056integer bit width or any vector of integer elements. 15057 15058:: 15059 15060 declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>) 15061 declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>) 15062 15063Overview: 15064""""""""" 15065 15066The '``llvm.abs``' family of intrinsic functions returns the absolute value 15067of an argument. 15068 15069Arguments: 15070"""""""""" 15071 15072The first argument is the value for which the absolute value is to be returned. 15073This argument may be of any integer type or a vector with integer element type. 15074The return type must match the first argument type. 15075 15076The second argument must be a constant and is a flag to indicate whether the 15077result value of the '``llvm.abs``' intrinsic is a 15078:ref:`poison value <poisonvalues>` if the first argument is statically or 15079dynamically an ``INT_MIN`` value. 15080 15081Semantics: 15082"""""""""" 15083 15084The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the 15085first argument or each element of a vector argument.". If the first argument is 15086``INT_MIN``, then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` 15087and ``poison`` otherwise. 15088 15089 15090.. _int_smax: 15091 15092'``llvm.smax.*``' Intrinsic 15093^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15094 15095Syntax: 15096""""""" 15097 15098This is an overloaded intrinsic. You can use ``@llvm.smax`` on any 15099integer bit width or any vector of integer elements. 15100 15101:: 15102 15103 declare i32 @llvm.smax.i32(i32 %a, i32 %b) 15104 declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b) 15105 15106Overview: 15107""""""""" 15108 15109Return the larger of ``%a`` and ``%b`` comparing the values as signed integers. 15110Vector intrinsics operate on a per-element basis. The larger element of ``%a`` 15111and ``%b`` at a given index is returned for that index. 15112 15113Arguments: 15114"""""""""" 15115 15116The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 15117integer element type. The argument types must match each other, and the return 15118type must match the argument type. 15119 15120 15121.. _int_smin: 15122 15123'``llvm.smin.*``' Intrinsic 15124^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15125 15126Syntax: 15127""""""" 15128 15129This is an overloaded intrinsic. You can use ``@llvm.smin`` on any 15130integer bit width or any vector of integer elements. 15131 15132:: 15133 15134 declare i32 @llvm.smin.i32(i32 %a, i32 %b) 15135 declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b) 15136 15137Overview: 15138""""""""" 15139 15140Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers. 15141Vector intrinsics operate on a per-element basis. The smaller element of ``%a`` 15142and ``%b`` at a given index is returned for that index. 15143 15144Arguments: 15145"""""""""" 15146 15147The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 15148integer element type. The argument types must match each other, and the return 15149type must match the argument type. 15150 15151 15152.. _int_umax: 15153 15154'``llvm.umax.*``' Intrinsic 15155^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15156 15157Syntax: 15158""""""" 15159 15160This is an overloaded intrinsic. You can use ``@llvm.umax`` on any 15161integer bit width or any vector of integer elements. 15162 15163:: 15164 15165 declare i32 @llvm.umax.i32(i32 %a, i32 %b) 15166 declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b) 15167 15168Overview: 15169""""""""" 15170 15171Return the larger of ``%a`` and ``%b`` comparing the values as unsigned 15172integers. Vector intrinsics operate on a per-element basis. The larger element 15173of ``%a`` and ``%b`` at a given index is returned for that index. 15174 15175Arguments: 15176"""""""""" 15177 15178The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 15179integer element type. The argument types must match each other, and the return 15180type must match the argument type. 15181 15182 15183.. _int_umin: 15184 15185'``llvm.umin.*``' Intrinsic 15186^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15187 15188Syntax: 15189""""""" 15190 15191This is an overloaded intrinsic. You can use ``@llvm.umin`` on any 15192integer bit width or any vector of integer elements. 15193 15194:: 15195 15196 declare i32 @llvm.umin.i32(i32 %a, i32 %b) 15197 declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b) 15198 15199Overview: 15200""""""""" 15201 15202Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned 15203integers. Vector intrinsics operate on a per-element basis. The smaller element 15204of ``%a`` and ``%b`` at a given index is returned for that index. 15205 15206Arguments: 15207"""""""""" 15208 15209The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 15210integer element type. The argument types must match each other, and the return 15211type must match the argument type. 15212 15213.. _int_scmp: 15214 15215'``llvm.scmp.*``' Intrinsic 15216^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15217 15218Syntax: 15219""""""" 15220 15221This is an overloaded intrinsic. You can use ``@llvm.scmp`` on any 15222integer bit width or any vector of integer elements. 15223 15224:: 15225 15226 declare i2 @llvm.scmp.i2.i32(i32 %a, i32 %b) 15227 declare <4 x i32> @llvm.scmp.v4i32.v4i32(<4 x i32> %a, <4 x i32> %b) 15228 15229Overview: 15230""""""""" 15231 15232Return ``-1`` if ``%a`` is signed less than ``%b``, ``0`` if they are equal, and 15233``1`` if ``%a`` is signed greater than ``%b``. Vector intrinsics operate on a per-element basis. 15234 15235Arguments: 15236"""""""""" 15237 15238The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 15239integer element type. The argument types must match each other, and the return 15240type must be at least as wide as ``i2``, to hold the three possible return values. 15241 15242.. _int_ucmp: 15243 15244'``llvm.ucmp.*``' Intrinsic 15245^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15246 15247Syntax: 15248""""""" 15249 15250This is an overloaded intrinsic. You can use ``@llvm.ucmp`` on any 15251integer bit width or any vector of integer elements. 15252 15253:: 15254 15255 declare i2 @llvm.ucmp.i2.i32(i32 %a, i32 %b) 15256 declare <4 x i32> @llvm.ucmp.v4i32.v4i32(<4 x i32> %a, <4 x i32> %b) 15257 15258Overview: 15259""""""""" 15260 15261Return ``-1`` if ``%a`` is unsigned less than ``%b``, ``0`` if they are equal, and 15262``1`` if ``%a`` is unsigned greater than ``%b``. Vector intrinsics operate on a per-element basis. 15263 15264Arguments: 15265"""""""""" 15266 15267The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 15268integer element type. The argument types must match each other, and the return 15269type must be at least as wide as ``i2``, to hold the three possible return values. 15270 15271.. _int_memcpy: 15272 15273'``llvm.memcpy``' Intrinsic 15274^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15275 15276Syntax: 15277""""""" 15278 15279This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any 15280integer bit width and for different address spaces. Not all targets 15281support all bit widths however. 15282 15283:: 15284 15285 declare void @llvm.memcpy.p0.p0.i32(ptr <dest>, ptr <src>, 15286 i32 <len>, i1 <isvolatile>) 15287 declare void @llvm.memcpy.p0.p0.i64(ptr <dest>, ptr <src>, 15288 i64 <len>, i1 <isvolatile>) 15289 15290Overview: 15291""""""""" 15292 15293The '``llvm.memcpy.*``' intrinsics copy a block of memory from the 15294source location to the destination location. 15295 15296Note that, unlike the standard libc function, the ``llvm.memcpy.*`` 15297intrinsics do not return a value, takes extra isvolatile 15298arguments and the pointers can be in specified address spaces. 15299 15300Arguments: 15301"""""""""" 15302 15303The first argument is a pointer to the destination, the second is a 15304pointer to the source. The third argument is an integer argument 15305specifying the number of bytes to copy, and the fourth is a 15306boolean indicating a volatile access. 15307 15308The :ref:`align <attr_align>` parameter attribute can be provided 15309for the first and second arguments. 15310 15311If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is 15312a :ref:`volatile operation <volatile>`. The detailed access behavior is not 15313very cleanly specified and it is unwise to depend on it. 15314 15315Semantics: 15316"""""""""" 15317 15318The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source 15319location to the destination location, which must either be equal or 15320non-overlapping. It copies "len" bytes of memory over. If the argument is known 15321to be aligned to some boundary, this can be specified as an attribute on the 15322argument. 15323 15324If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to 15325the arguments. 15326If ``<len>`` is not a well-defined value, the behavior is undefined. 15327If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined, 15328otherwise the behavior is undefined. 15329 15330.. _int_memcpy_inline: 15331 15332'``llvm.memcpy.inline``' Intrinsic 15333^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15334 15335Syntax: 15336""""""" 15337 15338This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any 15339integer bit width and for different address spaces. Not all targets 15340support all bit widths however. 15341 15342:: 15343 15344 declare void @llvm.memcpy.inline.p0.p0.i32(ptr <dest>, ptr <src>, 15345 i32 <len>, i1 <isvolatile>) 15346 declare void @llvm.memcpy.inline.p0.p0.i64(ptr <dest>, ptr <src>, 15347 i64 <len>, i1 <isvolatile>) 15348 15349Overview: 15350""""""""" 15351 15352The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the 15353source location to the destination location and guarantees that no external 15354functions are called. 15355 15356Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*`` 15357intrinsics do not return a value, takes extra isvolatile 15358arguments and the pointers can be in specified address spaces. 15359 15360Arguments: 15361"""""""""" 15362 15363The first argument is a pointer to the destination, the second is a 15364pointer to the source. The third argument is an integer argument 15365specifying the number of bytes to copy, and the fourth is a 15366boolean indicating a volatile access. 15367 15368The :ref:`align <attr_align>` parameter attribute can be provided 15369for the first and second arguments. 15370 15371If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is 15372a :ref:`volatile operation <volatile>`. The detailed access behavior is not 15373very cleanly specified and it is unwise to depend on it. 15374 15375Semantics: 15376"""""""""" 15377 15378The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the 15379source location to the destination location, which are not allowed to 15380overlap. It copies "len" bytes of memory over. If the argument is known 15381to be aligned to some boundary, this can be specified as an attribute on 15382the argument. 15383The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of 15384'``llvm.memcpy.*``', but the generated code is guaranteed not to call any 15385external functions. 15386 15387.. _int_memmove: 15388 15389'``llvm.memmove``' Intrinsic 15390^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15391 15392Syntax: 15393""""""" 15394 15395This is an overloaded intrinsic. You can use llvm.memmove on any integer 15396bit width and for different address space. Not all targets support all 15397bit widths however. 15398 15399:: 15400 15401 declare void @llvm.memmove.p0.p0.i32(ptr <dest>, ptr <src>, 15402 i32 <len>, i1 <isvolatile>) 15403 declare void @llvm.memmove.p0.p0.i64(ptr <dest>, ptr <src>, 15404 i64 <len>, i1 <isvolatile>) 15405 15406Overview: 15407""""""""" 15408 15409The '``llvm.memmove.*``' intrinsics move a block of memory from the 15410source location to the destination location. It is similar to the 15411'``llvm.memcpy``' intrinsic but allows the two memory locations to 15412overlap. 15413 15414Note that, unlike the standard libc function, the ``llvm.memmove.*`` 15415intrinsics do not return a value, takes an extra isvolatile 15416argument and the pointers can be in specified address spaces. 15417 15418Arguments: 15419"""""""""" 15420 15421The first argument is a pointer to the destination, the second is a 15422pointer to the source. The third argument is an integer argument 15423specifying the number of bytes to copy, and the fourth is a 15424boolean indicating a volatile access. 15425 15426The :ref:`align <attr_align>` parameter attribute can be provided 15427for the first and second arguments. 15428 15429If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call 15430is a :ref:`volatile operation <volatile>`. The detailed access behavior is 15431not very cleanly specified and it is unwise to depend on it. 15432 15433Semantics: 15434"""""""""" 15435 15436The '``llvm.memmove.*``' intrinsics copy a block of memory from the 15437source location to the destination location, which may overlap. It 15438copies "len" bytes of memory over. If the argument is known to be 15439aligned to some boundary, this can be specified as an attribute on 15440the argument. 15441 15442If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to 15443the arguments. 15444If ``<len>`` is not a well-defined value, the behavior is undefined. 15445If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined, 15446otherwise the behavior is undefined. 15447 15448.. _int_memset: 15449 15450'``llvm.memset.*``' Intrinsics 15451^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15452 15453Syntax: 15454""""""" 15455 15456This is an overloaded intrinsic. You can use llvm.memset on any integer 15457bit width and for different address spaces. However, not all targets 15458support all bit widths. 15459 15460:: 15461 15462 declare void @llvm.memset.p0.i32(ptr <dest>, i8 <val>, 15463 i32 <len>, i1 <isvolatile>) 15464 declare void @llvm.memset.p0.i64(ptr <dest>, i8 <val>, 15465 i64 <len>, i1 <isvolatile>) 15466 15467Overview: 15468""""""""" 15469 15470The '``llvm.memset.*``' intrinsics fill a block of memory with a 15471particular byte value. 15472 15473Note that, unlike the standard libc function, the ``llvm.memset`` 15474intrinsic does not return a value and takes an extra volatile 15475argument. Also, the destination can be in an arbitrary address space. 15476 15477Arguments: 15478"""""""""" 15479 15480The first argument is a pointer to the destination to fill, the second 15481is the byte value with which to fill it, the third argument is an 15482integer argument specifying the number of bytes to fill, and the fourth 15483is a boolean indicating a volatile access. 15484 15485The :ref:`align <attr_align>` parameter attribute can be provided 15486for the first arguments. 15487 15488If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is 15489a :ref:`volatile operation <volatile>`. The detailed access behavior is not 15490very cleanly specified and it is unwise to depend on it. 15491 15492Semantics: 15493"""""""""" 15494 15495The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting 15496at the destination location. If the argument is known to be 15497aligned to some boundary, this can be specified as an attribute on 15498the argument. 15499 15500If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to 15501the arguments. 15502If ``<len>`` is not a well-defined value, the behavior is undefined. 15503If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the 15504behavior is undefined. 15505 15506.. _int_memset_inline: 15507 15508'``llvm.memset.inline``' Intrinsic 15509^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15510 15511Syntax: 15512""""""" 15513 15514This is an overloaded intrinsic. You can use ``llvm.memset.inline`` on any 15515integer bit width and for different address spaces. Not all targets 15516support all bit widths however. 15517 15518:: 15519 15520 declare void @llvm.memset.inline.p0.p0i8.i32(ptr <dest>, i8 <val>, 15521 i32 <len>, i1 <isvolatile>) 15522 declare void @llvm.memset.inline.p0.p0.i64(ptr <dest>, i8 <val>, 15523 i64 <len>, i1 <isvolatile>) 15524 15525Overview: 15526""""""""" 15527 15528The '``llvm.memset.inline.*``' intrinsics fill a block of memory with a 15529particular byte value and guarantees that no external functions are called. 15530 15531Note that, unlike the standard libc function, the ``llvm.memset.inline.*`` 15532intrinsics do not return a value, take an extra isvolatile argument and the 15533pointer can be in specified address spaces. 15534 15535Arguments: 15536"""""""""" 15537 15538The first argument is a pointer to the destination to fill, the second 15539is the byte value with which to fill it, the third argument is a constant 15540integer argument specifying the number of bytes to fill, and the fourth 15541is a boolean indicating a volatile access. 15542 15543The :ref:`align <attr_align>` parameter attribute can be provided 15544for the first argument. 15545 15546If the ``isvolatile`` parameter is ``true``, the ``llvm.memset.inline`` call is 15547a :ref:`volatile operation <volatile>`. The detailed access behavior is not 15548very cleanly specified and it is unwise to depend on it. 15549 15550Semantics: 15551"""""""""" 15552 15553The '``llvm.memset.inline.*``' intrinsics fill "len" bytes of memory starting 15554at the destination location. If the argument is known to be 15555aligned to some boundary, this can be specified as an attribute on 15556the argument. 15557 15558If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to 15559the arguments. 15560If ``<len>`` is not a well-defined value, the behavior is undefined. 15561If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the 15562behavior is undefined. 15563 15564The behavior of '``llvm.memset.inline.*``' is equivalent to the behavior of 15565'``llvm.memset.*``', but the generated code is guaranteed not to call any 15566external functions. 15567 15568.. _int_experimental_memset_pattern: 15569 15570'``llvm.experimental.memset.pattern``' Intrinsic 15571^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15572 15573Syntax: 15574""""""" 15575 15576This is an overloaded intrinsic. You can use 15577``llvm.experimental.memset.pattern`` on any integer bit width and for 15578different address spaces. Not all targets support all bit widths however. 15579 15580:: 15581 15582 declare void @llvm.experimental.memset.pattern.p0.i128.i64(ptr <dest>, i128 <val>, 15583 i64 <count>, i1 <isvolatile>) 15584 15585Overview: 15586""""""""" 15587 15588The '``llvm.experimental.memset.pattern.*``' intrinsics fill a block of memory 15589with a particular value. This may be expanded to an inline loop, a sequence of 15590stores, or a libcall depending on what is available for the target and the 15591expected performance and code size impact. 15592 15593Arguments: 15594"""""""""" 15595 15596The first argument is a pointer to the destination to fill, the second 15597is the value with which to fill it, the third argument is an integer 15598argument specifying the number of times to fill the value, and the fourth is a 15599boolean indicating a volatile access. 15600 15601The :ref:`align <attr_align>` parameter attribute can be provided 15602for the first argument. 15603 15604If the ``isvolatile`` parameter is ``true``, the 15605``llvm.experimental.memset.pattern`` call is a :ref:`volatile operation 15606<volatile>`. The detailed access behavior is not very cleanly specified and it 15607is unwise to depend on it. 15608 15609Semantics: 15610"""""""""" 15611 15612The '``llvm.experimental.memset.pattern*``' intrinsic fills memory starting at 15613the destination location with the given pattern ``<count>`` times, 15614incrementing by the allocation size of the type each time. The stores follow 15615the usual semantics of store instructions, including regarding endianness and 15616padding. If the argument is known to be aligned to some boundary, this can be 15617specified as an attribute on the argument. 15618 15619If ``<count>`` is 0, it is no-op modulo the behavior of attributes attached to 15620the arguments. 15621If ``<count>`` is not a well-defined value, the behavior is undefined. 15622If ``<count>`` is not zero, ``<dest>`` should be well-defined, otherwise the 15623behavior is undefined. 15624 15625.. _int_sqrt: 15626 15627'``llvm.sqrt.*``' Intrinsic 15628^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15629 15630Syntax: 15631""""""" 15632 15633This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any 15634floating-point or vector of floating-point type. Not all targets support 15635all types however. 15636 15637:: 15638 15639 declare float @llvm.sqrt.f32(float %Val) 15640 declare double @llvm.sqrt.f64(double %Val) 15641 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val) 15642 declare fp128 @llvm.sqrt.f128(fp128 %Val) 15643 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val) 15644 15645Overview: 15646""""""""" 15647 15648The '``llvm.sqrt``' intrinsics return the square root of the specified value. 15649 15650Arguments: 15651"""""""""" 15652 15653The argument and return value are floating-point numbers of the same type. 15654 15655Semantics: 15656"""""""""" 15657 15658Return the same value as a corresponding libm '``sqrt``' function but without 15659trapping or setting ``errno``. For types specified by IEEE-754, the result 15660matches a conforming libm implementation. 15661 15662When specified with the fast-math-flag 'afn', the result may be approximated 15663using a less accurate calculation. 15664 15665'``llvm.powi.*``' Intrinsic 15666^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15667 15668Syntax: 15669""""""" 15670 15671This is an overloaded intrinsic. You can use ``llvm.powi`` on any 15672floating-point or vector of floating-point type. Not all targets support 15673all types however. 15674 15675Generally, the only supported type for the exponent is the one matching 15676with the C type ``int``. 15677 15678:: 15679 15680 declare float @llvm.powi.f32.i32(float %Val, i32 %power) 15681 declare double @llvm.powi.f64.i16(double %Val, i16 %power) 15682 declare x86_fp80 @llvm.powi.f80.i32(x86_fp80 %Val, i32 %power) 15683 declare fp128 @llvm.powi.f128.i32(fp128 %Val, i32 %power) 15684 declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128 %Val, i32 %power) 15685 15686Overview: 15687""""""""" 15688 15689The '``llvm.powi.*``' intrinsics return the first operand raised to the 15690specified (positive or negative) power. The order of evaluation of 15691multiplications is not defined. When a vector of floating-point type is 15692used, the second argument remains a scalar integer value. 15693 15694Arguments: 15695"""""""""" 15696 15697The second argument is an integer power, and the first is a value to 15698raise to that power. 15699 15700Semantics: 15701"""""""""" 15702 15703This function returns the first value raised to the second power with an 15704unspecified sequence of rounding operations. 15705 15706.. _t_llvm_sin: 15707 15708'``llvm.sin.*``' Intrinsic 15709^^^^^^^^^^^^^^^^^^^^^^^^^^ 15710 15711Syntax: 15712""""""" 15713 15714This is an overloaded intrinsic. You can use ``llvm.sin`` on any 15715floating-point or vector of floating-point type. Not all targets support 15716all types however. 15717 15718:: 15719 15720 declare float @llvm.sin.f32(float %Val) 15721 declare double @llvm.sin.f64(double %Val) 15722 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val) 15723 declare fp128 @llvm.sin.f128(fp128 %Val) 15724 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val) 15725 15726Overview: 15727""""""""" 15728 15729The '``llvm.sin.*``' intrinsics return the sine of the operand. 15730 15731Arguments: 15732"""""""""" 15733 15734The argument and return value are floating-point numbers of the same type. 15735 15736Semantics: 15737"""""""""" 15738 15739Return the same value as a corresponding libm '``sin``' function but without 15740trapping or setting ``errno``. 15741 15742When specified with the fast-math-flag 'afn', the result may be approximated 15743using a less accurate calculation. 15744 15745.. _t_llvm_cos: 15746 15747'``llvm.cos.*``' Intrinsic 15748^^^^^^^^^^^^^^^^^^^^^^^^^^ 15749 15750Syntax: 15751""""""" 15752 15753This is an overloaded intrinsic. You can use ``llvm.cos`` on any 15754floating-point or vector of floating-point type. Not all targets support 15755all types however. 15756 15757:: 15758 15759 declare float @llvm.cos.f32(float %Val) 15760 declare double @llvm.cos.f64(double %Val) 15761 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val) 15762 declare fp128 @llvm.cos.f128(fp128 %Val) 15763 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val) 15764 15765Overview: 15766""""""""" 15767 15768The '``llvm.cos.*``' intrinsics return the cosine of the operand. 15769 15770Arguments: 15771"""""""""" 15772 15773The argument and return value are floating-point numbers of the same type. 15774 15775Semantics: 15776"""""""""" 15777 15778Return the same value as a corresponding libm '``cos``' function but without 15779trapping or setting ``errno``. 15780 15781When specified with the fast-math-flag 'afn', the result may be approximated 15782using a less accurate calculation. 15783 15784'``llvm.tan.*``' Intrinsic 15785^^^^^^^^^^^^^^^^^^^^^^^^^^ 15786 15787Syntax: 15788""""""" 15789 15790This is an overloaded intrinsic. You can use ``llvm.tan`` on any 15791floating-point or vector of floating-point type. Not all targets support 15792all types however. 15793 15794:: 15795 15796 declare float @llvm.tan.f32(float %Val) 15797 declare double @llvm.tan.f64(double %Val) 15798 declare x86_fp80 @llvm.tan.f80(x86_fp80 %Val) 15799 declare fp128 @llvm.tan.f128(fp128 %Val) 15800 declare ppc_fp128 @llvm.tan.ppcf128(ppc_fp128 %Val) 15801 15802Overview: 15803""""""""" 15804 15805The '``llvm.tan.*``' intrinsics return the tangent of the operand. 15806 15807Arguments: 15808"""""""""" 15809 15810The argument and return value are floating-point numbers of the same type. 15811 15812Semantics: 15813"""""""""" 15814 15815Return the same value as a corresponding libm '``tan``' function but without 15816trapping or setting ``errno``. 15817 15818When specified with the fast-math-flag 'afn', the result may be approximated 15819using a less accurate calculation. 15820 15821'``llvm.asin.*``' Intrinsic 15822^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15823 15824Syntax: 15825""""""" 15826 15827This is an overloaded intrinsic. You can use ``llvm.asin`` on any 15828floating-point or vector of floating-point type. Not all targets support 15829all types however. 15830 15831:: 15832 15833 declare float @llvm.asin.f32(float %Val) 15834 declare double @llvm.asin.f64(double %Val) 15835 declare x86_fp80 @llvm.asin.f80(x86_fp80 %Val) 15836 declare fp128 @llvm.asin.f128(fp128 %Val) 15837 declare ppc_fp128 @llvm.asin.ppcf128(ppc_fp128 %Val) 15838 15839Overview: 15840""""""""" 15841 15842The '``llvm.asin.*``' intrinsics return the arcsine of the operand. 15843 15844Arguments: 15845"""""""""" 15846 15847The argument and return value are floating-point numbers of the same type. 15848 15849Semantics: 15850"""""""""" 15851 15852Return the same value as a corresponding libm '``asin``' function but without 15853trapping or setting ``errno``. 15854 15855When specified with the fast-math-flag 'afn', the result may be approximated 15856using a less accurate calculation. 15857 15858'``llvm.acos.*``' Intrinsic 15859^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15860 15861Syntax: 15862""""""" 15863 15864This is an overloaded intrinsic. You can use ``llvm.acos`` on any 15865floating-point or vector of floating-point type. Not all targets support 15866all types however. 15867 15868:: 15869 15870 declare float @llvm.acos.f32(float %Val) 15871 declare double @llvm.acos.f64(double %Val) 15872 declare x86_fp80 @llvm.acos.f80(x86_fp80 %Val) 15873 declare fp128 @llvm.acos.f128(fp128 %Val) 15874 declare ppc_fp128 @llvm.acos.ppcf128(ppc_fp128 %Val) 15875 15876Overview: 15877""""""""" 15878 15879The '``llvm.acos.*``' intrinsics return the arccosine of the operand. 15880 15881Arguments: 15882"""""""""" 15883 15884The argument and return value are floating-point numbers of the same type. 15885 15886Semantics: 15887"""""""""" 15888 15889Return the same value as a corresponding libm '``acos``' function but without 15890trapping or setting ``errno``. 15891 15892When specified with the fast-math-flag 'afn', the result may be approximated 15893using a less accurate calculation. 15894 15895'``llvm.atan.*``' Intrinsic 15896^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15897 15898Syntax: 15899""""""" 15900 15901This is an overloaded intrinsic. You can use ``llvm.atan`` on any 15902floating-point or vector of floating-point type. Not all targets support 15903all types however. 15904 15905:: 15906 15907 declare float @llvm.atan.f32(float %Val) 15908 declare double @llvm.atan.f64(double %Val) 15909 declare x86_fp80 @llvm.atan.f80(x86_fp80 %Val) 15910 declare fp128 @llvm.atan.f128(fp128 %Val) 15911 declare ppc_fp128 @llvm.atan.ppcf128(ppc_fp128 %Val) 15912 15913Overview: 15914""""""""" 15915 15916The '``llvm.atan.*``' intrinsics return the arctangent of the operand. 15917 15918Arguments: 15919"""""""""" 15920 15921The argument and return value are floating-point numbers of the same type. 15922 15923Semantics: 15924"""""""""" 15925 15926Return the same value as a corresponding libm '``atan``' function but without 15927trapping or setting ``errno``. 15928 15929When specified with the fast-math-flag 'afn', the result may be approximated 15930using a less accurate calculation. 15931 15932'``llvm.atan2.*``' Intrinsic 15933^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15934 15935Syntax: 15936""""""" 15937 15938This is an overloaded intrinsic. You can use ``llvm.atan2`` on any 15939floating-point or vector of floating-point type. Not all targets support 15940all types however. 15941 15942:: 15943 15944 declare float @llvm.atan2.f32(float %Y, float %X) 15945 declare double @llvm.atan2.f64(double %Y, double %X) 15946 declare x86_fp80 @llvm.atan2.f80(x86_fp80 %Y, x86_fp80 %X) 15947 declare fp128 @llvm.atan2.f128(fp128 %Y, fp128 %X) 15948 declare ppc_fp128 @llvm.atan2.ppcf128(ppc_fp128 %Y, ppc_fp128 %X) 15949 15950Overview: 15951""""""""" 15952 15953The '``llvm.atan2.*``' intrinsics return the arctangent of ``Y/X`` accounting 15954for the quadrant. 15955 15956Arguments: 15957"""""""""" 15958 15959The arguments and return value are floating-point numbers of the same type. 15960 15961Semantics: 15962"""""""""" 15963 15964Return the same value as a corresponding libm '``atan2``' function but without 15965trapping or setting ``errno``. 15966 15967When specified with the fast-math-flag 'afn', the result may be approximated 15968using a less accurate calculation. 15969 15970'``llvm.sinh.*``' Intrinsic 15971^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15972 15973Syntax: 15974""""""" 15975 15976This is an overloaded intrinsic. You can use ``llvm.sinh`` on any 15977floating-point or vector of floating-point type. Not all targets support 15978all types however. 15979 15980:: 15981 15982 declare float @llvm.sinh.f32(float %Val) 15983 declare double @llvm.sinh.f64(double %Val) 15984 declare x86_fp80 @llvm.sinh.f80(x86_fp80 %Val) 15985 declare fp128 @llvm.sinh.f128(fp128 %Val) 15986 declare ppc_fp128 @llvm.sinh.ppcf128(ppc_fp128 %Val) 15987 15988Overview: 15989""""""""" 15990 15991The '``llvm.sinh.*``' intrinsics return the hyperbolic sine of the operand. 15992 15993Arguments: 15994"""""""""" 15995 15996The argument and return value are floating-point numbers of the same type. 15997 15998Semantics: 15999"""""""""" 16000 16001Return the same value as a corresponding libm '``sinh``' function but without 16002trapping or setting ``errno``. 16003 16004When specified with the fast-math-flag 'afn', the result may be approximated 16005using a less accurate calculation. 16006 16007'``llvm.cosh.*``' Intrinsic 16008^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16009 16010Syntax: 16011""""""" 16012 16013This is an overloaded intrinsic. You can use ``llvm.cosh`` on any 16014floating-point or vector of floating-point type. Not all targets support 16015all types however. 16016 16017:: 16018 16019 declare float @llvm.cosh.f32(float %Val) 16020 declare double @llvm.cosh.f64(double %Val) 16021 declare x86_fp80 @llvm.cosh.f80(x86_fp80 %Val) 16022 declare fp128 @llvm.cosh.f128(fp128 %Val) 16023 declare ppc_fp128 @llvm.cosh.ppcf128(ppc_fp128 %Val) 16024 16025Overview: 16026""""""""" 16027 16028The '``llvm.cosh.*``' intrinsics return the hyperbolic cosine of the operand. 16029 16030Arguments: 16031"""""""""" 16032 16033The argument and return value are floating-point numbers of the same type. 16034 16035Semantics: 16036"""""""""" 16037 16038Return the same value as a corresponding libm '``cosh``' function but without 16039trapping or setting ``errno``. 16040 16041When specified with the fast-math-flag 'afn', the result may be approximated 16042using a less accurate calculation. 16043 16044'``llvm.tanh.*``' Intrinsic 16045^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16046 16047Syntax: 16048""""""" 16049 16050This is an overloaded intrinsic. You can use ``llvm.tanh`` on any 16051floating-point or vector of floating-point type. Not all targets support 16052all types however. 16053 16054:: 16055 16056 declare float @llvm.tanh.f32(float %Val) 16057 declare double @llvm.tanh.f64(double %Val) 16058 declare x86_fp80 @llvm.tanh.f80(x86_fp80 %Val) 16059 declare fp128 @llvm.tanh.f128(fp128 %Val) 16060 declare ppc_fp128 @llvm.tanh.ppcf128(ppc_fp128 %Val) 16061 16062Overview: 16063""""""""" 16064 16065The '``llvm.tanh.*``' intrinsics return the hyperbolic tangent of the operand. 16066 16067Arguments: 16068"""""""""" 16069 16070The argument and return value are floating-point numbers of the same type. 16071 16072Semantics: 16073"""""""""" 16074 16075Return the same value as a corresponding libm '``tanh``' function but without 16076trapping or setting ``errno``. 16077 16078When specified with the fast-math-flag 'afn', the result may be approximated 16079using a less accurate calculation. 16080 16081 16082'``llvm.sincos.*``' Intrinsic 16083^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16084 16085Syntax: 16086""""""" 16087 16088This is an overloaded intrinsic. You can use ``llvm.sincos`` on any 16089floating-point or vector of floating-point type. Not all targets support 16090all types however. 16091 16092:: 16093 16094 declare { float, float } @llvm.sincos.f32(float %Val) 16095 declare { double, double } @llvm.sincos.f64(double %Val) 16096 declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val) 16097 declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val) 16098 declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val) 16099 declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val) 16100 16101Overview: 16102""""""""" 16103 16104The '``llvm.sincos.*``' intrinsics returns the sine and cosine of the operand. 16105 16106Arguments: 16107"""""""""" 16108 16109The argument is a :ref:`floating-point <t_floating>` value or 16110:ref:`vector <t_vector>` of floating-point values. Returns two values matching 16111the argument type in a struct. 16112 16113Semantics: 16114"""""""""" 16115 16116This intrinsic is equivalent to a calling both :ref:`llvm.sin <t_llvm_sin>` 16117and :ref:`llvm.cos <t_llvm_cos>` on the argument. 16118 16119The first result is the sine of the argument and the second result is the cosine 16120of the argument. 16121 16122When specified with the fast-math-flag 'afn', the result may be approximated 16123using a less accurate calculation. 16124 16125'``llvm.pow.*``' Intrinsic 16126^^^^^^^^^^^^^^^^^^^^^^^^^^ 16127 16128Syntax: 16129""""""" 16130 16131This is an overloaded intrinsic. You can use ``llvm.pow`` on any 16132floating-point or vector of floating-point type. Not all targets support 16133all types however. 16134 16135:: 16136 16137 declare float @llvm.pow.f32(float %Val, float %Power) 16138 declare double @llvm.pow.f64(double %Val, double %Power) 16139 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power) 16140 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power) 16141 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power) 16142 16143Overview: 16144""""""""" 16145 16146The '``llvm.pow.*``' intrinsics return the first operand raised to the 16147specified (positive or negative) power. 16148 16149Arguments: 16150"""""""""" 16151 16152The arguments and return value are floating-point numbers of the same type. 16153 16154Semantics: 16155"""""""""" 16156 16157Return the same value as a corresponding libm '``pow``' function but without 16158trapping or setting ``errno``. 16159 16160When specified with the fast-math-flag 'afn', the result may be approximated 16161using a less accurate calculation. 16162 16163.. _int_exp: 16164 16165'``llvm.exp.*``' Intrinsic 16166^^^^^^^^^^^^^^^^^^^^^^^^^^ 16167 16168Syntax: 16169""""""" 16170 16171This is an overloaded intrinsic. You can use ``llvm.exp`` on any 16172floating-point or vector of floating-point type. Not all targets support 16173all types however. 16174 16175:: 16176 16177 declare float @llvm.exp.f32(float %Val) 16178 declare double @llvm.exp.f64(double %Val) 16179 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val) 16180 declare fp128 @llvm.exp.f128(fp128 %Val) 16181 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val) 16182 16183Overview: 16184""""""""" 16185 16186The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified 16187value. 16188 16189Arguments: 16190"""""""""" 16191 16192The argument and return value are floating-point numbers of the same type. 16193 16194Semantics: 16195"""""""""" 16196 16197Return the same value as a corresponding libm '``exp``' function but without 16198trapping or setting ``errno``. 16199 16200When specified with the fast-math-flag 'afn', the result may be approximated 16201using a less accurate calculation. 16202 16203.. _int_exp2: 16204 16205'``llvm.exp2.*``' Intrinsic 16206^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16207 16208Syntax: 16209""""""" 16210 16211This is an overloaded intrinsic. You can use ``llvm.exp2`` on any 16212floating-point or vector of floating-point type. Not all targets support 16213all types however. 16214 16215:: 16216 16217 declare float @llvm.exp2.f32(float %Val) 16218 declare double @llvm.exp2.f64(double %Val) 16219 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val) 16220 declare fp128 @llvm.exp2.f128(fp128 %Val) 16221 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val) 16222 16223Overview: 16224""""""""" 16225 16226The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the 16227specified value. 16228 16229Arguments: 16230"""""""""" 16231 16232The argument and return value are floating-point numbers of the same type. 16233 16234Semantics: 16235"""""""""" 16236 16237Return the same value as a corresponding libm '``exp2``' function but without 16238trapping or setting ``errno``. 16239 16240When specified with the fast-math-flag 'afn', the result may be approximated 16241using a less accurate calculation. 16242 16243.. _int_exp10: 16244 16245'``llvm.exp10.*``' Intrinsic 16246^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16247 16248Syntax: 16249""""""" 16250 16251This is an overloaded intrinsic. You can use ``llvm.exp10`` on any 16252floating-point or vector of floating-point type. Not all targets support 16253all types however. 16254 16255:: 16256 16257 declare float @llvm.exp10.f32(float %Val) 16258 declare double @llvm.exp10.f64(double %Val) 16259 declare x86_fp80 @llvm.exp10.f80(x86_fp80 %Val) 16260 declare fp128 @llvm.exp10.f128(fp128 %Val) 16261 declare ppc_fp128 @llvm.exp10.ppcf128(ppc_fp128 %Val) 16262 16263Overview: 16264""""""""" 16265 16266The '``llvm.exp10.*``' intrinsics compute the base-10 exponential of the 16267specified value. 16268 16269Arguments: 16270"""""""""" 16271 16272The argument and return value are floating-point numbers of the same type. 16273 16274Semantics: 16275"""""""""" 16276 16277Return the same value as a corresponding libm '``exp10``' function but without 16278trapping or setting ``errno``. 16279 16280When specified with the fast-math-flag 'afn', the result may be approximated 16281using a less accurate calculation. 16282 16283 16284'``llvm.ldexp.*``' Intrinsic 16285^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16286 16287Syntax: 16288""""""" 16289 16290This is an overloaded intrinsic. You can use ``llvm.ldexp`` on any 16291floating point or vector of floating point type. Not all targets support 16292all types however. 16293 16294:: 16295 16296 declare float @llvm.ldexp.f32.i32(float %Val, i32 %Exp) 16297 declare double @llvm.ldexp.f64.i32(double %Val, i32 %Exp) 16298 declare x86_fp80 @llvm.ldexp.f80.i32(x86_fp80 %Val, i32 %Exp) 16299 declare fp128 @llvm.ldexp.f128.i32(fp128 %Val, i32 %Exp) 16300 declare ppc_fp128 @llvm.ldexp.ppcf128.i32(ppc_fp128 %Val, i32 %Exp) 16301 declare <2 x float> @llvm.ldexp.v2f32.v2i32(<2 x float> %Val, <2 x i32> %Exp) 16302 16303Overview: 16304""""""""" 16305 16306The '``llvm.ldexp.*``' intrinsics perform the ldexp function. 16307 16308Arguments: 16309"""""""""" 16310 16311The first argument and the return value are :ref:`floating-point 16312<t_floating>` or :ref:`vector <t_vector>` of floating-point values of 16313the same type. The second argument is an integer with the same number 16314of elements. 16315 16316Semantics: 16317"""""""""" 16318 16319This function multiplies the first argument by 2 raised to the second 16320argument's power. If the first argument is NaN or infinite, the same 16321value is returned. If the result underflows a zero with the same sign 16322is returned. If the result overflows, the result is an infinity with 16323the same sign. 16324 16325.. _int_frexp: 16326 16327'``llvm.frexp.*``' Intrinsic 16328^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16329 16330Syntax: 16331""""""" 16332 16333This is an overloaded intrinsic. You can use ``llvm.frexp`` on any 16334floating point or vector of floating point type. Not all targets support 16335all types however. 16336 16337:: 16338 16339 declare { float, i32 } @llvm.frexp.f32.i32(float %Val) 16340 declare { double, i32 } @llvm.frexp.f64.i32(double %Val) 16341 declare { x86_fp80, i32 } @llvm.frexp.f80.i32(x86_fp80 %Val) 16342 declare { fp128, i32 } @llvm.frexp.f128.i32(fp128 %Val) 16343 declare { ppc_fp128, i32 } @llvm.frexp.ppcf128.i32(ppc_fp128 %Val) 16344 declare { <2 x float>, <2 x i32> } @llvm.frexp.v2f32.v2i32(<2 x float> %Val) 16345 16346Overview: 16347""""""""" 16348 16349The '``llvm.frexp.*``' intrinsics perform the frexp function. 16350 16351Arguments: 16352"""""""""" 16353 16354The argument is a :ref:`floating-point <t_floating>` or 16355:ref:`vector <t_vector>` of floating-point values. Returns two values 16356in a struct. The first struct field matches the argument type, and the 16357second field is an integer or a vector of integer values with the same 16358number of elements as the argument. 16359 16360Semantics: 16361"""""""""" 16362 16363This intrinsic splits a floating point value into a normalized 16364fractional component and integral exponent. 16365 16366For a non-zero argument, returns the argument multiplied by some power 16367of two such that the absolute value of the returned value is in the 16368range [0.5, 1.0), with the same sign as the argument. The second 16369result is an integer such that the first result raised to the power of 16370the second result is the input argument. 16371 16372If the argument is a zero, returns a zero with the same sign and a 0 16373exponent. 16374 16375If the argument is a NaN, a NaN is returned and the returned exponent 16376is unspecified. 16377 16378If the argument is an infinity, returns an infinity with the same sign 16379and an unspecified exponent. 16380 16381.. _int_log: 16382 16383'``llvm.log.*``' Intrinsic 16384^^^^^^^^^^^^^^^^^^^^^^^^^^ 16385 16386Syntax: 16387""""""" 16388 16389This is an overloaded intrinsic. You can use ``llvm.log`` on any 16390floating-point or vector of floating-point type. Not all targets support 16391all types however. 16392 16393:: 16394 16395 declare float @llvm.log.f32(float %Val) 16396 declare double @llvm.log.f64(double %Val) 16397 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val) 16398 declare fp128 @llvm.log.f128(fp128 %Val) 16399 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val) 16400 16401Overview: 16402""""""""" 16403 16404The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified 16405value. 16406 16407Arguments: 16408"""""""""" 16409 16410The argument and return value are floating-point numbers of the same type. 16411 16412Semantics: 16413"""""""""" 16414 16415Return the same value as a corresponding libm '``log``' function but without 16416trapping or setting ``errno``. 16417 16418When specified with the fast-math-flag 'afn', the result may be approximated 16419using a less accurate calculation. 16420 16421.. _int_log10: 16422 16423'``llvm.log10.*``' Intrinsic 16424^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16425 16426Syntax: 16427""""""" 16428 16429This is an overloaded intrinsic. You can use ``llvm.log10`` on any 16430floating-point or vector of floating-point type. Not all targets support 16431all types however. 16432 16433:: 16434 16435 declare float @llvm.log10.f32(float %Val) 16436 declare double @llvm.log10.f64(double %Val) 16437 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val) 16438 declare fp128 @llvm.log10.f128(fp128 %Val) 16439 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val) 16440 16441Overview: 16442""""""""" 16443 16444The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the 16445specified value. 16446 16447Arguments: 16448"""""""""" 16449 16450The argument and return value are floating-point numbers of the same type. 16451 16452Semantics: 16453"""""""""" 16454 16455Return the same value as a corresponding libm '``log10``' function but without 16456trapping or setting ``errno``. 16457 16458When specified with the fast-math-flag 'afn', the result may be approximated 16459using a less accurate calculation. 16460 16461 16462.. _int_log2: 16463 16464'``llvm.log2.*``' Intrinsic 16465^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16466 16467Syntax: 16468""""""" 16469 16470This is an overloaded intrinsic. You can use ``llvm.log2`` on any 16471floating-point or vector of floating-point type. Not all targets support 16472all types however. 16473 16474:: 16475 16476 declare float @llvm.log2.f32(float %Val) 16477 declare double @llvm.log2.f64(double %Val) 16478 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val) 16479 declare fp128 @llvm.log2.f128(fp128 %Val) 16480 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val) 16481 16482Overview: 16483""""""""" 16484 16485The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified 16486value. 16487 16488Arguments: 16489"""""""""" 16490 16491The argument and return value are floating-point numbers of the same type. 16492 16493Semantics: 16494"""""""""" 16495 16496Return the same value as a corresponding libm '``log2``' function but without 16497trapping or setting ``errno``. 16498 16499When specified with the fast-math-flag 'afn', the result may be approximated 16500using a less accurate calculation. 16501 16502.. _int_fma: 16503 16504'``llvm.fma.*``' Intrinsic 16505^^^^^^^^^^^^^^^^^^^^^^^^^^ 16506 16507Syntax: 16508""""""" 16509 16510This is an overloaded intrinsic. You can use ``llvm.fma`` on any 16511floating-point or vector of floating-point type. Not all targets support 16512all types however. 16513 16514:: 16515 16516 declare float @llvm.fma.f32(float %a, float %b, float %c) 16517 declare double @llvm.fma.f64(double %a, double %b, double %c) 16518 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c) 16519 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c) 16520 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c) 16521 16522Overview: 16523""""""""" 16524 16525The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation. 16526 16527Arguments: 16528"""""""""" 16529 16530The arguments and return value are floating-point numbers of the same type. 16531 16532Semantics: 16533"""""""""" 16534 16535Return the same value as the IEEE-754 fusedMultiplyAdd operation. This 16536is assumed to not trap or set ``errno``. 16537 16538When specified with the fast-math-flag 'afn', the result may be approximated 16539using a less accurate calculation. 16540 16541.. _int_fabs: 16542 16543'``llvm.fabs.*``' Intrinsic 16544^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16545 16546Syntax: 16547""""""" 16548 16549This is an overloaded intrinsic. You can use ``llvm.fabs`` on any 16550floating-point or vector of floating-point type. Not all targets support 16551all types however. 16552 16553:: 16554 16555 declare float @llvm.fabs.f32(float %Val) 16556 declare double @llvm.fabs.f64(double %Val) 16557 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val) 16558 declare fp128 @llvm.fabs.f128(fp128 %Val) 16559 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val) 16560 16561Overview: 16562""""""""" 16563 16564The '``llvm.fabs.*``' intrinsics return the absolute value of the 16565operand. 16566 16567Arguments: 16568"""""""""" 16569 16570The argument and return value are floating-point numbers of the same 16571type. 16572 16573Semantics: 16574"""""""""" 16575 16576This function returns the same values as the libm ``fabs`` functions 16577would, and handles error conditions in the same way. 16578The returned value is completely identical to the input except for the sign bit; 16579in particular, if the input is a NaN, then the quiet/signaling bit and payload 16580are perfectly preserved. 16581 16582.. _i_fminmax_family: 16583 16584'``llvm.min.*``' Intrinsics Comparation 16585^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16586 16587Standard: 16588""""""""" 16589 16590IEEE754 and ISO C define some min/max operations, and they have some differences 16591on working with qNaN/sNaN and +0.0/-0.0. Here is the list: 16592 16593.. list-table:: 16594 :header-rows: 2 16595 16596 * - ``ISO C`` 16597 - fmin/fmax 16598 - fmininum/fmaximum 16599 - fminimum_num/fmaximum_num 16600 16601 * - ``IEEE754`` 16602 - minNum/maxNum (2008) 16603 - minimum/maximum (2019) 16604 - minimumNumber/maximumNumber (2019) 16605 16606 * - ``+0.0 vs -0.0`` 16607 - either one 16608 - +0.0 > -0.0 16609 - +0.0 > -0.0 16610 16611 * - ``NUM vs sNaN`` 16612 - qNaN, invalid exception 16613 - qNaN, invalid exception 16614 - NUM, invalid exception 16615 16616 * - ``qNaN vs sNaN`` 16617 - qNaN, invalid exception 16618 - qNaN, invalid exception 16619 - qNaN, invalid exception 16620 16621 * - ``NUM vs qNaN`` 16622 - NUM, no exception 16623 - qNaN, no exception 16624 - NUM, no exception 16625 16626LLVM Implementation: 16627"""""""""""""""""""" 16628 16629LLVM implements all ISO C flavors as listed in this table, except in the 16630default floating-point environment exceptions are ignored. The constrained 16631versions of the intrinsics respect the exception behavior. 16632 16633.. list-table:: 16634 :header-rows: 1 16635 :widths: 16 28 28 28 16636 16637 * - Operation 16638 - minnum/maxnum 16639 - minimum/maximum 16640 - minimumnum/maximumnum 16641 16642 * - ``NUM vs qNaN`` 16643 - NUM, no exception 16644 - qNaN, no exception 16645 - NUM, no exception 16646 16647 * - ``NUM vs sNaN`` 16648 - qNaN, invalid exception 16649 - qNaN, invalid exception 16650 - NUM, invalid exception 16651 16652 * - ``qNaN vs sNaN`` 16653 - qNaN, invalid exception 16654 - qNaN, invalid exception 16655 - qNaN, invalid exception 16656 16657 * - ``sNaN vs sNaN`` 16658 - qNaN, invalid exception 16659 - qNaN, invalid exception 16660 - qNaN, invalid exception 16661 16662 * - ``+0.0 vs -0.0`` 16663 - either one 16664 - +0.0(max)/-0.0(min) 16665 - +0.0(max)/-0.0(min) 16666 16667 * - ``NUM vs NUM`` 16668 - larger(max)/smaller(min) 16669 - larger(max)/smaller(min) 16670 - larger(max)/smaller(min) 16671 16672.. _i_minnum: 16673 16674'``llvm.minnum.*``' Intrinsic 16675^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16676 16677Syntax: 16678""""""" 16679 16680This is an overloaded intrinsic. You can use ``llvm.minnum`` on any 16681floating-point or vector of floating-point type. Not all targets support 16682all types however. 16683 16684:: 16685 16686 declare float @llvm.minnum.f32(float %Val0, float %Val1) 16687 declare double @llvm.minnum.f64(double %Val0, double %Val1) 16688 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 16689 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1) 16690 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 16691 16692Overview: 16693""""""""" 16694 16695The '``llvm.minnum.*``' intrinsics return the minimum of the two 16696arguments. 16697 16698 16699Arguments: 16700"""""""""" 16701 16702The arguments and return value are floating-point numbers of the same 16703type. 16704 16705Semantics: 16706"""""""""" 16707 16708Follows the IEEE-754 semantics for minNum, except for handling of 16709signaling NaNs. This match's the behavior of libm's fmin. 16710 16711If either operand is a NaN, returns the other non-NaN operand. Returns 16712NaN only if both operands are NaN. If the operands compare equal, 16713returns either one of the operands. For example, this means that 16714fmin(+0.0, -0.0) returns either operand. 16715 16716Unlike the IEEE-754 2008 behavior, this does not distinguish between 16717signaling and quiet NaN inputs. If a target's implementation follows 16718the standard and returns a quiet NaN if either input is a signaling 16719NaN, the intrinsic lowering is responsible for quieting the inputs to 16720correctly return the non-NaN input (e.g. by using the equivalent of 16721``llvm.canonicalize``). 16722 16723.. _i_maxnum: 16724 16725'``llvm.maxnum.*``' Intrinsic 16726^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16727 16728Syntax: 16729""""""" 16730 16731This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any 16732floating-point or vector of floating-point type. Not all targets support 16733all types however. 16734 16735:: 16736 16737 declare float @llvm.maxnum.f32(float %Val0, float %Val1) 16738 declare double @llvm.maxnum.f64(double %Val0, double %Val1) 16739 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 16740 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1) 16741 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 16742 16743Overview: 16744""""""""" 16745 16746The '``llvm.maxnum.*``' intrinsics return the maximum of the two 16747arguments. 16748 16749 16750Arguments: 16751"""""""""" 16752 16753The arguments and return value are floating-point numbers of the same 16754type. 16755 16756Semantics: 16757"""""""""" 16758Follows the IEEE-754 semantics for maxNum except for the handling of 16759signaling NaNs. This matches the behavior of libm's fmax. 16760 16761If either operand is a NaN, returns the other non-NaN operand. Returns 16762NaN only if both operands are NaN. If the operands compare equal, 16763returns either one of the operands. For example, this means that 16764fmax(+0.0, -0.0) returns either -0.0 or 0.0. 16765 16766Unlike the IEEE-754 2008 behavior, this does not distinguish between 16767signaling and quiet NaN inputs. If a target's implementation follows 16768the standard and returns a quiet NaN if either input is a signaling 16769NaN, the intrinsic lowering is responsible for quieting the inputs to 16770correctly return the non-NaN input (e.g. by using the equivalent of 16771``llvm.canonicalize``). 16772 16773.. _i_minimum: 16774 16775'``llvm.minimum.*``' Intrinsic 16776^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16777 16778Syntax: 16779""""""" 16780 16781This is an overloaded intrinsic. You can use ``llvm.minimum`` on any 16782floating-point or vector of floating-point type. Not all targets support 16783all types however. 16784 16785:: 16786 16787 declare float @llvm.minimum.f32(float %Val0, float %Val1) 16788 declare double @llvm.minimum.f64(double %Val0, double %Val1) 16789 declare x86_fp80 @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 16790 declare fp128 @llvm.minimum.f128(fp128 %Val0, fp128 %Val1) 16791 declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 16792 16793Overview: 16794""""""""" 16795 16796The '``llvm.minimum.*``' intrinsics return the minimum of the two 16797arguments, propagating NaNs and treating -0.0 as less than +0.0. 16798 16799 16800Arguments: 16801"""""""""" 16802 16803The arguments and return value are floating-point numbers of the same 16804type. 16805 16806Semantics: 16807"""""""""" 16808If either operand is a NaN, returns NaN. Otherwise returns the lesser 16809of the two arguments. -0.0 is considered to be less than +0.0 for this 16810intrinsic. Note that these are the semantics specified in the draft of 16811IEEE 754-2019. 16812 16813.. _i_maximum: 16814 16815'``llvm.maximum.*``' Intrinsic 16816^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16817 16818Syntax: 16819""""""" 16820 16821This is an overloaded intrinsic. You can use ``llvm.maximum`` on any 16822floating-point or vector of floating-point type. Not all targets support 16823all types however. 16824 16825:: 16826 16827 declare float @llvm.maximum.f32(float %Val0, float %Val1) 16828 declare double @llvm.maximum.f64(double %Val0, double %Val1) 16829 declare x86_fp80 @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 16830 declare fp128 @llvm.maximum.f128(fp128 %Val0, fp128 %Val1) 16831 declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 16832 16833Overview: 16834""""""""" 16835 16836The '``llvm.maximum.*``' intrinsics return the maximum of the two 16837arguments, propagating NaNs and treating -0.0 as less than +0.0. 16838 16839 16840Arguments: 16841"""""""""" 16842 16843The arguments and return value are floating-point numbers of the same 16844type. 16845 16846Semantics: 16847"""""""""" 16848If either operand is a NaN, returns NaN. Otherwise returns the greater 16849of the two arguments. -0.0 is considered to be less than +0.0 for this 16850intrinsic. Note that these are the semantics specified in the draft of 16851IEEE 754-2019. 16852 16853.. _i_minimumnum: 16854 16855'``llvm.minimumnum.*``' Intrinsic 16856^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16857 16858Syntax: 16859""""""" 16860 16861This is an overloaded intrinsic. You can use ``llvm.minimumnum`` on any 16862floating-point or vector of floating-point type. Not all targets support 16863all types however. 16864 16865:: 16866 16867 declare float @llvm.minimumnum.f32(float %Val0, float %Val1) 16868 declare double @llvm.minimumnum.f64(double %Val0, double %Val1) 16869 declare x86_fp80 @llvm.minimumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 16870 declare fp128 @llvm.minimumnum.f128(fp128 %Val0, fp128 %Val1) 16871 declare ppc_fp128 @llvm.minimumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 16872 16873Overview: 16874""""""""" 16875 16876The '``llvm.minimumnum.*``' intrinsics return the minimum of the two 16877arguments, not propagating NaNs and treating -0.0 as less than +0.0. 16878 16879 16880Arguments: 16881"""""""""" 16882 16883The arguments and return value are floating-point numbers of the same 16884type. 16885 16886Semantics: 16887"""""""""" 16888If both operands are NaNs (including sNaN), returns qNaN. If one operand 16889is NaN (including sNaN) and another operand is a number, return the number. 16890Otherwise returns the lesser of the two arguments. -0.0 is considered to 16891be less than +0.0 for this intrinsic. 16892 16893Note that these are the semantics of minimumNumber specified in IEEE 754-2019. 16894 16895It has some differences with '``llvm.minnum.*``': 168961)'``llvm.minnum.*``' will return qNaN if either operand is sNaN. 168972)'``llvm.minnum*``' may return either one if we compare +0.0 vs -0.0. 16898 16899.. _i_maximumnum: 16900 16901'``llvm.maximumnum.*``' Intrinsic 16902^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16903 16904Syntax: 16905""""""" 16906 16907This is an overloaded intrinsic. You can use ``llvm.maximumnum`` on any 16908floating-point or vector of floating-point type. Not all targets support 16909all types however. 16910 16911:: 16912 16913 declare float @llvm.maximumnum.f32(float %Val0, float %Val1) 16914 declare double @llvm.maximumnum.f64(double %Val0, double %Val1) 16915 declare x86_fp80 @llvm.maximumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 16916 declare fp128 @llvm.maximumnum.f128(fp128 %Val0, fp128 %Val1) 16917 declare ppc_fp128 @llvm.maximumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 16918 16919Overview: 16920""""""""" 16921 16922The '``llvm.maximumnum.*``' intrinsics return the maximum of the two 16923arguments, not propagating NaNs and treating -0.0 as less than +0.0. 16924 16925 16926Arguments: 16927"""""""""" 16928 16929The arguments and return value are floating-point numbers of the same 16930type. 16931 16932Semantics: 16933"""""""""" 16934If both operands are NaNs (including sNaN), returns qNaN. If one operand 16935is NaN (including sNaN) and another operand is a number, return the number. 16936Otherwise returns the greater of the two arguments. -0.0 is considered to 16937be less than +0.0 for this intrinsic. 16938 16939Note that these are the semantics of maximumNumber specified in IEEE 754-2019. 16940 16941It has some differences with '``llvm.maxnum.*``': 169421)'``llvm.maxnum.*``' will return qNaN if either operand is sNaN. 169432)'``llvm.maxnum*``' may return either one if we compare +0.0 vs -0.0. 16944 16945.. _int_copysign: 16946 16947'``llvm.copysign.*``' Intrinsic 16948^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16949 16950Syntax: 16951""""""" 16952 16953This is an overloaded intrinsic. You can use ``llvm.copysign`` on any 16954floating-point or vector of floating-point type. Not all targets support 16955all types however. 16956 16957:: 16958 16959 declare float @llvm.copysign.f32(float %Mag, float %Sgn) 16960 declare double @llvm.copysign.f64(double %Mag, double %Sgn) 16961 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn) 16962 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn) 16963 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn) 16964 16965Overview: 16966""""""""" 16967 16968The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the 16969first operand and the sign of the second operand. 16970 16971Arguments: 16972"""""""""" 16973 16974The arguments and return value are floating-point numbers of the same 16975type. 16976 16977Semantics: 16978"""""""""" 16979 16980This function returns the same values as the libm ``copysign`` 16981functions would, and handles error conditions in the same way. 16982The returned value is completely identical to the first operand except for the 16983sign bit; in particular, if the input is a NaN, then the quiet/signaling bit and 16984payload are perfectly preserved. 16985 16986.. _int_floor: 16987 16988'``llvm.floor.*``' Intrinsic 16989^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16990 16991Syntax: 16992""""""" 16993 16994This is an overloaded intrinsic. You can use ``llvm.floor`` on any 16995floating-point or vector of floating-point type. Not all targets support 16996all types however. 16997 16998:: 16999 17000 declare float @llvm.floor.f32(float %Val) 17001 declare double @llvm.floor.f64(double %Val) 17002 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val) 17003 declare fp128 @llvm.floor.f128(fp128 %Val) 17004 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val) 17005 17006Overview: 17007""""""""" 17008 17009The '``llvm.floor.*``' intrinsics return the floor of the operand. 17010 17011Arguments: 17012"""""""""" 17013 17014The argument and return value are floating-point numbers of the same 17015type. 17016 17017Semantics: 17018"""""""""" 17019 17020This function returns the same values as the libm ``floor`` functions 17021would, and handles error conditions in the same way. 17022 17023.. _int_ceil: 17024 17025'``llvm.ceil.*``' Intrinsic 17026^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17027 17028Syntax: 17029""""""" 17030 17031This is an overloaded intrinsic. You can use ``llvm.ceil`` on any 17032floating-point or vector of floating-point type. Not all targets support 17033all types however. 17034 17035:: 17036 17037 declare float @llvm.ceil.f32(float %Val) 17038 declare double @llvm.ceil.f64(double %Val) 17039 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val) 17040 declare fp128 @llvm.ceil.f128(fp128 %Val) 17041 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val) 17042 17043Overview: 17044""""""""" 17045 17046The '``llvm.ceil.*``' intrinsics return the ceiling of the operand. 17047 17048Arguments: 17049"""""""""" 17050 17051The argument and return value are floating-point numbers of the same 17052type. 17053 17054Semantics: 17055"""""""""" 17056 17057This function returns the same values as the libm ``ceil`` functions 17058would, and handles error conditions in the same way. 17059 17060 17061.. _int_llvm_trunc: 17062 17063'``llvm.trunc.*``' Intrinsic 17064^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17065 17066Syntax: 17067""""""" 17068 17069This is an overloaded intrinsic. You can use ``llvm.trunc`` on any 17070floating-point or vector of floating-point type. Not all targets support 17071all types however. 17072 17073:: 17074 17075 declare float @llvm.trunc.f32(float %Val) 17076 declare double @llvm.trunc.f64(double %Val) 17077 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val) 17078 declare fp128 @llvm.trunc.f128(fp128 %Val) 17079 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val) 17080 17081Overview: 17082""""""""" 17083 17084The '``llvm.trunc.*``' intrinsics returns the operand rounded to the 17085nearest integer not larger in magnitude than the operand. 17086 17087Arguments: 17088"""""""""" 17089 17090The argument and return value are floating-point numbers of the same 17091type. 17092 17093Semantics: 17094"""""""""" 17095 17096This function returns the same values as the libm ``trunc`` functions 17097would, and handles error conditions in the same way. 17098 17099.. _int_rint: 17100 17101'``llvm.rint.*``' Intrinsic 17102^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17103 17104Syntax: 17105""""""" 17106 17107This is an overloaded intrinsic. You can use ``llvm.rint`` on any 17108floating-point or vector of floating-point type. Not all targets support 17109all types however. 17110 17111:: 17112 17113 declare float @llvm.rint.f32(float %Val) 17114 declare double @llvm.rint.f64(double %Val) 17115 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val) 17116 declare fp128 @llvm.rint.f128(fp128 %Val) 17117 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val) 17118 17119Overview: 17120""""""""" 17121 17122The '``llvm.rint.*``' intrinsics returns the operand rounded to the 17123nearest integer. It may raise an inexact floating-point exception if the 17124operand isn't an integer. 17125 17126Arguments: 17127"""""""""" 17128 17129The argument and return value are floating-point numbers of the same 17130type. 17131 17132Semantics: 17133"""""""""" 17134 17135This function returns the same values as the libm ``rint`` functions 17136would, and handles error conditions in the same way. Since LLVM assumes the 17137:ref:`default floating-point environment <floatenv>`, the rounding mode is 17138assumed to be set to "nearest", so halfway cases are rounded to the even 17139integer. Use :ref:`Constrained Floating-Point Intrinsics <constrainedfp>` 17140to avoid that assumption. 17141 17142.. _int_nearbyint: 17143 17144'``llvm.nearbyint.*``' Intrinsic 17145^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17146 17147Syntax: 17148""""""" 17149 17150This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any 17151floating-point or vector of floating-point type. Not all targets support 17152all types however. 17153 17154:: 17155 17156 declare float @llvm.nearbyint.f32(float %Val) 17157 declare double @llvm.nearbyint.f64(double %Val) 17158 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val) 17159 declare fp128 @llvm.nearbyint.f128(fp128 %Val) 17160 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val) 17161 17162Overview: 17163""""""""" 17164 17165The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the 17166nearest integer. 17167 17168Arguments: 17169"""""""""" 17170 17171The argument and return value are floating-point numbers of the same 17172type. 17173 17174Semantics: 17175"""""""""" 17176 17177This function returns the same values as the libm ``nearbyint`` 17178functions would, and handles error conditions in the same way. Since LLVM 17179assumes the :ref:`default floating-point environment <floatenv>`, the rounding 17180mode is assumed to be set to "nearest", so halfway cases are rounded to the even 17181integer. Use :ref:`Constrained Floating-Point Intrinsics <constrainedfp>` to 17182avoid that assumption. 17183 17184.. _int_round: 17185 17186'``llvm.round.*``' Intrinsic 17187^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17188 17189Syntax: 17190""""""" 17191 17192This is an overloaded intrinsic. You can use ``llvm.round`` on any 17193floating-point or vector of floating-point type. Not all targets support 17194all types however. 17195 17196:: 17197 17198 declare float @llvm.round.f32(float %Val) 17199 declare double @llvm.round.f64(double %Val) 17200 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val) 17201 declare fp128 @llvm.round.f128(fp128 %Val) 17202 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val) 17203 17204Overview: 17205""""""""" 17206 17207The '``llvm.round.*``' intrinsics returns the operand rounded to the 17208nearest integer. 17209 17210Arguments: 17211"""""""""" 17212 17213The argument and return value are floating-point numbers of the same 17214type. 17215 17216Semantics: 17217"""""""""" 17218 17219This function returns the same values as the libm ``round`` 17220functions would, and handles error conditions in the same way. 17221 17222.. _int_roundeven: 17223 17224'``llvm.roundeven.*``' Intrinsic 17225^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17226 17227Syntax: 17228""""""" 17229 17230This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any 17231floating-point or vector of floating-point type. Not all targets support 17232all types however. 17233 17234:: 17235 17236 declare float @llvm.roundeven.f32(float %Val) 17237 declare double @llvm.roundeven.f64(double %Val) 17238 declare x86_fp80 @llvm.roundeven.f80(x86_fp80 %Val) 17239 declare fp128 @llvm.roundeven.f128(fp128 %Val) 17240 declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128 %Val) 17241 17242Overview: 17243""""""""" 17244 17245The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest 17246integer in floating-point format rounding halfway cases to even (that is, to the 17247nearest value that is an even integer). 17248 17249Arguments: 17250"""""""""" 17251 17252The argument and return value are floating-point numbers of the same type. 17253 17254Semantics: 17255"""""""""" 17256 17257This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It 17258also behaves in the same way as C standard function ``roundeven``, except that 17259it does not raise floating point exceptions. 17260 17261 17262'``llvm.lround.*``' Intrinsic 17263^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17264 17265Syntax: 17266""""""" 17267 17268This is an overloaded intrinsic. You can use ``llvm.lround`` on any 17269floating-point type or vector of floating-point type. Not all targets 17270support all types however. 17271 17272:: 17273 17274 declare i32 @llvm.lround.i32.f32(float %Val) 17275 declare i32 @llvm.lround.i32.f64(double %Val) 17276 declare i32 @llvm.lround.i32.f80(float %Val) 17277 declare i32 @llvm.lround.i32.f128(double %Val) 17278 declare i32 @llvm.lround.i32.ppcf128(double %Val) 17279 17280 declare i64 @llvm.lround.i64.f32(float %Val) 17281 declare i64 @llvm.lround.i64.f64(double %Val) 17282 declare i64 @llvm.lround.i64.f80(float %Val) 17283 declare i64 @llvm.lround.i64.f128(double %Val) 17284 declare i64 @llvm.lround.i64.ppcf128(double %Val) 17285 17286Overview: 17287""""""""" 17288 17289The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest 17290integer with ties away from zero. 17291 17292 17293Arguments: 17294"""""""""" 17295 17296The argument is a floating-point number and the return value is an integer 17297type. 17298 17299Semantics: 17300"""""""""" 17301 17302This function returns the same values as the libm ``lround`` functions 17303would, but without setting errno. If the rounded value is too large to 17304be stored in the result type, the return value is a non-deterministic 17305value (equivalent to `freeze poison`). 17306 17307'``llvm.llround.*``' Intrinsic 17308^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17309 17310Syntax: 17311""""""" 17312 17313This is an overloaded intrinsic. You can use ``llvm.llround`` on any 17314floating-point type. Not all targets support all types however. 17315 17316:: 17317 17318 declare i64 @llvm.llround.i64.f32(float %Val) 17319 declare i64 @llvm.llround.i64.f64(double %Val) 17320 declare i64 @llvm.llround.i64.f80(float %Val) 17321 declare i64 @llvm.llround.i64.f128(double %Val) 17322 declare i64 @llvm.llround.i64.ppcf128(double %Val) 17323 17324Overview: 17325""""""""" 17326 17327The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest 17328integer with ties away from zero. 17329 17330Arguments: 17331"""""""""" 17332 17333The argument is a floating-point number and the return value is an integer 17334type. 17335 17336Semantics: 17337"""""""""" 17338 17339This function returns the same values as the libm ``llround`` 17340functions would, but without setting errno. If the rounded value is 17341too large to be stored in the result type, the return value is a 17342non-deterministic value (equivalent to `freeze poison`). 17343 17344.. _int_lrint: 17345 17346'``llvm.lrint.*``' Intrinsic 17347^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17348 17349Syntax: 17350""""""" 17351 17352This is an overloaded intrinsic. You can use ``llvm.lrint`` on any 17353floating-point type or vector of floating-point type. Not all targets 17354support all types however. 17355 17356:: 17357 17358 declare i32 @llvm.lrint.i32.f32(float %Val) 17359 declare i32 @llvm.lrint.i32.f64(double %Val) 17360 declare i32 @llvm.lrint.i32.f80(float %Val) 17361 declare i32 @llvm.lrint.i32.f128(double %Val) 17362 declare i32 @llvm.lrint.i32.ppcf128(double %Val) 17363 17364 declare i64 @llvm.lrint.i64.f32(float %Val) 17365 declare i64 @llvm.lrint.i64.f64(double %Val) 17366 declare i64 @llvm.lrint.i64.f80(float %Val) 17367 declare i64 @llvm.lrint.i64.f128(double %Val) 17368 declare i64 @llvm.lrint.i64.ppcf128(double %Val) 17369 17370Overview: 17371""""""""" 17372 17373The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest 17374integer. 17375 17376 17377Arguments: 17378"""""""""" 17379 17380The argument is a floating-point number and the return value is an integer 17381type. 17382 17383Semantics: 17384"""""""""" 17385 17386This function returns the same values as the libm ``lrint`` functions 17387would, but without setting errno. If the rounded value is too large to 17388be stored in the result type, the return value is a non-deterministic 17389value (equivalent to `freeze poison`). 17390 17391.. _int_llrint: 17392 17393'``llvm.llrint.*``' Intrinsic 17394^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17395 17396Syntax: 17397""""""" 17398 17399This is an overloaded intrinsic. You can use ``llvm.llrint`` on any 17400floating-point type or vector of floating-point type. Not all targets 17401support all types however. 17402 17403:: 17404 17405 declare i64 @llvm.llrint.i64.f32(float %Val) 17406 declare i64 @llvm.llrint.i64.f64(double %Val) 17407 declare i64 @llvm.llrint.i64.f80(float %Val) 17408 declare i64 @llvm.llrint.i64.f128(double %Val) 17409 declare i64 @llvm.llrint.i64.ppcf128(double %Val) 17410 17411Overview: 17412""""""""" 17413 17414The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest 17415integer. 17416 17417Arguments: 17418"""""""""" 17419 17420The argument is a floating-point number and the return value is an integer 17421type. 17422 17423Semantics: 17424"""""""""" 17425 17426This function returns the same values as the libm ``llrint`` functions 17427would, but without setting errno. If the rounded value is too large to 17428be stored in the result type, the return value is a non-deterministic 17429value (equivalent to `freeze poison`). 17430 17431Bit Manipulation Intrinsics 17432--------------------------- 17433 17434LLVM provides intrinsics for a few important bit manipulation 17435operations. These allow efficient code generation for some algorithms. 17436 17437.. _int_bitreverse: 17438 17439'``llvm.bitreverse.*``' Intrinsics 17440^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17441 17442Syntax: 17443""""""" 17444 17445This is an overloaded intrinsic function. You can use bitreverse on any 17446integer type. 17447 17448:: 17449 17450 declare i16 @llvm.bitreverse.i16(i16 <id>) 17451 declare i32 @llvm.bitreverse.i32(i32 <id>) 17452 declare i64 @llvm.bitreverse.i64(i64 <id>) 17453 declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>) 17454 17455Overview: 17456""""""""" 17457 17458The '``llvm.bitreverse``' family of intrinsics is used to reverse the 17459bitpattern of an integer value or vector of integer values; for example 17460``0b10110110`` becomes ``0b01101101``. 17461 17462Semantics: 17463"""""""""" 17464 17465The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit 17466``M`` in the input moved to bit ``N-M-1`` in the output. The vector 17467intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element 17468basis and the element order is not affected. 17469 17470.. _int_bswap: 17471 17472'``llvm.bswap.*``' Intrinsics 17473^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17474 17475Syntax: 17476""""""" 17477 17478This is an overloaded intrinsic function. You can use bswap on any 17479integer type that is an even number of bytes (i.e. BitWidth % 16 == 0). 17480 17481:: 17482 17483 declare i16 @llvm.bswap.i16(i16 <id>) 17484 declare i32 @llvm.bswap.i32(i32 <id>) 17485 declare i64 @llvm.bswap.i64(i64 <id>) 17486 declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>) 17487 17488Overview: 17489""""""""" 17490 17491The '``llvm.bswap``' family of intrinsics is used to byte swap an integer 17492value or vector of integer values with an even number of bytes (positive 17493multiple of 16 bits). 17494 17495Semantics: 17496"""""""""" 17497 17498The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high 17499and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32`` 17500intrinsic returns an i32 value that has the four bytes of the input i32 17501swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the 17502returned i32 will have its bytes in 3, 2, 1, 0 order. The 17503``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this 17504concept to additional even-byte lengths (6 bytes, 8 bytes and more, 17505respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``, 17506operate on a per-element basis and the element order is not affected. 17507 17508.. _int_ctpop: 17509 17510'``llvm.ctpop.*``' Intrinsic 17511^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17512 17513Syntax: 17514""""""" 17515 17516This is an overloaded intrinsic. You can use llvm.ctpop on any integer 17517bit width, or on any vector with integer elements. Not all targets 17518support all bit widths or vector types, however. 17519 17520:: 17521 17522 declare i8 @llvm.ctpop.i8(i8 <src>) 17523 declare i16 @llvm.ctpop.i16(i16 <src>) 17524 declare i32 @llvm.ctpop.i32(i32 <src>) 17525 declare i64 @llvm.ctpop.i64(i64 <src>) 17526 declare i256 @llvm.ctpop.i256(i256 <src>) 17527 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>) 17528 17529Overview: 17530""""""""" 17531 17532The '``llvm.ctpop``' family of intrinsics counts the number of bits set 17533in a value. 17534 17535Arguments: 17536"""""""""" 17537 17538The only argument is the value to be counted. The argument may be of any 17539integer type, or a vector with integer elements. The return type must 17540match the argument type. 17541 17542Semantics: 17543"""""""""" 17544 17545The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within 17546each element of a vector. 17547 17548.. _int_ctlz: 17549 17550'``llvm.ctlz.*``' Intrinsic 17551^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17552 17553Syntax: 17554""""""" 17555 17556This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any 17557integer bit width, or any vector whose elements are integers. Not all 17558targets support all bit widths or vector types, however. 17559 17560:: 17561 17562 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_poison>) 17563 declare <2 x i37> @llvm.ctlz.v2i37(<2 x i37> <src>, i1 <is_zero_poison>) 17564 17565Overview: 17566""""""""" 17567 17568The '``llvm.ctlz``' family of intrinsic functions counts the number of 17569leading zeros in a variable. 17570 17571Arguments: 17572"""""""""" 17573 17574The first argument is the value to be counted. This argument may be of 17575any integer type, or a vector with integer element type. The return 17576type must match the first argument type. 17577 17578The second argument is a constant flag that indicates whether the intrinsic 17579returns a valid result if the first argument is zero. If the first 17580argument is zero and the second argument is true, the result is poison. 17581Historically some architectures did not provide a defined result for zero 17582values as efficiently, and many algorithms are now predicated on avoiding 17583zero-value inputs. 17584 17585Semantics: 17586"""""""""" 17587 17588The '``llvm.ctlz``' intrinsic counts the leading (most significant) 17589zeros in a variable, or within each element of the vector. If 17590``src == 0`` then the result is the size in bits of the type of ``src`` 17591if ``is_zero_poison == 0`` and ``poison`` otherwise. For example, 17592``llvm.ctlz(i32 2) = 30``. 17593 17594.. _int_cttz: 17595 17596'``llvm.cttz.*``' Intrinsic 17597^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17598 17599Syntax: 17600""""""" 17601 17602This is an overloaded intrinsic. You can use ``llvm.cttz`` on any 17603integer bit width, or any vector of integer elements. Not all targets 17604support all bit widths or vector types, however. 17605 17606:: 17607 17608 declare i42 @llvm.cttz.i42 (i42 <src>, i1 <is_zero_poison>) 17609 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_poison>) 17610 17611Overview: 17612""""""""" 17613 17614The '``llvm.cttz``' family of intrinsic functions counts the number of 17615trailing zeros. 17616 17617Arguments: 17618"""""""""" 17619 17620The first argument is the value to be counted. This argument may be of 17621any integer type, or a vector with integer element type. The return 17622type must match the first argument type. 17623 17624The second argument is a constant flag that indicates whether the intrinsic 17625returns a valid result if the first argument is zero. If the first 17626argument is zero and the second argument is true, the result is poison. 17627Historically some architectures did not provide a defined result for zero 17628values as efficiently, and many algorithms are now predicated on avoiding 17629zero-value inputs. 17630 17631Semantics: 17632"""""""""" 17633 17634The '``llvm.cttz``' intrinsic counts the trailing (least significant) 17635zeros in a variable, or within each element of a vector. If ``src == 0`` 17636then the result is the size in bits of the type of ``src`` if 17637``is_zero_poison == 0`` and ``poison`` otherwise. For example, 17638``llvm.cttz(2) = 1``. 17639 17640.. _int_overflow: 17641 17642.. _int_fshl: 17643 17644'``llvm.fshl.*``' Intrinsic 17645^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17646 17647Syntax: 17648""""""" 17649 17650This is an overloaded intrinsic. You can use ``llvm.fshl`` on any 17651integer bit width or any vector of integer elements. Not all targets 17652support all bit widths or vector types, however. 17653 17654:: 17655 17656 declare i8 @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c) 17657 declare i64 @llvm.fshl.i64(i64 %a, i64 %b, i64 %c) 17658 declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) 17659 17660Overview: 17661""""""""" 17662 17663The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left: 17664the first two values are concatenated as { %a : %b } (%a is the most significant 17665bits of the wide value), the combined value is shifted left, and the most 17666significant bits are extracted to produce a result that is the same size as the 17667original arguments. If the first 2 arguments are identical, this is equivalent 17668to a rotate left operation. For vector types, the operation occurs for each 17669element of the vector. The shift argument is treated as an unsigned amount 17670modulo the element size of the arguments. 17671 17672Arguments: 17673"""""""""" 17674 17675The first two arguments are the values to be concatenated. The third 17676argument is the shift amount. The arguments may be any integer type or a 17677vector with integer element type. All arguments and the return value must 17678have the same type. 17679 17680Example: 17681"""""""" 17682 17683.. code-block:: text 17684 17685 %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8) 17686 %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15) ; %r = i8: 128 (0b10000000) 17687 %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11) ; %r = i8: 120 (0b01111000) 17688 %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8) ; %r = i8: 0 (0b00000000) 17689 17690.. _int_fshr: 17691 17692'``llvm.fshr.*``' Intrinsic 17693^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17694 17695Syntax: 17696""""""" 17697 17698This is an overloaded intrinsic. You can use ``llvm.fshr`` on any 17699integer bit width or any vector of integer elements. Not all targets 17700support all bit widths or vector types, however. 17701 17702:: 17703 17704 declare i8 @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c) 17705 declare i64 @llvm.fshr.i64(i64 %a, i64 %b, i64 %c) 17706 declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) 17707 17708Overview: 17709""""""""" 17710 17711The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right: 17712the first two values are concatenated as { %a : %b } (%a is the most significant 17713bits of the wide value), the combined value is shifted right, and the least 17714significant bits are extracted to produce a result that is the same size as the 17715original arguments. If the first 2 arguments are identical, this is equivalent 17716to a rotate right operation. For vector types, the operation occurs for each 17717element of the vector. The shift argument is treated as an unsigned amount 17718modulo the element size of the arguments. 17719 17720Arguments: 17721"""""""""" 17722 17723The first two arguments are the values to be concatenated. The third 17724argument is the shift amount. The arguments may be any integer type or a 17725vector with integer element type. All arguments and the return value must 17726have the same type. 17727 17728Example: 17729"""""""" 17730 17731.. code-block:: text 17732 17733 %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8) 17734 %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15) ; %r = i8: 254 (0b11111110) 17735 %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11) ; %r = i8: 225 (0b11100001) 17736 %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8) ; %r = i8: 255 (0b11111111) 17737 17738Arithmetic with Overflow Intrinsics 17739----------------------------------- 17740 17741LLVM provides intrinsics for fast arithmetic overflow checking. 17742 17743Each of these intrinsics returns a two-element struct. The first 17744element of this struct contains the result of the corresponding 17745arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of 17746the result. Therefore, for example, the first element of the struct 17747returned by ``llvm.sadd.with.overflow.i32`` is always the same as the 17748result of a 32-bit ``add`` instruction with the same operands, where 17749the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag. 17750 17751The second element of the result is an ``i1`` that is 1 if the 17752arithmetic operation overflowed and 0 otherwise. An operation 17753overflows if, for any values of its operands ``A`` and ``B`` and for 17754any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is 17755not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is 17756``sext`` for signed overflow and ``zext`` for unsigned overflow, and 17757``op`` is the underlying arithmetic operation. 17758 17759The behavior of these intrinsics is well-defined for all argument 17760values. 17761 17762'``llvm.sadd.with.overflow.*``' Intrinsics 17763^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17764 17765Syntax: 17766""""""" 17767 17768This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow`` 17769on any integer bit width or vectors of integers. 17770 17771:: 17772 17773 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) 17774 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) 17775 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b) 17776 declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 17777 17778Overview: 17779""""""""" 17780 17781The '``llvm.sadd.with.overflow``' family of intrinsic functions perform 17782a signed addition of the two arguments, and indicate whether an overflow 17783occurred during the signed summation. 17784 17785Arguments: 17786"""""""""" 17787 17788The arguments (%a and %b) and the first element of the result structure 17789may be of integer types of any bit width, but they must have the same 17790bit width. The second element of the result structure must be of type 17791``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 17792addition. 17793 17794Semantics: 17795"""""""""" 17796 17797The '``llvm.sadd.with.overflow``' family of intrinsic functions perform 17798a signed addition of the two variables. They return a structure --- the 17799first element of which is the signed summation, and the second element 17800of which is a bit specifying if the signed summation resulted in an 17801overflow. 17802 17803Examples: 17804""""""""" 17805 17806.. code-block:: llvm 17807 17808 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) 17809 %sum = extractvalue {i32, i1} %res, 0 17810 %obit = extractvalue {i32, i1} %res, 1 17811 br i1 %obit, label %overflow, label %normal 17812 17813'``llvm.uadd.with.overflow.*``' Intrinsics 17814^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17815 17816Syntax: 17817""""""" 17818 17819This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow`` 17820on any integer bit width or vectors of integers. 17821 17822:: 17823 17824 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b) 17825 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) 17826 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b) 17827 declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 17828 17829Overview: 17830""""""""" 17831 17832The '``llvm.uadd.with.overflow``' family of intrinsic functions perform 17833an unsigned addition of the two arguments, and indicate whether a carry 17834occurred during the unsigned summation. 17835 17836Arguments: 17837"""""""""" 17838 17839The arguments (%a and %b) and the first element of the result structure 17840may be of integer types of any bit width, but they must have the same 17841bit width. The second element of the result structure must be of type 17842``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 17843addition. 17844 17845Semantics: 17846"""""""""" 17847 17848The '``llvm.uadd.with.overflow``' family of intrinsic functions perform 17849an unsigned addition of the two arguments. They return a structure --- the 17850first element of which is the sum, and the second element of which is a 17851bit specifying if the unsigned summation resulted in a carry. 17852 17853Examples: 17854""""""""" 17855 17856.. code-block:: llvm 17857 17858 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) 17859 %sum = extractvalue {i32, i1} %res, 0 17860 %obit = extractvalue {i32, i1} %res, 1 17861 br i1 %obit, label %carry, label %normal 17862 17863'``llvm.ssub.with.overflow.*``' Intrinsics 17864^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17865 17866Syntax: 17867""""""" 17868 17869This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow`` 17870on any integer bit width or vectors of integers. 17871 17872:: 17873 17874 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b) 17875 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) 17876 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b) 17877 declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 17878 17879Overview: 17880""""""""" 17881 17882The '``llvm.ssub.with.overflow``' family of intrinsic functions perform 17883a signed subtraction of the two arguments, and indicate whether an 17884overflow occurred during the signed subtraction. 17885 17886Arguments: 17887"""""""""" 17888 17889The arguments (%a and %b) and the first element of the result structure 17890may be of integer types of any bit width, but they must have the same 17891bit width. The second element of the result structure must be of type 17892``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 17893subtraction. 17894 17895Semantics: 17896"""""""""" 17897 17898The '``llvm.ssub.with.overflow``' family of intrinsic functions perform 17899a signed subtraction of the two arguments. They return a structure --- the 17900first element of which is the subtraction, and the second element of 17901which is a bit specifying if the signed subtraction resulted in an 17902overflow. 17903 17904Examples: 17905""""""""" 17906 17907.. code-block:: llvm 17908 17909 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) 17910 %sum = extractvalue {i32, i1} %res, 0 17911 %obit = extractvalue {i32, i1} %res, 1 17912 br i1 %obit, label %overflow, label %normal 17913 17914'``llvm.usub.with.overflow.*``' Intrinsics 17915^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17916 17917Syntax: 17918""""""" 17919 17920This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow`` 17921on any integer bit width or vectors of integers. 17922 17923:: 17924 17925 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b) 17926 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) 17927 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b) 17928 declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 17929 17930Overview: 17931""""""""" 17932 17933The '``llvm.usub.with.overflow``' family of intrinsic functions perform 17934an unsigned subtraction of the two arguments, and indicate whether an 17935overflow occurred during the unsigned subtraction. 17936 17937Arguments: 17938"""""""""" 17939 17940The arguments (%a and %b) and the first element of the result structure 17941may be of integer types of any bit width, but they must have the same 17942bit width. The second element of the result structure must be of type 17943``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 17944subtraction. 17945 17946Semantics: 17947"""""""""" 17948 17949The '``llvm.usub.with.overflow``' family of intrinsic functions perform 17950an unsigned subtraction of the two arguments. They return a structure --- 17951the first element of which is the subtraction, and the second element of 17952which is a bit specifying if the unsigned subtraction resulted in an 17953overflow. 17954 17955Examples: 17956""""""""" 17957 17958.. code-block:: llvm 17959 17960 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) 17961 %sum = extractvalue {i32, i1} %res, 0 17962 %obit = extractvalue {i32, i1} %res, 1 17963 br i1 %obit, label %overflow, label %normal 17964 17965'``llvm.smul.with.overflow.*``' Intrinsics 17966^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17967 17968Syntax: 17969""""""" 17970 17971This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow`` 17972on any integer bit width or vectors of integers. 17973 17974:: 17975 17976 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b) 17977 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) 17978 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b) 17979 declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 17980 17981Overview: 17982""""""""" 17983 17984The '``llvm.smul.with.overflow``' family of intrinsic functions perform 17985a signed multiplication of the two arguments, and indicate whether an 17986overflow occurred during the signed multiplication. 17987 17988Arguments: 17989"""""""""" 17990 17991The arguments (%a and %b) and the first element of the result structure 17992may be of integer types of any bit width, but they must have the same 17993bit width. The second element of the result structure must be of type 17994``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 17995multiplication. 17996 17997Semantics: 17998"""""""""" 17999 18000The '``llvm.smul.with.overflow``' family of intrinsic functions perform 18001a signed multiplication of the two arguments. They return a structure --- 18002the first element of which is the multiplication, and the second element 18003of which is a bit specifying if the signed multiplication resulted in an 18004overflow. 18005 18006Examples: 18007""""""""" 18008 18009.. code-block:: llvm 18010 18011 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) 18012 %sum = extractvalue {i32, i1} %res, 0 18013 %obit = extractvalue {i32, i1} %res, 1 18014 br i1 %obit, label %overflow, label %normal 18015 18016'``llvm.umul.with.overflow.*``' Intrinsics 18017^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18018 18019Syntax: 18020""""""" 18021 18022This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow`` 18023on any integer bit width or vectors of integers. 18024 18025:: 18026 18027 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b) 18028 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) 18029 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b) 18030 declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 18031 18032Overview: 18033""""""""" 18034 18035The '``llvm.umul.with.overflow``' family of intrinsic functions perform 18036a unsigned multiplication of the two arguments, and indicate whether an 18037overflow occurred during the unsigned multiplication. 18038 18039Arguments: 18040"""""""""" 18041 18042The arguments (%a and %b) and the first element of the result structure 18043may be of integer types of any bit width, but they must have the same 18044bit width. The second element of the result structure must be of type 18045``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 18046multiplication. 18047 18048Semantics: 18049"""""""""" 18050 18051The '``llvm.umul.with.overflow``' family of intrinsic functions perform 18052an unsigned multiplication of the two arguments. They return a structure --- 18053the first element of which is the multiplication, and the second 18054element of which is a bit specifying if the unsigned multiplication 18055resulted in an overflow. 18056 18057Examples: 18058""""""""" 18059 18060.. code-block:: llvm 18061 18062 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) 18063 %sum = extractvalue {i32, i1} %res, 0 18064 %obit = extractvalue {i32, i1} %res, 1 18065 br i1 %obit, label %overflow, label %normal 18066 18067Saturation Arithmetic Intrinsics 18068--------------------------------- 18069 18070Saturation arithmetic is a version of arithmetic in which operations are 18071limited to a fixed range between a minimum and maximum value. If the result of 18072an operation is greater than the maximum value, the result is set (or 18073"clamped") to this maximum. If it is below the minimum, it is clamped to this 18074minimum. 18075 18076.. _int_sadd_sat: 18077 18078'``llvm.sadd.sat.*``' Intrinsics 18079^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18080 18081Syntax 18082""""""" 18083 18084This is an overloaded intrinsic. You can use ``llvm.sadd.sat`` 18085on any integer bit width or vectors of integers. 18086 18087:: 18088 18089 declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b) 18090 declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b) 18091 declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b) 18092 declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 18093 18094Overview 18095""""""""" 18096 18097The '``llvm.sadd.sat``' family of intrinsic functions perform signed 18098saturating addition on the 2 arguments. 18099 18100Arguments 18101"""""""""" 18102 18103The arguments (%a and %b) and the result may be of integer types of any bit 18104width, but they must have the same bit width. ``%a`` and ``%b`` are the two 18105values that will undergo signed addition. 18106 18107Semantics: 18108"""""""""" 18109 18110The maximum value this operation can clamp to is the largest signed value 18111representable by the bit width of the arguments. The minimum value is the 18112smallest signed value representable by this bit width. 18113 18114 18115Examples 18116""""""""" 18117 18118.. code-block:: llvm 18119 18120 %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2) ; %res = 3 18121 %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6) ; %res = 7 18122 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2) ; %res = -2 18123 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5) ; %res = -8 18124 18125 18126.. _int_uadd_sat: 18127 18128'``llvm.uadd.sat.*``' Intrinsics 18129^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18130 18131Syntax 18132""""""" 18133 18134This is an overloaded intrinsic. You can use ``llvm.uadd.sat`` 18135on any integer bit width or vectors of integers. 18136 18137:: 18138 18139 declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b) 18140 declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b) 18141 declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b) 18142 declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 18143 18144Overview 18145""""""""" 18146 18147The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned 18148saturating addition on the 2 arguments. 18149 18150Arguments 18151"""""""""" 18152 18153The arguments (%a and %b) and the result may be of integer types of any bit 18154width, but they must have the same bit width. ``%a`` and ``%b`` are the two 18155values that will undergo unsigned addition. 18156 18157Semantics: 18158"""""""""" 18159 18160The maximum value this operation can clamp to is the largest unsigned value 18161representable by the bit width of the arguments. Because this is an unsigned 18162operation, the result will never saturate towards zero. 18163 18164 18165Examples 18166""""""""" 18167 18168.. code-block:: llvm 18169 18170 %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2) ; %res = 3 18171 %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6) ; %res = 11 18172 %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8) ; %res = 15 18173 18174 18175.. _int_ssub_sat: 18176 18177'``llvm.ssub.sat.*``' Intrinsics 18178^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18179 18180Syntax 18181""""""" 18182 18183This is an overloaded intrinsic. You can use ``llvm.ssub.sat`` 18184on any integer bit width or vectors of integers. 18185 18186:: 18187 18188 declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b) 18189 declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b) 18190 declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b) 18191 declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 18192 18193Overview 18194""""""""" 18195 18196The '``llvm.ssub.sat``' family of intrinsic functions perform signed 18197saturating subtraction on the 2 arguments. 18198 18199Arguments 18200"""""""""" 18201 18202The arguments (%a and %b) and the result may be of integer types of any bit 18203width, but they must have the same bit width. ``%a`` and ``%b`` are the two 18204values that will undergo signed subtraction. 18205 18206Semantics: 18207"""""""""" 18208 18209The maximum value this operation can clamp to is the largest signed value 18210representable by the bit width of the arguments. The minimum value is the 18211smallest signed value representable by this bit width. 18212 18213 18214Examples 18215""""""""" 18216 18217.. code-block:: llvm 18218 18219 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1) ; %res = 1 18220 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6) ; %res = -4 18221 %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5) ; %res = -8 18222 %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5) ; %res = 7 18223 18224 18225.. _int_usub_sat: 18226 18227'``llvm.usub.sat.*``' Intrinsics 18228^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18229 18230Syntax 18231""""""" 18232 18233This is an overloaded intrinsic. You can use ``llvm.usub.sat`` 18234on any integer bit width or vectors of integers. 18235 18236:: 18237 18238 declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b) 18239 declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b) 18240 declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b) 18241 declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 18242 18243Overview 18244""""""""" 18245 18246The '``llvm.usub.sat``' family of intrinsic functions perform unsigned 18247saturating subtraction on the 2 arguments. 18248 18249Arguments 18250"""""""""" 18251 18252The arguments (%a and %b) and the result may be of integer types of any bit 18253width, but they must have the same bit width. ``%a`` and ``%b`` are the two 18254values that will undergo unsigned subtraction. 18255 18256Semantics: 18257"""""""""" 18258 18259The minimum value this operation can clamp to is 0, which is the smallest 18260unsigned value representable by the bit width of the unsigned arguments. 18261Because this is an unsigned operation, the result will never saturate towards 18262the largest possible value representable by this bit width. 18263 18264 18265Examples 18266""""""""" 18267 18268.. code-block:: llvm 18269 18270 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1) ; %res = 1 18271 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6) ; %res = 0 18272 18273 18274'``llvm.sshl.sat.*``' Intrinsics 18275^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18276 18277Syntax 18278""""""" 18279 18280This is an overloaded intrinsic. You can use ``llvm.sshl.sat`` 18281on integers or vectors of integers of any bit width. 18282 18283:: 18284 18285 declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b) 18286 declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b) 18287 declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b) 18288 declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 18289 18290Overview 18291""""""""" 18292 18293The '``llvm.sshl.sat``' family of intrinsic functions perform signed 18294saturating left shift on the first argument. 18295 18296Arguments 18297"""""""""" 18298 18299The arguments (``%a`` and ``%b``) and the result may be of integer types of any 18300bit width, but they must have the same bit width. ``%a`` is the value to be 18301shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or 18302dynamically) equal to or larger than the integer bit width of the arguments, 18303the result is a :ref:`poison value <poisonvalues>`. If the arguments are 18304vectors, each vector element of ``a`` is shifted by the corresponding shift 18305amount in ``b``. 18306 18307 18308Semantics: 18309"""""""""" 18310 18311The maximum value this operation can clamp to is the largest signed value 18312representable by the bit width of the arguments. The minimum value is the 18313smallest signed value representable by this bit width. 18314 18315 18316Examples 18317""""""""" 18318 18319.. code-block:: llvm 18320 18321 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1) ; %res = 4 18322 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2) ; %res = 7 18323 %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1) ; %res = -8 18324 %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1) ; %res = -2 18325 18326 18327'``llvm.ushl.sat.*``' Intrinsics 18328^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18329 18330Syntax 18331""""""" 18332 18333This is an overloaded intrinsic. You can use ``llvm.ushl.sat`` 18334on integers or vectors of integers of any bit width. 18335 18336:: 18337 18338 declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b) 18339 declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b) 18340 declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b) 18341 declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 18342 18343Overview 18344""""""""" 18345 18346The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned 18347saturating left shift on the first argument. 18348 18349Arguments 18350"""""""""" 18351 18352The arguments (``%a`` and ``%b``) and the result may be of integer types of any 18353bit width, but they must have the same bit width. ``%a`` is the value to be 18354shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or 18355dynamically) equal to or larger than the integer bit width of the arguments, 18356the result is a :ref:`poison value <poisonvalues>`. If the arguments are 18357vectors, each vector element of ``a`` is shifted by the corresponding shift 18358amount in ``b``. 18359 18360Semantics: 18361"""""""""" 18362 18363The maximum value this operation can clamp to is the largest unsigned value 18364representable by the bit width of the arguments. 18365 18366 18367Examples 18368""""""""" 18369 18370.. code-block:: llvm 18371 18372 %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1) ; %res = 4 18373 %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3) ; %res = 15 18374 18375 18376Fixed Point Arithmetic Intrinsics 18377--------------------------------- 18378 18379A fixed point number represents a real data type for a number that has a fixed 18380number of digits after a radix point (equivalent to the decimal point '.'). 18381The number of digits after the radix point is referred as the `scale`. These 18382are useful for representing fractional values to a specific precision. The 18383following intrinsics perform fixed point arithmetic operations on 2 operands 18384of the same scale, specified as the third argument. 18385 18386The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication 18387of fixed point numbers through scaled integers. Therefore, fixed point 18388multiplication can be represented as 18389 18390.. code-block:: llvm 18391 18392 %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale) 18393 18394 ; Expands to 18395 %a2 = sext i4 %a to i8 18396 %b2 = sext i4 %b to i8 18397 %mul = mul nsw nuw i8 %a2, %b2 18398 %scale2 = trunc i32 %scale to i8 18399 %r = ashr i8 %mul, i8 %scale2 ; this is for a target rounding down towards negative infinity 18400 %result = trunc i8 %r to i4 18401 18402The ``llvm.*div.fix`` family of intrinsic functions represents a division of 18403fixed point numbers through scaled integers. Fixed point division can be 18404represented as: 18405 18406.. code-block:: llvm 18407 18408 %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale) 18409 18410 ; Expands to 18411 %a2 = sext i4 %a to i8 18412 %b2 = sext i4 %b to i8 18413 %scale2 = trunc i32 %scale to i8 18414 %a3 = shl i8 %a2, %scale2 18415 %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero 18416 %result = trunc i8 %r to i4 18417 18418For each of these functions, if the result cannot be represented exactly with 18419the provided scale, the result is rounded. Rounding is unspecified since 18420preferred rounding may vary for different targets. Rounding is specified 18421through a target hook. Different pipelines should legalize or optimize this 18422using the rounding specified by this hook if it is provided. Operations like 18423constant folding, instruction combining, KnownBits, and ValueTracking should 18424also use this hook, if provided, and not assume the direction of rounding. A 18425rounded result must always be within one unit of precision from the true 18426result. That is, the error between the returned result and the true result must 18427be less than 1/2^(scale). 18428 18429 18430'``llvm.smul.fix.*``' Intrinsics 18431^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18432 18433Syntax 18434""""""" 18435 18436This is an overloaded intrinsic. You can use ``llvm.smul.fix`` 18437on any integer bit width or vectors of integers. 18438 18439:: 18440 18441 declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale) 18442 declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale) 18443 declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale) 18444 declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 18445 18446Overview 18447""""""""" 18448 18449The '``llvm.smul.fix``' family of intrinsic functions perform signed 18450fixed point multiplication on 2 arguments of the same scale. 18451 18452Arguments 18453"""""""""" 18454 18455The arguments (%a and %b) and the result may be of integer types of any bit 18456width, but they must have the same bit width. The arguments may also work with 18457int vectors of the same length and int size. ``%a`` and ``%b`` are the two 18458values that will undergo signed fixed point multiplication. The argument 18459``%scale`` represents the scale of both operands, and must be a constant 18460integer. 18461 18462Semantics: 18463"""""""""" 18464 18465This operation performs fixed point multiplication on the 2 arguments of a 18466specified scale. The result will also be returned in the same scale specified 18467in the third argument. 18468 18469If the result value cannot be precisely represented in the given scale, the 18470value is rounded up or down to the closest representable value. The rounding 18471direction is unspecified. 18472 18473It is undefined behavior if the result value does not fit within the range of 18474the fixed point type. 18475 18476 18477Examples 18478""""""""" 18479 18480.. code-block:: llvm 18481 18482 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 18483 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 18484 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) 18485 18486 ; The result in the following could be rounded up to -2 or down to -2.5 18487 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) 18488 18489 18490'``llvm.umul.fix.*``' Intrinsics 18491^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18492 18493Syntax 18494""""""" 18495 18496This is an overloaded intrinsic. You can use ``llvm.umul.fix`` 18497on any integer bit width or vectors of integers. 18498 18499:: 18500 18501 declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale) 18502 declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale) 18503 declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale) 18504 declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 18505 18506Overview 18507""""""""" 18508 18509The '``llvm.umul.fix``' family of intrinsic functions perform unsigned 18510fixed point multiplication on 2 arguments of the same scale. 18511 18512Arguments 18513"""""""""" 18514 18515The arguments (%a and %b) and the result may be of integer types of any bit 18516width, but they must have the same bit width. The arguments may also work with 18517int vectors of the same length and int size. ``%a`` and ``%b`` are the two 18518values that will undergo unsigned fixed point multiplication. The argument 18519``%scale`` represents the scale of both operands, and must be a constant 18520integer. 18521 18522Semantics: 18523"""""""""" 18524 18525This operation performs unsigned fixed point multiplication on the 2 arguments of a 18526specified scale. The result will also be returned in the same scale specified 18527in the third argument. 18528 18529If the result value cannot be precisely represented in the given scale, the 18530value is rounded up or down to the closest representable value. The rounding 18531direction is unspecified. 18532 18533It is undefined behavior if the result value does not fit within the range of 18534the fixed point type. 18535 18536 18537Examples 18538""""""""" 18539 18540.. code-block:: llvm 18541 18542 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 18543 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 18544 18545 ; The result in the following could be rounded down to 3.5 or up to 4 18546 %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1) ; %res = 7 (or 8) (7.5 x 0.5 = 3.75) 18547 18548 18549'``llvm.smul.fix.sat.*``' Intrinsics 18550^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18551 18552Syntax 18553""""""" 18554 18555This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat`` 18556on any integer bit width or vectors of integers. 18557 18558:: 18559 18560 declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 18561 declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 18562 declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 18563 declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 18564 18565Overview 18566""""""""" 18567 18568The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed 18569fixed point saturating multiplication on 2 arguments of the same scale. 18570 18571Arguments 18572"""""""""" 18573 18574The arguments (%a and %b) and the result may be of integer types of any bit 18575width, but they must have the same bit width. ``%a`` and ``%b`` are the two 18576values that will undergo signed fixed point multiplication. The argument 18577``%scale`` represents the scale of both operands, and must be a constant 18578integer. 18579 18580Semantics: 18581"""""""""" 18582 18583This operation performs fixed point multiplication on the 2 arguments of a 18584specified scale. The result will also be returned in the same scale specified 18585in the third argument. 18586 18587If the result value cannot be precisely represented in the given scale, the 18588value is rounded up or down to the closest representable value. The rounding 18589direction is unspecified. 18590 18591The maximum value this operation can clamp to is the largest signed value 18592representable by the bit width of the first 2 arguments. The minimum value is the 18593smallest signed value representable by this bit width. 18594 18595 18596Examples 18597""""""""" 18598 18599.. code-block:: llvm 18600 18601 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 18602 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 18603 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) 18604 18605 ; The result in the following could be rounded up to -2 or down to -2.5 18606 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) 18607 18608 ; Saturation 18609 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0) ; %res = 7 18610 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2) ; %res = 7 18611 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2) ; %res = -8 18612 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1) ; %res = 7 18613 18614 ; Scale can affect the saturation result 18615 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7) 18616 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2) 18617 18618 18619'``llvm.umul.fix.sat.*``' Intrinsics 18620^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18621 18622Syntax 18623""""""" 18624 18625This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat`` 18626on any integer bit width or vectors of integers. 18627 18628:: 18629 18630 declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 18631 declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 18632 declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 18633 declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 18634 18635Overview 18636""""""""" 18637 18638The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned 18639fixed point saturating multiplication on 2 arguments of the same scale. 18640 18641Arguments 18642"""""""""" 18643 18644The arguments (%a and %b) and the result may be of integer types of any bit 18645width, but they must have the same bit width. ``%a`` and ``%b`` are the two 18646values that will undergo unsigned fixed point multiplication. The argument 18647``%scale`` represents the scale of both operands, and must be a constant 18648integer. 18649 18650Semantics: 18651"""""""""" 18652 18653This operation performs fixed point multiplication on the 2 arguments of a 18654specified scale. The result will also be returned in the same scale specified 18655in the third argument. 18656 18657If the result value cannot be precisely represented in the given scale, the 18658value is rounded up or down to the closest representable value. The rounding 18659direction is unspecified. 18660 18661The maximum value this operation can clamp to is the largest unsigned value 18662representable by the bit width of the first 2 arguments. The minimum value is the 18663smallest unsigned value representable by this bit width (zero). 18664 18665 18666Examples 18667""""""""" 18668 18669.. code-block:: llvm 18670 18671 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 18672 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 18673 18674 ; The result in the following could be rounded down to 2 or up to 2.5 18675 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1) ; %res = 4 (or 5) (1.5 x 1.5 = 2.25) 18676 18677 ; Saturation 18678 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0) ; %res = 15 (8 x 2 -> clamped to 15) 18679 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2) ; %res = 15 (2 x 2 -> clamped to 3.75) 18680 18681 ; Scale can affect the saturation result 18682 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7) 18683 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2) 18684 18685 18686'``llvm.sdiv.fix.*``' Intrinsics 18687^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18688 18689Syntax 18690""""""" 18691 18692This is an overloaded intrinsic. You can use ``llvm.sdiv.fix`` 18693on any integer bit width or vectors of integers. 18694 18695:: 18696 18697 declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale) 18698 declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale) 18699 declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale) 18700 declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 18701 18702Overview 18703""""""""" 18704 18705The '``llvm.sdiv.fix``' family of intrinsic functions perform signed 18706fixed point division on 2 arguments of the same scale. 18707 18708Arguments 18709"""""""""" 18710 18711The arguments (%a and %b) and the result may be of integer types of any bit 18712width, but they must have the same bit width. The arguments may also work with 18713int vectors of the same length and int size. ``%a`` and ``%b`` are the two 18714values that will undergo signed fixed point division. The argument 18715``%scale`` represents the scale of both operands, and must be a constant 18716integer. 18717 18718Semantics: 18719"""""""""" 18720 18721This operation performs fixed point division on the 2 arguments of a 18722specified scale. The result will also be returned in the same scale specified 18723in the third argument. 18724 18725If the result value cannot be precisely represented in the given scale, the 18726value is rounded up or down to the closest representable value. The rounding 18727direction is unspecified. 18728 18729It is undefined behavior if the result value does not fit within the range of 18730the fixed point type, or if the second argument is zero. 18731 18732 18733Examples 18734""""""""" 18735 18736.. code-block:: llvm 18737 18738 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 18739 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 18740 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5) 18741 18742 ; The result in the following could be rounded up to 1 or down to 0.5 18743 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 18744 18745 18746'``llvm.udiv.fix.*``' Intrinsics 18747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18748 18749Syntax 18750""""""" 18751 18752This is an overloaded intrinsic. You can use ``llvm.udiv.fix`` 18753on any integer bit width or vectors of integers. 18754 18755:: 18756 18757 declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale) 18758 declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale) 18759 declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale) 18760 declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 18761 18762Overview 18763""""""""" 18764 18765The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned 18766fixed point division on 2 arguments of the same scale. 18767 18768Arguments 18769"""""""""" 18770 18771The arguments (%a and %b) and the result may be of integer types of any bit 18772width, but they must have the same bit width. The arguments may also work with 18773int vectors of the same length and int size. ``%a`` and ``%b`` are the two 18774values that will undergo unsigned fixed point division. The argument 18775``%scale`` represents the scale of both operands, and must be a constant 18776integer. 18777 18778Semantics: 18779"""""""""" 18780 18781This operation performs fixed point division on the 2 arguments of a 18782specified scale. The result will also be returned in the same scale specified 18783in the third argument. 18784 18785If the result value cannot be precisely represented in the given scale, the 18786value is rounded up or down to the closest representable value. The rounding 18787direction is unspecified. 18788 18789It is undefined behavior if the result value does not fit within the range of 18790the fixed point type, or if the second argument is zero. 18791 18792 18793Examples 18794""""""""" 18795 18796.. code-block:: llvm 18797 18798 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 18799 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 18800 %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125) 18801 18802 ; The result in the following could be rounded up to 1 or down to 0.5 18803 %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 18804 18805 18806'``llvm.sdiv.fix.sat.*``' Intrinsics 18807^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18808 18809Syntax 18810""""""" 18811 18812This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat`` 18813on any integer bit width or vectors of integers. 18814 18815:: 18816 18817 declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 18818 declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 18819 declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 18820 declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 18821 18822Overview 18823""""""""" 18824 18825The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed 18826fixed point saturating division on 2 arguments of the same scale. 18827 18828Arguments 18829"""""""""" 18830 18831The arguments (%a and %b) and the result may be of integer types of any bit 18832width, but they must have the same bit width. ``%a`` and ``%b`` are the two 18833values that will undergo signed fixed point division. The argument 18834``%scale`` represents the scale of both operands, and must be a constant 18835integer. 18836 18837Semantics: 18838"""""""""" 18839 18840This operation performs fixed point division on the 2 arguments of a 18841specified scale. The result will also be returned in the same scale specified 18842in the third argument. 18843 18844If the result value cannot be precisely represented in the given scale, the 18845value is rounded up or down to the closest representable value. The rounding 18846direction is unspecified. 18847 18848The maximum value this operation can clamp to is the largest signed value 18849representable by the bit width of the first 2 arguments. The minimum value is the 18850smallest signed value representable by this bit width. 18851 18852It is undefined behavior if the second argument is zero. 18853 18854 18855Examples 18856""""""""" 18857 18858.. code-block:: llvm 18859 18860 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 18861 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 18862 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5) 18863 18864 ; The result in the following could be rounded up to 1 or down to 0.5 18865 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 18866 18867 ; Saturation 18868 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0) ; %res = 7 (-8 / -1 = 8 => 7) 18869 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2) ; %res = 7 (1 / 0.5 = 2 => 1.75) 18870 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2) ; %res = -8 (-1 / 0.25 = -4 => -2) 18871 18872 18873'``llvm.udiv.fix.sat.*``' Intrinsics 18874^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18875 18876Syntax 18877""""""" 18878 18879This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat`` 18880on any integer bit width or vectors of integers. 18881 18882:: 18883 18884 declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 18885 declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 18886 declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 18887 declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 18888 18889Overview 18890""""""""" 18891 18892The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned 18893fixed point saturating division on 2 arguments of the same scale. 18894 18895Arguments 18896"""""""""" 18897 18898The arguments (%a and %b) and the result may be of integer types of any bit 18899width, but they must have the same bit width. ``%a`` and ``%b`` are the two 18900values that will undergo unsigned fixed point division. The argument 18901``%scale`` represents the scale of both operands, and must be a constant 18902integer. 18903 18904Semantics: 18905"""""""""" 18906 18907This operation performs fixed point division on the 2 arguments of a 18908specified scale. The result will also be returned in the same scale specified 18909in the third argument. 18910 18911If the result value cannot be precisely represented in the given scale, the 18912value is rounded up or down to the closest representable value. The rounding 18913direction is unspecified. 18914 18915The maximum value this operation can clamp to is the largest unsigned value 18916representable by the bit width of the first 2 arguments. The minimum value is the 18917smallest unsigned value representable by this bit width (zero). 18918 18919It is undefined behavior if the second argument is zero. 18920 18921Examples 18922""""""""" 18923 18924.. code-block:: llvm 18925 18926 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 18927 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 18928 18929 ; The result in the following could be rounded down to 0.5 or up to 1 18930 %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 1 (or 2) (1.5 / 2 = 0.75) 18931 18932 ; Saturation 18933 %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2) ; %res = 15 (2 / 0.5 = 4 => 3.75) 18934 18935 18936Specialized Arithmetic Intrinsics 18937--------------------------------- 18938 18939.. _i_intr_llvm_canonicalize: 18940 18941'``llvm.canonicalize.*``' Intrinsic 18942^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18943 18944Syntax: 18945""""""" 18946 18947:: 18948 18949 declare float @llvm.canonicalize.f32(float %a) 18950 declare double @llvm.canonicalize.f64(double %b) 18951 18952Overview: 18953""""""""" 18954 18955The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical 18956encoding of a floating-point number. This canonicalization is useful for 18957implementing certain numeric primitives such as frexp. The canonical encoding is 18958defined by IEEE-754-2008 to be: 18959 18960:: 18961 18962 2.1.8 canonical encoding: The preferred encoding of a floating-point 18963 representation in a format. Applied to declets, significands of finite 18964 numbers, infinities, and NaNs, especially in decimal formats. 18965 18966This operation can also be considered equivalent to the IEEE-754-2008 18967conversion of a floating-point value to the same format. NaNs are handled 18968according to section 6.2. 18969 18970Examples of non-canonical encodings: 18971 18972- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are 18973 converted to a canonical representation per hardware-specific protocol. 18974- Many normal decimal floating-point numbers have non-canonical alternative 18975 encodings. 18976- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values. 18977 These are treated as non-canonical encodings of zero and will be flushed to 18978 a zero of the same sign by this operation. 18979 18980Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with 18981default exception handling must signal an invalid exception, and produce a 18982quiet NaN result. 18983 18984This function should always be implementable as multiplication by 1.0, provided 18985that the compiler does not constant fold the operation. Likewise, division by 189861.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with 18987-0.0 is also sufficient provided that the rounding mode is not -Infinity. 18988 18989``@llvm.canonicalize`` must preserve the equality relation. That is: 18990 18991- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)`` 18992- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent 18993 to ``(x == y)`` 18994 18995Additionally, the sign of zero must be conserved: 18996``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0`` 18997 18998The payload bits of a NaN must be conserved, with two exceptions. 18999First, environments which use only a single canonical representation of NaN 19000must perform said canonicalization. Second, SNaNs must be quieted per the 19001usual methods. 19002 19003The canonicalization operation may be optimized away if: 19004 19005- The input is known to be canonical. For example, it was produced by a 19006 floating-point operation that is required by the standard to be canonical. 19007- The result is consumed only by (or fused with) other floating-point 19008 operations. That is, the bits of the floating-point value are not examined. 19009 19010.. _int_fmuladd: 19011 19012'``llvm.fmuladd.*``' Intrinsic 19013^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19014 19015Syntax: 19016""""""" 19017 19018:: 19019 19020 declare float @llvm.fmuladd.f32(float %a, float %b, float %c) 19021 declare double @llvm.fmuladd.f64(double %a, double %b, double %c) 19022 19023Overview: 19024""""""""" 19025 19026The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add 19027expressions that can be fused if the code generator determines that (a) the 19028target instruction set has support for a fused operation, and (b) that the 19029fused operation is more efficient than the equivalent, separate pair of mul 19030and add instructions. 19031 19032Arguments: 19033"""""""""" 19034 19035The '``llvm.fmuladd.*``' intrinsics each take three arguments: two 19036multiplicands, a and b, and an addend c. 19037 19038Semantics: 19039"""""""""" 19040 19041The expression: 19042 19043:: 19044 19045 %0 = call float @llvm.fmuladd.f32(%a, %b, %c) 19046 19047is equivalent to the expression a \* b + c, except that it is unspecified 19048whether rounding will be performed between the multiplication and addition 19049steps. Fusion is not guaranteed, even if the target platform supports it. 19050If a fused multiply-add is required, the corresponding 19051:ref:`llvm.fma <int_fma>` intrinsic function should be used instead. 19052This never sets errno, just as '``llvm.fma.*``'. 19053 19054Examples: 19055""""""""" 19056 19057.. code-block:: llvm 19058 19059 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c 19060 19061 19062Hardware-Loop Intrinsics 19063------------------------ 19064 19065LLVM support several intrinsics to mark a loop as a hardware-loop. They are 19066hints to the backend which are required to lower these intrinsics further to target 19067specific instructions, or revert the hardware-loop to a normal loop if target 19068specific restriction are not met and a hardware-loop can't be generated. 19069 19070These intrinsics may be modified in the future and are not intended to be used 19071outside the backend. Thus, front-end and mid-level optimizations should not be 19072generating these intrinsics. 19073 19074 19075'``llvm.set.loop.iterations.*``' Intrinsic 19076^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19077 19078Syntax: 19079""""""" 19080 19081This is an overloaded intrinsic. 19082 19083:: 19084 19085 declare void @llvm.set.loop.iterations.i32(i32) 19086 declare void @llvm.set.loop.iterations.i64(i64) 19087 19088Overview: 19089""""""""" 19090 19091The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the 19092hardware-loop trip count. They are placed in the loop preheader basic block and 19093are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these 19094instructions. 19095 19096Arguments: 19097"""""""""" 19098 19099The integer operand is the loop trip count of the hardware-loop, and thus 19100not e.g. the loop back-edge taken count. 19101 19102Semantics: 19103"""""""""" 19104 19105The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic 19106on their operand. It's a hint to the backend that can use this to set up the 19107hardware-loop count with a target specific instruction, usually a move of this 19108value to a special register or a hardware-loop instruction. 19109 19110 19111'``llvm.start.loop.iterations.*``' Intrinsic 19112^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19113 19114Syntax: 19115""""""" 19116 19117This is an overloaded intrinsic. 19118 19119:: 19120 19121 declare i32 @llvm.start.loop.iterations.i32(i32) 19122 declare i64 @llvm.start.loop.iterations.i64(i64) 19123 19124Overview: 19125""""""""" 19126 19127The '``llvm.start.loop.iterations.*``' intrinsics are similar to the 19128'``llvm.set.loop.iterations.*``' intrinsics, used to specify the 19129hardware-loop trip count but also produce a value identical to the input 19130that can be used as the input to the loop. They are placed in the loop 19131preheader basic block and the output is expected to be the input to the 19132phi for the induction variable of the loop, decremented by the 19133'``llvm.loop.decrement.reg.*``'. 19134 19135Arguments: 19136"""""""""" 19137 19138The integer operand is the loop trip count of the hardware-loop, and thus 19139not e.g. the loop back-edge taken count. 19140 19141Semantics: 19142"""""""""" 19143 19144The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic 19145on their operand. It's a hint to the backend that can use this to set up the 19146hardware-loop count with a target specific instruction, usually a move of this 19147value to a special register or a hardware-loop instruction. 19148 19149'``llvm.test.set.loop.iterations.*``' Intrinsic 19150^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19151 19152Syntax: 19153""""""" 19154 19155This is an overloaded intrinsic. 19156 19157:: 19158 19159 declare i1 @llvm.test.set.loop.iterations.i32(i32) 19160 declare i1 @llvm.test.set.loop.iterations.i64(i64) 19161 19162Overview: 19163""""""""" 19164 19165The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the 19166the loop trip count, and also test that the given count is not zero, allowing 19167it to control entry to a while-loop. They are placed in the loop preheader's 19168predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid 19169optimizers duplicating these instructions. 19170 19171Arguments: 19172"""""""""" 19173 19174The integer operand is the loop trip count of the hardware-loop, and thus 19175not e.g. the loop back-edge taken count. 19176 19177Semantics: 19178"""""""""" 19179 19180The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any 19181arithmetic on their operand. It's a hint to the backend that can use this to 19182set up the hardware-loop count with a target specific instruction, usually a 19183move of this value to a special register or a hardware-loop instruction. 19184The result is the conditional value of whether the given count is not zero. 19185 19186 19187'``llvm.test.start.loop.iterations.*``' Intrinsic 19188^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19189 19190Syntax: 19191""""""" 19192 19193This is an overloaded intrinsic. 19194 19195:: 19196 19197 declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32) 19198 declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64) 19199 19200Overview: 19201""""""""" 19202 19203The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the 19204'``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``' 19205intrinsics, used to specify the hardware-loop trip count, but also produce a 19206value identical to the input that can be used as the input to the loop. The 19207second i1 output controls entry to a while-loop. 19208 19209Arguments: 19210"""""""""" 19211 19212The integer operand is the loop trip count of the hardware-loop, and thus 19213not e.g. the loop back-edge taken count. 19214 19215Semantics: 19216"""""""""" 19217 19218The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any 19219arithmetic on their operand. It's a hint to the backend that can use this to 19220set up the hardware-loop count with a target specific instruction, usually a 19221move of this value to a special register or a hardware-loop instruction. 19222The result is a pair of the input and a conditional value of whether the 19223given count is not zero. 19224 19225 19226'``llvm.loop.decrement.reg.*``' Intrinsic 19227^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19228 19229Syntax: 19230""""""" 19231 19232This is an overloaded intrinsic. 19233 19234:: 19235 19236 declare i32 @llvm.loop.decrement.reg.i32(i32, i32) 19237 declare i64 @llvm.loop.decrement.reg.i64(i64, i64) 19238 19239Overview: 19240""""""""" 19241 19242The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop 19243iteration counter and return an updated value that will be used in the next 19244loop test check. 19245 19246Arguments: 19247"""""""""" 19248 19249Both arguments must have identical integer types. The first operand is the 19250loop iteration counter. The second operand is the maximum number of elements 19251processed in an iteration. 19252 19253Semantics: 19254"""""""""" 19255 19256The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its 19257two operands, which is not allowed to wrap. They return the remaining number of 19258iterations still to be executed, and can be used together with a ``PHI``, 19259``ICMP`` and ``BR`` to control the number of loop iterations executed. Any 19260optimizations are allowed to treat it is a ``SUB``, and it is supported by 19261SCEV, so it's the backends responsibility to handle cases where it may be 19262optimized. These intrinsics are marked as ``IntrNoDuplicate`` to avoid 19263optimizers duplicating these instructions. 19264 19265 19266'``llvm.loop.decrement.*``' Intrinsic 19267^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19268 19269Syntax: 19270""""""" 19271 19272This is an overloaded intrinsic. 19273 19274:: 19275 19276 declare i1 @llvm.loop.decrement.i32(i32) 19277 declare i1 @llvm.loop.decrement.i64(i64) 19278 19279Overview: 19280""""""""" 19281 19282The HardwareLoops pass allows the loop decrement value to be specified with an 19283option. It defaults to a loop decrement value of 1, but it can be an unsigned 19284integer value provided by this option. The '``llvm.loop.decrement.*``' 19285intrinsics decrement the loop iteration counter with this value, and return a 19286false predicate if the loop should exit, and true otherwise. 19287This is emitted if the loop counter is not updated via a ``PHI`` node, which 19288can also be controlled with an option. 19289 19290Arguments: 19291"""""""""" 19292 19293The integer argument is the loop decrement value used to decrement the loop 19294iteration counter. 19295 19296Semantics: 19297"""""""""" 19298 19299The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration 19300counter with the given loop decrement value, and return false if the loop 19301should exit, this ``SUB`` is not allowed to wrap. The result is a condition 19302that is used by the conditional branch controlling the loop. 19303 19304 19305Vector Reduction Intrinsics 19306--------------------------- 19307 19308Horizontal reductions of vectors can be expressed using the following 19309intrinsics. Each one takes a vector operand as an input and applies its 19310respective operation across all elements of the vector, returning a single 19311scalar result of the same element type. 19312 19313.. _int_vector_reduce_add: 19314 19315'``llvm.vector.reduce.add.*``' Intrinsic 19316^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19317 19318Syntax: 19319""""""" 19320 19321:: 19322 19323 declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a) 19324 declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a) 19325 19326Overview: 19327""""""""" 19328 19329The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD`` 19330reduction of a vector, returning the result as a scalar. The return type matches 19331the element-type of the vector input. 19332 19333Arguments: 19334"""""""""" 19335The argument to this intrinsic must be a vector of integer values. 19336 19337.. _int_vector_reduce_fadd: 19338 19339'``llvm.vector.reduce.fadd.*``' Intrinsic 19340^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19341 19342Syntax: 19343""""""" 19344 19345:: 19346 19347 declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a) 19348 declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a) 19349 19350Overview: 19351""""""""" 19352 19353The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point 19354``ADD`` reduction of a vector, returning the result as a scalar. The return type 19355matches the element-type of the vector input. 19356 19357If the intrinsic call has the 'reassoc' flag set, then the reduction will not 19358preserve the associativity of an equivalent scalarized counterpart. Otherwise 19359the reduction will be *sequential*, thus implying that the operation respects 19360the associativity of a scalarized reduction. That is, the reduction begins with 19361the start value and performs an fadd operation with consecutively increasing 19362vector element indices. See the following pseudocode: 19363 19364:: 19365 19366 float sequential_fadd(start_value, input_vector) 19367 result = start_value 19368 for i = 0 to length(input_vector) 19369 result = result + input_vector[i] 19370 return result 19371 19372 19373Arguments: 19374"""""""""" 19375The first argument to this intrinsic is a scalar start value for the reduction. 19376The type of the start value matches the element-type of the vector input. 19377The second argument must be a vector of floating-point values. 19378 19379To ignore the start value, negative zero (``-0.0``) can be used, as it is 19380the neutral value of floating point addition. 19381 19382Examples: 19383""""""""" 19384 19385:: 19386 19387 %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction 19388 %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction 19389 19390 19391.. _int_vector_reduce_mul: 19392 19393'``llvm.vector.reduce.mul.*``' Intrinsic 19394^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19395 19396Syntax: 19397""""""" 19398 19399:: 19400 19401 declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a) 19402 declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a) 19403 19404Overview: 19405""""""""" 19406 19407The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL`` 19408reduction of a vector, returning the result as a scalar. The return type matches 19409the element-type of the vector input. 19410 19411Arguments: 19412"""""""""" 19413The argument to this intrinsic must be a vector of integer values. 19414 19415.. _int_vector_reduce_fmul: 19416 19417'``llvm.vector.reduce.fmul.*``' Intrinsic 19418^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19419 19420Syntax: 19421""""""" 19422 19423:: 19424 19425 declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a) 19426 declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a) 19427 19428Overview: 19429""""""""" 19430 19431The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point 19432``MUL`` reduction of a vector, returning the result as a scalar. The return type 19433matches the element-type of the vector input. 19434 19435If the intrinsic call has the 'reassoc' flag set, then the reduction will not 19436preserve the associativity of an equivalent scalarized counterpart. Otherwise 19437the reduction will be *sequential*, thus implying that the operation respects 19438the associativity of a scalarized reduction. That is, the reduction begins with 19439the start value and performs an fmul operation with consecutively increasing 19440vector element indices. See the following pseudocode: 19441 19442:: 19443 19444 float sequential_fmul(start_value, input_vector) 19445 result = start_value 19446 for i = 0 to length(input_vector) 19447 result = result * input_vector[i] 19448 return result 19449 19450 19451Arguments: 19452"""""""""" 19453The first argument to this intrinsic is a scalar start value for the reduction. 19454The type of the start value matches the element-type of the vector input. 19455The second argument must be a vector of floating-point values. 19456 19457To ignore the start value, one (``1.0``) can be used, as it is the neutral 19458value of floating point multiplication. 19459 19460Examples: 19461""""""""" 19462 19463:: 19464 19465 %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction 19466 %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction 19467 19468.. _int_vector_reduce_and: 19469 19470'``llvm.vector.reduce.and.*``' Intrinsic 19471^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19472 19473Syntax: 19474""""""" 19475 19476:: 19477 19478 declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a) 19479 19480Overview: 19481""""""""" 19482 19483The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND`` 19484reduction of a vector, returning the result as a scalar. The return type matches 19485the element-type of the vector input. 19486 19487Arguments: 19488"""""""""" 19489The argument to this intrinsic must be a vector of integer values. 19490 19491.. _int_vector_reduce_or: 19492 19493'``llvm.vector.reduce.or.*``' Intrinsic 19494^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19495 19496Syntax: 19497""""""" 19498 19499:: 19500 19501 declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a) 19502 19503Overview: 19504""""""""" 19505 19506The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction 19507of a vector, returning the result as a scalar. The return type matches the 19508element-type of the vector input. 19509 19510Arguments: 19511"""""""""" 19512The argument to this intrinsic must be a vector of integer values. 19513 19514.. _int_vector_reduce_xor: 19515 19516'``llvm.vector.reduce.xor.*``' Intrinsic 19517^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19518 19519Syntax: 19520""""""" 19521 19522:: 19523 19524 declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a) 19525 19526Overview: 19527""""""""" 19528 19529The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR`` 19530reduction of a vector, returning the result as a scalar. The return type matches 19531the element-type of the vector input. 19532 19533Arguments: 19534"""""""""" 19535The argument to this intrinsic must be a vector of integer values. 19536 19537.. _int_vector_reduce_smax: 19538 19539'``llvm.vector.reduce.smax.*``' Intrinsic 19540^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19541 19542Syntax: 19543""""""" 19544 19545:: 19546 19547 declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a) 19548 19549Overview: 19550""""""""" 19551 19552The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer 19553``MAX`` reduction of a vector, returning the result as a scalar. The return type 19554matches the element-type of the vector input. 19555 19556Arguments: 19557"""""""""" 19558The argument to this intrinsic must be a vector of integer values. 19559 19560.. _int_vector_reduce_smin: 19561 19562'``llvm.vector.reduce.smin.*``' Intrinsic 19563^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19564 19565Syntax: 19566""""""" 19567 19568:: 19569 19570 declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a) 19571 19572Overview: 19573""""""""" 19574 19575The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer 19576``MIN`` reduction of a vector, returning the result as a scalar. The return type 19577matches the element-type of the vector input. 19578 19579Arguments: 19580"""""""""" 19581The argument to this intrinsic must be a vector of integer values. 19582 19583.. _int_vector_reduce_umax: 19584 19585'``llvm.vector.reduce.umax.*``' Intrinsic 19586^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19587 19588Syntax: 19589""""""" 19590 19591:: 19592 19593 declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a) 19594 19595Overview: 19596""""""""" 19597 19598The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned 19599integer ``MAX`` reduction of a vector, returning the result as a scalar. The 19600return type matches the element-type of the vector input. 19601 19602Arguments: 19603"""""""""" 19604The argument to this intrinsic must be a vector of integer values. 19605 19606.. _int_vector_reduce_umin: 19607 19608'``llvm.vector.reduce.umin.*``' Intrinsic 19609^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19610 19611Syntax: 19612""""""" 19613 19614:: 19615 19616 declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a) 19617 19618Overview: 19619""""""""" 19620 19621The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned 19622integer ``MIN`` reduction of a vector, returning the result as a scalar. The 19623return type matches the element-type of the vector input. 19624 19625Arguments: 19626"""""""""" 19627The argument to this intrinsic must be a vector of integer values. 19628 19629.. _int_vector_reduce_fmax: 19630 19631'``llvm.vector.reduce.fmax.*``' Intrinsic 19632^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19633 19634Syntax: 19635""""""" 19636 19637:: 19638 19639 declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a) 19640 declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a) 19641 19642Overview: 19643""""""""" 19644 19645The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point 19646``MAX`` reduction of a vector, returning the result as a scalar. The return type 19647matches the element-type of the vector input. 19648 19649This instruction has the same comparison semantics as the '``llvm.maxnum.*``' 19650intrinsic. That is, the result will always be a number unless all elements of 19651the vector are NaN. For a vector with maximum element magnitude 0.0 and 19652containing both +0.0 and -0.0 elements, the sign of the result is unspecified. 19653 19654If the intrinsic call has the ``nnan`` fast-math flag, then the operation can 19655assume that NaNs are not present in the input vector. 19656 19657Arguments: 19658"""""""""" 19659The argument to this intrinsic must be a vector of floating-point values. 19660 19661.. _int_vector_reduce_fmin: 19662 19663'``llvm.vector.reduce.fmin.*``' Intrinsic 19664^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19665 19666Syntax: 19667""""""" 19668This is an overloaded intrinsic. 19669 19670:: 19671 19672 declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a) 19673 declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a) 19674 19675Overview: 19676""""""""" 19677 19678The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point 19679``MIN`` reduction of a vector, returning the result as a scalar. The return type 19680matches the element-type of the vector input. 19681 19682This instruction has the same comparison semantics as the '``llvm.minnum.*``' 19683intrinsic. That is, the result will always be a number unless all elements of 19684the vector are NaN. For a vector with minimum element magnitude 0.0 and 19685containing both +0.0 and -0.0 elements, the sign of the result is unspecified. 19686 19687If the intrinsic call has the ``nnan`` fast-math flag, then the operation can 19688assume that NaNs are not present in the input vector. 19689 19690Arguments: 19691"""""""""" 19692The argument to this intrinsic must be a vector of floating-point values. 19693 19694.. _int_vector_reduce_fmaximum: 19695 19696'``llvm.vector.reduce.fmaximum.*``' Intrinsic 19697^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19698 19699Syntax: 19700""""""" 19701This is an overloaded intrinsic. 19702 19703:: 19704 19705 declare float @llvm.vector.reduce.fmaximum.v4f32(<4 x float> %a) 19706 declare double @llvm.vector.reduce.fmaximum.v2f64(<2 x double> %a) 19707 19708Overview: 19709""""""""" 19710 19711The '``llvm.vector.reduce.fmaximum.*``' intrinsics do a floating-point 19712``MAX`` reduction of a vector, returning the result as a scalar. The return type 19713matches the element-type of the vector input. 19714 19715This instruction has the same comparison semantics as the '``llvm.maximum.*``' 19716intrinsic. That is, this intrinsic propagates NaNs and +0.0 is considered 19717greater than -0.0. If any element of the vector is a NaN, the result is NaN. 19718 19719Arguments: 19720"""""""""" 19721The argument to this intrinsic must be a vector of floating-point values. 19722 19723.. _int_vector_reduce_fminimum: 19724 19725'``llvm.vector.reduce.fminimum.*``' Intrinsic 19726^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19727 19728Syntax: 19729""""""" 19730This is an overloaded intrinsic. 19731 19732:: 19733 19734 declare float @llvm.vector.reduce.fminimum.v4f32(<4 x float> %a) 19735 declare double @llvm.vector.reduce.fminimum.v2f64(<2 x double> %a) 19736 19737Overview: 19738""""""""" 19739 19740The '``llvm.vector.reduce.fminimum.*``' intrinsics do a floating-point 19741``MIN`` reduction of a vector, returning the result as a scalar. The return type 19742matches the element-type of the vector input. 19743 19744This instruction has the same comparison semantics as the '``llvm.minimum.*``' 19745intrinsic. That is, this intrinsic propagates NaNs and -0.0 is considered less 19746than +0.0. If any element of the vector is a NaN, the result is NaN. 19747 19748Arguments: 19749"""""""""" 19750The argument to this intrinsic must be a vector of floating-point values. 19751 19752'``llvm.vector.insert``' Intrinsic 19753^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19754 19755Syntax: 19756""""""" 19757This is an overloaded intrinsic. 19758 19759:: 19760 19761 ; Insert fixed type into scalable type 19762 declare <vscale x 4 x float> @llvm.vector.insert.nxv4f32.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 <idx>) 19763 declare <vscale x 2 x double> @llvm.vector.insert.nxv2f64.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 <idx>) 19764 19765 ; Insert scalable type into scalable type 19766 declare <vscale x 4 x float> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x float> %vec, <vscale x 2 x float> %subvec, i64 <idx>) 19767 19768 ; Insert fixed type into fixed type 19769 declare <4 x double> @llvm.vector.insert.v4f64.v2f64(<4 x double> %vec, <2 x double> %subvec, i64 <idx>) 19770 19771Overview: 19772""""""""" 19773 19774The '``llvm.vector.insert.*``' intrinsics insert a vector into another vector 19775starting from a given index. The return type matches the type of the vector we 19776insert into. Conceptually, this can be used to build a scalable vector out of 19777non-scalable vectors, however this intrinsic can also be used on purely fixed 19778types. 19779 19780Scalable vectors can only be inserted into other scalable vectors. 19781 19782Arguments: 19783"""""""""" 19784 19785The ``vec`` is the vector which ``subvec`` will be inserted into. 19786The ``subvec`` is the vector that will be inserted. 19787 19788``idx`` represents the starting element number at which ``subvec`` will be 19789inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum 19790vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by 19791the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at 19792``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` + 19793num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition 19794cannot be determined statically but is false at runtime, then the result vector 19795is a :ref:`poison value <poisonvalues>`. 19796 19797 19798'``llvm.vector.extract``' Intrinsic 19799^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19800 19801Syntax: 19802""""""" 19803This is an overloaded intrinsic. 19804 19805:: 19806 19807 ; Extract fixed type from scalable type 19808 declare <4 x float> @llvm.vector.extract.v4f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>) 19809 declare <2 x double> @llvm.vector.extract.v2f64.nxv2f64(<vscale x 2 x double> %vec, i64 <idx>) 19810 19811 ; Extract scalable type from scalable type 19812 declare <vscale x 2 x float> @llvm.vector.extract.nxv2f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>) 19813 19814 ; Extract fixed type from fixed type 19815 declare <2 x double> @llvm.vector.extract.v2f64.v4f64(<4 x double> %vec, i64 <idx>) 19816 19817Overview: 19818""""""""" 19819 19820The '``llvm.vector.extract.*``' intrinsics extract a vector from within another 19821vector starting from a given index. The return type must be explicitly 19822specified. Conceptually, this can be used to decompose a scalable vector into 19823non-scalable parts, however this intrinsic can also be used on purely fixed 19824types. 19825 19826Scalable vectors can only be extracted from other scalable vectors. 19827 19828Arguments: 19829"""""""""" 19830 19831The ``vec`` is the vector from which we will extract a subvector. 19832 19833The ``idx`` specifies the starting element number within ``vec`` from which a 19834subvector is extracted. ``idx`` must be a constant multiple of the known-minimum 19835vector length of the result type. If the result type is a scalable vector, 19836``idx`` is first scaled by the result type's runtime scaling factor. Elements 19837``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector 19838indices. If this condition cannot be determined statically but is false at 19839runtime, then the result vector is a :ref:`poison value <poisonvalues>`. The 19840``idx`` parameter must be a vector index constant type (for most targets this 19841will be an integer pointer type). 19842 19843'``llvm.vector.reverse``' Intrinsic 19844^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19845 19846Syntax: 19847""""""" 19848This is an overloaded intrinsic. 19849 19850:: 19851 19852 declare <2 x i8> @llvm.vector.reverse.v2i8(<2 x i8> %a) 19853 declare <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> %a) 19854 19855Overview: 19856""""""""" 19857 19858The '``llvm.vector.reverse.*``' intrinsics reverse a vector. 19859The intrinsic takes a single vector and returns a vector of matching type but 19860with the original lane order reversed. These intrinsics work for both fixed 19861and scalable vectors. While this intrinsic supports all vector types 19862the recommended way to express this operation for fixed-width vectors is 19863still to use a shufflevector, as that may allow for more optimization 19864opportunities. 19865 19866Arguments: 19867"""""""""" 19868 19869The argument to this intrinsic must be a vector. 19870 19871'``llvm.vector.deinterleave2``' Intrinsic 19872^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19873 19874Syntax: 19875""""""" 19876This is an overloaded intrinsic. 19877 19878:: 19879 19880 declare {<2 x double>, <2 x double>} @llvm.vector.deinterleave2.v4f64(<4 x double> %vec1) 19881 declare {<vscale x 4 x i32>, <vscale x 4 x i32>} @llvm.vector.deinterleave2.nxv8i32(<vscale x 8 x i32> %vec1) 19882 19883Overview: 19884""""""""" 19885 19886The '``llvm.vector.deinterleave2``' intrinsic constructs two 19887vectors by deinterleaving the even and odd lanes of the input vector. 19888 19889This intrinsic works for both fixed and scalable vectors. While this intrinsic 19890supports all vector types the recommended way to express this operation for 19891fixed-width vectors is still to use a shufflevector, as that may allow for more 19892optimization opportunities. 19893 19894For example: 19895 19896.. code-block:: text 19897 19898 {<2 x i64>, <2 x i64>} llvm.vector.deinterleave2.v4i64(<4 x i64> <i64 0, i64 1, i64 2, i64 3>); ==> {<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>} 19899 19900Arguments: 19901"""""""""" 19902 19903The argument is a vector whose type corresponds to the logical concatenation of 19904the two result types. 19905 19906'``llvm.vector.interleave2``' Intrinsic 19907^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19908 19909Syntax: 19910""""""" 19911This is an overloaded intrinsic. 19912 19913:: 19914 19915 declare <4 x double> @llvm.vector.interleave2.v4f64(<2 x double> %vec1, <2 x double> %vec2) 19916 declare <vscale x 8 x i32> @llvm.vector.interleave2.nxv8i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2) 19917 19918Overview: 19919""""""""" 19920 19921The '``llvm.vector.interleave2``' intrinsic constructs a vector 19922by interleaving two input vectors. 19923 19924This intrinsic works for both fixed and scalable vectors. While this intrinsic 19925supports all vector types the recommended way to express this operation for 19926fixed-width vectors is still to use a shufflevector, as that may allow for more 19927optimization opportunities. 19928 19929For example: 19930 19931.. code-block:: text 19932 19933 <4 x i64> llvm.vector.interleave2.v4i64(<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>); ==> <4 x i64> <i64 0, i64 1, i64 2, i64 3> 19934 19935Arguments: 19936"""""""""" 19937Both arguments must be vectors of the same type whereby their logical 19938concatenation matches the result type. 19939 19940'``llvm.experimental.cttz.elts``' Intrinsic 19941^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19942 19943Syntax: 19944""""""" 19945 19946This is an overloaded intrinsic. You can use ```llvm.experimental.cttz.elts``` 19947on any vector of integer elements, both fixed width and scalable. 19948 19949:: 19950 19951 declare i8 @llvm.experimental.cttz.elts.i8.v8i1(<8 x i1> <src>, i1 <is_zero_poison>) 19952 19953Overview: 19954""""""""" 19955 19956The '``llvm.experimental.cttz.elts``' intrinsic counts the number of trailing 19957zero elements of a vector. 19958 19959Arguments: 19960"""""""""" 19961 19962The first argument is the vector to be counted. This argument must be a vector 19963with integer element type. The return type must also be an integer type which is 19964wide enough to hold the maximum number of elements of the source vector. The 19965behavior of this intrinsic is undefined if the return type is not wide enough 19966for the number of elements in the input vector. 19967 19968The second argument is a constant flag that indicates whether the intrinsic 19969returns a valid result if the first argument is all zero. If the first argument 19970is all zero and the second argument is true, the result is poison. 19971 19972Semantics: 19973"""""""""" 19974 19975The '``llvm.experimental.cttz.elts``' intrinsic counts the trailing (least 19976significant) zero elements in a vector. If ``src == 0`` the result is the 19977number of elements in the input vector. 19978 19979'``llvm.vector.splice``' Intrinsic 19980^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19981 19982Syntax: 19983""""""" 19984This is an overloaded intrinsic. 19985 19986:: 19987 19988 declare <2 x double> @llvm.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm) 19989 declare <vscale x 4 x i32> @llvm.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm) 19990 19991Overview: 19992""""""""" 19993 19994The '``llvm.vector.splice.*``' intrinsics construct a vector by 19995concatenating elements from the first input vector with elements of the second 19996input vector, returning a vector of the same type as the input vectors. The 19997signed immediate, modulo the number of elements in the vector, is the index 19998into the first vector from which to extract the result value. This means 19999conceptually that for a positive immediate, a vector is extracted from 20000``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative 20001immediate, it extracts ``-imm`` trailing elements from the first vector, and 20002the remaining elements from ``%vec2``. 20003 20004These intrinsics work for both fixed and scalable vectors. While this intrinsic 20005supports all vector types the recommended way to express this operation for 20006fixed-width vectors is still to use a shufflevector, as that may allow for more 20007optimization opportunities. 20008 20009For example: 20010 20011.. code-block:: text 20012 20013 llvm.vector.splice(<A,B,C,D>, <E,F,G,H>, 1); ==> <B, C, D, E> index 20014 llvm.vector.splice(<A,B,C,D>, <E,F,G,H>, -3); ==> <B, C, D, E> trailing elements 20015 20016 20017Arguments: 20018"""""""""" 20019 20020The first two operands are vectors with the same type. The start index is imm 20021modulo the runtime number of elements in the source vector. For a fixed-width 20022vector <N x eltty>, imm is a signed integer constant in the range 20023-N <= imm < N. For a scalable vector <vscale x N x eltty>, imm is a signed 20024integer constant in the range -X <= imm < X where X=vscale_range_min * N. 20025 20026'``llvm.stepvector``' Intrinsic 20027^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20028 20029This is an overloaded intrinsic. You can use ``llvm.stepvector`` 20030to generate a vector whose lane values comprise the linear sequence 20031<0, 1, 2, ...>. It is primarily intended for scalable vectors. 20032 20033:: 20034 20035 declare <vscale x 4 x i32> @llvm.stepvector.nxv4i32() 20036 declare <vscale x 8 x i16> @llvm.stepvector.nxv8i16() 20037 20038The '``llvm.stepvector``' intrinsics are used to create vectors 20039of integers whose elements contain a linear sequence of values starting from 0 20040with a step of 1. This intrinsic can only be used for vectors with integer 20041elements that are at least 8 bits in size. If the sequence value exceeds 20042the allowed limit for the element type then the result for that lane is 20043a poison value. 20044 20045These intrinsics work for both fixed and scalable vectors. While this intrinsic 20046supports all vector types, the recommended way to express this operation for 20047fixed-width vectors is still to generate a constant vector instead. 20048 20049 20050Arguments: 20051"""""""""" 20052 20053None. 20054 20055 20056'``llvm.experimental.get.vector.length``' Intrinsic 20057^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20058 20059Syntax: 20060""""""" 20061This is an overloaded intrinsic. 20062 20063:: 20064 20065 declare i32 @llvm.experimental.get.vector.length.i32(i32 %cnt, i32 immarg %vf, i1 immarg %scalable) 20066 declare i32 @llvm.experimental.get.vector.length.i64(i64 %cnt, i32 immarg %vf, i1 immarg %scalable) 20067 20068Overview: 20069""""""""" 20070 20071The '``llvm.experimental.get.vector.length.*``' intrinsics take a number of 20072elements to process and returns how many of the elements can be processed 20073with the requested vectorization factor. 20074 20075Arguments: 20076"""""""""" 20077 20078The first argument is an unsigned value of any scalar integer type and specifies 20079the total number of elements to be processed. The second argument is an i32 20080immediate for the vectorization factor. The third argument indicates if the 20081vectorization factor should be multiplied by vscale. 20082 20083Semantics: 20084"""""""""" 20085 20086Returns a non-negative i32 value (explicit vector length) that is unknown at compile 20087time and depends on the hardware specification. 20088If the result value does not fit in the result type, then the result is 20089a :ref:`poison value <poisonvalues>`. 20090 20091This intrinsic is intended to be used by loop vectorization with VP intrinsics 20092in order to get the number of elements to process on each loop iteration. The 20093result should be used to decrease the count for the next iteration until the 20094count reaches zero. 20095 20096Let ``%max_lanes`` be the number of lanes in the type described by ``%vf`` and 20097``%scalable``, here are the constraints on the returned value: 20098 20099- If ``%cnt`` equals to 0, returns 0. 20100- The returned value is always less than or equal to ``%max_lanes``. 20101- The returned value is always greater than or equal to ``ceil(%cnt / ceil(%cnt / %max_lanes))``, 20102 if ``%cnt`` is non-zero. 20103- The returned values are monotonically non-increasing in each loop iteration. That is, 20104 the returned value of an iteration is at least as large as that of any later 20105 iteration. 20106 20107Note that it has the following implications: 20108 20109- For a loop that uses this intrinsic, the number of iterations is equal to 20110 ``ceil(%C / %max_lanes)`` where ``%C`` is the initial ``%cnt`` value. 20111- If ``%cnt`` is non-zero, the return value is non-zero as well. 20112- If ``%cnt`` is less than or equal to ``%max_lanes``, the return value is equal to ``%cnt``. 20113 20114'``llvm.experimental.vector.partial.reduce.add.*``' Intrinsic 20115^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20116 20117Syntax: 20118""""""" 20119This is an overloaded intrinsic. 20120 20121:: 20122 20123 declare <4 x i32> @llvm.experimental.vector.partial.reduce.add.v4i32.v4i32.v8i32(<4 x i32> %a, <8 x i32> %b) 20124 declare <4 x i32> @llvm.experimental.vector.partial.reduce.add.v4i32.v4i32.v16i32(<4 x i32> %a, <16 x i32> %b) 20125 declare <vscale x 4 x i32> @llvm.experimental.vector.partial.reduce.add.nxv4i32.nxv4i32.nxv8i32(<vscale x 4 x i32> %a, <vscale x 8 x i32> %b) 20126 declare <vscale x 4 x i32> @llvm.experimental.vector.partial.reduce.add.nxv4i32.nxv4i32.nxv16i32(<vscale x 4 x i32> %a, <vscale x 16 x i32> %b) 20127 20128Overview: 20129""""""""" 20130 20131The '``llvm.vector.experimental.partial.reduce.add.*``' intrinsics reduce the 20132concatenation of the two vector operands down to the number of elements dictated 20133by the result type. The result type is a vector type that matches the type of the 20134first operand vector. 20135 20136Arguments: 20137"""""""""" 20138 20139Both arguments must be vectors of matching element types. The first argument type must 20140match the result type, while the second argument type must have a vector length that is a 20141positive integer multiple of the first vector/result type. The arguments must be either be 20142both fixed or both scalable vectors. 20143 20144 20145'``llvm.experimental.vector.histogram.*``' Intrinsic 20146^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20147 20148These intrinsics are overloaded. 20149 20150These intrinsics represent histogram-like operations; that is, updating values 20151in memory that may not be contiguous, and where multiple elements within a 20152single vector may be updating the same value in memory. 20153 20154The update operation must be specified as part of the intrinsic name. For a 20155simple histogram like the following the ``add`` operation would be used. 20156 20157.. code-block:: c 20158 20159 void simple_histogram(int *restrict buckets, unsigned *indices, int N, int inc) { 20160 for (int i = 0; i < N; ++i) 20161 buckets[indices[i]] += inc; 20162 } 20163 20164More update operation types may be added in the future. 20165 20166:: 20167 20168 declare void @llvm.experimental.vector.histogram.add.v8p0.i32(<8 x ptr> %ptrs, i32 %inc, <8 x i1> %mask) 20169 declare void @llvm.experimental.vector.histogram.add.nxv2p0.i64(<vscale x 2 x ptr> %ptrs, i64 %inc, <vscale x 2 x i1> %mask) 20170 20171Arguments: 20172"""""""""" 20173 20174The first argument is a vector of pointers to the memory locations to be 20175updated. The second argument is a scalar used to update the value from 20176memory; it must match the type of value to be updated. The final argument 20177is a mask value to exclude locations from being modified. 20178 20179Semantics: 20180"""""""""" 20181 20182The '``llvm.experimental.vector.histogram.*``' intrinsics are used to perform 20183updates on potentially overlapping values in memory. The intrinsics represent 20184the follow sequence of operations: 20185 201861. Gather load from the ``ptrs`` operand, with element type matching that of 20187 the ``inc`` operand. 201882. Update of the values loaded from memory. In the case of the ``add`` 20189 update operation, this means: 20190 20191 1. Perform a cross-vector histogram operation on the ``ptrs`` operand. 20192 2. Multiply the result by the ``inc`` operand. 20193 3. Add the result to the values loaded from memory 201943. Scatter the result of the update operation to the memory locations from 20195 the ``ptrs`` operand. 20196 20197The ``mask`` operand will apply to at least the gather and scatter operations. 20198 20199'``llvm.experimental.vector.extract.last.active``' Intrinsic 20200^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20201 20202This is an overloaded intrinsic. 20203 20204:: 20205 20206 declare i32 @llvm.experimental.vector.extract.last.active.v4i32(<4 x i32> %data, <4 x i1> %mask, i32 %passthru) 20207 declare i16 @llvm.experimental.vector.extract.last.active.nxv8i16(<vscale x 8 x i16> %data, <vscale x 8 x i1> %mask, i16 %passthru) 20208 20209Arguments: 20210"""""""""" 20211 20212The first argument is the data vector to extract a lane from. The second is a 20213mask vector controlling the extraction. The third argument is a passthru 20214value. 20215 20216The two input vectors must have the same number of elements, and the type of 20217the passthru value must match that of the elements of the data vector. 20218 20219Semantics: 20220"""""""""" 20221 20222The '``llvm.experimental.vector.extract.last.active``' intrinsic will extract an 20223element from the data vector at the index matching the highest active lane of 20224the mask vector. If no mask lanes are active then the passthru value is 20225returned instead. 20226 20227.. _int_vector_compress: 20228 20229'``llvm.experimental.vector.compress.*``' Intrinsics 20230^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20231 20232LLVM provides an intrinsic for compressing data within a vector based on a selection mask. 20233Semantically, this is similar to :ref:`llvm.masked.compressstore <int_compressstore>` but with weaker assumptions 20234and without storing the results to memory, i.e., the data remains in the vector. 20235 20236Syntax: 20237""""""" 20238This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected 20239from an input vector and placed adjacently within the result vector. A mask defines which elements to collect from the vector. 20240The remaining lanes are filled with values from ``passthru``. 20241 20242.. code-block:: llvm 20243 20244 declare <8 x i32> @llvm.experimental.vector.compress.v8i32(<8 x i32> <value>, <8 x i1> <mask>, <8 x i32> <passthru>) 20245 declare <16 x float> @llvm.experimental.vector.compress.v16f32(<16 x float> <value>, <16 x i1> <mask>, <16 x float> undef) 20246 20247Overview: 20248""""""""" 20249 20250Selects elements from input vector ``value`` according to the ``mask``. 20251All selected elements are written into adjacent lanes in the result vector, 20252from lower to higher. 20253The mask holds an entry for each vector lane, and is used to select elements 20254to be kept. 20255If a ``passthru`` vector is given, all remaining lanes are filled with the 20256corresponding lane's value from ``passthru``. 20257The main difference to :ref:`llvm.masked.compressstore <int_compressstore>` is 20258that the we do not need to guard against memory access for unselected lanes. 20259This allows for branchless code and better optimization for all targets that 20260do not support or have inefficient 20261instructions of the explicit semantics of 20262:ref:`llvm.masked.compressstore <int_compressstore>` but still have some form 20263of compress operations. 20264The result vector can be written with a similar effect, as all the selected 20265values are at the lower positions of the vector, but without requiring 20266branches to avoid writes where the mask is ``false``. 20267 20268Arguments: 20269"""""""""" 20270 20271The first operand is the input vector, from which elements are selected. 20272The second operand is the mask, a vector of boolean values. 20273The third operand is the passthru vector, from which elements are filled 20274into remaining lanes. 20275The mask and the input vector must have the same number of vector elements. 20276The input and passthru vectors must have the same type. 20277 20278Semantics: 20279"""""""""" 20280 20281The ``llvm.experimental.vector.compress`` intrinsic compresses data within a vector. 20282It collects elements from possibly non-adjacent lanes of a vector and places 20283them contiguously in the result vector based on a selection mask, filling the 20284remaining lanes with values from ``passthru``. 20285This intrinsic performs the logic of the following C++ example. 20286All values in ``out`` after the last selected one are undefined if 20287``passthru`` is undefined. 20288If all entries in the ``mask`` are 0, the ``out`` vector is ``passthru``. 20289If any element of the mask is poison, all elements of the result are poison. 20290Otherwise, if any element of the mask is undef, all elements of the result are undef. 20291If ``passthru`` is undefined, the number of valid lanes is equal to the number 20292of ``true`` entries in the mask, i.e., all lanes >= number-of-selected-values 20293are undefined. 20294 20295.. code-block:: cpp 20296 20297 // Consecutively place selected values in a vector. 20298 using VecT __attribute__((vector_size(N))) = int; 20299 VecT compress(VecT vec, VecT mask, VecT passthru) { 20300 VecT out; 20301 int idx = 0; 20302 for (int i = 0; i < N / sizeof(int); ++i) { 20303 out[idx] = vec[i]; 20304 idx += static_cast<bool>(mask[i]); 20305 } 20306 for (; idx < N / sizeof(int); ++idx) { 20307 out[idx] = passthru[idx]; 20308 } 20309 return out; 20310 } 20311 20312 20313'``llvm.experimental.vector.match.*``' Intrinsic 20314^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20315 20316Syntax: 20317""""""" 20318 20319This is an overloaded intrinsic. 20320 20321:: 20322 20323 declare <<n> x i1> @llvm.experimental.vector.match(<<n> x <ty>> %op1, <<m> x <ty>> %op2, <<n> x i1> %mask) 20324 declare <vscale x <n> x i1> @llvm.experimental.vector.match(<vscale x <n> x <ty>> %op1, <<m> x <ty>> %op2, <vscale x <n> x i1> %mask) 20325 20326Overview: 20327""""""""" 20328 20329Find active elements of the first argument matching any elements of the second. 20330 20331Arguments: 20332"""""""""" 20333 20334The first argument is the search vector, the second argument the vector of 20335elements we are searching for (i.e. for which we consider a match successful), 20336and the third argument is a mask that controls which elements of the first 20337argument are active. The first two arguments must be vectors of matching 20338integer element types. The first and third arguments and the result type must 20339have matching element counts (fixed or scalable). The second argument must be a 20340fixed vector, but its length may be different from the remaining arguments. 20341 20342Semantics: 20343"""""""""" 20344 20345The '``llvm.experimental.vector.match``' intrinsic compares each active element 20346in the first argument against the elements of the second argument, placing 20347``1`` in the corresponding element of the output vector if any equality 20348comparison is successful, and ``0`` otherwise. Inactive elements in the mask 20349are set to ``0`` in the output. 20350 20351Matrix Intrinsics 20352----------------- 20353 20354Operations on matrixes requiring shape information (like number of rows/columns 20355or the memory layout) can be expressed using the matrix intrinsics. These 20356intrinsics require matrix dimensions to be passed as immediate arguments, and 20357matrixes are passed and returned as vectors. This means that for a ``R`` x 20358``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the 20359corresponding vector, with indices starting at 0. Currently column-major layout 20360is assumed. The intrinsics support both integer and floating point matrixes. 20361 20362 20363'``llvm.matrix.transpose.*``' Intrinsic 20364^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20365 20366Syntax: 20367""""""" 20368This is an overloaded intrinsic. 20369 20370:: 20371 20372 declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>) 20373 20374Overview: 20375""""""""" 20376 20377The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x 20378<Cols>`` matrix and return the transposed matrix in the result vector. 20379 20380Arguments: 20381"""""""""" 20382 20383The first argument ``%In`` is a vector that corresponds to a ``<Rows> x 20384<Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the 20385number of rows and columns, respectively, and must be positive, constant 20386integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have 20387the same float or integer element type as ``%In``. 20388 20389'``llvm.matrix.multiply.*``' Intrinsic 20390^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20391 20392Syntax: 20393""""""" 20394This is an overloaded intrinsic. 20395 20396:: 20397 20398 declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>) 20399 20400Overview: 20401""""""""" 20402 20403The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x 20404<Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and 20405multiplies them. The result matrix is returned in the result vector. 20406 20407Arguments: 20408"""""""""" 20409 20410The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> * 20411<Inner>`` elements, and the second argument ``%B`` to a matrix with 20412``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``, 20413``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The 20414returned vector must have ``<OuterRows> * <OuterColumns>`` elements. 20415Vectors ``%A``, ``%B``, and the returned vector all have the same float or 20416integer element type. 20417 20418 20419'``llvm.matrix.column.major.load.*``' Intrinsic 20420^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20421 20422Syntax: 20423""""""" 20424This is an overloaded intrinsic. 20425 20426:: 20427 20428 declare vectorty @llvm.matrix.column.major.load.*( 20429 ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) 20430 20431Overview: 20432""""""""" 20433 20434The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>`` 20435matrix using a stride of ``%Stride`` to compute the start address of the 20436different columns. The offset is computed using ``%Stride``'s bitwidth. This 20437allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the 20438intrinsic is considered a :ref:`volatile memory access <volatile>`. The result 20439matrix is returned in the result vector. If the ``%Ptr`` argument is known to 20440be aligned to some boundary, this can be specified as an attribute on the 20441argument. 20442 20443Arguments: 20444"""""""""" 20445 20446The first argument ``%Ptr`` is a pointer type to the returned vector type, and 20447corresponds to the start address to load from. The second argument ``%Stride`` 20448is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used 20449to compute the column memory addresses. I.e., for a column ``C``, its start 20450memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument 20451``<IsVolatile>`` is a boolean value. The fourth and fifth arguments, 20452``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns, 20453respectively, and must be positive, constant integers. The returned vector must 20454have ``<Rows> * <Cols>`` elements. 20455 20456The :ref:`align <attr_align>` parameter attribute can be provided for the 20457``%Ptr`` arguments. 20458 20459 20460'``llvm.matrix.column.major.store.*``' Intrinsic 20461^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20462 20463Syntax: 20464""""""" 20465 20466:: 20467 20468 declare void @llvm.matrix.column.major.store.*( 20469 vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) 20470 20471Overview: 20472""""""""" 20473 20474The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x 20475<Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between 20476columns. The offset is computed using ``%Stride``'s bitwidth. If 20477``<IsVolatile>`` is true, the intrinsic is considered a 20478:ref:`volatile memory access <volatile>`. 20479 20480If the ``%Ptr`` argument is known to be aligned to some boundary, this can be 20481specified as an attribute on the argument. 20482 20483Arguments: 20484"""""""""" 20485 20486The first argument ``%In`` is a vector that corresponds to a ``<Rows> x 20487<Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a 20488pointer to the vector type of ``%In``, and is the start address of the matrix 20489in memory. The third argument ``%Stride`` is a positive, constant integer with 20490``%Stride >= <Rows>``. ``%Stride`` is used to compute the column memory 20491addresses. I.e., for a column ``C``, its start memory addresses is calculated 20492with ``%Ptr + C * %Stride``. The fourth argument ``<IsVolatile>`` is a boolean 20493value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows 20494and columns, respectively, and must be positive, constant integers. 20495 20496The :ref:`align <attr_align>` parameter attribute can be provided 20497for the ``%Ptr`` arguments. 20498 20499 20500Half Precision Floating-Point Intrinsics 20501---------------------------------------- 20502 20503For most target platforms, half precision floating-point is a 20504storage-only format. This means that it is a dense encoding (in memory) 20505but does not support computation in the format. 20506 20507This means that code must first load the half-precision floating-point 20508value as an i16, then convert it to float with 20509:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can 20510then be performed on the float value (including extending to double 20511etc). To store the value back to memory, it is first converted to float 20512if needed, then converted to i16 with 20513:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an 20514i16 value. 20515 20516.. _int_convert_to_fp16: 20517 20518'``llvm.convert.to.fp16``' Intrinsic 20519^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20520 20521Syntax: 20522""""""" 20523 20524:: 20525 20526 declare i16 @llvm.convert.to.fp16.f32(float %a) 20527 declare i16 @llvm.convert.to.fp16.f64(double %a) 20528 20529Overview: 20530""""""""" 20531 20532The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a 20533conventional floating-point type to half precision floating-point format. 20534 20535Arguments: 20536"""""""""" 20537 20538The intrinsic function contains single argument - the value to be 20539converted. 20540 20541Semantics: 20542"""""""""" 20543 20544The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a 20545conventional floating-point format to half precision floating-point format. The 20546return value is an ``i16`` which contains the converted number. 20547 20548Examples: 20549""""""""" 20550 20551.. code-block:: llvm 20552 20553 %res = call i16 @llvm.convert.to.fp16.f32(float %a) 20554 store i16 %res, i16* @x, align 2 20555 20556.. _int_convert_from_fp16: 20557 20558'``llvm.convert.from.fp16``' Intrinsic 20559^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20560 20561Syntax: 20562""""""" 20563 20564:: 20565 20566 declare float @llvm.convert.from.fp16.f32(i16 %a) 20567 declare double @llvm.convert.from.fp16.f64(i16 %a) 20568 20569Overview: 20570""""""""" 20571 20572The '``llvm.convert.from.fp16``' intrinsic function performs a 20573conversion from half precision floating-point format to single precision 20574floating-point format. 20575 20576Arguments: 20577"""""""""" 20578 20579The intrinsic function contains single argument - the value to be 20580converted. 20581 20582Semantics: 20583"""""""""" 20584 20585The '``llvm.convert.from.fp16``' intrinsic function performs a 20586conversion from half single precision floating-point format to single 20587precision floating-point format. The input half-float value is 20588represented by an ``i16`` value. 20589 20590Examples: 20591""""""""" 20592 20593.. code-block:: llvm 20594 20595 %a = load i16, ptr @x, align 2 20596 %res = call float @llvm.convert.from.fp16(i16 %a) 20597 20598Saturating floating-point to integer conversions 20599------------------------------------------------ 20600 20601The ``fptoui`` and ``fptosi`` instructions return a 20602:ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not 20603representable by the result type. These intrinsics provide an alternative 20604conversion, which will saturate towards the smallest and largest representable 20605integer values instead. 20606 20607'``llvm.fptoui.sat.*``' Intrinsic 20608^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20609 20610Syntax: 20611""""""" 20612 20613This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any 20614floating-point argument type and any integer result type, or vectors thereof. 20615Not all targets may support all types, however. 20616 20617:: 20618 20619 declare i32 @llvm.fptoui.sat.i32.f32(float %f) 20620 declare i19 @llvm.fptoui.sat.i19.f64(double %f) 20621 declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f) 20622 20623Overview: 20624""""""""" 20625 20626This intrinsic converts the argument into an unsigned integer using saturating 20627semantics. 20628 20629Arguments: 20630"""""""""" 20631 20632The argument may be any floating-point or vector of floating-point type. The 20633return value may be any integer or vector of integer type. The number of vector 20634elements in argument and return must be the same. 20635 20636Semantics: 20637"""""""""" 20638 20639The conversion to integer is performed subject to the following rules: 20640 20641- If the argument is any NaN, zero is returned. 20642- If the argument is smaller than zero (this includes negative infinity), 20643 zero is returned. 20644- If the argument is larger than the largest representable unsigned integer of 20645 the result type (this includes positive infinity), the largest representable 20646 unsigned integer is returned. 20647- Otherwise, the result of rounding the argument towards zero is returned. 20648 20649Example: 20650"""""""" 20651 20652.. code-block:: text 20653 20654 %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.875) ; yields i8: 123 20655 %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.75) ; yields i8: 0 20656 %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0) ; yields i8: 255 20657 %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0 20658 20659'``llvm.fptosi.sat.*``' Intrinsic 20660^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20661 20662Syntax: 20663""""""" 20664 20665This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any 20666floating-point argument type and any integer result type, or vectors thereof. 20667Not all targets may support all types, however. 20668 20669:: 20670 20671 declare i32 @llvm.fptosi.sat.i32.f32(float %f) 20672 declare i19 @llvm.fptosi.sat.i19.f64(double %f) 20673 declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f) 20674 20675Overview: 20676""""""""" 20677 20678This intrinsic converts the argument into a signed integer using saturating 20679semantics. 20680 20681Arguments: 20682"""""""""" 20683 20684The argument may be any floating-point or vector of floating-point type. The 20685return value may be any integer or vector of integer type. The number of vector 20686elements in argument and return must be the same. 20687 20688Semantics: 20689"""""""""" 20690 20691The conversion to integer is performed subject to the following rules: 20692 20693- If the argument is any NaN, zero is returned. 20694- If the argument is smaller than the smallest representable signed integer of 20695 the result type (this includes negative infinity), the smallest 20696 representable signed integer is returned. 20697- If the argument is larger than the largest representable signed integer of 20698 the result type (this includes positive infinity), the largest representable 20699 signed integer is returned. 20700- Otherwise, the result of rounding the argument towards zero is returned. 20701 20702Example: 20703"""""""" 20704 20705.. code-block:: text 20706 20707 %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.875) ; yields i8: 23 20708 %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.75) ; yields i8: -128 20709 %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0) ; yields i8: 127 20710 %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0 20711 20712Convergence Intrinsics 20713---------------------- 20714 20715The LLVM convergence intrinsics for controlling the semantics of ``convergent`` 20716operations, which all start with the ``llvm.experimental.convergence.`` 20717prefix, are described in the :doc:`ConvergentOperations` document. 20718 20719.. _dbg_intrinsics: 20720 20721Debugger Intrinsics 20722------------------- 20723 20724The LLVM debugger intrinsics (which all start with ``llvm.dbg.`` 20725prefix), are described in the `LLVM Source Level 20726Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_ 20727document. 20728 20729Exception Handling Intrinsics 20730----------------------------- 20731 20732The LLVM exception handling intrinsics (which all start with 20733``llvm.eh.`` prefix), are described in the `LLVM Exception 20734Handling <ExceptionHandling.html#format-common-intrinsics>`_ document. 20735 20736Pointer Authentication Intrinsics 20737--------------------------------- 20738 20739The LLVM pointer authentication intrinsics (which all start with 20740``llvm.ptrauth.`` prefix), are described in the `Pointer Authentication 20741<PointerAuth.html#intrinsics>`_ document. 20742 20743.. _int_trampoline: 20744 20745Trampoline Intrinsics 20746--------------------- 20747 20748These intrinsics make it possible to excise one parameter, marked with 20749the :ref:`nest <nest>` attribute, from a function. The result is a 20750callable function pointer lacking the nest parameter - the caller does 20751not need to provide a value for it. Instead, the value to use is stored 20752in advance in a "trampoline", a block of memory usually allocated on the 20753stack, which also contains code to splice the nest value into the 20754argument list. This is used to implement the GCC nested function address 20755extension. 20756 20757For example, if the function is ``i32 f(ptr nest %c, i32 %x, i32 %y)`` 20758then the resulting function pointer has signature ``i32 (i32, i32)``. 20759It can be created as follows: 20760 20761.. code-block:: llvm 20762 20763 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86 20764 call ptr @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %nval) 20765 %fp = call ptr @llvm.adjust.trampoline(ptr %tramp) 20766 20767The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to 20768``%val = call i32 %f(ptr %nval, i32 %x, i32 %y)``. 20769 20770.. _int_it: 20771 20772'``llvm.init.trampoline``' Intrinsic 20773^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20774 20775Syntax: 20776""""""" 20777 20778:: 20779 20780 declare void @llvm.init.trampoline(ptr <tramp>, ptr <func>, ptr <nval>) 20781 20782Overview: 20783""""""""" 20784 20785This fills the memory pointed to by ``tramp`` with executable code, 20786turning it into a trampoline. 20787 20788Arguments: 20789"""""""""" 20790 20791The ``llvm.init.trampoline`` intrinsic takes three arguments, all 20792pointers. The ``tramp`` argument must point to a sufficiently large and 20793sufficiently aligned block of memory; this memory is written to by the 20794intrinsic. Note that the size and the alignment are target-specific - 20795LLVM currently provides no portable way of determining them, so a 20796front-end that generates this intrinsic needs to have some 20797target-specific knowledge. The ``func`` argument must hold a function. 20798 20799Semantics: 20800"""""""""" 20801 20802The block of memory pointed to by ``tramp`` is filled with target 20803dependent code, turning it into a function. Then ``tramp`` needs to be 20804passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can 20805be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new 20806function's signature is the same as that of ``func`` with any arguments 20807marked with the ``nest`` attribute removed. At most one such ``nest`` 20808argument is allowed, and it must be of pointer type. Calling the new 20809function is equivalent to calling ``func`` with the same argument list, 20810but with ``nval`` used for the missing ``nest`` argument. If, after 20811calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is 20812modified, then the effect of any later call to the returned function 20813pointer is undefined. 20814 20815.. _int_at: 20816 20817'``llvm.adjust.trampoline``' Intrinsic 20818^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20819 20820Syntax: 20821""""""" 20822 20823:: 20824 20825 declare ptr @llvm.adjust.trampoline(ptr <tramp>) 20826 20827Overview: 20828""""""""" 20829 20830This performs any required machine-specific adjustment to the address of 20831a trampoline (passed as ``tramp``). 20832 20833Arguments: 20834"""""""""" 20835 20836``tramp`` must point to a block of memory which already has trampoline 20837code filled in by a previous call to 20838:ref:`llvm.init.trampoline <int_it>`. 20839 20840Semantics: 20841"""""""""" 20842 20843On some architectures the address of the code to be executed needs to be 20844different than the address where the trampoline is actually stored. This 20845intrinsic returns the executable address corresponding to ``tramp`` 20846after performing the required machine specific adjustments. The pointer 20847returned can then be :ref:`bitcast and executed <int_trampoline>`. 20848 20849 20850.. _int_vp: 20851 20852Vector Predication Intrinsics 20853----------------------------- 20854VP intrinsics are intended for predicated SIMD/vector code. A typical VP 20855operation takes a vector mask and an explicit vector length parameter as in: 20856 20857:: 20858 20859 <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl) 20860 20861The vector mask parameter (%mask) always has a vector of `i1` type, for example 20862`<32 x i1>`. The explicit vector length parameter always has the type `i32` and 20863is an unsigned integer value. The explicit vector length parameter (%evl) is in 20864the range: 20865 20866:: 20867 20868 0 <= %evl <= W, where W is the number of vector elements 20869 20870Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime 20871length of the vector. 20872 20873The VP intrinsic has undefined behavior if ``%evl > W``. The explicit vector 20874length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set 20875to True, and all other lanes ``%evl <= i < W`` to False. A new mask %M is 20876calculated with an element-wise AND from %mask and %EVLmask: 20877 20878:: 20879 20880 M = %mask AND %EVLmask 20881 20882A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates: 20883 20884:: 20885 20886 A <opcode> B = { A[i] <opcode> B[i] M[i] = True, and 20887 { undef otherwise 20888 20889Optimization Hint 20890^^^^^^^^^^^^^^^^^ 20891 20892Some targets, such as AVX512, do not support the %evl parameter in hardware. 20893The use of an effective %evl is discouraged for those targets. The function 20894``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target 20895has native support for %evl. 20896 20897.. _int_vp_select: 20898 20899'``llvm.vp.select.*``' Intrinsics 20900^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20901 20902Syntax: 20903""""""" 20904This is an overloaded intrinsic. 20905 20906:: 20907 20908 declare <16 x i32> @llvm.vp.select.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <evl>) 20909 declare <vscale x 4 x i64> @llvm.vp.select.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <evl>) 20910 20911Overview: 20912""""""""" 20913 20914The '``llvm.vp.select``' intrinsic is used to choose one value based on a 20915condition vector, without IR-level branching. 20916 20917Arguments: 20918"""""""""" 20919 20920The first argument is a vector of ``i1`` and indicates the condition. The 20921second argument is the value that is selected where the condition vector is 20922true. The third argument is the value that is selected where the condition 20923vector is false. The vectors must be of the same size. The fourth argument is 20924the explicit vector length. 20925 20926#. The optional ``fast-math flags`` marker indicates that the select has one or 20927 more :ref:`fast-math flags <fastmath>`. These are optimization hints to 20928 enable otherwise unsafe floating-point optimizations. Fast-math flags are 20929 only valid for selects that return :ref:`supported floating-point types 20930 <fastmath_return_types>`. 20931 20932Semantics: 20933"""""""""" 20934 20935The intrinsic selects lanes from the second and third argument depending on a 20936condition vector. 20937 20938All result lanes at positions greater or equal than ``%evl`` are undefined. 20939For all lanes below ``%evl`` where the condition vector is true the lane is 20940taken from the second argument. Otherwise, the lane is taken from the third 20941argument. 20942 20943Example: 20944"""""""" 20945 20946.. code-block:: llvm 20947 20948 %r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %evl) 20949 20950 ;;; Expansion. 20951 ;; Any result is legal on lanes at and above %evl. 20952 %also.r = select <4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false 20953 20954 20955.. _int_vp_merge: 20956 20957'``llvm.vp.merge.*``' Intrinsics 20958^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20959 20960Syntax: 20961""""""" 20962This is an overloaded intrinsic. 20963 20964:: 20965 20966 declare <16 x i32> @llvm.vp.merge.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <pivot>) 20967 declare <vscale x 4 x i64> @llvm.vp.merge.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <pivot>) 20968 20969Overview: 20970""""""""" 20971 20972The '``llvm.vp.merge``' intrinsic is used to choose one value based on a 20973condition vector and an index argument, without IR-level branching. 20974 20975Arguments: 20976"""""""""" 20977 20978The first argument is a vector of ``i1`` and indicates the condition. The 20979second argument is the value that is merged where the condition vector is true. 20980The third argument is the value that is selected where the condition vector is 20981false or the lane position is greater equal than the pivot. The fourth argument 20982is the pivot. 20983 20984#. The optional ``fast-math flags`` marker indicates that the merge has one or 20985 more :ref:`fast-math flags <fastmath>`. These are optimization hints to 20986 enable otherwise unsafe floating-point optimizations. Fast-math flags are 20987 only valid for merges that return :ref:`supported floating-point types 20988 <fastmath_return_types>`. 20989 20990Semantics: 20991"""""""""" 20992 20993The intrinsic selects lanes from the second and third argument depending on a 20994condition vector and pivot value. 20995 20996For all lanes where the condition vector is true and the lane position is less 20997than ``%pivot`` the lane is taken from the second argument. Otherwise, the lane 20998is taken from the third argument. 20999 21000Example: 21001"""""""" 21002 21003.. code-block:: llvm 21004 21005 %r = call <4 x i32> @llvm.vp.merge.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %pivot) 21006 21007 ;;; Expansion. 21008 ;; Lanes at and above %pivot are taken from %on_false 21009 %atfirst = insertelement <4 x i32> poison, i32 %pivot, i32 0 21010 %splat = shufflevector <4 x i32> %atfirst, <4 x i32> poison, <4 x i32> zeroinitializer 21011 %pivotmask = icmp ult <4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32> %splat 21012 %mergemask = and <4 x i1> %cond, <4 x i1> %pivotmask 21013 %also.r = select <4 x i1> %mergemask, <4 x i32> %on_true, <4 x i32> %on_false 21014 21015 21016 21017.. _int_vp_add: 21018 21019'``llvm.vp.add.*``' Intrinsics 21020^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21021 21022Syntax: 21023""""""" 21024This is an overloaded intrinsic. 21025 21026:: 21027 21028 declare <16 x i32> @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21029 declare <vscale x 4 x i32> @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21030 declare <256 x i64> @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21031 21032Overview: 21033""""""""" 21034 21035Predicated integer addition of two vectors of integers. 21036 21037 21038Arguments: 21039"""""""""" 21040 21041The first two arguments and the result have the same vector of integer type. The 21042third argument is the vector mask and has the same number of elements as the 21043result vector type. The fourth argument is the explicit vector length of the 21044operation. 21045 21046Semantics: 21047"""""""""" 21048 21049The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`) 21050of the first and second vector arguments on each enabled lane. The result on 21051disabled lanes is a :ref:`poison value <poisonvalues>`. 21052 21053Examples: 21054""""""""" 21055 21056.. code-block:: llvm 21057 21058 %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21059 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21060 21061 %t = add <4 x i32> %a, %b 21062 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21063 21064.. _int_vp_sub: 21065 21066'``llvm.vp.sub.*``' Intrinsics 21067^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21068 21069Syntax: 21070""""""" 21071This is an overloaded intrinsic. 21072 21073:: 21074 21075 declare <16 x i32> @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21076 declare <vscale x 4 x i32> @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21077 declare <256 x i64> @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21078 21079Overview: 21080""""""""" 21081 21082Predicated integer subtraction of two vectors of integers. 21083 21084 21085Arguments: 21086"""""""""" 21087 21088The first two arguments and the result have the same vector of integer type. The 21089third argument is the vector mask and has the same number of elements as the 21090result vector type. The fourth argument is the explicit vector length of the 21091operation. 21092 21093Semantics: 21094"""""""""" 21095 21096The '``llvm.vp.sub``' intrinsic performs integer subtraction 21097(:ref:`sub <i_sub>`) of the first and second vector arguments on each enabled 21098lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 21099 21100Examples: 21101""""""""" 21102 21103.. code-block:: llvm 21104 21105 %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21106 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21107 21108 %t = sub <4 x i32> %a, %b 21109 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21110 21111 21112 21113.. _int_vp_mul: 21114 21115'``llvm.vp.mul.*``' Intrinsics 21116^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21117 21118Syntax: 21119""""""" 21120This is an overloaded intrinsic. 21121 21122:: 21123 21124 declare <16 x i32> @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21125 declare <vscale x 4 x i32> @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21126 declare <256 x i64> @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21127 21128Overview: 21129""""""""" 21130 21131Predicated integer multiplication of two vectors of integers. 21132 21133 21134Arguments: 21135"""""""""" 21136 21137The first two arguments and the result have the same vector of integer type. The 21138third argument is the vector mask and has the same number of elements as the 21139result vector type. The fourth argument is the explicit vector length of the 21140operation. 21141 21142Semantics: 21143"""""""""" 21144The '``llvm.vp.mul``' intrinsic performs integer multiplication 21145(:ref:`mul <i_mul>`) of the first and second vector arguments on each enabled 21146lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 21147 21148Examples: 21149""""""""" 21150 21151.. code-block:: llvm 21152 21153 %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21154 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21155 21156 %t = mul <4 x i32> %a, %b 21157 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21158 21159 21160.. _int_vp_sdiv: 21161 21162'``llvm.vp.sdiv.*``' Intrinsics 21163^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21164 21165Syntax: 21166""""""" 21167This is an overloaded intrinsic. 21168 21169:: 21170 21171 declare <16 x i32> @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21172 declare <vscale x 4 x i32> @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21173 declare <256 x i64> @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21174 21175Overview: 21176""""""""" 21177 21178Predicated, signed division of two vectors of integers. 21179 21180 21181Arguments: 21182"""""""""" 21183 21184The first two arguments and the result have the same vector of integer type. The 21185third argument is the vector mask and has the same number of elements as the 21186result vector type. The fourth argument is the explicit vector length of the 21187operation. 21188 21189Semantics: 21190"""""""""" 21191 21192The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`) 21193of the first and second vector arguments on each enabled lane. The result on 21194disabled lanes is a :ref:`poison value <poisonvalues>`. 21195 21196Examples: 21197""""""""" 21198 21199.. code-block:: llvm 21200 21201 %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21202 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21203 21204 %t = sdiv <4 x i32> %a, %b 21205 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21206 21207 21208.. _int_vp_udiv: 21209 21210'``llvm.vp.udiv.*``' Intrinsics 21211^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21212 21213Syntax: 21214""""""" 21215This is an overloaded intrinsic. 21216 21217:: 21218 21219 declare <16 x i32> @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21220 declare <vscale x 4 x i32> @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21221 declare <256 x i64> @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21222 21223Overview: 21224""""""""" 21225 21226Predicated, unsigned division of two vectors of integers. 21227 21228 21229Arguments: 21230"""""""""" 21231 21232The first two arguments and the result have the same vector of integer type. The 21233third argument is the vector mask and has the same number of elements as the 21234result vector type. The fourth argument is the explicit vector length of the 21235operation. 21236 21237Semantics: 21238"""""""""" 21239 21240The '``llvm.vp.udiv``' intrinsic performs unsigned division 21241(:ref:`udiv <i_udiv>`) of the first and second vector arguments on each enabled 21242lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 21243 21244Examples: 21245""""""""" 21246 21247.. code-block:: llvm 21248 21249 %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21250 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21251 21252 %t = udiv <4 x i32> %a, %b 21253 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21254 21255 21256 21257.. _int_vp_srem: 21258 21259'``llvm.vp.srem.*``' Intrinsics 21260^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21261 21262Syntax: 21263""""""" 21264This is an overloaded intrinsic. 21265 21266:: 21267 21268 declare <16 x i32> @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21269 declare <vscale x 4 x i32> @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21270 declare <256 x i64> @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21271 21272Overview: 21273""""""""" 21274 21275Predicated computations of the signed remainder of two integer vectors. 21276 21277 21278Arguments: 21279"""""""""" 21280 21281The first two arguments and the result have the same vector of integer type. The 21282third argument is the vector mask and has the same number of elements as the 21283result vector type. The fourth argument is the explicit vector length of the 21284operation. 21285 21286Semantics: 21287"""""""""" 21288 21289The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division 21290(:ref:`srem <i_srem>`) of the first and second vector arguments on each enabled 21291lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 21292 21293Examples: 21294""""""""" 21295 21296.. code-block:: llvm 21297 21298 %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21299 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21300 21301 %t = srem <4 x i32> %a, %b 21302 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21303 21304 21305 21306.. _int_vp_urem: 21307 21308'``llvm.vp.urem.*``' Intrinsics 21309^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21310 21311Syntax: 21312""""""" 21313This is an overloaded intrinsic. 21314 21315:: 21316 21317 declare <16 x i32> @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21318 declare <vscale x 4 x i32> @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21319 declare <256 x i64> @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21320 21321Overview: 21322""""""""" 21323 21324Predicated computation of the unsigned remainder of two integer vectors. 21325 21326 21327Arguments: 21328"""""""""" 21329 21330The first two arguments and the result have the same vector of integer type. The 21331third argument is the vector mask and has the same number of elements as the 21332result vector type. The fourth argument is the explicit vector length of the 21333operation. 21334 21335Semantics: 21336"""""""""" 21337 21338The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division 21339(:ref:`urem <i_urem>`) of the first and second vector arguments on each enabled 21340lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 21341 21342Examples: 21343""""""""" 21344 21345.. code-block:: llvm 21346 21347 %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21348 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21349 21350 %t = urem <4 x i32> %a, %b 21351 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21352 21353 21354.. _int_vp_ashr: 21355 21356'``llvm.vp.ashr.*``' Intrinsics 21357^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21358 21359Syntax: 21360""""""" 21361This is an overloaded intrinsic. 21362 21363:: 21364 21365 declare <16 x i32> @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21366 declare <vscale x 4 x i32> @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21367 declare <256 x i64> @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21368 21369Overview: 21370""""""""" 21371 21372Vector-predicated arithmetic right-shift. 21373 21374 21375Arguments: 21376"""""""""" 21377 21378The first two arguments and the result have the same vector of integer type. The 21379third argument is the vector mask and has the same number of elements as the 21380result vector type. The fourth argument is the explicit vector length of the 21381operation. 21382 21383Semantics: 21384"""""""""" 21385 21386The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift 21387(:ref:`ashr <i_ashr>`) of the first argument by the second argument on each 21388enabled lane. The result on disabled lanes is a 21389:ref:`poison value <poisonvalues>`. 21390 21391Examples: 21392""""""""" 21393 21394.. code-block:: llvm 21395 21396 %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21397 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21398 21399 %t = ashr <4 x i32> %a, %b 21400 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21401 21402 21403.. _int_vp_lshr: 21404 21405 21406'``llvm.vp.lshr.*``' Intrinsics 21407^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21408 21409Syntax: 21410""""""" 21411This is an overloaded intrinsic. 21412 21413:: 21414 21415 declare <16 x i32> @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21416 declare <vscale x 4 x i32> @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21417 declare <256 x i64> @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21418 21419Overview: 21420""""""""" 21421 21422Vector-predicated logical right-shift. 21423 21424 21425Arguments: 21426"""""""""" 21427 21428The first two arguments and the result have the same vector of integer type. The 21429third argument is the vector mask and has the same number of elements as the 21430result vector type. The fourth argument is the explicit vector length of the 21431operation. 21432 21433Semantics: 21434"""""""""" 21435 21436The '``llvm.vp.lshr``' intrinsic computes the logical right shift 21437(:ref:`lshr <i_lshr>`) of the first argument by the second argument on each 21438enabled lane. The result on disabled lanes is a 21439:ref:`poison value <poisonvalues>`. 21440 21441Examples: 21442""""""""" 21443 21444.. code-block:: llvm 21445 21446 %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21447 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21448 21449 %t = lshr <4 x i32> %a, %b 21450 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21451 21452 21453.. _int_vp_shl: 21454 21455'``llvm.vp.shl.*``' Intrinsics 21456^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21457 21458Syntax: 21459""""""" 21460This is an overloaded intrinsic. 21461 21462:: 21463 21464 declare <16 x i32> @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21465 declare <vscale x 4 x i32> @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21466 declare <256 x i64> @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21467 21468Overview: 21469""""""""" 21470 21471Vector-predicated left shift. 21472 21473 21474Arguments: 21475"""""""""" 21476 21477The first two arguments and the result have the same vector of integer type. The 21478third argument is the vector mask and has the same number of elements as the 21479result vector type. The fourth argument is the explicit vector length of the 21480operation. 21481 21482Semantics: 21483"""""""""" 21484 21485The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of 21486the first argument by the second argument on each enabled lane. The result on 21487disabled lanes is a :ref:`poison value <poisonvalues>`. 21488 21489Examples: 21490""""""""" 21491 21492.. code-block:: llvm 21493 21494 %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21495 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21496 21497 %t = shl <4 x i32> %a, %b 21498 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21499 21500 21501.. _int_vp_or: 21502 21503'``llvm.vp.or.*``' Intrinsics 21504^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21505 21506Syntax: 21507""""""" 21508This is an overloaded intrinsic. 21509 21510:: 21511 21512 declare <16 x i32> @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21513 declare <vscale x 4 x i32> @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21514 declare <256 x i64> @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21515 21516Overview: 21517""""""""" 21518 21519Vector-predicated or. 21520 21521 21522Arguments: 21523"""""""""" 21524 21525The first two arguments and the result have the same vector of integer type. The 21526third argument is the vector mask and has the same number of elements as the 21527result vector type. The fourth argument is the explicit vector length of the 21528operation. 21529 21530Semantics: 21531"""""""""" 21532 21533The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the 21534first two arguments on each enabled lane. The result on disabled lanes is 21535a :ref:`poison value <poisonvalues>`. 21536 21537Examples: 21538""""""""" 21539 21540.. code-block:: llvm 21541 21542 %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21543 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21544 21545 %t = or <4 x i32> %a, %b 21546 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21547 21548 21549.. _int_vp_and: 21550 21551'``llvm.vp.and.*``' Intrinsics 21552^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21553 21554Syntax: 21555""""""" 21556This is an overloaded intrinsic. 21557 21558:: 21559 21560 declare <16 x i32> @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21561 declare <vscale x 4 x i32> @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21562 declare <256 x i64> @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21563 21564Overview: 21565""""""""" 21566 21567Vector-predicated and. 21568 21569 21570Arguments: 21571"""""""""" 21572 21573The first two arguments and the result have the same vector of integer type. The 21574third argument is the vector mask and has the same number of elements as the 21575result vector type. The fourth argument is the explicit vector length of the 21576operation. 21577 21578Semantics: 21579"""""""""" 21580 21581The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of 21582the first two arguments on each enabled lane. The result on disabled lanes is 21583a :ref:`poison value <poisonvalues>`. 21584 21585Examples: 21586""""""""" 21587 21588.. code-block:: llvm 21589 21590 %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21591 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21592 21593 %t = and <4 x i32> %a, %b 21594 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21595 21596 21597.. _int_vp_xor: 21598 21599'``llvm.vp.xor.*``' Intrinsics 21600^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21601 21602Syntax: 21603""""""" 21604This is an overloaded intrinsic. 21605 21606:: 21607 21608 declare <16 x i32> @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21609 declare <vscale x 4 x i32> @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21610 declare <256 x i64> @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21611 21612Overview: 21613""""""""" 21614 21615Vector-predicated, bitwise xor. 21616 21617 21618Arguments: 21619"""""""""" 21620 21621The first two arguments and the result have the same vector of integer type. The 21622third argument is the vector mask and has the same number of elements as the 21623result vector type. The fourth argument is the explicit vector length of the 21624operation. 21625 21626Semantics: 21627"""""""""" 21628 21629The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of 21630the first two arguments on each enabled lane. 21631The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 21632 21633Examples: 21634""""""""" 21635 21636.. code-block:: llvm 21637 21638 %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21639 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21640 21641 %t = xor <4 x i32> %a, %b 21642 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21643 21644.. _int_vp_abs: 21645 21646'``llvm.vp.abs.*``' Intrinsics 21647^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21648 21649Syntax: 21650""""""" 21651This is an overloaded intrinsic. 21652 21653:: 21654 21655 declare <16 x i32> @llvm.vp.abs.v16i32 (<16 x i32> <op>, i1 <is_int_min_poison>, <16 x i1> <mask>, i32 <vector_length>) 21656 declare <vscale x 4 x i32> @llvm.vp.abs.nxv4i32 (<vscale x 4 x i32> <op>, i1 <is_int_min_poison>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21657 declare <256 x i64> @llvm.vp.abs.v256i64 (<256 x i64> <op>, i1 <is_int_min_poison>, <256 x i1> <mask>, i32 <vector_length>) 21658 21659Overview: 21660""""""""" 21661 21662Predicated abs of a vector of integers. 21663 21664 21665Arguments: 21666"""""""""" 21667 21668The first argument and the result have the same vector of integer type. The 21669second argument must be a constant and is a flag to indicate whether the result 21670value of the '``llvm.vp.abs``' intrinsic is a :ref:`poison value <poisonvalues>` 21671if the first argument is statically or dynamically an ``INT_MIN`` value. The 21672third argument is the vector mask and has the same number of elements as the 21673result vector type. The fourth argument is the explicit vector length of the 21674operation. 21675 21676Semantics: 21677"""""""""" 21678 21679The '``llvm.vp.abs``' intrinsic performs abs (:ref:`abs <int_abs>`) of the first argument on each 21680enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 21681 21682Examples: 21683""""""""" 21684 21685.. code-block:: llvm 21686 21687 %r = call <4 x i32> @llvm.vp.abs.v4i32(<4 x i32> %a, i1 false, <4 x i1> %mask, i32 %evl) 21688 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21689 21690 %t = call <4 x i32> @llvm.abs.v4i32(<4 x i32> %a, i1 false) 21691 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21692 21693 21694 21695.. _int_vp_smax: 21696 21697'``llvm.vp.smax.*``' Intrinsics 21698^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21699 21700Syntax: 21701""""""" 21702This is an overloaded intrinsic. 21703 21704:: 21705 21706 declare <16 x i32> @llvm.vp.smax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21707 declare <vscale x 4 x i32> @llvm.vp.smax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21708 declare <256 x i64> @llvm.vp.smax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21709 21710Overview: 21711""""""""" 21712 21713Predicated integer signed maximum of two vectors of integers. 21714 21715 21716Arguments: 21717"""""""""" 21718 21719The first two arguments and the result have the same vector of integer type. The 21720third argument is the vector mask and has the same number of elements as the 21721result vector type. The fourth argument is the explicit vector length of the 21722operation. 21723 21724Semantics: 21725"""""""""" 21726 21727The '``llvm.vp.smax``' intrinsic performs integer signed maximum (:ref:`smax <int_smax>`) 21728of the first and second vector arguments on each enabled lane. The result on 21729disabled lanes is a :ref:`poison value <poisonvalues>`. 21730 21731Examples: 21732""""""""" 21733 21734.. code-block:: llvm 21735 21736 %r = call <4 x i32> @llvm.vp.smax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21737 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21738 21739 %t = call <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b) 21740 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21741 21742 21743.. _int_vp_smin: 21744 21745'``llvm.vp.smin.*``' Intrinsics 21746^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21747 21748Syntax: 21749""""""" 21750This is an overloaded intrinsic. 21751 21752:: 21753 21754 declare <16 x i32> @llvm.vp.smin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21755 declare <vscale x 4 x i32> @llvm.vp.smin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21756 declare <256 x i64> @llvm.vp.smin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21757 21758Overview: 21759""""""""" 21760 21761Predicated integer signed minimum of two vectors of integers. 21762 21763 21764Arguments: 21765"""""""""" 21766 21767The first two arguments and the result have the same vector of integer type. The 21768third argument is the vector mask and has the same number of elements as the 21769result vector type. The fourth argument is the explicit vector length of the 21770operation. 21771 21772Semantics: 21773"""""""""" 21774 21775The '``llvm.vp.smin``' intrinsic performs integer signed minimum (:ref:`smin <int_smin>`) 21776of the first and second vector arguments on each enabled lane. The result on 21777disabled lanes is a :ref:`poison value <poisonvalues>`. 21778 21779Examples: 21780""""""""" 21781 21782.. code-block:: llvm 21783 21784 %r = call <4 x i32> @llvm.vp.smin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21785 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21786 21787 %t = call <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b) 21788 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21789 21790 21791.. _int_vp_umax: 21792 21793'``llvm.vp.umax.*``' Intrinsics 21794^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21795 21796Syntax: 21797""""""" 21798This is an overloaded intrinsic. 21799 21800:: 21801 21802 declare <16 x i32> @llvm.vp.umax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21803 declare <vscale x 4 x i32> @llvm.vp.umax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21804 declare <256 x i64> @llvm.vp.umax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21805 21806Overview: 21807""""""""" 21808 21809Predicated integer unsigned maximum of two vectors of integers. 21810 21811 21812Arguments: 21813"""""""""" 21814 21815The first two arguments and the result have the same vector of integer type. The 21816third argument is the vector mask and has the same number of elements as the 21817result vector type. The fourth argument is the explicit vector length of the 21818operation. 21819 21820Semantics: 21821"""""""""" 21822 21823The '``llvm.vp.umax``' intrinsic performs integer unsigned maximum (:ref:`umax <int_umax>`) 21824of the first and second vector arguments on each enabled lane. The result on 21825disabled lanes is a :ref:`poison value <poisonvalues>`. 21826 21827Examples: 21828""""""""" 21829 21830.. code-block:: llvm 21831 21832 %r = call <4 x i32> @llvm.vp.umax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21833 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21834 21835 %t = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b) 21836 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21837 21838 21839.. _int_vp_umin: 21840 21841'``llvm.vp.umin.*``' Intrinsics 21842^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21843 21844Syntax: 21845""""""" 21846This is an overloaded intrinsic. 21847 21848:: 21849 21850 declare <16 x i32> @llvm.vp.umin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21851 declare <vscale x 4 x i32> @llvm.vp.umin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21852 declare <256 x i64> @llvm.vp.umin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21853 21854Overview: 21855""""""""" 21856 21857Predicated integer unsigned minimum of two vectors of integers. 21858 21859 21860Arguments: 21861"""""""""" 21862 21863The first two arguments and the result have the same vector of integer type. The 21864third argument is the vector mask and has the same number of elements as the 21865result vector type. The fourth argument is the explicit vector length of the 21866operation. 21867 21868Semantics: 21869"""""""""" 21870 21871The '``llvm.vp.umin``' intrinsic performs integer unsigned minimum (:ref:`umin <int_umin>`) 21872of the first and second vector arguments on each enabled lane. The result on 21873disabled lanes is a :ref:`poison value <poisonvalues>`. 21874 21875Examples: 21876""""""""" 21877 21878.. code-block:: llvm 21879 21880 %r = call <4 x i32> @llvm.vp.umin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 21881 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21882 21883 %t = call <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b) 21884 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 21885 21886 21887.. _int_vp_copysign: 21888 21889'``llvm.vp.copysign.*``' Intrinsics 21890^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21891 21892Syntax: 21893""""""" 21894This is an overloaded intrinsic. 21895 21896:: 21897 21898 declare <16 x float> @llvm.vp.copysign.v16f32 (<16 x float> <mag_op>, <16 x float> <sign_op>, <16 x i1> <mask>, i32 <vector_length>) 21899 declare <vscale x 4 x float> @llvm.vp.copysign.nxv4f32 (<vscale x 4 x float> <mag_op>, <vscale x 4 x float> <sign_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21900 declare <256 x double> @llvm.vp.copysign.v256f64 (<256 x double> <mag_op>, <256 x double> <sign_op>, <256 x i1> <mask>, i32 <vector_length>) 21901 21902Overview: 21903""""""""" 21904 21905Predicated floating-point copysign of two vectors of floating-point values. 21906 21907 21908Arguments: 21909"""""""""" 21910 21911The first two arguments and the result have the same vector of floating-point type. The 21912third argument is the vector mask and has the same number of elements as the 21913result vector type. The fourth argument is the explicit vector length of the 21914operation. 21915 21916Semantics: 21917"""""""""" 21918 21919The '``llvm.vp.copysign``' intrinsic performs floating-point copysign (:ref:`copysign <int_copysign>`) 21920of the first and second vector arguments on each enabled lane. The result on 21921disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is 21922performed in the default floating-point environment. 21923 21924Examples: 21925""""""""" 21926 21927.. code-block:: llvm 21928 21929 %r = call <4 x float> @llvm.vp.copysign.v4f32(<4 x float> %mag, <4 x float> %sign, <4 x i1> %mask, i32 %evl) 21930 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21931 21932 %t = call <4 x float> @llvm.copysign.v4f32(<4 x float> %mag, <4 x float> %sign) 21933 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 21934 21935 21936.. _int_vp_minnum: 21937 21938'``llvm.vp.minnum.*``' Intrinsics 21939^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21940 21941Syntax: 21942""""""" 21943This is an overloaded intrinsic. 21944 21945:: 21946 21947 declare <16 x float> @llvm.vp.minnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21948 declare <vscale x 4 x float> @llvm.vp.minnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21949 declare <256 x double> @llvm.vp.minnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21950 21951Overview: 21952""""""""" 21953 21954Predicated floating-point IEEE-754 minNum of two vectors of floating-point values. 21955 21956 21957Arguments: 21958"""""""""" 21959 21960The first two arguments and the result have the same vector of floating-point type. The 21961third argument is the vector mask and has the same number of elements as the 21962result vector type. The fourth argument is the explicit vector length of the 21963operation. 21964 21965Semantics: 21966"""""""""" 21967 21968The '``llvm.vp.minnum``' intrinsic performs floating-point minimum (:ref:`minnum <i_minnum>`) 21969of the first and second vector arguments on each enabled lane. The result on 21970disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is 21971performed in the default floating-point environment. 21972 21973Examples: 21974""""""""" 21975 21976.. code-block:: llvm 21977 21978 %r = call <4 x float> @llvm.vp.minnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 21979 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 21980 21981 %t = call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %b) 21982 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 21983 21984 21985.. _int_vp_maxnum: 21986 21987'``llvm.vp.maxnum.*``' Intrinsics 21988^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21989 21990Syntax: 21991""""""" 21992This is an overloaded intrinsic. 21993 21994:: 21995 21996 declare <16 x float> @llvm.vp.maxnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 21997 declare <vscale x 4 x float> @llvm.vp.maxnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 21998 declare <256 x double> @llvm.vp.maxnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 21999 22000Overview: 22001""""""""" 22002 22003Predicated floating-point IEEE-754 maxNum of two vectors of floating-point values. 22004 22005 22006Arguments: 22007"""""""""" 22008 22009The first two arguments and the result have the same vector of floating-point type. The 22010third argument is the vector mask and has the same number of elements as the 22011result vector type. The fourth argument is the explicit vector length of the 22012operation. 22013 22014Semantics: 22015"""""""""" 22016 22017The '``llvm.vp.maxnum``' intrinsic performs floating-point maximum (:ref:`maxnum <i_maxnum>`) 22018of the first and second vector arguments on each enabled lane. The result on 22019disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is 22020performed in the default floating-point environment. 22021 22022Examples: 22023""""""""" 22024 22025.. code-block:: llvm 22026 22027 %r = call <4 x float> @llvm.vp.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 22028 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22029 22030 %t = call <4 x float> @llvm.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 22031 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22032 22033 22034.. _int_vp_minimum: 22035 22036'``llvm.vp.minimum.*``' Intrinsics 22037^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22038 22039Syntax: 22040""""""" 22041This is an overloaded intrinsic. 22042 22043:: 22044 22045 declare <16 x float> @llvm.vp.minimum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 22046 declare <vscale x 4 x float> @llvm.vp.minimum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22047 declare <256 x double> @llvm.vp.minimum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 22048 22049Overview: 22050""""""""" 22051 22052Predicated floating-point minimum of two vectors of floating-point values, 22053propagating NaNs and treating -0.0 as less than +0.0. 22054 22055Arguments: 22056"""""""""" 22057 22058The first two arguments and the result have the same vector of floating-point type. The 22059third argument is the vector mask and has the same number of elements as the 22060result vector type. The fourth argument is the explicit vector length of the 22061operation. 22062 22063Semantics: 22064"""""""""" 22065 22066The '``llvm.vp.minimum``' intrinsic performs floating-point minimum (:ref:`minimum <i_minimum>`) 22067of the first and second vector arguments on each enabled lane, the result being 22068NaN if either argument is a NaN. -0.0 is considered to be less than +0.0 for this 22069intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 22070The operation is performed in the default floating-point environment. 22071 22072Examples: 22073""""""""" 22074 22075.. code-block:: llvm 22076 22077 %r = call <4 x float> @llvm.vp.minimum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 22078 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22079 22080 %t = call <4 x float> @llvm.minimum.v4f32(<4 x float> %a, <4 x float> %b) 22081 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22082 22083 22084.. _int_vp_maximum: 22085 22086'``llvm.vp.maximum.*``' Intrinsics 22087^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22088 22089Syntax: 22090""""""" 22091This is an overloaded intrinsic. 22092 22093:: 22094 22095 declare <16 x float> @llvm.vp.maximum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 22096 declare <vscale x 4 x float> @llvm.vp.maximum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22097 declare <256 x double> @llvm.vp.maximum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 22098 22099Overview: 22100""""""""" 22101 22102Predicated floating-point maximum of two vectors of floating-point values, 22103propagating NaNs and treating -0.0 as less than +0.0. 22104 22105Arguments: 22106"""""""""" 22107 22108The first two arguments and the result have the same vector of floating-point type. The 22109third argument is the vector mask and has the same number of elements as the 22110result vector type. The fourth argument is the explicit vector length of the 22111operation. 22112 22113Semantics: 22114"""""""""" 22115 22116The '``llvm.vp.maximum``' intrinsic performs floating-point maximum (:ref:`maximum <i_maximum>`) 22117of the first and second vector arguments on each enabled lane, the result being 22118NaN if either argument is a NaN. -0.0 is considered to be less than +0.0 for this 22119intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 22120The operation is performed in the default floating-point environment. 22121 22122Examples: 22123""""""""" 22124 22125.. code-block:: llvm 22126 22127 %r = call <4 x float> @llvm.vp.maximum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 22128 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22129 22130 %t = call <4 x float> @llvm.maximum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 22131 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22132 22133 22134.. _int_vp_fadd: 22135 22136'``llvm.vp.fadd.*``' Intrinsics 22137^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22138 22139Syntax: 22140""""""" 22141This is an overloaded intrinsic. 22142 22143:: 22144 22145 declare <16 x float> @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 22146 declare <vscale x 4 x float> @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22147 declare <256 x double> @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 22148 22149Overview: 22150""""""""" 22151 22152Predicated floating-point addition of two vectors of floating-point values. 22153 22154 22155Arguments: 22156"""""""""" 22157 22158The first two arguments and the result have the same vector of floating-point type. The 22159third argument is the vector mask and has the same number of elements as the 22160result vector type. The fourth argument is the explicit vector length of the 22161operation. 22162 22163Semantics: 22164"""""""""" 22165 22166The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`fadd <i_fadd>`) 22167of the first and second vector arguments on each enabled lane. The result on 22168disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is 22169performed in the default floating-point environment. 22170 22171Examples: 22172""""""""" 22173 22174.. code-block:: llvm 22175 22176 %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 22177 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22178 22179 %t = fadd <4 x float> %a, %b 22180 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22181 22182 22183.. _int_vp_fsub: 22184 22185'``llvm.vp.fsub.*``' Intrinsics 22186^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22187 22188Syntax: 22189""""""" 22190This is an overloaded intrinsic. 22191 22192:: 22193 22194 declare <16 x float> @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 22195 declare <vscale x 4 x float> @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22196 declare <256 x double> @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 22197 22198Overview: 22199""""""""" 22200 22201Predicated floating-point subtraction of two vectors of floating-point values. 22202 22203 22204Arguments: 22205"""""""""" 22206 22207The first two arguments and the result have the same vector of floating-point type. The 22208third argument is the vector mask and has the same number of elements as the 22209result vector type. The fourth argument is the explicit vector length of the 22210operation. 22211 22212Semantics: 22213"""""""""" 22214 22215The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`fsub <i_fsub>`) 22216of the first and second vector arguments on each enabled lane. The result on 22217disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is 22218performed in the default floating-point environment. 22219 22220Examples: 22221""""""""" 22222 22223.. code-block:: llvm 22224 22225 %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 22226 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22227 22228 %t = fsub <4 x float> %a, %b 22229 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22230 22231 22232.. _int_vp_fmul: 22233 22234'``llvm.vp.fmul.*``' Intrinsics 22235^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22236 22237Syntax: 22238""""""" 22239This is an overloaded intrinsic. 22240 22241:: 22242 22243 declare <16 x float> @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 22244 declare <vscale x 4 x float> @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22245 declare <256 x double> @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 22246 22247Overview: 22248""""""""" 22249 22250Predicated floating-point multiplication of two vectors of floating-point values. 22251 22252 22253Arguments: 22254"""""""""" 22255 22256The first two arguments and the result have the same vector of floating-point type. The 22257third argument is the vector mask and has the same number of elements as the 22258result vector type. The fourth argument is the explicit vector length of the 22259operation. 22260 22261Semantics: 22262"""""""""" 22263 22264The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`fmul <i_fmul>`) 22265of the first and second vector arguments on each enabled lane. The result on 22266disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is 22267performed in the default floating-point environment. 22268 22269Examples: 22270""""""""" 22271 22272.. code-block:: llvm 22273 22274 %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 22275 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22276 22277 %t = fmul <4 x float> %a, %b 22278 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22279 22280 22281.. _int_vp_fdiv: 22282 22283'``llvm.vp.fdiv.*``' Intrinsics 22284^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22285 22286Syntax: 22287""""""" 22288This is an overloaded intrinsic. 22289 22290:: 22291 22292 declare <16 x float> @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 22293 declare <vscale x 4 x float> @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22294 declare <256 x double> @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 22295 22296Overview: 22297""""""""" 22298 22299Predicated floating-point division of two vectors of floating-point values. 22300 22301 22302Arguments: 22303"""""""""" 22304 22305The first two arguments and the result have the same vector of floating-point type. The 22306third argument is the vector mask and has the same number of elements as the 22307result vector type. The fourth argument is the explicit vector length of the 22308operation. 22309 22310Semantics: 22311"""""""""" 22312 22313The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`fdiv <i_fdiv>`) 22314of the first and second vector arguments on each enabled lane. The result on 22315disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is 22316performed in the default floating-point environment. 22317 22318Examples: 22319""""""""" 22320 22321.. code-block:: llvm 22322 22323 %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 22324 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22325 22326 %t = fdiv <4 x float> %a, %b 22327 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22328 22329 22330.. _int_vp_frem: 22331 22332'``llvm.vp.frem.*``' Intrinsics 22333^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22334 22335Syntax: 22336""""""" 22337This is an overloaded intrinsic. 22338 22339:: 22340 22341 declare <16 x float> @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 22342 declare <vscale x 4 x float> @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22343 declare <256 x double> @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 22344 22345Overview: 22346""""""""" 22347 22348Predicated floating-point remainder of two vectors of floating-point values. 22349 22350 22351Arguments: 22352"""""""""" 22353 22354The first two arguments and the result have the same vector of floating-point type. The 22355third argument is the vector mask and has the same number of elements as the 22356result vector type. The fourth argument is the explicit vector length of the 22357operation. 22358 22359Semantics: 22360"""""""""" 22361 22362The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`frem <i_frem>`) 22363of the first and second vector arguments on each enabled lane. The result on 22364disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is 22365performed in the default floating-point environment. 22366 22367Examples: 22368""""""""" 22369 22370.. code-block:: llvm 22371 22372 %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 22373 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22374 22375 %t = frem <4 x float> %a, %b 22376 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22377 22378 22379.. _int_vp_fneg: 22380 22381'``llvm.vp.fneg.*``' Intrinsics 22382^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22383 22384Syntax: 22385""""""" 22386This is an overloaded intrinsic. 22387 22388:: 22389 22390 declare <16 x float> @llvm.vp.fneg.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 22391 declare <vscale x 4 x float> @llvm.vp.fneg.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22392 declare <256 x double> @llvm.vp.fneg.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 22393 22394Overview: 22395""""""""" 22396 22397Predicated floating-point negation of a vector of floating-point values. 22398 22399 22400Arguments: 22401"""""""""" 22402 22403The first argument and the result have the same vector of floating-point type. 22404The second argument is the vector mask and has the same number of elements as the 22405result vector type. The third argument is the explicit vector length of the 22406operation. 22407 22408Semantics: 22409"""""""""" 22410 22411The '``llvm.vp.fneg``' intrinsic performs floating-point negation (:ref:`fneg <i_fneg>`) 22412of the first vector argument on each enabled lane. The result on disabled lanes 22413is a :ref:`poison value <poisonvalues>`. 22414 22415Examples: 22416""""""""" 22417 22418.. code-block:: llvm 22419 22420 %r = call <4 x float> @llvm.vp.fneg.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 22421 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22422 22423 %t = fneg <4 x float> %a 22424 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22425 22426 22427.. _int_vp_fabs: 22428 22429'``llvm.vp.fabs.*``' Intrinsics 22430^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22431 22432Syntax: 22433""""""" 22434This is an overloaded intrinsic. 22435 22436:: 22437 22438 declare <16 x float> @llvm.vp.fabs.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 22439 declare <vscale x 4 x float> @llvm.vp.fabs.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22440 declare <256 x double> @llvm.vp.fabs.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 22441 22442Overview: 22443""""""""" 22444 22445Predicated floating-point absolute value of a vector of floating-point values. 22446 22447 22448Arguments: 22449"""""""""" 22450 22451The first argument and the result have the same vector of floating-point type. 22452The second argument is the vector mask and has the same number of elements as the 22453result vector type. The third argument is the explicit vector length of the 22454operation. 22455 22456Semantics: 22457"""""""""" 22458 22459The '``llvm.vp.fabs``' intrinsic performs floating-point absolute value 22460(:ref:`fabs <int_fabs>`) of the first vector argument on each enabled lane. The 22461result on disabled lanes is a :ref:`poison value <poisonvalues>`. 22462 22463Examples: 22464""""""""" 22465 22466.. code-block:: llvm 22467 22468 %r = call <4 x float> @llvm.vp.fabs.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 22469 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22470 22471 %t = call <4 x float> @llvm.fabs.v4f32(<4 x float> %a) 22472 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22473 22474 22475.. _int_vp_sqrt: 22476 22477'``llvm.vp.sqrt.*``' Intrinsics 22478^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22479 22480Syntax: 22481""""""" 22482This is an overloaded intrinsic. 22483 22484:: 22485 22486 declare <16 x float> @llvm.vp.sqrt.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 22487 declare <vscale x 4 x float> @llvm.vp.sqrt.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22488 declare <256 x double> @llvm.vp.sqrt.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 22489 22490Overview: 22491""""""""" 22492 22493Predicated floating-point square root of a vector of floating-point values. 22494 22495 22496Arguments: 22497"""""""""" 22498 22499The first argument and the result have the same vector of floating-point type. 22500The second argument is the vector mask and has the same number of elements as the 22501result vector type. The third argument is the explicit vector length of the 22502operation. 22503 22504Semantics: 22505"""""""""" 22506 22507The '``llvm.vp.sqrt``' intrinsic performs floating-point square root (:ref:`sqrt <int_sqrt>`) of 22508the first vector argument on each enabled lane. The result on disabled lanes is 22509a :ref:`poison value <poisonvalues>`. The operation is performed in the default 22510floating-point environment. 22511 22512Examples: 22513""""""""" 22514 22515.. code-block:: llvm 22516 22517 %r = call <4 x float> @llvm.vp.sqrt.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 22518 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22519 22520 %t = call <4 x float> @llvm.sqrt.v4f32(<4 x float> %a) 22521 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22522 22523 22524.. _int_vp_fma: 22525 22526'``llvm.vp.fma.*``' Intrinsics 22527^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22528 22529Syntax: 22530""""""" 22531This is an overloaded intrinsic. 22532 22533:: 22534 22535 declare <16 x float> @llvm.vp.fma.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 22536 declare <vscale x 4 x float> @llvm.vp.fma.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22537 declare <256 x double> @llvm.vp.fma.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 22538 22539Overview: 22540""""""""" 22541 22542Predicated floating-point fused multiply-add of two vectors of floating-point values. 22543 22544 22545Arguments: 22546"""""""""" 22547 22548The first three arguments and the result have the same vector of floating-point type. The 22549fourth argument is the vector mask and has the same number of elements as the 22550result vector type. The fifth argument is the explicit vector length of the 22551operation. 22552 22553Semantics: 22554"""""""""" 22555 22556The '``llvm.vp.fma``' intrinsic performs floating-point fused multiply-add (:ref:`llvm.fma <int_fma>`) 22557of the first, second, and third vector argument on each enabled lane. The result on 22558disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is 22559performed in the default floating-point environment. 22560 22561Examples: 22562""""""""" 22563 22564.. code-block:: llvm 22565 22566 %r = call <4 x float> @llvm.vp.fma.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl) 22567 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22568 22569 %t = call <4 x float> @llvm.fma(<4 x float> %a, <4 x float> %b, <4 x float> %c) 22570 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22571 22572 22573.. _int_vp_fmuladd: 22574 22575'``llvm.vp.fmuladd.*``' Intrinsics 22576^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22577 22578Syntax: 22579""""""" 22580This is an overloaded intrinsic. 22581 22582:: 22583 22584 declare <16 x float> @llvm.vp.fmuladd.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 22585 declare <vscale x 4 x float> @llvm.vp.fmuladd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 22586 declare <256 x double> @llvm.vp.fmuladd.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 22587 22588Overview: 22589""""""""" 22590 22591Predicated floating-point multiply-add of two vectors of floating-point values 22592that can be fused if code generator determines that (a) the target instruction 22593set has support for a fused operation, and (b) that the fused operation is more 22594efficient than the equivalent, separate pair of mul and add instructions. 22595 22596Arguments: 22597"""""""""" 22598 22599The first three arguments and the result have the same vector of floating-point 22600type. The fourth argument is the vector mask and has the same number of elements 22601as the result vector type. The fifth argument is the explicit vector length of 22602the operation. 22603 22604Semantics: 22605"""""""""" 22606 22607The '``llvm.vp.fmuladd``' intrinsic performs floating-point multiply-add (:ref:`llvm.fuladd <int_fmuladd>`) 22608of the first, second, and third vector argument on each enabled lane. The result 22609on disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is 22610performed in the default floating-point environment. 22611 22612Examples: 22613""""""""" 22614 22615.. code-block:: llvm 22616 22617 %r = call <4 x float> @llvm.vp.fmuladd.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl) 22618 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 22619 22620 %t = call <4 x float> @llvm.fmuladd(<4 x float> %a, <4 x float> %b, <4 x float> %c) 22621 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 22622 22623 22624.. _int_vp_reduce_add: 22625 22626'``llvm.vp.reduce.add.*``' Intrinsics 22627^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22628 22629Syntax: 22630""""""" 22631This is an overloaded intrinsic. 22632 22633:: 22634 22635 declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 22636 declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 22637 22638Overview: 22639""""""""" 22640 22641Predicated integer ``ADD`` reduction of a vector and a scalar starting value, 22642returning the result as a scalar. 22643 22644Arguments: 22645"""""""""" 22646 22647The first argument is the start value of the reduction, which must be a scalar 22648integer type equal to the result type. The second argument is the vector on 22649which the reduction is performed and must be a vector of integer values whose 22650element type is the result/start type. The third argument is the vector mask and 22651is a vector of boolean values with the same number of elements as the vector 22652argument. The fourth argument is the explicit vector length of the operation. 22653 22654Semantics: 22655"""""""""" 22656 22657The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction 22658(:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector argument 22659``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled 22660lanes are treated as containing the neutral value ``0`` (i.e. having no effect 22661on the reduction operation). If the vector length is zero, the result is equal 22662to ``start_value``. 22663 22664To ignore the start value, the neutral value can be used. 22665 22666Examples: 22667""""""""" 22668 22669.. code-block:: llvm 22670 22671 %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 22672 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 22673 ; are treated as though %mask were false for those lanes. 22674 22675 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer 22676 %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a) 22677 %also.r = add i32 %reduction, %start 22678 22679 22680.. _int_vp_reduce_fadd: 22681 22682'``llvm.vp.reduce.fadd.*``' Intrinsics 22683^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22684 22685Syntax: 22686""""""" 22687This is an overloaded intrinsic. 22688 22689:: 22690 22691 declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>) 22692 declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 22693 22694Overview: 22695""""""""" 22696 22697Predicated floating-point ``ADD`` reduction of a vector and a scalar starting 22698value, returning the result as a scalar. 22699 22700Arguments: 22701"""""""""" 22702 22703The first argument is the start value of the reduction, which must be a scalar 22704floating-point type equal to the result type. The second argument is the vector 22705on which the reduction is performed and must be a vector of floating-point 22706values whose element type is the result/start type. The third argument is the 22707vector mask and is a vector of boolean values with the same number of elements 22708as the vector argument. The fourth argument is the explicit vector length of the 22709operation. 22710 22711Semantics: 22712"""""""""" 22713 22714The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD`` 22715reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the 22716vector argument ``val`` on each enabled lane, adding it to the scalar 22717``start_value``. Disabled lanes are treated as containing the neutral value 22718``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are 22719enabled, the resulting value will be equal to ``start_value``. 22720 22721To ignore the start value, the neutral value can be used. 22722 22723See the unpredicated version (:ref:`llvm.vector.reduce.fadd 22724<int_vector_reduce_fadd>`) for more detail on the semantics of the reduction. 22725 22726Examples: 22727""""""""" 22728 22729.. code-block:: llvm 22730 22731 %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl) 22732 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 22733 ; are treated as though %mask were false for those lanes. 22734 22735 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0> 22736 %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a) 22737 22738 22739.. _int_vp_reduce_mul: 22740 22741'``llvm.vp.reduce.mul.*``' Intrinsics 22742^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22743 22744Syntax: 22745""""""" 22746This is an overloaded intrinsic. 22747 22748:: 22749 22750 declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 22751 declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 22752 22753Overview: 22754""""""""" 22755 22756Predicated integer ``MUL`` reduction of a vector and a scalar starting value, 22757returning the result as a scalar. 22758 22759 22760Arguments: 22761"""""""""" 22762 22763The first argument is the start value of the reduction, which must be a scalar 22764integer type equal to the result type. The second argument is the vector on 22765which the reduction is performed and must be a vector of integer values whose 22766element type is the result/start type. The third argument is the vector mask and 22767is a vector of boolean values with the same number of elements as the vector 22768argument. The fourth argument is the explicit vector length of the operation. 22769 22770Semantics: 22771"""""""""" 22772 22773The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction 22774(:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector argument ``val`` 22775on each enabled lane, multiplying it by the scalar ``start_value``. Disabled 22776lanes are treated as containing the neutral value ``1`` (i.e. having no effect 22777on the reduction operation). If the vector length is zero, the result is the 22778start value. 22779 22780To ignore the start value, the neutral value can be used. 22781 22782Examples: 22783""""""""" 22784 22785.. code-block:: llvm 22786 22787 %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 22788 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 22789 ; are treated as though %mask were false for those lanes. 22790 22791 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1> 22792 %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a) 22793 %also.r = mul i32 %reduction, %start 22794 22795.. _int_vp_reduce_fmul: 22796 22797'``llvm.vp.reduce.fmul.*``' Intrinsics 22798^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22799 22800Syntax: 22801""""""" 22802This is an overloaded intrinsic. 22803 22804:: 22805 22806 declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>) 22807 declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 22808 22809Overview: 22810""""""""" 22811 22812Predicated floating-point ``MUL`` reduction of a vector and a scalar starting 22813value, returning the result as a scalar. 22814 22815 22816Arguments: 22817"""""""""" 22818 22819The first argument is the start value of the reduction, which must be a scalar 22820floating-point type equal to the result type. The second argument is the vector 22821on which the reduction is performed and must be a vector of floating-point 22822values whose element type is the result/start type. The third argument is the 22823vector mask and is a vector of boolean values with the same number of elements 22824as the vector argument. The fourth argument is the explicit vector length of the 22825operation. 22826 22827Semantics: 22828"""""""""" 22829 22830The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL`` 22831reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the 22832vector argument ``val`` on each enabled lane, multiplying it by the scalar 22833`start_value``. Disabled lanes are treated as containing the neutral value 22834``1.0`` (i.e. having no effect on the reduction operation). If no lanes are 22835enabled, the resulting value will be equal to the starting value. 22836 22837To ignore the start value, the neutral value can be used. 22838 22839See the unpredicated version (:ref:`llvm.vector.reduce.fmul 22840<int_vector_reduce_fmul>`) for more detail on the semantics. 22841 22842Examples: 22843""""""""" 22844 22845.. code-block:: llvm 22846 22847 %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl) 22848 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 22849 ; are treated as though %mask were false for those lanes. 22850 22851 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0> 22852 %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a) 22853 22854 22855.. _int_vp_reduce_and: 22856 22857'``llvm.vp.reduce.and.*``' Intrinsics 22858^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22859 22860Syntax: 22861""""""" 22862This is an overloaded intrinsic. 22863 22864:: 22865 22866 declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 22867 declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 22868 22869Overview: 22870""""""""" 22871 22872Predicated integer ``AND`` reduction of a vector and a scalar starting value, 22873returning the result as a scalar. 22874 22875 22876Arguments: 22877"""""""""" 22878 22879The first argument is the start value of the reduction, which must be a scalar 22880integer type equal to the result type. The second argument is the vector on 22881which the reduction is performed and must be a vector of integer values whose 22882element type is the result/start type. The third argument is the vector mask and 22883is a vector of boolean values with the same number of elements as the vector 22884argument. The fourth argument is the explicit vector length of the operation. 22885 22886Semantics: 22887"""""""""" 22888 22889The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction 22890(:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector argument 22891``val`` on each enabled lane, performing an '``and``' of that with with the 22892scalar ``start_value``. Disabled lanes are treated as containing the neutral 22893value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction 22894operation). If the vector length is zero, the result is the start value. 22895 22896To ignore the start value, the neutral value can be used. 22897 22898Examples: 22899""""""""" 22900 22901.. code-block:: llvm 22902 22903 %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 22904 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 22905 ; are treated as though %mask were false for those lanes. 22906 22907 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1> 22908 %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a) 22909 %also.r = and i32 %reduction, %start 22910 22911 22912.. _int_vp_reduce_or: 22913 22914'``llvm.vp.reduce.or.*``' Intrinsics 22915^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22916 22917Syntax: 22918""""""" 22919This is an overloaded intrinsic. 22920 22921:: 22922 22923 declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 22924 declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 22925 22926Overview: 22927""""""""" 22928 22929Predicated integer ``OR`` reduction of a vector and a scalar starting value, 22930returning the result as a scalar. 22931 22932 22933Arguments: 22934"""""""""" 22935 22936The first argument is the start value of the reduction, which must be a scalar 22937integer type equal to the result type. The second argument is the vector on 22938which the reduction is performed and must be a vector of integer values whose 22939element type is the result/start type. The third argument is the vector mask and 22940is a vector of boolean values with the same number of elements as the vector 22941argument. The fourth argument is the explicit vector length of the operation. 22942 22943Semantics: 22944"""""""""" 22945 22946The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction 22947(:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector argument 22948``val`` on each enabled lane, performing an '``or``' of that with the scalar 22949``start_value``. Disabled lanes are treated as containing the neutral value 22950``0`` (i.e. having no effect on the reduction operation). If the vector length 22951is zero, the result is the start value. 22952 22953To ignore the start value, the neutral value can be used. 22954 22955Examples: 22956""""""""" 22957 22958.. code-block:: llvm 22959 22960 %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 22961 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 22962 ; are treated as though %mask were false for those lanes. 22963 22964 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0> 22965 %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a) 22966 %also.r = or i32 %reduction, %start 22967 22968.. _int_vp_reduce_xor: 22969 22970'``llvm.vp.reduce.xor.*``' Intrinsics 22971^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22972 22973Syntax: 22974""""""" 22975This is an overloaded intrinsic. 22976 22977:: 22978 22979 declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 22980 declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 22981 22982Overview: 22983""""""""" 22984 22985Predicated integer ``XOR`` reduction of a vector and a scalar starting value, 22986returning the result as a scalar. 22987 22988 22989Arguments: 22990"""""""""" 22991 22992The first argument is the start value of the reduction, which must be a scalar 22993integer type equal to the result type. The second argument is the vector on 22994which the reduction is performed and must be a vector of integer values whose 22995element type is the result/start type. The third argument is the vector mask and 22996is a vector of boolean values with the same number of elements as the vector 22997argument. The fourth argument is the explicit vector length of the operation. 22998 22999Semantics: 23000"""""""""" 23001 23002The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction 23003(:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector argument 23004``val`` on each enabled lane, performing an '``xor``' of that with the scalar 23005``start_value``. Disabled lanes are treated as containing the neutral value 23006``0`` (i.e. having no effect on the reduction operation). If the vector length 23007is zero, the result is the start value. 23008 23009To ignore the start value, the neutral value can be used. 23010 23011Examples: 23012""""""""" 23013 23014.. code-block:: llvm 23015 23016 %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 23017 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 23018 ; are treated as though %mask were false for those lanes. 23019 23020 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0> 23021 %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a) 23022 %also.r = xor i32 %reduction, %start 23023 23024 23025.. _int_vp_reduce_smax: 23026 23027'``llvm.vp.reduce.smax.*``' Intrinsics 23028^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23029 23030Syntax: 23031""""""" 23032This is an overloaded intrinsic. 23033 23034:: 23035 23036 declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 23037 declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 23038 23039Overview: 23040""""""""" 23041 23042Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting 23043value, returning the result as a scalar. 23044 23045 23046Arguments: 23047"""""""""" 23048 23049The first argument is the start value of the reduction, which must be a scalar 23050integer type equal to the result type. The second argument is the vector on 23051which the reduction is performed and must be a vector of integer values whose 23052element type is the result/start type. The third argument is the vector mask and 23053is a vector of boolean values with the same number of elements as the vector 23054argument. The fourth argument is the explicit vector length of the operation. 23055 23056Semantics: 23057"""""""""" 23058 23059The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX`` 23060reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the 23061vector argument ``val`` on each enabled lane, and taking the maximum of that and 23062the scalar ``start_value``. Disabled lanes are treated as containing the 23063neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation). 23064If the vector length is zero, the result is the start value. 23065 23066To ignore the start value, the neutral value can be used. 23067 23068Examples: 23069""""""""" 23070 23071.. code-block:: llvm 23072 23073 %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl) 23074 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 23075 ; are treated as though %mask were false for those lanes. 23076 23077 %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128> 23078 %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a) 23079 %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start) 23080 23081 23082.. _int_vp_reduce_smin: 23083 23084'``llvm.vp.reduce.smin.*``' Intrinsics 23085^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23086 23087Syntax: 23088""""""" 23089This is an overloaded intrinsic. 23090 23091:: 23092 23093 declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 23094 declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 23095 23096Overview: 23097""""""""" 23098 23099Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting 23100value, returning the result as a scalar. 23101 23102 23103Arguments: 23104"""""""""" 23105 23106The first argument is the start value of the reduction, which must be a scalar 23107integer type equal to the result type. The second argument is the vector on 23108which the reduction is performed and must be a vector of integer values whose 23109element type is the result/start type. The third argument is the vector mask and 23110is a vector of boolean values with the same number of elements as the vector 23111argument. The fourth argument is the explicit vector length of the operation. 23112 23113Semantics: 23114"""""""""" 23115 23116The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN`` 23117reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the 23118vector argument ``val`` on each enabled lane, and taking the minimum of that and 23119the scalar ``start_value``. Disabled lanes are treated as containing the 23120neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation). 23121If the vector length is zero, the result is the start value. 23122 23123To ignore the start value, the neutral value can be used. 23124 23125Examples: 23126""""""""" 23127 23128.. code-block:: llvm 23129 23130 %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl) 23131 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 23132 ; are treated as though %mask were false for those lanes. 23133 23134 %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127> 23135 %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a) 23136 %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start) 23137 23138 23139.. _int_vp_reduce_umax: 23140 23141'``llvm.vp.reduce.umax.*``' Intrinsics 23142^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23143 23144Syntax: 23145""""""" 23146This is an overloaded intrinsic. 23147 23148:: 23149 23150 declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 23151 declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 23152 23153Overview: 23154""""""""" 23155 23156Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting 23157value, returning the result as a scalar. 23158 23159 23160Arguments: 23161"""""""""" 23162 23163The first argument is the start value of the reduction, which must be a scalar 23164integer type equal to the result type. The second argument is the vector on 23165which the reduction is performed and must be a vector of integer values whose 23166element type is the result/start type. The third argument is the vector mask and 23167is a vector of boolean values with the same number of elements as the vector 23168argument. The fourth argument is the explicit vector length of the operation. 23169 23170Semantics: 23171"""""""""" 23172 23173The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX`` 23174reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the 23175vector argument ``val`` on each enabled lane, and taking the maximum of that and 23176the scalar ``start_value``. Disabled lanes are treated as containing the 23177neutral value ``0`` (i.e. having no effect on the reduction operation). If the 23178vector length is zero, the result is the start value. 23179 23180To ignore the start value, the neutral value can be used. 23181 23182Examples: 23183""""""""" 23184 23185.. code-block:: llvm 23186 23187 %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 23188 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 23189 ; are treated as though %mask were false for those lanes. 23190 23191 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0> 23192 %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a) 23193 %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start) 23194 23195 23196.. _int_vp_reduce_umin: 23197 23198'``llvm.vp.reduce.umin.*``' Intrinsics 23199^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23200 23201Syntax: 23202""""""" 23203This is an overloaded intrinsic. 23204 23205:: 23206 23207 declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 23208 declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 23209 23210Overview: 23211""""""""" 23212 23213Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting 23214value, returning the result as a scalar. 23215 23216 23217Arguments: 23218"""""""""" 23219 23220The first argument is the start value of the reduction, which must be a scalar 23221integer type equal to the result type. The second argument is the vector on 23222which the reduction is performed and must be a vector of integer values whose 23223element type is the result/start type. The third argument is the vector mask and 23224is a vector of boolean values with the same number of elements as the vector 23225argument. The fourth argument is the explicit vector length of the operation. 23226 23227Semantics: 23228"""""""""" 23229 23230The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN`` 23231reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the 23232vector argument ``val`` on each enabled lane, taking the minimum of that and the 23233scalar ``start_value``. Disabled lanes are treated as containing the neutral 23234value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction 23235operation). If the vector length is zero, the result is the start value. 23236 23237To ignore the start value, the neutral value can be used. 23238 23239Examples: 23240""""""""" 23241 23242.. code-block:: llvm 23243 23244 %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 23245 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 23246 ; are treated as though %mask were false for those lanes. 23247 23248 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1> 23249 %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a) 23250 %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start) 23251 23252 23253.. _int_vp_reduce_fmax: 23254 23255'``llvm.vp.reduce.fmax.*``' Intrinsics 23256^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23257 23258Syntax: 23259""""""" 23260This is an overloaded intrinsic. 23261 23262:: 23263 23264 declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>) 23265 declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 23266 23267Overview: 23268""""""""" 23269 23270Predicated floating-point ``MAX`` reduction of a vector and a scalar starting 23271value, returning the result as a scalar. 23272 23273 23274Arguments: 23275"""""""""" 23276 23277The first argument is the start value of the reduction, which must be a scalar 23278floating-point type equal to the result type. The second argument is the vector 23279on which the reduction is performed and must be a vector of floating-point 23280values whose element type is the result/start type. The third argument is the 23281vector mask and is a vector of boolean values with the same number of elements 23282as the vector argument. The fourth argument is the explicit vector length of the 23283operation. 23284 23285Semantics: 23286"""""""""" 23287 23288The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX`` 23289reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the 23290vector argument ``val`` on each enabled lane, taking the maximum of that and the 23291scalar ``start_value``. Disabled lanes are treated as containing the neutral 23292value (i.e. having no effect on the reduction operation). If the vector length 23293is zero, the result is the start value. 23294 23295The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no 23296flags are set, the neutral value is ``-QNAN``. If ``nnan`` and ``ninf`` are 23297both set, then the neutral value is the smallest floating-point value for the 23298result type. If only ``nnan`` is set then the neutral value is ``-Infinity``. 23299 23300This instruction has the same comparison semantics as the 23301:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the 23302'``llvm.maxnum.*``' intrinsic). That is, the result will always be a number 23303unless all elements of the vector and the starting value are ``NaN``. For a 23304vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and 23305``-0.0`` elements, the sign of the result is unspecified. 23306 23307To ignore the start value, the neutral value can be used. 23308 23309Examples: 23310""""""""" 23311 23312.. code-block:: llvm 23313 23314 %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl) 23315 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 23316 ; are treated as though %mask were false for those lanes. 23317 23318 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN> 23319 %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a) 23320 %also.r = call float @llvm.maxnum.f32(float %reduction, float %start) 23321 23322 23323.. _int_vp_reduce_fmin: 23324 23325'``llvm.vp.reduce.fmin.*``' Intrinsics 23326^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23327 23328Syntax: 23329""""""" 23330This is an overloaded intrinsic. 23331 23332:: 23333 23334 declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>) 23335 declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 23336 23337Overview: 23338""""""""" 23339 23340Predicated floating-point ``MIN`` reduction of a vector and a scalar starting 23341value, returning the result as a scalar. 23342 23343 23344Arguments: 23345"""""""""" 23346 23347The first argument is the start value of the reduction, which must be a scalar 23348floating-point type equal to the result type. The second argument is the vector 23349on which the reduction is performed and must be a vector of floating-point 23350values whose element type is the result/start type. The third argument is the 23351vector mask and is a vector of boolean values with the same number of elements 23352as the vector argument. The fourth argument is the explicit vector length of the 23353operation. 23354 23355Semantics: 23356"""""""""" 23357 23358The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN`` 23359reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the 23360vector argument ``val`` on each enabled lane, taking the minimum of that and the 23361scalar ``start_value``. Disabled lanes are treated as containing the neutral 23362value (i.e. having no effect on the reduction operation). If the vector length 23363is zero, the result is the start value. 23364 23365The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no 23366flags are set, the neutral value is ``+QNAN``. If ``nnan`` and ``ninf`` are 23367both set, then the neutral value is the largest floating-point value for the 23368result type. If only ``nnan`` is set then the neutral value is ``+Infinity``. 23369 23370This instruction has the same comparison semantics as the 23371:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the 23372'``llvm.minnum.*``' intrinsic). That is, the result will always be a number 23373unless all elements of the vector and the starting value are ``NaN``. For a 23374vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and 23375``-0.0`` elements, the sign of the result is unspecified. 23376 23377To ignore the start value, the neutral value can be used. 23378 23379Examples: 23380""""""""" 23381 23382.. code-block:: llvm 23383 23384 %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl) 23385 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 23386 ; are treated as though %mask were false for those lanes. 23387 23388 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN> 23389 %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a) 23390 %also.r = call float @llvm.minnum.f32(float %reduction, float %start) 23391 23392 23393.. _int_vp_reduce_fmaximum: 23394 23395'``llvm.vp.reduce.fmaximum.*``' Intrinsics 23396^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23397 23398Syntax: 23399""""""" 23400This is an overloaded intrinsic. 23401 23402:: 23403 23404 declare float @llvm.vp.reduce.fmaximum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>) 23405 declare double @llvm.vp.reduce.fmaximum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 23406 23407Overview: 23408""""""""" 23409 23410Predicated floating-point ``MAX`` reduction of a vector and a scalar starting 23411value, returning the result as a scalar. 23412 23413 23414Arguments: 23415"""""""""" 23416 23417The first argument is the start value of the reduction, which must be a scalar 23418floating-point type equal to the result type. The second argument is the vector 23419on which the reduction is performed and must be a vector of floating-point 23420values whose element type is the result/start type. The third argument is the 23421vector mask and is a vector of boolean values with the same number of elements 23422as the vector argument. The fourth argument is the explicit vector length of the 23423operation. 23424 23425Semantics: 23426"""""""""" 23427 23428The '``llvm.vp.reduce.fmaximum``' intrinsic performs the floating-point ``MAX`` 23429reduction (:ref:`llvm.vector.reduce.fmaximum <int_vector_reduce_fmaximum>`) of 23430the vector argument ``val`` on each enabled lane, taking the maximum of that and 23431the scalar ``start_value``. Disabled lanes are treated as containing the 23432neutral value (i.e. having no effect on the reduction operation). If the vector 23433length is zero, the result is the start value. 23434 23435The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no 23436flags are set or only the ``nnan`` is set, the neutral value is ``-Infinity``. 23437If ``ninf`` is set, then the neutral value is the smallest floating-point value 23438for the result type. 23439 23440This instruction has the same comparison semantics as the 23441:ref:`llvm.vector.reduce.fmaximum <int_vector_reduce_fmaximum>` intrinsic (and 23442thus the '``llvm.maximum.*``' intrinsic). That is, the result will always be a 23443number unless any of the elements in the vector or the starting value is 23444``NaN``. Namely, this intrinsic propagates ``NaN``. Also, -0.0 is considered 23445less than +0.0. 23446 23447To ignore the start value, the neutral value can be used. 23448 23449Examples: 23450""""""""" 23451 23452.. code-block:: llvm 23453 23454 %r = call float @llvm.vp.reduce.fmaximum.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl) 23455 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 23456 ; are treated as though %mask were false for those lanes. 23457 23458 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -infinity, float -infinity, float -infinity, float -infinity> 23459 %reduction = call float @llvm.vector.reduce.fmaximum.v4f32(<4 x float> %masked.a) 23460 %also.r = call float @llvm.maximum.f32(float %reduction, float %start) 23461 23462 23463.. _int_vp_reduce_fminimum: 23464 23465'``llvm.vp.reduce.fminimum.*``' Intrinsics 23466^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23467 23468Syntax: 23469""""""" 23470This is an overloaded intrinsic. 23471 23472:: 23473 23474 declare float @llvm.vp.reduce.fminimum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>) 23475 declare double @llvm.vp.reduce.fminimum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 23476 23477Overview: 23478""""""""" 23479 23480Predicated floating-point ``MIN`` reduction of a vector and a scalar starting 23481value, returning the result as a scalar. 23482 23483 23484Arguments: 23485"""""""""" 23486 23487The first argument is the start value of the reduction, which must be a scalar 23488floating-point type equal to the result type. The second argument is the vector 23489on which the reduction is performed and must be a vector of floating-point 23490values whose element type is the result/start type. The third argument is the 23491vector mask and is a vector of boolean values with the same number of elements 23492as the vector argument. The fourth argument is the explicit vector length of the 23493operation. 23494 23495Semantics: 23496"""""""""" 23497 23498The '``llvm.vp.reduce.fminimum``' intrinsic performs the floating-point ``MIN`` 23499reduction (:ref:`llvm.vector.reduce.fminimum <int_vector_reduce_fminimum>`) of 23500the vector argument ``val`` on each enabled lane, taking the minimum of that and 23501the scalar ``start_value``. Disabled lanes are treated as containing the neutral 23502value (i.e. having no effect on the reduction operation). If the vector length 23503is zero, the result is the start value. 23504 23505The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no 23506flags are set or only the ``nnan`` is set, the neutral value is ``+Infinity``. 23507If ``ninf`` is set, then the neutral value is the largest floating-point value 23508for the result type. 23509 23510This instruction has the same comparison semantics as the 23511:ref:`llvm.vector.reduce.fminimum <int_vector_reduce_fminimum>` intrinsic (and 23512thus the '``llvm.minimum.*``' intrinsic). That is, the result will always be a 23513number unless any of the elements in the vector or the starting value is 23514``NaN``. Namely, this intrinsic propagates ``NaN``. Also, -0.0 is considered 23515less than +0.0. 23516 23517To ignore the start value, the neutral value can be used. 23518 23519Examples: 23520""""""""" 23521 23522.. code-block:: llvm 23523 23524 %r = call float @llvm.vp.reduce.fminimum.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl) 23525 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 23526 ; are treated as though %mask were false for those lanes. 23527 23528 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float infinity, float infinity, float infinity, float infinity> 23529 %reduction = call float @llvm.vector.reduce.fminimum.v4f32(<4 x float> %masked.a) 23530 %also.r = call float @llvm.minimum.f32(float %reduction, float %start) 23531 23532 23533.. _int_get_active_lane_mask: 23534 23535'``llvm.get.active.lane.mask.*``' Intrinsics 23536^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23537 23538Syntax: 23539""""""" 23540This is an overloaded intrinsic. 23541 23542:: 23543 23544 declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n) 23545 declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n) 23546 declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n) 23547 declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n) 23548 23549 23550Overview: 23551""""""""" 23552 23553Create a mask representing active and inactive vector lanes. 23554 23555 23556Arguments: 23557"""""""""" 23558 23559Both arguments have the same scalar integer type. The result is a vector with 23560the i1 element type. 23561 23562Semantics: 23563"""""""""" 23564 23565The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent 23566to: 23567 23568:: 23569 23570 %m[i] = icmp ult (%base + i), %n 23571 23572where ``%m`` is a vector (mask) of active/inactive lanes with its elements 23573indexed by ``i``, and ``%base``, ``%n`` are the two arguments to 23574``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult`` 23575the unsigned less-than comparison operator. Overflow cannot occur in 23576``(%base + i)`` and its comparison against ``%n`` as it is performed in integer 23577numbers and not in machine numbers. If ``%n`` is ``0``, then the result is a 23578poison value. The above is equivalent to: 23579 23580:: 23581 23582 %m = @llvm.get.active.lane.mask(%base, %n) 23583 23584This can, for example, be emitted by the loop vectorizer in which case 23585``%base`` is the first element of the vector induction variable (VIV) and 23586``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise 23587less than comparison of VIV with the loop tripcount, producing a mask of 23588true/false values representing active/inactive vector lanes, except if the VIV 23589overflows in which case they return false in the lanes where the VIV overflows. 23590The arguments are scalar types to accommodate scalable vector types, for which 23591it is unknown what the type of the step vector needs to be that enumerate its 23592lanes without overflow. 23593 23594This mask ``%m`` can e.g. be used in masked load/store instructions. These 23595intrinsics provide a hint to the backend. I.e., for a vector loop, the 23596back-edge taken count of the original scalar loop is explicit as the second 23597argument. 23598 23599 23600Examples: 23601""""""""" 23602 23603.. code-block:: llvm 23604 23605 %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429) 23606 %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> poison) 23607 23608 23609.. _int_experimental_vp_splice: 23610 23611'``llvm.experimental.vp.splice``' Intrinsic 23612^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23613 23614Syntax: 23615""""""" 23616This is an overloaded intrinsic. 23617 23618:: 23619 23620 declare <2 x double> @llvm.experimental.vp.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm, <2 x i1> %mask, i32 %evl1, i32 %evl2) 23621 declare <vscale x 4 x i32> @llvm.experimental.vp.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm, <vscale x 4 x i1> %mask, i32 %evl1, i32 %evl2) 23622 23623Overview: 23624""""""""" 23625 23626The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length 23627predicated version of the '``llvm.vector.splice.*``' intrinsic. 23628 23629Arguments: 23630"""""""""" 23631 23632The result and the first two arguments ``vec1`` and ``vec2`` are vectors with 23633the same type. The third argument ``imm`` is an immediate signed integer that 23634indicates the offset index. The fourth argument ``mask`` is a vector mask and 23635has the same number of elements as the result. The last two arguments ``evl1`` 23636and ``evl2`` are unsigned integers indicating the explicit vector lengths of 23637``vec1`` and ``vec2`` respectively. ``imm``, ``evl1`` and ``evl2`` should 23638respect the following constraints: ``-evl1 <= imm < evl1``, ``0 <= evl1 <= VL`` 23639and ``0 <= evl2 <= VL``, where ``VL`` is the runtime vector factor. If these 23640constraints are not satisfied the intrinsic has undefined behavior. 23641 23642Semantics: 23643"""""""""" 23644 23645Effectively, this intrinsic concatenates ``vec1[0..evl1-1]`` and 23646``vec2[0..evl2-1]`` and creates the result vector by selecting the elements in a 23647window of size ``evl2``, starting at index ``imm`` (for a positive immediate) of 23648the concatenated vector. Elements in the result vector beyond ``evl2`` are 23649``undef``. If ``imm`` is negative the starting index is ``evl1 + imm``. The result 23650vector of active vector length ``evl2`` contains ``evl1 - imm`` (``-imm`` for 23651negative ``imm``) elements from indices ``[imm..evl1 - 1]`` 23652(``[evl1 + imm..evl1 -1]`` for negative ``imm``) of ``vec1`` followed by the 23653first ``evl2 - (evl1 - imm)`` (``evl2 + imm`` for negative ``imm``) elements of 23654``vec2``. If ``evl1 - imm`` (``-imm``) >= ``evl2``, only the first ``evl2`` 23655elements are considered and the remaining are ``undef``. The lanes in the result 23656vector disabled by ``mask`` are ``poison``. 23657 23658Examples: 23659""""""""" 23660 23661.. code-block:: text 23662 23663 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, 1, 2, 3); ==> <B, E, F, poison> index 23664 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, -2, 3, 2); ==> <B, C, poison, poison> trailing elements 23665 23666 23667.. _int_experimental_vp_splat: 23668 23669 23670'``llvm.experimental.vp.splat``' Intrinsic 23671^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23672 23673Syntax: 23674""""""" 23675This is an overloaded intrinsic. 23676 23677:: 23678 23679 declare <2 x double> @llvm.experimental.vp.splat.v2f64(double %scalar, <2 x i1> %mask, i32 %evl) 23680 declare <vscale x 4 x i32> @llvm.experimental.vp.splat.nxv4i32(i32 %scalar, <vscale x 4 x i1> %mask, i32 %evl) 23681 23682Overview: 23683""""""""" 23684 23685The '``llvm.experimental.vp.splat.*``' intrinsic is to create a predicated splat 23686with specific effective vector length. 23687 23688Arguments: 23689"""""""""" 23690 23691The result is a vector and it is a splat of the first scalar argument. The 23692second argument ``mask`` is a vector mask and has the same number of elements as 23693the result. The third argument is the explicit vector length of the operation. 23694 23695Semantics: 23696"""""""""" 23697 23698This intrinsic splats a vector with ``evl`` elements of a scalar argument. 23699The lanes in the result vector disabled by ``mask`` are ``poison``. The 23700elements past ``evl`` are poison. 23701 23702Examples: 23703""""""""" 23704 23705.. code-block:: llvm 23706 23707 %r = call <4 x float> @llvm.vp.splat.v4f32(float %a, <4 x i1> %mask, i32 %evl) 23708 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 23709 %e = insertelement <4 x float> poison, float %a, i32 0 23710 %s = shufflevector <4 x float> %e, <4 x float> poison, <4 x i32> zeroinitializer 23711 %also.r = select <4 x i1> %mask, <4 x float> %s, <4 x float> poison 23712 23713 23714.. _int_experimental_vp_reverse: 23715 23716 23717'``llvm.experimental.vp.reverse``' Intrinsic 23718^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23719 23720Syntax: 23721""""""" 23722This is an overloaded intrinsic. 23723 23724:: 23725 23726 declare <2 x double> @llvm.experimental.vp.reverse.v2f64(<2 x double> %vec, <2 x i1> %mask, i32 %evl) 23727 declare <vscale x 4 x i32> @llvm.experimental.vp.reverse.nxv4i32(<vscale x 4 x i32> %vec, <vscale x 4 x i1> %mask, i32 %evl) 23728 23729Overview: 23730""""""""" 23731 23732The '``llvm.experimental.vp.reverse.*``' intrinsic is the vector length 23733predicated version of the '``llvm.vector.reverse.*``' intrinsic. 23734 23735Arguments: 23736"""""""""" 23737 23738The result and the first argument ``vec`` are vectors with the same type. 23739The second argument ``mask`` is a vector mask and has the same number of 23740elements as the result. The third argument is the explicit vector length of 23741the operation. 23742 23743Semantics: 23744"""""""""" 23745 23746This intrinsic reverses the order of the first ``evl`` elements in a vector. 23747The lanes in the result vector disabled by ``mask`` are ``poison``. The 23748elements past ``evl`` are poison. 23749 23750.. _int_vp_load: 23751 23752'``llvm.vp.load``' Intrinsic 23753^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23754 23755Syntax: 23756""""""" 23757This is an overloaded intrinsic. 23758 23759:: 23760 23761 declare <4 x float> @llvm.vp.load.v4f32.p0(ptr %ptr, <4 x i1> %mask, i32 %evl) 23762 declare <vscale x 2 x i16> @llvm.vp.load.nxv2i16.p0(ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl) 23763 declare <8 x float> @llvm.vp.load.v8f32.p1(ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl) 23764 declare <vscale x 1 x i64> @llvm.vp.load.nxv1i64.p6(ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl) 23765 23766Overview: 23767""""""""" 23768 23769The '``llvm.vp.load.*``' intrinsic is the vector length predicated version of 23770the :ref:`llvm.masked.load <int_mload>` intrinsic. 23771 23772Arguments: 23773"""""""""" 23774 23775The first argument is the base pointer for the load. The second argument is a 23776vector of boolean values with the same number of elements as the return type. 23777The third is the explicit vector length of the operation. The return type and 23778underlying type of the base pointer are the same vector types. 23779 23780The :ref:`align <attr_align>` parameter attribute can be provided for the first 23781argument. 23782 23783Semantics: 23784"""""""""" 23785 23786The '``llvm.vp.load``' intrinsic reads a vector from memory in the same way as 23787the '``llvm.masked.load``' intrinsic, where the mask is taken from the 23788combination of the '``mask``' and '``evl``' arguments in the usual VP way. 23789Certain '``llvm.masked.load``' arguments do not have corresponding arguments in 23790'``llvm.vp.load``': the '``passthru``' argument is implicitly ``poison``; the 23791'``alignment``' argument is taken as the ``align`` parameter attribute, if 23792provided. The default alignment is taken as the ABI alignment of the return 23793type as specified by the :ref:`datalayout string<langref_datalayout>`. 23794 23795Examples: 23796""""""""" 23797 23798.. code-block:: text 23799 23800 %r = call <8 x i8> @llvm.vp.load.v8i8.p0(ptr align 2 %ptr, <8 x i1> %mask, i32 %evl) 23801 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 23802 23803 %also.r = call <8 x i8> @llvm.masked.load.v8i8.p0(ptr %ptr, i32 2, <8 x i1> %mask, <8 x i8> poison) 23804 23805 23806.. _int_vp_store: 23807 23808'``llvm.vp.store``' Intrinsic 23809^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23810 23811Syntax: 23812""""""" 23813This is an overloaded intrinsic. 23814 23815:: 23816 23817 declare void @llvm.vp.store.v4f32.p0(<4 x float> %val, ptr %ptr, <4 x i1> %mask, i32 %evl) 23818 declare void @llvm.vp.store.nxv2i16.p0(<vscale x 2 x i16> %val, ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl) 23819 declare void @llvm.vp.store.v8f32.p1(<8 x float> %val, ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl) 23820 declare void @llvm.vp.store.nxv1i64.p6(<vscale x 1 x i64> %val, ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl) 23821 23822Overview: 23823""""""""" 23824 23825The '``llvm.vp.store.*``' intrinsic is the vector length predicated version of 23826the :ref:`llvm.masked.store <int_mstore>` intrinsic. 23827 23828Arguments: 23829"""""""""" 23830 23831The first argument is the vector value to be written to memory. The second 23832argument is the base pointer for the store. It has the same underlying type as 23833the value argument. The third argument is a vector of boolean values with the 23834same number of elements as the return type. The fourth is the explicit vector 23835length of the operation. 23836 23837The :ref:`align <attr_align>` parameter attribute can be provided for the 23838second argument. 23839 23840Semantics: 23841"""""""""" 23842 23843The '``llvm.vp.store``' intrinsic reads a vector from memory in the same way as 23844the '``llvm.masked.store``' intrinsic, where the mask is taken from the 23845combination of the '``mask``' and '``evl``' arguments in the usual VP way. The 23846alignment of the operation (corresponding to the '``alignment``' argument of 23847'``llvm.masked.store``') is specified by the ``align`` parameter attribute (see 23848above). If it is not provided then the ABI alignment of the type of the 23849'``value``' argument as specified by the :ref:`datalayout 23850string<langref_datalayout>` is used instead. 23851 23852Examples: 23853""""""""" 23854 23855.. code-block:: text 23856 23857 call void @llvm.vp.store.v8i8.p0(<8 x i8> %val, ptr align 4 %ptr, <8 x i1> %mask, i32 %evl) 23858 ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below. 23859 23860 call void @llvm.masked.store.v8i8.p0(<8 x i8> %val, ptr %ptr, i32 4, <8 x i1> %mask) 23861 23862 23863.. _int_experimental_vp_strided_load: 23864 23865'``llvm.experimental.vp.strided.load``' Intrinsic 23866^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23867 23868Syntax: 23869""""""" 23870This is an overloaded intrinsic. 23871 23872:: 23873 23874 declare <4 x float> @llvm.experimental.vp.strided.load.v4f32.i64(ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl) 23875 declare <vscale x 2 x i16> @llvm.experimental.vp.strided.load.nxv2i16.i64(ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl) 23876 23877Overview: 23878""""""""" 23879 23880The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, scalar values from 23881memory locations evenly spaced apart by '``stride``' number of bytes, starting from '``ptr``'. 23882 23883Arguments: 23884"""""""""" 23885 23886The first argument is the base pointer for the load. The second argument is the stride 23887value expressed in bytes. The third argument is a vector of boolean values 23888with the same number of elements as the return type. The fourth is the explicit 23889vector length of the operation. The base pointer underlying type matches the type of the scalar 23890elements of the return argument. 23891 23892The :ref:`align <attr_align>` parameter attribute can be provided for the first 23893argument. 23894 23895Semantics: 23896"""""""""" 23897 23898The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, multiple scalar 23899values from memory in the same way as the :ref:`llvm.vp.gather <int_vp_gather>` intrinsic, 23900where the vector of pointers is in the form: 23901 23902 ``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``, 23903 23904with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed 23905integer and all arithmetic occurring in the pointer type. 23906 23907Examples: 23908""""""""" 23909 23910.. code-block:: text 23911 23912 %r = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.i64(i64* %ptr, i64 %stride, <8 x i64> %mask, i32 %evl) 23913 ;; The operation can also be expressed like this: 23914 23915 %addr = bitcast i64* %ptr to i8* 23916 ;; Create a vector of pointers %addrs in the form: 23917 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...> 23918 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* > 23919 %also.r = call <8 x i64> @llvm.vp.gather.v8i64.v8p0i64(<8 x i64* > %ptrs, <8 x i64> %mask, i32 %evl) 23920 23921 23922.. _int_experimental_vp_strided_store: 23923 23924'``llvm.experimental.vp.strided.store``' Intrinsic 23925^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23926 23927Syntax: 23928""""""" 23929This is an overloaded intrinsic. 23930 23931:: 23932 23933 declare void @llvm.experimental.vp.strided.store.v4f32.i64(<4 x float> %val, ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl) 23934 declare void @llvm.experimental.vp.strided.store.nxv2i16.i64(<vscale x 2 x i16> %val, ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl) 23935 23936Overview: 23937""""""""" 23938 23939The '``@llvm.experimental.vp.strided.store``' intrinsic stores the elements of 23940'``val``' into memory locations evenly spaced apart by '``stride``' number of 23941bytes, starting from '``ptr``'. 23942 23943Arguments: 23944"""""""""" 23945 23946The first argument is the vector value to be written to memory. The second 23947argument is the base pointer for the store. Its underlying type matches the 23948scalar element type of the value argument. The third argument is the stride value 23949expressed in bytes. The fourth argument is a vector of boolean values with the 23950same number of elements as the return type. The fifth is the explicit vector 23951length of the operation. 23952 23953The :ref:`align <attr_align>` parameter attribute can be provided for the 23954second argument. 23955 23956Semantics: 23957"""""""""" 23958 23959The '``llvm.experimental.vp.strided.store``' intrinsic stores the elements of 23960'``val``' in the same way as the :ref:`llvm.vp.scatter <int_vp_scatter>` intrinsic, 23961where the vector of pointers is in the form: 23962 23963 ``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``, 23964 23965with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed 23966integer and all arithmetic occurring in the pointer type. 23967 23968Examples: 23969""""""""" 23970 23971.. code-block:: text 23972 23973 call void @llvm.experimental.vp.strided.store.v8i64.i64(<8 x i64> %val, i64* %ptr, i64 %stride, <8 x i1> %mask, i32 %evl) 23974 ;; The operation can also be expressed like this: 23975 23976 %addr = bitcast i64* %ptr to i8* 23977 ;; Create a vector of pointers %addrs in the form: 23978 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...> 23979 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* > 23980 call void @llvm.vp.scatter.v8i64.v8p0i64(<8 x i64> %val, <8 x i64*> %ptrs, <8 x i1> %mask, i32 %evl) 23981 23982 23983.. _int_vp_gather: 23984 23985'``llvm.vp.gather``' Intrinsic 23986^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23987 23988Syntax: 23989""""""" 23990This is an overloaded intrinsic. 23991 23992:: 23993 23994 declare <4 x double> @llvm.vp.gather.v4f64.v4p0(<4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl) 23995 declare <vscale x 2 x i8> @llvm.vp.gather.nxv2i8.nxv2p0(<vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl) 23996 declare <2 x float> @llvm.vp.gather.v2f32.v2p2(<2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl) 23997 declare <vscale x 4 x i32> @llvm.vp.gather.nxv4i32.nxv4p4(<vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl) 23998 23999Overview: 24000""""""""" 24001 24002The '``llvm.vp.gather.*``' intrinsic is the vector length predicated version of 24003the :ref:`llvm.masked.gather <int_mgather>` intrinsic. 24004 24005Arguments: 24006"""""""""" 24007 24008The first argument is a vector of pointers which holds all memory addresses to 24009read. The second argument is a vector of boolean values with the same number of 24010elements as the return type. The third is the explicit vector length of the 24011operation. The return type and underlying type of the vector of pointers are 24012the same vector types. 24013 24014The :ref:`align <attr_align>` parameter attribute can be provided for the first 24015argument. 24016 24017Semantics: 24018"""""""""" 24019 24020The '``llvm.vp.gather``' intrinsic reads multiple scalar values from memory in 24021the same way as the '``llvm.masked.gather``' intrinsic, where the mask is taken 24022from the combination of the '``mask``' and '``evl``' arguments in the usual VP 24023way. Certain '``llvm.masked.gather``' arguments do not have corresponding 24024arguments in '``llvm.vp.gather``': the '``passthru``' argument is implicitly 24025``poison``; the '``alignment``' argument is taken as the ``align`` parameter, if 24026provided. The default alignment is taken as the ABI alignment of the source 24027addresses as specified by the :ref:`datalayout string<langref_datalayout>`. 24028 24029Examples: 24030""""""""" 24031 24032.. code-block:: text 24033 24034 %r = call <8 x i8> @llvm.vp.gather.v8i8.v8p0(<8 x ptr> align 8 %ptrs, <8 x i1> %mask, i32 %evl) 24035 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24036 24037 %also.r = call <8 x i8> @llvm.masked.gather.v8i8.v8p0(<8 x ptr> %ptrs, i32 8, <8 x i1> %mask, <8 x i8> poison) 24038 24039 24040.. _int_vp_scatter: 24041 24042'``llvm.vp.scatter``' Intrinsic 24043^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24044 24045Syntax: 24046""""""" 24047This is an overloaded intrinsic. 24048 24049:: 24050 24051 declare void @llvm.vp.scatter.v4f64.v4p0(<4 x double> %val, <4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl) 24052 declare void @llvm.vp.scatter.nxv2i8.nxv2p0(<vscale x 2 x i8> %val, <vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl) 24053 declare void @llvm.vp.scatter.v2f32.v2p2(<2 x float> %val, <2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl) 24054 declare void @llvm.vp.scatter.nxv4i32.nxv4p4(<vscale x 4 x i32> %val, <vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl) 24055 24056Overview: 24057""""""""" 24058 24059The '``llvm.vp.scatter.*``' intrinsic is the vector length predicated version of 24060the :ref:`llvm.masked.scatter <int_mscatter>` intrinsic. 24061 24062Arguments: 24063"""""""""" 24064 24065The first argument is a vector value to be written to memory. The second argument 24066is a vector of pointers, pointing to where the value elements should be stored. 24067The third argument is a vector of boolean values with the same number of 24068elements as the return type. The fourth is the explicit vector length of the 24069operation. 24070 24071The :ref:`align <attr_align>` parameter attribute can be provided for the 24072second argument. 24073 24074Semantics: 24075"""""""""" 24076 24077The '``llvm.vp.scatter``' intrinsic writes multiple scalar values to memory in 24078the same way as the '``llvm.masked.scatter``' intrinsic, where the mask is 24079taken from the combination of the '``mask``' and '``evl``' arguments in the 24080usual VP way. The '``alignment``' argument of the '``llvm.masked.scatter``' does 24081not have a corresponding argument in '``llvm.vp.scatter``': it is instead 24082provided via the optional ``align`` parameter attribute on the 24083vector-of-pointers argument. Otherwise it is taken as the ABI alignment of the 24084destination addresses as specified by the :ref:`datalayout 24085string<langref_datalayout>`. 24086 24087Examples: 24088""""""""" 24089 24090.. code-block:: text 24091 24092 call void @llvm.vp.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> align 1 %ptrs, <8 x i1> %mask, i32 %evl) 24093 ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below. 24094 24095 call void @llvm.masked.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> %ptrs, i32 1, <8 x i1> %mask) 24096 24097 24098.. _int_vp_trunc: 24099 24100'``llvm.vp.trunc.*``' Intrinsics 24101^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24102 24103Syntax: 24104""""""" 24105This is an overloaded intrinsic. 24106 24107:: 24108 24109 declare <16 x i16> @llvm.vp.trunc.v16i16.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>) 24110 declare <vscale x 4 x i16> @llvm.vp.trunc.nxv4i16.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24111 24112Overview: 24113""""""""" 24114 24115The '``llvm.vp.trunc``' intrinsic truncates its first argument to the return 24116type. The operation has a mask and an explicit vector length parameter. 24117 24118 24119Arguments: 24120"""""""""" 24121 24122The '``llvm.vp.trunc``' intrinsic takes a value to cast as its first argument. 24123The return type is the type to cast the value to. Both types must be vector of 24124:ref:`integer <t_integer>` type. The bit size of the value must be larger than 24125the bit size of the return type. The second argument is the vector mask. The 24126return type, the value to cast, and the vector mask have the same number of 24127elements. The third argument is the explicit vector length of the operation. 24128 24129Semantics: 24130"""""""""" 24131 24132The '``llvm.vp.trunc``' intrinsic truncates the high order bits in value and 24133converts the remaining bits to return type. Since the source size must be larger 24134than the destination size, '``llvm.vp.trunc``' cannot be a *no-op cast*. It will 24135always truncate bits. The conversion is performed on lane positions below the 24136explicit vector length and where the vector mask is true. Masked-off lanes are 24137``poison``. 24138 24139Examples: 24140""""""""" 24141 24142.. code-block:: llvm 24143 24144 %r = call <4 x i16> @llvm.vp.trunc.v4i16.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl) 24145 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24146 24147 %t = trunc <4 x i32> %a to <4 x i16> 24148 %also.r = select <4 x i1> %mask, <4 x i16> %t, <4 x i16> poison 24149 24150 24151.. _int_vp_zext: 24152 24153'``llvm.vp.zext.*``' Intrinsics 24154^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24155 24156Syntax: 24157""""""" 24158This is an overloaded intrinsic. 24159 24160:: 24161 24162 declare <16 x i32> @llvm.vp.zext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>) 24163 declare <vscale x 4 x i32> @llvm.vp.zext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24164 24165Overview: 24166""""""""" 24167 24168The '``llvm.vp.zext``' intrinsic zero extends its first argument to the return 24169type. The operation has a mask and an explicit vector length parameter. 24170 24171 24172Arguments: 24173"""""""""" 24174 24175The '``llvm.vp.zext``' intrinsic takes a value to cast as its first argument. 24176The return type is the type to cast the value to. Both types must be vectors of 24177:ref:`integer <t_integer>` type. The bit size of the value must be smaller than 24178the bit size of the return type. The second argument is the vector mask. The 24179return type, the value to cast, and the vector mask have the same number of 24180elements. The third argument is the explicit vector length of the operation. 24181 24182Semantics: 24183"""""""""" 24184 24185The '``llvm.vp.zext``' intrinsic fill the high order bits of the value with zero 24186bits until it reaches the size of the return type. When zero extending from i1, 24187the result will always be either 0 or 1. The conversion is performed on lane 24188positions below the explicit vector length and where the vector mask is true. 24189Masked-off lanes are ``poison``. 24190 24191Examples: 24192""""""""" 24193 24194.. code-block:: llvm 24195 24196 %r = call <4 x i32> @llvm.vp.zext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl) 24197 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24198 24199 %t = zext <4 x i16> %a to <4 x i32> 24200 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 24201 24202 24203.. _int_vp_sext: 24204 24205'``llvm.vp.sext.*``' Intrinsics 24206^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24207 24208Syntax: 24209""""""" 24210This is an overloaded intrinsic. 24211 24212:: 24213 24214 declare <16 x i32> @llvm.vp.sext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>) 24215 declare <vscale x 4 x i32> @llvm.vp.sext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24216 24217Overview: 24218""""""""" 24219 24220The '``llvm.vp.sext``' intrinsic sign extends its first argument to the return 24221type. The operation has a mask and an explicit vector length parameter. 24222 24223 24224Arguments: 24225"""""""""" 24226 24227The '``llvm.vp.sext``' intrinsic takes a value to cast as its first argument. 24228The return type is the type to cast the value to. Both types must be vectors of 24229:ref:`integer <t_integer>` type. The bit size of the value must be smaller than 24230the bit size of the return type. The second argument is the vector mask. The 24231return type, the value to cast, and the vector mask have the same number of 24232elements. The third argument is the explicit vector length of the operation. 24233 24234Semantics: 24235"""""""""" 24236 24237The '``llvm.vp.sext``' intrinsic performs a sign extension by copying the sign 24238bit (highest order bit) of the value until it reaches the size of the return 24239type. When sign extending from i1, the result will always be either -1 or 0. 24240The conversion is performed on lane positions below the explicit vector length 24241and where the vector mask is true. Masked-off lanes are ``poison``. 24242 24243Examples: 24244""""""""" 24245 24246.. code-block:: llvm 24247 24248 %r = call <4 x i32> @llvm.vp.sext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl) 24249 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24250 24251 %t = sext <4 x i16> %a to <4 x i32> 24252 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 24253 24254 24255.. _int_vp_fptrunc: 24256 24257'``llvm.vp.fptrunc.*``' Intrinsics 24258^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24259 24260Syntax: 24261""""""" 24262This is an overloaded intrinsic. 24263 24264:: 24265 24266 declare <16 x float> @llvm.vp.fptrunc.v16f32.v16f64 (<16 x double> <op>, <16 x i1> <mask>, i32 <vector_length>) 24267 declare <vscale x 4 x float> @llvm.vp.trunc.nxv4f32.nxv4f64 (<vscale x 4 x double> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24268 24269Overview: 24270""""""""" 24271 24272The '``llvm.vp.fptrunc``' intrinsic truncates its first argument to the return 24273type. The operation has a mask and an explicit vector length parameter. 24274 24275 24276Arguments: 24277"""""""""" 24278 24279The '``llvm.vp.fptrunc``' intrinsic takes a value to cast as its first argument. 24280The return type is the type to cast the value to. Both types must be vector of 24281:ref:`floating-point <t_floating>` type. The bit size of the value must be 24282larger than the bit size of the return type. This implies that 24283'``llvm.vp.fptrunc``' cannot be used to make a *no-op cast*. The second argument 24284is the vector mask. The return type, the value to cast, and the vector mask have 24285the same number of elements. The third argument is the explicit vector length of 24286the operation. 24287 24288Semantics: 24289"""""""""" 24290 24291The '``llvm.vp.fptrunc``' intrinsic casts a ``value`` from a larger 24292:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point 24293<t_floating>` type. 24294This instruction is assumed to execute in the default :ref:`floating-point 24295environment <floatenv>`. The conversion is performed on lane positions below the 24296explicit vector length and where the vector mask is true. Masked-off lanes are 24297``poison``. 24298 24299Examples: 24300""""""""" 24301 24302.. code-block:: llvm 24303 24304 %r = call <4 x float> @llvm.vp.fptrunc.v4f32.v4f64(<4 x double> %a, <4 x i1> %mask, i32 %evl) 24305 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24306 24307 %t = fptrunc <4 x double> %a to <4 x float> 24308 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 24309 24310 24311.. _int_vp_fpext: 24312 24313'``llvm.vp.fpext.*``' Intrinsics 24314^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24315 24316Syntax: 24317""""""" 24318This is an overloaded intrinsic. 24319 24320:: 24321 24322 declare <16 x double> @llvm.vp.fpext.v16f64.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 24323 declare <vscale x 4 x double> @llvm.vp.fpext.nxv4f64.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24324 24325Overview: 24326""""""""" 24327 24328The '``llvm.vp.fpext``' intrinsic extends its first argument to the return 24329type. The operation has a mask and an explicit vector length parameter. 24330 24331 24332Arguments: 24333"""""""""" 24334 24335The '``llvm.vp.fpext``' intrinsic takes a value to cast as its first argument. 24336The return type is the type to cast the value to. Both types must be vector of 24337:ref:`floating-point <t_floating>` type. The bit size of the value must be 24338smaller than the bit size of the return type. This implies that 24339'``llvm.vp.fpext``' cannot be used to make a *no-op cast*. The second argument 24340is the vector mask. The return type, the value to cast, and the vector mask have 24341the same number of elements. The third argument is the explicit vector length of 24342the operation. 24343 24344Semantics: 24345"""""""""" 24346 24347The '``llvm.vp.fpext``' intrinsic extends the ``value`` from a smaller 24348:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point 24349<t_floating>` type. The '``llvm.vp.fpext``' cannot be used to make a 24350*no-op cast* because it always changes bits. Use ``bitcast`` to make a 24351*no-op cast* for a floating-point cast. 24352The conversion is performed on lane positions below the explicit vector length 24353and where the vector mask is true. Masked-off lanes are ``poison``. 24354 24355Examples: 24356""""""""" 24357 24358.. code-block:: llvm 24359 24360 %r = call <4 x double> @llvm.vp.fpext.v4f64.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 24361 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24362 24363 %t = fpext <4 x float> %a to <4 x double> 24364 %also.r = select <4 x i1> %mask, <4 x double> %t, <4 x double> poison 24365 24366 24367.. _int_vp_fptoui: 24368 24369'``llvm.vp.fptoui.*``' Intrinsics 24370^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24371 24372Syntax: 24373""""""" 24374This is an overloaded intrinsic. 24375 24376:: 24377 24378 declare <16 x i32> @llvm.vp.fptoui.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 24379 declare <vscale x 4 x i32> @llvm.vp.fptoui.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24380 declare <256 x i64> @llvm.vp.fptoui.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 24381 24382Overview: 24383""""""""" 24384 24385The '``llvm.vp.fptoui``' intrinsic converts the :ref:`floating-point 24386<t_floating>` argument to the unsigned integer return type. 24387The operation has a mask and an explicit vector length parameter. 24388 24389 24390Arguments: 24391"""""""""" 24392 24393The '``llvm.vp.fptoui``' intrinsic takes a value to cast as its first argument. 24394The value to cast must be a vector of :ref:`floating-point <t_floating>` type. 24395The return type is the type to cast the value to. The return type must be 24396vector of :ref:`integer <t_integer>` type. The second argument is the vector 24397mask. The return type, the value to cast, and the vector mask have the same 24398number of elements. The third argument is the explicit vector length of the 24399operation. 24400 24401Semantics: 24402"""""""""" 24403 24404The '``llvm.vp.fptoui``' intrinsic converts its :ref:`floating-point 24405<t_floating>` argument into the nearest (rounding towards zero) unsigned integer 24406value where the lane position is below the explicit vector length and the 24407vector mask is true. Masked-off lanes are ``poison``. On enabled lanes where 24408conversion takes place and the value cannot fit in the return type, the result 24409on that lane is a :ref:`poison value <poisonvalues>`. 24410 24411Examples: 24412""""""""" 24413 24414.. code-block:: llvm 24415 24416 %r = call <4 x i32> @llvm.vp.fptoui.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 24417 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24418 24419 %t = fptoui <4 x float> %a to <4 x i32> 24420 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 24421 24422 24423.. _int_vp_fptosi: 24424 24425'``llvm.vp.fptosi.*``' Intrinsics 24426^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24427 24428Syntax: 24429""""""" 24430This is an overloaded intrinsic. 24431 24432:: 24433 24434 declare <16 x i32> @llvm.vp.fptosi.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 24435 declare <vscale x 4 x i32> @llvm.vp.fptosi.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24436 declare <256 x i64> @llvm.vp.fptosi.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 24437 24438Overview: 24439""""""""" 24440 24441The '``llvm.vp.fptosi``' intrinsic converts the :ref:`floating-point 24442<t_floating>` argument to the signed integer return type. 24443The operation has a mask and an explicit vector length parameter. 24444 24445 24446Arguments: 24447"""""""""" 24448 24449The '``llvm.vp.fptosi``' intrinsic takes a value to cast as its first argument. 24450The value to cast must be a vector of :ref:`floating-point <t_floating>` type. 24451The return type is the type to cast the value to. The return type must be 24452vector of :ref:`integer <t_integer>` type. The second argument is the vector 24453mask. The return type, the value to cast, and the vector mask have the same 24454number of elements. The third argument is the explicit vector length of the 24455operation. 24456 24457Semantics: 24458"""""""""" 24459 24460The '``llvm.vp.fptosi``' intrinsic converts its :ref:`floating-point 24461<t_floating>` argument into the nearest (rounding towards zero) signed integer 24462value where the lane position is below the explicit vector length and the 24463vector mask is true. Masked-off lanes are ``poison``. On enabled lanes where 24464conversion takes place and the value cannot fit in the return type, the result 24465on that lane is a :ref:`poison value <poisonvalues>`. 24466 24467Examples: 24468""""""""" 24469 24470.. code-block:: llvm 24471 24472 %r = call <4 x i32> @llvm.vp.fptosi.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 24473 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24474 24475 %t = fptosi <4 x float> %a to <4 x i32> 24476 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 24477 24478 24479.. _int_vp_uitofp: 24480 24481'``llvm.vp.uitofp.*``' Intrinsics 24482^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24483 24484Syntax: 24485""""""" 24486This is an overloaded intrinsic. 24487 24488:: 24489 24490 declare <16 x float> @llvm.vp.uitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>) 24491 declare <vscale x 4 x float> @llvm.vp.uitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24492 declare <256 x double> @llvm.vp.uitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>) 24493 24494Overview: 24495""""""""" 24496 24497The '``llvm.vp.uitofp``' intrinsic converts its unsigned integer argument to the 24498:ref:`floating-point <t_floating>` return type. The operation has a mask and 24499an explicit vector length parameter. 24500 24501 24502Arguments: 24503"""""""""" 24504 24505The '``llvm.vp.uitofp``' intrinsic takes a value to cast as its first argument. 24506The value to cast must be vector of :ref:`integer <t_integer>` type. The 24507return type is the type to cast the value to. The return type must be a vector 24508of :ref:`floating-point <t_floating>` type. The second argument is the vector 24509mask. The return type, the value to cast, and the vector mask have the same 24510number of elements. The third argument is the explicit vector length of the 24511operation. 24512 24513Semantics: 24514"""""""""" 24515 24516The '``llvm.vp.uitofp``' intrinsic interprets its first argument as an unsigned 24517integer quantity and converts it to the corresponding floating-point value. If 24518the value cannot be exactly represented, it is rounded using the default 24519rounding mode. The conversion is performed on lane positions below the 24520explicit vector length and where the vector mask is true. Masked-off lanes are 24521``poison``. 24522 24523Examples: 24524""""""""" 24525 24526.. code-block:: llvm 24527 24528 %r = call <4 x float> @llvm.vp.uitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl) 24529 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24530 24531 %t = uitofp <4 x i32> %a to <4 x float> 24532 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 24533 24534 24535.. _int_vp_sitofp: 24536 24537'``llvm.vp.sitofp.*``' Intrinsics 24538^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24539 24540Syntax: 24541""""""" 24542This is an overloaded intrinsic. 24543 24544:: 24545 24546 declare <16 x float> @llvm.vp.sitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>) 24547 declare <vscale x 4 x float> @llvm.vp.sitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24548 declare <256 x double> @llvm.vp.sitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>) 24549 24550Overview: 24551""""""""" 24552 24553The '``llvm.vp.sitofp``' intrinsic converts its signed integer argument to the 24554:ref:`floating-point <t_floating>` return type. The operation has a mask and 24555an explicit vector length parameter. 24556 24557 24558Arguments: 24559"""""""""" 24560 24561The '``llvm.vp.sitofp``' intrinsic takes a value to cast as its first argument. 24562The value to cast must be vector of :ref:`integer <t_integer>` type. The 24563return type is the type to cast the value to. The return type must be a vector 24564of :ref:`floating-point <t_floating>` type. The second argument is the vector 24565mask. The return type, the value to cast, and the vector mask have the same 24566number of elements. The third argument is the explicit vector length of the 24567operation. 24568 24569Semantics: 24570"""""""""" 24571 24572The '``llvm.vp.sitofp``' intrinsic interprets its first argument as a signed 24573integer quantity and converts it to the corresponding floating-point value. If 24574the value cannot be exactly represented, it is rounded using the default 24575rounding mode. The conversion is performed on lane positions below the 24576explicit vector length and where the vector mask is true. Masked-off lanes are 24577``poison``. 24578 24579Examples: 24580""""""""" 24581 24582.. code-block:: llvm 24583 24584 %r = call <4 x float> @llvm.vp.sitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl) 24585 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24586 24587 %t = sitofp <4 x i32> %a to <4 x float> 24588 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 24589 24590 24591.. _int_vp_ptrtoint: 24592 24593'``llvm.vp.ptrtoint.*``' Intrinsics 24594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24595 24596Syntax: 24597""""""" 24598This is an overloaded intrinsic. 24599 24600:: 24601 24602 declare <16 x i8> @llvm.vp.ptrtoint.v16i8.v16p0(<16 x ptr> <op>, <16 x i1> <mask>, i32 <vector_length>) 24603 declare <vscale x 4 x i8> @llvm.vp.ptrtoint.nxv4i8.nxv4p0(<vscale x 4 x ptr> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24604 declare <256 x i64> @llvm.vp.ptrtoint.v16i64.v16p0(<256 x ptr> <op>, <256 x i1> <mask>, i32 <vector_length>) 24605 24606Overview: 24607""""""""" 24608 24609The '``llvm.vp.ptrtoint``' intrinsic converts its pointer to the integer return 24610type. The operation has a mask and an explicit vector length parameter. 24611 24612 24613Arguments: 24614"""""""""" 24615 24616The '``llvm.vp.ptrtoint``' intrinsic takes a value to cast as its first argument 24617, which must be a vector of pointers, and a type to cast it to return type, 24618which must be a vector of :ref:`integer <t_integer>` type. 24619The second argument is the vector mask. The return type, the value to cast, and 24620the vector mask have the same number of elements. 24621The third argument is the explicit vector length of the operation. 24622 24623Semantics: 24624"""""""""" 24625 24626The '``llvm.vp.ptrtoint``' intrinsic converts value to return type by 24627interpreting the pointer value as an integer and either truncating or zero 24628extending that value to the size of the integer type. 24629If ``value`` is smaller than return type, then a zero extension is done. If 24630``value`` is larger than return type, then a truncation is done. If they are 24631the same size, then nothing is done (*no-op cast*) other than a type 24632change. 24633The conversion is performed on lane positions below the explicit vector length 24634and where the vector mask is true. Masked-off lanes are ``poison``. 24635 24636Examples: 24637""""""""" 24638 24639.. code-block:: llvm 24640 24641 %r = call <4 x i8> @llvm.vp.ptrtoint.v4i8.v4p0i32(<4 x ptr> %a, <4 x i1> %mask, i32 %evl) 24642 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24643 24644 %t = ptrtoint <4 x ptr> %a to <4 x i8> 24645 %also.r = select <4 x i1> %mask, <4 x i8> %t, <4 x i8> poison 24646 24647 24648.. _int_vp_inttoptr: 24649 24650'``llvm.vp.inttoptr.*``' Intrinsics 24651^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24652 24653Syntax: 24654""""""" 24655This is an overloaded intrinsic. 24656 24657:: 24658 24659 declare <16 x ptr> @llvm.vp.inttoptr.v16p0.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>) 24660 declare <vscale x 4 x ptr> @llvm.vp.inttoptr.nxv4p0.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24661 declare <256 x ptr> @llvm.vp.inttoptr.v256p0.v256i32 (<256 x i32> <op>, <256 x i1> <mask>, i32 <vector_length>) 24662 24663Overview: 24664""""""""" 24665 24666The '``llvm.vp.inttoptr``' intrinsic converts its integer value to the point 24667return type. The operation has a mask and an explicit vector length parameter. 24668 24669 24670Arguments: 24671"""""""""" 24672 24673The '``llvm.vp.inttoptr``' intrinsic takes a value to cast as its first argument 24674, which must be a vector of :ref:`integer <t_integer>` type, and a type to cast 24675it to return type, which must be a vector of pointers type. 24676The second argument is the vector mask. The return type, the value to cast, and 24677the vector mask have the same number of elements. 24678The third argument is the explicit vector length of the operation. 24679 24680Semantics: 24681"""""""""" 24682 24683The '``llvm.vp.inttoptr``' intrinsic converts ``value`` to return type by 24684applying either a zero extension or a truncation depending on the size of the 24685integer ``value``. If ``value`` is larger than the size of a pointer, then a 24686truncation is done. If ``value`` is smaller than the size of a pointer, then a 24687zero extension is done. If they are the same size, nothing is done (*no-op cast*). 24688The conversion is performed on lane positions below the explicit vector length 24689and where the vector mask is true. Masked-off lanes are ``poison``. 24690 24691Examples: 24692""""""""" 24693 24694.. code-block:: llvm 24695 24696 %r = call <4 x ptr> @llvm.vp.inttoptr.v4p0i32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl) 24697 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24698 24699 %t = inttoptr <4 x i32> %a to <4 x ptr> 24700 %also.r = select <4 x i1> %mask, <4 x ptr> %t, <4 x ptr> poison 24701 24702 24703.. _int_vp_fcmp: 24704 24705'``llvm.vp.fcmp.*``' Intrinsics 24706^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24707 24708Syntax: 24709""""""" 24710This is an overloaded intrinsic. 24711 24712:: 24713 24714 declare <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float> <left_op>, <16 x float> <right_op>, metadata <condition code>, <16 x i1> <mask>, i32 <vector_length>) 24715 declare <vscale x 4 x i1> @llvm.vp.fcmp.nxv4f32(<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, metadata <condition code>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24716 declare <256 x i1> @llvm.vp.fcmp.v256f64(<256 x double> <left_op>, <256 x double> <right_op>, metadata <condition code>, <256 x i1> <mask>, i32 <vector_length>) 24717 24718Overview: 24719""""""""" 24720 24721The '``llvm.vp.fcmp``' intrinsic returns a vector of boolean values based on 24722the comparison of its arguments. The operation has a mask and an explicit vector 24723length parameter. 24724 24725 24726Arguments: 24727"""""""""" 24728 24729The '``llvm.vp.fcmp``' intrinsic takes the two values to compare as its first 24730and second arguments. These two values must be vectors of :ref:`floating-point 24731<t_floating>` types. 24732The return type is the result of the comparison. The return type must be a 24733vector of :ref:`i1 <t_integer>` type. The fourth argument is the vector mask. 24734The return type, the values to compare, and the vector mask have the same 24735number of elements. The third argument is the condition code indicating the kind 24736of comparison to perform. It must be a metadata string with :ref:`one of the 24737supported floating-point condition code values <fcmp_md_cc>`. The fifth argument 24738is the explicit vector length of the operation. 24739 24740Semantics: 24741"""""""""" 24742 24743The '``llvm.vp.fcmp``' compares its first two arguments according to the 24744condition code given as the third argument. The arguments are compared element by 24745element on each enabled lane, where the semantics of the comparison are 24746defined :ref:`according to the condition code <fcmp_md_cc_sem>`. Masked-off 24747lanes are ``poison``. 24748 24749Examples: 24750""""""""" 24751 24752.. code-block:: llvm 24753 24754 %r = call <4 x i1> @llvm.vp.fcmp.v4f32(<4 x float> %a, <4 x float> %b, metadata !"oeq", <4 x i1> %mask, i32 %evl) 24755 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24756 24757 %t = fcmp oeq <4 x float> %a, %b 24758 %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> poison 24759 24760 24761.. _int_vp_icmp: 24762 24763'``llvm.vp.icmp.*``' Intrinsics 24764^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24765 24766Syntax: 24767""""""" 24768This is an overloaded intrinsic. 24769 24770:: 24771 24772 declare <32 x i1> @llvm.vp.icmp.v32i32(<32 x i32> <left_op>, <32 x i32> <right_op>, metadata <condition code>, <32 x i1> <mask>, i32 <vector_length>) 24773 declare <vscale x 2 x i1> @llvm.vp.icmp.nxv2i32(<vscale x 2 x i32> <left_op>, <vscale x 2 x i32> <right_op>, metadata <condition code>, <vscale x 2 x i1> <mask>, i32 <vector_length>) 24774 declare <128 x i1> @llvm.vp.icmp.v128i8(<128 x i8> <left_op>, <128 x i8> <right_op>, metadata <condition code>, <128 x i1> <mask>, i32 <vector_length>) 24775 24776Overview: 24777""""""""" 24778 24779The '``llvm.vp.icmp``' intrinsic returns a vector of boolean values based on 24780the comparison of its arguments. The operation has a mask and an explicit vector 24781length parameter. 24782 24783 24784Arguments: 24785"""""""""" 24786 24787The '``llvm.vp.icmp``' intrinsic takes the two values to compare as its first 24788and second arguments. These two values must be vectors of :ref:`integer 24789<t_integer>` types. 24790The return type is the result of the comparison. The return type must be a 24791vector of :ref:`i1 <t_integer>` type. The fourth argument is the vector mask. 24792The return type, the values to compare, and the vector mask have the same 24793number of elements. The third argument is the condition code indicating the kind 24794of comparison to perform. It must be a metadata string with :ref:`one of the 24795supported integer condition code values <icmp_md_cc>`. The fifth argument is the 24796explicit vector length of the operation. 24797 24798Semantics: 24799"""""""""" 24800 24801The '``llvm.vp.icmp``' compares its first two arguments according to the 24802condition code given as the third argument. The arguments are compared element by 24803element on each enabled lane, where the semantics of the comparison are 24804defined :ref:`according to the condition code <icmp_md_cc_sem>`. Masked-off 24805lanes are ``poison``. 24806 24807Examples: 24808""""""""" 24809 24810.. code-block:: llvm 24811 24812 %r = call <4 x i1> @llvm.vp.icmp.v4i32(<4 x i32> %a, <4 x i32> %b, metadata !"ne", <4 x i1> %mask, i32 %evl) 24813 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24814 24815 %t = icmp ne <4 x i32> %a, %b 24816 %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> poison 24817 24818.. _int_vp_ceil: 24819 24820'``llvm.vp.ceil.*``' Intrinsics 24821^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24822 24823Syntax: 24824""""""" 24825This is an overloaded intrinsic. 24826 24827:: 24828 24829 declare <16 x float> @llvm.vp.ceil.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 24830 declare <vscale x 4 x float> @llvm.vp.ceil.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24831 declare <256 x double> @llvm.vp.ceil.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 24832 24833Overview: 24834""""""""" 24835 24836Predicated floating-point ceiling of a vector of floating-point values. 24837 24838 24839Arguments: 24840"""""""""" 24841 24842The first argument and the result have the same vector of floating-point type. 24843The second argument is the vector mask and has the same number of elements as the 24844result vector type. The third argument is the explicit vector length of the 24845operation. 24846 24847Semantics: 24848"""""""""" 24849 24850The '``llvm.vp.ceil``' intrinsic performs floating-point ceiling 24851(:ref:`ceil <int_ceil>`) of the first vector argument on each enabled lane. The 24852result on disabled lanes is a :ref:`poison value <poisonvalues>`. 24853 24854Examples: 24855""""""""" 24856 24857.. code-block:: llvm 24858 24859 %r = call <4 x float> @llvm.vp.ceil.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 24860 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24861 24862 %t = call <4 x float> @llvm.ceil.v4f32(<4 x float> %a) 24863 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 24864 24865.. _int_vp_floor: 24866 24867'``llvm.vp.floor.*``' Intrinsics 24868^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24869 24870Syntax: 24871""""""" 24872This is an overloaded intrinsic. 24873 24874:: 24875 24876 declare <16 x float> @llvm.vp.floor.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 24877 declare <vscale x 4 x float> @llvm.vp.floor.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24878 declare <256 x double> @llvm.vp.floor.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 24879 24880Overview: 24881""""""""" 24882 24883Predicated floating-point floor of a vector of floating-point values. 24884 24885 24886Arguments: 24887"""""""""" 24888 24889The first argument and the result have the same vector of floating-point type. 24890The second argument is the vector mask and has the same number of elements as the 24891result vector type. The third argument is the explicit vector length of the 24892operation. 24893 24894Semantics: 24895"""""""""" 24896 24897The '``llvm.vp.floor``' intrinsic performs floating-point floor 24898(:ref:`floor <int_floor>`) of the first vector argument on each enabled lane. 24899The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 24900 24901Examples: 24902""""""""" 24903 24904.. code-block:: llvm 24905 24906 %r = call <4 x float> @llvm.vp.floor.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 24907 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24908 24909 %t = call <4 x float> @llvm.floor.v4f32(<4 x float> %a) 24910 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 24911 24912.. _int_vp_rint: 24913 24914'``llvm.vp.rint.*``' Intrinsics 24915^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24916 24917Syntax: 24918""""""" 24919This is an overloaded intrinsic. 24920 24921:: 24922 24923 declare <16 x float> @llvm.vp.rint.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 24924 declare <vscale x 4 x float> @llvm.vp.rint.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24925 declare <256 x double> @llvm.vp.rint.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 24926 24927Overview: 24928""""""""" 24929 24930Predicated floating-point rint of a vector of floating-point values. 24931 24932 24933Arguments: 24934"""""""""" 24935 24936The first argument and the result have the same vector of floating-point type. 24937The second argument is the vector mask and has the same number of elements as the 24938result vector type. The third argument is the explicit vector length of the 24939operation. 24940 24941Semantics: 24942"""""""""" 24943 24944The '``llvm.vp.rint``' intrinsic performs floating-point rint 24945(:ref:`rint <int_rint>`) of the first vector argument on each enabled lane. 24946The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 24947 24948Examples: 24949""""""""" 24950 24951.. code-block:: llvm 24952 24953 %r = call <4 x float> @llvm.vp.rint.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 24954 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 24955 24956 %t = call <4 x float> @llvm.rint.v4f32(<4 x float> %a) 24957 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 24958 24959.. _int_vp_nearbyint: 24960 24961'``llvm.vp.nearbyint.*``' Intrinsics 24962^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 24963 24964Syntax: 24965""""""" 24966This is an overloaded intrinsic. 24967 24968:: 24969 24970 declare <16 x float> @llvm.vp.nearbyint.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 24971 declare <vscale x 4 x float> @llvm.vp.nearbyint.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 24972 declare <256 x double> @llvm.vp.nearbyint.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 24973 24974Overview: 24975""""""""" 24976 24977Predicated floating-point nearbyint of a vector of floating-point values. 24978 24979 24980Arguments: 24981"""""""""" 24982 24983The first argument and the result have the same vector of floating-point type. 24984The second argument is the vector mask and has the same number of elements as the 24985result vector type. The third argument is the explicit vector length of the 24986operation. 24987 24988Semantics: 24989"""""""""" 24990 24991The '``llvm.vp.nearbyint``' intrinsic performs floating-point nearbyint 24992(:ref:`nearbyint <int_nearbyint>`) of the first vector argument on each enabled lane. 24993The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 24994 24995Examples: 24996""""""""" 24997 24998.. code-block:: llvm 24999 25000 %r = call <4 x float> @llvm.vp.nearbyint.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 25001 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25002 25003 %t = call <4 x float> @llvm.nearbyint.v4f32(<4 x float> %a) 25004 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 25005 25006.. _int_vp_round: 25007 25008'``llvm.vp.round.*``' Intrinsics 25009^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25010 25011Syntax: 25012""""""" 25013This is an overloaded intrinsic. 25014 25015:: 25016 25017 declare <16 x float> @llvm.vp.round.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 25018 declare <vscale x 4 x float> @llvm.vp.round.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25019 declare <256 x double> @llvm.vp.round.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 25020 25021Overview: 25022""""""""" 25023 25024Predicated floating-point round of a vector of floating-point values. 25025 25026 25027Arguments: 25028"""""""""" 25029 25030The first argument and the result have the same vector of floating-point type. 25031The second argument is the vector mask and has the same number of elements as the 25032result vector type. The third argument is the explicit vector length of the 25033operation. 25034 25035Semantics: 25036"""""""""" 25037 25038The '``llvm.vp.round``' intrinsic performs floating-point round 25039(:ref:`round <int_round>`) of the first vector argument on each enabled lane. 25040The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 25041 25042Examples: 25043""""""""" 25044 25045.. code-block:: llvm 25046 25047 %r = call <4 x float> @llvm.vp.round.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 25048 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25049 25050 %t = call <4 x float> @llvm.round.v4f32(<4 x float> %a) 25051 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 25052 25053.. _int_vp_roundeven: 25054 25055'``llvm.vp.roundeven.*``' Intrinsics 25056^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25057 25058Syntax: 25059""""""" 25060This is an overloaded intrinsic. 25061 25062:: 25063 25064 declare <16 x float> @llvm.vp.roundeven.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 25065 declare <vscale x 4 x float> @llvm.vp.roundeven.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25066 declare <256 x double> @llvm.vp.roundeven.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 25067 25068Overview: 25069""""""""" 25070 25071Predicated floating-point roundeven of a vector of floating-point values. 25072 25073 25074Arguments: 25075"""""""""" 25076 25077The first argument and the result have the same vector of floating-point type. 25078The second argument is the vector mask and has the same number of elements as the 25079result vector type. The third argument is the explicit vector length of the 25080operation. 25081 25082Semantics: 25083"""""""""" 25084 25085The '``llvm.vp.roundeven``' intrinsic performs floating-point roundeven 25086(:ref:`roundeven <int_roundeven>`) of the first vector argument on each enabled 25087lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 25088 25089Examples: 25090""""""""" 25091 25092.. code-block:: llvm 25093 25094 %r = call <4 x float> @llvm.vp.roundeven.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 25095 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25096 25097 %t = call <4 x float> @llvm.roundeven.v4f32(<4 x float> %a) 25098 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 25099 25100.. _int_vp_roundtozero: 25101 25102'``llvm.vp.roundtozero.*``' Intrinsics 25103^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25104 25105Syntax: 25106""""""" 25107This is an overloaded intrinsic. 25108 25109:: 25110 25111 declare <16 x float> @llvm.vp.roundtozero.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 25112 declare <vscale x 4 x float> @llvm.vp.roundtozero.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25113 declare <256 x double> @llvm.vp.roundtozero.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 25114 25115Overview: 25116""""""""" 25117 25118Predicated floating-point round-to-zero of a vector of floating-point values. 25119 25120 25121Arguments: 25122"""""""""" 25123 25124The first argument and the result have the same vector of floating-point type. 25125The second argument is the vector mask and has the same number of elements as the 25126result vector type. The third argument is the explicit vector length of the 25127operation. 25128 25129Semantics: 25130"""""""""" 25131 25132The '``llvm.vp.roundtozero``' intrinsic performs floating-point roundeven 25133(:ref:`llvm.trunc <int_llvm_trunc>`) of the first vector argument on each enabled lane. The 25134result on disabled lanes is a :ref:`poison value <poisonvalues>`. 25135 25136Examples: 25137""""""""" 25138 25139.. code-block:: llvm 25140 25141 %r = call <4 x float> @llvm.vp.roundtozero.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 25142 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25143 25144 %t = call <4 x float> @llvm.trunc.v4f32(<4 x float> %a) 25145 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison 25146 25147.. _int_vp_lrint: 25148 25149'``llvm.vp.lrint.*``' Intrinsics 25150^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25151 25152Syntax: 25153""""""" 25154This is an overloaded intrinsic. 25155 25156:: 25157 25158 declare <16 x i32> @llvm.vp.lrint.v16i32.v16f32(<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 25159 declare <vscale x 4 x i32> @llvm.vp.lrint.nxv4i32.nxv4f32(<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25160 declare <256 x i64> @llvm.vp.lrint.v256i64.v256f64(<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 25161 25162Overview: 25163""""""""" 25164 25165Predicated lrint of a vector of floating-point values. 25166 25167 25168Arguments: 25169"""""""""" 25170 25171The result is an integer vector and the first argument is a vector of :ref:`floating-point <t_floating>` 25172type with the same number of elements as the result vector type. The second 25173argument is the vector mask and has the same number of elements as the result 25174vector type. The third argument is the explicit vector length of the operation. 25175 25176Semantics: 25177"""""""""" 25178 25179The '``llvm.vp.lrint``' intrinsic performs lrint (:ref:`lrint <int_lrint>`) of 25180the first vector argument on each enabled lane. The result on disabled lanes is a 25181:ref:`poison value <poisonvalues>`. 25182 25183Examples: 25184""""""""" 25185 25186.. code-block:: llvm 25187 25188 %r = call <4 x i32> @llvm.vp.lrint.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 25189 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25190 25191 %t = call <4 x i32> @llvm.lrint.v4f32(<4 x float> %a) 25192 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25193 25194.. _int_vp_llrint: 25195 25196'``llvm.vp.llrint.*``' Intrinsics 25197^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25198 25199Syntax: 25200""""""" 25201This is an overloaded intrinsic. 25202 25203:: 25204 25205 declare <16 x i32> @llvm.vp.llrint.v16i32.v16f32(<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>) 25206 declare <vscale x 4 x i32> @llvm.vp.llrint.nxv4i32.nxv4f32(<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25207 declare <256 x i64> @llvm.vp.llrint.v256i64.v256f64(<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>) 25208 25209Overview: 25210""""""""" 25211 25212Predicated llrint of a vector of floating-point values. 25213 25214 25215Arguments: 25216"""""""""" 25217The result is an integer vector and the first argument is a vector of :ref:`floating-point <t_floating>` 25218type with the same number of elements as the result vector type. The second 25219argument is the vector mask and has the same number of elements as the result 25220vector type. The third argument is the explicit vector length of the operation. 25221 25222Semantics: 25223"""""""""" 25224 25225The '``llvm.vp.llrint``' intrinsic performs lrint (:ref:`llrint <int_llrint>`) of 25226the first vector argument on each enabled lane. The result on disabled lanes is a 25227:ref:`poison value <poisonvalues>`. 25228 25229Examples: 25230""""""""" 25231 25232.. code-block:: llvm 25233 25234 %r = call <4 x i32> @llvm.vp.llrint.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl) 25235 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25236 25237 %t = call <4 x i32> @llvm.llrint.v4f32(<4 x float> %a) 25238 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25239 25240 25241.. _int_vp_bitreverse: 25242 25243'``llvm.vp.bitreverse.*``' Intrinsics 25244^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25245 25246Syntax: 25247""""""" 25248This is an overloaded intrinsic. 25249 25250:: 25251 25252 declare <16 x i32> @llvm.vp.bitreverse.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>) 25253 declare <vscale x 4 x i32> @llvm.vp.bitreverse.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25254 declare <256 x i64> @llvm.vp.bitreverse.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>) 25255 25256Overview: 25257""""""""" 25258 25259Predicated bitreverse of a vector of integers. 25260 25261 25262Arguments: 25263"""""""""" 25264 25265The first argument and the result have the same vector of integer type. The 25266second argument is the vector mask and has the same number of elements as the 25267result vector type. The third argument is the explicit vector length of the 25268operation. 25269 25270Semantics: 25271"""""""""" 25272 25273The '``llvm.vp.bitreverse``' intrinsic performs bitreverse (:ref:`bitreverse <int_bitreverse>`) of the first argument on each 25274enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 25275 25276Examples: 25277""""""""" 25278 25279.. code-block:: llvm 25280 25281 %r = call <4 x i32> @llvm.vp.bitreverse.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl) 25282 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25283 25284 %t = call <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> %a) 25285 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25286 25287 25288.. _int_vp_bswap: 25289 25290'``llvm.vp.bswap.*``' Intrinsics 25291^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25292 25293Syntax: 25294""""""" 25295This is an overloaded intrinsic. 25296 25297:: 25298 25299 declare <16 x i32> @llvm.vp.bswap.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>) 25300 declare <vscale x 4 x i32> @llvm.vp.bswap.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25301 declare <256 x i64> @llvm.vp.bswap.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>) 25302 25303Overview: 25304""""""""" 25305 25306Predicated bswap of a vector of integers. 25307 25308 25309Arguments: 25310"""""""""" 25311 25312The first argument and the result have the same vector of integer type. The 25313second argument is the vector mask and has the same number of elements as the 25314result vector type. The third argument is the explicit vector length of the 25315operation. 25316 25317Semantics: 25318"""""""""" 25319 25320The '``llvm.vp.bswap``' intrinsic performs bswap (:ref:`bswap <int_bswap>`) of the first argument on each 25321enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 25322 25323Examples: 25324""""""""" 25325 25326.. code-block:: llvm 25327 25328 %r = call <4 x i32> @llvm.vp.bswap.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl) 25329 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25330 25331 %t = call <4 x i32> @llvm.bswap.v4i32(<4 x i32> %a) 25332 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25333 25334 25335.. _int_vp_ctpop: 25336 25337'``llvm.vp.ctpop.*``' Intrinsics 25338^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25339 25340Syntax: 25341""""""" 25342This is an overloaded intrinsic. 25343 25344:: 25345 25346 declare <16 x i32> @llvm.vp.ctpop.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>) 25347 declare <vscale x 4 x i32> @llvm.vp.ctpop.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25348 declare <256 x i64> @llvm.vp.ctpop.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>) 25349 25350Overview: 25351""""""""" 25352 25353Predicated ctpop of a vector of integers. 25354 25355 25356Arguments: 25357"""""""""" 25358 25359The first argument and the result have the same vector of integer type. The 25360second argument is the vector mask and has the same number of elements as the 25361result vector type. The third argument is the explicit vector length of the 25362operation. 25363 25364Semantics: 25365"""""""""" 25366 25367The '``llvm.vp.ctpop``' intrinsic performs ctpop (:ref:`ctpop <int_ctpop>`) of the first argument on each 25368enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 25369 25370Examples: 25371""""""""" 25372 25373.. code-block:: llvm 25374 25375 %r = call <4 x i32> @llvm.vp.ctpop.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl) 25376 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25377 25378 %t = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> %a) 25379 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25380 25381 25382.. _int_vp_ctlz: 25383 25384'``llvm.vp.ctlz.*``' Intrinsics 25385^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25386 25387Syntax: 25388""""""" 25389This is an overloaded intrinsic. 25390 25391:: 25392 25393 declare <16 x i32> @llvm.vp.ctlz.v16i32 (<16 x i32> <op>, i1 <is_zero_poison>, <16 x i1> <mask>, i32 <vector_length>) 25394 declare <vscale x 4 x i32> @llvm.vp.ctlz.nxv4i32 (<vscale x 4 x i32> <op>, i1 <is_zero_poison>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25395 declare <256 x i64> @llvm.vp.ctlz.v256i64 (<256 x i64> <op>, i1 <is_zero_poison>, <256 x i1> <mask>, i32 <vector_length>) 25396 25397Overview: 25398""""""""" 25399 25400Predicated ctlz of a vector of integers. 25401 25402 25403Arguments: 25404"""""""""" 25405 25406The first argument and the result have the same vector of integer type. The 25407second argument is a constant flag that indicates whether the intrinsic returns 25408a valid result if the first argument is zero. The third argument is the vector 25409mask and has the same number of elements as the result vector type. the fourth 25410argument is the explicit vector length of the operation. If the first argument 25411is zero and the second argument is true, the result is poison. 25412 25413Semantics: 25414"""""""""" 25415 25416The '``llvm.vp.ctlz``' intrinsic performs ctlz (:ref:`ctlz <int_ctlz>`) of the first argument on each 25417enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 25418 25419Examples: 25420""""""""" 25421 25422.. code-block:: llvm 25423 25424 %r = call <4 x i32> @llvm.vp.ctlz.v4i32(<4 x i32> %a, i1 false, <4 x i1> %mask, i32 %evl) 25425 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25426 25427 %t = call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> %a, i1 false) 25428 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25429 25430 25431.. _int_vp_cttz: 25432 25433'``llvm.vp.cttz.*``' Intrinsics 25434^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25435 25436Syntax: 25437""""""" 25438This is an overloaded intrinsic. 25439 25440:: 25441 25442 declare <16 x i32> @llvm.vp.cttz.v16i32 (<16 x i32> <op>, i1 <is_zero_poison>, <16 x i1> <mask>, i32 <vector_length>) 25443 declare <vscale x 4 x i32> @llvm.vp.cttz.nxv4i32 (<vscale x 4 x i32> <op>, i1 <is_zero_poison>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25444 declare <256 x i64> @llvm.vp.cttz.v256i64 (<256 x i64> <op>, i1 <is_zero_poison>, <256 x i1> <mask>, i32 <vector_length>) 25445 25446Overview: 25447""""""""" 25448 25449Predicated cttz of a vector of integers. 25450 25451 25452Arguments: 25453"""""""""" 25454 25455The first argument and the result have the same vector of integer type. The 25456second argument is a constant flag that indicates whether the intrinsic 25457returns a valid result if the first argument is zero. The third argument is 25458the vector mask and has the same number of elements as the result vector type. 25459The fourth argument is the explicit vector length of the operation. If the 25460first argument is zero and the second argument is true, the result is poison. 25461 25462Semantics: 25463"""""""""" 25464 25465The '``llvm.vp.cttz``' intrinsic performs cttz (:ref:`cttz <int_cttz>`) of the first argument on each 25466enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 25467 25468Examples: 25469""""""""" 25470 25471.. code-block:: llvm 25472 25473 %r = call <4 x i32> @llvm.vp.cttz.v4i32(<4 x i32> %a, i1 false, <4 x i1> %mask, i32 %evl) 25474 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25475 25476 %t = call <4 x i32> @llvm.cttz.v4i32(<4 x i32> %a, i1 false) 25477 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25478 25479 25480.. _int_vp_cttz_elts: 25481 25482'``llvm.vp.cttz.elts.*``' Intrinsics 25483^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25484 25485Syntax: 25486""""""" 25487This is an overloaded intrinsic. You can use ```llvm.vp.cttz.elts``` on any 25488vector of integer elements, both fixed width and scalable. 25489 25490:: 25491 25492 declare i32 @llvm.vp.cttz.elts.i32.v16i32 (<16 x i32> <op>, i1 <is_zero_poison>, <16 x i1> <mask>, i32 <vector_length>) 25493 declare i64 @llvm.vp.cttz.elts.i64.nxv4i32 (<vscale x 4 x i32> <op>, i1 <is_zero_poison>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25494 declare i64 @llvm.vp.cttz.elts.i64.v256i1 (<256 x i1> <op>, i1 <is_zero_poison>, <256 x i1> <mask>, i32 <vector_length>) 25495 25496Overview: 25497""""""""" 25498 25499This '```llvm.vp.cttz.elts```' intrinsic counts the number of trailing zero 25500elements of a vector. This is basically the vector-predicated version of 25501'```llvm.experimental.cttz.elts```'. 25502 25503Arguments: 25504"""""""""" 25505 25506The first argument is the vector to be counted. This argument must be a vector 25507with integer element type. The return type must also be an integer type which is 25508wide enough to hold the maximum number of elements of the source vector. The 25509behavior of this intrinsic is undefined if the return type is not wide enough 25510for the number of elements in the input vector. 25511 25512The second argument is a constant flag that indicates whether the intrinsic 25513returns a valid result if the first argument is all zero. 25514 25515The third argument is the vector mask and has the same number of elements as the 25516input vector type. The fourth argument is the explicit vector length of the 25517operation. 25518 25519Semantics: 25520"""""""""" 25521 25522The '``llvm.vp.cttz.elts``' intrinsic counts the trailing (least 25523significant / lowest-numbered) zero elements in the first argument on each 25524enabled lane. If the first argument is all zero and the second argument is true, 25525the result is poison. Otherwise, it returns the explicit vector length (i.e. the 25526fourth argument). 25527 25528.. _int_vp_sadd_sat: 25529 25530'``llvm.vp.sadd.sat.*``' Intrinsics 25531^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25532 25533Syntax: 25534""""""" 25535This is an overloaded intrinsic. 25536 25537:: 25538 25539 declare <16 x i32> @llvm.vp.sadd.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 25540 declare <vscale x 4 x i32> @llvm.vp.sadd.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25541 declare <256 x i64> @llvm.vp.sadd.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 25542 25543Overview: 25544""""""""" 25545 25546Predicated signed saturating addition of two vectors of integers. 25547 25548 25549Arguments: 25550"""""""""" 25551 25552The first two arguments and the result have the same vector of integer type. The 25553third argument is the vector mask and has the same number of elements as the 25554result vector type. The fourth argument is the explicit vector length of the 25555operation. 25556 25557Semantics: 25558"""""""""" 25559 25560The '``llvm.vp.sadd.sat``' intrinsic performs sadd.sat (:ref:`sadd.sat <int_sadd_sat>`) 25561of the first and second vector arguments on each enabled lane. The result on 25562disabled lanes is a :ref:`poison value <poisonvalues>`. 25563 25564 25565Examples: 25566""""""""" 25567 25568.. code-block:: llvm 25569 25570 %r = call <4 x i32> @llvm.vp.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 25571 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25572 25573 %t = call <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 25574 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25575 25576 25577.. _int_vp_uadd_sat: 25578 25579'``llvm.vp.uadd.sat.*``' Intrinsics 25580^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25581 25582Syntax: 25583""""""" 25584This is an overloaded intrinsic. 25585 25586:: 25587 25588 declare <16 x i32> @llvm.vp.uadd.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 25589 declare <vscale x 4 x i32> @llvm.vp.uadd.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25590 declare <256 x i64> @llvm.vp.uadd.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 25591 25592Overview: 25593""""""""" 25594 25595Predicated unsigned saturating addition of two vectors of integers. 25596 25597 25598Arguments: 25599"""""""""" 25600 25601The first two arguments and the result have the same vector of integer type. The 25602third argument is the vector mask and has the same number of elements as the 25603result vector type. The fourth argument is the explicit vector length of the 25604operation. 25605 25606Semantics: 25607"""""""""" 25608 25609The '``llvm.vp.uadd.sat``' intrinsic performs uadd.sat (:ref:`uadd.sat <int_uadd_sat>`) 25610of the first and second vector arguments on each enabled lane. The result on 25611disabled lanes is a :ref:`poison value <poisonvalues>`. 25612 25613 25614Examples: 25615""""""""" 25616 25617.. code-block:: llvm 25618 25619 %r = call <4 x i32> @llvm.vp.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 25620 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25621 25622 %t = call <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 25623 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25624 25625 25626.. _int_vp_ssub_sat: 25627 25628'``llvm.vp.ssub.sat.*``' Intrinsics 25629^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25630 25631Syntax: 25632""""""" 25633This is an overloaded intrinsic. 25634 25635:: 25636 25637 declare <16 x i32> @llvm.vp.ssub.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 25638 declare <vscale x 4 x i32> @llvm.vp.ssub.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25639 declare <256 x i64> @llvm.vp.ssub.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 25640 25641Overview: 25642""""""""" 25643 25644Predicated signed saturating subtraction of two vectors of integers. 25645 25646 25647Arguments: 25648"""""""""" 25649 25650The first two arguments and the result have the same vector of integer type. The 25651third argument is the vector mask and has the same number of elements as the 25652result vector type. The fourth argument is the explicit vector length of the 25653operation. 25654 25655Semantics: 25656"""""""""" 25657 25658The '``llvm.vp.ssub.sat``' intrinsic performs ssub.sat (:ref:`ssub.sat <int_ssub_sat>`) 25659of the first and second vector arguments on each enabled lane. The result on 25660disabled lanes is a :ref:`poison value <poisonvalues>`. 25661 25662 25663Examples: 25664""""""""" 25665 25666.. code-block:: llvm 25667 25668 %r = call <4 x i32> @llvm.vp.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 25669 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25670 25671 %t = call <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 25672 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25673 25674 25675.. _int_vp_usub_sat: 25676 25677'``llvm.vp.usub.sat.*``' Intrinsics 25678^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25679 25680Syntax: 25681""""""" 25682This is an overloaded intrinsic. 25683 25684:: 25685 25686 declare <16 x i32> @llvm.vp.usub.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 25687 declare <vscale x 4 x i32> @llvm.vp.usub.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25688 declare <256 x i64> @llvm.vp.usub.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 25689 25690Overview: 25691""""""""" 25692 25693Predicated unsigned saturating subtraction of two vectors of integers. 25694 25695 25696Arguments: 25697"""""""""" 25698 25699The first two arguments and the result have the same vector of integer type. The 25700third argument is the vector mask and has the same number of elements as the 25701result vector type. The fourth argument is the explicit vector length of the 25702operation. 25703 25704Semantics: 25705"""""""""" 25706 25707The '``llvm.vp.usub.sat``' intrinsic performs usub.sat (:ref:`usub.sat <int_usub_sat>`) 25708of the first and second vector arguments on each enabled lane. The result on 25709disabled lanes is a :ref:`poison value <poisonvalues>`. 25710 25711 25712Examples: 25713""""""""" 25714 25715.. code-block:: llvm 25716 25717 %r = call <4 x i32> @llvm.vp.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 25718 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25719 25720 %t = call <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 25721 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25722 25723 25724.. _int_vp_fshl: 25725 25726'``llvm.vp.fshl.*``' Intrinsics 25727^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25728 25729Syntax: 25730""""""" 25731This is an overloaded intrinsic. 25732 25733:: 25734 25735 declare <16 x i32> @llvm.vp.fshl.v16i32 (<16 x i32> <left_op>, <16 x i32> <middle_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 25736 declare <vscale x 4 x i32> @llvm.vp.fshl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <middle_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25737 declare <256 x i64> @llvm.vp.fshl.v256i64 (<256 x i64> <left_op>, <256 x i64> <middle_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 25738 25739Overview: 25740""""""""" 25741 25742Predicated fshl of three vectors of integers. 25743 25744 25745Arguments: 25746"""""""""" 25747 25748The first three arguments and the result have the same vector of integer type. The 25749fourth argument is the vector mask and has the same number of elements as the 25750result vector type. The fifth argument is the explicit vector length of the 25751operation. 25752 25753Semantics: 25754"""""""""" 25755 25756The '``llvm.vp.fshl``' intrinsic performs fshl (:ref:`fshl <int_fshl>`) of the first, second, and third 25757vector argument on each enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 25758 25759 25760Examples: 25761""""""""" 25762 25763.. code-block:: llvm 25764 25765 %r = call <4 x i32> @llvm.vp.fshl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i1> %mask, i32 %evl) 25766 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25767 25768 %t = call <4 x i32> @llvm.fshl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c) 25769 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25770 25771 25772'``llvm.vp.fshr.*``' Intrinsics 25773^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25774 25775Syntax: 25776""""""" 25777This is an overloaded intrinsic. 25778 25779:: 25780 25781 declare <16 x i32> @llvm.vp.fshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <middle_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 25782 declare <vscale x 4 x i32> @llvm.vp.fshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <middle_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 25783 declare <256 x i64> @llvm.vp.fshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <middle_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 25784 25785Overview: 25786""""""""" 25787 25788Predicated fshr of three vectors of integers. 25789 25790 25791Arguments: 25792"""""""""" 25793 25794The first three arguments and the result have the same vector of integer type. The 25795fourth argument is the vector mask and has the same number of elements as the 25796result vector type. The fifth argument is the explicit vector length of the 25797operation. 25798 25799Semantics: 25800"""""""""" 25801 25802The '``llvm.vp.fshr``' intrinsic performs fshr (:ref:`fshr <int_fshr>`) of the first, second, and third 25803vector argument on each enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 25804 25805 25806Examples: 25807""""""""" 25808 25809.. code-block:: llvm 25810 25811 %r = call <4 x i32> @llvm.vp.fshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i1> %mask, i32 %evl) 25812 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 25813 25814 %t = call <4 x i32> @llvm.fshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c) 25815 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison 25816 25817'``llvm.vp.is.fpclass.*``' Intrinsics 25818^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25819 25820Syntax: 25821""""""" 25822This is an overloaded intrinsic. 25823 25824:: 25825 25826 declare <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f32(<vscale x 2 x float> <op>, i32 <test>, <vscale x 2 x i1> <mask>, i32 <vector_length>) 25827 declare <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> <op>, i32 <test>, <2 x i1> <mask>, i32 <vector_length>) 25828 25829Overview: 25830""""""""" 25831 25832Predicated llvm.is.fpclass :ref:`llvm.is.fpclass <llvm.is.fpclass>` 25833 25834Arguments: 25835"""""""""" 25836 25837The first argument is a floating-point vector, the result type is a vector of 25838boolean with the same number of elements as the first argument. The second 25839argument specifies, which tests to perform :ref:`llvm.is.fpclass <llvm.is.fpclass>`. 25840The third argument is the vector mask and has the same number of elements as the 25841result vector type. The fourth argument is the explicit vector length of the 25842operation. 25843 25844Semantics: 25845"""""""""" 25846 25847The '``llvm.vp.is.fpclass``' intrinsic performs llvm.is.fpclass (:ref:`llvm.is.fpclass <llvm.is.fpclass>`). 25848 25849 25850Examples: 25851""""""""" 25852 25853.. code-block:: llvm 25854 25855 %r = call <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> %x, i32 3, <2 x i1> %m, i32 %evl) 25856 %t = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f16(<vscale x 2 x half> %x, i32 3, <vscale x 2 x i1> %m, i32 %evl) 25857 25858.. _int_mload_mstore: 25859 25860Masked Vector Load and Store Intrinsics 25861--------------------------------------- 25862 25863LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask argument, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed. 25864 25865.. _int_mload: 25866 25867'``llvm.masked.load.*``' Intrinsics 25868^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25869 25870Syntax: 25871""""""" 25872This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type. 25873 25874:: 25875 25876 declare <16 x float> @llvm.masked.load.v16f32.p0(ptr <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) 25877 declare <2 x double> @llvm.masked.load.v2f64.p0(ptr <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) 25878 ;; The data is a vector of pointers 25879 declare <8 x ptr> @llvm.masked.load.v8p0.p0(ptr <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x ptr> <passthru>) 25880 25881Overview: 25882""""""""" 25883 25884Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' argument. 25885 25886 25887Arguments: 25888"""""""""" 25889 25890The first argument is the base pointer for the load. The second argument is the alignment of the source location. It must be a power of two constant integer value. The third argument, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' argument are the same vector types. 25891 25892Semantics: 25893"""""""""" 25894 25895The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations. 25896The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask, except that the masked-off lanes are not accessed. 25897Only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation). 25898In particular, using this intrinsic prevents exceptions on memory accesses to masked-off lanes. 25899Masked-off lanes are also not considered accessed for the purpose of data races or ``noalias`` constraints. 25900 25901 25902:: 25903 25904 %res = call <16 x float> @llvm.masked.load.v16f32.p0(ptr %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru) 25905 25906 ;; The result of the two following instructions is identical aside from potential memory access exception 25907 %loadlal = load <16 x float>, ptr %ptr, align 4 25908 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru 25909 25910.. _int_mstore: 25911 25912'``llvm.masked.store.*``' Intrinsics 25913^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25914 25915Syntax: 25916""""""" 25917This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. 25918 25919:: 25920 25921 declare void @llvm.masked.store.v8i32.p0 (<8 x i32> <value>, ptr <ptr>, i32 <alignment>, <8 x i1> <mask>) 25922 declare void @llvm.masked.store.v16f32.p0(<16 x float> <value>, ptr <ptr>, i32 <alignment>, <16 x i1> <mask>) 25923 ;; The data is a vector of pointers 25924 declare void @llvm.masked.store.v8p0.p0 (<8 x ptr> <value>, ptr <ptr>, i32 <alignment>, <8 x i1> <mask>) 25925 25926Overview: 25927""""""""" 25928 25929Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. 25930 25931Arguments: 25932"""""""""" 25933 25934The first argument is the vector value to be written to memory. The second argument is the base pointer for the store, it has the same underlying type as the value argument. The third argument is the alignment of the destination location. It must be a power of two constant integer value. The fourth argument, mask, is a vector of boolean values. The types of the mask and the value argument must have the same number of vector elements. 25935 25936 25937Semantics: 25938"""""""""" 25939 25940The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations. 25941The result of this operation is equivalent to a load-modify-store sequence, except that the masked-off lanes are not accessed. 25942Only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation). 25943In particular, using this intrinsic prevents exceptions on memory accesses to masked-off lanes. 25944Masked-off lanes are also not considered accessed for the purpose of data races or ``noalias`` constraints. 25945 25946:: 25947 25948 call void @llvm.masked.store.v16f32.p0(<16 x float> %value, ptr %ptr, i32 4, <16 x i1> %mask) 25949 25950 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions 25951 %oldval = load <16 x float>, ptr %ptr, align 4 25952 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval 25953 store <16 x float> %res, ptr %ptr, align 4 25954 25955 25956Masked Vector Gather and Scatter Intrinsics 25957------------------------------------------- 25958 25959LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask argument, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed. 25960 25961.. _int_mgather: 25962 25963'``llvm.masked.gather.*``' Intrinsics 25964^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 25965 25966Syntax: 25967""""""" 25968This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector. 25969 25970:: 25971 25972 declare <16 x float> @llvm.masked.gather.v16f32.v16p0(<16 x ptr> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) 25973 declare <2 x double> @llvm.masked.gather.v2f64.v2p1(<2 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) 25974 declare <8 x ptr> @llvm.masked.gather.v8p0.v8p0(<8 x ptr> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x ptr> <passthru>) 25975 25976Overview: 25977""""""""" 25978 25979Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' argument. 25980 25981 25982Arguments: 25983"""""""""" 25984 25985The first argument is a vector of pointers which holds all memory addresses to read. The second argument is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third argument, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' argument are the same vector types. 25986 25987Semantics: 25988"""""""""" 25989 25990The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations. 25991The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks. 25992 25993 25994:: 25995 25996 %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0(<4 x ptr> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> poison) 25997 25998 ;; The gather with all-true mask is equivalent to the following instruction sequence 25999 %ptr0 = extractelement <4 x ptr> %ptrs, i32 0 26000 %ptr1 = extractelement <4 x ptr> %ptrs, i32 1 26001 %ptr2 = extractelement <4 x ptr> %ptrs, i32 2 26002 %ptr3 = extractelement <4 x ptr> %ptrs, i32 3 26003 26004 %val0 = load double, ptr %ptr0, align 8 26005 %val1 = load double, ptr %ptr1, align 8 26006 %val2 = load double, ptr %ptr2, align 8 26007 %val3 = load double, ptr %ptr3, align 8 26008 26009 %vec0 = insertelement <4 x double> poison, %val0, 0 26010 %vec01 = insertelement <4 x double> %vec0, %val1, 1 26011 %vec012 = insertelement <4 x double> %vec01, %val2, 2 26012 %vec0123 = insertelement <4 x double> %vec012, %val3, 3 26013 26014.. _int_mscatter: 26015 26016'``llvm.masked.scatter.*``' Intrinsics 26017^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26018 26019Syntax: 26020""""""" 26021This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element. 26022 26023:: 26024 26025 declare void @llvm.masked.scatter.v8i32.v8p0 (<8 x i32> <value>, <8 x ptr> <ptrs>, i32 <alignment>, <8 x i1> <mask>) 26026 declare void @llvm.masked.scatter.v16f32.v16p1(<16 x float> <value>, <16 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <16 x i1> <mask>) 26027 declare void @llvm.masked.scatter.v4p0.v4p0 (<4 x ptr> <value>, <4 x ptr> <ptrs>, i32 <alignment>, <4 x i1> <mask>) 26028 26029Overview: 26030""""""""" 26031 26032Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. 26033 26034Arguments: 26035"""""""""" 26036 26037The first argument is a vector value to be written to memory. The second argument is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value argument. The third argument is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth argument, mask, is a vector of boolean values. The types of the mask and the value argument must have the same number of vector elements. 26038 26039Semantics: 26040"""""""""" 26041 26042The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations. 26043 26044:: 26045 26046 ;; This instruction unconditionally stores data vector in multiple addresses 26047 call @llvm.masked.scatter.v8i32.v8p0(<8 x i32> %value, <8 x ptr> %ptrs, i32 4, <8 x i1> <true, true, .. true>) 26048 26049 ;; It is equivalent to a list of scalar stores 26050 %val0 = extractelement <8 x i32> %value, i32 0 26051 %val1 = extractelement <8 x i32> %value, i32 1 26052 .. 26053 %val7 = extractelement <8 x i32> %value, i32 7 26054 %ptr0 = extractelement <8 x ptr> %ptrs, i32 0 26055 %ptr1 = extractelement <8 x ptr> %ptrs, i32 1 26056 .. 26057 %ptr7 = extractelement <8 x ptr> %ptrs, i32 7 26058 ;; Note: the order of the following stores is important when they overlap: 26059 store i32 %val0, ptr %ptr0, align 4 26060 store i32 %val1, ptr %ptr1, align 4 26061 .. 26062 store i32 %val7, ptr %ptr7, align 4 26063 26064 26065Masked Vector Expanding Load and Compressing Store Intrinsics 26066------------------------------------------------------------- 26067 26068LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`. 26069 26070.. _int_expandload: 26071 26072'``llvm.masked.expandload.*``' Intrinsics 26073^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26074 26075Syntax: 26076""""""" 26077This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask. 26078 26079:: 26080 26081 declare <16 x float> @llvm.masked.expandload.v16f32 (ptr <ptr>, <16 x i1> <mask>, <16 x float> <passthru>) 26082 declare <2 x i64> @llvm.masked.expandload.v2i64 (ptr <ptr>, <2 x i1> <mask>, <2 x i64> <passthru>) 26083 26084Overview: 26085""""""""" 26086 26087Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' argument. 26088 26089 26090Arguments: 26091"""""""""" 26092 26093The first argument is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second argument, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' argument have the same vector type. 26094 26095The :ref:`align <attr_align>` parameter attribute can be provided for the first 26096argument. The pointer alignment defaults to 1. 26097 26098Semantics: 26099"""""""""" 26100 26101The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example: 26102 26103.. code-block:: c 26104 26105 // In this loop we load from B and spread the elements into array A. 26106 double *A, B; int *C; 26107 for (int i = 0; i < size; ++i) { 26108 if (C[i] != 0) 26109 A[i] = B[j++]; 26110 } 26111 26112 26113.. code-block:: llvm 26114 26115 ; Load several elements from array B and expand them in a vector. 26116 ; The number of loaded elements is equal to the number of '1' elements in the Mask. 26117 %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(ptr %Bptr, <8 x i1> %Mask, <8 x double> poison) 26118 ; Store the result in A 26119 call void @llvm.masked.store.v8f64.p0(<8 x double> %Tmp, ptr %Aptr, i32 8, <8 x i1> %Mask) 26120 26121 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. 26122 %MaskI = bitcast <8 x i1> %Mask to i8 26123 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) 26124 %MaskI64 = zext i8 %MaskIPopcnt to i64 26125 %BNextInd = add i64 %BInd, %MaskI64 26126 26127 26128Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles. 26129If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load. 26130 26131.. _int_compressstore: 26132 26133'``llvm.masked.compressstore.*``' Intrinsics 26134^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26135 26136Syntax: 26137""""""" 26138This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector. 26139 26140:: 26141 26142 declare void @llvm.masked.compressstore.v8i32 (<8 x i32> <value>, ptr <ptr>, <8 x i1> <mask>) 26143 declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, ptr <ptr>, <16 x i1> <mask>) 26144 26145Overview: 26146""""""""" 26147 26148Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask. 26149 26150Arguments: 26151"""""""""" 26152 26153The first argument is the input vector, from which elements are collected and written to memory. The second argument is the base pointer for the store, it has the same underlying type as the element of the input vector argument. The third argument is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements. 26154 26155The :ref:`align <attr_align>` parameter attribute can be provided for the second 26156argument. The pointer alignment defaults to 1. 26157 26158Semantics: 26159"""""""""" 26160 26161The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependencies like in the following example: 26162 26163.. code-block:: c 26164 26165 // In this loop we load elements from A and store them consecutively in B 26166 double *A, B; int *C; 26167 for (int i = 0; i < size; ++i) { 26168 if (C[i] != 0) 26169 B[j++] = A[i] 26170 } 26171 26172 26173.. code-block:: llvm 26174 26175 ; Load elements from A. 26176 %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0(ptr %Aptr, i32 8, <8 x i1> %Mask, <8 x double> poison) 26177 ; Store all selected elements consecutively in array B 26178 call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, ptr %Bptr, <8 x i1> %Mask) 26179 26180 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. 26181 %MaskI = bitcast <8 x i1> %Mask to i8 26182 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) 26183 %MaskI64 = zext i8 %MaskIPopcnt to i64 26184 %BNextInd = add i64 %BInd, %MaskI64 26185 26186 26187Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations. 26188 26189 26190Memory Use Markers 26191------------------ 26192 26193This class of intrinsics provides information about the 26194:ref:`lifetime of memory objects <objectlifetime>` and ranges where variables 26195are immutable. 26196 26197.. _int_lifestart: 26198 26199'``llvm.lifetime.start``' Intrinsic 26200^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26201 26202Syntax: 26203""""""" 26204 26205:: 26206 26207 declare void @llvm.lifetime.start(i64 <size>, ptr captures(none) <ptr>) 26208 26209Overview: 26210""""""""" 26211 26212The '``llvm.lifetime.start``' intrinsic specifies the start of a memory 26213object's lifetime. 26214 26215Arguments: 26216"""""""""" 26217 26218The first argument is a constant integer representing the size of the 26219object, or -1 if it is variable sized. The second argument is a pointer 26220to the object. 26221 26222Semantics: 26223"""""""""" 26224 26225If ``ptr`` is a stack-allocated object and it points to the first byte of 26226the object, the object is initially marked as dead. 26227``ptr`` is conservatively considered as a non-stack-allocated object if 26228the stack coloring algorithm that is used in the optimization pipeline cannot 26229conclude that ``ptr`` is a stack-allocated object. 26230 26231After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked 26232as alive and has an uninitialized value. 26233The stack object is marked as dead when either 26234:ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the 26235function returns. 26236 26237After :ref:`llvm.lifetime.end <int_lifeend>` is called, 26238'``llvm.lifetime.start``' on the stack object can be called again. 26239The second '``llvm.lifetime.start``' call marks the object as alive, but it 26240does not change the address of the object. 26241 26242If ``ptr`` is a non-stack-allocated object, it does not point to the first 26243byte of the object or it is a stack object that is already alive, it simply 26244fills all bytes of the object with ``poison``. 26245 26246 26247.. _int_lifeend: 26248 26249'``llvm.lifetime.end``' Intrinsic 26250^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26251 26252Syntax: 26253""""""" 26254 26255:: 26256 26257 declare void @llvm.lifetime.end(i64 <size>, ptr captures(none) <ptr>) 26258 26259Overview: 26260""""""""" 26261 26262The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's 26263lifetime. 26264 26265Arguments: 26266"""""""""" 26267 26268The first argument is a constant integer representing the size of the 26269object, or -1 if it is variable sized. The second argument is a pointer 26270to the object. 26271 26272Semantics: 26273"""""""""" 26274 26275If ``ptr`` is a stack-allocated object and it points to the first byte of the 26276object, the object is dead. 26277``ptr`` is conservatively considered as a non-stack-allocated object if 26278the stack coloring algorithm that is used in the optimization pipeline cannot 26279conclude that ``ptr`` is a stack-allocated object. 26280 26281Calling ``llvm.lifetime.end`` on an already dead alloca is no-op. 26282 26283If ``ptr`` is a non-stack-allocated object or it does not point to the first 26284byte of the object, it is equivalent to simply filling all bytes of the object 26285with ``poison``. 26286 26287 26288'``llvm.invariant.start``' Intrinsic 26289^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26290 26291Syntax: 26292""""""" 26293This is an overloaded intrinsic. The memory object can belong to any address space. 26294 26295:: 26296 26297 declare ptr @llvm.invariant.start.p0(i64 <size>, ptr captures(none) <ptr>) 26298 26299Overview: 26300""""""""" 26301 26302The '``llvm.invariant.start``' intrinsic specifies that the contents of 26303a memory object will not change. 26304 26305Arguments: 26306"""""""""" 26307 26308The first argument is a constant integer representing the size of the 26309object, or -1 if it is variable sized. The second argument is a pointer 26310to the object. 26311 26312Semantics: 26313"""""""""" 26314 26315This intrinsic indicates that until an ``llvm.invariant.end`` that uses 26316the return value, the referenced memory location is constant and 26317unchanging. 26318 26319'``llvm.invariant.end``' Intrinsic 26320^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26321 26322Syntax: 26323""""""" 26324This is an overloaded intrinsic. The memory object can belong to any address space. 26325 26326:: 26327 26328 declare void @llvm.invariant.end.p0(ptr <start>, i64 <size>, ptr captures(none) <ptr>) 26329 26330Overview: 26331""""""""" 26332 26333The '``llvm.invariant.end``' intrinsic specifies that the contents of a 26334memory object are mutable. 26335 26336Arguments: 26337"""""""""" 26338 26339The first argument is the matching ``llvm.invariant.start`` intrinsic. 26340The second argument is a constant integer representing the size of the 26341object, or -1 if it is variable sized and the third argument is a 26342pointer to the object. 26343 26344Semantics: 26345"""""""""" 26346 26347This intrinsic indicates that the memory is mutable again. 26348 26349'``llvm.launder.invariant.group``' Intrinsic 26350^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26351 26352Syntax: 26353""""""" 26354This is an overloaded intrinsic. The memory object can belong to any address 26355space. The returned pointer must belong to the same address space as the 26356argument. 26357 26358:: 26359 26360 declare ptr @llvm.launder.invariant.group.p0(ptr <ptr>) 26361 26362Overview: 26363""""""""" 26364 26365The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant 26366established by ``invariant.group`` metadata no longer holds, to obtain a new 26367pointer value that carries fresh invariant group information. It is an 26368experimental intrinsic, which means that its semantics might change in the 26369future. 26370 26371 26372Arguments: 26373"""""""""" 26374 26375The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer 26376to the memory. 26377 26378Semantics: 26379"""""""""" 26380 26381Returns another pointer that aliases its argument but which is considered different 26382for the purposes of ``load``/``store`` ``invariant.group`` metadata. 26383It does not read any accessible memory and the execution can be speculated. 26384 26385'``llvm.strip.invariant.group``' Intrinsic 26386^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26387 26388Syntax: 26389""""""" 26390This is an overloaded intrinsic. The memory object can belong to any address 26391space. The returned pointer must belong to the same address space as the 26392argument. 26393 26394:: 26395 26396 declare ptr @llvm.strip.invariant.group.p0(ptr <ptr>) 26397 26398Overview: 26399""""""""" 26400 26401The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant 26402established by ``invariant.group`` metadata no longer holds, to obtain a new pointer 26403value that does not carry the invariant information. It is an experimental 26404intrinsic, which means that its semantics might change in the future. 26405 26406 26407Arguments: 26408"""""""""" 26409 26410The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer 26411to the memory. 26412 26413Semantics: 26414"""""""""" 26415 26416Returns another pointer that aliases its argument but which has no associated 26417``invariant.group`` metadata. 26418It does not read any memory and can be speculated. 26419 26420 26421 26422.. _constrainedfp: 26423 26424Constrained Floating-Point Intrinsics 26425------------------------------------- 26426 26427These intrinsics are used to provide special handling of floating-point 26428operations when specific rounding mode or floating-point exception behavior is 26429required. By default, LLVM optimization passes assume that the rounding mode is 26430round-to-nearest and that floating-point exceptions will not be monitored. 26431Constrained FP intrinsics are used to support non-default rounding modes and 26432accurately preserve exception behavior without compromising LLVM's ability to 26433optimize FP code when the default behavior is used. 26434 26435If any FP operation in a function is constrained then they all must be 26436constrained. This is required for correct LLVM IR. Optimizations that 26437move code around can create miscompiles if mixing of constrained and normal 26438operations is done. The correct way to mix constrained and less constrained 26439operations is to use the rounding mode and exception handling metadata to 26440mark constrained intrinsics as having LLVM's default behavior. 26441 26442Each of these intrinsics corresponds to a normal floating-point operation. The 26443data arguments and the return value are the same as the corresponding FP 26444operation. 26445 26446The rounding mode argument is a metadata string specifying what 26447assumptions, if any, the optimizer can make when transforming constant 26448values. Some constrained FP intrinsics omit this argument. If required 26449by the intrinsic, this argument must be one of the following strings: 26450 26451:: 26452 26453 "round.dynamic" 26454 "round.tonearest" 26455 "round.downward" 26456 "round.upward" 26457 "round.towardzero" 26458 "round.tonearestaway" 26459 26460If this argument is "round.dynamic" optimization passes must assume that the 26461rounding mode is unknown and may change at runtime. No transformations that 26462depend on rounding mode may be performed in this case. 26463 26464The other possible values for the rounding mode argument correspond to the 26465similarly named IEEE rounding modes. If the argument is any of these values 26466optimization passes may perform transformations as long as they are consistent 26467with the specified rounding mode. 26468 26469For example, 'x-0'->'x' is not a valid transformation if the rounding mode is 26470"round.downward" or "round.dynamic" because if the value of 'x' is +0 then 26471'x-0' should evaluate to '-0' when rounding downward. However, this 26472transformation is legal for all other rounding modes. 26473 26474For values other than "round.dynamic" optimization passes may assume that the 26475actual runtime rounding mode (as defined in a target-specific manner) matches 26476the specified rounding mode, but this is not guaranteed. Using a specific 26477non-dynamic rounding mode which does not match the actual rounding mode at 26478runtime results in undefined behavior. 26479 26480The exception behavior argument is a metadata string describing the floating 26481point exception semantics that required for the intrinsic. This argument 26482must be one of the following strings: 26483 26484:: 26485 26486 "fpexcept.ignore" 26487 "fpexcept.maytrap" 26488 "fpexcept.strict" 26489 26490If this argument is "fpexcept.ignore" optimization passes may assume that the 26491exception status flags will not be read and that floating-point exceptions will 26492be masked. This allows transformations to be performed that may change the 26493exception semantics of the original code. For example, FP operations may be 26494speculatively executed in this case whereas they must not be for either of the 26495other possible values of this argument. 26496 26497If the exception behavior argument is "fpexcept.maytrap" optimization passes 26498must avoid transformations that may raise exceptions that would not have been 26499raised by the original code (such as speculatively executing FP operations), but 26500passes are not required to preserve all exceptions that are implied by the 26501original code. For example, exceptions may be potentially hidden by constant 26502folding. 26503 26504If the exception behavior argument is "fpexcept.strict" all transformations must 26505strictly preserve the floating-point exception semantics of the original code. 26506Any FP exception that would have been raised by the original code must be raised 26507by the transformed code, and the transformed code must not raise any FP 26508exceptions that would not have been raised by the original code. This is the 26509exception behavior argument that will be used if the code being compiled reads 26510the FP exception status flags, but this mode can also be used with code that 26511unmasks FP exceptions. 26512 26513The number and order of floating-point exceptions is NOT guaranteed. For 26514example, a series of FP operations that each may raise exceptions may be 26515vectorized into a single instruction that raises each unique exception a single 26516time. 26517 26518Proper :ref:`function attributes <fnattrs>` usage is required for the 26519constrained intrinsics to function correctly. 26520 26521All function *calls* done in a function that uses constrained floating 26522point intrinsics must have the ``strictfp`` attribute either on the 26523calling instruction or on the declaration or definition of the function 26524being called. 26525 26526All function *definitions* that use constrained floating point intrinsics 26527must have the ``strictfp`` attribute. 26528 26529'``llvm.experimental.constrained.fadd``' Intrinsic 26530^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26531 26532Syntax: 26533""""""" 26534 26535:: 26536 26537 declare <type> 26538 @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>, 26539 metadata <rounding mode>, 26540 metadata <exception behavior>) 26541 26542Overview: 26543""""""""" 26544 26545The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its 26546two arguments. 26547 26548 26549Arguments: 26550"""""""""" 26551 26552The first two arguments to the '``llvm.experimental.constrained.fadd``' 26553intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 26554of floating-point values. Both arguments must have identical types. 26555 26556The third and fourth arguments specify the rounding mode and exception 26557behavior as described above. 26558 26559Semantics: 26560"""""""""" 26561 26562The value produced is the floating-point sum of the two value arguments and has 26563the same type as the arguments. 26564 26565 26566'``llvm.experimental.constrained.fsub``' Intrinsic 26567^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26568 26569Syntax: 26570""""""" 26571 26572:: 26573 26574 declare <type> 26575 @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>, 26576 metadata <rounding mode>, 26577 metadata <exception behavior>) 26578 26579Overview: 26580""""""""" 26581 26582The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference 26583of its two arguments. 26584 26585 26586Arguments: 26587"""""""""" 26588 26589The first two arguments to the '``llvm.experimental.constrained.fsub``' 26590intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 26591of floating-point values. Both arguments must have identical types. 26592 26593The third and fourth arguments specify the rounding mode and exception 26594behavior as described above. 26595 26596Semantics: 26597"""""""""" 26598 26599The value produced is the floating-point difference of the two value arguments 26600and has the same type as the arguments. 26601 26602 26603'``llvm.experimental.constrained.fmul``' Intrinsic 26604^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26605 26606Syntax: 26607""""""" 26608 26609:: 26610 26611 declare <type> 26612 @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>, 26613 metadata <rounding mode>, 26614 metadata <exception behavior>) 26615 26616Overview: 26617""""""""" 26618 26619The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of 26620its two arguments. 26621 26622 26623Arguments: 26624"""""""""" 26625 26626The first two arguments to the '``llvm.experimental.constrained.fmul``' 26627intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 26628of floating-point values. Both arguments must have identical types. 26629 26630The third and fourth arguments specify the rounding mode and exception 26631behavior as described above. 26632 26633Semantics: 26634"""""""""" 26635 26636The value produced is the floating-point product of the two value arguments and 26637has the same type as the arguments. 26638 26639 26640'``llvm.experimental.constrained.fdiv``' Intrinsic 26641^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26642 26643Syntax: 26644""""""" 26645 26646:: 26647 26648 declare <type> 26649 @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>, 26650 metadata <rounding mode>, 26651 metadata <exception behavior>) 26652 26653Overview: 26654""""""""" 26655 26656The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of 26657its two arguments. 26658 26659 26660Arguments: 26661"""""""""" 26662 26663The first two arguments to the '``llvm.experimental.constrained.fdiv``' 26664intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 26665of floating-point values. Both arguments must have identical types. 26666 26667The third and fourth arguments specify the rounding mode and exception 26668behavior as described above. 26669 26670Semantics: 26671"""""""""" 26672 26673The value produced is the floating-point quotient of the two value arguments and 26674has the same type as the arguments. 26675 26676 26677'``llvm.experimental.constrained.frem``' Intrinsic 26678^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26679 26680Syntax: 26681""""""" 26682 26683:: 26684 26685 declare <type> 26686 @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>, 26687 metadata <rounding mode>, 26688 metadata <exception behavior>) 26689 26690Overview: 26691""""""""" 26692 26693The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder 26694from the division of its two arguments. 26695 26696 26697Arguments: 26698"""""""""" 26699 26700The first two arguments to the '``llvm.experimental.constrained.frem``' 26701intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 26702of floating-point values. Both arguments must have identical types. 26703 26704The third and fourth arguments specify the rounding mode and exception 26705behavior as described above. The rounding mode argument has no effect, since 26706the result of frem is never rounded, but the argument is included for 26707consistency with the other constrained floating-point intrinsics. 26708 26709Semantics: 26710"""""""""" 26711 26712The value produced is the floating-point remainder from the division of the two 26713value arguments and has the same type as the arguments. The remainder has the 26714same sign as the dividend. 26715 26716'``llvm.experimental.constrained.fma``' Intrinsic 26717^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26718 26719Syntax: 26720""""""" 26721 26722:: 26723 26724 declare <type> 26725 @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>, 26726 metadata <rounding mode>, 26727 metadata <exception behavior>) 26728 26729Overview: 26730""""""""" 26731 26732The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a 26733fused-multiply-add operation on its arguments. 26734 26735Arguments: 26736"""""""""" 26737 26738The first three arguments to the '``llvm.experimental.constrained.fma``' 26739intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector 26740<t_vector>` of floating-point values. All arguments must have identical types. 26741 26742The fourth and fifth arguments specify the rounding mode and exception behavior 26743as described above. 26744 26745Semantics: 26746"""""""""" 26747 26748The result produced is the product of the first two arguments added to the third 26749argument computed with infinite precision, and then rounded to the target 26750precision. 26751 26752'``llvm.experimental.constrained.fptoui``' Intrinsic 26753^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26754 26755Syntax: 26756""""""" 26757 26758:: 26759 26760 declare <ty2> 26761 @llvm.experimental.constrained.fptoui(<type> <value>, 26762 metadata <exception behavior>) 26763 26764Overview: 26765""""""""" 26766 26767The '``llvm.experimental.constrained.fptoui``' intrinsic converts a 26768floating-point ``value`` to its unsigned integer equivalent of type ``ty2``. 26769 26770Arguments: 26771"""""""""" 26772 26773The first argument to the '``llvm.experimental.constrained.fptoui``' 26774intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 26775<t_vector>` of floating point values. 26776 26777The second argument specifies the exception behavior as described above. 26778 26779Semantics: 26780"""""""""" 26781 26782The result produced is an unsigned integer converted from the floating 26783point argument. The value is truncated, so it is rounded towards zero. 26784 26785'``llvm.experimental.constrained.fptosi``' Intrinsic 26786^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26787 26788Syntax: 26789""""""" 26790 26791:: 26792 26793 declare <ty2> 26794 @llvm.experimental.constrained.fptosi(<type> <value>, 26795 metadata <exception behavior>) 26796 26797Overview: 26798""""""""" 26799 26800The '``llvm.experimental.constrained.fptosi``' intrinsic converts 26801:ref:`floating-point <t_floating>` ``value`` to type ``ty2``. 26802 26803Arguments: 26804"""""""""" 26805 26806The first argument to the '``llvm.experimental.constrained.fptosi``' 26807intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 26808<t_vector>` of floating point values. 26809 26810The second argument specifies the exception behavior as described above. 26811 26812Semantics: 26813"""""""""" 26814 26815The result produced is a signed integer converted from the floating 26816point argument. The value is truncated, so it is rounded towards zero. 26817 26818'``llvm.experimental.constrained.uitofp``' Intrinsic 26819^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26820 26821Syntax: 26822""""""" 26823 26824:: 26825 26826 declare <ty2> 26827 @llvm.experimental.constrained.uitofp(<type> <value>, 26828 metadata <rounding mode>, 26829 metadata <exception behavior>) 26830 26831Overview: 26832""""""""" 26833 26834The '``llvm.experimental.constrained.uitofp``' intrinsic converts an 26835unsigned integer ``value`` to a floating-point of type ``ty2``. 26836 26837Arguments: 26838"""""""""" 26839 26840The first argument to the '``llvm.experimental.constrained.uitofp``' 26841intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector 26842<t_vector>` of integer values. 26843 26844The second and third arguments specify the rounding mode and exception 26845behavior as described above. 26846 26847Semantics: 26848"""""""""" 26849 26850An inexact floating-point exception will be raised if rounding is required. 26851Any result produced is a floating point value converted from the input 26852integer argument. 26853 26854'``llvm.experimental.constrained.sitofp``' Intrinsic 26855^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26856 26857Syntax: 26858""""""" 26859 26860:: 26861 26862 declare <ty2> 26863 @llvm.experimental.constrained.sitofp(<type> <value>, 26864 metadata <rounding mode>, 26865 metadata <exception behavior>) 26866 26867Overview: 26868""""""""" 26869 26870The '``llvm.experimental.constrained.sitofp``' intrinsic converts a 26871signed integer ``value`` to a floating-point of type ``ty2``. 26872 26873Arguments: 26874"""""""""" 26875 26876The first argument to the '``llvm.experimental.constrained.sitofp``' 26877intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector 26878<t_vector>` of integer values. 26879 26880The second and third arguments specify the rounding mode and exception 26881behavior as described above. 26882 26883Semantics: 26884"""""""""" 26885 26886An inexact floating-point exception will be raised if rounding is required. 26887Any result produced is a floating point value converted from the input 26888integer argument. 26889 26890'``llvm.experimental.constrained.fptrunc``' Intrinsic 26891^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26892 26893Syntax: 26894""""""" 26895 26896:: 26897 26898 declare <ty2> 26899 @llvm.experimental.constrained.fptrunc(<type> <value>, 26900 metadata <rounding mode>, 26901 metadata <exception behavior>) 26902 26903Overview: 26904""""""""" 26905 26906The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value`` 26907to type ``ty2``. 26908 26909Arguments: 26910"""""""""" 26911 26912The first argument to the '``llvm.experimental.constrained.fptrunc``' 26913intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 26914<t_vector>` of floating point values. This argument must be larger in size 26915than the result. 26916 26917The second and third arguments specify the rounding mode and exception 26918behavior as described above. 26919 26920Semantics: 26921"""""""""" 26922 26923The result produced is a floating point value truncated to be smaller in size 26924than the argument. 26925 26926'``llvm.experimental.constrained.fpext``' Intrinsic 26927^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26928 26929Syntax: 26930""""""" 26931 26932:: 26933 26934 declare <ty2> 26935 @llvm.experimental.constrained.fpext(<type> <value>, 26936 metadata <exception behavior>) 26937 26938Overview: 26939""""""""" 26940 26941The '``llvm.experimental.constrained.fpext``' intrinsic extends a 26942floating-point ``value`` to a larger floating-point value. 26943 26944Arguments: 26945"""""""""" 26946 26947The first argument to the '``llvm.experimental.constrained.fpext``' 26948intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 26949<t_vector>` of floating point values. This argument must be smaller in size 26950than the result. 26951 26952The second argument specifies the exception behavior as described above. 26953 26954Semantics: 26955"""""""""" 26956 26957The result produced is a floating point value extended to be larger in size 26958than the argument. All restrictions that apply to the fpext instruction also 26959apply to this intrinsic. 26960 26961'``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics 26962^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26963 26964Syntax: 26965""""""" 26966 26967:: 26968 26969 declare <ty2> 26970 @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, 26971 metadata <condition code>, 26972 metadata <exception behavior>) 26973 declare <ty2> 26974 @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, 26975 metadata <condition code>, 26976 metadata <exception behavior>) 26977 26978Overview: 26979""""""""" 26980 26981The '``llvm.experimental.constrained.fcmp``' and 26982'``llvm.experimental.constrained.fcmps``' intrinsics return a boolean 26983value or vector of boolean values based on comparison of its arguments. 26984 26985If the arguments are floating-point scalars, then the result type is a 26986boolean (:ref:`i1 <t_integer>`). 26987 26988If the arguments are floating-point vectors, then the result type is a 26989vector of boolean with the same number of elements as the arguments being 26990compared. 26991 26992The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet 26993comparison operation while the '``llvm.experimental.constrained.fcmps``' 26994intrinsic performs a signaling comparison operation. 26995 26996Arguments: 26997"""""""""" 26998 26999The first two arguments to the '``llvm.experimental.constrained.fcmp``' 27000and '``llvm.experimental.constrained.fcmps``' intrinsics must be 27001:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 27002of floating-point values. Both arguments must have identical types. 27003 27004The third argument is the condition code indicating the kind of comparison 27005to perform. It must be a metadata string with one of the following values: 27006 27007.. _fcmp_md_cc: 27008 27009- "``oeq``": ordered and equal 27010- "``ogt``": ordered and greater than 27011- "``oge``": ordered and greater than or equal 27012- "``olt``": ordered and less than 27013- "``ole``": ordered and less than or equal 27014- "``one``": ordered and not equal 27015- "``ord``": ordered (no nans) 27016- "``ueq``": unordered or equal 27017- "``ugt``": unordered or greater than 27018- "``uge``": unordered or greater than or equal 27019- "``ult``": unordered or less than 27020- "``ule``": unordered or less than or equal 27021- "``une``": unordered or not equal 27022- "``uno``": unordered (either nans) 27023 27024*Ordered* means that neither argument is a NAN while *unordered* means 27025that either argument may be a NAN. 27026 27027The fourth argument specifies the exception behavior as described above. 27028 27029Semantics: 27030"""""""""" 27031 27032``op1`` and ``op2`` are compared according to the condition code given 27033as the third argument. If the arguments are vectors, then the 27034vectors are compared element by element. Each comparison performed 27035always yields an :ref:`i1 <t_integer>` result, as follows: 27036 27037.. _fcmp_md_cc_sem: 27038 27039- "``oeq``": yields ``true`` if both arguments are not a NAN and ``op1`` 27040 is equal to ``op2``. 27041- "``ogt``": yields ``true`` if both arguments are not a NAN and ``op1`` 27042 is greater than ``op2``. 27043- "``oge``": yields ``true`` if both arguments are not a NAN and ``op1`` 27044 is greater than or equal to ``op2``. 27045- "``olt``": yields ``true`` if both arguments are not a NAN and ``op1`` 27046 is less than ``op2``. 27047- "``ole``": yields ``true`` if both arguments are not a NAN and ``op1`` 27048 is less than or equal to ``op2``. 27049- "``one``": yields ``true`` if both arguments are not a NAN and ``op1`` 27050 is not equal to ``op2``. 27051- "``ord``": yields ``true`` if both arguments are not a NAN. 27052- "``ueq``": yields ``true`` if either argument is a NAN or ``op1`` is 27053 equal to ``op2``. 27054- "``ugt``": yields ``true`` if either argument is a NAN or ``op1`` is 27055 greater than ``op2``. 27056- "``uge``": yields ``true`` if either argument is a NAN or ``op1`` is 27057 greater than or equal to ``op2``. 27058- "``ult``": yields ``true`` if either argument is a NAN or ``op1`` is 27059 less than ``op2``. 27060- "``ule``": yields ``true`` if either argument is a NAN or ``op1`` is 27061 less than or equal to ``op2``. 27062- "``une``": yields ``true`` if either argument is a NAN or ``op1`` is 27063 not equal to ``op2``. 27064- "``uno``": yields ``true`` if either argument is a NAN. 27065 27066The quiet comparison operation performed by 27067'``llvm.experimental.constrained.fcmp``' will only raise an exception 27068if either argument is a SNAN. The signaling comparison operation 27069performed by '``llvm.experimental.constrained.fcmps``' will raise an 27070exception if either argument is a NAN (QNAN or SNAN). Such an exception 27071does not preclude a result being produced (e.g. exception might only 27072set a flag), therefore the distinction between ordered and unordered 27073comparisons is also relevant for the 27074'``llvm.experimental.constrained.fcmps``' intrinsic. 27075 27076'``llvm.experimental.constrained.fmuladd``' Intrinsic 27077^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27078 27079Syntax: 27080""""""" 27081 27082:: 27083 27084 declare <type> 27085 @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>, 27086 <type> <op3>, 27087 metadata <rounding mode>, 27088 metadata <exception behavior>) 27089 27090Overview: 27091""""""""" 27092 27093The '``llvm.experimental.constrained.fmuladd``' intrinsic represents 27094multiply-add expressions that can be fused if the code generator determines 27095that (a) the target instruction set has support for a fused operation, 27096and (b) that the fused operation is more efficient than the equivalent, 27097separate pair of mul and add instructions. 27098 27099Arguments: 27100"""""""""" 27101 27102The first three arguments to the '``llvm.experimental.constrained.fmuladd``' 27103intrinsic must be floating-point or vector of floating-point values. 27104All three arguments must have identical types. 27105 27106The fourth and fifth arguments specify the rounding mode and exception behavior 27107as described above. 27108 27109Semantics: 27110"""""""""" 27111 27112The expression: 27113 27114:: 27115 27116 %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c, 27117 metadata <rounding mode>, 27118 metadata <exception behavior>) 27119 27120is equivalent to the expression: 27121 27122:: 27123 27124 %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b, 27125 metadata <rounding mode>, 27126 metadata <exception behavior>) 27127 %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c, 27128 metadata <rounding mode>, 27129 metadata <exception behavior>) 27130 27131except that it is unspecified whether rounding will be performed between the 27132multiplication and addition steps. Fusion is not guaranteed, even if the target 27133platform supports it. 27134If a fused multiply-add is required, the corresponding 27135:ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be 27136used instead. 27137This never sets errno, just as '``llvm.experimental.constrained.fma.*``'. 27138 27139Constrained libm-equivalent Intrinsics 27140-------------------------------------- 27141 27142In addition to the basic floating-point operations for which constrained 27143intrinsics are described above, there are constrained versions of various 27144operations which provide equivalent behavior to a corresponding libm function. 27145These intrinsics allow the precise behavior of these operations with respect to 27146rounding mode and exception behavior to be controlled. 27147 27148As with the basic constrained floating-point intrinsics, the rounding mode 27149and exception behavior arguments only control the behavior of the optimizer. 27150They do not change the runtime floating-point environment. 27151 27152 27153'``llvm.experimental.constrained.sqrt``' Intrinsic 27154^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27155 27156Syntax: 27157""""""" 27158 27159:: 27160 27161 declare <type> 27162 @llvm.experimental.constrained.sqrt(<type> <op1>, 27163 metadata <rounding mode>, 27164 metadata <exception behavior>) 27165 27166Overview: 27167""""""""" 27168 27169The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root 27170of the specified value, returning the same value as the libm '``sqrt``' 27171functions would, but without setting ``errno``. 27172 27173Arguments: 27174"""""""""" 27175 27176The first argument and the return type are floating-point numbers of the same 27177type. 27178 27179The second and third arguments specify the rounding mode and exception 27180behavior as described above. 27181 27182Semantics: 27183"""""""""" 27184 27185This function returns the nonnegative square root of the specified value. 27186If the value is less than negative zero, a floating-point exception occurs 27187and the return value is architecture specific. 27188 27189 27190'``llvm.experimental.constrained.pow``' Intrinsic 27191^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27192 27193Syntax: 27194""""""" 27195 27196:: 27197 27198 declare <type> 27199 @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>, 27200 metadata <rounding mode>, 27201 metadata <exception behavior>) 27202 27203Overview: 27204""""""""" 27205 27206The '``llvm.experimental.constrained.pow``' intrinsic returns the first argument 27207raised to the (positive or negative) power specified by the second argument. 27208 27209Arguments: 27210"""""""""" 27211 27212The first two arguments and the return value are floating-point numbers of the 27213same type. The second argument specifies the power to which the first argument 27214should be raised. 27215 27216The third and fourth arguments specify the rounding mode and exception 27217behavior as described above. 27218 27219Semantics: 27220"""""""""" 27221 27222This function returns the first value raised to the second power, 27223returning the same values as the libm ``pow`` functions would, and 27224handles error conditions in the same way. 27225 27226 27227'``llvm.experimental.constrained.powi``' Intrinsic 27228^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27229 27230Syntax: 27231""""""" 27232 27233:: 27234 27235 declare <type> 27236 @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>, 27237 metadata <rounding mode>, 27238 metadata <exception behavior>) 27239 27240Overview: 27241""""""""" 27242 27243The '``llvm.experimental.constrained.powi``' intrinsic returns the first argument 27244raised to the (positive or negative) power specified by the second argument. The 27245order of evaluation of multiplications is not defined. When a vector of 27246floating-point type is used, the second argument remains a scalar integer value. 27247 27248 27249Arguments: 27250"""""""""" 27251 27252The first argument and the return value are floating-point numbers of the same 27253type. The second argument is a 32-bit signed integer specifying the power to 27254which the first argument should be raised. 27255 27256The third and fourth arguments specify the rounding mode and exception 27257behavior as described above. 27258 27259Semantics: 27260"""""""""" 27261 27262This function returns the first value raised to the second power with an 27263unspecified sequence of rounding operations. 27264 27265 27266'``llvm.experimental.constrained.ldexp``' Intrinsic 27267^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27268 27269Syntax: 27270""""""" 27271 27272:: 27273 27274 declare <type0> 27275 @llvm.experimental.constrained.ldexp(<type0> <op1>, <type1> <op2>, 27276 metadata <rounding mode>, 27277 metadata <exception behavior>) 27278 27279Overview: 27280""""""""" 27281 27282The '``llvm.experimental.constrained.ldexp``' performs the ldexp function. 27283 27284 27285Arguments: 27286"""""""""" 27287 27288The first argument and the return value are :ref:`floating-point 27289<t_floating>` or :ref:`vector <t_vector>` of floating-point values of 27290the same type. The second argument is an integer with the same number 27291of elements. 27292 27293 27294The third and fourth arguments specify the rounding mode and exception 27295behavior as described above. 27296 27297Semantics: 27298"""""""""" 27299 27300This function multiplies the first argument by 2 raised to the second 27301argument's power. If the first argument is NaN or infinite, the same 27302value is returned. If the result underflows a zero with the same sign 27303is returned. If the result overflows, the result is an infinity with 27304the same sign. 27305 27306 27307'``llvm.experimental.constrained.sin``' Intrinsic 27308^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27309 27310Syntax: 27311""""""" 27312 27313:: 27314 27315 declare <type> 27316 @llvm.experimental.constrained.sin(<type> <op1>, 27317 metadata <rounding mode>, 27318 metadata <exception behavior>) 27319 27320Overview: 27321""""""""" 27322 27323The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the 27324first argument. 27325 27326Arguments: 27327"""""""""" 27328 27329The first argument and the return type are floating-point numbers of the same 27330type. 27331 27332The second and third arguments specify the rounding mode and exception 27333behavior as described above. 27334 27335Semantics: 27336"""""""""" 27337 27338This function returns the sine of the specified argument, returning the 27339same values as the libm ``sin`` functions would, and handles error 27340conditions in the same way. 27341 27342 27343'``llvm.experimental.constrained.cos``' Intrinsic 27344^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27345 27346Syntax: 27347""""""" 27348 27349:: 27350 27351 declare <type> 27352 @llvm.experimental.constrained.cos(<type> <op1>, 27353 metadata <rounding mode>, 27354 metadata <exception behavior>) 27355 27356Overview: 27357""""""""" 27358 27359The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the 27360first argument. 27361 27362Arguments: 27363"""""""""" 27364 27365The first argument and the return type are floating-point numbers of the same 27366type. 27367 27368The second and third arguments specify the rounding mode and exception 27369behavior as described above. 27370 27371Semantics: 27372"""""""""" 27373 27374This function returns the cosine of the specified argument, returning the 27375same values as the libm ``cos`` functions would, and handles error 27376conditions in the same way. 27377 27378 27379'``llvm.experimental.constrained.tan``' Intrinsic 27380^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27381 27382Syntax: 27383""""""" 27384 27385:: 27386 27387 declare <type> 27388 @llvm.experimental.constrained.tan(<type> <op1>, 27389 metadata <rounding mode>, 27390 metadata <exception behavior>) 27391 27392Overview: 27393""""""""" 27394 27395The '``llvm.experimental.constrained.tan``' intrinsic returns the tangent of the 27396first argument. 27397 27398Arguments: 27399"""""""""" 27400 27401The first argument and the return type are floating-point numbers of the same 27402type. 27403 27404The second and third arguments specify the rounding mode and exception 27405behavior as described above. 27406 27407Semantics: 27408"""""""""" 27409 27410This function returns the tangent of the specified argument, returning the 27411same values as the libm ``tan`` functions would, and handles error 27412conditions in the same way. 27413 27414'``llvm.experimental.constrained.asin``' Intrinsic 27415^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27416 27417Syntax: 27418""""""" 27419 27420:: 27421 27422 declare <type> 27423 @llvm.experimental.constrained.asin(<type> <op1>, 27424 metadata <rounding mode>, 27425 metadata <exception behavior>) 27426 27427Overview: 27428""""""""" 27429 27430The '``llvm.experimental.constrained.asin``' intrinsic returns the arcsine of the 27431first operand. 27432 27433Arguments: 27434"""""""""" 27435 27436The first argument and the return type are floating-point numbers of the same 27437type. 27438 27439The second and third arguments specify the rounding mode and exception 27440behavior as described above. 27441 27442Semantics: 27443"""""""""" 27444 27445This function returns the arcsine of the specified operand, returning the 27446same values as the libm ``asin`` functions would, and handles error 27447conditions in the same way. 27448 27449 27450'``llvm.experimental.constrained.acos``' Intrinsic 27451^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27452 27453Syntax: 27454""""""" 27455 27456:: 27457 27458 declare <type> 27459 @llvm.experimental.constrained.acos(<type> <op1>, 27460 metadata <rounding mode>, 27461 metadata <exception behavior>) 27462 27463Overview: 27464""""""""" 27465 27466The '``llvm.experimental.constrained.acos``' intrinsic returns the arccosine of the 27467first operand. 27468 27469Arguments: 27470"""""""""" 27471 27472The first argument and the return type are floating-point numbers of the same 27473type. 27474 27475The second and third arguments specify the rounding mode and exception 27476behavior as described above. 27477 27478Semantics: 27479"""""""""" 27480 27481This function returns the arccosine of the specified operand, returning the 27482same values as the libm ``acos`` functions would, and handles error 27483conditions in the same way. 27484 27485 27486'``llvm.experimental.constrained.atan``' Intrinsic 27487^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27488 27489Syntax: 27490""""""" 27491 27492:: 27493 27494 declare <type> 27495 @llvm.experimental.constrained.atan(<type> <op1>, 27496 metadata <rounding mode>, 27497 metadata <exception behavior>) 27498 27499Overview: 27500""""""""" 27501 27502The '``llvm.experimental.constrained.atan``' intrinsic returns the arctangent of the 27503first operand. 27504 27505Arguments: 27506"""""""""" 27507 27508The first argument and the return type are floating-point numbers of the same 27509type. 27510 27511The second and third arguments specify the rounding mode and exception 27512behavior as described above. 27513 27514Semantics: 27515"""""""""" 27516 27517This function returns the arctangent of the specified operand, returning the 27518same values as the libm ``atan`` functions would, and handles error 27519conditions in the same way. 27520 27521'``llvm.experimental.constrained.atan2``' Intrinsic 27522^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27523 27524Syntax: 27525""""""" 27526 27527:: 27528 27529 declare <type> 27530 @llvm.experimental.constrained.atan2(<type> <op1>, 27531 <type> <op2>, 27532 metadata <rounding mode>, 27533 metadata <exception behavior>) 27534 27535Overview: 27536""""""""" 27537 27538The '``llvm.experimental.constrained.atan2``' intrinsic returns the arctangent 27539of ``<op1>`` divided by ``<op2>`` accounting for the quadrant. 27540 27541Arguments: 27542"""""""""" 27543 27544The first two arguments and the return value are floating-point numbers of the 27545same type. 27546 27547The third and fourth arguments specify the rounding mode and exception 27548behavior as described above. 27549 27550Semantics: 27551"""""""""" 27552 27553This function returns the quadrant-specific arctangent using the specified 27554operands, returning the same values as the libm ``atan2`` functions would, and 27555handles error conditions in the same way. 27556 27557'``llvm.experimental.constrained.sinh``' Intrinsic 27558^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27559 27560Syntax: 27561""""""" 27562 27563:: 27564 27565 declare <type> 27566 @llvm.experimental.constrained.sinh(<type> <op1>, 27567 metadata <rounding mode>, 27568 metadata <exception behavior>) 27569 27570Overview: 27571""""""""" 27572 27573The '``llvm.experimental.constrained.sinh``' intrinsic returns the hyperbolic sine of the 27574first operand. 27575 27576Arguments: 27577"""""""""" 27578 27579The first argument and the return type are floating-point numbers of the same 27580type. 27581 27582The second and third arguments specify the rounding mode and exception 27583behavior as described above. 27584 27585Semantics: 27586"""""""""" 27587 27588This function returns the hyperbolic sine of the specified operand, returning the 27589same values as the libm ``sinh`` functions would, and handles error 27590conditions in the same way. 27591 27592 27593'``llvm.experimental.constrained.cosh``' Intrinsic 27594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27595 27596Syntax: 27597""""""" 27598 27599:: 27600 27601 declare <type> 27602 @llvm.experimental.constrained.cosh(<type> <op1>, 27603 metadata <rounding mode>, 27604 metadata <exception behavior>) 27605 27606Overview: 27607""""""""" 27608 27609The '``llvm.experimental.constrained.cosh``' intrinsic returns the hyperbolic cosine of the 27610first operand. 27611 27612Arguments: 27613"""""""""" 27614 27615The first argument and the return type are floating-point numbers of the same 27616type. 27617 27618The second and third arguments specify the rounding mode and exception 27619behavior as described above. 27620 27621Semantics: 27622"""""""""" 27623 27624This function returns the hyperbolic cosine of the specified operand, returning the 27625same values as the libm ``cosh`` functions would, and handles error 27626conditions in the same way. 27627 27628 27629'``llvm.experimental.constrained.tanh``' Intrinsic 27630^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27631 27632Syntax: 27633""""""" 27634 27635:: 27636 27637 declare <type> 27638 @llvm.experimental.constrained.tanh(<type> <op1>, 27639 metadata <rounding mode>, 27640 metadata <exception behavior>) 27641 27642Overview: 27643""""""""" 27644 27645The '``llvm.experimental.constrained.tanh``' intrinsic returns the hyperbolic tangent of the 27646first operand. 27647 27648Arguments: 27649"""""""""" 27650 27651The first argument and the return type are floating-point numbers of the same 27652type. 27653 27654The second and third arguments specify the rounding mode and exception 27655behavior as described above. 27656 27657Semantics: 27658"""""""""" 27659 27660This function returns the hyperbolic tangent of the specified operand, returning the 27661same values as the libm ``tanh`` functions would, and handles error 27662conditions in the same way. 27663 27664'``llvm.experimental.constrained.exp``' Intrinsic 27665^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27666 27667Syntax: 27668""""""" 27669 27670:: 27671 27672 declare <type> 27673 @llvm.experimental.constrained.exp(<type> <op1>, 27674 metadata <rounding mode>, 27675 metadata <exception behavior>) 27676 27677Overview: 27678""""""""" 27679 27680The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e 27681exponential of the specified value. 27682 27683Arguments: 27684"""""""""" 27685 27686The first argument and the return value are floating-point numbers of the same 27687type. 27688 27689The second and third arguments specify the rounding mode and exception 27690behavior as described above. 27691 27692Semantics: 27693"""""""""" 27694 27695This function returns the same values as the libm ``exp`` functions 27696would, and handles error conditions in the same way. 27697 27698 27699'``llvm.experimental.constrained.exp2``' Intrinsic 27700^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27701 27702Syntax: 27703""""""" 27704 27705:: 27706 27707 declare <type> 27708 @llvm.experimental.constrained.exp2(<type> <op1>, 27709 metadata <rounding mode>, 27710 metadata <exception behavior>) 27711 27712Overview: 27713""""""""" 27714 27715The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2 27716exponential of the specified value. 27717 27718 27719Arguments: 27720"""""""""" 27721 27722The first argument and the return value are floating-point numbers of the same 27723type. 27724 27725The second and third arguments specify the rounding mode and exception 27726behavior as described above. 27727 27728Semantics: 27729"""""""""" 27730 27731This function returns the same values as the libm ``exp2`` functions 27732would, and handles error conditions in the same way. 27733 27734 27735'``llvm.experimental.constrained.log``' Intrinsic 27736^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27737 27738Syntax: 27739""""""" 27740 27741:: 27742 27743 declare <type> 27744 @llvm.experimental.constrained.log(<type> <op1>, 27745 metadata <rounding mode>, 27746 metadata <exception behavior>) 27747 27748Overview: 27749""""""""" 27750 27751The '``llvm.experimental.constrained.log``' intrinsic computes the base-e 27752logarithm of the specified value. 27753 27754Arguments: 27755"""""""""" 27756 27757The first argument and the return value are floating-point numbers of the same 27758type. 27759 27760The second and third arguments specify the rounding mode and exception 27761behavior as described above. 27762 27763 27764Semantics: 27765"""""""""" 27766 27767This function returns the same values as the libm ``log`` functions 27768would, and handles error conditions in the same way. 27769 27770 27771'``llvm.experimental.constrained.log10``' Intrinsic 27772^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27773 27774Syntax: 27775""""""" 27776 27777:: 27778 27779 declare <type> 27780 @llvm.experimental.constrained.log10(<type> <op1>, 27781 metadata <rounding mode>, 27782 metadata <exception behavior>) 27783 27784Overview: 27785""""""""" 27786 27787The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10 27788logarithm of the specified value. 27789 27790Arguments: 27791"""""""""" 27792 27793The first argument and the return value are floating-point numbers of the same 27794type. 27795 27796The second and third arguments specify the rounding mode and exception 27797behavior as described above. 27798 27799Semantics: 27800"""""""""" 27801 27802This function returns the same values as the libm ``log10`` functions 27803would, and handles error conditions in the same way. 27804 27805 27806'``llvm.experimental.constrained.log2``' Intrinsic 27807^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27808 27809Syntax: 27810""""""" 27811 27812:: 27813 27814 declare <type> 27815 @llvm.experimental.constrained.log2(<type> <op1>, 27816 metadata <rounding mode>, 27817 metadata <exception behavior>) 27818 27819Overview: 27820""""""""" 27821 27822The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2 27823logarithm of the specified value. 27824 27825Arguments: 27826"""""""""" 27827 27828The first argument and the return value are floating-point numbers of the same 27829type. 27830 27831The second and third arguments specify the rounding mode and exception 27832behavior as described above. 27833 27834Semantics: 27835"""""""""" 27836 27837This function returns the same values as the libm ``log2`` functions 27838would, and handles error conditions in the same way. 27839 27840 27841'``llvm.experimental.constrained.rint``' Intrinsic 27842^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27843 27844Syntax: 27845""""""" 27846 27847:: 27848 27849 declare <type> 27850 @llvm.experimental.constrained.rint(<type> <op1>, 27851 metadata <rounding mode>, 27852 metadata <exception behavior>) 27853 27854Overview: 27855""""""""" 27856 27857The '``llvm.experimental.constrained.rint``' intrinsic returns the first 27858argument rounded to the nearest integer. It may raise an inexact floating-point 27859exception if the argument is not an integer. 27860 27861Arguments: 27862"""""""""" 27863 27864The first argument and the return value are floating-point numbers of the same 27865type. 27866 27867The second and third arguments specify the rounding mode and exception 27868behavior as described above. 27869 27870Semantics: 27871"""""""""" 27872 27873This function returns the same values as the libm ``rint`` functions 27874would, and handles error conditions in the same way. The rounding mode is 27875described, not determined, by the rounding mode argument. The actual rounding 27876mode is determined by the runtime floating-point environment. The rounding 27877mode argument is only intended as information to the compiler. 27878 27879 27880'``llvm.experimental.constrained.lrint``' Intrinsic 27881^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27882 27883Syntax: 27884""""""" 27885 27886:: 27887 27888 declare <inttype> 27889 @llvm.experimental.constrained.lrint(<fptype> <op1>, 27890 metadata <rounding mode>, 27891 metadata <exception behavior>) 27892 27893Overview: 27894""""""""" 27895 27896The '``llvm.experimental.constrained.lrint``' intrinsic returns the first 27897argument rounded to the nearest integer. An inexact floating-point exception 27898will be raised if the argument is not an integer. An invalid exception is 27899raised if the result is too large to fit into a supported integer type, 27900and in this case the result is undefined. 27901 27902Arguments: 27903"""""""""" 27904 27905The first argument is a floating-point number. The return value is an 27906integer type. Not all types are supported on all targets. The supported 27907types are the same as the ``llvm.lrint`` intrinsic and the ``lrint`` 27908libm functions. 27909 27910The second and third arguments specify the rounding mode and exception 27911behavior as described above. 27912 27913Semantics: 27914"""""""""" 27915 27916This function returns the same values as the libm ``lrint`` functions 27917would, and handles error conditions in the same way. 27918 27919The rounding mode is described, not determined, by the rounding mode 27920argument. The actual rounding mode is determined by the runtime floating-point 27921environment. The rounding mode argument is only intended as information 27922to the compiler. 27923 27924If the runtime floating-point environment is using the default rounding mode 27925then the results will be the same as the llvm.lrint intrinsic. 27926 27927 27928'``llvm.experimental.constrained.llrint``' Intrinsic 27929^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27930 27931Syntax: 27932""""""" 27933 27934:: 27935 27936 declare <inttype> 27937 @llvm.experimental.constrained.llrint(<fptype> <op1>, 27938 metadata <rounding mode>, 27939 metadata <exception behavior>) 27940 27941Overview: 27942""""""""" 27943 27944The '``llvm.experimental.constrained.llrint``' intrinsic returns the first 27945argument rounded to the nearest integer. An inexact floating-point exception 27946will be raised if the argument is not an integer. An invalid exception is 27947raised if the result is too large to fit into a supported integer type, 27948and in this case the result is undefined. 27949 27950Arguments: 27951"""""""""" 27952 27953The first argument is a floating-point number. The return value is an 27954integer type. Not all types are supported on all targets. The supported 27955types are the same as the ``llvm.llrint`` intrinsic and the ``llrint`` 27956libm functions. 27957 27958The second and third arguments specify the rounding mode and exception 27959behavior as described above. 27960 27961Semantics: 27962"""""""""" 27963 27964This function returns the same values as the libm ``llrint`` functions 27965would, and handles error conditions in the same way. 27966 27967The rounding mode is described, not determined, by the rounding mode 27968argument. The actual rounding mode is determined by the runtime floating-point 27969environment. The rounding mode argument is only intended as information 27970to the compiler. 27971 27972If the runtime floating-point environment is using the default rounding mode 27973then the results will be the same as the llvm.llrint intrinsic. 27974 27975 27976'``llvm.experimental.constrained.nearbyint``' Intrinsic 27977^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 27978 27979Syntax: 27980""""""" 27981 27982:: 27983 27984 declare <type> 27985 @llvm.experimental.constrained.nearbyint(<type> <op1>, 27986 metadata <rounding mode>, 27987 metadata <exception behavior>) 27988 27989Overview: 27990""""""""" 27991 27992The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first 27993argument rounded to the nearest integer. It will not raise an inexact 27994floating-point exception if the argument is not an integer. 27995 27996 27997Arguments: 27998"""""""""" 27999 28000The first argument and the return value are floating-point numbers of the same 28001type. 28002 28003The second and third arguments specify the rounding mode and exception 28004behavior as described above. 28005 28006Semantics: 28007"""""""""" 28008 28009This function returns the same values as the libm ``nearbyint`` functions 28010would, and handles error conditions in the same way. The rounding mode is 28011described, not determined, by the rounding mode argument. The actual rounding 28012mode is determined by the runtime floating-point environment. The rounding 28013mode argument is only intended as information to the compiler. 28014 28015 28016'``llvm.experimental.constrained.maxnum``' Intrinsic 28017^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28018 28019Syntax: 28020""""""" 28021 28022:: 28023 28024 declare <type> 28025 @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2> 28026 metadata <exception behavior>) 28027 28028Overview: 28029""""""""" 28030 28031The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum 28032of the two arguments. 28033 28034Arguments: 28035"""""""""" 28036 28037The first two arguments and the return value are floating-point numbers 28038of the same type. 28039 28040The third argument specifies the exception behavior as described above. 28041 28042Semantics: 28043"""""""""" 28044 28045This function follows the IEEE-754 semantics for maxNum. 28046 28047 28048'``llvm.experimental.constrained.minnum``' Intrinsic 28049^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28050 28051Syntax: 28052""""""" 28053 28054:: 28055 28056 declare <type> 28057 @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2> 28058 metadata <exception behavior>) 28059 28060Overview: 28061""""""""" 28062 28063The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum 28064of the two arguments. 28065 28066Arguments: 28067"""""""""" 28068 28069The first two arguments and the return value are floating-point numbers 28070of the same type. 28071 28072The third argument specifies the exception behavior as described above. 28073 28074Semantics: 28075"""""""""" 28076 28077This function follows the IEEE-754 semantics for minNum. 28078 28079 28080'``llvm.experimental.constrained.maximum``' Intrinsic 28081^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28082 28083Syntax: 28084""""""" 28085 28086:: 28087 28088 declare <type> 28089 @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2> 28090 metadata <exception behavior>) 28091 28092Overview: 28093""""""""" 28094 28095The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum 28096of the two arguments, propagating NaNs and treating -0.0 as less than +0.0. 28097 28098Arguments: 28099"""""""""" 28100 28101The first two arguments and the return value are floating-point numbers 28102of the same type. 28103 28104The third argument specifies the exception behavior as described above. 28105 28106Semantics: 28107"""""""""" 28108 28109This function follows semantics specified in the draft of IEEE 754-2019. 28110 28111 28112'``llvm.experimental.constrained.minimum``' Intrinsic 28113^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28114 28115Syntax: 28116""""""" 28117 28118:: 28119 28120 declare <type> 28121 @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2> 28122 metadata <exception behavior>) 28123 28124Overview: 28125""""""""" 28126 28127The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum 28128of the two arguments, propagating NaNs and treating -0.0 as less than +0.0. 28129 28130Arguments: 28131"""""""""" 28132 28133The first two arguments and the return value are floating-point numbers 28134of the same type. 28135 28136The third argument specifies the exception behavior as described above. 28137 28138Semantics: 28139"""""""""" 28140 28141This function follows semantics specified in the draft of IEEE 754-2019. 28142 28143 28144'``llvm.experimental.constrained.ceil``' Intrinsic 28145^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28146 28147Syntax: 28148""""""" 28149 28150:: 28151 28152 declare <type> 28153 @llvm.experimental.constrained.ceil(<type> <op1>, 28154 metadata <exception behavior>) 28155 28156Overview: 28157""""""""" 28158 28159The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the 28160first argument. 28161 28162Arguments: 28163"""""""""" 28164 28165The first argument and the return value are floating-point numbers of the same 28166type. 28167 28168The second argument specifies the exception behavior as described above. 28169 28170Semantics: 28171"""""""""" 28172 28173This function returns the same values as the libm ``ceil`` functions 28174would and handles error conditions in the same way. 28175 28176 28177'``llvm.experimental.constrained.floor``' Intrinsic 28178^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28179 28180Syntax: 28181""""""" 28182 28183:: 28184 28185 declare <type> 28186 @llvm.experimental.constrained.floor(<type> <op1>, 28187 metadata <exception behavior>) 28188 28189Overview: 28190""""""""" 28191 28192The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the 28193first argument. 28194 28195Arguments: 28196"""""""""" 28197 28198The first argument and the return value are floating-point numbers of the same 28199type. 28200 28201The second argument specifies the exception behavior as described above. 28202 28203Semantics: 28204"""""""""" 28205 28206This function returns the same values as the libm ``floor`` functions 28207would and handles error conditions in the same way. 28208 28209 28210'``llvm.experimental.constrained.round``' Intrinsic 28211^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28212 28213Syntax: 28214""""""" 28215 28216:: 28217 28218 declare <type> 28219 @llvm.experimental.constrained.round(<type> <op1>, 28220 metadata <exception behavior>) 28221 28222Overview: 28223""""""""" 28224 28225The '``llvm.experimental.constrained.round``' intrinsic returns the first 28226argument rounded to the nearest integer. 28227 28228Arguments: 28229"""""""""" 28230 28231The first argument and the return value are floating-point numbers of the same 28232type. 28233 28234The second argument specifies the exception behavior as described above. 28235 28236Semantics: 28237"""""""""" 28238 28239This function returns the same values as the libm ``round`` functions 28240would and handles error conditions in the same way. 28241 28242 28243'``llvm.experimental.constrained.roundeven``' Intrinsic 28244^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28245 28246Syntax: 28247""""""" 28248 28249:: 28250 28251 declare <type> 28252 @llvm.experimental.constrained.roundeven(<type> <op1>, 28253 metadata <exception behavior>) 28254 28255Overview: 28256""""""""" 28257 28258The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first 28259argument rounded to the nearest integer in floating-point format, rounding 28260halfway cases to even (that is, to the nearest value that is an even integer), 28261regardless of the current rounding direction. 28262 28263Arguments: 28264"""""""""" 28265 28266The first argument and the return value are floating-point numbers of the same 28267type. 28268 28269The second argument specifies the exception behavior as described above. 28270 28271Semantics: 28272"""""""""" 28273 28274This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It 28275also behaves in the same way as C standard function ``roundeven`` and can signal 28276the invalid operation exception for a SNAN argument. 28277 28278 28279'``llvm.experimental.constrained.lround``' Intrinsic 28280^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28281 28282Syntax: 28283""""""" 28284 28285:: 28286 28287 declare <inttype> 28288 @llvm.experimental.constrained.lround(<fptype> <op1>, 28289 metadata <exception behavior>) 28290 28291Overview: 28292""""""""" 28293 28294The '``llvm.experimental.constrained.lround``' intrinsic returns the first 28295argument rounded to the nearest integer with ties away from zero. It will 28296raise an inexact floating-point exception if the argument is not an integer. 28297An invalid exception is raised if the result is too large to fit into a 28298supported integer type, and in this case the result is undefined. 28299 28300Arguments: 28301"""""""""" 28302 28303The first argument is a floating-point number. The return value is an 28304integer type. Not all types are supported on all targets. The supported 28305types are the same as the ``llvm.lround`` intrinsic and the ``lround`` 28306libm functions. 28307 28308The second argument specifies the exception behavior as described above. 28309 28310Semantics: 28311"""""""""" 28312 28313This function returns the same values as the libm ``lround`` functions 28314would and handles error conditions in the same way. 28315 28316 28317'``llvm.experimental.constrained.llround``' Intrinsic 28318^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28319 28320Syntax: 28321""""""" 28322 28323:: 28324 28325 declare <inttype> 28326 @llvm.experimental.constrained.llround(<fptype> <op1>, 28327 metadata <exception behavior>) 28328 28329Overview: 28330""""""""" 28331 28332The '``llvm.experimental.constrained.llround``' intrinsic returns the first 28333argument rounded to the nearest integer with ties away from zero. It will 28334raise an inexact floating-point exception if the argument is not an integer. 28335An invalid exception is raised if the result is too large to fit into a 28336supported integer type, and in this case the result is undefined. 28337 28338Arguments: 28339"""""""""" 28340 28341The first argument is a floating-point number. The return value is an 28342integer type. Not all types are supported on all targets. The supported 28343types are the same as the ``llvm.llround`` intrinsic and the ``llround`` 28344libm functions. 28345 28346The second argument specifies the exception behavior as described above. 28347 28348Semantics: 28349"""""""""" 28350 28351This function returns the same values as the libm ``llround`` functions 28352would and handles error conditions in the same way. 28353 28354 28355'``llvm.experimental.constrained.trunc``' Intrinsic 28356^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28357 28358Syntax: 28359""""""" 28360 28361:: 28362 28363 declare <type> 28364 @llvm.experimental.constrained.trunc(<type> <op1>, 28365 metadata <exception behavior>) 28366 28367Overview: 28368""""""""" 28369 28370The '``llvm.experimental.constrained.trunc``' intrinsic returns the first 28371argument rounded to the nearest integer not larger in magnitude than the 28372argument. 28373 28374Arguments: 28375"""""""""" 28376 28377The first argument and the return value are floating-point numbers of the same 28378type. 28379 28380The second argument specifies the exception behavior as described above. 28381 28382Semantics: 28383"""""""""" 28384 28385This function returns the same values as the libm ``trunc`` functions 28386would and handles error conditions in the same way. 28387 28388.. _int_experimental_noalias_scope_decl: 28389 28390'``llvm.experimental.noalias.scope.decl``' Intrinsic 28391^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28392 28393Syntax: 28394""""""" 28395 28396 28397:: 28398 28399 declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list) 28400 28401Overview: 28402""""""""" 28403 28404The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a 28405noalias scope is declared. When the intrinsic is duplicated, a decision must 28406also be made about the scope: depending on the reason of the duplication, 28407the scope might need to be duplicated as well. 28408 28409 28410Arguments: 28411"""""""""" 28412 28413The ``!id.scope.list`` argument is metadata that is a list of ``noalias`` 28414metadata references. The format is identical to that required for ``noalias`` 28415metadata. This list must have exactly one element. 28416 28417Semantics: 28418"""""""""" 28419 28420The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a 28421noalias scope is declared. When the intrinsic is duplicated, a decision must 28422also be made about the scope: depending on the reason of the duplication, 28423the scope might need to be duplicated as well. 28424 28425For example, when the intrinsic is used inside a loop body, and that loop is 28426unrolled, the associated noalias scope must also be duplicated. Otherwise, the 28427noalias property it signifies would spill across loop iterations, whereas it 28428was only valid within a single iteration. 28429 28430.. code-block:: llvm 28431 28432 ; This examples shows two possible positions for noalias.decl and how they impact the semantics: 28433 ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations. 28434 ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration. 28435 declare void @decl_in_loop(ptr %a.base, ptr %b.base) { 28436 entry: 28437 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop 28438 br label %loop 28439 28440 loop: 28441 %a = phi ptr [ %a.base, %entry ], [ %a.inc, %loop ] 28442 %b = phi ptr [ %b.base, %entry ], [ %b.inc, %loop ] 28443 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop 28444 %val = load i8, ptr %a, !alias.scope !2 28445 store i8 %val, ptr %b, !noalias !2 28446 %a.inc = getelementptr inbounds i8, ptr %a, i64 1 28447 %b.inc = getelementptr inbounds i8, ptr %b, i64 1 28448 %cond = call i1 @cond() 28449 br i1 %cond, label %loop, label %exit 28450 28451 exit: 28452 ret void 28453 } 28454 28455 !0 = !{!0} ; domain 28456 !1 = !{!1, !0} ; scope 28457 !2 = !{!1} ; scope list 28458 28459Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope 28460are possible, but one should never dominate another. Violations are pointed out 28461by the verifier as they indicate a problem in either a transformation pass or 28462the input. 28463 28464 28465Floating Point Environment Manipulation intrinsics 28466-------------------------------------------------- 28467 28468These functions read or write floating point environment, such as rounding 28469mode or state of floating point exceptions. Altering the floating point 28470environment requires special care. See :ref:`Floating Point Environment <floatenv>`. 28471 28472.. _int_get_rounding: 28473 28474'``llvm.get.rounding``' Intrinsic 28475^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28476 28477Syntax: 28478""""""" 28479 28480:: 28481 28482 declare i32 @llvm.get.rounding() 28483 28484Overview: 28485""""""""" 28486 28487The '``llvm.get.rounding``' intrinsic reads the current rounding mode. 28488 28489Semantics: 28490"""""""""" 28491 28492The '``llvm.get.rounding``' intrinsic returns the current rounding mode. 28493Encoding of the returned values is same as the result of ``FLT_ROUNDS``, 28494specified by C standard: 28495 28496:: 28497 28498 0 - toward zero 28499 1 - to nearest, ties to even 28500 2 - toward positive infinity 28501 3 - toward negative infinity 28502 4 - to nearest, ties away from zero 28503 28504Other values may be used to represent additional rounding modes, supported by a 28505target. These values are target-specific. 28506 28507.. _int_set_rounding: 28508 28509'``llvm.set.rounding``' Intrinsic 28510^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28511 28512Syntax: 28513""""""" 28514 28515:: 28516 28517 declare void @llvm.set.rounding(i32 <val>) 28518 28519Overview: 28520""""""""" 28521 28522The '``llvm.set.rounding``' intrinsic sets current rounding mode. 28523 28524Arguments: 28525"""""""""" 28526 28527The argument is the required rounding mode. Encoding of rounding mode is 28528the same as used by '``llvm.get.rounding``'. 28529 28530Semantics: 28531"""""""""" 28532 28533The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is 28534similar to C library function 'fesetround', however this intrinsic does not 28535return any value and uses platform-independent representation of IEEE rounding 28536modes. 28537 28538.. _int_get_fpenv: 28539 28540'``llvm.get.fpenv``' Intrinsic 28541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28542 28543Syntax: 28544""""""" 28545 28546:: 28547 28548 declare <integer_type> @llvm.get.fpenv() 28549 28550Overview: 28551""""""""" 28552 28553The '``llvm.get.fpenv``' intrinsic returns bits of the current floating-point 28554environment. The return value type is platform-specific. 28555 28556Semantics: 28557"""""""""" 28558 28559The '``llvm.get.fpenv``' intrinsic reads the current floating-point environment 28560and returns it as an integer value. 28561 28562.. _int_set_fpenv: 28563 28564'``llvm.set.fpenv``' Intrinsic 28565^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28566 28567Syntax: 28568""""""" 28569 28570:: 28571 28572 declare void @llvm.set.fpenv(<integer_type> <val>) 28573 28574Overview: 28575""""""""" 28576 28577The '``llvm.set.fpenv``' intrinsic sets the current floating-point environment. 28578 28579Arguments: 28580"""""""""" 28581 28582The argument is an integer representing the new floating-point environment. The 28583integer type is platform-specific. 28584 28585Semantics: 28586"""""""""" 28587 28588The '``llvm.set.fpenv``' intrinsic sets the current floating-point environment 28589to the state specified by the argument. The state may be previously obtained by a 28590call to '``llvm.get.fpenv``' or synthesized in a platform-dependent way. 28591 28592 28593'``llvm.reset.fpenv``' Intrinsic 28594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28595 28596Syntax: 28597""""""" 28598 28599:: 28600 28601 declare void @llvm.reset.fpenv() 28602 28603Overview: 28604""""""""" 28605 28606The '``llvm.reset.fpenv``' intrinsic sets the default floating-point environment. 28607 28608Semantics: 28609"""""""""" 28610 28611The '``llvm.reset.fpenv``' intrinsic sets the current floating-point environment 28612to default state. It is similar to the call 'fesetenv(FE_DFL_ENV)', except it 28613does not return any value. 28614 28615.. _int_get_fpmode: 28616 28617'``llvm.get.fpmode``' Intrinsic 28618^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28619 28620Syntax: 28621""""""" 28622 28623The '``llvm.get.fpmode``' intrinsic returns bits of the current floating-point 28624control modes. The return value type is platform-specific. 28625 28626:: 28627 28628 declare <integer_type> @llvm.get.fpmode() 28629 28630Overview: 28631""""""""" 28632 28633The '``llvm.get.fpmode``' intrinsic reads the current dynamic floating-point 28634control modes and returns it as an integer value. 28635 28636Arguments: 28637"""""""""" 28638 28639None. 28640 28641Semantics: 28642"""""""""" 28643 28644The '``llvm.get.fpmode``' intrinsic reads the current dynamic floating-point 28645control modes, such as rounding direction, precision, treatment of denormals and 28646so on. It is similar to the C library function 'fegetmode', however this 28647function does not store the set of control modes into memory but returns it as 28648an integer value. Interpretation of the bits in this value is target-dependent. 28649 28650'``llvm.set.fpmode``' Intrinsic 28651^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28652 28653Syntax: 28654""""""" 28655 28656The '``llvm.set.fpmode``' intrinsic sets the current floating-point control modes. 28657 28658:: 28659 28660 declare void @llvm.set.fpmode(<integer_type> <val>) 28661 28662Overview: 28663""""""""" 28664 28665The '``llvm.set.fpmode``' intrinsic sets the current dynamic floating-point 28666control modes. 28667 28668Arguments: 28669"""""""""" 28670 28671The argument is a set of floating-point control modes, represented as an integer 28672value in a target-dependent way. 28673 28674Semantics: 28675"""""""""" 28676 28677The '``llvm.set.fpmode``' intrinsic sets the current dynamic floating-point 28678control modes to the state specified by the argument, which must be obtained by 28679a call to '``llvm.get.fpmode``' or constructed in a target-specific way. It is 28680similar to the C library function 'fesetmode', however this function does not 28681read the set of control modes from memory but gets it as integer value. 28682 28683'``llvm.reset.fpmode``' Intrinsic 28684^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28685 28686Syntax: 28687""""""" 28688 28689:: 28690 28691 declare void @llvm.reset.fpmode() 28692 28693Overview: 28694""""""""" 28695 28696The '``llvm.reset.fpmode``' intrinsic sets the default dynamic floating-point 28697control modes. 28698 28699Arguments: 28700"""""""""" 28701 28702None. 28703 28704Semantics: 28705"""""""""" 28706 28707The '``llvm.reset.fpmode``' intrinsic sets the current dynamic floating-point 28708environment to default state. It is similar to the C library function call 28709'fesetmode(FE_DFL_MODE)', however this function does not return any value. 28710 28711 28712Floating-Point Test Intrinsics 28713------------------------------ 28714 28715These functions get properties of floating-point values. 28716 28717 28718.. _llvm.is.fpclass: 28719 28720'``llvm.is.fpclass``' Intrinsic 28721^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28722 28723Syntax: 28724""""""" 28725 28726:: 28727 28728 declare i1 @llvm.is.fpclass(<fptype> <op>, i32 <test>) 28729 declare <N x i1> @llvm.is.fpclass(<vector-fptype> <op>, i32 <test>) 28730 28731Overview: 28732""""""""" 28733 28734The '``llvm.is.fpclass``' intrinsic returns a boolean value or vector of boolean 28735values depending on whether the first argument satisfies the test specified by 28736the second argument. 28737 28738If the first argument is a floating-point scalar, then the result type is a 28739boolean (:ref:`i1 <t_integer>`). 28740 28741If the first argument is a floating-point vector, then the result type is a 28742vector of boolean with the same number of elements as the first argument. 28743 28744Arguments: 28745"""""""""" 28746 28747The first argument to the '``llvm.is.fpclass``' intrinsic must be 28748:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 28749of floating-point values. 28750 28751The second argument specifies, which tests to perform. It must be a compile-time 28752integer constant, each bit in which specifies floating-point class: 28753 28754+-------+----------------------+ 28755| Bit # | floating-point class | 28756+=======+======================+ 28757| 0 | Signaling NaN | 28758+-------+----------------------+ 28759| 1 | Quiet NaN | 28760+-------+----------------------+ 28761| 2 | Negative infinity | 28762+-------+----------------------+ 28763| 3 | Negative normal | 28764+-------+----------------------+ 28765| 4 | Negative subnormal | 28766+-------+----------------------+ 28767| 5 | Negative zero | 28768+-------+----------------------+ 28769| 6 | Positive zero | 28770+-------+----------------------+ 28771| 7 | Positive subnormal | 28772+-------+----------------------+ 28773| 8 | Positive normal | 28774+-------+----------------------+ 28775| 9 | Positive infinity | 28776+-------+----------------------+ 28777 28778Semantics: 28779"""""""""" 28780 28781The function checks if ``op`` belongs to any of the floating-point classes 28782specified by ``test``. If ``op`` is a vector, then the check is made element by 28783element. Each check yields an :ref:`i1 <t_integer>` result, which is ``true``, 28784if the element value satisfies the specified test. The argument ``test`` is a 28785bit mask where each bit specifies floating-point class to test. For example, the 28786value 0x108 makes test for normal value, - bits 3 and 8 in it are set, which 28787means that the function returns ``true`` if ``op`` is a positive or negative 28788normal value. The function never raises floating-point exceptions. The 28789function does not canonicalize its input value and does not depend 28790on the floating-point environment. If the floating-point environment 28791has a zeroing treatment of subnormal input values (such as indicated 28792by the ``"denormal-fp-math"`` attribute), a subnormal value will be 28793observed (will not be implicitly treated as zero). 28794 28795 28796General Intrinsics 28797------------------ 28798 28799This class of intrinsics is designed to be generic and has no specific 28800purpose. 28801 28802'``llvm.var.annotation``' Intrinsic 28803^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28804 28805Syntax: 28806""""""" 28807 28808:: 28809 28810 declare void @llvm.var.annotation(ptr <val>, ptr <str>, ptr <str>, i32 <int>) 28811 28812Overview: 28813""""""""" 28814 28815The '``llvm.var.annotation``' intrinsic. 28816 28817Arguments: 28818"""""""""" 28819 28820The first argument is a pointer to a value, the second is a pointer to a 28821global string, the third is a pointer to a global string which is the 28822source file name, and the last argument is the line number. 28823 28824Semantics: 28825"""""""""" 28826 28827This intrinsic allows annotation of local variables with arbitrary 28828strings. This can be useful for special purpose optimizations that want 28829to look for these annotations. These have no other defined use; they are 28830ignored by code generation and optimization. 28831 28832'``llvm.ptr.annotation.*``' Intrinsic 28833^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28834 28835Syntax: 28836""""""" 28837 28838This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a 28839pointer to an integer of any width. *NOTE* you must specify an address space for 28840the pointer. The identifier for the default address space is the integer 28841'``0``'. 28842 28843:: 28844 28845 declare ptr @llvm.ptr.annotation.p0(ptr <val>, ptr <str>, ptr <str>, i32 <int>) 28846 declare ptr @llvm.ptr.annotation.p1(ptr addrspace(1) <val>, ptr <str>, ptr <str>, i32 <int>) 28847 28848Overview: 28849""""""""" 28850 28851The '``llvm.ptr.annotation``' intrinsic. 28852 28853Arguments: 28854"""""""""" 28855 28856The first argument is a pointer to an integer value of arbitrary bitwidth 28857(result of some expression), the second is a pointer to a global string, the 28858third is a pointer to a global string which is the source file name, and the 28859last argument is the line number. It returns the value of the first argument. 28860 28861Semantics: 28862"""""""""" 28863 28864This intrinsic allows annotation of a pointer to an integer with arbitrary 28865strings. This can be useful for special purpose optimizations that want to look 28866for these annotations. These have no other defined use; transformations preserve 28867annotations on a best-effort basis but are allowed to replace the intrinsic with 28868its first argument without breaking semantics and the intrinsic is completely 28869dropped during instruction selection. 28870 28871'``llvm.annotation.*``' Intrinsic 28872^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28873 28874Syntax: 28875""""""" 28876 28877This is an overloaded intrinsic. You can use '``llvm.annotation``' on 28878any integer bit width. 28879 28880:: 28881 28882 declare i8 @llvm.annotation.i8(i8 <val>, ptr <str>, ptr <str>, i32 <int>) 28883 declare i16 @llvm.annotation.i16(i16 <val>, ptr <str>, ptr <str>, i32 <int>) 28884 declare i32 @llvm.annotation.i32(i32 <val>, ptr <str>, ptr <str>, i32 <int>) 28885 declare i64 @llvm.annotation.i64(i64 <val>, ptr <str>, ptr <str>, i32 <int>) 28886 declare i256 @llvm.annotation.i256(i256 <val>, ptr <str>, ptr <str>, i32 <int>) 28887 28888Overview: 28889""""""""" 28890 28891The '``llvm.annotation``' intrinsic. 28892 28893Arguments: 28894"""""""""" 28895 28896The first argument is an integer value (result of some expression), the 28897second is a pointer to a global string, the third is a pointer to a 28898global string which is the source file name, and the last argument is 28899the line number. It returns the value of the first argument. 28900 28901Semantics: 28902"""""""""" 28903 28904This intrinsic allows annotations to be put on arbitrary expressions with 28905arbitrary strings. This can be useful for special purpose optimizations that 28906want to look for these annotations. These have no other defined use; 28907transformations preserve annotations on a best-effort basis but are allowed to 28908replace the intrinsic with its first argument without breaking semantics and the 28909intrinsic is completely dropped during instruction selection. 28910 28911'``llvm.codeview.annotation``' Intrinsic 28912^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28913 28914Syntax: 28915""""""" 28916 28917This annotation emits a label at its program point and an associated 28918``S_ANNOTATION`` codeview record with some additional string metadata. This is 28919used to implement MSVC's ``__annotation`` intrinsic. It is marked 28920``noduplicate``, so calls to this intrinsic prevent inlining and should be 28921considered expensive. 28922 28923:: 28924 28925 declare void @llvm.codeview.annotation(metadata) 28926 28927Arguments: 28928"""""""""" 28929 28930The argument should be an MDTuple containing any number of MDStrings. 28931 28932.. _llvm.trap: 28933 28934'``llvm.trap``' Intrinsic 28935^^^^^^^^^^^^^^^^^^^^^^^^^ 28936 28937Syntax: 28938""""""" 28939 28940:: 28941 28942 declare void @llvm.trap() cold noreturn nounwind 28943 28944Overview: 28945""""""""" 28946 28947The '``llvm.trap``' intrinsic. 28948 28949Arguments: 28950"""""""""" 28951 28952None. 28953 28954Semantics: 28955"""""""""" 28956 28957This intrinsic is lowered to the target dependent trap instruction. If 28958the target does not have a trap instruction, this intrinsic will be 28959lowered to a call of the ``abort()`` function. 28960 28961.. _llvm.debugtrap: 28962 28963'``llvm.debugtrap``' Intrinsic 28964^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28965 28966Syntax: 28967""""""" 28968 28969:: 28970 28971 declare void @llvm.debugtrap() nounwind 28972 28973Overview: 28974""""""""" 28975 28976The '``llvm.debugtrap``' intrinsic. 28977 28978Arguments: 28979"""""""""" 28980 28981None. 28982 28983Semantics: 28984"""""""""" 28985 28986This intrinsic is lowered to code which is intended to cause an 28987execution trap with the intention of requesting the attention of a 28988debugger. 28989 28990.. _llvm.ubsantrap: 28991 28992'``llvm.ubsantrap``' Intrinsic 28993^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28994 28995Syntax: 28996""""""" 28997 28998:: 28999 29000 declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind 29001 29002Overview: 29003""""""""" 29004 29005The '``llvm.ubsantrap``' intrinsic. 29006 29007Arguments: 29008"""""""""" 29009 29010An integer describing the kind of failure detected. 29011 29012Semantics: 29013"""""""""" 29014 29015This intrinsic is lowered to code which is intended to cause an execution trap, 29016embedding the argument into encoding of that trap somehow to discriminate 29017crashes if possible. 29018 29019Equivalent to ``@llvm.trap`` for targets that do not support this behavior. 29020 29021'``llvm.stackprotector``' Intrinsic 29022^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29023 29024Syntax: 29025""""""" 29026 29027:: 29028 29029 declare void @llvm.stackprotector(ptr <guard>, ptr <slot>) 29030 29031Overview: 29032""""""""" 29033 29034The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it 29035onto the stack at ``slot``. The stack slot is adjusted to ensure that it 29036is placed on the stack before local variables. 29037 29038Arguments: 29039"""""""""" 29040 29041The ``llvm.stackprotector`` intrinsic requires two pointer arguments. 29042The first argument is the value loaded from the stack guard 29043``@__stack_chk_guard``. The second variable is an ``alloca`` that has 29044enough space to hold the value of the guard. 29045 29046Semantics: 29047"""""""""" 29048 29049This intrinsic causes the prologue/epilogue inserter to force the position of 29050the ``AllocaInst`` stack slot to be before local variables on the stack. This is 29051to ensure that if a local variable on the stack is overwritten, it will destroy 29052the value of the guard. When the function exits, the guard on the stack is 29053checked against the original guard by ``llvm.stackprotectorcheck``. If they are 29054different, then ``llvm.stackprotectorcheck`` causes the program to abort by 29055calling the ``__stack_chk_fail()`` function. 29056 29057'``llvm.stackguard``' Intrinsic 29058^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29059 29060Syntax: 29061""""""" 29062 29063:: 29064 29065 declare ptr @llvm.stackguard() 29066 29067Overview: 29068""""""""" 29069 29070The ``llvm.stackguard`` intrinsic returns the system stack guard value. 29071 29072It should not be generated by frontends, since it is only for internal usage. 29073The reason why we create this intrinsic is that we still support IR form Stack 29074Protector in FastISel. 29075 29076Arguments: 29077"""""""""" 29078 29079None. 29080 29081Semantics: 29082"""""""""" 29083 29084On some platforms, the value returned by this intrinsic remains unchanged 29085between loads in the same thread. On other platforms, it returns the same 29086global variable value, if any, e.g. ``@__stack_chk_guard``. 29087 29088Currently some platforms have IR-level customized stack guard loading (e.g. 29089X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be 29090in the future. 29091 29092'``llvm.objectsize``' Intrinsic 29093^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29094 29095Syntax: 29096""""""" 29097 29098:: 29099 29100 declare i32 @llvm.objectsize.i32(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>) 29101 declare i64 @llvm.objectsize.i64(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>) 29102 29103Overview: 29104""""""""" 29105 29106The ``llvm.objectsize`` intrinsic is designed to provide information to the 29107optimizer to determine whether a) an operation (like memcpy) will overflow a 29108buffer that corresponds to an object, or b) that a runtime check for overflow 29109isn't necessary. An object in this context means an allocation of a specific 29110class, structure, array, or other object. 29111 29112Arguments: 29113"""""""""" 29114 29115The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a 29116pointer to or into the ``object``. The second argument determines whether 29117``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is 29118unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` 29119in address space 0 is used as its pointer argument. If it's ``false``, 29120``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if 29121the ``null`` is in a non-zero address space or if ``true`` is given for the 29122third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth 29123argument to ``llvm.objectsize`` determines if the value should be evaluated at 29124runtime. 29125 29126The second, third, and fourth arguments only accept constants. 29127 29128Semantics: 29129"""""""""" 29130 29131The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of 29132the object concerned. If the size cannot be determined, ``llvm.objectsize`` 29133returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument). 29134 29135'``llvm.expect``' Intrinsic 29136^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29137 29138Syntax: 29139""""""" 29140 29141This is an overloaded intrinsic. You can use ``llvm.expect`` on any 29142integer bit width. 29143 29144:: 29145 29146 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>) 29147 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>) 29148 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>) 29149 29150Overview: 29151""""""""" 29152 29153The ``llvm.expect`` intrinsic provides information about expected (the 29154most probable) value of ``val``, which can be used by optimizers. 29155 29156Arguments: 29157"""""""""" 29158 29159The ``llvm.expect`` intrinsic takes two arguments. The first argument is 29160a value. The second argument is an expected value. 29161 29162Semantics: 29163"""""""""" 29164 29165This intrinsic is lowered to the ``val``. 29166 29167'``llvm.expect.with.probability``' Intrinsic 29168^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29169 29170Syntax: 29171""""""" 29172 29173This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic. 29174You can use ``llvm.expect.with.probability`` on any integer bit width. 29175 29176:: 29177 29178 declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>) 29179 declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>) 29180 declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>) 29181 29182Overview: 29183""""""""" 29184 29185The ``llvm.expect.with.probability`` intrinsic provides information about 29186expected value of ``val`` with probability(or confidence) ``prob``, which can 29187be used by optimizers. 29188 29189Arguments: 29190"""""""""" 29191 29192The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first 29193argument is a value. The second argument is an expected value. The third 29194argument is a probability. 29195 29196Semantics: 29197"""""""""" 29198 29199This intrinsic is lowered to the ``val``. 29200 29201.. _int_assume: 29202 29203'``llvm.assume``' Intrinsic 29204^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29205 29206Syntax: 29207""""""" 29208 29209:: 29210 29211 declare void @llvm.assume(i1 %cond) 29212 29213Overview: 29214""""""""" 29215 29216The ``llvm.assume`` allows the optimizer to assume that the provided 29217condition is true. This information can then be used in simplifying other parts 29218of the code. 29219 29220More complex assumptions can be encoded as 29221:ref:`assume operand bundles <assume_opbundles>`. 29222 29223Arguments: 29224"""""""""" 29225 29226The argument of the call is the condition which the optimizer may assume is 29227always true. 29228 29229Semantics: 29230"""""""""" 29231 29232The intrinsic allows the optimizer to assume that the provided condition is 29233always true whenever the control flow reaches the intrinsic call. No code is 29234generated for this intrinsic, and instructions that contribute only to the 29235provided condition are not used for code generation. If the condition is 29236violated during execution, the behavior is undefined. 29237 29238Note that the optimizer might limit the transformations performed on values 29239used by the ``llvm.assume`` intrinsic in order to preserve the instructions 29240only used to form the intrinsic's input argument. This might prove undesirable 29241if the extra information provided by the ``llvm.assume`` intrinsic does not cause 29242sufficient overall improvement in code quality. For this reason, 29243``llvm.assume`` should not be used to document basic mathematical invariants 29244that the optimizer can otherwise deduce or facts that are of little use to the 29245optimizer. 29246 29247.. _int_ssa_copy: 29248 29249'``llvm.ssa.copy``' Intrinsic 29250^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29251 29252Syntax: 29253""""""" 29254 29255:: 29256 29257 declare type @llvm.ssa.copy(type returned %operand) memory(none) 29258 29259Arguments: 29260"""""""""" 29261 29262The first argument is an operand which is used as the returned value. 29263 29264Overview: 29265"""""""""" 29266 29267The ``llvm.ssa.copy`` intrinsic can be used to attach information to 29268operations by copying them and giving them new names. For example, 29269the PredicateInfo utility uses it to build Extended SSA form, and 29270attach various forms of information to operands that dominate specific 29271uses. It is not meant for general use, only for building temporary 29272renaming forms that require value splits at certain points. 29273 29274.. _type.test: 29275 29276'``llvm.type.test``' Intrinsic 29277^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29278 29279Syntax: 29280""""""" 29281 29282:: 29283 29284 declare i1 @llvm.type.test(ptr %ptr, metadata %type) nounwind memory(none) 29285 29286 29287Arguments: 29288"""""""""" 29289 29290The first argument is a pointer to be tested. The second argument is a 29291metadata object representing a :doc:`type identifier <TypeMetadata>`. 29292 29293Overview: 29294""""""""" 29295 29296The ``llvm.type.test`` intrinsic tests whether the given pointer is associated 29297with the given type identifier. 29298 29299.. _type.checked.load: 29300 29301'``llvm.type.checked.load``' Intrinsic 29302^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29303 29304Syntax: 29305""""""" 29306 29307:: 29308 29309 declare {ptr, i1} @llvm.type.checked.load(ptr %ptr, i32 %offset, metadata %type) nounwind memory(argmem: read) 29310 29311 29312Arguments: 29313"""""""""" 29314 29315The first argument is a pointer from which to load a function pointer. The 29316second argument is the byte offset from which to load the function pointer. The 29317third argument is a metadata object representing a :doc:`type identifier 29318<TypeMetadata>`. 29319 29320Overview: 29321""""""""" 29322 29323The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a 29324virtual table pointer using type metadata. This intrinsic is used to implement 29325control flow integrity in conjunction with virtual call optimization. The 29326virtual call optimization pass will optimize away ``llvm.type.checked.load`` 29327intrinsics associated with devirtualized calls, thereby removing the type 29328check in cases where it is not needed to enforce the control flow integrity 29329constraint. 29330 29331If the given pointer is associated with a type metadata identifier, this 29332function returns true as the second element of its return value. (Note that 29333the function may also return true if the given pointer is not associated 29334with a type metadata identifier.) If the function's return value's second 29335element is true, the following rules apply to the first element: 29336 29337- If the given pointer is associated with the given type metadata identifier, 29338 it is the function pointer loaded from the given byte offset from the given 29339 pointer. 29340 29341- If the given pointer is not associated with the given type metadata 29342 identifier, it is one of the following (the choice of which is unspecified): 29343 29344 1. The function pointer that would have been loaded from an arbitrarily chosen 29345 (through an unspecified mechanism) pointer associated with the type 29346 metadata. 29347 29348 2. If the function has a non-void return type, a pointer to a function that 29349 returns an unspecified value without causing side effects. 29350 29351If the function's return value's second element is false, the value of the 29352first element is undefined. 29353 29354.. _type.checked.load.relative: 29355 29356'``llvm.type.checked.load.relative``' Intrinsic 29357^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29358 29359Syntax: 29360""""""" 29361 29362:: 29363 29364 declare {ptr, i1} @llvm.type.checked.load.relative(ptr %ptr, i32 %offset, metadata %type) nounwind memory(argmem: read) 29365 29366Overview: 29367""""""""" 29368 29369The ``llvm.type.checked.load.relative`` intrinsic loads a relative pointer to a 29370function from a virtual table pointer using metadata. Otherwise, its semantic is 29371identical to the ``llvm.type.checked.load`` intrinsic. 29372 29373A relative pointer is a pointer to an offset to the pointed to value. The 29374address of the underlying pointer of the relative pointer is obtained by adding 29375the offset to the address of the offset value. 29376 29377'``llvm.arithmetic.fence``' Intrinsic 29378^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29379 29380Syntax: 29381""""""" 29382 29383:: 29384 29385 declare <type> 29386 @llvm.arithmetic.fence(<type> <op>) 29387 29388Overview: 29389""""""""" 29390 29391The purpose of the ``llvm.arithmetic.fence`` intrinsic 29392is to prevent the optimizer from performing fast-math optimizations, 29393particularly reassociation, 29394between the argument and the expression that contains the argument. 29395It can be used to preserve the parentheses in the source language. 29396 29397Arguments: 29398"""""""""" 29399 29400The ``llvm.arithmetic.fence`` intrinsic takes only one argument. 29401The argument and the return value are floating-point numbers, 29402or vector floating-point numbers, of the same type. 29403 29404Semantics: 29405"""""""""" 29406 29407This intrinsic returns the value of its operand. The optimizer can optimize 29408the argument, but the optimizer cannot hoist any component of the operand 29409to the containing context, and the optimizer cannot move the calculation of 29410any expression in the containing context into the operand. 29411 29412 29413'``llvm.donothing``' Intrinsic 29414^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29415 29416Syntax: 29417""""""" 29418 29419:: 29420 29421 declare void @llvm.donothing() nounwind memory(none) 29422 29423Overview: 29424""""""""" 29425 29426The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only 29427three intrinsics (besides ``llvm.experimental.patchpoint`` and 29428``llvm.experimental.gc.statepoint``) that can be called with an invoke 29429instruction. 29430 29431Arguments: 29432"""""""""" 29433 29434None. 29435 29436Semantics: 29437"""""""""" 29438 29439This intrinsic does nothing, and it's removed by optimizers and ignored 29440by codegen. 29441 29442'``llvm.experimental.deoptimize``' Intrinsic 29443^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29444 29445Syntax: 29446""""""" 29447 29448:: 29449 29450 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ] 29451 29452Overview: 29453""""""""" 29454 29455This intrinsic, together with :ref:`deoptimization operand bundles 29456<deopt_opbundles>`, allow frontends to express transfer of control and 29457frame-local state from the currently executing (typically more specialized, 29458hence faster) version of a function into another (typically more generic, hence 29459slower) version. 29460 29461In languages with a fully integrated managed runtime like Java and JavaScript 29462this intrinsic can be used to implement "uncommon trap" or "side exit" like 29463functionality. In unmanaged languages like C and C++, this intrinsic can be 29464used to represent the slow paths of specialized functions. 29465 29466 29467Arguments: 29468"""""""""" 29469 29470The intrinsic takes an arbitrary number of arguments, whose meaning is 29471decided by the :ref:`lowering strategy<deoptimize_lowering>`. 29472 29473Semantics: 29474"""""""""" 29475 29476The ``@llvm.experimental.deoptimize`` intrinsic executes an attached 29477deoptimization continuation (denoted using a :ref:`deoptimization 29478operand bundle <deopt_opbundles>`) and returns the value returned by 29479the deoptimization continuation. Defining the semantic properties of 29480the continuation itself is out of scope of the language reference -- 29481as far as LLVM is concerned, the deoptimization continuation can 29482invoke arbitrary side effects, including reading from and writing to 29483the entire heap. 29484 29485Deoptimization continuations expressed using ``"deopt"`` operand bundles always 29486continue execution to the end of the physical frame containing them, so all 29487calls to ``@llvm.experimental.deoptimize`` must be in "tail position": 29488 29489 - ``@llvm.experimental.deoptimize`` cannot be invoked. 29490 - The call must immediately precede a :ref:`ret <i_ret>` instruction. 29491 - The ``ret`` instruction must return the value produced by the 29492 ``@llvm.experimental.deoptimize`` call if there is one, or void. 29493 29494Note that the above restrictions imply that the return type for a call to 29495``@llvm.experimental.deoptimize`` will match the return type of its immediate 29496caller. 29497 29498The inliner composes the ``"deopt"`` continuations of the caller into the 29499``"deopt"`` continuations present in the inlinee, and also updates calls to this 29500intrinsic to return directly from the frame of the function it inlined into. 29501 29502All declarations of ``@llvm.experimental.deoptimize`` must share the 29503same calling convention. 29504 29505.. _deoptimize_lowering: 29506 29507Lowering: 29508""""""""" 29509 29510Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the 29511symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to 29512ensure that this symbol is defined). The call arguments to 29513``@llvm.experimental.deoptimize`` are lowered as if they were formal 29514arguments of the specified types, and not as varargs. 29515 29516 29517'``llvm.experimental.guard``' Intrinsic 29518^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29519 29520Syntax: 29521""""""" 29522 29523:: 29524 29525 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ] 29526 29527Overview: 29528""""""""" 29529 29530This intrinsic, together with :ref:`deoptimization operand bundles 29531<deopt_opbundles>`, allows frontends to express guards or checks on 29532optimistic assumptions made during compilation. The semantics of 29533``@llvm.experimental.guard`` is defined in terms of 29534``@llvm.experimental.deoptimize`` -- its body is defined to be 29535equivalent to: 29536 29537.. code-block:: text 29538 29539 define void @llvm.experimental.guard(i1 %pred, <args...>) { 29540 %realPred = and i1 %pred, undef 29541 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}] 29542 29543 leave: 29544 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ] 29545 ret void 29546 29547 continue: 29548 ret void 29549 } 29550 29551 29552with the optional ``[, !make.implicit !{}]`` present if and only if it 29553is present on the call site. For more details on ``!make.implicit``, 29554see :doc:`FaultMaps`. 29555 29556In words, ``@llvm.experimental.guard`` executes the attached 29557``"deopt"`` continuation if (but **not** only if) its first argument 29558is ``false``. Since the optimizer is allowed to replace the ``undef`` 29559with an arbitrary value, it can optimize guard to fail "spuriously", 29560i.e. without the original condition being false (hence the "not only 29561if"); and this allows for "check widening" type optimizations. 29562 29563``@llvm.experimental.guard`` cannot be invoked. 29564 29565After ``@llvm.experimental.guard`` was first added, a more general 29566formulation was found in ``@llvm.experimental.widenable.condition``. 29567Support for ``@llvm.experimental.guard`` is slowly being rephrased in 29568terms of this alternate. 29569 29570'``llvm.experimental.widenable.condition``' Intrinsic 29571^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29572 29573Syntax: 29574""""""" 29575 29576:: 29577 29578 declare i1 @llvm.experimental.widenable.condition() 29579 29580Overview: 29581""""""""" 29582 29583This intrinsic represents a "widenable condition" which is 29584boolean expressions with the following property: whether this 29585expression is `true` or `false`, the program is correct and 29586well-defined. 29587 29588Together with :ref:`deoptimization operand bundles <deopt_opbundles>`, 29589``@llvm.experimental.widenable.condition`` allows frontends to 29590express guards or checks on optimistic assumptions made during 29591compilation and represent them as branch instructions on special 29592conditions. 29593 29594While this may appear similar in semantics to `undef`, it is very 29595different in that an invocation produces a particular, singular 29596value. It is also intended to be lowered late, and remain available 29597for specific optimizations and transforms that can benefit from its 29598special properties. 29599 29600Arguments: 29601"""""""""" 29602 29603None. 29604 29605Semantics: 29606"""""""""" 29607 29608The intrinsic ``@llvm.experimental.widenable.condition()`` 29609returns either `true` or `false`. For each evaluation of a call 29610to this intrinsic, the program must be valid and correct both if 29611it returns `true` and if it returns `false`. This allows 29612transformation passes to replace evaluations of this intrinsic 29613with either value whenever one is beneficial. 29614 29615When used in a branch condition, it allows us to choose between 29616two alternative correct solutions for the same problem, like 29617in example below: 29618 29619.. code-block:: text 29620 29621 %cond = call i1 @llvm.experimental.widenable.condition() 29622 br i1 %cond, label %fast_path, label %slow_path 29623 29624 fast_path: 29625 ; Apply memory-consuming but fast solution for a task. 29626 29627 slow_path: 29628 ; Cheap in memory but slow solution. 29629 29630Whether the result of intrinsic's call is `true` or `false`, 29631it should be correct to pick either solution. We can switch 29632between them by replacing the result of 29633``@llvm.experimental.widenable.condition`` with different 29634`i1` expressions. 29635 29636This is how it can be used to represent guards as widenable branches: 29637 29638.. code-block:: text 29639 29640 block: 29641 ; Unguarded instructions 29642 call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)] 29643 ; Guarded instructions 29644 29645Can be expressed in an alternative equivalent form of explicit branch using 29646``@llvm.experimental.widenable.condition``: 29647 29648.. code-block:: text 29649 29650 block: 29651 ; Unguarded instructions 29652 %widenable_condition = call i1 @llvm.experimental.widenable.condition() 29653 %guard_condition = and i1 %cond, %widenable_condition 29654 br i1 %guard_condition, label %guarded, label %deopt 29655 29656 guarded: 29657 ; Guarded instructions 29658 29659 deopt: 29660 call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ] 29661 29662So the block `guarded` is only reachable when `%cond` is `true`, 29663and it should be valid to go to the block `deopt` whenever `%cond` 29664is `true` or `false`. 29665 29666``@llvm.experimental.widenable.condition`` will never throw, thus 29667it cannot be invoked. 29668 29669Guard widening: 29670""""""""""""""" 29671 29672When ``@llvm.experimental.widenable.condition()`` is used in 29673condition of a guard represented as explicit branch, it is 29674legal to widen the guard's condition with any additional 29675conditions. 29676 29677Guard widening looks like replacement of 29678 29679.. code-block:: text 29680 29681 %widenable_cond = call i1 @llvm.experimental.widenable.condition() 29682 %guard_cond = and i1 %cond, %widenable_cond 29683 br i1 %guard_cond, label %guarded, label %deopt 29684 29685with 29686 29687.. code-block:: text 29688 29689 %widenable_cond = call i1 @llvm.experimental.widenable.condition() 29690 %new_cond = and i1 %any_other_cond, %widenable_cond 29691 %new_guard_cond = and i1 %cond, %new_cond 29692 br i1 %new_guard_cond, label %guarded, label %deopt 29693 29694for this branch. Here `%any_other_cond` is an arbitrarily chosen 29695well-defined `i1` value. By making guard widening, we may 29696impose stricter conditions on `guarded` block and bail to the 29697deopt when the new condition is not met. 29698 29699Lowering: 29700""""""""" 29701 29702Default lowering strategy is replacing the result of 29703call of ``@llvm.experimental.widenable.condition`` with 29704constant `true`. However it is always correct to replace 29705it with any other `i1` value. Any pass can 29706freely do it if it can benefit from non-default lowering. 29707 29708'``llvm.allow.ubsan.check``' Intrinsic 29709^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29710 29711Syntax: 29712""""""" 29713 29714:: 29715 29716 declare i1 @llvm.allow.ubsan.check(i8 immarg %kind) 29717 29718Overview: 29719""""""""" 29720 29721This intrinsic returns ``true`` if and only if the compiler opted to enable the 29722ubsan check in the current basic block. 29723 29724Rules to allow ubsan checks are not part of the intrinsic declaration, and 29725controlled by compiler options. 29726 29727This intrinsic is the ubsan specific version of ``@llvm.allow.runtime.check()``. 29728 29729Arguments: 29730"""""""""" 29731 29732An integer describing the kind of ubsan check guarded by the intrinsic. 29733 29734Semantics: 29735"""""""""" 29736 29737The intrinsic ``@llvm.allow.ubsan.check()`` returns either ``true`` or 29738``false``, depending on compiler options. 29739 29740For each evaluation of a call to this intrinsic, the program must be valid and 29741correct both if it returns ``true`` and if it returns ``false``. 29742 29743When used in a branch condition, it selects one of the two paths: 29744 29745* `true``: Executes the UBSan check and reports any failures. 29746 29747* `false`: Bypasses the check, assuming it always succeeds. 29748 29749Example: 29750 29751.. code-block:: text 29752 29753 %allow = call i1 @llvm.allow.ubsan.check(i8 5) 29754 %not.allow = xor i1 %allow, true 29755 %cond = or i1 %ubcheck, %not.allow 29756 br i1 %cond, label %cont, label %trap 29757 29758 cont: 29759 ; Proceed 29760 29761 trap: 29762 call void @llvm.ubsantrap(i8 5) 29763 unreachable 29764 29765 29766'``llvm.allow.runtime.check``' Intrinsic 29767^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29768 29769Syntax: 29770""""""" 29771 29772:: 29773 29774 declare i1 @llvm.allow.runtime.check(metadata %kind) 29775 29776Overview: 29777""""""""" 29778 29779This intrinsic returns ``true`` if and only if the compiler opted to enable 29780runtime checks in the current basic block. 29781 29782Rules to allow runtime checks are not part of the intrinsic declaration, and 29783controlled by compiler options. 29784 29785This intrinsic is non-ubsan specific version of ``@llvm.allow.ubsan.check()``. 29786 29787Arguments: 29788"""""""""" 29789 29790A string identifying the kind of runtime check guarded by the intrinsic. The 29791string can be used to control rules to allow checks. 29792 29793Semantics: 29794"""""""""" 29795 29796The intrinsic ``@llvm.allow.runtime.check()`` returns either ``true`` or 29797``false``, depending on compiler options. 29798 29799For each evaluation of a call to this intrinsic, the program must be valid and 29800correct both if it returns ``true`` and if it returns ``false``. 29801 29802When used in a branch condition, it allows us to choose between 29803two alternative correct solutions for the same problem. 29804 29805If the intrinsic is evaluated as ``true``, program should execute a guarded 29806check. If the intrinsic is evaluated as ``false``, the program should avoid any 29807unnecessary checks. 29808 29809Example: 29810 29811.. code-block:: text 29812 29813 %allow = call i1 @llvm.allow.runtime.check(metadata !"my_check") 29814 br i1 %allow, label %fast_path, label %slow_path 29815 29816 fast_path: 29817 ; Omit diagnostics. 29818 29819 slow_path: 29820 ; Additional diagnostics. 29821 29822 29823'``llvm.load.relative``' Intrinsic 29824^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29825 29826Syntax: 29827""""""" 29828 29829:: 29830 29831 declare ptr @llvm.load.relative.iN(ptr %ptr, iN %offset) nounwind memory(argmem: read) 29832 29833Overview: 29834""""""""" 29835 29836This intrinsic loads a 32-bit value from the address ``%ptr + %offset``, 29837adds ``%ptr`` to that value and returns it. The constant folder specifically 29838recognizes the form of this intrinsic and the constant initializers it may 29839load from; if a loaded constant initializer is known to have the form 29840``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. 29841 29842LLVM provides that the calculation of such a constant initializer will 29843not overflow at link time under the medium code model if ``x`` is an 29844``unnamed_addr`` function. However, it does not provide this guarantee for 29845a constant initializer folded into a function body. This intrinsic can be 29846used to avoid the possibility of overflows when loading from such a constant. 29847 29848.. _llvm_sideeffect: 29849 29850'``llvm.sideeffect``' Intrinsic 29851^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29852 29853Syntax: 29854""""""" 29855 29856:: 29857 29858 declare void @llvm.sideeffect() inaccessiblememonly nounwind willreturn 29859 29860Overview: 29861""""""""" 29862 29863The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers 29864treat it as having side effects, so it can be inserted into a loop to 29865indicate that the loop shouldn't be assumed to terminate (which could 29866potentially lead to the loop being optimized away entirely), even if it's 29867an infinite loop with no other side effects. 29868 29869Arguments: 29870"""""""""" 29871 29872None. 29873 29874Semantics: 29875"""""""""" 29876 29877This intrinsic actually does nothing, but optimizers must assume that it 29878has externally observable side effects. 29879 29880'``llvm.is.constant.*``' Intrinsic 29881^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29882 29883Syntax: 29884""""""" 29885 29886This is an overloaded intrinsic. You can use llvm.is.constant with any argument type. 29887 29888:: 29889 29890 declare i1 @llvm.is.constant.i32(i32 %operand) nounwind memory(none) 29891 declare i1 @llvm.is.constant.f32(float %operand) nounwind memory(none) 29892 declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind memory(none) 29893 29894Overview: 29895""""""""" 29896 29897The '``llvm.is.constant``' intrinsic will return true if the argument 29898is known to be a manifest compile-time constant. It is guaranteed to 29899fold to either true or false before generating machine code. 29900 29901Semantics: 29902"""""""""" 29903 29904This intrinsic generates no code. If its argument is known to be a 29905manifest compile-time constant value, then the intrinsic will be 29906converted to a constant true value. Otherwise, it will be converted to 29907a constant false value. 29908 29909In particular, note that if the argument is a constant expression 29910which refers to a global (the address of which _is_ a constant, but 29911not manifest during the compile), then the intrinsic evaluates to 29912false. 29913 29914The result also intentionally depends on the result of optimization 29915passes -- e.g., the result can change depending on whether a 29916function gets inlined or not. A function's parameters are 29917obviously not constant. However, a call like 29918``llvm.is.constant.i32(i32 %param)`` *can* return true after the 29919function is inlined, if the value passed to the function parameter was 29920a constant. 29921 29922.. _int_ptrmask: 29923 29924'``llvm.ptrmask``' Intrinsic 29925^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29926 29927Syntax: 29928""""""" 29929 29930:: 29931 29932 declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) speculatable memory(none) 29933 29934Arguments: 29935"""""""""" 29936 29937The first argument is a pointer or vector of pointers. The second argument is 29938an integer or vector of integers with the same bit width as the index type 29939size of the first argument. 29940 29941Overview: 29942"""""""""" 29943 29944The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask. 29945This allows stripping data from tagged pointers without converting them to an 29946integer (ptrtoint/inttoptr). As a consequence, we can preserve more information 29947to facilitate alias analysis and underlying-object detection. 29948 29949Semantics: 29950"""""""""" 29951 29952The result of ``ptrmask(%ptr, %mask)`` is equivalent to the following expansion, 29953where ``iPtrIdx`` is the index type size of the pointer:: 29954 29955 %intptr = ptrtoint ptr %ptr to iPtrIdx ; this may truncate 29956 %masked = and iPtrIdx %intptr, %mask 29957 %diff = sub iPtrIdx %masked, %intptr 29958 %result = getelementptr i8, ptr %ptr, iPtrIdx %diff 29959 29960If the pointer index type size is smaller than the pointer type size, this 29961implies that pointer bits beyond the index size are not affected by this 29962intrinsic. For integral pointers, it behaves as if the mask were extended with 299631 bits to the pointer type size. 29964 29965Both the returned pointer(s) and the first argument are based on the same 29966underlying object (for more information on the *based on* terminology see 29967:ref:`the pointer aliasing rules <pointeraliasing>`). 29968 29969The intrinsic only captures the pointer argument through the return value. 29970 29971.. _int_threadlocal_address: 29972 29973'``llvm.threadlocal.address``' Intrinsic 29974^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 29975 29976Syntax: 29977""""""" 29978 29979:: 29980 29981 declare ptr @llvm.threadlocal.address(ptr) nounwind willreturn memory(none) 29982 29983Arguments: 29984"""""""""" 29985 29986The `llvm.threadlocal.address` intrinsic requires a global value argument (a 29987:ref:`global variable <globalvars>` or alias) that is thread local. 29988 29989Semantics: 29990"""""""""" 29991 29992The address of a thread local global is not a constant, since it depends on 29993the calling thread. The `llvm.threadlocal.address` intrinsic returns the 29994address of the given thread local global in the calling thread. 29995 29996.. _int_vscale: 29997 29998'``llvm.vscale``' Intrinsic 29999^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30000 30001Syntax: 30002""""""" 30003 30004:: 30005 30006 declare i32 llvm.vscale.i32() 30007 declare i64 llvm.vscale.i64() 30008 30009Overview: 30010""""""""" 30011 30012The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable 30013vectors such as ``<vscale x 16 x i8>``. 30014 30015Semantics: 30016"""""""""" 30017 30018``vscale`` is a positive value that is constant throughout program 30019execution, but is unknown at compile time. 30020If the result value does not fit in the result type, then the result is 30021a :ref:`poison value <poisonvalues>`. 30022 30023.. _llvm_fake_use: 30024 30025'``llvm.fake.use``' Intrinsic 30026^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30027 30028Syntax: 30029""""""" 30030 30031:: 30032 30033 declare void @llvm.fake.use(...) 30034 30035Overview: 30036""""""""" 30037 30038The ``llvm.fake.use`` intrinsic is a no-op. It takes a single 30039value as an operand and is treated as a use of that operand, to force the 30040optimizer to preserve that value prior to the fake use. This is used for 30041extending the lifetimes of variables, where this intrinsic placed at the end of 30042a variable's scope helps prevent that variable from being optimized out. 30043 30044Arguments: 30045"""""""""" 30046 30047The ``llvm.fake.use`` intrinsic takes one argument, which may be any 30048function-local SSA value. Note that the signature is variadic so that the 30049intrinsic can take any type of argument, but passing more than one argument will 30050result in an error. 30051 30052Semantics: 30053"""""""""" 30054 30055This intrinsic does nothing, but optimizers must consider it a use of its single 30056operand and should try to preserve the intrinsic and its position in the 30057function. 30058 30059 30060Stack Map Intrinsics 30061-------------------- 30062 30063LLVM provides experimental intrinsics to support runtime patching 30064mechanisms commonly desired in dynamic language JITs. These intrinsics 30065are described in :doc:`StackMaps`. 30066 30067Element Wise Atomic Memory Intrinsics 30068------------------------------------- 30069 30070These intrinsics are similar to the standard library memory intrinsics except 30071that they perform memory transfer as a sequence of atomic memory accesses. 30072 30073.. _int_memcpy_element_unordered_atomic: 30074 30075'``llvm.memcpy.element.unordered.atomic``' Intrinsic 30076^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30077 30078Syntax: 30079""""""" 30080 30081This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on 30082any integer bit width and for different address spaces. Not all targets 30083support all bit widths however. 30084 30085:: 30086 30087 declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i32(ptr <dest>, 30088 ptr <src>, 30089 i32 <len>, 30090 i32 <element_size>) 30091 declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i64(ptr <dest>, 30092 ptr <src>, 30093 i64 <len>, 30094 i32 <element_size>) 30095 30096Overview: 30097""""""""" 30098 30099The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the 30100'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated 30101as arrays with elements that are exactly ``element_size`` bytes, and the copy between 30102buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations 30103that are a positive integer multiple of the ``element_size`` in size. 30104 30105Arguments: 30106"""""""""" 30107 30108The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>` 30109intrinsic, with the added constraint that ``len`` is required to be a positive integer 30110multiple of the ``element_size``. If ``len`` is not a positive integer multiple of 30111``element_size``, then the behavior of the intrinsic is undefined. 30112 30113``element_size`` must be a compile-time constant positive power of two no greater than 30114target-specific atomic access size limit. 30115 30116For each of the input pointers ``align`` parameter attribute must be specified. It 30117must be a power of two no less than the ``element_size``. Caller guarantees that 30118both the source and destination pointers are aligned to that boundary. 30119 30120Semantics: 30121"""""""""" 30122 30123The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of 30124memory from the source location to the destination location. These locations are not 30125allowed to overlap. The memory copy is performed as a sequence of load/store operations 30126where each access is guaranteed to be a multiple of ``element_size`` bytes wide and 30127aligned at an ``element_size`` boundary. 30128 30129The order of the copy is unspecified. The same value may be read from the source 30130buffer many times, but only one write is issued to the destination buffer per 30131element. It is well defined to have concurrent reads and writes to both source and 30132destination provided those reads and writes are unordered atomic when specified. 30133 30134This intrinsic does not provide any additional ordering guarantees over those 30135provided by a set of unordered loads from the source location and stores to the 30136destination. 30137 30138Lowering: 30139""""""""" 30140 30141In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is 30142lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*' 30143is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic 30144lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific 30145lowering. 30146 30147Optimizer is allowed to inline memory copy when it's profitable to do so. 30148 30149'``llvm.memmove.element.unordered.atomic``' Intrinsic 30150^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30151 30152Syntax: 30153""""""" 30154 30155This is an overloaded intrinsic. You can use 30156``llvm.memmove.element.unordered.atomic`` on any integer bit width and for 30157different address spaces. Not all targets support all bit widths however. 30158 30159:: 30160 30161 declare void @llvm.memmove.element.unordered.atomic.p0.p0.i32(ptr <dest>, 30162 ptr <src>, 30163 i32 <len>, 30164 i32 <element_size>) 30165 declare void @llvm.memmove.element.unordered.atomic.p0.p0.i64(ptr <dest>, 30166 ptr <src>, 30167 i64 <len>, 30168 i32 <element_size>) 30169 30170Overview: 30171""""""""" 30172 30173The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization 30174of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and 30175``src`` are treated as arrays with elements that are exactly ``element_size`` 30176bytes, and the copy between buffers uses a sequence of 30177:ref:`unordered atomic <ordering>` load/store operations that are a positive 30178integer multiple of the ``element_size`` in size. 30179 30180Arguments: 30181"""""""""" 30182 30183The first three arguments are the same as they are in the 30184:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that 30185``len`` is required to be a positive integer multiple of the ``element_size``. 30186If ``len`` is not a positive integer multiple of ``element_size``, then the 30187behavior of the intrinsic is undefined. 30188 30189``element_size`` must be a compile-time constant positive power of two no 30190greater than a target-specific atomic access size limit. 30191 30192For each of the input pointers the ``align`` parameter attribute must be 30193specified. It must be a power of two no less than the ``element_size``. Caller 30194guarantees that both the source and destination pointers are aligned to that 30195boundary. 30196 30197Semantics: 30198"""""""""" 30199 30200The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes 30201of memory from the source location to the destination location. These locations 30202are allowed to overlap. The memory copy is performed as a sequence of load/store 30203operations where each access is guaranteed to be a multiple of ``element_size`` 30204bytes wide and aligned at an ``element_size`` boundary. 30205 30206The order of the copy is unspecified. The same value may be read from the source 30207buffer many times, but only one write is issued to the destination buffer per 30208element. It is well defined to have concurrent reads and writes to both source 30209and destination provided those reads and writes are unordered atomic when 30210specified. 30211 30212This intrinsic does not provide any additional ordering guarantees over those 30213provided by a set of unordered loads from the source location and stores to the 30214destination. 30215 30216Lowering: 30217""""""""" 30218 30219In the most general case call to the 30220'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol 30221``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an 30222actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering 30223<RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific 30224lowering. 30225 30226The optimizer is allowed to inline the memory copy when it's profitable to do so. 30227 30228.. _int_memset_element_unordered_atomic: 30229 30230'``llvm.memset.element.unordered.atomic``' Intrinsic 30231^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30232 30233Syntax: 30234""""""" 30235 30236This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on 30237any integer bit width and for different address spaces. Not all targets 30238support all bit widths however. 30239 30240:: 30241 30242 declare void @llvm.memset.element.unordered.atomic.p0.i32(ptr <dest>, 30243 i8 <value>, 30244 i32 <len>, 30245 i32 <element_size>) 30246 declare void @llvm.memset.element.unordered.atomic.p0.i64(ptr <dest>, 30247 i8 <value>, 30248 i64 <len>, 30249 i32 <element_size>) 30250 30251Overview: 30252""""""""" 30253 30254The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the 30255'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array 30256with elements that are exactly ``element_size`` bytes, and the assignment to that array 30257uses uses a sequence of :ref:`unordered atomic <ordering>` store operations 30258that are a positive integer multiple of the ``element_size`` in size. 30259 30260Arguments: 30261"""""""""" 30262 30263The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>` 30264intrinsic, with the added constraint that ``len`` is required to be a positive integer 30265multiple of the ``element_size``. If ``len`` is not a positive integer multiple of 30266``element_size``, then the behavior of the intrinsic is undefined. 30267 30268``element_size`` must be a compile-time constant positive power of two no greater than 30269target-specific atomic access size limit. 30270 30271The ``dest`` input pointer must have the ``align`` parameter attribute specified. It 30272must be a power of two no less than the ``element_size``. Caller guarantees that 30273the destination pointer is aligned to that boundary. 30274 30275Semantics: 30276"""""""""" 30277 30278The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of 30279memory starting at the destination location to the given ``value``. The memory is 30280set with a sequence of store operations where each access is guaranteed to be a 30281multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary. 30282 30283The order of the assignment is unspecified. Only one write is issued to the 30284destination buffer per element. It is well defined to have concurrent reads and 30285writes to the destination provided those reads and writes are unordered atomic 30286when specified. 30287 30288This intrinsic does not provide any additional ordering guarantees over those 30289provided by a set of unordered stores to the destination. 30290 30291Lowering: 30292""""""""" 30293 30294In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is 30295lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*' 30296is replaced with an actual element size. 30297 30298The optimizer is allowed to inline the memory assignment when it's profitable to do so. 30299 30300Objective-C ARC Runtime Intrinsics 30301---------------------------------- 30302 30303LLVM provides intrinsics that lower to Objective-C ARC runtime entry points. 30304LLVM is aware of the semantics of these functions, and optimizes based on that 30305knowledge. You can read more about the details of Objective-C ARC `here 30306<https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_. 30307 30308'``llvm.objc.autorelease``' Intrinsic 30309^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30310 30311Syntax: 30312""""""" 30313:: 30314 30315 declare ptr @llvm.objc.autorelease(ptr) 30316 30317Lowering: 30318""""""""" 30319 30320Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_. 30321 30322'``llvm.objc.autoreleasePoolPop``' Intrinsic 30323^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30324 30325Syntax: 30326""""""" 30327:: 30328 30329 declare void @llvm.objc.autoreleasePoolPop(ptr) 30330 30331Lowering: 30332""""""""" 30333 30334Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_. 30335 30336'``llvm.objc.autoreleasePoolPush``' Intrinsic 30337^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30338 30339Syntax: 30340""""""" 30341:: 30342 30343 declare ptr @llvm.objc.autoreleasePoolPush() 30344 30345Lowering: 30346""""""""" 30347 30348Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_. 30349 30350'``llvm.objc.autoreleaseReturnValue``' Intrinsic 30351^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30352 30353Syntax: 30354""""""" 30355:: 30356 30357 declare ptr @llvm.objc.autoreleaseReturnValue(ptr) 30358 30359Lowering: 30360""""""""" 30361 30362Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_. 30363 30364'``llvm.objc.copyWeak``' Intrinsic 30365^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30366 30367Syntax: 30368""""""" 30369:: 30370 30371 declare void @llvm.objc.copyWeak(ptr, ptr) 30372 30373Lowering: 30374""""""""" 30375 30376Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_. 30377 30378'``llvm.objc.destroyWeak``' Intrinsic 30379^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30380 30381Syntax: 30382""""""" 30383:: 30384 30385 declare void @llvm.objc.destroyWeak(ptr) 30386 30387Lowering: 30388""""""""" 30389 30390Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_. 30391 30392'``llvm.objc.initWeak``' Intrinsic 30393^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30394 30395Syntax: 30396""""""" 30397:: 30398 30399 declare ptr @llvm.objc.initWeak(ptr, ptr) 30400 30401Lowering: 30402""""""""" 30403 30404Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_. 30405 30406'``llvm.objc.loadWeak``' Intrinsic 30407^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30408 30409Syntax: 30410""""""" 30411:: 30412 30413 declare ptr @llvm.objc.loadWeak(ptr) 30414 30415Lowering: 30416""""""""" 30417 30418Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_. 30419 30420'``llvm.objc.loadWeakRetained``' Intrinsic 30421^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30422 30423Syntax: 30424""""""" 30425:: 30426 30427 declare ptr @llvm.objc.loadWeakRetained(ptr) 30428 30429Lowering: 30430""""""""" 30431 30432Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_. 30433 30434'``llvm.objc.moveWeak``' Intrinsic 30435^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30436 30437Syntax: 30438""""""" 30439:: 30440 30441 declare void @llvm.objc.moveWeak(ptr, ptr) 30442 30443Lowering: 30444""""""""" 30445 30446Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_. 30447 30448'``llvm.objc.release``' Intrinsic 30449^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30450 30451Syntax: 30452""""""" 30453:: 30454 30455 declare void @llvm.objc.release(ptr) 30456 30457Lowering: 30458""""""""" 30459 30460Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_. 30461 30462'``llvm.objc.retain``' Intrinsic 30463^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30464 30465Syntax: 30466""""""" 30467:: 30468 30469 declare ptr @llvm.objc.retain(ptr) 30470 30471Lowering: 30472""""""""" 30473 30474Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_. 30475 30476'``llvm.objc.retainAutorelease``' Intrinsic 30477^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30478 30479Syntax: 30480""""""" 30481:: 30482 30483 declare ptr @llvm.objc.retainAutorelease(ptr) 30484 30485Lowering: 30486""""""""" 30487 30488Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_. 30489 30490'``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic 30491^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30492 30493Syntax: 30494""""""" 30495:: 30496 30497 declare ptr @llvm.objc.retainAutoreleaseReturnValue(ptr) 30498 30499Lowering: 30500""""""""" 30501 30502Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_. 30503 30504'``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic 30505^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30506 30507Syntax: 30508""""""" 30509:: 30510 30511 declare ptr @llvm.objc.retainAutoreleasedReturnValue(ptr) 30512 30513Lowering: 30514""""""""" 30515 30516Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_. 30517 30518'``llvm.objc.retainBlock``' Intrinsic 30519^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30520 30521Syntax: 30522""""""" 30523:: 30524 30525 declare ptr @llvm.objc.retainBlock(ptr) 30526 30527Lowering: 30528""""""""" 30529 30530Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_. 30531 30532'``llvm.objc.storeStrong``' Intrinsic 30533^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30534 30535Syntax: 30536""""""" 30537:: 30538 30539 declare void @llvm.objc.storeStrong(ptr, ptr) 30540 30541Lowering: 30542""""""""" 30543 30544Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_. 30545 30546'``llvm.objc.storeWeak``' Intrinsic 30547^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30548 30549Syntax: 30550""""""" 30551:: 30552 30553 declare ptr @llvm.objc.storeWeak(ptr, ptr) 30554 30555Lowering: 30556""""""""" 30557 30558Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_. 30559 30560Preserving Debug Information Intrinsics 30561--------------------------------------- 30562 30563These intrinsics are used to carry certain debuginfo together with 30564IR-level operations. For example, it may be desirable to 30565know the structure/union name and the original user-level field 30566indices. Such information got lost in IR GetElementPtr instruction 30567since the IR types are different from debugInfo types and unions 30568are converted to structs in IR. 30569 30570'``llvm.preserve.array.access.index``' Intrinsic 30571^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30572 30573Syntax: 30574""""""" 30575:: 30576 30577 declare <ret_type> 30578 @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base, 30579 i32 dim, 30580 i32 index) 30581 30582Overview: 30583""""""""" 30584 30585The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address 30586based on array base ``base``, array dimension ``dim`` and the last access index ``index`` 30587into the array. The return type ``ret_type`` is a pointer type to the array element. 30588The array ``dim`` and ``index`` are preserved which is more robust than 30589getelementptr instruction which may be subject to compiler transformation. 30590The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 30591to provide array or pointer debuginfo type. 30592The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the 30593debuginfo version of ``type``. 30594 30595Arguments: 30596"""""""""" 30597 30598The ``base`` is the array base address. The ``dim`` is the array dimension. 30599The ``base`` is a pointer if ``dim`` equals 0. 30600The ``index`` is the last access index into the array or pointer. 30601 30602The ``base`` argument must be annotated with an :ref:`elementtype 30603<attr_elementtype>` attribute at the call-site. This attribute specifies the 30604getelementptr element type. 30605 30606Semantics: 30607"""""""""" 30608 30609The '``llvm.preserve.array.access.index``' intrinsic produces the same result 30610as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``. 30611 30612'``llvm.preserve.union.access.index``' Intrinsic 30613^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30614 30615Syntax: 30616""""""" 30617:: 30618 30619 declare <type> 30620 @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base, 30621 i32 di_index) 30622 30623Overview: 30624""""""""" 30625 30626The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index 30627``di_index`` and returns the ``base`` address. 30628The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 30629to provide union debuginfo type. 30630The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``. 30631The return type ``type`` is the same as the ``base`` type. 30632 30633Arguments: 30634"""""""""" 30635 30636The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo. 30637 30638Semantics: 30639"""""""""" 30640 30641The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address. 30642 30643'``llvm.preserve.struct.access.index``' Intrinsic 30644^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30645 30646Syntax: 30647""""""" 30648:: 30649 30650 declare <ret_type> 30651 @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base, 30652 i32 gep_index, 30653 i32 di_index) 30654 30655Overview: 30656""""""""" 30657 30658The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address 30659based on struct base ``base`` and IR struct member index ``gep_index``. 30660The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 30661to provide struct debuginfo type. 30662The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``. 30663The return type ``ret_type`` is a pointer type to the structure member. 30664 30665Arguments: 30666"""""""""" 30667 30668The ``base`` is the structure base address. The ``gep_index`` is the struct member index 30669based on IR structures. The ``di_index`` is the struct member index based on debuginfo. 30670 30671The ``base`` argument must be annotated with an :ref:`elementtype 30672<attr_elementtype>` attribute at the call-site. This attribute specifies the 30673getelementptr element type. 30674 30675Semantics: 30676"""""""""" 30677 30678The '``llvm.preserve.struct.access.index``' intrinsic produces the same result 30679as a getelementptr with base ``base`` and access operands ``{0, gep_index}``. 30680 30681'``llvm.fptrunc.round``' Intrinsic 30682^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30683 30684Syntax: 30685""""""" 30686 30687:: 30688 30689 declare <ty2> 30690 @llvm.fptrunc.round(<type> <value>, metadata <rounding mode>) 30691 30692Overview: 30693""""""""" 30694 30695The '``llvm.fptrunc.round``' intrinsic truncates 30696:ref:`floating-point <t_floating>` ``value`` to type ``ty2`` 30697with a specified rounding mode. 30698 30699Arguments: 30700"""""""""" 30701 30702The '``llvm.fptrunc.round``' intrinsic takes a :ref:`floating-point 30703<t_floating>` value to cast and a :ref:`floating-point <t_floating>` type 30704to cast it to. This argument must be larger in size than the result. 30705 30706The second argument specifies the rounding mode as described in the constrained 30707intrinsics section. 30708For this intrinsic, the "round.dynamic" mode is not supported. 30709 30710Semantics: 30711"""""""""" 30712 30713The '``llvm.fptrunc.round``' intrinsic casts a ``value`` from a larger 30714:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point 30715<t_floating>` type. 30716This intrinsic is assumed to execute in the default :ref:`floating-point 30717environment <floatenv>` *except* for the rounding mode. 30718This intrinsic is not supported on all targets. Some targets may not support 30719all rounding modes. 30720