1============================== 2LLVM Language Reference Manual 3============================== 4 5.. contents:: 6 :local: 7 :depth: 4 8 9Abstract 10======== 11 12This document is a reference manual for the LLVM assembly language. LLVM 13is a Static Single Assignment (SSA) based representation that provides 14type safety, low-level operations, flexibility, and the capability of 15representing 'all' high-level languages cleanly. It is the common code 16representation used throughout all phases of the LLVM compilation 17strategy. 18 19Introduction 20============ 21 22The LLVM code representation is designed to be used in three different 23forms: as an in-memory compiler IR, as an on-disk bitcode representation 24(suitable for fast loading by a Just-In-Time compiler), and as a human 25readable assembly language representation. This allows LLVM to provide a 26powerful intermediate representation for efficient compiler 27transformations and analysis, while providing a natural means to debug 28and visualize the transformations. The three different forms of LLVM are 29all equivalent. This document describes the human readable 30representation and notation. 31 32The LLVM representation aims to be light-weight and low-level while 33being expressive, typed, and extensible at the same time. It aims to be 34a "universal IR" of sorts, by being at a low enough level that 35high-level ideas may be cleanly mapped to it (similar to how 36microprocessors are "universal IR's", allowing many source languages to 37be mapped to them). By providing type information, LLVM can be used as 38the target of optimizations: for example, through pointer analysis, it 39can be proven that a C automatic variable is never accessed outside of 40the current function, allowing it to be promoted to a simple SSA value 41instead of a memory location. 42 43.. _wellformed: 44 45Well-Formedness 46--------------- 47 48It is important to note that this document describes 'well formed' LLVM 49assembly language. There is a difference between what the parser accepts 50and what is considered 'well formed'. For example, the following 51instruction is syntactically okay, but not well formed: 52 53.. code-block:: llvm 54 55 %x = add i32 1, %x 56 57because the definition of ``%x`` does not dominate all of its uses. The 58LLVM infrastructure provides a verification pass that may be used to 59verify that an LLVM module is well formed. This pass is automatically 60run by the parser after parsing input assembly and by the optimizer 61before it outputs bitcode. The violations pointed out by the verifier 62pass indicate bugs in transformation passes or input to the parser. 63 64.. _identifiers: 65 66Identifiers 67=========== 68 69LLVM identifiers come in two basic types: global and local. Global 70identifiers (functions, global variables) begin with the ``'@'`` 71character. Local identifiers (register names, types) begin with the 72``'%'`` character. Additionally, there are three different formats for 73identifiers, for different purposes: 74 75#. Named values are represented as a string of characters with their 76 prefix. For example, ``%foo``, ``@DivisionByZero``, 77 ``%a.really.long.identifier``. The actual regular expression used is 78 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other 79 characters in their names can be surrounded with quotes. Special 80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII 81 code for the character in hexadecimal. In this way, any character can 82 be used in a name value, even quotes themselves. The ``"\01"`` prefix 83 can be used on global values to suppress mangling. 84#. Unnamed values are represented as an unsigned numeric value with 85 their prefix. For example, ``%12``, ``@2``, ``%44``. 86#. Constants, which are described in the section Constants_ below. 87 88LLVM requires that values start with a prefix for two reasons: Compilers 89don't need to worry about name clashes with reserved words, and the set 90of reserved words may be expanded in the future without penalty. 91Additionally, unnamed identifiers allow a compiler to quickly come up 92with a temporary variable without having to avoid symbol table 93conflicts. 94 95Reserved words in LLVM are very similar to reserved words in other 96languages. There are keywords for different opcodes ('``add``', 97'``bitcast``', '``ret``', etc...), for primitive type names ('``void``', 98'``i32``', etc...), and others. These reserved words cannot conflict 99with variable names, because none of them start with a prefix character 100(``'%'`` or ``'@'``). 101 102Here is an example of LLVM code to multiply the integer variable 103'``%X``' by 8: 104 105The easy way: 106 107.. code-block:: llvm 108 109 %result = mul i32 %X, 8 110 111After strength reduction: 112 113.. code-block:: llvm 114 115 %result = shl i32 %X, 3 116 117And the hard way: 118 119.. code-block:: llvm 120 121 %0 = add i32 %X, %X ; yields i32:%0 122 %1 = add i32 %0, %0 ; yields i32:%1 123 %result = add i32 %1, %1 124 125This last way of multiplying ``%X`` by 8 illustrates several important 126lexical features of LLVM: 127 128#. Comments are delimited with a '``;``' and go until the end of line. 129#. Unnamed temporaries are created when the result of a computation is 130 not assigned to a named value. 131#. Unnamed temporaries are numbered sequentially (using a per-function 132 incrementing counter, starting with 0). Note that basic blocks and unnamed 133 function parameters are included in this numbering. For example, if the 134 entry basic block is not given a label name and all function parameters are 135 named, then it will get number 0. 136 137It also shows a convention that we follow in this document. When 138demonstrating instructions, we will follow an instruction with a comment 139that defines the type and name of value produced. 140 141High Level Structure 142==================== 143 144Module Structure 145---------------- 146 147LLVM programs are composed of ``Module``'s, each of which is a 148translation unit of the input programs. Each module consists of 149functions, global variables, and symbol table entries. Modules may be 150combined together with the LLVM linker, which merges function (and 151global variable) definitions, resolves forward declarations, and merges 152symbol table entries. Here is an example of the "hello world" module: 153 154.. code-block:: llvm 155 156 ; Declare the string constant as a global constant. 157 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" 158 159 ; External declaration of the puts function 160 declare i32 @puts(i8* nocapture) nounwind 161 162 ; Definition of main function 163 define i32 @main() { ; i32()* 164 ; Convert [13 x i8]* to i8*... 165 %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0 166 167 ; Call puts function to write out the string to stdout. 168 call i32 @puts(i8* %cast210) 169 ret i32 0 170 } 171 172 ; Named metadata 173 !0 = !{i32 42, null, !"string"} 174 !foo = !{!0} 175 176This example is made up of a :ref:`global variable <globalvars>` named 177"``.str``", an external declaration of the "``puts``" function, a 178:ref:`function definition <functionstructure>` for "``main``" and 179:ref:`named metadata <namedmetadatastructure>` "``foo``". 180 181In general, a module is made up of a list of global values (where both 182functions and global variables are global values). Global values are 183represented by a pointer to a memory location (in this case, a pointer 184to an array of char, and a pointer to a function), and have one of the 185following :ref:`linkage types <linkage>`. 186 187.. _linkage: 188 189Linkage Types 190------------- 191 192All Global Variables and Functions have one of the following types of 193linkage: 194 195``private`` 196 Global values with "``private``" linkage are only directly 197 accessible by objects in the current module. In particular, linking 198 code into a module with a private global value may cause the 199 private to be renamed as necessary to avoid collisions. Because the 200 symbol is private to the module, all references can be updated. This 201 doesn't show up in any symbol table in the object file. 202``internal`` 203 Similar to private, but the value shows as a local symbol 204 (``STB_LOCAL`` in the case of ELF) in the object file. This 205 corresponds to the notion of the '``static``' keyword in C. 206``available_externally`` 207 Globals with "``available_externally``" linkage are never emitted into 208 the object file corresponding to the LLVM module. From the linker's 209 perspective, an ``available_externally`` global is equivalent to 210 an external declaration. They exist to allow inlining and other 211 optimizations to take place given knowledge of the definition of the 212 global, which is known to be somewhere outside the module. Globals 213 with ``available_externally`` linkage are allowed to be discarded at 214 will, and allow inlining and other optimizations. This linkage type is 215 only allowed on definitions, not declarations. 216``linkonce`` 217 Globals with "``linkonce``" linkage are merged with other globals of 218 the same name when linkage occurs. This can be used to implement 219 some forms of inline functions, templates, or other code which must 220 be generated in each translation unit that uses it, but where the 221 body may be overridden with a more definitive definition later. 222 Unreferenced ``linkonce`` globals are allowed to be discarded. Note 223 that ``linkonce`` linkage does not actually allow the optimizer to 224 inline the body of this function into callers because it doesn't 225 know if this definition of the function is the definitive definition 226 within the program or whether it will be overridden by a stronger 227 definition. To enable inlining and other optimizations, use 228 "``linkonce_odr``" linkage. 229``weak`` 230 "``weak``" linkage has the same merging semantics as ``linkonce`` 231 linkage, except that unreferenced globals with ``weak`` linkage may 232 not be discarded. This is used for globals that are declared "weak" 233 in C source code. 234``common`` 235 "``common``" linkage is most similar to "``weak``" linkage, but they 236 are used for tentative definitions in C, such as "``int X;``" at 237 global scope. Symbols with "``common``" linkage are merged in the 238 same way as ``weak symbols``, and they may not be deleted if 239 unreferenced. ``common`` symbols may not have an explicit section, 240 must have a zero initializer, and may not be marked 241 ':ref:`constant <globalvars>`'. Functions and aliases may not have 242 common linkage. 243 244.. _linkage_appending: 245 246``appending`` 247 "``appending``" linkage may only be applied to global variables of 248 pointer to array type. When two global variables with appending 249 linkage are linked together, the two global arrays are appended 250 together. This is the LLVM, typesafe, equivalent of having the 251 system linker append together "sections" with identical names when 252 .o files are linked. 253 254 Unfortunately this doesn't correspond to any feature in .o files, so it 255 can only be used for variables like ``llvm.global_ctors`` which llvm 256 interprets specially. 257 258``extern_weak`` 259 The semantics of this linkage follow the ELF object file model: the 260 symbol is weak until linked, if not linked, the symbol becomes null 261 instead of being an undefined reference. 262``linkonce_odr``, ``weak_odr`` 263 Some languages allow differing globals to be merged, such as two 264 functions with different semantics. Other languages, such as 265 ``C++``, ensure that only equivalent globals are ever merged (the 266 "one definition rule" --- "ODR"). Such languages can use the 267 ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the 268 global will only be merged with equivalent globals. These linkage 269 types are otherwise the same as their non-``odr`` versions. 270``external`` 271 If none of the above identifiers are used, the global is externally 272 visible, meaning that it participates in linkage and can be used to 273 resolve external symbol references. 274 275It is illegal for a global variable or function *declaration* to have any 276linkage type other than ``external`` or ``extern_weak``. 277 278.. _callingconv: 279 280Calling Conventions 281------------------- 282 283LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and 284:ref:`invokes <i_invoke>` can all have an optional calling convention 285specified for the call. The calling convention of any pair of dynamic 286caller/callee must match, or the behavior of the program is undefined. 287The following calling conventions are supported by LLVM, and more may be 288added in the future: 289 290"``ccc``" - The C calling convention 291 This calling convention (the default if no other calling convention 292 is specified) matches the target C calling conventions. This calling 293 convention supports varargs function calls and tolerates some 294 mismatch in the declared prototype and implemented declaration of 295 the function (as does normal C). 296"``fastcc``" - The fast calling convention 297 This calling convention attempts to make calls as fast as possible 298 (e.g. by passing things in registers). This calling convention 299 allows the target to use whatever tricks it wants to produce fast 300 code for the target, without having to conform to an externally 301 specified ABI (Application Binary Interface). `Tail calls can only 302 be optimized when this, the tailcc, the GHC or the HiPE convention is 303 used. <CodeGenerator.html#id80>`_ This calling convention does not 304 support varargs and requires the prototype of all callees to exactly 305 match the prototype of the function definition. 306"``coldcc``" - The cold calling convention 307 This calling convention attempts to make code in the caller as 308 efficient as possible under the assumption that the call is not 309 commonly executed. As such, these calls often preserve all registers 310 so that the call does not break any live ranges in the caller side. 311 This calling convention does not support varargs and requires the 312 prototype of all callees to exactly match the prototype of the 313 function definition. Furthermore the inliner doesn't consider such function 314 calls for inlining. 315"``cc 10``" - GHC convention 316 This calling convention has been implemented specifically for use by 317 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_. 318 It passes everything in registers, going to extremes to achieve this 319 by disabling callee save registers. This calling convention should 320 not be used lightly but only for specific situations such as an 321 alternative to the *register pinning* performance technique often 322 used when implementing functional programming languages. At the 323 moment only X86 supports this convention and it has the following 324 limitations: 325 326 - On *X86-32* only supports up to 4 bit type parameters. No 327 floating-point types are supported. 328 - On *X86-64* only supports up to 10 bit type parameters and 6 329 floating-point parameters. 330 331 This calling convention supports `tail call 332 optimization <CodeGenerator.html#id80>`_ but requires both the 333 caller and callee are using it. 334"``cc 11``" - The HiPE calling convention 335 This calling convention has been implemented specifically for use by 336 the `High-Performance Erlang 337 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the* 338 native code compiler of the `Ericsson's Open Source Erlang/OTP 339 system <http://www.erlang.org/download.shtml>`_. It uses more 340 registers for argument passing than the ordinary C calling 341 convention and defines no callee-saved registers. The calling 342 convention properly supports `tail call 343 optimization <CodeGenerator.html#id80>`_ but requires that both the 344 caller and the callee use it. It uses a *register pinning* 345 mechanism, similar to GHC's convention, for keeping frequently 346 accessed runtime components pinned to specific hardware registers. 347 At the moment only X86 supports this convention (both 32 and 64 348 bit). 349"``webkit_jscc``" - WebKit's JavaScript calling convention 350 This calling convention has been implemented for `WebKit FTL JIT 351 <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the 352 stack right to left (as cdecl does), and returns a value in the 353 platform's customary return register. 354"``anyregcc``" - Dynamic calling convention for code patching 355 This is a special convention that supports patching an arbitrary code 356 sequence in place of a call site. This convention forces the call 357 arguments into registers but allows them to be dynamically 358 allocated. This can currently only be used with calls to 359 llvm.experimental.patchpoint because only this intrinsic records 360 the location of its arguments in a side table. See :doc:`StackMaps`. 361"``preserve_mostcc``" - The `PreserveMost` calling convention 362 This calling convention attempts to make the code in the caller as 363 unintrusive as possible. This convention behaves identically to the `C` 364 calling convention on how arguments and return values are passed, but it 365 uses a different set of caller/callee-saved registers. This alleviates the 366 burden of saving and recovering a large register set before and after the 367 call in the caller. If the arguments are passed in callee-saved registers, 368 then they will be preserved by the callee across the call. This doesn't 369 apply for values returned in callee-saved registers. 370 371 - On X86-64 the callee preserves all general purpose registers, except for 372 R11. R11 can be used as a scratch register. Floating-point registers 373 (XMMs/YMMs) are not preserved and need to be saved by the caller. 374 375 The idea behind this convention is to support calls to runtime functions 376 that have a hot path and a cold path. The hot path is usually a small piece 377 of code that doesn't use many registers. The cold path might need to call out to 378 another function and therefore only needs to preserve the caller-saved 379 registers, which haven't already been saved by the caller. The 380 `PreserveMost` calling convention is very similar to the `cold` calling 381 convention in terms of caller/callee-saved registers, but they are used for 382 different types of function calls. `coldcc` is for function calls that are 383 rarely executed, whereas `preserve_mostcc` function calls are intended to be 384 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc` 385 doesn't prevent the inliner from inlining the function call. 386 387 This calling convention will be used by a future version of the ObjectiveC 388 runtime and should therefore still be considered experimental at this time. 389 Although this convention was created to optimize certain runtime calls to 390 the ObjectiveC runtime, it is not limited to this runtime and might be used 391 by other runtimes in the future too. The current implementation only 392 supports X86-64, but the intention is to support more architectures in the 393 future. 394"``preserve_allcc``" - The `PreserveAll` calling convention 395 This calling convention attempts to make the code in the caller even less 396 intrusive than the `PreserveMost` calling convention. This calling 397 convention also behaves identical to the `C` calling convention on how 398 arguments and return values are passed, but it uses a different set of 399 caller/callee-saved registers. This removes the burden of saving and 400 recovering a large register set before and after the call in the caller. If 401 the arguments are passed in callee-saved registers, then they will be 402 preserved by the callee across the call. This doesn't apply for values 403 returned in callee-saved registers. 404 405 - On X86-64 the callee preserves all general purpose registers, except for 406 R11. R11 can be used as a scratch register. Furthermore it also preserves 407 all floating-point registers (XMMs/YMMs). 408 409 The idea behind this convention is to support calls to runtime functions 410 that don't need to call out to any other functions. 411 412 This calling convention, like the `PreserveMost` calling convention, will be 413 used by a future version of the ObjectiveC runtime and should be considered 414 experimental at this time. 415"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions 416 Clang generates an access function to access C++-style TLS. The access 417 function generally has an entry block, an exit block and an initialization 418 block that is run at the first time. The entry and exit blocks can access 419 a few TLS IR variables, each access will be lowered to a platform-specific 420 sequence. 421 422 This calling convention aims to minimize overhead in the caller by 423 preserving as many registers as possible (all the registers that are 424 preserved on the fast path, composed of the entry and exit blocks). 425 426 This calling convention behaves identical to the `C` calling convention on 427 how arguments and return values are passed, but it uses a different set of 428 caller/callee-saved registers. 429 430 Given that each platform has its own lowering sequence, hence its own set 431 of preserved registers, we can't use the existing `PreserveMost`. 432 433 - On X86-64 the callee preserves all general purpose registers, except for 434 RDI and RAX. 435"``tailcc``" - Tail callable calling convention 436 This calling convention ensures that calls in tail position will always be 437 tail call optimized. This calling convention is equivalent to fastcc, 438 except for an additional guarantee that tail calls will be produced 439 whenever possible. `Tail calls can only be optimized when this, the fastcc, 440 the GHC or the HiPE convention is used. <CodeGenerator.html#id80>`_ This 441 calling convention does not support varargs and requires the prototype of 442 all callees to exactly match the prototype of the function definition. 443"``swiftcc``" - This calling convention is used for Swift language. 444 - On X86-64 RCX and R8 are available for additional integer returns, and 445 XMM2 and XMM3 are available for additional FP/vector returns. 446 - On iOS platforms, we use AAPCS-VFP calling convention. 447"``swifttailcc``" 448 This calling convention is like ``swiftcc`` in most respects, but also the 449 callee pops the argument area of the stack so that mandatory tail calls are 450 possible as in ``tailcc``. 451"``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism) 452 This calling convention is used for the Control Flow Guard check function, 453 calls to which can be inserted before indirect calls to check that the call 454 target is a valid function address. The check function has no return value, 455 but it will trigger an OS-level error if the address is not a valid target. 456 The set of registers preserved by the check function, and the register 457 containing the target address are architecture-specific. 458 459 - On X86 the target address is passed in ECX. 460 - On ARM the target address is passed in R0. 461 - On AArch64 the target address is passed in X15. 462"``cc <n>``" - Numbered convention 463 Any calling convention may be specified by number, allowing 464 target-specific calling conventions to be used. Target specific 465 calling conventions start at 64. 466 467More calling conventions can be added/defined on an as-needed basis, to 468support Pascal conventions or any other well-known target-independent 469convention. 470 471.. _visibilitystyles: 472 473Visibility Styles 474----------------- 475 476All Global Variables and Functions have one of the following visibility 477styles: 478 479"``default``" - Default style 480 On targets that use the ELF object file format, default visibility 481 means that the declaration is visible to other modules and, in 482 shared libraries, means that the declared entity may be overridden. 483 On Darwin, default visibility means that the declaration is visible 484 to other modules. Default visibility corresponds to "external 485 linkage" in the language. 486"``hidden``" - Hidden style 487 Two declarations of an object with hidden visibility refer to the 488 same object if they are in the same shared object. Usually, hidden 489 visibility indicates that the symbol will not be placed into the 490 dynamic symbol table, so no other module (executable or shared 491 library) can reference it directly. 492"``protected``" - Protected style 493 On ELF, protected visibility indicates that the symbol will be 494 placed in the dynamic symbol table, but that references within the 495 defining module will bind to the local symbol. That is, the symbol 496 cannot be overridden by another module. 497 498A symbol with ``internal`` or ``private`` linkage must have ``default`` 499visibility. 500 501.. _dllstorageclass: 502 503DLL Storage Classes 504------------------- 505 506All Global Variables, Functions and Aliases can have one of the following 507DLL storage class: 508 509``dllimport`` 510 "``dllimport``" causes the compiler to reference a function or variable via 511 a global pointer to a pointer that is set up by the DLL exporting the 512 symbol. On Microsoft Windows targets, the pointer name is formed by 513 combining ``__imp_`` and the function or variable name. 514``dllexport`` 515 "``dllexport``" causes the compiler to provide a global pointer to a pointer 516 in a DLL, so that it can be referenced with the ``dllimport`` attribute. On 517 Microsoft Windows targets, the pointer name is formed by combining 518 ``__imp_`` and the function or variable name. Since this storage class 519 exists for defining a dll interface, the compiler, assembler and linker know 520 it is externally referenced and must refrain from deleting the symbol. 521 522.. _tls_model: 523 524Thread Local Storage Models 525--------------------------- 526 527A variable may be defined as ``thread_local``, which means that it will 528not be shared by threads (each thread will have a separated copy of the 529variable). Not all targets support thread-local variables. Optionally, a 530TLS model may be specified: 531 532``localdynamic`` 533 For variables that are only used within the current shared library. 534``initialexec`` 535 For variables in modules that will not be loaded dynamically. 536``localexec`` 537 For variables defined in the executable and only used within it. 538 539If no explicit model is given, the "general dynamic" model is used. 540 541The models correspond to the ELF TLS models; see `ELF Handling For 542Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for 543more information on under which circumstances the different models may 544be used. The target may choose a different TLS model if the specified 545model is not supported, or if a better choice of model can be made. 546 547A model can also be specified in an alias, but then it only governs how 548the alias is accessed. It will not have any effect in the aliasee. 549 550For platforms without linker support of ELF TLS model, the -femulated-tls 551flag can be used to generate GCC compatible emulated TLS code. 552 553.. _runtime_preemption_model: 554 555Runtime Preemption Specifiers 556----------------------------- 557 558Global variables, functions and aliases may have an optional runtime preemption 559specifier. If a preemption specifier isn't given explicitly, then a 560symbol is assumed to be ``dso_preemptable``. 561 562``dso_preemptable`` 563 Indicates that the function or variable may be replaced by a symbol from 564 outside the linkage unit at runtime. 565 566``dso_local`` 567 The compiler may assume that a function or variable marked as ``dso_local`` 568 will resolve to a symbol within the same linkage unit. Direct access will 569 be generated even if the definition is not within this compilation unit. 570 571.. _namedtypes: 572 573Structure Types 574--------------- 575 576LLVM IR allows you to specify both "identified" and "literal" :ref:`structure 577types <t_struct>`. Literal types are uniqued structurally, but identified types 578are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used 579to forward declare a type that is not yet available. 580 581An example of an identified structure specification is: 582 583.. code-block:: llvm 584 585 %mytype = type { %mytype*, i32 } 586 587Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only 588literal types are uniqued in recent versions of LLVM. 589 590.. _nointptrtype: 591 592Non-Integral Pointer Type 593------------------------- 594 595Note: non-integral pointer types are a work in progress, and they should be 596considered experimental at this time. 597 598LLVM IR optionally allows the frontend to denote pointers in certain address 599spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`. 600Non-integral pointer types represent pointers that have an *unspecified* bitwise 601representation; that is, the integral representation may be target dependent or 602unstable (not backed by a fixed integer). 603 604``inttoptr`` instructions converting integers to non-integral pointer types are 605ill-typed, and so are ``ptrtoint`` instructions converting values of 606non-integral pointer types to integers. Vector versions of said instructions 607are ill-typed as well. 608 609.. _globalvars: 610 611Global Variables 612---------------- 613 614Global variables define regions of memory allocated at compilation time 615instead of run-time. 616 617Global variable definitions must be initialized. 618 619Global variables in other translation units can also be declared, in which 620case they don't have an initializer. 621 622Global variables can optionally specify a :ref:`linkage type <linkage>`. 623 624Either global variable definitions or declarations may have an explicit section 625to be placed in and may have an optional explicit alignment specified. If there 626is a mismatch between the explicit or inferred section information for the 627variable declaration and its definition the resulting behavior is undefined. 628 629A variable may be defined as a global ``constant``, which indicates that 630the contents of the variable will **never** be modified (enabling better 631optimization, allowing the global data to be placed in the read-only 632section of an executable, etc). Note that variables that need runtime 633initialization cannot be marked ``constant`` as there is a store to the 634variable. 635 636LLVM explicitly allows *declarations* of global variables to be marked 637constant, even if the final definition of the global is not. This 638capability can be used to enable slightly better optimization of the 639program, but requires the language definition to guarantee that 640optimizations based on the 'constantness' are valid for the translation 641units that do not include the definition. 642 643As SSA values, global variables define pointer values that are in scope 644(i.e. they dominate) all basic blocks in the program. Global variables 645always define a pointer to their "content" type because they describe a 646region of memory, and all memory objects in LLVM are accessed through 647pointers. 648 649Global variables can be marked with ``unnamed_addr`` which indicates 650that the address is not significant, only the content. Constants marked 651like this can be merged with other constants if they have the same 652initializer. Note that a constant with significant address *can* be 653merged with a ``unnamed_addr`` constant, the result being a constant 654whose address is significant. 655 656If the ``local_unnamed_addr`` attribute is given, the address is known to 657not be significant within the module. 658 659A global variable may be declared to reside in a target-specific 660numbered address space. For targets that support them, address spaces 661may affect how optimizations are performed and/or what target 662instructions are used to access the variable. The default address space 663is zero. The address space qualifier must precede any other attributes. 664 665LLVM allows an explicit section to be specified for globals. If the 666target supports it, it will emit globals to the section specified. 667Additionally, the global can placed in a comdat if the target has the necessary 668support. 669 670External declarations may have an explicit section specified. Section 671information is retained in LLVM IR for targets that make use of this 672information. Attaching section information to an external declaration is an 673assertion that its definition is located in the specified section. If the 674definition is located in a different section, the behavior is undefined. 675 676By default, global initializers are optimized by assuming that global 677variables defined within the module are not modified from their 678initial values before the start of the global initializer. This is 679true even for variables potentially accessible from outside the 680module, including those with external linkage or appearing in 681``@llvm.used`` or dllexported variables. This assumption may be suppressed 682by marking the variable with ``externally_initialized``. 683 684An explicit alignment may be specified for a global, which must be a 685power of 2. If not present, or if the alignment is set to zero, the 686alignment of the global is set by the target to whatever it feels 687convenient. If an explicit alignment is specified, the global is forced 688to have exactly that alignment. Targets and optimizers are not allowed 689to over-align the global if the global has an assigned section. In this 690case, the extra alignment could be observable: for example, code could 691assume that the globals are densely packed in their section and try to 692iterate over them as an array, alignment padding would break this 693iteration. The maximum alignment is ``1 << 29``. 694 695For global variables declarations, as well as definitions that may be 696replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common`` 697linkage types), LLVM makes no assumptions about the allocation size of the 698variables, except that they may not overlap. The alignment of a global variable 699declaration or replaceable definition must not be greater than the alignment of 700the definition it resolves to. 701 702Globals can also have a :ref:`DLL storage class <dllstorageclass>`, 703an optional :ref:`runtime preemption specifier <runtime_preemption_model>`, 704an optional :ref:`global attributes <glattrs>` and 705an optional list of attached :ref:`metadata <metadata>`. 706 707Variables and aliases can have a 708:ref:`Thread Local Storage Model <tls_model>`. 709 710:ref:`Scalable vectors <t_vector>` cannot be global variables or members of 711arrays because their size is unknown at compile time. They are allowed in 712structs to facilitate intrinsics returning multiple values. Structs containing 713scalable vectors cannot be used in loads, stores, allocas, or GEPs. 714 715Syntax:: 716 717 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility] 718 [DLLStorageClass] [ThreadLocal] 719 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] 720 [ExternallyInitialized] 721 <global | constant> <Type> [<InitializerConstant>] 722 [, section "name"] [, comdat [($name)]] 723 [, align <Alignment>] (, !name !N)* 724 725For example, the following defines a global in a numbered address space 726with an initializer, section, and alignment: 727 728.. code-block:: llvm 729 730 @G = addrspace(5) constant float 1.0, section "foo", align 4 731 732The following example just declares a global variable 733 734.. code-block:: llvm 735 736 @G = external global i32 737 738The following example defines a thread-local global with the 739``initialexec`` TLS model: 740 741.. code-block:: llvm 742 743 @G = thread_local(initialexec) global i32 0, align 4 744 745.. _functionstructure: 746 747Functions 748--------- 749 750LLVM function definitions consist of the "``define``" keyword, an 751optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption 752specifier <runtime_preemption_model>`, an optional :ref:`visibility 753style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, 754an optional :ref:`calling convention <callingconv>`, 755an optional ``unnamed_addr`` attribute, a return type, an optional 756:ref:`parameter attribute <paramattrs>` for the return type, a function 757name, a (possibly empty) argument list (each with optional :ref:`parameter 758attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, 759an optional address space, an optional section, an optional alignment, 760an optional :ref:`comdat <langref_comdats>`, 761an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, 762an optional :ref:`prologue <prologuedata>`, 763an optional :ref:`personality <personalityfn>`, 764an optional list of attached :ref:`metadata <metadata>`, 765an opening curly brace, a list of basic blocks, and a closing curly brace. 766 767LLVM function declarations consist of the "``declare``" keyword, an 768optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style 769<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an 770optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` 771or ``local_unnamed_addr`` attribute, an optional address space, a return type, 772an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly 773empty list of arguments, an optional alignment, an optional :ref:`garbage 774collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional 775:ref:`prologue <prologuedata>`. 776 777A function definition contains a list of basic blocks, forming the CFG (Control 778Flow Graph) for the function. Each basic block may optionally start with a label 779(giving the basic block a symbol table entry), contains a list of instructions, 780and ends with a :ref:`terminator <terminators>` instruction (such as a branch or 781function return). If an explicit label name is not provided, a block is assigned 782an implicit numbered label, using the next value from the same counter as used 783for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a 784function entry block does not have an explicit label, it will be assigned label 785"%0", then the first unnamed temporary in that block will be "%1", etc. If a 786numeric label is explicitly specified, it must match the numeric label that 787would be used implicitly. 788 789The first basic block in a function is special in two ways: it is 790immediately executed on entrance to the function, and it is not allowed 791to have predecessor basic blocks (i.e. there can not be any branches to 792the entry block of a function). Because the block can have no 793predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`. 794 795LLVM allows an explicit section to be specified for functions. If the 796target supports it, it will emit functions to the section specified. 797Additionally, the function can be placed in a COMDAT. 798 799An explicit alignment may be specified for a function. If not present, 800or if the alignment is set to zero, the alignment of the function is set 801by the target to whatever it feels convenient. If an explicit alignment 802is specified, the function is forced to have at least that much 803alignment. All alignments must be a power of 2. 804 805If the ``unnamed_addr`` attribute is given, the address is known to not 806be significant and two identical functions can be merged. 807 808If the ``local_unnamed_addr`` attribute is given, the address is known to 809not be significant within the module. 810 811If an explicit address space is not given, it will default to the program 812address space from the :ref:`datalayout string<langref_datalayout>`. 813 814Syntax:: 815 816 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass] 817 [cconv] [ret attrs] 818 <ResultType> @<FunctionName> ([argument list]) 819 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs] 820 [section "name"] [comdat [($name)]] [align N] [gc] [prefix Constant] 821 [prologue Constant] [personality Constant] (!name !N)* { ... } 822 823The argument list is a comma separated sequence of arguments where each 824argument is of the following form: 825 826Syntax:: 827 828 <type> [parameter Attrs] [name] 829 830 831.. _langref_aliases: 832 833Aliases 834------- 835 836Aliases, unlike function or variables, don't create any new data. They 837are just a new symbol and metadata for an existing position. 838 839Aliases have a name and an aliasee that is either a global value or a 840constant expression. 841 842Aliases may have an optional :ref:`linkage type <linkage>`, an optional 843:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional 844:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class 845<dllstorageclass>` and an optional :ref:`tls model <tls_model>`. 846 847Syntax:: 848 849 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> 850 851The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, 852``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers 853might not correctly handle dropping a weak symbol that is aliased. 854 855Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as 856the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point 857to the same content. 858 859If the ``local_unnamed_addr`` attribute is given, the address is known to 860not be significant within the module. 861 862Since aliases are only a second name, some restrictions apply, of which 863some can only be checked when producing an object file: 864 865* The expression defining the aliasee must be computable at assembly 866 time. Since it is just a name, no relocations can be used. 867 868* No alias in the expression can be weak as the possibility of the 869 intermediate alias being overridden cannot be represented in an 870 object file. 871 872* No global value in the expression can be a declaration, since that 873 would require a relocation, which is not possible. 874 875.. _langref_ifunc: 876 877IFuncs 878------- 879 880IFuncs, like as aliases, don't create any new data or func. They are just a new 881symbol that dynamic linker resolves at runtime by calling a resolver function. 882 883IFuncs have a name and a resolver that is a function called by dynamic linker 884that returns address of another function associated with the name. 885 886IFunc may have an optional :ref:`linkage type <linkage>` and an optional 887:ref:`visibility style <visibility>`. 888 889Syntax:: 890 891 @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver> 892 893 894.. _langref_comdats: 895 896Comdats 897------- 898 899Comdat IR provides access to COFF and ELF object file COMDAT functionality. 900 901Comdats have a name which represents the COMDAT key. All global objects that 902specify this key will only end up in the final object file if the linker chooses 903that key over some other key. Aliases are placed in the same COMDAT that their 904aliasee computes to, if any. 905 906Comdats have a selection kind to provide input on how the linker should 907choose between keys in two different object files. 908 909Syntax:: 910 911 $<Name> = comdat SelectionKind 912 913The selection kind must be one of the following: 914 915``any`` 916 The linker may choose any COMDAT key, the choice is arbitrary. 917``exactmatch`` 918 The linker may choose any COMDAT key but the sections must contain the 919 same data. 920``largest`` 921 The linker will choose the section containing the largest COMDAT key. 922``noduplicates`` 923 The linker requires that only section with this COMDAT key exist. 924``samesize`` 925 The linker may choose any COMDAT key but the sections must contain the 926 same amount of data. 927 928Note that XCOFF and the Mach-O platform don't support COMDATs, and ELF and 929WebAssembly only support ``any`` as a selection kind. 930 931Here is an example of a COMDAT group where a function will only be selected if 932the COMDAT key's section is the largest: 933 934.. code-block:: text 935 936 $foo = comdat largest 937 @foo = global i32 2, comdat($foo) 938 939 define void @bar() comdat($foo) { 940 ret void 941 } 942 943As a syntactic sugar the ``$name`` can be omitted if the name is the same as 944the global name: 945 946.. code-block:: llvm 947 948 $foo = comdat any 949 @foo = global i32 2, comdat 950 951 952In a COFF object file, this will create a COMDAT section with selection kind 953``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol 954and another COMDAT section with selection kind 955``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT 956section and contains the contents of the ``@bar`` symbol. 957 958There are some restrictions on the properties of the global object. 959It, or an alias to it, must have the same name as the COMDAT group when 960targeting COFF. 961The contents and size of this object may be used during link-time to determine 962which COMDAT groups get selected depending on the selection kind. 963Because the name of the object must match the name of the COMDAT group, the 964linkage of the global object must not be local; local symbols can get renamed 965if a collision occurs in the symbol table. 966 967The combined use of COMDATS and section attributes may yield surprising results. 968For example: 969 970.. code-block:: llvm 971 972 $foo = comdat any 973 $bar = comdat any 974 @g1 = global i32 42, section "sec", comdat($foo) 975 @g2 = global i32 42, section "sec", comdat($bar) 976 977From the object file perspective, this requires the creation of two sections 978with the same name. This is necessary because both globals belong to different 979COMDAT groups and COMDATs, at the object file level, are represented by 980sections. 981 982Note that certain IR constructs like global variables and functions may 983create COMDATs in the object file in addition to any which are specified using 984COMDAT IR. This arises when the code generator is configured to emit globals 985in individual sections (e.g. when `-data-sections` or `-function-sections` 986is supplied to `llc`). 987 988.. _namedmetadatastructure: 989 990Named Metadata 991-------------- 992 993Named metadata is a collection of metadata. :ref:`Metadata 994nodes <metadata>` (but not metadata strings) are the only valid 995operands for a named metadata. 996 997#. Named metadata are represented as a string of characters with the 998 metadata prefix. The rules for metadata names are the same as for 999 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes 1000 are still valid, which allows any character to be part of a name. 1001 1002Syntax:: 1003 1004 ; Some unnamed metadata nodes, which are referenced by the named metadata. 1005 !0 = !{!"zero"} 1006 !1 = !{!"one"} 1007 !2 = !{!"two"} 1008 ; A named metadata. 1009 !name = !{!0, !1, !2} 1010 1011.. _paramattrs: 1012 1013Parameter Attributes 1014-------------------- 1015 1016The return type and each parameter of a function type may have a set of 1017*parameter attributes* associated with them. Parameter attributes are 1018used to communicate additional information about the result or 1019parameters of a function. Parameter attributes are considered to be part 1020of the function, not of the function type, so functions with different 1021parameter attributes can have the same function type. 1022 1023Parameter attributes are simple keywords that follow the type specified. 1024If multiple parameter attributes are needed, they are space separated. 1025For example: 1026 1027.. code-block:: llvm 1028 1029 declare i32 @printf(i8* noalias nocapture, ...) 1030 declare i32 @atoi(i8 zeroext) 1031 declare signext i8 @returns_signed_char() 1032 1033Note that any attributes for the function result (``nounwind``, 1034``readonly``) come immediately after the argument list. 1035 1036Currently, only the following parameter attributes are defined: 1037 1038``zeroext`` 1039 This indicates to the code generator that the parameter or return 1040 value should be zero-extended to the extent required by the target's 1041 ABI by the caller (for a parameter) or the callee (for a return value). 1042``signext`` 1043 This indicates to the code generator that the parameter or return 1044 value should be sign-extended to the extent required by the target's 1045 ABI (which is usually 32-bits) by the caller (for a parameter) or 1046 the callee (for a return value). 1047``inreg`` 1048 This indicates that this parameter or return value should be treated 1049 in a special target-dependent fashion while emitting code for 1050 a function call or return (usually, by putting it in a register as 1051 opposed to memory, though some targets use it to distinguish between 1052 two different kinds of registers). Use of this attribute is 1053 target-specific. 1054``byval(<ty>)`` 1055 This indicates that the pointer parameter should really be passed by 1056 value to the function. The attribute implies that a hidden copy of 1057 the pointee is made between the caller and the callee, so the callee 1058 is unable to modify the value in the caller. This attribute is only 1059 valid on LLVM pointer arguments. It is generally used to pass 1060 structs and arrays by value, but is also valid on pointers to 1061 scalars. The copy is considered to belong to the caller not the 1062 callee (for example, ``readonly`` functions should not write to 1063 ``byval`` parameters). This is not a valid attribute for return 1064 values. 1065 1066 The byval type argument indicates the in-memory value type, and 1067 must be the same as the pointee type of the argument. 1068 1069 The byval attribute also supports specifying an alignment with the 1070 align attribute. It indicates the alignment of the stack slot to 1071 form and the known alignment of the pointer specified to the call 1072 site. If the alignment is not specified, then the code generator 1073 makes a target-specific assumption. 1074 1075.. _attr_byref: 1076 1077``byref(<ty>)`` 1078 1079 The ``byref`` argument attribute allows specifying the pointee 1080 memory type of an argument. This is similar to ``byval``, but does 1081 not imply a copy is made anywhere, or that the argument is passed 1082 on the stack. This implies the pointer is dereferenceable up to 1083 the storage size of the type. 1084 1085 It is not generally permissible to introduce a write to an 1086 ``byref`` pointer. The pointer may have any address space and may 1087 be read only. 1088 1089 This is not a valid attribute for return values. 1090 1091 The alignment for an ``byref`` parameter can be explicitly 1092 specified by combining it with the ``align`` attribute, similar to 1093 ``byval``. If the alignment is not specified, then the code generator 1094 makes a target-specific assumption. 1095 1096 This is intended for representing ABI constraints, and is not 1097 intended to be inferred for optimization use. 1098 1099.. _attr_preallocated: 1100 1101``preallocated(<ty>)`` 1102 This indicates that the pointer parameter should really be passed by 1103 value to the function, and that the pointer parameter's pointee has 1104 already been initialized before the call instruction. This attribute 1105 is only valid on LLVM pointer arguments. The argument must be the value 1106 returned by the appropriate 1107 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non 1108 ``musttail`` calls, or the corresponding caller parameter in ``musttail`` 1109 calls, although it is ignored during codegen. 1110 1111 A non ``musttail`` function call with a ``preallocated`` attribute in 1112 any parameter must have a ``"preallocated"`` operand bundle. A ``musttail`` 1113 function call cannot have a ``"preallocated"`` operand bundle. 1114 1115 The preallocated attribute requires a type argument, which must be 1116 the same as the pointee type of the argument. 1117 1118 The preallocated attribute also supports specifying an alignment with the 1119 align attribute. It indicates the alignment of the stack slot to 1120 form and the known alignment of the pointer specified to the call 1121 site. If the alignment is not specified, then the code generator 1122 makes a target-specific assumption. 1123 1124.. _attr_inalloca: 1125 1126``inalloca(<ty>)`` 1127 1128 The ``inalloca`` argument attribute allows the caller to take the 1129 address of outgoing stack arguments. An ``inalloca`` argument must 1130 be a pointer to stack memory produced by an ``alloca`` instruction. 1131 The alloca, or argument allocation, must also be tagged with the 1132 inalloca keyword. Only the last argument may have the ``inalloca`` 1133 attribute, and that argument is guaranteed to be passed in memory. 1134 1135 An argument allocation may be used by a call at most once because 1136 the call may deallocate it. The ``inalloca`` attribute cannot be 1137 used in conjunction with other attributes that affect argument 1138 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The 1139 ``inalloca`` attribute also disables LLVM's implicit lowering of 1140 large aggregate return values, which means that frontend authors 1141 must lower them with ``sret`` pointers. 1142 1143 When the call site is reached, the argument allocation must have 1144 been the most recent stack allocation that is still live, or the 1145 behavior is undefined. It is possible to allocate additional stack 1146 space after an argument allocation and before its call site, but it 1147 must be cleared off with :ref:`llvm.stackrestore 1148 <int_stackrestore>`. 1149 1150 The inalloca attribute requires a type argument, which must be the 1151 same as the pointee type of the argument. 1152 1153 See :doc:`InAlloca` for more information on how to use this 1154 attribute. 1155 1156``sret(<ty>)`` 1157 This indicates that the pointer parameter specifies the address of a 1158 structure that is the return value of the function in the source 1159 program. This pointer must be guaranteed by the caller to be valid: 1160 loads and stores to the structure may be assumed by the callee not 1161 to trap and to be properly aligned. This is not a valid attribute 1162 for return values. 1163 1164 The sret type argument specifies the in memory type, which must be 1165 the same as the pointee type of the argument. 1166 1167.. _attr_align: 1168 1169``align <n>`` or ``align(<n>)`` 1170 This indicates that the pointer value has the specified alignment. 1171 If the pointer value does not have the specified alignment, 1172 :ref:`poison value <poisonvalues>` is returned or passed instead. The 1173 ``align`` attribute should be combined with the ``noundef`` attribute to 1174 ensure a pointer is aligned, or otherwise the behavior is undefined. Note 1175 that ``align 1`` has no effect on non-byval, non-preallocated arguments. 1176 1177 Note that this attribute has additional semantics when combined with the 1178 ``byval`` or ``preallocated`` attribute, which are documented there. 1179 1180.. _noalias: 1181 1182``noalias`` 1183 This indicates that memory locations accessed via pointer values 1184 :ref:`based <pointeraliasing>` on the argument or return value are not also 1185 accessed, during the execution of the function, via pointer values not 1186 *based* on the argument or return value. This guarantee only holds for 1187 memory locations that are *modified*, by any means, during the execution of 1188 the function. The attribute on a return value also has additional semantics 1189 described below. The caller shares the responsibility with the callee for 1190 ensuring that these requirements are met. For further details, please see 1191 the discussion of the NoAlias response in :ref:`alias analysis <Must, May, 1192 or No>`. 1193 1194 Note that this definition of ``noalias`` is intentionally similar 1195 to the definition of ``restrict`` in C99 for function arguments. 1196 1197 For function return values, C99's ``restrict`` is not meaningful, 1198 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias`` 1199 attribute on return values are stronger than the semantics of the attribute 1200 when used on function arguments. On function return values, the ``noalias`` 1201 attribute indicates that the function acts like a system memory allocation 1202 function, returning a pointer to allocated storage disjoint from the 1203 storage for any other object accessible to the caller. 1204 1205.. _nocapture: 1206 1207``nocapture`` 1208 This indicates that the callee does not :ref:`capture <pointercapture>` the 1209 pointer. This is not a valid attribute for return values. 1210 This attribute applies only to the particular copy of the pointer passed in 1211 this argument. A caller could pass two copies of the same pointer with one 1212 being annotated nocapture and the other not, and the callee could validly 1213 capture through the non annotated parameter. 1214 1215.. code-block:: llvm 1216 1217 define void @f(i8* nocapture %a, i8* %b) { 1218 ; (capture %b) 1219 } 1220 1221 call void @f(i8* @glb, i8* @glb) ; well-defined 1222 1223``nofree`` 1224 This indicates that callee does not free the pointer argument. This is not 1225 a valid attribute for return values. 1226 1227.. _nest: 1228 1229``nest`` 1230 This indicates that the pointer parameter can be excised using the 1231 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid 1232 attribute for return values and can only be applied to one parameter. 1233 1234``returned`` 1235 This indicates that the function always returns the argument as its return 1236 value. This is a hint to the optimizer and code generator used when 1237 generating the caller, allowing value propagation, tail call optimization, 1238 and omission of register saves and restores in some cases; it is not 1239 checked or enforced when generating the callee. The parameter and the 1240 function return type must be valid operands for the 1241 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for 1242 return values and can only be applied to one parameter. 1243 1244``nonnull`` 1245 This indicates that the parameter or return pointer is not null. This 1246 attribute may only be applied to pointer typed parameters. This is not 1247 checked or enforced by LLVM; if the parameter or return pointer is null, 1248 :ref:`poison value <poisonvalues>` is returned or passed instead. 1249 The ``nonnull`` attribute should be combined with the ``noundef`` attribute 1250 to ensure a pointer is not null or otherwise the behavior is undefined. 1251 1252``dereferenceable(<n>)`` 1253 This indicates that the parameter or return pointer is dereferenceable. This 1254 attribute may only be applied to pointer typed parameters. A pointer that 1255 is dereferenceable can be loaded from speculatively without a risk of 1256 trapping. The number of bytes known to be dereferenceable must be provided 1257 in parentheses. It is legal for the number of bytes to be less than the 1258 size of the pointee type. The ``nonnull`` attribute does not imply 1259 dereferenceability (consider a pointer to one element past the end of an 1260 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in 1261 ``addrspace(0)`` (which is the default address space), except if the 1262 ``null_pointer_is_valid`` function attribute is present. 1263 ``n`` should be a positive number. The pointer should be well defined, 1264 otherwise it is undefined behavior. This means ``dereferenceable(<n>)`` 1265 implies ``noundef``. 1266 1267``dereferenceable_or_null(<n>)`` 1268 This indicates that the parameter or return value isn't both 1269 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same 1270 time. All non-null pointers tagged with 1271 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``. 1272 For address space 0 ``dereferenceable_or_null(<n>)`` implies that 1273 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``, 1274 and in other address spaces ``dereferenceable_or_null(<n>)`` 1275 implies that a pointer is at least one of ``dereferenceable(<n>)`` 1276 or ``null`` (i.e. it may be both ``null`` and 1277 ``dereferenceable(<n>)``). This attribute may only be applied to 1278 pointer typed parameters. 1279 1280``swiftself`` 1281 This indicates that the parameter is the self/context parameter. This is not 1282 a valid attribute for return values and can only be applied to one 1283 parameter. 1284 1285``swiftasync`` 1286 This indicates that the parameter is the asynchronous context parameter and 1287 triggers the creation of a target-specific extended frame record to store 1288 this pointer. This is not a valid attribute for return values and can only 1289 be applied to one parameter. 1290 1291``swifterror`` 1292 This attribute is motivated to model and optimize Swift error handling. It 1293 can be applied to a parameter with pointer to pointer type or a 1294 pointer-sized alloca. At the call site, the actual argument that corresponds 1295 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or 1296 the ``swifterror`` parameter of the caller. A ``swifterror`` value (either 1297 the parameter or the alloca) can only be loaded and stored from, or used as 1298 a ``swifterror`` argument. This is not a valid attribute for return values 1299 and can only be applied to one parameter. 1300 1301 These constraints allow the calling convention to optimize access to 1302 ``swifterror`` variables by associating them with a specific register at 1303 call boundaries rather than placing them in memory. Since this does change 1304 the calling convention, a function which uses the ``swifterror`` attribute 1305 on a parameter is not ABI-compatible with one which does not. 1306 1307 These constraints also allow LLVM to assume that a ``swifterror`` argument 1308 does not alias any other memory visible within a function and that a 1309 ``swifterror`` alloca passed as an argument does not escape. 1310 1311``immarg`` 1312 This indicates the parameter is required to be an immediate 1313 value. This must be a trivial immediate integer or floating-point 1314 constant. Undef or constant expressions are not valid. This is 1315 only valid on intrinsic declarations and cannot be applied to a 1316 call site or arbitrary function. 1317 1318``noundef`` 1319 This attribute applies to parameters and return values. If the value 1320 representation contains any undefined or poison bits, the behavior is 1321 undefined. Note that this does not refer to padding introduced by the 1322 type's storage representation. 1323 1324``alignstack(<n>)`` 1325 This indicates the alignment that should be considered by the backend when 1326 assigning this parameter to a stack slot during calling convention 1327 lowering. The enforcement of the specified alignment is target-dependent, 1328 as target-specific calling convention rules may override this value. This 1329 attribute serves the purpose of carrying language specific alignment 1330 information that is not mapped to base types in the backend (for example, 1331 over-alignment specification through language attributes). 1332 1333.. _gc: 1334 1335Garbage Collector Strategy Names 1336-------------------------------- 1337 1338Each function may specify a garbage collector strategy name, which is simply a 1339string: 1340 1341.. code-block:: llvm 1342 1343 define void @f() gc "name" { ... } 1344 1345The supported values of *name* includes those :ref:`built in to LLVM 1346<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC 1347strategy will cause the compiler to alter its output in order to support the 1348named garbage collection algorithm. Note that LLVM itself does not contain a 1349garbage collector, this functionality is restricted to generating machine code 1350which can interoperate with a collector provided externally. 1351 1352.. _prefixdata: 1353 1354Prefix Data 1355----------- 1356 1357Prefix data is data associated with a function which the code 1358generator will emit immediately before the function's entrypoint. 1359The purpose of this feature is to allow frontends to associate 1360language-specific runtime metadata with specific functions and make it 1361available through the function pointer while still allowing the 1362function pointer to be called. 1363 1364To access the data for a given function, a program may bitcast the 1365function pointer to a pointer to the constant's type and dereference 1366index -1. This implies that the IR symbol points just past the end of 1367the prefix data. For instance, take the example of a function annotated 1368with a single ``i32``, 1369 1370.. code-block:: llvm 1371 1372 define void @f() prefix i32 123 { ... } 1373 1374The prefix data can be referenced as, 1375 1376.. code-block:: llvm 1377 1378 %0 = bitcast void* () @f to i32* 1379 %a = getelementptr inbounds i32, i32* %0, i32 -1 1380 %b = load i32, i32* %a 1381 1382Prefix data is laid out as if it were an initializer for a global variable 1383of the prefix data's type. The function will be placed such that the 1384beginning of the prefix data is aligned. This means that if the size 1385of the prefix data is not a multiple of the alignment size, the 1386function's entrypoint will not be aligned. If alignment of the 1387function's entrypoint is desired, padding must be added to the prefix 1388data. 1389 1390A function may have prefix data but no body. This has similar semantics 1391to the ``available_externally`` linkage in that the data may be used by the 1392optimizers but will not be emitted in the object file. 1393 1394.. _prologuedata: 1395 1396Prologue Data 1397------------- 1398 1399The ``prologue`` attribute allows arbitrary code (encoded as bytes) to 1400be inserted prior to the function body. This can be used for enabling 1401function hot-patching and instrumentation. 1402 1403To maintain the semantics of ordinary function calls, the prologue data must 1404have a particular format. Specifically, it must begin with a sequence of 1405bytes which decode to a sequence of machine instructions, valid for the 1406module's target, which transfer control to the point immediately succeeding 1407the prologue data, without performing any other visible action. This allows 1408the inliner and other passes to reason about the semantics of the function 1409definition without needing to reason about the prologue data. Obviously this 1410makes the format of the prologue data highly target dependent. 1411 1412A trivial example of valid prologue data for the x86 architecture is ``i8 144``, 1413which encodes the ``nop`` instruction: 1414 1415.. code-block:: text 1416 1417 define void @f() prologue i8 144 { ... } 1418 1419Generally prologue data can be formed by encoding a relative branch instruction 1420which skips the metadata, as in this example of valid prologue data for the 1421x86_64 architecture, where the first two bytes encode ``jmp .+10``: 1422 1423.. code-block:: text 1424 1425 %0 = type <{ i8, i8, i8* }> 1426 1427 define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... } 1428 1429A function may have prologue data but no body. This has similar semantics 1430to the ``available_externally`` linkage in that the data may be used by the 1431optimizers but will not be emitted in the object file. 1432 1433.. _personalityfn: 1434 1435Personality Function 1436-------------------- 1437 1438The ``personality`` attribute permits functions to specify what function 1439to use for exception handling. 1440 1441.. _attrgrp: 1442 1443Attribute Groups 1444---------------- 1445 1446Attribute groups are groups of attributes that are referenced by objects within 1447the IR. They are important for keeping ``.ll`` files readable, because a lot of 1448functions will use the same set of attributes. In the degenerative case of a 1449``.ll`` file that corresponds to a single ``.c`` file, the single attribute 1450group will capture the important command line flags used to build that file. 1451 1452An attribute group is a module-level object. To use an attribute group, an 1453object references the attribute group's ID (e.g. ``#37``). An object may refer 1454to more than one attribute group. In that situation, the attributes from the 1455different groups are merged. 1456 1457Here is an example of attribute groups for a function that should always be 1458inlined, has a stack alignment of 4, and which shouldn't use SSE instructions: 1459 1460.. code-block:: llvm 1461 1462 ; Target-independent attributes: 1463 attributes #0 = { alwaysinline alignstack=4 } 1464 1465 ; Target-dependent attributes: 1466 attributes #1 = { "no-sse" } 1467 1468 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse". 1469 define void @f() #0 #1 { ... } 1470 1471.. _fnattrs: 1472 1473Function Attributes 1474------------------- 1475 1476Function attributes are set to communicate additional information about 1477a function. Function attributes are considered to be part of the 1478function, not of the function type, so functions with different function 1479attributes can have the same function type. 1480 1481Function attributes are simple keywords that follow the type specified. 1482If multiple attributes are needed, they are space separated. For 1483example: 1484 1485.. code-block:: llvm 1486 1487 define void @f() noinline { ... } 1488 define void @f() alwaysinline { ... } 1489 define void @f() alwaysinline optsize { ... } 1490 define void @f() optsize { ... } 1491 1492``alignstack(<n>)`` 1493 This attribute indicates that, when emitting the prologue and 1494 epilogue, the backend should forcibly align the stack pointer. 1495 Specify the desired alignment, which must be a power of two, in 1496 parentheses. 1497``allocsize(<EltSizeParam>[, <NumEltsParam>])`` 1498 This attribute indicates that the annotated function will always return at 1499 least a given number of bytes (or null). Its arguments are zero-indexed 1500 parameter numbers; if one argument is provided, then it's assumed that at 1501 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the 1502 returned pointer. If two are provided, then it's assumed that 1503 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are 1504 available. The referenced parameters must be integer types. No assumptions 1505 are made about the contents of the returned block of memory. 1506``alwaysinline`` 1507 This attribute indicates that the inliner should attempt to inline 1508 this function into callers whenever possible, ignoring any active 1509 inlining size threshold for this caller. 1510``builtin`` 1511 This indicates that the callee function at a call site should be 1512 recognized as a built-in function, even though the function's declaration 1513 uses the ``nobuiltin`` attribute. This is only valid at call sites for 1514 direct calls to functions that are declared with the ``nobuiltin`` 1515 attribute. 1516``cold`` 1517 This attribute indicates that this function is rarely called. When 1518 computing edge weights, basic blocks post-dominated by a cold 1519 function call are also considered to be cold; and, thus, given low 1520 weight. 1521``convergent`` 1522 In some parallel execution models, there exist operations that cannot be 1523 made control-dependent on any additional values. We call such operations 1524 ``convergent``, and mark them with this attribute. 1525 1526 The ``convergent`` attribute may appear on functions or call/invoke 1527 instructions. When it appears on a function, it indicates that calls to 1528 this function should not be made control-dependent on additional values. 1529 For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so 1530 calls to this intrinsic cannot be made control-dependent on additional 1531 values. 1532 1533 When it appears on a call/invoke, the ``convergent`` attribute indicates 1534 that we should treat the call as though we're calling a convergent 1535 function. This is particularly useful on indirect calls; without this we 1536 may treat such calls as though the target is non-convergent. 1537 1538 The optimizer may remove the ``convergent`` attribute on functions when it 1539 can prove that the function does not execute any convergent operations. 1540 Similarly, the optimizer may remove ``convergent`` on calls/invokes when it 1541 can prove that the call/invoke cannot call a convergent function. 1542``"frame-pointer"`` 1543 This attribute tells the code generator whether the function 1544 should keep the frame pointer. The code generator may emit the frame pointer 1545 even if this attribute says the frame pointer can be eliminated. 1546 The allowed string values are: 1547 1548 * ``"none"`` (default) - the frame pointer can be eliminated. 1549 * ``"non-leaf"`` - the frame pointer should be kept if the function calls 1550 other functions. 1551 * ``"all"`` - the frame pointer should be kept. 1552``hot`` 1553 This attribute indicates that this function is a hot spot of the program 1554 execution. The function will be optimized more aggressively and will be 1555 placed into special subsection of the text section to improving locality. 1556 1557 When profile feedback is enabled, this attribute has the precedence over 1558 the profile information. By marking a function ``hot``, users can work 1559 around the cases where the training input does not have good coverage 1560 on all the hot functions. 1561``inaccessiblememonly`` 1562 This attribute indicates that the function may only access memory that 1563 is not accessible by the module being compiled. This is a weaker form 1564 of ``readnone``. If the function reads or writes other memory, the 1565 behavior is undefined. 1566``inaccessiblemem_or_argmemonly`` 1567 This attribute indicates that the function may only access memory that is 1568 either not accessible by the module being compiled, or is pointed to 1569 by its pointer arguments. This is a weaker form of ``argmemonly``. If the 1570 function reads or writes other memory, the behavior is undefined. 1571``inlinehint`` 1572 This attribute indicates that the source code contained a hint that 1573 inlining this function is desirable (such as the "inline" keyword in 1574 C/C++). It is just a hint; it imposes no requirements on the 1575 inliner. 1576``jumptable`` 1577 This attribute indicates that the function should be added to a 1578 jump-instruction table at code-generation time, and that all address-taken 1579 references to this function should be replaced with a reference to the 1580 appropriate jump-instruction-table function pointer. Note that this creates 1581 a new pointer for the original function, which means that code that depends 1582 on function-pointer identity can break. So, any function annotated with 1583 ``jumptable`` must also be ``unnamed_addr``. 1584``minsize`` 1585 This attribute suggests that optimization passes and code generator 1586 passes make choices that keep the code size of this function as small 1587 as possible and perform optimizations that may sacrifice runtime 1588 performance in order to minimize the size of the generated code. 1589``naked`` 1590 This attribute disables prologue / epilogue emission for the 1591 function. This can have very system-specific consequences. 1592``"no-inline-line-tables"`` 1593 When this attribute is set to true, the inliner discards source locations 1594 when inlining code and instead uses the source location of the call site. 1595 Breakpoints set on code that was inlined into the current function will 1596 not fire during the execution of the inlined call sites. If the debugger 1597 stops inside an inlined call site, it will appear to be stopped at the 1598 outermost inlined call site. 1599``no-jump-tables`` 1600 When this attribute is set to true, the jump tables and lookup tables that 1601 can be generated from a switch case lowering are disabled. 1602``nobuiltin`` 1603 This indicates that the callee function at a call site is not recognized as 1604 a built-in function. LLVM will retain the original call and not replace it 1605 with equivalent code based on the semantics of the built-in function, unless 1606 the call site uses the ``builtin`` attribute. This is valid at call sites 1607 and on function declarations and definitions. 1608``noduplicate`` 1609 This attribute indicates that calls to the function cannot be 1610 duplicated. A call to a ``noduplicate`` function may be moved 1611 within its parent function, but may not be duplicated within 1612 its parent function. 1613 1614 A function containing a ``noduplicate`` call may still 1615 be an inlining candidate, provided that the call is not 1616 duplicated by inlining. That implies that the function has 1617 internal linkage and only has one call site, so the original 1618 call is dead after inlining. 1619``nofree`` 1620 This function attribute indicates that the function does not, directly or 1621 transitively, call a memory-deallocation function (``free``, for example) 1622 on a memory allocation which existed before the call. 1623 1624 As a result, uncaptured pointers that are known to be dereferenceable 1625 prior to a call to a function with the ``nofree`` attribute are still 1626 known to be dereferenceable after the call. The capturing condition is 1627 necessary in environments where the function might communicate the 1628 pointer to another thread which then deallocates the memory. Alternatively, 1629 ``nosync`` would ensure such communication cannot happen and even captured 1630 pointers cannot be freed by the function. 1631 1632 A ``nofree`` function is explicitly allowed to free memory which it 1633 allocated or (if not ``nosync``) arrange for another thread to free 1634 memory on it's behalf. As a result, perhaps surprisingly, a ``nofree`` 1635 function can return a pointer to a previously deallocated memory object. 1636``noimplicitfloat`` 1637 This attributes disables implicit floating-point instructions. 1638``noinline`` 1639 This attribute indicates that the inliner should never inline this 1640 function in any situation. This attribute may not be used together 1641 with the ``alwaysinline`` attribute. 1642``nomerge`` 1643 This attribute indicates that calls to this function should never be merged 1644 during optimization. For example, it will prevent tail merging otherwise 1645 identical code sequences that raise an exception or terminate the program. 1646 Tail merging normally reduces the precision of source location information, 1647 making stack traces less useful for debugging. This attribute gives the 1648 user control over the tradeoff between code size and debug information 1649 precision. 1650``nonlazybind`` 1651 This attribute suppresses lazy symbol binding for the function. This 1652 may make calls to the function faster, at the cost of extra program 1653 startup time if the function is not called during program startup. 1654``noredzone`` 1655 This attribute indicates that the code generator should not use a 1656 red zone, even if the target-specific ABI normally permits it. 1657``indirect-tls-seg-refs`` 1658 This attribute indicates that the code generator should not use 1659 direct TLS access through segment registers, even if the 1660 target-specific ABI normally permits it. 1661``noreturn`` 1662 This function attribute indicates that the function never returns 1663 normally, hence through a return instruction. This produces undefined 1664 behavior at runtime if the function ever does dynamically return. Annotated 1665 functions may still raise an exception, i.a., ``nounwind`` is not implied. 1666``norecurse`` 1667 This function attribute indicates that the function does not call itself 1668 either directly or indirectly down any possible call path. This produces 1669 undefined behavior at runtime if the function ever does recurse. 1670``willreturn`` 1671 This function attribute indicates that a call of this function will 1672 either exhibit undefined behavior or comes back and continues execution 1673 at a point in the existing call stack that includes the current invocation. 1674 Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied. 1675 If an invocation of an annotated function does not return control back 1676 to a point in the call stack, the behavior is undefined. 1677``nosync`` 1678 This function attribute indicates that the function does not communicate 1679 (synchronize) with another thread through memory or other well-defined means. 1680 Synchronization is considered possible in the presence of `atomic` accesses 1681 that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses, 1682 as well as `convergent` function calls. Note that through `convergent` function calls 1683 non-memory communication, e.g., cross-lane operations, are possible and are also 1684 considered synchronization. However `convergent` does not contradict `nosync`. 1685 If an annotated function does ever synchronize with another thread, 1686 the behavior is undefined. 1687``nounwind`` 1688 This function attribute indicates that the function never raises an 1689 exception. If the function does raise an exception, its runtime 1690 behavior is undefined. However, functions marked nounwind may still 1691 trap or generate asynchronous exceptions. Exception handling schemes 1692 that are recognized by LLVM to handle asynchronous exceptions, such 1693 as SEH, will still provide their implementation defined semantics. 1694``null_pointer_is_valid`` 1695 If ``null_pointer_is_valid`` is set, then the ``null`` address 1696 in address-space 0 is considered to be a valid address for memory loads and 1697 stores. Any analysis or optimization should not treat dereferencing a 1698 pointer to ``null`` as undefined behavior in this function. 1699 Note: Comparing address of a global variable to ``null`` may still 1700 evaluate to false because of a limitation in querying this attribute inside 1701 constant expressions. 1702``optforfuzzing`` 1703 This attribute indicates that this function should be optimized 1704 for maximum fuzzing signal. 1705``optnone`` 1706 This function attribute indicates that most optimization passes will skip 1707 this function, with the exception of interprocedural optimization passes. 1708 Code generation defaults to the "fast" instruction selector. 1709 This attribute cannot be used together with the ``alwaysinline`` 1710 attribute; this attribute is also incompatible 1711 with the ``minsize`` attribute and the ``optsize`` attribute. 1712 1713 This attribute requires the ``noinline`` attribute to be specified on 1714 the function as well, so the function is never inlined into any caller. 1715 Only functions with the ``alwaysinline`` attribute are valid 1716 candidates for inlining into the body of this function. 1717``optsize`` 1718 This attribute suggests that optimization passes and code generator 1719 passes make choices that keep the code size of this function low, 1720 and otherwise do optimizations specifically to reduce code size as 1721 long as they do not significantly impact runtime performance. 1722``"patchable-function"`` 1723 This attribute tells the code generator that the code 1724 generated for this function needs to follow certain conventions that 1725 make it possible for a runtime function to patch over it later. 1726 The exact effect of this attribute depends on its string value, 1727 for which there currently is one legal possibility: 1728 1729 * ``"prologue-short-redirect"`` - This style of patchable 1730 function is intended to support patching a function prologue to 1731 redirect control away from the function in a thread safe 1732 manner. It guarantees that the first instruction of the 1733 function will be large enough to accommodate a short jump 1734 instruction, and will be sufficiently aligned to allow being 1735 fully changed via an atomic compare-and-swap instruction. 1736 While the first requirement can be satisfied by inserting large 1737 enough NOP, LLVM can and will try to re-purpose an existing 1738 instruction (i.e. one that would have to be emitted anyway) as 1739 the patchable instruction larger than a short jump. 1740 1741 ``"prologue-short-redirect"`` is currently only supported on 1742 x86-64. 1743 1744 This attribute by itself does not imply restrictions on 1745 inter-procedural optimizations. All of the semantic effects the 1746 patching may have to be separately conveyed via the linkage type. 1747``"probe-stack"`` 1748 This attribute indicates that the function will trigger a guard region 1749 in the end of the stack. It ensures that accesses to the stack must be 1750 no further apart than the size of the guard region to a previous 1751 access of the stack. It takes one required string value, the name of 1752 the stack probing function that will be called. 1753 1754 If a function that has a ``"probe-stack"`` attribute is inlined into 1755 a function with another ``"probe-stack"`` attribute, the resulting 1756 function has the ``"probe-stack"`` attribute of the caller. If a 1757 function that has a ``"probe-stack"`` attribute is inlined into a 1758 function that has no ``"probe-stack"`` attribute at all, the resulting 1759 function has the ``"probe-stack"`` attribute of the callee. 1760``readnone`` 1761 On a function, this attribute indicates that the function computes its 1762 result (or decides to unwind an exception) based strictly on its arguments, 1763 without dereferencing any pointer arguments or otherwise accessing 1764 any mutable state (e.g. memory, control registers, etc) visible to 1765 caller functions. It does not write through any pointer arguments 1766 (including ``byval`` arguments) and never changes any state visible 1767 to callers. This means while it cannot unwind exceptions by calling 1768 the ``C++`` exception throwing methods (since they write to memory), there may 1769 be non-``C++`` mechanisms that throw exceptions without writing to LLVM 1770 visible memory. 1771 1772 On an argument, this attribute indicates that the function does not 1773 dereference that pointer argument, even though it may read or write the 1774 memory that the pointer points to if accessed through other pointers. 1775 1776 If a readnone function reads or writes memory visible to the program, or 1777 has other side-effects, the behavior is undefined. If a function reads from 1778 or writes to a readnone pointer argument, the behavior is undefined. 1779``readonly`` 1780 On a function, this attribute indicates that the function does not write 1781 through any pointer arguments (including ``byval`` arguments) or otherwise 1782 modify any state (e.g. memory, control registers, etc) visible to 1783 caller functions. It may dereference pointer arguments and read 1784 state that may be set in the caller. A readonly function always 1785 returns the same value (or unwinds an exception identically) when 1786 called with the same set of arguments and global state. This means while it 1787 cannot unwind exceptions by calling the ``C++`` exception throwing methods 1788 (since they write to memory), there may be non-``C++`` mechanisms that throw 1789 exceptions without writing to LLVM visible memory. 1790 1791 On an argument, this attribute indicates that the function does not write 1792 through this pointer argument, even though it may write to the memory that 1793 the pointer points to. 1794 1795 If a readonly function writes memory visible to the program, or 1796 has other side-effects, the behavior is undefined. If a function writes to 1797 a readonly pointer argument, the behavior is undefined. 1798``"stack-probe-size"`` 1799 This attribute controls the behavior of stack probes: either 1800 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any. 1801 It defines the size of the guard region. It ensures that if the function 1802 may use more stack space than the size of the guard region, stack probing 1803 sequence will be emitted. It takes one required integer value, which 1804 is 4096 by default. 1805 1806 If a function that has a ``"stack-probe-size"`` attribute is inlined into 1807 a function with another ``"stack-probe-size"`` attribute, the resulting 1808 function has the ``"stack-probe-size"`` attribute that has the lower 1809 numeric value. If a function that has a ``"stack-probe-size"`` attribute is 1810 inlined into a function that has no ``"stack-probe-size"`` attribute 1811 at all, the resulting function has the ``"stack-probe-size"`` attribute 1812 of the callee. 1813``"no-stack-arg-probe"`` 1814 This attribute disables ABI-required stack probes, if any. 1815``writeonly`` 1816 On a function, this attribute indicates that the function may write to but 1817 does not read from memory. 1818 1819 On an argument, this attribute indicates that the function may write to but 1820 does not read through this pointer argument (even though it may read from 1821 the memory that the pointer points to). 1822 1823 If a writeonly function reads memory visible to the program, or 1824 has other side-effects, the behavior is undefined. If a function reads 1825 from a writeonly pointer argument, the behavior is undefined. 1826``argmemonly`` 1827 This attribute indicates that the only memory accesses inside function are 1828 loads and stores from objects pointed to by its pointer-typed arguments, 1829 with arbitrary offsets. Or in other words, all memory operations in the 1830 function can refer to memory only using pointers based on its function 1831 arguments. 1832 1833 Note that ``argmemonly`` can be used together with ``readonly`` attribute 1834 in order to specify that function reads only from its arguments. 1835 1836 If an argmemonly function reads or writes memory other than the pointer 1837 arguments, or has other side-effects, the behavior is undefined. 1838``returns_twice`` 1839 This attribute indicates that this function can return twice. The C 1840 ``setjmp`` is an example of such a function. The compiler disables 1841 some optimizations (like tail calls) in the caller of these 1842 functions. 1843``safestack`` 1844 This attribute indicates that 1845 `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_ 1846 protection is enabled for this function. 1847 1848 If a function that has a ``safestack`` attribute is inlined into a 1849 function that doesn't have a ``safestack`` attribute or which has an 1850 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting 1851 function will have a ``safestack`` attribute. 1852``sanitize_address`` 1853 This attribute indicates that AddressSanitizer checks 1854 (dynamic address safety analysis) are enabled for this function. 1855``sanitize_memory`` 1856 This attribute indicates that MemorySanitizer checks (dynamic detection 1857 of accesses to uninitialized memory) are enabled for this function. 1858``sanitize_thread`` 1859 This attribute indicates that ThreadSanitizer checks 1860 (dynamic thread safety analysis) are enabled for this function. 1861``sanitize_hwaddress`` 1862 This attribute indicates that HWAddressSanitizer checks 1863 (dynamic address safety analysis based on tagged pointers) are enabled for 1864 this function. 1865``sanitize_memtag`` 1866 This attribute indicates that MemTagSanitizer checks 1867 (dynamic address safety analysis based on Armv8 MTE) are enabled for 1868 this function. 1869``speculative_load_hardening`` 1870 This attribute indicates that 1871 `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_ 1872 should be enabled for the function body. 1873 1874 Speculative Load Hardening is a best-effort mitigation against 1875 information leak attacks that make use of control flow 1876 miss-speculation - specifically miss-speculation of whether a branch 1877 is taken or not. Typically vulnerabilities enabling such attacks are 1878 classified as "Spectre variant #1". Notably, this does not attempt to 1879 mitigate against miss-speculation of branch target, classified as 1880 "Spectre variant #2" vulnerabilities. 1881 1882 When inlining, the attribute is sticky. Inlining a function that carries 1883 this attribute will cause the caller to gain the attribute. This is intended 1884 to provide a maximally conservative model where the code in a function 1885 annotated with this attribute will always (even after inlining) end up 1886 hardened. 1887``speculatable`` 1888 This function attribute indicates that the function does not have any 1889 effects besides calculating its result and does not have undefined behavior. 1890 Note that ``speculatable`` is not enough to conclude that along any 1891 particular execution path the number of calls to this function will not be 1892 externally observable. This attribute is only valid on functions 1893 and declarations, not on individual call sites. If a function is 1894 incorrectly marked as speculatable and really does exhibit 1895 undefined behavior, the undefined behavior may be observed even 1896 if the call site is dead code. 1897 1898``ssp`` 1899 This attribute indicates that the function should emit a stack 1900 smashing protector. It is in the form of a "canary" --- a random value 1901 placed on the stack before the local variables that's checked upon 1902 return from the function to see if it has been overwritten. A 1903 heuristic is used to determine if a function needs stack protectors 1904 or not. The heuristic used will enable protectors for functions with: 1905 1906 - Character arrays larger than ``ssp-buffer-size`` (default 8). 1907 - Aggregates containing character arrays larger than ``ssp-buffer-size``. 1908 - Calls to alloca() with variable sizes or constant sizes greater than 1909 ``ssp-buffer-size``. 1910 1911 Variables that are identified as requiring a protector will be arranged 1912 on the stack such that they are adjacent to the stack protector guard. 1913 1914 A function with the ``ssp`` attribute but without the ``alwaysinline`` 1915 attribute cannot be inlined into a function without a 1916 ``ssp/sspreq/sspstrong`` attribute. If inlined, the caller will get the 1917 ``ssp`` attribute. 1918``sspstrong`` 1919 This attribute indicates that the function should emit a stack smashing 1920 protector. This attribute causes a strong heuristic to be used when 1921 determining if a function needs stack protectors. The strong heuristic 1922 will enable protectors for functions with: 1923 1924 - Arrays of any size and type 1925 - Aggregates containing an array of any size and type. 1926 - Calls to alloca(). 1927 - Local variables that have had their address taken. 1928 1929 Variables that are identified as requiring a protector will be arranged 1930 on the stack such that they are adjacent to the stack protector guard. 1931 The specific layout rules are: 1932 1933 #. Large arrays and structures containing large arrays 1934 (``>= ssp-buffer-size``) are closest to the stack protector. 1935 #. Small arrays and structures containing small arrays 1936 (``< ssp-buffer-size``) are 2nd closest to the protector. 1937 #. Variables that have had their address taken are 3rd closest to the 1938 protector. 1939 1940 This overrides the ``ssp`` function attribute. 1941 1942 A function with the ``sspstrong`` attribute but without the 1943 ``alwaysinline`` attribute cannot be inlined into a function without a 1944 ``ssp/sspstrong/sspreq`` attribute. If inlined, the caller will get the 1945 ``sspstrong`` attribute unless the ``sspreq`` attribute exists. 1946``sspreq`` 1947 This attribute indicates that the function should *always* emit a stack 1948 smashing protector. This overrides the ``ssp`` and ``sspstrong`` function 1949 attributes. 1950 1951 Variables that are identified as requiring a protector will be arranged 1952 on the stack such that they are adjacent to the stack protector guard. 1953 The specific layout rules are: 1954 1955 #. Large arrays and structures containing large arrays 1956 (``>= ssp-buffer-size``) are closest to the stack protector. 1957 #. Small arrays and structures containing small arrays 1958 (``< ssp-buffer-size``) are 2nd closest to the protector. 1959 #. Variables that have had their address taken are 3rd closest to the 1960 protector. 1961 1962 A function with the ``sspreq`` attribute but without the ``alwaysinline`` 1963 attribute cannot be inlined into a function without a 1964 ``ssp/sspstrong/sspreq`` attribute. If inlined, the caller will get the 1965 ``sspreq`` attribute. 1966 1967``strictfp`` 1968 This attribute indicates that the function was called from a scope that 1969 requires strict floating-point semantics. LLVM will not attempt any 1970 optimizations that require assumptions about the floating-point rounding 1971 mode or that might alter the state of floating-point status flags that 1972 might otherwise be set or cleared by calling this function. LLVM will 1973 not introduce any new floating-point instructions that may trap. 1974 1975``"denormal-fp-math"`` 1976 This indicates the denormal (subnormal) handling that may be 1977 assumed for the default floating-point environment. This is a 1978 comma separated pair. The elements may be one of ``"ieee"``, 1979 ``"preserve-sign"``, or ``"positive-zero"``. The first entry 1980 indicates the flushing mode for the result of floating point 1981 operations. The second indicates the handling of denormal inputs 1982 to floating point instructions. For compatibility with older 1983 bitcode, if the second value is omitted, both input and output 1984 modes will assume the same mode. 1985 1986 If this is attribute is not specified, the default is 1987 ``"ieee,ieee"``. 1988 1989 If the output mode is ``"preserve-sign"``, or ``"positive-zero"``, 1990 denormal outputs may be flushed to zero by standard floating-point 1991 operations. It is not mandated that flushing to zero occurs, but if 1992 a denormal output is flushed to zero, it must respect the sign 1993 mode. Not all targets support all modes. While this indicates the 1994 expected floating point mode the function will be executed with, 1995 this does not make any attempt to ensure the mode is 1996 consistent. User or platform code is expected to set the floating 1997 point mode appropriately before function entry. 1998 1999 If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, a 2000 floating-point operation must treat any input denormal value as 2001 zero. In some situations, if an instruction does not respect this 2002 mode, the input may need to be converted to 0 as if by 2003 ``@llvm.canonicalize`` during lowering for correctness. 2004 2005``"denormal-fp-math-f32"`` 2006 Same as ``"denormal-fp-math"``, but only controls the behavior of 2007 the 32-bit float type (or vectors of 32-bit floats). If both are 2008 are present, this overrides ``"denormal-fp-math"``. Not all targets 2009 support separately setting the denormal mode per type, and no 2010 attempt is made to diagnose unsupported uses. Currently this 2011 attribute is respected by the AMDGPU and NVPTX backends. 2012 2013``"thunk"`` 2014 This attribute indicates that the function will delegate to some other 2015 function with a tail call. The prototype of a thunk should not be used for 2016 optimization purposes. The caller is expected to cast the thunk prototype to 2017 match the thunk target prototype. 2018``uwtable`` 2019 This attribute indicates that the ABI being targeted requires that 2020 an unwind table entry be produced for this function even if we can 2021 show that no exceptions passes by it. This is normally the case for 2022 the ELF x86-64 abi, but it can be disabled for some compilation 2023 units. 2024``nocf_check`` 2025 This attribute indicates that no control-flow check will be performed on 2026 the attributed entity. It disables -fcf-protection=<> for a specific 2027 entity to fine grain the HW control flow protection mechanism. The flag 2028 is target independent and currently appertains to a function or function 2029 pointer. 2030``shadowcallstack`` 2031 This attribute indicates that the ShadowCallStack checks are enabled for 2032 the function. The instrumentation checks that the return address for the 2033 function has not changed between the function prolog and epilog. It is 2034 currently x86_64-specific. 2035``mustprogress`` 2036 This attribute indicates that the function is required to return, unwind, 2037 or interact with the environment in an observable way e.g. via a volatile 2038 memory access, I/O, or other synchronization. The ``mustprogress`` 2039 attribute is intended to model the requirements of the first section of 2040 [intro.progress] of the C++ Standard. As a consequence, a loop in a 2041 function with the `mustprogress` attribute can be assumed to terminate if 2042 it does not interact with the environment in an observable way, and 2043 terminating loops without side-effects can be removed. If a `mustprogress` 2044 function does not satisfy this contract, the behavior is undefined. This 2045 attribute does not apply transitively to callees, but does apply to call 2046 sites within the function. Note that `willreturn` implies `mustprogress`. 2047``vscale_range(<min>[, <max>])`` 2048 This attribute indicates the minimum and maximum vscale value for the given 2049 function. A value of 0 means unbounded. If the optional max value is omitted 2050 then max is set to the value of min. If the attribute is not present, no 2051 assumptions are made about the range of vscale. 2052 2053Call Site Attributes 2054---------------------- 2055 2056In addition to function attributes the following call site only 2057attributes are supported: 2058 2059``vector-function-abi-variant`` 2060 This attribute can be attached to a :ref:`call <i_call>` to list 2061 the vector functions associated to the function. Notice that the 2062 attribute cannot be attached to a :ref:`invoke <i_invoke>` or a 2063 :ref:`callbr <i_callbr>` instruction. The attribute consists of a 2064 comma separated list of mangled names. The order of the list does 2065 not imply preference (it is logically a set). The compiler is free 2066 to pick any listed vector function of its choosing. 2067 2068 The syntax for the mangled names is as follows::: 2069 2070 _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)] 2071 2072 When present, the attribute informs the compiler that the function 2073 ``<scalar_name>`` has a corresponding vector variant that can be 2074 used to perform the concurrent invocation of ``<scalar_name>`` on 2075 vectors. The shape of the vector function is described by the 2076 tokens between the prefix ``_ZGV`` and the ``<scalar_name>`` 2077 token. The standard name of the vector function is 2078 ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present, 2079 the optional token ``(<vector_redirection>)`` informs the compiler 2080 that a custom name is provided in addition to the standard one 2081 (custom names can be provided for example via the use of ``declare 2082 variant`` in OpenMP 5.0). The declaration of the variant must be 2083 present in the IR Module. The signature of the vector variant is 2084 determined by the rules of the Vector Function ABI (VFABI) 2085 specifications of the target. For Arm and X86, the VFABI can be 2086 found at https://github.com/ARM-software/abi-aa and 2087 https://software.intel.com/en-us/articles/vector-simd-function-abi, 2088 respectively. 2089 2090 For X86 and Arm targets, the values of the tokens in the standard 2091 name are those that are defined in the VFABI. LLVM has an internal 2092 ``<isa>`` token that can be used to create scalar-to-vector 2093 mappings for functions that are not directly associated to any of 2094 the target ISAs (for example, some of the mappings stored in the 2095 TargetLibraryInfo). Valid values for the ``<isa>`` token are::: 2096 2097 <isa>:= b | c | d | e -> X86 SSE, AVX, AVX2, AVX512 2098 | n | s -> Armv8 Advanced SIMD, SVE 2099 | __LLVM__ -> Internal LLVM Vector ISA 2100 2101 For all targets currently supported (x86, Arm and Internal LLVM), 2102 the remaining tokens can have the following values::: 2103 2104 <mask>:= M | N -> mask | no mask 2105 2106 <vlen>:= number -> number of lanes 2107 | x -> VLA (Vector Length Agnostic) 2108 2109 <parameters>:= v -> vector 2110 | l | l <number> -> linear 2111 | R | R <number> -> linear with ref modifier 2112 | L | L <number> -> linear with val modifier 2113 | U | U <number> -> linear with uval modifier 2114 | ls <pos> -> runtime linear 2115 | Rs <pos> -> runtime linear with ref modifier 2116 | Ls <pos> -> runtime linear with val modifier 2117 | Us <pos> -> runtime linear with uval modifier 2118 | u -> uniform 2119 2120 <scalar_name>:= name of the scalar function 2121 2122 <vector_redirection>:= optional, custom name of the vector function 2123 2124``preallocated(<ty>)`` 2125 This attribute is required on calls to ``llvm.call.preallocated.arg`` 2126 and cannot be used on any other call. See 2127 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more 2128 details. 2129 2130.. _glattrs: 2131 2132Global Attributes 2133----------------- 2134 2135Attributes may be set to communicate additional information about a global variable. 2136Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable 2137are grouped into a single :ref:`attribute group <attrgrp>`. 2138 2139.. _opbundles: 2140 2141Operand Bundles 2142--------------- 2143 2144Operand bundles are tagged sets of SSA values that can be associated 2145with certain LLVM instructions (currently only ``call`` s and 2146``invoke`` s). In a way they are like metadata, but dropping them is 2147incorrect and will change program semantics. 2148 2149Syntax:: 2150 2151 operand bundle set ::= '[' operand bundle (, operand bundle )* ']' 2152 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')' 2153 bundle operand ::= SSA value 2154 tag ::= string constant 2155 2156Operand bundles are **not** part of a function's signature, and a 2157given function may be called from multiple places with different kinds 2158of operand bundles. This reflects the fact that the operand bundles 2159are conceptually a part of the ``call`` (or ``invoke``), not the 2160callee being dispatched to. 2161 2162Operand bundles are a generic mechanism intended to support 2163runtime-introspection-like functionality for managed languages. While 2164the exact semantics of an operand bundle depend on the bundle tag, 2165there are certain limitations to how much the presence of an operand 2166bundle can influence the semantics of a program. These restrictions 2167are described as the semantics of an "unknown" operand bundle. As 2168long as the behavior of an operand bundle is describable within these 2169restrictions, LLVM does not need to have special knowledge of the 2170operand bundle to not miscompile programs containing it. 2171 2172- The bundle operands for an unknown operand bundle escape in unknown 2173 ways before control is transferred to the callee or invokee. 2174- Calls and invokes with operand bundles have unknown read / write 2175 effect on the heap on entry and exit (even if the call target is 2176 ``readnone`` or ``readonly``), unless they're overridden with 2177 callsite specific attributes. 2178- An operand bundle at a call site cannot change the implementation 2179 of the called function. Inter-procedural optimizations work as 2180 usual as long as they take into account the first two properties. 2181 2182More specific types of operand bundles are described below. 2183 2184.. _deopt_opbundles: 2185 2186Deoptimization Operand Bundles 2187^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2188 2189Deoptimization operand bundles are characterized by the ``"deopt"`` 2190operand bundle tag. These operand bundles represent an alternate 2191"safe" continuation for the call site they're attached to, and can be 2192used by a suitable runtime to deoptimize the compiled frame at the 2193specified call site. There can be at most one ``"deopt"`` operand 2194bundle attached to a call site. Exact details of deoptimization is 2195out of scope for the language reference, but it usually involves 2196rewriting a compiled frame into a set of interpreted frames. 2197 2198From the compiler's perspective, deoptimization operand bundles make 2199the call sites they're attached to at least ``readonly``. They read 2200through all of their pointer typed operands (even if they're not 2201otherwise escaped) and the entire visible heap. Deoptimization 2202operand bundles do not capture their operands except during 2203deoptimization, in which case control will not be returned to the 2204compiled frame. 2205 2206The inliner knows how to inline through calls that have deoptimization 2207operand bundles. Just like inlining through a normal call site 2208involves composing the normal and exceptional continuations, inlining 2209through a call site with a deoptimization operand bundle needs to 2210appropriately compose the "safe" deoptimization continuation. The 2211inliner does this by prepending the parent's deoptimization 2212continuation to every deoptimization continuation in the inlined body. 2213E.g. inlining ``@f`` into ``@g`` in the following example 2214 2215.. code-block:: llvm 2216 2217 define void @f() { 2218 call void @x() ;; no deopt state 2219 call void @y() [ "deopt"(i32 10) ] 2220 call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ] 2221 ret void 2222 } 2223 2224 define void @g() { 2225 call void @f() [ "deopt"(i32 20) ] 2226 ret void 2227 } 2228 2229will result in 2230 2231.. code-block:: llvm 2232 2233 define void @g() { 2234 call void @x() ;; still no deopt state 2235 call void @y() [ "deopt"(i32 20, i32 10) ] 2236 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ] 2237 ret void 2238 } 2239 2240It is the frontend's responsibility to structure or encode the 2241deoptimization state in a way that syntactically prepending the 2242caller's deoptimization state to the callee's deoptimization state is 2243semantically equivalent to composing the caller's deoptimization 2244continuation after the callee's deoptimization continuation. 2245 2246.. _ob_funclet: 2247 2248Funclet Operand Bundles 2249^^^^^^^^^^^^^^^^^^^^^^^ 2250 2251Funclet operand bundles are characterized by the ``"funclet"`` 2252operand bundle tag. These operand bundles indicate that a call site 2253is within a particular funclet. There can be at most one 2254``"funclet"`` operand bundle attached to a call site and it must have 2255exactly one bundle operand. 2256 2257If any funclet EH pads have been "entered" but not "exited" (per the 2258`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_), 2259it is undefined behavior to execute a ``call`` or ``invoke`` which: 2260 2261* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind 2262 intrinsic, or 2263* has a ``"funclet"`` bundle whose operand is not the most-recently-entered 2264 not-yet-exited funclet EH pad. 2265 2266Similarly, if no funclet EH pads have been entered-but-not-yet-exited, 2267executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior. 2268 2269GC Transition Operand Bundles 2270^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2271 2272GC transition operand bundles are characterized by the 2273``"gc-transition"`` operand bundle tag. These operand bundles mark a 2274call as a transition between a function with one GC strategy to a 2275function with a different GC strategy. If coordinating the transition 2276between GC strategies requires additional code generation at the call 2277site, these bundles may contain any values that are needed by the 2278generated code. For more details, see :ref:`GC Transitions 2279<gc_transition_args>`. 2280 2281The bundle contain an arbitrary list of Values which need to be passed 2282to GC transition code. They will be lowered and passed as operands to 2283the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed 2284that these arguments must be available before and after (but not 2285necessarily during) the execution of the callee. 2286 2287.. _assume_opbundles: 2288 2289Assume Operand Bundles 2290^^^^^^^^^^^^^^^^^^^^^^ 2291 2292Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing 2293assumptions that a :ref:`parameter attribute <paramattrs>` or a 2294:ref:`function attribute <fnattrs>` holds for a certain value at a certain 2295location. Operand bundles enable assumptions that are either hard or impossible 2296to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`. 2297 2298An assume operand bundle has the form: 2299 2300:: 2301 2302 "<tag>"([ <holds for value> [, <attribute argument>] ]) 2303 2304* The tag of the operand bundle is usually the name of attribute that can be 2305 assumed to hold. It can also be `ignore`, this tag doesn't contain any 2306 information and should be ignored. 2307* The first argument if present is the value for which the attribute hold. 2308* The second argument if present is an argument of the attribute. 2309 2310If there are no arguments the attribute is a property of the call location. 2311 2312If the represented attribute expects a constant argument, the argument provided 2313to the operand bundle should be a constant as well. 2314 2315For example: 2316 2317.. code-block:: llvm 2318 2319 call void @llvm.assume(i1 true) ["align"(i32* %val, i32 8)] 2320 2321allows the optimizer to assume that at location of call to 2322:ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8. 2323 2324.. code-block:: llvm 2325 2326 call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(i64* %val)] 2327 2328allows the optimizer to assume that the :ref:`llvm.assume <int_assume>` 2329call location is cold and that ``%val`` may not be null. 2330 2331Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the 2332provided guarantees are violated at runtime the behavior is undefined. 2333 2334Even if the assumed property can be encoded as a boolean value, like 2335``nonnull``, using operand bundles to express the property can still have 2336benefits: 2337 2338* Attributes that can be expressed via operand bundles are directly the 2339 property that the optimizer uses and cares about. Encoding attributes as 2340 operand bundles removes the need for an instruction sequence that represents 2341 the property (e.g., `icmp ne i32* %p, null` for `nonnull`) and for the 2342 optimizer to deduce the property from that instruction sequence. 2343* Expressing the property using operand bundles makes it easy to identify the 2344 use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then 2345 simplifies and improves heuristics, e.g., for use "use-sensitive" 2346 optimizations. 2347 2348.. _ob_preallocated: 2349 2350Preallocated Operand Bundles 2351^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2352 2353Preallocated operand bundles are characterized by the ``"preallocated"`` 2354operand bundle tag. These operand bundles allow separation of the allocation 2355of the call argument memory from the call site. This is necessary to pass 2356non-trivially copyable objects by value in a way that is compatible with MSVC 2357on some targets. There can be at most one ``"preallocated"`` operand bundle 2358attached to a call site and it must have exactly one bundle operand, which is 2359a token generated by ``@llvm.call.preallocated.setup``. A call with this 2360operand bundle should not adjust the stack before entering the function, as 2361that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics. 2362 2363.. code-block:: llvm 2364 2365 %foo = type { i64, i32 } 2366 2367 ... 2368 2369 %t = call token @llvm.call.preallocated.setup(i32 1) 2370 %a = call i8* @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo) 2371 %b = bitcast i8* %a to %foo* 2372 ; initialize %b 2373 call void @bar(i32 42, %foo* preallocated(%foo) %b) ["preallocated"(token %t)] 2374 2375.. _ob_gc_live: 2376 2377GC Live Operand Bundles 2378^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2379 2380A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>` 2381intrinsic. The operand bundle must contain every pointer to a garbage collected 2382object which potentially needs to be updated by the garbage collector. 2383 2384When lowered, any relocated value will be recorded in the corresponding 2385:ref:`stackmap entry <statepoint-stackmap-format>`. See the intrinsic description 2386for further details. 2387 2388ObjC ARC Attached Call Operand Bundles 2389^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2390 2391A ``"clang.arc.attachedcall`` operand bundle on a call indicates the call is 2392implicitly followed by a marker instruction and a call to an ObjC runtime 2393function that uses the result of the call. If the argument passed to the operand 2394bundle is 0, ``@objc_retainAutoreleasedReturnValue`` is called. If 1 is passed, 2395``@objc_unsafeClaimAutoreleasedReturnValue`` is called. A call with this bundle 2396implicitly uses its return value. 2397 2398The operand bundle is needed to ensure the call is immediately followed by the 2399marker instruction or the ObjC runtime call in the final output. 2400 2401.. _moduleasm: 2402 2403Module-Level Inline Assembly 2404---------------------------- 2405 2406Modules may contain "module-level inline asm" blocks, which corresponds 2407to the GCC "file scope inline asm" blocks. These blocks are internally 2408concatenated by LLVM and treated as a single unit, but may be separated 2409in the ``.ll`` file if desired. The syntax is very simple: 2410 2411.. code-block:: llvm 2412 2413 module asm "inline asm code goes here" 2414 module asm "more can go here" 2415 2416The strings can contain any character by escaping non-printable 2417characters. The escape sequence used is simply "\\xx" where "xx" is the 2418two digit hex code for the number. 2419 2420Note that the assembly string *must* be parseable by LLVM's integrated assembler 2421(unless it is disabled), even when emitting a ``.s`` file. 2422 2423.. _langref_datalayout: 2424 2425Data Layout 2426----------- 2427 2428A module may specify a target specific data layout string that specifies 2429how data is to be laid out in memory. The syntax for the data layout is 2430simply: 2431 2432.. code-block:: llvm 2433 2434 target datalayout = "layout specification" 2435 2436The *layout specification* consists of a list of specifications 2437separated by the minus sign character ('-'). Each specification starts 2438with a letter and may include other information after the letter to 2439define some aspect of the data layout. The specifications accepted are 2440as follows: 2441 2442``E`` 2443 Specifies that the target lays out data in big-endian form. That is, 2444 the bits with the most significance have the lowest address 2445 location. 2446``e`` 2447 Specifies that the target lays out data in little-endian form. That 2448 is, the bits with the least significance have the lowest address 2449 location. 2450``S<size>`` 2451 Specifies the natural alignment of the stack in bits. Alignment 2452 promotion of stack variables is limited to the natural stack 2453 alignment to avoid dynamic stack realignment. The stack alignment 2454 must be a multiple of 8-bits. If omitted, the natural stack 2455 alignment defaults to "unspecified", which does not prevent any 2456 alignment promotions. 2457``P<address space>`` 2458 Specifies the address space that corresponds to program memory. 2459 Harvard architectures can use this to specify what space LLVM 2460 should place things such as functions into. If omitted, the 2461 program memory space defaults to the default address space of 0, 2462 which corresponds to a Von Neumann architecture that has code 2463 and data in the same space. 2464``G<address space>`` 2465 Specifies the address space to be used by default when creating global 2466 variables. If omitted, the globals address space defaults to the default 2467 address space 0. 2468 Note: variable declarations without an address space are always created in 2469 address space 0, this property only affects the default value to be used 2470 when creating globals without additional contextual information (e.g. in 2471 LLVM passes). 2472``A<address space>`` 2473 Specifies the address space of objects created by '``alloca``'. 2474 Defaults to the default address space of 0. 2475``p[n]:<size>:<abi>:<pref>:<idx>`` 2476 This specifies the *size* of a pointer and its ``<abi>`` and 2477 ``<pref>``\erred alignments for address space ``n``. The fourth parameter 2478 ``<idx>`` is a size of index that used for address calculation. If not 2479 specified, the default index size is equal to the pointer size. All sizes 2480 are in bits. The address space, ``n``, is optional, and if not specified, 2481 denotes the default address space 0. The value of ``n`` must be 2482 in the range [1,2^23). 2483``i<size>:<abi>:<pref>`` 2484 This specifies the alignment for an integer type of a given bit 2485 ``<size>``. The value of ``<size>`` must be in the range [1,2^23). 2486``v<size>:<abi>:<pref>`` 2487 This specifies the alignment for a vector type of a given bit 2488 ``<size>``. 2489``f<size>:<abi>:<pref>`` 2490 This specifies the alignment for a floating-point type of a given bit 2491 ``<size>``. Only values of ``<size>`` that are supported by the target 2492 will work. 32 (float) and 64 (double) are supported on all targets; 80 2493 or 128 (different flavors of long double) are also supported on some 2494 targets. 2495``a:<abi>:<pref>`` 2496 This specifies the alignment for an object of aggregate type. 2497``F<type><abi>`` 2498 This specifies the alignment for function pointers. 2499 The options for ``<type>`` are: 2500 2501 * ``i``: The alignment of function pointers is independent of the alignment 2502 of functions, and is a multiple of ``<abi>``. 2503 * ``n``: The alignment of function pointers is a multiple of the explicit 2504 alignment specified on the function, and is a multiple of ``<abi>``. 2505``m:<mangling>`` 2506 If present, specifies that llvm names are mangled in the output. Symbols 2507 prefixed with the mangling escape character ``\01`` are passed through 2508 directly to the assembler without the escape character. The mangling style 2509 options are 2510 2511 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix. 2512 * ``m``: Mips mangling: Private symbols get a ``$`` prefix. 2513 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other 2514 symbols get a ``_`` prefix. 2515 * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix. 2516 Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``, 2517 ``__fastcall``, and ``__vectorcall`` have custom mangling that appends 2518 ``@N`` where N is the number of bytes used to pass parameters. C++ symbols 2519 starting with ``?`` are not mangled in any way. 2520 * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C 2521 symbols do not receive a ``_`` prefix. 2522 * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix. 2523``n<size1>:<size2>:<size3>...`` 2524 This specifies a set of native integer widths for the target CPU in 2525 bits. For example, it might contain ``n32`` for 32-bit PowerPC, 2526 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of 2527 this set are considered to support most general arithmetic operations 2528 efficiently. 2529``ni:<address space0>:<address space1>:<address space2>...`` 2530 This specifies pointer types with the specified address spaces 2531 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0`` 2532 address space cannot be specified as non-integral. 2533 2534On every specification that takes a ``<abi>:<pref>``, specifying the 2535``<pref>`` alignment is optional. If omitted, the preceding ``:`` 2536should be omitted too and ``<pref>`` will be equal to ``<abi>``. 2537 2538When constructing the data layout for a given target, LLVM starts with a 2539default set of specifications which are then (possibly) overridden by 2540the specifications in the ``datalayout`` keyword. The default 2541specifications are given in this list: 2542 2543- ``E`` - big endian 2544- ``p:64:64:64`` - 64-bit pointers with 64-bit alignment. 2545- ``p[n]:64:64:64`` - Other address spaces are assumed to be the 2546 same as the default address space. 2547- ``S0`` - natural stack alignment is unspecified 2548- ``i1:8:8`` - i1 is 8-bit (byte) aligned 2549- ``i8:8:8`` - i8 is 8-bit (byte) aligned 2550- ``i16:16:16`` - i16 is 16-bit aligned 2551- ``i32:32:32`` - i32 is 32-bit aligned 2552- ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred 2553 alignment of 64-bits 2554- ``f16:16:16`` - half is 16-bit aligned 2555- ``f32:32:32`` - float is 32-bit aligned 2556- ``f64:64:64`` - double is 64-bit aligned 2557- ``f128:128:128`` - quad is 128-bit aligned 2558- ``v64:64:64`` - 64-bit vector is 64-bit aligned 2559- ``v128:128:128`` - 128-bit vector is 128-bit aligned 2560- ``a:0:64`` - aggregates are 64-bit aligned 2561 2562When LLVM is determining the alignment for a given type, it uses the 2563following rules: 2564 2565#. If the type sought is an exact match for one of the specifications, 2566 that specification is used. 2567#. If no match is found, and the type sought is an integer type, then 2568 the smallest integer type that is larger than the bitwidth of the 2569 sought type is used. If none of the specifications are larger than 2570 the bitwidth then the largest integer type is used. For example, 2571 given the default specifications above, the i7 type will use the 2572 alignment of i8 (next largest) while both i65 and i256 will use the 2573 alignment of i64 (largest specified). 2574#. If no match is found, and the type sought is a vector type, then the 2575 largest vector type that is smaller than the sought vector type will 2576 be used as a fall back. This happens because <128 x double> can be 2577 implemented in terms of 64 <2 x double>, for example. 2578 2579The function of the data layout string may not be what you expect. 2580Notably, this is not a specification from the frontend of what alignment 2581the code generator should use. 2582 2583Instead, if specified, the target data layout is required to match what 2584the ultimate *code generator* expects. This string is used by the 2585mid-level optimizers to improve code, and this only works if it matches 2586what the ultimate code generator uses. There is no way to generate IR 2587that does not embed this target-specific detail into the IR. If you 2588don't specify the string, the default specifications will be used to 2589generate a Data Layout and the optimization phases will operate 2590accordingly and introduce target specificity into the IR with respect to 2591these default specifications. 2592 2593.. _langref_triple: 2594 2595Target Triple 2596------------- 2597 2598A module may specify a target triple string that describes the target 2599host. The syntax for the target triple is simply: 2600 2601.. code-block:: llvm 2602 2603 target triple = "x86_64-apple-macosx10.7.0" 2604 2605The *target triple* string consists of a series of identifiers delimited 2606by the minus sign character ('-'). The canonical forms are: 2607 2608:: 2609 2610 ARCHITECTURE-VENDOR-OPERATING_SYSTEM 2611 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT 2612 2613This information is passed along to the backend so that it generates 2614code for the proper architecture. It's possible to override this on the 2615command line with the ``-mtriple`` command line option. 2616 2617.. _objectlifetime: 2618 2619Object Lifetime 2620---------------------- 2621 2622A memory object, or simply object, is a region of a memory space that is 2623reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap 2624allocation calls, and global variable definitions. 2625Once it is allocated, the bytes stored in the region can only be read or written 2626through a pointer that is :ref:`based on <pointeraliasing>` the allocation 2627value. 2628If a pointer that is not based on the object tries to read or write to the 2629object, it is undefined behavior. 2630 2631A lifetime of a memory object is a property that decides its accessibility. 2632Unless stated otherwise, a memory object is alive since its allocation, and 2633dead after its deallocation. 2634It is undefined behavior to access a memory object that isn't alive, but 2635operations that don't dereference it such as 2636:ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and 2637:ref:`icmp <i_icmp>` return a valid result. 2638This explains code motion of these instructions across operations that 2639impact the object's lifetime. 2640A stack object's lifetime can be explicitly specified using 2641:ref:`llvm.lifetime.start <int_lifestart>` and 2642:ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls. 2643 2644.. _pointeraliasing: 2645 2646Pointer Aliasing Rules 2647---------------------- 2648 2649Any memory access must be done through a pointer value associated with 2650an address range of the memory access, otherwise the behavior is 2651undefined. Pointer values are associated with address ranges according 2652to the following rules: 2653 2654- A pointer value is associated with the addresses associated with any 2655 value it is *based* on. 2656- An address of a global variable is associated with the address range 2657 of the variable's storage. 2658- The result value of an allocation instruction is associated with the 2659 address range of the allocated storage. 2660- A null pointer in the default address-space is associated with no 2661 address. 2662- An :ref:`undef value <undefvalues>` in *any* address-space is 2663 associated with no address. 2664- An integer constant other than zero or a pointer value returned from 2665 a function not defined within LLVM may be associated with address 2666 ranges allocated through mechanisms other than those provided by 2667 LLVM. Such ranges shall not overlap with any ranges of addresses 2668 allocated by mechanisms provided by LLVM. 2669 2670A pointer value is *based* on another pointer value according to the 2671following rules: 2672 2673- A pointer value formed from a scalar ``getelementptr`` operation is *based* on 2674 the pointer-typed operand of the ``getelementptr``. 2675- The pointer in lane *l* of the result of a vector ``getelementptr`` operation 2676 is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand 2677 of the ``getelementptr``. 2678- The result value of a ``bitcast`` is *based* on the operand of the 2679 ``bitcast``. 2680- A pointer value formed by an ``inttoptr`` is *based* on all pointer 2681 values that contribute (directly or indirectly) to the computation of 2682 the pointer's value. 2683- The "*based* on" relationship is transitive. 2684 2685Note that this definition of *"based"* is intentionally similar to the 2686definition of *"based"* in C99, though it is slightly weaker. 2687 2688LLVM IR does not associate types with memory. The result type of a 2689``load`` merely indicates the size and alignment of the memory from 2690which to load, as well as the interpretation of the value. The first 2691operand type of a ``store`` similarly only indicates the size and 2692alignment of the store. 2693 2694Consequently, type-based alias analysis, aka TBAA, aka 2695``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR. 2696:ref:`Metadata <metadata>` may be used to encode additional information 2697which specialized optimization passes may use to implement type-based 2698alias analysis. 2699 2700.. _pointercapture: 2701 2702Pointer Capture 2703--------------- 2704 2705Given a function call and a pointer that is passed as an argument or stored in 2706the memory before the call, a pointer is *captured* by the call if it makes a 2707copy of any part of the pointer that outlives the call. 2708To be precise, a pointer is captured if one or more of the following conditions 2709hold: 2710 27111. The call stores any bit of the pointer carrying information into a place, 2712 and the stored bits can be read from the place by the caller after this call 2713 exits. 2714 2715.. code-block:: llvm 2716 2717 @glb = global i8* null 2718 @glb2 = global i8* null 2719 @glb3 = global i8* null 2720 @glbi = global i32 0 2721 2722 define i8* @f(i8* %a, i8* %b, i8* %c, i8* %d, i8* %e) { 2723 store i8* %a, i8** @glb ; %a is captured by this call 2724 2725 store i8* %b, i8** @glb2 ; %b isn't captured because the stored value is overwritten by the store below 2726 store i8* null, i8** @glb2 2727 2728 store i8* %c, i8** @glb3 2729 call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured 2730 store i8* null, i8** @glb3 2731 2732 %i = ptrtoint i8* %d to i64 2733 %j = trunc i64 %i to i32 2734 store i32 %j, i32* @glbi ; %d is captured 2735 2736 ret i8* %e ; %e is captured 2737 } 2738 27392. The call stores any bit of the pointer carrying information into a place, 2740 and the stored bits can be safely read from the place by another thread via 2741 synchronization. 2742 2743.. code-block:: llvm 2744 2745 @lock = global i1 true 2746 2747 define void @f(i8* %a) { 2748 store i8* %a, i8** @glb 2749 store atomic i1 false, i1* @lock release ; %a is captured because another thread can safely read @glb 2750 store i8* null, i8** @glb 2751 ret void 2752 } 2753 27543. The call's behavior depends on any bit of the pointer carrying information. 2755 2756.. code-block:: llvm 2757 2758 @glb = global i8 0 2759 2760 define void @f(i8* %a) { 2761 %c = icmp eq i8* %a, @glb 2762 br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a 2763 BB_EXIT: 2764 call void @exit() 2765 unreachable 2766 BB_CONTINUE: 2767 ret void 2768 } 2769 27704. The pointer is used in a volatile access as its address. 2771 2772 2773.. _volatile: 2774 2775Volatile Memory Accesses 2776------------------------ 2777 2778Certain memory accesses, such as :ref:`load <i_load>`'s, 2779:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be 2780marked ``volatile``. The optimizers must not change the number of 2781volatile operations or change their order of execution relative to other 2782volatile operations. The optimizers *may* change the order of volatile 2783operations relative to non-volatile operations. This is not Java's 2784"volatile" and has no cross-thread synchronization behavior. 2785 2786A volatile load or store may have additional target-specific semantics. 2787Any volatile operation can have side effects, and any volatile operation 2788can read and/or modify state which is not accessible via a regular load 2789or store in this module. Volatile operations may use addresses which do 2790not point to memory (like MMIO registers). This means the compiler may 2791not use a volatile operation to prove a non-volatile access to that 2792address has defined behavior. 2793 2794The allowed side-effects for volatile accesses are limited. If a 2795non-volatile store to a given address would be legal, a volatile 2796operation may modify the memory at that address. A volatile operation 2797may not modify any other memory accessible by the module being compiled. 2798A volatile operation may not call any code in the current module. 2799 2800The compiler may assume execution will continue after a volatile operation, 2801so operations which modify memory or may have undefined behavior can be 2802hoisted past a volatile operation. 2803 2804IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy 2805or llvm.memmove intrinsics even when those intrinsics are flagged volatile. 2806Likewise, the backend should never split or merge target-legal volatile 2807load/store instructions. Similarly, IR-level volatile loads and stores cannot 2808change from integer to floating-point or vice versa. 2809 2810.. admonition:: Rationale 2811 2812 Platforms may rely on volatile loads and stores of natively supported 2813 data width to be executed as single instruction. For example, in C 2814 this holds for an l-value of volatile primitive type with native 2815 hardware support, but not necessarily for aggregate types. The 2816 frontend upholds these expectations, which are intentionally 2817 unspecified in the IR. The rules above ensure that IR transformations 2818 do not violate the frontend's contract with the language. 2819 2820.. _memmodel: 2821 2822Memory Model for Concurrent Operations 2823-------------------------------------- 2824 2825The LLVM IR does not define any way to start parallel threads of 2826execution or to register signal handlers. Nonetheless, there are 2827platform-specific ways to create them, and we define LLVM IR's behavior 2828in their presence. This model is inspired by the C++0x memory model. 2829 2830For a more informal introduction to this model, see the :doc:`Atomics`. 2831 2832We define a *happens-before* partial order as the least partial order 2833that 2834 2835- Is a superset of single-thread program order, and 2836- When a *synchronizes-with* ``b``, includes an edge from ``a`` to 2837 ``b``. *Synchronizes-with* pairs are introduced by platform-specific 2838 techniques, like pthread locks, thread creation, thread joining, 2839 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering 2840 Constraints <ordering>`). 2841 2842Note that program order does not introduce *happens-before* edges 2843between a thread and signals executing inside that thread. 2844 2845Every (defined) read operation (load instructions, memcpy, atomic 2846loads/read-modify-writes, etc.) R reads a series of bytes written by 2847(defined) write operations (store instructions, atomic 2848stores/read-modify-writes, memcpy, etc.). For the purposes of this 2849section, initialized globals are considered to have a write of the 2850initializer which is atomic and happens before any other read or write 2851of the memory in question. For each byte of a read R, R\ :sub:`byte` 2852may see any write to the same byte, except: 2853 2854- If write\ :sub:`1` happens before write\ :sub:`2`, and 2855 write\ :sub:`2` happens before R\ :sub:`byte`, then 2856 R\ :sub:`byte` does not see write\ :sub:`1`. 2857- If R\ :sub:`byte` happens before write\ :sub:`3`, then 2858 R\ :sub:`byte` does not see write\ :sub:`3`. 2859 2860Given that definition, R\ :sub:`byte` is defined as follows: 2861 2862- If R is volatile, the result is target-dependent. (Volatile is 2863 supposed to give guarantees which can support ``sig_atomic_t`` in 2864 C/C++, and may be used for accesses to addresses that do not behave 2865 like normal memory. It does not generally provide cross-thread 2866 synchronization.) 2867- Otherwise, if there is no write to the same byte that happens before 2868 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte. 2869- Otherwise, if R\ :sub:`byte` may see exactly one write, 2870 R\ :sub:`byte` returns the value written by that write. 2871- Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may 2872 see are atomic, it chooses one of the values written. See the :ref:`Atomic 2873 Memory Ordering Constraints <ordering>` section for additional 2874 constraints on how the choice is made. 2875- Otherwise R\ :sub:`byte` returns ``undef``. 2876 2877R returns the value composed of the series of bytes it read. This 2878implies that some bytes within the value may be ``undef`` **without** 2879the entire value being ``undef``. Note that this only defines the 2880semantics of the operation; it doesn't mean that targets will emit more 2881than one instruction to read the series of bytes. 2882 2883Note that in cases where none of the atomic intrinsics are used, this 2884model places only one restriction on IR transformations on top of what 2885is required for single-threaded execution: introducing a store to a byte 2886which might not otherwise be stored is not allowed in general. 2887(Specifically, in the case where another thread might write to and read 2888from an address, introducing a store can change a load that may see 2889exactly one write into a load that may see multiple writes.) 2890 2891.. _ordering: 2892 2893Atomic Memory Ordering Constraints 2894---------------------------------- 2895 2896Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`, 2897:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`, 2898:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take 2899ordering parameters that determine which other atomic instructions on 2900the same address they *synchronize with*. These semantics are borrowed 2901from Java and C++0x, but are somewhat more colloquial. If these 2902descriptions aren't precise enough, check those specs (see spec 2903references in the :doc:`atomics guide <Atomics>`). 2904:ref:`fence <i_fence>` instructions treat these orderings somewhat 2905differently since they don't take an address. See that instruction's 2906documentation for details. 2907 2908For a simpler introduction to the ordering constraints, see the 2909:doc:`Atomics`. 2910 2911``unordered`` 2912 The set of values that can be read is governed by the happens-before 2913 partial order. A value cannot be read unless some operation wrote 2914 it. This is intended to provide a guarantee strong enough to model 2915 Java's non-volatile shared variables. This ordering cannot be 2916 specified for read-modify-write operations; it is not strong enough 2917 to make them atomic in any interesting way. 2918``monotonic`` 2919 In addition to the guarantees of ``unordered``, there is a single 2920 total order for modifications by ``monotonic`` operations on each 2921 address. All modification orders must be compatible with the 2922 happens-before order. There is no guarantee that the modification 2923 orders can be combined to a global total order for the whole program 2924 (and this often will not be possible). The read in an atomic 2925 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and 2926 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification 2927 order immediately before the value it writes. If one atomic read 2928 happens before another atomic read of the same address, the later 2929 read must see the same value or a later value in the address's 2930 modification order. This disallows reordering of ``monotonic`` (or 2931 stronger) operations on the same address. If an address is written 2932 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally 2933 read that address repeatedly, the other threads must eventually see 2934 the write. This corresponds to the C++0x/C1x 2935 ``memory_order_relaxed``. 2936``acquire`` 2937 In addition to the guarantees of ``monotonic``, a 2938 *synchronizes-with* edge may be formed with a ``release`` operation. 2939 This is intended to model C++'s ``memory_order_acquire``. 2940``release`` 2941 In addition to the guarantees of ``monotonic``, if this operation 2942 writes a value which is subsequently read by an ``acquire`` 2943 operation, it *synchronizes-with* that operation. (This isn't a 2944 complete description; see the C++0x definition of a release 2945 sequence.) This corresponds to the C++0x/C1x 2946 ``memory_order_release``. 2947``acq_rel`` (acquire+release) 2948 Acts as both an ``acquire`` and ``release`` operation on its 2949 address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``. 2950``seq_cst`` (sequentially consistent) 2951 In addition to the guarantees of ``acq_rel`` (``acquire`` for an 2952 operation that only reads, ``release`` for an operation that only 2953 writes), there is a global total order on all 2954 sequentially-consistent operations on all addresses, which is 2955 consistent with the *happens-before* partial order and with the 2956 modification orders of all the affected addresses. Each 2957 sequentially-consistent read sees the last preceding write to the 2958 same address in this global order. This corresponds to the C++0x/C1x 2959 ``memory_order_seq_cst`` and Java volatile. 2960 2961.. _syncscope: 2962 2963If an atomic operation is marked ``syncscope("singlethread")``, it only 2964*synchronizes with* and only participates in the seq\_cst total orderings of 2965other operations running in the same thread (for example, in signal handlers). 2966 2967If an atomic operation is marked ``syncscope("<target-scope>")``, where 2968``<target-scope>`` is a target specific synchronization scope, then it is target 2969dependent if it *synchronizes with* and participates in the seq\_cst total 2970orderings of other operations. 2971 2972Otherwise, an atomic operation that is not marked ``syncscope("singlethread")`` 2973or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the 2974seq\_cst total orderings of other operations that are not marked 2975``syncscope("singlethread")`` or ``syncscope("<target-scope>")``. 2976 2977.. _floatenv: 2978 2979Floating-Point Environment 2980-------------------------- 2981 2982The default LLVM floating-point environment assumes that floating-point 2983instructions do not have side effects. Results assume the round-to-nearest 2984rounding mode. No floating-point exception state is maintained in this 2985environment. Therefore, there is no attempt to create or preserve invalid 2986operation (SNaN) or division-by-zero exceptions. 2987 2988The benefit of this exception-free assumption is that floating-point 2989operations may be speculated freely without any other fast-math relaxations 2990to the floating-point model. 2991 2992Code that requires different behavior than this should use the 2993:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`. 2994 2995.. _fastmath: 2996 2997Fast-Math Flags 2998--------------- 2999 3000LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`, 3001:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`, 3002:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`, 3003:ref:`select <i_select>` and :ref:`call <i_call>` 3004may use the following flags to enable otherwise unsafe 3005floating-point transformations. 3006 3007``nnan`` 3008 No NaNs - Allow optimizations to assume the arguments and result are not 3009 NaN. If an argument is a nan, or the result would be a nan, it produces 3010 a :ref:`poison value <poisonvalues>` instead. 3011 3012``ninf`` 3013 No Infs - Allow optimizations to assume the arguments and result are not 3014 +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it 3015 produces a :ref:`poison value <poisonvalues>` instead. 3016 3017``nsz`` 3018 No Signed Zeros - Allow optimizations to treat the sign of a zero 3019 argument or result as insignificant. This does not imply that -0.0 3020 is poison and/or guaranteed to not exist in the operation. 3021 3022``arcp`` 3023 Allow Reciprocal - Allow optimizations to use the reciprocal of an 3024 argument rather than perform division. 3025 3026``contract`` 3027 Allow floating-point contraction (e.g. fusing a multiply followed by an 3028 addition into a fused multiply-and-add). This does not enable reassociating 3029 to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not 3030 be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations. 3031 3032``afn`` 3033 Approximate functions - Allow substitution of approximate calculations for 3034 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions 3035 for places where this can apply to LLVM's intrinsic math functions. 3036 3037``reassoc`` 3038 Allow reassociation transformations for floating-point instructions. 3039 This may dramatically change results in floating-point. 3040 3041``fast`` 3042 This flag implies all of the others. 3043 3044.. _uselistorder: 3045 3046Use-list Order Directives 3047------------------------- 3048 3049Use-list directives encode the in-memory order of each use-list, allowing the 3050order to be recreated. ``<order-indexes>`` is a comma-separated list of 3051indexes that are assigned to the referenced value's uses. The referenced 3052value's use-list is immediately sorted by these indexes. 3053 3054Use-list directives may appear at function scope or global scope. They are not 3055instructions, and have no effect on the semantics of the IR. When they're at 3056function scope, they must appear after the terminator of the final basic block. 3057 3058If basic blocks have their address taken via ``blockaddress()`` expressions, 3059``uselistorder_bb`` can be used to reorder their use-lists from outside their 3060function's scope. 3061 3062:Syntax: 3063 3064:: 3065 3066 uselistorder <ty> <value>, { <order-indexes> } 3067 uselistorder_bb @function, %block { <order-indexes> } 3068 3069:Examples: 3070 3071:: 3072 3073 define void @foo(i32 %arg1, i32 %arg2) { 3074 entry: 3075 ; ... instructions ... 3076 bb: 3077 ; ... instructions ... 3078 3079 ; At function scope. 3080 uselistorder i32 %arg1, { 1, 0, 2 } 3081 uselistorder label %bb, { 1, 0 } 3082 } 3083 3084 ; At global scope. 3085 uselistorder i32* @global, { 1, 2, 0 } 3086 uselistorder i32 7, { 1, 0 } 3087 uselistorder i32 (i32) @bar, { 1, 0 } 3088 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 } 3089 3090.. _source_filename: 3091 3092Source Filename 3093--------------- 3094 3095The *source filename* string is set to the original module identifier, 3096which will be the name of the compiled source file when compiling from 3097source through the clang front end, for example. It is then preserved through 3098the IR and bitcode. 3099 3100This is currently necessary to generate a consistent unique global 3101identifier for local functions used in profile data, which prepends the 3102source file name to the local function name. 3103 3104The syntax for the source file name is simply: 3105 3106.. code-block:: text 3107 3108 source_filename = "/path/to/source.c" 3109 3110.. _typesystem: 3111 3112Type System 3113=========== 3114 3115The LLVM type system is one of the most important features of the 3116intermediate representation. Being typed enables a number of 3117optimizations to be performed on the intermediate representation 3118directly, without having to do extra analyses on the side before the 3119transformation. A strong type system makes it easier to read the 3120generated code and enables novel analyses and transformations that are 3121not feasible to perform on normal three address code representations. 3122 3123.. _t_void: 3124 3125Void Type 3126--------- 3127 3128:Overview: 3129 3130 3131The void type does not represent any value and has no size. 3132 3133:Syntax: 3134 3135 3136:: 3137 3138 void 3139 3140 3141.. _t_function: 3142 3143Function Type 3144------------- 3145 3146:Overview: 3147 3148 3149The function type can be thought of as a function signature. It consists of a 3150return type and a list of formal parameter types. The return type of a function 3151type is a void type or first class type --- except for :ref:`label <t_label>` 3152and :ref:`metadata <t_metadata>` types. 3153 3154:Syntax: 3155 3156:: 3157 3158 <returntype> (<parameter list>) 3159 3160...where '``<parameter list>``' is a comma-separated list of type 3161specifiers. Optionally, the parameter list may include a type ``...``, which 3162indicates that the function takes a variable number of arguments. Variable 3163argument functions can access their arguments with the :ref:`variable argument 3164handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type 3165except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`. 3166 3167:Examples: 3168 3169+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3170| ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` | 3171+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3172| ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. | 3173+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3174| ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. | 3175+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3176| ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values | 3177+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3178 3179.. _t_firstclass: 3180 3181First Class Types 3182----------------- 3183 3184The :ref:`first class <t_firstclass>` types are perhaps the most important. 3185Values of these types are the only ones which can be produced by 3186instructions. 3187 3188.. _t_single_value: 3189 3190Single Value Types 3191^^^^^^^^^^^^^^^^^^ 3192 3193These are the types that are valid in registers from CodeGen's perspective. 3194 3195.. _t_integer: 3196 3197Integer Type 3198"""""""""""" 3199 3200:Overview: 3201 3202The integer type is a very simple type that simply specifies an 3203arbitrary bit width for the integer type desired. Any bit width from 1 3204bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified. 3205 3206:Syntax: 3207 3208:: 3209 3210 iN 3211 3212The number of bits the integer will occupy is specified by the ``N`` 3213value. 3214 3215Examples: 3216********* 3217 3218+----------------+------------------------------------------------+ 3219| ``i1`` | a single-bit integer. | 3220+----------------+------------------------------------------------+ 3221| ``i32`` | a 32-bit integer. | 3222+----------------+------------------------------------------------+ 3223| ``i1942652`` | a really big integer of over 1 million bits. | 3224+----------------+------------------------------------------------+ 3225 3226.. _t_floating: 3227 3228Floating-Point Types 3229"""""""""""""""""""" 3230 3231.. list-table:: 3232 :header-rows: 1 3233 3234 * - Type 3235 - Description 3236 3237 * - ``half`` 3238 - 16-bit floating-point value 3239 3240 * - ``bfloat`` 3241 - 16-bit "brain" floating-point value (7-bit significand). Provides the 3242 same number of exponent bits as ``float``, so that it matches its dynamic 3243 range, but with greatly reduced precision. Used in Intel's AVX-512 BF16 3244 extensions and Arm's ARMv8.6-A extensions, among others. 3245 3246 * - ``float`` 3247 - 32-bit floating-point value 3248 3249 * - ``double`` 3250 - 64-bit floating-point value 3251 3252 * - ``fp128`` 3253 - 128-bit floating-point value (113-bit significand) 3254 3255 * - ``x86_fp80`` 3256 - 80-bit floating-point value (X87) 3257 3258 * - ``ppc_fp128`` 3259 - 128-bit floating-point value (two 64-bits) 3260 3261The binary format of half, float, double, and fp128 correspond to the 3262IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128 3263respectively. 3264 3265X86_amx Type 3266"""""""""""" 3267 3268:Overview: 3269 3270The x86_amx type represents a value held in an AMX tile register on an x86 3271machine. The operations allowed on it are quite limited. Only few intrinsics 3272are allowed: stride load and store, zero and dot product. No instruction is 3273allowed for this type. There are no arguments, arrays, pointers, vectors 3274or constants of this type. 3275 3276:Syntax: 3277 3278:: 3279 3280 x86_amx 3281 3282 3283X86_mmx Type 3284"""""""""""" 3285 3286:Overview: 3287 3288The x86_mmx type represents a value held in an MMX register on an x86 3289machine. The operations allowed on it are quite limited: parameters and 3290return values, load and store, and bitcast. User-specified MMX 3291instructions are represented as intrinsic or asm calls with arguments 3292and/or results of this type. There are no arrays, vectors or constants 3293of this type. 3294 3295:Syntax: 3296 3297:: 3298 3299 x86_mmx 3300 3301 3302.. _t_pointer: 3303 3304Pointer Type 3305"""""""""""" 3306 3307:Overview: 3308 3309The pointer type is used to specify memory locations. Pointers are 3310commonly used to reference objects in memory. 3311 3312Pointer types may have an optional address space attribute defining the 3313numbered address space where the pointed-to object resides. The default 3314address space is number zero. The semantics of non-zero address spaces 3315are target-specific. 3316 3317Note that LLVM does not permit pointers to void (``void*``) nor does it 3318permit pointers to labels (``label*``). Use ``i8*`` instead. 3319 3320LLVM is in the process of transitioning to opaque pointers. Opaque pointers do 3321not have a pointee type. Rather, instructions interacting through pointers 3322specify the type of the underlying memory they are interacting with. Opaque 3323pointers are still in the process of being worked on and are not complete. 3324 3325:Syntax: 3326 3327:: 3328 3329 <type> * 3330 ptr 3331 3332:Examples: 3333 3334+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3335| ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. | 3336+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3337| ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. | 3338+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3339| ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space 5. | 3340+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3341| ``ptr`` | An opaque pointer type to a value that resides in address space 0. | 3342+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3343| ``ptr addrspace(5)`` | An opaque pointer type to a value that resides in address space 5. | 3344+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3345 3346.. _t_vector: 3347 3348Vector Type 3349""""""""""" 3350 3351:Overview: 3352 3353A vector type is a simple derived type that represents a vector of 3354elements. Vector types are used when multiple primitive data are 3355operated in parallel using a single instruction (SIMD). A vector type 3356requires a size (number of elements), an underlying primitive data type, 3357and a scalable property to represent vectors where the exact hardware 3358vector length is unknown at compile time. Vector types are considered 3359:ref:`first class <t_firstclass>`. 3360 3361:Memory Layout: 3362 3363In general vector elements are laid out in memory in the same way as 3364:ref:`array types <t_array>`. Such an analogy works fine as long as the vector 3365elements are byte sized. However, when the elements of the vector aren't byte 3366sized it gets a bit more complicated. One way to describe the layout is by 3367describing what happens when a vector such as <N x iM> is bitcasted to an 3368integer type with N*M bits, and then following the rules for storing such an 3369integer to memory. 3370 3371A bitcast from a vector type to a scalar integer type will see the elements 3372being packed together (without padding). The order in which elements are 3373inserted in the integer depends on endianess. For little endian element zero 3374is put in the least significant bits of the integer, and for big endian 3375element zero is put in the most significant bits. 3376 3377Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together 3378with the analogy that we can replace a vector store by a bitcast followed by 3379an integer store, we get this for big endian: 3380 3381.. code-block:: llvm 3382 3383 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16 3384 3385 ; Bitcasting from a vector to an integral type can be seen as 3386 ; concatenating the values: 3387 ; %val now has the hexadecimal value 0x1235. 3388 3389 store i16 %val, i16* %ptr 3390 3391 ; In memory the content will be (8-bit addressing): 3392 ; 3393 ; [%ptr + 0]: 00010010 (0x12) 3394 ; [%ptr + 1]: 00110101 (0x35) 3395 3396The same example for little endian: 3397 3398.. code-block:: llvm 3399 3400 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16 3401 3402 ; Bitcasting from a vector to an integral type can be seen as 3403 ; concatenating the values: 3404 ; %val now has the hexadecimal value 0x5321. 3405 3406 store i16 %val, i16* %ptr 3407 3408 ; In memory the content will be (8-bit addressing): 3409 ; 3410 ; [%ptr + 0]: 01010011 (0x53) 3411 ; [%ptr + 1]: 00100001 (0x21) 3412 3413When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout 3414is unspecified (just like it is for an integral type of the same size). This 3415is because different targets could put the padding at different positions when 3416the type size is smaller than the type's store size. 3417 3418:Syntax: 3419 3420:: 3421 3422 < <# elements> x <elementtype> > ; Fixed-length vector 3423 < vscale x <# elements> x <elementtype> > ; Scalable vector 3424 3425The number of elements is a constant integer value larger than 0; 3426elementtype may be any integer, floating-point or pointer type. Vectors 3427of size zero are not allowed. For scalable vectors, the total number of 3428elements is a constant multiple (called vscale) of the specified number 3429of elements; vscale is a positive integer that is unknown at compile time 3430and the same hardware-dependent constant for all scalable vectors at run 3431time. The size of a specific scalable vector type is thus constant within 3432IR, even if the exact size in bytes cannot be determined until run time. 3433 3434:Examples: 3435 3436+------------------------+----------------------------------------------------+ 3437| ``<4 x i32>`` | Vector of 4 32-bit integer values. | 3438+------------------------+----------------------------------------------------+ 3439| ``<8 x float>`` | Vector of 8 32-bit floating-point values. | 3440+------------------------+----------------------------------------------------+ 3441| ``<2 x i64>`` | Vector of 2 64-bit integer values. | 3442+------------------------+----------------------------------------------------+ 3443| ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. | 3444+------------------------+----------------------------------------------------+ 3445| ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. | 3446+------------------------+----------------------------------------------------+ 3447 3448.. _t_label: 3449 3450Label Type 3451^^^^^^^^^^ 3452 3453:Overview: 3454 3455The label type represents code labels. 3456 3457:Syntax: 3458 3459:: 3460 3461 label 3462 3463.. _t_token: 3464 3465Token Type 3466^^^^^^^^^^ 3467 3468:Overview: 3469 3470The token type is used when a value is associated with an instruction 3471but all uses of the value must not attempt to introspect or obscure it. 3472As such, it is not appropriate to have a :ref:`phi <i_phi>` or 3473:ref:`select <i_select>` of type token. 3474 3475:Syntax: 3476 3477:: 3478 3479 token 3480 3481 3482 3483.. _t_metadata: 3484 3485Metadata Type 3486^^^^^^^^^^^^^ 3487 3488:Overview: 3489 3490The metadata type represents embedded metadata. No derived types may be 3491created from metadata except for :ref:`function <t_function>` arguments. 3492 3493:Syntax: 3494 3495:: 3496 3497 metadata 3498 3499.. _t_aggregate: 3500 3501Aggregate Types 3502^^^^^^^^^^^^^^^ 3503 3504Aggregate Types are a subset of derived types that can contain multiple 3505member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are 3506aggregate types. :ref:`Vectors <t_vector>` are not considered to be 3507aggregate types. 3508 3509.. _t_array: 3510 3511Array Type 3512"""""""""" 3513 3514:Overview: 3515 3516The array type is a very simple derived type that arranges elements 3517sequentially in memory. The array type requires a size (number of 3518elements) and an underlying data type. 3519 3520:Syntax: 3521 3522:: 3523 3524 [<# elements> x <elementtype>] 3525 3526The number of elements is a constant integer value; ``elementtype`` may 3527be any type with a size. 3528 3529:Examples: 3530 3531+------------------+--------------------------------------+ 3532| ``[40 x i32]`` | Array of 40 32-bit integer values. | 3533+------------------+--------------------------------------+ 3534| ``[41 x i32]`` | Array of 41 32-bit integer values. | 3535+------------------+--------------------------------------+ 3536| ``[4 x i8]`` | Array of 4 8-bit integer values. | 3537+------------------+--------------------------------------+ 3538 3539Here are some examples of multidimensional arrays: 3540 3541+-----------------------------+----------------------------------------------------------+ 3542| ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | 3543+-----------------------------+----------------------------------------------------------+ 3544| ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. | 3545+-----------------------------+----------------------------------------------------------+ 3546| ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | 3547+-----------------------------+----------------------------------------------------------+ 3548 3549There is no restriction on indexing beyond the end of the array implied 3550by a static type (though there are restrictions on indexing beyond the 3551bounds of an allocated object in some cases). This means that 3552single-dimension 'variable sized array' addressing can be implemented in 3553LLVM with a zero length array type. An implementation of 'pascal style 3554arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for 3555example. 3556 3557.. _t_struct: 3558 3559Structure Type 3560"""""""""""""" 3561 3562:Overview: 3563 3564The structure type is used to represent a collection of data members 3565together in memory. The elements of a structure may be any type that has 3566a size. 3567 3568Structures in memory are accessed using '``load``' and '``store``' by 3569getting a pointer to a field with the '``getelementptr``' instruction. 3570Structures in registers are accessed using the '``extractvalue``' and 3571'``insertvalue``' instructions. 3572 3573Structures may optionally be "packed" structures, which indicate that 3574the alignment of the struct is one byte, and that there is no padding 3575between the elements. In non-packed structs, padding between field types 3576is inserted as defined by the DataLayout string in the module, which is 3577required to match what the underlying code generator expects. 3578 3579Structures can either be "literal" or "identified". A literal structure 3580is defined inline with other types (e.g. ``{i32, i32}*``) whereas 3581identified types are always defined at the top level with a name. 3582Literal types are uniqued by their contents and can never be recursive 3583or opaque since there is no way to write one. Identified types can be 3584recursive, can be opaqued, and are never uniqued. 3585 3586:Syntax: 3587 3588:: 3589 3590 %T1 = type { <type list> } ; Identified normal struct type 3591 %T2 = type <{ <type list> }> ; Identified packed struct type 3592 3593:Examples: 3594 3595+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3596| ``{ i32, i32, i32 }`` | A triple of three ``i32`` values | 3597+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3598| ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. | 3599+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3600| ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. | 3601+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3602 3603.. _t_opaque: 3604 3605Opaque Structure Types 3606"""""""""""""""""""""" 3607 3608:Overview: 3609 3610Opaque structure types are used to represent structure types that 3611do not have a body specified. This corresponds (for example) to the C 3612notion of a forward declared structure. They can be named (``%X``) or 3613unnamed (``%52``). 3614 3615:Syntax: 3616 3617:: 3618 3619 %X = type opaque 3620 %52 = type opaque 3621 3622:Examples: 3623 3624+--------------+-------------------+ 3625| ``opaque`` | An opaque type. | 3626+--------------+-------------------+ 3627 3628.. _constants: 3629 3630Constants 3631========= 3632 3633LLVM has several different basic types of constants. This section 3634describes them all and their syntax. 3635 3636Simple Constants 3637---------------- 3638 3639**Boolean constants** 3640 The two strings '``true``' and '``false``' are both valid constants 3641 of the ``i1`` type. 3642**Integer constants** 3643 Standard integers (such as '4') are constants of the 3644 :ref:`integer <t_integer>` type. Negative numbers may be used with 3645 integer types. 3646**Floating-point constants** 3647 Floating-point constants use standard decimal notation (e.g. 3648 123.421), exponential notation (e.g. 1.23421e+2), or a more precise 3649 hexadecimal notation (see below). The assembler requires the exact 3650 decimal value of a floating-point constant. For example, the 3651 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating 3652 decimal in binary. Floating-point constants must have a 3653 :ref:`floating-point <t_floating>` type. 3654**Null pointer constants** 3655 The identifier '``null``' is recognized as a null pointer constant 3656 and must be of :ref:`pointer type <t_pointer>`. 3657**Token constants** 3658 The identifier '``none``' is recognized as an empty token constant 3659 and must be of :ref:`token type <t_token>`. 3660 3661The one non-intuitive notation for constants is the hexadecimal form of 3662floating-point constants. For example, the form 3663'``double 0x432ff973cafa8000``' is equivalent to (but harder to read 3664than) '``double 4.5e+15``'. The only time hexadecimal floating-point 3665constants are required (and the only time that they are generated by the 3666disassembler) is when a floating-point constant must be emitted but it 3667cannot be represented as a decimal floating-point number in a reasonable 3668number of digits. For example, NaN's, infinities, and other special 3669values are represented in their IEEE hexadecimal format so that assembly 3670and disassembly do not cause any bits to change in the constants. 3671 3672When using the hexadecimal form, constants of types bfloat, half, float, and 3673double are represented using the 16-digit form shown above (which matches the 3674IEEE754 representation for double); bfloat, half and float values must, however, 3675be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single 3676precision respectively. Hexadecimal format is always used for long double, and 3677there are three forms of long double. The 80-bit format used by x86 is 3678represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format 3679used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32 3680hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed 3681by 32 hexadecimal digits. Long doubles will only work if they match the long 3682double format on your target. The IEEE 16-bit format (half precision) is 3683represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit 3684format is represented by ``0xR`` followed by 4 hexadecimal digits. All 3685hexadecimal formats are big-endian (sign bit at the left). 3686 3687There are no constants of type x86_mmx and x86_amx. 3688 3689.. _complexconstants: 3690 3691Complex Constants 3692----------------- 3693 3694Complex constants are a (potentially recursive) combination of simple 3695constants and smaller complex constants. 3696 3697**Structure constants** 3698 Structure constants are represented with notation similar to 3699 structure type definitions (a comma separated list of elements, 3700 surrounded by braces (``{}``)). For example: 3701 "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as 3702 "``@G = external global i32``". Structure constants must have 3703 :ref:`structure type <t_struct>`, and the number and types of elements 3704 must match those specified by the type. 3705**Array constants** 3706 Array constants are represented with notation similar to array type 3707 definitions (a comma separated list of elements, surrounded by 3708 square brackets (``[]``)). For example: 3709 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have 3710 :ref:`array type <t_array>`, and the number and types of elements must 3711 match those specified by the type. As a special case, character array 3712 constants may also be represented as a double-quoted string using the ``c`` 3713 prefix. For example: "``c"Hello World\0A\00"``". 3714**Vector constants** 3715 Vector constants are represented with notation similar to vector 3716 type definitions (a comma separated list of elements, surrounded by 3717 less-than/greater-than's (``<>``)). For example: 3718 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants 3719 must have :ref:`vector type <t_vector>`, and the number and types of 3720 elements must match those specified by the type. 3721**Zero initialization** 3722 The string '``zeroinitializer``' can be used to zero initialize a 3723 value to zero of *any* type, including scalar and 3724 :ref:`aggregate <t_aggregate>` types. This is often used to avoid 3725 having to print large zero initializers (e.g. for large arrays) and 3726 is always exactly equivalent to using explicit zero initializers. 3727**Metadata node** 3728 A metadata node is a constant tuple without types. For example: 3729 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values, 3730 for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``". 3731 Unlike other typed constants that are meant to be interpreted as part of 3732 the instruction stream, metadata is a place to attach additional 3733 information such as debug info. 3734 3735Global Variable and Function Addresses 3736-------------------------------------- 3737 3738The addresses of :ref:`global variables <globalvars>` and 3739:ref:`functions <functionstructure>` are always implicitly valid 3740(link-time) constants. These constants are explicitly referenced when 3741the :ref:`identifier for the global <identifiers>` is used and always have 3742:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM 3743file: 3744 3745.. code-block:: llvm 3746 3747 @X = global i32 17 3748 @Y = global i32 42 3749 @Z = global [2 x i32*] [ i32* @X, i32* @Y ] 3750 3751.. _undefvalues: 3752 3753Undefined Values 3754---------------- 3755 3756The string '``undef``' can be used anywhere a constant is expected, and 3757indicates that the user of the value may receive an unspecified 3758bit-pattern. Undefined values may be of any type (other than '``label``' 3759or '``void``') and be used anywhere a constant is permitted. 3760 3761Undefined values are useful because they indicate to the compiler that 3762the program is well defined no matter what value is used. This gives the 3763compiler more freedom to optimize. Here are some examples of 3764(potentially surprising) transformations that are valid (in pseudo IR): 3765 3766.. code-block:: llvm 3767 3768 %A = add %X, undef 3769 %B = sub %X, undef 3770 %C = xor %X, undef 3771 Safe: 3772 %A = undef 3773 %B = undef 3774 %C = undef 3775 3776This is safe because all of the output bits are affected by the undef 3777bits. Any output bit can have a zero or one depending on the input bits. 3778 3779.. code-block:: llvm 3780 3781 %A = or %X, undef 3782 %B = and %X, undef 3783 Safe: 3784 %A = -1 3785 %B = 0 3786 Safe: 3787 %A = %X ;; By choosing undef as 0 3788 %B = %X ;; By choosing undef as -1 3789 Unsafe: 3790 %A = undef 3791 %B = undef 3792 3793These logical operations have bits that are not always affected by the 3794input. For example, if ``%X`` has a zero bit, then the output of the 3795'``and``' operation will always be a zero for that bit, no matter what 3796the corresponding bit from the '``undef``' is. As such, it is unsafe to 3797optimize or assume that the result of the '``and``' is '``undef``'. 3798However, it is safe to assume that all bits of the '``undef``' could be 37990, and optimize the '``and``' to 0. Likewise, it is safe to assume that 3800all the bits of the '``undef``' operand to the '``or``' could be set, 3801allowing the '``or``' to be folded to -1. 3802 3803.. code-block:: llvm 3804 3805 %A = select undef, %X, %Y 3806 %B = select undef, 42, %Y 3807 %C = select %X, %Y, undef 3808 Safe: 3809 %A = %X (or %Y) 3810 %B = 42 (or %Y) 3811 %C = %Y 3812 Unsafe: 3813 %A = undef 3814 %B = undef 3815 %C = undef 3816 3817This set of examples shows that undefined '``select``' (and conditional 3818branch) conditions can go *either way*, but they have to come from one 3819of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were 3820both known to have a clear low bit, then ``%A`` would have to have a 3821cleared low bit. However, in the ``%C`` example, the optimizer is 3822allowed to assume that the '``undef``' operand could be the same as 3823``%Y``, allowing the whole '``select``' to be eliminated. 3824 3825.. code-block:: text 3826 3827 %A = xor undef, undef 3828 3829 %B = undef 3830 %C = xor %B, %B 3831 3832 %D = undef 3833 %E = icmp slt %D, 4 3834 %F = icmp gte %D, 4 3835 3836 Safe: 3837 %A = undef 3838 %B = undef 3839 %C = undef 3840 %D = undef 3841 %E = undef 3842 %F = undef 3843 3844This example points out that two '``undef``' operands are not 3845necessarily the same. This can be surprising to people (and also matches 3846C semantics) where they assume that "``X^X``" is always zero, even if 3847``X`` is undefined. This isn't true for a number of reasons, but the 3848short answer is that an '``undef``' "variable" can arbitrarily change 3849its value over its "live range". This is true because the variable 3850doesn't actually *have a live range*. Instead, the value is logically 3851read from arbitrary registers that happen to be around when needed, so 3852the value is not necessarily consistent over time. In fact, ``%A`` and 3853``%C`` need to have the same semantics or the core LLVM "replace all 3854uses with" concept would not hold. 3855 3856To ensure all uses of a given register observe the same value (even if 3857'``undef``'), the :ref:`freeze instruction <i_freeze>` can be used. 3858 3859.. code-block:: llvm 3860 3861 %A = sdiv undef, %X 3862 %B = sdiv %X, undef 3863 Safe: 3864 %A = 0 3865 b: unreachable 3866 3867These examples show the crucial difference between an *undefined value* 3868and *undefined behavior*. An undefined value (like '``undef``') is 3869allowed to have an arbitrary bit-pattern. This means that the ``%A`` 3870operation can be constant folded to '``0``', because the '``undef``' 3871could be zero, and zero divided by any value is zero. 3872However, in the second example, we can make a more aggressive 3873assumption: because the ``undef`` is allowed to be an arbitrary value, 3874we are allowed to assume that it could be zero. Since a divide by zero 3875has *undefined behavior*, we are allowed to assume that the operation 3876does not execute at all. This allows us to delete the divide and all 3877code after it. Because the undefined operation "can't happen", the 3878optimizer can assume that it occurs in dead code. 3879 3880.. code-block:: text 3881 3882 a: store undef -> %X 3883 b: store %X -> undef 3884 Safe: 3885 a: <deleted> 3886 b: unreachable 3887 3888A store *of* an undefined value can be assumed to not have any effect; 3889we can assume that the value is overwritten with bits that happen to 3890match what was already there. However, a store *to* an undefined 3891location could clobber arbitrary memory, therefore, it has undefined 3892behavior. 3893 3894Branching on an undefined value is undefined behavior. 3895This explains optimizations that depend on branch conditions to construct 3896predicates, such as Correlated Value Propagation and Global Value Numbering. 3897In case of switch instruction, the branch condition should be frozen, otherwise 3898it is undefined behavior. 3899 3900.. code-block:: text 3901 3902 Unsafe: 3903 br undef, BB1, BB2 ; UB 3904 3905 %X = and i32 undef, 255 3906 switch %X, label %ret [ .. ] ; UB 3907 3908 store undef, i8* %ptr 3909 %X = load i8* %ptr ; %X is undef 3910 switch i8 %X, label %ret [ .. ] ; UB 3911 3912 Safe: 3913 %X = or i8 undef, 255 ; always 255 3914 switch i8 %X, label %ret [ .. ] ; Well-defined 3915 3916 %X = freeze i1 undef 3917 br %X, BB1, BB2 ; Well-defined (non-deterministic jump) 3918 3919 3920This is also consistent with the behavior of MemorySanitizer. 3921MemorySanitizer, detector of uses of uninitialized memory, 3922defines a branch with condition that depends on an undef value (or 3923certain other values, like e.g. a result of a load from heap-allocated 3924memory that has never been stored to) to have an externally visible 3925side effect. For this reason functions with *sanitize_memory* 3926attribute are not allowed to produce such branches "out of thin 3927air". More strictly, an optimization that inserts a conditional branch 3928is only valid if in all executions where the branch condition has at 3929least one undefined bit, the same branch condition is evaluated in the 3930input IR as well. 3931 3932.. _poisonvalues: 3933 3934Poison Values 3935------------- 3936 3937A poison value is a result of an erroneous operation. 3938In order to facilitate speculative execution, many instructions do not 3939invoke immediate undefined behavior when provided with illegal operands, 3940and return a poison value instead. 3941The string '``poison``' can be used anywhere a constant is expected, and 3942operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce 3943a poison value. 3944 3945Poison value behavior is defined in terms of value *dependence*: 3946 3947- Values other than :ref:`phi <i_phi>` nodes, :ref:`select <i_select>`, and 3948 :ref:`freeze <i_freeze>` instructions depend on their operands. 3949- :ref:`Phi <i_phi>` nodes depend on the operand corresponding to 3950 their dynamic predecessor basic block. 3951- :ref:`Select <i_select>` instructions depend on their condition operand and 3952 their selected operand. 3953- Function arguments depend on the corresponding actual argument values 3954 in the dynamic callers of their functions. 3955- :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>` 3956 instructions that dynamically transfer control back to them. 3957- :ref:`Invoke <i_invoke>` instructions depend on the 3958 :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing 3959 call instructions that dynamically transfer control back to them. 3960- Non-volatile loads and stores depend on the most recent stores to all 3961 of the referenced memory addresses, following the order in the IR 3962 (including loads and stores implied by intrinsics such as 3963 :ref:`@llvm.memcpy <int_memcpy>`.) 3964- An instruction with externally visible side effects depends on the 3965 most recent preceding instruction with externally visible side 3966 effects, following the order in the IR. (This includes :ref:`volatile 3967 operations <volatile>`.) 3968- An instruction *control-depends* on a :ref:`terminator 3969 instruction <terminators>` if the terminator instruction has 3970 multiple successors and the instruction is always executed when 3971 control transfers to one of the successors, and may not be executed 3972 when control is transferred to another. 3973- Additionally, an instruction also *control-depends* on a terminator 3974 instruction if the set of instructions it otherwise depends on would 3975 be different if the terminator had transferred control to a different 3976 successor. 3977- Dependence is transitive. 3978- Vector elements may be independently poisoned. Therefore, transforms 3979 on instructions such as shufflevector must be careful to propagate 3980 poison across values or elements only as allowed by the original code. 3981 3982An instruction that *depends* on a poison value, produces a poison value 3983itself. A poison value may be relaxed into an 3984:ref:`undef value <undefvalues>`, which takes an arbitrary bit-pattern. 3985Propagation of poison can be stopped with the 3986:ref:`freeze instruction <i_freeze>`. 3987 3988This means that immediate undefined behavior occurs if a poison value is 3989used as an instruction operand that has any values that trigger undefined 3990behavior. Notably this includes (but is not limited to): 3991 3992- The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or 3993 any other pointer dereferencing instruction (independent of address 3994 space). 3995- The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem`` 3996 instruction. 3997- The condition operand of a :ref:`br <i_br>` instruction. 3998- The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 3999 instruction. 4000- The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 4001 instruction, when the function or invoking call site has a ``noundef`` 4002 attribute in the corresponding position. 4003- The operand of a :ref:`ret <i_ret>` instruction if the function or invoking 4004 call site has a `noundef` attribute in the return value position. 4005 4006Here are some examples: 4007 4008.. code-block:: llvm 4009 4010 entry: 4011 %poison = sub nuw i32 0, 1 ; Results in a poison value. 4012 %poison2 = sub i32 poison, 1 ; Also results in a poison value. 4013 %still_poison = and i32 %poison, 0 ; 0, but also poison. 4014 %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison 4015 store i32 0, i32* %poison_yet_again ; Undefined behavior due to 4016 ; store to poison. 4017 4018 store i32 %poison, i32* @g ; Poison value stored to memory. 4019 %poison3 = load i32, i32* @g ; Poison value loaded back from memory. 4020 4021 %narrowaddr = bitcast i32* @g to i16* 4022 %wideaddr = bitcast i32* @g to i64* 4023 %poison4 = load i16, i16* %narrowaddr ; Returns a poison value. 4024 %poison5 = load i64, i64* %wideaddr ; Returns a poison value. 4025 4026 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value. 4027 br i1 %cmp, label %end, label %end ; undefined behavior 4028 4029 end: 4030 4031.. _welldefinedvalues: 4032 4033Well-Defined Values 4034------------------- 4035 4036Given a program execution, a value is *well defined* if the value does not 4037have an undef bit and is not poison in the execution. 4038An aggregate value or vector is well defined if its elements are well defined. 4039The padding of an aggregate isn't considered, since it isn't visible 4040without storing it into memory and loading it with a different type. 4041 4042A constant of a :ref:`single value <t_single_value>`, non-vector type is well 4043defined if it is neither '``undef``' constant nor '``poison``' constant. 4044The result of :ref:`freeze instruction <i_freeze>` is well defined regardless 4045of its operand. 4046 4047.. _blockaddress: 4048 4049Addresses of Basic Blocks 4050------------------------- 4051 4052``blockaddress(@function, %block)`` 4053 4054The '``blockaddress``' constant computes the address of the specified 4055basic block in the specified function, and always has an ``i8*`` type. 4056Taking the address of the entry block is illegal. 4057 4058This value only has defined behavior when used as an operand to the 4059':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or 4060for comparisons against null. Pointer equality tests between labels addresses 4061results in undefined behavior --- though, again, comparison against null is ok, 4062and no label is equal to the null pointer. This may be passed around as an 4063opaque pointer sized value as long as the bits are not inspected. This 4064allows ``ptrtoint`` and arithmetic to be performed on these values so 4065long as the original value is reconstituted before the ``indirectbr`` or 4066``callbr`` instruction. 4067 4068Finally, some targets may provide defined semantics when using the value 4069as the operand to an inline assembly, but that is target specific. 4070 4071.. _dso_local_equivalent: 4072 4073DSO Local Equivalent 4074-------------------- 4075 4076``dso_local_equivalent @func`` 4077 4078A '``dso_local_equivalent``' constant represents a function which is 4079functionally equivalent to a given function, but is always defined in the 4080current linkage unit. The resulting pointer has the same type as the underlying 4081function. The resulting pointer is permitted, but not required, to be different 4082from a pointer to the function, and it may have different values in different 4083translation units. 4084 4085The target function may not have ``extern_weak`` linkage. 4086 4087``dso_local_equivalent`` can be implemented as such: 4088 4089- If the function has local linkage, hidden visibility, or is 4090 ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer 4091 to the function. 4092- ``dso_local_equivalent`` can be implemented with a stub that tail-calls the 4093 function. Many targets support relocations that resolve at link time to either 4094 a function or a stub for it, depending on if the function is defined within the 4095 linkage unit; LLVM will use this when available. (This is commonly called a 4096 "PLT stub".) On other targets, the stub may need to be emitted explicitly. 4097 4098This can be used wherever a ``dso_local`` instance of a function is needed without 4099needing to explicitly make the original function ``dso_local``. An instance where 4100this can be used is for static offset calculations between a function and some other 4101``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI, 4102where dynamic relocations for function pointers in VTables can be replaced with 4103static relocations for offsets between the VTable and virtual functions which 4104may not be ``dso_local``. 4105 4106This is currently only supported for ELF binary formats. 4107 4108.. _constantexprs: 4109 4110Constant Expressions 4111-------------------- 4112 4113Constant expressions are used to allow expressions involving other 4114constants to be used as constants. Constant expressions may be of any 4115:ref:`first class <t_firstclass>` type and may involve any LLVM operation 4116that does not have side effects (e.g. load and call are not supported). 4117The following is the syntax for constant expressions: 4118 4119``trunc (CST to TYPE)`` 4120 Perform the :ref:`trunc operation <i_trunc>` on constants. 4121``zext (CST to TYPE)`` 4122 Perform the :ref:`zext operation <i_zext>` on constants. 4123``sext (CST to TYPE)`` 4124 Perform the :ref:`sext operation <i_sext>` on constants. 4125``fptrunc (CST to TYPE)`` 4126 Truncate a floating-point constant to another floating-point type. 4127 The size of CST must be larger than the size of TYPE. Both types 4128 must be floating-point. 4129``fpext (CST to TYPE)`` 4130 Floating-point extend a constant to another type. The size of CST 4131 must be smaller or equal to the size of TYPE. Both types must be 4132 floating-point. 4133``fptoui (CST to TYPE)`` 4134 Convert a floating-point constant to the corresponding unsigned 4135 integer constant. TYPE must be a scalar or vector integer type. CST 4136 must be of scalar or vector floating-point type. Both CST and TYPE 4137 must be scalars, or vectors of the same number of elements. If the 4138 value won't fit in the integer type, the result is a 4139 :ref:`poison value <poisonvalues>`. 4140``fptosi (CST to TYPE)`` 4141 Convert a floating-point constant to the corresponding signed 4142 integer constant. TYPE must be a scalar or vector integer type. CST 4143 must be of scalar or vector floating-point type. Both CST and TYPE 4144 must be scalars, or vectors of the same number of elements. If the 4145 value won't fit in the integer type, the result is a 4146 :ref:`poison value <poisonvalues>`. 4147``uitofp (CST to TYPE)`` 4148 Convert an unsigned integer constant to the corresponding 4149 floating-point constant. TYPE must be a scalar or vector floating-point 4150 type. CST must be of scalar or vector integer type. Both CST and TYPE must 4151 be scalars, or vectors of the same number of elements. 4152``sitofp (CST to TYPE)`` 4153 Convert a signed integer constant to the corresponding floating-point 4154 constant. TYPE must be a scalar or vector floating-point type. 4155 CST must be of scalar or vector integer type. Both CST and TYPE must 4156 be scalars, or vectors of the same number of elements. 4157``ptrtoint (CST to TYPE)`` 4158 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants. 4159``inttoptr (CST to TYPE)`` 4160 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants. 4161 This one is *really* dangerous! 4162``bitcast (CST to TYPE)`` 4163 Convert a constant, CST, to another TYPE. 4164 The constraints of the operands are the same as those for the 4165 :ref:`bitcast instruction <i_bitcast>`. 4166``addrspacecast (CST to TYPE)`` 4167 Convert a constant pointer or constant vector of pointer, CST, to another 4168 TYPE in a different address space. The constraints of the operands are the 4169 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`. 4170``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)`` 4171 Perform the :ref:`getelementptr operation <i_getelementptr>` on 4172 constants. As with the :ref:`getelementptr <i_getelementptr>` 4173 instruction, the index list may have one or more indexes, which are 4174 required to make sense for the type of "pointer to TY". 4175``select (COND, VAL1, VAL2)`` 4176 Perform the :ref:`select operation <i_select>` on constants. 4177``icmp COND (VAL1, VAL2)`` 4178 Perform the :ref:`icmp operation <i_icmp>` on constants. 4179``fcmp COND (VAL1, VAL2)`` 4180 Perform the :ref:`fcmp operation <i_fcmp>` on constants. 4181``extractelement (VAL, IDX)`` 4182 Perform the :ref:`extractelement operation <i_extractelement>` on 4183 constants. 4184``insertelement (VAL, ELT, IDX)`` 4185 Perform the :ref:`insertelement operation <i_insertelement>` on 4186 constants. 4187``shufflevector (VEC1, VEC2, IDXMASK)`` 4188 Perform the :ref:`shufflevector operation <i_shufflevector>` on 4189 constants. 4190``extractvalue (VAL, IDX0, IDX1, ...)`` 4191 Perform the :ref:`extractvalue operation <i_extractvalue>` on 4192 constants. The index list is interpreted in a similar manner as 4193 indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At 4194 least one index value must be specified. 4195``insertvalue (VAL, ELT, IDX0, IDX1, ...)`` 4196 Perform the :ref:`insertvalue operation <i_insertvalue>` on constants. 4197 The index list is interpreted in a similar manner as indices in a 4198 ':ref:`getelementptr <i_getelementptr>`' operation. At least one index 4199 value must be specified. 4200``OPCODE (LHS, RHS)`` 4201 Perform the specified operation of the LHS and RHS constants. OPCODE 4202 may be any of the :ref:`binary <binaryops>` or :ref:`bitwise 4203 binary <bitwiseops>` operations. The constraints on operands are 4204 the same as those for the corresponding instruction (e.g. no bitwise 4205 operations on floating-point values are allowed). 4206 4207Other Values 4208============ 4209 4210.. _inlineasmexprs: 4211 4212Inline Assembler Expressions 4213---------------------------- 4214 4215LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level 4216Inline Assembly <moduleasm>`) through the use of a special value. This value 4217represents the inline assembler as a template string (containing the 4218instructions to emit), a list of operand constraints (stored as a string), a 4219flag that indicates whether or not the inline asm expression has side effects, 4220and a flag indicating whether the function containing the asm needs to align its 4221stack conservatively. 4222 4223The template string supports argument substitution of the operands using "``$``" 4224followed by a number, to indicate substitution of the given register/memory 4225location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also 4226be used, where ``MODIFIER`` is a target-specific annotation for how to print the 4227operand (See :ref:`inline-asm-modifiers`). 4228 4229A literal "``$``" may be included by using "``$$``" in the template. To include 4230other special characters into the output, the usual "``\XX``" escapes may be 4231used, just as in other strings. Note that after template substitution, the 4232resulting assembly string is parsed by LLVM's integrated assembler unless it is 4233disabled -- even when emitting a ``.s`` file -- and thus must contain assembly 4234syntax known to LLVM. 4235 4236LLVM also supports a few more substitutions useful for writing inline assembly: 4237 4238- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob. 4239 This substitution is useful when declaring a local label. Many standard 4240 compiler optimizations, such as inlining, may duplicate an inline asm blob. 4241 Adding a blob-unique identifier ensures that the two labels will not conflict 4242 during assembly. This is used to implement `GCC's %= special format 4243 string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_. 4244- ``${:comment}``: Expands to the comment character of the current target's 4245 assembly dialect. This is usually ``#``, but many targets use other strings, 4246 such as ``;``, ``//``, or ``!``. 4247- ``${:private}``: Expands to the assembler private label prefix. Labels with 4248 this prefix will not appear in the symbol table of the assembled object. 4249 Typically the prefix is ``L``, but targets may use other strings. ``.L`` is 4250 relatively popular. 4251 4252LLVM's support for inline asm is modeled closely on the requirements of Clang's 4253GCC-compatible inline-asm support. Thus, the feature-set and the constraint and 4254modifier codes listed here are similar or identical to those in GCC's inline asm 4255support. However, to be clear, the syntax of the template and constraint strings 4256described here is *not* the same as the syntax accepted by GCC and Clang, and, 4257while most constraint letters are passed through as-is by Clang, some get 4258translated to other codes when converting from the C source to the LLVM 4259assembly. 4260 4261An example inline assembler expression is: 4262 4263.. code-block:: llvm 4264 4265 i32 (i32) asm "bswap $0", "=r,r" 4266 4267Inline assembler expressions may **only** be used as the callee operand 4268of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction. 4269Thus, typically we have: 4270 4271.. code-block:: llvm 4272 4273 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y) 4274 4275Inline asms with side effects not visible in the constraint list must be 4276marked as having side effects. This is done through the use of the 4277'``sideeffect``' keyword, like so: 4278 4279.. code-block:: llvm 4280 4281 call void asm sideeffect "eieio", ""() 4282 4283In some cases inline asms will contain code that will not work unless 4284the stack is aligned in some way, such as calls or SSE instructions on 4285x86, yet will not contain code that does that alignment within the asm. 4286The compiler should make conservative assumptions about what the asm 4287might contain and should generate its usual stack alignment code in the 4288prologue if the '``alignstack``' keyword is present: 4289 4290.. code-block:: llvm 4291 4292 call void asm alignstack "eieio", ""() 4293 4294Inline asms also support using non-standard assembly dialects. The 4295assumed dialect is ATT. When the '``inteldialect``' keyword is present, 4296the inline asm is using the Intel dialect. Currently, ATT and Intel are 4297the only supported dialects. An example is: 4298 4299.. code-block:: llvm 4300 4301 call void asm inteldialect "eieio", ""() 4302 4303If multiple keywords appear the '``sideeffect``' keyword must come 4304first, the '``alignstack``' keyword second and the '``inteldialect``' 4305keyword last. 4306 4307Inline Asm Constraint String 4308^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 4309 4310The constraint list is a comma-separated string, each element containing one or 4311more constraint codes. 4312 4313For each element in the constraint list an appropriate register or memory 4314operand will be chosen, and it will be made available to assembly template 4315string expansion as ``$0`` for the first constraint in the list, ``$1`` for the 4316second, etc. 4317 4318There are three different types of constraints, which are distinguished by a 4319prefix symbol in front of the constraint code: Output, Input, and Clobber. The 4320constraints must always be given in that order: outputs first, then inputs, then 4321clobbers. They cannot be intermingled. 4322 4323There are also three different categories of constraint codes: 4324 4325- Register constraint. This is either a register class, or a fixed physical 4326 register. This kind of constraint will allocate a register, and if necessary, 4327 bitcast the argument or result to the appropriate type. 4328- Memory constraint. This kind of constraint is for use with an instruction 4329 taking a memory operand. Different constraints allow for different addressing 4330 modes used by the target. 4331- Immediate value constraint. This kind of constraint is for an integer or other 4332 immediate value which can be rendered directly into an instruction. The 4333 various target-specific constraints allow the selection of a value in the 4334 proper range for the instruction you wish to use it with. 4335 4336Output constraints 4337"""""""""""""""""" 4338 4339Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This 4340indicates that the assembly will write to this operand, and the operand will 4341then be made available as a return value of the ``asm`` expression. Output 4342constraints do not consume an argument from the call instruction. (Except, see 4343below about indirect outputs). 4344 4345Normally, it is expected that no output locations are written to by the assembly 4346expression until *all* of the inputs have been read. As such, LLVM may assign 4347the same register to an output and an input. If this is not safe (e.g. if the 4348assembly contains two instructions, where the first writes to one output, and 4349the second reads an input and writes to a second output), then the "``&``" 4350modifier must be used (e.g. "``=&r``") to specify that the output is an 4351"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM 4352will not use the same register for any inputs (other than an input tied to this 4353output). 4354 4355Input constraints 4356""""""""""""""""" 4357 4358Input constraints do not have a prefix -- just the constraint codes. Each input 4359constraint will consume one argument from the call instruction. It is not 4360permitted for the asm to write to any input register or memory location (unless 4361that input is tied to an output). Note also that multiple inputs may all be 4362assigned to the same register, if LLVM can determine that they necessarily all 4363contain the same value. 4364 4365Instead of providing a Constraint Code, input constraints may also "tie" 4366themselves to an output constraint, by providing an integer as the constraint 4367string. Tied inputs still consume an argument from the call instruction, and 4368take up a position in the asm template numbering as is usual -- they will simply 4369be constrained to always use the same register as the output they've been tied 4370to. For example, a constraint string of "``=r,0``" says to assign a register for 4371output, and use that register as an input as well (it being the 0'th 4372constraint). 4373 4374It is permitted to tie an input to an "early-clobber" output. In that case, no 4375*other* input may share the same register as the input tied to the early-clobber 4376(even when the other input has the same value). 4377 4378You may only tie an input to an output which has a register constraint, not a 4379memory constraint. Only a single input may be tied to an output. 4380 4381There is also an "interesting" feature which deserves a bit of explanation: if a 4382register class constraint allocates a register which is too small for the value 4383type operand provided as input, the input value will be split into multiple 4384registers, and all of them passed to the inline asm. 4385 4386However, this feature is often not as useful as you might think. 4387 4388Firstly, the registers are *not* guaranteed to be consecutive. So, on those 4389architectures that have instructions which operate on multiple consecutive 4390instructions, this is not an appropriate way to support them. (e.g. the 32-bit 4391SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The 4392hardware then loads into both the named register, and the next register. This 4393feature of inline asm would not be useful to support that.) 4394 4395A few of the targets provide a template string modifier allowing explicit access 4396to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and 4397``D``). On such an architecture, you can actually access the second allocated 4398register (yet, still, not any subsequent ones). But, in that case, you're still 4399probably better off simply splitting the value into two separate operands, for 4400clarity. (e.g. see the description of the ``A`` constraint on X86, which, 4401despite existing only for use with this feature, is not really a good idea to 4402use) 4403 4404Indirect inputs and outputs 4405""""""""""""""""""""""""""" 4406 4407Indirect output or input constraints can be specified by the "``*``" modifier 4408(which goes after the "``=``" in case of an output). This indicates that the asm 4409will write to or read from the contents of an *address* provided as an input 4410argument. (Note that in this way, indirect outputs act more like an *input* than 4411an output: just like an input, they consume an argument of the call expression, 4412rather than producing a return value. An indirect output constraint is an 4413"output" only in that the asm is expected to write to the contents of the input 4414memory location, instead of just read from it). 4415 4416This is most typically used for memory constraint, e.g. "``=*m``", to pass the 4417address of a variable as a value. 4418 4419It is also possible to use an indirect *register* constraint, but only on output 4420(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output 4421value normally, and then, separately emit a store to the address provided as 4422input, after the provided inline asm. (It's not clear what value this 4423functionality provides, compared to writing the store explicitly after the asm 4424statement, and it can only produce worse code, since it bypasses many 4425optimization passes. I would recommend not using it.) 4426 4427 4428Clobber constraints 4429""""""""""""""""""" 4430 4431A clobber constraint is indicated by a "``~``" prefix. A clobber does not 4432consume an input operand, nor generate an output. Clobbers cannot use any of the 4433general constraint code letters -- they may use only explicit register 4434constraints, e.g. "``~{eax}``". The one exception is that a clobber string of 4435"``~{memory}``" indicates that the assembly writes to arbitrary undeclared 4436memory locations -- not only the memory pointed to by a declared indirect 4437output. 4438 4439Note that clobbering named registers that are also present in output 4440constraints is not legal. 4441 4442 4443Constraint Codes 4444"""""""""""""""" 4445After a potential prefix comes constraint code, or codes. 4446 4447A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character 4448followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``" 4449(e.g. "``{eax}``"). 4450 4451The one and two letter constraint codes are typically chosen to be the same as 4452GCC's constraint codes. 4453 4454A single constraint may include one or more than constraint code in it, leaving 4455it up to LLVM to choose which one to use. This is included mainly for 4456compatibility with the translation of GCC inline asm coming from clang. 4457 4458There are two ways to specify alternatives, and either or both may be used in an 4459inline asm constraint list: 4460 44611) Append the codes to each other, making a constraint code set. E.g. "``im``" 4462 or "``{eax}m``". This means "choose any of the options in the set". The 4463 choice of constraint is made independently for each constraint in the 4464 constraint list. 4465 44662) Use "``|``" between constraint code sets, creating alternatives. Every 4467 constraint in the constraint list must have the same number of alternative 4468 sets. With this syntax, the same alternative in *all* of the items in the 4469 constraint list will be chosen together. 4470 4471Putting those together, you might have a two operand constraint string like 4472``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then 4473operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1 4474may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m. 4475 4476However, the use of either of the alternatives features is *NOT* recommended, as 4477LLVM is not able to make an intelligent choice about which one to use. (At the 4478point it currently needs to choose, not enough information is available to do so 4479in a smart way.) Thus, it simply tries to make a choice that's most likely to 4480compile, not one that will be optimal performance. (e.g., given "``rm``", it'll 4481always choose to use memory, not registers). And, if given multiple registers, 4482or multiple register classes, it will simply choose the first one. (In fact, it 4483doesn't currently even ensure explicitly specified physical registers are 4484unique, so specifying multiple physical registers as alternatives, like 4485``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was 4486intended.) 4487 4488Supported Constraint Code List 4489"""""""""""""""""""""""""""""" 4490 4491The constraint codes are, in general, expected to behave the same way they do in 4492GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C 4493inline asm code which was supported by GCC. A mismatch in behavior between LLVM 4494and GCC likely indicates a bug in LLVM. 4495 4496Some constraint codes are typically supported by all targets: 4497 4498- ``r``: A register in the target's general purpose register class. 4499- ``m``: A memory address operand. It is target-specific what addressing modes 4500 are supported, typical examples are register, or register + register offset, 4501 or register + immediate offset (of some target-specific size). 4502- ``i``: An integer constant (of target-specific width). Allows either a simple 4503 immediate, or a relocatable value. 4504- ``n``: An integer constant -- *not* including relocatable values. 4505- ``s``: An integer constant, but allowing *only* relocatable values. 4506- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically 4507 useful to pass a label for an asm branch or call. 4508 4509 .. FIXME: but that surely isn't actually okay to jump out of an asm 4510 block without telling llvm about the control transfer???) 4511 4512- ``{register-name}``: Requires exactly the named physical register. 4513 4514Other constraints are target-specific: 4515 4516AArch64: 4517 4518- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate. 4519- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction, 4520 i.e. 0 to 4095 with optional shift by 12. 4521- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or 4522 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12. 4523- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a 4524 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register. 4525- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a 4526 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register. 4527- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a 4528 32-bit register. This is a superset of ``K``: in addition to the bitmask 4529 immediate, also allows immediate integers which can be loaded with a single 4530 ``MOVZ`` or ``MOVL`` instruction. 4531- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a 4532 64-bit register. This is a superset of ``L``. 4533- ``Q``: Memory address operand must be in a single register (no 4534 offsets). (However, LLVM currently does this for the ``m`` constraint as 4535 well.) 4536- ``r``: A 32 or 64-bit integer register (W* or X*). 4537- ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register. 4538- ``x``: Like w, but restricted to registers 0 to 15 inclusive. 4539- ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive. 4540- ``Upl``: One of the low eight SVE predicate registers (P0 to P7) 4541- ``Upa``: Any of the SVE predicate registers (P0 to P15) 4542 4543AMDGPU: 4544 4545- ``r``: A 32 or 64-bit integer register. 4546- ``[0-9]v``: The 32-bit VGPR register, number 0-9. 4547- ``[0-9]s``: The 32-bit SGPR register, number 0-9. 4548- ``[0-9]a``: The 32-bit AGPR register, number 0-9. 4549- ``I``: An integer inline constant in the range from -16 to 64. 4550- ``J``: A 16-bit signed integer constant. 4551- ``A``: An integer or a floating-point inline constant. 4552- ``B``: A 32-bit signed integer constant. 4553- ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64. 4554- ``DA``: A 64-bit constant that can be split into two "A" constants. 4555- ``DB``: A 64-bit constant that can be split into two "B" constants. 4556 4557All ARM modes: 4558 4559- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address 4560 operand. Treated the same as operand ``m``, at the moment. 4561- ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14`` 4562- ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11`` 4563 4564ARM and ARM's Thumb2 mode: 4565 4566- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``) 4567- ``I``: An immediate integer valid for a data-processing instruction. 4568- ``J``: An immediate integer between -4095 and 4095. 4569- ``K``: An immediate integer whose bitwise inverse is valid for a 4570 data-processing instruction. (Can be used with template modifier "``B``" to 4571 print the inverted value). 4572- ``L``: An immediate integer whose negation is valid for a data-processing 4573 instruction. (Can be used with template modifier "``n``" to print the negated 4574 value). 4575- ``M``: A power of two or an integer between 0 and 32. 4576- ``N``: Invalid immediate constraint. 4577- ``O``: Invalid immediate constraint. 4578- ``r``: A general-purpose 32-bit integer register (``r0-r15``). 4579- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same 4580 as ``r``. 4581- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode, 4582 invalid. 4583- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4584 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively. 4585- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4586 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively. 4587- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4588 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively. 4589 4590ARM's Thumb1 mode: 4591 4592- ``I``: An immediate integer between 0 and 255. 4593- ``J``: An immediate integer between -255 and -1. 4594- ``K``: An immediate integer between 0 and 255, with optional left-shift by 4595 some amount. 4596- ``L``: An immediate integer between -7 and 7. 4597- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020. 4598- ``N``: An immediate integer between 0 and 31. 4599- ``O``: An immediate integer which is a multiple of 4 between -508 and 508. 4600- ``r``: A low 32-bit GPR register (``r0-r7``). 4601- ``l``: A low 32-bit GPR register (``r0-r7``). 4602- ``h``: A high GPR register (``r0-r7``). 4603- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4604 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively. 4605- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4606 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively. 4607- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4608 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively. 4609 4610 4611Hexagon: 4612 4613- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``, 4614 at the moment. 4615- ``r``: A 32 or 64-bit register. 4616 4617MSP430: 4618 4619- ``r``: An 8 or 16-bit register. 4620 4621MIPS: 4622 4623- ``I``: An immediate signed 16-bit integer. 4624- ``J``: An immediate integer zero. 4625- ``K``: An immediate unsigned 16-bit integer. 4626- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0. 4627- ``N``: An immediate integer between -65535 and -1. 4628- ``O``: An immediate signed 15-bit integer. 4629- ``P``: An immediate integer between 1 and 65535. 4630- ``m``: A memory address operand. In MIPS-SE mode, allows a base address 4631 register plus 16-bit immediate offset. In MIPS mode, just a base register. 4632- ``R``: A memory address operand. In MIPS-SE mode, allows a base address 4633 register plus a 9-bit signed offset. In MIPS mode, the same as constraint 4634 ``m``. 4635- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or 4636 ``sc`` instruction on the given subtarget (details vary). 4637- ``r``, ``d``, ``y``: A 32 or 64-bit GPR register. 4638- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register 4639 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w`` 4640 argument modifier for compatibility with GCC. 4641- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always 4642 ``25``). 4643- ``l``: The ``lo`` register, 32 or 64-bit. 4644- ``x``: Invalid. 4645 4646NVPTX: 4647 4648- ``b``: A 1-bit integer register. 4649- ``c`` or ``h``: A 16-bit integer register. 4650- ``r``: A 32-bit integer register. 4651- ``l`` or ``N``: A 64-bit integer register. 4652- ``f``: A 32-bit float register. 4653- ``d``: A 64-bit float register. 4654 4655 4656PowerPC: 4657 4658- ``I``: An immediate signed 16-bit integer. 4659- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits. 4660- ``K``: An immediate unsigned 16-bit integer. 4661- ``L``: An immediate signed 16-bit integer, shifted left 16 bits. 4662- ``M``: An immediate integer greater than 31. 4663- ``N``: An immediate integer that is an exact power of 2. 4664- ``O``: The immediate integer constant 0. 4665- ``P``: An immediate integer constant whose negation is a signed 16-bit 4666 constant. 4667- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently 4668 treated the same as ``m``. 4669- ``r``: A 32 or 64-bit integer register. 4670- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is: 4671 ``R1-R31``). 4672- ``f``: A 32 or 64-bit float register (``F0-F31``), 4673- ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector 4674 register (``V0-V31``). 4675 4676- ``y``: Condition register (``CR0-CR7``). 4677- ``wc``: An individual CR bit in a CR register. 4678- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX 4679 register set (overlapping both the floating-point and vector register files). 4680- ``ws``: A 32 or 64-bit floating-point register, from the full VSX register 4681 set. 4682 4683RISC-V: 4684 4685- ``A``: An address operand (using a general-purpose register, without an 4686 offset). 4687- ``I``: A 12-bit signed integer immediate operand. 4688- ``J``: A zero integer immediate operand. 4689- ``K``: A 5-bit unsigned integer immediate operand. 4690- ``f``: A 32- or 64-bit floating-point register (requires F or D extension). 4691- ``r``: A 32- or 64-bit general-purpose register (depending on the platform 4692 ``XLEN``). 4693 4694Sparc: 4695 4696- ``I``: An immediate 13-bit signed integer. 4697- ``r``: A 32-bit integer register. 4698- ``f``: Any floating-point register on SparcV8, or a floating-point 4699 register in the "low" half of the registers on SparcV9. 4700- ``e``: Any floating-point register. (Same as ``f`` on SparcV8.) 4701 4702SystemZ: 4703 4704- ``I``: An immediate unsigned 8-bit integer. 4705- ``J``: An immediate unsigned 12-bit integer. 4706- ``K``: An immediate signed 16-bit integer. 4707- ``L``: An immediate signed 20-bit integer. 4708- ``M``: An immediate integer 0x7fffffff. 4709- ``Q``: A memory address operand with a base address and a 12-bit immediate 4710 unsigned displacement. 4711- ``R``: A memory address operand with a base address, a 12-bit immediate 4712 unsigned displacement, and an index register. 4713- ``S``: A memory address operand with a base address and a 20-bit immediate 4714 signed displacement. 4715- ``T``: A memory address operand with a base address, a 20-bit immediate 4716 signed displacement, and an index register. 4717- ``r`` or ``d``: A 32, 64, or 128-bit integer register. 4718- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an 4719 address context evaluates as zero). 4720- ``h``: A 32-bit value in the high part of a 64bit data register 4721 (LLVM-specific) 4722- ``f``: A 32, 64, or 128-bit floating-point register. 4723 4724X86: 4725 4726- ``I``: An immediate integer between 0 and 31. 4727- ``J``: An immediate integer between 0 and 64. 4728- ``K``: An immediate signed 8-bit integer. 4729- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only) 4730 0xffffffff. 4731- ``M``: An immediate integer between 0 and 3. 4732- ``N``: An immediate unsigned 8-bit integer. 4733- ``O``: An immediate integer between 0 and 127. 4734- ``e``: An immediate 32-bit signed integer. 4735- ``Z``: An immediate 32-bit unsigned integer. 4736- ``o``, ``v``: Treated the same as ``m``, at the moment. 4737- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit 4738 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d`` 4739 registers, and on X86-64, it is all of the integer registers. 4740- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit 4741 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers. 4742- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register. 4743- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has 4744 existed since i386, and can be accessed without the REX prefix. 4745- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register. 4746- ``y``: A 64-bit MMX register, if MMX is enabled. 4747- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector 4748 operand in a SSE register. If AVX is also enabled, can also be a 256-bit 4749 vector operand in an AVX register. If AVX-512 is also enabled, can also be a 4750 512-bit vector operand in an AVX512 register, Otherwise, an error. 4751- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error. 4752- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in 4753 32-bit mode, a 64-bit integer operand will get split into two registers). It 4754 is not recommended to use this constraint, as in 64-bit mode, the 64-bit 4755 operand will get allocated only to RAX -- if two 32-bit operands are needed, 4756 you're better off splitting it yourself, before passing it to the asm 4757 statement. 4758 4759XCore: 4760 4761- ``r``: A 32-bit integer register. 4762 4763 4764.. _inline-asm-modifiers: 4765 4766Asm template argument modifiers 4767^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 4768 4769In the asm template string, modifiers can be used on the operand reference, like 4770"``${0:n}``". 4771 4772The modifiers are, in general, expected to behave the same way they do in 4773GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C 4774inline asm code which was supported by GCC. A mismatch in behavior between LLVM 4775and GCC likely indicates a bug in LLVM. 4776 4777Target-independent: 4778 4779- ``c``: Print an immediate integer constant unadorned, without 4780 the target-specific immediate punctuation (e.g. no ``$`` prefix). 4781- ``n``: Negate and print immediate integer constant unadorned, without the 4782 target-specific immediate punctuation (e.g. no ``$`` prefix). 4783- ``l``: Print as an unadorned label, without the target-specific label 4784 punctuation (e.g. no ``$`` prefix). 4785 4786AArch64: 4787 4788- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g., 4789 instead of ``x30``, print ``w30``. 4790- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow). 4791- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a 4792 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of 4793 ``v*``. 4794 4795AMDGPU: 4796 4797- ``r``: No effect. 4798 4799ARM: 4800 4801- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a 4802 register). 4803- ``P``: No effect. 4804- ``q``: No effect. 4805- ``y``: Print a VFP single-precision register as an indexed double (e.g. print 4806 as ``d4[1]`` instead of ``s9``) 4807- ``B``: Bitwise invert and print an immediate integer constant without ``#`` 4808 prefix. 4809- ``L``: Print the low 16-bits of an immediate integer constant. 4810- ``M``: Print as a register set suitable for ldm/stm. Also prints *all* 4811 register operands subsequent to the specified one (!), so use carefully. 4812- ``Q``: Print the low-order register of a register-pair, or the low-order 4813 register of a two-register operand. 4814- ``R``: Print the high-order register of a register-pair, or the high-order 4815 register of a two-register operand. 4816- ``H``: Print the second register of a register-pair. (On a big-endian system, 4817 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent 4818 to ``R``.) 4819 4820 .. FIXME: H doesn't currently support printing the second register 4821 of a two-register operand. 4822 4823- ``e``: Print the low doubleword register of a NEON quad register. 4824- ``f``: Print the high doubleword register of a NEON quad register. 4825- ``m``: Print the base register of a memory operand without the ``[`` and ``]`` 4826 adornment. 4827 4828Hexagon: 4829 4830- ``L``: Print the second register of a two-register operand. Requires that it 4831 has been allocated consecutively to the first. 4832 4833 .. FIXME: why is it restricted to consecutive ones? And there's 4834 nothing that ensures that happens, is there? 4835 4836- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise 4837 nothing. Used to print 'addi' vs 'add' instructions. 4838 4839MSP430: 4840 4841No additional modifiers. 4842 4843MIPS: 4844 4845- ``X``: Print an immediate integer as hexadecimal 4846- ``x``: Print the low 16 bits of an immediate integer as hexadecimal. 4847- ``d``: Print an immediate integer as decimal. 4848- ``m``: Subtract one and print an immediate integer as decimal. 4849- ``z``: Print $0 if an immediate zero, otherwise print normally. 4850- ``L``: Print the low-order register of a two-register operand, or prints the 4851 address of the low-order word of a double-word memory operand. 4852 4853 .. FIXME: L seems to be missing memory operand support. 4854 4855- ``M``: Print the high-order register of a two-register operand, or prints the 4856 address of the high-order word of a double-word memory operand. 4857 4858 .. FIXME: M seems to be missing memory operand support. 4859 4860- ``D``: Print the second register of a two-register operand, or prints the 4861 second word of a double-word memory operand. (On a big-endian system, ``D`` is 4862 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to 4863 ``M``.) 4864- ``w``: No effect. Provided for compatibility with GCC which requires this 4865 modifier in order to print MSA registers (``W0-W31``) with the ``f`` 4866 constraint. 4867 4868NVPTX: 4869 4870- ``r``: No effect. 4871 4872PowerPC: 4873 4874- ``L``: Print the second register of a two-register operand. Requires that it 4875 has been allocated consecutively to the first. 4876 4877 .. FIXME: why is it restricted to consecutive ones? And there's 4878 nothing that ensures that happens, is there? 4879 4880- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise 4881 nothing. Used to print 'addi' vs 'add' instructions. 4882- ``y``: For a memory operand, prints formatter for a two-register X-form 4883 instruction. (Currently always prints ``r0,OPERAND``). 4884- ``U``: Prints 'u' if the memory operand is an update form, and nothing 4885 otherwise. (NOTE: LLVM does not support update form, so this will currently 4886 always print nothing) 4887- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does 4888 not support indexed form, so this will currently always print nothing) 4889 4890RISC-V: 4891 4892- ``i``: Print the letter 'i' if the operand is not a register, otherwise print 4893 nothing. Used to print 'addi' vs 'add' instructions, etc. 4894- ``z``: Print the register ``zero`` if an immediate zero, otherwise print 4895 normally. 4896 4897Sparc: 4898 4899- ``r``: No effect. 4900 4901SystemZ: 4902 4903SystemZ implements only ``n``, and does *not* support any of the other 4904target-independent modifiers. 4905 4906X86: 4907 4908- ``c``: Print an unadorned integer or symbol name. (The latter is 4909 target-specific behavior for this typically target-independent modifier). 4910- ``A``: Print a register name with a '``*``' before it. 4911- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory 4912 operand. 4913- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a 4914 memory operand. 4915- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory 4916 operand. 4917- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory 4918 operand. 4919- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are 4920 available, otherwise the 32-bit register name; do nothing on a memory operand. 4921- ``n``: Negate and print an unadorned integer, or, for operands other than an 4922 immediate integer (e.g. a relocatable symbol expression), print a '-' before 4923 the operand. (The behavior for relocatable symbol expressions is a 4924 target-specific behavior for this typically target-independent modifier) 4925- ``H``: Print a memory reference with additional offset +8. 4926- ``P``: Print a memory reference or operand for use as the argument of a call 4927 instruction. (E.g. omit ``(rip)``, even though it's PC-relative.) 4928 4929XCore: 4930 4931No additional modifiers. 4932 4933 4934Inline Asm Metadata 4935^^^^^^^^^^^^^^^^^^^ 4936 4937The call instructions that wrap inline asm nodes may have a 4938"``!srcloc``" MDNode attached to it that contains a list of constant 4939integers. If present, the code generator will use the integer as the 4940location cookie value when report errors through the ``LLVMContext`` 4941error reporting mechanisms. This allows a front-end to correlate backend 4942errors that occur with inline asm back to the source code that produced 4943it. For example: 4944 4945.. code-block:: llvm 4946 4947 call void asm sideeffect "something bad", ""(), !srcloc !42 4948 ... 4949 !42 = !{ i32 1234567 } 4950 4951It is up to the front-end to make sense of the magic numbers it places 4952in the IR. If the MDNode contains multiple constants, the code generator 4953will use the one that corresponds to the line of the asm that the error 4954occurs on. 4955 4956.. _metadata: 4957 4958Metadata 4959======== 4960 4961LLVM IR allows metadata to be attached to instructions in the program 4962that can convey extra information about the code to the optimizers and 4963code generator. One example application of metadata is source-level 4964debug information. There are two metadata primitives: strings and nodes. 4965 4966Metadata does not have a type, and is not a value. If referenced from a 4967``call`` instruction, it uses the ``metadata`` type. 4968 4969All metadata are identified in syntax by an exclamation point ('``!``'). 4970 4971.. _metadata-string: 4972 4973Metadata Nodes and Metadata Strings 4974----------------------------------- 4975 4976A metadata string is a string surrounded by double quotes. It can 4977contain any character by escaping non-printable characters with 4978"``\xx``" where "``xx``" is the two digit hex code. For example: 4979"``!"test\00"``". 4980 4981Metadata nodes are represented with notation similar to structure 4982constants (a comma separated list of elements, surrounded by braces and 4983preceded by an exclamation point). Metadata nodes can have any values as 4984their operand. For example: 4985 4986.. code-block:: llvm 4987 4988 !{ !"test\00", i32 10} 4989 4990Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example: 4991 4992.. code-block:: text 4993 4994 !0 = distinct !{!"test\00", i32 10} 4995 4996``distinct`` nodes are useful when nodes shouldn't be merged based on their 4997content. They can also occur when transformations cause uniquing collisions 4998when metadata operands change. 4999 5000A :ref:`named metadata <namedmetadatastructure>` is a collection of 5001metadata nodes, which can be looked up in the module symbol table. For 5002example: 5003 5004.. code-block:: llvm 5005 5006 !foo = !{!4, !3} 5007 5008Metadata can be used as function arguments. Here the ``llvm.dbg.value`` 5009intrinsic is using three metadata arguments: 5010 5011.. code-block:: llvm 5012 5013 call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26) 5014 5015Metadata can be attached to an instruction. Here metadata ``!21`` is attached 5016to the ``add`` instruction using the ``!dbg`` identifier: 5017 5018.. code-block:: llvm 5019 5020 %indvar.next = add i64 %indvar, 1, !dbg !21 5021 5022Metadata can also be attached to a function or a global variable. Here metadata 5023``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1`` 5024and ``g2`` using the ``!dbg`` identifier: 5025 5026.. code-block:: llvm 5027 5028 declare !dbg !22 void @f1() 5029 define void @f2() !dbg !22 { 5030 ret void 5031 } 5032 5033 @g1 = global i32 0, !dbg !22 5034 @g2 = external global i32, !dbg !22 5035 5036A transformation is required to drop any metadata attachment that it does not 5037know or know it can't preserve. Currently there is an exception for metadata 5038attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be 5039unconditionally dropped unless the global is itself deleted. 5040 5041Metadata attached to a module using named metadata may not be dropped, with 5042the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``). 5043 5044More information about specific metadata nodes recognized by the 5045optimizers and code generator is found below. 5046 5047.. _specialized-metadata: 5048 5049Specialized Metadata Nodes 5050^^^^^^^^^^^^^^^^^^^^^^^^^^ 5051 5052Specialized metadata nodes are custom data structures in metadata (as opposed 5053to generic tuples). Their fields are labelled, and can be specified in any 5054order. 5055 5056These aren't inherently debug info centric, but currently all the specialized 5057metadata nodes are related to debug info. 5058 5059.. _DICompileUnit: 5060 5061DICompileUnit 5062""""""""""""" 5063 5064``DICompileUnit`` nodes represent a compile unit. The ``enums:``, 5065``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples 5066containing the debug info to be emitted along with the compile unit, regardless 5067of code optimizations (some nodes are only emitted if there are references to 5068them from instructions). The ``debugInfoForProfiling:`` field is a boolean 5069indicating whether or not line-table discriminators are updated to provide 5070more-accurate debug info for profiling results. 5071 5072.. code-block:: text 5073 5074 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang", 5075 isOptimized: true, flags: "-O2", runtimeVersion: 2, 5076 splitDebugFilename: "abc.debug", emissionKind: FullDebug, 5077 enums: !2, retainedTypes: !3, globals: !4, imports: !5, 5078 macros: !6, dwoId: 0x0abcd) 5079 5080Compile unit descriptors provide the root scope for objects declared in a 5081specific compilation unit. File descriptors are defined using this scope. These 5082descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep 5083track of global variables, type information, and imported entities (declarations 5084and namespaces). 5085 5086.. _DIFile: 5087 5088DIFile 5089"""""" 5090 5091``DIFile`` nodes represent files. The ``filename:`` can include slashes. 5092 5093.. code-block:: none 5094 5095 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir", 5096 checksumkind: CSK_MD5, 5097 checksum: "000102030405060708090a0b0c0d0e0f") 5098 5099Files are sometimes used in ``scope:`` fields, and are the only valid target 5100for ``file:`` fields. 5101Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1, CSK_SHA256} 5102 5103.. _DIBasicType: 5104 5105DIBasicType 5106""""""""""" 5107 5108``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and 5109``float``. ``tag:`` defaults to ``DW_TAG_base_type``. 5110 5111.. code-block:: text 5112 5113 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8, 5114 encoding: DW_ATE_unsigned_char) 5115 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") 5116 5117The ``encoding:`` describes the details of the type. Usually it's one of the 5118following: 5119 5120.. code-block:: text 5121 5122 DW_ATE_address = 1 5123 DW_ATE_boolean = 2 5124 DW_ATE_float = 4 5125 DW_ATE_signed = 5 5126 DW_ATE_signed_char = 6 5127 DW_ATE_unsigned = 7 5128 DW_ATE_unsigned_char = 8 5129 5130.. _DISubroutineType: 5131 5132DISubroutineType 5133"""""""""""""""" 5134 5135``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field 5136refers to a tuple; the first operand is the return type, while the rest are the 5137types of the formal arguments in order. If the first operand is ``null``, that 5138represents a function with no return value (such as ``void foo() {}`` in C++). 5139 5140.. code-block:: text 5141 5142 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed) 5143 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char) 5144 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char) 5145 5146.. _DIDerivedType: 5147 5148DIDerivedType 5149""""""""""""" 5150 5151``DIDerivedType`` nodes represent types derived from other types, such as 5152qualified types. 5153 5154.. code-block:: text 5155 5156 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8, 5157 encoding: DW_ATE_unsigned_char) 5158 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32, 5159 align: 32) 5160 5161The following ``tag:`` values are valid: 5162 5163.. code-block:: text 5164 5165 DW_TAG_member = 13 5166 DW_TAG_pointer_type = 15 5167 DW_TAG_reference_type = 16 5168 DW_TAG_typedef = 22 5169 DW_TAG_inheritance = 28 5170 DW_TAG_ptr_to_member_type = 31 5171 DW_TAG_const_type = 38 5172 DW_TAG_friend = 42 5173 DW_TAG_volatile_type = 53 5174 DW_TAG_restrict_type = 55 5175 DW_TAG_atomic_type = 71 5176 5177.. _DIDerivedTypeMember: 5178 5179``DW_TAG_member`` is used to define a member of a :ref:`composite type 5180<DICompositeType>`. The type of the member is the ``baseType:``. The 5181``offset:`` is the member's bit offset. If the composite type has an ODR 5182``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is 5183uniqued based only on its ``name:`` and ``scope:``. 5184 5185``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:`` 5186field of :ref:`composite types <DICompositeType>` to describe parents and 5187friends. 5188 5189``DW_TAG_typedef`` is used to provide a name for the ``baseType:``. 5190 5191``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``, 5192``DW_TAG_volatile_type``, ``DW_TAG_restrict_type`` and ``DW_TAG_atomic_type`` 5193are used to qualify the ``baseType:``. 5194 5195Note that the ``void *`` type is expressed as a type derived from NULL. 5196 5197.. _DICompositeType: 5198 5199DICompositeType 5200""""""""""""""" 5201 5202``DICompositeType`` nodes represent types composed of other types, like 5203structures and unions. ``elements:`` points to a tuple of the composed types. 5204 5205If the source language supports ODR, the ``identifier:`` field gives the unique 5206identifier used for type merging between modules. When specified, 5207:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member 5208derived types <DIDerivedTypeMember>` that reference the ODR-type in their 5209``scope:`` change uniquing rules. 5210 5211For a given ``identifier:``, there should only be a single composite type that 5212does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules 5213together will unique such definitions at parse time via the ``identifier:`` 5214field, even if the nodes are ``distinct``. 5215 5216.. code-block:: text 5217 5218 !0 = !DIEnumerator(name: "SixKind", value: 7) 5219 !1 = !DIEnumerator(name: "SevenKind", value: 7) 5220 !2 = !DIEnumerator(name: "NegEightKind", value: -8) 5221 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12, 5222 line: 2, size: 32, align: 32, identifier: "_M4Enum", 5223 elements: !{!0, !1, !2}) 5224 5225The following ``tag:`` values are valid: 5226 5227.. code-block:: text 5228 5229 DW_TAG_array_type = 1 5230 DW_TAG_class_type = 2 5231 DW_TAG_enumeration_type = 4 5232 DW_TAG_structure_type = 19 5233 DW_TAG_union_type = 23 5234 5235For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange 5236descriptors <DISubrange>`, each representing the range of subscripts at that 5237level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an 5238array type is a native packed vector. The optional ``dataLocation`` is a 5239DIExpression that describes how to get from an object's address to the actual 5240raw data, if they aren't equivalent. This is only supported for array types, 5241particularly to describe Fortran arrays, which have an array descriptor in 5242addition to the array data. Alternatively it can also be DIVariable which 5243has the address of the actual raw data. The Fortran language supports pointer 5244arrays which can be attached to actual arrays, this attachment between pointer 5245and pointee is called association. The optional ``associated`` is a 5246DIExpression that describes whether the pointer array is currently associated. 5247The optional ``allocated`` is a DIExpression that describes whether the 5248allocatable array is currently allocated. The optional ``rank`` is a 5249DIExpression that describes the rank (number of dimensions) of fortran assumed 5250rank array (rank is known at runtime). 5251 5252For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator 5253descriptors <DIEnumerator>`, each representing the definition of an enumeration 5254value for the set. All enumeration type descriptors are collected in the 5255``enums:`` field of the :ref:`compile unit <DICompileUnit>`. 5256 5257For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and 5258``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types 5259<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or 5260``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with 5261``isDefinition: false``. 5262 5263.. _DISubrange: 5264 5265DISubrange 5266"""""""""" 5267 5268``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of 5269:ref:`DICompositeType`. 5270 5271- ``count: -1`` indicates an empty array. 5272- ``count: !9`` describes the count with a :ref:`DILocalVariable`. 5273- ``count: !11`` describes the count with a :ref:`DIGlobalVariable`. 5274 5275.. code-block:: text 5276 5277 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0 5278 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1 5279 !2 = !DISubrange(count: -1) ; empty array. 5280 5281 ; Scopes used in rest of example 5282 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file") 5283 !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6) 5284 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5) 5285 5286 ; Use of local variable as count value 5287 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 5288 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9) 5289 !11 = !DISubrange(count: !10, lowerBound: 0) 5290 5291 ; Use of global variable as count value 5292 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9) 5293 !13 = !DISubrange(count: !12, lowerBound: 0) 5294 5295.. _DIEnumerator: 5296 5297DIEnumerator 5298"""""""""""" 5299 5300``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type`` 5301variants of :ref:`DICompositeType`. 5302 5303.. code-block:: text 5304 5305 !0 = !DIEnumerator(name: "SixKind", value: 7) 5306 !1 = !DIEnumerator(name: "SevenKind", value: 7) 5307 !2 = !DIEnumerator(name: "NegEightKind", value: -8) 5308 5309DITemplateTypeParameter 5310""""""""""""""""""""""" 5311 5312``DITemplateTypeParameter`` nodes represent type parameters to generic source 5313language constructs. They are used (optionally) in :ref:`DICompositeType` and 5314:ref:`DISubprogram` ``templateParams:`` fields. 5315 5316.. code-block:: text 5317 5318 !0 = !DITemplateTypeParameter(name: "Ty", type: !1) 5319 5320DITemplateValueParameter 5321"""""""""""""""""""""""" 5322 5323``DITemplateValueParameter`` nodes represent value parameters to generic source 5324language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``, 5325but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or 5326``DW_TAG_GNU_template_param_pack``. They are used (optionally) in 5327:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields. 5328 5329.. code-block:: text 5330 5331 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7) 5332 5333DINamespace 5334""""""""""" 5335 5336``DINamespace`` nodes represent namespaces in the source language. 5337 5338.. code-block:: text 5339 5340 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7) 5341 5342.. _DIGlobalVariable: 5343 5344DIGlobalVariable 5345"""""""""""""""" 5346 5347``DIGlobalVariable`` nodes represent global variables in the source language. 5348 5349.. code-block:: text 5350 5351 @foo = global i32, !dbg !0 5352 !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression()) 5353 !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2, 5354 file: !3, line: 7, type: !4, isLocal: true, 5355 isDefinition: false, declaration: !5) 5356 5357 5358DIGlobalVariableExpression 5359"""""""""""""""""""""""""" 5360 5361``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together 5362with a :ref:`DIExpression`. 5363 5364.. code-block:: text 5365 5366 @lower = global i32, !dbg !0 5367 @upper = global i32, !dbg !1 5368 !0 = !DIGlobalVariableExpression( 5369 var: !2, 5370 expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32) 5371 ) 5372 !1 = !DIGlobalVariableExpression( 5373 var: !2, 5374 expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32) 5375 ) 5376 !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3, 5377 file: !4, line: 8, type: !5, declaration: !6) 5378 5379All global variable expressions should be referenced by the `globals:` field of 5380a :ref:`compile unit <DICompileUnit>`. 5381 5382.. _DISubprogram: 5383 5384DISubprogram 5385"""""""""""" 5386 5387``DISubprogram`` nodes represent functions from the source language. A distinct 5388``DISubprogram`` may be attached to a function definition using ``!dbg`` 5389metadata. A unique ``DISubprogram`` may be attached to a function declaration 5390used for call site debug info. The ``retainedNodes:`` field is a list of 5391:ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be 5392retained, even if their IR counterparts are optimized out of the IR. The 5393``type:`` field must point at an :ref:`DISubroutineType`. 5394 5395.. _DISubprogramDeclaration: 5396 5397When ``isDefinition: false``, subprograms describe a declaration in the type 5398tree as opposed to a definition of a function. If the scope is a composite 5399type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, 5400then the subprogram declaration is uniqued based only on its ``linkageName:`` 5401and ``scope:``. 5402 5403.. code-block:: text 5404 5405 define void @_Z3foov() !dbg !0 { 5406 ... 5407 } 5408 5409 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1, 5410 file: !2, line: 7, type: !3, isLocal: true, 5411 isDefinition: true, scopeLine: 8, 5412 containingType: !4, 5413 virtuality: DW_VIRTUALITY_pure_virtual, 5414 virtualIndex: 10, flags: DIFlagPrototyped, 5415 isOptimized: true, unit: !5, templateParams: !6, 5416 declaration: !7, retainedNodes: !8, 5417 thrownTypes: !9) 5418 5419.. _DILexicalBlock: 5420 5421DILexicalBlock 5422"""""""""""""" 5423 5424``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram 5425<DISubprogram>`. The line number and column numbers are used to distinguish 5426two lexical blocks at same depth. They are valid targets for ``scope:`` 5427fields. 5428 5429.. code-block:: text 5430 5431 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35) 5432 5433Usually lexical blocks are ``distinct`` to prevent node merging based on 5434operands. 5435 5436.. _DILexicalBlockFile: 5437 5438DILexicalBlockFile 5439"""""""""""""""""" 5440 5441``DILexicalBlockFile`` nodes are used to discriminate between sections of a 5442:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to 5443indicate textual inclusion, or the ``discriminator:`` field can be used to 5444discriminate between control flow within a single block in the source language. 5445 5446.. code-block:: text 5447 5448 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35) 5449 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0) 5450 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1) 5451 5452.. _DILocation: 5453 5454DILocation 5455"""""""""" 5456 5457``DILocation`` nodes represent source debug locations. The ``scope:`` field is 5458mandatory, and points at an :ref:`DILexicalBlockFile`, an 5459:ref:`DILexicalBlock`, or an :ref:`DISubprogram`. 5460 5461.. code-block:: text 5462 5463 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2) 5464 5465.. _DILocalVariable: 5466 5467DILocalVariable 5468""""""""""""""" 5469 5470``DILocalVariable`` nodes represent local variables in the source language. If 5471the ``arg:`` field is set to non-zero, then this variable is a subprogram 5472parameter, and it will be included in the ``variables:`` field of its 5473:ref:`DISubprogram`. 5474 5475.. code-block:: text 5476 5477 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7, 5478 type: !3, flags: DIFlagArtificial) 5479 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7, 5480 type: !3) 5481 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3) 5482 5483.. _DIExpression: 5484 5485DIExpression 5486"""""""""""" 5487 5488``DIExpression`` nodes represent expressions that are inspired by the DWARF 5489expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>` 5490(such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the 5491referenced LLVM variable relates to the source language variable. Debug 5492intrinsics are interpreted left-to-right: start by pushing the value/address 5493operand of the intrinsic onto a stack, then repeatedly push and evaluate 5494opcodes from the DIExpression until the final variable description is produced. 5495 5496The current supported opcode vocabulary is limited: 5497 5498- ``DW_OP_deref`` dereferences the top of the expression stack. 5499- ``DW_OP_plus`` pops the last two entries from the expression stack, adds 5500 them together and appends the result to the expression stack. 5501- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts 5502 the last entry from the second last entry and appends the result to the 5503 expression stack. 5504- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression. 5505- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8`` 5506 here, respectively) of the variable fragment from the working expression. Note 5507 that contrary to DW_OP_bit_piece, the offset is describing the location 5508 within the described source variable. 5509- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding 5510 (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the 5511 expression stack is to be converted. Maps into a ``DW_OP_convert`` operation 5512 that references a base type constructed from the supplied values. 5513- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be 5514 optionally applied to the pointer. The memory tag is derived from the 5515 given tag offset in an implementation-defined manner. 5516- ``DW_OP_swap`` swaps top two stack entries. 5517- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top 5518 of the stack is treated as an address. The second stack entry is treated as an 5519 address space identifier. 5520- ``DW_OP_stack_value`` marks a constant value. 5521- ``DW_OP_LLVM_entry_value, N`` may only appear in MIR and at the 5522 beginning of a ``DIExpression``. In DWARF a ``DBG_VALUE`` 5523 instruction binding a ``DIExpression(DW_OP_LLVM_entry_value`` to a 5524 register is lowered to a ``DW_OP_entry_value [reg]``, pushing the 5525 value the register had upon function entry onto the stack. The next 5526 ``(N - 1)`` operations will be part of the ``DW_OP_entry_value`` 5527 block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 5528 1, DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an 5529 expression where the entry value of the debug value instruction's 5530 value/address operand is pushed to the stack, and is added 5531 with 123. Due to framework limitations ``N`` can currently only 5532 be 1. 5533 5534 The operation is introduced by the ``LiveDebugValues`` pass, which 5535 applies it only to function parameters that are unmodified 5536 throughout the function. Support is limited to simple register 5537 location descriptions, or as indirect locations (e.g., when a struct 5538 is passed-by-value to a callee via a pointer to a temporary copy 5539 made in the caller). The entry value op is also introduced by the 5540 ``AsmPrinter`` pass when a call site parameter value 5541 (``DW_AT_call_site_parameter_value``) is represented as entry value 5542 of the parameter. 5543- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one 5544 value, such as one that calculates the sum of two registers. This is always 5545 used in combination with an ordered list of values, such that 5546 ``DW_OP_LLVM_arg, N`` refers to the ``N``th element in that list. For 5547 example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus, 5548 DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to 5549 ``%reg1 - reg2``. This list of values should be provided by the containing 5550 intrinsic/instruction. 5551- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided 5552 signed offset of the specified register. The opcode is only generated by the 5553 ``AsmPrinter`` pass to describe call site parameter value which requires an 5554 expression over two registers. 5555- ``DW_OP_push_object_address`` pushes the address of the object which can then 5556 serve as a descriptor in subsequent calculation. This opcode can be used to 5557 calculate bounds of fortran allocatable array which has array descriptors. 5558- ``DW_OP_over`` duplicates the entry currently second in the stack at the top 5559 of the stack. This opcode can be used to calculate bounds of fortran assumed 5560 rank array which has rank known at run time and current dimension number is 5561 implicitly first element of the stack. 5562- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can 5563 be used to represent pointer variables which are optimized out but the value 5564 it points to is known. This operator is required as it is different than DWARF 5565 operator DW_OP_implicit_pointer in representation and specification (number 5566 and types of operands) and later can not be used as multiple level. 5567 5568.. code-block:: text 5569 5570 IR for "*ptr = 4;" 5571 -------------- 5572 call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !20) 5573 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5, 5574 type: !18) 5575 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64) 5576 !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 5577 !20 = !DIExpression(DW_OP_LLVM_implicit_pointer)) 5578 5579 IR for "**ptr = 4;" 5580 -------------- 5581 call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !21) 5582 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5, 5583 type: !18) 5584 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64) 5585 !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64) 5586 !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 5587 !21 = !DIExpression(DW_OP_LLVM_implicit_pointer, 5588 DW_OP_LLVM_implicit_pointer)) 5589 5590DWARF specifies three kinds of simple location descriptions: Register, memory, 5591and implicit location descriptions. Note that a location description is 5592defined over certain ranges of a program, i.e the location of a variable may 5593change over the course of the program. Register and memory location 5594descriptions describe the *concrete location* of a source variable (in the 5595sense that a debugger might modify its value), whereas *implicit locations* 5596describe merely the actual *value* of a source variable which might not exist 5597in registers or in memory (see ``DW_OP_stack_value``). 5598 5599A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect 5600value (the address) of a source variable. The first operand of the intrinsic 5601must be an address of some kind. A DIExpression attached to the intrinsic 5602refines this address to produce a concrete location for the source variable. 5603 5604A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable. 5605The first operand of the intrinsic may be a direct or indirect value. A 5606DIExpression attached to the intrinsic refines the first operand to produce a 5607direct value. For example, if the first operand is an indirect value, it may be 5608necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a 5609valid debug intrinsic. 5610 5611.. note:: 5612 5613 A DIExpression is interpreted in the same way regardless of which kind of 5614 debug intrinsic it's attached to. 5615 5616.. code-block:: text 5617 5618 !0 = !DIExpression(DW_OP_deref) 5619 !1 = !DIExpression(DW_OP_plus_uconst, 3) 5620 !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus) 5621 !2 = !DIExpression(DW_OP_bit_piece, 3, 7) 5622 !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7) 5623 !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef) 5624 !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value) 5625 5626DIArgList 5627"""""""""""" 5628 5629``DIArgList`` nodes hold a list of constant or SSA value references. These are 5630used in :ref:`debug intrinsics<dbg_intrinsics>` (currently only in 5631``llvm.dbg.value``) in combination with a ``DIExpression`` that uses the 5632``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values 5633within a function, it must only be used as a function argument, must always be 5634inlined, and cannot appear in named metadata. 5635 5636.. code-block:: text 5637 5638 llvm.dbg.value(metadata !DIArgList(i32 %a, i32 %b), 5639 metadata !16, 5640 metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus)) 5641 5642DIFlags 5643""""""""""""""" 5644 5645These flags encode various properties of DINodes. 5646 5647The `ExportSymbols` flag marks a class, struct or union whose members 5648may be referenced as if they were defined in the containing class or 5649union. This flag is used to decide whether the DW_AT_export_symbols can 5650be used for the structure type. 5651 5652DIObjCProperty 5653"""""""""""""" 5654 5655``DIObjCProperty`` nodes represent Objective-C property nodes. 5656 5657.. code-block:: text 5658 5659 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo", 5660 getter: "getFoo", attributes: 7, type: !2) 5661 5662DIImportedEntity 5663"""""""""""""""" 5664 5665``DIImportedEntity`` nodes represent entities (such as modules) imported into a 5666compile unit. 5667 5668.. code-block:: text 5669 5670 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0, 5671 entity: !1, line: 7) 5672 5673DIMacro 5674""""""" 5675 5676``DIMacro`` nodes represent definition or undefinition of a macro identifiers. 5677The ``name:`` field is the macro identifier, followed by macro parameters when 5678defining a function-like macro, and the ``value`` field is the token-string 5679used to expand the macro identifier. 5680 5681.. code-block:: text 5682 5683 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)", 5684 value: "((x) + 1)") 5685 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo") 5686 5687DIMacroFile 5688""""""""""" 5689 5690``DIMacroFile`` nodes represent inclusion of source files. 5691The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that 5692appear in the included source file. 5693 5694.. code-block:: text 5695 5696 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2, 5697 nodes: !3) 5698 5699.. _DILabel: 5700 5701DILabel 5702""""""" 5703 5704``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of 5705a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a 5706:ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`. 5707The ``name:`` field is the label identifier. The ``file:`` field is the 5708:ref:`DIFile` the label is present in. The ``line:`` field is the source line 5709within the file where the label is declared. 5710 5711.. code-block:: text 5712 5713 !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7) 5714 5715'``tbaa``' Metadata 5716^^^^^^^^^^^^^^^^^^^ 5717 5718In LLVM IR, memory does not have types, so LLVM's own type system is not 5719suitable for doing type based alias analysis (TBAA). Instead, metadata is 5720added to the IR to describe a type system of a higher level language. This 5721can be used to implement C/C++ strict type aliasing rules, but it can also 5722be used to implement custom alias analysis behavior for other languages. 5723 5724This description of LLVM's TBAA system is broken into two parts: 5725:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and 5726:ref:`Representation<tbaa_node_representation>` talks about the metadata 5727encoding of various entities. 5728 5729It is always possible to trace any TBAA node to a "root" TBAA node (details 5730in the :ref:`Representation<tbaa_node_representation>` section). TBAA 5731nodes with different roots have an unknown aliasing relationship, and LLVM 5732conservatively infers ``MayAlias`` between them. The rules mentioned in 5733this section only pertain to TBAA nodes living under the same root. 5734 5735.. _tbaa_node_semantics: 5736 5737Semantics 5738""""""""" 5739 5740The TBAA metadata system, referred to as "struct path TBAA" (not to be 5741confused with ``tbaa.struct``), consists of the following high level 5742concepts: *Type Descriptors*, further subdivided into scalar type 5743descriptors and struct type descriptors; and *Access Tags*. 5744 5745**Type descriptors** describe the type system of the higher level language 5746being compiled. **Scalar type descriptors** describe types that do not 5747contain other types. Each scalar type has a parent type, which must also 5748be a scalar type or the TBAA root. Via this parent relation, scalar types 5749within a TBAA root form a tree. **Struct type descriptors** denote types 5750that contain a sequence of other type descriptors, at known offsets. These 5751contained type descriptors can either be struct type descriptors themselves 5752or scalar type descriptors. 5753 5754**Access tags** are metadata nodes attached to load and store instructions. 5755Access tags use type descriptors to describe the *location* being accessed 5756in terms of the type system of the higher level language. Access tags are 5757tuples consisting of a base type, an access type and an offset. The base 5758type is a scalar type descriptor or a struct type descriptor, the access 5759type is a scalar type descriptor, and the offset is a constant integer. 5760 5761The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two 5762things: 5763 5764 * If ``BaseTy`` is a struct type, the tag describes a memory access (load 5765 or store) of a value of type ``AccessTy`` contained in the struct type 5766 ``BaseTy`` at offset ``Offset``. 5767 5768 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and 5769 ``AccessTy`` must be the same; and the access tag describes a scalar 5770 access with scalar type ``AccessTy``. 5771 5772We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)`` 5773tuples this way: 5774 5775 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is 5776 ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as 5777 described in the TBAA metadata. ``ImmediateParent(BaseTy, Offset)`` is 5778 undefined if ``Offset`` is non-zero. 5779 5780 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)`` 5781 is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in 5782 ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted 5783 to be relative within that inner type. 5784 5785A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)`` 5786aliases a memory access with an access tag ``(BaseTy2, AccessTy2, 5787Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2, 5788Offset2)`` via the ``Parent`` relation or vice versa. 5789 5790As a concrete example, the type descriptor graph for the following program 5791 5792.. code-block:: c 5793 5794 struct Inner { 5795 int i; // offset 0 5796 float f; // offset 4 5797 }; 5798 5799 struct Outer { 5800 float f; // offset 0 5801 double d; // offset 4 5802 struct Inner inner_a; // offset 12 5803 }; 5804 5805 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) { 5806 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0) 5807 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12) 5808 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, FloatScalarTy, 16) 5809 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0) 5810 } 5811 5812is (note that in C and C++, ``char`` can be used to access any arbitrary 5813type): 5814 5815.. code-block:: text 5816 5817 Root = "TBAA Root" 5818 CharScalarTy = ("char", Root, 0) 5819 FloatScalarTy = ("float", CharScalarTy, 0) 5820 DoubleScalarTy = ("double", CharScalarTy, 0) 5821 IntScalarTy = ("int", CharScalarTy, 0) 5822 InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)} 5823 OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4), 5824 (InnerStructTy, 12)} 5825 5826 5827with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy, 58280)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and 5829``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``. 5830 5831.. _tbaa_node_representation: 5832 5833Representation 5834"""""""""""""" 5835 5836The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or 5837with exactly one ``MDString`` operand. 5838 5839Scalar type descriptors are represented as an ``MDNode`` s with two 5840operands. The first operand is an ``MDString`` denoting the name of the 5841struct type. LLVM does not assign meaning to the value of this operand, it 5842only cares about it being an ``MDString``. The second operand is an 5843``MDNode`` which points to the parent for said scalar type descriptor, 5844which is either another scalar type descriptor or the TBAA root. Scalar 5845type descriptors can have an optional third argument, but that must be the 5846constant integer zero. 5847 5848Struct type descriptors are represented as ``MDNode`` s with an odd number 5849of operands greater than 1. The first operand is an ``MDString`` denoting 5850the name of the struct type. Like in scalar type descriptors the actual 5851value of this name operand is irrelevant to LLVM. After the name operand, 5852the struct type descriptors have a sequence of alternating ``MDNode`` and 5853``ConstantInt`` operands. With N starting from 1, the 2N - 1 th operand, 5854an ``MDNode``, denotes a contained field, and the 2N th operand, a 5855``ConstantInt``, is the offset of the said contained field. The offsets 5856must be in non-decreasing order. 5857 5858Access tags are represented as ``MDNode`` s with either 3 or 4 operands. 5859The first operand is an ``MDNode`` pointing to the node representing the 5860base type. The second operand is an ``MDNode`` pointing to the node 5861representing the access type. The third operand is a ``ConstantInt`` that 5862states the offset of the access. If a fourth field is present, it must be 5863a ``ConstantInt`` valued at 0 or 1. If it is 1 then the access tag states 5864that the location being accessed is "constant" (meaning 5865``pointsToConstantMemory`` should return true; see `other useful 5866AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). The TBAA root of 5867the access type and the base type of an access tag must be the same, and 5868that is the TBAA root of the access tag. 5869 5870'``tbaa.struct``' Metadata 5871^^^^^^^^^^^^^^^^^^^^^^^^^^ 5872 5873The :ref:`llvm.memcpy <int_memcpy>` is often used to implement 5874aggregate assignment operations in C and similar languages, however it 5875is defined to copy a contiguous region of memory, which is more than 5876strictly necessary for aggregate types which contain holes due to 5877padding. Also, it doesn't contain any TBAA information about the fields 5878of the aggregate. 5879 5880``!tbaa.struct`` metadata can describe which memory subregions in a 5881memcpy are padding and what the TBAA tags of the struct are. 5882 5883The current metadata format is very simple. ``!tbaa.struct`` metadata 5884nodes are a list of operands which are in conceptual groups of three. 5885For each group of three, the first operand gives the byte offset of a 5886field in bytes, the second gives its size in bytes, and the third gives 5887its tbaa tag. e.g.: 5888 5889.. code-block:: llvm 5890 5891 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 } 5892 5893This describes a struct with two fields. The first is at offset 0 bytes 5894with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes 5895and has size 4 bytes and has tbaa tag !2. 5896 5897Note that the fields need not be contiguous. In this example, there is a 58984 byte gap between the two fields. This gap represents padding which 5899does not carry useful data and need not be preserved. 5900 5901'``noalias``' and '``alias.scope``' Metadata 5902^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5903 5904``noalias`` and ``alias.scope`` metadata provide the ability to specify generic 5905noalias memory-access sets. This means that some collection of memory access 5906instructions (loads, stores, memory-accessing calls, etc.) that carry 5907``noalias`` metadata can specifically be specified not to alias with some other 5908collection of memory access instructions that carry ``alias.scope`` metadata. 5909Each type of metadata specifies a list of scopes where each scope has an id and 5910a domain. 5911 5912When evaluating an aliasing query, if for some domain, the set 5913of scopes with that domain in one instruction's ``alias.scope`` list is a 5914subset of (or equal to) the set of scopes for that domain in another 5915instruction's ``noalias`` list, then the two memory accesses are assumed not to 5916alias. 5917 5918Because scopes in one domain don't affect scopes in other domains, separate 5919domains can be used to compose multiple independent noalias sets. This is 5920used for example during inlining. As the noalias function parameters are 5921turned into noalias scope metadata, a new domain is used every time the 5922function is inlined. 5923 5924The metadata identifying each domain is itself a list containing one or two 5925entries. The first entry is the name of the domain. Note that if the name is a 5926string then it can be combined across functions and translation units. A 5927self-reference can be used to create globally unique domain names. A 5928descriptive string may optionally be provided as a second list entry. 5929 5930The metadata identifying each scope is also itself a list containing two or 5931three entries. The first entry is the name of the scope. Note that if the name 5932is a string then it can be combined across functions and translation units. A 5933self-reference can be used to create globally unique scope names. A metadata 5934reference to the scope's domain is the second entry. A descriptive string may 5935optionally be provided as a third list entry. 5936 5937For example, 5938 5939.. code-block:: llvm 5940 5941 ; Two scope domains: 5942 !0 = !{!0} 5943 !1 = !{!1} 5944 5945 ; Some scopes in these domains: 5946 !2 = !{!2, !0} 5947 !3 = !{!3, !0} 5948 !4 = !{!4, !1} 5949 5950 ; Some scope lists: 5951 !5 = !{!4} ; A list containing only scope !4 5952 !6 = !{!4, !3, !2} 5953 !7 = !{!3} 5954 5955 ; These two instructions don't alias: 5956 %0 = load float, float* %c, align 4, !alias.scope !5 5957 store float %0, float* %arrayidx.i, align 4, !noalias !5 5958 5959 ; These two instructions also don't alias (for domain !1, the set of scopes 5960 ; in the !alias.scope equals that in the !noalias list): 5961 %2 = load float, float* %c, align 4, !alias.scope !5 5962 store float %2, float* %arrayidx.i2, align 4, !noalias !6 5963 5964 ; These two instructions may alias (for domain !0, the set of scopes in 5965 ; the !noalias list is not a superset of, or equal to, the scopes in the 5966 ; !alias.scope list): 5967 %2 = load float, float* %c, align 4, !alias.scope !6 5968 store float %0, float* %arrayidx.i, align 4, !noalias !7 5969 5970'``fpmath``' Metadata 5971^^^^^^^^^^^^^^^^^^^^^ 5972 5973``fpmath`` metadata may be attached to any instruction of floating-point 5974type. It can be used to express the maximum acceptable error in the 5975result of that instruction, in ULPs, thus potentially allowing the 5976compiler to use a more efficient but less accurate method of computing 5977it. ULP is defined as follows: 5978 5979 If ``x`` is a real number that lies between two finite consecutive 5980 floating-point numbers ``a`` and ``b``, without being equal to one 5981 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the 5982 distance between the two non-equal finite floating-point numbers 5983 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``. 5984 5985The metadata node shall consist of a single positive float type number 5986representing the maximum relative error, for example: 5987 5988.. code-block:: llvm 5989 5990 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs 5991 5992.. _range-metadata: 5993 5994'``range``' Metadata 5995^^^^^^^^^^^^^^^^^^^^ 5996 5997``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of 5998integer types. It expresses the possible ranges the loaded value or the value 5999returned by the called function at this call site is in. If the loaded or 6000returned value is not in the specified range, the behavior is undefined. The 6001ranges are represented with a flattened list of integers. The loaded value or 6002the value returned is known to be in the union of the ranges defined by each 6003consecutive pair. Each pair has the following properties: 6004 6005- The type must match the type loaded by the instruction. 6006- The pair ``a,b`` represents the range ``[a,b)``. 6007- Both ``a`` and ``b`` are constants. 6008- The range is allowed to wrap. 6009- The range should not represent the full or empty set. That is, 6010 ``a!=b``. 6011 6012In addition, the pairs must be in signed order of the lower bound and 6013they must be non-contiguous. 6014 6015Examples: 6016 6017.. code-block:: llvm 6018 6019 %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1 6020 %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1 6021 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5 6022 %d = invoke i8 @bar() to label %cont 6023 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5 6024 ... 6025 !0 = !{ i8 0, i8 2 } 6026 !1 = !{ i8 255, i8 2 } 6027 !2 = !{ i8 0, i8 2, i8 3, i8 6 } 6028 !3 = !{ i8 -2, i8 0, i8 3, i8 6 } 6029 6030'``absolute_symbol``' Metadata 6031^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6032 6033``absolute_symbol`` metadata may be attached to a global variable 6034declaration. It marks the declaration as a reference to an absolute symbol, 6035which causes the backend to use absolute relocations for the symbol even 6036in position independent code, and expresses the possible ranges that the 6037global variable's *address* (not its value) is in, in the same format as 6038``range`` metadata, with the extension that the pair ``all-ones,all-ones`` 6039may be used to represent the full set. 6040 6041Example (assuming 64-bit pointers): 6042 6043.. code-block:: llvm 6044 6045 @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256) 6046 @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64) 6047 6048 ... 6049 !0 = !{ i64 0, i64 256 } 6050 !1 = !{ i64 -1, i64 -1 } 6051 6052'``callees``' Metadata 6053^^^^^^^^^^^^^^^^^^^^^^ 6054 6055``callees`` metadata may be attached to indirect call sites. If ``callees`` 6056metadata is attached to a call site, and any callee is not among the set of 6057functions provided by the metadata, the behavior is undefined. The intent of 6058this metadata is to facilitate optimizations such as indirect-call promotion. 6059For example, in the code below, the call instruction may only target the 6060``add`` or ``sub`` functions: 6061 6062.. code-block:: llvm 6063 6064 %result = call i64 %binop(i64 %x, i64 %y), !callees !0 6065 6066 ... 6067 !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub} 6068 6069'``callback``' Metadata 6070^^^^^^^^^^^^^^^^^^^^^^^ 6071 6072``callback`` metadata may be attached to a function declaration, or definition. 6073(Call sites are excluded only due to the lack of a use case.) For ease of 6074exposition, we'll refer to the function annotated w/ metadata as a broker 6075function. The metadata describes how the arguments of a call to the broker are 6076in turn passed to the callback function specified by the metadata. Thus, the 6077``callback`` metadata provides a partial description of a call site inside the 6078broker function with regards to the arguments of a call to the broker. The only 6079semantic restriction on the broker function itself is that it is not allowed to 6080inspect or modify arguments referenced in the ``callback`` metadata as 6081pass-through to the callback function. 6082 6083The broker is not required to actually invoke the callback function at runtime. 6084However, the assumptions about not inspecting or modifying arguments that would 6085be passed to the specified callback function still hold, even if the callback 6086function is not dynamically invoked. The broker is allowed to invoke the 6087callback function more than once per invocation of the broker. The broker is 6088also allowed to invoke (directly or indirectly) the function passed as a 6089callback through another use. Finally, the broker is also allowed to relay the 6090callback callee invocation to a different thread. 6091 6092The metadata is structured as follows: At the outer level, ``callback`` 6093metadata is a list of ``callback`` encodings. Each encoding starts with a 6094constant ``i64`` which describes the argument position of the callback function 6095in the call to the broker. The following elements, except the last, describe 6096what arguments are passed to the callback function. Each element is again an 6097``i64`` constant identifying the argument of the broker that is passed through, 6098or ``i64 -1`` to indicate an unknown or inspected argument. The order in which 6099they are listed has to be the same in which they are passed to the callback 6100callee. The last element of the encoding is a boolean which specifies how 6101variadic arguments of the broker are handled. If it is true, all variadic 6102arguments of the broker are passed through to the callback function *after* the 6103arguments encoded explicitly before. 6104 6105In the code below, the ``pthread_create`` function is marked as a broker 6106through the ``!callback !1`` metadata. In the example, there is only one 6107callback encoding, namely ``!2``, associated with the broker. This encoding 6108identifies the callback function as the second argument of the broker (``i64 61092``) and the sole argument of the callback function as the third one of the 6110broker function (``i64 3``). 6111 6112.. FIXME why does the llvm-sphinx-docs builder give a highlighting 6113 error if the below is set to highlight as 'llvm', despite that we 6114 have misc.highlighting_failure set? 6115 6116.. code-block:: text 6117 6118 declare !callback !1 dso_local i32 @pthread_create(i64*, %union.pthread_attr_t*, i8* (i8*)*, i8*) 6119 6120 ... 6121 !2 = !{i64 2, i64 3, i1 false} 6122 !1 = !{!2} 6123 6124Another example is shown below. The callback callee is the second argument of 6125the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown 6126values (each identified by a ``i64 -1``) and afterwards all 6127variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the 6128final ``i1 true``). 6129 6130.. FIXME why does the llvm-sphinx-docs builder give a highlighting 6131 error if the below is set to highlight as 'llvm', despite that we 6132 have misc.highlighting_failure set? 6133 6134.. code-block:: text 6135 6136 declare !callback !0 dso_local void @__kmpc_fork_call(%struct.ident_t*, i32, void (i32*, i32*, ...)*, ...) 6137 6138 ... 6139 !1 = !{i64 2, i64 -1, i64 -1, i1 true} 6140 !0 = !{!1} 6141 6142 6143'``unpredictable``' Metadata 6144^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6145 6146``unpredictable`` metadata may be attached to any branch or switch 6147instruction. It can be used to express the unpredictability of control 6148flow. Similar to the llvm.expect intrinsic, it may be used to alter 6149optimizations related to compare and branch instructions. The metadata 6150is treated as a boolean value; if it exists, it signals that the branch 6151or switch that it is attached to is completely unpredictable. 6152 6153.. _md_dereferenceable: 6154 6155'``dereferenceable``' Metadata 6156^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6157 6158The existence of the ``!dereferenceable`` metadata on the instruction 6159tells the optimizer that the value loaded is known to be dereferenceable. 6160The number of bytes known to be dereferenceable is specified by the integer 6161value in the metadata node. This is analogous to the ''dereferenceable'' 6162attribute on parameters and return values. 6163 6164.. _md_dereferenceable_or_null: 6165 6166'``dereferenceable_or_null``' Metadata 6167^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6168 6169The existence of the ``!dereferenceable_or_null`` metadata on the 6170instruction tells the optimizer that the value loaded is known to be either 6171dereferenceable or null. 6172The number of bytes known to be dereferenceable is specified by the integer 6173value in the metadata node. This is analogous to the ''dereferenceable_or_null'' 6174attribute on parameters and return values. 6175 6176.. _llvm.loop: 6177 6178'``llvm.loop``' 6179^^^^^^^^^^^^^^^ 6180 6181It is sometimes useful to attach information to loop constructs. Currently, 6182loop metadata is implemented as metadata attached to the branch instruction 6183in the loop latch block. The loop metadata node is a list of 6184other metadata nodes, each representing a property of the loop. Usually, 6185the first item of the property node is a string. For example, the 6186``llvm.loop.unroll.count`` suggests an unroll factor to the loop 6187unroller: 6188 6189.. code-block:: llvm 6190 6191 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0 6192 ... 6193 !0 = !{!0, !1, !2} 6194 !1 = !{!"llvm.loop.unroll.enable"} 6195 !2 = !{!"llvm.loop.unroll.count", i32 4} 6196 6197For legacy reasons, the first item of a loop metadata node must be a 6198reference to itself. Before the advent of the 'distinct' keyword, this 6199forced the preservation of otherwise identical metadata nodes. Since 6200the loop-metadata node can be attached to multiple nodes, the 'distinct' 6201keyword has become unnecessary. 6202 6203Prior to the property nodes, one or two ``DILocation`` (debug location) 6204nodes can be present in the list. The first, if present, identifies the 6205source-code location where the loop begins. The second, if present, 6206identifies the source-code location where the loop ends. 6207 6208Loop metadata nodes cannot be used as unique identifiers. They are 6209neither persistent for the same loop through transformations nor 6210necessarily unique to just one loop. 6211 6212'``llvm.loop.disable_nonforced``' 6213^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6214 6215This metadata disables all optional loop transformations unless 6216explicitly instructed using other transformation metadata such as 6217``llvm.loop.unroll.enable``. That is, no heuristic will try to determine 6218whether a transformation is profitable. The purpose is to avoid that the 6219loop is transformed to a different loop before an explicitly requested 6220(forced) transformation is applied. For instance, loop fusion can make 6221other transformations impossible. Mandatory loop canonicalizations such 6222as loop rotation are still applied. 6223 6224It is recommended to use this metadata in addition to any llvm.loop.* 6225transformation directive. Also, any loop should have at most one 6226directive applied to it (and a sequence of transformations built using 6227followup-attributes). Otherwise, which transformation will be applied 6228depends on implementation details such as the pass pipeline order. 6229 6230See :ref:`transformation-metadata` for details. 6231 6232'``llvm.loop.vectorize``' and '``llvm.loop.interleave``' 6233^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6234 6235Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are 6236used to control per-loop vectorization and interleaving parameters such as 6237vectorization width and interleave count. These metadata should be used in 6238conjunction with ``llvm.loop`` loop identification metadata. The 6239``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only 6240optimization hints and the optimizer will only interleave and vectorize loops if 6241it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata 6242which contains information about loop-carried memory dependencies can be helpful 6243in determining the safety of these transformations. 6244 6245'``llvm.loop.interleave.count``' Metadata 6246^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6247 6248This metadata suggests an interleave count to the loop interleaver. 6249The first operand is the string ``llvm.loop.interleave.count`` and the 6250second operand is an integer specifying the interleave count. For 6251example: 6252 6253.. code-block:: llvm 6254 6255 !0 = !{!"llvm.loop.interleave.count", i32 4} 6256 6257Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving 6258multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0 6259then the interleave count will be determined automatically. 6260 6261'``llvm.loop.vectorize.enable``' Metadata 6262^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6263 6264This metadata selectively enables or disables vectorization for the loop. The 6265first operand is the string ``llvm.loop.vectorize.enable`` and the second operand 6266is a bit. If the bit operand value is 1 vectorization is enabled. A value of 62670 disables vectorization: 6268 6269.. code-block:: llvm 6270 6271 !0 = !{!"llvm.loop.vectorize.enable", i1 0} 6272 !1 = !{!"llvm.loop.vectorize.enable", i1 1} 6273 6274'``llvm.loop.vectorize.predicate.enable``' Metadata 6275^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6276 6277This metadata selectively enables or disables creating predicated instructions 6278for the loop, which can enable folding of the scalar epilogue loop into the 6279main loop. The first operand is the string 6280``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If 6281the bit operand value is 1 vectorization is enabled. A value of 0 disables 6282vectorization: 6283 6284.. code-block:: llvm 6285 6286 !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0} 6287 !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1} 6288 6289'``llvm.loop.vectorize.scalable.enable``' Metadata 6290^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6291 6292This metadata selectively enables or disables scalable vectorization for the 6293loop, and only has any effect if vectorization for the loop is already enabled. 6294The first operand is the string ``llvm.loop.vectorize.scalable.enable`` 6295and the second operand is a bit. If the bit operand value is 1 scalable 6296vectorization is enabled, whereas a value of 0 reverts to the default fixed 6297width vectorization: 6298 6299.. code-block:: llvm 6300 6301 !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0} 6302 !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1} 6303 6304'``llvm.loop.vectorize.width``' Metadata 6305^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6306 6307This metadata sets the target width of the vectorizer. The first 6308operand is the string ``llvm.loop.vectorize.width`` and the second 6309operand is an integer specifying the width. For example: 6310 6311.. code-block:: llvm 6312 6313 !0 = !{!"llvm.loop.vectorize.width", i32 4} 6314 6315Note that setting ``llvm.loop.vectorize.width`` to 1 disables 6316vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to 63170 or if the loop does not have this metadata the width will be 6318determined automatically. 6319 6320'``llvm.loop.vectorize.followup_vectorized``' Metadata 6321^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6322 6323This metadata defines which loop attributes the vectorized loop will 6324have. See :ref:`transformation-metadata` for details. 6325 6326'``llvm.loop.vectorize.followup_epilogue``' Metadata 6327^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6328 6329This metadata defines which loop attributes the epilogue will have. The 6330epilogue is not vectorized and is executed when either the vectorized 6331loop is not known to preserve semantics (because e.g., it processes two 6332arrays that are found to alias by a runtime check) or for the last 6333iterations that do not fill a complete set of vector lanes. See 6334:ref:`Transformation Metadata <transformation-metadata>` for details. 6335 6336'``llvm.loop.vectorize.followup_all``' Metadata 6337^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6338 6339Attributes in the metadata will be added to both the vectorized and 6340epilogue loop. 6341See :ref:`Transformation Metadata <transformation-metadata>` for details. 6342 6343'``llvm.loop.unroll``' 6344^^^^^^^^^^^^^^^^^^^^^^ 6345 6346Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling 6347optimization hints such as the unroll factor. ``llvm.loop.unroll`` 6348metadata should be used in conjunction with ``llvm.loop`` loop 6349identification metadata. The ``llvm.loop.unroll`` metadata are only 6350optimization hints and the unrolling will only be performed if the 6351optimizer believes it is safe to do so. 6352 6353'``llvm.loop.unroll.count``' Metadata 6354^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6355 6356This metadata suggests an unroll factor to the loop unroller. The 6357first operand is the string ``llvm.loop.unroll.count`` and the second 6358operand is a positive integer specifying the unroll factor. For 6359example: 6360 6361.. code-block:: llvm 6362 6363 !0 = !{!"llvm.loop.unroll.count", i32 4} 6364 6365If the trip count of the loop is less than the unroll count the loop 6366will be partially unrolled. 6367 6368'``llvm.loop.unroll.disable``' Metadata 6369^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6370 6371This metadata disables loop unrolling. The metadata has a single operand 6372which is the string ``llvm.loop.unroll.disable``. For example: 6373 6374.. code-block:: llvm 6375 6376 !0 = !{!"llvm.loop.unroll.disable"} 6377 6378'``llvm.loop.unroll.runtime.disable``' Metadata 6379^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6380 6381This metadata disables runtime loop unrolling. The metadata has a single 6382operand which is the string ``llvm.loop.unroll.runtime.disable``. For example: 6383 6384.. code-block:: llvm 6385 6386 !0 = !{!"llvm.loop.unroll.runtime.disable"} 6387 6388'``llvm.loop.unroll.enable``' Metadata 6389^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6390 6391This metadata suggests that the loop should be fully unrolled if the trip count 6392is known at compile time and partially unrolled if the trip count is not known 6393at compile time. The metadata has a single operand which is the string 6394``llvm.loop.unroll.enable``. For example: 6395 6396.. code-block:: llvm 6397 6398 !0 = !{!"llvm.loop.unroll.enable"} 6399 6400'``llvm.loop.unroll.full``' Metadata 6401^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6402 6403This metadata suggests that the loop should be unrolled fully. The 6404metadata has a single operand which is the string ``llvm.loop.unroll.full``. 6405For example: 6406 6407.. code-block:: llvm 6408 6409 !0 = !{!"llvm.loop.unroll.full"} 6410 6411'``llvm.loop.unroll.followup``' Metadata 6412^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6413 6414This metadata defines which loop attributes the unrolled loop will have. 6415See :ref:`Transformation Metadata <transformation-metadata>` for details. 6416 6417'``llvm.loop.unroll.followup_remainder``' Metadata 6418^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6419 6420This metadata defines which loop attributes the remainder loop after 6421partial/runtime unrolling will have. See 6422:ref:`Transformation Metadata <transformation-metadata>` for details. 6423 6424'``llvm.loop.unroll_and_jam``' 6425^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6426 6427This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata 6428above, but affect the unroll and jam pass. In addition any loop with 6429``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will 6430disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the 6431unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam 6432too.) 6433 6434The metadata for unroll and jam otherwise is the same as for ``unroll``. 6435``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and 6436``llvm.loop.unroll_and_jam.count`` do the same as for unroll. 6437``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints 6438and the normal safety checks will still be performed. 6439 6440'``llvm.loop.unroll_and_jam.count``' Metadata 6441^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6442 6443This metadata suggests an unroll and jam factor to use, similarly to 6444``llvm.loop.unroll.count``. The first operand is the string 6445``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer 6446specifying the unroll factor. For example: 6447 6448.. code-block:: llvm 6449 6450 !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4} 6451 6452If the trip count of the loop is less than the unroll count the loop 6453will be partially unroll and jammed. 6454 6455'``llvm.loop.unroll_and_jam.disable``' Metadata 6456^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6457 6458This metadata disables loop unroll and jamming. The metadata has a single 6459operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example: 6460 6461.. code-block:: llvm 6462 6463 !0 = !{!"llvm.loop.unroll_and_jam.disable"} 6464 6465'``llvm.loop.unroll_and_jam.enable``' Metadata 6466^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6467 6468This metadata suggests that the loop should be fully unroll and jammed if the 6469trip count is known at compile time and partially unrolled if the trip count is 6470not known at compile time. The metadata has a single operand which is the 6471string ``llvm.loop.unroll_and_jam.enable``. For example: 6472 6473.. code-block:: llvm 6474 6475 !0 = !{!"llvm.loop.unroll_and_jam.enable"} 6476 6477'``llvm.loop.unroll_and_jam.followup_outer``' Metadata 6478^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6479 6480This metadata defines which loop attributes the outer unrolled loop will 6481have. See :ref:`Transformation Metadata <transformation-metadata>` for 6482details. 6483 6484'``llvm.loop.unroll_and_jam.followup_inner``' Metadata 6485^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6486 6487This metadata defines which loop attributes the inner jammed loop will 6488have. See :ref:`Transformation Metadata <transformation-metadata>` for 6489details. 6490 6491'``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata 6492^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6493 6494This metadata defines which attributes the epilogue of the outer loop 6495will have. This loop is usually unrolled, meaning there is no such 6496loop. This attribute will be ignored in this case. See 6497:ref:`Transformation Metadata <transformation-metadata>` for details. 6498 6499'``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata 6500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6501 6502This metadata defines which attributes the inner loop of the epilogue 6503will have. The outer epilogue will usually be unrolled, meaning there 6504can be multiple inner remainder loops. See 6505:ref:`Transformation Metadata <transformation-metadata>` for details. 6506 6507'``llvm.loop.unroll_and_jam.followup_all``' Metadata 6508^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6509 6510Attributes specified in the metadata is added to all 6511``llvm.loop.unroll_and_jam.*`` loops. See 6512:ref:`Transformation Metadata <transformation-metadata>` for details. 6513 6514'``llvm.loop.licm_versioning.disable``' Metadata 6515^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6516 6517This metadata indicates that the loop should not be versioned for the purpose 6518of enabling loop-invariant code motion (LICM). The metadata has a single operand 6519which is the string ``llvm.loop.licm_versioning.disable``. For example: 6520 6521.. code-block:: llvm 6522 6523 !0 = !{!"llvm.loop.licm_versioning.disable"} 6524 6525'``llvm.loop.distribute.enable``' Metadata 6526^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6527 6528Loop distribution allows splitting a loop into multiple loops. Currently, 6529this is only performed if the entire loop cannot be vectorized due to unsafe 6530memory dependencies. The transformation will attempt to isolate the unsafe 6531dependencies into their own loop. 6532 6533This metadata can be used to selectively enable or disable distribution of the 6534loop. The first operand is the string ``llvm.loop.distribute.enable`` and the 6535second operand is a bit. If the bit operand value is 1 distribution is 6536enabled. A value of 0 disables distribution: 6537 6538.. code-block:: llvm 6539 6540 !0 = !{!"llvm.loop.distribute.enable", i1 0} 6541 !1 = !{!"llvm.loop.distribute.enable", i1 1} 6542 6543This metadata should be used in conjunction with ``llvm.loop`` loop 6544identification metadata. 6545 6546'``llvm.loop.distribute.followup_coincident``' Metadata 6547^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6548 6549This metadata defines which attributes extracted loops with no cyclic 6550dependencies will have (i.e. can be vectorized). See 6551:ref:`Transformation Metadata <transformation-metadata>` for details. 6552 6553'``llvm.loop.distribute.followup_sequential``' Metadata 6554^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6555 6556This metadata defines which attributes the isolated loops with unsafe 6557memory dependencies will have. See 6558:ref:`Transformation Metadata <transformation-metadata>` for details. 6559 6560'``llvm.loop.distribute.followup_fallback``' Metadata 6561^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6562 6563If loop versioning is necessary, this metadata defined the attributes 6564the non-distributed fallback version will have. See 6565:ref:`Transformation Metadata <transformation-metadata>` for details. 6566 6567'``llvm.loop.distribute.followup_all``' Metadata 6568^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6569 6570The attributes in this metadata is added to all followup loops of the 6571loop distribution pass. See 6572:ref:`Transformation Metadata <transformation-metadata>` for details. 6573 6574'``llvm.licm.disable``' Metadata 6575^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6576 6577This metadata indicates that loop-invariant code motion (LICM) should not be 6578performed on this loop. The metadata has a single operand which is the string 6579``llvm.licm.disable``. For example: 6580 6581.. code-block:: llvm 6582 6583 !0 = !{!"llvm.licm.disable"} 6584 6585Note that although it operates per loop it isn't given the llvm.loop prefix 6586as it is not affected by the ``llvm.loop.disable_nonforced`` metadata. 6587 6588'``llvm.access.group``' Metadata 6589^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6590 6591``llvm.access.group`` metadata can be attached to any instruction that 6592potentially accesses memory. It can point to a single distinct metadata 6593node, which we call access group. This node represents all memory access 6594instructions referring to it via ``llvm.access.group``. When an 6595instruction belongs to multiple access groups, it can also point to a 6596list of accesses groups, illustrated by the following example. 6597 6598.. code-block:: llvm 6599 6600 %val = load i32, i32* %arrayidx, !llvm.access.group !0 6601 ... 6602 !0 = !{!1, !2} 6603 !1 = distinct !{} 6604 !2 = distinct !{} 6605 6606It is illegal for the list node to be empty since it might be confused 6607with an access group. 6608 6609The access group metadata node must be 'distinct' to avoid collapsing 6610multiple access groups by content. A access group metadata node must 6611always be empty which can be used to distinguish an access group 6612metadata node from a list of access groups. Being empty avoids the 6613situation that the content must be updated which, because metadata is 6614immutable by design, would required finding and updating all references 6615to the access group node. 6616 6617The access group can be used to refer to a memory access instruction 6618without pointing to it directly (which is not possible in global 6619metadata). Currently, the only metadata making use of it is 6620``llvm.loop.parallel_accesses``. 6621 6622'``llvm.loop.parallel_accesses``' Metadata 6623^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6624 6625The ``llvm.loop.parallel_accesses`` metadata refers to one or more 6626access group metadata nodes (see ``llvm.access.group``). It denotes that 6627no loop-carried memory dependence exist between it and other instructions 6628in the loop with this metadata. 6629 6630Let ``m1`` and ``m2`` be two instructions that both have the 6631``llvm.access.group`` metadata to the access group ``g1``, respectively 6632``g2`` (which might be identical). If a loop contains both access groups 6633in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can 6634assume that there is no dependency between ``m1`` and ``m2`` carried by 6635this loop. Instructions that belong to multiple access groups are 6636considered having this property if at least one of the access groups 6637matches the ``llvm.loop.parallel_accesses`` list. 6638 6639If all memory-accessing instructions in a loop have 6640``llvm.access.group`` metadata that each refer to one of the access 6641groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the 6642loop has no loop carried memory dependences and is considered to be a 6643parallel loop. 6644 6645Note that if not all memory access instructions belong to an access 6646group referred to by ``llvm.loop.parallel_accesses``, then the loop must 6647not be considered trivially parallel. Additional 6648memory dependence analysis is required to make that determination. As a fail 6649safe mechanism, this causes loops that were originally parallel to be considered 6650sequential (if optimization passes that are unaware of the parallel semantics 6651insert new memory instructions into the loop body). 6652 6653Example of a loop that is considered parallel due to its correct use of 6654both ``llvm.access.group`` and ``llvm.loop.parallel_accesses`` 6655metadata types. 6656 6657.. code-block:: llvm 6658 6659 for.body: 6660 ... 6661 %val0 = load i32, i32* %arrayidx, !llvm.access.group !1 6662 ... 6663 store i32 %val0, i32* %arrayidx1, !llvm.access.group !1 6664 ... 6665 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0 6666 6667 for.end: 6668 ... 6669 !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}} 6670 !1 = distinct !{} 6671 6672It is also possible to have nested parallel loops: 6673 6674.. code-block:: llvm 6675 6676 outer.for.body: 6677 ... 6678 %val1 = load i32, i32* %arrayidx3, !llvm.access.group !4 6679 ... 6680 br label %inner.for.body 6681 6682 inner.for.body: 6683 ... 6684 %val0 = load i32, i32* %arrayidx1, !llvm.access.group !3 6685 ... 6686 store i32 %val0, i32* %arrayidx2, !llvm.access.group !3 6687 ... 6688 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1 6689 6690 inner.for.end: 6691 ... 6692 store i32 %val1, i32* %arrayidx4, !llvm.access.group !4 6693 ... 6694 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2 6695 6696 outer.for.end: ; preds = %for.body 6697 ... 6698 !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}} ; metadata for the inner loop 6699 !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop 6700 !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well) 6701 !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop 6702 6703'``llvm.loop.mustprogress``' Metadata 6704^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6705 6706The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to 6707terminate, unwind, or interact with the environment in an observable way e.g. 6708via a volatile memory access, I/O, or other synchronization. If such a loop is 6709not found to interact with the environment in an observable way, the loop may 6710be removed. This corresponds to the ``mustprogress`` function attribute. 6711 6712'``irr_loop``' Metadata 6713^^^^^^^^^^^^^^^^^^^^^^^ 6714 6715``irr_loop`` metadata may be attached to the terminator instruction of a basic 6716block that's an irreducible loop header (note that an irreducible loop has more 6717than once header basic blocks.) If ``irr_loop`` metadata is attached to the 6718terminator instruction of a basic block that is not really an irreducible loop 6719header, the behavior is undefined. The intent of this metadata is to improve the 6720accuracy of the block frequency propagation. For example, in the code below, the 6721block ``header0`` may have a loop header weight (relative to the other headers of 6722the irreducible loop) of 100: 6723 6724.. code-block:: llvm 6725 6726 header0: 6727 ... 6728 br i1 %cmp, label %t1, label %t2, !irr_loop !0 6729 6730 ... 6731 !0 = !{"loop_header_weight", i64 100} 6732 6733Irreducible loop header weights are typically based on profile data. 6734 6735.. _md_invariant.group: 6736 6737'``invariant.group``' Metadata 6738^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6739 6740The experimental ``invariant.group`` metadata may be attached to 6741``load``/``store`` instructions referencing a single metadata with no entries. 6742The existence of the ``invariant.group`` metadata on the instruction tells 6743the optimizer that every ``load`` and ``store`` to the same pointer operand 6744can be assumed to load or store the same 6745value (but see the ``llvm.launder.invariant.group`` intrinsic which affects 6746when two pointers are considered the same). Pointers returned by bitcast or 6747getelementptr with only zero indices are considered the same. 6748 6749Examples: 6750 6751.. code-block:: llvm 6752 6753 @unknownPtr = external global i8 6754 ... 6755 %ptr = alloca i8 6756 store i8 42, i8* %ptr, !invariant.group !0 6757 call void @foo(i8* %ptr) 6758 6759 %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change 6760 call void @foo(i8* %ptr) 6761 6762 %newPtr = call i8* @getPointer(i8* %ptr) 6763 %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr 6764 6765 %unknownValue = load i8, i8* @unknownPtr 6766 store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42 6767 6768 call void @foo(i8* %ptr) 6769 %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr) 6770 %d = load i8, i8* %newPtr2, !invariant.group !0 ; Can't step through launder.invariant.group to get value of %ptr 6771 6772 ... 6773 declare void @foo(i8*) 6774 declare i8* @getPointer(i8*) 6775 declare i8* @llvm.launder.invariant.group(i8*) 6776 6777 !0 = !{} 6778 6779The invariant.group metadata must be dropped when replacing one pointer by 6780another based on aliasing information. This is because invariant.group is tied 6781to the SSA value of the pointer operand. 6782 6783.. code-block:: llvm 6784 6785 %v = load i8, i8* %x, !invariant.group !0 6786 ; if %x mustalias %y then we can replace the above instruction with 6787 %v = load i8, i8* %y 6788 6789Note that this is an experimental feature, which means that its semantics might 6790change in the future. 6791 6792'``type``' Metadata 6793^^^^^^^^^^^^^^^^^^^ 6794 6795See :doc:`TypeMetadata`. 6796 6797'``associated``' Metadata 6798^^^^^^^^^^^^^^^^^^^^^^^^^ 6799 6800The ``associated`` metadata may be attached to a global object 6801declaration with a single argument that references another global object. 6802 6803This metadata prevents discarding of the global object in linker GC 6804unless the referenced object is also discarded. The linker support for 6805this feature is spotty. For best compatibility, globals carrying this 6806metadata may also: 6807 6808- Be in a comdat with the referenced global. 6809- Be in @llvm.compiler.used. 6810- Have an explicit section with a name which is a valid C identifier. 6811 6812It does not have any effect on non-ELF targets. 6813 6814Example: 6815 6816.. code-block:: text 6817 6818 $a = comdat any 6819 @a = global i32 1, comdat $a 6820 @b = internal global i32 2, comdat $a, section "abc", !associated !0 6821 !0 = !{i32* @a} 6822 6823 6824'``prof``' Metadata 6825^^^^^^^^^^^^^^^^^^^ 6826 6827The ``prof`` metadata is used to record profile data in the IR. 6828The first operand of the metadata node indicates the profile metadata 6829type. There are currently 3 types: 6830:ref:`branch_weights<prof_node_branch_weights>`, 6831:ref:`function_entry_count<prof_node_function_entry_count>`, and 6832:ref:`VP<prof_node_VP>`. 6833 6834.. _prof_node_branch_weights: 6835 6836branch_weights 6837"""""""""""""" 6838 6839Branch weight metadata attached to a branch, select, switch or call instruction 6840represents the likeliness of the associated branch being taken. 6841For more information, see :doc:`BranchWeightMetadata`. 6842 6843.. _prof_node_function_entry_count: 6844 6845function_entry_count 6846"""""""""""""""""""" 6847 6848Function entry count metadata can be attached to function definitions 6849to record the number of times the function is called. Used with BFI 6850information, it is also used to derive the basic block profile count. 6851For more information, see :doc:`BranchWeightMetadata`. 6852 6853.. _prof_node_VP: 6854 6855VP 6856"" 6857 6858VP (value profile) metadata can be attached to instructions that have 6859value profile information. Currently this is indirect calls (where it 6860records the hottest callees) and calls to memory intrinsics such as memcpy, 6861memmove, and memset (where it records the hottest byte lengths). 6862 6863Each VP metadata node contains "VP" string, then a uint32_t value for the value 6864profiling kind, a uint64_t value for the total number of times the instruction 6865is executed, followed by uint64_t value and execution count pairs. 6866The value profiling kind is 0 for indirect call targets and 1 for memory 6867operations. For indirect call targets, each profile value is a hash 6868of the callee function name, and for memory operations each value is the 6869byte length. 6870 6871Note that the value counts do not need to add up to the total count 6872listed in the third operand (in practice only the top hottest values 6873are tracked and reported). 6874 6875Indirect call example: 6876 6877.. code-block:: llvm 6878 6879 call void %f(), !prof !1 6880 !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410} 6881 6882Note that the VP type is 0 (the second operand), which indicates this is 6883an indirect call value profile data. The third operand indicates that the 6884indirect call executed 1600 times. The 4th and 6th operands give the 6885hashes of the 2 hottest target functions' names (this is the same hash used 6886to represent function names in the profile database), and the 5th and 7th 6887operands give the execution count that each of the respective prior target 6888functions was called. 6889 6890.. _md_annotation: 6891 6892'``annotation``' Metadata 6893^^^^^^^^^^^^^^^^^^^^^^^^^ 6894 6895The ``annotation`` metadata can be used to attach a tuple of annotation strings 6896to any instruction. This metadata does not impact the semantics of the program 6897and may only be used to provide additional insight about the program and 6898transformations to users. 6899 6900Example: 6901 6902.. code-block:: text 6903 6904 %a.addr = alloca float*, align 8, !annotation !0 6905 !0 = !{!"auto-init"} 6906 6907Module Flags Metadata 6908===================== 6909 6910Information about the module as a whole is difficult to convey to LLVM's 6911subsystems. The LLVM IR isn't sufficient to transmit this information. 6912The ``llvm.module.flags`` named metadata exists in order to facilitate 6913this. These flags are in the form of key / value pairs --- much like a 6914dictionary --- making it easy for any subsystem who cares about a flag to 6915look it up. 6916 6917The ``llvm.module.flags`` metadata contains a list of metadata triplets. 6918Each triplet has the following form: 6919 6920- The first element is a *behavior* flag, which specifies the behavior 6921 when two (or more) modules are merged together, and it encounters two 6922 (or more) metadata with the same ID. The supported behaviors are 6923 described below. 6924- The second element is a metadata string that is a unique ID for the 6925 metadata. Each module may only have one flag entry for each unique ID (not 6926 including entries with the **Require** behavior). 6927- The third element is the value of the flag. 6928 6929When two (or more) modules are merged together, the resulting 6930``llvm.module.flags`` metadata is the union of the modules' flags. That is, for 6931each unique metadata ID string, there will be exactly one entry in the merged 6932modules ``llvm.module.flags`` metadata table, and the value for that entry will 6933be determined by the merge behavior flag, as described below. The only exception 6934is that entries with the *Require* behavior are always preserved. 6935 6936The following behaviors are supported: 6937 6938.. list-table:: 6939 :header-rows: 1 6940 :widths: 10 90 6941 6942 * - Value 6943 - Behavior 6944 6945 * - 1 6946 - **Error** 6947 Emits an error if two values disagree, otherwise the resulting value 6948 is that of the operands. 6949 6950 * - 2 6951 - **Warning** 6952 Emits a warning if two values disagree. The result value will be the 6953 operand for the flag from the first module being linked, or the max 6954 if the other module uses **Max** (in which case the resulting flag 6955 will be **Max**). 6956 6957 * - 3 6958 - **Require** 6959 Adds a requirement that another module flag be present and have a 6960 specified value after linking is performed. The value must be a 6961 metadata pair, where the first element of the pair is the ID of the 6962 module flag to be restricted, and the second element of the pair is 6963 the value the module flag should be restricted to. This behavior can 6964 be used to restrict the allowable results (via triggering of an 6965 error) of linking IDs with the **Override** behavior. 6966 6967 * - 4 6968 - **Override** 6969 Uses the specified value, regardless of the behavior or value of the 6970 other module. If both modules specify **Override**, but the values 6971 differ, an error will be emitted. 6972 6973 * - 5 6974 - **Append** 6975 Appends the two values, which are required to be metadata nodes. 6976 6977 * - 6 6978 - **AppendUnique** 6979 Appends the two values, which are required to be metadata 6980 nodes. However, duplicate entries in the second list are dropped 6981 during the append operation. 6982 6983 * - 7 6984 - **Max** 6985 Takes the max of the two values, which are required to be integers. 6986 6987It is an error for a particular unique flag ID to have multiple behaviors, 6988except in the case of **Require** (which adds restrictions on another metadata 6989value) or **Override**. 6990 6991An example of module flags: 6992 6993.. code-block:: llvm 6994 6995 !0 = !{ i32 1, !"foo", i32 1 } 6996 !1 = !{ i32 4, !"bar", i32 37 } 6997 !2 = !{ i32 2, !"qux", i32 42 } 6998 !3 = !{ i32 3, !"qux", 6999 !{ 7000 !"foo", i32 1 7001 } 7002 } 7003 !llvm.module.flags = !{ !0, !1, !2, !3 } 7004 7005- Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior 7006 if two or more ``!"foo"`` flags are seen is to emit an error if their 7007 values are not equal. 7008 7009- Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The 7010 behavior if two or more ``!"bar"`` flags are seen is to use the value 7011 '37'. 7012 7013- Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The 7014 behavior if two or more ``!"qux"`` flags are seen is to emit a 7015 warning if their values are not equal. 7016 7017- Metadata ``!3`` has the ID ``!"qux"`` and the value: 7018 7019 :: 7020 7021 !{ !"foo", i32 1 } 7022 7023 The behavior is to emit an error if the ``llvm.module.flags`` does not 7024 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is 7025 performed. 7026 7027Synthesized Functions Module Flags Metadata 7028------------------------------------------- 7029 7030These metadata specify the default attributes synthesized functions should have. 7031These metadata are currently respected by a few instrumentation passes, such as 7032sanitizers. 7033 7034These metadata correspond to a few function attributes with significant code 7035generation behaviors. Function attributes with just optimization purposes 7036should not be listed because the performance impact of these synthesized 7037functions is small. 7038 7039- "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function 7040 will get the "frame-pointer" function attribute, with value being "none", 7041 "non-leaf", or "all", respectively. 7042- "uwtable": **Max**. The value can be 0 or 1. If the value is 1, a synthesized 7043 function will get the ``uwtable`` function attribute. 7044 7045Objective-C Garbage Collection Module Flags Metadata 7046---------------------------------------------------- 7047 7048On the Mach-O platform, Objective-C stores metadata about garbage 7049collection in a special section called "image info". The metadata 7050consists of a version number and a bitmask specifying what types of 7051garbage collection are supported (if any) by the file. If two or more 7052modules are linked together their garbage collection metadata needs to 7053be merged rather than appended together. 7054 7055The Objective-C garbage collection module flags metadata consists of the 7056following key-value pairs: 7057 7058.. list-table:: 7059 :header-rows: 1 7060 :widths: 30 70 7061 7062 * - Key 7063 - Value 7064 7065 * - ``Objective-C Version`` 7066 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2. 7067 7068 * - ``Objective-C Image Info Version`` 7069 - **[Required]** --- The version of the image info section. Currently 7070 always 0. 7071 7072 * - ``Objective-C Image Info Section`` 7073 - **[Required]** --- The section to place the metadata. Valid values are 7074 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and 7075 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for 7076 Objective-C ABI version 2. 7077 7078 * - ``Objective-C Garbage Collection`` 7079 - **[Required]** --- Specifies whether garbage collection is supported or 7080 not. Valid values are 0, for no garbage collection, and 2, for garbage 7081 collection supported. 7082 7083 * - ``Objective-C GC Only`` 7084 - **[Optional]** --- Specifies that only garbage collection is supported. 7085 If present, its value must be 6. This flag requires that the 7086 ``Objective-C Garbage Collection`` flag have the value 2. 7087 7088Some important flag interactions: 7089 7090- If a module with ``Objective-C Garbage Collection`` set to 0 is 7091 merged with a module with ``Objective-C Garbage Collection`` set to 7092 2, then the resulting module has the 7093 ``Objective-C Garbage Collection`` flag set to 0. 7094- A module with ``Objective-C Garbage Collection`` set to 0 cannot be 7095 merged with a module with ``Objective-C GC Only`` set to 6. 7096 7097C type width Module Flags Metadata 7098---------------------------------- 7099 7100The ARM backend emits a section into each generated object file describing the 7101options that it was compiled with (in a compiler-independent way) to prevent 7102linking incompatible objects, and to allow automatic library selection. Some 7103of these options are not visible at the IR level, namely wchar_t width and enum 7104width. 7105 7106To pass this information to the backend, these options are encoded in module 7107flags metadata, using the following key-value pairs: 7108 7109.. list-table:: 7110 :header-rows: 1 7111 :widths: 30 70 7112 7113 * - Key 7114 - Value 7115 7116 * - short_wchar 7117 - * 0 --- sizeof(wchar_t) == 4 7118 * 1 --- sizeof(wchar_t) == 2 7119 7120 * - short_enum 7121 - * 0 --- Enums are at least as large as an ``int``. 7122 * 1 --- Enums are stored in the smallest integer type which can 7123 represent all of its values. 7124 7125For example, the following metadata section specifies that the module was 7126compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an 7127enum is the smallest type which can represent all of its values:: 7128 7129 !llvm.module.flags = !{!0, !1} 7130 !0 = !{i32 1, !"short_wchar", i32 1} 7131 !1 = !{i32 1, !"short_enum", i32 0} 7132 7133LTO Post-Link Module Flags Metadata 7134----------------------------------- 7135 7136Some optimisations are only when the entire LTO unit is present in the current 7137module. This is represented by the ``LTOPostLink`` module flags metadata, which 7138will be created with a value of ``1`` when LTO linking occurs. 7139 7140Automatic Linker Flags Named Metadata 7141===================================== 7142 7143Some targets support embedding of flags to the linker inside individual object 7144files. Typically this is used in conjunction with language extensions which 7145allow source files to contain linker command line options, and have these 7146automatically be transmitted to the linker via object files. 7147 7148These flags are encoded in the IR using named metadata with the name 7149``!llvm.linker.options``. Each operand is expected to be a metadata node 7150which should be a list of other metadata nodes, each of which should be a 7151list of metadata strings defining linker options. 7152 7153For example, the following metadata section specifies two separate sets of 7154linker options, presumably to link against ``libz`` and the ``Cocoa`` 7155framework:: 7156 7157 !0 = !{ !"-lz" } 7158 !1 = !{ !"-framework", !"Cocoa" } 7159 !llvm.linker.options = !{ !0, !1 } 7160 7161The metadata encoding as lists of lists of options, as opposed to a collapsed 7162list of options, is chosen so that the IR encoding can use multiple option 7163strings to specify e.g., a single library, while still having that specifier be 7164preserved as an atomic element that can be recognized by a target specific 7165assembly writer or object file emitter. 7166 7167Each individual option is required to be either a valid option for the target's 7168linker, or an option that is reserved by the target specific assembly writer or 7169object file emitter. No other aspect of these options is defined by the IR. 7170 7171Dependent Libs Named Metadata 7172============================= 7173 7174Some targets support embedding of strings into object files to indicate 7175a set of libraries to add to the link. Typically this is used in conjunction 7176with language extensions which allow source files to explicitly declare the 7177libraries they depend on, and have these automatically be transmitted to the 7178linker via object files. 7179 7180The list is encoded in the IR using named metadata with the name 7181``!llvm.dependent-libraries``. Each operand is expected to be a metadata node 7182which should contain a single string operand. 7183 7184For example, the following metadata section contains two library specifiers:: 7185 7186 !0 = !{!"a library specifier"} 7187 !1 = !{!"another library specifier"} 7188 !llvm.dependent-libraries = !{ !0, !1 } 7189 7190Each library specifier will be handled independently by the consuming linker. 7191The effect of the library specifiers are defined by the consuming linker. 7192 7193.. _summary: 7194 7195ThinLTO Summary 7196=============== 7197 7198Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_ 7199causes the building of a compact summary of the module that is emitted into 7200the bitcode. The summary is emitted into the LLVM assembly and identified 7201in syntax by a caret ('``^``'). 7202 7203The summary is parsed into a bitcode output, along with the Module 7204IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes 7205of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the 7206summary entries (just as they currently ignore summary entries in a bitcode 7207input file). 7208 7209Eventually, the summary will be parsed into a ModuleSummaryIndex object under 7210the same conditions where summary index is currently built from bitcode. 7211Specifically, tools that test the Thin Link portion of a ThinLTO compile 7212(i.e. llvm-lto and llvm-lto2), or when parsing a combined index 7213for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag 7214(this part is not yet implemented, use llvm-as to create a bitcode object 7215before feeding into thin link tools for now). 7216 7217There are currently 3 types of summary entries in the LLVM assembly: 7218:ref:`module paths<module_path_summary>`, 7219:ref:`global values<gv_summary>`, and 7220:ref:`type identifiers<typeid_summary>`. 7221 7222.. _module_path_summary: 7223 7224Module Path Summary Entry 7225------------------------- 7226 7227Each module path summary entry lists a module containing global values included 7228in the summary. For a single IR module there will be one such entry, but 7229in a combined summary index produced during the thin link, there will be 7230one module path entry per linked module with summary. 7231 7232Example: 7233 7234.. code-block:: text 7235 7236 ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418)) 7237 7238The ``path`` field is a string path to the bitcode file, and the ``hash`` 7239field is the 160-bit SHA-1 hash of the IR bitcode contents, used for 7240incremental builds and caching. 7241 7242.. _gv_summary: 7243 7244Global Value Summary Entry 7245-------------------------- 7246 7247Each global value summary entry corresponds to a global value defined or 7248referenced by a summarized module. 7249 7250Example: 7251 7252.. code-block:: text 7253 7254 ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831 7255 7256For declarations, there will not be a summary list. For definitions, a 7257global value will contain a list of summaries, one per module containing 7258a definition. There can be multiple entries in a combined summary index 7259for symbols with weak linkage. 7260 7261Each ``Summary`` format will depend on whether the global value is a 7262:ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or 7263:ref:`alias<alias_summary>`. 7264 7265.. _function_summary: 7266 7267Function Summary 7268^^^^^^^^^^^^^^^^ 7269 7270If the global value is a function, the ``Summary`` entry will look like: 7271 7272.. code-block:: text 7273 7274 function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]? 7275 7276The ``module`` field includes the summary entry id for the module containing 7277this definition, and the ``flags`` field contains information such as 7278the linkage type, a flag indicating whether it is legal to import the 7279definition, whether it is globally live and whether the linker resolved it 7280to a local definition (the latter two are populated during the thin link). 7281The ``insts`` field contains the number of IR instructions in the function. 7282Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`, 7283:ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`, 7284:ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`. 7285 7286.. _variable_summary: 7287 7288Global Variable Summary 7289^^^^^^^^^^^^^^^^^^^^^^^ 7290 7291If the global value is a variable, the ``Summary`` entry will look like: 7292 7293.. code-block:: text 7294 7295 variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]? 7296 7297The variable entry contains a subset of the fields in a 7298:ref:`function summary <function_summary>`, see the descriptions there. 7299 7300.. _alias_summary: 7301 7302Alias Summary 7303^^^^^^^^^^^^^ 7304 7305If the global value is an alias, the ``Summary`` entry will look like: 7306 7307.. code-block:: text 7308 7309 alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2) 7310 7311The ``module`` and ``flags`` fields are as described for a 7312:ref:`function summary <function_summary>`. The ``aliasee`` field 7313contains a reference to the global value summary entry of the aliasee. 7314 7315.. _funcflags_summary: 7316 7317Function Flags 7318^^^^^^^^^^^^^^ 7319 7320The optional ``FuncFlags`` field looks like: 7321 7322.. code-block:: text 7323 7324 funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0) 7325 7326If unspecified, flags are assumed to hold the conservative ``false`` value of 7327``0``. 7328 7329.. _calls_summary: 7330 7331Calls 7332^^^^^ 7333 7334The optional ``Calls`` field looks like: 7335 7336.. code-block:: text 7337 7338 calls: ((Callee)[, (Callee)]*) 7339 7340where each ``Callee`` looks like: 7341 7342.. code-block:: text 7343 7344 callee: ^1[, hotness: None]?[, relbf: 0]? 7345 7346The ``callee`` refers to the summary entry id of the callee. At most one 7347of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``, 7348``Hot``, and ``Critical``), and ``relbf`` (which holds the integer 7349branch frequency relative to the entry frequency, scaled down by 2^8) 7350may be specified. The defaults are ``Unknown`` and ``0``, respectively. 7351 7352.. _params_summary: 7353 7354Params 7355^^^^^^ 7356 7357The optional ``Params`` is used by ``StackSafety`` and looks like: 7358 7359.. code-block:: text 7360 7361 Params: ((Param)[, (Param)]*) 7362 7363where each ``Param`` describes pointer parameter access inside of the 7364function and looks like: 7365 7366.. code-block:: text 7367 7368 param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]? 7369 7370where the first ``param`` is the number of the parameter it describes, 7371``offset`` is the inclusive range of offsets from the pointer parameter to bytes 7372which can be accessed by the function. This range does not include accesses by 7373function calls from ``calls`` list. 7374 7375where each ``Callee`` describes how parameter is forwarded into other 7376functions and looks like: 7377 7378.. code-block:: text 7379 7380 callee: ^3, param: 5, offset: [-3, 3] 7381 7382The ``callee`` refers to the summary entry id of the callee, ``param`` is 7383the number of the callee parameter which points into the callers parameter 7384with offset known to be inside of the ``offset`` range. ``calls`` will be 7385consumed and removed by thin link stage to update ``Param::offset`` so it 7386covers all accesses possible by ``calls``. 7387 7388Pointer parameter without corresponding ``Param`` is considered unsafe and we 7389assume that access with any offset is possible. 7390 7391Example: 7392 7393If we have the following function: 7394 7395.. code-block:: text 7396 7397 define i64 @foo(i64* %0, i32* %1, i8* %2, i8 %3) { 7398 store i32* %1, i32** @x 7399 %5 = getelementptr inbounds i8, i8* %2, i64 5 7400 %6 = load i8, i8* %5 7401 %7 = getelementptr inbounds i8, i8* %2, i8 %3 7402 tail call void @bar(i8 %3, i8* %7) 7403 %8 = load i64, i64* %0 7404 ret i64 %8 7405 } 7406 7407We can expect the record like this: 7408 7409.. code-block:: text 7410 7411 params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127])))) 7412 7413The function may access just 8 bytes of the parameter %0 . ``calls`` is empty, 7414so the parameter is either not used for function calls or ``offset`` already 7415covers all accesses from nested function calls. 7416Parameter %1 escapes, so access is unknown. 7417The function itself can access just a single byte of the parameter %2. Additional 7418access is possible inside of the ``@bar`` or ``^3``. The function adds signed 7419offset to the pointer and passes the result as the argument %1 into ``^3``. 7420This record itself does not tell us how ``^3`` will access the parameter. 7421Parameter %3 is not a pointer. 7422 7423.. _refs_summary: 7424 7425Refs 7426^^^^ 7427 7428The optional ``Refs`` field looks like: 7429 7430.. code-block:: text 7431 7432 refs: ((Ref)[, (Ref)]*) 7433 7434where each ``Ref`` contains a reference to the summary id of the referenced 7435value (e.g. ``^1``). 7436 7437.. _typeidinfo_summary: 7438 7439TypeIdInfo 7440^^^^^^^^^^ 7441 7442The optional ``TypeIdInfo`` field, used for 7443`Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, 7444looks like: 7445 7446.. code-block:: text 7447 7448 typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]? 7449 7450These optional fields have the following forms: 7451 7452TypeTests 7453""""""""" 7454 7455.. code-block:: text 7456 7457 typeTests: (TypeIdRef[, TypeIdRef]*) 7458 7459Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` 7460by summary id or ``GUID``. 7461 7462TypeTestAssumeVCalls 7463"""""""""""""""""""" 7464 7465.. code-block:: text 7466 7467 typeTestAssumeVCalls: (VFuncId[, VFuncId]*) 7468 7469Where each VFuncId has the format: 7470 7471.. code-block:: text 7472 7473 vFuncId: (TypeIdRef, offset: 16) 7474 7475Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` 7476by summary id or ``GUID`` preceded by a ``guid:`` tag. 7477 7478TypeCheckedLoadVCalls 7479""""""""""""""""""""" 7480 7481.. code-block:: text 7482 7483 typeCheckedLoadVCalls: (VFuncId[, VFuncId]*) 7484 7485Where each VFuncId has the format described for ``TypeTestAssumeVCalls``. 7486 7487TypeTestAssumeConstVCalls 7488""""""""""""""""""""""""" 7489 7490.. code-block:: text 7491 7492 typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*) 7493 7494Where each ConstVCall has the format: 7495 7496.. code-block:: text 7497 7498 (VFuncId, args: (Arg[, Arg]*)) 7499 7500and where each VFuncId has the format described for ``TypeTestAssumeVCalls``, 7501and each Arg is an integer argument number. 7502 7503TypeCheckedLoadConstVCalls 7504"""""""""""""""""""""""""" 7505 7506.. code-block:: text 7507 7508 typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*) 7509 7510Where each ConstVCall has the format described for 7511``TypeTestAssumeConstVCalls``. 7512 7513.. _typeid_summary: 7514 7515Type ID Summary Entry 7516--------------------- 7517 7518Each type id summary entry corresponds to a type identifier resolution 7519which is generated during the LTO link portion of the compile when building 7520with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, 7521so these are only present in a combined summary index. 7522 7523Example: 7524 7525.. code-block:: text 7526 7527 ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778 7528 7529The ``typeTestRes`` gives the type test resolution ``kind`` (which may 7530be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and 7531the ``size-1`` bit width. It is followed by optional flags, which default to 0, 7532and an optional WpdResolutions (whole program devirtualization resolution) 7533field that looks like: 7534 7535.. code-block:: text 7536 7537 wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]* 7538 7539where each entry is a mapping from the given byte offset to the whole-program 7540devirtualization resolution WpdRes, that has one of the following formats: 7541 7542.. code-block:: text 7543 7544 wpdRes: (kind: branchFunnel) 7545 wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi") 7546 wpdRes: (kind: indir) 7547 7548Additionally, each wpdRes has an optional ``resByArg`` field, which 7549describes the resolutions for calls with all constant integer arguments: 7550 7551.. code-block:: text 7552 7553 resByArg: (ResByArg[, ResByArg]*) 7554 7555where ResByArg is: 7556 7557.. code-block:: text 7558 7559 args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0]) 7560 7561Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal`` 7562or ``VirtualConstProp``. The ``info`` field is only used if the kind 7563is ``UniformRetVal`` (indicates the uniform return value), or 7564``UniqueRetVal`` (holds the return value associated with the unique vtable 7565(0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does 7566not support the use of absolute symbols to store constants. 7567 7568.. _intrinsicglobalvariables: 7569 7570Intrinsic Global Variables 7571========================== 7572 7573LLVM has a number of "magic" global variables that contain data that 7574affect code generation or other IR semantics. These are documented here. 7575All globals of this sort should have a section specified as 7576"``llvm.metadata``". This section and all globals that start with 7577"``llvm.``" are reserved for use by LLVM. 7578 7579.. _gv_llvmused: 7580 7581The '``llvm.used``' Global Variable 7582----------------------------------- 7583 7584The ``@llvm.used`` global is an array which has 7585:ref:`appending linkage <linkage_appending>`. This array contains a list of 7586pointers to named global variables, functions and aliases which may optionally 7587have a pointer cast formed of bitcast or getelementptr. For example, a legal 7588use of it is: 7589 7590.. code-block:: llvm 7591 7592 @X = global i8 4 7593 @Y = global i32 123 7594 7595 @llvm.used = appending global [2 x i8*] [ 7596 i8* @X, 7597 i8* bitcast (i32* @Y to i8*) 7598 ], section "llvm.metadata" 7599 7600If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler, 7601and linker are required to treat the symbol as if there is a reference to the 7602symbol that it cannot see (which is why they have to be named). For example, if 7603a variable has internal linkage and no references other than that from the 7604``@llvm.used`` list, it cannot be deleted. This is commonly used to represent 7605references from inline asms and other things the compiler cannot "see", and 7606corresponds to "``attribute((used))``" in GNU C. 7607 7608On some targets, the code generator must emit a directive to the 7609assembler or object file to prevent the assembler and linker from 7610removing the symbol. 7611 7612.. _gv_llvmcompilerused: 7613 7614The '``llvm.compiler.used``' Global Variable 7615-------------------------------------------- 7616 7617The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used`` 7618directive, except that it only prevents the compiler from touching the 7619symbol. On targets that support it, this allows an intelligent linker to 7620optimize references to the symbol without being impeded as it would be 7621by ``@llvm.used``. 7622 7623This is a rare construct that should only be used in rare circumstances, 7624and should not be exposed to source languages. 7625 7626.. _gv_llvmglobalctors: 7627 7628The '``llvm.global_ctors``' Global Variable 7629------------------------------------------- 7630 7631.. code-block:: llvm 7632 7633 %0 = type { i32, void ()*, i8* } 7634 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }] 7635 7636The ``@llvm.global_ctors`` array contains a list of constructor 7637functions, priorities, and an associated global or function. 7638The functions referenced by this array will be called in ascending order 7639of priority (i.e. lowest first) when the module is loaded. The order of 7640functions with the same priority is not defined. 7641 7642If the third field is non-null, and points to a global variable 7643or function, the initializer function will only run if the associated 7644data from the current module is not discarded. 7645 7646.. _llvmglobaldtors: 7647 7648The '``llvm.global_dtors``' Global Variable 7649------------------------------------------- 7650 7651.. code-block:: llvm 7652 7653 %0 = type { i32, void ()*, i8* } 7654 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }] 7655 7656The ``@llvm.global_dtors`` array contains a list of destructor 7657functions, priorities, and an associated global or function. 7658The functions referenced by this array will be called in descending 7659order of priority (i.e. highest first) when the module is unloaded. The 7660order of functions with the same priority is not defined. 7661 7662If the third field is non-null, and points to a global variable 7663or function, the destructor function will only run if the associated 7664data from the current module is not discarded. 7665 7666Instruction Reference 7667===================== 7668 7669The LLVM instruction set consists of several different classifications 7670of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary 7671instructions <binaryops>`, :ref:`bitwise binary 7672instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and 7673:ref:`other instructions <otherops>`. 7674 7675.. _terminators: 7676 7677Terminator Instructions 7678----------------------- 7679 7680As mentioned :ref:`previously <functionstructure>`, every basic block in a 7681program ends with a "Terminator" instruction, which indicates which 7682block should be executed after the current block is finished. These 7683terminator instructions typically yield a '``void``' value: they produce 7684control flow, not values (the one exception being the 7685':ref:`invoke <i_invoke>`' instruction). 7686 7687The terminator instructions are: ':ref:`ret <i_ret>`', 7688':ref:`br <i_br>`', ':ref:`switch <i_switch>`', 7689':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`', 7690':ref:`callbr <i_callbr>`' 7691':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`', 7692':ref:`catchret <i_catchret>`', 7693':ref:`cleanupret <i_cleanupret>`', 7694and ':ref:`unreachable <i_unreachable>`'. 7695 7696.. _i_ret: 7697 7698'``ret``' Instruction 7699^^^^^^^^^^^^^^^^^^^^^ 7700 7701Syntax: 7702""""""" 7703 7704:: 7705 7706 ret <type> <value> ; Return a value from a non-void function 7707 ret void ; Return from void function 7708 7709Overview: 7710""""""""" 7711 7712The '``ret``' instruction is used to return control flow (and optionally 7713a value) from a function back to the caller. 7714 7715There are two forms of the '``ret``' instruction: one that returns a 7716value and then causes control flow, and one that just causes control 7717flow to occur. 7718 7719Arguments: 7720"""""""""" 7721 7722The '``ret``' instruction optionally accepts a single argument, the 7723return value. The type of the return value must be a ':ref:`first 7724class <t_firstclass>`' type. 7725 7726A function is not :ref:`well formed <wellformed>` if it has a non-void 7727return type and contains a '``ret``' instruction with no return value or 7728a return value with a type that does not match its type, or if it has a 7729void return type and contains a '``ret``' instruction with a return 7730value. 7731 7732Semantics: 7733"""""""""" 7734 7735When the '``ret``' instruction is executed, control flow returns back to 7736the calling function's context. If the caller is a 7737":ref:`call <i_call>`" instruction, execution continues at the 7738instruction after the call. If the caller was an 7739":ref:`invoke <i_invoke>`" instruction, execution continues at the 7740beginning of the "normal" destination block. If the instruction returns 7741a value, that value shall set the call or invoke instruction's return 7742value. 7743 7744Example: 7745"""""""" 7746 7747.. code-block:: llvm 7748 7749 ret i32 5 ; Return an integer value of 5 7750 ret void ; Return from a void function 7751 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2 7752 7753.. _i_br: 7754 7755'``br``' Instruction 7756^^^^^^^^^^^^^^^^^^^^ 7757 7758Syntax: 7759""""""" 7760 7761:: 7762 7763 br i1 <cond>, label <iftrue>, label <iffalse> 7764 br label <dest> ; Unconditional branch 7765 7766Overview: 7767""""""""" 7768 7769The '``br``' instruction is used to cause control flow to transfer to a 7770different basic block in the current function. There are two forms of 7771this instruction, corresponding to a conditional branch and an 7772unconditional branch. 7773 7774Arguments: 7775"""""""""" 7776 7777The conditional branch form of the '``br``' instruction takes a single 7778'``i1``' value and two '``label``' values. The unconditional form of the 7779'``br``' instruction takes a single '``label``' value as a target. 7780 7781Semantics: 7782"""""""""" 7783 7784Upon execution of a conditional '``br``' instruction, the '``i1``' 7785argument is evaluated. If the value is ``true``, control flows to the 7786'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows 7787to the '``iffalse``' ``label`` argument. 7788If '``cond``' is ``poison`` or ``undef``, this instruction has undefined 7789behavior. 7790 7791Example: 7792"""""""" 7793 7794.. code-block:: llvm 7795 7796 Test: 7797 %cond = icmp eq i32 %a, %b 7798 br i1 %cond, label %IfEqual, label %IfUnequal 7799 IfEqual: 7800 ret i32 1 7801 IfUnequal: 7802 ret i32 0 7803 7804.. _i_switch: 7805 7806'``switch``' Instruction 7807^^^^^^^^^^^^^^^^^^^^^^^^ 7808 7809Syntax: 7810""""""" 7811 7812:: 7813 7814 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ] 7815 7816Overview: 7817""""""""" 7818 7819The '``switch``' instruction is used to transfer control flow to one of 7820several different places. It is a generalization of the '``br``' 7821instruction, allowing a branch to occur to one of many possible 7822destinations. 7823 7824Arguments: 7825"""""""""" 7826 7827The '``switch``' instruction uses three parameters: an integer 7828comparison value '``value``', a default '``label``' destination, and an 7829array of pairs of comparison value constants and '``label``'s. The table 7830is not allowed to contain duplicate constant entries. 7831 7832Semantics: 7833"""""""""" 7834 7835The ``switch`` instruction specifies a table of values and destinations. 7836When the '``switch``' instruction is executed, this table is searched 7837for the given value. If the value is found, control flow is transferred 7838to the corresponding destination; otherwise, control flow is transferred 7839to the default destination. 7840If '``value``' is ``poison`` or ``undef``, this instruction has undefined 7841behavior. 7842 7843Implementation: 7844""""""""""""""" 7845 7846Depending on properties of the target machine and the particular 7847``switch`` instruction, this instruction may be code generated in 7848different ways. For example, it could be generated as a series of 7849chained conditional branches or with a lookup table. 7850 7851Example: 7852"""""""" 7853 7854.. code-block:: llvm 7855 7856 ; Emulate a conditional br instruction 7857 %Val = zext i1 %value to i32 7858 switch i32 %Val, label %truedest [ i32 0, label %falsedest ] 7859 7860 ; Emulate an unconditional br instruction 7861 switch i32 0, label %dest [ ] 7862 7863 ; Implement a jump table: 7864 switch i32 %val, label %otherwise [ i32 0, label %onzero 7865 i32 1, label %onone 7866 i32 2, label %ontwo ] 7867 7868.. _i_indirectbr: 7869 7870'``indirectbr``' Instruction 7871^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7872 7873Syntax: 7874""""""" 7875 7876:: 7877 7878 indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ] 7879 7880Overview: 7881""""""""" 7882 7883The '``indirectbr``' instruction implements an indirect branch to a 7884label within the current function, whose address is specified by 7885"``address``". Address must be derived from a 7886:ref:`blockaddress <blockaddress>` constant. 7887 7888Arguments: 7889"""""""""" 7890 7891The '``address``' argument is the address of the label to jump to. The 7892rest of the arguments indicate the full set of possible destinations 7893that the address may point to. Blocks are allowed to occur multiple 7894times in the destination list, though this isn't particularly useful. 7895 7896This destination list is required so that dataflow analysis has an 7897accurate understanding of the CFG. 7898 7899Semantics: 7900"""""""""" 7901 7902Control transfers to the block specified in the address argument. All 7903possible destination blocks must be listed in the label list, otherwise 7904this instruction has undefined behavior. This implies that jumps to 7905labels defined in other functions have undefined behavior as well. 7906If '``address``' is ``poison`` or ``undef``, this instruction has undefined 7907behavior. 7908 7909Implementation: 7910""""""""""""""" 7911 7912This is typically implemented with a jump through a register. 7913 7914Example: 7915"""""""" 7916 7917.. code-block:: llvm 7918 7919 indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ] 7920 7921.. _i_invoke: 7922 7923'``invoke``' Instruction 7924^^^^^^^^^^^^^^^^^^^^^^^^ 7925 7926Syntax: 7927""""""" 7928 7929:: 7930 7931 <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] 7932 [operand bundles] to label <normal label> unwind label <exception label> 7933 7934Overview: 7935""""""""" 7936 7937The '``invoke``' instruction causes control to transfer to a specified 7938function, with the possibility of control flow transfer to either the 7939'``normal``' label or the '``exception``' label. If the callee function 7940returns with the "``ret``" instruction, control flow will return to the 7941"normal" label. If the callee (or any indirect callees) returns via the 7942":ref:`resume <i_resume>`" instruction or other exception handling 7943mechanism, control is interrupted and continued at the dynamically 7944nearest "exception" label. 7945 7946The '``exception``' label is a `landing 7947pad <ExceptionHandling.html#overview>`_ for the exception. As such, 7948'``exception``' label is required to have the 7949":ref:`landingpad <i_landingpad>`" instruction, which contains the 7950information about the behavior of the program after unwinding happens, 7951as its first non-PHI instruction. The restrictions on the 7952"``landingpad``" instruction's tightly couples it to the "``invoke``" 7953instruction, so that the important information contained within the 7954"``landingpad``" instruction can't be lost through normal code motion. 7955 7956Arguments: 7957"""""""""" 7958 7959This instruction requires several arguments: 7960 7961#. The optional "cconv" marker indicates which :ref:`calling 7962 convention <callingconv>` the call should use. If none is 7963 specified, the call defaults to using C calling conventions. 7964#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 7965 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 7966 are valid here. 7967#. The optional addrspace attribute can be used to indicate the address space 7968 of the called function. If it is not specified, the program address space 7969 from the :ref:`datalayout string<langref_datalayout>` will be used. 7970#. '``ty``': the type of the call instruction itself which is also the 7971 type of the return value. Functions that return no value are marked 7972 ``void``. 7973#. '``fnty``': shall be the signature of the function being invoked. The 7974 argument types must match the types implied by this signature. This 7975 type can be omitted if the function is not varargs. 7976#. '``fnptrval``': An LLVM value containing a pointer to a function to 7977 be invoked. In most cases, this is a direct function invocation, but 7978 indirect ``invoke``'s are just as possible, calling an arbitrary pointer 7979 to function value. 7980#. '``function args``': argument list whose types match the function 7981 signature argument types and parameter attributes. All arguments must 7982 be of :ref:`first class <t_firstclass>` type. If the function signature 7983 indicates the function accepts a variable number of arguments, the 7984 extra arguments can be specified. 7985#. '``normal label``': the label reached when the called function 7986 executes a '``ret``' instruction. 7987#. '``exception label``': the label reached when a callee returns via 7988 the :ref:`resume <i_resume>` instruction or other exception handling 7989 mechanism. 7990#. The optional :ref:`function attributes <fnattrs>` list. 7991#. The optional :ref:`operand bundles <opbundles>` list. 7992 7993Semantics: 7994"""""""""" 7995 7996This instruction is designed to operate as a standard '``call``' 7997instruction in most regards. The primary difference is that it 7998establishes an association with a label, which is used by the runtime 7999library to unwind the stack. 8000 8001This instruction is used in languages with destructors to ensure that 8002proper cleanup is performed in the case of either a ``longjmp`` or a 8003thrown exception. Additionally, this is important for implementation of 8004'``catch``' clauses in high-level languages that support them. 8005 8006For the purposes of the SSA form, the definition of the value returned 8007by the '``invoke``' instruction is deemed to occur on the edge from the 8008current block to the "normal" label. If the callee unwinds then no 8009return value is available. 8010 8011Example: 8012"""""""" 8013 8014.. code-block:: llvm 8015 8016 %retval = invoke i32 @Test(i32 15) to label %Continue 8017 unwind label %TestCleanup ; i32:retval set 8018 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue 8019 unwind label %TestCleanup ; i32:retval set 8020 8021.. _i_callbr: 8022 8023'``callbr``' Instruction 8024^^^^^^^^^^^^^^^^^^^^^^^^ 8025 8026Syntax: 8027""""""" 8028 8029:: 8030 8031 <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] 8032 [operand bundles] to label <fallthrough label> [indirect labels] 8033 8034Overview: 8035""""""""" 8036 8037The '``callbr``' instruction causes control to transfer to a specified 8038function, with the possibility of control flow transfer to either the 8039'``fallthrough``' label or one of the '``indirect``' labels. 8040 8041This instruction should only be used to implement the "goto" feature of gcc 8042style inline assembly. Any other usage is an error in the IR verifier. 8043 8044Arguments: 8045"""""""""" 8046 8047This instruction requires several arguments: 8048 8049#. The optional "cconv" marker indicates which :ref:`calling 8050 convention <callingconv>` the call should use. If none is 8051 specified, the call defaults to using C calling conventions. 8052#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 8053 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 8054 are valid here. 8055#. The optional addrspace attribute can be used to indicate the address space 8056 of the called function. If it is not specified, the program address space 8057 from the :ref:`datalayout string<langref_datalayout>` will be used. 8058#. '``ty``': the type of the call instruction itself which is also the 8059 type of the return value. Functions that return no value are marked 8060 ``void``. 8061#. '``fnty``': shall be the signature of the function being called. The 8062 argument types must match the types implied by this signature. This 8063 type can be omitted if the function is not varargs. 8064#. '``fnptrval``': An LLVM value containing a pointer to a function to 8065 be called. In most cases, this is a direct function call, but 8066 other ``callbr``'s are just as possible, calling an arbitrary pointer 8067 to function value. 8068#. '``function args``': argument list whose types match the function 8069 signature argument types and parameter attributes. All arguments must 8070 be of :ref:`first class <t_firstclass>` type. If the function signature 8071 indicates the function accepts a variable number of arguments, the 8072 extra arguments can be specified. 8073#. '``fallthrough label``': the label reached when the inline assembly's 8074 execution exits the bottom. 8075#. '``indirect labels``': the labels reached when a callee transfers control 8076 to a location other than the '``fallthrough label``'. The blockaddress 8077 constant for these should also be in the list of '``function args``'. 8078#. The optional :ref:`function attributes <fnattrs>` list. 8079#. The optional :ref:`operand bundles <opbundles>` list. 8080 8081Semantics: 8082"""""""""" 8083 8084This instruction is designed to operate as a standard '``call``' 8085instruction in most regards. The primary difference is that it 8086establishes an association with additional labels to define where control 8087flow goes after the call. 8088 8089The output values of a '``callbr``' instruction are available only to 8090the '``fallthrough``' block, not to any '``indirect``' blocks(s). 8091 8092The only use of this today is to implement the "goto" feature of gcc inline 8093assembly where additional labels can be provided as locations for the inline 8094assembly to jump to. 8095 8096Example: 8097"""""""" 8098 8099.. code-block:: llvm 8100 8101 ; "asm goto" without output constraints. 8102 callbr void asm "", "r,X"(i32 %x, i8 *blockaddress(@foo, %indirect)) 8103 to label %fallthrough [label %indirect] 8104 8105 ; "asm goto" with output constraints. 8106 <result> = callbr i32 asm "", "=r,r,X"(i32 %x, i8 *blockaddress(@foo, %indirect)) 8107 to label %fallthrough [label %indirect] 8108 8109.. _i_resume: 8110 8111'``resume``' Instruction 8112^^^^^^^^^^^^^^^^^^^^^^^^ 8113 8114Syntax: 8115""""""" 8116 8117:: 8118 8119 resume <type> <value> 8120 8121Overview: 8122""""""""" 8123 8124The '``resume``' instruction is a terminator instruction that has no 8125successors. 8126 8127Arguments: 8128"""""""""" 8129 8130The '``resume``' instruction requires one argument, which must have the 8131same type as the result of any '``landingpad``' instruction in the same 8132function. 8133 8134Semantics: 8135"""""""""" 8136 8137The '``resume``' instruction resumes propagation of an existing 8138(in-flight) exception whose unwinding was interrupted with a 8139:ref:`landingpad <i_landingpad>` instruction. 8140 8141Example: 8142"""""""" 8143 8144.. code-block:: llvm 8145 8146 resume { i8*, i32 } %exn 8147 8148.. _i_catchswitch: 8149 8150'``catchswitch``' Instruction 8151^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8152 8153Syntax: 8154""""""" 8155 8156:: 8157 8158 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller 8159 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default> 8160 8161Overview: 8162""""""""" 8163 8164The '``catchswitch``' instruction is used by `LLVM's exception handling system 8165<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers 8166that may be executed by the :ref:`EH personality routine <personalityfn>`. 8167 8168Arguments: 8169"""""""""" 8170 8171The ``parent`` argument is the token of the funclet that contains the 8172``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet, 8173this operand may be the token ``none``. 8174 8175The ``default`` argument is the label of another basic block beginning with 8176either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination 8177must be a legal target with respect to the ``parent`` links, as described in 8178the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_. 8179 8180The ``handlers`` are a nonempty list of successor blocks that each begin with a 8181:ref:`catchpad <i_catchpad>` instruction. 8182 8183Semantics: 8184"""""""""" 8185 8186Executing this instruction transfers control to one of the successors in 8187``handlers``, if appropriate, or continues to unwind via the unwind label if 8188present. 8189 8190The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that 8191it must be both the first non-phi instruction and last instruction in the basic 8192block. Therefore, it must be the only non-phi instruction in the block. 8193 8194Example: 8195"""""""" 8196 8197.. code-block:: text 8198 8199 dispatch1: 8200 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller 8201 dispatch2: 8202 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup 8203 8204.. _i_catchret: 8205 8206'``catchret``' Instruction 8207^^^^^^^^^^^^^^^^^^^^^^^^^^ 8208 8209Syntax: 8210""""""" 8211 8212:: 8213 8214 catchret from <token> to label <normal> 8215 8216Overview: 8217""""""""" 8218 8219The '``catchret``' instruction is a terminator instruction that has a 8220single successor. 8221 8222 8223Arguments: 8224"""""""""" 8225 8226The first argument to a '``catchret``' indicates which ``catchpad`` it 8227exits. It must be a :ref:`catchpad <i_catchpad>`. 8228The second argument to a '``catchret``' specifies where control will 8229transfer to next. 8230 8231Semantics: 8232"""""""""" 8233 8234The '``catchret``' instruction ends an existing (in-flight) exception whose 8235unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The 8236:ref:`personality function <personalityfn>` gets a chance to execute arbitrary 8237code to, for example, destroy the active exception. Control then transfers to 8238``normal``. 8239 8240The ``token`` argument must be a token produced by a ``catchpad`` instruction. 8241If the specified ``catchpad`` is not the most-recently-entered not-yet-exited 8242funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 8243the ``catchret``'s behavior is undefined. 8244 8245Example: 8246"""""""" 8247 8248.. code-block:: text 8249 8250 catchret from %catch label %continue 8251 8252.. _i_cleanupret: 8253 8254'``cleanupret``' Instruction 8255^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8256 8257Syntax: 8258""""""" 8259 8260:: 8261 8262 cleanupret from <value> unwind label <continue> 8263 cleanupret from <value> unwind to caller 8264 8265Overview: 8266""""""""" 8267 8268The '``cleanupret``' instruction is a terminator instruction that has 8269an optional successor. 8270 8271 8272Arguments: 8273"""""""""" 8274 8275The '``cleanupret``' instruction requires one argument, which indicates 8276which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`. 8277If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited 8278funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 8279the ``cleanupret``'s behavior is undefined. 8280 8281The '``cleanupret``' instruction also has an optional successor, ``continue``, 8282which must be the label of another basic block beginning with either a 8283``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must 8284be a legal target with respect to the ``parent`` links, as described in the 8285`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_. 8286 8287Semantics: 8288"""""""""" 8289 8290The '``cleanupret``' instruction indicates to the 8291:ref:`personality function <personalityfn>` that one 8292:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended. 8293It transfers control to ``continue`` or unwinds out of the function. 8294 8295Example: 8296"""""""" 8297 8298.. code-block:: text 8299 8300 cleanupret from %cleanup unwind to caller 8301 cleanupret from %cleanup unwind label %continue 8302 8303.. _i_unreachable: 8304 8305'``unreachable``' Instruction 8306^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8307 8308Syntax: 8309""""""" 8310 8311:: 8312 8313 unreachable 8314 8315Overview: 8316""""""""" 8317 8318The '``unreachable``' instruction has no defined semantics. This 8319instruction is used to inform the optimizer that a particular portion of 8320the code is not reachable. This can be used to indicate that the code 8321after a no-return function cannot be reached, and other facts. 8322 8323Semantics: 8324"""""""""" 8325 8326The '``unreachable``' instruction has no defined semantics. 8327 8328.. _unaryops: 8329 8330Unary Operations 8331----------------- 8332 8333Unary operators require a single operand, execute an operation on 8334it, and produce a single value. The operand might represent multiple 8335data, as is the case with the :ref:`vector <t_vector>` data type. The 8336result value has the same type as its operand. 8337 8338.. _i_fneg: 8339 8340'``fneg``' Instruction 8341^^^^^^^^^^^^^^^^^^^^^^ 8342 8343Syntax: 8344""""""" 8345 8346:: 8347 8348 <result> = fneg [fast-math flags]* <ty> <op1> ; yields ty:result 8349 8350Overview: 8351""""""""" 8352 8353The '``fneg``' instruction returns the negation of its operand. 8354 8355Arguments: 8356"""""""""" 8357 8358The argument to the '``fneg``' instruction must be a 8359:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8360floating-point values. 8361 8362Semantics: 8363"""""""""" 8364 8365The value produced is a copy of the operand with its sign bit flipped. 8366This instruction can also take any number of :ref:`fast-math 8367flags <fastmath>`, which are optimization hints to enable otherwise 8368unsafe floating-point optimizations: 8369 8370Example: 8371"""""""" 8372 8373.. code-block:: text 8374 8375 <result> = fneg float %val ; yields float:result = -%var 8376 8377.. _binaryops: 8378 8379Binary Operations 8380----------------- 8381 8382Binary operators are used to do most of the computation in a program. 8383They require two operands of the same type, execute an operation on 8384them, and produce a single value. The operands might represent multiple 8385data, as is the case with the :ref:`vector <t_vector>` data type. The 8386result value has the same type as its operands. 8387 8388There are several different binary operators: 8389 8390.. _i_add: 8391 8392'``add``' Instruction 8393^^^^^^^^^^^^^^^^^^^^^ 8394 8395Syntax: 8396""""""" 8397 8398:: 8399 8400 <result> = add <ty> <op1>, <op2> ; yields ty:result 8401 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result 8402 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result 8403 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result 8404 8405Overview: 8406""""""""" 8407 8408The '``add``' instruction returns the sum of its two operands. 8409 8410Arguments: 8411"""""""""" 8412 8413The two arguments to the '``add``' instruction must be 8414:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8415arguments must have identical types. 8416 8417Semantics: 8418"""""""""" 8419 8420The value produced is the integer sum of the two operands. 8421 8422If the sum has unsigned overflow, the result returned is the 8423mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of 8424the result. 8425 8426Because LLVM integers use a two's complement representation, this 8427instruction is appropriate for both signed and unsigned integers. 8428 8429``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 8430respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 8431result value of the ``add`` is a :ref:`poison value <poisonvalues>` if 8432unsigned and/or signed overflow, respectively, occurs. 8433 8434Example: 8435"""""""" 8436 8437.. code-block:: text 8438 8439 <result> = add i32 4, %var ; yields i32:result = 4 + %var 8440 8441.. _i_fadd: 8442 8443'``fadd``' Instruction 8444^^^^^^^^^^^^^^^^^^^^^^ 8445 8446Syntax: 8447""""""" 8448 8449:: 8450 8451 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8452 8453Overview: 8454""""""""" 8455 8456The '``fadd``' instruction returns the sum of its two operands. 8457 8458Arguments: 8459"""""""""" 8460 8461The two arguments to the '``fadd``' instruction must be 8462:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8463floating-point values. Both arguments must have identical types. 8464 8465Semantics: 8466"""""""""" 8467 8468The value produced is the floating-point sum of the two operands. 8469This instruction is assumed to execute in the default :ref:`floating-point 8470environment <floatenv>`. 8471This instruction can also take any number of :ref:`fast-math 8472flags <fastmath>`, which are optimization hints to enable otherwise 8473unsafe floating-point optimizations: 8474 8475Example: 8476"""""""" 8477 8478.. code-block:: text 8479 8480 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var 8481 8482.. _i_sub: 8483 8484'``sub``' Instruction 8485^^^^^^^^^^^^^^^^^^^^^ 8486 8487Syntax: 8488""""""" 8489 8490:: 8491 8492 <result> = sub <ty> <op1>, <op2> ; yields ty:result 8493 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result 8494 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result 8495 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result 8496 8497Overview: 8498""""""""" 8499 8500The '``sub``' instruction returns the difference of its two operands. 8501 8502Note that the '``sub``' instruction is used to represent the '``neg``' 8503instruction present in most other intermediate representations. 8504 8505Arguments: 8506"""""""""" 8507 8508The two arguments to the '``sub``' instruction must be 8509:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8510arguments must have identical types. 8511 8512Semantics: 8513"""""""""" 8514 8515The value produced is the integer difference of the two operands. 8516 8517If the difference has unsigned overflow, the result returned is the 8518mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of 8519the result. 8520 8521Because LLVM integers use a two's complement representation, this 8522instruction is appropriate for both signed and unsigned integers. 8523 8524``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 8525respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 8526result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if 8527unsigned and/or signed overflow, respectively, occurs. 8528 8529Example: 8530"""""""" 8531 8532.. code-block:: text 8533 8534 <result> = sub i32 4, %var ; yields i32:result = 4 - %var 8535 <result> = sub i32 0, %val ; yields i32:result = -%var 8536 8537.. _i_fsub: 8538 8539'``fsub``' Instruction 8540^^^^^^^^^^^^^^^^^^^^^^ 8541 8542Syntax: 8543""""""" 8544 8545:: 8546 8547 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8548 8549Overview: 8550""""""""" 8551 8552The '``fsub``' instruction returns the difference of its two operands. 8553 8554Arguments: 8555"""""""""" 8556 8557The two arguments to the '``fsub``' instruction must be 8558:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8559floating-point values. Both arguments must have identical types. 8560 8561Semantics: 8562"""""""""" 8563 8564The value produced is the floating-point difference of the two operands. 8565This instruction is assumed to execute in the default :ref:`floating-point 8566environment <floatenv>`. 8567This instruction can also take any number of :ref:`fast-math 8568flags <fastmath>`, which are optimization hints to enable otherwise 8569unsafe floating-point optimizations: 8570 8571Example: 8572"""""""" 8573 8574.. code-block:: text 8575 8576 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var 8577 <result> = fsub float -0.0, %val ; yields float:result = -%var 8578 8579.. _i_mul: 8580 8581'``mul``' Instruction 8582^^^^^^^^^^^^^^^^^^^^^ 8583 8584Syntax: 8585""""""" 8586 8587:: 8588 8589 <result> = mul <ty> <op1>, <op2> ; yields ty:result 8590 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result 8591 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result 8592 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result 8593 8594Overview: 8595""""""""" 8596 8597The '``mul``' instruction returns the product of its two operands. 8598 8599Arguments: 8600"""""""""" 8601 8602The two arguments to the '``mul``' instruction must be 8603:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8604arguments must have identical types. 8605 8606Semantics: 8607"""""""""" 8608 8609The value produced is the integer product of the two operands. 8610 8611If the result of the multiplication has unsigned overflow, the result 8612returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the 8613bit width of the result. 8614 8615Because LLVM integers use a two's complement representation, and the 8616result is the same width as the operands, this instruction returns the 8617correct result for both signed and unsigned integers. If a full product 8618(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be 8619sign-extended or zero-extended as appropriate to the width of the full 8620product. 8621 8622``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 8623respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 8624result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if 8625unsigned and/or signed overflow, respectively, occurs. 8626 8627Example: 8628"""""""" 8629 8630.. code-block:: text 8631 8632 <result> = mul i32 4, %var ; yields i32:result = 4 * %var 8633 8634.. _i_fmul: 8635 8636'``fmul``' Instruction 8637^^^^^^^^^^^^^^^^^^^^^^ 8638 8639Syntax: 8640""""""" 8641 8642:: 8643 8644 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8645 8646Overview: 8647""""""""" 8648 8649The '``fmul``' instruction returns the product of its two operands. 8650 8651Arguments: 8652"""""""""" 8653 8654The two arguments to the '``fmul``' instruction must be 8655:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8656floating-point values. Both arguments must have identical types. 8657 8658Semantics: 8659"""""""""" 8660 8661The value produced is the floating-point product of the two operands. 8662This instruction is assumed to execute in the default :ref:`floating-point 8663environment <floatenv>`. 8664This instruction can also take any number of :ref:`fast-math 8665flags <fastmath>`, which are optimization hints to enable otherwise 8666unsafe floating-point optimizations: 8667 8668Example: 8669"""""""" 8670 8671.. code-block:: text 8672 8673 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var 8674 8675.. _i_udiv: 8676 8677'``udiv``' Instruction 8678^^^^^^^^^^^^^^^^^^^^^^ 8679 8680Syntax: 8681""""""" 8682 8683:: 8684 8685 <result> = udiv <ty> <op1>, <op2> ; yields ty:result 8686 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result 8687 8688Overview: 8689""""""""" 8690 8691The '``udiv``' instruction returns the quotient of its two operands. 8692 8693Arguments: 8694"""""""""" 8695 8696The two arguments to the '``udiv``' instruction must be 8697:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8698arguments must have identical types. 8699 8700Semantics: 8701"""""""""" 8702 8703The value produced is the unsigned integer quotient of the two operands. 8704 8705Note that unsigned integer division and signed integer division are 8706distinct operations; for signed integer division, use '``sdiv``'. 8707 8708Division by zero is undefined behavior. For vectors, if any element 8709of the divisor is zero, the operation has undefined behavior. 8710 8711 8712If the ``exact`` keyword is present, the result value of the ``udiv`` is 8713a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as 8714such, "((a udiv exact b) mul b) == a"). 8715 8716Example: 8717"""""""" 8718 8719.. code-block:: text 8720 8721 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var 8722 8723.. _i_sdiv: 8724 8725'``sdiv``' Instruction 8726^^^^^^^^^^^^^^^^^^^^^^ 8727 8728Syntax: 8729""""""" 8730 8731:: 8732 8733 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result 8734 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result 8735 8736Overview: 8737""""""""" 8738 8739The '``sdiv``' instruction returns the quotient of its two operands. 8740 8741Arguments: 8742"""""""""" 8743 8744The two arguments to the '``sdiv``' instruction must be 8745:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8746arguments must have identical types. 8747 8748Semantics: 8749"""""""""" 8750 8751The value produced is the signed integer quotient of the two operands 8752rounded towards zero. 8753 8754Note that signed integer division and unsigned integer division are 8755distinct operations; for unsigned integer division, use '``udiv``'. 8756 8757Division by zero is undefined behavior. For vectors, if any element 8758of the divisor is zero, the operation has undefined behavior. 8759Overflow also leads to undefined behavior; this is a rare case, but can 8760occur, for example, by doing a 32-bit division of -2147483648 by -1. 8761 8762If the ``exact`` keyword is present, the result value of the ``sdiv`` is 8763a :ref:`poison value <poisonvalues>` if the result would be rounded. 8764 8765Example: 8766"""""""" 8767 8768.. code-block:: text 8769 8770 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var 8771 8772.. _i_fdiv: 8773 8774'``fdiv``' Instruction 8775^^^^^^^^^^^^^^^^^^^^^^ 8776 8777Syntax: 8778""""""" 8779 8780:: 8781 8782 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8783 8784Overview: 8785""""""""" 8786 8787The '``fdiv``' instruction returns the quotient of its two operands. 8788 8789Arguments: 8790"""""""""" 8791 8792The two arguments to the '``fdiv``' instruction must be 8793:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8794floating-point values. Both arguments must have identical types. 8795 8796Semantics: 8797"""""""""" 8798 8799The value produced is the floating-point quotient of the two operands. 8800This instruction is assumed to execute in the default :ref:`floating-point 8801environment <floatenv>`. 8802This instruction can also take any number of :ref:`fast-math 8803flags <fastmath>`, which are optimization hints to enable otherwise 8804unsafe floating-point optimizations: 8805 8806Example: 8807"""""""" 8808 8809.. code-block:: text 8810 8811 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var 8812 8813.. _i_urem: 8814 8815'``urem``' Instruction 8816^^^^^^^^^^^^^^^^^^^^^^ 8817 8818Syntax: 8819""""""" 8820 8821:: 8822 8823 <result> = urem <ty> <op1>, <op2> ; yields ty:result 8824 8825Overview: 8826""""""""" 8827 8828The '``urem``' instruction returns the remainder from the unsigned 8829division of its two arguments. 8830 8831Arguments: 8832"""""""""" 8833 8834The two arguments to the '``urem``' instruction must be 8835:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8836arguments must have identical types. 8837 8838Semantics: 8839"""""""""" 8840 8841This instruction returns the unsigned integer *remainder* of a division. 8842This instruction always performs an unsigned division to get the 8843remainder. 8844 8845Note that unsigned integer remainder and signed integer remainder are 8846distinct operations; for signed integer remainder, use '``srem``'. 8847 8848Taking the remainder of a division by zero is undefined behavior. 8849For vectors, if any element of the divisor is zero, the operation has 8850undefined behavior. 8851 8852Example: 8853"""""""" 8854 8855.. code-block:: text 8856 8857 <result> = urem i32 4, %var ; yields i32:result = 4 % %var 8858 8859.. _i_srem: 8860 8861'``srem``' Instruction 8862^^^^^^^^^^^^^^^^^^^^^^ 8863 8864Syntax: 8865""""""" 8866 8867:: 8868 8869 <result> = srem <ty> <op1>, <op2> ; yields ty:result 8870 8871Overview: 8872""""""""" 8873 8874The '``srem``' instruction returns the remainder from the signed 8875division of its two operands. This instruction can also take 8876:ref:`vector <t_vector>` versions of the values in which case the elements 8877must be integers. 8878 8879Arguments: 8880"""""""""" 8881 8882The two arguments to the '``srem``' instruction must be 8883:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8884arguments must have identical types. 8885 8886Semantics: 8887"""""""""" 8888 8889This instruction returns the *remainder* of a division (where the result 8890is either zero or has the same sign as the dividend, ``op1``), not the 8891*modulo* operator (where the result is either zero or has the same sign 8892as the divisor, ``op2``) of a value. For more information about the 8893difference, see `The Math 8894Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a 8895table of how this is implemented in various languages, please see 8896`Wikipedia: modulo 8897operation <http://en.wikipedia.org/wiki/Modulo_operation>`_. 8898 8899Note that signed integer remainder and unsigned integer remainder are 8900distinct operations; for unsigned integer remainder, use '``urem``'. 8901 8902Taking the remainder of a division by zero is undefined behavior. 8903For vectors, if any element of the divisor is zero, the operation has 8904undefined behavior. 8905Overflow also leads to undefined behavior; this is a rare case, but can 8906occur, for example, by taking the remainder of a 32-bit division of 8907-2147483648 by -1. (The remainder doesn't actually overflow, but this 8908rule lets srem be implemented using instructions that return both the 8909result of the division and the remainder.) 8910 8911Example: 8912"""""""" 8913 8914.. code-block:: text 8915 8916 <result> = srem i32 4, %var ; yields i32:result = 4 % %var 8917 8918.. _i_frem: 8919 8920'``frem``' Instruction 8921^^^^^^^^^^^^^^^^^^^^^^ 8922 8923Syntax: 8924""""""" 8925 8926:: 8927 8928 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8929 8930Overview: 8931""""""""" 8932 8933The '``frem``' instruction returns the remainder from the division of 8934its two operands. 8935 8936Arguments: 8937"""""""""" 8938 8939The two arguments to the '``frem``' instruction must be 8940:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8941floating-point values. Both arguments must have identical types. 8942 8943Semantics: 8944"""""""""" 8945 8946The value produced is the floating-point remainder of the two operands. 8947This is the same output as a libm '``fmod``' function, but without any 8948possibility of setting ``errno``. The remainder has the same sign as the 8949dividend. 8950This instruction is assumed to execute in the default :ref:`floating-point 8951environment <floatenv>`. 8952This instruction can also take any number of :ref:`fast-math 8953flags <fastmath>`, which are optimization hints to enable otherwise 8954unsafe floating-point optimizations: 8955 8956Example: 8957"""""""" 8958 8959.. code-block:: text 8960 8961 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var 8962 8963.. _bitwiseops: 8964 8965Bitwise Binary Operations 8966------------------------- 8967 8968Bitwise binary operators are used to do various forms of bit-twiddling 8969in a program. They are generally very efficient instructions and can 8970commonly be strength reduced from other instructions. They require two 8971operands of the same type, execute an operation on them, and produce a 8972single value. The resulting value is the same type as its operands. 8973 8974.. _i_shl: 8975 8976'``shl``' Instruction 8977^^^^^^^^^^^^^^^^^^^^^ 8978 8979Syntax: 8980""""""" 8981 8982:: 8983 8984 <result> = shl <ty> <op1>, <op2> ; yields ty:result 8985 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result 8986 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result 8987 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result 8988 8989Overview: 8990""""""""" 8991 8992The '``shl``' instruction returns the first operand shifted to the left 8993a specified number of bits. 8994 8995Arguments: 8996"""""""""" 8997 8998Both arguments to the '``shl``' instruction must be the same 8999:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 9000'``op2``' is treated as an unsigned value. 9001 9002Semantics: 9003"""""""""" 9004 9005The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`, 9006where ``n`` is the width of the result. If ``op2`` is (statically or 9007dynamically) equal to or larger than the number of bits in 9008``op1``, this instruction returns a :ref:`poison value <poisonvalues>`. 9009If the arguments are vectors, each vector element of ``op1`` is shifted 9010by the corresponding shift amount in ``op2``. 9011 9012If the ``nuw`` keyword is present, then the shift produces a poison 9013value if it shifts out any non-zero bits. 9014If the ``nsw`` keyword is present, then the shift produces a poison 9015value if it shifts out any bits that disagree with the resultant sign bit. 9016 9017Example: 9018"""""""" 9019 9020.. code-block:: text 9021 9022 <result> = shl i32 4, %var ; yields i32: 4 << %var 9023 <result> = shl i32 4, 2 ; yields i32: 16 9024 <result> = shl i32 1, 10 ; yields i32: 1024 9025 <result> = shl i32 1, 32 ; undefined 9026 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4> 9027 9028.. _i_lshr: 9029 9030 9031'``lshr``' Instruction 9032^^^^^^^^^^^^^^^^^^^^^^ 9033 9034Syntax: 9035""""""" 9036 9037:: 9038 9039 <result> = lshr <ty> <op1>, <op2> ; yields ty:result 9040 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result 9041 9042Overview: 9043""""""""" 9044 9045The '``lshr``' instruction (logical shift right) returns the first 9046operand shifted to the right a specified number of bits with zero fill. 9047 9048Arguments: 9049"""""""""" 9050 9051Both arguments to the '``lshr``' instruction must be the same 9052:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 9053'``op2``' is treated as an unsigned value. 9054 9055Semantics: 9056"""""""""" 9057 9058This instruction always performs a logical shift right operation. The 9059most significant bits of the result will be filled with zero bits after 9060the shift. If ``op2`` is (statically or dynamically) equal to or larger 9061than the number of bits in ``op1``, this instruction returns a :ref:`poison 9062value <poisonvalues>`. If the arguments are vectors, each vector element 9063of ``op1`` is shifted by the corresponding shift amount in ``op2``. 9064 9065If the ``exact`` keyword is present, the result value of the ``lshr`` is 9066a poison value if any of the bits shifted out are non-zero. 9067 9068Example: 9069"""""""" 9070 9071.. code-block:: text 9072 9073 <result> = lshr i32 4, 1 ; yields i32:result = 2 9074 <result> = lshr i32 4, 2 ; yields i32:result = 1 9075 <result> = lshr i8 4, 3 ; yields i8:result = 0 9076 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F 9077 <result> = lshr i32 1, 32 ; undefined 9078 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1> 9079 9080.. _i_ashr: 9081 9082'``ashr``' Instruction 9083^^^^^^^^^^^^^^^^^^^^^^ 9084 9085Syntax: 9086""""""" 9087 9088:: 9089 9090 <result> = ashr <ty> <op1>, <op2> ; yields ty:result 9091 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result 9092 9093Overview: 9094""""""""" 9095 9096The '``ashr``' instruction (arithmetic shift right) returns the first 9097operand shifted to the right a specified number of bits with sign 9098extension. 9099 9100Arguments: 9101"""""""""" 9102 9103Both arguments to the '``ashr``' instruction must be the same 9104:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 9105'``op2``' is treated as an unsigned value. 9106 9107Semantics: 9108"""""""""" 9109 9110This instruction always performs an arithmetic shift right operation, 9111The most significant bits of the result will be filled with the sign bit 9112of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger 9113than the number of bits in ``op1``, this instruction returns a :ref:`poison 9114value <poisonvalues>`. If the arguments are vectors, each vector element 9115of ``op1`` is shifted by the corresponding shift amount in ``op2``. 9116 9117If the ``exact`` keyword is present, the result value of the ``ashr`` is 9118a poison value if any of the bits shifted out are non-zero. 9119 9120Example: 9121"""""""" 9122 9123.. code-block:: text 9124 9125 <result> = ashr i32 4, 1 ; yields i32:result = 2 9126 <result> = ashr i32 4, 2 ; yields i32:result = 1 9127 <result> = ashr i8 4, 3 ; yields i8:result = 0 9128 <result> = ashr i8 -2, 1 ; yields i8:result = -1 9129 <result> = ashr i32 1, 32 ; undefined 9130 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0> 9131 9132.. _i_and: 9133 9134'``and``' Instruction 9135^^^^^^^^^^^^^^^^^^^^^ 9136 9137Syntax: 9138""""""" 9139 9140:: 9141 9142 <result> = and <ty> <op1>, <op2> ; yields ty:result 9143 9144Overview: 9145""""""""" 9146 9147The '``and``' instruction returns the bitwise logical and of its two 9148operands. 9149 9150Arguments: 9151"""""""""" 9152 9153The two arguments to the '``and``' instruction must be 9154:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 9155arguments must have identical types. 9156 9157Semantics: 9158"""""""""" 9159 9160The truth table used for the '``and``' instruction is: 9161 9162+-----+-----+-----+ 9163| In0 | In1 | Out | 9164+-----+-----+-----+ 9165| 0 | 0 | 0 | 9166+-----+-----+-----+ 9167| 0 | 1 | 0 | 9168+-----+-----+-----+ 9169| 1 | 0 | 0 | 9170+-----+-----+-----+ 9171| 1 | 1 | 1 | 9172+-----+-----+-----+ 9173 9174Example: 9175"""""""" 9176 9177.. code-block:: text 9178 9179 <result> = and i32 4, %var ; yields i32:result = 4 & %var 9180 <result> = and i32 15, 40 ; yields i32:result = 8 9181 <result> = and i32 4, 8 ; yields i32:result = 0 9182 9183.. _i_or: 9184 9185'``or``' Instruction 9186^^^^^^^^^^^^^^^^^^^^ 9187 9188Syntax: 9189""""""" 9190 9191:: 9192 9193 <result> = or <ty> <op1>, <op2> ; yields ty:result 9194 9195Overview: 9196""""""""" 9197 9198The '``or``' instruction returns the bitwise logical inclusive or of its 9199two operands. 9200 9201Arguments: 9202"""""""""" 9203 9204The two arguments to the '``or``' instruction must be 9205:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 9206arguments must have identical types. 9207 9208Semantics: 9209"""""""""" 9210 9211The truth table used for the '``or``' instruction is: 9212 9213+-----+-----+-----+ 9214| In0 | In1 | Out | 9215+-----+-----+-----+ 9216| 0 | 0 | 0 | 9217+-----+-----+-----+ 9218| 0 | 1 | 1 | 9219+-----+-----+-----+ 9220| 1 | 0 | 1 | 9221+-----+-----+-----+ 9222| 1 | 1 | 1 | 9223+-----+-----+-----+ 9224 9225Example: 9226"""""""" 9227 9228:: 9229 9230 <result> = or i32 4, %var ; yields i32:result = 4 | %var 9231 <result> = or i32 15, 40 ; yields i32:result = 47 9232 <result> = or i32 4, 8 ; yields i32:result = 12 9233 9234.. _i_xor: 9235 9236'``xor``' Instruction 9237^^^^^^^^^^^^^^^^^^^^^ 9238 9239Syntax: 9240""""""" 9241 9242:: 9243 9244 <result> = xor <ty> <op1>, <op2> ; yields ty:result 9245 9246Overview: 9247""""""""" 9248 9249The '``xor``' instruction returns the bitwise logical exclusive or of 9250its two operands. The ``xor`` is used to implement the "one's 9251complement" operation, which is the "~" operator in C. 9252 9253Arguments: 9254"""""""""" 9255 9256The two arguments to the '``xor``' instruction must be 9257:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 9258arguments must have identical types. 9259 9260Semantics: 9261"""""""""" 9262 9263The truth table used for the '``xor``' instruction is: 9264 9265+-----+-----+-----+ 9266| In0 | In1 | Out | 9267+-----+-----+-----+ 9268| 0 | 0 | 0 | 9269+-----+-----+-----+ 9270| 0 | 1 | 1 | 9271+-----+-----+-----+ 9272| 1 | 0 | 1 | 9273+-----+-----+-----+ 9274| 1 | 1 | 0 | 9275+-----+-----+-----+ 9276 9277Example: 9278"""""""" 9279 9280.. code-block:: text 9281 9282 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var 9283 <result> = xor i32 15, 40 ; yields i32:result = 39 9284 <result> = xor i32 4, 8 ; yields i32:result = 12 9285 <result> = xor i32 %V, -1 ; yields i32:result = ~%V 9286 9287Vector Operations 9288----------------- 9289 9290LLVM supports several instructions to represent vector operations in a 9291target-independent manner. These instructions cover the element-access 9292and vector-specific operations needed to process vectors effectively. 9293While LLVM does directly support these vector operations, many 9294sophisticated algorithms will want to use target-specific intrinsics to 9295take full advantage of a specific target. 9296 9297.. _i_extractelement: 9298 9299'``extractelement``' Instruction 9300^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9301 9302Syntax: 9303""""""" 9304 9305:: 9306 9307 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty> 9308 <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty> 9309 9310Overview: 9311""""""""" 9312 9313The '``extractelement``' instruction extracts a single scalar element 9314from a vector at a specified index. 9315 9316Arguments: 9317"""""""""" 9318 9319The first operand of an '``extractelement``' instruction is a value of 9320:ref:`vector <t_vector>` type. The second operand is an index indicating 9321the position from which to extract the element. The index may be a 9322variable of any integer type. 9323 9324Semantics: 9325"""""""""" 9326 9327The result is a scalar of the same type as the element type of ``val``. 9328Its value is the value at position ``idx`` of ``val``. If ``idx`` 9329exceeds the length of ``val`` for a fixed-length vector, the result is a 9330:ref:`poison value <poisonvalues>`. For a scalable vector, if the value 9331of ``idx`` exceeds the runtime length of the vector, the result is a 9332:ref:`poison value <poisonvalues>`. 9333 9334Example: 9335"""""""" 9336 9337.. code-block:: text 9338 9339 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32 9340 9341.. _i_insertelement: 9342 9343'``insertelement``' Instruction 9344^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9345 9346Syntax: 9347""""""" 9348 9349:: 9350 9351 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>> 9352 <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>> 9353 9354Overview: 9355""""""""" 9356 9357The '``insertelement``' instruction inserts a scalar element into a 9358vector at a specified index. 9359 9360Arguments: 9361"""""""""" 9362 9363The first operand of an '``insertelement``' instruction is a value of 9364:ref:`vector <t_vector>` type. The second operand is a scalar value whose 9365type must equal the element type of the first operand. The third operand 9366is an index indicating the position at which to insert the value. The 9367index may be a variable of any integer type. 9368 9369Semantics: 9370"""""""""" 9371 9372The result is a vector of the same type as ``val``. Its element values 9373are those of ``val`` except at position ``idx``, where it gets the value 9374``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector, 9375the result is a :ref:`poison value <poisonvalues>`. For a scalable vector, 9376if the value of ``idx`` exceeds the runtime length of the vector, the result 9377is a :ref:`poison value <poisonvalues>`. 9378 9379Example: 9380"""""""" 9381 9382.. code-block:: text 9383 9384 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32> 9385 9386.. _i_shufflevector: 9387 9388'``shufflevector``' Instruction 9389^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9390 9391Syntax: 9392""""""" 9393 9394:: 9395 9396 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>> 9397 <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask> ; yields <vscale x m x <ty>> 9398 9399Overview: 9400""""""""" 9401 9402The '``shufflevector``' instruction constructs a permutation of elements 9403from two input vectors, returning a vector with the same element type as 9404the input and length that is the same as the shuffle mask. 9405 9406Arguments: 9407"""""""""" 9408 9409The first two operands of a '``shufflevector``' instruction are vectors 9410with the same type. The third argument is a shuffle mask vector constant 9411whose element type is ``i32``. The mask vector elements must be constant 9412integers or ``undef`` values. The result of the instruction is a vector 9413whose length is the same as the shuffle mask and whose element type is the 9414same as the element type of the first two operands. 9415 9416Semantics: 9417"""""""""" 9418 9419The elements of the two input vectors are numbered from left to right 9420across both of the vectors. For each element of the result vector, the 9421shuffle mask selects an element from one of the input vectors to copy 9422to the result. Non-negative elements in the mask represent an index 9423into the concatenated pair of input vectors. 9424 9425If the shuffle mask is undefined, the result vector is undefined. If 9426the shuffle mask selects an undefined element from one of the input 9427vectors, the resulting element is undefined. An undefined element 9428in the mask vector specifies that the resulting element is undefined. 9429An undefined element in the mask vector prevents a poisoned vector 9430element from propagating. 9431 9432For scalable vectors, the only valid mask values at present are 9433``zeroinitializer`` and ``undef``, since we cannot write all indices as 9434literals for a vector with a length unknown at compile time. 9435 9436Example: 9437"""""""" 9438 9439.. code-block:: text 9440 9441 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, 9442 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32> 9443 <result> = shufflevector <4 x i32> %v1, <4 x i32> undef, 9444 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle. 9445 <result> = shufflevector <8 x i32> %v1, <8 x i32> undef, 9446 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> 9447 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, 9448 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32> 9449 9450Aggregate Operations 9451-------------------- 9452 9453LLVM supports several instructions for working with 9454:ref:`aggregate <t_aggregate>` values. 9455 9456.. _i_extractvalue: 9457 9458'``extractvalue``' Instruction 9459^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9460 9461Syntax: 9462""""""" 9463 9464:: 9465 9466 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}* 9467 9468Overview: 9469""""""""" 9470 9471The '``extractvalue``' instruction extracts the value of a member field 9472from an :ref:`aggregate <t_aggregate>` value. 9473 9474Arguments: 9475"""""""""" 9476 9477The first operand of an '``extractvalue``' instruction is a value of 9478:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are 9479constant indices to specify which value to extract in a similar manner 9480as indices in a '``getelementptr``' instruction. 9481 9482The major differences to ``getelementptr`` indexing are: 9483 9484- Since the value being indexed is not a pointer, the first index is 9485 omitted and assumed to be zero. 9486- At least one index must be specified. 9487- Not only struct indices but also array indices must be in bounds. 9488 9489Semantics: 9490"""""""""" 9491 9492The result is the value at the position in the aggregate specified by 9493the index operands. 9494 9495Example: 9496"""""""" 9497 9498.. code-block:: text 9499 9500 <result> = extractvalue {i32, float} %agg, 0 ; yields i32 9501 9502.. _i_insertvalue: 9503 9504'``insertvalue``' Instruction 9505^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9506 9507Syntax: 9508""""""" 9509 9510:: 9511 9512 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type> 9513 9514Overview: 9515""""""""" 9516 9517The '``insertvalue``' instruction inserts a value into a member field in 9518an :ref:`aggregate <t_aggregate>` value. 9519 9520Arguments: 9521"""""""""" 9522 9523The first operand of an '``insertvalue``' instruction is a value of 9524:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is 9525a first-class value to insert. The following operands are constant 9526indices indicating the position at which to insert the value in a 9527similar manner as indices in a '``extractvalue``' instruction. The value 9528to insert must have the same type as the value identified by the 9529indices. 9530 9531Semantics: 9532"""""""""" 9533 9534The result is an aggregate of the same type as ``val``. Its value is 9535that of ``val`` except that the value at the position specified by the 9536indices is that of ``elt``. 9537 9538Example: 9539"""""""" 9540 9541.. code-block:: llvm 9542 9543 %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef} 9544 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val} 9545 %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}} 9546 9547.. _memoryops: 9548 9549Memory Access and Addressing Operations 9550--------------------------------------- 9551 9552A key design point of an SSA-based representation is how it represents 9553memory. In LLVM, no memory locations are in SSA form, which makes things 9554very simple. This section describes how to read, write, and allocate 9555memory in LLVM. 9556 9557.. _i_alloca: 9558 9559'``alloca``' Instruction 9560^^^^^^^^^^^^^^^^^^^^^^^^ 9561 9562Syntax: 9563""""""" 9564 9565:: 9566 9567 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)] ; yields type addrspace(num)*:result 9568 9569Overview: 9570""""""""" 9571 9572The '``alloca``' instruction allocates memory on the stack frame of the 9573currently executing function, to be automatically released when this 9574function returns to its caller. If the address space is not explicitly 9575specified, the object is allocated in the alloca address space from the 9576:ref:`datalayout string<langref_datalayout>`. 9577 9578Arguments: 9579"""""""""" 9580 9581The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements`` 9582bytes of memory on the runtime stack, returning a pointer of the 9583appropriate type to the program. If "NumElements" is specified, it is 9584the number of elements allocated, otherwise "NumElements" is defaulted 9585to be one. If a constant alignment is specified, the value result of the 9586allocation is guaranteed to be aligned to at least that boundary. The 9587alignment may not be greater than ``1 << 29``. If not specified, or if 9588zero, the target can choose to align the allocation on any convenient 9589boundary compatible with the type. 9590 9591'``type``' may be any sized type. 9592 9593Semantics: 9594"""""""""" 9595 9596Memory is allocated; a pointer is returned. The allocated memory is 9597uninitialized, and loading from uninitialized memory produces an undefined 9598value. The operation itself is undefined if there is insufficient stack 9599space for the allocation.'``alloca``'d memory is automatically released 9600when the function returns. The '``alloca``' instruction is commonly used 9601to represent automatic variables that must have an address available. When 9602the function returns (either with the ``ret`` or ``resume`` instructions), 9603the memory is reclaimed. Allocating zero bytes is legal, but the returned 9604pointer may not be unique. The order in which memory is allocated (ie., 9605which way the stack grows) is not specified. 9606 9607Note that '``alloca``' outside of the alloca address space from the 9608:ref:`datalayout string<langref_datalayout>` is meaningful only if the 9609target has assigned it a semantics. 9610 9611If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`, 9612the returned object is initially dead. 9613See :ref:`llvm.lifetime.start <int_lifestart>` and 9614:ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of 9615lifetime-manipulating intrinsics. 9616 9617Example: 9618"""""""" 9619 9620.. code-block:: llvm 9621 9622 %ptr = alloca i32 ; yields i32*:ptr 9623 %ptr = alloca i32, i32 4 ; yields i32*:ptr 9624 %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr 9625 %ptr = alloca i32, align 1024 ; yields i32*:ptr 9626 9627.. _i_load: 9628 9629'``load``' Instruction 9630^^^^^^^^^^^^^^^^^^^^^^ 9631 9632Syntax: 9633""""""" 9634 9635:: 9636 9637 <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>] 9638 <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] 9639 !<nontemp_node> = !{ i32 1 } 9640 !<empty_node> = !{} 9641 !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> } 9642 !<align_node> = !{ i64 <value_alignment> } 9643 9644Overview: 9645""""""""" 9646 9647The '``load``' instruction is used to read from memory. 9648 9649Arguments: 9650"""""""""" 9651 9652The argument to the ``load`` instruction specifies the memory address from which 9653to load. The type specified must be a :ref:`first class <t_firstclass>` type of 9654known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If 9655the ``load`` is marked as ``volatile``, then the optimizer is not allowed to 9656modify the number or order of execution of this ``load`` with other 9657:ref:`volatile operations <volatile>`. 9658 9659If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering 9660<ordering>` and optional ``syncscope("<target-scope>")`` argument. The 9661``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions. 9662Atomic loads produce :ref:`defined <memmodel>` results when they may see 9663multiple atomic stores. The type of the pointee must be an integer, pointer, or 9664floating-point type whose bit width is a power of two greater than or equal to 9665eight and less than or equal to a target-specific size limit. ``align`` must be 9666explicitly specified on atomic loads, and the load has undefined behavior if the 9667alignment is not set to a value which is at least the size in bytes of the 9668pointee. ``!nontemporal`` does not have any defined semantics for atomic loads. 9669 9670The optional constant ``align`` argument specifies the alignment of the 9671operation (that is, the alignment of the memory address). A value of 0 9672or an omitted ``align`` argument means that the operation has the ABI 9673alignment for the target. It is the responsibility of the code emitter 9674to ensure that the alignment information is correct. Overestimating the 9675alignment results in undefined behavior. Underestimating the alignment 9676may produce less efficient code. An alignment of 1 is always safe. The 9677maximum possible alignment is ``1 << 29``. An alignment value higher 9678than the size of the loaded type implies memory up to the alignment 9679value bytes can be safely loaded without trapping in the default 9680address space. Access of the high bytes can interfere with debugging 9681tools, so should not be accessed if the function has the 9682``sanitize_thread`` or ``sanitize_address`` attributes. 9683 9684The optional ``!nontemporal`` metadata must reference a single 9685metadata name ``<nontemp_node>`` corresponding to a metadata node with one 9686``i32`` entry of value 1. The existence of the ``!nontemporal`` 9687metadata on the instruction tells the optimizer and code generator 9688that this load is not expected to be reused in the cache. The code 9689generator may select special instructions to save cache bandwidth, such 9690as the ``MOVNT`` instruction on x86. 9691 9692The optional ``!invariant.load`` metadata must reference a single 9693metadata name ``<empty_node>`` corresponding to a metadata node with no 9694entries. If a load instruction tagged with the ``!invariant.load`` 9695metadata is executed, the memory location referenced by the load has 9696to contain the same value at all points in the program where the 9697memory location is dereferenceable; otherwise, the behavior is 9698undefined. 9699 9700The optional ``!invariant.group`` metadata must reference a single metadata name 9701 ``<empty_node>`` corresponding to a metadata node with no entries. 9702 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`. 9703 9704The optional ``!nonnull`` metadata must reference a single 9705metadata name ``<empty_node>`` corresponding to a metadata node with no 9706entries. The existence of the ``!nonnull`` metadata on the 9707instruction tells the optimizer that the value loaded is known to 9708never be null. If the value is null at runtime, the behavior is undefined. 9709This is analogous to the ``nonnull`` attribute on parameters and return 9710values. This metadata can only be applied to loads of a pointer type. 9711 9712The optional ``!dereferenceable`` metadata must reference a single metadata 9713name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` 9714entry. 9715See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`. 9716 9717The optional ``!dereferenceable_or_null`` metadata must reference a single 9718metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one 9719``i64`` entry. 9720See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null 9721<md_dereferenceable_or_null>`. 9722 9723The optional ``!align`` metadata must reference a single metadata name 9724``<align_node>`` corresponding to a metadata node with one ``i64`` entry. 9725The existence of the ``!align`` metadata on the instruction tells the 9726optimizer that the value loaded is known to be aligned to a boundary specified 9727by the integer value in the metadata node. The alignment must be a power of 2. 9728This is analogous to the ''align'' attribute on parameters and return values. 9729This metadata can only be applied to loads of a pointer type. If the returned 9730value is not appropriately aligned at runtime, the behavior is undefined. 9731 9732The optional ``!noundef`` metadata must reference a single metadata name 9733``<empty_node>`` corresponding to a node with no entries. The existence of 9734``!noundef`` metadata on the instruction tells the optimizer that the value 9735loaded is known to be :ref:`well defined <welldefinedvalues>`. 9736If the value isn't well defined, the behavior is undefined. 9737 9738Semantics: 9739"""""""""" 9740 9741The location of memory pointed to is loaded. If the value being loaded 9742is of scalar type then the number of bytes read does not exceed the 9743minimum number of bytes needed to hold all bits of the type. For 9744example, loading an ``i24`` reads at most three bytes. When loading a 9745value of a type like ``i20`` with a size that is not an integral number 9746of bytes, the result is undefined if the value was not originally 9747written using a store of the same type. 9748If the value being loaded is of aggregate type, the bytes that correspond to 9749padding may be accessed but are ignored, because it is impossible to observe 9750padding from the loaded aggregate value. 9751If ``<pointer>`` is not a well-defined value, the behavior is undefined. 9752 9753Examples: 9754""""""""" 9755 9756.. code-block:: llvm 9757 9758 %ptr = alloca i32 ; yields i32*:ptr 9759 store i32 3, i32* %ptr ; yields void 9760 %val = load i32, i32* %ptr ; yields i32:val = i32 3 9761 9762.. _i_store: 9763 9764'``store``' Instruction 9765^^^^^^^^^^^^^^^^^^^^^^^ 9766 9767Syntax: 9768""""""" 9769 9770:: 9771 9772 store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>] ; yields void 9773 store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void 9774 !<nontemp_node> = !{ i32 1 } 9775 !<empty_node> = !{} 9776 9777Overview: 9778""""""""" 9779 9780The '``store``' instruction is used to write to memory. 9781 9782Arguments: 9783"""""""""" 9784 9785There are two arguments to the ``store`` instruction: a value to store and an 9786address at which to store it. The type of the ``<pointer>`` operand must be a 9787pointer to the :ref:`first class <t_firstclass>` type of the ``<value>`` 9788operand. If the ``store`` is marked as ``volatile``, then the optimizer is not 9789allowed to modify the number or order of execution of this ``store`` with other 9790:ref:`volatile operations <volatile>`. Only values of :ref:`first class 9791<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque 9792structural type <t_opaque>`) can be stored. 9793 9794If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering 9795<ordering>` and optional ``syncscope("<target-scope>")`` argument. The 9796``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions. 9797Atomic loads produce :ref:`defined <memmodel>` results when they may see 9798multiple atomic stores. The type of the pointee must be an integer, pointer, or 9799floating-point type whose bit width is a power of two greater than or equal to 9800eight and less than or equal to a target-specific size limit. ``align`` must be 9801explicitly specified on atomic stores, and the store has undefined behavior if 9802the alignment is not set to a value which is at least the size in bytes of the 9803pointee. ``!nontemporal`` does not have any defined semantics for atomic stores. 9804 9805The optional constant ``align`` argument specifies the alignment of the 9806operation (that is, the alignment of the memory address). A value of 0 9807or an omitted ``align`` argument means that the operation has the ABI 9808alignment for the target. It is the responsibility of the code emitter 9809to ensure that the alignment information is correct. Overestimating the 9810alignment results in undefined behavior. Underestimating the 9811alignment may produce less efficient code. An alignment of 1 is always 9812safe. The maximum possible alignment is ``1 << 29``. An alignment 9813value higher than the size of the stored type implies memory up to the 9814alignment value bytes can be stored to without trapping in the default 9815address space. Storing to the higher bytes however may result in data 9816races if another thread can access the same address. Introducing a 9817data race is not allowed. Storing to the extra bytes is not allowed 9818even in situations where a data race is known to not exist if the 9819function has the ``sanitize_address`` attribute. 9820 9821The optional ``!nontemporal`` metadata must reference a single metadata 9822name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry 9823of value 1. The existence of the ``!nontemporal`` metadata on the instruction 9824tells the optimizer and code generator that this load is not expected to 9825be reused in the cache. The code generator may select special 9826instructions to save cache bandwidth, such as the ``MOVNT`` instruction on 9827x86. 9828 9829The optional ``!invariant.group`` metadata must reference a 9830single metadata name ``<empty_node>``. See ``invariant.group`` metadata. 9831 9832Semantics: 9833"""""""""" 9834 9835The contents of memory are updated to contain ``<value>`` at the 9836location specified by the ``<pointer>`` operand. If ``<value>`` is 9837of scalar type then the number of bytes written does not exceed the 9838minimum number of bytes needed to hold all bits of the type. For 9839example, storing an ``i24`` writes at most three bytes. When writing a 9840value of a type like ``i20`` with a size that is not an integral number 9841of bytes, it is unspecified what happens to the extra bits that do not 9842belong to the type, but they will typically be overwritten. 9843If ``<value>`` is of aggregate type, padding is filled with 9844:ref:`undef <undefvalues>`. 9845If ``<pointer>`` is not a well-defined value, the behavior is undefined. 9846 9847Example: 9848"""""""" 9849 9850.. code-block:: llvm 9851 9852 %ptr = alloca i32 ; yields i32*:ptr 9853 store i32 3, i32* %ptr ; yields void 9854 %val = load i32, i32* %ptr ; yields i32:val = i32 3 9855 9856.. _i_fence: 9857 9858'``fence``' Instruction 9859^^^^^^^^^^^^^^^^^^^^^^^ 9860 9861Syntax: 9862""""""" 9863 9864:: 9865 9866 fence [syncscope("<target-scope>")] <ordering> ; yields void 9867 9868Overview: 9869""""""""" 9870 9871The '``fence``' instruction is used to introduce happens-before edges 9872between operations. 9873 9874Arguments: 9875"""""""""" 9876 9877'``fence``' instructions take an :ref:`ordering <ordering>` argument which 9878defines what *synchronizes-with* edges they add. They can only be given 9879``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings. 9880 9881Semantics: 9882"""""""""" 9883 9884A fence A which has (at least) ``release`` ordering semantics 9885*synchronizes with* a fence B with (at least) ``acquire`` ordering 9886semantics if and only if there exist atomic operations X and Y, both 9887operating on some atomic object M, such that A is sequenced before X, X 9888modifies M (either directly or through some side effect of a sequence 9889headed by X), Y is sequenced before B, and Y observes M. This provides a 9890*happens-before* dependency between A and B. Rather than an explicit 9891``fence``, one (but not both) of the atomic operations X or Y might 9892provide a ``release`` or ``acquire`` (resp.) ordering constraint and 9893still *synchronize-with* the explicit ``fence`` and establish the 9894*happens-before* edge. 9895 9896A ``fence`` which has ``seq_cst`` ordering, in addition to having both 9897``acquire`` and ``release`` semantics specified above, participates in 9898the global program order of other ``seq_cst`` operations and/or fences. 9899 9900A ``fence`` instruction can also take an optional 9901":ref:`syncscope <syncscope>`" argument. 9902 9903Example: 9904"""""""" 9905 9906.. code-block:: text 9907 9908 fence acquire ; yields void 9909 fence syncscope("singlethread") seq_cst ; yields void 9910 fence syncscope("agent") seq_cst ; yields void 9911 9912.. _i_cmpxchg: 9913 9914'``cmpxchg``' Instruction 9915^^^^^^^^^^^^^^^^^^^^^^^^^ 9916 9917Syntax: 9918""""""" 9919 9920:: 9921 9922 cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields { ty, i1 } 9923 9924Overview: 9925""""""""" 9926 9927The '``cmpxchg``' instruction is used to atomically modify memory. It 9928loads a value in memory and compares it to a given value. If they are 9929equal, it tries to store a new value into the memory. 9930 9931Arguments: 9932"""""""""" 9933 9934There are three arguments to the '``cmpxchg``' instruction: an address 9935to operate on, a value to compare to the value currently be at that 9936address, and a new value to place at that address if the compared values 9937are equal. The type of '<cmp>' must be an integer or pointer type whose 9938bit width is a power of two greater than or equal to eight and less 9939than or equal to a target-specific size limit. '<cmp>' and '<new>' must 9940have the same type, and the type of '<pointer>' must be a pointer to 9941that type. If the ``cmpxchg`` is marked as ``volatile``, then the 9942optimizer is not allowed to modify the number or order of execution of 9943this ``cmpxchg`` with other :ref:`volatile operations <volatile>`. 9944 9945The success and failure :ref:`ordering <ordering>` arguments specify how this 9946``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters 9947must be at least ``monotonic``, the failure ordering cannot be either 9948``release`` or ``acq_rel``. 9949 9950A ``cmpxchg`` instruction can also take an optional 9951":ref:`syncscope <syncscope>`" argument. 9952 9953The instruction can take an optional ``align`` attribute. 9954The alignment must be a power of two greater or equal to the size of the 9955`<value>` type. If unspecified, the alignment is assumed to be equal to the 9956size of the '<value>' type. Note that this default alignment assumption is 9957different from the alignment used for the load/store instructions when align 9958isn't specified. 9959 9960The pointer passed into cmpxchg must have alignment greater than or 9961equal to the size in memory of the operand. 9962 9963Semantics: 9964"""""""""" 9965 9966The contents of memory at the location specified by the '``<pointer>``' operand 9967is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is 9968written to the location. The original value at the location is returned, 9969together with a flag indicating success (true) or failure (false). 9970 9971If the cmpxchg operation is marked as ``weak`` then a spurious failure is 9972permitted: the operation may not write ``<new>`` even if the comparison 9973matched. 9974 9975If the cmpxchg operation is strong (the default), the i1 value is 1 if and only 9976if the value loaded equals ``cmp``. 9977 9978A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of 9979identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic 9980load with an ordering parameter determined the second ordering parameter. 9981 9982Example: 9983"""""""" 9984 9985.. code-block:: llvm 9986 9987 entry: 9988 %orig = load atomic i32, i32* %ptr unordered, align 4 ; yields i32 9989 br label %loop 9990 9991 loop: 9992 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop] 9993 %squared = mul i32 %cmp, %cmp 9994 %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 } 9995 %value_loaded = extractvalue { i32, i1 } %val_success, 0 9996 %success = extractvalue { i32, i1 } %val_success, 1 9997 br i1 %success, label %done, label %loop 9998 9999 done: 10000 ... 10001 10002.. _i_atomicrmw: 10003 10004'``atomicrmw``' Instruction 10005^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10006 10007Syntax: 10008""""""" 10009 10010:: 10011 10012 atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>] ; yields ty 10013 10014Overview: 10015""""""""" 10016 10017The '``atomicrmw``' instruction is used to atomically modify memory. 10018 10019Arguments: 10020"""""""""" 10021 10022There are three arguments to the '``atomicrmw``' instruction: an 10023operation to apply, an address whose value to modify, an argument to the 10024operation. The operation must be one of the following keywords: 10025 10026- xchg 10027- add 10028- sub 10029- and 10030- nand 10031- or 10032- xor 10033- max 10034- min 10035- umax 10036- umin 10037- fadd 10038- fsub 10039 10040For most of these operations, the type of '<value>' must be an integer 10041type whose bit width is a power of two greater than or equal to eight 10042and less than or equal to a target-specific size limit. For xchg, this 10043may also be a floating point type with the same size constraints as 10044integers. For fadd/fsub, this must be a floating point type. The 10045type of the '``<pointer>``' operand must be a pointer to that type. If 10046the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not 10047allowed to modify the number or order of execution of this 10048``atomicrmw`` with other :ref:`volatile operations <volatile>`. 10049 10050The instruction can take an optional ``align`` attribute. 10051The alignment must be a power of two greater or equal to the size of the 10052`<value>` type. If unspecified, the alignment is assumed to be equal to the 10053size of the '<value>' type. Note that this default alignment assumption is 10054different from the alignment used for the load/store instructions when align 10055isn't specified. 10056 10057A ``atomicrmw`` instruction can also take an optional 10058":ref:`syncscope <syncscope>`" argument. 10059 10060Semantics: 10061"""""""""" 10062 10063The contents of memory at the location specified by the '``<pointer>``' 10064operand are atomically read, modified, and written back. The original 10065value at the location is returned. The modification is specified by the 10066operation argument: 10067 10068- xchg: ``*ptr = val`` 10069- add: ``*ptr = *ptr + val`` 10070- sub: ``*ptr = *ptr - val`` 10071- and: ``*ptr = *ptr & val`` 10072- nand: ``*ptr = ~(*ptr & val)`` 10073- or: ``*ptr = *ptr | val`` 10074- xor: ``*ptr = *ptr ^ val`` 10075- max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison) 10076- min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison) 10077- umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison) 10078- umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison) 10079- fadd: ``*ptr = *ptr + val`` (using floating point arithmetic) 10080- fsub: ``*ptr = *ptr - val`` (using floating point arithmetic) 10081 10082Example: 10083"""""""" 10084 10085.. code-block:: llvm 10086 10087 %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32 10088 10089.. _i_getelementptr: 10090 10091'``getelementptr``' Instruction 10092^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10093 10094Syntax: 10095""""""" 10096 10097:: 10098 10099 <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}* 10100 <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}* 10101 <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx> 10102 10103Overview: 10104""""""""" 10105 10106The '``getelementptr``' instruction is used to get the address of a 10107subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs 10108address calculation only and does not access memory. The instruction can also 10109be used to calculate a vector of such addresses. 10110 10111Arguments: 10112"""""""""" 10113 10114The first argument is always a type used as the basis for the calculations. 10115The second argument is always a pointer or a vector of pointers, and is the 10116base address to start from. The remaining arguments are indices 10117that indicate which of the elements of the aggregate object are indexed. 10118The interpretation of each index is dependent on the type being indexed 10119into. The first index always indexes the pointer value given as the 10120second argument, the second index indexes a value of the type pointed to 10121(not necessarily the value directly pointed to, since the first index 10122can be non-zero), etc. The first type indexed into must be a pointer 10123value, subsequent types can be arrays, vectors, and structs. Note that 10124subsequent types being indexed into can never be pointers, since that 10125would require loading the pointer before continuing calculation. 10126 10127The type of each index argument depends on the type it is indexing into. 10128When indexing into a (optionally packed) structure, only ``i32`` integer 10129**constants** are allowed (when using a vector of indices they must all 10130be the **same** ``i32`` integer constant). When indexing into an array, 10131pointer or vector, integers of any width are allowed, and they are not 10132required to be constant. These integers are treated as signed values 10133where relevant. 10134 10135For example, let's consider a C code fragment and how it gets compiled 10136to LLVM: 10137 10138.. code-block:: c 10139 10140 struct RT { 10141 char A; 10142 int B[10][20]; 10143 char C; 10144 }; 10145 struct ST { 10146 int X; 10147 double Y; 10148 struct RT Z; 10149 }; 10150 10151 int *foo(struct ST *s) { 10152 return &s[1].Z.B[5][13]; 10153 } 10154 10155The LLVM code generated by Clang is: 10156 10157.. code-block:: llvm 10158 10159 %struct.RT = type { i8, [10 x [20 x i32]], i8 } 10160 %struct.ST = type { i32, double, %struct.RT } 10161 10162 define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp { 10163 entry: 10164 %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13 10165 ret i32* %arrayidx 10166 } 10167 10168Semantics: 10169"""""""""" 10170 10171In the example above, the first index is indexing into the 10172'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``' 10173= '``{ i32, double, %struct.RT }``' type, a structure. The second index 10174indexes into the third element of the structure, yielding a 10175'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another 10176structure. The third index indexes into the second element of the 10177structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two 10178dimensions of the array are subscripted into, yielding an '``i32``' 10179type. The '``getelementptr``' instruction returns a pointer to this 10180element, thus computing a value of '``i32*``' type. 10181 10182Note that it is perfectly legal to index partially through a structure, 10183returning a pointer to an inner element. Because of this, the LLVM code 10184for the given testcase is equivalent to: 10185 10186.. code-block:: llvm 10187 10188 define i32* @foo(%struct.ST* %s) { 10189 %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1 10190 %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2 10191 %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3 10192 %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4 10193 %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5 10194 ret i32* %t5 10195 } 10196 10197If the ``inbounds`` keyword is present, the result value of the 10198``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the 10199following rules is violated: 10200 10201* The base pointer has an *in bounds* address of an allocated object, which 10202 means that it points into an allocated object, or to its end. The only 10203 *in bounds* address for a null pointer in the default address-space is the 10204 null pointer itself. 10205* If the type of an index is larger than the pointer index type, the 10206 truncation to the pointer index type preserves the signed value. 10207* The multiplication of an index by the type size does not wrap the pointer 10208 index type in a signed sense (``nsw``). 10209* The successive addition of offsets (without adding the base address) does 10210 not wrap the pointer index type in a signed sense (``nsw``). 10211* The successive addition of the current address, interpreted as an unsigned 10212 number, and an offset, interpreted as a signed number, does not wrap the 10213 unsigned address space and remains *in bounds* of the allocated object. 10214 As a corollary, if the added offset is non-negative, the addition does not 10215 wrap in an unsigned sense (``nuw``). 10216* In cases where the base is a vector of pointers, the ``inbounds`` keyword 10217 applies to each of the computations element-wise. 10218 10219These rules are based on the assumption that no allocated object may cross 10220the unsigned address space boundary, and no allocated object may be larger 10221than half the pointer index type space. 10222 10223If the ``inbounds`` keyword is not present, the offsets are added to the 10224base address with silently-wrapping two's complement arithmetic. If the 10225offsets have a different width from the pointer, they are sign-extended 10226or truncated to the width of the pointer. The result value of the 10227``getelementptr`` may be outside the object pointed to by the base 10228pointer. The result value may not necessarily be used to access memory 10229though, even if it happens to point into allocated storage. See the 10230:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more 10231information. 10232 10233If the ``inrange`` keyword is present before any index, loading from or 10234storing to any pointer derived from the ``getelementptr`` has undefined 10235behavior if the load or store would access memory outside of the bounds of 10236the element selected by the index marked as ``inrange``. The result of a 10237pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations 10238involving memory) involving a pointer derived from a ``getelementptr`` with 10239the ``inrange`` keyword is undefined, with the exception of comparisons 10240in the case where both operands are in the range of the element selected 10241by the ``inrange`` keyword, inclusive of the address one past the end of 10242that element. Note that the ``inrange`` keyword is currently only allowed 10243in constant ``getelementptr`` expressions. 10244 10245The getelementptr instruction is often confusing. For some more insight 10246into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`. 10247 10248Example: 10249"""""""" 10250 10251.. code-block:: llvm 10252 10253 ; yields [12 x i8]*:aptr 10254 %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1 10255 ; yields i8*:vptr 10256 %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1 10257 ; yields i8*:eptr 10258 %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1 10259 ; yields i32*:iptr 10260 %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0 10261 10262Vector of pointers: 10263""""""""""""""""""" 10264 10265The ``getelementptr`` returns a vector of pointers, instead of a single address, 10266when one or more of its arguments is a vector. In such cases, all vector 10267arguments should have the same number of elements, and every scalar argument 10268will be effectively broadcast into a vector during address calculation. 10269 10270.. code-block:: llvm 10271 10272 ; All arguments are vectors: 10273 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8) 10274 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets 10275 10276 ; Add the same scalar offset to each pointer of a vector: 10277 ; A[i] = ptrs[i] + offset*sizeof(i8) 10278 %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset 10279 10280 ; Add distinct offsets to the same pointer: 10281 ; A[i] = ptr + offsets[i]*sizeof(i8) 10282 %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets 10283 10284 ; In all cases described above the type of the result is <4 x i8*> 10285 10286The two following instructions are equivalent: 10287 10288.. code-block:: llvm 10289 10290 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1, 10291 <4 x i32> <i32 2, i32 2, i32 2, i32 2>, 10292 <4 x i32> <i32 1, i32 1, i32 1, i32 1>, 10293 <4 x i32> %ind4, 10294 <4 x i64> <i64 13, i64 13, i64 13, i64 13> 10295 10296 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1, 10297 i32 2, i32 1, <4 x i32> %ind4, i64 13 10298 10299Let's look at the C code, where the vector version of ``getelementptr`` 10300makes sense: 10301 10302.. code-block:: c 10303 10304 // Let's assume that we vectorize the following loop: 10305 double *A, *B; int *C; 10306 for (int i = 0; i < size; ++i) { 10307 A[i] = B[C[i]]; 10308 } 10309 10310.. code-block:: llvm 10311 10312 ; get pointers for 8 elements from array B 10313 %ptrs = getelementptr double, double* %B, <8 x i32> %C 10314 ; load 8 elements from array B into A 10315 %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs, 10316 i32 8, <8 x i1> %mask, <8 x double> %passthru) 10317 10318Conversion Operations 10319--------------------- 10320 10321The instructions in this category are the conversion instructions 10322(casting) which all take a single operand and a type. They perform 10323various bit conversions on the operand. 10324 10325.. _i_trunc: 10326 10327'``trunc .. to``' Instruction 10328^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10329 10330Syntax: 10331""""""" 10332 10333:: 10334 10335 <result> = trunc <ty> <value> to <ty2> ; yields ty2 10336 10337Overview: 10338""""""""" 10339 10340The '``trunc``' instruction truncates its operand to the type ``ty2``. 10341 10342Arguments: 10343"""""""""" 10344 10345The '``trunc``' instruction takes a value to trunc, and a type to trunc 10346it to. Both types must be of :ref:`integer <t_integer>` types, or vectors 10347of the same number of integers. The bit size of the ``value`` must be 10348larger than the bit size of the destination type, ``ty2``. Equal sized 10349types are not allowed. 10350 10351Semantics: 10352"""""""""" 10353 10354The '``trunc``' instruction truncates the high order bits in ``value`` 10355and converts the remaining bits to ``ty2``. Since the source size must 10356be larger than the destination size, ``trunc`` cannot be a *no-op cast*. 10357It will always truncate bits. 10358 10359Example: 10360"""""""" 10361 10362.. code-block:: llvm 10363 10364 %X = trunc i32 257 to i8 ; yields i8:1 10365 %Y = trunc i32 123 to i1 ; yields i1:true 10366 %Z = trunc i32 122 to i1 ; yields i1:false 10367 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7> 10368 10369.. _i_zext: 10370 10371'``zext .. to``' Instruction 10372^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10373 10374Syntax: 10375""""""" 10376 10377:: 10378 10379 <result> = zext <ty> <value> to <ty2> ; yields ty2 10380 10381Overview: 10382""""""""" 10383 10384The '``zext``' instruction zero extends its operand to type ``ty2``. 10385 10386Arguments: 10387"""""""""" 10388 10389The '``zext``' instruction takes a value to cast, and a type to cast it 10390to. Both types must be of :ref:`integer <t_integer>` types, or vectors of 10391the same number of integers. The bit size of the ``value`` must be 10392smaller than the bit size of the destination type, ``ty2``. 10393 10394Semantics: 10395"""""""""" 10396 10397The ``zext`` fills the high order bits of the ``value`` with zero bits 10398until it reaches the size of the destination type, ``ty2``. 10399 10400When zero extending from i1, the result will always be either 0 or 1. 10401 10402Example: 10403"""""""" 10404 10405.. code-block:: llvm 10406 10407 %X = zext i32 257 to i64 ; yields i64:257 10408 %Y = zext i1 true to i32 ; yields i32:1 10409 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> 10410 10411.. _i_sext: 10412 10413'``sext .. to``' Instruction 10414^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10415 10416Syntax: 10417""""""" 10418 10419:: 10420 10421 <result> = sext <ty> <value> to <ty2> ; yields ty2 10422 10423Overview: 10424""""""""" 10425 10426The '``sext``' sign extends ``value`` to the type ``ty2``. 10427 10428Arguments: 10429"""""""""" 10430 10431The '``sext``' instruction takes a value to cast, and a type to cast it 10432to. Both types must be of :ref:`integer <t_integer>` types, or vectors of 10433the same number of integers. The bit size of the ``value`` must be 10434smaller than the bit size of the destination type, ``ty2``. 10435 10436Semantics: 10437"""""""""" 10438 10439The '``sext``' instruction performs a sign extension by copying the sign 10440bit (highest order bit) of the ``value`` until it reaches the bit size 10441of the type ``ty2``. 10442 10443When sign extending from i1, the extension always results in -1 or 0. 10444 10445Example: 10446"""""""" 10447 10448.. code-block:: llvm 10449 10450 %X = sext i8 -1 to i16 ; yields i16 :65535 10451 %Y = sext i1 true to i32 ; yields i32:-1 10452 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> 10453 10454'``fptrunc .. to``' Instruction 10455^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10456 10457Syntax: 10458""""""" 10459 10460:: 10461 10462 <result> = fptrunc <ty> <value> to <ty2> ; yields ty2 10463 10464Overview: 10465""""""""" 10466 10467The '``fptrunc``' instruction truncates ``value`` to type ``ty2``. 10468 10469Arguments: 10470"""""""""" 10471 10472The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>` 10473value to cast and a :ref:`floating-point <t_floating>` type to cast it to. 10474The size of ``value`` must be larger than the size of ``ty2``. This 10475implies that ``fptrunc`` cannot be used to make a *no-op cast*. 10476 10477Semantics: 10478"""""""""" 10479 10480The '``fptrunc``' instruction casts a ``value`` from a larger 10481:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point 10482<t_floating>` type. 10483This instruction is assumed to execute in the default :ref:`floating-point 10484environment <floatenv>`. 10485 10486Example: 10487"""""""" 10488 10489.. code-block:: llvm 10490 10491 %X = fptrunc double 16777217.0 to float ; yields float:16777216.0 10492 %Y = fptrunc double 1.0E+300 to half ; yields half:+infinity 10493 10494'``fpext .. to``' Instruction 10495^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10496 10497Syntax: 10498""""""" 10499 10500:: 10501 10502 <result> = fpext <ty> <value> to <ty2> ; yields ty2 10503 10504Overview: 10505""""""""" 10506 10507The '``fpext``' extends a floating-point ``value`` to a larger floating-point 10508value. 10509 10510Arguments: 10511"""""""""" 10512 10513The '``fpext``' instruction takes a :ref:`floating-point <t_floating>` 10514``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it 10515to. The source type must be smaller than the destination type. 10516 10517Semantics: 10518"""""""""" 10519 10520The '``fpext``' instruction extends the ``value`` from a smaller 10521:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point 10522<t_floating>` type. The ``fpext`` cannot be used to make a 10523*no-op cast* because it always changes bits. Use ``bitcast`` to make a 10524*no-op cast* for a floating-point cast. 10525 10526Example: 10527"""""""" 10528 10529.. code-block:: llvm 10530 10531 %X = fpext float 3.125 to double ; yields double:3.125000e+00 10532 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000 10533 10534'``fptoui .. to``' Instruction 10535^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10536 10537Syntax: 10538""""""" 10539 10540:: 10541 10542 <result> = fptoui <ty> <value> to <ty2> ; yields ty2 10543 10544Overview: 10545""""""""" 10546 10547The '``fptoui``' converts a floating-point ``value`` to its unsigned 10548integer equivalent of type ``ty2``. 10549 10550Arguments: 10551"""""""""" 10552 10553The '``fptoui``' instruction takes a value to cast, which must be a 10554scalar or vector :ref:`floating-point <t_floating>` value, and a type to 10555cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If 10556``ty`` is a vector floating-point type, ``ty2`` must be a vector integer 10557type with the same number of elements as ``ty`` 10558 10559Semantics: 10560"""""""""" 10561 10562The '``fptoui``' instruction converts its :ref:`floating-point 10563<t_floating>` operand into the nearest (rounding towards zero) 10564unsigned integer value. If the value cannot fit in ``ty2``, the result 10565is a :ref:`poison value <poisonvalues>`. 10566 10567Example: 10568"""""""" 10569 10570.. code-block:: llvm 10571 10572 %X = fptoui double 123.0 to i32 ; yields i32:123 10573 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1 10574 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1 10575 10576'``fptosi .. to``' Instruction 10577^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10578 10579Syntax: 10580""""""" 10581 10582:: 10583 10584 <result> = fptosi <ty> <value> to <ty2> ; yields ty2 10585 10586Overview: 10587""""""""" 10588 10589The '``fptosi``' instruction converts :ref:`floating-point <t_floating>` 10590``value`` to type ``ty2``. 10591 10592Arguments: 10593"""""""""" 10594 10595The '``fptosi``' instruction takes a value to cast, which must be a 10596scalar or vector :ref:`floating-point <t_floating>` value, and a type to 10597cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If 10598``ty`` is a vector floating-point type, ``ty2`` must be a vector integer 10599type with the same number of elements as ``ty`` 10600 10601Semantics: 10602"""""""""" 10603 10604The '``fptosi``' instruction converts its :ref:`floating-point 10605<t_floating>` operand into the nearest (rounding towards zero) 10606signed integer value. If the value cannot fit in ``ty2``, the result 10607is a :ref:`poison value <poisonvalues>`. 10608 10609Example: 10610"""""""" 10611 10612.. code-block:: llvm 10613 10614 %X = fptosi double -123.0 to i32 ; yields i32:-123 10615 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1 10616 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1 10617 10618'``uitofp .. to``' Instruction 10619^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10620 10621Syntax: 10622""""""" 10623 10624:: 10625 10626 <result> = uitofp <ty> <value> to <ty2> ; yields ty2 10627 10628Overview: 10629""""""""" 10630 10631The '``uitofp``' instruction regards ``value`` as an unsigned integer 10632and converts that value to the ``ty2`` type. 10633 10634Arguments: 10635"""""""""" 10636 10637The '``uitofp``' instruction takes a value to cast, which must be a 10638scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to 10639``ty2``, which must be an :ref:`floating-point <t_floating>` type. If 10640``ty`` is a vector integer type, ``ty2`` must be a vector floating-point 10641type with the same number of elements as ``ty`` 10642 10643Semantics: 10644"""""""""" 10645 10646The '``uitofp``' instruction interprets its operand as an unsigned 10647integer quantity and converts it to the corresponding floating-point 10648value. If the value cannot be exactly represented, it is rounded using 10649the default rounding mode. 10650 10651 10652Example: 10653"""""""" 10654 10655.. code-block:: llvm 10656 10657 %X = uitofp i32 257 to float ; yields float:257.0 10658 %Y = uitofp i8 -1 to double ; yields double:255.0 10659 10660'``sitofp .. to``' Instruction 10661^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10662 10663Syntax: 10664""""""" 10665 10666:: 10667 10668 <result> = sitofp <ty> <value> to <ty2> ; yields ty2 10669 10670Overview: 10671""""""""" 10672 10673The '``sitofp``' instruction regards ``value`` as a signed integer and 10674converts that value to the ``ty2`` type. 10675 10676Arguments: 10677"""""""""" 10678 10679The '``sitofp``' instruction takes a value to cast, which must be a 10680scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to 10681``ty2``, which must be an :ref:`floating-point <t_floating>` type. If 10682``ty`` is a vector integer type, ``ty2`` must be a vector floating-point 10683type with the same number of elements as ``ty`` 10684 10685Semantics: 10686"""""""""" 10687 10688The '``sitofp``' instruction interprets its operand as a signed integer 10689quantity and converts it to the corresponding floating-point value. If the 10690value cannot be exactly represented, it is rounded using the default rounding 10691mode. 10692 10693Example: 10694"""""""" 10695 10696.. code-block:: llvm 10697 10698 %X = sitofp i32 257 to float ; yields float:257.0 10699 %Y = sitofp i8 -1 to double ; yields double:-1.0 10700 10701.. _i_ptrtoint: 10702 10703'``ptrtoint .. to``' Instruction 10704^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10705 10706Syntax: 10707""""""" 10708 10709:: 10710 10711 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2 10712 10713Overview: 10714""""""""" 10715 10716The '``ptrtoint``' instruction converts the pointer or a vector of 10717pointers ``value`` to the integer (or vector of integers) type ``ty2``. 10718 10719Arguments: 10720"""""""""" 10721 10722The '``ptrtoint``' instruction takes a ``value`` to cast, which must be 10723a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a 10724type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or 10725a vector of integers type. 10726 10727Semantics: 10728"""""""""" 10729 10730The '``ptrtoint``' instruction converts ``value`` to integer type 10731``ty2`` by interpreting the pointer value as an integer and either 10732truncating or zero extending that value to the size of the integer type. 10733If ``value`` is smaller than ``ty2`` then a zero extension is done. If 10734``value`` is larger than ``ty2`` then a truncation is done. If they are 10735the same size, then nothing is done (*no-op cast*) other than a type 10736change. 10737 10738Example: 10739"""""""" 10740 10741.. code-block:: llvm 10742 10743 %X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture 10744 %Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture 10745 %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture 10746 10747.. _i_inttoptr: 10748 10749'``inttoptr .. to``' Instruction 10750^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10751 10752Syntax: 10753""""""" 10754 10755:: 10756 10757 <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>] ; yields ty2 10758 10759Overview: 10760""""""""" 10761 10762The '``inttoptr``' instruction converts an integer ``value`` to a 10763pointer type, ``ty2``. 10764 10765Arguments: 10766"""""""""" 10767 10768The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to 10769cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>` 10770type. 10771 10772The optional ``!dereferenceable`` metadata must reference a single metadata 10773name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` 10774entry. 10775See ``dereferenceable`` metadata. 10776 10777The optional ``!dereferenceable_or_null`` metadata must reference a single 10778metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one 10779``i64`` entry. 10780See ``dereferenceable_or_null`` metadata. 10781 10782Semantics: 10783"""""""""" 10784 10785The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by 10786applying either a zero extension or a truncation depending on the size 10787of the integer ``value``. If ``value`` is larger than the size of a 10788pointer then a truncation is done. If ``value`` is smaller than the size 10789of a pointer then a zero extension is done. If they are the same size, 10790nothing is done (*no-op cast*). 10791 10792Example: 10793"""""""" 10794 10795.. code-block:: llvm 10796 10797 %X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture 10798 %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture 10799 %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture 10800 %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers 10801 10802.. _i_bitcast: 10803 10804'``bitcast .. to``' Instruction 10805^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10806 10807Syntax: 10808""""""" 10809 10810:: 10811 10812 <result> = bitcast <ty> <value> to <ty2> ; yields ty2 10813 10814Overview: 10815""""""""" 10816 10817The '``bitcast``' instruction converts ``value`` to type ``ty2`` without 10818changing any bits. 10819 10820Arguments: 10821"""""""""" 10822 10823The '``bitcast``' instruction takes a value to cast, which must be a 10824non-aggregate first class value, and a type to cast it to, which must 10825also be a non-aggregate :ref:`first class <t_firstclass>` type. The 10826bit sizes of ``value`` and the destination type, ``ty2``, must be 10827identical. If the source type is a pointer, the destination type must 10828also be a pointer of the same size. This instruction supports bitwise 10829conversion of vectors to integers and to vectors of other types (as 10830long as they have the same size). 10831 10832Semantics: 10833"""""""""" 10834 10835The '``bitcast``' instruction converts ``value`` to type ``ty2``. It 10836is always a *no-op cast* because no bits change with this 10837conversion. The conversion is done as if the ``value`` had been stored 10838to memory and read back as type ``ty2``. Pointer (or vector of 10839pointers) types may only be converted to other pointer (or vector of 10840pointers) types with the same address space through this instruction. 10841To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>` 10842or :ref:`ptrtoint <i_ptrtoint>` instructions first. 10843 10844There is a caveat for bitcasts involving vector types in relation to 10845endianess. For example ``bitcast <2 x i8> <value> to i16`` puts element zero 10846of the vector in the least significant bits of the i16 for little-endian while 10847element zero ends up in the most significant bits for big-endian. 10848 10849Example: 10850"""""""" 10851 10852.. code-block:: text 10853 10854 %X = bitcast i8 255 to i8 ; yields i8 :-1 10855 %Y = bitcast i32* %x to sint* ; yields sint*:%x 10856 %Z = bitcast <2 x int> %V to i64; ; yields i64: %V (depends on endianess) 10857 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*> 10858 10859.. _i_addrspacecast: 10860 10861'``addrspacecast .. to``' Instruction 10862^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10863 10864Syntax: 10865""""""" 10866 10867:: 10868 10869 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2 10870 10871Overview: 10872""""""""" 10873 10874The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in 10875address space ``n`` to type ``pty2`` in address space ``m``. 10876 10877Arguments: 10878"""""""""" 10879 10880The '``addrspacecast``' instruction takes a pointer or vector of pointer value 10881to cast and a pointer type to cast it to, which must have a different 10882address space. 10883 10884Semantics: 10885"""""""""" 10886 10887The '``addrspacecast``' instruction converts the pointer value 10888``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex 10889value modification, depending on the target and the address space 10890pair. Pointer conversions within the same address space must be 10891performed with the ``bitcast`` instruction. Note that if the address space 10892conversion is legal then both result and operand refer to the same memory 10893location. 10894 10895Example: 10896"""""""" 10897 10898.. code-block:: llvm 10899 10900 %X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x 10901 %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y 10902 %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*> ; yields <4 x float addrspace(3)*>:%z 10903 10904.. _otherops: 10905 10906Other Operations 10907---------------- 10908 10909The instructions in this category are the "miscellaneous" instructions, 10910which defy better classification. 10911 10912.. _i_icmp: 10913 10914'``icmp``' Instruction 10915^^^^^^^^^^^^^^^^^^^^^^ 10916 10917Syntax: 10918""""""" 10919 10920:: 10921 10922 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 10923 10924Overview: 10925""""""""" 10926 10927The '``icmp``' instruction returns a boolean value or a vector of 10928boolean values based on comparison of its two integer, integer vector, 10929pointer, or pointer vector operands. 10930 10931Arguments: 10932"""""""""" 10933 10934The '``icmp``' instruction takes three operands. The first operand is 10935the condition code indicating the kind of comparison to perform. It is 10936not a value, just a keyword. The possible condition codes are: 10937 10938#. ``eq``: equal 10939#. ``ne``: not equal 10940#. ``ugt``: unsigned greater than 10941#. ``uge``: unsigned greater or equal 10942#. ``ult``: unsigned less than 10943#. ``ule``: unsigned less or equal 10944#. ``sgt``: signed greater than 10945#. ``sge``: signed greater or equal 10946#. ``slt``: signed less than 10947#. ``sle``: signed less or equal 10948 10949The remaining two arguments must be :ref:`integer <t_integer>` or 10950:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They 10951must also be identical types. 10952 10953Semantics: 10954"""""""""" 10955 10956The '``icmp``' compares ``op1`` and ``op2`` according to the condition 10957code given as ``cond``. The comparison performed always yields either an 10958:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows: 10959 10960#. ``eq``: yields ``true`` if the operands are equal, ``false`` 10961 otherwise. No sign interpretation is necessary or performed. 10962#. ``ne``: yields ``true`` if the operands are unequal, ``false`` 10963 otherwise. No sign interpretation is necessary or performed. 10964#. ``ugt``: interprets the operands as unsigned values and yields 10965 ``true`` if ``op1`` is greater than ``op2``. 10966#. ``uge``: interprets the operands as unsigned values and yields 10967 ``true`` if ``op1`` is greater than or equal to ``op2``. 10968#. ``ult``: interprets the operands as unsigned values and yields 10969 ``true`` if ``op1`` is less than ``op2``. 10970#. ``ule``: interprets the operands as unsigned values and yields 10971 ``true`` if ``op1`` is less than or equal to ``op2``. 10972#. ``sgt``: interprets the operands as signed values and yields ``true`` 10973 if ``op1`` is greater than ``op2``. 10974#. ``sge``: interprets the operands as signed values and yields ``true`` 10975 if ``op1`` is greater than or equal to ``op2``. 10976#. ``slt``: interprets the operands as signed values and yields ``true`` 10977 if ``op1`` is less than ``op2``. 10978#. ``sle``: interprets the operands as signed values and yields ``true`` 10979 if ``op1`` is less than or equal to ``op2``. 10980 10981If the operands are :ref:`pointer <t_pointer>` typed, the pointer values 10982are compared as if they were integers. 10983 10984If the operands are integer vectors, then they are compared element by 10985element. The result is an ``i1`` vector with the same number of elements 10986as the values being compared. Otherwise, the result is an ``i1``. 10987 10988Example: 10989"""""""" 10990 10991.. code-block:: text 10992 10993 <result> = icmp eq i32 4, 5 ; yields: result=false 10994 <result> = icmp ne float* %X, %X ; yields: result=false 10995 <result> = icmp ult i16 4, 5 ; yields: result=true 10996 <result> = icmp sgt i16 4, 5 ; yields: result=false 10997 <result> = icmp ule i16 -4, 5 ; yields: result=false 10998 <result> = icmp sge i16 4, 5 ; yields: result=false 10999 11000.. _i_fcmp: 11001 11002'``fcmp``' Instruction 11003^^^^^^^^^^^^^^^^^^^^^^ 11004 11005Syntax: 11006""""""" 11007 11008:: 11009 11010 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 11011 11012Overview: 11013""""""""" 11014 11015The '``fcmp``' instruction returns a boolean value or vector of boolean 11016values based on comparison of its operands. 11017 11018If the operands are floating-point scalars, then the result type is a 11019boolean (:ref:`i1 <t_integer>`). 11020 11021If the operands are floating-point vectors, then the result type is a 11022vector of boolean with the same number of elements as the operands being 11023compared. 11024 11025Arguments: 11026"""""""""" 11027 11028The '``fcmp``' instruction takes three operands. The first operand is 11029the condition code indicating the kind of comparison to perform. It is 11030not a value, just a keyword. The possible condition codes are: 11031 11032#. ``false``: no comparison, always returns false 11033#. ``oeq``: ordered and equal 11034#. ``ogt``: ordered and greater than 11035#. ``oge``: ordered and greater than or equal 11036#. ``olt``: ordered and less than 11037#. ``ole``: ordered and less than or equal 11038#. ``one``: ordered and not equal 11039#. ``ord``: ordered (no nans) 11040#. ``ueq``: unordered or equal 11041#. ``ugt``: unordered or greater than 11042#. ``uge``: unordered or greater than or equal 11043#. ``ult``: unordered or less than 11044#. ``ule``: unordered or less than or equal 11045#. ``une``: unordered or not equal 11046#. ``uno``: unordered (either nans) 11047#. ``true``: no comparison, always returns true 11048 11049*Ordered* means that neither operand is a QNAN while *unordered* means 11050that either operand may be a QNAN. 11051 11052Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point 11053<t_floating>` type or a :ref:`vector <t_vector>` of floating-point type. 11054They must have identical types. 11055 11056Semantics: 11057"""""""""" 11058 11059The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the 11060condition code given as ``cond``. If the operands are vectors, then the 11061vectors are compared element by element. Each comparison performed 11062always yields an :ref:`i1 <t_integer>` result, as follows: 11063 11064#. ``false``: always yields ``false``, regardless of operands. 11065#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1`` 11066 is equal to ``op2``. 11067#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1`` 11068 is greater than ``op2``. 11069#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1`` 11070 is greater than or equal to ``op2``. 11071#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1`` 11072 is less than ``op2``. 11073#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1`` 11074 is less than or equal to ``op2``. 11075#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1`` 11076 is not equal to ``op2``. 11077#. ``ord``: yields ``true`` if both operands are not a QNAN. 11078#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is 11079 equal to ``op2``. 11080#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is 11081 greater than ``op2``. 11082#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is 11083 greater than or equal to ``op2``. 11084#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is 11085 less than ``op2``. 11086#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is 11087 less than or equal to ``op2``. 11088#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is 11089 not equal to ``op2``. 11090#. ``uno``: yields ``true`` if either operand is a QNAN. 11091#. ``true``: always yields ``true``, regardless of operands. 11092 11093The ``fcmp`` instruction can also optionally take any number of 11094:ref:`fast-math flags <fastmath>`, which are optimization hints to enable 11095otherwise unsafe floating-point optimizations. 11096 11097Any set of fast-math flags are legal on an ``fcmp`` instruction, but the 11098only flags that have any effect on its semantics are those that allow 11099assumptions to be made about the values of input arguments; namely 11100``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information. 11101 11102Example: 11103"""""""" 11104 11105.. code-block:: text 11106 11107 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false 11108 <result> = fcmp one float 4.0, 5.0 ; yields: result=true 11109 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true 11110 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false 11111 11112.. _i_phi: 11113 11114'``phi``' Instruction 11115^^^^^^^^^^^^^^^^^^^^^ 11116 11117Syntax: 11118""""""" 11119 11120:: 11121 11122 <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ... 11123 11124Overview: 11125""""""""" 11126 11127The '``phi``' instruction is used to implement the φ node in the SSA 11128graph representing the function. 11129 11130Arguments: 11131"""""""""" 11132 11133The type of the incoming values is specified with the first type field. 11134After this, the '``phi``' instruction takes a list of pairs as 11135arguments, with one pair for each predecessor basic block of the current 11136block. Only values of :ref:`first class <t_firstclass>` type may be used as 11137the value arguments to the PHI node. Only labels may be used as the 11138label arguments. 11139 11140There must be no non-phi instructions between the start of a basic block 11141and the PHI instructions: i.e. PHI instructions must be first in a basic 11142block. 11143 11144For the purposes of the SSA form, the use of each incoming value is 11145deemed to occur on the edge from the corresponding predecessor block to 11146the current block (but after any definition of an '``invoke``' 11147instruction's return value on the same edge). 11148 11149The optional ``fast-math-flags`` marker indicates that the phi has one 11150or more :ref:`fast-math-flags <fastmath>`. These are optimization hints 11151to enable otherwise unsafe floating-point optimizations. Fast-math-flags 11152are only valid for phis that return a floating-point scalar or vector 11153type, or an array (nested to any depth) of floating-point scalar or vector 11154types. 11155 11156Semantics: 11157"""""""""" 11158 11159At runtime, the '``phi``' instruction logically takes on the value 11160specified by the pair corresponding to the predecessor basic block that 11161executed just prior to the current block. 11162 11163Example: 11164"""""""" 11165 11166.. code-block:: llvm 11167 11168 Loop: ; Infinite loop that counts from 0 on up... 11169 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ] 11170 %nextindvar = add i32 %indvar, 1 11171 br label %Loop 11172 11173.. _i_select: 11174 11175'``select``' Instruction 11176^^^^^^^^^^^^^^^^^^^^^^^^ 11177 11178Syntax: 11179""""""" 11180 11181:: 11182 11183 <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty 11184 11185 selty is either i1 or {<N x i1>} 11186 11187Overview: 11188""""""""" 11189 11190The '``select``' instruction is used to choose one value based on a 11191condition, without IR-level branching. 11192 11193Arguments: 11194"""""""""" 11195 11196The '``select``' instruction requires an 'i1' value or a vector of 'i1' 11197values indicating the condition, and two values of the same :ref:`first 11198class <t_firstclass>` type. 11199 11200#. The optional ``fast-math flags`` marker indicates that the select has one or more 11201 :ref:`fast-math flags <fastmath>`. These are optimization hints to enable 11202 otherwise unsafe floating-point optimizations. Fast-math flags are only valid 11203 for selects that return a floating-point scalar or vector type, or an array 11204 (nested to any depth) of floating-point scalar or vector types. 11205 11206Semantics: 11207"""""""""" 11208 11209If the condition is an i1 and it evaluates to 1, the instruction returns 11210the first value argument; otherwise, it returns the second value 11211argument. 11212 11213If the condition is a vector of i1, then the value arguments must be 11214vectors of the same size, and the selection is done element by element. 11215 11216If the condition is an i1 and the value arguments are vectors of the 11217same size, then an entire vector is selected. 11218 11219Example: 11220"""""""" 11221 11222.. code-block:: llvm 11223 11224 %X = select i1 true, i8 17, i8 42 ; yields i8:17 11225 11226 11227.. _i_freeze: 11228 11229'``freeze``' Instruction 11230^^^^^^^^^^^^^^^^^^^^^^^^ 11231 11232Syntax: 11233""""""" 11234 11235:: 11236 11237 <result> = freeze ty <val> ; yields ty:result 11238 11239Overview: 11240""""""""" 11241 11242The '``freeze``' instruction is used to stop propagation of 11243:ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values. 11244 11245Arguments: 11246"""""""""" 11247 11248The '``freeze``' instruction takes a single argument. 11249 11250Semantics: 11251"""""""""" 11252 11253If the argument is ``undef`` or ``poison``, '``freeze``' returns an 11254arbitrary, but fixed, value of type '``ty``'. 11255Otherwise, this instruction is a no-op and returns the input argument. 11256All uses of a value returned by the same '``freeze``' instruction are 11257guaranteed to always observe the same value, while different '``freeze``' 11258instructions may yield different values. 11259 11260While ``undef`` and ``poison`` pointers can be frozen, the result is a 11261non-dereferenceable pointer. See the 11262:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information. 11263If an aggregate value or vector is frozen, the operand is frozen element-wise. 11264The padding of an aggregate isn't considered, since it isn't visible 11265without storing it into memory and loading it with a different type. 11266 11267 11268Example: 11269"""""""" 11270 11271.. code-block:: text 11272 11273 %w = i32 undef 11274 %x = freeze i32 %w 11275 %y = add i32 %w, %w ; undef 11276 %z = add i32 %x, %x ; even number because all uses of %x observe 11277 ; the same value 11278 %x2 = freeze i32 %w 11279 %cmp = icmp eq i32 %x, %x2 ; can be true or false 11280 11281 ; example with vectors 11282 %v = <2 x i32> <i32 undef, i32 poison> 11283 %a = extractelement <2 x i32> %v, i32 0 ; undef 11284 %b = extractelement <2 x i32> %v, i32 1 ; poison 11285 %add = add i32 %a, %a ; undef 11286 11287 %v.fr = freeze <2 x i32> %v ; element-wise freeze 11288 %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef 11289 %add.f = add i32 %d, %d ; even number 11290 11291 ; branching on frozen value 11292 %poison = add nsw i1 %k, undef ; poison 11293 %c = freeze i1 %poison 11294 br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar 11295 11296 11297.. _i_call: 11298 11299'``call``' Instruction 11300^^^^^^^^^^^^^^^^^^^^^^ 11301 11302Syntax: 11303""""""" 11304 11305:: 11306 11307 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)] 11308 <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ] 11309 11310Overview: 11311""""""""" 11312 11313The '``call``' instruction represents a simple function call. 11314 11315Arguments: 11316"""""""""" 11317 11318This instruction requires several arguments: 11319 11320#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers 11321 should perform tail call optimization. The ``tail`` marker is a hint that 11322 `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker 11323 means that the call must be tail call optimized in order for the program to 11324 be correct. The ``musttail`` marker provides these guarantees: 11325 11326 #. The call will not cause unbounded stack growth if it is part of a 11327 recursive cycle in the call graph. 11328 #. Arguments with the :ref:`inalloca <attr_inalloca>` or 11329 :ref:`preallocated <attr_preallocated>` attribute are forwarded in place. 11330 #. If the musttail call appears in a function with the ``"thunk"`` attribute 11331 and the caller and callee both have varargs, than any unprototyped 11332 arguments in register or memory are forwarded to the callee. Similarly, 11333 the return value of the callee is returned to the caller's caller, even 11334 if a void return type is in use. 11335 11336 Both markers imply that the callee does not access allocas from the caller. 11337 The ``tail`` marker additionally implies that the callee does not access 11338 varargs from the caller. Calls marked ``musttail`` must obey the following 11339 additional rules: 11340 11341 - The call must immediately precede a :ref:`ret <i_ret>` instruction, 11342 or a pointer bitcast followed by a ret instruction. 11343 - The ret instruction must return the (possibly bitcasted) value 11344 produced by the call or void. 11345 - The caller and callee prototypes must match. Pointer types of 11346 parameters or return types may differ in pointee type, but not 11347 in address space. 11348 - The calling conventions of the caller and callee must match. 11349 - All ABI-impacting function attributes, such as sret, byval, inreg, 11350 returned, and inalloca, must match. 11351 - The callee must be varargs iff the caller is varargs. Bitcasting a 11352 non-varargs function to the appropriate varargs type is legal so 11353 long as the non-varargs prefixes obey the other rules. 11354 11355 Tail call optimization for calls marked ``tail`` is guaranteed to occur if 11356 the following conditions are met: 11357 11358 - Caller and callee both have the calling convention ``fastcc`` or ``tailcc``. 11359 - The call is in tail position (ret immediately follows call and ret 11360 uses value of call or is void). 11361 - Option ``-tailcallopt`` is enabled, 11362 ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention 11363 is ``tailcc`` 11364 - `Platform-specific constraints are 11365 met. <CodeGenerator.html#tailcallopt>`_ 11366 11367#. The optional ``notail`` marker indicates that the optimizers should not add 11368 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail 11369 call optimization from being performed on the call. 11370 11371#. The optional ``fast-math flags`` marker indicates that the call has one or more 11372 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable 11373 otherwise unsafe floating-point optimizations. Fast-math flags are only valid 11374 for calls that return a floating-point scalar or vector type, or an array 11375 (nested to any depth) of floating-point scalar or vector types. 11376 11377#. The optional "cconv" marker indicates which :ref:`calling 11378 convention <callingconv>` the call should use. If none is 11379 specified, the call defaults to using C calling conventions. The 11380 calling convention of the call must match the calling convention of 11381 the target function, or else the behavior is undefined. 11382#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 11383 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 11384 are valid here. 11385#. The optional addrspace attribute can be used to indicate the address space 11386 of the called function. If it is not specified, the program address space 11387 from the :ref:`datalayout string<langref_datalayout>` will be used. 11388#. '``ty``': the type of the call instruction itself which is also the 11389 type of the return value. Functions that return no value are marked 11390 ``void``. 11391#. '``fnty``': shall be the signature of the function being called. The 11392 argument types must match the types implied by this signature. This 11393 type can be omitted if the function is not varargs. 11394#. '``fnptrval``': An LLVM value containing a pointer to a function to 11395 be called. In most cases, this is a direct function call, but 11396 indirect ``call``'s are just as possible, calling an arbitrary pointer 11397 to function value. 11398#. '``function args``': argument list whose types match the function 11399 signature argument types and parameter attributes. All arguments must 11400 be of :ref:`first class <t_firstclass>` type. If the function signature 11401 indicates the function accepts a variable number of arguments, the 11402 extra arguments can be specified. 11403#. The optional :ref:`function attributes <fnattrs>` list. 11404#. The optional :ref:`operand bundles <opbundles>` list. 11405 11406Semantics: 11407"""""""""" 11408 11409The '``call``' instruction is used to cause control flow to transfer to 11410a specified function, with its incoming arguments bound to the specified 11411values. Upon a '``ret``' instruction in the called function, control 11412flow continues with the instruction after the function call, and the 11413return value of the function is bound to the result argument. 11414 11415Example: 11416"""""""" 11417 11418.. code-block:: llvm 11419 11420 %retval = call i32 @test(i32 %argc) 11421 call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42) ; yields i32 11422 %X = tail call i32 @foo() ; yields i32 11423 %Y = tail call fastcc i32 @foo() ; yields i32 11424 call void %foo(i8 97 signext) 11425 11426 %struct.A = type { i32, i8 } 11427 %r = call %struct.A @foo() ; yields { i32, i8 } 11428 %gr = extractvalue %struct.A %r, 0 ; yields i32 11429 %gr1 = extractvalue %struct.A %r, 1 ; yields i8 11430 %Z = call void @foo() noreturn ; indicates that %foo never returns normally 11431 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended 11432 11433llvm treats calls to some functions with names and arguments that match 11434the standard C99 library as being the C99 library functions, and may 11435perform optimizations or generate code for them under that assumption. 11436This is something we'd like to change in the future to provide better 11437support for freestanding environments and non-C-based languages. 11438 11439.. _i_va_arg: 11440 11441'``va_arg``' Instruction 11442^^^^^^^^^^^^^^^^^^^^^^^^ 11443 11444Syntax: 11445""""""" 11446 11447:: 11448 11449 <resultval> = va_arg <va_list*> <arglist>, <argty> 11450 11451Overview: 11452""""""""" 11453 11454The '``va_arg``' instruction is used to access arguments passed through 11455the "variable argument" area of a function call. It is used to implement 11456the ``va_arg`` macro in C. 11457 11458Arguments: 11459"""""""""" 11460 11461This instruction takes a ``va_list*`` value and the type of the 11462argument. It returns a value of the specified argument type and 11463increments the ``va_list`` to point to the next argument. The actual 11464type of ``va_list`` is target specific. 11465 11466Semantics: 11467"""""""""" 11468 11469The '``va_arg``' instruction loads an argument of the specified type 11470from the specified ``va_list`` and causes the ``va_list`` to point to 11471the next argument. For more information, see the variable argument 11472handling :ref:`Intrinsic Functions <int_varargs>`. 11473 11474It is legal for this instruction to be called in a function which does 11475not take a variable number of arguments, for example, the ``vfprintf`` 11476function. 11477 11478``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic 11479function <intrinsics>` because it takes a type as an argument. 11480 11481Example: 11482"""""""" 11483 11484See the :ref:`variable argument processing <int_varargs>` section. 11485 11486Note that the code generator does not yet fully support va\_arg on many 11487targets. Also, it does not currently support va\_arg with aggregate 11488types on any target. 11489 11490.. _i_landingpad: 11491 11492'``landingpad``' Instruction 11493^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11494 11495Syntax: 11496""""""" 11497 11498:: 11499 11500 <resultval> = landingpad <resultty> <clause>+ 11501 <resultval> = landingpad <resultty> cleanup <clause>* 11502 11503 <clause> := catch <type> <value> 11504 <clause> := filter <array constant type> <array constant> 11505 11506Overview: 11507""""""""" 11508 11509The '``landingpad``' instruction is used by `LLVM's exception handling 11510system <ExceptionHandling.html#overview>`_ to specify that a basic block 11511is a landing pad --- one where the exception lands, and corresponds to the 11512code found in the ``catch`` portion of a ``try``/``catch`` sequence. It 11513defines values supplied by the :ref:`personality function <personalityfn>` upon 11514re-entry to the function. The ``resultval`` has the type ``resultty``. 11515 11516Arguments: 11517"""""""""" 11518 11519The optional 11520``cleanup`` flag indicates that the landing pad block is a cleanup. 11521 11522A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and 11523contains the global variable representing the "type" that may be caught 11524or filtered respectively. Unlike the ``catch`` clause, the ``filter`` 11525clause takes an array constant as its argument. Use 11526"``[0 x i8**] undef``" for a filter which cannot throw. The 11527'``landingpad``' instruction must contain *at least* one ``clause`` or 11528the ``cleanup`` flag. 11529 11530Semantics: 11531"""""""""" 11532 11533The '``landingpad``' instruction defines the values which are set by the 11534:ref:`personality function <personalityfn>` upon re-entry to the function, and 11535therefore the "result type" of the ``landingpad`` instruction. As with 11536calling conventions, how the personality function results are 11537represented in LLVM IR is target specific. 11538 11539The clauses are applied in order from top to bottom. If two 11540``landingpad`` instructions are merged together through inlining, the 11541clauses from the calling function are appended to the list of clauses. 11542When the call stack is being unwound due to an exception being thrown, 11543the exception is compared against each ``clause`` in turn. If it doesn't 11544match any of the clauses, and the ``cleanup`` flag is not set, then 11545unwinding continues further up the call stack. 11546 11547The ``landingpad`` instruction has several restrictions: 11548 11549- A landing pad block is a basic block which is the unwind destination 11550 of an '``invoke``' instruction. 11551- A landing pad block must have a '``landingpad``' instruction as its 11552 first non-PHI instruction. 11553- There can be only one '``landingpad``' instruction within the landing 11554 pad block. 11555- A basic block that is not a landing pad block may not include a 11556 '``landingpad``' instruction. 11557 11558Example: 11559"""""""" 11560 11561.. code-block:: llvm 11562 11563 ;; A landing pad which can catch an integer. 11564 %res = landingpad { i8*, i32 } 11565 catch i8** @_ZTIi 11566 ;; A landing pad that is a cleanup. 11567 %res = landingpad { i8*, i32 } 11568 cleanup 11569 ;; A landing pad which can catch an integer and can only throw a double. 11570 %res = landingpad { i8*, i32 } 11571 catch i8** @_ZTIi 11572 filter [1 x i8**] [@_ZTId] 11573 11574.. _i_catchpad: 11575 11576'``catchpad``' Instruction 11577^^^^^^^^^^^^^^^^^^^^^^^^^^ 11578 11579Syntax: 11580""""""" 11581 11582:: 11583 11584 <resultval> = catchpad within <catchswitch> [<args>*] 11585 11586Overview: 11587""""""""" 11588 11589The '``catchpad``' instruction is used by `LLVM's exception handling 11590system <ExceptionHandling.html#overview>`_ to specify that a basic block 11591begins a catch handler --- one where a personality routine attempts to transfer 11592control to catch an exception. 11593 11594Arguments: 11595"""""""""" 11596 11597The ``catchswitch`` operand must always be a token produced by a 11598:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This 11599ensures that each ``catchpad`` has exactly one predecessor block, and it always 11600terminates in a ``catchswitch``. 11601 11602The ``args`` correspond to whatever information the personality routine 11603requires to know if this is an appropriate handler for the exception. Control 11604will transfer to the ``catchpad`` if this is the first appropriate handler for 11605the exception. 11606 11607The ``resultval`` has the type :ref:`token <t_token>` and is used to match the 11608``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH 11609pads. 11610 11611Semantics: 11612"""""""""" 11613 11614When the call stack is being unwound due to an exception being thrown, the 11615exception is compared against the ``args``. If it doesn't match, control will 11616not reach the ``catchpad`` instruction. The representation of ``args`` is 11617entirely target and personality function-specific. 11618 11619Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad`` 11620instruction must be the first non-phi of its parent basic block. 11621 11622The meaning of the tokens produced and consumed by ``catchpad`` and other "pad" 11623instructions is described in the 11624`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_. 11625 11626When a ``catchpad`` has been "entered" but not yet "exited" (as 11627described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 11628it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 11629that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`. 11630 11631Example: 11632"""""""" 11633 11634.. code-block:: text 11635 11636 dispatch: 11637 %cs = catchswitch within none [label %handler0] unwind to caller 11638 ;; A catch block which can catch an integer. 11639 handler0: 11640 %tok = catchpad within %cs [i8** @_ZTIi] 11641 11642.. _i_cleanuppad: 11643 11644'``cleanuppad``' Instruction 11645^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11646 11647Syntax: 11648""""""" 11649 11650:: 11651 11652 <resultval> = cleanuppad within <parent> [<args>*] 11653 11654Overview: 11655""""""""" 11656 11657The '``cleanuppad``' instruction is used by `LLVM's exception handling 11658system <ExceptionHandling.html#overview>`_ to specify that a basic block 11659is a cleanup block --- one where a personality routine attempts to 11660transfer control to run cleanup actions. 11661The ``args`` correspond to whatever additional 11662information the :ref:`personality function <personalityfn>` requires to 11663execute the cleanup. 11664The ``resultval`` has the type :ref:`token <t_token>` and is used to 11665match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`. 11666The ``parent`` argument is the token of the funclet that contains the 11667``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet, 11668this operand may be the token ``none``. 11669 11670Arguments: 11671"""""""""" 11672 11673The instruction takes a list of arbitrary values which are interpreted 11674by the :ref:`personality function <personalityfn>`. 11675 11676Semantics: 11677"""""""""" 11678 11679When the call stack is being unwound due to an exception being thrown, 11680the :ref:`personality function <personalityfn>` transfers control to the 11681``cleanuppad`` with the aid of the personality-specific arguments. 11682As with calling conventions, how the personality function results are 11683represented in LLVM IR is target specific. 11684 11685The ``cleanuppad`` instruction has several restrictions: 11686 11687- A cleanup block is a basic block which is the unwind destination of 11688 an exceptional instruction. 11689- A cleanup block must have a '``cleanuppad``' instruction as its 11690 first non-PHI instruction. 11691- There can be only one '``cleanuppad``' instruction within the 11692 cleanup block. 11693- A basic block that is not a cleanup block may not include a 11694 '``cleanuppad``' instruction. 11695 11696When a ``cleanuppad`` has been "entered" but not yet "exited" (as 11697described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 11698it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 11699that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`. 11700 11701Example: 11702"""""""" 11703 11704.. code-block:: text 11705 11706 %tok = cleanuppad within %cs [] 11707 11708.. _intrinsics: 11709 11710Intrinsic Functions 11711=================== 11712 11713LLVM supports the notion of an "intrinsic function". These functions 11714have well known names and semantics and are required to follow certain 11715restrictions. Overall, these intrinsics represent an extension mechanism 11716for the LLVM language that does not require changing all of the 11717transformations in LLVM when adding to the language (or the bitcode 11718reader/writer, the parser, etc...). 11719 11720Intrinsic function names must all start with an "``llvm.``" prefix. This 11721prefix is reserved in LLVM for intrinsic names; thus, function names may 11722not begin with this prefix. Intrinsic functions must always be external 11723functions: you cannot define the body of intrinsic functions. Intrinsic 11724functions may only be used in call or invoke instructions: it is illegal 11725to take the address of an intrinsic function. Additionally, because 11726intrinsic functions are part of the LLVM language, it is required if any 11727are added that they be documented here. 11728 11729Some intrinsic functions can be overloaded, i.e., the intrinsic 11730represents a family of functions that perform the same operation but on 11731different data types. Because LLVM can represent over 8 million 11732different integer types, overloading is used commonly to allow an 11733intrinsic function to operate on any integer type. One or more of the 11734argument types or the result type can be overloaded to accept any 11735integer type. Argument types may also be defined as exactly matching a 11736previous argument's type or the result type. This allows an intrinsic 11737function which accepts multiple arguments, but needs all of them to be 11738of the same type, to only be overloaded with respect to a single 11739argument or the result. 11740 11741Overloaded intrinsics will have the names of its overloaded argument 11742types encoded into its function name, each preceded by a period. Only 11743those types which are overloaded result in a name suffix. Arguments 11744whose type is matched against another type do not. For example, the 11745``llvm.ctpop`` function can take an integer of any width and returns an 11746integer of exactly the same integer width. This leads to a family of 11747functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and 11748``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is 11749overloaded, and only one type suffix is required. Because the argument's 11750type is matched against the return type, it does not require its own 11751name suffix. 11752 11753:ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics 11754that depend on an unnamed type in one of its overloaded argument types get an 11755additional ``.<number>`` suffix. This allows differentiating intrinsics with 11756different unnamed types as arguments. (For example: 11757``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and 11758it ensures unique names in the module. While linking together two modules, it is 11759still possible to get a name clash. In that case one of the names will be 11760changed by getting a new number. 11761 11762For target developers who are defining intrinsics for back-end code 11763generation, any intrinsic overloads based solely the distinction between 11764integer or floating point types should not be relied upon for correct 11765code generation. In such cases, the recommended approach for target 11766maintainers when defining intrinsics is to create separate integer and 11767FP intrinsics rather than rely on overloading. For example, if different 11768codegen is required for ``llvm.target.foo(<4 x i32>)`` and 11769``llvm.target.foo(<4 x float>)`` then these should be split into 11770different intrinsics. 11771 11772To learn how to add an intrinsic function, please see the `Extending 11773LLVM Guide <ExtendingLLVM.html>`_. 11774 11775.. _int_varargs: 11776 11777Variable Argument Handling Intrinsics 11778------------------------------------- 11779 11780Variable argument support is defined in LLVM with the 11781:ref:`va_arg <i_va_arg>` instruction and these three intrinsic 11782functions. These functions are related to the similarly named macros 11783defined in the ``<stdarg.h>`` header file. 11784 11785All of these functions operate on arguments that use a target-specific 11786value type "``va_list``". The LLVM assembly language reference manual 11787does not define what this type is, so all transformations should be 11788prepared to handle these functions regardless of the type used. 11789 11790This example shows how the :ref:`va_arg <i_va_arg>` instruction and the 11791variable argument handling intrinsic functions are used. 11792 11793.. code-block:: llvm 11794 11795 ; This struct is different for every platform. For most platforms, 11796 ; it is merely an i8*. 11797 %struct.va_list = type { i8* } 11798 11799 ; For Unix x86_64 platforms, va_list is the following struct: 11800 ; %struct.va_list = type { i32, i32, i8*, i8* } 11801 11802 define i32 @test(i32 %X, ...) { 11803 ; Initialize variable argument processing 11804 %ap = alloca %struct.va_list 11805 %ap2 = bitcast %struct.va_list* %ap to i8* 11806 call void @llvm.va_start(i8* %ap2) 11807 11808 ; Read a single integer argument 11809 %tmp = va_arg i8* %ap2, i32 11810 11811 ; Demonstrate usage of llvm.va_copy and llvm.va_end 11812 %aq = alloca i8* 11813 %aq2 = bitcast i8** %aq to i8* 11814 call void @llvm.va_copy(i8* %aq2, i8* %ap2) 11815 call void @llvm.va_end(i8* %aq2) 11816 11817 ; Stop processing of arguments. 11818 call void @llvm.va_end(i8* %ap2) 11819 ret i32 %tmp 11820 } 11821 11822 declare void @llvm.va_start(i8*) 11823 declare void @llvm.va_copy(i8*, i8*) 11824 declare void @llvm.va_end(i8*) 11825 11826.. _int_va_start: 11827 11828'``llvm.va_start``' Intrinsic 11829^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11830 11831Syntax: 11832""""""" 11833 11834:: 11835 11836 declare void @llvm.va_start(i8* <arglist>) 11837 11838Overview: 11839""""""""" 11840 11841The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for 11842subsequent use by ``va_arg``. 11843 11844Arguments: 11845"""""""""" 11846 11847The argument is a pointer to a ``va_list`` element to initialize. 11848 11849Semantics: 11850"""""""""" 11851 11852The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro 11853available in C. In a target-dependent way, it initializes the 11854``va_list`` element to which the argument points, so that the next call 11855to ``va_arg`` will produce the first variable argument passed to the 11856function. Unlike the C ``va_start`` macro, this intrinsic does not need 11857to know the last argument of the function as the compiler can figure 11858that out. 11859 11860'``llvm.va_end``' Intrinsic 11861^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11862 11863Syntax: 11864""""""" 11865 11866:: 11867 11868 declare void @llvm.va_end(i8* <arglist>) 11869 11870Overview: 11871""""""""" 11872 11873The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been 11874initialized previously with ``llvm.va_start`` or ``llvm.va_copy``. 11875 11876Arguments: 11877"""""""""" 11878 11879The argument is a pointer to a ``va_list`` to destroy. 11880 11881Semantics: 11882"""""""""" 11883 11884The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro 11885available in C. In a target-dependent way, it destroys the ``va_list`` 11886element to which the argument points. Calls to 11887:ref:`llvm.va_start <int_va_start>` and 11888:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to 11889``llvm.va_end``. 11890 11891.. _int_va_copy: 11892 11893'``llvm.va_copy``' Intrinsic 11894^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11895 11896Syntax: 11897""""""" 11898 11899:: 11900 11901 declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>) 11902 11903Overview: 11904""""""""" 11905 11906The '``llvm.va_copy``' intrinsic copies the current argument position 11907from the source argument list to the destination argument list. 11908 11909Arguments: 11910"""""""""" 11911 11912The first argument is a pointer to a ``va_list`` element to initialize. 11913The second argument is a pointer to a ``va_list`` element to copy from. 11914 11915Semantics: 11916"""""""""" 11917 11918The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro 11919available in C. In a target-dependent way, it copies the source 11920``va_list`` element into the destination ``va_list`` element. This 11921intrinsic is necessary because the `` llvm.va_start`` intrinsic may be 11922arbitrarily complex and require, for example, memory allocation. 11923 11924Accurate Garbage Collection Intrinsics 11925-------------------------------------- 11926 11927LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_ 11928(GC) requires the frontend to generate code containing appropriate intrinsic 11929calls and select an appropriate GC strategy which knows how to lower these 11930intrinsics in a manner which is appropriate for the target collector. 11931 11932These intrinsics allow identification of :ref:`GC roots on the 11933stack <int_gcroot>`, as well as garbage collector implementations that 11934require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. 11935Frontends for type-safe garbage collected languages should generate 11936these intrinsics to make use of the LLVM garbage collectors. For more 11937details, see `Garbage Collection with LLVM <GarbageCollection.html>`_. 11938 11939LLVM provides an second experimental set of intrinsics for describing garbage 11940collection safepoints in compiled code. These intrinsics are an alternative 11941to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for 11942:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The 11943differences in approach are covered in the `Garbage Collection with LLVM 11944<GarbageCollection.html>`_ documentation. The intrinsics themselves are 11945described in :doc:`Statepoints`. 11946 11947.. _int_gcroot: 11948 11949'``llvm.gcroot``' Intrinsic 11950^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11951 11952Syntax: 11953""""""" 11954 11955:: 11956 11957 declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata) 11958 11959Overview: 11960""""""""" 11961 11962The '``llvm.gcroot``' intrinsic declares the existence of a GC root to 11963the code generator, and allows some metadata to be associated with it. 11964 11965Arguments: 11966"""""""""" 11967 11968The first argument specifies the address of a stack object that contains 11969the root pointer. The second pointer (which must be either a constant or 11970a global value address) contains the meta-data to be associated with the 11971root. 11972 11973Semantics: 11974"""""""""" 11975 11976At runtime, a call to this intrinsic stores a null pointer into the 11977"ptrloc" location. At compile-time, the code generator generates 11978information to allow the runtime to find the pointer at GC safe points. 11979The '``llvm.gcroot``' intrinsic may only be used in a function which 11980:ref:`specifies a GC algorithm <gc>`. 11981 11982.. _int_gcread: 11983 11984'``llvm.gcread``' Intrinsic 11985^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11986 11987Syntax: 11988""""""" 11989 11990:: 11991 11992 declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr) 11993 11994Overview: 11995""""""""" 11996 11997The '``llvm.gcread``' intrinsic identifies reads of references from heap 11998locations, allowing garbage collector implementations that require read 11999barriers. 12000 12001Arguments: 12002"""""""""" 12003 12004The second argument is the address to read from, which should be an 12005address allocated from the garbage collector. The first object is a 12006pointer to the start of the referenced object, if needed by the language 12007runtime (otherwise null). 12008 12009Semantics: 12010"""""""""" 12011 12012The '``llvm.gcread``' intrinsic has the same semantics as a load 12013instruction, but may be replaced with substantially more complex code by 12014the garbage collector runtime, as needed. The '``llvm.gcread``' 12015intrinsic may only be used in a function which :ref:`specifies a GC 12016algorithm <gc>`. 12017 12018.. _int_gcwrite: 12019 12020'``llvm.gcwrite``' Intrinsic 12021^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12022 12023Syntax: 12024""""""" 12025 12026:: 12027 12028 declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2) 12029 12030Overview: 12031""""""""" 12032 12033The '``llvm.gcwrite``' intrinsic identifies writes of references to heap 12034locations, allowing garbage collector implementations that require write 12035barriers (such as generational or reference counting collectors). 12036 12037Arguments: 12038"""""""""" 12039 12040The first argument is the reference to store, the second is the start of 12041the object to store it to, and the third is the address of the field of 12042Obj to store to. If the runtime does not require a pointer to the 12043object, Obj may be null. 12044 12045Semantics: 12046"""""""""" 12047 12048The '``llvm.gcwrite``' intrinsic has the same semantics as a store 12049instruction, but may be replaced with substantially more complex code by 12050the garbage collector runtime, as needed. The '``llvm.gcwrite``' 12051intrinsic may only be used in a function which :ref:`specifies a GC 12052algorithm <gc>`. 12053 12054 12055.. _gc_statepoint: 12056 12057'llvm.experimental.gc.statepoint' Intrinsic 12058^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12059 12060Syntax: 12061""""""" 12062 12063:: 12064 12065 declare token 12066 @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>, 12067 func_type <target>, 12068 i64 <#call args>, i64 <flags>, 12069 ... (call parameters), 12070 i64 0, i64 0) 12071 12072Overview: 12073""""""""" 12074 12075The statepoint intrinsic represents a call which is parse-able by the 12076runtime. 12077 12078Operands: 12079""""""""" 12080 12081The 'id' operand is a constant integer that is reported as the ID 12082field in the generated stackmap. LLVM does not interpret this 12083parameter in any way and its meaning is up to the statepoint user to 12084decide. Note that LLVM is free to duplicate code containing 12085statepoint calls, and this may transform IR that had a unique 'id' per 12086lexical call to statepoint to IR that does not. 12087 12088If 'num patch bytes' is non-zero then the call instruction 12089corresponding to the statepoint is not emitted and LLVM emits 'num 12090patch bytes' bytes of nops in its place. LLVM will emit code to 12091prepare the function arguments and retrieve the function return value 12092in accordance to the calling convention; the former before the nop 12093sequence and the latter after the nop sequence. It is expected that 12094the user will patch over the 'num patch bytes' bytes of nops with a 12095calling sequence specific to their runtime before executing the 12096generated machine code. There are no guarantees with respect to the 12097alignment of the nop sequence. Unlike :doc:`StackMaps` statepoints do 12098not have a concept of shadow bytes. Note that semantically the 12099statepoint still represents a call or invoke to 'target', and the nop 12100sequence after patching is expected to represent an operation 12101equivalent to a call or invoke to 'target'. 12102 12103The 'target' operand is the function actually being called. The 12104target can be specified as either a symbolic LLVM function, or as an 12105arbitrary Value of appropriate function type. Note that the function 12106type must match the signature of the callee and the types of the 'call 12107parameters' arguments. 12108 12109The '#call args' operand is the number of arguments to the actual 12110call. It must exactly match the number of arguments passed in the 12111'call parameters' variable length section. 12112 12113The 'flags' operand is used to specify extra information about the 12114statepoint. This is currently only used to mark certain statepoints 12115as GC transitions. This operand is a 64-bit integer with the following 12116layout, where bit 0 is the least significant bit: 12117 12118 +-------+---------------------------------------------------+ 12119 | Bit # | Usage | 12120 +=======+===================================================+ 12121 | 0 | Set if the statepoint is a GC transition, cleared | 12122 | | otherwise. | 12123 +-------+---------------------------------------------------+ 12124 | 1-63 | Reserved for future use; must be cleared. | 12125 +-------+---------------------------------------------------+ 12126 12127The 'call parameters' arguments are simply the arguments which need to 12128be passed to the call target. They will be lowered according to the 12129specified calling convention and otherwise handled like a normal call 12130instruction. The number of arguments must exactly match what is 12131specified in '# call args'. The types must match the signature of 12132'target'. 12133 12134The 'call parameter' attributes must be followed by two 'i64 0' constants. 12135These were originally the length prefixes for 'gc transition parameter' and 12136'deopt parameter' arguments, but the role of these parameter sets have been 12137entirely replaced with the corresponding operand bundles. In a future 12138revision, these now redundant arguments will be removed. 12139 12140Semantics: 12141"""""""""" 12142 12143A statepoint is assumed to read and write all memory. As a result, 12144memory operations can not be reordered past a statepoint. It is 12145illegal to mark a statepoint as being either 'readonly' or 'readnone'. 12146 12147Note that legal IR can not perform any memory operation on a 'gc 12148pointer' argument of the statepoint in a location statically reachable 12149from the statepoint. Instead, the explicitly relocated value (from a 12150``gc.relocate``) must be used. 12151 12152'llvm.experimental.gc.result' Intrinsic 12153^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12154 12155Syntax: 12156""""""" 12157 12158:: 12159 12160 declare type* 12161 @llvm.experimental.gc.result(token %statepoint_token) 12162 12163Overview: 12164""""""""" 12165 12166``gc.result`` extracts the result of the original call instruction 12167which was replaced by the ``gc.statepoint``. The ``gc.result`` 12168intrinsic is actually a family of three intrinsics due to an 12169implementation limitation. Other than the type of the return value, 12170the semantics are the same. 12171 12172Operands: 12173""""""""" 12174 12175The first and only argument is the ``gc.statepoint`` which starts 12176the safepoint sequence of which this ``gc.result`` is a part. 12177Despite the typing of this as a generic token, *only* the value defined 12178by a ``gc.statepoint`` is legal here. 12179 12180Semantics: 12181"""""""""" 12182 12183The ``gc.result`` represents the return value of the call target of 12184the ``statepoint``. The type of the ``gc.result`` must exactly match 12185the type of the target. If the call target returns void, there will 12186be no ``gc.result``. 12187 12188A ``gc.result`` is modeled as a 'readnone' pure function. It has no 12189side effects since it is just a projection of the return value of the 12190previous call represented by the ``gc.statepoint``. 12191 12192'llvm.experimental.gc.relocate' Intrinsic 12193^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12194 12195Syntax: 12196""""""" 12197 12198:: 12199 12200 declare <pointer type> 12201 @llvm.experimental.gc.relocate(token %statepoint_token, 12202 i32 %base_offset, 12203 i32 %pointer_offset) 12204 12205Overview: 12206""""""""" 12207 12208A ``gc.relocate`` returns the potentially relocated value of a pointer 12209at the safepoint. 12210 12211Operands: 12212""""""""" 12213 12214The first argument is the ``gc.statepoint`` which starts the 12215safepoint sequence of which this ``gc.relocation`` is a part. 12216Despite the typing of this as a generic token, *only* the value defined 12217by a ``gc.statepoint`` is legal here. 12218 12219The second and third arguments are both indices into operands of the 12220corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle. 12221 12222The second argument is an index which specifies the allocation for the pointer 12223being relocated. The associated value must be within the object with which the 12224pointer being relocated is associated. The optimizer is free to change *which* 12225interior derived pointer is reported, provided that it does not replace an 12226actual base pointer with another interior derived pointer. Collectors are 12227allowed to rely on the base pointer operand remaining an actual base pointer if 12228so constructed. 12229 12230The third argument is an index which specify the (potentially) derived pointer 12231being relocated. It is legal for this index to be the same as the second 12232argument if-and-only-if a base pointer is being relocated. 12233 12234Semantics: 12235"""""""""" 12236 12237The return value of ``gc.relocate`` is the potentially relocated value 12238of the pointer specified by its arguments. It is unspecified how the 12239value of the returned pointer relates to the argument to the 12240``gc.statepoint`` other than that a) it points to the same source 12241language object with the same offset, and b) the 'based-on' 12242relationship of the newly relocated pointers is a projection of the 12243unrelocated pointers. In particular, the integer value of the pointer 12244returned is unspecified. 12245 12246A ``gc.relocate`` is modeled as a ``readnone`` pure function. It has no 12247side effects since it is just a way to extract information about work 12248done during the actual call modeled by the ``gc.statepoint``. 12249 12250Code Generator Intrinsics 12251------------------------- 12252 12253These intrinsics are provided by LLVM to expose special features that 12254may only be implemented with code generator support. 12255 12256'``llvm.returnaddress``' Intrinsic 12257^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12258 12259Syntax: 12260""""""" 12261 12262:: 12263 12264 declare i8* @llvm.returnaddress(i32 <level>) 12265 12266Overview: 12267""""""""" 12268 12269The '``llvm.returnaddress``' intrinsic attempts to compute a 12270target-specific value indicating the return address of the current 12271function or one of its callers. 12272 12273Arguments: 12274"""""""""" 12275 12276The argument to this intrinsic indicates which function to return the 12277address for. Zero indicates the calling function, one indicates its 12278caller, etc. The argument is **required** to be a constant integer 12279value. 12280 12281Semantics: 12282"""""""""" 12283 12284The '``llvm.returnaddress``' intrinsic either returns a pointer 12285indicating the return address of the specified call frame, or zero if it 12286cannot be identified. The value returned by this intrinsic is likely to 12287be incorrect or 0 for arguments other than zero, so it should only be 12288used for debugging purposes. 12289 12290Note that calling this intrinsic does not prevent function inlining or 12291other aggressive transformations, so the value returned may not be that 12292of the obvious source-language caller. 12293 12294'``llvm.addressofreturnaddress``' Intrinsic 12295^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12296 12297Syntax: 12298""""""" 12299 12300:: 12301 12302 declare i8* @llvm.addressofreturnaddress() 12303 12304Overview: 12305""""""""" 12306 12307The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific 12308pointer to the place in the stack frame where the return address of the 12309current function is stored. 12310 12311Semantics: 12312"""""""""" 12313 12314Note that calling this intrinsic does not prevent function inlining or 12315other aggressive transformations, so the value returned may not be that 12316of the obvious source-language caller. 12317 12318This intrinsic is only implemented for x86 and aarch64. 12319 12320'``llvm.sponentry``' Intrinsic 12321^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12322 12323Syntax: 12324""""""" 12325 12326:: 12327 12328 declare i8* @llvm.sponentry() 12329 12330Overview: 12331""""""""" 12332 12333The '``llvm.sponentry``' intrinsic returns the stack pointer value at 12334the entry of the current function calling this intrinsic. 12335 12336Semantics: 12337"""""""""" 12338 12339Note this intrinsic is only verified on AArch64. 12340 12341'``llvm.frameaddress``' Intrinsic 12342^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12343 12344Syntax: 12345""""""" 12346 12347:: 12348 12349 declare i8* @llvm.frameaddress(i32 <level>) 12350 12351Overview: 12352""""""""" 12353 12354The '``llvm.frameaddress``' intrinsic attempts to return the 12355target-specific frame pointer value for the specified stack frame. 12356 12357Arguments: 12358"""""""""" 12359 12360The argument to this intrinsic indicates which function to return the 12361frame pointer for. Zero indicates the calling function, one indicates 12362its caller, etc. The argument is **required** to be a constant integer 12363value. 12364 12365Semantics: 12366"""""""""" 12367 12368The '``llvm.frameaddress``' intrinsic either returns a pointer 12369indicating the frame address of the specified call frame, or zero if it 12370cannot be identified. The value returned by this intrinsic is likely to 12371be incorrect or 0 for arguments other than zero, so it should only be 12372used for debugging purposes. 12373 12374Note that calling this intrinsic does not prevent function inlining or 12375other aggressive transformations, so the value returned may not be that 12376of the obvious source-language caller. 12377 12378'``llvm.swift.async.context.addr``' Intrinsic 12379^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12380 12381Syntax: 12382""""""" 12383 12384:: 12385 12386 declare i8** @llvm.swift.async.context.addr() 12387 12388Overview: 12389""""""""" 12390 12391The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to 12392the part of the extended frame record containing the asynchronous 12393context of a Swift execution. 12394 12395Semantics: 12396"""""""""" 12397 12398If the caller has a ``swiftasync`` parameter, that argument will initially 12399be stored at the returned address. If not, it will be initialized to null. 12400 12401'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics 12402^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12403 12404Syntax: 12405""""""" 12406 12407:: 12408 12409 declare void @llvm.localescape(...) 12410 declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx) 12411 12412Overview: 12413""""""""" 12414 12415The '``llvm.localescape``' intrinsic escapes offsets of a collection of static 12416allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a 12417live frame pointer to recover the address of the allocation. The offset is 12418computed during frame layout of the caller of ``llvm.localescape``. 12419 12420Arguments: 12421"""""""""" 12422 12423All arguments to '``llvm.localescape``' must be pointers to static allocas or 12424casts of static allocas. Each function can only call '``llvm.localescape``' 12425once, and it can only do so from the entry block. 12426 12427The ``func`` argument to '``llvm.localrecover``' must be a constant 12428bitcasted pointer to a function defined in the current module. The code 12429generator cannot determine the frame allocation offset of functions defined in 12430other modules. 12431 12432The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a 12433call frame that is currently live. The return value of '``llvm.localaddress``' 12434is one way to produce such a value, but various runtimes also expose a suitable 12435pointer in platform-specific ways. 12436 12437The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to 12438'``llvm.localescape``' to recover. It is zero-indexed. 12439 12440Semantics: 12441"""""""""" 12442 12443These intrinsics allow a group of functions to share access to a set of local 12444stack allocations of a one parent function. The parent function may call the 12445'``llvm.localescape``' intrinsic once from the function entry block, and the 12446child functions can use '``llvm.localrecover``' to access the escaped allocas. 12447The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where 12448the escaped allocas are allocated, which would break attempts to use 12449'``llvm.localrecover``'. 12450 12451'``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics 12452^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12453 12454Syntax: 12455""""""" 12456 12457:: 12458 12459 declare void @llvm.seh.try.begin() 12460 declare void @llvm.seh.try.end() 12461 12462Overview: 12463""""""""" 12464 12465The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark 12466the boundary of a _try region for Windows SEH Asynchrous Exception Handling. 12467 12468Semantics: 12469"""""""""" 12470 12471When a C-function is compiled with Windows SEH Asynchrous Exception option, 12472-feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try 12473boundary and to prevent potential exceptions from being moved across boundary. 12474Any set of operations can then be confined to the region by reading their leaf 12475inputs via volatile loads and writing their root outputs via volatile stores. 12476 12477'``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics 12478^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12479 12480Syntax: 12481""""""" 12482 12483:: 12484 12485 declare void @llvm.seh.scope.begin() 12486 declare void @llvm.seh.scope.end() 12487 12488Overview: 12489""""""""" 12490 12491The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark 12492the boundary of a CPP object lifetime for Windows SEH Asynchrous Exception 12493Handling (MSVC option -EHa). 12494 12495Semantics: 12496"""""""""" 12497 12498LLVM's ordinary exception-handling representation associates EH cleanups and 12499handlers only with ``invoke``s, which normally correspond only to call sites. To 12500support arbitrary faulting instructions, it must be possible to recover the current 12501EH scope for any instruction. Turning every operation in LLVM that could fault 12502into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a 12503large number of intrinsics, impede optimization of those operations, and make 12504compilation slower by introducing many extra basic blocks. These intrinsics can 12505be used instead to mark the region protected by a cleanup, such as for a local 12506C++ object with a non-trivial destructor. ``llvm.seh.scope.begin`` is used to mark 12507the start of the region; it is always called with ``invoke``, with the unwind block 12508being the desired unwind destination for any potentially-throwing instructions 12509within the region. `llvm.seh.scope.end` is used to mark when the scope ends 12510and the EH cleanup is no longer required (e.g. because the destructor is being 12511called). 12512 12513.. _int_read_register: 12514.. _int_read_volatile_register: 12515.. _int_write_register: 12516 12517'``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics 12518^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12519 12520Syntax: 12521""""""" 12522 12523:: 12524 12525 declare i32 @llvm.read_register.i32(metadata) 12526 declare i64 @llvm.read_register.i64(metadata) 12527 declare i32 @llvm.read_volatile_register.i32(metadata) 12528 declare i64 @llvm.read_volatile_register.i64(metadata) 12529 declare void @llvm.write_register.i32(metadata, i32 @value) 12530 declare void @llvm.write_register.i64(metadata, i64 @value) 12531 !0 = !{!"sp\00"} 12532 12533Overview: 12534""""""""" 12535 12536The '``llvm.read_register``', '``llvm.read_volatile_register``', and 12537'``llvm.write_register``' intrinsics provide access to the named register. 12538The register must be valid on the architecture being compiled to. The type 12539needs to be compatible with the register being read. 12540 12541Semantics: 12542"""""""""" 12543 12544The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics 12545return the current value of the register, where possible. The 12546'``llvm.write_register``' intrinsic sets the current value of the register, 12547where possible. 12548 12549A call to '``llvm.read_volatile_register``' is assumed to have side-effects 12550and possibly return a different value each time (e.g. for a timer register). 12551 12552This is useful to implement named register global variables that need 12553to always be mapped to a specific register, as is common practice on 12554bare-metal programs including OS kernels. 12555 12556The compiler doesn't check for register availability or use of the used 12557register in surrounding code, including inline assembly. Because of that, 12558allocatable registers are not supported. 12559 12560Warning: So far it only works with the stack pointer on selected 12561architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of 12562work is needed to support other registers and even more so, allocatable 12563registers. 12564 12565.. _int_stacksave: 12566 12567'``llvm.stacksave``' Intrinsic 12568^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12569 12570Syntax: 12571""""""" 12572 12573:: 12574 12575 declare i8* @llvm.stacksave() 12576 12577Overview: 12578""""""""" 12579 12580The '``llvm.stacksave``' intrinsic is used to remember the current state 12581of the function stack, for use with 12582:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for 12583implementing language features like scoped automatic variable sized 12584arrays in C99. 12585 12586Semantics: 12587"""""""""" 12588 12589This intrinsic returns an opaque pointer value that can be passed to 12590:ref:`llvm.stackrestore <int_stackrestore>`. When an 12591``llvm.stackrestore`` intrinsic is executed with a value saved from 12592``llvm.stacksave``, it effectively restores the state of the stack to 12593the state it was in when the ``llvm.stacksave`` intrinsic executed. In 12594practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that 12595were allocated after the ``llvm.stacksave`` was executed. 12596 12597.. _int_stackrestore: 12598 12599'``llvm.stackrestore``' Intrinsic 12600^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12601 12602Syntax: 12603""""""" 12604 12605:: 12606 12607 declare void @llvm.stackrestore(i8* %ptr) 12608 12609Overview: 12610""""""""" 12611 12612The '``llvm.stackrestore``' intrinsic is used to restore the state of 12613the function stack to the state it was in when the corresponding 12614:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is 12615useful for implementing language features like scoped automatic variable 12616sized arrays in C99. 12617 12618Semantics: 12619"""""""""" 12620 12621See the description for :ref:`llvm.stacksave <int_stacksave>`. 12622 12623.. _int_get_dynamic_area_offset: 12624 12625'``llvm.get.dynamic.area.offset``' Intrinsic 12626^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12627 12628Syntax: 12629""""""" 12630 12631:: 12632 12633 declare i32 @llvm.get.dynamic.area.offset.i32() 12634 declare i64 @llvm.get.dynamic.area.offset.i64() 12635 12636Overview: 12637""""""""" 12638 12639 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to 12640 get the offset from native stack pointer to the address of the most 12641 recent dynamic alloca on the caller's stack. These intrinsics are 12642 intended for use in combination with 12643 :ref:`llvm.stacksave <int_stacksave>` to get a 12644 pointer to the most recent dynamic alloca. This is useful, for example, 12645 for AddressSanitizer's stack unpoisoning routines. 12646 12647Semantics: 12648"""""""""" 12649 12650 These intrinsics return a non-negative integer value that can be used to 12651 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>` 12652 on the caller's stack. In particular, for targets where stack grows downwards, 12653 adding this offset to the native stack pointer would get the address of the most 12654 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more 12655 complicated, because subtracting this value from stack pointer would get the address 12656 one past the end of the most recent dynamic alloca. 12657 12658 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>` 12659 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a 12660 compile-time-known constant value. 12661 12662 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>` 12663 must match the target's default address space's (address space 0) pointer type. 12664 12665'``llvm.prefetch``' Intrinsic 12666^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12667 12668Syntax: 12669""""""" 12670 12671:: 12672 12673 declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>) 12674 12675Overview: 12676""""""""" 12677 12678The '``llvm.prefetch``' intrinsic is a hint to the code generator to 12679insert a prefetch instruction if supported; otherwise, it is a noop. 12680Prefetches have no effect on the behavior of the program but can change 12681its performance characteristics. 12682 12683Arguments: 12684"""""""""" 12685 12686``address`` is the address to be prefetched, ``rw`` is the specifier 12687determining if the fetch should be for a read (0) or write (1), and 12688``locality`` is a temporal locality specifier ranging from (0) - no 12689locality, to (3) - extremely local keep in cache. The ``cache type`` 12690specifies whether the prefetch is performed on the data (1) or 12691instruction (0) cache. The ``rw``, ``locality`` and ``cache type`` 12692arguments must be constant integers. 12693 12694Semantics: 12695"""""""""" 12696 12697This intrinsic does not modify the behavior of the program. In 12698particular, prefetches cannot trap and do not produce a value. On 12699targets that support this intrinsic, the prefetch can provide hints to 12700the processor cache for better performance. 12701 12702'``llvm.pcmarker``' Intrinsic 12703^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12704 12705Syntax: 12706""""""" 12707 12708:: 12709 12710 declare void @llvm.pcmarker(i32 <id>) 12711 12712Overview: 12713""""""""" 12714 12715The '``llvm.pcmarker``' intrinsic is a method to export a Program 12716Counter (PC) in a region of code to simulators and other tools. The 12717method is target specific, but it is expected that the marker will use 12718exported symbols to transmit the PC of the marker. The marker makes no 12719guarantees that it will remain with any specific instruction after 12720optimizations. It is possible that the presence of a marker will inhibit 12721optimizations. The intended use is to be inserted after optimizations to 12722allow correlations of simulation runs. 12723 12724Arguments: 12725"""""""""" 12726 12727``id`` is a numerical id identifying the marker. 12728 12729Semantics: 12730"""""""""" 12731 12732This intrinsic does not modify the behavior of the program. Backends 12733that do not support this intrinsic may ignore it. 12734 12735'``llvm.readcyclecounter``' Intrinsic 12736^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12737 12738Syntax: 12739""""""" 12740 12741:: 12742 12743 declare i64 @llvm.readcyclecounter() 12744 12745Overview: 12746""""""""" 12747 12748The '``llvm.readcyclecounter``' intrinsic provides access to the cycle 12749counter register (or similar low latency, high accuracy clocks) on those 12750targets that support it. On X86, it should map to RDTSC. On Alpha, it 12751should map to RPCC. As the backing counters overflow quickly (on the 12752order of 9 seconds on alpha), this should only be used for small 12753timings. 12754 12755Semantics: 12756"""""""""" 12757 12758When directly supported, reading the cycle counter should not modify any 12759memory. Implementations are allowed to either return an application 12760specific value or a system wide value. On backends without support, this 12761is lowered to a constant 0. 12762 12763Note that runtime support may be conditional on the privilege-level code is 12764running at and the host platform. 12765 12766'``llvm.clear_cache``' Intrinsic 12767^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12768 12769Syntax: 12770""""""" 12771 12772:: 12773 12774 declare void @llvm.clear_cache(i8*, i8*) 12775 12776Overview: 12777""""""""" 12778 12779The '``llvm.clear_cache``' intrinsic ensures visibility of modifications 12780in the specified range to the execution unit of the processor. On 12781targets with non-unified instruction and data cache, the implementation 12782flushes the instruction cache. 12783 12784Semantics: 12785"""""""""" 12786 12787On platforms with coherent instruction and data caches (e.g. x86), this 12788intrinsic is a nop. On platforms with non-coherent instruction and data 12789cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate 12790instructions or a system call, if cache flushing requires special 12791privileges. 12792 12793The default behavior is to emit a call to ``__clear_cache`` from the run 12794time library. 12795 12796This intrinsic does *not* empty the instruction pipeline. Modifications 12797of the current function are outside the scope of the intrinsic. 12798 12799'``llvm.instrprof.increment``' Intrinsic 12800^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12801 12802Syntax: 12803""""""" 12804 12805:: 12806 12807 declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>, 12808 i32 <num-counters>, i32 <index>) 12809 12810Overview: 12811""""""""" 12812 12813The '``llvm.instrprof.increment``' intrinsic can be emitted by a 12814frontend for use with instrumentation based profiling. These will be 12815lowered by the ``-instrprof`` pass to generate execution counts of a 12816program at runtime. 12817 12818Arguments: 12819"""""""""" 12820 12821The first argument is a pointer to a global variable containing the 12822name of the entity being instrumented. This should generally be the 12823(mangled) function name for a set of counters. 12824 12825The second argument is a hash value that can be used by the consumer 12826of the profile data to detect changes to the instrumented source, and 12827the third is the number of counters associated with ``name``. It is an 12828error if ``hash`` or ``num-counters`` differ between two instances of 12829``instrprof.increment`` that refer to the same name. 12830 12831The last argument refers to which of the counters for ``name`` should 12832be incremented. It should be a value between 0 and ``num-counters``. 12833 12834Semantics: 12835"""""""""" 12836 12837This intrinsic represents an increment of a profiling counter. It will 12838cause the ``-instrprof`` pass to generate the appropriate data 12839structures and the code to increment the appropriate value, in a 12840format that can be written out by a compiler runtime and consumed via 12841the ``llvm-profdata`` tool. 12842 12843'``llvm.instrprof.increment.step``' Intrinsic 12844^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12845 12846Syntax: 12847""""""" 12848 12849:: 12850 12851 declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>, 12852 i32 <num-counters>, 12853 i32 <index>, i64 <step>) 12854 12855Overview: 12856""""""""" 12857 12858The '``llvm.instrprof.increment.step``' intrinsic is an extension to 12859the '``llvm.instrprof.increment``' intrinsic with an additional fifth 12860argument to specify the step of the increment. 12861 12862Arguments: 12863"""""""""" 12864The first four arguments are the same as '``llvm.instrprof.increment``' 12865intrinsic. 12866 12867The last argument specifies the value of the increment of the counter variable. 12868 12869Semantics: 12870"""""""""" 12871See description of '``llvm.instrprof.increment``' intrinsic. 12872 12873 12874'``llvm.instrprof.value.profile``' Intrinsic 12875^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12876 12877Syntax: 12878""""""" 12879 12880:: 12881 12882 declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>, 12883 i64 <value>, i32 <value_kind>, 12884 i32 <index>) 12885 12886Overview: 12887""""""""" 12888 12889The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a 12890frontend for use with instrumentation based profiling. This will be 12891lowered by the ``-instrprof`` pass to find out the target values, 12892instrumented expressions take in a program at runtime. 12893 12894Arguments: 12895"""""""""" 12896 12897The first argument is a pointer to a global variable containing the 12898name of the entity being instrumented. ``name`` should generally be the 12899(mangled) function name for a set of counters. 12900 12901The second argument is a hash value that can be used by the consumer 12902of the profile data to detect changes to the instrumented source. It 12903is an error if ``hash`` differs between two instances of 12904``llvm.instrprof.*`` that refer to the same name. 12905 12906The third argument is the value of the expression being profiled. The profiled 12907expression's value should be representable as an unsigned 64-bit value. The 12908fourth argument represents the kind of value profiling that is being done. The 12909supported value profiling kinds are enumerated through the 12910``InstrProfValueKind`` type declared in the 12911``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the 12912index of the instrumented expression within ``name``. It should be >= 0. 12913 12914Semantics: 12915"""""""""" 12916 12917This intrinsic represents the point where a call to a runtime routine 12918should be inserted for value profiling of target expressions. ``-instrprof`` 12919pass will generate the appropriate data structures and replace the 12920``llvm.instrprof.value.profile`` intrinsic with the call to the profile 12921runtime library with proper arguments. 12922 12923'``llvm.thread.pointer``' Intrinsic 12924^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12925 12926Syntax: 12927""""""" 12928 12929:: 12930 12931 declare i8* @llvm.thread.pointer() 12932 12933Overview: 12934""""""""" 12935 12936The '``llvm.thread.pointer``' intrinsic returns the value of the thread 12937pointer. 12938 12939Semantics: 12940"""""""""" 12941 12942The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area 12943for the current thread. The exact semantics of this value are target 12944specific: it may point to the start of TLS area, to the end, or somewhere 12945in the middle. Depending on the target, this intrinsic may read a register, 12946call a helper function, read from an alternate memory space, or perform 12947other operations necessary to locate the TLS area. Not all targets support 12948this intrinsic. 12949 12950'``llvm.call.preallocated.setup``' Intrinsic 12951^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12952 12953Syntax: 12954""""""" 12955 12956:: 12957 12958 declare token @llvm.call.preallocated.setup(i32 %num_args) 12959 12960Overview: 12961""""""""" 12962 12963The '``llvm.call.preallocated.setup``' intrinsic returns a token which can 12964be used with a call's ``"preallocated"`` operand bundle to indicate that 12965certain arguments are allocated and initialized before the call. 12966 12967Semantics: 12968"""""""""" 12969 12970The '``llvm.call.preallocated.setup``' intrinsic returns a token which is 12971associated with at most one call. The token can be passed to 12972'``@llvm.call.preallocated.arg``' to get a pointer to get that 12973corresponding argument. The token must be the parameter to a 12974``"preallocated"`` operand bundle for the corresponding call. 12975 12976Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must 12977be properly nested. e.g. 12978 12979:: code-block:: llvm 12980 12981 %t1 = call token @llvm.call.preallocated.setup(i32 0) 12982 %t2 = call token @llvm.call.preallocated.setup(i32 0) 12983 call void foo() ["preallocated"(token %t2)] 12984 call void foo() ["preallocated"(token %t1)] 12985 12986is allowed, but not 12987 12988:: code-block:: llvm 12989 12990 %t1 = call token @llvm.call.preallocated.setup(i32 0) 12991 %t2 = call token @llvm.call.preallocated.setup(i32 0) 12992 call void foo() ["preallocated"(token %t1)] 12993 call void foo() ["preallocated"(token %t2)] 12994 12995.. _int_call_preallocated_arg: 12996 12997'``llvm.call.preallocated.arg``' Intrinsic 12998^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12999 13000Syntax: 13001""""""" 13002 13003:: 13004 13005 declare i8* @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index) 13006 13007Overview: 13008""""""""" 13009 13010The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the 13011corresponding preallocated argument for the preallocated call. 13012 13013Semantics: 13014"""""""""" 13015 13016The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the 13017``%arg_index``th argument with the ``preallocated`` attribute for 13018the call associated with the ``%setup_token``, which must be from 13019'``llvm.call.preallocated.setup``'. 13020 13021A call to '``llvm.call.preallocated.arg``' must have a call site 13022``preallocated`` attribute. The type of the ``preallocated`` attribute must 13023match the type used by the ``preallocated`` attribute of the corresponding 13024argument at the preallocated call. The type is used in the case that an 13025``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due 13026to DCE), where otherwise we cannot know how large the arguments are. 13027 13028It is undefined behavior if this is called with a token from an 13029'``llvm.call.preallocated.setup``' if another 13030'``llvm.call.preallocated.setup``' has already been called or if the 13031preallocated call corresponding to the '``llvm.call.preallocated.setup``' 13032has already been called. 13033 13034.. _int_call_preallocated_teardown: 13035 13036'``llvm.call.preallocated.teardown``' Intrinsic 13037^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13038 13039Syntax: 13040""""""" 13041 13042:: 13043 13044 declare i8* @llvm.call.preallocated.teardown(token %setup_token) 13045 13046Overview: 13047""""""""" 13048 13049The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack 13050created by a '``llvm.call.preallocated.setup``'. 13051 13052Semantics: 13053"""""""""" 13054 13055The token argument must be a '``llvm.call.preallocated.setup``'. 13056 13057The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack 13058allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly 13059one of this or the preallocated call must be called to prevent stack leaks. 13060It is undefined behavior to call both a '``llvm.call.preallocated.teardown``' 13061and the preallocated call for a given '``llvm.call.preallocated.setup``'. 13062 13063For example, if the stack is allocated for a preallocated call by a 13064'``llvm.call.preallocated.setup``', then an initializer function called on an 13065allocated argument throws an exception, there should be a 13066'``llvm.call.preallocated.teardown``' in the exception handler to prevent 13067stack leaks. 13068 13069Following the nesting rules in '``llvm.call.preallocated.setup``', nested 13070calls to '``llvm.call.preallocated.setup``' and 13071'``llvm.call.preallocated.teardown``' are allowed but must be properly 13072nested. 13073 13074Example: 13075"""""""" 13076 13077.. code-block:: llvm 13078 13079 %cs = call token @llvm.call.preallocated.setup(i32 1) 13080 %x = call i8* @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32) 13081 %y = bitcast i8* %x to i32* 13082 invoke void @constructor(i32* %y) to label %conta unwind label %contb 13083 conta: 13084 call void @foo1(i32* preallocated(i32) %y) ["preallocated"(token %cs)] 13085 ret void 13086 contb: 13087 %s = catchswitch within none [label %catch] unwind to caller 13088 catch: 13089 %p = catchpad within %s [] 13090 call void @llvm.call.preallocated.teardown(token %cs) 13091 ret void 13092 13093Standard C/C++ Library Intrinsics 13094--------------------------------- 13095 13096LLVM provides intrinsics for a few important standard C/C++ library 13097functions. These intrinsics allow source-language front-ends to pass 13098information about the alignment of the pointer arguments to the code 13099generator, providing opportunity for more efficient code generation. 13100 13101 13102'``llvm.abs.*``' Intrinsic 13103^^^^^^^^^^^^^^^^^^^^^^^^^^ 13104 13105Syntax: 13106""""""" 13107 13108This is an overloaded intrinsic. You can use ``llvm.abs`` on any 13109integer bit width or any vector of integer elements. 13110 13111:: 13112 13113 declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>) 13114 declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>) 13115 13116Overview: 13117""""""""" 13118 13119The '``llvm.abs``' family of intrinsic functions returns the absolute value 13120of an argument. 13121 13122Arguments: 13123"""""""""" 13124 13125The first argument is the value for which the absolute value is to be returned. 13126This argument may be of any integer type or a vector with integer element type. 13127The return type must match the first argument type. 13128 13129The second argument must be a constant and is a flag to indicate whether the 13130result value of the '``llvm.abs``' intrinsic is a 13131:ref:`poison value <poisonvalues>` if the argument is statically or dynamically 13132an ``INT_MIN`` value. 13133 13134Semantics: 13135"""""""""" 13136 13137The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the 13138argument or each element of a vector argument.". If the argument is ``INT_MIN``, 13139then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and 13140``poison`` otherwise. 13141 13142 13143'``llvm.smax.*``' Intrinsic 13144^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13145 13146Syntax: 13147""""""" 13148 13149This is an overloaded intrinsic. You can use ``@llvm.smax`` on any 13150integer bit width or any vector of integer elements. 13151 13152:: 13153 13154 declare i32 @llvm.smax.i32(i32 %a, i32 %b) 13155 declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b) 13156 13157Overview: 13158""""""""" 13159 13160Return the larger of ``%a`` and ``%b`` comparing the values as signed integers. 13161Vector intrinsics operate on a per-element basis. The larger element of ``%a`` 13162and ``%b`` at a given index is returned for that index. 13163 13164Arguments: 13165"""""""""" 13166 13167The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 13168integer element type. The argument types must match each other, and the return 13169type must match the argument type. 13170 13171 13172'``llvm.smin.*``' Intrinsic 13173^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13174 13175Syntax: 13176""""""" 13177 13178This is an overloaded intrinsic. You can use ``@llvm.smin`` on any 13179integer bit width or any vector of integer elements. 13180 13181:: 13182 13183 declare i32 @llvm.smin.i32(i32 %a, i32 %b) 13184 declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b) 13185 13186Overview: 13187""""""""" 13188 13189Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers. 13190Vector intrinsics operate on a per-element basis. The smaller element of ``%a`` 13191and ``%b`` at a given index is returned for that index. 13192 13193Arguments: 13194"""""""""" 13195 13196The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 13197integer element type. The argument types must match each other, and the return 13198type must match the argument type. 13199 13200 13201'``llvm.umax.*``' Intrinsic 13202^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13203 13204Syntax: 13205""""""" 13206 13207This is an overloaded intrinsic. You can use ``@llvm.umax`` on any 13208integer bit width or any vector of integer elements. 13209 13210:: 13211 13212 declare i32 @llvm.umax.i32(i32 %a, i32 %b) 13213 declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b) 13214 13215Overview: 13216""""""""" 13217 13218Return the larger of ``%a`` and ``%b`` comparing the values as unsigned 13219integers. Vector intrinsics operate on a per-element basis. The larger element 13220of ``%a`` and ``%b`` at a given index is returned for that index. 13221 13222Arguments: 13223"""""""""" 13224 13225The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 13226integer element type. The argument types must match each other, and the return 13227type must match the argument type. 13228 13229 13230'``llvm.umin.*``' Intrinsic 13231^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13232 13233Syntax: 13234""""""" 13235 13236This is an overloaded intrinsic. You can use ``@llvm.umin`` on any 13237integer bit width or any vector of integer elements. 13238 13239:: 13240 13241 declare i32 @llvm.umin.i32(i32 %a, i32 %b) 13242 declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b) 13243 13244Overview: 13245""""""""" 13246 13247Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned 13248integers. Vector intrinsics operate on a per-element basis. The smaller element 13249of ``%a`` and ``%b`` at a given index is returned for that index. 13250 13251Arguments: 13252"""""""""" 13253 13254The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 13255integer element type. The argument types must match each other, and the return 13256type must match the argument type. 13257 13258 13259.. _int_memcpy: 13260 13261'``llvm.memcpy``' Intrinsic 13262^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13263 13264Syntax: 13265""""""" 13266 13267This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any 13268integer bit width and for different address spaces. Not all targets 13269support all bit widths however. 13270 13271:: 13272 13273 declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 13274 i32 <len>, i1 <isvolatile>) 13275 declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 13276 i64 <len>, i1 <isvolatile>) 13277 13278Overview: 13279""""""""" 13280 13281The '``llvm.memcpy.*``' intrinsics copy a block of memory from the 13282source location to the destination location. 13283 13284Note that, unlike the standard libc function, the ``llvm.memcpy.*`` 13285intrinsics do not return a value, takes extra isvolatile 13286arguments and the pointers can be in specified address spaces. 13287 13288Arguments: 13289"""""""""" 13290 13291The first argument is a pointer to the destination, the second is a 13292pointer to the source. The third argument is an integer argument 13293specifying the number of bytes to copy, and the fourth is a 13294boolean indicating a volatile access. 13295 13296The :ref:`align <attr_align>` parameter attribute can be provided 13297for the first and second arguments. 13298 13299If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is 13300a :ref:`volatile operation <volatile>`. The detailed access behavior is not 13301very cleanly specified and it is unwise to depend on it. 13302 13303Semantics: 13304"""""""""" 13305 13306The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source 13307location to the destination location, which must either be equal or 13308non-overlapping. It copies "len" bytes of memory over. If the argument is known 13309to be aligned to some boundary, this can be specified as an attribute on the 13310argument. 13311 13312If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to 13313the arguments. 13314If ``<len>`` is not a well-defined value, the behavior is undefined. 13315If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined, 13316otherwise the behavior is undefined. 13317 13318.. _int_memcpy_inline: 13319 13320'``llvm.memcpy.inline``' Intrinsic 13321^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13322 13323Syntax: 13324""""""" 13325 13326This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any 13327integer bit width and for different address spaces. Not all targets 13328support all bit widths however. 13329 13330:: 13331 13332 declare void @llvm.memcpy.inline.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 13333 i32 <len>, i1 <isvolatile>) 13334 declare void @llvm.memcpy.inline.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 13335 i64 <len>, i1 <isvolatile>) 13336 13337Overview: 13338""""""""" 13339 13340The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the 13341source location to the destination location and guarantees that no external 13342functions are called. 13343 13344Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*`` 13345intrinsics do not return a value, takes extra isvolatile 13346arguments and the pointers can be in specified address spaces. 13347 13348Arguments: 13349"""""""""" 13350 13351The first argument is a pointer to the destination, the second is a 13352pointer to the source. The third argument is a constant integer argument 13353specifying the number of bytes to copy, and the fourth is a 13354boolean indicating a volatile access. 13355 13356The :ref:`align <attr_align>` parameter attribute can be provided 13357for the first and second arguments. 13358 13359If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is 13360a :ref:`volatile operation <volatile>`. The detailed access behavior is not 13361very cleanly specified and it is unwise to depend on it. 13362 13363Semantics: 13364"""""""""" 13365 13366The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the 13367source location to the destination location, which are not allowed to 13368overlap. It copies "len" bytes of memory over. If the argument is known 13369to be aligned to some boundary, this can be specified as an attribute on 13370the argument. 13371The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of 13372'``llvm.memcpy.*``', but the generated code is guaranteed not to call any 13373external functions. 13374 13375.. _int_memmove: 13376 13377'``llvm.memmove``' Intrinsic 13378^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13379 13380Syntax: 13381""""""" 13382 13383This is an overloaded intrinsic. You can use llvm.memmove on any integer 13384bit width and for different address space. Not all targets support all 13385bit widths however. 13386 13387:: 13388 13389 declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 13390 i32 <len>, i1 <isvolatile>) 13391 declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 13392 i64 <len>, i1 <isvolatile>) 13393 13394Overview: 13395""""""""" 13396 13397The '``llvm.memmove.*``' intrinsics move a block of memory from the 13398source location to the destination location. It is similar to the 13399'``llvm.memcpy``' intrinsic but allows the two memory locations to 13400overlap. 13401 13402Note that, unlike the standard libc function, the ``llvm.memmove.*`` 13403intrinsics do not return a value, takes an extra isvolatile 13404argument and the pointers can be in specified address spaces. 13405 13406Arguments: 13407"""""""""" 13408 13409The first argument is a pointer to the destination, the second is a 13410pointer to the source. The third argument is an integer argument 13411specifying the number of bytes to copy, and the fourth is a 13412boolean indicating a volatile access. 13413 13414The :ref:`align <attr_align>` parameter attribute can be provided 13415for the first and second arguments. 13416 13417If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call 13418is a :ref:`volatile operation <volatile>`. The detailed access behavior is 13419not very cleanly specified and it is unwise to depend on it. 13420 13421Semantics: 13422"""""""""" 13423 13424The '``llvm.memmove.*``' intrinsics copy a block of memory from the 13425source location to the destination location, which may overlap. It 13426copies "len" bytes of memory over. If the argument is known to be 13427aligned to some boundary, this can be specified as an attribute on 13428the argument. 13429 13430If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to 13431the arguments. 13432If ``<len>`` is not a well-defined value, the behavior is undefined. 13433If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined, 13434otherwise the behavior is undefined. 13435 13436.. _int_memset: 13437 13438'``llvm.memset.*``' Intrinsics 13439^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13440 13441Syntax: 13442""""""" 13443 13444This is an overloaded intrinsic. You can use llvm.memset on any integer 13445bit width and for different address spaces. However, not all targets 13446support all bit widths. 13447 13448:: 13449 13450 declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>, 13451 i32 <len>, i1 <isvolatile>) 13452 declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>, 13453 i64 <len>, i1 <isvolatile>) 13454 13455Overview: 13456""""""""" 13457 13458The '``llvm.memset.*``' intrinsics fill a block of memory with a 13459particular byte value. 13460 13461Note that, unlike the standard libc function, the ``llvm.memset`` 13462intrinsic does not return a value and takes an extra volatile 13463argument. Also, the destination can be in an arbitrary address space. 13464 13465Arguments: 13466"""""""""" 13467 13468The first argument is a pointer to the destination to fill, the second 13469is the byte value with which to fill it, the third argument is an 13470integer argument specifying the number of bytes to fill, and the fourth 13471is a boolean indicating a volatile access. 13472 13473The :ref:`align <attr_align>` parameter attribute can be provided 13474for the first arguments. 13475 13476If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is 13477a :ref:`volatile operation <volatile>`. The detailed access behavior is not 13478very cleanly specified and it is unwise to depend on it. 13479 13480Semantics: 13481"""""""""" 13482 13483The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting 13484at the destination location. If the argument is known to be 13485aligned to some boundary, this can be specified as an attribute on 13486the argument. 13487 13488If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to 13489the arguments. 13490If ``<len>`` is not a well-defined value, the behavior is undefined. 13491If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined, 13492otherwise the behavior is undefined. 13493 13494'``llvm.sqrt.*``' Intrinsic 13495^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13496 13497Syntax: 13498""""""" 13499 13500This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any 13501floating-point or vector of floating-point type. Not all targets support 13502all types however. 13503 13504:: 13505 13506 declare float @llvm.sqrt.f32(float %Val) 13507 declare double @llvm.sqrt.f64(double %Val) 13508 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val) 13509 declare fp128 @llvm.sqrt.f128(fp128 %Val) 13510 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val) 13511 13512Overview: 13513""""""""" 13514 13515The '``llvm.sqrt``' intrinsics return the square root of the specified value. 13516 13517Arguments: 13518"""""""""" 13519 13520The argument and return value are floating-point numbers of the same type. 13521 13522Semantics: 13523"""""""""" 13524 13525Return the same value as a corresponding libm '``sqrt``' function but without 13526trapping or setting ``errno``. For types specified by IEEE-754, the result 13527matches a conforming libm implementation. 13528 13529When specified with the fast-math-flag 'afn', the result may be approximated 13530using a less accurate calculation. 13531 13532'``llvm.powi.*``' Intrinsic 13533^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13534 13535Syntax: 13536""""""" 13537 13538This is an overloaded intrinsic. You can use ``llvm.powi`` on any 13539floating-point or vector of floating-point type. Not all targets support 13540all types however. 13541 13542:: 13543 13544 declare float @llvm.powi.f32(float %Val, i32 %power) 13545 declare double @llvm.powi.f64(double %Val, i32 %power) 13546 declare x86_fp80 @llvm.powi.f80(x86_fp80 %Val, i32 %power) 13547 declare fp128 @llvm.powi.f128(fp128 %Val, i32 %power) 13548 declare ppc_fp128 @llvm.powi.ppcf128(ppc_fp128 %Val, i32 %power) 13549 13550Overview: 13551""""""""" 13552 13553The '``llvm.powi.*``' intrinsics return the first operand raised to the 13554specified (positive or negative) power. The order of evaluation of 13555multiplications is not defined. When a vector of floating-point type is 13556used, the second argument remains a scalar integer value. 13557 13558Arguments: 13559"""""""""" 13560 13561The second argument is an integer power, and the first is a value to 13562raise to that power. 13563 13564Semantics: 13565"""""""""" 13566 13567This function returns the first value raised to the second power with an 13568unspecified sequence of rounding operations. 13569 13570'``llvm.sin.*``' Intrinsic 13571^^^^^^^^^^^^^^^^^^^^^^^^^^ 13572 13573Syntax: 13574""""""" 13575 13576This is an overloaded intrinsic. You can use ``llvm.sin`` on any 13577floating-point or vector of floating-point type. Not all targets support 13578all types however. 13579 13580:: 13581 13582 declare float @llvm.sin.f32(float %Val) 13583 declare double @llvm.sin.f64(double %Val) 13584 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val) 13585 declare fp128 @llvm.sin.f128(fp128 %Val) 13586 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val) 13587 13588Overview: 13589""""""""" 13590 13591The '``llvm.sin.*``' intrinsics return the sine of the operand. 13592 13593Arguments: 13594"""""""""" 13595 13596The argument and return value are floating-point numbers of the same type. 13597 13598Semantics: 13599"""""""""" 13600 13601Return the same value as a corresponding libm '``sin``' function but without 13602trapping or setting ``errno``. 13603 13604When specified with the fast-math-flag 'afn', the result may be approximated 13605using a less accurate calculation. 13606 13607'``llvm.cos.*``' Intrinsic 13608^^^^^^^^^^^^^^^^^^^^^^^^^^ 13609 13610Syntax: 13611""""""" 13612 13613This is an overloaded intrinsic. You can use ``llvm.cos`` on any 13614floating-point or vector of floating-point type. Not all targets support 13615all types however. 13616 13617:: 13618 13619 declare float @llvm.cos.f32(float %Val) 13620 declare double @llvm.cos.f64(double %Val) 13621 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val) 13622 declare fp128 @llvm.cos.f128(fp128 %Val) 13623 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val) 13624 13625Overview: 13626""""""""" 13627 13628The '``llvm.cos.*``' intrinsics return the cosine of the operand. 13629 13630Arguments: 13631"""""""""" 13632 13633The argument and return value are floating-point numbers of the same type. 13634 13635Semantics: 13636"""""""""" 13637 13638Return the same value as a corresponding libm '``cos``' function but without 13639trapping or setting ``errno``. 13640 13641When specified with the fast-math-flag 'afn', the result may be approximated 13642using a less accurate calculation. 13643 13644'``llvm.pow.*``' Intrinsic 13645^^^^^^^^^^^^^^^^^^^^^^^^^^ 13646 13647Syntax: 13648""""""" 13649 13650This is an overloaded intrinsic. You can use ``llvm.pow`` on any 13651floating-point or vector of floating-point type. Not all targets support 13652all types however. 13653 13654:: 13655 13656 declare float @llvm.pow.f32(float %Val, float %Power) 13657 declare double @llvm.pow.f64(double %Val, double %Power) 13658 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power) 13659 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power) 13660 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power) 13661 13662Overview: 13663""""""""" 13664 13665The '``llvm.pow.*``' intrinsics return the first operand raised to the 13666specified (positive or negative) power. 13667 13668Arguments: 13669"""""""""" 13670 13671The arguments and return value are floating-point numbers of the same type. 13672 13673Semantics: 13674"""""""""" 13675 13676Return the same value as a corresponding libm '``pow``' function but without 13677trapping or setting ``errno``. 13678 13679When specified with the fast-math-flag 'afn', the result may be approximated 13680using a less accurate calculation. 13681 13682'``llvm.exp.*``' Intrinsic 13683^^^^^^^^^^^^^^^^^^^^^^^^^^ 13684 13685Syntax: 13686""""""" 13687 13688This is an overloaded intrinsic. You can use ``llvm.exp`` on any 13689floating-point or vector of floating-point type. Not all targets support 13690all types however. 13691 13692:: 13693 13694 declare float @llvm.exp.f32(float %Val) 13695 declare double @llvm.exp.f64(double %Val) 13696 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val) 13697 declare fp128 @llvm.exp.f128(fp128 %Val) 13698 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val) 13699 13700Overview: 13701""""""""" 13702 13703The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified 13704value. 13705 13706Arguments: 13707"""""""""" 13708 13709The argument and return value are floating-point numbers of the same type. 13710 13711Semantics: 13712"""""""""" 13713 13714Return the same value as a corresponding libm '``exp``' function but without 13715trapping or setting ``errno``. 13716 13717When specified with the fast-math-flag 'afn', the result may be approximated 13718using a less accurate calculation. 13719 13720'``llvm.exp2.*``' Intrinsic 13721^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13722 13723Syntax: 13724""""""" 13725 13726This is an overloaded intrinsic. You can use ``llvm.exp2`` on any 13727floating-point or vector of floating-point type. Not all targets support 13728all types however. 13729 13730:: 13731 13732 declare float @llvm.exp2.f32(float %Val) 13733 declare double @llvm.exp2.f64(double %Val) 13734 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val) 13735 declare fp128 @llvm.exp2.f128(fp128 %Val) 13736 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val) 13737 13738Overview: 13739""""""""" 13740 13741The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the 13742specified value. 13743 13744Arguments: 13745"""""""""" 13746 13747The argument and return value are floating-point numbers of the same type. 13748 13749Semantics: 13750"""""""""" 13751 13752Return the same value as a corresponding libm '``exp2``' function but without 13753trapping or setting ``errno``. 13754 13755When specified with the fast-math-flag 'afn', the result may be approximated 13756using a less accurate calculation. 13757 13758'``llvm.log.*``' Intrinsic 13759^^^^^^^^^^^^^^^^^^^^^^^^^^ 13760 13761Syntax: 13762""""""" 13763 13764This is an overloaded intrinsic. You can use ``llvm.log`` on any 13765floating-point or vector of floating-point type. Not all targets support 13766all types however. 13767 13768:: 13769 13770 declare float @llvm.log.f32(float %Val) 13771 declare double @llvm.log.f64(double %Val) 13772 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val) 13773 declare fp128 @llvm.log.f128(fp128 %Val) 13774 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val) 13775 13776Overview: 13777""""""""" 13778 13779The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified 13780value. 13781 13782Arguments: 13783"""""""""" 13784 13785The argument and return value are floating-point numbers of the same type. 13786 13787Semantics: 13788"""""""""" 13789 13790Return the same value as a corresponding libm '``log``' function but without 13791trapping or setting ``errno``. 13792 13793When specified with the fast-math-flag 'afn', the result may be approximated 13794using a less accurate calculation. 13795 13796'``llvm.log10.*``' Intrinsic 13797^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13798 13799Syntax: 13800""""""" 13801 13802This is an overloaded intrinsic. You can use ``llvm.log10`` on any 13803floating-point or vector of floating-point type. Not all targets support 13804all types however. 13805 13806:: 13807 13808 declare float @llvm.log10.f32(float %Val) 13809 declare double @llvm.log10.f64(double %Val) 13810 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val) 13811 declare fp128 @llvm.log10.f128(fp128 %Val) 13812 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val) 13813 13814Overview: 13815""""""""" 13816 13817The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the 13818specified value. 13819 13820Arguments: 13821"""""""""" 13822 13823The argument and return value are floating-point numbers of the same type. 13824 13825Semantics: 13826"""""""""" 13827 13828Return the same value as a corresponding libm '``log10``' function but without 13829trapping or setting ``errno``. 13830 13831When specified with the fast-math-flag 'afn', the result may be approximated 13832using a less accurate calculation. 13833 13834'``llvm.log2.*``' Intrinsic 13835^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13836 13837Syntax: 13838""""""" 13839 13840This is an overloaded intrinsic. You can use ``llvm.log2`` on any 13841floating-point or vector of floating-point type. Not all targets support 13842all types however. 13843 13844:: 13845 13846 declare float @llvm.log2.f32(float %Val) 13847 declare double @llvm.log2.f64(double %Val) 13848 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val) 13849 declare fp128 @llvm.log2.f128(fp128 %Val) 13850 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val) 13851 13852Overview: 13853""""""""" 13854 13855The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified 13856value. 13857 13858Arguments: 13859"""""""""" 13860 13861The argument and return value are floating-point numbers of the same type. 13862 13863Semantics: 13864"""""""""" 13865 13866Return the same value as a corresponding libm '``log2``' function but without 13867trapping or setting ``errno``. 13868 13869When specified with the fast-math-flag 'afn', the result may be approximated 13870using a less accurate calculation. 13871 13872.. _int_fma: 13873 13874'``llvm.fma.*``' Intrinsic 13875^^^^^^^^^^^^^^^^^^^^^^^^^^ 13876 13877Syntax: 13878""""""" 13879 13880This is an overloaded intrinsic. You can use ``llvm.fma`` on any 13881floating-point or vector of floating-point type. Not all targets support 13882all types however. 13883 13884:: 13885 13886 declare float @llvm.fma.f32(float %a, float %b, float %c) 13887 declare double @llvm.fma.f64(double %a, double %b, double %c) 13888 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c) 13889 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c) 13890 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c) 13891 13892Overview: 13893""""""""" 13894 13895The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation. 13896 13897Arguments: 13898"""""""""" 13899 13900The arguments and return value are floating-point numbers of the same type. 13901 13902Semantics: 13903"""""""""" 13904 13905Return the same value as a corresponding libm '``fma``' function but without 13906trapping or setting ``errno``. 13907 13908When specified with the fast-math-flag 'afn', the result may be approximated 13909using a less accurate calculation. 13910 13911'``llvm.fabs.*``' Intrinsic 13912^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13913 13914Syntax: 13915""""""" 13916 13917This is an overloaded intrinsic. You can use ``llvm.fabs`` on any 13918floating-point or vector of floating-point type. Not all targets support 13919all types however. 13920 13921:: 13922 13923 declare float @llvm.fabs.f32(float %Val) 13924 declare double @llvm.fabs.f64(double %Val) 13925 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val) 13926 declare fp128 @llvm.fabs.f128(fp128 %Val) 13927 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val) 13928 13929Overview: 13930""""""""" 13931 13932The '``llvm.fabs.*``' intrinsics return the absolute value of the 13933operand. 13934 13935Arguments: 13936"""""""""" 13937 13938The argument and return value are floating-point numbers of the same 13939type. 13940 13941Semantics: 13942"""""""""" 13943 13944This function returns the same values as the libm ``fabs`` functions 13945would, and handles error conditions in the same way. 13946 13947'``llvm.minnum.*``' Intrinsic 13948^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13949 13950Syntax: 13951""""""" 13952 13953This is an overloaded intrinsic. You can use ``llvm.minnum`` on any 13954floating-point or vector of floating-point type. Not all targets support 13955all types however. 13956 13957:: 13958 13959 declare float @llvm.minnum.f32(float %Val0, float %Val1) 13960 declare double @llvm.minnum.f64(double %Val0, double %Val1) 13961 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 13962 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1) 13963 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 13964 13965Overview: 13966""""""""" 13967 13968The '``llvm.minnum.*``' intrinsics return the minimum of the two 13969arguments. 13970 13971 13972Arguments: 13973"""""""""" 13974 13975The arguments and return value are floating-point numbers of the same 13976type. 13977 13978Semantics: 13979"""""""""" 13980 13981Follows the IEEE-754 semantics for minNum, except for handling of 13982signaling NaNs. This match's the behavior of libm's fmin. 13983 13984If either operand is a NaN, returns the other non-NaN operand. Returns 13985NaN only if both operands are NaN. The returned NaN is always 13986quiet. If the operands compare equal, returns a value that compares 13987equal to both operands. This means that fmin(+/-0.0, +/-0.0) could 13988return either -0.0 or 0.0. 13989 13990Unlike the IEEE-754 2008 behavior, this does not distinguish between 13991signaling and quiet NaN inputs. If a target's implementation follows 13992the standard and returns a quiet NaN if either input is a signaling 13993NaN, the intrinsic lowering is responsible for quieting the inputs to 13994correctly return the non-NaN input (e.g. by using the equivalent of 13995``llvm.canonicalize``). 13996 13997 13998'``llvm.maxnum.*``' Intrinsic 13999^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14000 14001Syntax: 14002""""""" 14003 14004This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any 14005floating-point or vector of floating-point type. Not all targets support 14006all types however. 14007 14008:: 14009 14010 declare float @llvm.maxnum.f32(float %Val0, float %Val1l) 14011 declare double @llvm.maxnum.f64(double %Val0, double %Val1) 14012 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 14013 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1) 14014 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 14015 14016Overview: 14017""""""""" 14018 14019The '``llvm.maxnum.*``' intrinsics return the maximum of the two 14020arguments. 14021 14022 14023Arguments: 14024"""""""""" 14025 14026The arguments and return value are floating-point numbers of the same 14027type. 14028 14029Semantics: 14030"""""""""" 14031Follows the IEEE-754 semantics for maxNum except for the handling of 14032signaling NaNs. This matches the behavior of libm's fmax. 14033 14034If either operand is a NaN, returns the other non-NaN operand. Returns 14035NaN only if both operands are NaN. The returned NaN is always 14036quiet. If the operands compare equal, returns a value that compares 14037equal to both operands. This means that fmax(+/-0.0, +/-0.0) could 14038return either -0.0 or 0.0. 14039 14040Unlike the IEEE-754 2008 behavior, this does not distinguish between 14041signaling and quiet NaN inputs. If a target's implementation follows 14042the standard and returns a quiet NaN if either input is a signaling 14043NaN, the intrinsic lowering is responsible for quieting the inputs to 14044correctly return the non-NaN input (e.g. by using the equivalent of 14045``llvm.canonicalize``). 14046 14047'``llvm.minimum.*``' Intrinsic 14048^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14049 14050Syntax: 14051""""""" 14052 14053This is an overloaded intrinsic. You can use ``llvm.minimum`` on any 14054floating-point or vector of floating-point type. Not all targets support 14055all types however. 14056 14057:: 14058 14059 declare float @llvm.minimum.f32(float %Val0, float %Val1) 14060 declare double @llvm.minimum.f64(double %Val0, double %Val1) 14061 declare x86_fp80 @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 14062 declare fp128 @llvm.minimum.f128(fp128 %Val0, fp128 %Val1) 14063 declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 14064 14065Overview: 14066""""""""" 14067 14068The '``llvm.minimum.*``' intrinsics return the minimum of the two 14069arguments, propagating NaNs and treating -0.0 as less than +0.0. 14070 14071 14072Arguments: 14073"""""""""" 14074 14075The arguments and return value are floating-point numbers of the same 14076type. 14077 14078Semantics: 14079"""""""""" 14080If either operand is a NaN, returns NaN. Otherwise returns the lesser 14081of the two arguments. -0.0 is considered to be less than +0.0 for this 14082intrinsic. Note that these are the semantics specified in the draft of 14083IEEE 754-2018. 14084 14085'``llvm.maximum.*``' Intrinsic 14086^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14087 14088Syntax: 14089""""""" 14090 14091This is an overloaded intrinsic. You can use ``llvm.maximum`` on any 14092floating-point or vector of floating-point type. Not all targets support 14093all types however. 14094 14095:: 14096 14097 declare float @llvm.maximum.f32(float %Val0, float %Val1) 14098 declare double @llvm.maximum.f64(double %Val0, double %Val1) 14099 declare x86_fp80 @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 14100 declare fp128 @llvm.maximum.f128(fp128 %Val0, fp128 %Val1) 14101 declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 14102 14103Overview: 14104""""""""" 14105 14106The '``llvm.maximum.*``' intrinsics return the maximum of the two 14107arguments, propagating NaNs and treating -0.0 as less than +0.0. 14108 14109 14110Arguments: 14111"""""""""" 14112 14113The arguments and return value are floating-point numbers of the same 14114type. 14115 14116Semantics: 14117"""""""""" 14118If either operand is a NaN, returns NaN. Otherwise returns the greater 14119of the two arguments. -0.0 is considered to be less than +0.0 for this 14120intrinsic. Note that these are the semantics specified in the draft of 14121IEEE 754-2018. 14122 14123'``llvm.copysign.*``' Intrinsic 14124^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14125 14126Syntax: 14127""""""" 14128 14129This is an overloaded intrinsic. You can use ``llvm.copysign`` on any 14130floating-point or vector of floating-point type. Not all targets support 14131all types however. 14132 14133:: 14134 14135 declare float @llvm.copysign.f32(float %Mag, float %Sgn) 14136 declare double @llvm.copysign.f64(double %Mag, double %Sgn) 14137 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn) 14138 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn) 14139 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn) 14140 14141Overview: 14142""""""""" 14143 14144The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the 14145first operand and the sign of the second operand. 14146 14147Arguments: 14148"""""""""" 14149 14150The arguments and return value are floating-point numbers of the same 14151type. 14152 14153Semantics: 14154"""""""""" 14155 14156This function returns the same values as the libm ``copysign`` 14157functions would, and handles error conditions in the same way. 14158 14159'``llvm.floor.*``' Intrinsic 14160^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14161 14162Syntax: 14163""""""" 14164 14165This is an overloaded intrinsic. You can use ``llvm.floor`` on any 14166floating-point or vector of floating-point type. Not all targets support 14167all types however. 14168 14169:: 14170 14171 declare float @llvm.floor.f32(float %Val) 14172 declare double @llvm.floor.f64(double %Val) 14173 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val) 14174 declare fp128 @llvm.floor.f128(fp128 %Val) 14175 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val) 14176 14177Overview: 14178""""""""" 14179 14180The '``llvm.floor.*``' intrinsics return the floor of the operand. 14181 14182Arguments: 14183"""""""""" 14184 14185The argument and return value are floating-point numbers of the same 14186type. 14187 14188Semantics: 14189"""""""""" 14190 14191This function returns the same values as the libm ``floor`` functions 14192would, and handles error conditions in the same way. 14193 14194'``llvm.ceil.*``' Intrinsic 14195^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14196 14197Syntax: 14198""""""" 14199 14200This is an overloaded intrinsic. You can use ``llvm.ceil`` on any 14201floating-point or vector of floating-point type. Not all targets support 14202all types however. 14203 14204:: 14205 14206 declare float @llvm.ceil.f32(float %Val) 14207 declare double @llvm.ceil.f64(double %Val) 14208 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val) 14209 declare fp128 @llvm.ceil.f128(fp128 %Val) 14210 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val) 14211 14212Overview: 14213""""""""" 14214 14215The '``llvm.ceil.*``' intrinsics return the ceiling of the operand. 14216 14217Arguments: 14218"""""""""" 14219 14220The argument and return value are floating-point numbers of the same 14221type. 14222 14223Semantics: 14224"""""""""" 14225 14226This function returns the same values as the libm ``ceil`` functions 14227would, and handles error conditions in the same way. 14228 14229'``llvm.trunc.*``' Intrinsic 14230^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14231 14232Syntax: 14233""""""" 14234 14235This is an overloaded intrinsic. You can use ``llvm.trunc`` on any 14236floating-point or vector of floating-point type. Not all targets support 14237all types however. 14238 14239:: 14240 14241 declare float @llvm.trunc.f32(float %Val) 14242 declare double @llvm.trunc.f64(double %Val) 14243 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val) 14244 declare fp128 @llvm.trunc.f128(fp128 %Val) 14245 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val) 14246 14247Overview: 14248""""""""" 14249 14250The '``llvm.trunc.*``' intrinsics returns the operand rounded to the 14251nearest integer not larger in magnitude than the operand. 14252 14253Arguments: 14254"""""""""" 14255 14256The argument and return value are floating-point numbers of the same 14257type. 14258 14259Semantics: 14260"""""""""" 14261 14262This function returns the same values as the libm ``trunc`` functions 14263would, and handles error conditions in the same way. 14264 14265'``llvm.rint.*``' Intrinsic 14266^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14267 14268Syntax: 14269""""""" 14270 14271This is an overloaded intrinsic. You can use ``llvm.rint`` on any 14272floating-point or vector of floating-point type. Not all targets support 14273all types however. 14274 14275:: 14276 14277 declare float @llvm.rint.f32(float %Val) 14278 declare double @llvm.rint.f64(double %Val) 14279 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val) 14280 declare fp128 @llvm.rint.f128(fp128 %Val) 14281 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val) 14282 14283Overview: 14284""""""""" 14285 14286The '``llvm.rint.*``' intrinsics returns the operand rounded to the 14287nearest integer. It may raise an inexact floating-point exception if the 14288operand isn't an integer. 14289 14290Arguments: 14291"""""""""" 14292 14293The argument and return value are floating-point numbers of the same 14294type. 14295 14296Semantics: 14297"""""""""" 14298 14299This function returns the same values as the libm ``rint`` functions 14300would, and handles error conditions in the same way. 14301 14302'``llvm.nearbyint.*``' Intrinsic 14303^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14304 14305Syntax: 14306""""""" 14307 14308This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any 14309floating-point or vector of floating-point type. Not all targets support 14310all types however. 14311 14312:: 14313 14314 declare float @llvm.nearbyint.f32(float %Val) 14315 declare double @llvm.nearbyint.f64(double %Val) 14316 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val) 14317 declare fp128 @llvm.nearbyint.f128(fp128 %Val) 14318 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val) 14319 14320Overview: 14321""""""""" 14322 14323The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the 14324nearest integer. 14325 14326Arguments: 14327"""""""""" 14328 14329The argument and return value are floating-point numbers of the same 14330type. 14331 14332Semantics: 14333"""""""""" 14334 14335This function returns the same values as the libm ``nearbyint`` 14336functions would, and handles error conditions in the same way. 14337 14338'``llvm.round.*``' Intrinsic 14339^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14340 14341Syntax: 14342""""""" 14343 14344This is an overloaded intrinsic. You can use ``llvm.round`` on any 14345floating-point or vector of floating-point type. Not all targets support 14346all types however. 14347 14348:: 14349 14350 declare float @llvm.round.f32(float %Val) 14351 declare double @llvm.round.f64(double %Val) 14352 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val) 14353 declare fp128 @llvm.round.f128(fp128 %Val) 14354 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val) 14355 14356Overview: 14357""""""""" 14358 14359The '``llvm.round.*``' intrinsics returns the operand rounded to the 14360nearest integer. 14361 14362Arguments: 14363"""""""""" 14364 14365The argument and return value are floating-point numbers of the same 14366type. 14367 14368Semantics: 14369"""""""""" 14370 14371This function returns the same values as the libm ``round`` 14372functions would, and handles error conditions in the same way. 14373 14374'``llvm.roundeven.*``' Intrinsic 14375^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14376 14377Syntax: 14378""""""" 14379 14380This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any 14381floating-point or vector of floating-point type. Not all targets support 14382all types however. 14383 14384:: 14385 14386 declare float @llvm.roundeven.f32(float %Val) 14387 declare double @llvm.roundeven.f64(double %Val) 14388 declare x86_fp80 @llvm.roundeven.f80(x86_fp80 %Val) 14389 declare fp128 @llvm.roundeven.f128(fp128 %Val) 14390 declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128 %Val) 14391 14392Overview: 14393""""""""" 14394 14395The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest 14396integer in floating-point format rounding halfway cases to even (that is, to the 14397nearest value that is an even integer). 14398 14399Arguments: 14400"""""""""" 14401 14402The argument and return value are floating-point numbers of the same type. 14403 14404Semantics: 14405"""""""""" 14406 14407This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It 14408also behaves in the same way as C standard function ``roundeven``, except that 14409it does not raise floating point exceptions. 14410 14411 14412'``llvm.lround.*``' Intrinsic 14413^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14414 14415Syntax: 14416""""""" 14417 14418This is an overloaded intrinsic. You can use ``llvm.lround`` on any 14419floating-point type. Not all targets support all types however. 14420 14421:: 14422 14423 declare i32 @llvm.lround.i32.f32(float %Val) 14424 declare i32 @llvm.lround.i32.f64(double %Val) 14425 declare i32 @llvm.lround.i32.f80(float %Val) 14426 declare i32 @llvm.lround.i32.f128(double %Val) 14427 declare i32 @llvm.lround.i32.ppcf128(double %Val) 14428 14429 declare i64 @llvm.lround.i64.f32(float %Val) 14430 declare i64 @llvm.lround.i64.f64(double %Val) 14431 declare i64 @llvm.lround.i64.f80(float %Val) 14432 declare i64 @llvm.lround.i64.f128(double %Val) 14433 declare i64 @llvm.lround.i64.ppcf128(double %Val) 14434 14435Overview: 14436""""""""" 14437 14438The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest 14439integer with ties away from zero. 14440 14441 14442Arguments: 14443"""""""""" 14444 14445The argument is a floating-point number and the return value is an integer 14446type. 14447 14448Semantics: 14449"""""""""" 14450 14451This function returns the same values as the libm ``lround`` 14452functions would, but without setting errno. 14453 14454'``llvm.llround.*``' Intrinsic 14455^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14456 14457Syntax: 14458""""""" 14459 14460This is an overloaded intrinsic. You can use ``llvm.llround`` on any 14461floating-point type. Not all targets support all types however. 14462 14463:: 14464 14465 declare i64 @llvm.lround.i64.f32(float %Val) 14466 declare i64 @llvm.lround.i64.f64(double %Val) 14467 declare i64 @llvm.lround.i64.f80(float %Val) 14468 declare i64 @llvm.lround.i64.f128(double %Val) 14469 declare i64 @llvm.lround.i64.ppcf128(double %Val) 14470 14471Overview: 14472""""""""" 14473 14474The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest 14475integer with ties away from zero. 14476 14477Arguments: 14478"""""""""" 14479 14480The argument is a floating-point number and the return value is an integer 14481type. 14482 14483Semantics: 14484"""""""""" 14485 14486This function returns the same values as the libm ``llround`` 14487functions would, but without setting errno. 14488 14489'``llvm.lrint.*``' Intrinsic 14490^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14491 14492Syntax: 14493""""""" 14494 14495This is an overloaded intrinsic. You can use ``llvm.lrint`` on any 14496floating-point type. Not all targets support all types however. 14497 14498:: 14499 14500 declare i32 @llvm.lrint.i32.f32(float %Val) 14501 declare i32 @llvm.lrint.i32.f64(double %Val) 14502 declare i32 @llvm.lrint.i32.f80(float %Val) 14503 declare i32 @llvm.lrint.i32.f128(double %Val) 14504 declare i32 @llvm.lrint.i32.ppcf128(double %Val) 14505 14506 declare i64 @llvm.lrint.i64.f32(float %Val) 14507 declare i64 @llvm.lrint.i64.f64(double %Val) 14508 declare i64 @llvm.lrint.i64.f80(float %Val) 14509 declare i64 @llvm.lrint.i64.f128(double %Val) 14510 declare i64 @llvm.lrint.i64.ppcf128(double %Val) 14511 14512Overview: 14513""""""""" 14514 14515The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest 14516integer. 14517 14518 14519Arguments: 14520"""""""""" 14521 14522The argument is a floating-point number and the return value is an integer 14523type. 14524 14525Semantics: 14526"""""""""" 14527 14528This function returns the same values as the libm ``lrint`` 14529functions would, but without setting errno. 14530 14531'``llvm.llrint.*``' Intrinsic 14532^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14533 14534Syntax: 14535""""""" 14536 14537This is an overloaded intrinsic. You can use ``llvm.llrint`` on any 14538floating-point type. Not all targets support all types however. 14539 14540:: 14541 14542 declare i64 @llvm.llrint.i64.f32(float %Val) 14543 declare i64 @llvm.llrint.i64.f64(double %Val) 14544 declare i64 @llvm.llrint.i64.f80(float %Val) 14545 declare i64 @llvm.llrint.i64.f128(double %Val) 14546 declare i64 @llvm.llrint.i64.ppcf128(double %Val) 14547 14548Overview: 14549""""""""" 14550 14551The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest 14552integer. 14553 14554Arguments: 14555"""""""""" 14556 14557The argument is a floating-point number and the return value is an integer 14558type. 14559 14560Semantics: 14561"""""""""" 14562 14563This function returns the same values as the libm ``llrint`` 14564functions would, but without setting errno. 14565 14566Bit Manipulation Intrinsics 14567--------------------------- 14568 14569LLVM provides intrinsics for a few important bit manipulation 14570operations. These allow efficient code generation for some algorithms. 14571 14572'``llvm.bitreverse.*``' Intrinsics 14573^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14574 14575Syntax: 14576""""""" 14577 14578This is an overloaded intrinsic function. You can use bitreverse on any 14579integer type. 14580 14581:: 14582 14583 declare i16 @llvm.bitreverse.i16(i16 <id>) 14584 declare i32 @llvm.bitreverse.i32(i32 <id>) 14585 declare i64 @llvm.bitreverse.i64(i64 <id>) 14586 declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>) 14587 14588Overview: 14589""""""""" 14590 14591The '``llvm.bitreverse``' family of intrinsics is used to reverse the 14592bitpattern of an integer value or vector of integer values; for example 14593``0b10110110`` becomes ``0b01101101``. 14594 14595Semantics: 14596"""""""""" 14597 14598The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit 14599``M`` in the input moved to bit ``N-M`` in the output. The vector 14600intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element 14601basis and the element order is not affected. 14602 14603'``llvm.bswap.*``' Intrinsics 14604^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14605 14606Syntax: 14607""""""" 14608 14609This is an overloaded intrinsic function. You can use bswap on any 14610integer type that is an even number of bytes (i.e. BitWidth % 16 == 0). 14611 14612:: 14613 14614 declare i16 @llvm.bswap.i16(i16 <id>) 14615 declare i32 @llvm.bswap.i32(i32 <id>) 14616 declare i64 @llvm.bswap.i64(i64 <id>) 14617 declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>) 14618 14619Overview: 14620""""""""" 14621 14622The '``llvm.bswap``' family of intrinsics is used to byte swap an integer 14623value or vector of integer values with an even number of bytes (positive 14624multiple of 16 bits). 14625 14626Semantics: 14627"""""""""" 14628 14629The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high 14630and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32`` 14631intrinsic returns an i32 value that has the four bytes of the input i32 14632swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the 14633returned i32 will have its bytes in 3, 2, 1, 0 order. The 14634``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this 14635concept to additional even-byte lengths (6 bytes, 8 bytes and more, 14636respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``, 14637operate on a per-element basis and the element order is not affected. 14638 14639'``llvm.ctpop.*``' Intrinsic 14640^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14641 14642Syntax: 14643""""""" 14644 14645This is an overloaded intrinsic. You can use llvm.ctpop on any integer 14646bit width, or on any vector with integer elements. Not all targets 14647support all bit widths or vector types, however. 14648 14649:: 14650 14651 declare i8 @llvm.ctpop.i8(i8 <src>) 14652 declare i16 @llvm.ctpop.i16(i16 <src>) 14653 declare i32 @llvm.ctpop.i32(i32 <src>) 14654 declare i64 @llvm.ctpop.i64(i64 <src>) 14655 declare i256 @llvm.ctpop.i256(i256 <src>) 14656 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>) 14657 14658Overview: 14659""""""""" 14660 14661The '``llvm.ctpop``' family of intrinsics counts the number of bits set 14662in a value. 14663 14664Arguments: 14665"""""""""" 14666 14667The only argument is the value to be counted. The argument may be of any 14668integer type, or a vector with integer elements. The return type must 14669match the argument type. 14670 14671Semantics: 14672"""""""""" 14673 14674The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within 14675each element of a vector. 14676 14677'``llvm.ctlz.*``' Intrinsic 14678^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14679 14680Syntax: 14681""""""" 14682 14683This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any 14684integer bit width, or any vector whose elements are integers. Not all 14685targets support all bit widths or vector types, however. 14686 14687:: 14688 14689 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_undef>) 14690 declare i16 @llvm.ctlz.i16 (i16 <src>, i1 <is_zero_undef>) 14691 declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>) 14692 declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>) 14693 declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>) 14694 declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) 14695 14696Overview: 14697""""""""" 14698 14699The '``llvm.ctlz``' family of intrinsic functions counts the number of 14700leading zeros in a variable. 14701 14702Arguments: 14703"""""""""" 14704 14705The first argument is the value to be counted. This argument may be of 14706any integer type, or a vector with integer element type. The return 14707type must match the first argument type. 14708 14709The second argument must be a constant and is a flag to indicate whether 14710the intrinsic should ensure that a zero as the first argument produces a 14711defined result. Historically some architectures did not provide a 14712defined result for zero values as efficiently, and many algorithms are 14713now predicated on avoiding zero-value inputs. 14714 14715Semantics: 14716"""""""""" 14717 14718The '``llvm.ctlz``' intrinsic counts the leading (most significant) 14719zeros in a variable, or within each element of the vector. If 14720``src == 0`` then the result is the size in bits of the type of ``src`` 14721if ``is_zero_undef == 0`` and ``undef`` otherwise. For example, 14722``llvm.ctlz(i32 2) = 30``. 14723 14724'``llvm.cttz.*``' Intrinsic 14725^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14726 14727Syntax: 14728""""""" 14729 14730This is an overloaded intrinsic. You can use ``llvm.cttz`` on any 14731integer bit width, or any vector of integer elements. Not all targets 14732support all bit widths or vector types, however. 14733 14734:: 14735 14736 declare i8 @llvm.cttz.i8 (i8 <src>, i1 <is_zero_undef>) 14737 declare i16 @llvm.cttz.i16 (i16 <src>, i1 <is_zero_undef>) 14738 declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>) 14739 declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>) 14740 declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>) 14741 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) 14742 14743Overview: 14744""""""""" 14745 14746The '``llvm.cttz``' family of intrinsic functions counts the number of 14747trailing zeros. 14748 14749Arguments: 14750"""""""""" 14751 14752The first argument is the value to be counted. This argument may be of 14753any integer type, or a vector with integer element type. The return 14754type must match the first argument type. 14755 14756The second argument must be a constant and is a flag to indicate whether 14757the intrinsic should ensure that a zero as the first argument produces a 14758defined result. Historically some architectures did not provide a 14759defined result for zero values as efficiently, and many algorithms are 14760now predicated on avoiding zero-value inputs. 14761 14762Semantics: 14763"""""""""" 14764 14765The '``llvm.cttz``' intrinsic counts the trailing (least significant) 14766zeros in a variable, or within each element of a vector. If ``src == 0`` 14767then the result is the size in bits of the type of ``src`` if 14768``is_zero_undef == 0`` and ``undef`` otherwise. For example, 14769``llvm.cttz(2) = 1``. 14770 14771.. _int_overflow: 14772 14773'``llvm.fshl.*``' Intrinsic 14774^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14775 14776Syntax: 14777""""""" 14778 14779This is an overloaded intrinsic. You can use ``llvm.fshl`` on any 14780integer bit width or any vector of integer elements. Not all targets 14781support all bit widths or vector types, however. 14782 14783:: 14784 14785 declare i8 @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c) 14786 declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c) 14787 declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) 14788 14789Overview: 14790""""""""" 14791 14792The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left: 14793the first two values are concatenated as { %a : %b } (%a is the most significant 14794bits of the wide value), the combined value is shifted left, and the most 14795significant bits are extracted to produce a result that is the same size as the 14796original arguments. If the first 2 arguments are identical, this is equivalent 14797to a rotate left operation. For vector types, the operation occurs for each 14798element of the vector. The shift argument is treated as an unsigned amount 14799modulo the element size of the arguments. 14800 14801Arguments: 14802"""""""""" 14803 14804The first two arguments are the values to be concatenated. The third 14805argument is the shift amount. The arguments may be any integer type or a 14806vector with integer element type. All arguments and the return value must 14807have the same type. 14808 14809Example: 14810"""""""" 14811 14812.. code-block:: text 14813 14814 %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8) 14815 %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15) ; %r = i8: 128 (0b10000000) 14816 %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11) ; %r = i8: 120 (0b01111000) 14817 %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8) ; %r = i8: 0 (0b00000000) 14818 14819'``llvm.fshr.*``' Intrinsic 14820^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14821 14822Syntax: 14823""""""" 14824 14825This is an overloaded intrinsic. You can use ``llvm.fshr`` on any 14826integer bit width or any vector of integer elements. Not all targets 14827support all bit widths or vector types, however. 14828 14829:: 14830 14831 declare i8 @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c) 14832 declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c) 14833 declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) 14834 14835Overview: 14836""""""""" 14837 14838The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right: 14839the first two values are concatenated as { %a : %b } (%a is the most significant 14840bits of the wide value), the combined value is shifted right, and the least 14841significant bits are extracted to produce a result that is the same size as the 14842original arguments. If the first 2 arguments are identical, this is equivalent 14843to a rotate right operation. For vector types, the operation occurs for each 14844element of the vector. The shift argument is treated as an unsigned amount 14845modulo the element size of the arguments. 14846 14847Arguments: 14848"""""""""" 14849 14850The first two arguments are the values to be concatenated. The third 14851argument is the shift amount. The arguments may be any integer type or a 14852vector with integer element type. All arguments and the return value must 14853have the same type. 14854 14855Example: 14856"""""""" 14857 14858.. code-block:: text 14859 14860 %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8) 14861 %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15) ; %r = i8: 254 (0b11111110) 14862 %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11) ; %r = i8: 225 (0b11100001) 14863 %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8) ; %r = i8: 255 (0b11111111) 14864 14865Arithmetic with Overflow Intrinsics 14866----------------------------------- 14867 14868LLVM provides intrinsics for fast arithmetic overflow checking. 14869 14870Each of these intrinsics returns a two-element struct. The first 14871element of this struct contains the result of the corresponding 14872arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of 14873the result. Therefore, for example, the first element of the struct 14874returned by ``llvm.sadd.with.overflow.i32`` is always the same as the 14875result of a 32-bit ``add`` instruction with the same operands, where 14876the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag. 14877 14878The second element of the result is an ``i1`` that is 1 if the 14879arithmetic operation overflowed and 0 otherwise. An operation 14880overflows if, for any values of its operands ``A`` and ``B`` and for 14881any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is 14882not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is 14883``sext`` for signed overflow and ``zext`` for unsigned overflow, and 14884``op`` is the underlying arithmetic operation. 14885 14886The behavior of these intrinsics is well-defined for all argument 14887values. 14888 14889'``llvm.sadd.with.overflow.*``' Intrinsics 14890^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14891 14892Syntax: 14893""""""" 14894 14895This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow`` 14896on any integer bit width or vectors of integers. 14897 14898:: 14899 14900 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) 14901 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) 14902 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b) 14903 declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 14904 14905Overview: 14906""""""""" 14907 14908The '``llvm.sadd.with.overflow``' family of intrinsic functions perform 14909a signed addition of the two arguments, and indicate whether an overflow 14910occurred during the signed summation. 14911 14912Arguments: 14913"""""""""" 14914 14915The arguments (%a and %b) and the first element of the result structure 14916may be of integer types of any bit width, but they must have the same 14917bit width. The second element of the result structure must be of type 14918``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 14919addition. 14920 14921Semantics: 14922"""""""""" 14923 14924The '``llvm.sadd.with.overflow``' family of intrinsic functions perform 14925a signed addition of the two variables. They return a structure --- the 14926first element of which is the signed summation, and the second element 14927of which is a bit specifying if the signed summation resulted in an 14928overflow. 14929 14930Examples: 14931""""""""" 14932 14933.. code-block:: llvm 14934 14935 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) 14936 %sum = extractvalue {i32, i1} %res, 0 14937 %obit = extractvalue {i32, i1} %res, 1 14938 br i1 %obit, label %overflow, label %normal 14939 14940'``llvm.uadd.with.overflow.*``' Intrinsics 14941^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14942 14943Syntax: 14944""""""" 14945 14946This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow`` 14947on any integer bit width or vectors of integers. 14948 14949:: 14950 14951 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b) 14952 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) 14953 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b) 14954 declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 14955 14956Overview: 14957""""""""" 14958 14959The '``llvm.uadd.with.overflow``' family of intrinsic functions perform 14960an unsigned addition of the two arguments, and indicate whether a carry 14961occurred during the unsigned summation. 14962 14963Arguments: 14964"""""""""" 14965 14966The arguments (%a and %b) and the first element of the result structure 14967may be of integer types of any bit width, but they must have the same 14968bit width. The second element of the result structure must be of type 14969``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 14970addition. 14971 14972Semantics: 14973"""""""""" 14974 14975The '``llvm.uadd.with.overflow``' family of intrinsic functions perform 14976an unsigned addition of the two arguments. They return a structure --- the 14977first element of which is the sum, and the second element of which is a 14978bit specifying if the unsigned summation resulted in a carry. 14979 14980Examples: 14981""""""""" 14982 14983.. code-block:: llvm 14984 14985 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) 14986 %sum = extractvalue {i32, i1} %res, 0 14987 %obit = extractvalue {i32, i1} %res, 1 14988 br i1 %obit, label %carry, label %normal 14989 14990'``llvm.ssub.with.overflow.*``' Intrinsics 14991^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14992 14993Syntax: 14994""""""" 14995 14996This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow`` 14997on any integer bit width or vectors of integers. 14998 14999:: 15000 15001 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b) 15002 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) 15003 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b) 15004 declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 15005 15006Overview: 15007""""""""" 15008 15009The '``llvm.ssub.with.overflow``' family of intrinsic functions perform 15010a signed subtraction of the two arguments, and indicate whether an 15011overflow occurred during the signed subtraction. 15012 15013Arguments: 15014"""""""""" 15015 15016The arguments (%a and %b) and the first element of the result structure 15017may be of integer types of any bit width, but they must have the same 15018bit width. The second element of the result structure must be of type 15019``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 15020subtraction. 15021 15022Semantics: 15023"""""""""" 15024 15025The '``llvm.ssub.with.overflow``' family of intrinsic functions perform 15026a signed subtraction of the two arguments. They return a structure --- the 15027first element of which is the subtraction, and the second element of 15028which is a bit specifying if the signed subtraction resulted in an 15029overflow. 15030 15031Examples: 15032""""""""" 15033 15034.. code-block:: llvm 15035 15036 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) 15037 %sum = extractvalue {i32, i1} %res, 0 15038 %obit = extractvalue {i32, i1} %res, 1 15039 br i1 %obit, label %overflow, label %normal 15040 15041'``llvm.usub.with.overflow.*``' Intrinsics 15042^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15043 15044Syntax: 15045""""""" 15046 15047This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow`` 15048on any integer bit width or vectors of integers. 15049 15050:: 15051 15052 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b) 15053 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) 15054 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b) 15055 declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 15056 15057Overview: 15058""""""""" 15059 15060The '``llvm.usub.with.overflow``' family of intrinsic functions perform 15061an unsigned subtraction of the two arguments, and indicate whether an 15062overflow occurred during the unsigned subtraction. 15063 15064Arguments: 15065"""""""""" 15066 15067The arguments (%a and %b) and the first element of the result structure 15068may be of integer types of any bit width, but they must have the same 15069bit width. The second element of the result structure must be of type 15070``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 15071subtraction. 15072 15073Semantics: 15074"""""""""" 15075 15076The '``llvm.usub.with.overflow``' family of intrinsic functions perform 15077an unsigned subtraction of the two arguments. They return a structure --- 15078the first element of which is the subtraction, and the second element of 15079which is a bit specifying if the unsigned subtraction resulted in an 15080overflow. 15081 15082Examples: 15083""""""""" 15084 15085.. code-block:: llvm 15086 15087 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) 15088 %sum = extractvalue {i32, i1} %res, 0 15089 %obit = extractvalue {i32, i1} %res, 1 15090 br i1 %obit, label %overflow, label %normal 15091 15092'``llvm.smul.with.overflow.*``' Intrinsics 15093^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15094 15095Syntax: 15096""""""" 15097 15098This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow`` 15099on any integer bit width or vectors of integers. 15100 15101:: 15102 15103 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b) 15104 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) 15105 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b) 15106 declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 15107 15108Overview: 15109""""""""" 15110 15111The '``llvm.smul.with.overflow``' family of intrinsic functions perform 15112a signed multiplication of the two arguments, and indicate whether an 15113overflow occurred during the signed multiplication. 15114 15115Arguments: 15116"""""""""" 15117 15118The arguments (%a and %b) and the first element of the result structure 15119may be of integer types of any bit width, but they must have the same 15120bit width. The second element of the result structure must be of type 15121``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 15122multiplication. 15123 15124Semantics: 15125"""""""""" 15126 15127The '``llvm.smul.with.overflow``' family of intrinsic functions perform 15128a signed multiplication of the two arguments. They return a structure --- 15129the first element of which is the multiplication, and the second element 15130of which is a bit specifying if the signed multiplication resulted in an 15131overflow. 15132 15133Examples: 15134""""""""" 15135 15136.. code-block:: llvm 15137 15138 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) 15139 %sum = extractvalue {i32, i1} %res, 0 15140 %obit = extractvalue {i32, i1} %res, 1 15141 br i1 %obit, label %overflow, label %normal 15142 15143'``llvm.umul.with.overflow.*``' Intrinsics 15144^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15145 15146Syntax: 15147""""""" 15148 15149This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow`` 15150on any integer bit width or vectors of integers. 15151 15152:: 15153 15154 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b) 15155 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) 15156 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b) 15157 declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 15158 15159Overview: 15160""""""""" 15161 15162The '``llvm.umul.with.overflow``' family of intrinsic functions perform 15163a unsigned multiplication of the two arguments, and indicate whether an 15164overflow occurred during the unsigned multiplication. 15165 15166Arguments: 15167"""""""""" 15168 15169The arguments (%a and %b) and the first element of the result structure 15170may be of integer types of any bit width, but they must have the same 15171bit width. The second element of the result structure must be of type 15172``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 15173multiplication. 15174 15175Semantics: 15176"""""""""" 15177 15178The '``llvm.umul.with.overflow``' family of intrinsic functions perform 15179an unsigned multiplication of the two arguments. They return a structure --- 15180the first element of which is the multiplication, and the second 15181element of which is a bit specifying if the unsigned multiplication 15182resulted in an overflow. 15183 15184Examples: 15185""""""""" 15186 15187.. code-block:: llvm 15188 15189 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) 15190 %sum = extractvalue {i32, i1} %res, 0 15191 %obit = extractvalue {i32, i1} %res, 1 15192 br i1 %obit, label %overflow, label %normal 15193 15194Saturation Arithmetic Intrinsics 15195--------------------------------- 15196 15197Saturation arithmetic is a version of arithmetic in which operations are 15198limited to a fixed range between a minimum and maximum value. If the result of 15199an operation is greater than the maximum value, the result is set (or 15200"clamped") to this maximum. If it is below the minimum, it is clamped to this 15201minimum. 15202 15203 15204'``llvm.sadd.sat.*``' Intrinsics 15205^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15206 15207Syntax 15208""""""" 15209 15210This is an overloaded intrinsic. You can use ``llvm.sadd.sat`` 15211on any integer bit width or vectors of integers. 15212 15213:: 15214 15215 declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b) 15216 declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b) 15217 declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b) 15218 declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15219 15220Overview 15221""""""""" 15222 15223The '``llvm.sadd.sat``' family of intrinsic functions perform signed 15224saturating addition on the 2 arguments. 15225 15226Arguments 15227"""""""""" 15228 15229The arguments (%a and %b) and the result may be of integer types of any bit 15230width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15231values that will undergo signed addition. 15232 15233Semantics: 15234"""""""""" 15235 15236The maximum value this operation can clamp to is the largest signed value 15237representable by the bit width of the arguments. The minimum value is the 15238smallest signed value representable by this bit width. 15239 15240 15241Examples 15242""""""""" 15243 15244.. code-block:: llvm 15245 15246 %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2) ; %res = 3 15247 %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6) ; %res = 7 15248 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2) ; %res = -2 15249 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5) ; %res = -8 15250 15251 15252'``llvm.uadd.sat.*``' Intrinsics 15253^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15254 15255Syntax 15256""""""" 15257 15258This is an overloaded intrinsic. You can use ``llvm.uadd.sat`` 15259on any integer bit width or vectors of integers. 15260 15261:: 15262 15263 declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b) 15264 declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b) 15265 declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b) 15266 declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15267 15268Overview 15269""""""""" 15270 15271The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned 15272saturating addition on the 2 arguments. 15273 15274Arguments 15275"""""""""" 15276 15277The arguments (%a and %b) and the result may be of integer types of any bit 15278width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15279values that will undergo unsigned addition. 15280 15281Semantics: 15282"""""""""" 15283 15284The maximum value this operation can clamp to is the largest unsigned value 15285representable by the bit width of the arguments. Because this is an unsigned 15286operation, the result will never saturate towards zero. 15287 15288 15289Examples 15290""""""""" 15291 15292.. code-block:: llvm 15293 15294 %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2) ; %res = 3 15295 %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6) ; %res = 11 15296 %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8) ; %res = 15 15297 15298 15299'``llvm.ssub.sat.*``' Intrinsics 15300^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15301 15302Syntax 15303""""""" 15304 15305This is an overloaded intrinsic. You can use ``llvm.ssub.sat`` 15306on any integer bit width or vectors of integers. 15307 15308:: 15309 15310 declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b) 15311 declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b) 15312 declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b) 15313 declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15314 15315Overview 15316""""""""" 15317 15318The '``llvm.ssub.sat``' family of intrinsic functions perform signed 15319saturating subtraction on the 2 arguments. 15320 15321Arguments 15322"""""""""" 15323 15324The arguments (%a and %b) and the result may be of integer types of any bit 15325width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15326values that will undergo signed subtraction. 15327 15328Semantics: 15329"""""""""" 15330 15331The maximum value this operation can clamp to is the largest signed value 15332representable by the bit width of the arguments. The minimum value is the 15333smallest signed value representable by this bit width. 15334 15335 15336Examples 15337""""""""" 15338 15339.. code-block:: llvm 15340 15341 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1) ; %res = 1 15342 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6) ; %res = -4 15343 %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5) ; %res = -8 15344 %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5) ; %res = 7 15345 15346 15347'``llvm.usub.sat.*``' Intrinsics 15348^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15349 15350Syntax 15351""""""" 15352 15353This is an overloaded intrinsic. You can use ``llvm.usub.sat`` 15354on any integer bit width or vectors of integers. 15355 15356:: 15357 15358 declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b) 15359 declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b) 15360 declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b) 15361 declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15362 15363Overview 15364""""""""" 15365 15366The '``llvm.usub.sat``' family of intrinsic functions perform unsigned 15367saturating subtraction on the 2 arguments. 15368 15369Arguments 15370"""""""""" 15371 15372The arguments (%a and %b) and the result may be of integer types of any bit 15373width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15374values that will undergo unsigned subtraction. 15375 15376Semantics: 15377"""""""""" 15378 15379The minimum value this operation can clamp to is 0, which is the smallest 15380unsigned value representable by the bit width of the unsigned arguments. 15381Because this is an unsigned operation, the result will never saturate towards 15382the largest possible value representable by this bit width. 15383 15384 15385Examples 15386""""""""" 15387 15388.. code-block:: llvm 15389 15390 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1) ; %res = 1 15391 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6) ; %res = 0 15392 15393 15394'``llvm.sshl.sat.*``' Intrinsics 15395^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15396 15397Syntax 15398""""""" 15399 15400This is an overloaded intrinsic. You can use ``llvm.sshl.sat`` 15401on integers or vectors of integers of any bit width. 15402 15403:: 15404 15405 declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b) 15406 declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b) 15407 declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b) 15408 declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15409 15410Overview 15411""""""""" 15412 15413The '``llvm.sshl.sat``' family of intrinsic functions perform signed 15414saturating left shift on the first argument. 15415 15416Arguments 15417"""""""""" 15418 15419The arguments (``%a`` and ``%b``) and the result may be of integer types of any 15420bit width, but they must have the same bit width. ``%a`` is the value to be 15421shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or 15422dynamically) equal to or larger than the integer bit width of the arguments, 15423the result is a :ref:`poison value <poisonvalues>`. If the arguments are 15424vectors, each vector element of ``a`` is shifted by the corresponding shift 15425amount in ``b``. 15426 15427 15428Semantics: 15429"""""""""" 15430 15431The maximum value this operation can clamp to is the largest signed value 15432representable by the bit width of the arguments. The minimum value is the 15433smallest signed value representable by this bit width. 15434 15435 15436Examples 15437""""""""" 15438 15439.. code-block:: llvm 15440 15441 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1) ; %res = 4 15442 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2) ; %res = 7 15443 %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1) ; %res = -8 15444 %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1) ; %res = -2 15445 15446 15447'``llvm.ushl.sat.*``' Intrinsics 15448^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15449 15450Syntax 15451""""""" 15452 15453This is an overloaded intrinsic. You can use ``llvm.ushl.sat`` 15454on integers or vectors of integers of any bit width. 15455 15456:: 15457 15458 declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b) 15459 declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b) 15460 declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b) 15461 declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15462 15463Overview 15464""""""""" 15465 15466The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned 15467saturating left shift on the first argument. 15468 15469Arguments 15470"""""""""" 15471 15472The arguments (``%a`` and ``%b``) and the result may be of integer types of any 15473bit width, but they must have the same bit width. ``%a`` is the value to be 15474shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or 15475dynamically) equal to or larger than the integer bit width of the arguments, 15476the result is a :ref:`poison value <poisonvalues>`. If the arguments are 15477vectors, each vector element of ``a`` is shifted by the corresponding shift 15478amount in ``b``. 15479 15480Semantics: 15481"""""""""" 15482 15483The maximum value this operation can clamp to is the largest unsigned value 15484representable by the bit width of the arguments. 15485 15486 15487Examples 15488""""""""" 15489 15490.. code-block:: llvm 15491 15492 %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1) ; %res = 4 15493 %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3) ; %res = 15 15494 15495 15496Fixed Point Arithmetic Intrinsics 15497--------------------------------- 15498 15499A fixed point number represents a real data type for a number that has a fixed 15500number of digits after a radix point (equivalent to the decimal point '.'). 15501The number of digits after the radix point is referred as the `scale`. These 15502are useful for representing fractional values to a specific precision. The 15503following intrinsics perform fixed point arithmetic operations on 2 operands 15504of the same scale, specified as the third argument. 15505 15506The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication 15507of fixed point numbers through scaled integers. Therefore, fixed point 15508multiplication can be represented as 15509 15510.. code-block:: llvm 15511 15512 %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale) 15513 15514 ; Expands to 15515 %a2 = sext i4 %a to i8 15516 %b2 = sext i4 %b to i8 15517 %mul = mul nsw nuw i8 %a, %b 15518 %scale2 = trunc i32 %scale to i8 15519 %r = ashr i8 %mul, i8 %scale2 ; this is for a target rounding down towards negative infinity 15520 %result = trunc i8 %r to i4 15521 15522The ``llvm.*div.fix`` family of intrinsic functions represents a division of 15523fixed point numbers through scaled integers. Fixed point division can be 15524represented as: 15525 15526.. code-block:: llvm 15527 15528 %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale) 15529 15530 ; Expands to 15531 %a2 = sext i4 %a to i8 15532 %b2 = sext i4 %b to i8 15533 %scale2 = trunc i32 %scale to i8 15534 %a3 = shl i8 %a2, %scale2 15535 %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero 15536 %result = trunc i8 %r to i4 15537 15538For each of these functions, if the result cannot be represented exactly with 15539the provided scale, the result is rounded. Rounding is unspecified since 15540preferred rounding may vary for different targets. Rounding is specified 15541through a target hook. Different pipelines should legalize or optimize this 15542using the rounding specified by this hook if it is provided. Operations like 15543constant folding, instruction combining, KnownBits, and ValueTracking should 15544also use this hook, if provided, and not assume the direction of rounding. A 15545rounded result must always be within one unit of precision from the true 15546result. That is, the error between the returned result and the true result must 15547be less than 1/2^(scale). 15548 15549 15550'``llvm.smul.fix.*``' Intrinsics 15551^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15552 15553Syntax 15554""""""" 15555 15556This is an overloaded intrinsic. You can use ``llvm.smul.fix`` 15557on any integer bit width or vectors of integers. 15558 15559:: 15560 15561 declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale) 15562 declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale) 15563 declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale) 15564 declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15565 15566Overview 15567""""""""" 15568 15569The '``llvm.smul.fix``' family of intrinsic functions perform signed 15570fixed point multiplication on 2 arguments of the same scale. 15571 15572Arguments 15573"""""""""" 15574 15575The arguments (%a and %b) and the result may be of integer types of any bit 15576width, but they must have the same bit width. The arguments may also work with 15577int vectors of the same length and int size. ``%a`` and ``%b`` are the two 15578values that will undergo signed fixed point multiplication. The argument 15579``%scale`` represents the scale of both operands, and must be a constant 15580integer. 15581 15582Semantics: 15583"""""""""" 15584 15585This operation performs fixed point multiplication on the 2 arguments of a 15586specified scale. The result will also be returned in the same scale specified 15587in the third argument. 15588 15589If the result value cannot be precisely represented in the given scale, the 15590value is rounded up or down to the closest representable value. The rounding 15591direction is unspecified. 15592 15593It is undefined behavior if the result value does not fit within the range of 15594the fixed point type. 15595 15596 15597Examples 15598""""""""" 15599 15600.. code-block:: llvm 15601 15602 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 15603 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 15604 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) 15605 15606 ; The result in the following could be rounded up to -2 or down to -2.5 15607 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) 15608 15609 15610'``llvm.umul.fix.*``' Intrinsics 15611^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15612 15613Syntax 15614""""""" 15615 15616This is an overloaded intrinsic. You can use ``llvm.umul.fix`` 15617on any integer bit width or vectors of integers. 15618 15619:: 15620 15621 declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale) 15622 declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale) 15623 declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale) 15624 declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15625 15626Overview 15627""""""""" 15628 15629The '``llvm.umul.fix``' family of intrinsic functions perform unsigned 15630fixed point multiplication on 2 arguments of the same scale. 15631 15632Arguments 15633"""""""""" 15634 15635The arguments (%a and %b) and the result may be of integer types of any bit 15636width, but they must have the same bit width. The arguments may also work with 15637int vectors of the same length and int size. ``%a`` and ``%b`` are the two 15638values that will undergo unsigned fixed point multiplication. The argument 15639``%scale`` represents the scale of both operands, and must be a constant 15640integer. 15641 15642Semantics: 15643"""""""""" 15644 15645This operation performs unsigned fixed point multiplication on the 2 arguments of a 15646specified scale. The result will also be returned in the same scale specified 15647in the third argument. 15648 15649If the result value cannot be precisely represented in the given scale, the 15650value is rounded up or down to the closest representable value. The rounding 15651direction is unspecified. 15652 15653It is undefined behavior if the result value does not fit within the range of 15654the fixed point type. 15655 15656 15657Examples 15658""""""""" 15659 15660.. code-block:: llvm 15661 15662 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 15663 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 15664 15665 ; The result in the following could be rounded down to 3.5 or up to 4 15666 %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1) ; %res = 7 (or 8) (7.5 x 0.5 = 3.75) 15667 15668 15669'``llvm.smul.fix.sat.*``' Intrinsics 15670^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15671 15672Syntax 15673""""""" 15674 15675This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat`` 15676on any integer bit width or vectors of integers. 15677 15678:: 15679 15680 declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 15681 declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 15682 declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 15683 declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15684 15685Overview 15686""""""""" 15687 15688The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed 15689fixed point saturating multiplication on 2 arguments of the same scale. 15690 15691Arguments 15692"""""""""" 15693 15694The arguments (%a and %b) and the result may be of integer types of any bit 15695width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15696values that will undergo signed fixed point multiplication. The argument 15697``%scale`` represents the scale of both operands, and must be a constant 15698integer. 15699 15700Semantics: 15701"""""""""" 15702 15703This operation performs fixed point multiplication on the 2 arguments of a 15704specified scale. The result will also be returned in the same scale specified 15705in the third argument. 15706 15707If the result value cannot be precisely represented in the given scale, the 15708value is rounded up or down to the closest representable value. The rounding 15709direction is unspecified. 15710 15711The maximum value this operation can clamp to is the largest signed value 15712representable by the bit width of the first 2 arguments. The minimum value is the 15713smallest signed value representable by this bit width. 15714 15715 15716Examples 15717""""""""" 15718 15719.. code-block:: llvm 15720 15721 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 15722 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 15723 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) 15724 15725 ; The result in the following could be rounded up to -2 or down to -2.5 15726 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) 15727 15728 ; Saturation 15729 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0) ; %res = 7 15730 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2) ; %res = 7 15731 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2) ; %res = -8 15732 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1) ; %res = 7 15733 15734 ; Scale can affect the saturation result 15735 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7) 15736 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2) 15737 15738 15739'``llvm.umul.fix.sat.*``' Intrinsics 15740^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15741 15742Syntax 15743""""""" 15744 15745This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat`` 15746on any integer bit width or vectors of integers. 15747 15748:: 15749 15750 declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 15751 declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 15752 declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 15753 declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15754 15755Overview 15756""""""""" 15757 15758The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned 15759fixed point saturating multiplication on 2 arguments of the same scale. 15760 15761Arguments 15762"""""""""" 15763 15764The arguments (%a and %b) and the result may be of integer types of any bit 15765width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15766values that will undergo unsigned fixed point multiplication. The argument 15767``%scale`` represents the scale of both operands, and must be a constant 15768integer. 15769 15770Semantics: 15771"""""""""" 15772 15773This operation performs fixed point multiplication on the 2 arguments of a 15774specified scale. The result will also be returned in the same scale specified 15775in the third argument. 15776 15777If the result value cannot be precisely represented in the given scale, the 15778value is rounded up or down to the closest representable value. The rounding 15779direction is unspecified. 15780 15781The maximum value this operation can clamp to is the largest unsigned value 15782representable by the bit width of the first 2 arguments. The minimum value is the 15783smallest unsigned value representable by this bit width (zero). 15784 15785 15786Examples 15787""""""""" 15788 15789.. code-block:: llvm 15790 15791 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 15792 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 15793 15794 ; The result in the following could be rounded down to 2 or up to 2.5 15795 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1) ; %res = 4 (or 5) (1.5 x 1.5 = 2.25) 15796 15797 ; Saturation 15798 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0) ; %res = 15 (8 x 2 -> clamped to 15) 15799 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2) ; %res = 15 (2 x 2 -> clamped to 3.75) 15800 15801 ; Scale can affect the saturation result 15802 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7) 15803 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2) 15804 15805 15806'``llvm.sdiv.fix.*``' Intrinsics 15807^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15808 15809Syntax 15810""""""" 15811 15812This is an overloaded intrinsic. You can use ``llvm.sdiv.fix`` 15813on any integer bit width or vectors of integers. 15814 15815:: 15816 15817 declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale) 15818 declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale) 15819 declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale) 15820 declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15821 15822Overview 15823""""""""" 15824 15825The '``llvm.sdiv.fix``' family of intrinsic functions perform signed 15826fixed point division on 2 arguments of the same scale. 15827 15828Arguments 15829"""""""""" 15830 15831The arguments (%a and %b) and the result may be of integer types of any bit 15832width, but they must have the same bit width. The arguments may also work with 15833int vectors of the same length and int size. ``%a`` and ``%b`` are the two 15834values that will undergo signed fixed point division. The argument 15835``%scale`` represents the scale of both operands, and must be a constant 15836integer. 15837 15838Semantics: 15839"""""""""" 15840 15841This operation performs fixed point division on the 2 arguments of a 15842specified scale. The result will also be returned in the same scale specified 15843in the third argument. 15844 15845If the result value cannot be precisely represented in the given scale, the 15846value is rounded up or down to the closest representable value. The rounding 15847direction is unspecified. 15848 15849It is undefined behavior if the result value does not fit within the range of 15850the fixed point type, or if the second argument is zero. 15851 15852 15853Examples 15854""""""""" 15855 15856.. code-block:: llvm 15857 15858 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 15859 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 15860 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5) 15861 15862 ; The result in the following could be rounded up to 1 or down to 0.5 15863 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 15864 15865 15866'``llvm.udiv.fix.*``' Intrinsics 15867^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15868 15869Syntax 15870""""""" 15871 15872This is an overloaded intrinsic. You can use ``llvm.udiv.fix`` 15873on any integer bit width or vectors of integers. 15874 15875:: 15876 15877 declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale) 15878 declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale) 15879 declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale) 15880 declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15881 15882Overview 15883""""""""" 15884 15885The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned 15886fixed point division on 2 arguments of the same scale. 15887 15888Arguments 15889"""""""""" 15890 15891The arguments (%a and %b) and the result may be of integer types of any bit 15892width, but they must have the same bit width. The arguments may also work with 15893int vectors of the same length and int size. ``%a`` and ``%b`` are the two 15894values that will undergo unsigned fixed point division. The argument 15895``%scale`` represents the scale of both operands, and must be a constant 15896integer. 15897 15898Semantics: 15899"""""""""" 15900 15901This operation performs fixed point division on the 2 arguments of a 15902specified scale. The result will also be returned in the same scale specified 15903in the third argument. 15904 15905If the result value cannot be precisely represented in the given scale, the 15906value is rounded up or down to the closest representable value. The rounding 15907direction is unspecified. 15908 15909It is undefined behavior if the result value does not fit within the range of 15910the fixed point type, or if the second argument is zero. 15911 15912 15913Examples 15914""""""""" 15915 15916.. code-block:: llvm 15917 15918 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 15919 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 15920 %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125) 15921 15922 ; The result in the following could be rounded up to 1 or down to 0.5 15923 %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 15924 15925 15926'``llvm.sdiv.fix.sat.*``' Intrinsics 15927^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15928 15929Syntax 15930""""""" 15931 15932This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat`` 15933on any integer bit width or vectors of integers. 15934 15935:: 15936 15937 declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 15938 declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 15939 declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 15940 declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15941 15942Overview 15943""""""""" 15944 15945The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed 15946fixed point saturating division on 2 arguments of the same scale. 15947 15948Arguments 15949"""""""""" 15950 15951The arguments (%a and %b) and the result may be of integer types of any bit 15952width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15953values that will undergo signed fixed point division. The argument 15954``%scale`` represents the scale of both operands, and must be a constant 15955integer. 15956 15957Semantics: 15958"""""""""" 15959 15960This operation performs fixed point division on the 2 arguments of a 15961specified scale. The result will also be returned in the same scale specified 15962in the third argument. 15963 15964If the result value cannot be precisely represented in the given scale, the 15965value is rounded up or down to the closest representable value. The rounding 15966direction is unspecified. 15967 15968The maximum value this operation can clamp to is the largest signed value 15969representable by the bit width of the first 2 arguments. The minimum value is the 15970smallest signed value representable by this bit width. 15971 15972It is undefined behavior if the second argument is zero. 15973 15974 15975Examples 15976""""""""" 15977 15978.. code-block:: llvm 15979 15980 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 15981 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 15982 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5) 15983 15984 ; The result in the following could be rounded up to 1 or down to 0.5 15985 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 15986 15987 ; Saturation 15988 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0) ; %res = 7 (-8 / -1 = 8 => 7) 15989 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2) ; %res = 7 (1 / 0.5 = 2 => 1.75) 15990 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2) ; %res = -8 (-1 / 0.25 = -4 => -2) 15991 15992 15993'``llvm.udiv.fix.sat.*``' Intrinsics 15994^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15995 15996Syntax 15997""""""" 15998 15999This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat`` 16000on any integer bit width or vectors of integers. 16001 16002:: 16003 16004 declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 16005 declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 16006 declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 16007 declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 16008 16009Overview 16010""""""""" 16011 16012The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned 16013fixed point saturating division on 2 arguments of the same scale. 16014 16015Arguments 16016"""""""""" 16017 16018The arguments (%a and %b) and the result may be of integer types of any bit 16019width, but they must have the same bit width. ``%a`` and ``%b`` are the two 16020values that will undergo unsigned fixed point division. The argument 16021``%scale`` represents the scale of both operands, and must be a constant 16022integer. 16023 16024Semantics: 16025"""""""""" 16026 16027This operation performs fixed point division on the 2 arguments of a 16028specified scale. The result will also be returned in the same scale specified 16029in the third argument. 16030 16031If the result value cannot be precisely represented in the given scale, the 16032value is rounded up or down to the closest representable value. The rounding 16033direction is unspecified. 16034 16035The maximum value this operation can clamp to is the largest unsigned value 16036representable by the bit width of the first 2 arguments. The minimum value is the 16037smallest unsigned value representable by this bit width (zero). 16038 16039It is undefined behavior if the second argument is zero. 16040 16041Examples 16042""""""""" 16043 16044.. code-block:: llvm 16045 16046 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 16047 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 16048 16049 ; The result in the following could be rounded down to 0.5 or up to 1 16050 %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 1 (or 2) (1.5 / 2 = 0.75) 16051 16052 ; Saturation 16053 %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2) ; %res = 15 (2 / 0.5 = 4 => 3.75) 16054 16055 16056Specialised Arithmetic Intrinsics 16057--------------------------------- 16058 16059.. _i_intr_llvm_canonicalize: 16060 16061'``llvm.canonicalize.*``' Intrinsic 16062^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16063 16064Syntax: 16065""""""" 16066 16067:: 16068 16069 declare float @llvm.canonicalize.f32(float %a) 16070 declare double @llvm.canonicalize.f64(double %b) 16071 16072Overview: 16073""""""""" 16074 16075The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical 16076encoding of a floating-point number. This canonicalization is useful for 16077implementing certain numeric primitives such as frexp. The canonical encoding is 16078defined by IEEE-754-2008 to be: 16079 16080:: 16081 16082 2.1.8 canonical encoding: The preferred encoding of a floating-point 16083 representation in a format. Applied to declets, significands of finite 16084 numbers, infinities, and NaNs, especially in decimal formats. 16085 16086This operation can also be considered equivalent to the IEEE-754-2008 16087conversion of a floating-point value to the same format. NaNs are handled 16088according to section 6.2. 16089 16090Examples of non-canonical encodings: 16091 16092- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are 16093 converted to a canonical representation per hardware-specific protocol. 16094- Many normal decimal floating-point numbers have non-canonical alternative 16095 encodings. 16096- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values. 16097 These are treated as non-canonical encodings of zero and will be flushed to 16098 a zero of the same sign by this operation. 16099 16100Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with 16101default exception handling must signal an invalid exception, and produce a 16102quiet NaN result. 16103 16104This function should always be implementable as multiplication by 1.0, provided 16105that the compiler does not constant fold the operation. Likewise, division by 161061.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with 16107-0.0 is also sufficient provided that the rounding mode is not -Infinity. 16108 16109``@llvm.canonicalize`` must preserve the equality relation. That is: 16110 16111- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)`` 16112- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to 16113 to ``(x == y)`` 16114 16115Additionally, the sign of zero must be conserved: 16116``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0`` 16117 16118The payload bits of a NaN must be conserved, with two exceptions. 16119First, environments which use only a single canonical representation of NaN 16120must perform said canonicalization. Second, SNaNs must be quieted per the 16121usual methods. 16122 16123The canonicalization operation may be optimized away if: 16124 16125- The input is known to be canonical. For example, it was produced by a 16126 floating-point operation that is required by the standard to be canonical. 16127- The result is consumed only by (or fused with) other floating-point 16128 operations. That is, the bits of the floating-point value are not examined. 16129 16130'``llvm.fmuladd.*``' Intrinsic 16131^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16132 16133Syntax: 16134""""""" 16135 16136:: 16137 16138 declare float @llvm.fmuladd.f32(float %a, float %b, float %c) 16139 declare double @llvm.fmuladd.f64(double %a, double %b, double %c) 16140 16141Overview: 16142""""""""" 16143 16144The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add 16145expressions that can be fused if the code generator determines that (a) the 16146target instruction set has support for a fused operation, and (b) that the 16147fused operation is more efficient than the equivalent, separate pair of mul 16148and add instructions. 16149 16150Arguments: 16151"""""""""" 16152 16153The '``llvm.fmuladd.*``' intrinsics each take three arguments: two 16154multiplicands, a and b, and an addend c. 16155 16156Semantics: 16157"""""""""" 16158 16159The expression: 16160 16161:: 16162 16163 %0 = call float @llvm.fmuladd.f32(%a, %b, %c) 16164 16165is equivalent to the expression a \* b + c, except that it is unspecified 16166whether rounding will be performed between the multiplication and addition 16167steps. Fusion is not guaranteed, even if the target platform supports it. 16168If a fused multiply-add is required, the corresponding 16169:ref:`llvm.fma <int_fma>` intrinsic function should be used instead. 16170This never sets errno, just as '``llvm.fma.*``'. 16171 16172Examples: 16173""""""""" 16174 16175.. code-block:: llvm 16176 16177 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c 16178 16179 16180Hardware-Loop Intrinsics 16181------------------------ 16182 16183LLVM support several intrinsics to mark a loop as a hardware-loop. They are 16184hints to the backend which are required to lower these intrinsics further to target 16185specific instructions, or revert the hardware-loop to a normal loop if target 16186specific restriction are not met and a hardware-loop can't be generated. 16187 16188These intrinsics may be modified in the future and are not intended to be used 16189outside the backend. Thus, front-end and mid-level optimizations should not be 16190generating these intrinsics. 16191 16192 16193'``llvm.set.loop.iterations.*``' Intrinsic 16194^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16195 16196Syntax: 16197""""""" 16198 16199This is an overloaded intrinsic. 16200 16201:: 16202 16203 declare void @llvm.set.loop.iterations.i32(i32) 16204 declare void @llvm.set.loop.iterations.i64(i64) 16205 16206Overview: 16207""""""""" 16208 16209The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the 16210hardware-loop trip count. They are placed in the loop preheader basic block and 16211are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these 16212instructions. 16213 16214Arguments: 16215"""""""""" 16216 16217The integer operand is the loop trip count of the hardware-loop, and thus 16218not e.g. the loop back-edge taken count. 16219 16220Semantics: 16221"""""""""" 16222 16223The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic 16224on their operand. It's a hint to the backend that can use this to set up the 16225hardware-loop count with a target specific instruction, usually a move of this 16226value to a special register or a hardware-loop instruction. 16227 16228 16229'``llvm.start.loop.iterations.*``' Intrinsic 16230^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16231 16232Syntax: 16233""""""" 16234 16235This is an overloaded intrinsic. 16236 16237:: 16238 16239 declare i32 @llvm.start.loop.iterations.i32(i32) 16240 declare i64 @llvm.start.loop.iterations.i64(i64) 16241 16242Overview: 16243""""""""" 16244 16245The '``llvm.start.loop.iterations.*``' intrinsics are similar to the 16246'``llvm.set.loop.iterations.*``' intrinsics, used to specify the 16247hardware-loop trip count but also produce a value identical to the input 16248that can be used as the input to the loop. They are placed in the loop 16249preheader basic block and the output is expected to be the input to the 16250phi for the induction variable of the loop, decremented by the 16251'``llvm.loop.decrement.reg.*``'. 16252 16253Arguments: 16254"""""""""" 16255 16256The integer operand is the loop trip count of the hardware-loop, and thus 16257not e.g. the loop back-edge taken count. 16258 16259Semantics: 16260"""""""""" 16261 16262The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic 16263on their operand. It's a hint to the backend that can use this to set up the 16264hardware-loop count with a target specific instruction, usually a move of this 16265value to a special register or a hardware-loop instruction. 16266 16267'``llvm.test.set.loop.iterations.*``' Intrinsic 16268^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16269 16270Syntax: 16271""""""" 16272 16273This is an overloaded intrinsic. 16274 16275:: 16276 16277 declare i1 @llvm.test.set.loop.iterations.i32(i32) 16278 declare i1 @llvm.test.set.loop.iterations.i64(i64) 16279 16280Overview: 16281""""""""" 16282 16283The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the 16284the loop trip count, and also test that the given count is not zero, allowing 16285it to control entry to a while-loop. They are placed in the loop preheader's 16286predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid 16287optimizers duplicating these instructions. 16288 16289Arguments: 16290"""""""""" 16291 16292The integer operand is the loop trip count of the hardware-loop, and thus 16293not e.g. the loop back-edge taken count. 16294 16295Semantics: 16296"""""""""" 16297 16298The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any 16299arithmetic on their operand. It's a hint to the backend that can use this to 16300set up the hardware-loop count with a target specific instruction, usually a 16301move of this value to a special register or a hardware-loop instruction. 16302The result is the conditional value of whether the given count is not zero. 16303 16304 16305'``llvm.test.start.loop.iterations.*``' Intrinsic 16306^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16307 16308Syntax: 16309""""""" 16310 16311This is an overloaded intrinsic. 16312 16313:: 16314 16315 declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32) 16316 declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64) 16317 16318Overview: 16319""""""""" 16320 16321The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the 16322'``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``' 16323intrinsics, used to specify the hardware-loop trip count, but also produce a 16324value identical to the input that can be used as the input to the loop. The 16325second i1 output controls entry to a while-loop. 16326 16327Arguments: 16328"""""""""" 16329 16330The integer operand is the loop trip count of the hardware-loop, and thus 16331not e.g. the loop back-edge taken count. 16332 16333Semantics: 16334"""""""""" 16335 16336The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any 16337arithmetic on their operand. It's a hint to the backend that can use this to 16338set up the hardware-loop count with a target specific instruction, usually a 16339move of this value to a special register or a hardware-loop instruction. 16340The result is a pair of the input and a conditional value of whether the 16341given count is not zero. 16342 16343 16344'``llvm.loop.decrement.reg.*``' Intrinsic 16345^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16346 16347Syntax: 16348""""""" 16349 16350This is an overloaded intrinsic. 16351 16352:: 16353 16354 declare i32 @llvm.loop.decrement.reg.i32(i32, i32) 16355 declare i64 @llvm.loop.decrement.reg.i64(i64, i64) 16356 16357Overview: 16358""""""""" 16359 16360The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop 16361iteration counter and return an updated value that will be used in the next 16362loop test check. 16363 16364Arguments: 16365"""""""""" 16366 16367Both arguments must have identical integer types. The first operand is the 16368loop iteration counter. The second operand is the maximum number of elements 16369processed in an iteration. 16370 16371Semantics: 16372"""""""""" 16373 16374The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its 16375two operands, which is not allowed to wrap. They return the remaining number of 16376iterations still to be executed, and can be used together with a ``PHI``, 16377``ICMP`` and ``BR`` to control the number of loop iterations executed. Any 16378optimisations are allowed to treat it is a ``SUB``, and it is supported by 16379SCEV, so it's the backends responsibility to handle cases where it may be 16380optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid 16381optimizers duplicating these instructions. 16382 16383 16384'``llvm.loop.decrement.*``' Intrinsic 16385^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16386 16387Syntax: 16388""""""" 16389 16390This is an overloaded intrinsic. 16391 16392:: 16393 16394 declare i1 @llvm.loop.decrement.i32(i32) 16395 declare i1 @llvm.loop.decrement.i64(i64) 16396 16397Overview: 16398""""""""" 16399 16400The HardwareLoops pass allows the loop decrement value to be specified with an 16401option. It defaults to a loop decrement value of 1, but it can be an unsigned 16402integer value provided by this option. The '``llvm.loop.decrement.*``' 16403intrinsics decrement the loop iteration counter with this value, and return a 16404false predicate if the loop should exit, and true otherwise. 16405This is emitted if the loop counter is not updated via a ``PHI`` node, which 16406can also be controlled with an option. 16407 16408Arguments: 16409"""""""""" 16410 16411The integer argument is the loop decrement value used to decrement the loop 16412iteration counter. 16413 16414Semantics: 16415"""""""""" 16416 16417The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration 16418counter with the given loop decrement value, and return false if the loop 16419should exit, this ``SUB`` is not allowed to wrap. The result is a condition 16420that is used by the conditional branch controlling the loop. 16421 16422 16423Vector Reduction Intrinsics 16424--------------------------- 16425 16426Horizontal reductions of vectors can be expressed using the following 16427intrinsics. Each one takes a vector operand as an input and applies its 16428respective operation across all elements of the vector, returning a single 16429scalar result of the same element type. 16430 16431 16432'``llvm.vector.reduce.add.*``' Intrinsic 16433^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16434 16435Syntax: 16436""""""" 16437 16438:: 16439 16440 declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a) 16441 declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a) 16442 16443Overview: 16444""""""""" 16445 16446The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD`` 16447reduction of a vector, returning the result as a scalar. The return type matches 16448the element-type of the vector input. 16449 16450Arguments: 16451"""""""""" 16452The argument to this intrinsic must be a vector of integer values. 16453 16454'``llvm.vector.reduce.fadd.*``' Intrinsic 16455^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16456 16457Syntax: 16458""""""" 16459 16460:: 16461 16462 declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a) 16463 declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a) 16464 16465Overview: 16466""""""""" 16467 16468The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point 16469``ADD`` reduction of a vector, returning the result as a scalar. The return type 16470matches the element-type of the vector input. 16471 16472If the intrinsic call has the 'reassoc' flag set, then the reduction will not 16473preserve the associativity of an equivalent scalarized counterpart. Otherwise 16474the reduction will be *sequential*, thus implying that the operation respects 16475the associativity of a scalarized reduction. That is, the reduction begins with 16476the start value and performs an fadd operation with consecutively increasing 16477vector element indices. See the following pseudocode: 16478 16479:: 16480 16481 float sequential_fadd(start_value, input_vector) 16482 result = start_value 16483 for i = 0 to length(input_vector) 16484 result = result + input_vector[i] 16485 return result 16486 16487 16488Arguments: 16489"""""""""" 16490The first argument to this intrinsic is a scalar start value for the reduction. 16491The type of the start value matches the element-type of the vector input. 16492The second argument must be a vector of floating-point values. 16493 16494To ignore the start value, negative zero (``-0.0``) can be used, as it is 16495the neutral value of floating point addition. 16496 16497Examples: 16498""""""""" 16499 16500:: 16501 16502 %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction 16503 %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction 16504 16505 16506'``llvm.vector.reduce.mul.*``' Intrinsic 16507^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16508 16509Syntax: 16510""""""" 16511 16512:: 16513 16514 declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a) 16515 declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a) 16516 16517Overview: 16518""""""""" 16519 16520The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL`` 16521reduction of a vector, returning the result as a scalar. The return type matches 16522the element-type of the vector input. 16523 16524Arguments: 16525"""""""""" 16526The argument to this intrinsic must be a vector of integer values. 16527 16528'``llvm.vector.reduce.fmul.*``' Intrinsic 16529^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16530 16531Syntax: 16532""""""" 16533 16534:: 16535 16536 declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a) 16537 declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a) 16538 16539Overview: 16540""""""""" 16541 16542The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point 16543``MUL`` reduction of a vector, returning the result as a scalar. The return type 16544matches the element-type of the vector input. 16545 16546If the intrinsic call has the 'reassoc' flag set, then the reduction will not 16547preserve the associativity of an equivalent scalarized counterpart. Otherwise 16548the reduction will be *sequential*, thus implying that the operation respects 16549the associativity of a scalarized reduction. That is, the reduction begins with 16550the start value and performs an fmul operation with consecutively increasing 16551vector element indices. See the following pseudocode: 16552 16553:: 16554 16555 float sequential_fmul(start_value, input_vector) 16556 result = start_value 16557 for i = 0 to length(input_vector) 16558 result = result * input_vector[i] 16559 return result 16560 16561 16562Arguments: 16563"""""""""" 16564The first argument to this intrinsic is a scalar start value for the reduction. 16565The type of the start value matches the element-type of the vector input. 16566The second argument must be a vector of floating-point values. 16567 16568To ignore the start value, one (``1.0``) can be used, as it is the neutral 16569value of floating point multiplication. 16570 16571Examples: 16572""""""""" 16573 16574:: 16575 16576 %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction 16577 %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction 16578 16579'``llvm.vector.reduce.and.*``' Intrinsic 16580^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16581 16582Syntax: 16583""""""" 16584 16585:: 16586 16587 declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a) 16588 16589Overview: 16590""""""""" 16591 16592The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND`` 16593reduction of a vector, returning the result as a scalar. The return type matches 16594the element-type of the vector input. 16595 16596Arguments: 16597"""""""""" 16598The argument to this intrinsic must be a vector of integer values. 16599 16600'``llvm.vector.reduce.or.*``' Intrinsic 16601^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16602 16603Syntax: 16604""""""" 16605 16606:: 16607 16608 declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a) 16609 16610Overview: 16611""""""""" 16612 16613The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction 16614of a vector, returning the result as a scalar. The return type matches the 16615element-type of the vector input. 16616 16617Arguments: 16618"""""""""" 16619The argument to this intrinsic must be a vector of integer values. 16620 16621'``llvm.vector.reduce.xor.*``' Intrinsic 16622^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16623 16624Syntax: 16625""""""" 16626 16627:: 16628 16629 declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a) 16630 16631Overview: 16632""""""""" 16633 16634The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR`` 16635reduction of a vector, returning the result as a scalar. The return type matches 16636the element-type of the vector input. 16637 16638Arguments: 16639"""""""""" 16640The argument to this intrinsic must be a vector of integer values. 16641 16642'``llvm.vector.reduce.smax.*``' Intrinsic 16643^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16644 16645Syntax: 16646""""""" 16647 16648:: 16649 16650 declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a) 16651 16652Overview: 16653""""""""" 16654 16655The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer 16656``MAX`` reduction of a vector, returning the result as a scalar. The return type 16657matches the element-type of the vector input. 16658 16659Arguments: 16660"""""""""" 16661The argument to this intrinsic must be a vector of integer values. 16662 16663'``llvm.vector.reduce.smin.*``' Intrinsic 16664^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16665 16666Syntax: 16667""""""" 16668 16669:: 16670 16671 declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a) 16672 16673Overview: 16674""""""""" 16675 16676The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer 16677``MIN`` reduction of a vector, returning the result as a scalar. The return type 16678matches the element-type of the vector input. 16679 16680Arguments: 16681"""""""""" 16682The argument to this intrinsic must be a vector of integer values. 16683 16684'``llvm.vector.reduce.umax.*``' Intrinsic 16685^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16686 16687Syntax: 16688""""""" 16689 16690:: 16691 16692 declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a) 16693 16694Overview: 16695""""""""" 16696 16697The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned 16698integer ``MAX`` reduction of a vector, returning the result as a scalar. The 16699return type matches the element-type of the vector input. 16700 16701Arguments: 16702"""""""""" 16703The argument to this intrinsic must be a vector of integer values. 16704 16705'``llvm.vector.reduce.umin.*``' Intrinsic 16706^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16707 16708Syntax: 16709""""""" 16710 16711:: 16712 16713 declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a) 16714 16715Overview: 16716""""""""" 16717 16718The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned 16719integer ``MIN`` reduction of a vector, returning the result as a scalar. The 16720return type matches the element-type of the vector input. 16721 16722Arguments: 16723"""""""""" 16724The argument to this intrinsic must be a vector of integer values. 16725 16726'``llvm.vector.reduce.fmax.*``' Intrinsic 16727^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16728 16729Syntax: 16730""""""" 16731 16732:: 16733 16734 declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a) 16735 declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a) 16736 16737Overview: 16738""""""""" 16739 16740The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point 16741``MAX`` reduction of a vector, returning the result as a scalar. The return type 16742matches the element-type of the vector input. 16743 16744This instruction has the same comparison semantics as the '``llvm.maxnum.*``' 16745intrinsic. That is, the result will always be a number unless all elements of 16746the vector are NaN. For a vector with maximum element magnitude 0.0 and 16747containing both +0.0 and -0.0 elements, the sign of the result is unspecified. 16748 16749If the intrinsic call has the ``nnan`` fast-math flag, then the operation can 16750assume that NaNs are not present in the input vector. 16751 16752Arguments: 16753"""""""""" 16754The argument to this intrinsic must be a vector of floating-point values. 16755 16756'``llvm.vector.reduce.fmin.*``' Intrinsic 16757^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16758 16759Syntax: 16760""""""" 16761This is an overloaded intrinsic. 16762 16763:: 16764 16765 declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a) 16766 declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a) 16767 16768Overview: 16769""""""""" 16770 16771The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point 16772``MIN`` reduction of a vector, returning the result as a scalar. The return type 16773matches the element-type of the vector input. 16774 16775This instruction has the same comparison semantics as the '``llvm.minnum.*``' 16776intrinsic. That is, the result will always be a number unless all elements of 16777the vector are NaN. For a vector with minimum element magnitude 0.0 and 16778containing both +0.0 and -0.0 elements, the sign of the result is unspecified. 16779 16780If the intrinsic call has the ``nnan`` fast-math flag, then the operation can 16781assume that NaNs are not present in the input vector. 16782 16783Arguments: 16784"""""""""" 16785The argument to this intrinsic must be a vector of floating-point values. 16786 16787'``llvm.experimental.vector.insert``' Intrinsic 16788^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16789 16790Syntax: 16791""""""" 16792This is an overloaded intrinsic. You can use ``llvm.experimental.vector.insert`` 16793to insert a fixed-width vector into a scalable vector, but not the other way 16794around. 16795 16796:: 16797 16798 declare <vscale x 4 x float> @llvm.experimental.vector.insert.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 %idx) 16799 declare <vscale x 2 x double> @llvm.experimental.vector.insert.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 %idx) 16800 16801Overview: 16802""""""""" 16803 16804The '``llvm.experimental.vector.insert.*``' intrinsics insert a vector into another vector 16805starting from a given index. The return type matches the type of the vector we 16806insert into. Conceptually, this can be used to build a scalable vector out of 16807non-scalable vectors. 16808 16809Arguments: 16810"""""""""" 16811 16812The ``vec`` is the vector which ``subvec`` will be inserted into. 16813The ``subvec`` is the vector that will be inserted. 16814 16815``idx`` represents the starting element number at which ``subvec`` will be 16816inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum 16817vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by 16818the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at 16819``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` + 16820num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition 16821cannot be determined statically but is false at runtime, then the result vector 16822is undefined. 16823 16824 16825'``llvm.experimental.vector.extract``' Intrinsic 16826^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16827 16828Syntax: 16829""""""" 16830This is an overloaded intrinsic. You can use 16831``llvm.experimental.vector.extract`` to extract a fixed-width vector from a 16832scalable vector, but not the other way around. 16833 16834:: 16835 16836 declare <4 x float> @llvm.experimental.vector.extract.v4f32(<vscale x 4 x float> %vec, i64 %idx) 16837 declare <2 x double> @llvm.experimental.vector.extract.v2f64(<vscale x 2 x double> %vec, i64 %idx) 16838 16839Overview: 16840""""""""" 16841 16842The '``llvm.experimental.vector.extract.*``' intrinsics extract a vector from 16843within another vector starting from a given index. The return type must be 16844explicitly specified. Conceptually, this can be used to decompose a scalable 16845vector into non-scalable parts. 16846 16847Arguments: 16848"""""""""" 16849 16850The ``vec`` is the vector from which we will extract a subvector. 16851 16852The ``idx`` specifies the starting element number within ``vec`` from which a 16853subvector is extracted. ``idx`` must be a constant multiple of the known-minimum 16854vector length of the result type. If the result type is a scalable vector, 16855``idx`` is first scaled by the result type's runtime scaling factor. Elements 16856``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector 16857indices. If this condition cannot be determined statically but is false at 16858runtime, then the result vector is undefined. The ``idx`` parameter must be a 16859vector index constant type (for most targets this will be an integer pointer 16860type). 16861 16862'``llvm.experimental.vector.reverse``' Intrinsic 16863^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16864 16865Syntax: 16866""""""" 16867This is an overloaded intrinsic. 16868 16869:: 16870 16871 declare <2 x i8> @llvm.experimental.vector.reverse.v2i8(<2 x i8> %a) 16872 declare <vscale x 4 x i32> @llvm.experimental.vector.reverse.nxv4i32(<vscale x 4 x i32> %a) 16873 16874Overview: 16875""""""""" 16876 16877The '``llvm.experimental.vector.reverse.*``' intrinsics reverse a vector. 16878The intrinsic takes a single vector and returns a vector of matching type but 16879with the original lane order reversed. These intrinsics work for both fixed 16880and scalable vectors. While this intrinsic is marked as experimental the 16881recommended way to express reverse operations for fixed-width vectors is still 16882to use a shufflevector, as that may allow for more optimization opportunities. 16883 16884Arguments: 16885"""""""""" 16886 16887The argument to this intrinsic must be a vector. 16888 16889'``llvm.experimental.vector.splice``' Intrinsic 16890^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16891 16892Syntax: 16893""""""" 16894This is an overloaded intrinsic. 16895 16896:: 16897 16898 declare <2 x double> @llvm.experimental.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm) 16899 declare <vscale x 4 x i32> @llvm.experimental.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm) 16900 16901Overview: 16902""""""""" 16903 16904The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by 16905concatenating elements from the first input vector with elements of the second 16906input vector, returning a vector of the same type as the input vectors. The 16907signed immediate, modulo the number of elements in the vector, is the index 16908into the first vector from which to extract the result value. This means 16909conceptually that for a positive immediate, a vector is extracted from 16910``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative 16911immediate, it extracts ``-imm`` trailing elements from the first vector, and 16912the remaining elements from ``%vec2``. 16913 16914These intrinsics work for both fixed and scalable vectors. While this intrinsic 16915is marked as experimental, the recommended way to express this operation for 16916fixed-width vectors is still to use a shufflevector, as that may allow for more 16917optimization opportunities. 16918 16919For example: 16920 16921.. code-block:: text 16922 16923 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> ; index 16924 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing elements 16925 16926 16927Arguments: 16928"""""""""" 16929 16930The first two operands are vectors with the same type. The third argument 16931``imm`` is the start index, modulo VL, where VL is the runtime vector length of 16932the source/result vector. The ``imm`` is a signed integer constant in the range 16933``-VL <= imm < VL``. For values outside of this range the result is poison. 16934 16935'``llvm.experimental.stepvector``' Intrinsic 16936^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16937 16938This is an overloaded intrinsic. You can use ``llvm.experimental.stepvector`` 16939to generate a vector whose lane values comprise the linear sequence 16940<0, 1, 2, ...>. It is primarily intended for scalable vectors. 16941 16942:: 16943 16944 declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32() 16945 declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16() 16946 16947The '``llvm.experimental.stepvector``' intrinsics are used to create vectors 16948of integers whose elements contain a linear sequence of values starting from 0 16949with a step of 1. This experimental intrinsic can only be used for vectors 16950with integer elements that are at least 8 bits in size. If the sequence value 16951exceeds the allowed limit for the element type then the result for that lane is 16952undefined. 16953 16954These intrinsics work for both fixed and scalable vectors. While this intrinsic 16955is marked as experimental, the recommended way to express this operation for 16956fixed-width vectors is still to generate a constant vector instead. 16957 16958 16959Arguments: 16960"""""""""" 16961 16962None. 16963 16964 16965Matrix Intrinsics 16966----------------- 16967 16968Operations on matrixes requiring shape information (like number of rows/columns 16969or the memory layout) can be expressed using the matrix intrinsics. These 16970intrinsics require matrix dimensions to be passed as immediate arguments, and 16971matrixes are passed and returned as vectors. This means that for a ``R`` x 16972``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the 16973corresponding vector, with indices starting at 0. Currently column-major layout 16974is assumed. The intrinsics support both integer and floating point matrixes. 16975 16976 16977'``llvm.matrix.transpose.*``' Intrinsic 16978^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16979 16980Syntax: 16981""""""" 16982This is an overloaded intrinsic. 16983 16984:: 16985 16986 declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>) 16987 16988Overview: 16989""""""""" 16990 16991The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x 16992<Cols>`` matrix and return the transposed matrix in the result vector. 16993 16994Arguments: 16995"""""""""" 16996 16997The first argument ``%In`` is a vector that corresponds to a ``<Rows> x 16998<Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the 16999number of rows and columns, respectively, and must be positive, constant 17000integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have 17001the same float or integer element type as ``%In``. 17002 17003'``llvm.matrix.multiply.*``' Intrinsic 17004^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17005 17006Syntax: 17007""""""" 17008This is an overloaded intrinsic. 17009 17010:: 17011 17012 declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>) 17013 17014Overview: 17015""""""""" 17016 17017The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x 17018<Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and 17019multiplies them. The result matrix is returned in the result vector. 17020 17021Arguments: 17022"""""""""" 17023 17024The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> * 17025<Inner>`` elements, and the second argument ``%B`` to a matrix with 17026``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``, 17027``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The 17028returned vector must have ``<OuterRows> * <OuterColumns>`` elements. 17029Vectors ``%A``, ``%B``, and the returned vector all have the same float or 17030integer element type. 17031 17032 17033'``llvm.matrix.column.major.load.*``' Intrinsic 17034^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17035 17036Syntax: 17037""""""" 17038This is an overloaded intrinsic. 17039 17040:: 17041 17042 declare vectorty @llvm.matrix.column.major.load.*( 17043 ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) 17044 17045Overview: 17046""""""""" 17047 17048The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>`` 17049matrix using a stride of ``%Stride`` to compute the start address of the 17050different columns. This allows for convenient loading of sub matrixes. If 17051``<IsVolatile>`` is true, the intrinsic is considered a :ref:`volatile memory 17052access <volatile>`. The result matrix is returned in the result vector. If the 17053``%Ptr`` argument is known to be aligned to some boundary, this can be 17054specified as an attribute on the argument. 17055 17056Arguments: 17057"""""""""" 17058 17059The first argument ``%Ptr`` is a pointer type to the returned vector type, and 17060corresponds to the start address to load from. The second argument ``%Stride`` 17061is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used 17062to compute the column memory addresses. I.e., for a column ``C``, its start 17063memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument 17064``<IsVolatile>`` is a boolean value. The fourth and fifth arguments, 17065``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns, 17066respectively, and must be positive, constant integers. The returned vector must 17067have ``<Rows> * <Cols>`` elements. 17068 17069The :ref:`align <attr_align>` parameter attribute can be provided for the 17070``%Ptr`` arguments. 17071 17072 17073'``llvm.matrix.column.major.store.*``' Intrinsic 17074^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17075 17076Syntax: 17077""""""" 17078 17079:: 17080 17081 declare void @llvm.matrix.column.major.store.*( 17082 vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) 17083 17084Overview: 17085""""""""" 17086 17087The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x 17088<Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between 17089columns. If ``<IsVolatile>`` is true, the intrinsic is considered a 17090:ref:`volatile memory access <volatile>`. 17091 17092If the ``%Ptr`` argument is known to be aligned to some boundary, this can be 17093specified as an attribute on the argument. 17094 17095Arguments: 17096"""""""""" 17097 17098The first argument ``%In`` is a vector that corresponds to a ``<Rows> x 17099<Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a 17100pointer to the vector type of ``%In``, and is the start address of the matrix 17101in memory. The third argument ``%Stride`` is a positive, constant integer with 17102``%Stride >= <Rows>``. ``%Stride`` is used to compute the column memory 17103addresses. I.e., for a column ``C``, its start memory addresses is calculated 17104with ``%Ptr + C * %Stride``. The fourth argument ``<IsVolatile>`` is a boolean 17105value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows 17106and columns, respectively, and must be positive, constant integers. 17107 17108The :ref:`align <attr_align>` parameter attribute can be provided 17109for the ``%Ptr`` arguments. 17110 17111 17112Half Precision Floating-Point Intrinsics 17113---------------------------------------- 17114 17115For most target platforms, half precision floating-point is a 17116storage-only format. This means that it is a dense encoding (in memory) 17117but does not support computation in the format. 17118 17119This means that code must first load the half-precision floating-point 17120value as an i16, then convert it to float with 17121:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can 17122then be performed on the float value (including extending to double 17123etc). To store the value back to memory, it is first converted to float 17124if needed, then converted to i16 with 17125:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an 17126i16 value. 17127 17128.. _int_convert_to_fp16: 17129 17130'``llvm.convert.to.fp16``' Intrinsic 17131^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17132 17133Syntax: 17134""""""" 17135 17136:: 17137 17138 declare i16 @llvm.convert.to.fp16.f32(float %a) 17139 declare i16 @llvm.convert.to.fp16.f64(double %a) 17140 17141Overview: 17142""""""""" 17143 17144The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a 17145conventional floating-point type to half precision floating-point format. 17146 17147Arguments: 17148"""""""""" 17149 17150The intrinsic function contains single argument - the value to be 17151converted. 17152 17153Semantics: 17154"""""""""" 17155 17156The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a 17157conventional floating-point format to half precision floating-point format. The 17158return value is an ``i16`` which contains the converted number. 17159 17160Examples: 17161""""""""" 17162 17163.. code-block:: llvm 17164 17165 %res = call i16 @llvm.convert.to.fp16.f32(float %a) 17166 store i16 %res, i16* @x, align 2 17167 17168.. _int_convert_from_fp16: 17169 17170'``llvm.convert.from.fp16``' Intrinsic 17171^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17172 17173Syntax: 17174""""""" 17175 17176:: 17177 17178 declare float @llvm.convert.from.fp16.f32(i16 %a) 17179 declare double @llvm.convert.from.fp16.f64(i16 %a) 17180 17181Overview: 17182""""""""" 17183 17184The '``llvm.convert.from.fp16``' intrinsic function performs a 17185conversion from half precision floating-point format to single precision 17186floating-point format. 17187 17188Arguments: 17189"""""""""" 17190 17191The intrinsic function contains single argument - the value to be 17192converted. 17193 17194Semantics: 17195"""""""""" 17196 17197The '``llvm.convert.from.fp16``' intrinsic function performs a 17198conversion from half single precision floating-point format to single 17199precision floating-point format. The input half-float value is 17200represented by an ``i16`` value. 17201 17202Examples: 17203""""""""" 17204 17205.. code-block:: llvm 17206 17207 %a = load i16, i16* @x, align 2 17208 %res = call float @llvm.convert.from.fp16(i16 %a) 17209 17210Saturating floating-point to integer conversions 17211------------------------------------------------ 17212 17213The ``fptoui`` and ``fptosi`` instructions return a 17214:ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not 17215representable by the result type. These intrinsics provide an alternative 17216conversion, which will saturate towards the smallest and largest representable 17217integer values instead. 17218 17219'``llvm.fptoui.sat.*``' Intrinsic 17220^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17221 17222Syntax: 17223""""""" 17224 17225This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any 17226floating-point argument type and any integer result type, or vectors thereof. 17227Not all targets may support all types, however. 17228 17229:: 17230 17231 declare i32 @llvm.fptoui.sat.i32.f32(float %f) 17232 declare i19 @llvm.fptoui.sat.i19.f64(double %f) 17233 declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f) 17234 17235Overview: 17236""""""""" 17237 17238This intrinsic converts the argument into an unsigned integer using saturating 17239semantics. 17240 17241Arguments: 17242"""""""""" 17243 17244The argument may be any floating-point or vector of floating-point type. The 17245return value may be any integer or vector of integer type. The number of vector 17246elements in argument and return must be the same. 17247 17248Semantics: 17249"""""""""" 17250 17251The conversion to integer is performed subject to the following rules: 17252 17253- If the argument is any NaN, zero is returned. 17254- If the argument is smaller than zero (this includes negative infinity), 17255 zero is returned. 17256- If the argument is larger than the largest representable unsigned integer of 17257 the result type (this includes positive infinity), the largest representable 17258 unsigned integer is returned. 17259- Otherwise, the result of rounding the argument towards zero is returned. 17260 17261Example: 17262"""""""" 17263 17264.. code-block:: text 17265 17266 %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.9) ; yields i8: 123 17267 %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.7) ; yields i8: 0 17268 %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0) ; yields i8: 255 17269 %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0 17270 17271'``llvm.fptosi.sat.*``' Intrinsic 17272^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17273 17274Syntax: 17275""""""" 17276 17277This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any 17278floating-point argument type and any integer result type, or vectors thereof. 17279Not all targets may support all types, however. 17280 17281:: 17282 17283 declare i32 @llvm.fptosi.sat.i32.f32(float %f) 17284 declare i19 @llvm.fptosi.sat.i19.f64(double %f) 17285 declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f) 17286 17287Overview: 17288""""""""" 17289 17290This intrinsic converts the argument into a signed integer using saturating 17291semantics. 17292 17293Arguments: 17294"""""""""" 17295 17296The argument may be any floating-point or vector of floating-point type. The 17297return value may be any integer or vector of integer type. The number of vector 17298elements in argument and return must be the same. 17299 17300Semantics: 17301"""""""""" 17302 17303The conversion to integer is performed subject to the following rules: 17304 17305- If the argument is any NaN, zero is returned. 17306- If the argument is smaller than the smallest representable signed integer of 17307 the result type (this includes negative infinity), the smallest 17308 representable signed integer is returned. 17309- If the argument is larger than the largest representable signed integer of 17310 the result type (this includes positive infinity), the largest representable 17311 signed integer is returned. 17312- Otherwise, the result of rounding the argument towards zero is returned. 17313 17314Example: 17315"""""""" 17316 17317.. code-block:: text 17318 17319 %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.9) ; yields i8: 23 17320 %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.8) ; yields i8: -128 17321 %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0) ; yields i8: 127 17322 %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0 17323 17324.. _dbg_intrinsics: 17325 17326Debugger Intrinsics 17327------------------- 17328 17329The LLVM debugger intrinsics (which all start with ``llvm.dbg.`` 17330prefix), are described in the `LLVM Source Level 17331Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_ 17332document. 17333 17334Exception Handling Intrinsics 17335----------------------------- 17336 17337The LLVM exception handling intrinsics (which all start with 17338``llvm.eh.`` prefix), are described in the `LLVM Exception 17339Handling <ExceptionHandling.html#format-common-intrinsics>`_ document. 17340 17341.. _int_trampoline: 17342 17343Trampoline Intrinsics 17344--------------------- 17345 17346These intrinsics make it possible to excise one parameter, marked with 17347the :ref:`nest <nest>` attribute, from a function. The result is a 17348callable function pointer lacking the nest parameter - the caller does 17349not need to provide a value for it. Instead, the value to use is stored 17350in advance in a "trampoline", a block of memory usually allocated on the 17351stack, which also contains code to splice the nest value into the 17352argument list. This is used to implement the GCC nested function address 17353extension. 17354 17355For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)`` 17356then the resulting function pointer has signature ``i32 (i32, i32)*``. 17357It can be created as follows: 17358 17359.. code-block:: llvm 17360 17361 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86 17362 %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0 17363 call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval) 17364 %p = call i8* @llvm.adjust.trampoline(i8* %tramp1) 17365 %fp = bitcast i8* %p to i32 (i32, i32)* 17366 17367The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to 17368``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``. 17369 17370.. _int_it: 17371 17372'``llvm.init.trampoline``' Intrinsic 17373^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17374 17375Syntax: 17376""""""" 17377 17378:: 17379 17380 declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>) 17381 17382Overview: 17383""""""""" 17384 17385This fills the memory pointed to by ``tramp`` with executable code, 17386turning it into a trampoline. 17387 17388Arguments: 17389"""""""""" 17390 17391The ``llvm.init.trampoline`` intrinsic takes three arguments, all 17392pointers. The ``tramp`` argument must point to a sufficiently large and 17393sufficiently aligned block of memory; this memory is written to by the 17394intrinsic. Note that the size and the alignment are target-specific - 17395LLVM currently provides no portable way of determining them, so a 17396front-end that generates this intrinsic needs to have some 17397target-specific knowledge. The ``func`` argument must hold a function 17398bitcast to an ``i8*``. 17399 17400Semantics: 17401"""""""""" 17402 17403The block of memory pointed to by ``tramp`` is filled with target 17404dependent code, turning it into a function. Then ``tramp`` needs to be 17405passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can 17406be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new 17407function's signature is the same as that of ``func`` with any arguments 17408marked with the ``nest`` attribute removed. At most one such ``nest`` 17409argument is allowed, and it must be of pointer type. Calling the new 17410function is equivalent to calling ``func`` with the same argument list, 17411but with ``nval`` used for the missing ``nest`` argument. If, after 17412calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is 17413modified, then the effect of any later call to the returned function 17414pointer is undefined. 17415 17416.. _int_at: 17417 17418'``llvm.adjust.trampoline``' Intrinsic 17419^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17420 17421Syntax: 17422""""""" 17423 17424:: 17425 17426 declare i8* @llvm.adjust.trampoline(i8* <tramp>) 17427 17428Overview: 17429""""""""" 17430 17431This performs any required machine-specific adjustment to the address of 17432a trampoline (passed as ``tramp``). 17433 17434Arguments: 17435"""""""""" 17436 17437``tramp`` must point to a block of memory which already has trampoline 17438code filled in by a previous call to 17439:ref:`llvm.init.trampoline <int_it>`. 17440 17441Semantics: 17442"""""""""" 17443 17444On some architectures the address of the code to be executed needs to be 17445different than the address where the trampoline is actually stored. This 17446intrinsic returns the executable address corresponding to ``tramp`` 17447after performing the required machine specific adjustments. The pointer 17448returned can then be :ref:`bitcast and executed <int_trampoline>`. 17449 17450 17451.. _int_vp: 17452 17453Vector Predication Intrinsics 17454----------------------------- 17455VP intrinsics are intended for predicated SIMD/vector code. A typical VP 17456operation takes a vector mask and an explicit vector length parameter as in: 17457 17458:: 17459 17460 <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl) 17461 17462The vector mask parameter (%mask) always has a vector of `i1` type, for example 17463`<32 x i1>`. The explicit vector length parameter always has the type `i32` and 17464is an unsigned integer value. The explicit vector length parameter (%evl) is in 17465the range: 17466 17467:: 17468 17469 0 <= %evl <= W, where W is the number of vector elements 17470 17471Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime 17472length of the vector. 17473 17474The VP intrinsic has undefined behavior if ``%evl > W``. The explicit vector 17475length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set 17476to True, and all other lanes ``%evl <= i < W`` to False. A new mask %M is 17477calculated with an element-wise AND from %mask and %EVLmask: 17478 17479:: 17480 17481 M = %mask AND %EVLmask 17482 17483A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates: 17484 17485:: 17486 17487 A <opcode> B = { A[i] <opcode> B[i] M[i] = True, and 17488 { undef otherwise 17489 17490Optimization Hint 17491^^^^^^^^^^^^^^^^^ 17492 17493Some targets, such as AVX512, do not support the %evl parameter in hardware. 17494The use of an effective %evl is discouraged for those targets. The function 17495``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target 17496has native support for %evl. 17497 17498 17499.. _int_vp_add: 17500 17501'``llvm.vp.add.*``' Intrinsics 17502^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17503 17504Syntax: 17505""""""" 17506This is an overloaded intrinsic. 17507 17508:: 17509 17510 declare <16 x i32> @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17511 declare <vscale x 4 x i32> @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17512 declare <256 x i64> @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17513 17514Overview: 17515""""""""" 17516 17517Predicated integer addition of two vectors of integers. 17518 17519 17520Arguments: 17521"""""""""" 17522 17523The first two operands and the result have the same vector of integer type. The 17524third operand is the vector mask and has the same number of elements as the 17525result vector type. The fourth operand is the explicit vector length of the 17526operation. 17527 17528Semantics: 17529"""""""""" 17530 17531The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`) 17532of the first and second vector operand on each enabled lane. The result on 17533disabled lanes is undefined. 17534 17535Examples: 17536""""""""" 17537 17538.. code-block:: llvm 17539 17540 %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17541 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17542 17543 %t = add <4 x i32> %a, %b 17544 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17545 17546.. _int_vp_sub: 17547 17548'``llvm.vp.sub.*``' Intrinsics 17549^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17550 17551Syntax: 17552""""""" 17553This is an overloaded intrinsic. 17554 17555:: 17556 17557 declare <16 x i32> @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17558 declare <vscale x 4 x i32> @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17559 declare <256 x i64> @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17560 17561Overview: 17562""""""""" 17563 17564Predicated integer subtraction of two vectors of integers. 17565 17566 17567Arguments: 17568"""""""""" 17569 17570The first two operands and the result have the same vector of integer type. The 17571third operand is the vector mask and has the same number of elements as the 17572result vector type. The fourth operand is the explicit vector length of the 17573operation. 17574 17575Semantics: 17576"""""""""" 17577 17578The '``llvm.vp.sub``' intrinsic performs integer subtraction 17579(:ref:`sub <i_sub>`) of the first and second vector operand on each enabled 17580lane. The result on disabled lanes is undefined. 17581 17582Examples: 17583""""""""" 17584 17585.. code-block:: llvm 17586 17587 %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17588 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17589 17590 %t = sub <4 x i32> %a, %b 17591 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17592 17593 17594 17595.. _int_vp_mul: 17596 17597'``llvm.vp.mul.*``' Intrinsics 17598^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17599 17600Syntax: 17601""""""" 17602This is an overloaded intrinsic. 17603 17604:: 17605 17606 declare <16 x i32> @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17607 declare <vscale x 4 x i32> @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17608 declare <256 x i64> @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17609 17610Overview: 17611""""""""" 17612 17613Predicated integer multiplication of two vectors of integers. 17614 17615 17616Arguments: 17617"""""""""" 17618 17619The first two operands and the result have the same vector of integer type. The 17620third operand is the vector mask and has the same number of elements as the 17621result vector type. The fourth operand is the explicit vector length of the 17622operation. 17623 17624Semantics: 17625"""""""""" 17626The '``llvm.vp.mul``' intrinsic performs integer multiplication 17627(:ref:`mul <i_mul>`) of the first and second vector operand on each enabled 17628lane. The result on disabled lanes is undefined. 17629 17630Examples: 17631""""""""" 17632 17633.. code-block:: llvm 17634 17635 %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17636 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17637 17638 %t = mul <4 x i32> %a, %b 17639 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17640 17641 17642.. _int_vp_sdiv: 17643 17644'``llvm.vp.sdiv.*``' Intrinsics 17645^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17646 17647Syntax: 17648""""""" 17649This is an overloaded intrinsic. 17650 17651:: 17652 17653 declare <16 x i32> @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17654 declare <vscale x 4 x i32> @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17655 declare <256 x i64> @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17656 17657Overview: 17658""""""""" 17659 17660Predicated, signed division of two vectors of integers. 17661 17662 17663Arguments: 17664"""""""""" 17665 17666The first two operands and the result have the same vector of integer type. The 17667third operand is the vector mask and has the same number of elements as the 17668result vector type. The fourth operand is the explicit vector length of the 17669operation. 17670 17671Semantics: 17672"""""""""" 17673 17674The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`) 17675of the first and second vector operand on each enabled lane. The result on 17676disabled lanes is undefined. 17677 17678Examples: 17679""""""""" 17680 17681.. code-block:: llvm 17682 17683 %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17684 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17685 17686 %t = sdiv <4 x i32> %a, %b 17687 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17688 17689 17690.. _int_vp_udiv: 17691 17692'``llvm.vp.udiv.*``' Intrinsics 17693^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17694 17695Syntax: 17696""""""" 17697This is an overloaded intrinsic. 17698 17699:: 17700 17701 declare <16 x i32> @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17702 declare <vscale x 4 x i32> @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17703 declare <256 x i64> @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17704 17705Overview: 17706""""""""" 17707 17708Predicated, unsigned division of two vectors of integers. 17709 17710 17711Arguments: 17712"""""""""" 17713 17714The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation. 17715 17716Semantics: 17717"""""""""" 17718 17719The '``llvm.vp.udiv``' intrinsic performs unsigned division 17720(:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled 17721lane. The result on disabled lanes is undefined. 17722 17723Examples: 17724""""""""" 17725 17726.. code-block:: llvm 17727 17728 %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17729 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17730 17731 %t = udiv <4 x i32> %a, %b 17732 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17733 17734 17735 17736.. _int_vp_srem: 17737 17738'``llvm.vp.srem.*``' Intrinsics 17739^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17740 17741Syntax: 17742""""""" 17743This is an overloaded intrinsic. 17744 17745:: 17746 17747 declare <16 x i32> @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17748 declare <vscale x 4 x i32> @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17749 declare <256 x i64> @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17750 17751Overview: 17752""""""""" 17753 17754Predicated computations of the signed remainder of two integer vectors. 17755 17756 17757Arguments: 17758"""""""""" 17759 17760The first two operands and the result have the same vector of integer type. The 17761third operand is the vector mask and has the same number of elements as the 17762result vector type. The fourth operand is the explicit vector length of the 17763operation. 17764 17765Semantics: 17766"""""""""" 17767 17768The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division 17769(:ref:`srem <i_srem>`) of the first and second vector operand on each enabled 17770lane. The result on disabled lanes is undefined. 17771 17772Examples: 17773""""""""" 17774 17775.. code-block:: llvm 17776 17777 %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17778 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17779 17780 %t = srem <4 x i32> %a, %b 17781 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17782 17783 17784 17785.. _int_vp_urem: 17786 17787'``llvm.vp.urem.*``' Intrinsics 17788^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17789 17790Syntax: 17791""""""" 17792This is an overloaded intrinsic. 17793 17794:: 17795 17796 declare <16 x i32> @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17797 declare <vscale x 4 x i32> @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17798 declare <256 x i64> @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17799 17800Overview: 17801""""""""" 17802 17803Predicated computation of the unsigned remainder of two integer vectors. 17804 17805 17806Arguments: 17807"""""""""" 17808 17809The first two operands and the result have the same vector of integer type. The 17810third operand is the vector mask and has the same number of elements as the 17811result vector type. The fourth operand is the explicit vector length of the 17812operation. 17813 17814Semantics: 17815"""""""""" 17816 17817The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division 17818(:ref:`urem <i_urem>`) of the first and second vector operand on each enabled 17819lane. The result on disabled lanes is undefined. 17820 17821Examples: 17822""""""""" 17823 17824.. code-block:: llvm 17825 17826 %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17827 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17828 17829 %t = urem <4 x i32> %a, %b 17830 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17831 17832 17833.. _int_vp_ashr: 17834 17835'``llvm.vp.ashr.*``' Intrinsics 17836^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17837 17838Syntax: 17839""""""" 17840This is an overloaded intrinsic. 17841 17842:: 17843 17844 declare <16 x i32> @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17845 declare <vscale x 4 x i32> @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17846 declare <256 x i64> @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17847 17848Overview: 17849""""""""" 17850 17851Vector-predicated arithmetic right-shift. 17852 17853 17854Arguments: 17855"""""""""" 17856 17857The first two operands and the result have the same vector of integer type. The 17858third operand is the vector mask and has the same number of elements as the 17859result vector type. The fourth operand is the explicit vector length of the 17860operation. 17861 17862Semantics: 17863"""""""""" 17864 17865The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift 17866(:ref:`ashr <i_ashr>`) of the first operand by the second operand on each 17867enabled lane. The result on disabled lanes is undefined. 17868 17869Examples: 17870""""""""" 17871 17872.. code-block:: llvm 17873 17874 %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17875 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17876 17877 %t = ashr <4 x i32> %a, %b 17878 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17879 17880 17881.. _int_vp_lshr: 17882 17883 17884'``llvm.vp.lshr.*``' Intrinsics 17885^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17886 17887Syntax: 17888""""""" 17889This is an overloaded intrinsic. 17890 17891:: 17892 17893 declare <16 x i32> @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17894 declare <vscale x 4 x i32> @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17895 declare <256 x i64> @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17896 17897Overview: 17898""""""""" 17899 17900Vector-predicated logical right-shift. 17901 17902 17903Arguments: 17904"""""""""" 17905 17906The first two operands and the result have the same vector of integer type. The 17907third operand is the vector mask and has the same number of elements as the 17908result vector type. The fourth operand is the explicit vector length of the 17909operation. 17910 17911Semantics: 17912"""""""""" 17913 17914The '``llvm.vp.lshr``' intrinsic computes the logical right shift 17915(:ref:`lshr <i_lshr>`) of the first operand by the second operand on each 17916enabled lane. The result on disabled lanes is undefined. 17917 17918Examples: 17919""""""""" 17920 17921.. code-block:: llvm 17922 17923 %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17924 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17925 17926 %t = lshr <4 x i32> %a, %b 17927 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17928 17929 17930.. _int_vp_shl: 17931 17932'``llvm.vp.shl.*``' Intrinsics 17933^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17934 17935Syntax: 17936""""""" 17937This is an overloaded intrinsic. 17938 17939:: 17940 17941 declare <16 x i32> @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17942 declare <vscale x 4 x i32> @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17943 declare <256 x i64> @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17944 17945Overview: 17946""""""""" 17947 17948Vector-predicated left shift. 17949 17950 17951Arguments: 17952"""""""""" 17953 17954The first two operands and the result have the same vector of integer type. The 17955third operand is the vector mask and has the same number of elements as the 17956result vector type. The fourth operand is the explicit vector length of the 17957operation. 17958 17959Semantics: 17960"""""""""" 17961 17962The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of 17963the first operand by the second operand on each enabled lane. The result on 17964disabled lanes is undefined. 17965 17966Examples: 17967""""""""" 17968 17969.. code-block:: llvm 17970 17971 %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17972 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17973 17974 %t = shl <4 x i32> %a, %b 17975 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17976 17977 17978.. _int_vp_or: 17979 17980'``llvm.vp.or.*``' Intrinsics 17981^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17982 17983Syntax: 17984""""""" 17985This is an overloaded intrinsic. 17986 17987:: 17988 17989 declare <16 x i32> @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17990 declare <vscale x 4 x i32> @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17991 declare <256 x i64> @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17992 17993Overview: 17994""""""""" 17995 17996Vector-predicated or. 17997 17998 17999Arguments: 18000"""""""""" 18001 18002The first two operands and the result have the same vector of integer type. The 18003third operand is the vector mask and has the same number of elements as the 18004result vector type. The fourth operand is the explicit vector length of the 18005operation. 18006 18007Semantics: 18008"""""""""" 18009 18010The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the 18011first two operands on each enabled lane. The result on disabled lanes is 18012undefined. 18013 18014Examples: 18015""""""""" 18016 18017.. code-block:: llvm 18018 18019 %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18020 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18021 18022 %t = or <4 x i32> %a, %b 18023 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18024 18025 18026.. _int_vp_and: 18027 18028'``llvm.vp.and.*``' Intrinsics 18029^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18030 18031Syntax: 18032""""""" 18033This is an overloaded intrinsic. 18034 18035:: 18036 18037 declare <16 x i32> @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18038 declare <vscale x 4 x i32> @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18039 declare <256 x i64> @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18040 18041Overview: 18042""""""""" 18043 18044Vector-predicated and. 18045 18046 18047Arguments: 18048"""""""""" 18049 18050The first two operands and the result have the same vector of integer type. The 18051third operand is the vector mask and has the same number of elements as the 18052result vector type. The fourth operand is the explicit vector length of the 18053operation. 18054 18055Semantics: 18056"""""""""" 18057 18058The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of 18059the first two operands on each enabled lane. The result on disabled lanes is 18060undefined. 18061 18062Examples: 18063""""""""" 18064 18065.. code-block:: llvm 18066 18067 %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18068 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18069 18070 %t = and <4 x i32> %a, %b 18071 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18072 18073 18074.. _int_vp_xor: 18075 18076'``llvm.vp.xor.*``' Intrinsics 18077^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18078 18079Syntax: 18080""""""" 18081This is an overloaded intrinsic. 18082 18083:: 18084 18085 declare <16 x i32> @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18086 declare <vscale x 4 x i32> @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18087 declare <256 x i64> @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18088 18089Overview: 18090""""""""" 18091 18092Vector-predicated, bitwise xor. 18093 18094 18095Arguments: 18096"""""""""" 18097 18098The first two operands and the result have the same vector of integer type. The 18099third operand is the vector mask and has the same number of elements as the 18100result vector type. The fourth operand is the explicit vector length of the 18101operation. 18102 18103Semantics: 18104"""""""""" 18105 18106The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of 18107the first two operands on each enabled lane. 18108The result on disabled lanes is undefined. 18109 18110Examples: 18111""""""""" 18112 18113.. code-block:: llvm 18114 18115 %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18116 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18117 18118 %t = xor <4 x i32> %a, %b 18119 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18120 18121 18122.. _int_get_active_lane_mask: 18123 18124'``llvm.get.active.lane.mask.*``' Intrinsics 18125^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18126 18127Syntax: 18128""""""" 18129This is an overloaded intrinsic. 18130 18131:: 18132 18133 declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n) 18134 declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n) 18135 declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n) 18136 declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n) 18137 18138 18139Overview: 18140""""""""" 18141 18142Create a mask representing active and inactive vector lanes. 18143 18144 18145Arguments: 18146"""""""""" 18147 18148Both operands have the same scalar integer type. The result is a vector with 18149the i1 element type. 18150 18151Semantics: 18152"""""""""" 18153 18154The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent 18155to: 18156 18157:: 18158 18159 %m[i] = icmp ult (%base + i), %n 18160 18161where ``%m`` is a vector (mask) of active/inactive lanes with its elements 18162indexed by ``i``, and ``%base``, ``%n`` are the two arguments to 18163``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult`` 18164the unsigned less-than comparison operator. Overflow cannot occur in 18165``(%base + i)`` and its comparison against ``%n`` as it is performed in integer 18166numbers and not in machine numbers. If ``%n`` is ``0``, then the result is a 18167poison value. The above is equivalent to: 18168 18169:: 18170 18171 %m = @llvm.get.active.lane.mask(%base, %n) 18172 18173This can, for example, be emitted by the loop vectorizer in which case 18174``%base`` is the first element of the vector induction variable (VIV) and 18175``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise 18176less than comparison of VIV with the loop tripcount, producing a mask of 18177true/false values representing active/inactive vector lanes, except if the VIV 18178overflows in which case they return false in the lanes where the VIV overflows. 18179The arguments are scalar types to accommodate scalable vector types, for which 18180it is unknown what the type of the step vector needs to be that enumerate its 18181lanes without overflow. 18182 18183This mask ``%m`` can e.g. be used in masked load/store instructions. These 18184intrinsics provide a hint to the backend. I.e., for a vector loop, the 18185back-edge taken count of the original scalar loop is explicit as the second 18186argument. 18187 18188 18189Examples: 18190""""""""" 18191 18192.. code-block:: llvm 18193 18194 %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429) 18195 %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> undef) 18196 18197 18198.. _int_mload_mstore: 18199 18200Masked Vector Load and Store Intrinsics 18201--------------------------------------- 18202 18203LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed. 18204 18205.. _int_mload: 18206 18207'``llvm.masked.load.*``' Intrinsics 18208^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18209 18210Syntax: 18211""""""" 18212This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type. 18213 18214:: 18215 18216 declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) 18217 declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) 18218 ;; The data is a vector of pointers to double 18219 declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>) 18220 ;; The data is a vector of function pointers 18221 declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>) 18222 18223Overview: 18224""""""""" 18225 18226Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand. 18227 18228 18229Arguments: 18230"""""""""" 18231 18232The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types. 18233 18234Semantics: 18235"""""""""" 18236 18237The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations. 18238The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes. 18239 18240 18241:: 18242 18243 %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru) 18244 18245 ;; The result of the two following instructions is identical aside from potential memory access exception 18246 %loadlal = load <16 x float>, <16 x float>* %ptr, align 4 18247 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru 18248 18249.. _int_mstore: 18250 18251'``llvm.masked.store.*``' Intrinsics 18252^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18253 18254Syntax: 18255""""""" 18256This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. 18257 18258:: 18259 18260 declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>) 18261 declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>) 18262 ;; The data is a vector of pointers to double 18263 declare void @llvm.masked.store.v8p0f64.p0v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>) 18264 ;; The data is a vector of function pointers 18265 declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>) 18266 18267Overview: 18268""""""""" 18269 18270Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. 18271 18272Arguments: 18273"""""""""" 18274 18275The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. It must be a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements. 18276 18277 18278Semantics: 18279"""""""""" 18280 18281The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations. 18282The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes. 18283 18284:: 18285 18286 call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask) 18287 18288 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions 18289 %oldval = load <16 x float>, <16 x float>* %ptr, align 4 18290 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval 18291 store <16 x float> %res, <16 x float>* %ptr, align 4 18292 18293 18294Masked Vector Gather and Scatter Intrinsics 18295------------------------------------------- 18296 18297LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed. 18298 18299.. _int_mgather: 18300 18301'``llvm.masked.gather.*``' Intrinsics 18302^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18303 18304Syntax: 18305""""""" 18306This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector. 18307 18308:: 18309 18310 declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) 18311 declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64 (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) 18312 declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float*> <passthru>) 18313 18314Overview: 18315""""""""" 18316 18317Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand. 18318 18319 18320Arguments: 18321"""""""""" 18322 18323The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types. 18324 18325Semantics: 18326"""""""""" 18327 18328The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations. 18329The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks. 18330 18331 18332:: 18333 18334 %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef) 18335 18336 ;; The gather with all-true mask is equivalent to the following instruction sequence 18337 %ptr0 = extractelement <4 x double*> %ptrs, i32 0 18338 %ptr1 = extractelement <4 x double*> %ptrs, i32 1 18339 %ptr2 = extractelement <4 x double*> %ptrs, i32 2 18340 %ptr3 = extractelement <4 x double*> %ptrs, i32 3 18341 18342 %val0 = load double, double* %ptr0, align 8 18343 %val1 = load double, double* %ptr1, align 8 18344 %val2 = load double, double* %ptr2, align 8 18345 %val3 = load double, double* %ptr3, align 8 18346 18347 %vec0 = insertelement <4 x double>undef, %val0, 0 18348 %vec01 = insertelement <4 x double>%vec0, %val1, 1 18349 %vec012 = insertelement <4 x double>%vec01, %val2, 2 18350 %vec0123 = insertelement <4 x double>%vec012, %val3, 3 18351 18352.. _int_mscatter: 18353 18354'``llvm.masked.scatter.*``' Intrinsics 18355^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18356 18357Syntax: 18358""""""" 18359This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element. 18360 18361:: 18362 18363 declare void @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>) 18364 declare void @llvm.masked.scatter.v16f32.v16p1f32 (<16 x float> <value>, <16 x float addrspace(1)*> <ptrs>, i32 <alignment>, <16 x i1> <mask>) 18365 declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1> <mask>) 18366 18367Overview: 18368""""""""" 18369 18370Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. 18371 18372Arguments: 18373"""""""""" 18374 18375The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements. 18376 18377Semantics: 18378"""""""""" 18379 18380The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations. 18381 18382:: 18383 18384 ;; This instruction unconditionally stores data vector in multiple addresses 18385 call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>) 18386 18387 ;; It is equivalent to a list of scalar stores 18388 %val0 = extractelement <8 x i32> %value, i32 0 18389 %val1 = extractelement <8 x i32> %value, i32 1 18390 .. 18391 %val7 = extractelement <8 x i32> %value, i32 7 18392 %ptr0 = extractelement <8 x i32*> %ptrs, i32 0 18393 %ptr1 = extractelement <8 x i32*> %ptrs, i32 1 18394 .. 18395 %ptr7 = extractelement <8 x i32*> %ptrs, i32 7 18396 ;; Note: the order of the following stores is important when they overlap: 18397 store i32 %val0, i32* %ptr0, align 4 18398 store i32 %val1, i32* %ptr1, align 4 18399 .. 18400 store i32 %val7, i32* %ptr7, align 4 18401 18402 18403Masked Vector Expanding Load and Compressing Store Intrinsics 18404------------------------------------------------------------- 18405 18406LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`. 18407 18408.. _int_expandload: 18409 18410'``llvm.masked.expandload.*``' Intrinsics 18411^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18412 18413Syntax: 18414""""""" 18415This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask. 18416 18417:: 18418 18419 declare <16 x float> @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>) 18420 declare <2 x i64> @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1> <mask>, <2 x i64> <passthru>) 18421 18422Overview: 18423""""""""" 18424 18425Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand. 18426 18427 18428Arguments: 18429"""""""""" 18430 18431The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type. 18432 18433Semantics: 18434"""""""""" 18435 18436The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example: 18437 18438.. code-block:: c 18439 18440 // In this loop we load from B and spread the elements into array A. 18441 double *A, B; int *C; 18442 for (int i = 0; i < size; ++i) { 18443 if (C[i] != 0) 18444 A[i] = B[j++]; 18445 } 18446 18447 18448.. code-block:: llvm 18449 18450 ; Load several elements from array B and expand them in a vector. 18451 ; The number of loaded elements is equal to the number of '1' elements in the Mask. 18452 %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef) 18453 ; Store the result in A 18454 call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask) 18455 18456 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. 18457 %MaskI = bitcast <8 x i1> %Mask to i8 18458 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) 18459 %MaskI64 = zext i8 %MaskIPopcnt to i64 18460 %BNextInd = add i64 %BInd, %MaskI64 18461 18462 18463Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles. 18464If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load. 18465 18466.. _int_compressstore: 18467 18468'``llvm.masked.compressstore.*``' Intrinsics 18469^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18470 18471Syntax: 18472""""""" 18473This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector. 18474 18475:: 18476 18477 declare void @llvm.masked.compressstore.v8i32 (<8 x i32> <value>, i32* <ptr>, <8 x i1> <mask>) 18478 declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>) 18479 18480Overview: 18481""""""""" 18482 18483Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask. 18484 18485Arguments: 18486"""""""""" 18487 18488The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements. 18489 18490 18491Semantics: 18492"""""""""" 18493 18494The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example: 18495 18496.. code-block:: c 18497 18498 // In this loop we load elements from A and store them consecutively in B 18499 double *A, B; int *C; 18500 for (int i = 0; i < size; ++i) { 18501 if (C[i] != 0) 18502 B[j++] = A[i] 18503 } 18504 18505 18506.. code-block:: llvm 18507 18508 ; Load elements from A. 18509 %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef) 18510 ; Store all selected elements consecutively in array B 18511 call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask) 18512 18513 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. 18514 %MaskI = bitcast <8 x i1> %Mask to i8 18515 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) 18516 %MaskI64 = zext i8 %MaskIPopcnt to i64 18517 %BNextInd = add i64 %BInd, %MaskI64 18518 18519 18520Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations. 18521 18522 18523Memory Use Markers 18524------------------ 18525 18526This class of intrinsics provides information about the 18527:ref:`lifetime of memory objects <objectlifetime>` and ranges where variables 18528are immutable. 18529 18530.. _int_lifestart: 18531 18532'``llvm.lifetime.start``' Intrinsic 18533^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18534 18535Syntax: 18536""""""" 18537 18538:: 18539 18540 declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>) 18541 18542Overview: 18543""""""""" 18544 18545The '``llvm.lifetime.start``' intrinsic specifies the start of a memory 18546object's lifetime. 18547 18548Arguments: 18549"""""""""" 18550 18551The first argument is a constant integer representing the size of the 18552object, or -1 if it is variable sized. The second argument is a pointer 18553to the object. 18554 18555Semantics: 18556"""""""""" 18557 18558If ``ptr`` is a stack-allocated object and it points to the first byte of 18559the object, the object is initially marked as dead. 18560``ptr`` is conservatively considered as a non-stack-allocated object if 18561the stack coloring algorithm that is used in the optimization pipeline cannot 18562conclude that ``ptr`` is a stack-allocated object. 18563 18564After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked 18565as alive and has an uninitialized value. 18566The stack object is marked as dead when either 18567:ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the 18568function returns. 18569 18570After :ref:`llvm.lifetime.end <int_lifeend>` is called, 18571'``llvm.lifetime.start``' on the stack object can be called again. 18572The second '``llvm.lifetime.start``' call marks the object as alive, but it 18573does not change the address of the object. 18574 18575If ``ptr`` is a non-stack-allocated object, it does not point to the first 18576byte of the object or it is a stack object that is already alive, it simply 18577fills all bytes of the object with ``poison``. 18578 18579 18580.. _int_lifeend: 18581 18582'``llvm.lifetime.end``' Intrinsic 18583^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18584 18585Syntax: 18586""""""" 18587 18588:: 18589 18590 declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>) 18591 18592Overview: 18593""""""""" 18594 18595The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's 18596lifetime. 18597 18598Arguments: 18599"""""""""" 18600 18601The first argument is a constant integer representing the size of the 18602object, or -1 if it is variable sized. The second argument is a pointer 18603to the object. 18604 18605Semantics: 18606"""""""""" 18607 18608If ``ptr`` is a stack-allocated object and it points to the first byte of the 18609object, the object is dead. 18610``ptr`` is conservatively considered as a non-stack-allocated object if 18611the stack coloring algorithm that is used in the optimization pipeline cannot 18612conclude that ``ptr`` is a stack-allocated object. 18613 18614Calling ``llvm.lifetime.end`` on an already dead alloca is no-op. 18615 18616If ``ptr`` is a non-stack-allocated object or it does not point to the first 18617byte of the object, it is equivalent to simply filling all bytes of the object 18618with ``poison``. 18619 18620 18621'``llvm.invariant.start``' Intrinsic 18622^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18623 18624Syntax: 18625""""""" 18626This is an overloaded intrinsic. The memory object can belong to any address space. 18627 18628:: 18629 18630 declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>) 18631 18632Overview: 18633""""""""" 18634 18635The '``llvm.invariant.start``' intrinsic specifies that the contents of 18636a memory object will not change. 18637 18638Arguments: 18639"""""""""" 18640 18641The first argument is a constant integer representing the size of the 18642object, or -1 if it is variable sized. The second argument is a pointer 18643to the object. 18644 18645Semantics: 18646"""""""""" 18647 18648This intrinsic indicates that until an ``llvm.invariant.end`` that uses 18649the return value, the referenced memory location is constant and 18650unchanging. 18651 18652'``llvm.invariant.end``' Intrinsic 18653^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18654 18655Syntax: 18656""""""" 18657This is an overloaded intrinsic. The memory object can belong to any address space. 18658 18659:: 18660 18661 declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>) 18662 18663Overview: 18664""""""""" 18665 18666The '``llvm.invariant.end``' intrinsic specifies that the contents of a 18667memory object are mutable. 18668 18669Arguments: 18670"""""""""" 18671 18672The first argument is the matching ``llvm.invariant.start`` intrinsic. 18673The second argument is a constant integer representing the size of the 18674object, or -1 if it is variable sized and the third argument is a 18675pointer to the object. 18676 18677Semantics: 18678"""""""""" 18679 18680This intrinsic indicates that the memory is mutable again. 18681 18682'``llvm.launder.invariant.group``' Intrinsic 18683^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18684 18685Syntax: 18686""""""" 18687This is an overloaded intrinsic. The memory object can belong to any address 18688space. The returned pointer must belong to the same address space as the 18689argument. 18690 18691:: 18692 18693 declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>) 18694 18695Overview: 18696""""""""" 18697 18698The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant 18699established by ``invariant.group`` metadata no longer holds, to obtain a new 18700pointer value that carries fresh invariant group information. It is an 18701experimental intrinsic, which means that its semantics might change in the 18702future. 18703 18704 18705Arguments: 18706"""""""""" 18707 18708The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer 18709to the memory. 18710 18711Semantics: 18712"""""""""" 18713 18714Returns another pointer that aliases its argument but which is considered different 18715for the purposes of ``load``/``store`` ``invariant.group`` metadata. 18716It does not read any accessible memory and the execution can be speculated. 18717 18718'``llvm.strip.invariant.group``' Intrinsic 18719^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18720 18721Syntax: 18722""""""" 18723This is an overloaded intrinsic. The memory object can belong to any address 18724space. The returned pointer must belong to the same address space as the 18725argument. 18726 18727:: 18728 18729 declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>) 18730 18731Overview: 18732""""""""" 18733 18734The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant 18735established by ``invariant.group`` metadata no longer holds, to obtain a new pointer 18736value that does not carry the invariant information. It is an experimental 18737intrinsic, which means that its semantics might change in the future. 18738 18739 18740Arguments: 18741"""""""""" 18742 18743The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer 18744to the memory. 18745 18746Semantics: 18747"""""""""" 18748 18749Returns another pointer that aliases its argument but which has no associated 18750``invariant.group`` metadata. 18751It does not read any memory and can be speculated. 18752 18753 18754 18755.. _constrainedfp: 18756 18757Constrained Floating-Point Intrinsics 18758------------------------------------- 18759 18760These intrinsics are used to provide special handling of floating-point 18761operations when specific rounding mode or floating-point exception behavior is 18762required. By default, LLVM optimization passes assume that the rounding mode is 18763round-to-nearest and that floating-point exceptions will not be monitored. 18764Constrained FP intrinsics are used to support non-default rounding modes and 18765accurately preserve exception behavior without compromising LLVM's ability to 18766optimize FP code when the default behavior is used. 18767 18768If any FP operation in a function is constrained then they all must be 18769constrained. This is required for correct LLVM IR. Optimizations that 18770move code around can create miscompiles if mixing of constrained and normal 18771operations is done. The correct way to mix constrained and less constrained 18772operations is to use the rounding mode and exception handling metadata to 18773mark constrained intrinsics as having LLVM's default behavior. 18774 18775Each of these intrinsics corresponds to a normal floating-point operation. The 18776data arguments and the return value are the same as the corresponding FP 18777operation. 18778 18779The rounding mode argument is a metadata string specifying what 18780assumptions, if any, the optimizer can make when transforming constant 18781values. Some constrained FP intrinsics omit this argument. If required 18782by the intrinsic, this argument must be one of the following strings: 18783 18784:: 18785 18786 "round.dynamic" 18787 "round.tonearest" 18788 "round.downward" 18789 "round.upward" 18790 "round.towardzero" 18791 "round.tonearestaway" 18792 18793If this argument is "round.dynamic" optimization passes must assume that the 18794rounding mode is unknown and may change at runtime. No transformations that 18795depend on rounding mode may be performed in this case. 18796 18797The other possible values for the rounding mode argument correspond to the 18798similarly named IEEE rounding modes. If the argument is any of these values 18799optimization passes may perform transformations as long as they are consistent 18800with the specified rounding mode. 18801 18802For example, 'x-0'->'x' is not a valid transformation if the rounding mode is 18803"round.downward" or "round.dynamic" because if the value of 'x' is +0 then 18804'x-0' should evaluate to '-0' when rounding downward. However, this 18805transformation is legal for all other rounding modes. 18806 18807For values other than "round.dynamic" optimization passes may assume that the 18808actual runtime rounding mode (as defined in a target-specific manner) matches 18809the specified rounding mode, but this is not guaranteed. Using a specific 18810non-dynamic rounding mode which does not match the actual rounding mode at 18811runtime results in undefined behavior. 18812 18813The exception behavior argument is a metadata string describing the floating 18814point exception semantics that required for the intrinsic. This argument 18815must be one of the following strings: 18816 18817:: 18818 18819 "fpexcept.ignore" 18820 "fpexcept.maytrap" 18821 "fpexcept.strict" 18822 18823If this argument is "fpexcept.ignore" optimization passes may assume that the 18824exception status flags will not be read and that floating-point exceptions will 18825be masked. This allows transformations to be performed that may change the 18826exception semantics of the original code. For example, FP operations may be 18827speculatively executed in this case whereas they must not be for either of the 18828other possible values of this argument. 18829 18830If the exception behavior argument is "fpexcept.maytrap" optimization passes 18831must avoid transformations that may raise exceptions that would not have been 18832raised by the original code (such as speculatively executing FP operations), but 18833passes are not required to preserve all exceptions that are implied by the 18834original code. For example, exceptions may be potentially hidden by constant 18835folding. 18836 18837If the exception behavior argument is "fpexcept.strict" all transformations must 18838strictly preserve the floating-point exception semantics of the original code. 18839Any FP exception that would have been raised by the original code must be raised 18840by the transformed code, and the transformed code must not raise any FP 18841exceptions that would not have been raised by the original code. This is the 18842exception behavior argument that will be used if the code being compiled reads 18843the FP exception status flags, but this mode can also be used with code that 18844unmasks FP exceptions. 18845 18846The number and order of floating-point exceptions is NOT guaranteed. For 18847example, a series of FP operations that each may raise exceptions may be 18848vectorized into a single instruction that raises each unique exception a single 18849time. 18850 18851Proper :ref:`function attributes <fnattrs>` usage is required for the 18852constrained intrinsics to function correctly. 18853 18854All function *calls* done in a function that uses constrained floating 18855point intrinsics must have the ``strictfp`` attribute. 18856 18857All function *definitions* that use constrained floating point intrinsics 18858must have the ``strictfp`` attribute. 18859 18860'``llvm.experimental.constrained.fadd``' Intrinsic 18861^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18862 18863Syntax: 18864""""""" 18865 18866:: 18867 18868 declare <type> 18869 @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>, 18870 metadata <rounding mode>, 18871 metadata <exception behavior>) 18872 18873Overview: 18874""""""""" 18875 18876The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its 18877two operands. 18878 18879 18880Arguments: 18881"""""""""" 18882 18883The first two arguments to the '``llvm.experimental.constrained.fadd``' 18884intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 18885of floating-point values. Both arguments must have identical types. 18886 18887The third and fourth arguments specify the rounding mode and exception 18888behavior as described above. 18889 18890Semantics: 18891"""""""""" 18892 18893The value produced is the floating-point sum of the two value operands and has 18894the same type as the operands. 18895 18896 18897'``llvm.experimental.constrained.fsub``' Intrinsic 18898^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18899 18900Syntax: 18901""""""" 18902 18903:: 18904 18905 declare <type> 18906 @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>, 18907 metadata <rounding mode>, 18908 metadata <exception behavior>) 18909 18910Overview: 18911""""""""" 18912 18913The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference 18914of its two operands. 18915 18916 18917Arguments: 18918"""""""""" 18919 18920The first two arguments to the '``llvm.experimental.constrained.fsub``' 18921intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 18922of floating-point values. Both arguments must have identical types. 18923 18924The third and fourth arguments specify the rounding mode and exception 18925behavior as described above. 18926 18927Semantics: 18928"""""""""" 18929 18930The value produced is the floating-point difference of the two value operands 18931and has the same type as the operands. 18932 18933 18934'``llvm.experimental.constrained.fmul``' Intrinsic 18935^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18936 18937Syntax: 18938""""""" 18939 18940:: 18941 18942 declare <type> 18943 @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>, 18944 metadata <rounding mode>, 18945 metadata <exception behavior>) 18946 18947Overview: 18948""""""""" 18949 18950The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of 18951its two operands. 18952 18953 18954Arguments: 18955"""""""""" 18956 18957The first two arguments to the '``llvm.experimental.constrained.fmul``' 18958intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 18959of floating-point values. Both arguments must have identical types. 18960 18961The third and fourth arguments specify the rounding mode and exception 18962behavior as described above. 18963 18964Semantics: 18965"""""""""" 18966 18967The value produced is the floating-point product of the two value operands and 18968has the same type as the operands. 18969 18970 18971'``llvm.experimental.constrained.fdiv``' Intrinsic 18972^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18973 18974Syntax: 18975""""""" 18976 18977:: 18978 18979 declare <type> 18980 @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>, 18981 metadata <rounding mode>, 18982 metadata <exception behavior>) 18983 18984Overview: 18985""""""""" 18986 18987The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of 18988its two operands. 18989 18990 18991Arguments: 18992"""""""""" 18993 18994The first two arguments to the '``llvm.experimental.constrained.fdiv``' 18995intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 18996of floating-point values. Both arguments must have identical types. 18997 18998The third and fourth arguments specify the rounding mode and exception 18999behavior as described above. 19000 19001Semantics: 19002"""""""""" 19003 19004The value produced is the floating-point quotient of the two value operands and 19005has the same type as the operands. 19006 19007 19008'``llvm.experimental.constrained.frem``' Intrinsic 19009^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19010 19011Syntax: 19012""""""" 19013 19014:: 19015 19016 declare <type> 19017 @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>, 19018 metadata <rounding mode>, 19019 metadata <exception behavior>) 19020 19021Overview: 19022""""""""" 19023 19024The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder 19025from the division of its two operands. 19026 19027 19028Arguments: 19029"""""""""" 19030 19031The first two arguments to the '``llvm.experimental.constrained.frem``' 19032intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 19033of floating-point values. Both arguments must have identical types. 19034 19035The third and fourth arguments specify the rounding mode and exception 19036behavior as described above. The rounding mode argument has no effect, since 19037the result of frem is never rounded, but the argument is included for 19038consistency with the other constrained floating-point intrinsics. 19039 19040Semantics: 19041"""""""""" 19042 19043The value produced is the floating-point remainder from the division of the two 19044value operands and has the same type as the operands. The remainder has the 19045same sign as the dividend. 19046 19047'``llvm.experimental.constrained.fma``' Intrinsic 19048^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19049 19050Syntax: 19051""""""" 19052 19053:: 19054 19055 declare <type> 19056 @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>, 19057 metadata <rounding mode>, 19058 metadata <exception behavior>) 19059 19060Overview: 19061""""""""" 19062 19063The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a 19064fused-multiply-add operation on its operands. 19065 19066Arguments: 19067"""""""""" 19068 19069The first three arguments to the '``llvm.experimental.constrained.fma``' 19070intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector 19071<t_vector>` of floating-point values. All arguments must have identical types. 19072 19073The fourth and fifth arguments specify the rounding mode and exception behavior 19074as described above. 19075 19076Semantics: 19077"""""""""" 19078 19079The result produced is the product of the first two operands added to the third 19080operand computed with infinite precision, and then rounded to the target 19081precision. 19082 19083'``llvm.experimental.constrained.fptoui``' Intrinsic 19084^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19085 19086Syntax: 19087""""""" 19088 19089:: 19090 19091 declare <ty2> 19092 @llvm.experimental.constrained.fptoui(<type> <value>, 19093 metadata <exception behavior>) 19094 19095Overview: 19096""""""""" 19097 19098The '``llvm.experimental.constrained.fptoui``' intrinsic converts a 19099floating-point ``value`` to its unsigned integer equivalent of type ``ty2``. 19100 19101Arguments: 19102"""""""""" 19103 19104The first argument to the '``llvm.experimental.constrained.fptoui``' 19105intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 19106<t_vector>` of floating point values. 19107 19108The second argument specifies the exception behavior as described above. 19109 19110Semantics: 19111"""""""""" 19112 19113The result produced is an unsigned integer converted from the floating 19114point operand. The value is truncated, so it is rounded towards zero. 19115 19116'``llvm.experimental.constrained.fptosi``' Intrinsic 19117^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19118 19119Syntax: 19120""""""" 19121 19122:: 19123 19124 declare <ty2> 19125 @llvm.experimental.constrained.fptosi(<type> <value>, 19126 metadata <exception behavior>) 19127 19128Overview: 19129""""""""" 19130 19131The '``llvm.experimental.constrained.fptosi``' intrinsic converts 19132:ref:`floating-point <t_floating>` ``value`` to type ``ty2``. 19133 19134Arguments: 19135"""""""""" 19136 19137The first argument to the '``llvm.experimental.constrained.fptosi``' 19138intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 19139<t_vector>` of floating point values. 19140 19141The second argument specifies the exception behavior as described above. 19142 19143Semantics: 19144"""""""""" 19145 19146The result produced is a signed integer converted from the floating 19147point operand. The value is truncated, so it is rounded towards zero. 19148 19149'``llvm.experimental.constrained.uitofp``' Intrinsic 19150^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19151 19152Syntax: 19153""""""" 19154 19155:: 19156 19157 declare <ty2> 19158 @llvm.experimental.constrained.uitofp(<type> <value>, 19159 metadata <rounding mode>, 19160 metadata <exception behavior>) 19161 19162Overview: 19163""""""""" 19164 19165The '``llvm.experimental.constrained.uitofp``' intrinsic converts an 19166unsigned integer ``value`` to a floating-point of type ``ty2``. 19167 19168Arguments: 19169"""""""""" 19170 19171The first argument to the '``llvm.experimental.constrained.uitofp``' 19172intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector 19173<t_vector>` of integer values. 19174 19175The second and third arguments specify the rounding mode and exception 19176behavior as described above. 19177 19178Semantics: 19179"""""""""" 19180 19181An inexact floating-point exception will be raised if rounding is required. 19182Any result produced is a floating point value converted from the input 19183integer operand. 19184 19185'``llvm.experimental.constrained.sitofp``' Intrinsic 19186^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19187 19188Syntax: 19189""""""" 19190 19191:: 19192 19193 declare <ty2> 19194 @llvm.experimental.constrained.sitofp(<type> <value>, 19195 metadata <rounding mode>, 19196 metadata <exception behavior>) 19197 19198Overview: 19199""""""""" 19200 19201The '``llvm.experimental.constrained.sitofp``' intrinsic converts a 19202signed integer ``value`` to a floating-point of type ``ty2``. 19203 19204Arguments: 19205"""""""""" 19206 19207The first argument to the '``llvm.experimental.constrained.sitofp``' 19208intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector 19209<t_vector>` of integer values. 19210 19211The second and third arguments specify the rounding mode and exception 19212behavior as described above. 19213 19214Semantics: 19215"""""""""" 19216 19217An inexact floating-point exception will be raised if rounding is required. 19218Any result produced is a floating point value converted from the input 19219integer operand. 19220 19221'``llvm.experimental.constrained.fptrunc``' Intrinsic 19222^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19223 19224Syntax: 19225""""""" 19226 19227:: 19228 19229 declare <ty2> 19230 @llvm.experimental.constrained.fptrunc(<type> <value>, 19231 metadata <rounding mode>, 19232 metadata <exception behavior>) 19233 19234Overview: 19235""""""""" 19236 19237The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value`` 19238to type ``ty2``. 19239 19240Arguments: 19241"""""""""" 19242 19243The first argument to the '``llvm.experimental.constrained.fptrunc``' 19244intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 19245<t_vector>` of floating point values. This argument must be larger in size 19246than the result. 19247 19248The second and third arguments specify the rounding mode and exception 19249behavior as described above. 19250 19251Semantics: 19252"""""""""" 19253 19254The result produced is a floating point value truncated to be smaller in size 19255than the operand. 19256 19257'``llvm.experimental.constrained.fpext``' Intrinsic 19258^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19259 19260Syntax: 19261""""""" 19262 19263:: 19264 19265 declare <ty2> 19266 @llvm.experimental.constrained.fpext(<type> <value>, 19267 metadata <exception behavior>) 19268 19269Overview: 19270""""""""" 19271 19272The '``llvm.experimental.constrained.fpext``' intrinsic extends a 19273floating-point ``value`` to a larger floating-point value. 19274 19275Arguments: 19276"""""""""" 19277 19278The first argument to the '``llvm.experimental.constrained.fpext``' 19279intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 19280<t_vector>` of floating point values. This argument must be smaller in size 19281than the result. 19282 19283The second argument specifies the exception behavior as described above. 19284 19285Semantics: 19286"""""""""" 19287 19288The result produced is a floating point value extended to be larger in size 19289than the operand. All restrictions that apply to the fpext instruction also 19290apply to this intrinsic. 19291 19292'``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics 19293^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19294 19295Syntax: 19296""""""" 19297 19298:: 19299 19300 declare <ty2> 19301 @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, 19302 metadata <condition code>, 19303 metadata <exception behavior>) 19304 declare <ty2> 19305 @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, 19306 metadata <condition code>, 19307 metadata <exception behavior>) 19308 19309Overview: 19310""""""""" 19311 19312The '``llvm.experimental.constrained.fcmp``' and 19313'``llvm.experimental.constrained.fcmps``' intrinsics return a boolean 19314value or vector of boolean values based on comparison of its operands. 19315 19316If the operands are floating-point scalars, then the result type is a 19317boolean (:ref:`i1 <t_integer>`). 19318 19319If the operands are floating-point vectors, then the result type is a 19320vector of boolean with the same number of elements as the operands being 19321compared. 19322 19323The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet 19324comparison operation while the '``llvm.experimental.constrained.fcmps``' 19325intrinsic performs a signaling comparison operation. 19326 19327Arguments: 19328"""""""""" 19329 19330The first two arguments to the '``llvm.experimental.constrained.fcmp``' 19331and '``llvm.experimental.constrained.fcmps``' intrinsics must be 19332:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 19333of floating-point values. Both arguments must have identical types. 19334 19335The third argument is the condition code indicating the kind of comparison 19336to perform. It must be a metadata string with one of the following values: 19337 19338- "``oeq``": ordered and equal 19339- "``ogt``": ordered and greater than 19340- "``oge``": ordered and greater than or equal 19341- "``olt``": ordered and less than 19342- "``ole``": ordered and less than or equal 19343- "``one``": ordered and not equal 19344- "``ord``": ordered (no nans) 19345- "``ueq``": unordered or equal 19346- "``ugt``": unordered or greater than 19347- "``uge``": unordered or greater than or equal 19348- "``ult``": unordered or less than 19349- "``ule``": unordered or less than or equal 19350- "``une``": unordered or not equal 19351- "``uno``": unordered (either nans) 19352 19353*Ordered* means that neither operand is a NAN while *unordered* means 19354that either operand may be a NAN. 19355 19356The fourth argument specifies the exception behavior as described above. 19357 19358Semantics: 19359"""""""""" 19360 19361``op1`` and ``op2`` are compared according to the condition code given 19362as the third argument. If the operands are vectors, then the 19363vectors are compared element by element. Each comparison performed 19364always yields an :ref:`i1 <t_integer>` result, as follows: 19365 19366- "``oeq``": yields ``true`` if both operands are not a NAN and ``op1`` 19367 is equal to ``op2``. 19368- "``ogt``": yields ``true`` if both operands are not a NAN and ``op1`` 19369 is greater than ``op2``. 19370- "``oge``": yields ``true`` if both operands are not a NAN and ``op1`` 19371 is greater than or equal to ``op2``. 19372- "``olt``": yields ``true`` if both operands are not a NAN and ``op1`` 19373 is less than ``op2``. 19374- "``ole``": yields ``true`` if both operands are not a NAN and ``op1`` 19375 is less than or equal to ``op2``. 19376- "``one``": yields ``true`` if both operands are not a NAN and ``op1`` 19377 is not equal to ``op2``. 19378- "``ord``": yields ``true`` if both operands are not a NAN. 19379- "``ueq``": yields ``true`` if either operand is a NAN or ``op1`` is 19380 equal to ``op2``. 19381- "``ugt``": yields ``true`` if either operand is a NAN or ``op1`` is 19382 greater than ``op2``. 19383- "``uge``": yields ``true`` if either operand is a NAN or ``op1`` is 19384 greater than or equal to ``op2``. 19385- "``ult``": yields ``true`` if either operand is a NAN or ``op1`` is 19386 less than ``op2``. 19387- "``ule``": yields ``true`` if either operand is a NAN or ``op1`` is 19388 less than or equal to ``op2``. 19389- "``une``": yields ``true`` if either operand is a NAN or ``op1`` is 19390 not equal to ``op2``. 19391- "``uno``": yields ``true`` if either operand is a NAN. 19392 19393The quiet comparison operation performed by 19394'``llvm.experimental.constrained.fcmp``' will only raise an exception 19395if either operand is a SNAN. The signaling comparison operation 19396performed by '``llvm.experimental.constrained.fcmps``' will raise an 19397exception if either operand is a NAN (QNAN or SNAN). Such an exception 19398does not preclude a result being produced (e.g. exception might only 19399set a flag), therefore the distinction between ordered and unordered 19400comparisons is also relevant for the 19401'``llvm.experimental.constrained.fcmps``' intrinsic. 19402 19403'``llvm.experimental.constrained.fmuladd``' Intrinsic 19404^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19405 19406Syntax: 19407""""""" 19408 19409:: 19410 19411 declare <type> 19412 @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>, 19413 <type> <op3>, 19414 metadata <rounding mode>, 19415 metadata <exception behavior>) 19416 19417Overview: 19418""""""""" 19419 19420The '``llvm.experimental.constrained.fmuladd``' intrinsic represents 19421multiply-add expressions that can be fused if the code generator determines 19422that (a) the target instruction set has support for a fused operation, 19423and (b) that the fused operation is more efficient than the equivalent, 19424separate pair of mul and add instructions. 19425 19426Arguments: 19427"""""""""" 19428 19429The first three arguments to the '``llvm.experimental.constrained.fmuladd``' 19430intrinsic must be floating-point or vector of floating-point values. 19431All three arguments must have identical types. 19432 19433The fourth and fifth arguments specify the rounding mode and exception behavior 19434as described above. 19435 19436Semantics: 19437"""""""""" 19438 19439The expression: 19440 19441:: 19442 19443 %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c, 19444 metadata <rounding mode>, 19445 metadata <exception behavior>) 19446 19447is equivalent to the expression: 19448 19449:: 19450 19451 %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b, 19452 metadata <rounding mode>, 19453 metadata <exception behavior>) 19454 %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c, 19455 metadata <rounding mode>, 19456 metadata <exception behavior>) 19457 19458except that it is unspecified whether rounding will be performed between the 19459multiplication and addition steps. Fusion is not guaranteed, even if the target 19460platform supports it. 19461If a fused multiply-add is required, the corresponding 19462:ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be 19463used instead. 19464This never sets errno, just as '``llvm.experimental.constrained.fma.*``'. 19465 19466Constrained libm-equivalent Intrinsics 19467-------------------------------------- 19468 19469In addition to the basic floating-point operations for which constrained 19470intrinsics are described above, there are constrained versions of various 19471operations which provide equivalent behavior to a corresponding libm function. 19472These intrinsics allow the precise behavior of these operations with respect to 19473rounding mode and exception behavior to be controlled. 19474 19475As with the basic constrained floating-point intrinsics, the rounding mode 19476and exception behavior arguments only control the behavior of the optimizer. 19477They do not change the runtime floating-point environment. 19478 19479 19480'``llvm.experimental.constrained.sqrt``' Intrinsic 19481^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19482 19483Syntax: 19484""""""" 19485 19486:: 19487 19488 declare <type> 19489 @llvm.experimental.constrained.sqrt(<type> <op1>, 19490 metadata <rounding mode>, 19491 metadata <exception behavior>) 19492 19493Overview: 19494""""""""" 19495 19496The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root 19497of the specified value, returning the same value as the libm '``sqrt``' 19498functions would, but without setting ``errno``. 19499 19500Arguments: 19501"""""""""" 19502 19503The first argument and the return type are floating-point numbers of the same 19504type. 19505 19506The second and third arguments specify the rounding mode and exception 19507behavior as described above. 19508 19509Semantics: 19510"""""""""" 19511 19512This function returns the nonnegative square root of the specified value. 19513If the value is less than negative zero, a floating-point exception occurs 19514and the return value is architecture specific. 19515 19516 19517'``llvm.experimental.constrained.pow``' Intrinsic 19518^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19519 19520Syntax: 19521""""""" 19522 19523:: 19524 19525 declare <type> 19526 @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>, 19527 metadata <rounding mode>, 19528 metadata <exception behavior>) 19529 19530Overview: 19531""""""""" 19532 19533The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand 19534raised to the (positive or negative) power specified by the second operand. 19535 19536Arguments: 19537"""""""""" 19538 19539The first two arguments and the return value are floating-point numbers of the 19540same type. The second argument specifies the power to which the first argument 19541should be raised. 19542 19543The third and fourth arguments specify the rounding mode and exception 19544behavior as described above. 19545 19546Semantics: 19547"""""""""" 19548 19549This function returns the first value raised to the second power, 19550returning the same values as the libm ``pow`` functions would, and 19551handles error conditions in the same way. 19552 19553 19554'``llvm.experimental.constrained.powi``' Intrinsic 19555^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19556 19557Syntax: 19558""""""" 19559 19560:: 19561 19562 declare <type> 19563 @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>, 19564 metadata <rounding mode>, 19565 metadata <exception behavior>) 19566 19567Overview: 19568""""""""" 19569 19570The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand 19571raised to the (positive or negative) power specified by the second operand. The 19572order of evaluation of multiplications is not defined. When a vector of 19573floating-point type is used, the second argument remains a scalar integer value. 19574 19575 19576Arguments: 19577"""""""""" 19578 19579The first argument and the return value are floating-point numbers of the same 19580type. The second argument is a 32-bit signed integer specifying the power to 19581which the first argument should be raised. 19582 19583The third and fourth arguments specify the rounding mode and exception 19584behavior as described above. 19585 19586Semantics: 19587"""""""""" 19588 19589This function returns the first value raised to the second power with an 19590unspecified sequence of rounding operations. 19591 19592 19593'``llvm.experimental.constrained.sin``' Intrinsic 19594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19595 19596Syntax: 19597""""""" 19598 19599:: 19600 19601 declare <type> 19602 @llvm.experimental.constrained.sin(<type> <op1>, 19603 metadata <rounding mode>, 19604 metadata <exception behavior>) 19605 19606Overview: 19607""""""""" 19608 19609The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the 19610first operand. 19611 19612Arguments: 19613"""""""""" 19614 19615The first argument and the return type are floating-point numbers of the same 19616type. 19617 19618The second and third arguments specify the rounding mode and exception 19619behavior as described above. 19620 19621Semantics: 19622"""""""""" 19623 19624This function returns the sine of the specified operand, returning the 19625same values as the libm ``sin`` functions would, and handles error 19626conditions in the same way. 19627 19628 19629'``llvm.experimental.constrained.cos``' Intrinsic 19630^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19631 19632Syntax: 19633""""""" 19634 19635:: 19636 19637 declare <type> 19638 @llvm.experimental.constrained.cos(<type> <op1>, 19639 metadata <rounding mode>, 19640 metadata <exception behavior>) 19641 19642Overview: 19643""""""""" 19644 19645The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the 19646first operand. 19647 19648Arguments: 19649"""""""""" 19650 19651The first argument and the return type are floating-point numbers of the same 19652type. 19653 19654The second and third arguments specify the rounding mode and exception 19655behavior as described above. 19656 19657Semantics: 19658"""""""""" 19659 19660This function returns the cosine of the specified operand, returning the 19661same values as the libm ``cos`` functions would, and handles error 19662conditions in the same way. 19663 19664 19665'``llvm.experimental.constrained.exp``' Intrinsic 19666^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19667 19668Syntax: 19669""""""" 19670 19671:: 19672 19673 declare <type> 19674 @llvm.experimental.constrained.exp(<type> <op1>, 19675 metadata <rounding mode>, 19676 metadata <exception behavior>) 19677 19678Overview: 19679""""""""" 19680 19681The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e 19682exponential of the specified value. 19683 19684Arguments: 19685"""""""""" 19686 19687The first argument and the return value are floating-point numbers of the same 19688type. 19689 19690The second and third arguments specify the rounding mode and exception 19691behavior as described above. 19692 19693Semantics: 19694"""""""""" 19695 19696This function returns the same values as the libm ``exp`` functions 19697would, and handles error conditions in the same way. 19698 19699 19700'``llvm.experimental.constrained.exp2``' Intrinsic 19701^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19702 19703Syntax: 19704""""""" 19705 19706:: 19707 19708 declare <type> 19709 @llvm.experimental.constrained.exp2(<type> <op1>, 19710 metadata <rounding mode>, 19711 metadata <exception behavior>) 19712 19713Overview: 19714""""""""" 19715 19716The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2 19717exponential of the specified value. 19718 19719 19720Arguments: 19721"""""""""" 19722 19723The first argument and the return value are floating-point numbers of the same 19724type. 19725 19726The second and third arguments specify the rounding mode and exception 19727behavior as described above. 19728 19729Semantics: 19730"""""""""" 19731 19732This function returns the same values as the libm ``exp2`` functions 19733would, and handles error conditions in the same way. 19734 19735 19736'``llvm.experimental.constrained.log``' Intrinsic 19737^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19738 19739Syntax: 19740""""""" 19741 19742:: 19743 19744 declare <type> 19745 @llvm.experimental.constrained.log(<type> <op1>, 19746 metadata <rounding mode>, 19747 metadata <exception behavior>) 19748 19749Overview: 19750""""""""" 19751 19752The '``llvm.experimental.constrained.log``' intrinsic computes the base-e 19753logarithm of the specified value. 19754 19755Arguments: 19756"""""""""" 19757 19758The first argument and the return value are floating-point numbers of the same 19759type. 19760 19761The second and third arguments specify the rounding mode and exception 19762behavior as described above. 19763 19764 19765Semantics: 19766"""""""""" 19767 19768This function returns the same values as the libm ``log`` functions 19769would, and handles error conditions in the same way. 19770 19771 19772'``llvm.experimental.constrained.log10``' Intrinsic 19773^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19774 19775Syntax: 19776""""""" 19777 19778:: 19779 19780 declare <type> 19781 @llvm.experimental.constrained.log10(<type> <op1>, 19782 metadata <rounding mode>, 19783 metadata <exception behavior>) 19784 19785Overview: 19786""""""""" 19787 19788The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10 19789logarithm of the specified value. 19790 19791Arguments: 19792"""""""""" 19793 19794The first argument and the return value are floating-point numbers of the same 19795type. 19796 19797The second and third arguments specify the rounding mode and exception 19798behavior as described above. 19799 19800Semantics: 19801"""""""""" 19802 19803This function returns the same values as the libm ``log10`` functions 19804would, and handles error conditions in the same way. 19805 19806 19807'``llvm.experimental.constrained.log2``' Intrinsic 19808^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19809 19810Syntax: 19811""""""" 19812 19813:: 19814 19815 declare <type> 19816 @llvm.experimental.constrained.log2(<type> <op1>, 19817 metadata <rounding mode>, 19818 metadata <exception behavior>) 19819 19820Overview: 19821""""""""" 19822 19823The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2 19824logarithm of the specified value. 19825 19826Arguments: 19827"""""""""" 19828 19829The first argument and the return value are floating-point numbers of the same 19830type. 19831 19832The second and third arguments specify the rounding mode and exception 19833behavior as described above. 19834 19835Semantics: 19836"""""""""" 19837 19838This function returns the same values as the libm ``log2`` functions 19839would, and handles error conditions in the same way. 19840 19841 19842'``llvm.experimental.constrained.rint``' Intrinsic 19843^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19844 19845Syntax: 19846""""""" 19847 19848:: 19849 19850 declare <type> 19851 @llvm.experimental.constrained.rint(<type> <op1>, 19852 metadata <rounding mode>, 19853 metadata <exception behavior>) 19854 19855Overview: 19856""""""""" 19857 19858The '``llvm.experimental.constrained.rint``' intrinsic returns the first 19859operand rounded to the nearest integer. It may raise an inexact floating-point 19860exception if the operand is not an integer. 19861 19862Arguments: 19863"""""""""" 19864 19865The first argument and the return value are floating-point numbers of the same 19866type. 19867 19868The second and third arguments specify the rounding mode and exception 19869behavior as described above. 19870 19871Semantics: 19872"""""""""" 19873 19874This function returns the same values as the libm ``rint`` functions 19875would, and handles error conditions in the same way. The rounding mode is 19876described, not determined, by the rounding mode argument. The actual rounding 19877mode is determined by the runtime floating-point environment. The rounding 19878mode argument is only intended as information to the compiler. 19879 19880 19881'``llvm.experimental.constrained.lrint``' Intrinsic 19882^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19883 19884Syntax: 19885""""""" 19886 19887:: 19888 19889 declare <inttype> 19890 @llvm.experimental.constrained.lrint(<fptype> <op1>, 19891 metadata <rounding mode>, 19892 metadata <exception behavior>) 19893 19894Overview: 19895""""""""" 19896 19897The '``llvm.experimental.constrained.lrint``' intrinsic returns the first 19898operand rounded to the nearest integer. An inexact floating-point exception 19899will be raised if the operand is not an integer. An invalid exception is 19900raised if the result is too large to fit into a supported integer type, 19901and in this case the result is undefined. 19902 19903Arguments: 19904"""""""""" 19905 19906The first argument is a floating-point number. The return value is an 19907integer type. Not all types are supported on all targets. The supported 19908types are the same as the ``llvm.lrint`` intrinsic and the ``lrint`` 19909libm functions. 19910 19911The second and third arguments specify the rounding mode and exception 19912behavior as described above. 19913 19914Semantics: 19915"""""""""" 19916 19917This function returns the same values as the libm ``lrint`` functions 19918would, and handles error conditions in the same way. 19919 19920The rounding mode is described, not determined, by the rounding mode 19921argument. The actual rounding mode is determined by the runtime floating-point 19922environment. The rounding mode argument is only intended as information 19923to the compiler. 19924 19925If the runtime floating-point environment is using the default rounding mode 19926then the results will be the same as the llvm.lrint intrinsic. 19927 19928 19929'``llvm.experimental.constrained.llrint``' Intrinsic 19930^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19931 19932Syntax: 19933""""""" 19934 19935:: 19936 19937 declare <inttype> 19938 @llvm.experimental.constrained.llrint(<fptype> <op1>, 19939 metadata <rounding mode>, 19940 metadata <exception behavior>) 19941 19942Overview: 19943""""""""" 19944 19945The '``llvm.experimental.constrained.llrint``' intrinsic returns the first 19946operand rounded to the nearest integer. An inexact floating-point exception 19947will be raised if the operand is not an integer. An invalid exception is 19948raised if the result is too large to fit into a supported integer type, 19949and in this case the result is undefined. 19950 19951Arguments: 19952"""""""""" 19953 19954The first argument is a floating-point number. The return value is an 19955integer type. Not all types are supported on all targets. The supported 19956types are the same as the ``llvm.llrint`` intrinsic and the ``llrint`` 19957libm functions. 19958 19959The second and third arguments specify the rounding mode and exception 19960behavior as described above. 19961 19962Semantics: 19963"""""""""" 19964 19965This function returns the same values as the libm ``llrint`` functions 19966would, and handles error conditions in the same way. 19967 19968The rounding mode is described, not determined, by the rounding mode 19969argument. The actual rounding mode is determined by the runtime floating-point 19970environment. The rounding mode argument is only intended as information 19971to the compiler. 19972 19973If the runtime floating-point environment is using the default rounding mode 19974then the results will be the same as the llvm.llrint intrinsic. 19975 19976 19977'``llvm.experimental.constrained.nearbyint``' Intrinsic 19978^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19979 19980Syntax: 19981""""""" 19982 19983:: 19984 19985 declare <type> 19986 @llvm.experimental.constrained.nearbyint(<type> <op1>, 19987 metadata <rounding mode>, 19988 metadata <exception behavior>) 19989 19990Overview: 19991""""""""" 19992 19993The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first 19994operand rounded to the nearest integer. It will not raise an inexact 19995floating-point exception if the operand is not an integer. 19996 19997 19998Arguments: 19999"""""""""" 20000 20001The first argument and the return value are floating-point numbers of the same 20002type. 20003 20004The second and third arguments specify the rounding mode and exception 20005behavior as described above. 20006 20007Semantics: 20008"""""""""" 20009 20010This function returns the same values as the libm ``nearbyint`` functions 20011would, and handles error conditions in the same way. The rounding mode is 20012described, not determined, by the rounding mode argument. The actual rounding 20013mode is determined by the runtime floating-point environment. The rounding 20014mode argument is only intended as information to the compiler. 20015 20016 20017'``llvm.experimental.constrained.maxnum``' Intrinsic 20018^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20019 20020Syntax: 20021""""""" 20022 20023:: 20024 20025 declare <type> 20026 @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2> 20027 metadata <exception behavior>) 20028 20029Overview: 20030""""""""" 20031 20032The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum 20033of the two arguments. 20034 20035Arguments: 20036"""""""""" 20037 20038The first two arguments and the return value are floating-point numbers 20039of the same type. 20040 20041The third argument specifies the exception behavior as described above. 20042 20043Semantics: 20044"""""""""" 20045 20046This function follows the IEEE-754 semantics for maxNum. 20047 20048 20049'``llvm.experimental.constrained.minnum``' Intrinsic 20050^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20051 20052Syntax: 20053""""""" 20054 20055:: 20056 20057 declare <type> 20058 @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2> 20059 metadata <exception behavior>) 20060 20061Overview: 20062""""""""" 20063 20064The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum 20065of the two arguments. 20066 20067Arguments: 20068"""""""""" 20069 20070The first two arguments and the return value are floating-point numbers 20071of the same type. 20072 20073The third argument specifies the exception behavior as described above. 20074 20075Semantics: 20076"""""""""" 20077 20078This function follows the IEEE-754 semantics for minNum. 20079 20080 20081'``llvm.experimental.constrained.maximum``' Intrinsic 20082^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20083 20084Syntax: 20085""""""" 20086 20087:: 20088 20089 declare <type> 20090 @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2> 20091 metadata <exception behavior>) 20092 20093Overview: 20094""""""""" 20095 20096The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum 20097of the two arguments, propagating NaNs and treating -0.0 as less than +0.0. 20098 20099Arguments: 20100"""""""""" 20101 20102The first two arguments and the return value are floating-point numbers 20103of the same type. 20104 20105The third argument specifies the exception behavior as described above. 20106 20107Semantics: 20108"""""""""" 20109 20110This function follows semantics specified in the draft of IEEE 754-2018. 20111 20112 20113'``llvm.experimental.constrained.minimum``' Intrinsic 20114^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20115 20116Syntax: 20117""""""" 20118 20119:: 20120 20121 declare <type> 20122 @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2> 20123 metadata <exception behavior>) 20124 20125Overview: 20126""""""""" 20127 20128The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum 20129of the two arguments, propagating NaNs and treating -0.0 as less than +0.0. 20130 20131Arguments: 20132"""""""""" 20133 20134The first two arguments and the return value are floating-point numbers 20135of the same type. 20136 20137The third argument specifies the exception behavior as described above. 20138 20139Semantics: 20140"""""""""" 20141 20142This function follows semantics specified in the draft of IEEE 754-2018. 20143 20144 20145'``llvm.experimental.constrained.ceil``' Intrinsic 20146^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20147 20148Syntax: 20149""""""" 20150 20151:: 20152 20153 declare <type> 20154 @llvm.experimental.constrained.ceil(<type> <op1>, 20155 metadata <exception behavior>) 20156 20157Overview: 20158""""""""" 20159 20160The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the 20161first operand. 20162 20163Arguments: 20164"""""""""" 20165 20166The first argument and the return value are floating-point numbers of the same 20167type. 20168 20169The second argument specifies the exception behavior as described above. 20170 20171Semantics: 20172"""""""""" 20173 20174This function returns the same values as the libm ``ceil`` functions 20175would and handles error conditions in the same way. 20176 20177 20178'``llvm.experimental.constrained.floor``' Intrinsic 20179^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20180 20181Syntax: 20182""""""" 20183 20184:: 20185 20186 declare <type> 20187 @llvm.experimental.constrained.floor(<type> <op1>, 20188 metadata <exception behavior>) 20189 20190Overview: 20191""""""""" 20192 20193The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the 20194first operand. 20195 20196Arguments: 20197"""""""""" 20198 20199The first argument and the return value are floating-point numbers of the same 20200type. 20201 20202The second argument specifies the exception behavior as described above. 20203 20204Semantics: 20205"""""""""" 20206 20207This function returns the same values as the libm ``floor`` functions 20208would and handles error conditions in the same way. 20209 20210 20211'``llvm.experimental.constrained.round``' Intrinsic 20212^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20213 20214Syntax: 20215""""""" 20216 20217:: 20218 20219 declare <type> 20220 @llvm.experimental.constrained.round(<type> <op1>, 20221 metadata <exception behavior>) 20222 20223Overview: 20224""""""""" 20225 20226The '``llvm.experimental.constrained.round``' intrinsic returns the first 20227operand rounded to the nearest integer. 20228 20229Arguments: 20230"""""""""" 20231 20232The first argument and the return value are floating-point numbers of the same 20233type. 20234 20235The second argument specifies the exception behavior as described above. 20236 20237Semantics: 20238"""""""""" 20239 20240This function returns the same values as the libm ``round`` functions 20241would and handles error conditions in the same way. 20242 20243 20244'``llvm.experimental.constrained.roundeven``' Intrinsic 20245^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20246 20247Syntax: 20248""""""" 20249 20250:: 20251 20252 declare <type> 20253 @llvm.experimental.constrained.roundeven(<type> <op1>, 20254 metadata <exception behavior>) 20255 20256Overview: 20257""""""""" 20258 20259The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first 20260operand rounded to the nearest integer in floating-point format, rounding 20261halfway cases to even (that is, to the nearest value that is an even integer), 20262regardless of the current rounding direction. 20263 20264Arguments: 20265"""""""""" 20266 20267The first argument and the return value are floating-point numbers of the same 20268type. 20269 20270The second argument specifies the exception behavior as described above. 20271 20272Semantics: 20273"""""""""" 20274 20275This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It 20276also behaves in the same way as C standard function ``roundeven`` and can signal 20277the invalid operation exception for a SNAN operand. 20278 20279 20280'``llvm.experimental.constrained.lround``' Intrinsic 20281^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20282 20283Syntax: 20284""""""" 20285 20286:: 20287 20288 declare <inttype> 20289 @llvm.experimental.constrained.lround(<fptype> <op1>, 20290 metadata <exception behavior>) 20291 20292Overview: 20293""""""""" 20294 20295The '``llvm.experimental.constrained.lround``' intrinsic returns the first 20296operand rounded to the nearest integer with ties away from zero. It will 20297raise an inexact floating-point exception if the operand is not an integer. 20298An invalid exception is raised if the result is too large to fit into a 20299supported integer type, and in this case the result is undefined. 20300 20301Arguments: 20302"""""""""" 20303 20304The first argument is a floating-point number. The return value is an 20305integer type. Not all types are supported on all targets. The supported 20306types are the same as the ``llvm.lround`` intrinsic and the ``lround`` 20307libm functions. 20308 20309The second argument specifies the exception behavior as described above. 20310 20311Semantics: 20312"""""""""" 20313 20314This function returns the same values as the libm ``lround`` functions 20315would and handles error conditions in the same way. 20316 20317 20318'``llvm.experimental.constrained.llround``' Intrinsic 20319^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20320 20321Syntax: 20322""""""" 20323 20324:: 20325 20326 declare <inttype> 20327 @llvm.experimental.constrained.llround(<fptype> <op1>, 20328 metadata <exception behavior>) 20329 20330Overview: 20331""""""""" 20332 20333The '``llvm.experimental.constrained.llround``' intrinsic returns the first 20334operand rounded to the nearest integer with ties away from zero. It will 20335raise an inexact floating-point exception if the operand is not an integer. 20336An invalid exception is raised if the result is too large to fit into a 20337supported integer type, and in this case the result is undefined. 20338 20339Arguments: 20340"""""""""" 20341 20342The first argument is a floating-point number. The return value is an 20343integer type. Not all types are supported on all targets. The supported 20344types are the same as the ``llvm.llround`` intrinsic and the ``llround`` 20345libm functions. 20346 20347The second argument specifies the exception behavior as described above. 20348 20349Semantics: 20350"""""""""" 20351 20352This function returns the same values as the libm ``llround`` functions 20353would and handles error conditions in the same way. 20354 20355 20356'``llvm.experimental.constrained.trunc``' Intrinsic 20357^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20358 20359Syntax: 20360""""""" 20361 20362:: 20363 20364 declare <type> 20365 @llvm.experimental.constrained.trunc(<type> <op1>, 20366 metadata <exception behavior>) 20367 20368Overview: 20369""""""""" 20370 20371The '``llvm.experimental.constrained.trunc``' intrinsic returns the first 20372operand rounded to the nearest integer not larger in magnitude than the 20373operand. 20374 20375Arguments: 20376"""""""""" 20377 20378The first argument and the return value are floating-point numbers of the same 20379type. 20380 20381The second argument specifies the exception behavior as described above. 20382 20383Semantics: 20384"""""""""" 20385 20386This function returns the same values as the libm ``trunc`` functions 20387would and handles error conditions in the same way. 20388 20389.. _int_experimental_noalias_scope_decl: 20390 20391'``llvm.experimental.noalias.scope.decl``' Intrinsic 20392^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20393 20394Syntax: 20395""""""" 20396 20397 20398:: 20399 20400 declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list) 20401 20402Overview: 20403""""""""" 20404 20405The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a 20406noalias scope is declared. When the intrinsic is duplicated, a decision must 20407also be made about the scope: depending on the reason of the duplication, 20408the scope might need to be duplicated as well. 20409 20410 20411Arguments: 20412"""""""""" 20413 20414The ``!id.scope.list`` argument is metadata that is a list of ``noalias`` 20415metadata references. The format is identical to that required for ``noalias`` 20416metadata. This list must have exactly one element. 20417 20418Semantics: 20419"""""""""" 20420 20421The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a 20422noalias scope is declared. When the intrinsic is duplicated, a decision must 20423also be made about the scope: depending on the reason of the duplication, 20424the scope might need to be duplicated as well. 20425 20426For example, when the intrinsic is used inside a loop body, and that loop is 20427unrolled, the associated noalias scope must also be duplicated. Otherwise, the 20428noalias property it signifies would spill across loop iterations, whereas it 20429was only valid within a single iteration. 20430 20431.. code-block:: llvm 20432 20433 ; This examples shows two possible positions for noalias.decl and how they impact the semantics: 20434 ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations. 20435 ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration. 20436 declare void @decl_in_loop(i8* %a.base, i8* %b.base) { 20437 entry: 20438 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop 20439 br label %loop 20440 20441 loop: 20442 %a = phi i8* [ %a.base, %entry ], [ %a.inc, %loop ] 20443 %b = phi i8* [ %b.base, %entry ], [ %b.inc, %loop ] 20444 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop 20445 %val = load i8, i8* %a, !alias.scope !2 20446 store i8 %val, i8* %b, !noalias !2 20447 %a.inc = getelementptr inbounds i8, i8* %a, i64 1 20448 %b.inc = getelementptr inbounds i8, i8* %b, i64 1 20449 %cond = call i1 @cond() 20450 br i1 %cond, label %loop, label %exit 20451 20452 exit: 20453 ret void 20454 } 20455 20456 !0 = !{!0} ; domain 20457 !1 = !{!1, !0} ; scope 20458 !2 = !{!1} ; scope list 20459 20460Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope 20461are possible, but one should never dominate another. Violations are pointed out 20462by the verifier as they indicate a problem in either a transformation pass or 20463the input. 20464 20465 20466Floating Point Environment Manipulation intrinsics 20467-------------------------------------------------- 20468 20469These functions read or write floating point environment, such as rounding 20470mode or state of floating point exceptions. Altering the floating point 20471environment requires special care. See :ref:`Floating Point Environment <floatenv>`. 20472 20473'``llvm.flt.rounds``' Intrinsic 20474^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20475 20476Syntax: 20477""""""" 20478 20479:: 20480 20481 declare i32 @llvm.flt.rounds() 20482 20483Overview: 20484""""""""" 20485 20486The '``llvm.flt.rounds``' intrinsic reads the current rounding mode. 20487 20488Semantics: 20489"""""""""" 20490 20491The '``llvm.flt.rounds``' intrinsic returns the current rounding mode. 20492Encoding of the returned values is same as the result of ``FLT_ROUNDS``, 20493specified by C standard: 20494 20495:: 20496 20497 0 - toward zero 20498 1 - to nearest, ties to even 20499 2 - toward positive infinity 20500 3 - toward negative infinity 20501 4 - to nearest, ties away from zero 20502 20503Other values may be used to represent additional rounding modes, supported by a 20504target. These values are target-specific. 20505 20506 20507'``llvm.set.rounding``' Intrinsic 20508^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20509 20510Syntax: 20511""""""" 20512 20513:: 20514 20515 declare void @llvm.set.rounding(i32 <val>) 20516 20517Overview: 20518""""""""" 20519 20520The '``llvm.set.rounding``' intrinsic sets current rounding mode. 20521 20522Arguments: 20523"""""""""" 20524 20525The argument is the required rounding mode. Encoding of rounding mode is 20526the same as used by '``llvm.flt.rounds``'. 20527 20528Semantics: 20529"""""""""" 20530 20531The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is 20532similar to C library function 'fesetround', however this intrinsic does not 20533return any value and uses platform-independent representation of IEEE rounding 20534modes. 20535 20536 20537General Intrinsics 20538------------------ 20539 20540This class of intrinsics is designed to be generic and has no specific 20541purpose. 20542 20543'``llvm.var.annotation``' Intrinsic 20544^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20545 20546Syntax: 20547""""""" 20548 20549:: 20550 20551 declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32 <int>) 20552 20553Overview: 20554""""""""" 20555 20556The '``llvm.var.annotation``' intrinsic. 20557 20558Arguments: 20559"""""""""" 20560 20561The first argument is a pointer to a value, the second is a pointer to a 20562global string, the third is a pointer to a global string which is the 20563source file name, and the last argument is the line number. 20564 20565Semantics: 20566"""""""""" 20567 20568This intrinsic allows annotation of local variables with arbitrary 20569strings. This can be useful for special purpose optimizations that want 20570to look for these annotations. These have no other defined use; they are 20571ignored by code generation and optimization. 20572 20573'``llvm.ptr.annotation.*``' Intrinsic 20574^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20575 20576Syntax: 20577""""""" 20578 20579This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a 20580pointer to an integer of any width. *NOTE* you must specify an address space for 20581the pointer. The identifier for the default address space is the integer 20582'``0``'. 20583 20584:: 20585 20586 declare i8* @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32 <int>) 20587 declare i16* @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32 <int>) 20588 declare i32* @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32 <int>) 20589 declare i64* @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32 <int>) 20590 declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32 <int>) 20591 20592Overview: 20593""""""""" 20594 20595The '``llvm.ptr.annotation``' intrinsic. 20596 20597Arguments: 20598"""""""""" 20599 20600The first argument is a pointer to an integer value of arbitrary bitwidth 20601(result of some expression), the second is a pointer to a global string, the 20602third is a pointer to a global string which is the source file name, and the 20603last argument is the line number. It returns the value of the first argument. 20604 20605Semantics: 20606"""""""""" 20607 20608This intrinsic allows annotation of a pointer to an integer with arbitrary 20609strings. This can be useful for special purpose optimizations that want to look 20610for these annotations. These have no other defined use; they are ignored by code 20611generation and optimization. 20612 20613'``llvm.annotation.*``' Intrinsic 20614^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20615 20616Syntax: 20617""""""" 20618 20619This is an overloaded intrinsic. You can use '``llvm.annotation``' on 20620any integer bit width. 20621 20622:: 20623 20624 declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32 <int>) 20625 declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32 <int>) 20626 declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32 <int>) 20627 declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32 <int>) 20628 declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32 <int>) 20629 20630Overview: 20631""""""""" 20632 20633The '``llvm.annotation``' intrinsic. 20634 20635Arguments: 20636"""""""""" 20637 20638The first argument is an integer value (result of some expression), the 20639second is a pointer to a global string, the third is a pointer to a 20640global string which is the source file name, and the last argument is 20641the line number. It returns the value of the first argument. 20642 20643Semantics: 20644"""""""""" 20645 20646This intrinsic allows annotations to be put on arbitrary expressions 20647with arbitrary strings. This can be useful for special purpose 20648optimizations that want to look for these annotations. These have no 20649other defined use; they are ignored by code generation and optimization. 20650 20651'``llvm.codeview.annotation``' Intrinsic 20652^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20653 20654Syntax: 20655""""""" 20656 20657This annotation emits a label at its program point and an associated 20658``S_ANNOTATION`` codeview record with some additional string metadata. This is 20659used to implement MSVC's ``__annotation`` intrinsic. It is marked 20660``noduplicate``, so calls to this intrinsic prevent inlining and should be 20661considered expensive. 20662 20663:: 20664 20665 declare void @llvm.codeview.annotation(metadata) 20666 20667Arguments: 20668"""""""""" 20669 20670The argument should be an MDTuple containing any number of MDStrings. 20671 20672'``llvm.trap``' Intrinsic 20673^^^^^^^^^^^^^^^^^^^^^^^^^ 20674 20675Syntax: 20676""""""" 20677 20678:: 20679 20680 declare void @llvm.trap() cold noreturn nounwind 20681 20682Overview: 20683""""""""" 20684 20685The '``llvm.trap``' intrinsic. 20686 20687Arguments: 20688"""""""""" 20689 20690None. 20691 20692Semantics: 20693"""""""""" 20694 20695This intrinsic is lowered to the target dependent trap instruction. If 20696the target does not have a trap instruction, this intrinsic will be 20697lowered to a call of the ``abort()`` function. 20698 20699'``llvm.debugtrap``' Intrinsic 20700^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20701 20702Syntax: 20703""""""" 20704 20705:: 20706 20707 declare void @llvm.debugtrap() nounwind 20708 20709Overview: 20710""""""""" 20711 20712The '``llvm.debugtrap``' intrinsic. 20713 20714Arguments: 20715"""""""""" 20716 20717None. 20718 20719Semantics: 20720"""""""""" 20721 20722This intrinsic is lowered to code which is intended to cause an 20723execution trap with the intention of requesting the attention of a 20724debugger. 20725 20726'``llvm.ubsantrap``' Intrinsic 20727^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20728 20729Syntax: 20730""""""" 20731 20732:: 20733 20734 declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind 20735 20736Overview: 20737""""""""" 20738 20739The '``llvm.ubsantrap``' intrinsic. 20740 20741Arguments: 20742"""""""""" 20743 20744An integer describing the kind of failure detected. 20745 20746Semantics: 20747"""""""""" 20748 20749This intrinsic is lowered to code which is intended to cause an execution trap, 20750embedding the argument into encoding of that trap somehow to discriminate 20751crashes if possible. 20752 20753Equivalent to ``@llvm.trap`` for targets that do not support this behaviour. 20754 20755'``llvm.stackprotector``' Intrinsic 20756^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20757 20758Syntax: 20759""""""" 20760 20761:: 20762 20763 declare void @llvm.stackprotector(i8* <guard>, i8** <slot>) 20764 20765Overview: 20766""""""""" 20767 20768The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it 20769onto the stack at ``slot``. The stack slot is adjusted to ensure that it 20770is placed on the stack before local variables. 20771 20772Arguments: 20773"""""""""" 20774 20775The ``llvm.stackprotector`` intrinsic requires two pointer arguments. 20776The first argument is the value loaded from the stack guard 20777``@__stack_chk_guard``. The second variable is an ``alloca`` that has 20778enough space to hold the value of the guard. 20779 20780Semantics: 20781"""""""""" 20782 20783This intrinsic causes the prologue/epilogue inserter to force the position of 20784the ``AllocaInst`` stack slot to be before local variables on the stack. This is 20785to ensure that if a local variable on the stack is overwritten, it will destroy 20786the value of the guard. When the function exits, the guard on the stack is 20787checked against the original guard by ``llvm.stackprotectorcheck``. If they are 20788different, then ``llvm.stackprotectorcheck`` causes the program to abort by 20789calling the ``__stack_chk_fail()`` function. 20790 20791'``llvm.stackguard``' Intrinsic 20792^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20793 20794Syntax: 20795""""""" 20796 20797:: 20798 20799 declare i8* @llvm.stackguard() 20800 20801Overview: 20802""""""""" 20803 20804The ``llvm.stackguard`` intrinsic returns the system stack guard value. 20805 20806It should not be generated by frontends, since it is only for internal usage. 20807The reason why we create this intrinsic is that we still support IR form Stack 20808Protector in FastISel. 20809 20810Arguments: 20811"""""""""" 20812 20813None. 20814 20815Semantics: 20816"""""""""" 20817 20818On some platforms, the value returned by this intrinsic remains unchanged 20819between loads in the same thread. On other platforms, it returns the same 20820global variable value, if any, e.g. ``@__stack_chk_guard``. 20821 20822Currently some platforms have IR-level customized stack guard loading (e.g. 20823X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be 20824in the future. 20825 20826'``llvm.objectsize``' Intrinsic 20827^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20828 20829Syntax: 20830""""""" 20831 20832:: 20833 20834 declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>) 20835 declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>) 20836 20837Overview: 20838""""""""" 20839 20840The ``llvm.objectsize`` intrinsic is designed to provide information to the 20841optimizer to determine whether a) an operation (like memcpy) will overflow a 20842buffer that corresponds to an object, or b) that a runtime check for overflow 20843isn't necessary. An object in this context means an allocation of a specific 20844class, structure, array, or other object. 20845 20846Arguments: 20847"""""""""" 20848 20849The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a 20850pointer to or into the ``object``. The second argument determines whether 20851``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is 20852unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` 20853in address space 0 is used as its pointer argument. If it's ``false``, 20854``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if 20855the ``null`` is in a non-zero address space or if ``true`` is given for the 20856third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth 20857argument to ``llvm.objectsize`` determines if the value should be evaluated at 20858runtime. 20859 20860The second, third, and fourth arguments only accept constants. 20861 20862Semantics: 20863"""""""""" 20864 20865The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of 20866the object concerned. If the size cannot be determined, ``llvm.objectsize`` 20867returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument). 20868 20869'``llvm.expect``' Intrinsic 20870^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20871 20872Syntax: 20873""""""" 20874 20875This is an overloaded intrinsic. You can use ``llvm.expect`` on any 20876integer bit width. 20877 20878:: 20879 20880 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>) 20881 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>) 20882 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>) 20883 20884Overview: 20885""""""""" 20886 20887The ``llvm.expect`` intrinsic provides information about expected (the 20888most probable) value of ``val``, which can be used by optimizers. 20889 20890Arguments: 20891"""""""""" 20892 20893The ``llvm.expect`` intrinsic takes two arguments. The first argument is 20894a value. The second argument is an expected value. 20895 20896Semantics: 20897"""""""""" 20898 20899This intrinsic is lowered to the ``val``. 20900 20901'``llvm.expect.with.probability``' Intrinsic 20902^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20903 20904Syntax: 20905""""""" 20906 20907This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic. 20908You can use ``llvm.expect.with.probability`` on any integer bit width. 20909 20910:: 20911 20912 declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>) 20913 declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>) 20914 declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>) 20915 20916Overview: 20917""""""""" 20918 20919The ``llvm.expect.with.probability`` intrinsic provides information about 20920expected value of ``val`` with probability(or confidence) ``prob``, which can 20921be used by optimizers. 20922 20923Arguments: 20924"""""""""" 20925 20926The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first 20927argument is a value. The second argument is an expected value. The third 20928argument is a probability. 20929 20930Semantics: 20931"""""""""" 20932 20933This intrinsic is lowered to the ``val``. 20934 20935.. _int_assume: 20936 20937'``llvm.assume``' Intrinsic 20938^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20939 20940Syntax: 20941""""""" 20942 20943:: 20944 20945 declare void @llvm.assume(i1 %cond) 20946 20947Overview: 20948""""""""" 20949 20950The ``llvm.assume`` allows the optimizer to assume that the provided 20951condition is true. This information can then be used in simplifying other parts 20952of the code. 20953 20954More complex assumptions can be encoded as 20955:ref:`assume operand bundles <assume_opbundles>`. 20956 20957Arguments: 20958"""""""""" 20959 20960The argument of the call is the condition which the optimizer may assume is 20961always true. 20962 20963Semantics: 20964"""""""""" 20965 20966The intrinsic allows the optimizer to assume that the provided condition is 20967always true whenever the control flow reaches the intrinsic call. No code is 20968generated for this intrinsic, and instructions that contribute only to the 20969provided condition are not used for code generation. If the condition is 20970violated during execution, the behavior is undefined. 20971 20972Note that the optimizer might limit the transformations performed on values 20973used by the ``llvm.assume`` intrinsic in order to preserve the instructions 20974only used to form the intrinsic's input argument. This might prove undesirable 20975if the extra information provided by the ``llvm.assume`` intrinsic does not cause 20976sufficient overall improvement in code quality. For this reason, 20977``llvm.assume`` should not be used to document basic mathematical invariants 20978that the optimizer can otherwise deduce or facts that are of little use to the 20979optimizer. 20980 20981.. _int_ssa_copy: 20982 20983'``llvm.ssa.copy``' Intrinsic 20984^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20985 20986Syntax: 20987""""""" 20988 20989:: 20990 20991 declare type @llvm.ssa.copy(type %operand) returned(1) readnone 20992 20993Arguments: 20994"""""""""" 20995 20996The first argument is an operand which is used as the returned value. 20997 20998Overview: 20999"""""""""" 21000 21001The ``llvm.ssa.copy`` intrinsic can be used to attach information to 21002operations by copying them and giving them new names. For example, 21003the PredicateInfo utility uses it to build Extended SSA form, and 21004attach various forms of information to operands that dominate specific 21005uses. It is not meant for general use, only for building temporary 21006renaming forms that require value splits at certain points. 21007 21008.. _type.test: 21009 21010'``llvm.type.test``' Intrinsic 21011^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21012 21013Syntax: 21014""""""" 21015 21016:: 21017 21018 declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone 21019 21020 21021Arguments: 21022"""""""""" 21023 21024The first argument is a pointer to be tested. The second argument is a 21025metadata object representing a :doc:`type identifier <TypeMetadata>`. 21026 21027Overview: 21028""""""""" 21029 21030The ``llvm.type.test`` intrinsic tests whether the given pointer is associated 21031with the given type identifier. 21032 21033.. _type.checked.load: 21034 21035'``llvm.type.checked.load``' Intrinsic 21036^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21037 21038Syntax: 21039""""""" 21040 21041:: 21042 21043 declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly 21044 21045 21046Arguments: 21047"""""""""" 21048 21049The first argument is a pointer from which to load a function pointer. The 21050second argument is the byte offset from which to load the function pointer. The 21051third argument is a metadata object representing a :doc:`type identifier 21052<TypeMetadata>`. 21053 21054Overview: 21055""""""""" 21056 21057The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a 21058virtual table pointer using type metadata. This intrinsic is used to implement 21059control flow integrity in conjunction with virtual call optimization. The 21060virtual call optimization pass will optimize away ``llvm.type.checked.load`` 21061intrinsics associated with devirtualized calls, thereby removing the type 21062check in cases where it is not needed to enforce the control flow integrity 21063constraint. 21064 21065If the given pointer is associated with a type metadata identifier, this 21066function returns true as the second element of its return value. (Note that 21067the function may also return true if the given pointer is not associated 21068with a type metadata identifier.) If the function's return value's second 21069element is true, the following rules apply to the first element: 21070 21071- If the given pointer is associated with the given type metadata identifier, 21072 it is the function pointer loaded from the given byte offset from the given 21073 pointer. 21074 21075- If the given pointer is not associated with the given type metadata 21076 identifier, it is one of the following (the choice of which is unspecified): 21077 21078 1. The function pointer that would have been loaded from an arbitrarily chosen 21079 (through an unspecified mechanism) pointer associated with the type 21080 metadata. 21081 21082 2. If the function has a non-void return type, a pointer to a function that 21083 returns an unspecified value without causing side effects. 21084 21085If the function's return value's second element is false, the value of the 21086first element is undefined. 21087 21088 21089'``llvm.donothing``' Intrinsic 21090^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21091 21092Syntax: 21093""""""" 21094 21095:: 21096 21097 declare void @llvm.donothing() nounwind readnone 21098 21099Overview: 21100""""""""" 21101 21102The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only 21103three intrinsics (besides ``llvm.experimental.patchpoint`` and 21104``llvm.experimental.gc.statepoint``) that can be called with an invoke 21105instruction. 21106 21107Arguments: 21108"""""""""" 21109 21110None. 21111 21112Semantics: 21113"""""""""" 21114 21115This intrinsic does nothing, and it's removed by optimizers and ignored 21116by codegen. 21117 21118'``llvm.experimental.deoptimize``' Intrinsic 21119^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21120 21121Syntax: 21122""""""" 21123 21124:: 21125 21126 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ] 21127 21128Overview: 21129""""""""" 21130 21131This intrinsic, together with :ref:`deoptimization operand bundles 21132<deopt_opbundles>`, allow frontends to express transfer of control and 21133frame-local state from the currently executing (typically more specialized, 21134hence faster) version of a function into another (typically more generic, hence 21135slower) version. 21136 21137In languages with a fully integrated managed runtime like Java and JavaScript 21138this intrinsic can be used to implement "uncommon trap" or "side exit" like 21139functionality. In unmanaged languages like C and C++, this intrinsic can be 21140used to represent the slow paths of specialized functions. 21141 21142 21143Arguments: 21144"""""""""" 21145 21146The intrinsic takes an arbitrary number of arguments, whose meaning is 21147decided by the :ref:`lowering strategy<deoptimize_lowering>`. 21148 21149Semantics: 21150"""""""""" 21151 21152The ``@llvm.experimental.deoptimize`` intrinsic executes an attached 21153deoptimization continuation (denoted using a :ref:`deoptimization 21154operand bundle <deopt_opbundles>`) and returns the value returned by 21155the deoptimization continuation. Defining the semantic properties of 21156the continuation itself is out of scope of the language reference -- 21157as far as LLVM is concerned, the deoptimization continuation can 21158invoke arbitrary side effects, including reading from and writing to 21159the entire heap. 21160 21161Deoptimization continuations expressed using ``"deopt"`` operand bundles always 21162continue execution to the end of the physical frame containing them, so all 21163calls to ``@llvm.experimental.deoptimize`` must be in "tail position": 21164 21165 - ``@llvm.experimental.deoptimize`` cannot be invoked. 21166 - The call must immediately precede a :ref:`ret <i_ret>` instruction. 21167 - The ``ret`` instruction must return the value produced by the 21168 ``@llvm.experimental.deoptimize`` call if there is one, or void. 21169 21170Note that the above restrictions imply that the return type for a call to 21171``@llvm.experimental.deoptimize`` will match the return type of its immediate 21172caller. 21173 21174The inliner composes the ``"deopt"`` continuations of the caller into the 21175``"deopt"`` continuations present in the inlinee, and also updates calls to this 21176intrinsic to return directly from the frame of the function it inlined into. 21177 21178All declarations of ``@llvm.experimental.deoptimize`` must share the 21179same calling convention. 21180 21181.. _deoptimize_lowering: 21182 21183Lowering: 21184""""""""" 21185 21186Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the 21187symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to 21188ensure that this symbol is defined). The call arguments to 21189``@llvm.experimental.deoptimize`` are lowered as if they were formal 21190arguments of the specified types, and not as varargs. 21191 21192 21193'``llvm.experimental.guard``' Intrinsic 21194^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21195 21196Syntax: 21197""""""" 21198 21199:: 21200 21201 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ] 21202 21203Overview: 21204""""""""" 21205 21206This intrinsic, together with :ref:`deoptimization operand bundles 21207<deopt_opbundles>`, allows frontends to express guards or checks on 21208optimistic assumptions made during compilation. The semantics of 21209``@llvm.experimental.guard`` is defined in terms of 21210``@llvm.experimental.deoptimize`` -- its body is defined to be 21211equivalent to: 21212 21213.. code-block:: text 21214 21215 define void @llvm.experimental.guard(i1 %pred, <args...>) { 21216 %realPred = and i1 %pred, undef 21217 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}] 21218 21219 leave: 21220 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ] 21221 ret void 21222 21223 continue: 21224 ret void 21225 } 21226 21227 21228with the optional ``[, !make.implicit !{}]`` present if and only if it 21229is present on the call site. For more details on ``!make.implicit``, 21230see :doc:`FaultMaps`. 21231 21232In words, ``@llvm.experimental.guard`` executes the attached 21233``"deopt"`` continuation if (but **not** only if) its first argument 21234is ``false``. Since the optimizer is allowed to replace the ``undef`` 21235with an arbitrary value, it can optimize guard to fail "spuriously", 21236i.e. without the original condition being false (hence the "not only 21237if"); and this allows for "check widening" type optimizations. 21238 21239``@llvm.experimental.guard`` cannot be invoked. 21240 21241After ``@llvm.experimental.guard`` was first added, a more general 21242formulation was found in ``@llvm.experimental.widenable.condition``. 21243Support for ``@llvm.experimental.guard`` is slowly being rephrased in 21244terms of this alternate. 21245 21246'``llvm.experimental.widenable.condition``' Intrinsic 21247^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21248 21249Syntax: 21250""""""" 21251 21252:: 21253 21254 declare i1 @llvm.experimental.widenable.condition() 21255 21256Overview: 21257""""""""" 21258 21259This intrinsic represents a "widenable condition" which is 21260boolean expressions with the following property: whether this 21261expression is `true` or `false`, the program is correct and 21262well-defined. 21263 21264Together with :ref:`deoptimization operand bundles <deopt_opbundles>`, 21265``@llvm.experimental.widenable.condition`` allows frontends to 21266express guards or checks on optimistic assumptions made during 21267compilation and represent them as branch instructions on special 21268conditions. 21269 21270While this may appear similar in semantics to `undef`, it is very 21271different in that an invocation produces a particular, singular 21272value. It is also intended to be lowered late, and remain available 21273for specific optimizations and transforms that can benefit from its 21274special properties. 21275 21276Arguments: 21277"""""""""" 21278 21279None. 21280 21281Semantics: 21282"""""""""" 21283 21284The intrinsic ``@llvm.experimental.widenable.condition()`` 21285returns either `true` or `false`. For each evaluation of a call 21286to this intrinsic, the program must be valid and correct both if 21287it returns `true` and if it returns `false`. This allows 21288transformation passes to replace evaluations of this intrinsic 21289with either value whenever one is beneficial. 21290 21291When used in a branch condition, it allows us to choose between 21292two alternative correct solutions for the same problem, like 21293in example below: 21294 21295.. code-block:: text 21296 21297 %cond = call i1 @llvm.experimental.widenable.condition() 21298 br i1 %cond, label %solution_1, label %solution_2 21299 21300 label %fast_path: 21301 ; Apply memory-consuming but fast solution for a task. 21302 21303 label %slow_path: 21304 ; Cheap in memory but slow solution. 21305 21306Whether the result of intrinsic's call is `true` or `false`, 21307it should be correct to pick either solution. We can switch 21308between them by replacing the result of 21309``@llvm.experimental.widenable.condition`` with different 21310`i1` expressions. 21311 21312This is how it can be used to represent guards as widenable branches: 21313 21314.. code-block:: text 21315 21316 block: 21317 ; Unguarded instructions 21318 call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)] 21319 ; Guarded instructions 21320 21321Can be expressed in an alternative equivalent form of explicit branch using 21322``@llvm.experimental.widenable.condition``: 21323 21324.. code-block:: text 21325 21326 block: 21327 ; Unguarded instructions 21328 %widenable_condition = call i1 @llvm.experimental.widenable.condition() 21329 %guard_condition = and i1 %cond, %widenable_condition 21330 br i1 %guard_condition, label %guarded, label %deopt 21331 21332 guarded: 21333 ; Guarded instructions 21334 21335 deopt: 21336 call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ] 21337 21338So the block `guarded` is only reachable when `%cond` is `true`, 21339and it should be valid to go to the block `deopt` whenever `%cond` 21340is `true` or `false`. 21341 21342``@llvm.experimental.widenable.condition`` will never throw, thus 21343it cannot be invoked. 21344 21345Guard widening: 21346""""""""""""""" 21347 21348When ``@llvm.experimental.widenable.condition()`` is used in 21349condition of a guard represented as explicit branch, it is 21350legal to widen the guard's condition with any additional 21351conditions. 21352 21353Guard widening looks like replacement of 21354 21355.. code-block:: text 21356 21357 %widenable_cond = call i1 @llvm.experimental.widenable.condition() 21358 %guard_cond = and i1 %cond, %widenable_cond 21359 br i1 %guard_cond, label %guarded, label %deopt 21360 21361with 21362 21363.. code-block:: text 21364 21365 %widenable_cond = call i1 @llvm.experimental.widenable.condition() 21366 %new_cond = and i1 %any_other_cond, %widenable_cond 21367 %new_guard_cond = and i1 %cond, %new_cond 21368 br i1 %new_guard_cond, label %guarded, label %deopt 21369 21370for this branch. Here `%any_other_cond` is an arbitrarily chosen 21371well-defined `i1` value. By making guard widening, we may 21372impose stricter conditions on `guarded` block and bail to the 21373deopt when the new condition is not met. 21374 21375Lowering: 21376""""""""" 21377 21378Default lowering strategy is replacing the result of 21379call of ``@llvm.experimental.widenable.condition`` with 21380constant `true`. However it is always correct to replace 21381it with any other `i1` value. Any pass can 21382freely do it if it can benefit from non-default lowering. 21383 21384 21385'``llvm.load.relative``' Intrinsic 21386^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21387 21388Syntax: 21389""""""" 21390 21391:: 21392 21393 declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly 21394 21395Overview: 21396""""""""" 21397 21398This intrinsic loads a 32-bit value from the address ``%ptr + %offset``, 21399adds ``%ptr`` to that value and returns it. The constant folder specifically 21400recognizes the form of this intrinsic and the constant initializers it may 21401load from; if a loaded constant initializer is known to have the form 21402``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. 21403 21404LLVM provides that the calculation of such a constant initializer will 21405not overflow at link time under the medium code model if ``x`` is an 21406``unnamed_addr`` function. However, it does not provide this guarantee for 21407a constant initializer folded into a function body. This intrinsic can be 21408used to avoid the possibility of overflows when loading from such a constant. 21409 21410.. _llvm_sideeffect: 21411 21412'``llvm.sideeffect``' Intrinsic 21413^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21414 21415Syntax: 21416""""""" 21417 21418:: 21419 21420 declare void @llvm.sideeffect() inaccessiblememonly nounwind 21421 21422Overview: 21423""""""""" 21424 21425The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers 21426treat it as having side effects, so it can be inserted into a loop to 21427indicate that the loop shouldn't be assumed to terminate (which could 21428potentially lead to the loop being optimized away entirely), even if it's 21429an infinite loop with no other side effects. 21430 21431Arguments: 21432"""""""""" 21433 21434None. 21435 21436Semantics: 21437"""""""""" 21438 21439This intrinsic actually does nothing, but optimizers must assume that it 21440has externally observable side effects. 21441 21442'``llvm.is.constant.*``' Intrinsic 21443^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21444 21445Syntax: 21446""""""" 21447 21448This is an overloaded intrinsic. You can use llvm.is.constant with any argument type. 21449 21450:: 21451 21452 declare i1 @llvm.is.constant.i32(i32 %operand) nounwind readnone 21453 declare i1 @llvm.is.constant.f32(float %operand) nounwind readnone 21454 declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind readnone 21455 21456Overview: 21457""""""""" 21458 21459The '``llvm.is.constant``' intrinsic will return true if the argument 21460is known to be a manifest compile-time constant. It is guaranteed to 21461fold to either true or false before generating machine code. 21462 21463Semantics: 21464"""""""""" 21465 21466This intrinsic generates no code. If its argument is known to be a 21467manifest compile-time constant value, then the intrinsic will be 21468converted to a constant true value. Otherwise, it will be converted to 21469a constant false value. 21470 21471In particular, note that if the argument is a constant expression 21472which refers to a global (the address of which _is_ a constant, but 21473not manifest during the compile), then the intrinsic evaluates to 21474false. 21475 21476The result also intentionally depends on the result of optimization 21477passes -- e.g., the result can change depending on whether a 21478function gets inlined or not. A function's parameters are 21479obviously not constant. However, a call like 21480``llvm.is.constant.i32(i32 %param)`` *can* return true after the 21481function is inlined, if the value passed to the function parameter was 21482a constant. 21483 21484On the other hand, if constant folding is not run, it will never 21485evaluate to true, even in simple cases. 21486 21487.. _int_ptrmask: 21488 21489'``llvm.ptrmask``' Intrinsic 21490^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21491 21492Syntax: 21493""""""" 21494 21495:: 21496 21497 declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) readnone speculatable 21498 21499Arguments: 21500"""""""""" 21501 21502The first argument is a pointer. The second argument is an integer. 21503 21504Overview: 21505"""""""""" 21506 21507The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask. 21508This allows stripping data from tagged pointers without converting them to an 21509integer (ptrtoint/inttoptr). As a consequence, we can preserve more information 21510to facilitate alias analysis and underlying-object detection. 21511 21512Semantics: 21513"""""""""" 21514 21515The result of ``ptrmask(ptr, mask)`` is equivalent to 21516``getelementptr ptr, (ptrtoint(ptr) & mask) - ptrtoint(ptr)``. Both the returned 21517pointer and the first argument are based on the same underlying object (for more 21518information on the *based on* terminology see 21519:ref:`the pointer aliasing rules <pointeraliasing>`). If the bitwidth of the 21520mask argument does not match the pointer size of the target, the mask is 21521zero-extended or truncated accordingly. 21522 21523.. _int_vscale: 21524 21525'``llvm.vscale``' Intrinsic 21526^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21527 21528Syntax: 21529""""""" 21530 21531:: 21532 21533 declare i32 llvm.vscale.i32() 21534 declare i64 llvm.vscale.i64() 21535 21536Overview: 21537""""""""" 21538 21539The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable 21540vectors such as ``<vscale x 16 x i8>``. 21541 21542Semantics: 21543"""""""""" 21544 21545``vscale`` is a positive value that is constant throughout program 21546execution, but is unknown at compile time. 21547If the result value does not fit in the result type, then the result is 21548a :ref:`poison value <poisonvalues>`. 21549 21550 21551Stack Map Intrinsics 21552-------------------- 21553 21554LLVM provides experimental intrinsics to support runtime patching 21555mechanisms commonly desired in dynamic language JITs. These intrinsics 21556are described in :doc:`StackMaps`. 21557 21558Element Wise Atomic Memory Intrinsics 21559------------------------------------- 21560 21561These intrinsics are similar to the standard library memory intrinsics except 21562that they perform memory transfer as a sequence of atomic memory accesses. 21563 21564.. _int_memcpy_element_unordered_atomic: 21565 21566'``llvm.memcpy.element.unordered.atomic``' Intrinsic 21567^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21568 21569Syntax: 21570""""""" 21571 21572This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on 21573any integer bit width and for different address spaces. Not all targets 21574support all bit widths however. 21575 21576:: 21577 21578 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>, 21579 i8* <src>, 21580 i32 <len>, 21581 i32 <element_size>) 21582 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>, 21583 i8* <src>, 21584 i64 <len>, 21585 i32 <element_size>) 21586 21587Overview: 21588""""""""" 21589 21590The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the 21591'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated 21592as arrays with elements that are exactly ``element_size`` bytes, and the copy between 21593buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations 21594that are a positive integer multiple of the ``element_size`` in size. 21595 21596Arguments: 21597"""""""""" 21598 21599The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>` 21600intrinsic, with the added constraint that ``len`` is required to be a positive integer 21601multiple of the ``element_size``. If ``len`` is not a positive integer multiple of 21602``element_size``, then the behaviour of the intrinsic is undefined. 21603 21604``element_size`` must be a compile-time constant positive power of two no greater than 21605target-specific atomic access size limit. 21606 21607For each of the input pointers ``align`` parameter attribute must be specified. It 21608must be a power of two no less than the ``element_size``. Caller guarantees that 21609both the source and destination pointers are aligned to that boundary. 21610 21611Semantics: 21612"""""""""" 21613 21614The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of 21615memory from the source location to the destination location. These locations are not 21616allowed to overlap. The memory copy is performed as a sequence of load/store operations 21617where each access is guaranteed to be a multiple of ``element_size`` bytes wide and 21618aligned at an ``element_size`` boundary. 21619 21620The order of the copy is unspecified. The same value may be read from the source 21621buffer many times, but only one write is issued to the destination buffer per 21622element. It is well defined to have concurrent reads and writes to both source and 21623destination provided those reads and writes are unordered atomic when specified. 21624 21625This intrinsic does not provide any additional ordering guarantees over those 21626provided by a set of unordered loads from the source location and stores to the 21627destination. 21628 21629Lowering: 21630""""""""" 21631 21632In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is 21633lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*' 21634is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic 21635lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific 21636lowering. 21637 21638Optimizer is allowed to inline memory copy when it's profitable to do so. 21639 21640'``llvm.memmove.element.unordered.atomic``' Intrinsic 21641^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21642 21643Syntax: 21644""""""" 21645 21646This is an overloaded intrinsic. You can use 21647``llvm.memmove.element.unordered.atomic`` on any integer bit width and for 21648different address spaces. Not all targets support all bit widths however. 21649 21650:: 21651 21652 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>, 21653 i8* <src>, 21654 i32 <len>, 21655 i32 <element_size>) 21656 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>, 21657 i8* <src>, 21658 i64 <len>, 21659 i32 <element_size>) 21660 21661Overview: 21662""""""""" 21663 21664The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization 21665of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and 21666``src`` are treated as arrays with elements that are exactly ``element_size`` 21667bytes, and the copy between buffers uses a sequence of 21668:ref:`unordered atomic <ordering>` load/store operations that are a positive 21669integer multiple of the ``element_size`` in size. 21670 21671Arguments: 21672"""""""""" 21673 21674The first three arguments are the same as they are in the 21675:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that 21676``len`` is required to be a positive integer multiple of the ``element_size``. 21677If ``len`` is not a positive integer multiple of ``element_size``, then the 21678behaviour of the intrinsic is undefined. 21679 21680``element_size`` must be a compile-time constant positive power of two no 21681greater than a target-specific atomic access size limit. 21682 21683For each of the input pointers the ``align`` parameter attribute must be 21684specified. It must be a power of two no less than the ``element_size``. Caller 21685guarantees that both the source and destination pointers are aligned to that 21686boundary. 21687 21688Semantics: 21689"""""""""" 21690 21691The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes 21692of memory from the source location to the destination location. These locations 21693are allowed to overlap. The memory copy is performed as a sequence of load/store 21694operations where each access is guaranteed to be a multiple of ``element_size`` 21695bytes wide and aligned at an ``element_size`` boundary. 21696 21697The order of the copy is unspecified. The same value may be read from the source 21698buffer many times, but only one write is issued to the destination buffer per 21699element. It is well defined to have concurrent reads and writes to both source 21700and destination provided those reads and writes are unordered atomic when 21701specified. 21702 21703This intrinsic does not provide any additional ordering guarantees over those 21704provided by a set of unordered loads from the source location and stores to the 21705destination. 21706 21707Lowering: 21708""""""""" 21709 21710In the most general case call to the 21711'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol 21712``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an 21713actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering 21714<RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific 21715lowering. 21716 21717The optimizer is allowed to inline the memory copy when it's profitable to do so. 21718 21719.. _int_memset_element_unordered_atomic: 21720 21721'``llvm.memset.element.unordered.atomic``' Intrinsic 21722^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21723 21724Syntax: 21725""""""" 21726 21727This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on 21728any integer bit width and for different address spaces. Not all targets 21729support all bit widths however. 21730 21731:: 21732 21733 declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>, 21734 i8 <value>, 21735 i32 <len>, 21736 i32 <element_size>) 21737 declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>, 21738 i8 <value>, 21739 i64 <len>, 21740 i32 <element_size>) 21741 21742Overview: 21743""""""""" 21744 21745The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the 21746'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array 21747with elements that are exactly ``element_size`` bytes, and the assignment to that array 21748uses uses a sequence of :ref:`unordered atomic <ordering>` store operations 21749that are a positive integer multiple of the ``element_size`` in size. 21750 21751Arguments: 21752"""""""""" 21753 21754The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>` 21755intrinsic, with the added constraint that ``len`` is required to be a positive integer 21756multiple of the ``element_size``. If ``len`` is not a positive integer multiple of 21757``element_size``, then the behaviour of the intrinsic is undefined. 21758 21759``element_size`` must be a compile-time constant positive power of two no greater than 21760target-specific atomic access size limit. 21761 21762The ``dest`` input pointer must have the ``align`` parameter attribute specified. It 21763must be a power of two no less than the ``element_size``. Caller guarantees that 21764the destination pointer is aligned to that boundary. 21765 21766Semantics: 21767"""""""""" 21768 21769The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of 21770memory starting at the destination location to the given ``value``. The memory is 21771set with a sequence of store operations where each access is guaranteed to be a 21772multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary. 21773 21774The order of the assignment is unspecified. Only one write is issued to the 21775destination buffer per element. It is well defined to have concurrent reads and 21776writes to the destination provided those reads and writes are unordered atomic 21777when specified. 21778 21779This intrinsic does not provide any additional ordering guarantees over those 21780provided by a set of unordered stores to the destination. 21781 21782Lowering: 21783""""""""" 21784 21785In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is 21786lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*' 21787is replaced with an actual element size. 21788 21789The optimizer is allowed to inline the memory assignment when it's profitable to do so. 21790 21791Objective-C ARC Runtime Intrinsics 21792---------------------------------- 21793 21794LLVM provides intrinsics that lower to Objective-C ARC runtime entry points. 21795LLVM is aware of the semantics of these functions, and optimizes based on that 21796knowledge. You can read more about the details of Objective-C ARC `here 21797<https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_. 21798 21799'``llvm.objc.autorelease``' Intrinsic 21800^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21801 21802Syntax: 21803""""""" 21804:: 21805 21806 declare i8* @llvm.objc.autorelease(i8*) 21807 21808Lowering: 21809""""""""" 21810 21811Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_. 21812 21813'``llvm.objc.autoreleasePoolPop``' Intrinsic 21814^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21815 21816Syntax: 21817""""""" 21818:: 21819 21820 declare void @llvm.objc.autoreleasePoolPop(i8*) 21821 21822Lowering: 21823""""""""" 21824 21825Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_. 21826 21827'``llvm.objc.autoreleasePoolPush``' Intrinsic 21828^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21829 21830Syntax: 21831""""""" 21832:: 21833 21834 declare i8* @llvm.objc.autoreleasePoolPush() 21835 21836Lowering: 21837""""""""" 21838 21839Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_. 21840 21841'``llvm.objc.autoreleaseReturnValue``' Intrinsic 21842^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21843 21844Syntax: 21845""""""" 21846:: 21847 21848 declare i8* @llvm.objc.autoreleaseReturnValue(i8*) 21849 21850Lowering: 21851""""""""" 21852 21853Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_. 21854 21855'``llvm.objc.copyWeak``' Intrinsic 21856^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21857 21858Syntax: 21859""""""" 21860:: 21861 21862 declare void @llvm.objc.copyWeak(i8**, i8**) 21863 21864Lowering: 21865""""""""" 21866 21867Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_. 21868 21869'``llvm.objc.destroyWeak``' Intrinsic 21870^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21871 21872Syntax: 21873""""""" 21874:: 21875 21876 declare void @llvm.objc.destroyWeak(i8**) 21877 21878Lowering: 21879""""""""" 21880 21881Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_. 21882 21883'``llvm.objc.initWeak``' Intrinsic 21884^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21885 21886Syntax: 21887""""""" 21888:: 21889 21890 declare i8* @llvm.objc.initWeak(i8**, i8*) 21891 21892Lowering: 21893""""""""" 21894 21895Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_. 21896 21897'``llvm.objc.loadWeak``' Intrinsic 21898^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21899 21900Syntax: 21901""""""" 21902:: 21903 21904 declare i8* @llvm.objc.loadWeak(i8**) 21905 21906Lowering: 21907""""""""" 21908 21909Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_. 21910 21911'``llvm.objc.loadWeakRetained``' Intrinsic 21912^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21913 21914Syntax: 21915""""""" 21916:: 21917 21918 declare i8* @llvm.objc.loadWeakRetained(i8**) 21919 21920Lowering: 21921""""""""" 21922 21923Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_. 21924 21925'``llvm.objc.moveWeak``' Intrinsic 21926^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21927 21928Syntax: 21929""""""" 21930:: 21931 21932 declare void @llvm.objc.moveWeak(i8**, i8**) 21933 21934Lowering: 21935""""""""" 21936 21937Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_. 21938 21939'``llvm.objc.release``' Intrinsic 21940^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21941 21942Syntax: 21943""""""" 21944:: 21945 21946 declare void @llvm.objc.release(i8*) 21947 21948Lowering: 21949""""""""" 21950 21951Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_. 21952 21953'``llvm.objc.retain``' Intrinsic 21954^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21955 21956Syntax: 21957""""""" 21958:: 21959 21960 declare i8* @llvm.objc.retain(i8*) 21961 21962Lowering: 21963""""""""" 21964 21965Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_. 21966 21967'``llvm.objc.retainAutorelease``' Intrinsic 21968^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21969 21970Syntax: 21971""""""" 21972:: 21973 21974 declare i8* @llvm.objc.retainAutorelease(i8*) 21975 21976Lowering: 21977""""""""" 21978 21979Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_. 21980 21981'``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic 21982^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21983 21984Syntax: 21985""""""" 21986:: 21987 21988 declare i8* @llvm.objc.retainAutoreleaseReturnValue(i8*) 21989 21990Lowering: 21991""""""""" 21992 21993Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_. 21994 21995'``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic 21996^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21997 21998Syntax: 21999""""""" 22000:: 22001 22002 declare i8* @llvm.objc.retainAutoreleasedReturnValue(i8*) 22003 22004Lowering: 22005""""""""" 22006 22007Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_. 22008 22009'``llvm.objc.retainBlock``' Intrinsic 22010^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22011 22012Syntax: 22013""""""" 22014:: 22015 22016 declare i8* @llvm.objc.retainBlock(i8*) 22017 22018Lowering: 22019""""""""" 22020 22021Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_. 22022 22023'``llvm.objc.storeStrong``' Intrinsic 22024^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22025 22026Syntax: 22027""""""" 22028:: 22029 22030 declare void @llvm.objc.storeStrong(i8**, i8*) 22031 22032Lowering: 22033""""""""" 22034 22035Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_. 22036 22037'``llvm.objc.storeWeak``' Intrinsic 22038^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22039 22040Syntax: 22041""""""" 22042:: 22043 22044 declare i8* @llvm.objc.storeWeak(i8**, i8*) 22045 22046Lowering: 22047""""""""" 22048 22049Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_. 22050 22051Preserving Debug Information Intrinsics 22052^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22053 22054These intrinsics are used to carry certain debuginfo together with 22055IR-level operations. For example, it may be desirable to 22056know the structure/union name and the original user-level field 22057indices. Such information got lost in IR GetElementPtr instruction 22058since the IR types are different from debugInfo types and unions 22059are converted to structs in IR. 22060 22061'``llvm.preserve.array.access.index``' Intrinsic 22062^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22063 22064Syntax: 22065""""""" 22066:: 22067 22068 declare <ret_type> 22069 @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base, 22070 i32 dim, 22071 i32 index) 22072 22073Overview: 22074""""""""" 22075 22076The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address 22077based on array base ``base``, array dimension ``dim`` and the last access index ``index`` 22078into the array. The return type ``ret_type`` is a pointer type to the array element. 22079The array ``dim`` and ``index`` are preserved which is more robust than 22080getelementptr instruction which may be subject to compiler transformation. 22081The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 22082to provide array or pointer debuginfo type. 22083The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the 22084debuginfo version of ``type``. 22085 22086Arguments: 22087"""""""""" 22088 22089The ``base`` is the array base address. The ``dim`` is the array dimension. 22090The ``base`` is a pointer if ``dim`` equals 0. 22091The ``index`` is the last access index into the array or pointer. 22092 22093Semantics: 22094"""""""""" 22095 22096The '``llvm.preserve.array.access.index``' intrinsic produces the same result 22097as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``. 22098 22099'``llvm.preserve.union.access.index``' Intrinsic 22100^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22101 22102Syntax: 22103""""""" 22104:: 22105 22106 declare <type> 22107 @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base, 22108 i32 di_index) 22109 22110Overview: 22111""""""""" 22112 22113The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index 22114``di_index`` and returns the ``base`` address. 22115The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 22116to provide union debuginfo type. 22117The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``. 22118The return type ``type`` is the same as the ``base`` type. 22119 22120Arguments: 22121"""""""""" 22122 22123The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo. 22124 22125Semantics: 22126"""""""""" 22127 22128The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address. 22129 22130'``llvm.preserve.struct.access.index``' Intrinsic 22131^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22132 22133Syntax: 22134""""""" 22135:: 22136 22137 declare <ret_type> 22138 @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base, 22139 i32 gep_index, 22140 i32 di_index) 22141 22142Overview: 22143""""""""" 22144 22145The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address 22146based on struct base ``base`` and IR struct member index ``gep_index``. 22147The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 22148to provide struct debuginfo type. 22149The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``. 22150The return type ``ret_type`` is a pointer type to the structure member. 22151 22152Arguments: 22153"""""""""" 22154 22155The ``base`` is the structure base address. The ``gep_index`` is the struct member index 22156based on IR structures. The ``di_index`` is the struct member index based on debuginfo. 22157 22158Semantics: 22159"""""""""" 22160 22161The '``llvm.preserve.struct.access.index``' intrinsic produces the same result 22162as a getelementptr with base ``base`` and access operands ``{0, gep_index}``. 22163