xref: /llvm-project/llvm/docs/LangRef.rst (revision 29441e4f5fa5f5c7709f7cf180815ba97f611297)
1==============================
2LLVM Language Reference Manual
3==============================
4
5.. contents::
6   :local:
7   :depth: 3
8
9Abstract
10========
11
12This document is a reference manual for the LLVM assembly language. LLVM
13is a Static Single Assignment (SSA) based representation that provides
14type safety, low-level operations, flexibility, and the capability of
15representing 'all' high-level languages cleanly. It is the common code
16representation used throughout all phases of the LLVM compilation
17strategy.
18
19Introduction
20============
21
22The LLVM code representation is designed to be used in three different
23forms: as an in-memory compiler IR, as an on-disk bitcode representation
24(suitable for fast loading by a Just-In-Time compiler), and as a human
25readable assembly language representation. This allows LLVM to provide a
26powerful intermediate representation for efficient compiler
27transformations and analysis, while providing a natural means to debug
28and visualize the transformations. The three different forms of LLVM are
29all equivalent. This document describes the human readable
30representation and notation.
31
32The LLVM representation aims to be light-weight and low-level while
33being expressive, typed, and extensible at the same time. It aims to be
34a "universal IR" of sorts, by being at a low enough level that
35high-level ideas may be cleanly mapped to it (similar to how
36microprocessors are "universal IR's", allowing many source languages to
37be mapped to them). By providing type information, LLVM can be used as
38the target of optimizations: for example, through pointer analysis, it
39can be proven that a C automatic variable is never accessed outside of
40the current function, allowing it to be promoted to a simple SSA value
41instead of a memory location.
42
43.. _wellformed:
44
45Well-Formedness
46---------------
47
48It is important to note that this document describes 'well formed' LLVM
49assembly language. There is a difference between what the parser accepts
50and what is considered 'well formed'. For example, the following
51instruction is syntactically okay, but not well formed:
52
53.. code-block:: llvm
54
55    %x = add i32 1, %x
56
57because the definition of ``%x`` does not dominate all of its uses. The
58LLVM infrastructure provides a verification pass that may be used to
59verify that an LLVM module is well formed. This pass is automatically
60run by the parser after parsing input assembly and by the optimizer
61before it outputs bitcode. The violations pointed out by the verifier
62pass indicate bugs in transformation passes or input to the parser.
63
64Syntax
65======
66
67.. _identifiers:
68
69Identifiers
70-----------
71
72LLVM identifiers come in two basic types: global and local. Global
73identifiers (functions, global variables) begin with the ``'@'``
74character. Local identifiers (register names, types) begin with the
75``'%'`` character. Additionally, there are three different formats for
76identifiers, for different purposes:
77
78#. Named values are represented as a string of characters with their
79   prefix. For example, ``%foo``, ``@DivisionByZero``,
80   ``%a.really.long.identifier``. The actual regular expression used is
81   '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
82   characters in their names can be surrounded with quotes. Special
83   characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
84   code for the character in hexadecimal. In this way, any character can
85   be used in a name value, even quotes themselves. The ``"\01"`` prefix
86   can be used on global values to suppress mangling.
87#. Unnamed values are represented as an unsigned numeric value with
88   their prefix. For example, ``%12``, ``@2``, ``%44``.
89#. Constants, which are described in the section Constants_ below.
90
91LLVM requires that values start with a prefix for two reasons: Compilers
92don't need to worry about name clashes with reserved words, and the set
93of reserved words may be expanded in the future without penalty.
94Additionally, unnamed identifiers allow a compiler to quickly come up
95with a temporary variable without having to avoid symbol table
96conflicts.
97
98Reserved words in LLVM are very similar to reserved words in other
99languages. There are keywords for different opcodes ('``add``',
100'``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
101'``i32``', etc...), and others. These reserved words cannot conflict
102with variable names, because none of them start with a prefix character
103(``'%'`` or ``'@'``).
104
105Here is an example of LLVM code to multiply the integer variable
106'``%X``' by 8:
107
108The easy way:
109
110.. code-block:: llvm
111
112    %result = mul i32 %X, 8
113
114After strength reduction:
115
116.. code-block:: llvm
117
118    %result = shl i32 %X, 3
119
120And the hard way:
121
122.. code-block:: llvm
123
124    %0 = add i32 %X, %X           ; yields i32:%0
125    %1 = add i32 %0, %0           /* yields i32:%1 */
126    %result = add i32 %1, %1
127
128This last way of multiplying ``%X`` by 8 illustrates several important
129lexical features of LLVM:
130
131#. Comments are delimited with a '``;``' and go until the end of line.
132   Alternatively, comments can start with ``/*`` and terminate with ``*/``.
133#. Unnamed temporaries are created when the result of a computation is
134   not assigned to a named value.
135#. By default, unnamed temporaries are numbered sequentially (using a
136   per-function incrementing counter, starting with 0). However, when explicitly
137   specifying temporary numbers, it is allowed to skip over numbers.
138
139   Note that basic blocks and unnamed function parameters are included in this
140   numbering. For example, if the entry basic block is not given a label name
141   and all function parameters are named, then it will get number 0.
142
143It also shows a convention that we follow in this document. When
144demonstrating instructions, we will follow an instruction with a comment
145that defines the type and name of value produced.
146
147.. _string_constants:
148
149String constants
150----------------
151
152Strings in LLVM programs are delimited by ``"`` characters. Within a
153string, all bytes are treated literally with the exception of ``\``
154characters, which start escapes, and the first ``"`` character, which
155ends the string.
156
157There are two kinds of escapes.
158
159* ``\\`` represents a single ``\`` character.
160
161* ``\`` followed by two hexadecimal characters (0-9, a-f, or A-F)
162  represents the byte with the given value (e.g. \x00 represents a
163  null byte).
164
165To represent a ``"`` character, use ``\22``. (``\"`` will end the string
166with a trailing ``\``.)
167
168Newlines do not terminate string constants; strings can span multiple
169lines.
170
171The interpretation of string constants (e.g. their character encoding)
172depends on context.
173
174
175High Level Structure
176====================
177
178Module Structure
179----------------
180
181LLVM programs are composed of ``Module``'s, each of which is a
182translation unit of the input programs. Each module consists of
183functions, global variables, and symbol table entries. Modules may be
184combined together with the LLVM linker, which merges function (and
185global variable) definitions, resolves forward declarations, and merges
186symbol table entries. Here is an example of the "hello world" module:
187
188.. code-block:: llvm
189
190    ; Declare the string constant as a global constant.
191    @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
192
193    ; External declaration of the puts function
194    declare i32 @puts(ptr captures(none)) nounwind
195
196    ; Definition of main function
197    define i32 @main() {
198      ; Call puts function to write out the string to stdout.
199      call i32 @puts(ptr @.str)
200      ret i32 0
201    }
202
203    ; Named metadata
204    !0 = !{i32 42, null, !"string"}
205    !foo = !{!0}
206
207This example is made up of a :ref:`global variable <globalvars>` named
208"``.str``", an external declaration of the "``puts``" function, a
209:ref:`function definition <functionstructure>` for "``main``" and
210:ref:`named metadata <namedmetadatastructure>` "``foo``".
211
212In general, a module is made up of a list of global values (where both
213functions and global variables are global values). Global values are
214represented by a pointer to a memory location (in this case, a pointer
215to an array of char, and a pointer to a function), and have one of the
216following :ref:`linkage types <linkage>`.
217
218.. _linkage:
219
220Linkage Types
221-------------
222
223All Global Variables and Functions have one of the following types of
224linkage:
225
226``private``
227    Global values with "``private``" linkage are only directly
228    accessible by objects in the current module. In particular, linking
229    code into a module with a private global value may cause the
230    private to be renamed as necessary to avoid collisions. Because the
231    symbol is private to the module, all references can be updated. This
232    doesn't show up in any symbol table in the object file.
233``internal``
234    Similar to private, but the value shows as a local symbol
235    (``STB_LOCAL`` in the case of ELF) in the object file. This
236    corresponds to the notion of the '``static``' keyword in C.
237``available_externally``
238    Globals with "``available_externally``" linkage are never emitted into
239    the object file corresponding to the LLVM module. From the linker's
240    perspective, an ``available_externally`` global is equivalent to
241    an external declaration. They exist to allow inlining and other
242    optimizations to take place given knowledge of the definition of the
243    global, which is known to be somewhere outside the module. Globals
244    with ``available_externally`` linkage are allowed to be discarded at
245    will, and allow inlining and other optimizations. This linkage type is
246    only allowed on definitions, not declarations.
247``linkonce``
248    Globals with "``linkonce``" linkage are merged with other globals of
249    the same name when linkage occurs. This can be used to implement
250    some forms of inline functions, templates, or other code which must
251    be generated in each translation unit that uses it, but where the
252    body may be overridden with a more definitive definition later.
253    Unreferenced ``linkonce`` globals are allowed to be discarded. Note
254    that ``linkonce`` linkage does not actually allow the optimizer to
255    inline the body of this function into callers because it doesn't
256    know if this definition of the function is the definitive definition
257    within the program or whether it will be overridden by a stronger
258    definition. To enable inlining and other optimizations, use
259    "``linkonce_odr``" linkage.
260``weak``
261    "``weak``" linkage has the same merging semantics as ``linkonce``
262    linkage, except that unreferenced globals with ``weak`` linkage may
263    not be discarded. This is used for globals that are declared "weak"
264    in C source code.
265``common``
266    "``common``" linkage is most similar to "``weak``" linkage, but they
267    are used for tentative definitions in C, such as "``int X;``" at
268    global scope. Symbols with "``common``" linkage are merged in the
269    same way as ``weak symbols``, and they may not be deleted if
270    unreferenced. ``common`` symbols may not have an explicit section,
271    must have a zero initializer, and may not be marked
272    ':ref:`constant <globalvars>`'. Functions and aliases may not have
273    common linkage.
274
275.. _linkage_appending:
276
277``appending``
278    "``appending``" linkage may only be applied to global variables of
279    pointer to array type. When two global variables with appending
280    linkage are linked together, the two global arrays are appended
281    together. This is the LLVM, typesafe, equivalent of having the
282    system linker append together "sections" with identical names when
283    .o files are linked.
284
285    Unfortunately this doesn't correspond to any feature in .o files, so it
286    can only be used for variables like ``llvm.global_ctors`` which llvm
287    interprets specially.
288
289``extern_weak``
290    The semantics of this linkage follow the ELF object file model: the
291    symbol is weak until linked, if not linked, the symbol becomes null
292    instead of being an undefined reference.
293``linkonce_odr``, ``weak_odr``
294    The ``odr`` suffix indicates that all globals defined with the given name
295    are equivalent, along the lines of the C++ "one definition rule" ("ODR").
296    Informally, this means we can inline functions and fold loads of constants.
297
298    Formally, use the following definition: when an ``odr`` function is
299    called, one of the definitions is non-deterministically chosen to run. For
300    ``odr`` variables, if any byte in the value is not equal in all
301    initializers, that byte is a :ref:`poison value <poisonvalues>`. For
302    aliases and ifuncs, apply the rule for the underlying function or variable.
303
304    These linkage types are otherwise the same as their non-``odr`` versions.
305``external``
306    If none of the above identifiers are used, the global is externally
307    visible, meaning that it participates in linkage and can be used to
308    resolve external symbol references.
309
310It is illegal for a global variable or function *declaration* to have any
311linkage type other than ``external`` or ``extern_weak``.
312
313.. _callingconv:
314
315Calling Conventions
316-------------------
317
318LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
319:ref:`invokes <i_invoke>` can all have an optional calling convention
320specified for the call. The calling convention of any pair of dynamic
321caller/callee must match, or the behavior of the program is undefined.
322The following calling conventions are supported by LLVM, and more may be
323added in the future:
324
325"``ccc``" - The C calling convention
326    This calling convention (the default if no other calling convention
327    is specified) matches the target C calling conventions. This calling
328    convention supports varargs function calls and tolerates some
329    mismatch in the declared prototype and implemented declaration of
330    the function (as does normal C).
331"``fastcc``" - The fast calling convention
332    This calling convention attempts to make calls as fast as possible
333    (e.g. by passing things in registers). This calling convention
334    allows the target to use whatever tricks it wants to produce fast
335    code for the target, without having to conform to an externally
336    specified ABI (Application Binary Interface). `Tail calls can only
337    be optimized when this, the tailcc, the GHC or the HiPE convention is
338    used. <CodeGenerator.html#tail-call-optimization>`_ This calling
339    convention does not support varargs and requires the prototype of all
340    callees to exactly match the prototype of the function definition.
341"``coldcc``" - The cold calling convention
342    This calling convention attempts to make code in the caller as
343    efficient as possible under the assumption that the call is not
344    commonly executed. As such, these calls often preserve all registers
345    so that the call does not break any live ranges in the caller side.
346    This calling convention does not support varargs and requires the
347    prototype of all callees to exactly match the prototype of the
348    function definition. Furthermore the inliner doesn't consider such function
349    calls for inlining.
350"``ghccc``" - GHC convention
351    This calling convention has been implemented specifically for use by
352    the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
353    It passes everything in registers, going to extremes to achieve this
354    by disabling callee save registers. This calling convention should
355    not be used lightly but only for specific situations such as an
356    alternative to the *register pinning* performance technique often
357    used when implementing functional programming languages. At the
358    moment only X86, AArch64, and RISCV support this convention. The
359    following limitations exist:
360
361    -  On *X86-32* only up to 4 bit type parameters are supported. No
362       floating-point types are supported.
363    -  On *X86-64* only up to 10 bit type parameters and 6
364       floating-point parameters are supported.
365    -  On *AArch64* only up to 4 32-bit floating-point parameters,
366       4 64-bit floating-point parameters, and 10 bit type parameters
367       are supported.
368    -  *RISCV64* only supports up to 11 bit type parameters, 4
369       32-bit floating-point parameters, and 4 64-bit floating-point
370       parameters.
371
372    This calling convention supports `tail call
373    optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
374    both the caller and callee are using it.
375"``cc 11``" - The HiPE calling convention
376    This calling convention has been implemented specifically for use by
377    the `High-Performance Erlang
378    (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
379    native code compiler of the `Ericsson's Open Source Erlang/OTP
380    system <http://www.erlang.org/download.shtml>`_. It uses more
381    registers for argument passing than the ordinary C calling
382    convention and defines no callee-saved registers. The calling
383    convention properly supports `tail call
384    optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
385    that both the caller and the callee use it. It uses a *register pinning*
386    mechanism, similar to GHC's convention, for keeping frequently
387    accessed runtime components pinned to specific hardware registers.
388    At the moment only X86 supports this convention (both 32 and 64
389    bit).
390"``anyregcc``" - Dynamic calling convention for code patching
391    This is a special convention that supports patching an arbitrary code
392    sequence in place of a call site. This convention forces the call
393    arguments into registers but allows them to be dynamically
394    allocated. This can currently only be used with calls to
395    llvm.experimental.patchpoint because only this intrinsic records
396    the location of its arguments in a side table. See :doc:`StackMaps`.
397"``preserve_mostcc``" - The `PreserveMost` calling convention
398    This calling convention attempts to make the code in the caller as
399    unintrusive as possible. This convention behaves identically to the `C`
400    calling convention on how arguments and return values are passed, but it
401    uses a different set of caller/callee-saved registers. This alleviates the
402    burden of saving and recovering a large register set before and after the
403    call in the caller. If the arguments are passed in callee-saved registers,
404    then they will be preserved by the callee across the call. This doesn't
405    apply for values returned in callee-saved registers.
406
407    - On X86-64 the callee preserves all general purpose registers, except for
408      R11 and return registers, if any. R11 can be used as a scratch register.
409      The treatment of floating-point registers (XMMs/YMMs) matches the OS's C
410      calling convention: on most platforms, they are not preserved and need to
411      be saved by the caller, but on Windows, xmm6-xmm15 are preserved.
412
413    - On AArch64 the callee preserve all general purpose registers, except X0-X8
414      and X16-X18.
415
416    The idea behind this convention is to support calls to runtime functions
417    that have a hot path and a cold path. The hot path is usually a small piece
418    of code that doesn't use many registers. The cold path might need to call out to
419    another function and therefore only needs to preserve the caller-saved
420    registers, which haven't already been saved by the caller. The
421    `PreserveMost` calling convention is very similar to the `cold` calling
422    convention in terms of caller/callee-saved registers, but they are used for
423    different types of function calls. `coldcc` is for function calls that are
424    rarely executed, whereas `preserve_mostcc` function calls are intended to be
425    on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
426    doesn't prevent the inliner from inlining the function call.
427
428    This calling convention will be used by a future version of the ObjectiveC
429    runtime and should therefore still be considered experimental at this time.
430    Although this convention was created to optimize certain runtime calls to
431    the ObjectiveC runtime, it is not limited to this runtime and might be used
432    by other runtimes in the future too. The current implementation only
433    supports X86-64, but the intention is to support more architectures in the
434    future.
435"``preserve_allcc``" - The `PreserveAll` calling convention
436    This calling convention attempts to make the code in the caller even less
437    intrusive than the `PreserveMost` calling convention. This calling
438    convention also behaves identical to the `C` calling convention on how
439    arguments and return values are passed, but it uses a different set of
440    caller/callee-saved registers. This removes the burden of saving and
441    recovering a large register set before and after the call in the caller. If
442    the arguments are passed in callee-saved registers, then they will be
443    preserved by the callee across the call. This doesn't apply for values
444    returned in callee-saved registers.
445
446    - On X86-64 the callee preserves all general purpose registers, except for
447      R11. R11 can be used as a scratch register. Furthermore it also preserves
448      all floating-point registers (XMMs/YMMs).
449
450    - On AArch64 the callee preserve all general purpose registers, except X0-X8
451      and X16-X18. Furthermore it also preserves lower 128 bits of V8-V31 SIMD -
452      floating point registers.
453
454    The idea behind this convention is to support calls to runtime functions
455    that don't need to call out to any other functions.
456
457    This calling convention, like the `PreserveMost` calling convention, will be
458    used by a future version of the ObjectiveC runtime and should be considered
459    experimental at this time.
460"``preserve_nonecc``" - The `PreserveNone` calling convention
461    This calling convention doesn't preserve any general registers. So all
462    general registers are caller saved registers. It also uses all general
463    registers to pass arguments. This attribute doesn't impact non-general
464    purpose registers (e.g. floating point registers, on X86 XMMs/YMMs).
465    Non-general purpose registers still follow the standard c calling
466    convention. Currently it is for x86_64 and AArch64 only.
467"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
468    Clang generates an access function to access C++-style TLS. The access
469    function generally has an entry block, an exit block and an initialization
470    block that is run at the first time. The entry and exit blocks can access
471    a few TLS IR variables, each access will be lowered to a platform-specific
472    sequence.
473
474    This calling convention aims to minimize overhead in the caller by
475    preserving as many registers as possible (all the registers that are
476    preserved on the fast path, composed of the entry and exit blocks).
477
478    This calling convention behaves identical to the `C` calling convention on
479    how arguments and return values are passed, but it uses a different set of
480    caller/callee-saved registers.
481
482    Given that each platform has its own lowering sequence, hence its own set
483    of preserved registers, we can't use the existing `PreserveMost`.
484
485    - On X86-64 the callee preserves all general purpose registers, except for
486      RDI and RAX.
487"``tailcc``" - Tail callable calling convention
488    This calling convention ensures that calls in tail position will always be
489    tail call optimized. This calling convention is equivalent to fastcc,
490    except for an additional guarantee that tail calls will be produced
491    whenever possible. `Tail calls can only be optimized when this, the fastcc,
492    the GHC or the HiPE convention is used. <CodeGenerator.html#tail-call-optimization>`_
493    This calling convention does not support varargs and requires the prototype of
494    all callees to exactly match the prototype of the function definition.
495"``swiftcc``" - This calling convention is used for Swift language.
496    - On X86-64 RCX and R8 are available for additional integer returns, and
497      XMM2 and XMM3 are available for additional FP/vector returns.
498    - On iOS platforms, we use AAPCS-VFP calling convention.
499"``swifttailcc``"
500    This calling convention is like ``swiftcc`` in most respects, but also the
501    callee pops the argument area of the stack so that mandatory tail calls are
502    possible as in ``tailcc``.
503"``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism)
504    This calling convention is used for the Control Flow Guard check function,
505    calls to which can be inserted before indirect calls to check that the call
506    target is a valid function address. The check function has no return value,
507    but it will trigger an OS-level error if the address is not a valid target.
508    The set of registers preserved by the check function, and the register
509    containing the target address are architecture-specific.
510
511    - On X86 the target address is passed in ECX.
512    - On ARM the target address is passed in R0.
513    - On AArch64 the target address is passed in X15.
514"``cc <n>``" - Numbered convention
515    Any calling convention may be specified by number, allowing
516    target-specific calling conventions to be used. Target specific
517    calling conventions start at 64.
518
519More calling conventions can be added/defined on an as-needed basis, to
520support Pascal conventions or any other well-known target-independent
521convention.
522
523.. _visibilitystyles:
524
525Visibility Styles
526-----------------
527
528All Global Variables and Functions have one of the following visibility
529styles:
530
531"``default``" - Default style
532    On targets that use the ELF object file format, default visibility
533    means that the declaration is visible to other modules and, in
534    shared libraries, means that the declared entity may be overridden.
535    On Darwin, default visibility means that the declaration is visible
536    to other modules. On XCOFF, default visibility means no explicit
537    visibility bit will be set and whether the symbol is visible
538    (i.e "exported") to other modules depends primarily on export lists
539    provided to the linker. Default visibility corresponds to "external
540    linkage" in the language.
541"``hidden``" - Hidden style
542    Two declarations of an object with hidden visibility refer to the
543    same object if they are in the same shared object. Usually, hidden
544    visibility indicates that the symbol will not be placed into the
545    dynamic symbol table, so no other module (executable or shared
546    library) can reference it directly.
547"``protected``" - Protected style
548    On ELF, protected visibility indicates that the symbol will be
549    placed in the dynamic symbol table, but that references within the
550    defining module will bind to the local symbol. That is, the symbol
551    cannot be overridden by another module.
552
553A symbol with ``internal`` or ``private`` linkage must have ``default``
554visibility.
555
556.. _dllstorageclass:
557
558DLL Storage Classes
559-------------------
560
561All Global Variables, Functions and Aliases can have one of the following
562DLL storage class:
563
564``dllimport``
565    "``dllimport``" causes the compiler to reference a function or variable via
566    a global pointer to a pointer that is set up by the DLL exporting the
567    symbol. On Microsoft Windows targets, the pointer name is formed by
568    combining ``__imp_`` and the function or variable name.
569``dllexport``
570    On Microsoft Windows targets, "``dllexport``" causes the compiler to provide
571    a global pointer to a pointer in a DLL, so that it can be referenced with the
572    ``dllimport`` attribute. the pointer name is formed by combining ``__imp_``
573    and the function or variable name. On XCOFF targets, ``dllexport`` indicates
574    that the symbol will be made visible to other modules using "exported"
575    visibility and thus placed by the linker in the loader section symbol table.
576    Since this storage class exists for defining a dll interface, the compiler,
577    assembler and linker know it is externally referenced and must refrain from
578    deleting the symbol.
579
580A symbol with ``internal`` or ``private`` linkage cannot have a DLL storage
581class.
582
583.. _tls_model:
584
585Thread Local Storage Models
586---------------------------
587
588A variable may be defined as ``thread_local``, which means that it will
589not be shared by threads (each thread will have a separated copy of the
590variable). Not all targets support thread-local variables. Optionally, a
591TLS model may be specified:
592
593``localdynamic``
594    For variables that are only used within the current shared library.
595``initialexec``
596    For variables in modules that will not be loaded dynamically.
597``localexec``
598    For variables defined in the executable and only used within it.
599
600If no explicit model is given, the "general dynamic" model is used.
601
602The models correspond to the ELF TLS models; see `ELF Handling For
603Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
604more information on under which circumstances the different models may
605be used. The target may choose a different TLS model if the specified
606model is not supported, or if a better choice of model can be made.
607
608A model can also be specified in an alias, but then it only governs how
609the alias is accessed. It will not have any effect in the aliasee.
610
611For platforms without linker support of ELF TLS model, the -femulated-tls
612flag can be used to generate GCC compatible emulated TLS code.
613
614.. _runtime_preemption_model:
615
616Runtime Preemption Specifiers
617-----------------------------
618
619Global variables, functions and aliases may have an optional runtime preemption
620specifier. If a preemption specifier isn't given explicitly, then a
621symbol is assumed to be ``dso_preemptable``.
622
623``dso_preemptable``
624    Indicates that the function or variable may be replaced by a symbol from
625    outside the linkage unit at runtime.
626
627``dso_local``
628    The compiler may assume that a function or variable marked as ``dso_local``
629    will resolve to a symbol within the same linkage unit. Direct access will
630    be generated even if the definition is not within this compilation unit.
631
632.. _namedtypes:
633
634Structure Types
635---------------
636
637LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
638types <t_struct>`. Literal types are uniqued structurally, but identified types
639are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
640to forward declare a type that is not yet available.
641
642An example of an identified structure specification is:
643
644.. code-block:: llvm
645
646    %mytype = type { %mytype*, i32 }
647
648Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
649literal types are uniqued in recent versions of LLVM.
650
651.. _nointptrtype:
652
653Non-Integral Pointer Type
654-------------------------
655
656Note: non-integral pointer types are a work in progress, and they should be
657considered experimental at this time.
658
659LLVM IR optionally allows the frontend to denote pointers in certain address
660spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
661Non-integral pointer types represent pointers that have an *unspecified* bitwise
662representation; that is, the integral representation may be target dependent or
663unstable (not backed by a fixed integer).
664
665``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
666integral (i.e. normal) pointers in that they convert integers to and from
667corresponding pointer types, but there are additional implications to be
668aware of.  Because the bit-representation of a non-integral pointer may
669not be stable, two identical casts of the same operand may or may not
670return the same value.  Said differently, the conversion to or from the
671non-integral type depends on environmental state in an implementation
672defined manner.
673
674If the frontend wishes to observe a *particular* value following a cast, the
675generated IR must fence with the underlying environment in an implementation
676defined manner. (In practice, this tends to require ``noinline`` routines for
677such operations.)
678
679From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
680non-integral types are analogous to ones on integral types with one
681key exception: the optimizer may not, in general, insert new dynamic
682occurrences of such casts.  If a new cast is inserted, the optimizer would
683need to either ensure that a) all possible values are valid, or b)
684appropriate fencing is inserted.  Since the appropriate fencing is
685implementation defined, the optimizer can't do the latter.  The former is
686challenging as many commonly expected properties, such as
687``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
688Similar restrictions apply to intrinsics that might examine the pointer bits,
689such as :ref:`llvm.ptrmask<int_ptrmask>`.
690
691The alignment information provided by the frontend for a non-integral pointer
692(typically using attributes or metadata) must be valid for every possible
693representation of the pointer.
694
695.. _globalvars:
696
697Global Variables
698----------------
699
700Global variables define regions of memory allocated at compilation time
701instead of run-time.
702
703Global variable definitions must be initialized.
704
705Global variables in other translation units can also be declared, in which
706case they don't have an initializer.
707
708Global variables can optionally specify a :ref:`linkage type <linkage>`.
709
710Either global variable definitions or declarations may have an explicit section
711to be placed in and may have an optional explicit alignment specified. If there
712is a mismatch between the explicit or inferred section information for the
713variable declaration and its definition the resulting behavior is undefined.
714
715A variable may be defined as a global ``constant``, which indicates that
716the contents of the variable will **never** be modified (enabling better
717optimization, allowing the global data to be placed in the read-only
718section of an executable, etc). Note that variables that need runtime
719initialization cannot be marked ``constant`` as there is a store to the
720variable.
721
722LLVM explicitly allows *declarations* of global variables to be marked
723constant, even if the final definition of the global is not. This
724capability can be used to enable slightly better optimization of the
725program, but requires the language definition to guarantee that
726optimizations based on the 'constantness' are valid for the translation
727units that do not include the definition.
728
729As SSA values, global variables define pointer values that are in scope
730(i.e. they dominate) all basic blocks in the program. Global variables
731always define a pointer to their "content" type because they describe a
732region of memory, and all memory objects in LLVM are accessed through
733pointers.
734
735Global variables can be marked with ``unnamed_addr`` which indicates
736that the address is not significant, only the content. Constants marked
737like this can be merged with other constants if they have the same
738initializer. Note that a constant with significant address *can* be
739merged with a ``unnamed_addr`` constant, the result being a constant
740whose address is significant.
741
742If the ``local_unnamed_addr`` attribute is given, the address is known to
743not be significant within the module.
744
745A global variable may be declared to reside in a target-specific
746numbered address space. For targets that support them, address spaces
747may affect how optimizations are performed and/or what target
748instructions are used to access the variable. The default address space
749is zero. The address space qualifier must precede any other attributes.
750
751LLVM allows an explicit section to be specified for globals. If the
752target supports it, it will emit globals to the section specified.
753Additionally, the global can placed in a comdat if the target has the necessary
754support.
755
756External declarations may have an explicit section specified. Section
757information is retained in LLVM IR for targets that make use of this
758information. Attaching section information to an external declaration is an
759assertion that its definition is located in the specified section. If the
760definition is located in a different section, the behavior is undefined.
761
762LLVM allows an explicit code model to be specified for globals. If the
763target supports it, it will emit globals in the code model specified,
764overriding the code model used to compile the translation unit.
765The allowed values are "tiny", "small", "kernel", "medium", "large".
766This may be extended in the future to specify global data layout that
767doesn't cleanly fit into a specific code model.
768
769By default, global initializers are optimized by assuming that global
770variables defined within the module are not modified from their
771initial values before the start of the global initializer. This is
772true even for variables potentially accessible from outside the
773module, including those with external linkage or appearing in
774``@llvm.used`` or dllexported variables. This assumption may be suppressed
775by marking the variable with ``externally_initialized``.
776
777An explicit alignment may be specified for a global, which must be a
778power of 2. If not present, or if the alignment is set to zero, the
779alignment of the global is set by the target to whatever it feels
780convenient. If an explicit alignment is specified, the global is forced
781to have exactly that alignment. Targets and optimizers are not allowed
782to over-align the global if the global has an assigned section. In this
783case, the extra alignment could be observable: for example, code could
784assume that the globals are densely packed in their section and try to
785iterate over them as an array, alignment padding would break this
786iteration. For TLS variables, the module flag ``MaxTLSAlign``, if present,
787limits the alignment to the given value. Optimizers are not allowed to
788impose a stronger alignment on these variables. The maximum alignment
789is ``1 << 32``.
790
791For global variable declarations, as well as definitions that may be
792replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common``
793linkage types), the allocation size and alignment of the definition it resolves
794to must be greater than or equal to that of the declaration or replaceable
795definition, otherwise the behavior is undefined.
796
797Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
798an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
799an optional :ref:`global attributes <glattrs>` and
800an optional list of attached :ref:`metadata <metadata>`.
801
802Variables and aliases can have a
803:ref:`Thread Local Storage Model <tls_model>`.
804
805Globals cannot be or contain :ref:`Scalable vectors <t_vector>` because their
806size is unknown at compile time. They are allowed in structs to facilitate
807intrinsics returning multiple values. Generally, structs containing scalable
808vectors are not considered "sized" and cannot be used in loads, stores, allocas,
809or GEPs. The only exception to this rule is for structs that contain scalable
810vectors of the same type (e.g. ``{<vscale x 2 x i32>, <vscale x 2 x i32>}``
811contains the same type while ``{<vscale x 2 x i32>, <vscale x 2 x i64>}``
812doesn't). These kinds of structs (we may call them homogeneous scalable vector
813structs) are considered sized and can be used in loads, stores, allocas, but
814not GEPs.
815
816Globals with ``toc-data`` attribute set are stored in TOC of XCOFF. Their
817alignments are not larger than that of a TOC entry. Optimizations should not
818increase their alignments to mitigate TOC overflow.
819
820Syntax::
821
822      @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
823                         [DLLStorageClass] [ThreadLocal]
824                         [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
825                         [ExternallyInitialized]
826                         <global | constant> <Type> [<InitializerConstant>]
827                         [, section "name"] [, partition "name"]
828                         [, comdat [($name)]] [, align <Alignment>]
829                         [, code_model "model"]
830                         [, no_sanitize_address] [, no_sanitize_hwaddress]
831                         [, sanitize_address_dyninit] [, sanitize_memtag]
832                         (, !name !N)*
833
834For example, the following defines a global in a numbered address space
835with an initializer, section, and alignment:
836
837.. code-block:: llvm
838
839    @G = addrspace(5) constant float 1.0, section "foo", align 4
840
841The following example just declares a global variable
842
843.. code-block:: llvm
844
845   @G = external global i32
846
847The following example defines a global variable with the
848``large`` code model:
849
850.. code-block:: llvm
851
852    @G = internal global i32 0, code_model "large"
853
854The following example defines a thread-local global with the
855``initialexec`` TLS model:
856
857.. code-block:: llvm
858
859    @G = thread_local(initialexec) global i32 0, align 4
860
861.. _functionstructure:
862
863Functions
864---------
865
866LLVM function definitions consist of the "``define``" keyword, an
867optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
868specifier <runtime_preemption_model>`,  an optional :ref:`visibility
869style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
870an optional :ref:`calling convention <callingconv>`,
871an optional ``unnamed_addr`` attribute, a return type, an optional
872:ref:`parameter attribute <paramattrs>` for the return type, a function
873name, a (possibly empty) argument list (each with optional :ref:`parameter
874attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
875an optional address space, an optional section, an optional partition,
876an optional alignment, an optional :ref:`comdat <langref_comdats>`,
877an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
878an optional :ref:`prologue <prologuedata>`,
879an optional :ref:`personality <personalityfn>`,
880an optional list of attached :ref:`metadata <metadata>`,
881an opening curly brace, a list of basic blocks, and a closing curly brace.
882
883Syntax::
884
885    define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
886           [cconv] [ret attrs]
887           <ResultType> @<FunctionName> ([argument list])
888           [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs]
889           [section "name"] [partition "name"] [comdat [($name)]] [align N]
890           [gc] [prefix Constant] [prologue Constant] [personality Constant]
891           (!name !N)* { ... }
892
893The argument list is a comma separated sequence of arguments where each
894argument is of the following form:
895
896Syntax::
897
898   <type> [parameter Attrs] [name]
899
900LLVM function declarations consist of the "``declare``" keyword, an
901optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
902<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
903optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
904or ``local_unnamed_addr`` attribute, an optional address space, a return type,
905an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly
906empty list of arguments, an optional alignment, an optional :ref:`garbage
907collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
908:ref:`prologue <prologuedata>`.
909
910Syntax::
911
912    declare [linkage] [visibility] [DLLStorageClass]
913            [cconv] [ret attrs]
914            <ResultType> @<FunctionName> ([argument list])
915            [(unnamed_addr|local_unnamed_addr)] [align N] [gc]
916            [prefix Constant] [prologue Constant]
917
918A function definition contains a list of basic blocks, forming the CFG (Control
919Flow Graph) for the function. Each basic block may optionally start with a label
920(giving the basic block a symbol table entry), contains a list of instructions
921and :ref:`debug records <debugrecords>`,
922and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
923function return). If an explicit label name is not provided, a block is assigned
924an implicit numbered label, using the next value from the same counter as used
925for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a
926function entry block does not have an explicit label, it will be assigned label
927"%0", then the first unnamed temporary in that block will be "%1", etc. If a
928numeric label is explicitly specified, it must match the numeric label that
929would be used implicitly.
930
931The first basic block in a function is special in two ways: it is
932immediately executed on entrance to the function, and it is not allowed
933to have predecessor basic blocks (i.e. there can not be any branches to
934the entry block of a function). Because the block can have no
935predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
936
937LLVM allows an explicit section to be specified for functions. If the
938target supports it, it will emit functions to the section specified.
939Additionally, the function can be placed in a COMDAT.
940
941An explicit alignment may be specified for a function. If not present,
942or if the alignment is set to zero, the alignment of the function is set
943by the target to whatever it feels convenient. If an explicit alignment
944is specified, the function is forced to have at least that much
945alignment. All alignments must be a power of 2.
946
947If the ``unnamed_addr`` attribute is given, the address is known to not
948be significant and two identical functions can be merged.
949
950If the ``local_unnamed_addr`` attribute is given, the address is known to
951not be significant within the module.
952
953If an explicit address space is not given, it will default to the program
954address space from the :ref:`datalayout string<langref_datalayout>`.
955
956.. _langref_aliases:
957
958Aliases
959-------
960
961Aliases, unlike function or variables, don't create any new data. They
962are just a new symbol and metadata for an existing position.
963
964Aliases have a name and an aliasee that is either a global value or a
965constant expression.
966
967Aliases may have an optional :ref:`linkage type <linkage>`, an optional
968:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
969:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
970<dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
971
972Syntax::
973
974    @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
975              [, partition "name"]
976
977The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
978``linkonce_odr``, ``weak_odr``, ``external``, ``available_externally``. Note
979that some system linkers might not correctly handle dropping a weak symbol that
980is aliased.
981
982Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
983the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
984to the same content.
985
986If the ``local_unnamed_addr`` attribute is given, the address is known to
987not be significant within the module.
988
989Since aliases are only a second name, some restrictions apply, of which
990some can only be checked when producing an object file:
991
992* The expression defining the aliasee must be computable at assembly
993  time. Since it is just a name, no relocations can be used.
994
995* No alias in the expression can be weak as the possibility of the
996  intermediate alias being overridden cannot be represented in an
997  object file.
998
999* If the alias has the ``available_externally`` linkage, the aliasee must be an
1000  ``available_externally`` global value; otherwise the aliasee can be an
1001  expression but no global value in the expression can be a declaration, since
1002  that would require a relocation, which is not possible.
1003
1004* If either the alias or the aliasee may be replaced by a symbol outside the
1005  module at link time or runtime, any optimization cannot replace the alias with
1006  the aliasee, since the behavior may be different. The alias may be used as a
1007  name guaranteed to point to the content in the current module.
1008
1009.. _langref_ifunc:
1010
1011IFuncs
1012-------
1013
1014IFuncs, like as aliases, don't create any new data or func. They are just a new
1015symbol that is resolved at runtime by calling a resolver function.
1016
1017On ELF platforms, IFuncs are resolved by the dynamic linker at load time. On
1018Mach-O platforms, they are lowered in terms of ``.symbol_resolver`` functions,
1019which lazily resolve the callee the first time they are called.
1020
1021IFunc may have an optional :ref:`linkage type <linkage>` and an optional
1022:ref:`visibility style <visibility>`.
1023
1024Syntax::
1025
1026    @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
1027              [, partition "name"]
1028
1029
1030.. _langref_comdats:
1031
1032Comdats
1033-------
1034
1035Comdat IR provides access to object file COMDAT/section group functionality
1036which represents interrelated sections.
1037
1038Comdats have a name which represents the COMDAT key and a selection kind to
1039provide input on how the linker deduplicates comdats with the same key in two
1040different object files. A comdat must be included or omitted as a unit.
1041Discarding the whole comdat is allowed but discarding a subset is not.
1042
1043A global object may be a member of at most one comdat. Aliases are placed in the
1044same COMDAT that their aliasee computes to, if any.
1045
1046Syntax::
1047
1048    $<Name> = comdat SelectionKind
1049
1050For selection kinds other than ``nodeduplicate``, only one of the duplicate
1051comdats may be retained by the linker and the members of the remaining comdats
1052must be discarded. The following selection kinds are supported:
1053
1054``any``
1055    The linker may choose any COMDAT key, the choice is arbitrary.
1056``exactmatch``
1057    The linker may choose any COMDAT key but the sections must contain the
1058    same data.
1059``largest``
1060    The linker will choose the section containing the largest COMDAT key.
1061``nodeduplicate``
1062    No deduplication is performed.
1063``samesize``
1064    The linker may choose any COMDAT key but the sections must contain the
1065    same amount of data.
1066
1067- XCOFF and Mach-O don't support COMDATs.
1068- COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need
1069  a non-local linkage COMDAT symbol.
1070- ELF supports ``any`` and ``nodeduplicate``.
1071- WebAssembly only supports ``any``.
1072
1073Here is an example of a COFF COMDAT where a function will only be selected if
1074the COMDAT key's section is the largest:
1075
1076.. code-block:: text
1077
1078   $foo = comdat largest
1079   @foo = global i32 2, comdat($foo)
1080
1081   define void @bar() comdat($foo) {
1082     ret void
1083   }
1084
1085In a COFF object file, this will create a COMDAT section with selection kind
1086``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
1087and another COMDAT section with selection kind
1088``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
1089section and contains the contents of the ``@bar`` symbol.
1090
1091As a syntactic sugar the ``$name`` can be omitted if the name is the same as
1092the global name:
1093
1094.. code-block:: llvm
1095
1096  $foo = comdat any
1097  @foo = global i32 2, comdat
1098  @bar = global i32 3, comdat($foo)
1099
1100There are some restrictions on the properties of the global object.
1101It, or an alias to it, must have the same name as the COMDAT group when
1102targeting COFF.
1103The contents and size of this object may be used during link-time to determine
1104which COMDAT groups get selected depending on the selection kind.
1105Because the name of the object must match the name of the COMDAT group, the
1106linkage of the global object must not be local; local symbols can get renamed
1107if a collision occurs in the symbol table.
1108
1109The combined use of COMDATS and section attributes may yield surprising results.
1110For example:
1111
1112.. code-block:: llvm
1113
1114   $foo = comdat any
1115   $bar = comdat any
1116   @g1 = global i32 42, section "sec", comdat($foo)
1117   @g2 = global i32 42, section "sec", comdat($bar)
1118
1119From the object file perspective, this requires the creation of two sections
1120with the same name. This is necessary because both globals belong to different
1121COMDAT groups and COMDATs, at the object file level, are represented by
1122sections.
1123
1124Note that certain IR constructs like global variables and functions may
1125create COMDATs in the object file in addition to any which are specified using
1126COMDAT IR. This arises when the code generator is configured to emit globals
1127in individual sections (e.g. when `-data-sections` or `-function-sections`
1128is supplied to `llc`).
1129
1130.. _namedmetadatastructure:
1131
1132Named Metadata
1133--------------
1134
1135Named metadata is a collection of metadata. :ref:`Metadata
1136nodes <metadata>` (but not metadata strings) are the only valid
1137operands for a named metadata.
1138
1139#. Named metadata are represented as a string of characters with the
1140   metadata prefix. The rules for metadata names are the same as for
1141   identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
1142   are still valid, which allows any character to be part of a name.
1143
1144Syntax::
1145
1146    ; Some unnamed metadata nodes, which are referenced by the named metadata.
1147    !0 = !{!"zero"}
1148    !1 = !{!"one"}
1149    !2 = !{!"two"}
1150    ; A named metadata.
1151    !name = !{!0, !1, !2}
1152
1153.. _paramattrs:
1154
1155Parameter Attributes
1156--------------------
1157
1158The return type and each parameter of a function type may have a set of
1159*parameter attributes* associated with them. Parameter attributes are
1160used to communicate additional information about the result or
1161parameters of a function. Parameter attributes are considered to be part
1162of the function, not of the function type, so functions with different
1163parameter attributes can have the same function type. Parameter attributes can
1164be placed both on function declarations/definitions, and at call-sites.
1165
1166Parameter attributes are either simple keywords or strings that follow the
1167specified type. Multiple parameter attributes, when required, are separated by
1168spaces. For example:
1169
1170.. code-block:: llvm
1171
1172    ; On function declarations/definitions:
1173    declare i32 @printf(ptr noalias captures(none), ...)
1174    declare i32 @atoi(i8 zeroext)
1175    declare signext i8 @returns_signed_char()
1176    define void @baz(i32 "amdgpu-flat-work-group-size"="1,256" %x)
1177
1178    ; On call-sites:
1179    call i32 @atoi(i8 zeroext %x)
1180    call signext i8 @returns_signed_char()
1181
1182Note that any attributes for the function result (``nonnull``,
1183``signext``) come before the result type.
1184
1185Parameter attributes can be broadly separated into two kinds: ABI attributes
1186that affect how values are passed to/from functions, like ``zeroext``,
1187``inreg``, ``byval``, or ``sret``. And optimization attributes, which provide
1188additional optimization guarantees, like ``noalias``, ``nonnull`` and
1189``dereferenceable``.
1190
1191ABI attributes must be specified *both* at the function declaration/definition
1192and call-site, otherwise the behavior may be undefined. ABI attributes cannot
1193be safely dropped. Optimization attributes do not have to match between
1194call-site and function: The intersection of their implied semantics applies.
1195Optimization attributes can also be freely dropped.
1196
1197If an integer argument to a function is not marked signext/zeroext/noext, the
1198kind of extension used is target-specific. Some targets depend for
1199correctness on the kind of extension to be explicitly specified.
1200
1201Currently, only the following parameter attributes are defined:
1202
1203``zeroext``
1204    This indicates to the code generator that the parameter or return
1205    value should be zero-extended to the extent required by the target's
1206    ABI by the caller (for a parameter) or the callee (for a return value).
1207``signext``
1208    This indicates to the code generator that the parameter or return
1209    value should be sign-extended to the extent required by the target's
1210    ABI (which is usually 32-bits) by the caller (for a parameter) or
1211    the callee (for a return value).
1212``noext``
1213    This indicates to the code generator that the parameter or return
1214    value has the high bits undefined, as for a struct in register, and
1215    therefore does not need to be sign or zero extended. This is the same
1216    as default behavior and is only actually used (by some targets) to
1217    validate that one of the attributes is always present.
1218``inreg``
1219    This indicates that this parameter or return value should be treated
1220    in a special target-dependent fashion while emitting code for
1221    a function call or return (usually, by putting it in a register as
1222    opposed to memory, though some targets use it to distinguish between
1223    two different kinds of registers). Use of this attribute is
1224    target-specific.
1225``byval(<ty>)``
1226    This indicates that the pointer parameter should really be passed by
1227    value to the function. The attribute implies that a hidden copy of
1228    the pointee is made between the caller and the callee, so the callee
1229    is unable to modify the value in the caller. This attribute is only
1230    valid on LLVM pointer arguments. It is generally used to pass
1231    structs and arrays by value, but is also valid on pointers to
1232    scalars. The copy is considered to belong to the caller not the
1233    callee (for example, ``readonly`` functions should not write to
1234    ``byval`` parameters). This is not a valid attribute for return
1235    values.
1236
1237    The byval type argument indicates the in-memory value type, and
1238    must be the same as the pointee type of the argument.
1239
1240    The byval attribute also supports specifying an alignment with the
1241    align attribute. It indicates the alignment of the stack slot to
1242    form and the known alignment of the pointer specified to the call
1243    site. If the alignment is not specified, then the code generator
1244    makes a target-specific assumption.
1245
1246.. _attr_byref:
1247
1248``byref(<ty>)``
1249
1250    The ``byref`` argument attribute allows specifying the pointee
1251    memory type of an argument. This is similar to ``byval``, but does
1252    not imply a copy is made anywhere, or that the argument is passed
1253    on the stack. This implies the pointer is dereferenceable up to
1254    the storage size of the type.
1255
1256    It is not generally permissible to introduce a write to an
1257    ``byref`` pointer. The pointer may have any address space and may
1258    be read only.
1259
1260    This is not a valid attribute for return values.
1261
1262    The alignment for an ``byref`` parameter can be explicitly
1263    specified by combining it with the ``align`` attribute, similar to
1264    ``byval``. If the alignment is not specified, then the code generator
1265    makes a target-specific assumption.
1266
1267    This is intended for representing ABI constraints, and is not
1268    intended to be inferred for optimization use.
1269
1270.. _attr_preallocated:
1271
1272``preallocated(<ty>)``
1273    This indicates that the pointer parameter should really be passed by
1274    value to the function, and that the pointer parameter's pointee has
1275    already been initialized before the call instruction. This attribute
1276    is only valid on LLVM pointer arguments. The argument must be the value
1277    returned by the appropriate
1278    :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non
1279    ``musttail`` calls, or the corresponding caller parameter in ``musttail``
1280    calls, although it is ignored during codegen.
1281
1282    A non ``musttail`` function call with a ``preallocated`` attribute in
1283    any parameter must have a ``"preallocated"`` operand bundle. A ``musttail``
1284    function call cannot have a ``"preallocated"`` operand bundle.
1285
1286    The preallocated attribute requires a type argument, which must be
1287    the same as the pointee type of the argument.
1288
1289    The preallocated attribute also supports specifying an alignment with the
1290    align attribute. It indicates the alignment of the stack slot to
1291    form and the known alignment of the pointer specified to the call
1292    site. If the alignment is not specified, then the code generator
1293    makes a target-specific assumption.
1294
1295.. _attr_inalloca:
1296
1297``inalloca(<ty>)``
1298
1299    The ``inalloca`` argument attribute allows the caller to take the
1300    address of outgoing stack arguments. An ``inalloca`` argument must
1301    be a pointer to stack memory produced by an ``alloca`` instruction.
1302    The alloca, or argument allocation, must also be tagged with the
1303    inalloca keyword. Only the last argument may have the ``inalloca``
1304    attribute, and that argument is guaranteed to be passed in memory.
1305
1306    An argument allocation may be used by a call at most once because
1307    the call may deallocate it. The ``inalloca`` attribute cannot be
1308    used in conjunction with other attributes that affect argument
1309    storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
1310    ``inalloca`` attribute also disables LLVM's implicit lowering of
1311    large aggregate return values, which means that frontend authors
1312    must lower them with ``sret`` pointers.
1313
1314    When the call site is reached, the argument allocation must have
1315    been the most recent stack allocation that is still live, or the
1316    behavior is undefined. It is possible to allocate additional stack
1317    space after an argument allocation and before its call site, but it
1318    must be cleared off with :ref:`llvm.stackrestore
1319    <int_stackrestore>`.
1320
1321    The inalloca attribute requires a type argument, which must be the
1322    same as the pointee type of the argument.
1323
1324    See :doc:`InAlloca` for more information on how to use this
1325    attribute.
1326
1327``sret(<ty>)``
1328    This indicates that the pointer parameter specifies the address of a
1329    structure that is the return value of the function in the source
1330    program. This pointer must be guaranteed by the caller to be valid:
1331    loads and stores to the structure may be assumed by the callee not
1332    to trap and to be properly aligned.
1333
1334    The sret type argument specifies the in memory type, which must be
1335    the same as the pointee type of the argument.
1336
1337    A function that accepts an ``sret`` argument must return ``void``.
1338    A return value may not be ``sret``.
1339
1340.. _attr_elementtype:
1341
1342``elementtype(<ty>)``
1343
1344    The ``elementtype`` argument attribute can be used to specify a pointer
1345    element type in a way that is compatible with `opaque pointers
1346    <OpaquePointers.html>`__.
1347
1348    The ``elementtype`` attribute by itself does not carry any specific
1349    semantics. However, certain intrinsics may require this attribute to be
1350    present and assign it particular semantics. This will be documented on
1351    individual intrinsics.
1352
1353    The attribute may only be applied to pointer typed arguments of intrinsic
1354    calls. It cannot be applied to non-intrinsic calls, and cannot be applied
1355    to parameters on function declarations. For non-opaque pointers, the type
1356    passed to ``elementtype`` must match the pointer element type.
1357
1358.. _attr_align:
1359
1360``align <n>`` or ``align(<n>)``
1361    This indicates that the pointer value or vector of pointers has the
1362    specified alignment. If applied to a vector of pointers, *all* pointers
1363    (elements) have the specified alignment. If the pointer value does not have
1364    the specified alignment, :ref:`poison value <poisonvalues>` is returned or
1365    passed instead.  The ``align`` attribute should be combined with the
1366    ``noundef`` attribute to ensure a pointer is aligned, or otherwise the
1367    behavior is undefined. Note that ``align 1`` has no effect on non-byval,
1368    non-preallocated arguments.
1369
1370    Note that this attribute has additional semantics when combined with the
1371    ``byval`` or ``preallocated`` attribute, which are documented there.
1372
1373.. _noalias:
1374
1375``noalias``
1376    This indicates that memory locations accessed via pointer values
1377    :ref:`based <pointeraliasing>` on the argument or return value are not also
1378    accessed, during the execution of the function, via pointer values not
1379    *based* on the argument or return value. This guarantee only holds for
1380    memory locations that are *modified*, by any means, during the execution of
1381    the function. If there are other accesses not based on the argument or
1382    return value, the behavior is undefined. The attribute on a return value
1383    also has additional semantics, as described below. Both the caller and the
1384    callee share the responsibility of ensuring that these requirements are
1385    met. For further details, please see the discussion of the NoAlias response
1386    in :ref:`alias analysis <Must, May,  or No>`.
1387
1388    Note that this definition of ``noalias`` is intentionally similar
1389    to the definition of ``restrict`` in C99 for function arguments.
1390
1391    For function return values, C99's ``restrict`` is not meaningful,
1392    while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1393    attribute on return values are stronger than the semantics of the attribute
1394    when used on function arguments. On function return values, the ``noalias``
1395    attribute indicates that the function acts like a system memory allocation
1396    function, returning a pointer to allocated storage disjoint from the
1397    storage for any other object accessible to the caller.
1398
1399``captures(...)``
1400    This attributes restrict the ways in which the callee may capture the
1401    pointer. This is not a valid attribute for return values. This attribute
1402    applies only to the particular copy of the pointer passed in this argument.
1403
1404    The arguments of ``captures`` is a list of captured pointer components,
1405    which may be ``none``, or a combination of:
1406
1407    - ``address``: The integral address of the pointer.
1408    - ``address_is_null`` (subset of ``address``): Whether the address is null.
1409    - ``provenance``: The ability to access the pointer for both read and write
1410      after the function returns.
1411    - ``read_provenance`` (subset of ``provenance``): The ability to access the
1412      pointer only for reads after the function returns.
1413
1414    Additionally, it is possible to specify that some components are only
1415    captured in certain locations. Currently only the return value (``ret``)
1416    and other (default) locations are supported.
1417
1418    The `pointer capture section <pointercapture>` discusses these semantics
1419    in more detail.
1420
1421    Some examples of how to use the attribute:
1422
1423    - ``captures(none)``: Pointer not captured.
1424    - ``captures(address, provenance)``: Equivalent to omitting the attribute.
1425    - ``captures(address)``: Address may be captured, but not provenance.
1426    - ``captures(address_is_null)``: Only captures whether the address is null.
1427    - ``captures(address, read_provenance)``: Both address and provenance
1428      captured, but only for read-only access.
1429    - ``captures(ret: address, provenance)``: Pointer captured through return
1430      value only.
1431    - ``captures(address_is_null, ret: address, provenance)``: The whole pointer
1432      is captured through the return value, and additionally whether the pointer
1433      is null is captured in some other way.
1434
1435``nofree``
1436    This indicates that callee does not free the pointer argument. This is not
1437    a valid attribute for return values.
1438
1439.. _nest:
1440
1441``nest``
1442    This indicates that the pointer parameter can be excised using the
1443    :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
1444    attribute for return values and can only be applied to one parameter.
1445
1446``returned``
1447    This indicates that the function always returns the argument as its return
1448    value. This is a hint to the optimizer and code generator used when
1449    generating the caller, allowing value propagation, tail call optimization,
1450    and omission of register saves and restores in some cases; it is not
1451    checked or enforced when generating the callee. The parameter and the
1452    function return type must be valid operands for the
1453    :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1454    return values and can only be applied to one parameter.
1455
1456``nonnull``
1457    This indicates that the parameter or return pointer is not null. This
1458    attribute may only be applied to pointer typed parameters. This is not
1459    checked or enforced by LLVM; if the parameter or return pointer is null,
1460    :ref:`poison value <poisonvalues>` is returned or passed instead.
1461    The ``nonnull`` attribute should be combined with the ``noundef`` attribute
1462    to ensure a pointer is not null or otherwise the behavior is undefined.
1463
1464``dereferenceable(<n>)``
1465    This indicates that the parameter or return pointer is dereferenceable. This
1466    attribute may only be applied to pointer typed parameters. A pointer that
1467    is dereferenceable can be loaded from speculatively without a risk of
1468    trapping. The number of bytes known to be dereferenceable must be provided
1469    in parentheses. It is legal for the number of bytes to be less than the
1470    size of the pointee type. The ``nonnull`` attribute does not imply
1471    dereferenceability (consider a pointer to one element past the end of an
1472    array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1473    ``addrspace(0)`` (which is the default address space), except if the
1474    ``null_pointer_is_valid`` function attribute is present.
1475    ``n`` should be a positive number. The pointer should be well defined,
1476    otherwise it is undefined behavior. This means ``dereferenceable(<n>)``
1477    implies ``noundef``.
1478
1479``dereferenceable_or_null(<n>)``
1480    This indicates that the parameter or return value isn't both
1481    non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1482    time. All non-null pointers tagged with
1483    ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1484    For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1485    a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1486    and in other address spaces ``dereferenceable_or_null(<n>)``
1487    implies that a pointer is at least one of ``dereferenceable(<n>)``
1488    or ``null`` (i.e. it may be both ``null`` and
1489    ``dereferenceable(<n>)``). This attribute may only be applied to
1490    pointer typed parameters.
1491
1492``swiftself``
1493    This indicates that the parameter is the self/context parameter. This is not
1494    a valid attribute for return values and can only be applied to one
1495    parameter.
1496
1497.. _swiftasync:
1498
1499``swiftasync``
1500    This indicates that the parameter is the asynchronous context parameter and
1501    triggers the creation of a target-specific extended frame record to store
1502    this pointer. This is not a valid attribute for return values and can only
1503    be applied to one parameter.
1504
1505``swifterror``
1506    This attribute is motivated to model and optimize Swift error handling. It
1507    can be applied to a parameter with pointer to pointer type or a
1508    pointer-sized alloca. At the call site, the actual argument that corresponds
1509    to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1510    the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1511    the parameter or the alloca) can only be loaded and stored from, or used as
1512    a ``swifterror`` argument. This is not a valid attribute for return values
1513    and can only be applied to one parameter.
1514
1515    These constraints allow the calling convention to optimize access to
1516    ``swifterror`` variables by associating them with a specific register at
1517    call boundaries rather than placing them in memory. Since this does change
1518    the calling convention, a function which uses the ``swifterror`` attribute
1519    on a parameter is not ABI-compatible with one which does not.
1520
1521    These constraints also allow LLVM to assume that a ``swifterror`` argument
1522    does not alias any other memory visible within a function and that a
1523    ``swifterror`` alloca passed as an argument does not escape.
1524
1525``immarg``
1526    This indicates the parameter is required to be an immediate
1527    value. This must be a trivial immediate integer or floating-point
1528    constant. Undef or constant expressions are not valid. This is
1529    only valid on intrinsic declarations and cannot be applied to a
1530    call site or arbitrary function.
1531
1532``noundef``
1533    This attribute applies to parameters and return values. If the value
1534    representation contains any undefined or poison bits, the behavior is
1535    undefined. Note that this does not refer to padding introduced by the
1536    type's storage representation.
1537
1538    If memory sanitizer is enabled, ``noundef`` becomes an ABI attribute and
1539    must match between the call-site and the function definition.
1540
1541.. _nofpclass:
1542
1543``nofpclass(<test mask>)``
1544    This attribute applies to parameters and return values with
1545    floating-point and vector of floating-point types, as well as
1546    :ref:`supported aggregates <fastmath_return_types>` of such types
1547    (matching the supported types for :ref:`fast-math flags <fastmath>`).
1548    The test mask has the same format as the second argument to the
1549    :ref:`llvm.is.fpclass <llvm.is.fpclass>`, and indicates which classes
1550    of floating-point values are not permitted for the value. For example
1551    a bitmask of 3 indicates the parameter may not be a NaN.
1552
1553    If the value is a floating-point class indicated by the
1554    ``nofpclass`` test mask, a :ref:`poison value <poisonvalues>` is
1555    passed or returned instead.
1556
1557.. code-block:: text
1558  :caption: The following invariants hold
1559
1560       @llvm.is.fpclass(nofpclass(test_mask) %x, test_mask) => false
1561       @llvm.is.fpclass(nofpclass(test_mask) %x, ~test_mask) => true
1562       nofpclass(all) => poison
1563..
1564
1565   In textual IR, various string names are supported for readability
1566   and can be combined. For example ``nofpclass(nan pinf nzero)``
1567   evaluates to a mask of 547.
1568
1569   This does not depend on the floating-point environment. For
1570   example, a function parameter marked ``nofpclass(zero)`` indicates
1571   no zero inputs. If this is applied to an argument in a function
1572   marked with :ref:`\"denormal-fp-math\" <denormal_fp_math>`
1573   indicating zero treatment of input denormals, it does not imply the
1574   value cannot be a denormal value which would compare equal to 0.
1575
1576.. table:: Recognized test mask names
1577
1578    +-------+----------------------+---------------+
1579    | Name  | floating-point class | Bitmask value |
1580    +=======+======================+===============+
1581    |  nan  | Any NaN              |       3       |
1582    +-------+----------------------+---------------+
1583    |  inf  | +/- infinity         |      516      |
1584    +-------+----------------------+---------------+
1585    |  norm | +/- normal           |      264      |
1586    +-------+----------------------+---------------+
1587    |  sub  | +/- subnormal        |      144      |
1588    +-------+----------------------+---------------+
1589    |  zero | +/- 0                |       96      |
1590    +-------+----------------------+---------------+
1591    |  all  | All values           |     1023      |
1592    +-------+----------------------+---------------+
1593    | snan  | Signaling NaN        |       1       |
1594    +-------+----------------------+---------------+
1595    | qnan  | Quiet NaN            |       2       |
1596    +-------+----------------------+---------------+
1597    | ninf  | Negative infinity    |       4       |
1598    +-------+----------------------+---------------+
1599    | nnorm | Negative normal      |       8       |
1600    +-------+----------------------+---------------+
1601    | nsub  | Negative subnormal   |       16      |
1602    +-------+----------------------+---------------+
1603    | nzero | Negative zero        |       32      |
1604    +-------+----------------------+---------------+
1605    | pzero | Positive zero        |       64      |
1606    +-------+----------------------+---------------+
1607    | psub  | Positive subnormal   |       128     |
1608    +-------+----------------------+---------------+
1609    | pnorm | Positive normal      |       256     |
1610    +-------+----------------------+---------------+
1611    | pinf  | Positive infinity    |       512     |
1612    +-------+----------------------+---------------+
1613
1614
1615``alignstack(<n>)``
1616    This indicates the alignment that should be considered by the backend when
1617    assigning this parameter to a stack slot during calling convention
1618    lowering. The enforcement of the specified alignment is target-dependent,
1619    as target-specific calling convention rules may override this value. This
1620    attribute serves the purpose of carrying language specific alignment
1621    information that is not mapped to base types in the backend (for example,
1622    over-alignment specification through language attributes).
1623
1624``allocalign``
1625    The function parameter marked with this attribute is the alignment in bytes of the
1626    newly allocated block returned by this function. The returned value must either have
1627    the specified alignment or be the null pointer. The return value MAY be more aligned
1628    than the requested alignment, but not less aligned.  Invalid (e.g. non-power-of-2)
1629    alignments are permitted for the allocalign parameter, so long as the returned pointer
1630    is null. This attribute may only be applied to integer parameters.
1631
1632``allocptr``
1633    The function parameter marked with this attribute is the pointer
1634    that will be manipulated by the allocator. For a realloc-like
1635    function the pointer will be invalidated upon success (but the
1636    same address may be returned), for a free-like function the
1637    pointer will always be invalidated.
1638
1639``readnone``
1640    This attribute indicates that the function does not dereference that
1641    pointer argument, even though it may read or write the memory that the
1642    pointer points to if accessed through other pointers.
1643
1644    If a function reads from or writes to a readnone pointer argument, the
1645    behavior is undefined.
1646
1647``readonly``
1648    This attribute indicates that the function does not write through this
1649    pointer argument, even though it may write to the memory that the pointer
1650    points to.
1651
1652    If a function writes to a readonly pointer argument, the behavior is
1653    undefined.
1654
1655``writeonly``
1656    This attribute indicates that the function may write to, but does not read
1657    through this pointer argument (even though it may read from the memory that
1658    the pointer points to).
1659
1660    This attribute is understood in the same way as the ``memory(write)``
1661    attribute. That is, the pointer may still be read as long as the read is
1662    not observable outside the function. See the ``memory`` documentation for
1663    precise semantics.
1664
1665``writable``
1666    This attribute is only meaningful in conjunction with ``dereferenceable(N)``
1667    or another attribute that implies the first ``N`` bytes of the pointer
1668    argument are dereferenceable.
1669
1670    In that case, the attribute indicates that the first ``N`` bytes will be
1671    (non-atomically) loaded and stored back on entry to the function.
1672
1673    This implies that it's possible to introduce spurious stores on entry to
1674    the function without introducing traps or data races. This does not
1675    necessarily hold throughout the whole function, as the pointer may escape
1676    to a different thread during the execution of the function. See also the
1677    :ref:`atomic optimization guide <Optimization outside atomic>`
1678
1679    The "other attributes" that imply dereferenceability are
1680    ``dereferenceable_or_null`` (if the pointer is non-null) and the
1681    ``sret``, ``byval``, ``byref``, ``inalloca``, ``preallocated`` family of
1682    attributes. Note that not all of these combinations are useful, e.g.
1683    ``byval`` arguments are known to be writable even without this attribute.
1684
1685    The ``writable`` attribute cannot be combined with ``readnone``,
1686    ``readonly`` or a ``memory`` attribute that does not contain
1687    ``argmem: write``.
1688
1689``initializes((Lo1, Hi1), ...)``
1690    This attribute indicates that the function initializes the ranges of the
1691    pointer parameter's memory, ``[%p+LoN, %p+HiN)``. Initialization of memory
1692    means the first memory access is a non-volatile, non-atomic write. The
1693    write must happen before the function returns. If the function unwinds,
1694    the write may not happen.
1695
1696    This attribute only holds for the memory accessed via this pointer
1697    parameter. Other arbitrary accesses to the same memory via other pointers
1698    are allowed.
1699
1700    The ``writable`` or ``dereferenceable`` attribute do not imply the
1701    ``initializes`` attribute. The ``initializes`` attribute does not imply
1702    ``writeonly`` since ``initializes`` allows reading from the pointer
1703    after writing.
1704
1705    This attribute is a list of constant ranges in ascending order with no
1706    overlapping or consecutive list elements. ``LoN/HiN`` are 64-bit integers,
1707    and negative values are allowed in case the argument points partway into
1708    an allocation. An empty list is not allowed.
1709
1710``dead_on_unwind``
1711    At a high level, this attribute indicates that the pointer argument is dead
1712    if the call unwinds, in the sense that the caller will not depend on the
1713    contents of the memory. Stores that would only be visible on the unwind
1714    path can be elided.
1715
1716    More precisely, the behavior is as-if any memory written through the
1717    pointer during the execution of the function is overwritten with a poison
1718    value on unwind. This includes memory written by the implicit write implied
1719    by the ``writable`` attribute. The caller is allowed to access the affected
1720    memory, but all loads that are not preceded by a store will return poison.
1721
1722    This attribute cannot be applied to return values.
1723
1724``range(<ty> <a>, <b>)``
1725    This attribute expresses the possible range of the parameter or return value.
1726    If the value is not in the specified range, it is converted to poison.
1727    The arguments passed to ``range`` have the following properties:
1728
1729    -  The type must match the scalar type of the parameter or return value.
1730    -  The pair ``a,b`` represents the range ``[a,b)``.
1731    -  Both ``a`` and ``b`` are constants.
1732    -  The range is allowed to wrap.
1733    -  The empty range is represented using ``0,0``.
1734    -  Otherwise, ``a`` and ``b`` are not allowed to be equal.
1735
1736    This attribute may only be applied to parameters or return values with integer
1737    or vector of integer types.
1738
1739    For vector-typed parameters, the range is applied element-wise.
1740
1741.. _gc:
1742
1743Garbage Collector Strategy Names
1744--------------------------------
1745
1746Each function may specify a garbage collector strategy name, which is simply a
1747string:
1748
1749.. code-block:: llvm
1750
1751    define void @f() gc "name" { ... }
1752
1753The supported values of *name* includes those :ref:`built in to LLVM
1754<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1755strategy will cause the compiler to alter its output in order to support the
1756named garbage collection algorithm. Note that LLVM itself does not contain a
1757garbage collector, this functionality is restricted to generating machine code
1758which can interoperate with a collector provided externally.
1759
1760.. _prefixdata:
1761
1762Prefix Data
1763-----------
1764
1765Prefix data is data associated with a function which the code
1766generator will emit immediately before the function's entrypoint.
1767The purpose of this feature is to allow frontends to associate
1768language-specific runtime metadata with specific functions and make it
1769available through the function pointer while still allowing the
1770function pointer to be called.
1771
1772To access the data for a given function, a program may bitcast the
1773function pointer to a pointer to the constant's type and dereference
1774index -1. This implies that the IR symbol points just past the end of
1775the prefix data. For instance, take the example of a function annotated
1776with a single ``i32``,
1777
1778.. code-block:: llvm
1779
1780    define void @f() prefix i32 123 { ... }
1781
1782The prefix data can be referenced as,
1783
1784.. code-block:: llvm
1785
1786    %a = getelementptr inbounds i32, ptr @f, i32 -1
1787    %b = load i32, ptr %a
1788
1789Prefix data is laid out as if it were an initializer for a global variable
1790of the prefix data's type. The function will be placed such that the
1791beginning of the prefix data is aligned. This means that if the size
1792of the prefix data is not a multiple of the alignment size, the
1793function's entrypoint will not be aligned. If alignment of the
1794function's entrypoint is desired, padding must be added to the prefix
1795data.
1796
1797A function may have prefix data but no body. This has similar semantics
1798to the ``available_externally`` linkage in that the data may be used by the
1799optimizers but will not be emitted in the object file.
1800
1801.. _prologuedata:
1802
1803Prologue Data
1804-------------
1805
1806The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1807be inserted prior to the function body. This can be used for enabling
1808function hot-patching and instrumentation.
1809
1810To maintain the semantics of ordinary function calls, the prologue data must
1811have a particular format. Specifically, it must begin with a sequence of
1812bytes which decode to a sequence of machine instructions, valid for the
1813module's target, which transfer control to the point immediately succeeding
1814the prologue data, without performing any other visible action. This allows
1815the inliner and other passes to reason about the semantics of the function
1816definition without needing to reason about the prologue data. Obviously this
1817makes the format of the prologue data highly target dependent.
1818
1819A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
1820which encodes the ``nop`` instruction:
1821
1822.. code-block:: text
1823
1824    define void @f() prologue i8 144 { ... }
1825
1826Generally prologue data can be formed by encoding a relative branch instruction
1827which skips the metadata, as in this example of valid prologue data for the
1828x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1829
1830.. code-block:: text
1831
1832    %0 = type <{ i8, i8, ptr }>
1833
1834    define void @f() prologue %0 <{ i8 235, i8 8, ptr @md}> { ... }
1835
1836A function may have prologue data but no body. This has similar semantics
1837to the ``available_externally`` linkage in that the data may be used by the
1838optimizers but will not be emitted in the object file.
1839
1840.. _personalityfn:
1841
1842Personality Function
1843--------------------
1844
1845The ``personality`` attribute permits functions to specify what function
1846to use for exception handling.
1847
1848.. _attrgrp:
1849
1850Attribute Groups
1851----------------
1852
1853Attribute groups are groups of attributes that are referenced by objects within
1854the IR. They are important for keeping ``.ll`` files readable, because a lot of
1855functions will use the same set of attributes. In the degenerative case of a
1856``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1857group will capture the important command line flags used to build that file.
1858
1859An attribute group is a module-level object. To use an attribute group, an
1860object references the attribute group's ID (e.g. ``#37``). An object may refer
1861to more than one attribute group. In that situation, the attributes from the
1862different groups are merged.
1863
1864Here is an example of attribute groups for a function that should always be
1865inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1866
1867.. code-block:: llvm
1868
1869   ; Target-independent attributes:
1870   attributes #0 = { alwaysinline alignstack=4 }
1871
1872   ; Target-dependent attributes:
1873   attributes #1 = { "no-sse" }
1874
1875   ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1876   define void @f() #0 #1 { ... }
1877
1878.. _fnattrs:
1879
1880Function Attributes
1881-------------------
1882
1883Function attributes are set to communicate additional information about
1884a function. Function attributes are considered to be part of the
1885function, not of the function type, so functions with different function
1886attributes can have the same function type.
1887
1888Function attributes are simple keywords or strings that follow the specified
1889type. Multiple attributes, when required, are separated by spaces.
1890For example:
1891
1892.. code-block:: llvm
1893
1894    define void @f() noinline { ... }
1895    define void @f() alwaysinline { ... }
1896    define void @f() alwaysinline optsize { ... }
1897    define void @f() optsize { ... }
1898    define void @f() "no-sse" { ... }
1899
1900``alignstack(<n>)``
1901    This attribute indicates that, when emitting the prologue and
1902    epilogue, the backend should forcibly align the stack pointer.
1903    Specify the desired alignment, which must be a power of two, in
1904    parentheses.
1905``"alloc-family"="FAMILY"``
1906    This indicates which "family" an allocator function is part of. To avoid
1907    collisions, the family name should match the mangled name of the primary
1908    allocator function, that is "malloc" for malloc/calloc/realloc/free,
1909    "_Znwm" for ``::operator::new`` and ``::operator::delete``, and
1910    "_ZnwmSt11align_val_t" for aligned ``::operator::new`` and
1911    ``::operator::delete``. Matching malloc/realloc/free calls within a family
1912    can be optimized, but mismatched ones will be left alone.
1913``allockind("KIND")``
1914    Describes the behavior of an allocation function. The KIND string contains comma
1915    separated entries from the following options:
1916
1917    * "alloc": the function returns a new block of memory or null.
1918    * "realloc": the function returns a new block of memory or null. If the
1919      result is non-null the memory contents from the start of the block up to
1920      the smaller of the original allocation size and the new allocation size
1921      will match that of the ``allocptr`` argument and the ``allocptr``
1922      argument is invalidated, even if the function returns the same address.
1923    * "free": the function frees the block of memory specified by ``allocptr``.
1924      Functions marked as "free" ``allockind`` must return void.
1925    * "uninitialized": Any newly-allocated memory (either a new block from
1926      a "alloc" function or the enlarged capacity from a "realloc" function)
1927      will be uninitialized.
1928    * "zeroed": Any newly-allocated memory (either a new block from a "alloc"
1929      function or the enlarged capacity from a "realloc" function) will be
1930      zeroed.
1931    * "aligned": the function returns memory aligned according to the
1932      ``allocalign`` parameter.
1933
1934    The first three options are mutually exclusive, and the remaining options
1935    describe more details of how the function behaves. The remaining options
1936    are invalid for "free"-type functions.
1937``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1938    This attribute indicates that the annotated function will always return at
1939    least a given number of bytes (or null). Its arguments are zero-indexed
1940    parameter numbers; if one argument is provided, then it's assumed that at
1941    least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1942    returned pointer. If two are provided, then it's assumed that
1943    ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1944    available. The referenced parameters must be integer types. No assumptions
1945    are made about the contents of the returned block of memory.
1946``alwaysinline``
1947    This attribute indicates that the inliner should attempt to inline
1948    this function into callers whenever possible, ignoring any active
1949    inlining size threshold for this caller.
1950``builtin``
1951    This indicates that the callee function at a call site should be
1952    recognized as a built-in function, even though the function's declaration
1953    uses the ``nobuiltin`` attribute. This is only valid at call sites for
1954    direct calls to functions that are declared with the ``nobuiltin``
1955    attribute.
1956``cold``
1957    This attribute indicates that this function is rarely called. When
1958    computing edge weights, basic blocks post-dominated by a cold
1959    function call are also considered to be cold; and, thus, given low
1960    weight.
1961
1962.. _attr_convergent:
1963
1964``convergent``
1965    This attribute indicates that this function is convergent.
1966    When it appears on a call/invoke, the convergent attribute
1967    indicates that we should treat the call as though we’re calling a
1968    convergent function. This is particularly useful on indirect
1969    calls; without this we may treat such calls as though the target
1970    is non-convergent.
1971
1972    See :doc:`ConvergentOperations` for further details.
1973
1974    It is an error to call :ref:`llvm.experimental.convergence.entry
1975    <llvm.experimental.convergence.entry>` from a function that
1976    does not have this attribute.
1977``disable_sanitizer_instrumentation``
1978    When instrumenting code with sanitizers, it can be important to skip certain
1979    functions to ensure no instrumentation is applied to them.
1980
1981    This attribute is not always similar to absent ``sanitize_<name>``
1982    attributes: depending on the specific sanitizer, code can be inserted into
1983    functions regardless of the ``sanitize_<name>`` attribute to prevent false
1984    positive reports.
1985
1986    ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation,
1987    taking precedence over the ``sanitize_<name>`` attributes and other compiler
1988    flags.
1989``"dontcall-error"``
1990    This attribute denotes that an error diagnostic should be emitted when a
1991    call of a function with this attribute is not eliminated via optimization.
1992    Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1993    such callees to attach information about where in the source language such a
1994    call came from. A string value can be provided as a note.
1995``"dontcall-warn"``
1996    This attribute denotes that a warning diagnostic should be emitted when a
1997    call of a function with this attribute is not eliminated via optimization.
1998    Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1999    such callees to attach information about where in the source language such a
2000    call came from. A string value can be provided as a note.
2001``fn_ret_thunk_extern``
2002    This attribute tells the code generator that returns from functions should
2003    be replaced with jumps to externally-defined architecture-specific symbols.
2004    For X86, this symbol's identifier is ``__x86_return_thunk``.
2005``"frame-pointer"``
2006    This attribute tells the code generator whether the function
2007    should keep the frame pointer. The code generator may emit the frame pointer
2008    even if this attribute says the frame pointer can be eliminated.
2009    The allowed string values are:
2010
2011     * ``"none"`` (default) - the frame pointer can be eliminated, and it's
2012       register can be used for other purposes.
2013     * ``"reserved"`` - the frame pointer register must either be updated to
2014       point to a valid frame record for the current function, or not be
2015       modified.
2016     * ``"non-leaf"`` - the frame pointer should be kept if the function calls
2017       other functions.
2018     * ``"all"`` - the frame pointer should be kept.
2019``hot``
2020    This attribute indicates that this function is a hot spot of the program
2021    execution. The function will be optimized more aggressively and will be
2022    placed into special subsection of the text section to improving locality.
2023
2024    When profile feedback is enabled, this attribute has the precedence over
2025    the profile information. By marking a function ``hot``, users can work
2026    around the cases where the training input does not have good coverage
2027    on all the hot functions.
2028``inlinehint``
2029    This attribute indicates that the source code contained a hint that
2030    inlining this function is desirable (such as the "inline" keyword in
2031    C/C++). It is just a hint; it imposes no requirements on the
2032    inliner.
2033``jumptable``
2034    This attribute indicates that the function should be added to a
2035    jump-instruction table at code-generation time, and that all address-taken
2036    references to this function should be replaced with a reference to the
2037    appropriate jump-instruction-table function pointer. Note that this creates
2038    a new pointer for the original function, which means that code that depends
2039    on function-pointer identity can break. So, any function annotated with
2040    ``jumptable`` must also be ``unnamed_addr``.
2041``memory(...)``
2042    This attribute specifies the possible memory effects of the call-site or
2043    function. It allows specifying the possible access kinds (``none``,
2044    ``read``, ``write``, or ``readwrite``) for the possible memory location
2045    kinds (``argmem``, ``inaccessiblemem``, as well as a default). It is best
2046    understood by example:
2047
2048    - ``memory(none)``: Does not access any memory.
2049    - ``memory(read)``: May read (but not write) any memory.
2050    - ``memory(write)``: May write (but not read) any memory.
2051    - ``memory(readwrite)``: May read or write any memory.
2052    - ``memory(argmem: read)``: May only read argument memory.
2053    - ``memory(argmem: read, inaccessiblemem: write)``: May only read argument
2054      memory and only write inaccessible memory.
2055    - ``memory(read, argmem: readwrite)``: May read any memory (default mode)
2056      and additionally write argument memory.
2057    - ``memory(readwrite, argmem: none)``: May access any memory apart from
2058      argument memory.
2059
2060    The supported access kinds are:
2061
2062    - ``readwrite``: Any kind of access to the location is allowed.
2063    - ``read``: The location is only read. Writing to the location is immediate
2064      undefined behavior. This includes the case where the location is read from
2065      and then the same value is written back.
2066    - ``write``: Only writes to the location are observable outside the function
2067      call. However, the function may still internally read the location after
2068      writing it, as this is not observable. Reading the location prior to
2069      writing it results in a poison value.
2070    - ``none``: No reads or writes to the location are observed outside the
2071      function. It is always valid to read and write allocas, and to read global
2072      constants, even if ``memory(none)`` is used, as these effects are not
2073      externally observable.
2074
2075    The supported memory location kinds are:
2076
2077    - ``argmem``: This refers to accesses that are based on pointer arguments
2078      to the function.
2079    - ``inaccessiblemem``: This refers to accesses to memory which is not
2080      accessible by the current module (before return from the function -- an
2081      allocator function may return newly accessible memory while only
2082      accessing inaccessible memory itself). Inaccessible memory is often used
2083      to model control dependencies of intrinsics.
2084    - The default access kind (specified without a location prefix) applies to
2085      all locations that haven't been specified explicitly, including those that
2086      don't currently have a dedicated location kind (e.g. accesses to globals
2087      or captured pointers).
2088
2089    If the ``memory`` attribute is not specified, then ``memory(readwrite)``
2090    is implied (all memory effects are possible).
2091
2092    The memory effects of a call can be computed as
2093    ``CallSiteEffects & (FunctionEffects | OperandBundleEffects)``. Thus, the
2094    call-site annotation takes precedence over the potential effects described
2095    by either the function annotation or the operand bundles.
2096``minsize``
2097    This attribute suggests that optimization passes and code generator
2098    passes make choices that keep the code size of this function as small
2099    as possible and perform optimizations that may sacrifice runtime
2100    performance in order to minimize the size of the generated code.
2101    This attribute is incompatible with the ``optdebug`` and ``optnone``
2102    attributes.
2103``naked``
2104    This attribute disables prologue / epilogue emission for the
2105    function. This can have very system-specific consequences. The arguments of
2106    a ``naked`` function can not be referenced through IR values.
2107``"no-inline-line-tables"``
2108    When this attribute is set to true, the inliner discards source locations
2109    when inlining code and instead uses the source location of the call site.
2110    Breakpoints set on code that was inlined into the current function will
2111    not fire during the execution of the inlined call sites. If the debugger
2112    stops inside an inlined call site, it will appear to be stopped at the
2113    outermost inlined call site.
2114``no-jump-tables``
2115    When this attribute is set to true, the jump tables and lookup tables that
2116    can be generated from a switch case lowering are disabled.
2117``nobuiltin``
2118    This indicates that the callee function at a call site is not recognized as
2119    a built-in function. LLVM will retain the original call and not replace it
2120    with equivalent code based on the semantics of the built-in function, unless
2121    the call site uses the ``builtin`` attribute. This is valid at call sites
2122    and on function declarations and definitions.
2123``nocallback``
2124    This attribute indicates that the function is only allowed to jump back into
2125    caller's module by a return or an exception, and is not allowed to jump back
2126    by invoking a callback function, a direct, possibly transitive, external
2127    function call, use of ``longjmp``, or other means. It is a compiler hint that
2128    is used at module level to improve dataflow analysis, dropped during linking,
2129    and has no effect on functions defined in the current module.
2130``nodivergencesource``
2131    A call to this function is not a source of divergence. In uniformity
2132    analysis, a *source of divergence* is an instruction that generates
2133    divergence even if its inputs are uniform. A call with no further information
2134    would normally be considered a source of divergence; setting this attribute
2135    on a function means that a call to it is not a source of divergence.
2136``noduplicate``
2137    This attribute indicates that calls to the function cannot be
2138    duplicated. A call to a ``noduplicate`` function may be moved
2139    within its parent function, but may not be duplicated within
2140    its parent function.
2141
2142    A function containing a ``noduplicate`` call may still
2143    be an inlining candidate, provided that the call is not
2144    duplicated by inlining. That implies that the function has
2145    internal linkage and only has one call site, so the original
2146    call is dead after inlining.
2147``nofree``
2148    This function attribute indicates that the function does not, directly or
2149    transitively, call a memory-deallocation function (``free``, for example)
2150    on a memory allocation which existed before the call.
2151
2152    As a result, uncaptured pointers that are known to be dereferenceable
2153    prior to a call to a function with the ``nofree`` attribute are still
2154    known to be dereferenceable after the call. The capturing condition is
2155    necessary in environments where the function might communicate the
2156    pointer to another thread which then deallocates the memory.  Alternatively,
2157    ``nosync`` would ensure such communication cannot happen and even captured
2158    pointers cannot be freed by the function.
2159
2160    A ``nofree`` function is explicitly allowed to free memory which it
2161    allocated or (if not ``nosync``) arrange for another thread to free
2162    memory on it's behalf.  As a result, perhaps surprisingly, a ``nofree``
2163    function can return a pointer to a previously deallocated memory object.
2164``noimplicitfloat``
2165    Disallows implicit floating-point code. This inhibits optimizations that
2166    use floating-point code and floating-point registers for operations that are
2167    not nominally floating-point. LLVM instructions that perform floating-point
2168    operations or require access to floating-point registers may still cause
2169    floating-point code to be generated.
2170
2171    Also inhibits optimizations that create SIMD/vector code and registers from
2172    scalar code such as vectorization or memcpy/memset optimization. This
2173    includes integer vectors. Vector instructions present in IR may still cause
2174    vector code to be generated.
2175``noinline``
2176    This attribute indicates that the inliner should never inline this
2177    function in any situation. This attribute may not be used together
2178    with the ``alwaysinline`` attribute.
2179``nomerge``
2180    This attribute indicates that calls to this function should never be merged
2181    during optimization. For example, it will prevent tail merging otherwise
2182    identical code sequences that raise an exception or terminate the program.
2183    Tail merging normally reduces the precision of source location information,
2184    making stack traces less useful for debugging. This attribute gives the
2185    user control over the tradeoff between code size and debug information
2186    precision.
2187``nonlazybind``
2188    This attribute suppresses lazy symbol binding for the function. This
2189    may make calls to the function faster, at the cost of extra program
2190    startup time if the function is not called during program startup.
2191``noprofile``
2192    This function attribute prevents instrumentation based profiling, used for
2193    coverage or profile based optimization, from being added to a function. It
2194    also blocks inlining if the caller and callee have different values of this
2195    attribute.
2196``skipprofile``
2197    This function attribute prevents instrumentation based profiling, used for
2198    coverage or profile based optimization, from being added to a function. This
2199    attribute does not restrict inlining, so instrumented instruction could end
2200    up in this function.
2201``noredzone``
2202    This attribute indicates that the code generator should not use a
2203    red zone, even if the target-specific ABI normally permits it.
2204``indirect-tls-seg-refs``
2205    This attribute indicates that the code generator should not use
2206    direct TLS access through segment registers, even if the
2207    target-specific ABI normally permits it.
2208``noreturn``
2209    This function attribute indicates that the function never returns
2210    normally, hence through a return instruction. This produces undefined
2211    behavior at runtime if the function ever does dynamically return. Annotated
2212    functions may still raise an exception, i.a., ``nounwind`` is not implied.
2213``norecurse``
2214    This function attribute indicates that the function does not call itself
2215    either directly or indirectly down any possible call path. This produces
2216    undefined behavior at runtime if the function ever does recurse.
2217
2218.. _langref_willreturn:
2219
2220``willreturn``
2221    This function attribute indicates that a call of this function will
2222    either exhibit undefined behavior or comes back and continues execution
2223    at a point in the existing call stack that includes the current invocation.
2224    Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied.
2225    If an invocation of an annotated function does not return control back
2226    to a point in the call stack, the behavior is undefined.
2227``nosync``
2228    This function attribute indicates that the function does not communicate
2229    (synchronize) with another thread through memory or other well-defined means.
2230    Synchronization is considered possible in the presence of `atomic` accesses
2231    that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses,
2232    as well as `convergent` function calls.
2233
2234    Note that `convergent` operations can involve communication that is
2235    considered to be not through memory and does not necessarily imply an
2236    ordering between threads for the purposes of the memory model. Therefore,
2237    an operation can be both `convergent` and `nosync`.
2238
2239    If a `nosync` function does ever synchronize with another thread,
2240    the behavior is undefined.
2241``nounwind``
2242    This function attribute indicates that the function never raises an
2243    exception. If the function does raise an exception, its runtime
2244    behavior is undefined. However, functions marked nounwind may still
2245    trap or generate asynchronous exceptions. Exception handling schemes
2246    that are recognized by LLVM to handle asynchronous exceptions, such
2247    as SEH, will still provide their implementation defined semantics.
2248``nosanitize_bounds``
2249    This attribute indicates that bounds checking sanitizer instrumentation
2250    is disabled for this function.
2251``nosanitize_coverage``
2252    This attribute indicates that SanitizerCoverage instrumentation is disabled
2253    for this function.
2254``null_pointer_is_valid``
2255   If ``null_pointer_is_valid`` is set, then the ``null`` address
2256   in address-space 0 is considered to be a valid address for memory loads and
2257   stores. Any analysis or optimization should not treat dereferencing a
2258   pointer to ``null`` as undefined behavior in this function.
2259   Note: Comparing address of a global variable to ``null`` may still
2260   evaluate to false because of a limitation in querying this attribute inside
2261   constant expressions.
2262``optdebug``
2263    This attribute suggests that optimization passes and code generator passes
2264    should make choices that try to preserve debug info without significantly
2265    degrading runtime performance.
2266    This attribute is incompatible with the ``minsize``, ``optsize``, and
2267    ``optnone`` attributes.
2268``optforfuzzing``
2269    This attribute indicates that this function should be optimized
2270    for maximum fuzzing signal.
2271``optnone``
2272    This function attribute indicates that most optimization passes will skip
2273    this function, with the exception of interprocedural optimization passes.
2274    Code generation defaults to the "fast" instruction selector.
2275    This attribute cannot be used together with the ``alwaysinline``
2276    attribute; this attribute is also incompatible
2277    with the ``minsize``, ``optsize``, and ``optdebug`` attributes.
2278
2279    This attribute requires the ``noinline`` attribute to be specified on
2280    the function as well, so the function is never inlined into any caller.
2281    Only functions with the ``alwaysinline`` attribute are valid
2282    candidates for inlining into the body of this function.
2283``optsize``
2284    This attribute suggests that optimization passes and code generator
2285    passes make choices that keep the code size of this function low,
2286    and otherwise do optimizations specifically to reduce code size as
2287    long as they do not significantly impact runtime performance.
2288    This attribute is incompatible with the ``optdebug`` and ``optnone``
2289    attributes.
2290``"patchable-function"``
2291    This attribute tells the code generator that the code
2292    generated for this function needs to follow certain conventions that
2293    make it possible for a runtime function to patch over it later.
2294    The exact effect of this attribute depends on its string value,
2295    for which there currently is one legal possibility:
2296
2297     * ``"prologue-short-redirect"`` - This style of patchable
2298       function is intended to support patching a function prologue to
2299       redirect control away from the function in a thread safe
2300       manner.  It guarantees that the first instruction of the
2301       function will be large enough to accommodate a short jump
2302       instruction, and will be sufficiently aligned to allow being
2303       fully changed via an atomic compare-and-swap instruction.
2304       While the first requirement can be satisfied by inserting large
2305       enough NOP, LLVM can and will try to re-purpose an existing
2306       instruction (i.e. one that would have to be emitted anyway) as
2307       the patchable instruction larger than a short jump.
2308
2309       ``"prologue-short-redirect"`` is currently only supported on
2310       x86-64.
2311
2312    This attribute by itself does not imply restrictions on
2313    inter-procedural optimizations.  All of the semantic effects the
2314    patching may have to be separately conveyed via the linkage type.
2315``"probe-stack"``
2316    This attribute indicates that the function will trigger a guard region
2317    in the end of the stack. It ensures that accesses to the stack must be
2318    no further apart than the size of the guard region to a previous
2319    access of the stack. It takes one required string value, the name of
2320    the stack probing function that will be called.
2321
2322    If a function that has a ``"probe-stack"`` attribute is inlined into
2323    a function with another ``"probe-stack"`` attribute, the resulting
2324    function has the ``"probe-stack"`` attribute of the caller. If a
2325    function that has a ``"probe-stack"`` attribute is inlined into a
2326    function that has no ``"probe-stack"`` attribute at all, the resulting
2327    function has the ``"probe-stack"`` attribute of the callee.
2328``"stack-probe-size"``
2329    This attribute controls the behavior of stack probes: either
2330    the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
2331    It defines the size of the guard region. It ensures that if the function
2332    may use more stack space than the size of the guard region, stack probing
2333    sequence will be emitted. It takes one required integer value, which
2334    is 4096 by default.
2335
2336    If a function that has a ``"stack-probe-size"`` attribute is inlined into
2337    a function with another ``"stack-probe-size"`` attribute, the resulting
2338    function has the ``"stack-probe-size"`` attribute that has the lower
2339    numeric value. If a function that has a ``"stack-probe-size"`` attribute is
2340    inlined into a function that has no ``"stack-probe-size"`` attribute
2341    at all, the resulting function has the ``"stack-probe-size"`` attribute
2342    of the callee.
2343``"no-stack-arg-probe"``
2344    This attribute disables ABI-required stack probes, if any.
2345``returns_twice``
2346    This attribute indicates that this function can return twice. The C
2347    ``setjmp`` is an example of such a function. The compiler disables
2348    some optimizations (like tail calls) in the caller of these
2349    functions.
2350``safestack``
2351    This attribute indicates that
2352    `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_
2353    protection is enabled for this function.
2354
2355    If a function that has a ``safestack`` attribute is inlined into a
2356    function that doesn't have a ``safestack`` attribute or which has an
2357    ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
2358    function will have a ``safestack`` attribute.
2359``sanitize_address``
2360    This attribute indicates that AddressSanitizer checks
2361    (dynamic address safety analysis) are enabled for this function.
2362``sanitize_memory``
2363    This attribute indicates that MemorySanitizer checks (dynamic detection
2364    of accesses to uninitialized memory) are enabled for this function.
2365``sanitize_thread``
2366    This attribute indicates that ThreadSanitizer checks
2367    (dynamic thread safety analysis) are enabled for this function.
2368``sanitize_hwaddress``
2369    This attribute indicates that HWAddressSanitizer checks
2370    (dynamic address safety analysis based on tagged pointers) are enabled for
2371    this function.
2372``sanitize_memtag``
2373    This attribute indicates that MemTagSanitizer checks
2374    (dynamic address safety analysis based on Armv8 MTE) are enabled for
2375    this function.
2376``sanitize_realtime``
2377    This attribute indicates that RealtimeSanitizer checks
2378    (realtime safety analysis - no allocations, syscalls or exceptions) are enabled
2379    for this function.
2380``sanitize_realtime_blocking``
2381    This attribute indicates that RealtimeSanitizer should error immediately
2382    if the attributed function is called during invocation of a function
2383    attributed with ``sanitize_realtime``.
2384    This attribute is incompatible with the ``sanitize_realtime`` attribute.
2385``speculative_load_hardening``
2386    This attribute indicates that
2387    `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
2388    should be enabled for the function body.
2389
2390    Speculative Load Hardening is a best-effort mitigation against
2391    information leak attacks that make use of control flow
2392    miss-speculation - specifically miss-speculation of whether a branch
2393    is taken or not. Typically vulnerabilities enabling such attacks are
2394    classified as "Spectre variant #1". Notably, this does not attempt to
2395    mitigate against miss-speculation of branch target, classified as
2396    "Spectre variant #2" vulnerabilities.
2397
2398    When inlining, the attribute is sticky. Inlining a function that carries
2399    this attribute will cause the caller to gain the attribute. This is intended
2400    to provide a maximally conservative model where the code in a function
2401    annotated with this attribute will always (even after inlining) end up
2402    hardened.
2403``speculatable``
2404    This function attribute indicates that the function does not have any
2405    effects besides calculating its result and does not have undefined behavior.
2406    Note that ``speculatable`` is not enough to conclude that along any
2407    particular execution path the number of calls to this function will not be
2408    externally observable. This attribute is only valid on functions
2409    and declarations, not on individual call sites. If a function is
2410    incorrectly marked as speculatable and really does exhibit
2411    undefined behavior, the undefined behavior may be observed even
2412    if the call site is dead code.
2413
2414``ssp``
2415    This attribute indicates that the function should emit a stack
2416    smashing protector. It is in the form of a "canary" --- a random value
2417    placed on the stack before the local variables that's checked upon
2418    return from the function to see if it has been overwritten. A
2419    heuristic is used to determine if a function needs stack protectors
2420    or not. The heuristic used will enable protectors for functions with:
2421
2422    - Character arrays larger than ``ssp-buffer-size`` (default 8).
2423    - Aggregates containing character arrays larger than ``ssp-buffer-size``.
2424    - Calls to alloca() with variable sizes or constant sizes greater than
2425      ``ssp-buffer-size``.
2426
2427    Variables that are identified as requiring a protector will be arranged
2428    on the stack such that they are adjacent to the stack protector guard.
2429
2430    If a function with an ``ssp`` attribute is inlined into a calling function,
2431    the attribute is not carried over to the calling function.
2432
2433``sspstrong``
2434    This attribute indicates that the function should emit a stack smashing
2435    protector. This attribute causes a strong heuristic to be used when
2436    determining if a function needs stack protectors. The strong heuristic
2437    will enable protectors for functions with:
2438
2439    - Arrays of any size and type
2440    - Aggregates containing an array of any size and type.
2441    - Calls to alloca().
2442    - Local variables that have had their address taken.
2443
2444    Variables that are identified as requiring a protector will be arranged
2445    on the stack such that they are adjacent to the stack protector guard.
2446    The specific layout rules are:
2447
2448    #. Large arrays and structures containing large arrays
2449       (``>= ssp-buffer-size``) are closest to the stack protector.
2450    #. Small arrays and structures containing small arrays
2451       (``< ssp-buffer-size``) are 2nd closest to the protector.
2452    #. Variables that have had their address taken are 3rd closest to the
2453       protector.
2454
2455    This overrides the ``ssp`` function attribute.
2456
2457    If a function with an ``sspstrong`` attribute is inlined into a calling
2458    function which has an ``ssp`` attribute, the calling function's attribute
2459    will be upgraded to ``sspstrong``.
2460
2461``sspreq``
2462    This attribute indicates that the function should *always* emit a stack
2463    smashing protector. This overrides the ``ssp`` and ``sspstrong`` function
2464    attributes.
2465
2466    Variables that are identified as requiring a protector will be arranged
2467    on the stack such that they are adjacent to the stack protector guard.
2468    The specific layout rules are:
2469
2470    #. Large arrays and structures containing large arrays
2471       (``>= ssp-buffer-size``) are closest to the stack protector.
2472    #. Small arrays and structures containing small arrays
2473       (``< ssp-buffer-size``) are 2nd closest to the protector.
2474    #. Variables that have had their address taken are 3rd closest to the
2475       protector.
2476
2477    If a function with an ``sspreq`` attribute is inlined into a calling
2478    function which has an ``ssp`` or ``sspstrong`` attribute, the calling
2479    function's attribute will be upgraded to ``sspreq``.
2480
2481.. _strictfp:
2482
2483``strictfp``
2484    This attribute indicates that the function was called from a scope that
2485    requires strict floating-point semantics.  LLVM will not attempt any
2486    optimizations that require assumptions about the floating-point rounding
2487    mode or that might alter the state of floating-point status flags that
2488    might otherwise be set or cleared by calling this function. LLVM will
2489    not introduce any new floating-point instructions that may trap.
2490
2491.. _denormal_fp_math:
2492
2493``"denormal-fp-math"``
2494    This indicates the denormal (subnormal) handling that may be
2495    assumed for the default floating-point environment. This is a
2496    comma separated pair. The elements may be one of ``"ieee"``,
2497    ``"preserve-sign"``, ``"positive-zero"``, or ``"dynamic"``. The
2498    first entry indicates the flushing mode for the result of floating
2499    point operations. The second indicates the handling of denormal inputs
2500    to floating point instructions. For compatibility with older
2501    bitcode, if the second value is omitted, both input and output
2502    modes will assume the same mode.
2503
2504    If this is attribute is not specified, the default is ``"ieee,ieee"``.
2505
2506    If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,
2507    denormal outputs may be flushed to zero by standard floating-point
2508    operations. It is not mandated that flushing to zero occurs, but if
2509    a denormal output is flushed to zero, it must respect the sign
2510    mode. Not all targets support all modes.
2511
2512    If the mode is ``"dynamic"``, the behavior is derived from the
2513    dynamic state of the floating-point environment. Transformations
2514    which depend on the behavior of denormal values should not be
2515    performed.
2516
2517    While this indicates the expected floating point mode the function
2518    will be executed with, this does not make any attempt to ensure
2519    the mode is consistent. User or platform code is expected to set
2520    the floating point mode appropriately before function entry.
2521
2522    If the input mode is ``"preserve-sign"``, or ``"positive-zero"``,
2523    a floating-point operation must treat any input denormal value as
2524    zero. In some situations, if an instruction does not respect this
2525    mode, the input may need to be converted to 0 as if by
2526    ``@llvm.canonicalize`` during lowering for correctness.
2527
2528``"denormal-fp-math-f32"``
2529    Same as ``"denormal-fp-math"``, but only controls the behavior of
2530    the 32-bit float type (or vectors of 32-bit floats). If both are
2531    are present, this overrides ``"denormal-fp-math"``. Not all targets
2532    support separately setting the denormal mode per type, and no
2533    attempt is made to diagnose unsupported uses. Currently this
2534    attribute is respected by the AMDGPU and NVPTX backends.
2535
2536``"thunk"``
2537    This attribute indicates that the function will delegate to some other
2538    function with a tail call. The prototype of a thunk should not be used for
2539    optimization purposes. The caller is expected to cast the thunk prototype to
2540    match the thunk target prototype.
2541``uwtable[(sync|async)]``
2542    This attribute indicates that the ABI being targeted requires that
2543    an unwind table entry be produced for this function even if we can
2544    show that no exceptions passes by it. This is normally the case for
2545    the ELF x86-64 abi, but it can be disabled for some compilation
2546    units. The optional parameter describes what kind of unwind tables
2547    to generate: ``sync`` for normal unwind tables, ``async`` for asynchronous
2548    (instruction precise) unwind tables. Without the parameter, the attribute
2549    ``uwtable`` is equivalent to ``uwtable(async)``.
2550``nocf_check``
2551    This attribute indicates that no control-flow check will be performed on
2552    the attributed entity. It disables -fcf-protection=<> for a specific
2553    entity to fine grain the HW control flow protection mechanism. The flag
2554    is target independent and currently appertains to a function or function
2555    pointer.
2556``shadowcallstack``
2557    This attribute indicates that the ShadowCallStack checks are enabled for
2558    the function. The instrumentation checks that the return address for the
2559    function has not changed between the function prolog and epilog. It is
2560    currently x86_64-specific.
2561
2562.. _langref_mustprogress:
2563
2564``mustprogress``
2565    This attribute indicates that the function is required to return, unwind,
2566    or interact with the environment in an observable way e.g. via a volatile
2567    memory access, I/O, or other synchronization.  The ``mustprogress``
2568    attribute is intended to model the requirements of the first section of
2569    [intro.progress] of the C++ Standard. As a consequence, a loop in a
2570    function with the ``mustprogress`` attribute can be assumed to terminate if
2571    it does not interact with the environment in an observable way, and
2572    terminating loops without side-effects can be removed. If a ``mustprogress``
2573    function does not satisfy this contract, the behavior is undefined. If a
2574    ``mustprogress`` function calls a function not marked ``mustprogress``,
2575    and that function never returns, the program is well-defined even if there
2576    isn't any other observable progress.  Note that ``willreturn`` implies
2577    ``mustprogress``.
2578``"warn-stack-size"="<threshold>"``
2579    This attribute sets a threshold to emit diagnostics once the frame size is
2580    known should the frame size exceed the specified value.  It takes one
2581    required integer value, which should be a non-negative integer, and less
2582    than `UINT_MAX`.  It's unspecified which threshold will be used when
2583    duplicate definitions are linked together with differing values.
2584``vscale_range(<min>[, <max>])``
2585    This function attribute indicates `vscale` is a power-of-two within a
2586    specified range. `min` must be a power-of-two that is greater than 0. When
2587    specified, `max` must be a power-of-two greater-than-or-equal to `min` or 0
2588    to signify an unbounded maximum. The syntax `vscale_range(<val>)` can be
2589    used to set both `min` and `max` to the same value. Functions that don't
2590    include this attribute make no assumptions about the value of `vscale`.
2591``"nooutline"``
2592    This attribute indicates that outlining passes should not modify the
2593    function.
2594
2595Call Site Attributes
2596----------------------
2597
2598In addition to function attributes the following call site only
2599attributes are supported:
2600
2601``vector-function-abi-variant``
2602    This attribute can be attached to a :ref:`call <i_call>` to list
2603    the vector functions associated to the function. Notice that the
2604    attribute cannot be attached to a :ref:`invoke <i_invoke>` or a
2605    :ref:`callbr <i_callbr>` instruction. The attribute consists of a
2606    comma separated list of mangled names. The order of the list does
2607    not imply preference (it is logically a set). The compiler is free
2608    to pick any listed vector function of its choosing.
2609
2610    The syntax for the mangled names is as follows:::
2611
2612        _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
2613
2614    When present, the attribute informs the compiler that the function
2615    ``<scalar_name>`` has a corresponding vector variant that can be
2616    used to perform the concurrent invocation of ``<scalar_name>`` on
2617    vectors. The shape of the vector function is described by the
2618    tokens between the prefix ``_ZGV`` and the ``<scalar_name>``
2619    token. The standard name of the vector function is
2620    ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present,
2621    the optional token ``(<vector_redirection>)`` informs the compiler
2622    that a custom name is provided in addition to the standard one
2623    (custom names can be provided for example via the use of ``declare
2624    variant`` in OpenMP 5.0). The declaration of the variant must be
2625    present in the IR Module. The signature of the vector variant is
2626    determined by the rules of the Vector Function ABI (VFABI)
2627    specifications of the target. For Arm and X86, the VFABI can be
2628    found at https://github.com/ARM-software/abi-aa and
2629    https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html,
2630    respectively.
2631
2632    For X86 and Arm targets, the values of the tokens in the standard
2633    name are those that are defined in the VFABI. LLVM has an internal
2634    ``<isa>`` token that can be used to create scalar-to-vector
2635    mappings for functions that are not directly associated to any of
2636    the target ISAs (for example, some of the mappings stored in the
2637    TargetLibraryInfo). Valid values for the ``<isa>`` token are:::
2638
2639        <isa>:= b | c | d | e  -> X86 SSE, AVX, AVX2, AVX512
2640              | n | s          -> Armv8 Advanced SIMD, SVE
2641              | __LLVM__       -> Internal LLVM Vector ISA
2642
2643    For all targets currently supported (x86, Arm and Internal LLVM),
2644    the remaining tokens can have the following values:::
2645
2646        <mask>:= M | N         -> mask | no mask
2647
2648        <vlen>:= number        -> number of lanes
2649               | x             -> VLA (Vector Length Agnostic)
2650
2651        <parameters>:= v              -> vector
2652                     | l | l <number> -> linear
2653                     | R | R <number> -> linear with ref modifier
2654                     | L | L <number> -> linear with val modifier
2655                     | U | U <number> -> linear with uval modifier
2656                     | ls <pos>       -> runtime linear
2657                     | Rs <pos>       -> runtime linear with ref modifier
2658                     | Ls <pos>       -> runtime linear with val modifier
2659                     | Us <pos>       -> runtime linear with uval modifier
2660                     | u              -> uniform
2661
2662        <scalar_name>:= name of the scalar function
2663
2664        <vector_redirection>:= optional, custom name of the vector function
2665
2666``preallocated(<ty>)``
2667    This attribute is required on calls to ``llvm.call.preallocated.arg``
2668    and cannot be used on any other call. See
2669    :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more
2670    details.
2671
2672.. _glattrs:
2673
2674Global Attributes
2675-----------------
2676
2677Attributes may be set to communicate additional information about a global variable.
2678Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
2679are grouped into a single :ref:`attribute group <attrgrp>`.
2680
2681``no_sanitize_address``
2682    This attribute indicates that the global variable should not have
2683    AddressSanitizer instrumentation applied to it, because it was annotated
2684    with `__attribute__((no_sanitize("address")))`,
2685    `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2686    `-fsanitize-ignorelist` file.
2687``no_sanitize_hwaddress``
2688    This attribute indicates that the global variable should not have
2689    HWAddressSanitizer instrumentation applied to it, because it was annotated
2690    with `__attribute__((no_sanitize("hwaddress")))`,
2691    `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2692    `-fsanitize-ignorelist` file.
2693``sanitize_memtag``
2694    This attribute indicates that the global variable should have AArch64 memory
2695    tags (MTE) instrumentation applied to it. This attribute causes the
2696    suppression of certain optimizations, like GlobalMerge, as well as ensuring
2697    extra directives are emitted in the assembly and extra bits of metadata are
2698    placed in the object file so that the linker can ensure the accesses are
2699    protected by MTE. This attribute is added by clang when
2700    `-fsanitize=memtag-globals` is provided, as long as the global is not marked
2701    with `__attribute__((no_sanitize("memtag")))`,
2702    `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2703    `-fsanitize-ignorelist` file. The AArch64 Globals Tagging pass may remove
2704    this attribute when it's not possible to tag the global (e.g. it's a TLS
2705    variable).
2706``sanitize_address_dyninit``
2707    This attribute indicates that the global variable, when instrumented with
2708    AddressSanitizer, should be checked for ODR violations. This attribute is
2709    applied to global variables that are dynamically initialized according to
2710    C++ rules.
2711
2712.. _opbundles:
2713
2714Operand Bundles
2715---------------
2716
2717Operand bundles are tagged sets of SSA values or metadata strings that can be
2718associated with certain LLVM instructions (currently only ``call`` s and
2719``invoke`` s).  In a way they are like metadata, but dropping them is
2720incorrect and will change program semantics.
2721
2722Syntax::
2723
2724    operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
2725    operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
2726    bundle operand ::= SSA value | metadata string
2727    tag ::= string constant
2728
2729Operand bundles are **not** part of a function's signature, and a
2730given function may be called from multiple places with different kinds
2731of operand bundles.  This reflects the fact that the operand bundles
2732are conceptually a part of the ``call`` (or ``invoke``), not the
2733callee being dispatched to.
2734
2735Operand bundles are a generic mechanism intended to support
2736runtime-introspection-like functionality for managed languages.  While
2737the exact semantics of an operand bundle depend on the bundle tag,
2738there are certain limitations to how much the presence of an operand
2739bundle can influence the semantics of a program.  These restrictions
2740are described as the semantics of an "unknown" operand bundle.  As
2741long as the behavior of an operand bundle is describable within these
2742restrictions, LLVM does not need to have special knowledge of the
2743operand bundle to not miscompile programs containing it.
2744
2745- The bundle operands for an unknown operand bundle escape in unknown
2746  ways before control is transferred to the callee or invokee.
2747- Calls and invokes with operand bundles have unknown read / write
2748  effect on the heap on entry and exit (even if the call target specifies
2749  a ``memory`` attribute), unless they're overridden with
2750  callsite specific attributes.
2751- An operand bundle at a call site cannot change the implementation
2752  of the called function.  Inter-procedural optimizations work as
2753  usual as long as they take into account the first two properties.
2754
2755More specific types of operand bundles are described below.
2756
2757.. _deopt_opbundles:
2758
2759Deoptimization Operand Bundles
2760^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2761
2762Deoptimization operand bundles are characterized by the ``"deopt"``
2763operand bundle tag.  These operand bundles represent an alternate
2764"safe" continuation for the call site they're attached to, and can be
2765used by a suitable runtime to deoptimize the compiled frame at the
2766specified call site.  There can be at most one ``"deopt"`` operand
2767bundle attached to a call site.  Exact details of deoptimization is
2768out of scope for the language reference, but it usually involves
2769rewriting a compiled frame into a set of interpreted frames.
2770
2771From the compiler's perspective, deoptimization operand bundles make
2772the call sites they're attached to at least ``readonly``.  They read
2773through all of their pointer typed operands (even if they're not
2774otherwise escaped) and the entire visible heap.  Deoptimization
2775operand bundles do not capture their operands except during
2776deoptimization, in which case control will not be returned to the
2777compiled frame.
2778
2779The inliner knows how to inline through calls that have deoptimization
2780operand bundles.  Just like inlining through a normal call site
2781involves composing the normal and exceptional continuations, inlining
2782through a call site with a deoptimization operand bundle needs to
2783appropriately compose the "safe" deoptimization continuation.  The
2784inliner does this by prepending the parent's deoptimization
2785continuation to every deoptimization continuation in the inlined body.
2786E.g. inlining ``@f`` into ``@g`` in the following example
2787
2788.. code-block:: llvm
2789
2790    define void @f() {
2791      call void @x()  ;; no deopt state
2792      call void @y() [ "deopt"(i32 10) ]
2793      call void @y() [ "deopt"(i32 10), "unknown"(ptr null) ]
2794      ret void
2795    }
2796
2797    define void @g() {
2798      call void @f() [ "deopt"(i32 20) ]
2799      ret void
2800    }
2801
2802will result in
2803
2804.. code-block:: llvm
2805
2806    define void @g() {
2807      call void @x()  ;; still no deopt state
2808      call void @y() [ "deopt"(i32 20, i32 10) ]
2809      call void @y() [ "deopt"(i32 20, i32 10), "unknown"(ptr null) ]
2810      ret void
2811    }
2812
2813It is the frontend's responsibility to structure or encode the
2814deoptimization state in a way that syntactically prepending the
2815caller's deoptimization state to the callee's deoptimization state is
2816semantically equivalent to composing the caller's deoptimization
2817continuation after the callee's deoptimization continuation.
2818
2819.. _ob_funclet:
2820
2821Funclet Operand Bundles
2822^^^^^^^^^^^^^^^^^^^^^^^
2823
2824Funclet operand bundles are characterized by the ``"funclet"``
2825operand bundle tag.  These operand bundles indicate that a call site
2826is within a particular funclet.  There can be at most one
2827``"funclet"`` operand bundle attached to a call site and it must have
2828exactly one bundle operand.
2829
2830If any funclet EH pads have been "entered" but not "exited" (per the
2831`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
2832it is undefined behavior to execute a ``call`` or ``invoke`` which:
2833
2834* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
2835  intrinsic, or
2836* has a ``"funclet"`` bundle whose operand is not the most-recently-entered
2837  not-yet-exited funclet EH pad.
2838
2839Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
2840executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
2841
2842GC Transition Operand Bundles
2843^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2844
2845GC transition operand bundles are characterized by the
2846``"gc-transition"`` operand bundle tag. These operand bundles mark a
2847call as a transition between a function with one GC strategy to a
2848function with a different GC strategy. If coordinating the transition
2849between GC strategies requires additional code generation at the call
2850site, these bundles may contain any values that are needed by the
2851generated code.  For more details, see :ref:`GC Transitions
2852<gc_transition_args>`.
2853
2854The bundle contain an arbitrary list of Values which need to be passed
2855to GC transition code. They will be lowered and passed as operands to
2856the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed
2857that these arguments must be available before and after (but not
2858necessarily during) the execution of the callee.
2859
2860.. _assume_opbundles:
2861
2862Assume Operand Bundles
2863^^^^^^^^^^^^^^^^^^^^^^
2864
2865Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing
2866assumptions, such as that a :ref:`parameter attribute <paramattrs>` or a
2867:ref:`function attribute <fnattrs>` holds for a certain value at a certain
2868location. Operand bundles enable assumptions that are either hard or impossible
2869to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`.
2870
2871An assume operand bundle has the form:
2872
2873::
2874
2875      "<tag>"([ <arguments>] ])
2876
2877In the case of function or parameter attributes, the operand bundle has the
2878restricted form:
2879
2880::
2881
2882      "<tag>"([ <holds for value> [, <attribute argument>] ])
2883
2884* The tag of the operand bundle is usually the name of attribute that can be
2885  assumed to hold. It can also be `ignore`, this tag doesn't contain any
2886  information and should be ignored.
2887* The first argument if present is the value for which the attribute hold.
2888* The second argument if present is an argument of the attribute.
2889
2890If there are no arguments the attribute is a property of the call location.
2891
2892For example:
2893
2894.. code-block:: llvm
2895
2896      call void @llvm.assume(i1 true) ["align"(ptr %val, i32 8)]
2897
2898allows the optimizer to assume that at location of call to
2899:ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8.
2900
2901.. code-block:: llvm
2902
2903      call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(ptr %val)]
2904
2905allows the optimizer to assume that the :ref:`llvm.assume <int_assume>`
2906call location is cold and that ``%val`` may not be null.
2907
2908Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the
2909provided guarantees are violated at runtime the behavior is undefined.
2910
2911While attributes expect constant arguments, assume operand bundles may be
2912provided a dynamic value, for example:
2913
2914.. code-block:: llvm
2915
2916      call void @llvm.assume(i1 true) ["align"(ptr %val, i32 %align)]
2917
2918If the operand bundle value violates any requirements on the attribute value,
2919the behavior is undefined, unless one of the following exceptions applies:
2920
2921* ``"align"`` operand bundles may specify a non-power-of-two alignment
2922  (including a zero alignment). If this is the case, then the pointer value
2923  must be a null pointer, otherwise the behavior is undefined.
2924
2925In addition to allowing operand bundles encoding function and parameter
2926attributes, an assume operand bundle my also encode a ``separate_storage``
2927operand bundle. This has the form:
2928
2929.. code-block:: llvm
2930
2931    separate_storage(<val1>, <val2>)``
2932
2933This indicates that no pointer :ref:`based <pointeraliasing>` on one of its
2934arguments can alias any pointer based on the other.
2935
2936Even if the assumed property can be encoded as a boolean value, like
2937``nonnull``, using operand bundles to express the property can still have
2938benefits:
2939
2940* Attributes that can be expressed via operand bundles are directly the
2941  property that the optimizer uses and cares about. Encoding attributes as
2942  operand bundles removes the need for an instruction sequence that represents
2943  the property (e.g., `icmp ne ptr %p, null` for `nonnull`) and for the
2944  optimizer to deduce the property from that instruction sequence.
2945* Expressing the property using operand bundles makes it easy to identify the
2946  use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then
2947  simplifies and improves heuristics, e.g., for use "use-sensitive"
2948  optimizations.
2949
2950.. _ob_preallocated:
2951
2952Preallocated Operand Bundles
2953^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2954
2955Preallocated operand bundles are characterized by the ``"preallocated"``
2956operand bundle tag.  These operand bundles allow separation of the allocation
2957of the call argument memory from the call site.  This is necessary to pass
2958non-trivially copyable objects by value in a way that is compatible with MSVC
2959on some targets.  There can be at most one ``"preallocated"`` operand bundle
2960attached to a call site and it must have exactly one bundle operand, which is
2961a token generated by ``@llvm.call.preallocated.setup``.  A call with this
2962operand bundle should not adjust the stack before entering the function, as
2963that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics.
2964
2965.. code-block:: llvm
2966
2967      %foo = type { i64, i32 }
2968
2969      ...
2970
2971      %t = call token @llvm.call.preallocated.setup(i32 1)
2972      %a = call ptr @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo)
2973      ; initialize %b
2974      call void @bar(i32 42, ptr preallocated(%foo) %a) ["preallocated"(token %t)]
2975
2976.. _ob_gc_live:
2977
2978GC Live Operand Bundles
2979^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2980
2981A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>`
2982intrinsic. The operand bundle must contain every pointer to a garbage collected
2983object which potentially needs to be updated by the garbage collector.
2984
2985When lowered, any relocated value will be recorded in the corresponding
2986:ref:`stackmap entry <statepoint-stackmap-format>`.  See the intrinsic description
2987for further details.
2988
2989ObjC ARC Attached Call Operand Bundles
2990^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2991
2992A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is
2993implicitly followed by a marker instruction and a call to an ObjC runtime
2994function that uses the result of the call. The operand bundle takes a mandatory
2995pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or
2996``@objc_unsafeClaimAutoreleasedReturnValue``).
2997The return value of a call with this bundle is used by a call to
2998``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is
2999void, in which case the operand bundle is ignored.
3000
3001.. code-block:: llvm
3002
3003   ; The marker instruction and a runtime function call are inserted after the call
3004   ; to @foo.
3005   call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_retainAutoreleasedReturnValue) ]
3006   call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_unsafeClaimAutoreleasedReturnValue) ]
3007
3008The operand bundle is needed to ensure the call is immediately followed by the
3009marker instruction and the ObjC runtime call in the final output.
3010
3011.. _ob_ptrauth:
3012
3013Pointer Authentication Operand Bundles
3014^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3015
3016Pointer Authentication operand bundles are characterized by the
3017``"ptrauth"`` operand bundle tag.  They are described in the
3018`Pointer Authentication <PointerAuth.html#operand-bundle>`__ document.
3019
3020.. _ob_kcfi:
3021
3022KCFI Operand Bundles
3023^^^^^^^^^^^^^^^^^^^^
3024
3025A ``"kcfi"`` operand bundle on an indirect call indicates that the call will
3026be preceded by a runtime type check, which validates that the call target is
3027prefixed with a :ref:`type identifier<md_kcfi_type>` that matches the operand
3028bundle attribute. For example:
3029
3030.. code-block:: llvm
3031
3032      call void %0() ["kcfi"(i32 1234)]
3033
3034Clang emits KCFI operand bundles and the necessary metadata with
3035``-fsanitize=kcfi``.
3036
3037.. _convergencectrl:
3038
3039Convergence Control Operand Bundles
3040^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3041
3042A "convergencectrl" operand bundle is only valid on a ``convergent`` operation.
3043When present, the operand bundle must contain exactly one value of token type.
3044See the :doc:`ConvergentOperations` document for details.
3045
3046.. _moduleasm:
3047
3048Module-Level Inline Assembly
3049----------------------------
3050
3051Modules may contain "module-level inline asm" blocks, which corresponds
3052to the GCC "file scope inline asm" blocks. These blocks are internally
3053concatenated by LLVM and treated as a single unit, but may be separated
3054in the ``.ll`` file if desired. The syntax is very simple:
3055
3056.. code-block:: llvm
3057
3058    module asm "inline asm code goes here"
3059    module asm "more can go here"
3060
3061The strings can contain any character by escaping non-printable
3062characters. The escape sequence used is simply "\\xx" where "xx" is the
3063two digit hex code for the number.
3064
3065Note that the assembly string *must* be parseable by LLVM's integrated assembler
3066(unless it is disabled), even when emitting a ``.s`` file.
3067
3068.. _langref_datalayout:
3069
3070Data Layout
3071-----------
3072
3073A module may specify a target specific data layout string that specifies
3074how data is to be laid out in memory. The syntax for the data layout is
3075simply:
3076
3077.. code-block:: llvm
3078
3079    target datalayout = "layout specification"
3080
3081The *layout specification* consists of a list of specifications
3082separated by the minus sign character ('-'). Each specification starts
3083with a letter and may include other information after the letter to
3084define some aspect of the data layout. The specifications accepted are
3085as follows:
3086
3087``E``
3088    Specifies that the target lays out data in big-endian form. That is,
3089    the bits with the most significance have the lowest address
3090    location.
3091``e``
3092    Specifies that the target lays out data in little-endian form. That
3093    is, the bits with the least significance have the lowest address
3094    location.
3095``S<size>``
3096    Specifies the natural alignment of the stack in bits. Alignment
3097    promotion of stack variables is limited to the natural stack
3098    alignment to avoid dynamic stack realignment. The stack alignment
3099    must be a multiple of 8-bits. If omitted, the natural stack
3100    alignment defaults to "unspecified", which does not prevent any
3101    alignment promotions.
3102``P<address space>``
3103    Specifies the address space that corresponds to program memory.
3104    Harvard architectures can use this to specify what space LLVM
3105    should place things such as functions into. If omitted, the
3106    program memory space defaults to the default address space of 0,
3107    which corresponds to a Von Neumann architecture that has code
3108    and data in the same space.
3109``G<address space>``
3110    Specifies the address space to be used by default when creating global
3111    variables. If omitted, the globals address space defaults to the default
3112    address space 0.
3113    Note: variable declarations without an address space are always created in
3114    address space 0, this property only affects the default value to be used
3115    when creating globals without additional contextual information (e.g. in
3116    LLVM passes).
3117
3118.. _alloca_addrspace:
3119
3120``A<address space>``
3121    Specifies the address space of objects created by '``alloca``'.
3122    Defaults to the default address space of 0.
3123``p[n]:<size>:<abi>[:<pref>][:<idx>]``
3124    This specifies the *size* of a pointer and its ``<abi>`` and
3125    ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional
3126    and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the
3127    index that used for address calculation, which must be less than or equal
3128    to the pointer size. If not
3129    specified, the default index size is equal to the pointer size. All sizes
3130    are in bits. The address space, ``n``, is optional, and if not specified,
3131    denotes the default address space 0. The value of ``n`` must be
3132    in the range [1,2^24).
3133``i<size>:<abi>[:<pref>]``
3134    This specifies the alignment for an integer type of a given bit
3135    ``<size>``. The value of ``<size>`` must be in the range [1,2^24).
3136    ``<pref>`` is optional and defaults to ``<abi>``.
3137    For ``i8``, the ``<abi>`` value must equal 8,
3138    that is, ``i8`` must be naturally aligned.
3139``v<size>:<abi>[:<pref>]``
3140    This specifies the alignment for a vector type of a given bit
3141    ``<size>``. The value of ``<size>`` must be in the range [1,2^24).
3142    ``<pref>`` is optional and defaults to ``<abi>``.
3143``f<size>:<abi>[:<pref>]``
3144    This specifies the alignment for a floating-point type of a given bit
3145    ``<size>``. Only values of ``<size>`` that are supported by the target
3146    will work. 32 (float) and 64 (double) are supported on all targets; 80
3147    or 128 (different flavors of long double) are also supported on some
3148    targets. The value of ``<size>`` must be in the range [1,2^24).
3149    ``<pref>`` is optional and defaults to ``<abi>``.
3150``a:<abi>[:<pref>]``
3151    This specifies the alignment for an object of aggregate type.
3152    ``<pref>`` is optional and defaults to ``<abi>``.
3153``F<type><abi>``
3154    This specifies the alignment for function pointers.
3155    The options for ``<type>`` are:
3156
3157    * ``i``: The alignment of function pointers is independent of the alignment
3158      of functions, and is a multiple of ``<abi>``.
3159    * ``n``: The alignment of function pointers is a multiple of the explicit
3160      alignment specified on the function, and is a multiple of ``<abi>``.
3161``m:<mangling>``
3162    If present, specifies that llvm names are mangled in the output. Symbols
3163    prefixed with the mangling escape character ``\01`` are passed through
3164    directly to the assembler without the escape character. The mangling style
3165    options are
3166
3167    * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
3168    * ``l``: GOFF mangling: Private symbols get a ``@`` prefix.
3169    * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
3170    * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
3171      symbols get a ``_`` prefix.
3172    * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
3173      Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
3174      ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
3175      ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
3176      starting with ``?`` are not mangled in any way.
3177    * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
3178      symbols do not receive a ``_`` prefix.
3179    * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix.
3180``n<size1>:<size2>:<size3>...``
3181    This specifies a set of native integer widths for the target CPU in
3182    bits. For example, it might contain ``n32`` for 32-bit PowerPC,
3183    ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
3184    this set are considered to support most general arithmetic operations
3185    efficiently.
3186``ni:<address space0>:<address space1>:<address space2>...``
3187    This specifies pointer types with the specified address spaces
3188    as :ref:`Non-Integral Pointer Type <nointptrtype>` s.  The ``0``
3189    address space cannot be specified as non-integral.
3190
3191On every specification that takes a ``<abi>:<pref>``, specifying the
3192``<pref>`` alignment is optional. If omitted, the preceding ``:``
3193should be omitted too and ``<pref>`` will be equal to ``<abi>``.
3194
3195When constructing the data layout for a given target, LLVM starts with a
3196default set of specifications which are then (possibly) overridden by
3197the specifications in the ``datalayout`` keyword. The default
3198specifications are given in this list:
3199
3200-  ``e`` - little endian
3201-  ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
3202-  ``p[n]:64:64:64`` - Other address spaces are assumed to be the
3203   same as the default address space.
3204-  ``S0`` - natural stack alignment is unspecified
3205-  ``i1:8:8`` - i1 is 8-bit (byte) aligned
3206-  ``i8:8:8`` - i8 is 8-bit (byte) aligned as mandated
3207-  ``i16:16:16`` - i16 is 16-bit aligned
3208-  ``i32:32:32`` - i32 is 32-bit aligned
3209-  ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
3210   alignment of 64-bits
3211-  ``f16:16:16`` - half is 16-bit aligned
3212-  ``f32:32:32`` - float is 32-bit aligned
3213-  ``f64:64:64`` - double is 64-bit aligned
3214-  ``f128:128:128`` - quad is 128-bit aligned
3215-  ``v64:64:64`` - 64-bit vector is 64-bit aligned
3216-  ``v128:128:128`` - 128-bit vector is 128-bit aligned
3217-  ``a:0:64`` - aggregates are 64-bit aligned
3218
3219When LLVM is determining the alignment for a given type, it uses the
3220following rules:
3221
3222#. If the type sought is an exact match for one of the specifications,
3223   that specification is used.
3224#. If no match is found, and the type sought is an integer type, then
3225   the smallest integer type that is larger than the bitwidth of the
3226   sought type is used. If none of the specifications are larger than
3227   the bitwidth then the largest integer type is used. For example,
3228   given the default specifications above, the i7 type will use the
3229   alignment of i8 (next largest) while both i65 and i256 will use the
3230   alignment of i64 (largest specified).
3231
3232The function of the data layout string may not be what you expect.
3233Notably, this is not a specification from the frontend of what alignment
3234the code generator should use.
3235
3236Instead, if specified, the target data layout is required to match what
3237the ultimate *code generator* expects. This string is used by the
3238mid-level optimizers to improve code, and this only works if it matches
3239what the ultimate code generator uses. There is no way to generate IR
3240that does not embed this target-specific detail into the IR. If you
3241don't specify the string, the default specifications will be used to
3242generate a Data Layout and the optimization phases will operate
3243accordingly and introduce target specificity into the IR with respect to
3244these default specifications.
3245
3246.. _langref_triple:
3247
3248Target Triple
3249-------------
3250
3251A module may specify a target triple string that describes the target
3252host. The syntax for the target triple is simply:
3253
3254.. code-block:: llvm
3255
3256    target triple = "x86_64-apple-macosx10.7.0"
3257
3258The *target triple* string consists of a series of identifiers delimited
3259by the minus sign character ('-'). The canonical forms are:
3260
3261::
3262
3263    ARCHITECTURE-VENDOR-OPERATING_SYSTEM
3264    ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
3265
3266This information is passed along to the backend so that it generates
3267code for the proper architecture. It's possible to override this on the
3268command line with the ``-mtriple`` command line option.
3269
3270.. _objectlifetime:
3271
3272Object Lifetime
3273----------------------
3274
3275A memory object, or simply object, is a region of a memory space that is
3276reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap
3277allocation calls, and global variable definitions.
3278Once it is allocated, the bytes stored in the region can only be read or written
3279through a pointer that is :ref:`based on <pointeraliasing>` the allocation
3280value.
3281If a pointer that is not based on the object tries to read or write to the
3282object, it is undefined behavior.
3283
3284A lifetime of a memory object is a property that decides its accessibility.
3285Unless stated otherwise, a memory object is alive since its allocation, and
3286dead after its deallocation.
3287It is undefined behavior to access a memory object that isn't alive, but
3288operations that don't dereference it such as
3289:ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and
3290:ref:`icmp <i_icmp>` return a valid result.
3291This explains code motion of these instructions across operations that
3292impact the object's lifetime.
3293A stack object's lifetime can be explicitly specified using
3294:ref:`llvm.lifetime.start <int_lifestart>` and
3295:ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.
3296
3297.. _pointeraliasing:
3298
3299Pointer Aliasing Rules
3300----------------------
3301
3302Any memory access must be done through a pointer value associated with
3303an address range of the memory access, otherwise the behavior is
3304undefined. Pointer values are associated with address ranges according
3305to the following rules:
3306
3307-  A pointer value is associated with the addresses associated with any
3308   value it is *based* on.
3309-  An address of a global variable is associated with the address range
3310   of the variable's storage.
3311-  The result value of an allocation instruction is associated with the
3312   address range of the allocated storage.
3313-  A null pointer in the default address-space is associated with no
3314   address.
3315-  An :ref:`undef value <undefvalues>` in *any* address-space is
3316   associated with no address.
3317-  An integer constant other than zero or a pointer value returned from
3318   a function not defined within LLVM may be associated with address
3319   ranges allocated through mechanisms other than those provided by
3320   LLVM. Such ranges shall not overlap with any ranges of addresses
3321   allocated by mechanisms provided by LLVM.
3322
3323A pointer value is *based* on another pointer value according to the
3324following rules:
3325
3326-  A pointer value formed from a scalar ``getelementptr`` operation is *based* on
3327   the pointer-typed operand of the ``getelementptr``.
3328-  The pointer in lane *l* of the result of a vector ``getelementptr`` operation
3329   is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
3330   of the ``getelementptr``.
3331-  The result value of a ``bitcast`` is *based* on the operand of the
3332   ``bitcast``.
3333-  A pointer value formed by an ``inttoptr`` is *based* on all pointer
3334   values that contribute (directly or indirectly) to the computation of
3335   the pointer's value.
3336-  The "*based* on" relationship is transitive.
3337
3338Note that this definition of *"based"* is intentionally similar to the
3339definition of *"based"* in C99, though it is slightly weaker.
3340
3341LLVM IR does not associate types with memory. The result type of a
3342``load`` merely indicates the size and alignment of the memory from
3343which to load, as well as the interpretation of the value. The first
3344operand type of a ``store`` similarly only indicates the size and
3345alignment of the store.
3346
3347Consequently, type-based alias analysis, aka TBAA, aka
3348``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
3349:ref:`Metadata <metadata>` may be used to encode additional information
3350which specialized optimization passes may use to implement type-based
3351alias analysis.
3352
3353.. _pointercapture:
3354
3355Pointer Capture
3356---------------
3357
3358Given a function call and a pointer that is passed as an argument or stored in
3359memory before the call, the call may capture two components of the pointer:
3360
3361  * The address of the pointer, which is its integral value. This also includes
3362    parts of the address or any information about the address, including the
3363    fact that it does not equal one specific value. We further distinguish
3364    whether only the fact that the address is/isn't null is captured.
3365  * The provenance of the pointer, which is the ability to perform memory
3366    accesses through the pointer, in the sense of the :ref:`pointer aliasing
3367    rules <pointeraliasing>`. We further distinguish whether only read acceses
3368    are allowed, or both reads and writes.
3369
3370For example, the following function captures the address of ``%a``, because
3371it is compared to a pointer, leaking information about the identitiy of the
3372pointer:
3373
3374.. code-block:: llvm
3375
3376    @glb = global i8 0
3377
3378    define i1 @f(ptr %a) {
3379      %c = icmp eq ptr %a, @glb
3380      ret i1 %c
3381    }
3382
3383The function does not capture the provenance of the pointer, because the
3384``icmp`` instruction only operates on the pointer address. The following
3385function captures both the address and provenance of the pointer, as both
3386may be read from ``@glb`` after the function returns:
3387
3388.. code-block:: llvm
3389
3390    @glb = global ptr null
3391
3392    define void @f(ptr %a) {
3393      store ptr %a, ptr @glb
3394      ret void
3395    }
3396
3397The following function captures *neither* the address nor the provenance of
3398the pointer:
3399
3400.. code-block:: llvm
3401
3402    define i32 @f(ptr %a) {
3403      %v = load i32, ptr %a
3404      ret i32
3405    }
3406
3407While address capture includes uses of the address within the body of the
3408function, provenance capture refers exclusively to the ability to perform
3409accesses *after* the function returns. Memory accesses within the function
3410itself are not considered pointer captures.
3411
3412We can further say that the capture only occurs through a specific location.
3413In the following example, the pointer (both address and provenance) is captured
3414through the return value only:
3415
3416.. code-block:: llvm
3417
3418    define ptr @f(ptr %a) {
3419      %gep = getelementptr i8, ptr %a, i64 4
3420      ret ptr %gep
3421    }
3422
3423However, we always consider direct inspection of the pointer address
3424(e.g. using ``ptrtoint``) to be location-independent. The following example
3425is *not* considered a return-only capture, even though the ``ptrtoint``
3426ultimately only contribues to the return value:
3427
3428.. code-block:: llvm
3429
3430    @lookup = constant [4 x i8] [i8 0, i8 1, i8 2, i8 3]
3431
3432    define ptr @f(ptr %a) {
3433      %a.addr = ptrtoint ptr %a to i64
3434      %mask = and i64 %a.addr, 3
3435      %gep = getelementptr i8, ptr @lookup, i64 %mask
3436      ret ptr %gep
3437    }
3438
3439This definition is chosen to allow capture analysis to continue with the return
3440value in the usual fashion.
3441
3442The following describes possible ways to capture a pointer in more detail,
3443where unqualified uses of the word "capture" refer to capturing both address
3444and provenance.
3445
34461. The call stores any bit of the pointer carrying information into a place,
3447   and the stored bits can be read from the place by the caller after this call
3448   exits.
3449
3450.. code-block:: llvm
3451
3452    @glb  = global ptr null
3453    @glb2 = global ptr null
3454    @glb3 = global ptr null
3455    @glbi = global i32 0
3456
3457    define ptr @f(ptr %a, ptr %b, ptr %c, ptr %d, ptr %e) {
3458      store ptr %a, ptr @glb ; %a is captured by this call
3459
3460      store ptr %b,   ptr @glb2 ; %b isn't captured because the stored value is overwritten by the store below
3461      store ptr null, ptr @glb2
3462
3463      store ptr %c,   ptr @glb3
3464      call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured
3465      store ptr null, ptr @glb3
3466
3467      %i = ptrtoint ptr %d to i64
3468      %j = trunc i64 %i to i32
3469      store i32 %j, ptr @glbi ; %d is captured
3470
3471      ret ptr %e ; %e is captured
3472    }
3473
34742. The call stores any bit of the pointer carrying information into a place,
3475   and the stored bits can be safely read from the place by another thread via
3476   synchronization.
3477
3478.. code-block:: llvm
3479
3480    @lock = global i1 true
3481
3482    define void @f(ptr %a) {
3483      store ptr %a, ptr @glb
3484      store atomic i1 false, ptr @lock release ; %a is captured because another thread can safely read @glb
3485      store ptr null, ptr @glb
3486      ret void
3487    }
3488
34893. The call's behavior depends on any bit of the pointer carrying information
3490   (address capture only).
3491
3492.. code-block:: llvm
3493
3494    @glb = global i8 0
3495
3496    define void @f(ptr %a) {
3497      %c = icmp eq ptr %a, @glb
3498      br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; captures address of %a only
3499    BB_EXIT:
3500      call void @exit()
3501      unreachable
3502    BB_CONTINUE:
3503      ret void
3504    }
3505
35064. The pointer is used as the pointer operand of a volatile access.
3507
3508.. _volatile:
3509
3510Volatile Memory Accesses
3511------------------------
3512
3513Certain memory accesses, such as :ref:`load <i_load>`'s,
3514:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
3515marked ``volatile``. The optimizers must not change the number of
3516volatile operations or change their order of execution relative to other
3517volatile operations. The optimizers *may* change the order of volatile
3518operations relative to non-volatile operations. This is not Java's
3519"volatile" and has no cross-thread synchronization behavior.
3520
3521A volatile load or store may have additional target-specific semantics.
3522Any volatile operation can have side effects, and any volatile operation
3523can read and/or modify state which is not accessible via a regular load
3524or store in this module. Volatile operations may use addresses which do
3525not point to memory (like MMIO registers). This means the compiler may
3526not use a volatile operation to prove a non-volatile access to that
3527address has defined behavior.
3528
3529The allowed side-effects for volatile accesses are limited.  If a
3530non-volatile store to a given address would be legal, a volatile
3531operation may modify the memory at that address. A volatile operation
3532may not modify any other memory accessible by the module being compiled.
3533A volatile operation may not call any code in the current module.
3534
3535In general (without target specific context), the address space of a
3536volatile operation may not be changed. Different address spaces may
3537have different trapping behavior when dereferencing an invalid
3538pointer.
3539
3540The compiler may assume execution will continue after a volatile operation,
3541so operations which modify memory or may have undefined behavior can be
3542hoisted past a volatile operation.
3543
3544As an exception to the preceding rule, the compiler may not assume execution
3545will continue after a volatile store operation. This restriction is necessary
3546to support the somewhat common pattern in C of intentionally storing to an
3547invalid pointer to crash the program. In the future, it might make sense to
3548allow frontends to control this behavior.
3549
3550IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy
3551or llvm.memmove intrinsics even when those intrinsics are flagged volatile.
3552Likewise, the backend should never split or merge target-legal volatile
3553load/store instructions. Similarly, IR-level volatile loads and stores cannot
3554change from integer to floating-point or vice versa.
3555
3556.. admonition:: Rationale
3557
3558 Platforms may rely on volatile loads and stores of natively supported
3559 data width to be executed as single instruction. For example, in C
3560 this holds for an l-value of volatile primitive type with native
3561 hardware support, but not necessarily for aggregate types. The
3562 frontend upholds these expectations, which are intentionally
3563 unspecified in the IR. The rules above ensure that IR transformations
3564 do not violate the frontend's contract with the language.
3565
3566.. _memmodel:
3567
3568Memory Model for Concurrent Operations
3569--------------------------------------
3570
3571The LLVM IR does not define any way to start parallel threads of
3572execution or to register signal handlers. Nonetheless, there are
3573platform-specific ways to create them, and we define LLVM IR's behavior
3574in their presence. This model is inspired by the C++ memory model.
3575
3576For a more informal introduction to this model, see the :doc:`Atomics`.
3577
3578We define a *happens-before* partial order as the least partial order
3579that
3580
3581-  Is a superset of single-thread program order, and
3582-  When ``a`` *synchronizes-with* ``b``, includes an edge from ``a`` to
3583   ``b``. *Synchronizes-with* pairs are introduced by platform-specific
3584   techniques, like pthread locks, thread creation, thread joining,
3585   etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
3586   Constraints <ordering>`).
3587
3588Note that program order does not introduce *happens-before* edges
3589between a thread and signals executing inside that thread.
3590
3591Every (defined) read operation (load instructions, memcpy, atomic
3592loads/read-modify-writes, etc.) R reads a series of bytes written by
3593(defined) write operations (store instructions, atomic
3594stores/read-modify-writes, memcpy, etc.). For the purposes of this
3595section, initialized globals are considered to have a write of the
3596initializer which is atomic and happens before any other read or write
3597of the memory in question. For each byte of a read R, R\ :sub:`byte`
3598may see any write to the same byte, except:
3599
3600-  If write\ :sub:`1`  happens before write\ :sub:`2`, and
3601   write\ :sub:`2` happens before R\ :sub:`byte`, then
3602   R\ :sub:`byte` does not see write\ :sub:`1`.
3603-  If R\ :sub:`byte` happens before write\ :sub:`3`, then
3604   R\ :sub:`byte` does not see write\ :sub:`3`.
3605
3606Given that definition, R\ :sub:`byte` is defined as follows:
3607
3608-  If R is volatile, the result is target-dependent. (Volatile is
3609   supposed to give guarantees which can support ``sig_atomic_t`` in
3610   C/C++, and may be used for accesses to addresses that do not behave
3611   like normal memory. It does not generally provide cross-thread
3612   synchronization.)
3613-  Otherwise, if there is no write to the same byte that happens before
3614   R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
3615-  Otherwise, if R\ :sub:`byte` may see exactly one write,
3616   R\ :sub:`byte` returns the value written by that write.
3617-  Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
3618   see are atomic, it chooses one of the values written. See the :ref:`Atomic
3619   Memory Ordering Constraints <ordering>` section for additional
3620   constraints on how the choice is made.
3621-  Otherwise R\ :sub:`byte` returns ``undef``.
3622
3623R returns the value composed of the series of bytes it read. This
3624implies that some bytes within the value may be ``undef`` **without**
3625the entire value being ``undef``. Note that this only defines the
3626semantics of the operation; it doesn't mean that targets will emit more
3627than one instruction to read the series of bytes.
3628
3629Note that in cases where none of the atomic intrinsics are used, this
3630model places only one restriction on IR transformations on top of what
3631is required for single-threaded execution: introducing a store to a byte
3632which might not otherwise be stored is not allowed in general.
3633(Specifically, in the case where another thread might write to and read
3634from an address, introducing a store can change a load that may see
3635exactly one write into a load that may see multiple writes.)
3636
3637.. _ordering:
3638
3639Atomic Memory Ordering Constraints
3640----------------------------------
3641
3642Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
3643:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
3644:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
3645ordering parameters that determine which other atomic instructions on
3646the same address they *synchronize with*. These semantics implement
3647the Java or C++ memory models; if these descriptions aren't precise
3648enough, check those specs (see spec references in the
3649:doc:`atomics guide <Atomics>`). :ref:`fence <i_fence>` instructions
3650treat these orderings somewhat differently since they don't take an
3651address. See that instruction's documentation for details.
3652
3653For a simpler introduction to the ordering constraints, see the
3654:doc:`Atomics`.
3655
3656``unordered``
3657    The set of values that can be read is governed by the happens-before
3658    partial order. A value cannot be read unless some operation wrote
3659    it. This is intended to provide a guarantee strong enough to model
3660    Java's non-volatile shared variables. This ordering cannot be
3661    specified for read-modify-write operations; it is not strong enough
3662    to make them atomic in any interesting way.
3663``monotonic``
3664    In addition to the guarantees of ``unordered``, there is a single
3665    total order for modifications by ``monotonic`` operations on each
3666    address. All modification orders must be compatible with the
3667    happens-before order. There is no guarantee that the modification
3668    orders can be combined to a global total order for the whole program
3669    (and this often will not be possible). The read in an atomic
3670    read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
3671    :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
3672    order immediately before the value it writes. If one atomic read
3673    happens before another atomic read of the same address, the later
3674    read must see the same value or a later value in the address's
3675    modification order. This disallows reordering of ``monotonic`` (or
3676    stronger) operations on the same address. If an address is written
3677    ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
3678    read that address repeatedly, the other threads must eventually see
3679    the write. This corresponds to the C/C++ ``memory_order_relaxed``.
3680``acquire``
3681    In addition to the guarantees of ``monotonic``, a
3682    *synchronizes-with* edge may be formed with a ``release`` operation.
3683    This is intended to model C/C++'s ``memory_order_acquire``.
3684``release``
3685    In addition to the guarantees of ``monotonic``, if this operation
3686    writes a value which is subsequently read by an ``acquire``
3687    operation, it *synchronizes-with* that operation. Furthermore,
3688    this occurs even if the value written by a ``release`` operation
3689    has been modified by a read-modify-write operation before being
3690    read. (Such a set of operations comprises a *release
3691    sequence*). This corresponds to the C/C++
3692    ``memory_order_release``.
3693``acq_rel`` (acquire+release)
3694    Acts as both an ``acquire`` and ``release`` operation on its
3695    address. This corresponds to the C/C++ ``memory_order_acq_rel``.
3696``seq_cst`` (sequentially consistent)
3697    In addition to the guarantees of ``acq_rel`` (``acquire`` for an
3698    operation that only reads, ``release`` for an operation that only
3699    writes), there is a global total order on all
3700    sequentially-consistent operations on all addresses. Each
3701    sequentially-consistent read sees the last preceding write to the
3702    same address in this global order. This corresponds to the C/C++
3703    ``memory_order_seq_cst`` and Java ``volatile``.
3704
3705    Note: this global total order is *not* guaranteed to be fully
3706    consistent with the *happens-before* partial order if
3707    non-``seq_cst`` accesses are involved. See the C++ standard
3708    `[atomics.order] <https://wg21.link/atomics.order>`_ section
3709    for more details on the exact guarantees.
3710
3711.. _syncscope:
3712
3713If an atomic operation is marked ``syncscope("singlethread")``, it only
3714*synchronizes with* and only participates in the seq\_cst total orderings of
3715other operations running in the same thread (for example, in signal handlers).
3716
3717If an atomic operation is marked ``syncscope("<target-scope>")``, where
3718``<target-scope>`` is a target specific synchronization scope, then it is target
3719dependent if it *synchronizes with* and participates in the seq\_cst total
3720orderings of other operations.
3721
3722Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
3723or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
3724seq\_cst total orderings of other operations that are not marked
3725``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
3726
3727.. _floatenv:
3728
3729Floating-Point Environment
3730--------------------------
3731
3732The default LLVM floating-point environment assumes that traps are disabled and
3733status flags are not observable. Therefore, floating-point math operations do
3734not have side effects and may be speculated freely. Results assume the
3735round-to-nearest rounding mode, and subnormals are assumed to be preserved.
3736
3737Running LLVM code in an environment where these assumptions are not met
3738typically leads to undefined behavior. The ``strictfp`` and ``denormal-fp-math``
3739attributes as well as :ref:`Constrained Floating-Point Intrinsics
3740<constrainedfp>` can be used to weaken LLVM's assumptions and ensure defined
3741behavior in non-default floating-point environments; see their respective
3742documentation for details.
3743
3744.. _floatnan:
3745
3746Behavior of Floating-Point NaN values
3747-------------------------------------
3748
3749A floating-point NaN value consists of a sign bit, a quiet/signaling bit, and a
3750payload (which makes up the rest of the mantissa except for the quiet/signaling
3751bit). LLVM assumes that the quiet/signaling bit being set to ``1`` indicates a
3752quiet NaN (QNaN), and a value of ``0`` indicates a signaling NaN (SNaN). In the
3753following we will hence just call it the "quiet bit".
3754
3755The representation bits of a floating-point value do not mutate arbitrarily; in
3756particular, if there is no floating-point operation being performed, NaN signs,
3757quiet bits, and payloads are preserved.
3758
3759For the purpose of this section, ``bitcast`` as well as the following operations
3760are not "floating-point math operations": ``fneg``, ``llvm.fabs``, and
3761``llvm.copysign``. These operations act directly on the underlying bit
3762representation and never change anything except possibly for the sign bit.
3763
3764Floating-point math operations that return a NaN are an exception from the
3765general principle that LLVM implements IEEE-754 semantics. Unless specified
3766otherwise, the following rules apply whenever the IEEE-754 semantics say that a
3767NaN value is returned: the result has a non-deterministic sign; the quiet bit
3768and payload are non-deterministically chosen from the following set of options:
3769
3770- The quiet bit is set and the payload is all-zero. ("Preferred NaN" case)
3771- The quiet bit is set and the payload is copied from any input operand that is
3772  a NaN. ("Quieting NaN propagation" case)
3773- The quiet bit and payload are copied from any input operand that is a NaN.
3774  ("Unchanged NaN propagation" case)
3775- The quiet bit is set and the payload is picked from a target-specific set of
3776  "extra" possible NaN payloads. The set can depend on the input operand values.
3777  This set is empty on x86 and ARM, but can be non-empty on other architectures.
3778  (For instance, on wasm, if any input NaN does not have the preferred all-zero
3779  payload or any input NaN is an SNaN, then this set contains all possible
3780  payloads; otherwise, it is empty. On SPARC, this set consists of the all-one
3781  payload.)
3782
3783In particular, if all input NaNs are quiet (or if there are no input NaNs), then
3784the output NaN is definitely quiet. Signaling NaN outputs can only occur if they
3785are provided as an input value. For example, "fmul SNaN, 1.0" may be simplified
3786to SNaN rather than QNaN. Similarly, if all input NaNs are preferred (or if
3787there are no input NaNs) and the target does not have any "extra" NaN payloads,
3788then the output NaN is guaranteed to be preferred.
3789
3790Floating-point math operations are allowed to treat all NaNs as if they were
3791quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0.
3792
3793Code that requires different behavior than this should use the
3794:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
3795In particular, constrained intrinsics rule out the "Unchanged NaN propagation"
3796case; they are guaranteed to return a QNaN.
3797
3798Unfortunately, due to hard-or-impossible-to-fix issues, LLVM violates its own
3799specification on some architectures:
3800
3801- x86-32 without SSE2 enabled may convert floating-point values to x86_fp80 and
3802  back when performing floating-point math operations; this can lead to results
3803  with different precision than expected and it can alter NaN values. Since
3804  optimizations can make contradicting assumptions, this can lead to arbitrary
3805  miscompilations. See `issue #44218
3806  <https://github.com/llvm/llvm-project/issues/44218>`_.
3807- x86-32 (even with SSE2 enabled) may implicitly perform such a conversion on
3808  values returned from a function for some calling conventions. See `issue
3809  #66803 <https://github.com/llvm/llvm-project/issues/66803>`_.
3810- Older MIPS versions use the opposite polarity for the quiet/signaling bit, and
3811  LLVM does not correctly represent this. See `issue #60796
3812  <https://github.com/llvm/llvm-project/issues/60796>`_.
3813
3814.. _floatsem:
3815
3816Floating-Point Semantics
3817------------------------
3818
3819This section defines the semantics for core floating-point operations on types
3820that use a format specified by IEEE-745. These types are: ``half``, ``float``,
3821``double``, and ``fp128``, which correspond to the binary16, binary32, binary64,
3822and binary128 formats, respectively. The "core" operations are those defined in
3823section 5 of IEEE-745, which all have corresponding LLVM operations.
3824
3825The value returned by those operations matches that of the corresponding
3826IEEE-754 operation executed in the :ref:`default LLVM floating-point environment
3827<floatenv>`, except that the behavior of NaN results is instead :ref:`as
3828specified here <floatnan>`. In particular, such a floating-point instruction
3829returning a non-NaN value is guaranteed to always return the same bit-identical
3830result on all machines and optimization levels.
3831
3832This means that optimizations and backends may not change the observed bitwise
3833result of these operations in any way (unless NaNs are returned), and frontends
3834can rely on these operations providing correctly rounded results as described in
3835the standard.
3836
3837(Note that this is only about the value returned by these operations; see the
3838:ref:`floating-point environment section <floatenv>` regarding flags and
3839exceptions.)
3840
3841Various flags, attributes, and metadata can alter the behavior of these
3842operations and thus make them not bit-identical across machines and optimization
3843levels any more: most notably, the :ref:`fast-math flags <fastmath>` as well as
3844the :ref:`strictfp <strictfp>` and :ref:`denormal-fp-math <denormal_fp_math>`
3845attributes and :ref:`!fpmath metadata <fpmath-metadata>`. See their
3846corresponding documentation for details.
3847
3848.. _fastmath:
3849
3850Fast-Math Flags
3851---------------
3852
3853LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`,
3854:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
3855:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`, :ref:`fptrunc <i_fptrunc>`,
3856:ref:`fpext <i_fpext>`), and :ref:`phi <i_phi>`, :ref:`select <i_select>`, or
3857:ref:`call <i_call>` instructions that return floating-point types may use the
3858following flags to enable otherwise unsafe floating-point transformations.
3859
3860``fast``
3861   This flag is a shorthand for specifying all fast-math flags at once, and
3862   imparts no additional semantics from using all of them.
3863
3864``nnan``
3865   No NaNs - Allow optimizations to assume the arguments and result are not
3866   NaN. If an argument is a nan, or the result would be a nan, it produces
3867   a :ref:`poison value <poisonvalues>` instead.
3868
3869``ninf``
3870   No Infs - Allow optimizations to assume the arguments and result are not
3871   +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it
3872   produces a :ref:`poison value <poisonvalues>` instead.
3873
3874``nsz``
3875   No Signed Zeros - Allow optimizations to treat the sign of a zero
3876   argument or zero result as insignificant. This does not imply that -0.0
3877   is poison and/or guaranteed to not exist in the operation.
3878
3879Note: For :ref:`phi <i_phi>`, :ref:`select <i_select>`, and :ref:`call <i_call>`
3880instructions, the following return types are considered to be floating-point
3881types:
3882
3883.. _fastmath_return_types:
3884
3885- Floating-point scalar or vector types
3886- Array types (nested to any depth) of floating-point scalar or vector types
3887- Homogeneous literal struct types of floating-point scalar or vector types
3888
3889Rewrite-based flags
3890^^^^^^^^^^^^^^^^^^^
3891
3892The following flags have rewrite-based semantics. These flags allow expressions,
3893potentially containing multiple non-consecutive instructions, to be rewritten
3894into alternative instructions. When multiple instructions are involved in an
3895expression, it is necessary that all of the instructions have the necessary
3896rewrite-based flag present on them, and the rewritten instructions will
3897generally have the intersection of the flags present on the input instruction.
3898
3899In the following example, the floating-point expression in the body of ``@orig``
3900has ``contract`` and ``reassoc`` in common, and thus if it is rewritten into the
3901expression in the body of ``@target``, all of the new instructions get those two
3902flags and only those flags as a result. Since the ``arcp`` is present on only
3903one of the instructions in the expression, it is not present in the transformed
3904expression. Furthermore, this reassociation here is only legal because both the
3905instructions had the ``reassoc`` flag; if only one had it, it would not be legal
3906to make the transformation.
3907
3908.. code-block:: llvm
3909
3910      define double @orig(double %a, double %b, double %c) {
3911        %t1 = fmul contract reassoc double %a, %b
3912        %val = fmul contract reassoc arcp double %t1, %c
3913        ret double %val
3914      }
3915
3916      define double @target(double %a, double %b, double %c) {
3917        %t1 = fmul contract reassoc double %b, %c
3918        %val = fmul contract reassoc double %a, %t1
3919        ret double %val
3920      }
3921
3922These rules do not apply to the other fast-math flags. Whether or not a flag
3923like ``nnan`` is present on any or all of the rewritten instructions is based
3924on whether or not it is possible for said instruction to have a NaN input or
3925output, given the original flags.
3926
3927``arcp``
3928   Allows division to be treated as a multiplication by a reciprocal.
3929   Specifically, this permits ``a / b`` to be considered equivalent to
3930   ``a * (1.0 / b)`` (which may subsequently be susceptible to code motion),
3931   and it also permits ``a / (b / c)`` to be considered equivalent to
3932   ``a * (c / b)``. Both of these rewrites can be applied in either direction:
3933   ``a * (c / b)`` can be rewritten into ``a / (b / c)``.
3934
3935``contract``
3936   Allow floating-point contraction (e.g. fusing a multiply followed by an
3937   addition into a fused multiply-and-add). This does not enable reassociation
3938   to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not
3939   be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations.
3940
3941.. _fastmath_afn:
3942
3943``afn``
3944   Approximate functions - Allow substitution of approximate calculations for
3945   functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
3946   for places where this can apply to LLVM's intrinsic math functions.
3947
3948``reassoc``
3949   Allow reassociation transformations for floating-point instructions.
3950   This may dramatically change results in floating-point.
3951
3952.. _uselistorder:
3953
3954Use-list Order Directives
3955-------------------------
3956
3957Use-list directives encode the in-memory order of each use-list, allowing the
3958order to be recreated. ``<order-indexes>`` is a comma-separated list of
3959indexes that are assigned to the referenced value's uses. The referenced
3960value's use-list is immediately sorted by these indexes.
3961
3962Use-list directives may appear at function scope or global scope. They are not
3963instructions, and have no effect on the semantics of the IR. When they're at
3964function scope, they must appear after the terminator of the final basic block.
3965
3966If basic blocks have their address taken via ``blockaddress()`` expressions,
3967``uselistorder_bb`` can be used to reorder their use-lists from outside their
3968function's scope.
3969
3970:Syntax:
3971
3972::
3973
3974    uselistorder <ty> <value>, { <order-indexes> }
3975    uselistorder_bb @function, %block { <order-indexes> }
3976
3977:Examples:
3978
3979::
3980
3981    define void @foo(i32 %arg1, i32 %arg2) {
3982    entry:
3983      ; ... instructions ...
3984    bb:
3985      ; ... instructions ...
3986
3987      ; At function scope.
3988      uselistorder i32 %arg1, { 1, 0, 2 }
3989      uselistorder label %bb, { 1, 0 }
3990    }
3991
3992    ; At global scope.
3993    uselistorder ptr @global, { 1, 2, 0 }
3994    uselistorder i32 7, { 1, 0 }
3995    uselistorder i32 (i32) @bar, { 1, 0 }
3996    uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
3997
3998.. _source_filename:
3999
4000Source Filename
4001---------------
4002
4003The *source filename* string is set to the original module identifier,
4004which will be the name of the compiled source file when compiling from
4005source through the clang front end, for example. It is then preserved through
4006the IR and bitcode.
4007
4008This is currently necessary to generate a consistent unique global
4009identifier for local functions used in profile data, which prepends the
4010source file name to the local function name.
4011
4012The syntax for the source file name is simply:
4013
4014.. code-block:: text
4015
4016    source_filename = "/path/to/source.c"
4017
4018.. _typesystem:
4019
4020Type System
4021===========
4022
4023The LLVM type system is one of the most important features of the
4024intermediate representation. Being typed enables a number of
4025optimizations to be performed on the intermediate representation
4026directly, without having to do extra analyses on the side before the
4027transformation. A strong type system makes it easier to read the
4028generated code and enables novel analyses and transformations that are
4029not feasible to perform on normal three address code representations.
4030
4031.. _t_void:
4032
4033Void Type
4034---------
4035
4036:Overview:
4037
4038
4039The void type does not represent any value and has no size.
4040
4041:Syntax:
4042
4043
4044::
4045
4046      void
4047
4048
4049.. _t_function:
4050
4051Function Type
4052-------------
4053
4054:Overview:
4055
4056
4057The function type can be thought of as a function signature. It consists of a
4058return type and a list of formal parameter types. The return type of a function
4059type is a void type or first class type --- except for :ref:`label <t_label>`
4060and :ref:`metadata <t_metadata>` types.
4061
4062:Syntax:
4063
4064::
4065
4066      <returntype> (<parameter list>)
4067
4068...where '``<parameter list>``' is a comma-separated list of type
4069specifiers. Optionally, the parameter list may include a type ``...``, which
4070indicates that the function takes a variable number of arguments. Variable
4071argument functions can access their arguments with the :ref:`variable argument
4072handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
4073except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
4074
4075:Examples:
4076
4077+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4078| ``i32 (i32)``                   | function taking an ``i32``, returning an ``i32``                                                                                                                    |
4079+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4080| ``i32 (ptr, ...)``              | A vararg function that takes at least one :ref:`pointer <t_pointer>` argument and returns an integer. This is the signature for ``printf`` in LLVM.                 |
4081+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4082| ``{i32, i32} (i32)``            | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values                                                                 |
4083+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4084
4085.. _t_firstclass:
4086
4087First Class Types
4088-----------------
4089
4090The :ref:`first class <t_firstclass>` types are perhaps the most important.
4091Values of these types are the only ones which can be produced by
4092instructions.
4093
4094.. _t_single_value:
4095
4096Single Value Types
4097^^^^^^^^^^^^^^^^^^
4098
4099These are the types that are valid in registers from CodeGen's perspective.
4100
4101.. _t_integer:
4102
4103Integer Type
4104""""""""""""
4105
4106:Overview:
4107
4108The integer type is a very simple type that simply specifies an
4109arbitrary bit width for the integer type desired. Any bit width from 1
4110bit to 2\ :sup:`23`\ (about 8 million) can be specified.
4111
4112:Syntax:
4113
4114::
4115
4116      iN
4117
4118The number of bits the integer will occupy is specified by the ``N``
4119value.
4120
4121Examples:
4122*********
4123
4124+----------------+------------------------------------------------+
4125| ``i1``         | a single-bit integer.                          |
4126+----------------+------------------------------------------------+
4127| ``i32``        | a 32-bit integer.                              |
4128+----------------+------------------------------------------------+
4129| ``i1942652``   | a really big integer of over 1 million bits.   |
4130+----------------+------------------------------------------------+
4131
4132.. _t_floating:
4133
4134Floating-Point Types
4135""""""""""""""""""""
4136
4137.. list-table::
4138   :header-rows: 1
4139
4140   * - Type
4141     - Description
4142
4143   * - ``half``
4144     - 16-bit floating-point value (IEEE-754 binary16)
4145
4146   * - ``bfloat``
4147     - 16-bit "brain" floating-point value (7-bit significand).  Provides the
4148       same number of exponent bits as ``float``, so that it matches its dynamic
4149       range, but with greatly reduced precision.  Used in Intel's AVX-512 BF16
4150       extensions and Arm's ARMv8.6-A extensions, among others.
4151
4152   * - ``float``
4153     - 32-bit floating-point value (IEEE-754 binary32)
4154
4155   * - ``double``
4156     - 64-bit floating-point value (IEEE-754 binary64)
4157
4158   * - ``fp128``
4159     - 128-bit floating-point value (IEEE-754 binary128)
4160
4161   * - ``x86_fp80``
4162     -  80-bit floating-point value (X87)
4163
4164   * - ``ppc_fp128``
4165     - 128-bit floating-point value (two 64-bits)
4166
4167X86_amx Type
4168""""""""""""
4169
4170:Overview:
4171
4172The x86_amx type represents a value held in an AMX tile register on an x86
4173machine. The operations allowed on it are quite limited. Only few intrinsics
4174are allowed: stride load and store, zero and dot product. No instruction is
4175allowed for this type. There are no arguments, arrays, pointers, vectors
4176or constants of this type.
4177
4178:Syntax:
4179
4180::
4181
4182      x86_amx
4183
4184
4185
4186.. _t_pointer:
4187
4188Pointer Type
4189""""""""""""
4190
4191:Overview:
4192
4193The pointer type ``ptr`` is used to specify memory locations. Pointers are
4194commonly used to reference objects in memory.
4195
4196Pointer types may have an optional address space attribute defining
4197the numbered address space where the pointed-to object resides. For
4198example, ``ptr addrspace(5)`` is a pointer to address space 5.
4199In addition to integer constants, ``addrspace`` can also reference one of the
4200address spaces defined in the :ref:`datalayout string<langref_datalayout>`.
4201``addrspace("A")`` will use the alloca address space, ``addrspace("G")``
4202the default globals address space and ``addrspace("P")`` the program address
4203space.
4204
4205The default address space is number zero.
4206
4207The semantics of non-zero address spaces are target-specific. Memory
4208access through a non-dereferenceable pointer is undefined behavior in
4209any address space. Pointers with the bit-value 0 are only assumed to
4210be non-dereferenceable in address space 0, unless the function is
4211marked with the ``null_pointer_is_valid`` attribute.
4212
4213If an object can be proven accessible through a pointer with a
4214different address space, the access may be modified to use that
4215address space. Exceptions apply if the operation is ``volatile``.
4216
4217Prior to LLVM 15, pointer types also specified a pointee type, such as
4218``i8*``, ``[4 x i32]*`` or ``i32 (i32*)*``. In LLVM 15, such "typed
4219pointers" are still supported under non-default options. See the
4220`opaque pointers document <OpaquePointers.html>`__ for more information.
4221
4222.. _t_target_type:
4223
4224Target Extension Type
4225"""""""""""""""""""""
4226
4227:Overview:
4228
4229Target extension types represent types that must be preserved through
4230optimization, but are otherwise generally opaque to the compiler. They may be
4231used as function parameters or arguments, and in :ref:`phi <i_phi>` or
4232:ref:`select <i_select>` instructions. Some types may be also used in
4233:ref:`alloca <i_alloca>` instructions or as global values, and correspondingly
4234it is legal to use :ref:`load <i_load>` and :ref:`store <i_store>` instructions
4235on them. Full semantics for these types are defined by the target.
4236
4237The only constants that target extension types may have are ``zeroinitializer``,
4238``undef``, and ``poison``. Other possible values for target extension types may
4239arise from target-specific intrinsics and functions.
4240
4241These types cannot be converted to other types. As such, it is not legal to use
4242them in :ref:`bitcast <i_bitcast>` instructions (as a source or target type),
4243nor is it legal to use them in :ref:`ptrtoint <i_ptrtoint>` or
4244:ref:`inttoptr <i_inttoptr>` instructions. Similarly, they are not legal to use
4245in an :ref:`icmp <i_icmp>` instruction.
4246
4247Target extension types have a name and optional type or integer parameters. The
4248meanings of name and parameters are defined by the target. When being defined in
4249LLVM IR, all of the type parameters must precede all of the integer parameters.
4250
4251Specific target extension types are registered with LLVM as having specific
4252properties. These properties can be used to restrict the type from appearing in
4253certain contexts, such as being the type of a global variable or having a
4254``zeroinitializer`` constant be valid. A complete list of type properties may be
4255found in the documentation for ``llvm::TargetExtType::Property`` (`doxygen
4256<https://llvm.org/doxygen/classllvm_1_1TargetExtType.html>`_).
4257
4258:Syntax:
4259
4260.. code-block:: llvm
4261
4262      target("label")
4263      target("label", void)
4264      target("label", void, i32)
4265      target("label", 0, 1, 2)
4266      target("label", void, i32, 0, 1, 2)
4267
4268
4269.. _t_vector:
4270
4271Vector Type
4272"""""""""""
4273
4274:Overview:
4275
4276A vector type is a simple derived type that represents a vector of
4277elements. Vector types are used when multiple primitive data are
4278operated in parallel using a single instruction (SIMD). A vector type
4279requires a size (number of elements), an underlying primitive data type,
4280and a scalable property to represent vectors where the exact hardware
4281vector length is unknown at compile time. Vector types are considered
4282:ref:`first class <t_firstclass>`.
4283
4284:Memory Layout:
4285
4286In general vector elements are laid out in memory in the same way as
4287:ref:`array types <t_array>`. Such an analogy works fine as long as the vector
4288elements are byte sized. However, when the elements of the vector aren't byte
4289sized it gets a bit more complicated. One way to describe the layout is by
4290describing what happens when a vector such as <N x iM> is bitcasted to an
4291integer type with N*M bits, and then following the rules for storing such an
4292integer to memory.
4293
4294A bitcast from a vector type to a scalar integer type will see the elements
4295being packed together (without padding). The order in which elements are
4296inserted in the integer depends on endianness. For little endian element zero
4297is put in the least significant bits of the integer, and for big endian
4298element zero is put in the most significant bits.
4299
4300Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together
4301with the analogy that we can replace a vector store by a bitcast followed by
4302an integer store, we get this for big endian:
4303
4304.. code-block:: llvm
4305
4306      %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
4307
4308      ; Bitcasting from a vector to an integral type can be seen as
4309      ; concatenating the values:
4310      ;   %val now has the hexadecimal value 0x1235.
4311
4312      store i16 %val, ptr %ptr
4313
4314      ; In memory the content will be (8-bit addressing):
4315      ;
4316      ;    [%ptr + 0]: 00010010  (0x12)
4317      ;    [%ptr + 1]: 00110101  (0x35)
4318
4319The same example for little endian:
4320
4321.. code-block:: llvm
4322
4323      %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
4324
4325      ; Bitcasting from a vector to an integral type can be seen as
4326      ; concatenating the values:
4327      ;   %val now has the hexadecimal value 0x5321.
4328
4329      store i16 %val, ptr %ptr
4330
4331      ; In memory the content will be (8-bit addressing):
4332      ;
4333      ;    [%ptr + 0]: 00100001  (0x21)
4334      ;    [%ptr + 1]: 01010011  (0x53)
4335
4336When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout
4337is unspecified (just like it is for an integral type of the same size). This
4338is because different targets could put the padding at different positions when
4339the type size is smaller than the type's store size.
4340
4341:Syntax:
4342
4343::
4344
4345      < <# elements> x <elementtype> >          ; Fixed-length vector
4346      < vscale x <# elements> x <elementtype> > ; Scalable vector
4347
4348The number of elements is a constant integer value larger than 0;
4349elementtype may be any integer, floating-point or pointer type. Vectors
4350of size zero are not allowed. For scalable vectors, the total number of
4351elements is a constant multiple (called vscale) of the specified number
4352of elements; vscale is a positive integer that is unknown at compile time
4353and the same hardware-dependent constant for all scalable vectors at run
4354time. The size of a specific scalable vector type is thus constant within
4355IR, even if the exact size in bytes cannot be determined until run time.
4356
4357:Examples:
4358
4359+------------------------+----------------------------------------------------+
4360| ``<4 x i32>``          | Vector of 4 32-bit integer values.                 |
4361+------------------------+----------------------------------------------------+
4362| ``<8 x float>``        | Vector of 8 32-bit floating-point values.          |
4363+------------------------+----------------------------------------------------+
4364| ``<2 x i64>``          | Vector of 2 64-bit integer values.                 |
4365+------------------------+----------------------------------------------------+
4366| ``<4 x ptr>``          | Vector of 4 pointers                               |
4367+------------------------+----------------------------------------------------+
4368| ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. |
4369+------------------------+----------------------------------------------------+
4370
4371.. _t_label:
4372
4373Label Type
4374^^^^^^^^^^
4375
4376:Overview:
4377
4378The label type represents code labels.
4379
4380:Syntax:
4381
4382::
4383
4384      label
4385
4386.. _t_token:
4387
4388Token Type
4389^^^^^^^^^^
4390
4391:Overview:
4392
4393The token type is used when a value is associated with an instruction
4394but all uses of the value must not attempt to introspect or obscure it.
4395As such, it is not appropriate to have a :ref:`phi <i_phi>` or
4396:ref:`select <i_select>` of type token.
4397
4398:Syntax:
4399
4400::
4401
4402      token
4403
4404
4405
4406.. _t_metadata:
4407
4408Metadata Type
4409^^^^^^^^^^^^^
4410
4411:Overview:
4412
4413The metadata type represents embedded metadata. No derived types may be
4414created from metadata except for :ref:`function <t_function>` arguments.
4415
4416:Syntax:
4417
4418::
4419
4420      metadata
4421
4422.. _t_aggregate:
4423
4424Aggregate Types
4425^^^^^^^^^^^^^^^
4426
4427Aggregate Types are a subset of derived types that can contain multiple
4428member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
4429aggregate types. :ref:`Vectors <t_vector>` are not considered to be
4430aggregate types.
4431
4432.. _t_array:
4433
4434Array Type
4435""""""""""
4436
4437:Overview:
4438
4439The array type is a very simple derived type that arranges elements
4440sequentially in memory. The array type requires a size (number of
4441elements) and an underlying data type.
4442
4443:Syntax:
4444
4445::
4446
4447      [<# elements> x <elementtype>]
4448
4449The number of elements is a constant integer value; ``elementtype`` may
4450be any type with a size.
4451
4452:Examples:
4453
4454+------------------+--------------------------------------+
4455| ``[40 x i32]``   | Array of 40 32-bit integer values.   |
4456+------------------+--------------------------------------+
4457| ``[41 x i32]``   | Array of 41 32-bit integer values.   |
4458+------------------+--------------------------------------+
4459| ``[4 x i8]``     | Array of 4 8-bit integer values.     |
4460+------------------+--------------------------------------+
4461
4462Here are some examples of multidimensional arrays:
4463
4464+-----------------------------+----------------------------------------------------------+
4465| ``[3 x [4 x i32]]``         | 3x4 array of 32-bit integer values.                      |
4466+-----------------------------+----------------------------------------------------------+
4467| ``[12 x [10 x float]]``     | 12x10 array of single precision floating-point values.   |
4468+-----------------------------+----------------------------------------------------------+
4469| ``[2 x [3 x [4 x i16]]]``   | 2x3x4 array of 16-bit integer values.                    |
4470+-----------------------------+----------------------------------------------------------+
4471
4472There is no restriction on indexing beyond the end of the array implied
4473by a static type (though there are restrictions on indexing beyond the
4474bounds of an allocated object in some cases). This means that
4475single-dimension 'variable sized array' addressing can be implemented in
4476LLVM with a zero length array type. An implementation of 'pascal style
4477arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
4478example.
4479
4480.. _t_struct:
4481
4482Structure Type
4483""""""""""""""
4484
4485:Overview:
4486
4487The structure type is used to represent a collection of data members
4488together in memory. The elements of a structure may be any type that has
4489a size.
4490
4491Structures in memory are accessed using '``load``' and '``store``' by
4492getting a pointer to a field with the '``getelementptr``' instruction.
4493Structures in registers are accessed using the '``extractvalue``' and
4494'``insertvalue``' instructions.
4495
4496Structures may optionally be "packed" structures, which indicate that
4497the alignment of the struct is one byte, and that there is no padding
4498between the elements. In non-packed structs, padding between field types
4499is inserted as defined by the DataLayout string in the module, which is
4500required to match what the underlying code generator expects.
4501
4502Structures can either be "literal" or "identified". A literal structure
4503is defined inline with other types (e.g. ``[2 x {i32, i32}]``) whereas
4504identified types are always defined at the top level with a name.
4505Literal types are uniqued by their contents and can never be recursive
4506or opaque since there is no way to write one. Identified types can be
4507opaqued and are never uniqued. Identified types must not be recursive.
4508
4509:Syntax:
4510
4511::
4512
4513      %T1 = type { <type list> }     ; Identified normal struct type
4514      %T2 = type <{ <type list> }>   ; Identified packed struct type
4515
4516:Examples:
4517
4518+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4519| ``{ i32, i32, i32 }``        | A triple of three ``i32`` values (this is a "homogeneous" struct as all element types are the same)                                                                                   |
4520+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4521| ``{ float, ptr }``           | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>`.                                                                                |
4522+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4523| ``<{ i8, i32 }>``            | A packed struct known to be 5 bytes in size.                                                                                                                                          |
4524+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4525
4526.. _t_opaque:
4527
4528Opaque Structure Types
4529""""""""""""""""""""""
4530
4531:Overview:
4532
4533Opaque structure types are used to represent structure types that
4534do not have a body specified. This corresponds (for example) to the C
4535notion of a forward declared structure. They can be named (``%X``) or
4536unnamed (``%52``).
4537
4538:Syntax:
4539
4540::
4541
4542      %X = type opaque
4543      %52 = type opaque
4544
4545:Examples:
4546
4547+--------------+-------------------+
4548| ``opaque``   | An opaque type.   |
4549+--------------+-------------------+
4550
4551.. _constants:
4552
4553Constants
4554=========
4555
4556LLVM has several different basic types of constants. This section
4557describes them all and their syntax.
4558
4559Simple Constants
4560----------------
4561
4562**Boolean constants**
4563    The two strings '``true``' and '``false``' are both valid constants
4564    of the ``i1`` type.
4565**Integer constants**
4566    Standard integers (such as '4') are constants of the :ref:`integer
4567    <t_integer>` type. They can be either decimal or
4568    hexadecimal. Decimal integers can be prefixed with - to represent
4569    negative integers, e.g. '``-1234``'. Hexadecimal integers must be
4570    prefixed with either u or s to indicate whether they are unsigned
4571    or signed respectively. e.g '``u0x8000``' gives 32768, whilst
4572    '``s0x8000``' gives -32768.
4573
4574    Note that hexadecimal integers are sign extended from the number
4575    of active bits, i.e. the bit width minus the number of leading
4576    zeros. So '``s0x0001``' of type '``i16``' will be -1, not 1.
4577**Floating-point constants**
4578    Floating-point constants use standard decimal notation (e.g.
4579    123.421), exponential notation (e.g. 1.23421e+2), or a more precise
4580    hexadecimal notation (see below). The assembler requires the exact
4581    decimal value of a floating-point constant. For example, the
4582    assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
4583    decimal in binary. Floating-point constants must have a
4584    :ref:`floating-point <t_floating>` type.
4585**Null pointer constants**
4586    The identifier '``null``' is recognized as a null pointer constant
4587    and must be of :ref:`pointer type <t_pointer>`.
4588**Token constants**
4589    The identifier '``none``' is recognized as an empty token constant
4590    and must be of :ref:`token type <t_token>`.
4591
4592The one non-intuitive notation for constants is the hexadecimal form of
4593floating-point constants. For example, the form
4594'``double    0x432ff973cafa8000``' is equivalent to (but harder to read
4595than) '``double 4.5e+15``'. The only time hexadecimal floating-point
4596constants are required (and the only time that they are generated by the
4597disassembler) is when a floating-point constant must be emitted but it
4598cannot be represented as a decimal floating-point number in a reasonable
4599number of digits. For example, NaN's, infinities, and other special
4600values are represented in their IEEE hexadecimal format so that assembly
4601and disassembly do not cause any bits to change in the constants.
4602
4603When using the hexadecimal form, constants of types bfloat, half, float, and
4604double are represented using the 16-digit form shown above (which matches the
4605IEEE754 representation for double); bfloat, half and float values must, however,
4606be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single
4607precision respectively. Hexadecimal format is always used for long double, and
4608there are three forms of long double. The 80-bit format used by x86 is
4609represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format
4610used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32
4611hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed
4612by 32 hexadecimal digits. Long doubles will only work if they match the long
4613double format on your target.  The IEEE 16-bit format (half precision) is
4614represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
4615format is represented by ``0xR`` followed by 4 hexadecimal digits. All
4616hexadecimal formats are big-endian (sign bit at the left).
4617
4618There are no constants of type x86_amx.
4619
4620.. _complexconstants:
4621
4622Complex Constants
4623-----------------
4624
4625Complex constants are a (potentially recursive) combination of simple
4626constants and smaller complex constants.
4627
4628**Structure constants**
4629    Structure constants are represented with notation similar to
4630    structure type definitions (a comma separated list of elements,
4631    surrounded by braces (``{}``)). For example:
4632    "``{ i32 4, float 17.0, ptr @G }``", where "``@G``" is declared as
4633    "``@G = external global i32``". Structure constants must have
4634    :ref:`structure type <t_struct>`, and the number and types of elements
4635    must match those specified by the type.
4636**Array constants**
4637    Array constants are represented with notation similar to array type
4638    definitions (a comma separated list of elements, surrounded by
4639    square brackets (``[]``)). For example:
4640    "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
4641    :ref:`array type <t_array>`, and the number and types of elements must
4642    match those specified by the type. As a special case, character array
4643    constants may also be represented as a double-quoted string using the ``c``
4644    prefix. For example: "``c"Hello World\0A\00"``".
4645**Vector constants**
4646    Vector constants are represented with notation similar to vector
4647    type definitions (a comma separated list of elements, surrounded by
4648    less-than/greater-than's (``<>``)). For example:
4649    "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
4650    must have :ref:`vector type <t_vector>`, and the number and types of
4651    elements must match those specified by the type.
4652
4653    When creating a vector whose elements have the same constant value, the
4654    preferred syntax is ``splat (<Ty> Val)``. For example: "``splat (i32 11)``".
4655    These vector constants must have :ref:`vector type <t_vector>` with an
4656    element type that matches the ``splat`` operand.
4657**Zero initialization**
4658    The string '``zeroinitializer``' can be used to zero initialize a
4659    value to zero of *any* type, including scalar and
4660    :ref:`aggregate <t_aggregate>` types. This is often used to avoid
4661    having to print large zero initializers (e.g. for large arrays) and
4662    is always exactly equivalent to using explicit zero initializers.
4663**Metadata node**
4664    A metadata node is a constant tuple without types. For example:
4665    "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
4666    for example: "``!{!0, i32 0, ptr @global, ptr @function, !"str"}``".
4667    Unlike other typed constants that are meant to be interpreted as part of
4668    the instruction stream, metadata is a place to attach additional
4669    information such as debug info.
4670
4671Global Variable and Function Addresses
4672--------------------------------------
4673
4674The addresses of :ref:`global variables <globalvars>` and
4675:ref:`functions <functionstructure>` are always implicitly valid
4676(link-time) constants. These constants are explicitly referenced when
4677the :ref:`identifier for the global <identifiers>` is used and always have
4678:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
4679file:
4680
4681.. code-block:: llvm
4682
4683    @X = global i32 17
4684    @Y = global i32 42
4685    @Z = global [2 x ptr] [ ptr @X, ptr @Y ]
4686
4687.. _undefvalues:
4688
4689Undefined Values
4690----------------
4691
4692The string '``undef``' can be used anywhere a constant is expected, and
4693indicates that the user of the value may receive an unspecified
4694bit-pattern. Undefined values may be of any type (other than '``label``'
4695or '``void``') and be used anywhere a constant is permitted.
4696
4697.. note::
4698
4699  A '``poison``' value (described in the next section) should be used instead of
4700  '``undef``' whenever possible. Poison values are stronger than undef, and
4701  enable more optimizations. Just the existence of '``undef``' blocks certain
4702  optimizations (see the examples below).
4703
4704Undefined values are useful because they indicate to the compiler that
4705the program is well defined no matter what value is used. This gives the
4706compiler more freedom to optimize. Here are some examples of
4707(potentially surprising) transformations that are valid (in pseudo IR):
4708
4709.. code-block:: llvm
4710
4711      %A = add %X, undef
4712      %B = sub %X, undef
4713      %C = xor %X, undef
4714    Safe:
4715      %A = undef
4716      %B = undef
4717      %C = undef
4718
4719This is safe because all of the output bits are affected by the undef
4720bits. Any output bit can have a zero or one depending on the input bits.
4721
4722.. code-block:: llvm
4723
4724      %A = or %X, undef
4725      %B = and %X, undef
4726    Safe:
4727      %A = -1
4728      %B = 0
4729    Safe:
4730      %A = %X  ;; By choosing undef as 0
4731      %B = %X  ;; By choosing undef as -1
4732    Unsafe:
4733      %A = undef
4734      %B = undef
4735
4736These logical operations have bits that are not always affected by the
4737input. For example, if ``%X`` has a zero bit, then the output of the
4738'``and``' operation will always be a zero for that bit, no matter what
4739the corresponding bit from the '``undef``' is. As such, it is unsafe to
4740optimize or assume that the result of the '``and``' is '``undef``'.
4741However, it is safe to assume that all bits of the '``undef``' could be
47420, and optimize the '``and``' to 0. Likewise, it is safe to assume that
4743all the bits of the '``undef``' operand to the '``or``' could be set,
4744allowing the '``or``' to be folded to -1.
4745
4746.. code-block:: llvm
4747
4748      %A = select undef, %X, %Y
4749      %B = select undef, 42, %Y
4750      %C = select %X, %Y, undef
4751    Safe:
4752      %A = %X     (or %Y)
4753      %B = 42     (or %Y)
4754      %C = %Y     (if %Y is provably not poison; unsafe otherwise)
4755    Unsafe:
4756      %A = undef
4757      %B = undef
4758      %C = undef
4759
4760This set of examples shows that undefined '``select``'
4761conditions can go *either way*, but they have to come from one
4762of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
4763both known to have a clear low bit, then ``%A`` would have to have a
4764cleared low bit. However, in the ``%C`` example, the optimizer is
4765allowed to assume that the '``undef``' operand could be the same as
4766``%Y`` if ``%Y`` is provably not '``poison``', allowing the whole '``select``'
4767to be eliminated. This is because '``poison``' is stronger than '``undef``'.
4768
4769.. code-block:: llvm
4770
4771      %A = xor undef, undef
4772
4773      %B = undef
4774      %C = xor %B, %B
4775
4776      %D = undef
4777      %E = icmp slt %D, 4
4778      %F = icmp gte %D, 4
4779
4780    Safe:
4781      %A = undef
4782      %B = undef
4783      %C = undef
4784      %D = undef
4785      %E = undef
4786      %F = undef
4787
4788This example points out that two '``undef``' operands are not
4789necessarily the same. This can be surprising to people (and also matches
4790C semantics) where they assume that "``X^X``" is always zero, even if
4791``X`` is undefined. This isn't true for a number of reasons, but the
4792short answer is that an '``undef``' "variable" can arbitrarily change
4793its value over its "live range". This is true because the variable
4794doesn't actually *have a live range*. Instead, the value is logically
4795read from arbitrary registers that happen to be around when needed, so
4796the value is not necessarily consistent over time. In fact, ``%A`` and
4797``%C`` need to have the same semantics or the core LLVM "replace all
4798uses with" concept would not hold.
4799
4800To ensure all uses of a given register observe the same value (even if
4801'``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
4802
4803.. code-block:: llvm
4804
4805      %A = sdiv undef, %X
4806      %B = sdiv %X, undef
4807    Safe:
4808      %A = 0
4809    b: unreachable
4810
4811These examples show the crucial difference between an *undefined value*
4812and *undefined behavior*. An undefined value (like '``undef``') is
4813allowed to have an arbitrary bit-pattern. This means that the ``%A``
4814operation can be constant folded to '``0``', because the '``undef``'
4815could be zero, and zero divided by any value is zero.
4816However, in the second example, we can make a more aggressive
4817assumption: because the ``undef`` is allowed to be an arbitrary value,
4818we are allowed to assume that it could be zero. Since a divide by zero
4819has *undefined behavior*, we are allowed to assume that the operation
4820does not execute at all. This allows us to delete the divide and all
4821code after it. Because the undefined operation "can't happen", the
4822optimizer can assume that it occurs in dead code.
4823
4824.. code-block:: text
4825
4826    a:  store undef -> %X
4827    b:  store %X -> undef
4828    Safe:
4829    a: <deleted>     (if the stored value in %X is provably not poison)
4830    b: unreachable
4831
4832A store *of* an undefined value can be assumed to not have any effect;
4833we can assume that the value is overwritten with bits that happen to
4834match what was already there. This argument is only valid if the stored value
4835is provably not ``poison``. However, a store *to* an undefined
4836location could clobber arbitrary memory, therefore, it has undefined
4837behavior.
4838
4839Branching on an undefined value is undefined behavior.
4840This explains optimizations that depend on branch conditions to construct
4841predicates, such as Correlated Value Propagation and Global Value Numbering.
4842In case of switch instruction, the branch condition should be frozen, otherwise
4843it is undefined behavior.
4844
4845.. code-block:: llvm
4846
4847    Unsafe:
4848      br undef, BB1, BB2 ; UB
4849
4850      %X = and i32 undef, 255
4851      switch %X, label %ret [ .. ] ; UB
4852
4853      store undef, ptr %ptr
4854      %X = load ptr %ptr ; %X is undef
4855      switch i8 %X, label %ret [ .. ] ; UB
4856
4857    Safe:
4858      %X = or i8 undef, 255 ; always 255
4859      switch i8 %X, label %ret [ .. ] ; Well-defined
4860
4861      %X = freeze i1 undef
4862      br %X, BB1, BB2 ; Well-defined (non-deterministic jump)
4863
4864
4865
4866.. _poisonvalues:
4867
4868Poison Values
4869-------------
4870
4871A poison value is a result of an erroneous operation.
4872In order to facilitate speculative execution, many instructions do not
4873invoke immediate undefined behavior when provided with illegal operands,
4874and return a poison value instead.
4875The string '``poison``' can be used anywhere a constant is expected, and
4876operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce
4877a poison value.
4878
4879Most instructions return '``poison``' when one of their arguments is
4880'``poison``'. A notable exception is the :ref:`select instruction <i_select>`.
4881Propagation of poison can be stopped with the
4882:ref:`freeze instruction <i_freeze>`.
4883
4884It is correct to replace a poison value with an
4885:ref:`undef value <undefvalues>` or any value of the type.
4886
4887This means that immediate undefined behavior occurs if a poison value is
4888used as an instruction operand that has any values that trigger undefined
4889behavior. Notably this includes (but is not limited to):
4890
4891-  The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or
4892   any other pointer dereferencing instruction (independent of address
4893   space).
4894-  The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem``
4895   instruction.
4896-  The condition operand of a :ref:`br <i_br>` instruction.
4897-  The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4898   instruction.
4899-  The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4900   instruction, when the function or invoking call site has a ``noundef``
4901   attribute in the corresponding position.
4902-  The operand of a :ref:`ret <i_ret>` instruction if the function or invoking
4903   call site has a `noundef` attribute in the return value position.
4904
4905Here are some examples:
4906
4907.. code-block:: llvm
4908
4909    entry:
4910      %poison = sub nuw i32 0, 1           ; Results in a poison value.
4911      %poison2 = sub i32 poison, 1         ; Also results in a poison value.
4912      %still_poison = and i32 %poison, 0   ; 0, but also poison.
4913      %poison_yet_again = getelementptr i32, ptr @h, i32 %still_poison
4914      store i32 0, ptr %poison_yet_again   ; Undefined behavior due to
4915                                           ; store to poison.
4916
4917      store i32 %poison, ptr @g            ; Poison value stored to memory.
4918      %poison3 = load i32, ptr @g          ; Poison value loaded back from memory.
4919
4920      %poison4 = load i16, ptr @g          ; Returns a poison value.
4921      %poison5 = load i64, ptr @g          ; Returns a poison value.
4922
4923      %cmp = icmp slt i32 %poison, 0       ; Returns a poison value.
4924      br i1 %cmp, label %end, label %end   ; undefined behavior
4925
4926    end:
4927
4928.. _welldefinedvalues:
4929
4930Well-Defined Values
4931-------------------
4932
4933Given a program execution, a value is *well defined* if the value does not
4934have an undef bit and is not poison in the execution.
4935An aggregate value or vector is well defined if its elements are well defined.
4936The padding of an aggregate isn't considered, since it isn't visible
4937without storing it into memory and loading it with a different type.
4938
4939A constant of a :ref:`single value <t_single_value>`, non-vector type is well
4940defined if it is neither '``undef``' constant nor '``poison``' constant.
4941The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
4942of its operand.
4943
4944.. _blockaddress:
4945
4946Addresses of Basic Blocks
4947-------------------------
4948
4949``blockaddress(@function, %block)``
4950
4951The '``blockaddress``' constant computes the address of the specified
4952basic block in the specified function.
4953
4954It always has an ``ptr addrspace(P)`` type, where ``P`` is the address space
4955of the function containing ``%block`` (usually ``addrspace(0)``).
4956
4957Taking the address of the entry block is illegal.
4958
4959This value only has defined behavior when used as an operand to the
4960':ref:`indirectbr <i_indirectbr>`' or for comparisons against null. Pointer
4961equality tests between labels addresses results in undefined behavior ---
4962though, again, comparison against null is ok, and no label is equal to the null
4963pointer. This may be passed around as an opaque pointer sized value as long as
4964the bits are not inspected. This allows ``ptrtoint`` and arithmetic to be
4965performed on these values so long as the original value is reconstituted before
4966the ``indirectbr`` instruction.
4967
4968Finally, some targets may provide defined semantics when using the value
4969as the operand to an inline assembly, but that is target specific.
4970
4971.. _dso_local_equivalent:
4972
4973DSO Local Equivalent
4974--------------------
4975
4976``dso_local_equivalent @func``
4977
4978A '``dso_local_equivalent``' constant represents a function which is
4979functionally equivalent to a given function, but is always defined in the
4980current linkage unit. The resulting pointer has the same type as the underlying
4981function. The resulting pointer is permitted, but not required, to be different
4982from a pointer to the function, and it may have different values in different
4983translation units.
4984
4985The target function may not have ``extern_weak`` linkage.
4986
4987``dso_local_equivalent`` can be implemented as such:
4988
4989- If the function has local linkage, hidden visibility, or is
4990  ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer
4991  to the function.
4992- ``dso_local_equivalent`` can be implemented with a stub that tail-calls the
4993  function. Many targets support relocations that resolve at link time to either
4994  a function or a stub for it, depending on if the function is defined within the
4995  linkage unit; LLVM will use this when available. (This is commonly called a
4996  "PLT stub".) On other targets, the stub may need to be emitted explicitly.
4997
4998This can be used wherever a ``dso_local`` instance of a function is needed without
4999needing to explicitly make the original function ``dso_local``. An instance where
5000this can be used is for static offset calculations between a function and some other
5001``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI,
5002where dynamic relocations for function pointers in VTables can be replaced with
5003static relocations for offsets between the VTable and virtual functions which
5004may not be ``dso_local``.
5005
5006This is currently only supported for ELF binary formats.
5007
5008.. _no_cfi:
5009
5010No CFI
5011------
5012
5013``no_cfi @func``
5014
5015With `Control-Flow Integrity (CFI)
5016<https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, a '``no_cfi``'
5017constant represents a function reference that does not get replaced with a
5018reference to the CFI jump table in the ``LowerTypeTests`` pass. These constants
5019may be useful in low-level programs, such as operating system kernels, which
5020need to refer to the actual function body.
5021
5022.. _ptrauth_constant:
5023
5024Pointer Authentication Constants
5025--------------------------------
5026
5027``ptrauth (ptr CST, i32 KEY[, i64 DISC[, ptr ADDRDISC]?]?)``
5028
5029A '``ptrauth``' constant represents a pointer with a cryptographic
5030authentication signature embedded into some bits, as described in the
5031`Pointer Authentication <PointerAuth.html>`__ document.
5032
5033A '``ptrauth``' constant is simply a constant equivalent to the
5034``llvm.ptrauth.sign`` intrinsic, potentially fed by a discriminator
5035``llvm.ptrauth.blend`` if needed.
5036
5037Its type is the same as the first argument.  An integer constant discriminator
5038and an address discriminator may be optionally specified.  Otherwise, they have
5039values ``i64 0`` and ``ptr null``.
5040
5041If the address discriminator is ``null`` then the expression is equivalent to
5042
5043.. code-block:: llvm
5044
5045    %tmp = call i64 @llvm.ptrauth.sign(i64 ptrtoint (ptr CST to i64), i32 KEY, i64 DISC)
5046    %val = inttoptr i64 %tmp to ptr
5047
5048Otherwise, the expression is equivalent to:
5049
5050.. code-block:: llvm
5051
5052    %tmp1 = call i64 @llvm.ptrauth.blend(i64 ptrtoint (ptr ADDRDISC to i64), i64 DISC)
5053    %tmp2 = call i64 @llvm.ptrauth.sign(i64 ptrtoint (ptr CST to i64), i32 KEY, i64 %tmp1)
5054    %val = inttoptr i64 %tmp2 to ptr
5055
5056.. _constantexprs:
5057
5058Constant Expressions
5059--------------------
5060
5061Constant expressions are used to allow expressions involving other
5062constants to be used as constants. Constant expressions may be of any
5063:ref:`first class <t_firstclass>` type and may involve any LLVM operation
5064that does not have side effects (e.g. load and call are not supported).
5065The following is the syntax for constant expressions:
5066
5067``trunc (CST to TYPE)``
5068    Perform the :ref:`trunc operation <i_trunc>` on constants.
5069``ptrtoint (CST to TYPE)``
5070    Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
5071``inttoptr (CST to TYPE)``
5072    Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
5073    This one is *really* dangerous!
5074``bitcast (CST to TYPE)``
5075    Convert a constant, CST, to another TYPE.
5076    The constraints of the operands are the same as those for the
5077    :ref:`bitcast instruction <i_bitcast>`.
5078``addrspacecast (CST to TYPE)``
5079    Convert a constant pointer or constant vector of pointer, CST, to another
5080    TYPE in a different address space. The constraints of the operands are the
5081    same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
5082``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
5083    Perform the :ref:`getelementptr operation <i_getelementptr>` on
5084    constants. As with the :ref:`getelementptr <i_getelementptr>`
5085    instruction, the index list may have one or more indexes, which are
5086    required to make sense for the type of "pointer to TY". These indexes
5087    may be implicitly sign-extended or truncated to match the index size
5088    of CSTPTR's address space.
5089``extractelement (VAL, IDX)``
5090    Perform the :ref:`extractelement operation <i_extractelement>` on
5091    constants.
5092``insertelement (VAL, ELT, IDX)``
5093    Perform the :ref:`insertelement operation <i_insertelement>` on
5094    constants.
5095``shufflevector (VEC1, VEC2, IDXMASK)``
5096    Perform the :ref:`shufflevector operation <i_shufflevector>` on
5097    constants.
5098``add (LHS, RHS)``
5099    Perform an addition on constants.
5100``sub (LHS, RHS)``
5101    Perform a subtraction on constants.
5102``mul (LHS, RHS)``
5103    Perform a multiplication on constants.
5104``shl (LHS, RHS)``
5105    Perform a left shift on constants.
5106``xor (LHS, RHS)``
5107    Perform a bitwise xor on constants.
5108
5109Other Values
5110============
5111
5112.. _inlineasmexprs:
5113
5114Inline Assembler Expressions
5115----------------------------
5116
5117LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
5118Inline Assembly <moduleasm>`) through the use of a special value. This value
5119represents the inline assembler as a template string (containing the
5120instructions to emit), a list of operand constraints (stored as a string), a
5121flag that indicates whether or not the inline asm expression has side effects,
5122and a flag indicating whether the function containing the asm needs to align its
5123stack conservatively.
5124
5125The template string supports argument substitution of the operands using "``$``"
5126followed by a number, to indicate substitution of the given register/memory
5127location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
5128be used, where ``MODIFIER`` is a target-specific annotation for how to print the
5129operand (See :ref:`inline-asm-modifiers`).
5130
5131A literal "``$``" may be included by using "``$$``" in the template. To include
5132other special characters into the output, the usual "``\XX``" escapes may be
5133used, just as in other strings. Note that after template substitution, the
5134resulting assembly string is parsed by LLVM's integrated assembler unless it is
5135disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
5136syntax known to LLVM.
5137
5138LLVM also supports a few more substitutions useful for writing inline assembly:
5139
5140- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
5141  This substitution is useful when declaring a local label. Many standard
5142  compiler optimizations, such as inlining, may duplicate an inline asm blob.
5143  Adding a blob-unique identifier ensures that the two labels will not conflict
5144  during assembly. This is used to implement `GCC's %= special format
5145  string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
5146- ``${:comment}``: Expands to the comment character of the current target's
5147  assembly dialect. This is usually ``#``, but many targets use other strings,
5148  such as ``;``, ``//``, or ``!``.
5149- ``${:private}``: Expands to the assembler private label prefix. Labels with
5150  this prefix will not appear in the symbol table of the assembled object.
5151  Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
5152  relatively popular.
5153
5154LLVM's support for inline asm is modeled closely on the requirements of Clang's
5155GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
5156modifier codes listed here are similar or identical to those in GCC's inline asm
5157support. However, to be clear, the syntax of the template and constraint strings
5158described here is *not* the same as the syntax accepted by GCC and Clang, and,
5159while most constraint letters are passed through as-is by Clang, some get
5160translated to other codes when converting from the C source to the LLVM
5161assembly.
5162
5163An example inline assembler expression is:
5164
5165.. code-block:: llvm
5166
5167    i32 (i32) asm "bswap $0", "=r,r"
5168
5169Inline assembler expressions may **only** be used as the callee operand
5170of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
5171Thus, typically we have:
5172
5173.. code-block:: llvm
5174
5175    %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
5176
5177Inline asms with side effects not visible in the constraint list must be
5178marked as having side effects. This is done through the use of the
5179'``sideeffect``' keyword, like so:
5180
5181.. code-block:: llvm
5182
5183    call void asm sideeffect "eieio", ""()
5184
5185In some cases inline asms will contain code that will not work unless
5186the stack is aligned in some way, such as calls or SSE instructions on
5187x86, yet will not contain code that does that alignment within the asm.
5188The compiler should make conservative assumptions about what the asm
5189might contain and should generate its usual stack alignment code in the
5190prologue if the '``alignstack``' keyword is present:
5191
5192.. code-block:: llvm
5193
5194    call void asm alignstack "eieio", ""()
5195
5196Inline asms also support using non-standard assembly dialects. The
5197assumed dialect is ATT. When the '``inteldialect``' keyword is present,
5198the inline asm is using the Intel dialect. Currently, ATT and Intel are
5199the only supported dialects. An example is:
5200
5201.. code-block:: llvm
5202
5203    call void asm inteldialect "eieio", ""()
5204
5205In the case that the inline asm might unwind the stack,
5206the '``unwind``' keyword must be used, so that the compiler emits
5207unwinding information:
5208
5209.. code-block:: llvm
5210
5211    call void asm unwind "call func", ""()
5212
5213If the inline asm unwinds the stack and isn't marked with
5214the '``unwind``' keyword, the behavior is undefined.
5215
5216If multiple keywords appear, the '``sideeffect``' keyword must come
5217first, the '``alignstack``' keyword second, the '``inteldialect``' keyword
5218third and the '``unwind``' keyword last.
5219
5220Inline Asm Constraint String
5221^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5222
5223The constraint list is a comma-separated string, each element containing one or
5224more constraint codes.
5225
5226For each element in the constraint list an appropriate register or memory
5227operand will be chosen, and it will be made available to assembly template
5228string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
5229second, etc.
5230
5231There are three different types of constraints, which are distinguished by a
5232prefix symbol in front of the constraint code: Output, Input, and Clobber. The
5233constraints must always be given in that order: outputs first, then inputs, then
5234clobbers. They cannot be intermingled.
5235
5236There are also three different categories of constraint codes:
5237
5238- Register constraint. This is either a register class, or a fixed physical
5239  register. This kind of constraint will allocate a register, and if necessary,
5240  bitcast the argument or result to the appropriate type.
5241- Memory constraint. This kind of constraint is for use with an instruction
5242  taking a memory operand. Different constraints allow for different addressing
5243  modes used by the target.
5244- Immediate value constraint. This kind of constraint is for an integer or other
5245  immediate value which can be rendered directly into an instruction. The
5246  various target-specific constraints allow the selection of a value in the
5247  proper range for the instruction you wish to use it with.
5248
5249Output constraints
5250""""""""""""""""""
5251
5252Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
5253indicates that the assembly will write to this operand, and the operand will
5254then be made available as a return value of the ``asm`` expression. Output
5255constraints do not consume an argument from the call instruction. (Except, see
5256below about indirect outputs).
5257
5258Normally, it is expected that no output locations are written to by the assembly
5259expression until *all* of the inputs have been read. As such, LLVM may assign
5260the same register to an output and an input. If this is not safe (e.g. if the
5261assembly contains two instructions, where the first writes to one output, and
5262the second reads an input and writes to a second output), then the "``&``"
5263modifier must be used (e.g. "``=&r``") to specify that the output is an
5264"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
5265will not use the same register for any inputs (other than an input tied to this
5266output).
5267
5268Input constraints
5269"""""""""""""""""
5270
5271Input constraints do not have a prefix -- just the constraint codes. Each input
5272constraint will consume one argument from the call instruction. It is not
5273permitted for the asm to write to any input register or memory location (unless
5274that input is tied to an output). Note also that multiple inputs may all be
5275assigned to the same register, if LLVM can determine that they necessarily all
5276contain the same value.
5277
5278Instead of providing a Constraint Code, input constraints may also "tie"
5279themselves to an output constraint, by providing an integer as the constraint
5280string. Tied inputs still consume an argument from the call instruction, and
5281take up a position in the asm template numbering as is usual -- they will simply
5282be constrained to always use the same register as the output they've been tied
5283to. For example, a constraint string of "``=r,0``" says to assign a register for
5284output, and use that register as an input as well (it being the 0'th
5285constraint).
5286
5287It is permitted to tie an input to an "early-clobber" output. In that case, no
5288*other* input may share the same register as the input tied to the early-clobber
5289(even when the other input has the same value).
5290
5291You may only tie an input to an output which has a register constraint, not a
5292memory constraint. Only a single input may be tied to an output.
5293
5294There is also an "interesting" feature which deserves a bit of explanation: if a
5295register class constraint allocates a register which is too small for the value
5296type operand provided as input, the input value will be split into multiple
5297registers, and all of them passed to the inline asm.
5298
5299However, this feature is often not as useful as you might think.
5300
5301Firstly, the registers are *not* guaranteed to be consecutive. So, on those
5302architectures that have instructions which operate on multiple consecutive
5303instructions, this is not an appropriate way to support them. (e.g. the 32-bit
5304SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
5305hardware then loads into both the named register, and the next register. This
5306feature of inline asm would not be useful to support that.)
5307
5308A few of the targets provide a template string modifier allowing explicit access
5309to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
5310``D``). On such an architecture, you can actually access the second allocated
5311register (yet, still, not any subsequent ones). But, in that case, you're still
5312probably better off simply splitting the value into two separate operands, for
5313clarity. (e.g. see the description of the ``A`` constraint on X86, which,
5314despite existing only for use with this feature, is not really a good idea to
5315use)
5316
5317Indirect inputs and outputs
5318"""""""""""""""""""""""""""
5319
5320Indirect output or input constraints can be specified by the "``*``" modifier
5321(which goes after the "``=``" in case of an output). This indicates that the asm
5322will write to or read from the contents of an *address* provided as an input
5323argument. (Note that in this way, indirect outputs act more like an *input* than
5324an output: just like an input, they consume an argument of the call expression,
5325rather than producing a return value. An indirect output constraint is an
5326"output" only in that the asm is expected to write to the contents of the input
5327memory location, instead of just read from it).
5328
5329This is most typically used for memory constraint, e.g. "``=*m``", to pass the
5330address of a variable as a value.
5331
5332It is also possible to use an indirect *register* constraint, but only on output
5333(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
5334value normally, and then, separately emit a store to the address provided as
5335input, after the provided inline asm. (It's not clear what value this
5336functionality provides, compared to writing the store explicitly after the asm
5337statement, and it can only produce worse code, since it bypasses many
5338optimization passes. I would recommend not using it.)
5339
5340Call arguments for indirect constraints must have pointer type and must specify
5341the :ref:`elementtype <attr_elementtype>` attribute to indicate the pointer
5342element type.
5343
5344Clobber constraints
5345"""""""""""""""""""
5346
5347A clobber constraint is indicated by a "``~``" prefix. A clobber does not
5348consume an input operand, nor generate an output. Clobbers cannot use any of the
5349general constraint code letters -- they may use only explicit register
5350constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
5351"``~{memory}``" indicates that the assembly writes to arbitrary undeclared
5352memory locations -- not only the memory pointed to by a declared indirect
5353output.
5354
5355Note that clobbering named registers that are also present in output
5356constraints is not legal.
5357
5358Label constraints
5359"""""""""""""""""
5360
5361A label constraint is indicated by a "``!``" prefix and typically used in the
5362form ``"!i"``. Instead of consuming call arguments, label constraints consume
5363indirect destination labels of ``callbr`` instructions.
5364
5365Label constraints can only be used in conjunction with ``callbr`` and the
5366number of label constraints must match the number of indirect destination
5367labels in the ``callbr`` instruction.
5368
5369
5370Constraint Codes
5371""""""""""""""""
5372After a potential prefix comes constraint code, or codes.
5373
5374A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
5375followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
5376(e.g. "``{eax}``").
5377
5378The one and two letter constraint codes are typically chosen to be the same as
5379GCC's constraint codes.
5380
5381A single constraint may include one or more than constraint code in it, leaving
5382it up to LLVM to choose which one to use. This is included mainly for
5383compatibility with the translation of GCC inline asm coming from clang.
5384
5385There are two ways to specify alternatives, and either or both may be used in an
5386inline asm constraint list:
5387
53881) Append the codes to each other, making a constraint code set. E.g. "``im``"
5389   or "``{eax}m``". This means "choose any of the options in the set". The
5390   choice of constraint is made independently for each constraint in the
5391   constraint list.
5392
53932) Use "``|``" between constraint code sets, creating alternatives. Every
5394   constraint in the constraint list must have the same number of alternative
5395   sets. With this syntax, the same alternative in *all* of the items in the
5396   constraint list will be chosen together.
5397
5398Putting those together, you might have a two operand constraint string like
5399``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
5400operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
5401may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
5402
5403However, the use of either of the alternatives features is *NOT* recommended, as
5404LLVM is not able to make an intelligent choice about which one to use. (At the
5405point it currently needs to choose, not enough information is available to do so
5406in a smart way.) Thus, it simply tries to make a choice that's most likely to
5407compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
5408always choose to use memory, not registers). And, if given multiple registers,
5409or multiple register classes, it will simply choose the first one. (In fact, it
5410doesn't currently even ensure explicitly specified physical registers are
5411unique, so specifying multiple physical registers as alternatives, like
5412``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
5413intended.)
5414
5415Supported Constraint Code List
5416""""""""""""""""""""""""""""""
5417
5418The constraint codes are, in general, expected to behave the same way they do in
5419GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
5420inline asm code which was supported by GCC. A mismatch in behavior between LLVM
5421and GCC likely indicates a bug in LLVM.
5422
5423Some constraint codes are typically supported by all targets:
5424
5425- ``r``: A register in the target's general purpose register class.
5426- ``m``: A memory address operand. It is target-specific what addressing modes
5427  are supported, typical examples are register, or register + register offset,
5428  or register + immediate offset (of some target-specific size).
5429- ``p``: An address operand. Similar to ``m``, but used by "load address"
5430  type instructions without touching memory.
5431- ``i``: An integer constant (of target-specific width). Allows either a simple
5432  immediate, or a relocatable value.
5433- ``n``: An integer constant -- *not* including relocatable values.
5434- ``s``: A symbol or label reference with a constant offset.
5435- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
5436  useful to pass a label for an asm branch or call.
5437
5438  .. FIXME: but that surely isn't actually okay to jump out of an asm
5439     block without telling llvm about the control transfer???)
5440
5441- ``{register-name}``: Requires exactly the named physical register.
5442
5443Other constraints are target-specific:
5444
5445AArch64:
5446
5447- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
5448- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
5449  i.e. 0 to 4095 with optional shift by 12.
5450- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
5451  ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
5452- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
5453  logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
5454- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
5455  logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
5456- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
5457  32-bit register. This is a superset of ``K``: in addition to the bitmask
5458  immediate, also allows immediate integers which can be loaded with a single
5459  ``MOVZ`` or ``MOVL`` instruction.
5460- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
5461  64-bit register. This is a superset of ``L``.
5462- ``Q``: Memory address operand must be in a single register (no
5463  offsets). (However, LLVM currently does this for the ``m`` constraint as
5464  well.)
5465- ``r``: A 32 or 64-bit integer register (W* or X*).
5466- ``S``: A symbol or label reference with a constant offset. The generic ``s``
5467  is not supported.
5468- ``Uci``: Like r, but restricted to registers 8 to 11 inclusive.
5469- ``Ucj``: Like r, but restricted to registers 12 to 15 inclusive.
5470- ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register.
5471- ``x``: Like w, but restricted to registers 0 to 15 inclusive.
5472- ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive.
5473- ``Uph``: One of the upper eight SVE predicate registers (P8 to P15)
5474- ``Upl``: One of the lower eight SVE predicate registers (P0 to P7)
5475- ``Upa``: Any of the SVE predicate registers (P0 to P15)
5476
5477AMDGPU:
5478
5479- ``r``: A 32 or 64-bit integer register.
5480- ``[0-9]v``: The 32-bit VGPR register, number 0-9.
5481- ``[0-9]s``: The 32-bit SGPR register, number 0-9.
5482- ``[0-9]a``: The 32-bit AGPR register, number 0-9.
5483- ``I``: An integer inline constant in the range from -16 to 64.
5484- ``J``: A 16-bit signed integer constant.
5485- ``A``: An integer or a floating-point inline constant.
5486- ``B``: A 32-bit signed integer constant.
5487- ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64.
5488- ``DA``: A 64-bit constant that can be split into two "A" constants.
5489- ``DB``: A 64-bit constant that can be split into two "B" constants.
5490
5491All ARM modes:
5492
5493- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
5494  operand. Treated the same as operand ``m``, at the moment.
5495- ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14``
5496- ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11``
5497
5498ARM and ARM's Thumb2 mode:
5499
5500- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
5501- ``I``: An immediate integer valid for a data-processing instruction.
5502- ``J``: An immediate integer between -4095 and 4095.
5503- ``K``: An immediate integer whose bitwise inverse is valid for a
5504  data-processing instruction. (Can be used with template modifier "``B``" to
5505  print the inverted value).
5506- ``L``: An immediate integer whose negation is valid for a data-processing
5507  instruction. (Can be used with template modifier "``n``" to print the negated
5508  value).
5509- ``M``: A power of two or an integer between 0 and 32.
5510- ``N``: Invalid immediate constraint.
5511- ``O``: Invalid immediate constraint.
5512- ``r``: A general-purpose 32-bit integer register (``r0-r15``).
5513- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
5514  as ``r``.
5515- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
5516  invalid.
5517- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5518  ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
5519- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5520  ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
5521- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5522  ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
5523
5524ARM's Thumb1 mode:
5525
5526- ``I``: An immediate integer between 0 and 255.
5527- ``J``: An immediate integer between -255 and -1.
5528- ``K``: An immediate integer between 0 and 255, with optional left-shift by
5529  some amount.
5530- ``L``: An immediate integer between -7 and 7.
5531- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
5532- ``N``: An immediate integer between 0 and 31.
5533- ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
5534- ``r``: A low 32-bit GPR register (``r0-r7``).
5535- ``l``: A low 32-bit GPR register (``r0-r7``).
5536- ``h``: A high GPR register (``r0-r7``).
5537- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5538  ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
5539- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5540  ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
5541- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5542  ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
5543
5544Hexagon:
5545
5546- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
5547  at the moment.
5548- ``r``: A 32 or 64-bit register.
5549
5550LoongArch:
5551
5552- ``f``: A floating-point register (if available).
5553- ``k``: A memory operand whose address is formed by a base register and
5554  (optionally scaled) index register.
5555- ``l``: A signed 16-bit constant.
5556- ``m``: A memory operand whose address is formed by a base register and
5557  offset that is suitable for use in instructions with the same addressing
5558  mode as st.w and ld.w.
5559- ``I``: A signed 12-bit constant (for arithmetic instructions).
5560- ``J``: An immediate integer zero.
5561- ``K``: An unsigned 12-bit constant (for logic instructions).
5562- ``ZB``: An address that is held in a general-purpose register. The offset
5563  is zero.
5564- ``ZC``: A memory operand whose address is formed by a base register and
5565  offset that is suitable for use in instructions with the same addressing
5566  mode as ll.w and sc.w.
5567
5568MSP430:
5569
5570- ``r``: An 8 or 16-bit register.
5571
5572MIPS:
5573
5574- ``I``: An immediate signed 16-bit integer.
5575- ``J``: An immediate integer zero.
5576- ``K``: An immediate unsigned 16-bit integer.
5577- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
5578- ``N``: An immediate integer between -65535 and -1.
5579- ``O``: An immediate signed 15-bit integer.
5580- ``P``: An immediate integer between 1 and 65535.
5581- ``m``: A memory address operand. In MIPS-SE mode, allows a base address
5582  register plus 16-bit immediate offset. In MIPS mode, just a base register.
5583- ``R``: A memory address operand. In MIPS-SE mode, allows a base address
5584  register plus a 9-bit signed offset. In MIPS mode, the same as constraint
5585  ``m``.
5586- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
5587  ``sc`` instruction on the given subtarget (details vary).
5588- ``r``, ``d``,  ``y``: A 32 or 64-bit GPR register.
5589- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
5590  (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
5591  argument modifier for compatibility with GCC.
5592- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
5593  ``25``).
5594- ``l``: The ``lo`` register, 32 or 64-bit.
5595- ``x``: Invalid.
5596
5597NVPTX:
5598
5599- ``b``: A 1-bit integer register.
5600- ``c`` or ``h``: A 16-bit integer register.
5601- ``r``: A 32-bit integer register.
5602- ``l`` or ``N``: A 64-bit integer register.
5603- ``q``: A 128-bit integer register.
5604- ``f``: A 32-bit float register.
5605- ``d``: A 64-bit float register.
5606
5607
5608PowerPC:
5609
5610- ``I``: An immediate signed 16-bit integer.
5611- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
5612- ``K``: An immediate unsigned 16-bit integer.
5613- ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
5614- ``M``: An immediate integer greater than 31.
5615- ``N``: An immediate integer that is an exact power of 2.
5616- ``O``: The immediate integer constant 0.
5617- ``P``: An immediate integer constant whose negation is a signed 16-bit
5618  constant.
5619- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
5620  treated the same as ``m``.
5621- ``r``: A 32 or 64-bit integer register.
5622- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
5623  ``R1-R31``).
5624- ``f``: A 32 or 64-bit float register (``F0-F31``),
5625- ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector
5626   register (``V0-V31``).
5627
5628- ``y``: Condition register (``CR0-CR7``).
5629- ``wc``: An individual CR bit in a CR register.
5630- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
5631  register set (overlapping both the floating-point and vector register files).
5632- ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
5633  set.
5634
5635RISC-V:
5636
5637- ``A``: An address operand (using a general-purpose register, without an
5638  offset).
5639- ``I``: A 12-bit signed integer immediate operand.
5640- ``J``: A zero integer immediate operand.
5641- ``K``: A 5-bit unsigned integer immediate operand.
5642- ``f``: A 32- or 64-bit floating-point register (requires F or D extension).
5643- ``r``: A 32- or 64-bit general-purpose register (depending on the platform
5644  ``XLEN``).
5645- ``S``: Alias for ``s``.
5646- ``vd``: A vector register, excluding ``v0`` (requires V extension).
5647- ``vm``: The vector register ``v0`` (requires V extension).
5648- ``vr``: A vector register (requires V extension).
5649
5650Sparc:
5651
5652- ``I``: An immediate 13-bit signed integer.
5653- ``r``: A 32-bit integer register.
5654- ``f``: Any floating-point register on SparcV8, or a floating-point
5655  register in the "low" half of the registers on SparcV9.
5656- ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
5657
5658SystemZ:
5659
5660- ``I``: An immediate unsigned 8-bit integer.
5661- ``J``: An immediate unsigned 12-bit integer.
5662- ``K``: An immediate signed 16-bit integer.
5663- ``L``: An immediate signed 20-bit integer.
5664- ``M``: An immediate integer 0x7fffffff.
5665- ``Q``: A memory address operand with a base address and a 12-bit immediate
5666  unsigned displacement.
5667- ``R``: A memory address operand with a base address, a 12-bit immediate
5668  unsigned displacement, and an index register.
5669- ``S``: A memory address operand with a base address and a 20-bit immediate
5670  signed displacement.
5671- ``T``: A memory address operand with a base address, a 20-bit immediate
5672  signed displacement, and an index register.
5673- ``r`` or ``d``: A 32, 64, or 128-bit integer register.
5674- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
5675  address context evaluates as zero).
5676- ``h``: A 32-bit value in the high part of a 64bit data register
5677  (LLVM-specific)
5678- ``f``: A 32, 64, or 128-bit floating-point register.
5679
5680X86:
5681
5682- ``I``: An immediate integer between 0 and 31.
5683- ``J``: An immediate integer between 0 and 64.
5684- ``K``: An immediate signed 8-bit integer.
5685- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
5686  0xffffffff.
5687- ``M``: An immediate integer between 0 and 3.
5688- ``N``: An immediate unsigned 8-bit integer.
5689- ``O``: An immediate integer between 0 and 127.
5690- ``e``: An immediate 32-bit signed integer.
5691- ``Z``: An immediate 32-bit unsigned integer.
5692- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
5693  ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
5694  registers, and on X86-64, it is all of the integer registers. When feature
5695  `egpr` and `inline-asm-use-gpr32` are both on, it will be extended to gpr32.
5696- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
5697  ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
5698- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register. When feature
5699  `egpr` and `inline-asm-use-gpr32` are both on, it will be extended to gpr32.
5700- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
5701  existed since i386, and can be accessed without the REX prefix.
5702- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
5703- ``y``: A 64-bit MMX register, if MMX is enabled.
5704- ``v``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
5705  operand in a SSE register. If AVX is also enabled, can also be a 256-bit
5706  vector operand in an AVX register. If AVX-512 is also enabled, can also be a
5707  512-bit vector operand in an AVX512 register. Otherwise, an error.
5708- ``Ws``: A symbolic reference with an optional constant addend or a label
5709  reference.
5710- ``x``: The same as ``v``, except that when AVX-512 is enabled, the ``x`` code
5711  only allocates into the first 16 AVX-512 registers, while the ``v`` code
5712  allocates into any of the 32 AVX-512 registers.
5713- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
5714- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
5715  32-bit mode, a 64-bit integer operand will get split into two registers). It
5716  is not recommended to use this constraint, as in 64-bit mode, the 64-bit
5717  operand will get allocated only to RAX -- if two 32-bit operands are needed,
5718  you're better off splitting it yourself, before passing it to the asm
5719  statement.
5720- ``jr``: An 8, 16, 32, or 64-bit integer gpr16. It won't be extended to gpr32
5721  when feature `egpr` or `inline-asm-use-gpr32` is on.
5722- ``jR``: An 8, 16, 32, or 64-bit integer gpr32 when feature `egpr`` is on.
5723  Otherwise, same as ``r``.
5724
5725XCore:
5726
5727- ``r``: A 32-bit integer register.
5728
5729
5730.. _inline-asm-modifiers:
5731
5732Asm template argument modifiers
5733^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5734
5735In the asm template string, modifiers can be used on the operand reference, like
5736"``${0:n}``".
5737
5738The modifiers are, in general, expected to behave the same way they do in
5739GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
5740inline asm code which was supported by GCC. A mismatch in behavior between LLVM
5741and GCC likely indicates a bug in LLVM.
5742
5743Target-independent:
5744
5745- ``c``: Print an immediate integer constant unadorned, without
5746  the target-specific immediate punctuation (e.g. no ``$`` prefix).
5747- ``n``: Negate and print immediate integer constant unadorned, without the
5748  target-specific immediate punctuation (e.g. no ``$`` prefix).
5749- ``l``: Print as an unadorned label, without the target-specific label
5750  punctuation (e.g. no ``$`` prefix).
5751
5752AArch64:
5753
5754- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
5755  instead of ``x30``, print ``w30``.
5756- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
5757- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
5758  ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
5759  ``v*``.
5760
5761AMDGPU:
5762
5763- ``r``: No effect.
5764
5765ARM:
5766
5767- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
5768  register).
5769- ``P``: No effect.
5770- ``q``: No effect.
5771- ``y``: Print a VFP single-precision register as an indexed double (e.g. print
5772  as ``d4[1]`` instead of ``s9``)
5773- ``B``: Bitwise invert and print an immediate integer constant without ``#``
5774  prefix.
5775- ``L``: Print the low 16-bits of an immediate integer constant.
5776- ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
5777  register operands subsequent to the specified one (!), so use carefully.
5778- ``Q``: Print the low-order register of a register-pair, or the low-order
5779  register of a two-register operand.
5780- ``R``: Print the high-order register of a register-pair, or the high-order
5781  register of a two-register operand.
5782- ``H``: Print the second register of a register-pair. (On a big-endian system,
5783  ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
5784  to ``R``.)
5785
5786  .. FIXME: H doesn't currently support printing the second register
5787     of a two-register operand.
5788
5789- ``e``: Print the low doubleword register of a NEON quad register.
5790- ``f``: Print the high doubleword register of a NEON quad register.
5791- ``m``: Print the base register of a memory operand without the ``[`` and ``]``
5792  adornment.
5793
5794Hexagon:
5795
5796- ``L``: Print the second register of a two-register operand. Requires that it
5797  has been allocated consecutively to the first.
5798
5799  .. FIXME: why is it restricted to consecutive ones? And there's
5800     nothing that ensures that happens, is there?
5801
5802- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5803  nothing. Used to print 'addi' vs 'add' instructions.
5804
5805LoongArch:
5806
5807- ``z``: Print $zero register if operand is zero, otherwise print it normally.
5808
5809MSP430:
5810
5811No additional modifiers.
5812
5813MIPS:
5814
5815- ``X``: Print an immediate integer as hexadecimal
5816- ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
5817- ``d``: Print an immediate integer as decimal.
5818- ``m``: Subtract one and print an immediate integer as decimal.
5819- ``z``: Print $0 if an immediate zero, otherwise print normally.
5820- ``L``: Print the low-order register of a two-register operand, or prints the
5821  address of the low-order word of a double-word memory operand.
5822
5823  .. FIXME: L seems to be missing memory operand support.
5824
5825- ``M``: Print the high-order register of a two-register operand, or prints the
5826  address of the high-order word of a double-word memory operand.
5827
5828  .. FIXME: M seems to be missing memory operand support.
5829
5830- ``D``: Print the second register of a two-register operand, or prints the
5831  second word of a double-word memory operand. (On a big-endian system, ``D`` is
5832  equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
5833  ``M``.)
5834- ``w``: No effect. Provided for compatibility with GCC which requires this
5835  modifier in order to print MSA registers (``W0-W31``) with the ``f``
5836  constraint.
5837
5838NVPTX:
5839
5840- ``r``: No effect.
5841
5842PowerPC:
5843
5844- ``L``: Print the second register of a two-register operand. Requires that it
5845  has been allocated consecutively to the first.
5846
5847  .. FIXME: why is it restricted to consecutive ones? And there's
5848     nothing that ensures that happens, is there?
5849
5850- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5851  nothing. Used to print 'addi' vs 'add' instructions.
5852- ``y``: For a memory operand, prints formatter for a two-register X-form
5853  instruction. (Currently always prints ``r0,OPERAND``).
5854- ``U``: Prints 'u' if the memory operand is an update form, and nothing
5855  otherwise. (NOTE: LLVM does not support update form, so this will currently
5856  always print nothing)
5857- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
5858  not support indexed form, so this will currently always print nothing)
5859
5860RISC-V:
5861
5862- ``i``: Print the letter 'i' if the operand is not a register, otherwise print
5863  nothing. Used to print 'addi' vs 'add' instructions, etc.
5864- ``z``: Print the register ``zero`` if an immediate zero, otherwise print
5865  normally.
5866
5867Sparc:
5868
5869- ``L``: Print the low-order register of a two-register operand.
5870- ``H``: Print the high-order register of a two-register operand.
5871- ``r``: No effect.
5872
5873SystemZ:
5874
5875SystemZ implements only ``n``, and does *not* support any of the other
5876target-independent modifiers.
5877
5878X86:
5879
5880- ``c``: Print an unadorned integer or symbol name. (The latter is
5881  target-specific behavior for this typically target-independent modifier).
5882- ``A``: Print a register name with a '``*``' before it.
5883- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
5884  operand.
5885- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
5886  memory operand.
5887- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
5888  operand.
5889- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
5890  operand.
5891- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
5892  available, otherwise the 32-bit register name; do nothing on a memory operand.
5893- ``n``: Negate and print an unadorned integer, or, for operands other than an
5894  immediate integer (e.g. a relocatable symbol expression), print a '-' before
5895  the operand. (The behavior for relocatable symbol expressions is a
5896  target-specific behavior for this typically target-independent modifier)
5897- ``H``: Print a memory reference with additional offset +8.
5898- ``p``: Print a raw symbol name (without syntax-specific prefixes).
5899- ``P``: Print a memory reference used as the argument of a call instruction or
5900  used with explicit base reg and index reg as its offset. So it can not use
5901  additional regs to present the memory reference. (E.g. omit ``(rip)``, even
5902  though it's PC-relative.)
5903
5904XCore:
5905
5906No additional modifiers.
5907
5908
5909Inline Asm Metadata
5910^^^^^^^^^^^^^^^^^^^
5911
5912The call instructions that wrap inline asm nodes may have a
5913"``!srcloc``" MDNode attached to it that contains a list of constant
5914integers. If present, the code generator will use the integer as the
5915location cookie value when report errors through the ``LLVMContext``
5916error reporting mechanisms. This allows a front-end to correlate backend
5917errors that occur with inline asm back to the source code that produced
5918it. For example:
5919
5920.. code-block:: llvm
5921
5922    call void asm sideeffect "something bad", ""(), !srcloc !42
5923    ...
5924    !42 = !{ i64 1234567 }
5925
5926It is up to the front-end to make sense of the magic numbers it places
5927in the IR. If the MDNode contains multiple constants, the code generator
5928will use the one that corresponds to the line of the asm that the error
5929occurs on.
5930
5931.. _metadata:
5932
5933Metadata
5934========
5935
5936LLVM IR allows metadata to be attached to instructions and global objects in
5937the program that can convey extra information about the code to the optimizers
5938and code generator.
5939
5940There are two metadata primitives: strings and nodes. There are
5941also specialized nodes which have a distinguished name and a set of named
5942arguments.
5943
5944.. note::
5945
5946    One example application of metadata is source-level debug information,
5947    which is currently the only user of specialized nodes.
5948
5949Metadata does not have a type, and is not a value.
5950
5951A value of non-\ ``metadata`` type can be used in a metadata context using the
5952syntax '``<type> <value>``'.
5953
5954All other metadata is identified in syntax as starting with an exclamation
5955point ('``!``').
5956
5957Metadata may be used in the following value contexts by using the ``metadata``
5958type:
5959
5960- Arguments to certain intrinsic functions, as described in their specification.
5961- Arguments to the ``catchpad``/``cleanuppad`` instructions.
5962
5963.. note::
5964
5965    Metadata can be "wrapped" in a ``MetadataAsValue`` so it can be referenced
5966    in a value context: ``MetadataAsValue`` is-a ``Value``.
5967
5968    A typed value can be "wrapped" in ``ValueAsMetadata`` so it can be
5969    referenced in a metadata context: ``ValueAsMetadata`` is-a ``Metadata``.
5970
5971    There is no explicit syntax for a ``ValueAsMetadata``, and instead
5972    the fact that a type identifier cannot begin with an exclamation point
5973    is used to resolve ambiguity.
5974
5975    A ``metadata`` type implies a ``MetadataAsValue``, and when followed with a
5976    '``<type> <value>``' pair it wraps the typed value in a ``ValueAsMetadata``.
5977
5978    For example, the first argument
5979    to this call is a ``MetadataAsValue(ValueAsMetadata(Value))``:
5980
5981    .. code-block:: llvm
5982
5983        call void @llvm.foo(metadata i32 1)
5984
5985    Whereas the first argument to this call is a ``MetadataAsValue(MDNode)``:
5986
5987    .. code-block:: llvm
5988
5989        call void @llvm.foo(metadata !0)
5990
5991    The first element of this ``MDTuple`` is a ``MDNode``:
5992
5993    .. code-block:: llvm
5994
5995        !{!0}
5996
5997    And the first element of this ``MDTuple`` is a ``ValueAsMetadata(Value)``:
5998
5999    .. code-block:: llvm
6000
6001        !{i32 1}
6002
6003.. _metadata-string:
6004
6005Metadata Strings (``MDString``)
6006-------------------------------
6007
6008.. FIXME Either fix all references to "MDString" in the docs, or make that
6009   identifier a formal part of the document.
6010
6011A metadata string is a string surrounded by double quotes. It can
6012contain any character by escaping non-printable characters with
6013"``\xx``" where "``xx``" is the two digit hex code. For example:
6014"``!"test\00"``".
6015
6016.. note::
6017
6018   A metadata string is metadata, but is not a metadata node.
6019
6020.. _metadata-node:
6021
6022Metadata Nodes (``MDNode``)
6023---------------------------
6024
6025.. FIXME Either fix all references to "MDNode" in the docs, or make that
6026   identifier a formal part of the document.
6027
6028Metadata tuples are represented with notation similar to structure
6029constants: a comma separated list of elements, surrounded by braces and
6030preceded by an exclamation point. Metadata nodes can have any values as
6031their operand. For example:
6032
6033.. code-block:: llvm
6034
6035    !{!"test\00", i32 10}
6036
6037Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
6038
6039.. code-block:: text
6040
6041    !0 = distinct !{!"test\00", i32 10}
6042
6043``distinct`` nodes are useful when nodes shouldn't be merged based on their
6044content. They can also occur when transformations cause uniquing collisions
6045when metadata operands change.
6046
6047A :ref:`named metadata <namedmetadatastructure>` is a collection of
6048metadata nodes, which can be looked up in the module symbol table. For
6049example:
6050
6051.. code-block:: llvm
6052
6053    !foo = !{!4, !3}
6054
6055Metadata can be used as function arguments. Here the ``llvm.dbg.value``
6056intrinsic is using three metadata arguments:
6057
6058.. code-block:: llvm
6059
6060    call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
6061
6062
6063.. FIXME Attachments cannot be ValueAsMetadata, but we don't have a
6064   particularly clear way to refer to ValueAsMetadata without getting into
6065   implementation details. Ideally the restriction would be explicit somewhere,
6066   though?
6067
6068Metadata can be attached to an instruction. Here metadata ``!21`` is attached
6069to the ``add`` instruction using the ``!dbg`` identifier:
6070
6071.. code-block:: llvm
6072
6073    %indvar.next = add i64 %indvar, 1, !dbg !21
6074
6075Instructions may not have multiple metadata attachments with the same
6076identifier.
6077
6078Metadata can also be attached to a function or a global variable. Here metadata
6079``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1``
6080and ``g2`` using the ``!dbg`` identifier:
6081
6082.. code-block:: llvm
6083
6084    declare !dbg !22 void @f1()
6085    define void @f2() !dbg !22 {
6086      ret void
6087    }
6088
6089    @g1 = global i32 0, !dbg !22
6090    @g2 = external global i32, !dbg !22
6091
6092Unlike instructions, global objects (functions and global variables) may have
6093multiple metadata attachments with the same identifier.
6094
6095A transformation is required to drop any metadata attachment that it
6096does not know or know it can't preserve. Currently there is an
6097exception for metadata attachment to globals for ``!func_sanitize``,
6098``!type``, ``!absolute_symbol`` and ``!associated`` which can't be
6099unconditionally dropped unless the global is itself deleted.
6100
6101Metadata attached to a module using named metadata may not be dropped, with
6102the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
6103
6104More information about specific metadata nodes recognized by the
6105optimizers and code generator is found below.
6106
6107.. _specialized-metadata:
6108
6109Specialized Metadata Nodes
6110^^^^^^^^^^^^^^^^^^^^^^^^^^
6111
6112Specialized metadata nodes are custom data structures in metadata (as opposed
6113to generic tuples). Their fields are labelled, and can be specified in any
6114order.
6115
6116These aren't inherently debug info centric, but currently all the specialized
6117metadata nodes are related to debug info.
6118
6119.. _DICompileUnit:
6120
6121DICompileUnit
6122"""""""""""""
6123
6124``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
6125``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
6126containing the debug info to be emitted along with the compile unit, regardless
6127of code optimizations (some nodes are only emitted if there are references to
6128them from instructions). The ``debugInfoForProfiling:`` field is a boolean
6129indicating whether or not line-table discriminators are updated to provide
6130more-accurate debug info for profiling results.
6131
6132.. code-block:: text
6133
6134    !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
6135                        isOptimized: true, flags: "-O2", runtimeVersion: 2,
6136                        splitDebugFilename: "abc.debug", emissionKind: FullDebug,
6137                        enums: !2, retainedTypes: !3, globals: !4, imports: !5,
6138                        macros: !6, dwoId: 0x0abcd)
6139
6140Compile unit descriptors provide the root scope for objects declared in a
6141specific compilation unit. File descriptors are defined using this scope.  These
6142descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
6143track of global variables, type information, and imported entities (declarations
6144and namespaces).
6145
6146.. _DIFile:
6147
6148DIFile
6149""""""
6150
6151``DIFile`` nodes represent files. The ``filename:`` can include slashes.
6152
6153.. code-block:: none
6154
6155    !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
6156                 checksumkind: CSK_MD5,
6157                 checksum: "000102030405060708090a0b0c0d0e0f")
6158
6159Files are sometimes used in ``scope:`` fields, and are the only valid target
6160for ``file:`` fields.
6161
6162The ``checksum:`` and ``checksumkind:`` fields are optional. If one of these
6163fields is present, then the other is required to be present as well. Valid
6164values for ``checksumkind:`` field are: {CSK_MD5, CSK_SHA1, CSK_SHA256}
6165
6166.. _DIBasicType:
6167
6168DIBasicType
6169"""""""""""
6170
6171``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
6172``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
6173
6174.. code-block:: text
6175
6176    !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
6177                      encoding: DW_ATE_unsigned_char)
6178    !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
6179
6180The ``encoding:`` describes the details of the type. Usually it's one of the
6181following:
6182
6183.. code-block:: text
6184
6185  DW_ATE_address       = 1
6186  DW_ATE_boolean       = 2
6187  DW_ATE_float         = 4
6188  DW_ATE_signed        = 5
6189  DW_ATE_signed_char   = 6
6190  DW_ATE_unsigned      = 7
6191  DW_ATE_unsigned_char = 8
6192
6193.. _DISubroutineType:
6194
6195DISubroutineType
6196""""""""""""""""
6197
6198``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
6199refers to a tuple; the first operand is the return type, while the rest are the
6200types of the formal arguments in order. If the first operand is ``null``, that
6201represents a function with no return value (such as ``void foo() {}`` in C++).
6202
6203.. code-block:: text
6204
6205    !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
6206    !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
6207    !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
6208
6209.. _DIDerivedType:
6210
6211DIDerivedType
6212"""""""""""""
6213
6214``DIDerivedType`` nodes represent types derived from other types, such as
6215qualified types.
6216
6217.. code-block:: text
6218
6219    !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
6220                      encoding: DW_ATE_unsigned_char)
6221    !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
6222                        align: 32)
6223
6224The following ``tag:`` values are valid:
6225
6226.. code-block:: text
6227
6228  DW_TAG_member             = 13
6229  DW_TAG_pointer_type       = 15
6230  DW_TAG_reference_type     = 16
6231  DW_TAG_typedef            = 22
6232  DW_TAG_inheritance        = 28
6233  DW_TAG_ptr_to_member_type = 31
6234  DW_TAG_const_type         = 38
6235  DW_TAG_friend             = 42
6236  DW_TAG_volatile_type      = 53
6237  DW_TAG_restrict_type      = 55
6238  DW_TAG_atomic_type        = 71
6239  DW_TAG_immutable_type     = 75
6240
6241.. _DIDerivedTypeMember:
6242
6243``DW_TAG_member`` is used to define a member of a :ref:`composite type
6244<DICompositeType>`. The type of the member is the ``baseType:``. The
6245``offset:`` is the member's bit offset.  If the composite type has an ODR
6246``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
6247uniqued based only on its ``name:`` and ``scope:``.
6248
6249``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
6250field of :ref:`composite types <DICompositeType>` to describe parents and
6251friends.
6252
6253``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
6254
6255``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
6256``DW_TAG_volatile_type``, ``DW_TAG_restrict_type``, ``DW_TAG_atomic_type`` and
6257``DW_TAG_immutable_type`` are used to qualify the ``baseType:``.
6258
6259Note that the ``void *`` type is expressed as a type derived from NULL.
6260
6261.. _DICompositeType:
6262
6263DICompositeType
6264"""""""""""""""
6265
6266``DICompositeType`` nodes represent types composed of other types, like
6267structures and unions. ``elements:`` points to a tuple of the composed types.
6268
6269If the source language supports ODR, the ``identifier:`` field gives the unique
6270identifier used for type merging between modules.  When specified,
6271:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
6272derived types <DIDerivedTypeMember>` that reference the ODR-type in their
6273``scope:`` change uniquing rules.
6274
6275For a given ``identifier:``, there should only be a single composite type that
6276does not have  ``flags: DIFlagFwdDecl`` set.  LLVM tools that link modules
6277together will unique such definitions at parse time via the ``identifier:``
6278field, even if the nodes are ``distinct``.
6279
6280.. code-block:: text
6281
6282    !0 = !DIEnumerator(name: "SixKind", value: 7)
6283    !1 = !DIEnumerator(name: "SevenKind", value: 7)
6284    !2 = !DIEnumerator(name: "NegEightKind", value: -8)
6285    !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
6286                          line: 2, size: 32, align: 32, identifier: "_M4Enum",
6287                          elements: !{!0, !1, !2})
6288
6289The following ``tag:`` values are valid:
6290
6291.. code-block:: text
6292
6293  DW_TAG_array_type       = 1
6294  DW_TAG_class_type       = 2
6295  DW_TAG_enumeration_type = 4
6296  DW_TAG_structure_type   = 19
6297  DW_TAG_union_type       = 23
6298
6299For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
6300descriptors <DISubrange>`, each representing the range of subscripts at that
6301level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
6302array type is a native packed vector. The optional ``dataLocation`` is a
6303DIExpression that describes how to get from an object's address to the actual
6304raw data, if they aren't equivalent. This is only supported for array types,
6305particularly to describe Fortran arrays, which have an array descriptor in
6306addition to the array data. Alternatively it can also be DIVariable which
6307has the address of the actual raw data. The Fortran language supports pointer
6308arrays which can be attached to actual arrays, this attachment between pointer
6309and pointee is called association.  The optional ``associated`` is a
6310DIExpression that describes whether the pointer array is currently associated.
6311The optional ``allocated`` is a DIExpression that describes whether the
6312allocatable array is currently allocated.  The optional ``rank`` is a
6313DIExpression that describes the rank (number of dimensions) of fortran assumed
6314rank array (rank is known at runtime).
6315
6316For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
6317descriptors <DIEnumerator>`, each representing the definition of an enumeration
6318value for the set. All enumeration type descriptors are collected in the
6319``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
6320
6321For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
6322``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
6323<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
6324``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
6325``isDefinition: false``.
6326
6327.. _DISubrange:
6328
6329DISubrange
6330""""""""""
6331
6332``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
6333:ref:`DICompositeType`.
6334
6335- ``count: -1`` indicates an empty array.
6336- ``count: !10`` describes the count with a :ref:`DILocalVariable`.
6337- ``count: !12`` describes the count with a :ref:`DIGlobalVariable`.
6338
6339.. code-block:: text
6340
6341    !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
6342    !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
6343    !2 = !DISubrange(count: -1) ; empty array.
6344
6345    ; Scopes used in rest of example
6346    !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
6347    !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6)
6348    !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5)
6349
6350    ; Use of local variable as count value
6351    !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
6352    !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
6353    !11 = !DISubrange(count: !10, lowerBound: 0)
6354
6355    ; Use of global variable as count value
6356    !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
6357    !13 = !DISubrange(count: !12, lowerBound: 0)
6358
6359.. _DIEnumerator:
6360
6361DIEnumerator
6362""""""""""""
6363
6364``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
6365variants of :ref:`DICompositeType`.
6366
6367.. code-block:: text
6368
6369    !0 = !DIEnumerator(name: "SixKind", value: 7)
6370    !1 = !DIEnumerator(name: "SevenKind", value: 7)
6371    !2 = !DIEnumerator(name: "NegEightKind", value: -8)
6372
6373DITemplateTypeParameter
6374"""""""""""""""""""""""
6375
6376``DITemplateTypeParameter`` nodes represent type parameters to generic source
6377language constructs. They are used (optionally) in :ref:`DICompositeType` and
6378:ref:`DISubprogram` ``templateParams:`` fields.
6379
6380.. code-block:: text
6381
6382    !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
6383
6384DITemplateValueParameter
6385""""""""""""""""""""""""
6386
6387``DITemplateValueParameter`` nodes represent value parameters to generic source
6388language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
6389but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
6390``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
6391:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
6392
6393.. code-block:: text
6394
6395    !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
6396
6397DINamespace
6398"""""""""""
6399
6400``DINamespace`` nodes represent namespaces in the source language.
6401
6402.. code-block:: text
6403
6404    !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
6405
6406.. _DIGlobalVariable:
6407
6408DIGlobalVariable
6409""""""""""""""""
6410
6411``DIGlobalVariable`` nodes represent global variables in the source language.
6412
6413.. code-block:: text
6414
6415    @foo = global i32, !dbg !0
6416    !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
6417    !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2,
6418                           file: !3, line: 7, type: !4, isLocal: true,
6419                           isDefinition: false, declaration: !5)
6420
6421
6422DIGlobalVariableExpression
6423""""""""""""""""""""""""""
6424
6425``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together
6426with a :ref:`DIExpression`.
6427
6428.. code-block:: text
6429
6430    @lower = global i32, !dbg !0
6431    @upper = global i32, !dbg !1
6432    !0 = !DIGlobalVariableExpression(
6433             var: !2,
6434             expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32)
6435             )
6436    !1 = !DIGlobalVariableExpression(
6437             var: !2,
6438             expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32)
6439             )
6440    !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3,
6441                           file: !4, line: 8, type: !5, declaration: !6)
6442
6443All global variable expressions should be referenced by the `globals:` field of
6444a :ref:`compile unit <DICompileUnit>`.
6445
6446.. _DISubprogram:
6447
6448DISubprogram
6449""""""""""""
6450
6451``DISubprogram`` nodes represent functions from the source language. A distinct
6452``DISubprogram`` may be attached to a function definition using ``!dbg``
6453metadata. A unique ``DISubprogram`` may be attached to a function declaration
6454used for call site debug info. The ``retainedNodes:`` field is a list of
6455:ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be
6456retained, even if their IR counterparts are optimized out of the IR. The
6457``type:`` field must point at an :ref:`DISubroutineType`.
6458
6459.. _DISubprogramDeclaration:
6460
6461When ``spFlags: DISPFlagDefinition`` is not present, subprograms describe a
6462declaration in the type tree as opposed to a definition of a function. In this
6463case, the ``declaration`` field must be empty. If the scope is a composite type
6464with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, then
6465the subprogram declaration is uniqued based only on its ``linkageName:`` and
6466``scope:``.
6467
6468.. code-block:: text
6469
6470    define void @_Z3foov() !dbg !0 {
6471      ...
6472    }
6473
6474    !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
6475                                file: !2, line: 7, type: !3,
6476                                spFlags: DISPFlagDefinition | DISPFlagLocalToUnit,
6477                                scopeLine: 8, containingType: !4,
6478                                virtuality: DW_VIRTUALITY_pure_virtual,
6479                                virtualIndex: 10, flags: DIFlagPrototyped,
6480                                isOptimized: true, unit: !5, templateParams: !6,
6481                                declaration: !7, retainedNodes: !8,
6482                                thrownTypes: !9)
6483
6484.. _DILexicalBlock:
6485
6486DILexicalBlock
6487""""""""""""""
6488
6489``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
6490<DISubprogram>`. The line number and column numbers are used to distinguish
6491two lexical blocks at same depth. They are valid targets for ``scope:``
6492fields.
6493
6494.. code-block:: text
6495
6496    !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
6497
6498Usually lexical blocks are ``distinct`` to prevent node merging based on
6499operands.
6500
6501.. _DILexicalBlockFile:
6502
6503DILexicalBlockFile
6504""""""""""""""""""
6505
6506``DILexicalBlockFile`` nodes are used to discriminate between sections of a
6507:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
6508indicate textual inclusion, or the ``discriminator:`` field can be used to
6509discriminate between control flow within a single block in the source language.
6510
6511.. code-block:: text
6512
6513    !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
6514    !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
6515    !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
6516
6517.. _DILocation:
6518
6519DILocation
6520""""""""""
6521
6522``DILocation`` nodes represent source debug locations. The ``scope:`` field is
6523mandatory, and points at an :ref:`DILexicalBlockFile`, an
6524:ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
6525
6526.. code-block:: text
6527
6528    !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
6529
6530.. _DILocalVariable:
6531
6532DILocalVariable
6533"""""""""""""""
6534
6535``DILocalVariable`` nodes represent local variables in the source language. If
6536the ``arg:`` field is set to non-zero, then this variable is a subprogram
6537parameter, and it will be included in the ``retainedNodes:`` field of its
6538:ref:`DISubprogram`.
6539
6540.. code-block:: text
6541
6542    !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
6543                          type: !3, flags: DIFlagArtificial)
6544    !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
6545                          type: !3)
6546    !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
6547
6548.. _DIExpression:
6549
6550DIExpression
6551""""""""""""
6552
6553``DIExpression`` nodes represent expressions that are inspired by the DWARF
6554expression language. They are used in :ref:`debug records <debugrecords>`
6555(such as ``#dbg_declare`` and ``#dbg_value``) to describe how the
6556referenced LLVM variable relates to the source language variable. Debug
6557expressions are interpreted left-to-right: start by pushing the value/address
6558operand of the record onto a stack, then repeatedly push and evaluate
6559opcodes from the DIExpression until the final variable description is produced.
6560
6561The current supported opcode vocabulary is limited:
6562
6563- ``DW_OP_deref`` dereferences the top of the expression stack.
6564- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
6565  them together and appends the result to the expression stack.
6566- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
6567  the last entry from the second last entry and appends the result to the
6568  expression stack.
6569- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
6570- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
6571  here, respectively) of the variable fragment from the working expression. Note
6572  that contrary to DW_OP_bit_piece, the offset is describing the location
6573  within the described source variable.
6574- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
6575  (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
6576  expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
6577  that references a base type constructed from the supplied values.
6578- ``DW_OP_LLVM_extract_bits_sext, 16, 8,`` specifies the offset and size
6579  (``16`` and ``8`` here, respectively) of bits that are to be extracted and
6580  sign-extended from the value at the top of the expression stack. If the top of
6581  the expression stack is a memory location then these bits are extracted from
6582  the value pointed to by that memory location. Maps into a ``DW_OP_shl``
6583  followed by ``DW_OP_shra``.
6584- ``DW_OP_LLVM_extract_bits_zext`` behaves similarly to
6585  ``DW_OP_LLVM_extract_bits_sext``, but zero-extends instead of sign-extending.
6586  Maps into a ``DW_OP_shl`` followed by ``DW_OP_shr``.
6587- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
6588  optionally applied to the pointer. The memory tag is derived from the
6589  given tag offset in an implementation-defined manner.
6590- ``DW_OP_swap`` swaps top two stack entries.
6591- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
6592  of the stack is treated as an address. The second stack entry is treated as an
6593  address space identifier.
6594- ``DW_OP_stack_value`` marks a constant value.
6595- ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon
6596  function entry. When targeting DWARF, a ``DBG_VALUE(reg, ...,
6597  DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to
6598  ``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon
6599  function entry onto the DWARF expression stack.
6600
6601  The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
6602  block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1,
6603  DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where
6604  the entry value of ``reg`` is pushed onto the stack, and is added with 123.
6605  Due to framework limitations ``N`` must be 1, in other words,
6606  ``DW_OP_entry_value`` always refers to the value/address operand of the
6607  instruction.
6608
6609  Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is
6610  usually used in MIR, but it is also allowed in LLVM IR when targeting a
6611  :ref:`swiftasync <swiftasync>` argument. The operation is introduced by:
6612
6613    - ``LiveDebugValues`` pass, which applies it to function parameters that
6614      are unmodified throughout the function. Support is limited to simple
6615      register location descriptions, or as indirect locations (e.g.,
6616      parameters passed-by-value to a callee via a pointer to a temporary copy
6617      made in the caller).
6618    - ``AsmPrinter`` pass when a call site parameter value
6619      (``DW_AT_call_site_parameter_value``) is represented as entry value of
6620      the parameter.
6621    - ``CoroSplit`` pass, which may move variables from allocas into a
6622      coroutine frame. If the coroutine frame is a
6623      :ref:`swiftasync <swiftasync>` argument, the variable is described with
6624      an ``DW_OP_LLVM_entry_value`` operation.
6625
6626- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
6627  value, such as one that calculates the sum of two registers. This is always
6628  used in combination with an ordered list of values, such that
6629  ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For
6630  example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
6631  DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
6632  ``%reg1 - reg2``. This list of values should be provided by the containing
6633  intrinsic/instruction.
6634- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
6635  signed offset of the specified register. The opcode is only generated by the
6636  ``AsmPrinter`` pass to describe call site parameter value which requires an
6637  expression over two registers.
6638- ``DW_OP_push_object_address`` pushes the address of the object which can then
6639  serve as a descriptor in subsequent calculation. This opcode can be used to
6640  calculate bounds of fortran allocatable array which has array descriptors.
6641- ``DW_OP_over`` duplicates the entry currently second in the stack at the top
6642  of the stack. This opcode can be used to calculate bounds of fortran assumed
6643  rank array which has rank known at run time and current dimension number is
6644  implicitly first element of the stack.
6645- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
6646  be used to represent pointer variables which are optimized out but the value
6647  it points to is known. This operator is required as it is different than DWARF
6648  operator DW_OP_implicit_pointer in representation and specification (number
6649  and types of operands) and later can not be used as multiple level.
6650
6651.. code-block:: text
6652
6653    IR for "*ptr = 4;"
6654    --------------
6655      #dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20)
6656    !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
6657                           type: !18)
6658    !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
6659    !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
6660    !20 = !DILocation(line: 10, scope: !12)
6661
6662    IR for "**ptr = 4;"
6663    --------------
6664      #dbg_value(i32 4, !17,
6665        !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer),
6666        !21)
6667    !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
6668                           type: !18)
6669    !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
6670    !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
6671    !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
6672    !21 = !DILocation(line: 10, scope: !12)
6673
6674DWARF specifies three kinds of simple location descriptions: Register, memory,
6675and implicit location descriptions.  Note that a location description is
6676defined over certain ranges of a program, i.e the location of a variable may
6677change over the course of the program. Register and memory location
6678descriptions describe the *concrete location* of a source variable (in the
6679sense that a debugger might modify its value), whereas *implicit locations*
6680describe merely the actual *value* of a source variable which might not exist
6681in registers or in memory (see ``DW_OP_stack_value``).
6682
6683A ``#dbg_declare`` record describes an indirect value (the address) of a
6684source variable. The first operand of the record must be an address of some
6685kind. A DIExpression operand to the record refines this address to produce a
6686concrete location for the source variable.
6687
6688A ``#dbg_value`` record describes the direct value of a source variable.
6689The first operand of the record may be a direct or indirect value. A
6690DIExpression operand to the record refines the first operand to produce a
6691direct value. For example, if the first operand is an indirect value, it may be
6692necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
6693valid debug record.
6694
6695.. note::
6696
6697   A DIExpression is interpreted in the same way regardless of which kind of
6698   debug record it's attached to.
6699
6700   DIExpressions are always printed and parsed inline; they can never be
6701   referenced by an ID (e.g. ``!1``).
6702
6703.. code-block:: text
6704
6705    !DIExpression(DW_OP_deref)
6706    !DIExpression(DW_OP_plus_uconst, 3)
6707    !DIExpression(DW_OP_constu, 3, DW_OP_plus)
6708    !DIExpression(DW_OP_bit_piece, 3, 7)
6709    !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
6710    !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
6711    !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
6712
6713DIAssignID
6714""""""""""
6715
6716``DIAssignID`` nodes have no operands and are always distinct. They are used to
6717link together (:ref:`#dbg_assign records <debugrecords>`) and instructions
6718that store in IR. See `Debug Info Assignment Tracking
6719<AssignmentTracking.html>`_ for more info.
6720
6721.. code-block:: llvm
6722
6723    store i32 %a, ptr %a.addr, align 4, !DIAssignID !2
6724    #dbg_assign(%a, !1, !DIExpression(), !2, %a.addr, !DIExpression(), !3)
6725
6726    !2 = distinct !DIAssignID()
6727
6728DIArgList
6729"""""""""
6730
6731.. FIXME In the implementation this is not a "node", but as it can only appear
6732   inline in a function context that distinction isn't observable anyway. Even
6733   if it is not required, it would be nice to be more clear about what is a
6734   "node", and what that actually means. The names in the implementation could
6735   also be updated to mirror whatever we decide here.
6736
6737``DIArgList`` nodes hold a list of constant or SSA value references. These are
6738used in :ref:`debug records <debugrecords>` in combination with a
6739``DIExpression`` that uses the
6740``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values
6741within a function, it must only be used as a function argument, must always be
6742inlined, and cannot appear in named metadata.
6743
6744.. code-block:: text
6745
6746    #dbg_value(!DIArgList(i32 %a, i32 %b),
6747               !16,
6748               !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus),
6749               !26)
6750
6751DIFlags
6752"""""""
6753
6754These flags encode various properties of DINodes.
6755
6756The `ExportSymbols` flag marks a class, struct or union whose members
6757may be referenced as if they were defined in the containing class or
6758union. This flag is used to decide whether the DW_AT_export_symbols can
6759be used for the structure type.
6760
6761DIObjCProperty
6762""""""""""""""
6763
6764``DIObjCProperty`` nodes represent Objective-C property nodes.
6765
6766.. code-block:: text
6767
6768    !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
6769                         getter: "getFoo", attributes: 7, type: !2)
6770
6771DIImportedEntity
6772""""""""""""""""
6773
6774``DIImportedEntity`` nodes represent entities (such as modules) imported into a
6775compile unit. The ``elements`` field is a list of renamed entities (such as
6776variables and subprograms) in the imported entity (such as module).
6777
6778.. code-block:: text
6779
6780   !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
6781                          entity: !1, line: 7, elements: !3)
6782   !3 = !{!4}
6783   !4 = !DIImportedEntity(tag: DW_TAG_imported_declaration, name: "bar", scope: !0,
6784                          entity: !5, line: 7)
6785
6786DIMacro
6787"""""""
6788
6789``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
6790The ``name:`` field is the macro identifier, followed by macro parameters when
6791defining a function-like macro, and the ``value`` field is the token-string
6792used to expand the macro identifier.
6793
6794.. code-block:: text
6795
6796   !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
6797                 value: "((x) + 1)")
6798   !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
6799
6800DIMacroFile
6801"""""""""""
6802
6803``DIMacroFile`` nodes represent inclusion of source files.
6804The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
6805appear in the included source file.
6806
6807.. code-block:: text
6808
6809   !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
6810                     nodes: !3)
6811
6812.. _DILabel:
6813
6814DILabel
6815"""""""
6816
6817``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of
6818a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a
6819:ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
6820The ``name:`` field is the label identifier. The ``file:`` field is the
6821:ref:`DIFile` the label is present in. The ``line:`` field is the source line
6822within the file where the label is declared.
6823
6824.. code-block:: text
6825
6826  !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7)
6827
6828DICommonBlock
6829"""""""""""""
6830
6831``DICommonBlock`` nodes represent Fortran common blocks. The ``scope:`` field
6832is mandatory and points to a :ref:`DILexicalBlockFile`, a
6833:ref:`DILexicalBlock`, or a :ref:`DISubprogram`. The ``declaration:``,
6834``name:``, ``file:``, and ``line:`` fields are optional.
6835
6836DIModule
6837""""""""
6838
6839``DIModule`` nodes represent a source language module, for example, a Clang
6840module, or a Fortran module. The ``scope:`` field is mandatory and points to a
6841:ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
6842The ``name:`` field is mandatory. The ``configMacros:``, ``includePath:``,
6843``apinotes:``, ``file:``, ``line:``, and ``isDecl:`` fields are optional.
6844
6845DIStringType
6846""""""""""""
6847
6848``DIStringType`` nodes represent a Fortran ``CHARACTER(n)`` type, with a
6849dynamic length and location encoded as an expression.
6850The ``tag:`` field is optional and defaults to ``DW_TAG_string_type``. The ``name:``,
6851``stringLength:``, ``stringLengthExpression``, ``stringLocationExpression:``,
6852``size:``, ``align:``, and ``encoding:`` fields are optional.
6853
6854If not present, the ``size:`` and ``align:`` fields default to the value zero.
6855
6856The length in bits of the string is specified by the first of the following
6857fields present:
6858
6859- ``stringLength:``, which points to a ``DIVariable`` whose value is the string
6860  length in bits.
6861- ``stringLengthExpression:``, which points to a ``DIExpression`` which
6862  computes the length in bits.
6863- ``size``, which contains the literal length in bits.
6864
6865The ``stringLocationExpression:`` points to a ``DIExpression`` which describes
6866the "data location" of the string object, if present.
6867
6868'``tbaa``' Metadata
6869^^^^^^^^^^^^^^^^^^^
6870
6871In LLVM IR, memory does not have types, so LLVM's own type system is not
6872suitable for doing type based alias analysis (TBAA). Instead, metadata is
6873added to the IR to describe a type system of a higher level language. This
6874can be used to implement C/C++ strict type aliasing rules, but it can also
6875be used to implement custom alias analysis behavior for other languages.
6876
6877This description of LLVM's TBAA system is broken into two parts:
6878:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
6879:ref:`Representation<tbaa_node_representation>` talks about the metadata
6880encoding of various entities.
6881
6882It is always possible to trace any TBAA node to a "root" TBAA node (details
6883in the :ref:`Representation<tbaa_node_representation>` section).  TBAA
6884nodes with different roots have an unknown aliasing relationship, and LLVM
6885conservatively infers ``MayAlias`` between them.  The rules mentioned in
6886this section only pertain to TBAA nodes living under the same root.
6887
6888.. _tbaa_node_semantics:
6889
6890Semantics
6891"""""""""
6892
6893The TBAA metadata system, referred to as "struct path TBAA" (not to be
6894confused with ``tbaa.struct``), consists of the following high level
6895concepts: *Type Descriptors*, further subdivided into scalar type
6896descriptors and struct type descriptors; and *Access Tags*.
6897
6898**Type descriptors** describe the type system of the higher level language
6899being compiled.  **Scalar type descriptors** describe types that do not
6900contain other types.  Each scalar type has a parent type, which must also
6901be a scalar type or the TBAA root.  Via this parent relation, scalar types
6902within a TBAA root form a tree.  **Struct type descriptors** denote types
6903that contain a sequence of other type descriptors, at known offsets.  These
6904contained type descriptors can either be struct type descriptors themselves
6905or scalar type descriptors.
6906
6907**Access tags** are metadata nodes attached to load and store instructions.
6908Access tags use type descriptors to describe the *location* being accessed
6909in terms of the type system of the higher level language.  Access tags are
6910tuples consisting of a base type, an access type and an offset.  The base
6911type is a scalar type descriptor or a struct type descriptor, the access
6912type is a scalar type descriptor, and the offset is a constant integer.
6913
6914The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
6915things:
6916
6917 * If ``BaseTy`` is a struct type, the tag describes a memory access (load
6918   or store) of a value of type ``AccessTy`` contained in the struct type
6919   ``BaseTy`` at offset ``Offset``.
6920
6921 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
6922   ``AccessTy`` must be the same; and the access tag describes a scalar
6923   access with scalar type ``AccessTy``.
6924
6925We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
6926tuples this way:
6927
6928 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
6929   ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
6930   described in the TBAA metadata.  ``ImmediateParent(BaseTy, Offset)`` is
6931   undefined if ``Offset`` is non-zero.
6932
6933 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
6934   is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
6935   ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
6936   to be relative within that inner type.
6937
6938A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
6939aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
6940Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
6941Offset2)`` via the ``Parent`` relation or vice versa. If memory accesses
6942alias even though they are noalias according to ``!tbaa`` metadata, the
6943behavior is undefined.
6944
6945As a concrete example, the type descriptor graph for the following program
6946
6947.. code-block:: c
6948
6949    struct Inner {
6950      int i;    // offset 0
6951      float f;  // offset 4
6952    };
6953
6954    struct Outer {
6955      float f;  // offset 0
6956      double d; // offset 4
6957      struct Inner inner_a;  // offset 12
6958    };
6959
6960    void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
6961      outer->f = 0;            // tag0: (OuterStructTy, FloatScalarTy, 0)
6962      outer->inner_a.i = 0;    // tag1: (OuterStructTy, IntScalarTy, 12)
6963      outer->inner_a.f = 0.0;  // tag2: (OuterStructTy, FloatScalarTy, 16)
6964      *f = 0.0;                // tag3: (FloatScalarTy, FloatScalarTy, 0)
6965    }
6966
6967is (note that in C and C++, ``char`` can be used to access any arbitrary
6968type):
6969
6970.. code-block:: text
6971
6972    Root = "TBAA Root"
6973    CharScalarTy = ("char", Root, 0)
6974    FloatScalarTy = ("float", CharScalarTy, 0)
6975    DoubleScalarTy = ("double", CharScalarTy, 0)
6976    IntScalarTy = ("int", CharScalarTy, 0)
6977    InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
6978    OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
6979                     (InnerStructTy, 12)}
6980
6981
6982with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
69830)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
6984``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
6985
6986.. _tbaa_node_representation:
6987
6988Representation
6989""""""""""""""
6990
6991The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
6992with exactly one ``MDString`` operand.
6993
6994Scalar type descriptors are represented as an ``MDNode`` s with two
6995operands.  The first operand is an ``MDString`` denoting the name of the
6996struct type.  LLVM does not assign meaning to the value of this operand, it
6997only cares about it being an ``MDString``.  The second operand is an
6998``MDNode`` which points to the parent for said scalar type descriptor,
6999which is either another scalar type descriptor or the TBAA root.  Scalar
7000type descriptors can have an optional third argument, but that must be the
7001constant integer zero.
7002
7003Struct type descriptors are represented as ``MDNode`` s with an odd number
7004of operands greater than 1.  The first operand is an ``MDString`` denoting
7005the name of the struct type.  Like in scalar type descriptors the actual
7006value of this name operand is irrelevant to LLVM.  After the name operand,
7007the struct type descriptors have a sequence of alternating ``MDNode`` and
7008``ConstantInt`` operands.  With N starting from 1, the 2N - 1 th operand,
7009an ``MDNode``, denotes a contained field, and the 2N th operand, a
7010``ConstantInt``, is the offset of the said contained field.  The offsets
7011must be in non-decreasing order.
7012
7013Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
7014The first operand is an ``MDNode`` pointing to the node representing the
7015base type.  The second operand is an ``MDNode`` pointing to the node
7016representing the access type.  The third operand is a ``ConstantInt`` that
7017states the offset of the access.  If a fourth field is present, it must be
7018a ``ConstantInt`` valued at 0 or 1.  If it is 1 then the access tag states
7019that the location being accessed is "constant" (meaning
7020``pointsToConstantMemory`` should return true; see `other useful
7021AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_).  The TBAA root of
7022the access type and the base type of an access tag must be the same, and
7023that is the TBAA root of the access tag.
7024
7025'``tbaa.struct``' Metadata
7026^^^^^^^^^^^^^^^^^^^^^^^^^^
7027
7028The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
7029aggregate assignment operations in C and similar languages, however it
7030is defined to copy a contiguous region of memory, which is more than
7031strictly necessary for aggregate types which contain holes due to
7032padding. Also, it doesn't contain any TBAA information about the fields
7033of the aggregate.
7034
7035``!tbaa.struct`` metadata can describe which memory subregions in a
7036memcpy are padding and what the TBAA tags of the struct are.
7037
7038The current metadata format is very simple. ``!tbaa.struct`` metadata
7039nodes are a list of operands which are in conceptual groups of three.
7040For each group of three, the first operand gives the byte offset of a
7041field in bytes, the second gives its size in bytes, and the third gives
7042its tbaa tag. e.g.:
7043
7044.. code-block:: llvm
7045
7046    !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
7047
7048This describes a struct with two fields. The first is at offset 0 bytes
7049with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
7050and has size 4 bytes and has tbaa tag !2.
7051
7052Note that the fields need not be contiguous. In this example, there is a
70534 byte gap between the two fields. This gap represents padding which
7054does not carry useful data and need not be preserved.
7055
7056'``noalias``' and '``alias.scope``' Metadata
7057^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7058
7059``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
7060noalias memory-access sets. This means that some collection of memory access
7061instructions (loads, stores, memory-accessing calls, etc.) that carry
7062``noalias`` metadata can specifically be specified not to alias with some other
7063collection of memory access instructions that carry ``alias.scope`` metadata. If
7064accesses from different collections alias, the behavior is undefined. Each type
7065of metadata specifies a list of scopes where each scope has an id and a domain.
7066
7067When evaluating an aliasing query, if for some domain, the set
7068of scopes with that domain in one instruction's ``alias.scope`` list is a
7069subset of (or equal to) the set of scopes for that domain in another
7070instruction's ``noalias`` list, then the two memory accesses are assumed not to
7071alias.
7072
7073Because scopes in one domain don't affect scopes in other domains, separate
7074domains can be used to compose multiple independent noalias sets.  This is
7075used for example during inlining.  As the noalias function parameters are
7076turned into noalias scope metadata, a new domain is used every time the
7077function is inlined.
7078
7079The metadata identifying each domain is itself a list containing one or two
7080entries. The first entry is the name of the domain. Note that if the name is a
7081string then it can be combined across functions and translation units. A
7082self-reference can be used to create globally unique domain names. A
7083descriptive string may optionally be provided as a second list entry.
7084
7085The metadata identifying each scope is also itself a list containing two or
7086three entries. The first entry is the name of the scope. Note that if the name
7087is a string then it can be combined across functions and translation units. A
7088self-reference can be used to create globally unique scope names. A metadata
7089reference to the scope's domain is the second entry. A descriptive string may
7090optionally be provided as a third list entry.
7091
7092For example,
7093
7094.. code-block:: llvm
7095
7096    ; Two scope domains:
7097    !0 = !{!0}
7098    !1 = !{!1}
7099
7100    ; Some scopes in these domains:
7101    !2 = !{!2, !0}
7102    !3 = !{!3, !0}
7103    !4 = !{!4, !1}
7104
7105    ; Some scope lists:
7106    !5 = !{!4} ; A list containing only scope !4
7107    !6 = !{!4, !3, !2}
7108    !7 = !{!3}
7109
7110    ; These two instructions don't alias:
7111    %0 = load float, ptr %c, align 4, !alias.scope !5
7112    store float %0, ptr %arrayidx.i, align 4, !noalias !5
7113
7114    ; These two instructions also don't alias (for domain !1, the set of scopes
7115    ; in the !alias.scope equals that in the !noalias list):
7116    %2 = load float, ptr %c, align 4, !alias.scope !5
7117    store float %2, ptr %arrayidx.i2, align 4, !noalias !6
7118
7119    ; These two instructions may alias (for domain !0, the set of scopes in
7120    ; the !noalias list is not a superset of, or equal to, the scopes in the
7121    ; !alias.scope list):
7122    %2 = load float, ptr %c, align 4, !alias.scope !6
7123    store float %0, ptr %arrayidx.i, align 4, !noalias !7
7124
7125.. _fpmath-metadata:
7126
7127'``fpmath``' Metadata
7128^^^^^^^^^^^^^^^^^^^^^
7129
7130``fpmath`` metadata may be attached to any instruction of floating-point
7131type. It can be used to express the maximum acceptable error in the
7132result of that instruction, in ULPs, thus potentially allowing the
7133compiler to use a more efficient but less accurate method of computing
7134it. ULP is defined as follows:
7135
7136    If ``x`` is a real number that lies between two finite consecutive
7137    floating-point numbers ``a`` and ``b``, without being equal to one
7138    of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
7139    distance between the two non-equal finite floating-point numbers
7140    nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
7141
7142The metadata node shall consist of a single positive float type number
7143representing the maximum relative error, for example:
7144
7145.. code-block:: llvm
7146
7147    !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
7148
7149.. _range-metadata:
7150
7151'``range``' Metadata
7152^^^^^^^^^^^^^^^^^^^^
7153
7154``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
7155integer or vector of integer types. It expresses the possible ranges the loaded
7156value or the value returned by the called function at this call site is in. If
7157the loaded or returned value is not in the specified range, a poison value is
7158returned instead. The ranges are represented with a flattened list of integers.
7159The loaded value or the value returned is known to be in the union of the ranges
7160defined by each consecutive pair. Each pair has the following properties:
7161
7162-  The type must match the scalar type of the instruction.
7163-  The pair ``a,b`` represents the range ``[a,b)``.
7164-  Both ``a`` and ``b`` are constants.
7165-  The range is allowed to wrap.
7166-  The range should not represent the full or empty set. That is,
7167   ``a!=b``.
7168
7169In addition, the pairs must be in signed order of the lower bound and
7170they must be non-contiguous.
7171
7172For vector-typed instructions, the range is applied element-wise.
7173
7174Examples:
7175
7176.. code-block:: llvm
7177
7178      %a = load i8, ptr %x, align 1, !range !0 ; Can only be 0 or 1
7179      %b = load i8, ptr %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
7180      %c = call i8 @foo(),       !range !2 ; Can only be 0, 1, 3, 4 or 5
7181      %d = invoke i8 @bar() to label %cont
7182             unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
7183      %e = load <2 x i8>, ptr %x, !range 0 ; Can only be <0 or 1, 0 or 1>
7184    ...
7185    !0 = !{ i8 0, i8 2 }
7186    !1 = !{ i8 255, i8 2 }
7187    !2 = !{ i8 0, i8 2, i8 3, i8 6 }
7188    !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
7189
7190'``absolute_symbol``' Metadata
7191^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7192
7193``absolute_symbol`` metadata may be attached to a global variable
7194declaration. It marks the declaration as a reference to an absolute symbol,
7195which causes the backend to use absolute relocations for the symbol even
7196in position independent code, and expresses the possible ranges that the
7197global variable's *address* (not its value) is in, in the same format as
7198``range`` metadata, with the extension that the pair ``all-ones,all-ones``
7199may be used to represent the full set.
7200
7201Example (assuming 64-bit pointers):
7202
7203.. code-block:: llvm
7204
7205      @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
7206      @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
7207
7208    ...
7209    !0 = !{ i64 0, i64 256 }
7210    !1 = !{ i64 -1, i64 -1 }
7211
7212'``callees``' Metadata
7213^^^^^^^^^^^^^^^^^^^^^^
7214
7215``callees`` metadata may be attached to indirect call sites. If ``callees``
7216metadata is attached to a call site, and any callee is not among the set of
7217functions provided by the metadata, the behavior is undefined. The intent of
7218this metadata is to facilitate optimizations such as indirect-call promotion.
7219For example, in the code below, the call instruction may only target the
7220``add`` or ``sub`` functions:
7221
7222.. code-block:: llvm
7223
7224    %result = call i64 %binop(i64 %x, i64 %y), !callees !0
7225
7226    ...
7227    !0 = !{ptr @add, ptr @sub}
7228
7229'``callback``' Metadata
7230^^^^^^^^^^^^^^^^^^^^^^^
7231
7232``callback`` metadata may be attached to a function declaration, or definition.
7233(Call sites are excluded only due to the lack of a use case.) For ease of
7234exposition, we'll refer to the function annotated w/ metadata as a broker
7235function. The metadata describes how the arguments of a call to the broker are
7236in turn passed to the callback function specified by the metadata. Thus, the
7237``callback`` metadata provides a partial description of a call site inside the
7238broker function with regards to the arguments of a call to the broker. The only
7239semantic restriction on the broker function itself is that it is not allowed to
7240inspect or modify arguments referenced in the ``callback`` metadata as
7241pass-through to the callback function.
7242
7243The broker is not required to actually invoke the callback function at runtime.
7244However, the assumptions about not inspecting or modifying arguments that would
7245be passed to the specified callback function still hold, even if the callback
7246function is not dynamically invoked. The broker is allowed to invoke the
7247callback function more than once per invocation of the broker. The broker is
7248also allowed to invoke (directly or indirectly) the function passed as a
7249callback through another use. Finally, the broker is also allowed to relay the
7250callback callee invocation to a different thread.
7251
7252The metadata is structured as follows: At the outer level, ``callback``
7253metadata is a list of ``callback`` encodings. Each encoding starts with a
7254constant ``i64`` which describes the argument position of the callback function
7255in the call to the broker. The following elements, except the last, describe
7256what arguments are passed to the callback function. Each element is again an
7257``i64`` constant identifying the argument of the broker that is passed through,
7258or ``i64 -1`` to indicate an unknown or inspected argument. The order in which
7259they are listed has to be the same in which they are passed to the callback
7260callee. The last element of the encoding is a boolean which specifies how
7261variadic arguments of the broker are handled. If it is true, all variadic
7262arguments of the broker are passed through to the callback function *after* the
7263arguments encoded explicitly before.
7264
7265In the code below, the ``pthread_create`` function is marked as a broker
7266through the ``!callback !1`` metadata. In the example, there is only one
7267callback encoding, namely ``!2``, associated with the broker. This encoding
7268identifies the callback function as the second argument of the broker (``i64
72692``) and the sole argument of the callback function as the third one of the
7270broker function (``i64 3``).
7271
7272.. FIXME why does the llvm-sphinx-docs builder give a highlighting
7273   error if the below is set to highlight as 'llvm', despite that we
7274   have misc.highlighting_failure set?
7275
7276.. code-block:: text
7277
7278    declare !callback !1 dso_local i32 @pthread_create(ptr, ptr, ptr, ptr)
7279
7280    ...
7281    !2 = !{i64 2, i64 3, i1 false}
7282    !1 = !{!2}
7283
7284Another example is shown below. The callback callee is the second argument of
7285the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown
7286values (each identified by a ``i64 -1``) and afterwards all
7287variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the
7288final ``i1 true``).
7289
7290.. FIXME why does the llvm-sphinx-docs builder give a highlighting
7291   error if the below is set to highlight as 'llvm', despite that we
7292   have misc.highlighting_failure set?
7293
7294.. code-block:: text
7295
7296    declare !callback !0 dso_local void @__kmpc_fork_call(ptr, i32, ptr, ...)
7297
7298    ...
7299    !1 = !{i64 2, i64 -1, i64 -1, i1 true}
7300    !0 = !{!1}
7301
7302'``exclude``' Metadata
7303^^^^^^^^^^^^^^^^^^^^^^
7304
7305``exclude`` metadata may be attached to a global variable to signify that its
7306section should not be included in the final executable or shared library. This
7307option is only valid for global variables with an explicit section targeting ELF
7308or COFF. This is done using the ``SHF_EXCLUDE`` flag on ELF targets and the
7309``IMAGE_SCN_LNK_REMOVE`` and ``IMAGE_SCN_MEM_DISCARDABLE`` flags for COFF
7310targets. Additionally, this metadata is only used as a flag, so the associated
7311node must be empty. The explicit section should not conflict with any other
7312sections that the user does not want removed after linking.
7313
7314.. code-block:: text
7315
7316  @object = private constant [1 x i8] c"\00", section ".foo" !exclude !0
7317
7318  ...
7319  !0 = !{}
7320
7321'``unpredictable``' Metadata
7322^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7323
7324``unpredictable`` metadata may be attached to any branch or switch
7325instruction. It can be used to express the unpredictability of control
7326flow. Similar to the llvm.expect intrinsic, it may be used to alter
7327optimizations related to compare and branch instructions. The metadata
7328is treated as a boolean value; if it exists, it signals that the branch
7329or switch that it is attached to is completely unpredictable.
7330
7331.. _md_dereferenceable:
7332
7333'``dereferenceable``' Metadata
7334^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7335
7336The existence of the ``!dereferenceable`` metadata on the instruction
7337tells the optimizer that the value loaded is known to be dereferenceable,
7338otherwise the behavior is undefined.
7339The number of bytes known to be dereferenceable is specified by the integer
7340value in the metadata node. This is analogous to the ''dereferenceable''
7341attribute on parameters and return values.
7342
7343.. _md_dereferenceable_or_null:
7344
7345'``dereferenceable_or_null``' Metadata
7346^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7347
7348The existence of the ``!dereferenceable_or_null`` metadata on the
7349instruction tells the optimizer that the value loaded is known to be either
7350dereferenceable or null, otherwise the behavior is undefined.
7351The number of bytes known to be dereferenceable is specified by the integer
7352value in the metadata node. This is analogous to the ''dereferenceable_or_null''
7353attribute on parameters and return values.
7354
7355.. _llvm.loop:
7356
7357'``llvm.loop``'
7358^^^^^^^^^^^^^^^
7359
7360It is sometimes useful to attach information to loop constructs. Currently,
7361loop metadata is implemented as metadata attached to the branch instruction
7362in the loop latch block. The loop metadata node is a list of
7363other metadata nodes, each representing a property of the loop. Usually,
7364the first item of the property node is a string. For example, the
7365``llvm.loop.unroll.count`` suggests an unroll factor to the loop
7366unroller:
7367
7368.. code-block:: llvm
7369
7370      br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
7371    ...
7372    !0 = !{!0, !1, !2}
7373    !1 = !{!"llvm.loop.unroll.enable"}
7374    !2 = !{!"llvm.loop.unroll.count", i32 4}
7375
7376For legacy reasons, the first item of a loop metadata node must be a
7377reference to itself. Before the advent of the 'distinct' keyword, this
7378forced the preservation of otherwise identical metadata nodes. Since
7379the loop-metadata node can be attached to multiple nodes, the 'distinct'
7380keyword has become unnecessary.
7381
7382Prior to the property nodes, one or two ``DILocation`` (debug location)
7383nodes can be present in the list. The first, if present, identifies the
7384source-code location where the loop begins. The second, if present,
7385identifies the source-code location where the loop ends.
7386
7387Loop metadata nodes cannot be used as unique identifiers. They are
7388neither persistent for the same loop through transformations nor
7389necessarily unique to just one loop.
7390
7391'``llvm.loop.disable_nonforced``'
7392^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7393
7394This metadata disables all optional loop transformations unless
7395explicitly instructed using other transformation metadata such as
7396``llvm.loop.unroll.enable``. That is, no heuristic will try to determine
7397whether a transformation is profitable. The purpose is to avoid that the
7398loop is transformed to a different loop before an explicitly requested
7399(forced) transformation is applied. For instance, loop fusion can make
7400other transformations impossible. Mandatory loop canonicalizations such
7401as loop rotation are still applied.
7402
7403It is recommended to use this metadata in addition to any llvm.loop.*
7404transformation directive. Also, any loop should have at most one
7405directive applied to it (and a sequence of transformations built using
7406followup-attributes). Otherwise, which transformation will be applied
7407depends on implementation details such as the pass pipeline order.
7408
7409See :ref:`transformation-metadata` for details.
7410
7411'``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
7412^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7413
7414Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
7415used to control per-loop vectorization and interleaving parameters such as
7416vectorization width and interleave count. These metadata should be used in
7417conjunction with ``llvm.loop`` loop identification metadata. The
7418``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
7419optimization hints and the optimizer will only interleave and vectorize loops if
7420it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata
7421which contains information about loop-carried memory dependencies can be helpful
7422in determining the safety of these transformations.
7423
7424'``llvm.loop.interleave.count``' Metadata
7425^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7426
7427This metadata suggests an interleave count to the loop interleaver.
7428The first operand is the string ``llvm.loop.interleave.count`` and the
7429second operand is an integer specifying the interleave count. For
7430example:
7431
7432.. code-block:: llvm
7433
7434   !0 = !{!"llvm.loop.interleave.count", i32 4}
7435
7436Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
7437multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
7438then the interleave count will be determined automatically.
7439
7440'``llvm.loop.vectorize.enable``' Metadata
7441^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7442
7443This metadata selectively enables or disables vectorization for the loop. The
7444first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
7445is a bit. If the bit operand value is 1 vectorization is enabled. A value of
74460 disables vectorization:
7447
7448.. code-block:: llvm
7449
7450   !0 = !{!"llvm.loop.vectorize.enable", i1 0}
7451   !1 = !{!"llvm.loop.vectorize.enable", i1 1}
7452
7453'``llvm.loop.vectorize.predicate.enable``' Metadata
7454^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7455
7456This metadata selectively enables or disables creating predicated instructions
7457for the loop, which can enable folding of the scalar epilogue loop into the
7458main loop. The first operand is the string
7459``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If
7460the bit operand value is 1 vectorization is enabled. A value of 0 disables
7461vectorization:
7462
7463.. code-block:: llvm
7464
7465   !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
7466   !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}
7467
7468'``llvm.loop.vectorize.scalable.enable``' Metadata
7469^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7470
7471This metadata selectively enables or disables scalable vectorization for the
7472loop, and only has any effect if vectorization for the loop is already enabled.
7473The first operand is the string ``llvm.loop.vectorize.scalable.enable``
7474and the second operand is a bit. If the bit operand value is 1 scalable
7475vectorization is enabled, whereas a value of 0 reverts to the default fixed
7476width vectorization:
7477
7478.. code-block:: llvm
7479
7480   !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0}
7481   !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1}
7482
7483'``llvm.loop.vectorize.width``' Metadata
7484^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7485
7486This metadata sets the target width of the vectorizer. The first
7487operand is the string ``llvm.loop.vectorize.width`` and the second
7488operand is an integer specifying the width. For example:
7489
7490.. code-block:: llvm
7491
7492   !0 = !{!"llvm.loop.vectorize.width", i32 4}
7493
7494Note that setting ``llvm.loop.vectorize.width`` to 1 disables
7495vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
74960 or if the loop does not have this metadata the width will be
7497determined automatically.
7498
7499'``llvm.loop.vectorize.followup_vectorized``' Metadata
7500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7501
7502This metadata defines which loop attributes the vectorized loop will
7503have. See :ref:`transformation-metadata` for details.
7504
7505'``llvm.loop.vectorize.followup_epilogue``' Metadata
7506^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7507
7508This metadata defines which loop attributes the epilogue will have. The
7509epilogue is not vectorized and is executed when either the vectorized
7510loop is not known to preserve semantics (because e.g., it processes two
7511arrays that are found to alias by a runtime check) or for the last
7512iterations that do not fill a complete set of vector lanes. See
7513:ref:`Transformation Metadata <transformation-metadata>` for details.
7514
7515'``llvm.loop.vectorize.followup_all``' Metadata
7516^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7517
7518Attributes in the metadata will be added to both the vectorized and
7519epilogue loop.
7520See :ref:`Transformation Metadata <transformation-metadata>` for details.
7521
7522'``llvm.loop.unroll``'
7523^^^^^^^^^^^^^^^^^^^^^^
7524
7525Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
7526optimization hints such as the unroll factor. ``llvm.loop.unroll``
7527metadata should be used in conjunction with ``llvm.loop`` loop
7528identification metadata. The ``llvm.loop.unroll`` metadata are only
7529optimization hints and the unrolling will only be performed if the
7530optimizer believes it is safe to do so.
7531
7532'``llvm.loop.unroll.count``' Metadata
7533^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7534
7535This metadata suggests an unroll factor to the loop unroller. The
7536first operand is the string ``llvm.loop.unroll.count`` and the second
7537operand is a positive integer specifying the unroll factor. For
7538example:
7539
7540.. code-block:: llvm
7541
7542   !0 = !{!"llvm.loop.unroll.count", i32 4}
7543
7544If the trip count of the loop is less than the unroll count the loop
7545will be partially unrolled.
7546
7547'``llvm.loop.unroll.disable``' Metadata
7548^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7549
7550This metadata disables loop unrolling. The metadata has a single operand
7551which is the string ``llvm.loop.unroll.disable``. For example:
7552
7553.. code-block:: llvm
7554
7555   !0 = !{!"llvm.loop.unroll.disable"}
7556
7557'``llvm.loop.unroll.runtime.disable``' Metadata
7558^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7559
7560This metadata disables runtime loop unrolling. The metadata has a single
7561operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
7562
7563.. code-block:: llvm
7564
7565   !0 = !{!"llvm.loop.unroll.runtime.disable"}
7566
7567'``llvm.loop.unroll.enable``' Metadata
7568^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7569
7570This metadata suggests that the loop should be fully unrolled if the trip count
7571is known at compile time and partially unrolled if the trip count is not known
7572at compile time. The metadata has a single operand which is the string
7573``llvm.loop.unroll.enable``.  For example:
7574
7575.. code-block:: llvm
7576
7577   !0 = !{!"llvm.loop.unroll.enable"}
7578
7579'``llvm.loop.unroll.full``' Metadata
7580^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7581
7582This metadata suggests that the loop should be unrolled fully. The
7583metadata has a single operand which is the string ``llvm.loop.unroll.full``.
7584For example:
7585
7586.. code-block:: llvm
7587
7588   !0 = !{!"llvm.loop.unroll.full"}
7589
7590'``llvm.loop.unroll.followup``' Metadata
7591^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7592
7593This metadata defines which loop attributes the unrolled loop will have.
7594See :ref:`Transformation Metadata <transformation-metadata>` for details.
7595
7596'``llvm.loop.unroll.followup_remainder``' Metadata
7597^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7598
7599This metadata defines which loop attributes the remainder loop after
7600partial/runtime unrolling will have. See
7601:ref:`Transformation Metadata <transformation-metadata>` for details.
7602
7603'``llvm.loop.unroll_and_jam``'
7604^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7605
7606This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata
7607above, but affect the unroll and jam pass. In addition any loop with
7608``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will
7609disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
7610unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
7611too.)
7612
7613The metadata for unroll and jam otherwise is the same as for ``unroll``.
7614``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and
7615``llvm.loop.unroll_and_jam.count`` do the same as for unroll.
7616``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints
7617and the normal safety checks will still be performed.
7618
7619'``llvm.loop.unroll_and_jam.count``' Metadata
7620^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7621
7622This metadata suggests an unroll and jam factor to use, similarly to
7623``llvm.loop.unroll.count``. The first operand is the string
7624``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer
7625specifying the unroll factor. For example:
7626
7627.. code-block:: llvm
7628
7629   !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4}
7630
7631If the trip count of the loop is less than the unroll count the loop
7632will be partially unroll and jammed.
7633
7634'``llvm.loop.unroll_and_jam.disable``' Metadata
7635^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7636
7637This metadata disables loop unroll and jamming. The metadata has a single
7638operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example:
7639
7640.. code-block:: llvm
7641
7642   !0 = !{!"llvm.loop.unroll_and_jam.disable"}
7643
7644'``llvm.loop.unroll_and_jam.enable``' Metadata
7645^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7646
7647This metadata suggests that the loop should be fully unroll and jammed if the
7648trip count is known at compile time and partially unrolled if the trip count is
7649not known at compile time. The metadata has a single operand which is the
7650string ``llvm.loop.unroll_and_jam.enable``.  For example:
7651
7652.. code-block:: llvm
7653
7654   !0 = !{!"llvm.loop.unroll_and_jam.enable"}
7655
7656'``llvm.loop.unroll_and_jam.followup_outer``' Metadata
7657^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7658
7659This metadata defines which loop attributes the outer unrolled loop will
7660have. See :ref:`Transformation Metadata <transformation-metadata>` for
7661details.
7662
7663'``llvm.loop.unroll_and_jam.followup_inner``' Metadata
7664^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7665
7666This metadata defines which loop attributes the inner jammed loop will
7667have. See :ref:`Transformation Metadata <transformation-metadata>` for
7668details.
7669
7670'``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
7671^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7672
7673This metadata defines which attributes the epilogue of the outer loop
7674will have. This loop is usually unrolled, meaning there is no such
7675loop. This attribute will be ignored in this case. See
7676:ref:`Transformation Metadata <transformation-metadata>` for details.
7677
7678'``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
7679^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7680
7681This metadata defines which attributes the inner loop of the epilogue
7682will have. The outer epilogue will usually be unrolled, meaning there
7683can be multiple inner remainder loops. See
7684:ref:`Transformation Metadata <transformation-metadata>` for details.
7685
7686'``llvm.loop.unroll_and_jam.followup_all``' Metadata
7687^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7688
7689Attributes specified in the metadata is added to all
7690``llvm.loop.unroll_and_jam.*`` loops. See
7691:ref:`Transformation Metadata <transformation-metadata>` for details.
7692
7693'``llvm.loop.licm_versioning.disable``' Metadata
7694^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7695
7696This metadata indicates that the loop should not be versioned for the purpose
7697of enabling loop-invariant code motion (LICM). The metadata has a single operand
7698which is the string ``llvm.loop.licm_versioning.disable``. For example:
7699
7700.. code-block:: llvm
7701
7702   !0 = !{!"llvm.loop.licm_versioning.disable"}
7703
7704'``llvm.loop.distribute.enable``' Metadata
7705^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7706
7707Loop distribution allows splitting a loop into multiple loops.  Currently,
7708this is only performed if the entire loop cannot be vectorized due to unsafe
7709memory dependencies.  The transformation will attempt to isolate the unsafe
7710dependencies into their own loop.
7711
7712This metadata can be used to selectively enable or disable distribution of the
7713loop.  The first operand is the string ``llvm.loop.distribute.enable`` and the
7714second operand is a bit. If the bit operand value is 1 distribution is
7715enabled. A value of 0 disables distribution:
7716
7717.. code-block:: llvm
7718
7719   !0 = !{!"llvm.loop.distribute.enable", i1 0}
7720   !1 = !{!"llvm.loop.distribute.enable", i1 1}
7721
7722This metadata should be used in conjunction with ``llvm.loop`` loop
7723identification metadata.
7724
7725'``llvm.loop.distribute.followup_coincident``' Metadata
7726^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7727
7728This metadata defines which attributes extracted loops with no cyclic
7729dependencies will have (i.e. can be vectorized). See
7730:ref:`Transformation Metadata <transformation-metadata>` for details.
7731
7732'``llvm.loop.distribute.followup_sequential``' Metadata
7733^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7734
7735This metadata defines which attributes the isolated loops with unsafe
7736memory dependencies will have. See
7737:ref:`Transformation Metadata <transformation-metadata>` for details.
7738
7739'``llvm.loop.distribute.followup_fallback``' Metadata
7740^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7741
7742If loop versioning is necessary, this metadata defined the attributes
7743the non-distributed fallback version will have. See
7744:ref:`Transformation Metadata <transformation-metadata>` for details.
7745
7746'``llvm.loop.distribute.followup_all``' Metadata
7747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7748
7749The attributes in this metadata is added to all followup loops of the
7750loop distribution pass. See
7751:ref:`Transformation Metadata <transformation-metadata>` for details.
7752
7753'``llvm.licm.disable``' Metadata
7754^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7755
7756This metadata indicates that loop-invariant code motion (LICM) should not be
7757performed on this loop. The metadata has a single operand which is the string
7758``llvm.licm.disable``. For example:
7759
7760.. code-block:: llvm
7761
7762   !0 = !{!"llvm.licm.disable"}
7763
7764Note that although it operates per loop it isn't given the llvm.loop prefix
7765as it is not affected by the ``llvm.loop.disable_nonforced`` metadata.
7766
7767'``llvm.access.group``' Metadata
7768^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7769
7770``llvm.access.group`` metadata can be attached to any instruction that
7771potentially accesses memory. It can point to a single distinct metadata
7772node, which we call access group. This node represents all memory access
7773instructions referring to it via ``llvm.access.group``. When an
7774instruction belongs to multiple access groups, it can also point to a
7775list of accesses groups, illustrated by the following example.
7776
7777.. code-block:: llvm
7778
7779   %val = load i32, ptr %arrayidx, !llvm.access.group !0
7780   ...
7781   !0 = !{!1, !2}
7782   !1 = distinct !{}
7783   !2 = distinct !{}
7784
7785It is illegal for the list node to be empty since it might be confused
7786with an access group.
7787
7788The access group metadata node must be 'distinct' to avoid collapsing
7789multiple access groups by content. An access group metadata node must
7790always be empty which can be used to distinguish an access group
7791metadata node from a list of access groups. Being empty avoids the
7792situation that the content must be updated which, because metadata is
7793immutable by design, would required finding and updating all references
7794to the access group node.
7795
7796The access group can be used to refer to a memory access instruction
7797without pointing to it directly (which is not possible in global
7798metadata). Currently, the only metadata making use of it is
7799``llvm.loop.parallel_accesses``.
7800
7801'``llvm.loop.parallel_accesses``' Metadata
7802^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7803
7804The ``llvm.loop.parallel_accesses`` metadata refers to one or more
7805access group metadata nodes (see ``llvm.access.group``). It denotes that
7806no loop-carried memory dependence exist between it and other instructions
7807in the loop with this metadata.
7808
7809Let ``m1`` and ``m2`` be two instructions that both have the
7810``llvm.access.group`` metadata to the access group ``g1``, respectively
7811``g2`` (which might be identical). If a loop contains both access groups
7812in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can
7813assume that there is no dependency between ``m1`` and ``m2`` carried by
7814this loop. Instructions that belong to multiple access groups are
7815considered having this property if at least one of the access groups
7816matches the ``llvm.loop.parallel_accesses`` list.
7817
7818If all memory-accessing instructions in a loop have
7819``llvm.access.group`` metadata that each refer to one of the access
7820groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the
7821loop has no loop carried memory dependencies and is considered to be a
7822parallel loop. If there is a loop-carried dependency, the behavior is
7823undefined.
7824
7825Note that if not all memory access instructions belong to an access
7826group referred to by ``llvm.loop.parallel_accesses``, then the loop must
7827not be considered trivially parallel. Additional
7828memory dependence analysis is required to make that determination. As a fail
7829safe mechanism, this causes loops that were originally parallel to be considered
7830sequential (if optimization passes that are unaware of the parallel semantics
7831insert new memory instructions into the loop body).
7832
7833Example of a loop that is considered parallel due to its correct use of
7834both ``llvm.access.group`` and ``llvm.loop.parallel_accesses``
7835metadata types.
7836
7837.. code-block:: llvm
7838
7839   for.body:
7840     ...
7841     %val0 = load i32, ptr %arrayidx, !llvm.access.group !1
7842     ...
7843     store i32 %val0, ptr %arrayidx1, !llvm.access.group !1
7844     ...
7845     br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
7846
7847   for.end:
7848   ...
7849   !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}}
7850   !1 = distinct !{}
7851
7852It is also possible to have nested parallel loops:
7853
7854.. code-block:: llvm
7855
7856   outer.for.body:
7857     ...
7858     %val1 = load i32, ptr %arrayidx3, !llvm.access.group !4
7859     ...
7860     br label %inner.for.body
7861
7862   inner.for.body:
7863     ...
7864     %val0 = load i32, ptr %arrayidx1, !llvm.access.group !3
7865     ...
7866     store i32 %val0, ptr %arrayidx2, !llvm.access.group !3
7867     ...
7868     br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
7869
7870   inner.for.end:
7871     ...
7872     store i32 %val1, ptr %arrayidx4, !llvm.access.group !4
7873     ...
7874     br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
7875
7876   outer.for.end:                                          ; preds = %for.body
7877   ...
7878   !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}}     ; metadata for the inner loop
7879   !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop
7880   !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well)
7881   !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop
7882
7883.. _langref_llvm_loop_mustprogress:
7884
7885'``llvm.loop.mustprogress``' Metadata
7886^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7887
7888The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to
7889terminate, unwind, or interact with the environment in an observable way e.g.
7890via a volatile memory access, I/O, or other synchronization. If such a loop is
7891not found to interact with the environment in an observable way, the loop may
7892be removed. This corresponds to the ``mustprogress`` function attribute.
7893
7894'``irr_loop``' Metadata
7895^^^^^^^^^^^^^^^^^^^^^^^
7896
7897``irr_loop`` metadata may be attached to the terminator instruction of a basic
7898block that's an irreducible loop header (note that an irreducible loop has more
7899than once header basic blocks.) If ``irr_loop`` metadata is attached to the
7900terminator instruction of a basic block that is not really an irreducible loop
7901header, the behavior is undefined. The intent of this metadata is to improve the
7902accuracy of the block frequency propagation. For example, in the code below, the
7903block ``header0`` may have a loop header weight (relative to the other headers of
7904the irreducible loop) of 100:
7905
7906.. code-block:: llvm
7907
7908    header0:
7909    ...
7910    br i1 %cmp, label %t1, label %t2, !irr_loop !0
7911
7912    ...
7913    !0 = !{"loop_header_weight", i64 100}
7914
7915Irreducible loop header weights are typically based on profile data.
7916
7917.. _md_invariant.group:
7918
7919'``invariant.group``' Metadata
7920^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7921
7922The experimental ``invariant.group`` metadata may be attached to
7923``load``/``store`` instructions referencing a single metadata with no entries.
7924The existence of the ``invariant.group`` metadata on the instruction tells
7925the optimizer that every ``load`` and ``store`` to the same pointer operand
7926can be assumed to load or store the same
7927value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
7928when two pointers are considered the same). Pointers returned by bitcast or
7929getelementptr with only zero indices are considered the same.
7930
7931Examples:
7932
7933.. code-block:: llvm
7934
7935   @unknownPtr = external global i8
7936   ...
7937   %ptr = alloca i8
7938   store i8 42, ptr %ptr, !invariant.group !0
7939   call void @foo(ptr %ptr)
7940
7941   %a = load i8, ptr %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
7942   call void @foo(ptr %ptr)
7943
7944   %newPtr = call ptr @getPointer(ptr %ptr)
7945   %c = load i8, ptr %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
7946
7947   %unknownValue = load i8, ptr @unknownPtr
7948   store i8 %unknownValue, ptr %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
7949
7950   call void @foo(ptr %ptr)
7951   %newPtr2 = call ptr @llvm.launder.invariant.group.p0(ptr %ptr)
7952   %d = load i8, ptr %newPtr2, !invariant.group !0  ; Can't step through launder.invariant.group to get value of %ptr
7953
7954   ...
7955   declare void @foo(ptr)
7956   declare ptr @getPointer(ptr)
7957   declare ptr @llvm.launder.invariant.group.p0(ptr)
7958
7959   !0 = !{}
7960
7961The invariant.group metadata must be dropped when replacing one pointer by
7962another based on aliasing information. This is because invariant.group is tied
7963to the SSA value of the pointer operand.
7964
7965.. code-block:: llvm
7966
7967  %v = load i8, ptr %x, !invariant.group !0
7968  ; if %x mustalias %y then we can replace the above instruction with
7969  %v = load i8, ptr %y
7970
7971Note that this is an experimental feature, which means that its semantics might
7972change in the future.
7973
7974'``type``' Metadata
7975^^^^^^^^^^^^^^^^^^^
7976
7977See :doc:`TypeMetadata`.
7978
7979'``associated``' Metadata
7980^^^^^^^^^^^^^^^^^^^^^^^^^
7981
7982The ``associated`` metadata may be attached to a global variable definition with
7983a single argument that references a global object (optionally through an alias).
7984
7985This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents
7986discarding of the global variable in linker GC unless the referenced object is
7987also discarded. The linker support for this feature is spotty. For best
7988compatibility, globals carrying this metadata should:
7989
7990- Be in ``@llvm.compiler.used``.
7991- If the referenced global variable is in a comdat, be in the same comdat.
7992
7993``!associated`` can not express many-to-one relationship. A global variable with
7994the metadata should generally not be referenced by a function: the function may
7995be inlined into other functions, leading to more references to the metadata.
7996Ideally we would want to keep metadata alive as long as any inline location is
7997alive, but this many-to-one relationship is not representable. Moreover, if the
7998metadata is retained while the function is discarded, the linker will report an
7999error of a relocation referencing a discarded section.
8000
8001The metadata is often used with an explicit section consisting of valid C
8002identifiers so that the runtime can find the metadata section with
8003linker-defined encapsulation symbols ``__start_<section_name>`` and
8004``__stop_<section_name>``.
8005
8006It does not have any effect on non-ELF targets.
8007
8008Example:
8009
8010.. code-block:: text
8011
8012    $a = comdat any
8013    @a = global i32 1, comdat $a
8014    @b = internal global i32 2, comdat $a, section "abc", !associated !0
8015    !0 = !{ptr @a}
8016
8017
8018'``prof``' Metadata
8019^^^^^^^^^^^^^^^^^^^
8020
8021The ``prof`` metadata is used to record profile data in the IR.
8022The first operand of the metadata node indicates the profile metadata
8023type. There are currently 3 types:
8024:ref:`branch_weights<prof_node_branch_weights>`,
8025:ref:`function_entry_count<prof_node_function_entry_count>`, and
8026:ref:`VP<prof_node_VP>`.
8027
8028.. _prof_node_branch_weights:
8029
8030branch_weights
8031""""""""""""""
8032
8033Branch weight metadata attached to a branch, select, switch or call instruction
8034represents the likeliness of the associated branch being taken.
8035For more information, see :doc:`BranchWeightMetadata`.
8036
8037.. _prof_node_function_entry_count:
8038
8039function_entry_count
8040""""""""""""""""""""
8041
8042Function entry count metadata can be attached to function definitions
8043to record the number of times the function is called. Used with BFI
8044information, it is also used to derive the basic block profile count.
8045For more information, see :doc:`BranchWeightMetadata`.
8046
8047.. _prof_node_VP:
8048
8049VP
8050""
8051
8052VP (value profile) metadata can be attached to instructions that have
8053value profile information. Currently this is indirect calls (where it
8054records the hottest callees) and calls to memory intrinsics such as memcpy,
8055memmove, and memset (where it records the hottest byte lengths).
8056
8057Each VP metadata node contains "VP" string, then a uint32_t value for the value
8058profiling kind, a uint64_t value for the total number of times the instruction
8059is executed, followed by uint64_t value and execution count pairs.
8060The value profiling kind is 0 for indirect call targets and 1 for memory
8061operations. For indirect call targets, each profile value is a hash
8062of the callee function name, and for memory operations each value is the
8063byte length.
8064
8065Note that the value counts do not need to add up to the total count
8066listed in the third operand (in practice only the top hottest values
8067are tracked and reported).
8068
8069Indirect call example:
8070
8071.. code-block:: llvm
8072
8073    call void %f(), !prof !1
8074    !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
8075
8076Note that the VP type is 0 (the second operand), which indicates this is
8077an indirect call value profile data. The third operand indicates that the
8078indirect call executed 1600 times. The 4th and 6th operands give the
8079hashes of the 2 hottest target functions' names (this is the same hash used
8080to represent function names in the profile database), and the 5th and 7th
8081operands give the execution count that each of the respective prior target
8082functions was called.
8083
8084.. _md_annotation:
8085
8086'``annotation``' Metadata
8087^^^^^^^^^^^^^^^^^^^^^^^^^
8088
8089The ``annotation`` metadata can be used to attach a tuple of annotation strings
8090or a tuple of a tuple of annotation strings to any instruction. This metadata does
8091not impact the semantics of the program and may only be used to provide additional
8092insight about the program and transformations to users.
8093
8094Example:
8095
8096.. code-block:: text
8097
8098    %a.addr = alloca ptr, align 8, !annotation !0
8099    !0 = !{!"auto-init"}
8100
8101Embedding tuple of strings example:
8102
8103.. code-block:: text
8104
8105  %a.ptr = getelementptr ptr, ptr %base, i64 0. !annotation !0
8106  !0 = !{!1}
8107  !1 = !{!"gep offset", !"0"}
8108
8109'``func_sanitize``' Metadata
8110^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8111
8112The ``func_sanitize`` metadata is used to attach two values for the function
8113sanitizer instrumentation. The first value is the ubsan function signature.
8114The second value is the address of the proxy variable which stores the address
8115of the RTTI descriptor. If :ref:`prologue <prologuedata>` and '``func_sanitize``'
8116are used at the same time, :ref:`prologue <prologuedata>` is emitted before
8117'``func_sanitize``' in the output.
8118
8119Example:
8120
8121.. code-block:: text
8122
8123    @__llvm_rtti_proxy = private unnamed_addr constant ptr @_ZTIFvvE
8124    define void @_Z3funv() !func_sanitize !0 {
8125      return void
8126    }
8127    !0 = !{i32 846595819, ptr @__llvm_rtti_proxy}
8128
8129.. _md_kcfi_type:
8130
8131'``kcfi_type``' Metadata
8132^^^^^^^^^^^^^^^^^^^^^^^^
8133
8134The ``kcfi_type`` metadata can be used to attach a type identifier to
8135functions that can be called indirectly. The type data is emitted before the
8136function entry in the assembly. Indirect calls with the :ref:`kcfi operand
8137bundle<ob_kcfi>` will emit a check that compares the type identifier to the
8138metadata.
8139
8140Example:
8141
8142.. code-block:: text
8143
8144    define dso_local i32 @f() !kcfi_type !0 {
8145      ret i32 0
8146    }
8147    !0 = !{i32 12345678}
8148
8149Clang emits ``kcfi_type`` metadata nodes for address-taken functions with
8150``-fsanitize=kcfi``.
8151
8152.. _md_memprof:
8153
8154'``memprof``' Metadata
8155^^^^^^^^^^^^^^^^^^^^^^^^
8156
8157The ``memprof`` metadata is used to record memory profile data on heap
8158allocation calls. Multiple context-sensitive profiles can be represented
8159with a single ``memprof`` metadata attachment.
8160
8161Example:
8162
8163.. code-block:: text
8164
8165    %call = call ptr @_Znam(i64 10), !memprof !0, !callsite !5
8166    !0 = !{!1, !3}
8167    !1 = !{!2, !"cold"}
8168    !2 = !{i64 4854880825882961848, i64 1905834578520680781}
8169    !3 = !{!4, !"notcold"}
8170    !4 = !{i64 4854880825882961848, i64 -6528110295079665978}
8171    !5 = !{i64 4854880825882961848}
8172
8173Each operand in the ``memprof`` metadata attachment describes the profiled
8174behavior of memory allocated by the associated allocation for a given context.
8175In the above example, there were 2 profiled contexts, one allocating memory
8176that was typically cold and one allocating memory that was typically not cold.
8177
8178The format of the metadata describing a context specific profile (e.g.
8179``!1`` and ``!3`` above) requires a first operand that is a metadata node
8180describing the context, followed by a list of string metadata tags describing
8181the profile behavior (e.g. ``cold`` and ``notcold``) above. The metadata nodes
8182describing the context (e.g. ``!2`` and ``!4`` above) are unique ids
8183corresponding to callsites, which can be matched to associated IR calls via
8184:ref:`callsite metadata<md_callsite>`. In practice these ids are formed via
8185a hash of the callsite's debug info, and the associated call may be in a
8186different module. The contexts are listed in order from leaf-most call (the
8187allocation itself) to the outermost callsite context required for uniquely
8188identifying the described profile behavior (note this may not be the top of
8189the profiled call stack).
8190
8191.. _md_callsite:
8192
8193'``callsite``' Metadata
8194^^^^^^^^^^^^^^^^^^^^^^^^
8195
8196The ``callsite`` metadata is used to identify callsites involved in memory
8197profile contexts described in :ref:`memprof metadata<md_memprof>`.
8198
8199It is attached both to the profile allocation calls (see the example in
8200:ref:`memprof metadata<md_memprof>`), as well as to other callsites
8201in profiled contexts described in heap allocation ``memprof`` metadata.
8202
8203Example:
8204
8205.. code-block:: text
8206
8207    %call = call ptr @_Z1Bb(void), !callsite !0
8208    !0 = !{i64 -6528110295079665978, i64 5462047985461644151}
8209
8210Each operand in the ``callsite`` metadata attachment is a unique id
8211corresponding to a callsite (possibly inlined). In practice these ids are
8212formed via a hash of the callsite's debug info. If the call was not inlined
8213into any callers it will contain a single operand (id). If it was inlined
8214it will contain a list of ids, including the ids of the callsites in the
8215full inline sequence, in order from the leaf-most call's id to the outermost
8216inlined call.
8217
8218
8219'``noalias.addrspace``' Metadata
8220^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8221
8222The ``noalias.addrspace`` metadata is used to identify memory
8223operations which cannot access objects allocated in a range of address
8224spaces. It is attached to memory instructions, including
8225:ref:`atomicrmw <i_atomicrmw>`, :ref:`cmpxchg <i_cmpxchg>`, and
8226:ref:`call <i_call>` instructions.
8227
8228This follows the same form as :ref:`range metadata <range-metadata>`,
8229except the field entries must be of type `i32`. The interpretation is
8230the same numeric address spaces as applied to IR values.
8231
8232Example:
8233
8234.. code-block:: llvm
8235
8236    ; %ptr cannot point to an object allocated in addrspace(5)
8237    %rmw.valid = atomicrmw and ptr %ptr, i64 %value seq_cst, !noalias.addrspace !0
8238
8239    ; Undefined behavior. The underlying object is allocated in one of the listed
8240    ; address spaces.
8241    %alloca = alloca i64, addrspace(5)
8242    %alloca.cast = addrspacecast ptr addrspace(5) %alloca to ptr
8243    %rmw.ub = atomicrmw and ptr %alloca.cast, i64 %value seq_cst, !noalias.addrspace !0
8244
8245    !0 = !{i32 5, i32 6} ; Exclude addrspace(5) only
8246
8247
8248This is intended for use on targets with a notion of generic address
8249spaces, which at runtime resolve to different physical memory
8250spaces. The interpretation of the address space values is target
8251specific. The behavior is undefined if the runtime memory address does
8252resolve to an object defined in one of the indicated address spaces.
8253
8254
8255Module Flags Metadata
8256=====================
8257
8258Information about the module as a whole is difficult to convey to LLVM's
8259subsystems. The LLVM IR isn't sufficient to transmit this information.
8260The ``llvm.module.flags`` named metadata exists in order to facilitate
8261this. These flags are in the form of key / value pairs --- much like a
8262dictionary --- making it easy for any subsystem who cares about a flag to
8263look it up.
8264
8265The ``llvm.module.flags`` metadata contains a list of metadata triplets.
8266Each triplet has the following form:
8267
8268-  The first element is a *behavior* flag, which specifies the behavior
8269   when two (or more) modules are merged together, and it encounters two
8270   (or more) metadata with the same ID. The supported behaviors are
8271   described below.
8272-  The second element is a metadata string that is a unique ID for the
8273   metadata. Each module may only have one flag entry for each unique ID (not
8274   including entries with the **Require** behavior).
8275-  The third element is the value of the flag.
8276
8277When two (or more) modules are merged together, the resulting
8278``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
8279each unique metadata ID string, there will be exactly one entry in the merged
8280modules ``llvm.module.flags`` metadata table, and the value for that entry will
8281be determined by the merge behavior flag, as described below. The only exception
8282is that entries with the *Require* behavior are always preserved.
8283
8284The following behaviors are supported:
8285
8286.. list-table::
8287   :header-rows: 1
8288   :widths: 10 90
8289
8290   * - Value
8291     - Behavior
8292
8293   * - 1
8294     - **Error**
8295           Emits an error if two values disagree, otherwise the resulting value
8296           is that of the operands.
8297
8298   * - 2
8299     - **Warning**
8300           Emits a warning if two values disagree. The result value will be the
8301           operand for the flag from the first module being linked, unless the
8302           other module uses **Min** or **Max**, in which case the result will
8303           be **Min** (with the min value) or **Max** (with the max value),
8304           respectively.
8305
8306   * - 3
8307     - **Require**
8308           Adds a requirement that another module flag be present and have a
8309           specified value after linking is performed. The value must be a
8310           metadata pair, where the first element of the pair is the ID of the
8311           module flag to be restricted, and the second element of the pair is
8312           the value the module flag should be restricted to. This behavior can
8313           be used to restrict the allowable results (via triggering of an
8314           error) of linking IDs with the **Override** behavior.
8315
8316   * - 4
8317     - **Override**
8318           Uses the specified value, regardless of the behavior or value of the
8319           other module. If both modules specify **Override**, but the values
8320           differ, an error will be emitted.
8321
8322   * - 5
8323     - **Append**
8324           Appends the two values, which are required to be metadata nodes.
8325
8326   * - 6
8327     - **AppendUnique**
8328           Appends the two values, which are required to be metadata
8329           nodes. However, duplicate entries in the second list are dropped
8330           during the append operation.
8331
8332   * - 7
8333     - **Max**
8334           Takes the max of the two values, which are required to be integers.
8335
8336   * - 8
8337     - **Min**
8338           Takes the min of the two values, which are required to be non-negative integers.
8339           An absent module flag is treated as having the value 0.
8340
8341It is an error for a particular unique flag ID to have multiple behaviors,
8342except in the case of **Require** (which adds restrictions on another metadata
8343value) or **Override**.
8344
8345An example of module flags:
8346
8347.. code-block:: llvm
8348
8349    !0 = !{ i32 1, !"foo", i32 1 }
8350    !1 = !{ i32 4, !"bar", i32 37 }
8351    !2 = !{ i32 2, !"qux", i32 42 }
8352    !3 = !{ i32 3, !"qux",
8353      !{
8354        !"foo", i32 1
8355      }
8356    }
8357    !llvm.module.flags = !{ !0, !1, !2, !3 }
8358
8359-  Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
8360   if two or more ``!"foo"`` flags are seen is to emit an error if their
8361   values are not equal.
8362
8363-  Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
8364   behavior if two or more ``!"bar"`` flags are seen is to use the value
8365   '37'.
8366
8367-  Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
8368   behavior if two or more ``!"qux"`` flags are seen is to emit a
8369   warning if their values are not equal.
8370
8371-  Metadata ``!3`` has the ID ``!"qux"`` and the value:
8372
8373   ::
8374
8375       !{ !"foo", i32 1 }
8376
8377   The behavior is to emit an error if the ``llvm.module.flags`` does not
8378   contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
8379   performed.
8380
8381Synthesized Functions Module Flags Metadata
8382-------------------------------------------
8383
8384These metadata specify the default attributes synthesized functions should have.
8385These metadata are currently respected by a few instrumentation passes, such as
8386sanitizers.
8387
8388These metadata correspond to a few function attributes with significant code
8389generation behaviors. Function attributes with just optimization purposes
8390should not be listed because the performance impact of these synthesized
8391functions is small.
8392
8393- "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function
8394  will get the "frame-pointer" function attribute, with value being "none",
8395  "non-leaf", or "all", respectively.
8396- "function_return_thunk_extern": The synthesized function will get the
8397  ``fn_return_thunk_extern`` function attribute.
8398- "uwtable": **Max**. The value can be 0, 1, or 2. If the value is 1, a synthesized
8399  function will get the ``uwtable(sync)`` function attribute, if the value is 2,
8400  a synthesized function will get the ``uwtable(async)`` function attribute.
8401
8402Objective-C Garbage Collection Module Flags Metadata
8403----------------------------------------------------
8404
8405On the Mach-O platform, Objective-C stores metadata about garbage
8406collection in a special section called "image info". The metadata
8407consists of a version number and a bitmask specifying what types of
8408garbage collection are supported (if any) by the file. If two or more
8409modules are linked together their garbage collection metadata needs to
8410be merged rather than appended together.
8411
8412The Objective-C garbage collection module flags metadata consists of the
8413following key-value pairs:
8414
8415.. list-table::
8416   :header-rows: 1
8417   :widths: 30 70
8418
8419   * - Key
8420     - Value
8421
8422   * - ``Objective-C Version``
8423     - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
8424
8425   * - ``Objective-C Image Info Version``
8426     - **[Required]** --- The version of the image info section. Currently
8427       always 0.
8428
8429   * - ``Objective-C Image Info Section``
8430     - **[Required]** --- The section to place the metadata. Valid values are
8431       ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
8432       ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
8433       Objective-C ABI version 2.
8434
8435   * - ``Objective-C Garbage Collection``
8436     - **[Required]** --- Specifies whether garbage collection is supported or
8437       not. Valid values are 0, for no garbage collection, and 2, for garbage
8438       collection supported.
8439
8440   * - ``Objective-C GC Only``
8441     - **[Optional]** --- Specifies that only garbage collection is supported.
8442       If present, its value must be 6. This flag requires that the
8443       ``Objective-C Garbage Collection`` flag have the value 2.
8444
8445Some important flag interactions:
8446
8447-  If a module with ``Objective-C Garbage Collection`` set to 0 is
8448   merged with a module with ``Objective-C Garbage Collection`` set to
8449   2, then the resulting module has the
8450   ``Objective-C Garbage Collection`` flag set to 0.
8451-  A module with ``Objective-C Garbage Collection`` set to 0 cannot be
8452   merged with a module with ``Objective-C GC Only`` set to 6.
8453
8454C type width Module Flags Metadata
8455----------------------------------
8456
8457The ARM backend emits a section into each generated object file describing the
8458options that it was compiled with (in a compiler-independent way) to prevent
8459linking incompatible objects, and to allow automatic library selection. Some
8460of these options are not visible at the IR level, namely wchar_t width and enum
8461width.
8462
8463To pass this information to the backend, these options are encoded in module
8464flags metadata, using the following key-value pairs:
8465
8466.. list-table::
8467   :header-rows: 1
8468   :widths: 30 70
8469
8470   * - Key
8471     - Value
8472
8473   * - short_wchar
8474     - * 0 --- sizeof(wchar_t) == 4
8475       * 1 --- sizeof(wchar_t) == 2
8476
8477   * - short_enum
8478     - * 0 --- Enums are at least as large as an ``int``.
8479       * 1 --- Enums are stored in the smallest integer type which can
8480         represent all of its values.
8481
8482For example, the following metadata section specifies that the module was
8483compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
8484enum is the smallest type which can represent all of its values::
8485
8486    !llvm.module.flags = !{!0, !1}
8487    !0 = !{i32 1, !"short_wchar", i32 1}
8488    !1 = !{i32 1, !"short_enum", i32 0}
8489
8490Stack Alignment Metadata
8491------------------------
8492
8493Changes the default stack alignment from the target ABI's implicit default
8494stack alignment. Takes an i32 value in bytes. It is considered an error to link
8495two modules together with different values for this metadata.
8496
8497For example:
8498
8499    !llvm.module.flags = !{!0}
8500    !0 = !{i32 1, !"override-stack-alignment", i32 8}
8501
8502This will change the stack alignment to 8B.
8503
8504Embedded Objects Names Metadata
8505===============================
8506
8507Offloading compilations need to embed device code into the host section table to
8508create a fat binary. This metadata node references each global that will be
8509embedded in the module. The primary use for this is to make referencing these
8510globals more efficient in the IR. The metadata references nodes containing
8511pointers to the global to be embedded followed by the section name it will be
8512stored at::
8513
8514    !llvm.embedded.objects = !{!0}
8515    !0 = !{ptr @object, !".section"}
8516
8517Automatic Linker Flags Named Metadata
8518=====================================
8519
8520Some targets support embedding of flags to the linker inside individual object
8521files. Typically this is used in conjunction with language extensions which
8522allow source files to contain linker command line options, and have these
8523automatically be transmitted to the linker via object files.
8524
8525These flags are encoded in the IR using named metadata with the name
8526``!llvm.linker.options``. Each operand is expected to be a metadata node
8527which should be a list of other metadata nodes, each of which should be a
8528list of metadata strings defining linker options.
8529
8530For example, the following metadata section specifies two separate sets of
8531linker options, presumably to link against ``libz`` and the ``Cocoa``
8532framework::
8533
8534    !0 = !{ !"-lz" }
8535    !1 = !{ !"-framework", !"Cocoa" }
8536    !llvm.linker.options = !{ !0, !1 }
8537
8538The metadata encoding as lists of lists of options, as opposed to a collapsed
8539list of options, is chosen so that the IR encoding can use multiple option
8540strings to specify e.g., a single library, while still having that specifier be
8541preserved as an atomic element that can be recognized by a target specific
8542assembly writer or object file emitter.
8543
8544Each individual option is required to be either a valid option for the target's
8545linker, or an option that is reserved by the target specific assembly writer or
8546object file emitter. No other aspect of these options is defined by the IR.
8547
8548Dependent Libs Named Metadata
8549=============================
8550
8551Some targets support embedding of strings into object files to indicate
8552a set of libraries to add to the link. Typically this is used in conjunction
8553with language extensions which allow source files to explicitly declare the
8554libraries they depend on, and have these automatically be transmitted to the
8555linker via object files.
8556
8557The list is encoded in the IR using named metadata with the name
8558``!llvm.dependent-libraries``. Each operand is expected to be a metadata node
8559which should contain a single string operand.
8560
8561For example, the following metadata section contains two library specifiers::
8562
8563    !0 = !{!"a library specifier"}
8564    !1 = !{!"another library specifier"}
8565    !llvm.dependent-libraries = !{ !0, !1 }
8566
8567Each library specifier will be handled independently by the consuming linker.
8568The effect of the library specifiers are defined by the consuming linker.
8569
8570.. _summary:
8571
8572ThinLTO Summary
8573===============
8574
8575Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
8576causes the building of a compact summary of the module that is emitted into
8577the bitcode. The summary is emitted into the LLVM assembly and identified
8578in syntax by a caret ('``^``').
8579
8580The summary is parsed into a bitcode output, along with the Module
8581IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
8582of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
8583summary entries (just as they currently ignore summary entries in a bitcode
8584input file).
8585
8586Eventually, the summary will be parsed into a ModuleSummaryIndex object under
8587the same conditions where summary index is currently built from bitcode.
8588Specifically, tools that test the Thin Link portion of a ThinLTO compile
8589(i.e. llvm-lto and llvm-lto2), or when parsing a combined index
8590for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag
8591(this part is not yet implemented, use llvm-as to create a bitcode object
8592before feeding into thin link tools for now).
8593
8594There are currently 3 types of summary entries in the LLVM assembly:
8595:ref:`module paths<module_path_summary>`,
8596:ref:`global values<gv_summary>`, and
8597:ref:`type identifiers<typeid_summary>`.
8598
8599.. _module_path_summary:
8600
8601Module Path Summary Entry
8602-------------------------
8603
8604Each module path summary entry lists a module containing global values included
8605in the summary. For a single IR module there will be one such entry, but
8606in a combined summary index produced during the thin link, there will be
8607one module path entry per linked module with summary.
8608
8609Example:
8610
8611.. code-block:: text
8612
8613    ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
8614
8615The ``path`` field is a string path to the bitcode file, and the ``hash``
8616field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
8617incremental builds and caching.
8618
8619.. _gv_summary:
8620
8621Global Value Summary Entry
8622--------------------------
8623
8624Each global value summary entry corresponds to a global value defined or
8625referenced by a summarized module.
8626
8627Example:
8628
8629.. code-block:: text
8630
8631    ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
8632
8633For declarations, there will not be a summary list. For definitions, a
8634global value will contain a list of summaries, one per module containing
8635a definition. There can be multiple entries in a combined summary index
8636for symbols with weak linkage.
8637
8638Each ``Summary`` format will depend on whether the global value is a
8639:ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
8640:ref:`alias<alias_summary>`.
8641
8642.. _function_summary:
8643
8644Function Summary
8645^^^^^^^^^^^^^^^^
8646
8647If the global value is a function, the ``Summary`` entry will look like:
8648
8649.. code-block:: text
8650
8651    function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]?
8652
8653The ``module`` field includes the summary entry id for the module containing
8654this definition, and the ``flags`` field contains information such as
8655the linkage type, a flag indicating whether it is legal to import the
8656definition, whether it is globally live and whether the linker resolved it
8657to a local definition (the latter two are populated during the thin link).
8658The ``insts`` field contains the number of IR instructions in the function.
8659Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
8660:ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
8661:ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`.
8662
8663.. _variable_summary:
8664
8665Global Variable Summary
8666^^^^^^^^^^^^^^^^^^^^^^^
8667
8668If the global value is a variable, the ``Summary`` entry will look like:
8669
8670.. code-block:: text
8671
8672    variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
8673
8674The variable entry contains a subset of the fields in a
8675:ref:`function summary <function_summary>`, see the descriptions there.
8676
8677.. _alias_summary:
8678
8679Alias Summary
8680^^^^^^^^^^^^^
8681
8682If the global value is an alias, the ``Summary`` entry will look like:
8683
8684.. code-block:: text
8685
8686    alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
8687
8688The ``module`` and ``flags`` fields are as described for a
8689:ref:`function summary <function_summary>`. The ``aliasee`` field
8690contains a reference to the global value summary entry of the aliasee.
8691
8692.. _funcflags_summary:
8693
8694Function Flags
8695^^^^^^^^^^^^^^
8696
8697The optional ``FuncFlags`` field looks like:
8698
8699.. code-block:: text
8700
8701    funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0)
8702
8703If unspecified, flags are assumed to hold the conservative ``false`` value of
8704``0``.
8705
8706.. _calls_summary:
8707
8708Calls
8709^^^^^
8710
8711The optional ``Calls`` field looks like:
8712
8713.. code-block:: text
8714
8715    calls: ((Callee)[, (Callee)]*)
8716
8717where each ``Callee`` looks like:
8718
8719.. code-block:: text
8720
8721    callee: ^1[, hotness: None]?[, relbf: 0]?
8722
8723The ``callee`` refers to the summary entry id of the callee. At most one
8724of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
8725``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
8726branch frequency relative to the entry frequency, scaled down by 2^8)
8727may be specified. The defaults are ``Unknown`` and ``0``, respectively.
8728
8729.. _params_summary:
8730
8731Params
8732^^^^^^
8733
8734The optional ``Params`` is used by ``StackSafety`` and looks like:
8735
8736.. code-block:: text
8737
8738    Params: ((Param)[, (Param)]*)
8739
8740where each ``Param`` describes pointer parameter access inside of the
8741function and looks like:
8742
8743.. code-block:: text
8744
8745    param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]?
8746
8747where the first ``param`` is the number of the parameter it describes,
8748``offset`` is the inclusive range of offsets from the pointer parameter to bytes
8749which can be accessed by the function. This range does not include accesses by
8750function calls from ``calls`` list.
8751
8752where each ``Callee`` describes how parameter is forwarded into other
8753functions and looks like:
8754
8755.. code-block:: text
8756
8757    callee: ^3, param: 5, offset: [-3, 3]
8758
8759The ``callee`` refers to the summary entry id of the callee,  ``param`` is
8760the number of the callee parameter which points into the callers parameter
8761with offset known to be inside of the ``offset`` range. ``calls`` will be
8762consumed and removed by thin link stage to update ``Param::offset`` so it
8763covers all accesses possible by ``calls``.
8764
8765Pointer parameter without corresponding ``Param`` is considered unsafe and we
8766assume that access with any offset is possible.
8767
8768Example:
8769
8770If we have the following function:
8771
8772.. code-block:: text
8773
8774    define i64 @foo(ptr %0, ptr %1, ptr %2, i8 %3) {
8775      store ptr %1, ptr @x
8776      %5 = getelementptr inbounds i8, ptr %2, i64 5
8777      %6 = load i8, ptr %5
8778      %7 = getelementptr inbounds i8, ptr %2, i8 %3
8779      tail call void @bar(i8 %3, ptr %7)
8780      %8 = load i64, ptr %0
8781      ret i64 %8
8782    }
8783
8784We can expect the record like this:
8785
8786.. code-block:: text
8787
8788    params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127]))))
8789
8790The function may access just 8 bytes of the parameter %0 . ``calls`` is empty,
8791so the parameter is either not used for function calls or ``offset`` already
8792covers all accesses from nested function calls.
8793Parameter %1 escapes, so access is unknown.
8794The function itself can access just a single byte of the parameter %2. Additional
8795access is possible inside of the ``@bar`` or ``^3``. The function adds signed
8796offset to the pointer and passes the result as the argument %1 into ``^3``.
8797This record itself does not tell us how ``^3`` will access the parameter.
8798Parameter %3 is not a pointer.
8799
8800.. _refs_summary:
8801
8802Refs
8803^^^^
8804
8805The optional ``Refs`` field looks like:
8806
8807.. code-block:: text
8808
8809    refs: ((Ref)[, (Ref)]*)
8810
8811where each ``Ref`` contains a reference to the summary id of the referenced
8812value (e.g. ``^1``).
8813
8814.. _typeidinfo_summary:
8815
8816TypeIdInfo
8817^^^^^^^^^^
8818
8819The optional ``TypeIdInfo`` field, used for
8820`Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
8821looks like:
8822
8823.. code-block:: text
8824
8825    typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
8826
8827These optional fields have the following forms:
8828
8829TypeTests
8830"""""""""
8831
8832.. code-block:: text
8833
8834    typeTests: (TypeIdRef[, TypeIdRef]*)
8835
8836Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
8837by summary id or ``GUID``.
8838
8839TypeTestAssumeVCalls
8840""""""""""""""""""""
8841
8842.. code-block:: text
8843
8844    typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
8845
8846Where each VFuncId has the format:
8847
8848.. code-block:: text
8849
8850    vFuncId: (TypeIdRef, offset: 16)
8851
8852Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
8853by summary id or ``GUID`` preceded by a ``guid:`` tag.
8854
8855TypeCheckedLoadVCalls
8856"""""""""""""""""""""
8857
8858.. code-block:: text
8859
8860    typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
8861
8862Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
8863
8864TypeTestAssumeConstVCalls
8865"""""""""""""""""""""""""
8866
8867.. code-block:: text
8868
8869    typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
8870
8871Where each ConstVCall has the format:
8872
8873.. code-block:: text
8874
8875    (VFuncId, args: (Arg[, Arg]*))
8876
8877and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
8878and each Arg is an integer argument number.
8879
8880TypeCheckedLoadConstVCalls
8881""""""""""""""""""""""""""
8882
8883.. code-block:: text
8884
8885    typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
8886
8887Where each ConstVCall has the format described for
8888``TypeTestAssumeConstVCalls``.
8889
8890.. _typeid_summary:
8891
8892Type ID Summary Entry
8893---------------------
8894
8895Each type id summary entry corresponds to a type identifier resolution
8896which is generated during the LTO link portion of the compile when building
8897with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
8898so these are only present in a combined summary index.
8899
8900Example:
8901
8902.. code-block:: text
8903
8904    ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
8905
8906The ``typeTestRes`` gives the type test resolution ``kind`` (which may
8907be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
8908the ``size-1`` bit width. It is followed by optional flags, which default to 0,
8909and an optional WpdResolutions (whole program devirtualization resolution)
8910field that looks like:
8911
8912.. code-block:: text
8913
8914    wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
8915
8916where each entry is a mapping from the given byte offset to the whole-program
8917devirtualization resolution WpdRes, that has one of the following formats:
8918
8919.. code-block:: text
8920
8921    wpdRes: (kind: branchFunnel)
8922    wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
8923    wpdRes: (kind: indir)
8924
8925Additionally, each wpdRes has an optional ``resByArg`` field, which
8926describes the resolutions for calls with all constant integer arguments:
8927
8928.. code-block:: text
8929
8930    resByArg: (ResByArg[, ResByArg]*)
8931
8932where ResByArg is:
8933
8934.. code-block:: text
8935
8936    args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
8937
8938Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
8939or ``VirtualConstProp``. The ``info`` field is only used if the kind
8940is ``UniformRetVal`` (indicates the uniform return value), or
8941``UniqueRetVal`` (holds the return value associated with the unique vtable
8942(0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
8943not support the use of absolute symbols to store constants.
8944
8945.. _intrinsicglobalvariables:
8946
8947Intrinsic Global Variables
8948==========================
8949
8950LLVM has a number of "magic" global variables that contain data that
8951affect code generation or other IR semantics. These are documented here.
8952All globals of this sort should have a section specified as
8953"``llvm.metadata``". This section and all globals that start with
8954"``llvm.``" are reserved for use by LLVM.
8955
8956.. _gv_llvmused:
8957
8958The '``llvm.used``' Global Variable
8959-----------------------------------
8960
8961The ``@llvm.used`` global is an array which has
8962:ref:`appending linkage <linkage_appending>`. This array contains a list of
8963pointers to named global variables, functions and aliases which may optionally
8964have a pointer cast formed of bitcast or getelementptr. For example, a legal
8965use of it is:
8966
8967.. code-block:: llvm
8968
8969    @X = global i8 4
8970    @Y = global i32 123
8971
8972    @llvm.used = appending global [2 x ptr] [
8973       ptr @X,
8974       ptr @Y
8975    ], section "llvm.metadata"
8976
8977If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
8978and linker are required to treat the symbol as if there is a reference to the
8979symbol that it cannot see (which is why they have to be named). For example, if
8980a variable has internal linkage and no references other than that from the
8981``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
8982references from inline asms and other things the compiler cannot "see", and
8983corresponds to "``attribute((used))``" in GNU C.
8984
8985On some targets, the code generator must emit a directive to the
8986assembler or object file to prevent the assembler and linker from
8987removing the symbol.
8988
8989.. _gv_llvmcompilerused:
8990
8991The '``llvm.compiler.used``' Global Variable
8992--------------------------------------------
8993
8994The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
8995directive, except that it only prevents the compiler from touching the
8996symbol. On targets that support it, this allows an intelligent linker to
8997optimize references to the symbol without being impeded as it would be
8998by ``@llvm.used``.
8999
9000This is a rare construct that should only be used in rare circumstances,
9001and should not be exposed to source languages.
9002
9003.. _gv_llvmglobalctors:
9004
9005The '``llvm.global_ctors``' Global Variable
9006-------------------------------------------
9007
9008.. code-block:: llvm
9009
9010    %0 = type { i32, ptr, ptr }
9011    @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, ptr @ctor, ptr @data }]
9012
9013The ``@llvm.global_ctors`` array contains a list of constructor
9014functions, priorities, and an associated global or function.
9015The functions referenced by this array will be called in ascending order
9016of priority (i.e. lowest first) when the module is loaded. The order of
9017functions with the same priority is not defined.
9018
9019If the third field is non-null, and points to a global variable
9020or function, the initializer function will only run if the associated
9021data from the current module is not discarded.
9022On ELF the referenced global variable or function must be in a comdat.
9023
9024.. _llvmglobaldtors:
9025
9026The '``llvm.global_dtors``' Global Variable
9027-------------------------------------------
9028
9029.. code-block:: llvm
9030
9031    %0 = type { i32, ptr, ptr }
9032    @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, ptr @dtor, ptr @data }]
9033
9034The ``@llvm.global_dtors`` array contains a list of destructor
9035functions, priorities, and an associated global or function.
9036The functions referenced by this array will be called in descending
9037order of priority (i.e. highest first) when the module is unloaded. The
9038order of functions with the same priority is not defined.
9039
9040If the third field is non-null, and points to a global variable
9041or function, the destructor function will only run if the associated
9042data from the current module is not discarded.
9043On ELF the referenced global variable or function must be in a comdat.
9044
9045Instruction Reference
9046=====================
9047
9048The LLVM instruction set consists of several different classifications
9049of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
9050instructions <binaryops>`, :ref:`bitwise binary
9051instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
9052:ref:`other instructions <otherops>`. There are also :ref:`debug records
9053<debugrecords>`, which are not instructions themselves but are printed
9054interleaved with instructions to describe changes in the state of the program's
9055debug information at each position in the program's execution.
9056
9057.. _terminators:
9058
9059Terminator Instructions
9060-----------------------
9061
9062As mentioned :ref:`previously <functionstructure>`, every basic block in a
9063program ends with a "Terminator" instruction, which indicates which
9064block should be executed after the current block is finished. These
9065terminator instructions typically yield a '``void``' value: they produce
9066control flow, not values (the one exception being the
9067':ref:`invoke <i_invoke>`' instruction).
9068
9069The terminator instructions are: ':ref:`ret <i_ret>`',
9070':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
9071':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
9072':ref:`callbr <i_callbr>`'
9073':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
9074':ref:`catchret <i_catchret>`',
9075':ref:`cleanupret <i_cleanupret>`',
9076and ':ref:`unreachable <i_unreachable>`'.
9077
9078.. _i_ret:
9079
9080'``ret``' Instruction
9081^^^^^^^^^^^^^^^^^^^^^
9082
9083Syntax:
9084"""""""
9085
9086::
9087
9088      ret <type> <value>       ; Return a value from a non-void function
9089      ret void                 ; Return from void function
9090
9091Overview:
9092"""""""""
9093
9094The '``ret``' instruction is used to return control flow (and optionally
9095a value) from a function back to the caller.
9096
9097There are two forms of the '``ret``' instruction: one that returns a
9098value and then causes control flow, and one that just causes control
9099flow to occur.
9100
9101Arguments:
9102""""""""""
9103
9104The '``ret``' instruction optionally accepts a single argument, the
9105return value. The type of the return value must be a ':ref:`first
9106class <t_firstclass>`' type.
9107
9108A function is not :ref:`well formed <wellformed>` if it has a non-void
9109return type and contains a '``ret``' instruction with no return value or
9110a return value with a type that does not match its type, or if it has a
9111void return type and contains a '``ret``' instruction with a return
9112value.
9113
9114Semantics:
9115""""""""""
9116
9117When the '``ret``' instruction is executed, control flow returns back to
9118the calling function's context. If the caller is a
9119":ref:`call <i_call>`" instruction, execution continues at the
9120instruction after the call. If the caller was an
9121":ref:`invoke <i_invoke>`" instruction, execution continues at the
9122beginning of the "normal" destination block. If the instruction returns
9123a value, that value shall set the call or invoke instruction's return
9124value.
9125
9126Example:
9127""""""""
9128
9129.. code-block:: llvm
9130
9131      ret i32 5                       ; Return an integer value of 5
9132      ret void                        ; Return from a void function
9133      ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
9134
9135.. _i_br:
9136
9137'``br``' Instruction
9138^^^^^^^^^^^^^^^^^^^^
9139
9140Syntax:
9141"""""""
9142
9143::
9144
9145      br i1 <cond>, label <iftrue>, label <iffalse>
9146      br label <dest>          ; Unconditional branch
9147
9148Overview:
9149"""""""""
9150
9151The '``br``' instruction is used to cause control flow to transfer to a
9152different basic block in the current function. There are two forms of
9153this instruction, corresponding to a conditional branch and an
9154unconditional branch.
9155
9156Arguments:
9157""""""""""
9158
9159The conditional branch form of the '``br``' instruction takes a single
9160'``i1``' value and two '``label``' values. The unconditional form of the
9161'``br``' instruction takes a single '``label``' value as a target.
9162
9163Semantics:
9164""""""""""
9165
9166Upon execution of a conditional '``br``' instruction, the '``i1``'
9167argument is evaluated. If the value is ``true``, control flows to the
9168'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
9169to the '``iffalse``' ``label`` argument.
9170If '``cond``' is ``poison`` or ``undef``, this instruction has undefined
9171behavior.
9172
9173Example:
9174""""""""
9175
9176.. code-block:: llvm
9177
9178    Test:
9179      %cond = icmp eq i32 %a, %b
9180      br i1 %cond, label %IfEqual, label %IfUnequal
9181    IfEqual:
9182      ret i32 1
9183    IfUnequal:
9184      ret i32 0
9185
9186.. _i_switch:
9187
9188'``switch``' Instruction
9189^^^^^^^^^^^^^^^^^^^^^^^^
9190
9191Syntax:
9192"""""""
9193
9194::
9195
9196      switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
9197
9198Overview:
9199"""""""""
9200
9201The '``switch``' instruction is used to transfer control flow to one of
9202several different places. It is a generalization of the '``br``'
9203instruction, allowing a branch to occur to one of many possible
9204destinations.
9205
9206Arguments:
9207""""""""""
9208
9209The '``switch``' instruction uses three parameters: an integer
9210comparison value '``value``', a default '``label``' destination, and an
9211array of pairs of comparison value constants and '``label``'s. The table
9212is not allowed to contain duplicate constant entries.
9213
9214Semantics:
9215""""""""""
9216
9217The ``switch`` instruction specifies a table of values and destinations.
9218When the '``switch``' instruction is executed, this table is searched
9219for the given value. If the value is found, control flow is transferred
9220to the corresponding destination; otherwise, control flow is transferred
9221to the default destination.
9222If '``value``' is ``poison`` or ``undef``, this instruction has undefined
9223behavior.
9224
9225Implementation:
9226"""""""""""""""
9227
9228Depending on properties of the target machine and the particular
9229``switch`` instruction, this instruction may be code generated in
9230different ways. For example, it could be generated as a series of
9231chained conditional branches or with a lookup table.
9232
9233Example:
9234""""""""
9235
9236.. code-block:: llvm
9237
9238     ; Emulate a conditional br instruction
9239     %Val = zext i1 %value to i32
9240     switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
9241
9242     ; Emulate an unconditional br instruction
9243     switch i32 0, label %dest [ ]
9244
9245     ; Implement a jump table:
9246     switch i32 %val, label %otherwise [ i32 0, label %onzero
9247                                         i32 1, label %onone
9248                                         i32 2, label %ontwo ]
9249
9250.. _i_indirectbr:
9251
9252'``indirectbr``' Instruction
9253^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9254
9255Syntax:
9256"""""""
9257
9258::
9259
9260      indirectbr ptr <address>, [ label <dest1>, label <dest2>, ... ]
9261
9262Overview:
9263"""""""""
9264
9265The '``indirectbr``' instruction implements an indirect branch to a
9266label within the current function, whose address is specified by
9267"``address``". Address must be derived from a
9268:ref:`blockaddress <blockaddress>` constant.
9269
9270Arguments:
9271""""""""""
9272
9273The '``address``' argument is the address of the label to jump to. The
9274rest of the arguments indicate the full set of possible destinations
9275that the address may point to. Blocks are allowed to occur multiple
9276times in the destination list, though this isn't particularly useful.
9277
9278This destination list is required so that dataflow analysis has an
9279accurate understanding of the CFG.
9280
9281Semantics:
9282""""""""""
9283
9284Control transfers to the block specified in the address argument. All
9285possible destination blocks must be listed in the label list, otherwise
9286this instruction has undefined behavior. This implies that jumps to
9287labels defined in other functions have undefined behavior as well.
9288If '``address``' is ``poison`` or ``undef``, this instruction has undefined
9289behavior.
9290
9291Implementation:
9292"""""""""""""""
9293
9294This is typically implemented with a jump through a register.
9295
9296Example:
9297""""""""
9298
9299.. code-block:: llvm
9300
9301     indirectbr ptr %Addr, [ label %bb1, label %bb2, label %bb3 ]
9302
9303.. _i_invoke:
9304
9305'``invoke``' Instruction
9306^^^^^^^^^^^^^^^^^^^^^^^^
9307
9308Syntax:
9309"""""""
9310
9311::
9312
9313      <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
9314                    [operand bundles] to label <normal label> unwind label <exception label>
9315
9316Overview:
9317"""""""""
9318
9319The '``invoke``' instruction causes control to transfer to a specified
9320function, with the possibility of control flow transfer to either the
9321'``normal``' label or the '``exception``' label. If the callee function
9322returns with the "``ret``" instruction, control flow will return to the
9323"normal" label. If the callee (or any indirect callees) returns via the
9324":ref:`resume <i_resume>`" instruction or other exception handling
9325mechanism, control is interrupted and continued at the dynamically
9326nearest "exception" label.
9327
9328The '``exception``' label is a `landing
9329pad <ExceptionHandling.html#overview>`_ for the exception. As such,
9330'``exception``' label is required to have the
9331":ref:`landingpad <i_landingpad>`" instruction, which contains the
9332information about the behavior of the program after unwinding happens,
9333as its first non-PHI instruction. The restrictions on the
9334"``landingpad``" instruction's tightly couples it to the "``invoke``"
9335instruction, so that the important information contained within the
9336"``landingpad``" instruction can't be lost through normal code motion.
9337
9338Arguments:
9339""""""""""
9340
9341This instruction requires several arguments:
9342
9343#. The optional "cconv" marker indicates which :ref:`calling
9344   convention <callingconv>` the call should use. If none is
9345   specified, the call defaults to using C calling conventions.
9346#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
9347   values. Only '``zeroext``', '``signext``', '``noext``', and '``inreg``'
9348   attributes are valid here.
9349#. The optional addrspace attribute can be used to indicate the address space
9350   of the called function. If it is not specified, the program address space
9351   from the :ref:`datalayout string<langref_datalayout>` will be used.
9352#. '``ty``': the type of the call instruction itself which is also the
9353   type of the return value. Functions that return no value are marked
9354   ``void``.
9355#. '``fnty``': shall be the signature of the function being invoked. The
9356   argument types must match the types implied by this signature. This
9357   type can be omitted if the function is not varargs.
9358#. '``fnptrval``': An LLVM value containing a pointer to a function to
9359   be invoked. In most cases, this is a direct function invocation, but
9360   indirect ``invoke``'s are just as possible, calling an arbitrary pointer
9361   to function value.
9362#. '``function args``': argument list whose types match the function
9363   signature argument types and parameter attributes. All arguments must
9364   be of :ref:`first class <t_firstclass>` type. If the function signature
9365   indicates the function accepts a variable number of arguments, the
9366   extra arguments can be specified.
9367#. '``normal label``': the label reached when the called function
9368   executes a '``ret``' instruction.
9369#. '``exception label``': the label reached when a callee returns via
9370   the :ref:`resume <i_resume>` instruction or other exception handling
9371   mechanism.
9372#. The optional :ref:`function attributes <fnattrs>` list.
9373#. The optional :ref:`operand bundles <opbundles>` list.
9374
9375Semantics:
9376""""""""""
9377
9378This instruction is designed to operate as a standard '``call``'
9379instruction in most regards. The primary difference is that it
9380establishes an association with a label, which is used by the runtime
9381library to unwind the stack.
9382
9383This instruction is used in languages with destructors to ensure that
9384proper cleanup is performed in the case of either a ``longjmp`` or a
9385thrown exception. Additionally, this is important for implementation of
9386'``catch``' clauses in high-level languages that support them.
9387
9388For the purposes of the SSA form, the definition of the value returned
9389by the '``invoke``' instruction is deemed to occur on the edge from the
9390current block to the "normal" label. If the callee unwinds then no
9391return value is available.
9392
9393Example:
9394""""""""
9395
9396.. code-block:: llvm
9397
9398      %retval = invoke i32 @Test(i32 15) to label %Continue
9399                  unwind label %TestCleanup              ; i32:retval set
9400      %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
9401                  unwind label %TestCleanup              ; i32:retval set
9402
9403.. _i_callbr:
9404
9405'``callbr``' Instruction
9406^^^^^^^^^^^^^^^^^^^^^^^^
9407
9408Syntax:
9409"""""""
9410
9411::
9412
9413      <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
9414                    [operand bundles] to label <fallthrough label> [indirect labels]
9415
9416Overview:
9417"""""""""
9418
9419The '``callbr``' instruction causes control to transfer to a specified
9420function, with the possibility of control flow transfer to either the
9421'``fallthrough``' label or one of the '``indirect``' labels.
9422
9423This instruction should only be used to implement the "goto" feature of gcc
9424style inline assembly. Any other usage is an error in the IR verifier.
9425
9426Note that in order to support outputs along indirect edges, LLVM may need to
9427split critical edges, which may require synthesizing a replacement block for
9428the ``indirect labels``. Therefore, the address of a label as seen by another
9429``callbr`` instruction, or for a :ref:`blockaddress <blockaddress>` constant,
9430may not be equal to the address provided for the same block to this
9431instruction's ``indirect labels`` operand. The assembly code may only transfer
9432control to addresses provided via this instruction's ``indirect labels``.
9433
9434Arguments:
9435""""""""""
9436
9437This instruction requires several arguments:
9438
9439#. The optional "cconv" marker indicates which :ref:`calling
9440   convention <callingconv>` the call should use. If none is
9441   specified, the call defaults to using C calling conventions.
9442#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
9443   values. Only '``zeroext``', '``signext``', '``noext``', and '``inreg``'
9444   attributes are valid here.
9445#. The optional addrspace attribute can be used to indicate the address space
9446   of the called function. If it is not specified, the program address space
9447   from the :ref:`datalayout string<langref_datalayout>` will be used.
9448#. '``ty``': the type of the call instruction itself which is also the
9449   type of the return value. Functions that return no value are marked
9450   ``void``.
9451#. '``fnty``': shall be the signature of the function being called. The
9452   argument types must match the types implied by this signature. This
9453   type can be omitted if the function is not varargs.
9454#. '``fnptrval``': An LLVM value containing a pointer to a function to
9455   be called. In most cases, this is a direct function call, but
9456   other ``callbr``'s are just as possible, calling an arbitrary pointer
9457   to function value.
9458#. '``function args``': argument list whose types match the function
9459   signature argument types and parameter attributes. All arguments must
9460   be of :ref:`first class <t_firstclass>` type. If the function signature
9461   indicates the function accepts a variable number of arguments, the
9462   extra arguments can be specified.
9463#. '``fallthrough label``': the label reached when the inline assembly's
9464   execution exits the bottom.
9465#. '``indirect labels``': the labels reached when a callee transfers control
9466   to a location other than the '``fallthrough label``'. Label constraints
9467   refer to these destinations.
9468#. The optional :ref:`function attributes <fnattrs>` list.
9469#. The optional :ref:`operand bundles <opbundles>` list.
9470
9471Semantics:
9472""""""""""
9473
9474This instruction is designed to operate as a standard '``call``'
9475instruction in most regards. The primary difference is that it
9476establishes an association with additional labels to define where control
9477flow goes after the call.
9478
9479The output values of a '``callbr``' instruction are available both in the
9480the '``fallthrough``' block, and any '``indirect``' blocks(s).
9481
9482The only use of this today is to implement the "goto" feature of gcc inline
9483assembly where additional labels can be provided as locations for the inline
9484assembly to jump to.
9485
9486Example:
9487""""""""
9488
9489.. code-block:: llvm
9490
9491      ; "asm goto" without output constraints.
9492      callbr void asm "", "r,!i"(i32 %x)
9493                  to label %fallthrough [label %indirect]
9494
9495      ; "asm goto" with output constraints.
9496      <result> = callbr i32 asm "", "=r,r,!i"(i32 %x)
9497                  to label %fallthrough [label %indirect]
9498
9499.. _i_resume:
9500
9501'``resume``' Instruction
9502^^^^^^^^^^^^^^^^^^^^^^^^
9503
9504Syntax:
9505"""""""
9506
9507::
9508
9509      resume <type> <value>
9510
9511Overview:
9512"""""""""
9513
9514The '``resume``' instruction is a terminator instruction that has no
9515successors.
9516
9517Arguments:
9518""""""""""
9519
9520The '``resume``' instruction requires one argument, which must have the
9521same type as the result of any '``landingpad``' instruction in the same
9522function.
9523
9524Semantics:
9525""""""""""
9526
9527The '``resume``' instruction resumes propagation of an existing
9528(in-flight) exception whose unwinding was interrupted with a
9529:ref:`landingpad <i_landingpad>` instruction.
9530
9531Example:
9532""""""""
9533
9534.. code-block:: llvm
9535
9536      resume { ptr, i32 } %exn
9537
9538.. _i_catchswitch:
9539
9540'``catchswitch``' Instruction
9541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9542
9543Syntax:
9544"""""""
9545
9546::
9547
9548      <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
9549      <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
9550
9551Overview:
9552"""""""""
9553
9554The '``catchswitch``' instruction is used by `LLVM's exception handling system
9555<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
9556that may be executed by the :ref:`EH personality routine <personalityfn>`.
9557
9558Arguments:
9559""""""""""
9560
9561The ``parent`` argument is the token of the funclet that contains the
9562``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
9563this operand may be the token ``none``.
9564
9565The ``default`` argument is the label of another basic block beginning with
9566either a ``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination
9567must be a legal target with respect to the ``parent`` links, as described in
9568the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
9569
9570The ``handlers`` are a nonempty list of successor blocks that each begin with a
9571:ref:`catchpad <i_catchpad>` instruction.
9572
9573Semantics:
9574""""""""""
9575
9576Executing this instruction transfers control to one of the successors in
9577``handlers``, if appropriate, or continues to unwind via the unwind label if
9578present.
9579
9580The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
9581it must be both the first non-phi instruction and last instruction in the basic
9582block. Therefore, it must be the only non-phi instruction in the block.
9583
9584Example:
9585""""""""
9586
9587.. code-block:: text
9588
9589    dispatch1:
9590      %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
9591    dispatch2:
9592      %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
9593
9594.. _i_catchret:
9595
9596'``catchret``' Instruction
9597^^^^^^^^^^^^^^^^^^^^^^^^^^
9598
9599Syntax:
9600"""""""
9601
9602::
9603
9604      catchret from <token> to label <normal>
9605
9606Overview:
9607"""""""""
9608
9609The '``catchret``' instruction is a terminator instruction that has a
9610single successor.
9611
9612
9613Arguments:
9614""""""""""
9615
9616The first argument to a '``catchret``' indicates which ``catchpad`` it
9617exits.  It must be a :ref:`catchpad <i_catchpad>`.
9618The second argument to a '``catchret``' specifies where control will
9619transfer to next.
9620
9621Semantics:
9622""""""""""
9623
9624The '``catchret``' instruction ends an existing (in-flight) exception whose
9625unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction.  The
9626:ref:`personality function <personalityfn>` gets a chance to execute arbitrary
9627code to, for example, destroy the active exception.  Control then transfers to
9628``normal``.
9629
9630The ``token`` argument must be a token produced by a ``catchpad`` instruction.
9631If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
9632funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9633the ``catchret``'s behavior is undefined.
9634
9635Example:
9636""""""""
9637
9638.. code-block:: text
9639
9640      catchret from %catch to label %continue
9641
9642.. _i_cleanupret:
9643
9644'``cleanupret``' Instruction
9645^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9646
9647Syntax:
9648"""""""
9649
9650::
9651
9652      cleanupret from <value> unwind label <continue>
9653      cleanupret from <value> unwind to caller
9654
9655Overview:
9656"""""""""
9657
9658The '``cleanupret``' instruction is a terminator instruction that has
9659an optional successor.
9660
9661
9662Arguments:
9663""""""""""
9664
9665The '``cleanupret``' instruction requires one argument, which indicates
9666which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
9667If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
9668funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9669the ``cleanupret``'s behavior is undefined.
9670
9671The '``cleanupret``' instruction also has an optional successor, ``continue``,
9672which must be the label of another basic block beginning with either a
9673``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination must
9674be a legal target with respect to the ``parent`` links, as described in the
9675`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
9676
9677Semantics:
9678""""""""""
9679
9680The '``cleanupret``' instruction indicates to the
9681:ref:`personality function <personalityfn>` that one
9682:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
9683It transfers control to ``continue`` or unwinds out of the function.
9684
9685Example:
9686""""""""
9687
9688.. code-block:: text
9689
9690      cleanupret from %cleanup unwind to caller
9691      cleanupret from %cleanup unwind label %continue
9692
9693.. _i_unreachable:
9694
9695'``unreachable``' Instruction
9696^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9697
9698Syntax:
9699"""""""
9700
9701::
9702
9703      unreachable
9704
9705Overview:
9706"""""""""
9707
9708The '``unreachable``' instruction has no defined semantics. This
9709instruction is used to inform the optimizer that a particular portion of
9710the code is not reachable. This can be used to indicate that the code
9711after a no-return function cannot be reached, and other facts.
9712
9713Semantics:
9714""""""""""
9715
9716The '``unreachable``' instruction has no defined semantics.
9717
9718.. _unaryops:
9719
9720Unary Operations
9721-----------------
9722
9723Unary operators require a single operand, execute an operation on
9724it, and produce a single value. The operand might represent multiple
9725data, as is the case with the :ref:`vector <t_vector>` data type. The
9726result value has the same type as its operand.
9727
9728.. _i_fneg:
9729
9730'``fneg``' Instruction
9731^^^^^^^^^^^^^^^^^^^^^^
9732
9733Syntax:
9734"""""""
9735
9736::
9737
9738      <result> = fneg [fast-math flags]* <ty> <op1>   ; yields ty:result
9739
9740Overview:
9741"""""""""
9742
9743The '``fneg``' instruction returns the negation of its operand.
9744
9745Arguments:
9746""""""""""
9747
9748The argument to the '``fneg``' instruction must be a
9749:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9750floating-point values.
9751
9752Semantics:
9753""""""""""
9754
9755The value produced is a copy of the operand with its sign bit flipped.
9756The value is otherwise completely identical; in particular, if the input is a
9757NaN, then the quiet/signaling bit and payload are perfectly preserved.
9758
9759This instruction can also take any number of :ref:`fast-math
9760flags <fastmath>`, which are optimization hints to enable otherwise
9761unsafe floating-point optimizations:
9762
9763Example:
9764""""""""
9765
9766.. code-block:: text
9767
9768      <result> = fneg float %val          ; yields float:result = -%var
9769
9770.. _binaryops:
9771
9772Binary Operations
9773-----------------
9774
9775Binary operators are used to do most of the computation in a program.
9776They require two operands of the same type, execute an operation on
9777them, and produce a single value. The operands might represent multiple
9778data, as is the case with the :ref:`vector <t_vector>` data type. The
9779result value has the same type as its operands.
9780
9781There are several different binary operators:
9782
9783.. _i_add:
9784
9785'``add``' Instruction
9786^^^^^^^^^^^^^^^^^^^^^
9787
9788Syntax:
9789"""""""
9790
9791::
9792
9793      <result> = add <ty> <op1>, <op2>          ; yields ty:result
9794      <result> = add nuw <ty> <op1>, <op2>      ; yields ty:result
9795      <result> = add nsw <ty> <op1>, <op2>      ; yields ty:result
9796      <result> = add nuw nsw <ty> <op1>, <op2>  ; yields ty:result
9797
9798Overview:
9799"""""""""
9800
9801The '``add``' instruction returns the sum of its two operands.
9802
9803Arguments:
9804""""""""""
9805
9806The two arguments to the '``add``' instruction must be
9807:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9808arguments must have identical types.
9809
9810Semantics:
9811""""""""""
9812
9813The value produced is the integer sum of the two operands.
9814
9815If the sum has unsigned overflow, the result returned is the
9816mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
9817the result.
9818
9819Because LLVM integers use a two's complement representation, this
9820instruction is appropriate for both signed and unsigned integers.
9821
9822``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9823respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9824result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
9825unsigned and/or signed overflow, respectively, occurs.
9826
9827Example:
9828""""""""
9829
9830.. code-block:: text
9831
9832      <result> = add i32 4, %var          ; yields i32:result = 4 + %var
9833
9834.. _i_fadd:
9835
9836'``fadd``' Instruction
9837^^^^^^^^^^^^^^^^^^^^^^
9838
9839Syntax:
9840"""""""
9841
9842::
9843
9844      <result> = fadd [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
9845
9846Overview:
9847"""""""""
9848
9849The '``fadd``' instruction returns the sum of its two operands.
9850
9851Arguments:
9852""""""""""
9853
9854The two arguments to the '``fadd``' instruction must be
9855:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9856floating-point values. Both arguments must have identical types.
9857
9858Semantics:
9859""""""""""
9860
9861The value produced is the floating-point sum of the two operands.
9862This instruction is assumed to execute in the default :ref:`floating-point
9863environment <floatenv>`.
9864This instruction can also take any number of :ref:`fast-math
9865flags <fastmath>`, which are optimization hints to enable otherwise
9866unsafe floating-point optimizations:
9867
9868Example:
9869""""""""
9870
9871.. code-block:: text
9872
9873      <result> = fadd float 4.0, %var          ; yields float:result = 4.0 + %var
9874
9875.. _i_sub:
9876
9877'``sub``' Instruction
9878^^^^^^^^^^^^^^^^^^^^^
9879
9880Syntax:
9881"""""""
9882
9883::
9884
9885      <result> = sub <ty> <op1>, <op2>          ; yields ty:result
9886      <result> = sub nuw <ty> <op1>, <op2>      ; yields ty:result
9887      <result> = sub nsw <ty> <op1>, <op2>      ; yields ty:result
9888      <result> = sub nuw nsw <ty> <op1>, <op2>  ; yields ty:result
9889
9890Overview:
9891"""""""""
9892
9893The '``sub``' instruction returns the difference of its two operands.
9894
9895Note that the '``sub``' instruction is used to represent the '``neg``'
9896instruction present in most other intermediate representations.
9897
9898Arguments:
9899""""""""""
9900
9901The two arguments to the '``sub``' instruction must be
9902:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9903arguments must have identical types.
9904
9905Semantics:
9906""""""""""
9907
9908The value produced is the integer difference of the two operands.
9909
9910If the difference has unsigned overflow, the result returned is the
9911mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
9912the result.
9913
9914Because LLVM integers use a two's complement representation, this
9915instruction is appropriate for both signed and unsigned integers.
9916
9917``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9918respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9919result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
9920unsigned and/or signed overflow, respectively, occurs.
9921
9922Example:
9923""""""""
9924
9925.. code-block:: text
9926
9927      <result> = sub i32 4, %var          ; yields i32:result = 4 - %var
9928      <result> = sub i32 0, %val          ; yields i32:result = -%var
9929
9930.. _i_fsub:
9931
9932'``fsub``' Instruction
9933^^^^^^^^^^^^^^^^^^^^^^
9934
9935Syntax:
9936"""""""
9937
9938::
9939
9940      <result> = fsub [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
9941
9942Overview:
9943"""""""""
9944
9945The '``fsub``' instruction returns the difference of its two operands.
9946
9947Arguments:
9948""""""""""
9949
9950The two arguments to the '``fsub``' instruction must be
9951:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9952floating-point values. Both arguments must have identical types.
9953
9954Semantics:
9955""""""""""
9956
9957The value produced is the floating-point difference of the two operands.
9958This instruction is assumed to execute in the default :ref:`floating-point
9959environment <floatenv>`.
9960This instruction can also take any number of :ref:`fast-math
9961flags <fastmath>`, which are optimization hints to enable otherwise
9962unsafe floating-point optimizations:
9963
9964Example:
9965""""""""
9966
9967.. code-block:: text
9968
9969      <result> = fsub float 4.0, %var           ; yields float:result = 4.0 - %var
9970      <result> = fsub float -0.0, %val          ; yields float:result = -%var
9971
9972.. _i_mul:
9973
9974'``mul``' Instruction
9975^^^^^^^^^^^^^^^^^^^^^
9976
9977Syntax:
9978"""""""
9979
9980::
9981
9982      <result> = mul <ty> <op1>, <op2>          ; yields ty:result
9983      <result> = mul nuw <ty> <op1>, <op2>      ; yields ty:result
9984      <result> = mul nsw <ty> <op1>, <op2>      ; yields ty:result
9985      <result> = mul nuw nsw <ty> <op1>, <op2>  ; yields ty:result
9986
9987Overview:
9988"""""""""
9989
9990The '``mul``' instruction returns the product of its two operands.
9991
9992Arguments:
9993""""""""""
9994
9995The two arguments to the '``mul``' instruction must be
9996:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9997arguments must have identical types.
9998
9999Semantics:
10000""""""""""
10001
10002The value produced is the integer product of the two operands.
10003
10004If the result of the multiplication has unsigned overflow, the result
10005returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
10006bit width of the result.
10007
10008Because LLVM integers use a two's complement representation, and the
10009result is the same width as the operands, this instruction returns the
10010correct result for both signed and unsigned integers. If a full product
10011(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
10012sign-extended or zero-extended as appropriate to the width of the full
10013product.
10014
10015``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
10016respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
10017result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
10018unsigned and/or signed overflow, respectively, occurs.
10019
10020Example:
10021""""""""
10022
10023.. code-block:: text
10024
10025      <result> = mul i32 4, %var          ; yields i32:result = 4 * %var
10026
10027.. _i_fmul:
10028
10029'``fmul``' Instruction
10030^^^^^^^^^^^^^^^^^^^^^^
10031
10032Syntax:
10033"""""""
10034
10035::
10036
10037      <result> = fmul [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
10038
10039Overview:
10040"""""""""
10041
10042The '``fmul``' instruction returns the product of its two operands.
10043
10044Arguments:
10045""""""""""
10046
10047The two arguments to the '``fmul``' instruction must be
10048:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
10049floating-point values. Both arguments must have identical types.
10050
10051Semantics:
10052""""""""""
10053
10054The value produced is the floating-point product of the two operands.
10055This instruction is assumed to execute in the default :ref:`floating-point
10056environment <floatenv>`.
10057This instruction can also take any number of :ref:`fast-math
10058flags <fastmath>`, which are optimization hints to enable otherwise
10059unsafe floating-point optimizations:
10060
10061Example:
10062""""""""
10063
10064.. code-block:: text
10065
10066      <result> = fmul float 4.0, %var          ; yields float:result = 4.0 * %var
10067
10068.. _i_udiv:
10069
10070'``udiv``' Instruction
10071^^^^^^^^^^^^^^^^^^^^^^
10072
10073Syntax:
10074"""""""
10075
10076::
10077
10078      <result> = udiv <ty> <op1>, <op2>         ; yields ty:result
10079      <result> = udiv exact <ty> <op1>, <op2>   ; yields ty:result
10080
10081Overview:
10082"""""""""
10083
10084The '``udiv``' instruction returns the quotient of its two operands.
10085
10086Arguments:
10087""""""""""
10088
10089The two arguments to the '``udiv``' instruction must be
10090:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10091arguments must have identical types.
10092
10093Semantics:
10094""""""""""
10095
10096The value produced is the unsigned integer quotient of the two operands.
10097
10098Note that unsigned integer division and signed integer division are
10099distinct operations; for signed integer division, use '``sdiv``'.
10100
10101Division by zero is undefined behavior. For vectors, if any element
10102of the divisor is zero, the operation has undefined behavior.
10103
10104
10105If the ``exact`` keyword is present, the result value of the ``udiv`` is
10106a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
10107such, "((a udiv exact b) mul b) == a").
10108
10109Example:
10110""""""""
10111
10112.. code-block:: text
10113
10114      <result> = udiv i32 4, %var          ; yields i32:result = 4 / %var
10115
10116.. _i_sdiv:
10117
10118'``sdiv``' Instruction
10119^^^^^^^^^^^^^^^^^^^^^^
10120
10121Syntax:
10122"""""""
10123
10124::
10125
10126      <result> = sdiv <ty> <op1>, <op2>         ; yields ty:result
10127      <result> = sdiv exact <ty> <op1>, <op2>   ; yields ty:result
10128
10129Overview:
10130"""""""""
10131
10132The '``sdiv``' instruction returns the quotient of its two operands.
10133
10134Arguments:
10135""""""""""
10136
10137The two arguments to the '``sdiv``' instruction must be
10138:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10139arguments must have identical types.
10140
10141Semantics:
10142""""""""""
10143
10144The value produced is the signed integer quotient of the two operands
10145rounded towards zero.
10146
10147Note that signed integer division and unsigned integer division are
10148distinct operations; for unsigned integer division, use '``udiv``'.
10149
10150Division by zero is undefined behavior. For vectors, if any element
10151of the divisor is zero, the operation has undefined behavior.
10152Overflow also leads to undefined behavior; this is a rare case, but can
10153occur, for example, by doing a 32-bit division of -2147483648 by -1.
10154
10155If the ``exact`` keyword is present, the result value of the ``sdiv`` is
10156a :ref:`poison value <poisonvalues>` if the result would be rounded.
10157
10158Example:
10159""""""""
10160
10161.. code-block:: text
10162
10163      <result> = sdiv i32 4, %var          ; yields i32:result = 4 / %var
10164
10165.. _i_fdiv:
10166
10167'``fdiv``' Instruction
10168^^^^^^^^^^^^^^^^^^^^^^
10169
10170Syntax:
10171"""""""
10172
10173::
10174
10175      <result> = fdiv [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
10176
10177Overview:
10178"""""""""
10179
10180The '``fdiv``' instruction returns the quotient of its two operands.
10181
10182Arguments:
10183""""""""""
10184
10185The two arguments to the '``fdiv``' instruction must be
10186:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
10187floating-point values. Both arguments must have identical types.
10188
10189Semantics:
10190""""""""""
10191
10192The value produced is the floating-point quotient of the two operands.
10193This instruction is assumed to execute in the default :ref:`floating-point
10194environment <floatenv>`.
10195This instruction can also take any number of :ref:`fast-math
10196flags <fastmath>`, which are optimization hints to enable otherwise
10197unsafe floating-point optimizations:
10198
10199Example:
10200""""""""
10201
10202.. code-block:: text
10203
10204      <result> = fdiv float 4.0, %var          ; yields float:result = 4.0 / %var
10205
10206.. _i_urem:
10207
10208'``urem``' Instruction
10209^^^^^^^^^^^^^^^^^^^^^^
10210
10211Syntax:
10212"""""""
10213
10214::
10215
10216      <result> = urem <ty> <op1>, <op2>   ; yields ty:result
10217
10218Overview:
10219"""""""""
10220
10221The '``urem``' instruction returns the remainder from the unsigned
10222division of its two arguments.
10223
10224Arguments:
10225""""""""""
10226
10227The two arguments to the '``urem``' instruction must be
10228:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10229arguments must have identical types.
10230
10231Semantics:
10232""""""""""
10233
10234This instruction returns the unsigned integer *remainder* of a division.
10235This instruction always performs an unsigned division to get the
10236remainder.
10237
10238Note that unsigned integer remainder and signed integer remainder are
10239distinct operations; for signed integer remainder, use '``srem``'.
10240
10241Taking the remainder of a division by zero is undefined behavior.
10242For vectors, if any element of the divisor is zero, the operation has
10243undefined behavior.
10244
10245Example:
10246""""""""
10247
10248.. code-block:: text
10249
10250      <result> = urem i32 4, %var          ; yields i32:result = 4 % %var
10251
10252.. _i_srem:
10253
10254'``srem``' Instruction
10255^^^^^^^^^^^^^^^^^^^^^^
10256
10257Syntax:
10258"""""""
10259
10260::
10261
10262      <result> = srem <ty> <op1>, <op2>   ; yields ty:result
10263
10264Overview:
10265"""""""""
10266
10267The '``srem``' instruction returns the remainder from the signed
10268division of its two operands. This instruction can also take
10269:ref:`vector <t_vector>` versions of the values in which case the elements
10270must be integers.
10271
10272Arguments:
10273""""""""""
10274
10275The two arguments to the '``srem``' instruction must be
10276:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10277arguments must have identical types.
10278
10279Semantics:
10280""""""""""
10281
10282This instruction returns the *remainder* of a division (where the result
10283is either zero or has the same sign as the dividend, ``op1``), not the
10284*modulo* operator (where the result is either zero or has the same sign
10285as the divisor, ``op2``) of a value. For more information about the
10286difference, see `The Math
10287Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
10288table of how this is implemented in various languages, please see
10289`Wikipedia: modulo
10290operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
10291
10292Note that signed integer remainder and unsigned integer remainder are
10293distinct operations; for unsigned integer remainder, use '``urem``'.
10294
10295Taking the remainder of a division by zero is undefined behavior.
10296For vectors, if any element of the divisor is zero, the operation has
10297undefined behavior.
10298Overflow also leads to undefined behavior; this is a rare case, but can
10299occur, for example, by taking the remainder of a 32-bit division of
10300-2147483648 by -1. (The remainder doesn't actually overflow, but this
10301rule lets srem be implemented using instructions that return both the
10302result of the division and the remainder.)
10303
10304Example:
10305""""""""
10306
10307.. code-block:: text
10308
10309      <result> = srem i32 4, %var          ; yields i32:result = 4 % %var
10310
10311.. _i_frem:
10312
10313'``frem``' Instruction
10314^^^^^^^^^^^^^^^^^^^^^^
10315
10316Syntax:
10317"""""""
10318
10319::
10320
10321      <result> = frem [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
10322
10323Overview:
10324"""""""""
10325
10326The '``frem``' instruction returns the remainder from the division of
10327its two operands.
10328
10329.. note::
10330
10331	The instruction is implemented as a call to libm's '``fmod``'
10332	for some targets, and using the instruction may thus require linking libm.
10333
10334
10335Arguments:
10336""""""""""
10337
10338The two arguments to the '``frem``' instruction must be
10339:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
10340floating-point values. Both arguments must have identical types.
10341
10342Semantics:
10343""""""""""
10344
10345The value produced is the floating-point remainder of the two operands.
10346This is the same output as a libm '``fmod``' function, but without any
10347possibility of setting ``errno``. The remainder has the same sign as the
10348dividend.
10349This instruction is assumed to execute in the default :ref:`floating-point
10350environment <floatenv>`.
10351This instruction can also take any number of :ref:`fast-math
10352flags <fastmath>`, which are optimization hints to enable otherwise
10353unsafe floating-point optimizations:
10354
10355Example:
10356""""""""
10357
10358.. code-block:: text
10359
10360      <result> = frem float 4.0, %var          ; yields float:result = 4.0 % %var
10361
10362.. _bitwiseops:
10363
10364Bitwise Binary Operations
10365-------------------------
10366
10367Bitwise binary operators are used to do various forms of bit-twiddling
10368in a program. They are generally very efficient instructions and can
10369commonly be strength reduced from other instructions. They require two
10370operands of the same type, execute an operation on them, and produce a
10371single value. The resulting value is the same type as its operands.
10372
10373.. _i_shl:
10374
10375'``shl``' Instruction
10376^^^^^^^^^^^^^^^^^^^^^
10377
10378Syntax:
10379"""""""
10380
10381::
10382
10383      <result> = shl <ty> <op1>, <op2>           ; yields ty:result
10384      <result> = shl nuw <ty> <op1>, <op2>       ; yields ty:result
10385      <result> = shl nsw <ty> <op1>, <op2>       ; yields ty:result
10386      <result> = shl nuw nsw <ty> <op1>, <op2>   ; yields ty:result
10387
10388Overview:
10389"""""""""
10390
10391The '``shl``' instruction returns the first operand shifted to the left
10392a specified number of bits.
10393
10394Arguments:
10395""""""""""
10396
10397Both arguments to the '``shl``' instruction must be the same
10398:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
10399'``op2``' is treated as an unsigned value.
10400
10401Semantics:
10402""""""""""
10403
10404The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
10405where ``n`` is the width of the result. If ``op2`` is (statically or
10406dynamically) equal to or larger than the number of bits in
10407``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
10408If the arguments are vectors, each vector element of ``op1`` is shifted
10409by the corresponding shift amount in ``op2``.
10410
10411If the ``nuw`` keyword is present, then the shift produces a poison
10412value if it shifts out any non-zero bits.
10413If the ``nsw`` keyword is present, then the shift produces a poison
10414value if it shifts out any bits that disagree with the resultant sign bit.
10415
10416Example:
10417""""""""
10418
10419.. code-block:: text
10420
10421      <result> = shl i32 4, %var   ; yields i32: 4 << %var
10422      <result> = shl i32 4, 2      ; yields i32: 16
10423      <result> = shl i32 1, 10     ; yields i32: 1024
10424      <result> = shl i32 1, 32     ; undefined
10425      <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 2, i32 4>
10426
10427.. _i_lshr:
10428
10429
10430'``lshr``' Instruction
10431^^^^^^^^^^^^^^^^^^^^^^
10432
10433Syntax:
10434"""""""
10435
10436::
10437
10438      <result> = lshr <ty> <op1>, <op2>         ; yields ty:result
10439      <result> = lshr exact <ty> <op1>, <op2>   ; yields ty:result
10440
10441Overview:
10442"""""""""
10443
10444The '``lshr``' instruction (logical shift right) returns the first
10445operand shifted to the right a specified number of bits with zero fill.
10446
10447Arguments:
10448""""""""""
10449
10450Both arguments to the '``lshr``' instruction must be the same
10451:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
10452'``op2``' is treated as an unsigned value.
10453
10454Semantics:
10455""""""""""
10456
10457This instruction always performs a logical shift right operation. The
10458most significant bits of the result will be filled with zero bits after
10459the shift. If ``op2`` is (statically or dynamically) equal to or larger
10460than the number of bits in ``op1``, this instruction returns a :ref:`poison
10461value <poisonvalues>`. If the arguments are vectors, each vector element
10462of ``op1`` is shifted by the corresponding shift amount in ``op2``.
10463
10464If the ``exact`` keyword is present, the result value of the ``lshr`` is
10465a poison value if any of the bits shifted out are non-zero.
10466
10467Example:
10468""""""""
10469
10470.. code-block:: text
10471
10472      <result> = lshr i32 4, 1   ; yields i32:result = 2
10473      <result> = lshr i32 4, 2   ; yields i32:result = 1
10474      <result> = lshr i8  4, 3   ; yields i8:result = 0
10475      <result> = lshr i8 -2, 1   ; yields i8:result = 0x7F
10476      <result> = lshr i32 1, 32  ; undefined
10477      <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
10478
10479.. _i_ashr:
10480
10481'``ashr``' Instruction
10482^^^^^^^^^^^^^^^^^^^^^^
10483
10484Syntax:
10485"""""""
10486
10487::
10488
10489      <result> = ashr <ty> <op1>, <op2>         ; yields ty:result
10490      <result> = ashr exact <ty> <op1>, <op2>   ; yields ty:result
10491
10492Overview:
10493"""""""""
10494
10495The '``ashr``' instruction (arithmetic shift right) returns the first
10496operand shifted to the right a specified number of bits with sign
10497extension.
10498
10499Arguments:
10500""""""""""
10501
10502Both arguments to the '``ashr``' instruction must be the same
10503:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
10504'``op2``' is treated as an unsigned value.
10505
10506Semantics:
10507""""""""""
10508
10509This instruction always performs an arithmetic shift right operation,
10510The most significant bits of the result will be filled with the sign bit
10511of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
10512than the number of bits in ``op1``, this instruction returns a :ref:`poison
10513value <poisonvalues>`. If the arguments are vectors, each vector element
10514of ``op1`` is shifted by the corresponding shift amount in ``op2``.
10515
10516If the ``exact`` keyword is present, the result value of the ``ashr`` is
10517a poison value if any of the bits shifted out are non-zero.
10518
10519Example:
10520""""""""
10521
10522.. code-block:: text
10523
10524      <result> = ashr i32 4, 1   ; yields i32:result = 2
10525      <result> = ashr i32 4, 2   ; yields i32:result = 1
10526      <result> = ashr i8  4, 3   ; yields i8:result = 0
10527      <result> = ashr i8 -2, 1   ; yields i8:result = -1
10528      <result> = ashr i32 1, 32  ; undefined
10529      <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3>   ; yields: result=<2 x i32> < i32 -1, i32 0>
10530
10531.. _i_and:
10532
10533'``and``' Instruction
10534^^^^^^^^^^^^^^^^^^^^^
10535
10536Syntax:
10537"""""""
10538
10539::
10540
10541      <result> = and <ty> <op1>, <op2>   ; yields ty:result
10542
10543Overview:
10544"""""""""
10545
10546The '``and``' instruction returns the bitwise logical and of its two
10547operands.
10548
10549Arguments:
10550""""""""""
10551
10552The two arguments to the '``and``' instruction must be
10553:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10554arguments must have identical types.
10555
10556Semantics:
10557""""""""""
10558
10559The truth table used for the '``and``' instruction is:
10560
10561+-----+-----+-----+
10562| In0 | In1 | Out |
10563+-----+-----+-----+
10564|   0 |   0 |   0 |
10565+-----+-----+-----+
10566|   0 |   1 |   0 |
10567+-----+-----+-----+
10568|   1 |   0 |   0 |
10569+-----+-----+-----+
10570|   1 |   1 |   1 |
10571+-----+-----+-----+
10572
10573Example:
10574""""""""
10575
10576.. code-block:: text
10577
10578      <result> = and i32 4, %var         ; yields i32:result = 4 & %var
10579      <result> = and i32 15, 40          ; yields i32:result = 8
10580      <result> = and i32 4, 8            ; yields i32:result = 0
10581
10582.. _i_or:
10583
10584'``or``' Instruction
10585^^^^^^^^^^^^^^^^^^^^
10586
10587Syntax:
10588"""""""
10589
10590::
10591
10592      <result> = or <ty> <op1>, <op2>   ; yields ty:result
10593      <result> = or disjoint <ty> <op1>, <op2>   ; yields ty:result
10594
10595Overview:
10596"""""""""
10597
10598The '``or``' instruction returns the bitwise logical inclusive or of its
10599two operands.
10600
10601Arguments:
10602""""""""""
10603
10604The two arguments to the '``or``' instruction must be
10605:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10606arguments must have identical types.
10607
10608Semantics:
10609""""""""""
10610
10611The truth table used for the '``or``' instruction is:
10612
10613+-----+-----+-----+
10614| In0 | In1 | Out |
10615+-----+-----+-----+
10616|   0 |   0 |   0 |
10617+-----+-----+-----+
10618|   0 |   1 |   1 |
10619+-----+-----+-----+
10620|   1 |   0 |   1 |
10621+-----+-----+-----+
10622|   1 |   1 |   1 |
10623+-----+-----+-----+
10624
10625``disjoint`` means that for each bit, that bit is zero in at least one of the
10626inputs. This allows the Or to be treated as an Add since no carry can occur from
10627any bit. If the disjoint keyword is present, the result value of the ``or`` is a
10628:ref:`poison value <poisonvalues>` if both inputs have a one in the same bit
10629position. For vectors, only the element containing the bit is poison.
10630
10631Example:
10632""""""""
10633
10634::
10635
10636      <result> = or i32 4, %var         ; yields i32:result = 4 | %var
10637      <result> = or i32 15, 40          ; yields i32:result = 47
10638      <result> = or i32 4, 8            ; yields i32:result = 12
10639
10640.. _i_xor:
10641
10642'``xor``' Instruction
10643^^^^^^^^^^^^^^^^^^^^^
10644
10645Syntax:
10646"""""""
10647
10648::
10649
10650      <result> = xor <ty> <op1>, <op2>   ; yields ty:result
10651
10652Overview:
10653"""""""""
10654
10655The '``xor``' instruction returns the bitwise logical exclusive or of
10656its two operands. The ``xor`` is used to implement the "one's
10657complement" operation, which is the "~" operator in C.
10658
10659Arguments:
10660""""""""""
10661
10662The two arguments to the '``xor``' instruction must be
10663:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10664arguments must have identical types.
10665
10666Semantics:
10667""""""""""
10668
10669The truth table used for the '``xor``' instruction is:
10670
10671+-----+-----+-----+
10672| In0 | In1 | Out |
10673+-----+-----+-----+
10674|   0 |   0 |   0 |
10675+-----+-----+-----+
10676|   0 |   1 |   1 |
10677+-----+-----+-----+
10678|   1 |   0 |   1 |
10679+-----+-----+-----+
10680|   1 |   1 |   0 |
10681+-----+-----+-----+
10682
10683Example:
10684""""""""
10685
10686.. code-block:: text
10687
10688      <result> = xor i32 4, %var         ; yields i32:result = 4 ^ %var
10689      <result> = xor i32 15, 40          ; yields i32:result = 39
10690      <result> = xor i32 4, 8            ; yields i32:result = 12
10691      <result> = xor i32 %V, -1          ; yields i32:result = ~%V
10692
10693Vector Operations
10694-----------------
10695
10696LLVM supports several instructions to represent vector operations in a
10697target-independent manner. These instructions cover the element-access
10698and vector-specific operations needed to process vectors effectively.
10699While LLVM does directly support these vector operations, many
10700sophisticated algorithms will want to use target-specific intrinsics to
10701take full advantage of a specific target.
10702
10703.. _i_extractelement:
10704
10705'``extractelement``' Instruction
10706^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10707
10708Syntax:
10709"""""""
10710
10711::
10712
10713      <result> = extractelement <n x <ty>> <val>, <ty2> <idx>  ; yields <ty>
10714      <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty>
10715
10716Overview:
10717"""""""""
10718
10719The '``extractelement``' instruction extracts a single scalar element
10720from a vector at a specified index.
10721
10722Arguments:
10723""""""""""
10724
10725The first operand of an '``extractelement``' instruction is a value of
10726:ref:`vector <t_vector>` type. The second operand is an index indicating
10727the position from which to extract the element. The index may be a
10728variable of any integer type, and will be treated as an unsigned integer.
10729
10730Semantics:
10731""""""""""
10732
10733The result is a scalar of the same type as the element type of ``val``.
10734Its value is the value at position ``idx`` of ``val``. If ``idx``
10735exceeds the length of ``val`` for a fixed-length vector, the result is a
10736:ref:`poison value <poisonvalues>`. For a scalable vector, if the value
10737of ``idx`` exceeds the runtime length of the vector, the result is a
10738:ref:`poison value <poisonvalues>`.
10739
10740Example:
10741""""""""
10742
10743.. code-block:: text
10744
10745      <result> = extractelement <4 x i32> %vec, i32 0    ; yields i32
10746
10747.. _i_insertelement:
10748
10749'``insertelement``' Instruction
10750^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10751
10752Syntax:
10753"""""""
10754
10755::
10756
10757      <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx>    ; yields <n x <ty>>
10758      <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>>
10759
10760Overview:
10761"""""""""
10762
10763The '``insertelement``' instruction inserts a scalar element into a
10764vector at a specified index.
10765
10766Arguments:
10767""""""""""
10768
10769The first operand of an '``insertelement``' instruction is a value of
10770:ref:`vector <t_vector>` type. The second operand is a scalar value whose
10771type must equal the element type of the first operand. The third operand
10772is an index indicating the position at which to insert the value. The
10773index may be a variable of any integer type, and will be treated as an
10774unsigned integer.
10775
10776Semantics:
10777""""""""""
10778
10779The result is a vector of the same type as ``val``. Its element values
10780are those of ``val`` except at position ``idx``, where it gets the value
10781``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector,
10782the result is a :ref:`poison value <poisonvalues>`. For a scalable vector,
10783if the value of ``idx`` exceeds the runtime length of the vector, the result
10784is a :ref:`poison value <poisonvalues>`.
10785
10786Example:
10787""""""""
10788
10789.. code-block:: text
10790
10791      <result> = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
10792
10793.. _i_shufflevector:
10794
10795'``shufflevector``' Instruction
10796^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10797
10798Syntax:
10799"""""""
10800
10801::
10802
10803      <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask>    ; yields <m x <ty>>
10804      <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask>  ; yields <vscale x m x <ty>>
10805
10806Overview:
10807"""""""""
10808
10809The '``shufflevector``' instruction constructs a permutation of elements
10810from two input vectors, returning a vector with the same element type as
10811the input and length that is the same as the shuffle mask.
10812
10813Arguments:
10814""""""""""
10815
10816The first two operands of a '``shufflevector``' instruction are vectors
10817with the same type. The third argument is a shuffle mask vector constant
10818whose element type is ``i32``. The mask vector elements must be constant
10819integers or ``poison`` values. The result of the instruction is a vector
10820whose length is the same as the shuffle mask and whose element type is the
10821same as the element type of the first two operands.
10822
10823Semantics:
10824""""""""""
10825
10826The elements of the two input vectors are numbered from left to right
10827across both of the vectors. For each element of the result vector, the
10828shuffle mask selects an element from one of the input vectors to copy
10829to the result. Non-negative elements in the mask represent an index
10830into the concatenated pair of input vectors.
10831
10832A ``poison`` element in the mask vector specifies that the resulting element
10833is ``poison``.
10834For backwards-compatibility reasons, LLVM temporarily also accepts ``undef``
10835mask elements, which will be interpreted the same way as ``poison`` elements.
10836If the shuffle mask selects an ``undef`` element from one of the input
10837vectors, the resulting element is ``undef``.
10838
10839For scalable vectors, the only valid mask values at present are
10840``zeroinitializer``, ``undef`` and ``poison``, since we cannot write all indices as
10841literals for a vector with a length unknown at compile time.
10842
10843Example:
10844""""""""
10845
10846.. code-block:: text
10847
10848      <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
10849                              <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
10850      <result> = shufflevector <4 x i32> %v1, <4 x i32> poison,
10851                              <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
10852      <result> = shufflevector <8 x i32> %v1, <8 x i32> poison,
10853                              <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
10854      <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
10855                              <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
10856
10857Aggregate Operations
10858--------------------
10859
10860LLVM supports several instructions for working with
10861:ref:`aggregate <t_aggregate>` values.
10862
10863.. _i_extractvalue:
10864
10865'``extractvalue``' Instruction
10866^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10867
10868Syntax:
10869"""""""
10870
10871::
10872
10873      <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
10874
10875Overview:
10876"""""""""
10877
10878The '``extractvalue``' instruction extracts the value of a member field
10879from an :ref:`aggregate <t_aggregate>` value.
10880
10881Arguments:
10882""""""""""
10883
10884The first operand of an '``extractvalue``' instruction is a value of
10885:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
10886constant indices to specify which value to extract in a similar manner
10887as indices in a '``getelementptr``' instruction.
10888
10889The major differences to ``getelementptr`` indexing are:
10890
10891-  Since the value being indexed is not a pointer, the first index is
10892   omitted and assumed to be zero.
10893-  At least one index must be specified.
10894-  Not only struct indices but also array indices must be in bounds.
10895
10896Semantics:
10897""""""""""
10898
10899The result is the value at the position in the aggregate specified by
10900the index operands.
10901
10902Example:
10903""""""""
10904
10905.. code-block:: text
10906
10907      <result> = extractvalue {i32, float} %agg, 0    ; yields i32
10908
10909.. _i_insertvalue:
10910
10911'``insertvalue``' Instruction
10912^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10913
10914Syntax:
10915"""""""
10916
10917::
10918
10919      <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}*    ; yields <aggregate type>
10920
10921Overview:
10922"""""""""
10923
10924The '``insertvalue``' instruction inserts a value into a member field in
10925an :ref:`aggregate <t_aggregate>` value.
10926
10927Arguments:
10928""""""""""
10929
10930The first operand of an '``insertvalue``' instruction is a value of
10931:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
10932a first-class value to insert. The following operands are constant
10933indices indicating the position at which to insert the value in a
10934similar manner as indices in a '``extractvalue``' instruction. The value
10935to insert must have the same type as the value identified by the
10936indices.
10937
10938Semantics:
10939""""""""""
10940
10941The result is an aggregate of the same type as ``val``. Its value is
10942that of ``val`` except that the value at the position specified by the
10943indices is that of ``elt``.
10944
10945Example:
10946""""""""
10947
10948.. code-block:: llvm
10949
10950      %agg1 = insertvalue {i32, float} poison, i32 1, 0              ; yields {i32 1, float poison}
10951      %agg2 = insertvalue {i32, float} %agg1, float %val, 1          ; yields {i32 1, float %val}
10952      %agg3 = insertvalue {i32, {float}} poison, float %val, 1, 0    ; yields {i32 poison, {float %val}}
10953
10954.. _memoryops:
10955
10956Memory Access and Addressing Operations
10957---------------------------------------
10958
10959A key design point of an SSA-based representation is how it represents
10960memory. In LLVM, no memory locations are in SSA form, which makes things
10961very simple. This section describes how to read, write, and allocate
10962memory in LLVM.
10963
10964.. _i_alloca:
10965
10966'``alloca``' Instruction
10967^^^^^^^^^^^^^^^^^^^^^^^^
10968
10969Syntax:
10970"""""""
10971
10972::
10973
10974      <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)]     ; yields type addrspace(num)*:result
10975
10976Overview:
10977"""""""""
10978
10979The '``alloca``' instruction allocates memory on the stack frame of the
10980currently executing function, to be automatically released when this
10981function returns to its caller.  If the address space is not explicitly
10982specified, the object is allocated in the alloca address space from the
10983:ref:`datalayout string<langref_datalayout>`.
10984
10985Arguments:
10986""""""""""
10987
10988The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
10989bytes of memory on the runtime stack, returning a pointer of the
10990appropriate type to the program. If "NumElements" is specified, it is
10991the number of elements allocated, otherwise "NumElements" is defaulted
10992to be one.
10993
10994If a constant alignment is specified, the value result of the
10995allocation is guaranteed to be aligned to at least that boundary. The
10996alignment may not be greater than ``1 << 32``.
10997
10998The alignment is only optional when parsing textual IR; for in-memory IR,
10999it is always present. If not specified, the target can choose to align the
11000allocation on any convenient boundary compatible with the type.
11001
11002'``type``' may be any sized type.
11003
11004Structs containing scalable vectors cannot be used in allocas unless all
11005fields are the same scalable vector type (e.g. ``{<vscale x 2 x i32>,
11006<vscale x 2 x i32>}`` contains the same type while ``{<vscale x 2 x i32>,
11007<vscale x 2 x i64>}`` doesn't).
11008
11009Semantics:
11010""""""""""
11011
11012Memory is allocated; a pointer is returned. The allocated memory is
11013uninitialized, and loading from uninitialized memory produces an undefined
11014value. The operation itself is undefined if there is insufficient stack
11015space for the allocation.'``alloca``'d memory is automatically released
11016when the function returns. The '``alloca``' instruction is commonly used
11017to represent automatic variables that must have an address available. When
11018the function returns (either with the ``ret`` or ``resume`` instructions),
11019the memory is reclaimed. Allocating zero bytes is legal, but the returned
11020pointer may not be unique. The order in which memory is allocated (ie.,
11021which way the stack grows) is not specified.
11022
11023Note that '``alloca``' outside of the alloca address space from the
11024:ref:`datalayout string<langref_datalayout>` is meaningful only if the
11025target has assigned it a semantics.
11026
11027If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`,
11028the returned object is initially dead.
11029See :ref:`llvm.lifetime.start <int_lifestart>` and
11030:ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of
11031lifetime-manipulating intrinsics.
11032
11033Example:
11034""""""""
11035
11036.. code-block:: llvm
11037
11038      %ptr = alloca i32                             ; yields ptr
11039      %ptr = alloca i32, i32 4                      ; yields ptr
11040      %ptr = alloca i32, i32 4, align 1024          ; yields ptr
11041      %ptr = alloca i32, align 1024                 ; yields ptr
11042
11043.. _i_load:
11044
11045'``load``' Instruction
11046^^^^^^^^^^^^^^^^^^^^^^
11047
11048Syntax:
11049"""""""
11050
11051::
11052
11053      <result> = load [volatile] <ty>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]
11054      <result> = load atomic [volatile] <ty>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]
11055      !<nontemp_node> = !{ i32 1 }
11056      !<empty_node> = !{}
11057      !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }
11058      !<align_node> = !{ i64 <value_alignment> }
11059
11060Overview:
11061"""""""""
11062
11063The '``load``' instruction is used to read from memory.
11064
11065Arguments:
11066""""""""""
11067
11068The argument to the ``load`` instruction specifies the memory address from which
11069to load. The type specified must be a :ref:`first class <t_firstclass>` type of
11070known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
11071the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
11072modify the number or order of execution of this ``load`` with other
11073:ref:`volatile operations <volatile>`.
11074
11075If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
11076<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
11077``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
11078Atomic loads produce :ref:`defined <memmodel>` results when they may see
11079multiple atomic stores. The type of the pointee must be an integer, pointer, or
11080floating-point type whose bit width is a power of two greater than or equal to
11081eight and less than or equal to a target-specific size limit.  ``align`` must be
11082explicitly specified on atomic loads. Note: if the alignment is not greater or
11083equal to the size of the `<value>` type, the atomic operation is likely to
11084require a lock and have poor performance. ``!nontemporal`` does not have any
11085defined semantics for atomic loads.
11086
11087The optional constant ``align`` argument specifies the alignment of the
11088operation (that is, the alignment of the memory address). It is the
11089responsibility of the code emitter to ensure that the alignment information is
11090correct. Overestimating the alignment results in undefined behavior.
11091Underestimating the alignment may produce less efficient code. An alignment of
110921 is always safe. The maximum possible alignment is ``1 << 32``. An alignment
11093value higher than the size of the loaded type implies memory up to the
11094alignment value bytes can be safely loaded without trapping in the default
11095address space. Access of the high bytes can interfere with debugging tools, so
11096should not be accessed if the function has the ``sanitize_thread`` or
11097``sanitize_address`` attributes.
11098
11099The alignment is only optional when parsing textual IR; for in-memory IR, it is
11100always present. An omitted ``align`` argument means that the operation has the
11101ABI alignment for the target.
11102
11103The optional ``!nontemporal`` metadata must reference a single
11104metadata name ``<nontemp_node>`` corresponding to a metadata node with one
11105``i32`` entry of value 1. The existence of the ``!nontemporal``
11106metadata on the instruction tells the optimizer and code generator
11107that this load is not expected to be reused in the cache. The code
11108generator may select special instructions to save cache bandwidth, such
11109as the ``MOVNT`` instruction on x86.
11110
11111The optional ``!invariant.load`` metadata must reference a single
11112metadata name ``<empty_node>`` corresponding to a metadata node with no
11113entries. If a load instruction tagged with the ``!invariant.load``
11114metadata is executed, the memory location referenced by the load has
11115to contain the same value at all points in the program where the
11116memory location is dereferenceable; otherwise, the behavior is
11117undefined.
11118
11119The optional ``!invariant.group`` metadata must reference a single metadata name
11120 ``<empty_node>`` corresponding to a metadata node with no entries.
11121 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`.
11122
11123The optional ``!nonnull`` metadata must reference a single
11124metadata name ``<empty_node>`` corresponding to a metadata node with no
11125entries. The existence of the ``!nonnull`` metadata on the
11126instruction tells the optimizer that the value loaded is known to
11127never be null. If the value is null at runtime, a poison value is returned
11128instead.  This is analogous to the ``nonnull`` attribute on parameters and
11129return values. This metadata can only be applied to loads of a pointer type.
11130
11131The optional ``!dereferenceable`` metadata must reference a single metadata
11132name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
11133entry.
11134See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`.
11135
11136The optional ``!dereferenceable_or_null`` metadata must reference a single
11137metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
11138``i64`` entry.
11139See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null
11140<md_dereferenceable_or_null>`.
11141
11142The optional ``!align`` metadata must reference a single metadata name
11143``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
11144The existence of the ``!align`` metadata on the instruction tells the
11145optimizer that the value loaded is known to be aligned to a boundary specified
11146by the integer value in the metadata node. The alignment must be a power of 2.
11147This is analogous to the ''align'' attribute on parameters and return values.
11148This metadata can only be applied to loads of a pointer type. If the returned
11149value is not appropriately aligned at runtime, a poison value is returned
11150instead.
11151
11152The optional ``!noundef`` metadata must reference a single metadata name
11153``<empty_node>`` corresponding to a node with no entries. The existence of
11154``!noundef`` metadata on the instruction tells the optimizer that the value
11155loaded is known to be :ref:`well defined <welldefinedvalues>`.
11156If the value isn't well defined, the behavior is undefined. If the ``!noundef``
11157metadata is combined with poison-generating metadata like ``!nonnull``,
11158violation of that metadata constraint will also result in undefined behavior.
11159
11160Semantics:
11161""""""""""
11162
11163The location of memory pointed to is loaded. If the value being loaded
11164is of scalar type then the number of bytes read does not exceed the
11165minimum number of bytes needed to hold all bits of the type. For
11166example, loading an ``i24`` reads at most three bytes. When loading a
11167value of a type like ``i20`` with a size that is not an integral number
11168of bytes, the result is undefined if the value was not originally
11169written using a store of the same type.
11170If the value being loaded is of aggregate type, the bytes that correspond to
11171padding may be accessed but are ignored, because it is impossible to observe
11172padding from the loaded aggregate value.
11173If ``<pointer>`` is not a well-defined value, the behavior is undefined.
11174
11175Examples:
11176"""""""""
11177
11178.. code-block:: llvm
11179
11180      %ptr = alloca i32                               ; yields ptr
11181      store i32 3, ptr %ptr                           ; yields void
11182      %val = load i32, ptr %ptr                       ; yields i32:val = i32 3
11183
11184.. _i_store:
11185
11186'``store``' Instruction
11187^^^^^^^^^^^^^^^^^^^^^^^
11188
11189Syntax:
11190"""""""
11191
11192::
11193
11194      store [volatile] <ty> <value>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>]        ; yields void
11195      store atomic [volatile] <ty> <value>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void
11196      !<nontemp_node> = !{ i32 1 }
11197      !<empty_node> = !{}
11198
11199Overview:
11200"""""""""
11201
11202The '``store``' instruction is used to write to memory.
11203
11204Arguments:
11205""""""""""
11206
11207There are two arguments to the ``store`` instruction: a value to store and an
11208address at which to store it. The type of the ``<pointer>`` operand must be a
11209pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
11210operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
11211allowed to modify the number or order of execution of this ``store`` with other
11212:ref:`volatile operations <volatile>`.  Only values of :ref:`first class
11213<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
11214structural type <t_opaque>`) can be stored.
11215
11216If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
11217<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
11218``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
11219Atomic loads produce :ref:`defined <memmodel>` results when they may see
11220multiple atomic stores. The type of the pointee must be an integer, pointer, or
11221floating-point type whose bit width is a power of two greater than or equal to
11222eight and less than or equal to a target-specific size limit.  ``align`` must be
11223explicitly specified on atomic stores. Note: if the alignment is not greater or
11224equal to the size of the `<value>` type, the atomic operation is likely to
11225require a lock and have poor performance. ``!nontemporal`` does not have any
11226defined semantics for atomic stores.
11227
11228The optional constant ``align`` argument specifies the alignment of the
11229operation (that is, the alignment of the memory address). It is the
11230responsibility of the code emitter to ensure that the alignment information is
11231correct. Overestimating the alignment results in undefined behavior.
11232Underestimating the alignment may produce less efficient code. An alignment of
112331 is always safe. The maximum possible alignment is ``1 << 32``. An alignment
11234value higher than the size of the loaded type implies memory up to the
11235alignment value bytes can be safely loaded without trapping in the default
11236address space. Access of the high bytes can interfere with debugging tools, so
11237should not be accessed if the function has the ``sanitize_thread`` or
11238``sanitize_address`` attributes.
11239
11240The alignment is only optional when parsing textual IR; for in-memory IR, it is
11241always present. An omitted ``align`` argument means that the operation has the
11242ABI alignment for the target.
11243
11244The optional ``!nontemporal`` metadata must reference a single metadata
11245name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry
11246of value 1. The existence of the ``!nontemporal`` metadata on the instruction
11247tells the optimizer and code generator that this load is not expected to
11248be reused in the cache. The code generator may select special
11249instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
11250x86.
11251
11252The optional ``!invariant.group`` metadata must reference a
11253single metadata name ``<empty_node>``. See ``invariant.group`` metadata.
11254
11255Semantics:
11256""""""""""
11257
11258The contents of memory are updated to contain ``<value>`` at the
11259location specified by the ``<pointer>`` operand. If ``<value>`` is
11260of scalar type then the number of bytes written does not exceed the
11261minimum number of bytes needed to hold all bits of the type. For
11262example, storing an ``i24`` writes at most three bytes. When writing a
11263value of a type like ``i20`` with a size that is not an integral number
11264of bytes, it is unspecified what happens to the extra bits that do not
11265belong to the type, but they will typically be overwritten.
11266If ``<value>`` is of aggregate type, padding is filled with
11267:ref:`undef <undefvalues>`.
11268If ``<pointer>`` is not a well-defined value, the behavior is undefined.
11269
11270Example:
11271""""""""
11272
11273.. code-block:: llvm
11274
11275      %ptr = alloca i32                               ; yields ptr
11276      store i32 3, ptr %ptr                           ; yields void
11277      %val = load i32, ptr %ptr                       ; yields i32:val = i32 3
11278
11279.. _i_fence:
11280
11281'``fence``' Instruction
11282^^^^^^^^^^^^^^^^^^^^^^^
11283
11284Syntax:
11285"""""""
11286
11287::
11288
11289      fence [syncscope("<target-scope>")] <ordering>  ; yields void
11290
11291Overview:
11292"""""""""
11293
11294The '``fence``' instruction is used to introduce happens-before edges
11295between operations.
11296
11297Arguments:
11298""""""""""
11299
11300'``fence``' instructions take an :ref:`ordering <ordering>` argument which
11301defines what *synchronizes-with* edges they add. They can only be given
11302``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
11303
11304Semantics:
11305""""""""""
11306
11307A fence A which has (at least) ``release`` ordering semantics
11308*synchronizes with* a fence B with (at least) ``acquire`` ordering
11309semantics if and only if there exist atomic operations X and Y, both
11310operating on some atomic object M, such that A is sequenced before X, X
11311modifies M (either directly or through some side effect of a sequence
11312headed by X), Y is sequenced before B, and Y observes M. This provides a
11313*happens-before* dependency between A and B. Rather than an explicit
11314``fence``, one (but not both) of the atomic operations X or Y might
11315provide a ``release`` or ``acquire`` (resp.) ordering constraint and
11316still *synchronize-with* the explicit ``fence`` and establish the
11317*happens-before* edge.
11318
11319A ``fence`` which has ``seq_cst`` ordering, in addition to having both
11320``acquire`` and ``release`` semantics specified above, participates in
11321the global program order of other ``seq_cst`` operations and/or
11322fences. Furthermore, the global ordering created by a ``seq_cst``
11323fence must be compatible with the individual total orders of
11324``monotonic`` (or stronger) memory accesses occurring before and after
11325such a fence. The exact semantics of this interaction are somewhat
11326complicated, see the C++ standard's `[atomics.order]
11327<https://wg21.link/atomics.order>`_ section for more details.
11328
11329A ``fence`` instruction can also take an optional
11330":ref:`syncscope <syncscope>`" argument.
11331
11332Example:
11333""""""""
11334
11335.. code-block:: text
11336
11337      fence acquire                                        ; yields void
11338      fence syncscope("singlethread") seq_cst              ; yields void
11339      fence syncscope("agent") seq_cst                     ; yields void
11340
11341.. _i_cmpxchg:
11342
11343'``cmpxchg``' Instruction
11344^^^^^^^^^^^^^^^^^^^^^^^^^
11345
11346Syntax:
11347"""""""
11348
11349::
11350
11351      cmpxchg [weak] [volatile] ptr <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields  { ty, i1 }
11352
11353Overview:
11354"""""""""
11355
11356The '``cmpxchg``' instruction is used to atomically modify memory. It
11357loads a value in memory and compares it to a given value. If they are
11358equal, it tries to store a new value into the memory.
11359
11360Arguments:
11361""""""""""
11362
11363There are three arguments to the '``cmpxchg``' instruction: an address
11364to operate on, a value to compare to the value currently be at that
11365address, and a new value to place at that address if the compared values
11366are equal. The type of '<cmp>' must be an integer or pointer type whose
11367bit width is a power of two greater than or equal to eight and less
11368than or equal to a target-specific size limit. '<cmp>' and '<new>' must
11369have the same type, and the type of '<pointer>' must be a pointer to
11370that type. If the ``cmpxchg`` is marked as ``volatile``, then the
11371optimizer is not allowed to modify the number or order of execution of
11372this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
11373
11374The success and failure :ref:`ordering <ordering>` arguments specify how this
11375``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
11376must be at least ``monotonic``, the failure ordering cannot be either
11377``release`` or ``acq_rel``.
11378
11379A ``cmpxchg`` instruction can also take an optional
11380":ref:`syncscope <syncscope>`" argument.
11381
11382Note: if the alignment is not greater or equal to the size of the `<value>`
11383type, the atomic operation is likely to require a lock and have poor
11384performance.
11385
11386The alignment is only optional when parsing textual IR; for in-memory IR, it is
11387always present. If unspecified, the alignment is assumed to be equal to the
11388size of the '<value>' type. Note that this default alignment assumption is
11389different from the alignment used for the load/store instructions when align
11390isn't specified.
11391
11392The pointer passed into cmpxchg must have alignment greater than or
11393equal to the size in memory of the operand.
11394
11395Semantics:
11396""""""""""
11397
11398The contents of memory at the location specified by the '``<pointer>``' operand
11399is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
11400written to the location. The original value at the location is returned,
11401together with a flag indicating success (true) or failure (false).
11402
11403If the cmpxchg operation is marked as ``weak`` then a spurious failure is
11404permitted: the operation may not write ``<new>`` even if the comparison
11405matched.
11406
11407If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
11408if the value loaded equals ``cmp``.
11409
11410A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
11411identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
11412load with an ordering parameter determined the second ordering parameter.
11413
11414Example:
11415""""""""
11416
11417.. code-block:: llvm
11418
11419    entry:
11420      %orig = load atomic i32, ptr %ptr unordered, align 4                      ; yields i32
11421      br label %loop
11422
11423    loop:
11424      %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
11425      %squared = mul i32 %cmp, %cmp
11426      %val_success = cmpxchg ptr %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields  { i32, i1 }
11427      %value_loaded = extractvalue { i32, i1 } %val_success, 0
11428      %success = extractvalue { i32, i1 } %val_success, 1
11429      br i1 %success, label %done, label %loop
11430
11431    done:
11432      ...
11433
11434.. _i_atomicrmw:
11435
11436'``atomicrmw``' Instruction
11437^^^^^^^^^^^^^^^^^^^^^^^^^^^
11438
11439Syntax:
11440"""""""
11441
11442::
11443
11444      atomicrmw [volatile] <operation> ptr <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>]  ; yields ty
11445
11446Overview:
11447"""""""""
11448
11449The '``atomicrmw``' instruction is used to atomically modify memory.
11450
11451Arguments:
11452""""""""""
11453
11454There are three arguments to the '``atomicrmw``' instruction: an
11455operation to apply, an address whose value to modify, an argument to the
11456operation. The operation must be one of the following keywords:
11457
11458-  xchg
11459-  add
11460-  sub
11461-  and
11462-  nand
11463-  or
11464-  xor
11465-  max
11466-  min
11467-  umax
11468-  umin
11469-  fadd
11470-  fsub
11471-  fmax
11472-  fmin
11473-  uinc_wrap
11474-  udec_wrap
11475-  usub_cond
11476-  usub_sat
11477
11478For most of these operations, the type of '<value>' must be an integer
11479type whose bit width is a power of two greater than or equal to eight
11480and less than or equal to a target-specific size limit. For xchg, this
11481may also be a floating point or a pointer type with the same size constraints
11482as integers.  For fadd/fsub/fmax/fmin, this must be a floating-point
11483or fixed vector of floating-point type.  The type of the '``<pointer>``'
11484operand must be a pointer to that type. If the ``atomicrmw`` is marked
11485as ``volatile``, then the optimizer is not allowed to modify the
11486number or order of execution of this ``atomicrmw`` with other
11487:ref:`volatile operations <volatile>`.
11488
11489Note: if the alignment is not greater or equal to the size of the `<value>`
11490type, the atomic operation is likely to require a lock and have poor
11491performance.
11492
11493The alignment is only optional when parsing textual IR; for in-memory IR, it is
11494always present. If unspecified, the alignment is assumed to be equal to the
11495size of the '<value>' type. Note that this default alignment assumption is
11496different from the alignment used for the load/store instructions when align
11497isn't specified.
11498
11499A ``atomicrmw`` instruction can also take an optional
11500":ref:`syncscope <syncscope>`" argument.
11501
11502Semantics:
11503""""""""""
11504
11505The contents of memory at the location specified by the '``<pointer>``'
11506operand are atomically read, modified, and written back. The original
11507value at the location is returned. The modification is specified by the
11508operation argument:
11509
11510-  xchg: ``*ptr = val``
11511-  add: ``*ptr = *ptr + val``
11512-  sub: ``*ptr = *ptr - val``
11513-  and: ``*ptr = *ptr & val``
11514-  nand: ``*ptr = ~(*ptr & val)``
11515-  or: ``*ptr = *ptr | val``
11516-  xor: ``*ptr = *ptr ^ val``
11517-  max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
11518-  min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
11519-  umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison)
11520-  umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison)
11521- fadd: ``*ptr = *ptr + val`` (using floating point arithmetic)
11522- fsub: ``*ptr = *ptr - val`` (using floating point arithmetic)
11523-  fmax: ``*ptr = maxnum(*ptr, val)`` (match the `llvm.maxnum.*`` intrinsic)
11524-  fmin: ``*ptr = minnum(*ptr, val)`` (match the `llvm.minnum.*`` intrinsic)
11525-  uinc_wrap: ``*ptr = (*ptr u>= val) ? 0 : (*ptr + 1)`` (increment value with wraparound to zero when incremented above input value)
11526-  udec_wrap: ``*ptr = ((*ptr == 0) || (*ptr u> val)) ? val : (*ptr - 1)`` (decrement with wraparound to input value when decremented below zero).
11527-  usub_cond: ``*ptr = (*ptr u>= val) ? *ptr - val : *ptr`` (subtract only if no unsigned overflow).
11528-  usub_sat: ``*ptr = (*ptr u>= val) ? *ptr - val : 0`` (subtract with unsigned clamping to zero).
11529
11530
11531Example:
11532""""""""
11533
11534.. code-block:: llvm
11535
11536      %old = atomicrmw add ptr %ptr, i32 1 acquire                        ; yields i32
11537
11538.. _i_getelementptr:
11539
11540'``getelementptr``' Instruction
11541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11542
11543Syntax:
11544"""""""
11545
11546::
11547
11548      <result> = getelementptr <ty>, ptr <ptrval>{, <ty> <idx>}*
11549      <result> = getelementptr inbounds <ty>, ptr <ptrval>{, <ty> <idx>}*
11550      <result> = getelementptr nusw <ty>, ptr <ptrval>{, <ty> <idx>}*
11551      <result> = getelementptr nuw <ty>, ptr <ptrval>{, <ty> <idx>}*
11552      <result> = getelementptr inrange(S,E) <ty>, ptr <ptrval>{, <ty> <idx>}*
11553      <result> = getelementptr <ty>, <N x ptr> <ptrval>, <vector index type> <idx>
11554
11555Overview:
11556"""""""""
11557
11558The '``getelementptr``' instruction is used to get the address of a
11559subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
11560address calculation only and does not access memory. The instruction can also
11561be used to calculate a vector of such addresses.
11562
11563Arguments:
11564""""""""""
11565
11566The first argument is always a type used as the basis for the calculations.
11567The second argument is always a pointer or a vector of pointers, and is the
11568base address to start from. The remaining arguments are indices
11569that indicate which of the elements of the aggregate object are indexed.
11570The interpretation of each index is dependent on the type being indexed
11571into. The first index always indexes the pointer value given as the
11572second argument, the second index indexes a value of the type pointed to
11573(not necessarily the value directly pointed to, since the first index
11574can be non-zero), etc. The first type indexed into must be a pointer
11575value, subsequent types can be arrays, vectors, and structs. Note that
11576subsequent types being indexed into can never be pointers, since that
11577would require loading the pointer before continuing calculation.
11578
11579The type of each index argument depends on the type it is indexing into.
11580When indexing into a (optionally packed) structure, only ``i32`` integer
11581**constants** are allowed (when using a vector of indices they must all
11582be the **same** ``i32`` integer constant). When indexing into an array,
11583pointer or vector, integers of any width are allowed, and they are not
11584required to be constant. These integers are treated as signed values
11585where relevant.
11586
11587For example, let's consider a C code fragment and how it gets compiled
11588to LLVM:
11589
11590.. code-block:: c
11591
11592    struct RT {
11593      char A;
11594      int B[10][20];
11595      char C;
11596    };
11597    struct ST {
11598      int X;
11599      double Y;
11600      struct RT Z;
11601    };
11602
11603    int *foo(struct ST *s) {
11604      return &s[1].Z.B[5][13];
11605    }
11606
11607The LLVM code generated by Clang is approximately:
11608
11609.. code-block:: llvm
11610
11611    %struct.RT = type { i8, [10 x [20 x i32]], i8 }
11612    %struct.ST = type { i32, double, %struct.RT }
11613
11614    define ptr @foo(ptr %s) {
11615    entry:
11616      %arrayidx = getelementptr inbounds %struct.ST, ptr %s, i64 1, i32 2, i32 1, i64 5, i64 13
11617      ret ptr %arrayidx
11618    }
11619
11620Semantics:
11621""""""""""
11622
11623In the example above, the first index is indexing into the
11624'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
11625= '``{ i32, double, %struct.RT }``' type, a structure. The second index
11626indexes into the third element of the structure, yielding a
11627'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
11628structure. The third index indexes into the second element of the
11629structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
11630dimensions of the array are subscripted into, yielding an '``i32``'
11631type. The '``getelementptr``' instruction returns a pointer to this
11632element.
11633
11634Note that it is perfectly legal to index partially through a structure,
11635returning a pointer to an inner element. Because of this, the LLVM code
11636for the given testcase is equivalent to:
11637
11638.. code-block:: llvm
11639
11640    define ptr @foo(ptr %s) {
11641      %t1 = getelementptr %struct.ST, ptr %s, i32 1
11642      %t2 = getelementptr %struct.ST, ptr %t1, i32 0, i32 2
11643      %t3 = getelementptr %struct.RT, ptr %t2, i32 0, i32 1
11644      %t4 = getelementptr [10 x [20 x i32]], ptr %t3, i32 0, i32 5
11645      %t5 = getelementptr [20 x i32], ptr %t4, i32 0, i32 13
11646      ret ptr %t5
11647    }
11648
11649The indices are first converted to offsets in the pointer's index type. If the
11650currently indexed type is a struct type, the struct offset corresponding to the
11651index is sign-extended or truncated to the pointer index type. Otherwise, the
11652index itself is sign-extended or truncated, and then multiplied by the type
11653allocation size (that is, the size rounded up to the ABI alignment) of the
11654currently indexed type.
11655
11656The offsets are then added to the low bits of the base address up to the index
11657type width, with silently-wrapping two's complement arithmetic. If the pointer
11658size is larger than the index size, this means that the bits outside the index
11659type width will not be affected.
11660
11661The result value of the ``getelementptr`` may be outside the object pointed
11662to by the base pointer. The result value may not necessarily be used to access
11663memory though, even if it happens to point into allocated storage. See the
11664:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
11665information.
11666
11667The ``getelementptr`` instruction may have a number of attributes that impose
11668additional rules. If any of the rules are violated, the result value is a
11669:ref:`poison value <poisonvalues>`. In cases where the base is a vector of
11670pointers, the attributes apply to each computation element-wise.
11671
11672For ``nusw`` (no unsigned signed wrap):
11673
11674 * If the type of an index is larger than the pointer index type, the
11675   truncation to the pointer index type preserves the signed value
11676   (``trunc nsw``).
11677 * The multiplication of an index by the type size does not wrap the pointer
11678   index type in a signed sense (``mul nsw``).
11679 * The successive addition of each offset (without adding the base address)
11680   does not wrap the pointer index type in a signed sense (``add nsw``).
11681 * The successive addition of the current address, truncated to the pointer
11682   index type and interpreted as an unsigned number, and each offset,
11683   interpreted as a signed number, does not wrap the pointer index type.
11684
11685For ``nuw`` (no unsigned wrap):
11686
11687 * If the type of an index is larger than the pointer index type, the
11688   truncation to the pointer index type preserves the unsigned value
11689   (``trunc nuw``).
11690 * The multiplication of an index by the type size does not wrap the pointer
11691   index type in an unsigned sense (``mul nuw``).
11692 * The successive addition of each offset (without adding the base address)
11693   does not wrap the pointer index type in an unsigned sense (``add nuw``).
11694 * The successive addition of the current address, truncated to the pointer
11695   index type and interpreted as an unsigned number, and each offset, also
11696   interpreted as an unsigned number, does not wrap the pointer index type
11697   (``add nuw``).
11698
11699For ``inbounds`` all rules of the ``nusw`` attribute apply. Additionally,
11700if the ``getelementptr`` has any non-zero indices, the following rules apply:
11701
11702 * The base pointer has an *in bounds* address of the allocated object that it
11703   is :ref:`based <pointeraliasing>` on. This means that it points into that
11704   allocated object, or to its end. Note that the object does not have to be
11705   live anymore; being in-bounds of a deallocated object is sufficient.
11706 * During the successive addition of offsets to the address, the resulting
11707   pointer must remain *in bounds* of the allocated object at each step.
11708
11709Note that ``getelementptr`` with all-zero indices is always considered to be
11710``inbounds``, even if the base pointer does not point to an allocated object.
11711As a corollary, the only pointer in bounds of the null pointer in the default
11712address space is the null pointer itself.
11713
11714These rules are based on the assumption that no allocated object may cross
11715the unsigned address space boundary, and no allocated object may be larger
11716than half the pointer index type space.
11717
11718If ``inbounds`` is present on a ``getelementptr`` instruction, the ``nusw``
11719attribute will be automatically set as well. For this reason, the ``nusw``
11720will also not be printed in textual IR if ``inbounds`` is already present.
11721
11722If the ``inrange(Start, End)`` attribute is present, loading from or
11723storing to any pointer derived from the ``getelementptr`` has undefined
11724behavior if the load or store would access memory outside the half-open range
11725``[Start, End)`` from the ``getelementptr`` expression result. The result of
11726a pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
11727involving memory) involving a pointer derived from a ``getelementptr`` with
11728the ``inrange`` keyword is undefined, with the exception of comparisons
11729in the case where both operands are in the closed range ``[Start, End]``.
11730Note that the ``inrange`` keyword is currently only allowed
11731in constant ``getelementptr`` expressions.
11732
11733The getelementptr instruction is often confusing. For some more insight
11734into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
11735
11736Example:
11737""""""""
11738
11739.. code-block:: llvm
11740
11741        %aptr = getelementptr {i32, [12 x i8]}, ptr %saptr, i64 0, i32 1
11742        %vptr = getelementptr {i32, <2 x i8>}, ptr %svptr, i64 0, i32 1, i32 1
11743        %eptr = getelementptr [12 x i8], ptr %aptr, i64 0, i32 1
11744        %iptr = getelementptr [10 x i32], ptr @arr, i16 0, i16 0
11745
11746Vector of pointers:
11747"""""""""""""""""""
11748
11749The ``getelementptr`` returns a vector of pointers, instead of a single address,
11750when one or more of its arguments is a vector. In such cases, all vector
11751arguments should have the same number of elements, and every scalar argument
11752will be effectively broadcast into a vector during address calculation.
11753
11754.. code-block:: llvm
11755
11756     ; All arguments are vectors:
11757     ;   A[i] = ptrs[i] + offsets[i]*sizeof(i8)
11758     %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
11759
11760     ; Add the same scalar offset to each pointer of a vector:
11761     ;   A[i] = ptrs[i] + offset*sizeof(i8)
11762     %A = getelementptr i8, <4 x ptr> %ptrs, i64 %offset
11763
11764     ; Add distinct offsets to the same pointer:
11765     ;   A[i] = ptr + offsets[i]*sizeof(i8)
11766     %A = getelementptr i8, ptr %ptr, <4 x i64> %offsets
11767
11768     ; In all cases described above the type of the result is <4 x ptr>
11769
11770The two following instructions are equivalent:
11771
11772.. code-block:: llvm
11773
11774     getelementptr  %struct.ST, <4 x ptr> %s, <4 x i64> %ind1,
11775       <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
11776       <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
11777       <4 x i32> %ind4,
11778       <4 x i64> <i64 13, i64 13, i64 13, i64 13>
11779
11780     getelementptr  %struct.ST, <4 x ptr> %s, <4 x i64> %ind1,
11781       i32 2, i32 1, <4 x i32> %ind4, i64 13
11782
11783Let's look at the C code, where the vector version of ``getelementptr``
11784makes sense:
11785
11786.. code-block:: c
11787
11788    // Let's assume that we vectorize the following loop:
11789    double *A, *B; int *C;
11790    for (int i = 0; i < size; ++i) {
11791      A[i] = B[C[i]];
11792    }
11793
11794.. code-block:: llvm
11795
11796    ; get pointers for 8 elements from array B
11797    %ptrs = getelementptr double, ptr %B, <8 x i32> %C
11798    ; load 8 elements from array B into A
11799    %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x ptr> %ptrs,
11800         i32 8, <8 x i1> %mask, <8 x double> %passthru)
11801
11802Conversion Operations
11803---------------------
11804
11805The instructions in this category are the conversion instructions
11806(casting) which all take a single operand and a type. They perform
11807various bit conversions on the operand.
11808
11809.. _i_trunc:
11810
11811'``trunc .. to``' Instruction
11812^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11813
11814Syntax:
11815"""""""
11816
11817::
11818
11819      <result> = trunc <ty> <value> to <ty2>             ; yields ty2
11820      <result> = trunc nsw <ty> <value> to <ty2>         ; yields ty2
11821      <result> = trunc nuw <ty> <value> to <ty2>         ; yields ty2
11822      <result> = trunc nuw nsw <ty> <value> to <ty2>     ; yields ty2
11823
11824Overview:
11825"""""""""
11826
11827The '``trunc``' instruction truncates its operand to the type ``ty2``.
11828
11829Arguments:
11830""""""""""
11831
11832The '``trunc``' instruction takes a value to trunc, and a type to trunc
11833it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
11834of the same number of integers. The bit size of the ``value`` must be
11835larger than the bit size of the destination type, ``ty2``. Equal sized
11836types are not allowed.
11837
11838Semantics:
11839""""""""""
11840
11841The '``trunc``' instruction truncates the high order bits in ``value``
11842and converts the remaining bits to ``ty2``. Since the source size must
11843be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
11844It will always truncate bits.
11845
11846If the ``nuw`` keyword is present, and any of the truncated bits are non-zero,
11847the result is a :ref:`poison value <poisonvalues>`. If the ``nsw`` keyword
11848is present, and any of the truncated bits are not the same as the top bit
11849of the truncation result, the result is a :ref:`poison value <poisonvalues>`.
11850
11851Example:
11852""""""""
11853
11854.. code-block:: llvm
11855
11856      %X = trunc i32 257 to i8                        ; yields i8:1
11857      %Y = trunc i32 123 to i1                        ; yields i1:true
11858      %Z = trunc i32 122 to i1                        ; yields i1:false
11859      %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
11860
11861.. _i_zext:
11862
11863'``zext .. to``' Instruction
11864^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11865
11866Syntax:
11867"""""""
11868
11869::
11870
11871      <result> = zext <ty> <value> to <ty2>             ; yields ty2
11872
11873Overview:
11874"""""""""
11875
11876The '``zext``' instruction zero extends its operand to type ``ty2``.
11877
11878The ``nneg`` (non-negative) flag, if present, specifies that the operand is
11879non-negative. This property may be used by optimization passes to later
11880convert the ``zext`` into a ``sext``.
11881
11882Arguments:
11883""""""""""
11884
11885The '``zext``' instruction takes a value to cast, and a type to cast it
11886to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
11887the same number of integers. The bit size of the ``value`` must be
11888smaller than the bit size of the destination type, ``ty2``.
11889
11890Semantics:
11891""""""""""
11892
11893The ``zext`` fills the high order bits of the ``value`` with zero bits
11894until it reaches the size of the destination type, ``ty2``.
11895
11896When zero extending from i1, the result will always be either 0 or 1.
11897
11898If the ``nneg`` flag is set, and the ``zext`` argument is negative, the result
11899is a poison value.
11900
11901Example:
11902""""""""
11903
11904.. code-block:: llvm
11905
11906      %X = zext i32 257 to i64              ; yields i64:257
11907      %Y = zext i1 true to i32              ; yields i32:1
11908      %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
11909
11910      %a = zext nneg i8 127 to i16 ; yields i16 127
11911      %b = zext nneg i8 -1 to i16  ; yields i16 poison
11912
11913.. _i_sext:
11914
11915'``sext .. to``' Instruction
11916^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11917
11918Syntax:
11919"""""""
11920
11921::
11922
11923      <result> = sext <ty> <value> to <ty2>             ; yields ty2
11924
11925Overview:
11926"""""""""
11927
11928The '``sext``' sign extends ``value`` to the type ``ty2``.
11929
11930Arguments:
11931""""""""""
11932
11933The '``sext``' instruction takes a value to cast, and a type to cast it
11934to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
11935the same number of integers. The bit size of the ``value`` must be
11936smaller than the bit size of the destination type, ``ty2``.
11937
11938Semantics:
11939""""""""""
11940
11941The '``sext``' instruction performs a sign extension by copying the sign
11942bit (highest order bit) of the ``value`` until it reaches the bit size
11943of the type ``ty2``.
11944
11945When sign extending from i1, the extension always results in -1 or 0.
11946
11947Example:
11948""""""""
11949
11950.. code-block:: llvm
11951
11952      %X = sext i8  -1 to i16              ; yields i16   :65535
11953      %Y = sext i1 true to i32             ; yields i32:-1
11954      %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
11955
11956.. _i_fptrunc:
11957
11958'``fptrunc .. to``' Instruction
11959^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11960
11961Syntax:
11962"""""""
11963
11964::
11965
11966      <result> = fptrunc [fast-math flags]* <ty> <value> to <ty2> ; yields ty2
11967
11968Overview:
11969"""""""""
11970
11971The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
11972
11973Arguments:
11974""""""""""
11975
11976The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
11977value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
11978The size of ``value`` must be larger than the size of ``ty2``. This
11979implies that ``fptrunc`` cannot be used to make a *no-op cast*.
11980
11981Semantics:
11982""""""""""
11983
11984The '``fptrunc``' instruction casts a ``value`` from a larger
11985:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
11986<t_floating>` type.
11987This instruction is assumed to execute in the default :ref:`floating-point
11988environment <floatenv>`.
11989
11990NaN values follow the usual :ref:`NaN behaviors <floatnan>`, except that _if_ a
11991NaN payload is propagated from the input ("Quieting NaN propagation" or
11992"Unchanged NaN propagation" cases), then the low order bits of the NaN payload
11993which cannot fit in the resulting type are discarded. Note that if discarding
11994the low order bits leads to an all-0 payload, this cannot be represented as a
11995signaling NaN (it would represent an infinity instead), so in that case
11996"Unchanged NaN propagation" is not possible.
11997
11998This instruction can also take any number of :ref:`fast-math
11999flags <fastmath>`, which are optimization hints to enable otherwise
12000unsafe floating-point optimizations.
12001
12002Example:
12003""""""""
12004
12005.. code-block:: llvm
12006
12007      %X = fptrunc double 16777217.0 to float    ; yields float:16777216.0
12008      %Y = fptrunc double 1.0E+300 to half       ; yields half:+infinity
12009
12010.. _i_fpext:
12011
12012'``fpext .. to``' Instruction
12013^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12014
12015Syntax:
12016"""""""
12017
12018::
12019
12020      <result> = fpext [fast-math flags]* <ty> <value> to <ty2> ; yields ty2
12021
12022Overview:
12023"""""""""
12024
12025The '``fpext``' extends a floating-point ``value`` to a larger floating-point
12026value.
12027
12028Arguments:
12029""""""""""
12030
12031The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
12032``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
12033to. The source type must be smaller than the destination type.
12034
12035Semantics:
12036""""""""""
12037
12038The '``fpext``' instruction extends the ``value`` from a smaller
12039:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
12040<t_floating>` type. The ``fpext`` cannot be used to make a
12041*no-op cast* because it always changes bits. Use ``bitcast`` to make a
12042*no-op cast* for a floating-point cast.
12043
12044NaN values follow the usual :ref:`NaN behaviors <floatnan>`, except that _if_ a
12045NaN payload is propagated from the input ("Quieting NaN propagation" or
12046"Unchanged NaN propagation" cases), then it is copied to the high order bits of
12047the resulting payload, and the remaining low order bits are zero.
12048
12049This instruction can also take any number of :ref:`fast-math
12050flags <fastmath>`, which are optimization hints to enable otherwise
12051unsafe floating-point optimizations.
12052
12053Example:
12054""""""""
12055
12056.. code-block:: llvm
12057
12058      %X = fpext float 3.125 to double         ; yields double:3.125000e+00
12059      %Y = fpext double %X to fp128            ; yields fp128:0xL00000000000000004000900000000000
12060
12061'``fptoui .. to``' Instruction
12062^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12063
12064Syntax:
12065"""""""
12066
12067::
12068
12069      <result> = fptoui <ty> <value> to <ty2>             ; yields ty2
12070
12071Overview:
12072"""""""""
12073
12074The '``fptoui``' converts a floating-point ``value`` to its unsigned
12075integer equivalent of type ``ty2``.
12076
12077Arguments:
12078""""""""""
12079
12080The '``fptoui``' instruction takes a value to cast, which must be a
12081scalar or vector :ref:`floating-point <t_floating>` value, and a type to
12082cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
12083``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
12084type with the same number of elements as ``ty``
12085
12086Semantics:
12087""""""""""
12088
12089The '``fptoui``' instruction converts its :ref:`floating-point
12090<t_floating>` operand into the nearest (rounding towards zero)
12091unsigned integer value. If the value cannot fit in ``ty2``, the result
12092is a :ref:`poison value <poisonvalues>`.
12093
12094Example:
12095""""""""
12096
12097.. code-block:: llvm
12098
12099      %X = fptoui double 123.0 to i32      ; yields i32:123
12100      %Y = fptoui float 1.0E+300 to i1     ; yields undefined:1
12101      %Z = fptoui float 1.04E+17 to i8     ; yields undefined:1
12102
12103'``fptosi .. to``' Instruction
12104^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12105
12106Syntax:
12107"""""""
12108
12109::
12110
12111      <result> = fptosi <ty> <value> to <ty2>             ; yields ty2
12112
12113Overview:
12114"""""""""
12115
12116The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
12117``value`` to type ``ty2``.
12118
12119Arguments:
12120""""""""""
12121
12122The '``fptosi``' instruction takes a value to cast, which must be a
12123scalar or vector :ref:`floating-point <t_floating>` value, and a type to
12124cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
12125``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
12126type with the same number of elements as ``ty``
12127
12128Semantics:
12129""""""""""
12130
12131The '``fptosi``' instruction converts its :ref:`floating-point
12132<t_floating>` operand into the nearest (rounding towards zero)
12133signed integer value. If the value cannot fit in ``ty2``, the result
12134is a :ref:`poison value <poisonvalues>`.
12135
12136Example:
12137""""""""
12138
12139.. code-block:: llvm
12140
12141      %X = fptosi double -123.0 to i32      ; yields i32:-123
12142      %Y = fptosi float 1.0E-247 to i1      ; yields undefined:1
12143      %Z = fptosi float 1.04E+17 to i8      ; yields undefined:1
12144
12145'``uitofp .. to``' Instruction
12146^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12147
12148Syntax:
12149"""""""
12150
12151::
12152
12153      <result> = uitofp <ty> <value> to <ty2>             ; yields ty2
12154
12155Overview:
12156"""""""""
12157
12158The '``uitofp``' instruction regards ``value`` as an unsigned integer
12159and converts that value to the ``ty2`` type.
12160
12161The ``nneg`` (non-negative) flag, if present, specifies that the
12162operand is non-negative. This property may be used by optimization
12163passes to later convert the ``uitofp`` into a ``sitofp``.
12164
12165Arguments:
12166""""""""""
12167
12168The '``uitofp``' instruction takes a value to cast, which must be a
12169scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
12170``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
12171``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
12172type with the same number of elements as ``ty``
12173
12174Semantics:
12175""""""""""
12176
12177The '``uitofp``' instruction interprets its operand as an unsigned
12178integer quantity and converts it to the corresponding floating-point
12179value. If the value cannot be exactly represented, it is rounded using
12180the default rounding mode.
12181
12182If the ``nneg`` flag is set, and the ``uitofp`` argument is negative,
12183the result is a poison value.
12184
12185
12186Example:
12187""""""""
12188
12189.. code-block:: llvm
12190
12191      %X = uitofp i32 257 to float         ; yields float:257.0
12192      %Y = uitofp i8 -1 to double          ; yields double:255.0
12193
12194      %a = uitofp nneg i32 256 to i32      ; yields float:256.0
12195      %b = uitofp nneg i32 -256 to i32     ; yields i32 poison
12196
12197'``sitofp .. to``' Instruction
12198^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12199
12200Syntax:
12201"""""""
12202
12203::
12204
12205      <result> = sitofp <ty> <value> to <ty2>             ; yields ty2
12206
12207Overview:
12208"""""""""
12209
12210The '``sitofp``' instruction regards ``value`` as a signed integer and
12211converts that value to the ``ty2`` type.
12212
12213Arguments:
12214""""""""""
12215
12216The '``sitofp``' instruction takes a value to cast, which must be a
12217scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
12218``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
12219``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
12220type with the same number of elements as ``ty``
12221
12222Semantics:
12223""""""""""
12224
12225The '``sitofp``' instruction interprets its operand as a signed integer
12226quantity and converts it to the corresponding floating-point value. If the
12227value cannot be exactly represented, it is rounded using the default rounding
12228mode.
12229
12230Example:
12231""""""""
12232
12233.. code-block:: llvm
12234
12235      %X = sitofp i32 257 to float         ; yields float:257.0
12236      %Y = sitofp i8 -1 to double          ; yields double:-1.0
12237
12238.. _i_ptrtoint:
12239
12240'``ptrtoint .. to``' Instruction
12241^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12242
12243Syntax:
12244"""""""
12245
12246::
12247
12248      <result> = ptrtoint <ty> <value> to <ty2>             ; yields ty2
12249
12250Overview:
12251"""""""""
12252
12253The '``ptrtoint``' instruction converts the pointer or a vector of
12254pointers ``value`` to the integer (or vector of integers) type ``ty2``.
12255
12256Arguments:
12257""""""""""
12258
12259The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
12260a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
12261type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
12262a vector of integers type.
12263
12264Semantics:
12265""""""""""
12266
12267The '``ptrtoint``' instruction converts ``value`` to integer type
12268``ty2`` by interpreting the pointer value as an integer and either
12269truncating or zero extending that value to the size of the integer type.
12270If ``value`` is smaller than ``ty2`` then a zero extension is done. If
12271``value`` is larger than ``ty2`` then a truncation is done. If they are
12272the same size, then nothing is done (*no-op cast*) other than a type
12273change.
12274
12275Example:
12276""""""""
12277
12278.. code-block:: llvm
12279
12280      %X = ptrtoint ptr %P to i8                         ; yields truncation on 32-bit architecture
12281      %Y = ptrtoint ptr %P to i64                        ; yields zero extension on 32-bit architecture
12282      %Z = ptrtoint <4 x ptr> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
12283
12284.. _i_inttoptr:
12285
12286'``inttoptr .. to``' Instruction
12287^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12288
12289Syntax:
12290"""""""
12291
12292::
12293
12294      <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>]             ; yields ty2
12295
12296Overview:
12297"""""""""
12298
12299The '``inttoptr``' instruction converts an integer ``value`` to a
12300pointer type, ``ty2``.
12301
12302Arguments:
12303""""""""""
12304
12305The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
12306cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
12307type.
12308
12309The optional ``!dereferenceable`` metadata must reference a single metadata
12310name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
12311entry.
12312See ``dereferenceable`` metadata.
12313
12314The optional ``!dereferenceable_or_null`` metadata must reference a single
12315metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
12316``i64`` entry.
12317See ``dereferenceable_or_null`` metadata.
12318
12319Semantics:
12320""""""""""
12321
12322The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
12323applying either a zero extension or a truncation depending on the size
12324of the integer ``value``. If ``value`` is larger than the size of a
12325pointer then a truncation is done. If ``value`` is smaller than the size
12326of a pointer then a zero extension is done. If they are the same size,
12327nothing is done (*no-op cast*).
12328
12329Example:
12330""""""""
12331
12332.. code-block:: llvm
12333
12334      %X = inttoptr i32 255 to ptr           ; yields zero extension on 64-bit architecture
12335      %Y = inttoptr i32 255 to ptr           ; yields no-op on 32-bit architecture
12336      %Z = inttoptr i64 0 to ptr             ; yields truncation on 32-bit architecture
12337      %Z = inttoptr <4 x i32> %G to <4 x ptr>; yields truncation of vector G to four pointers
12338
12339.. _i_bitcast:
12340
12341'``bitcast .. to``' Instruction
12342^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12343
12344Syntax:
12345"""""""
12346
12347::
12348
12349      <result> = bitcast <ty> <value> to <ty2>             ; yields ty2
12350
12351Overview:
12352"""""""""
12353
12354The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
12355changing any bits.
12356
12357Arguments:
12358""""""""""
12359
12360The '``bitcast``' instruction takes a value to cast, which must be a
12361non-aggregate first class value, and a type to cast it to, which must
12362also be a non-aggregate :ref:`first class <t_firstclass>` type. The
12363bit sizes of ``value`` and the destination type, ``ty2``, must be
12364identical. If the source type is a pointer, the destination type must
12365also be a pointer of the same size. This instruction supports bitwise
12366conversion of vectors to integers and to vectors of other types (as
12367long as they have the same size).
12368
12369Semantics:
12370""""""""""
12371
12372The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
12373is always a *no-op cast* because no bits change with this
12374conversion. The conversion is done as if the ``value`` had been stored
12375to memory and read back as type ``ty2``. Pointer (or vector of
12376pointers) types may only be converted to other pointer (or vector of
12377pointers) types with the same address space through this instruction.
12378To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
12379or :ref:`ptrtoint <i_ptrtoint>` instructions first.
12380
12381There is a caveat for bitcasts involving vector types in relation to
12382endianness. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
12383of the vector in the least significant bits of the i16 for little-endian while
12384element zero ends up in the most significant bits for big-endian.
12385
12386Example:
12387""""""""
12388
12389.. code-block:: text
12390
12391      %X = bitcast i8 255 to i8         ; yields i8 :-1
12392      %Y = bitcast i32* %x to i16*      ; yields i16*:%x
12393      %Z = bitcast <2 x i32> %V to i64; ; yields i64: %V (depends on endianness)
12394      %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
12395
12396.. _i_addrspacecast:
12397
12398'``addrspacecast .. to``' Instruction
12399^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12400
12401Syntax:
12402"""""""
12403
12404::
12405
12406      <result> = addrspacecast <pty> <ptrval> to <pty2>       ; yields pty2
12407
12408Overview:
12409"""""""""
12410
12411The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
12412address space ``n`` to type ``pty2`` in address space ``m``.
12413
12414Arguments:
12415""""""""""
12416
12417The '``addrspacecast``' instruction takes a pointer or vector of pointer value
12418to cast and a pointer type to cast it to, which must have a different
12419address space.
12420
12421Semantics:
12422""""""""""
12423
12424The '``addrspacecast``' instruction converts the pointer value
12425``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
12426value modification, depending on the target and the address space
12427pair. Pointer conversions within the same address space must be
12428performed with the ``bitcast`` instruction. Note that if the address
12429space conversion produces a dereferenceable result then both result
12430and operand refer to the same memory location. The conversion must
12431have no side effects, and must not capture the value of the pointer.
12432
12433If the source is :ref:`poison <poisonvalues>`, the result is
12434:ref:`poison <poisonvalues>`.
12435
12436If the source is not :ref:`poison <poisonvalues>`, and both source and
12437destination are :ref:`integral pointers <nointptrtype>`, and the
12438result pointer is dereferenceable, the cast is assumed to be
12439reversible (i.e. casting the result back to the original address space
12440should yield the original bit pattern).
12441
12442Example:
12443""""""""
12444
12445.. code-block:: llvm
12446
12447      %X = addrspacecast ptr %x to ptr addrspace(1)
12448      %Y = addrspacecast ptr addrspace(1) %y to ptr addrspace(2)
12449      %Z = addrspacecast <4 x ptr> %z to <4 x ptr addrspace(3)>
12450
12451.. _otherops:
12452
12453Other Operations
12454----------------
12455
12456The instructions in this category are the "miscellaneous" instructions,
12457which defy better classification.
12458
12459.. _i_icmp:
12460
12461'``icmp``' Instruction
12462^^^^^^^^^^^^^^^^^^^^^^
12463
12464Syntax:
12465"""""""
12466
12467::
12468
12469      <result> = icmp <cond> <ty> <op1>, <op2>   ; yields i1 or <N x i1>:result
12470      <result> = icmp samesign <cond> <ty> <op1>, <op2>   ; yields i1 or <N x i1>:result
12471
12472Overview:
12473"""""""""
12474
12475The '``icmp``' instruction returns a boolean value or a vector of
12476boolean values based on comparison of its two integer, integer vector,
12477pointer, or pointer vector operands.
12478
12479Arguments:
12480""""""""""
12481
12482The '``icmp``' instruction takes three operands. The first operand is
12483the condition code indicating the kind of comparison to perform. It is
12484not a value, just a keyword. The possible condition codes are:
12485
12486.. _icmp_md_cc:
12487
12488#. ``eq``: equal
12489#. ``ne``: not equal
12490#. ``ugt``: unsigned greater than
12491#. ``uge``: unsigned greater or equal
12492#. ``ult``: unsigned less than
12493#. ``ule``: unsigned less or equal
12494#. ``sgt``: signed greater than
12495#. ``sge``: signed greater or equal
12496#. ``slt``: signed less than
12497#. ``sle``: signed less or equal
12498
12499The remaining two arguments must be :ref:`integer <t_integer>` or
12500:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
12501must also be identical types.
12502
12503Semantics:
12504""""""""""
12505
12506The '``icmp``' compares ``op1`` and ``op2`` according to the condition
12507code given as ``cond``. The comparison performed always yields either an
12508:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
12509
12510.. _icmp_md_cc_sem:
12511
12512#. ``eq``: yields ``true`` if the operands are equal, ``false``
12513   otherwise. No sign interpretation is necessary or performed.
12514#. ``ne``: yields ``true`` if the operands are unequal, ``false``
12515   otherwise. No sign interpretation is necessary or performed.
12516#. ``ugt``: interprets the operands as unsigned values and yields
12517   ``true`` if ``op1`` is greater than ``op2``.
12518#. ``uge``: interprets the operands as unsigned values and yields
12519   ``true`` if ``op1`` is greater than or equal to ``op2``.
12520#. ``ult``: interprets the operands as unsigned values and yields
12521   ``true`` if ``op1`` is less than ``op2``.
12522#. ``ule``: interprets the operands as unsigned values and yields
12523   ``true`` if ``op1`` is less than or equal to ``op2``.
12524#. ``sgt``: interprets the operands as signed values and yields ``true``
12525   if ``op1`` is greater than ``op2``.
12526#. ``sge``: interprets the operands as signed values and yields ``true``
12527   if ``op1`` is greater than or equal to ``op2``.
12528#. ``slt``: interprets the operands as signed values and yields ``true``
12529   if ``op1`` is less than ``op2``.
12530#. ``sle``: interprets the operands as signed values and yields ``true``
12531   if ``op1`` is less than or equal to ``op2``.
12532
12533If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
12534are compared as if they were integers.
12535
12536If the operands are integer vectors, then they are compared element by
12537element. The result is an ``i1`` vector with the same number of elements
12538as the values being compared. Otherwise, the result is an ``i1``.
12539
12540If the ``samesign`` keyword is present and the operands are not of the
12541same sign then the result is a :ref:`poison value <poisonvalues>`.
12542
12543Example:
12544""""""""
12545
12546.. code-block:: text
12547
12548      <result> = icmp eq i32 4, 5          ; yields: result=false
12549      <result> = icmp ne ptr %X, %X        ; yields: result=false
12550      <result> = icmp ult i16  4, 5        ; yields: result=true
12551      <result> = icmp sgt i16  4, 5        ; yields: result=false
12552      <result> = icmp ule i16 -4, 5        ; yields: result=false
12553      <result> = icmp sge i16  4, 5        ; yields: result=false
12554
12555.. _i_fcmp:
12556
12557'``fcmp``' Instruction
12558^^^^^^^^^^^^^^^^^^^^^^
12559
12560Syntax:
12561"""""""
12562
12563::
12564
12565      <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2>     ; yields i1 or <N x i1>:result
12566
12567Overview:
12568"""""""""
12569
12570The '``fcmp``' instruction returns a boolean value or vector of boolean
12571values based on comparison of its operands.
12572
12573If the operands are floating-point scalars, then the result type is a
12574boolean (:ref:`i1 <t_integer>`).
12575
12576If the operands are floating-point vectors, then the result type is a
12577vector of boolean with the same number of elements as the operands being
12578compared.
12579
12580Arguments:
12581""""""""""
12582
12583The '``fcmp``' instruction takes three operands. The first operand is
12584the condition code indicating the kind of comparison to perform. It is
12585not a value, just a keyword. The possible condition codes are:
12586
12587#. ``false``: no comparison, always returns false
12588#. ``oeq``: ordered and equal
12589#. ``ogt``: ordered and greater than
12590#. ``oge``: ordered and greater than or equal
12591#. ``olt``: ordered and less than
12592#. ``ole``: ordered and less than or equal
12593#. ``one``: ordered and not equal
12594#. ``ord``: ordered (no nans)
12595#. ``ueq``: unordered or equal
12596#. ``ugt``: unordered or greater than
12597#. ``uge``: unordered or greater than or equal
12598#. ``ult``: unordered or less than
12599#. ``ule``: unordered or less than or equal
12600#. ``une``: unordered or not equal
12601#. ``uno``: unordered (either nans)
12602#. ``true``: no comparison, always returns true
12603
12604*Ordered* means that neither operand is a QNAN while *unordered* means
12605that either operand may be a QNAN.
12606
12607Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
12608<t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
12609They must have identical types.
12610
12611Semantics:
12612""""""""""
12613
12614The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
12615condition code given as ``cond``. If the operands are vectors, then the
12616vectors are compared element by element. Each comparison performed
12617always yields an :ref:`i1 <t_integer>` result, as follows:
12618
12619#. ``false``: always yields ``false``, regardless of operands.
12620#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
12621   is equal to ``op2``.
12622#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
12623   is greater than ``op2``.
12624#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
12625   is greater than or equal to ``op2``.
12626#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
12627   is less than ``op2``.
12628#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
12629   is less than or equal to ``op2``.
12630#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
12631   is not equal to ``op2``.
12632#. ``ord``: yields ``true`` if both operands are not a QNAN.
12633#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
12634   equal to ``op2``.
12635#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
12636   greater than ``op2``.
12637#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
12638   greater than or equal to ``op2``.
12639#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
12640   less than ``op2``.
12641#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
12642   less than or equal to ``op2``.
12643#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
12644   not equal to ``op2``.
12645#. ``uno``: yields ``true`` if either operand is a QNAN.
12646#. ``true``: always yields ``true``, regardless of operands.
12647
12648The ``fcmp`` instruction can also optionally take any number of
12649:ref:`fast-math flags <fastmath>`, which are optimization hints to enable
12650otherwise unsafe floating-point optimizations.
12651
12652Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
12653only flags that have any effect on its semantics are those that allow
12654assumptions to be made about the values of input arguments; namely
12655``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information.
12656
12657Example:
12658""""""""
12659
12660.. code-block:: text
12661
12662      <result> = fcmp oeq float 4.0, 5.0    ; yields: result=false
12663      <result> = fcmp one float 4.0, 5.0    ; yields: result=true
12664      <result> = fcmp olt float 4.0, 5.0    ; yields: result=true
12665      <result> = fcmp ueq double 1.0, 2.0   ; yields: result=false
12666
12667.. _i_phi:
12668
12669'``phi``' Instruction
12670^^^^^^^^^^^^^^^^^^^^^
12671
12672Syntax:
12673"""""""
12674
12675::
12676
12677      <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ...
12678
12679Overview:
12680"""""""""
12681
12682The '``phi``' instruction is used to implement the φ node in the SSA
12683graph representing the function.
12684
12685Arguments:
12686""""""""""
12687
12688The type of the incoming values is specified with the first type field.
12689After this, the '``phi``' instruction takes a list of pairs as
12690arguments, with one pair for each predecessor basic block of the current
12691block. Only values of :ref:`first class <t_firstclass>` type may be used as
12692the value arguments to the PHI node. Only labels may be used as the
12693label arguments.
12694
12695There must be no non-phi instructions between the start of a basic block
12696and the PHI instructions: i.e. PHI instructions must be first in a basic
12697block.
12698
12699For the purposes of the SSA form, the use of each incoming value is
12700deemed to occur on the edge from the corresponding predecessor block to
12701the current block (but after any definition of an '``invoke``'
12702instruction's return value on the same edge).
12703
12704The optional ``fast-math-flags`` marker indicates that the phi has one
12705or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
12706to enable otherwise unsafe floating-point optimizations. Fast-math-flags
12707are only valid for phis that return :ref:`supported floating-point types
12708<fastmath_return_types>`.
12709
12710Semantics:
12711""""""""""
12712
12713At runtime, the '``phi``' instruction logically takes on the value
12714specified by the pair corresponding to the predecessor basic block that
12715executed just prior to the current block.
12716
12717Example:
12718""""""""
12719
12720.. code-block:: llvm
12721
12722    Loop:       ; Infinite loop that counts from 0 on up...
12723      %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
12724      %nextindvar = add i32 %indvar, 1
12725      br label %Loop
12726
12727.. _i_select:
12728
12729'``select``' Instruction
12730^^^^^^^^^^^^^^^^^^^^^^^^
12731
12732Syntax:
12733"""""""
12734
12735::
12736
12737      <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2>             ; yields ty
12738
12739      selty is either i1 or {<N x i1>}
12740
12741Overview:
12742"""""""""
12743
12744The '``select``' instruction is used to choose one value based on a
12745condition, without IR-level branching.
12746
12747Arguments:
12748""""""""""
12749
12750The '``select``' instruction requires an 'i1' value or a vector of 'i1'
12751values indicating the condition, and two values of the same :ref:`first
12752class <t_firstclass>` type.
12753
12754#. The optional ``fast-math flags`` marker indicates that the select has one or more
12755   :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
12756   otherwise unsafe floating-point optimizations. Fast-math flags are only valid
12757   for selects that return :ref:`supported floating-point types
12758   <fastmath_return_types>`.
12759
12760Semantics:
12761""""""""""
12762
12763If the condition is an i1 and it evaluates to 1, the instruction returns
12764the first value argument; otherwise, it returns the second value
12765argument.
12766
12767If the condition is a vector of i1, then the value arguments must be
12768vectors of the same size, and the selection is done element by element.
12769
12770If the condition is an i1 and the value arguments are vectors of the
12771same size, then an entire vector is selected.
12772
12773Example:
12774""""""""
12775
12776.. code-block:: llvm
12777
12778      %X = select i1 true, i8 17, i8 42          ; yields i8:17
12779
12780
12781.. _i_freeze:
12782
12783'``freeze``' Instruction
12784^^^^^^^^^^^^^^^^^^^^^^^^
12785
12786Syntax:
12787"""""""
12788
12789::
12790
12791      <result> = freeze ty <val>    ; yields ty:result
12792
12793Overview:
12794"""""""""
12795
12796The '``freeze``' instruction is used to stop propagation of
12797:ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values.
12798
12799Arguments:
12800""""""""""
12801
12802The '``freeze``' instruction takes a single argument.
12803
12804Semantics:
12805""""""""""
12806
12807If the argument is ``undef`` or ``poison``, '``freeze``' returns an
12808arbitrary, but fixed, value of type '``ty``'.
12809Otherwise, this instruction is a no-op and returns the input argument.
12810All uses of a value returned by the same '``freeze``' instruction are
12811guaranteed to always observe the same value, while different '``freeze``'
12812instructions may yield different values.
12813
12814While ``undef`` and ``poison`` pointers can be frozen, the result is a
12815non-dereferenceable pointer. See the
12816:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information.
12817If an aggregate value or vector is frozen, the operand is frozen element-wise.
12818The padding of an aggregate isn't considered, since it isn't visible
12819without storing it into memory and loading it with a different type.
12820
12821
12822Example:
12823""""""""
12824
12825.. code-block:: text
12826
12827      %w = i32 undef
12828      %x = freeze i32 %w
12829      %y = add i32 %w, %w         ; undef
12830      %z = add i32 %x, %x         ; even number because all uses of %x observe
12831                                  ; the same value
12832      %x2 = freeze i32 %w
12833      %cmp = icmp eq i32 %x, %x2  ; can be true or false
12834
12835      ; example with vectors
12836      %v = <2 x i32> <i32 undef, i32 poison>
12837      %a = extractelement <2 x i32> %v, i32 0    ; undef
12838      %b = extractelement <2 x i32> %v, i32 1    ; poison
12839      %add = add i32 %a, %a                      ; undef
12840
12841      %v.fr = freeze <2 x i32> %v                ; element-wise freeze
12842      %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef
12843      %add.f = add i32 %d, %d                    ; even number
12844
12845      ; branching on frozen value
12846      %poison = add nsw i1 %k, undef   ; poison
12847      %c = freeze i1 %poison
12848      br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar
12849
12850
12851.. _i_call:
12852
12853'``call``' Instruction
12854^^^^^^^^^^^^^^^^^^^^^^
12855
12856Syntax:
12857"""""""
12858
12859::
12860
12861      <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)]
12862                 <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ]
12863
12864Overview:
12865"""""""""
12866
12867The '``call``' instruction represents a simple function call.
12868
12869Arguments:
12870""""""""""
12871
12872This instruction requires several arguments:
12873
12874#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
12875   should perform tail call optimization. The ``tail`` marker is a hint that
12876   `can be ignored <CodeGenerator.html#tail-call-optimization>`_. The
12877   ``musttail`` marker means that the call must be tail call optimized in order
12878   for the program to be correct. This is true even in the presence of
12879   attributes like "disable-tail-calls". The ``musttail`` marker provides these
12880   guarantees:
12881
12882   -  The call will not cause unbounded stack growth if it is part of a
12883      recursive cycle in the call graph.
12884   -  Arguments with the :ref:`inalloca <attr_inalloca>` or
12885      :ref:`preallocated <attr_preallocated>` attribute are forwarded in place.
12886   -  If the musttail call appears in a function with the ``"thunk"`` attribute
12887      and the caller and callee both have varargs, then any unprototyped
12888      arguments in register or memory are forwarded to the callee. Similarly,
12889      the return value of the callee is returned to the caller's caller, even
12890      if a void return type is in use.
12891
12892   Both markers imply that the callee does not access allocas, va_args, or
12893   byval arguments from the caller. As an exception to that, an alloca or byval
12894   argument may be passed to the callee as a byval argument, which can be
12895   dereferenced inside the callee. For example:
12896
12897   .. code-block:: llvm
12898
12899      declare void @take_byval(ptr byval(i64))
12900      declare void @take_ptr(ptr)
12901
12902      ; Invalid (assuming @take_ptr dereferences the pointer), because %local
12903      ; may be de-allocated before the call to @take_ptr.
12904      define void @invalid_alloca() {
12905      entry:
12906        %local = alloca i64
12907        tail call void @take_ptr(ptr %local)
12908        ret void
12909      }
12910
12911      ; Valid, the byval attribute causes the memory allocated by %local to be
12912      ; copied into @take_byval's stack frame.
12913      define void @byval_alloca() {
12914      entry:
12915        %local = alloca i64
12916        tail call void @take_byval(ptr byval(i64) %local)
12917        ret void
12918      }
12919
12920      ; Invalid, because @use_global_va_list uses the variadic arguments from
12921      ; @invalid_va_list.
12922      %struct.va_list = type { ptr }
12923      @va_list = external global %struct.va_list
12924      define void @use_global_va_list() {
12925      entry:
12926        %arg = va_arg ptr @va_list, i64
12927        ret void
12928      }
12929      define void @invalid_va_list(i32 %a, ...) {
12930      entry:
12931        call void @llvm.va_start.p0(ptr @va_list)
12932        tail call void @use_global_va_list()
12933        ret void
12934      }
12935
12936      ; Valid, byval argument forwarded to tail call as another byval argument.
12937      define void @forward_byval(ptr byval(i64) %x) {
12938      entry:
12939        tail call void @take_byval(ptr byval(i64) %x)
12940        ret void
12941      }
12942
12943      ; Invalid (assuming @take_ptr dereferences the pointer), byval argument
12944      ; passed to tail callee as non-byval ptr.
12945      define void @invalid_byval(ptr byval(i64) %x) {
12946      entry:
12947        tail call void @take_ptr(ptr %x)
12948        ret void
12949      }
12950
12951   Calls marked ``musttail`` must obey the following additional rules:
12952
12953   -  The call must immediately precede a :ref:`ret <i_ret>` instruction,
12954      or a pointer bitcast followed by a ret instruction.
12955   -  The ret instruction must return the (possibly bitcasted) value
12956      produced by the call, undef, or void.
12957   -  The calling conventions of the caller and callee must match.
12958   -  The callee must be varargs iff the caller is varargs. Bitcasting a
12959      non-varargs function to the appropriate varargs type is legal so
12960      long as the non-varargs prefixes obey the other rules.
12961   -  The return type must not undergo automatic conversion to an `sret` pointer.
12962
12963   In addition, if the calling convention is not `swifttailcc` or `tailcc`:
12964
12965   -  All ABI-impacting function attributes, such as sret, byval, inreg,
12966      returned, and inalloca, must match.
12967   -  The caller and callee prototypes must match. Pointer types of parameters
12968      or return types may differ in pointee type, but not in address space.
12969
12970   On the other hand, if the calling convention is `swifttailcc` or `tailcc`:
12971
12972   -  Only these ABI-impacting attributes attributes are allowed: sret, byval,
12973      swiftself, and swiftasync.
12974   -  Prototypes are not required to match.
12975
12976   Tail call optimization for calls marked ``tail`` is guaranteed to occur if
12977   the following conditions are met:
12978
12979   -  Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
12980   -  The call is in tail position (ret immediately follows call and ret
12981      uses value of call or is void).
12982   -  Option ``-tailcallopt`` is enabled, ``llvm::GuaranteedTailCallOpt`` is
12983      ``true``, or the calling convention is ``tailcc``.
12984   -  `Platform-specific constraints are met.
12985      <CodeGenerator.html#tail-call-optimization>`_
12986
12987#. The optional ``notail`` marker indicates that the optimizers should not add
12988   ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
12989   call optimization from being performed on the call.
12990
12991#. The optional ``fast-math flags`` marker indicates that the call has one or more
12992   :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
12993   otherwise unsafe floating-point optimizations. Fast-math flags are only valid
12994   for calls that return :ref:`supported floating-point types <fastmath_return_types>`.
12995
12996#. The optional "cconv" marker indicates which :ref:`calling
12997   convention <callingconv>` the call should use. If none is
12998   specified, the call defaults to using C calling conventions. The
12999   calling convention of the call must match the calling convention of
13000   the target function, or else the behavior is undefined.
13001#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
13002   values. Only '``zeroext``', '``signext``', '``noext``', and '``inreg``'
13003   attributes are valid here.
13004#. The optional addrspace attribute can be used to indicate the address space
13005   of the called function. If it is not specified, the program address space
13006   from the :ref:`datalayout string<langref_datalayout>` will be used.
13007#. '``ty``': the type of the call instruction itself which is also the
13008   type of the return value. Functions that return no value are marked
13009   ``void``.
13010#. '``fnty``': shall be the signature of the function being called. The
13011   argument types must match the types implied by this signature. This
13012   type can be omitted if the function is not varargs.
13013#. '``fnptrval``': An LLVM value containing a pointer to a function to
13014   be called. In most cases, this is a direct function call, but
13015   indirect ``call``'s are just as possible, calling an arbitrary pointer
13016   to function value.
13017#. '``function args``': argument list whose types match the function
13018   signature argument types and parameter attributes. All arguments must
13019   be of :ref:`first class <t_firstclass>` type. If the function signature
13020   indicates the function accepts a variable number of arguments, the
13021   extra arguments can be specified.
13022#. The optional :ref:`function attributes <fnattrs>` list.
13023#. The optional :ref:`operand bundles <opbundles>` list.
13024
13025Semantics:
13026""""""""""
13027
13028The '``call``' instruction is used to cause control flow to transfer to
13029a specified function, with its incoming arguments bound to the specified
13030values. Upon a '``ret``' instruction in the called function, control
13031flow continues with the instruction after the function call, and the
13032return value of the function is bound to the result argument.
13033
13034Example:
13035""""""""
13036
13037.. code-block:: llvm
13038
13039      %retval = call i32 @test(i32 %argc)
13040      call i32 (ptr, ...) @printf(ptr %msg, i32 12, i8 42)        ; yields i32
13041      %X = tail call i32 @foo()                                    ; yields i32
13042      %Y = tail call fastcc i32 @foo()  ; yields i32
13043      call void %foo(i8 signext 97)
13044
13045      %struct.A = type { i32, i8 }
13046      %r = call %struct.A @foo()                        ; yields { i32, i8 }
13047      %gr = extractvalue %struct.A %r, 0                ; yields i32
13048      %gr1 = extractvalue %struct.A %r, 1               ; yields i8
13049      %Z = call void @foo() noreturn                    ; indicates that %foo never returns normally
13050      %ZZ = call zeroext i32 @bar()                     ; Return value is %zero extended
13051
13052llvm treats calls to some functions with names and arguments that match
13053the standard C99 library as being the C99 library functions, and may
13054perform optimizations or generate code for them under that assumption.
13055This is something we'd like to change in the future to provide better
13056support for freestanding environments and non-C-based languages.
13057
13058.. _i_va_arg:
13059
13060'``va_arg``' Instruction
13061^^^^^^^^^^^^^^^^^^^^^^^^
13062
13063Syntax:
13064"""""""
13065
13066::
13067
13068      <resultval> = va_arg <va_list*> <arglist>, <argty>
13069
13070Overview:
13071"""""""""
13072
13073The '``va_arg``' instruction is used to access arguments passed through
13074the "variable argument" area of a function call. It is used to implement
13075the ``va_arg`` macro in C.
13076
13077Arguments:
13078""""""""""
13079
13080This instruction takes a ``va_list*`` value and the type of the
13081argument. It returns a value of the specified argument type and
13082increments the ``va_list`` to point to the next argument. The actual
13083type of ``va_list`` is target specific.
13084
13085Semantics:
13086""""""""""
13087
13088The '``va_arg``' instruction loads an argument of the specified type
13089from the specified ``va_list`` and causes the ``va_list`` to point to
13090the next argument. For more information, see the variable argument
13091handling :ref:`Intrinsic Functions <int_varargs>`.
13092
13093It is legal for this instruction to be called in a function which does
13094not take a variable number of arguments, for example, the ``vfprintf``
13095function.
13096
13097``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
13098function <intrinsics>` because it takes a type as an argument.
13099
13100Example:
13101""""""""
13102
13103See the :ref:`variable argument processing <int_varargs>` section.
13104
13105Note that the code generator does not yet fully support va\_arg on many
13106targets. Also, it does not currently support va\_arg with aggregate
13107types on any target.
13108
13109.. _i_landingpad:
13110
13111'``landingpad``' Instruction
13112^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13113
13114Syntax:
13115"""""""
13116
13117::
13118
13119      <resultval> = landingpad <resultty> <clause>+
13120      <resultval> = landingpad <resultty> cleanup <clause>*
13121
13122      <clause> := catch <type> <value>
13123      <clause> := filter <array constant type> <array constant>
13124
13125Overview:
13126"""""""""
13127
13128The '``landingpad``' instruction is used by `LLVM's exception handling
13129system <ExceptionHandling.html#overview>`_ to specify that a basic block
13130is a landing pad --- one where the exception lands, and corresponds to the
13131code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
13132defines values supplied by the :ref:`personality function <personalityfn>` upon
13133re-entry to the function. The ``resultval`` has the type ``resultty``.
13134
13135Arguments:
13136""""""""""
13137
13138The optional
13139``cleanup`` flag indicates that the landing pad block is a cleanup.
13140
13141A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
13142contains the global variable representing the "type" that may be caught
13143or filtered respectively. Unlike the ``catch`` clause, the ``filter``
13144clause takes an array constant as its argument. Use
13145"``[0 x ptr] undef``" for a filter which cannot throw. The
13146'``landingpad``' instruction must contain *at least* one ``clause`` or
13147the ``cleanup`` flag.
13148
13149Semantics:
13150""""""""""
13151
13152The '``landingpad``' instruction defines the values which are set by the
13153:ref:`personality function <personalityfn>` upon re-entry to the function, and
13154therefore the "result type" of the ``landingpad`` instruction. As with
13155calling conventions, how the personality function results are
13156represented in LLVM IR is target specific.
13157
13158The clauses are applied in order from top to bottom. If two
13159``landingpad`` instructions are merged together through inlining, the
13160clauses from the calling function are appended to the list of clauses.
13161When the call stack is being unwound due to an exception being thrown,
13162the exception is compared against each ``clause`` in turn. If it doesn't
13163match any of the clauses, and the ``cleanup`` flag is not set, then
13164unwinding continues further up the call stack.
13165
13166The ``landingpad`` instruction has several restrictions:
13167
13168-  A landing pad block is a basic block which is the unwind destination
13169   of an '``invoke``' instruction.
13170-  A landing pad block must have a '``landingpad``' instruction as its
13171   first non-PHI instruction.
13172-  There can be only one '``landingpad``' instruction within the landing
13173   pad block.
13174-  A basic block that is not a landing pad block may not include a
13175   '``landingpad``' instruction.
13176
13177Example:
13178""""""""
13179
13180.. code-block:: llvm
13181
13182      ;; A landing pad which can catch an integer.
13183      %res = landingpad { ptr, i32 }
13184               catch ptr @_ZTIi
13185      ;; A landing pad that is a cleanup.
13186      %res = landingpad { ptr, i32 }
13187               cleanup
13188      ;; A landing pad which can catch an integer and can only throw a double.
13189      %res = landingpad { ptr, i32 }
13190               catch ptr @_ZTIi
13191               filter [1 x ptr] [ptr @_ZTId]
13192
13193.. _i_catchpad:
13194
13195'``catchpad``' Instruction
13196^^^^^^^^^^^^^^^^^^^^^^^^^^
13197
13198Syntax:
13199"""""""
13200
13201::
13202
13203      <resultval> = catchpad within <catchswitch> [<args>*]
13204
13205Overview:
13206"""""""""
13207
13208The '``catchpad``' instruction is used by `LLVM's exception handling
13209system <ExceptionHandling.html#overview>`_ to specify that a basic block
13210begins a catch handler --- one where a personality routine attempts to transfer
13211control to catch an exception.
13212
13213Arguments:
13214""""""""""
13215
13216The ``catchswitch`` operand must always be a token produced by a
13217:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
13218ensures that each ``catchpad`` has exactly one predecessor block, and it always
13219terminates in a ``catchswitch``.
13220
13221The ``args`` correspond to whatever information the personality routine
13222requires to know if this is an appropriate handler for the exception. Control
13223will transfer to the ``catchpad`` if this is the first appropriate handler for
13224the exception.
13225
13226The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
13227``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
13228pads.
13229
13230Semantics:
13231""""""""""
13232
13233When the call stack is being unwound due to an exception being thrown, the
13234exception is compared against the ``args``. If it doesn't match, control will
13235not reach the ``catchpad`` instruction.  The representation of ``args`` is
13236entirely target and personality function-specific.
13237
13238Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
13239instruction must be the first non-phi of its parent basic block.
13240
13241The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
13242instructions is described in the
13243`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
13244
13245When a ``catchpad`` has been "entered" but not yet "exited" (as
13246described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
13247it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
13248that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
13249
13250Example:
13251""""""""
13252
13253.. code-block:: text
13254
13255    dispatch:
13256      %cs = catchswitch within none [label %handler0] unwind to caller
13257      ;; A catch block which can catch an integer.
13258    handler0:
13259      %tok = catchpad within %cs [ptr @_ZTIi]
13260
13261.. _i_cleanuppad:
13262
13263'``cleanuppad``' Instruction
13264^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13265
13266Syntax:
13267"""""""
13268
13269::
13270
13271      <resultval> = cleanuppad within <parent> [<args>*]
13272
13273Overview:
13274"""""""""
13275
13276The '``cleanuppad``' instruction is used by `LLVM's exception handling
13277system <ExceptionHandling.html#overview>`_ to specify that a basic block
13278is a cleanup block --- one where a personality routine attempts to
13279transfer control to run cleanup actions.
13280The ``args`` correspond to whatever additional
13281information the :ref:`personality function <personalityfn>` requires to
13282execute the cleanup.
13283The ``resultval`` has the type :ref:`token <t_token>` and is used to
13284match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
13285The ``parent`` argument is the token of the funclet that contains the
13286``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
13287this operand may be the token ``none``.
13288
13289Arguments:
13290""""""""""
13291
13292The instruction takes a list of arbitrary values which are interpreted
13293by the :ref:`personality function <personalityfn>`.
13294
13295Semantics:
13296""""""""""
13297
13298When the call stack is being unwound due to an exception being thrown,
13299the :ref:`personality function <personalityfn>` transfers control to the
13300``cleanuppad`` with the aid of the personality-specific arguments.
13301As with calling conventions, how the personality function results are
13302represented in LLVM IR is target specific.
13303
13304The ``cleanuppad`` instruction has several restrictions:
13305
13306-  A cleanup block is a basic block which is the unwind destination of
13307   an exceptional instruction.
13308-  A cleanup block must have a '``cleanuppad``' instruction as its
13309   first non-PHI instruction.
13310-  There can be only one '``cleanuppad``' instruction within the
13311   cleanup block.
13312-  A basic block that is not a cleanup block may not include a
13313   '``cleanuppad``' instruction.
13314
13315When a ``cleanuppad`` has been "entered" but not yet "exited" (as
13316described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
13317it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
13318that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
13319
13320Example:
13321""""""""
13322
13323.. code-block:: text
13324
13325      %tok = cleanuppad within %cs []
13326
13327.. _debugrecords:
13328
13329Debug Records
13330-----------------------
13331
13332Debug records appear interleaved with instructions, but are not instructions;
13333they are used only to define debug information, and have no effect on generated
13334code. They are distinguished from instructions by the use of a leading `#` and
13335an extra level of indentation. As an example:
13336
13337.. code-block:: llvm
13338
13339  %inst1 = op1 %a, %b
13340    #dbg_value(%inst1, !10, !DIExpression(), !11)
13341  %inst2 = op2 %inst1, %c
13342
13343These debug records replace the prior :ref:`debug intrinsics<dbg_intrinsics>`.
13344Debug records will be disabled if ``--write-experimental-debuginfo=false`` is
13345passed to LLVM; it is an error for both records and intrinsics to appear in the
13346same module. More information about debug records can be found in the `LLVM
13347Source Level Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
13348document.
13349
13350.. _intrinsics:
13351
13352Intrinsic Functions
13353===================
13354
13355LLVM supports the notion of an "intrinsic function". These functions
13356have well known names and semantics and are required to follow certain
13357restrictions. Overall, these intrinsics represent an extension mechanism
13358for the LLVM language that does not require changing all of the
13359transformations in LLVM when adding to the language (or the bitcode
13360reader/writer, the parser, etc...).
13361
13362Intrinsic function names must all start with an "``llvm.``" prefix. This
13363prefix is reserved in LLVM for intrinsic names; thus, function names may
13364not begin with this prefix. Intrinsic functions must always be external
13365functions: you cannot define the body of intrinsic functions. Intrinsic
13366functions may only be used in call or invoke instructions: it is illegal
13367to take the address of an intrinsic function. Additionally, because
13368intrinsic functions are part of the LLVM language, it is required if any
13369are added that they be documented here.
13370
13371Some intrinsic functions can be overloaded, i.e., the intrinsic
13372represents a family of functions that perform the same operation but on
13373different data types. Because LLVM can represent over 8 million
13374different integer types, overloading is used commonly to allow an
13375intrinsic function to operate on any integer type. One or more of the
13376argument types or the result type can be overloaded to accept any
13377integer type. Argument types may also be defined as exactly matching a
13378previous argument's type or the result type. This allows an intrinsic
13379function which accepts multiple arguments, but needs all of them to be
13380of the same type, to only be overloaded with respect to a single
13381argument or the result.
13382
13383Overloaded intrinsics will have the names of its overloaded argument
13384types encoded into its function name, each preceded by a period. Only
13385those types which are overloaded result in a name suffix. Arguments
13386whose type is matched against another type do not. For example, the
13387``llvm.ctpop`` function can take an integer of any width and returns an
13388integer of exactly the same integer width. This leads to a family of
13389functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
13390``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
13391overloaded, and only one type suffix is required. Because the argument's
13392type is matched against the return type, it does not require its own
13393name suffix.
13394
13395:ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics
13396that depend on an unnamed type in one of its overloaded argument types get an
13397additional ``.<number>`` suffix. This allows differentiating intrinsics with
13398different unnamed types as arguments. (For example:
13399``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and
13400it ensures unique names in the module. While linking together two modules, it is
13401still possible to get a name clash. In that case one of the names will be
13402changed by getting a new number.
13403
13404For target developers who are defining intrinsics for back-end code
13405generation, any intrinsic overloads based solely the distinction between
13406integer or floating point types should not be relied upon for correct
13407code generation. In such cases, the recommended approach for target
13408maintainers when defining intrinsics is to create separate integer and
13409FP intrinsics rather than rely on overloading. For example, if different
13410codegen is required for ``llvm.target.foo(<4 x i32>)`` and
13411``llvm.target.foo(<4 x float>)`` then these should be split into
13412different intrinsics.
13413
13414To learn how to add an intrinsic function, please see the `Extending
13415LLVM Guide <ExtendingLLVM.html>`_.
13416
13417.. _int_varargs:
13418
13419Variable Argument Handling Intrinsics
13420-------------------------------------
13421
13422Variable argument support is defined in LLVM with the
13423:ref:`va_arg <i_va_arg>` instruction and these three intrinsic
13424functions. These functions are related to the similarly named macros
13425defined in the ``<stdarg.h>`` header file.
13426
13427All of these functions take as arguments pointers to a target-specific
13428value type "``va_list``". The LLVM assembly language reference manual
13429does not define what this type is, so all transformations should be
13430prepared to handle these functions regardless of the type used. The intrinsics
13431are overloaded, and can be used for pointers to different address spaces.
13432
13433This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
13434variable argument handling intrinsic functions are used.
13435
13436.. code-block:: llvm
13437
13438    ; This struct is different for every platform. For most platforms,
13439    ; it is merely a ptr.
13440    %struct.va_list = type { ptr }
13441
13442    ; For Unix x86_64 platforms, va_list is the following struct:
13443    ; %struct.va_list = type { i32, i32, ptr, ptr }
13444
13445    define i32 @test(i32 %X, ...) {
13446      ; Initialize variable argument processing
13447      %ap = alloca %struct.va_list
13448      call void @llvm.va_start.p0(ptr %ap)
13449
13450      ; Read a single integer argument
13451      %tmp = va_arg ptr %ap, i32
13452
13453      ; Demonstrate usage of llvm.va_copy and llvm.va_end
13454      %aq = alloca ptr
13455      call void @llvm.va_copy.p0(ptr %aq, ptr %ap)
13456      call void @llvm.va_end.p0(ptr %aq)
13457
13458      ; Stop processing of arguments.
13459      call void @llvm.va_end.p0(ptr %ap)
13460      ret i32 %tmp
13461    }
13462
13463    declare void @llvm.va_start.p0(ptr)
13464    declare void @llvm.va_copy.p0(ptr, ptr)
13465    declare void @llvm.va_end.p0(ptr)
13466
13467.. _int_va_start:
13468
13469'``llvm.va_start``' Intrinsic
13470^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13471
13472Syntax:
13473"""""""
13474
13475::
13476
13477      declare void @llvm.va_start.p0(ptr <arglist>)
13478      declare void @llvm.va_start.p5(ptr addrspace(5) <arglist>)
13479
13480Overview:
13481"""""""""
13482
13483The '``llvm.va_start``' intrinsic initializes ``<arglist>`` for
13484subsequent use by ``va_arg``.
13485
13486Arguments:
13487""""""""""
13488
13489The argument is a pointer to a ``va_list`` element to initialize.
13490
13491Semantics:
13492""""""""""
13493
13494The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
13495available in C. In a target-dependent way, it initializes the
13496``va_list`` element to which the argument points, so that the next call
13497to ``va_arg`` will produce the first variable argument passed to the
13498function. Unlike the C ``va_start`` macro, this intrinsic does not need
13499to know the last argument of the function as the compiler can figure
13500that out.
13501
13502'``llvm.va_end``' Intrinsic
13503^^^^^^^^^^^^^^^^^^^^^^^^^^^
13504
13505Syntax:
13506"""""""
13507
13508::
13509
13510      declare void @llvm.va_end.p0(ptr <arglist>)
13511      declare void @llvm.va_end.p5(ptr addrspace(5) <arglist>)
13512
13513Overview:
13514"""""""""
13515
13516The '``llvm.va_end``' intrinsic destroys ``<arglist>``, which has been
13517initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
13518
13519Arguments:
13520""""""""""
13521
13522The argument is a pointer to a ``va_list`` to destroy.
13523
13524Semantics:
13525""""""""""
13526
13527The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
13528available in C. In a target-dependent way, it destroys the ``va_list``
13529element to which the argument points. Calls to
13530:ref:`llvm.va_start <int_va_start>` and
13531:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
13532``llvm.va_end``.
13533
13534.. _int_va_copy:
13535
13536'``llvm.va_copy``' Intrinsic
13537^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13538
13539Syntax:
13540"""""""
13541
13542::
13543
13544      declare void @llvm.va_copy.p0(ptr <destarglist>, ptr <srcarglist>)
13545      declare void @llvm.va_copy.p5(ptr addrspace(5) <destarglist>, ptr addrspace(5) <srcarglist>)
13546
13547Overview:
13548"""""""""
13549
13550The '``llvm.va_copy``' intrinsic copies the current argument position
13551from the source argument list to the destination argument list.
13552
13553Arguments:
13554""""""""""
13555
13556The first argument is a pointer to a ``va_list`` element to initialize.
13557The second argument is a pointer to a ``va_list`` element to copy from.
13558The address spaces of the two arguments must match.
13559
13560Semantics:
13561""""""""""
13562
13563The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
13564available in C. In a target-dependent way, it copies the source
13565``va_list`` element into the destination ``va_list`` element. This
13566intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
13567arbitrarily complex and require, for example, memory allocation.
13568
13569Accurate Garbage Collection Intrinsics
13570--------------------------------------
13571
13572LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
13573(GC) requires the frontend to generate code containing appropriate intrinsic
13574calls and select an appropriate GC strategy which knows how to lower these
13575intrinsics in a manner which is appropriate for the target collector.
13576
13577These intrinsics allow identification of :ref:`GC roots on the
13578stack <int_gcroot>`, as well as garbage collector implementations that
13579require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
13580Frontends for type-safe garbage collected languages should generate
13581these intrinsics to make use of the LLVM garbage collectors. For more
13582details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
13583
13584LLVM provides an second experimental set of intrinsics for describing garbage
13585collection safepoints in compiled code. These intrinsics are an alternative
13586to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
13587:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
13588differences in approach are covered in the `Garbage Collection with LLVM
13589<GarbageCollection.html>`_ documentation. The intrinsics themselves are
13590described in :doc:`Statepoints`.
13591
13592.. _int_gcroot:
13593
13594'``llvm.gcroot``' Intrinsic
13595^^^^^^^^^^^^^^^^^^^^^^^^^^^
13596
13597Syntax:
13598"""""""
13599
13600::
13601
13602      declare void @llvm.gcroot(ptr %ptrloc, ptr %metadata)
13603
13604Overview:
13605"""""""""
13606
13607The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
13608the code generator, and allows some metadata to be associated with it.
13609
13610Arguments:
13611""""""""""
13612
13613The first argument specifies the address of a stack object that contains
13614the root pointer. The second pointer (which must be either a constant or
13615a global value address) contains the meta-data to be associated with the
13616root.
13617
13618Semantics:
13619""""""""""
13620
13621At runtime, a call to this intrinsic stores a null pointer into the
13622"ptrloc" location. At compile-time, the code generator generates
13623information to allow the runtime to find the pointer at GC safe points.
13624The '``llvm.gcroot``' intrinsic may only be used in a function which
13625:ref:`specifies a GC algorithm <gc>`.
13626
13627.. _int_gcread:
13628
13629'``llvm.gcread``' Intrinsic
13630^^^^^^^^^^^^^^^^^^^^^^^^^^^
13631
13632Syntax:
13633"""""""
13634
13635::
13636
13637      declare ptr @llvm.gcread(ptr %ObjPtr, ptr %Ptr)
13638
13639Overview:
13640"""""""""
13641
13642The '``llvm.gcread``' intrinsic identifies reads of references from heap
13643locations, allowing garbage collector implementations that require read
13644barriers.
13645
13646Arguments:
13647""""""""""
13648
13649The second argument is the address to read from, which should be an
13650address allocated from the garbage collector. The first object is a
13651pointer to the start of the referenced object, if needed by the language
13652runtime (otherwise null).
13653
13654Semantics:
13655""""""""""
13656
13657The '``llvm.gcread``' intrinsic has the same semantics as a load
13658instruction, but may be replaced with substantially more complex code by
13659the garbage collector runtime, as needed. The '``llvm.gcread``'
13660intrinsic may only be used in a function which :ref:`specifies a GC
13661algorithm <gc>`.
13662
13663.. _int_gcwrite:
13664
13665'``llvm.gcwrite``' Intrinsic
13666^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13667
13668Syntax:
13669"""""""
13670
13671::
13672
13673      declare void @llvm.gcwrite(ptr %P1, ptr %Obj, ptr %P2)
13674
13675Overview:
13676"""""""""
13677
13678The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
13679locations, allowing garbage collector implementations that require write
13680barriers (such as generational or reference counting collectors).
13681
13682Arguments:
13683""""""""""
13684
13685The first argument is the reference to store, the second is the start of
13686the object to store it to, and the third is the address of the field of
13687Obj to store to. If the runtime does not require a pointer to the
13688object, Obj may be null.
13689
13690Semantics:
13691""""""""""
13692
13693The '``llvm.gcwrite``' intrinsic has the same semantics as a store
13694instruction, but may be replaced with substantially more complex code by
13695the garbage collector runtime, as needed. The '``llvm.gcwrite``'
13696intrinsic may only be used in a function which :ref:`specifies a GC
13697algorithm <gc>`.
13698
13699
13700.. _gc_statepoint:
13701
13702'``llvm.experimental.gc.statepoint``' Intrinsic
13703^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13704
13705Syntax:
13706"""""""
13707
13708::
13709
13710      declare token
13711        @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
13712                       ptr elementtype(func_type) <target>,
13713                       i64 <#call args>, i64 <flags>,
13714                       ... (call parameters),
13715                       i64 0, i64 0)
13716
13717Overview:
13718"""""""""
13719
13720The statepoint intrinsic represents a call which is parse-able by the
13721runtime.
13722
13723Operands:
13724"""""""""
13725
13726The 'id' operand is a constant integer that is reported as the ID
13727field in the generated stackmap.  LLVM does not interpret this
13728parameter in any way and its meaning is up to the statepoint user to
13729decide.  Note that LLVM is free to duplicate code containing
13730statepoint calls, and this may transform IR that had a unique 'id' per
13731lexical call to statepoint to IR that does not.
13732
13733If 'num patch bytes' is non-zero then the call instruction
13734corresponding to the statepoint is not emitted and LLVM emits 'num
13735patch bytes' bytes of nops in its place.  LLVM will emit code to
13736prepare the function arguments and retrieve the function return value
13737in accordance to the calling convention; the former before the nop
13738sequence and the latter after the nop sequence.  It is expected that
13739the user will patch over the 'num patch bytes' bytes of nops with a
13740calling sequence specific to their runtime before executing the
13741generated machine code.  There are no guarantees with respect to the
13742alignment of the nop sequence.  Unlike :doc:`StackMaps` statepoints do
13743not have a concept of shadow bytes.  Note that semantically the
13744statepoint still represents a call or invoke to 'target', and the nop
13745sequence after patching is expected to represent an operation
13746equivalent to a call or invoke to 'target'.
13747
13748The 'target' operand is the function actually being called. The operand
13749must have an :ref:`elementtype <attr_elementtype>` attribute specifying
13750the function type of the target. The target can be specified as either
13751a symbolic LLVM function, or as an arbitrary Value of pointer type. Note
13752that the function type must match the signature of the callee and the
13753types of the 'call parameters' arguments.
13754
13755The '#call args' operand is the number of arguments to the actual
13756call.  It must exactly match the number of arguments passed in the
13757'call parameters' variable length section.
13758
13759The 'flags' operand is used to specify extra information about the
13760statepoint. This is currently only used to mark certain statepoints
13761as GC transitions. This operand is a 64-bit integer with the following
13762layout, where bit 0 is the least significant bit:
13763
13764  +-------+---------------------------------------------------+
13765  | Bit # | Usage                                             |
13766  +=======+===================================================+
13767  |     0 | Set if the statepoint is a GC transition, cleared |
13768  |       | otherwise.                                        |
13769  +-------+---------------------------------------------------+
13770  |  1-63 | Reserved for future use; must be cleared.         |
13771  +-------+---------------------------------------------------+
13772
13773The 'call parameters' arguments are simply the arguments which need to
13774be passed to the call target.  They will be lowered according to the
13775specified calling convention and otherwise handled like a normal call
13776instruction.  The number of arguments must exactly match what is
13777specified in '# call args'.  The types must match the signature of
13778'target'.
13779
13780The 'call parameter' attributes must be followed by two 'i64 0' constants.
13781These were originally the length prefixes for 'gc transition parameter' and
13782'deopt parameter' arguments, but the role of these parameter sets have been
13783entirely replaced with the corresponding operand bundles.  In a future
13784revision, these now redundant arguments will be removed.
13785
13786Semantics:
13787""""""""""
13788
13789A statepoint is assumed to read and write all memory.  As a result,
13790memory operations can not be reordered past a statepoint.  It is
13791illegal to mark a statepoint as being either 'readonly' or 'readnone'.
13792
13793Note that legal IR can not perform any memory operation on a 'gc
13794pointer' argument of the statepoint in a location statically reachable
13795from the statepoint.  Instead, the explicitly relocated value (from a
13796``gc.relocate``) must be used.
13797
13798'``llvm.experimental.gc.result``' Intrinsic
13799^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13800
13801Syntax:
13802"""""""
13803
13804::
13805
13806      declare type
13807        @llvm.experimental.gc.result(token %statepoint_token)
13808
13809Overview:
13810"""""""""
13811
13812``gc.result`` extracts the result of the original call instruction
13813which was replaced by the ``gc.statepoint``.  The ``gc.result``
13814intrinsic is actually a family of three intrinsics due to an
13815implementation limitation.  Other than the type of the return value,
13816the semantics are the same.
13817
13818Operands:
13819"""""""""
13820
13821The first and only argument is the ``gc.statepoint`` which starts
13822the safepoint sequence of which this ``gc.result`` is a part.
13823Despite the typing of this as a generic token, *only* the value defined
13824by a ``gc.statepoint`` is legal here.
13825
13826Semantics:
13827""""""""""
13828
13829The ``gc.result`` represents the return value of the call target of
13830the ``statepoint``.  The type of the ``gc.result`` must exactly match
13831the type of the target.  If the call target returns void, there will
13832be no ``gc.result``.
13833
13834A ``gc.result`` is modeled as a 'readnone' pure function.  It has no
13835side effects since it is just a projection of the return value of the
13836previous call represented by the ``gc.statepoint``.
13837
13838'``llvm.experimental.gc.relocate``' Intrinsic
13839^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13840
13841Syntax:
13842"""""""
13843
13844::
13845
13846      declare <pointer type>
13847        @llvm.experimental.gc.relocate(token %statepoint_token,
13848                                       i32 %base_offset,
13849                                       i32 %pointer_offset)
13850
13851Overview:
13852"""""""""
13853
13854A ``gc.relocate`` returns the potentially relocated value of a pointer
13855at the safepoint.
13856
13857Operands:
13858"""""""""
13859
13860The first argument is the ``gc.statepoint`` which starts the
13861safepoint sequence of which this ``gc.relocation`` is a part.
13862Despite the typing of this as a generic token, *only* the value defined
13863by a ``gc.statepoint`` is legal here.
13864
13865The second and third arguments are both indices into operands of the
13866corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle.
13867
13868The second argument is an index which specifies the allocation for the pointer
13869being relocated. The associated value must be within the object with which the
13870pointer being relocated is associated. The optimizer is free to change *which*
13871interior derived pointer is reported, provided that it does not replace an
13872actual base pointer with another interior derived pointer. Collectors are
13873allowed to rely on the base pointer operand remaining an actual base pointer if
13874so constructed.
13875
13876The third argument is an index which specify the (potentially) derived pointer
13877being relocated.  It is legal for this index to be the same as the second
13878argument if-and-only-if a base pointer is being relocated.
13879
13880Semantics:
13881""""""""""
13882
13883The return value of ``gc.relocate`` is the potentially relocated value
13884of the pointer specified by its arguments.  It is unspecified how the
13885value of the returned pointer relates to the argument to the
13886``gc.statepoint`` other than that a) it points to the same source
13887language object with the same offset, and b) the 'based-on'
13888relationship of the newly relocated pointers is a projection of the
13889unrelocated pointers.  In particular, the integer value of the pointer
13890returned is unspecified.
13891
13892A ``gc.relocate`` is modeled as a ``readnone`` pure function.  It has no
13893side effects since it is just a way to extract information about work
13894done during the actual call modeled by the ``gc.statepoint``.
13895
13896.. _gc.get.pointer.base:
13897
13898'``llvm.experimental.gc.get.pointer.base``' Intrinsic
13899^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13900
13901Syntax:
13902"""""""
13903
13904::
13905
13906      declare <pointer type>
13907        @llvm.experimental.gc.get.pointer.base(
13908          <pointer type> readnone captures(none) %derived_ptr)
13909          nounwind willreturn memory(none)
13910
13911Overview:
13912"""""""""
13913
13914``gc.get.pointer.base`` for a derived pointer returns its base pointer.
13915
13916Operands:
13917"""""""""
13918
13919The only argument is a pointer which is based on some object with
13920an unknown offset from the base of said object.
13921
13922Semantics:
13923""""""""""
13924
13925This intrinsic is used in the abstract machine model for GC to represent
13926the base pointer for an arbitrary derived pointer.
13927
13928This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
13929replacing all uses of this callsite with the offset of a derived pointer from
13930its base pointer value. The replacement is done as part of the lowering to the
13931explicit statepoint model.
13932
13933The return pointer type must be the same as the type of the parameter.
13934
13935
13936'``llvm.experimental.gc.get.pointer.offset``' Intrinsic
13937^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13938
13939Syntax:
13940"""""""
13941
13942::
13943
13944      declare i64
13945        @llvm.experimental.gc.get.pointer.offset(
13946          <pointer type> readnone captures(none) %derived_ptr)
13947          nounwind willreturn memory(none)
13948
13949Overview:
13950"""""""""
13951
13952``gc.get.pointer.offset`` for a derived pointer returns the offset from its
13953base pointer.
13954
13955Operands:
13956"""""""""
13957
13958The only argument is a pointer which is based on some object with
13959an unknown offset from the base of said object.
13960
13961Semantics:
13962""""""""""
13963
13964This intrinsic is used in the abstract machine model for GC to represent
13965the offset of an arbitrary derived pointer from its base pointer.
13966
13967This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
13968replacing all uses of this callsite with the offset of a derived pointer from
13969its base pointer value. The replacement is done as part of the lowering to the
13970explicit statepoint model.
13971
13972Basically this call calculates difference between the derived pointer and its
13973base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But
13974this cast done outside the :ref:`RewriteStatepointsForGC` pass could result
13975in the pointers lost for further lowering from the abstract model to the
13976explicit physical one.
13977
13978Code Generator Intrinsics
13979-------------------------
13980
13981These intrinsics are provided by LLVM to expose special features that
13982may only be implemented with code generator support.
13983
13984'``llvm.returnaddress``' Intrinsic
13985^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13986
13987Syntax:
13988"""""""
13989
13990::
13991
13992      declare ptr @llvm.returnaddress(i32 <level>)
13993
13994Overview:
13995"""""""""
13996
13997The '``llvm.returnaddress``' intrinsic attempts to compute a
13998target-specific value indicating the return address of the current
13999function or one of its callers.
14000
14001Arguments:
14002""""""""""
14003
14004The argument to this intrinsic indicates which function to return the
14005address for. Zero indicates the calling function, one indicates its
14006caller, etc. The argument is **required** to be a constant integer
14007value.
14008
14009Semantics:
14010""""""""""
14011
14012The '``llvm.returnaddress``' intrinsic either returns a pointer
14013indicating the return address of the specified call frame, or zero if it
14014cannot be identified. The value returned by this intrinsic is likely to
14015be incorrect or 0 for arguments other than zero, so it should only be
14016used for debugging purposes.
14017
14018Note that calling this intrinsic does not prevent function inlining or
14019other aggressive transformations, so the value returned may not be that
14020of the obvious source-language caller.
14021
14022'``llvm.addressofreturnaddress``' Intrinsic
14023^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14024
14025Syntax:
14026"""""""
14027
14028::
14029
14030      declare ptr @llvm.addressofreturnaddress()
14031
14032Overview:
14033"""""""""
14034
14035The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
14036pointer to the place in the stack frame where the return address of the
14037current function is stored.
14038
14039Semantics:
14040""""""""""
14041
14042Note that calling this intrinsic does not prevent function inlining or
14043other aggressive transformations, so the value returned may not be that
14044of the obvious source-language caller.
14045
14046This intrinsic is only implemented for x86 and aarch64.
14047
14048'``llvm.sponentry``' Intrinsic
14049^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14050
14051Syntax:
14052"""""""
14053
14054::
14055
14056      declare ptr @llvm.sponentry()
14057
14058Overview:
14059"""""""""
14060
14061The '``llvm.sponentry``' intrinsic returns the stack pointer value at
14062the entry of the current function calling this intrinsic.
14063
14064Semantics:
14065""""""""""
14066
14067Note this intrinsic is only verified on AArch64 and ARM.
14068
14069'``llvm.frameaddress``' Intrinsic
14070^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14071
14072Syntax:
14073"""""""
14074
14075::
14076
14077      declare ptr @llvm.frameaddress(i32 <level>)
14078
14079Overview:
14080"""""""""
14081
14082The '``llvm.frameaddress``' intrinsic attempts to return the
14083target-specific frame pointer value for the specified stack frame.
14084
14085Arguments:
14086""""""""""
14087
14088The argument to this intrinsic indicates which function to return the
14089frame pointer for. Zero indicates the calling function, one indicates
14090its caller, etc. The argument is **required** to be a constant integer
14091value.
14092
14093Semantics:
14094""""""""""
14095
14096The '``llvm.frameaddress``' intrinsic either returns a pointer
14097indicating the frame address of the specified call frame, or zero if it
14098cannot be identified. The value returned by this intrinsic is likely to
14099be incorrect or 0 for arguments other than zero, so it should only be
14100used for debugging purposes.
14101
14102Note that calling this intrinsic does not prevent function inlining or
14103other aggressive transformations, so the value returned may not be that
14104of the obvious source-language caller.
14105
14106'``llvm.swift.async.context.addr``' Intrinsic
14107^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14108
14109Syntax:
14110"""""""
14111
14112::
14113
14114      declare ptr @llvm.swift.async.context.addr()
14115
14116Overview:
14117"""""""""
14118
14119The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to
14120the part of the extended frame record containing the asynchronous
14121context of a Swift execution.
14122
14123Semantics:
14124""""""""""
14125
14126If the caller has a ``swiftasync`` parameter, that argument will initially
14127be stored at the returned address. If not, it will be initialized to null.
14128
14129'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
14130^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14131
14132Syntax:
14133"""""""
14134
14135::
14136
14137      declare void @llvm.localescape(...)
14138      declare ptr @llvm.localrecover(ptr %func, ptr %fp, i32 %idx)
14139
14140Overview:
14141"""""""""
14142
14143The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
14144allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
14145live frame pointer to recover the address of the allocation. The offset is
14146computed during frame layout of the caller of ``llvm.localescape``.
14147
14148Arguments:
14149""""""""""
14150
14151All arguments to '``llvm.localescape``' must be pointers to static allocas or
14152casts of static allocas. Each function can only call '``llvm.localescape``'
14153once, and it can only do so from the entry block.
14154
14155The ``func`` argument to '``llvm.localrecover``' must be a constant
14156bitcasted pointer to a function defined in the current module. The code
14157generator cannot determine the frame allocation offset of functions defined in
14158other modules.
14159
14160The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
14161call frame that is currently live. The return value of '``llvm.localaddress``'
14162is one way to produce such a value, but various runtimes also expose a suitable
14163pointer in platform-specific ways.
14164
14165The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
14166'``llvm.localescape``' to recover. It is zero-indexed.
14167
14168Semantics:
14169""""""""""
14170
14171These intrinsics allow a group of functions to share access to a set of local
14172stack allocations of a one parent function. The parent function may call the
14173'``llvm.localescape``' intrinsic once from the function entry block, and the
14174child functions can use '``llvm.localrecover``' to access the escaped allocas.
14175The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
14176the escaped allocas are allocated, which would break attempts to use
14177'``llvm.localrecover``'.
14178
14179'``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics
14180^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14181
14182Syntax:
14183"""""""
14184
14185::
14186
14187      declare void @llvm.seh.try.begin()
14188      declare void @llvm.seh.try.end()
14189
14190Overview:
14191"""""""""
14192
14193The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark
14194the boundary of a _try region for Windows SEH Asynchronous Exception Handling.
14195
14196Semantics:
14197""""""""""
14198
14199When a C-function is compiled with Windows SEH Asynchronous Exception option,
14200-feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try
14201boundary and to prevent potential exceptions from being moved across boundary.
14202Any set of operations can then be confined to the region by reading their leaf
14203inputs via volatile loads and writing their root outputs via volatile stores.
14204
14205'``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics
14206^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14207
14208Syntax:
14209"""""""
14210
14211::
14212
14213      declare void @llvm.seh.scope.begin()
14214      declare void @llvm.seh.scope.end()
14215
14216Overview:
14217"""""""""
14218
14219The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark
14220the boundary of a CPP object lifetime for Windows SEH Asynchronous Exception
14221Handling (MSVC option -EHa).
14222
14223Semantics:
14224""""""""""
14225
14226LLVM's ordinary exception-handling representation associates EH cleanups and
14227handlers only with ``invoke``s, which normally correspond only to call sites.  To
14228support arbitrary faulting instructions, it must be possible to recover the current
14229EH scope for any instruction.  Turning every operation in LLVM that could fault
14230into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
14231large number of intrinsics, impede optimization of those operations, and make
14232compilation slower by introducing many extra basic blocks.  These intrinsics can
14233be used instead to mark the region protected by a cleanup, such as for a local
14234C++ object with a non-trivial destructor.  ``llvm.seh.scope.begin`` is used to mark
14235the start of the region; it is always called with ``invoke``, with the unwind block
14236being the desired unwind destination for any potentially-throwing instructions
14237within the region.  `llvm.seh.scope.end` is used to mark when the scope ends
14238and the EH cleanup is no longer required (e.g. because the destructor is being
14239called).
14240
14241.. _int_read_register:
14242.. _int_read_volatile_register:
14243.. _int_write_register:
14244
14245'``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics
14246^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14247
14248Syntax:
14249"""""""
14250
14251::
14252
14253      declare i32 @llvm.read_register.i32(metadata)
14254      declare i64 @llvm.read_register.i64(metadata)
14255      declare i32 @llvm.read_volatile_register.i32(metadata)
14256      declare i64 @llvm.read_volatile_register.i64(metadata)
14257      declare void @llvm.write_register.i32(metadata, i32 @value)
14258      declare void @llvm.write_register.i64(metadata, i64 @value)
14259      !0 = !{!"sp\00"}
14260
14261Overview:
14262"""""""""
14263
14264The '``llvm.read_register``', '``llvm.read_volatile_register``', and
14265'``llvm.write_register``' intrinsics provide access to the named register.
14266The register must be valid on the architecture being compiled to. The type
14267needs to be compatible with the register being read.
14268
14269Semantics:
14270""""""""""
14271
14272The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics
14273return the current value of the register, where possible. The
14274'``llvm.write_register``' intrinsic sets the current value of the register,
14275where possible.
14276
14277A call to '``llvm.read_volatile_register``' is assumed to have side-effects
14278and possibly return a different value each time (e.g. for a timer register).
14279
14280This is useful to implement named register global variables that need
14281to always be mapped to a specific register, as is common practice on
14282bare-metal programs including OS kernels.
14283
14284The compiler doesn't check for register availability or use of the used
14285register in surrounding code, including inline assembly. Because of that,
14286allocatable registers are not supported.
14287
14288Warning: So far it only works with the stack pointer on selected
14289architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
14290work is needed to support other registers and even more so, allocatable
14291registers.
14292
14293.. _int_stacksave:
14294
14295'``llvm.stacksave``' Intrinsic
14296^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14297
14298Syntax:
14299"""""""
14300
14301::
14302
14303      declare ptr @llvm.stacksave.p0()
14304      declare ptr addrspace(5) @llvm.stacksave.p5()
14305
14306Overview:
14307"""""""""
14308
14309The '``llvm.stacksave``' intrinsic is used to remember the current state
14310of the function stack, for use with
14311:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
14312implementing language features like scoped automatic variable sized
14313arrays in C99.
14314
14315Semantics:
14316""""""""""
14317
14318This intrinsic returns an opaque pointer value that can be passed to
14319:ref:`llvm.stackrestore <int_stackrestore>`. When an
14320``llvm.stackrestore`` intrinsic is executed with a value saved from
14321``llvm.stacksave``, it effectively restores the state of the stack to
14322the state it was in when the ``llvm.stacksave`` intrinsic executed. In
14323practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack
14324that were allocated after the ``llvm.stacksave`` was executed. The
14325address space should typically be the
14326:ref:`alloca address space <alloca_addrspace>`.
14327
14328.. _int_stackrestore:
14329
14330'``llvm.stackrestore``' Intrinsic
14331^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14332
14333Syntax:
14334"""""""
14335
14336::
14337
14338      declare void @llvm.stackrestore.p0(ptr %ptr)
14339      declare void @llvm.stackrestore.p5(ptr addrspace(5) %ptr)
14340
14341Overview:
14342"""""""""
14343
14344The '``llvm.stackrestore``' intrinsic is used to restore the state of
14345the function stack to the state it was in when the corresponding
14346:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
14347useful for implementing language features like scoped automatic
14348variable sized arrays in C99. The address space should typically be
14349the :ref:`alloca address space <alloca_addrspace>`.
14350
14351Semantics:
14352""""""""""
14353
14354See the description for :ref:`llvm.stacksave <int_stacksave>`.
14355
14356.. _int_get_dynamic_area_offset:
14357
14358'``llvm.get.dynamic.area.offset``' Intrinsic
14359^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14360
14361Syntax:
14362"""""""
14363
14364::
14365
14366      declare i32 @llvm.get.dynamic.area.offset.i32()
14367      declare i64 @llvm.get.dynamic.area.offset.i64()
14368
14369Overview:
14370"""""""""
14371
14372      The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
14373      get the offset from native stack pointer to the address of the most
14374      recent dynamic alloca on the caller's stack. These intrinsics are
14375      intended for use in combination with
14376      :ref:`llvm.stacksave <int_stacksave>` to get a
14377      pointer to the most recent dynamic alloca. This is useful, for example,
14378      for AddressSanitizer's stack unpoisoning routines.
14379
14380Semantics:
14381""""""""""
14382
14383      These intrinsics return a non-negative integer value that can be used to
14384      get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
14385      on the caller's stack. In particular, for targets where stack grows downwards,
14386      adding this offset to the native stack pointer would get the address of the most
14387      recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
14388      complicated, because subtracting this value from stack pointer would get the address
14389      one past the end of the most recent dynamic alloca.
14390
14391      Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
14392      returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
14393      compile-time-known constant value.
14394
14395      The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
14396      must match the target's default address space's (address space 0) pointer type.
14397
14398'``llvm.prefetch``' Intrinsic
14399^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14400
14401Syntax:
14402"""""""
14403
14404::
14405
14406      declare void @llvm.prefetch(ptr <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
14407
14408Overview:
14409"""""""""
14410
14411The '``llvm.prefetch``' intrinsic is a hint to the code generator to
14412insert a prefetch instruction if supported; otherwise, it is a noop.
14413Prefetches have no effect on the behavior of the program but can change
14414its performance characteristics.
14415
14416Arguments:
14417""""""""""
14418
14419``address`` is the address to be prefetched, ``rw`` is the specifier
14420determining if the fetch should be for a read (0) or write (1), and
14421``locality`` is a temporal locality specifier ranging from (0) - no
14422locality, to (3) - extremely local keep in cache. The ``cache type``
14423specifies whether the prefetch is performed on the data (1) or
14424instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
14425arguments must be constant integers.
14426
14427Semantics:
14428""""""""""
14429
14430This intrinsic does not modify the behavior of the program. In
14431particular, prefetches cannot trap and do not produce a value. On
14432targets that support this intrinsic, the prefetch can provide hints to
14433the processor cache for better performance.
14434
14435'``llvm.pcmarker``' Intrinsic
14436^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14437
14438Syntax:
14439"""""""
14440
14441::
14442
14443      declare void @llvm.pcmarker(i32 <id>)
14444
14445Overview:
14446"""""""""
14447
14448The '``llvm.pcmarker``' intrinsic is a method to export a Program
14449Counter (PC) in a region of code to simulators and other tools. The
14450method is target specific, but it is expected that the marker will use
14451exported symbols to transmit the PC of the marker. The marker makes no
14452guarantees that it will remain with any specific instruction after
14453optimizations. It is possible that the presence of a marker will inhibit
14454optimizations. The intended use is to be inserted after optimizations to
14455allow correlations of simulation runs.
14456
14457Arguments:
14458""""""""""
14459
14460``id`` is a numerical id identifying the marker.
14461
14462Semantics:
14463""""""""""
14464
14465This intrinsic does not modify the behavior of the program. Backends
14466that do not support this intrinsic may ignore it.
14467
14468'``llvm.readcyclecounter``' Intrinsic
14469^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14470
14471Syntax:
14472"""""""
14473
14474::
14475
14476      declare i64 @llvm.readcyclecounter()
14477
14478Overview:
14479"""""""""
14480
14481The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
14482counter register (or similar low latency, high accuracy clocks) on those
14483targets that support it. On X86, it should map to RDTSC. On Alpha, it
14484should map to RPCC. As the backing counters overflow quickly (on the
14485order of 9 seconds on alpha), this should only be used for small
14486timings.
14487
14488Semantics:
14489""""""""""
14490
14491When directly supported, reading the cycle counter should not modify any
14492memory. Implementations are allowed to either return an application
14493specific value or a system wide value. On backends without support, this
14494is lowered to a constant 0.
14495
14496Note that runtime support may be conditional on the privilege-level code is
14497running at and the host platform.
14498
14499'``llvm.clear_cache``' Intrinsic
14500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14501
14502Syntax:
14503"""""""
14504
14505::
14506
14507      declare void @llvm.clear_cache(ptr, ptr)
14508
14509Overview:
14510"""""""""
14511
14512The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
14513in the specified range to the execution unit of the processor. On
14514targets with non-unified instruction and data cache, the implementation
14515flushes the instruction cache.
14516
14517Semantics:
14518""""""""""
14519
14520On platforms with coherent instruction and data caches (e.g. x86), this
14521intrinsic is a nop. On platforms with non-coherent instruction and data
14522cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
14523instructions or a system call, if cache flushing requires special
14524privileges.
14525
14526The default behavior is to emit a call to ``__clear_cache`` from the run
14527time library.
14528
14529This intrinsic does *not* empty the instruction pipeline. Modifications
14530of the current function are outside the scope of the intrinsic.
14531
14532'``llvm.instrprof.increment``' Intrinsic
14533^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14534
14535Syntax:
14536"""""""
14537
14538::
14539
14540      declare void @llvm.instrprof.increment(ptr <name>, i64 <hash>,
14541                                             i32 <num-counters>, i32 <index>)
14542
14543Overview:
14544"""""""""
14545
14546The '``llvm.instrprof.increment``' intrinsic can be emitted by a
14547frontend for use with instrumentation based profiling. These will be
14548lowered by the ``-instrprof`` pass to generate execution counts of a
14549program at runtime.
14550
14551Arguments:
14552""""""""""
14553
14554The first argument is a pointer to a global variable containing the
14555name of the entity being instrumented. This should generally be the
14556(mangled) function name for a set of counters.
14557
14558The second argument is a hash value that can be used by the consumer
14559of the profile data to detect changes to the instrumented source, and
14560the third is the number of counters associated with ``name``. It is an
14561error if ``hash`` or ``num-counters`` differ between two instances of
14562``instrprof.increment`` that refer to the same name.
14563
14564The last argument refers to which of the counters for ``name`` should
14565be incremented. It should be a value between 0 and ``num-counters``.
14566
14567Semantics:
14568""""""""""
14569
14570This intrinsic represents an increment of a profiling counter. It will
14571cause the ``-instrprof`` pass to generate the appropriate data
14572structures and the code to increment the appropriate value, in a
14573format that can be written out by a compiler runtime and consumed via
14574the ``llvm-profdata`` tool.
14575
14576.. FIXME: write complete doc on contextual instrumentation and link from here
14577.. and from llvm.instrprof.callsite.
14578
14579The intrinsic is lowered differently for contextual profiling by the
14580``-ctx-instr-lower`` pass. Here:
14581
14582* the entry basic block increment counter is lowered as a call to compiler-rt,
14583  to either ``__llvm_ctx_profile_start_context`` or
14584  ``__llvm_ctx_profile_get_context``. Either returns a pointer to a context object
14585  which contains a buffer into which counter increments can happen. Note that the
14586  pointer value returned by compiler-rt may have its LSB set - counter increments
14587  happen offset from the address with the LSB cleared.
14588
14589* all the other lowerings of ``llvm.instrprof.increment[.step]`` happen within
14590  that context.
14591
14592* the context is assumed to be a local value to the function, and no concurrency
14593  concerns need to be handled by LLVM.
14594
14595'``llvm.instrprof.increment.step``' Intrinsic
14596^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14597
14598Syntax:
14599"""""""
14600
14601::
14602
14603      declare void @llvm.instrprof.increment.step(ptr <name>, i64 <hash>,
14604                                                  i32 <num-counters>,
14605                                                  i32 <index>, i64 <step>)
14606
14607Overview:
14608"""""""""
14609
14610The '``llvm.instrprof.increment.step``' intrinsic is an extension to
14611the '``llvm.instrprof.increment``' intrinsic with an additional fifth
14612argument to specify the step of the increment.
14613
14614Arguments:
14615""""""""""
14616The first four arguments are the same as '``llvm.instrprof.increment``'
14617intrinsic.
14618
14619The last argument specifies the value of the increment of the counter variable.
14620
14621Semantics:
14622""""""""""
14623See description of '``llvm.instrprof.increment``' intrinsic.
14624
14625'``llvm.instrprof.callsite``' Intrinsic
14626^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14627
14628Syntax:
14629"""""""
14630
14631::
14632
14633      declare void @llvm.instrprof.callsite(ptr <name>, i64 <hash>,
14634                                            i32 <num-counters>,
14635                                            i32 <index>, ptr <callsite>)
14636
14637Overview:
14638"""""""""
14639
14640The '``llvm.instrprof.callsite``' intrinsic should be emitted before a callsite
14641that's not to a "fake" callee (like another intrinsic or asm). It is used by
14642contextual profiling and has side-effects. Its lowering happens in IR, and
14643target-specific backends should never encounter it.
14644
14645Arguments:
14646""""""""""
14647The first 4 arguments are similar to ``llvm.instrprof.increment``. The indexing
14648is specific to callsites, meaning callsites are indexed from 0, independent from
14649the indexes used by the other intrinsics (such as
14650``llvm.instrprof.increment[.step]``).
14651
14652The last argument is the called value of the callsite this intrinsic precedes.
14653
14654Semantics:
14655""""""""""
14656
14657This is lowered by contextual profiling. In contextual profiling, functions get,
14658from compiler-rt, a pointer to a context object. The context object consists of
14659a buffer LLVM can use to perform counter increments (i.e. the lowering of
14660``llvm.instrprof.increment[.step]``. The address range following the counter
14661buffer, ``<num-counters>`` x ``sizeof(ptr)`` - sized, is expected to contain
14662pointers to contexts of functions called from this function ("subcontexts").
14663LLVM does not dereference into that memory region, just calculates GEPs.
14664
14665The lowering of ``llvm.instrprof.callsite`` consists of:
14666
14667* write to ``__llvm_ctx_profile_expected_callee`` the ``<callsite>`` value;
14668
14669* write to ``__llvm_ctx_profile_callsite`` the address into this function's
14670  context of the ``<index>`` position into the subcontexts region.
14671
14672
14673``__llvm_ctx_profile_{expected_callee|callsite}`` are initialized by compiler-rt
14674and are TLS. They are both vectors of pointers of size 2. The index into each is
14675determined when the current function obtains the pointer to its context from
14676compiler-rt. The pointer's LSB gives the index.
14677
14678
14679'``llvm.instrprof.timestamp``' Intrinsic
14680^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14681
14682Syntax:
14683"""""""
14684
14685::
14686
14687      declare void @llvm.instrprof.timestamp(i8* <name>, i64 <hash>,
14688                                             i32 <num-counters>, i32 <index>)
14689
14690Overview:
14691"""""""""
14692
14693The '``llvm.instrprof.timestamp``' intrinsic is used to implement temporal
14694profiling.
14695
14696Arguments:
14697""""""""""
14698The arguments are the same as '``llvm.instrprof.increment``'. The ``index`` is
14699expected to always be zero.
14700
14701Semantics:
14702""""""""""
14703Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores a
14704timestamp representing when this function was executed for the first time.
14705
14706'``llvm.instrprof.cover``' Intrinsic
14707^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14708
14709Syntax:
14710"""""""
14711
14712::
14713
14714      declare void @llvm.instrprof.cover(ptr <name>, i64 <hash>,
14715                                         i32 <num-counters>, i32 <index>)
14716
14717Overview:
14718"""""""""
14719
14720The '``llvm.instrprof.cover``' intrinsic is used to implement coverage
14721instrumentation.
14722
14723Arguments:
14724""""""""""
14725The arguments are the same as the first four arguments of
14726'``llvm.instrprof.increment``'.
14727
14728Semantics:
14729""""""""""
14730Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores zero to
14731the profiling variable to signify that the function has been covered. We store
14732zero because this is more efficient on some targets.
14733
14734'``llvm.instrprof.value.profile``' Intrinsic
14735^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14736
14737Syntax:
14738"""""""
14739
14740::
14741
14742      declare void @llvm.instrprof.value.profile(ptr <name>, i64 <hash>,
14743                                                 i64 <value>, i32 <value_kind>,
14744                                                 i32 <index>)
14745
14746Overview:
14747"""""""""
14748
14749The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
14750frontend for use with instrumentation based profiling. This will be
14751lowered by the ``-instrprof`` pass to find out the target values,
14752instrumented expressions take in a program at runtime.
14753
14754Arguments:
14755""""""""""
14756
14757The first argument is a pointer to a global variable containing the
14758name of the entity being instrumented. ``name`` should generally be the
14759(mangled) function name for a set of counters.
14760
14761The second argument is a hash value that can be used by the consumer
14762of the profile data to detect changes to the instrumented source. It
14763is an error if ``hash`` differs between two instances of
14764``llvm.instrprof.*`` that refer to the same name.
14765
14766The third argument is the value of the expression being profiled. The profiled
14767expression's value should be representable as an unsigned 64-bit value. The
14768fourth argument represents the kind of value profiling that is being done. The
14769supported value profiling kinds are enumerated through the
14770``InstrProfValueKind`` type declared in the
14771``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
14772index of the instrumented expression within ``name``. It should be >= 0.
14773
14774Semantics:
14775""""""""""
14776
14777This intrinsic represents the point where a call to a runtime routine
14778should be inserted for value profiling of target expressions. ``-instrprof``
14779pass will generate the appropriate data structures and replace the
14780``llvm.instrprof.value.profile`` intrinsic with the call to the profile
14781runtime library with proper arguments.
14782
14783'``llvm.instrprof.mcdc.parameters``' Intrinsic
14784^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14785
14786Syntax:
14787"""""""
14788
14789::
14790
14791      declare void @llvm.instrprof.mcdc.parameters(ptr <name>, i64 <hash>,
14792                                                   i32 <bitmap-bits>)
14793
14794Overview:
14795"""""""""
14796
14797The '``llvm.instrprof.mcdc.parameters``' intrinsic is used to initiate MC/DC
14798code coverage instrumentation for a function.
14799
14800Arguments:
14801""""""""""
14802
14803The first argument is a pointer to a global variable containing the
14804name of the entity being instrumented. This should generally be the
14805(mangled) function name for a set of counters.
14806
14807The second argument is a hash value that can be used by the consumer
14808of the profile data to detect changes to the instrumented source.
14809
14810The third argument is the number of bitmap bits required by the function to
14811record the number of test vectors executed for each boolean expression.
14812
14813Semantics:
14814""""""""""
14815
14816This intrinsic represents basic MC/DC parameters initiating one or more MC/DC
14817instrumentation sequences in a function. It will cause the ``-instrprof`` pass
14818to generate the appropriate data structures and the code to instrument MC/DC
14819test vectors in a format that can be written out by a compiler runtime and
14820consumed via the ``llvm-profdata`` tool.
14821
14822'``llvm.instrprof.mcdc.tvbitmap.update``' Intrinsic
14823^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14824
14825Syntax:
14826"""""""
14827
14828::
14829
14830      declare void @llvm.instrprof.mcdc.tvbitmap.update(ptr <name>, i64 <hash>,
14831                                                        i32 <bitmap-index>,
14832                                                        ptr <mcdc-temp-addr>)
14833
14834Overview:
14835"""""""""
14836
14837The '``llvm.instrprof.mcdc.tvbitmap.update``' intrinsic is used to track MC/DC
14838test vector execution after each boolean expression has been fully executed.
14839The overall value of the condition bitmap, after it has been successively
14840updated with the true or false evaluation of each condition, uniquely identifies
14841an executed MC/DC test vector and is used as a bit index into the global test
14842vector bitmap.
14843
14844Arguments:
14845""""""""""
14846
14847The first argument is a pointer to a global variable containing the
14848name of the entity being instrumented. This should generally be the
14849(mangled) function name for a set of counters.
14850
14851The second argument is a hash value that can be used by the consumer
14852of the profile data to detect changes to the instrumented source.
14853
14854The third argument is the bit index into the global test vector bitmap
14855corresponding to the function.
14856
14857The fourth argument is the address of the condition bitmap, which contains a
14858value representing an executed MC/DC test vector. It is loaded and used as the
14859bit index of the test vector bitmap.
14860
14861Semantics:
14862""""""""""
14863
14864This intrinsic represents the final operation of an MC/DC instrumentation
14865sequence and will cause the ``-instrprof`` pass to generate the code to
14866instrument an update of a function's global test vector bitmap to indicate that
14867a test vector has been executed. The global test vector bitmap can be consumed
14868by the ``llvm-profdata`` and ``llvm-cov`` tools.
14869
14870'``llvm.thread.pointer``' Intrinsic
14871^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14872
14873Syntax:
14874"""""""
14875
14876::
14877
14878      declare ptr @llvm.thread.pointer()
14879
14880Overview:
14881"""""""""
14882
14883The '``llvm.thread.pointer``' intrinsic returns the value of the thread
14884pointer.
14885
14886Semantics:
14887""""""""""
14888
14889The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
14890for the current thread.  The exact semantics of this value are target
14891specific: it may point to the start of TLS area, to the end, or somewhere
14892in the middle.  Depending on the target, this intrinsic may read a register,
14893call a helper function, read from an alternate memory space, or perform
14894other operations necessary to locate the TLS area.  Not all targets support
14895this intrinsic.
14896
14897'``llvm.call.preallocated.setup``' Intrinsic
14898^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14899
14900Syntax:
14901"""""""
14902
14903::
14904
14905      declare token @llvm.call.preallocated.setup(i32 %num_args)
14906
14907Overview:
14908"""""""""
14909
14910The '``llvm.call.preallocated.setup``' intrinsic returns a token which can
14911be used with a call's ``"preallocated"`` operand bundle to indicate that
14912certain arguments are allocated and initialized before the call.
14913
14914Semantics:
14915""""""""""
14916
14917The '``llvm.call.preallocated.setup``' intrinsic returns a token which is
14918associated with at most one call. The token can be passed to
14919'``@llvm.call.preallocated.arg``' to get a pointer to get that
14920corresponding argument. The token must be the parameter to a
14921``"preallocated"`` operand bundle for the corresponding call.
14922
14923Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must
14924be properly nested. e.g.
14925
14926:: code-block:: llvm
14927
14928      %t1 = call token @llvm.call.preallocated.setup(i32 0)
14929      %t2 = call token @llvm.call.preallocated.setup(i32 0)
14930      call void foo() ["preallocated"(token %t2)]
14931      call void foo() ["preallocated"(token %t1)]
14932
14933is allowed, but not
14934
14935:: code-block:: llvm
14936
14937      %t1 = call token @llvm.call.preallocated.setup(i32 0)
14938      %t2 = call token @llvm.call.preallocated.setup(i32 0)
14939      call void foo() ["preallocated"(token %t1)]
14940      call void foo() ["preallocated"(token %t2)]
14941
14942.. _int_call_preallocated_arg:
14943
14944'``llvm.call.preallocated.arg``' Intrinsic
14945^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14946
14947Syntax:
14948"""""""
14949
14950::
14951
14952      declare ptr @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index)
14953
14954Overview:
14955"""""""""
14956
14957The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
14958corresponding preallocated argument for the preallocated call.
14959
14960Semantics:
14961""""""""""
14962
14963The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
14964``%arg_index``th argument with the ``preallocated`` attribute for
14965the call associated with the ``%setup_token``, which must be from
14966'``llvm.call.preallocated.setup``'.
14967
14968A call to '``llvm.call.preallocated.arg``' must have a call site
14969``preallocated`` attribute. The type of the ``preallocated`` attribute must
14970match the type used by the ``preallocated`` attribute of the corresponding
14971argument at the preallocated call. The type is used in the case that an
14972``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due
14973to DCE), where otherwise we cannot know how large the arguments are.
14974
14975It is undefined behavior if this is called with a token from an
14976'``llvm.call.preallocated.setup``' if another
14977'``llvm.call.preallocated.setup``' has already been called or if the
14978preallocated call corresponding to the '``llvm.call.preallocated.setup``'
14979has already been called.
14980
14981.. _int_call_preallocated_teardown:
14982
14983'``llvm.call.preallocated.teardown``' Intrinsic
14984^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14985
14986Syntax:
14987"""""""
14988
14989::
14990
14991      declare ptr @llvm.call.preallocated.teardown(token %setup_token)
14992
14993Overview:
14994"""""""""
14995
14996The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
14997created by a '``llvm.call.preallocated.setup``'.
14998
14999Semantics:
15000""""""""""
15001
15002The token argument must be a '``llvm.call.preallocated.setup``'.
15003
15004The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
15005allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly
15006one of this or the preallocated call must be called to prevent stack leaks.
15007It is undefined behavior to call both a '``llvm.call.preallocated.teardown``'
15008and the preallocated call for a given '``llvm.call.preallocated.setup``'.
15009
15010For example, if the stack is allocated for a preallocated call by a
15011'``llvm.call.preallocated.setup``', then an initializer function called on an
15012allocated argument throws an exception, there should be a
15013'``llvm.call.preallocated.teardown``' in the exception handler to prevent
15014stack leaks.
15015
15016Following the nesting rules in '``llvm.call.preallocated.setup``', nested
15017calls to '``llvm.call.preallocated.setup``' and
15018'``llvm.call.preallocated.teardown``' are allowed but must be properly
15019nested.
15020
15021Example:
15022""""""""
15023
15024.. code-block:: llvm
15025
15026        %cs = call token @llvm.call.preallocated.setup(i32 1)
15027        %x = call ptr @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32)
15028        invoke void @constructor(ptr %x) to label %conta unwind label %contb
15029    conta:
15030        call void @foo1(ptr preallocated(i32) %x) ["preallocated"(token %cs)]
15031        ret void
15032    contb:
15033        %s = catchswitch within none [label %catch] unwind to caller
15034    catch:
15035        %p = catchpad within %s []
15036        call void @llvm.call.preallocated.teardown(token %cs)
15037        ret void
15038
15039Standard C/C++ Library Intrinsics
15040---------------------------------
15041
15042LLVM provides intrinsics for a few important standard C/C++ library
15043functions. These intrinsics allow source-language front-ends to pass
15044information about the alignment of the pointer arguments to the code
15045generator, providing opportunity for more efficient code generation.
15046
15047.. _int_abs:
15048
15049'``llvm.abs.*``' Intrinsic
15050^^^^^^^^^^^^^^^^^^^^^^^^^^
15051
15052Syntax:
15053"""""""
15054
15055This is an overloaded intrinsic. You can use ``llvm.abs`` on any
15056integer bit width or any vector of integer elements.
15057
15058::
15059
15060      declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>)
15061      declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>)
15062
15063Overview:
15064"""""""""
15065
15066The '``llvm.abs``' family of intrinsic functions returns the absolute value
15067of an argument.
15068
15069Arguments:
15070""""""""""
15071
15072The first argument is the value for which the absolute value is to be returned.
15073This argument may be of any integer type or a vector with integer element type.
15074The return type must match the first argument type.
15075
15076The second argument must be a constant and is a flag to indicate whether the
15077result value of the '``llvm.abs``' intrinsic is a
15078:ref:`poison value <poisonvalues>` if the first argument is statically or
15079dynamically an ``INT_MIN`` value.
15080
15081Semantics:
15082""""""""""
15083
15084The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the
15085first argument or each element of a vector argument.". If the first argument is
15086``INT_MIN``, then the result is also ``INT_MIN`` if ``is_int_min_poison == 0``
15087and ``poison`` otherwise.
15088
15089
15090.. _int_smax:
15091
15092'``llvm.smax.*``' Intrinsic
15093^^^^^^^^^^^^^^^^^^^^^^^^^^^
15094
15095Syntax:
15096"""""""
15097
15098This is an overloaded intrinsic. You can use ``@llvm.smax`` on any
15099integer bit width or any vector of integer elements.
15100
15101::
15102
15103      declare i32 @llvm.smax.i32(i32 %a, i32 %b)
15104      declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
15105
15106Overview:
15107"""""""""
15108
15109Return the larger of ``%a`` and ``%b`` comparing the values as signed integers.
15110Vector intrinsics operate on a per-element basis. The larger element of ``%a``
15111and ``%b`` at a given index is returned for that index.
15112
15113Arguments:
15114""""""""""
15115
15116The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15117integer element type. The argument types must match each other, and the return
15118type must match the argument type.
15119
15120
15121.. _int_smin:
15122
15123'``llvm.smin.*``' Intrinsic
15124^^^^^^^^^^^^^^^^^^^^^^^^^^^
15125
15126Syntax:
15127"""""""
15128
15129This is an overloaded intrinsic. You can use ``@llvm.smin`` on any
15130integer bit width or any vector of integer elements.
15131
15132::
15133
15134      declare i32 @llvm.smin.i32(i32 %a, i32 %b)
15135      declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
15136
15137Overview:
15138"""""""""
15139
15140Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers.
15141Vector intrinsics operate on a per-element basis. The smaller element of ``%a``
15142and ``%b`` at a given index is returned for that index.
15143
15144Arguments:
15145""""""""""
15146
15147The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15148integer element type. The argument types must match each other, and the return
15149type must match the argument type.
15150
15151
15152.. _int_umax:
15153
15154'``llvm.umax.*``' Intrinsic
15155^^^^^^^^^^^^^^^^^^^^^^^^^^^
15156
15157Syntax:
15158"""""""
15159
15160This is an overloaded intrinsic. You can use ``@llvm.umax`` on any
15161integer bit width or any vector of integer elements.
15162
15163::
15164
15165      declare i32 @llvm.umax.i32(i32 %a, i32 %b)
15166      declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
15167
15168Overview:
15169"""""""""
15170
15171Return the larger of ``%a`` and ``%b`` comparing the values as unsigned
15172integers. Vector intrinsics operate on a per-element basis. The larger element
15173of ``%a`` and ``%b`` at a given index is returned for that index.
15174
15175Arguments:
15176""""""""""
15177
15178The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15179integer element type. The argument types must match each other, and the return
15180type must match the argument type.
15181
15182
15183.. _int_umin:
15184
15185'``llvm.umin.*``' Intrinsic
15186^^^^^^^^^^^^^^^^^^^^^^^^^^^
15187
15188Syntax:
15189"""""""
15190
15191This is an overloaded intrinsic. You can use ``@llvm.umin`` on any
15192integer bit width or any vector of integer elements.
15193
15194::
15195
15196      declare i32 @llvm.umin.i32(i32 %a, i32 %b)
15197      declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
15198
15199Overview:
15200"""""""""
15201
15202Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned
15203integers. Vector intrinsics operate on a per-element basis. The smaller element
15204of ``%a`` and ``%b`` at a given index is returned for that index.
15205
15206Arguments:
15207""""""""""
15208
15209The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15210integer element type. The argument types must match each other, and the return
15211type must match the argument type.
15212
15213.. _int_scmp:
15214
15215'``llvm.scmp.*``' Intrinsic
15216^^^^^^^^^^^^^^^^^^^^^^^^^^^
15217
15218Syntax:
15219"""""""
15220
15221This is an overloaded intrinsic. You can use ``@llvm.scmp`` on any
15222integer bit width or any vector of integer elements.
15223
15224::
15225
15226      declare i2 @llvm.scmp.i2.i32(i32 %a, i32 %b)
15227      declare <4 x i32> @llvm.scmp.v4i32.v4i32(<4 x i32> %a, <4 x i32> %b)
15228
15229Overview:
15230"""""""""
15231
15232Return ``-1`` if ``%a`` is signed less than ``%b``, ``0`` if they are equal, and
15233``1`` if ``%a`` is signed greater than ``%b``. Vector intrinsics operate on a per-element basis.
15234
15235Arguments:
15236""""""""""
15237
15238The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15239integer element type. The argument types must match each other, and the return
15240type must be at least as wide as ``i2``, to hold the three possible return values.
15241
15242.. _int_ucmp:
15243
15244'``llvm.ucmp.*``' Intrinsic
15245^^^^^^^^^^^^^^^^^^^^^^^^^^^
15246
15247Syntax:
15248"""""""
15249
15250This is an overloaded intrinsic. You can use ``@llvm.ucmp`` on any
15251integer bit width or any vector of integer elements.
15252
15253::
15254
15255      declare i2 @llvm.ucmp.i2.i32(i32 %a, i32 %b)
15256      declare <4 x i32> @llvm.ucmp.v4i32.v4i32(<4 x i32> %a, <4 x i32> %b)
15257
15258Overview:
15259"""""""""
15260
15261Return ``-1`` if ``%a`` is unsigned less than ``%b``, ``0`` if they are equal, and
15262``1`` if ``%a`` is unsigned greater than ``%b``. Vector intrinsics operate on a per-element basis.
15263
15264Arguments:
15265""""""""""
15266
15267The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15268integer element type. The argument types must match each other, and the return
15269type must be at least as wide as ``i2``, to hold the three possible return values.
15270
15271.. _int_memcpy:
15272
15273'``llvm.memcpy``' Intrinsic
15274^^^^^^^^^^^^^^^^^^^^^^^^^^^
15275
15276Syntax:
15277"""""""
15278
15279This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
15280integer bit width and for different address spaces. Not all targets
15281support all bit widths however.
15282
15283::
15284
15285      declare void @llvm.memcpy.p0.p0.i32(ptr <dest>, ptr <src>,
15286                                          i32 <len>, i1 <isvolatile>)
15287      declare void @llvm.memcpy.p0.p0.i64(ptr <dest>, ptr <src>,
15288                                          i64 <len>, i1 <isvolatile>)
15289
15290Overview:
15291"""""""""
15292
15293The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
15294source location to the destination location.
15295
15296Note that, unlike the standard libc function, the ``llvm.memcpy.*``
15297intrinsics do not return a value, takes extra isvolatile
15298arguments and the pointers can be in specified address spaces.
15299
15300Arguments:
15301""""""""""
15302
15303The first argument is a pointer to the destination, the second is a
15304pointer to the source. The third argument is an integer argument
15305specifying the number of bytes to copy, and the fourth is a
15306boolean indicating a volatile access.
15307
15308The :ref:`align <attr_align>` parameter attribute can be provided
15309for the first and second arguments.
15310
15311If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
15312a :ref:`volatile operation <volatile>`. The detailed access behavior is not
15313very cleanly specified and it is unwise to depend on it.
15314
15315Semantics:
15316""""""""""
15317
15318The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source
15319location to the destination location, which must either be equal or
15320non-overlapping. It copies "len" bytes of memory over. If the argument is known
15321to be aligned to some boundary, this can be specified as an attribute on the
15322argument.
15323
15324If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
15325the arguments.
15326If ``<len>`` is not a well-defined value, the behavior is undefined.
15327If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
15328otherwise the behavior is undefined.
15329
15330.. _int_memcpy_inline:
15331
15332'``llvm.memcpy.inline``' Intrinsic
15333^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15334
15335Syntax:
15336"""""""
15337
15338This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any
15339integer bit width and for different address spaces. Not all targets
15340support all bit widths however.
15341
15342::
15343
15344      declare void @llvm.memcpy.inline.p0.p0.i32(ptr <dest>, ptr <src>,
15345                                                 i32 <len>, i1 <isvolatile>)
15346      declare void @llvm.memcpy.inline.p0.p0.i64(ptr <dest>, ptr <src>,
15347                                                 i64 <len>, i1 <isvolatile>)
15348
15349Overview:
15350"""""""""
15351
15352The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
15353source location to the destination location and guarantees that no external
15354functions are called.
15355
15356Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*``
15357intrinsics do not return a value, takes extra isvolatile
15358arguments and the pointers can be in specified address spaces.
15359
15360Arguments:
15361""""""""""
15362
15363The first argument is a pointer to the destination, the second is a
15364pointer to the source. The third argument is an integer argument
15365specifying the number of bytes to copy, and the fourth is a
15366boolean indicating a volatile access.
15367
15368The :ref:`align <attr_align>` parameter attribute can be provided
15369for the first and second arguments.
15370
15371If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is
15372a :ref:`volatile operation <volatile>`. The detailed access behavior is not
15373very cleanly specified and it is unwise to depend on it.
15374
15375Semantics:
15376""""""""""
15377
15378The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
15379source location to the destination location, which are not allowed to
15380overlap. It copies "len" bytes of memory over. If the argument is known
15381to be aligned to some boundary, this can be specified as an attribute on
15382the argument.
15383The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of
15384'``llvm.memcpy.*``', but the generated code is guaranteed not to call any
15385external functions.
15386
15387.. _int_memmove:
15388
15389'``llvm.memmove``' Intrinsic
15390^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15391
15392Syntax:
15393"""""""
15394
15395This is an overloaded intrinsic. You can use llvm.memmove on any integer
15396bit width and for different address space. Not all targets support all
15397bit widths however.
15398
15399::
15400
15401      declare void @llvm.memmove.p0.p0.i32(ptr <dest>, ptr <src>,
15402                                           i32 <len>, i1 <isvolatile>)
15403      declare void @llvm.memmove.p0.p0.i64(ptr <dest>, ptr <src>,
15404                                           i64 <len>, i1 <isvolatile>)
15405
15406Overview:
15407"""""""""
15408
15409The '``llvm.memmove.*``' intrinsics move a block of memory from the
15410source location to the destination location. It is similar to the
15411'``llvm.memcpy``' intrinsic but allows the two memory locations to
15412overlap.
15413
15414Note that, unlike the standard libc function, the ``llvm.memmove.*``
15415intrinsics do not return a value, takes an extra isvolatile
15416argument and the pointers can be in specified address spaces.
15417
15418Arguments:
15419""""""""""
15420
15421The first argument is a pointer to the destination, the second is a
15422pointer to the source. The third argument is an integer argument
15423specifying the number of bytes to copy, and the fourth is a
15424boolean indicating a volatile access.
15425
15426The :ref:`align <attr_align>` parameter attribute can be provided
15427for the first and second arguments.
15428
15429If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
15430is a :ref:`volatile operation <volatile>`. The detailed access behavior is
15431not very cleanly specified and it is unwise to depend on it.
15432
15433Semantics:
15434""""""""""
15435
15436The '``llvm.memmove.*``' intrinsics copy a block of memory from the
15437source location to the destination location, which may overlap. It
15438copies "len" bytes of memory over. If the argument is known to be
15439aligned to some boundary, this can be specified as an attribute on
15440the argument.
15441
15442If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
15443the arguments.
15444If ``<len>`` is not a well-defined value, the behavior is undefined.
15445If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
15446otherwise the behavior is undefined.
15447
15448.. _int_memset:
15449
15450'``llvm.memset.*``' Intrinsics
15451^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15452
15453Syntax:
15454"""""""
15455
15456This is an overloaded intrinsic. You can use llvm.memset on any integer
15457bit width and for different address spaces. However, not all targets
15458support all bit widths.
15459
15460::
15461
15462      declare void @llvm.memset.p0.i32(ptr <dest>, i8 <val>,
15463                                       i32 <len>, i1 <isvolatile>)
15464      declare void @llvm.memset.p0.i64(ptr <dest>, i8 <val>,
15465                                       i64 <len>, i1 <isvolatile>)
15466
15467Overview:
15468"""""""""
15469
15470The '``llvm.memset.*``' intrinsics fill a block of memory with a
15471particular byte value.
15472
15473Note that, unlike the standard libc function, the ``llvm.memset``
15474intrinsic does not return a value and takes an extra volatile
15475argument. Also, the destination can be in an arbitrary address space.
15476
15477Arguments:
15478""""""""""
15479
15480The first argument is a pointer to the destination to fill, the second
15481is the byte value with which to fill it, the third argument is an
15482integer argument specifying the number of bytes to fill, and the fourth
15483is a boolean indicating a volatile access.
15484
15485The :ref:`align <attr_align>` parameter attribute can be provided
15486for the first arguments.
15487
15488If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
15489a :ref:`volatile operation <volatile>`. The detailed access behavior is not
15490very cleanly specified and it is unwise to depend on it.
15491
15492Semantics:
15493""""""""""
15494
15495The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
15496at the destination location. If the argument is known to be
15497aligned to some boundary, this can be specified as an attribute on
15498the argument.
15499
15500If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
15501the arguments.
15502If ``<len>`` is not a well-defined value, the behavior is undefined.
15503If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the
15504behavior is undefined.
15505
15506.. _int_memset_inline:
15507
15508'``llvm.memset.inline``' Intrinsic
15509^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15510
15511Syntax:
15512"""""""
15513
15514This is an overloaded intrinsic. You can use ``llvm.memset.inline`` on any
15515integer bit width and for different address spaces. Not all targets
15516support all bit widths however.
15517
15518::
15519
15520      declare void @llvm.memset.inline.p0.p0i8.i32(ptr <dest>, i8 <val>,
15521                                                   i32 <len>, i1 <isvolatile>)
15522      declare void @llvm.memset.inline.p0.p0.i64(ptr <dest>, i8 <val>,
15523                                                 i64 <len>, i1 <isvolatile>)
15524
15525Overview:
15526"""""""""
15527
15528The '``llvm.memset.inline.*``' intrinsics fill a block of memory with a
15529particular byte value and guarantees that no external functions are called.
15530
15531Note that, unlike the standard libc function, the ``llvm.memset.inline.*``
15532intrinsics do not return a value, take an extra isvolatile argument and the
15533pointer can be in specified address spaces.
15534
15535Arguments:
15536""""""""""
15537
15538The first argument is a pointer to the destination to fill, the second
15539is the byte value with which to fill it, the third argument is a constant
15540integer argument specifying the number of bytes to fill, and the fourth
15541is a boolean indicating a volatile access.
15542
15543The :ref:`align <attr_align>` parameter attribute can be provided
15544for the first argument.
15545
15546If the ``isvolatile`` parameter is ``true``, the ``llvm.memset.inline`` call is
15547a :ref:`volatile operation <volatile>`. The detailed access behavior is not
15548very cleanly specified and it is unwise to depend on it.
15549
15550Semantics:
15551""""""""""
15552
15553The '``llvm.memset.inline.*``' intrinsics fill "len" bytes of memory starting
15554at the destination location. If the argument is known to be
15555aligned to some boundary, this can be specified as an attribute on
15556the argument.
15557
15558If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
15559the arguments.
15560If ``<len>`` is not a well-defined value, the behavior is undefined.
15561If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the
15562behavior is undefined.
15563
15564The behavior of '``llvm.memset.inline.*``' is equivalent to the behavior of
15565'``llvm.memset.*``', but the generated code is guaranteed not to call any
15566external functions.
15567
15568.. _int_experimental_memset_pattern:
15569
15570'``llvm.experimental.memset.pattern``' Intrinsic
15571^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15572
15573Syntax:
15574"""""""
15575
15576This is an overloaded intrinsic. You can use
15577``llvm.experimental.memset.pattern`` on any integer bit width and for
15578different address spaces. Not all targets support all bit widths however.
15579
15580::
15581
15582      declare void @llvm.experimental.memset.pattern.p0.i128.i64(ptr <dest>, i128 <val>,
15583                                                                 i64 <count>, i1 <isvolatile>)
15584
15585Overview:
15586"""""""""
15587
15588The '``llvm.experimental.memset.pattern.*``' intrinsics fill a block of memory
15589with a particular value. This may be expanded to an inline loop, a sequence of
15590stores, or a libcall depending on what is available for the target and the
15591expected performance and code size impact.
15592
15593Arguments:
15594""""""""""
15595
15596The first argument is a pointer to the destination to fill, the second
15597is the value with which to fill it, the third argument is an integer
15598argument specifying the number of times to fill the value, and the fourth is a
15599boolean indicating a volatile access.
15600
15601The :ref:`align <attr_align>` parameter attribute can be provided
15602for the first argument.
15603
15604If the ``isvolatile`` parameter is ``true``, the
15605``llvm.experimental.memset.pattern`` call is a :ref:`volatile operation
15606<volatile>`. The detailed access behavior is not very cleanly specified and it
15607is unwise to depend on it.
15608
15609Semantics:
15610""""""""""
15611
15612The '``llvm.experimental.memset.pattern*``' intrinsic fills memory starting at
15613the destination location with the given pattern ``<count>`` times,
15614incrementing by the allocation size of the type each time. The stores follow
15615the usual semantics of store instructions, including regarding endianness and
15616padding. If the argument is known to be aligned to some boundary, this can be
15617specified as an attribute on the argument.
15618
15619If ``<count>`` is 0, it is no-op modulo the behavior of attributes attached to
15620the arguments.
15621If ``<count>`` is not a well-defined value, the behavior is undefined.
15622If ``<count>`` is not zero, ``<dest>`` should be well-defined, otherwise the
15623behavior is undefined.
15624
15625.. _int_sqrt:
15626
15627'``llvm.sqrt.*``' Intrinsic
15628^^^^^^^^^^^^^^^^^^^^^^^^^^^
15629
15630Syntax:
15631"""""""
15632
15633This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
15634floating-point or vector of floating-point type. Not all targets support
15635all types however.
15636
15637::
15638
15639      declare float     @llvm.sqrt.f32(float %Val)
15640      declare double    @llvm.sqrt.f64(double %Val)
15641      declare x86_fp80  @llvm.sqrt.f80(x86_fp80 %Val)
15642      declare fp128     @llvm.sqrt.f128(fp128 %Val)
15643      declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
15644
15645Overview:
15646"""""""""
15647
15648The '``llvm.sqrt``' intrinsics return the square root of the specified value.
15649
15650Arguments:
15651""""""""""
15652
15653The argument and return value are floating-point numbers of the same type.
15654
15655Semantics:
15656""""""""""
15657
15658Return the same value as a corresponding libm '``sqrt``' function but without
15659trapping or setting ``errno``. For types specified by IEEE-754, the result
15660matches a conforming libm implementation.
15661
15662When specified with the fast-math-flag 'afn', the result may be approximated
15663using a less accurate calculation.
15664
15665'``llvm.powi.*``' Intrinsic
15666^^^^^^^^^^^^^^^^^^^^^^^^^^^
15667
15668Syntax:
15669"""""""
15670
15671This is an overloaded intrinsic. You can use ``llvm.powi`` on any
15672floating-point or vector of floating-point type. Not all targets support
15673all types however.
15674
15675Generally, the only supported type for the exponent is the one matching
15676with the C type ``int``.
15677
15678::
15679
15680      declare float     @llvm.powi.f32.i32(float  %Val, i32 %power)
15681      declare double    @llvm.powi.f64.i16(double %Val, i16 %power)
15682      declare x86_fp80  @llvm.powi.f80.i32(x86_fp80  %Val, i32 %power)
15683      declare fp128     @llvm.powi.f128.i32(fp128 %Val, i32 %power)
15684      declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128  %Val, i32 %power)
15685
15686Overview:
15687"""""""""
15688
15689The '``llvm.powi.*``' intrinsics return the first operand raised to the
15690specified (positive or negative) power. The order of evaluation of
15691multiplications is not defined. When a vector of floating-point type is
15692used, the second argument remains a scalar integer value.
15693
15694Arguments:
15695""""""""""
15696
15697The second argument is an integer power, and the first is a value to
15698raise to that power.
15699
15700Semantics:
15701""""""""""
15702
15703This function returns the first value raised to the second power with an
15704unspecified sequence of rounding operations.
15705
15706.. _t_llvm_sin:
15707
15708'``llvm.sin.*``' Intrinsic
15709^^^^^^^^^^^^^^^^^^^^^^^^^^
15710
15711Syntax:
15712"""""""
15713
15714This is an overloaded intrinsic. You can use ``llvm.sin`` on any
15715floating-point or vector of floating-point type. Not all targets support
15716all types however.
15717
15718::
15719
15720      declare float     @llvm.sin.f32(float  %Val)
15721      declare double    @llvm.sin.f64(double %Val)
15722      declare x86_fp80  @llvm.sin.f80(x86_fp80  %Val)
15723      declare fp128     @llvm.sin.f128(fp128 %Val)
15724      declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128  %Val)
15725
15726Overview:
15727"""""""""
15728
15729The '``llvm.sin.*``' intrinsics return the sine of the operand.
15730
15731Arguments:
15732""""""""""
15733
15734The argument and return value are floating-point numbers of the same type.
15735
15736Semantics:
15737""""""""""
15738
15739Return the same value as a corresponding libm '``sin``' function but without
15740trapping or setting ``errno``.
15741
15742When specified with the fast-math-flag 'afn', the result may be approximated
15743using a less accurate calculation.
15744
15745.. _t_llvm_cos:
15746
15747'``llvm.cos.*``' Intrinsic
15748^^^^^^^^^^^^^^^^^^^^^^^^^^
15749
15750Syntax:
15751"""""""
15752
15753This is an overloaded intrinsic. You can use ``llvm.cos`` on any
15754floating-point or vector of floating-point type. Not all targets support
15755all types however.
15756
15757::
15758
15759      declare float     @llvm.cos.f32(float  %Val)
15760      declare double    @llvm.cos.f64(double %Val)
15761      declare x86_fp80  @llvm.cos.f80(x86_fp80  %Val)
15762      declare fp128     @llvm.cos.f128(fp128 %Val)
15763      declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128  %Val)
15764
15765Overview:
15766"""""""""
15767
15768The '``llvm.cos.*``' intrinsics return the cosine of the operand.
15769
15770Arguments:
15771""""""""""
15772
15773The argument and return value are floating-point numbers of the same type.
15774
15775Semantics:
15776""""""""""
15777
15778Return the same value as a corresponding libm '``cos``' function but without
15779trapping or setting ``errno``.
15780
15781When specified with the fast-math-flag 'afn', the result may be approximated
15782using a less accurate calculation.
15783
15784'``llvm.tan.*``' Intrinsic
15785^^^^^^^^^^^^^^^^^^^^^^^^^^
15786
15787Syntax:
15788"""""""
15789
15790This is an overloaded intrinsic. You can use ``llvm.tan`` on any
15791floating-point or vector of floating-point type. Not all targets support
15792all types however.
15793
15794::
15795
15796      declare float     @llvm.tan.f32(float  %Val)
15797      declare double    @llvm.tan.f64(double %Val)
15798      declare x86_fp80  @llvm.tan.f80(x86_fp80  %Val)
15799      declare fp128     @llvm.tan.f128(fp128 %Val)
15800      declare ppc_fp128 @llvm.tan.ppcf128(ppc_fp128  %Val)
15801
15802Overview:
15803"""""""""
15804
15805The '``llvm.tan.*``' intrinsics return the tangent of the operand.
15806
15807Arguments:
15808""""""""""
15809
15810The argument and return value are floating-point numbers of the same type.
15811
15812Semantics:
15813""""""""""
15814
15815Return the same value as a corresponding libm '``tan``' function but without
15816trapping or setting ``errno``.
15817
15818When specified with the fast-math-flag 'afn', the result may be approximated
15819using a less accurate calculation.
15820
15821'``llvm.asin.*``' Intrinsic
15822^^^^^^^^^^^^^^^^^^^^^^^^^^^
15823
15824Syntax:
15825"""""""
15826
15827This is an overloaded intrinsic. You can use ``llvm.asin`` on any
15828floating-point or vector of floating-point type. Not all targets support
15829all types however.
15830
15831::
15832
15833      declare float     @llvm.asin.f32(float  %Val)
15834      declare double    @llvm.asin.f64(double %Val)
15835      declare x86_fp80  @llvm.asin.f80(x86_fp80  %Val)
15836      declare fp128     @llvm.asin.f128(fp128 %Val)
15837      declare ppc_fp128 @llvm.asin.ppcf128(ppc_fp128  %Val)
15838
15839Overview:
15840"""""""""
15841
15842The '``llvm.asin.*``' intrinsics return the arcsine of the operand.
15843
15844Arguments:
15845""""""""""
15846
15847The argument and return value are floating-point numbers of the same type.
15848
15849Semantics:
15850""""""""""
15851
15852Return the same value as a corresponding libm '``asin``' function but without
15853trapping or setting ``errno``.
15854
15855When specified with the fast-math-flag 'afn', the result may be approximated
15856using a less accurate calculation.
15857
15858'``llvm.acos.*``' Intrinsic
15859^^^^^^^^^^^^^^^^^^^^^^^^^^^
15860
15861Syntax:
15862"""""""
15863
15864This is an overloaded intrinsic. You can use ``llvm.acos`` on any
15865floating-point or vector of floating-point type. Not all targets support
15866all types however.
15867
15868::
15869
15870      declare float     @llvm.acos.f32(float  %Val)
15871      declare double    @llvm.acos.f64(double %Val)
15872      declare x86_fp80  @llvm.acos.f80(x86_fp80  %Val)
15873      declare fp128     @llvm.acos.f128(fp128 %Val)
15874      declare ppc_fp128 @llvm.acos.ppcf128(ppc_fp128  %Val)
15875
15876Overview:
15877"""""""""
15878
15879The '``llvm.acos.*``' intrinsics return the arccosine of the operand.
15880
15881Arguments:
15882""""""""""
15883
15884The argument and return value are floating-point numbers of the same type.
15885
15886Semantics:
15887""""""""""
15888
15889Return the same value as a corresponding libm '``acos``' function but without
15890trapping or setting ``errno``.
15891
15892When specified with the fast-math-flag 'afn', the result may be approximated
15893using a less accurate calculation.
15894
15895'``llvm.atan.*``' Intrinsic
15896^^^^^^^^^^^^^^^^^^^^^^^^^^^
15897
15898Syntax:
15899"""""""
15900
15901This is an overloaded intrinsic. You can use ``llvm.atan`` on any
15902floating-point or vector of floating-point type. Not all targets support
15903all types however.
15904
15905::
15906
15907      declare float     @llvm.atan.f32(float  %Val)
15908      declare double    @llvm.atan.f64(double %Val)
15909      declare x86_fp80  @llvm.atan.f80(x86_fp80  %Val)
15910      declare fp128     @llvm.atan.f128(fp128 %Val)
15911      declare ppc_fp128 @llvm.atan.ppcf128(ppc_fp128  %Val)
15912
15913Overview:
15914"""""""""
15915
15916The '``llvm.atan.*``' intrinsics return the arctangent of the operand.
15917
15918Arguments:
15919""""""""""
15920
15921The argument and return value are floating-point numbers of the same type.
15922
15923Semantics:
15924""""""""""
15925
15926Return the same value as a corresponding libm '``atan``' function but without
15927trapping or setting ``errno``.
15928
15929When specified with the fast-math-flag 'afn', the result may be approximated
15930using a less accurate calculation.
15931
15932'``llvm.atan2.*``' Intrinsic
15933^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15934
15935Syntax:
15936"""""""
15937
15938This is an overloaded intrinsic. You can use ``llvm.atan2`` on any
15939floating-point or vector of floating-point type. Not all targets support
15940all types however.
15941
15942::
15943
15944      declare float     @llvm.atan2.f32(float  %Y, float %X)
15945      declare double    @llvm.atan2.f64(double %Y, double %X)
15946      declare x86_fp80  @llvm.atan2.f80(x86_fp80  %Y, x86_fp80 %X)
15947      declare fp128     @llvm.atan2.f128(fp128 %Y, fp128 %X)
15948      declare ppc_fp128 @llvm.atan2.ppcf128(ppc_fp128  %Y, ppc_fp128 %X)
15949
15950Overview:
15951"""""""""
15952
15953The '``llvm.atan2.*``' intrinsics return the arctangent of ``Y/X`` accounting
15954for the quadrant.
15955
15956Arguments:
15957""""""""""
15958
15959The arguments and return value are floating-point numbers of the same type.
15960
15961Semantics:
15962""""""""""
15963
15964Return the same value as a corresponding libm '``atan2``' function but without
15965trapping or setting ``errno``.
15966
15967When specified with the fast-math-flag 'afn', the result may be approximated
15968using a less accurate calculation.
15969
15970'``llvm.sinh.*``' Intrinsic
15971^^^^^^^^^^^^^^^^^^^^^^^^^^^
15972
15973Syntax:
15974"""""""
15975
15976This is an overloaded intrinsic. You can use ``llvm.sinh`` on any
15977floating-point or vector of floating-point type. Not all targets support
15978all types however.
15979
15980::
15981
15982      declare float     @llvm.sinh.f32(float  %Val)
15983      declare double    @llvm.sinh.f64(double %Val)
15984      declare x86_fp80  @llvm.sinh.f80(x86_fp80  %Val)
15985      declare fp128     @llvm.sinh.f128(fp128 %Val)
15986      declare ppc_fp128 @llvm.sinh.ppcf128(ppc_fp128  %Val)
15987
15988Overview:
15989"""""""""
15990
15991The '``llvm.sinh.*``' intrinsics return the hyperbolic sine of the operand.
15992
15993Arguments:
15994""""""""""
15995
15996The argument and return value are floating-point numbers of the same type.
15997
15998Semantics:
15999""""""""""
16000
16001Return the same value as a corresponding libm '``sinh``' function but without
16002trapping or setting ``errno``.
16003
16004When specified with the fast-math-flag 'afn', the result may be approximated
16005using a less accurate calculation.
16006
16007'``llvm.cosh.*``' Intrinsic
16008^^^^^^^^^^^^^^^^^^^^^^^^^^^
16009
16010Syntax:
16011"""""""
16012
16013This is an overloaded intrinsic. You can use ``llvm.cosh`` on any
16014floating-point or vector of floating-point type. Not all targets support
16015all types however.
16016
16017::
16018
16019      declare float     @llvm.cosh.f32(float  %Val)
16020      declare double    @llvm.cosh.f64(double %Val)
16021      declare x86_fp80  @llvm.cosh.f80(x86_fp80  %Val)
16022      declare fp128     @llvm.cosh.f128(fp128 %Val)
16023      declare ppc_fp128 @llvm.cosh.ppcf128(ppc_fp128  %Val)
16024
16025Overview:
16026"""""""""
16027
16028The '``llvm.cosh.*``' intrinsics return the hyperbolic cosine of the operand.
16029
16030Arguments:
16031""""""""""
16032
16033The argument and return value are floating-point numbers of the same type.
16034
16035Semantics:
16036""""""""""
16037
16038Return the same value as a corresponding libm '``cosh``' function but without
16039trapping or setting ``errno``.
16040
16041When specified with the fast-math-flag 'afn', the result may be approximated
16042using a less accurate calculation.
16043
16044'``llvm.tanh.*``' Intrinsic
16045^^^^^^^^^^^^^^^^^^^^^^^^^^^
16046
16047Syntax:
16048"""""""
16049
16050This is an overloaded intrinsic. You can use ``llvm.tanh`` on any
16051floating-point or vector of floating-point type. Not all targets support
16052all types however.
16053
16054::
16055
16056      declare float     @llvm.tanh.f32(float  %Val)
16057      declare double    @llvm.tanh.f64(double %Val)
16058      declare x86_fp80  @llvm.tanh.f80(x86_fp80  %Val)
16059      declare fp128     @llvm.tanh.f128(fp128 %Val)
16060      declare ppc_fp128 @llvm.tanh.ppcf128(ppc_fp128  %Val)
16061
16062Overview:
16063"""""""""
16064
16065The '``llvm.tanh.*``' intrinsics return the hyperbolic tangent of the operand.
16066
16067Arguments:
16068""""""""""
16069
16070The argument and return value are floating-point numbers of the same type.
16071
16072Semantics:
16073""""""""""
16074
16075Return the same value as a corresponding libm '``tanh``' function but without
16076trapping or setting ``errno``.
16077
16078When specified with the fast-math-flag 'afn', the result may be approximated
16079using a less accurate calculation.
16080
16081
16082'``llvm.sincos.*``' Intrinsic
16083^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16084
16085Syntax:
16086"""""""
16087
16088This is an overloaded intrinsic. You can use ``llvm.sincos`` on any
16089floating-point or vector of floating-point type. Not all targets support
16090all types however.
16091
16092::
16093
16094      declare { float, float }          @llvm.sincos.f32(float  %Val)
16095      declare { double, double }        @llvm.sincos.f64(double %Val)
16096      declare { x86_fp80, x86_fp80 }    @llvm.sincos.f80(x86_fp80  %Val)
16097      declare { fp128, fp128 }          @llvm.sincos.f128(fp128 %Val)
16098      declare { ppc_fp128, ppc_fp128 }  @llvm.sincos.ppcf128(ppc_fp128  %Val)
16099      declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float>  %Val)
16100
16101Overview:
16102"""""""""
16103
16104The '``llvm.sincos.*``' intrinsics returns the sine and cosine of the operand.
16105
16106Arguments:
16107""""""""""
16108
16109The argument is a :ref:`floating-point <t_floating>` value or
16110:ref:`vector <t_vector>` of floating-point values. Returns two values matching
16111the argument type in a struct.
16112
16113Semantics:
16114""""""""""
16115
16116This intrinsic is equivalent to a calling both :ref:`llvm.sin <t_llvm_sin>`
16117and :ref:`llvm.cos <t_llvm_cos>` on the argument.
16118
16119The first result is the sine of the argument and the second result is the cosine
16120of the argument.
16121
16122When specified with the fast-math-flag 'afn', the result may be approximated
16123using a less accurate calculation.
16124
16125'``llvm.pow.*``' Intrinsic
16126^^^^^^^^^^^^^^^^^^^^^^^^^^
16127
16128Syntax:
16129"""""""
16130
16131This is an overloaded intrinsic. You can use ``llvm.pow`` on any
16132floating-point or vector of floating-point type. Not all targets support
16133all types however.
16134
16135::
16136
16137      declare float     @llvm.pow.f32(float  %Val, float %Power)
16138      declare double    @llvm.pow.f64(double %Val, double %Power)
16139      declare x86_fp80  @llvm.pow.f80(x86_fp80  %Val, x86_fp80 %Power)
16140      declare fp128     @llvm.pow.f128(fp128 %Val, fp128 %Power)
16141      declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128  %Val, ppc_fp128 Power)
16142
16143Overview:
16144"""""""""
16145
16146The '``llvm.pow.*``' intrinsics return the first operand raised to the
16147specified (positive or negative) power.
16148
16149Arguments:
16150""""""""""
16151
16152The arguments and return value are floating-point numbers of the same type.
16153
16154Semantics:
16155""""""""""
16156
16157Return the same value as a corresponding libm '``pow``' function but without
16158trapping or setting ``errno``.
16159
16160When specified with the fast-math-flag 'afn', the result may be approximated
16161using a less accurate calculation.
16162
16163.. _int_exp:
16164
16165'``llvm.exp.*``' Intrinsic
16166^^^^^^^^^^^^^^^^^^^^^^^^^^
16167
16168Syntax:
16169"""""""
16170
16171This is an overloaded intrinsic. You can use ``llvm.exp`` on any
16172floating-point or vector of floating-point type. Not all targets support
16173all types however.
16174
16175::
16176
16177      declare float     @llvm.exp.f32(float  %Val)
16178      declare double    @llvm.exp.f64(double %Val)
16179      declare x86_fp80  @llvm.exp.f80(x86_fp80  %Val)
16180      declare fp128     @llvm.exp.f128(fp128 %Val)
16181      declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128  %Val)
16182
16183Overview:
16184"""""""""
16185
16186The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
16187value.
16188
16189Arguments:
16190""""""""""
16191
16192The argument and return value are floating-point numbers of the same type.
16193
16194Semantics:
16195""""""""""
16196
16197Return the same value as a corresponding libm '``exp``' function but without
16198trapping or setting ``errno``.
16199
16200When specified with the fast-math-flag 'afn', the result may be approximated
16201using a less accurate calculation.
16202
16203.. _int_exp2:
16204
16205'``llvm.exp2.*``' Intrinsic
16206^^^^^^^^^^^^^^^^^^^^^^^^^^^
16207
16208Syntax:
16209"""""""
16210
16211This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
16212floating-point or vector of floating-point type. Not all targets support
16213all types however.
16214
16215::
16216
16217      declare float     @llvm.exp2.f32(float  %Val)
16218      declare double    @llvm.exp2.f64(double %Val)
16219      declare x86_fp80  @llvm.exp2.f80(x86_fp80  %Val)
16220      declare fp128     @llvm.exp2.f128(fp128 %Val)
16221      declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128  %Val)
16222
16223Overview:
16224"""""""""
16225
16226The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
16227specified value.
16228
16229Arguments:
16230""""""""""
16231
16232The argument and return value are floating-point numbers of the same type.
16233
16234Semantics:
16235""""""""""
16236
16237Return the same value as a corresponding libm '``exp2``' function but without
16238trapping or setting ``errno``.
16239
16240When specified with the fast-math-flag 'afn', the result may be approximated
16241using a less accurate calculation.
16242
16243.. _int_exp10:
16244
16245'``llvm.exp10.*``' Intrinsic
16246^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16247
16248Syntax:
16249"""""""
16250
16251This is an overloaded intrinsic. You can use ``llvm.exp10`` on any
16252floating-point or vector of floating-point type. Not all targets support
16253all types however.
16254
16255::
16256
16257      declare float     @llvm.exp10.f32(float  %Val)
16258      declare double    @llvm.exp10.f64(double %Val)
16259      declare x86_fp80  @llvm.exp10.f80(x86_fp80  %Val)
16260      declare fp128     @llvm.exp10.f128(fp128 %Val)
16261      declare ppc_fp128 @llvm.exp10.ppcf128(ppc_fp128  %Val)
16262
16263Overview:
16264"""""""""
16265
16266The '``llvm.exp10.*``' intrinsics compute the base-10 exponential of the
16267specified value.
16268
16269Arguments:
16270""""""""""
16271
16272The argument and return value are floating-point numbers of the same type.
16273
16274Semantics:
16275""""""""""
16276
16277Return the same value as a corresponding libm '``exp10``' function but without
16278trapping or setting ``errno``.
16279
16280When specified with the fast-math-flag 'afn', the result may be approximated
16281using a less accurate calculation.
16282
16283
16284'``llvm.ldexp.*``' Intrinsic
16285^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16286
16287Syntax:
16288"""""""
16289
16290This is an overloaded intrinsic. You can use ``llvm.ldexp`` on any
16291floating point or vector of floating point type. Not all targets support
16292all types however.
16293
16294::
16295
16296      declare float     @llvm.ldexp.f32.i32(float %Val, i32 %Exp)
16297      declare double    @llvm.ldexp.f64.i32(double %Val, i32 %Exp)
16298      declare x86_fp80  @llvm.ldexp.f80.i32(x86_fp80 %Val, i32 %Exp)
16299      declare fp128     @llvm.ldexp.f128.i32(fp128 %Val, i32 %Exp)
16300      declare ppc_fp128 @llvm.ldexp.ppcf128.i32(ppc_fp128 %Val, i32 %Exp)
16301      declare <2 x float> @llvm.ldexp.v2f32.v2i32(<2 x float> %Val, <2 x i32> %Exp)
16302
16303Overview:
16304"""""""""
16305
16306The '``llvm.ldexp.*``' intrinsics perform the ldexp function.
16307
16308Arguments:
16309""""""""""
16310
16311The first argument and the return value are :ref:`floating-point
16312<t_floating>` or :ref:`vector <t_vector>` of floating-point values of
16313the same type. The second argument is an integer with the same number
16314of elements.
16315
16316Semantics:
16317""""""""""
16318
16319This function multiplies the first argument by 2 raised to the second
16320argument's power. If the first argument is NaN or infinite, the same
16321value is returned. If the result underflows a zero with the same sign
16322is returned. If the result overflows, the result is an infinity with
16323the same sign.
16324
16325.. _int_frexp:
16326
16327'``llvm.frexp.*``' Intrinsic
16328^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16329
16330Syntax:
16331"""""""
16332
16333This is an overloaded intrinsic. You can use ``llvm.frexp`` on any
16334floating point or vector of floating point type. Not all targets support
16335all types however.
16336
16337::
16338
16339      declare { float, i32 }     @llvm.frexp.f32.i32(float %Val)
16340      declare { double, i32 }    @llvm.frexp.f64.i32(double %Val)
16341      declare { x86_fp80, i32 }  @llvm.frexp.f80.i32(x86_fp80 %Val)
16342      declare { fp128, i32 }     @llvm.frexp.f128.i32(fp128 %Val)
16343      declare { ppc_fp128, i32 } @llvm.frexp.ppcf128.i32(ppc_fp128 %Val)
16344      declare { <2 x float>, <2 x i32> }  @llvm.frexp.v2f32.v2i32(<2 x float> %Val)
16345
16346Overview:
16347"""""""""
16348
16349The '``llvm.frexp.*``' intrinsics perform the frexp function.
16350
16351Arguments:
16352""""""""""
16353
16354The argument is a :ref:`floating-point <t_floating>` or
16355:ref:`vector <t_vector>` of floating-point values. Returns two values
16356in a struct. The first struct field matches the argument type, and the
16357second field is an integer or a vector of integer values with the same
16358number of elements as the argument.
16359
16360Semantics:
16361""""""""""
16362
16363This intrinsic splits a floating point value into a normalized
16364fractional component and integral exponent.
16365
16366For a non-zero argument, returns the argument multiplied by some power
16367of two such that the absolute value of the returned value is in the
16368range [0.5, 1.0), with the same sign as the argument. The second
16369result is an integer such that the first result raised to the power of
16370the second result is the input argument.
16371
16372If the argument is a zero, returns a zero with the same sign and a 0
16373exponent.
16374
16375If the argument is a NaN, a NaN is returned and the returned exponent
16376is unspecified.
16377
16378If the argument is an infinity, returns an infinity with the same sign
16379and an unspecified exponent.
16380
16381.. _int_log:
16382
16383'``llvm.log.*``' Intrinsic
16384^^^^^^^^^^^^^^^^^^^^^^^^^^
16385
16386Syntax:
16387"""""""
16388
16389This is an overloaded intrinsic. You can use ``llvm.log`` on any
16390floating-point or vector of floating-point type. Not all targets support
16391all types however.
16392
16393::
16394
16395      declare float     @llvm.log.f32(float  %Val)
16396      declare double    @llvm.log.f64(double %Val)
16397      declare x86_fp80  @llvm.log.f80(x86_fp80  %Val)
16398      declare fp128     @llvm.log.f128(fp128 %Val)
16399      declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128  %Val)
16400
16401Overview:
16402"""""""""
16403
16404The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
16405value.
16406
16407Arguments:
16408""""""""""
16409
16410The argument and return value are floating-point numbers of the same type.
16411
16412Semantics:
16413""""""""""
16414
16415Return the same value as a corresponding libm '``log``' function but without
16416trapping or setting ``errno``.
16417
16418When specified with the fast-math-flag 'afn', the result may be approximated
16419using a less accurate calculation.
16420
16421.. _int_log10:
16422
16423'``llvm.log10.*``' Intrinsic
16424^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16425
16426Syntax:
16427"""""""
16428
16429This is an overloaded intrinsic. You can use ``llvm.log10`` on any
16430floating-point or vector of floating-point type. Not all targets support
16431all types however.
16432
16433::
16434
16435      declare float     @llvm.log10.f32(float  %Val)
16436      declare double    @llvm.log10.f64(double %Val)
16437      declare x86_fp80  @llvm.log10.f80(x86_fp80  %Val)
16438      declare fp128     @llvm.log10.f128(fp128 %Val)
16439      declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128  %Val)
16440
16441Overview:
16442"""""""""
16443
16444The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
16445specified value.
16446
16447Arguments:
16448""""""""""
16449
16450The argument and return value are floating-point numbers of the same type.
16451
16452Semantics:
16453""""""""""
16454
16455Return the same value as a corresponding libm '``log10``' function but without
16456trapping or setting ``errno``.
16457
16458When specified with the fast-math-flag 'afn', the result may be approximated
16459using a less accurate calculation.
16460
16461
16462.. _int_log2:
16463
16464'``llvm.log2.*``' Intrinsic
16465^^^^^^^^^^^^^^^^^^^^^^^^^^^
16466
16467Syntax:
16468"""""""
16469
16470This is an overloaded intrinsic. You can use ``llvm.log2`` on any
16471floating-point or vector of floating-point type. Not all targets support
16472all types however.
16473
16474::
16475
16476      declare float     @llvm.log2.f32(float  %Val)
16477      declare double    @llvm.log2.f64(double %Val)
16478      declare x86_fp80  @llvm.log2.f80(x86_fp80  %Val)
16479      declare fp128     @llvm.log2.f128(fp128 %Val)
16480      declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128  %Val)
16481
16482Overview:
16483"""""""""
16484
16485The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
16486value.
16487
16488Arguments:
16489""""""""""
16490
16491The argument and return value are floating-point numbers of the same type.
16492
16493Semantics:
16494""""""""""
16495
16496Return the same value as a corresponding libm '``log2``' function but without
16497trapping or setting ``errno``.
16498
16499When specified with the fast-math-flag 'afn', the result may be approximated
16500using a less accurate calculation.
16501
16502.. _int_fma:
16503
16504'``llvm.fma.*``' Intrinsic
16505^^^^^^^^^^^^^^^^^^^^^^^^^^
16506
16507Syntax:
16508"""""""
16509
16510This is an overloaded intrinsic. You can use ``llvm.fma`` on any
16511floating-point or vector of floating-point type. Not all targets support
16512all types however.
16513
16514::
16515
16516      declare float     @llvm.fma.f32(float  %a, float  %b, float  %c)
16517      declare double    @llvm.fma.f64(double %a, double %b, double %c)
16518      declare x86_fp80  @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
16519      declare fp128     @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
16520      declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
16521
16522Overview:
16523"""""""""
16524
16525The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
16526
16527Arguments:
16528""""""""""
16529
16530The arguments and return value are floating-point numbers of the same type.
16531
16532Semantics:
16533""""""""""
16534
16535Return the same value as the IEEE-754 fusedMultiplyAdd operation. This
16536is assumed to not trap or set ``errno``.
16537
16538When specified with the fast-math-flag 'afn', the result may be approximated
16539using a less accurate calculation.
16540
16541.. _int_fabs:
16542
16543'``llvm.fabs.*``' Intrinsic
16544^^^^^^^^^^^^^^^^^^^^^^^^^^^
16545
16546Syntax:
16547"""""""
16548
16549This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
16550floating-point or vector of floating-point type. Not all targets support
16551all types however.
16552
16553::
16554
16555      declare float     @llvm.fabs.f32(float  %Val)
16556      declare double    @llvm.fabs.f64(double %Val)
16557      declare x86_fp80  @llvm.fabs.f80(x86_fp80 %Val)
16558      declare fp128     @llvm.fabs.f128(fp128 %Val)
16559      declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
16560
16561Overview:
16562"""""""""
16563
16564The '``llvm.fabs.*``' intrinsics return the absolute value of the
16565operand.
16566
16567Arguments:
16568""""""""""
16569
16570The argument and return value are floating-point numbers of the same
16571type.
16572
16573Semantics:
16574""""""""""
16575
16576This function returns the same values as the libm ``fabs`` functions
16577would, and handles error conditions in the same way.
16578The returned value is completely identical to the input except for the sign bit;
16579in particular, if the input is a NaN, then the quiet/signaling bit and payload
16580are perfectly preserved.
16581
16582.. _i_fminmax_family:
16583
16584'``llvm.min.*``' Intrinsics Comparation
16585^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16586
16587Standard:
16588"""""""""
16589
16590IEEE754 and ISO C define some min/max operations, and they have some differences
16591on working with qNaN/sNaN and +0.0/-0.0. Here is the list:
16592
16593.. list-table::
16594   :header-rows: 2
16595
16596   * - ``ISO C``
16597     - fmin/fmax
16598     - fmininum/fmaximum
16599     - fminimum_num/fmaximum_num
16600
16601   * - ``IEEE754``
16602     - minNum/maxNum (2008)
16603     - minimum/maximum (2019)
16604     - minimumNumber/maximumNumber (2019)
16605
16606   * - ``+0.0 vs -0.0``
16607     - either one
16608     - +0.0 > -0.0
16609     - +0.0 > -0.0
16610
16611   * - ``NUM vs sNaN``
16612     - qNaN, invalid exception
16613     - qNaN, invalid exception
16614     - NUM, invalid exception
16615
16616   * - ``qNaN vs sNaN``
16617     - qNaN, invalid exception
16618     - qNaN, invalid exception
16619     - qNaN, invalid exception
16620
16621   * - ``NUM vs qNaN``
16622     - NUM, no exception
16623     - qNaN, no exception
16624     - NUM, no exception
16625
16626LLVM Implementation:
16627""""""""""""""""""""
16628
16629LLVM implements all ISO C flavors as listed in this table, except in the
16630default floating-point environment exceptions are ignored. The constrained
16631versions of the intrinsics respect the exception behavior.
16632
16633.. list-table::
16634   :header-rows: 1
16635   :widths: 16 28 28 28
16636
16637   * - Operation
16638     - minnum/maxnum
16639     - minimum/maximum
16640     - minimumnum/maximumnum
16641
16642   * - ``NUM vs qNaN``
16643     - NUM, no exception
16644     - qNaN, no exception
16645     - NUM, no exception
16646
16647   * - ``NUM vs sNaN``
16648     - qNaN, invalid exception
16649     - qNaN, invalid exception
16650     - NUM, invalid exception
16651
16652   * - ``qNaN vs sNaN``
16653     - qNaN, invalid exception
16654     - qNaN, invalid exception
16655     - qNaN, invalid exception
16656
16657   * - ``sNaN vs sNaN``
16658     - qNaN, invalid exception
16659     - qNaN, invalid exception
16660     - qNaN, invalid exception
16661
16662   * - ``+0.0 vs -0.0``
16663     - either one
16664     - +0.0(max)/-0.0(min)
16665     - +0.0(max)/-0.0(min)
16666
16667   * - ``NUM vs NUM``
16668     - larger(max)/smaller(min)
16669     - larger(max)/smaller(min)
16670     - larger(max)/smaller(min)
16671
16672.. _i_minnum:
16673
16674'``llvm.minnum.*``' Intrinsic
16675^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16676
16677Syntax:
16678"""""""
16679
16680This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
16681floating-point or vector of floating-point type. Not all targets support
16682all types however.
16683
16684::
16685
16686      declare float     @llvm.minnum.f32(float %Val0, float %Val1)
16687      declare double    @llvm.minnum.f64(double %Val0, double %Val1)
16688      declare x86_fp80  @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16689      declare fp128     @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
16690      declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16691
16692Overview:
16693"""""""""
16694
16695The '``llvm.minnum.*``' intrinsics return the minimum of the two
16696arguments.
16697
16698
16699Arguments:
16700""""""""""
16701
16702The arguments and return value are floating-point numbers of the same
16703type.
16704
16705Semantics:
16706""""""""""
16707
16708Follows the IEEE-754 semantics for minNum, except for handling of
16709signaling NaNs. This match's the behavior of libm's fmin.
16710
16711If either operand is a NaN, returns the other non-NaN operand. Returns
16712NaN only if both operands are NaN. If the operands compare equal,
16713returns either one of the operands. For example, this means that
16714fmin(+0.0, -0.0) returns either operand.
16715
16716Unlike the IEEE-754 2008 behavior, this does not distinguish between
16717signaling and quiet NaN inputs. If a target's implementation follows
16718the standard and returns a quiet NaN if either input is a signaling
16719NaN, the intrinsic lowering is responsible for quieting the inputs to
16720correctly return the non-NaN input (e.g. by using the equivalent of
16721``llvm.canonicalize``).
16722
16723.. _i_maxnum:
16724
16725'``llvm.maxnum.*``' Intrinsic
16726^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16727
16728Syntax:
16729"""""""
16730
16731This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
16732floating-point or vector of floating-point type. Not all targets support
16733all types however.
16734
16735::
16736
16737      declare float     @llvm.maxnum.f32(float  %Val0, float  %Val1)
16738      declare double    @llvm.maxnum.f64(double %Val0, double %Val1)
16739      declare x86_fp80  @llvm.maxnum.f80(x86_fp80  %Val0, x86_fp80  %Val1)
16740      declare fp128     @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
16741      declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128  %Val0, ppc_fp128  %Val1)
16742
16743Overview:
16744"""""""""
16745
16746The '``llvm.maxnum.*``' intrinsics return the maximum of the two
16747arguments.
16748
16749
16750Arguments:
16751""""""""""
16752
16753The arguments and return value are floating-point numbers of the same
16754type.
16755
16756Semantics:
16757""""""""""
16758Follows the IEEE-754 semantics for maxNum except for the handling of
16759signaling NaNs. This matches the behavior of libm's fmax.
16760
16761If either operand is a NaN, returns the other non-NaN operand. Returns
16762NaN only if both operands are NaN. If the operands compare equal,
16763returns either one of the operands. For example, this means that
16764fmax(+0.0, -0.0) returns either -0.0 or 0.0.
16765
16766Unlike the IEEE-754 2008 behavior, this does not distinguish between
16767signaling and quiet NaN inputs. If a target's implementation follows
16768the standard and returns a quiet NaN if either input is a signaling
16769NaN, the intrinsic lowering is responsible for quieting the inputs to
16770correctly return the non-NaN input (e.g. by using the equivalent of
16771``llvm.canonicalize``).
16772
16773.. _i_minimum:
16774
16775'``llvm.minimum.*``' Intrinsic
16776^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16777
16778Syntax:
16779"""""""
16780
16781This is an overloaded intrinsic. You can use ``llvm.minimum`` on any
16782floating-point or vector of floating-point type. Not all targets support
16783all types however.
16784
16785::
16786
16787      declare float     @llvm.minimum.f32(float %Val0, float %Val1)
16788      declare double    @llvm.minimum.f64(double %Val0, double %Val1)
16789      declare x86_fp80  @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16790      declare fp128     @llvm.minimum.f128(fp128 %Val0, fp128 %Val1)
16791      declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16792
16793Overview:
16794"""""""""
16795
16796The '``llvm.minimum.*``' intrinsics return the minimum of the two
16797arguments, propagating NaNs and treating -0.0 as less than +0.0.
16798
16799
16800Arguments:
16801""""""""""
16802
16803The arguments and return value are floating-point numbers of the same
16804type.
16805
16806Semantics:
16807""""""""""
16808If either operand is a NaN, returns NaN. Otherwise returns the lesser
16809of the two arguments. -0.0 is considered to be less than +0.0 for this
16810intrinsic. Note that these are the semantics specified in the draft of
16811IEEE 754-2019.
16812
16813.. _i_maximum:
16814
16815'``llvm.maximum.*``' Intrinsic
16816^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16817
16818Syntax:
16819"""""""
16820
16821This is an overloaded intrinsic. You can use ``llvm.maximum`` on any
16822floating-point or vector of floating-point type. Not all targets support
16823all types however.
16824
16825::
16826
16827      declare float     @llvm.maximum.f32(float %Val0, float %Val1)
16828      declare double    @llvm.maximum.f64(double %Val0, double %Val1)
16829      declare x86_fp80  @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16830      declare fp128     @llvm.maximum.f128(fp128 %Val0, fp128 %Val1)
16831      declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16832
16833Overview:
16834"""""""""
16835
16836The '``llvm.maximum.*``' intrinsics return the maximum of the two
16837arguments, propagating NaNs and treating -0.0 as less than +0.0.
16838
16839
16840Arguments:
16841""""""""""
16842
16843The arguments and return value are floating-point numbers of the same
16844type.
16845
16846Semantics:
16847""""""""""
16848If either operand is a NaN, returns NaN. Otherwise returns the greater
16849of the two arguments. -0.0 is considered to be less than +0.0 for this
16850intrinsic. Note that these are the semantics specified in the draft of
16851IEEE 754-2019.
16852
16853.. _i_minimumnum:
16854
16855'``llvm.minimumnum.*``' Intrinsic
16856^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16857
16858Syntax:
16859"""""""
16860
16861This is an overloaded intrinsic. You can use ``llvm.minimumnum`` on any
16862floating-point or vector of floating-point type. Not all targets support
16863all types however.
16864
16865::
16866
16867      declare float     @llvm.minimumnum.f32(float %Val0, float %Val1)
16868      declare double    @llvm.minimumnum.f64(double %Val0, double %Val1)
16869      declare x86_fp80  @llvm.minimumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16870      declare fp128     @llvm.minimumnum.f128(fp128 %Val0, fp128 %Val1)
16871      declare ppc_fp128 @llvm.minimumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16872
16873Overview:
16874"""""""""
16875
16876The '``llvm.minimumnum.*``' intrinsics return the minimum of the two
16877arguments, not propagating NaNs and treating -0.0 as less than +0.0.
16878
16879
16880Arguments:
16881""""""""""
16882
16883The arguments and return value are floating-point numbers of the same
16884type.
16885
16886Semantics:
16887""""""""""
16888If both operands are NaNs (including sNaN), returns qNaN. If one operand
16889is NaN (including sNaN) and another operand is a number, return the number.
16890Otherwise returns the lesser of the two arguments. -0.0 is considered to
16891be less than +0.0 for this intrinsic.
16892
16893Note that these are the semantics of minimumNumber specified in IEEE 754-2019.
16894
16895It has some differences with '``llvm.minnum.*``':
168961)'``llvm.minnum.*``' will return qNaN if either operand is sNaN.
168972)'``llvm.minnum*``' may return either one if we compare +0.0 vs -0.0.
16898
16899.. _i_maximumnum:
16900
16901'``llvm.maximumnum.*``' Intrinsic
16902^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16903
16904Syntax:
16905"""""""
16906
16907This is an overloaded intrinsic. You can use ``llvm.maximumnum`` on any
16908floating-point or vector of floating-point type. Not all targets support
16909all types however.
16910
16911::
16912
16913      declare float     @llvm.maximumnum.f32(float %Val0, float %Val1)
16914      declare double    @llvm.maximumnum.f64(double %Val0, double %Val1)
16915      declare x86_fp80  @llvm.maximumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16916      declare fp128     @llvm.maximumnum.f128(fp128 %Val0, fp128 %Val1)
16917      declare ppc_fp128 @llvm.maximumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16918
16919Overview:
16920"""""""""
16921
16922The '``llvm.maximumnum.*``' intrinsics return the maximum of the two
16923arguments, not propagating NaNs and treating -0.0 as less than +0.0.
16924
16925
16926Arguments:
16927""""""""""
16928
16929The arguments and return value are floating-point numbers of the same
16930type.
16931
16932Semantics:
16933""""""""""
16934If both operands are NaNs (including sNaN), returns qNaN. If one operand
16935is NaN (including sNaN) and another operand is a number, return the number.
16936Otherwise returns the greater of the two arguments. -0.0 is considered to
16937be less than +0.0 for this intrinsic.
16938
16939Note that these are the semantics of maximumNumber specified in IEEE 754-2019.
16940
16941It has some differences with '``llvm.maxnum.*``':
169421)'``llvm.maxnum.*``' will return qNaN if either operand is sNaN.
169432)'``llvm.maxnum*``' may return either one if we compare +0.0 vs -0.0.
16944
16945.. _int_copysign:
16946
16947'``llvm.copysign.*``' Intrinsic
16948^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16949
16950Syntax:
16951"""""""
16952
16953This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
16954floating-point or vector of floating-point type. Not all targets support
16955all types however.
16956
16957::
16958
16959      declare float     @llvm.copysign.f32(float  %Mag, float  %Sgn)
16960      declare double    @llvm.copysign.f64(double %Mag, double %Sgn)
16961      declare x86_fp80  @llvm.copysign.f80(x86_fp80  %Mag, x86_fp80  %Sgn)
16962      declare fp128     @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
16963      declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128  %Mag, ppc_fp128  %Sgn)
16964
16965Overview:
16966"""""""""
16967
16968The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
16969first operand and the sign of the second operand.
16970
16971Arguments:
16972""""""""""
16973
16974The arguments and return value are floating-point numbers of the same
16975type.
16976
16977Semantics:
16978""""""""""
16979
16980This function returns the same values as the libm ``copysign``
16981functions would, and handles error conditions in the same way.
16982The returned value is completely identical to the first operand except for the
16983sign bit; in particular, if the input is a NaN, then the quiet/signaling bit and
16984payload are perfectly preserved.
16985
16986.. _int_floor:
16987
16988'``llvm.floor.*``' Intrinsic
16989^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16990
16991Syntax:
16992"""""""
16993
16994This is an overloaded intrinsic. You can use ``llvm.floor`` on any
16995floating-point or vector of floating-point type. Not all targets support
16996all types however.
16997
16998::
16999
17000      declare float     @llvm.floor.f32(float  %Val)
17001      declare double    @llvm.floor.f64(double %Val)
17002      declare x86_fp80  @llvm.floor.f80(x86_fp80  %Val)
17003      declare fp128     @llvm.floor.f128(fp128 %Val)
17004      declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128  %Val)
17005
17006Overview:
17007"""""""""
17008
17009The '``llvm.floor.*``' intrinsics return the floor of the operand.
17010
17011Arguments:
17012""""""""""
17013
17014The argument and return value are floating-point numbers of the same
17015type.
17016
17017Semantics:
17018""""""""""
17019
17020This function returns the same values as the libm ``floor`` functions
17021would, and handles error conditions in the same way.
17022
17023.. _int_ceil:
17024
17025'``llvm.ceil.*``' Intrinsic
17026^^^^^^^^^^^^^^^^^^^^^^^^^^^
17027
17028Syntax:
17029"""""""
17030
17031This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
17032floating-point or vector of floating-point type. Not all targets support
17033all types however.
17034
17035::
17036
17037      declare float     @llvm.ceil.f32(float  %Val)
17038      declare double    @llvm.ceil.f64(double %Val)
17039      declare x86_fp80  @llvm.ceil.f80(x86_fp80  %Val)
17040      declare fp128     @llvm.ceil.f128(fp128 %Val)
17041      declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128  %Val)
17042
17043Overview:
17044"""""""""
17045
17046The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
17047
17048Arguments:
17049""""""""""
17050
17051The argument and return value are floating-point numbers of the same
17052type.
17053
17054Semantics:
17055""""""""""
17056
17057This function returns the same values as the libm ``ceil`` functions
17058would, and handles error conditions in the same way.
17059
17060
17061.. _int_llvm_trunc:
17062
17063'``llvm.trunc.*``' Intrinsic
17064^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17065
17066Syntax:
17067"""""""
17068
17069This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
17070floating-point or vector of floating-point type. Not all targets support
17071all types however.
17072
17073::
17074
17075      declare float     @llvm.trunc.f32(float  %Val)
17076      declare double    @llvm.trunc.f64(double %Val)
17077      declare x86_fp80  @llvm.trunc.f80(x86_fp80  %Val)
17078      declare fp128     @llvm.trunc.f128(fp128 %Val)
17079      declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128  %Val)
17080
17081Overview:
17082"""""""""
17083
17084The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
17085nearest integer not larger in magnitude than the operand.
17086
17087Arguments:
17088""""""""""
17089
17090The argument and return value are floating-point numbers of the same
17091type.
17092
17093Semantics:
17094""""""""""
17095
17096This function returns the same values as the libm ``trunc`` functions
17097would, and handles error conditions in the same way.
17098
17099.. _int_rint:
17100
17101'``llvm.rint.*``' Intrinsic
17102^^^^^^^^^^^^^^^^^^^^^^^^^^^
17103
17104Syntax:
17105"""""""
17106
17107This is an overloaded intrinsic. You can use ``llvm.rint`` on any
17108floating-point or vector of floating-point type. Not all targets support
17109all types however.
17110
17111::
17112
17113      declare float     @llvm.rint.f32(float  %Val)
17114      declare double    @llvm.rint.f64(double %Val)
17115      declare x86_fp80  @llvm.rint.f80(x86_fp80  %Val)
17116      declare fp128     @llvm.rint.f128(fp128 %Val)
17117      declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128  %Val)
17118
17119Overview:
17120"""""""""
17121
17122The '``llvm.rint.*``' intrinsics returns the operand rounded to the
17123nearest integer. It may raise an inexact floating-point exception if the
17124operand isn't an integer.
17125
17126Arguments:
17127""""""""""
17128
17129The argument and return value are floating-point numbers of the same
17130type.
17131
17132Semantics:
17133""""""""""
17134
17135This function returns the same values as the libm ``rint`` functions
17136would, and handles error conditions in the same way. Since LLVM assumes the
17137:ref:`default floating-point environment <floatenv>`, the rounding mode is
17138assumed to be set to "nearest", so halfway cases are rounded to the even
17139integer. Use :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`
17140to avoid that assumption.
17141
17142.. _int_nearbyint:
17143
17144'``llvm.nearbyint.*``' Intrinsic
17145^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17146
17147Syntax:
17148"""""""
17149
17150This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
17151floating-point or vector of floating-point type. Not all targets support
17152all types however.
17153
17154::
17155
17156      declare float     @llvm.nearbyint.f32(float  %Val)
17157      declare double    @llvm.nearbyint.f64(double %Val)
17158      declare x86_fp80  @llvm.nearbyint.f80(x86_fp80  %Val)
17159      declare fp128     @llvm.nearbyint.f128(fp128 %Val)
17160      declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128  %Val)
17161
17162Overview:
17163"""""""""
17164
17165The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
17166nearest integer.
17167
17168Arguments:
17169""""""""""
17170
17171The argument and return value are floating-point numbers of the same
17172type.
17173
17174Semantics:
17175""""""""""
17176
17177This function returns the same values as the libm ``nearbyint``
17178functions would, and handles error conditions in the same way. Since LLVM
17179assumes the :ref:`default floating-point environment <floatenv>`, the rounding
17180mode is assumed to be set to "nearest", so halfway cases are rounded to the even
17181integer. Use :ref:`Constrained Floating-Point Intrinsics <constrainedfp>` to
17182avoid that assumption.
17183
17184.. _int_round:
17185
17186'``llvm.round.*``' Intrinsic
17187^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17188
17189Syntax:
17190"""""""
17191
17192This is an overloaded intrinsic. You can use ``llvm.round`` on any
17193floating-point or vector of floating-point type. Not all targets support
17194all types however.
17195
17196::
17197
17198      declare float     @llvm.round.f32(float  %Val)
17199      declare double    @llvm.round.f64(double %Val)
17200      declare x86_fp80  @llvm.round.f80(x86_fp80  %Val)
17201      declare fp128     @llvm.round.f128(fp128 %Val)
17202      declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128  %Val)
17203
17204Overview:
17205"""""""""
17206
17207The '``llvm.round.*``' intrinsics returns the operand rounded to the
17208nearest integer.
17209
17210Arguments:
17211""""""""""
17212
17213The argument and return value are floating-point numbers of the same
17214type.
17215
17216Semantics:
17217""""""""""
17218
17219This function returns the same values as the libm ``round``
17220functions would, and handles error conditions in the same way.
17221
17222.. _int_roundeven:
17223
17224'``llvm.roundeven.*``' Intrinsic
17225^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17226
17227Syntax:
17228"""""""
17229
17230This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any
17231floating-point or vector of floating-point type. Not all targets support
17232all types however.
17233
17234::
17235
17236      declare float     @llvm.roundeven.f32(float  %Val)
17237      declare double    @llvm.roundeven.f64(double %Val)
17238      declare x86_fp80  @llvm.roundeven.f80(x86_fp80  %Val)
17239      declare fp128     @llvm.roundeven.f128(fp128 %Val)
17240      declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128  %Val)
17241
17242Overview:
17243"""""""""
17244
17245The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest
17246integer in floating-point format rounding halfway cases to even (that is, to the
17247nearest value that is an even integer).
17248
17249Arguments:
17250""""""""""
17251
17252The argument and return value are floating-point numbers of the same type.
17253
17254Semantics:
17255""""""""""
17256
17257This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
17258also behaves in the same way as C standard function ``roundeven``, except that
17259it does not raise floating point exceptions.
17260
17261
17262'``llvm.lround.*``' Intrinsic
17263^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17264
17265Syntax:
17266"""""""
17267
17268This is an overloaded intrinsic. You can use ``llvm.lround`` on any
17269floating-point type or vector of floating-point type. Not all targets
17270support all types however.
17271
17272::
17273
17274      declare i32 @llvm.lround.i32.f32(float %Val)
17275      declare i32 @llvm.lround.i32.f64(double %Val)
17276      declare i32 @llvm.lround.i32.f80(float %Val)
17277      declare i32 @llvm.lround.i32.f128(double %Val)
17278      declare i32 @llvm.lround.i32.ppcf128(double %Val)
17279
17280      declare i64 @llvm.lround.i64.f32(float %Val)
17281      declare i64 @llvm.lround.i64.f64(double %Val)
17282      declare i64 @llvm.lround.i64.f80(float %Val)
17283      declare i64 @llvm.lround.i64.f128(double %Val)
17284      declare i64 @llvm.lround.i64.ppcf128(double %Val)
17285
17286Overview:
17287"""""""""
17288
17289The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest
17290integer with ties away from zero.
17291
17292
17293Arguments:
17294""""""""""
17295
17296The argument is a floating-point number and the return value is an integer
17297type.
17298
17299Semantics:
17300""""""""""
17301
17302This function returns the same values as the libm ``lround`` functions
17303would, but without setting errno. If the rounded value is too large to
17304be stored in the result type, the return value is a non-deterministic
17305value (equivalent to `freeze poison`).
17306
17307'``llvm.llround.*``' Intrinsic
17308^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17309
17310Syntax:
17311"""""""
17312
17313This is an overloaded intrinsic. You can use ``llvm.llround`` on any
17314floating-point type. Not all targets support all types however.
17315
17316::
17317
17318      declare i64 @llvm.llround.i64.f32(float %Val)
17319      declare i64 @llvm.llround.i64.f64(double %Val)
17320      declare i64 @llvm.llround.i64.f80(float %Val)
17321      declare i64 @llvm.llround.i64.f128(double %Val)
17322      declare i64 @llvm.llround.i64.ppcf128(double %Val)
17323
17324Overview:
17325"""""""""
17326
17327The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest
17328integer with ties away from zero.
17329
17330Arguments:
17331""""""""""
17332
17333The argument is a floating-point number and the return value is an integer
17334type.
17335
17336Semantics:
17337""""""""""
17338
17339This function returns the same values as the libm ``llround``
17340functions would, but without setting errno. If the rounded value is
17341too large to be stored in the result type, the return value is a
17342non-deterministic value (equivalent to `freeze poison`).
17343
17344.. _int_lrint:
17345
17346'``llvm.lrint.*``' Intrinsic
17347^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17348
17349Syntax:
17350"""""""
17351
17352This is an overloaded intrinsic. You can use ``llvm.lrint`` on any
17353floating-point type or vector of floating-point type. Not all targets
17354support all types however.
17355
17356::
17357
17358      declare i32 @llvm.lrint.i32.f32(float %Val)
17359      declare i32 @llvm.lrint.i32.f64(double %Val)
17360      declare i32 @llvm.lrint.i32.f80(float %Val)
17361      declare i32 @llvm.lrint.i32.f128(double %Val)
17362      declare i32 @llvm.lrint.i32.ppcf128(double %Val)
17363
17364      declare i64 @llvm.lrint.i64.f32(float %Val)
17365      declare i64 @llvm.lrint.i64.f64(double %Val)
17366      declare i64 @llvm.lrint.i64.f80(float %Val)
17367      declare i64 @llvm.lrint.i64.f128(double %Val)
17368      declare i64 @llvm.lrint.i64.ppcf128(double %Val)
17369
17370Overview:
17371"""""""""
17372
17373The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest
17374integer.
17375
17376
17377Arguments:
17378""""""""""
17379
17380The argument is a floating-point number and the return value is an integer
17381type.
17382
17383Semantics:
17384""""""""""
17385
17386This function returns the same values as the libm ``lrint`` functions
17387would, but without setting errno. If the rounded value is too large to
17388be stored in the result type, the return value is a non-deterministic
17389value (equivalent to `freeze poison`).
17390
17391.. _int_llrint:
17392
17393'``llvm.llrint.*``' Intrinsic
17394^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17395
17396Syntax:
17397"""""""
17398
17399This is an overloaded intrinsic. You can use ``llvm.llrint`` on any
17400floating-point type or vector of floating-point type. Not all targets
17401support all types however.
17402
17403::
17404
17405      declare i64 @llvm.llrint.i64.f32(float %Val)
17406      declare i64 @llvm.llrint.i64.f64(double %Val)
17407      declare i64 @llvm.llrint.i64.f80(float %Val)
17408      declare i64 @llvm.llrint.i64.f128(double %Val)
17409      declare i64 @llvm.llrint.i64.ppcf128(double %Val)
17410
17411Overview:
17412"""""""""
17413
17414The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest
17415integer.
17416
17417Arguments:
17418""""""""""
17419
17420The argument is a floating-point number and the return value is an integer
17421type.
17422
17423Semantics:
17424""""""""""
17425
17426This function returns the same values as the libm ``llrint`` functions
17427would, but without setting errno. If the rounded value is too large to
17428be stored in the result type, the return value is a non-deterministic
17429value (equivalent to `freeze poison`).
17430
17431Bit Manipulation Intrinsics
17432---------------------------
17433
17434LLVM provides intrinsics for a few important bit manipulation
17435operations. These allow efficient code generation for some algorithms.
17436
17437.. _int_bitreverse:
17438
17439'``llvm.bitreverse.*``' Intrinsics
17440^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17441
17442Syntax:
17443"""""""
17444
17445This is an overloaded intrinsic function. You can use bitreverse on any
17446integer type.
17447
17448::
17449
17450      declare i16 @llvm.bitreverse.i16(i16 <id>)
17451      declare i32 @llvm.bitreverse.i32(i32 <id>)
17452      declare i64 @llvm.bitreverse.i64(i64 <id>)
17453      declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>)
17454
17455Overview:
17456"""""""""
17457
17458The '``llvm.bitreverse``' family of intrinsics is used to reverse the
17459bitpattern of an integer value or vector of integer values; for example
17460``0b10110110`` becomes ``0b01101101``.
17461
17462Semantics:
17463""""""""""
17464
17465The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
17466``M`` in the input moved to bit ``N-M-1`` in the output. The vector
17467intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element
17468basis and the element order is not affected.
17469
17470.. _int_bswap:
17471
17472'``llvm.bswap.*``' Intrinsics
17473^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17474
17475Syntax:
17476"""""""
17477
17478This is an overloaded intrinsic function. You can use bswap on any
17479integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
17480
17481::
17482
17483      declare i16 @llvm.bswap.i16(i16 <id>)
17484      declare i32 @llvm.bswap.i32(i32 <id>)
17485      declare i64 @llvm.bswap.i64(i64 <id>)
17486      declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>)
17487
17488Overview:
17489"""""""""
17490
17491The '``llvm.bswap``' family of intrinsics is used to byte swap an integer
17492value or vector of integer values with an even number of bytes (positive
17493multiple of 16 bits).
17494
17495Semantics:
17496""""""""""
17497
17498The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
17499and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
17500intrinsic returns an i32 value that has the four bytes of the input i32
17501swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
17502returned i32 will have its bytes in 3, 2, 1, 0 order. The
17503``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
17504concept to additional even-byte lengths (6 bytes, 8 bytes and more,
17505respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``,
17506operate on a per-element basis and the element order is not affected.
17507
17508.. _int_ctpop:
17509
17510'``llvm.ctpop.*``' Intrinsic
17511^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17512
17513Syntax:
17514"""""""
17515
17516This is an overloaded intrinsic. You can use llvm.ctpop on any integer
17517bit width, or on any vector with integer elements. Not all targets
17518support all bit widths or vector types, however.
17519
17520::
17521
17522      declare i8 @llvm.ctpop.i8(i8  <src>)
17523      declare i16 @llvm.ctpop.i16(i16 <src>)
17524      declare i32 @llvm.ctpop.i32(i32 <src>)
17525      declare i64 @llvm.ctpop.i64(i64 <src>)
17526      declare i256 @llvm.ctpop.i256(i256 <src>)
17527      declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
17528
17529Overview:
17530"""""""""
17531
17532The '``llvm.ctpop``' family of intrinsics counts the number of bits set
17533in a value.
17534
17535Arguments:
17536""""""""""
17537
17538The only argument is the value to be counted. The argument may be of any
17539integer type, or a vector with integer elements. The return type must
17540match the argument type.
17541
17542Semantics:
17543""""""""""
17544
17545The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
17546each element of a vector.
17547
17548.. _int_ctlz:
17549
17550'``llvm.ctlz.*``' Intrinsic
17551^^^^^^^^^^^^^^^^^^^^^^^^^^^
17552
17553Syntax:
17554"""""""
17555
17556This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
17557integer bit width, or any vector whose elements are integers. Not all
17558targets support all bit widths or vector types, however.
17559
17560::
17561
17562      declare i8   @llvm.ctlz.i8  (i8   <src>, i1 <is_zero_poison>)
17563      declare <2 x i37> @llvm.ctlz.v2i37(<2 x i37> <src>, i1 <is_zero_poison>)
17564
17565Overview:
17566"""""""""
17567
17568The '``llvm.ctlz``' family of intrinsic functions counts the number of
17569leading zeros in a variable.
17570
17571Arguments:
17572""""""""""
17573
17574The first argument is the value to be counted. This argument may be of
17575any integer type, or a vector with integer element type. The return
17576type must match the first argument type.
17577
17578The second argument is a constant flag that indicates whether the intrinsic
17579returns a valid result if the first argument is zero. If the first
17580argument is zero and the second argument is true, the result is poison.
17581Historically some architectures did not provide a defined result for zero
17582values as efficiently, and many algorithms are now predicated on avoiding
17583zero-value inputs.
17584
17585Semantics:
17586""""""""""
17587
17588The '``llvm.ctlz``' intrinsic counts the leading (most significant)
17589zeros in a variable, or within each element of the vector. If
17590``src == 0`` then the result is the size in bits of the type of ``src``
17591if ``is_zero_poison == 0`` and ``poison`` otherwise. For example,
17592``llvm.ctlz(i32 2) = 30``.
17593
17594.. _int_cttz:
17595
17596'``llvm.cttz.*``' Intrinsic
17597^^^^^^^^^^^^^^^^^^^^^^^^^^^
17598
17599Syntax:
17600"""""""
17601
17602This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
17603integer bit width, or any vector of integer elements. Not all targets
17604support all bit widths or vector types, however.
17605
17606::
17607
17608      declare i42   @llvm.cttz.i42  (i42   <src>, i1 <is_zero_poison>)
17609      declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_poison>)
17610
17611Overview:
17612"""""""""
17613
17614The '``llvm.cttz``' family of intrinsic functions counts the number of
17615trailing zeros.
17616
17617Arguments:
17618""""""""""
17619
17620The first argument is the value to be counted. This argument may be of
17621any integer type, or a vector with integer element type. The return
17622type must match the first argument type.
17623
17624The second argument is a constant flag that indicates whether the intrinsic
17625returns a valid result if the first argument is zero. If the first
17626argument is zero and the second argument is true, the result is poison.
17627Historically some architectures did not provide a defined result for zero
17628values as efficiently, and many algorithms are now predicated on avoiding
17629zero-value inputs.
17630
17631Semantics:
17632""""""""""
17633
17634The '``llvm.cttz``' intrinsic counts the trailing (least significant)
17635zeros in a variable, or within each element of a vector. If ``src == 0``
17636then the result is the size in bits of the type of ``src`` if
17637``is_zero_poison == 0`` and ``poison`` otherwise. For example,
17638``llvm.cttz(2) = 1``.
17639
17640.. _int_overflow:
17641
17642.. _int_fshl:
17643
17644'``llvm.fshl.*``' Intrinsic
17645^^^^^^^^^^^^^^^^^^^^^^^^^^^
17646
17647Syntax:
17648"""""""
17649
17650This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
17651integer bit width or any vector of integer elements. Not all targets
17652support all bit widths or vector types, however.
17653
17654::
17655
17656      declare i8  @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
17657      declare i64 @llvm.fshl.i64(i64 %a, i64 %b, i64 %c)
17658      declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
17659
17660Overview:
17661"""""""""
17662
17663The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
17664the first two values are concatenated as { %a : %b } (%a is the most significant
17665bits of the wide value), the combined value is shifted left, and the most
17666significant bits are extracted to produce a result that is the same size as the
17667original arguments. If the first 2 arguments are identical, this is equivalent
17668to a rotate left operation. For vector types, the operation occurs for each
17669element of the vector. The shift argument is treated as an unsigned amount
17670modulo the element size of the arguments.
17671
17672Arguments:
17673""""""""""
17674
17675The first two arguments are the values to be concatenated. The third
17676argument is the shift amount. The arguments may be any integer type or a
17677vector with integer element type. All arguments and the return value must
17678have the same type.
17679
17680Example:
17681""""""""
17682
17683.. code-block:: text
17684
17685      %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
17686      %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15)  ; %r = i8: 128 (0b10000000)
17687      %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11)  ; %r = i8: 120 (0b01111000)
17688      %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8)   ; %r = i8: 0   (0b00000000)
17689
17690.. _int_fshr:
17691
17692'``llvm.fshr.*``' Intrinsic
17693^^^^^^^^^^^^^^^^^^^^^^^^^^^
17694
17695Syntax:
17696"""""""
17697
17698This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
17699integer bit width or any vector of integer elements. Not all targets
17700support all bit widths or vector types, however.
17701
17702::
17703
17704      declare i8  @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
17705      declare i64 @llvm.fshr.i64(i64 %a, i64 %b, i64 %c)
17706      declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
17707
17708Overview:
17709"""""""""
17710
17711The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
17712the first two values are concatenated as { %a : %b } (%a is the most significant
17713bits of the wide value), the combined value is shifted right, and the least
17714significant bits are extracted to produce a result that is the same size as the
17715original arguments. If the first 2 arguments are identical, this is equivalent
17716to a rotate right operation. For vector types, the operation occurs for each
17717element of the vector. The shift argument is treated as an unsigned amount
17718modulo the element size of the arguments.
17719
17720Arguments:
17721""""""""""
17722
17723The first two arguments are the values to be concatenated. The third
17724argument is the shift amount. The arguments may be any integer type or a
17725vector with integer element type. All arguments and the return value must
17726have the same type.
17727
17728Example:
17729""""""""
17730
17731.. code-block:: text
17732
17733      %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
17734      %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15)  ; %r = i8: 254 (0b11111110)
17735      %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11)  ; %r = i8: 225 (0b11100001)
17736      %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8)   ; %r = i8: 255 (0b11111111)
17737
17738Arithmetic with Overflow Intrinsics
17739-----------------------------------
17740
17741LLVM provides intrinsics for fast arithmetic overflow checking.
17742
17743Each of these intrinsics returns a two-element struct. The first
17744element of this struct contains the result of the corresponding
17745arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
17746the result. Therefore, for example, the first element of the struct
17747returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
17748result of a 32-bit ``add`` instruction with the same operands, where
17749the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
17750
17751The second element of the result is an ``i1`` that is 1 if the
17752arithmetic operation overflowed and 0 otherwise. An operation
17753overflows if, for any values of its operands ``A`` and ``B`` and for
17754any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
17755not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
17756``sext`` for signed overflow and ``zext`` for unsigned overflow, and
17757``op`` is the underlying arithmetic operation.
17758
17759The behavior of these intrinsics is well-defined for all argument
17760values.
17761
17762'``llvm.sadd.with.overflow.*``' Intrinsics
17763^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17764
17765Syntax:
17766"""""""
17767
17768This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
17769on any integer bit width or vectors of integers.
17770
17771::
17772
17773      declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
17774      declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
17775      declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
17776      declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17777
17778Overview:
17779"""""""""
17780
17781The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
17782a signed addition of the two arguments, and indicate whether an overflow
17783occurred during the signed summation.
17784
17785Arguments:
17786""""""""""
17787
17788The arguments (%a and %b) and the first element of the result structure
17789may be of integer types of any bit width, but they must have the same
17790bit width. The second element of the result structure must be of type
17791``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
17792addition.
17793
17794Semantics:
17795""""""""""
17796
17797The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
17798a signed addition of the two variables. They return a structure --- the
17799first element of which is the signed summation, and the second element
17800of which is a bit specifying if the signed summation resulted in an
17801overflow.
17802
17803Examples:
17804"""""""""
17805
17806.. code-block:: llvm
17807
17808      %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
17809      %sum = extractvalue {i32, i1} %res, 0
17810      %obit = extractvalue {i32, i1} %res, 1
17811      br i1 %obit, label %overflow, label %normal
17812
17813'``llvm.uadd.with.overflow.*``' Intrinsics
17814^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17815
17816Syntax:
17817"""""""
17818
17819This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
17820on any integer bit width or vectors of integers.
17821
17822::
17823
17824      declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
17825      declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
17826      declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
17827      declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17828
17829Overview:
17830"""""""""
17831
17832The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
17833an unsigned addition of the two arguments, and indicate whether a carry
17834occurred during the unsigned summation.
17835
17836Arguments:
17837""""""""""
17838
17839The arguments (%a and %b) and the first element of the result structure
17840may be of integer types of any bit width, but they must have the same
17841bit width. The second element of the result structure must be of type
17842``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
17843addition.
17844
17845Semantics:
17846""""""""""
17847
17848The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
17849an unsigned addition of the two arguments. They return a structure --- the
17850first element of which is the sum, and the second element of which is a
17851bit specifying if the unsigned summation resulted in a carry.
17852
17853Examples:
17854"""""""""
17855
17856.. code-block:: llvm
17857
17858      %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
17859      %sum = extractvalue {i32, i1} %res, 0
17860      %obit = extractvalue {i32, i1} %res, 1
17861      br i1 %obit, label %carry, label %normal
17862
17863'``llvm.ssub.with.overflow.*``' Intrinsics
17864^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17865
17866Syntax:
17867"""""""
17868
17869This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
17870on any integer bit width or vectors of integers.
17871
17872::
17873
17874      declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
17875      declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
17876      declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
17877      declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17878
17879Overview:
17880"""""""""
17881
17882The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
17883a signed subtraction of the two arguments, and indicate whether an
17884overflow occurred during the signed subtraction.
17885
17886Arguments:
17887""""""""""
17888
17889The arguments (%a and %b) and the first element of the result structure
17890may be of integer types of any bit width, but they must have the same
17891bit width. The second element of the result structure must be of type
17892``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
17893subtraction.
17894
17895Semantics:
17896""""""""""
17897
17898The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
17899a signed subtraction of the two arguments. They return a structure --- the
17900first element of which is the subtraction, and the second element of
17901which is a bit specifying if the signed subtraction resulted in an
17902overflow.
17903
17904Examples:
17905"""""""""
17906
17907.. code-block:: llvm
17908
17909      %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
17910      %sum = extractvalue {i32, i1} %res, 0
17911      %obit = extractvalue {i32, i1} %res, 1
17912      br i1 %obit, label %overflow, label %normal
17913
17914'``llvm.usub.with.overflow.*``' Intrinsics
17915^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17916
17917Syntax:
17918"""""""
17919
17920This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
17921on any integer bit width or vectors of integers.
17922
17923::
17924
17925      declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
17926      declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
17927      declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
17928      declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17929
17930Overview:
17931"""""""""
17932
17933The '``llvm.usub.with.overflow``' family of intrinsic functions perform
17934an unsigned subtraction of the two arguments, and indicate whether an
17935overflow occurred during the unsigned subtraction.
17936
17937Arguments:
17938""""""""""
17939
17940The arguments (%a and %b) and the first element of the result structure
17941may be of integer types of any bit width, but they must have the same
17942bit width. The second element of the result structure must be of type
17943``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
17944subtraction.
17945
17946Semantics:
17947""""""""""
17948
17949The '``llvm.usub.with.overflow``' family of intrinsic functions perform
17950an unsigned subtraction of the two arguments. They return a structure ---
17951the first element of which is the subtraction, and the second element of
17952which is a bit specifying if the unsigned subtraction resulted in an
17953overflow.
17954
17955Examples:
17956"""""""""
17957
17958.. code-block:: llvm
17959
17960      %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
17961      %sum = extractvalue {i32, i1} %res, 0
17962      %obit = extractvalue {i32, i1} %res, 1
17963      br i1 %obit, label %overflow, label %normal
17964
17965'``llvm.smul.with.overflow.*``' Intrinsics
17966^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17967
17968Syntax:
17969"""""""
17970
17971This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
17972on any integer bit width or vectors of integers.
17973
17974::
17975
17976      declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
17977      declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
17978      declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
17979      declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17980
17981Overview:
17982"""""""""
17983
17984The '``llvm.smul.with.overflow``' family of intrinsic functions perform
17985a signed multiplication of the two arguments, and indicate whether an
17986overflow occurred during the signed multiplication.
17987
17988Arguments:
17989""""""""""
17990
17991The arguments (%a and %b) and the first element of the result structure
17992may be of integer types of any bit width, but they must have the same
17993bit width. The second element of the result structure must be of type
17994``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
17995multiplication.
17996
17997Semantics:
17998""""""""""
17999
18000The '``llvm.smul.with.overflow``' family of intrinsic functions perform
18001a signed multiplication of the two arguments. They return a structure ---
18002the first element of which is the multiplication, and the second element
18003of which is a bit specifying if the signed multiplication resulted in an
18004overflow.
18005
18006Examples:
18007"""""""""
18008
18009.. code-block:: llvm
18010
18011      %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
18012      %sum = extractvalue {i32, i1} %res, 0
18013      %obit = extractvalue {i32, i1} %res, 1
18014      br i1 %obit, label %overflow, label %normal
18015
18016'``llvm.umul.with.overflow.*``' Intrinsics
18017^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18018
18019Syntax:
18020"""""""
18021
18022This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
18023on any integer bit width or vectors of integers.
18024
18025::
18026
18027      declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
18028      declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
18029      declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
18030      declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
18031
18032Overview:
18033"""""""""
18034
18035The '``llvm.umul.with.overflow``' family of intrinsic functions perform
18036a unsigned multiplication of the two arguments, and indicate whether an
18037overflow occurred during the unsigned multiplication.
18038
18039Arguments:
18040""""""""""
18041
18042The arguments (%a and %b) and the first element of the result structure
18043may be of integer types of any bit width, but they must have the same
18044bit width. The second element of the result structure must be of type
18045``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
18046multiplication.
18047
18048Semantics:
18049""""""""""
18050
18051The '``llvm.umul.with.overflow``' family of intrinsic functions perform
18052an unsigned multiplication of the two arguments. They return a structure ---
18053the first element of which is the multiplication, and the second
18054element of which is a bit specifying if the unsigned multiplication
18055resulted in an overflow.
18056
18057Examples:
18058"""""""""
18059
18060.. code-block:: llvm
18061
18062      %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
18063      %sum = extractvalue {i32, i1} %res, 0
18064      %obit = extractvalue {i32, i1} %res, 1
18065      br i1 %obit, label %overflow, label %normal
18066
18067Saturation Arithmetic Intrinsics
18068---------------------------------
18069
18070Saturation arithmetic is a version of arithmetic in which operations are
18071limited to a fixed range between a minimum and maximum value. If the result of
18072an operation is greater than the maximum value, the result is set (or
18073"clamped") to this maximum. If it is below the minimum, it is clamped to this
18074minimum.
18075
18076.. _int_sadd_sat:
18077
18078'``llvm.sadd.sat.*``' Intrinsics
18079^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18080
18081Syntax
18082"""""""
18083
18084This is an overloaded intrinsic. You can use ``llvm.sadd.sat``
18085on any integer bit width or vectors of integers.
18086
18087::
18088
18089      declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b)
18090      declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b)
18091      declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b)
18092      declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18093
18094Overview
18095"""""""""
18096
18097The '``llvm.sadd.sat``' family of intrinsic functions perform signed
18098saturating addition on the 2 arguments.
18099
18100Arguments
18101""""""""""
18102
18103The arguments (%a and %b) and the result may be of integer types of any bit
18104width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18105values that will undergo signed addition.
18106
18107Semantics:
18108""""""""""
18109
18110The maximum value this operation can clamp to is the largest signed value
18111representable by the bit width of the arguments. The minimum value is the
18112smallest signed value representable by this bit width.
18113
18114
18115Examples
18116"""""""""
18117
18118.. code-block:: llvm
18119
18120      %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2)  ; %res = 3
18121      %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6)  ; %res = 7
18122      %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2)  ; %res = -2
18123      %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5)  ; %res = -8
18124
18125
18126.. _int_uadd_sat:
18127
18128'``llvm.uadd.sat.*``' Intrinsics
18129^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18130
18131Syntax
18132"""""""
18133
18134This is an overloaded intrinsic. You can use ``llvm.uadd.sat``
18135on any integer bit width or vectors of integers.
18136
18137::
18138
18139      declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b)
18140      declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b)
18141      declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b)
18142      declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18143
18144Overview
18145"""""""""
18146
18147The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned
18148saturating addition on the 2 arguments.
18149
18150Arguments
18151""""""""""
18152
18153The arguments (%a and %b) and the result may be of integer types of any bit
18154width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18155values that will undergo unsigned addition.
18156
18157Semantics:
18158""""""""""
18159
18160The maximum value this operation can clamp to is the largest unsigned value
18161representable by the bit width of the arguments. Because this is an unsigned
18162operation, the result will never saturate towards zero.
18163
18164
18165Examples
18166"""""""""
18167
18168.. code-block:: llvm
18169
18170      %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2)  ; %res = 3
18171      %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6)  ; %res = 11
18172      %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8)  ; %res = 15
18173
18174
18175.. _int_ssub_sat:
18176
18177'``llvm.ssub.sat.*``' Intrinsics
18178^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18179
18180Syntax
18181"""""""
18182
18183This is an overloaded intrinsic. You can use ``llvm.ssub.sat``
18184on any integer bit width or vectors of integers.
18185
18186::
18187
18188      declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b)
18189      declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b)
18190      declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b)
18191      declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18192
18193Overview
18194"""""""""
18195
18196The '``llvm.ssub.sat``' family of intrinsic functions perform signed
18197saturating subtraction on the 2 arguments.
18198
18199Arguments
18200""""""""""
18201
18202The arguments (%a and %b) and the result may be of integer types of any bit
18203width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18204values that will undergo signed subtraction.
18205
18206Semantics:
18207""""""""""
18208
18209The maximum value this operation can clamp to is the largest signed value
18210representable by the bit width of the arguments. The minimum value is the
18211smallest signed value representable by this bit width.
18212
18213
18214Examples
18215"""""""""
18216
18217.. code-block:: llvm
18218
18219      %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1)  ; %res = 1
18220      %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6)  ; %res = -4
18221      %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5)  ; %res = -8
18222      %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5)  ; %res = 7
18223
18224
18225.. _int_usub_sat:
18226
18227'``llvm.usub.sat.*``' Intrinsics
18228^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18229
18230Syntax
18231"""""""
18232
18233This is an overloaded intrinsic. You can use ``llvm.usub.sat``
18234on any integer bit width or vectors of integers.
18235
18236::
18237
18238      declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b)
18239      declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b)
18240      declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b)
18241      declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18242
18243Overview
18244"""""""""
18245
18246The '``llvm.usub.sat``' family of intrinsic functions perform unsigned
18247saturating subtraction on the 2 arguments.
18248
18249Arguments
18250""""""""""
18251
18252The arguments (%a and %b) and the result may be of integer types of any bit
18253width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18254values that will undergo unsigned subtraction.
18255
18256Semantics:
18257""""""""""
18258
18259The minimum value this operation can clamp to is 0, which is the smallest
18260unsigned value representable by the bit width of the unsigned arguments.
18261Because this is an unsigned operation, the result will never saturate towards
18262the largest possible value representable by this bit width.
18263
18264
18265Examples
18266"""""""""
18267
18268.. code-block:: llvm
18269
18270      %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1)  ; %res = 1
18271      %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6)  ; %res = 0
18272
18273
18274'``llvm.sshl.sat.*``' Intrinsics
18275^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18276
18277Syntax
18278"""""""
18279
18280This is an overloaded intrinsic. You can use ``llvm.sshl.sat``
18281on integers or vectors of integers of any bit width.
18282
18283::
18284
18285      declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b)
18286      declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b)
18287      declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b)
18288      declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18289
18290Overview
18291"""""""""
18292
18293The '``llvm.sshl.sat``' family of intrinsic functions perform signed
18294saturating left shift on the first argument.
18295
18296Arguments
18297""""""""""
18298
18299The arguments (``%a`` and ``%b``) and the result may be of integer types of any
18300bit width, but they must have the same bit width. ``%a`` is the value to be
18301shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
18302dynamically) equal to or larger than the integer bit width of the arguments,
18303the result is a :ref:`poison value <poisonvalues>`. If the arguments are
18304vectors, each vector element of ``a`` is shifted by the corresponding shift
18305amount in ``b``.
18306
18307
18308Semantics:
18309""""""""""
18310
18311The maximum value this operation can clamp to is the largest signed value
18312representable by the bit width of the arguments. The minimum value is the
18313smallest signed value representable by this bit width.
18314
18315
18316Examples
18317"""""""""
18318
18319.. code-block:: llvm
18320
18321      %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1)  ; %res = 4
18322      %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2)  ; %res = 7
18323      %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1)  ; %res = -8
18324      %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1)  ; %res = -2
18325
18326
18327'``llvm.ushl.sat.*``' Intrinsics
18328^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18329
18330Syntax
18331"""""""
18332
18333This is an overloaded intrinsic. You can use ``llvm.ushl.sat``
18334on integers or vectors of integers of any bit width.
18335
18336::
18337
18338      declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b)
18339      declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b)
18340      declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b)
18341      declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18342
18343Overview
18344"""""""""
18345
18346The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned
18347saturating left shift on the first argument.
18348
18349Arguments
18350""""""""""
18351
18352The arguments (``%a`` and ``%b``) and the result may be of integer types of any
18353bit width, but they must have the same bit width. ``%a`` is the value to be
18354shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
18355dynamically) equal to or larger than the integer bit width of the arguments,
18356the result is a :ref:`poison value <poisonvalues>`. If the arguments are
18357vectors, each vector element of ``a`` is shifted by the corresponding shift
18358amount in ``b``.
18359
18360Semantics:
18361""""""""""
18362
18363The maximum value this operation can clamp to is the largest unsigned value
18364representable by the bit width of the arguments.
18365
18366
18367Examples
18368"""""""""
18369
18370.. code-block:: llvm
18371
18372      %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1)  ; %res = 4
18373      %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3)  ; %res = 15
18374
18375
18376Fixed Point Arithmetic Intrinsics
18377---------------------------------
18378
18379A fixed point number represents a real data type for a number that has a fixed
18380number of digits after a radix point (equivalent to the decimal point '.').
18381The number of digits after the radix point is referred as the `scale`. These
18382are useful for representing fractional values to a specific precision. The
18383following intrinsics perform fixed point arithmetic operations on 2 operands
18384of the same scale, specified as the third argument.
18385
18386The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication
18387of fixed point numbers through scaled integers. Therefore, fixed point
18388multiplication can be represented as
18389
18390.. code-block:: llvm
18391
18392        %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale)
18393
18394        ; Expands to
18395        %a2 = sext i4 %a to i8
18396        %b2 = sext i4 %b to i8
18397        %mul = mul nsw nuw i8 %a2, %b2
18398        %scale2 = trunc i32 %scale to i8
18399        %r = ashr i8 %mul, i8 %scale2  ; this is for a target rounding down towards negative infinity
18400        %result = trunc i8 %r to i4
18401
18402The ``llvm.*div.fix`` family of intrinsic functions represents a division of
18403fixed point numbers through scaled integers. Fixed point division can be
18404represented as:
18405
18406.. code-block:: llvm
18407
18408        %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale)
18409
18410        ; Expands to
18411        %a2 = sext i4 %a to i8
18412        %b2 = sext i4 %b to i8
18413        %scale2 = trunc i32 %scale to i8
18414        %a3 = shl i8 %a2, %scale2
18415        %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero
18416        %result = trunc i8 %r to i4
18417
18418For each of these functions, if the result cannot be represented exactly with
18419the provided scale, the result is rounded. Rounding is unspecified since
18420preferred rounding may vary for different targets. Rounding is specified
18421through a target hook. Different pipelines should legalize or optimize this
18422using the rounding specified by this hook if it is provided. Operations like
18423constant folding, instruction combining, KnownBits, and ValueTracking should
18424also use this hook, if provided, and not assume the direction of rounding. A
18425rounded result must always be within one unit of precision from the true
18426result. That is, the error between the returned result and the true result must
18427be less than 1/2^(scale).
18428
18429
18430'``llvm.smul.fix.*``' Intrinsics
18431^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18432
18433Syntax
18434"""""""
18435
18436This is an overloaded intrinsic. You can use ``llvm.smul.fix``
18437on any integer bit width or vectors of integers.
18438
18439::
18440
18441      declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale)
18442      declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale)
18443      declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale)
18444      declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18445
18446Overview
18447"""""""""
18448
18449The '``llvm.smul.fix``' family of intrinsic functions perform signed
18450fixed point multiplication on 2 arguments of the same scale.
18451
18452Arguments
18453""""""""""
18454
18455The arguments (%a and %b) and the result may be of integer types of any bit
18456width, but they must have the same bit width. The arguments may also work with
18457int vectors of the same length and int size. ``%a`` and ``%b`` are the two
18458values that will undergo signed fixed point multiplication. The argument
18459``%scale`` represents the scale of both operands, and must be a constant
18460integer.
18461
18462Semantics:
18463""""""""""
18464
18465This operation performs fixed point multiplication on the 2 arguments of a
18466specified scale. The result will also be returned in the same scale specified
18467in the third argument.
18468
18469If the result value cannot be precisely represented in the given scale, the
18470value is rounded up or down to the closest representable value. The rounding
18471direction is unspecified.
18472
18473It is undefined behavior if the result value does not fit within the range of
18474the fixed point type.
18475
18476
18477Examples
18478"""""""""
18479
18480.. code-block:: llvm
18481
18482      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
18483      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
18484      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
18485
18486      ; The result in the following could be rounded up to -2 or down to -2.5
18487      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
18488
18489
18490'``llvm.umul.fix.*``' Intrinsics
18491^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18492
18493Syntax
18494"""""""
18495
18496This is an overloaded intrinsic. You can use ``llvm.umul.fix``
18497on any integer bit width or vectors of integers.
18498
18499::
18500
18501      declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale)
18502      declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale)
18503      declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale)
18504      declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18505
18506Overview
18507"""""""""
18508
18509The '``llvm.umul.fix``' family of intrinsic functions perform unsigned
18510fixed point multiplication on 2 arguments of the same scale.
18511
18512Arguments
18513""""""""""
18514
18515The arguments (%a and %b) and the result may be of integer types of any bit
18516width, but they must have the same bit width. The arguments may also work with
18517int vectors of the same length and int size. ``%a`` and ``%b`` are the two
18518values that will undergo unsigned fixed point multiplication. The argument
18519``%scale`` represents the scale of both operands, and must be a constant
18520integer.
18521
18522Semantics:
18523""""""""""
18524
18525This operation performs unsigned fixed point multiplication on the 2 arguments of a
18526specified scale. The result will also be returned in the same scale specified
18527in the third argument.
18528
18529If the result value cannot be precisely represented in the given scale, the
18530value is rounded up or down to the closest representable value. The rounding
18531direction is unspecified.
18532
18533It is undefined behavior if the result value does not fit within the range of
18534the fixed point type.
18535
18536
18537Examples
18538"""""""""
18539
18540.. code-block:: llvm
18541
18542      %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
18543      %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
18544
18545      ; The result in the following could be rounded down to 3.5 or up to 4
18546      %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1)  ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)
18547
18548
18549'``llvm.smul.fix.sat.*``' Intrinsics
18550^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18551
18552Syntax
18553"""""""
18554
18555This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat``
18556on any integer bit width or vectors of integers.
18557
18558::
18559
18560      declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
18561      declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
18562      declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
18563      declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18564
18565Overview
18566"""""""""
18567
18568The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed
18569fixed point saturating multiplication on 2 arguments of the same scale.
18570
18571Arguments
18572""""""""""
18573
18574The arguments (%a and %b) and the result may be of integer types of any bit
18575width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18576values that will undergo signed fixed point multiplication. The argument
18577``%scale`` represents the scale of both operands, and must be a constant
18578integer.
18579
18580Semantics:
18581""""""""""
18582
18583This operation performs fixed point multiplication on the 2 arguments of a
18584specified scale. The result will also be returned in the same scale specified
18585in the third argument.
18586
18587If the result value cannot be precisely represented in the given scale, the
18588value is rounded up or down to the closest representable value. The rounding
18589direction is unspecified.
18590
18591The maximum value this operation can clamp to is the largest signed value
18592representable by the bit width of the first 2 arguments. The minimum value is the
18593smallest signed value representable by this bit width.
18594
18595
18596Examples
18597"""""""""
18598
18599.. code-block:: llvm
18600
18601      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
18602      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
18603      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
18604
18605      ; The result in the following could be rounded up to -2 or down to -2.5
18606      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
18607
18608      ; Saturation
18609      %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0)  ; %res = 7
18610      %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2)  ; %res = 7
18611      %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2)  ; %res = -8
18612      %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1)  ; %res = 7
18613
18614      ; Scale can affect the saturation result
18615      %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
18616      %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
18617
18618
18619'``llvm.umul.fix.sat.*``' Intrinsics
18620^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18621
18622Syntax
18623"""""""
18624
18625This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat``
18626on any integer bit width or vectors of integers.
18627
18628::
18629
18630      declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
18631      declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
18632      declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
18633      declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18634
18635Overview
18636"""""""""
18637
18638The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned
18639fixed point saturating multiplication on 2 arguments of the same scale.
18640
18641Arguments
18642""""""""""
18643
18644The arguments (%a and %b) and the result may be of integer types of any bit
18645width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18646values that will undergo unsigned fixed point multiplication. The argument
18647``%scale`` represents the scale of both operands, and must be a constant
18648integer.
18649
18650Semantics:
18651""""""""""
18652
18653This operation performs fixed point multiplication on the 2 arguments of a
18654specified scale. The result will also be returned in the same scale specified
18655in the third argument.
18656
18657If the result value cannot be precisely represented in the given scale, the
18658value is rounded up or down to the closest representable value. The rounding
18659direction is unspecified.
18660
18661The maximum value this operation can clamp to is the largest unsigned value
18662representable by the bit width of the first 2 arguments. The minimum value is the
18663smallest unsigned value representable by this bit width (zero).
18664
18665
18666Examples
18667"""""""""
18668
18669.. code-block:: llvm
18670
18671      %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
18672      %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
18673
18674      ; The result in the following could be rounded down to 2 or up to 2.5
18675      %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1)  ; %res = 4 (or 5) (1.5 x 1.5 = 2.25)
18676
18677      ; Saturation
18678      %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0)  ; %res = 15 (8 x 2 -> clamped to 15)
18679      %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2)  ; %res = 15 (2 x 2 -> clamped to 3.75)
18680
18681      ; Scale can affect the saturation result
18682      %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
18683      %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
18684
18685
18686'``llvm.sdiv.fix.*``' Intrinsics
18687^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18688
18689Syntax
18690"""""""
18691
18692This is an overloaded intrinsic. You can use ``llvm.sdiv.fix``
18693on any integer bit width or vectors of integers.
18694
18695::
18696
18697      declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale)
18698      declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale)
18699      declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale)
18700      declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18701
18702Overview
18703"""""""""
18704
18705The '``llvm.sdiv.fix``' family of intrinsic functions perform signed
18706fixed point division on 2 arguments of the same scale.
18707
18708Arguments
18709""""""""""
18710
18711The arguments (%a and %b) and the result may be of integer types of any bit
18712width, but they must have the same bit width. The arguments may also work with
18713int vectors of the same length and int size. ``%a`` and ``%b`` are the two
18714values that will undergo signed fixed point division. The argument
18715``%scale`` represents the scale of both operands, and must be a constant
18716integer.
18717
18718Semantics:
18719""""""""""
18720
18721This operation performs fixed point division on the 2 arguments of a
18722specified scale. The result will also be returned in the same scale specified
18723in the third argument.
18724
18725If the result value cannot be precisely represented in the given scale, the
18726value is rounded up or down to the closest representable value. The rounding
18727direction is unspecified.
18728
18729It is undefined behavior if the result value does not fit within the range of
18730the fixed point type, or if the second argument is zero.
18731
18732
18733Examples
18734"""""""""
18735
18736.. code-block:: llvm
18737
18738      %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
18739      %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
18740      %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
18741
18742      ; The result in the following could be rounded up to 1 or down to 0.5
18743      %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
18744
18745
18746'``llvm.udiv.fix.*``' Intrinsics
18747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18748
18749Syntax
18750"""""""
18751
18752This is an overloaded intrinsic. You can use ``llvm.udiv.fix``
18753on any integer bit width or vectors of integers.
18754
18755::
18756
18757      declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale)
18758      declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale)
18759      declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale)
18760      declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18761
18762Overview
18763"""""""""
18764
18765The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned
18766fixed point division on 2 arguments of the same scale.
18767
18768Arguments
18769""""""""""
18770
18771The arguments (%a and %b) and the result may be of integer types of any bit
18772width, but they must have the same bit width. The arguments may also work with
18773int vectors of the same length and int size. ``%a`` and ``%b`` are the two
18774values that will undergo unsigned fixed point division. The argument
18775``%scale`` represents the scale of both operands, and must be a constant
18776integer.
18777
18778Semantics:
18779""""""""""
18780
18781This operation performs fixed point division on the 2 arguments of a
18782specified scale. The result will also be returned in the same scale specified
18783in the third argument.
18784
18785If the result value cannot be precisely represented in the given scale, the
18786value is rounded up or down to the closest representable value. The rounding
18787direction is unspecified.
18788
18789It is undefined behavior if the result value does not fit within the range of
18790the fixed point type, or if the second argument is zero.
18791
18792
18793Examples
18794"""""""""
18795
18796.. code-block:: llvm
18797
18798      %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
18799      %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
18800      %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)
18801
18802      ; The result in the following could be rounded up to 1 or down to 0.5
18803      %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
18804
18805
18806'``llvm.sdiv.fix.sat.*``' Intrinsics
18807^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18808
18809Syntax
18810"""""""
18811
18812This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat``
18813on any integer bit width or vectors of integers.
18814
18815::
18816
18817      declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
18818      declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
18819      declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
18820      declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18821
18822Overview
18823"""""""""
18824
18825The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed
18826fixed point saturating division on 2 arguments of the same scale.
18827
18828Arguments
18829""""""""""
18830
18831The arguments (%a and %b) and the result may be of integer types of any bit
18832width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18833values that will undergo signed fixed point division. The argument
18834``%scale`` represents the scale of both operands, and must be a constant
18835integer.
18836
18837Semantics:
18838""""""""""
18839
18840This operation performs fixed point division on the 2 arguments of a
18841specified scale. The result will also be returned in the same scale specified
18842in the third argument.
18843
18844If the result value cannot be precisely represented in the given scale, the
18845value is rounded up or down to the closest representable value. The rounding
18846direction is unspecified.
18847
18848The maximum value this operation can clamp to is the largest signed value
18849representable by the bit width of the first 2 arguments. The minimum value is the
18850smallest signed value representable by this bit width.
18851
18852It is undefined behavior if the second argument is zero.
18853
18854
18855Examples
18856"""""""""
18857
18858.. code-block:: llvm
18859
18860      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
18861      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
18862      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
18863
18864      ; The result in the following could be rounded up to 1 or down to 0.5
18865      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
18866
18867      ; Saturation
18868      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0)  ; %res = 7 (-8 / -1 = 8 => 7)
18869      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2)  ; %res = 7 (1 / 0.5 = 2 => 1.75)
18870      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2)  ; %res = -8 (-1 / 0.25 = -4 => -2)
18871
18872
18873'``llvm.udiv.fix.sat.*``' Intrinsics
18874^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18875
18876Syntax
18877"""""""
18878
18879This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat``
18880on any integer bit width or vectors of integers.
18881
18882::
18883
18884      declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
18885      declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
18886      declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
18887      declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18888
18889Overview
18890"""""""""
18891
18892The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned
18893fixed point saturating division on 2 arguments of the same scale.
18894
18895Arguments
18896""""""""""
18897
18898The arguments (%a and %b) and the result may be of integer types of any bit
18899width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18900values that will undergo unsigned fixed point division. The argument
18901``%scale`` represents the scale of both operands, and must be a constant
18902integer.
18903
18904Semantics:
18905""""""""""
18906
18907This operation performs fixed point division on the 2 arguments of a
18908specified scale. The result will also be returned in the same scale specified
18909in the third argument.
18910
18911If the result value cannot be precisely represented in the given scale, the
18912value is rounded up or down to the closest representable value. The rounding
18913direction is unspecified.
18914
18915The maximum value this operation can clamp to is the largest unsigned value
18916representable by the bit width of the first 2 arguments. The minimum value is the
18917smallest unsigned value representable by this bit width (zero).
18918
18919It is undefined behavior if the second argument is zero.
18920
18921Examples
18922"""""""""
18923
18924.. code-block:: llvm
18925
18926      %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
18927      %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
18928
18929      ; The result in the following could be rounded down to 0.5 or up to 1
18930      %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 1 (or 2) (1.5 / 2 = 0.75)
18931
18932      ; Saturation
18933      %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2)  ; %res = 15 (2 / 0.5 = 4 => 3.75)
18934
18935
18936Specialized Arithmetic Intrinsics
18937---------------------------------
18938
18939.. _i_intr_llvm_canonicalize:
18940
18941'``llvm.canonicalize.*``' Intrinsic
18942^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18943
18944Syntax:
18945"""""""
18946
18947::
18948
18949      declare float @llvm.canonicalize.f32(float %a)
18950      declare double @llvm.canonicalize.f64(double %b)
18951
18952Overview:
18953"""""""""
18954
18955The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
18956encoding of a floating-point number. This canonicalization is useful for
18957implementing certain numeric primitives such as frexp. The canonical encoding is
18958defined by IEEE-754-2008 to be:
18959
18960::
18961
18962      2.1.8 canonical encoding: The preferred encoding of a floating-point
18963      representation in a format. Applied to declets, significands of finite
18964      numbers, infinities, and NaNs, especially in decimal formats.
18965
18966This operation can also be considered equivalent to the IEEE-754-2008
18967conversion of a floating-point value to the same format. NaNs are handled
18968according to section 6.2.
18969
18970Examples of non-canonical encodings:
18971
18972- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
18973  converted to a canonical representation per hardware-specific protocol.
18974- Many normal decimal floating-point numbers have non-canonical alternative
18975  encodings.
18976- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
18977  These are treated as non-canonical encodings of zero and will be flushed to
18978  a zero of the same sign by this operation.
18979
18980Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
18981default exception handling must signal an invalid exception, and produce a
18982quiet NaN result.
18983
18984This function should always be implementable as multiplication by 1.0, provided
18985that the compiler does not constant fold the operation. Likewise, division by
189861.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
18987-0.0 is also sufficient provided that the rounding mode is not -Infinity.
18988
18989``@llvm.canonicalize`` must preserve the equality relation. That is:
18990
18991- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
18992- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent
18993  to ``(x == y)``
18994
18995Additionally, the sign of zero must be conserved:
18996``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
18997
18998The payload bits of a NaN must be conserved, with two exceptions.
18999First, environments which use only a single canonical representation of NaN
19000must perform said canonicalization. Second, SNaNs must be quieted per the
19001usual methods.
19002
19003The canonicalization operation may be optimized away if:
19004
19005- The input is known to be canonical. For example, it was produced by a
19006  floating-point operation that is required by the standard to be canonical.
19007- The result is consumed only by (or fused with) other floating-point
19008  operations. That is, the bits of the floating-point value are not examined.
19009
19010.. _int_fmuladd:
19011
19012'``llvm.fmuladd.*``' Intrinsic
19013^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19014
19015Syntax:
19016"""""""
19017
19018::
19019
19020      declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
19021      declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
19022
19023Overview:
19024"""""""""
19025
19026The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
19027expressions that can be fused if the code generator determines that (a) the
19028target instruction set has support for a fused operation, and (b) that the
19029fused operation is more efficient than the equivalent, separate pair of mul
19030and add instructions.
19031
19032Arguments:
19033""""""""""
19034
19035The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
19036multiplicands, a and b, and an addend c.
19037
19038Semantics:
19039""""""""""
19040
19041The expression:
19042
19043::
19044
19045      %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
19046
19047is equivalent to the expression a \* b + c, except that it is unspecified
19048whether rounding will be performed between the multiplication and addition
19049steps. Fusion is not guaranteed, even if the target platform supports it.
19050If a fused multiply-add is required, the corresponding
19051:ref:`llvm.fma <int_fma>` intrinsic function should be used instead.
19052This never sets errno, just as '``llvm.fma.*``'.
19053
19054Examples:
19055"""""""""
19056
19057.. code-block:: llvm
19058
19059      %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
19060
19061
19062Hardware-Loop Intrinsics
19063------------------------
19064
19065LLVM support several intrinsics to mark a loop as a hardware-loop. They are
19066hints to the backend which are required to lower these intrinsics further to target
19067specific instructions, or revert the hardware-loop to a normal loop if target
19068specific restriction are not met and a hardware-loop can't be generated.
19069
19070These intrinsics may be modified in the future and are not intended to be used
19071outside the backend. Thus, front-end and mid-level optimizations should not be
19072generating these intrinsics.
19073
19074
19075'``llvm.set.loop.iterations.*``' Intrinsic
19076^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19077
19078Syntax:
19079"""""""
19080
19081This is an overloaded intrinsic.
19082
19083::
19084
19085      declare void @llvm.set.loop.iterations.i32(i32)
19086      declare void @llvm.set.loop.iterations.i64(i64)
19087
19088Overview:
19089"""""""""
19090
19091The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the
19092hardware-loop trip count. They are placed in the loop preheader basic block and
19093are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these
19094instructions.
19095
19096Arguments:
19097""""""""""
19098
19099The integer operand is the loop trip count of the hardware-loop, and thus
19100not e.g. the loop back-edge taken count.
19101
19102Semantics:
19103""""""""""
19104
19105The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic
19106on their operand. It's a hint to the backend that can use this to set up the
19107hardware-loop count with a target specific instruction, usually a move of this
19108value to a special register or a hardware-loop instruction.
19109
19110
19111'``llvm.start.loop.iterations.*``' Intrinsic
19112^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19113
19114Syntax:
19115"""""""
19116
19117This is an overloaded intrinsic.
19118
19119::
19120
19121      declare i32 @llvm.start.loop.iterations.i32(i32)
19122      declare i64 @llvm.start.loop.iterations.i64(i64)
19123
19124Overview:
19125"""""""""
19126
19127The '``llvm.start.loop.iterations.*``' intrinsics are similar to the
19128'``llvm.set.loop.iterations.*``' intrinsics, used to specify the
19129hardware-loop trip count but also produce a value identical to the input
19130that can be used as the input to the loop. They are placed in the loop
19131preheader basic block and the output is expected to be the input to the
19132phi for the induction variable of the loop, decremented by the
19133'``llvm.loop.decrement.reg.*``'.
19134
19135Arguments:
19136""""""""""
19137
19138The integer operand is the loop trip count of the hardware-loop, and thus
19139not e.g. the loop back-edge taken count.
19140
19141Semantics:
19142""""""""""
19143
19144The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic
19145on their operand. It's a hint to the backend that can use this to set up the
19146hardware-loop count with a target specific instruction, usually a move of this
19147value to a special register or a hardware-loop instruction.
19148
19149'``llvm.test.set.loop.iterations.*``' Intrinsic
19150^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19151
19152Syntax:
19153"""""""
19154
19155This is an overloaded intrinsic.
19156
19157::
19158
19159      declare i1 @llvm.test.set.loop.iterations.i32(i32)
19160      declare i1 @llvm.test.set.loop.iterations.i64(i64)
19161
19162Overview:
19163"""""""""
19164
19165The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the
19166the loop trip count, and also test that the given count is not zero, allowing
19167it to control entry to a while-loop.  They are placed in the loop preheader's
19168predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid
19169optimizers duplicating these instructions.
19170
19171Arguments:
19172""""""""""
19173
19174The integer operand is the loop trip count of the hardware-loop, and thus
19175not e.g. the loop back-edge taken count.
19176
19177Semantics:
19178""""""""""
19179
19180The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any
19181arithmetic on their operand. It's a hint to the backend that can use this to
19182set up the hardware-loop count with a target specific instruction, usually a
19183move of this value to a special register or a hardware-loop instruction.
19184The result is the conditional value of whether the given count is not zero.
19185
19186
19187'``llvm.test.start.loop.iterations.*``' Intrinsic
19188^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19189
19190Syntax:
19191"""""""
19192
19193This is an overloaded intrinsic.
19194
19195::
19196
19197      declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32)
19198      declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64)
19199
19200Overview:
19201"""""""""
19202
19203The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the
19204'``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``'
19205intrinsics, used to specify the hardware-loop trip count, but also produce a
19206value identical to the input that can be used as the input to the loop. The
19207second i1 output controls entry to a while-loop.
19208
19209Arguments:
19210""""""""""
19211
19212The integer operand is the loop trip count of the hardware-loop, and thus
19213not e.g. the loop back-edge taken count.
19214
19215Semantics:
19216""""""""""
19217
19218The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any
19219arithmetic on their operand. It's a hint to the backend that can use this to
19220set up the hardware-loop count with a target specific instruction, usually a
19221move of this value to a special register or a hardware-loop instruction.
19222The result is a pair of the input and a conditional value of whether the
19223given count is not zero.
19224
19225
19226'``llvm.loop.decrement.reg.*``' Intrinsic
19227^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19228
19229Syntax:
19230"""""""
19231
19232This is an overloaded intrinsic.
19233
19234::
19235
19236      declare i32 @llvm.loop.decrement.reg.i32(i32, i32)
19237      declare i64 @llvm.loop.decrement.reg.i64(i64, i64)
19238
19239Overview:
19240"""""""""
19241
19242The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop
19243iteration counter and return an updated value that will be used in the next
19244loop test check.
19245
19246Arguments:
19247""""""""""
19248
19249Both arguments must have identical integer types. The first operand is the
19250loop iteration counter. The second operand is the maximum number of elements
19251processed in an iteration.
19252
19253Semantics:
19254""""""""""
19255
19256The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its
19257two operands, which is not allowed to wrap. They return the remaining number of
19258iterations still to be executed, and can be used together with a ``PHI``,
19259``ICMP`` and ``BR`` to control the number of loop iterations executed. Any
19260optimizations are allowed to treat it is a ``SUB``, and it is supported by
19261SCEV, so it's the backends responsibility to handle cases where it may be
19262optimized. These intrinsics are marked as ``IntrNoDuplicate`` to avoid
19263optimizers duplicating these instructions.
19264
19265
19266'``llvm.loop.decrement.*``' Intrinsic
19267^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19268
19269Syntax:
19270"""""""
19271
19272This is an overloaded intrinsic.
19273
19274::
19275
19276      declare i1 @llvm.loop.decrement.i32(i32)
19277      declare i1 @llvm.loop.decrement.i64(i64)
19278
19279Overview:
19280"""""""""
19281
19282The HardwareLoops pass allows the loop decrement value to be specified with an
19283option. It defaults to a loop decrement value of 1, but it can be an unsigned
19284integer value provided by this option.  The '``llvm.loop.decrement.*``'
19285intrinsics decrement the loop iteration counter with this value, and return a
19286false predicate if the loop should exit, and true otherwise.
19287This is emitted if the loop counter is not updated via a ``PHI`` node, which
19288can also be controlled with an option.
19289
19290Arguments:
19291""""""""""
19292
19293The integer argument is the loop decrement value used to decrement the loop
19294iteration counter.
19295
19296Semantics:
19297""""""""""
19298
19299The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration
19300counter with the given loop decrement value, and return false if the loop
19301should exit, this ``SUB`` is not allowed to wrap. The result is a condition
19302that is used by the conditional branch controlling the loop.
19303
19304
19305Vector Reduction Intrinsics
19306---------------------------
19307
19308Horizontal reductions of vectors can be expressed using the following
19309intrinsics. Each one takes a vector operand as an input and applies its
19310respective operation across all elements of the vector, returning a single
19311scalar result of the same element type.
19312
19313.. _int_vector_reduce_add:
19314
19315'``llvm.vector.reduce.add.*``' Intrinsic
19316^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19317
19318Syntax:
19319"""""""
19320
19321::
19322
19323      declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a)
19324      declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a)
19325
19326Overview:
19327"""""""""
19328
19329The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD``
19330reduction of a vector, returning the result as a scalar. The return type matches
19331the element-type of the vector input.
19332
19333Arguments:
19334""""""""""
19335The argument to this intrinsic must be a vector of integer values.
19336
19337.. _int_vector_reduce_fadd:
19338
19339'``llvm.vector.reduce.fadd.*``' Intrinsic
19340^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19341
19342Syntax:
19343"""""""
19344
19345::
19346
19347      declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a)
19348      declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a)
19349
19350Overview:
19351"""""""""
19352
19353The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point
19354``ADD`` reduction of a vector, returning the result as a scalar. The return type
19355matches the element-type of the vector input.
19356
19357If the intrinsic call has the 'reassoc' flag set, then the reduction will not
19358preserve the associativity of an equivalent scalarized counterpart. Otherwise
19359the reduction will be *sequential*, thus implying that the operation respects
19360the associativity of a scalarized reduction. That is, the reduction begins with
19361the start value and performs an fadd operation with consecutively increasing
19362vector element indices. See the following pseudocode:
19363
19364::
19365
19366    float sequential_fadd(start_value, input_vector)
19367      result = start_value
19368      for i = 0 to length(input_vector)
19369        result = result + input_vector[i]
19370      return result
19371
19372
19373Arguments:
19374""""""""""
19375The first argument to this intrinsic is a scalar start value for the reduction.
19376The type of the start value matches the element-type of the vector input.
19377The second argument must be a vector of floating-point values.
19378
19379To ignore the start value, negative zero (``-0.0``) can be used, as it is
19380the neutral value of floating point addition.
19381
19382Examples:
19383"""""""""
19384
19385::
19386
19387      %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction
19388      %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
19389
19390
19391.. _int_vector_reduce_mul:
19392
19393'``llvm.vector.reduce.mul.*``' Intrinsic
19394^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19395
19396Syntax:
19397"""""""
19398
19399::
19400
19401      declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a)
19402      declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a)
19403
19404Overview:
19405"""""""""
19406
19407The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
19408reduction of a vector, returning the result as a scalar. The return type matches
19409the element-type of the vector input.
19410
19411Arguments:
19412""""""""""
19413The argument to this intrinsic must be a vector of integer values.
19414
19415.. _int_vector_reduce_fmul:
19416
19417'``llvm.vector.reduce.fmul.*``' Intrinsic
19418^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19419
19420Syntax:
19421"""""""
19422
19423::
19424
19425      declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a)
19426      declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a)
19427
19428Overview:
19429"""""""""
19430
19431The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point
19432``MUL`` reduction of a vector, returning the result as a scalar. The return type
19433matches the element-type of the vector input.
19434
19435If the intrinsic call has the 'reassoc' flag set, then the reduction will not
19436preserve the associativity of an equivalent scalarized counterpart. Otherwise
19437the reduction will be *sequential*, thus implying that the operation respects
19438the associativity of a scalarized reduction. That is, the reduction begins with
19439the start value and performs an fmul operation with consecutively increasing
19440vector element indices. See the following pseudocode:
19441
19442::
19443
19444    float sequential_fmul(start_value, input_vector)
19445      result = start_value
19446      for i = 0 to length(input_vector)
19447        result = result * input_vector[i]
19448      return result
19449
19450
19451Arguments:
19452""""""""""
19453The first argument to this intrinsic is a scalar start value for the reduction.
19454The type of the start value matches the element-type of the vector input.
19455The second argument must be a vector of floating-point values.
19456
19457To ignore the start value, one (``1.0``) can be used, as it is the neutral
19458value of floating point multiplication.
19459
19460Examples:
19461"""""""""
19462
19463::
19464
19465      %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction
19466      %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
19467
19468.. _int_vector_reduce_and:
19469
19470'``llvm.vector.reduce.and.*``' Intrinsic
19471^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19472
19473Syntax:
19474"""""""
19475
19476::
19477
19478      declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a)
19479
19480Overview:
19481"""""""""
19482
19483The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
19484reduction of a vector, returning the result as a scalar. The return type matches
19485the element-type of the vector input.
19486
19487Arguments:
19488""""""""""
19489The argument to this intrinsic must be a vector of integer values.
19490
19491.. _int_vector_reduce_or:
19492
19493'``llvm.vector.reduce.or.*``' Intrinsic
19494^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19495
19496Syntax:
19497"""""""
19498
19499::
19500
19501      declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a)
19502
19503Overview:
19504"""""""""
19505
19506The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
19507of a vector, returning the result as a scalar. The return type matches the
19508element-type of the vector input.
19509
19510Arguments:
19511""""""""""
19512The argument to this intrinsic must be a vector of integer values.
19513
19514.. _int_vector_reduce_xor:
19515
19516'``llvm.vector.reduce.xor.*``' Intrinsic
19517^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19518
19519Syntax:
19520"""""""
19521
19522::
19523
19524      declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a)
19525
19526Overview:
19527"""""""""
19528
19529The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
19530reduction of a vector, returning the result as a scalar. The return type matches
19531the element-type of the vector input.
19532
19533Arguments:
19534""""""""""
19535The argument to this intrinsic must be a vector of integer values.
19536
19537.. _int_vector_reduce_smax:
19538
19539'``llvm.vector.reduce.smax.*``' Intrinsic
19540^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19541
19542Syntax:
19543"""""""
19544
19545::
19546
19547      declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a)
19548
19549Overview:
19550"""""""""
19551
19552The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer
19553``MAX`` reduction of a vector, returning the result as a scalar. The return type
19554matches the element-type of the vector input.
19555
19556Arguments:
19557""""""""""
19558The argument to this intrinsic must be a vector of integer values.
19559
19560.. _int_vector_reduce_smin:
19561
19562'``llvm.vector.reduce.smin.*``' Intrinsic
19563^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19564
19565Syntax:
19566"""""""
19567
19568::
19569
19570      declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a)
19571
19572Overview:
19573"""""""""
19574
19575The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer
19576``MIN`` reduction of a vector, returning the result as a scalar. The return type
19577matches the element-type of the vector input.
19578
19579Arguments:
19580""""""""""
19581The argument to this intrinsic must be a vector of integer values.
19582
19583.. _int_vector_reduce_umax:
19584
19585'``llvm.vector.reduce.umax.*``' Intrinsic
19586^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19587
19588Syntax:
19589"""""""
19590
19591::
19592
19593      declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a)
19594
19595Overview:
19596"""""""""
19597
19598The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned
19599integer ``MAX`` reduction of a vector, returning the result as a scalar. The
19600return type matches the element-type of the vector input.
19601
19602Arguments:
19603""""""""""
19604The argument to this intrinsic must be a vector of integer values.
19605
19606.. _int_vector_reduce_umin:
19607
19608'``llvm.vector.reduce.umin.*``' Intrinsic
19609^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19610
19611Syntax:
19612"""""""
19613
19614::
19615
19616      declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a)
19617
19618Overview:
19619"""""""""
19620
19621The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned
19622integer ``MIN`` reduction of a vector, returning the result as a scalar. The
19623return type matches the element-type of the vector input.
19624
19625Arguments:
19626""""""""""
19627The argument to this intrinsic must be a vector of integer values.
19628
19629.. _int_vector_reduce_fmax:
19630
19631'``llvm.vector.reduce.fmax.*``' Intrinsic
19632^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19633
19634Syntax:
19635"""""""
19636
19637::
19638
19639      declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a)
19640      declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a)
19641
19642Overview:
19643"""""""""
19644
19645The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
19646``MAX`` reduction of a vector, returning the result as a scalar. The return type
19647matches the element-type of the vector input.
19648
19649This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
19650intrinsic. That is, the result will always be a number unless all elements of
19651the vector are NaN. For a vector with maximum element magnitude 0.0 and
19652containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
19653
19654If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
19655assume that NaNs are not present in the input vector.
19656
19657Arguments:
19658""""""""""
19659The argument to this intrinsic must be a vector of floating-point values.
19660
19661.. _int_vector_reduce_fmin:
19662
19663'``llvm.vector.reduce.fmin.*``' Intrinsic
19664^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19665
19666Syntax:
19667"""""""
19668This is an overloaded intrinsic.
19669
19670::
19671
19672      declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a)
19673      declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a)
19674
19675Overview:
19676"""""""""
19677
19678The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
19679``MIN`` reduction of a vector, returning the result as a scalar. The return type
19680matches the element-type of the vector input.
19681
19682This instruction has the same comparison semantics as the '``llvm.minnum.*``'
19683intrinsic. That is, the result will always be a number unless all elements of
19684the vector are NaN. For a vector with minimum element magnitude 0.0 and
19685containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
19686
19687If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
19688assume that NaNs are not present in the input vector.
19689
19690Arguments:
19691""""""""""
19692The argument to this intrinsic must be a vector of floating-point values.
19693
19694.. _int_vector_reduce_fmaximum:
19695
19696'``llvm.vector.reduce.fmaximum.*``' Intrinsic
19697^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19698
19699Syntax:
19700"""""""
19701This is an overloaded intrinsic.
19702
19703::
19704
19705      declare float @llvm.vector.reduce.fmaximum.v4f32(<4 x float> %a)
19706      declare double @llvm.vector.reduce.fmaximum.v2f64(<2 x double> %a)
19707
19708Overview:
19709"""""""""
19710
19711The '``llvm.vector.reduce.fmaximum.*``' intrinsics do a floating-point
19712``MAX`` reduction of a vector, returning the result as a scalar. The return type
19713matches the element-type of the vector input.
19714
19715This instruction has the same comparison semantics as the '``llvm.maximum.*``'
19716intrinsic. That is, this intrinsic propagates NaNs and +0.0 is considered
19717greater than -0.0. If any element of the vector is a NaN, the result is NaN.
19718
19719Arguments:
19720""""""""""
19721The argument to this intrinsic must be a vector of floating-point values.
19722
19723.. _int_vector_reduce_fminimum:
19724
19725'``llvm.vector.reduce.fminimum.*``' Intrinsic
19726^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19727
19728Syntax:
19729"""""""
19730This is an overloaded intrinsic.
19731
19732::
19733
19734      declare float @llvm.vector.reduce.fminimum.v4f32(<4 x float> %a)
19735      declare double @llvm.vector.reduce.fminimum.v2f64(<2 x double> %a)
19736
19737Overview:
19738"""""""""
19739
19740The '``llvm.vector.reduce.fminimum.*``' intrinsics do a floating-point
19741``MIN`` reduction of a vector, returning the result as a scalar. The return type
19742matches the element-type of the vector input.
19743
19744This instruction has the same comparison semantics as the '``llvm.minimum.*``'
19745intrinsic. That is, this intrinsic propagates NaNs and -0.0 is considered less
19746than +0.0. If any element of the vector is a NaN, the result is NaN.
19747
19748Arguments:
19749""""""""""
19750The argument to this intrinsic must be a vector of floating-point values.
19751
19752'``llvm.vector.insert``' Intrinsic
19753^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19754
19755Syntax:
19756"""""""
19757This is an overloaded intrinsic.
19758
19759::
19760
19761      ; Insert fixed type into scalable type
19762      declare <vscale x 4 x float> @llvm.vector.insert.nxv4f32.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 <idx>)
19763      declare <vscale x 2 x double> @llvm.vector.insert.nxv2f64.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 <idx>)
19764
19765      ; Insert scalable type into scalable type
19766      declare <vscale x 4 x float> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x float> %vec, <vscale x 2 x float> %subvec, i64 <idx>)
19767
19768      ; Insert fixed type into fixed type
19769      declare <4 x double> @llvm.vector.insert.v4f64.v2f64(<4 x double> %vec, <2 x double> %subvec, i64 <idx>)
19770
19771Overview:
19772"""""""""
19773
19774The '``llvm.vector.insert.*``' intrinsics insert a vector into another vector
19775starting from a given index. The return type matches the type of the vector we
19776insert into. Conceptually, this can be used to build a scalable vector out of
19777non-scalable vectors, however this intrinsic can also be used on purely fixed
19778types.
19779
19780Scalable vectors can only be inserted into other scalable vectors.
19781
19782Arguments:
19783""""""""""
19784
19785The ``vec`` is the vector which ``subvec`` will be inserted into.
19786The ``subvec`` is the vector that will be inserted.
19787
19788``idx`` represents the starting element number at which ``subvec`` will be
19789inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum
19790vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by
19791the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at
19792``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` +
19793num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition
19794cannot be determined statically but is false at runtime, then the result vector
19795is a :ref:`poison value <poisonvalues>`.
19796
19797
19798'``llvm.vector.extract``' Intrinsic
19799^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19800
19801Syntax:
19802"""""""
19803This is an overloaded intrinsic.
19804
19805::
19806
19807      ; Extract fixed type from scalable type
19808      declare <4 x float> @llvm.vector.extract.v4f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>)
19809      declare <2 x double> @llvm.vector.extract.v2f64.nxv2f64(<vscale x 2 x double> %vec, i64 <idx>)
19810
19811      ; Extract scalable type from scalable type
19812      declare <vscale x 2 x float> @llvm.vector.extract.nxv2f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>)
19813
19814      ; Extract fixed type from fixed type
19815      declare <2 x double> @llvm.vector.extract.v2f64.v4f64(<4 x double> %vec, i64 <idx>)
19816
19817Overview:
19818"""""""""
19819
19820The '``llvm.vector.extract.*``' intrinsics extract a vector from within another
19821vector starting from a given index. The return type must be explicitly
19822specified. Conceptually, this can be used to decompose a scalable vector into
19823non-scalable parts, however this intrinsic can also be used on purely fixed
19824types.
19825
19826Scalable vectors can only be extracted from other scalable vectors.
19827
19828Arguments:
19829""""""""""
19830
19831The ``vec`` is the vector from which we will extract a subvector.
19832
19833The ``idx`` specifies the starting element number within ``vec`` from which a
19834subvector is extracted. ``idx`` must be a constant multiple of the known-minimum
19835vector length of the result type. If the result type is a scalable vector,
19836``idx`` is first scaled by the result type's runtime scaling factor. Elements
19837``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector
19838indices. If this condition cannot be determined statically but is false at
19839runtime, then the result vector is a :ref:`poison value <poisonvalues>`. The
19840``idx`` parameter must be a vector index constant type (for most targets this
19841will be an integer pointer type).
19842
19843'``llvm.vector.reverse``' Intrinsic
19844^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19845
19846Syntax:
19847"""""""
19848This is an overloaded intrinsic.
19849
19850::
19851
19852      declare <2 x i8> @llvm.vector.reverse.v2i8(<2 x i8> %a)
19853      declare <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)
19854
19855Overview:
19856"""""""""
19857
19858The '``llvm.vector.reverse.*``' intrinsics reverse a vector.
19859The intrinsic takes a single vector and returns a vector of matching type but
19860with the original lane order reversed. These intrinsics work for both fixed
19861and scalable vectors. While this intrinsic supports all vector types
19862the recommended way to express this operation for fixed-width vectors is
19863still to use a shufflevector, as that may allow for more optimization
19864opportunities.
19865
19866Arguments:
19867""""""""""
19868
19869The argument to this intrinsic must be a vector.
19870
19871'``llvm.vector.deinterleave2``' Intrinsic
19872^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19873
19874Syntax:
19875"""""""
19876This is an overloaded intrinsic.
19877
19878::
19879
19880      declare {<2 x double>, <2 x double>} @llvm.vector.deinterleave2.v4f64(<4 x double> %vec1)
19881      declare {<vscale x 4 x i32>, <vscale x 4 x i32>}  @llvm.vector.deinterleave2.nxv8i32(<vscale x 8 x i32> %vec1)
19882
19883Overview:
19884"""""""""
19885
19886The '``llvm.vector.deinterleave2``' intrinsic constructs two
19887vectors by deinterleaving the even and odd lanes of the input vector.
19888
19889This intrinsic works for both fixed and scalable vectors. While this intrinsic
19890supports all vector types the recommended way to express this operation for
19891fixed-width vectors is still to use a shufflevector, as that may allow for more
19892optimization opportunities.
19893
19894For example:
19895
19896.. code-block:: text
19897
19898  {<2 x i64>, <2 x i64>} llvm.vector.deinterleave2.v4i64(<4 x i64> <i64 0, i64 1, i64 2, i64 3>); ==> {<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>}
19899
19900Arguments:
19901""""""""""
19902
19903The argument is a vector whose type corresponds to the logical concatenation of
19904the two result types.
19905
19906'``llvm.vector.interleave2``' Intrinsic
19907^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19908
19909Syntax:
19910"""""""
19911This is an overloaded intrinsic.
19912
19913::
19914
19915      declare <4 x double> @llvm.vector.interleave2.v4f64(<2 x double> %vec1, <2 x double> %vec2)
19916      declare <vscale x 8 x i32> @llvm.vector.interleave2.nxv8i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2)
19917
19918Overview:
19919"""""""""
19920
19921The '``llvm.vector.interleave2``' intrinsic constructs a vector
19922by interleaving two input vectors.
19923
19924This intrinsic works for both fixed and scalable vectors. While this intrinsic
19925supports all vector types the recommended way to express this operation for
19926fixed-width vectors is still to use a shufflevector, as that may allow for more
19927optimization opportunities.
19928
19929For example:
19930
19931.. code-block:: text
19932
19933   <4 x i64> llvm.vector.interleave2.v4i64(<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>); ==> <4 x i64> <i64 0, i64 1, i64 2, i64 3>
19934
19935Arguments:
19936""""""""""
19937Both arguments must be vectors of the same type whereby their logical
19938concatenation matches the result type.
19939
19940'``llvm.experimental.cttz.elts``' Intrinsic
19941^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19942
19943Syntax:
19944"""""""
19945
19946This is an overloaded intrinsic. You can use ```llvm.experimental.cttz.elts```
19947on any vector of integer elements, both fixed width and scalable.
19948
19949::
19950
19951      declare i8 @llvm.experimental.cttz.elts.i8.v8i1(<8 x i1> <src>, i1 <is_zero_poison>)
19952
19953Overview:
19954"""""""""
19955
19956The '``llvm.experimental.cttz.elts``' intrinsic counts the number of trailing
19957zero elements of a vector.
19958
19959Arguments:
19960""""""""""
19961
19962The first argument is the vector to be counted. This argument must be a vector
19963with integer element type. The return type must also be an integer type which is
19964wide enough to hold the maximum number of elements of the source vector. The
19965behavior of this intrinsic is undefined if the return type is not wide enough
19966for the number of elements in the input vector.
19967
19968The second argument is a constant flag that indicates whether the intrinsic
19969returns a valid result if the first argument is all zero. If the first argument
19970is all zero and the second argument is true, the result is poison.
19971
19972Semantics:
19973""""""""""
19974
19975The '``llvm.experimental.cttz.elts``' intrinsic counts the trailing (least
19976significant) zero elements in a vector. If ``src == 0`` the result is the
19977number of elements in the input vector.
19978
19979'``llvm.vector.splice``' Intrinsic
19980^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19981
19982Syntax:
19983"""""""
19984This is an overloaded intrinsic.
19985
19986::
19987
19988      declare <2 x double> @llvm.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
19989      declare <vscale x 4 x i32> @llvm.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)
19990
19991Overview:
19992"""""""""
19993
19994The '``llvm.vector.splice.*``' intrinsics construct a vector by
19995concatenating elements from the first input vector with elements of the second
19996input vector, returning a vector of the same type as the input vectors. The
19997signed immediate, modulo the number of elements in the vector, is the index
19998into the first vector from which to extract the result value. This means
19999conceptually that for a positive immediate, a vector is extracted from
20000``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative
20001immediate, it extracts ``-imm`` trailing elements from the first vector, and
20002the remaining elements from ``%vec2``.
20003
20004These intrinsics work for both fixed and scalable vectors. While this intrinsic
20005supports all vector types the recommended way to express this operation for
20006fixed-width vectors is still to use a shufflevector, as that may allow for more
20007optimization opportunities.
20008
20009For example:
20010
20011.. code-block:: text
20012
20013 llvm.vector.splice(<A,B,C,D>, <E,F,G,H>, 1);  ==> <B, C, D, E> index
20014 llvm.vector.splice(<A,B,C,D>, <E,F,G,H>, -3); ==> <B, C, D, E> trailing elements
20015
20016
20017Arguments:
20018""""""""""
20019
20020The first two operands are vectors with the same type. The start index is imm
20021modulo the runtime number of elements in the source vector. For a fixed-width
20022vector <N x eltty>, imm is a signed integer constant in the range
20023-N <= imm < N. For a scalable vector <vscale x N x eltty>, imm is a signed
20024integer constant in the range -X <= imm < X where X=vscale_range_min * N.
20025
20026'``llvm.stepvector``' Intrinsic
20027^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20028
20029This is an overloaded intrinsic. You can use ``llvm.stepvector``
20030to generate a vector whose lane values comprise the linear sequence
20031<0, 1, 2, ...>. It is primarily intended for scalable vectors.
20032
20033::
20034
20035      declare <vscale x 4 x i32> @llvm.stepvector.nxv4i32()
20036      declare <vscale x 8 x i16> @llvm.stepvector.nxv8i16()
20037
20038The '``llvm.stepvector``' intrinsics are used to create vectors
20039of integers whose elements contain a linear sequence of values starting from 0
20040with a step of 1. This intrinsic can only be used for vectors with integer
20041elements that are at least 8 bits in size. If the sequence value exceeds
20042the allowed limit for the element type then the result for that lane is
20043a poison value.
20044
20045These intrinsics work for both fixed and scalable vectors. While this intrinsic
20046supports all vector types, the recommended way to express this operation for
20047fixed-width vectors is still to generate a constant vector instead.
20048
20049
20050Arguments:
20051""""""""""
20052
20053None.
20054
20055
20056'``llvm.experimental.get.vector.length``' Intrinsic
20057^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20058
20059Syntax:
20060"""""""
20061This is an overloaded intrinsic.
20062
20063::
20064
20065      declare i32 @llvm.experimental.get.vector.length.i32(i32 %cnt, i32 immarg %vf, i1 immarg %scalable)
20066      declare i32 @llvm.experimental.get.vector.length.i64(i64 %cnt, i32 immarg %vf, i1 immarg %scalable)
20067
20068Overview:
20069"""""""""
20070
20071The '``llvm.experimental.get.vector.length.*``' intrinsics take a number of
20072elements to process and returns how many of the elements can be processed
20073with the requested vectorization factor.
20074
20075Arguments:
20076""""""""""
20077
20078The first argument is an unsigned value of any scalar integer type and specifies
20079the total number of elements to be processed. The second argument is an i32
20080immediate for the vectorization factor. The third argument indicates if the
20081vectorization factor should be multiplied by vscale.
20082
20083Semantics:
20084""""""""""
20085
20086Returns a non-negative i32 value (explicit vector length) that is unknown at compile
20087time and depends on the hardware specification.
20088If the result value does not fit in the result type, then the result is
20089a :ref:`poison value <poisonvalues>`.
20090
20091This intrinsic is intended to be used by loop vectorization with VP intrinsics
20092in order to get the number of elements to process on each loop iteration. The
20093result should be used to decrease the count for the next iteration until the
20094count reaches zero.
20095
20096Let ``%max_lanes`` be the number of lanes in the type described by ``%vf`` and
20097``%scalable``, here are the constraints on the returned value:
20098
20099-  If ``%cnt`` equals to 0, returns 0.
20100-  The returned value is always less than or equal to ``%max_lanes``.
20101-  The returned value is always greater than or equal to ``ceil(%cnt / ceil(%cnt / %max_lanes))``,
20102   if ``%cnt`` is non-zero.
20103-  The returned values are monotonically non-increasing in each loop iteration. That is,
20104   the returned value of an iteration is at least as large as that of any later
20105   iteration.
20106
20107Note that it has the following implications:
20108
20109-  For a loop that uses this intrinsic, the number of iterations is equal to
20110   ``ceil(%C / %max_lanes)`` where ``%C`` is the initial ``%cnt`` value.
20111-  If ``%cnt`` is non-zero, the return value is non-zero as well.
20112-  If ``%cnt`` is less than or equal to ``%max_lanes``, the return value is equal to ``%cnt``.
20113
20114'``llvm.experimental.vector.partial.reduce.add.*``' Intrinsic
20115^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20116
20117Syntax:
20118"""""""
20119This is an overloaded intrinsic.
20120
20121::
20122
20123      declare <4 x i32> @llvm.experimental.vector.partial.reduce.add.v4i32.v4i32.v8i32(<4 x i32> %a, <8 x i32> %b)
20124      declare <4 x i32> @llvm.experimental.vector.partial.reduce.add.v4i32.v4i32.v16i32(<4 x i32> %a, <16 x i32> %b)
20125      declare <vscale x 4 x i32> @llvm.experimental.vector.partial.reduce.add.nxv4i32.nxv4i32.nxv8i32(<vscale x 4 x i32> %a, <vscale x 8 x i32> %b)
20126      declare <vscale x 4 x i32> @llvm.experimental.vector.partial.reduce.add.nxv4i32.nxv4i32.nxv16i32(<vscale x 4 x i32> %a, <vscale x 16 x i32> %b)
20127
20128Overview:
20129"""""""""
20130
20131The '``llvm.vector.experimental.partial.reduce.add.*``' intrinsics reduce the
20132concatenation of the two vector operands down to the number of elements dictated
20133by the result type. The result type is a vector type that matches the type of the
20134first operand vector.
20135
20136Arguments:
20137""""""""""
20138
20139Both arguments must be vectors of matching element types. The first argument type must
20140match the result type, while the second argument type must have a vector length that is a
20141positive integer multiple of the first vector/result type. The arguments must be either be
20142both fixed or both scalable vectors.
20143
20144
20145'``llvm.experimental.vector.histogram.*``' Intrinsic
20146^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20147
20148These intrinsics are overloaded.
20149
20150These intrinsics represent histogram-like operations; that is, updating values
20151in memory that may not be contiguous, and where multiple elements within a
20152single vector may be updating the same value in memory.
20153
20154The update operation must be specified as part of the intrinsic name. For a
20155simple histogram like the following the ``add`` operation would be used.
20156
20157.. code-block:: c
20158
20159    void simple_histogram(int *restrict buckets, unsigned *indices, int N, int inc) {
20160      for (int i = 0; i < N; ++i)
20161        buckets[indices[i]] += inc;
20162    }
20163
20164More update operation types may be added in the future.
20165
20166::
20167
20168    declare void @llvm.experimental.vector.histogram.add.v8p0.i32(<8 x ptr> %ptrs, i32 %inc, <8 x i1> %mask)
20169    declare void @llvm.experimental.vector.histogram.add.nxv2p0.i64(<vscale x 2 x ptr> %ptrs, i64 %inc, <vscale x 2 x i1> %mask)
20170
20171Arguments:
20172""""""""""
20173
20174The first argument is a vector of pointers to the memory locations to be
20175updated. The second argument is a scalar used to update the value from
20176memory; it must match the type of value to be updated. The final argument
20177is a mask value to exclude locations from being modified.
20178
20179Semantics:
20180""""""""""
20181
20182The '``llvm.experimental.vector.histogram.*``' intrinsics are used to perform
20183updates on potentially overlapping values in memory. The intrinsics represent
20184the follow sequence of operations:
20185
201861. Gather load from the ``ptrs`` operand, with element type matching that of
20187   the ``inc`` operand.
201882. Update of the values loaded from memory. In the case of the ``add``
20189   update operation, this means:
20190
20191   1. Perform a cross-vector histogram operation on the ``ptrs`` operand.
20192   2. Multiply the result by the ``inc`` operand.
20193   3. Add the result to the values loaded from memory
201943. Scatter the result of the update operation to the memory locations from
20195   the ``ptrs`` operand.
20196
20197The ``mask`` operand will apply to at least the gather and scatter operations.
20198
20199'``llvm.experimental.vector.extract.last.active``' Intrinsic
20200^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20201
20202This is an overloaded intrinsic.
20203
20204::
20205
20206    declare i32 @llvm.experimental.vector.extract.last.active.v4i32(<4 x i32> %data, <4 x i1> %mask, i32 %passthru)
20207    declare i16 @llvm.experimental.vector.extract.last.active.nxv8i16(<vscale x 8 x i16> %data, <vscale x 8 x i1> %mask, i16 %passthru)
20208
20209Arguments:
20210""""""""""
20211
20212The first argument is the data vector to extract a lane from. The second is a
20213mask vector controlling the extraction. The third argument is a passthru
20214value.
20215
20216The two input vectors must have the same number of elements, and the type of
20217the passthru value must match that of the elements of the data vector.
20218
20219Semantics:
20220""""""""""
20221
20222The '``llvm.experimental.vector.extract.last.active``' intrinsic will extract an
20223element from the data vector at the index matching the highest active lane of
20224the mask vector. If no mask lanes are active then the passthru value is
20225returned instead.
20226
20227.. _int_vector_compress:
20228
20229'``llvm.experimental.vector.compress.*``' Intrinsics
20230^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20231
20232LLVM provides an intrinsic for compressing data within a vector based on a selection mask.
20233Semantically, this is similar to :ref:`llvm.masked.compressstore <int_compressstore>` but with weaker assumptions
20234and without storing the results to memory, i.e., the data remains in the vector.
20235
20236Syntax:
20237"""""""
20238This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected
20239from an input vector and placed adjacently within the result vector. A mask defines which elements to collect from the vector.
20240The remaining lanes are filled with values from ``passthru``.
20241
20242.. code-block:: llvm
20243
20244      declare <8 x i32> @llvm.experimental.vector.compress.v8i32(<8 x i32> <value>, <8 x i1> <mask>, <8 x i32> <passthru>)
20245      declare <16 x float> @llvm.experimental.vector.compress.v16f32(<16 x float> <value>, <16 x i1> <mask>, <16 x float> undef)
20246
20247Overview:
20248"""""""""
20249
20250Selects elements from input vector ``value`` according to the ``mask``.
20251All selected elements are written into adjacent lanes in the result vector,
20252from lower to higher.
20253The mask holds an entry for each vector lane, and is used to select elements
20254to be kept.
20255If a ``passthru`` vector is given, all remaining lanes are filled with the
20256corresponding lane's value from ``passthru``.
20257The main difference to :ref:`llvm.masked.compressstore <int_compressstore>` is
20258that the we do not need to guard against memory access for unselected lanes.
20259This allows for branchless code and better optimization for all targets that
20260do not support or have inefficient
20261instructions of the explicit semantics of
20262:ref:`llvm.masked.compressstore <int_compressstore>` but still have some form
20263of compress operations.
20264The result vector can be written with a similar effect, as all the selected
20265values are at the lower positions of the vector, but without requiring
20266branches to avoid writes where the mask is ``false``.
20267
20268Arguments:
20269""""""""""
20270
20271The first operand is the input vector, from which elements are selected.
20272The second operand is the mask, a vector of boolean values.
20273The third operand is the passthru vector, from which elements are filled
20274into remaining lanes.
20275The mask and the input vector must have the same number of vector elements.
20276The input and passthru vectors must have the same type.
20277
20278Semantics:
20279""""""""""
20280
20281The ``llvm.experimental.vector.compress`` intrinsic compresses data within a vector.
20282It collects elements from possibly non-adjacent lanes of a vector and places
20283them contiguously in the result vector based on a selection mask, filling the
20284remaining lanes with values from ``passthru``.
20285This intrinsic performs the logic of the following C++ example.
20286All values in ``out`` after the last selected one are undefined if
20287``passthru`` is undefined.
20288If all entries in the ``mask`` are 0, the ``out`` vector is ``passthru``.
20289If any element of the mask is poison, all elements of the result are poison.
20290Otherwise, if any element of the mask is undef, all elements of the result are undef.
20291If ``passthru`` is undefined, the number of valid lanes is equal to the number
20292of ``true`` entries in the mask, i.e., all lanes >= number-of-selected-values
20293are undefined.
20294
20295.. code-block:: cpp
20296
20297    // Consecutively place selected values in a vector.
20298    using VecT __attribute__((vector_size(N))) = int;
20299    VecT compress(VecT vec, VecT mask, VecT passthru) {
20300      VecT out;
20301      int idx = 0;
20302      for (int i = 0; i < N / sizeof(int); ++i) {
20303        out[idx] = vec[i];
20304        idx += static_cast<bool>(mask[i]);
20305      }
20306      for (; idx < N / sizeof(int); ++idx) {
20307        out[idx] = passthru[idx];
20308      }
20309      return out;
20310    }
20311
20312
20313'``llvm.experimental.vector.match.*``' Intrinsic
20314^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20315
20316Syntax:
20317"""""""
20318
20319This is an overloaded intrinsic.
20320
20321::
20322
20323    declare <<n> x i1> @llvm.experimental.vector.match(<<n> x <ty>> %op1, <<m> x <ty>> %op2, <<n> x i1> %mask)
20324    declare <vscale x <n> x i1> @llvm.experimental.vector.match(<vscale x <n> x <ty>> %op1, <<m> x <ty>> %op2, <vscale x <n> x i1> %mask)
20325
20326Overview:
20327"""""""""
20328
20329Find active elements of the first argument matching any elements of the second.
20330
20331Arguments:
20332""""""""""
20333
20334The first argument is the search vector, the second argument the vector of
20335elements we are searching for (i.e. for which we consider a match successful),
20336and the third argument is a mask that controls which elements of the first
20337argument are active. The first two arguments must be vectors of matching
20338integer element types. The first and third arguments and the result type must
20339have matching element counts (fixed or scalable). The second argument must be a
20340fixed vector, but its length may be different from the remaining arguments.
20341
20342Semantics:
20343""""""""""
20344
20345The '``llvm.experimental.vector.match``' intrinsic compares each active element
20346in the first argument against the elements of the second argument, placing
20347``1`` in the corresponding element of the output vector if any equality
20348comparison is successful, and ``0`` otherwise. Inactive elements in the mask
20349are set to ``0`` in the output.
20350
20351Matrix Intrinsics
20352-----------------
20353
20354Operations on matrixes requiring shape information (like number of rows/columns
20355or the memory layout) can be expressed using the matrix intrinsics. These
20356intrinsics require matrix dimensions to be passed as immediate arguments, and
20357matrixes are passed and returned as vectors. This means that for a ``R`` x
20358``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the
20359corresponding vector, with indices starting at 0. Currently column-major layout
20360is assumed.  The intrinsics support both integer and floating point matrixes.
20361
20362
20363'``llvm.matrix.transpose.*``' Intrinsic
20364^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20365
20366Syntax:
20367"""""""
20368This is an overloaded intrinsic.
20369
20370::
20371
20372      declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>)
20373
20374Overview:
20375"""""""""
20376
20377The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x
20378<Cols>`` matrix and return the transposed matrix in the result vector.
20379
20380Arguments:
20381""""""""""
20382
20383The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
20384<Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the
20385number of rows and columns, respectively, and must be positive, constant
20386integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have
20387the same float or integer element type as ``%In``.
20388
20389'``llvm.matrix.multiply.*``' Intrinsic
20390^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20391
20392Syntax:
20393"""""""
20394This is an overloaded intrinsic.
20395
20396::
20397
20398      declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>)
20399
20400Overview:
20401"""""""""
20402
20403The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x
20404<Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and
20405multiplies them. The result matrix is returned in the result vector.
20406
20407Arguments:
20408""""""""""
20409
20410The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> *
20411<Inner>`` elements, and the second argument ``%B`` to a matrix with
20412``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``,
20413``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The
20414returned vector must have ``<OuterRows> * <OuterColumns>`` elements.
20415Vectors ``%A``, ``%B``, and the returned vector all have the same float or
20416integer element type.
20417
20418
20419'``llvm.matrix.column.major.load.*``' Intrinsic
20420^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20421
20422Syntax:
20423"""""""
20424This is an overloaded intrinsic.
20425
20426::
20427
20428      declare vectorty @llvm.matrix.column.major.load.*(
20429          ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
20430
20431Overview:
20432"""""""""
20433
20434The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>``
20435matrix using a stride of ``%Stride`` to compute the start address of the
20436different columns.  The offset is computed using ``%Stride``'s bitwidth. This
20437allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the
20438intrinsic is considered a :ref:`volatile memory access <volatile>`. The result
20439matrix is returned in the result vector. If the ``%Ptr`` argument is known to
20440be aligned to some boundary, this can be specified as an attribute on the
20441argument.
20442
20443Arguments:
20444""""""""""
20445
20446The first argument ``%Ptr`` is a pointer type to the returned vector type, and
20447corresponds to the start address to load from. The second argument ``%Stride``
20448is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used
20449to compute the column memory addresses. I.e., for a column ``C``, its start
20450memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument
20451``<IsVolatile>`` is a boolean value.  The fourth and fifth arguments,
20452``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns,
20453respectively, and must be positive, constant integers. The returned vector must
20454have ``<Rows> * <Cols>`` elements.
20455
20456The :ref:`align <attr_align>` parameter attribute can be provided for the
20457``%Ptr`` arguments.
20458
20459
20460'``llvm.matrix.column.major.store.*``' Intrinsic
20461^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20462
20463Syntax:
20464"""""""
20465
20466::
20467
20468      declare void @llvm.matrix.column.major.store.*(
20469          vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
20470
20471Overview:
20472"""""""""
20473
20474The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x
20475<Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between
20476columns. The offset is computed using ``%Stride``'s bitwidth. If
20477``<IsVolatile>`` is true, the intrinsic is considered a
20478:ref:`volatile memory access <volatile>`.
20479
20480If the ``%Ptr`` argument is known to be aligned to some boundary, this can be
20481specified as an attribute on the argument.
20482
20483Arguments:
20484""""""""""
20485
20486The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
20487<Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a
20488pointer to the vector type of ``%In``, and is the start address of the matrix
20489in memory. The third argument ``%Stride`` is a positive, constant integer with
20490``%Stride >= <Rows>``.  ``%Stride`` is used to compute the column memory
20491addresses. I.e., for a column ``C``, its start memory addresses is calculated
20492with ``%Ptr + C * %Stride``.  The fourth argument ``<IsVolatile>`` is a boolean
20493value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows
20494and columns, respectively, and must be positive, constant integers.
20495
20496The :ref:`align <attr_align>` parameter attribute can be provided
20497for the ``%Ptr`` arguments.
20498
20499
20500Half Precision Floating-Point Intrinsics
20501----------------------------------------
20502
20503For most target platforms, half precision floating-point is a
20504storage-only format. This means that it is a dense encoding (in memory)
20505but does not support computation in the format.
20506
20507This means that code must first load the half-precision floating-point
20508value as an i16, then convert it to float with
20509:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
20510then be performed on the float value (including extending to double
20511etc). To store the value back to memory, it is first converted to float
20512if needed, then converted to i16 with
20513:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
20514i16 value.
20515
20516.. _int_convert_to_fp16:
20517
20518'``llvm.convert.to.fp16``' Intrinsic
20519^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20520
20521Syntax:
20522"""""""
20523
20524::
20525
20526      declare i16 @llvm.convert.to.fp16.f32(float %a)
20527      declare i16 @llvm.convert.to.fp16.f64(double %a)
20528
20529Overview:
20530"""""""""
20531
20532The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
20533conventional floating-point type to half precision floating-point format.
20534
20535Arguments:
20536""""""""""
20537
20538The intrinsic function contains single argument - the value to be
20539converted.
20540
20541Semantics:
20542""""""""""
20543
20544The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
20545conventional floating-point format to half precision floating-point format. The
20546return value is an ``i16`` which contains the converted number.
20547
20548Examples:
20549"""""""""
20550
20551.. code-block:: llvm
20552
20553      %res = call i16 @llvm.convert.to.fp16.f32(float %a)
20554      store i16 %res, i16* @x, align 2
20555
20556.. _int_convert_from_fp16:
20557
20558'``llvm.convert.from.fp16``' Intrinsic
20559^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20560
20561Syntax:
20562"""""""
20563
20564::
20565
20566      declare float @llvm.convert.from.fp16.f32(i16 %a)
20567      declare double @llvm.convert.from.fp16.f64(i16 %a)
20568
20569Overview:
20570"""""""""
20571
20572The '``llvm.convert.from.fp16``' intrinsic function performs a
20573conversion from half precision floating-point format to single precision
20574floating-point format.
20575
20576Arguments:
20577""""""""""
20578
20579The intrinsic function contains single argument - the value to be
20580converted.
20581
20582Semantics:
20583""""""""""
20584
20585The '``llvm.convert.from.fp16``' intrinsic function performs a
20586conversion from half single precision floating-point format to single
20587precision floating-point format. The input half-float value is
20588represented by an ``i16`` value.
20589
20590Examples:
20591"""""""""
20592
20593.. code-block:: llvm
20594
20595      %a = load i16, ptr @x, align 2
20596      %res = call float @llvm.convert.from.fp16(i16 %a)
20597
20598Saturating floating-point to integer conversions
20599------------------------------------------------
20600
20601The ``fptoui`` and ``fptosi`` instructions return a
20602:ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not
20603representable by the result type. These intrinsics provide an alternative
20604conversion, which will saturate towards the smallest and largest representable
20605integer values instead.
20606
20607'``llvm.fptoui.sat.*``' Intrinsic
20608^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20609
20610Syntax:
20611"""""""
20612
20613This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any
20614floating-point argument type and any integer result type, or vectors thereof.
20615Not all targets may support all types, however.
20616
20617::
20618
20619      declare i32 @llvm.fptoui.sat.i32.f32(float %f)
20620      declare i19 @llvm.fptoui.sat.i19.f64(double %f)
20621      declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f)
20622
20623Overview:
20624"""""""""
20625
20626This intrinsic converts the argument into an unsigned integer using saturating
20627semantics.
20628
20629Arguments:
20630""""""""""
20631
20632The argument may be any floating-point or vector of floating-point type. The
20633return value may be any integer or vector of integer type. The number of vector
20634elements in argument and return must be the same.
20635
20636Semantics:
20637""""""""""
20638
20639The conversion to integer is performed subject to the following rules:
20640
20641- If the argument is any NaN, zero is returned.
20642- If the argument is smaller than zero (this includes negative infinity),
20643  zero is returned.
20644- If the argument is larger than the largest representable unsigned integer of
20645  the result type (this includes positive infinity), the largest representable
20646  unsigned integer is returned.
20647- Otherwise, the result of rounding the argument towards zero is returned.
20648
20649Example:
20650""""""""
20651
20652.. code-block:: text
20653
20654      %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.875)            ; yields i8: 123
20655      %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.75)              ; yields i8:   0
20656      %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0)              ; yields i8: 255
20657      %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:   0
20658
20659'``llvm.fptosi.sat.*``' Intrinsic
20660^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20661
20662Syntax:
20663"""""""
20664
20665This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any
20666floating-point argument type and any integer result type, or vectors thereof.
20667Not all targets may support all types, however.
20668
20669::
20670
20671      declare i32 @llvm.fptosi.sat.i32.f32(float %f)
20672      declare i19 @llvm.fptosi.sat.i19.f64(double %f)
20673      declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f)
20674
20675Overview:
20676"""""""""
20677
20678This intrinsic converts the argument into a signed integer using saturating
20679semantics.
20680
20681Arguments:
20682""""""""""
20683
20684The argument may be any floating-point or vector of floating-point type. The
20685return value may be any integer or vector of integer type. The number of vector
20686elements in argument and return must be the same.
20687
20688Semantics:
20689""""""""""
20690
20691The conversion to integer is performed subject to the following rules:
20692
20693- If the argument is any NaN, zero is returned.
20694- If the argument is smaller than the smallest representable signed integer of
20695  the result type (this includes negative infinity), the smallest
20696  representable signed integer is returned.
20697- If the argument is larger than the largest representable signed integer of
20698  the result type (this includes positive infinity), the largest representable
20699  signed integer is returned.
20700- Otherwise, the result of rounding the argument towards zero is returned.
20701
20702Example:
20703""""""""
20704
20705.. code-block:: text
20706
20707      %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.875)             ; yields i8:   23
20708      %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.75)            ; yields i8: -128
20709      %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0)              ; yields i8:  127
20710      %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:    0
20711
20712Convergence Intrinsics
20713----------------------
20714
20715The LLVM convergence intrinsics for controlling the semantics of ``convergent``
20716operations, which all start with the ``llvm.experimental.convergence.``
20717prefix, are described in the :doc:`ConvergentOperations` document.
20718
20719.. _dbg_intrinsics:
20720
20721Debugger Intrinsics
20722-------------------
20723
20724The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
20725prefix), are described in the `LLVM Source Level
20726Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
20727document.
20728
20729Exception Handling Intrinsics
20730-----------------------------
20731
20732The LLVM exception handling intrinsics (which all start with
20733``llvm.eh.`` prefix), are described in the `LLVM Exception
20734Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
20735
20736Pointer Authentication Intrinsics
20737---------------------------------
20738
20739The LLVM pointer authentication intrinsics (which all start with
20740``llvm.ptrauth.`` prefix), are described in the `Pointer Authentication
20741<PointerAuth.html#intrinsics>`_ document.
20742
20743.. _int_trampoline:
20744
20745Trampoline Intrinsics
20746---------------------
20747
20748These intrinsics make it possible to excise one parameter, marked with
20749the :ref:`nest <nest>` attribute, from a function. The result is a
20750callable function pointer lacking the nest parameter - the caller does
20751not need to provide a value for it. Instead, the value to use is stored
20752in advance in a "trampoline", a block of memory usually allocated on the
20753stack, which also contains code to splice the nest value into the
20754argument list. This is used to implement the GCC nested function address
20755extension.
20756
20757For example, if the function is ``i32 f(ptr nest %c, i32 %x, i32 %y)``
20758then the resulting function pointer has signature ``i32 (i32, i32)``.
20759It can be created as follows:
20760
20761.. code-block:: llvm
20762
20763      %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
20764      call ptr @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %nval)
20765      %fp = call ptr @llvm.adjust.trampoline(ptr %tramp)
20766
20767The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
20768``%val = call i32 %f(ptr %nval, i32 %x, i32 %y)``.
20769
20770.. _int_it:
20771
20772'``llvm.init.trampoline``' Intrinsic
20773^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20774
20775Syntax:
20776"""""""
20777
20778::
20779
20780      declare void @llvm.init.trampoline(ptr <tramp>, ptr <func>, ptr <nval>)
20781
20782Overview:
20783"""""""""
20784
20785This fills the memory pointed to by ``tramp`` with executable code,
20786turning it into a trampoline.
20787
20788Arguments:
20789""""""""""
20790
20791The ``llvm.init.trampoline`` intrinsic takes three arguments, all
20792pointers. The ``tramp`` argument must point to a sufficiently large and
20793sufficiently aligned block of memory; this memory is written to by the
20794intrinsic. Note that the size and the alignment are target-specific -
20795LLVM currently provides no portable way of determining them, so a
20796front-end that generates this intrinsic needs to have some
20797target-specific knowledge. The ``func`` argument must hold a function.
20798
20799Semantics:
20800""""""""""
20801
20802The block of memory pointed to by ``tramp`` is filled with target
20803dependent code, turning it into a function. Then ``tramp`` needs to be
20804passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
20805be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
20806function's signature is the same as that of ``func`` with any arguments
20807marked with the ``nest`` attribute removed. At most one such ``nest``
20808argument is allowed, and it must be of pointer type. Calling the new
20809function is equivalent to calling ``func`` with the same argument list,
20810but with ``nval`` used for the missing ``nest`` argument. If, after
20811calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
20812modified, then the effect of any later call to the returned function
20813pointer is undefined.
20814
20815.. _int_at:
20816
20817'``llvm.adjust.trampoline``' Intrinsic
20818^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20819
20820Syntax:
20821"""""""
20822
20823::
20824
20825      declare ptr @llvm.adjust.trampoline(ptr <tramp>)
20826
20827Overview:
20828"""""""""
20829
20830This performs any required machine-specific adjustment to the address of
20831a trampoline (passed as ``tramp``).
20832
20833Arguments:
20834""""""""""
20835
20836``tramp`` must point to a block of memory which already has trampoline
20837code filled in by a previous call to
20838:ref:`llvm.init.trampoline <int_it>`.
20839
20840Semantics:
20841""""""""""
20842
20843On some architectures the address of the code to be executed needs to be
20844different than the address where the trampoline is actually stored. This
20845intrinsic returns the executable address corresponding to ``tramp``
20846after performing the required machine specific adjustments. The pointer
20847returned can then be :ref:`bitcast and executed <int_trampoline>`.
20848
20849
20850.. _int_vp:
20851
20852Vector Predication Intrinsics
20853-----------------------------
20854VP intrinsics are intended for predicated SIMD/vector code.  A typical VP
20855operation takes a vector mask and an explicit vector length parameter as in:
20856
20857::
20858
20859      <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl)
20860
20861The vector mask parameter (%mask) always has a vector of `i1` type, for example
20862`<32 x i1>`.  The explicit vector length parameter always has the type `i32` and
20863is an unsigned integer value.  The explicit vector length parameter (%evl) is in
20864the range:
20865
20866::
20867
20868      0 <= %evl <= W,  where W is the number of vector elements
20869
20870Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime
20871length of the vector.
20872
20873The VP intrinsic has undefined behavior if ``%evl > W``.  The explicit vector
20874length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set
20875to True, and all other lanes ``%evl <= i < W`` to False.  A new mask %M is
20876calculated with an element-wise AND from %mask and %EVLmask:
20877
20878::
20879
20880      M = %mask AND %EVLmask
20881
20882A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates:
20883
20884::
20885
20886       A <opcode> B =  {  A[i] <opcode> B[i]   M[i] = True, and
20887                       {  undef otherwise
20888
20889Optimization Hint
20890^^^^^^^^^^^^^^^^^
20891
20892Some targets, such as AVX512, do not support the %evl parameter in hardware.
20893The use of an effective %evl is discouraged for those targets.  The function
20894``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target
20895has native support for %evl.
20896
20897.. _int_vp_select:
20898
20899'``llvm.vp.select.*``' Intrinsics
20900^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20901
20902Syntax:
20903"""""""
20904This is an overloaded intrinsic.
20905
20906::
20907
20908      declare <16 x i32>  @llvm.vp.select.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <evl>)
20909      declare <vscale x 4 x i64>  @llvm.vp.select.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <evl>)
20910
20911Overview:
20912"""""""""
20913
20914The '``llvm.vp.select``' intrinsic is used to choose one value based on a
20915condition vector, without IR-level branching.
20916
20917Arguments:
20918""""""""""
20919
20920The first argument is a vector of ``i1`` and indicates the condition.  The
20921second argument is the value that is selected where the condition vector is
20922true.  The third argument is the value that is selected where the condition
20923vector is false.  The vectors must be of the same size.  The fourth argument is
20924the explicit vector length.
20925
20926#. The optional ``fast-math flags`` marker indicates that the select has one or
20927   more :ref:`fast-math flags <fastmath>`. These are optimization hints to
20928   enable otherwise unsafe floating-point optimizations. Fast-math flags are
20929   only valid for selects that return :ref:`supported floating-point types
20930   <fastmath_return_types>`.
20931
20932Semantics:
20933""""""""""
20934
20935The intrinsic selects lanes from the second and third argument depending on a
20936condition vector.
20937
20938All result lanes at positions greater or equal than ``%evl`` are undefined.
20939For all lanes below ``%evl`` where the condition vector is true the lane is
20940taken from the second argument.  Otherwise, the lane is taken from the third
20941argument.
20942
20943Example:
20944""""""""
20945
20946.. code-block:: llvm
20947
20948      %r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %evl)
20949
20950      ;;; Expansion.
20951      ;; Any result is legal on lanes at and above %evl.
20952      %also.r = select <4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false
20953
20954
20955.. _int_vp_merge:
20956
20957'``llvm.vp.merge.*``' Intrinsics
20958^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20959
20960Syntax:
20961"""""""
20962This is an overloaded intrinsic.
20963
20964::
20965
20966      declare <16 x i32>  @llvm.vp.merge.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <pivot>)
20967      declare <vscale x 4 x i64>  @llvm.vp.merge.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <pivot>)
20968
20969Overview:
20970"""""""""
20971
20972The '``llvm.vp.merge``' intrinsic is used to choose one value based on a
20973condition vector and an index argument, without IR-level branching.
20974
20975Arguments:
20976""""""""""
20977
20978The first argument is a vector of ``i1`` and indicates the condition.  The
20979second argument is the value that is merged where the condition vector is true.
20980The third argument is the value that is selected where the condition vector is
20981false or the lane position is greater equal than the pivot. The fourth argument
20982is the pivot.
20983
20984#. The optional ``fast-math flags`` marker indicates that the merge has one or
20985   more :ref:`fast-math flags <fastmath>`. These are optimization hints to
20986   enable otherwise unsafe floating-point optimizations. Fast-math flags are
20987   only valid for merges that return :ref:`supported floating-point types
20988   <fastmath_return_types>`.
20989
20990Semantics:
20991""""""""""
20992
20993The intrinsic selects lanes from the second and third argument depending on a
20994condition vector and pivot value.
20995
20996For all lanes where the condition vector is true and the lane position is less
20997than ``%pivot`` the lane is taken from the second argument.  Otherwise, the lane
20998is taken from the third argument.
20999
21000Example:
21001""""""""
21002
21003.. code-block:: llvm
21004
21005      %r = call <4 x i32> @llvm.vp.merge.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %pivot)
21006
21007      ;;; Expansion.
21008      ;; Lanes at and above %pivot are taken from %on_false
21009      %atfirst = insertelement <4 x i32> poison, i32 %pivot, i32 0
21010      %splat = shufflevector <4 x i32> %atfirst, <4 x i32> poison, <4 x i32> zeroinitializer
21011      %pivotmask = icmp ult <4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32> %splat
21012      %mergemask = and <4 x i1> %cond, <4 x i1> %pivotmask
21013      %also.r = select <4 x i1> %mergemask, <4 x i32> %on_true, <4 x i32> %on_false
21014
21015
21016
21017.. _int_vp_add:
21018
21019'``llvm.vp.add.*``' Intrinsics
21020^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21021
21022Syntax:
21023"""""""
21024This is an overloaded intrinsic.
21025
21026::
21027
21028      declare <16 x i32>  @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21029      declare <vscale x 4 x i32>  @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21030      declare <256 x i64>  @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21031
21032Overview:
21033"""""""""
21034
21035Predicated integer addition of two vectors of integers.
21036
21037
21038Arguments:
21039""""""""""
21040
21041The first two arguments and the result have the same vector of integer type. The
21042third argument is the vector mask and has the same number of elements as the
21043result vector type. The fourth argument is the explicit vector length of the
21044operation.
21045
21046Semantics:
21047""""""""""
21048
21049The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`)
21050of the first and second vector arguments on each enabled lane.  The result on
21051disabled lanes is a :ref:`poison value <poisonvalues>`.
21052
21053Examples:
21054"""""""""
21055
21056.. code-block:: llvm
21057
21058      %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21059      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21060
21061      %t = add <4 x i32> %a, %b
21062      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21063
21064.. _int_vp_sub:
21065
21066'``llvm.vp.sub.*``' Intrinsics
21067^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21068
21069Syntax:
21070"""""""
21071This is an overloaded intrinsic.
21072
21073::
21074
21075      declare <16 x i32>  @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21076      declare <vscale x 4 x i32>  @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21077      declare <256 x i64>  @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21078
21079Overview:
21080"""""""""
21081
21082Predicated integer subtraction of two vectors of integers.
21083
21084
21085Arguments:
21086""""""""""
21087
21088The first two arguments and the result have the same vector of integer type. The
21089third argument is the vector mask and has the same number of elements as the
21090result vector type. The fourth argument is the explicit vector length of the
21091operation.
21092
21093Semantics:
21094""""""""""
21095
21096The '``llvm.vp.sub``' intrinsic performs integer subtraction
21097(:ref:`sub <i_sub>`)  of the first and second vector arguments on each enabled
21098lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21099
21100Examples:
21101"""""""""
21102
21103.. code-block:: llvm
21104
21105      %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21106      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21107
21108      %t = sub <4 x i32> %a, %b
21109      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21110
21111
21112
21113.. _int_vp_mul:
21114
21115'``llvm.vp.mul.*``' Intrinsics
21116^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21117
21118Syntax:
21119"""""""
21120This is an overloaded intrinsic.
21121
21122::
21123
21124      declare <16 x i32>  @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21125      declare <vscale x 4 x i32>  @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21126      declare <256 x i64>  @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21127
21128Overview:
21129"""""""""
21130
21131Predicated integer multiplication of two vectors of integers.
21132
21133
21134Arguments:
21135""""""""""
21136
21137The first two arguments and the result have the same vector of integer type. The
21138third argument is the vector mask and has the same number of elements as the
21139result vector type. The fourth argument is the explicit vector length of the
21140operation.
21141
21142Semantics:
21143""""""""""
21144The '``llvm.vp.mul``' intrinsic performs integer multiplication
21145(:ref:`mul <i_mul>`) of the first and second vector arguments on each enabled
21146lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21147
21148Examples:
21149"""""""""
21150
21151.. code-block:: llvm
21152
21153      %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21154      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21155
21156      %t = mul <4 x i32> %a, %b
21157      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21158
21159
21160.. _int_vp_sdiv:
21161
21162'``llvm.vp.sdiv.*``' Intrinsics
21163^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21164
21165Syntax:
21166"""""""
21167This is an overloaded intrinsic.
21168
21169::
21170
21171      declare <16 x i32>  @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21172      declare <vscale x 4 x i32>  @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21173      declare <256 x i64>  @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21174
21175Overview:
21176"""""""""
21177
21178Predicated, signed division of two vectors of integers.
21179
21180
21181Arguments:
21182""""""""""
21183
21184The first two arguments and the result have the same vector of integer type. The
21185third argument is the vector mask and has the same number of elements as the
21186result vector type. The fourth argument is the explicit vector length of the
21187operation.
21188
21189Semantics:
21190""""""""""
21191
21192The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`)
21193of the first and second vector arguments on each enabled lane.  The result on
21194disabled lanes is a :ref:`poison value <poisonvalues>`.
21195
21196Examples:
21197"""""""""
21198
21199.. code-block:: llvm
21200
21201      %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21202      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21203
21204      %t = sdiv <4 x i32> %a, %b
21205      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21206
21207
21208.. _int_vp_udiv:
21209
21210'``llvm.vp.udiv.*``' Intrinsics
21211^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21212
21213Syntax:
21214"""""""
21215This is an overloaded intrinsic.
21216
21217::
21218
21219      declare <16 x i32>  @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21220      declare <vscale x 4 x i32>  @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21221      declare <256 x i64>  @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21222
21223Overview:
21224"""""""""
21225
21226Predicated, unsigned division of two vectors of integers.
21227
21228
21229Arguments:
21230""""""""""
21231
21232The first two arguments and the result have the same vector of integer type. The
21233third argument is the vector mask and has the same number of elements as the
21234result vector type. The fourth argument is the explicit vector length of the
21235operation.
21236
21237Semantics:
21238""""""""""
21239
21240The '``llvm.vp.udiv``' intrinsic performs unsigned division
21241(:ref:`udiv <i_udiv>`) of the first and second vector arguments on each enabled
21242lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21243
21244Examples:
21245"""""""""
21246
21247.. code-block:: llvm
21248
21249      %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21250      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21251
21252      %t = udiv <4 x i32> %a, %b
21253      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21254
21255
21256
21257.. _int_vp_srem:
21258
21259'``llvm.vp.srem.*``' Intrinsics
21260^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21261
21262Syntax:
21263"""""""
21264This is an overloaded intrinsic.
21265
21266::
21267
21268      declare <16 x i32>  @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21269      declare <vscale x 4 x i32>  @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21270      declare <256 x i64>  @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21271
21272Overview:
21273"""""""""
21274
21275Predicated computations of the signed remainder of two integer vectors.
21276
21277
21278Arguments:
21279""""""""""
21280
21281The first two arguments and the result have the same vector of integer type. The
21282third argument is the vector mask and has the same number of elements as the
21283result vector type. The fourth argument is the explicit vector length of the
21284operation.
21285
21286Semantics:
21287""""""""""
21288
21289The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division
21290(:ref:`srem <i_srem>`) of the first and second vector arguments on each enabled
21291lane.  The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21292
21293Examples:
21294"""""""""
21295
21296.. code-block:: llvm
21297
21298      %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21299      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21300
21301      %t = srem <4 x i32> %a, %b
21302      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21303
21304
21305
21306.. _int_vp_urem:
21307
21308'``llvm.vp.urem.*``' Intrinsics
21309^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21310
21311Syntax:
21312"""""""
21313This is an overloaded intrinsic.
21314
21315::
21316
21317      declare <16 x i32>  @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21318      declare <vscale x 4 x i32>  @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21319      declare <256 x i64>  @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21320
21321Overview:
21322"""""""""
21323
21324Predicated computation of the unsigned remainder of two integer vectors.
21325
21326
21327Arguments:
21328""""""""""
21329
21330The first two arguments and the result have the same vector of integer type. The
21331third argument is the vector mask and has the same number of elements as the
21332result vector type. The fourth argument is the explicit vector length of the
21333operation.
21334
21335Semantics:
21336""""""""""
21337
21338The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division
21339(:ref:`urem <i_urem>`) of the first and second vector arguments on each enabled
21340lane.  The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21341
21342Examples:
21343"""""""""
21344
21345.. code-block:: llvm
21346
21347      %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21348      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21349
21350      %t = urem <4 x i32> %a, %b
21351      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21352
21353
21354.. _int_vp_ashr:
21355
21356'``llvm.vp.ashr.*``' Intrinsics
21357^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21358
21359Syntax:
21360"""""""
21361This is an overloaded intrinsic.
21362
21363::
21364
21365      declare <16 x i32>  @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21366      declare <vscale x 4 x i32>  @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21367      declare <256 x i64>  @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21368
21369Overview:
21370"""""""""
21371
21372Vector-predicated arithmetic right-shift.
21373
21374
21375Arguments:
21376""""""""""
21377
21378The first two arguments and the result have the same vector of integer type. The
21379third argument is the vector mask and has the same number of elements as the
21380result vector type. The fourth argument is the explicit vector length of the
21381operation.
21382
21383Semantics:
21384""""""""""
21385
21386The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift
21387(:ref:`ashr <i_ashr>`) of the first argument by the second argument on each
21388enabled lane. The result on disabled lanes is a
21389:ref:`poison value <poisonvalues>`.
21390
21391Examples:
21392"""""""""
21393
21394.. code-block:: llvm
21395
21396      %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21397      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21398
21399      %t = ashr <4 x i32> %a, %b
21400      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21401
21402
21403.. _int_vp_lshr:
21404
21405
21406'``llvm.vp.lshr.*``' Intrinsics
21407^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21408
21409Syntax:
21410"""""""
21411This is an overloaded intrinsic.
21412
21413::
21414
21415      declare <16 x i32>  @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21416      declare <vscale x 4 x i32>  @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21417      declare <256 x i64>  @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21418
21419Overview:
21420"""""""""
21421
21422Vector-predicated logical right-shift.
21423
21424
21425Arguments:
21426""""""""""
21427
21428The first two arguments and the result have the same vector of integer type. The
21429third argument is the vector mask and has the same number of elements as the
21430result vector type. The fourth argument is the explicit vector length of the
21431operation.
21432
21433Semantics:
21434""""""""""
21435
21436The '``llvm.vp.lshr``' intrinsic computes the logical right shift
21437(:ref:`lshr <i_lshr>`) of the first argument by the second argument on each
21438enabled lane. The result on disabled lanes is a
21439:ref:`poison value <poisonvalues>`.
21440
21441Examples:
21442"""""""""
21443
21444.. code-block:: llvm
21445
21446      %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21447      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21448
21449      %t = lshr <4 x i32> %a, %b
21450      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21451
21452
21453.. _int_vp_shl:
21454
21455'``llvm.vp.shl.*``' Intrinsics
21456^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21457
21458Syntax:
21459"""""""
21460This is an overloaded intrinsic.
21461
21462::
21463
21464      declare <16 x i32>  @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21465      declare <vscale x 4 x i32>  @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21466      declare <256 x i64>  @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21467
21468Overview:
21469"""""""""
21470
21471Vector-predicated left shift.
21472
21473
21474Arguments:
21475""""""""""
21476
21477The first two arguments and the result have the same vector of integer type. The
21478third argument is the vector mask and has the same number of elements as the
21479result vector type. The fourth argument is the explicit vector length of the
21480operation.
21481
21482Semantics:
21483""""""""""
21484
21485The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of
21486the first argument by the second argument on each enabled lane.  The result on
21487disabled lanes is a :ref:`poison value <poisonvalues>`.
21488
21489Examples:
21490"""""""""
21491
21492.. code-block:: llvm
21493
21494      %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21495      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21496
21497      %t = shl <4 x i32> %a, %b
21498      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21499
21500
21501.. _int_vp_or:
21502
21503'``llvm.vp.or.*``' Intrinsics
21504^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21505
21506Syntax:
21507"""""""
21508This is an overloaded intrinsic.
21509
21510::
21511
21512      declare <16 x i32>  @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21513      declare <vscale x 4 x i32>  @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21514      declare <256 x i64>  @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21515
21516Overview:
21517"""""""""
21518
21519Vector-predicated or.
21520
21521
21522Arguments:
21523""""""""""
21524
21525The first two arguments and the result have the same vector of integer type. The
21526third argument is the vector mask and has the same number of elements as the
21527result vector type. The fourth argument is the explicit vector length of the
21528operation.
21529
21530Semantics:
21531""""""""""
21532
21533The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the
21534first two arguments on each enabled lane.  The result on disabled lanes is
21535a :ref:`poison value <poisonvalues>`.
21536
21537Examples:
21538"""""""""
21539
21540.. code-block:: llvm
21541
21542      %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21543      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21544
21545      %t = or <4 x i32> %a, %b
21546      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21547
21548
21549.. _int_vp_and:
21550
21551'``llvm.vp.and.*``' Intrinsics
21552^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21553
21554Syntax:
21555"""""""
21556This is an overloaded intrinsic.
21557
21558::
21559
21560      declare <16 x i32>  @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21561      declare <vscale x 4 x i32>  @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21562      declare <256 x i64>  @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21563
21564Overview:
21565"""""""""
21566
21567Vector-predicated and.
21568
21569
21570Arguments:
21571""""""""""
21572
21573The first two arguments and the result have the same vector of integer type. The
21574third argument is the vector mask and has the same number of elements as the
21575result vector type. The fourth argument is the explicit vector length of the
21576operation.
21577
21578Semantics:
21579""""""""""
21580
21581The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of
21582the first two arguments on each enabled lane.  The result on disabled lanes is
21583a :ref:`poison value <poisonvalues>`.
21584
21585Examples:
21586"""""""""
21587
21588.. code-block:: llvm
21589
21590      %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21591      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21592
21593      %t = and <4 x i32> %a, %b
21594      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21595
21596
21597.. _int_vp_xor:
21598
21599'``llvm.vp.xor.*``' Intrinsics
21600^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21601
21602Syntax:
21603"""""""
21604This is an overloaded intrinsic.
21605
21606::
21607
21608      declare <16 x i32>  @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21609      declare <vscale x 4 x i32>  @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21610      declare <256 x i64>  @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21611
21612Overview:
21613"""""""""
21614
21615Vector-predicated, bitwise xor.
21616
21617
21618Arguments:
21619""""""""""
21620
21621The first two arguments and the result have the same vector of integer type. The
21622third argument is the vector mask and has the same number of elements as the
21623result vector type. The fourth argument is the explicit vector length of the
21624operation.
21625
21626Semantics:
21627""""""""""
21628
21629The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of
21630the first two arguments on each enabled lane.
21631The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21632
21633Examples:
21634"""""""""
21635
21636.. code-block:: llvm
21637
21638      %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21639      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21640
21641      %t = xor <4 x i32> %a, %b
21642      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21643
21644.. _int_vp_abs:
21645
21646'``llvm.vp.abs.*``' Intrinsics
21647^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21648
21649Syntax:
21650"""""""
21651This is an overloaded intrinsic.
21652
21653::
21654
21655      declare <16 x i32>  @llvm.vp.abs.v16i32 (<16 x i32> <op>, i1 <is_int_min_poison>, <16 x i1> <mask>, i32 <vector_length>)
21656      declare <vscale x 4 x i32>  @llvm.vp.abs.nxv4i32 (<vscale x 4 x i32> <op>, i1 <is_int_min_poison>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21657      declare <256 x i64>  @llvm.vp.abs.v256i64 (<256 x i64> <op>, i1 <is_int_min_poison>, <256 x i1> <mask>, i32 <vector_length>)
21658
21659Overview:
21660"""""""""
21661
21662Predicated abs of a vector of integers.
21663
21664
21665Arguments:
21666""""""""""
21667
21668The first argument and the result have the same vector of integer type. The
21669second argument must be a constant and is a flag to indicate whether the result
21670value of the '``llvm.vp.abs``' intrinsic is a :ref:`poison value <poisonvalues>`
21671if the first argument is statically or dynamically an ``INT_MIN`` value. The
21672third argument is the vector mask and has the same number of elements as the
21673result vector type. The fourth argument is the explicit vector length of the
21674operation.
21675
21676Semantics:
21677""""""""""
21678
21679The '``llvm.vp.abs``' intrinsic performs abs (:ref:`abs <int_abs>`) of the first argument on each
21680enabled lane.  The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21681
21682Examples:
21683"""""""""
21684
21685.. code-block:: llvm
21686
21687      %r = call <4 x i32> @llvm.vp.abs.v4i32(<4 x i32> %a, i1 false, <4 x i1> %mask, i32 %evl)
21688      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21689
21690      %t = call <4 x i32> @llvm.abs.v4i32(<4 x i32> %a, i1 false)
21691      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21692
21693
21694
21695.. _int_vp_smax:
21696
21697'``llvm.vp.smax.*``' Intrinsics
21698^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21699
21700Syntax:
21701"""""""
21702This is an overloaded intrinsic.
21703
21704::
21705
21706      declare <16 x i32>  @llvm.vp.smax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21707      declare <vscale x 4 x i32>  @llvm.vp.smax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21708      declare <256 x i64>  @llvm.vp.smax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21709
21710Overview:
21711"""""""""
21712
21713Predicated integer signed maximum of two vectors of integers.
21714
21715
21716Arguments:
21717""""""""""
21718
21719The first two arguments and the result have the same vector of integer type. The
21720third argument is the vector mask and has the same number of elements as the
21721result vector type. The fourth argument is the explicit vector length of the
21722operation.
21723
21724Semantics:
21725""""""""""
21726
21727The '``llvm.vp.smax``' intrinsic performs integer signed maximum (:ref:`smax <int_smax>`)
21728of the first and second vector arguments on each enabled lane.  The result on
21729disabled lanes is a :ref:`poison value <poisonvalues>`.
21730
21731Examples:
21732"""""""""
21733
21734.. code-block:: llvm
21735
21736      %r = call <4 x i32> @llvm.vp.smax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21737      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21738
21739      %t = call <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
21740      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21741
21742
21743.. _int_vp_smin:
21744
21745'``llvm.vp.smin.*``' Intrinsics
21746^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21747
21748Syntax:
21749"""""""
21750This is an overloaded intrinsic.
21751
21752::
21753
21754      declare <16 x i32>  @llvm.vp.smin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21755      declare <vscale x 4 x i32>  @llvm.vp.smin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21756      declare <256 x i64>  @llvm.vp.smin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21757
21758Overview:
21759"""""""""
21760
21761Predicated integer signed minimum of two vectors of integers.
21762
21763
21764Arguments:
21765""""""""""
21766
21767The first two arguments and the result have the same vector of integer type. The
21768third argument is the vector mask and has the same number of elements as the
21769result vector type. The fourth argument is the explicit vector length of the
21770operation.
21771
21772Semantics:
21773""""""""""
21774
21775The '``llvm.vp.smin``' intrinsic performs integer signed minimum (:ref:`smin <int_smin>`)
21776of the first and second vector arguments on each enabled lane.  The result on
21777disabled lanes is a :ref:`poison value <poisonvalues>`.
21778
21779Examples:
21780"""""""""
21781
21782.. code-block:: llvm
21783
21784      %r = call <4 x i32> @llvm.vp.smin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21785      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21786
21787      %t = call <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
21788      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21789
21790
21791.. _int_vp_umax:
21792
21793'``llvm.vp.umax.*``' Intrinsics
21794^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21795
21796Syntax:
21797"""""""
21798This is an overloaded intrinsic.
21799
21800::
21801
21802      declare <16 x i32>  @llvm.vp.umax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21803      declare <vscale x 4 x i32>  @llvm.vp.umax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21804      declare <256 x i64>  @llvm.vp.umax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21805
21806Overview:
21807"""""""""
21808
21809Predicated integer unsigned maximum of two vectors of integers.
21810
21811
21812Arguments:
21813""""""""""
21814
21815The first two arguments and the result have the same vector of integer type. The
21816third argument is the vector mask and has the same number of elements as the
21817result vector type. The fourth argument is the explicit vector length of the
21818operation.
21819
21820Semantics:
21821""""""""""
21822
21823The '``llvm.vp.umax``' intrinsic performs integer unsigned maximum (:ref:`umax <int_umax>`)
21824of the first and second vector arguments on each enabled lane.  The result on
21825disabled lanes is a :ref:`poison value <poisonvalues>`.
21826
21827Examples:
21828"""""""""
21829
21830.. code-block:: llvm
21831
21832      %r = call <4 x i32> @llvm.vp.umax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21833      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21834
21835      %t = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
21836      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21837
21838
21839.. _int_vp_umin:
21840
21841'``llvm.vp.umin.*``' Intrinsics
21842^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21843
21844Syntax:
21845"""""""
21846This is an overloaded intrinsic.
21847
21848::
21849
21850      declare <16 x i32>  @llvm.vp.umin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21851      declare <vscale x 4 x i32>  @llvm.vp.umin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21852      declare <256 x i64>  @llvm.vp.umin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21853
21854Overview:
21855"""""""""
21856
21857Predicated integer unsigned minimum of two vectors of integers.
21858
21859
21860Arguments:
21861""""""""""
21862
21863The first two arguments and the result have the same vector of integer type. The
21864third argument is the vector mask and has the same number of elements as the
21865result vector type. The fourth argument is the explicit vector length of the
21866operation.
21867
21868Semantics:
21869""""""""""
21870
21871The '``llvm.vp.umin``' intrinsic performs integer unsigned minimum (:ref:`umin <int_umin>`)
21872of the first and second vector arguments on each enabled lane.  The result on
21873disabled lanes is a :ref:`poison value <poisonvalues>`.
21874
21875Examples:
21876"""""""""
21877
21878.. code-block:: llvm
21879
21880      %r = call <4 x i32> @llvm.vp.umin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21881      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21882
21883      %t = call <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
21884      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21885
21886
21887.. _int_vp_copysign:
21888
21889'``llvm.vp.copysign.*``' Intrinsics
21890^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21891
21892Syntax:
21893"""""""
21894This is an overloaded intrinsic.
21895
21896::
21897
21898      declare <16 x float>  @llvm.vp.copysign.v16f32 (<16 x float> <mag_op>, <16 x float> <sign_op>, <16 x i1> <mask>, i32 <vector_length>)
21899      declare <vscale x 4 x float>  @llvm.vp.copysign.nxv4f32 (<vscale x 4 x float> <mag_op>, <vscale x 4 x float> <sign_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21900      declare <256 x double>  @llvm.vp.copysign.v256f64 (<256 x double> <mag_op>, <256 x double> <sign_op>, <256 x i1> <mask>, i32 <vector_length>)
21901
21902Overview:
21903"""""""""
21904
21905Predicated floating-point copysign of two vectors of floating-point values.
21906
21907
21908Arguments:
21909""""""""""
21910
21911The first two arguments and the result have the same vector of floating-point type. The
21912third argument is the vector mask and has the same number of elements as the
21913result vector type. The fourth argument is the explicit vector length of the
21914operation.
21915
21916Semantics:
21917""""""""""
21918
21919The '``llvm.vp.copysign``' intrinsic performs floating-point copysign (:ref:`copysign <int_copysign>`)
21920of the first and second vector arguments on each enabled lane.  The result on
21921disabled lanes is a :ref:`poison value <poisonvalues>`.  The operation is
21922performed in the default floating-point environment.
21923
21924Examples:
21925"""""""""
21926
21927.. code-block:: llvm
21928
21929      %r = call <4 x float> @llvm.vp.copysign.v4f32(<4 x float> %mag, <4 x float> %sign, <4 x i1> %mask, i32 %evl)
21930      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21931
21932      %t = call <4 x float> @llvm.copysign.v4f32(<4 x float> %mag, <4 x float> %sign)
21933      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
21934
21935
21936.. _int_vp_minnum:
21937
21938'``llvm.vp.minnum.*``' Intrinsics
21939^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21940
21941Syntax:
21942"""""""
21943This is an overloaded intrinsic.
21944
21945::
21946
21947      declare <16 x float>  @llvm.vp.minnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21948      declare <vscale x 4 x float>  @llvm.vp.minnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21949      declare <256 x double>  @llvm.vp.minnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21950
21951Overview:
21952"""""""""
21953
21954Predicated floating-point IEEE-754 minNum of two vectors of floating-point values.
21955
21956
21957Arguments:
21958""""""""""
21959
21960The first two arguments and the result have the same vector of floating-point type. The
21961third argument is the vector mask and has the same number of elements as the
21962result vector type. The fourth argument is the explicit vector length of the
21963operation.
21964
21965Semantics:
21966""""""""""
21967
21968The '``llvm.vp.minnum``' intrinsic performs floating-point minimum (:ref:`minnum <i_minnum>`)
21969of the first and second vector arguments on each enabled lane.  The result on
21970disabled lanes is a :ref:`poison value <poisonvalues>`.  The operation is
21971performed in the default floating-point environment.
21972
21973Examples:
21974"""""""""
21975
21976.. code-block:: llvm
21977
21978      %r = call <4 x float> @llvm.vp.minnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
21979      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21980
21981      %t = call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %b)
21982      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
21983
21984
21985.. _int_vp_maxnum:
21986
21987'``llvm.vp.maxnum.*``' Intrinsics
21988^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21989
21990Syntax:
21991"""""""
21992This is an overloaded intrinsic.
21993
21994::
21995
21996      declare <16 x float>  @llvm.vp.maxnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21997      declare <vscale x 4 x float>  @llvm.vp.maxnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21998      declare <256 x double>  @llvm.vp.maxnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21999
22000Overview:
22001"""""""""
22002
22003Predicated floating-point IEEE-754 maxNum of two vectors of floating-point values.
22004
22005
22006Arguments:
22007""""""""""
22008
22009The first two arguments and the result have the same vector of floating-point type. The
22010third argument is the vector mask and has the same number of elements as the
22011result vector type. The fourth argument is the explicit vector length of the
22012operation.
22013
22014Semantics:
22015""""""""""
22016
22017The '``llvm.vp.maxnum``' intrinsic performs floating-point maximum (:ref:`maxnum <i_maxnum>`)
22018of the first and second vector arguments on each enabled lane.  The result on
22019disabled lanes is a :ref:`poison value <poisonvalues>`.  The operation is
22020performed in the default floating-point environment.
22021
22022Examples:
22023"""""""""
22024
22025.. code-block:: llvm
22026
22027      %r = call <4 x float> @llvm.vp.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22028      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22029
22030      %t = call <4 x float> @llvm.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22031      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22032
22033
22034.. _int_vp_minimum:
22035
22036'``llvm.vp.minimum.*``' Intrinsics
22037^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22038
22039Syntax:
22040"""""""
22041This is an overloaded intrinsic.
22042
22043::
22044
22045      declare <16 x float>  @llvm.vp.minimum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22046      declare <vscale x 4 x float>  @llvm.vp.minimum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22047      declare <256 x double>  @llvm.vp.minimum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22048
22049Overview:
22050"""""""""
22051
22052Predicated floating-point minimum of two vectors of floating-point values,
22053propagating NaNs and treating -0.0 as less than +0.0.
22054
22055Arguments:
22056""""""""""
22057
22058The first two arguments and the result have the same vector of floating-point type. The
22059third argument is the vector mask and has the same number of elements as the
22060result vector type. The fourth argument is the explicit vector length of the
22061operation.
22062
22063Semantics:
22064""""""""""
22065
22066The '``llvm.vp.minimum``' intrinsic performs floating-point minimum (:ref:`minimum <i_minimum>`)
22067of the first and second vector arguments on each enabled lane, the result being
22068NaN if either argument is a NaN. -0.0 is considered to be less than +0.0 for this
22069intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22070The operation is performed in the default floating-point environment.
22071
22072Examples:
22073"""""""""
22074
22075.. code-block:: llvm
22076
22077      %r = call <4 x float> @llvm.vp.minimum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22078      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22079
22080      %t = call <4 x float> @llvm.minimum.v4f32(<4 x float> %a, <4 x float> %b)
22081      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22082
22083
22084.. _int_vp_maximum:
22085
22086'``llvm.vp.maximum.*``' Intrinsics
22087^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22088
22089Syntax:
22090"""""""
22091This is an overloaded intrinsic.
22092
22093::
22094
22095      declare <16 x float>  @llvm.vp.maximum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22096      declare <vscale x 4 x float>  @llvm.vp.maximum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22097      declare <256 x double>  @llvm.vp.maximum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22098
22099Overview:
22100"""""""""
22101
22102Predicated floating-point maximum of two vectors of floating-point values,
22103propagating NaNs and treating -0.0 as less than +0.0.
22104
22105Arguments:
22106""""""""""
22107
22108The first two arguments and the result have the same vector of floating-point type. The
22109third argument is the vector mask and has the same number of elements as the
22110result vector type. The fourth argument is the explicit vector length of the
22111operation.
22112
22113Semantics:
22114""""""""""
22115
22116The '``llvm.vp.maximum``' intrinsic performs floating-point maximum (:ref:`maximum <i_maximum>`)
22117of the first and second vector arguments on each enabled lane, the result being
22118NaN if either argument is a NaN. -0.0 is considered to be less than +0.0 for this
22119intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22120The operation is performed in the default floating-point environment.
22121
22122Examples:
22123"""""""""
22124
22125.. code-block:: llvm
22126
22127      %r = call <4 x float> @llvm.vp.maximum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22128      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22129
22130      %t = call <4 x float> @llvm.maximum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22131      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22132
22133
22134.. _int_vp_fadd:
22135
22136'``llvm.vp.fadd.*``' Intrinsics
22137^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22138
22139Syntax:
22140"""""""
22141This is an overloaded intrinsic.
22142
22143::
22144
22145      declare <16 x float>  @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22146      declare <vscale x 4 x float>  @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22147      declare <256 x double>  @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22148
22149Overview:
22150"""""""""
22151
22152Predicated floating-point addition of two vectors of floating-point values.
22153
22154
22155Arguments:
22156""""""""""
22157
22158The first two arguments and the result have the same vector of floating-point type. The
22159third argument is the vector mask and has the same number of elements as the
22160result vector type. The fourth argument is the explicit vector length of the
22161operation.
22162
22163Semantics:
22164""""""""""
22165
22166The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`fadd <i_fadd>`)
22167of the first and second vector arguments on each enabled lane.  The result on
22168disabled lanes is a :ref:`poison value <poisonvalues>`.  The operation is
22169performed in the default floating-point environment.
22170
22171Examples:
22172"""""""""
22173
22174.. code-block:: llvm
22175
22176      %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22177      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22178
22179      %t = fadd <4 x float> %a, %b
22180      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22181
22182
22183.. _int_vp_fsub:
22184
22185'``llvm.vp.fsub.*``' Intrinsics
22186^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22187
22188Syntax:
22189"""""""
22190This is an overloaded intrinsic.
22191
22192::
22193
22194      declare <16 x float>  @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22195      declare <vscale x 4 x float>  @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22196      declare <256 x double>  @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22197
22198Overview:
22199"""""""""
22200
22201Predicated floating-point subtraction of two vectors of floating-point values.
22202
22203
22204Arguments:
22205""""""""""
22206
22207The first two arguments and the result have the same vector of floating-point type. The
22208third argument is the vector mask and has the same number of elements as the
22209result vector type. The fourth argument is the explicit vector length of the
22210operation.
22211
22212Semantics:
22213""""""""""
22214
22215The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`fsub <i_fsub>`)
22216of the first and second vector arguments on each enabled lane.  The result on
22217disabled lanes is a :ref:`poison value <poisonvalues>`.  The operation is
22218performed in the default floating-point environment.
22219
22220Examples:
22221"""""""""
22222
22223.. code-block:: llvm
22224
22225      %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22226      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22227
22228      %t = fsub <4 x float> %a, %b
22229      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22230
22231
22232.. _int_vp_fmul:
22233
22234'``llvm.vp.fmul.*``' Intrinsics
22235^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22236
22237Syntax:
22238"""""""
22239This is an overloaded intrinsic.
22240
22241::
22242
22243      declare <16 x float>  @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22244      declare <vscale x 4 x float>  @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22245      declare <256 x double>  @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22246
22247Overview:
22248"""""""""
22249
22250Predicated floating-point multiplication of two vectors of floating-point values.
22251
22252
22253Arguments:
22254""""""""""
22255
22256The first two arguments and the result have the same vector of floating-point type. The
22257third argument is the vector mask and has the same number of elements as the
22258result vector type. The fourth argument is the explicit vector length of the
22259operation.
22260
22261Semantics:
22262""""""""""
22263
22264The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`fmul <i_fmul>`)
22265of the first and second vector arguments on each enabled lane.  The result on
22266disabled lanes is a :ref:`poison value <poisonvalues>`.  The operation is
22267performed in the default floating-point environment.
22268
22269Examples:
22270"""""""""
22271
22272.. code-block:: llvm
22273
22274      %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22275      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22276
22277      %t = fmul <4 x float> %a, %b
22278      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22279
22280
22281.. _int_vp_fdiv:
22282
22283'``llvm.vp.fdiv.*``' Intrinsics
22284^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22285
22286Syntax:
22287"""""""
22288This is an overloaded intrinsic.
22289
22290::
22291
22292      declare <16 x float>  @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22293      declare <vscale x 4 x float>  @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22294      declare <256 x double>  @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22295
22296Overview:
22297"""""""""
22298
22299Predicated floating-point division of two vectors of floating-point values.
22300
22301
22302Arguments:
22303""""""""""
22304
22305The first two arguments and the result have the same vector of floating-point type. The
22306third argument is the vector mask and has the same number of elements as the
22307result vector type. The fourth argument is the explicit vector length of the
22308operation.
22309
22310Semantics:
22311""""""""""
22312
22313The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`fdiv <i_fdiv>`)
22314of the first and second vector arguments on each enabled lane.  The result on
22315disabled lanes is a :ref:`poison value <poisonvalues>`.  The operation is
22316performed in the default floating-point environment.
22317
22318Examples:
22319"""""""""
22320
22321.. code-block:: llvm
22322
22323      %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22324      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22325
22326      %t = fdiv <4 x float> %a, %b
22327      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22328
22329
22330.. _int_vp_frem:
22331
22332'``llvm.vp.frem.*``' Intrinsics
22333^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22334
22335Syntax:
22336"""""""
22337This is an overloaded intrinsic.
22338
22339::
22340
22341      declare <16 x float>  @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22342      declare <vscale x 4 x float>  @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22343      declare <256 x double>  @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22344
22345Overview:
22346"""""""""
22347
22348Predicated floating-point remainder of two vectors of floating-point values.
22349
22350
22351Arguments:
22352""""""""""
22353
22354The first two arguments and the result have the same vector of floating-point type. The
22355third argument is the vector mask and has the same number of elements as the
22356result vector type. The fourth argument is the explicit vector length of the
22357operation.
22358
22359Semantics:
22360""""""""""
22361
22362The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`frem <i_frem>`)
22363of the first and second vector arguments on each enabled lane.  The result on
22364disabled lanes is a :ref:`poison value <poisonvalues>`.  The operation is
22365performed in the default floating-point environment.
22366
22367Examples:
22368"""""""""
22369
22370.. code-block:: llvm
22371
22372      %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22373      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22374
22375      %t = frem <4 x float> %a, %b
22376      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22377
22378
22379.. _int_vp_fneg:
22380
22381'``llvm.vp.fneg.*``' Intrinsics
22382^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22383
22384Syntax:
22385"""""""
22386This is an overloaded intrinsic.
22387
22388::
22389
22390      declare <16 x float>  @llvm.vp.fneg.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22391      declare <vscale x 4 x float>  @llvm.vp.fneg.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22392      declare <256 x double>  @llvm.vp.fneg.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22393
22394Overview:
22395"""""""""
22396
22397Predicated floating-point negation of a vector of floating-point values.
22398
22399
22400Arguments:
22401""""""""""
22402
22403The first argument and the result have the same vector of floating-point type.
22404The second argument is the vector mask and has the same number of elements as the
22405result vector type. The third argument is the explicit vector length of the
22406operation.
22407
22408Semantics:
22409""""""""""
22410
22411The '``llvm.vp.fneg``' intrinsic performs floating-point negation (:ref:`fneg <i_fneg>`)
22412of the first vector argument on each enabled lane.  The result on disabled lanes
22413is a :ref:`poison value <poisonvalues>`.
22414
22415Examples:
22416"""""""""
22417
22418.. code-block:: llvm
22419
22420      %r = call <4 x float> @llvm.vp.fneg.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22421      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22422
22423      %t = fneg <4 x float> %a
22424      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22425
22426
22427.. _int_vp_fabs:
22428
22429'``llvm.vp.fabs.*``' Intrinsics
22430^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22431
22432Syntax:
22433"""""""
22434This is an overloaded intrinsic.
22435
22436::
22437
22438      declare <16 x float>  @llvm.vp.fabs.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22439      declare <vscale x 4 x float>  @llvm.vp.fabs.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22440      declare <256 x double>  @llvm.vp.fabs.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22441
22442Overview:
22443"""""""""
22444
22445Predicated floating-point absolute value of a vector of floating-point values.
22446
22447
22448Arguments:
22449""""""""""
22450
22451The first argument and the result have the same vector of floating-point type.
22452The second argument is the vector mask and has the same number of elements as the
22453result vector type. The third argument is the explicit vector length of the
22454operation.
22455
22456Semantics:
22457""""""""""
22458
22459The '``llvm.vp.fabs``' intrinsic performs floating-point absolute value
22460(:ref:`fabs <int_fabs>`) of the first vector argument on each enabled lane.  The
22461result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22462
22463Examples:
22464"""""""""
22465
22466.. code-block:: llvm
22467
22468      %r = call <4 x float> @llvm.vp.fabs.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22469      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22470
22471      %t = call <4 x float> @llvm.fabs.v4f32(<4 x float> %a)
22472      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22473
22474
22475.. _int_vp_sqrt:
22476
22477'``llvm.vp.sqrt.*``' Intrinsics
22478^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22479
22480Syntax:
22481"""""""
22482This is an overloaded intrinsic.
22483
22484::
22485
22486      declare <16 x float>  @llvm.vp.sqrt.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22487      declare <vscale x 4 x float>  @llvm.vp.sqrt.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22488      declare <256 x double>  @llvm.vp.sqrt.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22489
22490Overview:
22491"""""""""
22492
22493Predicated floating-point square root of a vector of floating-point values.
22494
22495
22496Arguments:
22497""""""""""
22498
22499The first argument and the result have the same vector of floating-point type.
22500The second argument is the vector mask and has the same number of elements as the
22501result vector type. The third argument is the explicit vector length of the
22502operation.
22503
22504Semantics:
22505""""""""""
22506
22507The '``llvm.vp.sqrt``' intrinsic performs floating-point square root (:ref:`sqrt <int_sqrt>`) of
22508the first vector argument on each enabled lane.  The result on disabled lanes is
22509a :ref:`poison value <poisonvalues>`. The operation is performed in the default
22510floating-point environment.
22511
22512Examples:
22513"""""""""
22514
22515.. code-block:: llvm
22516
22517      %r = call <4 x float> @llvm.vp.sqrt.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22518      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22519
22520      %t = call <4 x float> @llvm.sqrt.v4f32(<4 x float> %a)
22521      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22522
22523
22524.. _int_vp_fma:
22525
22526'``llvm.vp.fma.*``' Intrinsics
22527^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22528
22529Syntax:
22530"""""""
22531This is an overloaded intrinsic.
22532
22533::
22534
22535      declare <16 x float>  @llvm.vp.fma.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22536      declare <vscale x 4 x float>  @llvm.vp.fma.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22537      declare <256 x double>  @llvm.vp.fma.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22538
22539Overview:
22540"""""""""
22541
22542Predicated floating-point fused multiply-add of two vectors of floating-point values.
22543
22544
22545Arguments:
22546""""""""""
22547
22548The first three arguments and the result have the same vector of floating-point type. The
22549fourth argument is the vector mask and has the same number of elements as the
22550result vector type. The fifth argument is the explicit vector length of the
22551operation.
22552
22553Semantics:
22554""""""""""
22555
22556The '``llvm.vp.fma``' intrinsic performs floating-point fused multiply-add (:ref:`llvm.fma <int_fma>`)
22557of the first, second, and third vector argument on each enabled lane.  The result on
22558disabled lanes is a :ref:`poison value <poisonvalues>`.  The operation is
22559performed in the default floating-point environment.
22560
22561Examples:
22562"""""""""
22563
22564.. code-block:: llvm
22565
22566      %r = call <4 x float> @llvm.vp.fma.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl)
22567      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22568
22569      %t = call <4 x float> @llvm.fma(<4 x float> %a, <4 x float> %b, <4 x float> %c)
22570      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22571
22572
22573.. _int_vp_fmuladd:
22574
22575'``llvm.vp.fmuladd.*``' Intrinsics
22576^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22577
22578Syntax:
22579"""""""
22580This is an overloaded intrinsic.
22581
22582::
22583
22584      declare <16 x float>  @llvm.vp.fmuladd.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22585      declare <vscale x 4 x float>  @llvm.vp.fmuladd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22586      declare <256 x double>  @llvm.vp.fmuladd.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22587
22588Overview:
22589"""""""""
22590
22591Predicated floating-point multiply-add of two vectors of floating-point values
22592that can be fused if code generator determines that (a) the target instruction
22593set has support for a fused operation, and (b) that the fused operation is more
22594efficient than the equivalent, separate pair of mul and add instructions.
22595
22596Arguments:
22597""""""""""
22598
22599The first three arguments and the result have the same vector of floating-point
22600type. The fourth argument is the vector mask and has the same number of elements
22601as the result vector type. The fifth argument is the explicit vector length of
22602the operation.
22603
22604Semantics:
22605""""""""""
22606
22607The '``llvm.vp.fmuladd``' intrinsic performs floating-point multiply-add (:ref:`llvm.fuladd <int_fmuladd>`)
22608of the first, second, and third vector argument on each enabled lane.  The result
22609on disabled lanes is a :ref:`poison value <poisonvalues>`.  The operation is
22610performed in the default floating-point environment.
22611
22612Examples:
22613"""""""""
22614
22615.. code-block:: llvm
22616
22617      %r = call <4 x float> @llvm.vp.fmuladd.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl)
22618      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22619
22620      %t = call <4 x float> @llvm.fmuladd(<4 x float> %a, <4 x float> %b, <4 x float> %c)
22621      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22622
22623
22624.. _int_vp_reduce_add:
22625
22626'``llvm.vp.reduce.add.*``' Intrinsics
22627^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22628
22629Syntax:
22630"""""""
22631This is an overloaded intrinsic.
22632
22633::
22634
22635      declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22636      declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22637
22638Overview:
22639"""""""""
22640
22641Predicated integer ``ADD`` reduction of a vector and a scalar starting value,
22642returning the result as a scalar.
22643
22644Arguments:
22645""""""""""
22646
22647The first argument is the start value of the reduction, which must be a scalar
22648integer type equal to the result type. The second argument is the vector on
22649which the reduction is performed and must be a vector of integer values whose
22650element type is the result/start type. The third argument is the vector mask and
22651is a vector of boolean values with the same number of elements as the vector
22652argument. The fourth argument is the explicit vector length of the operation.
22653
22654Semantics:
22655""""""""""
22656
22657The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction
22658(:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector argument
22659``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled
22660lanes are treated as containing the neutral value ``0`` (i.e. having no effect
22661on the reduction operation). If the vector length is zero, the result is equal
22662to ``start_value``.
22663
22664To ignore the start value, the neutral value can be used.
22665
22666Examples:
22667"""""""""
22668
22669.. code-block:: llvm
22670
22671      %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
22672      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22673      ; are treated as though %mask were false for those lanes.
22674
22675      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer
22676      %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a)
22677      %also.r = add i32 %reduction, %start
22678
22679
22680.. _int_vp_reduce_fadd:
22681
22682'``llvm.vp.reduce.fadd.*``' Intrinsics
22683^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22684
22685Syntax:
22686"""""""
22687This is an overloaded intrinsic.
22688
22689::
22690
22691      declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
22692      declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22693
22694Overview:
22695"""""""""
22696
22697Predicated floating-point ``ADD`` reduction of a vector and a scalar starting
22698value, returning the result as a scalar.
22699
22700Arguments:
22701""""""""""
22702
22703The first argument is the start value of the reduction, which must be a scalar
22704floating-point type equal to the result type. The second argument is the vector
22705on which the reduction is performed and must be a vector of floating-point
22706values whose element type is the result/start type. The third argument is the
22707vector mask and is a vector of boolean values with the same number of elements
22708as the vector argument. The fourth argument is the explicit vector length of the
22709operation.
22710
22711Semantics:
22712""""""""""
22713
22714The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD``
22715reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the
22716vector argument ``val`` on each enabled lane, adding it to the scalar
22717``start_value``. Disabled lanes are treated as containing the neutral value
22718``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are
22719enabled, the resulting value will be equal to ``start_value``.
22720
22721To ignore the start value, the neutral value can be used.
22722
22723See the unpredicated version (:ref:`llvm.vector.reduce.fadd
22724<int_vector_reduce_fadd>`) for more detail on the semantics of the reduction.
22725
22726Examples:
22727"""""""""
22728
22729.. code-block:: llvm
22730
22731      %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
22732      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22733      ; are treated as though %mask were false for those lanes.
22734
22735      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>
22736      %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a)
22737
22738
22739.. _int_vp_reduce_mul:
22740
22741'``llvm.vp.reduce.mul.*``' Intrinsics
22742^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22743
22744Syntax:
22745"""""""
22746This is an overloaded intrinsic.
22747
22748::
22749
22750      declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22751      declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22752
22753Overview:
22754"""""""""
22755
22756Predicated integer ``MUL`` reduction of a vector and a scalar starting value,
22757returning the result as a scalar.
22758
22759
22760Arguments:
22761""""""""""
22762
22763The first argument is the start value of the reduction, which must be a scalar
22764integer type equal to the result type. The second argument is the vector on
22765which the reduction is performed and must be a vector of integer values whose
22766element type is the result/start type. The third argument is the vector mask and
22767is a vector of boolean values with the same number of elements as the vector
22768argument. The fourth argument is the explicit vector length of the operation.
22769
22770Semantics:
22771""""""""""
22772
22773The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction
22774(:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector argument ``val``
22775on each enabled lane, multiplying it by the scalar ``start_value``. Disabled
22776lanes are treated as containing the neutral value ``1`` (i.e. having no effect
22777on the reduction operation). If the vector length is zero, the result is the
22778start value.
22779
22780To ignore the start value, the neutral value can be used.
22781
22782Examples:
22783"""""""""
22784
22785.. code-block:: llvm
22786
22787      %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
22788      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22789      ; are treated as though %mask were false for those lanes.
22790
22791      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
22792      %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a)
22793      %also.r = mul i32 %reduction, %start
22794
22795.. _int_vp_reduce_fmul:
22796
22797'``llvm.vp.reduce.fmul.*``' Intrinsics
22798^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22799
22800Syntax:
22801"""""""
22802This is an overloaded intrinsic.
22803
22804::
22805
22806      declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
22807      declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22808
22809Overview:
22810"""""""""
22811
22812Predicated floating-point ``MUL`` reduction of a vector and a scalar starting
22813value, returning the result as a scalar.
22814
22815
22816Arguments:
22817""""""""""
22818
22819The first argument is the start value of the reduction, which must be a scalar
22820floating-point type equal to the result type. The second argument is the vector
22821on which the reduction is performed and must be a vector of floating-point
22822values whose element type is the result/start type. The third argument is the
22823vector mask and is a vector of boolean values with the same number of elements
22824as the vector argument. The fourth argument is the explicit vector length of the
22825operation.
22826
22827Semantics:
22828""""""""""
22829
22830The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL``
22831reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the
22832vector argument ``val`` on each enabled lane, multiplying it by the scalar
22833`start_value``. Disabled lanes are treated as containing the neutral value
22834``1.0`` (i.e. having no effect on the reduction operation). If no lanes are
22835enabled, the resulting value will be equal to the starting value.
22836
22837To ignore the start value, the neutral value can be used.
22838
22839See the unpredicated version (:ref:`llvm.vector.reduce.fmul
22840<int_vector_reduce_fmul>`) for more detail on the semantics.
22841
22842Examples:
22843"""""""""
22844
22845.. code-block:: llvm
22846
22847      %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
22848      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22849      ; are treated as though %mask were false for those lanes.
22850
22851      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0>
22852      %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a)
22853
22854
22855.. _int_vp_reduce_and:
22856
22857'``llvm.vp.reduce.and.*``' Intrinsics
22858^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22859
22860Syntax:
22861"""""""
22862This is an overloaded intrinsic.
22863
22864::
22865
22866      declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22867      declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22868
22869Overview:
22870"""""""""
22871
22872Predicated integer ``AND`` reduction of a vector and a scalar starting value,
22873returning the result as a scalar.
22874
22875
22876Arguments:
22877""""""""""
22878
22879The first argument is the start value of the reduction, which must be a scalar
22880integer type equal to the result type. The second argument is the vector on
22881which the reduction is performed and must be a vector of integer values whose
22882element type is the result/start type. The third argument is the vector mask and
22883is a vector of boolean values with the same number of elements as the vector
22884argument. The fourth argument is the explicit vector length of the operation.
22885
22886Semantics:
22887""""""""""
22888
22889The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction
22890(:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector argument
22891``val`` on each enabled lane, performing an '``and``' of that with with the
22892scalar ``start_value``. Disabled lanes are treated as containing the neutral
22893value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
22894operation). If the vector length is zero, the result is the start value.
22895
22896To ignore the start value, the neutral value can be used.
22897
22898Examples:
22899"""""""""
22900
22901.. code-block:: llvm
22902
22903      %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
22904      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22905      ; are treated as though %mask were false for those lanes.
22906
22907      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
22908      %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a)
22909      %also.r = and i32 %reduction, %start
22910
22911
22912.. _int_vp_reduce_or:
22913
22914'``llvm.vp.reduce.or.*``' Intrinsics
22915^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22916
22917Syntax:
22918"""""""
22919This is an overloaded intrinsic.
22920
22921::
22922
22923      declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22924      declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22925
22926Overview:
22927"""""""""
22928
22929Predicated integer ``OR`` reduction of a vector and a scalar starting value,
22930returning the result as a scalar.
22931
22932
22933Arguments:
22934""""""""""
22935
22936The first argument is the start value of the reduction, which must be a scalar
22937integer type equal to the result type. The second argument is the vector on
22938which the reduction is performed and must be a vector of integer values whose
22939element type is the result/start type. The third argument is the vector mask and
22940is a vector of boolean values with the same number of elements as the vector
22941argument. The fourth argument is the explicit vector length of the operation.
22942
22943Semantics:
22944""""""""""
22945
22946The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction
22947(:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector argument
22948``val`` on each enabled lane, performing an '``or``' of that with the scalar
22949``start_value``. Disabled lanes are treated as containing the neutral value
22950``0`` (i.e. having no effect on the reduction operation). If the vector length
22951is zero, the result is the start value.
22952
22953To ignore the start value, the neutral value can be used.
22954
22955Examples:
22956"""""""""
22957
22958.. code-block:: llvm
22959
22960      %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
22961      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22962      ; are treated as though %mask were false for those lanes.
22963
22964      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
22965      %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a)
22966      %also.r = or i32 %reduction, %start
22967
22968.. _int_vp_reduce_xor:
22969
22970'``llvm.vp.reduce.xor.*``' Intrinsics
22971^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22972
22973Syntax:
22974"""""""
22975This is an overloaded intrinsic.
22976
22977::
22978
22979      declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22980      declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22981
22982Overview:
22983"""""""""
22984
22985Predicated integer ``XOR`` reduction of a vector and a scalar starting value,
22986returning the result as a scalar.
22987
22988
22989Arguments:
22990""""""""""
22991
22992The first argument is the start value of the reduction, which must be a scalar
22993integer type equal to the result type. The second argument is the vector on
22994which the reduction is performed and must be a vector of integer values whose
22995element type is the result/start type. The third argument is the vector mask and
22996is a vector of boolean values with the same number of elements as the vector
22997argument. The fourth argument is the explicit vector length of the operation.
22998
22999Semantics:
23000""""""""""
23001
23002The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction
23003(:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector argument
23004``val`` on each enabled lane, performing an '``xor``' of that with the scalar
23005``start_value``. Disabled lanes are treated as containing the neutral value
23006``0`` (i.e. having no effect on the reduction operation). If the vector length
23007is zero, the result is the start value.
23008
23009To ignore the start value, the neutral value can be used.
23010
23011Examples:
23012"""""""""
23013
23014.. code-block:: llvm
23015
23016      %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
23017      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23018      ; are treated as though %mask were false for those lanes.
23019
23020      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
23021      %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a)
23022      %also.r = xor i32 %reduction, %start
23023
23024
23025.. _int_vp_reduce_smax:
23026
23027'``llvm.vp.reduce.smax.*``' Intrinsics
23028^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23029
23030Syntax:
23031"""""""
23032This is an overloaded intrinsic.
23033
23034::
23035
23036      declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
23037      declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23038
23039Overview:
23040"""""""""
23041
23042Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting
23043value, returning the result as a scalar.
23044
23045
23046Arguments:
23047""""""""""
23048
23049The first argument is the start value of the reduction, which must be a scalar
23050integer type equal to the result type. The second argument is the vector on
23051which the reduction is performed and must be a vector of integer values whose
23052element type is the result/start type. The third argument is the vector mask and
23053is a vector of boolean values with the same number of elements as the vector
23054argument. The fourth argument is the explicit vector length of the operation.
23055
23056Semantics:
23057""""""""""
23058
23059The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX``
23060reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the
23061vector argument ``val`` on each enabled lane, and taking the maximum of that and
23062the scalar ``start_value``. Disabled lanes are treated as containing the
23063neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation).
23064If the vector length is zero, the result is the start value.
23065
23066To ignore the start value, the neutral value can be used.
23067
23068Examples:
23069"""""""""
23070
23071.. code-block:: llvm
23072
23073      %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
23074      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23075      ; are treated as though %mask were false for those lanes.
23076
23077      %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128>
23078      %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a)
23079      %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start)
23080
23081
23082.. _int_vp_reduce_smin:
23083
23084'``llvm.vp.reduce.smin.*``' Intrinsics
23085^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23086
23087Syntax:
23088"""""""
23089This is an overloaded intrinsic.
23090
23091::
23092
23093      declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
23094      declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23095
23096Overview:
23097"""""""""
23098
23099Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting
23100value, returning the result as a scalar.
23101
23102
23103Arguments:
23104""""""""""
23105
23106The first argument is the start value of the reduction, which must be a scalar
23107integer type equal to the result type. The second argument is the vector on
23108which the reduction is performed and must be a vector of integer values whose
23109element type is the result/start type. The third argument is the vector mask and
23110is a vector of boolean values with the same number of elements as the vector
23111argument. The fourth argument is the explicit vector length of the operation.
23112
23113Semantics:
23114""""""""""
23115
23116The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN``
23117reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the
23118vector argument ``val`` on each enabled lane, and taking the minimum of that and
23119the scalar ``start_value``. Disabled lanes are treated as containing the
23120neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation).
23121If the vector length is zero, the result is the start value.
23122
23123To ignore the start value, the neutral value can be used.
23124
23125Examples:
23126"""""""""
23127
23128.. code-block:: llvm
23129
23130      %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
23131      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23132      ; are treated as though %mask were false for those lanes.
23133
23134      %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127>
23135      %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a)
23136      %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start)
23137
23138
23139.. _int_vp_reduce_umax:
23140
23141'``llvm.vp.reduce.umax.*``' Intrinsics
23142^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23143
23144Syntax:
23145"""""""
23146This is an overloaded intrinsic.
23147
23148::
23149
23150      declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
23151      declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23152
23153Overview:
23154"""""""""
23155
23156Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting
23157value, returning the result as a scalar.
23158
23159
23160Arguments:
23161""""""""""
23162
23163The first argument is the start value of the reduction, which must be a scalar
23164integer type equal to the result type. The second argument is the vector on
23165which the reduction is performed and must be a vector of integer values whose
23166element type is the result/start type. The third argument is the vector mask and
23167is a vector of boolean values with the same number of elements as the vector
23168argument. The fourth argument is the explicit vector length of the operation.
23169
23170Semantics:
23171""""""""""
23172
23173The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX``
23174reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the
23175vector argument ``val`` on each enabled lane, and taking the maximum of that and
23176the scalar ``start_value``. Disabled lanes are treated as containing the
23177neutral value ``0`` (i.e. having no effect on the reduction operation). If the
23178vector length is zero, the result is the start value.
23179
23180To ignore the start value, the neutral value can be used.
23181
23182Examples:
23183"""""""""
23184
23185.. code-block:: llvm
23186
23187      %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
23188      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23189      ; are treated as though %mask were false for those lanes.
23190
23191      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
23192      %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a)
23193      %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start)
23194
23195
23196.. _int_vp_reduce_umin:
23197
23198'``llvm.vp.reduce.umin.*``' Intrinsics
23199^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23200
23201Syntax:
23202"""""""
23203This is an overloaded intrinsic.
23204
23205::
23206
23207      declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
23208      declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23209
23210Overview:
23211"""""""""
23212
23213Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting
23214value, returning the result as a scalar.
23215
23216
23217Arguments:
23218""""""""""
23219
23220The first argument is the start value of the reduction, which must be a scalar
23221integer type equal to the result type. The second argument is the vector on
23222which the reduction is performed and must be a vector of integer values whose
23223element type is the result/start type. The third argument is the vector mask and
23224is a vector of boolean values with the same number of elements as the vector
23225argument. The fourth argument is the explicit vector length of the operation.
23226
23227Semantics:
23228""""""""""
23229
23230The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN``
23231reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the
23232vector argument ``val`` on each enabled lane, taking the minimum of that and the
23233scalar ``start_value``. Disabled lanes are treated as containing the neutral
23234value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
23235operation). If the vector length is zero, the result is the start value.
23236
23237To ignore the start value, the neutral value can be used.
23238
23239Examples:
23240"""""""""
23241
23242.. code-block:: llvm
23243
23244      %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
23245      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23246      ; are treated as though %mask were false for those lanes.
23247
23248      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
23249      %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a)
23250      %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start)
23251
23252
23253.. _int_vp_reduce_fmax:
23254
23255'``llvm.vp.reduce.fmax.*``' Intrinsics
23256^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23257
23258Syntax:
23259"""""""
23260This is an overloaded intrinsic.
23261
23262::
23263
23264      declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
23265      declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23266
23267Overview:
23268"""""""""
23269
23270Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
23271value, returning the result as a scalar.
23272
23273
23274Arguments:
23275""""""""""
23276
23277The first argument is the start value of the reduction, which must be a scalar
23278floating-point type equal to the result type. The second argument is the vector
23279on which the reduction is performed and must be a vector of floating-point
23280values whose element type is the result/start type. The third argument is the
23281vector mask and is a vector of boolean values with the same number of elements
23282as the vector argument. The fourth argument is the explicit vector length of the
23283operation.
23284
23285Semantics:
23286""""""""""
23287
23288The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX``
23289reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the
23290vector argument ``val`` on each enabled lane, taking the maximum of that and the
23291scalar ``start_value``. Disabled lanes are treated as containing the neutral
23292value (i.e. having no effect on the reduction operation). If the vector length
23293is zero, the result is the start value.
23294
23295The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
23296flags are set, the neutral value is ``-QNAN``. If ``nnan``  and ``ninf`` are
23297both set, then the neutral value is the smallest floating-point value for the
23298result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
23299
23300This instruction has the same comparison semantics as the
23301:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
23302'``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
23303unless all elements of the vector and the starting value are ``NaN``. For a
23304vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
23305``-0.0`` elements, the sign of the result is unspecified.
23306
23307To ignore the start value, the neutral value can be used.
23308
23309Examples:
23310"""""""""
23311
23312.. code-block:: llvm
23313
23314      %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
23315      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23316      ; are treated as though %mask were false for those lanes.
23317
23318      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
23319      %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a)
23320      %also.r = call float @llvm.maxnum.f32(float %reduction, float %start)
23321
23322
23323.. _int_vp_reduce_fmin:
23324
23325'``llvm.vp.reduce.fmin.*``' Intrinsics
23326^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23327
23328Syntax:
23329"""""""
23330This is an overloaded intrinsic.
23331
23332::
23333
23334      declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
23335      declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23336
23337Overview:
23338"""""""""
23339
23340Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
23341value, returning the result as a scalar.
23342
23343
23344Arguments:
23345""""""""""
23346
23347The first argument is the start value of the reduction, which must be a scalar
23348floating-point type equal to the result type. The second argument is the vector
23349on which the reduction is performed and must be a vector of floating-point
23350values whose element type is the result/start type. The third argument is the
23351vector mask and is a vector of boolean values with the same number of elements
23352as the vector argument. The fourth argument is the explicit vector length of the
23353operation.
23354
23355Semantics:
23356""""""""""
23357
23358The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN``
23359reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the
23360vector argument ``val`` on each enabled lane, taking the minimum of that and the
23361scalar ``start_value``. Disabled lanes are treated as containing the neutral
23362value (i.e. having no effect on the reduction operation). If the vector length
23363is zero, the result is the start value.
23364
23365The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
23366flags are set, the neutral value is ``+QNAN``. If ``nnan``  and ``ninf`` are
23367both set, then the neutral value is the largest floating-point value for the
23368result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
23369
23370This instruction has the same comparison semantics as the
23371:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
23372'``llvm.minnum.*``' intrinsic). That is, the result will always be a number
23373unless all elements of the vector and the starting value are ``NaN``. For a
23374vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
23375``-0.0`` elements, the sign of the result is unspecified.
23376
23377To ignore the start value, the neutral value can be used.
23378
23379Examples:
23380"""""""""
23381
23382.. code-block:: llvm
23383
23384      %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
23385      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23386      ; are treated as though %mask were false for those lanes.
23387
23388      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
23389      %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a)
23390      %also.r = call float @llvm.minnum.f32(float %reduction, float %start)
23391
23392
23393.. _int_vp_reduce_fmaximum:
23394
23395'``llvm.vp.reduce.fmaximum.*``' Intrinsics
23396^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23397
23398Syntax:
23399"""""""
23400This is an overloaded intrinsic.
23401
23402::
23403
23404      declare float @llvm.vp.reduce.fmaximum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
23405      declare double @llvm.vp.reduce.fmaximum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23406
23407Overview:
23408"""""""""
23409
23410Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
23411value, returning the result as a scalar.
23412
23413
23414Arguments:
23415""""""""""
23416
23417The first argument is the start value of the reduction, which must be a scalar
23418floating-point type equal to the result type. The second argument is the vector
23419on which the reduction is performed and must be a vector of floating-point
23420values whose element type is the result/start type. The third argument is the
23421vector mask and is a vector of boolean values with the same number of elements
23422as the vector argument. The fourth argument is the explicit vector length of the
23423operation.
23424
23425Semantics:
23426""""""""""
23427
23428The '``llvm.vp.reduce.fmaximum``' intrinsic performs the floating-point ``MAX``
23429reduction (:ref:`llvm.vector.reduce.fmaximum <int_vector_reduce_fmaximum>`) of
23430the vector argument ``val`` on each enabled lane, taking the maximum of that and
23431the scalar ``start_value``. Disabled lanes are treated as containing the
23432neutral value (i.e. having no effect on the reduction operation). If the vector
23433length is zero, the result is the start value.
23434
23435The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
23436flags are set or only the ``nnan`` is set, the neutral value is ``-Infinity``.
23437If ``ninf`` is set, then the neutral value is the smallest floating-point value
23438for the result type.
23439
23440This instruction has the same comparison semantics as the
23441:ref:`llvm.vector.reduce.fmaximum <int_vector_reduce_fmaximum>` intrinsic (and
23442thus the '``llvm.maximum.*``' intrinsic). That is, the result will always be a
23443number unless any of the elements in the vector or the starting value is
23444``NaN``. Namely, this intrinsic propagates ``NaN``. Also, -0.0 is considered
23445less than +0.0.
23446
23447To ignore the start value, the neutral value can be used.
23448
23449Examples:
23450"""""""""
23451
23452.. code-block:: llvm
23453
23454      %r = call float @llvm.vp.reduce.fmaximum.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
23455      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23456      ; are treated as though %mask were false for those lanes.
23457
23458      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -infinity, float -infinity, float -infinity, float -infinity>
23459      %reduction = call float @llvm.vector.reduce.fmaximum.v4f32(<4 x float> %masked.a)
23460      %also.r = call float @llvm.maximum.f32(float %reduction, float %start)
23461
23462
23463.. _int_vp_reduce_fminimum:
23464
23465'``llvm.vp.reduce.fminimum.*``' Intrinsics
23466^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23467
23468Syntax:
23469"""""""
23470This is an overloaded intrinsic.
23471
23472::
23473
23474      declare float @llvm.vp.reduce.fminimum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
23475      declare double @llvm.vp.reduce.fminimum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23476
23477Overview:
23478"""""""""
23479
23480Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
23481value, returning the result as a scalar.
23482
23483
23484Arguments:
23485""""""""""
23486
23487The first argument is the start value of the reduction, which must be a scalar
23488floating-point type equal to the result type. The second argument is the vector
23489on which the reduction is performed and must be a vector of floating-point
23490values whose element type is the result/start type. The third argument is the
23491vector mask and is a vector of boolean values with the same number of elements
23492as the vector argument. The fourth argument is the explicit vector length of the
23493operation.
23494
23495Semantics:
23496""""""""""
23497
23498The '``llvm.vp.reduce.fminimum``' intrinsic performs the floating-point ``MIN``
23499reduction (:ref:`llvm.vector.reduce.fminimum <int_vector_reduce_fminimum>`) of
23500the vector argument ``val`` on each enabled lane, taking the minimum of that and
23501the scalar ``start_value``. Disabled lanes are treated as containing the neutral
23502value (i.e. having no effect on the reduction operation). If the vector length
23503is zero, the result is the start value.
23504
23505The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
23506flags are set or only the ``nnan`` is set, the neutral value is ``+Infinity``.
23507If ``ninf`` is set, then the neutral value is the largest floating-point value
23508for the result type.
23509
23510This instruction has the same comparison semantics as the
23511:ref:`llvm.vector.reduce.fminimum <int_vector_reduce_fminimum>` intrinsic (and
23512thus the '``llvm.minimum.*``' intrinsic). That is, the result will always be a
23513number unless any of the elements in the vector or the starting value is
23514``NaN``. Namely, this intrinsic propagates ``NaN``. Also, -0.0 is considered
23515less than +0.0.
23516
23517To ignore the start value, the neutral value can be used.
23518
23519Examples:
23520"""""""""
23521
23522.. code-block:: llvm
23523
23524      %r = call float @llvm.vp.reduce.fminimum.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
23525      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23526      ; are treated as though %mask were false for those lanes.
23527
23528      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float infinity, float infinity, float infinity, float infinity>
23529      %reduction = call float @llvm.vector.reduce.fminimum.v4f32(<4 x float> %masked.a)
23530      %also.r = call float @llvm.minimum.f32(float %reduction, float %start)
23531
23532
23533.. _int_get_active_lane_mask:
23534
23535'``llvm.get.active.lane.mask.*``' Intrinsics
23536^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23537
23538Syntax:
23539"""""""
23540This is an overloaded intrinsic.
23541
23542::
23543
23544      declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n)
23545      declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n)
23546      declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n)
23547      declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n)
23548
23549
23550Overview:
23551"""""""""
23552
23553Create a mask representing active and inactive vector lanes.
23554
23555
23556Arguments:
23557""""""""""
23558
23559Both arguments have the same scalar integer type. The result is a vector with
23560the i1 element type.
23561
23562Semantics:
23563""""""""""
23564
23565The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent
23566to:
23567
23568::
23569
23570      %m[i] = icmp ult (%base + i), %n
23571
23572where ``%m`` is a vector (mask) of active/inactive lanes with its elements
23573indexed by ``i``,  and ``%base``, ``%n`` are the two arguments to
23574``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult``
23575the unsigned less-than comparison operator.  Overflow cannot occur in
23576``(%base + i)`` and its comparison against ``%n`` as it is performed in integer
23577numbers and not in machine numbers.  If ``%n`` is ``0``, then the result is a
23578poison value. The above is equivalent to:
23579
23580::
23581
23582      %m = @llvm.get.active.lane.mask(%base, %n)
23583
23584This can, for example, be emitted by the loop vectorizer in which case
23585``%base`` is the first element of the vector induction variable (VIV) and
23586``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise
23587less than comparison of VIV with the loop tripcount, producing a mask of
23588true/false values representing active/inactive vector lanes, except if the VIV
23589overflows in which case they return false in the lanes where the VIV overflows.
23590The arguments are scalar types to accommodate scalable vector types, for which
23591it is unknown what the type of the step vector needs to be that enumerate its
23592lanes without overflow.
23593
23594This mask ``%m`` can e.g. be used in masked load/store instructions. These
23595intrinsics provide a hint to the backend. I.e., for a vector loop, the
23596back-edge taken count of the original scalar loop is explicit as the second
23597argument.
23598
23599
23600Examples:
23601"""""""""
23602
23603.. code-block:: llvm
23604
23605      %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429)
23606      %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> poison)
23607
23608
23609.. _int_experimental_vp_splice:
23610
23611'``llvm.experimental.vp.splice``' Intrinsic
23612^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23613
23614Syntax:
23615"""""""
23616This is an overloaded intrinsic.
23617
23618::
23619
23620      declare <2 x double> @llvm.experimental.vp.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm, <2 x i1> %mask, i32 %evl1, i32 %evl2)
23621      declare <vscale x 4 x i32> @llvm.experimental.vp.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm, <vscale x 4 x i1> %mask, i32 %evl1, i32 %evl2)
23622
23623Overview:
23624"""""""""
23625
23626The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length
23627predicated version of the '``llvm.vector.splice.*``' intrinsic.
23628
23629Arguments:
23630""""""""""
23631
23632The result and the first two arguments ``vec1`` and ``vec2`` are vectors with
23633the same type.  The third argument ``imm`` is an immediate signed integer that
23634indicates the offset index.  The fourth argument ``mask`` is a vector mask and
23635has the same number of elements as the result.  The last two arguments ``evl1``
23636and ``evl2`` are unsigned integers indicating the explicit vector lengths of
23637``vec1`` and ``vec2`` respectively.  ``imm``, ``evl1`` and ``evl2`` should
23638respect the following constraints: ``-evl1 <= imm < evl1``, ``0 <= evl1 <= VL``
23639and ``0 <= evl2 <= VL``, where ``VL`` is the runtime vector factor. If these
23640constraints are not satisfied the intrinsic has undefined behavior.
23641
23642Semantics:
23643""""""""""
23644
23645Effectively, this intrinsic concatenates ``vec1[0..evl1-1]`` and
23646``vec2[0..evl2-1]`` and creates the result vector by selecting the elements in a
23647window of size ``evl2``, starting at index ``imm`` (for a positive immediate) of
23648the concatenated vector. Elements in the result vector beyond ``evl2`` are
23649``undef``.  If ``imm`` is negative the starting index is ``evl1 + imm``.  The result
23650vector of active vector length ``evl2`` contains ``evl1 - imm`` (``-imm`` for
23651negative ``imm``) elements from indices ``[imm..evl1 - 1]``
23652(``[evl1 + imm..evl1 -1]`` for negative ``imm``) of ``vec1`` followed by the
23653first ``evl2 - (evl1 - imm)`` (``evl2 + imm`` for negative ``imm``) elements of
23654``vec2``. If ``evl1 - imm`` (``-imm``) >= ``evl2``, only the first ``evl2``
23655elements are considered and the remaining are ``undef``.  The lanes in the result
23656vector disabled by ``mask`` are ``poison``.
23657
23658Examples:
23659"""""""""
23660
23661.. code-block:: text
23662
23663 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, 1, 2, 3);  ==> <B, E, F, poison> index
23664 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, -2, 3, 2); ==> <B, C, poison, poison> trailing elements
23665
23666
23667.. _int_experimental_vp_splat:
23668
23669
23670'``llvm.experimental.vp.splat``' Intrinsic
23671^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23672
23673Syntax:
23674"""""""
23675This is an overloaded intrinsic.
23676
23677::
23678
23679      declare <2 x double> @llvm.experimental.vp.splat.v2f64(double %scalar, <2 x i1> %mask, i32 %evl)
23680      declare <vscale x 4 x i32> @llvm.experimental.vp.splat.nxv4i32(i32 %scalar, <vscale x 4 x i1> %mask, i32 %evl)
23681
23682Overview:
23683"""""""""
23684
23685The '``llvm.experimental.vp.splat.*``' intrinsic is to create a predicated splat
23686with specific effective vector length.
23687
23688Arguments:
23689""""""""""
23690
23691The result is a vector and it is a splat of the first scalar argument. The
23692second argument ``mask`` is a vector mask and has the same number of elements as
23693the result. The third argument is the explicit vector length of the operation.
23694
23695Semantics:
23696""""""""""
23697
23698This intrinsic splats a vector with ``evl`` elements of a scalar argument.
23699The lanes in the result vector disabled by ``mask`` are ``poison``. The
23700elements past ``evl`` are poison.
23701
23702Examples:
23703"""""""""
23704
23705.. code-block:: llvm
23706
23707      %r = call <4 x float> @llvm.vp.splat.v4f32(float %a, <4 x i1> %mask, i32 %evl)
23708      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23709      %e = insertelement <4 x float> poison, float %a, i32 0
23710      %s = shufflevector <4 x float> %e, <4 x float> poison, <4 x i32> zeroinitializer
23711      %also.r = select <4 x i1> %mask, <4 x float> %s, <4 x float> poison
23712
23713
23714.. _int_experimental_vp_reverse:
23715
23716
23717'``llvm.experimental.vp.reverse``' Intrinsic
23718^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23719
23720Syntax:
23721"""""""
23722This is an overloaded intrinsic.
23723
23724::
23725
23726      declare <2 x double> @llvm.experimental.vp.reverse.v2f64(<2 x double> %vec, <2 x i1> %mask, i32 %evl)
23727      declare <vscale x 4 x i32> @llvm.experimental.vp.reverse.nxv4i32(<vscale x 4 x i32> %vec, <vscale x 4 x i1> %mask, i32 %evl)
23728
23729Overview:
23730"""""""""
23731
23732The '``llvm.experimental.vp.reverse.*``' intrinsic is the vector length
23733predicated version of the '``llvm.vector.reverse.*``' intrinsic.
23734
23735Arguments:
23736""""""""""
23737
23738The result and the first argument ``vec`` are vectors with the same type.
23739The second argument ``mask`` is a vector mask and has the same number of
23740elements as the result. The third argument is the explicit vector length of
23741the operation.
23742
23743Semantics:
23744""""""""""
23745
23746This intrinsic reverses the order of the first ``evl`` elements in a vector.
23747The lanes in the result vector disabled by ``mask`` are ``poison``. The
23748elements past ``evl`` are poison.
23749
23750.. _int_vp_load:
23751
23752'``llvm.vp.load``' Intrinsic
23753^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23754
23755Syntax:
23756"""""""
23757This is an overloaded intrinsic.
23758
23759::
23760
23761    declare <4 x float> @llvm.vp.load.v4f32.p0(ptr %ptr, <4 x i1> %mask, i32 %evl)
23762    declare <vscale x 2 x i16> @llvm.vp.load.nxv2i16.p0(ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl)
23763    declare <8 x float> @llvm.vp.load.v8f32.p1(ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl)
23764    declare <vscale x 1 x i64> @llvm.vp.load.nxv1i64.p6(ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl)
23765
23766Overview:
23767"""""""""
23768
23769The '``llvm.vp.load.*``' intrinsic is the vector length predicated version of
23770the :ref:`llvm.masked.load <int_mload>` intrinsic.
23771
23772Arguments:
23773""""""""""
23774
23775The first argument is the base pointer for the load. The second argument is a
23776vector of boolean values with the same number of elements as the return type.
23777The third is the explicit vector length of the operation. The return type and
23778underlying type of the base pointer are the same vector types.
23779
23780The :ref:`align <attr_align>` parameter attribute can be provided for the first
23781argument.
23782
23783Semantics:
23784""""""""""
23785
23786The '``llvm.vp.load``' intrinsic reads a vector from memory in the same way as
23787the '``llvm.masked.load``' intrinsic, where the mask is taken from the
23788combination of the '``mask``' and '``evl``' arguments in the usual VP way.
23789Certain '``llvm.masked.load``' arguments do not have corresponding arguments in
23790'``llvm.vp.load``': the '``passthru``' argument is implicitly ``poison``; the
23791'``alignment``' argument is taken as the ``align`` parameter attribute, if
23792provided. The default alignment is taken as the ABI alignment of the return
23793type as specified by the :ref:`datalayout string<langref_datalayout>`.
23794
23795Examples:
23796"""""""""
23797
23798.. code-block:: text
23799
23800     %r = call <8 x i8> @llvm.vp.load.v8i8.p0(ptr align 2 %ptr, <8 x i1> %mask, i32 %evl)
23801     ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23802
23803     %also.r = call <8 x i8> @llvm.masked.load.v8i8.p0(ptr %ptr, i32 2, <8 x i1> %mask, <8 x i8> poison)
23804
23805
23806.. _int_vp_store:
23807
23808'``llvm.vp.store``' Intrinsic
23809^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23810
23811Syntax:
23812"""""""
23813This is an overloaded intrinsic.
23814
23815::
23816
23817    declare void @llvm.vp.store.v4f32.p0(<4 x float> %val, ptr %ptr, <4 x i1> %mask, i32 %evl)
23818    declare void @llvm.vp.store.nxv2i16.p0(<vscale x 2 x i16> %val, ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl)
23819    declare void @llvm.vp.store.v8f32.p1(<8 x float> %val, ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl)
23820    declare void @llvm.vp.store.nxv1i64.p6(<vscale x 1 x i64> %val, ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl)
23821
23822Overview:
23823"""""""""
23824
23825The '``llvm.vp.store.*``' intrinsic is the vector length predicated version of
23826the :ref:`llvm.masked.store <int_mstore>` intrinsic.
23827
23828Arguments:
23829""""""""""
23830
23831The first argument is the vector value to be written to memory. The second
23832argument is the base pointer for the store. It has the same underlying type as
23833the value argument. The third argument is a vector of boolean values with the
23834same number of elements as the return type. The fourth is the explicit vector
23835length of the operation.
23836
23837The :ref:`align <attr_align>` parameter attribute can be provided for the
23838second argument.
23839
23840Semantics:
23841""""""""""
23842
23843The '``llvm.vp.store``' intrinsic reads a vector from memory in the same way as
23844the '``llvm.masked.store``' intrinsic, where the mask is taken from the
23845combination of the '``mask``' and '``evl``' arguments in the usual VP way. The
23846alignment of the operation (corresponding to the '``alignment``' argument of
23847'``llvm.masked.store``') is specified by the ``align`` parameter attribute (see
23848above). If it is not provided then the ABI alignment of the type of the
23849'``value``' argument as specified by the :ref:`datalayout
23850string<langref_datalayout>` is used instead.
23851
23852Examples:
23853"""""""""
23854
23855.. code-block:: text
23856
23857     call void @llvm.vp.store.v8i8.p0(<8 x i8> %val, ptr align 4 %ptr, <8 x i1> %mask, i32 %evl)
23858     ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
23859
23860     call void @llvm.masked.store.v8i8.p0(<8 x i8> %val, ptr %ptr, i32 4, <8 x i1> %mask)
23861
23862
23863.. _int_experimental_vp_strided_load:
23864
23865'``llvm.experimental.vp.strided.load``' Intrinsic
23866^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23867
23868Syntax:
23869"""""""
23870This is an overloaded intrinsic.
23871
23872::
23873
23874    declare <4 x float> @llvm.experimental.vp.strided.load.v4f32.i64(ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl)
23875    declare <vscale x 2 x i16> @llvm.experimental.vp.strided.load.nxv2i16.i64(ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl)
23876
23877Overview:
23878"""""""""
23879
23880The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, scalar values from
23881memory locations evenly spaced apart by '``stride``' number of bytes, starting from '``ptr``'.
23882
23883Arguments:
23884""""""""""
23885
23886The first argument is the base pointer for the load. The second argument is the stride
23887value expressed in bytes. The third argument is a vector of boolean values
23888with the same number of elements as the return type. The fourth is the explicit
23889vector length of the operation. The base pointer underlying type matches the type of the scalar
23890elements of the return argument.
23891
23892The :ref:`align <attr_align>` parameter attribute can be provided for the first
23893argument.
23894
23895Semantics:
23896""""""""""
23897
23898The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, multiple scalar
23899values from memory in the same way as the :ref:`llvm.vp.gather <int_vp_gather>` intrinsic,
23900where the vector of pointers is in the form:
23901
23902   ``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``,
23903
23904with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed
23905integer and all arithmetic occurring in the pointer type.
23906
23907Examples:
23908"""""""""
23909
23910.. code-block:: text
23911
23912	 %r = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.i64(i64* %ptr, i64 %stride, <8 x i64> %mask, i32 %evl)
23913	 ;; The operation can also be expressed like this:
23914
23915	 %addr = bitcast i64* %ptr to i8*
23916	 ;; Create a vector of pointers %addrs in the form:
23917	 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...>
23918	 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* >
23919	 %also.r = call <8 x i64> @llvm.vp.gather.v8i64.v8p0i64(<8 x i64* > %ptrs, <8 x i64> %mask, i32 %evl)
23920
23921
23922.. _int_experimental_vp_strided_store:
23923
23924'``llvm.experimental.vp.strided.store``' Intrinsic
23925^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23926
23927Syntax:
23928"""""""
23929This is an overloaded intrinsic.
23930
23931::
23932
23933    declare void @llvm.experimental.vp.strided.store.v4f32.i64(<4 x float> %val, ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl)
23934    declare void @llvm.experimental.vp.strided.store.nxv2i16.i64(<vscale x 2 x i16> %val, ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl)
23935
23936Overview:
23937"""""""""
23938
23939The '``@llvm.experimental.vp.strided.store``' intrinsic stores the elements of
23940'``val``' into memory locations evenly spaced apart by '``stride``' number of
23941bytes, starting from '``ptr``'.
23942
23943Arguments:
23944""""""""""
23945
23946The first argument is the vector value to be written to memory. The second
23947argument is the base pointer for the store. Its underlying type matches the
23948scalar element type of the value argument. The third argument is the stride value
23949expressed in bytes. The fourth argument is a vector of boolean values with the
23950same number of elements as the return type. The fifth is the explicit vector
23951length of the operation.
23952
23953The :ref:`align <attr_align>` parameter attribute can be provided for the
23954second argument.
23955
23956Semantics:
23957""""""""""
23958
23959The '``llvm.experimental.vp.strided.store``' intrinsic stores the elements of
23960'``val``' in the same way as the :ref:`llvm.vp.scatter <int_vp_scatter>` intrinsic,
23961where the vector of pointers is in the form:
23962
23963	``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``,
23964
23965with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed
23966integer and all arithmetic occurring in the pointer type.
23967
23968Examples:
23969"""""""""
23970
23971.. code-block:: text
23972
23973	 call void @llvm.experimental.vp.strided.store.v8i64.i64(<8 x i64> %val, i64* %ptr, i64 %stride, <8 x i1> %mask, i32 %evl)
23974	 ;; The operation can also be expressed like this:
23975
23976	 %addr = bitcast i64* %ptr to i8*
23977	 ;; Create a vector of pointers %addrs in the form:
23978	 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...>
23979	 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* >
23980	 call void @llvm.vp.scatter.v8i64.v8p0i64(<8 x i64> %val, <8 x i64*> %ptrs, <8 x i1> %mask, i32 %evl)
23981
23982
23983.. _int_vp_gather:
23984
23985'``llvm.vp.gather``' Intrinsic
23986^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23987
23988Syntax:
23989"""""""
23990This is an overloaded intrinsic.
23991
23992::
23993
23994    declare <4 x double> @llvm.vp.gather.v4f64.v4p0(<4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl)
23995    declare <vscale x 2 x i8> @llvm.vp.gather.nxv2i8.nxv2p0(<vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
23996    declare <2 x float> @llvm.vp.gather.v2f32.v2p2(<2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl)
23997    declare <vscale x 4 x i32> @llvm.vp.gather.nxv4i32.nxv4p4(<vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
23998
23999Overview:
24000"""""""""
24001
24002The '``llvm.vp.gather.*``' intrinsic is the vector length predicated version of
24003the :ref:`llvm.masked.gather <int_mgather>` intrinsic.
24004
24005Arguments:
24006""""""""""
24007
24008The first argument is a vector of pointers which holds all memory addresses to
24009read. The second argument is a vector of boolean values with the same number of
24010elements as the return type. The third is the explicit vector length of the
24011operation. The return type and underlying type of the vector of pointers are
24012the same vector types.
24013
24014The :ref:`align <attr_align>` parameter attribute can be provided for the first
24015argument.
24016
24017Semantics:
24018""""""""""
24019
24020The '``llvm.vp.gather``' intrinsic reads multiple scalar values from memory in
24021the same way as the '``llvm.masked.gather``' intrinsic, where the mask is taken
24022from the combination of the '``mask``' and '``evl``' arguments in the usual VP
24023way. Certain '``llvm.masked.gather``' arguments do not have corresponding
24024arguments in '``llvm.vp.gather``': the '``passthru``' argument is implicitly
24025``poison``; the '``alignment``' argument is taken as the ``align`` parameter, if
24026provided. The default alignment is taken as the ABI alignment of the source
24027addresses as specified by the :ref:`datalayout string<langref_datalayout>`.
24028
24029Examples:
24030"""""""""
24031
24032.. code-block:: text
24033
24034     %r = call <8 x i8> @llvm.vp.gather.v8i8.v8p0(<8 x ptr>  align 8 %ptrs, <8 x i1> %mask, i32 %evl)
24035     ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24036
24037     %also.r = call <8 x i8> @llvm.masked.gather.v8i8.v8p0(<8 x ptr> %ptrs, i32 8, <8 x i1> %mask, <8 x i8> poison)
24038
24039
24040.. _int_vp_scatter:
24041
24042'``llvm.vp.scatter``' Intrinsic
24043^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24044
24045Syntax:
24046"""""""
24047This is an overloaded intrinsic.
24048
24049::
24050
24051    declare void @llvm.vp.scatter.v4f64.v4p0(<4 x double> %val, <4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl)
24052    declare void @llvm.vp.scatter.nxv2i8.nxv2p0(<vscale x 2 x i8> %val, <vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
24053    declare void @llvm.vp.scatter.v2f32.v2p2(<2 x float> %val, <2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl)
24054    declare void @llvm.vp.scatter.nxv4i32.nxv4p4(<vscale x 4 x i32> %val, <vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
24055
24056Overview:
24057"""""""""
24058
24059The '``llvm.vp.scatter.*``' intrinsic is the vector length predicated version of
24060the :ref:`llvm.masked.scatter <int_mscatter>` intrinsic.
24061
24062Arguments:
24063""""""""""
24064
24065The first argument is a vector value to be written to memory. The second argument
24066is a vector of pointers, pointing to where the value elements should be stored.
24067The third argument is a vector of boolean values with the same number of
24068elements as the return type. The fourth is the explicit vector length of the
24069operation.
24070
24071The :ref:`align <attr_align>` parameter attribute can be provided for the
24072second argument.
24073
24074Semantics:
24075""""""""""
24076
24077The '``llvm.vp.scatter``' intrinsic writes multiple scalar values to memory in
24078the same way as the '``llvm.masked.scatter``' intrinsic, where the mask is
24079taken from the combination of the '``mask``' and '``evl``' arguments in the
24080usual VP way. The '``alignment``' argument of the '``llvm.masked.scatter``' does
24081not have a corresponding argument in '``llvm.vp.scatter``': it is instead
24082provided via the optional ``align`` parameter attribute on the
24083vector-of-pointers argument. Otherwise it is taken as the ABI alignment of the
24084destination addresses as specified by the :ref:`datalayout
24085string<langref_datalayout>`.
24086
24087Examples:
24088"""""""""
24089
24090.. code-block:: text
24091
24092     call void @llvm.vp.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> align 1 %ptrs, <8 x i1> %mask, i32 %evl)
24093     ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
24094
24095     call void @llvm.masked.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> %ptrs, i32 1, <8 x i1> %mask)
24096
24097
24098.. _int_vp_trunc:
24099
24100'``llvm.vp.trunc.*``' Intrinsics
24101^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24102
24103Syntax:
24104"""""""
24105This is an overloaded intrinsic.
24106
24107::
24108
24109      declare <16 x i16>  @llvm.vp.trunc.v16i16.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
24110      declare <vscale x 4 x i16>  @llvm.vp.trunc.nxv4i16.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24111
24112Overview:
24113"""""""""
24114
24115The '``llvm.vp.trunc``' intrinsic truncates its first argument to the return
24116type. The operation has a mask and an explicit vector length parameter.
24117
24118
24119Arguments:
24120""""""""""
24121
24122The '``llvm.vp.trunc``' intrinsic takes a value to cast as its first argument.
24123The return type is the type to cast the value to. Both types must be vector of
24124:ref:`integer <t_integer>` type. The bit size of the value must be larger than
24125the bit size of the return type. The second argument is the vector mask. The
24126return type, the value to cast, and the vector mask have the same number of
24127elements.  The third argument is the explicit vector length of the operation.
24128
24129Semantics:
24130""""""""""
24131
24132The '``llvm.vp.trunc``' intrinsic truncates the high order bits in value and
24133converts the remaining bits to return type. Since the source size must be larger
24134than the destination size, '``llvm.vp.trunc``' cannot be a *no-op cast*. It will
24135always truncate bits. The conversion is performed on lane positions below the
24136explicit vector length and where the vector mask is true.  Masked-off lanes are
24137``poison``.
24138
24139Examples:
24140"""""""""
24141
24142.. code-block:: llvm
24143
24144      %r = call <4 x i16> @llvm.vp.trunc.v4i16.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
24145      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24146
24147      %t = trunc <4 x i32> %a to <4 x i16>
24148      %also.r = select <4 x i1> %mask, <4 x i16> %t, <4 x i16> poison
24149
24150
24151.. _int_vp_zext:
24152
24153'``llvm.vp.zext.*``' Intrinsics
24154^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24155
24156Syntax:
24157"""""""
24158This is an overloaded intrinsic.
24159
24160::
24161
24162      declare <16 x i32>  @llvm.vp.zext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>)
24163      declare <vscale x 4 x i32>  @llvm.vp.zext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24164
24165Overview:
24166"""""""""
24167
24168The '``llvm.vp.zext``' intrinsic zero extends its first argument to the return
24169type. The operation has a mask and an explicit vector length parameter.
24170
24171
24172Arguments:
24173""""""""""
24174
24175The '``llvm.vp.zext``' intrinsic takes a value to cast as its first argument.
24176The return type is the type to cast the value to. Both types must be vectors of
24177:ref:`integer <t_integer>` type. The bit size of the value must be smaller than
24178the bit size of the return type. The second argument is the vector mask. The
24179return type, the value to cast, and the vector mask have the same number of
24180elements.  The third argument is the explicit vector length of the operation.
24181
24182Semantics:
24183""""""""""
24184
24185The '``llvm.vp.zext``' intrinsic fill the high order bits of the value with zero
24186bits until it reaches the size of the return type. When zero extending from i1,
24187the result will always be either 0 or 1. The conversion is performed on lane
24188positions below the explicit vector length and where the vector mask is true.
24189Masked-off lanes are ``poison``.
24190
24191Examples:
24192"""""""""
24193
24194.. code-block:: llvm
24195
24196      %r = call <4 x i32> @llvm.vp.zext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl)
24197      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24198
24199      %t = zext <4 x i16> %a to <4 x i32>
24200      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
24201
24202
24203.. _int_vp_sext:
24204
24205'``llvm.vp.sext.*``' Intrinsics
24206^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24207
24208Syntax:
24209"""""""
24210This is an overloaded intrinsic.
24211
24212::
24213
24214      declare <16 x i32>  @llvm.vp.sext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>)
24215      declare <vscale x 4 x i32>  @llvm.vp.sext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24216
24217Overview:
24218"""""""""
24219
24220The '``llvm.vp.sext``' intrinsic sign extends its first argument to the return
24221type. The operation has a mask and an explicit vector length parameter.
24222
24223
24224Arguments:
24225""""""""""
24226
24227The '``llvm.vp.sext``' intrinsic takes a value to cast as its first argument.
24228The return type is the type to cast the value to. Both types must be vectors of
24229:ref:`integer <t_integer>` type. The bit size of the value must be smaller than
24230the bit size of the return type. The second argument is the vector mask. The
24231return type, the value to cast, and the vector mask have the same number of
24232elements.  The third argument is the explicit vector length of the operation.
24233
24234Semantics:
24235""""""""""
24236
24237The '``llvm.vp.sext``' intrinsic performs a sign extension by copying the sign
24238bit (highest order bit) of the value until it reaches the size of the return
24239type. When sign extending from i1, the result will always be either -1 or 0.
24240The conversion is performed on lane positions below the explicit vector length
24241and where the vector mask is true. Masked-off lanes are ``poison``.
24242
24243Examples:
24244"""""""""
24245
24246.. code-block:: llvm
24247
24248      %r = call <4 x i32> @llvm.vp.sext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl)
24249      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24250
24251      %t = sext <4 x i16> %a to <4 x i32>
24252      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
24253
24254
24255.. _int_vp_fptrunc:
24256
24257'``llvm.vp.fptrunc.*``' Intrinsics
24258^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24259
24260Syntax:
24261"""""""
24262This is an overloaded intrinsic.
24263
24264::
24265
24266      declare <16 x float>  @llvm.vp.fptrunc.v16f32.v16f64 (<16 x double> <op>, <16 x i1> <mask>, i32 <vector_length>)
24267      declare <vscale x 4 x float>  @llvm.vp.trunc.nxv4f32.nxv4f64 (<vscale x 4 x double> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24268
24269Overview:
24270"""""""""
24271
24272The '``llvm.vp.fptrunc``' intrinsic truncates its first argument to the return
24273type. The operation has a mask and an explicit vector length parameter.
24274
24275
24276Arguments:
24277""""""""""
24278
24279The '``llvm.vp.fptrunc``' intrinsic takes a value to cast as its first argument.
24280The return type is the type to cast the value to. Both types must be vector of
24281:ref:`floating-point <t_floating>` type. The bit size of the value must be
24282larger than the bit size of the return type. This implies that
24283'``llvm.vp.fptrunc``' cannot be used to make a *no-op cast*. The second argument
24284is the vector mask. The return type, the value to cast, and the vector mask have
24285the same number of elements.  The third argument is the explicit vector length of
24286the operation.
24287
24288Semantics:
24289""""""""""
24290
24291The '``llvm.vp.fptrunc``' intrinsic casts a ``value`` from a larger
24292:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
24293<t_floating>` type.
24294This instruction is assumed to execute in the default :ref:`floating-point
24295environment <floatenv>`. The conversion is performed on lane positions below the
24296explicit vector length and where the vector mask is true.  Masked-off lanes are
24297``poison``.
24298
24299Examples:
24300"""""""""
24301
24302.. code-block:: llvm
24303
24304      %r = call <4 x float> @llvm.vp.fptrunc.v4f32.v4f64(<4 x double> %a, <4 x i1> %mask, i32 %evl)
24305      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24306
24307      %t = fptrunc <4 x double> %a to <4 x float>
24308      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24309
24310
24311.. _int_vp_fpext:
24312
24313'``llvm.vp.fpext.*``' Intrinsics
24314^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24315
24316Syntax:
24317"""""""
24318This is an overloaded intrinsic.
24319
24320::
24321
24322      declare <16 x double>  @llvm.vp.fpext.v16f64.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24323      declare <vscale x 4 x double>  @llvm.vp.fpext.nxv4f64.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24324
24325Overview:
24326"""""""""
24327
24328The '``llvm.vp.fpext``' intrinsic extends its first argument to the return
24329type. The operation has a mask and an explicit vector length parameter.
24330
24331
24332Arguments:
24333""""""""""
24334
24335The '``llvm.vp.fpext``' intrinsic takes a value to cast as its first argument.
24336The return type is the type to cast the value to. Both types must be vector of
24337:ref:`floating-point <t_floating>` type. The bit size of the value must be
24338smaller than the bit size of the return type. This implies that
24339'``llvm.vp.fpext``' cannot be used to make a *no-op cast*. The second argument
24340is the vector mask. The return type, the value to cast, and the vector mask have
24341the same number of elements.  The third argument is the explicit vector length of
24342the operation.
24343
24344Semantics:
24345""""""""""
24346
24347The '``llvm.vp.fpext``' intrinsic extends the ``value`` from a smaller
24348:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
24349<t_floating>` type. The '``llvm.vp.fpext``' cannot be used to make a
24350*no-op cast* because it always changes bits. Use ``bitcast`` to make a
24351*no-op cast* for a floating-point cast.
24352The conversion is performed on lane positions below the explicit vector length
24353and where the vector mask is true.  Masked-off lanes are ``poison``.
24354
24355Examples:
24356"""""""""
24357
24358.. code-block:: llvm
24359
24360      %r = call <4 x double> @llvm.vp.fpext.v4f64.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24361      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24362
24363      %t = fpext <4 x float> %a to <4 x double>
24364      %also.r = select <4 x i1> %mask, <4 x double> %t, <4 x double> poison
24365
24366
24367.. _int_vp_fptoui:
24368
24369'``llvm.vp.fptoui.*``' Intrinsics
24370^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24371
24372Syntax:
24373"""""""
24374This is an overloaded intrinsic.
24375
24376::
24377
24378      declare <16 x i32>  @llvm.vp.fptoui.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24379      declare <vscale x 4 x i32>  @llvm.vp.fptoui.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24380      declare <256 x i64>  @llvm.vp.fptoui.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24381
24382Overview:
24383"""""""""
24384
24385The '``llvm.vp.fptoui``' intrinsic converts the :ref:`floating-point
24386<t_floating>` argument to the unsigned integer return type.
24387The operation has a mask and an explicit vector length parameter.
24388
24389
24390Arguments:
24391""""""""""
24392
24393The '``llvm.vp.fptoui``' intrinsic takes a value to cast as its first argument.
24394The value to cast must be a vector of :ref:`floating-point <t_floating>` type.
24395The return type is the type to cast the value to. The return type must be
24396vector of :ref:`integer <t_integer>` type.  The second argument is the vector
24397mask. The return type, the value to cast, and the vector mask have the same
24398number of elements.  The third argument is the explicit vector length of the
24399operation.
24400
24401Semantics:
24402""""""""""
24403
24404The '``llvm.vp.fptoui``' intrinsic converts its :ref:`floating-point
24405<t_floating>` argument into the nearest (rounding towards zero) unsigned integer
24406value where the lane position is below the explicit vector length and the
24407vector mask is true.  Masked-off lanes are ``poison``. On enabled lanes where
24408conversion takes place and the value cannot fit in the return type, the result
24409on that lane is a :ref:`poison value <poisonvalues>`.
24410
24411Examples:
24412"""""""""
24413
24414.. code-block:: llvm
24415
24416      %r = call <4 x i32> @llvm.vp.fptoui.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24417      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24418
24419      %t = fptoui <4 x float> %a to <4 x i32>
24420      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
24421
24422
24423.. _int_vp_fptosi:
24424
24425'``llvm.vp.fptosi.*``' Intrinsics
24426^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24427
24428Syntax:
24429"""""""
24430This is an overloaded intrinsic.
24431
24432::
24433
24434      declare <16 x i32>  @llvm.vp.fptosi.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24435      declare <vscale x 4 x i32>  @llvm.vp.fptosi.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24436      declare <256 x i64>  @llvm.vp.fptosi.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24437
24438Overview:
24439"""""""""
24440
24441The '``llvm.vp.fptosi``' intrinsic converts the :ref:`floating-point
24442<t_floating>` argument to the signed integer return type.
24443The operation has a mask and an explicit vector length parameter.
24444
24445
24446Arguments:
24447""""""""""
24448
24449The '``llvm.vp.fptosi``' intrinsic takes a value to cast as its first argument.
24450The value to cast must be a vector of :ref:`floating-point <t_floating>` type.
24451The return type is the type to cast the value to. The return type must be
24452vector of :ref:`integer <t_integer>` type.  The second argument is the vector
24453mask. The return type, the value to cast, and the vector mask have the same
24454number of elements.  The third argument is the explicit vector length of the
24455operation.
24456
24457Semantics:
24458""""""""""
24459
24460The '``llvm.vp.fptosi``' intrinsic converts its :ref:`floating-point
24461<t_floating>` argument into the nearest (rounding towards zero) signed integer
24462value where the lane position is below the explicit vector length and the
24463vector mask is true.  Masked-off lanes are ``poison``. On enabled lanes where
24464conversion takes place and the value cannot fit in the return type, the result
24465on that lane is a :ref:`poison value <poisonvalues>`.
24466
24467Examples:
24468"""""""""
24469
24470.. code-block:: llvm
24471
24472      %r = call <4 x i32> @llvm.vp.fptosi.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24473      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24474
24475      %t = fptosi <4 x float> %a to <4 x i32>
24476      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
24477
24478
24479.. _int_vp_uitofp:
24480
24481'``llvm.vp.uitofp.*``' Intrinsics
24482^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24483
24484Syntax:
24485"""""""
24486This is an overloaded intrinsic.
24487
24488::
24489
24490      declare <16 x float>  @llvm.vp.uitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
24491      declare <vscale x 4 x float>  @llvm.vp.uitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24492      declare <256 x double>  @llvm.vp.uitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
24493
24494Overview:
24495"""""""""
24496
24497The '``llvm.vp.uitofp``' intrinsic converts its unsigned integer argument to the
24498:ref:`floating-point <t_floating>` return type.  The operation has a mask and
24499an explicit vector length parameter.
24500
24501
24502Arguments:
24503""""""""""
24504
24505The '``llvm.vp.uitofp``' intrinsic takes a value to cast as its first argument.
24506The value to cast must be vector of :ref:`integer <t_integer>` type.  The
24507return type is the type to cast the value to.  The return type must be a vector
24508of :ref:`floating-point <t_floating>` type.  The second argument is the vector
24509mask. The return type, the value to cast, and the vector mask have the same
24510number of elements.  The third argument is the explicit vector length of the
24511operation.
24512
24513Semantics:
24514""""""""""
24515
24516The '``llvm.vp.uitofp``' intrinsic interprets its first argument as an unsigned
24517integer quantity and converts it to the corresponding floating-point value. If
24518the value cannot be exactly represented, it is rounded using the default
24519rounding mode.  The conversion is performed on lane positions below the
24520explicit vector length and where the vector mask is true.  Masked-off lanes are
24521``poison``.
24522
24523Examples:
24524"""""""""
24525
24526.. code-block:: llvm
24527
24528      %r = call <4 x float> @llvm.vp.uitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
24529      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24530
24531      %t = uitofp <4 x i32> %a to <4 x float>
24532      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24533
24534
24535.. _int_vp_sitofp:
24536
24537'``llvm.vp.sitofp.*``' Intrinsics
24538^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24539
24540Syntax:
24541"""""""
24542This is an overloaded intrinsic.
24543
24544::
24545
24546      declare <16 x float>  @llvm.vp.sitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
24547      declare <vscale x 4 x float>  @llvm.vp.sitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24548      declare <256 x double>  @llvm.vp.sitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
24549
24550Overview:
24551"""""""""
24552
24553The '``llvm.vp.sitofp``' intrinsic converts its signed integer argument to the
24554:ref:`floating-point <t_floating>` return type.  The operation has a mask and
24555an explicit vector length parameter.
24556
24557
24558Arguments:
24559""""""""""
24560
24561The '``llvm.vp.sitofp``' intrinsic takes a value to cast as its first argument.
24562The value to cast must be vector of :ref:`integer <t_integer>` type.  The
24563return type is the type to cast the value to.  The return type must be a vector
24564of :ref:`floating-point <t_floating>` type.  The second argument is the vector
24565mask. The return type, the value to cast, and the vector mask have the same
24566number of elements.  The third argument is the explicit vector length of the
24567operation.
24568
24569Semantics:
24570""""""""""
24571
24572The '``llvm.vp.sitofp``' intrinsic interprets its first argument as a signed
24573integer quantity and converts it to the corresponding floating-point value. If
24574the value cannot be exactly represented, it is rounded using the default
24575rounding mode.  The conversion is performed on lane positions below the
24576explicit vector length and where the vector mask is true.  Masked-off lanes are
24577``poison``.
24578
24579Examples:
24580"""""""""
24581
24582.. code-block:: llvm
24583
24584      %r = call <4 x float> @llvm.vp.sitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
24585      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24586
24587      %t = sitofp <4 x i32> %a to <4 x float>
24588      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24589
24590
24591.. _int_vp_ptrtoint:
24592
24593'``llvm.vp.ptrtoint.*``' Intrinsics
24594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24595
24596Syntax:
24597"""""""
24598This is an overloaded intrinsic.
24599
24600::
24601
24602      declare <16 x i8>  @llvm.vp.ptrtoint.v16i8.v16p0(<16 x ptr> <op>, <16 x i1> <mask>, i32 <vector_length>)
24603      declare <vscale x 4 x i8>  @llvm.vp.ptrtoint.nxv4i8.nxv4p0(<vscale x 4 x ptr> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24604      declare <256 x i64>  @llvm.vp.ptrtoint.v16i64.v16p0(<256 x ptr> <op>, <256 x i1> <mask>, i32 <vector_length>)
24605
24606Overview:
24607"""""""""
24608
24609The '``llvm.vp.ptrtoint``' intrinsic converts its pointer to the integer return
24610type.  The operation has a mask and an explicit vector length parameter.
24611
24612
24613Arguments:
24614""""""""""
24615
24616The '``llvm.vp.ptrtoint``' intrinsic takes a value to cast as its first argument
24617, which must be a vector of pointers, and a type to cast it to return type,
24618which must be a vector of :ref:`integer <t_integer>` type.
24619The second argument is the vector mask. The return type, the value to cast, and
24620the vector mask have the same number of elements.
24621The third argument is the explicit vector length of the operation.
24622
24623Semantics:
24624""""""""""
24625
24626The '``llvm.vp.ptrtoint``' intrinsic converts value to return type by
24627interpreting the pointer value as an integer and either truncating or zero
24628extending that value to the size of the integer type.
24629If ``value`` is smaller than return type, then a zero extension is done. If
24630``value`` is larger than return type, then a truncation is done. If they are
24631the same size, then nothing is done (*no-op cast*) other than a type
24632change.
24633The conversion is performed on lane positions below the explicit vector length
24634and where the vector mask is true.  Masked-off lanes are ``poison``.
24635
24636Examples:
24637"""""""""
24638
24639.. code-block:: llvm
24640
24641      %r = call <4 x i8> @llvm.vp.ptrtoint.v4i8.v4p0i32(<4 x ptr> %a, <4 x i1> %mask, i32 %evl)
24642      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24643
24644      %t = ptrtoint <4 x ptr> %a to <4 x i8>
24645      %also.r = select <4 x i1> %mask, <4 x i8> %t, <4 x i8> poison
24646
24647
24648.. _int_vp_inttoptr:
24649
24650'``llvm.vp.inttoptr.*``' Intrinsics
24651^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24652
24653Syntax:
24654"""""""
24655This is an overloaded intrinsic.
24656
24657::
24658
24659      declare <16 x ptr>  @llvm.vp.inttoptr.v16p0.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
24660      declare <vscale x 4 x ptr>  @llvm.vp.inttoptr.nxv4p0.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24661      declare <256 x ptr>  @llvm.vp.inttoptr.v256p0.v256i32 (<256 x i32> <op>, <256 x i1> <mask>, i32 <vector_length>)
24662
24663Overview:
24664"""""""""
24665
24666The '``llvm.vp.inttoptr``' intrinsic converts its integer value to the point
24667return type. The operation has a mask and an explicit vector length parameter.
24668
24669
24670Arguments:
24671""""""""""
24672
24673The '``llvm.vp.inttoptr``' intrinsic takes a value to cast as its first argument
24674, which must be a vector of :ref:`integer <t_integer>` type, and a type to cast
24675it to return type, which must be a vector of pointers type.
24676The second argument is the vector mask. The return type, the value to cast, and
24677the vector mask have the same number of elements.
24678The third argument is the explicit vector length of the operation.
24679
24680Semantics:
24681""""""""""
24682
24683The '``llvm.vp.inttoptr``' intrinsic converts ``value`` to return type by
24684applying either a zero extension or a truncation depending on the size of the
24685integer ``value``. If ``value`` is larger than the size of a pointer, then a
24686truncation is done. If ``value`` is smaller than the size of a pointer, then a
24687zero extension is done. If they are the same size, nothing is done (*no-op cast*).
24688The conversion is performed on lane positions below the explicit vector length
24689and where the vector mask is true.  Masked-off lanes are ``poison``.
24690
24691Examples:
24692"""""""""
24693
24694.. code-block:: llvm
24695
24696      %r = call <4 x ptr> @llvm.vp.inttoptr.v4p0i32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
24697      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24698
24699      %t = inttoptr <4 x i32> %a to <4 x ptr>
24700      %also.r = select <4 x i1> %mask, <4 x ptr> %t, <4 x ptr> poison
24701
24702
24703.. _int_vp_fcmp:
24704
24705'``llvm.vp.fcmp.*``' Intrinsics
24706^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24707
24708Syntax:
24709"""""""
24710This is an overloaded intrinsic.
24711
24712::
24713
24714      declare <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float> <left_op>, <16 x float> <right_op>, metadata <condition code>, <16 x i1> <mask>, i32 <vector_length>)
24715      declare <vscale x 4 x i1> @llvm.vp.fcmp.nxv4f32(<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, metadata <condition code>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24716      declare <256 x i1> @llvm.vp.fcmp.v256f64(<256 x double> <left_op>, <256 x double> <right_op>, metadata <condition code>, <256 x i1> <mask>, i32 <vector_length>)
24717
24718Overview:
24719"""""""""
24720
24721The '``llvm.vp.fcmp``' intrinsic returns a vector of boolean values based on
24722the comparison of its arguments. The operation has a mask and an explicit vector
24723length parameter.
24724
24725
24726Arguments:
24727""""""""""
24728
24729The '``llvm.vp.fcmp``' intrinsic takes the two values to compare as its first
24730and second arguments. These two values must be vectors of :ref:`floating-point
24731<t_floating>` types.
24732The return type is the result of the comparison. The return type must be a
24733vector of :ref:`i1 <t_integer>` type. The fourth argument is the vector mask.
24734The return type, the values to compare, and the vector mask have the same
24735number of elements. The third argument is the condition code indicating the kind
24736of comparison to perform. It must be a metadata string with :ref:`one of the
24737supported floating-point condition code values <fcmp_md_cc>`. The fifth argument
24738is the explicit vector length of the operation.
24739
24740Semantics:
24741""""""""""
24742
24743The '``llvm.vp.fcmp``' compares its first two arguments according to the
24744condition code given as the third argument. The arguments are compared element by
24745element on each enabled lane, where the semantics of the comparison are
24746defined :ref:`according to the condition code <fcmp_md_cc_sem>`. Masked-off
24747lanes are ``poison``.
24748
24749Examples:
24750"""""""""
24751
24752.. code-block:: llvm
24753
24754      %r = call <4 x i1> @llvm.vp.fcmp.v4f32(<4 x float> %a, <4 x float> %b, metadata !"oeq", <4 x i1> %mask, i32 %evl)
24755      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24756
24757      %t = fcmp oeq <4 x float> %a, %b
24758      %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> poison
24759
24760
24761.. _int_vp_icmp:
24762
24763'``llvm.vp.icmp.*``' Intrinsics
24764^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24765
24766Syntax:
24767"""""""
24768This is an overloaded intrinsic.
24769
24770::
24771
24772      declare <32 x i1> @llvm.vp.icmp.v32i32(<32 x i32> <left_op>, <32 x i32> <right_op>, metadata <condition code>, <32 x i1> <mask>, i32 <vector_length>)
24773      declare <vscale x 2 x i1> @llvm.vp.icmp.nxv2i32(<vscale x 2 x i32> <left_op>, <vscale x 2 x i32> <right_op>, metadata <condition code>, <vscale x 2 x i1> <mask>, i32 <vector_length>)
24774      declare <128 x i1> @llvm.vp.icmp.v128i8(<128 x i8> <left_op>, <128 x i8> <right_op>, metadata <condition code>, <128 x i1> <mask>, i32 <vector_length>)
24775
24776Overview:
24777"""""""""
24778
24779The '``llvm.vp.icmp``' intrinsic returns a vector of boolean values based on
24780the comparison of its arguments. The operation has a mask and an explicit vector
24781length parameter.
24782
24783
24784Arguments:
24785""""""""""
24786
24787The '``llvm.vp.icmp``' intrinsic takes the two values to compare as its first
24788and second arguments. These two values must be vectors of :ref:`integer
24789<t_integer>` types.
24790The return type is the result of the comparison. The return type must be a
24791vector of :ref:`i1 <t_integer>` type. The fourth argument is the vector mask.
24792The return type, the values to compare, and the vector mask have the same
24793number of elements. The third argument is the condition code indicating the kind
24794of comparison to perform. It must be a metadata string with :ref:`one of the
24795supported integer condition code values <icmp_md_cc>`. The fifth argument is the
24796explicit vector length of the operation.
24797
24798Semantics:
24799""""""""""
24800
24801The '``llvm.vp.icmp``' compares its first two arguments according to the
24802condition code given as the third argument. The arguments are compared element by
24803element on each enabled lane, where the semantics of the comparison are
24804defined :ref:`according to the condition code <icmp_md_cc_sem>`. Masked-off
24805lanes are ``poison``.
24806
24807Examples:
24808"""""""""
24809
24810.. code-block:: llvm
24811
24812      %r = call <4 x i1> @llvm.vp.icmp.v4i32(<4 x i32> %a, <4 x i32> %b, metadata !"ne", <4 x i1> %mask, i32 %evl)
24813      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24814
24815      %t = icmp ne <4 x i32> %a, %b
24816      %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> poison
24817
24818.. _int_vp_ceil:
24819
24820'``llvm.vp.ceil.*``' Intrinsics
24821^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24822
24823Syntax:
24824"""""""
24825This is an overloaded intrinsic.
24826
24827::
24828
24829      declare <16 x float>  @llvm.vp.ceil.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24830      declare <vscale x 4 x float>  @llvm.vp.ceil.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24831      declare <256 x double>  @llvm.vp.ceil.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24832
24833Overview:
24834"""""""""
24835
24836Predicated floating-point ceiling of a vector of floating-point values.
24837
24838
24839Arguments:
24840""""""""""
24841
24842The first argument and the result have the same vector of floating-point type.
24843The second argument is the vector mask and has the same number of elements as the
24844result vector type. The third argument is the explicit vector length of the
24845operation.
24846
24847Semantics:
24848""""""""""
24849
24850The '``llvm.vp.ceil``' intrinsic performs floating-point ceiling
24851(:ref:`ceil <int_ceil>`) of the first vector argument on each enabled lane. The
24852result on disabled lanes is a :ref:`poison value <poisonvalues>`.
24853
24854Examples:
24855"""""""""
24856
24857.. code-block:: llvm
24858
24859      %r = call <4 x float> @llvm.vp.ceil.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24860      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24861
24862      %t = call <4 x float> @llvm.ceil.v4f32(<4 x float> %a)
24863      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24864
24865.. _int_vp_floor:
24866
24867'``llvm.vp.floor.*``' Intrinsics
24868^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24869
24870Syntax:
24871"""""""
24872This is an overloaded intrinsic.
24873
24874::
24875
24876      declare <16 x float>  @llvm.vp.floor.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24877      declare <vscale x 4 x float>  @llvm.vp.floor.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24878      declare <256 x double>  @llvm.vp.floor.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24879
24880Overview:
24881"""""""""
24882
24883Predicated floating-point floor of a vector of floating-point values.
24884
24885
24886Arguments:
24887""""""""""
24888
24889The first argument and the result have the same vector of floating-point type.
24890The second argument is the vector mask and has the same number of elements as the
24891result vector type. The third argument is the explicit vector length of the
24892operation.
24893
24894Semantics:
24895""""""""""
24896
24897The '``llvm.vp.floor``' intrinsic performs floating-point floor
24898(:ref:`floor <int_floor>`) of the first vector argument on each enabled lane.
24899The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
24900
24901Examples:
24902"""""""""
24903
24904.. code-block:: llvm
24905
24906      %r = call <4 x float> @llvm.vp.floor.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24907      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24908
24909      %t = call <4 x float> @llvm.floor.v4f32(<4 x float> %a)
24910      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24911
24912.. _int_vp_rint:
24913
24914'``llvm.vp.rint.*``' Intrinsics
24915^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24916
24917Syntax:
24918"""""""
24919This is an overloaded intrinsic.
24920
24921::
24922
24923      declare <16 x float>  @llvm.vp.rint.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24924      declare <vscale x 4 x float>  @llvm.vp.rint.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24925      declare <256 x double>  @llvm.vp.rint.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24926
24927Overview:
24928"""""""""
24929
24930Predicated floating-point rint of a vector of floating-point values.
24931
24932
24933Arguments:
24934""""""""""
24935
24936The first argument and the result have the same vector of floating-point type.
24937The second argument is the vector mask and has the same number of elements as the
24938result vector type. The third argument is the explicit vector length of the
24939operation.
24940
24941Semantics:
24942""""""""""
24943
24944The '``llvm.vp.rint``' intrinsic performs floating-point rint
24945(:ref:`rint <int_rint>`) of the first vector argument on each enabled lane.
24946The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
24947
24948Examples:
24949"""""""""
24950
24951.. code-block:: llvm
24952
24953      %r = call <4 x float> @llvm.vp.rint.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24954      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24955
24956      %t = call <4 x float> @llvm.rint.v4f32(<4 x float> %a)
24957      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24958
24959.. _int_vp_nearbyint:
24960
24961'``llvm.vp.nearbyint.*``' Intrinsics
24962^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24963
24964Syntax:
24965"""""""
24966This is an overloaded intrinsic.
24967
24968::
24969
24970      declare <16 x float>  @llvm.vp.nearbyint.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24971      declare <vscale x 4 x float>  @llvm.vp.nearbyint.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24972      declare <256 x double>  @llvm.vp.nearbyint.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24973
24974Overview:
24975"""""""""
24976
24977Predicated floating-point nearbyint of a vector of floating-point values.
24978
24979
24980Arguments:
24981""""""""""
24982
24983The first argument and the result have the same vector of floating-point type.
24984The second argument is the vector mask and has the same number of elements as the
24985result vector type. The third argument is the explicit vector length of the
24986operation.
24987
24988Semantics:
24989""""""""""
24990
24991The '``llvm.vp.nearbyint``' intrinsic performs floating-point nearbyint
24992(:ref:`nearbyint <int_nearbyint>`) of the first vector argument on each enabled lane.
24993The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
24994
24995Examples:
24996"""""""""
24997
24998.. code-block:: llvm
24999
25000      %r = call <4 x float> @llvm.vp.nearbyint.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
25001      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25002
25003      %t = call <4 x float> @llvm.nearbyint.v4f32(<4 x float> %a)
25004      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
25005
25006.. _int_vp_round:
25007
25008'``llvm.vp.round.*``' Intrinsics
25009^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25010
25011Syntax:
25012"""""""
25013This is an overloaded intrinsic.
25014
25015::
25016
25017      declare <16 x float>  @llvm.vp.round.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
25018      declare <vscale x 4 x float>  @llvm.vp.round.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25019      declare <256 x double>  @llvm.vp.round.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
25020
25021Overview:
25022"""""""""
25023
25024Predicated floating-point round of a vector of floating-point values.
25025
25026
25027Arguments:
25028""""""""""
25029
25030The first argument and the result have the same vector of floating-point type.
25031The second argument is the vector mask and has the same number of elements as the
25032result vector type. The third argument is the explicit vector length of the
25033operation.
25034
25035Semantics:
25036""""""""""
25037
25038The '``llvm.vp.round``' intrinsic performs floating-point round
25039(:ref:`round <int_round>`) of the first vector argument on each enabled lane.
25040The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25041
25042Examples:
25043"""""""""
25044
25045.. code-block:: llvm
25046
25047      %r = call <4 x float> @llvm.vp.round.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
25048      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25049
25050      %t = call <4 x float> @llvm.round.v4f32(<4 x float> %a)
25051      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
25052
25053.. _int_vp_roundeven:
25054
25055'``llvm.vp.roundeven.*``' Intrinsics
25056^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25057
25058Syntax:
25059"""""""
25060This is an overloaded intrinsic.
25061
25062::
25063
25064      declare <16 x float>  @llvm.vp.roundeven.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
25065      declare <vscale x 4 x float>  @llvm.vp.roundeven.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25066      declare <256 x double>  @llvm.vp.roundeven.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
25067
25068Overview:
25069"""""""""
25070
25071Predicated floating-point roundeven of a vector of floating-point values.
25072
25073
25074Arguments:
25075""""""""""
25076
25077The first argument and the result have the same vector of floating-point type.
25078The second argument is the vector mask and has the same number of elements as the
25079result vector type. The third argument is the explicit vector length of the
25080operation.
25081
25082Semantics:
25083""""""""""
25084
25085The '``llvm.vp.roundeven``' intrinsic performs floating-point roundeven
25086(:ref:`roundeven <int_roundeven>`) of the first vector argument on each enabled
25087lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25088
25089Examples:
25090"""""""""
25091
25092.. code-block:: llvm
25093
25094      %r = call <4 x float> @llvm.vp.roundeven.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
25095      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25096
25097      %t = call <4 x float> @llvm.roundeven.v4f32(<4 x float> %a)
25098      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
25099
25100.. _int_vp_roundtozero:
25101
25102'``llvm.vp.roundtozero.*``' Intrinsics
25103^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25104
25105Syntax:
25106"""""""
25107This is an overloaded intrinsic.
25108
25109::
25110
25111      declare <16 x float>  @llvm.vp.roundtozero.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
25112      declare <vscale x 4 x float>  @llvm.vp.roundtozero.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25113      declare <256 x double>  @llvm.vp.roundtozero.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
25114
25115Overview:
25116"""""""""
25117
25118Predicated floating-point round-to-zero of a vector of floating-point values.
25119
25120
25121Arguments:
25122""""""""""
25123
25124The first argument and the result have the same vector of floating-point type.
25125The second argument is the vector mask and has the same number of elements as the
25126result vector type. The third argument is the explicit vector length of the
25127operation.
25128
25129Semantics:
25130""""""""""
25131
25132The '``llvm.vp.roundtozero``' intrinsic performs floating-point roundeven
25133(:ref:`llvm.trunc <int_llvm_trunc>`) of the first vector argument on each enabled lane.  The
25134result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25135
25136Examples:
25137"""""""""
25138
25139.. code-block:: llvm
25140
25141      %r = call <4 x float> @llvm.vp.roundtozero.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
25142      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25143
25144      %t = call <4 x float> @llvm.trunc.v4f32(<4 x float> %a)
25145      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
25146
25147.. _int_vp_lrint:
25148
25149'``llvm.vp.lrint.*``' Intrinsics
25150^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25151
25152Syntax:
25153"""""""
25154This is an overloaded intrinsic.
25155
25156::
25157
25158      declare <16 x i32> @llvm.vp.lrint.v16i32.v16f32(<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
25159      declare <vscale x 4 x i32> @llvm.vp.lrint.nxv4i32.nxv4f32(<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25160      declare <256 x i64> @llvm.vp.lrint.v256i64.v256f64(<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
25161
25162Overview:
25163"""""""""
25164
25165Predicated lrint of a vector of floating-point values.
25166
25167
25168Arguments:
25169""""""""""
25170
25171The result is an integer vector and the first argument is a vector of :ref:`floating-point <t_floating>`
25172type with the same number of elements as the result vector type. The second
25173argument is the vector mask and has the same number of elements as the result
25174vector type. The third argument is the explicit vector length of the operation.
25175
25176Semantics:
25177""""""""""
25178
25179The '``llvm.vp.lrint``' intrinsic performs lrint (:ref:`lrint <int_lrint>`) of
25180the first vector argument on each enabled lane. The result on disabled lanes is a
25181:ref:`poison value <poisonvalues>`.
25182
25183Examples:
25184"""""""""
25185
25186.. code-block:: llvm
25187
25188      %r = call <4 x i32> @llvm.vp.lrint.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
25189      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25190
25191      %t = call <4 x i32> @llvm.lrint.v4f32(<4 x float> %a)
25192      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25193
25194.. _int_vp_llrint:
25195
25196'``llvm.vp.llrint.*``' Intrinsics
25197^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25198
25199Syntax:
25200"""""""
25201This is an overloaded intrinsic.
25202
25203::
25204
25205      declare <16 x i32> @llvm.vp.llrint.v16i32.v16f32(<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
25206      declare <vscale x 4 x i32> @llvm.vp.llrint.nxv4i32.nxv4f32(<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25207      declare <256 x i64> @llvm.vp.llrint.v256i64.v256f64(<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
25208
25209Overview:
25210"""""""""
25211
25212Predicated llrint of a vector of floating-point values.
25213
25214
25215Arguments:
25216""""""""""
25217The result is an integer vector and the first argument is a vector of :ref:`floating-point <t_floating>`
25218type with the same number of elements as the result vector type. The second
25219argument is the vector mask and has the same number of elements as the result
25220vector type. The third argument is the explicit vector length of the operation.
25221
25222Semantics:
25223""""""""""
25224
25225The '``llvm.vp.llrint``' intrinsic performs lrint (:ref:`llrint <int_llrint>`) of
25226the first vector argument on each enabled lane. The result on disabled lanes is a
25227:ref:`poison value <poisonvalues>`.
25228
25229Examples:
25230"""""""""
25231
25232.. code-block:: llvm
25233
25234      %r = call <4 x i32> @llvm.vp.llrint.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
25235      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25236
25237      %t = call <4 x i32> @llvm.llrint.v4f32(<4 x float> %a)
25238      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25239
25240
25241.. _int_vp_bitreverse:
25242
25243'``llvm.vp.bitreverse.*``' Intrinsics
25244^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25245
25246Syntax:
25247"""""""
25248This is an overloaded intrinsic.
25249
25250::
25251
25252      declare <16 x i32>  @llvm.vp.bitreverse.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
25253      declare <vscale x 4 x i32>  @llvm.vp.bitreverse.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25254      declare <256 x i64>  @llvm.vp.bitreverse.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
25255
25256Overview:
25257"""""""""
25258
25259Predicated bitreverse of a vector of integers.
25260
25261
25262Arguments:
25263""""""""""
25264
25265The first argument and the result have the same vector of integer type. The
25266second argument is the vector mask and has the same number of elements as the
25267result vector type. The third argument is the explicit vector length of the
25268operation.
25269
25270Semantics:
25271""""""""""
25272
25273The '``llvm.vp.bitreverse``' intrinsic performs bitreverse (:ref:`bitreverse <int_bitreverse>`) of the first argument on each
25274enabled lane.  The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25275
25276Examples:
25277"""""""""
25278
25279.. code-block:: llvm
25280
25281      %r = call <4 x i32> @llvm.vp.bitreverse.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
25282      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25283
25284      %t = call <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> %a)
25285      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25286
25287
25288.. _int_vp_bswap:
25289
25290'``llvm.vp.bswap.*``' Intrinsics
25291^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25292
25293Syntax:
25294"""""""
25295This is an overloaded intrinsic.
25296
25297::
25298
25299      declare <16 x i32>  @llvm.vp.bswap.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
25300      declare <vscale x 4 x i32>  @llvm.vp.bswap.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25301      declare <256 x i64>  @llvm.vp.bswap.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
25302
25303Overview:
25304"""""""""
25305
25306Predicated bswap of a vector of integers.
25307
25308
25309Arguments:
25310""""""""""
25311
25312The first argument and the result have the same vector of integer type. The
25313second argument is the vector mask and has the same number of elements as the
25314result vector type. The third argument is the explicit vector length of the
25315operation.
25316
25317Semantics:
25318""""""""""
25319
25320The '``llvm.vp.bswap``' intrinsic performs bswap (:ref:`bswap <int_bswap>`) of the first argument on each
25321enabled lane.  The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25322
25323Examples:
25324"""""""""
25325
25326.. code-block:: llvm
25327
25328      %r = call <4 x i32> @llvm.vp.bswap.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
25329      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25330
25331      %t = call <4 x i32> @llvm.bswap.v4i32(<4 x i32> %a)
25332      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25333
25334
25335.. _int_vp_ctpop:
25336
25337'``llvm.vp.ctpop.*``' Intrinsics
25338^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25339
25340Syntax:
25341"""""""
25342This is an overloaded intrinsic.
25343
25344::
25345
25346      declare <16 x i32>  @llvm.vp.ctpop.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
25347      declare <vscale x 4 x i32>  @llvm.vp.ctpop.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25348      declare <256 x i64>  @llvm.vp.ctpop.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
25349
25350Overview:
25351"""""""""
25352
25353Predicated ctpop of a vector of integers.
25354
25355
25356Arguments:
25357""""""""""
25358
25359The first argument and the result have the same vector of integer type. The
25360second argument is the vector mask and has the same number of elements as the
25361result vector type. The third argument is the explicit vector length of the
25362operation.
25363
25364Semantics:
25365""""""""""
25366
25367The '``llvm.vp.ctpop``' intrinsic performs ctpop (:ref:`ctpop <int_ctpop>`) of the first argument on each
25368enabled lane.  The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25369
25370Examples:
25371"""""""""
25372
25373.. code-block:: llvm
25374
25375      %r = call <4 x i32> @llvm.vp.ctpop.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
25376      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25377
25378      %t = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> %a)
25379      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25380
25381
25382.. _int_vp_ctlz:
25383
25384'``llvm.vp.ctlz.*``' Intrinsics
25385^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25386
25387Syntax:
25388"""""""
25389This is an overloaded intrinsic.
25390
25391::
25392
25393      declare <16 x i32>  @llvm.vp.ctlz.v16i32 (<16 x i32> <op>, i1 <is_zero_poison>, <16 x i1> <mask>, i32 <vector_length>)
25394      declare <vscale x 4 x i32>  @llvm.vp.ctlz.nxv4i32 (<vscale x 4 x i32> <op>, i1 <is_zero_poison>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25395      declare <256 x i64>  @llvm.vp.ctlz.v256i64 (<256 x i64> <op>, i1 <is_zero_poison>, <256 x i1> <mask>, i32 <vector_length>)
25396
25397Overview:
25398"""""""""
25399
25400Predicated ctlz of a vector of integers.
25401
25402
25403Arguments:
25404""""""""""
25405
25406The first argument and the result have the same vector of integer type. The
25407second argument is a constant flag that indicates whether the intrinsic returns
25408a valid result if the first argument is zero. The third argument is the vector
25409mask and has the same number of elements as the result vector type. the fourth
25410argument is the explicit vector length of the operation. If the first argument
25411is zero and the second argument is true, the result is poison.
25412
25413Semantics:
25414""""""""""
25415
25416The '``llvm.vp.ctlz``' intrinsic performs ctlz (:ref:`ctlz <int_ctlz>`) of the first argument on each
25417enabled lane.  The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25418
25419Examples:
25420"""""""""
25421
25422.. code-block:: llvm
25423
25424      %r = call <4 x i32> @llvm.vp.ctlz.v4i32(<4 x i32> %a, i1 false, <4 x i1> %mask, i32 %evl)
25425      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25426
25427      %t = call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> %a, i1 false)
25428      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25429
25430
25431.. _int_vp_cttz:
25432
25433'``llvm.vp.cttz.*``' Intrinsics
25434^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25435
25436Syntax:
25437"""""""
25438This is an overloaded intrinsic.
25439
25440::
25441
25442      declare <16 x i32>  @llvm.vp.cttz.v16i32 (<16 x i32> <op>, i1 <is_zero_poison>, <16 x i1> <mask>, i32 <vector_length>)
25443      declare <vscale x 4 x i32>  @llvm.vp.cttz.nxv4i32 (<vscale x 4 x i32> <op>, i1 <is_zero_poison>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25444      declare <256 x i64>  @llvm.vp.cttz.v256i64 (<256 x i64> <op>, i1 <is_zero_poison>, <256 x i1> <mask>, i32 <vector_length>)
25445
25446Overview:
25447"""""""""
25448
25449Predicated cttz of a vector of integers.
25450
25451
25452Arguments:
25453""""""""""
25454
25455The first argument and the result have the same vector of integer type. The
25456second argument is a constant flag that indicates whether the intrinsic
25457returns a valid result if the first argument is zero. The third argument is
25458the vector mask and has the same number of elements as the result vector type.
25459The fourth argument is the explicit vector length of the operation. If the
25460first argument is zero and the second argument is true, the result is poison.
25461
25462Semantics:
25463""""""""""
25464
25465The '``llvm.vp.cttz``' intrinsic performs cttz (:ref:`cttz <int_cttz>`) of the first argument on each
25466enabled lane.  The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25467
25468Examples:
25469"""""""""
25470
25471.. code-block:: llvm
25472
25473      %r = call <4 x i32> @llvm.vp.cttz.v4i32(<4 x i32> %a, i1 false, <4 x i1> %mask, i32 %evl)
25474      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25475
25476      %t = call <4 x i32> @llvm.cttz.v4i32(<4 x i32> %a, i1 false)
25477      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25478
25479
25480.. _int_vp_cttz_elts:
25481
25482'``llvm.vp.cttz.elts.*``' Intrinsics
25483^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25484
25485Syntax:
25486"""""""
25487This is an overloaded intrinsic. You can use ```llvm.vp.cttz.elts``` on any
25488vector of integer elements, both fixed width and scalable.
25489
25490::
25491
25492      declare i32  @llvm.vp.cttz.elts.i32.v16i32 (<16 x i32> <op>, i1 <is_zero_poison>, <16 x i1> <mask>, i32 <vector_length>)
25493      declare i64  @llvm.vp.cttz.elts.i64.nxv4i32 (<vscale x 4 x i32> <op>, i1 <is_zero_poison>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25494      declare i64  @llvm.vp.cttz.elts.i64.v256i1 (<256 x i1> <op>, i1 <is_zero_poison>, <256 x i1> <mask>, i32 <vector_length>)
25495
25496Overview:
25497"""""""""
25498
25499This '```llvm.vp.cttz.elts```' intrinsic counts the number of trailing zero
25500elements of a vector. This is basically the vector-predicated version of
25501'```llvm.experimental.cttz.elts```'.
25502
25503Arguments:
25504""""""""""
25505
25506The first argument is the vector to be counted. This argument must be a vector
25507with integer element type. The return type must also be an integer type which is
25508wide enough to hold the maximum number of elements of the source vector. The
25509behavior of this intrinsic is undefined if the return type is not wide enough
25510for the number of elements in the input vector.
25511
25512The second argument is a constant flag that indicates whether the intrinsic
25513returns a valid result if the first argument is all zero.
25514
25515The third argument is the vector mask and has the same number of elements as the
25516input vector type. The fourth argument is the explicit vector length of the
25517operation.
25518
25519Semantics:
25520""""""""""
25521
25522The '``llvm.vp.cttz.elts``' intrinsic counts the trailing (least
25523significant / lowest-numbered) zero elements in the first argument on each
25524enabled lane. If the first argument is all zero and the second argument is true,
25525the result is poison. Otherwise, it returns the explicit vector length (i.e. the
25526fourth argument).
25527
25528.. _int_vp_sadd_sat:
25529
25530'``llvm.vp.sadd.sat.*``' Intrinsics
25531^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25532
25533Syntax:
25534"""""""
25535This is an overloaded intrinsic.
25536
25537::
25538
25539      declare <16 x i32>  @llvm.vp.sadd.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25540      declare <vscale x 4 x i32>  @llvm.vp.sadd.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25541      declare <256 x i64>  @llvm.vp.sadd.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25542
25543Overview:
25544"""""""""
25545
25546Predicated signed saturating addition of two vectors of integers.
25547
25548
25549Arguments:
25550""""""""""
25551
25552The first two arguments and the result have the same vector of integer type. The
25553third argument is the vector mask and has the same number of elements as the
25554result vector type. The fourth argument is the explicit vector length of the
25555operation.
25556
25557Semantics:
25558""""""""""
25559
25560The '``llvm.vp.sadd.sat``' intrinsic performs sadd.sat (:ref:`sadd.sat <int_sadd_sat>`)
25561of the first and second vector arguments on each enabled lane. The result on
25562disabled lanes is a :ref:`poison value <poisonvalues>`.
25563
25564
25565Examples:
25566"""""""""
25567
25568.. code-block:: llvm
25569
25570      %r = call <4 x i32> @llvm.vp.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
25571      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25572
25573      %t = call <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
25574      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25575
25576
25577.. _int_vp_uadd_sat:
25578
25579'``llvm.vp.uadd.sat.*``' Intrinsics
25580^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25581
25582Syntax:
25583"""""""
25584This is an overloaded intrinsic.
25585
25586::
25587
25588      declare <16 x i32>  @llvm.vp.uadd.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25589      declare <vscale x 4 x i32>  @llvm.vp.uadd.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25590      declare <256 x i64>  @llvm.vp.uadd.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25591
25592Overview:
25593"""""""""
25594
25595Predicated unsigned saturating addition of two vectors of integers.
25596
25597
25598Arguments:
25599""""""""""
25600
25601The first two arguments and the result have the same vector of integer type. The
25602third argument is the vector mask and has the same number of elements as the
25603result vector type. The fourth argument is the explicit vector length of the
25604operation.
25605
25606Semantics:
25607""""""""""
25608
25609The '``llvm.vp.uadd.sat``' intrinsic performs uadd.sat (:ref:`uadd.sat <int_uadd_sat>`)
25610of the first and second vector arguments on each enabled lane. The result on
25611disabled lanes is a :ref:`poison value <poisonvalues>`.
25612
25613
25614Examples:
25615"""""""""
25616
25617.. code-block:: llvm
25618
25619      %r = call <4 x i32> @llvm.vp.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
25620      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25621
25622      %t = call <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
25623      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25624
25625
25626.. _int_vp_ssub_sat:
25627
25628'``llvm.vp.ssub.sat.*``' Intrinsics
25629^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25630
25631Syntax:
25632"""""""
25633This is an overloaded intrinsic.
25634
25635::
25636
25637      declare <16 x i32>  @llvm.vp.ssub.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25638      declare <vscale x 4 x i32>  @llvm.vp.ssub.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25639      declare <256 x i64>  @llvm.vp.ssub.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25640
25641Overview:
25642"""""""""
25643
25644Predicated signed saturating subtraction of two vectors of integers.
25645
25646
25647Arguments:
25648""""""""""
25649
25650The first two arguments and the result have the same vector of integer type. The
25651third argument is the vector mask and has the same number of elements as the
25652result vector type. The fourth argument is the explicit vector length of the
25653operation.
25654
25655Semantics:
25656""""""""""
25657
25658The '``llvm.vp.ssub.sat``' intrinsic performs ssub.sat (:ref:`ssub.sat <int_ssub_sat>`)
25659of the first and second vector arguments on each enabled lane. The result on
25660disabled lanes is a :ref:`poison value <poisonvalues>`.
25661
25662
25663Examples:
25664"""""""""
25665
25666.. code-block:: llvm
25667
25668      %r = call <4 x i32> @llvm.vp.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
25669      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25670
25671      %t = call <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
25672      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25673
25674
25675.. _int_vp_usub_sat:
25676
25677'``llvm.vp.usub.sat.*``' Intrinsics
25678^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25679
25680Syntax:
25681"""""""
25682This is an overloaded intrinsic.
25683
25684::
25685
25686      declare <16 x i32>  @llvm.vp.usub.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25687      declare <vscale x 4 x i32>  @llvm.vp.usub.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25688      declare <256 x i64>  @llvm.vp.usub.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25689
25690Overview:
25691"""""""""
25692
25693Predicated unsigned saturating subtraction of two vectors of integers.
25694
25695
25696Arguments:
25697""""""""""
25698
25699The first two arguments and the result have the same vector of integer type. The
25700third argument is the vector mask and has the same number of elements as the
25701result vector type. The fourth argument is the explicit vector length of the
25702operation.
25703
25704Semantics:
25705""""""""""
25706
25707The '``llvm.vp.usub.sat``' intrinsic performs usub.sat (:ref:`usub.sat <int_usub_sat>`)
25708of the first and second vector arguments on each enabled lane. The result on
25709disabled lanes is a :ref:`poison value <poisonvalues>`.
25710
25711
25712Examples:
25713"""""""""
25714
25715.. code-block:: llvm
25716
25717      %r = call <4 x i32> @llvm.vp.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
25718      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25719
25720      %t = call <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
25721      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25722
25723
25724.. _int_vp_fshl:
25725
25726'``llvm.vp.fshl.*``' Intrinsics
25727^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25728
25729Syntax:
25730"""""""
25731This is an overloaded intrinsic.
25732
25733::
25734
25735      declare <16 x i32>  @llvm.vp.fshl.v16i32 (<16 x i32> <left_op>, <16 x i32> <middle_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25736      declare <vscale x 4 x i32>  @llvm.vp.fshl.nxv4i32  (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <middle_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25737      declare <256 x i64>  @llvm.vp.fshl.v256i64 (<256 x i64> <left_op>, <256 x i64> <middle_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25738
25739Overview:
25740"""""""""
25741
25742Predicated fshl of three vectors of integers.
25743
25744
25745Arguments:
25746""""""""""
25747
25748The first three arguments and the result have the same vector of integer type. The
25749fourth argument is the vector mask and has the same number of elements as the
25750result vector type. The fifth argument is the explicit vector length of the
25751operation.
25752
25753Semantics:
25754""""""""""
25755
25756The '``llvm.vp.fshl``' intrinsic performs fshl (:ref:`fshl <int_fshl>`) of the first, second, and third
25757vector argument on each enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25758
25759
25760Examples:
25761"""""""""
25762
25763.. code-block:: llvm
25764
25765      %r = call <4 x i32> @llvm.vp.fshl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i1> %mask, i32 %evl)
25766      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25767
25768      %t = call <4 x i32> @llvm.fshl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c)
25769      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25770
25771
25772'``llvm.vp.fshr.*``' Intrinsics
25773^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25774
25775Syntax:
25776"""""""
25777This is an overloaded intrinsic.
25778
25779::
25780
25781      declare <16 x i32>  @llvm.vp.fshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <middle_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25782      declare <vscale x 4 x i32>  @llvm.vp.fshr.nxv4i32  (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <middle_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25783      declare <256 x i64>  @llvm.vp.fshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <middle_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25784
25785Overview:
25786"""""""""
25787
25788Predicated fshr of three vectors of integers.
25789
25790
25791Arguments:
25792""""""""""
25793
25794The first three arguments and the result have the same vector of integer type. The
25795fourth argument is the vector mask and has the same number of elements as the
25796result vector type. The fifth argument is the explicit vector length of the
25797operation.
25798
25799Semantics:
25800""""""""""
25801
25802The '``llvm.vp.fshr``' intrinsic performs fshr (:ref:`fshr <int_fshr>`) of the first, second, and third
25803vector argument on each enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25804
25805
25806Examples:
25807"""""""""
25808
25809.. code-block:: llvm
25810
25811      %r = call <4 x i32> @llvm.vp.fshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i1> %mask, i32 %evl)
25812      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25813
25814      %t = call <4 x i32> @llvm.fshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c)
25815      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25816
25817'``llvm.vp.is.fpclass.*``' Intrinsics
25818^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25819
25820Syntax:
25821"""""""
25822This is an overloaded intrinsic.
25823
25824::
25825
25826      declare <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f32(<vscale x 2 x float> <op>, i32 <test>, <vscale x 2 x i1> <mask>, i32 <vector_length>)
25827      declare <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> <op>, i32 <test>, <2 x i1> <mask>, i32 <vector_length>)
25828
25829Overview:
25830"""""""""
25831
25832Predicated llvm.is.fpclass :ref:`llvm.is.fpclass <llvm.is.fpclass>`
25833
25834Arguments:
25835""""""""""
25836
25837The first argument is a floating-point vector, the result type is a vector of
25838boolean with the same number of elements as the first argument.  The second
25839argument specifies, which tests to perform :ref:`llvm.is.fpclass <llvm.is.fpclass>`.
25840The third argument is the vector mask and has the same number of elements as the
25841result vector type. The fourth argument is the explicit vector length of the
25842operation.
25843
25844Semantics:
25845""""""""""
25846
25847The '``llvm.vp.is.fpclass``' intrinsic performs llvm.is.fpclass (:ref:`llvm.is.fpclass <llvm.is.fpclass>`).
25848
25849
25850Examples:
25851"""""""""
25852
25853.. code-block:: llvm
25854
25855      %r = call <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> %x, i32 3, <2 x i1> %m, i32 %evl)
25856      %t = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f16(<vscale x 2 x half> %x, i32 3, <vscale x 2 x i1> %m, i32 %evl)
25857
25858.. _int_mload_mstore:
25859
25860Masked Vector Load and Store Intrinsics
25861---------------------------------------
25862
25863LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask argument, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
25864
25865.. _int_mload:
25866
25867'``llvm.masked.load.*``' Intrinsics
25868^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25869
25870Syntax:
25871"""""""
25872This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
25873
25874::
25875
25876      declare <16 x float>  @llvm.masked.load.v16f32.p0(ptr <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
25877      declare <2 x double>  @llvm.masked.load.v2f64.p0(ptr <ptr>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
25878      ;; The data is a vector of pointers
25879      declare <8 x ptr> @llvm.masked.load.v8p0.p0(ptr <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x ptr> <passthru>)
25880
25881Overview:
25882"""""""""
25883
25884Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' argument.
25885
25886
25887Arguments:
25888""""""""""
25889
25890The first argument is the base pointer for the load. The second argument is the alignment of the source location. It must be a power of two constant integer value. The third argument, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' argument are the same vector types.
25891
25892Semantics:
25893""""""""""
25894
25895The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
25896The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask, except that the masked-off lanes are not accessed.
25897Only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation).
25898In particular, using this intrinsic prevents exceptions on memory accesses to masked-off lanes.
25899Masked-off lanes are also not considered accessed for the purpose of data races or ``noalias`` constraints.
25900
25901
25902::
25903
25904       %res = call <16 x float> @llvm.masked.load.v16f32.p0(ptr %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
25905
25906       ;; The result of the two following instructions is identical aside from potential memory access exception
25907       %loadlal = load <16 x float>, ptr %ptr, align 4
25908       %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
25909
25910.. _int_mstore:
25911
25912'``llvm.masked.store.*``' Intrinsics
25913^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25914
25915Syntax:
25916"""""""
25917This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
25918
25919::
25920
25921       declare void @llvm.masked.store.v8i32.p0 (<8  x i32>   <value>, ptr <ptr>, i32 <alignment>, <8  x i1> <mask>)
25922       declare void @llvm.masked.store.v16f32.p0(<16 x float> <value>, ptr <ptr>, i32 <alignment>, <16 x i1> <mask>)
25923       ;; The data is a vector of pointers
25924       declare void @llvm.masked.store.v8p0.p0  (<8 x ptr>    <value>, ptr <ptr>, i32 <alignment>, <8 x i1> <mask>)
25925
25926Overview:
25927"""""""""
25928
25929Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
25930
25931Arguments:
25932""""""""""
25933
25934The first argument is the vector value to be written to memory. The second argument is the base pointer for the store, it has the same underlying type as the value argument. The third argument is the alignment of the destination location. It must be a power of two constant integer value. The fourth argument, mask, is a vector of boolean values. The types of the mask and the value argument must have the same number of vector elements.
25935
25936
25937Semantics:
25938""""""""""
25939
25940The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
25941The result of this operation is equivalent to a load-modify-store sequence, except that the masked-off lanes are not accessed.
25942Only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation).
25943In particular, using this intrinsic prevents exceptions on memory accesses to masked-off lanes.
25944Masked-off lanes are also not considered accessed for the purpose of data races or ``noalias`` constraints.
25945
25946::
25947
25948       call void @llvm.masked.store.v16f32.p0(<16 x float> %value, ptr %ptr, i32 4,  <16 x i1> %mask)
25949
25950       ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
25951       %oldval = load <16 x float>, ptr %ptr, align 4
25952       %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
25953       store <16 x float> %res, ptr %ptr, align 4
25954
25955
25956Masked Vector Gather and Scatter Intrinsics
25957-------------------------------------------
25958
25959LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask argument, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
25960
25961.. _int_mgather:
25962
25963'``llvm.masked.gather.*``' Intrinsics
25964^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25965
25966Syntax:
25967"""""""
25968This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
25969
25970::
25971
25972      declare <16 x float> @llvm.masked.gather.v16f32.v16p0(<16 x ptr> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
25973      declare <2 x double> @llvm.masked.gather.v2f64.v2p1(<2 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
25974      declare <8 x ptr> @llvm.masked.gather.v8p0.v8p0(<8 x ptr> <ptrs>, i32 <alignment>, <8 x i1>  <mask>, <8 x ptr> <passthru>)
25975
25976Overview:
25977"""""""""
25978
25979Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' argument.
25980
25981
25982Arguments:
25983""""""""""
25984
25985The first argument is a vector of pointers which holds all memory addresses to read. The second argument is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third argument, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' argument are the same vector types.
25986
25987Semantics:
25988""""""""""
25989
25990The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
25991The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
25992
25993
25994::
25995
25996       %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0(<4 x ptr> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> poison)
25997
25998       ;; The gather with all-true mask is equivalent to the following instruction sequence
25999       %ptr0 = extractelement <4 x ptr> %ptrs, i32 0
26000       %ptr1 = extractelement <4 x ptr> %ptrs, i32 1
26001       %ptr2 = extractelement <4 x ptr> %ptrs, i32 2
26002       %ptr3 = extractelement <4 x ptr> %ptrs, i32 3
26003
26004       %val0 = load double, ptr %ptr0, align 8
26005       %val1 = load double, ptr %ptr1, align 8
26006       %val2 = load double, ptr %ptr2, align 8
26007       %val3 = load double, ptr %ptr3, align 8
26008
26009       %vec0    = insertelement <4 x double> poison, %val0, 0
26010       %vec01   = insertelement <4 x double> %vec0, %val1, 1
26011       %vec012  = insertelement <4 x double> %vec01, %val2, 2
26012       %vec0123 = insertelement <4 x double> %vec012, %val3, 3
26013
26014.. _int_mscatter:
26015
26016'``llvm.masked.scatter.*``' Intrinsics
26017^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26018
26019Syntax:
26020"""""""
26021This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
26022
26023::
26024
26025       declare void @llvm.masked.scatter.v8i32.v8p0  (<8 x i32>    <value>, <8 x ptr>               <ptrs>, i32 <alignment>, <8 x i1>  <mask>)
26026       declare void @llvm.masked.scatter.v16f32.v16p1(<16 x float> <value>, <16 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <16 x i1> <mask>)
26027       declare void @llvm.masked.scatter.v4p0.v4p0   (<4 x ptr>    <value>, <4 x ptr>               <ptrs>, i32 <alignment>, <4 x i1>  <mask>)
26028
26029Overview:
26030"""""""""
26031
26032Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
26033
26034Arguments:
26035""""""""""
26036
26037The first argument is a vector value to be written to memory. The second argument is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value argument. The third argument is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth argument, mask, is a vector of boolean values. The types of the mask and the value argument must have the same number of vector elements.
26038
26039Semantics:
26040""""""""""
26041
26042The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
26043
26044::
26045
26046       ;; This instruction unconditionally stores data vector in multiple addresses
26047       call @llvm.masked.scatter.v8i32.v8p0(<8 x i32> %value, <8 x ptr> %ptrs, i32 4,  <8 x i1>  <true, true, .. true>)
26048
26049       ;; It is equivalent to a list of scalar stores
26050       %val0 = extractelement <8 x i32> %value, i32 0
26051       %val1 = extractelement <8 x i32> %value, i32 1
26052       ..
26053       %val7 = extractelement <8 x i32> %value, i32 7
26054       %ptr0 = extractelement <8 x ptr> %ptrs, i32 0
26055       %ptr1 = extractelement <8 x ptr> %ptrs, i32 1
26056       ..
26057       %ptr7 = extractelement <8 x ptr> %ptrs, i32 7
26058       ;; Note: the order of the following stores is important when they overlap:
26059       store i32 %val0, ptr %ptr0, align 4
26060       store i32 %val1, ptr %ptr1, align 4
26061       ..
26062       store i32 %val7, ptr %ptr7, align 4
26063
26064
26065Masked Vector Expanding Load and Compressing Store Intrinsics
26066-------------------------------------------------------------
26067
26068LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
26069
26070.. _int_expandload:
26071
26072'``llvm.masked.expandload.*``' Intrinsics
26073^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26074
26075Syntax:
26076"""""""
26077This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
26078
26079::
26080
26081      declare <16 x float>  @llvm.masked.expandload.v16f32 (ptr <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
26082      declare <2 x i64>     @llvm.masked.expandload.v2i64 (ptr <ptr>, <2 x i1>  <mask>, <2 x i64> <passthru>)
26083
26084Overview:
26085"""""""""
26086
26087Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' argument.
26088
26089
26090Arguments:
26091""""""""""
26092
26093The first argument is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second argument, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' argument have the same vector type.
26094
26095The :ref:`align <attr_align>` parameter attribute can be provided for the first
26096argument. The pointer alignment defaults to 1.
26097
26098Semantics:
26099""""""""""
26100
26101The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
26102
26103.. code-block:: c
26104
26105    // In this loop we load from B and spread the elements into array A.
26106    double *A, B; int *C;
26107    for (int i = 0; i < size; ++i) {
26108      if (C[i] != 0)
26109        A[i] = B[j++];
26110    }
26111
26112
26113.. code-block:: llvm
26114
26115    ; Load several elements from array B and expand them in a vector.
26116    ; The number of loaded elements is equal to the number of '1' elements in the Mask.
26117    %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(ptr %Bptr, <8 x i1> %Mask, <8 x double> poison)
26118    ; Store the result in A
26119    call void @llvm.masked.store.v8f64.p0(<8 x double> %Tmp, ptr %Aptr, i32 8, <8 x i1> %Mask)
26120
26121    ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
26122    %MaskI = bitcast <8 x i1> %Mask to i8
26123    %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
26124    %MaskI64 = zext i8 %MaskIPopcnt to i64
26125    %BNextInd = add i64 %BInd, %MaskI64
26126
26127
26128Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
26129If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
26130
26131.. _int_compressstore:
26132
26133'``llvm.masked.compressstore.*``' Intrinsics
26134^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26135
26136Syntax:
26137"""""""
26138This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
26139
26140::
26141
26142      declare void @llvm.masked.compressstore.v8i32  (<8  x i32>   <value>, ptr <ptr>, <8  x i1> <mask>)
26143      declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, ptr <ptr>, <16 x i1> <mask>)
26144
26145Overview:
26146"""""""""
26147
26148Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
26149
26150Arguments:
26151""""""""""
26152
26153The first argument is the input vector, from which elements are collected and written to memory. The second argument is the base pointer for the store, it has the same underlying type as the element of the input vector argument. The third argument is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
26154
26155The :ref:`align <attr_align>` parameter attribute can be provided for the second
26156argument. The pointer alignment defaults to 1.
26157
26158Semantics:
26159""""""""""
26160
26161The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependencies like in the following example:
26162
26163.. code-block:: c
26164
26165    // In this loop we load elements from A and store them consecutively in B
26166    double *A, B; int *C;
26167    for (int i = 0; i < size; ++i) {
26168      if (C[i] != 0)
26169        B[j++] = A[i]
26170    }
26171
26172
26173.. code-block:: llvm
26174
26175    ; Load elements from A.
26176    %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0(ptr %Aptr, i32 8, <8 x i1> %Mask, <8 x double> poison)
26177    ; Store all selected elements consecutively in array B
26178    call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, ptr %Bptr, <8 x i1> %Mask)
26179
26180    ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
26181    %MaskI = bitcast <8 x i1> %Mask to i8
26182    %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
26183    %MaskI64 = zext i8 %MaskIPopcnt to i64
26184    %BNextInd = add i64 %BInd, %MaskI64
26185
26186
26187Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
26188
26189
26190Memory Use Markers
26191------------------
26192
26193This class of intrinsics provides information about the
26194:ref:`lifetime of memory objects <objectlifetime>` and ranges where variables
26195are immutable.
26196
26197.. _int_lifestart:
26198
26199'``llvm.lifetime.start``' Intrinsic
26200^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26201
26202Syntax:
26203"""""""
26204
26205::
26206
26207      declare void @llvm.lifetime.start(i64 <size>, ptr captures(none) <ptr>)
26208
26209Overview:
26210"""""""""
26211
26212The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
26213object's lifetime.
26214
26215Arguments:
26216""""""""""
26217
26218The first argument is a constant integer representing the size of the
26219object, or -1 if it is variable sized. The second argument is a pointer
26220to the object.
26221
26222Semantics:
26223""""""""""
26224
26225If ``ptr`` is a stack-allocated object and it points to the first byte of
26226the object, the object is initially marked as dead.
26227``ptr`` is conservatively considered as a non-stack-allocated object if
26228the stack coloring algorithm that is used in the optimization pipeline cannot
26229conclude that ``ptr`` is a stack-allocated object.
26230
26231After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked
26232as alive and has an uninitialized value.
26233The stack object is marked as dead when either
26234:ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the
26235function returns.
26236
26237After :ref:`llvm.lifetime.end <int_lifeend>` is called,
26238'``llvm.lifetime.start``' on the stack object can be called again.
26239The second '``llvm.lifetime.start``' call marks the object as alive, but it
26240does not change the address of the object.
26241
26242If ``ptr`` is a non-stack-allocated object, it does not point to the first
26243byte of the object or it is a stack object that is already alive, it simply
26244fills all bytes of the object with ``poison``.
26245
26246
26247.. _int_lifeend:
26248
26249'``llvm.lifetime.end``' Intrinsic
26250^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26251
26252Syntax:
26253"""""""
26254
26255::
26256
26257      declare void @llvm.lifetime.end(i64 <size>, ptr captures(none) <ptr>)
26258
26259Overview:
26260"""""""""
26261
26262The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's
26263lifetime.
26264
26265Arguments:
26266""""""""""
26267
26268The first argument is a constant integer representing the size of the
26269object, or -1 if it is variable sized. The second argument is a pointer
26270to the object.
26271
26272Semantics:
26273""""""""""
26274
26275If ``ptr`` is a stack-allocated object and it points to the first byte of the
26276object, the object is dead.
26277``ptr`` is conservatively considered as a non-stack-allocated object if
26278the stack coloring algorithm that is used in the optimization pipeline cannot
26279conclude that ``ptr`` is a stack-allocated object.
26280
26281Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.
26282
26283If ``ptr`` is a non-stack-allocated object or it does not point to the first
26284byte of the object, it is equivalent to simply filling all bytes of the object
26285with ``poison``.
26286
26287
26288'``llvm.invariant.start``' Intrinsic
26289^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26290
26291Syntax:
26292"""""""
26293This is an overloaded intrinsic. The memory object can belong to any address space.
26294
26295::
26296
26297      declare ptr @llvm.invariant.start.p0(i64 <size>, ptr captures(none) <ptr>)
26298
26299Overview:
26300"""""""""
26301
26302The '``llvm.invariant.start``' intrinsic specifies that the contents of
26303a memory object will not change.
26304
26305Arguments:
26306""""""""""
26307
26308The first argument is a constant integer representing the size of the
26309object, or -1 if it is variable sized. The second argument is a pointer
26310to the object.
26311
26312Semantics:
26313""""""""""
26314
26315This intrinsic indicates that until an ``llvm.invariant.end`` that uses
26316the return value, the referenced memory location is constant and
26317unchanging.
26318
26319'``llvm.invariant.end``' Intrinsic
26320^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26321
26322Syntax:
26323"""""""
26324This is an overloaded intrinsic. The memory object can belong to any address space.
26325
26326::
26327
26328      declare void @llvm.invariant.end.p0(ptr <start>, i64 <size>, ptr captures(none) <ptr>)
26329
26330Overview:
26331"""""""""
26332
26333The '``llvm.invariant.end``' intrinsic specifies that the contents of a
26334memory object are mutable.
26335
26336Arguments:
26337""""""""""
26338
26339The first argument is the matching ``llvm.invariant.start`` intrinsic.
26340The second argument is a constant integer representing the size of the
26341object, or -1 if it is variable sized and the third argument is a
26342pointer to the object.
26343
26344Semantics:
26345""""""""""
26346
26347This intrinsic indicates that the memory is mutable again.
26348
26349'``llvm.launder.invariant.group``' Intrinsic
26350^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26351
26352Syntax:
26353"""""""
26354This is an overloaded intrinsic. The memory object can belong to any address
26355space. The returned pointer must belong to the same address space as the
26356argument.
26357
26358::
26359
26360      declare ptr @llvm.launder.invariant.group.p0(ptr <ptr>)
26361
26362Overview:
26363"""""""""
26364
26365The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
26366established by ``invariant.group`` metadata no longer holds, to obtain a new
26367pointer value that carries fresh invariant group information. It is an
26368experimental intrinsic, which means that its semantics might change in the
26369future.
26370
26371
26372Arguments:
26373""""""""""
26374
26375The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
26376to the memory.
26377
26378Semantics:
26379""""""""""
26380
26381Returns another pointer that aliases its argument but which is considered different
26382for the purposes of ``load``/``store`` ``invariant.group`` metadata.
26383It does not read any accessible memory and the execution can be speculated.
26384
26385'``llvm.strip.invariant.group``' Intrinsic
26386^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26387
26388Syntax:
26389"""""""
26390This is an overloaded intrinsic. The memory object can belong to any address
26391space. The returned pointer must belong to the same address space as the
26392argument.
26393
26394::
26395
26396      declare ptr @llvm.strip.invariant.group.p0(ptr <ptr>)
26397
26398Overview:
26399"""""""""
26400
26401The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
26402established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
26403value that does not carry the invariant information. It is an experimental
26404intrinsic, which means that its semantics might change in the future.
26405
26406
26407Arguments:
26408""""""""""
26409
26410The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
26411to the memory.
26412
26413Semantics:
26414""""""""""
26415
26416Returns another pointer that aliases its argument but which has no associated
26417``invariant.group`` metadata.
26418It does not read any memory and can be speculated.
26419
26420
26421
26422.. _constrainedfp:
26423
26424Constrained Floating-Point Intrinsics
26425-------------------------------------
26426
26427These intrinsics are used to provide special handling of floating-point
26428operations when specific rounding mode or floating-point exception behavior is
26429required.  By default, LLVM optimization passes assume that the rounding mode is
26430round-to-nearest and that floating-point exceptions will not be monitored.
26431Constrained FP intrinsics are used to support non-default rounding modes and
26432accurately preserve exception behavior without compromising LLVM's ability to
26433optimize FP code when the default behavior is used.
26434
26435If any FP operation in a function is constrained then they all must be
26436constrained. This is required for correct LLVM IR. Optimizations that
26437move code around can create miscompiles if mixing of constrained and normal
26438operations is done. The correct way to mix constrained and less constrained
26439operations is to use the rounding mode and exception handling metadata to
26440mark constrained intrinsics as having LLVM's default behavior.
26441
26442Each of these intrinsics corresponds to a normal floating-point operation. The
26443data arguments and the return value are the same as the corresponding FP
26444operation.
26445
26446The rounding mode argument is a metadata string specifying what
26447assumptions, if any, the optimizer can make when transforming constant
26448values. Some constrained FP intrinsics omit this argument. If required
26449by the intrinsic, this argument must be one of the following strings:
26450
26451::
26452
26453      "round.dynamic"
26454      "round.tonearest"
26455      "round.downward"
26456      "round.upward"
26457      "round.towardzero"
26458      "round.tonearestaway"
26459
26460If this argument is "round.dynamic" optimization passes must assume that the
26461rounding mode is unknown and may change at runtime.  No transformations that
26462depend on rounding mode may be performed in this case.
26463
26464The other possible values for the rounding mode argument correspond to the
26465similarly named IEEE rounding modes.  If the argument is any of these values
26466optimization passes may perform transformations as long as they are consistent
26467with the specified rounding mode.
26468
26469For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
26470"round.downward" or "round.dynamic" because if the value of 'x' is +0 then
26471'x-0' should evaluate to '-0' when rounding downward.  However, this
26472transformation is legal for all other rounding modes.
26473
26474For values other than "round.dynamic" optimization passes may assume that the
26475actual runtime rounding mode (as defined in a target-specific manner) matches
26476the specified rounding mode, but this is not guaranteed.  Using a specific
26477non-dynamic rounding mode which does not match the actual rounding mode at
26478runtime results in undefined behavior.
26479
26480The exception behavior argument is a metadata string describing the floating
26481point exception semantics that required for the intrinsic. This argument
26482must be one of the following strings:
26483
26484::
26485
26486      "fpexcept.ignore"
26487      "fpexcept.maytrap"
26488      "fpexcept.strict"
26489
26490If this argument is "fpexcept.ignore" optimization passes may assume that the
26491exception status flags will not be read and that floating-point exceptions will
26492be masked.  This allows transformations to be performed that may change the
26493exception semantics of the original code.  For example, FP operations may be
26494speculatively executed in this case whereas they must not be for either of the
26495other possible values of this argument.
26496
26497If the exception behavior argument is "fpexcept.maytrap" optimization passes
26498must avoid transformations that may raise exceptions that would not have been
26499raised by the original code (such as speculatively executing FP operations), but
26500passes are not required to preserve all exceptions that are implied by the
26501original code.  For example, exceptions may be potentially hidden by constant
26502folding.
26503
26504If the exception behavior argument is "fpexcept.strict" all transformations must
26505strictly preserve the floating-point exception semantics of the original code.
26506Any FP exception that would have been raised by the original code must be raised
26507by the transformed code, and the transformed code must not raise any FP
26508exceptions that would not have been raised by the original code.  This is the
26509exception behavior argument that will be used if the code being compiled reads
26510the FP exception status flags, but this mode can also be used with code that
26511unmasks FP exceptions.
26512
26513The number and order of floating-point exceptions is NOT guaranteed.  For
26514example, a series of FP operations that each may raise exceptions may be
26515vectorized into a single instruction that raises each unique exception a single
26516time.
26517
26518Proper :ref:`function attributes <fnattrs>` usage is required for the
26519constrained intrinsics to function correctly.
26520
26521All function *calls* done in a function that uses constrained floating
26522point intrinsics must have the ``strictfp`` attribute either on the
26523calling instruction or on the declaration or definition of the function
26524being called.
26525
26526All function *definitions* that use constrained floating point intrinsics
26527must have the ``strictfp`` attribute.
26528
26529'``llvm.experimental.constrained.fadd``' Intrinsic
26530^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26531
26532Syntax:
26533"""""""
26534
26535::
26536
26537      declare <type>
26538      @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
26539                                          metadata <rounding mode>,
26540                                          metadata <exception behavior>)
26541
26542Overview:
26543"""""""""
26544
26545The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
26546two arguments.
26547
26548
26549Arguments:
26550""""""""""
26551
26552The first two arguments to the '``llvm.experimental.constrained.fadd``'
26553intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26554of floating-point values. Both arguments must have identical types.
26555
26556The third and fourth arguments specify the rounding mode and exception
26557behavior as described above.
26558
26559Semantics:
26560""""""""""
26561
26562The value produced is the floating-point sum of the two value arguments and has
26563the same type as the arguments.
26564
26565
26566'``llvm.experimental.constrained.fsub``' Intrinsic
26567^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26568
26569Syntax:
26570"""""""
26571
26572::
26573
26574      declare <type>
26575      @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
26576                                          metadata <rounding mode>,
26577                                          metadata <exception behavior>)
26578
26579Overview:
26580"""""""""
26581
26582The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
26583of its two arguments.
26584
26585
26586Arguments:
26587""""""""""
26588
26589The first two arguments to the '``llvm.experimental.constrained.fsub``'
26590intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26591of floating-point values. Both arguments must have identical types.
26592
26593The third and fourth arguments specify the rounding mode and exception
26594behavior as described above.
26595
26596Semantics:
26597""""""""""
26598
26599The value produced is the floating-point difference of the two value arguments
26600and has the same type as the arguments.
26601
26602
26603'``llvm.experimental.constrained.fmul``' Intrinsic
26604^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26605
26606Syntax:
26607"""""""
26608
26609::
26610
26611      declare <type>
26612      @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
26613                                          metadata <rounding mode>,
26614                                          metadata <exception behavior>)
26615
26616Overview:
26617"""""""""
26618
26619The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
26620its two arguments.
26621
26622
26623Arguments:
26624""""""""""
26625
26626The first two arguments to the '``llvm.experimental.constrained.fmul``'
26627intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26628of floating-point values. Both arguments must have identical types.
26629
26630The third and fourth arguments specify the rounding mode and exception
26631behavior as described above.
26632
26633Semantics:
26634""""""""""
26635
26636The value produced is the floating-point product of the two value arguments and
26637has the same type as the arguments.
26638
26639
26640'``llvm.experimental.constrained.fdiv``' Intrinsic
26641^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26642
26643Syntax:
26644"""""""
26645
26646::
26647
26648      declare <type>
26649      @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
26650                                          metadata <rounding mode>,
26651                                          metadata <exception behavior>)
26652
26653Overview:
26654"""""""""
26655
26656The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
26657its two arguments.
26658
26659
26660Arguments:
26661""""""""""
26662
26663The first two arguments to the '``llvm.experimental.constrained.fdiv``'
26664intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26665of floating-point values. Both arguments must have identical types.
26666
26667The third and fourth arguments specify the rounding mode and exception
26668behavior as described above.
26669
26670Semantics:
26671""""""""""
26672
26673The value produced is the floating-point quotient of the two value arguments and
26674has the same type as the arguments.
26675
26676
26677'``llvm.experimental.constrained.frem``' Intrinsic
26678^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26679
26680Syntax:
26681"""""""
26682
26683::
26684
26685      declare <type>
26686      @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
26687                                          metadata <rounding mode>,
26688                                          metadata <exception behavior>)
26689
26690Overview:
26691"""""""""
26692
26693The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
26694from the division of its two arguments.
26695
26696
26697Arguments:
26698""""""""""
26699
26700The first two arguments to the '``llvm.experimental.constrained.frem``'
26701intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26702of floating-point values. Both arguments must have identical types.
26703
26704The third and fourth arguments specify the rounding mode and exception
26705behavior as described above.  The rounding mode argument has no effect, since
26706the result of frem is never rounded, but the argument is included for
26707consistency with the other constrained floating-point intrinsics.
26708
26709Semantics:
26710""""""""""
26711
26712The value produced is the floating-point remainder from the division of the two
26713value arguments and has the same type as the arguments.  The remainder has the
26714same sign as the dividend.
26715
26716'``llvm.experimental.constrained.fma``' Intrinsic
26717^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26718
26719Syntax:
26720"""""""
26721
26722::
26723
26724      declare <type>
26725      @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
26726                                          metadata <rounding mode>,
26727                                          metadata <exception behavior>)
26728
26729Overview:
26730"""""""""
26731
26732The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
26733fused-multiply-add operation on its arguments.
26734
26735Arguments:
26736""""""""""
26737
26738The first three arguments to the '``llvm.experimental.constrained.fma``'
26739intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
26740<t_vector>` of floating-point values. All arguments must have identical types.
26741
26742The fourth and fifth arguments specify the rounding mode and exception behavior
26743as described above.
26744
26745Semantics:
26746""""""""""
26747
26748The result produced is the product of the first two arguments added to the third
26749argument computed with infinite precision, and then rounded to the target
26750precision.
26751
26752'``llvm.experimental.constrained.fptoui``' Intrinsic
26753^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26754
26755Syntax:
26756"""""""
26757
26758::
26759
26760      declare <ty2>
26761      @llvm.experimental.constrained.fptoui(<type> <value>,
26762                                          metadata <exception behavior>)
26763
26764Overview:
26765"""""""""
26766
26767The '``llvm.experimental.constrained.fptoui``' intrinsic converts a
26768floating-point ``value`` to its unsigned integer equivalent of type ``ty2``.
26769
26770Arguments:
26771""""""""""
26772
26773The first argument to the '``llvm.experimental.constrained.fptoui``'
26774intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
26775<t_vector>` of floating point values.
26776
26777The second argument specifies the exception behavior as described above.
26778
26779Semantics:
26780""""""""""
26781
26782The result produced is an unsigned integer converted from the floating
26783point argument. The value is truncated, so it is rounded towards zero.
26784
26785'``llvm.experimental.constrained.fptosi``' Intrinsic
26786^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26787
26788Syntax:
26789"""""""
26790
26791::
26792
26793      declare <ty2>
26794      @llvm.experimental.constrained.fptosi(<type> <value>,
26795                                          metadata <exception behavior>)
26796
26797Overview:
26798"""""""""
26799
26800The '``llvm.experimental.constrained.fptosi``' intrinsic converts
26801:ref:`floating-point <t_floating>` ``value`` to type ``ty2``.
26802
26803Arguments:
26804""""""""""
26805
26806The first argument to the '``llvm.experimental.constrained.fptosi``'
26807intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
26808<t_vector>` of floating point values.
26809
26810The second argument specifies the exception behavior as described above.
26811
26812Semantics:
26813""""""""""
26814
26815The result produced is a signed integer converted from the floating
26816point argument. The value is truncated, so it is rounded towards zero.
26817
26818'``llvm.experimental.constrained.uitofp``' Intrinsic
26819^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26820
26821Syntax:
26822"""""""
26823
26824::
26825
26826      declare <ty2>
26827      @llvm.experimental.constrained.uitofp(<type> <value>,
26828                                          metadata <rounding mode>,
26829                                          metadata <exception behavior>)
26830
26831Overview:
26832"""""""""
26833
26834The '``llvm.experimental.constrained.uitofp``' intrinsic converts an
26835unsigned integer ``value`` to a floating-point of type ``ty2``.
26836
26837Arguments:
26838""""""""""
26839
26840The first argument to the '``llvm.experimental.constrained.uitofp``'
26841intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
26842<t_vector>` of integer values.
26843
26844The second and third arguments specify the rounding mode and exception
26845behavior as described above.
26846
26847Semantics:
26848""""""""""
26849
26850An inexact floating-point exception will be raised if rounding is required.
26851Any result produced is a floating point value converted from the input
26852integer argument.
26853
26854'``llvm.experimental.constrained.sitofp``' Intrinsic
26855^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26856
26857Syntax:
26858"""""""
26859
26860::
26861
26862      declare <ty2>
26863      @llvm.experimental.constrained.sitofp(<type> <value>,
26864                                          metadata <rounding mode>,
26865                                          metadata <exception behavior>)
26866
26867Overview:
26868"""""""""
26869
26870The '``llvm.experimental.constrained.sitofp``' intrinsic converts a
26871signed integer ``value`` to a floating-point of type ``ty2``.
26872
26873Arguments:
26874""""""""""
26875
26876The first argument to the '``llvm.experimental.constrained.sitofp``'
26877intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
26878<t_vector>` of integer values.
26879
26880The second and third arguments specify the rounding mode and exception
26881behavior as described above.
26882
26883Semantics:
26884""""""""""
26885
26886An inexact floating-point exception will be raised if rounding is required.
26887Any result produced is a floating point value converted from the input
26888integer argument.
26889
26890'``llvm.experimental.constrained.fptrunc``' Intrinsic
26891^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26892
26893Syntax:
26894"""""""
26895
26896::
26897
26898      declare <ty2>
26899      @llvm.experimental.constrained.fptrunc(<type> <value>,
26900                                          metadata <rounding mode>,
26901                                          metadata <exception behavior>)
26902
26903Overview:
26904"""""""""
26905
26906The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value``
26907to type ``ty2``.
26908
26909Arguments:
26910""""""""""
26911
26912The first argument to the '``llvm.experimental.constrained.fptrunc``'
26913intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
26914<t_vector>` of floating point values. This argument must be larger in size
26915than the result.
26916
26917The second and third arguments specify the rounding mode and exception
26918behavior as described above.
26919
26920Semantics:
26921""""""""""
26922
26923The result produced is a floating point value truncated to be smaller in size
26924than the argument.
26925
26926'``llvm.experimental.constrained.fpext``' Intrinsic
26927^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26928
26929Syntax:
26930"""""""
26931
26932::
26933
26934      declare <ty2>
26935      @llvm.experimental.constrained.fpext(<type> <value>,
26936                                          metadata <exception behavior>)
26937
26938Overview:
26939"""""""""
26940
26941The '``llvm.experimental.constrained.fpext``' intrinsic extends a
26942floating-point ``value`` to a larger floating-point value.
26943
26944Arguments:
26945""""""""""
26946
26947The first argument to the '``llvm.experimental.constrained.fpext``'
26948intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
26949<t_vector>` of floating point values. This argument must be smaller in size
26950than the result.
26951
26952The second argument specifies the exception behavior as described above.
26953
26954Semantics:
26955""""""""""
26956
26957The result produced is a floating point value extended to be larger in size
26958than the argument. All restrictions that apply to the fpext instruction also
26959apply to this intrinsic.
26960
26961'``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics
26962^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26963
26964Syntax:
26965"""""""
26966
26967::
26968
26969      declare <ty2>
26970      @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
26971                                          metadata <condition code>,
26972                                          metadata <exception behavior>)
26973      declare <ty2>
26974      @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
26975                                           metadata <condition code>,
26976                                           metadata <exception behavior>)
26977
26978Overview:
26979"""""""""
26980
26981The '``llvm.experimental.constrained.fcmp``' and
26982'``llvm.experimental.constrained.fcmps``' intrinsics return a boolean
26983value or vector of boolean values based on comparison of its arguments.
26984
26985If the arguments are floating-point scalars, then the result type is a
26986boolean (:ref:`i1 <t_integer>`).
26987
26988If the arguments are floating-point vectors, then the result type is a
26989vector of boolean with the same number of elements as the arguments being
26990compared.
26991
26992The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet
26993comparison operation while the '``llvm.experimental.constrained.fcmps``'
26994intrinsic performs a signaling comparison operation.
26995
26996Arguments:
26997""""""""""
26998
26999The first two arguments to the '``llvm.experimental.constrained.fcmp``'
27000and '``llvm.experimental.constrained.fcmps``' intrinsics must be
27001:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
27002of floating-point values. Both arguments must have identical types.
27003
27004The third argument is the condition code indicating the kind of comparison
27005to perform. It must be a metadata string with one of the following values:
27006
27007.. _fcmp_md_cc:
27008
27009- "``oeq``": ordered and equal
27010- "``ogt``": ordered and greater than
27011- "``oge``": ordered and greater than or equal
27012- "``olt``": ordered and less than
27013- "``ole``": ordered and less than or equal
27014- "``one``": ordered and not equal
27015- "``ord``": ordered (no nans)
27016- "``ueq``": unordered or equal
27017- "``ugt``": unordered or greater than
27018- "``uge``": unordered or greater than or equal
27019- "``ult``": unordered or less than
27020- "``ule``": unordered or less than or equal
27021- "``une``": unordered or not equal
27022- "``uno``": unordered (either nans)
27023
27024*Ordered* means that neither argument is a NAN while *unordered* means
27025that either argument may be a NAN.
27026
27027The fourth argument specifies the exception behavior as described above.
27028
27029Semantics:
27030""""""""""
27031
27032``op1`` and ``op2`` are compared according to the condition code given
27033as the third argument. If the arguments are vectors, then the
27034vectors are compared element by element. Each comparison performed
27035always yields an :ref:`i1 <t_integer>` result, as follows:
27036
27037.. _fcmp_md_cc_sem:
27038
27039- "``oeq``": yields ``true`` if both arguments are not a NAN and ``op1``
27040  is equal to ``op2``.
27041- "``ogt``": yields ``true`` if both arguments are not a NAN and ``op1``
27042  is greater than ``op2``.
27043- "``oge``": yields ``true`` if both arguments are not a NAN and ``op1``
27044  is greater than or equal to ``op2``.
27045- "``olt``": yields ``true`` if both arguments are not a NAN and ``op1``
27046  is less than ``op2``.
27047- "``ole``": yields ``true`` if both arguments are not a NAN and ``op1``
27048  is less than or equal to ``op2``.
27049- "``one``": yields ``true`` if both arguments are not a NAN and ``op1``
27050  is not equal to ``op2``.
27051- "``ord``": yields ``true`` if both arguments are not a NAN.
27052- "``ueq``": yields ``true`` if either argument is a NAN or ``op1`` is
27053  equal to ``op2``.
27054- "``ugt``": yields ``true`` if either argument is a NAN or ``op1`` is
27055  greater than ``op2``.
27056- "``uge``": yields ``true`` if either argument is a NAN or ``op1`` is
27057  greater than or equal to ``op2``.
27058- "``ult``": yields ``true`` if either argument is a NAN or ``op1`` is
27059  less than ``op2``.
27060- "``ule``": yields ``true`` if either argument is a NAN or ``op1`` is
27061  less than or equal to ``op2``.
27062- "``une``": yields ``true`` if either argument is a NAN or ``op1`` is
27063  not equal to ``op2``.
27064- "``uno``": yields ``true`` if either argument is a NAN.
27065
27066The quiet comparison operation performed by
27067'``llvm.experimental.constrained.fcmp``' will only raise an exception
27068if either argument is a SNAN.  The signaling comparison operation
27069performed by '``llvm.experimental.constrained.fcmps``' will raise an
27070exception if either argument is a NAN (QNAN or SNAN). Such an exception
27071does not preclude a result being produced (e.g. exception might only
27072set a flag), therefore the distinction between ordered and unordered
27073comparisons is also relevant for the
27074'``llvm.experimental.constrained.fcmps``' intrinsic.
27075
27076'``llvm.experimental.constrained.fmuladd``' Intrinsic
27077^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27078
27079Syntax:
27080"""""""
27081
27082::
27083
27084      declare <type>
27085      @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>,
27086                                             <type> <op3>,
27087                                             metadata <rounding mode>,
27088                                             metadata <exception behavior>)
27089
27090Overview:
27091"""""""""
27092
27093The '``llvm.experimental.constrained.fmuladd``' intrinsic represents
27094multiply-add expressions that can be fused if the code generator determines
27095that (a) the target instruction set has support for a fused operation,
27096and (b) that the fused operation is more efficient than the equivalent,
27097separate pair of mul and add instructions.
27098
27099Arguments:
27100""""""""""
27101
27102The first three arguments to the '``llvm.experimental.constrained.fmuladd``'
27103intrinsic must be floating-point or vector of floating-point values.
27104All three arguments must have identical types.
27105
27106The fourth and fifth arguments specify the rounding mode and exception behavior
27107as described above.
27108
27109Semantics:
27110""""""""""
27111
27112The expression:
27113
27114::
27115
27116      %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c,
27117                                                                 metadata <rounding mode>,
27118                                                                 metadata <exception behavior>)
27119
27120is equivalent to the expression:
27121
27122::
27123
27124      %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b,
27125                                                              metadata <rounding mode>,
27126                                                              metadata <exception behavior>)
27127      %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c,
27128                                                              metadata <rounding mode>,
27129                                                              metadata <exception behavior>)
27130
27131except that it is unspecified whether rounding will be performed between the
27132multiplication and addition steps. Fusion is not guaranteed, even if the target
27133platform supports it.
27134If a fused multiply-add is required, the corresponding
27135:ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be
27136used instead.
27137This never sets errno, just as '``llvm.experimental.constrained.fma.*``'.
27138
27139Constrained libm-equivalent Intrinsics
27140--------------------------------------
27141
27142In addition to the basic floating-point operations for which constrained
27143intrinsics are described above, there are constrained versions of various
27144operations which provide equivalent behavior to a corresponding libm function.
27145These intrinsics allow the precise behavior of these operations with respect to
27146rounding mode and exception behavior to be controlled.
27147
27148As with the basic constrained floating-point intrinsics, the rounding mode
27149and exception behavior arguments only control the behavior of the optimizer.
27150They do not change the runtime floating-point environment.
27151
27152
27153'``llvm.experimental.constrained.sqrt``' Intrinsic
27154^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27155
27156Syntax:
27157"""""""
27158
27159::
27160
27161      declare <type>
27162      @llvm.experimental.constrained.sqrt(<type> <op1>,
27163                                          metadata <rounding mode>,
27164                                          metadata <exception behavior>)
27165
27166Overview:
27167"""""""""
27168
27169The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
27170of the specified value, returning the same value as the libm '``sqrt``'
27171functions would, but without setting ``errno``.
27172
27173Arguments:
27174""""""""""
27175
27176The first argument and the return type are floating-point numbers of the same
27177type.
27178
27179The second and third arguments specify the rounding mode and exception
27180behavior as described above.
27181
27182Semantics:
27183""""""""""
27184
27185This function returns the nonnegative square root of the specified value.
27186If the value is less than negative zero, a floating-point exception occurs
27187and the return value is architecture specific.
27188
27189
27190'``llvm.experimental.constrained.pow``' Intrinsic
27191^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27192
27193Syntax:
27194"""""""
27195
27196::
27197
27198      declare <type>
27199      @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
27200                                         metadata <rounding mode>,
27201                                         metadata <exception behavior>)
27202
27203Overview:
27204"""""""""
27205
27206The '``llvm.experimental.constrained.pow``' intrinsic returns the first argument
27207raised to the (positive or negative) power specified by the second argument.
27208
27209Arguments:
27210""""""""""
27211
27212The first two arguments and the return value are floating-point numbers of the
27213same type.  The second argument specifies the power to which the first argument
27214should be raised.
27215
27216The third and fourth arguments specify the rounding mode and exception
27217behavior as described above.
27218
27219Semantics:
27220""""""""""
27221
27222This function returns the first value raised to the second power,
27223returning the same values as the libm ``pow`` functions would, and
27224handles error conditions in the same way.
27225
27226
27227'``llvm.experimental.constrained.powi``' Intrinsic
27228^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27229
27230Syntax:
27231"""""""
27232
27233::
27234
27235      declare <type>
27236      @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
27237                                          metadata <rounding mode>,
27238                                          metadata <exception behavior>)
27239
27240Overview:
27241"""""""""
27242
27243The '``llvm.experimental.constrained.powi``' intrinsic returns the first argument
27244raised to the (positive or negative) power specified by the second argument. The
27245order of evaluation of multiplications is not defined. When a vector of
27246floating-point type is used, the second argument remains a scalar integer value.
27247
27248
27249Arguments:
27250""""""""""
27251
27252The first argument and the return value are floating-point numbers of the same
27253type.  The second argument is a 32-bit signed integer specifying the power to
27254which the first argument should be raised.
27255
27256The third and fourth arguments specify the rounding mode and exception
27257behavior as described above.
27258
27259Semantics:
27260""""""""""
27261
27262This function returns the first value raised to the second power with an
27263unspecified sequence of rounding operations.
27264
27265
27266'``llvm.experimental.constrained.ldexp``' Intrinsic
27267^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27268
27269Syntax:
27270"""""""
27271
27272::
27273
27274      declare <type0>
27275      @llvm.experimental.constrained.ldexp(<type0> <op1>, <type1> <op2>,
27276                                          metadata <rounding mode>,
27277                                          metadata <exception behavior>)
27278
27279Overview:
27280"""""""""
27281
27282The '``llvm.experimental.constrained.ldexp``' performs the ldexp function.
27283
27284
27285Arguments:
27286""""""""""
27287
27288The first argument and the return value are :ref:`floating-point
27289<t_floating>` or :ref:`vector <t_vector>` of floating-point values of
27290the same type. The second argument is an integer with the same number
27291of elements.
27292
27293
27294The third and fourth arguments specify the rounding mode and exception
27295behavior as described above.
27296
27297Semantics:
27298""""""""""
27299
27300This function multiplies the first argument by 2 raised to the second
27301argument's power. If the first argument is NaN or infinite, the same
27302value is returned. If the result underflows a zero with the same sign
27303is returned. If the result overflows, the result is an infinity with
27304the same sign.
27305
27306
27307'``llvm.experimental.constrained.sin``' Intrinsic
27308^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27309
27310Syntax:
27311"""""""
27312
27313::
27314
27315      declare <type>
27316      @llvm.experimental.constrained.sin(<type> <op1>,
27317                                         metadata <rounding mode>,
27318                                         metadata <exception behavior>)
27319
27320Overview:
27321"""""""""
27322
27323The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
27324first argument.
27325
27326Arguments:
27327""""""""""
27328
27329The first argument and the return type are floating-point numbers of the same
27330type.
27331
27332The second and third arguments specify the rounding mode and exception
27333behavior as described above.
27334
27335Semantics:
27336""""""""""
27337
27338This function returns the sine of the specified argument, returning the
27339same values as the libm ``sin`` functions would, and handles error
27340conditions in the same way.
27341
27342
27343'``llvm.experimental.constrained.cos``' Intrinsic
27344^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27345
27346Syntax:
27347"""""""
27348
27349::
27350
27351      declare <type>
27352      @llvm.experimental.constrained.cos(<type> <op1>,
27353                                         metadata <rounding mode>,
27354                                         metadata <exception behavior>)
27355
27356Overview:
27357"""""""""
27358
27359The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
27360first argument.
27361
27362Arguments:
27363""""""""""
27364
27365The first argument and the return type are floating-point numbers of the same
27366type.
27367
27368The second and third arguments specify the rounding mode and exception
27369behavior as described above.
27370
27371Semantics:
27372""""""""""
27373
27374This function returns the cosine of the specified argument, returning the
27375same values as the libm ``cos`` functions would, and handles error
27376conditions in the same way.
27377
27378
27379'``llvm.experimental.constrained.tan``' Intrinsic
27380^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27381
27382Syntax:
27383"""""""
27384
27385::
27386
27387      declare <type>
27388      @llvm.experimental.constrained.tan(<type> <op1>,
27389                                         metadata <rounding mode>,
27390                                         metadata <exception behavior>)
27391
27392Overview:
27393"""""""""
27394
27395The '``llvm.experimental.constrained.tan``' intrinsic returns the tangent of the
27396first argument.
27397
27398Arguments:
27399""""""""""
27400
27401The first argument and the return type are floating-point numbers of the same
27402type.
27403
27404The second and third arguments specify the rounding mode and exception
27405behavior as described above.
27406
27407Semantics:
27408""""""""""
27409
27410This function returns the tangent of the specified argument, returning the
27411same values as the libm ``tan`` functions would, and handles error
27412conditions in the same way.
27413
27414'``llvm.experimental.constrained.asin``' Intrinsic
27415^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27416
27417Syntax:
27418"""""""
27419
27420::
27421
27422      declare <type>
27423      @llvm.experimental.constrained.asin(<type> <op1>,
27424                                          metadata <rounding mode>,
27425                                          metadata <exception behavior>)
27426
27427Overview:
27428"""""""""
27429
27430The '``llvm.experimental.constrained.asin``' intrinsic returns the arcsine of the
27431first operand.
27432
27433Arguments:
27434""""""""""
27435
27436The first argument and the return type are floating-point numbers of the same
27437type.
27438
27439The second and third arguments specify the rounding mode and exception
27440behavior as described above.
27441
27442Semantics:
27443""""""""""
27444
27445This function returns the arcsine of the specified operand, returning the
27446same values as the libm ``asin`` functions would, and handles error
27447conditions in the same way.
27448
27449
27450'``llvm.experimental.constrained.acos``' Intrinsic
27451^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27452
27453Syntax:
27454"""""""
27455
27456::
27457
27458      declare <type>
27459      @llvm.experimental.constrained.acos(<type> <op1>,
27460                                          metadata <rounding mode>,
27461                                          metadata <exception behavior>)
27462
27463Overview:
27464"""""""""
27465
27466The '``llvm.experimental.constrained.acos``' intrinsic returns the arccosine of the
27467first operand.
27468
27469Arguments:
27470""""""""""
27471
27472The first argument and the return type are floating-point numbers of the same
27473type.
27474
27475The second and third arguments specify the rounding mode and exception
27476behavior as described above.
27477
27478Semantics:
27479""""""""""
27480
27481This function returns the arccosine of the specified operand, returning the
27482same values as the libm ``acos`` functions would, and handles error
27483conditions in the same way.
27484
27485
27486'``llvm.experimental.constrained.atan``' Intrinsic
27487^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27488
27489Syntax:
27490"""""""
27491
27492::
27493
27494      declare <type>
27495      @llvm.experimental.constrained.atan(<type> <op1>,
27496                                          metadata <rounding mode>,
27497                                          metadata <exception behavior>)
27498
27499Overview:
27500"""""""""
27501
27502The '``llvm.experimental.constrained.atan``' intrinsic returns the arctangent of the
27503first operand.
27504
27505Arguments:
27506""""""""""
27507
27508The first argument and the return type are floating-point numbers of the same
27509type.
27510
27511The second and third arguments specify the rounding mode and exception
27512behavior as described above.
27513
27514Semantics:
27515""""""""""
27516
27517This function returns the arctangent of the specified operand, returning the
27518same values as the libm ``atan`` functions would, and handles error
27519conditions in the same way.
27520
27521'``llvm.experimental.constrained.atan2``' Intrinsic
27522^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27523
27524Syntax:
27525"""""""
27526
27527::
27528
27529      declare <type>
27530      @llvm.experimental.constrained.atan2(<type> <op1>,
27531                                           <type> <op2>,
27532                                           metadata <rounding mode>,
27533                                           metadata <exception behavior>)
27534
27535Overview:
27536"""""""""
27537
27538The '``llvm.experimental.constrained.atan2``' intrinsic returns the arctangent
27539of ``<op1>`` divided by ``<op2>`` accounting for the quadrant.
27540
27541Arguments:
27542""""""""""
27543
27544The first two arguments and the return value are floating-point numbers of the
27545same type.
27546
27547The third and fourth arguments specify the rounding mode and exception
27548behavior as described above.
27549
27550Semantics:
27551""""""""""
27552
27553This function returns the quadrant-specific arctangent using the specified
27554operands, returning the same values as the libm ``atan2`` functions would, and
27555handles error conditions in the same way.
27556
27557'``llvm.experimental.constrained.sinh``' Intrinsic
27558^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27559
27560Syntax:
27561"""""""
27562
27563::
27564
27565      declare <type>
27566      @llvm.experimental.constrained.sinh(<type> <op1>,
27567                                          metadata <rounding mode>,
27568                                          metadata <exception behavior>)
27569
27570Overview:
27571"""""""""
27572
27573The '``llvm.experimental.constrained.sinh``' intrinsic returns the hyperbolic sine of the
27574first operand.
27575
27576Arguments:
27577""""""""""
27578
27579The first argument and the return type are floating-point numbers of the same
27580type.
27581
27582The second and third arguments specify the rounding mode and exception
27583behavior as described above.
27584
27585Semantics:
27586""""""""""
27587
27588This function returns the hyperbolic sine of the specified operand, returning the
27589same values as the libm ``sinh`` functions would, and handles error
27590conditions in the same way.
27591
27592
27593'``llvm.experimental.constrained.cosh``' Intrinsic
27594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27595
27596Syntax:
27597"""""""
27598
27599::
27600
27601      declare <type>
27602      @llvm.experimental.constrained.cosh(<type> <op1>,
27603                                          metadata <rounding mode>,
27604                                          metadata <exception behavior>)
27605
27606Overview:
27607"""""""""
27608
27609The '``llvm.experimental.constrained.cosh``' intrinsic returns the hyperbolic cosine of the
27610first operand.
27611
27612Arguments:
27613""""""""""
27614
27615The first argument and the return type are floating-point numbers of the same
27616type.
27617
27618The second and third arguments specify the rounding mode and exception
27619behavior as described above.
27620
27621Semantics:
27622""""""""""
27623
27624This function returns the hyperbolic cosine of the specified operand, returning the
27625same values as the libm ``cosh`` functions would, and handles error
27626conditions in the same way.
27627
27628
27629'``llvm.experimental.constrained.tanh``' Intrinsic
27630^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27631
27632Syntax:
27633"""""""
27634
27635::
27636
27637      declare <type>
27638      @llvm.experimental.constrained.tanh(<type> <op1>,
27639                                          metadata <rounding mode>,
27640                                          metadata <exception behavior>)
27641
27642Overview:
27643"""""""""
27644
27645The '``llvm.experimental.constrained.tanh``' intrinsic returns the hyperbolic tangent of the
27646first operand.
27647
27648Arguments:
27649""""""""""
27650
27651The first argument and the return type are floating-point numbers of the same
27652type.
27653
27654The second and third arguments specify the rounding mode and exception
27655behavior as described above.
27656
27657Semantics:
27658""""""""""
27659
27660This function returns the hyperbolic tangent of the specified operand, returning the
27661same values as the libm ``tanh`` functions would, and handles error
27662conditions in the same way.
27663
27664'``llvm.experimental.constrained.exp``' Intrinsic
27665^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27666
27667Syntax:
27668"""""""
27669
27670::
27671
27672      declare <type>
27673      @llvm.experimental.constrained.exp(<type> <op1>,
27674                                         metadata <rounding mode>,
27675                                         metadata <exception behavior>)
27676
27677Overview:
27678"""""""""
27679
27680The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
27681exponential of the specified value.
27682
27683Arguments:
27684""""""""""
27685
27686The first argument and the return value are floating-point numbers of the same
27687type.
27688
27689The second and third arguments specify the rounding mode and exception
27690behavior as described above.
27691
27692Semantics:
27693""""""""""
27694
27695This function returns the same values as the libm ``exp`` functions
27696would, and handles error conditions in the same way.
27697
27698
27699'``llvm.experimental.constrained.exp2``' Intrinsic
27700^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27701
27702Syntax:
27703"""""""
27704
27705::
27706
27707      declare <type>
27708      @llvm.experimental.constrained.exp2(<type> <op1>,
27709                                          metadata <rounding mode>,
27710                                          metadata <exception behavior>)
27711
27712Overview:
27713"""""""""
27714
27715The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
27716exponential of the specified value.
27717
27718
27719Arguments:
27720""""""""""
27721
27722The first argument and the return value are floating-point numbers of the same
27723type.
27724
27725The second and third arguments specify the rounding mode and exception
27726behavior as described above.
27727
27728Semantics:
27729""""""""""
27730
27731This function returns the same values as the libm ``exp2`` functions
27732would, and handles error conditions in the same way.
27733
27734
27735'``llvm.experimental.constrained.log``' Intrinsic
27736^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27737
27738Syntax:
27739"""""""
27740
27741::
27742
27743      declare <type>
27744      @llvm.experimental.constrained.log(<type> <op1>,
27745                                         metadata <rounding mode>,
27746                                         metadata <exception behavior>)
27747
27748Overview:
27749"""""""""
27750
27751The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
27752logarithm of the specified value.
27753
27754Arguments:
27755""""""""""
27756
27757The first argument and the return value are floating-point numbers of the same
27758type.
27759
27760The second and third arguments specify the rounding mode and exception
27761behavior as described above.
27762
27763
27764Semantics:
27765""""""""""
27766
27767This function returns the same values as the libm ``log`` functions
27768would, and handles error conditions in the same way.
27769
27770
27771'``llvm.experimental.constrained.log10``' Intrinsic
27772^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27773
27774Syntax:
27775"""""""
27776
27777::
27778
27779      declare <type>
27780      @llvm.experimental.constrained.log10(<type> <op1>,
27781                                           metadata <rounding mode>,
27782                                           metadata <exception behavior>)
27783
27784Overview:
27785"""""""""
27786
27787The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
27788logarithm of the specified value.
27789
27790Arguments:
27791""""""""""
27792
27793The first argument and the return value are floating-point numbers of the same
27794type.
27795
27796The second and third arguments specify the rounding mode and exception
27797behavior as described above.
27798
27799Semantics:
27800""""""""""
27801
27802This function returns the same values as the libm ``log10`` functions
27803would, and handles error conditions in the same way.
27804
27805
27806'``llvm.experimental.constrained.log2``' Intrinsic
27807^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27808
27809Syntax:
27810"""""""
27811
27812::
27813
27814      declare <type>
27815      @llvm.experimental.constrained.log2(<type> <op1>,
27816                                          metadata <rounding mode>,
27817                                          metadata <exception behavior>)
27818
27819Overview:
27820"""""""""
27821
27822The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
27823logarithm of the specified value.
27824
27825Arguments:
27826""""""""""
27827
27828The first argument and the return value are floating-point numbers of the same
27829type.
27830
27831The second and third arguments specify the rounding mode and exception
27832behavior as described above.
27833
27834Semantics:
27835""""""""""
27836
27837This function returns the same values as the libm ``log2`` functions
27838would, and handles error conditions in the same way.
27839
27840
27841'``llvm.experimental.constrained.rint``' Intrinsic
27842^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27843
27844Syntax:
27845"""""""
27846
27847::
27848
27849      declare <type>
27850      @llvm.experimental.constrained.rint(<type> <op1>,
27851                                          metadata <rounding mode>,
27852                                          metadata <exception behavior>)
27853
27854Overview:
27855"""""""""
27856
27857The '``llvm.experimental.constrained.rint``' intrinsic returns the first
27858argument rounded to the nearest integer. It may raise an inexact floating-point
27859exception if the argument is not an integer.
27860
27861Arguments:
27862""""""""""
27863
27864The first argument and the return value are floating-point numbers of the same
27865type.
27866
27867The second and third arguments specify the rounding mode and exception
27868behavior as described above.
27869
27870Semantics:
27871""""""""""
27872
27873This function returns the same values as the libm ``rint`` functions
27874would, and handles error conditions in the same way.  The rounding mode is
27875described, not determined, by the rounding mode argument.  The actual rounding
27876mode is determined by the runtime floating-point environment.  The rounding
27877mode argument is only intended as information to the compiler.
27878
27879
27880'``llvm.experimental.constrained.lrint``' Intrinsic
27881^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27882
27883Syntax:
27884"""""""
27885
27886::
27887
27888      declare <inttype>
27889      @llvm.experimental.constrained.lrint(<fptype> <op1>,
27890                                           metadata <rounding mode>,
27891                                           metadata <exception behavior>)
27892
27893Overview:
27894"""""""""
27895
27896The '``llvm.experimental.constrained.lrint``' intrinsic returns the first
27897argument rounded to the nearest integer. An inexact floating-point exception
27898will be raised if the argument is not an integer. An invalid exception is
27899raised if the result is too large to fit into a supported integer type,
27900and in this case the result is undefined.
27901
27902Arguments:
27903""""""""""
27904
27905The first argument is a floating-point number. The return value is an
27906integer type. Not all types are supported on all targets. The supported
27907types are the same as the ``llvm.lrint`` intrinsic and the ``lrint``
27908libm functions.
27909
27910The second and third arguments specify the rounding mode and exception
27911behavior as described above.
27912
27913Semantics:
27914""""""""""
27915
27916This function returns the same values as the libm ``lrint`` functions
27917would, and handles error conditions in the same way.
27918
27919The rounding mode is described, not determined, by the rounding mode
27920argument.  The actual rounding mode is determined by the runtime floating-point
27921environment.  The rounding mode argument is only intended as information
27922to the compiler.
27923
27924If the runtime floating-point environment is using the default rounding mode
27925then the results will be the same as the llvm.lrint intrinsic.
27926
27927
27928'``llvm.experimental.constrained.llrint``' Intrinsic
27929^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27930
27931Syntax:
27932"""""""
27933
27934::
27935
27936      declare <inttype>
27937      @llvm.experimental.constrained.llrint(<fptype> <op1>,
27938                                            metadata <rounding mode>,
27939                                            metadata <exception behavior>)
27940
27941Overview:
27942"""""""""
27943
27944The '``llvm.experimental.constrained.llrint``' intrinsic returns the first
27945argument rounded to the nearest integer. An inexact floating-point exception
27946will be raised if the argument is not an integer. An invalid exception is
27947raised if the result is too large to fit into a supported integer type,
27948and in this case the result is undefined.
27949
27950Arguments:
27951""""""""""
27952
27953The first argument is a floating-point number. The return value is an
27954integer type. Not all types are supported on all targets. The supported
27955types are the same as the ``llvm.llrint`` intrinsic and the ``llrint``
27956libm functions.
27957
27958The second and third arguments specify the rounding mode and exception
27959behavior as described above.
27960
27961Semantics:
27962""""""""""
27963
27964This function returns the same values as the libm ``llrint`` functions
27965would, and handles error conditions in the same way.
27966
27967The rounding mode is described, not determined, by the rounding mode
27968argument.  The actual rounding mode is determined by the runtime floating-point
27969environment.  The rounding mode argument is only intended as information
27970to the compiler.
27971
27972If the runtime floating-point environment is using the default rounding mode
27973then the results will be the same as the llvm.llrint intrinsic.
27974
27975
27976'``llvm.experimental.constrained.nearbyint``' Intrinsic
27977^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27978
27979Syntax:
27980"""""""
27981
27982::
27983
27984      declare <type>
27985      @llvm.experimental.constrained.nearbyint(<type> <op1>,
27986                                               metadata <rounding mode>,
27987                                               metadata <exception behavior>)
27988
27989Overview:
27990"""""""""
27991
27992The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
27993argument rounded to the nearest integer. It will not raise an inexact
27994floating-point exception if the argument is not an integer.
27995
27996
27997Arguments:
27998""""""""""
27999
28000The first argument and the return value are floating-point numbers of the same
28001type.
28002
28003The second and third arguments specify the rounding mode and exception
28004behavior as described above.
28005
28006Semantics:
28007""""""""""
28008
28009This function returns the same values as the libm ``nearbyint`` functions
28010would, and handles error conditions in the same way.  The rounding mode is
28011described, not determined, by the rounding mode argument.  The actual rounding
28012mode is determined by the runtime floating-point environment.  The rounding
28013mode argument is only intended as information to the compiler.
28014
28015
28016'``llvm.experimental.constrained.maxnum``' Intrinsic
28017^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28018
28019Syntax:
28020"""""""
28021
28022::
28023
28024      declare <type>
28025      @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2>
28026                                            metadata <exception behavior>)
28027
28028Overview:
28029"""""""""
28030
28031The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum
28032of the two arguments.
28033
28034Arguments:
28035""""""""""
28036
28037The first two arguments and the return value are floating-point numbers
28038of the same type.
28039
28040The third argument specifies the exception behavior as described above.
28041
28042Semantics:
28043""""""""""
28044
28045This function follows the IEEE-754 semantics for maxNum.
28046
28047
28048'``llvm.experimental.constrained.minnum``' Intrinsic
28049^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28050
28051Syntax:
28052"""""""
28053
28054::
28055
28056      declare <type>
28057      @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2>
28058                                            metadata <exception behavior>)
28059
28060Overview:
28061"""""""""
28062
28063The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum
28064of the two arguments.
28065
28066Arguments:
28067""""""""""
28068
28069The first two arguments and the return value are floating-point numbers
28070of the same type.
28071
28072The third argument specifies the exception behavior as described above.
28073
28074Semantics:
28075""""""""""
28076
28077This function follows the IEEE-754 semantics for minNum.
28078
28079
28080'``llvm.experimental.constrained.maximum``' Intrinsic
28081^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28082
28083Syntax:
28084"""""""
28085
28086::
28087
28088      declare <type>
28089      @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2>
28090                                             metadata <exception behavior>)
28091
28092Overview:
28093"""""""""
28094
28095The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum
28096of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
28097
28098Arguments:
28099""""""""""
28100
28101The first two arguments and the return value are floating-point numbers
28102of the same type.
28103
28104The third argument specifies the exception behavior as described above.
28105
28106Semantics:
28107""""""""""
28108
28109This function follows semantics specified in the draft of IEEE 754-2019.
28110
28111
28112'``llvm.experimental.constrained.minimum``' Intrinsic
28113^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28114
28115Syntax:
28116"""""""
28117
28118::
28119
28120      declare <type>
28121      @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2>
28122                                             metadata <exception behavior>)
28123
28124Overview:
28125"""""""""
28126
28127The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum
28128of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
28129
28130Arguments:
28131""""""""""
28132
28133The first two arguments and the return value are floating-point numbers
28134of the same type.
28135
28136The third argument specifies the exception behavior as described above.
28137
28138Semantics:
28139""""""""""
28140
28141This function follows semantics specified in the draft of IEEE 754-2019.
28142
28143
28144'``llvm.experimental.constrained.ceil``' Intrinsic
28145^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28146
28147Syntax:
28148"""""""
28149
28150::
28151
28152      declare <type>
28153      @llvm.experimental.constrained.ceil(<type> <op1>,
28154                                          metadata <exception behavior>)
28155
28156Overview:
28157"""""""""
28158
28159The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the
28160first argument.
28161
28162Arguments:
28163""""""""""
28164
28165The first argument and the return value are floating-point numbers of the same
28166type.
28167
28168The second argument specifies the exception behavior as described above.
28169
28170Semantics:
28171""""""""""
28172
28173This function returns the same values as the libm ``ceil`` functions
28174would and handles error conditions in the same way.
28175
28176
28177'``llvm.experimental.constrained.floor``' Intrinsic
28178^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28179
28180Syntax:
28181"""""""
28182
28183::
28184
28185      declare <type>
28186      @llvm.experimental.constrained.floor(<type> <op1>,
28187                                           metadata <exception behavior>)
28188
28189Overview:
28190"""""""""
28191
28192The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the
28193first argument.
28194
28195Arguments:
28196""""""""""
28197
28198The first argument and the return value are floating-point numbers of the same
28199type.
28200
28201The second argument specifies the exception behavior as described above.
28202
28203Semantics:
28204""""""""""
28205
28206This function returns the same values as the libm ``floor`` functions
28207would and handles error conditions in the same way.
28208
28209
28210'``llvm.experimental.constrained.round``' Intrinsic
28211^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28212
28213Syntax:
28214"""""""
28215
28216::
28217
28218      declare <type>
28219      @llvm.experimental.constrained.round(<type> <op1>,
28220                                           metadata <exception behavior>)
28221
28222Overview:
28223"""""""""
28224
28225The '``llvm.experimental.constrained.round``' intrinsic returns the first
28226argument rounded to the nearest integer.
28227
28228Arguments:
28229""""""""""
28230
28231The first argument and the return value are floating-point numbers of the same
28232type.
28233
28234The second argument specifies the exception behavior as described above.
28235
28236Semantics:
28237""""""""""
28238
28239This function returns the same values as the libm ``round`` functions
28240would and handles error conditions in the same way.
28241
28242
28243'``llvm.experimental.constrained.roundeven``' Intrinsic
28244^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28245
28246Syntax:
28247"""""""
28248
28249::
28250
28251      declare <type>
28252      @llvm.experimental.constrained.roundeven(<type> <op1>,
28253                                               metadata <exception behavior>)
28254
28255Overview:
28256"""""""""
28257
28258The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first
28259argument rounded to the nearest integer in floating-point format, rounding
28260halfway cases to even (that is, to the nearest value that is an even integer),
28261regardless of the current rounding direction.
28262
28263Arguments:
28264""""""""""
28265
28266The first argument and the return value are floating-point numbers of the same
28267type.
28268
28269The second argument specifies the exception behavior as described above.
28270
28271Semantics:
28272""""""""""
28273
28274This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
28275also behaves in the same way as C standard function ``roundeven`` and can signal
28276the invalid operation exception for a SNAN argument.
28277
28278
28279'``llvm.experimental.constrained.lround``' Intrinsic
28280^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28281
28282Syntax:
28283"""""""
28284
28285::
28286
28287      declare <inttype>
28288      @llvm.experimental.constrained.lround(<fptype> <op1>,
28289                                            metadata <exception behavior>)
28290
28291Overview:
28292"""""""""
28293
28294The '``llvm.experimental.constrained.lround``' intrinsic returns the first
28295argument rounded to the nearest integer with ties away from zero.  It will
28296raise an inexact floating-point exception if the argument is not an integer.
28297An invalid exception is raised if the result is too large to fit into a
28298supported integer type, and in this case the result is undefined.
28299
28300Arguments:
28301""""""""""
28302
28303The first argument is a floating-point number. The return value is an
28304integer type. Not all types are supported on all targets. The supported
28305types are the same as the ``llvm.lround`` intrinsic and the ``lround``
28306libm functions.
28307
28308The second argument specifies the exception behavior as described above.
28309
28310Semantics:
28311""""""""""
28312
28313This function returns the same values as the libm ``lround`` functions
28314would and handles error conditions in the same way.
28315
28316
28317'``llvm.experimental.constrained.llround``' Intrinsic
28318^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28319
28320Syntax:
28321"""""""
28322
28323::
28324
28325      declare <inttype>
28326      @llvm.experimental.constrained.llround(<fptype> <op1>,
28327                                             metadata <exception behavior>)
28328
28329Overview:
28330"""""""""
28331
28332The '``llvm.experimental.constrained.llround``' intrinsic returns the first
28333argument rounded to the nearest integer with ties away from zero. It will
28334raise an inexact floating-point exception if the argument is not an integer.
28335An invalid exception is raised if the result is too large to fit into a
28336supported integer type, and in this case the result is undefined.
28337
28338Arguments:
28339""""""""""
28340
28341The first argument is a floating-point number. The return value is an
28342integer type. Not all types are supported on all targets. The supported
28343types are the same as the ``llvm.llround`` intrinsic and the ``llround``
28344libm functions.
28345
28346The second argument specifies the exception behavior as described above.
28347
28348Semantics:
28349""""""""""
28350
28351This function returns the same values as the libm ``llround`` functions
28352would and handles error conditions in the same way.
28353
28354
28355'``llvm.experimental.constrained.trunc``' Intrinsic
28356^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28357
28358Syntax:
28359"""""""
28360
28361::
28362
28363      declare <type>
28364      @llvm.experimental.constrained.trunc(<type> <op1>,
28365                                           metadata <exception behavior>)
28366
28367Overview:
28368"""""""""
28369
28370The '``llvm.experimental.constrained.trunc``' intrinsic returns the first
28371argument rounded to the nearest integer not larger in magnitude than the
28372argument.
28373
28374Arguments:
28375""""""""""
28376
28377The first argument and the return value are floating-point numbers of the same
28378type.
28379
28380The second argument specifies the exception behavior as described above.
28381
28382Semantics:
28383""""""""""
28384
28385This function returns the same values as the libm ``trunc`` functions
28386would and handles error conditions in the same way.
28387
28388.. _int_experimental_noalias_scope_decl:
28389
28390'``llvm.experimental.noalias.scope.decl``' Intrinsic
28391^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28392
28393Syntax:
28394"""""""
28395
28396
28397::
28398
28399      declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)
28400
28401Overview:
28402"""""""""
28403
28404The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
28405noalias scope is declared. When the intrinsic is duplicated, a decision must
28406also be made about the scope: depending on the reason of the duplication,
28407the scope might need to be duplicated as well.
28408
28409
28410Arguments:
28411""""""""""
28412
28413The ``!id.scope.list`` argument is metadata that is a list of ``noalias``
28414metadata references. The format is identical to that required for ``noalias``
28415metadata. This list must have exactly one element.
28416
28417Semantics:
28418""""""""""
28419
28420The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
28421noalias scope is declared. When the intrinsic is duplicated, a decision must
28422also be made about the scope: depending on the reason of the duplication,
28423the scope might need to be duplicated as well.
28424
28425For example, when the intrinsic is used inside a loop body, and that loop is
28426unrolled, the associated noalias scope must also be duplicated. Otherwise, the
28427noalias property it signifies would spill across loop iterations, whereas it
28428was only valid within a single iteration.
28429
28430.. code-block:: llvm
28431
28432  ; This examples shows two possible positions for noalias.decl and how they impact the semantics:
28433  ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations.
28434  ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration.
28435  declare void @decl_in_loop(ptr %a.base, ptr %b.base) {
28436  entry:
28437    ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop
28438    br label %loop
28439
28440  loop:
28441    %a = phi ptr [ %a.base, %entry ], [ %a.inc, %loop ]
28442    %b = phi ptr [ %b.base, %entry ], [ %b.inc, %loop ]
28443    ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop
28444    %val = load i8, ptr %a, !alias.scope !2
28445    store i8 %val, ptr %b, !noalias !2
28446    %a.inc = getelementptr inbounds i8, ptr %a, i64 1
28447    %b.inc = getelementptr inbounds i8, ptr %b, i64 1
28448    %cond = call i1 @cond()
28449    br i1 %cond, label %loop, label %exit
28450
28451  exit:
28452    ret void
28453  }
28454
28455  !0 = !{!0} ; domain
28456  !1 = !{!1, !0} ; scope
28457  !2 = !{!1} ; scope list
28458
28459Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope
28460are possible, but one should never dominate another. Violations are pointed out
28461by the verifier as they indicate a problem in either a transformation pass or
28462the input.
28463
28464
28465Floating Point Environment Manipulation intrinsics
28466--------------------------------------------------
28467
28468These functions read or write floating point environment, such as rounding
28469mode or state of floating point exceptions. Altering the floating point
28470environment requires special care. See :ref:`Floating Point Environment <floatenv>`.
28471
28472.. _int_get_rounding:
28473
28474'``llvm.get.rounding``' Intrinsic
28475^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28476
28477Syntax:
28478"""""""
28479
28480::
28481
28482      declare i32 @llvm.get.rounding()
28483
28484Overview:
28485"""""""""
28486
28487The '``llvm.get.rounding``' intrinsic reads the current rounding mode.
28488
28489Semantics:
28490""""""""""
28491
28492The '``llvm.get.rounding``' intrinsic returns the current rounding mode.
28493Encoding of the returned values is same as the result of ``FLT_ROUNDS``,
28494specified by C standard:
28495
28496::
28497
28498    0  - toward zero
28499    1  - to nearest, ties to even
28500    2  - toward positive infinity
28501    3  - toward negative infinity
28502    4  - to nearest, ties away from zero
28503
28504Other values may be used to represent additional rounding modes, supported by a
28505target. These values are target-specific.
28506
28507.. _int_set_rounding:
28508
28509'``llvm.set.rounding``' Intrinsic
28510^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28511
28512Syntax:
28513"""""""
28514
28515::
28516
28517      declare void @llvm.set.rounding(i32 <val>)
28518
28519Overview:
28520"""""""""
28521
28522The '``llvm.set.rounding``' intrinsic sets current rounding mode.
28523
28524Arguments:
28525""""""""""
28526
28527The argument is the required rounding mode. Encoding of rounding mode is
28528the same as used by '``llvm.get.rounding``'.
28529
28530Semantics:
28531""""""""""
28532
28533The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is
28534similar to C library function 'fesetround', however this intrinsic does not
28535return any value and uses platform-independent representation of IEEE rounding
28536modes.
28537
28538.. _int_get_fpenv:
28539
28540'``llvm.get.fpenv``' Intrinsic
28541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28542
28543Syntax:
28544"""""""
28545
28546::
28547
28548      declare <integer_type> @llvm.get.fpenv()
28549
28550Overview:
28551"""""""""
28552
28553The '``llvm.get.fpenv``' intrinsic returns bits of the current floating-point
28554environment. The return value type is platform-specific.
28555
28556Semantics:
28557""""""""""
28558
28559The '``llvm.get.fpenv``' intrinsic reads the current floating-point environment
28560and returns it as an integer value.
28561
28562.. _int_set_fpenv:
28563
28564'``llvm.set.fpenv``' Intrinsic
28565^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28566
28567Syntax:
28568"""""""
28569
28570::
28571
28572      declare void @llvm.set.fpenv(<integer_type> <val>)
28573
28574Overview:
28575"""""""""
28576
28577The '``llvm.set.fpenv``' intrinsic sets the current floating-point environment.
28578
28579Arguments:
28580""""""""""
28581
28582The argument is an integer representing the new floating-point environment. The
28583integer type is platform-specific.
28584
28585Semantics:
28586""""""""""
28587
28588The '``llvm.set.fpenv``' intrinsic sets the current floating-point environment
28589to the state specified by the argument. The state may be previously obtained by a
28590call to '``llvm.get.fpenv``' or synthesized in a platform-dependent way.
28591
28592
28593'``llvm.reset.fpenv``' Intrinsic
28594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28595
28596Syntax:
28597"""""""
28598
28599::
28600
28601      declare void @llvm.reset.fpenv()
28602
28603Overview:
28604"""""""""
28605
28606The '``llvm.reset.fpenv``' intrinsic sets the default floating-point environment.
28607
28608Semantics:
28609""""""""""
28610
28611The '``llvm.reset.fpenv``' intrinsic sets the current floating-point environment
28612to default state. It is similar to the call 'fesetenv(FE_DFL_ENV)', except it
28613does not return any value.
28614
28615.. _int_get_fpmode:
28616
28617'``llvm.get.fpmode``' Intrinsic
28618^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28619
28620Syntax:
28621"""""""
28622
28623The '``llvm.get.fpmode``' intrinsic returns bits of the current floating-point
28624control modes. The return value type is platform-specific.
28625
28626::
28627
28628      declare <integer_type> @llvm.get.fpmode()
28629
28630Overview:
28631"""""""""
28632
28633The '``llvm.get.fpmode``' intrinsic reads the current dynamic floating-point
28634control modes and returns it as an integer value.
28635
28636Arguments:
28637""""""""""
28638
28639None.
28640
28641Semantics:
28642""""""""""
28643
28644The '``llvm.get.fpmode``' intrinsic reads the current dynamic floating-point
28645control modes, such as rounding direction, precision, treatment of denormals and
28646so on. It is similar to the C library function 'fegetmode', however this
28647function does not store the set of control modes into memory but returns it as
28648an integer value. Interpretation of the bits in this value is target-dependent.
28649
28650'``llvm.set.fpmode``' Intrinsic
28651^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28652
28653Syntax:
28654"""""""
28655
28656The '``llvm.set.fpmode``' intrinsic sets the current floating-point control modes.
28657
28658::
28659
28660      declare void @llvm.set.fpmode(<integer_type> <val>)
28661
28662Overview:
28663"""""""""
28664
28665The '``llvm.set.fpmode``' intrinsic sets the current dynamic floating-point
28666control modes.
28667
28668Arguments:
28669""""""""""
28670
28671The argument is a set of floating-point control modes, represented as an integer
28672value in a target-dependent way.
28673
28674Semantics:
28675""""""""""
28676
28677The '``llvm.set.fpmode``' intrinsic sets the current dynamic floating-point
28678control modes to the state specified by the argument, which must be obtained by
28679a call to '``llvm.get.fpmode``' or constructed in a target-specific way. It is
28680similar to the C library function 'fesetmode', however this function does not
28681read the set of control modes from memory but gets it as integer value.
28682
28683'``llvm.reset.fpmode``' Intrinsic
28684^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28685
28686Syntax:
28687"""""""
28688
28689::
28690
28691      declare void @llvm.reset.fpmode()
28692
28693Overview:
28694"""""""""
28695
28696The '``llvm.reset.fpmode``' intrinsic sets the default dynamic floating-point
28697control modes.
28698
28699Arguments:
28700""""""""""
28701
28702None.
28703
28704Semantics:
28705""""""""""
28706
28707The '``llvm.reset.fpmode``' intrinsic sets the current dynamic floating-point
28708environment to default state. It is similar to the C library function call
28709'fesetmode(FE_DFL_MODE)', however this function does not return any value.
28710
28711
28712Floating-Point Test Intrinsics
28713------------------------------
28714
28715These functions get properties of floating-point values.
28716
28717
28718.. _llvm.is.fpclass:
28719
28720'``llvm.is.fpclass``' Intrinsic
28721^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28722
28723Syntax:
28724"""""""
28725
28726::
28727
28728      declare i1 @llvm.is.fpclass(<fptype> <op>, i32 <test>)
28729      declare <N x i1> @llvm.is.fpclass(<vector-fptype> <op>, i32 <test>)
28730
28731Overview:
28732"""""""""
28733
28734The '``llvm.is.fpclass``' intrinsic returns a boolean value or vector of boolean
28735values depending on whether the first argument satisfies the test specified by
28736the second argument.
28737
28738If the first argument is a floating-point scalar, then the result type is a
28739boolean (:ref:`i1 <t_integer>`).
28740
28741If the first argument is a floating-point vector, then the result type is a
28742vector of boolean with the same number of elements as the first argument.
28743
28744Arguments:
28745""""""""""
28746
28747The first argument to the '``llvm.is.fpclass``' intrinsic must be
28748:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
28749of floating-point values.
28750
28751The second argument specifies, which tests to perform. It must be a compile-time
28752integer constant, each bit in which specifies floating-point class:
28753
28754+-------+----------------------+
28755| Bit # | floating-point class |
28756+=======+======================+
28757| 0     | Signaling NaN        |
28758+-------+----------------------+
28759| 1     | Quiet NaN            |
28760+-------+----------------------+
28761| 2     | Negative infinity    |
28762+-------+----------------------+
28763| 3     | Negative normal      |
28764+-------+----------------------+
28765| 4     | Negative subnormal   |
28766+-------+----------------------+
28767| 5     | Negative zero        |
28768+-------+----------------------+
28769| 6     | Positive zero        |
28770+-------+----------------------+
28771| 7     | Positive subnormal   |
28772+-------+----------------------+
28773| 8     | Positive normal      |
28774+-------+----------------------+
28775| 9     | Positive infinity    |
28776+-------+----------------------+
28777
28778Semantics:
28779""""""""""
28780
28781The function checks if ``op`` belongs to any of the floating-point classes
28782specified by ``test``. If ``op`` is a vector, then the check is made element by
28783element. Each check yields an :ref:`i1 <t_integer>` result, which is ``true``,
28784if the element value satisfies the specified test. The argument ``test`` is a
28785bit mask where each bit specifies floating-point class to test. For example, the
28786value 0x108 makes test for normal value, - bits 3 and 8 in it are set, which
28787means that the function returns ``true`` if ``op`` is a positive or negative
28788normal value. The function never raises floating-point exceptions. The
28789function does not canonicalize its input value and does not depend
28790on the floating-point environment. If the floating-point environment
28791has a zeroing treatment of subnormal input values (such as indicated
28792by the ``"denormal-fp-math"`` attribute), a subnormal value will be
28793observed (will not be implicitly treated as zero).
28794
28795
28796General Intrinsics
28797------------------
28798
28799This class of intrinsics is designed to be generic and has no specific
28800purpose.
28801
28802'``llvm.var.annotation``' Intrinsic
28803^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28804
28805Syntax:
28806"""""""
28807
28808::
28809
28810      declare void @llvm.var.annotation(ptr <val>, ptr <str>, ptr <str>, i32  <int>)
28811
28812Overview:
28813"""""""""
28814
28815The '``llvm.var.annotation``' intrinsic.
28816
28817Arguments:
28818""""""""""
28819
28820The first argument is a pointer to a value, the second is a pointer to a
28821global string, the third is a pointer to a global string which is the
28822source file name, and the last argument is the line number.
28823
28824Semantics:
28825""""""""""
28826
28827This intrinsic allows annotation of local variables with arbitrary
28828strings. This can be useful for special purpose optimizations that want
28829to look for these annotations. These have no other defined use; they are
28830ignored by code generation and optimization.
28831
28832'``llvm.ptr.annotation.*``' Intrinsic
28833^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28834
28835Syntax:
28836"""""""
28837
28838This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
28839pointer to an integer of any width. *NOTE* you must specify an address space for
28840the pointer. The identifier for the default address space is the integer
28841'``0``'.
28842
28843::
28844
28845      declare ptr @llvm.ptr.annotation.p0(ptr <val>, ptr <str>, ptr <str>, i32 <int>)
28846      declare ptr @llvm.ptr.annotation.p1(ptr addrspace(1) <val>, ptr <str>, ptr <str>, i32 <int>)
28847
28848Overview:
28849"""""""""
28850
28851The '``llvm.ptr.annotation``' intrinsic.
28852
28853Arguments:
28854""""""""""
28855
28856The first argument is a pointer to an integer value of arbitrary bitwidth
28857(result of some expression), the second is a pointer to a global string, the
28858third is a pointer to a global string which is the source file name, and the
28859last argument is the line number. It returns the value of the first argument.
28860
28861Semantics:
28862""""""""""
28863
28864This intrinsic allows annotation of a pointer to an integer with arbitrary
28865strings. This can be useful for special purpose optimizations that want to look
28866for these annotations. These have no other defined use; transformations preserve
28867annotations on a best-effort basis but are allowed to replace the intrinsic with
28868its first argument without breaking semantics and the intrinsic is completely
28869dropped during instruction selection.
28870
28871'``llvm.annotation.*``' Intrinsic
28872^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28873
28874Syntax:
28875"""""""
28876
28877This is an overloaded intrinsic. You can use '``llvm.annotation``' on
28878any integer bit width.
28879
28880::
28881
28882      declare i8 @llvm.annotation.i8(i8 <val>, ptr <str>, ptr <str>, i32  <int>)
28883      declare i16 @llvm.annotation.i16(i16 <val>, ptr <str>, ptr <str>, i32  <int>)
28884      declare i32 @llvm.annotation.i32(i32 <val>, ptr <str>, ptr <str>, i32  <int>)
28885      declare i64 @llvm.annotation.i64(i64 <val>, ptr <str>, ptr <str>, i32  <int>)
28886      declare i256 @llvm.annotation.i256(i256 <val>, ptr <str>, ptr <str>, i32  <int>)
28887
28888Overview:
28889"""""""""
28890
28891The '``llvm.annotation``' intrinsic.
28892
28893Arguments:
28894""""""""""
28895
28896The first argument is an integer value (result of some expression), the
28897second is a pointer to a global string, the third is a pointer to a
28898global string which is the source file name, and the last argument is
28899the line number. It returns the value of the first argument.
28900
28901Semantics:
28902""""""""""
28903
28904This intrinsic allows annotations to be put on arbitrary expressions with
28905arbitrary strings. This can be useful for special purpose optimizations that
28906want to look for these annotations. These have no other defined use;
28907transformations preserve annotations on a best-effort basis but are allowed to
28908replace the intrinsic with its first argument without breaking semantics and the
28909intrinsic is completely dropped during instruction selection.
28910
28911'``llvm.codeview.annotation``' Intrinsic
28912^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28913
28914Syntax:
28915"""""""
28916
28917This annotation emits a label at its program point and an associated
28918``S_ANNOTATION`` codeview record with some additional string metadata. This is
28919used to implement MSVC's ``__annotation`` intrinsic. It is marked
28920``noduplicate``, so calls to this intrinsic prevent inlining and should be
28921considered expensive.
28922
28923::
28924
28925      declare void @llvm.codeview.annotation(metadata)
28926
28927Arguments:
28928""""""""""
28929
28930The argument should be an MDTuple containing any number of MDStrings.
28931
28932.. _llvm.trap:
28933
28934'``llvm.trap``' Intrinsic
28935^^^^^^^^^^^^^^^^^^^^^^^^^
28936
28937Syntax:
28938"""""""
28939
28940::
28941
28942      declare void @llvm.trap() cold noreturn nounwind
28943
28944Overview:
28945"""""""""
28946
28947The '``llvm.trap``' intrinsic.
28948
28949Arguments:
28950""""""""""
28951
28952None.
28953
28954Semantics:
28955""""""""""
28956
28957This intrinsic is lowered to the target dependent trap instruction. If
28958the target does not have a trap instruction, this intrinsic will be
28959lowered to a call of the ``abort()`` function.
28960
28961.. _llvm.debugtrap:
28962
28963'``llvm.debugtrap``' Intrinsic
28964^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28965
28966Syntax:
28967"""""""
28968
28969::
28970
28971      declare void @llvm.debugtrap() nounwind
28972
28973Overview:
28974"""""""""
28975
28976The '``llvm.debugtrap``' intrinsic.
28977
28978Arguments:
28979""""""""""
28980
28981None.
28982
28983Semantics:
28984""""""""""
28985
28986This intrinsic is lowered to code which is intended to cause an
28987execution trap with the intention of requesting the attention of a
28988debugger.
28989
28990.. _llvm.ubsantrap:
28991
28992'``llvm.ubsantrap``' Intrinsic
28993^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28994
28995Syntax:
28996"""""""
28997
28998::
28999
29000      declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind
29001
29002Overview:
29003"""""""""
29004
29005The '``llvm.ubsantrap``' intrinsic.
29006
29007Arguments:
29008""""""""""
29009
29010An integer describing the kind of failure detected.
29011
29012Semantics:
29013""""""""""
29014
29015This intrinsic is lowered to code which is intended to cause an execution trap,
29016embedding the argument into encoding of that trap somehow to discriminate
29017crashes if possible.
29018
29019Equivalent to ``@llvm.trap`` for targets that do not support this behavior.
29020
29021'``llvm.stackprotector``' Intrinsic
29022^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29023
29024Syntax:
29025"""""""
29026
29027::
29028
29029      declare void @llvm.stackprotector(ptr <guard>, ptr <slot>)
29030
29031Overview:
29032"""""""""
29033
29034The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
29035onto the stack at ``slot``. The stack slot is adjusted to ensure that it
29036is placed on the stack before local variables.
29037
29038Arguments:
29039""""""""""
29040
29041The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
29042The first argument is the value loaded from the stack guard
29043``@__stack_chk_guard``. The second variable is an ``alloca`` that has
29044enough space to hold the value of the guard.
29045
29046Semantics:
29047""""""""""
29048
29049This intrinsic causes the prologue/epilogue inserter to force the position of
29050the ``AllocaInst`` stack slot to be before local variables on the stack. This is
29051to ensure that if a local variable on the stack is overwritten, it will destroy
29052the value of the guard. When the function exits, the guard on the stack is
29053checked against the original guard by ``llvm.stackprotectorcheck``. If they are
29054different, then ``llvm.stackprotectorcheck`` causes the program to abort by
29055calling the ``__stack_chk_fail()`` function.
29056
29057'``llvm.stackguard``' Intrinsic
29058^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29059
29060Syntax:
29061"""""""
29062
29063::
29064
29065      declare ptr @llvm.stackguard()
29066
29067Overview:
29068"""""""""
29069
29070The ``llvm.stackguard`` intrinsic returns the system stack guard value.
29071
29072It should not be generated by frontends, since it is only for internal usage.
29073The reason why we create this intrinsic is that we still support IR form Stack
29074Protector in FastISel.
29075
29076Arguments:
29077""""""""""
29078
29079None.
29080
29081Semantics:
29082""""""""""
29083
29084On some platforms, the value returned by this intrinsic remains unchanged
29085between loads in the same thread. On other platforms, it returns the same
29086global variable value, if any, e.g. ``@__stack_chk_guard``.
29087
29088Currently some platforms have IR-level customized stack guard loading (e.g.
29089X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
29090in the future.
29091
29092'``llvm.objectsize``' Intrinsic
29093^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29094
29095Syntax:
29096"""""""
29097
29098::
29099
29100      declare i32 @llvm.objectsize.i32(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
29101      declare i64 @llvm.objectsize.i64(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
29102
29103Overview:
29104"""""""""
29105
29106The ``llvm.objectsize`` intrinsic is designed to provide information to the
29107optimizer to determine whether a) an operation (like memcpy) will overflow a
29108buffer that corresponds to an object, or b) that a runtime check for overflow
29109isn't necessary. An object in this context means an allocation of a specific
29110class, structure, array, or other object.
29111
29112Arguments:
29113""""""""""
29114
29115The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a
29116pointer to or into the ``object``. The second argument determines whether
29117``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is
29118unknown. The third argument controls how ``llvm.objectsize`` acts when ``null``
29119in address space 0 is used as its pointer argument. If it's ``false``,
29120``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
29121the ``null`` is in a non-zero address space or if ``true`` is given for the
29122third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth
29123argument to ``llvm.objectsize`` determines if the value should be evaluated at
29124runtime.
29125
29126The second, third, and fourth arguments only accept constants.
29127
29128Semantics:
29129""""""""""
29130
29131The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of
29132the object concerned. If the size cannot be determined, ``llvm.objectsize``
29133returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument).
29134
29135'``llvm.expect``' Intrinsic
29136^^^^^^^^^^^^^^^^^^^^^^^^^^^
29137
29138Syntax:
29139"""""""
29140
29141This is an overloaded intrinsic. You can use ``llvm.expect`` on any
29142integer bit width.
29143
29144::
29145
29146      declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
29147      declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
29148      declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
29149
29150Overview:
29151"""""""""
29152
29153The ``llvm.expect`` intrinsic provides information about expected (the
29154most probable) value of ``val``, which can be used by optimizers.
29155
29156Arguments:
29157""""""""""
29158
29159The ``llvm.expect`` intrinsic takes two arguments. The first argument is
29160a value. The second argument is an expected value.
29161
29162Semantics:
29163""""""""""
29164
29165This intrinsic is lowered to the ``val``.
29166
29167'``llvm.expect.with.probability``' Intrinsic
29168^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29169
29170Syntax:
29171"""""""
29172
29173This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic.
29174You can use ``llvm.expect.with.probability`` on any integer bit width.
29175
29176::
29177
29178      declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>)
29179      declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>)
29180      declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>)
29181
29182Overview:
29183"""""""""
29184
29185The ``llvm.expect.with.probability`` intrinsic provides information about
29186expected value of ``val`` with probability(or confidence) ``prob``, which can
29187be used by optimizers.
29188
29189Arguments:
29190""""""""""
29191
29192The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first
29193argument is a value. The second argument is an expected value. The third
29194argument is a probability.
29195
29196Semantics:
29197""""""""""
29198
29199This intrinsic is lowered to the ``val``.
29200
29201.. _int_assume:
29202
29203'``llvm.assume``' Intrinsic
29204^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29205
29206Syntax:
29207"""""""
29208
29209::
29210
29211      declare void @llvm.assume(i1 %cond)
29212
29213Overview:
29214"""""""""
29215
29216The ``llvm.assume`` allows the optimizer to assume that the provided
29217condition is true. This information can then be used in simplifying other parts
29218of the code.
29219
29220More complex assumptions can be encoded as
29221:ref:`assume operand bundles <assume_opbundles>`.
29222
29223Arguments:
29224""""""""""
29225
29226The argument of the call is the condition which the optimizer may assume is
29227always true.
29228
29229Semantics:
29230""""""""""
29231
29232The intrinsic allows the optimizer to assume that the provided condition is
29233always true whenever the control flow reaches the intrinsic call. No code is
29234generated for this intrinsic, and instructions that contribute only to the
29235provided condition are not used for code generation. If the condition is
29236violated during execution, the behavior is undefined.
29237
29238Note that the optimizer might limit the transformations performed on values
29239used by the ``llvm.assume`` intrinsic in order to preserve the instructions
29240only used to form the intrinsic's input argument. This might prove undesirable
29241if the extra information provided by the ``llvm.assume`` intrinsic does not cause
29242sufficient overall improvement in code quality. For this reason,
29243``llvm.assume`` should not be used to document basic mathematical invariants
29244that the optimizer can otherwise deduce or facts that are of little use to the
29245optimizer.
29246
29247.. _int_ssa_copy:
29248
29249'``llvm.ssa.copy``' Intrinsic
29250^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29251
29252Syntax:
29253"""""""
29254
29255::
29256
29257      declare type @llvm.ssa.copy(type returned %operand) memory(none)
29258
29259Arguments:
29260""""""""""
29261
29262The first argument is an operand which is used as the returned value.
29263
29264Overview:
29265""""""""""
29266
29267The ``llvm.ssa.copy`` intrinsic can be used to attach information to
29268operations by copying them and giving them new names.  For example,
29269the PredicateInfo utility uses it to build Extended SSA form, and
29270attach various forms of information to operands that dominate specific
29271uses.  It is not meant for general use, only for building temporary
29272renaming forms that require value splits at certain points.
29273
29274.. _type.test:
29275
29276'``llvm.type.test``' Intrinsic
29277^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29278
29279Syntax:
29280"""""""
29281
29282::
29283
29284      declare i1 @llvm.type.test(ptr %ptr, metadata %type) nounwind memory(none)
29285
29286
29287Arguments:
29288""""""""""
29289
29290The first argument is a pointer to be tested. The second argument is a
29291metadata object representing a :doc:`type identifier <TypeMetadata>`.
29292
29293Overview:
29294"""""""""
29295
29296The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
29297with the given type identifier.
29298
29299.. _type.checked.load:
29300
29301'``llvm.type.checked.load``' Intrinsic
29302^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29303
29304Syntax:
29305"""""""
29306
29307::
29308
29309      declare {ptr, i1} @llvm.type.checked.load(ptr %ptr, i32 %offset, metadata %type) nounwind memory(argmem: read)
29310
29311
29312Arguments:
29313""""""""""
29314
29315The first argument is a pointer from which to load a function pointer. The
29316second argument is the byte offset from which to load the function pointer. The
29317third argument is a metadata object representing a :doc:`type identifier
29318<TypeMetadata>`.
29319
29320Overview:
29321"""""""""
29322
29323The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
29324virtual table pointer using type metadata. This intrinsic is used to implement
29325control flow integrity in conjunction with virtual call optimization. The
29326virtual call optimization pass will optimize away ``llvm.type.checked.load``
29327intrinsics associated with devirtualized calls, thereby removing the type
29328check in cases where it is not needed to enforce the control flow integrity
29329constraint.
29330
29331If the given pointer is associated with a type metadata identifier, this
29332function returns true as the second element of its return value. (Note that
29333the function may also return true if the given pointer is not associated
29334with a type metadata identifier.) If the function's return value's second
29335element is true, the following rules apply to the first element:
29336
29337- If the given pointer is associated with the given type metadata identifier,
29338  it is the function pointer loaded from the given byte offset from the given
29339  pointer.
29340
29341- If the given pointer is not associated with the given type metadata
29342  identifier, it is one of the following (the choice of which is unspecified):
29343
29344  1. The function pointer that would have been loaded from an arbitrarily chosen
29345     (through an unspecified mechanism) pointer associated with the type
29346     metadata.
29347
29348  2. If the function has a non-void return type, a pointer to a function that
29349     returns an unspecified value without causing side effects.
29350
29351If the function's return value's second element is false, the value of the
29352first element is undefined.
29353
29354.. _type.checked.load.relative:
29355
29356'``llvm.type.checked.load.relative``' Intrinsic
29357^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29358
29359Syntax:
29360"""""""
29361
29362::
29363
29364      declare {ptr, i1} @llvm.type.checked.load.relative(ptr %ptr, i32 %offset, metadata %type) nounwind memory(argmem: read)
29365
29366Overview:
29367"""""""""
29368
29369The ``llvm.type.checked.load.relative`` intrinsic loads a relative pointer to a
29370function from a virtual table pointer using metadata. Otherwise, its semantic is
29371identical to the ``llvm.type.checked.load`` intrinsic.
29372
29373A relative pointer is a pointer to an offset to the pointed to value. The
29374address of the underlying pointer of the relative pointer is obtained by adding
29375the offset to the address of the offset value.
29376
29377'``llvm.arithmetic.fence``' Intrinsic
29378^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29379
29380Syntax:
29381"""""""
29382
29383::
29384
29385      declare <type>
29386      @llvm.arithmetic.fence(<type> <op>)
29387
29388Overview:
29389"""""""""
29390
29391The purpose of the ``llvm.arithmetic.fence`` intrinsic
29392is to prevent the optimizer from performing fast-math optimizations,
29393particularly reassociation,
29394between the argument and the expression that contains the argument.
29395It can be used to preserve the parentheses in the source language.
29396
29397Arguments:
29398""""""""""
29399
29400The ``llvm.arithmetic.fence`` intrinsic takes only one argument.
29401The argument and the return value are floating-point numbers,
29402or vector floating-point numbers, of the same type.
29403
29404Semantics:
29405""""""""""
29406
29407This intrinsic returns the value of its operand. The optimizer can optimize
29408the argument, but the optimizer cannot hoist any component of the operand
29409to the containing context, and the optimizer cannot move the calculation of
29410any expression in the containing context into the operand.
29411
29412
29413'``llvm.donothing``' Intrinsic
29414^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29415
29416Syntax:
29417"""""""
29418
29419::
29420
29421      declare void @llvm.donothing() nounwind memory(none)
29422
29423Overview:
29424"""""""""
29425
29426The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
29427three intrinsics (besides ``llvm.experimental.patchpoint`` and
29428``llvm.experimental.gc.statepoint``) that can be called with an invoke
29429instruction.
29430
29431Arguments:
29432""""""""""
29433
29434None.
29435
29436Semantics:
29437""""""""""
29438
29439This intrinsic does nothing, and it's removed by optimizers and ignored
29440by codegen.
29441
29442'``llvm.experimental.deoptimize``' Intrinsic
29443^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29444
29445Syntax:
29446"""""""
29447
29448::
29449
29450      declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
29451
29452Overview:
29453"""""""""
29454
29455This intrinsic, together with :ref:`deoptimization operand bundles
29456<deopt_opbundles>`, allow frontends to express transfer of control and
29457frame-local state from the currently executing (typically more specialized,
29458hence faster) version of a function into another (typically more generic, hence
29459slower) version.
29460
29461In languages with a fully integrated managed runtime like Java and JavaScript
29462this intrinsic can be used to implement "uncommon trap" or "side exit" like
29463functionality.  In unmanaged languages like C and C++, this intrinsic can be
29464used to represent the slow paths of specialized functions.
29465
29466
29467Arguments:
29468""""""""""
29469
29470The intrinsic takes an arbitrary number of arguments, whose meaning is
29471decided by the :ref:`lowering strategy<deoptimize_lowering>`.
29472
29473Semantics:
29474""""""""""
29475
29476The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
29477deoptimization continuation (denoted using a :ref:`deoptimization
29478operand bundle <deopt_opbundles>`) and returns the value returned by
29479the deoptimization continuation.  Defining the semantic properties of
29480the continuation itself is out of scope of the language reference --
29481as far as LLVM is concerned, the deoptimization continuation can
29482invoke arbitrary side effects, including reading from and writing to
29483the entire heap.
29484
29485Deoptimization continuations expressed using ``"deopt"`` operand bundles always
29486continue execution to the end of the physical frame containing them, so all
29487calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
29488
29489   - ``@llvm.experimental.deoptimize`` cannot be invoked.
29490   - The call must immediately precede a :ref:`ret <i_ret>` instruction.
29491   - The ``ret`` instruction must return the value produced by the
29492     ``@llvm.experimental.deoptimize`` call if there is one, or void.
29493
29494Note that the above restrictions imply that the return type for a call to
29495``@llvm.experimental.deoptimize`` will match the return type of its immediate
29496caller.
29497
29498The inliner composes the ``"deopt"`` continuations of the caller into the
29499``"deopt"`` continuations present in the inlinee, and also updates calls to this
29500intrinsic to return directly from the frame of the function it inlined into.
29501
29502All declarations of ``@llvm.experimental.deoptimize`` must share the
29503same calling convention.
29504
29505.. _deoptimize_lowering:
29506
29507Lowering:
29508"""""""""
29509
29510Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
29511symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
29512ensure that this symbol is defined).  The call arguments to
29513``@llvm.experimental.deoptimize`` are lowered as if they were formal
29514arguments of the specified types, and not as varargs.
29515
29516
29517'``llvm.experimental.guard``' Intrinsic
29518^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29519
29520Syntax:
29521"""""""
29522
29523::
29524
29525      declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
29526
29527Overview:
29528"""""""""
29529
29530This intrinsic, together with :ref:`deoptimization operand bundles
29531<deopt_opbundles>`, allows frontends to express guards or checks on
29532optimistic assumptions made during compilation.  The semantics of
29533``@llvm.experimental.guard`` is defined in terms of
29534``@llvm.experimental.deoptimize`` -- its body is defined to be
29535equivalent to:
29536
29537.. code-block:: text
29538
29539  define void @llvm.experimental.guard(i1 %pred, <args...>) {
29540    %realPred = and i1 %pred, undef
29541    br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
29542
29543  leave:
29544    call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
29545    ret void
29546
29547  continue:
29548    ret void
29549  }
29550
29551
29552with the optional ``[, !make.implicit !{}]`` present if and only if it
29553is present on the call site.  For more details on ``!make.implicit``,
29554see :doc:`FaultMaps`.
29555
29556In words, ``@llvm.experimental.guard`` executes the attached
29557``"deopt"`` continuation if (but **not** only if) its first argument
29558is ``false``.  Since the optimizer is allowed to replace the ``undef``
29559with an arbitrary value, it can optimize guard to fail "spuriously",
29560i.e. without the original condition being false (hence the "not only
29561if"); and this allows for "check widening" type optimizations.
29562
29563``@llvm.experimental.guard`` cannot be invoked.
29564
29565After ``@llvm.experimental.guard`` was first added, a more general
29566formulation was found in ``@llvm.experimental.widenable.condition``.
29567Support for ``@llvm.experimental.guard`` is slowly being rephrased in
29568terms of this alternate.
29569
29570'``llvm.experimental.widenable.condition``' Intrinsic
29571^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29572
29573Syntax:
29574"""""""
29575
29576::
29577
29578      declare i1 @llvm.experimental.widenable.condition()
29579
29580Overview:
29581"""""""""
29582
29583This intrinsic represents a "widenable condition" which is
29584boolean expressions with the following property: whether this
29585expression is `true` or `false`, the program is correct and
29586well-defined.
29587
29588Together with :ref:`deoptimization operand bundles <deopt_opbundles>`,
29589``@llvm.experimental.widenable.condition`` allows frontends to
29590express guards or checks on optimistic assumptions made during
29591compilation and represent them as branch instructions on special
29592conditions.
29593
29594While this may appear similar in semantics to `undef`, it is very
29595different in that an invocation produces a particular, singular
29596value. It is also intended to be lowered late, and remain available
29597for specific optimizations and transforms that can benefit from its
29598special properties.
29599
29600Arguments:
29601""""""""""
29602
29603None.
29604
29605Semantics:
29606""""""""""
29607
29608The intrinsic ``@llvm.experimental.widenable.condition()``
29609returns either `true` or `false`. For each evaluation of a call
29610to this intrinsic, the program must be valid and correct both if
29611it returns `true` and if it returns `false`. This allows
29612transformation passes to replace evaluations of this intrinsic
29613with either value whenever one is beneficial.
29614
29615When used in a branch condition, it allows us to choose between
29616two alternative correct solutions for the same problem, like
29617in example below:
29618
29619.. code-block:: text
29620
29621    %cond = call i1 @llvm.experimental.widenable.condition()
29622    br i1 %cond, label %fast_path, label %slow_path
29623
29624  fast_path:
29625    ; Apply memory-consuming but fast solution for a task.
29626
29627  slow_path:
29628    ; Cheap in memory but slow solution.
29629
29630Whether the result of intrinsic's call is `true` or `false`,
29631it should be correct to pick either solution. We can switch
29632between them by replacing the result of
29633``@llvm.experimental.widenable.condition`` with different
29634`i1` expressions.
29635
29636This is how it can be used to represent guards as widenable branches:
29637
29638.. code-block:: text
29639
29640  block:
29641    ; Unguarded instructions
29642    call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)]
29643    ; Guarded instructions
29644
29645Can be expressed in an alternative equivalent form of explicit branch using
29646``@llvm.experimental.widenable.condition``:
29647
29648.. code-block:: text
29649
29650  block:
29651    ; Unguarded instructions
29652    %widenable_condition = call i1 @llvm.experimental.widenable.condition()
29653    %guard_condition = and i1 %cond, %widenable_condition
29654    br i1 %guard_condition, label %guarded, label %deopt
29655
29656  guarded:
29657    ; Guarded instructions
29658
29659  deopt:
29660    call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
29661
29662So the block `guarded` is only reachable when `%cond` is `true`,
29663and it should be valid to go to the block `deopt` whenever `%cond`
29664is `true` or `false`.
29665
29666``@llvm.experimental.widenable.condition`` will never throw, thus
29667it cannot be invoked.
29668
29669Guard widening:
29670"""""""""""""""
29671
29672When ``@llvm.experimental.widenable.condition()`` is used in
29673condition of a guard represented as explicit branch, it is
29674legal to widen the guard's condition with any additional
29675conditions.
29676
29677Guard widening looks like replacement of
29678
29679.. code-block:: text
29680
29681  %widenable_cond = call i1 @llvm.experimental.widenable.condition()
29682  %guard_cond = and i1 %cond, %widenable_cond
29683  br i1 %guard_cond, label %guarded, label %deopt
29684
29685with
29686
29687.. code-block:: text
29688
29689  %widenable_cond = call i1 @llvm.experimental.widenable.condition()
29690  %new_cond = and i1 %any_other_cond, %widenable_cond
29691  %new_guard_cond = and i1 %cond, %new_cond
29692  br i1 %new_guard_cond, label %guarded, label %deopt
29693
29694for this branch. Here `%any_other_cond` is an arbitrarily chosen
29695well-defined `i1` value. By making guard widening, we may
29696impose stricter conditions on `guarded` block and bail to the
29697deopt when the new condition is not met.
29698
29699Lowering:
29700"""""""""
29701
29702Default lowering strategy is replacing the result of
29703call of ``@llvm.experimental.widenable.condition``  with
29704constant `true`. However it is always correct to replace
29705it with any other `i1` value. Any pass can
29706freely do it if it can benefit from non-default lowering.
29707
29708'``llvm.allow.ubsan.check``' Intrinsic
29709^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29710
29711Syntax:
29712"""""""
29713
29714::
29715
29716      declare i1 @llvm.allow.ubsan.check(i8 immarg %kind)
29717
29718Overview:
29719"""""""""
29720
29721This intrinsic returns ``true`` if and only if the compiler opted to enable the
29722ubsan check in the current basic block.
29723
29724Rules to allow ubsan checks are not part of the intrinsic declaration, and
29725controlled by compiler options.
29726
29727This intrinsic is the ubsan specific version of ``@llvm.allow.runtime.check()``.
29728
29729Arguments:
29730""""""""""
29731
29732An integer describing the kind of ubsan check guarded by the intrinsic.
29733
29734Semantics:
29735""""""""""
29736
29737The intrinsic ``@llvm.allow.ubsan.check()`` returns either ``true`` or
29738``false``, depending on compiler options.
29739
29740For each evaluation of a call to this intrinsic, the program must be valid and
29741correct both if it returns ``true`` and if it returns ``false``.
29742
29743When used in a branch condition, it selects one of the two paths:
29744
29745* `true``: Executes the UBSan check and reports any failures.
29746
29747* `false`: Bypasses the check, assuming it always succeeds.
29748
29749Example:
29750
29751.. code-block:: text
29752
29753    %allow = call i1 @llvm.allow.ubsan.check(i8 5)
29754    %not.allow = xor i1 %allow, true
29755    %cond = or i1 %ubcheck, %not.allow
29756    br i1 %cond, label %cont, label %trap
29757
29758  cont:
29759    ; Proceed
29760
29761  trap:
29762    call void @llvm.ubsantrap(i8 5)
29763    unreachable
29764
29765
29766'``llvm.allow.runtime.check``' Intrinsic
29767^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29768
29769Syntax:
29770"""""""
29771
29772::
29773
29774      declare i1 @llvm.allow.runtime.check(metadata %kind)
29775
29776Overview:
29777"""""""""
29778
29779This intrinsic returns ``true`` if and only if the compiler opted to enable
29780runtime checks in the current basic block.
29781
29782Rules to allow runtime checks are not part of the intrinsic declaration, and
29783controlled by compiler options.
29784
29785This intrinsic is non-ubsan specific version of ``@llvm.allow.ubsan.check()``.
29786
29787Arguments:
29788""""""""""
29789
29790A string identifying the kind of runtime check guarded by the intrinsic. The
29791string can be used to control rules to allow checks.
29792
29793Semantics:
29794""""""""""
29795
29796The intrinsic ``@llvm.allow.runtime.check()`` returns either ``true`` or
29797``false``, depending on compiler options.
29798
29799For each evaluation of a call to this intrinsic, the program must be valid and
29800correct both if it returns ``true`` and if it returns ``false``.
29801
29802When used in a branch condition, it allows us to choose between
29803two alternative correct solutions for the same problem.
29804
29805If the intrinsic is evaluated as ``true``, program should execute a guarded
29806check. If the intrinsic is evaluated as ``false``, the program should avoid any
29807unnecessary checks.
29808
29809Example:
29810
29811.. code-block:: text
29812
29813    %allow = call i1 @llvm.allow.runtime.check(metadata !"my_check")
29814    br i1 %allow, label %fast_path, label %slow_path
29815
29816  fast_path:
29817    ; Omit diagnostics.
29818
29819  slow_path:
29820    ; Additional diagnostics.
29821
29822
29823'``llvm.load.relative``' Intrinsic
29824^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29825
29826Syntax:
29827"""""""
29828
29829::
29830
29831      declare ptr @llvm.load.relative.iN(ptr %ptr, iN %offset) nounwind memory(argmem: read)
29832
29833Overview:
29834"""""""""
29835
29836This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
29837adds ``%ptr`` to that value and returns it. The constant folder specifically
29838recognizes the form of this intrinsic and the constant initializers it may
29839load from; if a loaded constant initializer is known to have the form
29840``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
29841
29842LLVM provides that the calculation of such a constant initializer will
29843not overflow at link time under the medium code model if ``x`` is an
29844``unnamed_addr`` function. However, it does not provide this guarantee for
29845a constant initializer folded into a function body. This intrinsic can be
29846used to avoid the possibility of overflows when loading from such a constant.
29847
29848.. _llvm_sideeffect:
29849
29850'``llvm.sideeffect``' Intrinsic
29851^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29852
29853Syntax:
29854"""""""
29855
29856::
29857
29858      declare void @llvm.sideeffect() inaccessiblememonly nounwind willreturn
29859
29860Overview:
29861"""""""""
29862
29863The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
29864treat it as having side effects, so it can be inserted into a loop to
29865indicate that the loop shouldn't be assumed to terminate (which could
29866potentially lead to the loop being optimized away entirely), even if it's
29867an infinite loop with no other side effects.
29868
29869Arguments:
29870""""""""""
29871
29872None.
29873
29874Semantics:
29875""""""""""
29876
29877This intrinsic actually does nothing, but optimizers must assume that it
29878has externally observable side effects.
29879
29880'``llvm.is.constant.*``' Intrinsic
29881^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29882
29883Syntax:
29884"""""""
29885
29886This is an overloaded intrinsic. You can use llvm.is.constant with any argument type.
29887
29888::
29889
29890      declare i1 @llvm.is.constant.i32(i32 %operand) nounwind memory(none)
29891      declare i1 @llvm.is.constant.f32(float %operand) nounwind memory(none)
29892      declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind memory(none)
29893
29894Overview:
29895"""""""""
29896
29897The '``llvm.is.constant``' intrinsic will return true if the argument
29898is known to be a manifest compile-time constant. It is guaranteed to
29899fold to either true or false before generating machine code.
29900
29901Semantics:
29902""""""""""
29903
29904This intrinsic generates no code. If its argument is known to be a
29905manifest compile-time constant value, then the intrinsic will be
29906converted to a constant true value. Otherwise, it will be converted to
29907a constant false value.
29908
29909In particular, note that if the argument is a constant expression
29910which refers to a global (the address of which _is_ a constant, but
29911not manifest during the compile), then the intrinsic evaluates to
29912false.
29913
29914The result also intentionally depends on the result of optimization
29915passes -- e.g., the result can change depending on whether a
29916function gets inlined or not. A function's parameters are
29917obviously not constant. However, a call like
29918``llvm.is.constant.i32(i32 %param)`` *can* return true after the
29919function is inlined, if the value passed to the function parameter was
29920a constant.
29921
29922.. _int_ptrmask:
29923
29924'``llvm.ptrmask``' Intrinsic
29925^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29926
29927Syntax:
29928"""""""
29929
29930::
29931
29932      declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) speculatable memory(none)
29933
29934Arguments:
29935""""""""""
29936
29937The first argument is a pointer or vector of pointers. The second argument is
29938an integer or vector of integers with the same bit width as the index type
29939size of the first argument.
29940
29941Overview:
29942""""""""""
29943
29944The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask.
29945This allows stripping data from tagged pointers without converting them to an
29946integer (ptrtoint/inttoptr). As a consequence, we can preserve more information
29947to facilitate alias analysis and underlying-object detection.
29948
29949Semantics:
29950""""""""""
29951
29952The result of ``ptrmask(%ptr, %mask)`` is equivalent to the following expansion,
29953where ``iPtrIdx`` is the index type size of the pointer::
29954
29955    %intptr = ptrtoint ptr %ptr to iPtrIdx ; this may truncate
29956    %masked = and iPtrIdx %intptr, %mask
29957    %diff = sub iPtrIdx %masked, %intptr
29958    %result = getelementptr i8, ptr %ptr, iPtrIdx %diff
29959
29960If the pointer index type size is smaller than the pointer type size, this
29961implies that pointer bits beyond the index size are not affected by this
29962intrinsic. For integral pointers, it behaves as if the mask were extended with
299631 bits to the pointer type size.
29964
29965Both the returned pointer(s) and the first argument are based on the same
29966underlying object (for more information on the *based on* terminology see
29967:ref:`the pointer aliasing rules <pointeraliasing>`).
29968
29969The intrinsic only captures the pointer argument through the return value.
29970
29971.. _int_threadlocal_address:
29972
29973'``llvm.threadlocal.address``' Intrinsic
29974^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29975
29976Syntax:
29977"""""""
29978
29979::
29980
29981      declare ptr @llvm.threadlocal.address(ptr) nounwind willreturn memory(none)
29982
29983Arguments:
29984""""""""""
29985
29986The `llvm.threadlocal.address` intrinsic requires a global value argument (a
29987:ref:`global variable <globalvars>` or alias) that is thread local.
29988
29989Semantics:
29990""""""""""
29991
29992The address of a thread local global is not a constant, since it depends on
29993the calling thread. The `llvm.threadlocal.address` intrinsic returns the
29994address of the given thread local global in the calling thread.
29995
29996.. _int_vscale:
29997
29998'``llvm.vscale``' Intrinsic
29999^^^^^^^^^^^^^^^^^^^^^^^^^^^
30000
30001Syntax:
30002"""""""
30003
30004::
30005
30006      declare i32 llvm.vscale.i32()
30007      declare i64 llvm.vscale.i64()
30008
30009Overview:
30010"""""""""
30011
30012The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable
30013vectors such as ``<vscale x 16 x i8>``.
30014
30015Semantics:
30016""""""""""
30017
30018``vscale`` is a positive value that is constant throughout program
30019execution, but is unknown at compile time.
30020If the result value does not fit in the result type, then the result is
30021a :ref:`poison value <poisonvalues>`.
30022
30023.. _llvm_fake_use:
30024
30025'``llvm.fake.use``' Intrinsic
30026^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30027
30028Syntax:
30029"""""""
30030
30031::
30032
30033      declare void @llvm.fake.use(...)
30034
30035Overview:
30036"""""""""
30037
30038The ``llvm.fake.use`` intrinsic is a no-op. It takes a single
30039value as an operand and is treated as a use of that operand, to force the
30040optimizer to preserve that value prior to the fake use. This is used for
30041extending the lifetimes of variables, where this intrinsic placed at the end of
30042a variable's scope helps prevent that variable from being optimized out.
30043
30044Arguments:
30045""""""""""
30046
30047The ``llvm.fake.use`` intrinsic takes one argument, which may be any
30048function-local SSA value. Note that the signature is variadic so that the
30049intrinsic can take any type of argument, but passing more than one argument will
30050result in an error.
30051
30052Semantics:
30053""""""""""
30054
30055This intrinsic does nothing, but optimizers must consider it a use of its single
30056operand and should try to preserve the intrinsic and its position in the
30057function.
30058
30059
30060Stack Map Intrinsics
30061--------------------
30062
30063LLVM provides experimental intrinsics to support runtime patching
30064mechanisms commonly desired in dynamic language JITs. These intrinsics
30065are described in :doc:`StackMaps`.
30066
30067Element Wise Atomic Memory Intrinsics
30068-------------------------------------
30069
30070These intrinsics are similar to the standard library memory intrinsics except
30071that they perform memory transfer as a sequence of atomic memory accesses.
30072
30073.. _int_memcpy_element_unordered_atomic:
30074
30075'``llvm.memcpy.element.unordered.atomic``' Intrinsic
30076^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30077
30078Syntax:
30079"""""""
30080
30081This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
30082any integer bit width and for different address spaces. Not all targets
30083support all bit widths however.
30084
30085::
30086
30087      declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i32(ptr <dest>,
30088                                                                   ptr <src>,
30089                                                                   i32 <len>,
30090                                                                   i32 <element_size>)
30091      declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i64(ptr <dest>,
30092                                                                   ptr <src>,
30093                                                                   i64 <len>,
30094                                                                   i32 <element_size>)
30095
30096Overview:
30097"""""""""
30098
30099The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
30100'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
30101as arrays with elements that are exactly ``element_size`` bytes, and the copy between
30102buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
30103that are a positive integer multiple of the ``element_size`` in size.
30104
30105Arguments:
30106""""""""""
30107
30108The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
30109intrinsic, with the added constraint that ``len`` is required to be a positive integer
30110multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
30111``element_size``, then the behavior of the intrinsic is undefined.
30112
30113``element_size`` must be a compile-time constant positive power of two no greater than
30114target-specific atomic access size limit.
30115
30116For each of the input pointers ``align`` parameter attribute must be specified. It
30117must be a power of two no less than the ``element_size``. Caller guarantees that
30118both the source and destination pointers are aligned to that boundary.
30119
30120Semantics:
30121""""""""""
30122
30123The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
30124memory from the source location to the destination location. These locations are not
30125allowed to overlap. The memory copy is performed as a sequence of load/store operations
30126where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
30127aligned at an ``element_size`` boundary.
30128
30129The order of the copy is unspecified. The same value may be read from the source
30130buffer many times, but only one write is issued to the destination buffer per
30131element. It is well defined to have concurrent reads and writes to both source and
30132destination provided those reads and writes are unordered atomic when specified.
30133
30134This intrinsic does not provide any additional ordering guarantees over those
30135provided by a set of unordered loads from the source location and stores to the
30136destination.
30137
30138Lowering:
30139"""""""""
30140
30141In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
30142lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
30143is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic
30144lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
30145lowering.
30146
30147Optimizer is allowed to inline memory copy when it's profitable to do so.
30148
30149'``llvm.memmove.element.unordered.atomic``' Intrinsic
30150^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30151
30152Syntax:
30153"""""""
30154
30155This is an overloaded intrinsic. You can use
30156``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
30157different address spaces. Not all targets support all bit widths however.
30158
30159::
30160
30161      declare void @llvm.memmove.element.unordered.atomic.p0.p0.i32(ptr <dest>,
30162                                                                    ptr <src>,
30163                                                                    i32 <len>,
30164                                                                    i32 <element_size>)
30165      declare void @llvm.memmove.element.unordered.atomic.p0.p0.i64(ptr <dest>,
30166                                                                    ptr <src>,
30167                                                                    i64 <len>,
30168                                                                    i32 <element_size>)
30169
30170Overview:
30171"""""""""
30172
30173The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
30174of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
30175``src`` are treated as arrays with elements that are exactly ``element_size``
30176bytes, and the copy between buffers uses a sequence of
30177:ref:`unordered atomic <ordering>` load/store operations that are a positive
30178integer multiple of the ``element_size`` in size.
30179
30180Arguments:
30181""""""""""
30182
30183The first three arguments are the same as they are in the
30184:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
30185``len`` is required to be a positive integer multiple of the ``element_size``.
30186If ``len`` is not a positive integer multiple of ``element_size``, then the
30187behavior of the intrinsic is undefined.
30188
30189``element_size`` must be a compile-time constant positive power of two no
30190greater than a target-specific atomic access size limit.
30191
30192For each of the input pointers the ``align`` parameter attribute must be
30193specified. It must be a power of two no less than the ``element_size``. Caller
30194guarantees that both the source and destination pointers are aligned to that
30195boundary.
30196
30197Semantics:
30198""""""""""
30199
30200The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
30201of memory from the source location to the destination location. These locations
30202are allowed to overlap. The memory copy is performed as a sequence of load/store
30203operations where each access is guaranteed to be a multiple of ``element_size``
30204bytes wide and aligned at an ``element_size`` boundary.
30205
30206The order of the copy is unspecified. The same value may be read from the source
30207buffer many times, but only one write is issued to the destination buffer per
30208element. It is well defined to have concurrent reads and writes to both source
30209and destination provided those reads and writes are unordered atomic when
30210specified.
30211
30212This intrinsic does not provide any additional ordering guarantees over those
30213provided by a set of unordered loads from the source location and stores to the
30214destination.
30215
30216Lowering:
30217"""""""""
30218
30219In the most general case call to the
30220'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
30221``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
30222actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering
30223<RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
30224lowering.
30225
30226The optimizer is allowed to inline the memory copy when it's profitable to do so.
30227
30228.. _int_memset_element_unordered_atomic:
30229
30230'``llvm.memset.element.unordered.atomic``' Intrinsic
30231^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30232
30233Syntax:
30234"""""""
30235
30236This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
30237any integer bit width and for different address spaces. Not all targets
30238support all bit widths however.
30239
30240::
30241
30242      declare void @llvm.memset.element.unordered.atomic.p0.i32(ptr <dest>,
30243                                                                i8 <value>,
30244                                                                i32 <len>,
30245                                                                i32 <element_size>)
30246      declare void @llvm.memset.element.unordered.atomic.p0.i64(ptr <dest>,
30247                                                                i8 <value>,
30248                                                                i64 <len>,
30249                                                                i32 <element_size>)
30250
30251Overview:
30252"""""""""
30253
30254The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
30255'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
30256with elements that are exactly ``element_size`` bytes, and the assignment to that array
30257uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
30258that are a positive integer multiple of the ``element_size`` in size.
30259
30260Arguments:
30261""""""""""
30262
30263The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
30264intrinsic, with the added constraint that ``len`` is required to be a positive integer
30265multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
30266``element_size``, then the behavior of the intrinsic is undefined.
30267
30268``element_size`` must be a compile-time constant positive power of two no greater than
30269target-specific atomic access size limit.
30270
30271The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
30272must be a power of two no less than the ``element_size``. Caller guarantees that
30273the destination pointer is aligned to that boundary.
30274
30275Semantics:
30276""""""""""
30277
30278The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
30279memory starting at the destination location to the given ``value``. The memory is
30280set with a sequence of store operations where each access is guaranteed to be a
30281multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
30282
30283The order of the assignment is unspecified. Only one write is issued to the
30284destination buffer per element. It is well defined to have concurrent reads and
30285writes to the destination provided those reads and writes are unordered atomic
30286when specified.
30287
30288This intrinsic does not provide any additional ordering guarantees over those
30289provided by a set of unordered stores to the destination.
30290
30291Lowering:
30292"""""""""
30293
30294In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
30295lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
30296is replaced with an actual element size.
30297
30298The optimizer is allowed to inline the memory assignment when it's profitable to do so.
30299
30300Objective-C ARC Runtime Intrinsics
30301----------------------------------
30302
30303LLVM provides intrinsics that lower to Objective-C ARC runtime entry points.
30304LLVM is aware of the semantics of these functions, and optimizes based on that
30305knowledge. You can read more about the details of Objective-C ARC `here
30306<https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_.
30307
30308'``llvm.objc.autorelease``' Intrinsic
30309^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30310
30311Syntax:
30312"""""""
30313::
30314
30315      declare ptr @llvm.objc.autorelease(ptr)
30316
30317Lowering:
30318"""""""""
30319
30320Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_.
30321
30322'``llvm.objc.autoreleasePoolPop``' Intrinsic
30323^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30324
30325Syntax:
30326"""""""
30327::
30328
30329      declare void @llvm.objc.autoreleasePoolPop(ptr)
30330
30331Lowering:
30332"""""""""
30333
30334Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_.
30335
30336'``llvm.objc.autoreleasePoolPush``' Intrinsic
30337^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30338
30339Syntax:
30340"""""""
30341::
30342
30343      declare ptr @llvm.objc.autoreleasePoolPush()
30344
30345Lowering:
30346"""""""""
30347
30348Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_.
30349
30350'``llvm.objc.autoreleaseReturnValue``' Intrinsic
30351^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30352
30353Syntax:
30354"""""""
30355::
30356
30357      declare ptr @llvm.objc.autoreleaseReturnValue(ptr)
30358
30359Lowering:
30360"""""""""
30361
30362Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_.
30363
30364'``llvm.objc.copyWeak``' Intrinsic
30365^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30366
30367Syntax:
30368"""""""
30369::
30370
30371      declare void @llvm.objc.copyWeak(ptr, ptr)
30372
30373Lowering:
30374"""""""""
30375
30376Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_.
30377
30378'``llvm.objc.destroyWeak``' Intrinsic
30379^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30380
30381Syntax:
30382"""""""
30383::
30384
30385      declare void @llvm.objc.destroyWeak(ptr)
30386
30387Lowering:
30388"""""""""
30389
30390Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_.
30391
30392'``llvm.objc.initWeak``' Intrinsic
30393^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30394
30395Syntax:
30396"""""""
30397::
30398
30399      declare ptr @llvm.objc.initWeak(ptr, ptr)
30400
30401Lowering:
30402"""""""""
30403
30404Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_.
30405
30406'``llvm.objc.loadWeak``' Intrinsic
30407^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30408
30409Syntax:
30410"""""""
30411::
30412
30413      declare ptr @llvm.objc.loadWeak(ptr)
30414
30415Lowering:
30416"""""""""
30417
30418Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_.
30419
30420'``llvm.objc.loadWeakRetained``' Intrinsic
30421^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30422
30423Syntax:
30424"""""""
30425::
30426
30427      declare ptr @llvm.objc.loadWeakRetained(ptr)
30428
30429Lowering:
30430"""""""""
30431
30432Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_.
30433
30434'``llvm.objc.moveWeak``' Intrinsic
30435^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30436
30437Syntax:
30438"""""""
30439::
30440
30441      declare void @llvm.objc.moveWeak(ptr, ptr)
30442
30443Lowering:
30444"""""""""
30445
30446Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_.
30447
30448'``llvm.objc.release``' Intrinsic
30449^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30450
30451Syntax:
30452"""""""
30453::
30454
30455      declare void @llvm.objc.release(ptr)
30456
30457Lowering:
30458"""""""""
30459
30460Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_.
30461
30462'``llvm.objc.retain``' Intrinsic
30463^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30464
30465Syntax:
30466"""""""
30467::
30468
30469      declare ptr @llvm.objc.retain(ptr)
30470
30471Lowering:
30472"""""""""
30473
30474Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_.
30475
30476'``llvm.objc.retainAutorelease``' Intrinsic
30477^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30478
30479Syntax:
30480"""""""
30481::
30482
30483      declare ptr @llvm.objc.retainAutorelease(ptr)
30484
30485Lowering:
30486"""""""""
30487
30488Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_.
30489
30490'``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic
30491^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30492
30493Syntax:
30494"""""""
30495::
30496
30497      declare ptr @llvm.objc.retainAutoreleaseReturnValue(ptr)
30498
30499Lowering:
30500"""""""""
30501
30502Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_.
30503
30504'``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic
30505^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30506
30507Syntax:
30508"""""""
30509::
30510
30511      declare ptr @llvm.objc.retainAutoreleasedReturnValue(ptr)
30512
30513Lowering:
30514"""""""""
30515
30516Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_.
30517
30518'``llvm.objc.retainBlock``' Intrinsic
30519^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30520
30521Syntax:
30522"""""""
30523::
30524
30525      declare ptr @llvm.objc.retainBlock(ptr)
30526
30527Lowering:
30528"""""""""
30529
30530Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_.
30531
30532'``llvm.objc.storeStrong``' Intrinsic
30533^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30534
30535Syntax:
30536"""""""
30537::
30538
30539      declare void @llvm.objc.storeStrong(ptr, ptr)
30540
30541Lowering:
30542"""""""""
30543
30544Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_.
30545
30546'``llvm.objc.storeWeak``' Intrinsic
30547^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30548
30549Syntax:
30550"""""""
30551::
30552
30553      declare ptr @llvm.objc.storeWeak(ptr, ptr)
30554
30555Lowering:
30556"""""""""
30557
30558Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.
30559
30560Preserving Debug Information Intrinsics
30561---------------------------------------
30562
30563These intrinsics are used to carry certain debuginfo together with
30564IR-level operations. For example, it may be desirable to
30565know the structure/union name and the original user-level field
30566indices. Such information got lost in IR GetElementPtr instruction
30567since the IR types are different from debugInfo types and unions
30568are converted to structs in IR.
30569
30570'``llvm.preserve.array.access.index``' Intrinsic
30571^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30572
30573Syntax:
30574"""""""
30575::
30576
30577      declare <ret_type>
30578      @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base,
30579                                                                           i32 dim,
30580                                                                           i32 index)
30581
30582Overview:
30583"""""""""
30584
30585The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address
30586based on array base ``base``, array dimension ``dim`` and the last access index ``index``
30587into the array. The return type ``ret_type`` is a pointer type to the array element.
30588The array ``dim`` and ``index`` are preserved which is more robust than
30589getelementptr instruction which may be subject to compiler transformation.
30590The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
30591to provide array or pointer debuginfo type.
30592The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the
30593debuginfo version of ``type``.
30594
30595Arguments:
30596""""""""""
30597
30598The ``base`` is the array base address.  The ``dim`` is the array dimension.
30599The ``base`` is a pointer if ``dim`` equals 0.
30600The ``index`` is the last access index into the array or pointer.
30601
30602The ``base`` argument must be annotated with an :ref:`elementtype
30603<attr_elementtype>` attribute at the call-site. This attribute specifies the
30604getelementptr element type.
30605
30606Semantics:
30607""""""""""
30608
30609The '``llvm.preserve.array.access.index``' intrinsic produces the same result
30610as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``.
30611
30612'``llvm.preserve.union.access.index``' Intrinsic
30613^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30614
30615Syntax:
30616"""""""
30617::
30618
30619      declare <type>
30620      @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base,
30621                                                                        i32 di_index)
30622
30623Overview:
30624"""""""""
30625
30626The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index
30627``di_index`` and returns the ``base`` address.
30628The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
30629to provide union debuginfo type.
30630The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
30631The return type ``type`` is the same as the ``base`` type.
30632
30633Arguments:
30634""""""""""
30635
30636The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo.
30637
30638Semantics:
30639""""""""""
30640
30641The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address.
30642
30643'``llvm.preserve.struct.access.index``' Intrinsic
30644^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30645
30646Syntax:
30647"""""""
30648::
30649
30650      declare <ret_type>
30651      @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base,
30652                                                                 i32 gep_index,
30653                                                                 i32 di_index)
30654
30655Overview:
30656"""""""""
30657
30658The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address
30659based on struct base ``base`` and IR struct member index ``gep_index``.
30660The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
30661to provide struct debuginfo type.
30662The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
30663The return type ``ret_type`` is a pointer type to the structure member.
30664
30665Arguments:
30666""""""""""
30667
30668The ``base`` is the structure base address. The ``gep_index`` is the struct member index
30669based on IR structures. The ``di_index`` is the struct member index based on debuginfo.
30670
30671The ``base`` argument must be annotated with an :ref:`elementtype
30672<attr_elementtype>` attribute at the call-site. This attribute specifies the
30673getelementptr element type.
30674
30675Semantics:
30676""""""""""
30677
30678The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
30679as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.
30680
30681'``llvm.fptrunc.round``' Intrinsic
30682^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30683
30684Syntax:
30685"""""""
30686
30687::
30688
30689      declare <ty2>
30690      @llvm.fptrunc.round(<type> <value>, metadata <rounding mode>)
30691
30692Overview:
30693"""""""""
30694
30695The '``llvm.fptrunc.round``' intrinsic truncates
30696:ref:`floating-point <t_floating>` ``value`` to type ``ty2``
30697with a specified rounding mode.
30698
30699Arguments:
30700""""""""""
30701
30702The '``llvm.fptrunc.round``' intrinsic takes a :ref:`floating-point
30703<t_floating>` value to cast and a :ref:`floating-point <t_floating>` type
30704to cast it to. This argument must be larger in size than the result.
30705
30706The second argument specifies the rounding mode as described in the constrained
30707intrinsics section.
30708For this intrinsic, the "round.dynamic" mode is not supported.
30709
30710Semantics:
30711""""""""""
30712
30713The '``llvm.fptrunc.round``' intrinsic casts a ``value`` from a larger
30714:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
30715<t_floating>` type.
30716This intrinsic is assumed to execute in the default :ref:`floating-point
30717environment <floatenv>` *except* for the rounding mode.
30718This intrinsic is not supported on all targets. Some targets may not support
30719all rounding modes.
30720