1======================================== 2Machine IR (MIR) Format Reference Manual 3======================================== 4 5.. contents:: 6 :local: 7 8.. warning:: 9 This is a work in progress. 10 11Introduction 12============ 13 14This document is a reference manual for the Machine IR (MIR) serialization 15format. MIR is a human readable serialization format that is used to represent 16LLVM's :ref:`machine specific intermediate representation 17<machine code representation>`. 18 19The MIR serialization format is designed to be used for testing the code 20generation passes in LLVM. 21 22Overview 23======== 24 25The MIR serialization format uses a YAML container. YAML is a standard 26data serialization language, and the full YAML language spec can be read at 27`yaml.org 28<http://www.yaml.org/spec/1.2/spec.html#Introduction>`_. 29 30A MIR file is split up into a series of `YAML documents`_. The first document 31can contain an optional embedded LLVM IR module, and the rest of the documents 32contain the serialized machine functions. 33 34.. _YAML documents: http://www.yaml.org/spec/1.2/spec.html#id2800132 35 36MIR Testing Guide 37================= 38 39You can use the MIR format for testing in two different ways: 40 41- You can write MIR tests that invoke a single code generation pass using the 42 ``-run-pass`` option in llc. 43 44- You can use llc's ``-stop-after`` option with existing or new LLVM assembly 45 tests and check the MIR output of a specific code generation pass. 46 47Testing Individual Code Generation Passes 48----------------------------------------- 49 50The ``-run-pass`` option in llc allows you to create MIR tests that invoke just 51a single code generation pass. When this option is used, llc will parse an 52input MIR file, run the specified code generation pass(es), and output the 53resulting MIR code. 54 55You can generate an input MIR file for the test by using the ``-stop-after`` or 56``-stop-before`` option in llc. For example, if you would like to write a test 57for the post register allocation pseudo instruction expansion pass, you can 58specify the machine copy propagation pass in the ``-stop-after`` option, as it 59runs just before the pass that we are trying to test: 60 61 ``llc -stop-after=machine-cp bug-trigger.ll -o test.mir`` 62 63If the same pass is run multiple times, a run index can be included 64after the name with a comma. 65 66 ``llc -stop-after=dead-mi-elimination,1 bug-trigger.ll -o test.mir`` 67 68After generating the input MIR file, you'll have to add a run line that uses 69the ``-run-pass`` option to it. In order to test the post register allocation 70pseudo instruction expansion pass on X86-64, a run line like the one shown 71below can be used: 72 73 ``# RUN: llc -o - %s -mtriple=x86_64-- -run-pass=postrapseudos | FileCheck %s`` 74 75The MIR files are target dependent, so they have to be placed in the target 76specific test directories (``lib/CodeGen/TARGETNAME``). They also need to 77specify a target triple or a target architecture either in the run line or in 78the embedded LLVM IR module. 79 80Simplifying MIR files 81^^^^^^^^^^^^^^^^^^^^^ 82 83The MIR code coming out of ``-stop-after``/``-stop-before`` is very verbose; 84Tests are more accessible and future proof when simplified: 85 86- Use the ``-simplify-mir`` option with llc. 87 88- Machine function attributes often have default values or the test works just 89 as well with default values. Typical candidates for this are: `alignment:`, 90 `exposesReturnsTwice`, `legalized`, `regBankSelected`, `selected`. 91 The whole `frameInfo` section is often unnecessary if there is no special 92 frame usage in the function. `tracksRegLiveness` on the other hand is often 93 necessary for some passes that care about block livein lists. 94 95- The (global) `liveins:` list is typically only interesting for early 96 instruction selection passes and can be removed when testing later passes. 97 The per-block `liveins:` on the other hand are necessary if 98 `tracksRegLiveness` is true. 99 100- Branch probability data in block `successors:` lists can be dropped if the 101 test doesn't depend on it. Example: 102 `successors: %bb.1(0x40000000), %bb.2(0x40000000)` can be replaced with 103 `successors: %bb.1, %bb.2`. 104 105- MIR code contains a whole IR module. This is necessary because there are 106 no equivalents in MIR for global variables, references to external functions, 107 function attributes, metadata, debug info. Instead some MIR data references 108 the IR constructs. You can often remove them if the test doesn't depend on 109 them. 110 111- Alias Analysis is performed on IR values. These are referenced by memory 112 operands in MIR. Example: `:: (load 8 from %ir.foobar, !alias.scope !9)`. 113 If the test doesn't depend on (good) alias analysis the references can be 114 dropped: `:: (load 8)` 115 116- MIR blocks can reference IR blocks for debug printing, profile information 117 or debug locations. Example: `bb.42.myblock` in MIR references the IR block 118 `myblock`. It is usually possible to drop the `.myblock` reference and simply 119 use `bb.42`. 120 121- If there are no memory operands or blocks referencing the IR then the 122 IR function can be replaced by a parameterless dummy function like 123 `define @func() { ret void }`. 124 125- It is possible to drop the whole IR section of the MIR file if it only 126 contains dummy functions (see above). The .mir loader will create the 127 IR functions automatically in this case. 128 129.. _limitations: 130 131Limitations 132----------- 133 134Currently the MIR format has several limitations in terms of which state it 135can serialize: 136 137- The target-specific state in the target-specific ``MachineFunctionInfo`` 138 subclasses isn't serialized at the moment. 139 140- The target-specific ``MachineConstantPoolValue`` subclasses (in the ARM and 141 SystemZ backends) aren't serialized at the moment. 142 143- The ``MCSymbol`` machine operands don't support temporary or local symbols. 144 145- A lot of the state in ``MachineModuleInfo`` isn't serialized - only the CFI 146 instructions and the variable debug information from MMI is serialized right 147 now. 148 149These limitations impose restrictions on what you can test with the MIR format. 150For now, tests that would like to test some behaviour that depends on the state 151of temporary or local ``MCSymbol`` operands or the exception handling state in 152MMI, can't use the MIR format. As well as that, tests that test some behaviour 153that depends on the state of the target specific ``MachineFunctionInfo`` or 154``MachineConstantPoolValue`` subclasses can't use the MIR format at the moment. 155 156High Level Structure 157==================== 158 159.. _embedded-module: 160 161Embedded Module 162--------------- 163 164When the first YAML document contains a `YAML block literal string`_, the MIR 165parser will treat this string as an LLVM assembly language string that 166represents an embedded LLVM IR module. 167Here is an example of a YAML document that contains an LLVM module: 168 169.. code-block:: llvm 170 171 define i32 @inc(ptr %x) { 172 entry: 173 %0 = load i32, ptr %x 174 %1 = add i32 %0, 1 175 store i32 %1, ptr %x 176 ret i32 %1 177 } 178 179.. _YAML block literal string: http://www.yaml.org/spec/1.2/spec.html#id2795688 180 181Machine Functions 182----------------- 183 184The remaining YAML documents contain the machine functions. This is an example 185of such YAML document: 186 187.. code-block:: text 188 189 --- 190 name: inc 191 tracksRegLiveness: true 192 liveins: 193 - { reg: '$rdi' } 194 callSites: 195 - { bb: 0, offset: 3, fwdArgRegs: 196 - { arg: 0, reg: '$edi' } } 197 body: | 198 bb.0.entry: 199 liveins: $rdi 200 201 $eax = MOV32rm $rdi, 1, _, 0, _ 202 $eax = INC32r killed $eax, implicit-def dead $eflags 203 MOV32mr killed $rdi, 1, _, 0, _, $eax 204 CALL64pcrel32 @foo <regmask...> 205 RETQ $eax 206 ... 207 208The document above consists of attributes that represent the various 209properties and data structures in a machine function. 210 211The attribute ``name`` is required, and its value should be identical to the 212name of a function that this machine function is based on. 213 214The attribute ``body`` is a `YAML block literal string`_. Its value represents 215the function's machine basic blocks and their machine instructions. 216 217The attribute ``callSites`` is a representation of call site information which 218keeps track of call instructions and registers used to transfer call arguments. 219 220Machine Instructions Format Reference 221===================================== 222 223The machine basic blocks and their instructions are represented using a custom, 224human readable serialization language. This language is used in the 225`YAML block literal string`_ that corresponds to the machine function's body. 226 227A source string that uses this language contains a list of machine basic 228blocks, which are described in the section below. 229 230Machine Basic Blocks 231-------------------- 232 233A machine basic block is defined in a single block definition source construct 234that contains the block's ID. 235The example below defines two blocks that have an ID of zero and one: 236 237.. code-block:: text 238 239 bb.0: 240 <instructions> 241 bb.1: 242 <instructions> 243 244A machine basic block can also have a name. It should be specified after the ID 245in the block's definition: 246 247.. code-block:: text 248 249 bb.0.entry: ; This block's name is "entry" 250 <instructions> 251 252The block's name should be identical to the name of the IR block that this 253machine block is based on. 254 255.. _block-references: 256 257Block References 258^^^^^^^^^^^^^^^^ 259 260The machine basic blocks are identified by their ID numbers. Individual 261blocks are referenced using the following syntax: 262 263.. code-block:: text 264 265 %bb.<id> 266 267Example: 268 269.. code-block:: llvm 270 271 %bb.0 272 273The following syntax is also supported, but the former syntax is preferred for 274block references: 275 276.. code-block:: text 277 278 %bb.<id>[.<name>] 279 280Example: 281 282.. code-block:: llvm 283 284 %bb.1.then 285 286Successors 287^^^^^^^^^^ 288 289The machine basic block's successors have to be specified before any of the 290instructions: 291 292.. code-block:: text 293 294 bb.0.entry: 295 successors: %bb.1.then, %bb.2.else 296 <instructions> 297 bb.1.then: 298 <instructions> 299 bb.2.else: 300 <instructions> 301 302The branch weights can be specified in brackets after the successor blocks. 303The example below defines a block that has two successors with branch weights 304of 32 and 16: 305 306.. code-block:: text 307 308 bb.0.entry: 309 successors: %bb.1.then(32), %bb.2.else(16) 310 311.. _bb-liveins: 312 313Live In Registers 314^^^^^^^^^^^^^^^^^ 315 316The machine basic block's live in registers have to be specified before any of 317the instructions: 318 319.. code-block:: text 320 321 bb.0.entry: 322 liveins: $edi, $esi 323 324The list of live in registers and successors can be empty. The language also 325allows multiple live in register and successor lists - they are combined into 326one list by the parser. 327 328Miscellaneous Attributes 329^^^^^^^^^^^^^^^^^^^^^^^^ 330 331The attributes ``IsAddressTaken``, ``IsLandingPad``, 332``IsInlineAsmBrIndirectTarget`` and ``Alignment`` can be specified in brackets 333after the block's definition: 334 335.. code-block:: text 336 337 bb.0.entry (address-taken): 338 <instructions> 339 bb.2.else (align 4): 340 <instructions> 341 bb.3(landing-pad, align 4): 342 <instructions> 343 bb.4 (inlineasm-br-indirect-target): 344 <instructions> 345 346.. TODO: Describe the way the reference to an unnamed LLVM IR block can be 347 preserved. 348 349``Alignment`` is specified in bytes, and must be a power of two. 350 351.. _mir-instructions: 352 353Machine Instructions 354-------------------- 355 356A machine instruction is composed of a name, 357:ref:`machine operands <machine-operands>`, 358:ref:`instruction flags <instruction-flags>`, and machine memory operands. 359 360The instruction's name is usually specified before the operands. The example 361below shows an instance of the X86 ``RETQ`` instruction with a single machine 362operand: 363 364.. code-block:: text 365 366 RETQ $eax 367 368However, if the machine instruction has one or more explicitly defined register 369operands, the instruction's name has to be specified after them. The example 370below shows an instance of the AArch64 ``LDPXpost`` instruction with three 371defined register operands: 372 373.. code-block:: text 374 375 $sp, $fp, $lr = LDPXpost $sp, 2 376 377The instruction names are serialized using the exact definitions from the 378target's ``*InstrInfo.td`` files, and they are case sensitive. This means that 379similar instruction names like ``TSTri`` and ``tSTRi`` represent different 380machine instructions. 381 382.. _instruction-flags: 383 384Instruction Flags 385^^^^^^^^^^^^^^^^^ 386 387The flag ``frame-setup`` or ``frame-destroy`` can be specified before the 388instruction's name: 389 390.. code-block:: text 391 392 $fp = frame-setup ADDXri $sp, 0, 0 393 394.. code-block:: text 395 396 $x21, $x20 = frame-destroy LDPXi $sp 397 398.. _registers: 399 400Bundled Instructions 401^^^^^^^^^^^^^^^^^^^^ 402 403The syntax for bundled instructions is the following: 404 405.. code-block:: text 406 407 BUNDLE implicit-def $r0, implicit-def $r1, implicit $r2 { 408 $r0 = SOME_OP $r2 409 $r1 = ANOTHER_OP internal $r0 410 } 411 412The first instruction is often a bundle header. The instructions between ``{`` 413and ``}`` are bundled with the first instruction. 414 415.. _mir-registers: 416 417Registers 418--------- 419 420Registers are one of the key primitives in the machine instructions 421serialization language. They are primarily used in the 422:ref:`register machine operands <register-operands>`, 423but they can also be used in a number of other places, like the 424:ref:`basic block's live in list <bb-liveins>`. 425 426The physical registers are identified by their name and by the '$' prefix sigil. 427They use the following syntax: 428 429.. code-block:: text 430 431 $<name> 432 433The example below shows three X86 physical registers: 434 435.. code-block:: text 436 437 $eax 438 $r15 439 $eflags 440 441The virtual registers are identified by their ID number and by the '%' sigil. 442They use the following syntax: 443 444.. code-block:: text 445 446 %<id> 447 448Example: 449 450.. code-block:: text 451 452 %0 453 454The null registers are represented using an underscore ('``_``'). They can also be 455represented using a '``$noreg``' named register, although the former syntax 456is preferred. 457 458.. _machine-operands: 459 460Machine Operands 461---------------- 462 463There are eighteen different kinds of machine operands, and all of them can be 464serialized. 465 466Immediate Operands 467^^^^^^^^^^^^^^^^^^ 468 469The immediate machine operands are untyped, 64-bit signed integers. The 470example below shows an instance of the X86 ``MOV32ri`` instruction that has an 471immediate machine operand ``-42``: 472 473.. code-block:: text 474 475 $eax = MOV32ri -42 476 477An immediate operand is also used to represent a subregister index when the 478machine instruction has one of the following opcodes: 479 480- ``EXTRACT_SUBREG`` 481 482- ``INSERT_SUBREG`` 483 484- ``REG_SEQUENCE`` 485 486- ``SUBREG_TO_REG`` 487 488In case this is true, the Machine Operand is printed according to the target. 489 490For example: 491 492In AArch64RegisterInfo.td: 493 494.. code-block:: text 495 496 def sub_32 : SubRegIndex<32>; 497 498If the third operand is an immediate with the value ``15`` (target-dependent 499value), based on the instruction's opcode and the operand's index the operand 500will be printed as ``%subreg.sub_32``: 501 502.. code-block:: text 503 504 %1:gpr64 = SUBREG_TO_REG 0, %0, %subreg.sub_32 505 506For integers > 64bit, we use a special machine operand, ``MO_CImmediate``, 507which stores the immediate in a ``ConstantInt`` using an ``APInt`` (LLVM's 508arbitrary precision integers). 509 510.. TODO: Describe the FPIMM immediate operands. 511 512.. _register-operands: 513 514Register Operands 515^^^^^^^^^^^^^^^^^ 516 517The :ref:`register <registers>` primitive is used to represent the register 518machine operands. The register operands can also have optional 519:ref:`register flags <register-flags>`, 520:ref:`a subregister index <subregister-indices>`, 521and a reference to the tied register operand. 522The full syntax of a register operand is shown below: 523 524.. code-block:: text 525 526 [<flags>] <register> [ :<subregister-idx-name> ] [ (tied-def <tied-op>) ] 527 528This example shows an instance of the X86 ``XOR32rr`` instruction that has 5295 register operands with different register flags: 530 531.. code-block:: text 532 533 dead $eax = XOR32rr undef $eax, undef $eax, implicit-def dead $eflags, implicit-def $al 534 535.. _register-flags: 536 537Register Flags 538~~~~~~~~~~~~~~ 539 540The table below shows all of the possible register flags along with the 541corresponding internal ``llvm::RegState`` representation: 542 543.. 544 Keep this in sync with MachineInstrBuilder.h 545 546.. list-table:: 547 :header-rows: 1 548 549 * - Flag 550 - Internal Value 551 - Meaning 552 553 * - ``implicit`` 554 - ``RegState::Implicit`` 555 - Not emitted register (e.g. carry, or temporary result). 556 557 * - ``implicit-def`` 558 - ``RegState::ImplicitDefine`` 559 - ``implicit`` and ``def`` 560 561 * - ``def`` 562 - ``RegState::Define`` 563 - Register definition. 564 565 * - ``dead`` 566 - ``RegState::Dead`` 567 - Unused definition. 568 569 * - ``killed`` 570 - ``RegState::Kill`` 571 - The last use of a register. 572 573 * - ``undef`` 574 - ``RegState::Undef`` 575 - Value of the register doesn't matter. 576 577 * - ``internal`` 578 - ``RegState::InternalRead`` 579 - Register reads a value that is defined inside the same instruction or bundle. 580 581 * - ``early-clobber`` 582 - ``RegState::EarlyClobber`` 583 - Register definition happens before uses. 584 585 * - ``debug-use`` 586 - ``RegState::Debug`` 587 - Register 'use' is for debugging purpose. 588 589 * - ``renamable`` 590 - ``RegState::Renamable`` 591 - Register that may be renamed. 592 593.. _subregister-indices: 594 595Subregister Indices 596~~~~~~~~~~~~~~~~~~~ 597 598The register machine operands can reference a portion of a register by using 599the subregister indices. The example below shows an instance of the ``COPY`` 600pseudo instruction that uses the X86 ``sub_8bit`` subregister index to copy 8 601lower bits from the 32-bit virtual register 0 to the 8-bit virtual register 1: 602 603.. code-block:: text 604 605 %1 = COPY %0:sub_8bit 606 607The names of the subregister indices are target specific, and are typically 608defined in the target's ``*RegisterInfo.td`` file. 609 610Constant Pool Indices 611^^^^^^^^^^^^^^^^^^^^^ 612 613A constant pool index (CPI) operand is printed using its index in the 614function's ``MachineConstantPool`` and an offset. 615 616For example, a CPI with the index 1 and offset 8: 617 618.. code-block:: text 619 620 %1:gr64 = MOV64ri %const.1 + 8 621 622For a CPI with the index 0 and offset -12: 623 624.. code-block:: text 625 626 %1:gr64 = MOV64ri %const.0 - 12 627 628A constant pool entry is bound to a LLVM IR ``Constant`` or a target-specific 629``MachineConstantPoolValue``. When serializing all the function's constants the 630following format is used: 631 632.. code-block:: text 633 634 constants: 635 - id: <index> 636 value: <value> 637 alignment: <alignment> 638 isTargetSpecific: <target-specific> 639 640where: 641 - ``<index>`` is a 32-bit unsigned integer; 642 - ``<value>`` is a `LLVM IR Constant 643 <https://www.llvm.org/docs/LangRef.html#constants>`_; 644 - ``<alignment>`` is a 32-bit unsigned integer specified in bytes, and must be 645 power of two; 646 - ``<target-specific>`` is either true or false. 647 648Example: 649 650.. code-block:: text 651 652 constants: 653 - id: 0 654 value: 'double 3.250000e+00' 655 alignment: 8 656 - id: 1 657 value: 'g-(LPC0+8)' 658 alignment: 4 659 isTargetSpecific: true 660 661Global Value Operands 662^^^^^^^^^^^^^^^^^^^^^ 663 664The global value machine operands reference the global values from the 665:ref:`embedded LLVM IR module <embedded-module>`. 666The example below shows an instance of the X86 ``MOV64rm`` instruction that has 667a global value operand named ``G``: 668 669.. code-block:: text 670 671 $rax = MOV64rm $rip, 1, _, @G, _ 672 673The named global values are represented using an identifier with the '@' prefix. 674If the identifier doesn't match the regular expression 675`[-a-zA-Z$._][-a-zA-Z$._0-9]*`, then this identifier must be quoted. 676 677The unnamed global values are represented using an unsigned numeric value with 678the '@' prefix, like in the following examples: ``@0``, ``@989``. 679 680Target-dependent Index Operands 681^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 682 683A target index operand is a target-specific index and an offset. The 684target-specific index is printed using target-specific names and a positive or 685negative offset. 686 687For example, the ``amdgpu-constdata-start`` is associated with the index ``0`` 688in the AMDGPU backend. So if we have a target index operand with the index 0 689and the offset 8: 690 691.. code-block:: text 692 693 $sgpr2 = S_ADD_U32 _, target-index(amdgpu-constdata-start) + 8, implicit-def _, implicit-def _ 694 695Jump-table Index Operands 696^^^^^^^^^^^^^^^^^^^^^^^^^ 697 698A jump-table index operand with the index 0 is printed as following: 699 700.. code-block:: text 701 702 tBR_JTr killed $r0, %jump-table.0 703 704A machine jump-table entry contains a list of ``MachineBasicBlocks``. When serializing all the function's jump-table entries, the following format is used: 705 706.. code-block:: text 707 708 jumpTable: 709 kind: <kind> 710 entries: 711 - id: <index> 712 blocks: [ <bbreference>, <bbreference>, ... ] 713 714where ``<kind>`` is describing how the jump table is represented and emitted (plain address, relocations, PIC, etc.), and each ``<index>`` is a 32-bit unsigned integer and ``blocks`` contains a list of :ref:`machine basic block references <block-references>`. 715 716Example: 717 718.. code-block:: text 719 720 jumpTable: 721 kind: inline 722 entries: 723 - id: 0 724 blocks: [ '%bb.3', '%bb.9', '%bb.4.d3' ] 725 - id: 1 726 blocks: [ '%bb.7', '%bb.7', '%bb.4.d3', '%bb.5' ] 727 728External Symbol Operands 729^^^^^^^^^^^^^^^^^^^^^^^^^ 730 731An external symbol operand is represented using an identifier with the ``&`` 732prefix. The identifier is surrounded with ""'s and escaped if it has any 733special non-printable characters in it. 734 735Example: 736 737.. code-block:: text 738 739 CALL64pcrel32 &__stack_chk_fail, csr_64, implicit $rsp, implicit-def $rsp 740 741MCSymbol Operands 742^^^^^^^^^^^^^^^^^ 743 744A MCSymbol operand is holding a pointer to a ``MCSymbol``. For the limitations 745of this operand in MIR, see :ref:`limitations <limitations>`. 746 747The syntax is: 748 749.. code-block:: text 750 751 EH_LABEL <mcsymbol Ltmp1> 752 753Debug Instruction Reference Operands 754^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 755 756A debug instruction reference operand is a pair of indices, referring to an 757instruction and an operand within that instruction respectively; see 758:ref:`Instruction referencing locations <instruction-referencing-locations>`. 759 760The example below uses a reference to Instruction 1, Operand 0: 761 762.. code-block:: text 763 764 DBG_INSTR_REF !123, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(1, 0), debug-location !456 765 766CFIIndex Operands 767^^^^^^^^^^^^^^^^^ 768 769A CFI Index operand is holding an index into a per-function side-table, 770``MachineFunction::getFrameInstructions()``, which references all the frame 771instructions in a ``MachineFunction``. A ``CFI_INSTRUCTION`` may look like it 772contains multiple operands, but the only operand it contains is the CFI Index. 773The other operands are tracked by the ``MCCFIInstruction`` object. 774 775The syntax is: 776 777.. code-block:: text 778 779 CFI_INSTRUCTION offset $w30, -16 780 781which may be emitted later in the MC layer as: 782 783.. code-block:: text 784 785 .cfi_offset w30, -16 786 787IntrinsicID Operands 788^^^^^^^^^^^^^^^^^^^^ 789 790An Intrinsic ID operand contains a generic intrinsic ID or a target-specific ID. 791 792The syntax for the ``returnaddress`` intrinsic is: 793 794.. code-block:: text 795 796 $x0 = COPY intrinsic(@llvm.returnaddress) 797 798Predicate Operands 799^^^^^^^^^^^^^^^^^^ 800 801A Predicate operand contains an IR predicate from ``CmpInst::Predicate``, like 802``ICMP_EQ``, etc. 803 804For an int eq predicate ``ICMP_EQ``, the syntax is: 805 806.. code-block:: text 807 808 %2:gpr(s32) = G_ICMP intpred(eq), %0, %1 809 810.. TODO: Describe the parsers default behaviour when optional YAML attributes 811 are missing. 812.. TODO: Describe the syntax for virtual register YAML definitions. 813.. TODO: Describe the machine function's YAML flag attributes. 814.. TODO: Describe the syntax for the register mask machine operands. 815.. TODO: Describe the frame information YAML mapping. 816.. TODO: Describe the syntax of the stack object machine operands and their 817 YAML definitions. 818.. TODO: Describe the syntax of the block address machine operands. 819.. TODO: Describe the syntax of the metadata machine operands, and the 820 instructions debug location attribute. 821.. TODO: Describe the syntax of the register live out machine operands. 822.. TODO: Describe the syntax of the machine memory operands. 823 824Comments 825^^^^^^^^ 826 827Machine operands can have C/C++ style comments, which are annotations enclosed 828between ``/*`` and ``*/`` to improve readability of e.g. immediate operands. 829In the example below, ARM instructions EOR and BCC and immediate operands 830``14`` and ``0`` have been annotated with their condition codes (CC) 831definitions, i.e. the ``always`` and ``eq`` condition codes: 832 833.. code-block:: text 834 835 dead renamable $r2, $cpsr = tEOR killed renamable $r2, renamable $r1, 14 /* CC::always */, $noreg 836 t2Bcc %bb.4, 0 /* CC:eq */, killed $cpsr 837 838As these annotations are comments, they are ignored by the MI parser. 839Comments can be added or customized by overriding InstrInfo's hook 840``createMIROperandComment()``. 841 842Debug-Info constructs 843--------------------- 844 845Most of the debugging information in a MIR file is to be found in the metadata 846of the embedded module. Within a machine function, that metadata is referred to 847by various constructs to describe source locations and variable locations. 848 849Source locations 850^^^^^^^^^^^^^^^^ 851 852Every MIR instruction may optionally have a trailing reference to a 853``DILocation`` metadata node, after all operands and symbols, but before 854memory operands: 855 856.. code-block:: text 857 858 $rbp = MOV64rr $rdi, debug-location !12 859 860The source location attachment is synonymous with the ``!dbg`` metadata 861attachment in LLVM-IR. The absence of a source location attachment will be 862represented by an empty ``DebugLoc`` object in the machine instruction. 863 864Fixed variable locations 865^^^^^^^^^^^^^^^^^^^^^^^^ 866 867There are several ways of specifying variable locations. The simplest is 868describing a variable that is permanently located on the stack. In the stack 869or fixedStack attribute of the machine function, the variable, scope, and 870any qualifying location modifier are provided: 871 872.. code-block:: text 873 874 - { id: 0, name: offset.addr, offset: -24, size: 8, alignment: 8, stack-id: default, 875 4 debug-info-variable: '!1', debug-info-expression: '!DIExpression()', 876 debug-info-location: '!2' } 877 878Where: 879 880- ``debug-info-variable`` identifies a DILocalVariable metadata node, 881 882- ``debug-info-expression`` adds qualifiers to the variable location, 883 884- ``debug-info-location`` identifies a DILocation metadata node. 885 886These metadata attributes correspond to the operands of a ``#dbg_declare`` 887IR debug record, see the :ref:`source level 888debugging<debug_records>` documentation. 889 890Varying variable locations 891^^^^^^^^^^^^^^^^^^^^^^^^^^ 892 893Variables that are not always on the stack or change location are specified 894with the ``DBG_VALUE`` meta machine instruction. It is synonymous with the 895``#dbg_value`` IR record, and is written: 896 897.. code-block:: text 898 899 DBG_VALUE $rax, $noreg, !123, !DIExpression(), debug-location !456 900 901The operands to which respectively: 902 9031. Identifies a machine location such as a register, immediate, or frame index, 904 9052. Is either $noreg, or immediate value zero if an extra level of indirection is to be added to the first operand, 906 9073. Identifies a ``DILocalVariable`` metadata node, 908 9094. Specifies an expression qualifying the variable location, either inline or as a metadata node reference, 910 911While the source location identifies the ``DILocation`` for the scope of the 912variable. The second operand (``IsIndirect``) is deprecated and to be deleted. 913All additional qualifiers for the variable location should be made through the 914expression metadata. 915 916.. _instruction-referencing-locations: 917 918Instruction referencing locations 919^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 920 921This experimental feature aims to separate the specification of variable 922*values* from the program point where a variable takes on that value. Changes 923in variable value occur in the same manner as ``DBG_VALUE`` meta instructions 924but using ``DBG_INSTR_REF``. Variable values are identified by a pair of 925instruction number and operand number. Consider the example below: 926 927.. code-block:: text 928 929 $rbp = MOV64ri 0, debug-instr-number 1, debug-location !12 930 DBG_INSTR_REF !123, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(1, 0), debug-location !456 931 932Instruction numbers are directly attached to machine instructions with an 933optional ``debug-instr-number`` attachment, before the optional 934``debug-location`` attachment. The value defined in ``$rbp`` in the code 935above would be identified by the pair ``<1, 0>``. 936 937The 3rd operand of the ``DBG_INSTR_REF`` above records the instruction 938and operand number ``<1, 0>``, identifying the value defined by the ``MOV64ri``. 939The first two operands to ``DBG_INSTR_REF`` are identical to ``DBG_VALUE_LIST``, 940and the ``DBG_INSTR_REF`` s position records where the variable takes on the 941designated value in the same way. 942 943More information about how these constructs are used is available in 944:doc:`InstrRefDebugInfo`. The related documents :doc:`SourceLevelDebugging` and 945:doc:`HowToUpdateDebugInfo` may be useful as well. 946