1======================= 2Writing an LLVM Backend 3======================= 4 5.. toctree:: 6 :hidden: 7 8 HowToUseInstrMappings 9 10.. contents:: 11 :local: 12 13Introduction 14============ 15 16This document describes techniques for writing compiler backends that convert 17the LLVM Intermediate Representation (IR) to code for a specified machine or 18other languages. Code intended for a specific machine can take the form of 19either assembly code or binary code (usable for a JIT compiler). 20 21The backend of LLVM features a target-independent code generator that may 22create output for several types of target CPUs --- including X86, PowerPC, 23ARM, and SPARC. The backend may also be used to generate code targeted at SPUs 24of the Cell processor or GPUs to support the execution of compute kernels. 25 26The document focuses on existing examples found in subdirectories of 27``llvm/lib/Target`` in a downloaded LLVM release. In particular, this document 28focuses on the example of creating a static compiler (one that emits text 29assembly) for a SPARC target, because SPARC has fairly standard 30characteristics, such as a RISC instruction set and straightforward calling 31conventions. 32 33Audience 34-------- 35 36The audience for this document is anyone who needs to write an LLVM backend to 37generate code for a specific hardware or software target. 38 39Prerequisite Reading 40-------------------- 41 42These essential documents must be read before reading this document: 43 44* `LLVM Language Reference Manual <LangRef.html>`_ --- a reference manual for 45 the LLVM assembly language. 46 47* :doc:`CodeGenerator` --- a guide to the components (classes and code 48 generation algorithms) for translating the LLVM internal representation into 49 machine code for a specified target. Pay particular attention to the 50 descriptions of code generation stages: Instruction Selection, Scheduling and 51 Formation, SSA-based Optimization, Register Allocation, Prolog/Epilog Code 52 Insertion, Late Machine Code Optimizations, and Code Emission. 53 54* :doc:`TableGen/index` --- a document that describes the TableGen 55 (``tblgen``) application that manages domain-specific information to support 56 LLVM code generation. TableGen processes input from a target description 57 file (``.td`` suffix) and generates C++ code that can be used for code 58 generation. 59 60* :doc:`WritingAnLLVMPass` --- The assembly printer is a ``FunctionPass``, as 61 are several ``SelectionDAG`` processing steps. 62 63To follow the SPARC examples in this document, have a copy of `The SPARC 64Architecture Manual, Version 8 <http://www.sparc.org/standards/V8.pdf>`_ for 65reference. For details about the ARM instruction set, refer to the `ARM 66Architecture Reference Manual <http://infocenter.arm.com/>`_. For more about 67the GNU Assembler format (``GAS``), see `Using As 68<http://sourceware.org/binutils/docs/as/index.html>`_, especially for the 69assembly printer. "Using As" contains a list of target machine dependent 70features. 71 72Basic Steps 73----------- 74 75To write a compiler backend for LLVM that converts the LLVM IR to code for a 76specified target (machine or other language), follow these steps: 77 78* Create a subclass of the ``TargetMachine`` class that describes 79 characteristics of your target machine. Copy existing examples of specific 80 ``TargetMachine`` class and header files; for example, start with 81 ``SparcTargetMachine.cpp`` and ``SparcTargetMachine.h``, but change the file 82 names for your target. Similarly, change code that references "``Sparc``" to 83 reference your target. 84 85* Describe the register set of the target. Use TableGen to generate code for 86 register definition, register aliases, and register classes from a 87 target-specific ``RegisterInfo.td`` input file. You should also write 88 additional code for a subclass of the ``TargetRegisterInfo`` class that 89 represents the class register file data used for register allocation and also 90 describes the interactions between registers. 91 92* Describe the instruction set of the target. Use TableGen to generate code 93 for target-specific instructions from target-specific versions of 94 ``TargetInstrFormats.td`` and ``TargetInstrInfo.td``. You should write 95 additional code for a subclass of the ``TargetInstrInfo`` class to represent 96 machine instructions supported by the target machine. 97 98* Describe the selection and conversion of the LLVM IR from a Directed Acyclic 99 Graph (DAG) representation of instructions to native target-specific 100 instructions. Use TableGen to generate code that matches patterns and 101 selects instructions based on additional information in a target-specific 102 version of ``TargetInstrInfo.td``. Write code for ``XXXISelDAGToDAG.cpp``, 103 where ``XXX`` identifies the specific target, to perform pattern matching and 104 DAG-to-DAG instruction selection. Also write code in ``XXXISelLowering.cpp`` 105 to replace or remove operations and data types that are not supported 106 natively in a SelectionDAG. 107 108* Write code for an assembly printer that converts LLVM IR to a GAS format for 109 your target machine. You should add assembly strings to the instructions 110 defined in your target-specific version of ``TargetInstrInfo.td``. You 111 should also write code for a subclass of ``AsmPrinter`` that performs the 112 LLVM-to-assembly conversion and a trivial subclass of ``TargetAsmInfo``. 113 114* Optionally, add support for subtargets (i.e., variants with different 115 capabilities). You should also write code for a subclass of the 116 ``TargetSubtarget`` class, which allows you to use the ``-mcpu=`` and 117 ``-mattr=`` command-line options. 118 119* Optionally, add JIT support and create a machine code emitter (subclass of 120 ``TargetJITInfo``) that is used to emit binary code directly into memory. 121 122In the ``.cpp`` and ``.h``. files, initially stub up these methods and then 123implement them later. Initially, you may not know which private members that 124the class will need and which components will need to be subclassed. 125 126Preliminaries 127------------- 128 129To actually create your compiler backend, you need to create and modify a few 130files. The absolute minimum is discussed here. But to actually use the LLVM 131target-independent code generator, you must perform the steps described in the 132:doc:`LLVM Target-Independent Code Generator <CodeGenerator>` document. 133 134First, you should create a subdirectory under ``lib/Target`` to hold all the 135files related to your target. If your target is called "Dummy", create the 136directory ``lib/Target/Dummy``. 137 138In this new directory, create a ``CMakeLists.txt``. It is easiest to copy a 139``CMakeLists.txt`` of another target and modify it. It should at least contain 140the ``LLVM_TARGET_DEFINITIONS`` variable. The library can be named ``LLVMDummy`` 141(for example, see the MIPS target). Alternatively, you can split the library 142into ``LLVMDummyCodeGen`` and ``LLVMDummyAsmPrinter``, the latter of which 143should be implemented in a subdirectory below ``lib/Target/Dummy`` (for example, 144see the PowerPC target). 145 146Note that these two naming schemes are hardcoded into ``llvm-config``. Using 147any other naming scheme will confuse ``llvm-config`` and produce a lot of 148(seemingly unrelated) linker errors when linking ``llc``. 149 150To make your target actually do something, you need to implement a subclass of 151``TargetMachine``. This implementation should typically be in the file 152``lib/Target/DummyTargetMachine.cpp``, but any file in the ``lib/Target`` 153directory will be built and should work. To use LLVM's target independent code 154generator, you should do what all current machine backends do: create a 155subclass of ``CodeGenTargetMachineImpl``. (To create a target from scratch, create a 156subclass of ``TargetMachine``.) 157 158To get LLVM to actually build and link your target, you need to run ``cmake`` 159with ``-DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=Dummy``. This will build your 160target without needing to add it to the list of all the targets. 161 162Once your target is stable, you can add it to the ``LLVM_ALL_TARGETS`` variable 163located in the main ``CMakeLists.txt``. 164 165Target Machine 166============== 167 168``CodeGenTargetMachineImpl`` is designed as a base class for targets implemented with 169the LLVM target-independent code generator. The ``CodeGenTargetMachineImpl`` class 170should be specialized by a concrete target class that implements the various 171virtual methods. ``CodeGenTargetMachineImpl`` is defined as a subclass of 172``TargetMachine`` in ``include/llvm/CodeGen/CodeGenTargetMachineImpl.h``. The 173``TargetMachine`` class implementation (``include/llvm/Target/TargetMachine.cpp``) 174also processes numerous command-line options. 175 176To create a concrete target-specific subclass of ``CodeGenTargetMachineImpl``, start 177by copying an existing ``TargetMachine`` class and header. You should name the 178files that you create to reflect your specific target. For instance, for the 179SPARC target, name the files ``SparcTargetMachine.h`` and 180``SparcTargetMachine.cpp``. 181 182For a target machine ``XXX``, the implementation of ``XXXTargetMachine`` must 183have access methods to obtain objects that represent target components. These 184methods are named ``get*Info``, and are intended to obtain the instruction set 185(``getInstrInfo``), register set (``getRegisterInfo``), stack frame layout 186(``getFrameInfo``), and similar information. ``XXXTargetMachine`` must also 187implement the ``getDataLayout`` method to access an object with target-specific 188data characteristics, such as data type size and alignment requirements. 189 190For instance, for the SPARC target, the header file ``SparcTargetMachine.h`` 191declares prototypes for several ``get*Info`` and ``getDataLayout`` methods that 192simply return a class member. 193 194.. code-block:: c++ 195 196 namespace llvm { 197 198 class Module; 199 200 class SparcTargetMachine : public CodeGenTargetMachineImpl { 201 const DataLayout DataLayout; // Calculates type size & alignment 202 SparcSubtarget Subtarget; 203 SparcInstrInfo InstrInfo; 204 TargetFrameInfo FrameInfo; 205 206 protected: 207 virtual const TargetAsmInfo *createTargetAsmInfo() const; 208 209 public: 210 SparcTargetMachine(const Module &M, const std::string &FS); 211 212 virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; } 213 virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; } 214 virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; } 215 virtual const TargetRegisterInfo *getRegisterInfo() const { 216 return &InstrInfo.getRegisterInfo(); 217 } 218 virtual const DataLayout *getDataLayout() const { return &DataLayout; } 219 220 // Pass Pipeline Configuration 221 virtual bool addInstSelector(PassManagerBase &PM, bool Fast); 222 virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast); 223 }; 224 225 } // end namespace llvm 226 227* ``getInstrInfo()`` 228* ``getRegisterInfo()`` 229* ``getFrameInfo()`` 230* ``getDataLayout()`` 231* ``getSubtargetImpl()`` 232 233For some targets, you also need to support the following methods: 234 235* ``getTargetLowering()`` 236* ``getJITInfo()`` 237 238Some architectures, such as GPUs, do not support jumping to an arbitrary 239program location and implement branching using masked execution and loop using 240special instructions around the loop body. In order to avoid CFG modifications 241that introduce irreducible control flow not handled by such hardware, a target 242must call `setRequiresStructuredCFG(true)` when being initialized. 243 244In addition, the ``XXXTargetMachine`` constructor should specify a 245``TargetDescription`` string that determines the data layout for the target 246machine, including characteristics such as pointer size, alignment, and 247endianness. For example, the constructor for ``SparcTargetMachine`` contains 248the following: 249 250.. code-block:: c++ 251 252 SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS) 253 : DataLayout("E-p:32:32-f128:128:128"), 254 Subtarget(M, FS), InstrInfo(Subtarget), 255 FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) { 256 } 257 258Hyphens separate portions of the ``TargetDescription`` string. 259 260* An upper-case "``E``" in the string indicates a big-endian target data model. 261 A lower-case "``e``" indicates little-endian. 262 263* "``p:``" is followed by pointer information: size, ABI alignment, and 264 preferred alignment. If only two figures follow "``p:``", then the first 265 value is pointer size, and the second value is both ABI and preferred 266 alignment. 267 268* Then a letter for numeric type alignment: "``i``", "``f``", "``v``", or 269 "``a``" (corresponding to integer, floating point, vector, or aggregate). 270 "``i``", "``v``", or "``a``" are followed by ABI alignment and preferred 271 alignment. "``f``" is followed by three values: the first indicates the size 272 of a long double, then ABI alignment, and then ABI preferred alignment. 273 274Target Registration 275=================== 276 277You must also register your target with the ``TargetRegistry``, which is what 278other LLVM tools use to be able to lookup and use your target at runtime. The 279``TargetRegistry`` can be used directly, but for most targets there are helper 280templates which should take care of the work for you. 281 282All targets should declare a global ``Target`` object which is used to 283represent the target during registration. Then, in the target's ``TargetInfo`` 284library, the target should define that object and use the ``RegisterTarget`` 285template to register the target. For example, the Sparc registration code 286looks like this: 287 288.. code-block:: c++ 289 290 Target llvm::getTheSparcTarget(); 291 292 extern "C" void LLVMInitializeSparcTargetInfo() { 293 RegisterTarget<Triple::sparc, /*HasJIT=*/false> 294 X(getTheSparcTarget(), "sparc", "Sparc"); 295 } 296 297This allows the ``TargetRegistry`` to look up the target by name or by target 298triple. In addition, most targets will also register additional features which 299are available in separate libraries. These registration steps are separate, 300because some clients may wish to only link in some parts of the target --- the 301JIT code generator does not require the use of the assembler printer, for 302example. Here is an example of registering the Sparc assembly printer: 303 304.. code-block:: c++ 305 306 extern "C" void LLVMInitializeSparcAsmPrinter() { 307 RegisterAsmPrinter<SparcAsmPrinter> X(getTheSparcTarget()); 308 } 309 310For more information, see "`llvm/Target/TargetRegistry.h 311</doxygen/TargetRegistry_8h-source.html>`_". 312 313Register Set and Register Classes 314================================= 315 316You should describe a concrete target-specific class that represents the 317register file of a target machine. This class is called ``XXXRegisterInfo`` 318(where ``XXX`` identifies the target) and represents the class register file 319data that is used for register allocation. It also describes the interactions 320between registers. 321 322You also need to define register classes to categorize related registers. A 323register class should be added for groups of registers that are all treated the 324same way for some instruction. Typical examples are register classes for 325integer, floating-point, or vector registers. A register allocator allows an 326instruction to use any register in a specified register class to perform the 327instruction in a similar manner. Register classes allocate virtual registers 328to instructions from these sets, and register classes let the 329target-independent register allocator automatically choose the actual 330registers. 331 332Much of the code for registers, including register definition, register 333aliases, and register classes, is generated by TableGen from 334``XXXRegisterInfo.td`` input files and placed in ``XXXGenRegisterInfo.h.inc`` 335and ``XXXGenRegisterInfo.inc`` output files. Some of the code in the 336implementation of ``XXXRegisterInfo`` requires hand-coding. 337 338Defining a Register 339------------------- 340 341The ``XXXRegisterInfo.td`` file typically starts with register definitions for 342a target machine. The ``Register`` class (specified in ``Target.td``) is used 343to define an object for each register. The specified string ``n`` becomes the 344``Name`` of the register. The basic ``Register`` object does not have any 345subregisters and does not specify any aliases. 346 347.. code-block:: text 348 349 class Register<string n> { 350 string Namespace = ""; 351 string AsmName = n; 352 string Name = n; 353 int SpillSize = 0; 354 int SpillAlignment = 0; 355 list<Register> Aliases = []; 356 list<Register> SubRegs = []; 357 list<int> DwarfNumbers = []; 358 } 359 360For example, in the ``X86RegisterInfo.td`` file, there are register definitions 361that utilize the ``Register`` class, such as: 362 363.. code-block:: text 364 365 def AL : Register<"AL">, DwarfRegNum<[0, 0, 0]>; 366 367This defines the register ``AL`` and assigns it values (with ``DwarfRegNum``) 368that are used by ``gcc``, ``gdb``, or a debug information writer to identify a 369register. For register ``AL``, ``DwarfRegNum`` takes an array of 3 values 370representing 3 different modes: the first element is for X86-64, the second for 371exception handling (EH) on X86-32, and the third is generic. -1 is a special 372Dwarf number that indicates the gcc number is undefined, and -2 indicates the 373register number is invalid for this mode. 374 375From the previously described line in the ``X86RegisterInfo.td`` file, TableGen 376generates this code in the ``X86GenRegisterInfo.inc`` file: 377 378.. code-block:: c++ 379 380 static const unsigned GR8[] = { X86::AL, ... }; 381 382 const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 }; 383 384 const TargetRegisterDesc RegisterDescriptors[] = { 385 ... 386 { "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ... 387 388From the register info file, TableGen generates a ``TargetRegisterDesc`` object 389for each register. ``TargetRegisterDesc`` is defined in 390``include/llvm/Target/TargetRegisterInfo.h`` with the following fields: 391 392.. code-block:: c++ 393 394 struct TargetRegisterDesc { 395 const char *AsmName; // Assembly language name for the register 396 const char *Name; // Printable name for the reg (for debugging) 397 const unsigned *AliasSet; // Register Alias Set 398 const unsigned *SubRegs; // Sub-register set 399 const unsigned *ImmSubRegs; // Immediate sub-register set 400 const unsigned *SuperRegs; // Super-register set 401 }; 402 403TableGen uses the entire target description file (``.td``) to determine text 404names for the register (in the ``AsmName`` and ``Name`` fields of 405``TargetRegisterDesc``) and the relationships of other registers to the defined 406register (in the other ``TargetRegisterDesc`` fields). In this example, other 407definitions establish the registers "``AX``", "``EAX``", and "``RAX``" as 408aliases for one another, so TableGen generates a null-terminated array 409(``AL_AliasSet``) for this register alias set. 410 411The ``Register`` class is commonly used as a base class for more complex 412classes. In ``Target.td``, the ``Register`` class is the base for the 413``RegisterWithSubRegs`` class that is used to define registers that need to 414specify subregisters in the ``SubRegs`` list, as shown here: 415 416.. code-block:: text 417 418 class RegisterWithSubRegs<string n, list<Register> subregs> : Register<n> { 419 let SubRegs = subregs; 420 } 421 422In ``SparcRegisterInfo.td``, additional register classes are defined for SPARC: 423a ``Register`` subclass, ``SparcReg``, and further subclasses: ``Ri``, ``Rf``, 424and ``Rd``. SPARC registers are identified by 5-bit ID numbers, which is a 425feature common to these subclasses. Note the use of "``let``" expressions to 426override values that are initially defined in a superclass (such as ``SubRegs`` 427field in the ``Rd`` class). 428 429.. code-block:: text 430 431 class SparcReg<string n> : Register<n> { 432 field bits<5> Num; 433 let Namespace = "SP"; 434 } 435 // Ri - 32-bit integer registers 436 class Ri<bits<5> num, string n> : 437 SparcReg<n> { 438 let Num = num; 439 } 440 // Rf - 32-bit floating-point registers 441 class Rf<bits<5> num, string n> : 442 SparcReg<n> { 443 let Num = num; 444 } 445 // Rd - Slots in the FP register file for 64-bit floating-point values. 446 class Rd<bits<5> num, string n, list<Register> subregs> : SparcReg<n> { 447 let Num = num; 448 let SubRegs = subregs; 449 } 450 451In the ``SparcRegisterInfo.td`` file, there are register definitions that 452utilize these subclasses of ``Register``, such as: 453 454.. code-block:: text 455 456 def G0 : Ri< 0, "G0">, DwarfRegNum<[0]>; 457 def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>; 458 ... 459 def F0 : Rf< 0, "F0">, DwarfRegNum<[32]>; 460 def F1 : Rf< 1, "F1">, DwarfRegNum<[33]>; 461 ... 462 def D0 : Rd< 0, "F0", [F0, F1]>, DwarfRegNum<[32]>; 463 def D1 : Rd< 2, "F2", [F2, F3]>, DwarfRegNum<[34]>; 464 465The last two registers shown above (``D0`` and ``D1``) are double-precision 466floating-point registers that are aliases for pairs of single-precision 467floating-point sub-registers. In addition to aliases, the sub-register and 468super-register relationships of the defined register are in fields of a 469register's ``TargetRegisterDesc``. 470 471Defining a Register Class 472------------------------- 473 474The ``RegisterClass`` class (specified in ``Target.td``) is used to define an 475object that represents a group of related registers and also defines the 476default allocation order of the registers. A target description file 477``XXXRegisterInfo.td`` that uses ``Target.td`` can construct register classes 478using the following class: 479 480.. code-block:: text 481 482 class RegisterClass<string namespace, 483 list<ValueType> regTypes, int alignment, dag regList> { 484 string Namespace = namespace; 485 list<ValueType> RegTypes = regTypes; 486 int Size = 0; // spill size, in bits; zero lets tblgen pick the size 487 int Alignment = alignment; 488 489 // CopyCost is the cost of copying a value between two registers 490 // default value 1 means a single instruction 491 // A negative value means copying is extremely expensive or impossible 492 int CopyCost = 1; 493 dag MemberList = regList; 494 495 // for register classes that are subregisters of this class 496 list<RegisterClass> SubRegClassList = []; 497 498 code MethodProtos = [{}]; // to insert arbitrary code 499 code MethodBodies = [{}]; 500 } 501 502To define a ``RegisterClass``, use the following 4 arguments: 503 504* The first argument of the definition is the name of the namespace. 505 506* The second argument is a list of ``ValueType`` register type values that are 507 defined in ``include/llvm/CodeGen/ValueTypes.td``. Defined values include 508 integer types (such as ``i16``, ``i32``, and ``i1`` for Boolean), 509 floating-point types (``f32``, ``f64``), and vector types (for example, 510 ``v8i16`` for an ``8 x i16`` vector). All registers in a ``RegisterClass`` 511 must have the same ``ValueType``, but some registers may store vector data in 512 different configurations. For example a register that can process a 128-bit 513 vector may be able to handle 16 8-bit integer elements, 8 16-bit integers, 4 514 32-bit integers, and so on. 515 516* The third argument of the ``RegisterClass`` definition specifies the 517 alignment required of the registers when they are stored or loaded to 518 memory. 519 520* The final argument, ``regList``, specifies which registers are in this class. 521 If an alternative allocation order method is not specified, then ``regList`` 522 also defines the order of allocation used by the register allocator. Besides 523 simply listing registers with ``(add R0, R1, ...)``, more advanced set 524 operators are available. See ``include/llvm/Target/Target.td`` for more 525 information. 526 527In ``SparcRegisterInfo.td``, three ``RegisterClass`` objects are defined: 528``FPRegs``, ``DFPRegs``, and ``IntRegs``. For all three register classes, the 529first argument defines the namespace with the string "``SP``". ``FPRegs`` 530defines a group of 32 single-precision floating-point registers (``F0`` to 531``F31``); ``DFPRegs`` defines a group of 16 double-precision registers 532(``D0-D15``). 533 534.. code-block:: text 535 536 // F0, F1, F2, ..., F31 537 def FPRegs : RegisterClass<"SP", [f32], 32, (sequence "F%u", 0, 31)>; 538 539 def DFPRegs : RegisterClass<"SP", [f64], 64, 540 (add D0, D1, D2, D3, D4, D5, D6, D7, D8, 541 D9, D10, D11, D12, D13, D14, D15)>; 542 543 def IntRegs : RegisterClass<"SP", [i32], 32, 544 (add L0, L1, L2, L3, L4, L5, L6, L7, 545 I0, I1, I2, I3, I4, I5, 546 O0, O1, O2, O3, O4, O5, O7, 547 G1, 548 // Non-allocatable regs: 549 G2, G3, G4, 550 O6, // stack ptr 551 I6, // frame ptr 552 I7, // return address 553 G0, // constant zero 554 G5, G6, G7 // reserved for kernel 555 )>; 556 557Using ``SparcRegisterInfo.td`` with TableGen generates several output files 558that are intended for inclusion in other source code that you write. 559``SparcRegisterInfo.td`` generates ``SparcGenRegisterInfo.h.inc``, which should 560be included in the header file for the implementation of the SPARC register 561implementation that you write (``SparcRegisterInfo.h``). In 562``SparcGenRegisterInfo.h.inc`` a new structure is defined called 563``SparcGenRegisterInfo`` that uses ``TargetRegisterInfo`` as its base. It also 564specifies types, based upon the defined register classes: ``DFPRegsClass``, 565``FPRegsClass``, and ``IntRegsClass``. 566 567``SparcRegisterInfo.td`` also generates ``SparcGenRegisterInfo.inc``, which is 568included at the bottom of ``SparcRegisterInfo.cpp``, the SPARC register 569implementation. The code below shows only the generated integer registers and 570associated register classes. The order of registers in ``IntRegs`` reflects 571the order in the definition of ``IntRegs`` in the target description file. 572 573.. code-block:: c++ 574 575 // IntRegs Register Class... 576 static const unsigned IntRegs[] = { 577 SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5, 578 SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3, 579 SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3, 580 SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3, 581 SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5, 582 SP::G6, SP::G7, 583 }; 584 585 // IntRegsVTs Register Class Value Types... 586 static const MVT::ValueType IntRegsVTs[] = { 587 MVT::i32, MVT::Other 588 }; 589 590 namespace SP { // Register class instances 591 DFPRegsClass DFPRegsRegClass; 592 FPRegsClass FPRegsRegClass; 593 IntRegsClass IntRegsRegClass; 594 ... 595 // IntRegs Sub-register Classes... 596 static const TargetRegisterClass* const IntRegsSubRegClasses [] = { 597 NULL 598 }; 599 ... 600 // IntRegs Super-register Classes.. 601 static const TargetRegisterClass* const IntRegsSuperRegClasses [] = { 602 NULL 603 }; 604 ... 605 // IntRegs Register Class sub-classes... 606 static const TargetRegisterClass* const IntRegsSubclasses [] = { 607 NULL 608 }; 609 ... 610 // IntRegs Register Class super-classes... 611 static const TargetRegisterClass* const IntRegsSuperclasses [] = { 612 NULL 613 }; 614 615 IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID, 616 IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses, 617 IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {} 618 } 619 620The register allocators will avoid using reserved registers, and callee saved 621registers are not used until all the volatile registers have been used. That 622is usually good enough, but in some cases it may be necessary to provide custom 623allocation orders. 624 625Implement a subclass of ``TargetRegisterInfo`` 626---------------------------------------------- 627 628The final step is to hand code portions of ``XXXRegisterInfo``, which 629implements the interface described in ``TargetRegisterInfo.h`` (see 630:ref:`TargetRegisterInfo`). These functions return ``0``, ``NULL``, or 631``false``, unless overridden. Here is a list of functions that are overridden 632for the SPARC implementation in ``SparcRegisterInfo.cpp``: 633 634* ``getCalleeSavedRegs`` --- Returns a list of callee-saved registers in the 635 order of the desired callee-save stack frame offset. 636 637* ``getReservedRegs`` --- Returns a bitset indexed by physical register 638 numbers, indicating if a particular register is unavailable. 639 640* ``hasFP`` --- Return a Boolean indicating if a function should have a 641 dedicated frame pointer register. 642 643* ``eliminateCallFramePseudoInstr`` --- If call frame setup or destroy pseudo 644 instructions are used, this can be called to eliminate them. 645 646* ``eliminateFrameIndex`` --- Eliminate abstract frame indices from 647 instructions that may use them. 648 649* ``emitPrologue`` --- Insert prologue code into the function. 650 651* ``emitEpilogue`` --- Insert epilogue code into the function. 652 653.. _instruction-set: 654 655Instruction Set 656=============== 657 658During the early stages of code generation, the LLVM IR code is converted to a 659``SelectionDAG`` with nodes that are instances of the ``SDNode`` class 660containing target instructions. An ``SDNode`` has an opcode, operands, type 661requirements, and operation properties. For example, is an operation 662commutative, does an operation load from memory. The various operation node 663types are described in the ``include/llvm/CodeGen/SelectionDAGNodes.h`` file 664(values of the ``NodeType`` enum in the ``ISD`` namespace). 665 666TableGen uses the following target description (``.td``) input files to 667generate much of the code for instruction definition: 668 669* ``Target.td`` --- Where the ``Instruction``, ``Operand``, ``InstrInfo``, and 670 other fundamental classes are defined. 671 672* ``TargetSelectionDAG.td`` --- Used by ``SelectionDAG`` instruction selection 673 generators, contains ``SDTC*`` classes (selection DAG type constraint), 674 definitions of ``SelectionDAG`` nodes (such as ``imm``, ``cond``, ``bb``, 675 ``add``, ``fadd``, ``sub``), and pattern support (``Pattern``, ``Pat``, 676 ``PatFrag``, ``PatLeaf``, ``ComplexPattern``. 677 678* ``XXXInstrFormats.td`` --- Patterns for definitions of target-specific 679 instructions. 680 681* ``XXXInstrInfo.td`` --- Target-specific definitions of instruction templates, 682 condition codes, and instructions of an instruction set. For architecture 683 modifications, a different file name may be used. For example, for Pentium 684 with SSE instruction, this file is ``X86InstrSSE.td``, and for Pentium with 685 MMX, this file is ``X86InstrMMX.td``. 686 687There is also a target-specific ``XXX.td`` file, where ``XXX`` is the name of 688the target. The ``XXX.td`` file includes the other ``.td`` input files, but 689its contents are only directly important for subtargets. 690 691You should describe a concrete target-specific class ``XXXInstrInfo`` that 692represents machine instructions supported by a target machine. 693``XXXInstrInfo`` contains an array of ``XXXInstrDescriptor`` objects, each of 694which describes one instruction. An instruction descriptor defines: 695 696* Opcode mnemonic 697* Number of operands 698* List of implicit register definitions and uses 699* Target-independent properties (such as memory access, is commutable) 700* Target-specific flags 701 702The Instruction class (defined in ``Target.td``) is mostly used as a base for 703more complex instruction classes. 704 705.. code-block:: text 706 707 class Instruction { 708 string Namespace = ""; 709 dag OutOperandList; // A dag containing the MI def operand list. 710 dag InOperandList; // A dag containing the MI use operand list. 711 string AsmString = ""; // The .s format to print the instruction with. 712 list<dag> Pattern; // Set to the DAG pattern for this instruction. 713 list<Register> Uses = []; 714 list<Register> Defs = []; 715 list<Predicate> Predicates = []; // predicates turned into isel match code 716 ... remainder not shown for space ... 717 } 718 719A ``SelectionDAG`` node (``SDNode``) should contain an object representing a 720target-specific instruction that is defined in ``XXXInstrInfo.td``. The 721instruction objects should represent instructions from the architecture manual 722of the target machine (such as the SPARC Architecture Manual for the SPARC 723target). 724 725A single instruction from the architecture manual is often modeled as multiple 726target instructions, depending upon its operands. For example, a manual might 727describe an add instruction that takes a register or an immediate operand. An 728LLVM target could model this with two instructions named ``ADDri`` and 729``ADDrr``. 730 731You should define a class for each instruction category and define each opcode 732as a subclass of the category with appropriate parameters such as the fixed 733binary encoding of opcodes and extended opcodes. You should map the register 734bits to the bits of the instruction in which they are encoded (for the JIT). 735Also you should specify how the instruction should be printed when the 736automatic assembly printer is used. 737 738As is described in the SPARC Architecture Manual, Version 8, there are three 739major 32-bit formats for instructions. Format 1 is only for the ``CALL`` 740instruction. Format 2 is for branch on condition codes and ``SETHI`` (set high 741bits of a register) instructions. Format 3 is for other instructions. 742 743Each of these formats has corresponding classes in ``SparcInstrFormat.td``. 744``InstSP`` is a base class for other instruction classes. Additional base 745classes are specified for more precise formats: for example in 746``SparcInstrFormat.td``, ``F2_1`` is for ``SETHI``, and ``F2_2`` is for 747branches. There are three other base classes: ``F3_1`` for register/register 748operations, ``F3_2`` for register/immediate operations, and ``F3_3`` for 749floating-point operations. ``SparcInstrInfo.td`` also adds the base class 750``Pseudo`` for synthetic SPARC instructions. 751 752``SparcInstrInfo.td`` largely consists of operand and instruction definitions 753for the SPARC target. In ``SparcInstrInfo.td``, the following target 754description file entry, ``LDrr``, defines the Load Integer instruction for a 755Word (the ``LD`` SPARC opcode) from a memory address to a register. The first 756parameter, the value 3 (``11``\ :sub:`2`), is the operation value for this 757category of operation. The second parameter (``000000``\ :sub:`2`) is the 758specific operation value for ``LD``/Load Word. The third parameter is the 759output destination, which is a register operand and defined in the ``Register`` 760target description file (``IntRegs``). 761 762.. code-block:: text 763 764 def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$rd), (ins (MEMrr $rs1, $rs2):$addr), 765 "ld [$addr], $dst", 766 [(set i32:$dst, (load ADDRrr:$addr))]>; 767 768The fourth parameter is the input source, which uses the address operand 769``MEMrr`` that is defined earlier in ``SparcInstrInfo.td``: 770 771.. code-block:: text 772 773 def MEMrr : Operand<i32> { 774 let PrintMethod = "printMemOperand"; 775 let MIOperandInfo = (ops IntRegs, IntRegs); 776 } 777 778The fifth parameter is a string that is used by the assembly printer and can be 779left as an empty string until the assembly printer interface is implemented. 780The sixth and final parameter is the pattern used to match the instruction 781during the SelectionDAG Select Phase described in :doc:`CodeGenerator`. 782This parameter is detailed in the next section, :ref:`instruction-selector`. 783 784Instruction class definitions are not overloaded for different operand types, 785so separate versions of instructions are needed for register, memory, or 786immediate value operands. For example, to perform a Load Integer instruction 787for a Word from an immediate operand to a register, the following instruction 788class is defined: 789 790.. code-block:: text 791 792 def LDri : F3_2 <3, 0b000000, (outs IntRegs:$rd), (ins (MEMri $rs1, $simm13):$addr), 793 "ld [$addr], $dst", 794 [(set i32:$rd, (load ADDRri:$addr))]>; 795 796Writing these definitions for so many similar instructions can involve a lot of 797cut and paste. In ``.td`` files, the ``multiclass`` directive enables the 798creation of templates to define several instruction classes at once (using the 799``defm`` directive). For example in ``SparcInstrInfo.td``, the ``multiclass`` 800pattern ``F3_12`` is defined to create 2 instruction classes each time 801``F3_12`` is invoked: 802 803.. code-block:: text 804 805 multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> { 806 def rr : F3_1 <2, Op3Val, 807 (outs IntRegs:$rd), (ins IntRegs:$rs1, IntRegs:$rs1), 808 !strconcat(OpcStr, " $rs1, $rs2, $rd"), 809 [(set i32:$rd, (OpNode i32:$rs1, i32:$rs2))]>; 810 def ri : F3_2 <2, Op3Val, 811 (outs IntRegs:$rd), (ins IntRegs:$rs1, i32imm:$simm13), 812 !strconcat(OpcStr, " $rs1, $simm13, $rd"), 813 [(set i32:$rd, (OpNode i32:$rs1, simm13:$simm13))]>; 814 } 815 816So when the ``defm`` directive is used for the ``XOR`` and ``ADD`` 817instructions, as seen below, it creates four instruction objects: ``XORrr``, 818``XORri``, ``ADDrr``, and ``ADDri``. 819 820.. code-block:: text 821 822 defm XOR : F3_12<"xor", 0b000011, xor>; 823 defm ADD : F3_12<"add", 0b000000, add>; 824 825``SparcInstrInfo.td`` also includes definitions for condition codes that are 826referenced by branch instructions. The following definitions in 827``SparcInstrInfo.td`` indicate the bit location of the SPARC condition code. 828For example, the 10\ :sup:`th` bit represents the "greater than" condition for 829integers, and the 22\ :sup:`nd` bit represents the "greater than" condition for 830floats. 831 832.. code-block:: text 833 834 def ICC_NE : ICC_VAL< 9>; // Not Equal 835 def ICC_E : ICC_VAL< 1>; // Equal 836 def ICC_G : ICC_VAL<10>; // Greater 837 ... 838 def FCC_U : FCC_VAL<23>; // Unordered 839 def FCC_G : FCC_VAL<22>; // Greater 840 def FCC_UG : FCC_VAL<21>; // Unordered or Greater 841 ... 842 843(Note that ``Sparc.h`` also defines enums that correspond to the same SPARC 844condition codes. Care must be taken to ensure the values in ``Sparc.h`` 845correspond to the values in ``SparcInstrInfo.td``. I.e., ``SPCC::ICC_NE = 9``, 846``SPCC::FCC_U = 23`` and so on.) 847 848Instruction Operand Mapping 849--------------------------- 850 851The code generator backend maps instruction operands to fields in the 852instruction. Whenever a bit in the instruction encoding ``Inst`` is assigned 853to field without a concrete value, an operand from the ``outs`` or ``ins`` list 854is expected to have a matching name. This operand then populates that undefined 855field. For example, the Sparc target defines the ``XNORrr`` instruction as a 856``F3_1`` format instruction having three operands: the output ``$rd``, and the 857inputs ``$rs1``, and ``$rs2``. 858 859.. code-block:: text 860 861 def XNORrr : F3_1<2, 0b000111, 862 (outs IntRegs:$rd), (ins IntRegs:$rs1, IntRegs:$rs2), 863 "xnor $rs1, $rs2, $rd", 864 [(set i32:$rd, (not (xor i32:$rs1, i32:$rs2)))]>; 865 866The instruction templates in ``SparcInstrFormats.td`` show the base class for 867``F3_1`` is ``InstSP``. 868 869.. code-block:: text 870 871 class InstSP<dag outs, dag ins, string asmstr, list<dag> pattern> : Instruction { 872 field bits<32> Inst; 873 let Namespace = "SP"; 874 bits<2> op; 875 let Inst{31-30} = op; 876 dag OutOperandList = outs; 877 dag InOperandList = ins; 878 let AsmString = asmstr; 879 let Pattern = pattern; 880 } 881 882``InstSP`` defines the ``op`` field, and uses it to define bits 30 and 31 of the 883instruction, but does not assign a value to it. 884 885.. code-block:: text 886 887 class F3<dag outs, dag ins, string asmstr, list<dag> pattern> 888 : InstSP<outs, ins, asmstr, pattern> { 889 bits<5> rd; 890 bits<6> op3; 891 bits<5> rs1; 892 let op{1} = 1; // Op = 2 or 3 893 let Inst{29-25} = rd; 894 let Inst{24-19} = op3; 895 let Inst{18-14} = rs1; 896 } 897 898``F3`` defines the ``rd``, ``op3``, and ``rs1`` fields, and uses them in the 899instruction, and again does not assign values. 900 901.. code-block:: text 902 903 class F3_1<bits<2> opVal, bits<6> op3val, dag outs, dag ins, 904 string asmstr, list<dag> pattern> : F3<outs, ins, asmstr, pattern> { 905 bits<8> asi = 0; // asi not currently used 906 bits<5> rs2; 907 let op = opVal; 908 let op3 = op3val; 909 let Inst{13} = 0; // i field = 0 910 let Inst{12-5} = asi; // address space identifier 911 let Inst{4-0} = rs2; 912 } 913 914``F3_1`` assigns a value to ``op`` and ``op3`` fields, and defines the ``rs2`` 915field. Therefore, a ``F3_1`` format instruction will require a definition for 916``rd``, ``rs1``, and ``rs2`` in order to fully specify the instruction encoding. 917 918The ``XNORrr`` instruction then provides those three operands in its 919OutOperandList and InOperandList, which bind to the corresponding fields, and 920thus complete the instruction encoding. 921 922For some instructions, a single operand may contain sub-operands. As shown 923earlier, the instruction ``LDrr`` uses an input operand of type ``MEMrr``. This 924operand type contains two register sub-operands, defined by the 925``MIOperandInfo`` value to be ``(ops IntRegs, IntRegs)``. 926 927.. code-block:: text 928 929 def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$rd), (ins (MEMrr $rs1, $rs2):$addr), 930 "ld [$addr], $dst", 931 [(set i32:$dst, (load ADDRrr:$addr))]>; 932 933As this instruction is also the ``F3_1`` format, it will expect operands named 934``rd``, ``rs1``, and ``rs2`` as well. In order to allow this, a complex operand 935can optionally give names to each of its sub-operands. In this example 936``MEMrr``'s first sub-operand is named ``$rs1``, the second ``$rs2``, and the 937operand as a whole is also given the name ``$addr``. 938 939When a particular instruction doesn't use all the operands that the instruction 940format defines, a constant value may instead be bound to one or all. For 941example, the ``RDASR`` instruction only takes a single register operand, so we 942assign a constant zero to ``rs2``: 943 944.. code-block:: text 945 946 let rs2 = 0 in 947 def RDASR : F3_1<2, 0b101000, 948 (outs IntRegs:$rd), (ins ASRRegs:$rs1), 949 "rd $rs1, $rd", []>; 950 951Instruction Operand Name Mapping 952^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 953 954TableGen will also generate a function called getNamedOperandIdx() which 955can be used to look up an operand's index in a MachineInstr based on its 956TableGen name. Setting the UseNamedOperandTable bit in an instruction's 957TableGen definition will add all of its operands to an enumeration in the 958llvm::XXX:OpName namespace and also add an entry for it into the OperandMap 959table, which can be queried using getNamedOperandIdx() 960 961.. code-block:: text 962 963 int DstIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::dst); // => 0 964 int BIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::b); // => 1 965 int CIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::c); // => 2 966 int DIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::d); // => -1 967 968 ... 969 970The entries in the OpName enum are taken verbatim from the TableGen definitions, 971so operands with lowercase names will have lower case entries in the enum. 972 973To include the getNamedOperandIdx() function in your backend, you will need 974to define a few preprocessor macros in XXXInstrInfo.cpp and XXXInstrInfo.h. 975For example: 976 977XXXInstrInfo.cpp: 978 979.. code-block:: c++ 980 981 #define GET_INSTRINFO_NAMED_OPS // For getNamedOperandIdx() function 982 #include "XXXGenInstrInfo.inc" 983 984XXXInstrInfo.h: 985 986.. code-block:: c++ 987 988 #define GET_INSTRINFO_OPERAND_ENUM // For OpName enum 989 #include "XXXGenInstrInfo.inc" 990 991 namespace XXX { 992 int16_t getNamedOperandIdx(uint16_t Opcode, uint16_t NamedIndex); 993 } // End namespace XXX 994 995Instruction Operand Types 996^^^^^^^^^^^^^^^^^^^^^^^^^ 997 998TableGen will also generate an enumeration consisting of all named Operand 999types defined in the backend, in the llvm::XXX::OpTypes namespace. 1000Some common immediate Operand types (for instance i8, i32, i64, f32, f64) 1001are defined for all targets in ``include/llvm/Target/Target.td``, and are 1002available in each Target's OpTypes enum. Also, only named Operand types appear 1003in the enumeration: anonymous types are ignored. 1004For example, the X86 backend defines ``brtarget`` and ``brtarget8``, both 1005instances of the TableGen ``Operand`` class, which represent branch target 1006operands: 1007 1008.. code-block:: text 1009 1010 def brtarget : Operand<OtherVT>; 1011 def brtarget8 : Operand<OtherVT>; 1012 1013This results in: 1014 1015.. code-block:: c++ 1016 1017 namespace X86 { 1018 namespace OpTypes { 1019 enum OperandType { 1020 ... 1021 brtarget, 1022 brtarget8, 1023 ... 1024 i32imm, 1025 i64imm, 1026 ... 1027 OPERAND_TYPE_LIST_END 1028 } // End namespace OpTypes 1029 } // End namespace X86 1030 1031In typical TableGen fashion, to use the enum, you will need to define a 1032preprocessor macro: 1033 1034.. code-block:: c++ 1035 1036 #define GET_INSTRINFO_OPERAND_TYPES_ENUM // For OpTypes enum 1037 #include "XXXGenInstrInfo.inc" 1038 1039 1040Instruction Scheduling 1041---------------------- 1042 1043Instruction itineraries can be queried using MCDesc::getSchedClass(). The 1044value can be named by an enumeration in llvm::XXX::Sched namespace generated 1045by TableGen in XXXGenInstrInfo.inc. The name of the schedule classes are 1046the same as provided in XXXSchedule.td plus a default NoItinerary class. 1047 1048The schedule models are generated by TableGen by the SubtargetEmitter, 1049using the ``CodeGenSchedModels`` class. This is distinct from the itinerary 1050method of specifying machine resource use. The tool ``utils/schedcover.py`` 1051can be used to determine which instructions have been covered by the 1052schedule model description and which haven't. The first step is to use the 1053instructions below to create an output file. Then run ``schedcover.py`` on the 1054output file: 1055 1056.. code-block:: shell 1057 1058 $ <src>/utils/schedcover.py <build>/lib/Target/AArch64/tblGenSubtarget.with 1059 instruction, default, CortexA53Model, CortexA57Model, CycloneModel, ExynosM3Model, FalkorModel, KryoModel, ThunderX2T99Model, ThunderXT8XModel 1060 ABSv16i8, WriteV, , , CyWriteV3, M3WriteNMISC1, FalkorWr_2VXVY_2cyc, KryoWrite_2cyc_XY_XY_150ln, , 1061 ABSv1i64, WriteV, , , CyWriteV3, M3WriteNMISC1, FalkorWr_1VXVY_2cyc, KryoWrite_2cyc_XY_noRSV_67ln, , 1062 ... 1063 1064To capture the debug output from generating a schedule model, change to the 1065appropriate target directory and use the following command: 1066command with the ``subtarget-emitter`` debug option: 1067 1068.. code-block:: shell 1069 1070 $ <build>/bin/llvm-tblgen -debug-only=subtarget-emitter -gen-subtarget \ 1071 -I <src>/lib/Target/<target> -I <src>/include \ 1072 -I <src>/lib/Target <src>/lib/Target/<target>/<target>.td \ 1073 -o <build>/lib/Target/<target>/<target>GenSubtargetInfo.inc.tmp \ 1074 > tblGenSubtarget.dbg 2>&1 1075 1076Where ``<build>`` is the build directory, ``src`` is the source directory, 1077and ``<target>`` is the name of the target. 1078To double check that the above command is what is needed, one can capture the 1079exact TableGen command from a build by using: 1080 1081.. code-block:: shell 1082 1083 $ VERBOSE=1 make ... 1084 1085and search for ``llvm-tblgen`` commands in the output. 1086 1087 1088Instruction Relation Mapping 1089---------------------------- 1090 1091This TableGen feature is used to relate instructions with each other. It is 1092particularly useful when you have multiple instruction formats and need to 1093switch between them after instruction selection. This entire feature is driven 1094by relation models which can be defined in ``XXXInstrInfo.td`` files 1095according to the target-specific instruction set. Relation models are defined 1096using ``InstrMapping`` class as a base. TableGen parses all the models 1097and generates instruction relation maps using the specified information. 1098Relation maps are emitted as tables in the ``XXXGenInstrInfo.inc`` file 1099along with the functions to query them. For the detailed information on how to 1100use this feature, please refer to :doc:`HowToUseInstrMappings`. 1101 1102Implement a subclass of ``TargetInstrInfo`` 1103------------------------------------------- 1104 1105The final step is to hand code portions of ``XXXInstrInfo``, which implements 1106the interface described in ``TargetInstrInfo.h`` (see :ref:`TargetInstrInfo`). 1107These functions return ``0`` or a Boolean or they assert, unless overridden. 1108Here's a list of functions that are overridden for the SPARC implementation in 1109``SparcInstrInfo.cpp``: 1110 1111* ``isLoadFromStackSlot`` --- If the specified machine instruction is a direct 1112 load from a stack slot, return the register number of the destination and the 1113 ``FrameIndex`` of the stack slot. 1114 1115* ``isStoreToStackSlot`` --- If the specified machine instruction is a direct 1116 store to a stack slot, return the register number of the destination and the 1117 ``FrameIndex`` of the stack slot. 1118 1119* ``copyPhysReg`` --- Copy values between a pair of physical registers. 1120 1121* ``storeRegToStackSlot`` --- Store a register value to a stack slot. 1122 1123* ``loadRegFromStackSlot`` --- Load a register value from a stack slot. 1124 1125* ``storeRegToAddr`` --- Store a register value to memory. 1126 1127* ``loadRegFromAddr`` --- Load a register value from memory. 1128 1129* ``foldMemoryOperand`` --- Attempt to combine instructions of any load or 1130 store instruction for the specified operand(s). 1131 1132Branch Folding and If Conversion 1133-------------------------------- 1134 1135Performance can be improved by combining instructions or by eliminating 1136instructions that are never reached. The ``analyzeBranch`` method in 1137``XXXInstrInfo`` may be implemented to examine conditional instructions and 1138remove unnecessary instructions. ``analyzeBranch`` looks at the end of a 1139machine basic block (MBB) for opportunities for improvement, such as branch 1140folding and if conversion. The ``BranchFolder`` and ``IfConverter`` machine 1141function passes (see the source files ``BranchFolding.cpp`` and 1142``IfConversion.cpp`` in the ``lib/CodeGen`` directory) call ``analyzeBranch`` 1143to improve the control flow graph that represents the instructions. 1144 1145Several implementations of ``analyzeBranch`` (for ARM, Alpha, and X86) can be 1146examined as models for your own ``analyzeBranch`` implementation. Since SPARC 1147does not implement a useful ``analyzeBranch``, the ARM target implementation is 1148shown below. 1149 1150``analyzeBranch`` returns a Boolean value and takes four parameters: 1151 1152* ``MachineBasicBlock &MBB`` --- The incoming block to be examined. 1153 1154* ``MachineBasicBlock *&TBB`` --- A destination block that is returned. For a 1155 conditional branch that evaluates to true, ``TBB`` is the destination. 1156 1157* ``MachineBasicBlock *&FBB`` --- For a conditional branch that evaluates to 1158 false, ``FBB`` is returned as the destination. 1159 1160* ``std::vector<MachineOperand> &Cond`` --- List of operands to evaluate a 1161 condition for a conditional branch. 1162 1163In the simplest case, if a block ends without a branch, then it falls through 1164to the successor block. No destination blocks are specified for either ``TBB`` 1165or ``FBB``, so both parameters return ``NULL``. The start of the 1166``analyzeBranch`` (see code below for the ARM target) shows the function 1167parameters and the code for the simplest case. 1168 1169.. code-block:: c++ 1170 1171 bool ARMInstrInfo::analyzeBranch(MachineBasicBlock &MBB, 1172 MachineBasicBlock *&TBB, 1173 MachineBasicBlock *&FBB, 1174 std::vector<MachineOperand> &Cond) const 1175 { 1176 MachineBasicBlock::iterator I = MBB.end(); 1177 if (I == MBB.begin() || !isUnpredicatedTerminator(--I)) 1178 return false; 1179 1180If a block ends with a single unconditional branch instruction, then 1181``analyzeBranch`` (shown below) should return the destination of that branch in 1182the ``TBB`` parameter. 1183 1184.. code-block:: c++ 1185 1186 if (LastOpc == ARM::B || LastOpc == ARM::tB) { 1187 TBB = LastInst->getOperand(0).getMBB(); 1188 return false; 1189 } 1190 1191If a block ends with two unconditional branches, then the second branch is 1192never reached. In that situation, as shown below, remove the last branch 1193instruction and return the penultimate branch in the ``TBB`` parameter. 1194 1195.. code-block:: c++ 1196 1197 if ((SecondLastOpc == ARM::B || SecondLastOpc == ARM::tB) && 1198 (LastOpc == ARM::B || LastOpc == ARM::tB)) { 1199 TBB = SecondLastInst->getOperand(0).getMBB(); 1200 I = LastInst; 1201 I->eraseFromParent(); 1202 return false; 1203 } 1204 1205A block may end with a single conditional branch instruction that falls through 1206to successor block if the condition evaluates to false. In that case, 1207``analyzeBranch`` (shown below) should return the destination of that 1208conditional branch in the ``TBB`` parameter and a list of operands in the 1209``Cond`` parameter to evaluate the condition. 1210 1211.. code-block:: c++ 1212 1213 if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) { 1214 // Block ends with fall-through condbranch. 1215 TBB = LastInst->getOperand(0).getMBB(); 1216 Cond.push_back(LastInst->getOperand(1)); 1217 Cond.push_back(LastInst->getOperand(2)); 1218 return false; 1219 } 1220 1221If a block ends with both a conditional branch and an ensuing unconditional 1222branch, then ``analyzeBranch`` (shown below) should return the conditional 1223branch destination (assuming it corresponds to a conditional evaluation of 1224"``true``") in the ``TBB`` parameter and the unconditional branch destination 1225in the ``FBB`` (corresponding to a conditional evaluation of "``false``"). A 1226list of operands to evaluate the condition should be returned in the ``Cond`` 1227parameter. 1228 1229.. code-block:: c++ 1230 1231 unsigned SecondLastOpc = SecondLastInst->getOpcode(); 1232 1233 if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) || 1234 (SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) { 1235 TBB = SecondLastInst->getOperand(0).getMBB(); 1236 Cond.push_back(SecondLastInst->getOperand(1)); 1237 Cond.push_back(SecondLastInst->getOperand(2)); 1238 FBB = LastInst->getOperand(0).getMBB(); 1239 return false; 1240 } 1241 1242For the last two cases (ending with a single conditional branch or ending with 1243one conditional and one unconditional branch), the operands returned in the 1244``Cond`` parameter can be passed to methods of other instructions to create new 1245branches or perform other operations. An implementation of ``analyzeBranch`` 1246requires the helper methods ``removeBranch`` and ``insertBranch`` to manage 1247subsequent operations. 1248 1249``analyzeBranch`` should return false indicating success in most circumstances. 1250``analyzeBranch`` should only return true when the method is stumped about what 1251to do, for example, if a block has three terminating branches. 1252``analyzeBranch`` may return true if it encounters a terminator it cannot 1253handle, such as an indirect branch. 1254 1255.. _instruction-selector: 1256 1257Instruction Selector 1258==================== 1259 1260LLVM uses a ``SelectionDAG`` to represent LLVM IR instructions, and nodes of 1261the ``SelectionDAG`` ideally represent native target instructions. During code 1262generation, instruction selection passes are performed to convert non-native 1263DAG instructions into native target-specific instructions. The pass described 1264in ``XXXISelDAGToDAG.cpp`` is used to match patterns and perform DAG-to-DAG 1265instruction selection. Optionally, a pass may be defined (in 1266``XXXBranchSelector.cpp``) to perform similar DAG-to-DAG operations for branch 1267instructions. Later, the code in ``XXXISelLowering.cpp`` replaces or removes 1268operations and data types not supported natively (legalizes) in a 1269``SelectionDAG``. 1270 1271TableGen generates code for instruction selection using the following target 1272description input files: 1273 1274* ``XXXInstrInfo.td`` --- Contains definitions of instructions in a 1275 target-specific instruction set, generates ``XXXGenDAGISel.inc``, which is 1276 included in ``XXXISelDAGToDAG.cpp``. 1277 1278* ``XXXCallingConv.td`` --- Contains the calling and return value conventions 1279 for the target architecture, and it generates ``XXXGenCallingConv.inc``, 1280 which is included in ``XXXISelLowering.cpp``. 1281 1282The implementation of an instruction selection pass must include a header that 1283declares the ``FunctionPass`` class or a subclass of ``FunctionPass``. In 1284``XXXTargetMachine.cpp``, a Pass Manager (PM) should add each instruction 1285selection pass into the queue of passes to run. 1286 1287The LLVM static compiler (``llc``) is an excellent tool for visualizing the 1288contents of DAGs. To display the ``SelectionDAG`` before or after specific 1289processing phases, use the command line options for ``llc``, described at 1290:ref:`SelectionDAG-Process`. 1291 1292To describe instruction selector behavior, you should add patterns for lowering 1293LLVM code into a ``SelectionDAG`` as the last parameter of the instruction 1294definitions in ``XXXInstrInfo.td``. For example, in ``SparcInstrInfo.td``, 1295this entry defines a register store operation, and the last parameter describes 1296a pattern with the store DAG operator. 1297 1298.. code-block:: text 1299 1300 def STrr : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src), 1301 "st $src, [$addr]", [(store i32:$src, ADDRrr:$addr)]>; 1302 1303``ADDRrr`` is a memory mode that is also defined in ``SparcInstrInfo.td``: 1304 1305.. code-block:: text 1306 1307 def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>; 1308 1309The definition of ``ADDRrr`` refers to ``SelectADDRrr``, which is a function 1310defined in an implementation of the Instructor Selector (such as 1311``SparcISelDAGToDAG.cpp``). 1312 1313In ``lib/Target/TargetSelectionDAG.td``, the DAG operator for store is defined 1314below: 1315 1316.. code-block:: text 1317 1318 def store : PatFrag<(ops node:$val, node:$ptr), 1319 (unindexedstore node:$val, node:$ptr)> { 1320 let IsStore = true; 1321 let IsTruncStore = false; 1322 } 1323 1324``XXXInstrInfo.td`` also generates (in ``XXXGenDAGISel.inc``) the 1325``SelectCode`` method that is used to call the appropriate processing method 1326for an instruction. In this example, ``SelectCode`` calls ``Select_ISD_STORE`` 1327for the ``ISD::STORE`` opcode. 1328 1329.. code-block:: c++ 1330 1331 SDNode *SelectCode(SDValue N) { 1332 ... 1333 MVT::ValueType NVT = N.getNode()->getValueType(0); 1334 switch (N.getOpcode()) { 1335 case ISD::STORE: { 1336 switch (NVT) { 1337 default: 1338 return Select_ISD_STORE(N); 1339 break; 1340 } 1341 break; 1342 } 1343 ... 1344 1345The pattern for ``STrr`` is matched, so elsewhere in ``XXXGenDAGISel.inc``, 1346code for ``STrr`` is created for ``Select_ISD_STORE``. The ``Emit_22`` method 1347is also generated in ``XXXGenDAGISel.inc`` to complete the processing of this 1348instruction. 1349 1350.. code-block:: c++ 1351 1352 SDNode *Select_ISD_STORE(const SDValue &N) { 1353 SDValue Chain = N.getOperand(0); 1354 if (Predicate_store(N.getNode())) { 1355 SDValue N1 = N.getOperand(1); 1356 SDValue N2 = N.getOperand(2); 1357 SDValue CPTmp0; 1358 SDValue CPTmp1; 1359 1360 // Pattern: (st:void i32:i32:$src, 1361 // ADDRrr:i32:$addr)<<P:Predicate_store>> 1362 // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src) 1363 // Pattern complexity = 13 cost = 1 size = 0 1364 if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) && 1365 N1.getNode()->getValueType(0) == MVT::i32 && 1366 N2.getNode()->getValueType(0) == MVT::i32) { 1367 return Emit_22(N, SP::STrr, CPTmp0, CPTmp1); 1368 } 1369 ... 1370 1371The SelectionDAG Legalize Phase 1372------------------------------- 1373 1374The Legalize phase converts a DAG to use types and operations that are natively 1375supported by the target. For natively unsupported types and operations, you 1376need to add code to the target-specific ``XXXTargetLowering`` implementation to 1377convert unsupported types and operations to supported ones. 1378 1379In the constructor for the ``XXXTargetLowering`` class, first use the 1380``addRegisterClass`` method to specify which types are supported and which 1381register classes are associated with them. The code for the register classes 1382are generated by TableGen from ``XXXRegisterInfo.td`` and placed in 1383``XXXGenRegisterInfo.h.inc``. For example, the implementation of the 1384constructor for the SparcTargetLowering class (in ``SparcISelLowering.cpp``) 1385starts with the following code: 1386 1387.. code-block:: c++ 1388 1389 addRegisterClass(MVT::i32, SP::IntRegsRegisterClass); 1390 addRegisterClass(MVT::f32, SP::FPRegsRegisterClass); 1391 addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass); 1392 1393You should examine the node types in the ``ISD`` namespace 1394(``include/llvm/CodeGen/SelectionDAGNodes.h``) and determine which operations 1395the target natively supports. For operations that do **not** have native 1396support, add a callback to the constructor for the ``XXXTargetLowering`` class, 1397so the instruction selection process knows what to do. The ``TargetLowering`` 1398class callback methods (declared in ``llvm/Target/TargetLowering.h``) are: 1399 1400* ``setOperationAction`` --- General operation. 1401* ``setLoadExtAction`` --- Load with extension. 1402* ``setTruncStoreAction`` --- Truncating store. 1403* ``setIndexedLoadAction`` --- Indexed load. 1404* ``setIndexedStoreAction`` --- Indexed store. 1405* ``setConvertAction`` --- Type conversion. 1406* ``setCondCodeAction`` --- Support for a given condition code. 1407 1408Note: on older releases, ``setLoadXAction`` is used instead of 1409``setLoadExtAction``. Also, on older releases, ``setCondCodeAction`` may not 1410be supported. Examine your release to see what methods are specifically 1411supported. 1412 1413These callbacks are used to determine that an operation does or does not work 1414with a specified type (or types). And in all cases, the third parameter is a 1415``LegalAction`` type enum value: ``Promote``, ``Expand``, ``Custom``, or 1416``Legal``. ``SparcISelLowering.cpp`` contains examples of all four 1417``LegalAction`` values. 1418 1419Promote 1420^^^^^^^ 1421 1422For an operation without native support for a given type, the specified type 1423may be promoted to a larger type that is supported. For example, SPARC does 1424not support a sign-extending load for Boolean values (``i1`` type), so in 1425``SparcISelLowering.cpp`` the third parameter below, ``Promote``, changes 1426``i1`` type values to a large type before loading. 1427 1428.. code-block:: c++ 1429 1430 setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote); 1431 1432Expand 1433^^^^^^ 1434 1435For a type without native support, a value may need to be broken down further, 1436rather than promoted. For an operation without native support, a combination 1437of other operations may be used to similar effect. In SPARC, the 1438floating-point sine and cosine trig operations are supported by expansion to 1439other operations, as indicated by the third parameter, ``Expand``, to 1440``setOperationAction``: 1441 1442.. code-block:: c++ 1443 1444 setOperationAction(ISD::FSIN, MVT::f32, Expand); 1445 setOperationAction(ISD::FCOS, MVT::f32, Expand); 1446 1447Custom 1448^^^^^^ 1449 1450For some operations, simple type promotion or operation expansion may be 1451insufficient. In some cases, a special intrinsic function must be implemented. 1452 1453For example, a constant value may require special treatment, or an operation 1454may require spilling and restoring registers in the stack and working with 1455register allocators. 1456 1457As seen in ``SparcISelLowering.cpp`` code below, to perform a type conversion 1458from a floating point value to a signed integer, first the 1459``setOperationAction`` should be called with ``Custom`` as the third parameter: 1460 1461.. code-block:: c++ 1462 1463 setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom); 1464 1465In the ``LowerOperation`` method, for each ``Custom`` operation, a case 1466statement should be added to indicate what function to call. In the following 1467code, an ``FP_TO_SINT`` opcode will call the ``LowerFP_TO_SINT`` method: 1468 1469.. code-block:: c++ 1470 1471 SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) { 1472 switch (Op.getOpcode()) { 1473 case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG); 1474 ... 1475 } 1476 } 1477 1478Finally, the ``LowerFP_TO_SINT`` method is implemented, using an FP register to 1479convert the floating-point value to an integer. 1480 1481.. code-block:: c++ 1482 1483 static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &DAG) { 1484 assert(Op.getValueType() == MVT::i32); 1485 Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0)); 1486 return DAG.getNode(ISD::BITCAST, MVT::i32, Op); 1487 } 1488 1489Legal 1490^^^^^ 1491 1492The ``Legal`` ``LegalizeAction`` enum value simply indicates that an operation 1493**is** natively supported. ``Legal`` represents the default condition, so it 1494is rarely used. In ``SparcISelLowering.cpp``, the action for ``CTPOP`` (an 1495operation to count the bits set in an integer) is natively supported only for 1496SPARC v9. The following code enables the ``Expand`` conversion technique for 1497non-v9 SPARC implementations. 1498 1499.. code-block:: c++ 1500 1501 setOperationAction(ISD::CTPOP, MVT::i32, Expand); 1502 ... 1503 if (TM.getSubtarget<SparcSubtarget>().isV9()) 1504 setOperationAction(ISD::CTPOP, MVT::i32, Legal); 1505 1506.. _backend-calling-convs: 1507 1508Calling Conventions 1509------------------- 1510 1511To support target-specific calling conventions, ``XXXGenCallingConv.td`` uses 1512interfaces (such as ``CCIfType`` and ``CCAssignToReg``) that are defined in 1513``lib/Target/TargetCallingConv.td``. TableGen can take the target descriptor 1514file ``XXXGenCallingConv.td`` and generate the header file 1515``XXXGenCallingConv.inc``, which is typically included in 1516``XXXISelLowering.cpp``. You can use the interfaces in 1517``TargetCallingConv.td`` to specify: 1518 1519* The order of parameter allocation. 1520 1521* Where parameters and return values are placed (that is, on the stack or in 1522 registers). 1523 1524* Which registers may be used. 1525 1526* Whether the caller or callee unwinds the stack. 1527 1528The following example demonstrates the use of the ``CCIfType`` and 1529``CCAssignToReg`` interfaces. If the ``CCIfType`` predicate is true (that is, 1530if the current argument is of type ``f32`` or ``f64``), then the action is 1531performed. In this case, the ``CCAssignToReg`` action assigns the argument 1532value to the first available register: either ``R0`` or ``R1``. 1533 1534.. code-block:: text 1535 1536 CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>> 1537 1538``SparcCallingConv.td`` contains definitions for a target-specific return-value 1539calling convention (``RetCC_Sparc32``) and a basic 32-bit C calling convention 1540(``CC_Sparc32``). The definition of ``RetCC_Sparc32`` (shown below) indicates 1541which registers are used for specified scalar return types. A single-precision 1542float is returned to register ``F0``, and a double-precision float goes to 1543register ``D0``. A 32-bit integer is returned in register ``I0`` or ``I1``. 1544 1545.. code-block:: text 1546 1547 def RetCC_Sparc32 : CallingConv<[ 1548 CCIfType<[i32], CCAssignToReg<[I0, I1]>>, 1549 CCIfType<[f32], CCAssignToReg<[F0]>>, 1550 CCIfType<[f64], CCAssignToReg<[D0]>> 1551 ]>; 1552 1553The definition of ``CC_Sparc32`` in ``SparcCallingConv.td`` introduces 1554``CCAssignToStack``, which assigns the value to a stack slot with the specified 1555size and alignment. In the example below, the first parameter, 4, indicates 1556the size of the slot, and the second parameter, also 4, indicates the stack 1557alignment along 4-byte units. (Special cases: if size is zero, then the ABI 1558size is used; if alignment is zero, then the ABI alignment is used.) 1559 1560.. code-block:: text 1561 1562 def CC_Sparc32 : CallingConv<[ 1563 // All arguments get passed in integer registers if there is space. 1564 CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>, 1565 CCAssignToStack<4, 4> 1566 ]>; 1567 1568``CCDelegateTo`` is another commonly used interface, which tries to find a 1569specified sub-calling convention, and, if a match is found, it is invoked. In 1570the following example (in ``X86CallingConv.td``), the definition of 1571``RetCC_X86_32_C`` ends with ``CCDelegateTo``. After the current value is 1572assigned to the register ``ST0`` or ``ST1``, the ``RetCC_X86Common`` is 1573invoked. 1574 1575.. code-block:: text 1576 1577 def RetCC_X86_32_C : CallingConv<[ 1578 CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>, 1579 CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>, 1580 CCDelegateTo<RetCC_X86Common> 1581 ]>; 1582 1583``CCIfCC`` is an interface that attempts to match the given name to the current 1584calling convention. If the name identifies the current calling convention, 1585then a specified action is invoked. In the following example (in 1586``X86CallingConv.td``), if the ``Fast`` calling convention is in use, then 1587``RetCC_X86_32_Fast`` is invoked. If the ``SSECall`` calling convention is in 1588use, then ``RetCC_X86_32_SSE`` is invoked. 1589 1590.. code-block:: text 1591 1592 def RetCC_X86_32 : CallingConv<[ 1593 CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>, 1594 CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>, 1595 CCDelegateTo<RetCC_X86_32_C> 1596 ]>; 1597 1598``CCAssignToRegAndStack`` is the same as ``CCAssignToReg``, but also allocates 1599a stack slot, when some register is used. Basically, it works like: 1600``CCIf<CCAssignToReg<regList>, CCAssignToStack<size, align>>``. 1601 1602.. code-block:: text 1603 1604 class CCAssignToRegAndStack<list<Register> regList, int size, int align> 1605 : CCAssignToReg<regList> { 1606 int Size = size; 1607 int Align = align; 1608 } 1609 1610Other calling convention interfaces include: 1611 1612* ``CCIf <predicate, action>`` --- If the predicate matches, apply the action. 1613 1614* ``CCIfInReg <action>`` --- If the argument is marked with the "``inreg``" 1615 attribute, then apply the action. 1616 1617* ``CCIfNest <action>`` --- If the argument is marked with the "``nest``" 1618 attribute, then apply the action. 1619 1620* ``CCIfNotVarArg <action>`` --- If the current function does not take a 1621 variable number of arguments, apply the action. 1622 1623* ``CCAssignToRegWithShadow <registerList, shadowList>`` --- similar to 1624 ``CCAssignToReg``, but with a shadow list of registers. 1625 1626* ``CCPassByVal <size, align>`` --- Assign value to a stack slot with the 1627 minimum specified size and alignment. 1628 1629* ``CCPromoteToType <type>`` --- Promote the current value to the specified 1630 type. 1631 1632* ``CallingConv <[actions]>`` --- Define each calling convention that is 1633 supported. 1634 1635Assembly Printer 1636================ 1637 1638During the code emission stage, the code generator may utilize an LLVM pass to 1639produce assembly output. To do this, you want to implement the code for a 1640printer that converts LLVM IR to a GAS-format assembly language for your target 1641machine, using the following steps: 1642 1643* Define all the assembly strings for your target, adding them to the 1644 instructions defined in the ``XXXInstrInfo.td`` file. (See 1645 :ref:`instruction-set`.) TableGen will produce an output file 1646 (``XXXGenAsmWriter.inc``) with an implementation of the ``printInstruction`` 1647 method for the ``XXXAsmPrinter`` class. 1648 1649* Write ``XXXTargetAsmInfo.h``, which contains the bare-bones declaration of 1650 the ``XXXTargetAsmInfo`` class (a subclass of ``TargetAsmInfo``). 1651 1652* Write ``XXXTargetAsmInfo.cpp``, which contains target-specific values for 1653 ``TargetAsmInfo`` properties and sometimes new implementations for methods. 1654 1655* Write ``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that 1656 performs the LLVM-to-assembly conversion. 1657 1658The code in ``XXXTargetAsmInfo.h`` is usually a trivial declaration of the 1659``XXXTargetAsmInfo`` class for use in ``XXXTargetAsmInfo.cpp``. Similarly, 1660``XXXTargetAsmInfo.cpp`` usually has a few declarations of ``XXXTargetAsmInfo`` 1661replacement values that override the default values in ``TargetAsmInfo.cpp``. 1662For example in ``SparcTargetAsmInfo.cpp``: 1663 1664.. code-block:: c++ 1665 1666 SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) { 1667 Data16bitsDirective = "\t.half\t"; 1668 Data32bitsDirective = "\t.word\t"; 1669 Data64bitsDirective = 0; // .xword is only supported by V9. 1670 ZeroDirective = "\t.skip\t"; 1671 CommentString = "!"; 1672 ConstantPoolSection = "\t.section \".rodata\",#alloc\n"; 1673 } 1674 1675The X86 assembly printer implementation (``X86TargetAsmInfo``) is an example 1676where the target specific ``TargetAsmInfo`` class uses an overridden methods: 1677``ExpandInlineAsm``. 1678 1679A target-specific implementation of ``AsmPrinter`` is written in 1680``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that converts 1681the LLVM to printable assembly. The implementation must include the following 1682headers that have declarations for the ``AsmPrinter`` and 1683``MachineFunctionPass`` classes. The ``MachineFunctionPass`` is a subclass of 1684``FunctionPass``. 1685 1686.. code-block:: c++ 1687 1688 #include "llvm/CodeGen/AsmPrinter.h" 1689 #include "llvm/CodeGen/MachineFunctionPass.h" 1690 1691As a ``FunctionPass``, ``AsmPrinter`` first calls ``doInitialization`` to set 1692up the ``AsmPrinter``. In ``SparcAsmPrinter``, a ``Mangler`` object is 1693instantiated to process variable names. 1694 1695In ``XXXAsmPrinter.cpp``, the ``runOnMachineFunction`` method (declared in 1696``MachineFunctionPass``) must be implemented for ``XXXAsmPrinter``. In 1697``MachineFunctionPass``, the ``runOnFunction`` method invokes 1698``runOnMachineFunction``. Target-specific implementations of 1699``runOnMachineFunction`` differ, but generally do the following to process each 1700machine function: 1701 1702* Call ``SetupMachineFunction`` to perform initialization. 1703 1704* Call ``EmitConstantPool`` to print out (to the output stream) constants which 1705 have been spilled to memory. 1706 1707* Call ``EmitJumpTableInfo`` to print out jump tables used by the current 1708 function. 1709 1710* Print out the label for the current function. 1711 1712* Print out the code for the function, including basic block labels and the 1713 assembly for the instruction (using ``printInstruction``) 1714 1715The ``XXXAsmPrinter`` implementation must also include the code generated by 1716TableGen that is output in the ``XXXGenAsmWriter.inc`` file. The code in 1717``XXXGenAsmWriter.inc`` contains an implementation of the ``printInstruction`` 1718method that may call these methods: 1719 1720* ``printOperand`` 1721* ``printMemOperand`` 1722* ``printCCOperand`` (for conditional statements) 1723* ``printDataDirective`` 1724* ``printDeclare`` 1725* ``printImplicitDef`` 1726* ``printInlineAsm`` 1727 1728The implementations of ``printDeclare``, ``printImplicitDef``, 1729``printInlineAsm``, and ``printLabel`` in ``AsmPrinter.cpp`` are generally 1730adequate for printing assembly and do not need to be overridden. 1731 1732The ``printOperand`` method is implemented with a long ``switch``/``case`` 1733statement for the type of operand: register, immediate, basic block, external 1734symbol, global address, constant pool index, or jump table index. For an 1735instruction with a memory address operand, the ``printMemOperand`` method 1736should be implemented to generate the proper output. Similarly, 1737``printCCOperand`` should be used to print a conditional operand. 1738 1739``doFinalization`` should be overridden in ``XXXAsmPrinter``, and it should be 1740called to shut down the assembly printer. During ``doFinalization``, global 1741variables and constants are printed to output. 1742 1743Subtarget Support 1744================= 1745 1746Subtarget support is used to inform the code generation process of instruction 1747set variations for a given chip set. For example, the LLVM SPARC 1748implementation provided covers three major versions of the SPARC microprocessor 1749architecture: Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a 175064-bit architecture), and the UltraSPARC architecture. V8 has 16 1751double-precision floating-point registers that are also usable as either 32 1752single-precision or 8 quad-precision registers. V8 is also purely big-endian. 1753V9 has 32 double-precision floating-point registers that are also usable as 16 1754quad-precision registers, but cannot be used as single-precision registers. 1755The UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set 1756extensions. 1757 1758If subtarget support is needed, you should implement a target-specific 1759``XXXSubtarget`` class for your architecture. This class should process the 1760command-line options ``-mcpu=`` and ``-mattr=``. 1761 1762TableGen uses definitions in the ``Target.td`` and ``Sparc.td`` files to 1763generate code in ``SparcGenSubtarget.inc``. In ``Target.td``, shown below, the 1764``SubtargetFeature`` interface is defined. The first 4 string parameters of 1765the ``SubtargetFeature`` interface are a feature name, a XXXSubtarget field set 1766by the feature, the value of the XXXSubtarget field, and a description of the 1767feature. (The fifth parameter is a list of features whose presence is implied, 1768and its default value is an empty array.) 1769 1770If the value for the field is the string "true" or "false", the field 1771is assumed to be a bool and only one SubtargetFeature should refer to it. 1772Otherwise, it is assumed to be an integer. The integer value may be the name 1773of an enum constant. If multiple features use the same integer field, the 1774field will be set to the maximum value of all enabled features that share 1775the field. 1776 1777.. code-block:: text 1778 1779 class SubtargetFeature<string n, string f, string v, string d, 1780 list<SubtargetFeature> i = []> { 1781 string Name = n; 1782 string FieldName = f; 1783 string Value = v; 1784 string Desc = d; 1785 list<SubtargetFeature> Implies = i; 1786 } 1787 1788In the ``Sparc.td`` file, the ``SubtargetFeature`` is used to define the 1789following features. 1790 1791.. code-block:: text 1792 1793 def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true", 1794 "Enable SPARC-V9 instructions">; 1795 def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8", 1796 "UseV8DeprecatedInsts", "true", 1797 "Enable deprecated V8 instructions in V9 mode">; 1798 def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true", 1799 "Enable UltraSPARC Visual Instruction Set extensions">; 1800 1801Elsewhere in ``Sparc.td``, the ``Proc`` class is defined and then is used to 1802define particular SPARC processor subtypes that may have the previously 1803described features. 1804 1805.. code-block:: text 1806 1807 class Proc<string Name, list<SubtargetFeature> Features> 1808 : Processor<Name, NoItineraries, Features>; 1809 1810 def : Proc<"generic", []>; 1811 def : Proc<"v8", []>; 1812 def : Proc<"supersparc", []>; 1813 def : Proc<"sparclite", []>; 1814 def : Proc<"f934", []>; 1815 def : Proc<"hypersparc", []>; 1816 def : Proc<"sparclite86x", []>; 1817 def : Proc<"sparclet", []>; 1818 def : Proc<"tsc701", []>; 1819 def : Proc<"v9", [FeatureV9]>; 1820 def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated]>; 1821 def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated]>; 1822 def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>; 1823 1824From ``Target.td`` and ``Sparc.td`` files, the resulting 1825``SparcGenSubtarget.inc`` specifies enum values to identify the features, 1826arrays of constants to represent the CPU features and CPU subtypes, and the 1827``ParseSubtargetFeatures`` method that parses the features string that sets 1828specified subtarget options. The generated ``SparcGenSubtarget.inc`` file 1829should be included in the ``SparcSubtarget.cpp``. The target-specific 1830implementation of the ``XXXSubtarget`` method should follow this pseudocode: 1831 1832.. code-block:: c++ 1833 1834 XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) { 1835 // Set the default features 1836 // Determine default and user specified characteristics of the CPU 1837 // Call ParseSubtargetFeatures(FS, CPU) to parse the features string 1838 // Perform any additional operations 1839 } 1840 1841JIT Support 1842=========== 1843 1844The implementation of a target machine optionally includes a Just-In-Time (JIT) 1845code generator that emits machine code and auxiliary structures as binary 1846output that can be written directly to memory. To do this, implement JIT code 1847generation by performing the following steps: 1848 1849* Write an ``XXXCodeEmitter.cpp`` file that contains a machine function pass 1850 that transforms target-machine instructions into relocatable machine 1851 code. 1852 1853* Write an ``XXXJITInfo.cpp`` file that implements the JIT interfaces for 1854 target-specific code-generation activities, such as emitting machine code and 1855 stubs. 1856 1857* Modify ``XXXTargetMachine`` so that it provides a ``TargetJITInfo`` object 1858 through its ``getJITInfo`` method. 1859 1860There are several different approaches to writing the JIT support code. For 1861instance, TableGen and target descriptor files may be used for creating a JIT 1862code generator, but are not mandatory. For the Alpha and PowerPC target 1863machines, TableGen is used to generate ``XXXGenCodeEmitter.inc``, which 1864contains the binary coding of machine instructions and the 1865``getBinaryCodeForInstr`` method to access those codes. Other JIT 1866implementations do not. 1867 1868Both ``XXXJITInfo.cpp`` and ``XXXCodeEmitter.cpp`` must include the 1869``llvm/CodeGen/MachineCodeEmitter.h`` header file that defines the 1870``MachineCodeEmitter`` class containing code for several callback functions 1871that write data (in bytes, words, strings, etc.) to the output stream. 1872 1873Machine Code Emitter 1874-------------------- 1875 1876In ``XXXCodeEmitter.cpp``, a target-specific of the ``Emitter`` class is 1877implemented as a function pass (subclass of ``MachineFunctionPass``). The 1878target-specific implementation of ``runOnMachineFunction`` (invoked by 1879``runOnFunction`` in ``MachineFunctionPass``) iterates through the 1880``MachineBasicBlock`` calls ``emitInstruction`` to process each instruction and 1881emit binary code. ``emitInstruction`` is largely implemented with case 1882statements on the instruction types defined in ``XXXInstrInfo.h``. For 1883example, in ``X86CodeEmitter.cpp``, the ``emitInstruction`` method is built 1884around the following ``switch``/``case`` statements: 1885 1886.. code-block:: c++ 1887 1888 switch (Desc->TSFlags & X86::FormMask) { 1889 case X86II::Pseudo: // for not yet implemented instructions 1890 ... // or pseudo-instructions 1891 break; 1892 case X86II::RawFrm: // for instructions with a fixed opcode value 1893 ... 1894 break; 1895 case X86II::AddRegFrm: // for instructions that have one register operand 1896 ... // added to their opcode 1897 break; 1898 case X86II::MRMDestReg:// for instructions that use the Mod/RM byte 1899 ... // to specify a destination (register) 1900 break; 1901 case X86II::MRMDestMem:// for instructions that use the Mod/RM byte 1902 ... // to specify a destination (memory) 1903 break; 1904 case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte 1905 ... // to specify a source (register) 1906 break; 1907 case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte 1908 ... // to specify a source (memory) 1909 break; 1910 case X86II::MRM0r: case X86II::MRM1r: // for instructions that operate on 1911 case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and 1912 case X86II::MRM4r: case X86II::MRM5r: // use the Mod/RM byte and a field 1913 case X86II::MRM6r: case X86II::MRM7r: // to hold extended opcode data 1914 ... 1915 break; 1916 case X86II::MRM0m: case X86II::MRM1m: // for instructions that operate on 1917 case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and 1918 case X86II::MRM4m: case X86II::MRM5m: // use the Mod/RM byte and a field 1919 case X86II::MRM6m: case X86II::MRM7m: // to hold extended opcode data 1920 ... 1921 break; 1922 case X86II::MRMInitReg: // for instructions whose source and 1923 ... // destination are the same register 1924 break; 1925 } 1926 1927The implementations of these case statements often first emit the opcode and 1928then get the operand(s). Then depending upon the operand, helper methods may 1929be called to process the operand(s). For example, in ``X86CodeEmitter.cpp``, 1930for the ``X86II::AddRegFrm`` case, the first data emitted (by ``emitByte``) is 1931the opcode added to the register operand. Then an object representing the 1932machine operand, ``MO1``, is extracted. The helper methods such as 1933``isImmediate``, ``isGlobalAddress``, ``isExternalSymbol``, 1934``isConstantPoolIndex``, and ``isJumpTableIndex`` determine the operand type. 1935(``X86CodeEmitter.cpp`` also has private methods such as ``emitConstant``, 1936``emitGlobalAddress``, ``emitExternalSymbolAddress``, ``emitConstPoolAddress``, 1937and ``emitJumpTableAddress`` that emit the data into the output stream.) 1938 1939.. code-block:: c++ 1940 1941 case X86II::AddRegFrm: 1942 MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg())); 1943 1944 if (CurOp != NumOps) { 1945 const MachineOperand &MO1 = MI.getOperand(CurOp++); 1946 unsigned Size = X86InstrInfo::sizeOfImm(Desc); 1947 if (MO1.isImmediate()) 1948 emitConstant(MO1.getImm(), Size); 1949 else { 1950 unsigned rt = Is64BitMode ? X86::reloc_pcrel_word 1951 : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word); 1952 if (Opcode == X86::MOV64ri) 1953 rt = X86::reloc_absolute_dword; // FIXME: add X86II flag? 1954 if (MO1.isGlobalAddress()) { 1955 bool NeedStub = isa<Function>(MO1.getGlobal()); 1956 bool isLazy = gvNeedsLazyPtr(MO1.getGlobal()); 1957 emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0, 1958 NeedStub, isLazy); 1959 } else if (MO1.isExternalSymbol()) 1960 emitExternalSymbolAddress(MO1.getSymbolName(), rt); 1961 else if (MO1.isConstantPoolIndex()) 1962 emitConstPoolAddress(MO1.getIndex(), rt); 1963 else if (MO1.isJumpTableIndex()) 1964 emitJumpTableAddress(MO1.getIndex(), rt); 1965 } 1966 } 1967 break; 1968 1969In the previous example, ``XXXCodeEmitter.cpp`` uses the variable ``rt``, which 1970is a ``RelocationType`` enum that may be used to relocate addresses (for 1971example, a global address with a PIC base offset). The ``RelocationType`` enum 1972for that target is defined in the short target-specific ``XXXRelocations.h`` 1973file. The ``RelocationType`` is used by the ``relocate`` method defined in 1974``XXXJITInfo.cpp`` to rewrite addresses for referenced global symbols. 1975 1976For example, ``X86Relocations.h`` specifies the following relocation types for 1977the X86 addresses. In all four cases, the relocated value is added to the 1978value already in memory. For ``reloc_pcrel_word`` and ``reloc_picrel_word``, 1979there is an additional initial adjustment. 1980 1981.. code-block:: c++ 1982 1983 enum RelocationType { 1984 reloc_pcrel_word = 0, // add reloc value after adjusting for the PC loc 1985 reloc_picrel_word = 1, // add reloc value after adjusting for the PIC base 1986 reloc_absolute_word = 2, // absolute relocation; no additional adjustment 1987 reloc_absolute_dword = 3 // absolute relocation; no additional adjustment 1988 }; 1989 1990Target JIT Info 1991--------------- 1992 1993``XXXJITInfo.cpp`` implements the JIT interfaces for target-specific 1994code-generation activities, such as emitting machine code and stubs. At 1995minimum, a target-specific version of ``XXXJITInfo`` implements the following: 1996 1997* ``getLazyResolverFunction`` --- Initializes the JIT, gives the target a 1998 function that is used for compilation. 1999 2000* ``emitFunctionStub`` --- Returns a native function with a specified address 2001 for a callback function. 2002 2003* ``relocate`` --- Changes the addresses of referenced globals, based on 2004 relocation types. 2005 2006* Callback function that are wrappers to a function stub that is used when the 2007 real target is not initially known. 2008 2009``getLazyResolverFunction`` is generally trivial to implement. It makes the 2010incoming parameter as the global ``JITCompilerFunction`` and returns the 2011callback function that will be used a function wrapper. For the Alpha target 2012(in ``AlphaJITInfo.cpp``), the ``getLazyResolverFunction`` implementation is 2013simply: 2014 2015.. code-block:: c++ 2016 2017 TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction( 2018 JITCompilerFn F) { 2019 JITCompilerFunction = F; 2020 return AlphaCompilationCallback; 2021 } 2022 2023For the X86 target, the ``getLazyResolverFunction`` implementation is a little 2024more complicated, because it returns a different callback function for 2025processors with SSE instructions and XMM registers. 2026 2027The callback function initially saves and later restores the callee register 2028values, incoming arguments, and frame and return address. The callback 2029function needs low-level access to the registers or stack, so it is typically 2030implemented with assembler. 2031