xref: /llvm-project/llvm/docs/WritingAnLLVMBackend.rst (revision bb3f5e1fed7c6ba733b7f273e93f5d3930976185)
1=======================
2Writing an LLVM Backend
3=======================
4
5.. toctree::
6   :hidden:
7
8   HowToUseInstrMappings
9
10.. contents::
11   :local:
12
13Introduction
14============
15
16This document describes techniques for writing compiler backends that convert
17the LLVM Intermediate Representation (IR) to code for a specified machine or
18other languages.  Code intended for a specific machine can take the form of
19either assembly code or binary code (usable for a JIT compiler).
20
21The backend of LLVM features a target-independent code generator that may
22create output for several types of target CPUs --- including X86, PowerPC,
23ARM, and SPARC.  The backend may also be used to generate code targeted at SPUs
24of the Cell processor or GPUs to support the execution of compute kernels.
25
26The document focuses on existing examples found in subdirectories of
27``llvm/lib/Target`` in a downloaded LLVM release.  In particular, this document
28focuses on the example of creating a static compiler (one that emits text
29assembly) for a SPARC target, because SPARC has fairly standard
30characteristics, such as a RISC instruction set and straightforward calling
31conventions.
32
33Audience
34--------
35
36The audience for this document is anyone who needs to write an LLVM backend to
37generate code for a specific hardware or software target.
38
39Prerequisite Reading
40--------------------
41
42These essential documents must be read before reading this document:
43
44* `LLVM Language Reference Manual <LangRef.html>`_ --- a reference manual for
45  the LLVM assembly language.
46
47* :doc:`CodeGenerator` --- a guide to the components (classes and code
48  generation algorithms) for translating the LLVM internal representation into
49  machine code for a specified target.  Pay particular attention to the
50  descriptions of code generation stages: Instruction Selection, Scheduling and
51  Formation, SSA-based Optimization, Register Allocation, Prolog/Epilog Code
52  Insertion, Late Machine Code Optimizations, and Code Emission.
53
54* :doc:`TableGen/index` --- a document that describes the TableGen
55  (``tblgen``) application that manages domain-specific information to support
56  LLVM code generation.  TableGen processes input from a target description
57  file (``.td`` suffix) and generates C++ code that can be used for code
58  generation.
59
60* :doc:`WritingAnLLVMPass` --- The assembly printer is a ``FunctionPass``, as
61  are several ``SelectionDAG`` processing steps.
62
63To follow the SPARC examples in this document, have a copy of `The SPARC
64Architecture Manual, Version 8 <http://www.sparc.org/standards/V8.pdf>`_ for
65reference.  For details about the ARM instruction set, refer to the `ARM
66Architecture Reference Manual <http://infocenter.arm.com/>`_.  For more about
67the GNU Assembler format (``GAS``), see `Using As
68<http://sourceware.org/binutils/docs/as/index.html>`_, especially for the
69assembly printer.  "Using As" contains a list of target machine dependent
70features.
71
72Basic Steps
73-----------
74
75To write a compiler backend for LLVM that converts the LLVM IR to code for a
76specified target (machine or other language), follow these steps:
77
78* Create a subclass of the ``TargetMachine`` class that describes
79  characteristics of your target machine.  Copy existing examples of specific
80  ``TargetMachine`` class and header files; for example, start with
81  ``SparcTargetMachine.cpp`` and ``SparcTargetMachine.h``, but change the file
82  names for your target.  Similarly, change code that references "``Sparc``" to
83  reference your target.
84
85* Describe the register set of the target.  Use TableGen to generate code for
86  register definition, register aliases, and register classes from a
87  target-specific ``RegisterInfo.td`` input file.  You should also write
88  additional code for a subclass of the ``TargetRegisterInfo`` class that
89  represents the class register file data used for register allocation and also
90  describes the interactions between registers.
91
92* Describe the instruction set of the target.  Use TableGen to generate code
93  for target-specific instructions from target-specific versions of
94  ``TargetInstrFormats.td`` and ``TargetInstrInfo.td``.  You should write
95  additional code for a subclass of the ``TargetInstrInfo`` class to represent
96  machine instructions supported by the target machine.
97
98* Describe the selection and conversion of the LLVM IR from a Directed Acyclic
99  Graph (DAG) representation of instructions to native target-specific
100  instructions.  Use TableGen to generate code that matches patterns and
101  selects instructions based on additional information in a target-specific
102  version of ``TargetInstrInfo.td``.  Write code for ``XXXISelDAGToDAG.cpp``,
103  where ``XXX`` identifies the specific target, to perform pattern matching and
104  DAG-to-DAG instruction selection.  Also write code in ``XXXISelLowering.cpp``
105  to replace or remove operations and data types that are not supported
106  natively in a SelectionDAG.
107
108* Write code for an assembly printer that converts LLVM IR to a GAS format for
109  your target machine.  You should add assembly strings to the instructions
110  defined in your target-specific version of ``TargetInstrInfo.td``.  You
111  should also write code for a subclass of ``AsmPrinter`` that performs the
112  LLVM-to-assembly conversion and a trivial subclass of ``TargetAsmInfo``.
113
114* Optionally, add support for subtargets (i.e., variants with different
115  capabilities).  You should also write code for a subclass of the
116  ``TargetSubtarget`` class, which allows you to use the ``-mcpu=`` and
117  ``-mattr=`` command-line options.
118
119* Optionally, add JIT support and create a machine code emitter (subclass of
120  ``TargetJITInfo``) that is used to emit binary code directly into memory.
121
122In the ``.cpp`` and ``.h``. files, initially stub up these methods and then
123implement them later.  Initially, you may not know which private members that
124the class will need and which components will need to be subclassed.
125
126Preliminaries
127-------------
128
129To actually create your compiler backend, you need to create and modify a few
130files.  The absolute minimum is discussed here.  But to actually use the LLVM
131target-independent code generator, you must perform the steps described in the
132:doc:`LLVM Target-Independent Code Generator <CodeGenerator>` document.
133
134First, you should create a subdirectory under ``lib/Target`` to hold all the
135files related to your target.  If your target is called "Dummy", create the
136directory ``lib/Target/Dummy``.
137
138In this new directory, create a ``CMakeLists.txt``.  It is easiest to copy a
139``CMakeLists.txt`` of another target and modify it.  It should at least contain
140the ``LLVM_TARGET_DEFINITIONS`` variable. The library can be named ``LLVMDummy``
141(for example, see the MIPS target).  Alternatively, you can split the library
142into ``LLVMDummyCodeGen`` and ``LLVMDummyAsmPrinter``, the latter of which
143should be implemented in a subdirectory below ``lib/Target/Dummy`` (for example,
144see the PowerPC target).
145
146Note that these two naming schemes are hardcoded into ``llvm-config``.  Using
147any other naming scheme will confuse ``llvm-config`` and produce a lot of
148(seemingly unrelated) linker errors when linking ``llc``.
149
150To make your target actually do something, you need to implement a subclass of
151``TargetMachine``.  This implementation should typically be in the file
152``lib/Target/DummyTargetMachine.cpp``, but any file in the ``lib/Target``
153directory will be built and should work.  To use LLVM's target independent code
154generator, you should do what all current machine backends do: create a
155subclass of ``CodeGenTargetMachineImpl``.  (To create a target from scratch, create a
156subclass of ``TargetMachine``.)
157
158To get LLVM to actually build and link your target, you need to run ``cmake``
159with ``-DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=Dummy``. This will build your
160target without needing to add it to the list of all the targets.
161
162Once your target is stable, you can add it to the ``LLVM_ALL_TARGETS`` variable
163located in the main ``CMakeLists.txt``.
164
165Target Machine
166==============
167
168``CodeGenTargetMachineImpl`` is designed as a base class for targets implemented with
169the LLVM target-independent code generator. The ``CodeGenTargetMachineImpl`` class
170should be specialized by a concrete target class that implements the various
171virtual methods.  ``CodeGenTargetMachineImpl`` is defined as a subclass of
172``TargetMachine`` in ``include/llvm/CodeGen/CodeGenTargetMachineImpl.h``.  The
173``TargetMachine`` class implementation (``include/llvm/Target/TargetMachine.cpp``)
174also processes numerous command-line options.
175
176To create a concrete target-specific subclass of ``CodeGenTargetMachineImpl``, start
177by copying an existing ``TargetMachine`` class and header.  You should name the
178files that you create to reflect your specific target.  For instance, for the
179SPARC target, name the files ``SparcTargetMachine.h`` and
180``SparcTargetMachine.cpp``.
181
182For a target machine ``XXX``, the implementation of ``XXXTargetMachine`` must
183have access methods to obtain objects that represent target components.  These
184methods are named ``get*Info``, and are intended to obtain the instruction set
185(``getInstrInfo``), register set (``getRegisterInfo``), stack frame layout
186(``getFrameInfo``), and similar information.  ``XXXTargetMachine`` must also
187implement the ``getDataLayout`` method to access an object with target-specific
188data characteristics, such as data type size and alignment requirements.
189
190For instance, for the SPARC target, the header file ``SparcTargetMachine.h``
191declares prototypes for several ``get*Info`` and ``getDataLayout`` methods that
192simply return a class member.
193
194.. code-block:: c++
195
196  namespace llvm {
197
198  class Module;
199
200  class SparcTargetMachine : public CodeGenTargetMachineImpl {
201    const DataLayout DataLayout;       // Calculates type size & alignment
202    SparcSubtarget Subtarget;
203    SparcInstrInfo InstrInfo;
204    TargetFrameInfo FrameInfo;
205
206  protected:
207    virtual const TargetAsmInfo *createTargetAsmInfo() const;
208
209  public:
210    SparcTargetMachine(const Module &M, const std::string &FS);
211
212    virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; }
213    virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; }
214    virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; }
215    virtual const TargetRegisterInfo *getRegisterInfo() const {
216      return &InstrInfo.getRegisterInfo();
217    }
218    virtual const DataLayout *getDataLayout() const { return &DataLayout; }
219
220    // Pass Pipeline Configuration
221    virtual bool addInstSelector(PassManagerBase &PM, bool Fast);
222    virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast);
223  };
224
225  } // end namespace llvm
226
227* ``getInstrInfo()``
228* ``getRegisterInfo()``
229* ``getFrameInfo()``
230* ``getDataLayout()``
231* ``getSubtargetImpl()``
232
233For some targets, you also need to support the following methods:
234
235* ``getTargetLowering()``
236* ``getJITInfo()``
237
238Some architectures, such as GPUs, do not support jumping to an arbitrary
239program location and implement branching using masked execution and loop using
240special instructions around the loop body. In order to avoid CFG modifications
241that introduce irreducible control flow not handled by such hardware, a target
242must call `setRequiresStructuredCFG(true)` when being initialized.
243
244In addition, the ``XXXTargetMachine`` constructor should specify a
245``TargetDescription`` string that determines the data layout for the target
246machine, including characteristics such as pointer size, alignment, and
247endianness.  For example, the constructor for ``SparcTargetMachine`` contains
248the following:
249
250.. code-block:: c++
251
252  SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS)
253    : DataLayout("E-p:32:32-f128:128:128"),
254      Subtarget(M, FS), InstrInfo(Subtarget),
255      FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
256  }
257
258Hyphens separate portions of the ``TargetDescription`` string.
259
260* An upper-case "``E``" in the string indicates a big-endian target data model.
261  A lower-case "``e``" indicates little-endian.
262
263* "``p:``" is followed by pointer information: size, ABI alignment, and
264  preferred alignment.  If only two figures follow "``p:``", then the first
265  value is pointer size, and the second value is both ABI and preferred
266  alignment.
267
268* Then a letter for numeric type alignment: "``i``", "``f``", "``v``", or
269  "``a``" (corresponding to integer, floating point, vector, or aggregate).
270  "``i``", "``v``", or "``a``" are followed by ABI alignment and preferred
271  alignment. "``f``" is followed by three values: the first indicates the size
272  of a long double, then ABI alignment, and then ABI preferred alignment.
273
274Target Registration
275===================
276
277You must also register your target with the ``TargetRegistry``, which is what
278other LLVM tools use to be able to lookup and use your target at runtime.  The
279``TargetRegistry`` can be used directly, but for most targets there are helper
280templates which should take care of the work for you.
281
282All targets should declare a global ``Target`` object which is used to
283represent the target during registration.  Then, in the target's ``TargetInfo``
284library, the target should define that object and use the ``RegisterTarget``
285template to register the target.  For example, the Sparc registration code
286looks like this:
287
288.. code-block:: c++
289
290  Target llvm::getTheSparcTarget();
291
292  extern "C" void LLVMInitializeSparcTargetInfo() {
293    RegisterTarget<Triple::sparc, /*HasJIT=*/false>
294      X(getTheSparcTarget(), "sparc", "Sparc");
295  }
296
297This allows the ``TargetRegistry`` to look up the target by name or by target
298triple.  In addition, most targets will also register additional features which
299are available in separate libraries.  These registration steps are separate,
300because some clients may wish to only link in some parts of the target --- the
301JIT code generator does not require the use of the assembler printer, for
302example.  Here is an example of registering the Sparc assembly printer:
303
304.. code-block:: c++
305
306  extern "C" void LLVMInitializeSparcAsmPrinter() {
307    RegisterAsmPrinter<SparcAsmPrinter> X(getTheSparcTarget());
308  }
309
310For more information, see "`llvm/Target/TargetRegistry.h
311</doxygen/TargetRegistry_8h-source.html>`_".
312
313Register Set and Register Classes
314=================================
315
316You should describe a concrete target-specific class that represents the
317register file of a target machine.  This class is called ``XXXRegisterInfo``
318(where ``XXX`` identifies the target) and represents the class register file
319data that is used for register allocation.  It also describes the interactions
320between registers.
321
322You also need to define register classes to categorize related registers.  A
323register class should be added for groups of registers that are all treated the
324same way for some instruction.  Typical examples are register classes for
325integer, floating-point, or vector registers.  A register allocator allows an
326instruction to use any register in a specified register class to perform the
327instruction in a similar manner.  Register classes allocate virtual registers
328to instructions from these sets, and register classes let the
329target-independent register allocator automatically choose the actual
330registers.
331
332Much of the code for registers, including register definition, register
333aliases, and register classes, is generated by TableGen from
334``XXXRegisterInfo.td`` input files and placed in ``XXXGenRegisterInfo.h.inc``
335and ``XXXGenRegisterInfo.inc`` output files.  Some of the code in the
336implementation of ``XXXRegisterInfo`` requires hand-coding.
337
338Defining a Register
339-------------------
340
341The ``XXXRegisterInfo.td`` file typically starts with register definitions for
342a target machine.  The ``Register`` class (specified in ``Target.td``) is used
343to define an object for each register.  The specified string ``n`` becomes the
344``Name`` of the register.  The basic ``Register`` object does not have any
345subregisters and does not specify any aliases.
346
347.. code-block:: text
348
349  class Register<string n> {
350    string Namespace = "";
351    string AsmName = n;
352    string Name = n;
353    int SpillSize = 0;
354    int SpillAlignment = 0;
355    list<Register> Aliases = [];
356    list<Register> SubRegs = [];
357    list<int> DwarfNumbers = [];
358  }
359
360For example, in the ``X86RegisterInfo.td`` file, there are register definitions
361that utilize the ``Register`` class, such as:
362
363.. code-block:: text
364
365  def AL : Register<"AL">, DwarfRegNum<[0, 0, 0]>;
366
367This defines the register ``AL`` and assigns it values (with ``DwarfRegNum``)
368that are used by ``gcc``, ``gdb``, or a debug information writer to identify a
369register.  For register ``AL``, ``DwarfRegNum`` takes an array of 3 values
370representing 3 different modes: the first element is for X86-64, the second for
371exception handling (EH) on X86-32, and the third is generic. -1 is a special
372Dwarf number that indicates the gcc number is undefined, and -2 indicates the
373register number is invalid for this mode.
374
375From the previously described line in the ``X86RegisterInfo.td`` file, TableGen
376generates this code in the ``X86GenRegisterInfo.inc`` file:
377
378.. code-block:: c++
379
380  static const unsigned GR8[] = { X86::AL, ... };
381
382  const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
383
384  const TargetRegisterDesc RegisterDescriptors[] = {
385    ...
386  { "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
387
388From the register info file, TableGen generates a ``TargetRegisterDesc`` object
389for each register.  ``TargetRegisterDesc`` is defined in
390``include/llvm/Target/TargetRegisterInfo.h`` with the following fields:
391
392.. code-block:: c++
393
394  struct TargetRegisterDesc {
395    const char     *AsmName;      // Assembly language name for the register
396    const char     *Name;         // Printable name for the reg (for debugging)
397    const unsigned *AliasSet;     // Register Alias Set
398    const unsigned *SubRegs;      // Sub-register set
399    const unsigned *ImmSubRegs;   // Immediate sub-register set
400    const unsigned *SuperRegs;    // Super-register set
401  };
402
403TableGen uses the entire target description file (``.td``) to determine text
404names for the register (in the ``AsmName`` and ``Name`` fields of
405``TargetRegisterDesc``) and the relationships of other registers to the defined
406register (in the other ``TargetRegisterDesc`` fields).  In this example, other
407definitions establish the registers "``AX``", "``EAX``", and "``RAX``" as
408aliases for one another, so TableGen generates a null-terminated array
409(``AL_AliasSet``) for this register alias set.
410
411The ``Register`` class is commonly used as a base class for more complex
412classes.  In ``Target.td``, the ``Register`` class is the base for the
413``RegisterWithSubRegs`` class that is used to define registers that need to
414specify subregisters in the ``SubRegs`` list, as shown here:
415
416.. code-block:: text
417
418  class RegisterWithSubRegs<string n, list<Register> subregs> : Register<n> {
419    let SubRegs = subregs;
420  }
421
422In ``SparcRegisterInfo.td``, additional register classes are defined for SPARC:
423a ``Register`` subclass, ``SparcReg``, and further subclasses: ``Ri``, ``Rf``,
424and ``Rd``.  SPARC registers are identified by 5-bit ID numbers, which is a
425feature common to these subclasses.  Note the use of "``let``" expressions to
426override values that are initially defined in a superclass (such as ``SubRegs``
427field in the ``Rd`` class).
428
429.. code-block:: text
430
431  class SparcReg<string n> : Register<n> {
432    field bits<5> Num;
433    let Namespace = "SP";
434  }
435  // Ri - 32-bit integer registers
436  class Ri<bits<5> num, string n> :
437  SparcReg<n> {
438    let Num = num;
439  }
440  // Rf - 32-bit floating-point registers
441  class Rf<bits<5> num, string n> :
442  SparcReg<n> {
443    let Num = num;
444  }
445  // Rd - Slots in the FP register file for 64-bit floating-point values.
446  class Rd<bits<5> num, string n, list<Register> subregs> : SparcReg<n> {
447    let Num = num;
448    let SubRegs = subregs;
449  }
450
451In the ``SparcRegisterInfo.td`` file, there are register definitions that
452utilize these subclasses of ``Register``, such as:
453
454.. code-block:: text
455
456  def G0 : Ri< 0, "G0">, DwarfRegNum<[0]>;
457  def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>;
458  ...
459  def F0 : Rf< 0, "F0">, DwarfRegNum<[32]>;
460  def F1 : Rf< 1, "F1">, DwarfRegNum<[33]>;
461  ...
462  def D0 : Rd< 0, "F0", [F0, F1]>, DwarfRegNum<[32]>;
463  def D1 : Rd< 2, "F2", [F2, F3]>, DwarfRegNum<[34]>;
464
465The last two registers shown above (``D0`` and ``D1``) are double-precision
466floating-point registers that are aliases for pairs of single-precision
467floating-point sub-registers.  In addition to aliases, the sub-register and
468super-register relationships of the defined register are in fields of a
469register's ``TargetRegisterDesc``.
470
471Defining a Register Class
472-------------------------
473
474The ``RegisterClass`` class (specified in ``Target.td``) is used to define an
475object that represents a group of related registers and also defines the
476default allocation order of the registers.  A target description file
477``XXXRegisterInfo.td`` that uses ``Target.td`` can construct register classes
478using the following class:
479
480.. code-block:: text
481
482  class RegisterClass<string namespace,
483  list<ValueType> regTypes, int alignment, dag regList> {
484    string Namespace = namespace;
485    list<ValueType> RegTypes = regTypes;
486    int Size = 0;  // spill size, in bits; zero lets tblgen pick the size
487    int Alignment = alignment;
488
489    // CopyCost is the cost of copying a value between two registers
490    // default value 1 means a single instruction
491    // A negative value means copying is extremely expensive or impossible
492    int CopyCost = 1;
493    dag MemberList = regList;
494
495    // for register classes that are subregisters of this class
496    list<RegisterClass> SubRegClassList = [];
497
498    code MethodProtos = [{}];  // to insert arbitrary code
499    code MethodBodies = [{}];
500  }
501
502To define a ``RegisterClass``, use the following 4 arguments:
503
504* The first argument of the definition is the name of the namespace.
505
506* The second argument is a list of ``ValueType`` register type values that are
507  defined in ``include/llvm/CodeGen/ValueTypes.td``.  Defined values include
508  integer types (such as ``i16``, ``i32``, and ``i1`` for Boolean),
509  floating-point types (``f32``, ``f64``), and vector types (for example,
510  ``v8i16`` for an ``8 x i16`` vector).  All registers in a ``RegisterClass``
511  must have the same ``ValueType``, but some registers may store vector data in
512  different configurations.  For example a register that can process a 128-bit
513  vector may be able to handle 16 8-bit integer elements, 8 16-bit integers, 4
514  32-bit integers, and so on.
515
516* The third argument of the ``RegisterClass`` definition specifies the
517  alignment required of the registers when they are stored or loaded to
518  memory.
519
520* The final argument, ``regList``, specifies which registers are in this class.
521  If an alternative allocation order method is not specified, then ``regList``
522  also defines the order of allocation used by the register allocator.  Besides
523  simply listing registers with ``(add R0, R1, ...)``, more advanced set
524  operators are available.  See ``include/llvm/Target/Target.td`` for more
525  information.
526
527In ``SparcRegisterInfo.td``, three ``RegisterClass`` objects are defined:
528``FPRegs``, ``DFPRegs``, and ``IntRegs``.  For all three register classes, the
529first argument defines the namespace with the string "``SP``".  ``FPRegs``
530defines a group of 32 single-precision floating-point registers (``F0`` to
531``F31``); ``DFPRegs`` defines a group of 16 double-precision registers
532(``D0-D15``).
533
534.. code-block:: text
535
536  // F0, F1, F2, ..., F31
537  def FPRegs : RegisterClass<"SP", [f32], 32, (sequence "F%u", 0, 31)>;
538
539  def DFPRegs : RegisterClass<"SP", [f64], 64,
540                              (add D0, D1, D2, D3, D4, D5, D6, D7, D8,
541                                   D9, D10, D11, D12, D13, D14, D15)>;
542
543  def IntRegs : RegisterClass<"SP", [i32], 32,
544      (add L0, L1, L2, L3, L4, L5, L6, L7,
545           I0, I1, I2, I3, I4, I5,
546           O0, O1, O2, O3, O4, O5, O7,
547           G1,
548           // Non-allocatable regs:
549           G2, G3, G4,
550           O6,        // stack ptr
551           I6,        // frame ptr
552           I7,        // return address
553           G0,        // constant zero
554           G5, G6, G7 // reserved for kernel
555      )>;
556
557Using ``SparcRegisterInfo.td`` with TableGen generates several output files
558that are intended for inclusion in other source code that you write.
559``SparcRegisterInfo.td`` generates ``SparcGenRegisterInfo.h.inc``, which should
560be included in the header file for the implementation of the SPARC register
561implementation that you write (``SparcRegisterInfo.h``).  In
562``SparcGenRegisterInfo.h.inc`` a new structure is defined called
563``SparcGenRegisterInfo`` that uses ``TargetRegisterInfo`` as its base.  It also
564specifies types, based upon the defined register classes: ``DFPRegsClass``,
565``FPRegsClass``, and ``IntRegsClass``.
566
567``SparcRegisterInfo.td`` also generates ``SparcGenRegisterInfo.inc``, which is
568included at the bottom of ``SparcRegisterInfo.cpp``, the SPARC register
569implementation.  The code below shows only the generated integer registers and
570associated register classes.  The order of registers in ``IntRegs`` reflects
571the order in the definition of ``IntRegs`` in the target description file.
572
573.. code-block:: c++
574
575  // IntRegs Register Class...
576  static const unsigned IntRegs[] = {
577    SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
578    SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3,
579    SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3,
580    SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3,
581    SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5,
582    SP::G6, SP::G7,
583  };
584
585  // IntRegsVTs Register Class Value Types...
586  static const MVT::ValueType IntRegsVTs[] = {
587    MVT::i32, MVT::Other
588  };
589
590  namespace SP {   // Register class instances
591    DFPRegsClass    DFPRegsRegClass;
592    FPRegsClass     FPRegsRegClass;
593    IntRegsClass    IntRegsRegClass;
594  ...
595    // IntRegs Sub-register Classes...
596    static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
597      NULL
598    };
599  ...
600    // IntRegs Super-register Classes..
601    static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
602      NULL
603    };
604  ...
605    // IntRegs Register Class sub-classes...
606    static const TargetRegisterClass* const IntRegsSubclasses [] = {
607      NULL
608    };
609  ...
610    // IntRegs Register Class super-classes...
611    static const TargetRegisterClass* const IntRegsSuperclasses [] = {
612      NULL
613    };
614
615    IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID,
616      IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses,
617      IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
618  }
619
620The register allocators will avoid using reserved registers, and callee saved
621registers are not used until all the volatile registers have been used.  That
622is usually good enough, but in some cases it may be necessary to provide custom
623allocation orders.
624
625Implement a subclass of ``TargetRegisterInfo``
626----------------------------------------------
627
628The final step is to hand code portions of ``XXXRegisterInfo``, which
629implements the interface described in ``TargetRegisterInfo.h`` (see
630:ref:`TargetRegisterInfo`).  These functions return ``0``, ``NULL``, or
631``false``, unless overridden.  Here is a list of functions that are overridden
632for the SPARC implementation in ``SparcRegisterInfo.cpp``:
633
634* ``getCalleeSavedRegs`` --- Returns a list of callee-saved registers in the
635  order of the desired callee-save stack frame offset.
636
637* ``getReservedRegs`` --- Returns a bitset indexed by physical register
638  numbers, indicating if a particular register is unavailable.
639
640* ``hasFP`` --- Return a Boolean indicating if a function should have a
641  dedicated frame pointer register.
642
643* ``eliminateCallFramePseudoInstr`` --- If call frame setup or destroy pseudo
644  instructions are used, this can be called to eliminate them.
645
646* ``eliminateFrameIndex`` --- Eliminate abstract frame indices from
647  instructions that may use them.
648
649* ``emitPrologue`` --- Insert prologue code into the function.
650
651* ``emitEpilogue`` --- Insert epilogue code into the function.
652
653.. _instruction-set:
654
655Instruction Set
656===============
657
658During the early stages of code generation, the LLVM IR code is converted to a
659``SelectionDAG`` with nodes that are instances of the ``SDNode`` class
660containing target instructions.  An ``SDNode`` has an opcode, operands, type
661requirements, and operation properties.  For example, is an operation
662commutative, does an operation load from memory.  The various operation node
663types are described in the ``include/llvm/CodeGen/SelectionDAGNodes.h`` file
664(values of the ``NodeType`` enum in the ``ISD`` namespace).
665
666TableGen uses the following target description (``.td``) input files to
667generate much of the code for instruction definition:
668
669* ``Target.td`` --- Where the ``Instruction``, ``Operand``, ``InstrInfo``, and
670  other fundamental classes are defined.
671
672* ``TargetSelectionDAG.td`` --- Used by ``SelectionDAG`` instruction selection
673  generators, contains ``SDTC*`` classes (selection DAG type constraint),
674  definitions of ``SelectionDAG`` nodes (such as ``imm``, ``cond``, ``bb``,
675  ``add``, ``fadd``, ``sub``), and pattern support (``Pattern``, ``Pat``,
676  ``PatFrag``, ``PatLeaf``, ``ComplexPattern``.
677
678* ``XXXInstrFormats.td`` --- Patterns for definitions of target-specific
679  instructions.
680
681* ``XXXInstrInfo.td`` --- Target-specific definitions of instruction templates,
682  condition codes, and instructions of an instruction set.  For architecture
683  modifications, a different file name may be used.  For example, for Pentium
684  with SSE instruction, this file is ``X86InstrSSE.td``, and for Pentium with
685  MMX, this file is ``X86InstrMMX.td``.
686
687There is also a target-specific ``XXX.td`` file, where ``XXX`` is the name of
688the target.  The ``XXX.td`` file includes the other ``.td`` input files, but
689its contents are only directly important for subtargets.
690
691You should describe a concrete target-specific class ``XXXInstrInfo`` that
692represents machine instructions supported by a target machine.
693``XXXInstrInfo`` contains an array of ``XXXInstrDescriptor`` objects, each of
694which describes one instruction.  An instruction descriptor defines:
695
696* Opcode mnemonic
697* Number of operands
698* List of implicit register definitions and uses
699* Target-independent properties (such as memory access, is commutable)
700* Target-specific flags
701
702The Instruction class (defined in ``Target.td``) is mostly used as a base for
703more complex instruction classes.
704
705.. code-block:: text
706
707  class Instruction {
708    string Namespace = "";
709    dag OutOperandList;    // A dag containing the MI def operand list.
710    dag InOperandList;     // A dag containing the MI use operand list.
711    string AsmString = ""; // The .s format to print the instruction with.
712    list<dag> Pattern;     // Set to the DAG pattern for this instruction.
713    list<Register> Uses = [];
714    list<Register> Defs = [];
715    list<Predicate> Predicates = [];  // predicates turned into isel match code
716    ... remainder not shown for space ...
717  }
718
719A ``SelectionDAG`` node (``SDNode``) should contain an object representing a
720target-specific instruction that is defined in ``XXXInstrInfo.td``.  The
721instruction objects should represent instructions from the architecture manual
722of the target machine (such as the SPARC Architecture Manual for the SPARC
723target).
724
725A single instruction from the architecture manual is often modeled as multiple
726target instructions, depending upon its operands.  For example, a manual might
727describe an add instruction that takes a register or an immediate operand.  An
728LLVM target could model this with two instructions named ``ADDri`` and
729``ADDrr``.
730
731You should define a class for each instruction category and define each opcode
732as a subclass of the category with appropriate parameters such as the fixed
733binary encoding of opcodes and extended opcodes.  You should map the register
734bits to the bits of the instruction in which they are encoded (for the JIT).
735Also you should specify how the instruction should be printed when the
736automatic assembly printer is used.
737
738As is described in the SPARC Architecture Manual, Version 8, there are three
739major 32-bit formats for instructions.  Format 1 is only for the ``CALL``
740instruction.  Format 2 is for branch on condition codes and ``SETHI`` (set high
741bits of a register) instructions.  Format 3 is for other instructions.
742
743Each of these formats has corresponding classes in ``SparcInstrFormat.td``.
744``InstSP`` is a base class for other instruction classes.  Additional base
745classes are specified for more precise formats: for example in
746``SparcInstrFormat.td``, ``F2_1`` is for ``SETHI``, and ``F2_2`` is for
747branches.  There are three other base classes: ``F3_1`` for register/register
748operations, ``F3_2`` for register/immediate operations, and ``F3_3`` for
749floating-point operations.  ``SparcInstrInfo.td`` also adds the base class
750``Pseudo`` for synthetic SPARC instructions.
751
752``SparcInstrInfo.td`` largely consists of operand and instruction definitions
753for the SPARC target.  In ``SparcInstrInfo.td``, the following target
754description file entry, ``LDrr``, defines the Load Integer instruction for a
755Word (the ``LD`` SPARC opcode) from a memory address to a register.  The first
756parameter, the value 3 (``11``\ :sub:`2`), is the operation value for this
757category of operation.  The second parameter (``000000``\ :sub:`2`) is the
758specific operation value for ``LD``/Load Word.  The third parameter is the
759output destination, which is a register operand and defined in the ``Register``
760target description file (``IntRegs``).
761
762.. code-block:: text
763
764  def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$rd), (ins (MEMrr $rs1, $rs2):$addr),
765                   "ld [$addr], $dst",
766                   [(set i32:$dst, (load ADDRrr:$addr))]>;
767
768The fourth parameter is the input source, which uses the address operand
769``MEMrr`` that is defined earlier in ``SparcInstrInfo.td``:
770
771.. code-block:: text
772
773  def MEMrr : Operand<i32> {
774    let PrintMethod = "printMemOperand";
775    let MIOperandInfo = (ops IntRegs, IntRegs);
776  }
777
778The fifth parameter is a string that is used by the assembly printer and can be
779left as an empty string until the assembly printer interface is implemented.
780The sixth and final parameter is the pattern used to match the instruction
781during the SelectionDAG Select Phase described in :doc:`CodeGenerator`.
782This parameter is detailed in the next section, :ref:`instruction-selector`.
783
784Instruction class definitions are not overloaded for different operand types,
785so separate versions of instructions are needed for register, memory, or
786immediate value operands.  For example, to perform a Load Integer instruction
787for a Word from an immediate operand to a register, the following instruction
788class is defined:
789
790.. code-block:: text
791
792  def LDri : F3_2 <3, 0b000000, (outs IntRegs:$rd), (ins (MEMri $rs1, $simm13):$addr),
793                   "ld [$addr], $dst",
794                   [(set i32:$rd, (load ADDRri:$addr))]>;
795
796Writing these definitions for so many similar instructions can involve a lot of
797cut and paste.  In ``.td`` files, the ``multiclass`` directive enables the
798creation of templates to define several instruction classes at once (using the
799``defm`` directive).  For example in ``SparcInstrInfo.td``, the ``multiclass``
800pattern ``F3_12`` is defined to create 2 instruction classes each time
801``F3_12`` is invoked:
802
803.. code-block:: text
804
805  multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> {
806    def rr  : F3_1 <2, Op3Val,
807                   (outs IntRegs:$rd), (ins IntRegs:$rs1, IntRegs:$rs1),
808                   !strconcat(OpcStr, " $rs1, $rs2, $rd"),
809                   [(set i32:$rd, (OpNode i32:$rs1, i32:$rs2))]>;
810    def ri  : F3_2 <2, Op3Val,
811                   (outs IntRegs:$rd), (ins IntRegs:$rs1, i32imm:$simm13),
812                   !strconcat(OpcStr, " $rs1, $simm13, $rd"),
813                   [(set i32:$rd, (OpNode i32:$rs1, simm13:$simm13))]>;
814  }
815
816So when the ``defm`` directive is used for the ``XOR`` and ``ADD``
817instructions, as seen below, it creates four instruction objects: ``XORrr``,
818``XORri``, ``ADDrr``, and ``ADDri``.
819
820.. code-block:: text
821
822  defm XOR   : F3_12<"xor", 0b000011, xor>;
823  defm ADD   : F3_12<"add", 0b000000, add>;
824
825``SparcInstrInfo.td`` also includes definitions for condition codes that are
826referenced by branch instructions.  The following definitions in
827``SparcInstrInfo.td`` indicate the bit location of the SPARC condition code.
828For example, the 10\ :sup:`th` bit represents the "greater than" condition for
829integers, and the 22\ :sup:`nd` bit represents the "greater than" condition for
830floats.
831
832.. code-block:: text
833
834  def ICC_NE  : ICC_VAL< 9>;  // Not Equal
835  def ICC_E   : ICC_VAL< 1>;  // Equal
836  def ICC_G   : ICC_VAL<10>;  // Greater
837  ...
838  def FCC_U   : FCC_VAL<23>;  // Unordered
839  def FCC_G   : FCC_VAL<22>;  // Greater
840  def FCC_UG  : FCC_VAL<21>;  // Unordered or Greater
841  ...
842
843(Note that ``Sparc.h`` also defines enums that correspond to the same SPARC
844condition codes.  Care must be taken to ensure the values in ``Sparc.h``
845correspond to the values in ``SparcInstrInfo.td``.  I.e., ``SPCC::ICC_NE = 9``,
846``SPCC::FCC_U = 23`` and so on.)
847
848Instruction Operand Mapping
849---------------------------
850
851The code generator backend maps instruction operands to fields in the
852instruction.  Whenever a bit in the instruction encoding ``Inst`` is assigned
853to field without a concrete value, an operand from the ``outs`` or ``ins`` list
854is expected to have a matching name. This operand then populates that undefined
855field. For example, the Sparc target defines the ``XNORrr`` instruction as a
856``F3_1`` format instruction having three operands: the output ``$rd``, and the
857inputs ``$rs1``, and ``$rs2``.
858
859.. code-block:: text
860
861  def XNORrr  : F3_1<2, 0b000111,
862                     (outs IntRegs:$rd), (ins IntRegs:$rs1, IntRegs:$rs2),
863                     "xnor $rs1, $rs2, $rd",
864                     [(set i32:$rd, (not (xor i32:$rs1, i32:$rs2)))]>;
865
866The instruction templates in ``SparcInstrFormats.td`` show the base class for
867``F3_1`` is ``InstSP``.
868
869.. code-block:: text
870
871  class InstSP<dag outs, dag ins, string asmstr, list<dag> pattern> : Instruction {
872    field bits<32> Inst;
873    let Namespace = "SP";
874    bits<2> op;
875    let Inst{31-30} = op;
876    dag OutOperandList = outs;
877    dag InOperandList = ins;
878    let AsmString   = asmstr;
879    let Pattern = pattern;
880  }
881
882``InstSP`` defines the ``op`` field, and uses it to define bits 30 and 31 of the
883instruction, but does not assign a value to it.
884
885.. code-block:: text
886
887  class F3<dag outs, dag ins, string asmstr, list<dag> pattern>
888      : InstSP<outs, ins, asmstr, pattern> {
889    bits<5> rd;
890    bits<6> op3;
891    bits<5> rs1;
892    let op{1} = 1;   // Op = 2 or 3
893    let Inst{29-25} = rd;
894    let Inst{24-19} = op3;
895    let Inst{18-14} = rs1;
896  }
897
898``F3`` defines the ``rd``, ``op3``, and ``rs1`` fields, and uses them in the
899instruction, and again does not assign values.
900
901.. code-block:: text
902
903  class F3_1<bits<2> opVal, bits<6> op3val, dag outs, dag ins,
904             string asmstr, list<dag> pattern> : F3<outs, ins, asmstr, pattern> {
905    bits<8> asi = 0; // asi not currently used
906    bits<5> rs2;
907    let op         = opVal;
908    let op3        = op3val;
909    let Inst{13}   = 0;     // i field = 0
910    let Inst{12-5} = asi;   // address space identifier
911    let Inst{4-0}  = rs2;
912  }
913
914``F3_1`` assigns a value to ``op`` and ``op3`` fields, and defines the ``rs2``
915field.  Therefore, a ``F3_1`` format instruction will require a definition for
916``rd``, ``rs1``, and ``rs2`` in order to fully specify the instruction encoding.
917
918The ``XNORrr`` instruction then provides those three operands in its
919OutOperandList and InOperandList, which bind to the corresponding fields, and
920thus complete the instruction encoding.
921
922For some instructions, a single operand may contain sub-operands. As shown
923earlier, the instruction ``LDrr`` uses an input operand of type ``MEMrr``. This
924operand type contains two register sub-operands, defined by the
925``MIOperandInfo`` value to be ``(ops IntRegs, IntRegs)``.
926
927.. code-block:: text
928
929  def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$rd), (ins (MEMrr $rs1, $rs2):$addr),
930                   "ld [$addr], $dst",
931                   [(set i32:$dst, (load ADDRrr:$addr))]>;
932
933As this instruction is also the ``F3_1`` format, it will expect operands named
934``rd``, ``rs1``, and ``rs2`` as well. In order to allow this, a complex operand
935can optionally give names to each of its sub-operands. In this example
936``MEMrr``'s first sub-operand is named ``$rs1``, the second ``$rs2``, and the
937operand as a whole is also given the name ``$addr``.
938
939When a particular instruction doesn't use all the operands that the instruction
940format defines, a constant value may instead be bound to one or all. For
941example, the ``RDASR`` instruction only takes a single register operand, so we
942assign a constant zero to ``rs2``:
943
944.. code-block:: text
945
946  let rs2 = 0 in
947    def RDASR : F3_1<2, 0b101000,
948                     (outs IntRegs:$rd), (ins ASRRegs:$rs1),
949                     "rd $rs1, $rd", []>;
950
951Instruction Operand Name Mapping
952^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
953
954TableGen will also generate a function called getNamedOperandIdx() which
955can be used to look up an operand's index in a MachineInstr based on its
956TableGen name.  Setting the UseNamedOperandTable bit in an instruction's
957TableGen definition will add all of its operands to an enumeration in the
958llvm::XXX:OpName namespace and also add an entry for it into the OperandMap
959table, which can be queried using getNamedOperandIdx()
960
961.. code-block:: text
962
963  int DstIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::dst); // => 0
964  int BIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::b);     // => 1
965  int CIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::c);     // => 2
966  int DIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::d);     // => -1
967
968  ...
969
970The entries in the OpName enum are taken verbatim from the TableGen definitions,
971so operands with lowercase names will have lower case entries in the enum.
972
973To include the getNamedOperandIdx() function in your backend, you will need
974to define a few preprocessor macros in XXXInstrInfo.cpp and XXXInstrInfo.h.
975For example:
976
977XXXInstrInfo.cpp:
978
979.. code-block:: c++
980
981  #define GET_INSTRINFO_NAMED_OPS // For getNamedOperandIdx() function
982  #include "XXXGenInstrInfo.inc"
983
984XXXInstrInfo.h:
985
986.. code-block:: c++
987
988  #define GET_INSTRINFO_OPERAND_ENUM // For OpName enum
989  #include "XXXGenInstrInfo.inc"
990
991  namespace XXX {
992    int16_t getNamedOperandIdx(uint16_t Opcode, uint16_t NamedIndex);
993  } // End namespace XXX
994
995Instruction Operand Types
996^^^^^^^^^^^^^^^^^^^^^^^^^
997
998TableGen will also generate an enumeration consisting of all named Operand
999types defined in the backend, in the llvm::XXX::OpTypes namespace.
1000Some common immediate Operand types (for instance i8, i32, i64, f32, f64)
1001are defined for all targets in ``include/llvm/Target/Target.td``, and are
1002available in each Target's OpTypes enum.  Also, only named Operand types appear
1003in the enumeration: anonymous types are ignored.
1004For example, the X86 backend defines ``brtarget`` and ``brtarget8``, both
1005instances of the TableGen ``Operand`` class, which represent branch target
1006operands:
1007
1008.. code-block:: text
1009
1010  def brtarget : Operand<OtherVT>;
1011  def brtarget8 : Operand<OtherVT>;
1012
1013This results in:
1014
1015.. code-block:: c++
1016
1017  namespace X86 {
1018  namespace OpTypes {
1019  enum OperandType {
1020    ...
1021    brtarget,
1022    brtarget8,
1023    ...
1024    i32imm,
1025    i64imm,
1026    ...
1027    OPERAND_TYPE_LIST_END
1028  } // End namespace OpTypes
1029  } // End namespace X86
1030
1031In typical TableGen fashion, to use the enum, you will need to define a
1032preprocessor macro:
1033
1034.. code-block:: c++
1035
1036  #define GET_INSTRINFO_OPERAND_TYPES_ENUM // For OpTypes enum
1037  #include "XXXGenInstrInfo.inc"
1038
1039
1040Instruction Scheduling
1041----------------------
1042
1043Instruction itineraries can be queried using MCDesc::getSchedClass(). The
1044value can be named by an enumeration in llvm::XXX::Sched namespace generated
1045by TableGen in XXXGenInstrInfo.inc. The name of the schedule classes are
1046the same as provided in XXXSchedule.td plus a default NoItinerary class.
1047
1048The schedule models are generated by TableGen by the SubtargetEmitter,
1049using the ``CodeGenSchedModels`` class. This is distinct from the itinerary
1050method of specifying machine resource use.  The tool ``utils/schedcover.py``
1051can be used to determine which instructions have been covered by the
1052schedule model description and which haven't. The first step is to use the
1053instructions below to create an output file. Then run ``schedcover.py`` on the
1054output file:
1055
1056.. code-block:: shell
1057
1058  $ <src>/utils/schedcover.py <build>/lib/Target/AArch64/tblGenSubtarget.with
1059  instruction, default, CortexA53Model, CortexA57Model, CycloneModel, ExynosM3Model, FalkorModel, KryoModel, ThunderX2T99Model, ThunderXT8XModel
1060  ABSv16i8, WriteV, , , CyWriteV3, M3WriteNMISC1, FalkorWr_2VXVY_2cyc, KryoWrite_2cyc_XY_XY_150ln, ,
1061  ABSv1i64, WriteV, , , CyWriteV3, M3WriteNMISC1, FalkorWr_1VXVY_2cyc, KryoWrite_2cyc_XY_noRSV_67ln, ,
1062  ...
1063
1064To capture the debug output from generating a schedule model, change to the
1065appropriate target directory and use the following command:
1066command with the ``subtarget-emitter`` debug option:
1067
1068.. code-block:: shell
1069
1070  $ <build>/bin/llvm-tblgen -debug-only=subtarget-emitter -gen-subtarget \
1071    -I <src>/lib/Target/<target> -I <src>/include \
1072    -I <src>/lib/Target <src>/lib/Target/<target>/<target>.td \
1073    -o <build>/lib/Target/<target>/<target>GenSubtargetInfo.inc.tmp \
1074    > tblGenSubtarget.dbg 2>&1
1075
1076Where ``<build>`` is the build directory, ``src`` is the source directory,
1077and ``<target>`` is the name of the target.
1078To double check that the above command is what is needed, one can capture the
1079exact TableGen command from a build by using:
1080
1081.. code-block:: shell
1082
1083  $ VERBOSE=1 make ...
1084
1085and search for ``llvm-tblgen`` commands in the output.
1086
1087
1088Instruction Relation Mapping
1089----------------------------
1090
1091This TableGen feature is used to relate instructions with each other.  It is
1092particularly useful when you have multiple instruction formats and need to
1093switch between them after instruction selection.  This entire feature is driven
1094by relation models which can be defined in ``XXXInstrInfo.td`` files
1095according to the target-specific instruction set.  Relation models are defined
1096using ``InstrMapping`` class as a base.  TableGen parses all the models
1097and generates instruction relation maps using the specified information.
1098Relation maps are emitted as tables in the ``XXXGenInstrInfo.inc`` file
1099along with the functions to query them.  For the detailed information on how to
1100use this feature, please refer to :doc:`HowToUseInstrMappings`.
1101
1102Implement a subclass of ``TargetInstrInfo``
1103-------------------------------------------
1104
1105The final step is to hand code portions of ``XXXInstrInfo``, which implements
1106the interface described in ``TargetInstrInfo.h`` (see :ref:`TargetInstrInfo`).
1107These functions return ``0`` or a Boolean or they assert, unless overridden.
1108Here's a list of functions that are overridden for the SPARC implementation in
1109``SparcInstrInfo.cpp``:
1110
1111* ``isLoadFromStackSlot`` --- If the specified machine instruction is a direct
1112  load from a stack slot, return the register number of the destination and the
1113  ``FrameIndex`` of the stack slot.
1114
1115* ``isStoreToStackSlot`` --- If the specified machine instruction is a direct
1116  store to a stack slot, return the register number of the destination and the
1117  ``FrameIndex`` of the stack slot.
1118
1119* ``copyPhysReg`` --- Copy values between a pair of physical registers.
1120
1121* ``storeRegToStackSlot`` --- Store a register value to a stack slot.
1122
1123* ``loadRegFromStackSlot`` --- Load a register value from a stack slot.
1124
1125* ``storeRegToAddr`` --- Store a register value to memory.
1126
1127* ``loadRegFromAddr`` --- Load a register value from memory.
1128
1129* ``foldMemoryOperand`` --- Attempt to combine instructions of any load or
1130  store instruction for the specified operand(s).
1131
1132Branch Folding and If Conversion
1133--------------------------------
1134
1135Performance can be improved by combining instructions or by eliminating
1136instructions that are never reached.  The ``analyzeBranch`` method in
1137``XXXInstrInfo`` may be implemented to examine conditional instructions and
1138remove unnecessary instructions.  ``analyzeBranch`` looks at the end of a
1139machine basic block (MBB) for opportunities for improvement, such as branch
1140folding and if conversion.  The ``BranchFolder`` and ``IfConverter`` machine
1141function passes (see the source files ``BranchFolding.cpp`` and
1142``IfConversion.cpp`` in the ``lib/CodeGen`` directory) call ``analyzeBranch``
1143to improve the control flow graph that represents the instructions.
1144
1145Several implementations of ``analyzeBranch`` (for ARM, Alpha, and X86) can be
1146examined as models for your own ``analyzeBranch`` implementation.  Since SPARC
1147does not implement a useful ``analyzeBranch``, the ARM target implementation is
1148shown below.
1149
1150``analyzeBranch`` returns a Boolean value and takes four parameters:
1151
1152* ``MachineBasicBlock &MBB`` --- The incoming block to be examined.
1153
1154* ``MachineBasicBlock *&TBB`` --- A destination block that is returned.  For a
1155  conditional branch that evaluates to true, ``TBB`` is the destination.
1156
1157* ``MachineBasicBlock *&FBB`` --- For a conditional branch that evaluates to
1158  false, ``FBB`` is returned as the destination.
1159
1160* ``std::vector<MachineOperand> &Cond`` --- List of operands to evaluate a
1161  condition for a conditional branch.
1162
1163In the simplest case, if a block ends without a branch, then it falls through
1164to the successor block.  No destination blocks are specified for either ``TBB``
1165or ``FBB``, so both parameters return ``NULL``.  The start of the
1166``analyzeBranch`` (see code below for the ARM target) shows the function
1167parameters and the code for the simplest case.
1168
1169.. code-block:: c++
1170
1171  bool ARMInstrInfo::analyzeBranch(MachineBasicBlock &MBB,
1172                                   MachineBasicBlock *&TBB,
1173                                   MachineBasicBlock *&FBB,
1174                                   std::vector<MachineOperand> &Cond) const
1175  {
1176    MachineBasicBlock::iterator I = MBB.end();
1177    if (I == MBB.begin() || !isUnpredicatedTerminator(--I))
1178      return false;
1179
1180If a block ends with a single unconditional branch instruction, then
1181``analyzeBranch`` (shown below) should return the destination of that branch in
1182the ``TBB`` parameter.
1183
1184.. code-block:: c++
1185
1186    if (LastOpc == ARM::B || LastOpc == ARM::tB) {
1187      TBB = LastInst->getOperand(0).getMBB();
1188      return false;
1189    }
1190
1191If a block ends with two unconditional branches, then the second branch is
1192never reached.  In that situation, as shown below, remove the last branch
1193instruction and return the penultimate branch in the ``TBB`` parameter.
1194
1195.. code-block:: c++
1196
1197    if ((SecondLastOpc == ARM::B || SecondLastOpc == ARM::tB) &&
1198        (LastOpc == ARM::B || LastOpc == ARM::tB)) {
1199      TBB = SecondLastInst->getOperand(0).getMBB();
1200      I = LastInst;
1201      I->eraseFromParent();
1202      return false;
1203    }
1204
1205A block may end with a single conditional branch instruction that falls through
1206to successor block if the condition evaluates to false.  In that case,
1207``analyzeBranch`` (shown below) should return the destination of that
1208conditional branch in the ``TBB`` parameter and a list of operands in the
1209``Cond`` parameter to evaluate the condition.
1210
1211.. code-block:: c++
1212
1213    if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) {
1214      // Block ends with fall-through condbranch.
1215      TBB = LastInst->getOperand(0).getMBB();
1216      Cond.push_back(LastInst->getOperand(1));
1217      Cond.push_back(LastInst->getOperand(2));
1218      return false;
1219    }
1220
1221If a block ends with both a conditional branch and an ensuing unconditional
1222branch, then ``analyzeBranch`` (shown below) should return the conditional
1223branch destination (assuming it corresponds to a conditional evaluation of
1224"``true``") in the ``TBB`` parameter and the unconditional branch destination
1225in the ``FBB`` (corresponding to a conditional evaluation of "``false``").  A
1226list of operands to evaluate the condition should be returned in the ``Cond``
1227parameter.
1228
1229.. code-block:: c++
1230
1231    unsigned SecondLastOpc = SecondLastInst->getOpcode();
1232
1233    if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) ||
1234        (SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) {
1235      TBB =  SecondLastInst->getOperand(0).getMBB();
1236      Cond.push_back(SecondLastInst->getOperand(1));
1237      Cond.push_back(SecondLastInst->getOperand(2));
1238      FBB = LastInst->getOperand(0).getMBB();
1239      return false;
1240    }
1241
1242For the last two cases (ending with a single conditional branch or ending with
1243one conditional and one unconditional branch), the operands returned in the
1244``Cond`` parameter can be passed to methods of other instructions to create new
1245branches or perform other operations.  An implementation of ``analyzeBranch``
1246requires the helper methods ``removeBranch`` and ``insertBranch`` to manage
1247subsequent operations.
1248
1249``analyzeBranch`` should return false indicating success in most circumstances.
1250``analyzeBranch`` should only return true when the method is stumped about what
1251to do, for example, if a block has three terminating branches.
1252``analyzeBranch`` may return true if it encounters a terminator it cannot
1253handle, such as an indirect branch.
1254
1255.. _instruction-selector:
1256
1257Instruction Selector
1258====================
1259
1260LLVM uses a ``SelectionDAG`` to represent LLVM IR instructions, and nodes of
1261the ``SelectionDAG`` ideally represent native target instructions.  During code
1262generation, instruction selection passes are performed to convert non-native
1263DAG instructions into native target-specific instructions.  The pass described
1264in ``XXXISelDAGToDAG.cpp`` is used to match patterns and perform DAG-to-DAG
1265instruction selection.  Optionally, a pass may be defined (in
1266``XXXBranchSelector.cpp``) to perform similar DAG-to-DAG operations for branch
1267instructions.  Later, the code in ``XXXISelLowering.cpp`` replaces or removes
1268operations and data types not supported natively (legalizes) in a
1269``SelectionDAG``.
1270
1271TableGen generates code for instruction selection using the following target
1272description input files:
1273
1274* ``XXXInstrInfo.td`` --- Contains definitions of instructions in a
1275  target-specific instruction set, generates ``XXXGenDAGISel.inc``, which is
1276  included in ``XXXISelDAGToDAG.cpp``.
1277
1278* ``XXXCallingConv.td`` --- Contains the calling and return value conventions
1279  for the target architecture, and it generates ``XXXGenCallingConv.inc``,
1280  which is included in ``XXXISelLowering.cpp``.
1281
1282The implementation of an instruction selection pass must include a header that
1283declares the ``FunctionPass`` class or a subclass of ``FunctionPass``.  In
1284``XXXTargetMachine.cpp``, a Pass Manager (PM) should add each instruction
1285selection pass into the queue of passes to run.
1286
1287The LLVM static compiler (``llc``) is an excellent tool for visualizing the
1288contents of DAGs.  To display the ``SelectionDAG`` before or after specific
1289processing phases, use the command line options for ``llc``, described at
1290:ref:`SelectionDAG-Process`.
1291
1292To describe instruction selector behavior, you should add patterns for lowering
1293LLVM code into a ``SelectionDAG`` as the last parameter of the instruction
1294definitions in ``XXXInstrInfo.td``.  For example, in ``SparcInstrInfo.td``,
1295this entry defines a register store operation, and the last parameter describes
1296a pattern with the store DAG operator.
1297
1298.. code-block:: text
1299
1300  def STrr  : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
1301                   "st $src, [$addr]", [(store i32:$src, ADDRrr:$addr)]>;
1302
1303``ADDRrr`` is a memory mode that is also defined in ``SparcInstrInfo.td``:
1304
1305.. code-block:: text
1306
1307  def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>;
1308
1309The definition of ``ADDRrr`` refers to ``SelectADDRrr``, which is a function
1310defined in an implementation of the Instructor Selector (such as
1311``SparcISelDAGToDAG.cpp``).
1312
1313In ``lib/Target/TargetSelectionDAG.td``, the DAG operator for store is defined
1314below:
1315
1316.. code-block:: text
1317
1318  def store : PatFrag<(ops node:$val, node:$ptr),
1319                      (unindexedstore node:$val, node:$ptr)> {
1320    let IsStore = true;
1321    let IsTruncStore = false;
1322  }
1323
1324``XXXInstrInfo.td`` also generates (in ``XXXGenDAGISel.inc``) the
1325``SelectCode`` method that is used to call the appropriate processing method
1326for an instruction.  In this example, ``SelectCode`` calls ``Select_ISD_STORE``
1327for the ``ISD::STORE`` opcode.
1328
1329.. code-block:: c++
1330
1331  SDNode *SelectCode(SDValue N) {
1332    ...
1333    MVT::ValueType NVT = N.getNode()->getValueType(0);
1334    switch (N.getOpcode()) {
1335    case ISD::STORE: {
1336      switch (NVT) {
1337      default:
1338        return Select_ISD_STORE(N);
1339        break;
1340      }
1341      break;
1342    }
1343    ...
1344
1345The pattern for ``STrr`` is matched, so elsewhere in ``XXXGenDAGISel.inc``,
1346code for ``STrr`` is created for ``Select_ISD_STORE``.  The ``Emit_22`` method
1347is also generated in ``XXXGenDAGISel.inc`` to complete the processing of this
1348instruction.
1349
1350.. code-block:: c++
1351
1352  SDNode *Select_ISD_STORE(const SDValue &N) {
1353    SDValue Chain = N.getOperand(0);
1354    if (Predicate_store(N.getNode())) {
1355      SDValue N1 = N.getOperand(1);
1356      SDValue N2 = N.getOperand(2);
1357      SDValue CPTmp0;
1358      SDValue CPTmp1;
1359
1360      // Pattern: (st:void i32:i32:$src,
1361      //           ADDRrr:i32:$addr)<<P:Predicate_store>>
1362      // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
1363      // Pattern complexity = 13  cost = 1  size = 0
1364      if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &&
1365          N1.getNode()->getValueType(0) == MVT::i32 &&
1366          N2.getNode()->getValueType(0) == MVT::i32) {
1367        return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
1368      }
1369  ...
1370
1371The SelectionDAG Legalize Phase
1372-------------------------------
1373
1374The Legalize phase converts a DAG to use types and operations that are natively
1375supported by the target.  For natively unsupported types and operations, you
1376need to add code to the target-specific ``XXXTargetLowering`` implementation to
1377convert unsupported types and operations to supported ones.
1378
1379In the constructor for the ``XXXTargetLowering`` class, first use the
1380``addRegisterClass`` method to specify which types are supported and which
1381register classes are associated with them.  The code for the register classes
1382are generated by TableGen from ``XXXRegisterInfo.td`` and placed in
1383``XXXGenRegisterInfo.h.inc``.  For example, the implementation of the
1384constructor for the SparcTargetLowering class (in ``SparcISelLowering.cpp``)
1385starts with the following code:
1386
1387.. code-block:: c++
1388
1389  addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
1390  addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
1391  addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass);
1392
1393You should examine the node types in the ``ISD`` namespace
1394(``include/llvm/CodeGen/SelectionDAGNodes.h``) and determine which operations
1395the target natively supports.  For operations that do **not** have native
1396support, add a callback to the constructor for the ``XXXTargetLowering`` class,
1397so the instruction selection process knows what to do.  The ``TargetLowering``
1398class callback methods (declared in ``llvm/Target/TargetLowering.h``) are:
1399
1400* ``setOperationAction`` --- General operation.
1401* ``setLoadExtAction`` --- Load with extension.
1402* ``setTruncStoreAction`` --- Truncating store.
1403* ``setIndexedLoadAction`` --- Indexed load.
1404* ``setIndexedStoreAction`` --- Indexed store.
1405* ``setConvertAction`` --- Type conversion.
1406* ``setCondCodeAction`` --- Support for a given condition code.
1407
1408Note: on older releases, ``setLoadXAction`` is used instead of
1409``setLoadExtAction``.  Also, on older releases, ``setCondCodeAction`` may not
1410be supported.  Examine your release to see what methods are specifically
1411supported.
1412
1413These callbacks are used to determine that an operation does or does not work
1414with a specified type (or types).  And in all cases, the third parameter is a
1415``LegalAction`` type enum value: ``Promote``, ``Expand``, ``Custom``, or
1416``Legal``.  ``SparcISelLowering.cpp`` contains examples of all four
1417``LegalAction`` values.
1418
1419Promote
1420^^^^^^^
1421
1422For an operation without native support for a given type, the specified type
1423may be promoted to a larger type that is supported.  For example, SPARC does
1424not support a sign-extending load for Boolean values (``i1`` type), so in
1425``SparcISelLowering.cpp`` the third parameter below, ``Promote``, changes
1426``i1`` type values to a large type before loading.
1427
1428.. code-block:: c++
1429
1430  setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
1431
1432Expand
1433^^^^^^
1434
1435For a type without native support, a value may need to be broken down further,
1436rather than promoted.  For an operation without native support, a combination
1437of other operations may be used to similar effect.  In SPARC, the
1438floating-point sine and cosine trig operations are supported by expansion to
1439other operations, as indicated by the third parameter, ``Expand``, to
1440``setOperationAction``:
1441
1442.. code-block:: c++
1443
1444  setOperationAction(ISD::FSIN, MVT::f32, Expand);
1445  setOperationAction(ISD::FCOS, MVT::f32, Expand);
1446
1447Custom
1448^^^^^^
1449
1450For some operations, simple type promotion or operation expansion may be
1451insufficient.  In some cases, a special intrinsic function must be implemented.
1452
1453For example, a constant value may require special treatment, or an operation
1454may require spilling and restoring registers in the stack and working with
1455register allocators.
1456
1457As seen in ``SparcISelLowering.cpp`` code below, to perform a type conversion
1458from a floating point value to a signed integer, first the
1459``setOperationAction`` should be called with ``Custom`` as the third parameter:
1460
1461.. code-block:: c++
1462
1463  setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
1464
1465In the ``LowerOperation`` method, for each ``Custom`` operation, a case
1466statement should be added to indicate what function to call.  In the following
1467code, an ``FP_TO_SINT`` opcode will call the ``LowerFP_TO_SINT`` method:
1468
1469.. code-block:: c++
1470
1471  SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) {
1472    switch (Op.getOpcode()) {
1473    case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
1474    ...
1475    }
1476  }
1477
1478Finally, the ``LowerFP_TO_SINT`` method is implemented, using an FP register to
1479convert the floating-point value to an integer.
1480
1481.. code-block:: c++
1482
1483  static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &DAG) {
1484    assert(Op.getValueType() == MVT::i32);
1485    Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
1486    return DAG.getNode(ISD::BITCAST, MVT::i32, Op);
1487  }
1488
1489Legal
1490^^^^^
1491
1492The ``Legal`` ``LegalizeAction`` enum value simply indicates that an operation
1493**is** natively supported.  ``Legal`` represents the default condition, so it
1494is rarely used.  In ``SparcISelLowering.cpp``, the action for ``CTPOP`` (an
1495operation to count the bits set in an integer) is natively supported only for
1496SPARC v9.  The following code enables the ``Expand`` conversion technique for
1497non-v9 SPARC implementations.
1498
1499.. code-block:: c++
1500
1501  setOperationAction(ISD::CTPOP, MVT::i32, Expand);
1502  ...
1503  if (TM.getSubtarget<SparcSubtarget>().isV9())
1504    setOperationAction(ISD::CTPOP, MVT::i32, Legal);
1505
1506.. _backend-calling-convs:
1507
1508Calling Conventions
1509-------------------
1510
1511To support target-specific calling conventions, ``XXXGenCallingConv.td`` uses
1512interfaces (such as ``CCIfType`` and ``CCAssignToReg``) that are defined in
1513``lib/Target/TargetCallingConv.td``.  TableGen can take the target descriptor
1514file ``XXXGenCallingConv.td`` and generate the header file
1515``XXXGenCallingConv.inc``, which is typically included in
1516``XXXISelLowering.cpp``.  You can use the interfaces in
1517``TargetCallingConv.td`` to specify:
1518
1519* The order of parameter allocation.
1520
1521* Where parameters and return values are placed (that is, on the stack or in
1522  registers).
1523
1524* Which registers may be used.
1525
1526* Whether the caller or callee unwinds the stack.
1527
1528The following example demonstrates the use of the ``CCIfType`` and
1529``CCAssignToReg`` interfaces.  If the ``CCIfType`` predicate is true (that is,
1530if the current argument is of type ``f32`` or ``f64``), then the action is
1531performed.  In this case, the ``CCAssignToReg`` action assigns the argument
1532value to the first available register: either ``R0`` or ``R1``.
1533
1534.. code-block:: text
1535
1536  CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>>
1537
1538``SparcCallingConv.td`` contains definitions for a target-specific return-value
1539calling convention (``RetCC_Sparc32``) and a basic 32-bit C calling convention
1540(``CC_Sparc32``).  The definition of ``RetCC_Sparc32`` (shown below) indicates
1541which registers are used for specified scalar return types.  A single-precision
1542float is returned to register ``F0``, and a double-precision float goes to
1543register ``D0``.  A 32-bit integer is returned in register ``I0`` or ``I1``.
1544
1545.. code-block:: text
1546
1547  def RetCC_Sparc32 : CallingConv<[
1548    CCIfType<[i32], CCAssignToReg<[I0, I1]>>,
1549    CCIfType<[f32], CCAssignToReg<[F0]>>,
1550    CCIfType<[f64], CCAssignToReg<[D0]>>
1551  ]>;
1552
1553The definition of ``CC_Sparc32`` in ``SparcCallingConv.td`` introduces
1554``CCAssignToStack``, which assigns the value to a stack slot with the specified
1555size and alignment.  In the example below, the first parameter, 4, indicates
1556the size of the slot, and the second parameter, also 4, indicates the stack
1557alignment along 4-byte units.  (Special cases: if size is zero, then the ABI
1558size is used; if alignment is zero, then the ABI alignment is used.)
1559
1560.. code-block:: text
1561
1562  def CC_Sparc32 : CallingConv<[
1563    // All arguments get passed in integer registers if there is space.
1564    CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
1565    CCAssignToStack<4, 4>
1566  ]>;
1567
1568``CCDelegateTo`` is another commonly used interface, which tries to find a
1569specified sub-calling convention, and, if a match is found, it is invoked.  In
1570the following example (in ``X86CallingConv.td``), the definition of
1571``RetCC_X86_32_C`` ends with ``CCDelegateTo``.  After the current value is
1572assigned to the register ``ST0`` or ``ST1``, the ``RetCC_X86Common`` is
1573invoked.
1574
1575.. code-block:: text
1576
1577  def RetCC_X86_32_C : CallingConv<[
1578    CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>,
1579    CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>,
1580    CCDelegateTo<RetCC_X86Common>
1581  ]>;
1582
1583``CCIfCC`` is an interface that attempts to match the given name to the current
1584calling convention.  If the name identifies the current calling convention,
1585then a specified action is invoked.  In the following example (in
1586``X86CallingConv.td``), if the ``Fast`` calling convention is in use, then
1587``RetCC_X86_32_Fast`` is invoked.  If the ``SSECall`` calling convention is in
1588use, then ``RetCC_X86_32_SSE`` is invoked.
1589
1590.. code-block:: text
1591
1592  def RetCC_X86_32 : CallingConv<[
1593    CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>,
1594    CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>,
1595    CCDelegateTo<RetCC_X86_32_C>
1596  ]>;
1597
1598``CCAssignToRegAndStack`` is the same as ``CCAssignToReg``, but also allocates
1599a stack slot, when some register is used. Basically, it works like:
1600``CCIf<CCAssignToReg<regList>, CCAssignToStack<size, align>>``.
1601
1602.. code-block:: text
1603
1604  class CCAssignToRegAndStack<list<Register> regList, int size, int align>
1605      : CCAssignToReg<regList> {
1606    int Size = size;
1607    int Align = align;
1608  }
1609
1610Other calling convention interfaces include:
1611
1612* ``CCIf <predicate, action>`` --- If the predicate matches, apply the action.
1613
1614* ``CCIfInReg <action>`` --- If the argument is marked with the "``inreg``"
1615  attribute, then apply the action.
1616
1617* ``CCIfNest <action>`` --- If the argument is marked with the "``nest``"
1618  attribute, then apply the action.
1619
1620* ``CCIfNotVarArg <action>`` --- If the current function does not take a
1621  variable number of arguments, apply the action.
1622
1623* ``CCAssignToRegWithShadow <registerList, shadowList>`` --- similar to
1624  ``CCAssignToReg``, but with a shadow list of registers.
1625
1626* ``CCPassByVal <size, align>`` --- Assign value to a stack slot with the
1627  minimum specified size and alignment.
1628
1629* ``CCPromoteToType <type>`` --- Promote the current value to the specified
1630  type.
1631
1632* ``CallingConv <[actions]>`` --- Define each calling convention that is
1633  supported.
1634
1635Assembly Printer
1636================
1637
1638During the code emission stage, the code generator may utilize an LLVM pass to
1639produce assembly output.  To do this, you want to implement the code for a
1640printer that converts LLVM IR to a GAS-format assembly language for your target
1641machine, using the following steps:
1642
1643* Define all the assembly strings for your target, adding them to the
1644  instructions defined in the ``XXXInstrInfo.td`` file.  (See
1645  :ref:`instruction-set`.)  TableGen will produce an output file
1646  (``XXXGenAsmWriter.inc``) with an implementation of the ``printInstruction``
1647  method for the ``XXXAsmPrinter`` class.
1648
1649* Write ``XXXTargetAsmInfo.h``, which contains the bare-bones declaration of
1650  the ``XXXTargetAsmInfo`` class (a subclass of ``TargetAsmInfo``).
1651
1652* Write ``XXXTargetAsmInfo.cpp``, which contains target-specific values for
1653  ``TargetAsmInfo`` properties and sometimes new implementations for methods.
1654
1655* Write ``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that
1656  performs the LLVM-to-assembly conversion.
1657
1658The code in ``XXXTargetAsmInfo.h`` is usually a trivial declaration of the
1659``XXXTargetAsmInfo`` class for use in ``XXXTargetAsmInfo.cpp``.  Similarly,
1660``XXXTargetAsmInfo.cpp`` usually has a few declarations of ``XXXTargetAsmInfo``
1661replacement values that override the default values in ``TargetAsmInfo.cpp``.
1662For example in ``SparcTargetAsmInfo.cpp``:
1663
1664.. code-block:: c++
1665
1666  SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) {
1667    Data16bitsDirective = "\t.half\t";
1668    Data32bitsDirective = "\t.word\t";
1669    Data64bitsDirective = 0;  // .xword is only supported by V9.
1670    ZeroDirective = "\t.skip\t";
1671    CommentString = "!";
1672    ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
1673  }
1674
1675The X86 assembly printer implementation (``X86TargetAsmInfo``) is an example
1676where the target specific ``TargetAsmInfo`` class uses an overridden methods:
1677``ExpandInlineAsm``.
1678
1679A target-specific implementation of ``AsmPrinter`` is written in
1680``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that converts
1681the LLVM to printable assembly.  The implementation must include the following
1682headers that have declarations for the ``AsmPrinter`` and
1683``MachineFunctionPass`` classes.  The ``MachineFunctionPass`` is a subclass of
1684``FunctionPass``.
1685
1686.. code-block:: c++
1687
1688  #include "llvm/CodeGen/AsmPrinter.h"
1689  #include "llvm/CodeGen/MachineFunctionPass.h"
1690
1691As a ``FunctionPass``, ``AsmPrinter`` first calls ``doInitialization`` to set
1692up the ``AsmPrinter``.  In ``SparcAsmPrinter``, a ``Mangler`` object is
1693instantiated to process variable names.
1694
1695In ``XXXAsmPrinter.cpp``, the ``runOnMachineFunction`` method (declared in
1696``MachineFunctionPass``) must be implemented for ``XXXAsmPrinter``.  In
1697``MachineFunctionPass``, the ``runOnFunction`` method invokes
1698``runOnMachineFunction``.  Target-specific implementations of
1699``runOnMachineFunction`` differ, but generally do the following to process each
1700machine function:
1701
1702* Call ``SetupMachineFunction`` to perform initialization.
1703
1704* Call ``EmitConstantPool`` to print out (to the output stream) constants which
1705  have been spilled to memory.
1706
1707* Call ``EmitJumpTableInfo`` to print out jump tables used by the current
1708  function.
1709
1710* Print out the label for the current function.
1711
1712* Print out the code for the function, including basic block labels and the
1713  assembly for the instruction (using ``printInstruction``)
1714
1715The ``XXXAsmPrinter`` implementation must also include the code generated by
1716TableGen that is output in the ``XXXGenAsmWriter.inc`` file.  The code in
1717``XXXGenAsmWriter.inc`` contains an implementation of the ``printInstruction``
1718method that may call these methods:
1719
1720* ``printOperand``
1721* ``printMemOperand``
1722* ``printCCOperand`` (for conditional statements)
1723* ``printDataDirective``
1724* ``printDeclare``
1725* ``printImplicitDef``
1726* ``printInlineAsm``
1727
1728The implementations of ``printDeclare``, ``printImplicitDef``,
1729``printInlineAsm``, and ``printLabel`` in ``AsmPrinter.cpp`` are generally
1730adequate for printing assembly and do not need to be overridden.
1731
1732The ``printOperand`` method is implemented with a long ``switch``/``case``
1733statement for the type of operand: register, immediate, basic block, external
1734symbol, global address, constant pool index, or jump table index.  For an
1735instruction with a memory address operand, the ``printMemOperand`` method
1736should be implemented to generate the proper output.  Similarly,
1737``printCCOperand`` should be used to print a conditional operand.
1738
1739``doFinalization`` should be overridden in ``XXXAsmPrinter``, and it should be
1740called to shut down the assembly printer.  During ``doFinalization``, global
1741variables and constants are printed to output.
1742
1743Subtarget Support
1744=================
1745
1746Subtarget support is used to inform the code generation process of instruction
1747set variations for a given chip set.  For example, the LLVM SPARC
1748implementation provided covers three major versions of the SPARC microprocessor
1749architecture: Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a
175064-bit architecture), and the UltraSPARC architecture.  V8 has 16
1751double-precision floating-point registers that are also usable as either 32
1752single-precision or 8 quad-precision registers.  V8 is also purely big-endian.
1753V9 has 32 double-precision floating-point registers that are also usable as 16
1754quad-precision registers, but cannot be used as single-precision registers.
1755The UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set
1756extensions.
1757
1758If subtarget support is needed, you should implement a target-specific
1759``XXXSubtarget`` class for your architecture.  This class should process the
1760command-line options ``-mcpu=`` and ``-mattr=``.
1761
1762TableGen uses definitions in the ``Target.td`` and ``Sparc.td`` files to
1763generate code in ``SparcGenSubtarget.inc``.  In ``Target.td``, shown below, the
1764``SubtargetFeature`` interface is defined.  The first 4 string parameters of
1765the ``SubtargetFeature`` interface are a feature name, a XXXSubtarget field set
1766by the feature, the value of the XXXSubtarget field, and a description of the
1767feature.  (The fifth parameter is a list of features whose presence is implied,
1768and its default value is an empty array.)
1769
1770If the value for the field is the string "true" or "false", the field
1771is assumed to be a bool and only one SubtargetFeature should refer to it.
1772Otherwise, it is assumed to be an integer. The integer value may be the name
1773of an enum constant. If multiple features use the same integer field, the
1774field will be set to the maximum value of all enabled features that share
1775the field.
1776
1777.. code-block:: text
1778
1779  class SubtargetFeature<string n, string f, string v, string d,
1780                         list<SubtargetFeature> i = []> {
1781    string Name = n;
1782    string FieldName = f;
1783    string Value = v;
1784    string Desc = d;
1785    list<SubtargetFeature> Implies = i;
1786  }
1787
1788In the ``Sparc.td`` file, the ``SubtargetFeature`` is used to define the
1789following features.
1790
1791.. code-block:: text
1792
1793  def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true",
1794                       "Enable SPARC-V9 instructions">;
1795  def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8",
1796                       "UseV8DeprecatedInsts", "true",
1797                       "Enable deprecated V8 instructions in V9 mode">;
1798  def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true",
1799                       "Enable UltraSPARC Visual Instruction Set extensions">;
1800
1801Elsewhere in ``Sparc.td``, the ``Proc`` class is defined and then is used to
1802define particular SPARC processor subtypes that may have the previously
1803described features.
1804
1805.. code-block:: text
1806
1807  class Proc<string Name, list<SubtargetFeature> Features>
1808    : Processor<Name, NoItineraries, Features>;
1809
1810  def : Proc<"generic",         []>;
1811  def : Proc<"v8",              []>;
1812  def : Proc<"supersparc",      []>;
1813  def : Proc<"sparclite",       []>;
1814  def : Proc<"f934",            []>;
1815  def : Proc<"hypersparc",      []>;
1816  def : Proc<"sparclite86x",    []>;
1817  def : Proc<"sparclet",        []>;
1818  def : Proc<"tsc701",          []>;
1819  def : Proc<"v9",              [FeatureV9]>;
1820  def : Proc<"ultrasparc",      [FeatureV9, FeatureV8Deprecated]>;
1821  def : Proc<"ultrasparc3",     [FeatureV9, FeatureV8Deprecated]>;
1822  def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>;
1823
1824From ``Target.td`` and ``Sparc.td`` files, the resulting
1825``SparcGenSubtarget.inc`` specifies enum values to identify the features,
1826arrays of constants to represent the CPU features and CPU subtypes, and the
1827``ParseSubtargetFeatures`` method that parses the features string that sets
1828specified subtarget options.  The generated ``SparcGenSubtarget.inc`` file
1829should be included in the ``SparcSubtarget.cpp``.  The target-specific
1830implementation of the ``XXXSubtarget`` method should follow this pseudocode:
1831
1832.. code-block:: c++
1833
1834  XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) {
1835    // Set the default features
1836    // Determine default and user specified characteristics of the CPU
1837    // Call ParseSubtargetFeatures(FS, CPU) to parse the features string
1838    // Perform any additional operations
1839  }
1840
1841JIT Support
1842===========
1843
1844The implementation of a target machine optionally includes a Just-In-Time (JIT)
1845code generator that emits machine code and auxiliary structures as binary
1846output that can be written directly to memory.  To do this, implement JIT code
1847generation by performing the following steps:
1848
1849* Write an ``XXXCodeEmitter.cpp`` file that contains a machine function pass
1850  that transforms target-machine instructions into relocatable machine
1851  code.
1852
1853* Write an ``XXXJITInfo.cpp`` file that implements the JIT interfaces for
1854  target-specific code-generation activities, such as emitting machine code and
1855  stubs.
1856
1857* Modify ``XXXTargetMachine`` so that it provides a ``TargetJITInfo`` object
1858  through its ``getJITInfo`` method.
1859
1860There are several different approaches to writing the JIT support code.  For
1861instance, TableGen and target descriptor files may be used for creating a JIT
1862code generator, but are not mandatory.  For the Alpha and PowerPC target
1863machines, TableGen is used to generate ``XXXGenCodeEmitter.inc``, which
1864contains the binary coding of machine instructions and the
1865``getBinaryCodeForInstr`` method to access those codes.  Other JIT
1866implementations do not.
1867
1868Both ``XXXJITInfo.cpp`` and ``XXXCodeEmitter.cpp`` must include the
1869``llvm/CodeGen/MachineCodeEmitter.h`` header file that defines the
1870``MachineCodeEmitter`` class containing code for several callback functions
1871that write data (in bytes, words, strings, etc.) to the output stream.
1872
1873Machine Code Emitter
1874--------------------
1875
1876In ``XXXCodeEmitter.cpp``, a target-specific of the ``Emitter`` class is
1877implemented as a function pass (subclass of ``MachineFunctionPass``).  The
1878target-specific implementation of ``runOnMachineFunction`` (invoked by
1879``runOnFunction`` in ``MachineFunctionPass``) iterates through the
1880``MachineBasicBlock`` calls ``emitInstruction`` to process each instruction and
1881emit binary code.  ``emitInstruction`` is largely implemented with case
1882statements on the instruction types defined in ``XXXInstrInfo.h``.  For
1883example, in ``X86CodeEmitter.cpp``, the ``emitInstruction`` method is built
1884around the following ``switch``/``case`` statements:
1885
1886.. code-block:: c++
1887
1888  switch (Desc->TSFlags & X86::FormMask) {
1889  case X86II::Pseudo:  // for not yet implemented instructions
1890     ...               // or pseudo-instructions
1891     break;
1892  case X86II::RawFrm:  // for instructions with a fixed opcode value
1893     ...
1894     break;
1895  case X86II::AddRegFrm: // for instructions that have one register operand
1896     ...                 // added to their opcode
1897     break;
1898  case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
1899     ...                 // to specify a destination (register)
1900     break;
1901  case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
1902     ...                 // to specify a destination (memory)
1903     break;
1904  case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
1905     ...                 // to specify a source (register)
1906     break;
1907  case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
1908     ...                 // to specify a source (memory)
1909     break;
1910  case X86II::MRM0r: case X86II::MRM1r:  // for instructions that operate on
1911  case X86II::MRM2r: case X86II::MRM3r:  // a REGISTER r/m operand and
1912  case X86II::MRM4r: case X86II::MRM5r:  // use the Mod/RM byte and a field
1913  case X86II::MRM6r: case X86II::MRM7r:  // to hold extended opcode data
1914     ...
1915     break;
1916  case X86II::MRM0m: case X86II::MRM1m:  // for instructions that operate on
1917  case X86II::MRM2m: case X86II::MRM3m:  // a MEMORY r/m operand and
1918  case X86II::MRM4m: case X86II::MRM5m:  // use the Mod/RM byte and a field
1919  case X86II::MRM6m: case X86II::MRM7m:  // to hold extended opcode data
1920     ...
1921     break;
1922  case X86II::MRMInitReg: // for instructions whose source and
1923     ...                  // destination are the same register
1924     break;
1925  }
1926
1927The implementations of these case statements often first emit the opcode and
1928then get the operand(s).  Then depending upon the operand, helper methods may
1929be called to process the operand(s).  For example, in ``X86CodeEmitter.cpp``,
1930for the ``X86II::AddRegFrm`` case, the first data emitted (by ``emitByte``) is
1931the opcode added to the register operand.  Then an object representing the
1932machine operand, ``MO1``, is extracted.  The helper methods such as
1933``isImmediate``, ``isGlobalAddress``, ``isExternalSymbol``,
1934``isConstantPoolIndex``, and ``isJumpTableIndex`` determine the operand type.
1935(``X86CodeEmitter.cpp`` also has private methods such as ``emitConstant``,
1936``emitGlobalAddress``, ``emitExternalSymbolAddress``, ``emitConstPoolAddress``,
1937and ``emitJumpTableAddress`` that emit the data into the output stream.)
1938
1939.. code-block:: c++
1940
1941  case X86II::AddRegFrm:
1942    MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
1943
1944    if (CurOp != NumOps) {
1945      const MachineOperand &MO1 = MI.getOperand(CurOp++);
1946      unsigned Size = X86InstrInfo::sizeOfImm(Desc);
1947      if (MO1.isImmediate())
1948        emitConstant(MO1.getImm(), Size);
1949      else {
1950        unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
1951          : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
1952        if (Opcode == X86::MOV64ri)
1953          rt = X86::reloc_absolute_dword;  // FIXME: add X86II flag?
1954        if (MO1.isGlobalAddress()) {
1955          bool NeedStub = isa<Function>(MO1.getGlobal());
1956          bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
1957          emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
1958                            NeedStub, isLazy);
1959        } else if (MO1.isExternalSymbol())
1960          emitExternalSymbolAddress(MO1.getSymbolName(), rt);
1961        else if (MO1.isConstantPoolIndex())
1962          emitConstPoolAddress(MO1.getIndex(), rt);
1963        else if (MO1.isJumpTableIndex())
1964          emitJumpTableAddress(MO1.getIndex(), rt);
1965      }
1966    }
1967    break;
1968
1969In the previous example, ``XXXCodeEmitter.cpp`` uses the variable ``rt``, which
1970is a ``RelocationType`` enum that may be used to relocate addresses (for
1971example, a global address with a PIC base offset).  The ``RelocationType`` enum
1972for that target is defined in the short target-specific ``XXXRelocations.h``
1973file.  The ``RelocationType`` is used by the ``relocate`` method defined in
1974``XXXJITInfo.cpp`` to rewrite addresses for referenced global symbols.
1975
1976For example, ``X86Relocations.h`` specifies the following relocation types for
1977the X86 addresses.  In all four cases, the relocated value is added to the
1978value already in memory.  For ``reloc_pcrel_word`` and ``reloc_picrel_word``,
1979there is an additional initial adjustment.
1980
1981.. code-block:: c++
1982
1983  enum RelocationType {
1984    reloc_pcrel_word = 0,    // add reloc value after adjusting for the PC loc
1985    reloc_picrel_word = 1,   // add reloc value after adjusting for the PIC base
1986    reloc_absolute_word = 2, // absolute relocation; no additional adjustment
1987    reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
1988  };
1989
1990Target JIT Info
1991---------------
1992
1993``XXXJITInfo.cpp`` implements the JIT interfaces for target-specific
1994code-generation activities, such as emitting machine code and stubs.  At
1995minimum, a target-specific version of ``XXXJITInfo`` implements the following:
1996
1997* ``getLazyResolverFunction`` --- Initializes the JIT, gives the target a
1998  function that is used for compilation.
1999
2000* ``emitFunctionStub`` --- Returns a native function with a specified address
2001  for a callback function.
2002
2003* ``relocate`` --- Changes the addresses of referenced globals, based on
2004  relocation types.
2005
2006* Callback function that are wrappers to a function stub that is used when the
2007  real target is not initially known.
2008
2009``getLazyResolverFunction`` is generally trivial to implement.  It makes the
2010incoming parameter as the global ``JITCompilerFunction`` and returns the
2011callback function that will be used a function wrapper.  For the Alpha target
2012(in ``AlphaJITInfo.cpp``), the ``getLazyResolverFunction`` implementation is
2013simply:
2014
2015.. code-block:: c++
2016
2017  TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(
2018                                              JITCompilerFn F) {
2019    JITCompilerFunction = F;
2020    return AlphaCompilationCallback;
2021  }
2022
2023For the X86 target, the ``getLazyResolverFunction`` implementation is a little
2024more complicated, because it returns a different callback function for
2025processors with SSE instructions and XMM registers.
2026
2027The callback function initially saves and later restores the callee register
2028values, incoming arguments, and frame and return address.  The callback
2029function needs low-level access to the registers or stack, so it is typically
2030implemented with assembler.
2031