docs/GlobalISel/GMIR.rst

09467b48Spatrick.. _gmir:
09467b48Spatrick
09467b48SpatrickGeneric Machine IR
09467b48Spatrick==================
09467b48Spatrick
09467b48Spatrick.. contents::
09467b48Spatrick   :local:
09467b48Spatrick
09467b48SpatrickGeneric MIR (gMIR) is an intermediate representation that shares the same data
09467b48Spatrickstructures as :doc:`MachineIR (MIR) <../MIRLangRef>` but has more relaxed
09467b48Spatrickconstraints. As the compilation pipeline proceeds, these constraints are
09467b48Spatrickgradually tightened until gMIR has become MIR.
09467b48Spatrick
09467b48SpatrickThe rest of this document will assume that you are familiar with the concepts
09467b48Spatrickin :doc:`MachineIR (MIR) <../MIRLangRef>` and will highlight the differences
09467b48Spatrickbetween MIR and gMIR.
09467b48Spatrick
09467b48Spatrick.. _gmir-instructions:
09467b48Spatrick
09467b48SpatrickGeneric Machine Instructions
09467b48Spatrick----------------------------
09467b48Spatrick
09467b48Spatrick.. note::
09467b48Spatrick
09467b48Spatrick  This section expands on :ref:`mir-instructions` from the MIR Language
09467b48Spatrick  Reference.
09467b48Spatrick
09467b48SpatrickWhereas MIR deals largely in Target Instructions and only has a small set of
09467b48Spatricktarget independent opcodes such as ``COPY``, ``PHI``, and ``REG_SEQUENCE``,
09467b48SpatrickgMIR defines a rich collection of ``Generic Opcodes`` which are target
09467b48Spatrickindependent and describe operations which are typically supported by targets.
09467b48SpatrickOne example is ``G_ADD`` which is the generic opcode for an integer addition.
09467b48SpatrickMore information on each of the generic opcodes can be found at
09467b48Spatrick:doc:`GenericOpcode`.
09467b48Spatrick
09467b48SpatrickThe ``MachineIRBuilder`` class wraps the ``MachineInstrBuilder`` and provides
09467b48Spatricka convenient way to create these generic instructions.
09467b48Spatrick
09467b48Spatrick.. _gmir-gvregs:
09467b48Spatrick
09467b48SpatrickGeneric Virtual Registers
09467b48Spatrick-------------------------
09467b48Spatrick
09467b48Spatrick.. note::
09467b48Spatrick
09467b48Spatrick  This section expands on :ref:`mir-registers` from the MIR Language
09467b48Spatrick  Reference.
09467b48Spatrick
09467b48SpatrickGeneric virtual registers are like virtual registers but they are not assigned a
09467b48SpatrickRegister Class constraint. Instead, generic virtual registers have less strict
09467b48Spatrickconstraints starting with a :ref:`gmir-llt` and then further constrained to a
09467b48Spatrick:ref:`gmir-regbank`. Eventually they will be constrained to a register class
09467b48Spatrickat which point they become normal virtual registers.
09467b48Spatrick
09467b48SpatrickGeneric virtual registers can be used with all the virtual register API's
09467b48Spatrickprovided by ``MachineRegisterInfo``. In particular, the def-use chain API's can
09467b48Spatrickbe used without needing to distinguish them from non-generic virtual registers.
09467b48Spatrick
09467b48SpatrickFor simplicity, most generic instructions only accept virtual registers (both
09467b48Spatrickgeneric and non-generic). There are some exceptions to this but in general:
09467b48Spatrick
09467b48Spatrick* instead of immediates, they use a generic virtual register defined by an
09467b48Spatrick  instruction that materializes the immediate value (see
09467b48Spatrick  :ref:`irtranslator-constants`). Typically this is a G_CONSTANT or a
09467b48Spatrick  G_FCONSTANT. One example of an exception to this rule is G_SEXT_INREG where
09467b48Spatrick  having an immediate is mandatory.
09467b48Spatrick* instead of physical register, they use a generic virtual register that is
09467b48Spatrick  either defined by a ``COPY`` from the physical register or used by a ``COPY``
09467b48Spatrick  that defines the physical register.
09467b48Spatrick
09467b48Spatrick.. admonition:: Historical Note
09467b48Spatrick
09467b48Spatrick  We started with an alternative representation, where MRI tracks a size for
09467b48Spatrick  each generic virtual register, and instructions have lists of types.
09467b48Spatrick  That had two flaws: the type and size are redundant, and there was no generic
09467b48Spatrick  way of getting a given operand's type (as there was no 1:1 mapping between
09467b48Spatrick  instruction types and operands).
09467b48Spatrick  We considered putting the type in some variant of MCInstrDesc instead:
097a140dSpatrick  See `PR26576 <https://llvm.org/PR26576>`_: [GlobalISel] Generic MachineInstrs
09467b48Spatrick  need a type but this increases the memory footprint of the related objects
09467b48Spatrick
09467b48Spatrick.. _gmir-regbank:
09467b48Spatrick
09467b48SpatrickRegister Bank
09467b48Spatrick-------------
09467b48Spatrick
09467b48SpatrickA Register Bank is a set of register classes defined by the target. This
09467b48Spatrickdefinition is rather loose so let's talk about what they can achieve.
09467b48Spatrick
09467b48SpatrickSuppose we have a processor that has two register files, A and B. These are
09467b48Spatrickequal in every way and support the same instructions for the same cost. They're
09467b48Spatrickjust physically stored apart and each instruction can only access registers from
09467b48SpatrickA or B but never a mix of the two. If we want to perform an operation on data
09467b48Spatrickthat's in split between the two register files, we must first copy all the data
09467b48Spatrickinto a single register file.
09467b48Spatrick
09467b48SpatrickGiven a processor like this, we would benefit from clustering related data
09467b48Spatricktogether into one register file so that we minimize the cost of copying data
09467b48Spatrickback and forth to satisfy the (possibly conflicting) requirements of all the
09467b48Spatrickinstructions. Register Banks are a means to constrain the register allocator to
09467b48Spatrickuse a particular register file for a virtual register.
09467b48Spatrick
09467b48SpatrickIn practice, register files A and B are rarely equal. They can typically store
09467b48Spatrickthe same data but there's usually some restrictions on what operations you can
09467b48Spatrickdo on each register file. A fairly common pattern is for one of them to be
09467b48Spatrickaccessible to integer operations and the other accessible to floating point
*73471bf0Spatrickoperations. To accommodate this, let's rename A and B to GPR (general purpose
09467b48Spatrickregisters) and FPR (floating point registers).
09467b48Spatrick
09467b48SpatrickWe now have some additional constraints that limit us. An operation like G_FMUL
09467b48Spatrickhas to happen in FPR and G_ADD has to happen in GPR. However, even though this
09467b48Spatrickprescribes a lot of the assignments we still have some freedom. A G_LOAD can
09467b48Spatrickhappen in both GPR and FPR, and which we want depends on who is going to consume
09467b48Spatrickthe loaded data. Similarly, G_FNEG can happen in both GPR and FPR. If we assign
09467b48Spatrickit to FPR, then we'll use floating point negation. However, if we assign it to
09467b48SpatrickGPR then we can equivalently G_XOR the sign bit with 1 to invert it.
09467b48Spatrick
09467b48SpatrickIn summary, Register Banks are a means of disambiguating between seemingly
09467b48Spatrickequivalent choices based on some analysis of the differences when each choice
09467b48Spatrickis applied in a given context.
09467b48Spatrick
09467b48SpatrickTo give some concrete examples:
09467b48Spatrick
09467b48SpatrickAArch64
09467b48Spatrick
09467b48Spatrick  AArch64 has three main banks. GPR for integer operations, FPR for floating
09467b48Spatrick  point and also for the NEON vector instruction set. The third is CCR and
09467b48Spatrick  describes the condition code register used for predication.
09467b48Spatrick
09467b48SpatrickMIPS
09467b48Spatrick
09467b48Spatrick  MIPS has five main banks of which many programs only really use one or two.
09467b48Spatrick  GPR is the general purpose bank for integer operations. FGR or CP1 is for
09467b48Spatrick  the floating point operations as well as the MSA vector instructions and a
09467b48Spatrick  few other application specific extensions. CP0 is for system registers and
09467b48Spatrick  few programs will use it. CP2 and CP3 are for any application specific
09467b48Spatrick  coprocessors that may be present in the chip. Arguably, there is also a sixth
09467b48Spatrick  for the LO and HI registers but these are only used for the result of a few
09467b48Spatrick  operations and it's of questionable value to model distinctly from GPR.
09467b48Spatrick
09467b48SpatrickX86
09467b48Spatrick
09467b48Spatrick  X86 can be seen as having 3 main banks: general-purpose, x87, and
09467b48Spatrick  vector (which could be further split into a bank per domain for single vs
09467b48Spatrick  double precision instructions). It also looks like there's arguably a few
09467b48Spatrick  more potential banks such as one for the AVX512 Mask Registers.
09467b48Spatrick
09467b48SpatrickRegister banks are described by a target-provided API,
09467b48Spatrick:ref:`RegisterBankInfo <api-registerbankinfo>`.
09467b48Spatrick
09467b48Spatrick.. _gmir-llt:
09467b48Spatrick
09467b48SpatrickLow Level Type
09467b48Spatrick--------------
09467b48Spatrick
09467b48SpatrickAdditionally, every generic virtual register has a type, represented by an
09467b48Spatrickinstance of the ``LLT`` class.
09467b48Spatrick
09467b48SpatrickLike ``EVT``/``MVT``/``Type``, it has no distinction between unsigned and signed
09467b48Spatrickinteger types.  Furthermore, it also has no distinction between integer and
09467b48Spatrickfloating-point types: it mainly conveys absolutely necessary information, such
09467b48Spatrickas size and number of vector lanes:
09467b48Spatrick
09467b48Spatrick* ``sN`` for scalars
09467b48Spatrick* ``pN`` for pointers
09467b48Spatrick* ``<N x sM>`` for vectors
09467b48Spatrick
09467b48Spatrick``LLT`` is intended to replace the usage of ``EVT`` in SelectionDAG.
09467b48Spatrick
09467b48SpatrickHere are some LLT examples and their ``EVT`` and ``Type`` equivalents:
09467b48Spatrick
09467b48Spatrick   =============  =========  ======================================
09467b48Spatrick   LLT            EVT        IR Type
09467b48Spatrick   =============  =========  ======================================
09467b48Spatrick   ``s1``         ``i1``     ``i1``
09467b48Spatrick   ``s8``         ``i8``     ``i8``
09467b48Spatrick   ``s32``        ``i32``    ``i32``
09467b48Spatrick   ``s32``        ``f32``    ``float``
09467b48Spatrick   ``s17``        ``i17``    ``i17``
09467b48Spatrick   ``s16``        N/A        ``{i8, i8}`` [#abi-dependent]_
09467b48Spatrick   ``s32``        N/A        ``[4 x i8]`` [#abi-dependent]_
09467b48Spatrick   ``p0``         ``iPTR``   ``i8*``, ``i32*``, ``%opaque*``
09467b48Spatrick   ``p2``         ``iPTR``   ``i8 addrspace(2)*``
09467b48Spatrick   ``<4 x s32>``  ``v4f32``  ``<4 x float>``
09467b48Spatrick   ``s64``        ``v1f64``  ``<1 x double>``
09467b48Spatrick   ``<3 x s32>``  ``v3i32``  ``<3 x i32>``
09467b48Spatrick   =============  =========  ======================================
09467b48Spatrick
09467b48Spatrick
09467b48SpatrickRationale: instructions already encode a specific interpretation of types
09467b48Spatrick(e.g., ``add`` vs. ``fadd``, or ``sdiv`` vs. ``udiv``).  Also encoding that
09467b48Spatrickinformation in the type system requires introducing bitcast with no real
09467b48Spatrickadvantage for the selector.
09467b48Spatrick
09467b48SpatrickPointer types are distinguished by address space.  This matches IR, as opposed
09467b48Spatrickto SelectionDAG where address space is an attribute on operations.
09467b48SpatrickThis representation better supports pointers having different sizes depending
09467b48Spatrickon their addressspace.
09467b48Spatrick
09467b48Spatrick.. note::
09467b48Spatrick
09467b48Spatrick  .. caution::
09467b48Spatrick
09467b48Spatrick    Is this still true? I thought we'd removed the 1-element vector concept.
09467b48Spatrick    Hypothetically, it could be distinct from a scalar but I think we failed to
09467b48Spatrick    find a real occurrence.
09467b48Spatrick
09467b48Spatrick  Currently, LLT requires at least 2 elements in vectors, but some targets have
09467b48Spatrick  the concept of a '1-element vector'.  Representing them as their underlying
09467b48Spatrick  scalar type is a nice simplification.
09467b48Spatrick
09467b48Spatrick.. rubric:: Footnotes
09467b48Spatrick
09467b48Spatrick.. [#abi-dependent] This mapping is ABI dependent. Here we've assumed no additional padding is required.
09467b48Spatrick
09467b48SpatrickGeneric Opcode Reference
09467b48Spatrick------------------------
09467b48Spatrick
09467b48SpatrickThe Generic Opcodes that are available are described at :doc:`GenericOpcode`.