1============================================================ 2Extending LLVM: Adding instructions, intrinsics, types, etc. 3============================================================ 4 5Introduction and Warning 6======================== 7 8 9During the course of using LLVM, you may wish to customize it for your research 10project or for experimentation. At this point, you may realize that you need to 11add something to LLVM, whether it be a new fundamental type, a new intrinsic 12function, or a whole new instruction. 13 14When you come to this realization, stop and think. Do you really need to extend 15LLVM? Is it a new fundamental capability that LLVM does not support at its 16current incarnation or can it be synthesized from already pre-existing LLVM 17elements? If you are not sure, ask on the `LLVM forums 18<https://discourse.llvm.org>`_. The reason is that 19extending LLVM will get involved as you need to update all the different passes 20that you intend to use with your extension, and there are ``many`` LLVM analyses 21and transformations, so it may be quite a bit of work. 22 23Adding an `intrinsic function`_ is far easier than adding an 24instruction, and is transparent to optimization passes. If your added 25functionality can be expressed as a function call, an intrinsic function is the 26method of choice for LLVM extension. 27 28Before you invest a significant amount of effort into a non-trivial extension, 29**ask on the list** if what you are looking to do can be done with 30already-existing infrastructure, or if maybe someone else is already working on 31it. You will save yourself a lot of time and effort by doing so. 32 33.. _intrinsic function: 34 35Adding a new intrinsic function 36=============================== 37 38Adding a new intrinsic function to LLVM is much easier than adding a new 39instruction. Almost all extensions to LLVM should start as an intrinsic 40function and then be turned into an instruction if warranted. 41 42#. ``llvm/docs/LangRef.html``: 43 44 Document the intrinsic. Decide whether it is code generator specific and 45 what the restrictions are. Talk to other people about it so that you are 46 sure it's a good idea. 47 48#. ``llvm/include/llvm/IR/Intrinsics*.td``: 49 50 Add an entry for your intrinsic. Describe its memory access 51 characteristics for optimization (this controls whether it will be 52 DCE'd, CSE'd, etc). If any arguments need to be immediates, these 53 must be indicated with the ImmArg property. Note that any intrinsic 54 using one of the ``llvm_any*_ty`` types for an argument or return 55 type will be deemed by ``tblgen`` as overloaded and the 56 corresponding suffix will be required on the intrinsic's name. 57 58#. ``llvm/lib/Analysis/ConstantFolding.cpp``: 59 60 If it is possible to constant fold your intrinsic, add support to it in the 61 ``canConstantFoldCallTo`` and ``ConstantFoldCall`` functions. 62 63#. ``llvm/test/*``: 64 65 Add test cases for your test cases to the test suite 66 67Once the intrinsic has been added to the system, you must add code generator 68support for it. Generally you must do the following steps: 69 70Add support to the .td file for the target(s) of your choice in 71``lib/Target/*/*.td``. 72 73 This is usually a matter of adding a pattern to the .td file that matches the 74 intrinsic, though it may obviously require adding the instructions you want to 75 generate as well. There are lots of examples in the PowerPC and X86 backend 76 to follow. 77 78Adding a new SelectionDAG node 79============================== 80 81As with intrinsics, adding a new SelectionDAG node to LLVM is much easier than 82adding a new instruction. New nodes are often added to help represent 83instructions common to many targets. These nodes often map to an LLVM 84instruction (add, sub) or intrinsic (byteswap, population count). In other 85cases, new nodes have been added to allow many targets to perform a common task 86(converting between floating point and integer representation) or capture more 87complicated behavior in a single node (rotate). 88 89#. ``include/llvm/CodeGen/ISDOpcodes.h``: 90 91 Add an enum value for the new SelectionDAG node. 92 93#. ``lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp``: 94 95 Add code to print the node to ``getOperationName``. If your new node can be 96 evaluated at compile time when given constant arguments (such as an add of a 97 constant with another constant), find the ``getNode`` method that takes the 98 appropriate number of arguments, and add a case for your node to the switch 99 statement that performs constant folding for nodes that take the same number 100 of arguments as your new node. 101 102#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``: 103 104 Add code to `legalize, promote, and expand 105 <CodeGenerator.html#selectiondag_legalize>`_ the node as necessary. At a 106 minimum, you will need to add a case statement for your node in 107 ``LegalizeOp`` which calls LegalizeOp on the node's operands, and returns a 108 new node if any of the operands changed as a result of being legalized. It 109 is likely that not all targets supported by the SelectionDAG framework will 110 natively support the new node. In this case, you must also add code in your 111 node's case statement in ``LegalizeOp`` to Expand your node into simpler, 112 legal operations. The case for ``ISD::UREM`` for expanding a remainder into 113 a divide, multiply, and a subtract is a good example. 114 115#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``: 116 117 If targets may support the new node being added only at certain sizes, you 118 will also need to add code to your node's case statement in ``LegalizeOp`` 119 to Promote your node's operands to a larger size, and perform the correct 120 operation. You will also need to add code to ``PromoteOp`` to do this as 121 well. For a good example, see ``ISD::BSWAP``, which promotes its operand to 122 a wider size, performs the byteswap, and then shifts the correct bytes right 123 to emulate the narrower byteswap in the wider type. 124 125#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``: 126 127 Add a case for your node in ``ExpandOp`` to teach the legalizer how to 128 perform the action represented by the new node on a value that has been split 129 into high and low halves. This case will be used to support your node with a 130 64 bit operand on a 32 bit target. 131 132#. ``lib/CodeGen/SelectionDAG/DAGCombiner.cpp``: 133 134 If your node can be combined with itself, or other existing nodes in a 135 peephole-like fashion, add a visit function for it, and call that function 136 from. There are several good examples for simple combines you can do; 137 ``visitFABS`` and ``visitSRL`` are good starting places. 138 139#. ``lib/Target/PowerPC/PPCISelLowering.cpp``: 140 141 Each target has an implementation of the ``TargetLowering`` class, usually in 142 its own file (although some targets include it in the same file as the 143 DAGToDAGISel). The default behavior for a target is to assume that your new 144 node is legal for all types that are legal for that target. If this target 145 does not natively support your node, then tell the target to either Promote 146 it (if it is supported at a larger type) or Expand it. This will cause the 147 code you wrote in ``LegalizeOp`` above to decompose your new node into other 148 legal nodes for this target. 149 150#. ``include/llvm/Target/TargetSelectionDAG.td``: 151 152 Most current targets supported by LLVM generate code using the DAGToDAG 153 method, where SelectionDAG nodes are pattern matched to target-specific 154 nodes, which represent individual instructions. In order for the targets to 155 match an instruction to your new node, you must add a def for that node to 156 the list in this file, with the appropriate type constraints. Look at 157 ``add``, ``bswap``, and ``fadd`` for examples. 158 159#. ``lib/Target/PowerPC/PPCInstrInfo.td``: 160 161 Each target has a tablegen file that describes the target's instruction set. 162 For targets that use the DAGToDAG instruction selection framework, add a 163 pattern for your new node that uses one or more target nodes. Documentation 164 for this is a bit sparse right now, but there are several decent examples. 165 See the patterns for ``rotl`` in ``PPCInstrInfo.td``. 166 167#. TODO: document complex patterns. 168 169#. ``llvm/test/CodeGen/*``: 170 171 Add test cases for your new node to the test suite. 172 ``llvm/test/CodeGen/X86/bswap.ll`` is a good example. 173 174Adding a new instruction 175======================== 176 177.. warning:: 178 179 Adding instructions changes the bitcode format, and it will take some effort 180 to maintain compatibility with the previous version. Only add an instruction 181 if it is absolutely necessary. 182 183#. ``llvm/include/llvm/IR/Instruction.def``: 184 185 add a number for your instruction and an enum name 186 187#. ``llvm/include/llvm/IR/Instructions.h``: 188 189 add a definition for the class that will represent your instruction 190 191#. ``llvm/include/llvm/IR/InstVisitor.h``: 192 193 add a prototype for a visitor to your new instruction type 194 195#. ``llvm/lib/AsmParser/LLLexer.cpp``: 196 197 add a new token to parse your instruction from assembly text file 198 199#. ``llvm/lib/AsmParser/LLParser.cpp``: 200 201 add the grammar on how your instruction can be read and what it will 202 construct as a result 203 204#. ``llvm/lib/Bitcode/Reader/BitcodeReader.cpp``: 205 206 add a case for your instruction and how it will be parsed from bitcode 207 208#. ``llvm/lib/Bitcode/Writer/BitcodeWriter.cpp``: 209 210 add a case for your instruction and how it will be parsed from bitcode 211 212#. ``llvm/lib/IR/Instruction.cpp``: 213 214 add a case for how your instruction will be printed out to assembly 215 216#. ``llvm/lib/IR/Instructions.cpp``: 217 218 implement the class you defined in ``llvm/include/llvm/Instructions.h`` 219 220#. Test your instruction 221 222#. ``llvm/lib/Target/*``: 223 224 add support for your instruction to code generators, or add a lowering pass. 225 226#. ``llvm/test/*``: 227 228 add your test cases to the test suite. 229 230Also, you need to implement (or modify) any analyses or passes that you want to 231understand this new instruction. 232 233Adding a new type 234================= 235 236.. warning:: 237 238 Adding new types changes the bitcode format, and will break compatibility with 239 currently-existing LLVM installations. Only add new types if it is absolutely 240 necessary. 241 242Adding a fundamental type 243------------------------- 244 245#. ``llvm/include/llvm/IR/Type.h``: 246 247 add enum for the new type; add static ``Type*`` for this type 248 249#. ``llvm/lib/IR/Type.cpp`` and ``llvm/lib/CodeGen/ValueTypes.cpp``: 250 251 add mapping from ``TypeID`` => ``Type*``; initialize the static ``Type*`` 252 253#. ``llvm/include/llvm-c/Core.h`` and ``llvm/lib/IR/Core.cpp``: 254 255 add enum ``LLVMTypeKind`` and modify 256 ``LLVMTypeKind LLVMGetTypeKind(LLVMTypeRef Ty)`` for the new type 257 258#. ``llvm/lib/AsmParser/LLLexer.cpp``: 259 260 add ability to parse in the type from text assembly 261 262#. ``llvm/lib/AsmParser/LLParser.cpp``: 263 264 add a token for that type 265 266#. ``llvm/lib/Bitcode/Writer/BitcodeWriter.cpp``: 267 268 modify ``void ModuleBitcodeWriter::writeTypeTable()`` to serialize your type 269 270#. ``llvm/lib/Bitcode/Reader/BitcodeReader.cpp``: 271 272 modify ``Error BitcodeReader::parseTypeTableBody()`` to read your data type 273 274#. ``include/llvm/Bitcode/LLVMBitCodes.h``: 275 276 add enum ``TypeCodes`` for the new type 277 278Adding a derived type 279--------------------- 280 281#. ``llvm/include/llvm/IR/Type.h``: 282 283 add enum for the new type; add a forward declaration of the type also 284 285#. ``llvm/include/llvm/IR/DerivedTypes.h``: 286 287 add new class to represent new class in the hierarchy; add forward 288 declaration to the TypeMap value type 289 290#. ``llvm/lib/IR/Type.cpp`` and ``llvm/lib/CodeGen/ValueTypes.cpp``: 291 292 add support for derived type, notably `enum TypeID` and `is`, `get` methods. 293 294#. ``llvm/include/llvm-c/Core.h`` and ``llvm/lib/IR/Core.cpp``: 295 296 add enum ``LLVMTypeKind`` and modify 297 `LLVMTypeKind LLVMGetTypeKind(LLVMTypeRef Ty)` for the new type 298 299#. ``llvm/lib/AsmParser/LLLexer.cpp``: 300 301 modify ``lltok::Kind LLLexer::LexIdentifier()`` to add ability to 302 parse in the type from text assembly 303 304#. ``llvm/lib/Bitcode/Writer/BitcodeWriter.cpp``: 305 306 modify ``void ModuleBitcodeWriter::writeTypeTable()`` to serialize your type 307 308#. ``llvm/lib/Bitcode/Reader/BitcodeReader.cpp``: 309 310 modify ``Error BitcodeReader::parseTypeTableBody()`` to read your data type 311 312#. ``include/llvm/Bitcode/LLVMBitCodes.h``: 313 314 add enum ``TypeCodes`` for the new type 315 316#. ``llvm/lib/IR/AsmWriter.cpp``: 317 318 modify ``void TypePrinting::print(Type *Ty, raw_ostream &OS)`` 319 to output the new derived type 320