xref: /llvm-project/llvm/docs/TableGen/BackEnds.rst (revision 5ee8418057646f4640cd1bb60e73f9e5129ea12e)
1=================
2TableGen BackEnds
3=================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11TableGen backends are at the core of TableGen's functionality. The source
12files provide the classes and records that are parsed and end up as a
13collection of record instances, but it's up to the backend to interpret and
14print the records in a way that is meaningful to the user (normally a C++
15include file or a textual list of warnings, options, and error messages).
16
17TableGen is used by both LLVM, Clang, and MLIR with very different goals.
18LLVM uses it as a way to automate the generation of massive amounts of
19information regarding instructions, schedules, cores, and architecture
20features. Some backends generate output that is consumed by more than one
21source file, so they need to be created in a way that makes it is easy for
22preprocessor tricks to be used. Some backends can also print C++ code
23structures, so that they can be directly included as-is.
24
25Clang, on the other hand, uses it mainly for diagnostic messages (errors,
26warnings, tips) and attributes, so more on the textual end of the scale.
27
28MLIR uses TableGen to define operations, operation dialects, and operation
29traits.
30
31See the :doc:`TableGen Programmer's Reference <./ProgRef>` for an in-depth
32description of TableGen, and the :doc:`TableGen Backend Developer's Guide
33<./BackGuide>` for a guide to writing a new backend.
34
35LLVM BackEnds
36=============
37
38.. warning::
39   This portion is incomplete. Each section below needs three subsections:
40   description of its purpose with a list of users, output generated from
41   generic input, and finally why it needed a new backend (in case there's
42   something similar).
43
44Overall, each backend will take the same TableGen file type and transform into
45similar output for different targets/uses. There is an implicit contract between
46the TableGen files, the back-ends and their users.
47
48For instance, a global contract is that each back-end produces macro-guarded
49sections. Based on whether the file is included by a header or a source file,
50or even in which context of each file the include is being used, you have
51todefine a macro just before including it, to get the right output:
52
53.. code-block:: c++
54
55  #define GET_REGINFO_TARGET_DESC
56  #include "ARMGenRegisterInfo.inc"
57
58And just part of the generated file would be included. This is useful if
59you need the same information in multiple formats (instantiation, initialization,
60getter/setter functions, etc) from the same source TableGen file without having
61to re-compile the TableGen file multiple times.
62
63Sometimes, multiple macros might be defined before the same include file to
64output multiple blocks:
65
66.. code-block:: c++
67
68  #define GET_REGISTER_MATCHER
69  #define GET_SUBTARGET_FEATURE_NAME
70  #define GET_MATCHER_IMPLEMENTATION
71  #include "ARMGenAsmMatcher.inc"
72
73The macros will be undef'd automatically as they're used, in the include file.
74
75On all LLVM back-ends, the ``llvm-tblgen`` binary will be executed on the root
76TableGen file ``<Target>.td``, which should include all others. This guarantees
77that all information needed is accessible, and that no duplication is needed
78in the TableGen files.
79
80CodeEmitter
81-----------
82
83**Purpose**: CodeEmitterGen uses the descriptions of instructions and their fields to
84construct an automated code emitter: a function that, given a MachineInstr,
85returns the (currently, 32-bit unsigned) value of the instruction.
86
87**Output**: C++ code, implementing the target's CodeEmitter
88class by overriding the virtual functions as ``<Target>CodeEmitter::function()``.
89
90**Usage**: Used to include directly at the end of ``<Target>MCCodeEmitter.cpp``.
91
92RegisterInfo
93------------
94
95**Purpose**: This tablegen backend is responsible for emitting a description of a target
96register file for a code generator.  It uses instances of the Register,
97RegisterAliases, and RegisterClass classes to gather this information.
98
99**Output**: C++ code with enums and structures representing the register mappings,
100properties, masks, etc.
101
102**Usage**: Both on ``<Target>BaseRegisterInfo`` and ``<Target>MCTargetDesc`` (headers
103and source files) with macros defining in which they are for declaration vs.
104initialization issues.
105
106InstrInfo
107---------
108
109**Purpose**: This tablegen backend is responsible for emitting a description of the target
110instruction set for the code generator. (what are the differences from CodeEmitter?)
111
112**Output**: C++ code with enums and structures representing the instruction mappings,
113properties, masks, etc.
114
115**Usage**: Both on ``<Target>BaseInstrInfo`` and ``<Target>MCTargetDesc`` (headers
116and source files) with macros defining in which they are for declaration vs.
117initialization issues.
118
119AsmWriter
120---------
121
122**Purpose**: Emits an assembly printer for the current target.
123
124**Output**: Implementation of ``<Target>InstPrinter::printInstruction()``, among
125other things.
126
127**Usage**: Included directly into ``InstPrinter/<Target>InstPrinter.cpp``.
128
129AsmMatcher
130----------
131
132**Purpose**: Emits a target specifier matcher for
133converting parsed assembly operands in the MCInst structures. It also
134emits a matcher for custom operand parsing. Extensive documentation is
135written on the ``AsmMatcherEmitter.cpp`` file.
136
137**Output**: Assembler parsers' matcher functions, declarations, etc.
138
139**Usage**: Used in back-ends' ``AsmParser/<Target>AsmParser.cpp`` for
140building the AsmParser class.
141
142Disassembler
143------------
144
145**Purpose**: Contains disassembler table emitters for various
146architectures. Extensive documentation is written on the
147``DisassemblerEmitter.cpp`` file.
148
149**Output**: Decoding tables, static decoding functions, etc.
150
151**Usage**: Directly included in ``Disassembler/<Target>Disassembler.cpp``
152to cater for all default decodings, after all hand-made ones.
153
154PseudoLowering
155--------------
156
157**Purpose**: Generate pseudo instruction lowering.
158
159**Output**: Implements ``<Target>AsmPrinter::emitPseudoExpansionLowering()``.
160
161**Usage**: Included directly into ``<Target>AsmPrinter.cpp``.
162
163CallingConv
164-----------
165
166**Purpose**: Responsible for emitting descriptions of the calling
167conventions supported by this target.
168
169**Output**: Implement static functions to deal with calling conventions
170chained by matching styles, returning false on no match.
171
172**Usage**: Used in ISelLowering and FastIsel as function pointers to
173implementation returned by a CC selection function.
174
175DAGISel
176-------
177
178**Purpose**: Generate a DAG instruction selector.
179
180**Output**: Creates huge functions for automating DAG selection.
181
182**Usage**: Included in ``<Target>ISelDAGToDAG.cpp`` inside the target's
183implementation of ``SelectionDAGISel``.
184
185DFAPacketizer
186-------------
187
188**Purpose**: This class parses the Schedule.td file and produces an API that
189can be used to reason about whether an instruction can be added to a packet
190on a VLIW architecture. The class internally generates a deterministic finite
191automaton (DFA) that models all possible mappings of machine instructions
192to functional units as instructions are added to a packet.
193
194**Output**: Scheduling tables for GPU back-ends (Hexagon, AMD).
195
196**Usage**: Included directly on ``<Target>InstrInfo.cpp``.
197
198FastISel
199--------
200
201**Purpose**: This tablegen backend emits code for use by the "fast"
202instruction selection algorithm. See the comments at the top of
203lib/CodeGen/SelectionDAG/FastISel.cpp for background. This file
204scans through the target's tablegen instruction-info files
205and extracts instructions with obvious-looking patterns, and it emits
206code to look up these instructions by type and operator.
207
208**Output**: Generates ``Predicate`` and ``FastEmit`` methods.
209
210**Usage**: Implements private methods of the targets' implementation
211of ``FastISel`` class.
212
213Subtarget
214---------
215
216**Purpose**: Generate subtarget enumerations.
217
218**Output**: Enums, globals, local tables for sub-target information.
219
220**Usage**: Populates ``<Target>Subtarget`` and
221``MCTargetDesc/<Target>MCTargetDesc`` files (both headers and source).
222
223Intrinsic
224---------
225
226**Purpose**: Generate (target) intrinsic information.
227
228OptParserDefs
229-------------
230
231**Purpose**: Print enum values for a class.
232
233SearchableTables
234----------------
235
236**Purpose**: Generate custom searchable tables.
237
238**Output**: Enums, global tables, and lookup helper functions.
239
240**Usage**: This backend allows generating free-form, target-specific tables
241from TableGen records. The ARM and AArch64 targets use this backend to generate
242tables of system registers; the AMDGPU target uses it to generate meta-data
243about complex image and memory buffer instructions.
244
245See `SearchableTables Reference`_ for a detailed description.
246
247CTags
248-----
249
250**Purpose**: This tablegen backend emits an index of definitions in ctags(1)
251format. A helper script, utils/TableGen/tdtags, provides an easier-to-use
252interface; run 'tdtags -H' for documentation.
253
254X86EVEX2VEX
255-----------
256
257**Purpose**: This X86 specific tablegen backend emits tables that map EVEX
258encoded instructions to their VEX encoded identical instruction.
259
260Clang BackEnds
261==============
262
263ClangAttrClasses
264----------------
265
266**Purpose**: Creates Attrs.inc, which contains semantic attribute class
267declarations for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``.
268This file is included as part of ``Attr.h``.
269
270ClangAttrParserStringSwitches
271-----------------------------
272
273**Purpose**: Creates AttrParserStringSwitches.inc, which contains
274StringSwitch::Case statements for parser-related string switches. Each switch
275is given its own macro (such as ``CLANG_ATTR_ARG_CONTEXT_LIST``, or
276``CLANG_ATTR_IDENTIFIER_ARG_LIST``), which is expected to be defined before
277including AttrParserStringSwitches.inc, and undefined after.
278
279ClangAttrImpl
280-------------
281
282**Purpose**: Creates AttrImpl.inc, which contains semantic attribute class
283definitions for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``.
284This file is included as part of ``AttrImpl.cpp``.
285
286ClangAttrList
287-------------
288
289**Purpose**: Creates AttrList.inc, which is used when a list of semantic
290attribute identifiers is required. For instance, ``AttrKinds.h`` includes this
291file to generate the list of ``attr::Kind`` enumeration values. This list is
292separated out into multiple categories: attributes, inheritable attributes, and
293inheritable parameter attributes. This categorization happens automatically
294based on information in ``Attr.td`` and is used to implement the ``classof``
295functionality required for ``dyn_cast`` and similar APIs.
296
297ClangAttrPCHRead
298----------------
299
300**Purpose**: Creates AttrPCHRead.inc, which is used to deserialize attributes
301in the ``ASTReader::ReadAttributes`` function.
302
303ClangAttrPCHWrite
304-----------------
305
306**Purpose**: Creates AttrPCHWrite.inc, which is used to serialize attributes in
307the ``ASTWriter::WriteAttributes`` function.
308
309ClangAttrSpellings
310---------------------
311
312**Purpose**: Creates AttrSpellings.inc, which is used to implement the
313``__has_attribute`` feature test macro.
314
315ClangAttrSpellingListIndex
316--------------------------
317
318**Purpose**: Creates AttrSpellingListIndex.inc, which is used to map parsed
319attribute spellings (including which syntax or scope was used) to an attribute
320spelling list index. These spelling list index values are internal
321implementation details exposed via
322``AttributeList::getAttributeSpellingListIndex``.
323
324ClangAttrVisitor
325-------------------
326
327**Purpose**: Creates AttrVisitor.inc, which is used when implementing
328recursive AST visitors.
329
330ClangAttrTemplateInstantiate
331----------------------------
332
333**Purpose**: Creates AttrTemplateInstantiate.inc, which implements the
334``instantiateTemplateAttribute`` function, used when instantiating a template
335that requires an attribute to be cloned.
336
337ClangAttrParsedAttrList
338-----------------------
339
340**Purpose**: Creates AttrParsedAttrList.inc, which is used to generate the
341``AttributeList::Kind`` parsed attribute enumeration.
342
343ClangAttrParsedAttrImpl
344-----------------------
345
346**Purpose**: Creates AttrParsedAttrImpl.inc, which is used by
347``AttributeList.cpp`` to implement several functions on the ``AttributeList``
348class. This functionality is implemented via the ``AttrInfoMap ParsedAttrInfo``
349array, which contains one element per parsed attribute object.
350
351ClangAttrParsedAttrKinds
352------------------------
353
354**Purpose**: Creates AttrParsedAttrKinds.inc, which is used to implement the
355``AttributeList::getKind`` function, mapping a string (and syntax) to a parsed
356attribute ``AttributeList::Kind`` enumeration.
357
358ClangAttrDump
359-------------
360
361**Purpose**: Creates AttrDump.inc, which dumps information about an attribute.
362It is used to implement ``ASTDumper::dumpAttr``.
363
364ClangDiagsDefs
365--------------
366
367Generate Clang diagnostics definitions.
368
369ClangDiagGroups
370---------------
371
372Generate Clang diagnostic groups.
373
374ClangDiagsIndexName
375-------------------
376
377Generate Clang diagnostic name index.
378
379ClangCommentNodes
380-----------------
381
382Generate Clang AST comment nodes.
383
384ClangDeclNodes
385--------------
386
387Generate Clang AST declaration nodes.
388
389ClangStmtNodes
390--------------
391
392Generate Clang AST statement nodes.
393
394ClangSACheckers
395---------------
396
397Generate Clang Static Analyzer checkers.
398
399ClangCommentHTMLTags
400--------------------
401
402Generate efficient matchers for HTML tag names that are used in documentation comments.
403
404ClangCommentHTMLTagsProperties
405------------------------------
406
407Generate efficient matchers for HTML tag properties.
408
409ClangCommentHTMLNamedCharacterReferences
410----------------------------------------
411
412Generate function to translate named character references to UTF-8 sequences.
413
414ClangCommentCommandInfo
415-----------------------
416
417Generate command properties for commands that are used in documentation comments.
418
419ClangCommentCommandList
420-----------------------
421
422Generate list of commands that are used in documentation comments.
423
424ArmNeon
425-------
426
427Generate arm_neon.h for clang.
428
429ArmNeonSema
430-----------
431
432Generate ARM NEON sema support for clang.
433
434ArmNeonTest
435-----------
436
437Generate ARM NEON tests for clang.
438
439AttrDocs
440--------
441
442**Purpose**: Creates ``AttributeReference.rst`` from ``AttrDocs.td``, and is
443used for documenting user-facing attributes.
444
445General BackEnds
446================
447
448Print Records
449-------------
450
451The TableGen command option ``--print-records`` invokes a simple backend
452that prints all the classes and records defined in the source files. This is
453the default backend option. See the :doc:`TableGen Backend Developer's Guide
454<./BackGuide>` for more information.
455
456Print Detailed Records
457----------------------
458
459The TableGen command option ``--print-detailed-records`` invokes a backend
460that prints all the global variables, classes, and records defined in the
461source files, with more detail than the default record printer. See the
462:doc:`TableGen Backend Developer's Guide <./BackGuide>` for more
463information.
464
465JSON Reference
466--------------
467
468**Purpose**: Output all the values in every ``def``, as a JSON data
469structure that can be easily parsed by a variety of languages. Useful
470for writing custom backends without having to modify TableGen itself,
471or for performing auxiliary analysis on the same TableGen data passed
472to a built-in backend.
473
474**Output**:
475
476The root of the output file is a JSON object (i.e. dictionary),
477containing the following fixed keys:
478
479* ``!tablegen_json_version``: a numeric version field that will
480  increase if an incompatible change is ever made to the structure of
481  this data. The format described here corresponds to version 1.
482
483* ``!instanceof``: a dictionary whose keys are the class names defined
484  in the TableGen input. For each key, the corresponding value is an
485  array of strings giving the names of ``def`` records that derive
486  from that class. So ``root["!instanceof"]["Instruction"]``, for
487  example, would list the names of all the records deriving from the
488  class ``Instruction``.
489
490For each ``def`` record, the root object also has a key for the record
491name. The corresponding value is a subsidiary object containing the
492following fixed keys:
493
494* ``!superclasses``: an array of strings giving the names of all the
495  classes that this record derives from.
496
497* ``!fields``: an array of strings giving the names of all the variables
498  in this record that were defined with the ``field`` keyword.
499
500* ``!name``: a string giving the name of the record. This is always
501  identical to the key in the JSON root object corresponding to this
502  record's dictionary. (If the record is anonymous, the name is
503  arbitrary.)
504
505* ``!anonymous``: a boolean indicating whether the record's name was
506  specified by the TableGen input (if it is ``false``), or invented by
507  TableGen itself (if ``true``).
508
509* ``!locs``: an array of strings giving the source locations associated with
510  this record. For records instantiated from a ``multiclass``, this gives the
511  location of each ``def`` or ``defm``, starting with the inner-most
512  ``multiclass``, and ending with the top-level ``defm``. Each string contains
513  the file name and line number, separated by a colon.
514
515For each variable defined in a record, the ``def`` object for that
516record also has a key for the variable name. The corresponding value
517is a translation into JSON of the variable's value, using the
518conventions described below.
519
520Some TableGen data types are translated directly into the
521corresponding JSON type:
522
523* A completely undefined value (e.g. for a variable declared without
524  initializer in some superclass of this record, and never initialized
525  by the record itself or any other superclass) is emitted as the JSON
526  ``null`` value.
527
528* ``int`` and ``bit`` values are emitted as numbers. Note that
529  TableGen ``int`` values are capable of holding integers too large to
530  be exactly representable in IEEE double precision. The integer
531  literal in the JSON output will show the full exact integer value.
532  So if you need to retrieve large integers with full precision, you
533  should use a JSON reader capable of translating such literals back
534  into 64-bit integers without losing precision, such as Python's
535  standard ``json`` module.
536
537* ``string`` and ``code`` values are emitted as JSON strings.
538
539* ``list<T>`` values, for any element type ``T``, are emitted as JSON
540  arrays. Each element of the array is represented in turn using these
541  same conventions.
542
543* ``bits`` values are also emitted as arrays. A ``bits`` array is
544  ordered from least-significant bit to most-significant. So the
545  element with index ``i`` corresponds to the bit described as
546  ``x{i}`` in TableGen source. However, note that this means that
547  scripting languages are likely to *display* the array in the
548  opposite order from the way it appears in the TableGen source or in
549  the diagnostic ``-print-records`` output.
550
551All other TableGen value types are emitted as a JSON object,
552containing two standard fields: ``kind`` is a discriminator describing
553which kind of value the object represents, and ``printable`` is a
554string giving the same representation of the value that would appear
555in ``-print-records``.
556
557* A reference to a ``def`` object has ``kind=="def"``, and has an
558  extra field ``def`` giving the name of the object referred to.
559
560* A reference to another variable in the same record has
561  ``kind=="var"``, and has an extra field ``var`` giving the name of
562  the variable referred to.
563
564* A reference to a specific bit of a ``bits``-typed variable in the
565  same record has ``kind=="varbit"``, and has two extra fields:
566  ``var`` gives the name of the variable referred to, and ``index``
567  gives the index of the bit.
568
569* A value of type ``dag`` has ``kind=="dag"``, and has two extra
570  fields. ``operator`` gives the initial value after the opening
571  parenthesis of the dag initializer; ``args`` is an array giving the
572  following arguments. The elements of ``args`` are arrays of length
573  2, giving the value of each argument followed by its colon-suffixed
574  name (if any). For example, in the JSON representation of the dag
575  value ``(Op 22, "hello":$foo)`` (assuming that ``Op`` is the name of
576  a record defined elsewhere with a ``def`` statement):
577
578  * ``operator`` will be an object in which ``kind=="def"`` and
579    ``def=="Op"``
580
581  * ``args`` will be the array ``[[22, null], ["hello", "foo"]]``.
582
583* If any other kind of value or complicated expression appears in the
584  output, it will have ``kind=="complex"``, and no additional fields.
585  These values are not expected to be needed by backends. The standard
586  ``printable`` field can be used to extract a representation of them
587  in TableGen source syntax if necessary.
588
589SearchableTables Reference
590--------------------------
591
592A TableGen include file, ``SearchableTable.td``, provides classes for
593generating C++ searchable tables. These tables are described in the
594following sections. To generate the C++ code, run ``llvm-tblgen`` with the
595``--gen-searchable-tables`` option, which invokes the backend that generates
596the tables from the records you provide.
597
598Each of the data structures generated for searchable tables is guarded by an
599``#ifdef``. This allows you to include the generated ``.inc`` file and select only
600certain data structures for inclusion. The examples below show the macro
601names used in these guards.
602
603Generic Enumerated Types
604~~~~~~~~~~~~~~~~~~~~~~~~
605
606The ``GenericEnum`` class makes it easy to define a C++ enumerated type and
607the enumerated *elements* of that type. To define the type, define a record
608whose parent class is ``GenericEnum`` and whose name is the desired enum
609type. This class provides three fields, which you can set in the record
610using the ``let`` statement.
611
612* ``string FilterClass``. The enum type will have one element for each record
613  that derives from this class. These records are collected to assemble the
614  complete set of elements.
615
616* ``string NameField``. The name of a field *in the collected records* that specifies
617  the name of the element. If a record has no such field, the record's
618  name will be used.
619
620* ``string ValueField``. The name of a field *in the collected records* that
621  specifies the numerical value of the element. If a record has no such
622  field, it will be assigned an integer value. Values are assigned in
623  alphabetical order starting with 0.
624
625Here is an example where the values of the elements are specified
626explicitly, as a template argument to the ``BEntry`` class. The resulting
627C++ code is shown.
628
629.. code-block:: text
630
631  def BValues : GenericEnum {
632    let FilterClass = "BEntry";
633    let NameField = "Name";
634    let ValueField = "Encoding";
635  }
636
637  class BEntry<bits<16> enc> {
638    string Name = NAME;
639    bits<16> Encoding = enc;
640  }
641
642  def BFoo   : BEntry<0xac>;
643  def BBar   : BEntry<0x14>;
644  def BZoo   : BEntry<0x80>;
645  def BSnork : BEntry<0x4c>;
646
647.. code-block:: text
648
649  #ifdef GET_BValues_DECL
650  enum BValues {
651    BBar = 20,
652    BFoo = 172,
653    BSnork = 76,
654    BZoo = 128,
655  };
656  #endif
657
658In the following example, the values of the elements are assigned
659automatically. Note that values are assigned from 0, in alphabetical order
660by element name.
661
662.. code-block:: text
663
664  def CEnum : GenericEnum {
665    let FilterClass = "CEnum";
666  }
667
668  class CEnum;
669
670  def CFoo : CEnum;
671  def CBar : CEnum;
672  def CBaz : CEnum;
673
674.. code-block:: text
675
676  #ifdef GET_CEnum_DECL
677  enum CEnum {
678    CBar = 0,
679    CBaz = 1,
680    CFoo = 2,
681  };
682  #endif
683
684
685Generic Tables
686~~~~~~~~~~~~~~
687
688The ``GenericTable`` class is used to define a searchable generic table.
689TableGen produces C++ code to define the table entries and also produces
690the declaration and definition of a function to search the table based on a
691primary key. To define the table, define a record whose parent class is
692``GenericTable`` and whose name is the name of the global table of entries.
693This class provides six fields.
694
695* ``string FilterClass``. The table will have one entry for each record
696  that derives from this class.
697
698* ``string FilterClassField``. This is an optional field of ``FilterClass``
699  which should be `bit` type. If specified, only those records with this field
700  being true will have corresponding entries in the table. This field won't be
701  included in generated C++ fields if it isn't included in ``Fields`` list.
702
703* ``string CppTypeName``. The name of the C++ struct/class type of the
704  table that holds the entries. If unspecified, the ``FilterClass`` name is
705  used.
706
707* ``list<string> Fields``. A list of the names of the fields *in the
708  collected records* that contain the data for the table entries. The order of
709  this list determines the order of the values in the C++ initializers. See
710  below for information about the types of these fields.
711
712* ``list<string> PrimaryKey``. The list of fields that make up the
713  primary key.
714
715* ``string PrimaryKeyName``. The name of the generated C++ function
716  that performs a lookup on the primary key.
717
718* ``bit PrimaryKeyEarlyOut``. See the third example below.
719
720* ``bit PrimaryKeyReturnRange``. when set to 1, modifies the lookup function’s
721  definition to return a range of results rather than a single pointer to the
722  object. This feature proves useful when multiple objects meet the criteria
723  specified by the lookup function. Currently, it is supported only for primary
724  lookup functions. Refer to the second example below for further details.
725
726TableGen attempts to deduce the type of each of the table fields so that it
727can format the C++ initializers in the emitted table. It can deduce ``bit``,
728``bits<n>``, ``string``, ``Intrinsic``, and ``Instruction``.  These can be
729used in the primary key. Any other field types must be specified
730explicitly; this is done as shown in the second example below. Such fields
731cannot be used in the primary key.
732
733One special case of the field type has to do with code. Arbitrary code is
734represented by a string, but has to be emitted as a C++ initializer without
735quotes. If the code field was defined using a code literal (``[{...}]``),
736then TableGen will know to emit it without quotes. However, if it was
737defined using a string literal or complex string expression, then TableGen
738will not know. In this case, you can force TableGen to treat the field as
739code by including the following line in the ``GenericTable`` record, where
740*xxx* is the code field name.
741
742.. code-block:: text
743
744  string TypeOf_xxx = "code";
745
746Here is an example where TableGen can deduce the field types. Note that the
747table entry records are anonymous; the names of entry records are
748irrelevant.
749
750.. code-block:: text
751
752  def ATable : GenericTable {
753    let FilterClass = "AEntry";
754    let FilterClassField = "IsNeeded";
755    let Fields = ["Str", "Val1", "Val2"];
756    let PrimaryKey = ["Val1", "Val2"];
757    let PrimaryKeyName = "lookupATableByValues";
758  }
759
760  class AEntry<string str, int val1, int val2, bit isNeeded> {
761    string Str = str;
762    bits<8> Val1 = val1;
763    bits<10> Val2 = val2;
764    bit IsNeeded = isNeeded;
765  }
766
767  def : AEntry<"Bob",   5, 3, 1>;
768  def : AEntry<"Carol", 2, 6, 1>;
769  def : AEntry<"Ted",   4, 4, 1>;
770  def : AEntry<"Alice", 4, 5, 1>;
771  def : AEntry<"Costa", 2, 1, 1>;
772  def : AEntry<"Dale",  2, 1, 0>;
773
774Here is the generated C++ code. The declaration of ``lookupATableByValues``
775is guarded by ``GET_ATable_DECL``, while the definitions are guarded by
776``GET_ATable_IMPL``.
777
778.. code-block:: text
779
780  #ifdef GET_ATable_DECL
781  const AEntry *lookupATableByValues(uint8_t Val1, uint16_t Val2);
782  #endif
783
784  #ifdef GET_ATable_IMPL
785  constexpr AEntry ATable[] = {
786    { "Costa", 0x2, 0x1 }, // 0
787    { "Carol", 0x2, 0x6 }, // 1
788    { "Ted", 0x4, 0x4 }, // 2
789    { "Alice", 0x4, 0x5 }, // 3
790    { "Bob", 0x5, 0x3 }, // 4
791    /* { "Dale", 0x2, 0x1 }, // 5 */ // We don't generate this line as `IsNeeded` is 0.
792  };
793
794  const AEntry *lookupATableByValues(uint8_t Val1, uint16_t Val2) {
795    struct KeyType {
796      uint8_t Val1;
797      uint16_t Val2;
798    };
799    KeyType Key = { Val1, Val2 };
800    auto Table = ArrayRef(ATable);
801    auto Idx = std::lower_bound(Table.begin(), Table.end(), Key,
802      [](const AEntry &LHS, const KeyType &RHS) {
803        if (LHS.Val1 < RHS.Val1)
804          return true;
805        if (LHS.Val1 > RHS.Val1)
806          return false;
807        if (LHS.Val2 < RHS.Val2)
808          return true;
809        if (LHS.Val2 > RHS.Val2)
810          return false;
811        return false;
812      });
813
814    if (Idx == Table.end() ||
815        Key.Val1 != Idx->Val1 ||
816        Key.Val2 != Idx->Val2)
817      return nullptr;
818    return &*Idx;
819  }
820  #endif
821
822The table entries in ``ATable`` are sorted in order by ``Val1``, and within
823each of those values, by ``Val2``. This allows a binary search of the table,
824which is performed in the lookup function by ``std::lower_bound``. The
825lookup function returns a reference to the found table entry, or the null
826pointer if no entry is found. If the table has a single primary key field
827which is integral and densely numbered, a direct lookup is generated rather
828than a binary search.
829
830This example includes a field whose type TableGen cannot deduce. The ``Kind``
831field uses the enumerated type ``CEnum`` defined above. To inform TableGen
832of the type, the record derived from ``GenericTable`` must include a string field
833named ``TypeOf_``\ *field*, where *field* is the name of the field whose type
834is required.
835
836.. code-block:: text
837
838  def CTable : GenericTable {
839    let FilterClass = "CEntry";
840    let Fields = ["Name", "Kind", "Encoding"];
841    string TypeOf_Kind = "CEnum";
842    let PrimaryKey = ["Encoding"];
843    let PrimaryKeyName = "lookupCEntryByEncoding";
844  }
845
846  class CEntry<string name, CEnum kind, int enc> {
847    string Name = name;
848    CEnum Kind = kind;
849    bits<16> Encoding = enc;
850  }
851
852  def : CEntry<"Apple", CFoo, 10>;
853  def : CEntry<"Pear",  CBaz, 15>;
854  def : CEntry<"Apple", CBar, 13>;
855
856Here is the generated C++ code.
857
858.. code-block:: text
859
860  #ifdef GET_CTable_DECL
861  const CEntry *lookupCEntryByEncoding(uint16_t Encoding);
862  #endif
863
864  #ifdef GET_CTable_IMPL
865  constexpr CEntry CTable[] = {
866    { "Apple", CFoo, 0xA }, // 0
867    { "Apple", CBar, 0xD }, // 1
868    { "Pear", CBaz, 0xF }, // 2
869  };
870
871  const CEntry *lookupCEntryByEncoding(uint16_t Encoding) {
872    struct KeyType {
873      uint16_t Encoding;
874    };
875    KeyType Key = { Encoding };
876    auto Table = ArrayRef(CTable);
877    auto Idx = std::lower_bound(Table.begin(), Table.end(), Key,
878      [](const CEntry &LHS, const KeyType &RHS) {
879        if (LHS.Encoding < RHS.Encoding)
880          return true;
881        if (LHS.Encoding > RHS.Encoding)
882          return false;
883        return false;
884      });
885
886    if (Idx == Table.end() ||
887        Key.Encoding != Idx->Encoding)
888      return nullptr;
889    return &*Idx;
890  }
891
892In the above example, lets add one more record with encoding same as that of
893record ``CEntry<"Pear",  CBaz, 15>``.
894
895.. code-block:: text
896
897  def CFoobar : CEnum;
898  def : CEntry<"Banana", CFoobar, 15>;
899
900Below is the new generated ``CTable``
901
902.. code-block:: text
903
904  #ifdef GET_Table_IMPL
905  constexpr CEntry Table[] = {
906    { "Apple", CFoo, 0xA }, // 0
907    { "Apple", CBar, 0xD }, // 1
908    { "Banana", CFoobar, 0xF }, // 2
909    { "Pear", CBaz, 0xF }, // 3
910  };
911
912Since ``Banana`` lexicographically appears first, therefore in the ``CEntry``
913table, record with name ``Banana`` will come before the record with name
914``Pear``. Because of this, the ``lookupCEntryByEncoding`` function will always
915return a pointer to the record with name ``Banana`` even though in some cases
916the correct result can be the record with name ``Pear``. Such kind of scenario
917makes the exisitng lookup function insufficient because they always return a
918pointer to a single entry from the table, but instead it should return a range
919of results because multiple entries match the criteria sought by the lookup
920function. In this case, the definition of the lookup function needs to be
921modified to return a range of results which can be done by setting
922``PrimaryKeyReturnRange``.
923
924.. code-block:: text
925
926  def CTable : GenericTable {
927    let FilterClass = "CEntry";
928    let Fields = ["Name", "Kind", "Encoding"];
929    string TypeOf_Kind = "CEnum";
930    let PrimaryKey = ["Encoding"];
931    let PrimaryKeyName = "lookupCEntryByEncoding";
932    let PrimaryKeyReturnRange = true;
933  }
934
935Here is the modified lookup function.
936
937.. code-block:: text
938
939  llvm::iterator_range<const CEntry *> lookupCEntryByEncoding(uint16_t Encoding) {
940    struct KeyType {
941      uint16_t Encoding;
942    };
943    KeyType Key = {Encoding};
944    struct Comp {
945      bool operator()(const CEntry &LHS, const KeyType &RHS) const {
946        if (LHS.Encoding < RHS.Encoding)
947          return true;
948        if (LHS.Encoding > RHS.Encoding)
949          return false;
950        return false;
951      }
952      bool operator()(const KeyType &LHS, const CEntry &RHS) const {
953        if (LHS.Encoding < RHS.Encoding)
954          return true;
955        if (LHS.Encoding > RHS.Encoding)
956          return false;
957        return false;
958      }
959    };
960    auto Table = ArrayRef(Table);
961    auto It = std::equal_range(Table.begin(), Table.end(), Key, Comp());
962    return llvm::make_range(It.first, It.second);
963  }
964
965The new lookup function will return an iterator range with first pointer to the
966first result and the last pointer to the last matching result from the table.
967However, please note that the support for emitting modified definition exists
968for ``PrimaryKeyName`` only.
969
970The ``PrimaryKeyEarlyOut`` field, when set to 1, modifies the lookup
971function so that it tests the first field of the primary key to determine
972whether it is within the range of the collected records' primary keys. If
973not, the function returns the null pointer without performing the binary
974search. This is useful for tables that provide data for only some of the
975elements of a larger enum-based space. The first field of the primary key
976must be an integral type; it cannot be a string.
977
978Adding ``let PrimaryKeyEarlyOut = 1`` to the ``ATable`` above:
979
980.. code-block:: text
981
982  def ATable : GenericTable {
983    let FilterClass = "AEntry";
984    let Fields = ["Str", "Val1", "Val2"];
985    let PrimaryKey = ["Val1", "Val2"];
986    let PrimaryKeyName = "lookupATableByValues";
987    let PrimaryKeyEarlyOut = 1;
988  }
989
990causes the lookup function to change as follows:
991
992.. code-block:: text
993
994  const AEntry *lookupATableByValues(uint8_t Val1, uint16_t Val2) {
995    if ((Val1 < 0x2) ||
996        (Val1 > 0x5))
997      return nullptr;
998
999    struct KeyType {
1000    ...
1001
1002We can construct two GenericTables with the same ``FilterClass``, so that they
1003select from the same overall set of records, but assign them with different
1004``FilterClassField`` values so that they include different subsets of the
1005records of that class.
1006
1007For example, we can create two tables that contain only even or odd records.
1008Fields ``IsEven`` and ``IsOdd`` won't be included in generated C++ fields
1009because they aren't included in ``Fields`` list.
1010
1011.. code-block:: text
1012
1013  class EEntry<bits<8> value> {
1014    bits<8> Value = value;
1015    bit IsEven = !eq(!and(value, 1), 0);
1016    bit IsOdd = !not(IsEven);
1017  }
1018
1019  foreach i = {1-10} in {
1020    def : EEntry<i>;
1021  }
1022
1023  def EEntryEvenTable : GenericTable {
1024    let FilterClass = "EEntry";
1025    let FilterClassField = "IsEven";
1026    let Fields = ["Value"];
1027    let PrimaryKey = ["Value"];
1028    let PrimaryKeyName = "lookupEEntryEvenTableByValue";
1029  }
1030
1031  def EEntryOddTable : GenericTable {
1032    let FilterClass = "EEntry";
1033    let FilterClassField = "IsOdd";
1034    let Fields = ["Value"];
1035    let PrimaryKey = ["Value"];
1036    let PrimaryKeyName = "lookupEEntryOddTableByValue";
1037  }
1038
1039The generated tables are:
1040
1041.. code-block:: text
1042
1043  constexpr EEntry EEntryEvenTable[] = {
1044    { 0x2 }, // 0
1045    { 0x4 }, // 1
1046    { 0x6 }, // 2
1047    { 0x8 }, // 3
1048    { 0xA }, // 4
1049  };
1050
1051  constexpr EEntry EEntryOddTable[] = {
1052    { 0x1 }, // 0
1053    { 0x3 }, // 1
1054    { 0x5 }, // 2
1055    { 0x7 }, // 3
1056    { 0x9 }, // 4
1057  };
1058
1059Search Indexes
1060~~~~~~~~~~~~~~
1061
1062The ``SearchIndex`` class is used to define additional lookup functions for
1063generic tables. To define an additional function, define a record whose parent
1064class is ``SearchIndex`` and whose name is the name of the desired lookup
1065function. This class provides three fields.
1066
1067* ``GenericTable Table``. The name of the table that is to receive another
1068  lookup function.
1069
1070* ``list<string> Key``. The list of fields that make up the secondary key.
1071
1072* ``bit EarlyOut``. See the third example in `Generic Tables`_.
1073
1074Here is an example of a secondary key added to the ``CTable`` above. The
1075generated function looks up entries based on the ``Name`` and ``Kind`` fields.
1076
1077.. code-block:: text
1078
1079  def lookupCEntry : SearchIndex {
1080    let Table = CTable;
1081    let Key = ["Name", "Kind"];
1082  }
1083
1084This use of ``SearchIndex`` generates the following additional C++ code.
1085
1086.. code-block:: text
1087
1088  const CEntry *lookupCEntry(StringRef Name, unsigned Kind);
1089
1090  ...
1091
1092  const CEntry *lookupCEntryByName(StringRef Name, unsigned Kind) {
1093    struct IndexType {
1094      const char * Name;
1095      unsigned Kind;
1096      unsigned _index;
1097    };
1098    static const struct IndexType Index[] = {
1099      { "APPLE", CBar, 1 },
1100      { "APPLE", CFoo, 0 },
1101      { "PEAR", CBaz, 2 },
1102    };
1103
1104    struct KeyType {
1105      std::string Name;
1106      unsigned Kind;
1107    };
1108    KeyType Key = { Name.upper(), Kind };
1109    auto Table = ArrayRef(Index);
1110    auto Idx = std::lower_bound(Table.begin(), Table.end(), Key,
1111      [](const IndexType &LHS, const KeyType &RHS) {
1112        int CmpName = StringRef(LHS.Name).compare(RHS.Name);
1113        if (CmpName < 0) return true;
1114        if (CmpName > 0) return false;
1115        if ((unsigned)LHS.Kind < (unsigned)RHS.Kind)
1116          return true;
1117        if ((unsigned)LHS.Kind > (unsigned)RHS.Kind)
1118          return false;
1119        return false;
1120      });
1121
1122    if (Idx == Table.end() ||
1123        Key.Name != Idx->Name ||
1124        Key.Kind != Idx->Kind)
1125      return nullptr;
1126    return &CTable[Idx->_index];
1127  }
1128