xref: /llvm-project/mlir/docs/Bindings/Python.md (revision f136c800b60dbfacdbb645e7e92acba52e2f279f)
1# MLIR Python Bindings
2
3**Current status**: Under development and not enabled by default
4
5[TOC]
6
7## Building
8
9### Pre-requisites
10
11*   A relatively recent Python3 installation
12*   Installation of python dependencies as specified in
13    `mlir/python/requirements.txt`
14
15### CMake variables
16
17*   **`MLIR_ENABLE_BINDINGS_PYTHON`**`:BOOL`
18
19    Enables building the Python bindings. Defaults to `OFF`.
20
21*   **`Python3_EXECUTABLE`**:`STRING`
22
23    Specifies the `python` executable used for the LLVM build, including for
24    determining header/link flags for the Python bindings. On systems with
25    multiple Python implementations, setting this explicitly to the preferred
26    `python3` executable is strongly recommended.
27
28### Recommended development practices
29
30It is recommended to use a python virtual environment. Many ways exist for this,
31but the following is the simplest:
32
33```shell
34# Make sure your 'python' is what you expect. Note that on multi-python
35# systems, this may have a version suffix, and on many Linuxes and MacOS where
36# python2 and python3 co-exist, you may also want to use `python3`.
37which python
38python -m venv ~/.venv/mlirdev
39source ~/.venv/mlirdev/bin/activate
40
41# Note that many LTS distros will bundle a version of pip itself that is too
42# old to download all of the latest binaries for certain platforms.
43# The pip version can be obtained with `python -m pip --version`, and for
44# Linux specifically, this should be cross checked with minimum versions
45# here: https://github.com/pypa/manylinux
46# It is recommended to upgrade pip:
47python -m pip install --upgrade pip
48
49
50# Now the `python` command will resolve to your virtual environment and
51# packages will be installed there.
52python -m pip install -r mlir/python/requirements.txt
53
54# Now run `cmake`, `ninja`, et al.
55```
56
57For interactive use, it is sufficient to add the
58`tools/mlir/python_packages/mlir_core/` directory in your `build/` directory to
59the `PYTHONPATH`. Typically:
60
61```shell
62export PYTHONPATH=$(cd build && pwd)/tools/mlir/python_packages/mlir_core
63```
64
65Note that if you have installed (i.e. via `ninja install`, et al), then python
66packages for all enabled projects will be in your install tree under
67`python_packages/` (i.e. `python_packages/mlir_core`). Official distributions
68are built with a more specialized setup.
69
70## Design
71
72### Use cases
73
74There are likely two primary use cases for the MLIR python bindings:
75
761.  Support users who expect that an installed version of LLVM/MLIR will yield
77    the ability to `import mlir` and use the API in a pure way out of the box.
78
791.  Downstream integrations will likely want to include parts of the API in
80    their private namespace or specially built libraries, probably mixing it
81    with other python native bits.
82
83### Composable modules
84
85In order to support use case \#2, the Python bindings are organized into
86composable modules that downstream integrators can include and re-export into
87their own namespace if desired. This forces several design points:
88
89*   Separate the construction/populating of a `py::module` from
90    `PYBIND11_MODULE` global constructor.
91
92*   Introduce headers for C++-only wrapper classes as other related C++ modules
93    will need to interop with it.
94
95*   Separate any initialization routines that depend on optional components into
96    its own module/dependency (currently, things like `registerAllDialects` fall
97    into this category).
98
99There are a lot of co-related issues of shared library linkage, distribution
100concerns, etc that affect such things. Organizing the code into composable
101modules (versus a monolithic `cpp` file) allows the flexibility to address many
102of these as needed over time. Also, compilation time for all of the template
103meta-programming in pybind scales with the number of things you define in a
104translation unit. Breaking into multiple translation units can significantly aid
105compile times for APIs with a large surface area.
106
107### Submodules
108
109Generally, the C++ codebase namespaces most things into the `mlir` namespace.
110However, in order to modularize and make the Python bindings easier to
111understand, sub-packages are defined that map roughly to the directory structure
112of functional units in MLIR.
113
114Examples:
115
116*   `mlir.ir`
117*   `mlir.passes` (`pass` is a reserved word :( )
118*   `mlir.dialect`
119*   `mlir.execution_engine` (aside from namespacing, it is important that
120    "bulky"/optional parts like this are isolated)
121
122In addition, initialization functions that imply optional dependencies should be
123in underscored (notionally private) modules such as `_init` and linked
124separately. This allows downstream integrators to completely customize what is
125included "in the box" and covers things like dialect registration, pass
126registration, etc.
127
128### Loader
129
130LLVM/MLIR is a non-trivial python-native project that is likely to co-exist with
131other non-trivial native extensions. As such, the native extension (i.e. the
132`.so`/`.pyd`/`.dylib`) is exported as a notionally private top-level symbol
133(`_mlir`), while a small set of Python code is provided in
134`mlir/_cext_loader.py` and siblings which loads and re-exports it. This split
135provides a place to stage code that needs to prepare the environment *before*
136the shared library is loaded into the Python runtime, and also provides a place
137that one-time initialization code can be invoked apart from module constructors.
138
139It is recommended to avoid using `__init__.py` files to the extent possible,
140until reaching a leaf package that represents a discrete component. The rule to
141keep in mind is that the presence of an `__init__.py` file prevents the ability
142to split anything at that level or below in the namespace into different
143directories, deployment packages, wheels, etc.
144
145See the documentation for more information and advice:
146https://packaging.python.org/guides/packaging-namespace-packages/
147
148### Use the C-API
149
150The Python APIs should seek to layer on top of the C-API to the degree possible.
151Especially for the core, dialect-independent parts, such a binding enables
152packaging decisions that would be difficult or impossible if spanning a C++ ABI
153boundary. In addition, factoring in this way side-steps some very difficult
154issues that arise when combining RTTI-based modules (which pybind derived things
155are) with non-RTTI polymorphic C++ code (the default compilation mode of LLVM).
156
157### Ownership in the Core IR
158
159There are several top-level types in the core IR that are strongly owned by
160their python-side reference:
161
162*   `PyContext` (`mlir.ir.Context`)
163*   `PyModule` (`mlir.ir.Module`)
164*   `PyOperation` (`mlir.ir.Operation`) - but with caveats
165
166All other objects are dependent. All objects maintain a back-reference
167(keep-alive) to their closest containing top-level object. Further, dependent
168objects fall into two categories: a) uniqued (which live for the life-time of
169the context) and b) mutable. Mutable objects need additional machinery for
170keeping track of when the C++ instance that backs their Python object is no
171longer valid (typically due to some specific mutation of the IR, deletion, or
172bulk operation).
173
174### Optionality and argument ordering in the Core IR
175
176The following types support being bound to the current thread as a context
177manager:
178
179*   `PyLocation` (`loc: mlir.ir.Location = None`)
180*   `PyInsertionPoint` (`ip: mlir.ir.InsertionPoint = None`)
181*   `PyMlirContext` (`context: mlir.ir.Context = None`)
182
183In order to support composability of function arguments, when these types appear
184as arguments, they should always be the last and appear in the above order and
185with the given names (which is generally the order in which they are expected to
186need to be expressed explicitly in special cases) as necessary. Each should
187carry a default value of `py::none()` and use either a manual or automatic
188conversion for resolving either with the explicit value or a value from the
189thread context manager (i.e. `DefaultingPyMlirContext` or
190`DefaultingPyLocation`).
191
192The rationale for this is that in Python, trailing keyword arguments to the
193*right* are the most composable, enabling a variety of strategies such as kwarg
194passthrough, default values, etc. Keeping function signatures composable
195increases the chances that interesting DSLs and higher level APIs can be
196constructed without a lot of exotic boilerplate.
197
198Used consistently, this enables a style of IR construction that rarely needs to
199use explicit contexts, locations, or insertion points but is free to do so when
200extra control is needed.
201
202#### Operation hierarchy
203
204As mentioned above, `PyOperation` is special because it can exist in either a
205top-level or dependent state. The life-cycle is unidirectional: operations can
206be created detached (top-level) and once added to another operation, they are
207then dependent for the remainder of their lifetime. The situation is more
208complicated when considering construction scenarios where an operation is added
209to a transitive parent that is still detached, necessitating further accounting
210at such transition points (i.e. all such added children are initially added to
211the IR with a parent of their outer-most detached operation, but then once it is
212added to an attached operation, they need to be re-parented to the containing
213module).
214
215Due to the validity and parenting accounting needs, `PyOperation` is the owner
216for regions and blocks and needs to be a top-level type that we can count on not
217aliasing. This let's us do things like selectively invalidating instances when
218mutations occur without worrying that there is some alias to the same operation
219in the hierarchy. Operations are also the only entity that are allowed to be in
220a detached state, and they are interned at the context level so that there is
221never more than one Python `mlir.ir.Operation` object for a unique
222`MlirOperation`, regardless of how it is obtained.
223
224The C/C++ API allows for Region/Block to also be detached, but it simplifies the
225ownership model a lot to eliminate that possibility in this API, allowing the
226Region/Block to be completely dependent on its owning operation for accounting.
227The aliasing of Python `Region`/`Block` instances to underlying
228`MlirRegion`/`MlirBlock` is considered benign and these objects are not interned
229in the context (unlike operations).
230
231If we ever want to re-introduce detached regions/blocks, we could do so with new
232"DetachedRegion" class or similar and also avoid the complexity of accounting.
233With the way it is now, we can avoid having a global live list for regions and
234blocks. We may end up needing an op-local one at some point TBD, depending on
235how hard it is to guarantee how mutations interact with their Python peer
236objects. We can cross that bridge easily when we get there.
237
238Module, when used purely from the Python API, can't alias anyway, so we can use
239it as a top-level ref type without a live-list for interning. If the API ever
240changes such that this cannot be guaranteed (i.e. by letting you marshal a
241native-defined Module in), then there would need to be a live table for it too.
242
243## User-level API
244
245### Context Management
246
247The bindings rely on Python
248[context managers](https://docs.python.org/3/reference/datamodel.html#context-managers)
249(`with` statements) to simplify creation and handling of IR objects by omitting
250repeated arguments such as MLIR contexts, operation insertion points and
251locations. A context manager sets up the default object to be used by all
252binding calls within the following context and in the same thread. This default
253can be overridden by specific calls through the dedicated keyword arguments.
254
255#### MLIR Context
256
257An MLIR context is a top-level entity that owns attributes and types and is
258referenced from virtually all IR constructs. Contexts also provide thread safety
259at the C++ level. In Python bindings, the MLIR context is also a Python context
260manager, one can write:
261
262```python
263from mlir.ir import Context, Module
264
265with Context() as ctx:
266  # IR construction using `ctx` as context.
267
268  # For example, parsing an MLIR module from string requires the context.
269  Module.parse("builtin.module {}")
270```
271
272IR objects referencing a context usually provide access to it through the
273`.context` property. Most IR-constructing functions expect the context to be
274provided in some form. In case of attributes and types, the context may be
275extracted from the contained attribute or type. In case of operations, the
276context is systematically extracted from Locations (see below). When the context
277cannot be extracted from any argument, the bindings API expects the (keyword)
278argument `context`. If it is not provided or set to `None` (default), it will be
279looked up from an implicit stack of contexts maintained by the bindings in the
280current thread and updated by context managers. If there is no surrounding
281context, an error will be raised.
282
283Note that it is possible to manually specify the MLIR context both inside and
284outside of the `with` statement:
285
286```python
287from mlir.ir import Context, Module
288
289standalone_ctx = Context()
290with Context() as managed_ctx:
291  # Parse a module in managed_ctx.
292  Module.parse("...")
293
294  # Parse a module in standalone_ctx (override the context manager).
295  Module.parse("...", context=standalone_ctx)
296
297# Parse a module without using context managers.
298Module.parse("...", context=standalone_ctx)
299```
300
301The context object remains live as long as there are IR objects referencing it.
302
303#### Insertion Points and Locations
304
305When constructing an MLIR operation, two pieces of information are required:
306
307-   an *insertion point* that indicates where the operation is to be created in
308    the IR region/block/operation structure (usually before or after another
309    operation, or at the end of some block); it may be missing, at which point
310    the operation is created in the *detached* state;
311-   a *location* that contains user-understandable information about the source
312    of the operation (for example, file/line/column information), which must
313    always be provided as it carries a reference to the MLIR context.
314
315Both can be provided using context managers or explicitly as keyword arguments
316in the operation constructor. They can be also provided as keyword arguments
317`ip` and `loc` both within and outside of the context manager.
318
319```python
320from mlir.ir import Context, InsertionPoint, Location, Module, Operation
321
322with Context() as ctx:
323  module = Module.create()
324
325  # Prepare for inserting operations into the body of the module and indicate
326  # that these operations originate in the "f.mlir" file at the given line and
327  # column.
328  with InsertionPoint(module.body), Location.file("f.mlir", line=42, col=1):
329    # This operation will be inserted at the end of the module body and will
330    # have the location set up by the context manager.
331    Operation(<...>)
332
333    # This operation will be inserted at the end of the module (and after the
334    # previously constructed operation) and will have the location provided as
335    # the keyword argument.
336    Operation(<...>, loc=Location.file("g.mlir", line=1, col=10))
337
338    # This operation will be inserted at the *beginning* of the block rather
339    # than at its end.
340    Operation(<...>, ip=InsertionPoint.at_block_begin(module.body))
341```
342
343Note that `Location` needs an MLIR context to be constructed. It can take the
344context set up in the current thread by some surrounding context manager, or
345accept it as an explicit argument:
346
347```python
348from mlir.ir import Context, Location
349
350# Create a context and a location in this context in the same `with` statement.
351with Context() as ctx, Location.file("f.mlir", line=42, col=1, context=ctx):
352  pass
353```
354
355Locations are owned by the context and live as long as they are (transitively)
356referenced from somewhere in Python code.
357
358Unlike locations, the insertion point may be left unspecified (or, equivalently,
359set to `None` or `False`) during operation construction. In this case, the
360operation is created in the *detached* state, that is, it is not added into the
361region of another operation and is owned by the caller. This is usually the case
362for top-level operations that contain the IR, such as modules. Regions, blocks
363and values contained in an operation point back to it and maintain it live.
364
365### Inspecting IR Objects
366
367Inspecting the IR is one of the primary tasks the Python bindings are designed
368for. One can traverse the IR operation/region/block structure and inspect their
369aspects such as operation attributes and value types.
370
371#### Operations, Regions and Blocks
372
373Operations are represented as either:
374
375-   the generic `Operation` class, useful in particular for generic processing
376    of unregistered operations; or
377-   a specific subclass of `OpView` that provides more semantically-loaded
378    accessors to operation properties.
379
380Given an `OpView` subclass, one can obtain an `Operation` using its `.operation`
381property. Given an `Operation`, one can obtain the corresponding `OpView` using
382its `.opview` property *as long as* the corresponding class has been set up.
383This typically means that the Python module of its dialect has been loaded. By
384default, the `OpView` version is produced when navigating the IR tree.
385
386One can check if an operation has a specific type by means of Python's
387`isinstance` function:
388
389```python
390operation = <...>
391opview = <...>
392if isinstance(operation.opview, mydialect.MyOp):
393  pass
394if isinstance(opview, mydialect.MyOp):
395  pass
396```
397
398The components of an operation can be inspected using its properties.
399
400-   `attributes` is a collection of operation attributes . It can be subscripted
401    as both dictionary and sequence, e.g., both `operation.attributes["value"]`
402    and `operation.attributes[0]` will work. There is no guarantee on the order
403    in which the attributes are traversed when iterating over the `attributes`
404    property as sequence.
405-   `operands` is a sequence collection of operation operands.
406-   `results` is a sequence collection of operation results.
407-   `regions` is a sequence collection of regions attached to the operation.
408
409The objects produced by `operands` and `results` have a `.types` property that
410contains a sequence collection of types of the corresponding values.
411
412```python
413from mlir.ir import Operation
414
415operation1 = <...>
416operation2 = <...>
417if operation1.results.types == operation2.operand.types:
418  pass
419```
420
421`OpView` subclasses for specific operations may provide leaner accessors to
422properties of an operation. For example, named attributes, operand and results
423are usually accessible as properties of the `OpView` subclass with the same
424name, such as `operation.const_value` instead of
425`operation.attributes["const_value"]`. If this name is a reserved Python
426keyword, it is suffixed with an underscore.
427
428The operation itself is iterable, which provides access to the attached regions
429in order:
430
431```python
432from mlir.ir import Operation
433
434operation = <...>
435for region in operation:
436  do_something_with_region(region)
437```
438
439A region is conceptually a sequence of blocks. Objects of the `Region` class are
440thus iterable, which provides access to the blocks. One can also use the
441`.blocks` property.
442
443```python
444# Regions are directly iterable and give access to blocks.
445for block1, block2 in zip(operation.regions[0], operation.regions[0].blocks)
446  assert block1 == block2
447```
448
449A block contains a sequence of operations, and has several additional
450properties. Objects of the `Block` class are iterable and provide access to the
451operations contained in the block. So does the `.operations` property. Blocks
452also have a list of arguments available as a sequence collection using the
453`.arguments` property.
454
455Block and region belong to the parent operation in Python bindings and keep it
456alive. This operation can be accessed using the `.owner` property.
457
458#### Attributes and Types
459
460Attributes and types are (mostly) immutable context-owned objects. They are
461represented as either:
462
463-   an opaque `Attribute` or `Type` object supporting printing and comparison;
464    or
465-   a concrete subclass thereof with access to properties of the attribute or
466    type.
467
468Given an `Attribute` or `Type` object, one can obtain a concrete subclass using
469the constructor of the subclass. This may raise a `ValueError` if the attribute
470or type is not of the expected subclass:
471
472```python
473from mlir.ir import Attribute, Type
474from mlir.<dialect> import ConcreteAttr, ConcreteType
475
476attribute = <...>
477type = <...>
478try:
479  concrete_attr = ConcreteAttr(attribute)
480  concrete_type = ConcreteType(type)
481except ValueError as e:
482  # Handle incorrect subclass.
483```
484
485In addition, concrete attribute and type classes provide a static `isinstance`
486method to check whether an object of the opaque `Attribute` or `Type` type can
487be downcasted:
488
489```python
490from mlir.ir import Attribute, Type
491from mlir.<dialect> import ConcreteAttr, ConcreteType
492
493attribute = <...>
494type = <...>
495
496# No need to handle errors here.
497if ConcreteAttr.isinstance(attribute):
498  concrete_attr = ConcreteAttr(attribute)
499if ConcreteType.isinstance(type):
500  concrete_type = ConcreteType(type)
501```
502
503By default, and unlike operations, attributes and types are returned from IR
504traversals using the opaque `Attribute` or `Type` that needs to be downcasted.
505
506Concrete attribute and type classes usually expose their properties as Python
507readonly properties. For example, the elemental type of a tensor type can be
508accessed using the `.element_type` property.
509
510#### Values
511
512MLIR has two kinds of values based on their defining object: block arguments and
513operation results. Values are handled similarly to attributes and types. They
514are represented as either:
515
516-   a generic `Value` object; or
517-   a concrete `BlockArgument` or `OpResult` object.
518
519The former provides all the generic functionality such as comparison, type
520access and printing. The latter provide access to the defining block or
521operation and the position of the value within it. By default, the generic
522`Value` objects are returned from IR traversals. Downcasting is implemented
523through concrete subclass constructors, similarly to attribtues and types:
524
525```python
526from mlir.ir import BlockArgument, OpResult, Value
527
528value = ...
529
530# Set `concrete` to the specific value subclass.
531try:
532  concrete = BlockArgument(value)
533except ValueError:
534  # This must not raise another ValueError as values are either block arguments
535  # or op results.
536  concrete = OpResult(value)
537```
538
539#### Interfaces
540
541MLIR interfaces are a mechanism to interact with the IR without needing to know
542specific types of operations but only some of their aspects. Operation
543interfaces are available as Python classes with the same name as their C++
544counterparts. Objects of these classes can be constructed from either:
545
546-   an object of the `Operation` class or of any `OpView` subclass; in this
547    case, all interface methods are available;
548-   a subclass of `OpView` and a context; in this case, only the *static*
549    interface methods are available as there is no associated operation.
550
551In both cases, construction of the interface raises a `ValueError` if the
552operation class does not implement the interface in the given context (or, for
553operations, in the context that the operation is defined in). Similarly to
554attributes and types, the MLIR context may be set up by a surrounding context
555manager.
556
557```python
558from mlir.ir import Context, InferTypeOpInterface
559
560with Context():
561  op = <...>
562
563  # Attempt to cast the operation into an interface.
564  try:
565    iface = InferTypeOpInterface(op)
566  except ValueError:
567    print("Operation does not implement InferTypeOpInterface.")
568    raise
569
570  # All methods are available on interface objects constructed from an Operation
571  # or an OpView.
572  iface.someInstanceMethod()
573
574  # An interface object can also be constructed given an OpView subclass. It
575  # also needs a context in which the interface will be looked up. The context
576  # can be provided explicitly or set up by the surrounding context manager.
577  try:
578    iface = InferTypeOpInterface(some_dialect.SomeOp)
579  except ValueError:
580    print("SomeOp does not implement InferTypeOpInterface.")
581    raise
582
583  # Calling an instance method on an interface object constructed from a class
584  # will raise TypeError.
585  try:
586    iface.someInstanceMethod()
587  except TypeError:
588    pass
589
590  # One can still call static interface methods though.
591  iface.inferOpReturnTypes(<...>)
592```
593
594If an interface object was constructed from an `Operation` or an `OpView`, they
595are available as `.operation` and `.opview` properties of the interface object,
596respectively.
597
598Only a subset of operation interfaces are currently provided in Python bindings.
599Attribute and type interfaces are not yet available in Python bindings.
600
601### Creating IR Objects
602
603Python bindings also support IR creation and manipulation.
604
605#### Operations, Regions and Blocks
606
607Operations can be created given a `Location` and an optional `InsertionPoint`.
608It is often easier to user context managers to specify locations and insertion
609points for several operations created in a row as described above.
610
611Concrete operations can be created by using constructors of the corresponding
612`OpView` subclasses. The generic, default form of the constructor accepts:
613
614-   an optional sequence of types for operation results (`results`);
615-   an optional sequence of values for operation operands, or another operation
616    producing those values (`operands`);
617-   an optional dictionary of operation attributes (`attributes`);
618-   an optional sequence of successor blocks (`successors`);
619-   the number of regions to attach to the operation (`regions`, default `0`);
620-   the `loc` keyword argument containing the `Location` of this operation; if
621    `None`, the location created by the closest context manager is used or an
622    exception will be raised if there is no context manager;
623-   the `ip` keyword argument indicating where the operation will be inserted in
624    the IR; if `None`, the insertion point created by the closest context
625    manager is used; if there is no surrounding context manager, the operation
626    is created in the detached state.
627
628Most operations will customize the constructor to accept a reduced list of
629arguments that are relevant for the operation. For example, zero-result
630operations may omit the `results` argument, so can the operations where the
631result types can be derived from operand types unambiguously. As a concrete
632example, built-in function operations can be constructed by providing a function
633name as string and its argument and result types as a tuple of sequences:
634
635```python
636from mlir.ir import Context, Module
637from mlir.dialects import builtin
638
639with Context():
640  module = Module.create()
641  with InsertionPoint(module.body), Location.unknown():
642    func = func.FuncOp("main", ([], []))
643```
644
645Also see below for constructors generated from ODS.
646
647Operations can also be constructed using the generic class and based on the
648canonical string name of the operation using `Operation.create`. It accepts the
649operation name as string, which must exactly match the canonical name of the
650operation in C++ or ODS, followed by the same argument list as the default
651constructor for `OpView`. *This form is discouraged* from use and is intended
652for generic operation processing.
653
654```python
655from mlir.ir import Context, Module
656from mlir.dialects import builtin
657
658with Context():
659  module = Module.create()
660  with InsertionPoint(module.body), Location.unknown():
661    # Operations can be created in a generic way.
662    func = Operation.create(
663        "func.func", results=[], operands=[],
664        attributes={"function_type":TypeAttr.get(FunctionType.get([], []))},
665        successors=None, regions=1)
666    # The result will be downcasted to the concrete `OpView` subclass if
667    # available.
668    assert isinstance(func, func.FuncOp)
669```
670
671Regions are created for an operation when constructing it on the C++ side. They
672are not constructible in Python and are not expected to exist outside of
673operations (unlike in C++ that supports detached regions).
674
675Blocks can be created within a given region and inserted before or after another
676block of the same region using `create_before()`, `create_after()` methods of
677the `Block` class, or the `create_at_start()` static method of the same class.
678They are not expected to exist outside of regions (unlike in C++ that supports
679detached blocks).
680
681```python
682from mlir.ir import Block, Context, Operation
683
684with Context():
685  op = Operation.create("generic.op", regions=1)
686
687  # Create the first block in the region.
688  entry_block = Block.create_at_start(op.regions[0])
689
690  # Create further blocks.
691  other_block = entry_block.create_after()
692```
693
694Blocks can be used to create `InsertionPoint`s, which can point to the beginning
695or the end of the block, or just before its terminator. It is common for
696`OpView` subclasses to provide a `.body` property that can be used to construct
697an `InsertionPoint`. For example, builtin `Module` and `FuncOp` provide a
698`.body` and `.add_entry_blocK()`, respectively.
699
700#### Attributes and Types
701
702Attributes and types can be created given a `Context` or another attribute or
703type object that already references the context. To indicate that they are owned
704by the context, they are obtained by calling the static `get` method on the
705concrete attribute or type class. These method take as arguments the data
706necessary to construct the attribute or type and a the keyword `context`
707argument when the context cannot be derived from other arguments.
708
709```python
710from mlir.ir import Context, F32Type, FloatAttr
711
712# Attribute and types require access to an MLIR context, either directly or
713# through another context-owned object.
714ctx = Context()
715f32 = F32Type.get(context=ctx)
716pi = FloatAttr.get(f32, 3.14)
717
718# They may use the context defined by the surrounding context manager.
719with Context():
720  f32 = F32Type.get()
721  pi = FloatAttr.get(f32, 3.14)
722```
723
724Some attributes provide additional construction methods for clarity.
725
726```python
727from mlir.ir import Context, IntegerAttr, IntegerType
728
729with Context():
730  i8 = IntegerType.get_signless(8)
731  IntegerAttr.get(i8, 42)
732```
733
734Builtin attribute can often be constructed from Python types with similar
735structure. For example, `ArrayAttr` can be constructed from a sequence
736collection of attributes, and a `DictAttr` can be constructed from a dictionary:
737
738```python
739from mlir.ir import ArrayAttr, Context, DictAttr, UnitAttr
740
741with Context():
742  array = ArrayAttr.get([UnitAttr.get(), UnitAttr.get()])
743  dictionary = DictAttr.get({"array": array, "unit": UnitAttr.get()})
744```
745
746Custom builders for Attributes to be used during Operation creation can be
747registered by way of the `register_attribute_builder`. In particular the
748following is how a custom builder is registered for `I32Attr`:
749
750```python
751@register_attribute_builder("I32Attr")
752def _i32Attr(x: int, context: Context):
753  return IntegerAttr.get(
754        IntegerType.get_signless(32, context=context), x)
755```
756
757This allows to invoke op creation of an op with a `I32Attr` with
758
759```python
760foo.Op(30)
761```
762
763The registration is based on the ODS name but registry is via pure python
764method. Only single custom builder is allowed to be registered per ODS attribute
765type (e.g., I32Attr can have only one, which can correspond to multiple of the
766underlying IntegerAttr type).
767
768instead of
769
770```python
771foo.Op(IntegerAttr.get(IndexType.get_signless(32, context=context), 30))
772```
773
774## Style
775
776In general, for the core parts of MLIR, the Python bindings should be largely
777isomorphic with the underlying C++ structures. However, concessions are made
778either for practicality or to give the resulting library an appropriately
779"Pythonic" flavor.
780
781### Properties vs get\*() methods
782
783Generally favor converting trivial methods like `getContext()`, `getName()`,
784`isEntryBlock()`, etc to read-only Python properties (i.e. `context`). It is
785primarily a matter of calling `def_property_readonly` vs `def` in binding code,
786and makes things feel much nicer to the Python side.
787
788For example, prefer:
789
790```c++
791m.def_property_readonly("context", ...)
792```
793
794Over:
795
796```c++
797m.def("getContext", ...)
798```
799
800### **repr** methods
801
802Things that have nice printed representations are really great :) If there is a
803reasonable printed form, it can be a significant productivity boost to wire that
804to the `__repr__` method (and verify it with a [doctest](#sample-doctest)).
805
806### CamelCase vs snake\_case
807
808Name functions/methods/properties in `snake_case` and classes in `CamelCase`. As
809a mechanical concession to Python style, this can go a long way to making the
810API feel like it fits in with its peers in the Python landscape.
811
812If in doubt, choose names that will flow properly with other
813[PEP 8 style names](https://pep8.org/#descriptive-naming-styles).
814
815### Prefer pseudo-containers
816
817Many core IR constructs provide methods directly on the instance to query count
818and begin/end iterators. Prefer hoisting these to dedicated pseudo containers.
819
820For example, a direct mapping of blocks within regions could be done this way:
821
822```python
823region = ...
824
825for block in region:
826
827  pass
828```
829
830However, this way is preferred:
831
832```python
833region = ...
834
835for block in region.blocks:
836
837  pass
838
839print(len(region.blocks))
840print(region.blocks[0])
841print(region.blocks[-1])
842```
843
844Instead of leaking STL-derived identifiers (`front`, `back`, etc), translate
845them to appropriate `__dunder__` methods and iterator wrappers in the bindings.
846
847Note that this can be taken too far, so use good judgment. For example, block
848arguments may appear container-like but have defined methods for lookup and
849mutation that would be hard to model properly without making semantics
850complicated. If running into these, just mirror the C/C++ API.
851
852### Provide one stop helpers for common things
853
854One stop helpers that aggregate over multiple low level entities can be
855incredibly helpful and are encouraged within reason. For example, making
856`Context` have a `parse_asm` or equivalent that avoids needing to explicitly
857construct a SourceMgr can be quite nice. One stop helpers do not have to be
858mutually exclusive with a more complete mapping of the backing constructs.
859
860## Testing
861
862Tests should be added in the `test/Bindings/Python` directory and should
863typically be `.py` files that have a lit run line.
864
865We use `lit` and `FileCheck` based tests:
866
867*   For generative tests (those that produce IR), define a Python module that
868    constructs/prints the IR and pipe it through `FileCheck`.
869*   Parsing should be kept self-contained within the module under test by use of
870    raw constants and an appropriate `parse_asm` call.
871*   Any file I/O code should be staged through a tempfile vs relying on file
872    artifacts/paths outside of the test module.
873*   For convenience, we also test non-generative API interactions with the same
874    mechanisms, printing and `CHECK`ing as needed.
875
876### Sample FileCheck test
877
878```python
879# RUN: %PYTHON %s | mlir-opt -split-input-file | FileCheck
880
881# TODO: Move to a test utility class once any of this actually exists.
882def print_module(f):
883  m = f()
884  print("// -----")
885  print("// TEST_FUNCTION:", f.__name__)
886  print(m.to_asm())
887  return f
888
889# CHECK-LABEL: TEST_FUNCTION: create_my_op
890@print_module
891def create_my_op():
892  m = mlir.ir.Module()
893  builder = m.new_op_builder()
894  # CHECK: mydialect.my_operation ...
895  builder.my_op()
896  return m
897```
898
899## Integration with ODS
900
901The MLIR Python bindings integrate with the tablegen-based ODS system for
902providing user-friendly wrappers around MLIR dialects and operations. There are
903multiple parts to this integration, outlined below. Most details have been
904elided: refer to the build rules and python sources under `mlir.dialects` for
905the canonical way to use this facility.
906
907Users are responsible for providing a `{DIALECT_NAMESPACE}.py` (or an equivalent
908directory with `__init__.py` file) as the entrypoint.
909
910### Generating `_{DIALECT_NAMESPACE}_ops_gen.py` wrapper modules
911
912Each dialect with a mapping to python requires that an appropriate
913`_{DIALECT_NAMESPACE}_ops_gen.py` wrapper module is created. This is done by
914invoking `mlir-tblgen` on a python-bindings specific tablegen wrapper that
915includes the boilerplate and actual dialect specific `td` file. An example, for
916the `Func` (which is assigned the namespace `func` as a special case):
917
918```tablegen
919#ifndef PYTHON_BINDINGS_FUNC_OPS
920#define PYTHON_BINDINGS_FUNC_OPS
921
922include "mlir/Dialect/Func/IR/FuncOps.td"
923
924#endif // PYTHON_BINDINGS_FUNC_OPS
925```
926
927In the main repository, building the wrapper is done via the CMake function
928`declare_mlir_dialect_python_bindings`, which invokes:
929
930```
931mlir-tblgen -gen-python-op-bindings -bind-dialect={DIALECT_NAMESPACE} \
932    {PYTHON_BINDING_TD_FILE}
933```
934
935The generates op classes must be included in the `{DIALECT_NAMESPACE}.py` file
936in a similar way that generated headers are included for C++ generated code:
937
938```python
939from ._my_dialect_ops_gen import *
940```
941
942### Extending the search path for wrapper modules
943
944When the python bindings need to locate a wrapper module, they consult the
945`dialect_search_path` and use it to find an appropriately named module. For the
946main repository, this search path is hard-coded to include the `mlir.dialects`
947module, which is where wrappers are emitted by the above build rule. Out of tree
948dialects can add their modules to the search path by calling:
949
950```python
951from mlir.dialects._ods_common import _cext
952_cext.globals.append_dialect_search_prefix("myproject.mlir.dialects")
953```
954
955### Wrapper module code organization
956
957The wrapper module tablegen emitter outputs:
958
959*   A `_Dialect` class (extending `mlir.ir.Dialect`) with a `DIALECT_NAMESPACE`
960    attribute.
961*   An `{OpName}` class for each operation (extending `mlir.ir.OpView`).
962*   Decorators for each of the above to register with the system.
963
964Note: In order to avoid naming conflicts, all internal names used by the wrapper
965module are prefixed by `_ods_`.
966
967Each concrete `OpView` subclass further defines several public-intended
968attributes:
969
970*   `OPERATION_NAME` attribute with the `str` fully qualified operation name
971    (i.e. `math.absf`).
972*   An `__init__` method for the *default builder* if one is defined or inferred
973    for the operation.
974*   `@property` getter for each operand or result (using an auto-generated name
975    for unnamed of each).
976*   `@property` getter, setter and deleter for each declared attribute.
977
978It further emits additional private-intended attributes meant for subclassing
979and customization (default cases omit these attributes in favor of the defaults
980on `OpView`):
981
982*   `_ODS_REGIONS`: A specification on the number and types of regions.
983    Currently a tuple of (min_region_count, has_no_variadic_regions). Note that
984    the API does some light validation on this but the primary purpose is to
985    capture sufficient information to perform other default building and region
986    accessor generation.
987*   `_ODS_OPERAND_SEGMENTS` and `_ODS_RESULT_SEGMENTS`: Black-box value which
988    indicates the structure of either the operand or results with respect to
989    variadics. Used by `OpView._ods_build_default` to decode operand and result
990    lists that contain lists.
991
992#### Default Builder
993
994Presently, only a single, default builder is mapped to the `__init__` method.
995The intent is that this `__init__` method represents the *most specific* of the
996builders typically generated for C++; however currently it is just the generic
997form below.
998
999*   One argument for each declared result:
1000    *   For single-valued results: Each will accept an `mlir.ir.Type`.
1001    *   For variadic results: Each will accept a `List[mlir.ir.Type]`.
1002*   One argument for each declared operand or attribute:
1003    *   For single-valued operands: Each will accept an `mlir.ir.Value`.
1004    *   For variadic operands: Each will accept a `List[mlir.ir.Value]`.
1005    *   For attributes, it will accept an `mlir.ir.Attribute`.
1006*   Trailing usage-specific, optional keyword arguments:
1007    *   `loc`: An explicit `mlir.ir.Location` to use. Defaults to the location
1008        bound to the thread (i.e. `with Location.unknown():`) or an error if
1009        none is bound nor specified.
1010    *   `ip`: An explicit `mlir.ir.InsertionPoint` to use. Default to the
1011        insertion point bound to the thread (i.e. `with InsertionPoint(...):`).
1012
1013In addition, each `OpView` inherits a `build_generic` method which allows
1014construction via a (nested in the case of variadic) sequence of `results` and
1015`operands`. This can be used to get some default construction semantics for
1016operations that are otherwise unsupported in Python, at the expense of having a
1017very generic signature.
1018
1019#### Extending Generated Op Classes
1020
1021As mentioned above, the build system generates Python sources like
1022`_{DIALECT_NAMESPACE}_ops_gen.py` for each dialect with Python bindings. It is
1023often desirable to use these generated classes as a starting point for
1024further customization, so an extension mechanism is provided to make this easy.
1025This mechanism uses conventional inheritance combined with `OpView` registration.
1026For example, the default builder for `arith.constant`
1027
1028```python
1029class ConstantOp(_ods_ir.OpView):
1030  OPERATION_NAME = "arith.constant"
1031
1032  _ODS_REGIONS = (0, True)
1033
1034  def __init__(self, value, *, loc=None, ip=None):
1035    ...
1036```
1037
1038expects `value` to be a `TypedAttr` (e.g., `IntegerAttr` or `FloatAttr`).
1039Thus, a natural extension is a builder that accepts a MLIR type and a Python value and instantiates the appropriate `TypedAttr`:
1040
1041```python
1042from typing import Union
1043
1044from mlir.ir import Type, IntegerAttr, FloatAttr
1045from mlir.dialects._arith_ops_gen import _Dialect, ConstantOp
1046from mlir.dialects._ods_common import _cext
1047
1048@_cext.register_operation(_Dialect, replace=True)
1049class ConstantOpExt(ConstantOp):
1050    def __init__(
1051        self, result: Type, value: Union[int, float], *, loc=None, ip=None
1052    ):
1053        if isinstance(value, int):
1054            super().__init__(IntegerAttr.get(result, value), loc=loc, ip=ip)
1055        elif isinstance(value, float):
1056            super().__init__(FloatAttr.get(result, value), loc=loc, ip=ip)
1057        else:
1058            raise NotImplementedError(f"Building `arith.constant` not supported for {result=} {value=}")
1059```
1060
1061which enables building an instance of `arith.constant` like so:
1062
1063```python
1064from mlir.ir import F32Type
1065
1066a = ConstantOpExt(F32Type.get(), 42.42)
1067b = ConstantOpExt(IntegerType.get_signless(32), 42)
1068```
1069
1070Note, three key aspects of the extension mechanism in this example:
1071
10721. `ConstantOpExt` directly inherits from the generated `ConstantOp`;
10732. in this, simplest, case all that's required is a call to the super class' initializer, i.e., `super().__init__(...)`;
10743. in order to register `ConstantOpExt` as the preferred `OpView` that is returned by `mlir.ir.Operation.opview` (see [Operations, Regions and Blocks](#operations-regions-and-blocks))
1075   we decorate the class with `@_cext.register_operation(_Dialect, replace=True)`, **where the `replace=True` must be used**.
1076
1077In some more complex cases it might be necessary to explicitly build the `OpView` through `OpView.build_generic` (see [Default Builder](#default-builder)), just as is performed by the generated builders.
1078I.e., we must call `OpView.build_generic` **and pass the result to `OpView.__init__`**, where the small issue becomes that the latter is already overridden by the generated builder.
1079Thus, we must call a method of a super class' super class (the "grandparent"); for example:
1080
1081```python
1082from mlir.dialects._scf_ops_gen import _Dialect, ForOp
1083from mlir.dialects._ods_common import _cext
1084
1085@_cext.register_operation(_Dialect, replace=True)
1086class ForOpExt(ForOp):
1087    def __init__(self, lower_bound, upper_bound, step, iter_args, *, loc=None, ip=None):
1088        ...
1089        super(ForOp, self).__init__(self.build_generic(...))
1090```
1091
1092where `OpView.__init__` is called via `super(ForOp, self).__init__`.
1093Note, there are alternatives ways to implement this (e.g., explicitly writing `OpView.__init__`); see any discussion on Python inheritance.
1094
1095## Providing Python bindings for a dialect
1096
1097Python bindings are designed to support MLIR’s open dialect ecosystem. A dialect
1098can be exposed to Python as a submodule of `mlir.dialects` and interoperate with
1099the rest of the bindings. For dialects containing only operations, it is
1100sufficient to provide Python APIs for those operations. Note that the majority
1101of boilerplate APIs can be generated from ODS. For dialects containing
1102attributes and types, it is necessary to thread those through the C API since
1103there is no generic mechanism to create attributes and types. Passes need to be
1104registered with the context in order to be usable in a text-specified pass
1105manager, which may be done at Python module load time. Other functionality can
1106be provided, similar to attributes and types, by exposing the relevant C API and
1107building Python API on top.
1108
1109
1110### Operations
1111
1112Dialect operations are provided in Python by wrapping the generic
1113`mlir.ir.Operation` class with operation-specific builder functions and
1114properties. Therefore, there is no need to implement a separate C API for them.
1115For operations defined in ODS, `mlir-tblgen -gen-python-op-bindings
1116-bind-dialect=<dialect-namespace>` generates the Python API from the declarative
1117description.
1118It is sufficient to create a new `.td` file that includes the original ODS
1119definition and use it as source for the `mlir-tblgen` call.
1120Such `.td` files reside in
1121[`python/mlir/dialects/`](https://github.com/llvm/llvm-project/tree/main/mlir/python/mlir/dialects).
1122The results of `mlir-tblgen` are expected to produce a file named
1123`_<dialect-namespace>_ops_gen.py` by convention. The generated operation classes
1124can be extended as described above. MLIR provides [CMake
1125functions](https://github.com/llvm/llvm-project/blob/main/mlir/cmake/modules/AddMLIRPython.cmake)
1126to automate the production of such files. Finally, a
1127`python/mlir/dialects/<dialect-namespace>.py` or a
1128`python/mlir/dialects/<dialect-namespace>/__init__.py` file must be created and
1129filled with `import`s from the generated files to enable `import
1130mlir.dialects.<dialect-namespace>` in Python.
1131
1132
1133### Attributes and Types
1134
1135Dialect attributes and types are provided in Python as subclasses of the
1136`mlir.ir.Attribute` and `mlir.ir.Type` classes, respectively. Python APIs for
1137attributes and types must connect to the relevant C APIs for building and
1138inspection, which must be provided first. Bindings for `Attribute` and `Type`
1139subclasses can be defined using
1140[`include/mlir/Bindings/Python/PybindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/PybindAdaptors.h)
1141or
1142[`include/mlir/Bindings/Python/NanobindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h)
1143utilities that mimic pybind11/nanobind API for defining functions and
1144properties. These bindings are to be included in a separate module. The
1145utilities also provide automatic casting between C API handles `MlirAttribute`
1146and `MlirType` and their Python counterparts so that the C API handles can be
1147used directly in binding implementations. The methods and properties provided by
1148the bindings should follow the principles discussed above.
1149
1150The attribute and type bindings for a dialect can be located in
1151`lib/Bindings/Python/Dialect<Name>.cpp` and should be compiled into a separate
1152“Python extension” library placed in `python/mlir/_mlir_libs` that will be
1153loaded by Python at runtime. MLIR provides [CMake
1154functions](https://github.com/llvm/llvm-project/blob/main/mlir/cmake/modules/AddMLIRPython.cmake)
1155to automate the production of such libraries. This library should be `import`ed
1156from the main dialect file, i.e. `python/mlir/dialects/<dialect-namespace>.py`
1157or `python/mlir/dialects/<dialect-namespace>/__init__.py`, to ensure the types
1158are available when the dialect is loaded from Python.
1159
1160
1161### Passes
1162
1163Dialect-specific passes can be made available to the pass manager in Python by
1164registering them with the context and relying on the API for pass pipeline
1165parsing from string descriptions. This can be achieved by creating a new
1166pybind11 module, defined in `lib/Bindings/Python/<Dialect>Passes.cpp`, that
1167calls the registration C API, which must be provided first. For passes defined
1168declaratively using Tablegen, `mlir-tblgen -gen-pass-capi-header` and
1169`-mlir-tblgen -gen-pass-capi-impl` automate the generation of C API. The
1170pybind11 module must be compiled into a separate “Python extension” library,
1171which can be `import`ed  from the main dialect file, i.e.
1172`python/mlir/dialects/<dialect-namespace>.py` or
1173`python/mlir/dialects/<dialect-namespace>/__init__.py`, or from a separate
1174`passes` submodule to be put in
1175`python/mlir/dialects/<dialect-namespace>/passes.py` if it is undesirable to
1176make the passes available along with the dialect.
1177
1178
1179### Other functionality
1180
1181Dialect functionality other than IR objects or passes, such as helper functions,
1182can be exposed to Python similarly to attributes and types. C API is expected to
1183exist for this functionality, which can then be wrapped using pybind11 and
1184[`include/mlir/Bindings/Python/PybindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/PybindAdaptors.h),
1185or nanobind and
1186[`include/mlir/Bindings/Python/NanobindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h)
1187utilities to connect to the rest of Python API. The bindings can be located in a
1188separate module or in the same module as attributes and types, and
1189loaded along with the dialect.
1190
1191## Free-threading (No-GIL) support
1192
1193Free-threading or no-GIL support refers to CPython interpreter (>=3.13) with Global Interpreter Lock made optional. For details on the topic, please check [PEP-703](https://peps.python.org/pep-0703/) and this [Python free-threading guide](https://py-free-threading.github.io/).
1194
1195MLIR Python bindings are free-threading compatible with exceptions (discussed below) in the following sense: it is safe to work in multiple threads with **independent** contexts. Below we show an example code of safe usage:
1196
1197```python
1198# python3.13t example.py
1199import concurrent.futures
1200
1201import mlir.dialects.arith as arith
1202from mlir.ir import Context, Location, Module, IntegerType, InsertionPoint
1203
1204
1205def func(py_value):
1206    with Context() as ctx:
1207        module = Module.create(loc=Location.file("foo.txt", 0, 0))
1208
1209        dtype = IntegerType.get_signless(64)
1210        with InsertionPoint(module.body), Location.name("a"):
1211            arith.constant(dtype, py_value)
1212
1213    return module
1214
1215
1216num_workers = 8
1217with concurrent.futures.ThreadPoolExecutor(max_workers=num_workers) as executor:
1218    futures = []
1219    for i in range(num_workers):
1220        futures.append(executor.submit(func, i))
1221    assert len(list(f.result() for f in futures)) == num_workers
1222```
1223
1224The exceptions to the free-threading compatibility:
1225- IR printing is unsafe, e.g. when using `PassManager` with `PassManager.enable_ir_printing()` which calls thread-unsafe `llvm::raw_ostream`.
1226- Usage of `Location.emit_error` is unsafe (due to thread-unsafe `llvm::raw_ostream`).
1227- Usage of `Module.dump` is unsafe (due to thread-unsafe `llvm::raw_ostream`).
1228- Usage of `mlir.dialects.transform.interpreter` is unsafe.
1229- Usage of `mlir.dialects.gpu` and `gpu-module-to-binary` is unsafe.