1# MLIR Python Bindings 2 3**Current status**: Under development and not enabled by default 4 5[TOC] 6 7## Building 8 9### Pre-requisites 10 11* A relatively recent Python3 installation 12* Installation of python dependencies as specified in 13 `mlir/python/requirements.txt` 14 15### CMake variables 16 17* **`MLIR_ENABLE_BINDINGS_PYTHON`**`:BOOL` 18 19 Enables building the Python bindings. Defaults to `OFF`. 20 21* **`Python3_EXECUTABLE`**:`STRING` 22 23 Specifies the `python` executable used for the LLVM build, including for 24 determining header/link flags for the Python bindings. On systems with 25 multiple Python implementations, setting this explicitly to the preferred 26 `python3` executable is strongly recommended. 27 28### Recommended development practices 29 30It is recommended to use a python virtual environment. Many ways exist for this, 31but the following is the simplest: 32 33```shell 34# Make sure your 'python' is what you expect. Note that on multi-python 35# systems, this may have a version suffix, and on many Linuxes and MacOS where 36# python2 and python3 co-exist, you may also want to use `python3`. 37which python 38python -m venv ~/.venv/mlirdev 39source ~/.venv/mlirdev/bin/activate 40 41# Note that many LTS distros will bundle a version of pip itself that is too 42# old to download all of the latest binaries for certain platforms. 43# The pip version can be obtained with `python -m pip --version`, and for 44# Linux specifically, this should be cross checked with minimum versions 45# here: https://github.com/pypa/manylinux 46# It is recommended to upgrade pip: 47python -m pip install --upgrade pip 48 49 50# Now the `python` command will resolve to your virtual environment and 51# packages will be installed there. 52python -m pip install -r mlir/python/requirements.txt 53 54# Now run `cmake`, `ninja`, et al. 55``` 56 57For interactive use, it is sufficient to add the 58`tools/mlir/python_packages/mlir_core/` directory in your `build/` directory to 59the `PYTHONPATH`. Typically: 60 61```shell 62export PYTHONPATH=$(cd build && pwd)/tools/mlir/python_packages/mlir_core 63``` 64 65Note that if you have installed (i.e. via `ninja install`, et al), then python 66packages for all enabled projects will be in your install tree under 67`python_packages/` (i.e. `python_packages/mlir_core`). Official distributions 68are built with a more specialized setup. 69 70## Design 71 72### Use cases 73 74There are likely two primary use cases for the MLIR python bindings: 75 761. Support users who expect that an installed version of LLVM/MLIR will yield 77 the ability to `import mlir` and use the API in a pure way out of the box. 78 791. Downstream integrations will likely want to include parts of the API in 80 their private namespace or specially built libraries, probably mixing it 81 with other python native bits. 82 83### Composable modules 84 85In order to support use case \#2, the Python bindings are organized into 86composable modules that downstream integrators can include and re-export into 87their own namespace if desired. This forces several design points: 88 89* Separate the construction/populating of a `py::module` from 90 `PYBIND11_MODULE` global constructor. 91 92* Introduce headers for C++-only wrapper classes as other related C++ modules 93 will need to interop with it. 94 95* Separate any initialization routines that depend on optional components into 96 its own module/dependency (currently, things like `registerAllDialects` fall 97 into this category). 98 99There are a lot of co-related issues of shared library linkage, distribution 100concerns, etc that affect such things. Organizing the code into composable 101modules (versus a monolithic `cpp` file) allows the flexibility to address many 102of these as needed over time. Also, compilation time for all of the template 103meta-programming in pybind scales with the number of things you define in a 104translation unit. Breaking into multiple translation units can significantly aid 105compile times for APIs with a large surface area. 106 107### Submodules 108 109Generally, the C++ codebase namespaces most things into the `mlir` namespace. 110However, in order to modularize and make the Python bindings easier to 111understand, sub-packages are defined that map roughly to the directory structure 112of functional units in MLIR. 113 114Examples: 115 116* `mlir.ir` 117* `mlir.passes` (`pass` is a reserved word :( ) 118* `mlir.dialect` 119* `mlir.execution_engine` (aside from namespacing, it is important that 120 "bulky"/optional parts like this are isolated) 121 122In addition, initialization functions that imply optional dependencies should be 123in underscored (notionally private) modules such as `_init` and linked 124separately. This allows downstream integrators to completely customize what is 125included "in the box" and covers things like dialect registration, pass 126registration, etc. 127 128### Loader 129 130LLVM/MLIR is a non-trivial python-native project that is likely to co-exist with 131other non-trivial native extensions. As such, the native extension (i.e. the 132`.so`/`.pyd`/`.dylib`) is exported as a notionally private top-level symbol 133(`_mlir`), while a small set of Python code is provided in 134`mlir/_cext_loader.py` and siblings which loads and re-exports it. This split 135provides a place to stage code that needs to prepare the environment *before* 136the shared library is loaded into the Python runtime, and also provides a place 137that one-time initialization code can be invoked apart from module constructors. 138 139It is recommended to avoid using `__init__.py` files to the extent possible, 140until reaching a leaf package that represents a discrete component. The rule to 141keep in mind is that the presence of an `__init__.py` file prevents the ability 142to split anything at that level or below in the namespace into different 143directories, deployment packages, wheels, etc. 144 145See the documentation for more information and advice: 146https://packaging.python.org/guides/packaging-namespace-packages/ 147 148### Use the C-API 149 150The Python APIs should seek to layer on top of the C-API to the degree possible. 151Especially for the core, dialect-independent parts, such a binding enables 152packaging decisions that would be difficult or impossible if spanning a C++ ABI 153boundary. In addition, factoring in this way side-steps some very difficult 154issues that arise when combining RTTI-based modules (which pybind derived things 155are) with non-RTTI polymorphic C++ code (the default compilation mode of LLVM). 156 157### Ownership in the Core IR 158 159There are several top-level types in the core IR that are strongly owned by 160their python-side reference: 161 162* `PyContext` (`mlir.ir.Context`) 163* `PyModule` (`mlir.ir.Module`) 164* `PyOperation` (`mlir.ir.Operation`) - but with caveats 165 166All other objects are dependent. All objects maintain a back-reference 167(keep-alive) to their closest containing top-level object. Further, dependent 168objects fall into two categories: a) uniqued (which live for the life-time of 169the context) and b) mutable. Mutable objects need additional machinery for 170keeping track of when the C++ instance that backs their Python object is no 171longer valid (typically due to some specific mutation of the IR, deletion, or 172bulk operation). 173 174### Optionality and argument ordering in the Core IR 175 176The following types support being bound to the current thread as a context 177manager: 178 179* `PyLocation` (`loc: mlir.ir.Location = None`) 180* `PyInsertionPoint` (`ip: mlir.ir.InsertionPoint = None`) 181* `PyMlirContext` (`context: mlir.ir.Context = None`) 182 183In order to support composability of function arguments, when these types appear 184as arguments, they should always be the last and appear in the above order and 185with the given names (which is generally the order in which they are expected to 186need to be expressed explicitly in special cases) as necessary. Each should 187carry a default value of `py::none()` and use either a manual or automatic 188conversion for resolving either with the explicit value or a value from the 189thread context manager (i.e. `DefaultingPyMlirContext` or 190`DefaultingPyLocation`). 191 192The rationale for this is that in Python, trailing keyword arguments to the 193*right* are the most composable, enabling a variety of strategies such as kwarg 194passthrough, default values, etc. Keeping function signatures composable 195increases the chances that interesting DSLs and higher level APIs can be 196constructed without a lot of exotic boilerplate. 197 198Used consistently, this enables a style of IR construction that rarely needs to 199use explicit contexts, locations, or insertion points but is free to do so when 200extra control is needed. 201 202#### Operation hierarchy 203 204As mentioned above, `PyOperation` is special because it can exist in either a 205top-level or dependent state. The life-cycle is unidirectional: operations can 206be created detached (top-level) and once added to another operation, they are 207then dependent for the remainder of their lifetime. The situation is more 208complicated when considering construction scenarios where an operation is added 209to a transitive parent that is still detached, necessitating further accounting 210at such transition points (i.e. all such added children are initially added to 211the IR with a parent of their outer-most detached operation, but then once it is 212added to an attached operation, they need to be re-parented to the containing 213module). 214 215Due to the validity and parenting accounting needs, `PyOperation` is the owner 216for regions and blocks and needs to be a top-level type that we can count on not 217aliasing. This let's us do things like selectively invalidating instances when 218mutations occur without worrying that there is some alias to the same operation 219in the hierarchy. Operations are also the only entity that are allowed to be in 220a detached state, and they are interned at the context level so that there is 221never more than one Python `mlir.ir.Operation` object for a unique 222`MlirOperation`, regardless of how it is obtained. 223 224The C/C++ API allows for Region/Block to also be detached, but it simplifies the 225ownership model a lot to eliminate that possibility in this API, allowing the 226Region/Block to be completely dependent on its owning operation for accounting. 227The aliasing of Python `Region`/`Block` instances to underlying 228`MlirRegion`/`MlirBlock` is considered benign and these objects are not interned 229in the context (unlike operations). 230 231If we ever want to re-introduce detached regions/blocks, we could do so with new 232"DetachedRegion" class or similar and also avoid the complexity of accounting. 233With the way it is now, we can avoid having a global live list for regions and 234blocks. We may end up needing an op-local one at some point TBD, depending on 235how hard it is to guarantee how mutations interact with their Python peer 236objects. We can cross that bridge easily when we get there. 237 238Module, when used purely from the Python API, can't alias anyway, so we can use 239it as a top-level ref type without a live-list for interning. If the API ever 240changes such that this cannot be guaranteed (i.e. by letting you marshal a 241native-defined Module in), then there would need to be a live table for it too. 242 243## User-level API 244 245### Context Management 246 247The bindings rely on Python 248[context managers](https://docs.python.org/3/reference/datamodel.html#context-managers) 249(`with` statements) to simplify creation and handling of IR objects by omitting 250repeated arguments such as MLIR contexts, operation insertion points and 251locations. A context manager sets up the default object to be used by all 252binding calls within the following context and in the same thread. This default 253can be overridden by specific calls through the dedicated keyword arguments. 254 255#### MLIR Context 256 257An MLIR context is a top-level entity that owns attributes and types and is 258referenced from virtually all IR constructs. Contexts also provide thread safety 259at the C++ level. In Python bindings, the MLIR context is also a Python context 260manager, one can write: 261 262```python 263from mlir.ir import Context, Module 264 265with Context() as ctx: 266 # IR construction using `ctx` as context. 267 268 # For example, parsing an MLIR module from string requires the context. 269 Module.parse("builtin.module {}") 270``` 271 272IR objects referencing a context usually provide access to it through the 273`.context` property. Most IR-constructing functions expect the context to be 274provided in some form. In case of attributes and types, the context may be 275extracted from the contained attribute or type. In case of operations, the 276context is systematically extracted from Locations (see below). When the context 277cannot be extracted from any argument, the bindings API expects the (keyword) 278argument `context`. If it is not provided or set to `None` (default), it will be 279looked up from an implicit stack of contexts maintained by the bindings in the 280current thread and updated by context managers. If there is no surrounding 281context, an error will be raised. 282 283Note that it is possible to manually specify the MLIR context both inside and 284outside of the `with` statement: 285 286```python 287from mlir.ir import Context, Module 288 289standalone_ctx = Context() 290with Context() as managed_ctx: 291 # Parse a module in managed_ctx. 292 Module.parse("...") 293 294 # Parse a module in standalone_ctx (override the context manager). 295 Module.parse("...", context=standalone_ctx) 296 297# Parse a module without using context managers. 298Module.parse("...", context=standalone_ctx) 299``` 300 301The context object remains live as long as there are IR objects referencing it. 302 303#### Insertion Points and Locations 304 305When constructing an MLIR operation, two pieces of information are required: 306 307- an *insertion point* that indicates where the operation is to be created in 308 the IR region/block/operation structure (usually before or after another 309 operation, or at the end of some block); it may be missing, at which point 310 the operation is created in the *detached* state; 311- a *location* that contains user-understandable information about the source 312 of the operation (for example, file/line/column information), which must 313 always be provided as it carries a reference to the MLIR context. 314 315Both can be provided using context managers or explicitly as keyword arguments 316in the operation constructor. They can be also provided as keyword arguments 317`ip` and `loc` both within and outside of the context manager. 318 319```python 320from mlir.ir import Context, InsertionPoint, Location, Module, Operation 321 322with Context() as ctx: 323 module = Module.create() 324 325 # Prepare for inserting operations into the body of the module and indicate 326 # that these operations originate in the "f.mlir" file at the given line and 327 # column. 328 with InsertionPoint(module.body), Location.file("f.mlir", line=42, col=1): 329 # This operation will be inserted at the end of the module body and will 330 # have the location set up by the context manager. 331 Operation(<...>) 332 333 # This operation will be inserted at the end of the module (and after the 334 # previously constructed operation) and will have the location provided as 335 # the keyword argument. 336 Operation(<...>, loc=Location.file("g.mlir", line=1, col=10)) 337 338 # This operation will be inserted at the *beginning* of the block rather 339 # than at its end. 340 Operation(<...>, ip=InsertionPoint.at_block_begin(module.body)) 341``` 342 343Note that `Location` needs an MLIR context to be constructed. It can take the 344context set up in the current thread by some surrounding context manager, or 345accept it as an explicit argument: 346 347```python 348from mlir.ir import Context, Location 349 350# Create a context and a location in this context in the same `with` statement. 351with Context() as ctx, Location.file("f.mlir", line=42, col=1, context=ctx): 352 pass 353``` 354 355Locations are owned by the context and live as long as they are (transitively) 356referenced from somewhere in Python code. 357 358Unlike locations, the insertion point may be left unspecified (or, equivalently, 359set to `None` or `False`) during operation construction. In this case, the 360operation is created in the *detached* state, that is, it is not added into the 361region of another operation and is owned by the caller. This is usually the case 362for top-level operations that contain the IR, such as modules. Regions, blocks 363and values contained in an operation point back to it and maintain it live. 364 365### Inspecting IR Objects 366 367Inspecting the IR is one of the primary tasks the Python bindings are designed 368for. One can traverse the IR operation/region/block structure and inspect their 369aspects such as operation attributes and value types. 370 371#### Operations, Regions and Blocks 372 373Operations are represented as either: 374 375- the generic `Operation` class, useful in particular for generic processing 376 of unregistered operations; or 377- a specific subclass of `OpView` that provides more semantically-loaded 378 accessors to operation properties. 379 380Given an `OpView` subclass, one can obtain an `Operation` using its `.operation` 381property. Given an `Operation`, one can obtain the corresponding `OpView` using 382its `.opview` property *as long as* the corresponding class has been set up. 383This typically means that the Python module of its dialect has been loaded. By 384default, the `OpView` version is produced when navigating the IR tree. 385 386One can check if an operation has a specific type by means of Python's 387`isinstance` function: 388 389```python 390operation = <...> 391opview = <...> 392if isinstance(operation.opview, mydialect.MyOp): 393 pass 394if isinstance(opview, mydialect.MyOp): 395 pass 396``` 397 398The components of an operation can be inspected using its properties. 399 400- `attributes` is a collection of operation attributes . It can be subscripted 401 as both dictionary and sequence, e.g., both `operation.attributes["value"]` 402 and `operation.attributes[0]` will work. There is no guarantee on the order 403 in which the attributes are traversed when iterating over the `attributes` 404 property as sequence. 405- `operands` is a sequence collection of operation operands. 406- `results` is a sequence collection of operation results. 407- `regions` is a sequence collection of regions attached to the operation. 408 409The objects produced by `operands` and `results` have a `.types` property that 410contains a sequence collection of types of the corresponding values. 411 412```python 413from mlir.ir import Operation 414 415operation1 = <...> 416operation2 = <...> 417if operation1.results.types == operation2.operand.types: 418 pass 419``` 420 421`OpView` subclasses for specific operations may provide leaner accessors to 422properties of an operation. For example, named attributes, operand and results 423are usually accessible as properties of the `OpView` subclass with the same 424name, such as `operation.const_value` instead of 425`operation.attributes["const_value"]`. If this name is a reserved Python 426keyword, it is suffixed with an underscore. 427 428The operation itself is iterable, which provides access to the attached regions 429in order: 430 431```python 432from mlir.ir import Operation 433 434operation = <...> 435for region in operation: 436 do_something_with_region(region) 437``` 438 439A region is conceptually a sequence of blocks. Objects of the `Region` class are 440thus iterable, which provides access to the blocks. One can also use the 441`.blocks` property. 442 443```python 444# Regions are directly iterable and give access to blocks. 445for block1, block2 in zip(operation.regions[0], operation.regions[0].blocks) 446 assert block1 == block2 447``` 448 449A block contains a sequence of operations, and has several additional 450properties. Objects of the `Block` class are iterable and provide access to the 451operations contained in the block. So does the `.operations` property. Blocks 452also have a list of arguments available as a sequence collection using the 453`.arguments` property. 454 455Block and region belong to the parent operation in Python bindings and keep it 456alive. This operation can be accessed using the `.owner` property. 457 458#### Attributes and Types 459 460Attributes and types are (mostly) immutable context-owned objects. They are 461represented as either: 462 463- an opaque `Attribute` or `Type` object supporting printing and comparison; 464 or 465- a concrete subclass thereof with access to properties of the attribute or 466 type. 467 468Given an `Attribute` or `Type` object, one can obtain a concrete subclass using 469the constructor of the subclass. This may raise a `ValueError` if the attribute 470or type is not of the expected subclass: 471 472```python 473from mlir.ir import Attribute, Type 474from mlir.<dialect> import ConcreteAttr, ConcreteType 475 476attribute = <...> 477type = <...> 478try: 479 concrete_attr = ConcreteAttr(attribute) 480 concrete_type = ConcreteType(type) 481except ValueError as e: 482 # Handle incorrect subclass. 483``` 484 485In addition, concrete attribute and type classes provide a static `isinstance` 486method to check whether an object of the opaque `Attribute` or `Type` type can 487be downcasted: 488 489```python 490from mlir.ir import Attribute, Type 491from mlir.<dialect> import ConcreteAttr, ConcreteType 492 493attribute = <...> 494type = <...> 495 496# No need to handle errors here. 497if ConcreteAttr.isinstance(attribute): 498 concrete_attr = ConcreteAttr(attribute) 499if ConcreteType.isinstance(type): 500 concrete_type = ConcreteType(type) 501``` 502 503By default, and unlike operations, attributes and types are returned from IR 504traversals using the opaque `Attribute` or `Type` that needs to be downcasted. 505 506Concrete attribute and type classes usually expose their properties as Python 507readonly properties. For example, the elemental type of a tensor type can be 508accessed using the `.element_type` property. 509 510#### Values 511 512MLIR has two kinds of values based on their defining object: block arguments and 513operation results. Values are handled similarly to attributes and types. They 514are represented as either: 515 516- a generic `Value` object; or 517- a concrete `BlockArgument` or `OpResult` object. 518 519The former provides all the generic functionality such as comparison, type 520access and printing. The latter provide access to the defining block or 521operation and the position of the value within it. By default, the generic 522`Value` objects are returned from IR traversals. Downcasting is implemented 523through concrete subclass constructors, similarly to attribtues and types: 524 525```python 526from mlir.ir import BlockArgument, OpResult, Value 527 528value = ... 529 530# Set `concrete` to the specific value subclass. 531try: 532 concrete = BlockArgument(value) 533except ValueError: 534 # This must not raise another ValueError as values are either block arguments 535 # or op results. 536 concrete = OpResult(value) 537``` 538 539#### Interfaces 540 541MLIR interfaces are a mechanism to interact with the IR without needing to know 542specific types of operations but only some of their aspects. Operation 543interfaces are available as Python classes with the same name as their C++ 544counterparts. Objects of these classes can be constructed from either: 545 546- an object of the `Operation` class or of any `OpView` subclass; in this 547 case, all interface methods are available; 548- a subclass of `OpView` and a context; in this case, only the *static* 549 interface methods are available as there is no associated operation. 550 551In both cases, construction of the interface raises a `ValueError` if the 552operation class does not implement the interface in the given context (or, for 553operations, in the context that the operation is defined in). Similarly to 554attributes and types, the MLIR context may be set up by a surrounding context 555manager. 556 557```python 558from mlir.ir import Context, InferTypeOpInterface 559 560with Context(): 561 op = <...> 562 563 # Attempt to cast the operation into an interface. 564 try: 565 iface = InferTypeOpInterface(op) 566 except ValueError: 567 print("Operation does not implement InferTypeOpInterface.") 568 raise 569 570 # All methods are available on interface objects constructed from an Operation 571 # or an OpView. 572 iface.someInstanceMethod() 573 574 # An interface object can also be constructed given an OpView subclass. It 575 # also needs a context in which the interface will be looked up. The context 576 # can be provided explicitly or set up by the surrounding context manager. 577 try: 578 iface = InferTypeOpInterface(some_dialect.SomeOp) 579 except ValueError: 580 print("SomeOp does not implement InferTypeOpInterface.") 581 raise 582 583 # Calling an instance method on an interface object constructed from a class 584 # will raise TypeError. 585 try: 586 iface.someInstanceMethod() 587 except TypeError: 588 pass 589 590 # One can still call static interface methods though. 591 iface.inferOpReturnTypes(<...>) 592``` 593 594If an interface object was constructed from an `Operation` or an `OpView`, they 595are available as `.operation` and `.opview` properties of the interface object, 596respectively. 597 598Only a subset of operation interfaces are currently provided in Python bindings. 599Attribute and type interfaces are not yet available in Python bindings. 600 601### Creating IR Objects 602 603Python bindings also support IR creation and manipulation. 604 605#### Operations, Regions and Blocks 606 607Operations can be created given a `Location` and an optional `InsertionPoint`. 608It is often easier to user context managers to specify locations and insertion 609points for several operations created in a row as described above. 610 611Concrete operations can be created by using constructors of the corresponding 612`OpView` subclasses. The generic, default form of the constructor accepts: 613 614- an optional sequence of types for operation results (`results`); 615- an optional sequence of values for operation operands, or another operation 616 producing those values (`operands`); 617- an optional dictionary of operation attributes (`attributes`); 618- an optional sequence of successor blocks (`successors`); 619- the number of regions to attach to the operation (`regions`, default `0`); 620- the `loc` keyword argument containing the `Location` of this operation; if 621 `None`, the location created by the closest context manager is used or an 622 exception will be raised if there is no context manager; 623- the `ip` keyword argument indicating where the operation will be inserted in 624 the IR; if `None`, the insertion point created by the closest context 625 manager is used; if there is no surrounding context manager, the operation 626 is created in the detached state. 627 628Most operations will customize the constructor to accept a reduced list of 629arguments that are relevant for the operation. For example, zero-result 630operations may omit the `results` argument, so can the operations where the 631result types can be derived from operand types unambiguously. As a concrete 632example, built-in function operations can be constructed by providing a function 633name as string and its argument and result types as a tuple of sequences: 634 635```python 636from mlir.ir import Context, Module 637from mlir.dialects import builtin 638 639with Context(): 640 module = Module.create() 641 with InsertionPoint(module.body), Location.unknown(): 642 func = func.FuncOp("main", ([], [])) 643``` 644 645Also see below for constructors generated from ODS. 646 647Operations can also be constructed using the generic class and based on the 648canonical string name of the operation using `Operation.create`. It accepts the 649operation name as string, which must exactly match the canonical name of the 650operation in C++ or ODS, followed by the same argument list as the default 651constructor for `OpView`. *This form is discouraged* from use and is intended 652for generic operation processing. 653 654```python 655from mlir.ir import Context, Module 656from mlir.dialects import builtin 657 658with Context(): 659 module = Module.create() 660 with InsertionPoint(module.body), Location.unknown(): 661 # Operations can be created in a generic way. 662 func = Operation.create( 663 "func.func", results=[], operands=[], 664 attributes={"function_type":TypeAttr.get(FunctionType.get([], []))}, 665 successors=None, regions=1) 666 # The result will be downcasted to the concrete `OpView` subclass if 667 # available. 668 assert isinstance(func, func.FuncOp) 669``` 670 671Regions are created for an operation when constructing it on the C++ side. They 672are not constructible in Python and are not expected to exist outside of 673operations (unlike in C++ that supports detached regions). 674 675Blocks can be created within a given region and inserted before or after another 676block of the same region using `create_before()`, `create_after()` methods of 677the `Block` class, or the `create_at_start()` static method of the same class. 678They are not expected to exist outside of regions (unlike in C++ that supports 679detached blocks). 680 681```python 682from mlir.ir import Block, Context, Operation 683 684with Context(): 685 op = Operation.create("generic.op", regions=1) 686 687 # Create the first block in the region. 688 entry_block = Block.create_at_start(op.regions[0]) 689 690 # Create further blocks. 691 other_block = entry_block.create_after() 692``` 693 694Blocks can be used to create `InsertionPoint`s, which can point to the beginning 695or the end of the block, or just before its terminator. It is common for 696`OpView` subclasses to provide a `.body` property that can be used to construct 697an `InsertionPoint`. For example, builtin `Module` and `FuncOp` provide a 698`.body` and `.add_entry_blocK()`, respectively. 699 700#### Attributes and Types 701 702Attributes and types can be created given a `Context` or another attribute or 703type object that already references the context. To indicate that they are owned 704by the context, they are obtained by calling the static `get` method on the 705concrete attribute or type class. These method take as arguments the data 706necessary to construct the attribute or type and a the keyword `context` 707argument when the context cannot be derived from other arguments. 708 709```python 710from mlir.ir import Context, F32Type, FloatAttr 711 712# Attribute and types require access to an MLIR context, either directly or 713# through another context-owned object. 714ctx = Context() 715f32 = F32Type.get(context=ctx) 716pi = FloatAttr.get(f32, 3.14) 717 718# They may use the context defined by the surrounding context manager. 719with Context(): 720 f32 = F32Type.get() 721 pi = FloatAttr.get(f32, 3.14) 722``` 723 724Some attributes provide additional construction methods for clarity. 725 726```python 727from mlir.ir import Context, IntegerAttr, IntegerType 728 729with Context(): 730 i8 = IntegerType.get_signless(8) 731 IntegerAttr.get(i8, 42) 732``` 733 734Builtin attribute can often be constructed from Python types with similar 735structure. For example, `ArrayAttr` can be constructed from a sequence 736collection of attributes, and a `DictAttr` can be constructed from a dictionary: 737 738```python 739from mlir.ir import ArrayAttr, Context, DictAttr, UnitAttr 740 741with Context(): 742 array = ArrayAttr.get([UnitAttr.get(), UnitAttr.get()]) 743 dictionary = DictAttr.get({"array": array, "unit": UnitAttr.get()}) 744``` 745 746Custom builders for Attributes to be used during Operation creation can be 747registered by way of the `register_attribute_builder`. In particular the 748following is how a custom builder is registered for `I32Attr`: 749 750```python 751@register_attribute_builder("I32Attr") 752def _i32Attr(x: int, context: Context): 753 return IntegerAttr.get( 754 IntegerType.get_signless(32, context=context), x) 755``` 756 757This allows to invoke op creation of an op with a `I32Attr` with 758 759```python 760foo.Op(30) 761``` 762 763The registration is based on the ODS name but registry is via pure python 764method. Only single custom builder is allowed to be registered per ODS attribute 765type (e.g., I32Attr can have only one, which can correspond to multiple of the 766underlying IntegerAttr type). 767 768instead of 769 770```python 771foo.Op(IntegerAttr.get(IndexType.get_signless(32, context=context), 30)) 772``` 773 774## Style 775 776In general, for the core parts of MLIR, the Python bindings should be largely 777isomorphic with the underlying C++ structures. However, concessions are made 778either for practicality or to give the resulting library an appropriately 779"Pythonic" flavor. 780 781### Properties vs get\*() methods 782 783Generally favor converting trivial methods like `getContext()`, `getName()`, 784`isEntryBlock()`, etc to read-only Python properties (i.e. `context`). It is 785primarily a matter of calling `def_property_readonly` vs `def` in binding code, 786and makes things feel much nicer to the Python side. 787 788For example, prefer: 789 790```c++ 791m.def_property_readonly("context", ...) 792``` 793 794Over: 795 796```c++ 797m.def("getContext", ...) 798``` 799 800### **repr** methods 801 802Things that have nice printed representations are really great :) If there is a 803reasonable printed form, it can be a significant productivity boost to wire that 804to the `__repr__` method (and verify it with a [doctest](#sample-doctest)). 805 806### CamelCase vs snake\_case 807 808Name functions/methods/properties in `snake_case` and classes in `CamelCase`. As 809a mechanical concession to Python style, this can go a long way to making the 810API feel like it fits in with its peers in the Python landscape. 811 812If in doubt, choose names that will flow properly with other 813[PEP 8 style names](https://pep8.org/#descriptive-naming-styles). 814 815### Prefer pseudo-containers 816 817Many core IR constructs provide methods directly on the instance to query count 818and begin/end iterators. Prefer hoisting these to dedicated pseudo containers. 819 820For example, a direct mapping of blocks within regions could be done this way: 821 822```python 823region = ... 824 825for block in region: 826 827 pass 828``` 829 830However, this way is preferred: 831 832```python 833region = ... 834 835for block in region.blocks: 836 837 pass 838 839print(len(region.blocks)) 840print(region.blocks[0]) 841print(region.blocks[-1]) 842``` 843 844Instead of leaking STL-derived identifiers (`front`, `back`, etc), translate 845them to appropriate `__dunder__` methods and iterator wrappers in the bindings. 846 847Note that this can be taken too far, so use good judgment. For example, block 848arguments may appear container-like but have defined methods for lookup and 849mutation that would be hard to model properly without making semantics 850complicated. If running into these, just mirror the C/C++ API. 851 852### Provide one stop helpers for common things 853 854One stop helpers that aggregate over multiple low level entities can be 855incredibly helpful and are encouraged within reason. For example, making 856`Context` have a `parse_asm` or equivalent that avoids needing to explicitly 857construct a SourceMgr can be quite nice. One stop helpers do not have to be 858mutually exclusive with a more complete mapping of the backing constructs. 859 860## Testing 861 862Tests should be added in the `test/Bindings/Python` directory and should 863typically be `.py` files that have a lit run line. 864 865We use `lit` and `FileCheck` based tests: 866 867* For generative tests (those that produce IR), define a Python module that 868 constructs/prints the IR and pipe it through `FileCheck`. 869* Parsing should be kept self-contained within the module under test by use of 870 raw constants and an appropriate `parse_asm` call. 871* Any file I/O code should be staged through a tempfile vs relying on file 872 artifacts/paths outside of the test module. 873* For convenience, we also test non-generative API interactions with the same 874 mechanisms, printing and `CHECK`ing as needed. 875 876### Sample FileCheck test 877 878```python 879# RUN: %PYTHON %s | mlir-opt -split-input-file | FileCheck 880 881# TODO: Move to a test utility class once any of this actually exists. 882def print_module(f): 883 m = f() 884 print("// -----") 885 print("// TEST_FUNCTION:", f.__name__) 886 print(m.to_asm()) 887 return f 888 889# CHECK-LABEL: TEST_FUNCTION: create_my_op 890@print_module 891def create_my_op(): 892 m = mlir.ir.Module() 893 builder = m.new_op_builder() 894 # CHECK: mydialect.my_operation ... 895 builder.my_op() 896 return m 897``` 898 899## Integration with ODS 900 901The MLIR Python bindings integrate with the tablegen-based ODS system for 902providing user-friendly wrappers around MLIR dialects and operations. There are 903multiple parts to this integration, outlined below. Most details have been 904elided: refer to the build rules and python sources under `mlir.dialects` for 905the canonical way to use this facility. 906 907Users are responsible for providing a `{DIALECT_NAMESPACE}.py` (or an equivalent 908directory with `__init__.py` file) as the entrypoint. 909 910### Generating `_{DIALECT_NAMESPACE}_ops_gen.py` wrapper modules 911 912Each dialect with a mapping to python requires that an appropriate 913`_{DIALECT_NAMESPACE}_ops_gen.py` wrapper module is created. This is done by 914invoking `mlir-tblgen` on a python-bindings specific tablegen wrapper that 915includes the boilerplate and actual dialect specific `td` file. An example, for 916the `Func` (which is assigned the namespace `func` as a special case): 917 918```tablegen 919#ifndef PYTHON_BINDINGS_FUNC_OPS 920#define PYTHON_BINDINGS_FUNC_OPS 921 922include "mlir/Dialect/Func/IR/FuncOps.td" 923 924#endif // PYTHON_BINDINGS_FUNC_OPS 925``` 926 927In the main repository, building the wrapper is done via the CMake function 928`declare_mlir_dialect_python_bindings`, which invokes: 929 930``` 931mlir-tblgen -gen-python-op-bindings -bind-dialect={DIALECT_NAMESPACE} \ 932 {PYTHON_BINDING_TD_FILE} 933``` 934 935The generates op classes must be included in the `{DIALECT_NAMESPACE}.py` file 936in a similar way that generated headers are included for C++ generated code: 937 938```python 939from ._my_dialect_ops_gen import * 940``` 941 942### Extending the search path for wrapper modules 943 944When the python bindings need to locate a wrapper module, they consult the 945`dialect_search_path` and use it to find an appropriately named module. For the 946main repository, this search path is hard-coded to include the `mlir.dialects` 947module, which is where wrappers are emitted by the above build rule. Out of tree 948dialects can add their modules to the search path by calling: 949 950```python 951from mlir.dialects._ods_common import _cext 952_cext.globals.append_dialect_search_prefix("myproject.mlir.dialects") 953``` 954 955### Wrapper module code organization 956 957The wrapper module tablegen emitter outputs: 958 959* A `_Dialect` class (extending `mlir.ir.Dialect`) with a `DIALECT_NAMESPACE` 960 attribute. 961* An `{OpName}` class for each operation (extending `mlir.ir.OpView`). 962* Decorators for each of the above to register with the system. 963 964Note: In order to avoid naming conflicts, all internal names used by the wrapper 965module are prefixed by `_ods_`. 966 967Each concrete `OpView` subclass further defines several public-intended 968attributes: 969 970* `OPERATION_NAME` attribute with the `str` fully qualified operation name 971 (i.e. `math.absf`). 972* An `__init__` method for the *default builder* if one is defined or inferred 973 for the operation. 974* `@property` getter for each operand or result (using an auto-generated name 975 for unnamed of each). 976* `@property` getter, setter and deleter for each declared attribute. 977 978It further emits additional private-intended attributes meant for subclassing 979and customization (default cases omit these attributes in favor of the defaults 980on `OpView`): 981 982* `_ODS_REGIONS`: A specification on the number and types of regions. 983 Currently a tuple of (min_region_count, has_no_variadic_regions). Note that 984 the API does some light validation on this but the primary purpose is to 985 capture sufficient information to perform other default building and region 986 accessor generation. 987* `_ODS_OPERAND_SEGMENTS` and `_ODS_RESULT_SEGMENTS`: Black-box value which 988 indicates the structure of either the operand or results with respect to 989 variadics. Used by `OpView._ods_build_default` to decode operand and result 990 lists that contain lists. 991 992#### Default Builder 993 994Presently, only a single, default builder is mapped to the `__init__` method. 995The intent is that this `__init__` method represents the *most specific* of the 996builders typically generated for C++; however currently it is just the generic 997form below. 998 999* One argument for each declared result: 1000 * For single-valued results: Each will accept an `mlir.ir.Type`. 1001 * For variadic results: Each will accept a `List[mlir.ir.Type]`. 1002* One argument for each declared operand or attribute: 1003 * For single-valued operands: Each will accept an `mlir.ir.Value`. 1004 * For variadic operands: Each will accept a `List[mlir.ir.Value]`. 1005 * For attributes, it will accept an `mlir.ir.Attribute`. 1006* Trailing usage-specific, optional keyword arguments: 1007 * `loc`: An explicit `mlir.ir.Location` to use. Defaults to the location 1008 bound to the thread (i.e. `with Location.unknown():`) or an error if 1009 none is bound nor specified. 1010 * `ip`: An explicit `mlir.ir.InsertionPoint` to use. Default to the 1011 insertion point bound to the thread (i.e. `with InsertionPoint(...):`). 1012 1013In addition, each `OpView` inherits a `build_generic` method which allows 1014construction via a (nested in the case of variadic) sequence of `results` and 1015`operands`. This can be used to get some default construction semantics for 1016operations that are otherwise unsupported in Python, at the expense of having a 1017very generic signature. 1018 1019#### Extending Generated Op Classes 1020 1021As mentioned above, the build system generates Python sources like 1022`_{DIALECT_NAMESPACE}_ops_gen.py` for each dialect with Python bindings. It is 1023often desirable to use these generated classes as a starting point for 1024further customization, so an extension mechanism is provided to make this easy. 1025This mechanism uses conventional inheritance combined with `OpView` registration. 1026For example, the default builder for `arith.constant` 1027 1028```python 1029class ConstantOp(_ods_ir.OpView): 1030 OPERATION_NAME = "arith.constant" 1031 1032 _ODS_REGIONS = (0, True) 1033 1034 def __init__(self, value, *, loc=None, ip=None): 1035 ... 1036``` 1037 1038expects `value` to be a `TypedAttr` (e.g., `IntegerAttr` or `FloatAttr`). 1039Thus, a natural extension is a builder that accepts a MLIR type and a Python value and instantiates the appropriate `TypedAttr`: 1040 1041```python 1042from typing import Union 1043 1044from mlir.ir import Type, IntegerAttr, FloatAttr 1045from mlir.dialects._arith_ops_gen import _Dialect, ConstantOp 1046from mlir.dialects._ods_common import _cext 1047 1048@_cext.register_operation(_Dialect, replace=True) 1049class ConstantOpExt(ConstantOp): 1050 def __init__( 1051 self, result: Type, value: Union[int, float], *, loc=None, ip=None 1052 ): 1053 if isinstance(value, int): 1054 super().__init__(IntegerAttr.get(result, value), loc=loc, ip=ip) 1055 elif isinstance(value, float): 1056 super().__init__(FloatAttr.get(result, value), loc=loc, ip=ip) 1057 else: 1058 raise NotImplementedError(f"Building `arith.constant` not supported for {result=} {value=}") 1059``` 1060 1061which enables building an instance of `arith.constant` like so: 1062 1063```python 1064from mlir.ir import F32Type 1065 1066a = ConstantOpExt(F32Type.get(), 42.42) 1067b = ConstantOpExt(IntegerType.get_signless(32), 42) 1068``` 1069 1070Note, three key aspects of the extension mechanism in this example: 1071 10721. `ConstantOpExt` directly inherits from the generated `ConstantOp`; 10732. in this, simplest, case all that's required is a call to the super class' initializer, i.e., `super().__init__(...)`; 10743. in order to register `ConstantOpExt` as the preferred `OpView` that is returned by `mlir.ir.Operation.opview` (see [Operations, Regions and Blocks](#operations-regions-and-blocks)) 1075 we decorate the class with `@_cext.register_operation(_Dialect, replace=True)`, **where the `replace=True` must be used**. 1076 1077In some more complex cases it might be necessary to explicitly build the `OpView` through `OpView.build_generic` (see [Default Builder](#default-builder)), just as is performed by the generated builders. 1078I.e., we must call `OpView.build_generic` **and pass the result to `OpView.__init__`**, where the small issue becomes that the latter is already overridden by the generated builder. 1079Thus, we must call a method of a super class' super class (the "grandparent"); for example: 1080 1081```python 1082from mlir.dialects._scf_ops_gen import _Dialect, ForOp 1083from mlir.dialects._ods_common import _cext 1084 1085@_cext.register_operation(_Dialect, replace=True) 1086class ForOpExt(ForOp): 1087 def __init__(self, lower_bound, upper_bound, step, iter_args, *, loc=None, ip=None): 1088 ... 1089 super(ForOp, self).__init__(self.build_generic(...)) 1090``` 1091 1092where `OpView.__init__` is called via `super(ForOp, self).__init__`. 1093Note, there are alternatives ways to implement this (e.g., explicitly writing `OpView.__init__`); see any discussion on Python inheritance. 1094 1095## Providing Python bindings for a dialect 1096 1097Python bindings are designed to support MLIR’s open dialect ecosystem. A dialect 1098can be exposed to Python as a submodule of `mlir.dialects` and interoperate with 1099the rest of the bindings. For dialects containing only operations, it is 1100sufficient to provide Python APIs for those operations. Note that the majority 1101of boilerplate APIs can be generated from ODS. For dialects containing 1102attributes and types, it is necessary to thread those through the C API since 1103there is no generic mechanism to create attributes and types. Passes need to be 1104registered with the context in order to be usable in a text-specified pass 1105manager, which may be done at Python module load time. Other functionality can 1106be provided, similar to attributes and types, by exposing the relevant C API and 1107building Python API on top. 1108 1109 1110### Operations 1111 1112Dialect operations are provided in Python by wrapping the generic 1113`mlir.ir.Operation` class with operation-specific builder functions and 1114properties. Therefore, there is no need to implement a separate C API for them. 1115For operations defined in ODS, `mlir-tblgen -gen-python-op-bindings 1116-bind-dialect=<dialect-namespace>` generates the Python API from the declarative 1117description. 1118It is sufficient to create a new `.td` file that includes the original ODS 1119definition and use it as source for the `mlir-tblgen` call. 1120Such `.td` files reside in 1121[`python/mlir/dialects/`](https://github.com/llvm/llvm-project/tree/main/mlir/python/mlir/dialects). 1122The results of `mlir-tblgen` are expected to produce a file named 1123`_<dialect-namespace>_ops_gen.py` by convention. The generated operation classes 1124can be extended as described above. MLIR provides [CMake 1125functions](https://github.com/llvm/llvm-project/blob/main/mlir/cmake/modules/AddMLIRPython.cmake) 1126to automate the production of such files. Finally, a 1127`python/mlir/dialects/<dialect-namespace>.py` or a 1128`python/mlir/dialects/<dialect-namespace>/__init__.py` file must be created and 1129filled with `import`s from the generated files to enable `import 1130mlir.dialects.<dialect-namespace>` in Python. 1131 1132 1133### Attributes and Types 1134 1135Dialect attributes and types are provided in Python as subclasses of the 1136`mlir.ir.Attribute` and `mlir.ir.Type` classes, respectively. Python APIs for 1137attributes and types must connect to the relevant C APIs for building and 1138inspection, which must be provided first. Bindings for `Attribute` and `Type` 1139subclasses can be defined using 1140[`include/mlir/Bindings/Python/PybindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/PybindAdaptors.h) 1141or 1142[`include/mlir/Bindings/Python/NanobindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h) 1143utilities that mimic pybind11/nanobind API for defining functions and 1144properties. These bindings are to be included in a separate module. The 1145utilities also provide automatic casting between C API handles `MlirAttribute` 1146and `MlirType` and their Python counterparts so that the C API handles can be 1147used directly in binding implementations. The methods and properties provided by 1148the bindings should follow the principles discussed above. 1149 1150The attribute and type bindings for a dialect can be located in 1151`lib/Bindings/Python/Dialect<Name>.cpp` and should be compiled into a separate 1152“Python extension” library placed in `python/mlir/_mlir_libs` that will be 1153loaded by Python at runtime. MLIR provides [CMake 1154functions](https://github.com/llvm/llvm-project/blob/main/mlir/cmake/modules/AddMLIRPython.cmake) 1155to automate the production of such libraries. This library should be `import`ed 1156from the main dialect file, i.e. `python/mlir/dialects/<dialect-namespace>.py` 1157or `python/mlir/dialects/<dialect-namespace>/__init__.py`, to ensure the types 1158are available when the dialect is loaded from Python. 1159 1160 1161### Passes 1162 1163Dialect-specific passes can be made available to the pass manager in Python by 1164registering them with the context and relying on the API for pass pipeline 1165parsing from string descriptions. This can be achieved by creating a new 1166pybind11 module, defined in `lib/Bindings/Python/<Dialect>Passes.cpp`, that 1167calls the registration C API, which must be provided first. For passes defined 1168declaratively using Tablegen, `mlir-tblgen -gen-pass-capi-header` and 1169`-mlir-tblgen -gen-pass-capi-impl` automate the generation of C API. The 1170pybind11 module must be compiled into a separate “Python extension” library, 1171which can be `import`ed from the main dialect file, i.e. 1172`python/mlir/dialects/<dialect-namespace>.py` or 1173`python/mlir/dialects/<dialect-namespace>/__init__.py`, or from a separate 1174`passes` submodule to be put in 1175`python/mlir/dialects/<dialect-namespace>/passes.py` if it is undesirable to 1176make the passes available along with the dialect. 1177 1178 1179### Other functionality 1180 1181Dialect functionality other than IR objects or passes, such as helper functions, 1182can be exposed to Python similarly to attributes and types. C API is expected to 1183exist for this functionality, which can then be wrapped using pybind11 and 1184[`include/mlir/Bindings/Python/PybindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/PybindAdaptors.h), 1185or nanobind and 1186[`include/mlir/Bindings/Python/NanobindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h) 1187utilities to connect to the rest of Python API. The bindings can be located in a 1188separate module or in the same module as attributes and types, and 1189loaded along with the dialect. 1190 1191## Free-threading (No-GIL) support 1192 1193Free-threading or no-GIL support refers to CPython interpreter (>=3.13) with Global Interpreter Lock made optional. For details on the topic, please check [PEP-703](https://peps.python.org/pep-0703/) and this [Python free-threading guide](https://py-free-threading.github.io/). 1194 1195MLIR Python bindings are free-threading compatible with exceptions (discussed below) in the following sense: it is safe to work in multiple threads with **independent** contexts. Below we show an example code of safe usage: 1196 1197```python 1198# python3.13t example.py 1199import concurrent.futures 1200 1201import mlir.dialects.arith as arith 1202from mlir.ir import Context, Location, Module, IntegerType, InsertionPoint 1203 1204 1205def func(py_value): 1206 with Context() as ctx: 1207 module = Module.create(loc=Location.file("foo.txt", 0, 0)) 1208 1209 dtype = IntegerType.get_signless(64) 1210 with InsertionPoint(module.body), Location.name("a"): 1211 arith.constant(dtype, py_value) 1212 1213 return module 1214 1215 1216num_workers = 8 1217with concurrent.futures.ThreadPoolExecutor(max_workers=num_workers) as executor: 1218 futures = [] 1219 for i in range(num_workers): 1220 futures.append(executor.submit(func, i)) 1221 assert len(list(f.result() for f in futures)) == num_workers 1222``` 1223 1224The exceptions to the free-threading compatibility: 1225- IR printing is unsafe, e.g. when using `PassManager` with `PassManager.enable_ir_printing()` which calls thread-unsafe `llvm::raw_ostream`. 1226- Usage of `Location.emit_error` is unsafe (due to thread-unsafe `llvm::raw_ostream`). 1227- Usage of `Module.dump` is unsafe (due to thread-unsafe `llvm::raw_ostream`). 1228- Usage of `mlir.dialects.transform.interpreter` is unsafe. 1229- Usage of `mlir.dialects.gpu` and `gpu-module-to-binary` is unsafe.