1# 'omp' Dialect 2 3The `omp` dialect is for representing directives, clauses and other definitions 4of the [OpenMP programming model](https://www.openmp.org). This directive-based 5programming model, defined for the C, C++ and Fortran programming languages, 6provides abstractions to simplify the development of parallel and accelerated 7programs. All versions of the OpenMP specification can be found 8[here](https://www.openmp.org/specifications/). 9 10Operations in this MLIR dialect generally correspond to a single OpenMP 11directive, taking arguments that represent their supported clauses, though this 12is not always the case. For a detailed information of operations, types and 13other definitions in this dialect, refer to the automatically-generated 14[ODS Documentation](ODS.md). 15 16[TOC] 17 18## Operation Naming Conventions 19 20This section aims to standardize how dialect operation names are chosen, to 21ensure a level of consistency. There are two categories of names: tablegen names 22and assembly names. The former also corresponds to the C++ class that is 23generated for the operation, whereas the latter is used to represent it in MLIR 24text form. 25 26Tablegen names are CamelCase, with the first letter capitalized and an "Op" 27suffix, whereas assembly names are snake_case, with all lowercase letters and 28words separated by underscores. 29 30If the operation corresponds to a directive, clause or other kind of definition 31in the OpenMP specification, it must use the same name split into words in the 32same way. For example, the `target data` directive would become `TargetDataOp` / 33`omp.target_data`, whereas `taskloop` would become `TaskloopOp` / 34`omp.taskloop`. 35 36Operations intended to carry extra information for another particular operation 37or clause must be named after that other operation or clause, followed by the 38name of the additional information. The assembly name must use a period to 39separate both parts. For example, the operation used to define some extra 40mapping information is named `MapInfoOp` / `omp.map.info`. The same rules are 41followed if multiple operations are created for different variants of the same 42directive, e.g. `atomic` becomes `Atomic{Read,Write,Update,Capture}Op` / 43`omp.atomic.{read,write,update,capture}`. 44 45## Clause-Based Operation Definition 46 47One main feature of the OpenMP specification is that, even though the set of 48clauses that could be applied to a given directive is independent from other 49directives, these clauses can generally apply to multiple directives. Since 50clauses usually define which arguments the corresponding MLIR operation takes, 51it is possible (and preferred) to define OpenMP dialect operations based on the 52list of clauses taken by the corresponding directive. This makes it simpler to 53keep their representation consistent across operations and minimizes redundancy 54in the dialect. 55 56To achieve this, the base `OpenMP_Clause` tablegen class has been created. It is 57intended to be used to create clause definitions that can be then attached to 58multiple `OpenMP_Op` definitions, resulting in the latter inheriting by default 59all properties defined by clauses attached, similarly to the trait mechanism. 60This mechanism is implemented in 61[OpenMPOpBase.td](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td). 62 63### Adding a Clause 64 65OpenMP clause definitions are located in 66[OpenMPClauses.td](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td). 67For each clause, an `OpenMP_Clause` subclass and a definition based on it must 68be created. The subclass must take a `bit` template argument for each of the 69properties it can populate on associated `OpenMP_Op`s. These must be forwarded 70to the base class. The definition must be an instantiation of the base class 71where all these template arguments are set to `false`. The definition's name 72must be `OpenMP_<Name>Clause`, whereas its base class' must be 73`OpenMP_<Name>ClauseSkip`. Following this pattern makes it possible to 74optionally skip the inheritance of some properties when defining operations: 75[more info](#overriding-clause-inherited-properties). 76 77Clauses can define the following properties: 78 - `list<Traits> traits`: To be used when having a certain clause always 79implies some op trait, like the `map` clause and the `MapClauseOwningInterface`. 80 - `dag(ins) arguments`: Mandatory property holding values and attributes 81used to represent the clause. Argument names use snake_case and should contain 82the clause name to avoid name clashes between clauses. Variadic arguments 83(non-attributes) must contain the "_vars" suffix. 84 - `string {req,opt}AssemblyFormat`: Optional formatting strings to produce 85custom human-friendly printers and parsers for arguments associated with the 86clause. It will be combined with assembly formats for other clauses as explained 87[below](#adding-an-operation). 88 - `string description`: Optional description text to describe the clause and 89its representation. 90 - `string extraClassDeclaration`: Optional C++ declarations to be added to 91operation classes including the clause. 92 93For example: 94 95```tablegen 96class OpenMP_ExampleClauseSkip< 97 bit traits = false, bit arguments = false, bit assemblyFormat = false, 98 bit description = false, bit extraClassDeclaration = false 99 > : OpenMP_Clause<traits, arguments, assemblyFormat, description, 100 extraClassDeclaration> { 101 let arguments = (ins 102 Optional<AnyType>:$example_var 103 ); 104 105 let optAssemblyFormat = [{ 106 `example` `(` $example_var `:` type($example_var) `)` 107 }]; 108 109 let description = [{ 110 The `example_var` argument defines the variable to which the EXAMPLE clause 111 applies. 112 }]; 113} 114 115def OpenMP_ExampleClause : OpenMP_ExampleClauseSkip<>; 116``` 117 118### Adding an Operation 119 120Operations in the OpenMP dialect, located in 121[OpenMPOps.td](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td), 122can be defined like any other regular operation by just specifying a `mnemonic` 123and optional list of `traits` when inheriting from `OpenMP_Op`, and then 124defining the expected `description`, `arguments`, etc. properties inside of its 125body. However, in most cases, basing the operation definition on its list of 126accepted clauses is significantly simpler because some of the properties can 127just be inherited from these clauses. 128 129In general, the way to achieve this is to specify, in addition to the `mnemonic` 130and optional list of `traits`, a list of `clauses` where all the applicable 131`OpenMP_<Name>Clause` definitions are added. Then, the only properties that 132would have to be defined in the operation's body are the `summary` and 133`description`. For the latter, only the operation itself would have to be 134defined, and the description for its clause-inherited arguments is appended 135through the inherited `clausesDescription` property. By convention, the list of 136clauses for an operation must be specified in alphabetical order. 137 138If the operation is intended to have a single region, this is better achieved by 139setting the `singleRegion=true` template argument of `OpenMP_Op` rather manually 140populating the `regions` property of the operation, because that way the default 141`assemblyFormat` is also updated correspondingly. 142 143For example: 144 145```tablegen 146def ExampleOp : OpenMP_Op<"example", traits = [ 147 AttrSizedOperandSegments, ... 148 ], clauses = [ 149 OpenMP_AlignedClause, OpenMP_IfClause, OpenMP_LinearClause, ... 150 ], singleRegion = true> { 151 let summary = "example construct"; 152 let description = [{ 153 The example construct represents... 154 }] # clausesDescription; 155} 156``` 157 158This is possible because the `arguments`, `assemblyFormat` and 159`extraClassDeclaration` properties of the operation are by default 160populated by concatenating the corresponding properties of the clauses on the 161list. In the case of the `assemblyFormat`, this involves combining the 162`reqAssemblyFormat` and the `optAssemblyFormat` properties. The 163`reqAssemblyFormat` of all clauses is concatenated first and separated using 164spaces, whereas the `optAssemblyFormat` is wrapped in an `oilist()` and 165interleaved with "|" instead of spaces. The resulting `assemblyFormat` contains 166the required assembly format strings, followed by the optional assembly format 167strings, optionally the `$region` and the `attr-dict`. 168 169### Overriding Clause-Inherited Properties 170 171Although the clause-based definition of operations can greatly reduce work, it's 172also somewhat restrictive, since there may be some situations where only part of 173the operation definition can be automated in that manner. For a fine-grained 174control over properties inherited from each clause two features are available: 175 176 - Inhibition of properties. By using `OpenMP_<Name>ClauseSkip` tablegen 177classes, the list of properties copied from the clause to the operation can be 178selected. For example, `OpenMP_IfClauseSkip<assemblyFormat = true>` would result 179in every property defined for the `OpenMP_IfClause` except for the 180`assemblyFormat` being used to initially populate the properties of the 181operation. 182 - Augmentation of properties. There are times when there is a need to add to 183a clause-populated operation property. Instead of overriding the property in the 184definition of the operation and having to manually replicate what would 185otherwise be automatically populated before adding to it, some internal 186properties are defined to hold this default value: `clausesArgs`, 187`clausesAssemblyFormat`, `clauses{Req,Opt}AssemblyFormat` and 188`clausesExtraClassDeclaration`. 189 190In the following example, assuming both the `OpenMP_InReductionClause` and the 191`OpenMP_ReductionClause` define a `getReductionVars` extra class declaration, 192we skip the conflicting `extraClassDeclaration`s inherited by both clauses and 193provide another implementation, without having to also re-define other 194declarations inherited from the `OpenMP_AllocateClause`: 195 196```tablegen 197def ExampleOp : OpenMP_Op<"example", traits = [ 198 AttrSizedOperandSegments, ... 199 ], clauses = [ 200 OpenMP_AllocateClause, 201 OpenMP_InReductionClauseSkip<extraClassDeclaration = true>, 202 OpenMP_ReductionClauseSkip<extraClassDeclaration = true> 203 ], singleRegion = true> { 204 let summary = "example construct"; 205 let description = [{ 206 This operation represents... 207 }] # clausesDescription; 208 209 // Override the clause-populated extraClassDeclaration and add the default 210 // back via appending clausesExtraClassDeclaration to it. This has the effect 211 // of adding one declaration. Since this property is skipped for the 212 // InReduction and Reduction clauses, clausesExtraClassDeclaration won't 213 // incorporate the definition of this property for these clauses. 214 let extraClassDeclaration = [{ 215 SmallVector<Value> getReductionVars() { 216 // Concatenate inReductionVars and reductionVars and return the result... 217 } 218 }] # clausesExtraClassDeclaration; 219} 220``` 221 222These features are intended for complex edge cases, but an effort should be made 223to avoid having to use them, since they may introduce inconsistencies and 224complexity to the dialect. 225 226### Tablegen Verification Pass 227 228As a result of the implicit way in which fundamental properties of MLIR 229operations are populated following this approach, and the ability to override 230them, forgetting to append clause-inherited values might result in hard to debug 231tablegen errors. 232 233For this reason, the `-verify-openmp-ops` tablegen pseudo-backend was created. 234It runs before any other tablegen backends are triggered for the 235[OpenMPOps.td](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td) 236file and warns any time a property defined for a clause is not found in the 237corresponding operation, except if it is explicitly skipped as described 238[above](#overriding-clause-inherited-properties). This way, in case of a later 239tablegen failure while processing OpenMP dialect operations, earlier messages 240triggered by that pass can point to a likely solution. 241 242### Operand Structures 243 244One consequence of basing the representation of operations on the set of values 245and attributes defined for each clause applicable to the corresponding OpenMP 246directive is that operation argument lists tend to be long. This has the effect 247of making C++ operation builders difficult to work with and easy to mistakenly 248pass arguments in the wrong order, which may sometimes introduce hard to detect 249problems. 250 251A solution provided to this issue are operand structures. The main idea behind 252them is that there is one defined for each clause, holding a set of fields that 253contain the data needed to initialize each of the arguments associated with that 254clause. Clause operand structures are aggregated into operation operand 255structures via class inheritance. Then, a custom builder is defined for each 256operation taking the corresponding operand structure as a parameter. Since each 257argument is a named member of the structure, it becomes much simpler to set up 258the desired arguments to create a new operation. 259 260Ad-hoc operand structures available for use within the ODS definition of custom 261operation builders might be defined in 262[OpenMPClauseOperands.h](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h). 263However, this is generally not needed for clause-based operation definitions. 264The `-gen-openmp-clause-ops` tablegen backend, triggered when building the 'omp' 265dialect, will automatically produce structures in the following way: 266 267- It will create a `<Name>ClauseOps` structure for each `OpenMP_Clause` 268definition with one field per argument. 269- The name of each field will match the tablegen name of the corresponding 270argument, except for replacing snake case with camel case. 271- The type of the field will be obtained from the corresponding tablegen 272argument's type: 273 - Values are represented with `mlir::Value`, except for `Variadic`, which 274 makes it an `llvm::SmallVector<mlir::Value>`. 275 - `OptionalAttr` is represented by the translation of its `baseAttr`. 276 - `TypedArrayAttrBase`-based attribute types are represented by wrapping the 277 translation of their `elementAttr` in an `llvm::SmallVector`. The only 278 exception for this case is if the `elementAttr` is a "scalar" (i.e. non 279 array-like) attribute type, in which case the more generic `mlir::Attribute` 280 will be used in place of its `storageType`. 281 - For `ElementsAttrBase`-based attribute types a best effort is attempted to 282 obtain an element type (`llvm::APInt`, `llvm::APFloat` or 283 `DenseArrayAttrBase`'s `returnType`) to be wrapped in an `llvm::SmallVector`. 284 If it cannot be obtained, which will happen with non-builtin direct subclasses 285 of `ElementsAttrBase`, a warning will be emitted and the `storageType` (i.e. 286 specific `mlir::Attribute` subclass) will be used instead. 287 - Other attribute types will be represented with their `storageType`. 288- It will create `<Name>Operands` structure for each operation, which is an 289empty structure subclassing all operand structures defined for the corresponding 290`OpenMP_Op`'s clauses. 291 292### Entry Block Argument-Defining Clauses 293 294In their MLIR representation, certain OpenMP clauses introduce a mapping between 295values defined outside the operation they are applied to and entry block 296arguments for the region of that MLIR operation. This enables, for example, the 297introduction of private copies of the same underlying variable defined outside 298the MLIR operation the clause is attached to. Currently, clauses with this 299property can be classified into three main categories: 300 - Map-like clauses: `host_eval` (compiler internal, not defined by the OpenMP 301 specification: [see more](#host-evaluated-clauses-in-target-regions)), `map`, 302 `use_device_addr` and `use_device_ptr`. 303 - Reduction-like clauses: `in_reduction`, `reduction` and `task_reduction`. 304 - Privatization clauses: `private`. 305 306All three kinds of entry block argument-defining clauses use a similar custom 307assembly format representation, only differing based on the different pieces of 308information attached to each kind. Below, one example of each is shown: 309 310```mlir 311omp.target map_entries(%x -> %x.m, %y -> %y.m : !llvm.ptr, !llvm.ptr) { 312 // Use %x.m, %y.m in place of %x and %y... 313} 314 315omp.wsloop reduction(@add.i32 %x -> %x.r, byref @add.f32 %y -> %y.r : !llvm.ptr, !llvm.ptr) { 316 // Use %x.r, %y.r in place of %x and %y... 317} 318 319omp.parallel private(@x.privatizer %x -> %x.p, @y.privatizer %y -> %y.p : !llvm.ptr, !llvm.ptr) { 320 // Use %x.p, %y.p in place of %x and %y... 321} 322``` 323 324As a consequence of parsing and printing the operation's first region entry 325block argument names together with the custom assembly format of these clauses, 326entry block arguments (i.e. the `^bb0(...):` line) must not be explicitly 327defined for these operations. Additionally, it is not possible to implement this 328feature while allowing each clause to be independently parsed and printed, 329because they need to be printed/parsed together with the corresponding 330operation's first region. They must have a well-defined ordering in which 331multiple of these clauses are specified for a given operation, as well. 332 333The parsing/printing of these clauses together with the region provides the 334ability to define entry block arguments directly after the `->`. Forcing a 335specific ordering between these clauses makes the block argument ordering 336well-defined, which is the property used to easily match each clause with the 337entry block arguments defined by it. 338 339Custom printers and parsers for operation regions based on the entry block 340argument-defining clauses they take are implemented based on the 341`{parse,print}BlockArgRegion` functions, which take care of the sorting and 342formatting of each kind of clause, minimizing code duplication resulting from 343this approach. One example of the custom assembly format of an operation taking 344the `private` and `reduction` clauses is the following: 345 346```tablegen 347let assemblyFormat = clausesAssemblyFormat # [{ 348 custom<PrivateReductionRegion>($region, $private_vars, type($private_vars), 349 $private_syms, $reduction_vars, type($reduction_vars), $reduction_byref, 350 $reduction_syms) attr-dict 351}]; 352``` 353 354The `BlockArgOpenMPOpInterface` has been introduced to simplify the addition and 355handling of these kinds of clauses. It holds `num<ClauseName>BlockArgs()` 356functions that by default return 0, to be overriden by each clause through the 357`extraClassDeclaration` property. Based on these functions and the expected 358alphabetical sorting between entry block argument-defining clauses, it 359implements `get<ClauseName>BlockArgs()` functions that are the intended method 360of accessing clause-defined block arguments. 361 362## Loop-Associated Directives 363 364Loop-associated OpenMP constructs are represented in the dialect as loop wrapper 365operations. These implement the `LoopWrapperInterface`, which enforces a series 366of restrictions upon the operation: 367 - It has the `NoTerminator` and `SingleBlock` traits; 368 - It contains a single region; and 369 - Its only block contains exactly one operation, which must be another loop 370wrapper or `omp.loop_nest` operation. 371 372This approach splits the representation for a loop nest and the loop-associated 373constructs that specify how its iterations are executed, possibly across various 374SIMD lanes (`omp.simd`), threads (`omp.wsloop`), teams of threads 375(`omp.distribute`) or tasks (`omp.taskloop`). The ability to directly nest 376multiple loop wrappers to impact the execution of a single loop nest is used to 377represent composite constructs in a modular way. 378 379The `omp.loop_nest` operation represents a collapsed rectangular loop nest that 380must always be wrapped by at least one loop wrapper, which defines how it is 381intended to be executed. It serves as a simpler and more restrictive 382representation of OpenMP loops while a more general approach to support 383non-rectangular loop nests, loop transformations and non-perfectly nested loops 384based on a new `omp.canonical_loop` definition is developed. 385 386The following example shows how a `parallel {do,for}` construct would be 387represented: 388```mlir 389omp.parallel ... { 390 ... 391 omp.wsloop ... { 392 omp.loop_nest (%i) : index = (%lb) to (%ub) step (%step) { 393 %a = load %a[%i] : memref<?xf32> 394 %b = load %b[%i] : memref<?xf32> 395 %sum = arith.addf %a, %b : f32 396 store %sum, %c[%i] : memref<?xf32> 397 omp.yield 398 } 399 } 400 ... 401 omp.terminator 402} 403``` 404 405### Loop Transformations 406 407In addition to the worksharing loop-associated constructs described above, the 408OpenMP specification also defines a set of loop transformation constructs. They 409replace the associated loop(s) before worksharing constructs are executed on the 410generated loop(s). Some examples of such constructs are `tile` and `unroll`. 411 412A general approach for representing these types of OpenMP constructs has not yet 413been implemented, but it is closely linked to the `omp.canonical_loop` work. 414Nevertheless, loop transformation that the `collapse` clause for loop-associated 415worksharing constructs defines can be represented by introducing multiple 416bounds, step and induction variables to the `omp.loop_nest` operation. 417 418## Compound Construct Representation 419 420The OpenMP specification defines certain shortcuts that allow specifying 421multiple constructs in a single directive, which are referred to as compound 422constructs (e.g. `parallel do` contains the `parallel` and `do` constructs). 423These can be further classified into [combined](#combined-constructs) and 424[composite](#composite-constructs) constructs. This section describes how they 425are represented in the dialect. 426 427When clauses are specified for compound constructs, the OpenMP specification 428defines a set of rules to decide to which leaf constructs they apply, as well as 429potentially introducing some other implicit clauses. These rules must be taken 430into account by those creating the MLIR representation, since it is a per-leaf 431representation that expects these rules to have already been followed. 432 433### Combined Constructs 434 435Combined constructs are semantically equivalent to specifying one construct 436immediately nested inside another. This property is used to simplify the dialect 437by representing them through the operations associated to each leaf construct. 438For example, `target teams` would be represented as follows: 439 440```mlir 441omp.target ... { 442 ... 443 omp.teams ... { 444 ... 445 omp.terminator 446 } 447 ... 448 omp.terminator 449} 450``` 451 452### Composite Constructs 453 454Composite constructs are similar to combined constructs in that they specify the 455effect of one construct being applied immediately after another. However, they 456group together constructs that cannot be directly nested into each other. 457Specifically, they group together multiple loop-associated constructs that apply 458to the same collapsed loop nest. 459 460As of version 5.2 of the OpenMP specification, the list of composite constructs 461is the following: 462 - `{do,for} simd`; 463 - `distribute simd`; 464 - `distribute parallel {do,for}`; 465 - `distribute parallel {do,for} simd`; and 466 - `taskloop simd`. 467 468Even though the list of composite constructs is relatively short and it would 469also be possible to create dialect operations for each, it was decided to 470allow attaching multiple loop wrappers to a single loop instead. This minimizes 471redundancy in the dialect and maximizes its modularity, since there is a single 472operation for each leaf construct regardless of whether it can be part of a 473composite construct. On the other hand, this means the `omp.loop_nest` operation 474will have to be interpreted differently depending on how many and which loop 475wrappers are attached to it. 476 477To simplify the detection of operations taking part in the representation of a 478composite construct, the `ComposableOpInterface` was introduced. Its purpose is 479to handle the `omp.composite` discardable dialect attribute that can optionally 480be attached to these operations. Operation verifiers will ensure its presence is 481consistent with the context the operation appears in, so that it is valid when 482the attribute is present if and only if it represents a leaf of a composite 483construct. 484 485For example, the `distribute simd` composite construct is represented as 486follows: 487 488```mlir 489omp.distribute ... { 490 omp.simd ... { 491 omp.loop_nest (%i) : index = (%lb) to (%ub) step (%step) { 492 ... 493 omp.yield 494 } 495 } {omp.composite} 496} {omp.composite} 497``` 498 499One exception to this is the representation of the 500`distribute parallel {do,for}` composite construct. The presence of a 501block-associated `parallel` leaf construct would introduce many problems if it 502was allowed to work as a loop wrapper. In this case, the "hoisted `omp.parallel` 503representation" is used instead. This consists in making `omp.parallel` the 504parent operation, with a nested `omp.loop_nest` wrapped by `omp.distribute` and 505`omp.wsloop` (and `omp.simd`, in the `distribute parallel {do,for} simd` case). 506 507This approach works because `parallel` is a parallelism-generating construct, 508whereas `distribute` is a worksharing construct impacting the higher level 509`teams` construct, making the ordering between these constructs not cause 510semantic mismatches. This property is also exploited by LLVM's SPMD-mode. 511 512```mlir 513omp.parallel ... { 514 ... 515 omp.distribute ... { 516 omp.wsloop ... { 517 omp.loop_nest (%i) : index = (%lb) to (%ub) step (%step) { 518 ... 519 omp.yield 520 } 521 } {omp.composite} 522 } {omp.composite} 523 ... 524 omp.terminator 525} {omp.composite} 526``` 527 528## Host-Evaluated Clauses in Target Regions 529 530The `omp.target` operation, which represents the OpenMP `target` construct, is 531marked with the `IsolatedFromAbove` trait. This means that, inside of its 532region, no MLIR values defined outside of the op itself can be used. This is 533consistent with the OpenMP specification of the `target` construct, which 534mandates that all host device values used inside of the `target` region must 535either be privatized (data-sharing) or mapped (data-mapping). 536 537Normally, clauses applied to a construct are evaluated before entering that 538construct. Further, in some cases, the OpenMP specification stipulates that 539clauses be evaluated _on the host device_ on entry to a parent `target` 540construct. In particular, the `num_teams` and `thread_limit` clauses of the 541`teams` construct must be evaluated on the host device if it's nested inside or 542combined with a `target` construct. 543 544Additionally, the runtime library targeted by the MLIR to LLVM IR translation of 545the OpenMP dialect supports the optimized launch of SPMD kernels (i.e. 546`target teams distribute parallel {do,for}` in OpenMP), which requires 547specifying in advance what the total trip count of the loop is. Consequently, it 548is also beneficial to evaluate the trip count on the host device prior to the 549kernel launch. 550 551These host-evaluated values in MLIR would need to be placed outside of the 552`omp.target` region and also attached to the corresponding nested operations, 553which is not possible because of the `IsolatedFromAbove` trait. The solution 554implemented to address this problem has been to introduce the `host_eval` 555argument to the `omp.target` operation. It works similarly to a `map` clause, 556but its only intended use is to forward host-evaluated values to their 557corresponding operation inside of the region. Any uses outside of the previously 558described result in a verifier error. 559 560```mlir 561// Initialize %0, %1, %2, %3... 562omp.target host_eval(%0 -> %nt, %1 -> %lb, %2 -> %ub, %3 -> %step : i32, i32, i32, i32) { 563 omp.teams num_teams(to %nt : i32) { 564 omp.parallel { 565 omp.distribute { 566 omp.wsloop { 567 omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) { 568 // ... 569 omp.yield 570 } 571 omp.terminator 572 } {omp.composite} 573 omp.terminator 574 } {omp.composite} 575 omp.terminator 576 } {omp.composite} 577 omp.terminator 578 } 579 omp.terminator 580} 581``` 582