xref: /llvm-project/mlir/docs/Dialects/OpenMPDialect/_index.md (revision 9d7d8d2c87b3503681b362f6391d97227c62c2e8)
1# 'omp' Dialect
2
3The `omp` dialect is for representing directives, clauses and other definitions
4of the [OpenMP programming model](https://www.openmp.org). This directive-based
5programming model, defined for the C, C++ and Fortran programming languages,
6provides abstractions to simplify the development of parallel and accelerated
7programs. All versions of the OpenMP specification can be found
8[here](https://www.openmp.org/specifications/).
9
10Operations in this MLIR dialect generally correspond to a single OpenMP
11directive, taking arguments that represent their supported clauses, though this
12is not always the case. For a detailed information of operations, types and
13other definitions in this dialect, refer to the automatically-generated
14[ODS Documentation](ODS.md).
15
16[TOC]
17
18## Operation Naming Conventions
19
20This section aims to standardize how dialect operation names are chosen, to
21ensure a level of consistency. There are two categories of names: tablegen names
22and assembly names. The former also corresponds to the C++ class that is
23generated for the operation, whereas the latter is used to represent it in MLIR
24text form.
25
26Tablegen names are CamelCase, with the first letter capitalized and an "Op"
27suffix, whereas assembly names are snake_case, with all lowercase letters and
28words separated by underscores.
29
30If the operation corresponds to a directive, clause or other kind of definition
31in the OpenMP specification, it must use the same name split into words in the
32same way. For example, the `target data` directive would become `TargetDataOp` /
33`omp.target_data`, whereas `taskloop` would become `TaskloopOp` /
34`omp.taskloop`.
35
36Operations intended to carry extra information for another particular operation
37or clause must be named after that other operation or clause, followed by the
38name of the additional information. The assembly name must use a period to
39separate both parts. For example, the operation used to define some extra
40mapping information is named `MapInfoOp` / `omp.map.info`. The same rules are
41followed if multiple operations are created for different variants of the same
42directive, e.g. `atomic` becomes `Atomic{Read,Write,Update,Capture}Op` /
43`omp.atomic.{read,write,update,capture}`.
44
45## Clause-Based Operation Definition
46
47One main feature of the OpenMP specification is that, even though the set of
48clauses that could be applied to a given directive is independent from other
49directives, these clauses can generally apply to multiple directives. Since
50clauses usually define which arguments the corresponding MLIR operation takes,
51it is possible (and preferred) to define OpenMP dialect operations based on the
52list of clauses taken by the corresponding directive. This makes it simpler to
53keep their representation consistent across operations and minimizes redundancy
54in the dialect.
55
56To achieve this, the base `OpenMP_Clause` tablegen class has been created. It is
57intended to be used to create clause definitions that can be then attached to
58multiple `OpenMP_Op` definitions, resulting in the latter inheriting by default
59all properties defined by clauses attached, similarly to the trait mechanism.
60This mechanism is implemented in
61[OpenMPOpBase.td](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td).
62
63### Adding a Clause
64
65OpenMP clause definitions are located in
66[OpenMPClauses.td](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td).
67For each clause, an `OpenMP_Clause` subclass and a definition based on it must
68be created. The subclass must take a `bit` template argument for each of the
69properties it can populate on associated `OpenMP_Op`s. These must be forwarded
70to the base class. The definition must be an instantiation of the base class
71where all these template arguments are set to `false`. The definition's name
72must be `OpenMP_<Name>Clause`, whereas its base class' must be
73`OpenMP_<Name>ClauseSkip`. Following this pattern makes it possible to
74optionally skip the inheritance of some properties when defining operations:
75[more info](#overriding-clause-inherited-properties).
76
77Clauses can define the following properties:
78  - `list<Traits> traits`: To be used when having a certain clause always
79implies some op trait, like the `map` clause and the `MapClauseOwningInterface`.
80  - `dag(ins) arguments`: Mandatory property holding values and attributes
81used to represent the clause. Argument names use snake_case and should contain
82the clause name to avoid name clashes between clauses. Variadic arguments
83(non-attributes) must contain the "_vars" suffix.
84  - `string {req,opt}AssemblyFormat`: Optional formatting strings to produce
85custom human-friendly printers and parsers for arguments associated with the
86clause. It will be combined with assembly formats for other clauses as explained
87[below](#adding-an-operation).
88  - `string description`: Optional description text to describe the clause and
89its representation.
90  - `string extraClassDeclaration`: Optional C++ declarations to be added to
91operation classes including the clause.
92
93For example:
94
95```tablegen
96class OpenMP_ExampleClauseSkip<
97    bit traits = false, bit arguments = false, bit assemblyFormat = false,
98    bit description = false, bit extraClassDeclaration = false
99  > : OpenMP_Clause<traits, arguments, assemblyFormat, description,
100                    extraClassDeclaration> {
101  let arguments = (ins
102    Optional<AnyType>:$example_var
103  );
104
105  let optAssemblyFormat = [{
106    `example` `(` $example_var `:` type($example_var) `)`
107  }];
108
109  let description = [{
110    The `example_var` argument defines the variable to which the EXAMPLE clause
111    applies.
112  }];
113}
114
115def OpenMP_ExampleClause : OpenMP_ExampleClauseSkip<>;
116```
117
118### Adding an Operation
119
120Operations in the OpenMP dialect, located in
121[OpenMPOps.td](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td),
122can be defined like any other regular operation by just specifying a `mnemonic`
123and optional list of `traits` when inheriting from `OpenMP_Op`, and then
124defining the expected `description`, `arguments`, etc. properties inside of its
125body. However, in most cases, basing the operation definition on its list of
126accepted clauses is significantly simpler because some of the properties can
127just be inherited from these clauses.
128
129In general, the way to achieve this is to specify, in addition to the `mnemonic`
130and optional list of `traits`, a list of `clauses` where all the applicable
131`OpenMP_<Name>Clause` definitions are added. Then, the only properties that
132would have to be defined in the operation's body are the `summary` and
133`description`. For the latter, only the operation itself would have to be
134defined, and the description for its clause-inherited arguments is appended
135through the inherited `clausesDescription` property. By convention, the list of
136clauses for an operation must be specified in alphabetical order.
137
138If the operation is intended to have a single region, this is better achieved by
139setting the `singleRegion=true` template argument of `OpenMP_Op` rather manually
140populating the `regions` property of the operation, because that way the default
141`assemblyFormat` is also updated correspondingly.
142
143For example:
144
145```tablegen
146def ExampleOp : OpenMP_Op<"example", traits = [
147    AttrSizedOperandSegments, ...
148  ], clauses = [
149    OpenMP_AlignedClause, OpenMP_IfClause, OpenMP_LinearClause, ...
150  ], singleRegion = true> {
151  let summary = "example construct";
152  let description = [{
153    The example construct represents...
154  }] # clausesDescription;
155}
156```
157
158This is possible because the `arguments`, `assemblyFormat` and
159`extraClassDeclaration` properties of the operation are by default
160populated by concatenating the corresponding properties of the clauses on the
161list. In the case of the `assemblyFormat`, this involves combining the
162`reqAssemblyFormat` and the `optAssemblyFormat` properties. The
163`reqAssemblyFormat` of all clauses is concatenated first and separated using
164spaces, whereas the `optAssemblyFormat` is wrapped in an `oilist()` and
165interleaved with "|" instead of spaces. The resulting `assemblyFormat` contains
166the required assembly format strings, followed by the optional assembly format
167strings, optionally the `$region` and the `attr-dict`.
168
169### Overriding Clause-Inherited Properties
170
171Although the clause-based definition of operations can greatly reduce work, it's
172also somewhat restrictive, since there may be some situations where only part of
173the operation definition can be automated in that manner. For a fine-grained
174control over properties inherited from each clause two features are available:
175
176  - Inhibition of properties. By using `OpenMP_<Name>ClauseSkip` tablegen
177classes, the list of properties copied from the clause to the operation can be
178selected. For example, `OpenMP_IfClauseSkip<assemblyFormat = true>` would result
179in every property defined for the `OpenMP_IfClause` except for the
180`assemblyFormat` being used to initially populate the properties of the
181operation.
182  - Augmentation of properties. There are times when there is a need to add to
183a clause-populated operation property. Instead of overriding the property in the
184definition of the operation and having to manually replicate what would
185otherwise be automatically populated before adding to it, some internal
186properties are defined to hold this default value: `clausesArgs`,
187`clausesAssemblyFormat`, `clauses{Req,Opt}AssemblyFormat` and
188`clausesExtraClassDeclaration`.
189
190In the following example, assuming both the `OpenMP_InReductionClause` and the
191`OpenMP_ReductionClause` define a `getReductionVars` extra class declaration,
192we skip the conflicting `extraClassDeclaration`s inherited by both clauses and
193provide another implementation, without having to also re-define other
194declarations inherited from the `OpenMP_AllocateClause`:
195
196```tablegen
197def ExampleOp : OpenMP_Op<"example", traits = [
198    AttrSizedOperandSegments, ...
199  ], clauses = [
200    OpenMP_AllocateClause,
201    OpenMP_InReductionClauseSkip<extraClassDeclaration = true>,
202    OpenMP_ReductionClauseSkip<extraClassDeclaration = true>
203  ], singleRegion = true> {
204  let summary = "example construct";
205  let description = [{
206    This operation represents...
207  }] # clausesDescription;
208
209  // Override the clause-populated extraClassDeclaration and add the default
210  // back via appending clausesExtraClassDeclaration to it. This has the effect
211  // of adding one declaration. Since this property is skipped for the
212  // InReduction and Reduction clauses, clausesExtraClassDeclaration won't
213  // incorporate the definition of this property for these clauses.
214  let extraClassDeclaration = [{
215    SmallVector<Value> getReductionVars() {
216      // Concatenate inReductionVars and reductionVars and return the result...
217    }
218  }] # clausesExtraClassDeclaration;
219}
220```
221
222These features are intended for complex edge cases, but an effort should be made
223to avoid having to use them, since they may introduce inconsistencies and
224complexity to the dialect.
225
226### Tablegen Verification Pass
227
228As a result of the implicit way in which fundamental properties of MLIR
229operations are populated following this approach, and the ability to override
230them, forgetting to append clause-inherited values might result in hard to debug
231tablegen errors.
232
233For this reason, the `-verify-openmp-ops` tablegen pseudo-backend was created.
234It runs before any other tablegen backends are triggered for the
235[OpenMPOps.td](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td)
236file and warns any time a property defined for a clause is not found in the
237corresponding operation, except if it is explicitly skipped as described
238[above](#overriding-clause-inherited-properties). This way, in case of a later
239tablegen failure while processing OpenMP dialect operations, earlier messages
240triggered by that pass can point to a likely solution.
241
242### Operand Structures
243
244One consequence of basing the representation of operations on the set of values
245and attributes defined for each clause applicable to the corresponding OpenMP
246directive is that operation argument lists tend to be long. This has the effect
247of making C++ operation builders difficult to work with and easy to mistakenly
248pass arguments in the wrong order, which may sometimes introduce hard to detect
249problems.
250
251A solution provided to this issue are operand structures. The main idea behind
252them is that there is one defined for each clause, holding a set of fields that
253contain the data needed to initialize each of the arguments associated with that
254clause. Clause operand structures are aggregated into operation operand
255structures via class inheritance. Then, a custom builder is defined for each
256operation taking the corresponding operand structure as a parameter. Since each
257argument is a named member of the structure, it becomes much simpler to set up
258the desired arguments to create a new operation.
259
260Ad-hoc operand structures available for use within the ODS definition of custom
261operation builders might be defined in
262[OpenMPClauseOperands.h](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h).
263However, this is generally not needed for clause-based operation definitions.
264The `-gen-openmp-clause-ops` tablegen backend, triggered when building the 'omp'
265dialect, will automatically produce structures in the following way:
266
267- It will create a `<Name>ClauseOps` structure for each `OpenMP_Clause`
268definition with one field per argument.
269- The name of each field will match the tablegen name of the corresponding
270argument, except for replacing snake case with camel case.
271- The type of the field will be obtained from the corresponding tablegen
272argument's type:
273  - Values are represented with `mlir::Value`, except for `Variadic`, which
274  makes it an `llvm::SmallVector<mlir::Value>`.
275  - `OptionalAttr` is represented by the translation of its `baseAttr`.
276  - `TypedArrayAttrBase`-based attribute types are represented by wrapping the
277  translation of their `elementAttr` in an `llvm::SmallVector`. The only
278  exception for this case is if the `elementAttr` is a "scalar" (i.e. non
279  array-like) attribute type, in which case the more generic `mlir::Attribute`
280  will be used in place of its `storageType`.
281  - For `ElementsAttrBase`-based attribute types a best effort is attempted to
282  obtain an element type (`llvm::APInt`, `llvm::APFloat` or
283  `DenseArrayAttrBase`'s `returnType`) to be wrapped in an `llvm::SmallVector`.
284  If it cannot be obtained, which will happen with non-builtin direct subclasses
285  of `ElementsAttrBase`, a warning will be emitted and the `storageType` (i.e.
286  specific `mlir::Attribute` subclass) will be used instead.
287  - Other attribute types will be represented with their `storageType`.
288- It will create `<Name>Operands` structure for each operation, which is an
289empty structure subclassing all operand structures defined for the corresponding
290`OpenMP_Op`'s clauses.
291
292### Entry Block Argument-Defining Clauses
293
294In their MLIR representation, certain OpenMP clauses introduce a mapping between
295values defined outside the operation they are applied to and entry block
296arguments for the region of that MLIR operation. This enables, for example, the
297introduction of private copies of the same underlying variable defined outside
298the MLIR operation the clause is attached to. Currently, clauses with this
299property can be classified into three main categories:
300  - Map-like clauses: `host_eval` (compiler internal, not defined by the OpenMP
301  specification: [see more](#host-evaluated-clauses-in-target-regions)), `map`,
302  `use_device_addr` and `use_device_ptr`.
303  - Reduction-like clauses: `in_reduction`, `reduction` and `task_reduction`.
304  - Privatization clauses: `private`.
305
306All three kinds of entry block argument-defining clauses use a similar custom
307assembly format representation, only differing based on the different pieces of
308information attached to each kind. Below, one example of each is shown:
309
310```mlir
311omp.target map_entries(%x -> %x.m, %y -> %y.m : !llvm.ptr, !llvm.ptr) {
312  // Use %x.m, %y.m in place of %x and %y...
313}
314
315omp.wsloop reduction(@add.i32 %x -> %x.r, byref @add.f32 %y -> %y.r : !llvm.ptr, !llvm.ptr) {
316  // Use %x.r, %y.r in place of %x and %y...
317}
318
319omp.parallel private(@x.privatizer %x -> %x.p, @y.privatizer %y -> %y.p : !llvm.ptr, !llvm.ptr) {
320  // Use %x.p, %y.p in place of %x and %y...
321}
322```
323
324As a consequence of parsing and printing the operation's first region entry
325block argument names together with the custom assembly format of these clauses,
326entry block arguments (i.e. the `^bb0(...):` line) must not be explicitly
327defined for these operations. Additionally, it is not possible to implement this
328feature while allowing each clause to be independently parsed and printed,
329because they need to be printed/parsed together with the corresponding
330operation's first region. They must have a well-defined ordering in which
331multiple of these clauses are specified for a given operation, as well.
332
333The parsing/printing of these clauses together with the region provides the
334ability to define entry block arguments directly after the `->`. Forcing a
335specific ordering between these clauses makes the block argument ordering
336well-defined, which is the property used to easily match each clause with the
337entry block arguments defined by it.
338
339Custom printers and parsers for operation regions based on the entry block
340argument-defining clauses they take are implemented based on the
341`{parse,print}BlockArgRegion` functions, which take care of the sorting and
342formatting of each kind of clause, minimizing code duplication resulting from
343this approach. One example of the custom assembly format of an operation taking
344the `private` and `reduction` clauses is the following:
345
346```tablegen
347let assemblyFormat = clausesAssemblyFormat # [{
348  custom<PrivateReductionRegion>($region, $private_vars, type($private_vars),
349      $private_syms, $reduction_vars, type($reduction_vars), $reduction_byref,
350      $reduction_syms) attr-dict
351}];
352```
353
354The `BlockArgOpenMPOpInterface` has been introduced to simplify the addition and
355handling of these kinds of clauses. It holds `num<ClauseName>BlockArgs()`
356functions that by default return 0, to be overriden by each clause through the
357`extraClassDeclaration` property. Based on these functions and the expected
358alphabetical sorting between entry block argument-defining clauses, it
359implements `get<ClauseName>BlockArgs()` functions that are the intended method
360of accessing clause-defined block arguments.
361
362## Loop-Associated Directives
363
364Loop-associated OpenMP constructs are represented in the dialect as loop wrapper
365operations. These implement the `LoopWrapperInterface`, which enforces a series
366of restrictions upon the operation:
367  - It has the `NoTerminator` and `SingleBlock` traits;
368  - It contains a single region; and
369  - Its only block contains exactly one operation, which must be another loop
370wrapper or `omp.loop_nest` operation.
371
372This approach splits the representation for a loop nest and the loop-associated
373constructs that specify how its iterations are executed, possibly across various
374SIMD lanes (`omp.simd`), threads (`omp.wsloop`), teams of threads
375(`omp.distribute`) or tasks (`omp.taskloop`). The ability to directly nest
376multiple loop wrappers to impact the execution of a single loop nest is used to
377represent composite constructs in a modular way.
378
379The `omp.loop_nest` operation represents a collapsed rectangular loop nest that
380must always be wrapped by at least one loop wrapper, which defines how it is
381intended to be executed. It serves as a simpler and more restrictive
382representation of OpenMP loops while a more general approach to support
383non-rectangular loop nests, loop transformations and non-perfectly nested loops
384based on a new `omp.canonical_loop` definition is developed.
385
386The following example shows how a `parallel {do,for}` construct would be
387represented:
388```mlir
389omp.parallel ... {
390  ...
391  omp.wsloop ... {
392    omp.loop_nest (%i) : index = (%lb) to (%ub) step (%step) {
393      %a = load %a[%i] : memref<?xf32>
394      %b = load %b[%i] : memref<?xf32>
395      %sum = arith.addf %a, %b : f32
396      store %sum, %c[%i] : memref<?xf32>
397      omp.yield
398    }
399  }
400  ...
401  omp.terminator
402}
403```
404
405### Loop Transformations
406
407In addition to the worksharing loop-associated constructs described above, the
408OpenMP specification also defines a set of loop transformation constructs. They
409replace the associated loop(s) before worksharing constructs are executed on the
410generated loop(s). Some examples of such constructs are `tile` and `unroll`.
411
412A general approach for representing these types of OpenMP constructs has not yet
413been implemented, but it is closely linked to the `omp.canonical_loop` work.
414Nevertheless, loop transformation that the `collapse` clause for loop-associated
415worksharing constructs defines can be represented by introducing multiple
416bounds, step and induction variables to the `omp.loop_nest` operation.
417
418## Compound Construct Representation
419
420The OpenMP specification defines certain shortcuts that allow specifying
421multiple constructs in a single directive, which are referred to as compound
422constructs (e.g. `parallel do` contains the `parallel` and `do` constructs).
423These can be further classified into [combined](#combined-constructs) and
424[composite](#composite-constructs) constructs. This section describes how they
425are represented in the dialect.
426
427When clauses are specified for compound constructs, the OpenMP specification
428defines a set of rules to decide to which leaf constructs they apply, as well as
429potentially introducing some other implicit clauses. These rules must be taken
430into account by those creating the MLIR representation, since it is a per-leaf
431representation that expects these rules to have already been followed.
432
433### Combined Constructs
434
435Combined constructs are semantically equivalent to specifying one construct
436immediately nested inside another. This property is used to simplify the dialect
437by representing them through the operations associated to each leaf construct.
438For example, `target teams` would be represented as follows:
439
440```mlir
441omp.target ... {
442  ...
443  omp.teams ... {
444    ...
445    omp.terminator
446  }
447  ...
448  omp.terminator
449}
450```
451
452### Composite Constructs
453
454Composite constructs are similar to combined constructs in that they specify the
455effect of one construct being applied immediately after another. However, they
456group together constructs that cannot be directly nested into each other.
457Specifically, they group together multiple loop-associated constructs that apply
458to the same collapsed loop nest.
459
460As of version 5.2 of the OpenMP specification, the list of composite constructs
461is the following:
462  - `{do,for} simd`;
463  - `distribute simd`;
464  - `distribute parallel {do,for}`;
465  - `distribute parallel {do,for} simd`; and
466  - `taskloop simd`.
467
468Even though the list of composite constructs is relatively short and it would
469also be possible to create dialect operations for each, it was decided to
470allow attaching multiple loop wrappers to a single loop instead. This minimizes
471redundancy in the dialect and maximizes its modularity, since there is a single
472operation for each leaf construct regardless of whether it can be part of a
473composite construct. On the other hand, this means the `omp.loop_nest` operation
474will have to be interpreted differently depending on how many and which loop
475wrappers are attached to it.
476
477To simplify the detection of operations taking part in the representation of a
478composite construct, the `ComposableOpInterface` was introduced. Its purpose is
479to handle the `omp.composite` discardable dialect attribute that can optionally
480be attached to these operations. Operation verifiers will ensure its presence is
481consistent with the context the operation appears in, so that it is valid when
482the attribute is present if and only if it represents a leaf of a composite
483construct.
484
485For example, the `distribute simd` composite construct is represented as
486follows:
487
488```mlir
489omp.distribute ... {
490  omp.simd ... {
491    omp.loop_nest (%i) : index = (%lb) to (%ub) step (%step) {
492      ...
493      omp.yield
494    }
495  } {omp.composite}
496} {omp.composite}
497```
498
499One exception to this is the representation of the
500`distribute parallel {do,for}` composite construct. The presence of a
501block-associated `parallel` leaf construct would introduce many problems if it
502was allowed to work as a loop wrapper. In this case, the "hoisted `omp.parallel`
503representation" is used instead. This consists in making `omp.parallel` the
504parent operation, with a nested `omp.loop_nest` wrapped by `omp.distribute` and
505`omp.wsloop` (and `omp.simd`, in the `distribute parallel {do,for} simd` case).
506
507This approach works because `parallel` is a parallelism-generating construct,
508whereas `distribute` is a worksharing construct impacting the higher level
509`teams` construct, making the ordering between these constructs not cause
510semantic mismatches. This property is also exploited by LLVM's SPMD-mode.
511
512```mlir
513omp.parallel ... {
514  ...
515  omp.distribute ... {
516    omp.wsloop ... {
517      omp.loop_nest (%i) : index = (%lb) to (%ub) step (%step) {
518        ...
519        omp.yield
520      }
521    } {omp.composite}
522  } {omp.composite}
523  ...
524  omp.terminator
525} {omp.composite}
526```
527
528## Host-Evaluated Clauses in Target Regions
529
530The `omp.target` operation, which represents the OpenMP `target` construct, is
531marked with the `IsolatedFromAbove` trait. This means that, inside of its
532region, no MLIR values defined outside of the op itself can be used. This is
533consistent with the OpenMP specification of the `target` construct, which
534mandates that all host device values used inside of the `target` region must
535either be privatized (data-sharing) or mapped (data-mapping).
536
537Normally, clauses applied to a construct are evaluated before entering that
538construct. Further, in some cases, the OpenMP specification stipulates that
539clauses be evaluated _on the host device_ on entry to a parent `target`
540construct. In particular, the `num_teams` and `thread_limit` clauses of the
541`teams` construct must be evaluated on the host device if it's nested inside or
542combined with a `target` construct.
543
544Additionally, the runtime library targeted by the MLIR to LLVM IR translation of
545the OpenMP dialect supports the optimized launch of SPMD kernels (i.e.
546`target teams distribute parallel {do,for}` in OpenMP), which requires
547specifying in advance what the total trip count of the loop is. Consequently, it
548is also beneficial to evaluate the trip count on the host device prior to the
549kernel launch.
550
551These host-evaluated values in MLIR would need to be placed outside of the
552`omp.target` region and also attached to the corresponding nested operations,
553which is not possible because of the `IsolatedFromAbove` trait. The solution
554implemented to address this problem has been to introduce the `host_eval`
555argument to the `omp.target` operation. It works similarly to a `map` clause,
556but its only intended use is to forward host-evaluated values to their
557corresponding operation inside of the region. Any uses outside of the previously
558described result in a verifier error.
559
560```mlir
561// Initialize %0, %1, %2, %3...
562omp.target host_eval(%0 -> %nt, %1 -> %lb, %2 -> %ub, %3 -> %step : i32, i32, i32, i32) {
563  omp.teams num_teams(to %nt : i32) {
564    omp.parallel {
565      omp.distribute {
566        omp.wsloop {
567          omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
568            // ...
569            omp.yield
570          }
571          omp.terminator
572        } {omp.composite}
573        omp.terminator
574      } {omp.composite}
575      omp.terminator
576    } {omp.composite}
577    omp.terminator
578  }
579  omp.terminator
580}
581```
582