1OpenMP Extensions for OpenACC 2============================= 3 4OpenACC provides some functionality that OpenMP does not. In some 5cases, Clang supports OpenMP extensions to provide similar 6functionality, taking advantage of the runtime implementation already 7required for OpenACC. This section documents those extensions. 8 9By default, Clang recognizes these extensions. The command-line 10option ``-fno-openmp-extensions`` can be specified to disable all 11OpenMP extensions, including those described in this section. 12 13.. _ompx-motivation: 14 15Motivation 16---------- 17 18There are multiple benefits to exposing OpenACC functionality as LLVM 19OpenMP extensions: 20 21* OpenMP applications can take advantage of the additional 22 functionality. 23* As LLVM's implementation of these extensions matures, it can serve 24 as a basis for including these extensions in the OpenMP standard. 25* Source-to-source translation from certain OpenACC features to OpenMP 26 is otherwise impossible. 27* Runtime tests can be written in terms of OpenMP instead of OpenACC 28 or low-level runtime calls. 29* More generally, there is a clean separation of concerns between 30 OpenACC and OpenMP development in LLVM. That is, LLVM's OpenMP 31 developers can discuss, modify, and debug LLVM's extended OpenMP 32 implementation and test suite without directly considering OpenACC's 33 language and execution model, which are handled by LLVM's OpenACC 34 developers. 35 36.. _ompx-hold: 37 38``ompx_hold`` Map Type Modifier 39------------------------------- 40 41.. _ompx-holdExample: 42 43Example 44^^^^^^^ 45 46.. code-block:: c++ 47 48 #pragma omp target data map(ompx_hold, tofrom: x) // holds onto mapping of x throughout region 49 { 50 foo(); // might have map(delete: x) 51 #pragma omp target map(present, alloc: x) // x is guaranteed to be present 52 printf("%d\n", x); 53 } 54 55The ``ompx_hold`` map type modifier above specifies that the ``target 56data`` directive holds onto the mapping for ``x`` throughout the 57associated region regardless of any ``target exit data`` directives 58executed during the call to ``foo``. Thus, the presence assertion for 59``x`` at the enclosed ``target`` construct cannot fail. 60 61.. _ompx-holdBehavior: 62 63Behavior 64^^^^^^^^ 65 66* Stated more generally, the ``ompx_hold`` map type modifier specifies 67 that the associated data is not unmapped until the end of the 68 construct. As usual, the standard OpenMP reference count for the 69 data must also reach zero before the data is unmapped. 70* If ``ompx_hold`` is specified for the same data on lexically or 71 dynamically enclosed constructs, there is no additional effect as 72 the data mapping is already held throughout their regions. 73* The ``ompx_hold`` map type modifier is permitted to appear only on 74 ``target`` constructs (and associated combined constructs) and 75 ``target data`` constructs. It is not permitted to appear on 76 ``target enter data`` or ``target exit data`` directives because 77 there is no associated statement, so it is not meaningful to hold 78 onto a mapping until the end of the directive. 79* The runtime reports an error if ``omp_target_disassociate_ptr`` is 80 called for a mapping for which the ``ompx_hold`` map type modifier 81 is in effect. 82* Like the ``present`` map type modifier, the ``ompx_hold`` map type 83 modifier applies to an entire struct if it's specified for any 84 member of that struct even if other ``map`` clauses on the same 85 directive specify other members without the ``ompx_hold`` map type 86 modifier. 87* ``ompx_hold`` support is not yet provided for ``defaultmap``. 88 89Implementation 90^^^^^^^^^^^^^^ 91 92* LLVM uses the term *dynamic reference count* for the standard OpenMP 93 reference count for host/device data mappings. 94* The ``ompx_hold`` map type modifier selects an alternate reference 95 count, called the *hold reference count*. 96* A mapping is removed only once both its reference counts reach zero. 97* Because ``ompx_hold`` can appear only constructs, increments and 98 decrements of the hold reference count are guaranteed to be 99 balanced, so it is impossible to decrement it below zero. 100* The dynamic reference count is used wherever ``ompx_hold`` is not 101 specified (and possibly cannot be specified). Decrementing the 102 dynamic reference count has no effect if it is already zero. 103* The runtime determines that the ``ompx_hold`` map type modifier is 104 *in effect* (see :ref:`Behavior <ompx-holdBehavior>` above) when the 105 hold reference count is greater than zero. 106 107Relationship with OpenACC 108^^^^^^^^^^^^^^^^^^^^^^^^^ 109 110OpenACC specifies two reference counts for tracking host/device data 111mappings. Which reference count is used to implement an OpenACC 112directive is determined by the nature of that directive, either 113dynamic or structured: 114 115* The *dynamic reference count* is always used for ``enter data`` and 116 ``exit data`` directives and corresponding OpenACC routines. 117* The *structured reference count* is always used for ``data`` and 118 compute constructs, which are similar to OpenMP's ``target data`` 119 and ``target`` constructs. 120 121Contrast with OpenMP, where the dynamic reference count is always used 122unless the application developer specifies an alternate behavior via 123our map type modifier extension. We chose the name *hold* for that 124map type modifier because, as demonstrated in the above :ref:`example 125<ompx-holdExample>`, *hold* concisely identifies the desired behavior 126from the application developer's perspective without referencing the 127implementation of that behavior. 128 129The hold reference count is otherwise modeled after OpenACC's 130structured reference count. For example, calling ``acc_unmap_data``, 131which is similar to ``omp_target_disassociate_ptr``, is an error when 132the structured reference count is not zero. 133 134While Flang and Clang obviously must implement the syntax and 135semantics for selecting OpenACC reference counts differently than for 136selecting OpenMP reference counts, the implementation is the same at 137the runtime level. That is, OpenACC's dynamic reference count is 138OpenMP's dynamic reference count, and OpenACC's structured reference 139count is our OpenMP hold reference count extension. 140 141.. _atomicWithinTeams: 142 143``atomic`` Strictly Nested Within ``teams`` 144------------------------------------------- 145 146Example 147^^^^^^^ 148 149OpenMP 5.2, sec. 10.2 "teams Construct", p. 232, L9-12 restricts what 150regions can be strictly nested within a ``teams`` region. As an 151extension, Clang relaxes that restriction in the case of the 152``atomic`` construct so that, for example, the following case is 153permitted: 154 155.. code-block:: c++ 156 157 #pragma omp target teams map(tofrom:x) 158 #pragma omp atomic update 159 x++; 160 161Relationship with OpenACC 162^^^^^^^^^^^^^^^^^^^^^^^^^ 163 164This extension is important when translating OpenACC to OpenMP because 165OpenACC does not have the same restriction for its corresponding 166constructs. For example, the following is conforming OpenACC: 167 168.. code-block:: c++ 169 170 #pragma acc parallel copy(x) 171 #pragma acc atomic update 172 x++; 173