1.. _omp112: 2 3Found thread data sharing on the GPU. Expect degraded performance due to data globalization. [OMP112] 4===================================================================================================== 5 6This missed remark indicates that a globalized value was found on the target 7device that was not either replaced with stack memory by :ref:`OMP110 <omp110>` 8or shared memory by :ref:`OMP111 <omp111>`. Globalization that has not been 9removed will need to be handled by the runtime and will significantly impact 10performance. 11 12The OpenMP standard requires that threads are able to share their data between 13each-other. However, this is not true by default when offloading to a target 14device such as a GPU. Threads on a GPU cannot shared their data unless it is 15first placed in global or shared memory. In order to create standards complaint 16code, the Clang compiler will globalize any variables that could potentially be 17shared between the threads. In the majority of cases, globalized variables can 18either be returns to a thread-local stack, or pushed to shared memory. However, 19in a few cases it is necessary and will cause a performance penalty. 20 21Examples 22-------- 23 24This example shows legitimate data sharing on the device. It is a convoluted 25example, but is completely complaint with the OpenMP standard. If globalization 26was not added this would result in different results on different target 27devices. 28 29.. code-block:: c++ 30 31 #include <omp.h> 32 #include <cstdio> 33 34 #pragma omp declare target 35 static int *p; 36 #pragma omp end declare target 37 38 void foo() { 39 int x = omp_get_thread_num(); 40 if (omp_get_thread_num() == 1) 41 p = &x; 42 43 #pragma omp barrier 44 45 printf ("Thread %d: %d\n", omp_get_thread_num(), *p); 46 } 47 48 int main() { 49 #pragma omp target parallel 50 foo(); 51 } 52 53.. code-block:: console 54 55 $ clang++ -fopenmp -fopenmp-targets=nvptx64 -O1 -Rpass-missed=openmp-opt omp112.cpp 56 omp112.cpp:9:7: remark: Found thread data sharing on the GPU. Expect degraded performance 57 due to data globalization. [OMP112] [-Rpass-missed=openmp-opt] 58 int x = omp_get_thread_num(); 59 ^ 60 61A less convoluted example globalization that cannot be removed occurs when 62calling functions that aren't visible from the current translation unit. 63 64.. code-block:: c++ 65 66 extern void use(int *x); 67 68 void foo() { 69 int x; 70 use(&x); 71 } 72 73 int main() { 74 #pragma omp target parallel 75 foo(); 76 } 77 78.. code-block:: console 79 80 $ clang++ -fopenmp -fopenmp-targets=nvptx64 -O1 -Rpass-missed=openmp-opt omp112.cpp 81 omp112.cpp:4:7: remark: Found thread data sharing on the GPU. Expect degraded performance 82 due to data globalization. [OMP112] [-Rpass-missed=openmp-opt] 83 int x; 84 ^ 85 86Diagnostic Scope 87---------------- 88 89OpenMP target offloading missed remark. 90