| #
1d66649a |
| 21-Sep-2021 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument lis
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert, jhuber6
Differential Revision: https://reviews.llvm.org/D102107
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
fb0cf017 |
| 19-Jul-2021 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit e9c7291cb25f071f1a1dfa4049ed9f5a8a217b3e.
Fix failing tests
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
e9c7291c |
| 15-Jun-2021 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument lis
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102107
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
6ff380f4 |
| 19-May-2021 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFC] Remove SIMD check lines for non-simd tests
If a test does not contain an " simd" but -fopenmp-simd RUN lines we can just check that we do not create __kmpc|__tgt calls.
Reviewed By: A
[OpenMP][NFC] Remove SIMD check lines for non-simd tests
If a test does not contain an " simd" but -fopenmp-simd RUN lines we can just check that we do not create __kmpc|__tgt calls.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101973
show more ...
|
| #
16d03818 |
| 13-May-2021 |
Roman Lebedev <lebedev.ri@gmail.com> |
Return "[CGCall] Annotate `this` argument with alignment"
The original change was reverted because it was discovered that clang mishandles thunks, and they receive wrong attributes for their this/re
Return "[CGCall] Annotate `this` argument with alignment"
The original change was reverted because it was discovered that clang mishandles thunks, and they receive wrong attributes for their this/return types - the ones for the function they will call, not the ones they have.
While i have tried to fix this in https://reviews.llvm.org/D100388 that patch has been up and stuck for a month now, with little signs of progress.
So while it will be good to solve this for real, for now we can simply avoid introducing the bug, by not annotating this/return for thunks.
This reverts commit 6270b3a1eafaba4279e021418c5a2c5a35abc002, relanding 0aa0458f1429372038ca6a4edc7e94c96cd9a753.
show more ...
|
| #
df729e2b |
| 22-Apr-2021 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well.
This started with
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to avoid the "ref" global use introduced for globals with internal linkage. From there it went down the rabbit hole, e.g., all variables, even `nohost` ones, were emitted into the device code so it was impossible to determine if "ref" was needed late in the game (based on the name only). To make it really useful, `begin declare target` was needed as it can carry the `device_type`. Not emitting variables eagerly had a ripple effect. Finally, the precedence of the (explicit) declare target list items needed to be taken into account, that meant we cannot just look for any declare target attribute to make a decision. This caused the handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse declarations and defintions" as part of OpenMP parsing, this will always break at some point. Instead, we keep track what region we are in and act on definitions and declarations instead, this is what we do for declare variant and other begin/end directives already.
Highlights: - new diagnosis for restrictions specificed in the standard, - delayed emission of globals not mentioned in an explicit list of a declare target, - omission of `nohost` globals on the host and `host` globals on the device, - no explicit parsing of declarations in-between `omp [begin] declare variant` and the corresponding end anymore, regular parsing instead, - precedence for explicit mentions in `declare target` lists over implicit mentions in the declaration-definition-seq, and - `omp allocate` declarations will now replace an earlier emitted global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do on their own lead to "inconsistent states", which seem less desirable overall.
After working through this I feel the standard should remove the explicit declare target forms as the delayed emission is horrible. That said, while we delay things anyway, it seems to me we check too often for the current status even though that is often not sufficient to act upon. There seems to be a lot of duplication that can probably be trimmed down. Eagerly emitting some things seems pretty weak as an argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
show more ...
|
| #
207b08a9 |
| 05-May-2021 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactor
[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101849
show more ...
|
| #
f016c06a |
| 05-May-2021 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
Revert "[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks"
This reverts commit 956cae2f09b21429dbcb02066c99e35a239aa4bf.
|
| #
956cae2f |
| 04-May-2021 |
Giorgis Georgakoudis <georgakoudis1@llnl.gov> |
[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactor
[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101849
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5 |
|
| #
d222a07d |
| 01-Apr-2021 |
Thomas Preud'homme <thomasp@graphcore.ai> |
[OpenMP, test] Fix uses of undef S*VAR FileCheck var
Fix the many cases of use of undefined SIVAR/SVAR/SFVAR in OpenMP *private_codegen tests, due to a missing BLOCK directive to capture the IR vari
[OpenMP, test] Fix uses of undef S*VAR FileCheck var
Fix the many cases of use of undefined SIVAR/SVAR/SFVAR in OpenMP *private_codegen tests, due to a missing BLOCK directive to capture the IR variable when it is declared. It also fixes a few typo in its use.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99770
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
| #
6b335179 |
| 31-Dec-2020 |
Fangrui Song <i@maskray.me> |
[test] Add {{.*}} to make tests immune to dso_local/dso_preemptable/(none) differences
For a definition (of most linkage types), dso_local is set for ELF -fno-pic/-fpie and COFF, but not for Mach-O.
[test] Add {{.*}} to make tests immune to dso_local/dso_preemptable/(none) differences
For a definition (of most linkage types), dso_local is set for ELF -fno-pic/-fpie and COFF, but not for Mach-O. This nuance causes unneeded binary format differences.
This patch replaces (function) `define ` with `define{{.*}} `, (variable/constant/alias) `= ` with `={{.*}} `, or inserts appropriate `{{.*}} ` if there is an explicit linkage.
* Clang will set dso_local for Mach-O, which is currently implied by TargetMachine.cpp. This will make COFF/Mach-O and executable ELF similar. * Eventually I hope we can make dso_local the textual LLVM IR default (write explicit "dso_preemptable" when applicable) and -fpic ELF will be similar to everything else. This patch helps move toward that goal.
show more ...
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
69cd776e |
| 16-Nov-2020 |
CJ Johnson <johnsoncj@google.com> |
[CodeGen] Apply 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments.
* Adds 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments * Gates 'nonnull' on -f(no-)delete-null-pointer-
[CodeGen] Apply 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments.
* Adds 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments * Gates 'nonnull' on -f(no-)delete-null-pointer-checks * Introduces this-nonnull.cpp and microsoft-abi-this-nullable.cpp tests to explicitly test the behavior of this change * Refactors hundreds of over-constrained clang tests to permit these attributes, where needed * Updates Clang12 patch notes mentioning this change
Reviewed-by: rsmith, jdoerfert
Differential Revision: https://reviews.llvm.org/D17993
show more ...
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
fa5d22a0 |
| 28-May-2020 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][NFC] Reuse OMPIRBuilder `struct ident_t` handling in Clang
Replace the `ident_t` handling in Clang with the methods offered by the OMPIRBuilder. This cuts down on the clang code as well as
[OpenMP][NFC] Reuse OMPIRBuilder `struct ident_t` handling in Clang
Replace the `ident_t` handling in Clang with the methods offered by the OMPIRBuilder. This cuts down on the clang code as well as the differences between the two, making further transitions easier. Tests have changed but there should not be a real functional change. The most interesting difference is probably that we stop generating local ident_t allocations for now and just use globals. Given that this happens only with debug info, the location part of the `ident_t` is probably bigger than the test anyway. As the location part is already a global, we can avoid the allocation, memcpy, and store in favor of a constant global that is slightly bigger. This can be revisited if there are complications.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D80735
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc1 |
|
| #
62f3ef2b |
| 18-May-2020 |
Eli Friedman <efriedma@quicinc.com> |
[CGCall] Annotate references with "align" attribute.
If we're going to assume references are dereferenceable, we should also assume they're aligned: otherwise, we can't actually dereference them.
S
[CGCall] Annotate references with "align" attribute.
If we're going to assume references are dereferenceable, we should also assume they're aligned: otherwise, we can't actually dereference them.
See also D80072.
Differential Revision: https://reviews.llvm.org/D80166
show more ...
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
| #
d900dd0c |
| 15-Oct-2018 |
Sean Fertile <sfertile@ca.ibm.com> |
Revert "[CodeGenCXX] Treat 'this' as noalias in constructors"
This reverts commit https://reviews.llvm.org/rL344150 which causes MachineOutliner related failures on the ppc64le multistage buildbot.
Revert "[CodeGenCXX] Treat 'this' as noalias in constructors"
This reverts commit https://reviews.llvm.org/rL344150 which causes MachineOutliner related failures on the ppc64le multistage buildbot.
llvm-svn: 344526
show more ...
|
| #
cc7e7475 |
| 10-Oct-2018 |
Anton Bikineev <ant.bikineev@gmail.com> |
[CodeGenCXX] Treat 'this' as noalias in constructors
This is currently a clang extension and a resolution of the defect report in the C++ Standard.
Differential Revision: https://reviews.llvm.org/D
[CodeGenCXX] Treat 'this' as noalias in constructors
This is currently a clang extension and a resolution of the defect report in the C++ Standard.
Differential Revision: https://reviews.llvm.org/D46441
llvm-svn: 344150
show more ...
|
|
Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3 |
|
| #
e1ca7b61 |
| 29-Aug-2018 |
Mike Rice <michael.p.rice@intel.com> |
[OPENMP] Create non-const ident_t objects.
Currently ident_t objects are created const when debug info is not enabled, but the libittnotify libray in the OpenMP runtime writes to the reserved_2 fiel
[OPENMP] Create non-const ident_t objects.
Currently ident_t objects are created const when debug info is not enabled, but the libittnotify libray in the OpenMP runtime writes to the reserved_2 field (See __kmp_itt_region_forking in openmp/runtime/src/kmp_itt.inl). Now create ident_t objects non-const.
Differential Revision: https://reviews.llvm.org/D51331
llvm-svn: 340934
show more ...
|
|
Revision tags: llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2 |
|
| #
6e938eff |
| 19-Jan-2018 |
Daniel Neilson <dneilson@azul.com> |
Change memcpy/memove/memset to have dest and source alignment attributes (Step 1).
Summary: Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset intrinsics. This change
Change memcpy/memove/memset to have dest and source alignment attributes (Step 1).
Summary: Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset intrinsics. This change updates the Clang tests for this change.
The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two.
This change removes the alignment argument in favour of placing the alignment attribute on the source and destination pointers of the memory intrinsic call.
For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)
At this time the source and destination alignments must be the same (Step 1). Step 2 of the change, to be landed shortly, will relax that contraint and allow the source and destination to have different alignments.
llvm-svn: 322964
show more ...
|
|
Revision tags: llvmorg-6.0.0-rc1 |
|
| #
a8a9153a |
| 29-Dec-2017 |
Alexey Bataev <a.bataev@hotmail.com> |
[OPENMP] Support for -fopenmp-simd option with compilation of simd loops only.
Added support for -fopenmp-simd option that allows compilation of simd-based constructs without emission of OpenMP runt
[OPENMP] Support for -fopenmp-simd option with compilation of simd loops only.
Added support for -fopenmp-simd option that allows compilation of simd-based constructs without emission of OpenMP runtime calls.
llvm-svn: 321560
show more ...
|
| #
3f82cfc3 |
| 13-Dec-2017 |
Alexey Bataev <a.bataev@hotmail.com> |
[OPENMP] Fix handling of clauses in clause parsing mode.
The compiler may generate incorrect code if we try to capture the variable in clause parsing mode.
llvm-svn: 320590
|
|
Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1 |
|
| #
7a203715 |
| 21-Oct-2016 |
Reid Kleckner <rnk@google.com> |
Remove unnecessary x86 backend requirements from OpenMP tests
Clang can generate LLVM IR for x86 without a registered x86 backend.
llvm-svn: 284836
|
|
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1 |
|
| #
6d004264 |
| 16-Jun-2016 |
Samuel Antao <sfantao@us.ibm.com> |
Re-apply r272900 - [OpenMP] Cast captures by copy when passed to fork call so that they are compatible to what the runtime library expects.
An issue in one of the regression tests was fixed for 32-b
Re-apply r272900 - [OpenMP] Cast captures by copy when passed to fork call so that they are compatible to what the runtime library expects.
An issue in one of the regression tests was fixed for 32-bit hosts.
llvm-svn: 272931
show more ...
|
| #
b1f95012 |
| 16-Jun-2016 |
Samuel Antao <sfantao@us.ibm.com> |
Revert r272900 - [OpenMP] Cast captures by copy when passed to fork call so that they are compatible to what the runtime library expects.
Was causing trouble in one of the regression tests for a 32-
Revert r272900 - [OpenMP] Cast captures by copy when passed to fork call so that they are compatible to what the runtime library expects.
Was causing trouble in one of the regression tests for a 32-bit address space.
llvm-svn: 272908
show more ...
|
| #
49516179 |
| 16-Jun-2016 |
Samuel Antao <sfantao@us.ibm.com> |
[OpenMP] Cast captures by copy when passed to fork call so that they are compatible to what the runtime library expects.
Summary: This patch fixes an issue detected when firstprivate variables are p
[OpenMP] Cast captures by copy when passed to fork call so that they are compatible to what the runtime library expects.
Summary: This patch fixes an issue detected when firstprivate variables are passed to an OpenMP outlined function vararg list. Currently they are not compatible with what the runtime library expects causing malfunction in some targets.
This patch fixes the issue by moving the casting logic already in place for offloading to the common code that creates the outline function and arguments and updates the regression tests accordingly.
Reviewers: hfinkel, arpith-jacob, carlo.bertolli, kkwli0, ABataev
Subscribers: cfe-commits, caomhin
Differential Revision: http://reviews.llvm.org/D21150
llvm-svn: 272900
show more ...
|
|
Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
| #
9afe5754 |
| 24-May-2016 |
Alexey Bataev <a.bataev@hotmail.com> |
[OPENMP] Fixed codegen for firstprivate vars in standalone worksharing directives.
If firstprivate variable is is captured by value in outlined region and then used as firstprivate variable in inner
[OPENMP] Fixed codegen for firstprivate vars in standalone worksharing directives.
If firstprivate variable is is captured by value in outlined region and then used as firstprivate variable in inner worksharing directive, the copy for this firstprivate variable was not created. Fixed this bug.
llvm-svn: 270536
show more ...
|