#
7d5de9a1 |
| 05-Jan-2016 |
Samuel Antao <sfantao@us.ibm.com> |
[OpenMP] Revert rL256842: [OpenMP] Offloading descriptor registration and device codegen.
It was causing two regression, so I'm reverting until the cause is found.
llvm-svn: 256858
|
#
4d5f0bbe |
| 05-Jan-2016 |
Samuel Antao <sfantao@us.ibm.com> |
[OpenMP] Offloading descriptor registration and device codegen.
Summary: In order to offloading work properly two things need to be in place: - a descriptor with all the offloading information (devi
[OpenMP] Offloading descriptor registration and device codegen.
Summary: In order to offloading work properly two things need to be in place: - a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library. - all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device.
This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented.
About offloading descriptor:
The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type: ``` struct __tgt_offload_entry{ void *addr; char *name; int64_t size; }; ``` and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping.
The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run.
The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple.
About target codegen:
The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this: ``` !omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6}
!0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4} !1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5} !2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6} !3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0} !4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3} !5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1} !6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2} ``` The fields in each metadata entry are (in sequence): Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region". Entry 2) a unique ID of the device where the input source file that contain the target region lives. Entry 3) a unique ID of the file where the input source file that contain the target region lives. Entry 4) a mangled name of the function that encloses the target region. Entries 5) and 6) line and column number where the target region was found. Entry 7) is the order the entry was emitted.
Entry 2) and 3) are required to distinguish files that have the same function name. Entry 4) is required to distinguish different instances of the same declaration (usually templated ones) Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. )
This patch replaces http://reviews.llvm.org/D12306.
Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao
Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits
Differential Revision: http://reviews.llvm.org/D12614
llvm-svn: 256842
show more ...
|
#
695890c9 |
| 17-Dec-2015 |
Easwaran Raman <eraman@google.com> |
Attach maximum function count to Module when using PGO mode.
This sets the maximum entry count among all functions in the program to the module using module flags. This allows the optimizer to use t
Attach maximum function count to Module when using PGO mode.
This sets the maximum entry count among all functions in the program to the module using module flags. This allows the optimizer to use this information.
Differential Revision: http://reviews.llvm.org/D15163
llvm-svn: 255918
show more ...
|
#
fd6f92d5 |
| 15-Dec-2015 |
Evgeniy Stepanov <eugeni.stepanov@gmail.com> |
Cross-DSO control flow integrity (Clang part).
Clang-side cross-DSO CFI.
* Adds a command line flag -f[no-]sanitize-cfi-cross-dso. * Links a runtime library when enabled. * Emits __cfi_slowpath cal
Cross-DSO control flow integrity (Clang part).
Clang-side cross-DSO CFI.
* Adds a command line flag -f[no-]sanitize-cfi-cross-dso. * Links a runtime library when enabled. * Emits __cfi_slowpath calls is bitset test fails. * Emits extra hash-based bitsets for external CFI checks. * Sets a module flag to enable __cfi_check generation during LTO.
This mode does not yet support diagnostics.
llvm-svn: 255694
show more ...
|
#
0b17d44f |
| 15-Dec-2015 |
David Majnemer <david.majnemer@gmail.com> |
[WinEH] Update clang to use operand bundles on call sites
This updates clang to use bundle operands to associate an invoke with the funclet which it is contained within.
Depends on D15517.
Differe
[WinEH] Update clang to use operand bundles on call sites
This updates clang to use bundle operands to associate an invoke with the funclet which it is contained within.
Depends on D15517.
Differential Revision: http://reviews.llvm.org/D15518
llvm-svn: 255675
show more ...
|
#
dd4c71ca |
| 12-Dec-2015 |
Easwaran Raman <eraman@google.com> |
Revert r254647.
Reason: The testcase fails in many architectures.
Differential Revision: http://reviews.llvm.org/D15163
llvm-svn: 255416
|
#
d547e5e1 |
| 12-Dec-2015 |
Easwaran Raman <eraman@google.com> |
Attach maximum function count to Module when using PGO mode
This sets the maximum entry count among all functions in the program to the module using module flags. This allows the optimizer to use th
Attach maximum function count to Module when using PGO mode
This sets the maximum entry count among all functions in the program to the module using module flags. This allows the optimizer to use this information.
Differential Revision: http://reviews.llvm.org/D15163
llvm-svn: 255397
show more ...
|
#
953fe036 |
| 05-Dec-2015 |
Reid Kleckner <rnk@google.com> |
Revert "[x86] Exclusion of incorrect include headers paths for MCU target"
This reverts commit r254195.
From the description, I suspect that the wrong patch was committed here, and this is causing
Revert "[x86] Exclusion of incorrect include headers paths for MCU target"
This reverts commit r254195.
From the description, I suspect that the wrong patch was committed here, and this is causing assertion failures in EmitDeferred() when the global value ends up being a bitcast of a global.
llvm-svn: 254823
show more ...
|
#
3e3bb95b |
| 02-Dec-2015 |
George Burgess IV <george.burgess.iv@gmail.com> |
Add the `pass_object_size` attribute to clang.
`pass_object_size` is our way of enabling `__builtin_object_size` to produce high quality results without requiring inlining to happen everywhere.
A l
Add the `pass_object_size` attribute to clang.
`pass_object_size` is our way of enabling `__builtin_object_size` to produce high quality results without requiring inlining to happen everywhere.
A link to the design doc for this attribute is available at the Differential review link below.
Differential Revision: http://reviews.llvm.org/D13263
llvm-svn: 254554
show more ...
|
#
5a99c49d |
| 01-Dec-2015 |
Richard Smith <richard-llvm@metafoo.co.uk> |
Fix use-after-free when a C++ thread_local variable gets replaced (because its type changes when the initializer is attached). Don't hold onto the GlobalVariable*; recompute it from the VarDecl* inst
Fix use-after-free when a C++ thread_local variable gets replaced (because its type changes when the initializer is attached). Don't hold onto the GlobalVariable*; recompute it from the VarDecl* instead.
llvm-svn: 254359
show more ...
|
Revision tags: llvmorg-3.7.1 |
|
#
2a4db901 |
| 27-Nov-2015 |
Andrey Bokhanko <andreybokhanko@gmail.com> |
[x86] Exclusion of incorrect include headers paths for MCU target
Exclusion of /usr/include and /usr/local/include headers paths for MCU target.
Differential Revision: http://reviews.llvm.org/D1495
[x86] Exclusion of incorrect include headers paths for MCU target
Exclusion of /usr/include and /usr/local/include headers paths for MCU target.
Differential Revision: http://reviews.llvm.org/D14954
llvm-svn: 254195
show more ...
|
Revision tags: llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1 |
|
#
f93fff27 |
| 11-Nov-2015 |
Manman Ren <manman.ren@gmail.com> |
[TLS on Darwin] treat all Darwin platforms in the same way.
rdar://problem/9001553
llvm-svn: 252820
|
#
2b90a64e |
| 11-Nov-2015 |
Eric Christopher <echristo@gmail.com> |
Extract out a function onto CodeGenModule for getting the map of features for a particular function, then use it to clean up some code.
llvm-svn: 252819
|
#
68150269 |
| 11-Nov-2015 |
Manman Ren <manman.ren@gmail.com> |
[TLS on Darwin] change how we handle globals with linkonce or weak linkage.
This is about how we handle static member of a template. Before this commit, we use internal linkage for the IR thread-loc
[TLS on Darwin] change how we handle globals with linkonce or weak linkage.
This is about how we handle static member of a template. Before this commit, we use internal linkage for the IR thread-local variable, which is inefficient. With this commit, we will start to follow Itanium C++ ABI.
rdar://problem/23415206
Reviewed by John McCall.
llvm-svn: 252814
show more ...
|
#
9f5260ab |
| 06-Nov-2015 |
Duncan P. N. Exon Smith <dexonsmith@apple.com> |
CodeGen: Remove implicit ilist iterator conversions, NFC
Make ilist iterator conversions explicit in clangCodeGen. Eventually I'll remove them everywhere.
llvm-svn: 252358
|
#
0e2d4222 |
| 05-Nov-2015 |
Keno Fischer <kfischer@college.harvard.edu> |
Fix crash in EmitDeclMetadata mode
Summary: This fixes a bug that's easily encountered in LLDB (https://llvm.org/bugs/show_bug.cgi?id=22875). The problem here is that we mangle a name during debug i
Fix crash in EmitDeclMetadata mode
Summary: This fixes a bug that's easily encountered in LLDB (https://llvm.org/bugs/show_bug.cgi?id=22875). The problem here is that we mangle a name during debug info emission, but never actually emit the actual Decl, so we run into problems in EmitDeclMetadata (which assumes such a Decl exists). Fix that by just skipping metadata emissions for mangled names that don't have associated Decls.
Reviewers: rjmccall
Subscribers: labath, cfe-commits
Differential Revision: http://reviews.llvm.org/D13959
llvm-svn: 252229
show more ...
|
#
756447a6 |
| 30-Oct-2015 |
Tim Northover <tnorthover@apple.com> |
Watch and TV OS: wire up basic ABI choices
This sets the mostly expected Darwin default ABI options for these two platforms. Active changes from these defaults for watchOS are in a later patch.
llv
Watch and TV OS: wire up basic ABI choices
This sets the mostly expected Darwin default ABI options for these two platforms. Active changes from these defaults for watchOS are in a later patch.
llvm-svn: 251708
show more ...
|
#
b04ecb75 |
| 21-Oct-2015 |
John McCall <rjmccall@apple.com> |
Unify the ObjC entrypoint caches.
llvm-svn: 250918
|
#
c2d2b425 |
| 15-Oct-2015 |
Benjamin Kramer <benny.kra@googlemail.com> |
[CodeGen] Remove dead code. NFC.
llvm-svn: 250418
|
#
aec6b2c2 |
| 08-Oct-2015 |
Akira Hatanaka <ahatanaka@apple.com> |
[CodeGen] [CodeGen] Attach function attributes to functions created in CGBlocks.cpp.
This commit fixes a bug in clang's code-gen where it creates the following functions but doesn't attach function
[CodeGen] [CodeGen] Attach function attributes to functions created in CGBlocks.cpp.
This commit fixes a bug in clang's code-gen where it creates the following functions but doesn't attach function attributes to them:
__copy_helper_block_ __destroy_helper_block_ __Block_byref_object_copy_ __Block_byref_object_dispose_
rdar://problem/20828324
Differential Revision: http://reviews.llvm.org/D13525
llvm-svn: 249735
show more ...
|
#
200500d6 |
| 08-Oct-2015 |
Akira Hatanaka <ahatanaka@apple.com> |
[CodeGen] Check if the Decl pointer passed is null, and if so, return early.
This is needed in a patch I plan to commit later, in which a null Decl pointer is passed to SetLLVMFunctionAttributesForD
[CodeGen] Check if the Decl pointer passed is null, and if so, return early.
This is needed in a patch I plan to commit later, in which a null Decl pointer is passed to SetLLVMFunctionAttributesForDefinition.
Relevant discussion is in http://reviews.llvm.org/D13525.
llvm-svn: 249722
show more ...
|
#
3f02150d |
| 08-Oct-2015 |
David Majnemer <david.majnemer@gmail.com> |
[MSVC Compat] Enable ABI impacting non-conforming behavior independently of -fms-compatibility
No ABI for C++ currently makes it possible to implement the standard 100% perfectly. We wrongly hid so
[MSVC Compat] Enable ABI impacting non-conforming behavior independently of -fms-compatibility
No ABI for C++ currently makes it possible to implement the standard 100% perfectly. We wrongly hid some of our compatible behavior behind -fms-compatibility instead of tying it to the compiler ABI.
llvm-svn: 249656
show more ...
|
#
ed1fe5d0 |
| 03-Oct-2015 |
Yaron Keren <yaron.keren@gmail.com> |
Replace double-negated !SourceLocation.isInvalid() with SourceLocation.isValid().
llvm-svn: 249228
|
#
c005cc06 |
| 27-Sep-2015 |
Craig Topper <craig.topper@gmail.com> |
Use llvm::makeArrayRef. NFC.
llvm-svn: 248678
|
#
510d7c71 |
| 21-Sep-2015 |
Akira Hatanaka <ahatanaka@apple.com> |
Remove attributes minsize and optsize, which conflict with optnone.
This commit fixes an assert that is triggered when optnone is being added to an IR function that is already marked with minsize an
Remove attributes minsize and optsize, which conflict with optnone.
This commit fixes an assert that is triggered when optnone is being added to an IR function that is already marked with minsize and optsize.
rdar://problem/22723716
Differential Revision: http://reviews.llvm.org/D13004
llvm-svn: 248191
show more ...
|