#
74c10e32 |
| 09-Jul-2018 |
Craig Topper <craig.topper@intel.com> |
[Builtins][Attributes][X86] Tag all X86 builtins with their required vector width. Add a min_vector_width function attribute and tag all x86 instrinsics with it
This is part of an ongoing attempt at
[Builtins][Attributes][X86] Tag all X86 builtins with their required vector width. Add a min_vector_width function attribute and tag all x86 instrinsics with it
This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter.
Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible.
To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future.
To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin.
To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute.
To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins.
There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch.
Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue.
Differential Revision: https://reviews.llvm.org/D48617
llvm-svn: 336583
show more ...
|
Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3 |
|
#
1a83d067 |
| 07-Jun-2018 |
Gabor Buella <gabor.buella@intel.com> |
[CodeGen] Improve diagnostics related to target attributes
Summary: When requirement imposed by __target__ attributes on functions are not satisfied, prefer printing those requirements, which are ex
[CodeGen] Improve diagnostics related to target attributes
Summary: When requirement imposed by __target__ attributes on functions are not satisfied, prefer printing those requirements, which are explicitly mentioned in the attributes.
This makes such messages more useful, e.g. printing avx512f instead of avx2 in the following scenario:
``` $ cat foo.c static inline void __attribute__((__always_inline__, __target__("avx512f"))) x(void) { }
int main(void) { x(); } $ clang foo.c foo.c:7:2: error: always_inline function 'x' requires target feature 'avx2', but would be inlined into function 'main' that is compiled without support for 'avx2' x(); ^ 1 error generated. ```
bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37338
Reviewers: craig.topper, echristo, dblaikie
Reviewed By: craig.topper, echristo
Differential Revision: https://reviews.llvm.org/D46541
llvm-svn: 334174
show more ...
|
Revision tags: llvmorg-6.0.1-rc2 |
|
#
274506da |
| 03-May-2018 |
Craig Topper <craig.topper@intel.com> |
[CodeGenFunction] Use the StringRef::split function that takes a char separator instead of StringRef separator. NFC
The char separator version should be a little better optimized.
llvm-svn: 331482
|
#
f437e356 |
| 17-Apr-2018 |
Keith Wyss <wyssman@gmail.com> |
[XRay] Add clang builtin for xray typed events.
Summary: A clang builtin for xray typed events. Differs from __xray_customevent(...) by the presence of a type tag that is vended by compiler-rt in ty
[XRay] Add clang builtin for xray typed events.
Summary: A clang builtin for xray typed events. Differs from __xray_customevent(...) by the presence of a type tag that is vended by compiler-rt in typical usage. This allows xray handlers to expand logged events with their type description and plugins to process traced events based on type.
This change depends on D45633 for the intrinsic definition.
Reviewers: dberris, pelikan, rnk, eizan
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D45716
llvm-svn: 330220
show more ...
|
Revision tags: llvmorg-6.0.1-rc1 |
|
#
94ece8fb |
| 16-Apr-2018 |
Brock Wyma <brock.wyma@intel.com> |
[CodeView] Initial support for emitting S_THUNK32 symbols for compiler...
When emitting CodeView debug information, compiler-generated thunk routines should be emitted using S_THUNK32 symbols instea
[CodeView] Initial support for emitting S_THUNK32 symbols for compiler...
When emitting CodeView debug information, compiler-generated thunk routines should be emitted using S_THUNK32 symbols instead of S_GPROC32_ID symbols so Visual Studio can properly step into the user code. This initial support only handles standard thunk ordinals.
Differential Revision: https://reviews.llvm.org/D43838
llvm-svn: 330132
show more ...
|
#
1ba9d9c6 |
| 13-Apr-2018 |
Andrey Konovalov <andreyknvl@google.com> |
hwasan: add -fsanitize=kernel-hwaddress flag
This patch adds -fsanitize=kernel-hwaddress flag, that essentially enables -hwasan-kernel=1 -hwasan-recover=1 -hwasan-match-all-tag=0xff.
Differential R
hwasan: add -fsanitize=kernel-hwaddress flag
This patch adds -fsanitize=kernel-hwaddress flag, that essentially enables -hwasan-kernel=1 -hwasan-recover=1 -hwasan-match-all-tag=0xff.
Differential Revision: https://reviews.llvm.org/D45046
llvm-svn: 330044
show more ...
|
#
488f7c2b |
| 13-Apr-2018 |
Dean Michael Berris <dberris@google.com> |
[XRay][clang] Add flag to choose instrumentation bundles
Summary: This change addresses http://llvm.org/PR36926 by allowing users to pick which instrumentation bundles to use, when instrumenting wit
[XRay][clang] Add flag to choose instrumentation bundles
Summary: This change addresses http://llvm.org/PR36926 by allowing users to pick which instrumentation bundles to use, when instrumenting with XRay. In particular, the flag `-fxray-instrumentation-bundle=` has four valid values:
- `all`: the default, emits all instrumentation kinds - `none`: equivalent to -fnoxray-instrument - `function`: emits the entry/exit instrumentation - `custom`: emits the custom event instrumentation
These can be combined either as comma-separated values, or as repeated flag values.
Reviewers: echristo, kpw, eizan, pelikan
Reviewed By: pelikan
Subscribers: mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D44970
llvm-svn: 329985
show more ...
|
#
69a2e18b |
| 09-Apr-2018 |
Vitaly Buka <vitalybuka@google.com> |
asan: kernel: make no_sanitize("address") attribute work with -fsanitize=kernel-address
Summary: Right now to disable -fsanitize=kernel-address instrumentation, one needs to use no_sanitize("kernel-
asan: kernel: make no_sanitize("address") attribute work with -fsanitize=kernel-address
Summary: Right now to disable -fsanitize=kernel-address instrumentation, one needs to use no_sanitize("kernel-address"). Make either no_sanitize("address") or no_sanitize("kernel-address") disable both ASan and KASan instrumentation. Also remove redundant test.
Patch by Andrey Konovalov
Reviewers: eugenis, kcc, glider, dvyukov, vitalybuka
Reviewed By: eugenis, vitalybuka
Differential Revision: https://reviews.llvm.org/D44981
llvm-svn: 329612
show more ...
|
#
e55aa03a |
| 03-Apr-2018 |
Vlad Tsyrklevich <vlad@tsyrklevich.net> |
Add the -fsanitize=shadow-call-stack flag
Summary: Add support for the -fsanitize=shadow-call-stack flag which causes clang to add ShadowCallStack attribute to functions compiled with that flag enab
Add the -fsanitize=shadow-call-stack flag
Summary: Add support for the -fsanitize=shadow-call-stack flag which causes clang to add ShadowCallStack attribute to functions compiled with that flag enabled.
Reviewers: pcc, kcc
Reviewed By: pcc, kcc
Subscribers: cryptoad, cfe-commits, kcc
Differential Revision: https://reviews.llvm.org/D44801
llvm-svn: 329122
show more ...
|
Revision tags: llvmorg-5.0.2, llvmorg-5.0.2-rc2 |
|
#
5317f2e4 |
| 23-Mar-2018 |
Matt Morehouse <mascasa@google.com> |
[libFuzzer] Use OptForFuzzing attribute with -fsanitize=fuzzer.
Summary: Disables certain CMP optimizations to improve fuzzing signal under -O1 and -O2.
Switches all fuzzer tests to -O2 except for
[libFuzzer] Use OptForFuzzing attribute with -fsanitize=fuzzer.
Summary: Disables certain CMP optimizations to improve fuzzing signal under -O1 and -O2.
Switches all fuzzer tests to -O2 except for a few leak tests where the leak is optimized out under -O2.
Reviewers: kcc, vitalybuka
Reviewed By: vitalybuka
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D44798
llvm-svn: 328384
show more ...
|
Revision tags: llvmorg-5.0.2-rc1 |
|
#
d3dcf2f0 |
| 14-Mar-2018 |
Gheorghe-Teodor Bercea <gheorghe-teod.bercea@ibm.com> |
[OpenMP] Add OpenMP data sharing infrastructure using global memory
Summary: This patch handles the Clang code generation phase for the OpenMP data sharing infrastructure.
TODO: add a more detailed
[OpenMP] Add OpenMP data sharing infrastructure using global memory
Summary: This patch handles the Clang code generation phase for the OpenMP data sharing infrastructure.
TODO: add a more detailed description.
Reviewers: ABataev, carlo.bertolli, caomhin, hfinkel, Hahnfeld
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D43660
llvm-svn: 327513
show more ...
|
#
886b4505 |
| 02-Mar-2018 |
Manoj Gupta <manojgupta@google.com> |
Do not generate calls to fentry with __attribute__((no_instrument_function))
Summary: Currently only calls to mcount were suppressed with no_instrument_function attribute. Linux kernel requires that
Do not generate calls to fentry with __attribute__((no_instrument_function))
Summary: Currently only calls to mcount were suppressed with no_instrument_function attribute. Linux kernel requires that calls to fentry should also not be generated. This is an extended fix for PR PR33515.
Reviewers: hfinkel, rengolin, srhines, rnk, rsmith, rjmccall, hans
Reviewed By: rjmccall
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D43995
llvm-svn: 326639
show more ...
|
Revision tags: llvmorg-6.0.0 |
|
#
00f70bd9 |
| 01-Mar-2018 |
George Burgess IV <george.burgess.iv@gmail.com> |
Remove redundant casts. NFC
So I wrote a clang-tidy check to lint out redundant `isa`, `cast`, and `dyn_cast`s for fun. This is a portion of what it found for clang; I plan to do similar cleanups in
Remove redundant casts. NFC
So I wrote a clang-tidy check to lint out redundant `isa`, `cast`, and `dyn_cast`s for fun. This is a portion of what it found for clang; I plan to do similar cleanups in LLVM and other subprojects when I find time.
Because of the volume of changes, I explicitly avoided making any change that wasn't highly local and obviously correct to me (e.g. we still have a number of foo(cast<Bar>(baz)) that I didn't touch, since overloading is a thing and the cast<Bar> did actually change the type -- just up the class hierarchy).
I also tried to leave the types we were cast<>ing to somewhere nearby, in cases where it wasn't locally obvious what we were dealing with before.
llvm-svn: 326416
show more ...
|
Revision tags: llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2 |
|
#
891af03a |
| 03-Feb-2018 |
Sander de Smalen <sander.desmalen@arm.com> |
Recommit rL323952: [DebugInfo] Enable debug information for C99 VLA types.
Fixed build issue when building with g++-4.8 (specialization after instantiation).
llvm-svn: 324173
|
#
4e9a1264 |
| 01-Feb-2018 |
Sander de Smalen <sander.desmalen@arm.com> |
Reverting patch rL323952 due to build errors that I haven't encountered in local builds.
llvm-svn: 323956
|
#
17c4633e |
| 01-Feb-2018 |
Sander de Smalen <sander.desmalen@arm.com> |
[DebugInfo] Enable debug information for C99 VLA types
Summary: This patch enables debugging of C99 VLA types by generating more precise LLVM Debug metadata, using the extended DISubrange 'count' fi
[DebugInfo] Enable debug information for C99 VLA types
Summary: This patch enables debugging of C99 VLA types by generating more precise LLVM Debug metadata, using the extended DISubrange 'count' field that takes a DIVariable. This should implement: Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's) https://bugs.llvm.org/show_bug.cgi?id=30553
Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie
Reviewed By: aprantl
Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D41698
llvm-svn: 323952
show more ...
|
Revision tags: llvmorg-6.0.0-rc1 |
|
#
5cdf0383 |
| 12-Jan-2018 |
John McCall <rjmccall@apple.com> |
Allocate and access NormalCleanupDest with the natural alignment of i32.
This alignment can be less than 4 on certain embedded targets, which may not even be able to deal with 4-byte alignment on th
Allocate and access NormalCleanupDest with the natural alignment of i32.
This alignment can be less than 4 on certain embedded targets, which may not even be able to deal with 4-byte alignment on the stack.
Patch by Jacob Young!
llvm-svn: 322406
show more ...
|
#
57cc1a5d |
| 09-Jan-2018 |
Oren Ben Simhon <oren.ben.simhon@intel.com> |
Added Control Flow Protection Flag
Cf-protection is a target independent flag that instructs the back-end to instrument control flow mechanisms like: Branch, Return, etc. For example in X86 this fla
Added Control Flow Protection Flag
Cf-protection is a target independent flag that instructs the back-end to instrument control flow mechanisms like: Branch, Return, etc. For example in X86 this flag will be used to instrument Indirect Branch Tracking instructions.
Differential Revision: https://reviews.llvm.org/D40478
Change-Id: I5126e766c0e6b84118cae0ee8a20fe78cc373dea llvm-svn: 322063
show more ...
|
#
281d20b6 |
| 08-Jan-2018 |
Erich Keane <erich.keane@intel.com> |
Implement Attribute Target MultiVersioning
GCC's attribute 'target', in addition to being an optimization hint, also allows function multiversioning. We currently have the former implemented, this i
Implement Attribute Target MultiVersioning
GCC's attribute 'target', in addition to being an optimization hint, also allows function multiversioning. We currently have the former implemented, this is the latter's implementation.
This works by enabling functions with the same name/signature to coexist, so that they can all be emitted. Multiversion state is stored in the FunctionDecl itself, and SemaDecl manages the definitions. Note that it ends up having to permit redefinition of functions so that they can all be emitted. Additionally, all versions of the function must be emitted, so this also manages that.
Note that this includes some additional rules that GCC does not, since defining something as a MultiVersion function after a usage has been made illegal.
The only 'history rewriting' that happens is if a function is emitted before it has been converted to a multiversion'ed function, at which point its name needs to be changed.
Function templates and virtual functions are NOT yet supported (not supported in GCC either).
Additionally, constructors/destructors are disallowed, but the former is planned.
llvm-svn: 322028
show more ...
|
#
8c85bca5 |
| 05-Jan-2018 |
Stephan Bergmann <sbergman@redhat.com> |
No -fsanitize=function warning when calling noexcept function through non-noexcept pointer in C++17
As discussed in the mail thread <https://groups.google.com/a/isocpp.org/forum/ #!topic/std-discuss
No -fsanitize=function warning when calling noexcept function through non-noexcept pointer in C++17
As discussed in the mail thread <https://groups.google.com/a/isocpp.org/forum/ #!topic/std-discussion/T64_dW3WKUk> "Calling noexcept function throug non- noexcept pointer is undefined behavior?", such a call should not be UB. However, Clang currently warns about it.
This change removes exception specifications from the function types recorded for -fsanitize=function, both in the functions themselves and at the call sites. That means that calling a non-noexcept function through a noexcept pointer will also not be flagged as UB. In the review of this change, that was deemed acceptable, at least for now. (See the "TODO" in compiler-rt test/ubsan/TestCases/TypeCheck/Function/function.cpp.)
To remove exception specifications from types, the existing internal ASTContext::getFunctionTypeWithExceptionSpec was made public, and some places otherwise unrelated to this change have been adapted to call it, too.
This is the cfe part of a patch covering both cfe and compiler-rt.
Differential Revision: https://reviews.llvm.org/D40720
llvm-svn: 321859
show more ...
|
#
06f19a0d |
| 02-Jan-2018 |
Reid Kleckner <rnk@google.com> |
[WinEH] Allow for multiple terminatepads
Fixes verifier errors with Windows EH and OpenMP, which injects a terminate scope around parallel blocks.
Fixes PR35778
llvm-svn: 321676
|
#
cb8c0098 |
| 16-Dec-2017 |
Sanjay Patel <spatel@rotateright.com> |
[Driver, CodeGen] pass through and apply -fassociative-math
There are 2 parts to getting the -fassociative-math command-line flag translated to LLVM FMF:
1. In the driver/frontend, we accept the fl
[Driver, CodeGen] pass through and apply -fassociative-math
There are 2 parts to getting the -fassociative-math command-line flag translated to LLVM FMF:
1. In the driver/frontend, we accept the flag and its 'no' inverse and deal with the interactions with other flags like -ffast-math -fno-signed-zeros -fno-trapping-math. This was mostly already done - we just need to translate the flag as a codegen option. The test file is complicated because there are many potential combinations of flags here. Note that we are matching gcc's behavior that requires 'nsz' and no-trapping-math.
2. In codegen, we map the codegen option to FMF in the IR builder. This is simple code and corresponding test.
For the motivating example from PR27372:
float foo(float a, float x) { return ((a + x) - x); }
$ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math -emit-llvm | egrep 'fadd|fsub' %add = fadd nnan ninf nsz arcp contract float %0, %1 %sub = fsub nnan ninf nsz arcp contract float %add, %2
So 'reassoc' is off as expected (and so is the new 'afn' but that's a different patch). This case now works as expected end-to-end although the underlying logic is still wrong:
$ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math | grep xmm addss %xmm1, %xmm0 subss %xmm1, %xmm0
We're not done because the case where 'reassoc' is set is ignored by optimizer passes. Example:
$ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math -emit-llvm | grep fadd %add = fadd reassoc float %0, %1
$ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math | grep xmm addss %xmm1, %xmm0 subss %xmm1, %xmm0
Differential Revision: https://reviews.llvm.org/D39812
llvm-svn: 320920
show more ...
|
#
b952e639 |
| 15-Dec-2017 |
Alexey Bataev <a.bataev@hotmail.com> |
[OPENMP] Codegen `declare simd` for function declarations.
Previously the attributes were emitted only for function definitions. Patch adds emission of the attributes for function declarations.
llv
[OPENMP] Codegen `declare simd` for function declarations.
Previously the attributes were emitted only for function definitions. Patch adds emission of the attributes for function declarations.
llvm-svn: 320826
show more ...
|
#
12817e59 |
| 09-Dec-2017 |
Evgeniy Stepanov <eugeni.stepanov@gmail.com> |
Hardware-assisted AddressSanitizer (clang part).
Summary: Driver, frontend and LLVM codegen for HWASan. A clone of ASan, basically.
Reviewers: kcc, pcc, alekseyshl
Subscribers: srhines, javed.absa
Hardware-assisted AddressSanitizer (clang part).
Summary: Driver, frontend and LLVM codegen for HWASan. A clone of ASan, basically.
Reviewers: kcc, pcc, alekseyshl
Subscribers: srhines, javed.absar, cfe-commits
Differential Revision: https://reviews.llvm.org/D40936
llvm-svn: 320232
show more ...
|
Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3 |
|
#
1a5b10d5 |
| 30-Nov-2017 |
Dean Michael Berris <dberris@google.com> |
[XRay][clang] Introduce -fxray-always-emit-customevents
Summary: The -fxray-always-emit-customevents flag instructs clang to always emit the LLVM IR for calls to the `__xray_customevent(...)` built-
[XRay][clang] Introduce -fxray-always-emit-customevents
Summary: The -fxray-always-emit-customevents flag instructs clang to always emit the LLVM IR for calls to the `__xray_customevent(...)` built-in function. The default behaviour currently respects whether the function has an `[[clang::xray_never_instrument]]` attribute, and thus not lower the appropriate IR code for the custom event built-in.
This change allows users calling through to the `__xray_customevent(...)` built-in to always see those calls lowered to the corresponding LLVM IR to lay down instrumentation points for these custom event calls.
Using this flag enables us to emit even just the user-provided custom events even while never instrumenting the start/end of the function where they appear. This is useful in cases where "phase markers" using __xray_customevent(...) can have very few instructions, must never be instrumented when entered/exited.
Reviewers: rnk, dblaikie, kpw
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D40601
llvm-svn: 319388
show more ...
|