#
5ab7c285 |
| 27-Aug-2023 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Modernize PeepholeProtection (NFC)
|
#
7a41af86 |
| 24-Aug-2023 |
Fangrui Song <i@maskray.me> |
[X86] Support arch=x86-64{,-v2,-v3,-v4} for target_clones attribute
GCC 12 (https://gcc.gnu.org/PR101696) allows `arch=x86-64` `arch=x86-64-v2` `arch=x86-64-v3` `arch=x86-64-v4` in the target_clones
[X86] Support arch=x86-64{,-v2,-v3,-v4} for target_clones attribute
GCC 12 (https://gcc.gnu.org/PR101696) allows `arch=x86-64` `arch=x86-64-v2` `arch=x86-64-v3` `arch=x86-64-v4` in the target_clones function attribute. This patch ports the feature.
* Set KeyFeature to `x86-64{,-v2,-v3,-v4}` in `Processors[]`, to be used by X86TargetInfo::multiVersionSortPriority * builtins: change `__cpu_features2` to an array like libgcc. Define `FEATURE_X86_64_{BASELINE,V2,V3,V4}` and depended ISA feature bits. * CGBuiltin.cpp: update EmitX86CpuSupports to handle `arch=x86-64*`.
Close https://github.com/llvm/llvm-project/issues/55830
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D158329
show more ...
|
#
30c60ec5 |
| 23-Aug-2023 |
Manna, Soumi <soumi.manna@intel.com> |
[NFC][CLANG] Fix static analyzer bugs about large copy by values
Static Analyzer Tool complains about a large function call parameter which is is passed by value in CGBuiltin.cpp file.
1. In CodeGe
[NFC][CLANG] Fix static analyzer bugs about large copy by values
Static Analyzer Tool complains about a large function call parameter which is is passed by value in CGBuiltin.cpp file.
1. In CodeGenFunction::EmitSMELdrStr(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value *> &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value.
2. In CodeGenFunction::EmitSMEZero(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value *> &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value.
3. In CodeGenFunction::EmitSMEReadWrite(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value *> &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value.
4. In CodeGenFunction::EmitSMELd1St1(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value *> &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value.
I see many places in CGBuiltin.cpp file, we are passing parameter TypeFlags of type clang::SVETypeFlags by reference.
clang::SVETypeFlags inherits several other types.
This patch passes parameter TypeFlags by reference instead of by value in the function.
Reviewed By: tahonermann, sdesmalen
Differential Revision: https://reviews.llvm.org/D158522
show more ...
|
#
c4672454 |
| 14-Aug-2023 |
Chuanqi Xu <yedeng.yd@linux.alibaba.com> |
[C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty
Close https://github.com/llvm/llvm-project/issues/56301 Close https://github.com/llvm/llvm-project/issues/64151
See t
[C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty
Close https://github.com/llvm/llvm-project/issues/56301 Close https://github.com/llvm/llvm-project/issues/64151
See the summary and the discussion of https://reviews.llvm.org/D157070 to get the full context.
As @rjmccall pointed out, the key point of the root cause is that currently we didn't implement the semantics for '@llvm.coro.save' well ("after the await-ready returns false, the coroutine is considered to be suspended ") well. Since the semantics implies that we (the compiler) shouldn't write the spills into the coroutine frame in the await_suspend. But now it is possible due to some combinations of the optimizations so the semantics are broken. And the inlining is the root optimization of such optimizations. So in this patch, we tried to add the `noinline` attribute to the await_suspend call.
Also as an optimization, we don't add the `noinline` attribute to the await_suspend call if the awaiter is an empty class. This should be correct since the programmers can't access the local variables in await_suspend if the awaiter is empty. I think this is necessary for the performance since it is pretty common.
Another potential optimization is:
call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle, ptr @awaitSuspendFn)
Then it is much easier to perform the safety analysis in the middle end. If it is safe to inline the call to awaitSuspend, we can replace it in the CoroEarly pass. Otherwise we could replace it in the CoroSplit pass.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D157833
show more ...
|
#
ea8d3b1f |
| 09-Aug-2023 |
wanglei <wanglei@loongson.cn> |
[Clang][LoongArch] Use the ClangBuiltin class to automatically generate support for CBE and CFE
Fixed the type modifier (L->W), removed redundant feature checking code since the feature has already
[Clang][LoongArch] Use the ClangBuiltin class to automatically generate support for CBE and CFE
Fixed the type modifier (L->W), removed redundant feature checking code since the feature has already been checked in `EmitBuiltinExpr`. And Cleaned up unused diagnostic information.
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D156866
show more ...
|
#
27dab4d3 |
| 23-Jun-2023 |
Amy Huang <akhuang@google.com> |
Reland "Try to implement lambdas with inalloca parameters by forwarding without use of inallocas."t
This reverts commit 8ed7aa59f489715d39d32e72a787b8e75cfda151.
Differential Revision: https://revi
Reland "Try to implement lambdas with inalloca parameters by forwarding without use of inallocas."t
This reverts commit 8ed7aa59f489715d39d32e72a787b8e75cfda151.
Differential Revision: https://reviews.llvm.org/D154007
show more ...
|
#
f225898a |
| 20-Jul-2023 |
Bryan Chan <bryan.chan@huawei.com> |
[Clang][AArch64][SME] Add intrinsics for ZA array load/store (LDR/STR)
This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html
[Clang][AArch64][SME] Add intrinsics for ZA array load/store (LDR/STR)
This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html):
- svldr_vnum_za - svstr_vnum_za
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D134678
show more ...
|
#
578b0bd4 |
| 20-Jul-2023 |
Bryan Chan <bryan.chan@huawei.com> |
[Clang][AArch64][SME] Add ZA zeroing intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html):
- svzero_mask_
[Clang][AArch64][SME] Add ZA zeroing intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html):
- svzero_mask_za - svzero_za
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D134677
show more ...
|
#
6dc94c54 |
| 20-Jul-2023 |
Bryan Chan <bryan.chan@huawei.com> |
[Clang][AArch64][SME] Add vector read/write (mova) intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html):
- s
[Clang][AArch64][SME] Add vector read/write (mova) intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html):
- svread_hor_za8[_s8]_m // also for u8 - svread_hor_za16[_s16]_m // also for u16, f16, bf16 - svread_hor_za32[_s32]_m // also for u32, f32 - svread_hor_za64[_s64]_m // also for u64, f64 - svread_hor_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64 - svread_ver_za8[_s8]_m // also for u8 - svread_ver_za16[_s16]_m // also for u16, f16, bf16 - svread_ver_za32[_s32]_m // also for u32, f32 - svread_ver_za64[_s64]_m // also for u64, f64 - svread_ver_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64 - svwrite_hor_za8[_s8]_m // also for u8 - svwrite_hor_za16[_s16]_m // also for u16, f16, bf16 - svwrite_hor_za32[_s32]_m // also for u32, f32 - svwrite_hor_za64[_s64]_m // also for u64, f64 - svwrite_hor_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64 - svwrite_ver_za8[_s8]_m // also for u8 - svwrite_ver_za16[_s16]_m // also for u16, f16, bf16 - svwrite_ver_za32[_s32]_m // also for u32, f32 - svwrite_ver_za64[_s64]_m // also for u64, f64 - svwrite_ver_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen, kmclaughlin
Differential Revision: https://reviews.llvm.org/D128648
show more ...
|
#
bac2a075 |
| 03-Jul-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
clang: Attach !fpmath metadata to __builtin_sqrt based on language flags
OpenCL and HIP have -cl-fp32-correctly-rounded-divide-sqrt and -fno-hip-correctly-rounded-divide-sqrt. The corresponding fpma
clang: Attach !fpmath metadata to __builtin_sqrt based on language flags
OpenCL and HIP have -cl-fp32-correctly-rounded-divide-sqrt and -fno-hip-correctly-rounded-divide-sqrt. The corresponding fpmath metadata was only set on fdiv, and not sqrt. The backend is currently underutilizing sqrt lowering options, and the responsibility is split between the libraries and backend and this metadata is needed.
CUDA/NVCC has -prec-div and -prev-sqrt but clang doesn't appear to be aiming for compatibility with those. Don't know if OpenMP has a similar control.
show more ...
|
#
227012cb |
| 12-Jul-2023 |
Akash Banerjee <Akash.Banerjee@amd.com> |
[OpenMP] Migrate device code privatisation from Clang CodeGen to OMPIRBuilder
This patch migrates the UseDevicePtr and UseDeviceAddr clause related code for handling privatisation from Clang codegen
[OpenMP] Migrate device code privatisation from Clang CodeGen to OMPIRBuilder
This patch migrates the UseDevicePtr and UseDeviceAddr clause related code for handling privatisation from Clang codegen to the OMPIRBuilder
Depends on D150860
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D152554
show more ...
|
#
5942ae86 |
| 27-Jun-2023 |
Sindhu Chittireddy <sindhu.chittireddy@intel.com> |
[NFC] Initialize class member pointers to nullptr.
Reviewed here: https://reviews.llvm.org/D153926
|
#
13888870 |
| 22-Jun-2023 |
Doru Bercea <doru.bercea@amd.com> |
Enable dynamic-sized VLAs for data sharing in OpenMP offloaded target regions.
Review: https://reviews.llvm.org/D153883
|
#
eb61bde8 |
| 02-Jul-2023 |
Dave Pagan <dave.pagan@amd.com> |
[OpenMP][CodeGen] Add codegen for combined 'loop' directives.
The loop directive is a descriptive construct which allows the compiler flexibility in how it generates code for the directive's associa
[OpenMP][CodeGen] Add codegen for combined 'loop' directives.
The loop directive is a descriptive construct which allows the compiler flexibility in how it generates code for the directive's associated loop(s). See OpenMP specification 5.2 [257:8-9].
Codegen added in this patch for the combined 'loop' directives are:
'target teams loop' -> 'target teams distribute parallel for' 'teams loop' -> 'teams distribute parallel for' 'target parallel loop' -> 'target parallel for' 'parallel loop' -> 'parallel for'
NOTE: The implementation of the 'loop' directive itself is unchanged.
Differential Revision: https://reviews.llvm.org/D145823
show more ...
|
#
23489022 |
| 24-Jun-2023 |
Sergei Barannikov <barannikov88@gmail.com> |
[clang][CodeGen] Remove no-op EmitCastToVoidPtr (NFC)
Reviewed By: JOE1994
Differential Revision: https://reviews.llvm.org/D153694
|
#
8ed7aa59 |
| 22-Jun-2023 |
Amy Huang <akhuang@google.com> |
Revert "Try to implement lambdas with inalloca parameters by forwarding without use of inallocas."
Causes a clang crash (see crbug.com/1457256).
This reverts commit 015049338d7e8e0e81f2ad2f94e5a43e
Revert "Try to implement lambdas with inalloca parameters by forwarding without use of inallocas."
Causes a clang crash (see crbug.com/1457256).
This reverts commit 015049338d7e8e0e81f2ad2f94e5a43e2e3f5220.
show more ...
|
#
01504933 |
| 07-Nov-2022 |
Amy Huang <akhuang@google.com> |
Try to implement lambdas with inalloca parameters by forwarding without use of inallocas.
Differential Revision: https://reviews.llvm.org/D137872
|
#
9f6250f5 |
| 15-May-2023 |
Bryan Chan <bryan.chan@huawei.com> |
[Clang][AArch64][SME] Add vector load/store (ld1/st1) intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html):
[Clang][AArch64][SME] Add vector load/store (ld1/st1) intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined in https://arm-software.github.io/acle/main/acle.html):
- svld1_hor_za8 // also for _za16, _za32, _za64 and _za128 - svld1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128 - svld1_ver_za8 // also for _za16, _za32, _za64 and _za128 - svld1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128 - svst1_hor_za8 // also for _za16, _za32, _za64 and _za128 - svst1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128 - svst1_ver_za8 // also for _za16, _za32, _za64 and _za128 - svst1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128
SveEmitter.cpp is extended to generate arm_sme.h (currently named arm_sme_draft_spec_subject_to_change.h) and other SME definitions from arm_sme.td, which is modeled after arm_sve.td. Common TableGen definitions are moved into arm_sve_sme_incl.td.
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen, kmclaughlin
Differential Revision: https://reviews.llvm.org/D127910
show more ...
|
#
46f36649 |
| 20-May-2023 |
Fangrui Song <i@maskray.me> |
-fsanitize=function: use type hashes instead of RTTI objects
Currently we use RTTI objects to check type compatibility. To support non-unique RTTI objects, commit 5745eccef54ddd3caca278d1d292a88b228
-fsanitize=function: use type hashes instead of RTTI objects
Currently we use RTTI objects to check type compatibility. To support non-unique RTTI objects, commit 5745eccef54ddd3caca278d1d292a88b2281528b added a `checkTypeInfoEquality` string matching to the runtime. The scheme is inefficient.
``` _Z1fv: .long 846595819 # jmp .long .L__llvm_rtti_proxy-_Z3funv ...
main: ... # Load the second word (pointer to the RTTI object) and dereference it. movslq 4(%rsi), %rax movq (%rax,%rsi), %rdx # Is it the desired typeinfo object? leaq _ZTIFvvE(%rip), %rax # If not, call __ubsan_handle_function_type_mismatch_v1, which may recover if checkTypeInfoEquality allows cmpq %rax, %rdx jne .LBB1_2 ...
.section .data.rel.ro,"aw",@progbits .p2align 3, 0x0 .L__llvm_rtti_proxy: .quad _ZTIFvvE ```
Let's replace the indirect `_ZTI` pointer with a type hash similar to `-fsanitize=kcfi`.
``` _Z1fv: .long 3238382334 .long 2772461324 # type hash
main: ... # Load the second word (callee type hash) and check whether it is expected cmpl $-1522505972, -4(%rax) # If not, fail: call __ubsan_handle_function_type_mismatch jne .LBB2_2 ```
The RTTI object derives its name from `clang::MangleContext::mangleCXXRTTI`, which uses `mangleType`. `mangleTypeName` uses `mangleType` as well. So the type compatibility change is high-fidelity.
Since we no longer need RTTI pointers in `__ubsan::__ubsan_handle_function_type_mismatch_v1`, let's switch it back to version 0, the original signature before e215996a2932ed7c472f4e94dc4345b30fd0c373 (2019). `__ubsan::__ubsan_handle_function_type_mismatch_abort` is not recoverable, so we can revert some changes from e215996a2932ed7c472f4e94dc4345b30fd0c373.
Reviewed By: samitolvanen
Differential Revision: https://reviews.llvm.org/D148785
show more ...
|
#
64549f09 |
| 03-Apr-2023 |
Rafael A. Herrera Guaitero <randres2011@gmail.com> |
[OpenMP][5.1] Fix parallel masked is ignored #59939
Code generation support for 'parallel masked' directive.
The `EmitOMPParallelMaskedDirective` was implemented. In addition, the appropiate device
[OpenMP][5.1] Fix parallel masked is ignored #59939
Code generation support for 'parallel masked' directive.
The `EmitOMPParallelMaskedDirective` was implemented. In addition, the appropiate device functions were added.
Fix #59939.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D143527
show more ...
|
#
ce7eb2e0 |
| 23-Feb-2023 |
Wei Wang <apollo.mobility@gmail.com> |
[Coroutines] Avoid creating conditional cleanup markers in suspend block
We shouldn't access coro frame after returning from `await_suspend()` and before `llvm.coro.suspend()`. Make sure we always h
[Coroutines] Avoid creating conditional cleanup markers in suspend block
We shouldn't access coro frame after returning from `await_suspend()` and before `llvm.coro.suspend()`. Make sure we always hoist conditional cleanup markers when inside the `await.suspend` block.
Fix https://github.com/llvm/llvm-project/issues/59181
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D144680
show more ...
|
#
57865bc5 |
| 26-Jan-2023 |
Akira Hatanaka <ahatanaka@apple.com> |
[CodeGen] Add a flag to `Address` and `Lvalue` that is used to keep track of whether the pointer is known not to be null
The flag will be used for the arm64e work we plan to upstream in the future (
[CodeGen] Add a flag to `Address` and `Lvalue` that is used to keep track of whether the pointer is known not to be null
The flag will be used for the arm64e work we plan to upstream in the future (see https://lists.llvm.org/pipermail/llvm-dev/2019-October/136091.html). Currently the flag has no effect on code generation.
Differential Revision: https://reviews.llvm.org/D142584
show more ...
|
#
6ad0788c |
| 14-Jan-2023 |
Kazu Hirata <kazu@google.com> |
[clang] Use std::optional instead of llvm::Optional (NFC)
This patch replaces (llvm::|)Optional< with std::optional<. I'll post a separate patch to remove #include "llvm/ADT/Optional.h".
This is p
[clang] Use std::optional instead of llvm::Optional (NFC)
This patch replaces (llvm::|)Optional< with std::optional<. I'll post a separate patch to remove #include "llvm/ADT/Optional.h".
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
#
a1580d7b |
| 14-Jan-2023 |
Kazu Hirata <kazu@google.com> |
[clang] Add #include <optional> (NFC)
This patch adds #include <optional> to those files containing llvm::Optional<...> or Optional<...>.
I'll post a separate patch to actually replace llvm::Option
[clang] Add #include <optional> (NFC)
This patch adds #include <optional> to those files containing llvm::Optional<...> or Optional<...>.
I'll post a separate patch to actually replace llvm::Optional with std::optional.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
#
bf5c17ed |
| 13-Jan-2023 |
Guillaume Chatelet <gchatelet@google.com> |
[clang][NFC] Remove dependency on DataLayout::getPrefTypeAlignment
|