History log of /llvm-project/clang/lib/CodeGen/CodeGenModule.cpp (Results 376 – 400 of 2157)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1d97cb1f 04-Feb-2022 Yaxun (Sam) Liu <yaxun.liu@amd.com>

[HIP] Emit amdgpu_code_object_version module flag

code object version determines ABI, therefore should not be mixed.

This patch emits amdgpu_code_object_version module flag in LLVM IR
based on code

[HIP] Emit amdgpu_code_object_version module flag

code object version determines ABI, therefore should not be mixed.

This patch emits amdgpu_code_object_version module flag in LLVM IR
based on code object version (default 4).

The amdgpu_code_object_version value is code object version times 100.

LLVM IR with different amdgpu_code_object_version module flag cannot
be linked.

The -cc1 option -mcode-object-version=none is for ROCm device library use
only, which supports multiple ABI.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D119026

show more ...


# 171da443 04-Feb-2022 Yaxun (Sam) Liu <yaxun.liu@amd.com>

[HIPSPV] Fix literals are mapped to Generic address space

This issue is an oversight in D108621.

Literals in HIP are emitted as global constant variables with default
address space which maps to Ge

[HIPSPV] Fix literals are mapped to Generic address space

This issue is an oversight in D108621.

Literals in HIP are emitted as global constant variables with default
address space which maps to Generic address space for HIPSPV. In
SPIR-V such variables translate to OpVariable instructions with
Generic storage class which are not legal. Fix by mapping literals
to CrossWorkGroup address space.

The literals are not mapped to UniformConstant because the “flat”
pointers in HIP may reference them and “flat” pointers are modeled
as Generic pointers in SPIR-V. In SPIR-V/OpenCL UniformConstant
pointers may not be casted to Generic.

Patch by: Henry Linjamäki

Reviewed by: Yaxun Liu

Differential Revision: https://reviews.llvm.org/D118876

show more ...


# 853e0aa4 04-Feb-2022 Hans Wennborg <hans@chromium.org>

Don't dllexport reference temporaries

Even if the reference itself is dllexport, the temporary should not be.
In fact, we're already giving it internal linkage, so dllexporting it
is not just wastef

Don't dllexport reference temporaries

Even if the reference itself is dllexport, the temporary should not be.
In fact, we're already giving it internal linkage, so dllexporting it
is not just wasteful, but will fail to link, as in the example below:

$ cat /tmp/a.cc
void _DllMainCRTStartup() {}
const int __declspec(dllexport) &foo = 42;

$ clang-cl -fuse-ld=lld /tmp/a.cc /Zl /link /dll /out:a.dll
lld-link: error: <root>: undefined symbol: int const &foo::$RT1

Differential revision: https://reviews.llvm.org/D118980

show more ...


# 1f08b086 28-Jan-2022 Amilendra Kodithuwakku <Amilendra.Kodithuwakku@arm.com>

[clang][ARM] Emit warnings when PACBTI-M is used with unsupported architectures

Branch protection in M-class is supported by
- Armv8.1-M.Main
- Armv8-M.Main
- Armv7-M

Attempting to enable this f

[clang][ARM] Emit warnings when PACBTI-M is used with unsupported architectures

Branch protection in M-class is supported by
- Armv8.1-M.Main
- Armv8-M.Main
- Armv7-M

Attempting to enable this for other architectures, either by
command-line (e.g -mbranch-protection=bti) or by target attribute
in source code (e.g. __attribute__((target("branch-protection=..."))) )
will generate a warning.

In both cases function attributes related to branch protection will not
be emitted. Regardless of the warning, module level attributes related to
branch protection will be emitted when it is enabled via the command-line.

The following people also contributed to this patch:
- Victor Campos

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D115501

show more ...


# 82af9502 21-Jan-2022 Joao Moreira <joao.moreira@intel.com>

[X86] Enable ibt-seal optimization when LTO is used in Kernel

Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit

[X86] Enable ibt-seal optimization when LTO is used in Kernel

Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit these instruction on function's prologues. Because this is a security feature, it is desirable that only actual indirect-branch-targeted functions are emitted with ENDBRs. While it is possible to identify address-taken functions through LTO, minimizing these ENDBR instructions remains a hard task for user-space binaries because exported functions may end being reachable through PLT entries, that will use an indirect branch for such. Because this cannot be determined during compilation-time, the compiler currently emits ENDBRs to every non-local-linkage function.

Despite the challenge presented for user-space, the kernel landscape is different as no PLTs are used. With the intent of providing the most fit ENDBR emission for the kernel, kernel developers proposed an optimization named "ibt-seal" which replaces the ENDBRs for NOPs directly in the binary. The discussion of this feature can be seen in [1].

This diff brings the enablement of the flag -mibt-seal, which in combination with LTO enforces a different policy for ENDBR placement in when the code-model is set to "kernel". In this scenario, the compiler will only emit ENDBRs to address taken functions, ignoring non-address taken functions that are don't have local linkage.

A comparison between an LTO-compiled kernel binaries without and with the -mibt-seal feature enabled shows that when -mibt-seal was used, the number of ENDBRs in the vmlinux.o binary patched by objtool decreased from 44383 to 33192, and that the number of superfluous ENDBR instructions nopped-out decreased from 11730 to 540.

The 540 missed superfluous ENDBRs need to be investigated further, but hypotheses are: assembly code not being taken care of by the compiler, kernel exported symbols mechanisms creating bogus address taken situations or even these being removed due to other binary optimizations like kernel's static_calls. For now, I assume that the large drop in the number of ENDBR instructions already justifies the feature being merged.

[1] - https://lkml.org/lkml/2021/11/22/591

Reviewed By: xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D116070

show more ...


# 85c2bd2a 19-Jan-2022 Yaxun (Sam) Liu <yaxun.liu@amd.com>

Prevent adding module flag amdgpu_hostcall multiple times

HIP program with printf call fails to compile with -fsanitize=address
option, because of appending module flag - amdgpu_hostcall twice, one

Prevent adding module flag amdgpu_hostcall multiple times

HIP program with printf call fails to compile with -fsanitize=address
option, because of appending module flag - amdgpu_hostcall twice, one
for printf and one for sanitize option. This patch fixes that issue.

Patch by: Praveen Velliengiri

Reviewed by: Yaxun Liu, Roman Lebedev

Differential Revision: https://reviews.llvm.org/D116216

show more ...


# c63a3175 15-Jan-2022 Nikita Popov <nikita.ppv@gmail.com>

[AttrBuilder] Remove ctor accepting AttributeList and Index

Use the AttributeSet constructor instead. There's no good reason
why AttrBuilder itself should exact the AttributeSet from the
AttributeLi

[AttrBuilder] Remove ctor accepting AttributeList and Index

Use the AttributeSet constructor instead. There's no good reason
why AttrBuilder itself should exact the AttributeSet from the
AttributeList. Moving this out of the AttrBuilder generally results
in cleaner code.

show more ...


# 2bcba21c 14-Jan-2022 Erich Keane <erich.keane@intel.com>

[CPU-Dispatch] Make sure Dispatch names get updated if previously mangled

Cases where there is a mangling of a cpu-dispatch/cpu-specific function
before the function becomes 'multiversion' (such as

[CPU-Dispatch] Make sure Dispatch names get updated if previously mangled

Cases where there is a mangling of a cpu-dispatch/cpu-specific function
before the function becomes 'multiversion' (such as a member function)
causes the wrong name to be emitted for one of the variants/resolver,
since the name is cached. Make sure we invalidate the cache in
cpu-dispatch/cpu-specific modes, like we previously did for just target
multiversioning.

show more ...


# b699e8b1 13-Jan-2022 Erich Keane <erich.keane@intel.com>

Add another assert to cpu-dispatch emission to help track down a tough
to repro error.

As mentioned yesterday, I've got a problem that I can only reproduce on
Godbolt (none of the build configs on m

Add another assert to cpu-dispatch emission to help track down a tough
to repro error.

As mentioned yesterday, I've got a problem that I can only reproduce on
Godbolt (none of the build configs on my local machine!), so this is at
least somewhat usable until I figure out a cause.

show more ...


# 6e77ad11 12-Jan-2022 Erich Keane <erich.keane@intel.com>

Add an assert in cpudispatch emit to try to track down an error.

I'm attempting to debug an issue that I can only get to happen on
godbolt, where the cpu-dispatch resolver for an out of line member

Add an assert in cpudispatch emit to try to track down an error.

I'm attempting to debug an issue that I can only get to happen on
godbolt, where the cpu-dispatch resolver for an out of line member
function is generated with the wrong name, causing a link failure.

show more ...


# d2cc6c2d 03-Jan-2022 Serge Guelton <sguelton@redhat.com>

Use a sorted array instead of a map to store AttrBuilder string attributes

Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor sligh

Use a sorted array instead of a map to store AttrBuilder string attributes

Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor slightly heavier, and involves extra
allocation for each new string attribute. Storing the attribute key/value as
strings implies extra allocation/copy step.

Use a sorted vector instead. Given the low number of attributes generally
involved, this is cheaper, as showcased by

https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions

Differential Revision: https://reviews.llvm.org/D116599

show more ...


# 40446663 09-Jan-2022 Kazu Hirata <kazu@google.com>

[clang] Use true/false instead of 1/0 (NFC)

Identified with modernize-use-bool-literals.


# 9290ccc3 04-Jan-2022 serge-sans-paille <sguelton@redhat.com>

Introduce the AttributeMask class

This class is solely used as a lightweight and clean way to build a set of
attributes to be removed from an AttrBuilder. Previously AttrBuilder was used
both for bu

Introduce the AttributeMask class

This class is solely used as a lightweight and clean way to build a set of
attributes to be removed from an AttrBuilder. Previously AttrBuilder was used
both for building and removing, which introduced odd situation like creation of
Attribute with dummy value because the only relevant part was the attribute
kind.

Differential Revision: https://reviews.llvm.org/D116110

show more ...


# ec2e26ea 10-Aug-2021 Sami Tolvanen <samitolvanen@google.com>

[Clang] Add __builtin_function_start

Control-Flow Integrity (CFI) replaces references to address-taken
functions with pointers to the CFI jump table. This is a problem
for low-level code, such as op

[Clang] Add __builtin_function_start

Control-Flow Integrity (CFI) replaces references to address-taken
functions with pointers to the CFI jump table. This is a problem
for low-level code, such as operating system kernels, which may
need the address of an actual function body without the jump table
indirection.

This change adds the __builtin_function_start() builtin, which
accepts an argument that can be constant-evaluated to a function,
and returns the address of the function body.

Link: https://github.com/ClangBuiltLinux/linux/issues/1353

Depends on D108478

Reviewed By: pcc, rjmccall

Differential Revision: https://reviews.llvm.org/D108479

show more ...


# c3b624a1 15-Dec-2021 Nikita Popov <npopov@redhat.com>

[CodeGen] Avoid deprecated ConstantAddress constructor

Change all uses of the deprecated constructor to pass the
element type explicitly and drop it.

For cases where the correct element type was no

[CodeGen] Avoid deprecated ConstantAddress constructor

Change all uses of the deprecated constructor to pass the
element type explicitly and drop it.

For cases where the correct element type was not immediately
obvious to me or would require a slightly larger change I'm
falling back to explicitly calling getPointerElementType() for now.

show more ...


# 0a14674f 03-Dec-2021 Peter Collingbourne <peter@pcc.me.uk>

CodeGen: Strip exception specifications from function types in CFI type names.

With C++17 the exception specification has been made part of the
function type, and therefore part of mangled type name

CodeGen: Strip exception specifications from function types in CFI type names.

With C++17 the exception specification has been made part of the
function type, and therefore part of mangled type names.

However, it's valid to convert function pointers with an exception
specification to function pointers with the same argument and return
types but without an exception specification, which means that e.g. a
function of type "void () noexcept" can be called through a pointer
of type "void ()". We must therefore consider the two types to be
compatible for CFI purposes.

We can do this by stripping the exception specification before mangling
the type name, which is what this patch does.

Differential Revision: https://reviews.llvm.org/D115015

show more ...


# e3b2f022 01-Dec-2021 Ties Stuij <ties.stuij@arm.com>

[clang][ARM] PACBTI-M frontend support

Handle branch protection option on the commandline as well as a function
attribute. One patch for both mechanisms, as they use the same underlying
parsing mech

[clang][ARM] PACBTI-M frontend support

Handle branch protection option on the commandline as well as a function
attribute. One patch for both mechanisms, as they use the same underlying
parsing mechanism.

These are recorded in a set of LLVM IR module-level attributes like we do for
AArch64 PAC/BTI (see https://reviews.llvm.org/D85649):

- command-line options are "translated" to module-level LLVM IR
attributes (metadata).

- functions have PAC/BTI specific attributes iff the
__attribute__((target("branch-protection=...))) was used in the function
declaration.

- command-line option -mbranch-protection to armclang targeting Arm,
following this grammar:

branch-protection ::= "-mbranch-protection=" <protection>
protection ::= "none" | "standard" | "bti" [ "+" <pac-ret-clause> ]
| <pac-ret-clause> [ "+" "bti"]
pac-ret-clause ::= "pac-ret" [ "+" <pac-ret-option> ]
pac-ret-option ::= "leaf" ["+" "b-key"] | "b-key" ["+" "leaf"]

b-key is simply a placeholder to make it consistent with AArch64's
version. In Arm, however, it triggers a warning informing that b-key is
unsupported and a-key will be selected instead.

- Handle _attribute_((target(("branch-protection=..."))) for AArch32 with the
same grammer as the commandline options.

This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension

The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:

https://developer.arm.com/documentation/ddi0553/latest

The following people contributed to this patch:

- Momchil Velikov
- Victor Campos
- Ties Stuij

Reviewed By: vhscampos

Differential Revision: https://reviews.llvm.org/D112421

show more ...


# fc53eb69 29-Nov-2021 Erich Keane <erich.keane@intel.com>

Reapply 'Implement target_clones multiversioning'

See discussion in D51650, this change was a little aggressive in an
error while doing a 'while we were here', so this removes that error
condition,

Reapply 'Implement target_clones multiversioning'

See discussion in D51650, this change was a little aggressive in an
error while doing a 'while we were here', so this removes that error
condition, as it is apparently useful.

This reverts commit bb4934601d731465e01e2e22c80ce2dbe687d73f.

show more ...


# de34a940 18-Nov-2021 Phoebe Wang <pengfei.wang@intel.com>

[X86] Add -mskip-rax-setup support to align with GCC

AMD64 ABI mandates caller to specify the number of used SSE registers
when passing variable arguments.
GCC also provides option -mskip-rax-setup

[X86] Add -mskip-rax-setup support to align with GCC

AMD64 ABI mandates caller to specify the number of used SSE registers
when passing variable arguments.
GCC also provides option -mskip-rax-setup to skip the setup of rax when
SSE is disabled. This helps to reduce the code size, see pr23258.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D112413

show more ...


# d0ac215d 14-Nov-2021 Kazu Hirata <kazu@google.com>

[clang] Use isa instead of dyn_cast (NFC)


# bb493460 12-Nov-2021 Adrian Kuegel <akuegel@google.com>

Revert "Implement target_clones multiversioning"

This reverts commit 9deab60ae710f8c4cc810cd680edfb64c803f42d.
There is a possibly unintended semantic change.


# 9deab60a 05-Nov-2021 Erich Keane <erich.keane@intel.com>

Implement target_clones multiversioning

As discussed here: https://lwn.net/Articles/691932/

GCC6.0 adds target_clones multiversioning. This functionality is
an odd cross between the cpu_dispatch an

Implement target_clones multiversioning

As discussed here: https://lwn.net/Articles/691932/

GCC6.0 adds target_clones multiversioning. This functionality is
an odd cross between the cpu_dispatch and 'target' MV, but is compatible
with neither.

This attribute allows you to list all options, then emits a separately
optimized version of each function per-option (similar to the
cpu_specific attribute). It automatically generates a resolver, just
like the other two.

The mangling however, is... ODD to say the least. The mangling format
is:
<normal_mangling>.<option string>.<option ordinal>.

Differential Revision:https://reviews.llvm.org/D51650

show more ...


# 4b3881e9 10-Nov-2021 Yaxun (Sam) Liu <yaxun.liu@amd.com>

Emit hidden hostcall argument for sanitized kernels

this patch - https://reviews.llvm.org/D110337 changes the way how hostcall
hidden argument is emitted for printf, but the sanitized kernels also u

Emit hidden hostcall argument for sanitized kernels

this patch - https://reviews.llvm.org/D110337 changes the way how hostcall
hidden argument is emitted for printf, but the sanitized kernels also use
hostcall buffer to report a error for invalid memory access, which is not
handled by the above patch and it leads to vdi runtime error:

Device::callbackQueue aborting with error : HSA_STATUS_ERROR_MEMORY_FAULT:
Agent attempted to access an inaccessible address. code: 0x2b

Patch by: Praveen Velliengiri

Reviewed by: Yaxun Liu, Matt Arsenault

Differential Revision: https://reviews.llvm.org/D112820

show more ...


# 80072fde 04-Nov-2021 Yaxun (Sam) Liu <yaxun.liu@amd.com>

[CUDA][HIP] Allow comdat for kernels

Two identical instantiations of a template function can be emitted by two TU's
with linkonce_odr linkage without causing duplicate symbols in linker. MSVC
also r

[CUDA][HIP] Allow comdat for kernels

Two identical instantiations of a template function can be emitted by two TU's
with linkonce_odr linkage without causing duplicate symbols in linker. MSVC
also requires these symbols be in comdat sections. Linux does not require
the symbols in comdat sections to be merged by linker but by default
clang puts them in comdat sections.

If a template kernel is instantiated identically in two TU's. MSVC requires
that them to be in comdat sections, otherwise MSVC linker will diagnose them as
duplicate symbols. However, currently clang does not put instantiated template
kernels in comdat sections, which causes link error for MSVC.

This patch allows putting instantiated template kernels into comdat sections.

Reviewed by: Artem Belevich, Reid Kleckner

Differential Revision: https://reviews.llvm.org/D112492

show more ...


# 9efce0ba 06-Nov-2021 Itay Bookstein <ibookstein@gmail.com>

[clang] Run LLVM Verifier in modes without CodeGen too

Previously, the Backend_Emit{Nothing,BC,LL} modes did
not run the LLVM verifier since it is usually added via
the TargetMachine::addPassesToEmi

[clang] Run LLVM Verifier in modes without CodeGen too

Previously, the Backend_Emit{Nothing,BC,LL} modes did
not run the LLVM verifier since it is usually added via
the TargetMachine::addPassesToEmitFile method according
to the DisableVerify parameter. This is called from
EmitAssemblyHelper::AddEmitPasses, which is only relevant
for BackendAction-s that require CodeGen.

Note:
* In these particular situations the verifier is added
to the optimization pipeline rather than the codegen
pipeline so that it runs prior to the BC/LL emission
pass.
* This change applies to both the old and the new PMs.
* Because the clang tests use -emit-llvm ubiquitously,
this change will enable the verifier for them.
* A small bug is fixed in emitIFuncDefinition so that
the clang/test/CodeGen/ifunc.c test would pass:
the emitIFuncDefinition incorrectly passed the
GlobalDecl of the IFunc itself to the call to
GetOrCreateLLVMFunction for creating the resolver.

Signed-off-by: Itay Bookstein <ibookstein@gmail.com>

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D113352

show more ...


1...<<11121314151617181920>>...87