History log of /llvm-project/clang/lib/CodeGen/CodeGenModule.cpp (Results 351 – 375 of 2157)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 4ea1d435 08-Apr-2022 Yaxun (Sam) Liu <yaxun.liu@amd.com>

[CUDA][HIP] Externalize kernels in anonymous name space

kernels in anonymous name space needs to have unique name
to avoid duplicate symbols.

Fixes: https://github.com/llvm/llvm-project/issues/5456

[CUDA][HIP] Externalize kernels in anonymous name space

kernels in anonymous name space needs to have unique name
to avoid duplicate symbols.

Fixes: https://github.com/llvm/llvm-project/issues/54560

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D123353

show more ...


# fa34951f 08-Apr-2022 Mitch Phillips <31459023+hctim@users.noreply.github.com>

Reland "[MTE] Add -fsanitize=memtag* and friends."

Differential Revision: https://reviews.llvm.org/D118948


# 4aaf25b4 08-Apr-2022 Aaron Ballman <aaron@aaronballman.com>

Revert "[MTE] Add -fsanitize=memtag* and friends."

This reverts commit 8aa1490513f111afd407d87c3f07d26f65c8a686.

Broke testing: https://lab.llvm.org/buildbot/#/builders/109/builds/36233


# 8aa14905 01-Apr-2022 Mitch Phillips <31459023+hctim@users.noreply.github.com>

[MTE] Add -fsanitize=memtag* and friends.

Currently, enablement of heap MTE on Android is specified by an ELF note, which
signals to the linker to enable heap MTE. This change allows
-fsanitize=memt

[MTE] Add -fsanitize=memtag* and friends.

Currently, enablement of heap MTE on Android is specified by an ELF note, which
signals to the linker to enable heap MTE. This change allows
-fsanitize=memtag-heap to synthesize these notes, rather than adding them
through the build system. We need to extend this feature to also signal the
linker to do special work for MTE globals (in future) and MTE stack (currently
implemented in the toolchain, but not implemented in the loader).

Current Android uses a non-backwards-compatible ELF note, called
".note.android.memtag". Stack MTE is an ABI break anyway, so we don't mind that
we won't be able to run executables with stack MTE on Android 11/12 devices.

The current expectation is to support the verbiage used by Android, in
that "SYNC" means MTE Synchronous mode, and "ASYNC" effectively means
"fast", using the Kernel auto-upgrade feature that allows
hardware-specific and core-specific configuration as to whether "ASYNC"
would end up being Asynchronous, Asymmetric, or Synchronous on that
particular core, whichever has a reasonable performance delta. Of
course, this is platform and loader-specific.

Differential Revision: https://reviews.llvm.org/D118948

show more ...


# 74b56e02 08-Apr-2022 Chuanqi Xu <yedeng.yd@linux.alibaba.com>

[NFC] Remove unused variable in CodeGenModules

This eliminates an unused-variable warning


# b1ea0191 07-Mar-2022 Kavitha Natarajan <kavitha.natarajan@amd.com>

[clang][DebugInfo] Support debug info for alias variable

clang to emit DWARF information for global alias variable as
DW_TAG_imported_declaration. This change also handles nested
(recursive) importe

[clang][DebugInfo] Support debug info for alias variable

clang to emit DWARF information for global alias variable as
DW_TAG_imported_declaration. This change also handles nested
(recursive) imported declarations.

Reviewed by: dblaikie, aprantl

Differential Revision: https://reviews.llvm.org/D120989

show more ...


# 5531abaf 24-Mar-2022 Tom Honermann <tom.honermann@intel.com>

[clang] Corrections for target_clones multiversion functions.

This change merges code for emit of target and target_clones multiversion
resolver functions and, in doing so, corrects handling of targ

[clang] Corrections for target_clones multiversion functions.

This change merges code for emit of target and target_clones multiversion
resolver functions and, in doing so, corrects handling of target_clones
functions that are declared but not defined. Previously, a use of such
a target_clones function would result in an attempted emit of an ifunc
that referenced an undefined resolver function. Ifunc references to
undefined resolver functions are not allowed and, when the LLVM verifier
is not disabled (via '-disable-llvm-verifier'), resulted in the verifier
issuing a "IFunc resolver must be a definition" error and aborting the
compilation. With this change, ifuncs and resolver function definitions
are always emitted for used target_clones functions regardless of whether
the target_clones function is defined (if the function is defined, then
the ifunc and resolver are emitted regardless of whether the function is
used).

This change has the side effect of causing target_clones variants and
resolver functions to be emitted in a different order than they were
previously. This is harmless and is reflected in the updated tests.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122958

show more ...


# 40af8df6 01-Apr-2022 Tom Honermann <tom.honermann@intel.com>

[clang] NFC: Preparation for merging code to emit target and target_clones resolvers.

This change modifies CodeGenModule::emitMultiVersionFunctions() in preparation
for a change that will merge supp

[clang] NFC: Preparation for merging code to emit target and target_clones resolvers.

This change modifies CodeGenModule::emitMultiVersionFunctions() in preparation
for a change that will merge support for emitting target_clones resolvers into
this function. This change mostly serves to isolate indentation changes from
later behavior modifying changes.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122957

show more ...


# 0ace0100 01-Apr-2022 Tom Honermann <tom.honermann@intel.com>

[clang] NFC: Simplify the interface to CodeGenModule::GetOrCreateMultiVersionResolver().

Previously, GetOrCreateMultiVersionResolver() required the caller to provide
a GlobalDecl along with an llvm:

[clang] NFC: Simplify the interface to CodeGenModule::GetOrCreateMultiVersionResolver().

Previously, GetOrCreateMultiVersionResolver() required the caller to provide
a GlobalDecl along with an llvm::type and FunctionDecl. The latter two can be
cheaply obtained from the first, and the llvm::type parameter is not always
used, so requiring the caller to provide them was unnecessary and created the
possibility that callers would pass an inconsistent set. This change simplifies
the interface to only require the GlobalDecl value.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122956

show more ...


# bed5ee3f 31-Mar-2022 Tom Honermann <tom.honermann@intel.com>

[clang] NFC: Enhance comments in CodeGen for multiversion function support.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122955


# 15a17696 04-Apr-2022 Shangwu Yao <shangwuyao@gmail.com>

Emit OpenCL metadata when targeting SPIR-V

This is required for converting function calls such as get_global_id()
into SPIR-V builtins.

Differential Revision: https://reviews.llvm.org/D123049


# 7c53fc4f 25-Mar-2022 Tom Honermann <tom.honermann@intel.com>

[clang] Emit target_clones resolver functions as COMDAT.

Previously, resolver functions synthesized for target_clones multiversion
functions were not emitted as COMDAT. Now fixed.


# ff18b158 05-Apr-2022 Nikita Popov <npopov@redhat.com>

[CodeGen] Avoid unnecessary ConstantExpr cast

With opaque pointers, this is not necessarily a ConstantExpr. And
we don't need one here either, just Constant is sufficient.


# 9ba8c402 28-Mar-2022 Erich Keane <erich.keane@intel.com>

Fix behavior of ifuncs with 'used' extern "C" static functions

We expect that `extern "C"` static functions to be usable in things like
inline assembly, as well as ifuncs:
See the bug report here: h

Fix behavior of ifuncs with 'used' extern "C" static functions

We expect that `extern "C"` static functions to be usable in things like
inline assembly, as well as ifuncs:
See the bug report here: https://github.com/llvm/llvm-project/issues/54549

However, we were diagnosing this as 'not defined', because the
ifunc's attempt to look up its resolver would generate a declared IR
function.

Additionally, as background, the way we allow these static extern "C"
functions to work in inline assembly is by making an alias with the C
mangling in MOST situations to the version we emit with
internal-linkage/mangling.

The problem here was multi-fold: First- We generated the alias after the
ifunc was checked, so the function by that name didn't exist yet.
Second, the ifunc's generation caused a symbol to exist under the name
of the alias already (the declared function above), which suppressed the
alias generation.

This patch fixes all of this by moving the checking of ifuncs/CFE aliases
until AFTER we have generated the extern-C alias. Then, it does a
'fixup' around the GlobalIFunc to make sure we correct the reference.

Differential Revision: https://reviews.llvm.org/D122608

show more ...


# dfde3549 29-Mar-2022 Chris Bieneman <chris.bieneman@me.com>

NFC. Fixing warnings from adding DXContainer

Adds DXContainer to switch statements in Clang and LLDB to silence
warnings.


# d6148749 28-Mar-2022 James Y Knight <jyknight@google.com>

[Clang] Implement __builtin_source_location.

This builtin returns the address of a global instance of the
`std::source_location::__impl` type, which must be defined (with an
appropriate shape) befor

[Clang] Implement __builtin_source_location.

This builtin returns the address of a global instance of the
`std::source_location::__impl` type, which must be defined (with an
appropriate shape) before calling the builtin.

It will be used to implement std::source_location in libc++ in a
future change. The builtin is compatible with GCC's implementation,
and libstdc++'s usage. An intentional divergence is that GCC declares
the builtin's return type to be `const void*` (for
ease-of-implementation reasons), while Clang uses the actual type,
`const std::source_location::__impl*`.

In order to support this new functionality, I've also added a new
'UnnamedGlobalConstantDecl'. This artificial Decl is modeled after
MSGuidDecl, and is used to represent a generic concept of an lvalue
constant with global scope, deduplicated by its value. It's possible
that MSGuidDecl itself, or some of the other similar sorts of things
in Clang might be able to be refactored onto this more-generic
concept, but there's enough special-case weirdness in MSGuidDecl that
I gave up attempting to share code there, at least for now.

Finally, for compatibility with libstdc++'s <source_location> header,
I've added a second exception to the "cannot cast from void* to T* in
constant evaluation" rule. This seems a bit distasteful, but feels
like the best available option.

Reviewers: aaron.ballman, erichkeane

Differential Revision: https://reviews.llvm.org/D120159

show more ...


# b8f0e128 22-Mar-2022 Nikita Popov <npopov@redhat.com>

[CodeGen] Remove some uses of deprecated Address constructor

Remove two stray uses in CodeGenModule and CGCUDANV.


# 5d2ce766 18-Mar-2022 Benjamin Kramer <benny.kra@googlemail.com>

Use llvm::append_range instead of push_back loops where applicable. NFCI.


# 806bbc49 21-Feb-2022 Joseph Huber <jhuber6@vols.utk.edu>

[OpenMP] Try to embed offloading objects after codegen

Currently we use the `-fembed-offload-object` option to embed a binary
file into the host as a named section. This is currently only used as a

[OpenMP] Try to embed offloading objects after codegen

Currently we use the `-fembed-offload-object` option to embed a binary
file into the host as a named section. This is currently only used as a
codegen action, meaning we only handle this option correctly when the
input is a bitcode file. This patch adds the same handling to embed an
offloading object after we complete code generation. This allows us to
embed the object correctly if the input file is source or bitcode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120270

show more ...


# dc152659 10-Mar-2022 Erich Keane <erich.keane@intel.com>

Have cpu-specific variants set 'tune-cpu' as an optimization hint

Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the

Have cpu-specific variants set 'tune-cpu' as an optimization hint

Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it. This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.

This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.

Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or with different feature lists!).

If this is not done, there is two potential performance issues with the
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.

1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus
features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.

2- In the event that the user spelled VendorI's processor, and the
feature
list allows it to run on VendorA's processor of similar features, AND
there
is an optimization opportunity for VendorIs that negatively affects
"A"s,
the optimizer will likely choose to do so. This can be fixed by adding
an
alias to X86TargetParser.def.

Differential Revision: https://reviews.llvm.org/D121410

show more ...


# f3480390 29-Jan-2022 Itay Bookstein <ibookstein@gmail.com>

[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers

The purpose of this change is to fix the following codegen bug:

```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) {

[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers

The purpose of this change is to fix the following codegen bug:

```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }

// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);

// run:
clang main.c other.c -o main; ./main
```

This will segfault prior to the change, and return the correct
exit code 5 after the change.

The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.

The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.

I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.

Previous discussion in: https://reviews.llvm.org/D112349

Signed-off-by: Itay Bookstein <ibookstein@gmail.com>

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D120266

show more ...


# 50650766 16-Feb-2022 Nikita Popov <npopov@redhat.com>

[CodeGen] Rename deprecated Address constructor

To make uses of the deprecated constructor easier to spot, and to
ensure that no new uses are introduced, rename it to
Address::deprecated().

While d

[CodeGen] Rename deprecated Address constructor

To make uses of the deprecated constructor easier to spot, and to
ensure that no new uses are introduced, rename it to
Address::deprecated().

While doing the rename, I've filled in element types in cases
where it was relatively obvious, but we're still left with 135
calls to the deprecated constructor.

show more ...


# 6398903a 14-Feb-2022 Momchil Velikov <momchil.velikov@arm.com>

Extend the `uwtable` attribute with unwind table kind

We have the `clang -cc1` command-line option `-funwind-tables=1|2` and
the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind
tabl

Extend the `uwtable` attribute with unwind table kind

We have the `clang -cc1` command-line option `-funwind-tables=1|2` and
the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind
tables (1) or asynchronous unwind tables (2)`. However, this is
encoded in LLVM IR by the presence or the absence of the `uwtable`
attribute, i.e. we lose the information whether to generate want just
some unwind tables or asynchronous unwind tables.

Asynchronous unwind tables take more space in the runtime image, I'd
estimate something like 80-90% more, as the difference is adding
roughly the same number of CFI directives as for prologues, only a bit
simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even
more, if you consider tail duplication of epilogue blocks.
Asynchronous unwind tables could also restrict code generation to
having only a finite number of frame pointer adjustments (an example
of *not* having a finite number of `SP` adjustments is on AArch64 when
untagging the stack (MTE) in some cases the compiler can modify `SP`
in a loop).
Having the CFI precise up to an instruction generally also means one
cannot bundle together CFI instructions once the prologue is done,
they need to be interspersed with ordinary instructions, which means
extra `DW_CFA_advance_loc` commands, further increasing the unwind
tables size.

That is to say, async unwind tables impose a non-negligible overhead,
yet for the most common use cases (like C++ exceptions), they are not
even needed.

This patch extends the `uwtable` attribute with an optional
value:
- `uwtable` (default to `async`)
- `uwtable(sync)`, synchronous unwind tables
- `uwtable(async)`, asynchronous (instruction precise) unwind tables

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D114543

show more ...


# 87dd3d35 11-Feb-2022 Arthur Eubanks <aeubanks@google.com>

[clang][OpaquePtr] Remove call to getPointerElementType() in CodeGenModule::GetAddrOfGlobalTemporary()


# d8f99bb6 11-Feb-2022 Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
r

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216

show more ...


1...<<11121314151617181920>>...87