History log of /llvm-project/clang/lib/CodeGen/CodeGenModule.cpp (Results 551 – 575 of 2157)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 3bab88b7 14-Jun-2020 Tyker <tyker1@outlook.com>

Prevent IR-gen from emitting consteval declarations

Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.

Reviewers:

Prevent IR-gen from emitting consteval declarations

Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76420

show more ...


# 4db2b702 08-Jun-2020 Jian Cai <caij2003@gmail.com>

Add a flag to debug automatic variable initialization

Summary:
Add -ftrivial-auto-var-init-stop-after= to limit the number of times
stack variables are initialized when -ftrivial-auto-var-init= is u

Add a flag to debug automatic variable initialization

Summary:
Add -ftrivial-auto-var-init-stop-after= to limit the number of times
stack variables are initialized when -ftrivial-auto-var-init= is used to
initialize stack variables to zero or a pattern. This flag can be used
to bisect uninitialized uses of a stack variable exposed by automatic
variable initialization, such as http://crrev.com/c/2020401.

Reviewers: jfb, vitalybuka, kcc, glider, rsmith, rjmccall, pcc, eugenis, vlad.tsyrklevich

Reviewed By: jfb

Subscribers: phosek, hubert.reinterpretcast, srhines, MaskRay, george.burgess.iv, dexonsmith, inglorion, gbiv, llozano, manojgupta, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77168

show more ...


# ecd682bb 04-Jun-2020 Ties Stuij <ties.stuij@arm.com>

[ARM] Add __bf16 as new Bfloat16 C Type

Summary:
This patch upstreams support for a new storage only bfloat16 C type.
This type is used to implement primitive support for bfloat16 data, in
line with

[ARM] Add __bf16 as new Bfloat16 C Type

Summary:
This patch upstreams support for a new storage only bfloat16 C type.
This type is used to implement primitive support for bfloat16 data, in
line with the Bfloat16 extension of the Armv8.6-a architecture, as
detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

In detail this patch:
- introduces an opaque, storage-only C-type __bf16, which introduces a new bfloat IR type.

This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.

The following people contributed to this patch:
- Luke Cheeseman
- Momchil Velikov
- Alexandros Lamprineas
- Luke Geeson
- Simon Tatham
- Ties Stuij

Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, fpetrogalli

Reviewed By: SjoerdMeijer

Subscribers: labrinea, majnemer, asmith, dexonsmith, kristof.beyls, arphaman, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76077

show more ...


# ef1d4bec 01-Jun-2020 Nick Desaulniers <ndesaulniers@google.com>

[Clang][CGM] style cleanups NFC

Summary:
Forked from:
https://reviews.llvm.org/D80242

Use the getter for access to DebugInfo consistently.
Use break in switch in CodeGenModule::EmitTopLevelDecl con

[Clang][CGM] style cleanups NFC

Summary:
Forked from:
https://reviews.llvm.org/D80242

Use the getter for access to DebugInfo consistently.
Use break in switch in CodeGenModule::EmitTopLevelDecl consistently.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: cfe-commits, srhines

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80840

show more ...


# aca3d067 27-May-2020 James Y Knight <jyknight@google.com>

Fix Darwin 'constinit thread_local' variables.

Unlike other platforms using ItaniumCXXABI, Darwin does not allow the
creation of a thread-wrapper function for a variable in the TU of
users. Because

Fix Darwin 'constinit thread_local' variables.

Unlike other platforms using ItaniumCXXABI, Darwin does not allow the
creation of a thread-wrapper function for a variable in the TU of
users. Because of this, it can set the linkage of the thread-local
symbol to internal, with the assumption that no TUs other than the one
defining the variable will need it.

However, constinit thread_local variables do not require the use of
the thread-wrapper call, so users reference the variable
directly. Thus, it must not be converted to internal, or users will
get a link failure.

This was a regression introduced by the optimization in
00223827a952f66e7426c9881a2a4229e59bb019.

Differential Revision: https://reviews.llvm.org/D80417

show more ...


# 9d55e4ee 25-May-2020 Fangrui Song <maskray@google.com>

Make explicit -fno-semantic-interposition (in -fpic mode) infer dso_local

-fno-semantic-interposition is currently the CC1 default. (The opposite
disables some interprocedural optimizations.) Howeve

Make explicit -fno-semantic-interposition (in -fpic mode) infer dso_local

-fno-semantic-interposition is currently the CC1 default. (The opposite
disables some interprocedural optimizations.) However, it does not infer
dso_local: on most targets accesses to ExternalLinkage functions/variables
defined in the current module still need PLT/GOT.

This patch makes explicit -fno-semantic-interposition infer dso_local,
so that PLT/GOT can be eliminated if targets implement local aliases
for AsmPrinter::getSymbolPreferLocal (currently only x86).

Currently we check whether the module flag "SemanticInterposition" is 0.
If yes, infer dso_local. In the future, we can infer dso_local unless
"SemanticInterposition" is 1: frontends other than clang will also
benefit from the optimization if they don't bother setting the flag.
(There will be risks if they do want ELF interposition: they need to set
"SemanticInterposition" to 1.)

show more ...


# 62f3ef2b 18-May-2020 Eli Friedman <efriedma@quicinc.com>

[CGCall] Annotate references with "align" attribute.

If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.

S

[CGCall] Annotate references with "align" attribute.

If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.

See also D80072.

Differential Revision: https://reviews.llvm.org/D80166

show more ...


# 94908088 16-Apr-2020 George Burgess IV <george.burgess.iv@gmail.com>

[CodeGen] fix inline builtin-related breakage from D78162

In cases where we have multiple decls of an inline builtin, we may need
to go hunting for the one with a definition when setting function
at

[CodeGen] fix inline builtin-related breakage from D78162

In cases where we have multiple decls of an inline builtin, we may need
to go hunting for the one with a definition when setting function
attributes.

An additional test-case was provided on
https://github.com/ClangBuiltLinux/linux/issues/979

show more ...


# bab6df86 12-Apr-2020 Richard Smith <richard@metafoo.co.uk>

Rework how UuidAttr, CXXUuidofExpr, and GUID template arguments and constants are represented.

Summary:
Previously, we treated CXXUuidofExpr as quite a special case: it was the
only kind of expressi

Rework how UuidAttr, CXXUuidofExpr, and GUID template arguments and constants are represented.

Summary:
Previously, we treated CXXUuidofExpr as quite a special case: it was the
only kind of expression that could be a canonical template argument, it
could be a constant lvalue base object, and so on. In addition, we
represented the UUID value as a string, whose source form we did not
preserve faithfully, and that we partially parsed in multiple different
places.

With this patch, we create an MSGuidDecl object to represent the
implicit object of type 'struct _GUID' created by a UuidAttr. Each
UuidAttr holds a pointer to its 'struct _GUID' and its original
(as-written) UUID string. A non-value-dependent CXXUuidofExpr behaves
like a DeclRefExpr denoting that MSGuidDecl object. We cache an APValue
representation of the GUID on the MSGuidDecl and use it from constant
evaluation where needed.

This allows removing a lot of the special-case logic to handle these
expressions. Unfortunately, many parts of Clang assume there are only
a couple of interesting kinds of ValueDecl, so the total amount of
special-case logic is not really reduced very much.

This fixes a few bugs and issues:
* PR38490: we now support reading from GUID objects returned from
__uuidof during constant evaluation.
* Our Itanium mangling for a non-instantiation-dependent template
argument involving __uuidof no longer depends on which CXXUuidofExpr
template argument we happened to see first.
* We now predeclare ::_GUID, and permit use of __uuidof without
any header inclusion, better matching MSVC's behavior. We do not
predefine ::__s_GUID, though; that seems like a step too far.
* Our IR representation for GUID constants now uses the correct IR type
wherever possible. We will still fall back to using the
{i32, i16, i16, [8 x i8]}
layout if a definition of struct _GUID is not available. This is not
ideal: in principle the two layouts could have different padding.

Reviewers: rnk, jdoerfert

Subscribers: arphaman, cfe-commits, aeubanks

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78171

show more ...


# 2dd17ff0 15-Apr-2020 George Burgess IV <george.burgess.iv@gmail.com>

[CodeGen] only add nobuiltin to inline builtins if we'll emit them

There are some inline builtin definitions that we can't emit
(isTriviallyRecursive & callers go into why). Marking these
nobuiltin

[CodeGen] only add nobuiltin to inline builtins if we'll emit them

There are some inline builtin definitions that we can't emit
(isTriviallyRecursive & callers go into why). Marking these
nobuiltin is only useful if we actually emit the body, so don't mark
these as such unless we _do_ plan on emitting that.

This suboptimality was encountered in Linux (see some discussion on
D71082, and https://github.com/ClangBuiltLinux/linux/issues/979).

Differential Revision: https://reviews.llvm.org/D78162

show more ...


# 91c8c741 14-Apr-2020 George Burgess IV <george.burgess.iv@gmail.com>

[CodeGen] clarify a comment; NFC

Prompted by discussion on https://reviews.llvm.org/D78148.


# cfc00271 14-Mar-2020 Ayke van Laethem <aykevanlaethem@gmail.com>

[AVR] Support aliases in non-zero address space

This fixes code like the following on AVR:

void foo(void) {
}
void bar(void) __attribute__((alias("foo")));

Code like this is present in compiler-rt

[AVR] Support aliases in non-zero address space

This fixes code like the following on AVR:

void foo(void) {
}
void bar(void) __attribute__((alias("foo")));

Code like this is present in compiler-rt, which I'm trying to build.

Differential Revision: https://reviews.llvm.org/D76182

show more ...


# c97be2c3 26-Mar-2020 Michael Liao <michael.hliao@gmail.com>

[hip] Remove `hip_pinned_shadow`.

Summary:
- Use `device_builtin_surface` and `device_builtin_texture` for
surface/texture reference support. So far, both the host and device
use the same refere

[hip] Remove `hip_pinned_shadow`.

Summary:
- Use `device_builtin_surface` and `device_builtin_texture` for
surface/texture reference support. So far, both the host and device
use the same reference type, which could be revised later when
interface/implementation is stablized.

Reviewers: yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77583

show more ...


# 83fa811e 03-Apr-2020 Eli Friedman <efriedma@quicinc.com>

[clang][opaque pointers] Fix up a bunch of "getType()->getElementType()"

In contexts where we know an LLVM type is a pointer, there's generally
some simpler way to get the pointee type.


# 5be9b8cb 27-Mar-2020 Michael Liao <michael.hliao@gmail.com>

[cuda][hip] Add CUDA builtin surface/texture reference support.

Summary: - Re-commit after fix Sema checks on partial template specialization.

Reviewers: tra, rjmccall, yaxunl, a.sidorin

Subscribe

[cuda][hip] Add CUDA builtin surface/texture reference support.

Summary: - Re-commit after fix Sema checks on partial template specialization.

Reviewers: tra, rjmccall, yaxunl, a.sidorin

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76365

show more ...


# fe8063e1 27-Mar-2020 Artem Belevich <tra@google.com>

Revert "[cuda][hip] Add CUDA builtin surface/texture reference support."

This reverts commit 6a9ad5f3f4ac66f0cae592e911f4baeb6ee5eca6.
The patch breaks CUDA copmilation.

Differential Revision: http

Revert "[cuda][hip] Add CUDA builtin surface/texture reference support."

This reverts commit 6a9ad5f3f4ac66f0cae592e911f4baeb6ee5eca6.
The patch breaks CUDA copmilation.

Differential Revision: https://reviews.llvm.org/D76365

show more ...


# befb4be3 25-Feb-2020 Johannes Doerfert <johannes@jdoerfert.de>

[OpenMP] `omp begin/end declare variant` - part 2, sema ("+CG")

This is the second part loosely extracted from D71179 and cleaned up.

This patch provides semantic analysis support for `omp begin/en

[OpenMP] `omp begin/end declare variant` - part 2, sema ("+CG")

This is the second part loosely extracted from D71179 and cleaned up.

This patch provides semantic analysis support for `omp begin/end declare
variant`, mostly as defined in OpenMP technical report 8 (TR8) [0].
The sema handling makes code generation obsolete as we generate "the
right" calls that can just be handled as usual. This handling also
applies to the existing, albeit problematic, `omp declare variant
support`. As a consequence a lot of unneeded code generation and
complexity is removed.

A major purpose of this patch is to provide proper `math.h`/`cmath`
support for OpenMP target offloading. See PR42061, PR42798, PR42799. The
current code was developed with this feature in mind, see [1].

The logic is as follows:

If we have seen a `#pragma omp begin declare variant match(<SELECTOR>)`
but not the corresponding `end declare variant`, and we find a function
definition we will:
1) Create a function declaration for the definition we were about to generate.
2) Create a function definition but with a mangled name (according to
`<SELECTOR>`).
3) Annotate the declaration with the `OMPDeclareVariantAttr`, the same
one used already for `omp declare variant`, using and the mangled
function definition as specialization for the context defined by
`<SELECTOR>`.

When a call is created we inspect it. If the target has an
`OMPDeclareVariantAttr` attribute we try to specialize the call. To this
end, all variants are checked, the best applicable one is picked and a
new call to the specialization is created. The new call is used instead
of the original one to the base function. To keep the AST printing and
tooling possible we utilize the PseudoObjectExpr. The original call is
the syntactic expression, the specialized call is the semantic
expression.

[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN

Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim, aaron.ballman

Subscribers: bollu, guansong, openmp-commits, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75779

show more ...


# 6a9ad5f3 07-Mar-2020 Michael Liao <michael.hliao@gmail.com>

[cuda][hip] Add CUDA builtin surface/texture reference support.

Summary:
- Even though the bindless surface/texture interfaces are promoted,
there are still code using surface/texture references.

[cuda][hip] Add CUDA builtin surface/texture reference support.

Summary:
- Even though the bindless surface/texture interfaces are promoted,
there are still code using surface/texture references. For example,
[PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the
compilation issue for code using `tex2D` with texture references. For
better compatibility, this patch proposes the support of
surface/texture references.
- Due to the absent documentation and magic headers, it's believed that
`nvcc` does use builtins for texture support. From the limited NVVM
documentation[^nvvm] and NVPTX backend texture/surface related
tests[^test], it's believed that surface/texture references are
supported by replacing their reference types, which are annotated with
`device_builtin_surface_type`/`device_builtin_texture_type`, with the
corresponding handle-like object types, `cudaSurfaceObject_t` or
`cudaTextureObject_t`, in the device-side compilation. On the host
side, that global handle variables are registered and will be
established and updated later when corresponding binding/unbinding
APIs are called[^bind]. Surface/texture references are most like
device global variables but represented in different types on the host
and device sides.
- In this patch, the following changes are proposed to support that
behavior:
+ Refine `device_builtin_surface_type` and
`device_builtin_texture_type` attributes to be applied on `Type`
decl only to check whether a variable is of the surface/texture
reference type.
+ Add hooks in code generation to replace that reference types with
the correponding object types as well as all accesses to them. In
particular, `nvvm.texsurf.handle.internal` should be used to load
object handles from global reference variables[^texsurf] as well as
metadata annotations.
+ Generate host-side registration with proper template argument
parsing.

---
[^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf
[^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll
[^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf).
[^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used. But, the current backend doesn't have that supported. We may revise that later.

Reviewers: tra, rjmccall, yaxunl, a.sidorin

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76365

show more ...


# fc375266 10-Mar-2020 Shiva Chen <shiva@andestech.com>

[RISCV] Passing small data limitation value to RISCV backend

Passing small data limit to RISCVELFTargetObjectFile by module flag,
So the backend can set small data section threshold by the value.
Th

[RISCV] Passing small data limitation value to RISCV backend

Passing small data limit to RISCVELFTargetObjectFile by module flag,
So the backend can set small data section threshold by the value.
The data will be put into the small data section if the data smaller than
the threshold.

Differential Revision: https://reviews.llvm.org/D57497

show more ...


# 4cf01ed7 18-Mar-2020 Michael Liao <michael.hliao@gmail.com>

[hip] Revise `GlobalDecl` constructors. NFC.

Summary:
- https://reviews.llvm.org/D68578 revises the `GlobalDecl` constructors
to ensure all GPU kernels have `ReferenceKenelKind` initialized
prop

[hip] Revise `GlobalDecl` constructors. NFC.

Summary:
- https://reviews.llvm.org/D68578 revises the `GlobalDecl` constructors
to ensure all GPU kernels have `ReferenceKenelKind` initialized
properly with an explicit constructor and static one. But, there are
lots of places using the implicit constructor triggering the assertion
on non-GPU kernels. That's found in compilation of many tests and
workloads.
- Fixing all of them may change more code and, more importantly, all of
them assumes the default kernel reference kind. This patch changes
that constructor to tell `CUDAGlobalAttr` and construct `GlobalDecl`
properly.

Reviewers: yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76344

show more ...


# c45eaeab 17-Mar-2020 Jon Chesterfield <jonathanchesterfield@gmail.com>

[Clang] Undef attribute for global variables

Summary:
[Clang] Attribute to allow defining undef global variables

Initializing global variables is very cheap on hosted implementations. The
C semanti

[Clang] Undef attribute for global variables

Summary:
[Clang] Attribute to allow defining undef global variables

Initializing global variables is very cheap on hosted implementations. The
C semantics of zero initializing globals work very well there. It is not
necessarily cheap on freestanding implementations. Where there is no loader
available, code must be emitted near the start point to write the appropriate
values into memory.

At present, external variables can be declared in C++ and definitions provided
in assembly (or IR) to achive this effect. This patch provides an attribute in
order to remove this reason for writing assembly for performance sensitive
freestanding implementations.

A close analogue in tree is LDS memory for amdgcn, where the kernel is
responsible for initializing the memory after it starts executing on the gpu.
Uninitalized variables in LDS are observably cheaper than zero initialized.

Patch is loosely based on the cuda __shared__ and opencl __local variable
implementation which also produces undef global variables.

Reviewers: kcc, rjmccall, rsmith, glider, vitalybuka, pcc, eugenis, vlad.tsyrklevich, jdoerfert, gregrodgers, jfb, aaron.ballman

Reviewed By: rjmccall, aaron.ballman

Subscribers: Anastasia, aaron.ballman, davidb, Quuxplusone, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74361

show more ...


# e08464fb 29-Feb-2020 Reid Kleckner <rnk@google.com>

Avoid including FileManager.h from SourceManager.h

Most clients of SourceManager.h need to do things like turning source
locations into file & line number pairs, but this doesn't require
bringing in

Avoid including FileManager.h from SourceManager.h

Most clients of SourceManager.h need to do things like turning source
locations into file & line number pairs, but this doesn't require
bringing in FileManager.h and LLVM's FS headers.

The main code change here is to sink SM::createFileID into the cpp file.
I reason that this is not performance critical because it doesn't happen
on the diagnostic path, it happens along the paths of macro expansion
(could be hot) and new includes (less hot).

Saves some includes:
309 - /usr/local/google/home/rnk/llvm-project/clang/include/clang/Basic/FileManager.h
272 - /usr/local/google/home/rnk/llvm-project/clang/include/clang/Basic/FileSystemOptions.h
271 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/VirtualFileSystem.h
267 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/FileSystem.h
266 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/Chrono.h

Differential Revision: https://reviews.llvm.org/D75406

show more ...


# 22c457a8 05-Mar-2020 Yaxun (Sam) Liu <yaxun.liu@amd.com>

[HIP] Fix device stub name

HIP emits a device stub function for each kernel in host code.

The HIP debugger requires device stub function to have a different unmangled name as the kernel.

Currently

[HIP] Fix device stub name

HIP emits a device stub function for each kernel in host code.

The HIP debugger requires device stub function to have a different unmangled name as the kernel.

Currently the name of the device stub function is the mangled name with a postfix .stub. However,
this does not work with the HIP debugger since the unmangled name is the same as the kernel.

This patch adds prefix __device__stub__ to the unmangled name of the device stub before mangling,
therefore the device stub function has a valid mangled name which is different than the device kernel
name. The device side kernel name is kept unchanged. kernels with extern "C" also gets the prefix added
to the corresponding device stub function.

Differential Revision: https://reviews.llvm.org/D68578

show more ...


# 7b661608 09-Mar-2020 Erich Keane <erich.keane@intel.com>

Fix Target Multiversioning renaming.

The initial implementation only did 'first declaration renaming' when
a default version came after. This is insufficient in cases where a
default does not exist,

Fix Target Multiversioning renaming.

The initial implementation only did 'first declaration renaming' when
a default version came after. This is insufficient in cases where a
default does not exist, so this patch makes sure that we do the renaming
in all cases.

This renaming is necessary because we emit the first declaration before
knowing that it IS a target multiversion function, which would change
its name. The second declaration (the one that caused the
multiversioning) then needs to make sure that the first one has its name
changed to be consistent with the resolver usage.

show more ...


# 29e1a16b 05-Mar-2020 Yaxun (Sam) Liu <yaxun.liu@amd.com>

[NFC] Let mangler accept GlobalDecl

Differential Revision: https://reviews.llvm.org/D75700


1...<<21222324252627282930>>...87