#
358c012d |
| 02-Jun-2017 |
David Blaikie <dblaikie@gmail.com> |
BitcodeWriter: Removing unnecessary std::function in favor of template
More cleanup from post-commit discussion on r304516
llvm-svn: 304579
|
#
b6b42e01 |
| 02-Jun-2017 |
David Blaikie <dblaikie@gmail.com> |
Tidy up a bit of r304516, use SmallVector::assign rather than for loop
This might give a few better opportunities to optimize these to memcpy rather than loops - also a few minor cleanups (StringRef
Tidy up a bit of r304516, use SmallVector::assign rather than for loop
This might give a few better opportunities to optimize these to memcpy rather than loops - also a few minor cleanups (StringRef-izing, templating (to avoid std::function indirection), etc).
The SmallVector::assign(iter, iter) could be improved with the use of SFINAE, but the (iter, iter) ctor and append(iter, iter) need it to and don't have it - so, workaround it for now rather than bothering with the added complexity.
(also, as noted in the added FIXME, these assign ops could potentially be optimized better at least for non-trivially-copyable types)
llvm-svn: 304566
show more ...
|
#
7a27b132 |
| 02-Jun-2017 |
Teresa Johnson <tejohnson@google.com> |
[ThinLTO] Efficiency improvement when writing module path string table
Summary: When writing the combined index, we are walking the entire module path StringMap in the full index, and checking wheth
[ThinLTO] Efficiency improvement when writing module path string table
Summary: When writing the combined index, we are walking the entire module path StringMap in the full index, and checking whether each one should be included in the index being written. For distributed backends, where we write an individual combined index for each file, each with only a few module paths, this is incredibly inefficient. Add a method that takes a callback and hides the details of whether we are writing the full combined index, or just a slice, and in the latter case it walks the set of modules to include instead of the entire index.
For a huge application with around 23K files (i.e. where we were iterating through the 23K-entry modulePath StringMap 23K times), this change improved the thin link time by a whopping 48%.
Reviewers: pcc
Subscribers: Prazek, inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D33813
llvm-svn: 304516
show more ...
|
#
56584bbf |
| 01-Jun-2017 |
Evgeniy Stepanov <eugeni.stepanov@gmail.com> |
(NFC) Track global summary liveness in GVFlags.
Replace GVFlags::LiveRoot with GVFlags::Live and use that instead of all the DeadSymbols sets. This is refactoring in order to make liveness informati
(NFC) Track global summary liveness in GVFlags.
Replace GVFlags::LiveRoot with GVFlags::Live and use that instead of all the DeadSymbols sets. This is refactoring in order to make liveness information available in the RegularLTO pipeline.
llvm-svn: 304466
show more ...
|
#
a6a3fb57 |
| 31-May-2017 |
Teresa Johnson <tejohnson@google.com> |
[ThinLTO] Reduce unnecessary map lookups during combined summary write
Summary: Don't assign values to undefined references, simply don't emit those reference edges as they are not useful (we were a
[ThinLTO] Reduce unnecessary map lookups during combined summary write
Summary: Don't assign values to undefined references, simply don't emit those reference edges as they are not useful (we were already not emitting call edges to undefined refs).
Also, streamline the later lookup of value ids when writing the summaries, by combining the check for value id existence with the access of that value id.
Reviewers: pcc
Subscribers: Prazek, llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D33634
llvm-svn: 304323
show more ...
|
Revision tags: llvmorg-4.0.1-rc2 |
|
#
2c26a185 |
| 26-May-2017 |
Peter Collingbourne <peter@pcc.me.uk> |
Bitcode: Remove some dead code. Spotted by Teresa.
Differential Revision: https://reviews.llvm.org/D33609
llvm-svn: 304046
|
#
8bf67fe9 |
| 23-May-2017 |
Reid Kleckner <rnk@google.com> |
[IR] Switch AttributeList to use an array for O(1) access
Summary: Before this change, AttributeLists stored a pair of index and AttributeSet. This is memory efficient if most arguments do not have
[IR] Switch AttributeList to use an array for O(1) access
Summary: Before this change, AttributeLists stored a pair of index and AttributeSet. This is memory efficient if most arguments do not have attributes. However, it requires doing a search over the pairs to test an argument or function attribute. Profiling shows that this loop was 0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly tests values for nullability.
This was worth about 2.5% of mid-level optimization cycles on the sqlite3 amalgamation. Here are the full perf results: https://reviews.llvm.org/P7995
Here are just the before and after cycle counts: ``` $ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null 13,274,181,184 cycles # 3.047 GHz ( +- 0.28% ) $ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null 12,906,927,263 cycles # 3.043 GHz ( +- 0.51% ) ```
This patch *does not* change the indices used to query attributes, as requested by reviewers. Tracking whether an index is usable for array indexing is a huge pain that affects many of the internal APIs, so it would be good to come back later and do a cleanup to remove this internal adjustment.
Reviewers: pete, chandlerc
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D32819
llvm-svn: 303654
show more ...
|
#
f3d7904d |
| 11-May-2017 |
Javed Absar <javed.absar@arm.com> |
[IR] Allow attributes with global variables
This patch extends llvm-ir to allow attributes to be set on global variables. An RFC was sent out earlier by my colleague James Molloy: http://lists.llvm.
[IR] Allow attributes with global variables
This patch extends llvm-ir to allow attributes to be set on global variables. An RFC was sent out earlier by my colleague James Molloy: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053100.html A key part of that proposal was to extend LLVM-IR to carry attributes on global variables. This generic feature could be useful for multiple purposes. In our present context, it would be useful to carry user specified sections for bss/rodata/data.
Reviewed by: Jonathan Roelofs, Reid Kleckner Differential Revision: https://reviews.llvm.org/D32009
llvm-svn: 302794
show more ...
|
#
9667b91b |
| 04-May-2017 |
Peter Collingbourne <peter@pcc.me.uk> |
Re-apply r302108, "IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI." with a fix for the clang backend.
llvm-svn: 302176
|
#
f6039f25 |
| 04-May-2017 |
Eric Liu <ioeric@google.com> |
Revert "IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI."
This reverts commit r302108. This causes crash in clang bootstrap with LTO.
Contacted the auther in the or
Revert "IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI."
This reverts commit r302108. This causes crash in clang bootstrap with LTO.
Contacted the auther in the original commit.
llvm-svn: 302140
show more ...
|
#
5f85a9de |
| 04-May-2017 |
Peter Collingbourne <peter@pcc.me.uk> |
IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI.
When profiling a no-op incremental link of Chromium I found that the functions computeImportForFunction and computeD
IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI.
When profiling a no-op incremental link of Chromium I found that the functions computeImportForFunction and computeDeadSymbols were consuming roughly 10% of the profile. The goal of this change is to improve the performance of those functions by changing the map lookups that they were previously doing into pointer dereferences.
This is achieved by changing the ValueInfo data structure to be a pointer to an element of the global value map owned by ModuleSummaryIndex, and changing reference lists in the GlobalValueSummary to hold ValueInfos instead of GUIDs. This means that a ValueInfo will take a client directly to the summary list for a given GUID.
Differential Revision: https://reviews.llvm.org/D32471
llvm-svn: 302108
show more ...
|
#
7c2c4097 |
| 02-May-2017 |
Peter Collingbourne <peter@pcc.me.uk> |
Bitcode: Simplify how we enumerate summaries in the index. NFCI.
Instead of defining a custom iterator class, just use a function with a callback, which is much easier to understand and less error p
Bitcode: Simplify how we enumerate summaries in the index. NFCI.
Instead of defining a custom iterator class, just use a function with a callback, which is much easier to understand and less error prone.
Differential Revision: https://reviews.llvm.org/D32470
llvm-svn: 301942
show more ...
|
#
fed4f399 |
| 28-Apr-2017 |
Adrian Prantl <aprantl@apple.com> |
Remove line and file from DINamespace.
Fixes the issue highlighted in http://lists.llvm.org/pipermail/cfe-dev/2014-June/037500.html.
The DW_AT_decl_file and DW_AT_decl_line attributes on namespaces
Remove line and file from DINamespace.
Fixes the issue highlighted in http://lists.llvm.org/pipermail/cfe-dev/2014-June/037500.html.
The DW_AT_decl_file and DW_AT_decl_line attributes on namespaces can prevent LLVM from uniquing types that are in the same namespace. They also don't carry any meaningful information.
rdar://problem/17484998 Differential Revision: https://reviews.llvm.org/D32648
llvm-svn: 301706
show more ...
|
#
b19b57ea |
| 28-Apr-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Add speculatable function attribute
This attribute tells the optimizer that the function may be speculated.
Patch by Tom Stellard
llvm-svn: 301680
|
Revision tags: llvmorg-4.0.1-rc1 |
|
#
1d12b885 |
| 26-Apr-2017 |
Adrian Prantl <aprantl@apple.com> |
Add support for DW_TAG_thrown_type.
For Swift we would like to be able to encode the error types that a function may throw, so the debugger can display them alongside the function's return value whe
Add support for DW_TAG_thrown_type.
For Swift we would like to be able to encode the error types that a function may throw, so the debugger can display them alongside the function's return value when finish-ing a function.
DWARF defines DW_TAG_thrown_type (intended to be used for C++ throw() declarations) that is a perfect fit for this purpose. This patch wires up support for DW_TAG_thrown_type in LLVM by adding a list of thrown types to DISubprogram.
To offset the cost of the extra pointer, there is a follow-up patch that turns DISubprogram into a variable-length node.
rdar://problem/29481673
Differential Revision: https://reviews.llvm.org/D32559
llvm-svn: 301489
show more ...
|
#
63b26f0e |
| 24-Apr-2017 |
Reid Kleckner <rnk@google.com> |
Make getSlotAttributes return an AttributeSet instead of a wrapper list
Remove the temporary, poorly named getSlotSet method which did the same thing. Also remove getSlotNode, which is a hold-over f
Make getSlotAttributes return an AttributeSet instead of a wrapper list
Remove the temporary, poorly named getSlotSet method which did the same thing. Also remove getSlotNode, which is a hold-over from when we were dealing with AttributeSetNode* instead of AttributeSet.
llvm-svn: 301267
show more ...
|
#
b4a2d187 |
| 24-Apr-2017 |
Reid Kleckner <rnk@google.com> |
[Bitcode] Refactor attribute group writing to avoid getSlotAttributes
Summary: That API creates a temporary AttributeList to carry an index and a single AttributeSet. We need to carry the index in a
[Bitcode] Refactor attribute group writing to avoid getSlotAttributes
Summary: That API creates a temporary AttributeList to carry an index and a single AttributeSet. We need to carry the index in addition to the set, because that is how attribute groups are currently encoded.
NFC
Reviewers: pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32262
llvm-svn: 301245
show more ...
|
#
6825fb64 |
| 18-Apr-2017 |
Adrian Prantl <aprantl@apple.com> |
PR32382: Fix emitting complex DWARF expressions.
The DWARF specification knows 3 kinds of non-empty simple location descriptions: 1. Register location descriptions - describe a variable in a regis
PR32382: Fix emitting complex DWARF expressions.
The DWARF specification knows 3 kinds of non-empty simple location descriptions: 1. Register location descriptions - describe a variable in a register - consist of only a DW_OP_reg 2. Memory location descriptions - describe the address of a variable 3. Implicit location descriptions - describe the value of a variable - end with DW_OP_stack_value & friends
The existing DwarfExpression code is pretty much ignorant of these restrictions. This used to not matter because we only emitted very short expressions that we happened to get right by accident. This patch makes DwarfExpression aware of the rules defined by the DWARF standard and now chooses the right kind of location description for each expression being emitted.
This would have been an NFC commit (for the existing testsuite) if not for the way that clang describes captured block variables. Based on how the previous code in LLVM emitted locations, DW_OP_deref operations that should have come at the end of the expression are put at its beginning. Fixing this means changing the semantics of DIExpression, so this patch bumps the version number of DIExpression and implements a bitcode upgrade.
There are two major changes in this patch:
I had to fix the semantics of dbg.declare for describing function arguments. After this patch a dbg.declare always takes the *address* of a variable as the first argument, even if the argument is not an alloca.
When lowering a DBG_VALUE, the decision of whether to emit a register location description or a memory location description depends on the MachineLocation — register machine locations may get promoted to memory locations based on their DIExpression. (Future) optimization passes that want to salvage implicit debug location for variables may do so by appending a DW_OP_stack_value. For example: DBG_VALUE, [RBP-8] --> DW_OP_fbreg -8 DBG_VALUE, RAX --> DW_OP_reg0 +0 DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0
All testcases that were modified were regenerated from clang. I also added source-based testcases for each of these to the debuginfo-tests repository over the last week to make sure that no synchronized bugs slip in. The debuginfo-tests compile from source and run the debugger.
https://bugs.llvm.org/show_bug.cgi?id=32382 <rdar://problem/31205000>
Differential Revision: https://reviews.llvm.org/D31439
llvm-svn: 300522
show more ...
|
#
a0f371a1 |
| 17-Apr-2017 |
Peter Collingbourne <peter@pcc.me.uk> |
Bitcode: Add a string table to the bitcode format.
Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDA
Bitcode: Add a string table to the bitcode format.
Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table.
This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module.
On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%.
As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html
Differential Revision: https://reviews.llvm.org/D31838
llvm-svn: 300464
show more ...
|
#
927d8e61 |
| 12-Apr-2017 |
Chandler Carruth <chandlerc@gmail.com> |
[IR] Redesign the case iterator in SwitchInst to actually be an iterator and to expose a handle to represent the actual case rather than having the iterator return a reference to itself.
All of this
[IR] Redesign the case iterator in SwitchInst to actually be an iterator and to expose a handle to represent the actual case rather than having the iterator return a reference to itself.
All of this allows the iterator to be used with common STL facilities, standard algorithms, etc.
Doing this exposed some missing facilities in the iterator facade that I've fixed and required some work to the actual iterator to fully support the necessary API.
Differential Revision: https://reviews.llvm.org/D31548
llvm-svn: 300032
show more ...
|
#
db4cafa6 |
| 06-Apr-2017 |
Peter Collingbourne <peter@pcc.me.uk> |
Bitcode: Do not create FNENTRYs for aliases of functions.
There doesn't seem to be any point in doing this.
Differential Revision: https://reviews.llvm.org/D31691
llvm-svn: 299694
|
#
e935296f |
| 05-Apr-2017 |
Peter Collingbourne <peter@pcc.me.uk> |
Bitcode: Remove an unused declaration. NFC.
llvm-svn: 299598
|
#
cd847a8f |
| 28-Mar-2017 |
Adam Nemet <anemet@apple.com> |
[IR] Add AllowContract to FastMathFlags
-ffp-contract=fast does not currently work with LTO because it's passed as a TargetOption to the backend rather than in the IR. This adds it to FastMathFlags.
[IR] Add AllowContract to FastMathFlags
-ffp-contract=fast does not currently work with LTO because it's passed as a TargetOption to the backend rather than in the IR. This adds it to FastMathFlags.
This is toward fixing PR25721
Differential Revision: https://reviews.llvm.org/D31164
llvm-svn: 298939
show more ...
|
#
0c6a4ff8 |
| 23-Mar-2017 |
Teresa Johnson <tejohnson@google.com> |
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary: The cumulative size of the bitcode files for a very large application can be huge, particularly with -g. In a distributed
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary: The cumulative size of the bitcode files for a very large application can be huge, particularly with -g. In a distributed build environment, all of these files must be sent to the remote build node that performs the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode symbol table. Until we have a proper bitcode symbol table, simply stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode modules, just for use in the thin link step, which for now just strips all debug metadata. I plan to add a cc1 option so this can be invoked easily during the compile step.
However, care must be taken to ensure that these minimized thin link bitcode files produce the same index as with the original bitcode files, as these original bitcode files will be used in the backends.
Specifically: 1) The module hash used for caching is typically produced by hashing the written bitcode, and we want to include the hash that would correspond to the original bitcode file. This is because we want to ensure that changes in the stripped portions affect caching. Added plumbing to emit the same module hash in the minimized thin link bitcode file. 2) The module paths in the index are constructed from the module ID of each thin linked bitcode, and typically is automatically generated from the input file path. This is the path used for finding the modules to import from, and obviously we need this to point to the original bitcode files. Added gold-plugin support to take a suffix replacement during the thin link that is used to override the identifier on the MemoryBufferRef constructed from the loaded thin link bitcode file. The assumption is that the build system can specify that the minimized bitcode file has a name that is similar but uses a different suffix (e.g. out.thinlink.bc instead of out.o).
Added various tests to ensure that we get identical index files out of the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
show more ...
|
#
b518054b |
| 21-Mar-2017 |
Reid Kleckner <rnk@google.com> |
Rename AttributeSet to AttributeList
Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttr
Rename AttributeSet to AttributeList
Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name.
Rename AttributeSetImpl to AttributeListImpl to follow suit.
It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value.
Reviewers: sanjoy, javed.absar, chandlerc, pete
Reviewed By: pete
Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits
Differential Revision: https://reviews.llvm.org/D31102
llvm-svn: 298393
show more ...
|