#
e3ba652a |
| 03-Apr-2020 |
Wei Mi <wmi@google.com> |
[SampleFDO] Add flag for partial profile.
The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases
[SampleFDO] Add flag for partial profile.
The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them.
Differential Revision: https://reviews.llvm.org/D77426
show more ...
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
#
ebad6788 |
| 03-Mar-2020 |
Wei Mi <wmi@google.com> |
[SampleFDO] Port MD5 name table support to extbinary format.
Compbinary format uses MD5 to represent strings in name table. That gives smaller profile without the need of compression/decompression w
[SampleFDO] Port MD5 name table support to extbinary format.
Compbinary format uses MD5 to represent strings in name table. That gives smaller profile without the need of compression/decompression when writing/reading the profile. The patch adds the support in extbinary format. It is off by default but user can choose to enable it.
Note the feature of using MD5 in name table can bring very small chance of name conflict leading to profile mismatch. Besides, profile using the feature won't have the profile remapping support.
Differential Revision: https://reviews.llvm.org/D76255
show more ...
|
#
8404aeb5 |
| 14-Feb-2020 |
Alexandre Ganea <alexandre.ganea@ubisoft.com> |
[Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, s
[Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background == Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem == The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std::thread::hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch == To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead. The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
show more ...
|
Revision tags: llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1 |
|
#
adcd0268 |
| 28-Jan-2020 |
Benjamin Kramer <benny.kra@googlemail.com> |
Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with std::string_view. There should be no functional change here.
This is mostly m
Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
show more ...
|
#
01bfb366 |
| 20-Jan-2020 |
Yi Kong <yikong@google.com> |
[llvm-profdata] Fix hint message since argument format has changed
"-sample" option is now changed to "--sample".
|
Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
#
1f43ea41 |
| 21-Oct-2019 |
Bjorn Pettersson <bjorn.a.pettersson@ericsson.com> |
Prune Pass.h include from DataLayout.h. NFCI
Summary: Reduce include dependencies by no longer including Pass.h from DataLayout.h. That include seemed irrelevant to DataLayout, as well as being irre
Prune Pass.h include from DataLayout.h. NFCI
Summary: Reduce include dependencies by no longer including Pass.h from DataLayout.h. That include seemed irrelevant to DataLayout, as well as being irrelevant to several users of DataLayout.
Reviewers: rnk
Reviewed By: rnk
Subscribers: mehdi_amini, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69261
llvm-svn: 375436
show more ...
|
#
b523790a |
| 07-Oct-2019 |
Wei Mi <wmi@google.com> |
[SampleFDO] Add compression support for any section in ExtBinary profile format
Previously ExtBinary profile format only supports compression using zlib for profile symbol list. In this patch, we ex
[SampleFDO] Add compression support for any section in ExtBinary profile format
Previously ExtBinary profile format only supports compression using zlib for profile symbol list. In this patch, we extend the compression support to any section. User can select some or all of the sections to compress. In an experiment, for a 45M profile in ExtBinary format, compressing name table reduced its size to 24M, and compressing all the sections reduced its size to 11M.
Differential Revision: https://reviews.llvm.org/D68253
llvm-svn: 373914
show more ...
|
#
e0fa2689 |
| 01-Oct-2019 |
Rong Xu <xur@google.com> |
[PGO] Fix typos from r359612. NFC.
llvm-svn: 373369
|
#
eee532cd |
| 21-Sep-2019 |
Wei Mi <wmi@google.com> |
Recommit [SampleFDO] Expose an interface to return the size of a section or the size of the profile for profile in ExtBinary format.
Fix a test failure on Mac.
[SampleFDO] Expose an interface to re
Recommit [SampleFDO] Expose an interface to return the size of a section or the size of the profile for profile in ExtBinary format.
Fix a test failure on Mac.
[SampleFDO] Expose an interface to return the size of a section or the size of the profile for profile in ExtBinary format.
Sometimes we want to limit the size of the profile by stripping some functions with low sample count or by stripping some function names with small text size from profile symbol list. That requires the profile reader to have the interfaces returning the size of a section or the size of total profile. The patch add those interfaces.
At the same time, add some dump facility to show the size of each section.
Differential revision: https://reviews.llvm.org/D67726
llvm-svn: 372478
show more ...
|
#
3bb56fa4 |
| 21-Sep-2019 |
Amara Emerson <aemerson@apple.com> |
Revert "[SampleFDO] Expose an interface to return the size of a section or the size"
This reverts commit f118852046a1d255ed8c65c6b5db320e8cea53a0.
Broke the macOS build/greendragon bots.
llvm-svn:
Revert "[SampleFDO] Expose an interface to return the size of a section or the size"
This reverts commit f118852046a1d255ed8c65c6b5db320e8cea53a0.
Broke the macOS build/greendragon bots.
llvm-svn: 372464
show more ...
|
#
f1188520 |
| 20-Sep-2019 |
Wei Mi <wmi@google.com> |
[SampleFDO] Expose an interface to return the size of a section or the size of the profile for profile in ExtBinary format.
Sometimes we want to limit the size of the profile by stripping some funct
[SampleFDO] Expose an interface to return the size of a section or the size of the profile for profile in ExtBinary format.
Sometimes we want to limit the size of the profile by stripping some functions with low sample count or by stripping some function names with small text size from profile symbol list. That requires the profile reader to have the interfaces returning the size of a section or the size of total profile. The patch add those interfaces.
At the same time, add some dump facility to show the size of each section.
llvm-svn: 372439
show more ...
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4 |
|
#
0fcfe897 |
| 03-Sep-2019 |
Vedant Kumar <vsk@apple.com> |
[llvm-profdata] Add mode to recover from profile read failures
Add a mode in which profile read errors are not immediately treated as fatal. In this mode, merging makes forward progress and reports
[llvm-profdata] Add mode to recover from profile read failures
Add a mode in which profile read errors are not immediately treated as fatal. In this mode, merging makes forward progress and reports failure only if no inputs can be read.
Differential Revision: https://reviews.llvm.org/D66985
llvm-svn: 370827
show more ...
|
#
198009ae |
| 31-Aug-2019 |
Wei Mi <wmi@google.com> |
Fix some errors introduced by rL370563 which were not exposed on my local machine. 1. zlib::compress accept &size_t but the param is an uint64_t. 2. Some systems don't have zlib installed. Don't use
Fix some errors introduced by rL370563 which were not exposed on my local machine. 1. zlib::compress accept &size_t but the param is an uint64_t. 2. Some systems don't have zlib installed. Don't use compression by default.
llvm-svn: 370564
show more ...
|
#
798e59b8 |
| 31-Aug-2019 |
Wei Mi <wmi@google.com> |
[SampleFDO] Add profile symbol list section to discriminate function being cold versus function being newly added.
This is the second half of https://reviews.llvm.org/D66374.
Profile symbol list is
[SampleFDO] Add profile symbol list section to discriminate function being cold versus function being newly added.
This is the second half of https://reviews.llvm.org/D66374.
Profile symbol list is the collection of function symbols showing up in the binary which generates the current profile. It is used to discriminate function being cold versus function being newly added. Profile symbol list is only added for profile with ExtBinary format.
During profile use compilation, when profile-sample-accurate is enabled, a function without profile will be regarded as cold only when it is contained in that list.
Differential Revision: https://reviews.llvm.org/D66766
llvm-svn: 370563
show more ...
|
Revision tags: llvmorg-9.0.0-rc3 |
|
#
be907324 |
| 23-Aug-2019 |
Wei Mi <wmi@google.com> |
[SampleFDO] Add ExtBinary format to support extension of binary profile.
This is a patch split from https://reviews.llvm.org/D66374. It tries to add a new format of profile called ExtBinary. The for
[SampleFDO] Add ExtBinary format to support extension of binary profile.
This is a patch split from https://reviews.llvm.org/D66374. It tries to add a new format of profile called ExtBinary. The format adds a section header table to the profile and organize the profile in sections, so the future extension like adding a new section or extending an existing section will be easier while keeping backward compatiblity feasible.
Differential Revision: https://reviews.llvm.org/D66513
llvm-svn: 369798
show more ...
|
#
0eaee545 |
| 15-Aug-2019 |
Jonas Devlieghere <jonas@devlieghere.com> |
[llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of
[llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
show more ...
|
Revision tags: llvmorg-9.0.0-rc2 |
|
#
d9b948b6 |
| 05-Aug-2019 |
Fangrui Song <maskray@google.com> |
Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC
F_{None,Text,Append} are kept for compatibility since r334221.
llvm-svn: 367800
|
Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4 |
|
#
f0d3dcec |
| 08-Jul-2019 |
Rong Xu <xur@google.com> |
llvm-profdata] Handle the cases of overlapping input file and output file
Currently llvm-profdata does not expect the same file name for the input profile and the output profile. >llvm-profdata merg
llvm-profdata] Handle the cases of overlapping input file and output file
Currently llvm-profdata does not expect the same file name for the input profile and the output profile. >llvm-profdata merge A.profraw B.profraw -o B.profraw The above command runs successfully but the resulted B.profraw is not correct. This patch fixes the issue by moving the initialization of writer after loading the profile.
For the show command, the following will report a confusing error of "Empty raw profile file": >llvm-profdata show B.profraw -o B.profraw It's harder to fix as we need to output something before loading the input profile. I don't think that a fix for this is worth the effort. I just make the error explicit for the show command.
Differential Revision: https://reviews.llvm.org/D64360
llvm-svn: 365386
show more ...
|
Revision tags: llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
#
998b97f6 |
| 30-Apr-2019 |
Rong Xu <xur@google.com> |
[llvm-profdata] Add overlap command to compute similarity b/w two profile files
Add overlap functionality to llvm-profdata tool to compute the similarity between two profile files.
Differential Rev
[llvm-profdata] Add overlap command to compute similarity b/w two profile files
Add overlap functionality to llvm-profdata tool to compute the similarity between two profile files.
Differential Revision: https://reviews.llvm.org/D60977
llvm-svn: 359612
show more ...
|
#
8b0a15b0 |
| 15-Mar-2019 |
Fangrui Song <maskray@google.com> |
[llvm-profdata] Deleted unused Cutoffs added by D16005
llvm-svn: 356248
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4 |
|
#
a6ff69f6 |
| 28-Feb-2019 |
Rong Xu <xur@google.com> |
[PGO] Context sensitive PGO (part 2)
Part 2 of CSPGO changes (mostly related to ProfileSummary). Note that I use a default parameter in setProfileSummary() and getSummary(). This is to break the dep
[PGO] Context sensitive PGO (part 2)
Part 2 of CSPGO changes (mostly related to ProfileSummary). Note that I use a default parameter in setProfileSummary() and getSummary(). This is to break the dependency in clang. I will make the parameter explicit after changing clang in a separated patch.
Differential Revision: https://reviews.llvm.org/D54175
llvm-svn: 355131
show more ...
|
Revision tags: llvmorg-8.0.0-rc3 |
|
#
ef598759 |
| 21-Feb-2019 |
Fangrui Song <maskray@google.com> |
Fix some include order and file headers issues. NFC
llvm-svn: 354550
|
Revision tags: llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2 |
|
#
40a7f63c |
| 05-Feb-2019 |
Petar Jovanovic <petar.jovanovic@mips.com> |
[PGO] Fix the type of the formated variable
Change the format type of Value to PRIu64 since it is a uint64_t. The problem was detected on mips boards building 32-bit binaries, where it was printing
[PGO] Fix the type of the formated variable
Change the format type of Value to PRIu64 since it is a uint64_t. The problem was detected on mips boards building 32-bit binaries, where it was printing junk values and causing test failure.
Patch by Milos Stojanovic.
Differential Revision: https://reviews.llvm.org/D57583
llvm-svn: 353194
show more ...
|
Revision tags: llvmorg-8.0.0-rc1 |
|
#
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <chandlerc@gmail.com> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
#
52aa224a |
| 08-Jan-2019 |
Rong Xu <xur@google.com> |
[llvm-profdata] add value-cutoff functionality in show command
This patch improves llvm-profdata show command: (1) add -value-cutoff=<N> option: Show only those functions whose max count values
[llvm-profdata] add value-cutoff functionality in show command
This patch improves llvm-profdata show command: (1) add -value-cutoff=<N> option: Show only those functions whose max count values are greater or equal to N. (2) add -list-below-cutoff option: Only output names of functions whose max count value are below the cutoff. (3) formats value-profile counts and prints out percentage.
Differential Revision: https://reviews.llvm.org/D56342
llvm-svn: 350673
show more ...
|