136ac495dSmrg<?xml version="1.0" encoding="UTF-8" standalone="no"?> 236ac495dSmrg<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Chapter 19. Profile Mode</title><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><meta name="keywords" content="C++, library, profile" /><meta name="keywords" content="ISO C++, library" /><meta name="keywords" content="ISO C++, runtime, library" /><link rel="home" href="../index.html" title="The GNU C++ Library" /><link rel="up" href="extensions.html" title="Part III. Extensions" /><link rel="prev" href="parallel_mode_test.html" title="Testing" /><link rel="next" href="profile_mode_design.html" title="Design" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Chapter 19. Profile Mode</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="parallel_mode_test.html">Prev</a> </td><th width="60%" align="center">Part III. 336ac495dSmrg Extensions 436ac495dSmrg 536ac495dSmrg</th><td width="20%" align="right"> <a accesskey="n" href="profile_mode_design.html">Next</a></td></tr></table><hr /></div><div class="chapter"><div class="titlepage"><div><div><h2 class="title"><a id="manual.ext.profile_mode"></a>Chapter 19. Profile Mode</h2></div></div></div><div class="toc"><p><strong>Table of Contents</strong></p><dl class="toc"><dt><span class="section"><a href="profile_mode.html#manual.ext.profile_mode.intro">Intro</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode.html#manual.ext.profile_mode.using">Using the Profile Mode</a></span></dt><dt><span class="section"><a href="profile_mode.html#manual.ext.profile_mode.tuning">Tuning the Profile Mode</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_design.html">Design</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.wrapper">Wrapper Model</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.instrumentation">Instrumentation</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.rtlib">Run Time Behavior</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.analysis">Analysis and Diagnostics</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.cost-model">Cost Model</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.reports">Reports</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.testing">Testing</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_api.html">Extensions for Custom Containers</a></span></dt><dt><span class="section"><a href="profile_mode_cost_model.html">Empirical Cost Model</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html">Implementation Issues</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.stack">Stack Traces</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.symbols">Symbolization of Instruction Addresses</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.concurrency">Concurrency</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.stdlib-in-proflib">Using the Standard Library in the Instrumentation Implementation</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.malloc-hooks">Malloc Hooks</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.construction-destruction">Construction and Destruction of Global Objects</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_devel.html">Developer Information</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_devel.html#manual.ext.profile_mode.developer.bigpic">Big Picture</a></span></dt><dt><span class="section"><a href="profile_mode_devel.html#manual.ext.profile_mode.developer.howto">How To Add A Diagnostic</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html">Diagnostics</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.template">Diagnostic Template</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.containers">Containers</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.hashtable_too_small">Hashtable Too Small</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.hashtable_too_large">Hashtable Too Large</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.inefficient_hash">Inefficient Hash</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_too_small">Vector Too Small</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_too_large">Vector Too Large</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_to_hashtable">Vector to Hashtable</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.hashtable_to_vector">Hashtable to Vector</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_to_list">Vector to List</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.list_to_vector">List to Vector</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.list_to_slist">List to Forward List (Slist)</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.assoc_ord_to_unord">Ordered to Unordered Associative Container</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.algorithms">Algorithms</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.algorithms.sort">Sort Algorithm Performance</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.locality">Data Locality</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.locality.sw_prefetch">Need Software Prefetch</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.locality.linked">Linked Structure Locality</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.mthread">Multithreaded Data Access</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.mthread.ddtest">Data Dependence Violations at Container Level</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.mthread.false_share">False Sharing</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.statistics">Statistics</a></span></dt></dl></dd><dt><span class="bibliography"><a href="profile_mode.html#profile_mode.biblio">Bibliography</a></span></dt></dl></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="manual.ext.profile_mode.intro"></a>Intro</h2></div></div></div><p> 636ac495dSmrg <span class="emphasis"><em>Goal: </em></span>Give performance improvement advice based on 736ac495dSmrg recognition of suboptimal usage patterns of the standard library. 836ac495dSmrg </p><p> 936ac495dSmrg <span class="emphasis"><em>Method: </em></span>Wrap the standard library code. Insert 1036ac495dSmrg calls to an instrumentation library to record the internal state of 1136ac495dSmrg various components at interesting entry/exit points to/from the standard 1236ac495dSmrg library. Process trace, recognize suboptimal patterns, give advice. 1336ac495dSmrg For details, see the 14*c0a68be4Smrg <a class="link" href="https://ieeexplore.ieee.org/document/4907670/" target="_top">Perflint 1536ac495dSmrg paper presented at CGO 2009</a>. 1636ac495dSmrg </p><p> 1736ac495dSmrg <span class="emphasis"><em>Strengths: </em></span> 1836ac495dSmrg</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> 1936ac495dSmrg Unintrusive solution. The application code does not require any 2036ac495dSmrg modification. 2136ac495dSmrg </p></li><li class="listitem"><p> The advice is call context sensitive, thus capable of 2236ac495dSmrg identifying precisely interesting dynamic performance behavior. 2336ac495dSmrg </p></li><li class="listitem"><p> 2436ac495dSmrg The overhead model is pay-per-view. When you turn off a diagnostic class 2536ac495dSmrg at compile time, its overhead disappears. 2636ac495dSmrg </p></li></ul></div><p> 2736ac495dSmrg </p><p> 2836ac495dSmrg <span class="emphasis"><em>Drawbacks: </em></span> 2936ac495dSmrg</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> 3036ac495dSmrg You must recompile the application code with custom options. 3136ac495dSmrg </p></li><li class="listitem"><p>You must run the application on representative input. 3236ac495dSmrg The advice is input dependent. 3336ac495dSmrg </p></li><li class="listitem"><p> 3436ac495dSmrg The execution time will increase, in some cases by factors. 3536ac495dSmrg </p></li></ul></div><p> 3636ac495dSmrg </p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.using"></a>Using the Profile Mode</h3></div></div></div><p> 3736ac495dSmrg This is the anticipated common workflow for program <code class="code">foo.cc</code>: 3836ac495dSmrg</p><pre class="programlisting"> 3936ac495dSmrg$ cat foo.cc 4036ac495dSmrg#include <vector> 4136ac495dSmrgint main() { 4236ac495dSmrg vector<int> v; 4336ac495dSmrg for (int k = 0; k < 1024; ++k) v.insert(v.begin(), k); 4436ac495dSmrg} 4536ac495dSmrg 4636ac495dSmrg$ g++ -D_GLIBCXX_PROFILE foo.cc 4736ac495dSmrg$ ./a.out 4836ac495dSmrg$ cat libstdcxx-profile.txt 4936ac495dSmrgvector-to-list: improvement = 5: call stack = 0x804842c ... 5036ac495dSmrg : advice = change std::vector to std::list 5136ac495dSmrgvector-size: improvement = 3: call stack = 0x804842c ... 5236ac495dSmrg : advice = change initial container size from 0 to 1024 5336ac495dSmrg</pre><p> 5436ac495dSmrg </p><p> 5536ac495dSmrg Anatomy of a warning: 5636ac495dSmrg </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> 5736ac495dSmrg Warning id. This is a short descriptive string for the class 5836ac495dSmrg that this warning belongs to. E.g., "vector-to-list". 5936ac495dSmrg </p></li><li class="listitem"><p> 6036ac495dSmrg Estimated improvement. This is an approximation of the benefit expected 6136ac495dSmrg from implementing the change suggested by the warning. It is given on 6236ac495dSmrg a log10 scale. Negative values mean that the alternative would actually 6336ac495dSmrg do worse than the current choice. 6436ac495dSmrg In the example above, 5 comes from the fact that the overhead of 6536ac495dSmrg inserting at the beginning of a vector vs. a list is around 1024 * 1024 / 2, 6636ac495dSmrg which is around 10e5. The improvement from setting the initial size to 6736ac495dSmrg 1024 is in the range of 10e3, since the overhead of dynamic resizing is 6836ac495dSmrg linear in this case. 6936ac495dSmrg </p></li><li class="listitem"><p> 7036ac495dSmrg Call stack. Currently, the addresses are printed without 7136ac495dSmrg symbol name or code location attribution. 7236ac495dSmrg Users are expected to postprocess the output using, for instance, addr2line. 7336ac495dSmrg </p></li><li class="listitem"><p> 7436ac495dSmrg The warning message. For some warnings, this is static text, e.g., 7536ac495dSmrg "change vector to list". For other warnings, such as the one above, 7636ac495dSmrg the message contains numeric advice, e.g., the suggested initial size 7736ac495dSmrg of the vector. 7836ac495dSmrg </p></li></ul></div><p> 7936ac495dSmrg </p><p>Three files are generated. <code class="code">libstdcxx-profile.txt</code> 8036ac495dSmrg contains human readable advice. <code class="code">libstdcxx-profile.raw</code> 8136ac495dSmrg contains implementation specific data about each diagnostic. 8236ac495dSmrg Their format is not documented. They are sufficient to generate 8336ac495dSmrg all the advice given in <code class="code">libstdcxx-profile.txt</code>. The advantage 8436ac495dSmrg of keeping this raw format is that traces from multiple executions can 8536ac495dSmrg be aggregated simply by concatenating the raw traces. We intend to 8636ac495dSmrg offer an external utility program that can issue advice from a trace. 8736ac495dSmrg <code class="code">libstdcxx-profile.conf.out</code> lists the actual diagnostic 8836ac495dSmrg parameters used. To alter parameters, edit this file and rename it to 8936ac495dSmrg <code class="code">libstdcxx-profile.conf</code>. 9036ac495dSmrg </p><p>Advice is given regardless whether the transformation is valid. 9136ac495dSmrg For instance, we advise changing a map to an unordered_map even if the 9236ac495dSmrg application semantics require that data be ordered. 9336ac495dSmrg We believe such warnings can help users understand the performance 9436ac495dSmrg behavior of their application better, which can lead to changes 9536ac495dSmrg at a higher abstraction level. 9636ac495dSmrg </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.tuning"></a>Tuning the Profile Mode</h3></div></div></div><p>Compile time switches and environment variables (see also file 9736ac495dSmrg profiler.h). Unless specified otherwise, they can be set at compile time 9836ac495dSmrg using -D_<name> or by setting variable <name> 9936ac495dSmrg in the environment where the program is run, before starting execution. 10036ac495dSmrg </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> 10136ac495dSmrg <code class="code">_GLIBCXX_PROFILE_NO_<diagnostic></code>: 10236ac495dSmrg disable specific diagnostics. 10336ac495dSmrg See section Diagnostics for possible values. 10436ac495dSmrg (Environment variables not supported.) 10536ac495dSmrg </p></li><li class="listitem"><p> 10636ac495dSmrg <code class="code">_GLIBCXX_PROFILE_TRACE_PATH_ROOT</code>: set an alternative root 10736ac495dSmrg path for the output files. 10836ac495dSmrg </p></li><li class="listitem"><p>_GLIBCXX_PROFILE_MAX_WARN_COUNT: set it to the maximum 10936ac495dSmrg number of warnings desired. The default value is 10.</p></li><li class="listitem"><p> 11036ac495dSmrg <code class="code">_GLIBCXX_PROFILE_MAX_STACK_DEPTH</code>: if set to 0, 11136ac495dSmrg the advice will 11236ac495dSmrg be collected and reported for the program as a whole, and not for each 11336ac495dSmrg call context. 11436ac495dSmrg This could also be used in continuous regression tests, where you 11536ac495dSmrg just need to know whether there is a regression or not. 11636ac495dSmrg The default value is 32. 11736ac495dSmrg </p></li><li class="listitem"><p> 11836ac495dSmrg <code class="code">_GLIBCXX_PROFILE_MEM_PER_DIAGNOSTIC</code>: 11936ac495dSmrg set a limit on how much memory to use for the accounting tables for each 12036ac495dSmrg diagnostic type. When this limit is reached, new events are ignored 12136ac495dSmrg until the memory usage decreases under the limit. Generally, this means 12236ac495dSmrg that newly created containers will not be instrumented until some 12336ac495dSmrg live containers are deleted. The default is 128 MB. 12436ac495dSmrg </p></li><li class="listitem"><p> 12536ac495dSmrg <code class="code">_GLIBCXX_PROFILE_NO_THREADS</code>: 12636ac495dSmrg Make the library not use threads. If thread local storage (TLS) is not 12736ac495dSmrg available, you will get a preprocessor error asking you to set 12836ac495dSmrg -D_GLIBCXX_PROFILE_NO_THREADS if your program is single-threaded. 12936ac495dSmrg Multithreaded execution without TLS is not supported. 13036ac495dSmrg (Environment variable not supported.) 13136ac495dSmrg </p></li><li class="listitem"><p> 13236ac495dSmrg <code class="code">_GLIBCXX_HAVE_EXECINFO_H</code>: 13336ac495dSmrg This name should be defined automatically at library configuration time. 13436ac495dSmrg If your library was configured without <code class="code">execinfo.h</code>, but 13536ac495dSmrg you have it in your include path, you can define it explicitly. Without 13636ac495dSmrg it, advice is collected for the program as a whole, and not for each 13736ac495dSmrg call context. 13836ac495dSmrg (Environment variable not supported.) 13936ac495dSmrg </p></li></ul></div><p> 14036ac495dSmrg </p></div></div><div class="bibliography"><div class="titlepage"><div><div><h2 class="title"><a id="profile_mode.biblio"></a>Bibliography</h2></div></div></div><div class="biblioentry"><a id="id-1.3.5.6.9.2"></a><p><span class="citetitle"><em class="citetitle"> 14136ac495dSmrg Perflint: A Context Sensitive Performance Advisor for C++ Programs 14236ac495dSmrg </em>. </span><span class="author"><span class="firstname">Lixia</span> <span class="surname">Liu</span>. </span><span class="author"><span class="firstname">Silvius</span> <span class="surname">Rus</span>. </span><span class="copyright">Copyright © 2009 . </span><span class="publisher"><span class="publishername"> 14336ac495dSmrg Proceedings of the 2009 International Symposium on Code Generation 14436ac495dSmrg and Optimization 14536ac495dSmrg . </span></span></p></div></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="parallel_mode_test.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="extensions.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="profile_mode_design.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Testing </td><td width="20%" align="center"><a accesskey="h" href="../index.html">Home</a></td><td width="40%" align="right" valign="top"> Design</td></tr></table></div></body></html>