xref: /dflybsd-src/contrib/binutils-2.27/gprof/gprof.texi (revision e656dc90e3d65d744d534af2f5ea88cf8101ebcf)
1*a9fa9459Szrj\input texinfo @c -*-texinfo-*-
2*a9fa9459Szrj@setfilename gprof.info
3*a9fa9459Szrj@c Copyright (C) 1988-2016 Free Software Foundation, Inc.
4*a9fa9459Szrj@settitle GNU gprof
5*a9fa9459Szrj@setchapternewpage odd
6*a9fa9459Szrj
7*a9fa9459Szrj@c man begin INCLUDE
8*a9fa9459Szrj@include bfdver.texi
9*a9fa9459Szrj@c man end
10*a9fa9459Szrj
11*a9fa9459Szrj@ifnottex
12*a9fa9459Szrj@c This is a dir.info fragment to support semi-automated addition of
13*a9fa9459Szrj@c manuals to an info tree.  zoo@cygnus.com is developing this facility.
14*a9fa9459Szrj@dircategory Software development
15*a9fa9459Szrj@direntry
16*a9fa9459Szrj* gprof: (gprof).                Profiling your program's execution
17*a9fa9459Szrj@end direntry
18*a9fa9459Szrj@end ifnottex
19*a9fa9459Szrj
20*a9fa9459Szrj@copying
21*a9fa9459SzrjThis file documents the gprof profiler of the GNU system.
22*a9fa9459Szrj
23*a9fa9459Szrj@c man begin COPYRIGHT
24*a9fa9459SzrjCopyright @copyright{} 1988-2016 Free Software Foundation, Inc.
25*a9fa9459Szrj
26*a9fa9459SzrjPermission is granted to copy, distribute and/or modify this document
27*a9fa9459Szrjunder the terms of the GNU Free Documentation License, Version 1.3
28*a9fa9459Szrjor any later version published by the Free Software Foundation;
29*a9fa9459Szrjwith no Invariant Sections, with no Front-Cover Texts, and with no
30*a9fa9459SzrjBack-Cover Texts.  A copy of the license is included in the
31*a9fa9459Szrjsection entitled ``GNU Free Documentation License''.
32*a9fa9459Szrj
33*a9fa9459Szrj@c man end
34*a9fa9459Szrj@end copying
35*a9fa9459Szrj
36*a9fa9459Szrj@finalout
37*a9fa9459Szrj@smallbook
38*a9fa9459Szrj
39*a9fa9459Szrj@titlepage
40*a9fa9459Szrj@title GNU gprof
41*a9fa9459Szrj@subtitle The @sc{gnu} Profiler
42*a9fa9459Szrj@ifset VERSION_PACKAGE
43*a9fa9459Szrj@subtitle @value{VERSION_PACKAGE}
44*a9fa9459Szrj@end ifset
45*a9fa9459Szrj@subtitle Version @value{VERSION}
46*a9fa9459Szrj@author Jay Fenlason and Richard Stallman
47*a9fa9459Szrj
48*a9fa9459Szrj@page
49*a9fa9459Szrj
50*a9fa9459SzrjThis manual describes the @sc{gnu} profiler, @code{gprof}, and how you
51*a9fa9459Szrjcan use it to determine which parts of a program are taking most of the
52*a9fa9459Szrjexecution time.  We assume that you know how to write, compile, and
53*a9fa9459Szrjexecute programs.  @sc{gnu} @code{gprof} was written by Jay Fenlason.
54*a9fa9459SzrjEric S. Raymond made some minor corrections and additions in 2003.
55*a9fa9459Szrj
56*a9fa9459Szrj@vskip 0pt plus 1filll
57*a9fa9459SzrjCopyright @copyright{} 1988-2016 Free Software Foundation, Inc.
58*a9fa9459Szrj
59*a9fa9459Szrj      Permission is granted to copy, distribute and/or modify this document
60*a9fa9459Szrj      under the terms of the GNU Free Documentation License, Version 1.3
61*a9fa9459Szrj      or any later version published by the Free Software Foundation;
62*a9fa9459Szrj      with no Invariant Sections, with no Front-Cover Texts, and with no
63*a9fa9459Szrj      Back-Cover Texts.  A copy of the license is included in the
64*a9fa9459Szrj      section entitled ``GNU Free Documentation License''.
65*a9fa9459Szrj
66*a9fa9459Szrj@end titlepage
67*a9fa9459Szrj@contents
68*a9fa9459Szrj
69*a9fa9459Szrj@ifnottex
70*a9fa9459Szrj@node Top
71*a9fa9459Szrj@top Profiling a Program: Where Does It Spend Its Time?
72*a9fa9459Szrj
73*a9fa9459SzrjThis manual describes the @sc{gnu} profiler, @code{gprof}, and how you
74*a9fa9459Szrjcan use it to determine which parts of a program are taking most of the
75*a9fa9459Szrjexecution time.  We assume that you know how to write, compile, and
76*a9fa9459Szrjexecute programs.  @sc{gnu} @code{gprof} was written by Jay Fenlason.
77*a9fa9459Szrj
78*a9fa9459SzrjThis manual is for @code{gprof}
79*a9fa9459Szrj@ifset VERSION_PACKAGE
80*a9fa9459Szrj@value{VERSION_PACKAGE}
81*a9fa9459Szrj@end ifset
82*a9fa9459Szrjversion @value{VERSION}.
83*a9fa9459Szrj
84*a9fa9459SzrjThis document is distributed under the terms of the GNU Free
85*a9fa9459SzrjDocumentation License version 1.3.  A copy of the license is included
86*a9fa9459Szrjin the section entitled ``GNU Free Documentation License''.
87*a9fa9459Szrj
88*a9fa9459Szrj@menu
89*a9fa9459Szrj* Introduction::        What profiling means, and why it is useful.
90*a9fa9459Szrj
91*a9fa9459Szrj* Compiling::           How to compile your program for profiling.
92*a9fa9459Szrj* Executing::           Executing your program to generate profile data
93*a9fa9459Szrj* Invoking::            How to run @code{gprof}, and its options
94*a9fa9459Szrj
95*a9fa9459Szrj* Output::              Interpreting @code{gprof}'s output
96*a9fa9459Szrj
97*a9fa9459Szrj* Inaccuracy::          Potential problems you should be aware of
98*a9fa9459Szrj* How do I?::           Answers to common questions
99*a9fa9459Szrj* Incompatibilities::   (between @sc{gnu} @code{gprof} and Unix @code{gprof}.)
100*a9fa9459Szrj* Details::             Details of how profiling is done
101*a9fa9459Szrj* GNU Free Documentation License::  GNU Free Documentation License
102*a9fa9459Szrj@end menu
103*a9fa9459Szrj@end ifnottex
104*a9fa9459Szrj
105*a9fa9459Szrj@node Introduction
106*a9fa9459Szrj@chapter Introduction to Profiling
107*a9fa9459Szrj
108*a9fa9459Szrj@ifset man
109*a9fa9459Szrj@c man title gprof display call graph profile data
110*a9fa9459Szrj
111*a9fa9459Szrj@smallexample
112*a9fa9459Szrj@c man begin SYNOPSIS
113*a9fa9459Szrjgprof [ -[abcDhilLrsTvwxyz] ] [ -[ACeEfFJnNOpPqQZ][@var{name}] ]
114*a9fa9459Szrj [ -I @var{dirs} ] [ -d[@var{num}] ] [ -k @var{from/to} ]
115*a9fa9459Szrj [ -m @var{min-count} ] [ -R @var{map_file} ] [ -t @var{table-length} ]
116*a9fa9459Szrj [ --[no-]annotated-source[=@var{name}] ]
117*a9fa9459Szrj [ --[no-]exec-counts[=@var{name}] ]
118*a9fa9459Szrj [ --[no-]flat-profile[=@var{name}] ] [ --[no-]graph[=@var{name}] ]
119*a9fa9459Szrj [ --[no-]time=@var{name}] [ --all-lines ] [ --brief ]
120*a9fa9459Szrj [ --debug[=@var{level}] ] [ --function-ordering ]
121*a9fa9459Szrj [ --file-ordering @var{map_file} ] [ --directory-path=@var{dirs} ]
122*a9fa9459Szrj [ --display-unused-functions ] [ --file-format=@var{name} ]
123*a9fa9459Szrj [ --file-info ] [ --help ] [ --line ] [ --inline-file-names ]
124*a9fa9459Szrj [ --min-count=@var{n} ] [ --no-static ] [ --print-path ]
125*a9fa9459Szrj [ --separate-files ] [ --static-call-graph ] [ --sum ]
126*a9fa9459Szrj [ --table-length=@var{len} ] [ --traditional ] [ --version ]
127*a9fa9459Szrj [ --width=@var{n} ] [ --ignore-non-functions ]
128*a9fa9459Szrj [ --demangle[=@var{STYLE}] ] [ --no-demangle ]
129*a9fa9459Szrj [--external-symbol-table=name]
130*a9fa9459Szrj [ @var{image-file} ] [ @var{profile-file} @dots{} ]
131*a9fa9459Szrj@c man end
132*a9fa9459Szrj@end smallexample
133*a9fa9459Szrj
134*a9fa9459Szrj@c man begin DESCRIPTION
135*a9fa9459Szrj@code{gprof} produces an execution profile of C, Pascal, or Fortran77
136*a9fa9459Szrjprograms.  The effect of called routines is incorporated in the profile
137*a9fa9459Szrjof each caller.  The profile data is taken from the call graph profile file
138*a9fa9459Szrj(@file{gmon.out} default) which is created by programs
139*a9fa9459Szrjthat are compiled with the @samp{-pg} option of
140*a9fa9459Szrj@code{cc}, @code{pc}, and @code{f77}.
141*a9fa9459SzrjThe @samp{-pg} option also links in versions of the library routines
142*a9fa9459Szrjthat are compiled for profiling.  @code{Gprof} reads the given object
143*a9fa9459Szrjfile (the default is @code{a.out}) and establishes the relation between
144*a9fa9459Szrjits symbol table and the call graph profile from @file{gmon.out}.
145*a9fa9459SzrjIf more than one profile file is specified, the @code{gprof}
146*a9fa9459Szrjoutput shows the sum of the profile information in the given profile files.
147*a9fa9459Szrj
148*a9fa9459Szrj@code{Gprof} calculates the amount of time spent in each routine.
149*a9fa9459SzrjNext, these times are propagated along the edges of the call graph.
150*a9fa9459SzrjCycles are discovered, and calls into a cycle are made to share the time
151*a9fa9459Szrjof the cycle.
152*a9fa9459Szrj
153*a9fa9459Szrj@c man end
154*a9fa9459Szrj
155*a9fa9459Szrj@c man begin BUGS
156*a9fa9459SzrjThe granularity of the sampling is shown, but remains
157*a9fa9459Szrjstatistical at best.
158*a9fa9459SzrjWe assume that the time for each execution of a function
159*a9fa9459Szrjcan be expressed by the total time for the function divided
160*a9fa9459Szrjby the number of times the function is called.
161*a9fa9459SzrjThus the time propagated along the call graph arcs to the function's
162*a9fa9459Szrjparents is directly proportional to the number of times that
163*a9fa9459Szrjarc is traversed.
164*a9fa9459Szrj
165*a9fa9459SzrjParents that are not themselves profiled will have the time of
166*a9fa9459Szrjtheir profiled children propagated to them, but they will appear
167*a9fa9459Szrjto be spontaneously invoked in the call graph listing, and will
168*a9fa9459Szrjnot have their time propagated further.
169*a9fa9459SzrjSimilarly, signal catchers, even though profiled, will appear
170*a9fa9459Szrjto be spontaneous (although for more obscure reasons).
171*a9fa9459SzrjAny profiled children of signal catchers should have their times
172*a9fa9459Szrjpropagated properly, unless the signal catcher was invoked during
173*a9fa9459Szrjthe execution of the profiling routine, in which case all is lost.
174*a9fa9459Szrj
175*a9fa9459SzrjThe profiled program must call @code{exit}(2)
176*a9fa9459Szrjor return normally for the profiling information to be saved
177*a9fa9459Szrjin the @file{gmon.out} file.
178*a9fa9459Szrj@c man end
179*a9fa9459Szrj
180*a9fa9459Szrj@c man begin FILES
181*a9fa9459Szrj@table @code
182*a9fa9459Szrj@item @file{a.out}
183*a9fa9459Szrjthe namelist and text space.
184*a9fa9459Szrj@item @file{gmon.out}
185*a9fa9459Szrjdynamic call graph and profile.
186*a9fa9459Szrj@item @file{gmon.sum}
187*a9fa9459Szrjsummarized dynamic call graph and profile.
188*a9fa9459Szrj@end table
189*a9fa9459Szrj@c man end
190*a9fa9459Szrj
191*a9fa9459Szrj@c man begin SEEALSO
192*a9fa9459Szrjmonitor(3), profil(2), cc(1), prof(1), and the Info entry for @file{gprof}.
193*a9fa9459Szrj
194*a9fa9459Szrj``An Execution Profiler for Modular Programs'',
195*a9fa9459Szrjby S. Graham, P. Kessler, M. McKusick;
196*a9fa9459SzrjSoftware - Practice and Experience,
197*a9fa9459SzrjVol. 13, pp. 671-685, 1983.
198*a9fa9459Szrj
199*a9fa9459Szrj``gprof: A Call Graph Execution Profiler'',
200*a9fa9459Szrjby S. Graham, P. Kessler, M. McKusick;
201*a9fa9459SzrjProceedings of the SIGPLAN '82 Symposium on Compiler Construction,
202*a9fa9459SzrjSIGPLAN Notices, Vol. 17, No  6, pp. 120-126, June 1982.
203*a9fa9459Szrj@c man end
204*a9fa9459Szrj@end ifset
205*a9fa9459Szrj
206*a9fa9459SzrjProfiling allows you to learn where your program spent its time and which
207*a9fa9459Szrjfunctions called which other functions while it was executing.  This
208*a9fa9459Szrjinformation can show you which pieces of your program are slower than you
209*a9fa9459Szrjexpected, and might be candidates for rewriting to make your program
210*a9fa9459Szrjexecute faster.  It can also tell you which functions are being called more
211*a9fa9459Szrjor less often than you expected.  This may help you spot bugs that had
212*a9fa9459Szrjotherwise been unnoticed.
213*a9fa9459Szrj
214*a9fa9459SzrjSince the profiler uses information collected during the actual execution
215*a9fa9459Szrjof your program, it can be used on programs that are too large or too
216*a9fa9459Szrjcomplex to analyze by reading the source.  However, how your program is run
217*a9fa9459Szrjwill affect the information that shows up in the profile data.  If you
218*a9fa9459Szrjdon't use some feature of your program while it is being profiled, no
219*a9fa9459Szrjprofile information will be generated for that feature.
220*a9fa9459Szrj
221*a9fa9459SzrjProfiling has several steps:
222*a9fa9459Szrj
223*a9fa9459Szrj@itemize @bullet
224*a9fa9459Szrj@item
225*a9fa9459SzrjYou must compile and link your program with profiling enabled.
226*a9fa9459Szrj@xref{Compiling, ,Compiling a Program for Profiling}.
227*a9fa9459Szrj
228*a9fa9459Szrj@item
229*a9fa9459SzrjYou must execute your program to generate a profile data file.
230*a9fa9459Szrj@xref{Executing, ,Executing the Program}.
231*a9fa9459Szrj
232*a9fa9459Szrj@item
233*a9fa9459SzrjYou must run @code{gprof} to analyze the profile data.
234*a9fa9459Szrj@xref{Invoking, ,@code{gprof} Command Summary}.
235*a9fa9459Szrj@end itemize
236*a9fa9459Szrj
237*a9fa9459SzrjThe next three chapters explain these steps in greater detail.
238*a9fa9459Szrj
239*a9fa9459Szrj@c man begin DESCRIPTION
240*a9fa9459Szrj
241*a9fa9459SzrjSeveral forms of output are available from the analysis.
242*a9fa9459Szrj
243*a9fa9459SzrjThe @dfn{flat profile} shows how much time your program spent in each function,
244*a9fa9459Szrjand how many times that function was called.  If you simply want to know
245*a9fa9459Szrjwhich functions burn most of the cycles, it is stated concisely here.
246*a9fa9459Szrj@xref{Flat Profile, ,The Flat Profile}.
247*a9fa9459Szrj
248*a9fa9459SzrjThe @dfn{call graph} shows, for each function, which functions called it, which
249*a9fa9459Szrjother functions it called, and how many times.  There is also an estimate
250*a9fa9459Szrjof how much time was spent in the subroutines of each function.  This can
251*a9fa9459Szrjsuggest places where you might try to eliminate function calls that use a
252*a9fa9459Szrjlot of time.  @xref{Call Graph, ,The Call Graph}.
253*a9fa9459Szrj
254*a9fa9459SzrjThe @dfn{annotated source} listing is a copy of the program's
255*a9fa9459Szrjsource code, labeled with the number of times each line of the
256*a9fa9459Szrjprogram was executed.  @xref{Annotated Source, ,The Annotated Source
257*a9fa9459SzrjListing}.
258*a9fa9459Szrj@c man end
259*a9fa9459Szrj
260*a9fa9459SzrjTo better understand how profiling works, you may wish to read
261*a9fa9459Szrja description of its implementation.
262*a9fa9459Szrj@xref{Implementation, ,Implementation of Profiling}.
263*a9fa9459Szrj
264*a9fa9459Szrj@node Compiling
265*a9fa9459Szrj@chapter Compiling a Program for Profiling
266*a9fa9459Szrj
267*a9fa9459SzrjThe first step in generating profile information for your program is
268*a9fa9459Szrjto compile and link it with profiling enabled.
269*a9fa9459Szrj
270*a9fa9459SzrjTo compile a source file for profiling, specify the @samp{-pg} option when
271*a9fa9459Szrjyou run the compiler.  (This is in addition to the options you normally
272*a9fa9459Szrjuse.)
273*a9fa9459Szrj
274*a9fa9459SzrjTo link the program for profiling, if you use a compiler such as @code{cc}
275*a9fa9459Szrjto do the linking, simply specify @samp{-pg} in addition to your usual
276*a9fa9459Szrjoptions.  The same option, @samp{-pg}, alters either compilation or linking
277*a9fa9459Szrjto do what is necessary for profiling.  Here are examples:
278*a9fa9459Szrj
279*a9fa9459Szrj@example
280*a9fa9459Szrjcc -g -c myprog.c utils.c -pg
281*a9fa9459Szrjcc -o myprog myprog.o utils.o -pg
282*a9fa9459Szrj@end example
283*a9fa9459Szrj
284*a9fa9459SzrjThe @samp{-pg} option also works with a command that both compiles and links:
285*a9fa9459Szrj
286*a9fa9459Szrj@example
287*a9fa9459Szrjcc -o myprog myprog.c utils.c -g -pg
288*a9fa9459Szrj@end example
289*a9fa9459Szrj
290*a9fa9459SzrjNote: The @samp{-pg} option must be part of your compilation options
291*a9fa9459Szrjas well as your link options.  If it is not then no call-graph data
292*a9fa9459Szrjwill be gathered and when you run @code{gprof} you will get an error
293*a9fa9459Szrjmessage like this:
294*a9fa9459Szrj
295*a9fa9459Szrj@example
296*a9fa9459Szrjgprof: gmon.out file is missing call-graph data
297*a9fa9459Szrj@end example
298*a9fa9459Szrj
299*a9fa9459SzrjIf you add the @samp{-Q} switch to suppress the printing of the call
300*a9fa9459Szrjgraph data you will still be able to see the time samples:
301*a9fa9459Szrj
302*a9fa9459Szrj@example
303*a9fa9459SzrjFlat profile:
304*a9fa9459Szrj
305*a9fa9459SzrjEach sample counts as 0.01 seconds.
306*a9fa9459Szrj  %   cumulative   self              self     total
307*a9fa9459Szrj time   seconds   seconds    calls  Ts/call  Ts/call  name
308*a9fa9459Szrj 44.12      0.07     0.07                             zazLoop
309*a9fa9459Szrj 35.29      0.14     0.06                             main
310*a9fa9459Szrj 20.59      0.17     0.04                             bazMillion
311*a9fa9459Szrj@end example
312*a9fa9459Szrj
313*a9fa9459SzrjIf you run the linker @code{ld} directly instead of through a compiler
314*a9fa9459Szrjsuch as @code{cc}, you may have to specify a profiling startup file
315*a9fa9459Szrj@file{gcrt0.o} as the first input file instead of the usual startup
316*a9fa9459Szrjfile @file{crt0.o}.  In addition, you would probably want to
317*a9fa9459Szrjspecify the profiling C library, @file{libc_p.a}, by writing
318*a9fa9459Szrj@samp{-lc_p} instead of the usual @samp{-lc}.  This is not absolutely
319*a9fa9459Szrjnecessary, but doing this gives you number-of-calls information for
320*a9fa9459Szrjstandard library functions such as @code{read} and @code{open}.  For
321*a9fa9459Szrjexample:
322*a9fa9459Szrj
323*a9fa9459Szrj@example
324*a9fa9459Szrjld -o myprog /lib/gcrt0.o myprog.o utils.o -lc_p
325*a9fa9459Szrj@end example
326*a9fa9459Szrj
327*a9fa9459SzrjIf you are running the program on a system which supports shared
328*a9fa9459Szrjlibraries you may run into problems with the profiling support code in
329*a9fa9459Szrja shared library being called before that library has been fully
330*a9fa9459Szrjinitialised.  This is usually detected by the program encountering a
331*a9fa9459Szrjsegmentation fault as soon as it is run.  The solution is to link
332*a9fa9459Szrjagainst a static version of the library containing the profiling
333*a9fa9459Szrjsupport code, which for @code{gcc} users can be done via the
334*a9fa9459Szrj@samp{-static} or @samp{-static-libgcc} command line option.  For
335*a9fa9459Szrjexample:
336*a9fa9459Szrj
337*a9fa9459Szrj@example
338*a9fa9459Szrjgcc -g -pg -static-libgcc myprog.c utils.c -o myprog
339*a9fa9459Szrj@end example
340*a9fa9459Szrj
341*a9fa9459SzrjIf you compile only some of the modules of the program with @samp{-pg}, you
342*a9fa9459Szrjcan still profile the program, but you won't get complete information about
343*a9fa9459Szrjthe modules that were compiled without @samp{-pg}.  The only information
344*a9fa9459Szrjyou get for the functions in those modules is the total time spent in them;
345*a9fa9459Szrjthere is no record of how many times they were called, or from where.  This
346*a9fa9459Szrjwill not affect the flat profile (except that the @code{calls} field for
347*a9fa9459Szrjthe functions will be blank), but will greatly reduce the usefulness of the
348*a9fa9459Szrjcall graph.
349*a9fa9459Szrj
350*a9fa9459SzrjIf you wish to perform line-by-line profiling you should use the
351*a9fa9459Szrj@code{gcov} tool instead of @code{gprof}.  See that tool's manual or
352*a9fa9459Szrjinfo pages for more details of how to do this.
353*a9fa9459Szrj
354*a9fa9459SzrjNote, older versions of @code{gcc} produce line-by-line profiling
355*a9fa9459Szrjinformation that works with @code{gprof} rather than @code{gcov} so
356*a9fa9459Szrjthere is still support for displaying this kind of information in
357*a9fa9459Szrj@code{gprof}. @xref{Line-by-line, ,Line-by-line Profiling}.
358*a9fa9459Szrj
359*a9fa9459SzrjIt also worth noting that @code{gcc} implements a
360*a9fa9459Szrj@samp{-finstrument-functions} command line option which will insert
361*a9fa9459Szrjcalls to special user supplied instrumentation routines at the entry
362*a9fa9459Szrjand exit of every function in their program.  This can be used to
363*a9fa9459Szrjimplement an alternative profiling scheme.
364*a9fa9459Szrj
365*a9fa9459Szrj@node Executing
366*a9fa9459Szrj@chapter Executing the Program
367*a9fa9459Szrj
368*a9fa9459SzrjOnce the program is compiled for profiling, you must run it in order to
369*a9fa9459Szrjgenerate the information that @code{gprof} needs.  Simply run the program
370*a9fa9459Szrjas usual, using the normal arguments, file names, etc.  The program should
371*a9fa9459Szrjrun normally, producing the same output as usual.  It will, however, run
372*a9fa9459Szrjsomewhat slower than normal because of the time spent collecting and
373*a9fa9459Szrjwriting the profile data.
374*a9fa9459Szrj
375*a9fa9459SzrjThe way you run the program---the arguments and input that you give
376*a9fa9459Szrjit---may have a dramatic effect on what the profile information shows.  The
377*a9fa9459Szrjprofile data will describe the parts of the program that were activated for
378*a9fa9459Szrjthe particular input you use.  For example, if the first command you give
379*a9fa9459Szrjto your program is to quit, the profile data will show the time used in
380*a9fa9459Szrjinitialization and in cleanup, but not much else.
381*a9fa9459Szrj
382*a9fa9459SzrjYour program will write the profile data into a file called @file{gmon.out}
383*a9fa9459Szrjjust before exiting.  If there is already a file called @file{gmon.out},
384*a9fa9459Szrjits contents are overwritten.  There is currently no way to tell the
385*a9fa9459Szrjprogram to write the profile data under a different name, but you can rename
386*a9fa9459Szrjthe file afterwards if you are concerned that it may be overwritten.
387*a9fa9459Szrj
388*a9fa9459SzrjIn order to write the @file{gmon.out} file properly, your program must exit
389*a9fa9459Szrjnormally: by returning from @code{main} or by calling @code{exit}.  Calling
390*a9fa9459Szrjthe low-level function @code{_exit} does not write the profile data, and
391*a9fa9459Szrjneither does abnormal termination due to an unhandled signal.
392*a9fa9459Szrj
393*a9fa9459SzrjThe @file{gmon.out} file is written in the program's @emph{current working
394*a9fa9459Szrjdirectory} at the time it exits.  This means that if your program calls
395*a9fa9459Szrj@code{chdir}, the @file{gmon.out} file will be left in the last directory
396*a9fa9459Szrjyour program @code{chdir}'d to.  If you don't have permission to write in
397*a9fa9459Szrjthis directory, the file is not written, and you will get an error message.
398*a9fa9459Szrj
399*a9fa9459SzrjOlder versions of the @sc{gnu} profiling library may also write a file
400*a9fa9459Szrjcalled @file{bb.out}.  This file, if present, contains an human-readable
401*a9fa9459Szrjlisting of the basic-block execution counts.  Unfortunately, the
402*a9fa9459Szrjappearance of a human-readable @file{bb.out} means the basic-block
403*a9fa9459Szrjcounts didn't get written into @file{gmon.out}.
404*a9fa9459SzrjThe Perl script @code{bbconv.pl}, included with the @code{gprof}
405*a9fa9459Szrjsource distribution, will convert a @file{bb.out} file into
406*a9fa9459Szrja format readable by @code{gprof}.  Invoke it like this:
407*a9fa9459Szrj
408*a9fa9459Szrj@smallexample
409*a9fa9459Szrjbbconv.pl < bb.out > @var{bh-data}
410*a9fa9459Szrj@end smallexample
411*a9fa9459Szrj
412*a9fa9459SzrjThis translates the information in @file{bb.out} into a form that
413*a9fa9459Szrj@code{gprof} can understand.  But you still need to tell @code{gprof}
414*a9fa9459Szrjabout the existence of this translated information.  To do that, include
415*a9fa9459Szrj@var{bb-data} on the @code{gprof} command line, @emph{along with
416*a9fa9459Szrj@file{gmon.out}}, like this:
417*a9fa9459Szrj
418*a9fa9459Szrj@smallexample
419*a9fa9459Szrjgprof @var{options} @var{executable-file} gmon.out @var{bb-data} [@var{yet-more-profile-data-files}@dots{}] [> @var{outfile}]
420*a9fa9459Szrj@end smallexample
421*a9fa9459Szrj
422*a9fa9459Szrj@node Invoking
423*a9fa9459Szrj@chapter @code{gprof} Command Summary
424*a9fa9459Szrj
425*a9fa9459SzrjAfter you have a profile data file @file{gmon.out}, you can run @code{gprof}
426*a9fa9459Szrjto interpret the information in it.  The @code{gprof} program prints a
427*a9fa9459Szrjflat profile and a call graph on standard output.  Typically you would
428*a9fa9459Szrjredirect the output of @code{gprof} into a file with @samp{>}.
429*a9fa9459Szrj
430*a9fa9459SzrjYou run @code{gprof} like this:
431*a9fa9459Szrj
432*a9fa9459Szrj@smallexample
433*a9fa9459Szrjgprof @var{options} [@var{executable-file} [@var{profile-data-files}@dots{}]] [> @var{outfile}]
434*a9fa9459Szrj@end smallexample
435*a9fa9459Szrj
436*a9fa9459Szrj@noindent
437*a9fa9459SzrjHere square-brackets indicate optional arguments.
438*a9fa9459Szrj
439*a9fa9459SzrjIf you omit the executable file name, the file @file{a.out} is used.  If
440*a9fa9459Szrjyou give no profile data file name, the file @file{gmon.out} is used.  If
441*a9fa9459Szrjany file is not in the proper format, or if the profile data file does not
442*a9fa9459Szrjappear to belong to the executable file, an error message is printed.
443*a9fa9459Szrj
444*a9fa9459SzrjYou can give more than one profile data file by entering all their names
445*a9fa9459Szrjafter the executable file name; then the statistics in all the data files
446*a9fa9459Szrjare summed together.
447*a9fa9459Szrj
448*a9fa9459SzrjThe order of these options does not matter.
449*a9fa9459Szrj
450*a9fa9459Szrj@menu
451*a9fa9459Szrj* Output Options::      Controlling @code{gprof}'s output style
452*a9fa9459Szrj* Analysis Options::    Controlling how @code{gprof} analyzes its data
453*a9fa9459Szrj* Miscellaneous Options::
454*a9fa9459Szrj* Deprecated Options::  Options you no longer need to use, but which
455*a9fa9459Szrj                            have been retained for compatibility
456*a9fa9459Szrj* Symspecs::            Specifying functions to include or exclude
457*a9fa9459Szrj@end menu
458*a9fa9459Szrj
459*a9fa9459Szrj@node Output Options
460*a9fa9459Szrj@section Output Options
461*a9fa9459Szrj
462*a9fa9459Szrj@c man begin OPTIONS
463*a9fa9459SzrjThese options specify which of several output formats
464*a9fa9459Szrj@code{gprof} should produce.
465*a9fa9459Szrj
466*a9fa9459SzrjMany of these options take an optional @dfn{symspec} to specify
467*a9fa9459Szrjfunctions to be included or excluded.  These options can be
468*a9fa9459Szrjspecified multiple times, with different symspecs, to include
469*a9fa9459Szrjor exclude sets of symbols.  @xref{Symspecs, ,Symspecs}.
470*a9fa9459Szrj
471*a9fa9459SzrjSpecifying any of these options overrides the default (@samp{-p -q}),
472*a9fa9459Szrjwhich prints a flat profile and call graph analysis
473*a9fa9459Szrjfor all functions.
474*a9fa9459Szrj
475*a9fa9459Szrj@table @code
476*a9fa9459Szrj
477*a9fa9459Szrj@item -A[@var{symspec}]
478*a9fa9459Szrj@itemx --annotated-source[=@var{symspec}]
479*a9fa9459SzrjThe @samp{-A} option causes @code{gprof} to print annotated source code.
480*a9fa9459SzrjIf @var{symspec} is specified, print output only for matching symbols.
481*a9fa9459Szrj@xref{Annotated Source, ,The Annotated Source Listing}.
482*a9fa9459Szrj
483*a9fa9459Szrj@item -b
484*a9fa9459Szrj@itemx --brief
485*a9fa9459SzrjIf the @samp{-b} option is given, @code{gprof} doesn't print the
486*a9fa9459Szrjverbose blurbs that try to explain the meaning of all of the fields in
487*a9fa9459Szrjthe tables.  This is useful if you intend to print out the output, or
488*a9fa9459Szrjare tired of seeing the blurbs.
489*a9fa9459Szrj
490*a9fa9459Szrj@item -C[@var{symspec}]
491*a9fa9459Szrj@itemx --exec-counts[=@var{symspec}]
492*a9fa9459SzrjThe @samp{-C} option causes @code{gprof} to
493*a9fa9459Szrjprint a tally of functions and the number of times each was called.
494*a9fa9459SzrjIf @var{symspec} is specified, print tally only for matching symbols.
495*a9fa9459Szrj
496*a9fa9459SzrjIf the profile data file contains basic-block count records, specifying
497*a9fa9459Szrjthe @samp{-l} option, along with @samp{-C}, will cause basic-block
498*a9fa9459Szrjexecution counts to be tallied and displayed.
499*a9fa9459Szrj
500*a9fa9459Szrj@item -i
501*a9fa9459Szrj@itemx --file-info
502*a9fa9459SzrjThe @samp{-i} option causes @code{gprof} to display summary information
503*a9fa9459Szrjabout the profile data file(s) and then exit.  The number of histogram,
504*a9fa9459Szrjcall graph, and basic-block count records is displayed.
505*a9fa9459Szrj
506*a9fa9459Szrj@item -I @var{dirs}
507*a9fa9459Szrj@itemx --directory-path=@var{dirs}
508*a9fa9459SzrjThe @samp{-I} option specifies a list of search directories in
509*a9fa9459Szrjwhich to find source files.  Environment variable @var{GPROF_PATH}
510*a9fa9459Szrjcan also be used to convey this information.
511*a9fa9459SzrjUsed mostly for annotated source output.
512*a9fa9459Szrj
513*a9fa9459Szrj@item -J[@var{symspec}]
514*a9fa9459Szrj@itemx --no-annotated-source[=@var{symspec}]
515*a9fa9459SzrjThe @samp{-J} option causes @code{gprof} not to
516*a9fa9459Szrjprint annotated source code.
517*a9fa9459SzrjIf @var{symspec} is specified, @code{gprof} prints annotated source,
518*a9fa9459Szrjbut excludes matching symbols.
519*a9fa9459Szrj
520*a9fa9459Szrj@item -L
521*a9fa9459Szrj@itemx --print-path
522*a9fa9459SzrjNormally, source filenames are printed with the path
523*a9fa9459Szrjcomponent suppressed.  The @samp{-L} option causes @code{gprof}
524*a9fa9459Szrjto print the full pathname of
525*a9fa9459Szrjsource filenames, which is determined
526*a9fa9459Szrjfrom symbolic debugging information in the image file
527*a9fa9459Szrjand is relative to the directory in which the compiler
528*a9fa9459Szrjwas invoked.
529*a9fa9459Szrj
530*a9fa9459Szrj@item -p[@var{symspec}]
531*a9fa9459Szrj@itemx --flat-profile[=@var{symspec}]
532*a9fa9459SzrjThe @samp{-p} option causes @code{gprof} to print a flat profile.
533*a9fa9459SzrjIf @var{symspec} is specified, print flat profile only for matching symbols.
534*a9fa9459Szrj@xref{Flat Profile, ,The Flat Profile}.
535*a9fa9459Szrj
536*a9fa9459Szrj@item -P[@var{symspec}]
537*a9fa9459Szrj@itemx --no-flat-profile[=@var{symspec}]
538*a9fa9459SzrjThe @samp{-P} option causes @code{gprof} to suppress printing a flat profile.
539*a9fa9459SzrjIf @var{symspec} is specified, @code{gprof} prints a flat profile,
540*a9fa9459Szrjbut excludes matching symbols.
541*a9fa9459Szrj
542*a9fa9459Szrj@item -q[@var{symspec}]
543*a9fa9459Szrj@itemx --graph[=@var{symspec}]
544*a9fa9459SzrjThe @samp{-q} option causes @code{gprof} to print the call graph analysis.
545*a9fa9459SzrjIf @var{symspec} is specified, print call graph only for matching symbols
546*a9fa9459Szrjand their children.
547*a9fa9459Szrj@xref{Call Graph, ,The Call Graph}.
548*a9fa9459Szrj
549*a9fa9459Szrj@item -Q[@var{symspec}]
550*a9fa9459Szrj@itemx --no-graph[=@var{symspec}]
551*a9fa9459SzrjThe @samp{-Q} option causes @code{gprof} to suppress printing the
552*a9fa9459Szrjcall graph.
553*a9fa9459SzrjIf @var{symspec} is specified, @code{gprof} prints a call graph,
554*a9fa9459Szrjbut excludes matching symbols.
555*a9fa9459Szrj
556*a9fa9459Szrj@item -t
557*a9fa9459Szrj@itemx --table-length=@var{num}
558*a9fa9459SzrjThe @samp{-t} option causes the @var{num} most active source lines in
559*a9fa9459Szrjeach source file to be listed when source annotation is enabled.  The
560*a9fa9459Szrjdefault is 10.
561*a9fa9459Szrj
562*a9fa9459Szrj@item -y
563*a9fa9459Szrj@itemx --separate-files
564*a9fa9459SzrjThis option affects annotated source output only.
565*a9fa9459SzrjNormally, @code{gprof} prints annotated source files
566*a9fa9459Szrjto standard-output.  If this option is specified,
567*a9fa9459Szrjannotated source for a file named @file{path/@var{filename}}
568*a9fa9459Szrjis generated in the file @file{@var{filename}-ann}.  If the underlying
569*a9fa9459Szrjfile system would truncate @file{@var{filename}-ann} so that it
570*a9fa9459Szrjoverwrites the original @file{@var{filename}}, @code{gprof} generates
571*a9fa9459Szrjannotated source in the file @file{@var{filename}.ann} instead (if the
572*a9fa9459Szrjoriginal file name has an extension, that extension is @emph{replaced}
573*a9fa9459Szrjwith @file{.ann}).
574*a9fa9459Szrj
575*a9fa9459Szrj@item -Z[@var{symspec}]
576*a9fa9459Szrj@itemx --no-exec-counts[=@var{symspec}]
577*a9fa9459SzrjThe @samp{-Z} option causes @code{gprof} not to
578*a9fa9459Szrjprint a tally of functions and the number of times each was called.
579*a9fa9459SzrjIf @var{symspec} is specified, print tally, but exclude matching symbols.
580*a9fa9459Szrj
581*a9fa9459Szrj@item -r
582*a9fa9459Szrj@itemx --function-ordering
583*a9fa9459SzrjThe @samp{--function-ordering} option causes @code{gprof} to print a
584*a9fa9459Szrjsuggested function ordering for the program based on profiling data.
585*a9fa9459SzrjThis option suggests an ordering which may improve paging, tlb and
586*a9fa9459Szrjcache behavior for the program on systems which support arbitrary
587*a9fa9459Szrjordering of functions in an executable.
588*a9fa9459Szrj
589*a9fa9459SzrjThe exact details of how to force the linker to place functions
590*a9fa9459Szrjin a particular order is system dependent and out of the scope of this
591*a9fa9459Szrjmanual.
592*a9fa9459Szrj
593*a9fa9459Szrj@item -R @var{map_file}
594*a9fa9459Szrj@itemx --file-ordering @var{map_file}
595*a9fa9459SzrjThe @samp{--file-ordering} option causes @code{gprof} to print a
596*a9fa9459Szrjsuggested .o link line ordering for the program based on profiling data.
597*a9fa9459SzrjThis option suggests an ordering which may improve paging, tlb and
598*a9fa9459Szrjcache behavior for the program on systems which do not support arbitrary
599*a9fa9459Szrjordering of functions in an executable.
600*a9fa9459Szrj
601*a9fa9459SzrjUse of the @samp{-a} argument is highly recommended with this option.
602*a9fa9459Szrj
603*a9fa9459SzrjThe @var{map_file} argument is a pathname to a file which provides
604*a9fa9459Szrjfunction name to object file mappings.  The format of the file is similar to
605*a9fa9459Szrjthe output of the program @code{nm}.
606*a9fa9459Szrj
607*a9fa9459Szrj@smallexample
608*a9fa9459Szrj@group
609*a9fa9459Szrjc-parse.o:00000000 T yyparse
610*a9fa9459Szrjc-parse.o:00000004 C yyerrflag
611*a9fa9459Szrjc-lang.o:00000000 T maybe_objc_method_name
612*a9fa9459Szrjc-lang.o:00000000 T print_lang_statistics
613*a9fa9459Szrjc-lang.o:00000000 T recognize_objc_keyword
614*a9fa9459Szrjc-decl.o:00000000 T print_lang_identifier
615*a9fa9459Szrjc-decl.o:00000000 T print_lang_type
616*a9fa9459Szrj@dots{}
617*a9fa9459Szrj
618*a9fa9459Szrj@end group
619*a9fa9459Szrj@end smallexample
620*a9fa9459Szrj
621*a9fa9459SzrjTo create a @var{map_file} with @sc{gnu} @code{nm}, type a command like
622*a9fa9459Szrj@kbd{nm --extern-only --defined-only -v --print-file-name program-name}.
623*a9fa9459Szrj
624*a9fa9459Szrj@item -T
625*a9fa9459Szrj@itemx --traditional
626*a9fa9459SzrjThe @samp{-T} option causes @code{gprof} to print its output in
627*a9fa9459Szrj``traditional'' BSD style.
628*a9fa9459Szrj
629*a9fa9459Szrj@item -w @var{width}
630*a9fa9459Szrj@itemx --width=@var{width}
631*a9fa9459SzrjSets width of output lines to @var{width}.
632*a9fa9459SzrjCurrently only used when printing the function index at the bottom
633*a9fa9459Szrjof the call graph.
634*a9fa9459Szrj
635*a9fa9459Szrj@item -x
636*a9fa9459Szrj@itemx --all-lines
637*a9fa9459SzrjThis option affects annotated source output only.
638*a9fa9459SzrjBy default, only the lines at the beginning of a basic-block
639*a9fa9459Szrjare annotated.  If this option is specified, every line in
640*a9fa9459Szrja basic-block is annotated by repeating the annotation for the
641*a9fa9459Szrjfirst line.  This behavior is similar to @code{tcov}'s @samp{-a}.
642*a9fa9459Szrj
643*a9fa9459Szrj@item --demangle[=@var{style}]
644*a9fa9459Szrj@itemx --no-demangle
645*a9fa9459SzrjThese options control whether C++ symbol names should be demangled when
646*a9fa9459Szrjprinting output.  The default is to demangle symbols.  The
647*a9fa9459Szrj@code{--no-demangle} option may be used to turn off demangling. Different
648*a9fa9459Szrjcompilers have different mangling styles.  The optional demangling style
649*a9fa9459Szrjargument can be used to choose an appropriate demangling style for your
650*a9fa9459Szrjcompiler.
651*a9fa9459Szrj@end table
652*a9fa9459Szrj
653*a9fa9459Szrj@node Analysis Options
654*a9fa9459Szrj@section Analysis Options
655*a9fa9459Szrj
656*a9fa9459Szrj@table @code
657*a9fa9459Szrj
658*a9fa9459Szrj@item -a
659*a9fa9459Szrj@itemx --no-static
660*a9fa9459SzrjThe @samp{-a} option causes @code{gprof} to suppress the printing of
661*a9fa9459Szrjstatically declared (private) functions.  (These are functions whose
662*a9fa9459Szrjnames are not listed as global, and which are not visible outside the
663*a9fa9459Szrjfile/function/block where they were defined.)  Time spent in these
664*a9fa9459Szrjfunctions, calls to/from them, etc., will all be attributed to the
665*a9fa9459Szrjfunction that was loaded directly before it in the executable file.
666*a9fa9459Szrj@c This is compatible with Unix @code{gprof}, but a bad idea.
667*a9fa9459SzrjThis option affects both the flat profile and the call graph.
668*a9fa9459Szrj
669*a9fa9459Szrj@item -c
670*a9fa9459Szrj@itemx --static-call-graph
671*a9fa9459SzrjThe @samp{-c} option causes the call graph of the program to be
672*a9fa9459Szrjaugmented by a heuristic which examines the text space of the object
673*a9fa9459Szrjfile and identifies function calls in the binary machine code.
674*a9fa9459SzrjSince normal call graph records are only generated when functions are
675*a9fa9459Szrjentered, this option identifies children that could have been called,
676*a9fa9459Szrjbut never were.  Calls to functions that were not compiled with
677*a9fa9459Szrjprofiling enabled are also identified, but only if symbol table
678*a9fa9459Szrjentries are present for them.
679*a9fa9459SzrjCalls to dynamic library routines are typically @emph{not} found
680*a9fa9459Szrjby this option.
681*a9fa9459SzrjParents or children identified via this heuristic
682*a9fa9459Szrjare indicated in the call graph with call counts of @samp{0}.
683*a9fa9459Szrj
684*a9fa9459Szrj@item -D
685*a9fa9459Szrj@itemx --ignore-non-functions
686*a9fa9459SzrjThe @samp{-D} option causes @code{gprof} to ignore symbols which
687*a9fa9459Szrjare not known to be functions.  This option will give more accurate
688*a9fa9459Szrjprofile data on systems where it is supported (Solaris and HPUX for
689*a9fa9459Szrjexample).
690*a9fa9459Szrj
691*a9fa9459Szrj@item -k @var{from}/@var{to}
692*a9fa9459SzrjThe @samp{-k} option allows you to delete from the call graph any arcs from
693*a9fa9459Szrjsymbols matching symspec @var{from} to those matching symspec @var{to}.
694*a9fa9459Szrj
695*a9fa9459Szrj@item -l
696*a9fa9459Szrj@itemx --line
697*a9fa9459SzrjThe @samp{-l} option enables line-by-line profiling, which causes
698*a9fa9459Szrjhistogram hits to be charged to individual source code lines,
699*a9fa9459Szrjinstead of functions.  This feature only works with programs compiled
700*a9fa9459Szrjby older versions of the @code{gcc} compiler.  Newer versions of
701*a9fa9459Szrj@code{gcc} are designed to work with the @code{gcov} tool instead.
702*a9fa9459Szrj
703*a9fa9459SzrjIf the program was compiled with basic-block counting enabled,
704*a9fa9459Szrjthis option will also identify how many times each line of
705*a9fa9459Szrjcode was executed.
706*a9fa9459SzrjWhile line-by-line profiling can help isolate where in a large function
707*a9fa9459Szrja program is spending its time, it also significantly increases
708*a9fa9459Szrjthe running time of @code{gprof}, and magnifies statistical
709*a9fa9459Szrjinaccuracies.
710*a9fa9459Szrj@xref{Sampling Error, ,Statistical Sampling Error}.
711*a9fa9459Szrj
712*a9fa9459Szrj@item --inline-file-names
713*a9fa9459SzrjThis option causes @code{gprof} to print the source file after each
714*a9fa9459Szrjsymbol in both the flat profile and the call graph. The full path to the
715*a9fa9459Szrjfile is printed if used with the @samp{-L} option.
716*a9fa9459Szrj
717*a9fa9459Szrj@item -m @var{num}
718*a9fa9459Szrj@itemx --min-count=@var{num}
719*a9fa9459SzrjThis option affects execution count output only.
720*a9fa9459SzrjSymbols that are executed less than @var{num} times are suppressed.
721*a9fa9459Szrj
722*a9fa9459Szrj@item -n@var{symspec}
723*a9fa9459Szrj@itemx --time=@var{symspec}
724*a9fa9459SzrjThe @samp{-n} option causes @code{gprof}, in its call graph analysis,
725*a9fa9459Szrjto only propagate times for symbols matching @var{symspec}.
726*a9fa9459Szrj
727*a9fa9459Szrj@item -N@var{symspec}
728*a9fa9459Szrj@itemx --no-time=@var{symspec}
729*a9fa9459SzrjThe @samp{-n} option causes @code{gprof}, in its call graph analysis,
730*a9fa9459Szrjnot to propagate times for symbols matching @var{symspec}.
731*a9fa9459Szrj
732*a9fa9459Szrj@item -S@var{filename}
733*a9fa9459Szrj@itemx --external-symbol-table=@var{filename}
734*a9fa9459SzrjThe @samp{-S} option causes @code{gprof} to read an external symbol table
735*a9fa9459Szrjfile, such as @file{/proc/kallsyms}, rather than read the symbol table
736*a9fa9459Szrjfrom the given object file (the default is @code{a.out}). This is useful
737*a9fa9459Szrjfor profiling kernel modules.
738*a9fa9459Szrj
739*a9fa9459Szrj@item -z
740*a9fa9459Szrj@itemx --display-unused-functions
741*a9fa9459SzrjIf you give the @samp{-z} option, @code{gprof} will mention all
742*a9fa9459Szrjfunctions in the flat profile, even those that were never called, and
743*a9fa9459Szrjthat had no time spent in them.  This is useful in conjunction with the
744*a9fa9459Szrj@samp{-c} option for discovering which routines were never called.
745*a9fa9459Szrj
746*a9fa9459Szrj@end table
747*a9fa9459Szrj
748*a9fa9459Szrj@node Miscellaneous Options
749*a9fa9459Szrj@section Miscellaneous Options
750*a9fa9459Szrj
751*a9fa9459Szrj@table @code
752*a9fa9459Szrj
753*a9fa9459Szrj@item -d[@var{num}]
754*a9fa9459Szrj@itemx --debug[=@var{num}]
755*a9fa9459SzrjThe @samp{-d @var{num}} option specifies debugging options.
756*a9fa9459SzrjIf @var{num} is not specified, enable all debugging.
757*a9fa9459Szrj@xref{Debugging, ,Debugging @code{gprof}}.
758*a9fa9459Szrj
759*a9fa9459Szrj@item -h
760*a9fa9459Szrj@itemx --help
761*a9fa9459SzrjThe @samp{-h} option prints command line usage.
762*a9fa9459Szrj
763*a9fa9459Szrj@item -O@var{name}
764*a9fa9459Szrj@itemx --file-format=@var{name}
765*a9fa9459SzrjSelects the format of the profile data files.  Recognized formats are
766*a9fa9459Szrj@samp{auto} (the default), @samp{bsd}, @samp{4.4bsd}, @samp{magic}, and
767*a9fa9459Szrj@samp{prof} (not yet supported).
768*a9fa9459Szrj
769*a9fa9459Szrj@item -s
770*a9fa9459Szrj@itemx --sum
771*a9fa9459SzrjThe @samp{-s} option causes @code{gprof} to summarize the information
772*a9fa9459Szrjin the profile data files it read in, and write out a profile data
773*a9fa9459Szrjfile called @file{gmon.sum}, which contains all the information from
774*a9fa9459Szrjthe profile data files that @code{gprof} read in.  The file @file{gmon.sum}
775*a9fa9459Szrjmay be one of the specified input files; the effect of this is to
776*a9fa9459Szrjmerge the data in the other input files into @file{gmon.sum}.
777*a9fa9459Szrj
778*a9fa9459SzrjEventually you can run @code{gprof} again without @samp{-s} to analyze the
779*a9fa9459Szrjcumulative data in the file @file{gmon.sum}.
780*a9fa9459Szrj
781*a9fa9459Szrj@item -v
782*a9fa9459Szrj@itemx --version
783*a9fa9459SzrjThe @samp{-v} flag causes @code{gprof} to print the current version
784*a9fa9459Szrjnumber, and then exit.
785*a9fa9459Szrj
786*a9fa9459Szrj@end table
787*a9fa9459Szrj
788*a9fa9459Szrj@node Deprecated Options
789*a9fa9459Szrj@section Deprecated Options
790*a9fa9459Szrj
791*a9fa9459SzrjThese options have been replaced with newer versions that use symspecs.
792*a9fa9459Szrj
793*a9fa9459Szrj@table @code
794*a9fa9459Szrj
795*a9fa9459Szrj@item -e @var{function_name}
796*a9fa9459SzrjThe @samp{-e @var{function}} option tells @code{gprof} to not print
797*a9fa9459Szrjinformation about the function @var{function_name} (and its
798*a9fa9459Szrjchildren@dots{}) in the call graph.  The function will still be listed
799*a9fa9459Szrjas a child of any functions that call it, but its index number will be
800*a9fa9459Szrjshown as @samp{[not printed]}.  More than one @samp{-e} option may be
801*a9fa9459Szrjgiven; only one @var{function_name} may be indicated with each @samp{-e}
802*a9fa9459Szrjoption.
803*a9fa9459Szrj
804*a9fa9459Szrj@item -E @var{function_name}
805*a9fa9459SzrjThe @code{-E @var{function}} option works like the @code{-e} option, but
806*a9fa9459Szrjtime spent in the function (and children who were not called from
807*a9fa9459Szrjanywhere else), will not be used to compute the percentages-of-time for
808*a9fa9459Szrjthe call graph.  More than one @samp{-E} option may be given; only one
809*a9fa9459Szrj@var{function_name} may be indicated with each @samp{-E} option.
810*a9fa9459Szrj
811*a9fa9459Szrj@item -f @var{function_name}
812*a9fa9459SzrjThe @samp{-f @var{function}} option causes @code{gprof} to limit the
813*a9fa9459Szrjcall graph to the function @var{function_name} and its children (and
814*a9fa9459Szrjtheir children@dots{}).  More than one @samp{-f} option may be given;
815*a9fa9459Szrjonly one @var{function_name} may be indicated with each @samp{-f}
816*a9fa9459Szrjoption.
817*a9fa9459Szrj
818*a9fa9459Szrj@item -F @var{function_name}
819*a9fa9459SzrjThe @samp{-F @var{function}} option works like the @code{-f} option, but
820*a9fa9459Szrjonly time spent in the function and its children (and their
821*a9fa9459Szrjchildren@dots{}) will be used to determine total-time and
822*a9fa9459Szrjpercentages-of-time for the call graph.  More than one @samp{-F} option
823*a9fa9459Szrjmay be given; only one @var{function_name} may be indicated with each
824*a9fa9459Szrj@samp{-F} option.  The @samp{-F} option overrides the @samp{-E} option.
825*a9fa9459Szrj
826*a9fa9459Szrj@end table
827*a9fa9459Szrj
828*a9fa9459Szrj@c man end
829*a9fa9459Szrj
830*a9fa9459SzrjNote that only one function can be specified with each @code{-e},
831*a9fa9459Szrj@code{-E}, @code{-f} or @code{-F} option.  To specify more than one
832*a9fa9459Szrjfunction, use multiple options.  For example, this command:
833*a9fa9459Szrj
834*a9fa9459Szrj@example
835*a9fa9459Szrjgprof -e boring -f foo -f bar myprogram > gprof.output
836*a9fa9459Szrj@end example
837*a9fa9459Szrj
838*a9fa9459Szrj@noindent
839*a9fa9459Szrjlists in the call graph all functions that were reached from either
840*a9fa9459Szrj@code{foo} or @code{bar} and were not reachable from @code{boring}.
841*a9fa9459Szrj
842*a9fa9459Szrj@node Symspecs
843*a9fa9459Szrj@section Symspecs
844*a9fa9459Szrj
845*a9fa9459SzrjMany of the output options allow functions to be included or excluded
846*a9fa9459Szrjusing @dfn{symspecs} (symbol specifications), which observe the
847*a9fa9459Szrjfollowing syntax:
848*a9fa9459Szrj
849*a9fa9459Szrj@example
850*a9fa9459Szrj  filename_containing_a_dot
851*a9fa9459Szrj| funcname_not_containing_a_dot
852*a9fa9459Szrj| linenumber
853*a9fa9459Szrj| ( [ any_filename ] `:' ( any_funcname | linenumber ) )
854*a9fa9459Szrj@end example
855*a9fa9459Szrj
856*a9fa9459SzrjHere are some sample symspecs:
857*a9fa9459Szrj
858*a9fa9459Szrj@table @samp
859*a9fa9459Szrj@item main.c
860*a9fa9459SzrjSelects everything in file @file{main.c}---the
861*a9fa9459Szrjdot in the string tells @code{gprof} to interpret
862*a9fa9459Szrjthe string as a filename, rather than as
863*a9fa9459Szrja function name.  To select a file whose
864*a9fa9459Szrjname does not contain a dot, a trailing colon
865*a9fa9459Szrjshould be specified.  For example, @samp{odd:} is
866*a9fa9459Szrjinterpreted as the file named @file{odd}.
867*a9fa9459Szrj
868*a9fa9459Szrj@item main
869*a9fa9459SzrjSelects all functions named @samp{main}.
870*a9fa9459Szrj
871*a9fa9459SzrjNote that there may be multiple instances of the same function name
872*a9fa9459Szrjbecause some of the definitions may be local (i.e., static).  Unless a
873*a9fa9459Szrjfunction name is unique in a program, you must use the colon notation
874*a9fa9459Szrjexplained below to specify a function from a specific source file.
875*a9fa9459Szrj
876*a9fa9459SzrjSometimes, function names contain dots.  In such cases, it is necessary
877*a9fa9459Szrjto add a leading colon to the name.  For example, @samp{:.mul} selects
878*a9fa9459Szrjfunction @samp{.mul}.
879*a9fa9459Szrj
880*a9fa9459SzrjIn some object file formats, symbols have a leading underscore.
881*a9fa9459Szrj@code{gprof} will normally not print these underscores.  When you name a
882*a9fa9459Szrjsymbol in a symspec, you should type it exactly as @code{gprof} prints
883*a9fa9459Szrjit in its output.  For example, if the compiler produces a symbol
884*a9fa9459Szrj@samp{_main} from your @code{main} function, @code{gprof} still prints
885*a9fa9459Szrjit as @samp{main} in its output, so you should use @samp{main} in
886*a9fa9459Szrjsymspecs.
887*a9fa9459Szrj
888*a9fa9459Szrj@item main.c:main
889*a9fa9459SzrjSelects function @samp{main} in file @file{main.c}.
890*a9fa9459Szrj
891*a9fa9459Szrj@item main.c:134
892*a9fa9459SzrjSelects line 134 in file @file{main.c}.
893*a9fa9459Szrj@end table
894*a9fa9459Szrj
895*a9fa9459Szrj@node Output
896*a9fa9459Szrj@chapter Interpreting @code{gprof}'s Output
897*a9fa9459Szrj
898*a9fa9459Szrj@code{gprof} can produce several different output styles, the
899*a9fa9459Szrjmost important of which are described below.  The simplest output
900*a9fa9459Szrjstyles (file information, execution count, and function and file ordering)
901*a9fa9459Szrjare not described here, but are documented with the respective options
902*a9fa9459Szrjthat trigger them.
903*a9fa9459Szrj@xref{Output Options, ,Output Options}.
904*a9fa9459Szrj
905*a9fa9459Szrj@menu
906*a9fa9459Szrj* Flat Profile::        The flat profile shows how much time was spent
907*a9fa9459Szrj                            executing directly in each function.
908*a9fa9459Szrj* Call Graph::          The call graph shows which functions called which
909*a9fa9459Szrj                            others, and how much time each function used
910*a9fa9459Szrj                            when its subroutine calls are included.
911*a9fa9459Szrj* Line-by-line::        @code{gprof} can analyze individual source code lines
912*a9fa9459Szrj* Annotated Source::    The annotated source listing displays source code
913*a9fa9459Szrj                            labeled with execution counts
914*a9fa9459Szrj@end menu
915*a9fa9459Szrj
916*a9fa9459Szrj
917*a9fa9459Szrj@node Flat Profile
918*a9fa9459Szrj@section The Flat Profile
919*a9fa9459Szrj@cindex flat profile
920*a9fa9459Szrj
921*a9fa9459SzrjThe @dfn{flat profile} shows the total amount of time your program
922*a9fa9459Szrjspent executing each function.  Unless the @samp{-z} option is given,
923*a9fa9459Szrjfunctions with no apparent time spent in them, and no apparent calls
924*a9fa9459Szrjto them, are not mentioned.  Note that if a function was not compiled
925*a9fa9459Szrjfor profiling, and didn't run long enough to show up on the program
926*a9fa9459Szrjcounter histogram, it will be indistinguishable from a function that
927*a9fa9459Szrjwas never called.
928*a9fa9459Szrj
929*a9fa9459SzrjThis is part of a flat profile for a small program:
930*a9fa9459Szrj
931*a9fa9459Szrj@smallexample
932*a9fa9459Szrj@group
933*a9fa9459SzrjFlat profile:
934*a9fa9459Szrj
935*a9fa9459SzrjEach sample counts as 0.01 seconds.
936*a9fa9459Szrj  %   cumulative   self              self     total
937*a9fa9459Szrj time   seconds   seconds    calls  ms/call  ms/call  name
938*a9fa9459Szrj 33.34      0.02     0.02     7208     0.00     0.00  open
939*a9fa9459Szrj 16.67      0.03     0.01      244     0.04     0.12  offtime
940*a9fa9459Szrj 16.67      0.04     0.01        8     1.25     1.25  memccpy
941*a9fa9459Szrj 16.67      0.05     0.01        7     1.43     1.43  write
942*a9fa9459Szrj 16.67      0.06     0.01                             mcount
943*a9fa9459Szrj  0.00      0.06     0.00      236     0.00     0.00  tzset
944*a9fa9459Szrj  0.00      0.06     0.00      192     0.00     0.00  tolower
945*a9fa9459Szrj  0.00      0.06     0.00       47     0.00     0.00  strlen
946*a9fa9459Szrj  0.00      0.06     0.00       45     0.00     0.00  strchr
947*a9fa9459Szrj  0.00      0.06     0.00        1     0.00    50.00  main
948*a9fa9459Szrj  0.00      0.06     0.00        1     0.00     0.00  memcpy
949*a9fa9459Szrj  0.00      0.06     0.00        1     0.00    10.11  print
950*a9fa9459Szrj  0.00      0.06     0.00        1     0.00     0.00  profil
951*a9fa9459Szrj  0.00      0.06     0.00        1     0.00    50.00  report
952*a9fa9459Szrj@dots{}
953*a9fa9459Szrj@end group
954*a9fa9459Szrj@end smallexample
955*a9fa9459Szrj
956*a9fa9459Szrj@noindent
957*a9fa9459SzrjThe functions are sorted first by decreasing run-time spent in them,
958*a9fa9459Szrjthen by decreasing number of calls, then alphabetically by name.  The
959*a9fa9459Szrjfunctions @samp{mcount} and @samp{profil} are part of the profiling
960*a9fa9459Szrjapparatus and appear in every flat profile; their time gives a measure of
961*a9fa9459Szrjthe amount of overhead due to profiling.
962*a9fa9459Szrj
963*a9fa9459SzrjJust before the column headers, a statement appears indicating
964*a9fa9459Szrjhow much time each sample counted as.
965*a9fa9459SzrjThis @dfn{sampling period} estimates the margin of error in each of the time
966*a9fa9459Szrjfigures.  A time figure that is not much larger than this is not
967*a9fa9459Szrjreliable.  In this example, each sample counted as 0.01 seconds,
968*a9fa9459Szrjsuggesting a 100 Hz sampling rate.
969*a9fa9459SzrjThe program's total execution time was 0.06
970*a9fa9459Szrjseconds, as indicated by the @samp{cumulative seconds} field.  Since
971*a9fa9459Szrjeach sample counted for 0.01 seconds, this means only six samples
972*a9fa9459Szrjwere taken during the run.  Two of the samples occurred while the
973*a9fa9459Szrjprogram was in the @samp{open} function, as indicated by the
974*a9fa9459Szrj@samp{self seconds} field.  Each of the other four samples
975*a9fa9459Szrjoccurred one each in @samp{offtime}, @samp{memccpy}, @samp{write},
976*a9fa9459Szrjand @samp{mcount}.
977*a9fa9459SzrjSince only six samples were taken, none of these values can
978*a9fa9459Szrjbe regarded as particularly reliable.
979*a9fa9459SzrjIn another run,
980*a9fa9459Szrjthe @samp{self seconds} field for
981*a9fa9459Szrj@samp{mcount} might well be @samp{0.00} or @samp{0.02}.
982*a9fa9459Szrj@xref{Sampling Error, ,Statistical Sampling Error},
983*a9fa9459Szrjfor a complete discussion.
984*a9fa9459Szrj
985*a9fa9459SzrjThe remaining functions in the listing (those whose
986*a9fa9459Szrj@samp{self seconds} field is @samp{0.00}) didn't appear
987*a9fa9459Szrjin the histogram samples at all.  However, the call graph
988*a9fa9459Szrjindicated that they were called, so therefore they are listed,
989*a9fa9459Szrjsorted in decreasing order by the @samp{calls} field.
990*a9fa9459SzrjClearly some time was spent executing these functions,
991*a9fa9459Szrjbut the paucity of histogram samples prevents any
992*a9fa9459Szrjdetermination of how much time each took.
993*a9fa9459Szrj
994*a9fa9459SzrjHere is what the fields in each line mean:
995*a9fa9459Szrj
996*a9fa9459Szrj@table @code
997*a9fa9459Szrj@item % time
998*a9fa9459SzrjThis is the percentage of the total execution time your program spent
999*a9fa9459Szrjin this function.  These should all add up to 100%.
1000*a9fa9459Szrj
1001*a9fa9459Szrj@item cumulative seconds
1002*a9fa9459SzrjThis is the cumulative total number of seconds the computer spent
1003*a9fa9459Szrjexecuting this functions, plus the time spent in all the functions
1004*a9fa9459Szrjabove this one in this table.
1005*a9fa9459Szrj
1006*a9fa9459Szrj@item self seconds
1007*a9fa9459SzrjThis is the number of seconds accounted for by this function alone.
1008*a9fa9459SzrjThe flat profile listing is sorted first by this number.
1009*a9fa9459Szrj
1010*a9fa9459Szrj@item calls
1011*a9fa9459SzrjThis is the total number of times the function was called.  If the
1012*a9fa9459Szrjfunction was never called, or the number of times it was called cannot
1013*a9fa9459Szrjbe determined (probably because the function was not compiled with
1014*a9fa9459Szrjprofiling enabled), the @dfn{calls} field is blank.
1015*a9fa9459Szrj
1016*a9fa9459Szrj@item self ms/call
1017*a9fa9459SzrjThis represents the average number of milliseconds spent in this
1018*a9fa9459Szrjfunction per call, if this function is profiled.  Otherwise, this field
1019*a9fa9459Szrjis blank for this function.
1020*a9fa9459Szrj
1021*a9fa9459Szrj@item total ms/call
1022*a9fa9459SzrjThis represents the average number of milliseconds spent in this
1023*a9fa9459Szrjfunction and its descendants per call, if this function is profiled.
1024*a9fa9459SzrjOtherwise, this field is blank for this function.
1025*a9fa9459SzrjThis is the only field in the flat profile that uses call graph analysis.
1026*a9fa9459Szrj
1027*a9fa9459Szrj@item name
1028*a9fa9459SzrjThis is the name of the function.   The flat profile is sorted by this
1029*a9fa9459Szrjfield alphabetically after the @dfn{self seconds} and @dfn{calls}
1030*a9fa9459Szrjfields are sorted.
1031*a9fa9459Szrj@end table
1032*a9fa9459Szrj
1033*a9fa9459Szrj@node Call Graph
1034*a9fa9459Szrj@section The Call Graph
1035*a9fa9459Szrj@cindex call graph
1036*a9fa9459Szrj
1037*a9fa9459SzrjThe @dfn{call graph} shows how much time was spent in each function
1038*a9fa9459Szrjand its children.  From this information, you can find functions that,
1039*a9fa9459Szrjwhile they themselves may not have used much time, called other
1040*a9fa9459Szrjfunctions that did use unusual amounts of time.
1041*a9fa9459Szrj
1042*a9fa9459SzrjHere is a sample call from a small program.  This call came from the
1043*a9fa9459Szrjsame @code{gprof} run as the flat profile example in the previous
1044*a9fa9459Szrjsection.
1045*a9fa9459Szrj
1046*a9fa9459Szrj@smallexample
1047*a9fa9459Szrj@group
1048*a9fa9459Szrjgranularity: each sample hit covers 2 byte(s) for 20.00% of 0.05 seconds
1049*a9fa9459Szrj
1050*a9fa9459Szrjindex % time    self  children    called     name
1051*a9fa9459Szrj                                                 <spontaneous>
1052*a9fa9459Szrj[1]    100.0    0.00    0.05                 start [1]
1053*a9fa9459Szrj                0.00    0.05       1/1           main [2]
1054*a9fa9459Szrj                0.00    0.00       1/2           on_exit [28]
1055*a9fa9459Szrj                0.00    0.00       1/1           exit [59]
1056*a9fa9459Szrj-----------------------------------------------
1057*a9fa9459Szrj                0.00    0.05       1/1           start [1]
1058*a9fa9459Szrj[2]    100.0    0.00    0.05       1         main [2]
1059*a9fa9459Szrj                0.00    0.05       1/1           report [3]
1060*a9fa9459Szrj-----------------------------------------------
1061*a9fa9459Szrj                0.00    0.05       1/1           main [2]
1062*a9fa9459Szrj[3]    100.0    0.00    0.05       1         report [3]
1063*a9fa9459Szrj                0.00    0.03       8/8           timelocal [6]
1064*a9fa9459Szrj                0.00    0.01       1/1           print [9]
1065*a9fa9459Szrj                0.00    0.01       9/9           fgets [12]
1066*a9fa9459Szrj                0.00    0.00      12/34          strncmp <cycle 1> [40]
1067*a9fa9459Szrj                0.00    0.00       8/8           lookup [20]
1068*a9fa9459Szrj                0.00    0.00       1/1           fopen [21]
1069*a9fa9459Szrj                0.00    0.00       8/8           chewtime [24]
1070*a9fa9459Szrj                0.00    0.00       8/16          skipspace [44]
1071*a9fa9459Szrj-----------------------------------------------
1072*a9fa9459Szrj[4]     59.8    0.01        0.02       8+472     <cycle 2 as a whole> [4]
1073*a9fa9459Szrj                0.01        0.02     244+260         offtime <cycle 2> [7]
1074*a9fa9459Szrj                0.00        0.00     236+1           tzset <cycle 2> [26]
1075*a9fa9459Szrj-----------------------------------------------
1076*a9fa9459Szrj@end group
1077*a9fa9459Szrj@end smallexample
1078*a9fa9459Szrj
1079*a9fa9459SzrjThe lines full of dashes divide this table into @dfn{entries}, one for each
1080*a9fa9459Szrjfunction.  Each entry has one or more lines.
1081*a9fa9459Szrj
1082*a9fa9459SzrjIn each entry, the primary line is the one that starts with an index number
1083*a9fa9459Szrjin square brackets.  The end of this line says which function the entry is
1084*a9fa9459Szrjfor.  The preceding lines in the entry describe the callers of this
1085*a9fa9459Szrjfunction and the following lines describe its subroutines (also called
1086*a9fa9459Szrj@dfn{children} when we speak of the call graph).
1087*a9fa9459Szrj
1088*a9fa9459SzrjThe entries are sorted by time spent in the function and its subroutines.
1089*a9fa9459Szrj
1090*a9fa9459SzrjThe internal profiling function @code{mcount} (@pxref{Flat Profile, ,The
1091*a9fa9459SzrjFlat Profile}) is never mentioned in the call graph.
1092*a9fa9459Szrj
1093*a9fa9459Szrj@menu
1094*a9fa9459Szrj* Primary::       Details of the primary line's contents.
1095*a9fa9459Szrj* Callers::       Details of caller-lines' contents.
1096*a9fa9459Szrj* Subroutines::   Details of subroutine-lines' contents.
1097*a9fa9459Szrj* Cycles::        When there are cycles of recursion,
1098*a9fa9459Szrj                   such as @code{a} calls @code{b} calls @code{a}@dots{}
1099*a9fa9459Szrj@end menu
1100*a9fa9459Szrj
1101*a9fa9459Szrj@node Primary
1102*a9fa9459Szrj@subsection The Primary Line
1103*a9fa9459Szrj
1104*a9fa9459SzrjThe @dfn{primary line} in a call graph entry is the line that
1105*a9fa9459Szrjdescribes the function which the entry is about and gives the overall
1106*a9fa9459Szrjstatistics for this function.
1107*a9fa9459Szrj
1108*a9fa9459SzrjFor reference, we repeat the primary line from the entry for function
1109*a9fa9459Szrj@code{report} in our main example, together with the heading line that
1110*a9fa9459Szrjshows the names of the fields:
1111*a9fa9459Szrj
1112*a9fa9459Szrj@smallexample
1113*a9fa9459Szrj@group
1114*a9fa9459Szrjindex  % time    self  children called     name
1115*a9fa9459Szrj@dots{}
1116*a9fa9459Szrj[3]    100.0    0.00    0.05       1         report [3]
1117*a9fa9459Szrj@end group
1118*a9fa9459Szrj@end smallexample
1119*a9fa9459Szrj
1120*a9fa9459SzrjHere is what the fields in the primary line mean:
1121*a9fa9459Szrj
1122*a9fa9459Szrj@table @code
1123*a9fa9459Szrj@item index
1124*a9fa9459SzrjEntries are numbered with consecutive integers.  Each function
1125*a9fa9459Szrjtherefore has an index number, which appears at the beginning of its
1126*a9fa9459Szrjprimary line.
1127*a9fa9459Szrj
1128*a9fa9459SzrjEach cross-reference to a function, as a caller or subroutine of
1129*a9fa9459Szrjanother, gives its index number as well as its name.  The index number
1130*a9fa9459Szrjguides you if you wish to look for the entry for that function.
1131*a9fa9459Szrj
1132*a9fa9459Szrj@item % time
1133*a9fa9459SzrjThis is the percentage of the total time that was spent in this
1134*a9fa9459Szrjfunction, including time spent in subroutines called from this
1135*a9fa9459Szrjfunction.
1136*a9fa9459Szrj
1137*a9fa9459SzrjThe time spent in this function is counted again for the callers of
1138*a9fa9459Szrjthis function.  Therefore, adding up these percentages is meaningless.
1139*a9fa9459Szrj
1140*a9fa9459Szrj@item self
1141*a9fa9459SzrjThis is the total amount of time spent in this function.  This
1142*a9fa9459Szrjshould be identical to the number printed in the @code{seconds} field
1143*a9fa9459Szrjfor this function in the flat profile.
1144*a9fa9459Szrj
1145*a9fa9459Szrj@item children
1146*a9fa9459SzrjThis is the total amount of time spent in the subroutine calls made by
1147*a9fa9459Szrjthis function.  This should be equal to the sum of all the @code{self}
1148*a9fa9459Szrjand @code{children} entries of the children listed directly below this
1149*a9fa9459Szrjfunction.
1150*a9fa9459Szrj
1151*a9fa9459Szrj@item called
1152*a9fa9459SzrjThis is the number of times the function was called.
1153*a9fa9459Szrj
1154*a9fa9459SzrjIf the function called itself recursively, there are two numbers,
1155*a9fa9459Szrjseparated by a @samp{+}.  The first number counts non-recursive calls,
1156*a9fa9459Szrjand the second counts recursive calls.
1157*a9fa9459Szrj
1158*a9fa9459SzrjIn the example above, the function @code{report} was called once from
1159*a9fa9459Szrj@code{main}.
1160*a9fa9459Szrj
1161*a9fa9459Szrj@item name
1162*a9fa9459SzrjThis is the name of the current function.  The index number is
1163*a9fa9459Szrjrepeated after it.
1164*a9fa9459Szrj
1165*a9fa9459SzrjIf the function is part of a cycle of recursion, the cycle number is
1166*a9fa9459Szrjprinted between the function's name and the index number
1167*a9fa9459Szrj(@pxref{Cycles, ,How Mutually Recursive Functions Are Described}).
1168*a9fa9459SzrjFor example, if function @code{gnurr} is part of
1169*a9fa9459Szrjcycle number one, and has index number twelve, its primary line would
1170*a9fa9459Szrjbe end like this:
1171*a9fa9459Szrj
1172*a9fa9459Szrj@example
1173*a9fa9459Szrjgnurr <cycle 1> [12]
1174*a9fa9459Szrj@end example
1175*a9fa9459Szrj@end table
1176*a9fa9459Szrj
1177*a9fa9459Szrj@node Callers
1178*a9fa9459Szrj@subsection Lines for a Function's Callers
1179*a9fa9459Szrj
1180*a9fa9459SzrjA function's entry has a line for each function it was called by.
1181*a9fa9459SzrjThese lines' fields correspond to the fields of the primary line, but
1182*a9fa9459Szrjtheir meanings are different because of the difference in context.
1183*a9fa9459Szrj
1184*a9fa9459SzrjFor reference, we repeat two lines from the entry for the function
1185*a9fa9459Szrj@code{report}, the primary line and one caller-line preceding it, together
1186*a9fa9459Szrjwith the heading line that shows the names of the fields:
1187*a9fa9459Szrj
1188*a9fa9459Szrj@smallexample
1189*a9fa9459Szrjindex  % time    self  children called     name
1190*a9fa9459Szrj@dots{}
1191*a9fa9459Szrj                0.00    0.05       1/1           main [2]
1192*a9fa9459Szrj[3]    100.0    0.00    0.05       1         report [3]
1193*a9fa9459Szrj@end smallexample
1194*a9fa9459Szrj
1195*a9fa9459SzrjHere are the meanings of the fields in the caller-line for @code{report}
1196*a9fa9459Szrjcalled from @code{main}:
1197*a9fa9459Szrj
1198*a9fa9459Szrj@table @code
1199*a9fa9459Szrj@item self
1200*a9fa9459SzrjAn estimate of the amount of time spent in @code{report} itself when it was
1201*a9fa9459Szrjcalled from @code{main}.
1202*a9fa9459Szrj
1203*a9fa9459Szrj@item children
1204*a9fa9459SzrjAn estimate of the amount of time spent in subroutines of @code{report}
1205*a9fa9459Szrjwhen @code{report} was called from @code{main}.
1206*a9fa9459Szrj
1207*a9fa9459SzrjThe sum of the @code{self} and @code{children} fields is an estimate
1208*a9fa9459Szrjof the amount of time spent within calls to @code{report} from @code{main}.
1209*a9fa9459Szrj
1210*a9fa9459Szrj@item called
1211*a9fa9459SzrjTwo numbers: the number of times @code{report} was called from @code{main},
1212*a9fa9459Szrjfollowed by the total number of non-recursive calls to @code{report} from
1213*a9fa9459Szrjall its callers.
1214*a9fa9459Szrj
1215*a9fa9459Szrj@item name and index number
1216*a9fa9459SzrjThe name of the caller of @code{report} to which this line applies,
1217*a9fa9459Szrjfollowed by the caller's index number.
1218*a9fa9459Szrj
1219*a9fa9459SzrjNot all functions have entries in the call graph; some
1220*a9fa9459Szrjoptions to @code{gprof} request the omission of certain functions.
1221*a9fa9459SzrjWhen a caller has no entry of its own, it still has caller-lines
1222*a9fa9459Szrjin the entries of the functions it calls.
1223*a9fa9459Szrj
1224*a9fa9459SzrjIf the caller is part of a recursion cycle, the cycle number is
1225*a9fa9459Szrjprinted between the name and the index number.
1226*a9fa9459Szrj@end table
1227*a9fa9459Szrj
1228*a9fa9459SzrjIf the identity of the callers of a function cannot be determined, a
1229*a9fa9459Szrjdummy caller-line is printed which has @samp{<spontaneous>} as the
1230*a9fa9459Szrj``caller's name'' and all other fields blank.  This can happen for
1231*a9fa9459Szrjsignal handlers.
1232*a9fa9459Szrj@c What if some calls have determinable callers' names but not all?
1233*a9fa9459Szrj@c FIXME - still relevant?
1234*a9fa9459Szrj
1235*a9fa9459Szrj@node Subroutines
1236*a9fa9459Szrj@subsection Lines for a Function's Subroutines
1237*a9fa9459Szrj
1238*a9fa9459SzrjA function's entry has a line for each of its subroutines---in other
1239*a9fa9459Szrjwords, a line for each other function that it called.  These lines'
1240*a9fa9459Szrjfields correspond to the fields of the primary line, but their meanings
1241*a9fa9459Szrjare different because of the difference in context.
1242*a9fa9459Szrj
1243*a9fa9459SzrjFor reference, we repeat two lines from the entry for the function
1244*a9fa9459Szrj@code{main}, the primary line and a line for a subroutine, together
1245*a9fa9459Szrjwith the heading line that shows the names of the fields:
1246*a9fa9459Szrj
1247*a9fa9459Szrj@smallexample
1248*a9fa9459Szrjindex  % time    self  children called     name
1249*a9fa9459Szrj@dots{}
1250*a9fa9459Szrj[2]    100.0    0.00    0.05       1         main [2]
1251*a9fa9459Szrj                0.00    0.05       1/1           report [3]
1252*a9fa9459Szrj@end smallexample
1253*a9fa9459Szrj
1254*a9fa9459SzrjHere are the meanings of the fields in the subroutine-line for @code{main}
1255*a9fa9459Szrjcalling @code{report}:
1256*a9fa9459Szrj
1257*a9fa9459Szrj@table @code
1258*a9fa9459Szrj@item self
1259*a9fa9459SzrjAn estimate of the amount of time spent directly within @code{report}
1260*a9fa9459Szrjwhen @code{report} was called from @code{main}.
1261*a9fa9459Szrj
1262*a9fa9459Szrj@item children
1263*a9fa9459SzrjAn estimate of the amount of time spent in subroutines of @code{report}
1264*a9fa9459Szrjwhen @code{report} was called from @code{main}.
1265*a9fa9459Szrj
1266*a9fa9459SzrjThe sum of the @code{self} and @code{children} fields is an estimate
1267*a9fa9459Szrjof the total time spent in calls to @code{report} from @code{main}.
1268*a9fa9459Szrj
1269*a9fa9459Szrj@item called
1270*a9fa9459SzrjTwo numbers, the number of calls to @code{report} from @code{main}
1271*a9fa9459Szrjfollowed by the total number of non-recursive calls to @code{report}.
1272*a9fa9459SzrjThis ratio is used to determine how much of @code{report}'s @code{self}
1273*a9fa9459Szrjand @code{children} time gets credited to @code{main}.
1274*a9fa9459Szrj@xref{Assumptions, ,Estimating @code{children} Times}.
1275*a9fa9459Szrj
1276*a9fa9459Szrj@item name
1277*a9fa9459SzrjThe name of the subroutine of @code{main} to which this line applies,
1278*a9fa9459Szrjfollowed by the subroutine's index number.
1279*a9fa9459Szrj
1280*a9fa9459SzrjIf the caller is part of a recursion cycle, the cycle number is
1281*a9fa9459Szrjprinted between the name and the index number.
1282*a9fa9459Szrj@end table
1283*a9fa9459Szrj
1284*a9fa9459Szrj@node Cycles
1285*a9fa9459Szrj@subsection How Mutually Recursive Functions Are Described
1286*a9fa9459Szrj@cindex cycle
1287*a9fa9459Szrj@cindex recursion cycle
1288*a9fa9459Szrj
1289*a9fa9459SzrjThe graph may be complicated by the presence of @dfn{cycles of
1290*a9fa9459Szrjrecursion} in the call graph.  A cycle exists if a function calls
1291*a9fa9459Szrjanother function that (directly or indirectly) calls (or appears to
1292*a9fa9459Szrjcall) the original function.  For example: if @code{a} calls @code{b},
1293*a9fa9459Szrjand @code{b} calls @code{a}, then @code{a} and @code{b} form a cycle.
1294*a9fa9459Szrj
1295*a9fa9459SzrjWhenever there are call paths both ways between a pair of functions, they
1296*a9fa9459Szrjbelong to the same cycle.  If @code{a} and @code{b} call each other and
1297*a9fa9459Szrj@code{b} and @code{c} call each other, all three make one cycle.  Note that
1298*a9fa9459Szrjeven if @code{b} only calls @code{a} if it was not called from @code{a},
1299*a9fa9459Szrj@code{gprof} cannot determine this, so @code{a} and @code{b} are still
1300*a9fa9459Szrjconsidered a cycle.
1301*a9fa9459Szrj
1302*a9fa9459SzrjThe cycles are numbered with consecutive integers.  When a function
1303*a9fa9459Szrjbelongs to a cycle, each time the function name appears in the call graph
1304*a9fa9459Szrjit is followed by @samp{<cycle @var{number}>}.
1305*a9fa9459Szrj
1306*a9fa9459SzrjThe reason cycles matter is that they make the time values in the call
1307*a9fa9459Szrjgraph paradoxical.  The ``time spent in children'' of @code{a} should
1308*a9fa9459Szrjinclude the time spent in its subroutine @code{b} and in @code{b}'s
1309*a9fa9459Szrjsubroutines---but one of @code{b}'s subroutines is @code{a}!  How much of
1310*a9fa9459Szrj@code{a}'s time should be included in the children of @code{a}, when
1311*a9fa9459Szrj@code{a} is indirectly recursive?
1312*a9fa9459Szrj
1313*a9fa9459SzrjThe way @code{gprof} resolves this paradox is by creating a single entry
1314*a9fa9459Szrjfor the cycle as a whole.  The primary line of this entry describes the
1315*a9fa9459Szrjtotal time spent directly in the functions of the cycle.  The
1316*a9fa9459Szrj``subroutines'' of the cycle are the individual functions of the cycle, and
1317*a9fa9459Szrjall other functions that were called directly by them.  The ``callers'' of
1318*a9fa9459Szrjthe cycle are the functions, outside the cycle, that called functions in
1319*a9fa9459Szrjthe cycle.
1320*a9fa9459Szrj
1321*a9fa9459SzrjHere is an example portion of a call graph which shows a cycle containing
1322*a9fa9459Szrjfunctions @code{a} and @code{b}.  The cycle was entered by a call to
1323*a9fa9459Szrj@code{a} from @code{main}; both @code{a} and @code{b} called @code{c}.
1324*a9fa9459Szrj
1325*a9fa9459Szrj@smallexample
1326*a9fa9459Szrjindex  % time    self  children called     name
1327*a9fa9459Szrj----------------------------------------
1328*a9fa9459Szrj                 1.77        0    1/1        main [2]
1329*a9fa9459Szrj[3]     91.71    1.77        0    1+5    <cycle 1 as a whole> [3]
1330*a9fa9459Szrj                 1.02        0    3          b <cycle 1> [4]
1331*a9fa9459Szrj                 0.75        0    2          a <cycle 1> [5]
1332*a9fa9459Szrj----------------------------------------
1333*a9fa9459Szrj                                  3          a <cycle 1> [5]
1334*a9fa9459Szrj[4]     52.85    1.02        0    0      b <cycle 1> [4]
1335*a9fa9459Szrj                                  2          a <cycle 1> [5]
1336*a9fa9459Szrj                    0        0    3/6        c [6]
1337*a9fa9459Szrj----------------------------------------
1338*a9fa9459Szrj                 1.77        0    1/1        main [2]
1339*a9fa9459Szrj                                  2          b <cycle 1> [4]
1340*a9fa9459Szrj[5]     38.86    0.75        0    1      a <cycle 1> [5]
1341*a9fa9459Szrj                                  3          b <cycle 1> [4]
1342*a9fa9459Szrj                    0        0    3/6        c [6]
1343*a9fa9459Szrj----------------------------------------
1344*a9fa9459Szrj@end smallexample
1345*a9fa9459Szrj
1346*a9fa9459Szrj@noindent
1347*a9fa9459Szrj(The entire call graph for this program contains in addition an entry for
1348*a9fa9459Szrj@code{main}, which calls @code{a}, and an entry for @code{c}, with callers
1349*a9fa9459Szrj@code{a} and @code{b}.)
1350*a9fa9459Szrj
1351*a9fa9459Szrj@smallexample
1352*a9fa9459Szrjindex  % time    self  children called     name
1353*a9fa9459Szrj                                             <spontaneous>
1354*a9fa9459Szrj[1]    100.00       0     1.93    0      start [1]
1355*a9fa9459Szrj                 0.16     1.77    1/1        main [2]
1356*a9fa9459Szrj----------------------------------------
1357*a9fa9459Szrj                 0.16     1.77    1/1        start [1]
1358*a9fa9459Szrj[2]    100.00    0.16     1.77    1      main [2]
1359*a9fa9459Szrj                 1.77        0    1/1        a <cycle 1> [5]
1360*a9fa9459Szrj----------------------------------------
1361*a9fa9459Szrj                 1.77        0    1/1        main [2]
1362*a9fa9459Szrj[3]     91.71    1.77        0    1+5    <cycle 1 as a whole> [3]
1363*a9fa9459Szrj                 1.02        0    3          b <cycle 1> [4]
1364*a9fa9459Szrj                 0.75        0    2          a <cycle 1> [5]
1365*a9fa9459Szrj                    0        0    6/6        c [6]
1366*a9fa9459Szrj----------------------------------------
1367*a9fa9459Szrj                                  3          a <cycle 1> [5]
1368*a9fa9459Szrj[4]     52.85    1.02        0    0      b <cycle 1> [4]
1369*a9fa9459Szrj                                  2          a <cycle 1> [5]
1370*a9fa9459Szrj                    0        0    3/6        c [6]
1371*a9fa9459Szrj----------------------------------------
1372*a9fa9459Szrj                 1.77        0    1/1        main [2]
1373*a9fa9459Szrj                                  2          b <cycle 1> [4]
1374*a9fa9459Szrj[5]     38.86    0.75        0    1      a <cycle 1> [5]
1375*a9fa9459Szrj                                  3          b <cycle 1> [4]
1376*a9fa9459Szrj                    0        0    3/6        c [6]
1377*a9fa9459Szrj----------------------------------------
1378*a9fa9459Szrj                    0        0    3/6        b <cycle 1> [4]
1379*a9fa9459Szrj                    0        0    3/6        a <cycle 1> [5]
1380*a9fa9459Szrj[6]      0.00       0        0    6      c [6]
1381*a9fa9459Szrj----------------------------------------
1382*a9fa9459Szrj@end smallexample
1383*a9fa9459Szrj
1384*a9fa9459SzrjThe @code{self} field of the cycle's primary line is the total time
1385*a9fa9459Szrjspent in all the functions of the cycle.  It equals the sum of the
1386*a9fa9459Szrj@code{self} fields for the individual functions in the cycle, found
1387*a9fa9459Szrjin the entry in the subroutine lines for these functions.
1388*a9fa9459Szrj
1389*a9fa9459SzrjThe @code{children} fields of the cycle's primary line and subroutine lines
1390*a9fa9459Szrjcount only subroutines outside the cycle.  Even though @code{a} calls
1391*a9fa9459Szrj@code{b}, the time spent in those calls to @code{b} is not counted in
1392*a9fa9459Szrj@code{a}'s @code{children} time.  Thus, we do not encounter the problem of
1393*a9fa9459Szrjwhat to do when the time in those calls to @code{b} includes indirect
1394*a9fa9459Szrjrecursive calls back to @code{a}.
1395*a9fa9459Szrj
1396*a9fa9459SzrjThe @code{children} field of a caller-line in the cycle's entry estimates
1397*a9fa9459Szrjthe amount of time spent @emph{in the whole cycle}, and its other
1398*a9fa9459Szrjsubroutines, on the times when that caller called a function in the cycle.
1399*a9fa9459Szrj
1400*a9fa9459SzrjThe @code{called} field in the primary line for the cycle has two numbers:
1401*a9fa9459Szrjfirst, the number of times functions in the cycle were called by functions
1402*a9fa9459Szrjoutside the cycle; second, the number of times they were called by
1403*a9fa9459Szrjfunctions in the cycle (including times when a function in the cycle calls
1404*a9fa9459Szrjitself).  This is a generalization of the usual split into non-recursive and
1405*a9fa9459Szrjrecursive calls.
1406*a9fa9459Szrj
1407*a9fa9459SzrjThe @code{called} field of a subroutine-line for a cycle member in the
1408*a9fa9459Szrjcycle's entry says how many time that function was called from functions in
1409*a9fa9459Szrjthe cycle.  The total of all these is the second number in the primary line's
1410*a9fa9459Szrj@code{called} field.
1411*a9fa9459Szrj
1412*a9fa9459SzrjIn the individual entry for a function in a cycle, the other functions in
1413*a9fa9459Szrjthe same cycle can appear as subroutines and as callers.  These lines show
1414*a9fa9459Szrjhow many times each function in the cycle called or was called from each other
1415*a9fa9459Szrjfunction in the cycle.  The @code{self} and @code{children} fields in these
1416*a9fa9459Szrjlines are blank because of the difficulty of defining meanings for them
1417*a9fa9459Szrjwhen recursion is going on.
1418*a9fa9459Szrj
1419*a9fa9459Szrj@node Line-by-line
1420*a9fa9459Szrj@section Line-by-line Profiling
1421*a9fa9459Szrj
1422*a9fa9459Szrj@code{gprof}'s @samp{-l} option causes the program to perform
1423*a9fa9459Szrj@dfn{line-by-line} profiling.  In this mode, histogram
1424*a9fa9459Szrjsamples are assigned not to functions, but to individual
1425*a9fa9459Szrjlines of source code.  This only works with programs compiled with
1426*a9fa9459Szrjolder versions of the @code{gcc} compiler.  Newer versions of @code{gcc}
1427*a9fa9459Szrjuse a different program - @code{gcov} - to display line-by-line
1428*a9fa9459Szrjprofiling information.
1429*a9fa9459Szrj
1430*a9fa9459SzrjWith the older versions of @code{gcc} the program usually has to be
1431*a9fa9459Szrjcompiled with a @samp{-g} option, in addition to @samp{-pg}, in order
1432*a9fa9459Szrjto generate debugging symbols for tracking source code lines.
1433*a9fa9459SzrjNote, in much older versions of @code{gcc} the program had to be
1434*a9fa9459Szrjcompiled with the @samp{-a} command line option as well.
1435*a9fa9459Szrj
1436*a9fa9459SzrjThe flat profile is the most useful output table
1437*a9fa9459Szrjin line-by-line mode.
1438*a9fa9459SzrjThe call graph isn't as useful as normal, since
1439*a9fa9459Szrjthe current version of @code{gprof} does not propagate
1440*a9fa9459Szrjcall graph arcs from source code lines to the enclosing function.
1441*a9fa9459SzrjThe call graph does, however, show each line of code
1442*a9fa9459Szrjthat called each function, along with a count.
1443*a9fa9459Szrj
1444*a9fa9459SzrjHere is a section of @code{gprof}'s output, without line-by-line profiling.
1445*a9fa9459SzrjNote that @code{ct_init} accounted for four histogram hits, and
1446*a9fa9459Szrj13327 calls to @code{init_block}.
1447*a9fa9459Szrj
1448*a9fa9459Szrj@smallexample
1449*a9fa9459SzrjFlat profile:
1450*a9fa9459Szrj
1451*a9fa9459SzrjEach sample counts as 0.01 seconds.
1452*a9fa9459Szrj  %   cumulative   self              self     total
1453*a9fa9459Szrj time   seconds   seconds    calls  us/call  us/call  name
1454*a9fa9459Szrj 30.77      0.13     0.04     6335     6.31     6.31  ct_init
1455*a9fa9459Szrj
1456*a9fa9459Szrj
1457*a9fa9459Szrj		     Call graph (explanation follows)
1458*a9fa9459Szrj
1459*a9fa9459Szrj
1460*a9fa9459Szrjgranularity: each sample hit covers 4 byte(s) for 7.69% of 0.13 seconds
1461*a9fa9459Szrj
1462*a9fa9459Szrjindex % time    self  children    called     name
1463*a9fa9459Szrj
1464*a9fa9459Szrj                0.00    0.00       1/13496       name_too_long
1465*a9fa9459Szrj                0.00    0.00      40/13496       deflate
1466*a9fa9459Szrj                0.00    0.00     128/13496       deflate_fast
1467*a9fa9459Szrj                0.00    0.00   13327/13496       ct_init
1468*a9fa9459Szrj[7]      0.0    0.00    0.00   13496         init_block
1469*a9fa9459Szrj
1470*a9fa9459Szrj@end smallexample
1471*a9fa9459Szrj
1472*a9fa9459SzrjNow let's look at some of @code{gprof}'s output from the same program run,
1473*a9fa9459Szrjthis time with line-by-line profiling enabled.  Note that @code{ct_init}'s
1474*a9fa9459Szrjfour histogram hits are broken down into four lines of source code---one hit
1475*a9fa9459Szrjoccurred on each of lines 349, 351, 382 and 385.  In the call graph,
1476*a9fa9459Szrjnote how
1477*a9fa9459Szrj@code{ct_init}'s 13327 calls to @code{init_block} are broken down
1478*a9fa9459Szrjinto one call from line 396, 3071 calls from line 384, 3730 calls
1479*a9fa9459Szrjfrom line 385, and 6525 calls from 387.
1480*a9fa9459Szrj
1481*a9fa9459Szrj@smallexample
1482*a9fa9459SzrjFlat profile:
1483*a9fa9459Szrj
1484*a9fa9459SzrjEach sample counts as 0.01 seconds.
1485*a9fa9459Szrj  %   cumulative   self
1486*a9fa9459Szrj time   seconds   seconds    calls  name
1487*a9fa9459Szrj  7.69      0.10     0.01           ct_init (trees.c:349)
1488*a9fa9459Szrj  7.69      0.11     0.01           ct_init (trees.c:351)
1489*a9fa9459Szrj  7.69      0.12     0.01           ct_init (trees.c:382)
1490*a9fa9459Szrj  7.69      0.13     0.01           ct_init (trees.c:385)
1491*a9fa9459Szrj
1492*a9fa9459Szrj
1493*a9fa9459Szrj		     Call graph (explanation follows)
1494*a9fa9459Szrj
1495*a9fa9459Szrj
1496*a9fa9459Szrjgranularity: each sample hit covers 4 byte(s) for 7.69% of 0.13 seconds
1497*a9fa9459Szrj
1498*a9fa9459Szrj  % time    self  children    called     name
1499*a9fa9459Szrj
1500*a9fa9459Szrj            0.00    0.00       1/13496       name_too_long (gzip.c:1440)
1501*a9fa9459Szrj            0.00    0.00       1/13496       deflate (deflate.c:763)
1502*a9fa9459Szrj            0.00    0.00       1/13496       ct_init (trees.c:396)
1503*a9fa9459Szrj            0.00    0.00       2/13496       deflate (deflate.c:727)
1504*a9fa9459Szrj            0.00    0.00       4/13496       deflate (deflate.c:686)
1505*a9fa9459Szrj            0.00    0.00       5/13496       deflate (deflate.c:675)
1506*a9fa9459Szrj            0.00    0.00      12/13496       deflate (deflate.c:679)
1507*a9fa9459Szrj            0.00    0.00      16/13496       deflate (deflate.c:730)
1508*a9fa9459Szrj            0.00    0.00     128/13496       deflate_fast (deflate.c:654)
1509*a9fa9459Szrj            0.00    0.00    3071/13496       ct_init (trees.c:384)
1510*a9fa9459Szrj            0.00    0.00    3730/13496       ct_init (trees.c:385)
1511*a9fa9459Szrj            0.00    0.00    6525/13496       ct_init (trees.c:387)
1512*a9fa9459Szrj[6]  0.0    0.00    0.00   13496         init_block (trees.c:408)
1513*a9fa9459Szrj
1514*a9fa9459Szrj@end smallexample
1515*a9fa9459Szrj
1516*a9fa9459Szrj
1517*a9fa9459Szrj@node Annotated Source
1518*a9fa9459Szrj@section The Annotated Source Listing
1519*a9fa9459Szrj
1520*a9fa9459Szrj@code{gprof}'s @samp{-A} option triggers an annotated source listing,
1521*a9fa9459Szrjwhich lists the program's source code, each function labeled with the
1522*a9fa9459Szrjnumber of times it was called.  You may also need to specify the
1523*a9fa9459Szrj@samp{-I} option, if @code{gprof} can't find the source code files.
1524*a9fa9459Szrj
1525*a9fa9459SzrjWith older versions of @code{gcc} compiling with @samp{gcc @dots{} -g
1526*a9fa9459Szrj-pg -a} augments your program with basic-block counting code, in
1527*a9fa9459Szrjaddition to function counting code.  This enables @code{gprof} to
1528*a9fa9459Szrjdetermine how many times each line of code was executed.  With newer
1529*a9fa9459Szrjversions of @code{gcc} support for displaying basic-block counts is
1530*a9fa9459Szrjprovided by the @code{gcov} program.
1531*a9fa9459Szrj
1532*a9fa9459SzrjFor example, consider the following function, taken from gzip,
1533*a9fa9459Szrjwith line numbers added:
1534*a9fa9459Szrj
1535*a9fa9459Szrj@smallexample
1536*a9fa9459Szrj 1 ulg updcrc(s, n)
1537*a9fa9459Szrj 2     uch *s;
1538*a9fa9459Szrj 3     unsigned n;
1539*a9fa9459Szrj 4 @{
1540*a9fa9459Szrj 5     register ulg c;
1541*a9fa9459Szrj 6
1542*a9fa9459Szrj 7     static ulg crc = (ulg)0xffffffffL;
1543*a9fa9459Szrj 8
1544*a9fa9459Szrj 9     if (s == NULL) @{
1545*a9fa9459Szrj10         c = 0xffffffffL;
1546*a9fa9459Szrj11     @} else @{
1547*a9fa9459Szrj12         c = crc;
1548*a9fa9459Szrj13         if (n) do @{
1549*a9fa9459Szrj14             c = crc_32_tab[...];
1550*a9fa9459Szrj15         @} while (--n);
1551*a9fa9459Szrj16     @}
1552*a9fa9459Szrj17     crc = c;
1553*a9fa9459Szrj18     return c ^ 0xffffffffL;
1554*a9fa9459Szrj19 @}
1555*a9fa9459Szrj
1556*a9fa9459Szrj@end smallexample
1557*a9fa9459Szrj
1558*a9fa9459Szrj@code{updcrc} has at least five basic-blocks.
1559*a9fa9459SzrjOne is the function itself.  The
1560*a9fa9459Szrj@code{if} statement on line 9 generates two more basic-blocks, one
1561*a9fa9459Szrjfor each branch of the @code{if}.  A fourth basic-block results from
1562*a9fa9459Szrjthe @code{if} on line 13, and the contents of the @code{do} loop form
1563*a9fa9459Szrjthe fifth basic-block.  The compiler may also generate additional
1564*a9fa9459Szrjbasic-blocks to handle various special cases.
1565*a9fa9459Szrj
1566*a9fa9459SzrjA program augmented for basic-block counting can be analyzed with
1567*a9fa9459Szrj@samp{gprof -l -A}.
1568*a9fa9459SzrjThe @samp{-x} option is also helpful,
1569*a9fa9459Szrjto ensure that each line of code is labeled at least once.
1570*a9fa9459SzrjHere is @code{updcrc}'s
1571*a9fa9459Szrjannotated source listing for a sample @code{gzip} run:
1572*a9fa9459Szrj
1573*a9fa9459Szrj@smallexample
1574*a9fa9459Szrj                ulg updcrc(s, n)
1575*a9fa9459Szrj                    uch *s;
1576*a9fa9459Szrj                    unsigned n;
1577*a9fa9459Szrj            2 ->@{
1578*a9fa9459Szrj                    register ulg c;
1579*a9fa9459Szrj
1580*a9fa9459Szrj                    static ulg crc = (ulg)0xffffffffL;
1581*a9fa9459Szrj
1582*a9fa9459Szrj            2 ->    if (s == NULL) @{
1583*a9fa9459Szrj            1 ->        c = 0xffffffffL;
1584*a9fa9459Szrj            1 ->    @} else @{
1585*a9fa9459Szrj            1 ->        c = crc;
1586*a9fa9459Szrj            1 ->        if (n) do @{
1587*a9fa9459Szrj        26312 ->            c = crc_32_tab[...];
1588*a9fa9459Szrj26312,1,26311 ->        @} while (--n);
1589*a9fa9459Szrj                    @}
1590*a9fa9459Szrj            2 ->    crc = c;
1591*a9fa9459Szrj            2 ->    return c ^ 0xffffffffL;
1592*a9fa9459Szrj            2 ->@}
1593*a9fa9459Szrj@end smallexample
1594*a9fa9459Szrj
1595*a9fa9459SzrjIn this example, the function was called twice, passing once through
1596*a9fa9459Szrjeach branch of the @code{if} statement.  The body of the @code{do}
1597*a9fa9459Szrjloop was executed a total of 26312 times.  Note how the @code{while}
1598*a9fa9459Szrjstatement is annotated.  It began execution 26312 times, once for
1599*a9fa9459Szrjeach iteration through the loop.  One of those times (the last time)
1600*a9fa9459Szrjit exited, while it branched back to the beginning of the loop 26311 times.
1601*a9fa9459Szrj
1602*a9fa9459Szrj@node Inaccuracy
1603*a9fa9459Szrj@chapter Inaccuracy of @code{gprof} Output
1604*a9fa9459Szrj
1605*a9fa9459Szrj@menu
1606*a9fa9459Szrj* Sampling Error::      Statistical margins of error
1607*a9fa9459Szrj* Assumptions::         Estimating children times
1608*a9fa9459Szrj@end menu
1609*a9fa9459Szrj
1610*a9fa9459Szrj@node Sampling Error
1611*a9fa9459Szrj@section Statistical Sampling Error
1612*a9fa9459Szrj
1613*a9fa9459SzrjThe run-time figures that @code{gprof} gives you are based on a sampling
1614*a9fa9459Szrjprocess, so they are subject to statistical inaccuracy.  If a function runs
1615*a9fa9459Szrjonly a small amount of time, so that on the average the sampling process
1616*a9fa9459Szrjought to catch that function in the act only once, there is a pretty good
1617*a9fa9459Szrjchance it will actually find that function zero times, or twice.
1618*a9fa9459Szrj
1619*a9fa9459SzrjBy contrast, the number-of-calls and basic-block figures are derived
1620*a9fa9459Szrjby counting, not sampling.  They are completely accurate and will not
1621*a9fa9459Szrjvary from run to run if your program is deterministic and single
1622*a9fa9459Szrjthreaded.  In multi-threaded applications, or single threaded
1623*a9fa9459Szrjapplications that link with multi-threaded libraries, the counts are
1624*a9fa9459Szrjonly deterministic if the counting function is thread-safe.  (Note:
1625*a9fa9459Szrjbeware that the mcount counting function in glibc is @emph{not}
1626*a9fa9459Szrjthread-safe).  @xref{Implementation, ,Implementation of Profiling}.
1627*a9fa9459Szrj
1628*a9fa9459SzrjThe @dfn{sampling period} that is printed at the beginning of the flat
1629*a9fa9459Szrjprofile says how often samples are taken.  The rule of thumb is that a
1630*a9fa9459Szrjrun-time figure is accurate if it is considerably bigger than the sampling
1631*a9fa9459Szrjperiod.
1632*a9fa9459Szrj
1633*a9fa9459SzrjThe actual amount of error can be predicted.
1634*a9fa9459SzrjFor @var{n} samples, the @emph{expected} error
1635*a9fa9459Szrjis the square-root of @var{n}.  For example,
1636*a9fa9459Szrjif the sampling period is 0.01 seconds and @code{foo}'s run-time is 1 second,
1637*a9fa9459Szrj@var{n} is 100 samples (1 second/0.01 seconds), sqrt(@var{n}) is 10 samples, so
1638*a9fa9459Szrjthe expected error in @code{foo}'s run-time is 0.1 seconds (10*0.01 seconds),
1639*a9fa9459Szrjor ten percent of the observed value.
1640*a9fa9459SzrjAgain, if the sampling period is 0.01 seconds and @code{bar}'s run-time is
1641*a9fa9459Szrj100 seconds, @var{n} is 10000 samples, sqrt(@var{n}) is 100 samples, so
1642*a9fa9459Szrjthe expected error in @code{bar}'s run-time is 1 second,
1643*a9fa9459Szrjor one percent of the observed value.
1644*a9fa9459SzrjIt is likely to
1645*a9fa9459Szrjvary this much @emph{on the average} from one profiling run to the next.
1646*a9fa9459Szrj(@emph{Sometimes} it will vary more.)
1647*a9fa9459Szrj
1648*a9fa9459SzrjThis does not mean that a small run-time figure is devoid of information.
1649*a9fa9459SzrjIf the program's @emph{total} run-time is large, a small run-time for one
1650*a9fa9459Szrjfunction does tell you that that function used an insignificant fraction of
1651*a9fa9459Szrjthe whole program's time.  Usually this means it is not worth optimizing.
1652*a9fa9459Szrj
1653*a9fa9459SzrjOne way to get more accuracy is to give your program more (but similar)
1654*a9fa9459Szrjinput data so it will take longer.  Another way is to combine the data from
1655*a9fa9459Szrjseveral runs, using the @samp{-s} option of @code{gprof}.  Here is how:
1656*a9fa9459Szrj
1657*a9fa9459Szrj@enumerate
1658*a9fa9459Szrj@item
1659*a9fa9459SzrjRun your program once.
1660*a9fa9459Szrj
1661*a9fa9459Szrj@item
1662*a9fa9459SzrjIssue the command @samp{mv gmon.out gmon.sum}.
1663*a9fa9459Szrj
1664*a9fa9459Szrj@item
1665*a9fa9459SzrjRun your program again, the same as before.
1666*a9fa9459Szrj
1667*a9fa9459Szrj@item
1668*a9fa9459SzrjMerge the new data in @file{gmon.out} into @file{gmon.sum} with this command:
1669*a9fa9459Szrj
1670*a9fa9459Szrj@example
1671*a9fa9459Szrjgprof -s @var{executable-file} gmon.out gmon.sum
1672*a9fa9459Szrj@end example
1673*a9fa9459Szrj
1674*a9fa9459Szrj@item
1675*a9fa9459SzrjRepeat the last two steps as often as you wish.
1676*a9fa9459Szrj
1677*a9fa9459Szrj@item
1678*a9fa9459SzrjAnalyze the cumulative data using this command:
1679*a9fa9459Szrj
1680*a9fa9459Szrj@example
1681*a9fa9459Szrjgprof @var{executable-file} gmon.sum > @var{output-file}
1682*a9fa9459Szrj@end example
1683*a9fa9459Szrj@end enumerate
1684*a9fa9459Szrj
1685*a9fa9459Szrj@node Assumptions
1686*a9fa9459Szrj@section Estimating @code{children} Times
1687*a9fa9459Szrj
1688*a9fa9459SzrjSome of the figures in the call graph are estimates---for example, the
1689*a9fa9459Szrj@code{children} time values and all the time figures in caller and
1690*a9fa9459Szrjsubroutine lines.
1691*a9fa9459Szrj
1692*a9fa9459SzrjThere is no direct information about these measurements in the profile
1693*a9fa9459Szrjdata itself.  Instead, @code{gprof} estimates them by making an assumption
1694*a9fa9459Szrjabout your program that might or might not be true.
1695*a9fa9459Szrj
1696*a9fa9459SzrjThe assumption made is that the average time spent in each call to any
1697*a9fa9459Szrjfunction @code{foo} is not correlated with who called @code{foo}.  If
1698*a9fa9459Szrj@code{foo} used 5 seconds in all, and 2/5 of the calls to @code{foo} came
1699*a9fa9459Szrjfrom @code{a}, then @code{foo} contributes 2 seconds to @code{a}'s
1700*a9fa9459Szrj@code{children} time, by assumption.
1701*a9fa9459Szrj
1702*a9fa9459SzrjThis assumption is usually true enough, but for some programs it is far
1703*a9fa9459Szrjfrom true.  Suppose that @code{foo} returns very quickly when its argument
1704*a9fa9459Szrjis zero; suppose that @code{a} always passes zero as an argument, while
1705*a9fa9459Szrjother callers of @code{foo} pass other arguments.  In this program, all the
1706*a9fa9459Szrjtime spent in @code{foo} is in the calls from callers other than @code{a}.
1707*a9fa9459SzrjBut @code{gprof} has no way of knowing this; it will blindly and
1708*a9fa9459Szrjincorrectly charge 2 seconds of time in @code{foo} to the children of
1709*a9fa9459Szrj@code{a}.
1710*a9fa9459Szrj
1711*a9fa9459Szrj@c FIXME - has this been fixed?
1712*a9fa9459SzrjWe hope some day to put more complete data into @file{gmon.out}, so that
1713*a9fa9459Szrjthis assumption is no longer needed, if we can figure out how.  For the
1714*a9fa9459Szrjnovice, the estimated figures are usually more useful than misleading.
1715*a9fa9459Szrj
1716*a9fa9459Szrj@node How do I?
1717*a9fa9459Szrj@chapter Answers to Common Questions
1718*a9fa9459Szrj
1719*a9fa9459Szrj@table @asis
1720*a9fa9459Szrj@item How can I get more exact information about hot spots in my program?
1721*a9fa9459Szrj
1722*a9fa9459SzrjLooking at the per-line call counts only tells part of the story.
1723*a9fa9459SzrjBecause @code{gprof} can only report call times and counts by function,
1724*a9fa9459Szrjthe best way to get finer-grained information on where the program
1725*a9fa9459Szrjis spending its time is to re-factor large functions into sequences
1726*a9fa9459Szrjof calls to smaller ones.  Beware however that this can introduce
1727*a9fa9459Szrjartificial hot spots since compiling with @samp{-pg} adds a significant
1728*a9fa9459Szrjoverhead to function calls.  An alternative solution is to use a
1729*a9fa9459Szrjnon-intrusive profiler, e.g.@: oprofile.
1730*a9fa9459Szrj
1731*a9fa9459Szrj@item How do I find which lines in my program were executed the most times?
1732*a9fa9459Szrj
1733*a9fa9459SzrjUse the @code{gcov} program.
1734*a9fa9459Szrj
1735*a9fa9459Szrj@item How do I find which lines in my program called a particular function?
1736*a9fa9459Szrj
1737*a9fa9459SzrjUse @samp{gprof -l} and lookup the function in the call graph.
1738*a9fa9459SzrjThe callers will be broken down by function and line number.
1739*a9fa9459Szrj
1740*a9fa9459Szrj@item How do I analyze a program that runs for less than a second?
1741*a9fa9459Szrj
1742*a9fa9459SzrjTry using a shell script like this one:
1743*a9fa9459Szrj
1744*a9fa9459Szrj@example
1745*a9fa9459Szrjfor i in `seq 1 100`; do
1746*a9fa9459Szrj  fastprog
1747*a9fa9459Szrj  mv gmon.out gmon.out.$i
1748*a9fa9459Szrjdone
1749*a9fa9459Szrj
1750*a9fa9459Szrjgprof -s fastprog gmon.out.*
1751*a9fa9459Szrj
1752*a9fa9459Szrjgprof fastprog gmon.sum
1753*a9fa9459Szrj@end example
1754*a9fa9459Szrj
1755*a9fa9459SzrjIf your program is completely deterministic, all the call counts
1756*a9fa9459Szrjwill be simple multiples of 100 (i.e., a function called once in
1757*a9fa9459Szrjeach run will appear with a call count of 100).
1758*a9fa9459Szrj
1759*a9fa9459Szrj@end table
1760*a9fa9459Szrj
1761*a9fa9459Szrj@node Incompatibilities
1762*a9fa9459Szrj@chapter Incompatibilities with Unix @code{gprof}
1763*a9fa9459Szrj
1764*a9fa9459Szrj@sc{gnu} @code{gprof} and Berkeley Unix @code{gprof} use the same data
1765*a9fa9459Szrjfile @file{gmon.out}, and provide essentially the same information.  But
1766*a9fa9459Szrjthere are a few differences.
1767*a9fa9459Szrj
1768*a9fa9459Szrj@itemize @bullet
1769*a9fa9459Szrj@item
1770*a9fa9459Szrj@sc{gnu} @code{gprof} uses a new, generalized file format with support
1771*a9fa9459Szrjfor basic-block execution counts and non-realtime histograms.  A magic
1772*a9fa9459Szrjcookie and version number allows @code{gprof} to easily identify
1773*a9fa9459Szrjnew style files.  Old BSD-style files can still be read.
1774*a9fa9459Szrj@xref{File Format, ,Profiling Data File Format}.
1775*a9fa9459Szrj
1776*a9fa9459Szrj@item
1777*a9fa9459SzrjFor a recursive function, Unix @code{gprof} lists the function as a
1778*a9fa9459Szrjparent and as a child, with a @code{calls} field that lists the number
1779*a9fa9459Szrjof recursive calls.  @sc{gnu} @code{gprof} omits these lines and puts
1780*a9fa9459Szrjthe number of recursive calls in the primary line.
1781*a9fa9459Szrj
1782*a9fa9459Szrj@item
1783*a9fa9459SzrjWhen a function is suppressed from the call graph with @samp{-e}, @sc{gnu}
1784*a9fa9459Szrj@code{gprof} still lists it as a subroutine of functions that call it.
1785*a9fa9459Szrj
1786*a9fa9459Szrj@item
1787*a9fa9459Szrj@sc{gnu} @code{gprof} accepts the @samp{-k} with its argument
1788*a9fa9459Szrjin the form @samp{from/to}, instead of @samp{from to}.
1789*a9fa9459Szrj
1790*a9fa9459Szrj@item
1791*a9fa9459SzrjIn the annotated source listing,
1792*a9fa9459Szrjif there are multiple basic blocks on the same line,
1793*a9fa9459Szrj@sc{gnu} @code{gprof} prints all of their counts, separated by commas.
1794*a9fa9459Szrj
1795*a9fa9459Szrj@ignore - it does this now
1796*a9fa9459Szrj@item
1797*a9fa9459SzrjThe function names printed in @sc{gnu} @code{gprof} output do not include
1798*a9fa9459Szrjthe leading underscores that are added internally to the front of all
1799*a9fa9459SzrjC identifiers on many operating systems.
1800*a9fa9459Szrj@end ignore
1801*a9fa9459Szrj
1802*a9fa9459Szrj@item
1803*a9fa9459SzrjThe blurbs, field widths, and output formats are different.  @sc{gnu}
1804*a9fa9459Szrj@code{gprof} prints blurbs after the tables, so that you can see the
1805*a9fa9459Szrjtables without skipping the blurbs.
1806*a9fa9459Szrj@end itemize
1807*a9fa9459Szrj
1808*a9fa9459Szrj@node Details
1809*a9fa9459Szrj@chapter Details of Profiling
1810*a9fa9459Szrj
1811*a9fa9459Szrj@menu
1812*a9fa9459Szrj* Implementation::      How a program collects profiling information
1813*a9fa9459Szrj* File Format::         Format of @samp{gmon.out} files
1814*a9fa9459Szrj* Internals::           @code{gprof}'s internal operation
1815*a9fa9459Szrj* Debugging::           Using @code{gprof}'s @samp{-d} option
1816*a9fa9459Szrj@end menu
1817*a9fa9459Szrj
1818*a9fa9459Szrj@node Implementation
1819*a9fa9459Szrj@section Implementation of Profiling
1820*a9fa9459Szrj
1821*a9fa9459SzrjProfiling works by changing how every function in your program is compiled
1822*a9fa9459Szrjso that when it is called, it will stash away some information about where
1823*a9fa9459Szrjit was called from.  From this, the profiler can figure out what function
1824*a9fa9459Szrjcalled it, and can count how many times it was called.  This change is made
1825*a9fa9459Szrjby the compiler when your program is compiled with the @samp{-pg} option,
1826*a9fa9459Szrjwhich causes every function to call @code{mcount}
1827*a9fa9459Szrj(or @code{_mcount}, or @code{__mcount}, depending on the OS and compiler)
1828*a9fa9459Szrjas one of its first operations.
1829*a9fa9459Szrj
1830*a9fa9459SzrjThe @code{mcount} routine, included in the profiling library,
1831*a9fa9459Szrjis responsible for recording in an in-memory call graph table
1832*a9fa9459Szrjboth its parent routine (the child) and its parent's parent.  This is
1833*a9fa9459Szrjtypically done by examining the stack frame to find both
1834*a9fa9459Szrjthe address of the child, and the return address in the original parent.
1835*a9fa9459SzrjSince this is a very machine-dependent operation, @code{mcount}
1836*a9fa9459Szrjitself is typically a short assembly-language stub routine
1837*a9fa9459Szrjthat extracts the required
1838*a9fa9459Szrjinformation, and then calls @code{__mcount_internal}
1839*a9fa9459Szrj(a normal C function) with two arguments---@code{frompc} and @code{selfpc}.
1840*a9fa9459Szrj@code{__mcount_internal} is responsible for maintaining
1841*a9fa9459Szrjthe in-memory call graph, which records @code{frompc}, @code{selfpc},
1842*a9fa9459Szrjand the number of times each of these call arcs was traversed.
1843*a9fa9459Szrj
1844*a9fa9459SzrjGCC Version 2 provides a magical function (@code{__builtin_return_address}),
1845*a9fa9459Szrjwhich allows a generic @code{mcount} function to extract the
1846*a9fa9459Szrjrequired information from the stack frame.  However, on some
1847*a9fa9459Szrjarchitectures, most notably the SPARC, using this builtin can be
1848*a9fa9459Szrjvery computationally expensive, and an assembly language version
1849*a9fa9459Szrjof @code{mcount} is used for performance reasons.
1850*a9fa9459Szrj
1851*a9fa9459SzrjNumber-of-calls information for library routines is collected by using a
1852*a9fa9459Szrjspecial version of the C library.  The programs in it are the same as in
1853*a9fa9459Szrjthe usual C library, but they were compiled with @samp{-pg}.  If you
1854*a9fa9459Szrjlink your program with @samp{gcc @dots{} -pg}, it automatically uses the
1855*a9fa9459Szrjprofiling version of the library.
1856*a9fa9459Szrj
1857*a9fa9459SzrjProfiling also involves watching your program as it runs, and keeping a
1858*a9fa9459Szrjhistogram of where the program counter happens to be every now and then.
1859*a9fa9459SzrjTypically the program counter is looked at around 100 times per second of
1860*a9fa9459Szrjrun time, but the exact frequency may vary from system to system.
1861*a9fa9459Szrj
1862*a9fa9459SzrjThis is done is one of two ways.  Most UNIX-like operating systems
1863*a9fa9459Szrjprovide a @code{profil()} system call, which registers a memory
1864*a9fa9459Szrjarray with the kernel, along with a scale
1865*a9fa9459Szrjfactor that determines how the program's address space maps
1866*a9fa9459Szrjinto the array.
1867*a9fa9459SzrjTypical scaling values cause every 2 to 8 bytes of address space
1868*a9fa9459Szrjto map into a single array slot.
1869*a9fa9459SzrjOn every tick of the system clock
1870*a9fa9459Szrj(assuming the profiled program is running), the value of the
1871*a9fa9459Szrjprogram counter is examined and the corresponding slot in
1872*a9fa9459Szrjthe memory array is incremented.  Since this is done in the kernel,
1873*a9fa9459Szrjwhich had to interrupt the process anyway to handle the clock
1874*a9fa9459Szrjinterrupt, very little additional system overhead is required.
1875*a9fa9459Szrj
1876*a9fa9459SzrjHowever, some operating systems, most notably Linux 2.0 (and earlier),
1877*a9fa9459Szrjdo not provide a @code{profil()} system call.  On such a system,
1878*a9fa9459Szrjarrangements are made for the kernel to periodically deliver
1879*a9fa9459Szrja signal to the process (typically via @code{setitimer()}),
1880*a9fa9459Szrjwhich then performs the same operation of examining the
1881*a9fa9459Szrjprogram counter and incrementing a slot in the memory array.
1882*a9fa9459SzrjSince this method requires a signal to be delivered to
1883*a9fa9459Szrjuser space every time a sample is taken, it uses considerably
1884*a9fa9459Szrjmore overhead than kernel-based profiling.  Also, due to the
1885*a9fa9459Szrjadded delay required to deliver the signal, this method is
1886*a9fa9459Szrjless accurate as well.
1887*a9fa9459Szrj
1888*a9fa9459SzrjA special startup routine allocates memory for the histogram and
1889*a9fa9459Szrjeither calls @code{profil()} or sets up
1890*a9fa9459Szrja clock signal handler.
1891*a9fa9459SzrjThis routine (@code{monstartup}) can be invoked in several ways.
1892*a9fa9459SzrjOn Linux systems, a special profiling startup file @code{gcrt0.o},
1893*a9fa9459Szrjwhich invokes @code{monstartup} before @code{main},
1894*a9fa9459Szrjis used instead of the default @code{crt0.o}.
1895*a9fa9459SzrjUse of this special startup file is one of the effects
1896*a9fa9459Szrjof using @samp{gcc @dots{} -pg} to link.
1897*a9fa9459SzrjOn SPARC systems, no special startup files are used.
1898*a9fa9459SzrjRather, the @code{mcount} routine, when it is invoked for
1899*a9fa9459Szrjthe first time (typically when @code{main} is called),
1900*a9fa9459Szrjcalls @code{monstartup}.
1901*a9fa9459Szrj
1902*a9fa9459SzrjIf the compiler's @samp{-a} option was used, basic-block counting
1903*a9fa9459Szrjis also enabled.  Each object file is then compiled with a static array
1904*a9fa9459Szrjof counts, initially zero.
1905*a9fa9459SzrjIn the executable code, every time a new basic-block begins
1906*a9fa9459Szrj(i.e., when an @code{if} statement appears), an extra instruction
1907*a9fa9459Szrjis inserted to increment the corresponding count in the array.
1908*a9fa9459SzrjAt compile time, a paired array was constructed that recorded
1909*a9fa9459Szrjthe starting address of each basic-block.  Taken together,
1910*a9fa9459Szrjthe two arrays record the starting address of every basic-block,
1911*a9fa9459Szrjalong with the number of times it was executed.
1912*a9fa9459Szrj
1913*a9fa9459SzrjThe profiling library also includes a function (@code{mcleanup}) which is
1914*a9fa9459Szrjtypically registered using @code{atexit()} to be called as the
1915*a9fa9459Szrjprogram exits, and is responsible for writing the file @file{gmon.out}.
1916*a9fa9459SzrjProfiling is turned off, various headers are output, and the histogram
1917*a9fa9459Szrjis written, followed by the call-graph arcs and the basic-block counts.
1918*a9fa9459Szrj
1919*a9fa9459SzrjThe output from @code{gprof} gives no indication of parts of your program that
1920*a9fa9459Szrjare limited by I/O or swapping bandwidth.  This is because samples of the
1921*a9fa9459Szrjprogram counter are taken at fixed intervals of the program's run time.
1922*a9fa9459SzrjTherefore, the
1923*a9fa9459Szrjtime measurements in @code{gprof} output say nothing about time that your
1924*a9fa9459Szrjprogram was not running.  For example, a part of the program that creates
1925*a9fa9459Szrjso much data that it cannot all fit in physical memory at once may run very
1926*a9fa9459Szrjslowly due to thrashing, but @code{gprof} will say it uses little time.  On
1927*a9fa9459Szrjthe other hand, sampling by run time has the advantage that the amount of
1928*a9fa9459Szrjload due to other users won't directly affect the output you get.
1929*a9fa9459Szrj
1930*a9fa9459Szrj@node File Format
1931*a9fa9459Szrj@section Profiling Data File Format
1932*a9fa9459Szrj
1933*a9fa9459SzrjThe old BSD-derived file format used for profile data does not contain a
1934*a9fa9459Szrjmagic cookie that allows to check whether a data file really is a
1935*a9fa9459Szrj@code{gprof} file.  Furthermore, it does not provide a version number, thus
1936*a9fa9459Szrjrendering changes to the file format almost impossible.  @sc{gnu} @code{gprof}
1937*a9fa9459Szrjuses a new file format that provides these features.  For backward
1938*a9fa9459Szrjcompatibility, @sc{gnu} @code{gprof} continues to support the old BSD-derived
1939*a9fa9459Szrjformat, but not all features are supported with it.  For example,
1940*a9fa9459Szrjbasic-block execution counts cannot be accommodated by the old file
1941*a9fa9459Szrjformat.
1942*a9fa9459Szrj
1943*a9fa9459SzrjThe new file format is defined in header file @file{gmon_out.h}.  It
1944*a9fa9459Szrjconsists of a header containing the magic cookie and a version number,
1945*a9fa9459Szrjas well as some spare bytes available for future extensions.  All data
1946*a9fa9459Szrjin a profile data file is in the native format of the target for which
1947*a9fa9459Szrjthe profile was collected.  @sc{gnu} @code{gprof} adapts automatically
1948*a9fa9459Szrjto the byte-order in use.
1949*a9fa9459Szrj
1950*a9fa9459SzrjIn the new file format, the header is followed by a sequence of
1951*a9fa9459Szrjrecords.  Currently, there are three different record types: histogram
1952*a9fa9459Szrjrecords, call-graph arc records, and basic-block execution count
1953*a9fa9459Szrjrecords.  Each file can contain any number of each record type.  When
1954*a9fa9459Szrjreading a file, @sc{gnu} @code{gprof} will ensure records of the same type are
1955*a9fa9459Szrjcompatible with each other and compute the union of all records.  For
1956*a9fa9459Szrjexample, for basic-block execution counts, the union is simply the sum
1957*a9fa9459Szrjof all execution counts for each basic-block.
1958*a9fa9459Szrj
1959*a9fa9459Szrj@subsection Histogram Records
1960*a9fa9459Szrj
1961*a9fa9459SzrjHistogram records consist of a header that is followed by an array of
1962*a9fa9459Szrjbins.  The header contains the text-segment range that the histogram
1963*a9fa9459Szrjspans, the size of the histogram in bytes (unlike in the old BSD
1964*a9fa9459Szrjformat, this does not include the size of the header), the rate of the
1965*a9fa9459Szrjprofiling clock, and the physical dimension that the bin counts
1966*a9fa9459Szrjrepresent after being scaled by the profiling clock rate.  The
1967*a9fa9459Szrjphysical dimension is specified in two parts: a long name of up to 15
1968*a9fa9459Szrjcharacters and a single character abbreviation.  For example, a
1969*a9fa9459Szrjhistogram representing real-time would specify the long name as
1970*a9fa9459Szrj``seconds'' and the abbreviation as ``s''.  This feature is useful for
1971*a9fa9459Szrjarchitectures that support performance monitor hardware (which,
1972*a9fa9459Szrjfortunately, is becoming increasingly common).  For example, under DEC
1973*a9fa9459SzrjOSF/1, the ``uprofile'' command can be used to produce a histogram of,
1974*a9fa9459Szrjsay, instruction cache misses.  In this case, the dimension in the
1975*a9fa9459Szrjhistogram header could be set to ``i-cache misses'' and the abbreviation
1976*a9fa9459Szrjcould be set to ``1'' (because it is simply a count, not a physical
1977*a9fa9459Szrjdimension).  Also, the profiling rate would have to be set to 1 in
1978*a9fa9459Szrjthis case.
1979*a9fa9459Szrj
1980*a9fa9459SzrjHistogram bins are 16-bit numbers and each bin represent an equal
1981*a9fa9459Szrjamount of text-space.  For example, if the text-segment is one
1982*a9fa9459Szrjthousand bytes long and if there are ten bins in the histogram, each
1983*a9fa9459Szrjbin represents one hundred bytes.
1984*a9fa9459Szrj
1985*a9fa9459Szrj
1986*a9fa9459Szrj@subsection Call-Graph Records
1987*a9fa9459Szrj
1988*a9fa9459SzrjCall-graph records have a format that is identical to the one used in
1989*a9fa9459Szrjthe BSD-derived file format.  It consists of an arc in the call graph
1990*a9fa9459Szrjand a count indicating the number of times the arc was traversed
1991*a9fa9459Szrjduring program execution.  Arcs are specified by a pair of addresses:
1992*a9fa9459Szrjthe first must be within caller's function and the second must be
1993*a9fa9459Szrjwithin the callee's function.  When performing profiling at the
1994*a9fa9459Szrjfunction level, these addresses can point anywhere within the
1995*a9fa9459Szrjrespective function.  However, when profiling at the line-level, it is
1996*a9fa9459Szrjbetter if the addresses are as close to the call-site/entry-point as
1997*a9fa9459Szrjpossible.  This will ensure that the line-level call-graph is able to
1998*a9fa9459Szrjidentify exactly which line of source code performed calls to a
1999*a9fa9459Szrjfunction.
2000*a9fa9459Szrj
2001*a9fa9459Szrj@subsection Basic-Block Execution Count Records
2002*a9fa9459Szrj
2003*a9fa9459SzrjBasic-block execution count records consist of a header followed by a
2004*a9fa9459Szrjsequence of address/count pairs.  The header simply specifies the
2005*a9fa9459Szrjlength of the sequence.  In an address/count pair, the address
2006*a9fa9459Szrjidentifies a basic-block and the count specifies the number of times
2007*a9fa9459Szrjthat basic-block was executed.  Any address within the basic-address can
2008*a9fa9459Szrjbe used.
2009*a9fa9459Szrj
2010*a9fa9459Szrj@node Internals
2011*a9fa9459Szrj@section @code{gprof}'s Internal Operation
2012*a9fa9459Szrj
2013*a9fa9459SzrjLike most programs, @code{gprof} begins by processing its options.
2014*a9fa9459SzrjDuring this stage, it may building its symspec list
2015*a9fa9459Szrj(@code{sym_ids.c:@-sym_id_add}), if
2016*a9fa9459Szrjoptions are specified which use symspecs.
2017*a9fa9459Szrj@code{gprof} maintains a single linked list of symspecs,
2018*a9fa9459Szrjwhich will eventually get turned into 12 symbol tables,
2019*a9fa9459Szrjorganized into six include/exclude pairs---one
2020*a9fa9459Szrjpair each for the flat profile (INCL_FLAT/EXCL_FLAT),
2021*a9fa9459Szrjthe call graph arcs (INCL_ARCS/EXCL_ARCS),
2022*a9fa9459Szrjprinting in the call graph (INCL_GRAPH/EXCL_GRAPH),
2023*a9fa9459Szrjtiming propagation in the call graph (INCL_TIME/EXCL_TIME),
2024*a9fa9459Szrjthe annotated source listing (INCL_ANNO/EXCL_ANNO),
2025*a9fa9459Szrjand the execution count listing (INCL_EXEC/EXCL_EXEC).
2026*a9fa9459Szrj
2027*a9fa9459SzrjAfter option processing, @code{gprof} finishes
2028*a9fa9459Szrjbuilding the symspec list by adding all the symspecs in
2029*a9fa9459Szrj@code{default_excluded_list} to the exclude lists
2030*a9fa9459SzrjEXCL_TIME and EXCL_GRAPH, and if line-by-line profiling is specified,
2031*a9fa9459SzrjEXCL_FLAT as well.
2032*a9fa9459SzrjThese default excludes are not added to EXCL_ANNO, EXCL_ARCS, and EXCL_EXEC.
2033*a9fa9459Szrj
2034*a9fa9459SzrjNext, the BFD library is called to open the object file,
2035*a9fa9459Szrjverify that it is an object file,
2036*a9fa9459Szrjand read its symbol table (@code{core.c:@-core_init}),
2037*a9fa9459Szrjusing @code{bfd_canonicalize_symtab} after mallocing
2038*a9fa9459Szrjan appropriately sized array of symbols.  At this point,
2039*a9fa9459Szrjfunction mappings are read (if the @samp{--file-ordering} option
2040*a9fa9459Szrjhas been specified), and the core text space is read into
2041*a9fa9459Szrjmemory (if the @samp{-c} option was given).
2042*a9fa9459Szrj
2043*a9fa9459Szrj@code{gprof}'s own symbol table, an array of Sym structures,
2044*a9fa9459Szrjis now built.
2045*a9fa9459SzrjThis is done in one of two ways, by one of two routines, depending
2046*a9fa9459Szrjon whether line-by-line profiling (@samp{-l} option) has been
2047*a9fa9459Szrjenabled.
2048*a9fa9459SzrjFor normal profiling, the BFD canonical symbol table is scanned.
2049*a9fa9459SzrjFor line-by-line profiling, every
2050*a9fa9459Szrjtext space address is examined, and a new symbol table entry
2051*a9fa9459Szrjgets created every time the line number changes.
2052*a9fa9459SzrjIn either case, two passes are made through the symbol
2053*a9fa9459Szrjtable---one to count the size of the symbol table required,
2054*a9fa9459Szrjand the other to actually read the symbols.  In between the
2055*a9fa9459Szrjtwo passes, a single array of type @code{Sym} is created of
2056*a9fa9459Szrjthe appropriate length.
2057*a9fa9459SzrjFinally, @code{symtab.c:@-symtab_finalize}
2058*a9fa9459Szrjis called to sort the symbol table and remove duplicate entries
2059*a9fa9459Szrj(entries with the same memory address).
2060*a9fa9459Szrj
2061*a9fa9459SzrjThe symbol table must be a contiguous array for two reasons.
2062*a9fa9459SzrjFirst, the @code{qsort} library function (which sorts an array)
2063*a9fa9459Szrjwill be used to sort the symbol table.
2064*a9fa9459SzrjAlso, the symbol lookup routine (@code{symtab.c:@-sym_lookup}),
2065*a9fa9459Szrjwhich finds symbols
2066*a9fa9459Szrjbased on memory address, uses a binary search algorithm
2067*a9fa9459Szrjwhich requires the symbol table to be a sorted array.
2068*a9fa9459SzrjFunction symbols are indicated with an @code{is_func} flag.
2069*a9fa9459SzrjLine number symbols have no special flags set.
2070*a9fa9459SzrjAdditionally, a symbol can have an @code{is_static} flag
2071*a9fa9459Szrjto indicate that it is a local symbol.
2072*a9fa9459Szrj
2073*a9fa9459SzrjWith the symbol table read, the symspecs can now be translated
2074*a9fa9459Szrjinto Syms (@code{sym_ids.c:@-sym_id_parse}).  Remember that a single
2075*a9fa9459Szrjsymspec can match multiple symbols.
2076*a9fa9459SzrjAn array of symbol tables
2077*a9fa9459Szrj(@code{syms}) is created, each entry of which is a symbol table
2078*a9fa9459Szrjof Syms to be included or excluded from a particular listing.
2079*a9fa9459SzrjThe master symbol table and the symspecs are examined by nested
2080*a9fa9459Szrjloops, and every symbol that matches a symspec is inserted
2081*a9fa9459Szrjinto the appropriate syms table.  This is done twice, once to
2082*a9fa9459Szrjcount the size of each required symbol table, and again to build
2083*a9fa9459Szrjthe tables, which have been malloced between passes.
2084*a9fa9459SzrjFrom now on, to determine whether a symbol is on an include
2085*a9fa9459Szrjor exclude symspec list, @code{gprof} simply uses its
2086*a9fa9459Szrjstandard symbol lookup routine on the appropriate table
2087*a9fa9459Szrjin the @code{syms} array.
2088*a9fa9459Szrj
2089*a9fa9459SzrjNow the profile data file(s) themselves are read
2090*a9fa9459Szrj(@code{gmon_io.c:@-gmon_out_read}),
2091*a9fa9459Szrjfirst by checking for a new-style @samp{gmon.out} header,
2092*a9fa9459Szrjthen assuming this is an old-style BSD @samp{gmon.out}
2093*a9fa9459Szrjif the magic number test failed.
2094*a9fa9459Szrj
2095*a9fa9459SzrjNew-style histogram records are read by @code{hist.c:@-hist_read_rec}.
2096*a9fa9459SzrjFor the first histogram record, allocate a memory array to hold
2097*a9fa9459Szrjall the bins, and read them in.
2098*a9fa9459SzrjWhen multiple profile data files (or files with multiple histogram
2099*a9fa9459Szrjrecords) are read, the memory ranges of each pair of histogram records
2100*a9fa9459Szrjmust be either equal, or non-overlapping.  For each pair of histogram
2101*a9fa9459Szrjrecords, the resolution (memory region size divided by the number of
2102*a9fa9459Szrjbins) must be the same.  The time unit must be the same for all
2103*a9fa9459Szrjhistogram records. If the above containts are met, all histograms
2104*a9fa9459Szrjfor the same memory range are merged.
2105*a9fa9459Szrj
2106*a9fa9459SzrjAs each call graph record is read (@code{call_graph.c:@-cg_read_rec}),
2107*a9fa9459Szrjthe parent and child addresses
2108*a9fa9459Szrjare matched to symbol table entries, and a call graph arc is
2109*a9fa9459Szrjcreated by @code{cg_arcs.c:@-arc_add}, unless the arc fails a symspec
2110*a9fa9459Szrjcheck against INCL_ARCS/EXCL_ARCS.  As each arc is added,
2111*a9fa9459Szrja linked list is maintained of the parent's child arcs, and of the child's
2112*a9fa9459Szrjparent arcs.
2113*a9fa9459SzrjBoth the child's call count and the arc's call count are
2114*a9fa9459Szrjincremented by the record's call count.
2115*a9fa9459Szrj
2116*a9fa9459SzrjBasic-block records are read (@code{basic_blocks.c:@-bb_read_rec}),
2117*a9fa9459Szrjbut only if line-by-line profiling has been selected.
2118*a9fa9459SzrjEach basic-block address is matched to a corresponding line
2119*a9fa9459Szrjsymbol in the symbol table, and an entry made in the symbol's
2120*a9fa9459Szrjbb_addr and bb_calls arrays.  Again, if multiple basic-block
2121*a9fa9459Szrjrecords are present for the same address, the call counts
2122*a9fa9459Szrjare cumulative.
2123*a9fa9459Szrj
2124*a9fa9459SzrjA gmon.sum file is dumped, if requested (@code{gmon_io.c:@-gmon_out_write}).
2125*a9fa9459Szrj
2126*a9fa9459SzrjIf histograms were present in the data files, assign them to symbols
2127*a9fa9459Szrj(@code{hist.c:@-hist_assign_samples}) by iterating over all the sample
2128*a9fa9459Szrjbins and assigning them to symbols.  Since the symbol table
2129*a9fa9459Szrjis sorted in order of ascending memory addresses, we can
2130*a9fa9459Szrjsimple follow along in the symbol table as we make our pass
2131*a9fa9459Szrjover the sample bins.
2132*a9fa9459SzrjThis step includes a symspec check against INCL_FLAT/EXCL_FLAT.
2133*a9fa9459SzrjDepending on the histogram
2134*a9fa9459Szrjscale factor, a sample bin may span multiple symbols,
2135*a9fa9459Szrjin which case a fraction of the sample count is allocated
2136*a9fa9459Szrjto each symbol, proportional to the degree of overlap.
2137*a9fa9459SzrjThis effect is rare for normal profiling, but overlaps
2138*a9fa9459Szrjare more common during line-by-line profiling, and can
2139*a9fa9459Szrjcause each of two adjacent lines to be credited with half
2140*a9fa9459Szrja hit, for example.
2141*a9fa9459Szrj
2142*a9fa9459SzrjIf call graph data is present, @code{cg_arcs.c:@-cg_assemble} is called.
2143*a9fa9459SzrjFirst, if @samp{-c} was specified, a machine-dependent
2144*a9fa9459Szrjroutine (@code{find_call}) scans through each symbol's machine code,
2145*a9fa9459Szrjlooking for subroutine call instructions, and adding them
2146*a9fa9459Szrjto the call graph with a zero call count.
2147*a9fa9459SzrjA topological sort is performed by depth-first numbering
2148*a9fa9459Szrjall the symbols (@code{cg_dfn.c:@-cg_dfn}), so that
2149*a9fa9459Szrjchildren are always numbered less than their parents,
2150*a9fa9459Szrjthen making a array of pointers into the symbol table and sorting it into
2151*a9fa9459Szrjnumerical order, which is reverse topological
2152*a9fa9459Szrjorder (children appear before parents).
2153*a9fa9459SzrjCycles are also detected at this point, all members
2154*a9fa9459Szrjof which are assigned the same topological number.
2155*a9fa9459SzrjTwo passes are now made through this sorted array of symbol pointers.
2156*a9fa9459SzrjThe first pass, from end to beginning (parents to children),
2157*a9fa9459Szrjcomputes the fraction of child time to propagate to each parent
2158*a9fa9459Szrjand a print flag.
2159*a9fa9459SzrjThe print flag reflects symspec handling of INCL_GRAPH/EXCL_GRAPH,
2160*a9fa9459Szrjwith a parent's include or exclude (print or no print) property
2161*a9fa9459Szrjbeing propagated to its children, unless they themselves explicitly appear
2162*a9fa9459Szrjin INCL_GRAPH or EXCL_GRAPH.
2163*a9fa9459SzrjA second pass, from beginning to end (children to parents) actually
2164*a9fa9459Szrjpropagates the timings along the call graph, subject
2165*a9fa9459Szrjto a check against INCL_TIME/EXCL_TIME.
2166*a9fa9459SzrjWith the print flag, fractions, and timings now stored in the symbol
2167*a9fa9459Szrjstructures, the topological sort array is now discarded, and a
2168*a9fa9459Szrjnew array of pointers is assembled, this time sorted by propagated time.
2169*a9fa9459Szrj
2170*a9fa9459SzrjFinally, print the various outputs the user requested, which is now fairly
2171*a9fa9459Szrjstraightforward.  The call graph (@code{cg_print.c:@-cg_print}) and
2172*a9fa9459Szrjflat profile (@code{hist.c:@-hist_print}) are regurgitations of values
2173*a9fa9459Szrjalready computed.  The annotated source listing
2174*a9fa9459Szrj(@code{basic_blocks.c:@-print_annotated_source}) uses basic-block
2175*a9fa9459Szrjinformation, if present, to label each line of code with call counts,
2176*a9fa9459Szrjotherwise only the function call counts are presented.
2177*a9fa9459Szrj
2178*a9fa9459SzrjThe function ordering code is marginally well documented
2179*a9fa9459Szrjin the source code itself (@code{cg_print.c}).  Basically,
2180*a9fa9459Szrjthe functions with the most use and the most parents are
2181*a9fa9459Szrjplaced first, followed by other functions with the most use,
2182*a9fa9459Szrjfollowed by lower use functions, followed by unused functions
2183*a9fa9459Szrjat the end.
2184*a9fa9459Szrj
2185*a9fa9459Szrj@node Debugging
2186*a9fa9459Szrj@section Debugging @code{gprof}
2187*a9fa9459Szrj
2188*a9fa9459SzrjIf @code{gprof} was compiled with debugging enabled,
2189*a9fa9459Szrjthe @samp{-d} option triggers debugging output
2190*a9fa9459Szrj(to stdout) which can be helpful in understanding its operation.
2191*a9fa9459SzrjThe debugging number specified is interpreted as a sum of the following
2192*a9fa9459Szrjoptions:
2193*a9fa9459Szrj
2194*a9fa9459Szrj@table @asis
2195*a9fa9459Szrj@item 2 - Topological sort
2196*a9fa9459SzrjMonitor depth-first numbering of symbols during call graph analysis
2197*a9fa9459Szrj@item 4 - Cycles
2198*a9fa9459SzrjShows symbols as they are identified as cycle heads
2199*a9fa9459Szrj@item 16 - Tallying
2200*a9fa9459SzrjAs the call graph arcs are read, show each arc and how
2201*a9fa9459Szrjthe total calls to each function are tallied
2202*a9fa9459Szrj@item 32 - Call graph arc sorting
2203*a9fa9459SzrjDetails sorting individual parents/children within each call graph entry
2204*a9fa9459Szrj@item 64 - Reading histogram and call graph records
2205*a9fa9459SzrjShows address ranges of histograms as they are read, and each
2206*a9fa9459Szrjcall graph arc
2207*a9fa9459Szrj@item 128 - Symbol table
2208*a9fa9459SzrjReading, classifying, and sorting the symbol table from the object file.
2209*a9fa9459SzrjFor line-by-line profiling (@samp{-l} option), also shows line numbers
2210*a9fa9459Szrjbeing assigned to memory addresses.
2211*a9fa9459Szrj@item 256 - Static call graph
2212*a9fa9459SzrjTrace operation of @samp{-c} option
2213*a9fa9459Szrj@item 512 - Symbol table and arc table lookups
2214*a9fa9459SzrjDetail operation of lookup routines
2215*a9fa9459Szrj@item 1024 - Call graph propagation
2216*a9fa9459SzrjShows how function times are propagated along the call graph
2217*a9fa9459Szrj@item 2048 - Basic-blocks
2218*a9fa9459SzrjShows basic-block records as they are read from profile data
2219*a9fa9459Szrj(only meaningful with @samp{-l} option)
2220*a9fa9459Szrj@item 4096 - Symspecs
2221*a9fa9459SzrjShows symspec-to-symbol pattern matching operation
2222*a9fa9459Szrj@item 8192 - Annotate source
2223*a9fa9459SzrjTracks operation of @samp{-A} option
2224*a9fa9459Szrj@end table
2225*a9fa9459Szrj
2226*a9fa9459Szrj@node GNU Free Documentation License
2227*a9fa9459Szrj@appendix GNU Free Documentation License
2228*a9fa9459Szrj@include fdl.texi
2229*a9fa9459Szrj
2230*a9fa9459Szrj@bye
2231*a9fa9459Szrj
2232*a9fa9459SzrjNEEDS AN INDEX
2233*a9fa9459Szrj
2234*a9fa9459Szrj-T - "traditional BSD style": How is it different?  Should the
2235*a9fa9459Szrjdifferences be documented?
2236*a9fa9459Szrj
2237*a9fa9459Szrjexample flat file adds up to 100.01%...
2238*a9fa9459Szrj
2239*a9fa9459Szrjnote: time estimates now only go out to one decimal place (0.0), where
2240*a9fa9459Szrjthey used to extend two (78.67).
2241