GPROF(1) User Commands GPROF(1)

NAME


gprof - display call-graph profile data

SYNOPSIS


gprof [-abcCDlsz] [-e function-name] [-E function-name]
[-f function-name] [-F function-name]
[image-file [profile-file...]]
[-n number of functions]


DESCRIPTION


The gprof utility produces an execution profile of a program. The
effect of called routines is incorporated in the profile of each
caller. The profile data is taken from the call graph profile file
that is created by programs compiled with the -xpg option of cc(1),
or by the -pg option with other compilers, or by setting the
LD_PROFILE environment variable for shared objects. See ld.so.1(1).
These compiler options also link in versions of the library routines
which are compiled for profiling. The symbol table in the executable
image file image-file (a.out by default) is read and correlated with
the call graph profile file profile-file (gmon.out by default).


First, execution times for each routine are propagated along the
edges of the call graph. Cycles are discovered, and calls into a
cycle are made to share the time of the cycle. The first listing
shows the functions sorted according to the time they represent,
including the time of their call graph descendants. Below each
function entry is shown its (direct) call-graph children and how
their times are propagated to this function. A similar display above
the function shows how this function's time and the time of its
descendants are propagated to its (direct) call-graph parents.


Cycles are also shown, with an entry for the cycle as a whole and a
listing of the members of the cycle and their contributions to the
time and call counts of the cycle.


Next, a flat profile is given. This listing gives the total
execution times and call counts for each of the functions in the
program, sorted by decreasing time. Finally, an index is given, which
shows the correspondence between function names and call-graph
profile index numbers.


A single function may be split into subfunctions for profiling by
means of the MARK macro. See prof(7).


Beware of quantization errors. The granularity of the sampling is
shown, but remains statistical at best. It is assumed that the time
for each execution of a function can be expressed by the total time
for the function divided by the number of times the function is
called. Thus the time propagated along the call-graph arcs to
parents of that function is directly proportional to the number of
times that arc is traversed.


The profiled program must call exit(2) or return normally for the
profiling information to be saved in the gmon.out file.

OPTIONS


The following options are supported:

-a
Suppress printing statically declared functions.
If this option is given, all relevant information
about the static function (for instance, time
samples, calls to other functions, calls from
other functions) belongs to the function loaded
just before the static function in the a.out file.


-b
Brief. Suppress descriptions of each field in the
profile.


-c
Discover the static call-graph of the program by a
heuristic which examines the text space of the
object file. Static-only parents or children are
indicated with call counts of 0. Note that for
dynamically linked executables, the linked shared
objects' text segments are not examined.


-C
Demangle symbol names before printing them out.


-D
Produce a profile file gmon.sum that represents
the difference of the profile information in all
specified profile files. This summary profile
file may be given to subsequent executions of
gprof (also with -D) to summarize profile data
across several runs of an a.out file. See also
the -s option.

As an example, suppose function A calls function B
n times in profile file gmon.sum, and m times in
profile file gmon.out. With -D, a new gmon.sum
file will be created showing the number of calls
from A to B as n-m.


-efunction-name
Suppress printing the graph profile entry for
routine function-name and all its descendants
(unless they have other ancestors that are not
suppressed). More than one -e option may be
given. Only one function-name may be given with
each -e option.


-Efunction-name
Suppress printing the graph profile entry for
routine function-name (and its descendants) as -e,
below, and also exclude the time spent in
function-name (and its descendants) from the total
and percentage time computations. More than one -E
option may be given. For example:

-E mcount -E mcleanup

is the default.


-ffunction-name
Print the graph profile entry only for routine
function-name and its descendants. More than one
-f option may be given. Only one function-name
may be given with each -f option.


-Ffunction-name
Print the graph profile entry only for routine
function-name and its descendants (as -f, below)
and also use only the times of the printed
routines in total time and percentage
computations. More than one -F option may be
given. Only one function-name may be given with
each -F option. The -F option overrides the -E
option.


-l
Suppress the reporting of graph profile entries
for all local symbols. This option would be the
equivalent of placing all of the local symbols for
the specified executable image on the -E exclusion
list.


-n
Limits the size of flat and graph profile listings
to the top n offending functions.


-s
Produce a profile file gmon.sum which represents
the sum of the profile information in all of the
specified profile files. This summary profile
file may be given to subsequent executions of
gprof (also with -s) to accumulate profile data
across several runs of an a.out file. See also
the -D option.


-z
Display routines which have zero usage (as
indicated by call counts and accumulated time).
This is useful in conjunction with the -c option
for discovering which routines were never called.
Note that this has restricted use for dynamically
linked executables, since shared object text space
will not be examined by the -c option.


ENVIRONMENT VARIABLES


PROFDIR
If this environment variable contains a value, place
profiling output within that directory, in a file named
pid.programname. pid is the process ID and programname is
the name of the program being profiled, as determined by
removing any path prefix from the argv[0] with which the
program was called. If the variable contains a null value,
no profiling output is produced. Otherwise, profiling
output is placed in the file gmon.out.


FILES


a.out
executable file containing namelist


gmon.out
dynamic call-graph and profile


gmon.sum
summarized dynamic call-graph and
profile


$PROFDIR/pid.programname


SEE ALSO


cc(1), ld.so.1(1), exit(2), pcsample(2), profil(2), malloc(3C),
monitor(3C), malloc(3MALLOC), attributes(7), prof(7)


Graham, S.L., Kessler, P.B., McKusick, M.K., gprof: A Call Graph
Execution Profiler Proceedings of the SIGPLAN '82 Symposium on
Compiler Construction, SIGPLAN Notices, Vol. 17, No. 6, pp. 120-126,
June 1982.


Linker and Libraries Guide

NOTES


If the executable image has been stripped and does not have the
.symtab symbol table, gprof reads the global dynamic symbol tables
.dynsym and .SUNW_ldynsym, if present. The symbols in the dynamic
symbol tables are a subset of the symbols that are found in .symtab.
The .dynsym symbol table contains the global symbols used by the
runtime linker. .SUNW_ldynsym augments the information in .dynsym
with local function symbols. In the case where .dynsym is found and
.SUNW_ldynsym is not, only the information for the global symbols is
available. Without local symbols, the behavior is as described for
the -a option.


LD_LIBRARY_PATH must not contain /usr/lib as a component when
compiling a program for profiling. If LD_LIBRARY_PATH contains
/usr/lib, the program will not be linked correctly with the profiling
versions of the system libraries in /usr/lib/libp.


The times reported in successive identical runs may show variances
because of varying cache-hit ratios that result from sharing the
cache with other processes. Even if a program seems to be the only
one using the machine, hidden background or asynchronous processes
may blur the data. In rare cases, the clock ticks initiating
recording of the program counter may beat with loops in a program,
grossly distorting measurements. Call counts are always recorded
precisely, however.


Only programs that call exit or return from main are guaranteed to
produce a profile file, unless a final call to monitor is explicitly
coded.


Functions such as mcount(), _mcount(), moncontrol(), _moncontrol(),
monitor(), and _monitor() may appear in the gprof report. These
functions are part of the profiling implementation and thus account
for some amount of the runtime overhead. Since these functions are
not present in an unprofiled application, time accumulated and call
counts for these functions may be ignored when evaluating the
performance of an application.

64-bit profiling
64-bit profiling may be used freely with dynamically linked
executables, and profiling information is collected for the shared
objects if the objects are compiled for profiling. Care must be
applied to interpret the profile output, since it is possible for
symbols from different shared objects to have the same name. If name
duplication occurs in the profile output, the module id prefix before
the symbol name in the symbol index listing can be used to identify
the appropriate module for the symbol.


When using the -s or -Doption to sum multiple profile files, care
must be taken not to mix 32-bit profile files with 64-bit profile
files.

32-bit profiling
32-bit profiling may be used with dynamically linked executables, but
care must be applied. In 32-bit profiling, shared objects cannot be
profiled with gprof. Thus, when a profiled, dynamically linked
program is executed, only the main portion of the image is sampled.
This means that all time spent outside of the main object, that is,
time spent in a shared object, will not be included in the profile
summary; the total time reported for the program may be less than the
total time used by the program.


Because the time spent in a shared object cannot be accounted for,
the use of shared objects should be minimized whenever a program is
profiled with gprof. If desired, the program should be linked to the
profiled version of a library (or to the standard archive version if
no profiling version is available), instead of the shared object to
get profile information on the functions of a library. Versions of
profiled libraries may be supplied with the system in the
/usr/lib/libp directory. Refer to compiler driver documentation on
profiling.


Consider an extreme case. A profiled program dynamically linked with
the shared C library spends 100 units of time in some libc routine,
say, malloc(). Suppose malloc() is called only from routine B and B
consumes only 1 unit of time. Suppose further that routine A consumes
10 units of time, more than any other routine in the main (profiled)
portion of the image. In this case, gprof will conclude that most of
the time is being spent in A and almost no time is being spent in B.
From this it will be almost impossible to tell that the greatest
improvement can be made by looking at routine B and not routine A.
The value of the profiler in this case is severely degraded; the
solution is to use archives as much as possible for profiling.

BUGS


Parents which are not themselves profiled will have the time of their
profiled children propagated to them, but they will appear to be
spontaneously invoked in the call-graph listing, and will not have
their time propagated further. Similarly, signal catchers, even
though profiled, will appear to be spontaneous (although for more
obscure reasons). Any profiled children of signal catchers should
have their times propagated properly, unless the signal catcher was
invoked during the execution of the profiling routine, in which case
all is lost.

April 10, 2023 GPROF(1)

tribblix@gmail.com :: GitHub :: Privacy