【发布时间】:2015-08-21 08:34:30
【问题描述】:
尽管编译我的代码时设置了-pg 标志,但我无法从 gmon.out 查看 gprof 输出的完整详细信息
具体来说,我没有得到calls、self Ts/call 和total Ts/call 的详细信息。
这是gprof deepflow-static gmon.out 的输出:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
52.72 7.18 7.18 sor_coupled
34.07 11.82 4.64 compute_data_and_match
2.28 12.13 0.31 convolve_vert
2.28 12.44 0.31 compute_smoothness
2.06 12.72 0.28 convolve_horiz
1.47 12.92 0.20 color_image_resize_vert
1.03 13.06 0.14 sub_laplacian
0.88 13.18 0.12 image_resize_vert
0.88 13.30 0.12 image_warp
0.73 13.40 0.10 color_image_resize_horiz
0.59 13.48 0.08 image_resize_horiz
0.18 13.51 0.03 frexp
0.15 13.53 0.02 compute_one_level
0.15 13.55 0.02 fwrite
0.15 13.57 0.02 memcpy
0.11 13.58 0.02 __floorf_sse41
0.07 13.59 0.01 brk
0.07 13.60 0.01 color_image_png_load
0.07 13.61 0.01 get_derivatives
0.07 13.62 0.01 png_read_filter_row_paeth_multibyte_pixel
% the percentage of the total running time of the
time program used by this function.
cumulative a running sum of the number of seconds accounted
seconds for by this function and those listed above it.
self the number of seconds accounted for by this
seconds function alone. This is the major sort for this
listing.
calls the number of times this function was invoked, if
this function is profiled, else blank.
self the average number of milliseconds spent in this
ms/call function per call, if this function is profiled,
else blank.
total the average number of milliseconds spent in this
ms/call function and its descendents per call, if this
function is profiled, else blank.
name the name of the function. This is the minor sort
for this listing. The index shows the location of
the function in the gprof listing. If the index is
in parenthesis it shows where it would appear in
the gprof listing if it were to be printed.
Copyright (C) 2012 Free Software Foundation, Inc.
Copying and distribution of this file, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved.
这是我的 Makefile :
CC=gcc
CFLAGS=-Wall -g -O3
LDFLAGS=-g -Wall -O3
LIBFLAGS=-lm -ljpeg -lpng -pg
LIBAFLAGS=-static /usr/lib/x86_64-linux-gnu/libjpeg.a /usr/local/lib/libpng16.a /usr/lib/x86_64-linux-gnu/libz.a /usr/lib/x86_64-linux-gnu/libm.a
all: deepflow-static
deepflow: deepflow.o image.o io.o opticalflow_aux.o opticalflow.o solver.o
$(CC) $(LDFLAGS) $(LIBFLAGS) -o $@ $^
deepflow-static: deepflow.o image.o io.o opticalflow_aux.o opticalflow.o solver.o
$(CC) $(LIBFLAGS) -o $@ $^ $(LIBAFLAGS)
%.o: %.c
$(CC) -o $@ $(CFLAGS) -c $+
clean:
rm -f *.o deepflow
请帮助我找出缺少的内容。 提前致谢
【问题讨论】:
-
正如 BlunT 正确指出的那样,在 Makefile 中,我没有向 CFLAGS 添加 -pg 选项,这导致了所有这些不完整的输出。
-
我很好奇 - 你是怎么知道
gprof的?有老师推荐吗? -
@MikeDunlavey 我想确定程序的计算密集型部分,我之前使用过
perf工具,但我的顾问建议我使用gprof,因为它更适合这个任务。那是我开始使用gprof的时候。 -
好的,谢谢。您可能想在this post 上将第二个答案转达给您的顾问。有更好的方法来了解如何加快您的程序。