Rockbox

Tasklist

FS#2815 - Profiling support for codecs and plugins

Attached to Project: Rockbox
Opened by Brandon Low (lostlogic) - Thursday, 08 December 2005, 21:38 GMT
Last edited by Brandon Low (lostlogic) - Wednesday, 18 January 2006, 22:31 GMT
Task Type Patches
Category
Status Closed
Assigned To No-one
Operating System
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

This patch includes profiling functions for rockbox

Currently this support profiling most of the codecs, I
think only alac and faad had to be turned off due to
ICEs when compiled with profiling.

Profiling sessions start and end with codecs, and since
codecs share space in RAM, a profiling session across
multiple codecs wouldn't make sense, so when profiling,
ensure that your playlist includes files of the same
codec only for proper results.

Plugin profiling should also be possible, but I have
not tested this yet. The same type of modifications
that I made in the Makefiles for codecs should enable
it for plugins, but each indifidual plugin would need
to start and stop profiling and initialize the profile
functions (see wav2wv for example inititialization. I
had to at least initialize profiling on wav2wv due to
the sharing of objects post-compile between wav2wv and
wavepack codec.

Profiling is enabled by selecting developer build and
the profiling option. Profileing and DEBUG are
mutually exclusive (enforced in the configure script),
due to the fomit-frame-pointer flag. Note that
profiling turns on -fomit-frame-pointer on dev builds
which it normally is not.

More information about the design and status is here:
http://forums.rockbox.org/index.php?topic=2039.0

Also included in the patch is Java 1.5 source for a
parsing program, I know that Java is probably not the
prefered language for anything here in the rockbox
world, but it's my most fluent language currently, so
when I needed to parse a bunch of files and symbols
into a usable format, Java is how I did it.

Example commandline for the parser is:
java ProfileReader ../../profile.out
../../build-firmware/apps/codecs/mpa.map
../../build-firmware/libmad.a

Example output files at:
http://lostlogicx.com/transfer/profile.out
http://lostlogicx.com/transfer/profile.parsed.out
This task depends upon

Closed by  Brandon Low (lostlogic)
Wednesday, 18 January 2006, 22:31 GMT
Reason for closing:  Fixed
Comment by Brandon Low (lostlogic) - Saturday, 10 December 2005, 20:38 GMT

Updated the patch attached with a bug fix to the java
parser, and now a profile output comparator which takes two
parser output files, and compares them (changes in number of
ticks / number of calls).
Comment by Brandon Low (lostlogic) - Saturday, 17 December 2005, 16:40 GMT

Uploaded a new version of the patch with minor bug fixes,
and optimizations in the profile.c file. I was forced to
use a little bit of assembly to optimize the functions in
there, because GCC isn't smart enough to use the subqi and
addqi features of m68k, I also ran through the rest of the
compiler output and made some adjustments to my C which let
GCC be smarter.

The minor bug that was fixed was probably never an issue,
but the thread switching functions were not being handled
just right, in case a function being profiled called yield
internally.
Comment by Brandon Low (lostlogic) - Saturday, 17 December 2005, 18:29 GMT

Gah, another version -- I introduced a bug in the last
version with my optimizations.
Comment by Brandon Low (lostlogic) - Sunday, 18 December 2005, 03:43 GMT

VERSION SPEW TODAY: This latest patch fixes a bug in the
parsing code, and adds the object file name that the symbol
was found in to the output. No changes to the core
profiling code here. Yes, the parsing code is still in Java1.5.
Comment by Brandon Low (lostlogic) - Sunday, 18 December 2005, 20:26 GMT

NEW VERSION TODAY TOO!

Latest patch should reduce the impact of number of calls on
perceived number of ticks by deactivating timing on the
second instruction into the profiling call and reactivating
it as the last instruction before return. This also makes
profiling a bit faster, I spend more time in
profile_timer_tick slightly, but a lot less time in
profile_func_enter and profile_func_exit is the tradeoff.
Another loss here is that we lose the total time spent on
profiling thread counter.

To go with this, I've changed the profile_comparator.pl to
display percent change in number of ticks by default instead
of the change in number of calls, because if you are
comparing profiling results, they should really have the
same number of calls to each function in the runs being
compared. To regain the previous behavior of comparing
ticks and calls, just pass a third argument (any value) to
the perl call.

I also fixed a bug where the parser java would throw a null
pointer if it couldn't find some symbols. Now it just
displays their address and nothing for their object if that
happens.
Comment by Brandon Low (lostlogic) - Sunday, 18 December 2005, 20:59 GMT

bah, that version has a minor bug that results in a bad row
being output which is pointed to by indices[0] I'll hunt it
down later or tomorrow. It doesn't seem to hurt the
profiling results though.
Comment by Brandon Low (lostlogic) - Monday, 19 December 2005, 19:50 GMT

Updated patch: Fix the bug mentioned below, addq.l on a
short is not generally a good idea.
The profile comparator now also outputs total calls and
ticks with %change in ticks.
Comment by Brandon Low (lostlogic) - Wednesday, 21 December 2005, 23:17 GMT

Minor update. Just a change to the profile comparator perl
script.
Comment by Brandon Low (lostlogic) - Thursday, 05 January 2006, 15:30 GMT

attaching a new file for the first version of a profile
output parsing program in perl. Once I've tested it a bit,
I'll post a new patchfile with it included.
Comment by Brandon Low (lostlogic) - Thursday, 05 January 2006, 15:33 GMT

Oh, usage on the perl parser is:
./profile_parser.pl profile.out mapfile lib/objectfiles
mapfile lib/objectfiles [012]
the last argument tells the parser how to sort the output.
0 for calls, 1 for ticks, 2 for symbol name.

eg: ./profile_reader.pl ../../profile.out
../../build-firmware/apps/codecs/vorbis.map
../../build-firmware/libTremor.a 0


It is important that you parse your profiling output before
doing anything (a new build for instance) that would change
the map or library files that match it.
Comment by Brandon Low (lostlogic) - Thursday, 05 January 2006, 17:48 GMT

Here's an updated patch file. To make the patch easier to
read, I've dropped the use of the NO_PROF_ATTR within the
Tremor codec. Because of this, I will not delete the
previous version which has those changes, in case they are
wanted.

This drops the java profilereader code and includes the new
perl profile_reader code (updated from the one I posted
earlier) and updated profile_comparator code. Both of these
scripts now include USAGE methods.
Comment by Brandon Low (lostlogic) - Friday, 06 January 2006, 05:42 GMT

At Lear's suggestion, here's a new patch with the
profile_reader.pl updated to allow optional printing by
percent instead of by count. profile_comparator.pl still
only accepts count style, not percent.
Comment by Brandon Low (lostlogic) - Tuesday, 17 January 2006, 21:43 GMT

Fixed the things that preglow mentioned so far toward
getting this committed.

Outstanding questions:
* Should it be enabled for all codecs from the patch?
* Is the license in the file that I based the profiling on
from gcc compatible?
Comment by Brandon Low (lostlogic) - Wednesday, 18 January 2006, 03:25 GMT

Update patch so it applies with amiconn's latest changes
Comment by Brandon Low (lostlogic) - Wednesday, 18 January 2006, 22:31 GMT

in CVS.

Loading...