Rockbox

Tasklist

FS#6734 - Compile Rockbox using ARM THUMB code

Attached to Project: Rockbox
Opened by Daniel Ankers (dan_a) - Monday, 05 March 2007, 01:33 GMT
Last edited by Rafaël Carré (funman) - Friday, 11 June 2010, 05:37 GMT
Task Type Patches
Category Operating System/Drivers
Status Closed
Assigned To No-one
Operating System Another
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

This makes certain parts of Rockbox compile using the more space-efficient thumb code instruction set on the ARM architecture.
Whether the overall effect of this will be positive or negative is difficult to estimate, so it is worth trying it to see what works - there may be benefits to the battery life or to the boost rate. In general, non-performance critical code which does not contain large loops or ARM assembly will be better off as thumb code.

Enabling the interworking between normal ARM and THUMB increases the size of the Rockbox binary, but moving code to THUMB decreases it. Some code will not work as THUMB for reasons which are not known at the moment.

If you get a warning about libgcc not being compiled with interworking support, then you need to re-run the rockboxdev.sh script.
This task depends upon

Closed by  Rafaël Carré (funman)
Friday, 11 June 2010, 05:37 GMT
Reason for closing:  Accepted
Additional comments about closing:  r26760
Comment by Chris (decayed.cell) - Monday, 05 March 2007, 08:52 GMT
Synced to R12617. Does Binutils require to be rebuilt with interworking support?
Comment by Daniel Ankers (dan_a) - Monday, 05 March 2007, 09:04 GMT
As long as you have versions of libgcc with interworking enabled, that is all that is required.
Comment by Dave Hooper (stripwax) - Tuesday, 22 April 2008, 22:20 GMT
Doesn't get past the apple logo on boot with ipod 5g 64mb build, for me, so guessing that the set of code which does/doesn't work is unknown at this point?
Comment by Rafaël Carré (funman) - Friday, 28 May 2010, 21:55 GMT
Attempt at detecting C files which can't be build with -mthumb because they contain inline asm (supposedly we have no thumb asm)

To use just edit Makefile and replace CC=...gcc or CC=ccache ....gcc by CC=/path/to/script.py ....gcc

Only tested (r26353) with eabi gcc because the other one I have doesn't have a libgcc built with -mthumb-interwork

For fuzev1 I need to build with -mlong-calls => ram usage 64kB smaller
On clipv1 no need for -mlong-calls (because memory is only 2MB?) => ram usage 84kB smaller

The script simply runs gcc with -E and grep the output for 'asm' to reject files

This alone isn't enough so I added regexp code to remove unused static inline functions and grep for 'asm' again

With this version the Clip doesn't even boot fine (black screen), but before removing the static inline functions it booted and I could play chopper or display cube demo (which uses greylib) just fine.

codecs crashed immediately with an undef instruction in a HW register (TIMER register fwiw)


To disable the removal of static inline functions (and build much less files with -mthumb) just make "remove_static_inline" function return its input immediately, perhaps this version has less chance to touch sensible files
Comment by Rafaël Carré (funman) - Saturday, 29 May 2010, 17:15 GMT
memset for arm has a return which can't return to thumb code on armv4 (memset is only one example)

it works on armv5, so after r26386, Clipv2 runs just fine built with thumb:
binsize 94kB smaller
ram usage 93kB smaller

Note that apps/codecs/nsf.c doesn't build (ICE)
Also I didn't had to use -mlong-calls on Clipv2 although it has 8MB of ram, perhaps due to update to gcc 4.4.4
Comment by Rafaël Carré (funman) - Saturday, 29 May 2010, 18:03 GMT
Diff between 2 test_codec run on clipv2

All codecs are a bit slower except AAC and AC3 which are a bit faster.

Difference is not significant though, I guess because most performance code is in asm
Comment by Rafaël Carré (funman) - Saturday, 29 May 2010, 20:40 GMT
Much simpler script, which turns out to not be slower (build time takes 50% more without ccache)

Just try to build anything that ends in .o with -mthumb
if that fails try again without it
Comment by Rafaël Carré (funman) - Sunday, 30 May 2010, 12:01 GMT
I tested codec speeds (wma/mp3/mpc/vorbis/cook/aac on fuzev1, and also ac3/a52 on clipv2): decode time needed in percentage of ARM decode time (using eabi):

Clipv2 (armv5t) range from 95.13% (ac3) to 102.08% (cook), going through 96.82% (aac) and 100.18% (mp3)
aac / ac3 are faster, others are slower

Fuzev1 (armv4t - arm9tdmi) range from 99.82% (wma) to 101.22% (aac)
mp3 / wma are faster, others are slower
Comment by Rafaël Carré (funman) - Saturday, 05 June 2010, 21:44 GMT
100+kB binsize cut on gigabeats

Loading...