Rockbox

  • Status Closed
  • Percent Complete
    100%
  • Task Type Patches
  • Category Operating System/Drivers
  • Assigned To No-one
  • Operating System Another
  • Severity Low
  • Priority Very Low
  • Reported Version
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by Daniel Ankers - 2007-03-05
Last edited by Rafaël Carré - 2010-06-11

FS#6734 - Compile Rockbox using ARM THUMB code

This makes certain parts of Rockbox compile using the more space-efficient thumb code instruction set on the ARM architecture.
Whether the overall effect of this will be positive or negative is difficult to estimate, so it is worth trying it to see what works - there may be benefits to the battery life or to the boost rate. In general, non-performance critical code which does not contain large loops or ARM assembly will be better off as thumb code.

Enabling the interworking between normal ARM and THUMB increases the size of the Rockbox binary, but moving code to THUMB decreases it. Some code will not work as THUMB for reasons which are not known at the moment.

If you get a warning about libgcc not being compiled with interworking support, then you need to re-run the rockboxdev.sh script.

Closed by  Rafaël Carré
2010-06-11 05:37
Reason for closing:  Accepted
Additional comments about closing:  

r26760

Chris commented on 2007-03-05 08:52

Synced to R12617. Does Binutils require to be rebuilt with interworking support?

Daniel Ankers commented on 2007-03-05 09:04

As long as you have versions of libgcc with interworking enabled, that is all that is required.

Dave Hooper commented on 2008-04-22 22:20

Doesn’t get past the apple logo on boot with ipod 5g 64mb build, for me, so guessing that the set of code which does/doesn’t work is unknown at this point?

Rafaël Carré commented on 2010-05-28 21:55

Attempt at detecting C files which can’t be build with -mthumb because they contain inline asm (supposedly we have no thumb asm)

To use just edit Makefile and replace CC=…gcc or CC=ccache ….gcc by CC=/path/to/script.py ….gcc

Only tested (r26353) with eabi gcc because the other one I have doesn’t have a libgcc built with -mthumb-interwork

For fuzev1 I need to build with -mlong-calls ⇒ ram usage 64kB smaller
On clipv1 no need for -mlong-calls (because memory is only 2MB?) ⇒ ram usage 84kB smaller

The script simply runs gcc with -E and grep the output for ‘asm’ to reject files

This alone isn’t enough so I added regexp code to remove unused static inline functions and grep for ‘asm’ again

With this version the Clip doesn’t even boot fine (black screen), but before removing the static inline functions it booted and I could play chopper or display cube demo (which uses greylib) just fine.

codecs crashed immediately with an undef instruction in a HW register (TIMER register fwiw)

To disable the removal of static inline functions (and build much less files with -mthumb) just make “remove_static_inline” function return its input immediately, perhaps this version has less chance to touch sensible files

Rafaël Carré commented on 2010-05-29 17:15

memset for arm has a return which can’t return to thumb code on armv4 (memset is only one example)

it works on armv5, so after r26386, Clipv2 runs just fine built with thumb:
binsize 94kB smaller
ram usage 93kB smaller

Note that apps/codecs/nsf.c doesn’t build (ICE)
Also I didn’t had to use -mlong-calls on Clipv2 although it has 8MB of ram, perhaps due to update to gcc 4.4.4

Rafaël Carré commented on 2010-05-29 18:03

Diff between 2 test_codec run on clipv2

All codecs are a bit slower except AAC and AC3 which are a bit faster.

Difference is not significant though, I guess because most performance code is in asm

Rafaël Carré commented on 2010-05-29 20:40

Much simpler script, which turns out to not be slower (build time takes 50% more without ccache)

Just try to build anything that ends in .o with -mthumb
if that fails try again without it

Rafaël Carré commented on 2010-05-30 12:01

I tested codec speeds (wma/mp3/mpc/vorbis/cook/aac on fuzev1, and also ac3/a52 on clipv2): decode time needed in percentage of ARM decode time (using eabi):

Clipv2 (armv5t) range from 95.13% (ac3) to 102.08% (cook), going through 96.82% (aac) and 100.18% (mp3)
aac / ac3 are faster, others are slower

Fuzev1 (armv4t - arm9tdmi) range from 99.82% (wma) to 101.22% (aac)
mp3 / wma are faster, others are slower

Rafaël Carré commented on 2010-06-05 21:44

100+kB binsize cut on gigabeats

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing