FS#12431 - SH gcc 4.6.2 with link-time optimization, for Archos targets
Opened by Boris Gjenero (dreamlayers) - Thursday, 08 December 2011, 05:56 GMT
I'm now able to build a working copy of Rockbox r31177 for my Archos Recorder V2 using binutils 2.21.1 and gcc 4.6.2, with -Os -flto. The main advantages are a binary size and memory use decrease of 7kb and automatic discarding of unused code, and the main disadvantage is much slower linking. I don't know if this is worth it.
The new binutils is needed because a linker plugin is needed to enable link time optimization of object files stored in library archives, like libfirmware.a. Linker plugin support is automatically detected by gcc, so there's no need for -fuse-linker-plugin.
The attached gcc patch is based on the current gcc-4.0.3-rockbox-1.diff by Jens Arnold (amiconn). I still need to investigate whether the workaround in gcc/config/sh/sh.h is actually needed. Including it shouldn't cause any problems. You can find info about it in IRC logs around this date: http://www.rockbox.org/irc/rockbox-20060427.txt
The attached Rockbox patch changes rockboxdev.sh to build this toolchain, configure to add -flto for gcc 4.6.0 and above, and various things so Rockbox builds properly. The gcc patch can't be automatically downloaded by rockboxdev.sh, so put it the download directory, which is by default, /tmp/rbdev-dl. Note that configure will only use -Os if it finds "rockbox" in the sh-elf-gcc version string, so if you want to try an unpatched gcc, you need to edit configure or the generated Makefile.
Most of the code changes simply add __attribute__((used)) to stuff that gcc -flto would otherwise throw away. When C code is only referenced by assembler code, gcc will throw it away. This even happens for references from inline assembler in the same C file. Functions in apps/plugins/lib/gcc-support.c were also getting discarded, resulting in "defined in discarded section" errors.
Link time optimization shuffles around code, and then divides into several large assembler files. (Note how in rockbox.map, instead of the normal .o files, you see a bunch of .ltrans.o files.) Code from the same file may end up in different assembler files. This is why the "bsr _UIE" couldn't reach UIE(), and why .global is needed for _start_thread and _UIE4.
Various little notes: I see no improvement with GLOBAL_LDFLAGS=-fwhole-program, so gcc must be detecting that properly. Adding -ffunction-sections -Wl,--gc-sections is also not helpful. The patch doesn't fix some warnings added by using gcc 4.6.2, but there are only a few, and they should be easy to deal with. It also doesn't make changes needed for -flto for other targets. Without -flto, gcc 4.6.2 generates a binary that's 3 kb bigger than the gcc 4.0.3 binary.