Rockbox

  • Status Closed
  • Percent Complete
    100%
  • Task Type Bugs
  • Category Infrastructure → Build environment
  • Assigned To No-one
  • Operating System Another
  • Severity Medium
  • Priority Very Low
  • Reported Version Rbutil git
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by Barracuda72 - 2018-12-13
Last edited by speachy - 2020-10-20

FS#13164 - Rockbox commit ce0b31d87 fails to build due to overlapping sections in rockbox.elf

I am trying to build Rockbox for Sony NWZ-380 from Git repo, commit ce0b31d87db3c4c1c1bfb535c50770d33e9c4aaf.
arm-elf-eabi-gcc version 5.4.0, ARM binutils 2.30.
Everything goes smooth until just the latest step where this happens:

… LD duke3d.ovl
LD bmp.ovl
LD jpeg.ovl
LD png.ovl
LD ppm.ovl
LD gif.ovl
LD rockbox.elf
/usr/libexec/gcc/arm-elf-eabi/ld: section .ARM.exidx VMA [00000000600a9548,00000000600a954f] overlaps section .bss VMA [00000000600a7f20,00000000600ff773]

I’m using Rockbox on my NWZ-384 for a couple of years now and there were no problems building it in the past. I also was very glad that NWZ-380 port reached stable status in 3.14,
but now something seems to be broken (not terrible it seems, just requires some adjustments to the linker scripts but still).

Closed by  speachy
2020-10-20 21:37
Reason for closing:  Invalid
Additional comments about closing:  

I'm going to close this, as we don't support building with non-bundled toolchains.. and we specifically patched our toolchain to disable the exception generation that resulted in this overlapping.

I fixed it by adding:
.ARM.exidx : {

      __exidx_start = .;
      *(.ARM.exidx* .gnu.linkonce.armexidx.*)
     __exidx_end = .;
  } >DRAM

to the linker script ram.link generated at build time. It seems that libgcc's implementation of udivmoddi4 requires exception handling info that is stored here.
It's impossible to build recent GCC versions (I've tried many from 4.9 till 8.2) excluding exception support (at least without some patching),
so I think this hack should be included into Rockbox. This section is only 8 bytes long, so for the 6 Mb firmware it shouldn't really matter.

Admin

(blowing the dust off of this a bit)

Rockbox is only built and tested with the toolchains generated by tools/rockboxdev.sh. Our ARM toolchain is still at 4.4.4 and is patched to disable exceptions, and building with anything newer is at your own risk..

I'm in the process of trying to drag everything forward to gcc 4.9.4, and that's uncovered multiple bugs (and at least one target still crashes at startup!)

Anyway,

I'm in the process of trying to drag everything forward to gcc 4.9.4

Why? Current GCC version is 9.3, and 10.1 is soon to be released. Is there something that keep Rockbox hardwired on such ancient fossils?

I use GCC 9.2.0 to build Rockbox; with little hack mentioned above everything is fine. I use my player like 3-4 hours a day and can't find any problems whatsoever.

I know that Rockbox supports many targets, but isn't all of them ARM anyway? Shouldn't it be fine to migrate to a newer GCC versions?

Admin

The simple answer is that there is code in rockbox that is silently mis-compiled or mis-optimized by newer GCC versions. This is nearly always due to latent "ambiguities" or outright bugs in the code, but older GCC versions didn't trip over them or had less-aggressive optimizations.

Right now it the only known remaining issue is that the PP-based targets (eg most of the iPods) simply don't work when built with 4.9.4, panicking during the early startup code. The culprit appears to be the low-level threading or locking code, written in asm. It's not clear if this is PP-specific, or afflicts all armv4t targets, but newer armv5/armv6 targets appear to be fine.

And then there's outright bugs in GCC – the sh targets are officially stuck on GCC 4.0 due to a major compiler bug that screws up jump tables over a certain size – that bug might have been finally fixed in the gcc 4.5 timeframe, but I haven't heard back from anyone willing (and able) to test my experimental 4.9-based builds.

Putting aside bugs, GCC also changes how some things work – some time during the 4.1→4.9 transition the asm register allocation/calling conventions on MIPS targets subtly changed, and caused major threading problems leading to a ton of crashes. Something similar is probably what is causing PP/armv4t to break so badly.

…And these sorts of things are par for the course on bare-metal applications that have to get very creative with code to make things fit or perform acceptably.

Finally, gcc 4.9.4 is the newest toolchain used by any of the existing rockbox targets (ie mips and hosted linux) and the oldest gcc version that still compiles with modern GCC 10. Once this migration is sorted, then we can look to the future, which I'm sure will uncover more exciting bugs and other misbehaviors…

sh1 targets: basically all the archos players
m68k: iriver h100, h120, h300, iaudio x5, m5
arm7tdmi: all PP based players, most notably all Ipods up to 5.5G / Nano1G; e200, c200, mrobe100, sa9200, iriver h10, and probably some more I forgot.

the sh targets are officially stuck on GCC 4.0 due to a major compiler bug that screws up jump tables over a certain size

Are you talking about this one?

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=4516

Reported as fixed, it seems. Thou it mentions GCC 3.x, not 4.x.

but I haven't heard back from anyone willing (and able) to test my experimental 4.9-based builds.

Doesn't it mean there's just no users? If so, why support this target / arch in newer Rockbox versions and increase technical debt?

Once this migration is sorted, then we can look to the future, which I'm sure will uncover more exciting bugs and other misbehaviors…

The only problem that I see is that you could be doing same work twice, if not more. All this "register allocation/calling conventions" stuff probably has changed in more ways than one in recent years.
(Actually, it's what this bug is all about - newer GCC versions ships with static libraries that require exception support.)
At the same time, some bugs may've been fixed (and don't require workarounds anymore), some other were introduced and so on and so forth.
I'm just afraid that you may end up re-writing code that you just wrote…

Anyway, thank you for your work on Rockbox and thank you again for spending your valuable time to explain in detail that's happening here!

Admin
Reported as fixed, it seems. Thou it mentions GCC 3.x, not 4.x.

It was very much still a problem well after that PR was marked as fixed. :)

I haven't been able to find definitive proof that the bug rockbox triggered was ever fixed (or not), but there were other PRs that implied it was working. I figured it would be simpler to just bump the toolchain, fix the build warnings/problems, and see what happens on real hardware… which leads me to..

Doesn't it mean there's just no users? If so, why support this target / arch in newer Rockbox versions and increase technical debt?

I've already proposed dropping all archos targets in the near future, which means sh toolchain and HWCODEC support (and likely a great deal of other technical debt) can get ejected. But that said, I was able to relatively easily respin the sh toolchain and fix the compile/link problems that 4.9 uncovered, so my thinking is that if it works, great.. if not, it's not worth the trouble to properly fix given the ancient nature of those targets.

Actually, it's what this bug is all about - newer GCC versions ships with static libraries that require exception support.

…unless you compile libgcc with -fno-exceptions :) But in all seriousness, the reason exceptions were disabled is because the various low-level platform code would have to be partially re-written to handle those exceptions properly. Simply adding the linker section will allow things to link, but it's the runtime behaviour that is more concerning.

I'm just afraid that you may end up re-writing code that you just wrote…

It's a real concern, but I don't think it very likely on a significant scale, at least not based on my own experience taking multiple baremetal ARM codebases from early-GCC4 through GCC8 (driven mostly by the need to have GCC automatically generate the most compact code possible) Granted, none had the same level of technical baggage or overall complexity as Rockbox. :)

(But back to the PP crash, it's looking like the problem is in the multi-cpu threading code unique to PP targets, and not an arm toolchain bug per se. Probably something that the asm code relies on is getting optimized away..)

Admin

Some PP targets (mini2g) are working fine now with the updated toolchain, but at least one other (ipodcolor) still has issues. progress is slow.

Admin

We're now using GCC 4.9.4 (and -Os) for all targets, including ARM.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing