Rockbox

Tasklist

FS#12302 - Add backtrace to the panic screen on ARM

Attached to Project: Rockbox
Opened by Marcin Bukat (MarcinBukat) - Friday, 30 September 2011, 19:38 GMT
Last edited by Marcin Bukat (MarcinBukat) - Saturday, 14 April 2012, 16:21 GMT
Task Type Patches
Category User Interface
Status Closed
Assigned To No-one
Operating System All players
Severity Low
Priority Normal
Reported Version Release 3.9
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Port (rather trivial) of unwarminder http://www.mcternan.me.uk/ArmStackUnwinding/
This gives backtrace in Panic screen in addition to PC value
This task depends upon

Closed by  Marcin Bukat (MarcinBukat)
Saturday, 14 April 2012, 16:21 GMT
Reason for closing:  Accepted
Additional comments about closing:  commited as b4eab599513324dcaffa4c5693345ae11f3f9725
Comment by Thomas Martitz (kugel.) - Thursday, 06 October 2011, 10:11 GMT
I'm in favor! Does this also work for explicit panicf() calls?

I think it would be nicer to compile the unwinder as a separate lib and link to it later, perhaps with an compile time switch.
Comment by Marcin Bukat (MarcinBukat) - Thursday, 20 October 2011, 10:19 GMT
1) Changed get_sp() to work for all modes
2) added backtrace() to panicf()
Comment by Boris Gjenero (dreamlayers) - Friday, 02 December 2011, 17:17 GMT
I just did a panicf() in r31107 at the start of opendir_cached() in firmware/common/dircache.c. This is what I got:
backtrace start
A: 0x000C0000
backtrace end
It makes no sense. According to rockbox.map, the address is in .bss from apps/tagtree.o. Also, there can't only be one function on the stack. I don't expect memory to be trashed at this point. Rockbox would continue to function normally if it wasn't for the panicf(). A second attempt gives the same result.

I would love to see a working backtrace. For example, a backtrace could maybe help me figure out  FS#12423 .

Edit: Oops, I forgot about -fomit-frame-pointer. However, adding -fno-omit-frame-pointer after the -O just gives me an empty backtrace instead. I'm unable to test without -O because due to unrelated compile errors which happen even in unmodified svn. I'm running r31107 on a 5G iPod with 32MB RAM.
Comment by Thomas Martitz (kugel.) - Friday, 02 December 2011, 20:28 GMT
IIUC the frame pointer isnt needed. wodz noted at some point that there's a bug with panicf().

Try to make an swi, that should result in UIE being called with backtrace.
Comment by Boris Gjenero (dreamlayers) - Saturday, 03 December 2011, 06:13 GMT
I made a data abort and got a good backtrace. I'm very impressed that it works even with -fomit-frame-pointer!

I'm not sure what's wrong in panicf(). Passing the return address (LR) as PC and the current SP within panicf() as SP seems wrong. That's a PC value from the function that called panicf(), and an SP value that has been decreased by panicf(). Shouldn't both correspond to the same moment? However, I haven't been able to make it work based on this assumption.
Comment by Marcin Bukat (MarcinBukat) - Saturday, 03 December 2011, 10:00 GMT
I am aware that unwinder doesn't work when called from panicf() and I don't know why. I tried a few different approaches to get consistent PC,SP values but noon was working. I am fiddling now with skyeye simulator and this may help to debug unwinder. Kugel mentioned about possibility to debug this on ARM RaaA but he seems to be busy with other work and I don't have any ARM RaaA capable hardware.
Comment by Boris Gjenero (dreamlayers) - Saturday, 03 December 2011, 19:12 GMT
You could write memory and SP to a file, and then experiment with unwarminder on a computer. There is debug output which can be enabled, and you can use a debugger.
Comment by Marcin Bukat (MarcinBukat) - Monday, 05 December 2011, 12:20 GMT
This seems to finally fix backtrace from panicf().
Little explanation as the approach is hacky (IMO) and maybe someone comes with better idea. The main problem is that we need matching SP,PC input values for unwinder AND PC should be after panicf(). I don't know if this is the problem with panicf() itself being noreturn or unwinder shouldn't enter itself. The situation is further complicated by the fact that panicf() is variadic function. So to get consistent SP,PC I created naked function wrapper around panicf() just to save SP before entering actual panicf() (this should be the same as just after returning) and I save it to r4 register as this one doesn't hold valid parameters to function being wrapped. Then wrapper simply branch to actual panicf(). At the very beginning of panicf() I copy r4 to some local variable and get PC just after the function with gcc intrinsic function __builtin_return_address(). Although this seems to work I see potential problem that compiler might reorganize the code during optimization and trash our hackly saved in r4 SP value.
Comment by Marcin Bukat (MarcinBukat) - Monday, 05 December 2011, 14:45 GMT
The binsize impact is about 5.5kB for regular arm build (tested on rk27xx build) and about 4.3kB for thumb. This is with -O flag as is used by default on arm builds. Side note: why exactly do we use -O instead of -Os?
Comment by Boris Gjenero (dreamlayers) - Monday, 05 December 2011, 15:24 GMT
unwarminder_v3.diff works for me where v2 didn't work before. Thanks for the fix!

5.5kb is fine on targets with plenty of memory. This should be included conditionally, based on MEMORYSIZE, maybe for 8 and above.

I don't know why -O was chosen. When there's plenty of memory, such as 32MB on the 5G iPod, energy savings from more efficient code are larger than energy savings from enlarging the buffer via -Os. For best results, optimization settings could be adapted to how often code is used. For example, it's possible via "#pragma GCC optimize" and the optimize function attribute. (Of course, adding those throughout Rockbox would be a significant amount of additional work.)
Comment by Marcin Bukat (MarcinBukat) - Monday, 05 December 2011, 15:36 GMT
But the fact is that -Os produce faster code than -O
Comment by Michael Chicoine (mc2739) - Tuesday, 06 December 2011, 13:13 GMT
I have tried the latest patch on e200 and after forcing a stack overflow I noticed that the pc was truncated and sp was not displayed due to screen width.

The attached patch to firmware/target/arm/unwarminder/client.h formats the output to fit the e200 screen. Further optimizations may be needed for other small screen targets (i.e.clip).
Comment by Marcin Bukat (MarcinBukat) - Tuesday, 06 December 2011, 19:10 GMT
Client side needs fine tuning for sure. Your change increase line count though which I think isn't the best choice.
Comment by amaury pouly (pamaury) - Saturday, 10 December 2011, 12:13 GMT
I've tried it on the fuze+ and it works except for the display of sp/pc which is too long for one line. I also noticed that the backtracer doesn't play well with gotos which is not too surprizing given how it works. I've started implementing some symbol map code to print the functions name. I hope to have working code soon.
Comment by amaury pouly (pamaury) - Thursday, 15 December 2011, 17:43 GMT
Here is a patch which adds a symbol table. It requires an external program to read rockbox.map and then write a symtable.c file which get compiled in. After 3 iterations of build map > compile > link, you hopefully get a binary read to backtrace with symbol names. This is only one of the technique I've explored. It's a bit tricky because it requires some makefile modifications and if you don't iterate enough times, you will get a bad symbol table. I have ideas for more robust approaches but this one is also the most general: with the external program you can include virtually any information based on rockbox.map.

Loading...