Rockbox

  • Status Closed
  • Percent Complete
    100%
  • Task Type Bugs
  • Category Music playback
  • Assigned To No-one
  • Operating System Another
  • Severity High
  • Priority Very Low
  • Reported Version Daily build (which?)
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by JeanLouisBiasini - 2011-12-05
Last edited by Buschel - 2011-12-11

FS#12429 - Fuze+: playback failures (data aborts/undef instr/etc) with several codecs

Several problem occurs with those format on Fuze+’s port. The problem has been identified as player specific as it doesn’t occurs on clip+ and has been seen on 2 differents Fuze+ unit.

The problem are quite random which lead to difficulties to track them down precisely. Nevertheless, 3 differents phases can be observe regarding the behaviour of the fuze+ and mpc files:

- Normal behaviour (files can be selected, play, playlist also play without problem)
- Non-playing behauviour (files are skipped one after the other till the end of the dynamic play list) – Nothing consistent was found on how to go from normal into non-playing behaviour or reverse.
- Buggy behaviour (you can actually play and skip between files but after playing one file, the player will just hang stucked or go into panic at the moment to load the next one:
Two different value set were observed for now (backtrace remains the same):
Data abort at 63E57B28
FSR 0×8 (domain 0, fault 8)
address 0×64000000 backtrace start
A: 0x63E56AE8
A: 0x60040D28
A: 0x6003D87C
A: 0x6003DB50
backtrace end

or:

Data abort at 63E57B24
FSR 0×8 (domain 0, fault 8)
address 0×64000001 backtrace start
A: 0x63E56AE8
A: 0x60040D28
A: 0x6003D87C
A: 0x6003DB50
backtrace end

Several tests were made so far to establish that:

1) The problem is not files specific as normal behaviour and non-playing and buggy behaviour occurs on the same sets of files. Differents tag coding and even wiping all the tag away didn’t help either.

2) The problem is not related to database as not building it and/or erasing all DB files doesn’t wipe out the buggy behaviour.

3) The problem ist not be related to specific settings or if so, it is related to defaut settings. Recompiling, reinstalling rockbox from scratch doesn’t solve the buggy behaviour

4) Nothing consistent was found about how the player goes from normal to non-playing or reverse. However some consistent (systematical) way to go from non-playing to buggy phase was found:
- Reinitialize the database (wether if there are no DB or an already present up-to-date one). But the player will then be back on non-playing phases after reboot.
- Playing a big directory with a lot of files after skipping 20 to 50 of then the player will eventually start playing one of then (a random one nothing consistent here to)
- desactivating the directory’s cache option (settings > system > disk) this way is the only way to remains even after reboot.

5) the backtrace of panic occuring in buggy behaviour was given into the ./utils/analisys/find_address.pl tool with the following result:
https://gist.github.com/1434466

6) Flac format as been seen with the same non-playing behviour. But buggy phases gave no backtrace:
Data abort at 63E71120
FSR 0×1 (domain 0, fault 1)
address 0x00004D6B

Closed by  Buschel
2011-12-11 16:34
Reason for closing:  Fixed
Additional comments about closing:  

The MPC sv7 crash is fixed with r31211, flac crash is fixed with r31207 (enlarged MAX_FRAMESIZE).

The undefined instruction and the dircache influence will be handled in separate tasks.

find_address 0x63E71120 1 (flac’s data abort) returns:
https://gist.github.com/1434486

the revision for all those test is 31145.

Yesterday evening I had a lockup issue several times with my iPod nano 2G as well. When playing back mpc and changing the volume the device stopped playback, the time position was stuck, no pause was signalled. Only a restart solved the problem. This was with r30907 (an not with r31055 as calimed before I edited this comment).

Do you use other formats as well? To me this does not sound like a mpc or flac related issue, but like a general code playback issue.

I’m also using mp3 and ogg (quality 8 and 7) and never had the problem with them. The volume doesn’t seams related to this problem (it always happens at the end or beginning of the file). The principal issue is the skipping file and the panic when they finally get played. Did you experienced this?

but I agree it is not related directly to codec for this issue doesn’t seems to appear on other device… The backtrace is clearly pointing the codec but there must be other problem there…

I did not experience issues after skipping yet. Such problems could point to buffering… Another guess: I know that mpc is not very error resilient. If the data is corrupted in RAM (e.g. some overwritten or incomplete data segment which holds audio file data) mpc might crash. mp3 is more stable, I cannot judge ogg and flac though…

I’m not experienced yet enough with programming to know about memory but there are some clue that could point to a problem in memory handling. I have sometime an issue even with mp3 and ogg: “undefined instruction” while reaching the end of a song. Further more the fact that Database is not related to the problem but that initializing it has some effect on it is quite strange!

And I noticed today that mpc play okay in a little directory but will and with data abord in a big directory full of files (2,6 GB - 313 files)

Arf just to be clear they play ok sometime in a little folder - then we are back again in buggy or skippy mode. But the fact the problem are different in big and little folder is also a clue to memory, isn’t it?

Do you have those issues with both enabled and disabled directory cache? For testing please change this setting and perform a clean restart.

as written in the bug description, desactiving directory cache get rid of the problem of file skipping but not of the data abort

someone on the forum thread of the fuze+’s port seems to have the same issue with “m4a (aac hev2, encoded by Nero AAC, two-pass mode), and ogg/vorbis (encoded by the version of aoTuv included in the latest FreAc)”

Can you place a link to the relevant forum thread?

It would be nice to have .elf files, but I guess the MPC codec I built from r31155 matches the reported addresses.

The 0x63E57B28 and 0x63E57B24 data aborts are where values are loaded from r→buff[] in mpc_bits_read() at:
ret = (r→buff[0] | (r→buff[-1] « 8)) » r→count;
It is found in mpc_bits_reader.h and inlined right after the first mpc_demux_fill() call in the d→si.stream_version ⇐ 7 branch of mpc_demux_decode_inner(), which is inlined within mpc_demux_decode(). The start of the backtrace is the call of mpc_demux_decode() from codec_run(). It’s interesting that the failing data accesses are just past the end of RAM.

The FLAC data abort at 0x63E71120 is weird. It is past the end of .bss, and it doesn’t seem like FLAC uses codec_malloc().

Regarding mpc I can confirm that the bug seems only to appear on sv7 version. So far sv8 seems to be out of this. Regarding the flac one, it’s also weird that it leave no backtrace. As I said the problem is also that the bug is not always here… For the moment I don’t have problem with flac anymore… I will try different conpression to be sure…

oh and another thing I noticed was that when a bug occurs on mpc it SEEMS (i’m not absolutly sure) that reaching the end of the file, the progress bar should show more free space for the count down to reach its end… Another thing is that when rather than going into panic the player just hang, although the bug occurs at the end of the file, it shows a WPS with the progress bar on the 1/6 of the file.
For a link to the forum : http://forums.rockbox.org/index.php/topic,26284.msg186784.html#msg186784 and following thread of the same user.

Regarding the flac one, it’s also weird that it leave no backtrace.

The backtrace algorithm examines both the stack and code. If the code jumps somewhere crazy, I don’t expect to get a backtrace. There, FLAC is executing code from past the end of .bss. I don’t see anything that would put code there, so it seems PC shouldn’t be in that area. There is probably some memory corruption. (BTW. Stack corruption can also cause backtrace to fail.)

Why is this high severity? It only affects the unsupported fuze+ target, no?

Ok the skippping bug (files getting skipped one after the other) is back, and it does affect sv8 also! I can confirm that desctiving directory cache get rid of it and reactiving directory cache bring it back. No idea why it cames back suddenly but it was just after reinstallation. Pehraps I should open a different bug report for this skipping behavious for it doesn’t seems related?

Why is this high severity? It only affects the unsupported fuze+ target, no?
Severity is high, because the impact of the failure itself is high. We could lower the priority of course.

I have experienced the undefined instruction that come very often with the new backtrace’s patch:
Undefined instruction at 0000AC44
backtrace start
pc: 0x0000AC44
sp: 0×00005350 backtrace end

find_addr.pl returns:
jean-louis@debian:~/Bureau/rockbox-devtree/rockbox/buidl$ ../utils/analysis/find_addr.pl 0x0000AC44 1
/home/jean-louis/Bureau/rockbox-devtree/rockbox/buidl/firmware/libfirmware.a(thread.o) → threads

jean-louis@debian:~/Bureau/rockbox-devtree/rockbox/buidl$ ../utils/analysis/find_addr.pl 0×00005350 1
/home/jean-louis/Bureau/rockbox-devtree/rockbox/buidl/apps/codec_thread.o →

Short update on this: The MPC sv7 crash is fixed with r31211, flac crash is fixed with r31207 (enlarged MAX_FRAMESIZE).

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing