This is the bug/patch tracker for Rockbox. Click here for more information.
Quick links: Bugs · Patches · Rockbox frontpage
FS#11863 - Playback / buffer freezes
Attached to Project:
Rockbox
Opened by Andree Buschmann (Buschel) - Thursday, 06 January 2011, 10:42 GMT+2
Last edited by Andree Buschmann (Buschel) - Wednesday, 06 April 2011, 21:59 GMT+2
Opened by Andree Buschmann (Buschel) - Thursday, 06 January 2011, 10:42 GMT+2
Last edited by Andree Buschmann (Buschel) - Wednesday, 06 April 2011, 21:59 GMT+2
|
DetailsExtracted from
On iPod nano 1G and iPod Video's the following were issues observed: 1) Device does not fully start (no menu visible, background image loaded) 2) Playback stopps or does not start (WPS playtime is 0:00) 3) "data abort" on white screen Except (1) and (3) the iPod's UI responds. In situation (2) loading mpegplayer or shut down via longpress Play/Pause is not possible. In general plugins (tested with test_fps and test_mem) can be loaded and performed normally. On iPod Video this happened with builds which based on r28915, r28940 and r28960. So, before the ATA but after some kernel/thread/priority changes. I could provoke this failure twice when generating the DB during playback. But these failure also happened randomly during standard playback (mp3 and mpc in my case). As long as we do not know what is happening I will assume this is not specific for PP502x processors. |
This task depends upon
Closed by Andree Buschmann (Buschel)
Wednesday, 06 April 2011, 21:59 GMT+2
Reason for closing: Fixed
Additional comments about closing: All relevant issues have been fixed (also see 'related tasks') in the past weeks.
Wednesday, 06 April 2011, 21:59 GMT+2
Reason for closing: Fixed
Additional comments about closing: All relevant issues have been fixed (also see 'related tasks') in the past weeks.
A) r28992
B) r28992 w/o r28877
C) r28992 w/o r28877 r28928 r28923 r28918 r28909 r28901
Test case is playing a large playlist from my DB, starting with the same track for all builds.
A) locked up after about 30 minutes (during 1st or 2nd rebuffering after playback started)
I will focus on a test of A) now to get more confidence regarding influence of r28877.
I had a freeze after reboot from emergency USB connection (those aren't uncommon though). Whether it was that is questionable since it played a little bit before freezing and those tend to happen almost immediate upon booting. The freeze described in the previous paragraph was the firmware load after paperclipping from post-emergency USB freeze.
See if the attached patch reveals anything since if priority changes more often, any circular waits might have a greater occurance. But, I don't see how plugging would affect that.
FYI: I've got the H10 plugged again and it seems alright so far.
This is getting weird... Hopefully we will find a way to reproduce this. There definately _is_ something going wrong....
Most noticeable problem was ffw not working: Disk spinning forever, no responding UI, no sound, buffering section in debug menu showing negative value. Only savior was menu+center. A suspicious buffering related thing.
Don't remeber from what revision it happened. But after few hundreds of revision it gets mysteriously fixed!
And guess what? That happens again with some update and gets fixed again with some update after that. * 5 or more
That ruthless fucker which whispered 0dBFS screaming direct to my ear while I was alone on a bed at dark night might not be little piece of buggy codes but gcc and linker bugs that screw up some 'chosen' revision.
BTW I use iPod Video 80G and enjoyed r28831. I haven't encountered any problems with that.
Edit: Also interesting -> http://www.rockbox.org/irc/log-20110113#22:56:55 "the problem" is a strange playback issues as well.
FS#11865for an issue cause by r28652 / r28608 which seems to be related (and is the one discussed in the IRC conversation mentioned).AFAICT, nothing like this appears to be happening on my SD devices.
There is a bug for sure that appears alignment-related.
It behaves as if dircache is disabled although setting says it is enabled.
I transferred 100 songs and unplgged and got a data abort (at 5xxxx).
It seems flac cuesheet (or skipping track) doesn't work too. (Not sure it is caused by this)
(Load single flac -> 'Browse cuesheet' to change track -> Playback stops as soon as I change track (w or w/o full scale noise), and UI freezes.)
Edit: This patch is not working correct. It has negative side-effects on my simulation and iPod Video.
The test case which always failed for me on iPod Video target and simulation works fine on simulation with this patch. It also survived some intense seeking/skipping. I will further test on target.
Peter, can you describe the issues you had in detail? And can you explain what your patch does and what the failure (code-wise) was?
Edit: change .diff to .patch because IE is ok with viewing that
Some questions so far:
1) Is the patch by Peter Edwards included as well or did you find another way?
2) If you should submit those changes: Can you split this to several independent commits? E.g.
a) changing variables / structs (e.g. "data_counters" to "dc" or "h->id" to "handle_id")
b) removing mutexes
c) the logical changes?
If there should occur any side effects we can then better bisect and track down any issues.
Thanks for your work!
1) No, though I did find one spot so far where > should be >= to avoid the said situation. It's the one I know of ATM. Rebuffering check needed to be >= as well since file position is 0-based and then account for seeking to EOF.
2) a) b) c) Not completely possible since things like changing the locking scheme and fixing up problems need changing things for thread safety. Perhaps the minor things mentioned in 1) could be separated from thread stuff.
Not calling find_handle after a queue_send call was not correct but the need largely obviated (and there's no need to worry if a compaction-triggering event won't be in the queue before the sender's message).
handle_id is, for example, to cache the id through possible buffer compactions because h->id would no longer be valid thus lost entirely.Logical changes are rather minor except moving rebuffer logic almost entirely onto the buffer thread, which avoids having to re-find the handle after each queue_send and passes fewer requests to buffering.
Naturally, I'll remove whatever becomes obsolete as a consequence of tigtening things up.
If it's not a cacheline issue perhaps there could there be a stray drive interrupt?
I wrote a huge and confuse (but very detailed) forum post on what happens with my device, here: http://forums.rockbox.org/index.php/topic,27260.msg178068.html#msg178068
Note: There are bugs that exist in SVN that I will not bother to fix without at least a minor renovation of the codec format since it's almost impossible to fix them otherwise but that's a bit more extreme than I would like to get at this time. They have to do with depending upon the presence of a codec to signal seeks and skips at a time when it's quite impossible to have one handy.
Said existing bugs are the only way playback seems to break for me but with those it's stable enough to stop and restart to recover. I haven't been able to actually crash anything skipping/seeking on an otherwise stable target.
I'll try patching that against my trunk and when I'm done I'll tell you the results.
There are still issues when codecs fail for some reason. Not much change there except maybe the specifics of what goes on. Seeking/skipping at the end of the last track still doesn't work as it should (by seeking/skipping) but now the playback won't go dead.
------------
In the technical side, my plan is to divide codec operation into several exported entrypoints:
1) The init code which does the main crt0/codec initialization and returns. If it creates threads, it must do it here or notify of any failures. If it fails this, 2) won't be called but 3) will.
2) The decoding code which returns after every track completes, possibly subdivided:
a) seeking
b) decoding
3) Cleanup code (most important if it creates threads internally or uses any other system resources)
The codec will stay resident and ready for more tracks until 3) is called. It must reset itself if it previously errored in 2) for any reason.
All this places the presence of the codec under playback control and not codec control, which lead to the mentioned difficulties. Right now, the entrypoint cannot be called again without a full restart. Other side effects include simpler and more consistent codec implementations and the ability to decide precisely when it is running or not; it can more easily listen to its message queue at all times.
(Sorry for the ramble! :)
It seems much more stable. It played three tracks I chose with the file browser, at the end of the last one I let it playing (shuffle enabled) and it played 8 more tracks without hanging, clicks or other strange noises.
It was at the end of the 8th music, I had to go quickly check something and let the iPod playing, when I came back, it was with the screen turned off, no sound on the speakers and when I pressed the scrollwheel the screen didn't turn on, so in conclusion the player was hung again. I did a Menu+Select reset and it's playing again. Now let's see for how much time.
All the tracks the iPod has been playing are OGG tracks encoded at an average of 160 kbps and an average duration of 4 minutes.
Anyways, the playback seems more stable. I needed to change of language and Rockbox usually pauses playback for 0,5s while changing language (it also happens sometimes on the themes), this time it paused and sound didn't come on until I went to the WPS. There, it acted like it had been muted: playback had paused at aprox. 0:58s and it resumed at 1:05s. I changed of language again, and this time it kept playing after the short (0,5s) interruption.
In conclusion:
Your improvements seem to have a good effect. So I think your plan is the way to follow in order to have a more consistent code and stable playback :)
For now, Rockbox is able to play at least 8 tracks without hanging which is a pretty good mark, taking into account that before it had times that I could only listen to a minute and half of audio.
Program received signal SIGSEGV, Segmentation fault.
0x0044aa54 in pcmbuf_monitor_track_change (monitor=false) at /home/Steve/rockbox/apps/pcmbuf.c:496
496 read_end_chunk->end_of_track = monitor;
(gdb) print read_end_chunk
$1 = (struct chunkdesc *) 0x0
(gdb) print last_chunksize
$2 = 34304
(gdb) bt
#0 0x0044aa54 in pcmbuf_monitor_track_change (monitor=false) at /home/Steve/rockbox/apps/pcmbuf.c:496
#1 0x0044ab1d in pcmbuf_start_track_change (auto_skip=false) at /home/Steve/rockbox/apps/pcmbuf.c:548
#2 0x0044e6e5 in audio_check_new_track () at /home/Steve/rockbox/apps/playback.c:1727
#3 0x0044ef35 in audio_thread () at /home/Steve/rockbox/apps/playback.c:2199
#4 0x0047149c in start_thread () at /home/Steve/rockbox/firmware/target/hosted/thread-win32.c:53
#5 0x7c80b729 in KERNEL32!GetModuleFileNameA () from /cygdrive/c/WINDOWS/system32/kernel32.dll
#6 0x7c82ffb1 in KERNEL32!ConvertThreadToFiber () from /cygdrive/c/WINDOWS/system32/kernel32.dll
Cannot access memory at address 0xbaadf011
In my opinion we should switch off ATA_DMA on Trunk and for a 3.8.1 until the underlying issue is solved.
http://forums.rockbox.org/index.php?topic=14226.15
"Thirdly, there is a very serious silicon bug in the cache on the PP5003. If you ever flush the cache (or a region of it), it can get into a state where the wrong cache line gets written randomly somewhere in external memory. And, because the two CPUs on the 5003 are not cache-coherent, you end up flushing the cache a lot if you try and run stuff on both CPUs simultaneously. (Which you need to do, because the two clocks are locked together, so if you've got 90MHz worth of work to do, it uses much less power to run both CPUs at 45MHz than it does to do it all on one CPU and have the other doing 90MHz worth of nothing.) There is a workaround, which is only available because (bizarrely but fortunately) the 5003 memory-maps the entire internal logic of the cache controller, so you can go and rewrite all the tags and dirty bits to point to somewhere innocuous. Do not issue a cache flush, on either CPU, without following it up (running from IRAM) with the workaround sequence: arbitrary code or data will be overwritten, and your program will randomly crash. (Like, *cough*, the Karma 1.0 firmware did until we fixed it in 1.2.) This workaround is in the application code, not Ecos itself, and so does not appear in the ecos-last-rhepl diff."
I also remember reading from the same developer that the PP5003 is internally more a PP5020 than a PP5002. Hope that helps.
FS#11830(http://www.rockbox.org/tracker/task/11830?getfile=23052) not committed - only v06? Guessing v07 would be harmless now that ATA DMA is disabled ?rockbox-28525.zip rockbox-28953.zip rockbox-29248.zip
rockbox-28700.zip rockbox-29056.zip rockbox-r28435.zip
rockbox-28885.zip rockbox-29106.zip rockbox-r29510.zip
those are the builds im starting with, if anyone has better suggestions i can build tomorow while at work. hopefully we can track it down to at least a few dozen changes instead of ~1100
FS#11764). So, there seems to be evidence that 3.7 shows the error as well. InFS#11775it is stated that 3.6 does not show this effect. I also had local software builds which definately showed this lockup and noise issue, but they did only once. I never were able to reproduce...Anyway, the cache has seemed reliable with the likes of mpegplayer, fft and DMA audio playback, and loading plugins in general which very much rely on the cache coherency routines in places. I had a tester plugin that would ping-pong stuff between cores that ran for extended periods without trouble.
I started having very similar issues when I went from 3.7.1 to 3.8 Data Aborts, Splash screen hung at "Scanning" etc. Once I CTR+Menu enough times to get it playing, it would put out faint clicking sounds or no sound at all (out of the headphone jack.) Sometime the playback would stop/start irratically as well. I have treid the 3-11-2011 daily build as well; same results. I have an 80GB Video.
In my case, using the player on battery with headphones seems to work normally, although I generally dont use my player that way. I own it specifically to use with the OEM Mini (BMW) iPod interface. Previous verisons have had varying levels of support for the OEM control functions, but I could always at least hear the sound and the car would keep the player charged...