- Status Closed
- Percent Complete
- Task Type Bugs
- Category Music playback
- Assigned To No-one
- Operating System iPod Nano 2G
- Severity High
- Priority Very Low
- Reported Version Release 3.8.1
- Due in Version Undecided
-
Due Date
Undecided
- Votes
- Private
FS#12093 - Playback hanging after codec/playback rework
I experienced several hangups during playback in the middle of tracks. Those hangups appeared nearly each 1h of playback. Seek fwd/rwd or play/pause does not solve the issue. Skipping forward lets the next track play normally. The track that formerly hung up can be played normally after a skip.
Player: iPod nano 2G
Tracks: mpc / mp3
Hangups occured with mpc only, but that might just be an effect of my music collection which mostly consists of mpc.
Thread states:
codec *R 16 16 11% (⇒ always *R, not alternating *R/T)
audio T 15 15 57%
buffering T 15 15 25%
I will attach the .config file later.
Closed by MikeS
2011-05-09 21:20
Reason for closing: Fixed
Additional comments about closing: Warning: Undefined array key "typography" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 371 Warning: Undefined array key "camelcase" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 407
2011-05-09 21:20
Reason for closing: Fixed
Additional comments about closing: Warning: Undefined array key "typography" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 371 Warning: Undefined array key "camelcase" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 407
Reporter says the last patch provided
fixes the problem.
Loading...
Available keyboard shortcuts
- Alt + ⇧ Shift + l Login Dialog / Logout
- Alt + ⇧ Shift + a Add new task
- Alt + ⇧ Shift + m My searches
- Alt + ⇧ Shift + t focus taskid search
Tasklist
- o open selected task
- j move cursor down
- k move cursor up
Task Details
- n Next task
- p Previous task
- Alt + ⇧ Shift + e ↵ Enter Edit this task
- Alt + ⇧ Shift + w watch task
- Alt + ⇧ Shift + y Close Task
Task Editing
- Alt + ⇧ Shift + s save task
Then, I guess you're the guy to ask if anything is potentially wrong in the MPC codec because of the change?
Anyway, I thought about it and it might be a good idea to keep moving nonstop (if you hadn't done that when bringing the backlight up) on the scrollwheel just to be absolutely sure about the codec state. If it really is always *R, the thread is trapped in a loop somewhere of some type that is external to playback itself (not even buffering loops waiting for a handle's data should produce that since they sleep) but obviously yields. Perhaps I forgot about something too. :)
You said it can be played normally after a skip. Is it a reproducable pattern of hang in middle→skip→skip back to track→plays normally? Is the track completely buffered?
I'll probably have more questions too.
Another thing that comes to mind is stack space. Perhaps some silent overflow now exists. How tight is MPC on the codec stack?
Here is the .config file.
As far as I can tell from what I have seen in the buffering screen the playback was stuck when playing the last buffered track. Is it possible that this hangup occurs when the next tracks should be buffered?
Happened again on my iPod nano 2G. If I move the finger on the scrollwheel while I am in the "OS stack" screen I can see that "codec" very shortly switches to "S", but most of the time shows "*R".
OS stack debug screen:
S 16 16 10% scroll
B 18 18 33% usb
T/*R 16 16 17% backlight → *R only very shortly when scrolling the wheel
S 18 18 26% power
S 16 16 30% nand
B 20 20 48% dircache
T 20 20 8% tagcache
T 20 20 34% playlist cachectrl
*R/S 16 16 11% codec → S only very shortly when scrolling the wheel
T 15 15 57% audio
B 16 16 12% voice
T 15 15 25% buffering
buffering thread debug screen:
pcm 0/529.200
alloc 29.313.567/29.313.568
real 76.019/29.313.568
usefl 54.187/29.313.568
Doesn't this again show that the hangup might have happened when trying to buffer?
Regarding your other questions I can say that "hang in middle"→skip to next track→skip back to track→"plays normally" worked twice for me. I am pretty sure that when skipping to next track re-buffering took place. "hang in middle"→"seek fwd"→"seek rwd" did not get the device back to playback state, the time position and the bar were updated properly though. "hang in middle"→play/pause did not work too, the WPS kept displaying the "Play"-Icon. So, I guess the state of the playback engine did not change.
Edit: FWIW, my iPod Video did not lock up so far (using r29792).
Thank you for obtaining these details. Very brief "S" along with responsiveness to the audio thread's quit request indicates strongly that it's stuck in prep_bufdata in buffering.c. However, some points to corruption of the playback state if seeking won't recover it and play/pause won't change the engine state as seen from the WPS (see how simple audio_on_pause() really is and how could that not work?).
ETA:
To answer your question, yes, rebuffering seems quite certain. I'm not sure why. A queue overflow is possible and must not happen. alloc: full, real: low and useful: low looks interesting though.
Handle count, track count and watermark in the buffering screen might have been useful too.
To me right now from that set of quantities and a fair amount of consideration, it's looking like a case of a reset audio handle (to force rebuffer for the codec's data to a new position) + failure to shrink the buffer in order to get metadata out of the way so new data can be added.
Good to see I could help you with these details. I will try to give some more details about handles, track count, etc. next time – I didn't use my iPod since the last hangup.
Give this one a try and see what happens.
Lockup after 3h with the patch from above:
buffering thread debug screen:
pcm 0/529.200
alloc 29.406.639/29.406.640
real 62.115/29.406.640
usefl 22.147/29.406.640
data_rem 623.147
track count 6
handlecount 16
pcmbufdesc 0/21
watermark 21.500
OS stack debug screen:
*R 16 16 53% main
S 16 16 10% scroll
B 18 18 4% usb
T/*R 16 16 17% backlight → *R only very shortly when scrolling the wheel
S 18 18 26% power
S 16 16 36% nand
B 20 20 48% dircache
T 20 20 8% tagcache
T 20 20 34% playlist cachectrl
*R/S 16 16 11% codec → S only very shortly when scrolling the wheel
T 15 15 57% audio
B 16 16 12% voice
T 15 15 25% buffering
Just happened again:
buffering thread debug screen:
pcm 0/529.200
alloc 29.406.639/29.406.640
real 25.703.947/29.406.640
usefl 35.227/29.406.640
data_rem 6.103.683
track count 4
handlecount 8
pcmbufdesc 0/21
watermark 21.125
It seems like you can force the issue when using the buffering thread debug screen and skipping until there is only a minor buffer left (usefl ) and then the playback continue until it would need to rebuffer.
Edit 1: I could reproduce this issue 4-5 time today. The buffering screen always looks very similar: pcm 0, alloc and real nearly filled, usefl nearly empty, pcmbufdesc 0.
Edit 2: It observed that the watermark is set to 0 during rebuffering. Can this cause any unwanted condition in the pcm buffer handling (like "Decode when PCM buffer is below watermark. watermark is =0, we are fine.")?
It occurred to me that the codec couldn't be only stuck waiting for data, it must be exiting the buffering call then making no further progress to see *R, since waiting for data only calls sleep() but outside that calls yield().
In both cases, the usefl is above the watermark, but did it just add a small amount of data or quit just before scheduled rebuffering? (Rhetorical atm; I know you might not know that since you'd have to see it at the right time)
The numbers in the buffering screen don't change but are stuck at those values?
When you said, "…then the playback continue until it would need to rebuffer." Does that mean it quits again at the next rebuffer?
One moment! You're saying that you have recipe? Skip enough to force rebuffer→ wait for normal rebuffer after that = sometimes stalls. ??
Now it locks or you mean just stalls without pcm output? I wouldn't call any state that can be exited normally a "lockup".
With this one, it also sends BUFFER_EVENT_REBUFFER to clear old tracks when there isn't enough space even if the handle is the last one where before if the last handle was the one being rebuffered, no event was sent. To not clear it could lead to a buffer jam, unable to shrink the contents, if previous tracks exist that aren't moveable. Please reports!
On my fuze I can reproduce this pretty easily (meaning pretty much at will) with a mixed bunch of test files, the one at the key point being flac. It nearly drains the buffer, then buffering for that last one on the buffer stops prematurely. usefl is quite close to but greater than the watermark.
Next thing I'll do is to check for queue overflows which are really a very bad thing here should they happen. Every message is precious.
ETA:
The premature stop is what is expected if the buffering didn't yet have enough room to complete and collides with ridx or the next handle. Buffer above the watermark won't trigger a proper clear of old handles yet the request cannot complete because of those handles. It was probably due for a refill but a buffer seek caused a handle reset instead of sending the low buffer event, which after that refill for the handle leaves it too full to rebuffer, then it never gets low enough again to rebuffer inside playback. The codec state being anything other than S has me wondering about there being just one more ingredient.
Haha…well a case of the dumb overcame me and I don't want that responsible for misleading anyone. It's possible to see *R if it timed out in the sleep() call but hadn't yet been scheduled. derp.
Yep, splashed it out nice and cheap and it really looks like a buffer jam with the codec thread spinning around waiting for data that can't be loaded. I'm checking all this with plain SVN, FYI. The previous patch might cure it but I'm not yet sure it should be fixed exactly that way.
Sounds like you have a clue what is going on. I will wait for a patch to test or any support request from your side.
Sure do!
I think the real cause of the regression is the removal of sending BUFFER_EVENT_BUFFER_LOW when processing Q_START_FILL. There is no seek-related problem, at least in how my testing proceeded, since I checked if it actually went through those cases and it didn't. The position must be in range but not the extent of the data needed. That patch won't quite take care of things universally.
Really, those low buffer events tend to be a nuissance and I don't want it to fire them (risking queue overflows) just to make sure we beat it hard enough.
Try this one out. I wasn't able to reproduce any more on the Fuze (or was getting very unlucky
.
This really isn't quite proper but should tell if it's the right thing overall. It also removes buf_request_buffer_handle as a buffering API since that function can be considered a kludge because simply asking for data should be what requests buffering automatically if needed. Also ups the plugin API version.
I tested a build with this patch today and experienced no issues anymore. My nano ran for ~6h. I will start an overnight run in a few hours.
Results of the overnight endurance test ⇒ works! :o)
I would like to have this one submitted to have stable playback in svn. You can then further work on smarter solution. What do you think?
Ok, guess I'll just go with this for now (in a few minutes) and think about a nicer, more universal implementation for later. My time is limited at the moment anyway.