Rockbox

  • Status Closed
  • Percent Complete
    100%
  • Task Type Bugs
  • Category Music playback
  • Assigned To No-one
  • Operating System iPod Nano 2G
  • Severity High
  • Priority Very Low
  • Reported Version Release 3.8.1
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by Buschel - 2011-05-03
Last edited by MikeS - 2011-05-09

FS#12093 - Playback hanging after codec/playback rework

I experienced several hangups during playback in the middle of tracks. Those hangups appeared nearly each 1h of playback. Seek fwd/rwd or play/pause does not solve the issue. Skipping forward lets the next track play normally. The track that formerly hung up can be played normally after a skip.

Player: iPod nano 2G
Tracks: mpc / mp3

Hangups occured with mpc only, but that might just be an effect of my music collection which mostly consists of mpc.

Thread states:
codec *R 16 16 11% (⇒ always *R, not alternating *R/T)
audio T 15 15 57%
buffering T 15 15 25%

I will attach the .config file later.

Closed by  MikeS
2011-05-09 21:20
Reason for closing:  Fixed
Additional comments about closing:  

Reporter says the last patch provided fixes the problem.

MikeS commented on 2011-05-03 06:17

Then, I guess you're the guy to ask if anything is potentially wrong in the MPC codec because of the change? :-)

Anyway, I thought about it and it might be a good idea to keep moving nonstop (if you hadn't done that when bringing the backlight up) on the scrollwheel just to be absolutely sure about the codec state. If it really is always *R, the thread is trapped in a loop somewhere of some type that is external to playback itself (not even buffering loops waiting for a handle's data should produce that since they sleep) but obviously yields. Perhaps I forgot about something too. :)

You said it can be played normally after a skip. Is it a reproducable pattern of hang in middle→skip→skip back to track→plays normally? Is the track completely buffered?

I'll probably have more questions too.

MikeS commented on 2011-05-03 06:24

Another thing that comes to mind is stack space. Perhaps some silent overflow now exists. How tight is MPC on the codec stack?

Here is the .config file.

As far as I can tell from what I have seen in the buffering screen the playback was stuck when playing the last buffered track. Is it possible that this hangup occurs when the next tracks should be buffered?

Happened again on my iPod nano 2G. If I move the finger on the scrollwheel while I am in the "OS stack" screen I can see that "codec" very shortly switches to "S", but most of the time shows "*R".

OS stack debug screen:

S 16 16 10% scroll
B 18 18 33% usb
T/*R 16 16 17% backlight → *R only very shortly when scrolling the wheel
S 18 18 26% power
S 16 16 30% nand
B 20 20 48% dircache
T 20 20 8% tagcache
T 20 20 34% playlist cachectrl
*R/S 16 16 11% codec → S only very shortly when scrolling the wheel
T 15 15 57% audio
B 16 16 12% voice
T 15 15 25% buffering

buffering thread debug screen:

pcm 0/529.200
alloc 29.313.567/29.313.568
real 76.019/29.313.568
usefl 54.187/29.313.568

Doesn't this again show that the hangup might have happened when trying to buffer?

Regarding your other questions I can say that "hang in middle"→skip to next track→skip back to track→"plays normally" worked twice for me. I am pretty sure that when skipping to next track re-buffering took place. "hang in middle"→"seek fwd"→"seek rwd" did not get the device back to playback state, the time position and the bar were updated properly though. "hang in middle"→play/pause did not work too, the WPS kept displaying the "Play"-Icon. So, I guess the state of the playback engine did not change.

Edit: FWIW, my iPod Video did not lock up so far (using r29792).

MikeS commented on 2011-05-05 02:23

Thank you for obtaining these details. Very brief "S" along with responsiveness to the audio thread's quit request indicates strongly that it's stuck in prep_bufdata in buffering.c. However, some points to corruption of the playback state if seeking won't recover it and play/pause won't change the engine state as seen from the WPS (see how simple audio_on_pause() really is and how could that not work?).

ETA:
To answer your question, yes, rebuffering seems quite certain. I'm not sure why. A queue overflow is possible and must not happen. alloc: full, real: low and useful: low looks interesting though.
Handle count, track count and watermark in the buffering screen might have been useful too.

MikeS commented on 2011-05-05 05:37

To me right now from that set of quantities and a fair amount of consideration, it's looking like a case of a reset audio handle (to force rebuffer for the codec's data to a new position) + failure to shrink the buffer in order to get metadata out of the way so new data can be added.

Good to see I could help you with these details. I will try to give some more details about handles, track count, etc. next time – I didn't use my iPod since the last hangup.

MikeS commented on 2011-05-06 01:40

Give this one a try and see what happens.

Lockup after 3h with the patch from above:

buffering thread debug screen:

pcm 0/529.200
alloc 29.406.639/29.406.640
real 62.115/29.406.640
usefl 22.147/29.406.640
data_rem 623.147
track count 6
handlecount 16
pcmbufdesc 0/21
watermark 21.500

OS stack debug screen:

*R 16 16 53% main
S 16 16 10% scroll
B 18 18 4% usb
T/*R 16 16 17% backlight → *R only very shortly when scrolling the wheel
S 18 18 26% power
S 16 16 36% nand
B 20 20 48% dircache
T 20 20 8% tagcache
T 20 20 34% playlist cachectrl
*R/S 16 16 11% codec → S only very shortly when scrolling the wheel
T 15 15 57% audio
B 16 16 12% voice
T 15 15 25% buffering

Just happened again:

buffering thread debug screen:

pcm 0/529.200
alloc 29.406.639/29.406.640
real 25.703.947/29.406.640
usefl 35.227/29.406.640
data_rem 6.103.683
track count 4
handlecount 8
pcmbufdesc 0/21
watermark 21.125

It seems like you can force the issue when using the buffering thread debug screen and skipping until there is only a minor buffer left (usefl ) and then the playback continue until it would need to rebuffer.

Edit 1: I could reproduce this issue 4-5 time today. The buffering screen always looks very similar: pcm 0, alloc and real nearly filled, usefl nearly empty, pcmbufdesc 0.

Edit 2: It observed that the watermark is set to 0 during rebuffering. Can this cause any unwanted condition in the pcm buffer handling (like "Decode when PCM buffer is below watermark. watermark is =0, we are fine.")?

MikeS commented on 2011-05-06 16:14

It occurred to me that the codec couldn't be only stuck waiting for data, it must be exiting the buffering call then making no further progress to see *R, since waiting for data only calls sleep() but outside that calls yield().

In both cases, the usefl is above the watermark, but did it just add a small amount of data or quit just before scheduled rebuffering? (Rhetorical atm; I know you might not know that since you'd have to see it at the right time)

The numbers in the buffering screen don't change but are stuck at those values?

When you said, "…then the playback continue until it would need to rebuffer." Does that mean it quits again at the next rebuffer?

but did it just add a small amount of data or quit just before scheduled rebuffering?
usefl indeed sometimes changed (going up by several 100 KB) before rebuffering – from few 10 KB to some 100 KB's.
The numbers in the buffering screen don't change but are stuck at those values?
If the playback hangs the numbers in buffering screen do not change at all.
Does that mean it quits again at the next rebuffer?
No. Bad english :/ I meant to say that you just skip until usefl is so small, that any further skip would immediately result in rebuffering. Then you just wait until the playback engine wants to rebuffer. In some cases it will lock up, in most cases it will rebuffer normally. I just have the impression that you do not need to wait to play the whole buffered data, you can just skip and wait for rebuffering. This saves a lot of time for reproduction.
MikeS commented on 2011-05-07 00:00

One moment! You're saying that you have recipe? Skip enough to force rebuffer→ wait for normal rebuffer after that = sometimes stalls. ??

Now it locks or you mean just stalls without pcm output? I wouldn't call any state that can be exited normally a "lockup".

MikeS commented on 2011-05-07 02:20

With this one, it also sends BUFFER_EVENT_REBUFFER to clear old tracks when there isn't enough space even if the handle is the last one where before if the last handle was the one being rebuffered, no event was sent. To not clear it could lead to a buffer jam, unable to shrink the contents, if previous tracks exist that aren't moveable. Please reports!

MikeS commented on 2011-05-07 05:40

On my fuze I can reproduce this pretty easily (meaning pretty much at will) with a mixed bunch of test files, the one at the key point being flac. It nearly drains the buffer, then buffering for that last one on the buffer stops prematurely. usefl is quite close to but greater than the watermark.

Next thing I'll do is to check for queue overflows which are really a very bad thing here should they happen. Every message is precious.

ETA:

The premature stop is what is expected if the buffering didn't yet have enough room to complete and collides with ridx or the next handle. Buffer above the watermark won't trigger a proper clear of old handles yet the request cannot complete because of those handles. It was probably due for a refill but a buffer seek caused a handle reset instead of sending the low buffer event, which after that refill for the handle leaves it too full to rebuffer, then it never gets low enough again to rebuffer inside playback. The codec state being anything other than S has me wondering about there being just one more ingredient.

MikeS commented on 2011-05-07 06:21

Haha…well a case of the dumb overcame me and I don't want that responsible for misleading anyone. It's possible to see *R if it timed out in the sleep() call but hadn't yet been scheduled. derp.

MikeS commented on 2011-05-07 06:41

Yep, splashed it out nice and cheap and it really looks like a buffer jam with the codec thread spinning around waiting for data that can't be loaded. I'm checking all this with plain SVN, FYI. The previous patch might cure it but I'm not yet sure it should be fixed exactly that way.

Sounds like you have a clue what is going on. I will wait for a patch to test or any support request from your side.

MikeS commented on 2011-05-07 07:25

Sure do! :-) I think the real cause of the regression is the removal of sending BUFFER_EVENT_BUFFER_LOW when processing Q_START_FILL. There is no seek-related problem, at least in how my testing proceeded, since I checked if it actually went through those cases and it didn't. The position must be in range but not the extent of the data needed. That patch won't quite take care of things universally.

Really, those low buffer events tend to be a nuissance and I don't want it to fire them (risking queue overflows) just to make sure we beat it hard enough.

MikeS commented on 2011-05-08 07:15

Try this one out. I wasn't able to reproduce any more on the Fuze (or was getting very unlucky :-).

This really isn't quite proper but should tell if it's the right thing overall. It also removes buf_request_buffer_handle as a buffering API since that function can be considered a kludge because simply asking for data should be what requests buffering automatically if needed. Also ups the plugin API version.

I tested a build with this patch today and experienced no issues anymore. My nano ran for ~6h. I will start an overnight run in a few hours.

Results of the overnight endurance test ⇒ works! :o)

I would like to have this one submitted to have stable playback in svn. You can then further work on smarter solution. What do you think?

MikeS commented on 2011-05-09 20:31

Ok, guess I'll just go with this for now (in a few minutes) and think about a nicer, more universal implementation for later. My time is limited at the moment anyway.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing