FS#12310 - Crash when inserting USB while playback (since r30097)

Attached to Project: Rockbox
Opened by Andree Buschmann (Buschel) - Monday, 03 October 2011, 11:09 GMT
Last edited by Michael Sparmann (TheSeven) - Sunday, 06 November 2011, 00:57 GMT
Task Type Bugs
Category Operating System/Drivers
Status Closed
Assigned To No-one
Operating System iPod Nano 2G
Severity High
Priority Urgent
Reported Version Daily build (which?)
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No


When connecting USB while playback the device is either crashing or locking up. This is not happening when connecting USB after a clean boot.
This task depends upon

Closed by  Michael Sparmann (TheSeven)
Sunday, 06 November 2011, 00:57 GMT
Reason for closing:  Fixed
Additional comments about closing:  Fixed in r30906
Comment by Andree Buschmann (Buschel) - Monday, 03 October 2011, 11:41 GMT
Just in case my WPS has an effect on this. (72.4 KiB)
Comment by Andree Buschmann (Buschel) - Sunday, 16 October 2011, 14:13 GMT
Still happening with clean svn r30761. I attached the config.cfg just in case this might help.

Edit: This does not happen with my iPod Video.
Comment by Fred Bauer (freddyb) - Friday, 21 October 2011, 06:13 GMT
Does this help any?

edit - (Now in SVN)
Comment by Andree Buschmann (Buschel) - Friday, 21 October 2011, 08:10 GMT
Sorry to say it does not fix my problem.
Comment by Andree Buschmann (Buschel) - Friday, 21 October 2011, 10:02 GMT
Finally tracked my issues down to the change that introduced it: r30097 (pcm mixer).

Some more detail: USB connects and works fine when plugged while playback is not active (e.g. while in paused state or directly after startup).
Comment by Fred Bauer (freddyb) - Friday, 21 October 2011, 14:17 GMT
In apps/dsp.c the buflib move callback is commented as TODO:
Comment by Michael Sevakis (MikeS) - Friday, 21 October 2011, 14:42 GMT
If I had my way, I'd change the way buflib works and dispense with the callback nonsense altogether. People like to make things harder on themselves around here for reason I cannot fathom.
Comment by Andree Buschmann (Buschel) - Friday, 21 October 2011, 16:29 GMT
I am sure my USB issue has *nothing* to do with buflib. This issue came in exactly with r30097 which is *way* before buflib. From the the change itself I assume it something connected to the target specific interrupt stuff, which would explain why my iPod Video is not affected.
Comment by Michael Sevakis (MikeS) - Saturday, 22 October 2011, 00:36 GMT
r30097? If that's the case does using keyclick or voice and hitting a key before connecting USB make the clean boot NOT work?

But, in any case I still don't like buflib. ;-)

(Really, it's making me not want to work on anything at all.)

Comment by Andree Buschmann (Buschel) - Saturday, 22 October 2011, 06:51 GMT
Just test with keyclick. What did I do exactly:

1) Boot
2) Activate keyclick
3) Navigate in the menu (keyclicks were audible)
4) Connect USB

Result: works :/

Mike, I am helpless here. Fact is that r30096 connects USB fine when inserted during playback, r30097 locks ups.
Comment by Michael Sevakis (MikeS) - Saturday, 22 October 2011, 12:39 GMT
In either case of any PCM activity silence plays from the private, static mixer buffers for 3 seconds since the last active channel. Try to plug USB very fewer than 3 seconds after the last keyclick. If you already did, then it's not the mixer itself.

Edit: Delete confused comment about wrong device.
Comment by Andree Buschmann (Buschel) - Sunday, 23 October 2011, 07:38 GMT
Mike, if I insert USB very shortly after the last audible click the nano2G locks up. I retested with my former usecase (insert USB several seconds after last click) -> working fine.

This means the mixer is in charge, right?
Comment by Michael Sevakis (MikeS) - Sunday, 23 October 2011, 12:51 GMT
I would say that means the PCM driver and the USB driver for that particular hardware are currently incompatible. The mixer itself is only using private resources and no buffers shared with USB. Try removing clean_dcache in INT_DMA since I assume that double buffer is actually in IRAM as it says. Why does that driver have a double buffer inside and why isn't it using range operations instead?
Comment by Andree Buschmann (Buschel) - Sunday, 23 October 2011, 13:42 GMT
Removing clean_dcache() INT_DAM() of in pcm-s5l8700.c does not help.
Comment by Michael Sevakis (MikeS) - Sunday, 23 October 2011, 20:45 GMT
I'm sort of clueless then. I don't own that piece of hardware and cannot start narrowing down why it has a problem with PCM running in the background while USB is connected while others apparently do not.
Comment by Andree Buschmann (Buschel) - Monday, 24 October 2011, 04:51 GMT
I'll try to point TheSeven to this.
Comment by Michael Sparmann (TheSeven) - Monday, 24 October 2011, 06:58 GMT
Going to look into this later today.
The internal double buffering / buffer splitting is neccessary because a) the pcmbuf is too slow to pass new data in time if the DMA controller runs dry and b) the pcmbuf starts overwriting the data that's being played currently if we request a block in advance. If we had a way to keep pcmbuf blocks locked after requesting the next one, I'd happily get rid of this. Same is true for audio sources like mpegplayer btw.
IRAM on nano2g/classic has dcache enabled (it's NOT that fast!), so the clean_dcache call is neccessary, and IIRC the ARM940T core on that target doesn't support range operations.
Comment by Michael Sevakis (MikeS) - Monday, 24 October 2011, 09:58 GMT
That sounds quite odd. The pcmbuf code was hardly even as complex as the DMA code and should finish in sub-microseconds time (equally true for mpegplayer) and now double buffer is done in the mixer. How many samples are there in the hardware's IIS FIFO? Perhaps you have an interrupt latency problem. I had to take great care on i.MX31 in that regard or had similar problems that easily cause channel swaps (only 4 stereo pairs or samples in FIFO).
Comment by Michael Sparmann (TheSeven) - Monday, 24 October 2011, 16:47 GMT
We have no accurate datasheet, but it behaves like it has no buffer at all, i.e. plays back directly what it receives from the DMAC. So the DMAC will only trigger the interrupt when we've already ran out of data. I see no way around using auto-reloading DMA on these devices.
Comment by Michael Sevakis (MikeS) - Tuesday, 25 October 2011, 09:51 GMT
Aha, compulsory! Basically you need the next buffer set up already so it can autostart it without delay then you get an interrupt to send the future buffer and repeat the cycle? From the construction of the driver, it looks like it goes empty every cycle which could still cause dropouts or artifacts if in fact no hardware FIFO exists. It's not truely parallel double-buffered like the mixer.

Who knows, there could be FIFO watermark setting somewhere. It's not uncommon.

True that it seems unlikely to be the problem with this particular bug.
Comment by Michael Sparmann (TheSeven) - Tuesday, 25 October 2011, 16:59 GMT
What we're doing at the moment is splitting every PCM chunk into two parts: a large one and a small one. The small one needs to be big enough to keep the DMAC happy, and as small as possible to reduce the neccessary double buffer size and copying overhead.
When we get a PCM chunk, we send the large part to the DMAC immediately, copy the small part to the double buffer, and enqueue another DMA transfer for the small part. Once the DMAC has eaten the large part, we get an interrupt, request the next PCM chunk, and while the small chunk is still playing, enqueue the next large part, but don't do anything with the small part yet. Once the DMAC has eaten the previous small part, we copy the new small part around and enqueue it, and so on.
I think that's pretty much dropout-proof, as long as the pcmbuf doesn't get awfully slow.
Comment by Michael Sparmann (TheSeven) - Friday, 28 October 2011, 20:52 GMT
Hm, I can't find any hardware relationship between USB and PCM. They use different DMA controllers and different IRQs, and we don't do IRQ stacking. My suspicion is that terminating audio playback due to the USB connection event is doing something odd, or that there's memory corruption involved.
Comment by Michael Sevakis (MikeS) - Saturday, 29 October 2011, 00:02 GMT
You could force a PCM stop early by kludging in a mixer_reset call somewhere before USB gets active in order to see if it clears the issue (not advocated as an _actual_ solution, mind you). That will force-stop the actual PCM activity and stop all channels. But, that really does little more than just waiting 3 seconds then plugging.

Other than playing silence for 3 seconds after the last channel activity, all playback happenings should be identical to prior to said revision.

Is there possibly an alignment issue with the buffers in USB and the mixer? I think a cache cleaning conflict was ruled out.
Comment by Michael Sevakis (MikeS) - Saturday, 05 November 2011, 21:39 GMT
Just FYI: Keyclick while USB is connect seems to be fine on the targets I own. I suppose it should be expected to work.

So, still no idea huh?