FS#12150 - Fully-functional audio mixer

Attached to Project: Rockbox
Opened by Michael Sevakis (MikeS) - Monday, 06 June 2011, 23:35 GMT
Last edited by Michael Sevakis (MikeS) - Thursday, 30 June 2011, 07:20 GMT
Task Type Patches
Category Music playback
Status Closed
Assigned To Michael Sevakis (MikeS)
Operating System SW-codec
Severity Low
Priority Normal
Reported Version Release 3.8.1
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No


Implements a multi-channel mixer with fully-independent playback and volume control on each channel.

Long-standing bugs addressed:
Voice and keyclick do not work while playback is paused -- fixed
Fade on stop/paused does not work on all outputs - fixed

Voice and keyclick are as responsive to user input while playing music as when not, which was not previously the case. Additionally, voicing in the audio menus where pcmbuf is running with low-latency is stable.

PCM interface for each channel is largely identical to current callback mechanism. The main difference is the function call names and the need for a channel ID.

Mixing is done during the mixer's main PCM callback, which calls the callback for all active channels. Double buffering of short mix frames ensures there is enough time to compute the final downmix before hardware FIFO's run out of data. One minor change to PCM drivers is required to implement it so that the next frame may be produced while the current plays.

Silence plays for 3 seconds before PCM is actually stopped after all channels report 'no more' in order to prevent popping for keyclick and voice output. Recording workaround in order to be sure PCM is stopped for half-duplex PCM drivers is a bit messy right now but works.

Playing PCM from plugins while music is playing should be possible so long as their callbacks aren't very demanding. Same goes for the possibility of voicing/click/beep whle recording where full-duplex PCM is implementable.

Yes, as usual it all takes a good bit more RAM to have something nice but it appears a worthwhile trade.
This task depends upon

Closed by  Michael Sevakis (MikeS)
Thursday, 30 June 2011, 07:20 GMT
Reason for closing:  Accepted
Additional comments about closing:  Committed in r30097
Comment by Michael Sevakis (MikeS) - Monday, 06 June 2011, 23:49 GMT
I forgot to mention, I implemented the fade rather cheaply for the moment and it still blocks inside the playback engine and the channel volume is faded on the playback thread....still thinking about how better to go about it. It's short though. Plan to try to optimize mixing code for coldfire, not just armv6. ARMv4, don't know...there isn't a huge savings instruction-wise.
Comment by Steve Bavin (pondlife) - Tuesday, 07 June 2011, 13:43 GMT
Hi Michael,

Good work - been hoping somebody (including myself) would find time for this over the past 4 years or so.

Have tested on a Windows sim, seems to work in general although voice output is very broken up. It sounds like every other buffer is silence, or maybe it's just not keeping up? It makes no difference whether music is playing or not.

(I got a hunk error on patching firmware/target/arm/s5l8702/pcm-s5l8702.c , but that obviously doesn't affect a sim build.)
Comment by Michael Sevakis (MikeS) - Tuesday, 07 June 2011, 14:25 GMT
Possibly because I kept the voice PCM frame count down to two frames unless otherwise something required more (it's running very low latency). Try upping it in voice_thread.c to see what works.
Comment by Steve Bavin (pondlife) - Tuesday, 07 June 2011, 15:25 GMT
13 seems fine, 12 or less is not.
Comment by Michael Sevakis (MikeS) - Tuesday, 07 June 2011, 16:05 GMT
13! That's huge. My own linux sims running in a VM work fine with 2. Some targets might need a few more when stressed but even slow ones do alright with two. I guess that number will vary with build.
Comment by Steve Bavin (pondlife) - Tuesday, 07 June 2011, 16:09 GMT
Indeed. My PC isn't particularly slow either - a 3GHz quad-core thingummy... could there be some yielding missing?
Comment by Steve Bavin (pondlife) - Tuesday, 07 June 2011, 16:11 GMT
I should add this is in Windows XP, no VM involved.
Comment by Michael Sevakis (MikeS) - Tuesday, 07 June 2011, 17:26 GMT
Wow. Maybe there is but I would expect that to affect all builds negatively. Maybe something with the win32 fiber threads? It really should have no issue if a PP5002 doesn't, which is about the worst-case target for doing this (bad cache, slow IRAM, no actual hw DMA used). However, there is breakup if entering a heavy plugin and the "loading" splash is speaking or you keep scrolling heavily to the end of a menu but otherwise all has been well. Will have a look at the w32 threading and check if I see anything that could reduce responsiveness.
Comment by Michael Sevakis (MikeS) - Tuesday, 07 June 2011, 19:04 GMT
I just tried a win32 sim. I see what you mean. It's not just the voice but the keyclick too.
Comment by Steve Bavin (pondlife) - Tuesday, 07 June 2011, 19:17 GMT
Yes, but most noticable with voice - especially talk clips...
Comment by Michael Sevakis (MikeS) - Tuesday, 07 June 2011, 19:51 GMT
It looks to me like the audio buffers provided by the system are large and they get filled in bursts, so "PCM time" runs unevenly on the source side of things while audio playback tends to have enough data to always fill, leading to the bursty playback of channels that have little latency-- it eats all the available frames in a channel then plays silence.

Solution: Need to think how to create the buffers gradually regardless of size. :)
Comment by Michael Sevakis (MikeS) - Wednesday, 08 June 2011, 19:18 GMT
I'm quite near saying to heck with it for any app builds for now since this thing needs steady clocking of its frames and there's no way I'm going to be able to do anything with anything other than with SDL. Also, I get different results with the sdl.dll that I made with mingw vs. the one straight off the sdl website. They are very different sized libs (~1MB/~300KB) and former gives audio buffers of 44112 bytes and the second 8192 on my system but both are still unacceptably useless to use conventionally. Expecting 64K to be filling on android sure won't work and I haven't the facilities to work on it. Maemo? Who knows.

So, for now. Not gonna happen except for native targets where we do things (closer to) right instead of making ridiculous assumptions/lazy garbage interfaces.

Maybe a timer to run the mixer would suffice. Still would like to avoid this blowing-up into stupid on account of someone else's.
Comment by Michael Sevakis (MikeS) - Wednesday, 08 June 2011, 22:18 GMT
Well, I can get nice results if I add delays inside the audio callback to pace things out rather than slurping all the data in a fast and unregulated loop.
Comment by Michael Sevakis (MikeS) - Thursday, 09 June 2011, 04:29 GMT
Give this a whirl. Works for me for Linux and native Windows sims.
Comment by alex wallis (alexwallis646) - Thursday, 09 June 2011, 23:00 GMT
so could this patch actually be run on target?
Comment by Michael Sevakis (MikeS) - Friday, 10 June 2011, 00:19 GMT

If I did it right adding the extra call to the target PCM driver or it's one class of target that I have and know is right because I tested it, then yes, it can be run on target.

As an app or sim, things are being a bit sticky but the sim seems to be doing alright for me; I need Steve to check it out but he's gone off and lost intrest apparently.
Comment by Steve Bavin (pondlife) - Friday, 10 June 2011, 06:00 GMT
Hi Mike, I'm very much still interested but away from home for a couple of days. Will test as soon as I get back.
Comment by Michael Sevakis (MikeS) - Friday, 10 June 2011, 17:55 GMT
Sorry, I was just doing a bit of cynical trolling. :-)

Still, I would very much appreciate testing testing testing (thank you in advance!) with the other app builds too and samsung-based iPods/Meizu (why do those drivers double-buffer at a low level anyway?). Wouldn't mind getting this in for 3.9 to get rid of some ancient limitations and bugs but I guess time is running short.
Comment by Michael Sevakis (MikeS) - Sunday, 12 June 2011, 06:49 GMT
Correct use of ALIGN_CHANNEL on the function parameters (not channel data) when first starting channel and remove code for an impossible case when unpausing (channel in paused state cannot have reached 0 bytes remaining, so no need to care about it).

ETA: Forgot to mention that it takes care of voice dropping under heavier load which I never bothered to address before. Just needed one more voice frame to queue and a higher minimum priority for the voice thread.
Comment by Steve Bavin (pondlife) - Sunday, 12 June 2011, 16:56 GMT
Hi Mike,

22z seems to work ok for me, patched against r29995. (Well, nearly r29995, I had to modify configure to remove the -fvisibility=hidden option because of my old gcc, but that's a bit OT here.)
Comment by Thomas Martitz (kugel.) - Sunday, 12 June 2011, 17:56 GMT
Doesn't work on android/my phone. When resuming playback it just creates a constant noise which remains even after stopping playback (killing the app makes the noise go away). The same happens without the changes to pcm-android.c.
Comment by Michael Sevakis (MikeS) - Monday, 13 June 2011, 02:28 GMT
@Steve: Great! Sounds like that's one down.

@Thomas: That seems very odd. Can you give more details? The changes to pcm-android.c are quite necessary, esp. the provision for callback after a new frame (without that it _won't_ work at all). Just to rule one part out, didn't you say the mutex addition did work without issues? Anything that looks wrong there to you?
Comment by Michael Sevakis (MikeS) - Monday, 13 June 2011, 02:33 GMT I forgot to readd the pcm_play_dma_started_callback calls in pcm-android.c after resolving the conflicts with the new version. I'll get a patch up there ASAP.
Comment by Michael Sevakis (MikeS) - Monday, 13 June 2011, 02:42 GMT
Ok, this might work better (or not, we see) :-)

@Thomas: It might need some correction to not call pcm_play_dma_started_callback() when pcm_play_get_more_callback() says "no more" since it shouldn't call that unless more is expected.
Comment by Steve Bavin (pondlife) - Monday, 13 June 2011, 08:17 GMT
Hi Mike,

23 looks ok too.

It might be another Cygwin-ism, but patch always fails on ocm-s5l8702.c...
patching file firmware/target/arm/s5l8702/pcm-s5l8702.c
Hunk #1 FAILED at 113.
1 out of 1 hunk FAILED -- saving rejects to file firmware/target/arm/s5l8702/pcm-s5l8702.c.rej
Comment by Thomas Martitz (kugel.) - Monday, 13 June 2011, 11:10 GMT
The last patch works on android. playing music works, voice works, both together work and voice while paused also works. Great! :)
Comment by Michael Sevakis (MikeS) - Monday, 13 June 2011, 18:08 GMT
Thanks for all the help so far.

@Steve: I'm making the patches from SVN. I don't understand what's going on there to make the patching fail. ??

@Thomas: Excellent. Need to check keyclick, track skip beep and fade on stop/pause since those are also changed with this.
Comment by Michael Sevakis (MikeS) - Friday, 17 June 2011, 06:38 GMT
Sync to r30009.
Comment by Michael Sevakis (MikeS) - Sunday, 19 June 2011, 07:02 GMT
Throw in some optimizing for ARMv5 and lower. Move pcm_mixer files to /firmware.
Comment by Michael Sevakis (MikeS) - Sunday, 19 June 2011, 13:00 GMT
Add Coldfire optimization.
Comment by Steve Bavin (pondlife) - Tuesday, 21 June 2011, 07:22 GMT
Hopefully this can be committed soon, to give plenty of testing before the next release. A good feature for a 4.0 release!

Would it be worth doing a battery benchmark comparison for this patch? (I'm not volunteering as I don't have a working device... just saying...)
Comment by Michael Sevakis (MikeS) - Tuesday, 21 June 2011, 09:09 GMT
Perhaps it would be. It does do a bit more work but I've not seen any obvious decline over weeks of use. It might take some load off portal player at least. I'd like to work out to some reasonable satisfaction the CPU-optimized bits before judging battery life.
Comment by Steve Bavin (pondlife) - Tuesday, 21 June 2011, 10:44 GMT
Ignore my comment about not being able to build - had some silly pcm_mixer.h files lying around in apps..
Comment by Przemysław Hołubowski (p.h.) - Tuesday, 21 June 2011, 21:39 GMT
I get undefined reference to `pcm_play_buffer_callback' in function `fiq_playback' when I make for iriver H10 20GB. Make fails. I have patched clean rev. 30048.
Comment by Michael Sevakis (MikeS) - Wednesday, 22 June 2011, 01:03 GMT
Oops. This should fix that up. I renamed that function but missed pcm-pp.c.

Also, this has better coldfire assembly (seems like it should still be a bit better :P ) and divides CPU optimization into some *.c includes to include the necessary inline functions.
Comment by Michael Sevakis (MikeS) - Wednesday, 22 June 2011, 22:47 GMT
I did discover a couple glitches when using the patch.

1) Easily explainable. On coldfire I got a voice stack overflow one time. Since Coldfire uses any current stack as the interrupt stack, the voice stack is very close to the limit and the patched code will use more stack in the interrupt, perhaps the voice stack should be increased a few bytes. This won't be easily duplicated but can happen.

2) Clip (v1), the keyclick comes out wrong. I don't know why (maybe something with thumb interwork?) Fuze (v2), which uses the same PCM driver works perfectly. With current SVN the situation is a bit better but it's still wrong and 32 samples are missing from the beginning of the keyclick when not playing music. With the patch, doing a USB plug/unplug actually changes the tone of the keyclick until rebooting. Yep, there's some crazy going on here.
Comment by Michael Sevakis (MikeS) - Thursday, 23 June 2011, 03:27 GMT
Add a few more voice stack bytes for coldfire (just a little bit). Fix the keyclick issue on clip (asm constraints weren't right).
Comment by Michael Sevakis (MikeS) - Thursday, 23 June 2011, 07:08 GMT
Optimize beep generating in general and for ARM and Coldfire. Hopfully the last needed version of this before a commit. If any untested PCM driver get busted, then it will have to be fixed after that. :-)
Comment by Thomas Martitz (kugel.) - Thursday, 23 June 2011, 08:17 GMT
I'm not sure if it's a good idea to go into ASM at this early stage. Not even the C version had lots of testing yet. Also it's not proven that the ASM is really worth it.
Comment by Michael Sevakis (MikeS) - Thursday, 23 June 2011, 09:07 GMT
Since the ASM is there, written by myself free of charge, anyone can easily switch it back and forth to check it out. :-) I wouldn't consider more than a month of working this out against feedback to be "early".

If you can cut instruction count in loops by 2/3 by using special CPU features the compiler can't use and mind pipeline requirements better, experience with those SoC it's worth it and almost impossible to botch. The most doubt should be ARMv4 where the fewest saving are made, however, the constructs used are proven in the DSP code. Clipping is fairly free on coldfire and armv6 but expensive on armv4.
Comment by Michael Sevakis (MikeS) - Thursday, 23 June 2011, 11:50 GMT
FYI: Quick check on PP5002, iPod 3G (about the nastiest case there is since a sick number of callbacks are made due to the short FIFO without DMA) is worth an increase of about .7MHz in the buffering thread screen with everything set the same way on the same file. Voice reliability during playback (especially when buffering as well) is far, far better (SVN drops and skips, patch doesn't).
Comment by Michael Sevakis (MikeS) - Thursday, 23 June 2011, 12:42 GMT
Also decided to test e200 (PP5024) under similar conditions as 3G; it's worth about -0.3MHz:
forest.spc (SVN / 30z.patch): 54.8/54.5

Numbers for 3G:
The Noise of Carpet.mp3: (SVN / 30z.patch): 39.4/40.1
Comment by Nils Wallménius (nls) - Thursday, 23 June 2011, 19:51 GMT
maybe the new pcm-mixer-<cpu>.c files should go in the target tree?
Comment by Michael Sevakis (MikeS) - Thursday, 23 June 2011, 21:04 GMT
Hmmm...I was indecisive about it for some reason. I had them there, them moved them to follow the dsp.c pattern. If it's best to do /target in this case, then it's no problem.
Comment by Michael Sevakis (MikeS) - Thursday, 23 June 2011, 21:24 GMT
I was considering merging the mixer and direct PCM into one another to better implement "stealing" the audio device, making it tighter and efficiently implement the same functions pcm.c has like peak calculation but on individual channels. Do that now or work it out afterwords? I'm avoid it for now because merging it now would just make the changes more complex in one hit.
Comment by Michael Sevakis (MikeS) - Thursday, 23 June 2011, 22:45 GMT
Coldfire test on x5 (the fraction is all over the place so it's hard to tell exacly):
forest.spc (SVN / 30z.patch): 103.2 / 103.5
Comment by Michael Sevakis (MikeS) - Saturday, 25 June 2011, 00:08 GMT
Ought I just commit this now that we've branched for 3.9 some days ago and isn't terribly invasive to things in the trunk? The limitations in PCM addressed by it have been an annoyance for years, spuring discussions in IRC about what to do and now here's a fix (ala LinusN's request for a "mixer" several years ago) that is also general enough to be useful in other ways. I would also like to get a move on finishing-up  FS#12153 , which besides giving more stable positioning information, is a better basic approach for lowish-latency DSP but gets rid of hacky things in pcmbuf in a way that works better if this patch is committed first.

Comment by Michael Sevakis (MikeS) - Saturday, 25 June 2011, 02:26 GMT
Put the asm in the target subdirectory. Do the second callback with less overhead and add pcm-internal.h to declare items shared with the low-level and mid-level code only that aren't part of the normal pcm API for app use; pcm.h is getting too messy with internal stuff.
Comment by MichaelGiacomelli (saratoga) - Saturday, 25 June 2011, 02:52 GMT
Is mix_samples likely to be called with non-unity gain? If so, making all our ARMv5 devices use the ARMv4 loop just because they don't have the saturated add and halfword pack seems like a bad idea.
Comment by Michael Sevakis (MikeS) - Saturday, 25 June 2011, 03:10 GMT
I wouldn't say it's unlikely since it's the case when mixing voice or fading on stop/pause with other channels playing.

Of course I want to use the best one the target can use. If they can use the armv6 code with explicit saturation, I guess that's ok. I'm just not sure what's what with the various arm v5 targets. Is that all it's missing is the saturated add and halfword pack? Saturation with or without gain cut on one or more mixdowns is needed.
Comment by MichaelGiacomelli (saratoga) - Saturday, 25 June 2011, 04:03 GMT
I think thats all, V5 has the packed multiply instructions. There is a chart here showing each flavor:

Although if this is only going to be used on fading and mixing voice, saving a few cycles on multiplication may not be a huge deal.
Comment by Michael Sevakis (MikeS) - Saturday, 25 June 2011, 05:05 GMT
I did intend for plugins to use this so they can play sound with playback which is as simple as adding a new channel enum and using it just like the core does. I'm sure unity would be the typical case however the faster it is the better IMO.

From the chart, the instructions in the ARMv6 version that can be used on v5 are 5E instructions:

Might be good, but not using it for this right now:

I have debated using a lower resolution for the amplitude to accomodate 16x16 multiplies.
Comment by Michael Sevakis (MikeS) - Saturday, 25 June 2011, 08:04 GMT
Serving customer version 5. I repeat: serving customer version 5. x-@
Comment by PurlingNayuki (yzflcyq) - Saturday, 25 June 2011, 15:03 GMT
May I can use it to swap the channel. Wonderful work.
Comment by Andree Buschmann (Buschel) - Sunday, 26 June 2011, 08:25 GMT
Tried this patch (32z) on win32 sim and iPod Video target. It works for the sim (including voice during pause), but on the iPod Video I could not get voice to work at all.
Comment by Michael Sevakis (MikeS) - Sunday, 26 June 2011, 10:33 GMT
H10-5GB - check
H10-20GB - check
e200 - check
iPod Video - ORLY?

I give up!
Comment by Andree Buschmann (Buschel) - Sunday, 26 June 2011, 17:26 GMT
My fault, I used the wrong voice-file.. Sorry for that!

Result: Works like charm as well.

Edit: Test on iPod nano 2G was good, too.

Don't give up! :)
Comment by Michael Sevakis (MikeS) - Monday, 27 June 2011, 05:01 GMT
I'll take !@#$^%! for 9000, Alex. :)

Thanks for testing.

Comment by Michael Sevakis (MikeS) - Tuesday, 28 June 2011, 23:18 GMT
I plan on committing very shortly, within 24 hours most likely unless there's a chorus of "don't do it!" I just need to make sure I'm around to handle red and yellow. 22 days in the tracker is more than enough for this sort of simple thing.

Now, I'll be sure to describe here and elsewhere, such as a mailing list, what is expected in the PCM driver in case any of the changes are discovered to have been incorrect. It's basically the trivial (for the most part) addition of pcm_play_dma_started_callback in one place in each driver.

pcm_play_dma_started callback is expected to be called (it's really inlined) when:
1) A new buffer has been obtained via pcm_play_get_more_callback ... AND
2) After DMA transfer has begun (or if no DMA, the hardware FIFO has just been filled) ... AND
3) Only in the driver's audio callback or ISR, NOT when first starting PCM in pcm_play_dma_start ... AND
3) NOT if pcm_play_get_more_callback returned no buffer (size = 0)