Rockbox

Tasklist

FS#10199 - Dynamic range compression / hard limiting

Attached to Project: Rockbox
Opened by Jeffrey Goode (Blue_Dude) - Monday, 11 May 2009, 13:47 GMT
Last edited by Jeffrey Goode (Blue_Dude) - Wednesday, 19 August 2009, 14:44 GMT
Task Type Patches
Category Music playback
Status Closed
Assigned To No-one
Operating System SW-codec
Severity Low
Priority Normal
Reported Version Version 3.2
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

This is a test of the interface to make sure it doesn't conflict or interfere with other Rockbox functions. This adds a new menu item under Sound Settings named Compression. It is intended to choose the amount of gain applied to the playback buffer before further processing. So far though, it only sets a couple of static variables.
This task depends upon

Closed by  Jeffrey Goode (Blue_Dude)
Wednesday, 19 August 2009, 14:44 GMT
Reason for closing:  Accepted
Additional comments about closing:  Committed as r22394
Comment by Delyan Kratunov (archivator) - Tuesday, 12 May 2009, 11:01 GMT
Could you please elaborate? What you're doing is a dynamic range compressor, right? Why the pre-amp then? As far as I know, DRC is meant to only reduce gain above a certain threshold. The way I see it, there's no need to work on the entire stream, just the parts that need adjusting (minding the attack/release phases, of course).
Comment by Jeffrey Goode (Blue_Dude) - Tuesday, 12 May 2009, 13:02 GMT
The pre-amp could easily be considered post-amp, or make-up gain. By smashing down the peaks, the whole waveform becomes quieter, but more sonically even from a loudness standpoint. The make-up gain brings it back up to a listenable level. Or higher. Preamping means that when doing the math, (if I'm forced to do integer math), I have more bits to work with and the arithmetic will be a little more precise, that's all. Given the choice, when doing integer math I'd rather multiply first, then divide, than vice versa.

The use I had in mind was in a noisy environment like a car, when listening to relatively quiet material. The quiet parts will sound louder (it's just straight amplification), but the peaks that would clip will be adjusted (negative gain, with smoothing) to keep their shape, but under the clipping threshold. Strictly speaking, this isn't compression. It's hard limiting. But it has the advantage of working at a wide range of gains without a lot of processing power required and without distorting the waveform. It just works. Crank it to "11" and it will play just as loud as the hardware will accept and not distort. The quiet parts will be that much louder, but the peaks will retain their shape at just under maximum allowable gain. The overall dynamic range is that much smaller, therefore "compressed", for lack of a better term. Maybe I should just call it "pre-amp with limiting" or something. Suggestions?
Comment by Delyan Kratunov (archivator) - Tuesday, 12 May 2009, 18:06 GMT
That makes perfect sense, thanks for the explanation.

As far as the name is concerned, I'd go with "limited amplification" or something of the sort (moving the emphasis on amplification and not compression). To me, "dynamic range compressor" is reserved for a slightly different algorithm, albeit having the same effect (compressed dynamic range). Calling your approach that could be misleading.

Of course, naming should be the least of your troubles and can be brought to the discussion at a much later time. Right before committing sounds about right :)

Good luck!
Comment by Jeffrey Goode (Blue_Dude) - Monday, 25 May 2009, 19:55 GMT
This is a cleaned up version of the interface test. Functionality is still forthcoming.
Comment by Jeffrey Goode (Blue_Dude) - Friday, 10 July 2009, 23:49 GMT
Here's a first attempt at a limiter function. It does work, but it needs tweaking. It's currently set up to reduce gain in .6 dB steps, which in my opinion is too coarse. Finer steps are certainly possible, at the expense of compute time for each sample. This could be mitigated by replacing the current linear peak finding algorithm with some kind of sort-type algorithm, halving and halving again until arriving at the correct figure. The worst case for arriving at the correct figure with maximum dB steps (0.0375 dB) is around 9, vs the current 19. This precision would come at the expense of RAM because it involves much larger tables. I'm sure there's a happy medium in there somewhere.
Comment by Jeffrey Goode (Blue_Dude) - Monday, 13 July 2009, 18:13 GMT
Here's a much better debugged version. The sort algorithm still isn't there, but it has a larger table and linear interpolation between table values to smooth the output considerably.
Comment by alex wallis (alexwallis646) - Wednesday, 15 July 2009, 15:23 GMT
Hi.
I am not a developer, I am an interested user.

I tried your patch out today on my h140, as I am going on holiday for a week, so |I decided to just test it rather than risk going on holiday with an untried patch as part of my build.

I found very serious problems with it, so I am glad I tried it out before going on holiday.

I found that during file playback, the sound becomes extremely choppy and broken up, every few seconds playback pauses for around one or two seconds.

More seriously though, with this patch applied, the hard disc on my 140 does not spindown, and remains spun up even though it is set to spin down after three seconds in my config file.

I am certain its your patch causing the issues unfortunately as I did a fresh build of the latest svn code as of this morning with only your patch applied and all the other patches I usually use removed and the issues continued.

So I think unfortunately this patch needs serious work.

I will test new versions though when they come out.
Comment by alex wallis (alexwallis646) - Wednesday, 15 July 2009, 15:26 GMT
I also tried including your patch as part of the set I normally use on my player, and as soon as it was removed and I used just my normal patches the above issues stopped completely so its certainly your patch as I tried builds with just your patch, and also with my normal patches I use plus your patch. each time as soon as your patch was removed the issues stopped completely.
Comment by Jeffrey Goode (Blue_Dude) - Wednesday, 15 July 2009, 17:32 GMT
Hm. My patch does require quite a bit of RAM, and it's possible it's cutting into whatever buffer is set aside for your file cache, or whatever it is that keeps your disk from spinning. That might be the cause of your choppy audio AND your disk spinning, since if there's a input buffer underrun, the audio would cut off until it pulls more from the disk. I don't know enough about the h140 to say.

I have been running it successfully on a Sansa E280, but that's flash based and doesn't have the spinup issues you're having. I don't have a hard disk based DAP and don't even know where to start addressing this problem.
Comment by alex wallis (alexwallis646) - Wednesday, 15 July 2009, 17:51 GMT
I have the dircache option enabled if that's any help, as that really reduces disc spinups.

Maybe ask on the dev list or irc, i'm sure people will give you advice about what to do to solve these issues.
Comment by Magnus Holmgren (learman) - Wednesday, 15 July 2009, 21:51 GMT
It does sound like playback not being realtime (the buffering thread wouldn't get enough time to fill the buffers then, hence the disk doesn't spin down).

Having looked a little on the code, it does seem to add quite a bit of processing, and lots of memory accesses during that processing (the attack buffer and what not). On the H140, "normal" memory accesses are quite slow, so the difference between having audio processing buffers in IRAM and normal RAM is pretty significant (the difference is much smaller on ARM targets). Don't know if having all "important" buffers and tables in IRAM would make playback realtime though.

The DSP at least used to have the samples in a certain format (in terms of fractional bits) after input_samples had been called. If that's still the case, couldn't get_peak_value be simplified a bit?

dsp_flush_attack_buffer seem to allocate about 2 KB on the stack. Looks much to me, but I don't know if it could cause problems.

Finally, one way to avoid clipping is to limit samples like this:

if (sample < -0.5)
sample = tanh((sample + 0.5) / (1-0.5)) * (1-0.5) - 0.5;
else if (sample > 0.5)
sample = tanh((sample - 0.5) / (1-0.5)) * (1-0.5) + 0.5;

This would avoid the need for an attack buffer at least. :)
Comment by Jeffrey Goode (Blue_Dude) - Thursday, 16 July 2009, 02:32 GMT
I don't know enough about the architecture to even implement an IRAM buffer vs a normal RAM buffer, so I can't address that kind of performance issue.

input_samples just reconfigures the input buffer pointers to the internal 32 bit interleaved format. It doesn't set frac_bits or even look at the actual buffer contents. The actual fixed point format varies with the codec feeding samples to the buffer. That's why I had to scale the samples in get_peak_value before comparison. Simplifying get_peak_value is pretty high on my list of priorities, mostly because with heavily clipped samples it has to go through a *lot* of iterations per sample pair to come up with a valid number, and because get_peak_value has to be called for every sample. I will probably incorporate it into the calling routine directly to avoid calling overhead. I also need to cut down on those iterations. I'm working on that now.

dsp_flush_attack_buffer is only called in one specific circumstance: when playing out at the end of the playlist. It does allocate a bit less than 2KB of space in the PCM buffer, then immediately plays it and releases it. It shouldn't affect memory allocation during normal playback.

If I follow your math code snippet, you suggest that if the signal if halfway to max, start throttling gain automatically? I wonder if the per sample math involved would be less computationally intensive than what we have already...
Comment by Magnus Holmgren (learman) - Thursday, 16 July 2009, 07:13 GMT
Look at some of the other buffers in dsp.c. It's just a question of placing IBSS_ATTR (or ICONST_ATTR) in the right places. Note, however, that there might not be enough free IRAM for your needs; you'll get a link error in that case.

As for sample format, I'm pretty sure most codecs output the data in the same format. But maybe not all...

The tanh stuff simplifies a lot, but making a fast implementation of it would require some work too, no doubt.
Comment by Jeffrey Goode (Blue_Dude) - Thursday, 16 July 2009, 13:37 GMT
Ah, I wondered what IBSS_ATTR did. I'll try that and test compile for a h140 target to see if it works. I think the only things that need to be there are the high turnover values: attack buffer and peak buffer.

I've optimized the code some so that get_peak_value is called half as often and runs faster when it is. I still have to implement the divide and conquer table seek that should make it faster still.

I'm going to check out a separate branch on my system to explore your tanh idea. It will definitely prevent clipping but I don't know yet if the sound quality will suffer for it. There's a really fast exp(x) function that would be suitable for calculating tanh, but it remains to be seen if it'll be fast enough. If it works out, I'll open another FS ticket and post it there.
Comment by Jeffrey Goode (Blue_Dude) - Thursday, 16 July 2009, 19:26 GMT
OK, try this one out. It's further optimized for compute speed, finally including the divide and conquer table seek algorithm in get_peak_value. The get_peak_value function doesn't get called at all unless the sample is clipped, and when it does it only gets called once per sample pair. I've also put in IRAM attribute tags to place the attack buffer, the peak buffer and the gain tables into IRAM. It does compile and link for the h120/h140 so I don't think I overdid it. It may fail on targets with less IRAM though, whichever those are.

If this still plays slower than real time on h120/h140 targets there's not a whole lot left I can do except recommend that you not use it. :( Let me know how it goes.
Comment by alex wallis (alexwallis646) - Friday, 17 July 2009, 02:32 GMT
Hi.
Well good and bad news, I have done a bit of listening with this latest patch applied, and playback seems fine.

However when I have just tried skipping through a folder of eleven tracks, the hard drive does seem to be spun up a bit more than usual, in this particular playlist, the drive usually only spins up once, but this time when I was manually skipping through, the drive seemed to spin up every other track.

I am not sure I would feel comfortable running this patch on my build yet, as obviously hard drives do break eventually, so the less disk activity I can get the better.

Also, I am not sure what this means exactly, but having this patch applied reduces the buffer size down from 28.6 megabytes to 28.3 megabytes.

If the extra disc spinups could be sorted I would be interested in running this patch or seeing it committed.
Comment by Jeffrey Goode (Blue_Dude) - Saturday, 18 July 2009, 17:02 GMT
A proof-of-concept patch for learman's tanh algorithm is at  FS#10449 .
Comment by Jeffrey Goode (Blue_Dude) - Wednesday, 12 August 2009, 20:13 GMT
Here's an optimized version. It doesn't bog down the system when the limiter isn't active and the process runs as fast as I can make it.
Comment by Jeffrey Goode (Blue_Dude) - Thursday, 13 August 2009, 17:53 GMT
The last version hung the interface on the target if you tried to turn the limiter off (preamp 0dB) while playing. The audio would keep playing just fine but the interface would freeze for 30 seconds or so while waiting for room in the PCM buffer to open up. This version eliminates the freeze but you lose about 300 samples of audio when turning off the limiter while playing. This is probably imperceptible and seems a decent compromise since the situation doesn't come up very often.

Loading...