FS#10974 - Adapt dct32 to mpc codec

Attached to Project: Rockbox
Opened by Andree Buschmann (Buschel) - Saturday, 06 February 2010, 19:15 GMT
Last edited by Andree Buschmann (Buschel) - Sunday, 07 February 2010, 14:09 GMT
Task Type Patches
Category Codecs
Status Closed
Assigned To No-one
Operating System All players
Severity Low
Priority Normal
Reported Version Release 3.4
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No


mpc's calc_next_V() uses a special dct32 implementation with the disadvantage of possible internal overflows. To avoid such overflows pre- and postscaling is applied which slows down the decoding speed by a few 0.1 MHz's.

As mpc uses generally the same filterbanks and dct's as mp3 I have adapted the libmad dct32-implementation to mpc. The c-version is about 0.3 MHz faster as mpc svn. Interestingly the asm'ed version is about 1.2-1.4 MHz slower as the dct32 c-version.

The asm'ed dct32 could be further optimized via using s0.31 format for the coefficients and dropping one bit of precision in the result of the multiplication (just keep the upper dword of the 64 bit result). Or even dropping 4 bits of precision when using the s3.28 coefficients like they are defined now (mpc even drops 5 bits, nevertheless there is no change in the output). Doing so will save 0.2-0.4 MHz but will not reach the speed of the c-version of dct32.

I am especially interested in tests with non-ARM targets. Is the overall volume correct or is it clipped or too low?
This task depends upon

Closed by  Andree Buschmann (Buschel)
Sunday, 07 February 2010, 14:09 GMT
Reason for closing:  Accepted
Additional comments about closing:  submitted with r24544
Comment by Andree Buschmann (Buschel) - Saturday, 06 February 2010, 21:18 GMT
Next step:
- moved mirroring N/2 to N output into mpc_dct32()
- removed old calc_new_v() function
- removed asm'ed dct32 (for now)
- removed old (unused) OPTIMIZE_SPEED option that had impact on output accuracy
- moved mpc_dct32() to IRAM on targets with large IRAM

Tested on iPod5G and PCSim and works fine. Decoding speed on my testsample went from 23.3 MHz (svn) to 22.1 MHz. So, speed up is about +5%.

Please test on other targets -- especially on Coldfire.
Comment by Andree Buschmann (Buschel) - Sunday, 07 February 2010, 14:04 GMT
Last patch version:
- use costab with max precision (s0.31)

Comparing the decoding output of svn against this patched version there is a maximum difference of +/- 1 sample for 16 bit precision. This is expected.

Speed up on ARM (iPod 5.5G): 23.3 -> 22.2 MHz (+5%)
Speed up on Coldfire (M5): 22.4 -> 20.0 MHz (+12%)