FS#6705 - Small ARM optimizations to MPA codec
Opened by Tomasz Malesinski (tmal) - Tuesday, 27 February 2007, 23:38 GMT
Last edited by Dave Chapman (linuxstb) - Saturday, 28 July 2007, 15:21 GMT
Here are some optimizations that let me play mp3 files in realtime on Iriver iFP. The iFP benefits mostly by moving things into IRAM, since its external RAM is very slow (16-bit, 70 ns, no bursts). The patch needs some cleanup, but I won't be working on it in the next weeks, so I put it here in case someone wants to look at it.
One warning: there is a bug somewhere in Makefiles and sometimes make needs to be run twice to recompile and relink the codec.
The biggest change is a different dct32 routine. It is worse in terms of number of executed instructions, but it's smaller, so it fits in IRAM and uses less cache. Compared to my last patch (
Rest of the changes involve using ldm and stm instructions. Some give really minor gains.
With this patch the mpegplayer plugin does not compile on Ipod because of overfull IRAM.
Some ideas for further improvements:
- integrate III_overlap into III_imdct_l.
- at the cost of accuracy, we could try changing D coefficients (used in synth_full) to use at most 24 bits. ARM multiplies faster when some of the most significant bytes (bytes, not bits) of the last operand are 0.
Saturday, 28 July 2007, 15:21 GMT
Reason for closing: Accepted
Additional comments about closing: Finally committed to SVN - thanks.