Rockbox

Tasklist

FS#11502 - Optimize coldfire asm for fixmul in codecs

Attached to Project: Rockbox
Opened by Andree Buschmann (Buschel) - Wednesday, 28 July 2010, 05:52 GMT
Last edited by Andree Buschmann (Buschel) - Wednesday, 28 July 2010, 18:18 GMT
Task Type Patches
Category Codecs
Status Closed
Assigned To Andree Buschmann (Buschel)
Operating System Coldfire-based
Severity Low
Priority Normal
Reported Version Release 3.6
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Use faster fixmul16() by Nils Wallménius in libatrac. Needs to be tested on coldfire target before submission.

Question:
Can "fixmul14" (MPC_MULTIPLY in libmusepack/mpcdec_math.h and MUL_R in libfaad/fixed.h) and "fixmul15" (MULT31_SHIFT15 in lib/asm_mcf5249.h and libtremor/asm_mcf5249.h) be optimized as well?
This task depends upon

Closed by  Andree Buschmann (Buschel)
Wednesday, 28 July 2010, 18:18 GMT
Reason for closing:  Accepted
Additional comments about closing:  Submitted with r27596
Comment by Nils Wallménius (nls) - Wednesday, 28 July 2010, 10:17 GMT
This fixmul16 (which i just adapted from libwma btw) exploits the fact that the result is shifted 16 bits, since the other functions you mention use different shifts they can't be done in this way. I see no way to improve the codeclib and tremor MULT31_SHIFT15. The MPC_MULTIPLY could be improved slightly to use one register less but would still need as many cycles. On a sidenote, MPC_MULTIPLY_EX could also be improved a bit. I'll seew if i can find some motivation for this :)
Comment by Andree Buschmann (Buschel) - Wednesday, 28 July 2010, 10:39 GMT
Thanks for answering the question and of course I am looking forward to MPC_MULTIPLY_EX optimizations :o)

Btw, just to ensure proper functionality. Can you shortly test and approve the above patch?
Comment by Nils Wallménius (nls) - Wednesday, 28 July 2010, 13:39 GMT
yes, this is ok. speedup of ~0.5 % on my sample file
Comment by Nils Wallménius (nls) - Wednesday, 28 July 2010, 14:19 GMT
in libmpc, AFAICS the MPC_MULTIPLY_EX is only used with d->SCF_shift values which comes from find_shift() which always returns values in the 0-31 range so the conditional branch in MPC_MULTIPLY_EX will trigger only for shifts of 31, in which case t1 will be shifted right by 0 (unmodified)
if the conditional branch was removed t1 would be shifted left by 0 and x would be shifted right by 31 and then OR'ed with t1 so only the least significant bit could change

Loading...