Rockbox

  • Status Closed
  • Percent Complete
    100%
  • Task Type Patches
  • Category Codecs
  • Assigned To
    Buschel
  • Operating System Coldfire-based
  • Severity Low
  • Priority Very Low
  • Reported Version Release 3.6
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by Buschel - 2010-07-28
Last edited by Buschel - 2010-07-28

FS#11502 - Optimize coldfire asm for fixmul in codecs

Use faster fixmul16() by Nils Wallménius in libatrac. Needs to be tested on coldfire target before submission.

Question:
Can “fixmul14” (MPC_MULTIPLY in libmusepack/mpcdec_math.h and MUL_R in libfaad/fixed.h) and “fixmul15” (MULT31_SHIFT15 in lib/asm_mcf5249.h and libtremor/asm_mcf5249.h) be optimized as well?

Closed by  Buschel
2010-07-28 18:18
Reason for closing:  Accepted
Additional comments about closing:  

Submitted with r27596

nls commented on 2010-07-28 10:17

This fixmul16 (which i just adapted from libwma btw) exploits the fact that the result is shifted 16 bits, since the other functions you mention use different shifts they can't be done in this way. I see no way to improve the codeclib and tremor MULT31_SHIFT15. The MPC_MULTIPLY could be improved slightly to use one register less but would still need as many cycles. On a sidenote, MPC_MULTIPLY_EX could also be improved a bit. I'll seew if i can find some motivation for this :)

Thanks for answering the question and of course I am looking forward to MPC_MULTIPLY_EX optimizations :o)

Btw, just to ensure proper functionality. Can you shortly test and approve the above patch?

nls commented on 2010-07-28 13:39

yes, this is ok. speedup of ~0.5 % on my sample file

nls commented on 2010-07-28 14:19

in libmpc, AFAICS the MPC_MULTIPLY_EX is only used with d→SCF_shift values which comes from find_shift() which always returns values in the 0-31 range so the conditional branch in MPC_MULTIPLY_EX will trigger only for shifts of 31, in which case t1 will be shifted right by 0 (unmodified)
if the conditional branch was removed t1 would be shifted left by 0 and x would be shifted right by 31 and then OR'ed with t1 so only the least significant bit could change

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing