- Status Closed
- Percent Complete
- Task Type Patches
- Category Codecs
-
Assigned To
Buschel - Operating System Coldfire-based
- Severity Low
- Priority Very Low
- Reported Version Release 3.6
- Due in Version Undecided
-
Due Date
Undecided
- Votes
- Private
FS#11502 - Optimize coldfire asm for fixmul in codecs
Use faster fixmul16() by Nils Wallménius in libatrac. Needs to be tested on coldfire target before submission.
Question:
Can “fixmul14” (MPC_MULTIPLY in libmusepack/mpcdec_math.h and MUL_R in libfaad/fixed.h) and “fixmul15” (MULT31_SHIFT15 in lib/asm_mcf5249.h and libtremor/asm_mcf5249.h) be optimized as well?
Closed by Buschel
2010-07-28 18:18
Reason for closing: Accepted
Additional comments about closing: Warning: Undefined array key "typography" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 371 Warning: Undefined array key "camelcase" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 407
2010-07-28 18:18
Reason for closing: Accepted
Additional comments about closing: Warning: Undefined array key "typography" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 371 Warning: Undefined array key "camelcase" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 407
Submitted with r27596
Loading...
Available keyboard shortcuts
- Alt + ⇧ Shift + l Login Dialog / Logout
- Alt + ⇧ Shift + a Add new task
- Alt + ⇧ Shift + m My searches
- Alt + ⇧ Shift + t focus taskid search
Tasklist
- o open selected task
- j move cursor down
- k move cursor up
Task Details
- n Next task
- p Previous task
- Alt + ⇧ Shift + e ↵ Enter Edit this task
- Alt + ⇧ Shift + w watch task
- Alt + ⇧ Shift + y Close Task
Task Editing
- Alt + ⇧ Shift + s save task
This fixmul16 (which i just adapted from libwma btw) exploits the fact that the result is shifted 16 bits, since the other functions you mention use different shifts they can't be done in this way. I see no way to improve the codeclib and tremor MULT31_SHIFT15. The MPC_MULTIPLY could be improved slightly to use one register less but would still need as many cycles. On a sidenote, MPC_MULTIPLY_EX could also be improved a bit. I'll seew if i can find some motivation for this :)
Thanks for answering the question and of course I am looking forward to MPC_MULTIPLY_EX optimizations :o)
Btw, just to ensure proper functionality. Can you shortly test and approve the above patch?
yes, this is ok. speedup of ~0.5 % on my sample file
in libmpc, AFAICS the MPC_MULTIPLY_EX is only used with d→SCF_shift values which comes from find_shift() which always returns values in the 0-31 range so the conditional branch in MPC_MULTIPLY_EX will trigger only for shifts of 31, in which case t1 will be shifted right by 0 (unmodified)
if the conditional branch was removed t1 would be shifted left by 0 and x would be shifted right by 31 and then OR'ed with t1 so only the least significant bit could change