Rockbox

This is the bug/patch tracker for Rockbox. Click here for more information.

Quick links: Bugs · Patches · Feature requests · Rockbox frontpage

Tasklist

FS#8750 - add some ARM assembler for dsp-routines

Attached to Project: Rockbox
Opened by Andree Buschmann (Buschel) - Monday, 17 March 2008, 13:39 GMT+2
Last edited by Nils Wallménius (nls) - Wednesday, 19 March 2008, 18:25 GMT+2
Task Type Patches
Category Music playback
Status Closed
Assigned To No-one
Player type PortalPlayer-based
Severity Low
Priority Normal
Reported Version current build
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Private No

Details

Adding ARM assembler for the dsp-routines sample_output_mono() and sample_output_stereo(). Saves roughly 0.6MHz during playback of 44.1kHz stereo playback.
This task depends upon

Closed by  Nils Wallménius (nls)
Wednesday, 19 March 2008, 18:25 GMT+2
Reason for closing:  Accepted
Comment by Andree Buschmann (Buschel) - Monday, 17 March 2008, 15:17 GMT+2
New version:
- minor change after discussion in IRC (other solution for "bx lr")
- added ARM assembler channels_process_sound_chan_karaoke() -- can be further optimized, but has nearly no influence on CPU-load
Comment by Andree Buschmann (Buschel) - Monday, 17 March 2008, 15:17 GMT+2
Now with attached patch.
Comment by Andree Buschmann (Buschel) - Monday, 17 March 2008, 21:51 GMT+2
Using faster clipping (suggested by preglow). Total speed up for stereo-signals is +22% now.
Comment by Andree Buschmann (Buschel) - Monday, 17 March 2008, 22:44 GMT+2
Final version tonight, added ARM asm for channels_process_sound_chan_mono().
Comment by Andree Buschmann (Buschel) - Monday, 17 March 2008, 22:45 GMT+2
Now with attached patch.
Comment by Andree Buschmann (Buschel) - Tuesday, 18 March 2008, 12:43 GMT+2
Next patch version for further speed up. Changes:

- store 2 halfword samples via packing into 1 word
- process 2 samples in each loop -> use of ldm and stm possible

ToDo: Odd sample counts are not handled properly. On odd counts the routines will process an additional (not needed) sample. This should not end up in any bad effects as odd counts will always access a valid buffer address (odd counts are always smaller than SAMPLE_BUF_COUNT/2).

Speed of this patchversion vs. C-code:
- sound processing mono: +76%
- sound processing karaoke: +77%
- playback mono: +83%
- playback stereo: +41%
Comment by Andree Buschmann (Buschel) - Tuesday, 18 March 2008, 21:26 GMT+2
Next patch after review via irc.

- within sound processing mono/karaoke the division by 2 is done before adding/subtracting to avoid possible overflow
- added a note regarding the behaviour with odd sample counts (tested and showed no negative effect)
- only perform yield() in dsp.c each tick, not each 128 samples

Loading...