Rockbox

This is the bug/patch tracker for Rockbox. Click here for more information.

Quick links: Bugs · Patches · Rockbox frontpage

Tasklist

FS#11759 - Rearrange libmad synthesis memory acceses for arm

Attached to Project: Rockbox
Opened by MichaelGiacomelli (saratoga) - Monday, 15 November 2010, 05:55 GMT+2
Task Type Patches
Category Codecs
Status New
Assigned To No-one
Player Type All players
Severity Low
Priority Normal
Reported Version Release 3.6
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Private No

Details

Work in progress patch. Currently decodes audio but with some glitches. Has a small mountain of debug code included.

The basic idea is to rearrange the D filter coefficients in the synthesis filter so that pairs of them are used sequentially. This is not easy because the taps need to be loaded in the seemingly random order needed by the audio samples. However, this rearrangement seems to be possible:

0 1 2 3 4 5 6 7 (original sequence)
0 2 1 3 4 6 5 7 (new sequence)

The complication is that the code assumes that it can start a new filter at any offset, even odd ones, which means each and every filter needs to be rewritten 4 times, one for each of the 4 possible alignments. This patch does that.

Once I'm certain that it works, I intend to convert the D coefficients to packed 16 bit values, then use packed 16 bit multiply instructions on ARMv5E+. This should lead to a small speed up on armv4 (just because ldm instructions can be used instead of ldr) and a very large speed up on arm9E and arm11 (because packed multiplies are tremendously faster and much easier to pipeline).
   libmad_synth_realign_v01.patch (51.5 KiB)
 ../apps/codecs/libmad/D_sort.dat |  544 ++++++++++++++++++++++++++++++++++++++
 ../apps/codecs/libmad/synth.c    |  551 ++++++++++++++++++++++++++++++++++-----
 2 files changed, 1037 insertions(+), 58 deletions(-)

This task depends upon

Comment by MichaelGiacomelli (saratoga) - Tuesday, 16 November 2010, 03:20 GMT+2
Corrected a bug in one of the filters.
   libmad_synth_realign_v02.patch (54.6 KiB)
 apps/codecs/libmad/D_sort.dat |  592 ++++++++++++++++++++++++++++++++++++++++++
 apps/codecs/libmad/synth.c    |  571 ++++++++++++++++++++++++++++++++++++----
 2 files changed, 1106 insertions(+), 57 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Tuesday, 16 November 2010, 03:31 GMT+2
Above patch is confirmed to produce bit per bit identical output to SVN using lame_128k.mp3
Comment by MichaelGiacomelli (saratoga) - Tuesday, 16 November 2010, 03:38 GMT+2
Above patch without debug code.
   libmad_synth_realign_v03.patch (44 KiB)
 apps/codecs/libmad/D_sort.dat |  592 ++++++++++++++++++++++++++++++++++++++++++
 apps/codecs/libmad/synth.c    |  448 +++++++++++++++++++++++++------
 2 files changed, 946 insertions(+), 94 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Wednesday, 17 November 2010, 21:35 GMT+2
Converted to use 16 bit D coefficients. c code has an RMS error of 1.3 pcm levels, and a peak error of 8 levels for lame_128k.mp3. This seems more then acceptable.

Edit: Note that volume is off in that patch. I'll correct this later.
   libmad_synth_realign_v04.patch (46.3 KiB)
 apps/codecs/libmad/D_sort.dat |  592 ++++++++++++++++++++++++++++++++++++++++++
 apps/codecs/libmad/synth.c    |  498 +++++++++++++++++++++++++++--------
 2 files changed, 982 insertions(+), 108 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Saturday, 04 December 2010, 22:21 GMT+2
Thought of a better way to rearrange the D coefficients. This one is both much simpler and should give significantly better performance on arm11. In this version the D coefficients are split into two table: D_even and D_odd, which unsurprisingly contain the even and odd coefficients from the old table. The dewindowing code is then unrolled and rearranged to accommodate the new even and odd tables.

As a result, all memory accesses are now fully sequential, each D coefficient can be packed into a 32 bit pair, and all windowed sample data are used to generate 2 samples for each time they are loaded.

TODO:
Remove about 50KB of debug code from that patch.
Write ASM version.
   libmad_synth_realign_v06.patch (92.6 KiB)
 apps/codecs/libmad/layer3.c   |   50 +-
 apps/codecs/libmad/D_odd.dat  |  307 +++++++++++++
 apps/codecs/libmad/D_sort.dat |  592 ++++++++++++++++++++++++++
 apps/codecs/libmad/D_even.dat |  307 +++++++++++++
 apps/codecs/libmad/synth.c    |  928 ++++++++++++++++++++++++++++++++++++------
 5 files changed, 2046 insertions(+), 138 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Sunday, 05 December 2010, 05:27 GMT+2
Above but with a lot of bugs fixed. Output should be correct now.
   libmad_synth_realign_v07.patch (91.7 KiB)
 apps/codecs/libmad/D_odd.dat  |  307 ++++++++++++
 apps/codecs/libmad/D_sort.dat |  592 ++++++++++++++++++++++++
 apps/codecs/libmad/D_even.dat |  307 ++++++++++++
 apps/codecs/libmad/synth.c    | 1036 +++++++++++++++++++++++++++++++++++++-----
 4 files changed, 2128 insertions(+), 114 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Tuesday, 07 December 2010, 17:38 GMT+2
Overlooked some code in the above patch. Now fixed.

   libmad_synth_realign_v08.patch (92.9 KiB)
 apps/codecs/libmad/D_odd.dat  |  307 +++++++++++
 apps/codecs/libmad/D_sort.dat |  592 +++++++++++++++++++++++
 apps/codecs/libmad/D_even.dat |  307 +++++++++++
 apps/codecs/libmad/synth.c    | 1077 +++++++++++++++++++++++++++++++++++++-----
 4 files changed, 2169 insertions(+), 114 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Thursday, 09 December 2010, 22:08 GMT+2
Finally converted all filters to use the new even/odd coefficients. Removed old 'sorted' coefficients introduced in the original patch. Output is identical to SVN.
   libmad_synth_realign_v09.patch (64.7 KiB)
 apps/codecs/libmad/D_odd.dat  |  307 ++++++++++++
 apps/codecs/libmad/D_even.dat |  307 ++++++++++++
 apps/codecs/libmad/synth.c    | 1044 +++++++++++++++++++++++++++++++++++++-----
 3 files changed, 1542 insertions(+), 116 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Saturday, 11 December 2010, 23:46 GMT+2
* Delete a lot of debug code
* Reintroduce macros for code that won't be moved into the .S file
   libmad_synth_realign_v10.patch (62.9 KiB)
 apps/codecs/libmad/D_odd.dat  |  307 ++++++++++++
 apps/codecs/libmad/D_even.dat |  307 ++++++++++++
 apps/codecs/libmad/synth.c    | 1042 +++++++++++++++++++++++++++++++++++++-----
 3 files changed, 1540 insertions(+), 116 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Monday, 13 December 2010, 01:53 GMT+2
* Introduce ASM code for the 4 macro functions that won't be included in the .S file.
   libmad_synth_realign_v11.patch (69.3 KiB)
 apps/codecs/libmad/D_odd.dat  |  307 +++++++++++
 apps/codecs/libmad/D_even.dat |  307 +++++++++++
 apps/codecs/libmad/synth.c    | 1162 +++++++++++++++++++++++++++++++++++++-----
 3 files changed, 1658 insertions(+), 118 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Tuesday, 14 December 2010, 04:56 GMT+2
*Clean up most of the debug and dead code
*Finish reordering the body of for loop

Pretty much all thats left is actually converting the core each loop to ASM.
   libmad_synth_realign_v12.patch (59.8 KiB)
 apps/codecs/libmad/D_odd.dat  |  307 ++++++++++++++
 apps/codecs/libmad/D_even.dat |  307 ++++++++++++++
 apps/codecs/libmad/synth.c    |  886 ++++++++++++++++++++++++++++++++++++------
 3 files changed, 1375 insertions(+), 125 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Sunday, 19 December 2010, 23:24 GMT+2
* Rearranged arrays in memory to consolidate pointers and save 2 registers
* Wrote the first half of the first sb_sample function in assembly
   libmad_synth_realign_v13.patch (65.4 KiB)
 apps/codecs/libmad/D_odd.dat            |  307 ++++++++++
 apps/codecs/libmad/synth_full_arm_v5e.S |  140 ++++
 apps/codecs/libmad/D_even.dat           |  307 ++++++++++
 apps/codecs/libmad/synth.c              |  903 +++++++++++++++++++++++++++-----
 4 files changed, 1532 insertions(+), 125 deletions(-)

Comment by MichaelGiacomelli (saratoga) - Friday, 08 April 2011, 04:11 GMT+2
Added a simple test file to try debugging the asm code. Not sure why it currently crashes on decode, probably a dumb mistake somewhere.
   libmad_synth_realign_v14.patch (72.2 KiB)
 apps/codecs/libmad/D_odd.dat              |  307 +++++++++
 apps/codecs/libmad/D_even.dat             |  307 +++++++++
 apps/codecs/libmad/SOURCES                |    1 
 apps/codecs/libmad/synth.c                |  939 ++++++++++++++++++++++++++----
 apps/codecs/libmad/synth_full_arm_test2.S |  231 +++++++
 5 files changed, 1660 insertions(+), 125 deletions(-)

Loading...