FS#11235 - libmad asm tweaks for ARM9 and above

Attached to Project: Rockbox
Opened by MichaelGiacomelli (saratoga) - Sunday, 02 May 2010, 06:30 GMT
Last edited by MichaelGiacomelli (saratoga) - Monday, 29 November 2010, 22:41 GMT
Task Type Patches
Category Codecs
Status Closed
Assigned To No-one
Operating System All players
Severity Low
Priority Normal
Reported Version Release 3.4
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No


libmad is full of code that doesn't take into account pipeline interlocks on ARM9 and above. This fixes a bit of it, and gives a ~200kHz improvement.

Additionally, I noticed some of the ASM defines (PROD_ODDBACK_A, etc) are never used. Looking at the SVN logs, they probably never should have been committed. Anyone have an idea about them?
This task depends upon

Closed by  MichaelGiacomelli (saratoga)
Monday, 29 November 2010, 22:41 GMT
Reason for closing:  Accepted
Additional comments about closing:  Committed in r28624 and r28710. ASM is now fairly close to optimal for arm9. Further improvements will require algorithmic changes, see: FS#11759.
Comment by Andree Buschmann (Buschel) - Wednesday, 12 May 2010, 06:18 GMT
Merged against r25938.

Edit: Tested on PP502x -> 0.1 MHz slower. Sounds like we need compile options for <=ARMV4 and >ARMV4.
Comment by Dave Hooper (stripwax) - Wednesday, 19 May 2010, 13:12 GMT
What if you use r12 instead of r5 ; does that improve anything on PP502x?
Comment by Andree Buschmann (Buschel) - Monday, 24 May 2010, 11:44 GMT
Dave, interestingly decoding is exactly as fast as svn when using r12 instead of r5.
Comment by MichaelGiacomelli (saratoga) - Saturday, 20 November 2010, 21:38 GMT
Committed libmadasmv3.patch.


New patch performs the above scheduling on all of synth_full_arm.S. Saves about 2 MHz on arm9. Still needs some clean up and testing.

Edit: verified on amsv1, amsv2, and nano2g. 2MHz speed up on each.