Rockbox

  • Status Closed
  • Percent Complete
    0%
  • Task Type Patches
  • Category
  • Assigned To
    hohensoh
  • Operating System
  • Severity Low
  • Priority Very Low
  • Reported Version
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by amiconn - 2004-03-02
Last edited by hohensoh - 2004-04-05

FS#2039 - 20% faster bitswap, turbocharged copy_read_sectors etc.

This patch applies to 4 files:

bitswap.S:
- 20 % faster (from 18 to 15 clock cycles per loop),
- corrected alignment to longword, saves space

ata.c:
- new, turbo-charged copy_read_sectors in assembler

Speed figures (from clock cycle counting, taking
pipeline stalls into account):

             | word-aligned | unaligned
-------------+-------------------+--------------------
C original | 2.02 MB/s (100 %) | 1.71 MB/s (100 %)
[IDC]Dragon | 3.42 MB/s (169 %) | 2.22 MB/s (130 %)
new version | 4.76 MB/s (236 %) | 3.33 MB/s (195 %) 

If there are wait states, the speed differences may be
less but the speed relation between the routines should
be preserved. If there are memory wait states, my
routine for unaligned data should be even faster
compared to the others since it does write words, not
bytes.

This is not enabled by default, if you want it compiled
in you have to comment out "#define PREFER_C".

descramble.S:
- corrected to the desired longword alignment, saves space

lcd.c:
- took out now unnecessary variable "oldlevel" from
lcd_write_data

The patch is against current rockbox source (2004-03-02)

Closed by  hohensoh
2004-04-05 08:36
Reason for closing:  Accepted
Additional comments about closing:  

Logged In: YES
user_id=741001

Althought the fast read is default disabled, I think we can
close this as a patch.

Project Manager

Nice work! Ill review and test it asap.

alignment and bitswap committed (14% increase in the real
world).
ATA code pending, I measured a real world speedup of
18%/34% aligned/misaligned.

New version of ata.c patch. Didn't include bitswap and other
fixes this time since they are already committed. The new
ata.c patch
contains:

- further improved read speed for both aligned/unaligned (only

 about 3% this time)

- shorter code (6 instructions less) to save IRAM
- scratches one register less than my old code

- alternative, only slightly optimized (versus C version)
routine

to test with slow drives that have problems with the turbo
version. This one is very short.

Look in the ata.c header area for an additional #define

Corrected bug within alternative assembler routine.

Althought the fast read is default disabled, I think we can
close this as a patch.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing