• Status Closed
  • Percent Complete
  • Task Type Patches
  • Category Drivers
  • Assigned To No-one
  • Operating System All players
  • Severity Low
  • Priority Very Low
  • Reported Version Release 3.4
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by kugel. - 2009-11-23
Last edited by funman - 2010-06-23

FS#10805 - AMS Sansas: Vastly increase SD driver performance

This patch greatly improves SD driver performance, at the cost of memory.

The memory cost is zero for 8MB targets, because the sd buffer is used to IRAM, which is nearly unused as of now (this isn’t visible in the RAM usage which rockbox-info.txt gives). It’s about 10k for 2MB targets (additionally the buffer cannot be in IRAM on those).

I’ve made some analysis on different settings: increasing the aligned_buffer, moving it to iram, not using it for properly aligned transfers. The outcome of the analysis is visible here:

It turns out that a big ( aligned_buffer at IRAM gives a huge (~250%) transfer rate increase over SVN. A 32*SECTOR_SIZE buffer gives good performance too, but is noticable slower. Using the passed buffer (if word aligned) directly together with cache coherency functions is very fast too, but only increases performance on aligned transfers.

So, I decided to use the fastest setting on 8MB Samsas (64*SECTOR_SIZE buffer at IRAM).
on 2MB samsas I used a combo of 32*SECTOR_SIZE and using the passed buffer directly (this saves 15k RAM, at the cost of speed, code complexity and messing with the caches).

There’s 2 problems: a) This separates the way the driver works for the Samsas (not so much code wise, but logic wise), I don’t really like that.
b) the code path for 2MB targets gives data and prefetch aborts on my clip, but works fine on my fuze. I have no idea why!

I honestly think the 2MB targets, being flash devices, should also use the fastest setting. That would cost 25k over SVN (this patch adds 10k anyway, so it’s effectively only 15k more RAM), but I think the unified code paths and better performance are worth it IMO.

Closed by  funman
2010-06-23 04:34
Reason for closing:  Accepted
Additional comments about closing:   Warning: Undefined array key "typography" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 371 Warning: Undefined array key "camelcase" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 407


What do you actually gain from the better performance? As in, for a user just listening to music, what is this increased performance going to show up as?

Faster buffering is nice, both because theres less overhead and because the flash spends less time powered up.

And of course once we have USB it will make that a lot better.

Did you measure impact on battery life ?

No, but I doubt it’s noticeable. On my fuze, rebuffering takes ~1s with it, instead of ~2. With 128k mp3 it would rebuffer every 4 (and a few seconds) minutes, so the time spend on rebuffering is negible in either case. So, battery life could be incresed by a few minutes if at all.

Apart from buffering, presumably this would improve user experience for loading large files, large playlists, pictureflow, etc ?

Picture flow is definitely improved. I can’t notice any “?” covers when scrolling very quickly anymore (those “?” appear for a) no album art for a particular album or b) didn’t load the aa from the disk yet).

Does this patch works with the latest changes in the AMS SD code?

topik commented on 2009-12-14 13:37

It doesn’t work against r23989 (anymore). 5 of 8 hunks fail when patching. I’d like to try this patch so please make it work with the current state of ata_sd_as3525.c.

Sync to r24003.

This boots up just fine but panics when I try to play:

Dir entry 9 in sector 3 is not free! 62 00 69 00

I don’t think that just syncing is a good idea, since there were serious changes in the SD code recently.

PL081 DMA controller doc says:
“You must align the source and destination addresses to the source and destination width”

We use 8 words width so buffer needs to be aligned on 32 bytes.

I just tried something like this on Clip+, and also defining STORAGE_WANTS_ALIGN, but got crashes in codecs, while mpegplayer worked fine.

Idea: perhaps the buffers are not up-aligned and last cache line gets used anyway?

My try at aligning up and down the buffers on 8*4 = 32.

32 is the DMA requirement and the cacheline size, so both up & down cachelines aren’t reused after we cleaned/dumped them.

usb_storage not complete.

calls to read()/write() aren’t aligned because they use provided buffer directly.

test_disk still doesn’t work, not sure why..

I forgot the 127 sectors limit of MCI_DATA_LENGTH

test_disk still works when forcing copy to uncached buffer, else it doesn’t

test_disk passes but playback is buggy especially for lossless


Available keyboard shortcuts


Task Details

Task Editing