This is the bug/patch tracker for Rockbox. Click here for more information.
Quick links: Bugs · Patches · Rockbox frontpage
FS#10805 - AMS Sansas: Vastly increase SD driver performance
Attached to Project:
Rockbox
Opened by Thomas Martitz (kugel.) - Monday, 23 November 2009, 02:23 GMT+2
Last edited by Rafaël Carré (funman) - Wednesday, 23 June 2010, 06:34 GMT+2
Opened by Thomas Martitz (kugel.) - Monday, 23 November 2009, 02:23 GMT+2
Last edited by Rafaël Carré (funman) - Wednesday, 23 June 2010, 06:34 GMT+2
|
DetailsThis patch greatly improves SD driver performance, at the cost of memory.
The memory cost is zero for 8MB targets, because the sd buffer is used to IRAM, which is nearly unused as of now (this isn't visible in the RAM usage which rockbox-info.txt gives). It's about 10k for 2MB targets (additionally the buffer cannot be in IRAM on those). I've made some analysis on different settings: increasing the aligned_buffer, moving it to iram, not using it for properly aligned transfers. The outcome of the analysis is visible here: http://www.alice-dsl.net/simonemartitz/rockbox/fuze_sd_performance.pdf It turns out that a big ( aligned_buffer at IRAM gives a huge (~250%) transfer rate increase over SVN. A 32*SECTOR_SIZE buffer gives good performance too, but is noticable slower. Using the passed buffer (if word aligned) directly together with cache coherency functions is very fast too, but only increases performance on aligned transfers. So, I decided to use the fastest setting on 8MB Samsas (64*SECTOR_SIZE buffer at IRAM). on 2MB samsas I used a combo of 32*SECTOR_SIZE and using the passed buffer directly (this saves 15k RAM, at the cost of speed, code complexity and messing with the caches). There's 2 problems: a) This separates the way the driver works for the Samsas (not so much code wise, but logic wise), I don't really like that. b) the code path for 2MB targets gives data and prefetch aborts on my clip, but works fine on my fuze. I have no idea why! I honestly think the 2MB targets, being flash devices, should also use the fastest setting. That would cost 25k over SVN (this patch adds 10k anyway, so it's effectively only 15k more RAM), but I think the unified code paths and better performance are worth it IMO. |
This task depends upon
Closed by Rafaël Carré (funman)
Wednesday, 23 June 2010, 06:34 GMT+2
Reason for closing: Accepted
Additional comments about closing: r27074
Wednesday, 23 June 2010, 06:34 GMT+2
Reason for closing: Accepted
Additional comments about closing: r27074
And of course once we have USB it will make that a lot better.
This boots up just fine but panics when I try to play:
Dir entry 9 in sector 3 is not free! 62 00 69 00
"You must align the source and destination addresses to the source and destination width"
We use 8 words width so buffer needs to be aligned on 32 bytes.
I just tried something like this on Clip+, and also defining STORAGE_WANTS_ALIGN, but got crashes in codecs, while mpegplayer worked fine.
Idea: perhaps the buffers are not up-aligned and last cache line gets used anyway?
32 is the DMA requirement and the cacheline size, so both up & down cachelines aren't reused after we cleaned/dumped them.
usb_storage not complete.
calls to read()/write() aren't aligned because they use provided buffer directly.
test_disk still doesn't work, not sure why..
test_disk still works when forcing copy to uncached buffer, else it doesn't