Rockbox

Tasklist

FS#10015 - Use chained transfer descriptor to speed up USB transfers on PP and iMX31

Attached to Project: Rockbox
Opened by Frank Gevaerts (fg) - Thursday, 12 March 2009, 19:09 GMT
Last edited by Frank Gevaerts (fg) - Sunday, 29 March 2009, 19:59 GMT
Task Type Patches
Category Drivers
Status Closed
Assigned To No-one
Operating System All players
Severity Low
Priority Normal
Reported Version Version 3.1
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Use chained transfer descriptors to allow larger blocks of data in a single transfer. This speeds up USB transfers significantly.
This task depends upon

Closed by  Frank Gevaerts (fg)
Sunday, 29 March 2009, 19:59 GMT
Reason for closing:  Accepted
Additional comments about closing:  Committed as r20570
Comment by Frank Gevaerts (fg) - Thursday, 12 March 2009, 21:11 GMT
This crashes on gigabeat S for some reason. Taking out some error handling (that I've never seen triggered) helps. The problem is that this also takes size of transfered data, which doesn't matter for storage, but it does for e.g. serial. This means that this patch, while good for testing, isn't ready for commit.
Comment by Boris Gjenero (dreamlayers) - Thursday, 12 March 2009, 22:08 GMT
It seems like each transfer descriptor can handle 20K. Why are they being limited to 16K?
Comment by Frank Gevaerts (fg) - Thursday, 12 March 2009, 22:13 GMT
They can handle 20K, if the data is 4K-aligned. If not, you can only get 16K (or actually, you can get whatever fits. Just assuming 16K is safe and simple)
Comment by Przemysław Hołubowski (p.h.) - Friday, 13 March 2009, 09:37 GMT
I have tested the patch on H10-20 with 60GB HDD using UDMA1 mode.

USB wite (PC->H10) speed is 9.4MB/s (about 1.5MB/s faster then using OF) without the patch and about 11.2MB/s with the patch, and USB read speed is 8MB/s (OF attains 14-18MB/s) no matter the patch is applied or not.
Comment by Tom (frameshift) - Saturday, 14 March 2009, 02:27 GMT
Tested on my e250v1.
3.1mb/s write, 6.1mb/s read without patch.
3.3mb/s write, 6.1mb/s read with patch.
Comment by MichaelGiacomelli (saratoga) - Saturday, 14 March 2009, 04:46 GMT
I tried chained-tds-2.diff on my PC that had disconnects with Rockbox's USB mode and I saw a pretty big improvement. With this patch I didn't get a disconnect until over 5GB had been transfered. Without it I typically got one within a couple hundred MB, with 500MB being relatively rare.
Comment by Andree Buschmann (Buschel) - Saturday, 14 March 2009, 13:29 GMT
Tested chained-tds-2.diff with r20323 and  FS#9708  v0.6 (UDMA1 and UDMA4) on my 5.5G.

Keeping the CPU at 80MHz I get:
PIO4: 8100 KB/s write, 4700 KB/s read
UDMA1: 7100 KB/s write, 9400 KB/s read
UDMA4: 7100 KB/s write, 11300 KB/s read

When setting the CPU to 100MHz I get:
UDMA1: 8100 KB/s write, 9400 KB/s read
UDMA4: 8100 KB/s write, 11300 KB/s read

Edit: Oops, corrected read/write.
Comment by Frank Gevaerts (fg) - Sunday, 29 March 2009, 16:44 GMT
This one adds proper length handling, but it gets data aborts on gigabeat S at 0x59a18 (see the attached elf file for context)

Edit: I forgot the buffer size change to usb_storage in this one. It needs that too of course to get proper speed benefits
Comment by Toni (ahellmann) - Sunday, 29 March 2009, 19:06 GMT
The data abort seems to happen while accessing td->reserved in:

while(td!=QH_NEXT_TERMINATE)
{
length += ((td->reserved & DTD_RESERVED_LENGTH_MASK) -
((td->size_ioc_sts & DTD_PACKET_SIZE) >> DTD_LENGTH_BIT_POS));

td=td->next_td_ptr;
}

By the way, do 'transfer_descriptor' and 'queue_head' need to be byte packed? If the related struct members are accessed often, there might be a performance loss.
Comment by Frank Gevaerts (fg) - Sunday, 29 March 2009, 19:31 GMT
I don't really see how this can happen, as td->reserved should be nicely aligned. I'll check if it never jumps to an invalid location.

transfer_descriptor and queue_head are used directly by the hardware, so they need to be precisely packed and aligned. They only consist of 4-byte int fields though, so I don't think it should make any difference
Comment by Frank Gevaerts (fg) - Sunday, 29 March 2009, 19:57 GMT
It was a bad pointer. Fixed now (and committed)

Loading...