|
Rockbox mail archiveSubject: Re: Various bug fixes and improvements - where to put?Re: Various bug fixes and improvements - where to put?
From: Jens Arnold <arnold-j_at_t-online.de>
Date: Tue, 02 Mar 2004 09:01:31 +0100 Hello [IDC]Dragon On 02.03.2004, you wrote: >> as I have done various patches and improvements on different >> low-level routines in Rockbox I would like to know where I >> can put these in order to be revised and eventually >> integrated into the main cvs. > The patch tracker is the official way to post patches. I know > it's difficult for the core developers to follow users' > ... > I'm very interested in your patches and like to see them > there. Ok, will file patches then (not now, has to wait until the evening). >> - corrected memory waste in "bitswap.s" and "descramble.s" >> (simple .align issue) > Wasting 2 bytes, or what? The alignment may be intentional, > because the SH fetches two 16-bit instruction in one 32-bit > bus cycle. The execution time may depend on this pairing. Not 2 bytes, but more like 12 bytes (maximum) for each align. I know well from the SH1 manual that memory load/store instructions in IRAM should be long aligned in order to not cause contention with instruction fetch. The problem is with the syntax of the .align pseudo-instruction: ".align <n>" does not align to <n> bytes but to 2^<n> bytes, so the ".align 4" which is used there aligns to 16 bytes and not 4. For longword alignment ".align 2" must be used. This is _confirmed_ by assembler output of compiler (gcc -S xxx.c), internet gcc tips and by comparing final Rockbox size. >> - 20% faster bitswap > How that? (OK, I'll wait for your patch posting) You will see my new loop... >> - new turbo-charged copy_read_sectors() in "ata.c", with >> speed >> relations as follows: >> | word-aligned | unaligned >> -------------+-------------------+-------------------- >> C original | 2.02 MB/s (100 %) | 1.71 MB/s (100 %) >> [IDC]Dragon | 3.42 MB/s (169 %) | 2.22 MB/s (130 %) >> new version | 4.76 MB/s (236 %) | 3.33 MB/s (195 %) > Which "C original" is that? The current routines have already > been optimized on C level by me (look into the cvs history of > ata.c), so the gain could be even more drastic if you compare > with the 2.1 code. This is against _current_ "ata.c" code ("loop compiles to...) > assembler code. Again. I'm interested in what you did. A bug > (?) in the inline assembler prevented me from further > unrolling the copy loop. I did not manage to do stores with an > immediate offset, so I had to use r0 for that, and this is > only good for one level of unrolling. I _do_ actually use stores with immediate offsets! You have to obey the constraints of these inctructions: - the source register must always be r0 - the offset can _only_ be positive (no sign extension) and very small. - with the inline assembler syntax, you have to give the offset already shifted according to data width, e.g. if you want to store a word 2 bytes ahead, you have to write "mov.w r0,_at_(2,r<n>)" If you use "mov.w r0,_at_(1,r<n>)" you get an assemble error. >> - the take-out of cli() / sti() and replacement of these by >> set_irq_level() made disk loading in the video player >> extremely >> slow. >> I don't understand what the intention in replacing cli() / >> sti() by set_irq_level() really is. The comment says that >> cli() / sti() are not safe. > I didn't understand this either, and it seems I have not yet > compiled this into the version I'm running, else I may have > noticed the drag. They should be inline functions. > For the LCD code, we can take out the interrupt disable (like > my comment says), I only put this in to be safe for the I know your comment, but I wanted to retain your safety measure. Please let me know what you think about my idea about nesting cli() / sti(). Regards, Jens _______________________________________________ http://cool.haxx.se/mailman/listinfo/rockbox Received on 2004-03-02 Page template was last modified "Tue Sep 7 00:00:02 2021" The Rockbox Crew -- Privacy Policy |