Rockbox mail archiveSubject: Re: Various bug fixes and improvements - where to put?
Re: Various bug fixes and improvements - where to put?
From: [IDC]Dragon <idc-dragon_at_gmx.de>
Date: Tue, 2 Mar 2004 08:06:34 +0100 (MET)
> as I have done various patches and improvements on different
> low-level routines in Rockbox I would like to know where I can
> put these in order to be revised and eventually integrated into
> the main cvs.
> (Yes I know that there is a patch tracker, but it seems to me
> that the patches stay there a rather long time.)
The patch tracker is the official way to post patches. I know it's difficult
for the core developers to follow users' creativity, but we don't want
something like first and second class patches. But feel free to post here to draw
more attention to your patch(es).
I'm very interested in your patches and like to see them there.
> In detail I did the following:
> - corrected memory waste in "bitswap.s" and "descramble.s"
> (simple .align issue)
Wasting 2 bytes, or what?
The alignment may be intentional, because the SH fetches two 16-bit
instruction in one 32-bit bus cycle. The execution time may depend on this pairing.
> - 20% faster bitswap
How that? (OK, I'll wait for your patch posting)
> - new turbo-charged copy_read_sectors() in "ata.c", with speed
> relations as follows:
> | word-aligned | unaligned
> C original | 2.02 MB/s (100 %) | 1.71 MB/s (100 %)
> [IDC]Dragon | 3.42 MB/s (169 %) | 2.22 MB/s (130 %)
> new version | 4.76 MB/s (236 %) | 3.33 MB/s (195 %)
Which "C original" is that? The current routines have already been optimized
on C level by me (look into the cvs history of ata.c), so the gain could be
even more drastic if you compare with the 2.1 code.
> (These values were calculated from the machine clock cycles
> not taking memory or i/o wait states into account, but the
> relations should be correct)
Hmm, this may be misleading. The waitstates for ATA and memory are always
there, we optimize only on the code overhead to move the data. So in fact I
expext less gain, and it's not proportional to the # of instructions, because of
this base load.
I measured about 2MB/s aligned and 1.3MB/s misaligned for my assembler code.
Again. I'm interested in what you did. A bug (?) in the inline assembler
prevented me from further unrolling the copy loop. I did not manage to do stores
with an immediate offset, so I had to use r0 for that, and this is only good
for one level of unrolling.
> - the take-out of cli() / sti() and replacement of these by
> set_irq_level() made disk loading in the video player extremely
> I don't understand what the intention in replacing cli() / sti()
> by set_irq_level() really is. The comment says that cli() / sti()
> are not safe.
I didn't understand this either, and it seems I have not yet compiled this
into the version I'm running, else I may have noticed the drag.
They should be inline functions.
For the LCD code, we can take out the interrupt disable (like my comment
says), I only put this in to be safe for the future, in case anybody will mess
with the same port within an interrupt. This is currently not the case, and
won't be very likely either.
Just leave a mental note and the code comment in there.
Thanks for your work on this!
-- GMX ProMail (250 MB Mailbox, 50 FreeSMS, Virenschutz, 2,99 EUR/Monat...) jetzt 3 Monate GRATIS + 3x DER SPIEGEL +++ http://www.gmx.net/derspiegel +++ _______________________________________________ http://cool.haxx.se/mailman/listinfo/rockboxReceived on 2004-03-02