|
Rockbox mail archiveSubject: Re: Possible optimizations for coldfireRe: Possible optimizations for coldfire
From: Jens Arnold <arnold-j_at_t-online.de>
Date: Thu, 20 Apr 2006 00:55:55 +0200 On 19.04.2006, RaeNye wrote: > 1. Currently each PutPixel costs 2 mem reads, 4 mem writes and > some shifts (all because of the 18-bit thingy). Would What do you mean with PutPixel? There is no such function, and drawing pixels just requires one 16bit read and one 16bit write on all 16bit colour targets. The X5 framebuffer is 16bit. > double-buffering help? i.e., keep another LCD buffer (in DRAM) > representing the state of the LCD /now/; > whenever lcd_update() is called, we compare the 16-bit > (32-bit?) pixel value and only update it and the hardware if > necessary. Obvious con: memory. > How can this be profiled, btw? I would expect this to be slower, for several reasons. First, you would have to read 2 values from RAM instead of one, and since there's definitely not enough IRAM for 2 buffers at least one would be in slooow SDRAM. My idea was to add burst reading with movem, although that probably won't help much since the framebuffer is currently in IRAM. That might change later, if we want other stuff that's even more time critical to reside in IRAM, and then burst reading really pays off - cf the H300 lcd driver. > 2. I might be terribly wrong, but can we shut down all GUI > code when the backlight is off? > Yes, this means WPS refreshes, scrolling, etc. No we can't do so in general. We could perhaps do that in certain modules like the wps, but these modules then have to be made lcd status aware. They would need to redraw everything whenever the light goes on. > 3. GNU's memcpy() and memset() are not using all possible > registers (i.e. movem.l does only 16 bytes writes instead of > the possible 48). It also spends so much time on alignment > (which is not necessary for movem.l, IIRC). Rockbox' memcpy() and memset() for coldfire are not gnu - they are our own, coded by me. There are numerous reasons why they are designed as they are designed. The aligment isn't a waste at all, it's a huge speed boost. Check the coldfire manual about the memory controller and burst access. Thing is, while the coldfire doesn't strictly enforce alignment, there are performance penalties involved with unaligned accesses which are severe. The memory controller can do 1 (byte), 2 (word), 4 (longword) and 16 byte (line) bursts, but the access must be aligned at <size>-boundaries to be a burst access. Let's start with a line burst (16 bytes). Misalign that by 4 bytes, and you're at longword burst level. Speed penalty: a *factor* of 2.5! Misalign by 2 byte, and you're down to word accesses: Another factor of 2. Misalign by byte, and you are at a level where accesses go byte - word - byte. Another factor of 1.5... > The iAudio firmware contains smaller and faster versions, > which *I* cannot contribute to Rockbox -- as that would be > considered code theft -- but you can write as you only heard > the general idea :) Smaller - sure, our versions are quite large (at least memcpy and memmove). Faster - I strongly doubt that, see above. Regards, Jens Received on 2006-04-20 Page template was last modified "Tue Sep 7 00:00:02 2021" The Rockbox Crew -- Privacy Policy |