Rockbox.org home
release
dev builds
extras
themes manual
wiki
device status forums
mailing lists
IRC bugs
patches
dev guide



Rockbox mail archive

Subject: RE: Re: time to sleep?
From: Sven Karlsson (svenka_at_it.kth.se)
Date: 2002-08-22


Hi again,

>and every instruction from the RAM anyway. We have no cache.

I perfectly know that. In fact, caches have little to do with this.

However, "register" might/will help you anyway. Look at the following
snippet:

 int i;
 int *ptr;

 ...

 i=*ptr; /* statement 1 */
 *(ptr++)=5; /* statement 2 */
 i++; /* statement 3 */
 foo(); /* statement 4 */
 *(ptr+2)=6; /* statement 5 */
 i++; /* statement 6 */

Now this is quite a bit synthetic but the same end results occur in normal
code. The problem here is that the compiler can never know where ptr points,
or rather it can only deduce where ptr points to in some cases. Here ptr
might point to i and this is known as aliasing. Furthermore, you don't even
know if ptr points to ptr itself!

This means that the compiler will write i and ptr back to memory whenever it
fears that it will be access by ptr so the code generated by statement 1-6
might be (in pseudo mc680x0 assembly code):

 move.l ptr(a6),a2 ; load ptr
 move.l (a2),d2 ; read whatever ptr points to
 move.l d2,i(a6) ; flush i since we are going to write via ptr
 moveq.l #5,d2 ; load constant (move.l #5,(a2)+ might be faster when
                                ; the register pressure is high)
 addq.l #1,d2 ; increment i
                                ; we don't know if there are any side effects with regards to
                        ; i and ptr in foo so both must be flushed!
 bsr foo ; call foo assume callee save
 move.l ptr(a6),a2 ; reload ptr
 moveq.l #6,d2 ; load constant
 move.l d2,8(a2) ; write
 move.l i(a6),d2 ; reload i
 addq.l #1,d2 ; increment i

Now if the "register" keyword would have been used in this case, the
compiler could do better since it would have known that there are no
aliasing:

 move.l ptr(a6),a2 ; load ptr (unless it already is in a register)
 move.l (a2),d2 ; read whatever ptr points to
 moveq.l #5,d3 ; load constant (move.l #5,(a2)+ might be faster when
                                ; the register pressure is high)
 move.l d3,(a2)+
 addq.l #1,d2 ; increment i
 bsr foo ; call foo assuming callee save
 moveq.l #6,d3 ; load constant
 move.l d3,8(a2) ; write
 addq.l #1,d2 ; increment i

So the net outcome is shorter code, better register usage, and fewer memory
accesses. All contributing to the power usage.

Naturally, this is also compiler dependant and I'm not all that familiar
with the SH port of GCC. My experice is however that this really pays of on
mc680x0, SPARC, MIPS and even, although on a lesser degree, on x86.

Most people working on x86 won't probably see this as there are so few
registers available and the memory-register operations are so efficient that
the compilers tend to use memory a lot.

Anyway, I'll keep my big mouth shut for a while now.

Best regards
 Sven



Page was last modified "Jan 10 2012" The Rockbox Crew
aaa