Rockbox mail archive
Subject: RE: Re: time to sleep?
From: Sven Karlsson (svenka_at_it.kth.se)
Date: 2002-08-22
Hi again,
>and every instruction from the RAM anyway. We have no cache.
I perfectly know that. In fact, caches have little to do with this.
However, "register" might/will help you anyway. Look at the following
snippet:
int i;
int *ptr;
...
i=*ptr; /* statement 1 */
*(ptr++)=5; /* statement 2 */
i++; /* statement 3 */
foo(); /* statement 4 */
*(ptr+2)=6; /* statement 5 */
i++; /* statement 6 */
Now this is quite a bit synthetic but the same end results occur in normal
code. The problem here is that the compiler can never know where ptr points,
or rather it can only deduce where ptr points to in some cases. Here ptr
might point to i and this is known as aliasing. Furthermore, you don't even
know if ptr points to ptr itself!
This means that the compiler will write i and ptr back to memory whenever it
fears that it will be access by ptr so the code generated by statement 1-6
might be (in pseudo mc680x0 assembly code):
move.l ptr(a6),a2 ; load ptr
move.l (a2),d2 ; read whatever ptr points to
move.l d2,i(a6) ; flush i since we are going to write via ptr
moveq.l #5,d2 ; load constant (move.l #5,(a2)+ might be faster when
; the register pressure is high)
addq.l #1,d2 ; increment i
; we don't know if there are any side effects with regards to
; i and ptr in foo so both must be flushed!
bsr foo ; call foo assume callee save
move.l ptr(a6),a2 ; reload ptr
moveq.l #6,d2 ; load constant
move.l d2,8(a2) ; write
move.l i(a6),d2 ; reload i
addq.l #1,d2 ; increment i
Now if the "register" keyword would have been used in this case, the
compiler could do better since it would have known that there are no
aliasing:
move.l ptr(a6),a2 ; load ptr (unless it already is in a register)
move.l (a2),d2 ; read whatever ptr points to
moveq.l #5,d3 ; load constant (move.l #5,(a2)+ might be faster when
; the register pressure is high)
move.l d3,(a2)+
addq.l #1,d2 ; increment i
bsr foo ; call foo assuming callee save
moveq.l #6,d3 ; load constant
move.l d3,8(a2) ; write
addq.l #1,d2 ; increment i
So the net outcome is shorter code, better register usage, and fewer memory
accesses. All contributing to the power usage.
Naturally, this is also compiler dependant and I'm not all that familiar
with the SH port of GCC. My experice is however that this really pays of on
mc680x0, SPARC, MIPS and even, although on a lesser degree, on x86.
Most people working on x86 won't probably see this as there are so few
registers available and the memory-register operations are so efficient that
the compilers tend to use memory a lot.
Anyway, I'll keep my big mouth shut for a while now.
Best regards
Sven
Page was last modified "Jan 10 2012" The Rockbox Crew
|