Rockbox.org home
release
dev builds
extras
themes manual
wiki
device status forums
mailing lists
IRC bugs
patches
dev guide



Rockbox mail archive

Subject: Findings regarding PP5020 IRAM

Findings regarding PP5020 IRAM

From: Jens Arnold <jens_at_jens-arnold.net>
Date: Sun, 21 Feb 2010 23:24:58 +0100

Hi all!

I already mentioned this info in IRC, but I also want to send it
to the list for those who aren't IRC regulars and are not
reading the logs either.

As many of you may know, a lot of devices (including all ipods
up to the ipod Video) use a SoC of the Portalplayer family, or
PP for short, which exist in 3 flavours among the supported
targets: PP5002, PP5020 and PP5022.

All 3 flavours have many things in common, among those
- dual ARM7TDMI core, specced up to 80..100MHz
- 8KB unified, 4-way set associative cache per core
- 96KB (PP5002 and PP5020) or 128KB (PP5022) built-in, static
 RAM (called IRAM for short) which is specced as zero waitstate

What we did know so far is that the PP5002 has a broken cache,
in that cache accesses always include one waitstate, i.e. cache
access is only half as fast as it could be. With proper tuning,
most codecs can be made almost as fast as on PP5022 by proper
IRAM utilisation.

For quite some time I wondered why codecs running on PP5020 are
always a few percent slower (usually around 5%) than on both
PP5022 and PP5002 (when properly tuned). Today I decided to
investigate this issue, and here's what I found:

The IRAM on PP5020 is *not* always zero waitstate!

Apparently PP5020 IRAM consists of 4 banks of 24KB each, with
the following access characteristics:

- bank 0 is zero waitstate for the CPU, but one waitstate for
 the COP
- bank 1 is zero waitstate for the COP, but one waitstate for
 the CPU
- banks 2 and 3 are one waitstate for both cores

This means that IRAM usage in rockbox needs to be reconsidered
for best performance on PP5020.

As a rule of thumb I think that small arrays which are accessed
often in a tight loop are better off in DRAM, as the cache
doesn't have waitstates, while larger arrays (which don't fit
completely into the cache together with their code) should go
into IRAM. I guess there is no strict rule - each array in each
codec will need tuning on target regarding this PP5020 flaw. It
would be silly to give up IRAM on PP5020 completely. It slows
down codecs as a quick check with the APE codec has shown.

Regards, Jens
Received on 2010-02-21


Page was last modified "Jan 10 2012" The Rockbox Crew
aaa