Rockbox - Rockbox mail archive

From: Tat Tang <tat_tang_at_yahoo.com>
Date: Thu, 18 Mar 2004 18:28:29 -0800 (PST)

> It would mean the chinese version would always lag >
behind the "western" Rockbox...

We are in agreement. I was thinking about two boxed
products. One European, one Chinese both built from a
single source.

> 1) Run-time configuration. The user actually
> chooses if he wants to use single-byte strings or
> utf-8 strings. This could make the code rather
> complex.

Happily, I think it's straightforward. Just change
unicode2iso:

if (single_byte_mode) {
iso[isoloc] = uncode[x];
} else {
/* do utf8 translation */
}

Instead of making this a run-time configuration, it
would be nicer to do it on the fly. Otherwise browsing
directories would be painful. Of course that also
raises the issue of being able to define a single byte
font and a "utf8" font.

> 2) Compile-time configuration. Simpler code and
> smaller size.

If there are two binaries with different sizes, will
the larger binary be less functional?

A Unicode-big5 translation is 55k, being initalized
data it contributes to the 200K limit. It might be
compressible to ~20K, or we could drop less frequent
characters and compress it to maybe 10K. Either way,
it's significant.

Someone suggested loading it from disk. I need to go
understand the initialization code better.

We will need a different table for Simp. Chinese, two
for Japanese and one for Korean. So if we can't load
from disk we'd have to do some compile-time
configuration anyway.

> do not see Rockbox moving entirely to Unicode,
> since that requires the presence (and use) of a
> font cache, which is a big burden for all western
> languages that don't need one.

Disagree. The font.c has been modified in such a way
that if the font is less than MAX_FONT_SIZE it is
loaded in completely and there are no further disk
reads to fetch characters. If the font is larger than
MAX_FONT_SIZE the same memory area is reused for the
cache.

> By the way, please call symbols in the code utf8
> instead of big5.

Okay, I will call it utf8, though (IIRC) strictly
speaking, utf8 uses 1,2 or 3 bytes for 16 bits.

> Surely the code itself does not depend on the
> character set, but rather the encoding?

A strict interpretation of the encoding standards
would involve different code. But for practical
purposes, I think Trad. and Simplified Chinese can use
the same code. I haven't looked closely at Japanese or
Korean. Mind you, no-one has asked for Japanese or
Korean...

-- Oh, and accents are showing up perfectly
on the Rockbox site :)

Tat...

__________________________________
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com
_______________________________________________
http://cool.haxx.se/mailman/listinfo/rockbox
Received on 2004-03-19

Rockbox mail archive

Re: Re[2]: CeBit 2004