Rockbox mail archiveSubject: Re: Different charsets, loadable fonts, unicode ....
Re: Different charsets, loadable fonts, unicode ....
From: Alex Gitelman <alex_at_fg-soup.com>
Date: Mon, 15 Jul 2002 09:42:37 -0700
GH> 1. Unicode support. Can rockbox run with just unicode,
GH> or are we going to have to support code pages? Alex
GH> said that he had to use 3 different fonts in order to properly
GH> handle the ID3 tags. Also, apparently they weren't encoded
GH> in unicode, but I haven't got a clear answer on that.
Some more background info.
In non unicode times there were few different encodings for Cyrillic
that would map chars to the range from 0x80 to 0xff. So it used to be
always a problem to view Russian text. Typically win1251 would be used
in MS Windows systems and koi8 in Unix systems. So whatever text you
show you must know it's encoding.
I am not using 3 different fonts in my patch. I have fonts in
different encodings and their encoding flag is not always available.
I use only one font but transcoding works as following:
To get bitmap for char C:
1. convert it's code using encoding table from source encoding to
win1251. If there is no entry for this char than no transcoding in
this step. So table only has 64 real entries for Cyrillic chars and
rest is set to -1.
2. Lookup Encoding flag in BDF. If there is a glyph with encoding
matching char than use this glyph.
3. If no glyph with appropriate encoding is present than use char
offset in BDF.
At this point font gets compiled to AJF.
This fonts loads by Rockbox and ID3 tags look just fine after that.
All of them, created by different people and different software.
That makes to conclude that ID3 tags are not Unicode or conversion
happens somewhere else. No idea.
It didn't work for filenames though because they are UTF16. So they
need to be converted one more time. In that place (unicode.c) I just
hardcoded for now empirical fact that win1251 + 0xB0 = UTF16 for
Eventually I propose using unicode target encoding in step 1 instead
of win1251. Then if I compile Cyrillic font I specify that it goes to
page 4 and give it encoding map, say koi8->utf16. I end up with font
system_04.ajf. system.ajf will be for page 0. Page 0 font may be
compiled from different BDF or from the same BDF (but with different
So 0 and 4 pages have font support. Other pages will be displayed
as '?' unless font is added.
Now issue is non-unicode text. So we just give it non-unicode
transcoding map. This will not take much space. 255 bytes at most for
code page. Then we specify in preferences what map to use.
Received on 2002-07-15