|
Rockbox mail archiveSubject: Re: [ rockbox-Patches-920256 ] Traditional/Simplified Chinese patchRe: [ rockbox-Patches-920256 ] Traditional/Simplified Chinese patch
From: Tat Tang <tat_tang_at_yahoo.com>
Date: Tue, 23 Mar 2004 15:26:21 -0800 (PST) > Is support for both big5, gb and unicode necessary? Well, GB2312 is used for Simplified Chinese characters, as written in China. Big5 is used for Traditional Chinese characters as written in Taiwan and Hong Kong. These are separate and distinct character sets. For example, the character for "electricity" is written differently between the two scripts. The simplified version has no mapping in Big5 and the traditional version has no mapping in GB2312. Even where characters appear in both scripts, they map to different code points, for example the character for "you" maps to 0xC4E3 in GB2312 and 0xA741 in Big5. Yes, both Big5 and GB2312 need to be supported. > Those lookup tables use 15-30 KB code space. Are all > three code tables regularly used for file names and > ID3 tags? I'm finding that filenames are getting stored as Unicode and id3 tags (courtesy of freedb) are coming through as Big5 or GB2312. >From the documentation that I have, GB2312 defines 3755 frequently used Hanzi and Big5 defines 5401 frequently used Hanzi. So there is scope to reduce the size of these tables, however some users are keen to read lyrics/books while listening to music. It may be possible to provide full mappings and reduced mappings to trade off against mp3 buffering. Is there a rule of thumb for calculating minuimum buffer space? > If we need all three, can these tables be moved > outside the main code somehow? Maybe loaded at boot? The tables come in pairs, there is a Unicode->Big5 lookup and Unicode->GB2312 lookup. It's only necessary to maintain a single lookup. A particular "document" will be Big5 or GB2312 in the same way that a document will be Greek or Russian. Loading the required table at boot makes a lot of sense. It also allows the user to switch on the fly between Traditional, Simplified and possibly other MBCS languages. This would be combined with your previous suggestion of having a switch between single and multi-byte mode. So, the proposed boot sequence is something like this: 1. Boot into single byte mode. 2. If configured for single byte mode then finished 3. Otherwise, malloc a suitable buffer and load the required conversion table. 4. Switch into multi-byte mode. This level of switching between single and multi-byte mode means that single-byte mode users don't incur the extra memory overhead. It will require rebooting the firmware. When in multi-byte mode, there will an ability to switch into Western languages (i.e. soft switch into single byte mode). There would also be the ability to switch between multi-byte charsets. The initial thought was to provide a browse charsets menu, however it seems that a change in charset would always result in a font change. This seems clumsy, and it would be nicer if the user selects a font and the required charset is automatically loaded. This requires changing the font format to include a charset. Does this sound like a reasonable change? > gives a total of nine recorder builds! > Those files cannot be loaded by the Archos firmware. > Did you test the code with Rockbox burned in flash? I hear you. And I understand the flash rom has a size limit. I guess loading the conversion table at boot solves everything. Thanks for the feedback. And let me know how you feel about the proposed changes. Tat.. __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html _______________________________________________ http://cool.haxx.se/mailman/listinfo/rockbox Received on 2004-03-24 Page template was last modified "Tue Sep 7 00:00:02 2021" The Rockbox Crew -- Privacy Policy |