|
Rockbox Unicode Guide
Character Encoding
All typed text we use has its “character encoding” (sometimes referred to as page code). For English or other Latin-based languages, text is usually encoded in the ISO-8859-1 page code. Here are some more examples of other page codes:
- Simplified Chinese GB2312
- Traditional Chinese Big5
- Korean KSX1001
- Japanese Shift_JIS
- Hebrew ISO-8859-8
- Arabic CP1256 (Windows-1256)
- Cyrillic CP1251 (Windows-1251)
- Greek ISO-8859-7
- Unicode UTF-8
Note: each language doesn’t necessarily have only one page code. Also, Rockbox may support one of a language's page codes, but not the rest.
What is Unicode?
Unicode as defined by Wikipedia is “an industry standard whose goal is to provide the means by which text of all forms and languages can be encoded for use by computers.” To put it simply, Unicode supports multiple languages, therefore eliminating the need to use different page codes for every single language. More about the use of Unicode explained below.
ID3 Situations
I have non-Unicode ID3s in "insert 1 foreign language"
Tags are usually coded in the OS's default page code (default page code on your PC's OS). This is the case for most tags, except tags very very recently created by programs like MP3tag and other tagging programs which encode tags in Unicode by default instead. When your tag is coded this way, and not in Unicode, you can only display it on Rockbox by selecting the appropriate "default page code" and "font".
Note: In Windows XP, the default page code settings can be found at “Control Panel/Regional and Language Settings/Advanced tab/Language for non-Unicode programs”.
To explain even more, say you have an ID3 tag with Chinese info, coded in the OS's default code page (in this case Chinese). This tag would show as garbage on Rockbox unless you choose Chinese as your default page code, and Unifont as your font. This way, Chinese songs and English songs (ISO-8859-1) will all display properly.
Drawbacks of using Unifont
The problem with Unifont is that it might not be optimized for the WPS of your choice. The 6+12x13 font, added on Feb. 12 2006, supports Japanese and Korean characters and tends to be compatible with a larger number of WPSs. It is also smaller than Unifont and allows more characters to fit on the screen. You may wish to use 6+12x13 instead of Unifont if you only need support for Japanese and Korean characters.
I have non-Unicode ID3s in "insert 2 or more foreign language(s)"
If you have ID3 tags with more than 1 foreign language, then the above solution wouldn't be perfect for you. Say that in addition to Chinese ID3s, you also had Arabic, Greek, Hebrew, Korean or Japanese ID3s (a combination of any of these two or more would work). For my example, I'll choose Arabic as the second foreign language in my ID3s. If you're using windows, having ID3s in two of these languages means that you need to switch your default page code depending on which language you want to display. If I want to display an Arabic encoded ID3, I'd have to choose Arabic as my default page code. If I want to display a Chinese encoded ID3, I'd have to choose Chinese. If Arabic was set as the default page code, Arabic songs would display fine, but Chinese songs will show as garbage. The same is true for Rockbox. If your default page code on Rockbox is Chinese, Chinese ID3s will show fine, but Arabic ID3s will show as garbage.
There is a solution! Enter Unicode. Rather than encoding each tag in its native language's codec, such as encoding an Arabic tag in the Arabic page code, or a Chinese tag in the Chinese page code, we encode ALL tags regardless of the language in Unicode. This way, you do not need to tell Rockbox (or windows, or any other OS) which page code to use. Simply choose the font "Unifont" in Rockbox, and all the tags will show with no problem! You would then be able to play an Arabic song, followed by a Chinese, then Greek, then Korean etc... and all the tags would show properly! Without even changing Rockbox's default page code language!
Drawbacks of using Unicode
The drawback from using Unicode encoded ID3s is that not all PC MP3 players and DAPs support Unicode.
Solution for people who don’t like Unifont
The problem with displaying tags with foreign languages is that you have to use Unifont, which may not appeal to everyone. One solution is to use 6+12x13, which as mentioned above works well if you only need Japanese and Korean characters. A fairly large number of WPSs are compatible with the 6+12x13 font. Alternatively, you can create two different .cfg files on Rockbox. Make one for English music (or anything using ISO-8859-1) which has your favorite font (Snap, Chicago etc...) and your favorite WPS. Then have a second .cfg for "international" or "world" music, using Unifont, and an appropriate WPS of your choice. Check also the list of Rockbox unicode fonts.
stevenyu from Mistic River brought to my attention an H100 WPS optimized for the use with Unifont. Go to the WpsGallery and scroll down to FrejBon's Uniskin. Hopefully we'll have more WPSs optimized for Unifont in the future.
Notice that the current song in the screenshots shows in Japanese, and the next song that shows at the bottom is in Korean.
Solution for playing MP3s with Unicode tags on PC
It seems that in Windows, Winamp does not support Unicode. Real Player doesn't support it either. However, Foobar 2000, Windows Media Player and iTunes all support Unicode. Even displaying Unicode ID3 tags in Explorer is supported. For Linux, I read that Xmms doesn't do Unicode very well (didn't try it my self). I read that Rhythmbox and Amarok display Unicode fine.
I personally have been using Winamp for YEARS, almost a decade. I have Arabic, Chinese and Japanese music in my collection, and having to change the default page code and restarting everytime I decide to play different music is a serious pain. Actually I stopped listening to most of my non-English music because of this. I will stop using Winamp for now, untill they start supporting Unicode. Here are a bunch of usefull links for you:
MP3 players supporting Unicode
MP3 tagging software supporting Unicode
- Mp3tag - Probably the best at the moment. Allows you to choose between Unicode or ISO-8859-1. To convert current tags to Unicode, simply make sure tag writing is set to Unicode, select all, and save the tags again.
- Unitagger - A simple Unicode tagging program.
- ID3-TagIT 3 - My personal favorite ID3 tagger. Might have less options than Mp3tag when it comes to Unicode.
- ID3iconv - A Java command tool. May be usefull to some.
- foobar2000 - This player also allows tag editing, and can also convert ISO-8859-1 coded tags to Unicode
- Music Tag Editor 1.2<---->Mp3 Tag Assistant Professional 2.6 - 2 Programs, you can like it or not. Just try!
- EasyTag (≥ v1.99.9) - Supports unicode tagging using id3lib. Linux/GTK+
Note: It won't hurt to use multiple tagging programs if none of them have all the features you want. Most likely I'll use Mp3tag for converting to Unicode, and continue to use ID3-TagIT 3 for tagging.
You can Help! If you know other MP3 players, or tagging tools (for Windows or Linux) that support Unicode, let us know!
Using Unicode in WPS
Other than using Unicode for ID3 tags, you can also use it to customize your WPS. For example, you can have the words "Battery", "Next Song" or "Playing Now" written in Korean, Greek or Russian etc... This increases the possibility to locolize the way your player looks. All you have to do to allow this is to make sure your .wps file is saved and encoded in Unicode (UTF-8 or UTF-16), and that you use Unifont in Rockbox.
Text editors supporting Unicode
- Notepad - Maybe the simplest tool to use. Notepad in Windows 2000 and Windows XP supports encoding in UTF-8. After you click "Save As", make sure you select "UTF-8" as your encoding type at the bottom of the Save As window.
- Unicode and Multilingual Editors and Word Processors for Windows - A good site with links to various Editors supporting Unicode.
- Notepad2 - A simple Notepad alternative with support for syntax highlighting.
- Programmer's Notepad - A more advanced text editor.
- Notepad++ - Another advanced text editor.
- SC UniPad - A Unicode text editor featuring an on-screen soft keyboard.
- Vim & gVim - Another advanced text editor.
- PSPad - Yet another text editor that does a whole lot of things.
Notepad in Windows 95, 98 and ME does not support Unicode
RayyanFairaq - 10 Dec. 2005
Comments
SinjoPark: How about Vim?
JeongTaekIn: I say a good word for Notepad2.
MarcoenHirschberg: T-Matsuo added Unicode support to Winamp. Can someone verify that it works? (RayyanFairaq WARNING: This patch changes Winamp's UI into Japanese! Plus I don't think it fully supports Unicode, because it wasn't able to display a Unicode encoded tag with Arabic info, which Windows, and iTunes had no problems reading. So this may only be a soloution for CJK)
JeongTaekIn: It has problem a little. And there is perfect unicode support Winamp patch than Japan in the Korea. Here And Here.
|
|
Copyright © 1999-2008 by the contributing authors.
|
|