• Status Closed
  • Percent Complete
  • Task Type Patches
  • Category User Interface → Language
  • Assigned To No-one
  • Operating System
  • Severity Low
  • Priority Very Low
  • Reported Version
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by phaedrus961 - 2005-08-24
Last edited by marcoen - 2005-12-06

FS#2649 - Unicode patch

This patch is by no means complete, but I wanted to
upload it so that others can test/fix/improve etc. It
was originally written by Marcoen Hirschberg and has
been updated by me. There are still many things which
need implementing/fixing, but for me it is quite
usable. The utf8gen script is used to convert the lang
files to utf8.

Things that currently work:
1) Display of unicode filenames, id3 and vorbis tags.
2) Writing of unicode filenames.
3) Font caching.*
4) Basic BiDi support.
5) Selectable codepage conversions (no CJK yet). Things that don’t work:
1) Player doesn’t display unicode strings properly.
2) Doesn’t compile for V2 or FM Recorders due to 200k
size limit (hopefully this will change when the code
cleanup is complete).
3) Virtual keyboard does funny things with chars above
4) Text viewer only supports utf8. There are probably some more things I forgot. To use this, you will need an ISO-10646 font like those
ISO-8859-1 fonts will also work, but you will only get
latin1. *Font caching was taken from the Chinese patch, which
was written by Tat Tang and updated by Tenry Fu and
myself. Although it works fine, at least one font
crashes my archos and I don’t yet know why. Strangely,
it works fine in the sim.
Jens suggested loadable codepage conversion tables
and I think this is a good idea, or rather necessary if
we want conversions for CJK. I have some ideas how to
do this, but currently no time to do it.

Closed by  marcoen
2005-12-06 15:30
Reason for closing:  

Very cool, looking forward to this getting more work (Might
unicode be an additional goal for 2.(5+1) besides iriver
support?). I’ll gladly test this, but unassigning myself
from it since I have absolutely no idea about the
implementation/code/rockbox details of it.

Updated patch to work with utf8 id3 tags.

Updated to latest CVS and added cjk codepage conversions.
Codepage tables are now loaded from disk to reduce binary size.

Current patch is now close to completion, I hope. Most
everything is now working. The Player and the virtual
keyboard now handle utf8 strings correctly. I’ve added a
feature to save a list of loaded glyphs at shutdown and
reload them at boot. I’ve also added a perl script to
convert any bdf font to iso10646 encoding using the mapping
files from

changed code so that the glyph list is saved in lru order.

Finally fixed the crash on archos :) The patch description
is now way out of date. Only thing remaining to be done is
conversions for the text viewer. Everything else is working
flawlessly for me.

Code cleanup and optimization.

A few more optimizations. Also (somewhat) fixed the half
screen problem in the text viewer.

Fixed a bug with fonts that have very large glyphs and began
adapting the text viewer to utf8.

Updated to latest cvs.

Added Arabic joining and updated to latest cvs

Anonymous Submitter commented on 2005-11-19 02:49

Why are using this formula to count a number of UTF-8
bytes for UTC character?

——– firmware/common/unicode.c, Line 93 ——–

  if (ucs > 0x7F)
      while (ucs >> (6*tail + 2))

——– end of code ——–

It makes U-100 - U-7FF get stored in 3 bytes instead of 2
and U-3FFF - U-FFFF in 4 bytes instead of 3.

Maybe this will be better? :)

——– start of code ——–

      while (ucs >> (5*tail + 6))

——– end of code ——–

tfact commented on 2005-11-21 02:07

Updated to latest cvs.

tfact commented on 2005-11-21 05:20

Updated to latest cvs
and marge romete keyboard at


Available keyboard shortcuts


Task Details

Task Editing