Rockbox

  • Status Closed
  • Percent Complete
    100%
  • Task Type Bugs
  • Category Plugins
  • Assigned To No-one
  • Operating System Sansa e200
  • Severity Low
  • Priority Very Low
  • Reported Version Daily build (which?)
  • Due in Version Version 3.1
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by zephyr - 2008-09-09
Last edited by teru - 2009-07-13

FS#9387 - display error for Asian text

Viewer decodes incorrectly for the text file with one-byte characters and two-byte characters mixed. For example, when the encoding set to GB2312, it always get two bytes to constitute one character. This is not always ture. If current character is just an ASCII character, just need to read one byte.

viewer.c → get_ucs() (current)

  if ((prefs.encoding == SJIS && *str > 0xA0 && *str < 0xE0) || prefs.encoding < SJIS)
      return (unsigned char*)str+1;
  else
      return (unsigned char*)str+2;

I just build a private product on my sansa with following change to make decoding correct for GB2312. It works well:

  if ((prefs.encoding == SJIS && *str > 0xA0 && *str < 0xE0) || 
       (prefs.encoding < SJIS) ||
        (prefs.encoding == GB2312 && *str <= 0x7F))
      return (unsigned char*)str+1;
  else
      return (unsigned char*)str+2;

Could someone workout a complete solution including other encoding schema?

Closed by  teru
2009-07-13 13:03
Reason for closing:  Accepted
Additional comments about closing:   Warning: Undefined array key "typography" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 371 Warning: Undefined array key "camelcase" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 407

Committed in r21743. thanks!

GB2312 codepoints are always 2 bytes according to http://en.wikipedia.org/wiki/GB2312

What you describe seems to be EUC-CN as far as I can tell.

i think so. Two bytes are used to represent every character NOT found in ASCII.

Yuan, when ASCII was included in the text as you pointed it out, I confirmed the position of the next character is not correct for get_ucs().

Because your correction is not correct for KS X 1001, and Big-5, I create a patch file.
Please confirm it.

My patch file correct.
Please confirm it.

My patch file update.

Please use this patch when you apply patch files ( FS#8445 ,  FS#9546 ,  FS#9853 ,  FS#9855 ,  FS#9892 ,  FS#9893 ,  FS#9898 ,  FS#9902 ) for the Text viewer plugin.

Please apply the patch in order of  FS#9855 ,  FS#9892 ,  FS#9893 ,  FS#9898 ,  FS#9902 ,  FS#9853 ,  FS#9546 ,  FS#8445  and this task’s patch.

If you do not apply these patch files, this patch need not be applied.

sync r21316

where’s the patch?

Sorry, I missed to upload my patch.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing