- Status Closed
- Percent Complete
- Task Type Bugs
- Category Plugins
- Assigned To No-one
- Operating System Sansa e200
- Severity Low
- Priority Very Low
- Reported Version Daily build (which?)
- Due in Version Version 3.1
-
Due Date
Undecided
- Votes
- Private
FS#9387 - display error for Asian text
Viewer decodes incorrectly for the text file with one-byte characters and two-byte characters mixed. For example, when the encoding set to GB2312, it always get two bytes to constitute one character. This is not always ture. If current character is just an ASCII character, just need to read one byte.
viewer.c → get_ucs() (current)
if ((prefs.encoding == SJIS && *str > 0xA0 && *str < 0xE0) || prefs.encoding < SJIS) return (unsigned char*)str+1; else return (unsigned char*)str+2;
I just build a private product on my sansa with following change to make decoding correct for GB2312. It works well:
if ((prefs.encoding == SJIS && *str > 0xA0 && *str < 0xE0) || (prefs.encoding < SJIS) || (prefs.encoding == GB2312 && *str <= 0x7F)) return (unsigned char*)str+1; else return (unsigned char*)str+2;
Could someone workout a complete solution including other encoding schema?
Closed by teru
2009-07-13 13:03
Reason for closing: Accepted
Additional comments about closing: Warning: Undefined array key "typography" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 371 Warning: Undefined array key "camelcase" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 407
2009-07-13 13:03
Reason for closing: Accepted
Additional comments about closing: Warning: Undefined array key "typography" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 371 Warning: Undefined array key "camelcase" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 407
Committed in r21743. thanks!
Loading...
Available keyboard shortcuts
- Alt + ⇧ Shift + l Login Dialog / Logout
- Alt + ⇧ Shift + a Add new task
- Alt + ⇧ Shift + m My searches
- Alt + ⇧ Shift + t focus taskid search
Tasklist
- o open selected task
- j move cursor down
- k move cursor up
Task Details
- n Next task
- p Previous task
- Alt + ⇧ Shift + e ↵ Enter Edit this task
- Alt + ⇧ Shift + w watch task
- Alt + ⇧ Shift + y Close Task
Task Editing
- Alt + ⇧ Shift + s save task
GB2312 codepoints are always 2 bytes according to http://en.wikipedia.org/wiki/GB2312
What you describe seems to be EUC-CN as far as I can tell.
i think so. Two bytes are used to represent every character NOT found in ASCII.
Yuan, when ASCII was included in the text as you pointed it out, I confirmed the position of the next character is not correct for get_ucs().
Because your correction is not correct for KS X 1001, and Big-5, I create a patch file.
Please confirm it.
My patch file correct.
Please confirm it.
My patch file update.
Please use this patch when you apply patch files (
FS#8445,FS#9546,FS#9853,FS#9855,FS#9892,FS#9893,FS#9898,FS#9902) for the Text viewer plugin.Please apply the patch in order of
FS#9855,FS#9892,FS#9893,FS#9898,FS#9902,FS#9853,FS#9546,FS#8445and this task’s patch.If you do not apply these patch files, this patch need not be applied.
sync r21316
where’s the patch?
Sorry, I missed to upload my patch.