Rockbox.org home
release
dev builds
extras
themes manual
wiki
device status forums
mailing lists
IRC bugs
patches
dev guide



Rockbox mail archive

Subject: Conversion script for German-English and Spanish-German dictionaries

Conversion script for German-English and Spanish-German dictionaries

From: Zeno Gantner <zeno.gantner_at_web.de>
Date: 2006-06-04

Hello everybody,

at the moment, I do not use Rockbox, but it certainly looks cool ;-).

However, I am interested in free dictionaries and getting them running on
every possible platform.

I wrote a small script that can be used to convert files from the format used
by Frank Richter's Linux dictionary program, ding, to the Rockbox dictionary
format.

I wrote it primarily to be able to convert the Spanish-German word list that I
am maintaining, but there are also other word lists it can be applied to:

http://www.studentendorf-vauban.de/~zeno/es-de
http://wftp.tu-chemnitz.de/pub/Local/urz/ding/de-en/de-en.txt.gz
http://download.ferheng.org/german-turkish.txt
(German-Turkish, German-Kurdish, Turkish-Kurdish, English-Kurdish, Swedish-Sorani, ...)

The word lists mentioned are all distributable under GPL, so it should be no
problem to package them together with Rockbox.

I have still some questions, and there is one thing I am not satisfied with considering
the conversion script:
1. http://www.rockbox.org/twiki/bin/view/Main/RockboxDictionary says all words should
    be lower case. I have added a command line parameter call --lowercase to accomplish
    this (Be sure to run the program in an environment where a suitable locale is set).
    I ask myself, however, whether this is really necessary. Isn't it possible to display
    the words as they should be displayed, and to have a case insensitive search?
2. The same page says the file must be sorted alphabetically. What is the exact sort
    ordering that is expected? There are several possibilities, e.g. proper alphanumerical
    ordering vs. ordering by ASCII value of the characters.
3. (the one thing I am not satisfied with)
    The lower case conversion does not seem to work for UTF-8 files. I also tried Perl's utf8
    pragma, but it didn't help. If anyone has experience with Perl and UTF-8, please step in ;-)

If you like the conversion script, feel free to add it to your CVS, it is under GPL 2 or later.

If you have any suggestions or questions, just send me an email.

With kind regards,
  Zeno
_____________________________________________________________________
Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
http://smartsurfer.web.de/?mc=100071&distributionid=000000000071

Received on Sun Jun 4 19:56:23 2006

Page was last modified "Jan 10 2012" The Rockbox Crew
aaa