|
|
Wiki > Main > RockboxDictionary (compare)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Difference: RockboxDictionary (r24 vs. r23)Dictionary Note: this is still a work in progress, developers only. Todo
The most interesting xxx2rdf would be the Dict format, as there are a lot of free dictionarys availible in that format. Creating a dictionary file1. Download the prolog version of the WordNet dictionary here: WNprolog-2.0.tar.gz http://wordnetcode.princeton.edu/3.0/WNprolog-3.0.tar.gz 2. Extract wn_g.pl and wn_s.pl from it. 3. Put wn2rdf.pl, Run "make" wn_g.pl, wn_s.pl and the rdf2binary tool in 1 the directory, and execute wn2rdf.pl svn tools directory. 4. Execute Put wn2rdf.pl and the rdf2binary tool, it will output dict.desc tool from the tools directory in a directory with wn_g.pl, wn_s.pl, and dict.index execute wn2rdf.pl 5. Copy Execute the rdf2binary tool, it will output dict.desc and dict.index to 6. Copy dict.desc and dict.index to The rockbox dictionary formatThe input format for rdf2binary is very simple at this moment. It's one line per word, starting with the word, then a tab and then the description. The only thing you should be aware of when creating this files is that they must be in alphabetical order, and all words should be in lowercase. The binary formatThe binary format used for the index is pretty simple, the struct is like this one:
struct {
char word[WORDLEN];
long offset;
};
WORDLEN is a define in the rdf2binary tool, and the plugin. And the offset is an offset in dict.desc where the description is stored. The improved binary formatThis is still an idea under construction, but the new format would be just 1 file containing:
After that there should be the index data:
And then just plain text description data, one description per line. The hash binary formatHeader:
Offset table:
Hash table:
When searching for a word with hash X, the plugin looks up the offset for X and X+1 in the offset table. It reads the data between those offsets on looks for the word, we were searching for. It's just a hash table with chaining. Sources for dictionary files There is nearly everything needed for a german<->english single word translator on http://dict.tu-chemnitz.de For those who don't have to tools for compiling their own Dictionary files, you can download them from http://www.rockbox.dreamhosters.com/dict.zip (6.6MB) . If you want to download the two parts separately just get, http://www.rockbox.dreamhosters.com/dict.desc (17MB) and http://www.rockbox.dreamhosters.com/dict.index (5.0MB). -- PeterOlson - 12 Nov 2006 r24 - 19 Feb 2010 - 18:28:46 - AndrewEngelbrecht
Revision r24 - 19 Feb 2010 - 18:28 - AndrewEngelbrechtRevision r23 - 09 Feb 2009 - 19:35 - MaurusCuelenaere Copyright © by the contributing authors.
|