HOWTO convert a wikipedia dump into the .wwa format

(for use with freqmod's and AdamGashlin's mediawiki viewer, FS#4755 – Wikipedia)


This is my first attempt to a guide, written in a foreign language. So please be patient :-) and help in improving and completing this! You may contact me at simon at jso.be


Requirements:

- You will need to find a precompiled build for your device which includes the mww patch

OR

- be able to compile the patched version on your own. Try Detailed instructions on compiling for help on this. It's really not difficult!

Prebuilt binaries:

If you know of an available build, please tell me. I will provide the links here.


Patching the source:

get the latest patch from here: FS#4755 – Wikipedia

unzip it, patch the current rockbox source, compile it for your player and upload it.

You will now find a file .rockbox/rocks/viewers/mww.rock on your device.

Player-specific:

The patch is ipod-focused at the moment. So you might have to adapt the file by opening it in a text editor. (Maybe someday this won't be necessary anymore.)

iaudio X5:

I had to replace BUTTON_MENU with BUTTON_MODE|BUTTON_REL in the mww.diff file.

Sansa E280

replace BUTTON_MENU with BUTTON_POWER (Thanks framo!)



Getting the converter software:

Get the converter.tar.gz from FS#4755 – Wikipedia

On my system I have to unzip the file in 2 steps. Don't know why.

gzip -d converter.tar.gz

tar -xvf converter.tar

You won't have to compile the utilities, there are binaries for Linux and Windows. Do the Linux binaries run in a mac console?

Getting the wikipedia dump:

Go to download.wikimedia.org

You will be interested in the file xxwiki not the ...books ...wiktionary ...quote files! Choose the correct wiki by downloading the file beginning with your language's 2 letter code.

dewiki -> deutsch

enwiki -> english

I suggest the new 'converter' folder as a good place to save.

Go and do what you have always been waiting for. Your computer will be busy for a looong time getting some zeros and ones...

The file is zipped, unzipping it takes a while too and gives you a giant .xml file.

The conversion process

Go to the 'converter' folder and run the following 2 commands:

./xmlconv your_wikipedia_file.xml output_prefix

./btcreate output_prefix.wwt output_prefix.wwr output_prefix.wwi

I suggest you apply the format xxwiki-dumpdate as output_prefix.

Just to be sure: for an english wiki created on the 14 March 2012 this will look like this:

./xmlconv enwiki-20120314-pages-articles.xml enwiki-20120314

./btcreate enwiki-20120314.wwt enwiki-20120314.wwr enwiki-20120314.wwi

Obviously, if you didn't save the dump to the 'convert' folder, you will add the path to the file name or the commands respectively.

In some cases (BIG dumps) there will be multiple .wwa files (with a numbering)

Upload the file:

To start wikipedia you will 'play' the .wwi file, which will then load the articles from the .wwa's

If you have Horstschäfer's 'New stardict plugin' installed too, you might want to put the files in the /dicts folder. Then the dump will be available through the 'Dictionaries' button in the Root Menu.

You don't need the .wwr and .wws and .wwt





author musician72

simon at jso.be

version 0.1

date 06.10.07