|
Rockbox mail archiveSubject: XML for language filesXML for language files
From: Jonas Häggqvist <rasher_at_rasher.dk>
Date: Thu, 20 Sep 2007 02:46:25 +0200 After looking at genlang (and running away in terror), handling the Danish translation for a while and committing quite a few language patches, I started wondering about the source language file format that Rockbox uses (that is, not the binary format used on the player). The current format was introduced/discussed [1] in June 2006, where Daniel Stenberg proposed the current format in relation to the langv2 rework and a small-scale flamewar about using an XML format erupted. In the end, I think Daniel got tired of arguing without getting anywhere and implemented the current format. There are no major problems with it, but I still think moving it gently to XML would have some benefits: - genlang should become a lot simpler to modify since all the parsing would be done externally. - Similarly, the actual code should also be simpler, since it just consists of walking the tree and comparing node/attribute values. - The parser would be more robust to subtle syntax differences between english.lang and the translations. It'd either work as expected, or break loudly rather than producing unexpected results, possibly silently. At least it'd be harder to write a valid XML file that still broke genlang. - It'd be somewhat easier to produce one-off scripts in whatever language to edit language files since you don't have to write a parser first. - It should be fairly easy to write out a schema file and create an SVN pre-commit hook that validates it and rejects the commit if it's broken. - My own pet-project, the online translator [2] would also benefit for many of the above reasons :) I've already written a langv2toxml script (in Perl, using XML::LibXML which is rather nice) which produces output which looks reasonable to me [3]. The actual syntax is open for debate of course (and please do comment on it if you think it could be better), but I've tried modelling it very closely to the current syntax, which does work pretty well. Of course, there are drawbacks as well: - The file is somewhat harder to read and modify. I won't argue that this is true, but I don't think it's really a huge difference. This is probably the most important problem. - The files are about 30% larger. I don't think this is really a problem. - Other things that I want you to tell me, because I can't think of more. I admit I'm probably biased from having already written one parser (used on my website) and wondering why I couldn't just use an XML parser. It wasn't *that* hard, but I know the parser is making some assumptions about the file, that might not hold true after a translator had his dirty hands on it. Maybe I just need to take a step back and breathe deeply, but if no one objects too much, I'll create a new genlang to work with whatever XML schema is worked out (and also have a go at creating an XML scheme file to validate against). Remember, I'm not talking about the format used on the player, but the source format - I'm not *that* insane. [1] http://www.rockbox.org/mail/archive/rockbox-archive-2005-06/0363.shtml [2] http://rasher.dk/rockbox/translate/ [3] http://rasher.dk/rockbox/wallisertitsch.xml -- Jonas Häggqvist rasher(at)rasher(dot)dkReceived on 2007-09-20 Page template was last modified "Tue Sep 7 00:00:02 2021" The Rockbox Crew -- Privacy Policy |