Rockbox mail archive
Subject: roadmap to blind Rockboxing (need an audio amateur/speaker)
From: [IDC]Dragon (idc-dragon_at_gmx.de)
there have been some understandable requests for blindly operating Rockbox.
This is of course useful to handicapped people, but also in a car.
As a developer, I have the following picture in mind:
Speech synthesis is not an easy option, way too complex, since we already
can play mp3 audio. Recently I've added playing from memory, mainly for my
video plugin, but also with an eye on UI support. My suggestion would be to make
little mp3 clips for the language IDs, then have some script tie them all
together to one big (bitswapped) file which includes an index. This could even
be a part of the build process. This file has to fit into the mp3 buffer,
about 1.6 MByte on regular 2 MB boxes. Currently we have 335 entries, if each
takes a one second clip that means we can use up to 40kBit/s. More if we take
into account that not all language IDs are needed. Voice is mono by nature, 22
or 16 kHz should be sufficient.
OK, once the big file is there, it needs to be loaded into the mp3 buffer on
startup and every time you stop your music, because the normal playback
trashes it. Loading this takes less than a second (excluding spinup) with my
recent speedups. We can have the UI talking only if we're not playing (or
recording), this is the limitation.
I would implement the changes to the menu/screen code to play what's under
the cursor, if somebody else does the authoring of the clips. I am no audio
amateur (don't even have something that deserves the name microphone), and you
don't want to hear _me_ talking out of your box. This need some stamina and
reproduceability, some more clips will be needed as Rockbox evolves along.
Some help on the scripting would also be appreciated.
Any takers, can we form a team?
We need nicely trimmed clips of what's in "english.lang". I suggest naming
the clips like the LANG_xx ID, e.g. "LANG_DELETE.mp3". This makes it easy for
a script. They should not contain any ID tags, don't waste space on this. I
suggest keeping the original .wav's, so we can re-encode with a sample rate /
bit rate that just fits. The clips should have a little fade in/out, to avoid
clicks, but no silence at the begin or end, to avoid latencies. Maybe there
is some kind of push-to-talk tool that records and does the trimming and
fading more or less automatically?
Plus, this would be a good time to merge the "classic" talkbox patch, if we
want this in the main cvs.
And I have in mind to delay the file opening for recording until the buffer
runs full for the first time. This delays the disk spinup and makes the
internal mic useable for clips that are shorter than the buffer.
(Phew, when will I ever complete all my other stuff: the JPEG viewer, the
all-in-one video converter tool, my car stereo integration, remote control
GMX ProMail (250 MB Mailbox, 50 FreeSMS, Virenschutz, 2,99 EUR/Monat...)
jetzt 3 Monate GRATIS + 3x DER SPIEGEL +++ http://www.gmx.net/derspiegel +++
Page was last modified "Jan 10 2012" The Rockbox Crew