Rockbox.org home
release
dev builds
extras
themes manual
wiki
device status forums
mailing lists
IRC bugs
patches
dev guide



Rockbox mail archive

Subject: a solution for voice clips (talkbox/UI)
From: [IDC]Dragon (idc-dragon_at_gmx.de)
Date: 2004-02-28


Hello,

the topic of blind usage brought me into the field of text-to-speach
converters.

I learned about the Microsoft Speech API (SAPI), which makes it very easy to
program talking applications, but sounds pretty crappy. I have made a first
command line program. All it does is "speak" the text of the first command
line argument into a wav file. This can be useful to make Talkbox clips in an
automated way, recursively across the whole disk if you like and if somebody
makes a Perl script for it (hint, wink).
My program is here:
http://joerg.hohensohn.bei.t-online.de/archos/speech/speak
I don't know if this runs on a plain standard Windows box, maybe you need to
install some SAPI runtime:
http://www.microsoft.com/speech/download/sdk51/

The best speech is done by AT&T Natural Voices. I have no program which uses
it, but there's an interactive web demo:
http://www.naturalvoices.att.com/demos/
Just an idea: would it be possible to abuse this demo and harvest the wav
files it generates, in an automated way? I have no web programming skills,
maybe somebody else can look behind the scenes.

First I played with the eval version of TextAloud. It can talk better, but
is not made to be scripted. I convinced it with a trick, to get all rockbox
language strings as wav files. The program rememberes "articles" to be spoken,
so I made many little articles from the strings. In the program directory
there will be a last-settings file username_TAExport.txt, which I modified like
this:
||||||TextAloud MP3 Start of Article||||||
LANG_SOUND_SETTINGS|Mary|1
Sound Settings
||||||TextAloud MP3 End of Article||||||
||||||TextAloud MP3 Start of Article||||||
LANG_GENERAL_SETTINGS|Mary|1
General Settings
||||||TextAloud MP3 End of Article||||||
and so forth. When playing this into separate files I get a bunch of wav's,
which lame can then nicely encode. The integrated lame is too restricted,
won't do VBR, no quality settings avail. So I have all the UI clips. I have
placed the first shot here:
http://joerg.hohensohn.bei.t-online.de/archos/speech/ui_clips

A drawback is that TextAloud generates a lot of pause (silence) around the
actual clip, which I'd like to remove. Lame has no option for that, I haven't
found a tool yet which does it on a batch of files.

So far,
Jörg

-- 
GMX ProMail (250 MB Mailbox, 50 FreeSMS, Virenschutz, 2,99 EUR/Monat...)
jetzt 3 Monate GRATIS + 3x DER SPIEGEL +++ http://www.gmx.net/derspiegel +++

_______________________________________________ http://cool.haxx.se/mailman/listinfo/rockbox



Page was last modified "Jan 10 2012" The Rockbox Crew
aaa