|
Rockbox mail archiveSubject: a solution for voice clips (talkbox/UI)a solution for voice clips (talkbox/UI)
From: [IDC]Dragon <idc-dragon_at_gmx.de>
Date: Sat, 28 Feb 2004 15:17:17 +0100 (MET) Hello, the topic of blind usage brought me into the field of text-to-speach converters. I learned about the Microsoft Speech API (SAPI), which makes it very easy to program talking applications, but sounds pretty crappy. I have made a first command line program. All it does is "speak" the text of the first command line argument into a wav file. This can be useful to make Talkbox clips in an automated way, recursively across the whole disk if you like and if somebody makes a Perl script for it (hint, wink). My program is here: http://joerg.hohensohn.bei.t-online.de/archos/speech/speak I don't know if this runs on a plain standard Windows box, maybe you need to install some SAPI runtime: http://www.microsoft.com/speech/download/sdk51/ The best speech is done by AT&T Natural Voices. I have no program which uses it, but there's an interactive web demo: http://www.naturalvoices.att.com/demos/ Just an idea: would it be possible to abuse this demo and harvest the wav files it generates, in an automated way? I have no web programming skills, maybe somebody else can look behind the scenes. First I played with the eval version of TextAloud. It can talk better, but is not made to be scripted. I convinced it with a trick, to get all rockbox language strings as wav files. The program rememberes "articles" to be spoken, so I made many little articles from the strings. In the program directory there will be a last-settings file username_TAExport.txt, which I modified like this: ||||||TextAloud MP3 Start of Article|||||| LANG_SOUND_SETTINGS|Mary|1 Sound Settings ||||||TextAloud MP3 End of Article|||||| ||||||TextAloud MP3 Start of Article|||||| LANG_GENERAL_SETTINGS|Mary|1 General Settings ||||||TextAloud MP3 End of Article|||||| and so forth. When playing this into separate files I get a bunch of wav's, which lame can then nicely encode. The integrated lame is too restricted, won't do VBR, no quality settings avail. So I have all the UI clips. I have placed the first shot here: http://joerg.hohensohn.bei.t-online.de/archos/speech/ui_clips A drawback is that TextAloud generates a lot of pause (silence) around the actual clip, which I'd like to remove. Lame has no option for that, I haven't found a tool yet which does it on a batch of files. So far, Jörg -- GMX ProMail (250 MB Mailbox, 50 FreeSMS, Virenschutz, 2,99 EUR/Monat...) jetzt 3 Monate GRATIS + 3x DER SPIEGEL +++ http://www.gmx.net/derspiegel +++ _______________________________________________ http://cool.haxx.se/mailman/listinfo/rockboxReceived on 2004-02-28 Page template was last modified "Tue Sep 7 00:00:02 2021" The Rockbox Crew -- Privacy Policy |