|
Rockbox mail archiveSubject: Re: Voice patchesRe: Voice patches
From: Stéphane Doyon <s.doyon_at_videotron.ca>
Date: Sun, 07 Oct 2007 23:43:57 -0400 (EDT) On Sat, 6 Oct 2007, Daniel Weck wrote: > On 6 Oct 2007, at 04:21, Stéphane Doyon wrote: >> -Increase playback speed without affecting pitch. I'm not starting from >> scratch here, I have something more or less working, only the integration >> into Rockbox is still somewhat rough. > > As part of my programming work in the context of self-voicing user > interfaces, What sort of work would that be exactly? That's OT though I guess, so perhaps privately... > I have used 2 open-source implementations (GPL I think) of > "timescale modification" algorithms. Could you give me references? I'm aware of Soundtouch. > Stéphane, what algorithm have you been playing with ? (WSOLA ?) Err what's that? Looking it up. Hmm yes, that's it :-). Gives you an idea of what a strong theoretical background I have on this subject :-). What I'm using now is a previously unreleased implementation by Nicolas Pitre <nico_at_cam.org>. I've helped him with a bit of tuning and a bug report or two, but the credit really goes to him. He implemented that from scratch. He worked from a good understanding of the algorithm, although I'm pretty sure he did not read that paper :-). We both used that implementation for a few years on our Linux'ified iPAQ H3600 handhelds, to speed up talking books. Those have a StrongArm processor running at 206MHz, which is relatively modest. Since I was familiar with Nicolas's implementation and I knew it did not require too much CPU power, I naturally used that when trying to speed up playback on Rockbox. Nicolas said he was happy to release his implementation under the GPL, so integrating it into Rockbox officially is quite possible. My current integration of that implementation into Rockbox is rather rough. I do intend to post it as a patch soon. Then I'd need help from some dsp.c guru to fit it in properly. One difficulty is that this algorithm needs a relatively large sound buffer to work on, a latency of about 0.1s, and this would be the first Rockbox DSP effect to have this kind of requirement AFAICT. The good news is it works for me on an X5 and an e200. Audio books are typically lower bitrate and mono, and those I can speed up to a factor 3 without problem. High bitrate music can also be accelerated to some degree, just not as much. My rough integration causes latency in the UI when speeding up high bitrate files to near the max capacity of the CPU. But for audio books, it works well. I haven't measured the effect on my battery life, but I've been using this for a while and I know that qualitatively, the effect is not disastrous. I have not done a thorough comparison of this implementation vs Soundtouch, or other implementations. I did do a quick subjective comparison with Soundtouch: for speeding up speech, Nicolas's algorithm appeared to sound a bit superior, while for slowing down music, Soundtouch was better. I must admit that this works well enough for me that I am not really motivated to further investigate alternatives. >> -See if we could implement DAISY, or a subset of it, by preprocessing the >> DAISY on the host machine and coming up with a cue sheet for each DAISY >> level. We'd need to have a playlist backed by multiple cue sheets and the >> ability to switch between them. I'm not entirely sure this makes sense, >> would need to look into it more seriously. > > Right. I think that would be a good start, as it would be easy to implement > using existing Rockbox features. > > However, I would like to offer blind/visually-impaired users the convenience > of using their standard Daisy 2.02 or even 3.0 DTBs (Digital Talking Books), > without conversion. I imagine that would be more convenient, and probably acceptable as long as things like XML parsing happen in a plugin. Although personally I don't see preprocessing on the host as a big limitation because it would be near instantaneous (unlike video conversion for example). > 2) text support: well, I think this is unnecessary to start with. When we've I agree. > 3) the ability to play Daisy books from a plugin: my test implementations so I'm not sure about this one. Why play from a plugin? Or is this just a step in your development strategy. Admitedly I an not familiar with the newer versions of the DAISY standard and may be missing some issues. -- Stéphane Doyon <s.doyon_at_videotron.ca> http://pages.videotron.com/sdoyon/Received on 2007-10-08 Page template was last modified "Tue Sep 7 00:00:02 2021" The Rockbox Crew -- Privacy Policy |