Rockbox.org home
release
dev builds
extras
themes manual
wiki
device status forums
mailing lists
IRC bugs
patches
dev guide



Rockbox mail archive

Subject: Re: Voice patches

Re: Voice patches

From: Andrew Free <andrew_at_schjelderup.org>
Date: Sun, 7 Oct 2007 20:59:26 -0700

how do i get off this list
On Oct 7, 2007, at 8:43 PM, Stéphane Doyon wrote:

> On Sat, 6 Oct 2007, Daniel Weck wrote:
>> On 6 Oct 2007, at 04:21, Stéphane Doyon wrote:
>>> -Increase playback speed without affecting pitch. I'm not
>>> starting from
>>> scratch here, I have something more or less working, only the
>>> integration
>>> into Rockbox is still somewhat rough.
>>
>> As part of my programming work in the context of self-voicing user
>> interfaces,
>
> What sort of work would that be exactly? That's OT though I guess,
> so perhaps privately...
>
>> I have used 2 open-source implementations (GPL I think) of
>> "timescale modification" algorithms.
>
> Could you give me references? I'm aware of Soundtouch.
>
>> Stéphane, what algorithm have you been playing with ? (WSOLA ?)
>
> Err what's that? Looking it up. Hmm yes, that's it :-). Gives you
> an idea of what a strong theoretical background I have on this
> subject :-).
>
> What I'm using now is a previously unreleased implementation by
> Nicolas Pitre <nico_at_cam.org>. I've helped him with a bit of tuning
> and a bug report or two, but the credit really goes to him. He
> implemented that from scratch. He worked from a good understanding
> of the algorithm, although I'm pretty sure he did not read that
> paper :-). We both used that implementation for a few years on our
> Linux'ified iPAQ H3600 handhelds, to speed up talking books. Those
> have a StrongArm processor running at 206MHz, which is relatively
> modest. Since I was familiar with Nicolas's implementation and I
> knew it did not require too much CPU power, I naturally used that
> when trying to speed up playback on Rockbox. Nicolas said he was
> happy to release his implementation under the GPL, so integrating
> it into Rockbox officially is quite possible.
>
> My current integration of that implementation into Rockbox is
> rather rough. I do intend to post it as a patch soon. Then I'd need
> help from some dsp.c guru to fit it in properly. One difficulty is
> that this algorithm needs a relatively large sound buffer to work
> on, a latency of about 0.1s, and this would be the first Rockbox
> DSP effect to have this kind of requirement AFAICT.
>
> The good news is it works for me on an X5 and an e200. Audio books
> are typically lower bitrate and mono, and those I can speed up to a
> factor 3 without problem. High bitrate music can also be
> accelerated to some degree, just not as much. My rough integration
> causes latency in the UI when speeding up high bitrate files to
> near the max capacity of the CPU. But for audio books, it works
> well. I haven't measured the effect on my battery life, but I've
> been using this for a while and I know that qualitatively, the
> effect is not disastrous.
>
> I have not done a thorough comparison of this implementation vs
> Soundtouch, or other implementations. I did do a quick subjective
> comparison with Soundtouch: for speeding up speech, Nicolas's
> algorithm appeared to sound a bit superior, while for slowing down
> music, Soundtouch was better. I must admit that this works well
> enough for me that I am not really motivated to further investigate
> alternatives.
>
>>> -See if we could implement DAISY, or a subset of it, by
>>> preprocessing the
>>> DAISY on the host machine and coming up with a cue sheet for each
>>> DAISY
>>> level. We'd need to have a playlist backed by multiple cue sheets
>>> and the
>>> ability to switch between them. I'm not entirely sure this makes
>>> sense,
>>> would need to look into it more seriously.
>>
>> Right. I think that would be a good start, as it would be easy to
>> implement using existing Rockbox features.
>>
>> However, I would like to offer blind/visually-impaired users the
>> convenience of using their standard Daisy 2.02 or even 3.0 DTBs
>> (Digital Talking Books), without conversion.
>
> I imagine that would be more convenient, and probably acceptable as
> long as things like XML parsing happen in a plugin. Although
> personally I don't see preprocessing on the host as a big
> limitation because it would be near instantaneous (unlike video
> conversion for example).
>
>> 2) text support: well, I think this is unnecessary to start with.
>> When we've
>
> I agree.
>
>> 3) the ability to play Daisy books from a plugin: my test
>> implementations so
>
> I'm not sure about this one. Why play from a plugin? Or is this
> just a step in your development strategy.
>
> Admitedly I an not familiar with the newer versions of the DAISY
> standard and may be missing some issues.
>
> --
> Stéphane Doyon
> <s.doyon_at_videotron.ca>
> http://pages.videotron.com/sdoyon/
Received on 2007-10-08


Page was last modified "Jan 10 2012" The Rockbox Crew
aaa