Rockbox.org home
release
dev builds
extras
themes manual
wiki
device status forums
mailing lists
IRC bugs
patches
dev guide



Rockbox mail archive

Subject: Re: Voice patches

Re: Voice patches

From: Daniel Weck <daniel.weck_at_gmail.com>
Date: Sat, 6 Oct 2007 08:22:44 +0100

Hi Stéphane !

Thank you for your detailed proposal.
I hope some of your patches will make it to the trunk.

Please, read-on for comments about Daisy playback, and a remark about
'playback speed modification without pitch alteration'.

On 6 Oct 2007, at 04:21, Stéphane Doyon wrote:
> -Increase playback speed without affecting pitch. I'm not starting
> from
> scratch here, I have something more or less working, only the
> integration
> into Rockbox is still somewhat rough.

As part of my programming work in the context of self-voicing user
interfaces, I have used 2 open-source implementations (GPL I think)
of "timescale modification" algorithms.

They are both are processor-intensive, but the memory requirements
are reasonable. Technically, the lack of malloc/free in Rockbox
should not be a problem because "static" buffers can be used everywhere.

The main problem would be porting to a totally different audio
playback architecture.

Stéphane, what algorithm have you been playing with ? (WSOLA ?)

> -See if we could implement DAISY, or a subset of it, by
> preprocessing the
> DAISY on the host machine and coming up with a cue sheet for each
> DAISY
> level. We'd need to have a playlist backed by multiple cue sheets
> and the
> ability to switch between them. I'm not entirely sure this makes
> sense,
> would need to look into it more seriously.

Right. I think that would be a good start, as it would be easy to
implement using existing Rockbox features.

However, I would like to offer blind/visually-impaired users the
convenience of using their standard Daisy 2.02 or even 3.0 DTBs
(Digital Talking Books), without conversion. To achieve that, we need:

1) a very lightweight XML parser (low on memory and processor
requirements), with support for UTF-8 and other encodings.
=> Well, I have finished porting the Expat SAX parser to Rockbox,
using a modified dynamic memory allocation library (DBestFit). It
works wonderfully well on my Toshiba Gigabeat simulator, but
unfortunately crashes on the real device. I think it's due to memory
alignment issues (ARM, 32 bits / 4 bytes), although I have triple-
checked and can't find where the problem is. So, work in progress...

2) text support: well, I think this is unnecessary to start with.
When we've successfully implemented an audio-only book player,
targets like the Gigabeat are strong candidate to start experimenting
with text rendering. Until then, we should focus on audio-only books
and support for Daisy navigation (NCX, NCC, skippable / escapable
items, etc.).

3) the ability to play Daisy books from a plugin: my test
implementations so far are plugins that do not require any changes in
the Rockbox core (I have simply configured the viewers list to open
XML files with my plugin, when selected via the standard Rockbox file-
browser). This is great, but I am having problems playing audio files
from my plugin (my code instantiates a new codec, using the
test_codec.c example). I have implemented clip-begin / clip-end
functionality, using seek control and progress feedback...but I
haven't had time to work on this much so the code isn't working yet
(I was focusing on the malloc/free issues mentioned above).

I'd like to invite Rockbox hackers and developers to contribute here
and there ;-)

Kind regards, Daniel.
Received on 2007-10-06


Page was last modified "Jan 10 2012" The Rockbox Crew
aaa