Rockbox mail archiveSubject: Re: Voice patches
Re: Voice patches
From: Daniel Dalton <daniel.dalton47_at_gmail.com>
Date: Sat, 06 Oct 2007 18:38:13 +1000
Wow lots of good ideas. See my comments below.
On 6/10/2007 1:21 PM, Stéphane Doyon wrote:
> On Thu, 4 Oct 2007, Daniel Dalton wrote:
>> On 28/09/2007 1:03 AM, Stéphane Doyon wrote:
>>> I have a lot more ideas on making Rockbox's voice interface more usable,
>> Could you tell us some of your ideas?
> Well, I have some 15 or 20 patches languishing in the tracker, so as I
> said before, I don't think there's much point adding more until I see
> progress on some of this.
What else isn't accessible in rockbox?
Apart from having to listen to everything been spelled in the database?
> P7563, P6325 and P6171 should be ready for inclusion IMO.
> And kbd-accessible.diff from P6324 as well.
> I'm happy with P7774 and P7775 too, although perhaps a bit wider testing
> would be a good idea.
Would it help if I tried to get some included on irc?
I am not a very good coder myself (As you probably know) but I got a
couple of my voicing patches excepted by asking on irc.
> I could use advice on P6239: I need to enqueue multiple thumbnails (.talk
> clips). I suppose it wouldn't be acceptable to just declare several more
> buffers. I'd rather not make this part more complex by adding another
> thread and reading files from another context. Right now I'm using a
> hack... but I'd welcome suggestions.
Sorry can't help you there.
> Depending on that, talk_file and playlist_catalog from P6323 are ready,
> as well as P6240: improved feedback in bookmark selection. I've had
> several positive comments about that one: it's useful, people like
> it. Could use testing on HWCODEC, if there are any HWCODEC voice users
> left around.
> P7653 also is good, although perhaps the setting I added in there is
> overkill and should always be on.
I think the setting is good.
I can't see it been too much of a problem.
> P6331 and P7777 are also ready.
> Still I guess that's a lame answer to your question. About other ideas...
> -Of course there's the low-level issues I'd like to work on. Ability to
> speak while paused,
I asked about that as you might have seen a couple of weeks a go on this
But I am not exactly sure what needs to be done.
If I could help you out with this in anyway let me know.
>interrupting voice quickly while music is playing,
What exactly do you mean by this?
A button to kill the voice?
> cleaner interface between talk.c and playback.c. A quick one would be to
> have mp3_play_stop() do the same as voice_stop(), that'd eliminate some
> stuttering. And what voice_stop() does should perhaps be a shutup() call
> from playlist_start(). But there's a lot of higher level stuff we can do
> without depending on this.
So what's the actual problem with it now?
> -I'm considering implementing an alternate quick screen for blind users:
> something to put a few key functions closer at hand. We would like a
> quick way to have the time spoken, the Rockbox info menu is just too
> far. I'd like to put in a function that temporarily overrides file and
> dir .talk clips, for those times where you can't make out what the
> synthesizer was saying in that .talk clip, because it's a weird band name
> that you haven't heard before, and you need it spelled out jus this
So that would be in the quick screen thing?
>Also a quick way to adjust volumes when browsing files or
Is this sort of thing likely to be implemented into rockbox.
I am not saying at all it is not a good idea. Personally I think it is a
great idea but most of the devs are sighted and may think it is a waste
of a button.
Also I am working on a long press of rec to say the time and a short
press to go to the radio.
It actually says battery level and time. What should I do if voice menus
is off? Just make it do nothing? Or do you think your idea is better and
we should forget about mine.
Mine is just about done so I will hopefully put it on the tracker soon.
>Perhaps a hot key to toggle tracklock / study mode (P6188).
Actually I already implemented a button press for x5 and h300 in your
track lock patch. See my last comment.
Or do you mean in that quick screen.
> -Voice memo recorder functionality: the recording screen is meant more
> for elaborate music recording jobs. I'd like a context where I can be
> sure what's going on with little or no feedback. There's the issue of
> voice being disabled during recording, but beyond that, I'd like to be
> able to record a quick memo on my player without having to take out my
> earphones (and put them away again afterwards). It could be as simple as
> a button context where it records only while you hold the RECORD
> key. Anyway avoid using one button for start/pause, and one button for
> STOP and exit, so in case you're not confident your keypress went in, you
> can always press it a second time. And then find ways to facilitate
> managing a collection of memos: it's nice having the date in the
> filename, but it's not manageable when spelled. Add a context menu
> function to allow on the spot recording of a .talk clip associated to a
> particular file or dir. Eventually perhaps implement cut&paste editing of
> an audiofile, as long as it's uncompressed.
That's a good idea. What about a beep function for the recording screen?
Or is that too difficult. I was browsing the mails from a while a go and
I think you said something like that.
I tried to add it it compiled fine but made no beep. Can' you record and
play a beep?
> -Infrastructure to load secondary voice files. Use that to make plugins
> talk, without having to increase the size of the main voice file.
Obviously voicing all the useful plugins for blind users and just using
the normal voice file would not work. The voice file would become very
But I see two problems with this:
1. A lot of voice files. Remember one for every language and every player.
2. What about if we want to use strings for plugins that are already in
the main voice file?
We will need to have duplicates of the same strings.
> -Increase playback speed without affecting pitch. I'm not starting from
> scratch here, I have something more or less working, only the integration
> into Rockbox is still somewhat rough.
I remember you saying that one to me. Sounds like a good idea.
> -Make some basic WPS info accessible, at least time position and trac
> duration. Perhaps in a list like the id3 screen.
Do you just mean some very basic info not like your other patch that
voices the whole id3 screen?
> -Coarse navigation function: I once had an entire audiobook in a single
> track, lasting 24hours. It would be nice to be able to jump by some
> coarse increments like 5mins, 30mins, 3hours, then jump to within 10secs
> of the end of the track, and perhaps some proportional jumps like 10% of
> track... Part of the problem is that fastforward/rewind gives no progress
> feedback at all for blind users, because of the issue with pausing. But
> beyond that, a tool to move around big tracks might be useful.
Sounds very useful.
> -A kind of voice database: idea in part from Mario Lang. The metadata
> spoken in the database browser or id3 screen is always spelled, which is
> really slower. What if we preprocessed all files on the host computer,
> extracted all metadata tag text, have them all spoken similarly to what
> we do for .talk clips, and put all that in some sort of mini-database:
> perhaps just a big blob with some kind of hash index, or perhaps even
> each entry into a separate file. The idea is to have the voiced audio
> data indexed by the string that is being spoken. These could then be
> loaded on demand and cached. We could try to mitigate the disk access
> delays by keeping stats as to the most often used tags and preload those
> in one pass. I'd need to experiment to find out whether this would be
> feasible. If it does work then it would be useful beyond metadata
> tags. OTOH, if the espeak plugin works and assuming the response time is
> good, perhaps this is really not needed.
Espeak will take a bit to get working. First the licensing needs to be
Second it doesn't work on a lot of players.
So I think it will still be a while unfortunately.
Anyway do you mean it would play a talk clip? If so would this require
another script to generate the clips?
> -See if we could implement DAISY, or a subset of it, by preprocessing the
> DAISY on the host machine and coming up with a cue sheet for each DAISY
> level. We'd need to have a playlist backed by multiple cue sheets and the
> ability to switch between them. I'm not entirely sure this makes sense,
> would need to look into it more seriously.
> -Spontaneous battery level warning, say at 50% and 20%.
speak it automatically?
I also wanted rockbox to say charging when I insert the cable. Not sure
what code to look at though.
> -Make some plugins talk, at least those called from core.
I was going to do a bit of this as well.
So would you be happy if I tried to implement the more simple ideas?
Oh yeah and what about putting the info screen into a list.
Do you know how you will do that? I would be happy to help with it.
-- Daniel Dalton http://members.iinet.net.au/~ddalton/ daniel.dalton47_at_gmail.comReceived on 2007-10-06