|
Rockbox mail archiveSubject: Re: Voice patchesRe: Voice patches
From: Stéphane Doyon <s.doyon_at_videotron.ca>
Date: Fri, 05 Oct 2007 23:21:30 -0400 (EDT) On Thu, 4 Oct 2007, Daniel Dalton wrote: > On 28/09/2007 1:03 AM, Stéphane Doyon wrote: >> I have a lot more ideas on making Rockbox's voice interface more usable, > > Could you tell us some of your ideas? Well, I have some 15 or 20 patches languishing in the tracker, so as I said before, I don't think there's much point adding more until I see progress on some of this. P7563, P6325 and P6171 should be ready for inclusion IMO. And kbd-accessible.diff from P6324 as well. I'm happy with P7774 and P7775 too, although perhaps a bit wider testing would be a good idea. I could use advice on P6239: I need to enqueue multiple thumbnails (.talk clips). I suppose it wouldn't be acceptable to just declare several more buffers. I'd rather not make this part more complex by adding another thread and reading files from another context. Right now I'm using a hack... but I'd welcome suggestions. Depending on that, talk_file and playlist_catalog from P6323 are ready, as well as P6240: improved feedback in bookmark selection. I've had several positive comments about that one: it's useful, people like it. Could use testing on HWCODEC, if there are any HWCODEC voice users left around. P7653 also is good, although perhaps the setting I added in there is overkill and should always be on. P6331 and P7777 are also ready. Still I guess that's a lame answer to your question. About other ideas... -Of course there's the low-level issues I'd like to work on. Ability to speak while paused, interrupting voice quickly while music is playing, cleaner interface between talk.c and playback.c. A quick one would be to have mp3_play_stop() do the same as voice_stop(), that'd eliminate some stuttering. And what voice_stop() does should perhaps be a shutup() call from playlist_start(). But there's a lot of higher level stuff we can do without depending on this. -I'm considering implementing an alternate quick screen for blind users: something to put a few key functions closer at hand. We would like a quick way to have the time spoken, the Rockbox info menu is just too far. I'd like to put in a function that temporarily overrides file and dir .talk clips, for those times where you can't make out what the synthesizer was saying in that .talk clip, because it's a weird band name that you haven't heard before, and you need it spelled out jus this once. Also a quick way to adjust volumes when browsing files or menus. Perhaps a hot key to toggle tracklock / study mode (P6188). -Voice memo recorder functionality: the recording screen is meant more for elaborate music recording jobs. I'd like a context where I can be sure what's going on with little or no feedback. There's the issue of voice being disabled during recording, but beyond that, I'd like to be able to record a quick memo on my player without having to take out my earphones (and put them away again afterwards). It could be as simple as a button context where it records only while you hold the RECORD key. Anyway avoid using one button for start/pause, and one button for STOP and exit, so in case you're not confident your keypress went in, you can always press it a second time. And then find ways to facilitate managing a collection of memos: it's nice having the date in the filename, but it's not manageable when spelled. Add a context menu function to allow on the spot recording of a .talk clip associated to a particular file or dir. Eventually perhaps implement cut&paste editing of an audiofile, as long as it's uncompressed. -Infrastructure to load secondary voice files. Use that to make plugins talk, without having to increase the size of the main voice file. -Increase playback speed without affecting pitch. I'm not starting from scratch here, I have something more or less working, only the integration into Rockbox is still somewhat rough. -Make some basic WPS info accessible, at least time position and trac duration. Perhaps in a list like the id3 screen. -Coarse navigation function: I once had an entire audiobook in a single track, lasting 24hours. It would be nice to be able to jump by some coarse increments like 5mins, 30mins, 3hours, then jump to within 10secs of the end of the track, and perhaps some proportional jumps like 10% of track... Part of the problem is that fastforward/rewind gives no progress feedback at all for blind users, because of the issue with pausing. But beyond that, a tool to move around big tracks might be useful. -A kind of voice database: idea in part from Mario Lang. The metadata spoken in the database browser or id3 screen is always spelled, which is really slower. What if we preprocessed all files on the host computer, extracted all metadata tag text, have them all spoken similarly to what we do for .talk clips, and put all that in some sort of mini-database: perhaps just a big blob with some kind of hash index, or perhaps even each entry into a separate file. The idea is to have the voiced audio data indexed by the string that is being spoken. These could then be loaded on demand and cached. We could try to mitigate the disk access delays by keeping stats as to the most often used tags and preload those in one pass. I'd need to experiment to find out whether this would be feasible. If it does work then it would be useful beyond metadata tags. OTOH, if the espeak plugin works and assuming the response time is good, perhaps this is really not needed. -See if we could implement DAISY, or a subset of it, by preprocessing the DAISY on the host machine and coming up with a cue sheet for each DAISY level. We'd need to have a playlist backed by multiple cue sheets and the ability to switch between them. I'm not entirely sure this makes sense, would need to look into it more seriously. -Spontaneous battery level warning, say at 50% and 20%. -Make some plugins talk, at least those called from core. And more... -- Stéphane Doyon <s.doyon_at_videotron.ca> http://pages.videotron.com/sdoyon/Received on 2007-10-06 Page template was last modified "Tue Sep 7 00:00:02 2021" The Rockbox Crew -- Privacy Policy |