This is the bug/patch tracker for Rockbox. Click here for more information.
Quick links: Bugs · Patches · Rockbox frontpage
FS#8894 - Speeding playback up/down without affecting pitch
Attached to Project:
Rockbox
Opened by Stephane Doyon (sdoyon) - Tuesday, 15 April 2008, 03:23 GMT+2
Last edited by Steve Bavin (pondlife) - Friday, 12 June 2009, 09:21 GMT+2
Opened by Stephane Doyon (sdoyon) - Tuesday, 15 April 2008, 03:23 GMT+2
Last edited by Steve Bavin (pondlife) - Friday, 12 June 2009, 09:21 GMT+2
|
DetailsSpeeding playback up/down without affecting pitch
aka time scaling. The good news is: this actually works. The bad news: it still needs work. I've been using this on my players for many months now. Life has made it so that my Rockbox spare time has been reduced to very little, unfortunately. And this works just well enough that I have not felt a pressing need to complete this. Therefore I am putting this up in its current state so that it can be useful to others and in the hope that someone will pick it up and give it the love it needs. This work it based on a previously unreleased implementation by Nicolas Pitre <nico@cam.org>. So the credit for this mostly goes to him. It's loosely based on the WSOLA algorithm. Nicolas implemented this from scratch, working from a good understanding of the general algorithm. My contributions: I helped with a bit of tuning and a bug report or two, and I started on this half-baked integration into Rockbox which is still pretty rough. Nicolas and I both used this implementation for a few years on our Linux'ified iPAQ H3600 handhelds, to speed up talking books. Those have a StrongArm processor running at 206MHz, which is relatively modest and does not support floating point operations. Nicolas released the code to me under GPL, with the explicit understanding that I would post it here for integration into Rockbox. He is not himself a Rockbox user at this time. This patch has been tested on X5 and E200. It works well enough for speeding up audio books (which are typically lower bit rate and mono). I cannot stress enough how tremendously useful this feature is to me. Slowing down speech works, but intelligibility is not much improved. Music can also be sped up or slowed, but with significant distortion. I can speed up low bit rate speech to about a factor 3, although in actual use one would normally use a factor of 1.6 to 2. Speeding up high bit rate music runs out of CPU at a somewhat lower factor. Since I was familiar with Nicolas's implementation and I knew it did not require too much CPU power, I naturally used that when trying to speed up playback on Rockbox. Nowadays there are other implementations that could potentially be used. The only one I have actually tried is Soundtouch, and that comparison was admittedly done in haste. My findings were that for speeding up speech, Nicolas's algorithm appeared to sound somewhat better (less clicking distortion), while for slowing down music, Soundtouch was better. Since my goal is to speed up audio books, and since this implementation works well enough for me, I am not really motivated to further investigate alternatives. The main difficulty in integrating this algorithm into Rockbox is that it needs a relatively large sound buffer to work on, a latency of about 0.1s, and this would be the first Rockbox DSP effect to have this kind of requirement AFAICT. Also the implementation was meant to process larger chunks at a time, and I do not have a very accurate estimate of the required input buffer size for the algorithm, and so I am feeding it larger chunks than absolutely necessary. Some latency can be felt in the UI: little or none for low bitrate files, but pretty bad for high bit rate files. A better integration with dsp.c and better buffering estimations would presumably prevent this. I haven't measured the effect on my battery life. Subjectively, it doesn't feel disastrous, but I imagine it could be improved. I've bypassed the IRAM buffer that was too small for my needs. It should be easy to add logic to use the IRAM buffer at least when time scaling is not in effect. I've also left a bunch of debugging macros in there. The algorithm has several tunables that trade quality for CPU utilization. I imagine some DSP gurus might like to tinker with these and with the code. I have played with this a bit and I think the current quality level is (subjectively) just about right for speech. Another interesting feature to add would be a true pitch shift function: combining this time scaling function with the sampling rate alteration effect (what Rockbox currently calls pitch, to produce an effect that shifts pitch without affecting speed, or that allows controlling both speed and pitch independently. I imagine musicians would find that useful. I hope this will make other speech listeners as happy as it's made me. |
This task depends upon
Closed by Steve Bavin (pondlife)
Friday, 12 June 2009, 09:21 GMT+2
Reason for closing: Accepted
Additional comments about closing: Thanks!
Friday, 12 June 2009, 09:21 GMT+2
Reason for closing: Accepted
Additional comments about closing: Thanks!
Separate patch because this one is likely to go stale quickly...
EDIT: I know you said it expects that but why? It could keep it's own history buffer and interact with main dsp in smaller chunks, no?
Nice work! I'd be very interested in working on this and getting it committed, and would like to start by understanding the algorithm properly. Could you (or Nicolas) point me to a site (or book) which might act as a gentle introduction to the algorithm? I've tried googling WSOLA, but nothing remotely beginner-level appears.
p.s. Apologies for spelling your name wrong in CREDITS...
>Could you (or Nicolas) point me to a site (or book) which
>might act as a gentle introduction to the algorithm?
Indeed it's not exactly obvious. Unfortunately I'm not aware of any good
explanation/introduction to this.
There's this:
http://en.wikipedia.org/wiki/Audio_timescale-pitch_modification#Time_domain
but it's far from detailed.
Let me try to introduce the general idea in simple terms.
Roughly the idea is to pretend that your sound wave is a series of
repeating sequences, that repeat at whatever the fundamental frequency of
your sound is, and assuming that consecutive sequences are similar to one
another. On each iteration of the algorithm, we take a bit of the audio
we are about to process, and lookahead through the upcoming samples to
find a similar sequence. (You can lookup autocorrelation.) We then
proceed to clip out (sort of) one of the sequences, to speed up the
sound, or duplicate one, to slow it down.
How long the compared sequence is and how far ahead we look, depends on
the speed up factor, and in this implementation things are bounded by the
rough assumption that our fundamental frequency is higher than 100Hz. You
could do fancier stuff I'm sure. The best correlation is found by
searching for the offset at which we get the minimum of the sum of the
square of the deltas between corresponding samples. That's the costly
part. We actually have to skip a lot of those samples, and amazingly it
still works pretty well. There are possible variants for this step as
well of course, and you can tune quality vs processing time.
Now we don't actually clip parts out, or just repeat them, since that
would click like crazy, and of course our alignment and estimation of the
frequency are very rough. We do a kind of crossfading. When speeding up,
we take two consecutive sequences, and replace them with one that is the
result of mixing the two: giving most weigh to the first sequence at the
beginning and shifting the weight to the second sequence as we go. So the
beginning matches the beginning of the first sequence and the end matches
the end of the second.
Now I wrote this in 5mins, I'm no expert, and I'm only trying to
introduce the general idea here, so please be tolerant of this
explanation :-).
The part of the code that does what I just explained should be somewhat
understandable. However I admit some of the buffer management around it
is still somewhat obscure to me.
>p.s. Apologies for spelling your name wrong in CREDITS...
:-) No worries. And at least YOU had put in the accented E ;-) !
>Do the buffers have to be so large that they require removal from IRAM?
>EDIT: I know you said it expects that but why? It could keep it's own
>history buffer and interact with main dsp in smaller chunks, no?
So the autocorrelation phase needs a large chunk of input to operate
on. This isn't just history exactly: we're considering what chunk to clip
out and crossfade, and we can't start outputting it until we've decided.
The current implementation will work fine if you feed it tiny inputs: it
will buffer them internally. Empirically (not sure how to calculate
precisely) the minimum required buffer size is 3524 samples
(3524*4bytes). Some memory copies are of course involved in maintaining
that buffer, so this isn't terribly efficient.
A limitation of the current implementation is that when it has decided on
a correlation, it outputs an entire "frame" in one call. With some work,
it's conceivable the implementation could be made to split that operation
across successive calls. Alternatively the output could be buffered again
(with some more memory copies).
The current implementation has an output buffer of 4096 samples.
So we could feed it smaller chunks, but then the dsp/playback stack must
be ready for multiple iterations with 0 output. And we could possibly
have it emit smaller output, but then again you'd have several iterations
with 0 input and as much output as the rest of the pipeline can
handle. And the whole thing would probably be less efficient.
I'm not used to dealing with IRAM, so you'll have to educate me.
Also I'm not a DSP guru and I'm not particularly familiar with the
hardware either.
The autocorrelation is the costly step, as we look at the samples several
times. If someone were able to arrange this so we could do that step from
IRAM, I imagine there would be a huge gain. But I'm not sure how big a
buffer that would require, and how much IRAM we have left. In any case
this task is beyond me given my current availability.
Even if the required buffer is too large for IRAM in the general case, it
might perhaps be worth trying to make it work for cases where the audio
was downsampled to 22.05KHz or lower, as presumably the buffer
requirements might be more manageable, and this may be the case often
enough for speech.
So assuming autocorrelation cannot be done from IRAM, because it requires
too much space or because no one figures out the code, it means that this
costly step is done from ordinary RAM. I thought that the other DSP
effects (at least those I use) probably had a relatively smaller cost
compare to this one. I had speculated that in that case, the benefits of
IRAM were perhaps not so interesting, since they might be offset by
multiple calls and memory copies around the time scaling code.
The one big TODO that I left hanging however is to make sure to use the
existing IRAM buffer whenever time scaling is not in effect.
I've been hoping for an independent pitch-shifting plugin for a while (I'm a musician), I hope this will be implemented!
The version of this patched you emailed me, (it may already be up
here) works well for me.
I tested it on some music and it didn't lock up the player like older
versions did. Worked very well and the playback did speed up and slow
down. Some problems:
- Sometimes the player was slow to react when I held up/down in the
speed screen for a few seconds. As in it would start speeding right up
and I don't think the voice or the player could keep up.
- When the music was very fast, voice sometimes cut in and out, but
was still intelligible.
One thing I would suggest, do you think making the screen act like the
pitch one would be a good idea? So you could hit select and the speed
would go back to the default (100).
But looks nice from a quick test.
Thanks.
the IRAM resampling/dsp buffers when speed is not being altered.
Daniel Dalton wrote:
>The version of this patched you emailed me, (it may already be up
>here) works well for me.
Here it is.
>I tested it on some music and it didn't lock up the player like older
>versions did.
Erm well I didn't fix anything. This thing may stress buffering
somewhat, and I believe there have been a few fixes in that area in
the past weeks. (Mind you I'M not saying this code is perfect and
wouldn't ever freeze your player... but personally I still see
occasional freezes both with and without speed alteration.)
>Worked very well and the playback did speed up and slow
>down.
Good! At least one person tried it :-).
What player was this?
>Some problems:
>- Sometimes the player was slow to react when I held up/down in the
>speed screen for a few seconds. As in it would start speeding right up
>and I don't think the voice or the player could keep up.
Yes. That's what happens when you speed up high bitrate files beyond
what your player's CPU can handle. Ideally the range of the setting
would be clamped to something that doesn't let you shoot yourself in
the foot too much.
>One thing I would suggest, do you think making the screen act like the
>pitch one would be a good idea? So you could hit select and the speed
>would go back to the default (100).
You can do this by pressing CONTEXT, probably LONG SELECT. This is one
of the main reasons I added that feature to reset a setting to its
default.
Thanks for testing and reporting!
I tested this patch with e200, and find 2 bugs.
1. Using bass or treble function in sound menu with this patch, freezing occur.
2. Using crossfeed function with this patch and press next track, freezing occur.
Thank you for this great work!
With latest build(18093), this patch cause freezing problem when using with other sound option(crossfeed, bass, eq, etc..).
I think this works almost perfectly without using with other sound option(except pitch. pitch works with this patch well).
I'm using e200 and h120. Both of two players, this problem occurs same.
Oh.. I compiled v.18093 with this patch only(compiled with cygwin).
On h120, everything was okay except equalizer. When I use "Low shelf filter" or "High shelf filter", my h120 freezed. "Peak filter" had no trouble with this patch.
I think this freezing could caused by bug of shelving filter. http://www.rockbox.org/tracker/task/8867?histring=noise
I'll try with e200 another day.
To avoid unexpected freezing on h120, I managed sound_menu.c and eq_menu.c to disable eq option automatically when use this patch. But this is not profer solution.
Could you please release a single patch/diff that matches a more recent source revision?
The content has been split into different patches, relevant to different revisions.
Maybe (probably) there's something I'm doing wrong, but the patches just cannot be applied altogether.
Help?
Thanks in advance.
This way, further updated to the patch will be applied to the relevant branch/tag and when necessary it will be possible to do forward-backward integration to the main release branch.
Anyways, it will be much easier for newcomers (like myself) to get the right codebase, that holds all the "information" a patch requires.
?
What do you think about the tag-per-patch option?
This is a familiar best-practice for these kind of things.
I bought Gigabeat s30 and made a build to use this patch. And tested.
Like e200 and h120, freezing bug was occur, too. On gigabeat s, bug occured with EQ and crossfeed. I think, on coldfire cpu, when use EQ and this patch freezing bug happen. And on arm cpu, freezing problem happen when use this patch with EQ, crossfeed, bass, treble.
And according to your comment
(I can speed up low bit rate speech to about a factor 3, although in actual
use one would normally use a factor of 1.6 to 2. Speeding up high bit
rate music runs out of CPU at a somewhat lower factor.), this patch can fully used like Linux'ified iPAQ H3600 when change and use this source with gigabeat f or s. To change this patch, I saw this patch. But I couldn't find "factor" because of my lack of C language. How can I change this?
One more. To avoid freezing problem, I sligtly changed this patch(attached). When enable eq or crossfeed, sound speed option will change to 100 automatically.
I've still not had any freezing, tried on both H300 and Gigabeat F (and the simulator).
To me, the word freezing implies that the device locks up completely and has to be hardware rebooted (reset switch) - is this what you're seeing? Certainly at higher speeds, use of EQ might cause it to run out of CPU, when the audio playback will gap - is this what you're experiencing?
Most useful would be if you could attach an example audio file and your configuration in case this is specific to a particular setup/codec/encoding.
When I use crossfeed or EQ, my device was stop updating screen and a while later, device was shutdown. There wasn't need to hardware reboot(reset switch) but shutdowned automatically. This is my problem.
I currently use gigabeat s, so I think this is not run out of cpu problem. And this problem does not show on simulator.
I experienced that problem, as below listed player.
H100/H300 - When use low and high shelving filter in EQ, device goes to stop and shutdown.
E200 - When use bass/treble/crossfeed/EQ, device goes to stop and shutdown.
Gigabeat S - When use crossfeed/EQ, device goes to stop and shutdown.
I use 128k VBR MP3s most of the time, I don't think that would be the issue.
Again, if you could post your H300 config.cfg file here, I'll copy it onto my device and see if I can repro.
At that point, I think, you experienced this problem, too.
Currently, I have gigabeat s only. So I cannot upload h100/300 and e200's cfg file.
I use this patch with other various patch. I'll try test again and re-post with official build + tdspatch.
Problem occured with gigabeat s but not shutdowned automatically.
with crossfeed, device was halted without any massage.
with EQ, device was freezing with "Data abort at 000457AC" massages. (00043F24, 00043E70, ...)
I attached my config file.
1) It uses a lot of data space (even on Archos devices. This is probably down to the static buffers (outbuf and overlap_buffer). A better solution would be to allocate these from the audio buffer (probably requiring a reboot to enable/disable at the moment, maybe a seperate enable option too).
2) It crashes in some cases - see above. I'm not able to repro this reliably though (even with Lee's config.cfg), so it's probably something horrible!
Attempted to apply the 080808 patch to release 19195 and had the mods in english.lang rejected, so I added those manually to the file. Target is an iPod Mini 1G.
Good news - the item shows up on the menu.
Bad news - when I select it and change the playback rate, the iPod locks up and needs a reset.
If I disable the EQ, it seems to work OK - the EQ seems to kill it.
If I turn on crossfeed, I get a "data abort at 0004b30c (0)" and it crashes.
I also noticed that the pitch option vanished from the menu as a result of applying the patch - not sure if that was intentional or not.
This might indicate a problem with initialization, no?
Compiled and built for my iPod mini 1g and it seems to work OK.
Not sure if it's my memory playing tricks, but the output sounds more distorted to me than old versions (on H300 sim).
The other thing I need to check is how far tdspeed_apply() recurses. Does anyone know if a sim has the same size stack as the corresponding target.?
if (dsp->tdspeed_active && (samples = tdspeed_doit(tmp, samples)) <=0)
break;
- it handles any required audio rebuffering
- it takes care of any alignment
Briefly tried on sim, but not on any device, so please test and let me know what's broken!
(a) someone told buffering.c about this (it should really be in buffering.c, not playback.c) and
(b) we could ask buffering.c if a given handle is still valid
An alternative is to use buffer_alloc(), but that would mean that a reboot would be needed for any enabling of timestretching - which I feel is sucky with the current config. I guess there's always the option of another config option to "Enable timestretching" so you could pre-allocate the buffer, but that also seems like settings bloat to me.
For now, I can see it works ok on my H300, as long as I'm careful which plugins I use... ;-)
Going forward I'd like to integrate timestretch into the pitch screen - so perhaps we don't want to persist the speed setting anyway... Any opinions from audiobook readers?
Finally - might this be useful for audible seeking (eventually)?
Rather, I'd change the pitch screen to have 3 modes (it has 2 currently, but those only change how the speed increases), which in includes the no-pitch timestrech one.
Currently the pitch setting is set back to 100% at boot, but the timestretch percentage is persisted. I merely meant that the timestretch percentage would also be set to 100% at boot for consistency.
It's a bit odd at the moment with the two timestretch settings, but hopefully will feel more natural if/when the percentage is controlled from the pitch screen.
Please test and report back!
From a menu organisation standpoint, perhaps having the timestretch settings in a submenu called "Time Stretching" (kinda like the EQ menu) would seem a little less unusual.
FWIW, the way I use this, I appreciate that it's persistent: turn on
player, press play, audio book continues at whatever speed I was using
last time. The pitch setting is, I believe, currently not persisted
across reboots.
Stephane - No problem, I've not done much apart from tidy up. This is really Nicolas Pitre/Stephane Doyon/Bryan VanDyke's work - I certainly don't understand the algorithm... ;-)
I'm not sure whether "Time Stretch" is a better or worse term than "Speed". IMHO, "Speed" is simpler, but "Time Stretch" implies you're going to hear artifacts. The pedant in me says that a higher stretch percentage would actually slow down playback, and I need to resolve that inner conflict, so might go back to "Speed" - or maybe there's a better term?
Another outstanding issue is that when you first enable the feature, I find that the "Please reboot" splash doesn't behave as it should. It should splash for 2 seconds (or until a key is pressed), then return to the menu. It appears to return to the menu immediately then draws the splash notice on top of it, with no timeout. I copied the code from the Dircache enable setting, which works fine. Anyone got any idea what I did wrong?
The reboot oddness seems only to happen to me when going from a false-to-true state. If going from true-to-true it seems to act correctly. Maybe that can help somebody track it down.
Ah - just noticed that enabling cuesheet has the same oddness - so it's probably just a bug in splash or the menu code.
Maybe this is ready for commit, then onto integration with the pitch screen... what say you??
So the act of changing the variable verses setting it to a state it's already in, changes the way the splash/menu code behaves. Maybe that can help somebody narrow down where the bug could be in the splash/menu code.
That being said, after a day of use the code works fine on my H10 5g.
He had dithering/EQ/crossfeed all disabled, FWIW. I can't repro on either my H300 or a sim. :/
When "speeding up" the amount of input data required is larger than the output buffer - I wonder if something (in the codec handling) is assuming that this won't happen? It should be transparent, just resulting in more requests for data.
130% stretch and 130% pitch gave data abort at 0004b478 (which is dircache_get_entry in dircache.o) while in the wps.
70% stretch and 150% pitch and 70%stretch and 70% pitch both gave the data abort in strncpy as mentioned above
when trying to enter the filebrowser, i once got a hard freeze.
with dircache disabled or time stretch disabled i could not reproduce either of those.
I think this supports the theory that the dircache is corrupted by timestretch in some way.
edit: oh and the new strings should depend on the "swcodec" feature in the lang files.
edit2: a few nitpicks: the patch contains commented out lines of code and #if 0'ed blocks that i think should be removed
big_sample_buf = buffer_alloc(big_sample_buf_count * sizeof(int32_t));
big_resample_buf = buffer_alloc(big_sample_buf_count * RESAMPLE_RATIO * sizeof(int32_t));
instead of:
big_sample_buf = buffer_alloc(big_sample_buf_count);
big_resample_buf = buffer_alloc(big_sample_buf_count * RESAMPLE_RATIO);
big_sample_buf = (int32_t*)buffer_alloc(big_sample_buf_count * sizeof(int32_t));
big_resample_buf = (int32_t*)buffer_alloc(big_sample_buf_count * RESAMPLE_RATIO * sizeof(int32_t));
For some reason the very first time I started it up, the mp3 playback was garbled (just noise), but subsequent replays seem to be OK and I can't reproduce - could just be an odd holdover from the earlier build on the system. If it happens again, I'll see if I can get some additional info (though I'm not sure how I'd do that to be honest).
I tweaked the code to include the proper casts in Bryan's last update as well (is it just the two places in dsp.c, one for each line?). I don't imagine that has any significant bearing on the function of the patch, but at least his CS teacher won't smack either of us upside the head. :-)
I started a CS degree, but most of what I did was FORTRAN, x86 assembly, and Ada (pre-95). I taught myself C years before college, no doubt I'd have missed the casts entirely. :-)
Buffer allocation seems sane to me, but more testing is always good...
As for suggestions.
If MIN_FACTOR was increased to 50 the buffer requirement for timestretch could be reduced from 3x to 2x. This would also carry over to the resample stuff which has a 4x on top of that.
Get rid of small_(re)sample_buf and big_(re)sample_buf and buffer_alloc directly to (re)sample_buf. They're redundant and made more sense when the memory wasn't being allocated as start up.
I honestly can't say I understand the rest of the code that well though; feel free to update.
256 samples = 1024 bytes, maybe that's where I got confused before?
There was a mistaken assumption that resample needs to be 4x sample size. It only needs to be 4x the size the chunk being fed to it. Normally that is the sample buffer. But with timestretch that is the timestretch out buffer. This patch breaks up the timestretch out buffer into smaller chunk and feeds those down the line.
I left the 1 to 4 ratio in, but I think the resample buffer could be reduced some more.
I also have permission from both Stephane Doyon and Nicolas Pitre to change the license to GPL v2 or later. I'll take the liberty of quoting Nicolas:
>> Nico: are you OK with GPL v2 or later for the speed scaling?
> Sure, no problem.
>> For your reference, the current patch is at
>> http://www.rockbox.org/tracker/task/8894
>> Hmmm... Well, this implementation is not the latest (meaning best performing) version. I reworked the code a bit to factor out the interesting tunables,
>> and added alternative autocorelator implementations to play with. Also tweaked the code to get same quality results as the soundtouch implementation.
>> See attached my latest version if interested.
I think we would be interested - attached is the updated source.
Could the speed setting to be made specific to a playlist/bookmark (like shuffle is)? Flipping from spoken work (audiobook) to music shouldn't require changing the speed back to 100.
Sal - I didn;t know you could tie shuffle to a specific bookmark/playlist; please can you point out the setting for me? I personally have several .cfg files for switching settings- e.g. car.cfg / dj.cfg / book.cfg...
090304 - binary: +2696, RAM: +2800
090306 - binary: +2948, RAM + 3056
I've rejigged the chunk handling code in dsp.c to reduce this down to binary: +2748, RAM: +2848, in 090306b. Bryan, please could you double-check?
(IMHO, it's much more important to minimise impact for non-timestretch users than it is to save RAM for those who'll use the new functionality.)
I had another Idea to save some more memory. Get rid of the alloc for the big sample buffer and use the small re-sample buffer in its place. They're both 1024 samples (4k). Plus the added benefit of keeping part of the buffering in IRAM.
Delta is now binary: +2736, RAM: +2832.
Firstly once it is merged into the pitch screen which I believe is the current plan, perhaps the pitch screen should be renamed as it will be dealing with not only pitch but speed as well.
Next, if this patch is merged into the pitch screen, will the pitch screen be altered so that it can adjust pitch but without adjusting speed?
I think if possible we should be given 3 options, to either adjust pitch or speed, or do as the existing pitch screen does and adjust both at the same time.
The pitch screen changes are a seperate task, that'll be a whole new FlySpray entry.
It might just be a "speed screen" which runs alongside the pitch screen. But I like the idea of a third mode inside the pitch screen (pitch / semitone / stretch). When in stretch mode, the left/right buttons would control speed (i.e. just modifiying timestretch speed) and up/down would control pitch (modifying both timestretch and sample rate).
Maybe you'd like to write the patch :-)
In file menu.c in the do_menu() function about line 623 the function do_setting_from_menu is called. When this function returns true, the original parent menu is redrawn. (The function returns true when the variable in question changes.) Lower in do_menu function, the menucallback with ACTION_EXIT_MENUITEM is called. (which is where the splash screen is displayed) And finally the parent menu is redrawn again.
so what happens is
change setting menu
CHANGE
redraw
callback with splash.
change setting menu
NO CHANGE
callback with splash
redraw
FS#9994(nice co-incidental numbering there).On the e200 you toggle pitchscreen modes with REC, I thought you are going to press it once more to get to the timestretch. and that pitch_increase(); would handle all at once, just depending on the mode.
The mode selection is the same, this is just a third mode that you toggle through if timestretch is enabled (and available). Try it and see!
Have a go if you like, but you need to maintain the beatmatching/nudge facility, as well as the semitone mode, so I thought a third mode was more in keeping. Personally, I don't much like the colour and EQ screens, but each to his own - I use the pitch screen a lot more, so it's probably familiarity too.
A bigger issue is that kugel reported two problems on IRC:
1) Use of timestretch is causing data aborts for him on target (a Sansa of some kind, I didn't ask which)
2) This causes 64-bit sims to segfault (right back to the earliest patches). I've gone for an over-the-top removal of long, int and long long here - replaced with int32_t and intt64_t. This seems to make no difference here, but I can only test on a 32-bit sim.
I can't repro data aborts still - could someone who's seeing them attach their config.cfg in case this is relevant. Also, does it happen with any particular file format?
I don't get a crash with your .cfg file (on H300). I don't have the Tango WPS either, but doubt that's particularly relevant. Could you perhaps try resetting your settings to default, then checking to see if it crashes? If not, then enable things (crossfade, crossfeed, EQ) one at a time to see if there's a particular feature that timestretch disagrees with?
Please use the latest SVN build - here's a resynced patch (no changes aside from resync).
[jhenderson@krikkit build]$ make
CC lang_core.c
/home/jhenderson/Downloads/rockbox-svn/rockbox/apps/lang/english.lang:12471:1: warning: dest before line lacks quotes ()!
/home/jhenderson/Downloads/rockbox-svn/rockbox/apps/lang/english.lang:12471:1: warning: source before line lacks quotes ()!
/home/jhenderson/Downloads/rockbox-svn/rockbox/apps/lang/english.lang:12471:1: warning: voice before line lacks quotes ()!
/home/jhenderson/Downloads/rockbox-svn/rockbox/apps/lang/english.lang:12471:1: warning: empty dest before line in non-deprecated phrase!
/home/jhenderson/Downloads/rockbox-svn/rockbox/apps/lang/english.lang:12471:1: warning: unknown user!
In file included from /home/jhenderson/Downloads/rockbox-svn/rockbox/build/lang/lang_core.c:5:
/home/jhenderson/Downloads/rockbox-svn/rockbox/build/lang/lang.h:513: error: syntax error before ‘,’ token
make: *** [/home/jhenderson/Downloads/rockbox-svn/rockbox/build/lang/lang_core.o] Error 1
Jim
Summary:
patch > 090309 work about 10% of the time
090309 works very good
I only have and tested on a Sansa e200
Does the data abort code have a meaning?
I'm not familiar with data aborts - I guess this means that data alignment is incorrect, or we're attempting to access an invalid address. I suppose you could compare the address given with a map for your build, but I've no idea if this will give useful info. I suspect that my own target (H300) just doesn't give data aborts easily, if at all.
Data abort experts - this is your chance to help!
Still no fix for data aborts... nor any input or help from users of affected targets (hinty hint hint).
My h120(coldfire target) works so nice with this patch, but my gigabeat s(arm target) and ipod 5.5(arm target) have data aborts problem.
Does this one also help with data aborts? I compared the 090309 and 090316 patches, and put some consts back in.
I'm not sure why, but this gives some build warnings:
/home/Steve/rockbox/apps/dsp.c: In function `sample_output_new_format':
/home/Steve/rockbox/apps/dsp.c:608: warning: initialization from incompatible pointer type
/home/Steve/rockbox/apps/dsp.c:609: warning: initialization from incompatible pointer type
/home/Steve/rockbox/apps/dsp.c: In function `resampler_new_delta':
/home/Steve/rockbox/apps/dsp.c:727: warning: assignment from incompatible pointer type
/home/Steve/rockbox/apps/dsp.c:729: warning: assignment from incompatible pointer type
The parameters look identical to me...
Sorry I haven't been around lately, got busy with other things. I've just rebuilt with the patch you just posted and the latest svn build (21117) and got a data abort at 000096C0 when changing tracks. It doesn't happen every time, though, but is eventually reproducible.
I also got a data abort (after a reboot) at 000093CC when attempting to adjust the playback speed. This also doesn't happen every time.
I've built with debug enabled, but am unsure as to what information would be helpful. If you can point me in the right direction, I'll provide whatever I can.
Thanks for the report. gevaerts has been doing some tests and he's found that the Data Aborts go back at least to the 090309 patch.
He earlier reported that 090305 worked ok ( http://www.rockbox.org/irc/log-20090528#20:20:50 ) but that result is suspect as later tests failed to fail when expected.
It would be great if you could test the 090305 patch and report back here.
Will give that a shot and report back. Building it now against r21117 (looks like everything patched OK with the 090305 patch, a few offsets wrong and a reject on the CREDITS file, obviously no big deal). Will run with that build for a while tomorrow and see how it looks.
Basically my hope is that if we can work out exactly which patches are working and which are not, I'll find what I did that broke it.
I assume, realistically, it would be one of my changes at fault, and that the basic dsp is working. Of course, it might be that there's a problem throughout but that only some patches are overwriting memory in such a way as to cause instant problems - so best to run each one for a while if it doesn't fail immediately.
I'll do that later today.
#0 0x0000000000409edf in find_handle (handle_id=8)
at /home/jhenderson/Downloads/rockbox-svn/rockbox/apps/buffering.c:381
#1 0x000000000040bc49 in prep_bufdata (handle_id=8, size=0x7fffeaf2f008,
guardbuf_limit=true)
at /home/jhenderson/Downloads/rockbox-svn/rockbox/apps/buffering.c:1107
#2 0x000000000040c00d in bufgetdata (handle_id=8, size=0, data=0x7fffeaf2f040)
at /home/jhenderson/Downloads/rockbox-svn/rockbox/apps/buffering.c:1196
#3 0x0000000000447a50 in bufgetid3 (handle_id=8)
at /home/jhenderson/Downloads/rockbox-svn/rockbox/apps/playback.c:342
#4 0x00000000004483ec in set_filebuf_watermark ()
at /home/jhenderson/Downloads/rockbox-svn/rockbox/apps/playback.c:870
#5 0x000000000044a8fe in audio_thread ()
at /home/jhenderson/Downloads/rockbox-svn/rockbox/apps/playback.c:2420
#6 0x000000000046131d in runthread (data=0x86d3c0)
at /home/jhenderson/Downloads/rockbox-svn/rockbox/uisimulator/sdl/thread-sdl.c:468
#7 0x00007ffff7b564b7 in ?? () from /usr/lib64/libSDL-1.2.so.0
#8 0x00007ffff7b9a4f9 in ?? () from /usr/lib64/libSDL-1.2.so.0
#9 0x00007ffff792d070 in start_thread () from /lib64/libpthread.so.0
#10 0x00007ffff76a010d in clone () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()
I was hoping to duplicate the data abort in the simulator to see if I could get any more useful information. It does seem that buffering is possibly the problem that causes the data abort (the last time I got the abort with this build on the real device I was in the debug menu watching the buffer usage and flicked through the tracks very quickly with the scroll dial).
It could be that this segfault in the sim is entirely unrelated, but I wanted to report it in case it was.
Goven that the 2 data aborts I've been abl;e to check are also in find_handle(), I'd say this is indeed the same issue.
Can you reproduce this with the sim? If so, which simulator - and could you attach your config.cfg here - I'd love to be able to repro it.
When the initial calls happen: settings_apply -> dsp_timestretch_enable -> tdspeed_setup -> tdspeed_init -> (allocations)
tdspeed_init isn't called in tdspeed_setup because tdspeed_percent is 100.
As a test I set tdspeed_percent to 101 right before tdspeed_setup in dsp_timestretch_enable.
dsp.c line 319 insert
AUDIO_DSP.tdspeed_percent = 101;
right before
tdspeed_setup(&AUDIO_DSP);
That seems to take care of the aborts in the Simulator and on my H10.
Bryan
I don't understand the C <-> ASM parameter passing, so perhaps somebody could confirm that there's no ASM change needed changing "int32_t *src[]" to "const int32_t *src[]". This affects the following:
dsp_cf.S:
- dsp_upsample
- sample_output_stereo
- sample_output_mono
dsp_arm.S:
- sample_output_stereo
- sample_output_mono
I did miss something - the manual. Could anyone who can build manuals help out? I'll try to write up some explanations later, but I'm not good at clear explanation!
And - biengo, I need your real name for the credits.
I see biengo's name is Benedikt Goos - will update CREDITS in the next version, but still could do with some help on the MAS manual problem.
I have to admit that I initially didn't like the move of the speed scale to the pitch menu, but the more I use it, the more I like it there instead. I wonder if it might be useful to change the "Pitch" menu item to read "Pitch/Speed" when Timestretch is enabled just so it's more obvious where to go to change the speed.
Second thing is that I really like the way the pitch/speed tie together on the iPod when adjusting the pitch (having the speed adjust to offset the pitch speedup is very nice). Might be useful to add a similar option to the semitone adjustment screen as well (makes it easier to adjust for people using the combination to do transposition).
But these are a couple of minor enhancements, the core works very nicely. Thanks again for all your hard work to get this working. :-)
Some further comments about the manual part:
- I think the button table part should also be divided into swcodec and masf, otherwise it will mention timestretch for the Archoses too
- the WPS button table could mention the timestretch function in the explanation of the pitchscreen/timestretch shortcut.
*) tested an e200, OndioFM, FM-Recorder and Player manual
Here's a commit candidate with MASCODEC replaced.and the button table also dependent on masf/swcodec.
Jim - I don't want to add timestretching to the first two modes because of the artifacts, I did think about a 4th mode (like the 3rd but using semitones for up/down), but decided it was overkill.
I intend to commit this as soon as the freeze is done. The main downside remains the binsize, and that's unlikely to improve much until after commit (when hopefully some people with optimisation clue will get involved).
also, A suggestion for the future, I think it would be good to have three modes, one to adjust both pitch and speed as the ordinary pitch screen currently does, this time stretch mode which just changes the speed, but then a third mode to adjust pitch without changing the speed.
Just some thoughts, what do other people think?
I certainly think at least for consistency pitch should be displayed under sound settings if time stretch is going to be there. As at the moment as far as I know it is only shown in the wps context menu and nowhere else.
The Sound Settings option only enables you to use timestretching as this uses up a little memory. The only place you can actually set the speed is in the pitch screen.
There is indeed a way to change pitch only as you suggest.- perhaps you could read the manual sections that the patch updates and see if we need to reword them to be clearer?
you said
"The Sound Settings option only enables you to use timestretching as this uses up a little memory. The only place you can actually set the speed is in
the pitch screen."
What exactly did you mean by that? I saw the option under sound settings and saw it could be adjusted so I assumed it was for setting the speed. What exactly is the difference between using the time stretch option found under sound settings and the pitch screen setting? I will try to review the manual, though I have problems building them under cygwin.
If you are using the latest patch you should only have a single Timestretch setting under Sound Settings, and all this should offer is Yes or No. To use timestretch you must set this option to Yes, then reboot once to allow the memory to be allocated.
Then you use the pitch screen to actually set the speed percentage and/or pitch - although I don't think that is voiced at the moment.
Basically, with timestretch enabled, the pitch screen now has three modes - percentage, semitone and timestretch. This means you need to press the mode button (whatever that is on your target) twice after you've entered the pitch screen. Once in timestretch mode, the left/right buttons (or the equivalent) will decrease/increase speed, while the up/down buttons will increase/decrease pitch whilst maintaining speed.
I think though that perhaps the time stretch option should be renamed to something like use time stretch rather than just plain time stretch.
Also, why isn't the pitch screen accessible from sound settings? as in order to access it, you have to use the wps context menu which isn't very easy for a speech interface user, as you have to leave file playback running in order to keep the wps up, and be able to use speech. Because of course if you stop playback your out of the wps, and if you pause playback you can't use the speech. So I think it would be good if the pitch screen could be made accessible from the sound settings menu to avoid having to fight with hearing the speech over a file playing whenever you want to change speed or pitch.
Perhaps the pitch menu could be renamed to something like rate adjustment as rate can refer to pitch and speed.
So just wondering is there a mode that can be used to adjust pitch and have the speed go up or down as the pitch changes? as the original pitch screen used to do?
Steve, with regards to this - the fourth mode is what I was thinking of. Reason being that if you're doing transposition, the semitone adjustment is useful for that, but on its own it adjusts the speed as well, so offsetting that with timestretch makes sense. The way it is now, if someone wanted to do this, they'd have to use semitone adjustment to get the pitch adjustment, and then use the linked settings to make the adjustment manually. Really a matter of convenience (not something I'm likely to use it for myself, just thinking of the particular use case in question).
I made sure to update the build on my player by deleting the rockbox folder, compiling a new version from source and copying it into the players root.
I also compiled a new voice file from scratch and copied it into the lang folder in the rockbox folder on my player.
the problem I am experiencing is that I go into the time stretch option under sound settings, and the speech says something like time stretch no.
I go to change it to yes, and my player just freezes.
I have to hard reset it. I am using an h140.
Almost certainly it is some other patch causing problems on the player as I run a large number of sdoyons other patches, however I didn't experience problems with the player locking up when he sent me his last set of revised patches.
I will unapply all of his other patches, apply only this version of the patch, recompile and let you know if the player continues freezing.
that when I try to change the setting from no to yes the player is not actually freezing, when I move the joystick to select yes I get no speech feedback, and the player appears to lock up and won't do anything unless I press the play button.
Does anyone have any ideas what might be going on here? as the patch appears to apply with no problems.
I don't really want to have to abandon any of sdoyons other patches as they make life a lot easier when using rockbox, particularly for a speech user.
If I were to upload his set of patches, plus my cfg file, would it be possible for someone to please take a look at it and see what might be going on when I have this latest speed patch in a build with them?
as obviously I can't really tell due to the lack of speech feedback what's happening.
PS: What are the other patches, and why can't there be in SVN (yet)?
To be honest I am not completely sure what some of the patches he has created do, as he has never sent me a detailed explanation about a lot of them.
However, I believe a lot of them fix issues with the speech interface due to there names, one I know creates an option to switch the player off with the hold switch.
Another allows a visually impaired user to get information about elapsed and remaining time on a playlist, as well as on an individual track. It also gives information about average track bit rate and a lot of other information, information that normally is not accessible.
It has been suggested in the past that this would be better off as a plugin rather than in the core. However I feel this functionality would benefit a lot of rockbox users not just visually impaired ones as it means you do not have to look at the screen when wanting to know how long you have left on a given track.
Another one of his patches makes the player beep when you cycle to the end or beginning of a list, it means you can scroll through a list extremely quickly and do not have to wait for the speech to catch up.
His other patches make a lot more types of characters voiced, and also I believe make things such as the playlist catalogue voice propperly.
Another one makes the player beep and give a message when the charger is removed or inserted.
But to be honest I am not really sure what some of them are for.
Some of them are not on the patch tracker, I thought about uploading them but I decided not to as he has not given me permission to open a new task for them, but I would love for some of them to be committed as I hate having to ask him to resync every few months.
If I were to get permission to upload them as well as descriptions from him about what they do I would be happy to upload them.
If you might be able to take a look and advise me what is going on when I have the speed patch incorporated into a build with his other patches I would appreciate it. I can e mail them to people privately if they would like in order to avoid cluttering up the tracker.