This is the bug/patch tracker for Rockbox. Click here for more information.
Quick links: Bugs · Patches · Rockbox frontpage
FS#8768 - Instrument tuner plugin
Attached to Project:
Rockbox
Opened by Lechner Michael (smoking_gnu) - Thursday, 20 March 2008, 20:40 GMT+2
Last edited by Alexander Levin (fml2) - Sunday, 20 September 2009, 18:31 GMT+2
Opened by Lechner Michael (smoking_gnu) - Thursday, 20 March 2008, 20:40 GMT+2
Last edited by Alexander Levin (fml2) - Sunday, 20 September 2009, 18:31 GMT+2
|
DetailsThis plugin is an instrument tuner.
Currently, there's no real GUI, it just prints the note you're playing again and again until you press power. Here's the c file: |
This task depends upon
Closed by Alexander Levin (fml2)
Sunday, 20 September 2009, 18:31 GMT+2
Reason for closing: Accepted
Additional comments about closing: The work in this task has been committed in r22663, r22685 and r22753.
Sunday, 20 September 2009, 18:31 GMT+2
Reason for closing: Accepted
Additional comments about closing: The work in this task has been committed in r22663, r22685 and r22753.
Should be more readable now; no functional changes.
Furthermore, it uses quite a lot of floating point arithmetics, which we aren't too fond of. But I think we can let that pass, since it's only a plugin.
Good work!
Identifiers
-----
We do not want mixed case identifiers.
Variables and function names should be all lower case.
Struct and enum names should be all lower case.
Preprocessor symbols and enum constants should be all upper case.
Another nitpick from CONTRIBUTING though,
"Keep lines below 80 columns length."
schmittTriggered=0;
:-)
Well, I added a little bar that shows how 'wrong' your pitch is.
I also corrected that guideline violation, and if I find another one, I'll simply give up ;)
If the played note is loud enough, the tuner is now quite usable.
I'd be very grateful if someone actually tried my plugin and tell me any bugs/suggestions...
Could anyone else with an player with a built-in mic try this for me?
The sample buffer should be 32-bit aligned BTW.
32-bit aligned would be a signed int, right?
Also, the reason for the plugin not working on sansas has to do with the fact that they cannot record with a 44,1K sample rate - I'm working on it.
print_int_xy(0,160,(int) *rb->rec_freq_sampr);
print_int_xy(0,200,(int) SAMPLE_RATE);
and both are reported as 22050 on the sansa e280 not 88200 as mentioned in the comments?
Does the actual tuning work too? Does the GUI display correctly on the sansa's screen?
@MikeS: The sample buffer is now 32-bit aligned -> int32_t.
The GUI does display correctly - and looks good
If recording works, it should display "callback" every second or so. If not, recording does not work at all.
tc is always 0 hence freq is 0
If I exit pitch and go to recording, the mic is not responsive. If I then exit
recording and restart it, the mic responds.
This suggests something in the code is preventing the mc responding to input
There are some plugins that use recording, but they just use some functions and do not explain what they do exactly...
I know for sure that I'm doing several things wrong, but nobody actually helps me fixing those problems.
So until I get some more information, I'm afraid I can't help you.
It seems to be working on most targets, and I don't own a sansa to "reverse engineer" the recording api.
Thanks for testing it nevertheless!
rb->sound_set(SOUND_MIC_GAIN, 39);
rb->audio_set_recording_gain(39,39, AUDIO_GAIN_MIC);
Im not convinced that the sound_set()ine is actually doing anything. I think the gain
is being set by the audio_set_recording_gain()
recording now works although the pitch indication jumps around a bit. maybe some averaging would help.
I can't work on the plugin right now because cygwin doesn't work anymore and I can't get the toolchain installed on AndLinux.
The 'jumpiness' of the tuner is due to the fact that the Schmitt trigger algorithm only works reliably when the tone is quite loud.
So if you play a loud note, it shouldn't jump so much. I intend to fix that when I can compile again.
Detection doesn't work too well using built-in microphone though;
[either that or my guitar is way off-tune ;-) ]
I have removed rb->sound_set(SOUND_MIC_GAIN, 39);
gain is set at 39 instead of 100 as per as3514.c
rb->audio_set_recording_gain(39,39, AUDIO_GAIN_MIC);
I added some code in display_freq() that prevents the red bar set at -32 cents when freq=0 (no signal)
if ((int)freq == 0) /* prevents red bar at -32 cents when freq= 0*/
{ ldf = 0;
}
enum plugin_status plugin_start(const struct plugin_api* api, const void* parameter)
I would be eager to hear if it works for everyone else. I've only tested it on my h120 but I imagine it'll work on other platforms either out of the box or with very little modification.
Some details: I used the Yin pitch detection algorithm, adapted from the implementation in the Aubio sound processing library. This improved accuracy, sensitivity and latency significantly over the Schmitt algorithm. Now I'm getting a little over two seconds of latency when no sound is playing (which could probably be eliminated by implementing a simple magnitude detection routine on the input stream) and between .7 and 1.2 seconds when playing a sound, with higher frequencies being worked out faster. I think speed could be improved by a factor of three to four by converting to fixed-point math. I made an aborted effort at that but I think my 16.16 data type had insufficient precision. I may try again with a different division of bits, or even go up to 32.32 for the real brute force approach.
The Aubio library also includes an FFT version of Yin, which would reduce the complexity from O(n^2) to O(nlogn), significantly reducing latency. I made an initial foray into that code but it's pretty convoluted, with lots of data structures that are more complex than we need for this application, as well as dependence on an external FFT library. So that's more than I really want to figure out, but it might be a good project for someone else who wants to see this plugin improve.
I also made the drawing routines a little more general-purpose, making them properly detect whether they were being compiled for a color, greyscale or B&W screen.
I agree that the plugin should have a different name, though I don't think pitch_detector is really the best option. This is really designed to be an instrument tuner first and foremost, mimicking the functionality of a digital tuner that you'd buy at a music store. I think I'd prefer it to be called "tuner" (which might be ambiguous to someone seeing it in the list of apps) or instrument_tuner (which might be inconveniently long).
Has anyone else had a chance to try it out on another platform? It would be nice to know if it'll actually compile and work on other devices, and whether my impression that it's a better algorithm holds out in other people's experience.
Overlaid text: I'll look into it. That's mostly stuff that was already there when I came to it, but it doesn't seem like it should be too hard to fix.
Second, I implemented a volume check so that it'll only do the grunt work of detecting the pitch if the input is louder than a given threshold. This eliminates the longer delay when no pitch is being played, and makes it respond more more quickly to a button press to exit the plugin. Overall It feels WAY more snappy now.
With the fixed point math there are occasional glitches in the pitch detection. I'm not sure what causes them, but I would guess it has to do with the reduced precision. I have some ideas of how to improve that, though -- I don't think the fixed point code is at its optimum setup for the values involved. I'm much more familiar with the algorithm now, so I have a better idea of what range of values its going to be dealing with.
Still left to do:
1. Convert the rest of the floating point to fixed point.
2. Optimize the cpu_boost and backlight so they only come on when it hears a sound above the volume threshold
3. Improve the recording code to work simultaneously with the main loop
4. Add a menu system where you can control mic volume, volume threshold, Yin threshold
5. Fix the text overlapping problem
6. Fix the octave-off problem
7. Improve the display
8. Eliminate glitches in the detection code
1. Done. There's not more floating point in the plugin.
2. Done. In combination with the previously-implemented volume threshold it should be much nicer to your battery life now.
3. Not done.
4. Not done.
5. Done. I pretty much fixed it by eliminating the latency readout. That was pretty much just for debugging anyway.
6. Done. The octave problem exists because the recording routines were providing a stereo stream but I was reading it as if it were mono. On a related note, it'll now detect down a further octave from where it did before.
7. Tweaked just a little, but could still use some changes. In particular I'd love there to be a large readout of what not is being played, rather than the current block-below-letter method. Anyone know anything about using bitmaps and how to include them in the build?
8. Done, I think. It seems quite usable at this point, at least on my h120. Again, I'm eager to hear of other people's experiences on other devices.
Unfortunately it still won't detect low enough to tune the lowest string on a guitar, so I'll probably up the sample size a bit. It may be too much to ask to have it be able to tune an electric bass, though. On the other hand, the sample size could be tweaked through a menu system (up to a maximum based on static buffer allocation) so that lower instruments could be accommodated.
One problem currently is that it isn't precisely correct when detecting a reference tone. That A440 on Wikipedia shows up as somewhere between 337 and 440, usually 4 to 8 cents flat. This may be a rounding or precision problem, or there may be something else at play that I'm not aware of. One possible solution is just to do some calibration and introduce a fudge factor. Part of the uncertainty on this, of course, is wither the same thing happens on other platforms. If anyone gets a chance, please post your results of detecting an A440 reference tone.
So here's what still needs to be done:
3. Improve the recording code to work simultaneously with the main loop.
4. Add a menu system where you can control recording volume, recording source, volume threshold, Yin threshold, and sample size (lowest detectable note).
7. Improve the display.
9. Lower the lowest detectable note.
10. Calibrate it to properly detect a reference tone.
I fixed this by commented out the line rb->backlight_off();
Is this a problem on any other platforms?
3. Still not done
4. Done mostly. Recording source still isn't selectable but everything else is.
7. No further progress
9. Done. The default setting will now detect a note as low as 43Hz. It's also adjustable so you can get lower latency if you're only tuning higher-pitched instruments.
10. Not done. I'm still hoping to hear if other people see the same frequency discrepancy when detecting reference tones like the one linked above.
I also did some other improvements:
- I've been trying to get things to be programmed in a more generalized way. I've been able to get rid of all the platform-specific #ifdefs without compromizing compatability. Through the use of he appropriate functions and defines I've been able to eliminate special cases (apart from color-vs-grey-vs-b&w) and improve the ability of the plugin to play nice with the hardware.
- I adapted a binary log function to replace the one being used before, resulting in better precision of various calculations.
- The plugin saves its settings between sessions.
- The backlight now correctly obeys the backlight timeout period.
Still remaining to be done:
3. Improve the recording code to work simultaneously with the main loop.
4. Add the ability to select a recording source via the menu.
7. Improve the display with a larger note readout.
10. Calibrate it to properly detect a reference tone.
11. Allow adjustment of some settings with arrow buttons (without having to go to the menu).
The indicated tuning was -17cents(872Hz)
so, it indicates flat, but also indicates an octave high
I checked back with the previous diff 26th August
on the sansa that produces -17cents(436Hz) ie correct octave
I think what happened is that #define SAMPLE_RATE HW_SAMPR_DEFAULT is being set as 44100
rather than 22050 as required. If I set explicitly to 22050 then the latest diff gives -17cents(436Hz)
freq=fp_mul(freq,float2fixed(1.00986796f));
sansa now reads spot on :)
the factor 1.00986796 represents +17 cents as follows:
1 semitone = 2^(1/12)
1 cent = 1semitone^(1/100)
17cents= 1cent^17
A table of appropriate values for +/- x cents could be added to the menu system
How wrong is it with an 880Hz tone, or let's say 300Hz?
Also needed would be an update to the manual, of course....
I can confirm that it works well on the sansa e200 with 0 cents offset at 440Hz
Another question: is the #if in the file SOURCES correct? Don't we have some players that are capable of recording but can't run this plugin because they do everything in a dedicated chip (i.e. non SW_CODEC targets)?
I can confirm that it works well on the sansa e200 with 0 cents offset at 440Hz
so maybe something like this would be better?
#if CONFIG_CODEC==SWCODEC && HAVE_RECORDING && (HAVE_LINE_IN || HAVE_MIC_IN)
(I'm fine with typedef'ing simple data-types, but not really with structs)
Otherwise, it looks really nice.
1. The default threshold volume is set too high IMO (not a problem at all)
2. On Sansa e200 and with 19-Nimbus, the cent scale is drawn so that it looks like an underscore line for the numbers
3. It would be nice if pressing LEFT while in the plugin's menu would return to the tuner (same as the first menu entry)
But you've done an awesome job!
Thomas Martitz: I can understand hesitance about the typedef. I did it that way because I wanted to protect myself from accidentally performing an integer multiply or divide on a fixed variable. That would, of course, produce incorrect results, so I wrapped it all in a struct so the compiler would keep track of it for me. I'm sure it prevented quite a few bugs in the development process. I'm a big believer in doing things like this (probably because I'm mainly an object oriented programmer). If there's another way to accomplish that protection I'd love to know about it and I'd be happy to use it.
Is it really against the coding guidelines? I guess I haven't read them in a while, but I'm not sure what's wrong with it. I did it that way to isolate the implementation details of the type -- so I didn't have to say "struct fixed" every time I declared a fixed-point variable -- so I could treat it just like a normal data type as much as possible. It caused the annoyance of having to use macros in place of operators (oh, if only I could use operator overloading!), but it also made it much easier to change the internal representation at a later stage if desired (for instance, switching to a 32.32 fixed point representation).
All that said, it would certainly be easy to switch it from a struct to a plain signed integer representation. I would just fear that in doing that we'd be opening the door to future uncaught instances of improper operator usage. Am I being too anal?
Alexander Levin:
Would we only commit for h120 and e200 because those are the only ones that have been tested? Surely there are people out there who could test it on other platforms if asked (say, a simple test routine of starting the plugin, comparing the output when listening to the a440 test tone and testing the menu a bit).
And your subsequent message:
1. I wasn't sure how to deal with the default volume level and threshold. I didn't have anything better to choose, so I just set it to maximum volume and an arbitrary number for threshold. I welcome other ideas for default numbers for those.
2. Perhaps the text should be moved up a few pixels. That wouldn't be problematic at all, and may become irrelevant if my large-letter readout idea ever becomes a reality.
3. Yes, I agree. I guess that should probably be addressed before committing (from an obviousness-of-controls standpoint).
to 1: It's not a problem at all. I just noted that the threshold was too high for my guitar, but that daoesn't have to mean anything
to 2: why not compute the Y dynamically based on the font size?
I'm also working on the plugin -- trying to fix the GUI, i.e. make it independent of the font size. We should cooperate (on IRC?) to minimize merge work.
What hours are good for you on IRC? I'm kind of limited as to when I can work on this stuff -- mostly the morning and late evening (California time).
What I'm working on right now is to add in a larger readout of the note that's being played (using bitmaps) and adding in a couple gui-related menu items (whether to display flats or sharps for accidentals and an option to display (or not) the Hz readout (it'll probably be turned off by default).
Another thing that I just recently thought of that this plugin really should be able to do is to change the pitch calibration. A first step would be to have a menu item to adjust where A is (instead of 440Hz), but it would also be cool to be able to adjust up or down some number of steps (good for tuning non-concert-pitch instruments like clarinets, trumpets, saxophones, etc).
But while I indeed created the plugin and did some basic proof-of-concept stuff, it was actually David Johnston who did most of the hard work: therefore I think his name should be added to theCREDITS too.
Note that this patch won't run without the image files I'm going to include in my next comment.
Oh, and one more thing. I haven't tested this patch on platforms other than the h120 but I've included bitmaps for all the screen sizes I think are possible. Some of them may need some tweaking in terms of size or something like that. Let me know how it works for you.
I also tweaked it so it'll once again display the UI in the simulator, though the pitch detection still won't work. It just displays a dummy frequency.
$ make
CC apps/plugins/pitch_detector.c
/cygdrive/c/sansa/rockbox-21932/apps/plugins/pitch_detector.c:219: warning: excess elements in struc
t initializer
/cygdrive/c/sansa/rockbox-21932/apps/plugins/pitch_detector.c:219: warning: (near initialization for
'note_bitmaps[0]')
/cygdrive/c/sansa/rockbox-21932/apps/plugins/pitch_detector.c: In function 'init_everything':
/cygdrive/c/sansa/rockbox-21932/apps/plugins/pitch_detector.c:1130: error: 'struct picture' has no m
ember named 'slide_height'
make: *** [/cygdrive/c/sansa/rockbox-21932/build/apps/plugins/pitch_detector.o] Error 1
Any thoughts?
I made small improvements:
1. The lower horizontal line was drawn too low so that it wasn't visible
2. The last vertical line wasn't drawn (but the zero line was drawn twice)
3. The zero label was placed a bit to the right. I now differentiate label and bar x-coord.
Attached is the patch that fixes this.
I can see how the bar would be kind of big on any screen that's taller than it is wide. Perhaps the bar height should be relative to the minimum of the height and width? Here's a patch for that. I tried it out in the e200 sim and it looks pretty good. What's missing, though, is transparency on those note letters. Or maybe the whole screen should just be blanked out white instead? Any opinions?
Ive attached the resulting file, which gives me the errors I noted before.
IMO we don't necessarily need transparent bitmaps for the notes. On Sansa, a dark letter on white background looks quite OK.
@Steve: the file you attached compiles just fine here.
Im up and running now
I was downloading the code from daily build http://www.rockbox.org/dl.cgi?bin=source
Although the changelog showed the pitch detector changes for eg r22685 the 7zip source didnt contain the required files...
so I was manually downloading and adding pitch_detector.c and the compile failed no doubt because the bitmap interface
of the toolchain wasnt up to date. Not sure why the daily build is having this problem....
I downloaded this time from svn and all is well
Im up and running now
I was downloading the code from daily build http://www.rockbox.org/dl.cgi?bin=source
Although the changelog showed the pitch detector changes for eg r22685 the 7zip source didnt contain the required files...
so I was manually downloading and adding pitch_detector.c and the compile failed no doubt because the bitmap interface
of the toolchain wasnt up to date. Not sure why the daily build is having this problem....
I downloaded this time from svn and all is well
This is the first diff ive submitted , hopefully ive done it correctly!
1. Variable names. The names "reference_table", "ref", "xx" and the struct member "reference" are not good IMO. They are too general and do not tell at all what they mean. Also, some comments about these tables would do no harm.
2. I'd rename the menu entry "Reference Frequency" to "Frequency of A" (or something like that) so that it's clear what the setting means.
3. Maybe this is because of the sim (I have not tried it on target), but when I set the reference frequency to 435, the displayed note is pure A with the displayed frequency 445 (it's fixed in the sim). When I set the ref. freq. to 445 then the displayed note is G# (with the raw frequency still being 445). So I think there might be some problems in the implementation.
Once these issues are resolved I'll be happy to commit the patch!
1. Variable names. Ive used more meaningful names, and added comments.
2. Renamed the menu entry "Reference Frequency" to "Frequency of A (Hz)"
3. On the Sansa E200 it appears to work correctly. I tested as follows using the NCH tone generator www.nch.com.au/tonegen
input tone 435 Hz
Frequency of A set to 435 Hz indicated tone A: 1 cents (435.2Hz)
Frequency of A set to 445 Hz indicated tone A:-38 cents(435.17Hz)
input tone 445 Hz
Frequency of A set to 435Hz indicated tone A:41 cents (445.28Hz)
Frequency of A set to 445Hz indicated tone A:2 cents(445.23Hz)
Would be good if this could be confirmed on another target
Another question: wouldn't it be more simple to multiply the given frequency by a factor to normalise it to 440, and then proceed as if the frequency of A were 440? The factor would be (440/freqA). You already have this table, but as (1/x). What do you think?
My implementation required 5 fp_add(lfreq[ ],lfreq_A) and 1 fp_mul(freqs[ ],freq_A[ ])
The suggested implementation requires 1 fp_add(lfreq,lfreq_A) and 3 fp_mul(freq,freq_A[ ])
since fp_mul is significantly slower than fp_add , I think this may run slower
anyway, I have attached a patch for this method, which gives similar results as pitch_detector10_diff when tested using NCH tone generator on sansa e200
I have not tried this patch as I don't have the 'infrastructure'. Could you please test it?
Of much more concern here is the precision of the calculations. In general divides should be avoided or else performed after any relevant multiplies. Adds and subtractions don't really affect precision. In using a baked-in multiplier just remember that it'll reduce precision, especially if the multiplier is between -1 and 1. So we want to come up with a method with the least possible math, especially multiplications by small values.
My idea (implemented in the patch 11a) is to leave 95% of the algorithm as is and only to scale the frequency in the beginning. So, for example, if the reference frequency is set to be 435 then we'd multiply the input frequency by (400/435). If the input frequency is 435 (easy to understand example) then the result of this (i.e. the scaled input frequency) will be 400. The rest of the algorithm (finding out the note and the error) works as if the input frequency were 400, i.e. your original algorithm is used. Only in the end do we display the original input frequency, i.e. 435.
And yes, indication of octave would be good. How would you propose to do this? I'd display it as a smaller bitmap (as an index) right to the displayed note. But it would be a separate patch.
I think we should close this task after we've committed the change with the variable frequency of A and do other changes (e.g. displaying the octave) in a separate task. This one has become quite long and untrackable.
Regarding the octave display, I would just display the octave right after the note, e.g "A2" or "D#4". That would require some new bitmaps though...
Isn't the tone a a defacto standard tone for a reason? I *maybe* see a point if its octave is configurable, but as it seems to always show a (without indication of the octave) it's pointless as well.
fml: your example of 400Hz is one of the examples which should really be avoided with fixed point math. The precision is really bad with so small (400/435) factors.
Clearly, we need to be sophisticated!
As to variable frequency of A: of course I doubt that this feature will be used much. Serious musicians for whom this feature might be of interest have another means of tuning their instruments. But if we can have it -- why not?
IMO, the frequencies and the respective logs for the scale should be put in a struct as well.
I actually can't get it to register on my guitar at all either. Actually, it'll occasionally get what seems to be the right note but I don't know if it's just a fluke. Mostly it seems to just display randomness. I would guess that the algorithm is getting overwhelmed by overtones. I'm sure a stringed instrument is going to produce more complex waveforms than a wind instrument. As far as other instruments go, though, I'm able to get accurate detection on recorder, flute, bagpipe, whistling, and singing. I'm surprised that the guitar would be so problematic. I've even tried varying the algorithm settings but to no avail.
How about populating the menu items for the frequency of A on the fly? It's a bit of an academic question, the current code is OK. I just wonder how this could be done.
The stability may be due to the 22KHz recording rate acting as a 10Khz LPF, so the LPF may well be useful on other platforms that record at 44KHz
I would suggest the LPF function be switchable
Re low pass filter: Maybe that indicates that it would generally work better with a lower sample rate? I admit I haven't experimented with that. Since it's only likely to be used for sampling things below 2 or 3kHz (I tried it with my garklein recorder, an instrument that's only ~15cm long, and its A is at 1760 Hz) it could maybe even be knocked down to 11Hz sample rate. A menu option is probably a good idea.
Of course, a halved sample rate will also give a huge boost to the speed of the algorithm and could reduce its memory footprint.
In a separate patch, I plan to use set_bool for boolean settings (instead set_option). I didn't want to mix these changes into this patch.