PyTuner is really quite simple. Here is the outline of the algorithm behind it:
- Record audio data with pymedia.
- Compute the FFT of the audio data – now we have the strength of each frequency in the sample.
- Find peaks in the result of the FFT. This means looking for the dominant frequencies.
- Find the lowest frequency that has a peak besides the zero frequency.
- Find the nearest ‘desired note’ (the notes we should tune to) to that frequency.
- Mark the difference between the actual frequency and the desired note frequency.
There is more to the algorithm. First, to find peaks, I used a simple algorithm – I computed the average strength of the frequencies in the sample, and looked at all the frequencies above two standard deviations above that average. This yields an array, where each frequency is marked by 0, or 1. (Instead of 1 you can mark it with its original strength). For each consecutive sequence of frequencies with 1 – I computed the average of the run, and that was the peak location. So for example, if the data was:
0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0,
The peak would be in the frequency matching index 5.
(If you used the original strength instead of 0 and 1, you can apply a weighted average to get a more accurate result)
To find the actual note of a given frequency, I used the fact that notes (half-tones) are separated by a ratio of 2^(1/12) . (the twelfth root of 2). So, if our base note is A, at 110 Hz, to compute the note we would use the following code snippet:
note_names = "A A# B C C# D D# E F F# G G#".split()
note_index = math.log(note_freq/base_freq, 2**(1.0/12))
note_name = note_names[note_index%12]
While writing PyTuner, at first I implemented the FFT on my own, just to know how to do it once. I found it helped me to better understand the nature of the algorithm, and its output. I find the practice of implementing various algorithms on your own, for educational purposes, quite healthy. I always like to know how things work ‘behind the scenes’.