I wrote the wavetable oscillator code from scratch as I wrote this series of articles, using just the thought process I gave. Hence, no references to other articles on wavetable oscillators. The concept of a phase accumulator is fairly obvious; one of the first books I recall reading about it was Musical Applications of Microprocessors, by Hal Chamberlin, about 30 years ago (I have an autographed first edition), though the book didn’t explore the concept of multiple tables per oscillator to control aliasing.
These end notes are an opportunity to touch on various aspects of the oscillator.
First, a few examples. Here’s a pulse-width modulation example, using a sine-wave instance of an oscillator to modulate the pulse width of another oscillator:
Three detuned and summed sawtooth oscillators:
But we aren’t limited to computing our wavetables. Here, I recorded myself singing “aahh…” (originally, 110 Hz, the second A below middle C). I cut one cycle of that in an audio editor and stretched (resampled) it to 2048 samples, and did an FFT of that. From there it was easy to create higher octaves by eliminating harmonics in the upper octave and doing an inverse FFT to make the next higher wavetable—the same process as we did with the computed waves. Here it is, swept 20 Hz to 20 kHz (but keep in mind that the typical useful range is in the lower octaves):
Because I wanted to show the harmonic series of sine waves building a sawtooth in the earlier articles, I didn’t mention that sawtooth waves traditionally ramp up (remember the charging capacitor in the analog oscillator?). Creating this wave from sines make it ramp down—often called an inverted sawtooth. It really doesn’t make a difference in the audio range; in the sub-audio range for a control signal, you might prefer one or the other depending on the effect you’re after. But either way, it’s trivial to build the up-ramping sawtooth, and I did that in the final version just for fun. I left out the 20-40 Hz octave in the tutorial to make a point, but I’m including it here—a non-inverted sawtooth sweep, and you’ll notice that the lowest octave is brighter with a true 20 Hz table:
More on low-end brightness
Though we don’t often listen to the audio output of an oscillator in the sub-audio frequency range, the harmonic content is apparent for the brighter waveforms such as sawtooth. Here’s what it sounds like when we sweep from audio range down to 1 Hz with our wavetables built for the audio range:
Note how the harmonic content drops as the ticks slow.
Because the lower the frequency, the less that aliasing is a factor (since the sample resolution for a single cycle increases the lower we go), we could switch to a more straight-forward sawtooth for sub-audio. Or, because our oscillator allows a variable table size for each subtable, we could simply place a long wavetable to handle all sub-audio. Here’s the same oscillator, but with a 32768-sample sawtooth table to handle everything below 20 Hz:
My goal with the accompanying code was to make it simple and understandable. It’s pretty quick, though—the full audio sweep takes about 0.9 seconds to generate 1000 seconds of audio on my computer. And there are certainly ways to make it faster, though in general a little bit faster makes it a lot less readable, and often requires some loss of generality. For instance, we could make oscillator table selection faster by mandating octave spacing, then using a fast-log2 floating-point trick to to get to the proper table in a single step—but that would put a limitation on the oscillator, the relevant code would become unreadable to many people, and in practice the speed gain is small.
Some people might prefer a fixed-point phase accumulator. The math is fast and precise, and you can get a free wrap-around when you increment. But the double-precision floating point phase accumulator of this oscillator is extremely precise, is plenty fast, and the code is very easy to follow.
If you search the web for wavetable oscillators, you’ll find terrific educational papers and forum threads discussing various types of interpolation and their relative performance. It may leave you wondering why you should trust the lowly and relatively pathetic linear interpolation presented here. Good point.
It’s about table size. If your table is big enough, you don’t need any interpolation at all—truncation is fine. But linear interpolation is only a bit more costly, and gives a boost that’s probably worth the effort. Higher forms of interpolation require more sample points in the calculation (for instance, needing two points on each side of the point to be calculated, instead of one point on each side for linear interpolation). They are great for squeezing better quality out of a small table, but you need to ask yourself why you are using the small tables. If you’re implementing this on a modern computer, what is the point of increasing your calculations so that you can use a 512-sample table instead of a 2048-sample table—on a systems that has gigabytes of free memory?
Constant versus variable table size
If we use a constant subtable size, as each subtable moves up to cover the next octave, with half of the harmonic content, they become another factor of 2 x oversampled. The top octave will always be a sine wave, and at the 2048 samples we’re using for our subtables, it’s extremely oversampled—way more than we need or will be able to discern.
And while I said that we want to be at least 2 x oversampled at our lowest subtable, for linear interpolation to have good fidelity, at the same time it’s unlikely that we’ll hear it in the lowest octave. The reason is that low frequency wavetables for a sawtooth will be very crowded in the high harmonics—half of them are in the top octave of our hearing, spaced closely. So, we could go to 1024-sample tables for 40 Hz and up, or 2048 for 20 Hz and up (as in the “Saw sweep 2048 lin 20-20k” example above), and you probably won’t hear the difference. All of the higher tables will still be increasingly oversampled. Here’s the 1024 equivalent of the 2048 example in Part 3 of the wavetable oscillator series:
In a nutshell, we might suspect that it will be easier to hear the benefits of oversampling in the upper octaves, but at the same time the minimum table size to hold their required harmonic content is smaller.
We could just as easily go to a constant oversampling ratio by dropping the subtable length by a factor of two as we go to the next octave’s subtable. It’s easy enough to make all of these things variable in your code, so that you can play with tradeoffs.
Lesser interpolation experiments
Linear interpolation is referred to as “first order” interpolation. Truncation is “zero order” interpolation, implying none. So let’s hear how awful the oscillator would sound with no interpolation:
Um, that’s not as bad as expected—or should we not have expected it to be bad? Remember, we’re using fixed table sizes so far, which makes the higher, more sensitive tables progressively more oversampled. And oversampling helps improve any interpolation, even none.
Let’s explore variable table sizes, to keep a constant oversampling ratio per octave. We’ll start with the worst-case of 1 x oversampling, and with truncation:
As expected, the aliasing is bad, and gets worse in the higher octaves. Here it is again, with linear interpolation:
Even though 1 x oversampling is still inadequate, the improvement with linear interpolation is obvious. Let’s try again, but with 4 x oversampling; first with no interpolation (and let’s focus on the higher octaves):
Now with linear interpolation:
You can still hear a little aliasing at the top, but 4 x is adequate for much of the range. We might consider 8 x, but a problem with this approach is that we’re wasting a lot of table space for the lower octaves, and not achieving the objective of saving memory by using variable tables. Instead of keeping a constant oversampling ratio, we can go with a variable table size, but impose a minimum length, so that just the top octaves are progressively more oversampled, where we need it most.
Here are constant 1 x oversampled tables, but with a minimum of 64 samples so that the top octaves are oversampled progressively higher; first truncated, then with linear interpolation:
A couple of things we’ve learned from these exercises: The upper octaves need progressively more oversampling to sound as good as the lower octaves—at least for a waveform like the sawtooth. And linear interpolation is a noticeable improvement over truncation.
But I hope you’ve also realized that truncation is a viable alternative, as long as the table size is big enough. And the “big enough” qualification is not difficult if you’re using fixed table sizes, which yield progressively higher oversampling in higher octaves.
While variable table size is helpful in memory-resticted environments, it’s not very important in most modern computing environments. I included the ability to handle variable table sizes within an oscillator for educational experimentation, mainly.
Note: Some people are thinking…if truncation might be acceptable, why not use rounding instead? The answer is…there is no difference in noise/distortion between truncation and rounding—use whichever is most convenient (which may depend on the computer language you use or rounding mode of the processor). If you doubt that truncation and rounding are equivalent for our use, consider that rounding is equivalent to adding 0.5 and truncating; this means that rounding in our lookup table is equivalent to truncation except for a phase shift of half a sample—something we can’t hear (because there’s no reference to make it matter, but if it really bothers you, you could always create the wavetables pre-shifted by half a sample!).
Setting oscillator frequency
We set the pitch by setting the increment that gets added to our phasor for every new sample lookup. An increment of 0.5 steps half-way through the table. For a single-cycle table, that would correspond to a frequency of half the sample rate—22.05 kHz, half of our 44.1 kHz sample rate. Another way to say that is that the increment for a frequency of 22.05 kHz is 22050 / 44100, or 0.5.
So, the increment for any frequency is that frequency divided by the sample rate. For 440 Hz, it’s 440 / 44100, or 0.00997732426303855.
On the subject of frequency, one simple and useful addition to the oscillator might be the ability to adjust the phase, for quadrature and phase modulation effects.
OK, I hear better than I thought. 15k loud and clear, 16k is plenty clear too, but I could tell it wasn’t quite as loud. 17k is getting pretty tough, needing plenty of volume. But in this case I was listening to naked sine waves at pretty decent volume—I stand by my reasoning that it would be awfully tough to hear those aliased harmonics—even if we were playing ridiculously high notes with no synth filter, at high volume, and soloed. But, if you disagree, just go with closer wavetable spacing!