WaveTableOsc optimized

The wave table oscillator developed here in 2012 is pretty lightweight, but I never took a close look at optimization at the time. An efficient design is the number one optimization, and it already had that. I was curious how much minor code tweaks would help.

Free wrap

For instance, to avoid added complexity in the original article, I didn’t employ a common trick of extending the wave table an extra sample in order to avoid the need for a wrap-around test in the linear interpolation. A rewrite was a good opportunity to add it, and check on the improvement. The code change was trivial—add one to the allocation size for each table, and duplicate the first sample to the last when filling them, in AddWaveTable. Then remove the bounds check in the linear interpolation, saving a test and branch for every sample generated.

// old
float samp0 = waveTable->waveTable[intPart];
if (++intPart >= waveTable->waveTableLen)
    intPart = 0;
float samp1 = waveTable->waveTable[intPart];

// new
float samp0 = waveTable->waveTable[intPart];
float samp1 = waveTable->waveTable[intPart + 1];

Factor

And, I spotted another optimization opportunity. The oscillator needs to select the appropriate wavetable each time it calculates the next sample (GetOutput), but the choice of wavetable only changes when the frequency changes. So, the selection should be moved to SetFrequency. In general use, it should make no difference—an analog-like synth would usually have continuous modulation of the oscillator, for instance. But moving the calculation to SetFrequency would be a substantial win for steady tones.

I’ll go through the changes and their contributions, but a little explanation first. Unlike most of my DSP classes that provide a new sample on each Process call, the oscillator requires two calls—GetOutput and UpdatePhase—every time. It was just a design choice, I could have also provided a Process call that does both. In addition, SetFrequency might also be called every time, if the modulation is constant. For a steady tone, we’d call SetFrequency once, then the other two calls repeatedly for each sample. For a sweep, all three would be called for each sample. Of course I could trivially add a Process call that couples the existing UpdatePhase and GetOutput.

The wave table selection code iterates from lowest to highest, until it finds the right table, so it’s slower for higher frequencies. So, we’d expect an improvement in moving that code to SetFrequency for the steady tone case, and that the amount of improvement would be greater for high tone settings. We’d expect the case of a sweep to give no change in performance, since it still requires the wave table selection on each sample.

// old
inline void WaveTableOsc::setFrequency(double inc) {
  phaseInc = inc;
}

// new: the added code was simply repositioned from GetOutput
void SetFrequency(double inc) {
  mPhaseInc = inc;

  // update the current wave table selector
  int curWaveTable = 0;
  while ((mPhaseInc >= mWaveTables[curWaveTable].topFreq) && (curWaveTable < (mNumWaveTables - 1))) {
    ++curWaveTable;
  }
  mCurWaveTable = curWaveTable;
}

After moving the table selection from GetOutput, implementing the “one extra sample” optimization gave about a 20% to the steady-tone case (SetFrequency not included) on its own. Unexpectedly, changing the variable temp from double to float gave a similar individual improvement. For steady tones, these changes together yielded a 30% improvement to the loop (SetFrequency not included), and 60% for high tones.

At first test, the sweep case did not see the improvement, as the biggest change simply moved the calculation between the two routines called on each sample. I was disappointed, because I expected at least a small improvement from the other changes, but instruction pipelines can be fickle. The minor tweak of using a temporary variable to calculate mCurWaveTable changed that—yielding a a 24% improvement. This is in a loop that sweeps a sawtooth exponentially through the audible range (SetFrequency, GetOutput, UpdatePhase, calculate the next frequency with a multiply, loop overhead). Basically, all the same code is executed, but apparently it worked out better for the processor. So, a great improvement overall.

Another goal of the change was to put everything in a single header file—because I like that. GetOutput was formerly not inline, so there was a small improvement moving it. I also changed the case of the function names (setFrequency to SetFrequency) to align with the way I currently write code, so it’s not a drop-in replacement for the old version.

I tried one other improvement. The wave table selection “while” loop takes longer, of course, as frequency moves to higher tables. I tried to special-case octave tables by using clever manipulation of a floating point number, essentially getting a cheap log conversion. The advantage would be a single calculation for any frequency, instead of an iterative search. It ended up not being worth it for the case of octave tables, the iterative solution was more efficient that I expected, and more importantly it’s far more general.

Using picobench to benchmark, the new code is about 25% faster for the case of a 20-20kHz sawtooth sweep. Not a huge change, because the code was already very efficient. But as part of single-header-file rewrite without increasing complexity, that’s a pretty good bonus.

WaveTableOsc.h

This entry was posted in Source Code, Wavetable Oscillators. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *