## Wavetable signal to noise ratio

In our wavetable series, we discussed what size our wavetables needed to be in order to give us an appropriate number of harmonics. But since we interpolated between adjacent table entries, the table size also dictates the signal to noise ratio of playback. A bigger (and therefore more oversampled) table will give lower interpolation error—less noise. We use “signal to noise ratio”—SNR for short—as our metric for audio.

SNR has a precise definition—it’s the RMS value of the signal divided by the RMS value of the noise, and we usually express the ratio in dB. We’ll confine this article to sine tables, because they are useful and the ear is relatively sensitive to the purity of a sine wave.

### Calculating SNR

To calculate SNR, we need to know what part of the sampled audio is signal and what part is noise. Since we’re generating the audio, it’s pretty easy to know each. For example, if we interpolate our audio from a sine table, the signal part is the best precision sine calculations we can make, and the noise is that minus the wavetable-interpolated version. RMS is root-mean-squared, or taking the square roots of all the samples, producing the average, then squaring that value. We do that for both the signal and the noise, sample by sample, and divide the two—and convert to dB. The greatest error between the samples will be somewhere in the middle. Picking halfway for the sine is a good guess, but we can easily take more measurements and see.

It ends up that the table size can be relative small for an excellent SNR with linear interpolation. This shouldn’t be surprising, since a sine wave is smooth and therefore the error of drawing a line between two points gets small quickly with table size. A 512 sample table is ample for direct use in audio. It yields a 97 dB SNR. While some might think that’s fine for 16-bit audio but not so impressive for 24 bit, a closer look reveals just how good that SNR is.

Keep in mind, this is a ratio of signal to noise. While the noise floor is -97 dB compared with the signal, that’s not the same as saying we have a noise floor of -97 dB. The noise floor is -97 dB when the signal is 0 dB (actually, this is RMS, so a full-code sine wave is -3 dB and the noise is -100 dB). But people don’t record and listen to sine waves at the loudest possible volume. When the signal is -30 dB, the noise floor is -127 dB. When the signal is disabled, the noise floor is non-existent.

However, if that’s still not good enough for you, every doubling of the table size yields a 20 dB improvement.

### Code

Here’s a simple C++ example that calculates the SNR of a sine table. Set the tableSize variable to check different table sizes (typically a power of 2, but not enforced). The span variable is the number of measurements from one table entry to the next. You can copy and paste, and execute, this code in an online compiler (search for “execute c++ online” for many options).

```#include <iostream>
#include <cmath>
#if !defined M_PI
const double M_PI = 3.14159265358979323846;
#endif
using namespace std;

int main(void) {
const long tableSize = 512;
const long span = 4;
const long len = tableSize * span;
double sigPower = 0;
double errPower = 0;
for (long idx = 0; idx < len; idx++) {
long idxMod = fmod(idx, span);

double sig = sin((double)idx / len * 2 * M_PI);

double sin0, sin1;
if (!idxMod) {
sin0 = sig;
sin1 = sin((double)(idx + span) / len * 2 * M_PI);
}

double err = (sin1 - sin0) * idxMod / span + sin0 - sig;

sigPower += sig * sig;
errPower += err * err;
}
sigPower = sqrt(sigPower / len);
errPower = sqrt(errPower / len);

cout << "Table size: " << tableSize << endl;
cout << "Signal: " << 20 * log10(sigPower) << " dB RMS" << endl;
cout << "Noise:  " << 20 * log10(errPower) << " dB RMS" << endl;
cout << "SNR:    " << 20 * log10(sigPower / errPower) << " dB RMS" << endl;
}
```

### Quantifying the benefit of interpolation

This is a good opportunity to explore what linear interpolation buys us. Just change the error calculation line to “double err = sin0 – sig;”, and set span to a larger number, like 32, to get more readings between samples. Without linear interpolation, the SNR of a 512-sample table is about 43 dB, down from 97 dB, and we gain only 6 dB per table doubling.

You can extend this comparison to other interpolation methods, but it’s clear that linear interpolation is sufficient for a sine table.

### Extending to other waveforms

OK, how about other waveforms? A sawtooth wave is not as smooth as a sine, so as you might expect, it will take a larger table to yield high SNR number. Looking at it another way, the sawtooth is made up of a sine fundamental. The next harmonic is at half the amplitude, which alone would contribute half the signal and half the noise, but it’s also double the frequency—the equivalent of half the sine table size and therefore 20 dB worse than the fundamental is taken alone. It’s a little more complicated than just summing up the errors of the component sines, though, because positive and negative errors can cancel.

But the measurement technique is basically the same as with the sine example. The signal would be a high-resolution bandlimited sawtooth (not a naive sawtooth), and noise would be the difference between that and the interpolated values from your bandlimited sawtooth table. Left to you as an exercise, but you may be surprised at the poor numbers of a 2048 or 4096 sample table in the low octaves (where the is no oversampling). But again, the noise only occurs when you have signal, particularly when you have a bright waveform, and remains that far below it at any amplitude. It’s still hard to hear the noise through the signal!

Checking the SNR of wavetable generated by our wavetable oscillator code is a straightforward extension of the sine table code. For a wavetable of size 2048 and a given number of harmonics, for instance, create a table of size 2048 times span. Then subtract each entry of the wavetable, our “signal”, from the corresponding interpolated value for the “noise”. For instance, if tableSize is 2048 and span is 8, create a table of 16384 samples. For each signal sample n, from 0 to 16383, compare it with the linearly interpolated value between span points (compare samples 0-7 with the corresponding linear interpolation of samples 0 and 8, etc., using modulo or counters).

It’s more code than I want to put up in this article, especially if I want to give a lot of options or waves and interpolations, but it’s easy. You might want to make a class that lets you specify a waveform, including number of harmonics and wavetable size, which creates the waveform. Create a function to do the linear interpolation (“lerp”) and possibly others (make it a class in that case); input the wavetable and span, output the computed signal and noise numbers. Then main simply makes the call to build the waveform, and another call to analyze it, and displays the results.

## Sampling theory, the best explanation you’ve ever heard—End notes

A few words before moving on to other topics…

We’ve looked at why digital sample represent ideal impulses, and why any point between samples represents a value of zero. And, as a result, audio samples don’t represent the audio itself, but a modulated version of the audio.

Why is helpful to understand these points?

### Critical sampling

First, it gives clear and intuitive answers for why digital audio behaves certain ways than do more typical explanations. For instance, it makes this puzzle trivial:

People ask why the sample rate needs to be double the frequency of the highest signal frequency that we want to preserve. Often the reply is that it needs to be just above double the highest frequency of interest, to avoid aliasing. But why? And how much higher? At this point, someone mentions something about wagon wheels turning the wrong way in movies. Or shows a graph with two sine waves of different frequencies intersecting the same sample points. So unsatisfying.

If you consider that the signal is amplitude modulated in the digitization process, you need only see that the sidebands would start overlapping at exactly half the sample rate. To keep them from overlapping, all frequencies must be below half the sample rate, giving each cycle more than two samples.

### Multistage conversion

And integer sample rate conversion choices are easier to make. Especially for multistage conversion. We often use multistage conversion to improve efficiency. Like performing 8x upsampling as three 2x stages. If that sounds like three times the work, it isn’t, because the higher relative cutoffs of the filters make for fewer coefficients, balancing out with the total operations for 8x. But we can do more than break even by optimizing each stage—the earlier stages can be a bit sloppy as long as everything is tidy by the last stage’s output. Somewhat like doing a big cleanup on a house in multiple passes versus one.

Perhaps this is a good place to note that you might see chatroom posts where someone says that instead of inserting zeros and filtering, they prefer to use a polyphase filter. There is no difference—a polyphase filter in this case is simply an algorithm that upsamples and filters. Any seasoned programmer will notice that there is no need to explicitly place zeros between samples, then run all samples through an FIR, because the zero samples result in a zero product; optimizing the code to skip the zero operations results in a polyphase filter.

### Optimization example

An understanding of why we need to filter rate conversions can help us optimize DSP processes. For example, someone posted a question on a DSP board recently. They were using EQ filters designed by the bilinear transform, which have a pinching effect near half the sample rate (due to zeros at the Nyquist frequency). They didn’t need additional frequency headroom per se—the filters are linear—but they wanted to oversample by 2x to avoid the shape distortion of peaking EQ filters. (Note there are methods to reduce or avoid such distortion of the filter shape, but this is a good example.)

Let’s say we’re using a 48 kHz sample rate. Typically we’d raise the sample rate to 96k by inserting zeros every other sample and then lowpass filtering below 24k. Then we’d do our processing (EQ filtering). Finally, we’d take it back to 48k by lowpass filtering below 24k and discarding every other sample. But in this case, our processing step is linear (linear EQ filters), so it doesn’t create new frequencies. That means we can skip one of the lowpass filtering stages. It doesn’t matter whether we lowpass filter before or after the EQ processing, but we don’t need both. That’s a substantial savings.

### Another example

Let’s say we create an audio effect such as an amp simulator, which has a non-linear process that requires running at a higher sample rate to reduce audible aliasing. We run our initial linear processes, such as tone controls, then upsample and run our non-linear processes (an overdriven tube simulation!). But in this case we conclude with a speaker cabinet simulator, which is a linear process (via convolution or other type of filtering). Guitar and bass cabinets use large speakers (typically 8” and up, often 10” or 12” for guitar), with frequency responses that drops steeply above around 5 kHz. Understanding how the downsampling process works, we might choose to eliminate the downsampling filter stage altogether, as superfluous, or at least save cycles with a simplified filter with relaxed requirements.

Posted in Digital Audio, Sampling Theory | Tagged | 2 Comments

## Sampling theory, the best explanation you’ve ever heard—Part 3

We look at what Pulse Amplitude Modulation added to our analog source audio.

Earlier, we noted that the PAM signal represents the the source signal plus some additional high frequency content that we need to remove with a lowpass filter before we listen back.

Again, PAM is amplitude modulation of the source signal with a pulse train. Mathematically, we know precisely what amplitude modulation produces—the sums and differences of every frequency component between the two input signals. That is, if you you multiply a 100 Hz sine wave by a 6 Hz sine wave, the result is the sum of 106 Hz and 94 Hz sine waves. For signals with more frequency components, there are more sums and differences in the result.

To answer our question, “What got added?”, we need to understand the frequency content of a pulse train. One way to know that would be to use an Fourier Transform on the pulse train. But I want to use intuitive reasoning to eliminate as much math as possible. Fortunately, I already know what the extra frequency content is—it’s the spectral images in sampled systems, as described in classic DSP textbooks. That coupled with knowledge of amplitude modulation tips me off that we’ll need a frequency component at 0 Hz (DC—we need that to keep our original source band), at the sample rate, and at every integer multiple of the sample rate. Through infinity.

OK, we’ll lighten up on the infinity requirement. We can’t produce a perfect impulse in the analog world anyway. And we don’t need to. However, once in the digital domain, samples represent perfect impulses. While their values may have deviated slightly from a perfect representation of the analog signal, due to sampling time jitter and quantization, any math we do to them is “perfect” (again, subject to quantization and any other approximations). In the digital realm, the images do go to infinity.

Indeed, as you add cosine waves of 0, 1, 2, 3, 4…times the sample rate, the result gets closer and closer to the shape of an impulse. (Cosine instead of sine so that the peaks of the different frequencies line up.)

And that means we’ll have a copy of the source signal mirrored around 0 Hz, around the sample rate, twice the sample rate, three times the sample rate…to infinity. (In both directions, but we can ignore negative frequencies—for real signals, the negative spectrum mirrors the positive.)

### What we’ve learned

1. Individual digital samples are impulses. Not bandlimited impulses, ideal ones.

Bothered that ideal impulses are impossible? Only in the physical world. There, we accept limitations. For instance, gather together infinity of something. Anything—I’ll wait. Meanwhile, in the mathematical world, infinity fits easily on this page: ∞

2. We know what lies between samples—virtual zero samples.

Think there’s really a continuous wave, implied, between samples? If so, you probably think it’s because samples represent a bandlimited impulse. No—you’re getting confused with what will come out of the DAC’s lowpass filter later, when we play back audio.

3. Audio samples don’t represent the source audio. They represent a modulated version of the audio. We modulated the audio to ensure points #1 and #2.

This is a frequency-domain observation that follows from the first two points, which are time domain. If you understand this point, you’ll never be confused about sample rate conversion.

Posted in Digital Audio, Sampling Theory | Tagged | 5 Comments

## Sampling theory, the best explanation you’ve ever heard—Part 2

### Discrete time

For many, discrete time and digital sampling are synonymous, because most people have little experience with discrete time analog. But perhaps you’ve used an old-style analog delay stompbox, with “bucket brigade” delay chips. Discrete time goes back a lot farther, though. When we talk of the sampling theorem, attributed to people like Nyquist, Shannon, and others, it applies to discrete time signals, not digital signals in particular.

The origins of discrete time theory are in communications. A single wire can support multiple simultaneous telegraph messages, if you synchronize a commutator between sender and receiver and slice time into sections to interleave the messages—this is called Time Division Multiplexing, or TDM. Following later with voice, using TDM to fit multiple voice calls on a line, it was found that the sampling rate had to be around 3500-4300 Hz for satisfactory results.

Traveling over a wire, analog signals can’t be “discrete” per se–there is always something being sent, no gaps in time. But the signal information is discrete, sending zero in between, and that leaves room to interleave other signals in TDM.

The most common method of making an analog signal discrete in this way is through Pulse Amplitude Modulation, or PAM. This means we multiply the source signal continuously with a pulse train of unit amplitude.

While the benefit of PAM for analog communications is that we can interleave multiple signals, for digital, the benefit is that we don’t need to store the “blank” (zero) space between samples. For digital sampling, we simply measure the height of each impulse of the PAM result, and encode it as a number. Pulse Amplitude Modulation and encoding—we call the combined process Pulse Code Modulation. Now you know what PCM means.

### Impulses, really

Some might look at that last diagram and think, “But I’ve seen this process depicted as a staircase wave before, not spiky impulses.” In fact, measuring voltage quickly and with precision, which we must do for the encoding step, is not easy. Fortunately, we intend to discard the PAM waveform anyway, and keep just the digital values. We don’t need to maintain the empty spaces between impulses, since our objective is not time division multiplexing analog signals. So, we perform a “sample and hold” process on the source signal, which charges a capacitor at the moment of sampling and stretches the voltage value out, allowing a more leisurely measurement.

This results only in a small shift in time, functionally identical to instantaneous sampling—digital samples represent impulses, not a staircase. If you have a sample value of 0.73, think of it as an impulse of height 0.73 units.

The step of digitizing the analog PAM signal introduces quantization, and therefore quantization error. But it’s important to understand that issues related to aliasing are not a property of the digital domain—aliasing is a property of discrete time systems, so is inherent in the analog PAM signal as well. That’s why we took this detour—I believe I can explain aliasing to you in a simpler way, from the analog perspective.

Next: We’ll look at exactly what frequency content is added by the PAM (and therefore PCM) process, in Part 3

## Sampling theory, the best explanation you’ve ever heard—Part 1

I’ll start by giving away secrets first:

1. Individual digital samples are impulses. Not bandlimited impulses, ideal ones.
2. We know what lies between samples—virtual zero samples.
3. Audio samples don’t represent the source audio. They represent a modulated version of the audio. We modulated the audio to ensure points #1 and #2.

Well, not secrets, but many smart people—people who’ve done DSP programming for years—don’t know these points. They have other beliefs that have served them well, but have left gaps.

### Let’s see why

Analog audio, to digital for processing and storage, and back to analog

Component details—first the analog-to-digital converter (ADC)

The digital-to-analog converter (DAC)

Analog to digital conversion, and conversion back to analog are symmetrical processes—not surprising.

But we can make another important observation: We know that the bandlimiting lowpass filter of the ADC is there as a precaution, to ensure that the source signal is limited to frequencies below half the sample rate. But we have an identical filter at the output of the DAC—why do we need that, after eliminating the higher frequencies at the beginning of the ADC? The answer is that conversion to discrete time adds high frequency components not in the original signal.

Stop and think about this—it’s key to understanding digital audio. It means that the digital audio samples do not represent the spectrum of the bandlimited analog signal—the samples represent the spectrum of the bandlimited analog signal and additional higher frequencies.

To understand the nature of the higher frequencies added in the sampling process, it helps to look at the origins of sampling.

Next: We explore the origins of sampling in Part 2

Posted in Digital Audio, Sampling Theory | Tagged | 4 Comments

## Sampling theory, the best explanation you’ve ever heard—Prologue

I’ve been working on a new video, with the goal of giving the best explanation of digital sampling you’ve ever heard. The catch is I started on it three years ago. I’m not that slow, it’s just that I’ve been busy with projects, so time passes between working on it. And each time I get back to it, I rewrite it from scratch. You see, I want to give you a solid, useful theoretical framework that will both help you understand and help you make good decisions, with the goal of being intuitive. So, each time I have a little different viewpoint to try.

The video is still a few months off—I’m still busy for the next few months—but I’m going to present the idea first as a collection of short articles. And in the end, if I like it, the articles will serve as the script for the video.

I believe this is a unique explanation of sampling. Please follow it from the start, even if you feel you know the subject well. This isn’t something I read or was taught, but came from thinking about a way to explain it without descending into either mathematical symbolism or hand waving.

Next up: My first crack at the best explanation of sampling theory you’ve ever heard.

## Amp simulation oversampling

In tandem with our last article on Guitar amp simulation, this article gives a step by step view of the sampling and rate conversion processes, with a look at the frequency spectrum.

### From guitar to digital

The first two charts embody the initial sampling of the analog signal. It’s done in one step, from your analog-to-digital converter, but it’s a two-part process. First, a lowpass filter clears everything from half the sample rate up—something we must do to avoid higher frequencies aliasing into our audio band when sampled.

Then the signal is digitized. This creates repeating images of the positive and negative frequencies that extend upward without end. After this, we’ll look only at frequencies between 0 Hz and the sampling frequency, but it’s important to understand that these images are there, nonetheless.

If we don’t need more frequency headroom, we can do our signal processing on the samples we have. In fact, we want to do as much as we can at the lower sample rate. In the case of a guitar amp, we would process any tone controls that come before the “tube”, and other things like DC blocking and noise gating. And we can do our (non-saturating) gain stage here (assuming floating point).

### Higher rate processing

After initial processing and gain, it’s time for saturation. For this non-linear process, we need frequency headroom. The first step is to increase the rate by inserting zero-magnitude samples. Though I suggested starting with 8x upsampling in the guitar amp article, this exercise shows 4x, in order to better accommodate page constraints. We place three zero-samples between each existing sample to bring it up 4x. This only raises the sample rate—we have four samples in the same period that we used to have one. Since the spectrum is not altered, the aliases are still there.

Part two of the sample rate conversion process is to use a lowpass filter to clear everything above our original audio band. (In reality, we optimize the zero insertion and filtering into a single process, to take advantage of multiplies by zero that we can skip. Note we usually use a linear-phase FIR for this step, in order to preserve the wave shape for the saturator.) Now we see our headroom.

After our saturation stage, we’ve created new harmonics. As long as they are at a sufficiently low level by the time they reach the sample rate minus our final audio bandwidth, the aliased version won’t pollute our audio band.

### Back to normal

Done with our more expensive high-rate signal processing, we can drop back to our original sample rate for the rest. The first step is to run our lowpass filter again, to clear everything above our audio band.

Part two is the downsampling process—we simply keep one sample, discard three, and repeat. Why did we bother calculating them? Because we needed them ahead of the lowpass filter. But, here also, we can optimize these two down-conversion steps into one, and save needless calculation.

### Final processing

From here, we handle other linear processes—any filtering and tone controls the follow the tube stage, effects such spring reverb, and finally the speaker cabinet simulation. In the end, we send it to our digital-to-analog converter, which itself has a lowpass filter to remove the aliased copies.

And enjoy listening.

## Guitar amp simulation

In this article, I’ll sketch a basic guitar amp simulator. For one, questions on the topic come up often, and also, it will be a good example of a typical use of working at a higher sample rate.

The most basic guitar amp simulator has gain, with saturation, tone controls, and a speaker cabinet simulator. Because saturation is a non-linear process, the results are different whether the tone controls come before or after—more on this later. The speaker cabinet, of course, comes last, and is an important part of the tone. Gain with saturation (the overdriven “tube”) is the tricky part—we’ll start there.

### Gain with saturation

Gain is simply a multiply. But we like to overdrive guitar amps. That means gain and some form of clipping or softer limiting—that’s what generates the overdrive distortion harmonics we want. Typically, we’d ease into the clipping, more like tube saturation behaves, but the more overdrive gain you use, the closer it gets to hard clipping, as more of the signal is at the hard limit.

That’s where we hit our first DSP problem. Clipping creates harmonics, and there’s no way to say, “Please, only generate harmonics below half the sample rate.” We will have aliasing. The added harmonics fall off in intensity (in much the same way as harmonics of a rectangular wave do), as they extend higher in frequency, but at typical digital audio sample rates, the Nyquist Frequency comes too soon. Aliased images extend back down into the audio range. We can’t filter them out, because they mix with harmonics we want to keep. And because the guitar notes played aren’t likely the be an integer multiple of the sample period, the aliased harmonics are out of tune. Worse, if you bend a guitar note up, the aliased harmonics bend down—that’s where aliasing becomes painfully apparent.

Can we calculate clipping overtones and create just the ones we want? Can we analyze the input and output and remove frequencies that we don’t want? That is not an easy task (left as an exercise for the reader!).

### The oversampled solution

To mitigate the aliasing issue, the most practical solution is to give ourselves more frequency headroom before generating distortion harmonics with our saturation stage. If the sample rate is high enough, aliased images are spread far enough apart that the tones extending into our audio range from above are diminished to the point they are obscured by the din of our magnificent, thick, overdriven guitar sound. And when we play delicately or with little overdrive gain, so is the aliasing lessened.

How much headroom do we need? I’d like to leave that to your situation and experimentation—mostly. But since I started doing this in the days when every DSP cycle was a precious commodity, I can tell you that—for 44.1 kHz sample rate, typically the worst case that we care about—the minimum acceptable oversampling factor is 6x. 8x is a good place to start, and will be adequate for many uses, at a reasonable cost. (Do multistage oversampling for efficiency…but that’s another story.)

Raising the sample rate “8x” sounds like we’ll have eight times the bandwidth, but it’s better than that. At our original 44.1 kHz sample rate, we have a usable bandwidth of about 20 kHz (allowing for the conversion filters), and little addition frequency headroom. If we go past 24.1 kHz (44.1 kHz – 20 kHz), aliasing extends below 20 kHz. But by raising the rate to 8 times the original, we have headroom of 352.8 kHz – 20 kHz, or 332.8 kHz. That’s more than 80 times our original headroom.

The idea is that we upsample the signal (removing anything not in our original audio band), run it though our tube simulator (clipper/saturator), then drop the sample rate back to the original rate (part of this process is again removing frequencies higher than out original audio band).

### How much gain?

Real guitar amps have a lot of gain. Don’t think that if your overdrive is adjustable from 0-2 gain factor, you’ll get some good overdrive distortion. That’s only a maximum of 6 dB. More like 60 dB (0-1024)! Maybe up to something like 90 dB for modern, screaming high-gain amps. That’s equivalent to a shift of 15 bits, so I hope you’re using something better than a 16-bit converter (or tracks) to input your raw guitar sound.

### The tube

My intent here is not to guide you down the road of yet another guitar amp simulator plugin, but to give an example of a task that needed more frequency headroom, requiring processing at a higher sample rate. But it’s worth going into just a bit of detail on the tube (saturation) element. Again, we’re talking about “first approximations”—a hard clipper, or a soft one.

A hard clipper is trivial. If a sample is greater than 1, change it to 1. If less than -1, change it to -1.

```if (samp > 1.0)
samp = 1.0;
else if (samp < -1.0)
sample = -1.0;```

Here’s the transfer function—for input sample values along the x axis, the output is on the y axis. For input { 0.5, 0.8, 1.0, 1.3, 4.2 }, the output is { 0.5, 0.8, 1.0, 1.0, 1.0 }.

But for this use we’d probably like a softer clip—a transfer function that eases in to the limit. samp = samp > 1 ? 1 : (samp <= -1 ? -1 : samp * (2 – fabs(samp))); This is just a compact way of saying,

```if (samp > 1.0)
samp = 1.0;
else if (samp < -1.0)
sample = -1.0;
else
samp = samp * (2 - fabs(samp));```

This is a very mild, symmetrical x-squared curve. It’s just an example, but you’ll find it produces useful results. You could try a curve that stays straighter, then curves quicker near +/-1. Or a curve that’s not symmetrical for positive and negative excursions. The harder the curve, the closer we get to our original hard-clipping.

One detail worth noting: If you’ve spent much time looking at discussions of this sort of non-linear transfer function on the web, usenet, or mailing lists, invariably someone points out that we know exactly what order of harmonics will be created, and therefore how much oversampling we might need, based on the polynomial degree. This is wrong—in fact we’re only using the polynomial between the bounds of x from -1 to 1, and substituting a hard clip for all other points. For a use such as this, the input is driven far into the hard clip region, so the polynomial degree is no longer relevant.

### Tone controls

We know IIR filters well, so this part is easy. Typically, a guitar amp might have bass, mid, and treble controls. One catch is that if you want to sound like a particular vintage amp, they used passive filters that result in control interaction. That is, not active circuitry that isolate the components from each other. So, adjusting the bass band might affect the mid filtering as well. But that’s fairly easy to adjust for.

### More about overdrive and tone

Something worth noting. It you really want to scream with gain, you need to make some architectural adjustments. If you’ve played much with raw distortion pedals with different instruments, you’ve probably noticed that you get a “flatulent” sound with heavy clipping of bass. And high-gain distortion of signals with a lot of highs can sound pretty shrill. Guitar solos really scream when they have a lot of midrange distortion. So, if you really want to go for a screaming lead with loads of distortion without it falling apart into useless grunge, one thing you can do is roll off the lows and highs before the tube sim (distortion) stage. But then the result lacks body—compensate by boost the lows and high back up, after the distortion stage. The point here is that you have three choices of where to put EQ for your amp tone: before the tube, after, or both.

Some of your tone choices can be a property of the amp model—not everything needs to be a knob for the user to control. These choices are what give a particular guitar amp its characteristics. A Fender Twin does not sound like a Marshall Plexi. The reason that well-known guitar amps are the basis of DSP-based simulation is that the choices of their designers have withstood the test of time. Countless competitors faded from memory, often because there choices were as compelling.

### Cabinet

Similarly, guitar speakers gained familiar configurations not because the industry got together and chose, but these are the ones that worked out well.

One characteristic of speakers for guitar amps is that they are not full range—you’ll find no tweeters in guitar cabinets. The highs of the large speakers used (most often 10″ and 12″) drop off very quickly. A clean guitar tone doesn’t have strong high frequency harmonics, and the highest note on a guitar is typically below 1 kHz. Overdrive distortion creates powerful high frequency harmonics, but we really don’t want to hear them up very high—extremely harsh and fatiguing to listen to.

The first approximation of a speaker cabinet is simply a lowpass filter, set to maybe 5 kHz. I didn’t say it would be a great cabinet, but it would start to sound like a real guitar amp combo. The next step might be to approximate the response of a real speaker cabinet, miked, with multiple filters.

But if you’re serious, a better start might be to generate impulse responses of various cabinets (4 x 12″, 2 x 10″, etc.), miked at typical positions (center, edge, close, far), with selected mics (dynamic, condenser). Then use convolution to recreate the responses.

### Followup

To moderate the size of this article, I’ll follow with images depicting the oversampling process in the next article.

Posted in Aliasing, Digital Audio, Sample Rate Conversion | 3 Comments

It’s been asked many times, so it’s worth an article explaining the conventions used on this site for transfer functions, and why they may differ from what you see elsewhere.

People run into this most often with biquads: I use a (a0, a1, a2) in the numerator (defining zeros), and b in the denominator (defining poles). Many references, including wikipedia, use the opposite convention. Why am I so contrary?

OK, I go back a few years, back when the only way to find out about digital signal processing was to buy a college text book and try to make sense of it. In the beginning, there was no convention (I’m not sure there is now, other than de facto—please let me know if some technical society or other has put down the word). When I first went to write for the public, I took a survey of my book shelf. Most had a on top. And that choice made sense to me—a then b, from top to bottom, and also left to right for the direct form I structure. Finally, it makes sense that FIRs would have only a coefficients. Having only b coefficients seems odd, if you were to specialize the general transfer function.

My only guess for the other way preference, initially at least, is that when using the canonical (minimum number of delay elements) director form II, the result would be ab left to right.

So, I wrote a number of articles over the years using that convention. And over the years it seems that b-on-top has become dominant. (I recall seeing a-on-top on another major internet resource recently, and I see that one site even uses c and d—I assume to avoid the conflict altogether.) Free time is not something I have a lot of, so I don’t intend to edit all my articles, diagrams, and widgets to change the convention. I’m not saying it will never happen, but not for now.

It’s easy enough to notice which convention is used if you’re aware that there is no 100% agreement. Biquads (and higher order IIRs) are typically normalized to the output, making b0 (in my case) unity, so there is no coefficient needed. If you see a set of a and b coefficients, and a0 is missing and b0 is not, then the order is swapped relative to mine.

Finally, there is one other place that you can get caught with filter coefficients. When deriving a difference equation (y(n) = a0x(n) + a1x(n-1) + a2x(n-2) – b1y(n-1) – b2y(n-2)), sometimes people roll the minus signs for the feedback part into the respective coefficients. This damages the mathematical purity of it all a tad, but makes sense in a computer implementation. I don’t merge the minus signs—and fortunately, I don’t think most internet sources do either.

For the record, I’ll consult my bookshelf and see if I can come up with that original survey I mentioned earlier:

### a on top

Theory and Application of Digital Signal Processing, Rabiner and Gold, 1975
Principles of Digital Audio—Second Edition, Pohlmann, 1989*
Digital Signal Processing—A Practical Approach, Ifeachor and Jervis, 1993
Digital Audio Signal Processing, Zölzer, 1997
Digital Signal Processing—An Anthology: Chapter 2, An Introduction to Digital Filter Theory, Julius O. Smith, 1983

* The topic doesn’t appear in the first edition. Also, although the text uses a for the top (forward path of the difference equations, actually), a diagram of the direct form II structure shows the opposite. Since Pohlmann is consistent, otherwise, it seems the diagram was taken from somewhere else.

### b on top

I thought I had a few—can’t find any at the moment. Perhaps TI and Motorola application notes?

### N on top, D on bottom (Numerator and Denominator!)

Multirate Digital Signal Processing, Rochiere and Rabiner, 1983

### Other (these don’t use indexed coefficients)

Digital signal Analysis, Sterns, 1975
Musical Applications of Microprocessors, Chamberlin, 1980

### Final notes

Again, the textbook survey is to show why I made the choice then, not to support why it should be that way now. Be aware that you can’t assume that a given author is consistent over time. Rabiner’s books use different conventions, with different co-authors, as noted. Julius O. Smith’s detailed 1983 article has a on top, while his vast web resources use b on top. In DAFX: Digital Audio Effects (like Anthology above, a collection from multiple authors), Zölzer has b on top. It’s always a good idea to pay attention—there is nothing magical about the coefficient naming, it’s simply a style consideration.

## Filter frequency response grapher

Here’s a tool that plots frequency response from filter coefficients.

Hz
Plot
Max
Range
a coefficients (zeros)
b coefficients (poles)

The coefficients fields are tolerant of input format. Most characters that don’t look like numbers are treated as separators. So, you can enter coefficients separated by spaces or commas, or on different lines, separated by returns. That makes it easier to copy and paste coefficients from online filter calculators. They also ignore numbers that are followed by “=” or “ =”, so that “a0 = 0.1234” is seen as “0.1234”. Click the chart to accept coefficient changes, or change one of the controls.

Important: This tool does not assume that the filter coefficients are normalized to y0. So, in most cases you’ll need to insert a “1” as the first pole coefficient, the b0 term.

If there are no pole coefficients, it’s an FIR filter—all zeros.

Again, the convention of this website is that coefficients corresponding to zeros (left side of a direct form I) are the a coefficients, and poles the b coefficients. It’s usually easy to see because most IIR filter calculators normalize the output. So, if you are missing a0, it probably means that a and b are swapped with respect to this site’s convention—just paste them in the opposite coefficients fields (and remember to use a 1 for the missing coefficient). Also, negative signs at the summation for the feedback terms (b) are not rolled into the coefficients.

Posted in Biquads, Filters, FIR Filters, IIR Filters, Uncategorized, Widgets | 13 Comments