Sampling theory, the best explanation you’ve ever heard—End notes

A few words before moving on to other topics…

We’ve looked at why digital sample represent ideal impulses, and why any point between samples represents a value of zero. And, as a result, audio samples don’t represent the audio itself, but a modulated version of the audio.

Why is helpful to understand these points?

Critical sampling

First, it gives clear and intuitive answers for why digital audio behaves certain ways than do more typical explanations. For instance, it makes this puzzle trivial:

People ask why the sample rate needs to be double the frequency of the highest signal frequency that we want to preserve. Often the reply is that it needs to be just above double the highest frequency of interest, to avoid aliasing. But why? And how much higher? At this point, someone mentions something about wagon wheels turning the wrong way in movies. Or shows a graph with two sine waves of different frequencies intersecting the same sample points. So unsatisfying.

If you consider that the signal is amplitude modulated in the digitization process, you need only see that the sidebands would start overlapping at exactly half the sample rate. To keep them from overlapping, all frequencies must be below half the sample rate, giving each cycle more than two samples.

Multistage conversion

And integer sample rate conversion choices are easier to make. Especially for multistage conversion. We often use multistage conversion to improve efficiency. Like performing 8x upsampling as three 2x stages. If that sounds like three times the work, it isn’t, because the higher relative cutoffs of the filters make for fewer coefficients, balancing out with the total operations for 8x. But we can do more than break even by optimizing each stage—the earlier stages can be a bit sloppy as long as everything is tidy by the last stage’s output. Somewhat like doing a big cleanup on a house in multiple passes versus one.

Perhaps this is a good place to note that you might see chatroom posts where someone says that instead of inserting zeros and filtering, they prefer to use a polyphase filter. There is no difference—a polyphase filter in this case is simply an algorithm that upsamples and filters. Any seasoned programmer will notice that there is no need to explicitly place zeros between samples, then run all samples through an FIR, because the zero samples result in a zero product; optimizing the code to skip the zero operations results in a polyphase filter.

Optimization example

An understanding of why we need to filter rate conversions can help us optimize DSP processes. For example, someone posted a question on a DSP board recently. They were using EQ filters designed by the bilinear transform, which have a pinching effect near half the sample rate (due to zeros at the Nyquist frequency). They didn’t need additional frequency headroom per se—the filters are linear—but they wanted to oversample by 2x to avoid the shape distortion of peaking EQ filters. (Note there are methods to reduce or avoid such distortion of the filter shape, but this is a good example.)

Let’s say we’re using a 48 kHz sample rate. Typically we’d raise the sample rate to 96k by inserting zeros every other sample and then lowpass filtering below 24k. Then we’d do our processing (EQ filtering). Finally, we’d take it back to 48k by lowpass filtering below 24k and discarding every other sample. But in this case, our processing step is linear (linear EQ filters), so it doesn’t create new frequencies. That means we can skip one of the lowpass filtering stages. It doesn’t matter whether we lowpass filter before or after the EQ processing, but we don’t need both. That’s a substantial savings.

Another example

Let’s say we create an audio effect such as an amp simulator, which has a non-linear process that requires running at a higher sample rate to reduce audible aliasing. We run our initial linear processes, such as tone controls, then upsample and run our non-linear processes (an overdriven tube simulation!). But in this case we conclude with a speaker cabinet simulator, which is a linear process (via convolution or other type of filtering). Guitar and bass cabinets use large speakers (typically 8” and up, often 10” or 12” for guitar), with frequency responses that drops steeply above around 5 kHz. Understanding how the downsampling process works, we might choose to eliminate the downsampling filter stage altogether, as superfluous, or at least save cycles with a simplified filter with relaxed requirements.

Posted in Digital Audio, Sampling Theory | Tagged | 2 Comments

Sampling theory, the best explanation you’ve ever heard—Part 3

We look at what Pulse Amplitude Modulation added to our analog source audio.

What did PAM add?

Earlier, we noted that the PAM signal represents the the source signal plus some additional high frequency content that we need to remove with a lowpass filter before we listen back.

Again, PAM is amplitude modulation of the source signal with a pulse train. Mathematically, we know precisely what amplitude modulation produces—the sums and differences of every frequency component between the two input signals. That is, if you you multiply a 100 Hz sine wave by a 6 Hz sine wave, the result is the sum of 106 Hz and 94 Hz sine waves. For signals with more frequency components, there are more sums and differences in the result.

To answer our question, “What got added?”, we need to understand the frequency content of a pulse train. One way to know that would be to use an Fourier Transform on the pulse train. But I want to use intuitive reasoning to eliminate as much math as possible. Fortunately, I already know what the extra frequency content is—it’s the spectral images in sampled systems, as described in classic DSP textbooks. That coupled with knowledge of amplitude modulation tips me off that we’ll need a frequency component at 0 Hz (DC—we need that to keep our original source band), at the sample rate, and at every integer multiple of the sample rate. Through infinity.

OK, we’ll lighten up on the infinity requirement. We can’t produce a perfect impulse in the analog world anyway. And we don’t need to. However, once in the digital domain, samples represent perfect impulses. While their values may have deviated slightly from a perfect representation of the analog signal, due to sampling time jitter and quantization, any math we do to them is “perfect” (again, subject to quantization and any other approximations). In the digital realm, the images do go to infinity.

Indeed, as you add cosine waves of 0, 1, 2, 3, 4…times the sample rate, the result gets closer and closer to the shape of an impulse. (Cosine instead of sine so that the peaks of the different frequencies line up.)

And that means we’ll have a copy of the source signal mirrored around 0 Hz, around the sample rate, twice the sample rate, three times the sample rate…to infinity. (In both directions, but we can ignore negative frequencies—for real signals, the negative spectrum mirrors the positive.)

What we’ve learned

Revisiting my “secrets”, with added comments:

1. Individual digital samples are impulses. Not bandlimited impulses, ideal ones.

Bothered that ideal impulses are impossible? Only in the physical world. There, we accept limitations. For instance, gather together infinity of something. Anything—I’ll wait. Meanwhile, in the mathematical world, infinity fits easily on this page: ∞

2. We know what lies between samples—virtual zero samples.

Think there’s really a continuous wave, implied, between samples? If so, you probably think it’s because samples represent a bandlimited impulse. No—you’re getting confused with what will come out of the DAC’s lowpass filter later, when we play back audio.

3. Audio samples don’t represent the source audio. They represent a modulated version of the audio. We modulated the audio to ensure points #1 and #2.

This is a frequency-domain observation that follows from the first two points, which are time domain. If you understand this point, you’ll never be confused about sample rate conversion.

Posted in Digital Audio, Sampling Theory | Tagged | 5 Comments

Sampling theory, the best explanation you’ve ever heard—Part 2

In this article, we explore the origins of sampling.

Discrete time

For many, discrete time and digital sampling are synonymous, because most people have little experience with discrete time analog. But perhaps you’ve used an old-style analog delay stompbox, with “bucket brigade” delay chips. Discrete time goes back a lot farther, though. When we talk of the sampling theorem, attributed to people like Nyquist, Shannon, and others, it applies to discrete time signals, not digital signals in particular.

The origins of discrete time theory are in communications. A single wire can support multiple simultaneous telegraph messages, if you synchronize a commutator between sender and receiver and slice time into sections to interleave the messages—this is called Time Division Multiplexing, or TDM. Following later with voice, using TDM to fit multiple voice calls on a line, it was found that the sampling rate had to be around 3500-4300 Hz for satisfactory results.

Traveling over a wire, analog signals can’t be “discrete” per se–there is always something being sent, no gaps in time. But the signal information is discrete, sending zero in between, and that leaves room to interleave other signals in TDM.

The most common method of making an analog signal discrete in this way is through Pulse Amplitude Modulation, or PAM. This means we multiply the source signal continuously with a pulse train of unit amplitude.

While the benefit of PAM for analog communications is that we can interleave multiple signals, for digital, the benefit is that we don’t need to store the “blank” (zero) space between samples. For digital sampling, we simply measure the height of each impulse of the PAM result, and encode it as a number. Pulse Amplitude Modulation and encoding—we call the combined process Pulse Code Modulation. Now you know what PCM means.

Impulses, really

Some might look at that last diagram and think, “But I’ve seen this process depicted as a staircase wave before, not spiky impulses.” In fact, measuring voltage quickly and with precision, which we must do for the encoding step, is not easy. Fortunately, we intend to discard the PAM waveform anyway, and keep just the digital values. We don’t need to maintain the empty spaces between impulses, since our objective is not time division multiplexing analog signals. So, we perform a “sample and hold” process on the source signal, which charges a capacitor at the moment of sampling and stretches the voltage value out, allowing a more leisurely measurement.

This results only in a small shift in time, functionally identical to instantaneous sampling—digital samples represent impulses, not a staircase. If you have a sample value of 0.73, think of it as an impulse of height 0.73 units.

The step of digitizing the analog PAM signal introduces quantization, and therefore quantization error. But it’s important to understand that issues related to aliasing are not a property of the digital domain—aliasing is a property of discrete time systems, so is inherent in the analog PAM signal as well. That’s why we took this detour—I believe I can explain aliasing to you in a simpler way, from the analog perspective.

Next: We’ll look at exactly what frequency content is added by the PAM (and therefore PCM) process, in Part 3

Posted in Digital Audio, Sampling Theory | Tagged | Leave a comment

Sampling theory, the best explanation you’ve ever heard—Part 1

I’ll start by giving away secrets first:

  1. Individual digital samples are impulses. Not bandlimited impulses, ideal ones.
  2. We know what lies between samples—virtual zero samples.
  3. Audio samples don’t represent the source audio. They represent a modulated version of the audio. We modulated the audio to ensure points #1 and #2.

Well, not secrets, but many smart people—people who’ve done DSP programming for years—don’t know these points. They have other beliefs that have served them well, but have left gaps.

Let’s see why

Analog audio, to digital for processing and storage, and back to analog

Component details—first the analog-to-digital converter (ADC)

The digital-to-analog converter (DAC)

Analog to digital conversion, and conversion back to analog are symmetrical processes—not surprising.

But we can make another important observation: We know that the bandlimiting lowpass filter of the ADC is there as a precaution, to ensure that the source signal is limited to frequencies below half the sample rate. But we have an identical filter at the output of the DAC—why do we need that, after eliminating the higher frequencies at the beginning of the ADC? The answer is that conversion to discrete time adds high frequency components not in the original signal.

Stop and think about this—it’s key to understanding digital audio. It means that the digital audio samples do not represent the spectrum of the bandlimited analog signal—the samples represent the spectrum of the bandlimited analog signal and additional higher frequencies.

To understand the nature of the higher frequencies added in the sampling process, it helps to look at the origins of sampling.

Next: We explore the origins of sampling in Part 2

Posted in Digital Audio, Sampling Theory | Tagged | 4 Comments

Sampling theory, the best explanation you’ve ever heard—Prologue

I’ve been working on a new video, with the goal of giving the best explanation of digital sampling you’ve ever heard. The catch is I started on it three years ago. I’m not that slow, it’s just that I’ve been busy with projects, so time passes between working on it. And each time I get back to it, I rewrite it from scratch. You see, I want to give you a solid, useful theoretical framework that will both help you understand and help you make good decisions, with the goal of being intuitive. So, each time I have a little different viewpoint to try.

The video is still a few months off—I’m still busy for the next few months—but I’m going to present the idea first as a collection of short articles. And in the end, if I like it, the articles will serve as the script for the video.

I believe this is a unique explanation of sampling. Please follow it from the start, even if you feel you know the subject well. This isn’t something I read or was taught, but came from thinking about a way to explain it without descending into either mathematical symbolism or hand waving.

Next up: My first crack at the best explanation of sampling theory you’ve ever heard.

Posted in Digital Audio, Sampling Theory | Tagged | Leave a comment

Amp simulation oversampling

In tandem with our last article on Guitar amp simulation, this article gives a step by step view of the sampling and rate conversion processes, with a look at the frequency spectrum.

From guitar to digital

The first two charts embody the initial sampling of the analog signal. It’s done in one step, from your analog-to-digital converter, but it’s a two-part process. First, a lowpass filter clears everything from half the sample rate up—something we must do to avoid higher frequencies aliasing into our audio band when sampled.

Then the signal is digitized. This creates repeating images of the positive and negative frequencies that extend upward without end. After this, we’ll look only at frequencies between 0 Hz and the sampling frequency, but it’s important to understand that these images are there, nonetheless.

If we don’t need more frequency headroom, we can do our signal processing on the samples we have. In fact, we want to do as much as we can at the lower sample rate. In the case of a guitar amp, we would process any tone controls that come before the “tube”, and other things like DC blocking and noise gating. And we can do our (non-saturating) gain stage here (assuming floating point).

Higher rate processing

After initial processing and gain, it’s time for saturation. For this non-linear process, we need frequency headroom. The first step is to increase the rate by inserting zero-magnitude samples. Though I suggested starting with 8x upsampling in the guitar amp article, this exercise shows 4x, in order to better accommodate page constraints. We place three zero-samples between each existing sample to bring it up 4x. This only raises the sample rate—we have four samples in the same period that we used to have one. Since the spectrum is not altered, the aliases are still there.

Part two of the sample rate conversion process is to use a lowpass filter to clear everything above our original audio band. (In reality, we optimize the zero insertion and filtering into a single process, to take advantage of multiplies by zero that we can skip. Note we usually use a linear-phase FIR for this step, in order to preserve the wave shape for the saturator.) Now we see our headroom.

After our saturation stage, we’ve created new harmonics. As long as they are at a sufficiently low level by the time they reach the sample rate minus our final audio bandwidth, the aliased version won’t pollute our audio band.

Back to normal

Done with our more expensive high-rate signal processing, we can drop back to our original sample rate for the rest. The first step is to run our lowpass filter again, to clear everything above our audio band.

Part two is the downsampling process—we simply keep one sample, discard three, and repeat. Why did we bother calculating them? Because we needed them ahead of the lowpass filter. But, here also, we can optimize these two down-conversion steps into one, and save needless calculation.

Final processing

From here, we handle other linear processes—any filtering and tone controls the follow the tube stage, effects such spring reverb, and finally the speaker cabinet simulation. In the end, we send it to our digital-to-analog converter, which itself has a lowpass filter to remove the aliased copies.

And enjoy listening.

Posted in Digital Audio, Effects, Guitar Amp Simulation, Sample Rate Conversion | 4 Comments

Guitar amp simulation

In this article, I’ll sketch a basic guitar amp simulator. For one, questions on the topic come up often, and also, it will be a good example of a typical use of working at a higher sample rate.

The most basic guitar amp simulator has gain, with saturation, tone controls, and a speaker cabinet simulator. Because saturation is a non-linear process, the results are different whether the tone controls come before or after—more on this later. The speaker cabinet, of course, comes last, and is an important part of the tone. Gain with saturation (the overdriven “tube”) is the tricky part—we’ll start there.

Gain with saturation

Gain is simply a multiply. But we like to overdrive guitar amps. That means gain and some form of clipping or softer limiting—that’s what generates the overdrive distortion harmonics we want. Typically, we’d ease into the clipping, more like tube saturation behaves, but the more overdrive gain you use, the closer it gets to hard clipping, as more of the signal is at the hard limit.

That’s where we hit our first DSP problem. Clipping creates harmonics, and there’s no way to say, “Please, only generate harmonics below half the sample rate.” We will have aliasing. The added harmonics fall off in intensity (in much the same way as harmonics of a rectangular wave do), as they extend higher in frequency, but at typical digital audio sample rates, the Nyquist Frequency comes too soon. Aliased images extend back down into the audio range. We can’t filter them out, because they mix with harmonics we want to keep. And because the guitar notes played aren’t likely the be an integer multiple of the sample period, the aliased harmonics are out of tune. Worse, if you bend a guitar note up, the aliased harmonics bend down—that’s where aliasing becomes painfully apparent.

Can we calculate clipping overtones and create just the ones we want? Can we analyze the input and output and remove frequencies that we don’t want? That is not an easy task (left as an exercise for the reader!).

The oversampled solution

To mitigate the aliasing issue, the most practical solution is to give ourselves more frequency headroom before generating distortion harmonics with our saturation stage. If the sample rate is high enough, aliased images are spread far enough apart that the tones extending into our audio range from above are diminished to the point they are obscured by the din of our magnificent, thick, overdriven guitar sound. And when we play delicately or with little overdrive gain, so is the aliasing lessened.

How much headroom?

How much headroom do we need? I’d like to leave that to your situation and experimentation—mostly. But since I started doing this in the days when every DSP cycle was a precious commodity, I can tell you that—for 44.1 kHz sample rate, typically the worst case that we care about—the minimum acceptable oversampling factor is 6x. 8x is a good place to start, and will be adequate for many uses, at a reasonable cost. (Do multistage oversampling for efficiency…but that’s another story.)

Raising the sample rate “8x” sounds like we’ll have eight times the bandwidth, but it’s better than that. At our original 44.1 kHz sample rate, we have a usable bandwidth of about 20 kHz (allowing for the conversion filters), and little addition frequency headroom. If we go past 24.1 kHz (44.1 kHz – 20 kHz), aliasing extends below 20 kHz. But by raising the rate to 8 times the original, we have headroom of 352.8 kHz – 20 kHz, or 332.8 kHz. That’s more than 80 times our original headroom.

The idea is that we upsample the signal (removing anything not in our original audio band), run it though our tube simulator (clipper/saturator), then drop the sample rate back to the original rate (part of this process is again removing frequencies higher than out original audio band).

How much gain?

Real guitar amps have a lot of gain. Don’t think that if your overdrive is adjustable from 0-2 gain factor, you’ll get some good overdrive distortion. That’s only a maximum of 6 dB. More like 60 dB (0-1024)! Maybe up to something like 90 dB for modern, screaming high-gain amps. That’s equivalent to a shift of 15 bits, so I hope you’re using something better than a 16-bit converter (or tracks) to input your raw guitar sound.

The tube

My intent here is not to guide you down the road of yet another guitar amp simulator plugin, but to give an example of a task that needed more frequency headroom, requiring processing at a higher sample rate. But it’s worth going into just a bit of detail on the tube (saturation) element. Again, we’re talking about “first approximations”—a hard clipper, or a soft one.

A hard clipper is trivial. If a sample is greater than 1, change it to 1. If less than -1, change it to -1.

if (samp > 1.0)
 samp = 1.0;
else if (samp < -1.0)
  sample = -1.0;

Here’s the transfer function—for input sample values along the x axis, the output is on the y axis. For input { 0.5, 0.8, 1.0, 1.3, 4.2 }, the output is { 0.5, 0.8, 1.0, 1.0, 1.0 }.

But for this use we’d probably like a softer clip—a transfer function that eases in to the limit. samp = samp > 1 ? 1 : (samp <= -1 ? -1 : samp * (2 – fabs(samp))); This is just a compact way of saying,

if (samp > 1.0)
  samp = 1.0;
else if (samp < -1.0)
  sample = -1.0;
  samp = samp * (2 - fabs(samp));

This is a very mild, symmetrical x-squared curve. It’s just an example, but you’ll find it produces useful results. You could try a curve that stays straighter, then curves quicker near +/-1. Or a curve that’s not symmetrical for positive and negative excursions. The harder the curve, the closer we get to our original hard-clipping.

One detail worth noting: If you’ve spent much time looking at discussions of this sort of non-linear transfer function on the web, usenet, or mailing lists, invariably someone points out that we know exactly what order of harmonics will be created, and therefore how much oversampling we might need, based on the polynomial degree. This is wrong—in fact we’re only using the polynomial between the bounds of x from -1 to 1, and substituting a hard clip for all other points. For a use such as this, the input is driven far into the hard clip region, so the polynomial degree is no longer relevant.

Tone controls

We know IIR filters well, so this part is easy. Typically, a guitar amp might have bass, mid, and treble controls. One catch is that if you want to sound like a particular vintage amp, they used passive filters that result in control interaction. That is, not active circuitry that isolate the components from each other. So, adjusting the bass band might affect the mid filtering as well. But that’s fairly easy to adjust for.

More about overdrive and tone

Something worth noting. It you really want to scream with gain, you need to make some architectural adjustments. If you’ve played much with raw distortion pedals with different instruments, you’ve probably noticed that you get a “flatulent” sound with heavy clipping of bass. And high-gain distortion of signals with a lot of highs can sound pretty shrill. Guitar solos really scream when they have a lot of midrange distortion. So, if you really want to go for a screaming lead with loads of distortion without it falling apart into useless grunge, one thing you can do is roll off the lows and highs before the tube sim (distortion) stage. But then the result lacks body—compensate by boost the lows and high back up, after the distortion stage. The point here is that you have three choices of where to put EQ for your amp tone: before the tube, after, or both.

Some of your tone choices can be a property of the amp model—not everything needs to be a knob for the user to control. These choices are what give a particular guitar amp its characteristics. A Fender Twin does not sound like a Marshall Plexi. The reason that well-known guitar amps are the basis of DSP-based simulation is that the choices of their designers have withstood the test of time. Countless competitors faded from memory, often because there choices were as compelling.


Similarly, guitar speakers gained familiar configurations not because the industry got together and chose, but these are the ones that worked out well.

One characteristic of speakers for guitar amps is that they are not full range—you’ll find no tweeters in guitar cabinets. The highs of the large speakers used (most often 10″ and 12″) drop off very quickly. A clean guitar tone doesn’t have strong high frequency harmonics, and the highest note on a guitar is typically below 1 kHz. Overdrive distortion creates powerful high frequency harmonics, but we really don’t want to hear them up very high—extremely harsh and fatiguing to listen to.

The first approximation of a speaker cabinet is simply a lowpass filter, set to maybe 5 kHz. I didn’t say it would be a great cabinet, but it would start to sound like a real guitar amp combo. The next step might be to approximate the response of a real speaker cabinet, miked, with multiple filters.

But if you’re serious, a better start might be to generate impulse responses of various cabinets (4 x 12″, 2 x 10″, etc.), miked at typical positions (center, edge, close, far), with selected mics (dynamic, condenser). Then use convolution to recreate the responses.


To moderate the size of this article, I’ll follow with images depicting the oversampling process in the next article.

Posted in Aliasing, Digital Audio, Sample Rate Conversion | 3 Comments

About coefficient conventions

It’s been asked many times, so it’s worth an article explaining the conventions used on this site for transfer functions, and why they may differ from what you see elsewhere.

People run into this most often with biquads: I use a (a0, a1, a2) in the numerator (defining zeros), and b in the denominator (defining poles). Many references, including wikipedia, use the opposite convention. Why am I so contrary?

OK, I go back a few years, back when the only way to find out about digital signal processing was to buy a college text book and try to make sense of it. In the beginning, there was no convention (I’m not sure there is now, other than de facto—please let me know if some technical society or other has put down the word). When I first went to write for the public, I took a survey of my book shelf. Most had a on top. And that choice made sense to me—a then b, from top to bottom, and also left to right for the direct form I structure. Finally, it makes sense that FIRs would have only a coefficients. Having only b coefficients seems odd, if you were to specialize the general transfer function.

My only guess for the other way preference, initially at least, is that when using the canonical (minimum number of delay elements) director form II, the result would be ab left to right.

So, I wrote a number of articles over the years using that convention. And over the years it seems that b-on-top has become dominant. (I recall seeing a-on-top on another major internet resource recently, and I see that one site even uses c and d—I assume to avoid the conflict altogether.) Free time is not something I have a lot of, so I don’t intend to edit all my articles, diagrams, and widgets to change the convention. I’m not saying it will never happen, but not for now.

It’s easy enough to notice which convention is used if you’re aware that there is no 100% agreement. Biquads (and higher order IIRs) are typically normalized to the output, making b0 (in my case) unity, so there is no coefficient needed. If you see a set of a and b coefficients, and a0 is missing and b0 is not, then the order is swapped relative to mine.

Finally, there is one other place that you can get caught with filter coefficients. When deriving a difference equation (y(n) = a0x(n) + a1x(n-1) + a2x(n-2) – b1y(n-1) – b2y(n-2)), sometimes people roll the minus signs for the feedback part into the respective coefficients. This damages the mathematical purity of it all a tad, but makes sense in a computer implementation. I don’t merge the minus signs—and fortunately, I don’t think most internet sources do either.

For the record, I’ll consult my bookshelf and see if I can come up with that original survey I mentioned earlier:

a on top

Theory and Application of Digital Signal Processing, Rabiner and Gold, 1975
Principles of Digital Audio—Second Edition, Pohlmann, 1989*
Advanced Digital Audio, Pohlmann, 1991
Digital Signal Processing—A Practical Approach, Ifeachor and Jervis, 1993
Digital Audio Signal Processing, Zölzer, 1997
Digital Signal Processing—An Anthology: Chapter 2, An Introduction to Digital Filter Theory, Julius O. Smith, 1983

* The topic doesn’t appear in the first edition. Also, although the text uses a for the top (forward path of the difference equations, actually), a diagram of the direct form II structure shows the opposite. Since Pohlmann is consistent, otherwise, it seems the diagram was taken from somewhere else.

b on top

I thought I had a few—can’t find any at the moment. Perhaps TI and Motorola application notes?

N on top, D on bottom (Numerator and Denominator!)

Multirate Digital Signal Processing, Rochiere and Rabiner, 1983

Other (these don’t use indexed coefficients)

Digital signal Analysis, Sterns, 1975
Musical Applications of Microprocessors, Chamberlin, 1980

Final notes

Again, the textbook survey is to show why I made the choice then, not to support why it should be that way now. Be aware that you can’t assume that a given author is consistent over time. Rabiner’s books use different conventions, with different co-authors, as noted. Julius O. Smith’s detailed 1983 article has a on top, while his vast web resources use b on top. In DAFX: Digital Audio Effects (like Anthology above, a collection from multiple authors), Zölzer has b on top. It’s always a good idea to pay attention—there is nothing magical about the coefficient naming, it’s simply a style consideration.

Posted in Biquads, IIR Filters | 7 Comments

Filter frequency response grapher

Here’s a tool that plots frequency response from filter coefficients.

a coefficients (zeros)
b coefficients (poles)

The coefficients fields are tolerant of input format. Most characters that don’t look like numbers are treated as separators. So, you can enter coefficients separated by spaces or commas, or on different lines, separated by returns. That makes it easier to copy and paste coefficients from online filter calculators. They also ignore numbers that are followed by “=” or “ =”, so that “a0 = 0.1234” is seen as “0.1234”. Click the chart to accept coefficient changes, or change one of the controls.

Important: This tool does not assume that the filter coefficients are normalized to y0. So, in most cases you’ll need to insert a “1” as the first pole coefficient, the b0 term.

If there are no pole coefficients, it’s an FIR filter—all zeros.

Again, the convention of this website is that coefficients corresponding to zeros (left side of a direct form I) are the a coefficients, and poles the b coefficients. It’s usually easy to see because most IIR filter calculators normalize the output. So, if you are missing a0, it probably means that a and b are swapped with respect to this site’s convention—just paste them in the opposite coefficients fields (and remember to use a 1 for the missing coefficient). Also, negative signs at the summation for the feedback terms (b) are not rolled into the coefficients.

Posted in Biquads, Filters, FIR Filters, IIR Filters, Widgets | 13 Comments

Evaluating filter frequency response

A question that pops up for many DSP-ers working with IIR and FIR filters, I think, is how to look at a filter’s frequency and phase response. For many, maybe they’ve calculated filter coefficients with something like the biquad calculator on this site, or maybe they’ve used a MATLAB, Octave, Python (with the scipy library) and functions like freqz to compute and plot responses. But what if you want to code your own, perhaps to plot within a plugin written in c++?

You can find methods of calculating biquads, for instance, but here we’ll discuss a general solution. Fortunately, the general solution is easier to understand than starting with an equation that may have been optimized for a specific task, such as plotting biquad response.

Plotting an impulse response

One way we could approach it is to plot the impulse response of the filter. That works for any linear, time-invariant process, and a fixed filter qualifies. One problem is that we don’t know how long the impulse response might be, for an arbitrary filter. IIR (Infinite Impulse Response) filters can have a very long impulse response, as the name implies. We can feed a 1.0 sample followed by 0.0 samples to obtain the impulse response of the filter. While we don’t know how long it will be, we could take a long impulse response, perhaps windowing it, use an FFT to convert it to the frequency domain, and get a pretty good picture. But it’s not perfect.

For an FIR (Finite Impulse Response) filter, though, the results are precise. And the impulse response is equal to the coefficients themselves. So:

For the FIR, we simply run the coefficients through an FFT, and take the absolute value of the complex result to get the magnitude response.

(The FFT requires a power-of-2 length, so we’d need to append zeros to fill, or use a DFT. But we probably want to append zeros anyway, to get more frequency points out for our graph.)

Plotting the filter precisely

Let’s look for a more precise way to plot an arbitrary filter’s response, which might be IIR. Fortunately, if we have the filter coefficients, we have everything we need, because we have the filter’s transfer function, from which we can calculate a response for any frequency.

The transfer function of an IIR filter is given by


z0 is 1, of course, as is any value raised to the power of 0. And for normalized biquads, b0 is always 1, but I’ll leave it here for generality—you’ll see why soon.

To translate that to an analog response, we substitute e for z, where ω is 2π*freq, with freq being the normalized frequency, or frequency/samplerate:


Again, e0jω is simply 1.0, but left so you can see the pattern. Here it is restated using summations of an arbitrary number of poles and zeros:


For any angular frequency, ω, we can solve H(e). A normalized frequency of 0.5 is half the sample rate, so we probably want to step it from 0 to 0.5—ω from 0 to π—for however many points we want to evaluate and plot.

Coding it

From that last equation, we can see that a single FOR loop will handle the top or the bottom coefficient sets. Here, we’ll code that into a function that can evaluate either zeros (a terms) or poles (b terms). We’ll refer to this as our direct evaluation function, since it evaluates the coefficients directly (as opposed to evaluating an impulse response).

You’ve probably noticed the j, meaning an imaginary part of a complex number—the output will be complex. That’s OK, the output of an FFT is complex too, and we know how to get magnitude and phase from it already.

Some languages support complex arithmetic, and have no problem evaluating “e**(-2*j*0.5)”—either directly, or with an “exp” (exponential) function. It’s pretty easy in Python, for instance. (Something like, coef[idx] * math.e**(-idx * w * 1j), as the variable idx steps through the coefficients array.)

For languages that don’t, we can use Euler’s formula, ejx = cos(x) + j * sin(x); that is, the real part is the cosine of the argument, and the imaginary part is the sine of it.

(Remember, j is the same as i—electrical engineers already used i to symbolize current, so they diverged from physicists and used j. Computer programming often use j, maybe because i is a commonly used index variable.)

So, we create our function, run it on the numerator coefficients for a given frequency, run it again on the denominator coefficients, and divide the two. The result will be complex—taking the absolute value gives us the magnitude response at that frequency.

Revisiting the FIR

Since we already had a precise method of looking at FIR response via the FFT/DFT, let’s compare the two methods to see how similar they are.

To use our new method for the case of an FIR, we note that the denominator is simply 1, so there is no denominator to evaluate, no need for division. So:

For the FIR, we simply run the coefficients through our evaluation function, and take the absolute value of the complex result to get the magnitude response.

Does that sound familiar? It’s the same process we outlined using the FFT.

And back to IIR

OK, we just showed that our new evaluation function and the FFT are equivalent. (There is a difference—our evaluation function can check the response at an arbitrary frequency, whereas the FFT frequency spacing is defined by the FFT size, but we’ll set that aside for the moment. For a given frequency, the two produce identical results.)

Now, if the direct evaluation function and the FFT give the same results, for the same frequency point, and the numerator and denominator are evaluated by the same function, by extension we could also get a precise evaluation by substituting an FFT process for both the numerator and denominator, and dividing the two as before. Note that we’re no longer talking about the FFT of the impulse response, but the coefficients themselves. That means we no longer have the problem of getting the response of an impulse that can ring out for an unknown time—we have a known number of coefficients to run through the FFT.

Which is better?

In general, the answer is our direct evaluation method. Why? We can decide exactly where we want to evaluate each point. That means that we can just as easily plot with log frequency as we can linear.

But, there may be times that the FFT is more suitable—it is extremely efficient for power-of-2 lengths. (And don’t forget that we can use a real FFT—the upper half of the general FFT results would mirror the lower half and not be needed.)

An implementation

We probably want to evaluate ω from 0 to π, corresponding to a range of half the sample rate. So, we’d call the evaluation function with the numerator coefficients and with the denominator coefficients, for every ω that we want to know (spacing can be linear or log), and divide the two. For frequency response, we’d take the absolute value (equivalently, the square root of the sum of the squared real and imaginary parts) of each complex result to obtain magnitude, and arc tangent of the imaginary part divided by the real part (specifically, we use the atan2 function, which takes into account quadrants). Note that this is the same conversion we use for FFT results, as you can see in my article, A gentle introduction to the FFT.

\(magnitude:=\left |H \right |=abs(H)=\sqrt{H.real^2+H.imag^2}\)
\(phase := atan2(H.imag,H.real)\)

For now, I’ll leave you with some Python code, as it’s cleaner and leaner than a C or C++ implementation. It will make it easier to transfer to any language you might want (Python can be quite compact and elegant—I’m going for easy to understand and translate with this code). Here’s the direct evaluation routine corresponding to the summation part of the equation (you’ll also need to “import numpy” to have e available—also available in the math library, but we’ll use numpy later, so we’ll stick with numpy alone):

import numpy as np

# direct evaluation of coefficients at a given angular frequency
def coefsEval(coefs, w):
    res = 0
    idx = 0
    for x in coefs:
        res += x * np.e**(-idx * 1j * w)
        idx += 1
    return res

Again, we call this with the coefficients for each frequency of interest. Once for the numerator coefficients (the a coefficients on this website, corresponding to zeros), once for the denominator coefficients (b, for the poles—and don’t forget that if there is no b0, the case for a normalized filter, insert a 1.0 in its place). Divide the first result by the second. Use use abs (or equivalent) for magnitude and atan2 for phase on the result. Repeat for every frequency of interest.

Here’s a python function that evaluates numerator and denominator coefficients at an arbitrary number of points from 0 to π radians, with equal spacing, returning arrays of magnitude (in dB) and phase (in radian, between +/- π):

# filter response, evaluated at numPoints from 0-pi, inclusive
def filterEval(zeros, poles, numPoints):
    magdB = np.empty(0)
    phase = np.empty(0)
    for jdx in range(0, numPoints):
        w = jdx * math.pi / (numPoints - 1)
        resZeros = coefsEval(zeros, w)
        resPoles = coefsEval(poles, w)

        # output magnitude in dB, phase in radians
        Hw = resZeros / resPoles
        mag = abs(Hw)
        if mag == 0:
            mag = 0.0000000001  # limit to -200 dB for log
        magdB = np.append(magdB, 20 * np.log10(mag))
        phase = np.append(phase, math.atan2(Hw.imag, Hw.real))
    return (magdB, phase)

Here’s an example of evaluating biquad coefficients at 64 evenly spaced frequencies from 0 Hz to half the sample rate (these coefficients are right out of the biquad calculator on this website—don’t forget to include b0 = 1.0):

zeros = [ 0.2513643668578741, 0.5027287337157482, 0.2513643668578741 ]
poles = [ 1.0, -0.17123074520885395, 0.1766882126403502 ]

(magdB, phase) = filterEval(zeros, poles, 64)

for x in magdB:

for x in phase:

Next up, a javascript widget to plot magnitude and phase of arbitrary filter coefficients.

Extra credit

The direct evaluation function performs a Fourier analysis at a frequency of interest. For better understanding, reconcile it with the discrete Fourier transform described in A gentle introduction to the FFT. In that article, I describe probing the signal with cosine and sine waves to obtain the response at a given frequency. Look again at Euler’s formula, which shows that e is cosine (real part) and sine (imaginary part), which the article alludes to this under the section “Getting complex”. You should understand that the direct evaluation function presented here could be used to produce a DFT (given complete evaluation of the signals at appropriately spaced frequencies). The main difference is that for this analysis, we need not do a complete and reversible transform—we need only analyze frequency response values that we want to graph.

Posted in Biquads, FFT, Filters, FIR Filters, IIR Filters | 27 Comments