Recently, I’ve had lengthy discussions on the topic of dither with a couple of different people—of opposite views. One believes that everything should be dithered, including truncations to 24-bit. The other feels that dither is a waste of time even for 16-bit, and is needed only for shorter word lengths, rarely used today.
This got me thinking about how to give perspective on the truncation distortion and dither levels we’re talking about for 16-bit and 24-bit files. The usual demonstration would be to find or create a recording that is not unrealistic, yet produces noticeable truncation distortion at the bit levels we’re interested in—which is mainly near the floor of 16-bit and 24-bit samples.
But it occurred to me that people don’t have a good idea of how loud the distortion levels are by themselves, and that should be the place to start. I thought of a way to generate signals of those levels with no distortion of their own.
Perfect audio
Are you ready to listen to perfect digital audio files? By perfect I mean that they are exactly what they are designed to be. If I’d tried for low-level sine waves, for instance, they would be crude, distorted approximations. Instead, I created square waves at precise bit levels to avoid quantization errors, synchronized to the sample rate to avoid aliasing. The sample values are exact—they are not values that are “close” to expected values—and the harmonic content is representative of truncation effects at those levels.
The signal begins with the LSB set (we’ll call this “+1”), then the next sample is negated (-1). This repeats (+1, -1, +1, -1, …) for a short duration, then the period increases by one sample (+1, +1, -1, -1, +1, +1, -1, -1, …) for a short duration, and the process continues, increasing the period by one sample each time. This is a classic “divide by n” oscillator. You’ll notice that the pitch resolution is very poor at the beginning, as the second tone is an octave down from the first (which is at Nyquist and will be swallowed by your reconstruction filter), but gets better and better as pitch drops and each additional sample is a small percentage of the period. The waveform amplitude is not the smallest possible—toggling between +1 and 0 would do that—but is representative of the amplitude of the smallest truncation effects for bipolar signals.
So, the sound starts dropping at discrete frequencies, and moves increasingly towards a smooth frequency sweep.
Listening tests
Listen to these files on a quality monitoring system. First, the generated test signal toggling at the level of the fifth bit, moderately loud, so that you can become familiar with what you’ll be listening for at the lower levels:
Sweep at the 5th bit level (-24.1 dB)
All files are 24-bit. The difference is that the +1 level for the 24-bit version is the 24th bit, for the 16-bit version it’s the 16th bit, and for the 5-bit version it’s the 5th bit (making it 2048 times, or 66 dB, louder than the 16-bit version).
Now for 16-bit. Start by setting your monitoring level to as loud as you would normally play music, by running a song through it. Crank it up, but don’t hurt your ears. Then play this file by itself:
Sweep at the 16th bit level (-90.3 dB)
On a quality system at a relatively high monitoring level, you’ll have no problem hearing this sweep. After listening a few times, you might want to play a song again to remind yourself of the relative level of the test signal. Think about how difficult it would be to hear the test signal during the chunking guitars of a rock tune, but how it might be heard in the fading of quiet piano notes in classical music. (Yes, truncation noise can resulting in more of a tearing sound that the tone of the test sweep, but the relative levels are still representative of the relationship.)
Now, 24-bit. Warning: Do NOT try to compensate by cranking your volume level. You are NOT going to hear this signal anyway, and you risk making a mistake and assaulting your ears horrifically, or blowing speakers. No matter how great you think your 24-bit converters are, it and all of your other gear generate higher levels of noise than the 24th-bit level.
Sweep at the 24th bit level (-138.5 dB)
If you want to try other levels, use a sample editor (such as the free Audacity, or your favorite DAW) to boost the level 2x, or 6.02 dB, for every additional bit that you want to move up.
Think about the levels, and at what bit levels you’d care to conceal truncation distortion with dither, and I’ll comment further in a future article.
Don’t forget noise-shaped dither, though, which can *increase* the dynamic range of 16-bit audio, up to 120 dB.
http://people.xiph.org/~xiphmont/demo/neil-young.html
Not forgetting it—and I’d say that it increases subjective, or perceived dynamic range—but this article is about getting a feel for the distortion levels of a non-dithered signal…
I’m not sure how, but I’m positive I heard the -138.5 dB waveform. I listened to the files on my iMac 3.4 GHz Quad-Core Intel Core i7 running OS X 10.8.4, with Audio Device settings on 96kHz, 32-bit float, 0.0 dB, all factory OEM parts. I listened through my Shure SRH840 circumaural closed-back headphones. Static was incredibly low despite the lack of equipment I used to listen. Am I nuts? Was I simply hearing a resonate change in the static?
These are the questions that haunt me… Nah! I already know I’m crazy.
It’s possible that there is some non-linear aspect of the iMac output, including the sample rate conversion, but it would have to be large in order to hear anything.