Killing Technology is an ongoing series that dives into the science behind the music, taking the classic 1987 Voivod album as its inspiration.
In the first part of this series, I gave a very brief introduction to sampling and more specifically, how samples go from a set of points in time to the brutal beauty of Bolt Thrower (say that three times fast). With this knowledge in mind, I’d like to cover the concepts of downsampling, upsampling, and oversampling since they are often confused by many audiophiles and metal heads alike.
Downsampling is simply the process of reducing the sample rate of the original signal. Let’s say you originally recorded Bolt Thrower’s new album (a miracle in itself) at 24-bit/96kHz in your home studio and wanted to share its glory with the rest of the world which has standardized on the CD, what do you do? You downsample, which means you take the original 24-bit/96kHz sampled waveform and run it through a process that will re-sample it at 16-bit/44.1kHz.
So the first question that springs to mind is doesn’t downsampling reduce the overall fidelity since we are losing bits? No, provided we maintain a reasonable target Nyquist rate. In other words, by downsampling from 24-bit/96kHz to 16-bit/44.1kHz we can still reconstruct frequencies up to 22.05kHz or slightly above the range of human hearing.
But what about the newly introduced quantization noise introduced from taking all those 24-bit samples and trying to represent them now as 16-bit values, won’t that effect the music’s overall fidelity? Not really. Through a technique called dithering, we can push the noise introduced by the downsampling process to levels not audible by the human ear. Though it sounds a bit unintuitive, dithering actually adds some random noise to the original signal in order to cause the quantization errors introduced during the downsample process to be non-deterministic, causing the overall average of error within each sample to be closer to the original waveform. Still with me? Suffice it to say, you will be hard pressed to tell the difference between the original 24-bit/96kHz recording and the dithered down 16-bit/44.1kHz release.
Many high end studios record in 24-bit since within a studio context, there are some real advantages to be had. First, 24-bit sample sizes can represent a higher level of dynamic range, theoretically 144dB instead of 16-bit’s 96dB. Secondly, and even more importantly, recording at a higher bit depth yields a lower noise floor, or the intrinsic noise generated from the recording system itself. Both of these properties gives the engineer more “breathing” room during the recording process since low level noise such as note decay or very quiet passages may sound odd or distorted with only 16-bits to work with. Couple that with a complex production chain that includes dynamic range compression and brickwall limiting, the more headroom you have the better! Hear for yourself.
However, by the time the music hits your ears all the above is moot. The fact is how the engineer mixed and mastered the music is of way more importance than what sample rate he used to record it. That’s part of the reason why my partner in crime is constantly chastising hi-res distributors like HDTracks for not clearly stating the provenance of their masters. For example, I much rather have a 16-bit/44.1kHz FDR master sourced from the original analog tapes than a 24-bit/96kHz compressed nightmare sourced from the remastered brickwalled CD.
As you probably already surmised, upsampling and oversampling are the exact opposite of downsampling. Simply put, they are both used to describe the process of increasing the sample rate of the original signal. The only real difference is oversampling insinuates that the sample rate is above what was needed to meet the original signal’s target Nyquist rate. But the fact is, they are mathematically equivalent.
Please note, upsampling does not add any new information to the original signal. I will repeat (it is a very important concept), upsampling does not add any information to the original signal. So if you rip a CD, which by definition is 16-bit/44.1kHz, into 24-bit/96kHz or even 24-bit/192kHz FLAC, all you’ve done is waste hard disk space. You can’t put back what’s not there.
So why offer these higher sampling rates at all if dithering down to 16-bit/44.1kHz is “good” enough? There are those that feel that given today’s storage and playback capabilities, there is really no need to add more quantization noise as part of the downsampling process. Moreover, during playback, it has been shown that higher sampling rates do have some intrinsic benefits during digital audio conversion such as being more resilient to jitter. Whether or not these artifacts are really audible is another story entirely, but suffice it to say that with the inevitable death of the CD, the whole process of dithering down these days seems superfluous anyway.
Finally, when you needle drop at a higher sampling rate than the venerable CD, you are not upsampling since you are converting analog to digital, not digital to digital. All you are doing is deciding what the initial sampling rate is, and therefore what target Nyquist frequency you want to reproduce in the digital domain. Some folks like to needle drop at 24-bit/96kHz while others seem to think 24-bit/192kHz is a good idea (it’s not). But the reason why they choose a higher sampling rate is because a cartridge can produce frequencies way past the 22.05kHz boundary (or half of 16-bit/44.1kHz). Can you hear above these frequencies? Not on your life, but there is an argument to be made that if you are preserving vinyl as a digital archive you want to be able to reconstruct the original analog waveform as closely as possible.
In my next article in this series, I will throw samples completely out the window (literally), as I will be covering every audiophile’s worst nightmare, the MP3. Or is it? Stay tuned.