| > To Continue with Chapter 2
Sampling Theory The answer to this question is given by the Nyquist Sampling Theorem, which states that to represent a signal the sampling rate (or sampling frequency not to be confused with the frequency content of the sound, as it frequently is!) needs to be at least twice the highest frequency contained in the sound of the signal. For example, look back at our time-frequency picture from Section 2.2. It looks like it only contains frequencies up to 8000 Hz. If this were the case, then we would need to sample the sound at a rate of 16000 Hz (16kHz) in order to reproduce the sound. That is, we would need to take sound bites (bytes?!) 16000 times a second. In the next chapter, when we talk about representing sounds in the frequency domain (as a combination of various amounts of frequency components, which change over time) rather than in the time domain (as a numerical list of sample values of amplitudes), well learn a lot more about the ramifications of the Nyquist theorem for digital sound. But for our current purposes, it's a good idea to remember that since the human ear only responds to sounds up to about 20,000 Hz. We need to sample sounds at least 40,000 times a second, or at rate of 40,000 Hz, to represent these sounds. You may be wondering why we even need to represent sonic frequencies that high (when the piano, for instance, only goes up to the high 4,000 (or 4k) Hz range). The answer is timbral, particular, spectral. Remember that we saw in Section 1.4 that use those higher frequencies fill out the descriptive sonic information.
Just to review: we measure frequency in cycles per second (CPS), or Herz (Hz.). The frequency range of human hearing is usually given as 20Hz to 20,000Hz (abbreviated as 20kHz), meaning that we can hear sounds in that range. Knowing that, if we decide that the highest frequency were interested in is 20kHz, then according to the Nyquist Theorem, we need a sampling rate of at least twice that frequency, or 40kHz.
Undersampling: What happens if we sample too slowly for the frequencies we're trying to represent? We take samples (black dots) of a sinewave (in blue) at a certain interval (the sample rate). If the sinewave is changing too quickly (its frequency is too high) then we can't grab enough information to reconstruct the waveform from our samples. The result is that the high frequency waveform masquerades as a lower frequency waveform (how sneaky!), or that the higher frequency is aliased to a lower frequency.
Picture of an undersampled waveform. This sound was sampled 512 per second. This was way too slow.
This is the same soundfile as above, but now sampled 44100 (44.1kHz) times per second. Much better... Aliasing The most common standard sampling rate for digital audio (the one used for CDs) is 44.1kHz, giving us a Nyquist Frequency (defined as half the sampling rate) of 22.05kHz. If we use lower sampling rates, for example, 20kHz, we cant represent a sound whose frequency is above 10KHz. In fact, if we try, well get usually undesirable artifacts, called foldover or aliasing, in the signal. In other words: if the sinewave is changing too quickly, we will get the same set of samples that we would have obtained had we been taking samples from a sinewave of lower frequency! As we said before, the effect of this is that the higher frequency contributions now act as impostors of lower frequency information. The effect of this is that there are extra, unanticipated and new low frequency contributions to the sound. Sometimes we can use this in cool, interesting, and funkadelic ways, and other times (like when the NSA is listening to your phone not that we're paranoid or anything, but we think they are right now, or maybe not, so we better be a little careful, not that we have anything to hide, except for that one little thing a few years ago or you are trying to make a beautifully faithful reproduction of an exquisite sound) it just messes up the original sound. So in a sense, these impostors are aliases for the low frequencies, and we say that the result of our undersampling is an aliased waveform at a lower frequency. Foldover aliasing: this picture shows what happens when we sweep a sinewave up past the Nyquist rate. It's a picture in the frequency domain (which we haven't talked about much yet), so what you're seeing is the amplitude of specific component frequencies over time. The x axis is frequency, the z axis is amplitude, and the y axis is time (read back to front). As the sinewave sweeps up into frequencies above the Nyquist frequency, an aliased wave (starting at 0 Hz and ending at 44100 Hz over 10 seconds) is reflected below the Nyquist frequency of 22050 Hz. The soundfile can be heard below. Chirpng: A 10 second sound file sweeping a sine wave from 0 Hz to 44,100 Hz. Notice that the sound seems to disapear after it reaches the Nyquist rate of 22050 but then it wraps around as aliased sound back in to the audible domain. Anti-aliasing Filters Fortunately its fairly easy to avoid aliasing we simply make sure that the signal were recording doesnt contain any frequencies above the Nyquist Frequency. To accomplish this task, we use an anti-aliasing filter on the signal. Audio filtering is a technique that allows us to selectively keep or throw out certain frequencies in a sound just as light filters (like ones you might use on a camera) only allow certain frequencies of light (colors) to pass. For now, just remember that a filter lets us color a sound by changing its frequency content. We'll talk a lot more about filters in this book. An anti-aliasing filter is a lowpass filter. It's called a lowpass filter because it only allows frequencies below a certain cutoff frequency to pass. Anything above the cutoff frequency gets thrown away. By setting the cutoff frequency of the low-pass filter to the Nyquist frequency, we can throw out the offending frequencies, those high enough to cause aliasing, while retaining all of the lower frequencies that we want to record.
Anti-aliasing filters are a standard component in all digital sound recording, so aliasing is not usually a serious concern to the average user or computer musician (it is, however, a serious concern for audio designers). But, because many of the sounds in computer music are not recorded, but created digitally inside the computer itself, its important to fully understand aliasing and the Nyquist Theorem. Theres nothing to stop us from using a computer to create sounds with frequencies well above the Nyquist frequency. And while the computer has no problem dealing with such sounds as data, as soon as we mere humans want to actually hear them (as opposed to just conceptualizing or imagining them), we need to deal with the physical realities of aliasing, the Nyquist Theorem, and the analog to digital conversion process. Of course, its also possible to exploit these physical limitations in creative ways . . .
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||