Sound is defined by variations in air pressure over time. A rapidly vibrating object creates waves in the surrounding air. Ripples of concentric spherical waves move away from the source like circular waves in a lake move away from a colliding stone. At some point the waves may reach an ear or a microphone. Over time the air pressure is perceived to rise and fall. If the source creates a simple vibration, the observer will perceive a sin wave with air pressure rising and falling over time. If the source creates more complex vibration patterns or there are several simultaneous sources of vibrations than all the waves will combine and only a single complex waveform will be perceived by the observer. At any single moment the perceived air pressure may be recorded as a single value above or below the average air pressure. From this complex waveform the hearing apparatus and the nervous system execute a Fourier like analysis and actually break the complex waveform back into its constituent simple waveforms. Thus the original sources are resolved in the brain.
In a microphone the variation of pressure over time is transformed into a variation of voltage over time. The variation of voltage may be transmitted over a single pair of wires. A ground wire serves as a baseline while a second wire varies in voltage in proportion to the original variation in air pressure. This forms an analog waveform in the voltage. This analog voltage waveform may be sent to a codec (coder/decoder) chip. Sometimes this chip is also referred to as a DSP (Digital Signal Processing) or a PCM (pulse code modulation) chip. A codec chip on a sound card allows for both ADC (analog to digital conversion) as well as DAC (digital to analog conversion). The process of ADC is known as sampling. Recall that the variations in voltage over time simply represent the variations in air pressure over time. Also recall that at any single moment the perceived air pressure may be recorded as a single value above or below the average air pressure. This is exactly what the ADC does. It records the voltage level relative to the average value at a single moment. It then waits a fixed amount of time and than records the voltage level again. It continues to repeat this process, recording pairs of values, voltage vs. time. Each pair of values represents air pressure versus time and may be digitally stored on a hard disk or CD-ROM. This data may eventually be sent to a DAC which will reconstruct the voltage waveform from these recorded data. This voltage waveform is amplified and the power moves the driver in a loudspeaker which reproduces the original variation in air pressure. Hence the analog sound is reproduced from a digital source.
There are two values that must be defined for sampling to occur. These parameters are the sample size and the sampling rate. The latter parameter defines how long the ADC device must wait between taking voltage samples. Thus the device is told to record a certain number of voltage samples per second. This is measured in Hertz (Hz) samples per second, or kilohertz (kHz) thousands of samples per second. Digital phone lines sample the voltage at 8kHz. CD quality sound is sampled at a rate of 44.1kHz. Audio DAT's (Digital Audio Tapes) record at 48kHz. The higher the sampling rate the better the sound quality. However, if too many samples are taken the amount of data recorded can become unmanageable. In addition very expensive DSP chips are required to handle the high bandwidth of very high sample rates. Hence a compromise of sound quality vs. data bandwidth is maintained.
The sampling rate must be high enough to capture the highest tone you wish to sample. Simple tones or notes are simple sign waves of a given frequency which are also designated in Hertz. Hence the middle C on the piano is defined as 440Hz. The human ear can perceive tones as low as 20Hz and as high as 20kHz. According to Nyquist's Sampling Theorem samples of a tone must occur at twice the rate of the tone or aliasing occurs and artifacts are created in the recording. Therefore to reproduce tones as high as the human ear can hear (20kHz) the CD quality sampling rate standard was set at just over twice the 20kHz value at 44.1kHz. The exact value of 44.1kHz was chosen because it is multiple of many video formats allowing for easy synching of digital audio with digital video. When recording a high pass filter must be used to cut off frequencies above 20kHz to prevent aliasing.
Each time the ADC samples the voltage the measurement must be recorded as a discrete value. This process is known as quantizing. The sample size determines the precision with which the voltage is measured as well as the dynamic range (see below) of the recording. Sample Size is measured in bits. Each bit can store two values, a zero or a one. In standard digital telephone lines the sample size is 8 bits and the sample rate is 8kHz. This means that each sample can have one of 256 values (two to the power of eight). A sample size of 16 bits and a sample rate of 44.1kHz is used in CD quality recordings. This means that each value must be rounded to one of 65,536 values (two to the power of sixteen). The more bits that are used to measure each sample the more precise the recording becomes. However, the quantized values can never be as precise as the original analog value. Some precision is lost and this is known as quantizing error or quantizing distortion. A compromise must be made between the precision of the sample and the data storage and bandwidth requirements.
The dynamic range of a recording is merely the difference between the fainted and loudest sound that can be recorded. The maximum dynamic range of a signal is the sample size time 6dB. So a recording with an 8-bit sample size has a dynamic range of 49dB. 16-bit sample sizes yield a dynamic range of 96dB.
Let's calculate the data storage requirements per hour
of digital telephone and CD quality audio respectively. Voice grade recording is done at
8K samples per second in 8 bit (one byte) samples on a single track. This gives you a
reasonable reproduction of sound up to around 4kHz. 60Mins of voice grade recording takes
up about 1MB:
7,680,000 bits x (1KB/1,024bits) = 915KB's
CD audio grade recording is done 44.1K samples per
second in 16 bit (two bytes) samples on a two tracks (stereo). This gives you a reasonable
reproduction of sounds up to around 20kHz. 60Mins of CD audio grade recording takes up
bits/sample)x(2channels)= 5,080,320,000 bits
It is no coincidence that the data capacity of the CD-ROM is just enough to store one hour of CD quality audio.
It is an interesting fact that both the motor and sensory nervous systems work on a logarithmic and not a linear scale. For example one group of motor neurons are fired in coordination to lift a given weight. To lift twice that weight two groups of motor neurons are fired. To lift four times the original weight three groups of motor neurons are fired. Each time the weight is doubled only one more neuron group is recruited. This is a logarithmic relationship. As a consequence you have much more delicate and precise control when dealing with light objects but you barely notice large changes when your holding heavy objects. For example, you can carefully adjust your muscle movements to cut vegetables but when your holding a stack of logs and someone adds one more log you barely feel the difference. This is an efficient use of neurons.
Your sensory system also responds logarithmically to stimuli. When sounds are faint you can perceive the slightest change in volume. But when sounds become very loud you can only perceive large changes in volume. So why is all this important? Well remember all that data you need to record digitally? What if instead of recording it linearly you recorded it logarithmically? So that slight changes at low volumes were recorded with more precision than small changes at high volumes. Well that's what we do in a data compression process known as companding. One form of companding is called æ-law and it allows you to record 12-bits or even 16-bits of data in an 8-bit sample size! There is a similar system called a-law but this is used only in Europe.
Under UNIX three device files access the DSP chip,
/dev/dsp, /dev/dspW and /dev/audio.
Each is a symbolic link to /dev/dspN, /dev/dspWN and /dev/audioN respectively, where N is
some number greater than or equal to zero. Digital data written to these devices is played
in an analog format. Analog input is converted to digital. To play an audio file named
sound.wav you merely need to type
cat sound.wav > /dev/dsp. To record a
cat /dev/dsp > sound.wav. Alternately you may use dd as in dd
if=/dev/dsp of=/sound.wav bs=4 count=1 (note the bs=4 denotes four bytes for stereo 16-bit
audio). These device files respond to the normal
ioctl (), open (), close, read ()
write () system calls. Some codecs can record and playback
simultaneously. This is called full duplex operation. Others can
not record and playback simultaneously. They are said to operate in half
The only difference between these files is that /dev/dsp used 8-bit linear encoding while /dev/audio uses æ-law encoding. The default data format used 8kHz/unsigned/mono for both. /dev/dspW uses 16-bit signed little endian encoding. Intel x86 based computer and PC sound cards use little endian encoding. Little endian computers store bits from left to right while big indian computers store bits from right to left. 16-bit audio can be safely stored and accessed as a long integer. However, one needs to be careful about the endianess of risk computers.
The synthesizer device may be used to play back or create electronic music and/or sound effects. There are two major categories of synthesizers on most PC sound cards; FM synthesizers and Wave Table synthesizers. The first category simulates the timbre (sound quality) of various real musical instruments by combining various artificially created FM (Frequency Modulated) waveforms like sin waves and triangle waves. The basic wave forms are generated by simple oscillators and filters. The process of combining them is called modulation. Much guess work and black magic is used to produce an acceptable approximation of an authentic musical instrument. However, in recent years progress has been made by using complex mathematical models called physical models to calculate the actual sound of melodic instruments. While the timbre of isolated notes is not as realistic as the digital samples used in wave tables, the overall dynamic behavior can be more persuasive.
The FM synthesizer interface is composed of up to 36 cells. These include 18 modulator cells and 18 carrier cells. Each carrier cell can generates a base waveform. Each modulator cell can modulate (modify) a base carrier cell. This allows for 18 simultaneous channels called voices. A voice is merely a single note. If all 18 channels are used each voice can generate a note that is composed of two operators: the carrier and the modulator. Think of each operator as a mini-synthesizer that can be combined in series with another operator to create a complex sound. In fact modulated waveforms can be multiplied or added to each other. If fewer voices than 18 are used than some operators are freed to combine with other channels. Up to four operators can be combined to create a single voice. In addition, three additional channels may be reserved for built in percussion instruments.
Three data structures hold the data for FM synthesis. The FM Parameter Data Structure holds the rhythm information and percussion instrument settings. The FM Note Data Structure is used to assign a note frequency and an octave to each voice. The FM Voice Data Structure is the most complex and assigns all the voice properties. It can be used to define a voice's attack, sustain, decay and release parameters. It can also add verbrato, higher level harmonics, chorus, feedback, delay and many other parameters. A set of predefined parameter settings can be saved in a patch file. One simply loads a patch file to create a predefined sound. Most users just use a set of standard patch files which include 128 melodic and 47 percussive instruments.
Yamaha has been a pioneer in PC FM synthesizers. The Yamaha OPL2 is the FM synthesizer chip used in most older Creative Lab's Sound Blaster 1.x/2.x cards and AdLib cards. It has a maximum two operator capacity per voice and allows for nine simultaneous voices. The OPL2 does not allow for realistic Timbres. Yamaha's OPL3 chip has a 4 operator per voice capacity and allows for up to 18 stereo voices yielding a slightly improved timbre. It can generate stereo tones at 44.1kHz. Both chips includes five percussion instruments.
Wave Table synthesizers are the second major category of synthesizers. Wave Table synthesizers are vastly superior to FM synthesizers. Wave Table synthesizers play back pre-recorded digital samples of real musical instruments. The effect is fairly realistic. Many cards now include Wave tables such as the Sound Blaster 32 series, Sound Blaster 64 series and the Gravis Ultrasound GF1. The Yamaha SW60XG and DB50XG Wave Table cards as well as the Creative Labs Wave Blaster series are Wave Table add on cards. Wave Table add on cards are connected to MIDI ports on the sound card as external synthesizers.
In Wave Table synthesis a desired pitch is achieved by taking a prerecorded sample of one pitch and playing back the recording faster or slower to match the desired pitch. This is similar to spinning a record player faster or slower making the pitch rise and fall respectively. However, with digitally recorded tones the sample rate is already set to 44.1kHz. Increasing the pitch increases the sample rate and decreasing the pitch decreases the sample rate. In order to play the tone at a different pitch at 44.1kHz the tone must be re-sampled by interpolating values. This is normally achieved by linearly interpolating a value based on the two proximal sample values. However, this leads to harmonic distortion. Use of patented E-mu technology in the EMU8000 ASIC allow for up to four data points to be interpolated as a curved function like a Taylor polynomial. Up to 32-bit precision is used to calculate the final 16-bit sample. This new technique yields much greater fidelity.
The Sound Blaster 32 series Wave Table sports 32 voices while the Sound Blaster 64 series boast 64 simultaneous voices. In the later 32 voices are in hardware while additional voice may be downloaded as "Sound Fonts." Up to 28MB's of additional RAM may be added to a sound card for additional sound samples can be loaded and played. Custom sound fonts can be created with the use of Vienna 2.0 SF Studio software.
Under UNIX the /dev/sequencer file is used to access and control internal and external synthesizers. Up to fifteen synthesizers and sixteen MIDI ports can be addressed at one time. Alternately /dev/music (formerly /dev/sequncer2) may be used to access both midi devices and synthesizers in one uniform format. This allows for device independent programming at the cost of some precision afforded by the /dev/sequencer file format. Both /dev/sequencer and /dev/music accept formatted command input to control the device. These device files may not be used to play audio files. Instead /dev/midi, /dev/dsp or /dev/audio are used for this purpose. A little know fact is that the /dev/dmfm0 file allows raw access to the FM synthesizer at a register level.
MIDI (Musical Instrument Digital Interface) is a serial-like port used to connect synthesizers, drum kits, stage props and stage lighting equipment. A baud rate of 31.5K is utilized over a 5 pin DIN connector. 16 channels are available. Data can not be sent in real time. Therefore packets of data can not be timed. Older cards use a 6850 UART MIDI device. The newer Roland MPU-401 and compatible cards have an "intelligent mode" allowing for some advanced controls.
MIDI signals are sent over the cable with one status byte plus one or more data byte(s). The signal can be used to encode music. For example if a keyboard synthesizer is connected a status byte is sent including a channel number and then two data bytes are sent. The first describes the key number and the second quantizes how hard the key was pressed. Therefore the tone and volume are transmitted and can be played or stored in a MIDI file. MIDI files can contain musical works but contain no information about the instrument itself. Therefore the file may be recorded on a keyboard but played back with the timbre of any instrument desired.
Under UNIX the midi interface is accessed via the device file /dev/midi which is a symbolic link to /dev/midi0N, where N is some number starting with zero. MIDI audio file may be redirected as standard input to /dev/midi to be played. In most ways /dev/midi works like /dev/tty. A little file, /dev/dmmidi0, allows raw read () / write () access to the MIDI port for such things as lighting control.
Mixers are used to select the input source and to control the recording and payback volume levels. The input source can be the CD-input, line-input or microphone-input. In addition digital audio and synthesizer sources may be accepted and mixed with these other sources. The microphone-input is generally the default input.
Under UNIX a mixer is accessed by the device file /dev/mixer which is generally a symbolic link to /dev/mixerN, where N is some number starting with zero. The mixer addresses up to thirty one different channels. Each channel is assigned a number from 0-30 and some channels have mnemonic names referring to the device is controls. Some channels are in stereo. Each channel can be assigned a value from 0-100 to control volume.
When in trouble try to look up the IRQ, DMA channels and I/O ports for your sound card. For most Sound Blaster compatible cards IRQ5, DMA channels 1 and 5 and I/O ports 220 and 330 are used. Under UNIX you can find this information by running "cat /dev/sndstat." Under DOS try "msd." In Windows look in your control panel.
Creative Labs ,
Altec Lansing ,
Digital audio recording general information
Myteks audio theory & pro bookmarks
Digital Sound Page
Synth Zone digital audio information
Professional audio page
The Toy Specialists: Favorite links
Musician.com home page
ProStudio & Live Audio
WWW Pages Relating to Sound Computation
Introduction to Digital Recording Techniques
AudioWorks audio simulation software-- technical overview
DSP Source Guide
CD-ROM Digital Audio (CD-DA)
Audio software product information
Digital Sound Page-- hardware/software manufacturers
HohnerMidia Multimedia Musicsoft and Hardware
The Official Samplitude WWW Page
WaveLab-- Steinberg Products
DSP FX - Effects Processing System-- Power Technology
Cool Edit Pro-- Syntrillium Software
SAW Plus-- Innovative Quality Software
Sadie studio audio products
Waves professional audio processing
Soundcard WWW Site - Hardware Section
Digital Sound Page-- hardware/software manufacturers
AdB Products MultiWave Digital PRO18/ 24-bit audio
AVM Technology, Inc.
Antex Electronics Corporation-- StudioCard
Digital Audio Labs Home Page
Frontier Design Group Home Page
Merging Technologies S.A. - Kefren Board
Mytek Digital-- Workstation 20-bit Rack Interface
Sonorus, Inc. - STUDI/O Specifications
Turtle Beach Home Page
Turtle Beach Users Group Home Page
The Semi Official Turtle Beach Maui Page
Sources for digital audio software & hardware
Whitley / Bias Productions
Sound Chaser Home Page
Mission Recording and Audio
Audio gear for outboard A/D conversion
Eventide DSP4000 Ultra-Harmonizer
ProStudio & Live Audio: Product Profile DSP4000
TC Electronic Finalizer
Kurzweil K2500-- Official Page
The Kurzweil K2000/K2500 Launch Pad
Sweetwater Sound: The Complete K2000/2500
Akai S-Series Samplers-- Official Page
Akai S-Series Samplers-- Unofficial Page
Synthony Samplers Guide
Mytek Digital Workstation 20-bit A/D interface
Symmetrix 620 20-bit A/D interface
MultiWave Digital PRO18/ 24-bit audio-- AdB Products
CardD Plus-- Digital Audio Labs
Sources for audio gear & product information
Harmony Central Main Menu
Harmony Central MarketPlace
USA New Gear Price List
Synthcom System's Used Gear Price List
Sweetwater Sound Online
Thoroughbred Music On-Line Catalog
Chuck Levin's Washington Music
Leo's Professional Audio
Cutting Edge Audio
Caruso Music Online
Musicians Discount Warehouse
BPM Music Express
Interstate musician supply
Kraft Music Store
The DAT Web
Copyright (c) 1989 - 2008 Net Express All Rights Reserved.