Glossary

“Active” device—one that requires external power to function.

Aliasing—sampling rates that are too low to map a waveform accurately prohibit the faithful restoration of signals. A fault known as aliasing occurs when too few samples cause a device to interpret the voltage data as a waveform different from the one originally sampled.

Amplitude—a measure of the change of air pressure in a soundwave above normal (compression) and below normal (rarefaction). In other words, it is a measure of the strength of a sound without reference to its frequency. We perceive amplitude as loudness and express it in decibels (dB) of sound pressure level (SPL).

Analog audio—the representation of a signal by continuously variable and measurable physical quantities, such as pressure or voltage.

Analog-to-digital converter (ADC)—converts analog signals to digital code using pulse code modulation (PCM).

Audio volume—in relation to the measurement of loudness, “audio volume” refers to a subjective combination of level, frequency, content, and duration.

Bit—an abbreviation of the expression “binary digit.” Binary means something based on or made up of two things, and in digital audio systems, these two things are the numbers 0 and 1.

Bit depth—stipulates how many numbers (0 or 1) are used to represent each sample of a waveform.

Bit rate—indicates how many bits are transmitted per unit of time in digital audio.

Byte—a group of eight digits, each digit being either 0 or 1.

Codec—an abbreviation of coder/decoder, a codec is a software application using algorithms to encode a digital signal into another format, often to reduce the size of the file (hence the term compression). Once encoded, the file must be decoded to re-create the original audio. If no information is lost in the process, the codec is called lossless, but if information has been removed, the codec is a lossy one.

Coloration—an audible change in the quality (timbre) of a sound.

Comb filtering—the short delays between two or more microphones used to capture a complex waveform can cause comb filtering, a set of mathematically related (and regularly recurring) cancellations and reinforcements in which the summed wave that results from the inadvertent cutting and boosting of frequencies resembles the teeth of a comb.

Complex soundwave—a waveform comprised of a collection of sine waves, integer multiples of the fundamental frequency, that is, a complex set of frequencies arranged in a harmonic or overtone series above the lowest frequency of the spectrum.

Condenser microphone—a mic that operates electrostatically. Its capsule consists of a movable diaphragm and a fixed backplate, which form the two electrodes of a capacitor (previously called a condenser; hence, the name) that has been given a constant charge of DC voltage by an external power source. As soundwaves strike the diaphragm, the distance between the two surfaces changes, and this movement causes the charge-carrying ability (capacitance) of the structure to fluctuate around its fixed value. The resulting variation in voltage creates an electrical current that corresponds to the acoustic soundwave.

Convolution—the blending or convolving of one signal with another. Convolution is the method used to create reverb plugins based on impulse responses.

Critical distance (reverberation radius)—the distance from a sound source to that point in an enclosed space where the direct and reverberant fields are equal in level; that is, the total energy of one equals the other. Physicists have determined that the level of direct sound drops by 6.0 dB for every doubling of distance in a truly free field (that is, outdoors; the drop is somewhat smaller in an enclosed space) and that the level of reverberated sound remains more or less constant everywhere in a room. The ratio of direct to reverberated sound is 1:1 at the critical distance (see Figure Glossary.1).

Figure Glossary.1 Critical distance.

Figure Glossary.1 Critical distance.

The critical distance may be found in any room by at least two methods: (1) place a microphone relatively far from a sound source and then move a second mic increasingly closer to the source until the difference in level between the two microphones is less than 3.0 dB; (2) either during a rehearsal or by situating a boom box at the performers’ location (set between stations to produce white noise), recordists use an SPL meter to measure the SPL close to the source (approximately 30 centimeters or a foot away) and then double the distance and measure again, a point at which, according to the Inverse Square Law, the level will have decreased between 4.0 and 6.0 dB. After making note of the new level, they double the distance and take another measurement, repeating this procedure until the SPL stops dropping. By moving back to the area where the level began to remain constant, they find the critical distance.

At distances less than a third of the reverberation radius, the direct sound will be at least 10.0 dB stronger than the reverberated sound; hence, reverberation does not play a prominent role in the sound captured by a microphone. Conversely, at distances three times that of the critical distance, the direct sound is at least 10.0 dB weaker than the reverberated sound, and a microphone will primarily capture reverberation.

Damping—a method of controlling the way frequencies die away or roll off in a reverb tail created during digital reflection simulation.

dBFS (dB Full Scale)—audio level in decibels referenced to digital full scale, that is, referenced to the clipping point (“full scale”) in a digital audio system. 0.0 dB represents the maximum level a signal may attain before it incurs clipping.

dBTP (dB True Peak)—maximum inter-sample peak level of an audio signal in decibels referenced to digital full scale, that is, referenced to the clipping point (“full scale”) in a digital audio system. 0.0 dB represents the maximum level a signal may attain before it incurs clipping.

Decibel (dB)—one tenth of a bel. Named after Alexander Graham Bell, a bel expresses the logarithmic relationship between any two powers. In acoustics, large changes in measurable physical parameters (pressure, power, voltage) correspond to relatively small changes in perceived loudness. Thus, linear scales, because of the huge numbers involved, do not correspond very well to the perceived sound, so a logarithmic scale is used to bring the numerical representation of perceived loudness and the numerical representation of the actual physical change into line with each other. Logarithms are a simple way of expressing parameters that vary by enormous amounts with smaller numbers (in other words, a large measurement range is scaled down to a much smaller and more easily usable range).
Because the human ear accommodates a large range of loudness, it is convenient to express loudness logarithmically in factors of ten. The entire range of loudness can be expressed on a scale of about 120.0 dB (0.0 dB is defined as the threshold of hearing), and within this logarithmic scale, increasing the intensity of sound by a factor of 10 raises its level by 10.0 dB, increasing it by a factor of 100 raises the level by 20.0 dB, and increasing it by a factor of 1,000 raises the level by 30.0 dB, and so on. Hence, the term decibel does not represent a physical value. It is a relative measurement based on the internationally accepted standard that 20 micropascals of air pressure equals 0.0 dB (20 micropascals of air pressure at 1,000 Hz is the threshold of hearing for most people).
Since the term decibel expresses a ratio and not a physical value, it can be applied to things other than loudness. In amplifiers, for example, a 200 watt amp is 1 bel or 10 decibels more powerful than a 20 watt amp, but it is important to understand that even though a 200 watt amp puts out ten times more electrical power than a 20 watt amp, it does not generate ten times more loudness, for a ten-fold change of electrical power is only perceived by the human ear as a 10.0 dB change of loudness. In other words, the underlying scales are different, and one scale should not be equated directly with the other. This can be shown in a graph, where the vertical axis represents dB and the horizontal axis represents electrical power (the curved line in Figure Glossary.2 is the logarithmic contour).

Diaphragm—the thin membrane in a microphone capsule that moves in reaction to soundwaves. In the early days of capacitor microphones, diaphragms were made from PVC (polyvinyl chloride, such as in the M7 capsule designed by Georg Neumann in 1952), but now they are usually made from PE (polyethylene), which is lighter, thus providing a more responsive capsule with greater sensitivity and articulation. Manufacturers fashion these materials into thin sheets coated with a gold surface so that the diaphragm may be charged to create a capacitive effect (hence, the term “metal film”). In the most expensive capsules, the gold is evaporated onto the membrane in a vacuum chamber to ensure uniform coverage. The more economical process of sputtering or spraying gold onto the membrane, the process used on less expensive microphones, can result in an uneven coat that causes membrane imbalance and inconsistencies in the capsule’s response. In ribbon mics, the diaphragm consists of a thin strip of corrugated aluminum.

Diffuse or reverberant sound field—the area in a room in which reflections from the walls, ceiling, floor, etc. predominate (that is, the ensemble of reflections in an enclosed space). In other words, the sounds arrive at the listening position/microphone randomly from all directions and the direct sound no longer dominates. These reflections have their high-frequency content attenuated by surface absorption, as well as by the air, and reach the listener/microphone at oblique angles of incidence, which causes further high-frequency loss. This is also the field in which direct sound travels as a plane wave (in plane waves, intensity decreases in a linear relationship to distance traveled, that is, the Inverse Square Law no longer applies).

Figure Glossary.2 Logarithmic changes of power and perception in decibels.

Figure Glossary.2 Logarithmic changes of power and perception in decibels.

Digital audio—the use of a series of discrete binary numbers (0 or 1) to represent the changing voltage in an analog signal.

Digital-to-analog converter (DAC)—converts digital code to an analog signal (voltage), so that non-digital systems can use the information.

Direct or free sound field—sound arriving perpendicularly to the listening position or the diaphragm of a microphone without reflections (a purely direct field can exist only where sound propagation is undisturbed, such as in an open space free from all reflections). The direct path of sound is the shortest route from the sound source to the listening position. It is also the area in which soundwaves propagate spherically and the Inverse Square Law applies. This field ends where the sound pressure level ceases to fall by 6.0 dB for every doubling of the distance.

Distance factor—an indication of how far recordists can locate a directional microphone from a sound source and have it exhibit the same ratio of direct-to-reverberant sound pickup as an omnidirectional microphone.

Dither—the technique of adding specially constructed noise to a signal before its bit depth is reduced to a lower level. It alleviates the negative effects of quantization by replacing nonrandom distortion with a far more pleasing random noise spectrum. One of the commonly used types of dither is TPDF (triangular probability density function, which uses white noise with a flat frequency spectrum), but devices can also add noise containing a greater amount of high-frequency content (called blue noise). The process involving blue noise is known as colored/shaped noise dithering or noise shaping, and it concentrates the noise in less audible frequencies (generally those above 15–16 kHz), while reducing the level of the noise in the frequency range humans hear best (between 2 and 5 kHz and around 12 kHz).

Ducking—the technique of dropping one signal below another. It is frequently used in voiceovers to place the main signal in the background while the announcer speaks.

Dynamic microphone—these microphones operate on the principle of electromagnetic induction. A light diaphragm connected to a finely wrapped coil of wire suspended in a magnetic field moves within that magnetic field to induce an electrical current proportional to the displacement velocity of the diaphragm. Dynamic microphones are also called velocity or moving-coil microphones.

Dynamic range—the difference between the softest and loudest sound a system can produce.

Early reflections—the first reflections to arrive at a listening position within 80 ms of the direct sound.

Equal loudness curves—these curves or contours, originally established by the researchers Harvey Fletcher and Wilden A. Munson in the 1930s (and refined by later researchers), show how loudness affects the way humans hear various frequencies. Figure Glossary.3 demonstrates that people exhibit the greatest sensitivity to frequencies around 4 kHz and the least sensitivity at either end of the spectrum, particularly in the lower part of the hearing range. In other words, for listeners to perceive a 50 Hz sound in the same way they perceive a 1 kHz sound at 40.0 dB, the level of the signal has to be increased to 70.0 dB (in the chart, follow the 40 phon contour from 1 kHz up to the 70.0 dB level and the frequency below this point is roughly 50 Hz).

Figure Glossary.3 Equal loudness curves (contours).

Figure Glossary.3 Equal loudness curves (contours).

Fast Fourier transform (FFT)—a type of mathematical analysis, first developed by Jean Baptiste Joseph Fourier in the early nineteenth century, that allows data from one domain to be transformed into another domain. Computers perform the Fourier transform (that is, the mathematical calculations for it) at a very high speed, and modern spectrograms rely on what is known as the “fast Fourier transform” to plot frequency against amplitude in real time so that the visual representation of a signal changes as rapidly as the signal itself.

Filter—any device that alters the frequency spectrum of a signal by allowing some frequencies to pass, while attenuating others. Filters change the balance between the various sine waves that constitute a complex waveform.

Free sound fieldsee Direct or free sound field.

Frequency—a measure of how often (“frequently”) an event repeats itself. A sound source which vibrates back and forth 1,000 times per second has a frequency of 1,000 cycles per second (cps). Frequency is now stated in hertz (Hz) instead of cps (named after Heinrich Hertz, a German pioneer in research on the transmission of radio waves).

Frequency response—the range of frequencies that an audio device will reproduce at an equal level (within a tolerance, such as 3.0 dB). It is a way of understanding how a microphone responds to sound sources and is usually expressed in graph form, where the horizontal axis represents frequency and the vertical axis amplitude (in dB).

Fundamental—the lowest frequency in a complex waveform. The fundamental is perceived as the pitch of a note.

Harmonicssee Overtone series.

Headroom—the difference between the average or nominal level of a signal (in EBU terms, this is the target loudness) and the point at which the signal clips (0.0 dBFS in digital systems).

Hertz (Hz)—the term used to designate frequency in cycles per second and named after the German physicist Heinrich Hertz. It was adopted as the international standard in 1948.

Impulse response (IR)—the reverberation characteristics of an ambient space. An IR is recorded using a short burst of sound (for example, a starter pistol) or a full-range frequency sweep played through loudspeakers to excite the air molecules in a room. After the sound of the stimulus has been removed from the recording (through a process known as deconvolution), the room’s impulse response or reverb tail can be added to a dry signal.

Inverse Square Law—in the direct or free field (that is, in a field free from reflections), soundwaves radiate in all directions from a source in ever-expanding spheres, and as the surface areas of these spheres increase over distance, the intensity of the sound decreases in relation to the area the soundwaves spread across (see Figure Glossary.4). The Inverse Square Law states that the intensity of a sound decreases proportionally to the square of the distance from the source. In other words, for every doubling of the distance, the sound pressure reduces by half, which the human ear perceives as a decrease of 6.0 dB (note that this principle applies only in the direct or free field; in enclosed spaces, the actual decrease is somewhat less than 6.0 dB).

Figure Glossary.4 Spherical propagation of soundwaves in an open space.

Figure Glossary.4 Spherical propagation of soundwaves in an open space.

K-weighting—a filter that approximates human hearing by de-emphasizing low frequencies (to make them less loud) and emphasizing higher frequencies (to make them louder) (see Figure Glossary.5).

Figure Glossary.5 K-weighting.

Figure Glossary.5 K-weighting.

Late reflectionssee Diffuse or reverberant sound field.

Line level—refers to the average voltage level of an audio signal. In professional signal processing components, it is usually +4.0 dBu (dBu is the signal level expressed in decibels referenced to voltage).

LKFS (Loudness, K-weighted, referenced to digital Full Scale)—loudness level on an absolute digital scale. It is analogous to dBFS, for one unit of LKFS equals one dB. This terminology is used by the International Telecommunication Union and the Advanced Television Systems Committee (USA); it is identical to LUFS.

Logarithmic—instead of dealing with a number itself, the number is represented by its logarithm (often abbreviated as log). The common log of a number is the power to which the number 10 must be raised to obtain that number; for example, 10 to the power of 2 (102) equals 100, thus the log of 100 is 2. In a logarithmic scale, distances are proportional to the logs of the numbers represented, but in a linear scale the distances are proportional to the numbers themselves.

Lossless—a codec for reducing the size of a file that preserves the original data during coding and decoding; that is, no information is lost in the process.

Lossy—a codec that removes information from an audio signal in order to reduce the size of the file. Principles of psychoacoustics are used to identify parts of the signal that humans cannot hear well, and the codec discards less audible components, which has a detrimental effect on sound quality.

Loudness—a perceptual quantity: the magnitude of the physiological effect produced when a sound stimulates the ear. This physiological reaction is measured by meters employing an algorithm developed by the International Telecommunication Union (ITU) designed to approximate the human perception of level.

LRA (Loudness Range)—originally developed by TC Electronics, it is the overall range of the material from the softest part of a signal to the loudest part, given in LU. To avoid extreme events from affecting the reading, the top 5% and the lowest 10% of the total loudness range is excluded from the measurement (for example, a single gunshot or a long passage of silence in a movie would result in a loudness range that is far too broad).

LU (Loudness Unit)—a relative unit of loudness referenced to something other than digital full scale. It employs K-weighting and is analogous to dB, for one LU equals one dB. This terminology was established by the International Telecommunication Union (ITU).

LUFS (Loudness Unit, referenced to digital Full Scale)—loudness level on an absolute digital scale. It is analogous to dBFS, for one unit of LUFS equals one dB. LUFS employs K-weighting and is identical to LKFS. This terminology is used by the European Broadcasting Union (EBU).

Maximum sound pressure level—the maximum sound pressure level a microphone will accept, while producing harmonic distortion of 0.5% at 1,000 Hz.

Near field—the sound field immediately adjacent to a source where direct sound energy dominates. For microphones, this is the distance within which reflected sound remains minimal.

Noise floor (self-noise)—the internal noise level generated by a device or system (for example a microphone in the absence of soundwaves striking the diaphragm). The noise comes from the resistance of the coil or ribbon in electromagnetic mics and from the thermal noise of the resistors, as well as the electrical noise of the pre-amp, in electrostatic mics. It is expressed in dB (lower numbers are better).

Normalization—a method of adjusting loudness so that listening levels are more consistent for audiences.

Nyquist Theory—between 1924 and 1928, Harry Nyquist discovered that an analog signal can be recreated accurately only if measurements are taken at a rate equal to or greater than twice the highest frequency in the signal. The maximum frequency a digital system can represent is about half the sampling rate.

Overtone series—the frequencies above the fundamental in a complex waveform, that is, a collection of sine waves integer multiples of the fundamental frequency. This series of frequencies gives notes their tonal color or timbre.

“Passive” device—one that does not require external power to function.

Periodic soundwave—a waveform that repeats its shape. All waveforms with pitch are periodic.

Phase—the starting position of a periodic wave in relation to a complete cycle.

Phon—a unit used to relate perceived loudness to the actual sound pressure level of a signal. Phons describe the psychological effect of loudness. The concept of the phon is part of the system known as the equal loudness curves or contours, a system in which the threshold of hearing (0.0 dB) for a 1 kHz sine wave (pure tone) is equated to 0 phons.

Plane soundwave—when waves propagating spherically reach the point at which the surfaces of the spheres become almost flat, the intensity of the waves decrease in a linear fashion, more or less uniformly. At this distance, the Inverse Square Law no longer applies, because the total area of the plane changes very little as the waves travel forward.

PLR (Peak-to-loudness ratio)—the difference between a signal’s maximum true-peak level and its integrated or average loudness.

Polar coordinate graph—a graphing technique used for plotting the directional sensitivity patterns of microphones. Concentric circles represent the sensitivity in terms of dB, and the plotted lines show the amount of attenuation that occurs for specific frequencies arriving from various angles.

Polar patterns—the polar response of a microphone indicates its sensitivity to sounds arriving from any location around the diaphragm.

Pre-delay—the time gap between the arrival of the first wavefront at a listening position in an enclosed space and the arrival of the first reflection from a nearby surface (also known as the initial-time-delay gap).

Pressure-gradient transducer—a microphone operating on differences in pressure from sound-waves arriving on both sides of a single diaphragm or on the outer surfaces of two diaphragms joined together (but separated by a backplate).

Pressure transducer—microphone designers clamp a single circular diaphragm inside a completely enclosed casing so that only the front face is exposed to the sound field. Sounds arriving from all directions exert equal force on the diaphragm, and because the diaphragm responds identically to every pressure fluctuation on its surface, these microphones exhibit a nondirectional, that is, an omnidirectional (360°), response pattern.

Proximity effect—the discernible increase in the low-frequency response of pressure-gradient microphones (cardioids and ribbons) as sound sources move closer to the diaphragm.

Pulse code modulation (PCM)—invented by Alec Reeves in the late 1930s, PCM has become the standard method for digitally encoding analog waveforms (the technique is used in both WAV and AIFF). It has three components: sampling, quantizing, and encoding.

Quantization—when the voltage measurement at a sample falls between two of the integers in a scale based on bit depth, quantization rounds (quantizes) the measurement to the closest step of the scale.

Quantization noise—the rounding of measurements taken in the sampling process introduces errors into the system (heard as nonrandom noise), and the size of the error depends on the number of steps the scale contains: a 2-bit scale has 4 possible steps (22), a 3-bit scale 8 steps (23), a 4-bit scale 16 steps (24), an 8-bit scale 256 steps (28), a 16-bit scale 65,536 steps (216), and a 24-bit scale 16,777,216 steps (224). Scales based on higher numbers of bits, then, because they have more finely graded steps, reduce the size of the rounding error and, hence, the amount of noise in the system. In both 16 and 24 bit scales, the rounding error is so small that the noise introduced by quantization is quite faint.

Resolution—an indication of the sound quality of digital audio based on sample rate and bit depth. Today “high-resolution audio” has a bit depth of at least 24 and a sample rate at or greater than 88.2 or 96 kHz. The greater the “resolution” of the system, the more accurately it can represent waveforms.

Reverberationsee Diffuse or reverberant sound field.

Reverberation radiussee Critical distance.

Reverberation time (RT)—after a sound source has stopped emitting soundwaves, the time required for the reverberant field to decrease to one-millionth of its original strength, a reduction of 60.0 dB.

Ribbon microphone—these microphones operate on the principle of electromagnetic induction. A thin strip of corrugated aluminum (the diaphragm) is suspended in a magnetic field so that both sides engage with the sound source. These microphones induce an electrical current proportional to the velocity of displacement.

Sample peak—the peak level of a signal that occurs at sampling points.

Sampling—the process of measuring the voltage of an electrical audio signal at a regular interval so that the measurements can later be outputted as binary numbers.

Self-noisesee Noise floor.

Sensitivity—the ratio between the electrical output level of a microphone and the sound pressure level on the diaphragm. Usually expressed in dB, it is a measurement of the output produced when a mic is subjected to a standardized sound pressure level (that is, it indicates how much signal any given SPL produces).

Signal-to-noise ratio (SNR)—the ratio between the useful signal produced by a device and its inherent noise when the signal is removed, expressed in dB (higher numbers are better). A signal-to-noise ratio of 47.0 dB means that the noise floor is 47.0 dB below the signal.

Sine wave—a periodic waveform consisting of a single frequency. A sine wave has pitch but lacks the timbral quality associated with the complex waveforms produced by musical instruments and voices.

Sound pressure level (SPL)—soundwaves cause the air pressure at any given point in a wave’s cycle to vary above (compression) or below (rarefaction) barometric pressure. This variation in pressure quantifies the strength of a sound and is called sound pressure (this is what a microphone measures). When expressed on a decibel scale, it is called sound pressure level (20 micro-pascals of air pressure is 0.0 dB on the scale, and this corresponds to the threshold of hearing at 1,000 Hz for a normal human ear).

Spherical soundwave—close to a small sound source (such as the human voice), waves propagate spherically; that is, they travel away from the source in spheres that continuously increase in diameter. These waves decrease in intensity quite rapidly, falling by 6.0 dB for every doubling of the distance in a field free of reflections (the Inverse Square Law).

Transducer—a device that converts one form of energy to another (verb: transduce).

Transient—any sudden and brief fluctuation in a signal or sound that disturbs its steady-state nature. Transients generally are of a much higher amplitude than the average level and often cause devices to overload. The initial peak in energy at the beginning of a waveform (the “attack”) is called an onset transient (examples: a word which starts with a consonant, the hammer of a piano striking the strings, a rim shot on a snare drum).

Transient response—a measure of the ability of a device to handle and faithfully reproduce sudden fluctuations. In microphones, it is a measure of how quickly a diaphragm responds to abrupt changes in sound pressure (lighter diaphragms respond more quickly).

True peak—the undetected peak level of a signal that occurs between sampling points.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.21.5