2

The Aesthetic and Artistic Elements of Sound in Audio Recordings

The audio recording process has given the creative artist the tools to very finely shape perceived sound (the perceived parameters of sound) through a direct control of the physical dimensions of sound. This control of sound is well beyond that which was available to composers and performers before the presence of modern recording technology. This controlling of sound in new ways has led to new artistic elements in music, and has led to a redefinition of the musician, to new characteristics of sound, and to new dimensions in music. While our discussion is focused on musical applications, it should be remembered that all aspects of these artistic elements of music also function as artistic elements in other areas of audio (such as broadcast media, film, multimedia, etc.).

A new creative artist has evolved. This person uses the tools of recording technology as sound resources for the creation (or recreation) of an artistic product. This person may be a performer or composer in the traditional sense, or this person may be one of the new musicians: a producer, or sound engineer, or any of the host of other, related job titles. Throughout this book, these people are referred to as recordists.

Through its detailed control of sound, the audio recording medium has resources for creative expression that are not possible acoustically. Sounds are created, altered, and combined in ways that are beyond the reality of live, acoustic performance. New creative ideas and new additions to our musical language have emerged as a result of recording techniques and technologies.

Creative ideas are defined by these aesthetic and artistic elements. The artistic elements are the aspects of sound that comprise or characterize creative ideas (or entire works of art, pieces of music). Study of the artistic elements will allow us to understand individual musical ideas and the larger musical event, and to recognize how those ideas and sound events contribute to the entire piece of music. Discussion will emphasize the artistic elements that are unique to recorded music, especially music created through the use of modern recording techniques and technologies.

As we have learned from the previous chapter, the artistic elements of sound are the mind/brain’s interpretation of the perceived parameters of sound. Sound as it is perceived and understood by the human mind, becomes the resource for creative and artistic expression in sound. The perceived parameters of sound are utilized as the artistic elements of sound to create and ensure the communication of meaningful (musical) messages.

The Art of Recording occurs when the parameters of sound are perceived as a resource for artistic expression. Recording becomes an art when it is used to shape the substance of sound and music. These materials that allow for artistic expression will be understood through a study of their component parts: the artistic elements of sound.

The States of Sound and the Aesthetic/Artistic Elements

After the perception of sound, the recorded material is understood as being comprised of sound elements that are interpreted by the mind/brain, and thus communicate artistic ideas. The aesthetic/artistic elements are directly related to specific perceived parameters of sound, just as the perceived parameters of sound were directly related to specific physical dimensions of sound.

As will be remembered, sound in audio recording is in three states: physical dimensions, perceived parameters, and artistic elements.

The artistic elements are used by the recordist to shape music (sound), resulting in artistic expression. The perceived parameters translate into the artistic elements:

Table 2-1

The Perceived Parameters and the Aesthetic/Artistic Elements of Sound

Perceived Parameters

Aesthetic/Artistic Elements

Pitch

Pitch Levels and Relationships

Loudness

Dynamic Levels and Relationships

Duration

Rhythmic Patterns and Rate of

 

Activity

Timbre (perceived overall quality)

Sound Sources and Sound Quality

Space (perceived characteristics)

Spatial Properties

The audio production process allows for considerable variation and a very refined control of ALL of the artistic elements of sound. All of the artistic elements of sound can be accurately and precisely controlled through many states of variation, in ways that were possible with ONLY pitch on traditional musical instruments.

Table 2-2

The States of Sound in Audio Recording

Physical Dimensions (Acoustic State)

Perceived Parameters (Psychoacoustic Conception)

Artistic/Aesthetic Elements (Resources for Artistic Expression)

Frequency

Pitch

Pitch Levels and Relationships—melodic lines, chords, register, range, tonal organization, pitch density, pitch areas, vibrato

Amplitude

Loudness

Dynamic Levels and Relationships—dynamic contour, accents, tremolo, musical balance

Time

Duration (time perception)

Rhythmic Patterns and Rates of Activities—tempo, time, patterns of durations

Timbre (comprised of physical components: dynamic envelope, spectrum and spectral envelope)

Timbre (perceived as overall quality)

Sound Sources and Sound Quality—sound sources, groupings of sound sources, instrumentation, performance intensity, performance techniques

Space (comprised of physical components created by the interaction of the sound source and the environment, and their relationship to a microphone)

Space (perception of the sound source as it interacts with the environment, and perception of the physical relationship of the sound source and the listener)

Spatial Properties—stereo location, surround location, phantom images, moving sources, distance location, sound stage dimensions, imaging, environmental characteristics, perceived performance environment, space within space

Pitch Levels and Relationships

Pitch level relationships present most of the significant information in music. The artistic message of most of today’s music is communicated (to a large extent) by pitch relationships. The listener has been trained, by the music heard throughout their life, to focus on this element to obtain the most significant musical information. The other artistic elements often support pitch patterns and relationships.

Pitch is the most precisely controlled artistic element in traditional music. The use of pitch relationships and pitch levels in music is more sophisticated than the use of the other artistic elements. Complex relationships of pitch patterns and levels are common in music.

Information about the artistic element of pitch levels and relationships will be related to:

1.

The relative dominance of certain pitch levels,

2.

The relative register placement of pitch levels and patterns, or

3.

Pitch relationships: patterns of successive intervals, relationships of those patterns, and relationships of simultaneous intervals.

Traditional Uses of Pitch

The aesthetic/artistic element of pitch levels and relationships is broken into the component parts: melodic lines, chords, tonal organization, register, range, pitch density, pitch areas, and tonal speech inflection.

A series of successive, related pitches creates melodic lines. Melodic lines are perceived as a sequence of intervals that appear in a specific ordering and that have rhythmic characteristics. The melodic line is often the primary carrier of the artistic message of a piece of music.

The ordering of intervals, coupled with or independent from rhythm, creates patterns. Pattern perception is central to how humans perceive objects and events. These basic principles relate to all of the components of the artistic elements. Melodic lines are organized by patterns of intervals (short melodic ideas, riffs, or motives), supported by corresponding rhythmic patterns. The complexity of the patterns, the ways in which the patterns are repeated, and the ways in which the patterns are modified provide the melodic line with its unique character.

Two or more simultaneously sounding pitches create chords. In much of our music, chords are based on superimposing, or stacking, the intervals of a third (intervals containing three and four semitones, most commonly). Chords comprised of three pitches, combining two intervals of a third, are called triads. Continued stacking of thirds results in seventh, ninth, eleventh, and thirteenth chords.

The movement from one chord to another, or harmonic progression, is the most stylized of all the components of the artistic elements. Harmonic progression is the pattern created by successive chords, as based on the lowest note (the root) of the triads (or more complex chords). These patterns of chord progressions have become established as having general principles that occur consistently in certain types of music. Certain types of music will have stylized chord progressions (progressions that occur most frequently), other types of music will have quite different movement between chords, and perhaps emphasize more complex chord types. The patterns of the harmonic progression create harmony.

Harmony is one of the primary components that support the melodic line. The chords in the harmonic progression reinforce pitches of the melody. The speed and direction of the melodic line is often supported by the speed at which chords are changed, and the patterns created by the changing chords: harmonic rhythm.

The expectations of harmonic progression create a sequence of chords, which will present areas of tension and areas of repose within the musical composition. The tendencies of harmonic motion do much to shape the momentum of a piece of music, and can greatly enhance the character of the melodic line and musical message. Performers utilize the psychological tendencies of harmonic progression, exploiting its directional and dramatic tendencies. The expectations of harmonic movement and the psychological characteristics of harmonic progression have become important aspects of musical expression and musical performance.

The melodic and harmonic pitch materials are related through tonal organization. Certain pitch materials are emphasized over others, in varying degrees, in nearly all music. This emphasis creates systems of tonal organization in which a hierarchy of pitch levels exist. A hierarchy will most often place one pitch in a predominant role, with all other pitches having functions of varying importance, in supporting the primary pitch. The primary pitch, or tonal center, becomes a reference level, to which all pitch material is related, and around which pitch patterns are organized.

Many tonal organization systems exist. These systems tend to vary significantly by cultures, with most cultures using several different, but related systems. The major and minor tonal organization systems of Western music are examples of different, but related systems, as are the whole-tone and pentatonic systems of Eastern Asia. The reader should consult appropriate music theory texts for more detailed information on tonal organization, as necessary.

The New Pitch Concerns of Audio Production

Certain components of pitch levels and relationships have become more prominent in musical contexts (and other areas of audio) because of the new treatments of pitch relationships in music recordings. The components of range, register, pitch density, and pitch area can be more closely controlled in recorded music, than in live (unamplified) performance. These components are more important in recorded music, because they are precisely controllable by the technology, and they have been controlled to support and enhance the musical material.

Range is the span of pitches of a sound source (any instrument or voice). Range is the area of pitches that encompasses the highest note possible (or present in a certain musical example) to the lowest note possible (or present) of a particular sound source.

A register is a portion of a sound source’s range. A register will have a unique character (such as a unique timbre, or some other determining factor) that will differentiate it from all other areas of the same range. It is a small area within the source’s range that is unique in some way. Ranges are often divided into many registers; registers may encompass a very small group of successive pitches, up to a considerable portion of the source’s range.

A pitch area is a portion of any range (or of a register) that may or may not exhibit characteristics that are unique from other areas. Instead, it is a defined area between an upper and a lower pitch-level, in which a specific activity or sound exists.

Pitch density is the relative amount and register placement of simultaneously sounding pitch material, throughout the hearing range or within a specific pitch area. It is the amount and placement of pitch material in the composite musical texture (the overall sound of the piece of music), and is defined by its boundaries of highest and lowest sounding pitches.

With pitch density, sound sources are assigned (or perceived as occupying) a certain pitch area within the entire listening range (or the smaller pitch range used for a certain piece of music). Thus, certain pitch areas will have more activity than other pitch areas; certain sound sources will be present only in certain pitch areas, and other sources present only in other pitch areas; some sources may share pitch areas, and cause more activity to be present in those portions of the range; some pitch areas may be void of activity. Many possible variations exist.

Pitch density is a component of pitch-level relationships, and is directly related to traditional concerns of orchestration and instrumentation, with many new twists. Pitch density is a much more specific concern in recorded music because it is controllable in very fine increments. Traditional orchestration was concerned, basically, with the selection of instruments, and with the placement of the musical parts (performed by those assigned instruments and their sound qualities) against one another.

With the controls of signal processing (especially equalization), sound synthesis, and multitrack recording, the register placement of sound sources and their interaction with the other sound sources take on many more dimensions. Each sound source occupies a pitch area; the acoustic energy within the pitch area of a timbre’s spectrum is distributed in ways that are unique to each sound source. The spectrum of each sound source is an individual pitch density, and the pitch density of the overall program (or musical texture) is the composite of all of the simultaneous pitch information from all sound sources.

Sound sources, and musical ideas, are often delineated by the pitch area they occupy within the composite pitch density. Sound sources are more easily perceived as being separate entities and individual ideas, when they occupy their own pitch area in the composite, pitch density of the musical texture. This area can be large or quite small, and still be effective.

Sounds that do not have well-defined pitch quality, occupy a pitch area. These types of sounds are noise-like, in that they cannot be perceived as being at a specific pitch. Such sounds may, however, have unique pitch characteristics.

Many sounds cannot be recognized as having a specific pitch, yet have a number of frequencies that dominate their spectrum. Cymbals and drums easily fall into this category. Cymbals are easily perceived as sounding higher or lower than one another. Yet a specific pitch cannot be assigned to the sound source.

We perceive these sounds as occupying a pitch area. We perceive a pitch-type quality based on (1) the register placement of the area of the highest concentration of pitch information (at the highest amplitude level) present in the sound, and (2) the relative density (closeness of the spacing of pitch levels) of the pitch information (spectral components). We are able to identify the approximate area of pitches in which this concentration of spectral energy occurs, and are thus able to relate that area to other sounds.

Pitch areas are defined as the range spanned by the lowest and highest dominant frequencies around the area of the spectral activity. This range is called the bandwidth of the pitch area. Many sounds will have several pitch areas where concentrated amounts of spectral energy occurs, with one range dominating and others less prominent. The size of the bandwidth and the density of spectral information (the number of frequencies within the bandwidth and the spacing of those frequencies) define the sound quality of the pitch area.

Dynamic Levels and Relationships

Dynamic levels and relationships have traditionally been used in musical contexts for expressive or dramatic purposes. Expressive changes in dynamic levels and the relationships of those changes have most often been used to support the motion of melodic lines, to enhance the sense of direction in harmonic motion, or to emphasize a particular musical idea. A change of dynamic level, in and of itself, can produce a dramatic musical event, and is a common musical occurrence. Changes in dynamic level can be gradual or sudden, subtle or extreme.

Dynamics have traditionally been described by analogy: louder than, softer than, very loud (fortissimo), soft (piano), medium loud (mezzo forte), etc. The artistic element of dynamics in a piece of music is judged in relation to context. Dynamic levels are gauged in relation to (1) the overall, reference dynamic level of the piece of music, (2) the sounds occurring simultaneously with a sound source in question, and (3) the sounds that immediately follow and precede a particular sound.

The components of dynamic levels and relationships in audio recording are dynamic contour (with gradual and abrupt changes in dynamic level), emphasis/deemphasis accents (abrupt changes in dynamic level), musical balance (gradual and abrupt changes in dynamic levels), and dynamic speech inflections.

Traditional Uses of Dynamics

It is common for the most important musical idea/sound source in a piece of music to be given prominence in one way or another. Making that sound the loudest is an easy way of achieving this prominence (though not always the most elegant). Arranging sounds by relating dynamic levels to the importance of the musical part is very common, and a very natural association of loudness and the center of one’s attention.

Gradual changes in dynamic levels can be important. The crescendo (gradual increasing in loudness) can be used to support the motion of a melodic line (for instance), or it might be used on a sustained pitch as a musical gesture itself. Likewise a diminuendo or decrescendo (a gradual decrease in loudness) may be used in the same ways.

Rapid, slight alterations or changes in dynamic level for expressive purposes are often present in live performances. This is called tremolo, and is used primarily to add interest and substance to a sustained sound. Tremolo and vibrato are often confused. Vibrato is a rapid, slight variation of the pitch of a sound; it, also, is used to enhance the sound quality of the sound source. At times, performers may not be able to control their sound well enough to control tremolo and vibrato alterations; in these instances, tremolo and vibrato may detract from the source’s sound quality, rather than contribute to it.

To support a musical idea or to create a sense of drama, musical ideas are often brought to the listener’s attention by dynamic emphasis accents and attenuation accents. A shift in dynamic level that brings the listener’s attention to a musical idea, is an accent. Accents are most often emphasis accents, making use of increasing the dynamic level of the sound to achieve the desired result. Much more difficult to successfully achieve, de-emphasis (or attenuation) accents draw the listener’s attention to a musical idea, or a sound source, by a decrease in the dynamic level of the sound. Attenuation accents are often unsuccessful because the listener has a natural tendency to move attention away from softer sounds; these accents are most easily accomplished in sparse musical textures, where little else is going on to draw the listener’s attention away from the material being accented.

New Concepts of Dynamic Levels and Relationships

Changes in dynamic levels over time comprise dynamic contours. Dynamic contours can be perceived for individual sounds, individual sound sources, individual musical ideas comprised of a number of sound sources, and the overall piece of music. Dynamic contours are perceived at many different perspectives (level of detail). At their extremes, they exist as the smallest changes within the spectral envelope of a single sound source, and as great changes in the overall dynamic level of a recording.

The interaction of the dynamic contours of all sound sources in a piece of music creates musical balance. Musical balance is the interrelationships of the dynamic levels of each sound source, to one another and to the entire musical texture. The dynamic level of a particular sound source in relation to another sound source is a comparison of two parts of the musical balance.

Dynamic contours and musical balance have been used in supportive roles in most traditional music. At times dynamic level changes have been used for their own dramatic impact on the music (as discussed with crescendo and diminuendo, above), but most often they are used to assist the effectiveness of another artistic element. The mixing process easily alters musical balance. Recordists exercise great control over this artistic element.

The dynamic levels and relationships of a performance may be significantly different in the final recording. The recording process has very precise control over the dynamic levels of a sound source in the musical balance of the final recording. An instrument may have an audible dynamic level in the musical balance of a recording that is very different from the dynamic level at which the instrument was originally performed. The timbre of the instrument will exhibit the dynamic levels at which it was performed (perceived performance intensity), but its relative dynamic level in relation to the other musical parts might be significantly altered by the mix. For example, an instrument may be recorded playing a passage loudly, and end up in the final musical balance (mix) at a very soft dynamic level; the timbre of the instrument will indicate that the passage was performed very loudly, yet the actual dynamic level will be quite soft in relation to the overall musical texture, and to the other instruments of the texture.

Many clear examples of this are found in The Beatles’ recording of “Penny Lane.” Listening carefully to the flutes, piccolo, and piccolo trumpet parts throughout the song, one will find many instances where the loudness levels of the performances are not reflected in the actual loudness levels of the instruments in the recording. Among many instances of conflicting levels and timbre cues, we hear moderately loud flutes that were performed softly; loudly played piccolo sounds at a soft level in the mix; and a piccolo trumpet appearing at a softer level in the performance. Other instruments and voices in the song also have inconsistent musical balance and performance intensity information.

The reader is encouraged to take the time now to perform the musical balance and performance intensity Exercise 2-1 at the end of this chapter.

The dynamic level of a sound source in relation to other sound sources, and musical balance, is quite different and distinct from the perceived distance of one sound source to another. Yet, these two occurrences are often confused, and are the source of much common, misleading terminology used by recordists. Significant differences are present between a softly generated sound that is close to the listener and a loudly performed sound that is at a great distance to the listener, even when the two sounds have precisely the same sound pressure level (SPL) or perceived loudness level. Loudness levels within the recording process are independently controllable from the loudness level at which the sound was performed, and are independently controllable from the distance of the sound source from the original receptor and from the perceived listening location of the final recording. Dynamics must not be confused with distance. Dynamic levels, themselves, do not define distance location.

Rhythmic Patterns and Rates of Activities

Durations of sounds (the length of time in which the sound exists) combine to create musical rhythm. Rhythm is based on the perception of a steadily recurring, underlying pulse. The pulse does not need to be strongly audible to be perceived. The underlying pulse (or metric grid) is easily recognized by humans as the strongest, common proportion of duration (note value) heard in the music.

The rate of the pulses of the metric grid is the tempo of a piece of music. Tempo is measured in metronome markings (pulses per minute, abbreviated “M.M.”), or in some contexts as pulses per quarter note. Tempo, in a larger sense, can be the rate of activity of any large or small aspect of the piece of music (or of some other aspect of audio, for example the tempo of a dialogue).

Durations of sound are perceived proportionally in relation to the pulse of the metric grid. The human mind will organize the durations into groups of durations, or rhythmic patterns. In the same ways that we perceive patterns of pitches, we perceive patterns of durations. Pattern perception is transferable to all of the components of all of the artistic elements, and is the traditional way in which we perceive pitch and rhythmic relationships.

Rhythmic patterns are the durations of or between soundings of any artistic element. Rhythmic patterns might be created by the pulsing of a single percussion sound; in this way rhythmic patterns would be created by the durations between the occurrences of the starts of the same sound source. Rhythmic patterns comprised of the durations of successive, single pitches (perhaps including some silences) create melody. Rhythmic patterns of the durations of successive chords (groups of pitches) create harmonic rhythm. Extending this, in the same way rhythm can be transferred to ALL artistic elements. As examples, it is possible to have rhythms of sound location (as has become a common mixing technique for percussion sounds); it is likewise possible to have timbre melodies, or rhythms applied to patterns of identifiable timbres (this is often used for drum solos).

Sound Sources and Sound Quality

The selection, modification, or creation of sound sources is an important aesthetic and artistic element of audio recording. The sound quality of the sound sources (the timbre of the source), plays a central role in the presentation of musical ideas, and has become an increasingly significant form of musical expression.

The sound quality of a sound source may cause a musical part to stand out from others, or to blend into an ensemble. Sound quality alone can convey tension or repose, and give direction to a musical idea. Sound quality can add dramatic or extra-musical meaning or significance to a musical idea. Finally, the timbral quality of a sound source can, itself, be a primary musical idea, capable of conveying a meaningful musical message.

Until recently, composers used the sound quality of a sound source (1) to assist in delineating and differentiating musical ideas (making them easier to distinguish from one another), (2) to enhance the expression of a musical idea by the careful selection of the appropriate musical instrument to perform a particular musical idea, or (3) to create a composite timbre (or texture) of the ensemble, thereby forming a characteristic, overall sound quality.

Performers have always used the characteristic timbres of their instruments or voices to enhance musical interpretation. This activity has been greatly refined by the resources of recording and sound reinforcement technology. Performers now have greater flexibility in shaping the timbre of their instruments for creative expression. Of equally great importance, after the performance has been captured, the recording process allows for the opportunity to return to the performance for further (perhaps extensive) modifications of sound quality.

The selection of a sound source to represent (present) a particular musical idea is critical to the successful presentation of the idea. The act of selecting a sound source is among the most important decisions composers (and producers) make. The options for selecting sound sources are (1) to choose a particular instrumentation, (2) to modify the sound quality of an existing instrument or performance, or (3) to create, or synthesize, a sound source to meet the specific need of the musical idea.

The selection of instrumentation was once merely a matter of deciding which generic instrument of those available would perform a certain musical line. The selection of instrumentation has now become very specific and much more important. The performance that exists as a music recording may virtually live forever and be heard by countless people. This is very different from the typical, live music performance of the past that existed for only a passing moment and was heard by only those people present.

Today, the selection of instrumentation is often so specific, as to be a selection of a particular performer playing a particular model of an instrument. Generally, composers and producers are very much aware of the sound quality they want for a particular musical idea. The performer, the way the performer can develop a musical idea through their own personal performance techniques, and their ability to use sound quality for musical expression are all considerations in the selection of instrumentation.

Vocalists are commonly sought for the sound quality of their voice and their abilities to perform in particular singing styles. The vocal line of most songs is the focal point that carries the weight of musical expression. Vocalists make great use of performance techniques to enhance and develop their sound quality, as well as to support the drama and meaning of the text.

Performance techniques vary greatly between instruments, musical styles, performers, and functions of a musical idea. The most suitable performance techniques will be those that achieve the desired musical results, when the sound sources are finally combined. One performance technique consideration must be singled out for special attention—the intensity level of a performance.

As touched on in the above discussion with musical balance, a performance on a musical instrument will take place at a particular intensity level. This perceived performance intensity is comprised of loudness, energy exerted, performance technique, and the expressive qualities of the performance. Each performance at a different intensity level results in a different characteristic timbre of that instrument, at that loudness level. The same sound source will thus have different timbres, at different loudness levels (and at different pitch-levels), through performance intensity.

Along with the timbre (sound quality) and loudness level, performance intensity can communicate a sense of drama and an artistically sensitive presentation of the music to the listener. Through performance intensity, louder sounds might be more urgent, more intense; softer sounds might be cause for relaxation of musical motion. The exact reverse is equally possible. The expressive qualities of music are contained in performance intensity cues.

Modifying a sound source is a common way of creating a desired sound quality. Instruments, voices, or any other sound may be modified (while being recorded, or afterwards) to achieve a desired sound quality. Most often, this takes the form of making detailed modifications to a particular instrument so it best presents the musical idea. The final sound quality will still have some (perhaps many, perhaps only a few) characteristic qualities of the original sound.

The extensive modification of an existing sound source, to the point where the characteristic qualities of the original sound are lost, is actually the creation of a sound source. The creation of new sound qualities (or inventing timbres) has become an important feature in many types or pieces of music. The recording process easily allows for the creation of new sound sources, with new sound qualities.

Sound qualities are created by either extensively modifying an existing sound through sound sampling technologies, or by synthesizing a waveform. Sound synthesis techniques allow precise control over these two processes, and are having a widespread impact on recording practice and musical styles. Many specific technologies and techniques exist for synthesizing and sampling sounds; all have unique sound qualities and unique ways of allowing the user to modify or synthesize a sound source.

A new sense of the importance of sound quality to communicate, as well as to enhance, the musical message has come from this increased emphasis on sound quality and timbre. Sound quality has become a central element in a number of the primary decisions of recording music, as well as in the creation of music through the recording process. In making these primary decisions, sound quality is conceptualized as an object. The sound is thought of as a complete and individual entity, capable of being pulled out of time and out of context.

In this way, sound quality is approached as a sound object. This important concept will be explored in detail later in Chapter 4, “Listening and Evaluating Sound for the Audio Professional.”

The entire, composite sound of the music may also be conceptualized as a single entity, or overall quality comprised of any number of small, individualized sound sources and musical ideas. This sound quality of the overall sound, or entire program is called texture. Texture is perceived by the characteristics of its global sound quality.

Texture will nearly always be comprised of any number or types of individual sounds. Texture is perceived as an overall character, made up of the states and activities of all sounds and musical ideas. Pitch-register placements, rate of activities, dynamic contours, and spatial properties are all potentially important factors in defining a texture by the states or activities of its component parts.

Spatial Properties: Stereo and Surround Sound

The spatial properties of sound have traditionally not been used in musical contexts. The only exceptions are the location effects of antiphonal ensembles of certain Renaissance composers and in certain drama-related works of the nineteenth century, such as the 1837 Requiem by Hector Berlioz (with its brass ensembles stationed at the corners of the church, performing against the orchestra and choir on stage).

The spatial properties of sound play an important role in communicating the artistic message of recorded music. The roles of spatial properties of sound are many. Spatial properties may be used in supportive roles to enhance the character or effectiveness of musical ideas (large and small), to differentiate one sound source from another, to provide dramatic impact, to alter reality, or to reinforce reality by providing a performance space for the music. Further, spatial properties may be used as the primary idea of an artistic gesture. The spatial property of environmental characteristics even fuses with the timbre of the sound source to add a new dimension to its sound quality. Other possibilities certainly exist.

The number and types of roles that spatial location may play in music have yet to be exhausted or defined. The recent adoption of surround sound has further multiplied the possibilities.

All of the components of the spatial properties are under very precise and independent control. All of the spatial properties may be in many markedly different and fully audible states. Further, gradual and continuously variable change between those states is possible and common.

The spatial properties of sound that are of primary concern to recorded music (sound) are:

1.

The stereo location of the sound source on the horizontal plane of the stereo array,

2.

The distance of the sound source from the listener,

3.

The perceived characteristics of the sound source’s physical environment, and finally

4.

The surround location of sound sources on the lateral plane 360° around the listener.

The perceived elevation of a sound source is not consistently reproducible in widely used playback systems, and has not yet become a resource for artistic expression.

Two-Channel Stereo

The first three spatial properties are realized through stereophonic sound reproduction. The spatial qualities of stereo are perceived as relationships of location and distance cues and relationships of sound sources. These create a perception of a sound stage contained within the perceived performance environment of the recording.

While surround sound is becoming more prevalent, two-channel sound reproduction remains the standard of the music recording industry, with monophonic capabilities still considered for AM broadcast and television sound applications. The two-channel array of stereo sound attempts to reproduce all spatial cues through two separate sound locations (loudspeakers), each with more-or-less independent content (channel). With the two channels, it is possible to create the illusion of sound location at a loudspeaker, in between the two loudspeakers, or slightly outside the boundaries of the loudspeaker array; location is limited to the area slightly beyond that covered by the stereo array, and to the horizontal plane. The characteristics of the sound source’s environment and distance from the listener are created in much more subtle ways by stereo, but can be stunning nonetheless.

A setting is created by the two-channel playback format for the reproduction of a recorded or created performance (complete with spatial cues). This establishes a conceptual and physical environment within which the recording will be reproduced more-or-less accurately.

The reproduced recording presents an illusion of a live performance. This performance will be perceived as having existed in reality, in a real physical space; as the listener will conceive of this activity in relation to their own physical reality. The recording will appear to be contained in a single, perceived physical environment. Within this perceived space is an area that comprises the sound stage.

Sound Stage and Imaging

The sound stage is the perceived area within which all sound sources are located. It has an apparent physical size of width and depth. The sound sources of the recording will be grouped by the mind to occupy a single area. It is possible for different sound sources to occupy significantly different locations within the sound stage but still be grouped into the illusion of a single performance.

Imaging is the lateral location and distance placement of the individual sound sources within the sound stage. Imaging provides depth and width to the sound stage. The perceived locations and relationships of the sound sources create imaging, as all sources appear to exist at a certain lateral and distance location within the stereo array.

Images

Figure 2-1 Sound stage and the perceived performance environment.

Stereo Location

The stereo (lateral) location of a sound source is the perceived placement of the sound source in relation to the stereo array. Sound sources may be perceived at any lateral location within, or slightly beyond, the stereo array.

Phantom images are sound sources that are perceived to be sounding at locations where a physical sound source does not exist. Imaging relies on phantom imaging to create lateral localization cues for sound sources. Through the use of phantom images, sound sources may be perceived at any physical location within the stereo loudspeaker array, and up to 15° beyond the loudspeaker array. Stage width (sometimes called stereo spread) is the width of the entire sound stage. It is the area between the extreme left and right source images, and marks the sound stage boundaries.

Phantom images not only provide the illusion of the location of a sound source, they also create the illusion of the physical size (width) of the source. Two types of phantom images exist: the spread image and the point source.

A point source phantom image occupies a focused, precise point in the sound stage. The listener can close their eyes and point to a very precise point of little area where the source is heard to originate. Point sources exist at a specific point in space; narrow in width, and precisely located in the sound stage.

Images

Figure 2-2 Sound stage and imaging, with phantom images of various sizes.

The spread image appears to occupy an area. It is a phantom image that has a size that extends between two audible boundaries. The potential size of the spread image varies considerably; it might be slightly wider than a point source, or it may occupy the entire stereo array. The spread image is defined by its boundaries; it will be perceived to occupy an area between two points or edges. At times, a spread image may appear to have a hole in the middle, where it might occupy two more-or-less equal areas, one on either side on the stereo array.

The perceived lateral location of sound sources can be altered to provide the illusion of moving sources. Moving sound sources may be either point sources or spread images. Point sources and narrow spread images that change location most closely resemble our real life experiences of moving objects.

Many interesting examples of phantom images can be found on The Beatles’ album Abbey Road. An apparent example of a spread image with a hole in the middle is the tambourine in the first chorus of “She Came in Through the Bathroom Window.” The lead vocal in “You Never Give Me Your Money” begins the song as a point source. The image soon becomes a spread image that gradually grows wider, ultimately occupying a significant amount of the sound stage (this is partly due to the gradual addition and varying of environmental cues, which will be discussed shortly). In the second section of the work, the new lead vocal sound gradually moves from the right to the left side of the sound stage, while maintaining a spread image of moderate size.

Distance Location

Two categories of distance cues shape recorded music: (1) the distance of the listener to the sound stage, and (2) the distance of each sound source from the listener.

Both of these distances rely on a perception that the entire recording emanates from a single, global environment. This perceived performance environment establishes a reference location of the listener, from which all judgments of distance can be calculated.

The stage-to-listener distance establishes the front edge of the sound stage with respect to the listener and determines the level of intimacy of the music/recording. This is the distance between the grouped sources that make up the sound stage and the perceived position of the audience/listener. This stage-to-listener distance places the sound stage within the overall environment of the recording and provides a location for the listener.

The depth of sound stage is the area occupied by the distance of all sound sources. The boundaries of the depth of the sound stage are the perceived nearest and the perceived furthest sound sources (with the depths created by their environments, discussed below). The perceived distances of sound sources within the sound stage may be extreme; they may provide the illusion of great depth and a large area, or the exact opposite.

Stage-to-listener and depth of sound stage distance cues have different levels of importance in different applications. Depth of sound stage cues tend to be emphasized over stage-to-listener distance cues in many multitrack recordings; in those recordings, the cues of the distance of the source from the listener are often exploited for dramatic effect and/or to support musical ideas. In contrast, stage-to-listener distance cues are often carefully calculated in classical and some jazz recordings (especially those utilizing standardized stereo microphone techniques); in those recordings the stage-to-listener distance will not change and has been carefully selected to represent the most appropriate vantage point (the ideal seat) from which the music is to be heard.

Turning again to Abbey Road, the distance cues of the various instruments of “Golden Slumbers” gives the work and its companion “Carry That Weight” much space between the nearest and the furthest sources. The orchestral string and brass instruments are at some distance from the listener and give significant depth to the sound stage, while the piano brings the front edge of the sound stage very near the listener. Remembering that timbral detail is the primary determinant of distance location will help in accurately hearing these cues.

Environmental Characteristics

Matching a sound source to an environment with suitable sound and selecting the environment of the sound stage (the perceived performance environment) have become important parts of music recording. Environmental characteristics have the potential to significantly impact music and the quality of the recording.

Environmental characteristics fuse with the sound source to create a single sonic impression. Its host environment shapes the overall timbre/sound quality of each sound source; this is also true for the overall program (shaped by its perceived performance environment). Environmental characteristics contribute greatly to sound quality and also play an important role in the recording’s sense of space. The characteristics provide a space for the sound sources to perform in, they supply some distance information that may be significant, and they contribute to the perceived depth of the sound stage.

The sound characteristics of the host environments of sound sources and the complete sound stage are precisely controllable. Each sound source has the potential to be assigned environmental characteristics that are different from the other sound sources. The recording process allows the potential for each sound source to be given a different environment, and for the characteristics of those environments to be varied as desired. Further, each source may occupy any distance from the listener within the applied host environment.

The perceived performance environment (or the environment of the sound stage) is the overall environment where the performance (recording) is heard as taking place. This environment binds all the individual spaces together into a single performance area.

The environment of the sound stage and an individual environment for each sound source (or groups of sound sources) often co-exist in the same music recording. This places the individual sound sources with their individual environments within the overall, perceived performance environment of the recording. The illusion of space within space is thus created, with the following potential perceptions:

1.

That physical spaces may exist side-by-side,

2.

That one physical space may exist within another physical space (where often a space with the sound qualities of a physically large room may be perceived to exist within a smaller physical space), and

3.

That sounds may exist at various distances within the same host environments.

Any number of environments and associated stage-depth distance cues may occur simultaneously, and coexist within the same sound stage. The environments and associated distances are conceptually bound by the spatial impression of the perceived performance environment. These outer walls of the overall program establish a reference (subliminally, if not aurally) for the comparison of the sound sources.

Perhaps oddly, the overall space that serves as a reference, and that is perceived by the listener as being the space within which all activities occur, will often have the sound characteristics of an environment that is significantly smaller than the spaces it appears to contain. Such cues that send conflicting messages between our life experiences and the perceived musical occurrence are readily accepted by the listener and can be used to great artistic advantage. This is a very common space within space relationship.

Space within space will at times be coupled with distance cues to accentuate the different environments (spaces) of the sound sources, though often, this illusion is created solely by the environmental characteristics of the different spaces of each sound source.

“Here Comes the Sun,” also from Abbey Road, provides some clear examples of space within space. Environments clearly exist side by side from the song’s opening into the first verse. The guitar has an environment all to itself in the left channel, the electronic keyboard countermelody and Moog synthesizer glissando have similar environments distinctly different from the others, and the right channel voice has a very different third environment. The parts are held together by the notion that they all exist within a single performance space (perceived performance environment). The entry of additional instruments quickly adds numerous additional environments and enhances the sound stage. As the vocal lines are added, however, they appear to be within the same environment, though at distinctly different distances from the listener. The notion of spaces within spaces is also apparent in the drum parts; the trap set seems to occupy an area, with its characteristic environment, within which low toms in a larger space are contained.

Surround Sound

Music recordings are now being reasonably widely made in surround sound. Enough activity and interest is present that it is necessary for us to seriously explore this format now, but with some reservation. While some talented people have been working in this new format and some striking recordings have been made, few consistent uses of the unique sound qualities of surround have emerged. This section will discuss the most prevalent aesthetic and artistic elements currently found in surround music recordings, and will explore some potential applications. Without doubt, the artistic elements of surround will be further defined by recordists over the next few years; great changes and advances are likely, as the medium is just beginning to be explored in music production.

Listening to a stereo recording, we find ourselves observing a performance. We are viewing the activity as an outsider. And while we may get consumed by or immersed in the music, we are outside of the experience of the performance itself and are looking in. With surround sound, we can find ourselves enveloped by the music. We can be surrounded by the sound, and thereby contained within the space of the recording; we are no longer outside observers, but at least inside observers if not participants (at least in our perception of the experience). Now the listener can be enveloped by the sound (and become part of the space of the recording) or they might be oriented by the production techniques to observe a piece of music as a 360° panorama of sound. This aspect of surround sound has great potential of making a profound impact on music. Location and environmental characteristics will be approached differently for surround recordings, and distance cues will also take on new dimensions.

Surround Location

The sound stage of surround sound is vastly more complicated than stereo. Imaging takes on many strikingly new and different dimensions. With independent channels surrounding the listener, the potential exists for the sound stage to be extended enormously. This also places the listener in a listening position that is strikingly different from stereo.

As discussed, stereo is based on a single sound stage between two speakers. Five-channel surround (the format used for evaluations herein, discussed in detail in Chapter 9) provides the opportunity for as many as 26 possible combinations of speakers. This changes phantom image placement, width, and stability greatly.

The phantom images of stereo exist between two loudspeakers, and up to 15° beyond. Phantom imaging is more complex in surround. In surround there are five primary phantom image locations existing between adjacent pairs of speakers. These images tend to be the most stable and reliable between systems and playback environments.

Many secondary phantom images are possible as well. These can appear between speaker pairs that are not adjacent. These images contain inconsistencies in spectral information and are less stable. Implied are different distance locations for these images, as the trajectories between the pairs of speakers are closer to the listener position. These closer locations do not materialize in actual practice. The distance location of these images are actually pushed away somewhat by the diminished timbral clarity of these images.

When we consider locations caused by various groupings of three or four loudspeakers, placement options for phantom images get even more complex.

Phantom images can be of greatly different sizes in surround. They can range from completely surrounding the listener with a spread image of enormous size, to a small and precisely defined point source. Point sources and narrow spread images are common in current music productions, especially in the front sound field. The center channel has changed imaging on the traditional front sound stage tremendously. Images are often more defined in their locations and narrower in size.

Images

Figure 2-3 Phantom images in pairs of surround speakers.

Distance in Surround

Distance cues, of course, remain a product of timbral detail, with some reliance on environmental cues. The enhanced presence of ambience causes surround to more readily draw the listener into making inaccurate judgments of distance cues. Sounds are often perceived as further away than is accurate, largely because of an awareness of extra or enhanced environment information. The listener is drawn toward ambience and away from an awareness of timbral detail.

The depth of sound stage is extended all around the listener as well. The listener can perceive distance in all directions, and these cues can be present in surround recordings. This provides for creative opportunities not possible in stereo, but should be approached with reservation.

First, we know listeners accept sound stages of great depth in the front sound field. They easily imagine they are viewing something with proportions out of their physical confines. Listeners are not prepared to perceive sounds from the side and rear in the same way. When presented with similar materials from side and rear locations, listeners can be reluctant to place sounds in or behind walls (even after they have been observed and recognized at those locations). The same cues presented in a musically different way to envelop the listener in the space may allow this greater depth to be perceived.

Second, phantom images from the side and rear are inherently filled with phase anomalies of the listening room. This can cause a lack of timbral definition and detail, and distort distance cues. Further, it is also common for surround systems to give different timbral qualities to any instrument panned across its different speaker locations. These timbre changes often translate into distance changes and blur distance location imaging.

Finally, involuntary head movement contributes to our localization of sound sources. While this instinct has allowed our species to survive and evolve, it may well lead to a sense of apprehension in the listener. When presented with sounds from behind, the fight or flight instinct can be triggered, and thus distract the listener or create discomfort.

Environmental Characteristics and Surround Sound

Environmental characteristics can be directed to the listener from every direction in a very natural manner. This will immerse the listener in the cues, and provide the life-like experience of being present in the space where the recording was made. Spaciousness can be presented by both two-channel and five-channel systems to portray a sense of space, but only surround systems can provide the sensation of being there within the performance.

At present, environmental characteristics are mostly used in ways similar to stereo recordings. The characteristics of the perceived performance environment and of individual sound sources are crafted to shape the musicality of the recordings. One exception is fully or partially immersing the listener in the ambiance of a source’s host environment, while localizing the source in a specific location elsewhere (usually in the front sound field).

The inherent qualities of environmental characteristics remain unchanged, except for changes in the direction(s) of the arrival of reflected sound. This itself is a great difference, as the fusing of environmental characteristics and direct sound can become challenged, with a variety of results such as enlarged images, unnatural effects (perhaps pleasing), distracting reflections, and many more. This is especially apparent when environment cues are sent to only a few channels and do not surround the listener; this can lead to many different illusions. The environmental cues of individual sound sources may be perceived as separated from the direct sound, may be used to enhance imaging and space within space illusions, and many other alternatives exist for this new dimension. This new set of illusions makes the perceived performance environment’s tendency to bind all of the spaces of the individual sound sources together even more important. The perceived performance environment will continue to provide an important context for the recording and a critical point of reference.

The album On Air by Alan Parsons provides some clear examples of some of these concepts. The final track, “Blue Blue Sky,” surrounds the listener with a gradually moving lead vocal. The vocalist moves from the left surround, behind the listener through the right surround and the right channel before arriving at the center. The source image remains largely consistent in size and allows us to appreciate how the movement of the most important sound source, and its temporary placement behind the head of the listener, can impact the musical idea. The listener is observing this activity around them more than being enveloped by it; the gentle nature of the musical line allows the listener to feel comfortable with this movement and the choices of location. The texture and sound stage change entirely at 1:43. Here instrumentation changes and a chorus of vocals enters, and the listener becomes very effectively enveloped by the surround sound stage. One can easily perceive themselves not only in the recording space, but also within the group of performers.

Conclusion

With the recording process, it is possible for any of the artistic elements of sound to be varied in considerable detail. In so doing, all artistic elements can be shaped for artistic purposes and used to create musical ideas. As all elements of sound can be varied by roughly equal amounts, it is possible for any element to play an important role in a piece of music. We commonly see this practice in today’s music productions.

The artistic elements are used in very traditional roles in certain musical works and types of recording productions, and in very new ways in other works. These new ways the artistic elements are used tend to emphasize aspects of sound that can not be controlled in acoustic performances. The aesthetic/artistic elements unique to audio recording (especially sound quality and spatial properties) are commonly used to support and shape musical ideas. Different musical relationships and sound properties can exist in audio recordings rather than in acoustic music. Knowing and controlling these elements gives the recordist the opportunity to contribute to the creative process and the act of making music.

The potentials of the artistic elements to convey the musical message, the musical message itself, and the characteristics and limitations of the listener are explored in the following chapter.

Exercises

The following exercise should be practiced until you are comfortable with the material covered.


Exercise 2-1

General Musical Balance and Performance Intensity Observations.

1.

Listen carefully to “Penny Lane” by The Beatles. Follow the flutes, piccolo, and piccolo trumpet parts to observe the conflicting levels/cues cited in the discussion above.

2.

In succeeding hearings, find other instances where musical balance is at a different loudness than the performance intensity information of the instruments’ sound qualities.

3.

Listen again, while focusing attention on a specific instrument or voice you know well; follow that sound source carefully throughout the song, to make some general observations of performance intensity cues.

4.

Listen again and note the actual loudness of that instrument/voice in relation to the other sound sources.

5.

Finally, listen again for how these relationships change between major sections of the song (i.e., between verse and chorus).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.166.122