CHAPTER 16

The Non-Environment Control Room

Lack of consistency in current recordings. The concept of the Non-Environment. Discussion of criticisms. Spaciousness – on the recording or in the reproduction. The origins of stereophony. Perception variables. Benefits of the Non-Environment approach. Comparison with stereo microphone techniques. Traditional standards. The origin of the Non-Environment concept.

In this chapter, we will discuss an example of a control room philosophy which is generally representative of the types of stereo control rooms which use hard, reflective front walls and maximally absorbent rear walls. Chapter 17 will put forward an alternative concept, that of the Live-End, Dead-End (LEDE) control room, which uses an absorbent front half and a reflective/diffusive rear half to the rooms. That chapter is written by a specialist in LEDE control room design, in order to keep a balanced viewpoint from the perspective of the book. By placing these two chapters together, readers will be able to judge for themselves the pros and cons. What is more, as both systems are in use in highly sophisticated professional studios, it should become apparent why the subject of control room acoustics can be so controversial, with partisan supporters of the differing philosophies all ostensibly seeking more or less the same objectives via some very different means.

What follows in this chapter is an updated version of a paper entitled ‘A proposal for a more perceptually uniform control room for stereophonic music recording studios’ which was presented in 1997 to the 50th anniversary conference of the Audio Engineering Society, in New York, generally outlining the philosophy behind the Non-Environment rooms.14

This chapter also discusses the evidence that many of the roots of monitoring inconsistency lie not only in some erroneous and outdated beliefs relating to reverberation time, but also in attempts to extract more from stereophony than it can simultaneously supply.

16.1 Introduction

Judging by the wildly different tonal balances on many recordings for sale to the public, the state of room-to-room compatibility of the listening conditions in recording studio control rooms still appears to fall far short of what a professional industry should by now be achieving.

Digital recording has brought its share of problems, but one problem that we must credit it with removing is the variability inherent within the recording media. We may still have differences in the sonic performance of A to D and D to A converters, but the consistency of the digits which leave the recording studios and arrive in people’s homes is now largely guaranteed. The vagaries of analogue transfer to (and recovery from) vinyl disc or magnetic tape have largely been consigned to the past, so the excuses for much of the previous variability can no longer be used in defence of the current situation. Some variability or inconsistency in the spectral balance of recordings is no doubt within a range of artistic interpretations by recording staff, because it can be dependent upon what they feel is appropriate for any given track, but careful listening to much of the available recorded material will soon reveal that most of the variability is not intentional. The source of the problem certainly lies, to a large degree, in the variability of monitoring conditions in the control rooms of the recording studios in which the music is mixed.

There is a general over-reliance on inexpensive, close-field monitoring loudspeakers, and one of the reasons for the existence of this state of affairs is the lack of faith in the widely disparate range of combinations of large studio monitors and different control room acoustics. As the differing points of view as to what is ‘right’ have continued to exist amongst many experienced designers of both the loudspeakers and the rooms, it is little wonder that the recording staff have continued to show uncertainty about what they feel comfortable with. They have, in many cases, opted for highly personalised solutions that work for them, individually, on the most usual types of music that they record.

Studio designers are all aware of the problems inherent in the use of small, close-field monitors, such as the inability to produce the lower frequencies at appropriate levels, or the inconsistency in room positioning, and hence the uncertainty about how the room modes will, or will not, be driven. There are also the attendant problems of reflexions from the mixing console and other related equipment. What is more, it is true that except for a very few types of commonly used small monitors, their ability to resolve fine detail is sadly lacking. As a result of this, noises, bad edits, the operation of gates, the clashing of phase distorted effects artefacts (which often lead to undue harshness in the sound) and a host of other problems, all too frequently add themselves to the irregularities of spectral balance which occur due to entire octaves of the musical frequency range being left unmonitored.

16.2 Sources of Uncertainty

We cannot, and nor should we, be too dictatorial about all that occurs in an artistic industry. There needs to be a range of concepts of what is right in order to accommodate the individualities of different producers, musicians and listeners alike. All should be free to make their decisions for themselves, but those decisions can only be valid as long as they are aware of the decisions that they are making, and that they are not being led into them by ill-conceived control room monitoring conditions. In many cases, the designs have been based on a long held belief that a control room should possess a ‘reverberation time’ (RT) which approximates to an average of domestic listening conditions. RT is, of course, not an accurate concept for use in most small rooms, and certainly is not applicable to highly absorbent rooms, but nonetheless, RT60, T60 or whatever other decay description has been used as a design goal has frequently had some domestic point of reference.

In turn, many other design principles have been based on aspects of stereo perception in the rooms in which the commercially available recordings will eventually be played. This shows admirable concern for the record buying public who keep the music industry alive, but it is also apparent that much of the good intent has been misguided. In trying to take into account the extraordinary number of factors that are involved in the domestic listening process, and especially in the light of the fact that some requirements for the optimal reproduction of different types of music and recording techniques are mutually exclusive, the efforts have frequently not achieved their goals. By trying to take into account so many of the variables in the reproduction end of the chain, the production/record end has itself suffered a lack of certainty that has unfortunately served to introduce even more uncertainty into the reproduction end of the chain.

At the 8th International Conference of the AES, in 1990, Floyd Toole presented a paper ‘Loudspeakers and Rooms for Stereophonic Reproduction’.1 The abstract began thus:

Stereophonic reproduction attempts to reconstruct, in the minds of listeners, replicas of the timbral and spacial effects of acoustical events that have occurred at earlier times and other places. It matters not whether the ‘live’ event consisted of musicians in a natural acoustical environment, or a multi-track creation monitored in a control room. In all cases, musicians and production personnel presumably heard a stereophonic reproduction that met their artistic and technical expectations. Assuming that the necessary information has been preserved in the recording, a replication can be successful only to the extent that the loudspeakers are capable of reproducing the appropriate sounds, and that the listening rooms are capable of conveying those sounds to the ears of listeners. Variations in loudspeakers and rooms create many difficulties in achieving this goal. Although it has been traditional to consider the loudspeakers and room as separate entities, this approach is no longer justified. The loudspeakers, room and listener comprise a system within which the sounds and spacial illusions of stereo are decoded, and they must be considered together.

The above first part of Toole’s abstract is a lucid, concise, and powerful summing-up of a complex situation in real life. The last sentence of the above quotation is very significant: ‘The loudspeakers, room and listener comprise a system within which the sounds and spacial illusions of stereo are decoded …’ Decoded! If they are being decoded, then in the production process they must have been, in some way, considered to have been encoded. No encode/decode process can be expected to work optimally unless the decoder can track the encode process, and in order to do this, the encode process must be known, but in reality, the monitoring (encode) conditions during mixing are rarely known to the listener at home.

Recordings are not sold to the public with instructions about on which type of loudspeaker, and in which size, shape, or other property of room they should be auditioned. There is also absolutely nothing expected from the foreseeable future that will be likely to reduce the range of domestic listening equipment and conditions. Furthermore, music is not the most important thing in the lives of the majority of people, so it will continue to be normal for people to buy houses not for the acoustics of their rooms, but for a multitude of other priorities, and then find appropriate rooms in which to listen to the music of their choice. Different loudspeakers will suit different rooms, different types of music, different recording techniques and media, different budgets, and different personal tastes, relating to precisely what people like to hear in order to achieve the most enjoyment that they can from the music of their choice during its reproduction. There is, therefore, adequate justification in having available a good choice of reproduction equipment.

Taking many of these variables into account, there will be the audiophiles who choose their system and listening conditions with great care, and who may well enjoy optimising them for their favourite types of music, recording techniques and storage media. The most appropriate loudspeakers and listening conditions for rock music recorded on an analogue system, will, in all probability, be different from those most suited to the enjoyment of middle/side, stereo microphone technique recordings of digitally recorded classical music. However, even within the latter, quite highly defined set of recording conditions, there will be a wide range of available and appropriate reproduction systems and environments, and all will sound different. There is an argument for the case that different recording styles should be mixed in control rooms optimised for their own specific characteristics. However, the variation within the sub-groups is so great that to standardise on one arbitrarily chosen type of loudspeaker, at a given distance and at a given level, will only transfer many of the vagaries of the reproduction environment into the production end of the chain. This is because the decision made as to which equipment and conditions to use for the production process may be largely based on the results on similar equipment in the reproduction end of the chain, and this could lead to very volatile standards as fashions change.

What is proposed here is an abandonment of the attempts in the control rooms to try to accommodate so many of the variables in the reproduction systems, and to concentrate on a streamlined version of the more fundamental aspects of the production and quality control needs. This will allow recordings to be monitored in more detail and with more consistency. It will also, with the knowledge and skill of the recording staff, make it easier to predict what the results will sound like in different listening environments. This approach would appear to be far more realistic than attempting to mimic one set of variables with an average of another set of variables.

16.3 Removing a Variable

In 1994, a paper was presented to the UK, Institute of Acoustics2 (IOA) entitled ‘Control Room Reverberation is Unwanted Noise’. The paper put forward the concept of the Non-Environment rooms, which sought to provide monitoring conditions as close as could be achieved to free-field conditions. These rooms can reduce the decay time of reflexions and modal energy to such low levels that the perception of many recording defects becomes much easier. The paper also contained a discussion of the majority of the other widely used control room acoustic control philosophies. It noted the fact that, due to the sensitivity of human hearing systems, most attempts at producing optimised decay conditions for music monitoring had yielded control rooms that sounded subjectively very different, and tended to lead to different musical conclusions when mixing the same piece of music in different rooms.

Figures 16.1 and 16.2 show the general concept of the Non-Environment approach.3,4 It can be seen that the side walls, the rear wall and the ceiling are made as acoustically dead as possible to as low a frequency as possible. The front wall is hard, heavy and reflective, and the floor is also hard. These two surfaces, together with the hard surfaces of any equipment that may be facing the listener, provide a degree of acoustic life for sounds produced within the room, to alleviate any sense of being in an anechoic chamber. The loudspeakers are mounted flush in the solid front wall, and so are not actually in the room, but form a part of one of its perimeter surfaces. The front wall provides a large baffle against which the loudspeakers can push, thus aiding the efficiency and uniformity (flatness) of the low frequency radiation. The flush mounting also removes any response irregularities caused by cabinet edge diffractions, or by path length differences between the direct waves and the low frequency reflected waves from the wall behind free-standing loudspeakers.

image

Figure 16.1:

Plan of Non-Environment control room. Shaded areas are wide-band absorbers.

image

Figure 16.2:

Side elevation of Non-Environment control room: (a) Horizontal rear absorbers; (b) Vertical rear absorbers.

Except for the floor and any equipment placed within the room, the monitors face something approximating a hemi-anechoic chamber. The acoustic conditions provided by the room are thus dependent upon whether a sound is produced within the room, or from one of its boundaries. In the two cases the overall decay characteristics of the room would be very different. From the monitoring direction, the reflexion problems from recording equipment can be dealt with by angling the equipment such that reflexions pass away from the listener and into an absorbent surface. If this cannot be done directly, then the offending surfaces can be protected, either by an absorbent shield, or by a streamlining device that will deflect the incident waves around or away from the object such that they will not reach the listening position. The aim in these rooms is principally to monitor the output from the loudspeakers, and nothing more.

By means of these techniques, rooms can be built which can achieve very high degrees of room-to-room compatibility by virtue of their relative absence of monitoring acoustics.

image

Figure 16.3:

Frequency response function of monitor loudspeaker at 2 m in a small Non-Environment room.

The studio designer Tom Hidley has been pursuing techniques of controlling the room modes down to frequencies as low as 10 Hz for his Hidley Infrasound rooms,5 but some of the processes involved in achieving this very low frequency absorption lend themselves to the control of the more ‘audible’ low frequencies in much smaller rooms. This would seem to be important, because so many of the control rooms currently in use around the world are in the 25–35 m2 region, and it has typically been this range of room size which has suffered so badly from inter-room incompatibility. Whilst the small Non-Environment rooms of different shapes and sizes have different ambient characteristics for general speech and noises produced within the rooms, (due to the different nature of the reflective materials and the different reflexion times in different sizes of rooms), they all have remarkably common monitoring characteristics. These characteristics are essentially those of loudspeakers, modified by whatever small ambient aberrations remain.

Figure 16.3 shows the frequency response function of one monitor loudspeaker at a distance of 2 m in a relatively small Non-Environment room. Figure 16.4 shows a measurement of a similar loudspeaker at 3 m in a large Non-Environment room. The two plots are remarkably similar considering the different room sizes. Figures 16.5 and 16.6 show the step responses of the monitors in the small and large rooms respectively.

In brief, the rooms are made highly absorbent at the mid and high frequencies by the use of conventional fibrous absorbent materials. Low-mid absorption is achieved as the waves pass through arrays of fibrous-lined ducts, formed by solid waveguide panels. The lowest frequencies are addressed by means of air-damped constrained-layer panel absorbers and ‘deadsheet’ membrane absorbers, which effectively line the room with a heavy, acoustically-dead, semi-limp bag. The overall control is provided by the whole system of absorption, but for the purposes of this discussion, the concept can be likened to an anechoic chamber, with one wall replaced by a hard wall in which the loudspeakers are mounted (effectively a hemi-anechoic chamber if not for the hard floor). The floor, in rare cases, may have openings at the front and rear of the room for the utilisation of under-floor absorption. This can go some way to reducing the floor induced response disturbances.

image

Figure 16.4:

Frequency response of similar monitor loudspeaker to that shown in Figure 16.3 at 3 m in a large Non-Environment room.

Irrespective of size or shape, such a termination will be highly uniform compared to that provided by more conventional rooms. The low frequencies may vary somewhat with room size and proportions, but with suitably effective absorption they are likely to do so to a considerably smaller degree than with most other current control room designs. What is more, the response perturbances caused by the rooms are likely to be at lower relative levels to the direct sound than is the case with most other rooms. This was one of the main benefits being proposed in the IOA paper.2 Due to the reduction of room artefacts, the lower level details in the recorded music can be more readily perceived, and any unwanted aspects of the recordings, such as the audible operation of gates, can be dealt with before they become embarrassingly evident to the more discerning members of the recorded-music-buying public.

image

Figure 16.5:

Step response of monitor loudspeaker at 2m in a small Non-Environment room.

image

Figure 16.6:

Step response of monitor loudspeaker at 3 m in a large Non-Environment room.

16.4 Limitations – Real and Imaginary

Over the years, a number of criticisms have circulated about the room concept being discussed here. Some of these comments have had substantial grounds to support them, such as the abandonment of the concept of having a ‘domestic’ decay time, but others have been based on misconceived theories. Examples of the latter type are comments such that the lack of modal support will produce rooms that are subjectively lacking in bass, and that an over-dead monitoring acoustic will lead to the excessive use of reverberation when mixing. The lack of modal support would only produce bass-light mixes if the decay time at the middle and high frequencies remained typical of more conventional control rooms. This was the case with some of the control rooms of the 1970s and early 1980s, where the excessive use of ‘bass traps’ was incorporated into rooms which still possessed significant decay times at higher frequencies. The Non-Environment rooms, however, are all-trapped, not just bass-trapped. A person who is used to working in a more lively room may initially be unaccustomed to the low decay time, but it is usually rapidly adjusted to, and the clarity and impact that the low frequencies possess is a revelation. If a recording is considered to be sounding too dry, then it is because a dry sound is what is on the recording medium. If it is considered desirable to remedy this, then either the mixes can be given more reverberation according to taste, or the recording acoustics or microphone location can perhaps be changed. Despite the criticism sometimes heard that working in low decay time environments leads to excessively reverberant mixes, experience has clearly shown that it is not so.

The fact is that when artificial reverberation is added, even in a totally dead room, it is unlikely to become excessive when played in a more reverberant control room because the decay responses of any reasonable control room will be short by comparison to the reverberation effects that are usually applied to recordings or mixes. One thing that is often noticed, though, is just how clear the reverberation tails in the recordings can be heard in very low decay time rooms. It is very useful to be able to monitor these carefully, because synthetic reverberation can produce some undesirable decay tail artefacts, which all too frequently go unnoticed in many control rooms. In low decay time rooms, the sound of the rooms in which the microphones were placed, or the different effects processor that had been used on a recording, become clearly recognisable to a degree that is normally only detected on headphones. What is more, every different conventional control room will produce different perceived ratios of short-lived versus resonant, quasi-steady-state sounds, certainly beyond the critical distance. The transient/percussive/consonant sounds tend to fall off at 6 dB per doubling of distance from the near-field of the source, because their life is too short to excite much modal resonance, but the quasi-steady-state signals may be supported by the different modal decay characteristics of the different rooms. This is a fact of life in all but the most acoustically dead monitoring conditions. It therefore suggests that relatively dead monitoring acoustics will allow more consistency in the perception of the transient versus steady-state balance of the sounds.

16.5 Spacial Anomalies

Three of the more substantial criticisms of the low decay time monitoring conditions are that they lack a sense of spaciousness; they are not representative of ‘normal’ listening conditions; and that in the smaller rooms of this type they fail to support an adequately wide area of stereo imaging. The last of the three points will be dealt with in some detail in Sections 16.7 and 16.8, but let us first consider the question of spaciousness. An accurate rendering of spaciousness can only be achieved by multiple, lateral reflexions, arriving from the directions and with the inherent delays that are appropriate to the performance space, whether that space was real, or imaginary. A less accurate sense of spaciousness, which is perhaps a more realistic goal, can still only be achieved by reflexions coming from a direction other than that of the stereo loudspeakers. Any sense of spaciousness (in the enveloping sense) is not therefore inherent in a conventional stereo recording. It will be dependent in its nature upon the reproduction acoustics. It can never be truly representatively ‘monitored’ at the time of mixing. Introducing an arbitrary set of reflexions into the mixing environment tends only to confuse matters. Surround sound helps us to tackle this problem somewhat more reasonably.

Spaciousness and the perception of detail tend to be mutually exclusive. This is true whether it is in the performance space, the microphone technique or the reproduction chain. Orchestral conductors hear more detail from their rostra than the audiences hear from the seats in the auditoria. The conductors need to hear the detail to be able to do their job, but most audiences like to hear the all-enveloping sound from the auditoria because it pleases them. Distant stereo microphone arrangements, such as spaced omni’s, produce a greater richness of sound than close, multi-microphone techniques, but the latter can produce more fine detail, and perhaps more dynamic impact. The choice of which technique to use will be a creative decision by the people responsible for the recording. For critical monitoring, however, where the same compromises exist, it would seem that experienced personnel could far more realistically achieve their aims in rooms in which they could hear the fine detail, and then interpret how things would sound in a more reflective space. This would seem preferable to monitoring in rooms in which they could hear a spacious sound, but could only guess at what problems may lurk in the low level detail, masked by the ‘spacious’ reflexions of the room. In any case, it would not be too difficult a task to introduce suitable reflectors into a relatively acoustically dead room for a final and more spacious auditioning of the end result; once, that is, any problems in the finer details had been monitored and resolved.

The criticism about not being representative of domestic listening conditions would appear to be irrelevant. To date, all too many rooms that do attempt such domestic commonality often fail to produce the intended compatibility in the end result. Averages in themselves need not be representative. The average of the integers from 1 to 10 does not represent, even within ±20%, more than 2 of the 10 integers (5 and 6). The majority of the integers would not be closely represented by the average. The worldwide range of domestic listening conditions is far too wide for any ‘average’ control room to represent. Motor cars and headphones, which now form a large part of the international listening environment, are also not represented by any average room. In fact, none of the normal arguments for control room specifications have much relevance for cars, headphones, or a wide range of domestic loudspeaker listening. What this seems to suggest is that we ought to know more about what is actually on the storage medium. This needs to be known (and recorded in a more predictable manner) in order for the disparate reproduction systems to be able to make more reliable attempts to decode the intentions of the recording personnel. The effect of possible reproduction environments must be deduced from the audiological and psychoacoustic cues in the recording, and how they will relate to the various listening conditions. In other words, the recordings should allow the maximum to be reliably extracted from them, without bias to any particular set of reproduction conditions, unless, that is, the recordings are being made for some highly specific purpose, such as television commercials or big screen cinema.

16.6 Solutions

By taking the control room acoustics out of the recording chain, the emphasis of the burden of monitoring uniformity shifts on to the loudspeakers. As loudspeaker performance has been converging faster than room performance, this simplifies the task of producing more compatible control room monitoring. Furthermore, much of the effort in loudspeaker design research has been involved with the amelioration of the problems caused by a typical loudspeaker/room interface. Loudspeakers designed for monitoring in Non-Environment rooms can concentrate on the optimisation of axial response performance, with less emphasis needing to be placed on the directivity problems well off-axis. This removes many compromises from the design process.

It is often the constraints of producing loudspeakers with a smooth, wide-angle directivity over a wide frequency range which restricts the choice of drivers in a monitor system. The fewer restraints that there are on driver choice, the easier it is to choose drivers for their sonic neutrality, low non-linear distortions, achievable SPL and many other parameters that the usual need for off-axis directivity control frequently does much to compromise. Simpler monitoring systems of excellent ability to reveal fine detail, and work at high SPLs, can be more reasonably priced. This can therefore spread the availability of more neutral monitoring conditions to a greater proportion of the industry. An affordable means of achieving a more uniform performance from the middle order of recording studios would be likely to have more effect on the recording industry’s overall output than would be achieved by seeking to refine, even further, the upper echelon of elite studios, though that work should continue for its own valid reasons. One of the great benefits of Non-Environment rooms is that the techniques are not expensive, and apply with only minimal changes to control rooms of all sizes.

In the rooms being described here, phase responses become very important, because the absence of reflexions in the overall sound allows the detection of phase characteristics which even a single lateral reflexion can render inaudible. Many of these phase products, which are at the root of the harshness of many modern recordings, often go unnoticed, and hence also go uncorrected when using low resolution monitors in conventional rooms. With the absence of room characteristics in the monitor chain, the use of high-resolution monitor systems makes it much easier than is currently usual to achieve not only the desired timbral balance of individual instruments, but also the desired balance between the instruments. It also makes evident any non-linear distortions and the effect of any poor acoustics in the original recording spaces. The degree of openness and spaciousness contained within the recording, such as characteristics of ‘transparency’ and ‘depth’, can also be more easily assessed. (For further discussion of the audibility of phase, see Chapter 13.)

16.7 Stereo Imaging Constraints

Let us now turn to the other major point that has been raised in relation to these control rooms; their stereo imaging. Figure 16.7 shows a typical stereo perception area from a pair of loudspeakers situated in an anechoic chamber. The area is a function of geometry, so its actual size is determined by the distances between and from the loudspeakers, at least up to a point where the inter-channel delays become so great as to make stereo perception impossible. In large rooms with, say, 4 m between the loudspeakers and 5 m to the mixing position, the area available for stereo perception is sufficiently large to cover all the persons likely to be working behind the central 3 m or so of the mixing console. As the above dimensions decrease, so does the area of good stereo localisation. In very small rooms, and at close listening distances, the area of true stereo perception is perhaps only large enough for one person to appreciate comfortably. However, this should perhaps not be seen as a limitation of the room, but a clearer than normal demonstration of how two-loudspeaker stereo should behave. If clarity must be traded for spaciousness, then perhaps what we really need is ambient surround channels, rather than compromised two-loudspeaker stereo.

image

Figure 16.7:

Typical stereo perception areas for large and small rooms. Moving from A to G in either situation would not significantly change the relative path length between the large room monitors, C and D, or the small room monitors, E and F. However, in the small room, moving to position B would make the relative distance to the monitors E and F 2:3. On the other hand, moving to position B would only change the relative distances to C and D by about 17%, quite different from the 50% in the case of the small room. Large rooms, therefore, tend to produce a wider and deeper listening area for critical use.

Moving from A to G in either situation would not significantly change the relative path length between the large room monitors, C and D, or the small room monitors, E and F. However, in the small room, moving to position B would make the relative distance to the monitors E and F 2:3. On the other hand, moving to position B would only change the relative distances to C and D by about 17%, quite different from the 50% in the case of the small room. Large rooms, therefore, tend to produce a wider and deeper listening area for critical use.

16.8 The Concept of Stereo as Currently Used

If we look back at the early history of stereo, there were two significant attempts at the reproduction of a ‘solid’ sound; ‘stereos’ being the Greek word for solid: a wall of sound, in other words. The first experiments relevant to the development of current stereophonic sound recording and reproduction took place in the 1930s, by Snow, Fletcher and Steinberg at Bell Laboratories in the USA, and by Alan Blumlein at what was to become EMI in the UK. The Bell scientists worked towards the reproduction of the originally recorded wavefront on a macro scale in the listening area, by using multiple spaced microphones and multiple loudspeakers. Blumlein, realising that a two-channel distribution system was all that would be commercially practicable in the then foreseeable future, considered the Bell proposals to be too much to ask of a domestically realisable system. He therefore opted for the implementation of a system relying on a set of psychoacoustic criteria that could reproduce, in the area of a stereo seat (Blumlein’s own words), a realistic frontal sound stage using only a two-channel record/reproduce process.

The work at Bell Laboratories envisaged the likelihood of the use of at least three loudspeakers for reproduction, which they quite rightly considered superior ‘by eliminating the recession of the centre-stage position, and in reducing the differences in localisation for various observing positions’. In the 1970s, 1980s and 1990s Michael Gerzon6 put forward much work on new proposals for the three-speaker reproduction of stereo, some of which were totally compatible with two-channel recording systems. Although these more advanced proposals of Gerzon’s would cover a considerable listening area, the early proposals of Bell Laboratories were still quite narrow, subtending an angle of only 35° at the listening position, so this was aimed at reproduction in larger spaces, such as in cinemas, where the listeners could be at some distance from the loudspeakers.

At EMI, Blumlein’s aim was only to produce acoustical signals in a limited space around the head of one listener in a ‘stereo seat’. This was intended to form an accurate virtual image of the source, by means of reproduction via two loudspeakers subtending an angle of 60° at the listening position. Blumlein’s system constitutes the basis of what is now the well-established procedure known as Intensity Stereo, which inferred that simple level differences between the spaced loudspeakers would create both the necessary level and phase differences at the ears of the listener to produce a stereo image. This only occurs if each ear hears both loudspeakers, which is one reason why the stereo perception via headphones of a loudspeaker-derived mix can be so different, as no such inter-aural cross-talk exists with headphones. Shufflers can go some way to resolving this headphone problem, but they can also introduce problems of their own, such as position dependent frequency responses.

It was indeed possible for Blumlein’s system to produce stable images between the loudspeakers by choosing suitable level differences between the left and right loudspeakers. (The concept of ‘left and right’ is important here, because the effect is an aspect of human perception: the image supporting ability is not an inherent property of a pair of loudspeakers. The failure to fully appreciate this was one of the reasons for the failure of the many quadrophonic systems of the 1970s, where the assumption was often made, wrongly, that panning between a front/back pair of loudspeakers, on one side of the head only, would produce an analogous effect, which it does not.) The Intensity Stereo system is the one that the pan-pots of most mixing consoles employ, and which must surely be used in over 99% of all current recording processes. It is the implementation of Bauer’s Stereophonic Law of Sines.

There is nothing limiting in the way that Non-Environment rooms present the stereo images, as the images perform exactly as one would expect them to perform, according to the way that the Intensity Stereo system was envisaged and implemented. (Incidentally, the Intensity Stereo referred to here has nothing to do with the psychoacoustic theories claiming intensity differences to be the key factor in localisation; here it merely relates to the level differences at the loudspeakers.)

Much work has been done in control room design to try to expand the area in which stable stereo imaging can be achieved, and the provision of certain lateral reflexions can serve to reinforce stereo localisation. Davis referred to ‘Haas Kickers’7 which are strong reflexions appearing after a suitably reflexion free period, and which help to maintain imaging. However, in many such ways, the means of supporting a wider stereo listening area are not the development of the concept of Intensity Stereo, but are psychoacoustic ‘tricks’ to help to extract more than the system inherently is capable of supporting. If a property is not inherent in the recording, then perhaps the enhancement techniques are best left for the final listening rooms, and not the control rooms. The problem in this is that the techniques tend to come at the price of compromises that must be made in other areas of monitoring. This latter point can be disturbing, as in the term ‘control room monitoring’ the words ‘monitoring’ and ‘control’ imply some sort of reference to a standard, which can hardly be the case if varying techniques are used to support the insupportable. What is more, if the control and monitoring are not defined at the recording stage, then to what standards do the domestic equipment manufacturers design their sound reproduction products? In the ‘Studio Monitoring System’ and ‘Control Room’ surely we must aim at some sort of tighter reference if the present unacceptably large range of end-product sound qualities are to be brought to a more repeatable equilibrium. Having said that, the current situation does provide a great deal of work for the ‘Mastering’ industry.

16.9 Conflicts and Definitions

There are a number of factors in studio monitoring which directly contradict domestic hi-fi requirements. Studio monitors are usually required to show up flaws and problems in the sound. They have an analytical requirement that is not normally necessary when listening to music solely for pleasure. Control rooms are for quality control as well as for the assessment of compatibility with the outside world. They are also, of course, used as creative environments, and that is a further aspect that makes its own demands. However, in almost all cases, the quality control function is degraded when attempts are made to imitate arbitrary domestic conditions, or to support artificially the stereo image stability over a wider area than was ever envisaged when the concept was formulated. It would thus seem that a very logical way to control the ‘encode’ side of the recording process is in rooms which simplify, to the greatest extent, the accurate monitoring of the signal which is being captured by the recording medium. Once there is a more reliable definition of the encode side of the system, it gives the manufacturers of, and the listeners to, domestic music reproduction equipment a better reference from which to make their own choices and decisions about how to get their desired ‘best’ out of the recording. The wider the tolerances are at the ‘encode’ side of the system, then the less consistent will be the ability of the reproduction systems to decode faithfully what the artistes and producers intended the listeners to hear. Arbitrarily designed control rooms do not aid the search for better standards of reproduction, because they are dependent upon far too many variables.

Toole highlighted the above point very forcibly in Section 2.4 of reference1: Reflexions and Absorption of Sound – Effects in Time and Space. This is not a simple subject, because:

1.  The sounds radiated from loudspeakers in different directions are not the same;

2.  The frequency dependent absorption properties of reflecting surfaces are not the same;

3.  Listeners respond differently to sounds of different frequency;

4.  Listeners respond differently to sounds of different temporal structure, e.g. impulsive or sustained;

5.  Listeners respond differently to sound arriving at different times relative to the direct sound;

6.  Listeners respond differently to sounds arriving from different directions;

7.  Listeners respond differently to sounds in the presence of reverberation;

8.  Listeners have many different perceptual responses; and

9.  All of the preceding interact with each other and, to some extent, with the recording that is being auditioned.

That these interrelationships exist in domestic situations is incontrovertible, but surely all efforts should be made to remove as many of them as possible from the control rooms. The Non-Environment approach goes a long way towards achieving the lowest realistic number of room related variables.

Most domestic listeners want to hear music in a way that is pleasing. This is a valid requirement as they are seeking enjoyment, and they are at liberty to manipulate the above variables to suit their own requirements. However, what is pleasing should not be confused with what is on the recording medium. Stereo spaciousness can be very pleasing, but its presence in a domestic environment, or if created in a control room of any given design, is by no means necessarily an inherent property of what is on the recording. The use of early reflexions and reverberation can increase the stereo listening area, enhance the stereo listening pleasure, and extend it beyond the normal ‘stereo seat’ position,8 but such techniques often compromise the detection of fine detail in low level signals, which in a monitoring situation risks allowing problems to pass by unnoticed.

In most truly professional studios, control rooms have already tended towards being less reflective than domestic listening rooms, undoubtedly because of a number of the abovementioned reasons. Many professional recording personnel tend to prefer a more direct sound, even when listening for pleasure, as reported by Flindell et al.9 In the paper ‘Subjective Evaluation of Preferred Loudspeaker Directivity’ they noticed that when their listening test results were separated into groups of naïve and professional listeners, the preferences of the two groups were very different. A few of the professional listeners even preferred frequency contoured reflected energy, which mimicked the conditions frequently encountered with the more directional loudspeakers in many control rooms. Many of the naı¨ve listeners strongly favoured the spaciousness and extra high frequencies in the reflected sound, which were more typical of omni-directional (or multi-directional) loudspeakers in conventional rooms. No doubt there is a considerable degree of conditioning influencing the results for the professional listeners. Spending much time working in the conditions in which they do perhaps makes them more accustomed to hearing direct sounds. On the other hand, it is equally possible that as they are accustomed to listening for detail, such habits travel home with them.

The record and reproduce (studio and home) ends of the recording process have always been making their different demands, and it does not logically follow that the listening environment should be the same in both situations. Again quoting from Toole’s paper1:

Strong reflected or diffused sounds from behind can seriously impair the clarity of the virtual sound images between the loudspeakers. Even at what appear to be safe distances the same can be true if reflecting or diffusing surfaces are large. A simple test is to reproduce monophonic pink noise at equal levels through both loudspeakers. For a listener on the axis of symmetry, the result should be a compact auditory image midway between the loudspeakers. Moving the head slightly to the left and right should reveal a symmetrical brightening, as the acoustical cross-talk interference is changed, and the stereo axis should ‘lock in’ with great precision. Start close to the loudspeaker and then move further away. It would seem a fundamental (minimum?) requirement that one should be able to find a stereo axis, and hear a clear centre image, in any position where critical judgements are made.

If the new generation of cross-talk cancelling binaural and three-dimensional simulation systems is to be truly successful, a ‘clean’ acoustical path to the ears may be an absolute necessity. If a listening room garbles the crosstalk itself, it will most certainly garble the cancellation.

In a paper in the Journal of the Audio Engineering Society10 in 1986, Jack Wrightson wrote:

The problem in the context of studio monitoring is that, regardless of the conditions, the room/monitor—loudspeaker combination places its imprint on all that transpires. For this reason a control room should be neutral, it should add as few sonic colourations as possible to the sound generated by the monitor loudspeakers. In this context, poorly designed loudspeakers should exhibit their flaws; well-designed loudspeakers should demonstrate their assets. The aural purpose of a control room is to provide the best possible free-air representation of the signals carried by the studio’s audio system.

Surely, the above conditions are most ideally met by the Non-Environment rooms of the type being discussed here. In this type of room, the conditions for neutrality and room-to-room compatibility would seem to be considerably greater than for any other concept of control room currently on offer. The number of variables in Toole’s list in the previous paragraphs is significantly reduced.

1.  Off-axis anomalies play little part in the proceedings.

2.  Loudspeaker design is simplified.

3.  As most reputable monitors have reasonably flat on-axis responses, the perceived difference when mixing with different monitors should be less than is all too often currently the case.

4.  Reduced room decay time prevents the masking of low-level detail, an important factor in the ‘quality control’ process.

5.  Reduced room decay minimises confusing timbre colouration caused by the room.

6.  Reduced room reflexions enable precise stereo imaging, albeit over an area which is a function of room size and monitoring geometry.

7.  Reduced room reflexions allow the detection of unwanted phase anomalies which can result from the over use, or inappropriate use, of effects processors.

8.  Minimising room effects allow the various persons in the room to perceive the same musical balance between the instruments.

9.  Reduced room effect allows the clearer perception of the ambience of the recording spaces or the use of artificial reverberation effects, and hence it is more easy to judge their appropriateness, or otherwise, to the recording.

10.  Reduced room effect gives a greater possibility of working in other rooms of similar nature on a single recording project, even if the rooms are physically quite different, with a minimum of acclimatisation to the new location.

If the greatest price that must be paid for these advantages is a more restricted stable stereo imaging area in the smaller rooms, then it would seem to be a small price. When a mix is being built up, the desired timbre of an instrument can sometimes need to be changed in order to avoid masking by other instruments as they are introduced. Similarly, the optimum balance between the individual instruments can change. Just about the only thing that is usually static during the build up of a mix is the position of the instruments in the stereo panorama. Even in the smallest rooms where the stereo imaging will be true over perhaps only the space of one seat, then that seat will almost certainly always be available for occasional reference. Nothing about the imaging will suddenly change due to the dynamics of the mixing process.

In very small rooms, however, one should also consider the fact that the monitoring loudspeakers are often forced into positions where they cannot possibly subtend an angle of 60°, or less, at the monitoring position. This in itself will degrade the stereo imaging stability, irrespective of the type of room in which they are being used. Nevertheless, to subtend an angle of less than 60° in a small room would be likely to put any mixing personnel, other than the person on the centre line, outside of the loudspeaker pair, i.e. to the left of the left loudspeaker, for example. This situation would be less desirable, overall, than the less stable imaging produced by the greater subtended angle created by the wider spacing of the loudspeakers. Any comparisons of the stereo imaging in rooms of different design concepts should always take into account any differences in subtended loudspeaker angles, or the comparisons would be irrelevant.

Obviously, for the studios involved in the production of radio dramas or the like, where much more movement of the sound images is likely, the order of monitoring priorities may be somewhat different. In those cases, the greater use of dynamic panning, plus the possibility of having more people involved in the mixing process, may lead to a requirement for a large listening area over which the stereo sound stage was more stable. Perhaps this would take priority over the need for more absolute knowledge of the timbre of the sounds, however, the title of the paper on which this chapter is based did state ‘… for Stereophonic Music Recording Studios’.

If one is recording music for retail sale, and one must make monitoring compromises, then surely it is better that if there is one thing likely to be less easy to constantly monitor it should be the one thing which is least likely to vary. As not much tends to change in the static panning of a mix, then the stereo panorama is a good candidate for less constant monitoring. It should also be remembered that in larger rooms, the small ‘sweet-spot’ problem does not exist, and in the low decay time rooms the true imaging is better than in many other rooms, with all their attendant individual characteristics. Non-Environment rooms show stereo as it is recorded. If stereo is not enough, over two loudspeakers, then it is the format that should be criticised, not the rooms that show its failings. Surround sound systems are addressing this limitation to good effect.

Also noted in Toole’s paper1 were studies by Kuhl and Plantz11 and Kishinaga et al.12 Kuhl and Plantz, using only professional sound engineers as listeners, found that for dance and popular music, plus voice and radio drama, the preferences were for monitoring that was essentially the direct sound from the loudspeakers. At home, the majority of these same listeners, if listening to symphonic music, preferred a more reflective environment. Kishinaga et al. concluded from their investigations ‘that in designing a listening room, optimum arrangement of absorbing and reflecting materials differs depending on the purpose of listening’. Recording/quality control and listening for enjoyment are very different purposes. Toole went on to say ‘some recordings are clearly better matched to certain styles of reproduction than others. The situation (standardised listening conditions) would appear to be far from resolved’.

Indeed so, at the ‘decode’ or reproduction end at least, where tastes and preferences lead to different conditions for maximum enjoyment of the music. However, if these same variables are allowed to affect the encode process in the studio control room, then it only leads to chaos in trying to decode-to-taste any set of non-standard encodings. Again quoting from Toole: ‘In studio monitoring the general rule is to provide listeners with a sound-field that is predominantly direct. In these conditions, the principal impression of direction, image size and space are those that can be provided by the stereo signal itself’.

Surely, this is all that we can aim for in the studios. If we concentrate on what is on the recording medium, then the provision of a more consistently monitored product will allow the music-buying public to optimise their own listening conditions to suit their own pockets and preferences. Trying to guess what these conditions may be does nothing but harm to the encode process, and leads to absurd magnifications of the problems at the decode end. This being the case, the monitoring of the stereo in the Non-Environment rooms, without any enhancement or embellishments for greater enjoyment, would seem to be ideally suited to the production of recordings to a more consistent standard of reference, which should in turn make life easier for mastering facilities and the manufacturers of domestic equipment. Whatever that equipment may seek to achieve, its design and production would be made much easier without the often unintentional variability of the recorded material, affected as it is by the vagaries of current control room monitoring.

16.10 A Parallel Issue

In 1986, Stanley Lipshitz published a paper13 on the subject of the spaciousness and airiness of different techniques of recording using spaced microphone techniques. The following quotations are taken from that paper, and many parallels can be drawn between the lack of detail and false spaciousness of spaced microphone techniques, and the loss of detail perception associated with the false spaciousness which results from anything less than ‘direct’ monitoring.

On perceived spaciousness:

I believe that spaced-microphone techniques are fundamentally flawed, although highly regarded in some quarters, and that coincident-microphone recordings are the correct way to go. The ‘air’ and ‘depth’ so valued in spaced-microphone recordings are shown to be largely the artefacts of phasiness due to the microphone spacing, and not acoustic ambience at all.
I shall try to make a strong case for the use of single-point (i.e. coincident) stereophonic microphone techniques in preference to widely spaced microphone configurations.
I am aware that I am treading on dangerous ground here, in that an aesthetic judgement is called for when attempting to rate stereophonic recordings as ‘good’ or ‘bad’. Often it is the case that the more ‘ethereal’ the sound images appear, then the better the system is appreciated. Such systems can be regarded, however, only as attempts at pseudo-stereophony.
I consider such blurring to be a defect, although I will admit that some people like soft-focus (photographic) lenses
.

On stereo reproduction:

The problem of freeing the listener from the ‘stereo seat’ by enlarging the region within which the image remains reasonably free from distortion is, in my view, a reproduction related question rather than one bearing directly upon the recording technique.
If more than two transmission channels are available, one can do much better.
For such reproduction systems (for example Ambisonics) an acoustically dead listening room would be preferable. It is my belief that as more sophisticated reproduction systems become available, the correct trend will be toward more anechoic listening environments
.

On the psychoacoustics of stereo:

Of primary concern is the fact that the ear on the side of the earlier loudspeaker need not receive the louder signal, and indeed at low frequencies does not! So the inter-aural level differences produced at low frequencies do not always reinforce the image produced by impulsive sounds. Sometimes, the low frequency image pulls in the opposite direction from the image of the transient, broadening and smearing the overall image.
So we must consider stereo hearing as distinct from natural hearing, and actually quite unnatural – it is in fact an artificial creation
.

And, on the impact of modern recording technology:

The last few years have seen a dramatic improvement in our ability to accurately record, distribute and reproduce musical signals, and the benefits of this digital technology are now available to consumers in their homes.
What is on the master tapes is now laid bare without the masking effects of the earlier technology, and what the consumer can now hear is frequently unpleasant.
I feel that the source material (not referring to electronic music here) is now the weakest link in the chain from the artist to the listener, and that improvement here requires an enlightened reassessment of what goes on in the process of capturing the original sound and reproducing it through two loudspeakers
.

All of the above quotations from Stanley Lipshitz would seem to point to the need for detailed and direct monitoring as the only means of ‘hearing into’ what is really on the recording medium, and that spaciousness should, as discussed elsewhere in this chapter, be an aspect of the final reproduction environment. For detailed monitoring, it would appear that spaciousness and the resolution of fine detail are largely mutually exclusive. It should be recognised, however, that the authors (Newell and Holland) may possess a sensory bias towards the more detailed types of monitoring, as they admit to having a general dislike for soft focus photography: but also, it would seem, does Stanley Lipshitz.

So ended the AES conference presentation, but perhaps before closing this chapter it would be informative to look at how the concept initially developed. Firstly, however, this must be set in the context of the circumstances of the time.

16.11 Prior Art and Established Ideas

The traditional approach to control room design had been to achieve a smooth frequency response in a room with a decay time that was deemed representative of domestic, end-user, listening rooms. Apart from the standard decay time range relating to these outside-world conditions, there were two other proclaimed reasons for the choice. The first was the desired compatibility of mixes when work was transferred between different control rooms, and the second reason was the perceived general comfort of the personnel working in a ‘typical’ ambience.

Nevertheless, different specifications were drawn up which sometimes were excessively broad. The European Broadcasting Union (EBU) specifications, for example, are so broad as to be all but worthless; at least for serious music recording purposes. The fact is that the political problems incurred in trying to settle on a specification that 28 countries could agree to is something from which only ‘men in grey suits’ could be expected to gain any satisfaction. Why such specifications would be all but worthless if applied to commercial music control rooms is that many rooms can be built, entirely within the specifications, which are all very subjectively dissimilar. This does nothing to address the control room-to-control room compatibility problem, and if control rooms can be so dissimilar to each other, how can they all relate to some common concept of the outside world?

Furthermore, even if rooms are built to an acceptably tight standard, then this still offers little overall chance of achieving any significant level of compatibility as long as the choice of position for the monitor loudspeakers remains entirely arbitrary. Inevitably, if one moves a loudspeaker to different positions in any room other than an anechoic chamber, the cumulative response of the direct and reflected sounds will be different for each position. This renders flush mounting of the monitor loudspeakers almost mandatory if compatibility from room-to-room is desired. At least if one mounts the loudspeakers flush in a wall there is little risk that people will unwittingly move them from their preferred location, and hence perhaps also move the performance of the room/loudspeaker combination out of its desired range.

The compatibility issue, both between rooms and to the outside world is well addressed by the Non-Environment concept of control room design. The principle behind the idea is that any variables that do exist between rooms are acting upon a very small amount of room ambience. There is therefore less subjective variability in the room response because there is so little room response for the variables to act on, at least from the direction from which the monitors ‘speak’.

This concept moved many such rooms outside the more accepted, legacy-standard parameters for room design, but experience has shown that working in this way has not produced mixes that subsequently sound ‘wrong’ when played in rooms that are more conventional. (At least not beyond any subjective or objective ‘wrongness’ inherent in the loudspeakers and/or the rooms in which they are being auditioned.) Surely, this should be the case. When a mix is done in a room which is ‘somewhere within’ the accepted specifications, it can sound either similar, better, or worse in a domestic room, depending upon how the control room’s own inherent built-in misbalances coincide with the idiosyncrasies of the rooms in which the music is later heard. Ideally, a control room should add nothing to the recorded sound, or its monitoring conditions can only be considered arbitrary.

Much of this of course is enshrined in the philosophy of ‘near-field’ monitors, where loudspeakers of various types are used at close ranges. By this means, the characteristic sounds of a clutch of loudspeakers can become well known to a great number of recordists. If these are used at close range, particularly inside the critical distance (beyond which the direct sound sinks below the reflected sound), then even if such loudspeakers are not particularly well liked, at least they would be well known. Under such circumstances, room-to-room transfers of work can be judged with minimal variation. Unfortunately, such monitors frequently lack good transient accuracy, rarely do they have the transparency available from the better, bigger systems, and almost universally they fail to be able to monitor the full frequency range of the recordings. They usually also lack the necessary dynamic range to be able to assess properly low-level details in the presence of peak level sounds. Nonetheless, there is little in the concept of the close-field monitoring approach which in any way conflicts with the philosophy of the ‘Non-Environmentalists’. What the Non-Environment approach offers is the possibility of enjoying the positive aspects of the close-field approach without its restrictions and limitations. Effectively, it extends the size of the close-field to that of the entire room. In many cases, the critical distance lies outside the boundaries of the rooms. This is especially so in the case of small rooms, and this is a very important point.

When people look for convenient buildings in which to site studios, they are often forced into compromises. Access to the toilets may be a problem, for example, requiring a corridor where the control room would ideally be sited. Therefore, control rooms do not always take priority in any absolute sense, and they must often be sited less than optimally. To try to get a standardisation of size and shape is totally impossible. As we have discussed previously, different rooms of conventional design will have differently perceived acoustic characteristics if the sizes, shapes and materials of construction are not identical. Hence, if standardisation is impossible, and the variability in room geometry is inevitable, the control rooms can never be considered to be performing to any tight reference standard unless the monitoring acoustics can be reduced to almost zero, because it is only zero which can be multiplied or divided by a range of variable factors without changing the result.

The Non-Environment approach advocated here adequately addresses the above problems, and it does so in rooms which remain comfortable for people to work in for hour after hour, week after week. In practice, the much-vaunted greater working comfort of the more traditional control rooms never became an issue, and furthermore, reliable monitoring added its own comfort factor. Until this option (and its close relatives) became available, acousticians had not been getting anywhere near as far as the industry needed in terms of achieving these compatibility levels. The whole close-field monitoring concept was borne out of the recording staff taking the problem into their own hands, and dealing with it in the only way that they knew how; sitting close to the loudspeakers. Given the fact that they chose to suffer the limitations and privations of such monitoring techniques, rather than work with endless room-to-room ‘correction factors’, was a strong statement of their dissatisfaction with the general state of monitoring affairs.

16.12 The Zero Option – the Origins of the Philosophy

In fact, it was the compatibility problems that prompted Tom Hidley’s early retirement at the end of the 1970s. Until then, it was widely held that the one-third octave pressure amplitude response in the room was the strongest governing factor in the assessment of a room’s ‘rightness’, despite the fact that it took almost no notice of the phase response, and hence the transient response. The whole ‘room voicing’ concept by means of graphic equalisers is now thoroughly discredited and passé, but Tom Hidley, amongst many others, learnt that one the hard way.

It was a chance set of circumstances in the early 1980s that drew him towards the Non-Environment concept. A previous client of his called him just after his ‘retirement’ and asked if he could make some changes to an older room. One current catchphrase at that time was that rooms were ‘over-trapped’. The bass was not being perceived in the correct balance with the mid and high frequencies. Nobody seemed to be able to clearly define exactly what was wrong, because the spectrum analysers were reading flat, but it was a feeling that had gained some widespread acceptance. After making some modifications to the room, one of its subsequent users was Stevie Wonder, who, of course, is blind. It was noticed that when he was speaking about the sound, he kept pointing to the loudspeakers, but the places to which he was pointing were not consistent with the actual loudspeaker locations. Realising that something was amiss here, Tom Hidley thought that there must still be an excess of reflexions. He decided to increase the room’s absorption, even though this was totally contrary to the suggestion that they were already ‘over-trapped’. Nevertheless, the move was greeted with general approval, so he went back into his self-imposed retirement with quite a lot to think about.

A couple of years later, he was talked into visiting a company in Japan who wanted two studios of similar style to some of his earlier rooms in Tokyo. Hidley was a reluctant participant in all of this, because the compatibility impasse had still not been resolved. He eventually agreed that he would design one control room as the client wanted, but that the other would be designed to his own, latest, but still untested thinking. However, he imposed the additional condition: whichever of the two finished control rooms was least liked by the musicians and engineers would have to be demolished, and rebuilt according to the design of the preferred room. The ‘winning’ room was that of his first Non-Environment design, but it was radical and different, and was not immediately familiar to all its users. Nevertheless, it was also in Tokyo that designers such as Sam Toyoshima and Shozo Kinoshita were operating, and Toyoshima was also a strong advocate of very low decay times, hard front walls, flush mounted monitors and maximally absorbent rear walls (see also Section 13.1.5). Japan was therefore fertile ground for this sort of concept, which was almost diametrically opposed to the LEDE concept that was sweeping through the USA and was beginning to gain some acceptance in Europe. This was also the time of Tom Hidley’s first use of the vertically aligned Kinoshita monitors, as shown in Figure 13.9.

Hidley soon came out of his retirement to produce more rooms, encouraged by what he had heard in Japan, but many of these rooms were monstrous by the standards of the day, as the low frequency aspects of the absorption techniques were very space consuming (see Figure 13.7). In the meantime, the author had been trying to achieve consistency, in his own way, by pointing rather directional monitors at an absorbent back wall absorber, but the problem remained of what to do about the variation in the omni-directional low frequencies between rooms of different shapes and sizes. A means for more effective low frequency control was badly needed.

Coincidentally, Tom Hidley was also looking for more effective low frequency absorption, but this time for the infrasonic region. He was designing his ‘10 Hz’ control rooms for BOP TV in Boputatswana. Some joint research was therefore proposed that culminated in the one-tenth scale modelling work by Luis Soares at the ISVR. During the course of this work, Keith Holland was also coming to the end of his doctoral research on mid-range horns, and there came a time when both he and Luis Soares needed more measurements from real studios.

A weekend trip to London was arranged, in order to visit five different control rooms, each having a different designer and design philosophy. There was only one of Hidley’s Non-Environment control rooms in London at that time (1989), and neither the author, Holland nor Soares had seen or heard one before. After the five control rooms had been visited, measured and listened to, including a then current Newell design, the consensus was unanimous that the Non-Environment room at Nomis was closest to what all three people present agreed that a control room should be. Luis Soares went back to his one-tenth scale modelling of the absorption systems, but it was not going well. (The scaling of complex interactive systems is notoriously difficult.) What was needed was a full-scale model, but there was no room at the ISVR (Institute of Sound and Vibration Research) where such a model could be built.

A flexible client for a studio design was subsequently found in Liverpool, UK, and in the new year of 1991 a full-scale model was begun, in which Holland and Soares could continue their experimental development work prior to the studio opening for business. (Purely coincidentally, the studio was called The Lab, as it was built in an old chemical analysis laboratory.) The control room was built to the same general principles as Hidley’s Non-Environment room in London, though the shape was different, the size was different, the building shell was different, the materials of construction were different and the monitor system was different, although conceptually compatible to a large degree with the Kinoshitas in Nomis.

When everything was finished, as shown in Figure 16.8, Holland and Soares went to Liverpool with a car full of test equipment from the ISVR. Once all had been confirmed to be performing as intended, it was time to play some well-known CDs. Everybody in the room was stunned by the detail which could be heard in the music, but what was more, those in the room who knew Hidley’s Non-Environment rooms were hearing something that they recognised. The degree of compatibility between Nomis and The Lab was remarkable, and yet there was hardly anything the same in the construction materials or the equipment. What was clear was that it was the concept that was giving rise to the compatibility, not the details: hard front wall with built-in monitors, hard floor and all other surfaces maximally absorbent, which could be trimmed to taste to add any desirable extra life.

image

Figure 16.8:

A complete room. ‘The Lab’, at the Liverpool Music House (UK), in 1991. In this case, the floor is wooden parquet blocks.

The control rooms were still rather large, though; The Lab being about 48 m2 and Nomis much larger (see Figure 18.1). The pressing problem was how to make the concept work in smaller rooms, which could not afford the loss of space for the absorbers. The breakthrough came with the use of membrane absorbers in front of panel absorbers, which provided reasonably useful wideband absorption in a depth of only about 30 cm, although 80 to 100 cm was necessary if more ideal results were required. It was like lining the surfaces of the rooms with a limp bag. Although there were some differences between the large and small rooms, the concept proved to be sufficiently robust to retain the compatibility in the music mixing. However, it was not until about ten rooms had been built could it be reasonably assumed that good compatibility could be achieved from a wide range of shapes and sizes. This was very much an acid test, as the first ten control rooms referred to ranged from 80 m2 down to 12 m2 (240 m3 to 30 m3), although the very small room was not surrounded by a heavy, rigid containment shell, and therefore some of the low frequencies escaped to the corridor and lounge areas. The finished room is shown in Figure 16.9

The first obstacle faced was that the rooms did not sound quite like the previously accepted ‘professional’ rooms, but eventually people began to realise that the mixes were translating well to the world at large. The usual problem of taking something home to listen to it, then going back to the studio and remixing it, was becoming a thing of the past. Three years later, people were moving between the control rooms with absolute confidence and only minor compatibility problems. Since then, the number of rooms using this concept has multiplied, both for music mixing and in cinema control rooms for the mixing of soundtracks, as described in Chapter 21. At the other extreme of the size range, Figures 16.10 and 16.11 show a control room constructed in a space of only 17 m2 in a reinforced concrete ‘bunker’, which was a very different situation to the starting conditions for the room shown in Figure 16.9. However, there was an available ceiling height of around 4 m, and so the primary low-frequency absorption was due to the sandwich structure of the ‘box’ (the basic construction as shown in Figure 5.5) and the 1m, or more, of absorption in the ceiling.

It was the success of the compatibility of these rooms, both between themselves and with the outside world, and not only on conventional home systems, but also on headphones and in cars, which prompted the writing of the AES paper on which the first part of this chapter was based. It was the ‘The Zero Option’: if something is going to be acted upon by a variable multiplier, then the smaller that that something is, the smaller will be the variation. If a room is going to impose its character on the sound of the monitors, then the smaller the room sound is to begin with, the less it can affect the monitoring, even if it is variable from room to room. It is the concept of close-field monitoring, but with big monitor performance. Some construction and performance details can be found in Appendix 1, which follows Chapter 26.

image

Figure 16.9:

A very small Non-Environment control room: Noites Longas, Redondos, Portugal. The room was built in a shell of only 3 m×4 m, with a sloping roof, but the acoustic situation was helped by the fact that the containment walls were not very rigid. The low frequency isolation to the lounge, behind the control room, was not considered to be very important. The principal performing room is shown in Figure 5.23. Both rooms were floated, and situated on the top floor of a house.

Of course, the rooms had their critics. ‘Boxes full of Rockwool … and the end of acoustic design’ was the comment of one designer in the international recording press. ‘Like every other neat theory … I guarantee that it is an oversimplification’, was the reaction of another. Such scepticism is perhaps healthy, as it is only out of debate that ideas are usually developed. Nevertheless, all this is of no interest to the musicians and recording staff who neither know nor care about room acoustics. They just want rooms which make their daily work easier, more consistent, more rewarding, and which can help them to relax into their work with the confidence that hearing is believing. In the opinion of the author, the Non-Environment approach can most consistently deliver these things.

image

Figure 16.10:

(a) and (b) Two views of a control room under construction in a space of only 17 m2, but with around 4 m height, surrounded by a concrete isolation bunker.

image

Figure 16.11:

The finished control room. Neo Musicbox, Aranda de Duero, Spain, 2008. (See also front cover.)

16.13 Summary

If recording personnel are choosing to work in the close-field of small monitors, despite suffering the limitations of small loudspeaker boxes, then this begs the question as to why not extend the close-field of the large monitors by reducing the room decay time from the monitoring direction.

This can be done, and without unduly deadening the room ambience from the direction of the people within it.

Highly damped rooms also exhibit less spacial variation of the sound, which is normally caused by the room modes.

Domestic listening conditions are very variable. It would seem unwise to allow the concept of average listening conditions to influence unduly the control room acoustics – the result of such influence is usually a lack of certainty in the mixing environment.

If conditions cannot be standardised at the mixing stage, then how can hi-fi equipment be made to match the reproduction response to that of the production environment?

The Non-Environment rooms have maximally absorbent rear walls, side walls and ceilings.

The monitor loudspeakers are flush mounted in a hard front wall, which is reflective to the sounds of people in the room, but from which the musical sounds from the loudspeakers cannot reflect, because they are propagating away from it.

The floors are generally hard.

Such rooms, by the absence of their monitoring ambience, are very consistent in the monitoring character from room-to-room.

The rooms should not be confused with the ‘bass trapped’ rooms of the 1970s and 1980s, which could lead to bass-heavy mixes. The Non-Environment rooms are ‘all-trapped’.

Spacial sensations created in a control room, which are not an inherent part of the recording, cannot be considered part of a true monitoring process. Any spaciousness enhancement should be left to the final reproduction room acoustics.

Spaciousness and the perception of detail may be mutually exclusive. For critical monitoring it is probably better to aim for the more accurate perception of detail.

In Non-Environment rooms, loudspeaker design is simplified because less importance needs to be attached to off-axis responses.

Characteristics such as ‘transparency’ and ‘depth’ can also be more easily assessed without the confusion of room ambience.

Stereo, as we currently know it, was conceived as a sensation for a ‘stereo seat’. Control rooms that compromise their clarity of monitoring in order to try to broaden the stereo perception area are trying to get something from two-loudspeaker stereo that it was never intended to provide.

Surround sound systems are the solution to more spacious stereo.

The low decay-time control rooms tend to achieve the lowest number of room-related variables of any of the current control room philosophies.

There is a lot of learned opinion that leans towards lower room decay times for more accurate monitoring conditions.

References

1  Toole, Floyd E., ‘Loudspeakers and Rooms for Stereophonic Sound Reproduction’, Proceedings of the AES Eighth International Conference, Washington DC (1990)

2  Newell, P. R., Holland, K. R. and Hidley, T., ‘Control Room Reverberation is Unwanted Noise’, Proceedings of the Institute of Acoustics, Vol. 16, Part 4, pp. 365–73 (‘Reproduced Sound 10’ Conference, Windermere, UK, 1994.)

3  Newell, Philip, ‘The Non-Environment Control Room’, Studio Sound, Vol. 33. No. 11, pp. 22–9 (November 1991)

4  Newell, Philip, Studio Monitoring Design, Focal Press, Oxford, UK (1995)

5  Stark, Eric, ‘The Hidley Infrasound Era’, Studio Sound, Vol. 37, No. 12, pp. 52–6 (December 1995)

6  Gerzon, Michael, ‘Three Channels, The Future of Stereo?’, Studio Sound, Vol. 32, No. 6, pp. 112–21 (June 1990)

7  Davis, Don and Davis, Carolyne, Sound System Engineering, 2nd Edn, Howard Sams, Indianapolis, IN, USA (1987)

8  Moulton, D., Ferralli, M., Hembrock, S. and Pezzo, M., ‘The Localization of Phantom Images in an Omni-Directional Stereophonic Loudspeaker System’, AES 81st Convention, Preprint No. 2371 (1986)

9  Flindell, I. H., McKenzie, A. R., Negishi, H., Jewitt, M. and Ward, P., ‘Subjective Evaluations of Preferred Loudspeaker Directivity’, AES 90th Convention, Preprint No. 3076, p. 6, Paris (1991)

10  Wrightson, Jack, ‘Psychoacoustic Consideration in the Design of Studio Control Rooms’, Journal of the Audio Engineering Society, Vol. 34, No. 10, pp. 789–95 (1986)

11  Kuhl, W. and Plantz, R., ‘The Significance of the Diffuse Sound Radiated from Loudspeakers for the Subjective Hearing Event’, Acoustica, Vol. 40, pp. 182–90 (July 1978)

12  Kishinaga, S., Shimizu, Y., Ando, S. and Yomaguchi, K., ‘On the Acoustic Design of Listening Rooms’, presented at the 64th Convention of the Audio Engineering Society, Preprint No. 1524 (November 1979)

13  Lipshitz, Stanley P., ‘Stereo Microphone Techniques … Are the Purists Wrong?’, Journal of the Audio Engineering Society, Vol. 34, No. 9, pp. 716–35 (September 1986)

14  Newell, Philip R. and Holland, Keith R., A Proposal for a More Perceptually Uniform Control Room for Stereophonic Music Recording Studios, presented at the 103rd Convention of the Audio Engineering Society, New York, Preprint No. 4580 (1997). [Reference 14 can be found at the beginning of the chapter.]

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.114.221