Chapter 15: Studio monitoring: the principal objectives

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 15

Studio Monitoring: The Principal Objectives

The functions of a monitor system. Basic points of reference. Differences in perception. Different balances of compromises. Desirable assets. Loudspeakers as information channels. Close field monitoring. The NS10M.

15.1 The Forces at Work

From the previous chapter it should now be apparent that we are not dealing with perfection when we talk about music reproduction via loudspeakers. The late Richard Heyser (inventor of the Time Delay Spectrometry measurement system, and one of the audio giants of the 20th century) said that ‘In order to fully enjoy the intended illusion of a recording, it is necessary to willingly suspend one’s belief in reality. All recording and reproduction via two loudspeakers is illusory’.

The classical concept of studio monitoring has been ‘The closest approach to the original sound’, as the Acoustical Manufacturing Company stated in their advertising literature of the 1940s. David Moulton¹ pointed out that the loudspeaker itself, without formal recognition, has become the predominant musical instrument of our time. Music is being created on loudspeakers for playback by loudspeakers. Often there is no reality. Neither is there an original sound to attempt to make the ‘closest approach to’. Given the disparity between many studio-monitoring conditions, this must render the whole situation to be somewhat arbitrary.

It is also hard to get away from the fact that both the sound recording and sound reproduction industries are businesses to a greater degree than ever before. The truly professional ‘irrespective of cost’ or ‘absolute attention to detail’ approach to either one is now extremely rare. Reference was made in Chapter 1 to the building of The Townhouse studio in London in 1978, which in its day had the best of everything. Few studios could survive today if they were built at such real (i.e. inflation adjusted) costs.

Today, also more than ever before, the cost of advertising by the manufacturers who supply the professional industry, together with the cost of attending a large number of trade shows around the world, means that the sale price of much equipment must support all of these extra burdens. In fact, it is the case that the cost of components represents less than 15% of the sale price of much of today’s professional recording equipment, far less than was the case in 1980. The upshot of this is that just because something seems to be expensive does not necessarily mean that it contains expensive components. The pursuit of true excellence is now in the hands of a very small proportion of the people involved in the music industries. The majorities are in the pursuit of profits, and little else. Nevertheless, the principle aim of this chapter is to take the purist viewpoint as far as possible, and to try to make clear divisions between whether the compromises or shortfalls are based on the limitation of the science and technology, or on the reality of a market led world.

The main functions of studio monitor loudspeaker performance are directed towards enabling the users to detect things that may become problematical on the best domestic systems; to give the users the widest possible scope for artistic interpretation of what is being recorded; and to be capable of producing the desired atmosphere in the creative environment. Somewhat surprisingly to many people, many well-used studio monitor systems do not rate particularly highly on the fidelity ratings used by many hi-fi reviewers and magazines. To lay persons this may seem disturbing, but they must appreciate that the function of such systems is not primarily to create wonderful sounds in the studios, but to enable the studio to be used to produce recordings that sound wonderful when played at home by the public. Although it will always be very worthwhile to aim for maximising the fidelity of a monitoring system, other requirements, such as large size to enable reliable production of high sound pressure levels at low frequencies, ruggedness in daily use, ease of servicing, consistency of performance, and many other factors, will also be high on the list of design priorities. A super-fidelity system that fails twice a week is of little use in studios.

That studio monitor loudspeaker systems are a class unto themselves is borne out by the fact that so few loudspeakers straddle the professional/domestic divide. Monitoring loudspeakers are rarely found in other environments, not because of price, as many ‘high end’ hi-fi systems cost more than many comparable monitor systems, but because they are highly specialised, highly evolved tools. They are only a means to an end. Indeed, one reason why many good monitor systems do not automatically become expensive domestic hi-fi is that they can often be unpleasant to listen to on poor quality programme; they are too ruthlessly accurate, but it is their job to be so.

Conversely, the small monitors used in many home studio set ups, and which may be pleasant enough for domestic listening, often do not have the accuracy of resolution to make obvious during the recording process any build up of aggressive sounds, due, for example, to excessive signal processing artefacts. They have been designed to sound good, which is not much use when problem hunting. The result may often be that a recording which sounds acceptable on such loudspeakers may be less pleasant to listen to when heard on somebody’s domestic high-end hi-fi. Discs should not be released by a professional industry if they contain such badly monitored programmes. People who work on poor or ‘easy to listen to’ monitors are merely burying their heads in the sand, taking a lazy way out, and leaving the record buying public to suffer the results of their failings; all because of a lack of professionalism being shown in the recording studios.

15.2 Where is the Reference?

A significant problem exists, because no one combination of loudspeakers, amplifiers, crossovers and rooms can be optimised for the most lifelike sensory illusions for all of the different styles of music or recording techniques. The lack of correlation between measured and perceived responses only serves to complicate matters further. There can also be complex convolutions of the different performance characteristics that manifest themselves in very programme dependent ways. It still cannot be said in general terms whether a ±1 dB error in the pressure amplitude response is any more or less important than one degree in phase accuracy, or than an extra 0.1% of non-linear distortion. Even within the harmonic distortion performance, no absolute, programme independent ratio exists in terms of the relative importance of the individual harmonics. In fact, harmonic distortion measurements may be of little use at all, because the overriding source of unpleasant non-linear distortion would appear to be the intermodulation products, for which no common standards of measurement currently exists.

The aforementioned ‘closest approach to the original sound’ concept of a studio monitoring system has an inherently logical feel about it. It would seem to be self-evident that we should be able to set up an acoustic guitar, for example, in a good recording environment, listen to the recording in an acoustically neutral control room via a selection of loudspeakers, and choose the most realistic loudspeaker as a good monitor. The reality, however, is very different.

The different frequencies produced by the acoustic guitar radiate from many parts of the instrument. The highs tend to come from various parts of the strings. The lows come principally from the large surface of the body and the resonance hole. The ‘mids’ emanate from the complex vibrational behaviour of the body in a way that is quite spacially diffuse. What this means is that the sound field around the acoustic guitar is very complex, and what we perceive when standing in front of it is further dependent upon the reflexions in the room and the fact that we have two ears. Once we try to collect this sound via a microphone, the fact that the microphone has a directivity pattern quite unlike our two ears means that the electrical signal that it generates will not be representative of what we hear from a point in front of the guitar. Perhaps an obvious solution would be to use a pair of microphones in a dummy head, but such a collection system only sounds natural when the pinnae (outer ears) on the dummy head are identical to those of the listener, and when listening on headphones. This is because if we play back the recording via loudspeakers, the inter-aural cross correlation that is inherent in the two signals coming from the dummy head would be processed a second time when the sounds from the loudspeakers entered our own ears. It would thus pass through two sets of outer ears, which can be avoided only by using headphones.

Another problem with the sounds from the loudspeakers is that the spacial distribution of the sound source is entirely dependent upon frequency. The lows come out of the woofers and the highs come out of the tweeters. In fact, in many cases, all the high frequencies in a stereophonic loudspeaker reproduction are generated from two points, only a couple of centimetres across (the two tweeters), which is quite unlike the distributed high frequency generation by the instrument.

Furthermore, if the control room in which the loudspeaker is situated has the same characteristics as the recording room, the room characteristics will be heard in double dose; once through the microphone and once from the excitations by the loudspeaker. Clearly, this is not accurate monitoring. However, if we listen in an anechoic room, the lack of reflexions will clearly leave the loudspeakers as being the only sources of the sound, which again is quite unlike the natural ambient ‘surround’ sound heard when stood in front of the acoustic guitar in the recording room.

If we were to take the acoustic guitar into an anechoic chamber, place a high quality pair of measuring microphones in front of it, and monitor the sound via a pair of loudspeakers in another anechoic chamber, we could, perhaps, finally achieve a degree of realism via the loudspeakers. Under these circumstances, the loudspeakers with the widest frequency responses (pressure amplitude responses) the most accurate phase responses, the lowest non-linear distortion content and the best impulse responses would probably show themselves to be the most realistic in their reproduction of the original sound. Nevertheless, if the loudspeaker directivity was taken into account, then in a more lively control room they could give rise to reflections with strange frequency balances, which would render the overall perception to be much less natural.

Even in the anechoic circumstances, however, two people may not agree about which loudspeakers were most accurate. The fact that the sound fields from the loudspeakers are different to the sound fields from the actual instruments means that the shapes of the pinnae of the listeners (as explained in Chapter 2) will ensure that the reception of the different sound fields will create different responses in the ear canals of the different listeners. In other words, what reaches the eardrum can depend on the frequency distribution of the source in spacial terms, as well as the frequency distribution in terms of the pressure amplitude in the vicinity of the outer ear. In a test carried out in a studio in London in the 1980s,² two respected recording engineers failed to agree by 3 dB in the ‘6 kHz upwards’ range about the similarity of a monitor system to a live acoustic instrument in the studio. The question was not about what sounded right, but simply what high frequency level sounded most similar to the live source.

During research at the Institute of Sound and Vibration Research in Southampton University in 1989,³ a wide range of listeners took part in listening tests which were directed towards identifying characteristic sounds from mid-range horn loudspeakers. The question was asked for each of nine different sounds, ‘To which of the four reference loudspeakers does the test sample loudspeaker sound most similar?’ The tests were blind, and the sounds were a selection containing recordings of both transient and steady state nature. On some sounds agreement was almost unanimous, but on one particular sound and one particular sample, out of the first 11 listeners, two people thought that the sample sounded more like reference A, one like B, five like C, two like D, and one person thought that it did not sound like any of them. These were mainly people with some experience in listening critically to music, yet they could not agree on which loudspeaker sounded most similar to which. Had the question been reversed, the test loudspeaker could have been the reference sound. In that case, the question could have been asked, ‘Which of the four loudspeakers is most similar to the reference sound?’ The results would have been equally diverse.

Concurrently with the above tests, research was being done for Canon, who were assessing the value of a concept for a wide image stereo system.³ The test set-up was designed to detect spacial preferences by feeding artificial reflexions via combinations of fourteen loudspeakers selected via a six position switch. Many participants complained about switch malfunctions when no change was heard. When the results were published, it was revealed that the switch ‘faults’ were non-existent. The ‘faults’ always occurred in the same places for the same listeners when the tests were repeated, but the ‘faulty’ switch positions were different for each person. The implication was that people not only had some very well defined spacial image preferences, but that the sensations of the images differed greatly from person to person, also.

In the first of the two ISVR tests described above, the anechoic conditions and the relative uniformity of the directions of the different sounds meant that the differences perceived were in the domains of frequency and phase. In the second case, all the loudspeakers used for the simulated reflexions were nominally identical, so the differences perceived were in the spacial and temporal domains. Another outcome of the first of the tests was that some of the loudspeakers deemed to sound similar in the overall analysis showed harmonic distortion differences as great at 20 to 1. In other cases, loudspeakers with very similar distortion characteristics were considered to sound very different.

Loudspeaker systems as they exist today are far from perfect. Figure 15.1 shows a series of pressure amplitude response (frequency response) plots for a group of well-respected loudspeakers that are commonly used in recording studios. From these plots alone, and taking into account the fact that the loudspeakers all have different directivity characteristics, it should be apparent that their reflexions should sound different, and it is not hard to appreciate that they do sound different. It should also be easy to appreciate that because the response differences are so widely distributed, traditional equalisation systems could not adjust them all to a flat response.

Figure 15.1 Pressure amplitude responses of four loudspeakers, all ostensibly designed to perform more or less the same task

In the light of what has been presented in this chapter so far, it should be apparent that we do not have loudspeakers that exhibit perfect responses in any of their characteristics. Neither do we have a sufficient degree of perceptual consistency from one person to another to form an ‘expert listener’ panel to select a ‘best’ loudspeaker. Even if we did, we would no doubt get different results when the loudspeakers were auditioned in different rooms and on various types of music. The choice of studio monitoring systems therefore tends to depend to a considerable degree on the ability of the users to get the desired results when using them.

In an interview in EQ magazine in June 1993, George Massenburg, one of the world’s pre-eminent recording engineers and producers, said ‘I believe that there are no ultimate reference monitor systems, and no “golden ears” to tell you that there are. The standards may depend on the circumstances. For an individual, a monitor either works or it doesn’t....Much may be lost when one relies on an outsider’s judgement and recommendations’. Also, it is worth repeating (from Section 2.4.1) that about a hundred years earlier, Baron Rayleigh, one of the true giants of acoustical research, stated ‘The sensation of sound is a thing sui generis, not comparable to our other sensations. Directly or indirectly, all questions connected, with this subject must come for decision to the ear, as the organ of hearing, and from it there can be no appeal.’ Rayleigh summed up the situation so beautifully before loudspeakers had even been invented!

As Toole and Olive wrote,⁴ ‘A loudspeaker isn’t good until it sounds good. The traditional problem has been: how do we know what is good? Whose opinion do we trust?’ Although they then go on to describe listening tests which when well devised and controlled can yield very consistent results, they also acknowledge that when considering anything but the finest of loudspeakers in ideal listening circumstances (which may be very different from the ideal working circumstances in a control room), assessments may be based rather more on the balance of positive and negative attributes as they are revealed by different kinds of music, than on absolute responses.

15.3 Different Needs

There is a fundamental difference between classical and rock music recording. It is generally true to say that reproduction of classical and acoustic music in the control room seeks to emulate the original acoustic performance. Conversely, the live performance of rock, and especially electronic music, seeks to emulate the original studio recording. In classical music, classically recorded, the original performance exists as a real entity in time and space. Much popular music, however, never exists as a single integrated performance in any one time and place until the final mix is first heard in the control room. It is created on loudspeakers for performance by loudspeakers, and so no natural sound balance ever exists. The concept of the ‘real’ reference is therefore lost.

Even in the recording of acoustic instruments for rock music, a recorded bass drum sound, for example, rarely sounds like the actual drum. The recording of bass drums has now become highly stylised, perhaps to comply with what a recorded bass drum is now expected to sound like. Even with a listener’s head inside the bass drum, one could not hope to hear the ‘acoustic’ version of the recording. Notwithstanding the risk of hearing damage, the gross distortions and compressions that the ear would superimpose on the sound at such high SPLs would make any comparison of the real versus recorded sound invalid. And even that is not taking into consideration the change in the acoustic properties of the bass drum when a human head was placed inside it as opposed to a microphone. Much modern music therefore simply has no acoustic counterpart, even though it may be made up entirely of acoustic instruments.

Consequently, in the field of modern music recording, we are rarely, if ever, dealing with absolutes in terms of loudspeaker performance. We tend to be dealing largely with illusionary perspectives. Given that all loudspeakers are far from theoretical perfection, our choice of loudspeakers may be dictated by the most important aspects of their individual characteristics relative to any particular music or instrumentation. This is somewhat reminiscent of the discussion in Chapter 6, relating to the suitability of halls for orchestral recordings also sometimes being favourable to specific styles of music or instrumentation. As we are dealing with music, whose purpose is to bring pleasure to its listeners, anything that enables or inspires the people involved in the recording to reach new emotional heights must be valid. In fact, if we can for a moment split the recording process from the mixing process, there are some studios which are liked for their inspirational capacity during the recordings, but which during mixing would lead to unrepresentative results. Nonetheless, if they can lift a musical performance, then they have a valid function, but such idiosyncratic studios must be considered to be more of a performance stage (control room included) than a general reference for what is ‘right’ about a balanced mix.

15.4 What is Right?

Traditionally, the pressure amplitude response has been the most important aspect of loudspeaker performance. It has been relatively easy to measure, and it tends to relate well to the perception of a natural sound. Unfortunately, as can be seen from Figure 15.1, even maintaining a response accuracy of ±3 dB in an anechoic chamber is not easy, and, as any recording engineer knows, being able to adjust a graphic equaliser by ±3 dB over the full frequency range can result in a very wide range of different timbres. From this it can be correctly deduced that all the loudspeakers of Figure 15.1 sound different, even on-axis in an anechoic chamber. It is now also quite a widespread view amongst mastering engineers that a flat frequency response is not their prime concern when choosing a monitor system.⁵ Low distortion and transparency are highly valued attributes, together with a good transient response. Frequency response, it would seem, as long as it does not deviate excessively from a gently changing curve, is something to which the ear adapts quite easily. In fact, as the frequency balance is so variable from one home to another, the provision of tone controls to make some desired compensations is almost universal on domestic reproduction equipment. Nevertheless, a reasonably flat frequency response is a necessary part of good studio monitor systems if we are to avoid deviating too far from a common reference point.

The transient response of a loudspeaker is something which had not really received due attention until the 1980s. This will be discussed in more detail in Chapters 19 and 20, but essentially, the time response of a system (which defines its transient response) is a function of the convolution of the pressure amplitude response and the phase response. The three are related by the Fourier Transform (see Glossary).

Over 150 years ago, Ohm, and later Helmholtz, carried out experiments that appeared to show that the ear was ‘phase deaf’. Their experiments had been carried out on sine waves, which are very steady in nature, and not representative of normal musical programmes. Nonetheless, quite unbelievably, their work is still quoted in some circles as implying the relative unimportance of phase responses in terms of audibility.

Phase responses have an enormous bearing upon the waveforms of transient signals, changing considerably the way in which the shock of the transient excites the ear. Again, an argument had been put forward implying that if the waveform distortion changed the envelope only within the integration time of the ear, then the effect would not be perceived, but once again, our perception systems seem to be far more subtle than often previously believed. The concept of integration time states that if the ear samples sounds in periods of time of length x, then it would not matter whether ten units of a sound occurred simultaneously, or whether each was separated by a period of one-tenth x, or even any combination in between. The ear would still perceive ten sound units within that time window of length x, so the perception of any combination would be the same in all cases. In Figure 11.29, the two waveforms shown differ only in the phase relationship of the component frequencies. If the whole figure represented waveforms of less than the integration time of the ear, then classic theory predicts that they should sound the same, but they do not. The difference is more easily perceived in highly damped (non-reflective) listening conditions, but it does tend to disappear in conditions that are more reverberant. In Professor Manfred Schroeder’s own words ‘In listening to the two waveforms an astonishingly large difference is perceived. . . . Too bad Ohm and Helmholtz were not able to listen to (AM and QFM) signals. They might have been very hesitant to formulate their phase law’.⁶

Non-linear distortions should also be minimised in an ideal monitor system, because the natural timbre of an instrument can be either enhanced or degraded by non-linear distortions. One reason why valve amplifiers should not be used to drive monitor systems is that the typical second harmonic distortion that they often produce can be quite musical in nature, but that could mislead the recording personnel into thinking that they had a musical sounding recording when they in fact did not. They would merely be listening to it through a musical sounding amplifier. [A more in-depth discussion of the audibility of non-linear distortion will be found in Chapter 19.] So, although a flat frequency response and a good phase response (and hence a good transient response), low distortion, and an off-axis response which does not cause unduly coloured reflexions are all desirable attributes of a monitor loudspeaker, they nonetheless do not define the whole sonic character of a loudspeaker.

An interesting proposal was put forward by Watkinson and Salter in 1999,⁷ which addressed a hitherto undefined aspect of loudspeaker performance. They proposed treating the loudspeaker as an information channel of finite capacity that can actually be measured as an equivalent bit-rate. It would seem obvious that when listening to audio data compression codecs, for the purpose of assessing their audibility, a judgement of the degree of audibility could only be made when listening through loudspeakers of the highest resolution. When listening via a telephone, for example, it would be impossible to hear whether the signal being transmitted was 24-bit linear, 12-bit linear, or coded by AC3, MP3, or any other reasonable data compression codecs.

Conversely, if an audio signal was passed through a loudspeaker via a system for finely reducing a bit rate, then the loudspeakers that rendered the smallest bit rate reductions to be audible would be those with the highest resolution. A possible scale of resolution could be drawn up for different loudspeakers. The tests would need to be carried out in stereo, because some spacial aspects of codec signal degradation are not evident in mono. This concept would seem to be of value, because one reason why so many bad recordings are for sale in the shops is that they have become aggressive sounding due to the build up of signal processing artefacts which were not clearly audible on the loudspeakers in use during the mixing stage. In fact, one reason for the boom in the work for mastering houses has been caused by the increase in the proportion of studios that currently use totally inadequate monitoring systems. It also goes some way to explain why the mastering engineers are valuing transparency, low distortion and a good transient response, as being more important than a dead flat amplitude response.

Essentially, a good monitor system is one on which great recordings sound great, average recordings sound average, and bad recordings sound bad. This is not necessarily in agreement with the requirements for domestic hi-fi systems, whose prime function is to please the listeners. Really, nobody in their right minds would want to hear an unintentionally bad sound for the purposes of seeking enjoyment. In domestic circumstances, if a bad recording sounds less bad than it really is, whilst a good recording still sounds good, then for the purpose of enjoyment such a loudspeaker system would be well received. A studio monitor loudspeaker system, however, does not exist for the purposes of giving enjoyment. It exists to tell the recording personnel if they are achieving their desired objectives on the recording medium itself, and not only when heard via the monitors.

The aforementioned Watkinson and Salter presentation also addressed the fact that most rectangular box loudspeakers produced diffraction artefacts (as discussed in Chapters 4 and 11) which had a tendency to mask some of the loss of spacial sensation that could result from the use of signal processing devices with less than adequate electronic signal paths. This again adds to the argument that the principal monitoring system in a control room should be flush mounted.

In general, and only in general, Table 15.1, above, outlines some of the performance compromise differences that have developed over the years between the needs of the rock/pop and classic/acoustic recording worlds.

Table 15.1 Areas of design parameter priorities on two-loudspeaker stereo perception for rock/pop and classical/acoustic recording. Generalisations

Rock	Classical
High output SPL capability, e.g. 120 dB @ 3m.	Lower maximum output SPL, e.g. 105 dB @ 3 m.
More tolerant of time/phase distortion if no acoustic source exists.	Minimum time/phase distortion for good localisation and natural timbre.
Extended low frequency performance at high SPLs, holding the response relatively flat as far down as possible.	Can tolerate generally lower low frequency SPLs and tailing-off responses, as long as smooth roll-offs exist in terms of both amplitude and phase.
Require relatively dead rooms with minimal lateral reflexions in order to support strong phantom, amplitudepanned images. Axial response more important than off-axis response.	Require rooms with lateral reflexions, hence need to have smooth, even, directivity in order to achieve due sense of spaciousness.
Whilst harmonic distortion should remain as low as can be achieved, this should not be done at the expense of transient headroom. The stylised, highly transient sounds of electronic music will often be more audibly tolerant of harmonic distortion than of amplitude limiting.	Require low harmonic distortion because the more steady sounds and subtle ambient information of classical music, along with the lower general levels of transients, make harmonic distortion perception much more of a problem than high level transient headroom.
A second set of smaller loudspeakers are considered de rigueur for overall musical balance assessment, and also as a reference to how the recordings may be expected to sound in cars, at home, and on the radio.	No such similar need has arisen, and the reference to a smaller set of (usually) poorer quality loudspeakers has not evolved.

15.5 Close-Field Monitoring

It became apparent in the 1960s that the sound as mixed in recording studio control rooms did not always travel well to the then new transistor radios. As it was via such radios that most people decided which singles to buy, the success or failure of a single was seen, rightly or wrongly, to be dependent upon how it sounded on the radio. Whether an extra 2 dB on the level of the lead guitar would really make the difference between the single reaching No. 1 or only No. 18 is a moot point, but the competition between the record producers, and their inherent insecurity, drove the studios to provide a typical radio loudspeaker, either in or on the mixing console. By the early mid 1970s, the Auratone Sound Cube had become a de facto reference, and it was reasonably generally agreed that it was easier to judge vocal and reverberation levels more easily on the small loudspeakers. Judge them, that is, in terms of how they would typically sound in domestic circumstances. In those days, most control rooms were still put together on an ad hoc basis, and only rarely were they designed from scratch. Somewhat perversely, we have returned to a situation whereby the vast majority of control rooms are once again not properly designed, so it is little wonder that close field monitoring is still in widespread use.

Undoubtedly, the close field monitors, with their restricted frequency range, do give a good representation of typical domestic reproduction, which takes place largely on loudspeakers of a generally similar size or smaller. Listening close to or within the critical distance also helps to remove a degree of room-to-room variability. Unfortunately, though, this does not tell the whole story about the recording, and boosts applied to a recording at 70 Hz whilst monitoring via loudspeakers that roll off at 50 Hz may result in gross effects at 35 Hz. These can be deemed totally undesirable when the music is heard on a full range, truly high fidelity domestic system. Once again, in many cases, the mastering engineers are now expected to save the day.

It would seem to be incumbent upon a professional industry to deliver a well-balanced product to the marketplace. This will yield results correspondingly appropriate to the systems on which they will be played. It does not seem professional to deliver products that are exposed as lacking by the purchasers of audiophile systems. In other words, the quality heard from a recording should be proportionate to the quality of the system on which it is played. All too often, however, due to inadequate monitoring systems in the studios, the audiophiles in their homes are the first to realise just how awful some recordings are.

No studio can reasonably call itself professional unless it can provide the ability to monitor to a high degree of resolution over the great majority of the audio bandwidth. The close field monitoring is therefore a useful adjunct to, but not a substitute for, a high quality full range system, although the precise choice of whether the close field system should be of high or moderate quality depends to some extent on the type of music being recorded and its intended market. Few people monitor classical recording through a pair of NS10Ms, yet some respected rock/pop producers and engineers claim to be able to mix solely by reference to them.

15.6 Why the NS10M?

Probably no other loudspeaker has seen such widespread use in the field of contemporary music recording as the Yamaha NS10M and its derivative, the NS10M Studio. The original NS10M was conceived as a domestic hi-fi ‘bookshelf’ loudspeaker, but it was a commercial failure as such. In the early 1980s, however, it was seized upon by the music recording world. As a result of this, and after many early problems due to insufficient power handling and excessive high frequency response for studio use, Yamaha launched the more specialised NS10M Studio, which continued in production until late 2001. The loudspeaker was small enough to carry around from studio to studio, which is largely what producers and engineers did with them in the early days, yet it packed enough punch when connected to an adequate amplifier to reasonably replicate the typical sound character of a studio monitor of the day. Its use in the close field helped to standardise things even more, by removing the room-to-room variability to a large degree. At the same time, the frequency range resembled quite closely that of typical mid-priced hi-fi systems, which is little wonder, because that was the purpose for which it was originally designed.

The general consensus seemed to be that the NS10M had a hard sounding mid-range, a distortion performance that was not particularly good (though this is not borne out by measurements), a low frequency response that was adequate (whilst not being by any means ‘extended’), and an output capability that suited the close field monitoring requirements of most studios. The NS10M can certainly deliver a low frequency punch that in its early days was not common from a loudspeaker of such a small size. It is a loudspeaker unable to resolve fine detail, and the transparency of its sound does nothing to impress people. Yet, despite all of these criticisms, these loudspeakers have been used to get a musical balance on pop and rock music by probably more people than any other type of loudspeaker to date. For many people, they simply work for the assessment of the relative instrument balance on electric music, though they would tend to listen to other systems to judge the individual timbres of the instruments.

It has been extremely difficult to find users of NS10Ms who could explain what characteristics they possessed which rendered them so widely accepted. Some very highly respected engineers and producers have used them to create some outstanding mixes over the years, without ever having been able to fully explain why. However, in Sections 19.10 and 19.11 we shall attempt a denouement.

15.7 General Needs

Notwithstanding the idiosyncrasies of close field monitoring, in general the main monitor systems do show a tendency towards greater acceptance in line with improving objective performance measurement. Of course, the value for money aspect distorts the direct comparison of acceptance in terms of numbers sold, but the trend is reasonably clear. Direct comparisons are also made difficult by the tendency for some studio designers to favour certain brands and models of loudspeakers, and thus the market itself is not necessarily deciding things. This is totally understandable because if a designer is trying to guarantee a performance specification of a control room as a whole, then the imposition of an arbitrary monitor loudspeaker by the client is not likely to be acceptable. Unfortunately, the current situation can still be summed up by another quote from Toole and Olive⁴ ‘We make accurate technical measurements that we have difficulty in correlating with listener evaluations, and then compound the problem by making subjective evaluations that are unreliable. It is very difficult to make progress under such circumstances’. Vested interests in the marketing of the manufactured products also do little to help the search for true fidelity. Ultimately, we must use our experience and trust our own ears, but again this will be discussed much further in the later sections of Chapter 19.

15.8 Summary

Loudspeakers are a long way from perfection. Their imperfections group together in different ways, making some loudspeakers suitable for some specific purposes and other loudspeakers more suitable for other purposes.

The sound fields radiated by loudspeakers in no way represent the sound fields of the acoustic instruments whose sound they might be reproducing.

Subjective preference differences are also an obstacle to finding a standard reference loudspeaker.

Time domain responses are now accepted as having much more importance than was previously thought to be the case.

A monitor system can also be thought of as an information channel of finite capacity, and can be measured as such.

Rock and classical music recordists have tended to choose different criteria as being most important for their specific monitoring needs, but that is not to say that convergence is not possible when more is understood.

Relying only on small, close field monitors is a risky practice.

References

1 Moulton, David, ‘The Creation of Musical Sounds for Playback Through Loudspeakers’, presented at the Audio Engineering Society 8th International Conference, ‘The Sound of Audio’, Washington DC (1990)

2 Newell, Philip, Studio Monitoring Design, Focal Press, Oxford, UK, Chapter 22 (1995)

3 Newell, Philip, Studio Monitoring Design, Focal Press, Oxford, UK, Chapter 5, pp. 65–8 (1995)

4 Toole, F. E. and Olive, S. E., ‘Subjective Evaluation’ in: Borwick, J. (ed.) Loudspeaker and Headphone Handbook, 3rd Edn, Focal Press, Oxford, UK and Boston, USA, Chapter 13 (2001)

5 Newell, Philip, Project Studios, Focal Press, Oxford, UK, Chapter 8 (2001)

6 Schroeder, Manfred R., ‘Models of Hearing’, The Proceedings of the Institute of Electrical and Electronic Engineers (IEEE), Vol. 63, No. 9, pp. 1332–50 (September 1975)

7 Watkinson, J. and Salter, R., ‘Modelling and Measuring the Loudspeaker as an Information Channel’, presented to the ‘Reproduced Sound 15 Conference’ of the Institute of Acoustics, Stratford-on-Avon, UK (November 1999)

Bibliography

Borwick, John, Loudspeaker and Headphone Handbook, 3rd Edn, Focal Press, Oxford, UK (2001)

Colloms, Martin, High Performance Loudspeakers, 5th Edn, John Wiley & Sons, Chichester, UK (1997)

Eargle, John M., Loudspeaker Handbook, Chapman & Hall, New York, USA and London, UK (1997)

Newell, Philip, Studio Monitoring Design, Focal Press, Oxford, UK (1995)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 15: Studio monitoring: the principal objectives

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 15: Studio monitoring: the principal objectives