The human ear is able to perform very useful signal processing on incoming signals. For example, there are auditory mechanisms for making sense of target signals despite noisy environments. Fine frequency and intensity differences can be measured by the ear. Von Helmholtz [4] proposed that the auditory nerve processes sound tonotopically; that is, by having different nerve bundles be sensitive to different frequencies. This notion of the auditory system as a sophisticated filter bank persists today and is the basis of much auditory research.

While the anatomy (i.e., the structure) of the auditory system in most animal species is fairly well understood, we still have a long way to go toward comprehending the physiology (i.e., the function of the components). There are a great many similarities among the auditory systems of many animals, including humans. Thus, in this chapter we survey the physiological knowledge garnered from animal studies. In the next chapter we survey psychophysical studies on human subjects; in succeeding chapters we use this knowledge to try to erect plausible models of how the human auditory system perceives the pitch of speech and music and how it perceives speech.


The neocortex is that large part of the human brain that ultimately determines the nature of sensory input such as auditory stimuli. Here we trace the pathways that lead from the outer ear to this cortical percept. Figure 14.1 is a diagrammatic sketch of this pathway.1

Figure 14.1 shows the ascending pathways from the right cochlea to the cortex. Notice that there are both right and left versions of the intermediate neural nuclei. The right cochlea shown at the bottom right is part of the peripheral auditory system and its fibers innervate (excite) the right cochlear nucleus, but we see that there are pathways from there to both the right and left way stations. The left ear follows a comparable path (not shown to avoid confusion). Fibers that follow these ascending pathways are called afferent (feedforward).

There are also feedback mechanisms in the auditory system. Figure 14.2 shows half of the descending paths: those ending at the right cochlea. We see that there are efferent (feedback) neurons from most of the way stations that eventually terminate in the periphery.


FIGURE 14.1 Auditory pathways linking the right ear to the brain.


FIGURE 14.2 Descending pathways linking the brain to the right ear.

TABLE 14.1 Cells in the Auditory Nuclei of the Monkeyaa

Central Auditory Nucleus Number of Cells
Cochlear nuclei 88,000
Superior olivary complex 34,000
Nuclei of lateral lemniscus 38,000
Inferior colliculus 392,000
Medial geniculate body (pars principalis) 364,000
Auditory cortex 10,000,000
aFrom [12].

There are approximately 30,000 auditory neurons associated with each cochlea in humans. About 1000 of these neurons connect to around 20,000 outer hair cells. The 3500 inner hair cells connect to the roughly 29,000 neurons remaining. Table 14.1 is a list of cells in the auditory nuclei of the monkey. From the numbers in this table, it seems that the knowledge still to be obtained vastly exceeds the knowledge presently known.

Although the anatomy of the auditory path is fairly well understood, the physiology is still only partly understood. Many measurements on cats and other mammals have been made at the periphery, the nerve bundle leading from the inner ear into the cochlear nucleus. The cochlear nucleus (the next stage of neural processing beyond the peripheral auditory nerve) is only partially understood. There is a much greater variety of functions in this neural nucleus than there is in the auditory nerve, and if this pattern continues as we work our way up through all the pathways, it will be many years before sufficient physiological information is available so that scientists can propose a plausible model of auditory function. For these reasons, most of this chapter focuses on the peripheral auditory system; in particular, it focuses on the inner ear, which contains the cochlea.


Figure 14.3 shows the three components of the peripheral auditory system: the outer, middle, and inner ears. The input is an acoustic signal and the output is a collection of neural spikes that enter the brain, as indicated in Figs. 14.1 and 14.2.

The auditory canal is an acoustic tube that transmits sound to the eardrum, where acoustic energy is transduced to vibrational mechanical energy of the middle ear. The middle ear consists of three very small bones (ossicles); the malleus is attached to the other side of the eardrum and its vibration is transmitted through the incur to the stapes. The stapes motion impinges on the oval window of the inner ear. The oval window is a flexible membrane, and its motion sets the fluid within the cochlea in motion. This motion is transmitted to the basilar membrane within the cochlea. The final transducing medium is the collection of hair cells sitting atop the basilar membrane that implement the transformation to the neural spikes of the auditory nerve bundle. The semicircular canals (not shown in the figure) are part of the vestibular system that controls the sense of balance and are not part of the organs of hearing.


FIGURE 14.3 The peripheral auditory system. Boundaries between outer, middle, and inner ears are approximate.

As seen in Fig. 14.3, the shape of the cochlea resembles that of a snail, but we can better picture what happens by looking at Fig. 14.4. In this figure we have unwound the snail. Where the stapes impinges on the oval window is called the base; the far end (deep inside the snail) is the apex. Near the base, the basilar membrane (BM) is relatively narrow and stiff; near the apex it is wider and less stiff, with the result that high frequencies excite the basal portion and vibrations die out as they approach the apex. At low frequencies, vibrations begin at the base but reach peak amplitude further down, as seen in the figure. It is important to realize that high- and low-frequency disturbances created by stapes motion arrive at their respective peak basilar membrane points nearly simultaneously, because wave propagation is predominantly a fluid phenomenon and these traveling waves are appreciably faster than wave propagation on an isolated basilar membrane. This leads to the supposition that the BM action is akin to a filter bank, and in Chapter 19 we shall see that much research activity has centered around this concept. Figure 14.5 is a pictorial representation of BM activity.

The importance of the BM derives from the location, on the BM, of the auditory transducers, the hair cells. Motion of the hairs, or stereocilia, causes firing of the auditory nerves that innervate (connect to) the hair cells, and it is the spikes produced by the auditory neurons that relay all auditory information to higher brain centers.


Figure 14.6 shows how stereocilia motion (caused by BM motion) leads to neural spiking of the auditory nerve that is connected to the corresponding hair cell. To follow this activity, we need to make a slight detour in our discussion. All cells are inside a membrane that can permit molecules to flow from inside to outside of the cell (and vice versa). Control of this chemical flow is through molecules that are embedded in the cell membrane. Many of the chemicals both inside and outside the cell are charged so that this molecular flow can cause changes in the voltage potentials of the cells. Thus, in the case shown in Fig. 14.6, it is the flow of potassium and sodium ions that create the voltages needed to trigger neural spikes (action potentials).


FIGURE 14.4 Simplified model of the cochlea.


FIGURE 14.5 Pictorial representation of activity along the basilar membrane. From [3].


FIGURE 14.6 Neural spiking produced by hair-cell stereocilia motion.

In Fig. 14.6, depolarization is equivalent to making the voltage difference from inside to outside the cell more positive, because of the flow of positively charged sodium ions from outside to inside. For this flow to occur requires the presence of a neurotransmitter that opens channels in the auditory fiber. The precise mechanism of spike generation involves an understanding of the Hodgkin–Huxley model of neural firing [11], which is beyond the scope of this brief synopsis.


Many experiments have been performed on the auditory nerves of small mammals. Cats have been the mammal of choice in most cases because the auditory system of the cat closely resembles that of the human. The properties that have been evaluated include spontaneous spike generation, adaptation, tuning, synchrony, and various nonlinear effects.


FIGURE 14.7 Sketch of the inner and outer hair cells in a cross section of the cochlea.

Figure 14.7 is a drawing of some of the anatomy inside the cochlea. We see that three rows of outer hair cells and a single row of inner hair cells sit on the basilar membrane. The stereocilia of the hair cells impinge on the tectorial membrane, and the resultant forces open channels in the hair cells that eventually can cause spiking in the cochlear (auditory) nerve bundle shown in the lower left of the figure. Figure 14.8 is a schematic diagram of the innervation patterns; notice the three rows of outer hair cells (OHCs) and the one row of inner hair cells (IHCs), but also notice that approximately 90% of the afferent (ascending) auditory neurons come from this inner single row, whereas most of the efferent (descending) neurons go to the three outer rows.

Physiological measurements have uncovered some general properties of auditory nerves, including adaptation, tuning, synchrony, and nonlinearity (including masking).


FIGURE 14.8 Schematic diagram of inner (INC) and outer (OHC) hair cells. From [13].


FIGURE 14.9 Adaptation by an auditory nerve. From [7].

Adaptation: When a stimulus is suddenly applied, the spike rate of an auditory nerve fiber rapidly increases. If the stimulus remains (such as a steady tone), the rate decreases in an exponential manner to a steady value. Figure 14.9 shows a poststimulus time histogram for an afferent auditory nerve innervating an inner hair cell.

The poststimulus time histogram is obtained by the use of the following procedure: the tone is applied to the ear and the time of each spike referred to the initiation of the tone burst is measured. The tone is applied many times so that a histogram of spike-arrival times is obtained. We see from the figure that in the absence of the stimulus the neuron produces spikes at a small but discernable rate. This is called the spontaneous rate. At the beginning of the tone burst the neuron responds with many spikes, but after approximately 20 ms a steady rate is achieved. When the tone is removed, the spike rate decreases to slightly lower than the spontaneous rate for a short time before resuming its normal spontaneous rate. Therefore, Fig. 14.9 shows us that the neuron is more responsive to changes than to steady inputs.

Tuning: We have mentioned that the action of the basilar membrane resembles that of a bank of tuned filters and that the tuning frequency is a function of the position on the BM. We also know that different hair cells lie on different parts of the BM. Thus, it is no surprise that auditory nerves have tuning properties. Figure 14.10 shows an example.

Figure 14.10 is obtained by the use of the following procedure: for a given frequency, a 50-ms tone burst is applied every 100 ms. The sound level is gradually increased until the spike discharge rate is increased by one spike/s, at which time the sound pressure level (or SPL, as defined in Chapter 13) is recorded. This procedure is repeated for all frequencies to create the curves shown in the figure. The lowest values of these curves correspond to the frequency at which the nerve is most sensitive and therefore to its resonant frequency. Thus, these neural tuning curves look like the inverse of a bandpass filter tuning curve.

The results obtained here can be contrasted with von Bekesy's [2] measurements on basilar membranes. Treated as bandpass filters, the neural measurements result in appreciably narrower filters than von Bekesy's BM filters. For some time, physiologists tried to erect models to explain this discrepancy, but recent, more refined measurements on BM motion indicate agreement between BM and neuron. This allows us to advance the hypothesis that the primary component of an auditory fiber's tuning properties is the basilar membrane motion and that the curves of Fig. 14.10 are determined mainly by the BM motion and the position of the hair cell that innervates the neuron. There is, however, evidence that the hair bundles (stereocilia) of the hair cells also contribute to the tuning curves [5].


FIGURE 14.10 Tuning curves of six auditory nerve fibers. From [6].

Also notice the shape of the tuning curves. The left side of the resonance curve is a relatively slow function of frequency, whereas the right side has a much steeper slope. This fact also correlates well with the BM response to tones. For example, Fig. 14.5 A shows the BM response to a 1000-Hz tone; the envelope of BM displacement is seen to be the approximate inverse to the tuning curves of Fig. 14.10.

Synchrony: Again we apply a tone and perform a measurement on a cat's auditory neuron. This time we measure the histogram of time intervals between adjacent spikes. When this measurement is repeated many times, interval histograms such as those shown in Fig. 14.11 are created.


FIGURE 14.11 Interval histograms showing periodic distributions of interspike intervals to pure tones of different frequencies. The stimulus frequency is indicated in each graph. The intensity of all stimuli is an 80-dB SPL, and the tone duration is 21 s. The responses to 10 stimuli constitute the sample on which each histogram is based. (The abscissas plot time in milliseconds between successive neural discharges. Dots below the time axes indicate integral values of the period of each frequency employed.) From [9].

Notice that the time between peaks in both histograms is the inverse of the frequency. This indicates that spikes tend to occur in synchrony with the applied stimulus. There is still a probabilistic component to this phenomenon, as indicated by the finite width of the peaks and the small but observable noise floor – some small percentage of the spikes do not occur in synchrony with the signal.

Phase locking is another way to describe synchrony. Experimentally, it has been shown that neurons fire in phase with the stimulus primarily at low frequencies. Phase locking does not exist in the cat's auditory nerve for frequencies beyond 5 kHz and gradually diminishes above 1 kHz.

Nonlinearities: There are several phenomena that can be traced to the nonlinear behavior of the cochlea: saturation, two-tone suppression, masking by noise, and combination tones.

Saturation: The number of spikes that a nerve fiber can generate in a given time is limited by the biology of the fiber. It is also true that different fibers have varying properties. For example, the spontaneous rate can vary over many decibels. This can lead to an interesting result, shown in Fig. 14.12. Part D is the input power spectrum of the vowel in the word bet. Each point in parts A, B, and C is the normalized rate of a fiber at a given characteristic frequency (CF); these are high spontaneous rate fibers (and therefore low threshold fibers). As the input is increased, more fibers saturate so that as one progresses from A to C, the system loses its ability to represent the spectrum. In contrast, the low spontaneous fibers shown in parts E and F remain unsaturated for the same intensity inputs and thus continue to create a plausible image of the spectrum. It is currently of interest to auditory scientists to study the way in which more central auditory neurons make use of this diversity to yield a large dynamic range (100 dB) from the firings of individual neurons with dynamic ranges varying from 20 dB to 40 dB.

Two-tone suppression: Figure 14.9 shows adaptation to a tone by a nerve fiber. We see that after approximately 20 ms, the firing rate is at a steady state. If, now, a new tone is applied without removing the old tone, then, depending on the parameters, the old tone can be suppressed. Figure 14.13 shows this effect. After a short interval the old tone is strongly suppressed, but then the firing rate increases to a new steady state that is lower than the previous steady state. When the new tone is removed, the firing rate increases suddenly and then adapts to the steady state. The specific result depends on the parameters, including frequency of the masker (suppressing tone), amplitude of the masker, frequency and amplitude of the signal, the time relation between the signal and masker, the characteristic frequency of the fiber, and the threshold of the fiber. Figure 14.14 shows the shaded areas where a tone of a frequency different than the probe tone suppresses the probe tone by 20% or more. Outside the shaded areas but inside the tuning curve, the addition of a second tone causes an increase in the firing rate.

Masking of a tone by noise: If a tone plus noise is presented to a fiber, the noise has a suppressing effect on the response of the fiber to the tone. Figure 14.15 shows the firing rate of an auditory nerve as a function of the tone intensity, both with and without additive wideband masking noise. As the tone intensity increases toward its saturation, we see that the tone in noise fires at a lower rate when noise is present. The tone frequency and fiber CF were both 2.9 kHz. The noise band was 2.5-4 kHz.


FIGURE 14.12 Effect of input intensity on the rates of high spontaneous rate fibers (A, B, C) and low spontaneous rate fibers (D, E, F). From [10].


FIGURE 14.13 Suppression of a continuous tone by a tone burst. From [7].


FIGURE 14.14 Suppression regions of an auditory nerve fiber. From [1].

Combination tones: If a fiber is excited by two tones, a combination tone may appear that was not present in the stimulus. Thus, it is possible to excite a fiber with the combination tone 2f1f2 when both f1 and f2 are far from the CF of the fiber. For example, if the two applied tones are f1 = 1.0 kHz and f2 = 1.1 kHz, then the combination tones 0.7 kHz (4f1-3f2), 0.8 kHz (3f1-2f2), and 0.9 kHz (2f1f2) will also be able to excite the appropriately tuned fiber.


  1. The complete mammalian auditory system contains many millions of neurons, but the peripheral auditory system of humans contains approximately 30,000 neurons and is the best understood component.


    FIGURE 14.15 Rate vs. intensity with and without additive noise. From [8].

  2. The auditory system consists of both ascending and descending fibers. Thus, there is feedback at most levels.
  3. The outer ear terminates at the eardrum and affects the acoustics much like that of an acoustic tube.
  4. The middle ear performs mechanical impedance transformation, from the malleus (driven by the eardrum) to the stapes (which drives the inner ear fluids).
  5. The basilar membrane behaves like a bank of mechanical tuned circuits, over the complete range of auditory signals.
  6. BM motion is transmitted to the stereocilia of the hair cells, and this leads to the firings of peripheral auditory neurons.
  7. Auditory nerves adapt to stimuli, spiking vigorously at the beginning of a new input and then continuing to spike in the steady state at a reduced rate.
  8. Each auditory nerve has a best or characteristic frequency, which is a function of its position on the basilar membrane.
  9. For low frequencies (below 5 kHz) spikes tend to synchronize with periodic stimuli. This is called phase locking.
  10. Various nonlinearities exist in the auditory system, leading to such phenomena as the limited dynamic range of a nerve, masking effects, and combination tones.

Figure 14.16 is a summary block diagram of many of the connections in the system.


FIGURE 14.16 Conceptual block diagram of the peripheral auditory system. From [3].


  1. 14.1 Assume that the ear canal in a typical person is 3 cm long and of cylindrical shape with a diameter of 0.8 cm. Also, assume that the ear drum behaves like a solid, inflexible wall. Describe how the acoustical properties of the eardrum affect the frequency response of the overall system. How do these properties affect neural spiking for a 100-Hz pure tone stimulus and for a 3-kHz stimulus?
  2. 14.2 The dynamic range of a normal human ear is approximately 100 dB, but the measured dynamic range of many neurons is approximately 20-30 dB. How does the auditory system manage such a high dynamic range with such restricted elements?
  3. 14.3 In response to pure sinusoidal tones, a specific auditory nerve will spike at different rates, depending on the stimulus frequency. Various theories of the mechanism have been advanced. Discuss these theories and present empirical justifying evidence.
  4. 14.4 Describe the sequence of events leading to auditory nerve spiking when an acoustic pressure wave appears on the outer ear.
  5. 14.5 The frequency response in the peripheral auditory system is tonotopic (center frequency changes with place). Explain how this comes about.
  6. 14.6 Present a heuristic justification for the statement that the basilar membrane behaves like a bank of bandpass filters.
  7. 14.7 Refractory times in auditory nerves (intervals immediately following a spike during which the nerve is incapable of firing again) are at least several milliseconds. Explain how the auditory system is capable of responding to high-frequency stimuli (5 kHz or higher).
  8. 14.8 Many people are deaf because their hair cells have been destroyed. Can you think of a way whereby some hearing is restored?
  9. 14.9 Devise one or more thought experiments that demonstrate that listeners can hear frequencies that are not present in the stimulus.


  1. Arthur, R. M., Pfeifer, R. R., and Suga, N., “Properties of ‘two-tone inhibition’ in primary auditory neurones,” j. Physiol. 212: 593-609, 1971.
  2. von Bekesy, G., Experiments in Hearing, McGraw–Hill, New York, 1960.
  3. Frishkopf, L. S., Class notes, Quantitative Physiology II–Sensory Systems, Massachusetts Institute of Technology, 1989.
  4. von Helmholtz, H., On the Sensation of Tone as a Physiological Basis for the Study of Music, 4th. ed., A. J. Ellis, trans., Dover, New York, 1954; orig. German, 1862.
  5. Hudspeth, A. J., “The cellular basis of hearing: the biophysics of hair cells,” Science 230: 745-752, 1985.
  6. Kiang, N. Y.-S., and Moxon, E. C., “Tails of tuning curves of auditory nerve fibers,” J. Acoust. Soc. Am. 55: 620-630, 1974.
  7. Pickles, J., An Introduction to the Physiology of Hearing, 2nd ed., Academic Press, New York/London, 1988.
  8. Rhode, W. S., Geisler, C. D., and Kennedy, D. T., “Auditory nerve fiber responses to wide-band noise and tone combinations,” J. Neurophysiol. 41: 692-704, 1978.
  9. Rose, J. E., Brugge, J. F., Anderson, D. J., and Hind, J. E., “Phase-locked response to low-frequency tones in single auditory-nerve fibers of the squirrel monkey,” Journal of Neurophysiology 30: 262-286, 1967.
  10. Sachs, M. B., and Young, E. D., “Encoding of steady state vowels in the auditory nerve: representation in terms of discharge rate,” J. Acoust. Soc. Am. 66: 470-479, 1979.
  11. Shepherd, G. M., “Neurobiology,” Oxford Univ. Press, London/New York, pp. 101-119, 1988.
  12. Tobias, J. V., ed., Foundations ofModern Auditory Theory II, Academic Press, New York/London, 1972.
  13. Yost, W. A., and Nielsen, D. W., Fundamentals of Hearing – An Introduction, Holt, Rinehart & Winston, New York, 1977.

1 Figures 14.1 and 14.2 omit many details of the auditory pathways. Only the most studied nuclei are shown.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.