Von Kempelen spent 20 years building his speech synthesizer. He used the most viable method of implementation for his time (~1780): mechanical devices. In the first half of the 20th century Fant and others built speech synthesizers from analog electronic components. When the digital computer arrived, speech researchers recognized its potentiality for speech-processing tasks, but it was not until recently that computational power became sufficiently great and cost became sufficiently low that even complex algorithms could be implemented cheaply and in real time. So, advances in speech processing owe much to advancing computer technology; but, in addition, this progress has been dependent on the mathematical discipline of digital signal processing – also called discrete-time signal processing.

The connection between speech and digital signal processing is straightforward. Speech depends greatly on filtering, both in production and perception. The vocal tract is a complicated arrangement of acoustic tubes; understanding the behavior of vocal tracts relies on physical models of these acoustic tubes. We shall see in Chapters 10, 11, and 12 that digital models of tubes are based on digital signal processing (DSP) concepts. Also, the auditory system was recognized, more than a century ago, to have properties akin to a filter bank that analyzes the spectral characteristics of the speech signal.

It therefore is desirable to include material from the DSP field, with emphasis on the filtering properties of DSP algorithms. The fundamentals of DSP are briefly reviewed and then applied to the theory and design of digital filters, with emphasis on those elements that connect to our description of speech and music coding.

In this section we discuss the mathematical properties of the z transform. In Chapter 7 we will discuss the mathematical properties of the discrete Fourier transform. These transforms are the mathematical bridges that connect the time and frequency properties of discrete-time signals, just as the Laplace transform bridges the time–frequency properties of continuous signals. We start with a sequence *x(n)*, defined for all *n*. Define the *z* transform of *x(n)* as

where *z* is a complex variable and *(z)* is a function of a complex variable.

Although Eq. 6.1 makes no explicit reference to time, in many practical cases *x(n)* is derived by sampling a continuous signal at equally spaced time intervals.

In dealing with physical systems, it is convenient to assume that sequences begin at *n =* 0, so that *x(n)* is undefined for negative values of *n* and we have another definition of the *z* transform:

We will be dealing with Eq. 6.2 unless otherwise noted. We call *(z)* the two-sided *z* transform and refer to *X (z)* as simply the *z* transform, or, for the sake of clarity, the one-sided *z* transform.

Note that the *z* transform is a *linear* operation; that is, the *z* transform of a weighted sum is the weighted sum of the *z* transforms of the individual terms of the sum. This can easily be seen by inspection of Eq. 6.1 or Eq. 6.2. Also, the *z* transform of a *delayed* sequence *x(n – m)* is the *z* transform of the original sequence multiplied by *z ^{–m}*. (The proof is left as an exercise.) These properties will prove to be extremely useful.

Equation 6.2 is invertible; that is, we can find the sequence *x(n)*, given the function *X(z)*. To show this, multiply Eq. 6.2 by *z*^{k–1} and perform a closed line integration on both sides of the equation. If the integration path is within the region of convergence of the infinite series, then the summation and integration can be interchanged, yielding

But (stated without proof; see Exercise 6.6),

From Eqs. 6.3 and 6.4 we have

In Eqs. 6.3, 6.4, and 6.5, the integration path must enclose the origin.

For many practical problems, this integration never need be explicitly done; rather, the inverse transform can often be computed by inspection, with the use of the linearity and delay properties described in Section 6.2 (see Exercise 6.3).

The discrete convolution theorem is the defining equation of linear discrete systems. The mathematical statement is

where *h(n)* is the response of a linear system to a unit pulse, where the latter is defined as a sequence that is zero for all *n* except *x(n)* = 1 for *n* = 0.

The unit-pulse response function *h(n)* is associated with the time-domain behavior of the system; knowledge of *h(n)* allows one, in principle, to find the response to any arbitrary input signal Similarly, the system response to a discrete-time complex exponential can serve as a defining function in the frequency domain. Let the complex exponential be *e ^{jωn}*. From Eq. 6.6, the steady state response (as

where *H(e ^{jωn})* is seen to be the

If *X(z)* is the *z* transform of *x(n)* and *H(z)* is the *z* transform of *h(n)*, then it can be shown that

(See Exercise 6.1).

Equation 6.6 is a temporal relation between two time functions; Eq. 6.8 is the equivalent relation in the complex *z* domain. If the value of *z* is restricted to lie on the unit circle in the *z* plane, then the *z* transform reduces to the Fourier transform of the sequence.

Stated physically, convolution in the time domain leads to multiplication in the frequency domain. Furthermore, multiplication in the time domain leads to convolution in the frequency domain. We will state the result without proof.

Integration is on the unit circle.

If a continuous function of time *x(t)* is sampled every *T* seconds to produce the sequence *x(nT)*, the resultant frequency response is *periodic*, with period 1*/T*. If the original signal is band limited so that it contains no frequencies greater than 1/2*T*, we get the pictures of Figs. 6.1 and 6.2.

No information is lost in going from Fig. 6.1 to Fig. 6.2. In fact, the original, continuous signal can be recovered by low-pass filtering with a filter of bandwidth 1/2*T*; this is a statement of the well-known sampling theorem.

As a consequence of the well-known time-frequency trade-off, a sample, which has zero width in time, will have an infinite width in frequency; so, therefore, will a sequence of samples. The fact that each period in Fig. 6.2 is an exact duplicate of the original response of Fig. 6.1 can be shown in a number of ways.

First, since the samples represent the product of two signals (the original signal and a set of unity height samples *T* seconds apart), the resulting frequency response of the product is the convolution of the two frequency responses and leads directly to Fig. 6.2.

Second, Fig. 6.1 has its counterpart in the *z* plane. If the *z* transform of a sequence is evaluated on the unit circle in the *z* plane then *z* = e^{jθ}. Let θ in Fig. 6.3 be ω*T*; the *z* transform becomes

One complete path around the unit circle corresponds to ω*T*'s traveling from 0 to 2π. When ω*T* = 2π*f*, *T* = 2π, *f _{r}* = 1/

Much of the material in this book involves analysis and synthesis. Analysis often consists of studying the frequency components of signals; this is best done through filters, or through transform methods. In the analog world, filters are composed of resistors, capacitors, and inductors or their electronic equivalents. In the digital world we generally do filtering with computer programs. The signals we process are digital; they are quantized samples, or more simply, numbers.

Synthesis often consists of modeling physical devices, such as violins or human vocal tracts. In many cases the models may be approximated with linear systems. Filtering in the analog domain is mainly expressed mathematically through systems of linear differential equations. Modeling of systems in which space as well as time are parameters requires systems of partial differential equations. In the digital domain, systems of linear difference equations are needed to mathematically describe one-dimensional linear time-invariant devices. The description of more complicated systems such as acoustic tubes involves computer algorithms that have been labeled “digital wave guides” by Smith [8]. In many systems of interest, synthesis models obey some form of the wave equation, with solutions consisting of traveling or standing waves. Such models can be implemented by computer by incorporating relatively long delay elements into the algorithms. The following sections deal with both types of algorithms.

A simple example is the first-order equation:

A solution to this equation may be obtained by use of the *z* transform. Given the previously described properties of linearity and delay, we can show the *z*-transform solution. Let *X(z)* be the *z* transform of *x(n)* and let *Y(z)* be the *z* transform of *y(n);* taking the *z* transform of Eq. 6.11, we find

Solving for *Y(z)* yields

where *y*(–1) can be interpreted as an initial condition of *y(n)*. For example, if *y*(–1) = 0 (the system is initially at rest), then

where

Here *H (z)* is the *transfer function* related to the difference equation 6.11. The transfer function is defined as the ratio of the *z* transform of the output to the *z* transform of the input. Since the *z* transform of a unit impulse is unity, the transfer function can also be defined as the *z* transform of the output when the input is a unit pulse.

The frequency response of this first-order system can be studied from the geometry in the *z* plane. First, note from Eq. 6.13 that knowing the input *X (z)* anywhere in the *z* plane, plus knowledge of the number *y(-*1), allows determination of the *z* transform *Y(z)* of the output anywhere in the *z* plane. Of special interest is the response of the system to a steady-state sinusoid. For mathematical brevity we use as the input the complex exponential *x(n) = e ^{jnθ}*. The resulting response of the system can be found by evaluating

If *x(n)* consists of samples from an analog signal, θ = ω*T*, so Eqs. 6.16 and 6.17 are direct functions of frequency.

From Fig. 6.4 and Eqs. 6.16 and 6.17 we see that the smallest magnitude of the vector from the pole to the unit circle occurs for θ = 0, so this value of θ and hence, of frequency, corresponds to the maximum value of the frequency-response magnitude. In general, as the path on the unit circle gets close to a pole, the magnitude of the frequency response increases. If we move the pole to some angle (φ, near the unit circle, then, as θ approaches φ), the magnitude function peaks; this is resonance.

Figure 6.5 depicts such a situation.

We notice that the pole position is a complex number in the *z* plane. What is the difference equation in the time domain that could lead to Fig. 6.5? Let us try

which leads to the geometry of Fig. 6.5. However, notice that *y(n)*, the inverse *z* transform, is a sequence of *complex* numbers. The determination of resonance based on sequences of real numbers requires a second-order difference equation:

The *z* transform of Eq. 6.21 is

Solving for *Y(z)*, we find

The geometry in the complex *z* plane is shown in Fig. 6.6.

If *A* = 2*r* cos θ and *B* = –*r*^{2}, the denominator of Eq. 6.23 becomes

and this results in the roots

Another form of resonance can be obtained from the configuration shown in Fig. 6.7.

Figure 6.8 shows the *z*-transform equivalence of Fig. 6.7.

*H*_{I} (*z*) is the transfer function of the difference equation,

and *H*_{2} (*z*) is the transfer function of the equation

Explicit expressions for *H*_{1}(*z*) and *H*_{2}(*z*) are easily obtained:

In the *z* plane, *H*_{1}(*z*) is represented by *M* zeros spaced uniformly around the unit circle. If *M* is even (e.g., 12) and the minus sign in Eq. 6.28 is used, the zeros are as shown in Fig. 6.9. Thus, for example, the zeros at 60° can be cancelled by designing *H*_{2}(*z*) to have two poles at the same angles, as shown in Fig. 6.10. Cascading the two *z* transforms yields the pole-zero plot of Fig. 6.11.

Figure 6.12 shows the magnitude of the frequency response.

In the preceding sections, we have seen how the *z* transform allows us to analyze a time-domain system, defined in terms of a difference equation, and predict its frequency response by evaluating the *z* transform on the unit circle. Because of the simple relationship between the position of the poles and zeros in the *z* plane and the behavior of the overall *z* transform, we can quickly get an approximate idea of the frequencies at which our system will have peaks it its gain (resonances, resulting from poles close to the unit circle), or minima (anti-resonances, arising from zeros near or on the unit circle).

Most new systems for the processing of speech and music are now digital, and as such are based on the fundamental mathematical tools briefly reviewed in this chapter. Filters typically perform linear convolutions, and filter responses are generally specified in terms of their *z* transforms, which have their time-domain equivalence in terms of difference equations. First- and second-order systems form the basis of much discussion about linear discrete-time systems, and understanding the basics of the effect of transforming a continuous time signal into a sequence of numbers (sampling) is fundamental to this work.

For further reading, there is a wide range of reference texts on the subject of DSP, including [4], [1], [7], [5], and [6]. Of particular historic interest is Chapter 5, by W. Hurewicz, in [2]. Reference [3] describes the fundamental mathematics behind the z-transform.

**6.1**Prove that if*y(n)*is the convolution of two sequences*x(n)*and*h(n)*, then the*z*transform*Y(z)*is the product of the*z*transforms*X(z)*and*H(z)*.**6.2**Prove that the*z*transform of a delayed sequence*x(n – M)*is*z*^{–M}*X(z)*, where*X(z)*is the*z*transform of the undelayed sequence. What conditions are needed to make the proof correct?**6.3**Let*W(z)*= 2*X(z)*+ 3*z*^{–3}*Y(z)*. Find the inverse*z*transform of*W(z)*by inspection of the right-hand side, that is, without using contour integrals.**6.4**Let*H(z)*=*Y(z)*/*X(z)*= (1 –*z*)/(1 – 0.5^{–1}*z*^{–1}). Use algebra to generate an expression with*Y(z)*on the left-hand side and then use inspection to generate a corresponding difference equation.**6.5**Figure 6.13 is a*z*transform illustration of a digital filter. Choose θ in*H*_{2}(*z*) to cancel the lowest frequency pair of*H*_{1}(*z*). Plot the magnitude of the resulting frequency response of*H*_{1}(*z)H*_{2}*(z)*on the unit circle.**6.6**Try to prove the famous Cauchy integral equation (Eq. 6.4). If you have trouble, find a book on functions of a complex variable, study the proof, and write it up.**6.7**Let*x(n) = K*,^{n}*n*= 0, 1, 2, 3, .... Find the*z*transform of*x(n)*. Given*X(z)*, use the inversion theorem Eq. 6.5 to compute*x(n)*.**6.8**What is the solution to the difference equation*y(n) = Ky(n –*1) with initial condition*y*(0) = 1? Next, find the region of convergence of the*z*transform of your solution*y(n). K ≤*1.**6.9**A finite sequence*h(n)*is shown in Fig. 6.14.- (a) Find
*H(e*, and assuming that^{jω})*H(e*=^{jω})*e*^{–o.5jω(N – 1)}*R*(ω), sketch*R*(ω) in the interval 0 ≤ ω ≤ 2π. - (b) This impulse response can be implemented by the structure shown in Fig. 6.15. Determine
*a, b*, and*c*for*N*odd and for*N*even.A Hanning window

*w(n)*is applied to*h(n)*to form the product*f(n)*, where*w(n)*= 1/2 – 1/2 cos (2π*n/N*). - (c) Sketch the new function
*f(n)*versus*n*. - (d) Show how to implement a filter with impulse response
*f(n)*by the frequency-sampling structure shown in part (b).

- (a) Find
**6.10**Figure 6.16 shows the pole patterns of three*z*transforms,*H*_{1}(*z*),*H*_{2}*(z)*, and*H*_{3}(*z*). Determine the conditions for which the corresponding sequences*h*_{1}(*n*),*h*_{2}*(n)*, and*h*can be stable._{3}(n)**6.11**Find the magnitude of*H(z)*on the unit circle at ω = 0, π/4, π/8, and 7π/8 for the network described by the difference equation*y(n) – by(n – 1) = x(n) – ax(n – 1)*with initial conditions*x(n)*= 0 and*y(n)*= 0, for*a*= 2 and*b*= 1/2.**6.12**Find the linear convolution of the two sequences*x(n)*and*h(n)*shown in Fig. 6.17.**6.13**Consider the equationwith

*y(–1)*= 0 and*x(n)*= 0 for*n*≤ –1.- (a) Obtain an explicit solution for
*H(z)*, the network transfer function. - (b) Draw the digital network.
- (c) Does this network have all poles, all zeros, or both poles and zeros?
- (d) Show the positions of the network zeros, if any, in the complex
*z*plane. - (e) Is the network stable?
- (f) Sketch
*y(n)*versus*n*if*x(n)*is a unit sample.

- (a) Obtain an explicit solution for

- Gold, B., and Rader, C. M.,
*Digital Processing of Signals*, McGraw–Hill, New York, 1969. - James, H. M., Nichols, N. B., and Phillips, R. S.,
*Theory of Servomechanisms*, McGraw–Hill, New York, 1947. - Knopp, K.,
*Theory of Functions I*, Dover Publications, New York, 1945. - Kuo, F.F. and Kaiser, J. F.,
*System Analysis by Digital Computer*, John Wiley, New York, 1966. - Oppenheim, A., and Schafer, R.,
*Discrete-Time Signal Processing (3rd Ed.)*, Prentice–Hall, Englewood Cliffs, N.J., 2009. - Oppenheim, A., and Schafer, R.,
*Digital Signal Processing*, Prentice–Hall, Englewood Cliffs, N.J., 1975. - Rabiner, L., and Gold, B.,
*Theory and Application of Digital Signal Processing*, Prentice-Hall, Englewood Cliffs, N.J., 1975. - Smith, J. O., “Physical Modeling Using Digital Waveguides,”
*Comp. Music J*.**16(4)**: 74–91, Winter 1992.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.