14 Signal Space

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

14 Signal Space

Rodger E. Ziemer

14.1 Introduction

14.2 Fundamentals

14.3 Application of Signal Space Representation to Signal Detection

14.4 Application of Signal Space Representation to Parameter Estimation

14.5 Miscellaneous Applications

Wavelet Transforms

Mean-Square Estimation: The Orthogonality Principle

Volterra Adaptive Lattice Filtering

References

The concept of signal space has its roots in the mathematical theory of inner product spaces known as Hilbert spaces (Stakgold, 1967). Many books on linear systems touch on the subject of signal spaces in the context of Fourier series and transforms (Ziemer et al., 1998). The applications of signal space concepts in communication theory find their power in the representation of signal detection and estimation problems in geometrical terms, which provides much insight into signaling techniques and communication system design. The first person to have apparently exploited the power of signal space concept in communication theory was the Russian Kotel'nikov (1968), who presented his doctoral dissertation in January, 1947. Wozencraft and Jacobs (1965) expanded on this approach and their work is still today widely referenced. Arthurs and Dym (1962) made use of signal space concepts in the performance analysis of several digital modulation schemes. A one-chapter summary of the use of signal space methods in signal detection and estimation is provided in Ziemer and Tranter (2009). Another application of signal space concepts is in signal and image compression. Wavelet theory (Rioul and Vetterli, 1991) is currently finding use in these application areas. Finally, the application of signal space concepts to nonlinear filtering is discussed. In the next section, the fundamentals of generalized vector spaces are summarized, followed by an overview of several applications to signal representations.

14.2 Fundamentals

A linear space or vector space (signal space) (Stakgold, 1967) is a collection of elements (called vectors) x, y, z, …, for which the following axioms are satisfied:

To every pair of vectors x and y there corresponds a vector x + y with the properties:
1. x + y = y + x
2. x + (y + z) = (x + y) + z
3. There exists a unique element 0 such that x + 0 = x, for every x
4. To every x, there exists a unique vector labeled −x such that x + (−x) = 0
To all vectors, x and y, and all numbers (scalars), α and β, the following commutative and associative rules hold:
1. α(βx) = (αβ)x
2. (α + β)x = αx + βx
3. α(x + y) = αx + αy
4. 1x = x where 1 is the identity element

A vector is said to be a linear combination of the vectors x₁, x₂, …, x_k in a vector space if there exist numbers (not all 0) α₁, α₂, …, α_k such that

x = \sum_{i = 1}^{k} α_{i} x_{i} (14.1)

$x = \sum_{i = 1}^{k} α_{i} x_{i} (14.1)$

The vectors x₁, x₂, …, x_k are said to be linearly dependent (or form a dependent set) if there exist numbers α₁, α₂, …, α_k not all zero, such that

α_{1} x_{1} + α_{2} x_{2} + \dots + α_{k} x_{k} = 0 (14.2)

$α_{1} x_{1} + α_{2} x_{2} + \dots + α_{k} x_{k} = 0 (14.2)$

If Equation 14.2 can be satisfied only for α₁ = α₂ = … = α_k = 0, the vectors are linearly independent.

One is tempted to use the infinite-sum version of Equation 14.2 in defining the notion of independence for an infinite set of vectors. This is not true in general; one needs the notion of convergence, which is based on the concept of distance between vectors.

With the idea of linear independence firmly in mind, the concept of dimension of a vector space readily follows. A vector space is n-dimensional if it possesses a set of n independent vectors, but every set of n + 1 vectors is a dependent set. If for every positive integer k, a set of k independent vectors in the space can be found, the space is said to be infinite dimensional. A basis for a vector space means a finite set of vectors e₁, e₂, …, e_n with the following attributes:

They are linearly independent
Every vector x in the space can be written as a linear combination of the basis vectors; that is

$x = \sum_{i = 1}^{n} ξ_{i} e_{i} (14.3)$ $x = \sum_{i = 1}^{n} ξ_{i} e_{i} (14.3)$

It can be proved that the representation (14.3) is unique and that if the space is n-dimensional, any set of n independent vectors e₁, e₂, …, e_n forms a basis.

The next concept to be developed is that of a metric space. In addition to the addition of vectors and multiplication of vectors by scalars, as is true of ordinary three-dimensional vectors, it is important to have the notions of length and direction of a vector imposed. In other words, a metric structure must be added to the algebraic structure already defined. A collection of elements x, y, z, … in a space will be called a metric space if to each pair of elements x, y there corresponds a real number satisfying the properties:

d(x, y) = d(y, x)
d(x, y) ≥ 0 with equality if and only if x = y
d(x, z) ≤ d(x, y)+ d(y, z) (called the triangle inequality)

The function d(x, y) is called a metric (or distance function). Note that the definition of d(x, y) does not require that the elements be vectors; there may not be any way of adding elements or multiplying them by scalars as required for a vector space.

With the definition of a metric, one can now discuss the idea of convergence of a sequence {x_k} of elements in the space. Note that d(x, x_k) is a sequence of real numbers. Therefore, it is sensible to write that

\lim_{k \to \infty} x_{k} = x (14.4)

$\lim_{k \to \infty} x_{k} = x (14.4)$

if the sequence of numbers d(x, x_k) converges to 0 in the ordinary sense of convergence of sequences of real numbers. If

\lim_{m, p \to \infty} d (x_{m}, x_{p}) = 0 (14.5)

$\lim_{m, p \to \infty} d (x_{m}, x_{p}) = 0 (14.5)$

the sequence {x_k} is said to be a Cauchy sequence. It can be shown that if a sequence {x_k} converges, it is a Cauchy sequence. The converse is not necessarily true, for the limit may have carelessly been excluded from the space. If the converse is true, then the metric space is said to be complete.

The next vector space concept to be defined is that of length or norm of a vector. A normed vector space (or linear space) is a vector space in which a real-valued function ‖x‖ (known as the norm of x) is defined, with the properties

‖x‖ ≥ 0 with equality if and only if x = 0.
‖α x‖ = ‖α‖ ‖x‖.
‖x₁ + x₂‖ ≤ ‖x₁‖ + ‖x₂‖.

A normed vector space is automatically a metric space if the metric is defined as

d (x, y) = ‖ x - y ‖ (14.6)

$d (x, y) = ‖ x - y ‖ (14.6)$

which is called the natural metric for the space. A normed vector space may be viewed either as a linear space, a metric space, or both. Its elements may be interpreted as vectors or points, a case in point being ordinary three-dimensional geometry wherein we can identify points as vectors emanating from the origin to the point in question.

The structure of a normed vector space will now be refined further with the definition of the notion of angle between two vectors. In particular, it will be possible to tell whether two vectors are perpendicular. The notion of angle between two vectors will be obtained by defining the inner product (also known as a scalar or dot product). In general, an inner product in a vector space is a complex-valued function of ordered pairs x, y with the properties

〈x, y〉 = 〈y, x〉* (the asterisk denotes complex conjugate)
〈α x, y〉 = α 〈x, y〉
〈x₁ + x₂, y〉 = 〈x₁, y〉 + 〈x₂, y〉
〈x, x〉 ≥ 0 with equality if and only if x = 0

From the first two properties, it follows that

〈 x, α y 〉 = α * ​ 〈 x, y 〉 (14.7)

$〈 x, α y 〉 = α * 〈 x, y 〉 (14.7)$

Also, Schwarz's inequality can be proved and is given by

| 〈 x, y 〉 | \leq {〈 x, x 〉}^{1 / 2} {〈 y, y 〉}^{1 / 2} (14.8)

$| 〈 x, y 〉 | \leq {〈 x, x 〉}^{1 / 2} {〈 y, y 〉}^{1 / 2} (14.8)$

with equality if and only if x = αy (or x = 0 or y = 0). The real, nonnegative quantity 〈x, x〉^1/2 satisfies all the properties of a norm. Therefore, it is adopted as the definition of the norm, and Schwartz's inequality assumes the form

| 〈 x, y 〉 | \leq ‖ x ‖ ‖ y ‖ (14.9)

$| 〈 x, y 〉 | \leq ‖ x ‖ ‖ y ‖ (14.9)$

The natural metric in the space is given by Equation 14.6. An inner product space, which is complete in its natural metric, is called a Hilbert space.

Example 14.1

Consider the space of all complex-valued functions x(t) defined on a ≤ t ≤ b for which the integral

E_{x} = \int_{a}^{b} {| x (t) |}^{2} d t (14.10)

$E_{x} = \int_{a}^{b} {| x (t) |}^{2} d t (14.10)$

exists (i.e., the space of all finite-energy signals in the interval [a,b]). The inner product may be defined as

〈 x, y 〉 = \int_{a}^{b} x (t) y * (t) d t (14.11)

$〈 x, y 〉 = \int_{a}^{b} x (t) y * (t) d t (14.11)$

The natural norm is

‖ x ‖ = {[\int_{a}^{b} {| x (t) |}^{2} d t]}^{1 / 2} (14.12)

$‖ x ‖ = {[\int_{a}^{b} {| x (t) |}^{2} d t]}^{1 / 2} (14.12)$

and the metric is

d (x, y) = ‖ x - y ‖ = {[\int_{a}^{b} {| x (t) - y (t) |}^{2} d t]}^{1 / 2} (14.13)

$d (x, y) = ‖ x - y ‖ = {[\int_{a}^{b} {| x (t) - y (t) |}^{2} d t]}^{1 / 2} (14.13)$

respectively. Schwarz's inequality becomes

\int_{a}^{b} x (t) y * (t) d t \leq {[\int_{a}^{b} {| x (t) |}^{2} d t]}^{1 / 2} {[\int_{a}^{b} {| y (t) |}^{2} d t]}^{1 / 2} (14.14)

$\int_{a}^{b} x (t) y * (t) d t \leq {[\int_{a}^{b} {| x (t) |}^{2} d t]}^{1 / 2} {[\int_{a}^{b} {| y (t) |}^{2} d t]}^{1 / 2} (14.14)$

It can be shown that this space is complete and, hence, is a Hilbert space.

An additional requirement that can be imposed on a Hilbert space is separability, which, roughly speaking, restricts the number of elements in the space. A Hilbert space H is separable if there exists a countable (i.e., can be put in one-to-one correspondence with the positive integers) set of elements (f₁, f₂, …, f_n, …) whose finite linear combinations are such that for any element f in H and ∊ > 0 there exist an index N and constants α₁, α₂, …, α_N such that

‖ f - \sum_{k = 1}^{N} α_{k} f_{k} ‖ < ε (14.15)

$‖ f - \sum_{k = 1}^{N} α_{k} f_{k} ‖ < ε (14.15)$

The set (f₁, f₂, …, f_n, …) is called a spanning set. The discussions from here on are limited to separable Hilbert spaces.

Any finite-dimensional Hilbert space E_n is separable. In fact, there exists a set of n vectors (f₁, f₂, …, f_n) such that each vector x in E_n has the representation

x = \sum_{k = 1}^{n} α_{k} f_{k} (14.16)

$x = \sum_{k = 1}^{n} α_{k} f_{k} (14.16)$

It can be shown that the spaces consisting of square-integrable functions on the intervals [a, b], (−∞, b], [a, ∞), and (−∞, ∞) are all separable, where a and b are finite, and spanning sets exist for each of these spaces. For example, a spanning set for the space of square-integrable functions on [a,b] is the set (1,t,t², … ), which is clearly countable. For such function spaces, convergence is in the sense of convergence in the mean defined as

\lim_{k \to \infty} {\int_{}^{} | x_{k} (t) - x (t) |}^{2} d t = 0 (14.17)

$\lim_{k \to \infty} {\int_{}^{} | x_{k} (t) - x (t) |}^{2} d t = 0 (14.17)$

Similarly, the ideas of independence and basis sets apply to Hilbert spaces in infinite-dimensional form.

It is necessary to distinguish between the concepts of a basis and a spanning set consisting of independent vectors. As ∊ is reduced in Equation 14.15, it is expected that N must be increased, and it may also be necessary to change the previously found coefficients α₁, α₂, …, α_N. Hence, there might not exist a fixed sequence of constants ξ₁, ξ₂, …, ξ_n, … with the property

x = \sum_{k = 1}^{\infty} ξ_{k} f_{k} (14.18)

$x = \sum_{k = 1}^{\infty} ξ_{k} f_{k} (14.18)$

as would be required if the set {f_k} were a basis. For example, on the space of square-integrable functions on [−1,1], the independent set f₀ = 1, f₁ = t, f₂ = t², … is a spanning set, but not a basis, since there are many square-integrable functions on [−1,1] that cannot be expanded in a series like Equation 14.18 (an example is |t|). Odd as it may seem at first, it is possible if the powers of t in this spanning set are regrouped into the set of polynomials known as the Legendre polynomials, P_k (t).

Two vectors x, y are orthogonal or perpendicular if 〈x, y〉 = 0. A finite or countably infinite set of vectors {φ₁, φ₂, …, φ_k, …} is said to be an orthogonal set if 〈φ_i, φ_j〉 = 0, i ≠ j. A proper orthogonal set is an orthogonal set none of whose elements is the zero vector. A proper orthogonal set is an independent set. A set is orthonormal if

〈 ϕ_{i}, ϕ_{j} 〉 = {\begin{matrix} 0, i \neq j \\ 1, i = j \end{matrix} (14.19)

$〈 ϕ_{i}, ϕ_{j} 〉 = {\begin{matrix} 0, i \neq j \\ 1, i = j \end{matrix} (14.19)$

An important concept is that of a linear manifold in a Hilbert space. A set M is said to be a linear manifold if, for x and y belonging to M, so does αx + βy for arbitrary complex numbers α and β; thus, M is itself a linear space. If a linear manifold is a closed set, it is called a closed linear manifold (i.e., every Cauchy sequence has a limit in the space) and is itself a Hilbert space. In three-dimensional Euclidean space, linear manifolds are simply lines and planes containing the origin. In a finite-dimensional space, every linear manifold is necessarily closed.

Let M be a linear manifold, closed or not. Consider the set M^⊥ of all vectors which are orthogonal to every vector in M. It is a linear manifold that can be shown to be closed. If M is closed, M and M^⊥ are known as orthogonal complements. Given a linear manifold M, each vector in the space can be decomposed in a unique manner as a sum x_p + z, where x_p is in M and z is in M^⊥.

Given an infinite orthonormal set {φ₁, φ₂, …, φ_k, …} in the space of all square-integrable functions on the interval [a,b], let {a_n} be a sequence of complex numbers. The Riesz–Fischer theorem tells us how to represent an element in the space and states:

If

$\sum_{n = 1}^{\infty} {| a_{n} |}^{2} (14.20)$ $\sum_{n = 1}^{\infty} {| a_{n} |}^{2} (14.20)$

diverges, then

$\sum_{n = 1}^{\infty} a_{n} ϕ_{n} (14.21)$ $\sum_{n = 1}^{\infty} a_{n} ϕ_{n} (14.21)$

diverges.
If Equation 14.20 converges, then Equation 14.21 also converges to some element g in the space and

$a_{n} = 〈 g, ϕ_{n} 〉 (14.22)$ $a_{n} = 〈 g, ϕ_{n} 〉 (14.22)$

The next question that arises is how to construct an orthonormal set from an independent set {e₁, e₂, …, e_n}. A way of doing this is known as the Gram–Schmidt procedure. The construction is as follows:

Pick a vector from the set {e₁, e₂, …, e_n}, say e₁. Let

$ϕ_{1} = \frac{e_{1}}{{〈 e_{1}, e_{1} 〉}^{1 / 2}} (14.23)$ $ϕ_{1} = \frac{e_{1}}{{〈 e_{1}, e_{1} 〉}^{1 / 2}} (14.23)$
Remove from a second vector in the set, say e₂, its projection on φ₁. This yields

$g_{2} = e_{2} - 〈 e_{2}, ϕ_{1} 〉 ϕ_{1} (14.24)$ $g_{2} = e_{2} - 〈 e_{2}, ϕ_{1} 〉 ϕ_{1} (14.24)$

The vector g₂ is a linear combination of e₁ and e₂, and is orthogonal to φ₁. To normalize it, form

$ϕ_{2} = \frac{g_{2}}{{〈 g_{2}, g_{2} 〉}^{1 / 2}} (14.25)$ $ϕ_{2} = \frac{g_{2}}{{〈 g_{2}, g_{2} 〉}^{1 / 2}} (14.25)$
Pick another vector from the set, say e₃, and form

$g_{3} = e_{3} - 〈 e_{3}, ϕ_{2} 〉 ϕ_{2} - 〈 e_{3}, ϕ_{1} 〉 ϕ_{1} (14.26)$ $g_{3} = e_{3} - 〈 e_{3}, ϕ_{2} 〉 ϕ_{2} - 〈 e_{3}, ϕ_{1} 〉 ϕ_{1} (14.26)$

Normalize g₃ in a manner similar to that used for g₂.
Continue until all vectors in the set {e_k} have been used.

Note that the sets {e_k}, {g_k}, and {φ_k} all generate the same linear manifold. (Also, if {e_k} is not an independent set, the construction will result in 0 at one or more steps.)

A basis consisting of orthonormal vectors is known as an orthonormal basis. If the basis vectors are not normalized, it is simply an orthogonal basis.

Example 14.2

Consider the interval [−1, 1] and the independent set e₀ = 1, e₁ = t, …, e_k = t^k, …. The Gram–Schmidt procedure applied to this set without normalization, but with the requirement that all orthogonal functions take on the value 1 at t = 1, gives the set of Lengendre polynomials, which is

ψ_{0} (t) = 1, ψ_{1} (t) = t, ψ_{2} (t) = \frac{1}{2} (3 t^{2} - 1), ψ_{3} (t) = \frac{1}{2} (5 t^{3} - 3 t), \dots (14.27)

$ψ_{0} (t) = 1, ψ_{1} (t) = t, ψ_{2} (t) = \frac{1}{2} (3 t^{2} - 1), ψ_{3} (t) = \frac{1}{2} (5 t^{3} - 3 t), \dots (14.27)$

It is next desired to approximate an arbitrary vector x in a Hilbert space in terms of a linear combination of the independent set {e₁, e₂, …, e_k}, where k ≤ n if the space is an n-dimensional Euclidean space, and k is an arbitrary integer if the space is infinite dimensional. First, the orthonormal set {φ_k} is constructed from {e_k}. The unique, best approximation to x is the Fourier sum

\sum_{i = 1}^{k} 〈 x, ϕ_{i} 〉 ϕ_{i} (14.28)

$\sum_{i = 1}^{k} 〈 x, ϕ_{i} 〉 ϕ_{i} (14.28)$

which is geometrically the projection of x onto the linear manifold generated by {φ_k}, or equivalently, the sum of the projections along the individual axes defined by φ₁, φ₂, …, φ_k. The square of the distance between x and its projection is

{‖ x - \sum_{i = 1}^{k} 〈 x, ϕ_{i} 〉 ϕ_{i} ‖}^{2} = ‖ x ‖ - \sum_{i = 1}^{k} {| 〈 x, ϕ_{i} 〉 |}^{2} (14.29)

${‖ x - \sum_{i = 1}^{k} 〈 x, ϕ_{i} 〉 ϕ_{i} ‖}^{2} = ‖ x ‖ - \sum_{i = 1}^{k} {| 〈 x, ϕ_{i} 〉 |}^{2} (14.29)$

Since the left-hand side is nonnegative, Equation 14.29 gives Bessel's inequality, which is

{‖ x ‖}^{2} \geq \sum_{i = 1}^{k} {| 〈 x, ϕ_{i} 〉 |}^{2} (14.30)

${‖ x ‖}^{2} \geq \sum_{i = 1}^{k} {| 〈 x, ϕ_{i} 〉 |}^{2} (14.30)$

A convenient feature of the Fourier sum is as follows: If another vector φ_k+1 is added to the orthonormal approximating set of vectors, the best approximation now becomes

\sum_{i = 1}^{k + 1} 〈 x, ϕ_{i} 〉 ϕ_{i} (14.31)

$\sum_{i = 1}^{k + 1} 〈 x, ϕ_{i} 〉 ϕ_{i} (14.31)$

Thus, an additional term is added to the series expansion without changing previously computed coefficients, which makes the extension to a countably infinite orthonormal approximating set simple to envision. In the case of an infinite orthonormal approximating set, Bessel's inequality (14.30) now has an infinite limit on the sum. Does the approximating sum (14.31) converge to x? The answer is that convergence can be guaranteed only if the set {φ_k} is extensive enough, that is, if it is a basis or a complete orthonormal set. In such cases, Equation 14.30 becomes an equality. In fact, a number of equivalent criteria can be stated to determine whether an orthonormal set {φ_k} is a basis or not (Stakgold, 1967). These are

In finite n-dimensional Euclidean space, {φ_k} has exactly n elements for completeness
For every x in the space of square-integrable functions

$x = \sum_{i} 〈 x, ϕ_{i} 〉 ϕ_{i} (14.32)$ $x = \sum_{i} 〈 x, ϕ_{i} 〉 ϕ_{i} (14.32)$
For every x in the space of square-integrable functions

${‖ x ‖}^{2} = \sum_{i}^{} {| 〈 x, ϕ_{i} 〉 |}^{2} (14.33)$ ${‖ x ‖}^{2} = \sum_{i}^{} {| 〈 x, ϕ_{i} 〉 |}^{2} (14.33)$

(known as Parseval's equality)
The only x in the space of square-integrable functions for which all the Fourier coefficients vanish is the 0 function
There exists no function φ(t) in the space of square-integrable functions such that {φ, φ₁, φ₂, …, φ_k, …} is an orthonormal set

Examples of complete orthonormal sets of trigonometric functions over the interval [0,T] are as follows: (1) The complex exponentials with frequencies equal to the harmonics of the fundamental frequency ω₀ = 2π/T, or

\frac{1}{\sqrt{T}}, \frac{\exp (j ω_{0} t)}{\sqrt{T}}, \frac{\exp (- j ω_{0} t)}{\sqrt{T}}, \frac{\exp (j 2 ω_{0} t)}{\sqrt{T}}, \frac{\exp (- j 2 ω_{0} t)}{\sqrt{T}}, \dots (14.34)

$\frac{1}{\sqrt{T}}, \frac{\exp (j ω_{0} t)}{\sqrt{T}}, \frac{\exp (- j ω_{0} t)}{\sqrt{T}}, \frac{\exp (j 2 ω_{0} t)}{\sqrt{T}}, \frac{\exp (- j 2 ω_{0} t)}{\sqrt{T}}, \dots (14.34)$

The factor T^1/2 in the denominator is necessary to normalize the functions and is often not included in the definition of the complex exponential Fourier series. (2) The sines and cosines with frequencies equal to harmonics of the fundamental frequency ω₀ = 2 π/T or

\frac{1}{\sqrt{T}}, \frac{\cos (ω_{0} t)}{\sqrt{T / 2}}, \frac{\sin (ω_{0} t)}{\sqrt{T / 2}}, \frac{\cos (2 ω_{0} t)}{\sqrt{T / 2}}, \frac{\sin (2 ω_{0} t)}{\sqrt{T / 2}}, \dots (14.35)

$\frac{1}{\sqrt{T}}, \frac{\cos (ω_{0} t)}{\sqrt{T / 2}}, \frac{\sin (ω_{0} t)}{\sqrt{T / 2}}, \frac{\cos (2 ω_{0} t)}{\sqrt{T / 2}}, \frac{\sin (2 ω_{0} t)}{\sqrt{T / 2}}, \dots (14.35)$

Note that if any function is left out of these sets, the basis is incomplete.

14.3 Application of Signal Space Representation to Signal Detection

The M-ary signal detection problem is as follows: given M signals, s₁(t), s₂(t), …, s_M(t), defined over 0 ≤ t ≤ T. One is chosen at random and sent each T-second interval through a channel that adds white, Gaussian noise of power spectral density N₀/2 to it. The challenge is to design a receiver that will decide which signal was sent through the channel during each T-second interval with minimum probability of making an error.

An approach to this problem, as expanded upon in greater detail by Wozencraft and Jacobs (1965, Chapter 4) and Ziemer and Tranter (2009, Chapter 10), is to construct a linear manifold, called the signal space, using the Gram–Schmidt procedure on the M signals. Suppose that this results in the orthonormal basis set {φ₁, φ₂, …, φ_K} where K ≤ M (the number of basis functions may be less than the number of signals because the signals are not necessarily linearly independent). The received signal plus noise is represented in this signal space as vectors with coordinates (note that they depend on the signal transmitted)

Z_{i j} = A_{i j} + N_{j}, \begin{matrix} j = 1, 2, \dots, K \\ i = 1, 2, \dots M \end{matrix} (14.36)

$Z_{i j} = A_{i j} + N_{j}, \begin{matrix} j = 1, 2, \dots, K \\ i = 1, 2, \dots M \end{matrix} (14.36)$

where

A_{i j} = \int_{0}^{T} s_{i} (t) ϕ_{j} (t) d t (14.37)

$A_{i j} = \int_{0}^{T} s_{i} (t) ϕ_{j} (t) d t (14.37)$

The numbers Z_ij are components of vectors referred to as the signal vectors, and the space of all signal vectors is called the observation space. They may be produced by a bank of matched filters or correlators, with one filter matched to each orthonormal function.

An apparent problem with this approach is that not all possible noise waveforms added to the signal can be represented as vectors in this K-dimensional observation space. The part of the noise that is represented is

n_{‖} (t) = \sum_{j = 1}^{K} N_{j} ϕ_{j} (t) (14.38)

$n_{‖} (t) = \sum_{j = 1}^{K} N_{j} ϕ_{j} (t) (14.38)$

where

N_{j} = \int_{0}^{T} n (t) ϕ_{j} (t) (14.39)

$N_{j} = \int_{0}^{T} n (t) ϕ_{j} (t) (14.39)$

In terms of Hilbert space terminology, Equation 14.38 is the projection of the noise waveform onto the observation space (i.e., a linear manifold). The unrepresented part of the noise is

n_{⊥} (t) = n (t) - n_{‖} (t) (14.40)

$n_{⊥} (t) = n (t) - n_{‖} (t) (14.40)$

and is the part of the noise that must be represented in the orthogonal complement of the observation space. The question is whether the decision process will be harmed by ignoring this part of the noise. It can be shown that n_⊥(t) is uncorrelated with n_‖(t). Thus, since n(t) is Gaussian they are statistically independent and n_⊥(t) has no bearing on the decision process; nothing is lost by ignoring n_⊥(t).

The decision process can be shown to reduce to choosing that signal s_l (t) minimizing the distance to the data vector; that is,

d (z, s_{l}) = ‖ z - s_{l} ‖ = {[\sum {(Z_{i j} - A_{i j})}^{2}]}^{1 / 2} = minimum, l = 1, 2, \dots, M (14.41)

$d (z, s_{l}) = ‖ z - s_{l} ‖ = {[\sum {(Z_{i j} - A_{i j})}^{2}]}^{1 / 2} = minimum, l = 1, 2, \dots, M (14.41)$

where

z (t) = \sum_{j = 1}^{K} Z_{i j} ϕ_{j} (t) (14.42)

$z (t) = \sum_{j = 1}^{K} Z_{i j} ϕ_{j} (t) (14.42)$

and

s_{l} (t) = \sum_{j = 1}^{K} A_{l j} ϕ_{j} (t) (14.43)

$s_{l} (t) = \sum_{j = 1}^{K} A_{l j} ϕ_{j} (t) (14.43)$

Thus, the signal detection problem is reduced to a geometrical one, where the observation space is subdivided into decision regions in order to make a decision. Each decision region is constructed to ensure that each observation point included in it is closer to the chosen signal point than to any other.

14.4 Application of Signal Space Representation to Parameter Estimation

The procedure used in applying signal space concepts to estimation is similar to that used for signal detection. Consider the observed waveform consisting of additive signal and noise of the form

y (t) = s (t, A) + n (t), 0 \leq t \leq T (14.44)

$y (t) = s (t, A) + n (t), 0 \leq t \leq T (14.44)$

where A is a parameter to be estimated and the noise is white as before. Let {φ₁, φ₂, …, φ_k, …} be a complete orthonormal basis set. The observed waveform can be represented as

y (t) = \sum_{j = 1}^{\infty} S_{j} (A) ϕ_{j} (t) + \sum_{j = 1}^{\infty} N_{j} ϕ_{j} (t) (14.45)

$y (t) = \sum_{j = 1}^{\infty} S_{j} (A) ϕ_{j} (t) + \sum_{j = 1}^{\infty} N_{j} ϕ_{j} (t) (14.45)$

where

S_{j} (A) = 〈 s, ϕ_{j} 〉 = \int_{0}^{T} s (t, A) ϕ_{j}^{*} (t) d t (14.46)

$S_{j} (A) = 〈 s, ϕ_{j} 〉 = \int_{0}^{T} s (t, A) ϕ_{j}^{*} (t) d t (14.46)$

and N_j is defined by Equation 14.39. Hence, an estimate can be made on the basis of the set of coefficients

Z_{j} = S_{j} (A) + N_{j} (14.47)

$Z_{j} = S_{j} (A) + N_{j} (14.47)$

or on the basis of a vector in the signal space with these coordinates. A reasonable criterion for estimating A is to maximize the likelihood ratio, or a monotonic function thereof. Its logarithm, for n(t) Gaussian, can be shown to reduce to

l (A) = \lim_{K \to \infty} L_{K} (A) = \lim_{K \to \infty} [\frac{2}{N_{0}} \sum_{k = 1}^{K} Z_{K} S_{K} (A) - \frac{1}{N_{0}} \sum_{k = 1}^{K} S_{K}^{2} (A)] (14.48)

$l (A) = \lim_{K \to \infty} L_{K} (A) = \lim_{K \to \infty} [\frac{2}{N_{0}} \sum_{k = 1}^{K} Z_{K} S_{K} (A) - \frac{1}{N_{0}} \sum_{k = 1}^{K} S_{K}^{2} (A)] (14.48)$

In the limit as K → ∞, this becomes

l (A) = \frac{2}{N_{0}} \int_{0}^{T} z (t) s (t, A) d t - \frac{1}{N_{0}} \int_{0}^{T} s^{2} (t, A) d t (14.49)

$l (A) = \frac{2}{N_{0}} \int_{0}^{T} z (t) s (t, A) d t - \frac{1}{N_{0}} \int_{0}^{T} s^{2} (t, A) d t (14.49)$

A necessary condition for the value of A that maximizes Equation 14.49 is

\frac{\partial l (A)}{\partial A} = {\frac{2}{N_{0}} \int_{0}^{T} [z (t) - s (t, A)] \frac{\partial s (t, A)}{\partial A} d t |}_{A = \hat{A}} = 0 (14.50)

$\frac{\partial l (A)}{\partial A} = {\frac{2}{N_{0}} \int_{0}^{T} [z (t) - s (t, A)] \frac{\partial s (t, A)}{\partial A} d t |}_{A = \hat{A}} = 0 (14.50)$

The value of A that maximizes Equation 14.49, denoted $A ̂$ $A ̂$ , is called the maximum likelihood estimate.

14.5 Miscellaneous Applications

14.5.1 Wavelet Transforms

Wavelet transforms can be continuous time or discrete time. They find applications in speech and image compression, signal and image classification, and pattern recognition.

The continuous-time wavelet transform of a signal x(t) takes the form (Rioul and Vetterli, 1991)

W_{x} (τ, a) = \int_{- \infty}^{\infty} x (t) h_{a, τ}^{*} (t) d t (14.51)

$W_{x} (τ, a) = \int_{- \infty}^{\infty} x (t) h_{a, τ}^{*} (t) d t (14.51)$

where

h_{a, τ} (t) = \frac{1}{\sqrt{a}} h (\frac{t - τ}{a}) (14.52)

$h_{a, τ} (t) = \frac{1}{\sqrt{a}} h (\frac{t - τ}{a}) (14.52)$

are basis functions called wavelets. Thus, the wavelets defined in Equation 14.52 are scaled and translated versions of the basic wavelet prototype h(t) (also known as the mother wavelet), and the wavelet transform is seen to be a convolution of the conjugate of a wavelet with the signal x(t). Substitution of Equation 14.52 to 14.51 yields

W_{x} (τ, a) = \frac{1}{\sqrt{a}} \int_{- \infty}^{\infty} x (t) h^{*} (\frac{t - τ}{a}) d t (14.53)

$W_{x} (τ, a) = \frac{1}{\sqrt{a}} \int_{- \infty}^{\infty} x (t) h^{*} (\frac{t - τ}{a}) d t (14.53)$

Note that h(t/a) is contracted if a < 1 and expanded if a > 1. Thus, an interpretation of Equation 14.53 is that as a increases, the function h(t/a) becomes spread out over time and takes the long-term behavior of x(t) into account; as a decreases the short-time behavior of x(t) is taken into account. A change of variables in Equation 14.53 gives

W_{x} (τ, a) = \sqrt{a} \int_{- \infty}^{\infty} x (a t) h^{*} (t - \frac{τ}{a}) d t (14.54)

$W_{x} (τ, a) = \sqrt{a} \int_{- \infty}^{\infty} x (a t) h^{*} (t - \frac{τ}{a}) d t (14.54)$

Now the interpretation of Equation 14.54 is as the scale increases (a < 1), an increasingly contracted version of the signal is seen through a constant-length sifting function, h(t). This is only the barest of introductions to wavelets, and the reader is urged to consult the references to learn more about wavelets and their applications, particularly their discrete-time implementation.

14.5.2 Mean-Square Estimation: The Orthogonality Principle

Given n random variables, X₁, X₂, …, X_n it is desired to find n constants a₁, a₂, …, a_n such that when another random variable S is estimated by the sum

\hat{S} = \sum_{i = 1}^{n} a_{i} X_{i} (14.55)

$\hat{S} = \sum_{i = 1}^{n} a_{i} X_{i} (14.55)$

then the mean-square error (MSE)

MSE = E {{| S - \sum_{i = 1}^{n} a_{i} X_{i} |}^{2}} (14.56)

$MSE = E {{| S - \sum_{i = 1}^{n} a_{i} X_{i} |}^{2}} (14.56)$

is a minimum, where E{ } denotes expectation or statistical average. It is shown in (Papoulis and Pillai, 2001) that the MSE is minimized when the error is orthogonal to the data, or when

E {[S - \sum_{i = 1}^{n} a_{i} X_{i}] X_{j}^{*}} = 0, j = 1, 2, \dots, n (14.57)

$E {[S - \sum_{i = 1}^{n} a_{i} X_{i}] X_{j}^{*}} = 0, j = 1, 2, \dots, n (14.57)$

This is known as the orthogonality principle or projection theorem and can be interpreted as stating that the MSE is minimized when the error vector is orthogonal to the subspace (linear manifold) spanned by the vectors X₁, X₂, …, X_n. The projection theorem has many applications including filtering of noisy signals, known as Wiener filtering.

14.5.3 Volterra Adaptive Lattice Filtering

Volterra filters are nonlinear filters which, in discrete time, can be characterized by input–output relationships of the form (Mathews, 1991)

\begin{matrix} y [n] = \sum_{m_{1} = 0}^{N - 1} h_{1} [m_{1}] x [n - m_{1}] + \sum_{m_{2} = 0}^{N - 1} \sum_{m_{1} = 0}^{N - 1} h_{2} [m_{1}, m_{2}] x [n - m_{1}] x [n - m_{2}] \\ + \dots + \sum_{m_{p} = 0}^{N - 1} \dots \sum_{m_{1} = 0}^{N - 1} h_{p} [m_{1}, m_{2}, \dots, m_{p}] x [n - m_{1}] \dots x [n - m_{p}] \end{matrix} (14.58)

$\begin{matrix} y [n] = \sum_{m_{1} = 0}^{N - 1} h_{1} [m_{1}] x [n - m_{1}] + \sum_{m_{2} = 0}^{N - 1} \sum_{m_{1} = 0}^{N - 1} h_{2} [m_{1}, m_{2}] x [n - m_{1}] x [n - m_{2}] \\ + \dots + \sum_{m_{p} = 0}^{N - 1} \dots \sum_{m_{1} = 0}^{N - 1} h_{p} [m_{1}, m_{2}, \dots, m_{p}] x [n - m_{1}] \dots x [n - m_{p}] \end{matrix} (14.58)$

where x[n] is the input and y[n] is the output of the system at discrete time instantn. An important area for their application is inverse filtering, or equalization, for nonlinear communications channels (Benedetto and Biglieri, 1983) wherein some type of feedback adjustment algorithm is used to adjust the Volterra kernels (Haykin, 1996), h_p [m₁, m₂, …, m_p].

Clearly, Equation 14.58 becomes rapidly complex as the number of terms included is increased. Some simplification is obtained by assuming that h_p [m₁, m₂, …, m_p] is symmetrical in its indices. Further simplification may be obtained in many cases if the channels being equalized have odd symmetry, which is often the case. As an example, assume that both of these to be the case and let the series (14.58) be truncated at the third-order sum and let N = 3. Then, the Volterra expansion for this case becomes

\begin{matrix} y [n] = h_{1} [0] x [n] + h_{1} [1] x [n - 1] + h_{1} [2] x [n - 2] + h_{3} [0, 0, 0] x^{3} [n] \\ + h_{3} [0, 0, 1] x^{2} [n] x [n - 1] + h_{3} [0, 0, 2] x^{2} [n] x [n - 2] \\ + h_{3} [0, 1, 1] x [n] x^{2} [n - 1] + h_{3} [0, 1, 2] x [n] x [n - 1] x [n - 2] \\ + h_{3} [0, 2, 2] x [n] x^{2} [n - 2] + h_{3} [1, 1, 1] x^{3} [n - 1] \\ + h_{3} [1, 1, 2] x^{2} [n - 1] x [n - 2] + h_{3} [1, 2, 2] x [n - 1] x^{2} [n - 2] \\ + h_{3} [2, 2, 2] x^{3} [n - 2] \end{matrix} (14.59)

$\begin{matrix} y [n] = h_{1} [0] x [n] + h_{1} [1] x [n - 1] + h_{1} [2] x [n - 2] + h_{3} [0, 0, 0] x^{3} [n] \\ + h_{3} [0, 0, 1] x^{2} [n] x [n - 1] + h_{3} [0, 0, 2] x^{2} [n] x [n - 2] \\ + h_{3} [0, 1, 1] x [n] x^{2} [n - 1] + h_{3} [0, 1, 2] x [n] x [n - 1] x [n - 2] \\ + h_{3} [0, 2, 2] x [n] x^{2} [n - 2] + h_{3} [1, 1, 1] x^{3} [n - 1] \\ + h_{3} [1, 1, 2] x^{2} [n - 1] x [n - 2] + h_{3} [1, 2, 2] x [n - 1] x^{2} [n - 2] \\ + h_{3} [2, 2, 2] x^{3} [n - 2] \end{matrix} (14.59)$

The terms can be grouped in the following manner:

[\begin{matrix} \begin{matrix} x [n] & x [n - 1] & x [n - 2] \\ x^{2} [n] & x^{2} [n - 1] & x^{2} [n - 2] \\ x^{3} [n] & x^{3} [n - 1] & x^{3} [n - 2] \end{matrix} \\ \begin{matrix} x^{2} [n] x [n - 1] & x^{2} [n] x [n - 2] \end{matrix} \\ \begin{matrix} x [n] x^{2} [n - 1] & x [n] x^{2} [n - 2] \end{matrix} \\ \begin{matrix} Col . 0 & Col . 1 & Col . 2 \end{matrix} \end{matrix}] (14.60)

$[\begin{matrix} \begin{matrix} x [n] & x [n - 1] & x [n - 2] \\ x^{2} [n] & x^{2} [n - 1] & x^{2} [n - 2] \\ x^{3} [n] & x^{3} [n - 1] & x^{3} [n - 2] \end{matrix} \\ \begin{matrix} x^{2} [n] x [n - 1] & x^{2} [n] x [n - 2] \end{matrix} \\ \begin{matrix} x [n] x^{2} [n - 1] & x [n] x^{2} [n - 2] \end{matrix} \\ \begin{matrix} Col . 0 & Col . 1 & Col . 2 \end{matrix} \end{matrix}] (14.60)$

The basic idea for forming the lattice Volterra filter is to obtain a Gram–Schmidt orthogonal decomposition of the three columns, $x_{0}^{b} [n], x_{1}^{b} [n], x_{2}^{b} [n]$ $x_{0}^{b} [n], x_{1}^{b} [n], x_{2}^{b} [n]$ . Let $b_{0}^{} [n], b_{1}^{} [n], b_{2}^{} [n]$ $b_{0}^{} [n], b_{1}^{} [n], b_{2}^{} [n]$ represent the corresponding orthogonal basis set. Then, any linear combination of $x_{0}^{b} [n], x_{1}^{b} [n], x_{2}^{b} [n]$ $x_{0}^{b} [n], x_{1}^{b} [n], x_{2}^{b} [n]$ can be equivalently written as another linear combination of b₀[n], b₁[n], b₂[n]. Let d[n] be the desired signal, and let its estimate be

\hat{d} [n] = {(k_{0}^{d})}^{T} b_{0} [n] + {(k_{1}^{d})}^{T} b_{1} [n] + {(k_{2}^{d})}^{T} b_{2} [n] (14.61)

$\hat{d} [n] = {(k_{0}^{d})}^{T} b_{0} [n] + {(k_{1}^{d})}^{T} b_{1} [n] + {(k_{2}^{d})}^{T} b_{2} [n] (14.61)$

A big advantage of the lattice structure is that because of the orthogonality of b₀[n], b₁[n], b₂[n] the coefficient vector $k_{i}^{d}$ $k_{i}^{d}$ can be computed solely from the joint statistics of d[n] and b₁n].

References

Arthurs, E. and Dym, H. 1962. On the optimum detection of digital signals in the presence of white gaussian noise—A geometric approach and a study of three basic data transmission systems. IRE Trans. Commun. Syst., CS-10(Dec.): 336–372.

Benedetto, S. and Biglieri, E. 1983. Nonlinear equalization of digital satellite channels. IEEE J. Sel. Areas in Commun., SAC-1(Jan.): 57–62.

Haykin, S. 1996. Adaptive Filter Theory, 3rd ed. Englewood Cliffs, NJ: Prentice Hall.

Kotel'nikov, V.A. 1968. The Theory of Optimum Noise Immunity (trans. R.A. Silverman), Dover, New York.

Mathews, V. 1991. Adaptive polynomial filters. IEEE Signal Proc. Mag., 8(Jul.): 10–26.

Papoulis, A. and Pillai, S. Unnikrishna. 2001. Probability, Random Variables, and Stochastic Processes, 3rd ed., New York: McGraw–Hill.

Rioul, O. and Vetterli, M. 1991. Wavelets in signal processing. IEEE Signal Proc. Mag., 8(Oct.):14–38.

Stakgold, I. 1967. Boundary Value Problems of Mathematical Physics, Vol. 1, London: Macmillan, Collier–Macmillan Ltd.

Wozencraft, J.M. and Jacobs, I.M. 1965. Principles of Communication Engineering, New York: Wiley. (Available from Prospect Heights, IL: Waveland Press.)

Ziemer, R.E., Tranter, W.H., and Fannin, D.R. 1998. Signals and Systems: Continuous and Discrete, 4th ed. New York: Macmillan.

Ziemer, R.E. and Tranter, W.H. 2009. Principles of Communications: Systems, Modulation, and Noise, 6th ed. Boston, MA: Houghton Mifflin.

Table of Contents for
14 Signal Space

14

Signal Space

14.1 Introduction

14.2 Fundamentals

14.3 Application of Signal Space Representation to Signal Detection

14.4 Application of Signal Space Representation to Parameter Estimation

14.5 Miscellaneous Applications

14.5.1 Wavelet Transforms

14.5.2 Mean-Square Estimation: The Orthogonality Principle

14.5.3 Volterra Adaptive Lattice Filtering

References

Further Reading

Table of Contents for 14 Signal Space

Create new playlist

Sign In

Sign Up

14

Table of Contents for
14 Signal Space