11.6.2 UMP Detection with Both Composite Hypotheses

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

We now consider the more general case where both the hypotheses are composite. The UMP optimization problem can be stated as:

Maximize PD(δ˜;θ),for all θ∈Λ1, subject to supθ∈Λ0 PF(δ˜;θ)≤α. $Maximize P_{D} (\tilde{δ}; θ), for all θ \in Λ_{1}, subject to sup_{θ \in Λ_{0}} P_{F} (\tilde{δ}; θ) \leq α .$

(11.21)

If a UMP test δ̃_UMP exists, then it must satisfy the following conditions. First,

supθ0∈Λ0 PF(δ˜UMP;θ0)≤α. $sup_{θ_{0} \in Λ_{0}} P_{F} ({\tilde{δ}}_{UMP}; θ_{0}) \leq α .$

(11.22)

Second, for any δ̃ ∈ Δ̃ that satisfies supθ0∈Λ0PF(δ˜;θ0)≤α $\sup_{θ_{0} \in Λ_{0}} P_{F} (\tilde{δ}; θ_{0}) \leq α$ we must have

PD(δ˜;θ1)≤PD(δ˜UMP;θ1) for all θ1∈Λ1. $P_{D} (\tilde{δ}; θ_{1}) \leq P_{D} ({\tilde{δ}}_{UMP}; θ_{1}) for all θ_{1} \in Λ_{1} .$

(11.23)

The following example illustrates a case where a UMP solution can be found. Also see Exercises 11.9.12 and 11.9.13.

Example 11.6.4. Testing Between Two One-Sided Composite Signals in Gaussian Noise. This is an extension of Example 11.6.1 in which the observation is Y = θ+Z, with Z ~ N $N$ (0, σ²), and we are testing

H0: θ∈Λ0 = [0,1] versus H1:θ∈Λ1 = (1,∞). $H_{0} : θ \in Λ_{0} = [0, 1] versus H_{1} : θ \in Λ_{1} = (1, \infty) .$

For fixed θ₀ ∈ Λ₀ and θ₁ ∈ Λ₁, Lθ0,θ1(y) $L_{θ_{0}, θ_{1}} (y)$ has no point masses under Pθ0 $P_{θ_{0}}$ or Pθ1 $P_{θ_{1}}$ , and therefore δ̃_NP(y; θ₀, θ₁) is a deterministic LRT:

δNP(y:θ0,θ1) = {10ififLθ0,θ1(y)≥η(θ0,θ1)Lθ0,θ1(y)<η(θ0,θ1) = {10ifify≥η′(θ0,θ1)y<η′(θ0,θ1) $δ_{NP} (y : θ_{0}, θ_{1}) = {\begin{array}{l} 1 & if & L_{θ_{0}, θ_{1}} (y) \geq η (θ_{0}, θ_{1}) \\ 0 & if & L_{θ_{0}, θ_{1}} (y) < η (θ_{0}, θ_{1}) \end{array} = {\begin{array}{l} 1 & if & y \geq η^{'} (θ_{0}, θ_{1}) \\ 0 & if & y < η^{'} (θ_{0}, θ_{1}) \end{array}$

where η′(θ₀, θ₁) is given by

η′(θ0,θ1) = σ2log η(θ0,θ1)θ1 − θ0 + θ0 + θ12. $η^{'} (θ_{0}, θ_{1}) = \frac{σ^{2} \log η (θ_{0}, θ_{1})}{θ_{1} - θ_{0}} + \frac{θ_{0} + θ_{1}}{2} .$

Now in order to set the threshold η′ to meet the constraint on P_F given in (11.21), we first compute:

PF(δη′;θ0) = Pθ0{Y≥η′} = Q(η′ − θ0σ) $P_{F} (δ_{η^{'}}; θ_{0}) = P_{θ_{0}} {Y \geq η^{'}} = Q (\frac{η^{'} - θ_{0}}{σ})$

and note that this probability is an increasing function in θ₀. Therefore

supθ0∈[0,1] PF(δη′;θ0) = Q(η′ − 1σ) $sup_{θ_{0} \in [0, 1]} P_{F} (δ_{η^{'}}; θ_{0}) = Q (\frac{η^{'} - 1}{σ})$

and we can meet the P_F constraint with equality by setting η′ such that:

Q(η′ − 1σ) = α⇒η′α = σQ − 1(α) + 1. $Q (\frac{η^{'} - 1}{σ}) = α \Rightarrow {η^{'}}_{α} = σ Q^{- 1} (α) + 1.$

Note that η′_α is independent of θ₀ and θ₁. Define the test

δη′α(y) = {10ifify≥η′αy<η′α. $δ_{{η^{'}}_{α}} (y) = {\begin{array}{l} 1 & if & y \geq {η^{'}}_{α} \\ 0 & if & y < {η^{'}}_{α} \end{array} .$

We will now establish that δη′α $δ_{{η^{'}}_{α}}$ is a UMP test, by showing that conditions (11.22) and (11.23) hold. By construction,

supθ0∈[0,1] PF(δη′α;θ0) = PF(δη′α;1) = α $sup_{θ_{0} \in [0, 1]} P_{F} (δ_{{η^{'}}_{α}}; θ_{0}) = P_{F} (δ_{{η^{'}}_{α}}; 1) = α$

and so (11.22) holds. Also, δη′α $δ_{{η^{'}}_{α}}$ is an a-level N-P test between the simple hypotheses H₀ : θ = 1 and H₁ : θ = θ₁, and being independent of θ₁, it is an α-level N-P test between these hypotheses for all θ₁ ∈ (1, ∞). Now, consider any test δ̃ ∈ Δ̃ that satisfies sup_θ_{∈[0, 1]} P_F(δ̃; θ) ≤ α. Then clearly it is also true that P_F(δ̃; 1) ≤ α. This means that δ̃ is an α-level test for testing the simple hypotheses H₀ : θ = 1 versus H₁ : θ = θ₁, and it cannot be more powerful than δη′α $δ_{{η^{'}}_{α}}$ , i.e.,

PD(δ˜;θ1)≤PD(δη′α;θ1)for allθ1∈(1,∞). $P_{D} (\tilde{δ}; θ_{1}) \leq P_{D} (δ_{{η^{'}}_{α}}; θ_{1}) for all θ_{1} \in (1, \infty) .$

Therefore (11.23) holds and we have:

δUMP(y) = δη′α(y) = {10ifify≥σQ − 1(α) + 1y<σQ − 1(α) + 1. $δ_{UMP} (y) = δ_{{η^{'}}_{α}} (y) = {\begin{array}{l} 1 & if & y \geq σ Q^{- 1} (α) + 1 \\ 0 & if & y < σ Q^{- 1} (α) + 1 \end{array} .$

Again, while the test δ_UMP is independent of the θ₁, the performance of the test in terms of the P_D depends on θ₁. In particular

PD(δUMP;θ1) = Pθ{Y≥σQ − 1(α) + 1} = Q(Q − 1(α) − θ1 − 1σ). $P_{D} (δ_{UMP}; θ_{1}) = P_{θ} {Y \geq σ Q^{- 1} (α) + 1} = Q (Q^{- 1} (α) - \frac{θ_{1} - 1}{σ}) .$

□

11.6.3 Generalized Likelihood Ratio (GLR) Detection

While it is always desirable to have a UMP solution to the composite hypothesis testing problem, such solutions rarely exist in practice, especially in situations where both hypotheses are composite. One approach to generating a good test when UMP solutions do not exist is through the use of a “GLR” defined by

TGLR(y) = supθ1∈Λ1pθ1(y)supθ0∈Λ0pθ0(y). $T_{GLR} (y) = \frac{\sup_{θ_{1} \in Λ_{1}} p_{θ_{1}} (y)}{\sup_{θ_{0} \in Λ_{0}} p_{θ_{0}} (y)} .$

It is important to note that the maximization over θ₀ and θ₁ has to be performed for each realization of the observation y, and so this test statistic is considerably more complex that the LRT. Also the result of the maximization may not produce a PDF (or PMF) in the numerator and denominator. We can use the statistic T_GLR(y) to produce a test, which is called the “generalized likelihood ratio test (GLRT)”:

δ˜GLRT(y) = ⎧⎩⎨⎪⎪11 w.p.γ0ifififTGLRT(y)>ηTGLRT(y) = η.TGLRT(y)<η ${\tilde{δ}}_{GLRT} (y) = {\begin{array}{l} 1 & if & T_{GLRT} (y) > η \\ 1 w .p .γ & if & T_{GLRT} (y) = η . \\ 0 & if & T_{GLRT} (y) < η \end{array}$

The use of the GLRT can be justified via an asymptotic analysis with a sequence of independent and identically distributed (i.i.d.) observations under each hypothesis, where it can be shown to have certain optimality properties. The maximization in numerator and denominator in T_GLR(y) can also be justified from the viewpoint of maximum likelihood parameter estimation [2].

Example 11.6.5. Detection of One-Sided Composite Signal in Cauchy Noise (continued). This problem was introduced in Example 11.6.3. The conditional PDF is given by

pθ(y) = 1π[1 + (y − θ)2] $p_{θ} (y) = \frac{1}{π [1 + {(y - θ)}^{2}]}$

and we are testing H₀ : θ = 0 against the one-sided composite hypothesis H₁ : θ > 0. As we saw in Example 11.6.3, there is no UMP solution to this problem. The GLR statistic is given by

TGLR(y) = supθ>0pθ(y)p0(y) $T_{GLR} (y) = \frac{\sup_{θ > 0} p_{θ} (y)}{p_{0} (y)}$

with

supθ>0 pθ(y) = supθ>0 1π[1 + (y − θ)2] = {1π1π(1 + y2)ifify≥0y<0. $\sup_{θ > 0} p_{θ} (y) = \sup_{θ > 0} \frac{1}{π [1 + {(y - θ)}^{2}]} = {\begin{array}{l} \frac{1}{π} & if & y \geq 0 \\ \frac{1}{π (1 + y^{2})} & if & y < 0 \end{array} .$

Thus

TGLR(y) = {1 + y21ifify≥0y<0. $T_{GLR} (y) = {\begin{array}{l} 1 + y^{2} & if & y \geq 0 \\ 1 & if & y < 0 \end{array} .$

To find an α-level test we need to evaluate P₀{T_GLR(Y) ≥ η}. Clearly

P0{TGLR(y)≥η} = 1 for 0≤η<1. $P_{0} {T_{GLR} (y) \geq η} = 1 for 0 \leq η < 1.$

For η ≥ 1

P0{TGLR(Y)≥η} = ∫∞η − 1√1π11 + y2dy = 0.5 − tan − 1(η − 1√)π. $P_{0} {T_{GLR} (Y) \geq η} = \int_{\sqrt{η - 1}}^{\infty} \frac{1}{π} \frac{1}{1 + y^{2}} d y = 0.5 - \frac{\tan^{- 1} (\sqrt{η - 1})}{π} .$

There is a point of discontinuity in P₀{T_GLR(Y) ≥ η} at η = 1 as the value drops from 1 to the left to 0.5 to the right. For α ∈ (0.5, 1], we would need to randomize to meet the P_F constraint with equality. For α ∈ (0, 0.5], which would be more relevant in practice, the GLRT is a deterministic test:

δGLRT(y) = {10ififTGLR(y)≥ηαTGLR(y)<ηα $δ_{GLRT} (y) = {\begin{array}{l} 1 & if & T_{GLR} (y) \geq η_{α} \\ 0 & if & T_{GLR} (y) < η_{α} \end{array}$

where

ηα = [tan(π(0.5 − α))]2 + 1. $η_{α} = {[\tan (π (0.5 - α))]}^{2} + 1.$

□

11.6.4 Locally Most Powerful (LMP) Detection

Another approach to finding good detectors in cases where UMP tests do not exist is via a local optimization approach, which works when only one of the hypotheses is composite. Consider the scenario where Y ~ P_θ, we are interested in testing H₀ : θ = θ₀ versus H₁ : θ > θ₀, and there is no UMP solution. Also, suppose that θ takes values close to θ₀ under H₁; this might occur in practice in the detection of weak signals with unknown amplitude in noise.

Fix θ > θ₀ and let δ̃_θ be an α-level N-P test between θ and θ₀. Then assuming that P_D(δ̃_θ; θ) is differentiable with respect to θ, we can write the Taylor series approximation:

PD(δ˜θ;θ) = PD(δ˜θ0;θ0) + (θ − θ0)∂∂θPD(δ˜θ;θ)|θ = θ0 + (θ − θ0) ≈α + (θ − θ0)∂∂θPD(δ˜θ;θ)|θ = θ0 . $\begin{array}{l} P_{D} ({\tilde{δ}}_{θ}; θ) = P_{D} ({\tilde{δ}}_{θ_{0}}; θ_{0}) + (θ - θ_{0}) \frac{\partial}{\partial θ} P_{D} ({\tilde{δ}}_{θ}; θ) |_{θ = θ_{0}} + (θ - θ_{0}) \\ \approx α + (θ - θ_{0}) \frac{\partial}{\partial θ} P_{D} ({\tilde{δ}}_{θ}; θ) |_{θ = θ_{0}} . \end{array}$

The locally optimal criterion can described as:

Maximize∂∂θPD(δ˜θ;θ)|θ = θ0 subject to PF(δ˜;θ0)≤α $Maximize \frac{\partial}{\partial θ} P_{D} ({\tilde{δ}}_{θ}; θ) |_{θ = θ_{0}} subject to P_{F} (\tilde{δ}; θ_{0}) \leq α$

(11.24)

the idea being that maximizing P_D should be approximately the same as maximizing the slope of P_D at θ = θ₀ for values of θ close to θ₀. Now

PD(δ˜θ;θ) = ∫yI{δ˜(y) = 1}pθ(y)μ(dy). $P_{D} ({\tilde{δ}}_{θ}; θ) = \int_{y} I_{{\tilde{δ} (y) = 1}} p_{θ} (y) μ (d y) .$

Assuming that p_θ (y) is differentiable in θ

∂∂θPD(δ˜θ;θ)|θ = θ0 = ∫yI{δ˜(y) = 1}∂∂θpθ(y)|θ = θ0μ(dy). $\frac{\partial}{\partial θ} P_{D} ({\tilde{δ}}_{θ}; θ) |_{θ = θ_{0}} = \int_{y} I_{{\tilde{δ} (y) = 1}} \frac{\partial}{\partial θ} p_{θ} (y) |_{θ = θ_{0}} μ (d y) .$

Therefore, the solution to the locally optimal detection problem of (11.24) can be seen as being equivalent to N-P testing between pθ0(y) $p_{θ_{0}} (y)$ and

∂∂θpθ(y)|θ = θ0. $\frac{\partial}{\partial θ} p_{θ} (y) |_{θ = θ_{0}} .$

Even though the latter quantity is not necessarily a PDF (or PMF), the steps that we followed in deriving the N-P solution in Section 11.4 can be repeated to show that the solution to (11.24) has the form:

δ˜LMP(y) = ⎧⎩⎨⎪⎪11 w.p γ0ifififTlo(y)>ηTlo(y) = ηTlo(y)<η ${\tilde{δ}}_{LMP} (y) = {\begin{array}{l} 1 & if & T_{lo} (y) > η \\ 1 w .p γ & if & T_{lo} (y) = η \\ 0 & if & T_{lo} (y) < η \end{array}$

where

Tlo(y) = ∂∂θpθ(y)∣∣θ = θ0pθ0(y). $T_{lo} (y) = \frac{\frac{\partial}{\partial θ} p_{θ} (y) |_{θ = θ_{0}}}{p_{θ_{0}} (y)} .$

Example 11.6.6. Detection of One-Sided Composite Signal in Cauchy Noise (continued). This problem was introduced in Example 11.6.3, and we saw that there was no UMP solution. We studied the GLRT in Example 11.6.5, and now we examine the LMP solution.

pθ(y) = 1π[1 + (y − θ)2] ⇒ ∂∂θpθ(y)|θ = 0 = 2yπ(1 + y2)2 $p_{θ} (y) = \frac{1}{π [1 + {(y - θ)}^{2}]} \Rightarrow \frac{\partial}{\partial θ} p_{θ} (y) |_{θ = 0} = \frac{2 y}{π {(1 + y^{2})}^{2}}$

Thus

Tlo(y) = 2y1 + y2 $T_{lo} (y) = \frac{2 y}{1 + y^{2}}$

and

δ˜LMP(y) = {10ififTlo(y)≥ηTlo(y)<η. ${\tilde{δ}}_{LMP} (y) = {\begin{array}{l} 1 & if & T_{lo} (y) \geq η \\ 0 & if & T_{lo} (y) < η \end{array} .$

Randomization is not needed since T_lo(y) does not have point masses under P₀.

□

11.7 Binary Detection with Vector Observations

In the detection problems we have studied so far, we did not make any explicit assumptions about the observation space, although the examples were restricted to scalar observations. The theory that we have developed applies equally to scalar and vector observations. Nevertheless, it is useful to study the case of vector observations in more detail as such a study reveals aspects of detector structures that are useful in applications.

Consider the detection problem:

H0:Y∼p0(y) versus H1:Y∼p1(y) $H_{0} : Y \sim p_{0} (y) versus H_{1} : Y \sim p_{1} (y)$

where Y = [Y₁ Y₂ ⋯ Y_n]^⊤ and y = [y₁ y₂ ⋯ y_n]^⊤. The optimum detector for this problem, no matter which criterion (Bayes, Neyman-Pearson, minimax) we choose, is of the form

δ˜OPT(y) = ⎧⎩⎨⎪⎪11 w.p.γ0ifififlog L(y)>ηlog L(y) = ηlog L(y)<η ${\tilde{δ}}_{OPT} (y) = {\begin{array}{l} 1 & if & \log L (y) > η \\ 1 w .p .γ & if & \log L (y) = η \\ 0 & if & \log L (y) < η \end{array}$

(11.25)

where L(y) = p₁(y)/p₀(y) is the likelihood ratio, and taking the log of L(y) does not affect the structure of the test since log is a monotonic function. The threshold η and randomization parameter γ are chosen based on the criterion used for detection. Of course, in the Bayesian setting, η = log τ, with τ given in (11.11), and γ = 0.

11.7.1 Conditionally Independent Observations

Consider the special case where the observations are (conditionally) independent under each hypothesis. In this case

pj(y) = Πk = 1n pj,k(yk) $p_{j} (y) = Π_{k = 1}^{n} p_{j, k} (y_{k})$

and the log likelihood ratio in (11.25) can be written as

log L(y) = ∑k = 1nlog Lk(yk) $\log L (y) = \sum_{k = 1}^{n} \log L_{k} (y_{k})$

where L_k(y_k) = p_1,_k(y_k)/p_0,k(y_k).

Example 11.7.1. Deterministic signals in i.i.d. noise. Here, the hypotheses are given by:

H0:Y = s0 + Z versus H1:Y = s1 + Z $H_{0} : Y = s_{0} + Z versus H_{1} : Y = s_{1} + Z$

where s₀ and s₁ are deterministic vectors (signals) and Z₁, Z₂, …, Z_n are i.i.d. random variables with zero mean and density given by p_Z. Hence, the log likelihood ratio in (11.25) can be written as:

log L (y) = ∑k = 1nlog pZ(yk − s1,k)pZ(yk − s0,k). $\log L (y) = \sum_{k = 1}^{n} \log \frac{p_{Z} (y_{k} - s_{1},_{k})}{p_{Z} (y_{k} - s_{0},_{k})} .$

A special case of this example is one where Z is a vector of i.i.d. N $N$ (0, σ²) random variables, in which case (based on the more general result derived in the following section), we can show that the optimum detector structure is of the form:

δOPT(y) = {10ifif(s1 − s0)⊤y≥η(s1 − s0)⊤y<η. $δ_{OPT} (y) = {\begin{array}{l} 1 & if & {(s_{1} - s_{0})}^{⊤} y \geq η \\ 0 & if & {(s_{1} - s_{0})}^{⊤} y < η \end{array} .$

□

11.7.2 Deterministic Signals in Correlated Gaussian Noise

In general, the detection problem with vector observations that are conditionally dependent, given the hypothesis, does not admit any special structure beyond what is described in (11.25). However, in some special cases, we can simplify the expression for the log likelihood ratio to obtain some more insight into the detector structure. In this section, we consider the example of detecting deterministic signals in correlated Gaussian noise, for which the hypotheses are described by:

H0:Y = s0 + Z versus H1: Y = s1 + Z $H_{0} : Y = s_{0} + Z versus H_{1} : Y = s_{1} + Z$

with s₀ and s₁ being deterministic signals as in Example 11.7.1, and Z is a Gaussian vector with zero mean and covariance matrix Σ, denoted by Z ~ N $N$ (0, Σ). In this case

pj(y) = 1(2π)n|Σ|√exp{ − 12(y − sj)⊤Σ − 1(y − sj)}. $p_{j} (y) = \frac{1}{\sqrt{{(2 π)}^{n} | Σ |}} \exp {- \frac{1}{2} {(y - s_{j})}^{⊤} Σ^{- 1} (y - s_{j})} .$

where ∣Σ∣ is the absolute value of the determinant of Σ. Therefore

log L(y) − logp1(y)p0(y) = (s1 − s0)⊤Σ − 1(y − s1 − s02). $\log L (y) - \log \frac{p_{1} (y)}{p_{0} (y)} = {(s_{1} - s_{0})}^{⊤} Σ^{- 1} (y - \frac{s_{1} - s_{0}}{2}) .$

Since log L(y) does not have any point masses under either hypothesis, the optimum detector is deterministic and has the form:

δOPT(y) = {10ififT(y)≥ηT(y)<η $δ_{OPT} (y) = {\begin{array}{l} 1 & if & T (y) \geq η \\ 0 & if & T (y) < η \end{array}$

where T(y) = (s₁ − s₀)^⊤Σ⁻¹y and the η is chosen based on the detection criterion. In the special case of Bayesian detection,

η = log τ + 12(s1 − s0)⊤Σ − 1(s1 + s0) $η = \log τ + \frac{1}{2} {(s_{1} - s_{0})}^{⊤} Σ^{- 1} (s_{1} + s_{0})$

with τ given in (11.11).

If we define the pseudosignal s̃ by

s˜≜Σ − 1(s1 − s0) $\tilde{s} ≜ Σ^{- 1} (s_{1} - s_{0})$

then the test statistic T(y) can be written as:

T(y) = s˜⊥y = ∑k = 1ns˜kyk. $T (y) = {\tilde{s}}^{⊥} y = \sum_{k = 1}^{n} {\tilde{s}}_{k} y_{k} .$

We see that the optimum detector is a correlation detector or matched filter [2].

Note that T(y) is linear in Y and hence has a Gaussian PDF under both H₀ and H₁. In particular,

Ej[T(Y)] = s˜⊤s˜j≜μ˜j $E_{j} [T (Y)] = {\tilde{s}}^{⊤} {\tilde{s}}_{j} ≜ {\tilde{μ}}_{j}$

and

Varj[T(Y)] = Var(s˜⊤Z) = s˜⊤Σs˜ = μ˜1 − μ˜0≜d2 ${Var}_{j} [T (Y)] = Var ({\tilde{s}}^{⊤} Z) = {\tilde{s}}^{⊤} Σ_{\tilde{s}} = {\tilde{μ}}_{1} - {\tilde{μ}}_{0} ≜ d^{2}$

where d² is called the Mahalanobis distance between the signals s₁ and s₀.

Based on the above characterization of T(y), we can conclude that the problem of deterministic signal detection in correlated Gaussian noise is equivalent to the following detection problem involving the scalar observation T(y):

H0:T(y)~N(μ˜0,d2) versus H1:T(y)~N(μ˜1,d2). $H_{0} : T (y) ~ N ({\tilde{μ}}_{0}, d^{2}) versus H_{1} : T (y) ~ N ({\tilde{μ}}_{1}, d^{2}) .$

11.7.3 Gaussian Signals in Gaussian Noise

In this section we consider another important example involving dependent observations, that of detecting Gaussian signals in Gaussian noise. The hypotheses are described by:

H0:Y = S0 + Z versus H1:Y = S1 + Z $H_{0} : Y = S_{0} + Z versus H_{1} : Y = S_{1} + Z$

where S₀, S₁, and Z are jointly Gaussian random vectors. It is easy to see that this problem is equivalent to the following detection problem:

H0:Y~N(μ0,∑0) versus H1:Y~N(μ1,∑1) $H_{0} : Y ~ N (μ_{0}, \sum_{0}) versus H_{1} : Y ~ N (μ_{1}, \sum_{1})$

(11.26)

for some vectors μ₀, μ₁, and covariance matrices Σ₀ and Σ₁. Note that

pj(y) = 1(2π)n∣∣∑j∣∣√exp{ − 12(y − μj)⊤∑ − 1j(y − μj)} $p_{j} (y) = \frac{1}{\sqrt{{(2 π)}^{n} | \sum_{j} |}} \exp {- \frac{1}{2} {(y - μ_{j})}^{⊤} \sum_{j}^{- 1} (y - μ_{j})}$

and therefore the log likelihood ratio is given by:

log L(y) = 12y⊤(∑ − 10 − ∑ − 10)y + (μ⊤1∑ − 11 − μ⊤0∑ − 10)y + 12[log∣∣∑0∣∣∣∣∑1∣∣ + μ⊤0∑⊤1∑ − 11μ1]. $\begin{matrix} \log L (y) = \frac{1}{2} y^{⊤} (\sum_{0}^{- 1} - \sum_{0}^{- 1}) y + (μ_{1}^{⊤} \sum_{1}^{- 1} - μ_{0}^{⊤} \sum_{0}^{- 1}) y \\ + \frac{1}{2} [\log \frac{| \sum_{0} |}{| \sum_{1} |} + μ_{0}^{⊤} \sum_{1}^{⊤} \sum_{1}^{- 1} μ_{1}] . \end{matrix}$

Thus, the optimum detector in general involves both a quadratic term as well as a linear term in y. If Σ₀ = Σ₁ and μ₀ ≠ μ₁, then the quadratic term vanishes and we have the detector structure we saw earlier for the detection of deterministic signals in Gaussian noise. If μ₀ = μ₁ = 0 and Σ₁ ≠ Σ₀, then the linear term vanishes and we have a purely quadratic detector.

Example 11.7.2. Signaling over Rayleigh Fading Channel with Random Phase. The following detection problem arises in the context of wireless communication systems, when the carrier phase is not known at the receiver:

H0:Y = Z versus H1:Y = [AcosϕAsinϕ] + Z $H_{0} : Y = Z versus H_{1} : Y = [\begin{matrix} A \cos ϕ \\ A \sin ϕ \end{matrix}] + Z$

(11.27)

where Z ~ N $N$ (0, σ²I), A is the fading amplitude that is Rayleigh distributed, and ϕ is the random phase that is uniformly distributed on [0, 2π]. The PDF of A is given by:

pA(a) = av2exp[ − a22v2]I{a≥0} $p_{A} (a) = \frac{a}{v^{2}} \exp [- \frac{a^{2}}{2 v^{2}}] I_{{a \geq 0}}$

If we define the fading signal vector S to have components S₁ = A cos ϕ and S₂ = A sin ϕ, then it is not difficult to show that S₁ and S₂ are independent N $N$ (0, ν²) random variables. Thus the hypothesis test of (11.27) reduces to:

H0:Y~N(0,σ2I) versus H1:Y~N(0,(σ2 + v2)I). $H_{0} : Y ~ N (0, σ^{2} I) versus H_{1} : Y ~ N (0, (σ^{2} + v^{2}) I) .$

This is a special case of (11.26) with μ₀ = μ₁ = 0, and Σ₀ = σ²I, Σ₁ = (σ² + ^ν2)I. Thus the log likelihood ratio has the form:

log L(y) = (constant)y⊤y + (constant) $\log L (y) = (constant) y^{⊤} y + (constant)$

from which we can conclude that the optimum detector is of the form:

δOPT(y) = {10if y⊤y≥ηif y⊤y≥η. $δ_{OPT} (y) = {\begin{matrix} 1 & if y^{⊤} y \geq η \\ 0 & if y^{⊤} y \geq η \end{matrix} .$

The test statistic Y⊤Y=Y21+Y22 $Y^{⊤} Y = Y_{1}^{2} + Y_{2}^{2}$ has an exponential distribution with mean 2σ² under H₀, and an exponential distribution with mean 2(σ² + ν²) under H₁. Thus, if we are interested in N-P detection, for an α-level test, we can set P₀{Y^⊤Y ≥ η_α] = α by setting

exp[ηα2σ2] = α ⇒ ηα = − 2σ2log α. $\exp [\frac{η α}{2 σ^{2}}] = α \Rightarrow η_{α} = - 2 σ^{2} \log α .$

The corresponding power of the test is given by:

PD(δOPT) = P1{YTY≥ηα} = exp[ − ηα2(σ2 + v2)] = ασ2σ2 + v2. $P_{D} (δ_{OPT}) = P_{1} {Y^{T} Y \geq η_{α}} = \exp [- \frac{η_{α}}{2 (σ^{2} + v^{2})}] = α^{\frac{σ^{2}}{σ^{2} + v^{2}}} .$

□

11.8 Summary and Further Reading

This chapter covered the fundamentals of detection theory, with an emphasis on binary detection problems. In Section 11.1, we provided a general statistical decision theory framework for detection problems. In Section 11.2, Section 11.4, we introduced the three basic formulations for the binary detection problem: Bayesian, minimax, and Neyman-Pearson. We saw that in all cases the optimum detection rule is a LRT with possible randomization. In Section 11.5, Section 11.6, we studied composite detection problems where the distributions of the observations are not completely specified. In particular, we saw that Bayesian composite detection can be reduced to an equivalent simple detection problem. The Neyman-Pearson version of the composite detection problem is more interesting, and we studied various approaches to this problem, including UMP detection, GLR detection, and LMP detection. Finally, we examined the detection problem with vector observations in more detail, and discussed optimum detector structures for both the cases where the observations are conditionally independent and dependent, under each hypothesis.

This chapter was inspired by the textbook on detection and estimation theory by Poor [2]. While we focused almost exclusively on binary detection problems, extension to M-ary detection is straightforward at least in the Bayesian setting (see Exercise 11.9.6). More details on M-ary detection can be found in the books by Van Trees [3], Levy [4] and Kay [5]. An alternative formulation to the detection problem with incompletely specified distributions is the robust formulation of Huber [6]. Other extensions of detection theory include sequential [7] and quickest change detection [8], where observations are taken sequentially in time and decisions about the hypothesis need to be made online. Asymptotic performance analysis and design of detection procedures for large number of observations using tools from large deviations theory has been an active area of research (see, e.g., [9]). Finally, distributed sensor networks have generated interesting new directions for research in detection theory [10].

Acknowledgments

The writing of this chapter was supported in part by the U.S. National Science Foundation, under grant CCF-0830169, through the University of Illinois at Urbana-Champaign. The author would also like to thank Taposh Banerjee for help with the figures.

11.9 Exercises

Exercise 11.9.1. Consider the binary statistical decision theory problem for which S=D={0,1} $S = D = {0, 1}$ . Suppose the cost function is given by

C(i,j) = ⎧⎩⎨⎪⎪0110if i = jif j = 0, i = 1if j = 1, i = 0 $C (i, j) = {\begin{matrix} 0 & if i = j \\ 1 & if j = 0, i = 1 \\ 10 & if j = 1, i = 0 \end{matrix}$

The observation Y takes values in the set Γ = {a, b, c} and the conditional p.m.f.’s of Y are:

p0(a) = p0(b) = 0.5 p1(a) = p1(b) = 0.25,p1(c) = 0.5 $p_{0} (a) = p_{0} (b) = 0.5 p_{1} (a) = p_{1} (b) = 0.25, p_{1} (c) = 0.5$

1. Is there a best decision rule based on conditional risks?

2. Find Bayes (for equal priors) and minimax rules within the set of deterministic decision rules.

3. Now consider the set of randomized decision rules. Find a Bayes rule (for equal priors). Also construct a randomized rule whose maximum risk is smaller than that of the minimax rule of part (b).

Exercise 11.9.2. For the binary hypothesis testing problem, with C_0,0 < C_1,0 and C_1,1 < C_0,1, show there is no “best” rule based on conditional risks, except in the trivial case case where p₀(y) and p₁(y) have disjoint supports.

Exercise 11.9.3. Let S $S$ = {0, 1}, and D $D$ = {0, 1, e}. This would correspond to binary communication with erasures. Now suppose

pj(y) = 12πσ√exp[ − (y − ( − 1)j + 1)22σ2], j = 0,1, − ∞<y<∞. $p_{j} (y) = \frac{1}{\sqrt{2 π σ}} \exp [- \frac{{(y - {(- 1)}^{j + 1})}^{2}}{2 σ^{2}}], j = 0, 1, - \infty < y < \infty .$

That is, Y has distribution N $N$ (−1, σ²) when the state is 0, and Y has distribution N $N$ (1, σ²) when the state is 1. Assume a cost structure

Ci,j = ⎧⎩⎨⎪⎪0 if i = 0, j = 0 or i = 1, j = 11 if i = 1, j = 0 or i = 0, j = 1c if i = e $C_{i, j} = {\begin{array}{l} 0 if i = 0, j = 0 or i = 1, j = 1 \\ 1 if i = 1, j = 0 or i = 0, j = 1 \\ c if i = e \end{array}$

Furthermore, assume that the two states are equally likely.

1. First assume that c < 0.5. Show that the Bayes rule for this problem has the form:

δB(y) = ⎧⎩⎨⎪⎪0 y≤ − te − t<y<t1 y≥t $δ_{B} (y) = {\begin{array}{l} 0 y \leq - t \\ e - t < y < t \\ 1 y \geq t \end{array}$

Also give an expression for t in terms of the parameters of the problem.

2. Now find δ_B(y) when c ≥ 0.5.

Exercise 11.9.4. Consider the binary detection problem with

p1(y) = {1/400if y∈[0,4]otherwise $p_{1} (y) = {\begin{matrix} 1 / 4 & if y \in [0, 4] \\ 00 & otherwise \end{matrix}$

and

p0(y) = {(y + 3)/180if y∈[ − 3,3]otherwise $p_{0} (y) = {\begin{matrix} (y + 3) / 18 & if y \in [- 3, 3] \\ 0 & otherwise \end{matrix}$

1. Find a Bayes rule for uniform costs and equal priors and the corresponding minimum Bayes risk.

2. Find a minimax rule for uniform costs, and the corresponding minimax risk.

Exercise 11.9.5. For Exercise 11.9.2 above, find the minimum Bayes risk function V(π₀), and then find a minimax rule in the set of randomized decision rules using V(π₀).

Exercise 11.9.6. In this chapter, we formulated and solved the general Bayesian binary detection problem. We may generalize this formulation to M-ary detection (M > 2) as follows:

• S $S$ = {0, …, M − 1}, with a priori probability of state j being π_j.

• D $D$ = {0, …, M − 1}

• C(i, j) = C_ij ≥ 0, for i, j = 0, …, M − 1.

• Y $Y$ , the observation space being continuous/discrete with conditional density (PDF/PMF) p_j(y), j = 0, …, M − 1.

• δ ∈ Δ, δ partitions Y $Y$ into M regions Y $Y$ ₀, …, Y $Y$ _M₋₁, where δ(y) = i when y ∈ Y $Y$ _i.

Find δ_B(y) by specifying the Bayes decision regions Y $Y$ _i, i = 0, …, M − 1. Simplify as much as possible.

Exercise 11.9.7. Consider the 5-ary detection problem in which the hypotheses are given by

Hj:Y = (j − 2) + Z, j = 0,1,2,3,4, $H_{j} : Y = (j - 2) + Z, j = 0, 1, 2, 3, 4,$

where Z ~ N $N$ (0, 1). Assume that the hypotheses are equally likely.

1. Find the decision rule with minimum probability of error (i.e., Bayes rule with uniform costs).

2. Also find the corresponding minimum Bayes risk.

Hint: Find the probability of correct decision making first.

Exercise 11.9.8. Consider the binary detection problem with

p0(y) = 12e − |y| and p1(y) = e − 2|y|, y∈R $p_{0} (y) = \frac{1}{2} e^{- | y |} and p_{1} (y) = e^{- 2 | y |}, y \in ℝ$

1. Find the Bayes rule for equal priors and a cost structure of the form C₀₀ = C₁₁ = 0, C₁₀ = 1, and C₀₁ = 2.

2. Find the Bayes risk for the Bayes rule of part (a). (Note that the costs are not uniform.)

3. Find a Neyman-Pearson rule for α = 1/4.

4. Find the probability of detection for the rule of part (c).

Exercise 11.9.9. Consider the detection problem for which Γ = {0, 1, 2, …} and the PMF’s of the observations under the two hypotheses are:

p0(y) = (1 − β0)βy0, y = 0,1,2,… $p_{0} (y) = (1 - β_{0}) β_{0}^{y}, y = 0, 1, 2, \dots$

and

p1(y) = (1 − β0)βy1, y = 0,1,2,… $p_{1} (y) = (1 - β_{0}) β_{1}^{y}, y = 0, 1, 2, \dots$

Assume that 0 < β₀ < β₁ < 1.

1. Find the Bayes rule for uniform costs and equal priors.

2. Find the Neyman-Pearson rule with false-alarm probability α ∈ (0, 1). Also find the corresponding probability of detection as a function of α.

Exercise 11.9.10. Consider a binary detection problem, where the goal is to minimize the following risk measure

ρ(δ˜) = [PF(δ˜)]2 + PM(δ˜) $ρ (\tilde{δ}) = {[P_{F} (\tilde{δ})]}^{2} + P_{M} (\tilde{δ})$

1. Show that the optimal solution is a (possibly randomized) likelihood-ratio test.

2. Find the optimal solution for the observation model

p0(y){10if y∈[0,1]otherwise $p_{0} (y) {\begin{matrix} 1 & if y \in [0, 1] \\ 0 & otherwise \end{matrix}$

and

p1(y){2y0if y∈[0,1]otherwise $p_{1} (y) {\begin{matrix} 2 y & if y \in [0, 1] \\ 0 & otherwise \end{matrix}$

Exercise 11.9.11. Consider the detection problem where L(y) has no point masses under either hypothesis. Let δ_η denote the likelihood ratio test:

δη(y) = {10if L(y)≥ηif L(y)<η. $δ_{η} (y) = {\begin{matrix} 1 & if L (y) \geq η \\ 0 & if L (y) < η \end{matrix} .$

As discussed in Section 11.4.2, a plot of P_D(δ_η) versus P_F(δ_η) for various values of η is called the ROC. This plot is a concave function with the point (0, 0) corresponding to η = ∞, and the point (1, 1) corresponding to η = 0. Prove the following properties of ROC’s:

1. P_D(δ_η) ≥ P_F(δ_η) for all η. (Hint: consider cases η ≤ 1 and η > 1 separately.)

2. The slope of the ROC at a particular point is equal to the value of the threshold η required to acheive the P_D and P_F at that point, i.e.,

dPDdPF = η. $\frac{d P_{D}}{d P_{F}} = η .$

(Hint: Use the fact that L(Y) has a density under each hypothesis.)

Exercise 11.9.12. Consider the following composite detection problem with Λ = ℝ:

H0:θ≤θ˜ versus H1:θ>θ˜ $H_{0} : θ \leq \tilde{θ} versus H_{1} : θ > \tilde{θ}$

where θ̃ is a fixed real number. Now suppose that for each fixed θ₀ ≤ θ̃ and each fixed θ₁ > θ̃, we have

pθ1(y)pθ0(y) = gθ0,θ1(T(y)) $\frac{p_{θ_{1}} (y)}{p_{θ_{0}} (y)} = g_{θ_{0}, θ_{1}} (T (y))$

where the function T does not depend on θ₁ or θ₀, and the function gθ0,θ1 $g_{θ_{0}, θ_{1}}$ is strictly increasing in its argument.

Show that for any level α, a UMP test between H₀ and H₁ exists.

Exercise 11.9.13. Consider the composite binary detection problem in which

pθ(y) = {θe − θy0if y≥ 0if y< 0 $p_{θ} (y) = {\begin{matrix} θ e^{- θ_{y}} & if y \geq 0 \\ 0 & if y < 0 \end{matrix}$

1. For α ∈ (0, 1), show that a UMP test of level α exists for testing the hypotheses

H0:Λ0 = [1,2] versus H1:Λ1 = (2,∞). $H_{0} : Λ_{0} = [1, 2] versus H_{1} : Λ_{1} = (2, \infty) .$

Find this UMP test as a function of α.

2. Find the structure of the generalized likelihood ratio test.

Exercise 11.9.14. (UMP testing with Laplacian Observations) Consider the composite binary detection problem in which

pθ(y) = 12e − |y − θ|, y∈R. $p_{θ} (y) = \frac{1}{2} e^{- | y - θ |}, y \in ℝ .$

and we are testing:

H0:θ = 0 versus H1:θ>0 $H_{0} : θ = 0 versus H_{1} : θ > 0$

1. Does a UMP test exist? If so, find it for level α and derive its power P_D. If not, find the generalized likelihood ratio test for level α.

2. Find a locally most powerful α-level test and derive its power P_D.

Exercise 11.9.15. Consider the detection problem:

H0:Y = [ − a0] + Z versus H1: Y = [a0] + Z $H_{0} : Y = [\begin{matrix} - a \\ 0 \end{matrix}] + Z versus H_{1} : Y = [\begin{matrix} a \\ 0 \end{matrix}] + Z$

where Z ~ N $N$ (0, Σ) with

∑ = [1ρρ1 + ρ2]. $\sum = [\begin{matrix} 1 & ρ \\ ρ & 1 + ρ^{2} \end{matrix}] .$

Assume that a > 0 and ρ ∈ (0, 1).

1. For equal priors show that the minimum-probability-of-error detector is given by

δB(y) = {10if y1 − by2≥τif y1 − by2<τ $δ_{B} (y) = {\begin{matrix} 1 & if y_{1} - b y_{2} \geq τ \\ 0 & if y_{1} - b y_{2} < τ \end{matrix}$

where b = ρ/(1 + ρ²) and τ = 0.

2. Determine the minimum probability of error.

3. Consider the test of part (a) in the limit as ρ → 0. Explain why the dependence on y₂ goes away in this limit.

4. Now suppose the observations Y ~ N $N$ ([a 0]^⊤, Σ), with ρ = 1 but a being an unknown parameter, and we wish to test between the hypotheses:

H0:0<a<1 versus H1:a>1. $H_{0} : 0 < a < 1 versus H_{1} : a > 1.$

Show that a UMP test exists for this problem, and find the UMP test of level α ∈ (0, 1).

Exercise 11.9.16. Consider the detection problem with n-dimensional observations:

H0:Y = Z versus H1:Y = s + Z $H_{0} : Y = Z versus H_{1} : Y = s + Z$

where the components of Z are zero mean correlated random variables with

E[ZkZℓ] = σ2ρ|k − ℓ|, for all 1≤k,ℓ≤n $E [Z_{k} Z_{ℓ}] = σ^{2} ρ^{| k - ℓ |}, for all 1 \leq k, ℓ \leq n$

where ∣ρ∣ < 1.

1. Show that the N-P test for this problem has the form:

$δ_{η} (y) = {\begin{matrix} 1 & if \sum_{k = 1}^{n} b_{k} x_{k} \geq η \\ 0 & if \sum_{k = 1}^{n} b_{k} x_{k} < η \end{matrix}$

where b₁ = s₁/σ, x₁ = y₁/σ, and

$b_{k} = \frac{s_{k} - ρ s_{k - 1}}{σ \sqrt{1 - ρ^{2}}}, x_{k} = \frac{y_{k} - ρ y_{k - 1}}{σ \sqrt{1 - ρ^{2}}}, k = 2, \dots, n .$

Hint: Note that $\sum_{Z}^{- 1} = A / (σ^{2} (1 - ρ^{2}))$ , where A is a tridiagonal matrix with main diagonal (1 1 + ρ² 1 + ρ² … 1 + ρ² 1) and superdiagonal and subdiagonal entries all being −ρ.

2. Find the α-level N-P test, $δ_{η}_{_{α}}$ .

3. Find the ROC for the above detector, i.e., find P_D(δη_α) as a function of α.

Exercise 11.9.17. Consider the composite detection problem with twodimensional observations:

$H_{0} : Y = Z versus H_{1} : Y = θ s + Z$

where Z₁ and Z₂ are independent $N$ (0, 1) random variables, and s₁ = 1 and s₂ = −1.

The parameter θ is a deterministic but unknown parameter that takes one of two possible values +1 or −1.

1. Is there a UMP test for this problem? If so, find it for level a. If not, explain why not.

2. Show that an α-level GLRT for this problem is given by:

$δ_{GLRT} (y) = {\begin{matrix} 1 & if | y_{1} - y_{2} | \geq η_{α} \\ 0 & othwerwise \end{matrix}$

with $η_{α} = \sqrt{2} Q^{- 1} (\frac{α}{2})$ .

3. Give a clear argument to establish that the probability of detection for the GLRT of part (b) is independent of θ.

4. Now find the probability of detection for the GLRT as a function of η_α.

References

[1] Ferguson, T.S., Mathematical Statistics: A Decision Theoretic Approach. Academic Press, 1967.

[2] Poor, H.V., An Introduction to Signal Detection and Estimation, second edition. Springer-Verlag, 1994.

[3] Van Trees, H.L., Detection, Estimation and Modulation Theory, Part 1. Wiley, 1968.

[4] Levy, B.C., Principles of Signal Detection and Parameter Estimation. Springer-Verlag, 2008.

[5] Kay, S.M., Fundamentals of Statistical Signal Processing: Detection Theory. Prentice Hall, 1998.

[6] Huber, P.J., Robust Statistics. Wiley, 1981.

[7] Wald, A., Sequential Analysis. Wiley, 1947.

[8] Poor, H.V. and Hadjiliadis, O., Quickest Detection. Cambridge University Press, 2009.

[9] Dembo, A. and Zeitouni, O., Large Deviations Techniques and Applications, Second Edition. Springer-Verlag, 1998.

[10] Varshney, P.K., Distributed Detection and Data Fusion. Springer-Verlag, 1997.

¹As will be the convention in the rest of the chapter, we denote random variables by uppercase letters and their corresponding realizations by lowercase letters. In particular, a realization of Y is denoted by y.

²This condition typically holds for continuous observations when p₀(y) and p₁(y) are PDF’s with the same support, but not necessarily even in this case.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 11.6.2 UMP Detection with Both Composite Hypotheses

Create new playlist

Sign In

Sign Up

Table of Contents for
11.6.2 UMP Detection with Both Composite Hypotheses