11 Fundamentals of Detection Theory

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Venugopal V. Veeravalli^‡

^‡ University of Illinois at Urbana-Champaign, USA

Detection problems arise in a number of engineering applications such as radar, communications, surveillance, and image analysis. In the basic setting of the problem, the goal is to detect the presence or absence of a signal in noise. This chapter will provide the mathematical and statistical foundations for solving such problems.

11.1.1 Statistical Decision Theory Framework

Detection problems fall under the umbrella of statistical decision theory [1], where the goal is to make a right (optimal) choice from a set of alternatives in a noisy environment. There are five basic ingredients in a typical decision theory problem.

• S $S$ : The set of states (of nature). For detection problems, the number of states is finite, i.e., ∣ $S$ ∣ = M < ∞. For binary detection problems, which are prevalent in applications, $S$ = {0, 1}. We denote a typical state for detection problems by the variable j, i.e., j ∈ $S$ .

• $D$ : The set of decisions or actions. This set is the set of decisions about the state. Elements in $D$ would typically correspond to elements in $S$ . In some applications such as communications with erasure, the set $D$ could have larger cardinality than the set $S$ . We denote a typical decision by the variable i, i.e., i ∈ $D$ .

• C(i, j) or C_i_,_j: The cost function between decisions and states, C : $D$ × $S$ → ℝ⁺. In order to be able to talk about optimizing the decision, we need to quantify the cost incurred from each decision. The cost function C serves this purpose. An example of cost function, which is relevant in many applications, is the uniform cost function for which

$C_{i, j} = {\begin{matrix} 0 & if i = j \\ 1 & if i \neq j \end{matrix} .$

(11.1)

• $Y$ : The set of observations. The decision about the state is not made blindly but based on some random observation¹ Y taking values in $Y$ .

• Δ: The set of decision rules or tests. Since the decisions are based on the observations, we need to have mappings from the observation set to the decision set. These are the decision rules, i.e., δ ∈ Δ, δ : $Y$ ↦ $D$ .

Detection problems are also referred to as hypothesis testing problems, with the understanding that each element of $S$ corresponds to a hypothesis about the nature of the observations. The hypothesis corresponding to state j is denoted by H_j.

11.1.2 Probabilistic Structure for Observation Space

We associate with $Y$ , a sigma algebra $G$ of subsets of $Y$ to which we assign probabilities. The pair ( $Y$ , $G$ ) is the observation space. In the applications of interest in this chapter, we will almost exclusively have $Y$ = $R$ ⁿ, or $Y$ = {γ₁, γ₂, …}, a countable set. In the case that $Y$ = ℝⁿ, we take $G$ to be the smallest sigma-algebra containing all the n-dimensional rectangles in ℝⁿ, i.e., the Borel sigma-algebra $B$ ⁿ. In the case when $Y$ = {γ₁, γ₂, …}, we take $G$ to be the power set of $Y$ , i.e., 2^$Y$.

For $Y$ = Rⁿ, we assume that probabilities can be assigned using an n-dimensional PDF. For $Y$ = {γ₁, γ₂, …}, probabilities can be assigned using a PMF. We will use the term density for both PDFs and PMFs. We denote this density function by p, and use a common notation for the probability measure as in [2]:

For A ∈ $G$ ,

$P (A) = \int_{y \in A} p (y) μ (d y) = {\begin{matrix} \int_{y \in A} p (y) d y & for Y = ℝ^{n} \\ \sum_{γ_{i} \in A} p (γ_{i}) & for Y = {γ_{1}, γ_{2}, \dots} \end{matrix}$

(11.2)

Let g be a function on $Y$ . Then the expected value of the random variable g(Y) is given by

$E [g (Y)] = \int_{Y} p (y) g (y) μ (d y) = {\begin{matrix} \int_{Y} p (y) g (y) d y & for Y = ℝ^{n} \\ \sum_{Y} p (γ_{i}) g (γ_{i}) & for Y = {γ_{1}, γ_{2}, \dots} \end{matrix}$

(11.3)

11.1.3 Conditional Density and Conditional Risk

In order to make a decision about the state j based on the observation Y, we need to know how Y depends on j statistically. Typically, we assume that the conditional density (PDF/PMF) of Y conditioned on the state being j (which we denote by p_j(y)) is available for each j ∈ $S$ . In case the state is modeled as random variable J (see below), p_j(y) is the usual conditional density p_Y∣J(y∣j), but otherwise we can think of the set {p_j(y), j ∈ $S$ } as simply an indexed set of densities, with p_j being the density for Y that corresponds to the state being j.

Table 11.1: Decision rules and conditional risks for Example 11.1.1.

The cost associated with a decision rule δ ∈ Δ is a random quantity (because Y is random) given by C(δ(Y), j). Therefore, to order decision rules according to their “merit” we use the quantity

$R_{j} (δ) = E_{j} [C (δ (Y), j)] = \int C (δ (y), j) p_{j} (y) μ (d y) .$

which we call the conditional risk associated with δ when the state is j.

The conditional risk function can be used to obtain a (partial) ordering of the decision rules in Δ, in the following sense.

Definition 11.1.1. A decision rule δ is better than decision rule δ′ if

$R_{j} (δ) \leq R_{j} (δ^{'}), \forall j \in S$

and

$R_{j} (δ) < R_{j} (δ^{'}), for at least one j \in S$

Sometimes it may be possible to find a decision rule δ* ∈ Δ which is better than any other δ ∈ Δ. In this case, the statistical decision problem is solved. Unfortunately, this usually happens only for trivial cases as in the following example.

Example 11.1.1. Suppose $S$ = $D$ = {0, 1} with the uniform cost function as in (11.1). Furthermore suppose the observation Y takes values in the set $Y$ = {a, b, c} and the conditional p.m.f.’s of Y are:

$p_{0} (a) = 1, p_{0} (b) = p_{0} (c) = 0, p_{1} (a) = 0, p_{1} (b) = p_{1} (c) = 0.5.$

Then it is easy to see that we have the conditional risks for the eight possible decision rules depicted in Table 11.1. Clearly, δ₄ is the best rule according to Definition 11.1.1, but this happens only because the conditional PMFs p₀ and p₁ have disjoint supports (see Exercise 2).

□

Since conditional risks cannot be used directly in finding optimal solutions to statistical decision making problems except in trivial cases, there are two general approaches for finding optimal decision rules: Bayesian and minimax.

11.1.4 Bayesian Approach

Here we assume that we are given an a priori probability distribution on the set of states $S$ . The state is then denoted by a random variable J with PMF {π_j , j ∈ $S$ } (since, for detection problems, the state space is finite). Now we introduce the average risk or Bayes risk associated with a decision rule δ, which is given by

$r (δ) = E [R_{J} (δ)] = \sum_{j \in S} π_{j} R_{j} (δ)$

(11.4)

We can then obtain an ordering on the δ’s by using this Bayes risk. In particular, we choose the decision rule δ_B that has minimum Bayes risk, i.e.,

$δ_{B} = \arg \min_{δ \in Δ} r (δ) .$

The decision rule δ_B is called Bayes rule.

11.1.5 Minimax Approach

What if we are not given a prior distribution on the set $S$ ? We could postulate a distribution on $S$ (for example, a uniform distribution) and use the Bayesian approach. On the other hand, one may want to guarantee a certain level of performance for all choices of state. In this case, we use a minimax approach. The goal of the minimax approach is to find the decision rule δ_m that has the best worst case cost:

$δ_{m} = \arg \min_{δ \in Δ} \max_{j \in S} R_{j} (δ) .$

The decision rule δ_m is called the minimax rule.

In addition to Bayes and minimax approaches there are other criteria and techniques that are specific to special classes of decision-making problems. For example, in binary hypothesis testing, a third approach called the Neyman-Pearson approach (see Section 11.4) is often used in practice.

11.1.6 Randomized Decision Rules

Even though this might seem counter-intuitive at first, it is sometimes possible to get a better decision rule by randomly choosing between a set of deterministic decision rules.

Definition 11.1.2. A randomized decision rule δ̃ is described by

$\tilde{δ} (y) = δ_{l} (y) with probability β_{l}, l = 1, \dots, L$

for some L and some {β_ℓ}, with β_ℓ > 0 and ∑_ℓ ß_ℓ = 1.

Table 11.2: Decision rules, and Bayes and minimax risks for Example 11.1.2.

The set Δ̃ of randomized decision rules obviously contains the set Δ, and thus optimizing over Δ̃ will necessarily result in at least as good a decision rule as that obtained by optimizing over Δ.

Theorem 11.1.1. Randomization does not improve Bayes rules:

$\min_{δ \in Δ} r (δ) = \min_{\tilde{δ} \in \tilde{Δ}} r (\tilde{δ}) .$

Proof: Since Δ ⊂ Δ̃, it is clear that the right-hand side (RHS) is less than or equal to the left-hand side (LHS). To prove the reverse inequality, suppose δ̃ chooses δ_ℓ with probability β_ℓ, ℓ = 1, …, L. Then

$r (\tilde{δ}) = \sum_{l = 1}^{L} β_{l} r (δ_{l}) \geq \sum_{l = 1}^{L} β_{l} \min_{δ \in Δ} r (δ) = \min_{δ \in Δ} r (δ) .$

Taking the minimum over δ̃ ∈ Δ̃ on the LHS gives us the desired inequality.

■

However, as we see in the following example, randomization could result in a better minimax rule. We will also see later in Section 11.4 that randomization can yield better Neyman-Pearson rules for binary detection problems.

Example 11.1.2. Consider the same setup as in Example 11.1.1 with the following conditional PMF’s.

$p_{0} (a) = p_{0} (b) = 0.5, p_{0} (c) = 0, p_{1} (a) = 0, p_{1} (b) = p_{1} (c) = 0.5.$

We can compute the conditional risks for the eight possible decision rules as shown in Table 11.2. Clearly there is no “best” rule based on conditional risks alone in this case. Now consider finding a Bayes rule for priors π₀ = π₁ = 0.5. It is clear from the table that δ₂ and δ₄ are both Bayes rules. Also, δ₂, δ₃, δ₄ and δ₆ are all minimax rules with minimax risk equal to 0.5. Finally, randomizing between δ₂ and δ₄ with equal probability results in a rule with minimax risk equal to 0.25. Thus, we see that randomization can improve minimax rules.

□

11.1.7 General Method for Finding Bayes Rules

In the Bayesian framework, we can define the a posteriori probability π(j∣y) of the state j, given observation y. By Bayes probability law:

$π (j | y) = \frac{p_{j} (y) π_{j}}{p (y)} .$

(11.5)

We can write the Bayes risk of (11.4) in terms of π(j∣y) as:

$r (δ) = E [E [C (δ (Y), J) | Y]] = \int_{y \in Y} [\sum_{j \in S} π (j | y) C (δ (y), j)] p (y) μ (d y) .$

(11.6)

Define the a posteriori cost of decision i ∈ $D$ , given observation y, by

$C (i | y) ≜ \sum_{j \in S} π (j | y) C (i, j) .$

(11.7)

Then it is easy to see that minimizing r(δ) in (11.6) is equivalent to minimizing C(δ(y)∣y) for each y. Thus

$δ_{B} (y) = \arg \min_{i \in D} C (i | y) .$

(11.8)

11.2 Bayesian Binary Detection

We now study the special case of binary detection (hypothesis testing) in more detail. Here $S$ = $D$ = {0, 1}, and hence any deterministic decision rule δ partitions the observation space into disjoint sets $Y_{0}$ and $Y_{1}$ , corresponding to decision δ(y) = 0 and δ(y) = 1, respectively. The conditional risks for a decision rule δ can be written as:

$R_{j} (δ) = C_{0, j} P_{j} (Y_{0}) + C_{1, j} P_{j} (Y_{1}), j = 0, 1.$

Assumption 11.2.1. The cost of a correct decision about the state is strictly smaller than that of a wrong decision:

$C_{0, 0} < C_{1, 0}, C_{1, 1} < C_{0, 1} .$

Using (11.8), we can find a Bayes decision rule for binary detection as:

$δ_{B} (y) = \arg \min_{i \in {0, 1}} C (i | y) = {\begin{array}{l} 1 & if & C (1 | y) \leq C (0 | y) \\ 0 & if & C (1 | y) > C (0 | y) \end{array} .$

Clearly, the Bayes solution need not be unique since the average risk is the same whether we assign the decision of “0” or “1” to observations for which C(1∣y) = C(0∣y). Using (11.7) and (11.5), we obtain:

$δ_{B} (y) = {\begin{matrix} 1 & if π (1 | y) [C_{0, 1} - C_{1, 1}] \geq π (0 | y) [C_{1, 0} - C_{0, 0}] \\ 0 & otherwise \end{matrix}$

(11.9)

$= {\begin{matrix} 1 & if \frac{p_{1} (y)}{p_{0} (y)} \geq \frac{C_{1, 0} - C_{0, 0}}{C_{0, 1} - C_{1, 1}} \\ 0 & otherwise \end{matrix} .$

(11.10)

11.2.1 Likelihood Ratio Test

Definition 11.2.1. The likelihood ratio is given by

$L (y) = \frac{p_{1} (y)}{p_{0} (y)}, y \in Y$

with the understanding that $\frac{0}{0} = 0$ , and $\frac{x}{0} = \infty$ , for x > 0.

If we further define the threshold τ by:

$τ = \frac{π_{0}}{π_{1}} \frac{C_{1, 0} - C_{0, 0}}{C_{0, 1} - C_{1, 1}}$

(11.11)

then we can write

$δ_{B} (y) = {\begin{matrix} 1 & if L (y) \geq τ \\ 0 & otherwise \end{matrix} .$

Thus Bayes rule is a “LRT.”

11.2.2 Uniform Costs

For uniform costs (see (11.1)), C_0,0 = C_1,1 = 0, and C_0,1 = C_1,0 = 1. Therefore, the threshold for the LRT simplifies to $τ = \frac{π_{0}}{π_{1}}$ in this case. We can also see from (11.9) that

$δ_{B} (y) = {\begin{matrix} 1 & if π (1 | y) \geq π (0 | y) \\ 0 & otherwise \end{matrix} .$

Thus, for uniform costs, Bayes rule is a MAP rule. Furthermore, for uniform costs, the Bayes risk of a decision rule δ is given by

$r (δ) = π_{0} P_{0} (Y_{1}) + π_{1} P_{1} (Y_{0}) .$

The RHS is the average probability of error, denoted by P_e. Thus, for uniform costs, Bayes rule is also a minimum probability of error (MPE) rule.

Finally if we have uniform costs and equal priors (i.e., π₀ = π₁ = 0.5), then

$δ_{B} (y) = {\begin{matrix} 1 & if p_{1} (y) \geq p_{0} (y) \\ 0 & otherwise \end{matrix}$

and Bayes rule is a maximum likelihood (ML) decision rule.

11.2.3 Examples

Example 11.2.1. Signal Detection in Gaussian Noise. This detection problem arises in a number of engineering applications, including radar and digital communications, and can be described by the hypotheses test:

$H_{0} : Y = μ_{0} + Z versus H_{1} : Y = μ_{1} + Z$

where the constants µ₀ and µ₁ represent deterministic signals, and Z is a zero mean Gaussian random variable with variance σ², denoted by Z ~ $N$ (0, σ²). Without loss of generality, we may assume that µ₁ > µ₀.

The conditional PDFs are given by:

$p_{j} (y) = \frac{1}{\sqrt{2 π σ^{2}}} \exp [- \frac{{(y - μ_{j})}^{2}}{2 σ^{2}}], j = 0, 1$

and the likelihood ratio is given by:

$L (y) = \frac{p_{1} (y)}{p_{0} (y)} = \exp [\frac{μ_{1} - μ_{0}}{σ^{2}} (y - \frac{μ_{1} + μ_{0}}{2})] .$

It is easy to show that comparing L(y) to τ of (11.11) is equivalent to comparing y with τ′, where

$τ^{'} = \frac{σ^{2}}{μ_{1} - μ_{0}} \log τ + \frac{μ_{1} + μ_{0}}{2} .$

Thus Bayes rule is equivalent to a threshold test on the observation y:

$δ_{B} = {\begin{array}{l} 1 & if & y \geq τ^{'} \\ 0 & if & y < τ^{'} \end{array} .$

(11.12)

For uniform costs and equal priors, τ = 1 and $τ^{'} = \frac{μ_{1} + μ_{0}}{2}$ . Furthermore,

$r (δ_{B}) = P_{e} (δ_{B}) = 0.5 P_{0} (Y_{1}) + 0.5 P_{1} (Y_{0})$

where

$P_{0} (Y_{1}) = P_{0} {Y \geq τ^{'}} = 1 - Φ (\frac{τ^{'} - μ_{0}}{σ}) = 1 - Φ (\frac{μ_{1} - μ_{0}}{2 σ}) = Q (\frac{μ_{1} - μ_{0}}{2 σ})$

and

$P_{1} (Y_{0}) = P_{1} {Y < τ^{'}} = Φ (\frac{τ^{'} - μ_{1}}{σ}) = Φ (\frac{μ_{0} - μ_{1}}{2 σ}) = Q (\frac{μ_{1} - μ_{0}}{2 σ})$

where Φ is CDF of a $N$ (0, 1) random variable

$Φ (x) = \int_{- \infty}^{x} \frac{1}{\sqrt{2 π}} e^{- t^{2} / 2} d t$

(11.13)

and Q is the complement of Φ, i.e., Q(x) = 1 − Φ(x) = Φ(− x), for x ∈ ℝ. Thus

$r (δ_{B}) = P_{e} (δ_{B}) = Q (\frac{μ_{1} - μ_{0}}{2 σ}) .$

□

Example 11.2.2. Discrete Observations. Consider the detection problem of Example 11.1.2 with uniform costs, equal priors and

$p_{0} (a) = p_{0} (b) = 0.5, p_{0} (c) = 0, p_{1} (a) = 0, p_{1} (b) = p_{1} (c) = 0.5.$

The likelihood ratio is given by

$L (y) = {\begin{array}{l} 0 & if & y = a \\ 1 & if & y = b \\ \infty & if & y = c \end{array} .$

With uniform costs and equal priors, the threshold τ = 1. Therefore

$δ_{B} = {\begin{array}{l} 1 & if & L (y) \geq 1 \\ 0 & if & L (y) < 1 \end{array} = {\begin{array}{l} 1 & if & y = b, c \\ 0 & if & y = a \end{array} .$

This rule is nothing but δ₄ of Example 11.1.2. Note that if we had chosen 0 when L(y) = τ, then we would have obtained δ₂, which is also a Bayes rule.

□

11.3 Binary Minimax Detection

Recall from Section 11.1.5 that the minimax decision rule δ_m minimizes the worst case risk:

$δ_{m} = \arg \min_{δ \in Δ} R_{\max} (δ) .$

where R_max(δ) = max{R₀(δ), R₁(δ)}.

11.3.1 Bayes Risk Line and Minimum Risk Curve

We find δ_m indirectly by using the solution to Bayesian detection problem as follows. Since the prior on the states is not specified in the minimax setting, we allow the prior π₀ (= 1 − π₁) to be a variable over which we can optimize. We begin with the following definitions.

Definition 11.3.1. Bayes Risk Line. For any δ ∈ Δ,

$r (π_{0}; δ) = π_{0} R_{0} (δ) + (1 - π_{0}) R_{1} (δ) .$

Figure 11.1: Bayes risk lines and minimum risk curve.

Definition 11.3.2. Bayes Minimum Risk Curve.

$V (π_{0}) = \min_{δ \in Δ} r (π_{0}; δ) = r (π_{0}; δ_{B, π_{0}}), π_{0} \in [0, 1]$

where $δ_{B, π_{0}}$ is a Bayes rule for prior π₀.

Bayes risk lines and the minimum risk curve are illustrated in Figure 11.1. The following result states some useful properties of V(π₀).

Lemma 11.3.1. V is a concave (continuous) function on [0, 1] with V(0) = C_1,1 and V(1) = C_0,0.

Proof: The minimum of concave functions is concave; therefore, the concavity of V follows from the fact that each of the risk lines r(π₀; δ) is linear (and hence concave) in π₀. As for the end point properties,

$V (0) = \min_{δ \in Δ} R_{1} (δ) = \min_{δ \in Δ} C_{0, 1} P_{1} (Y_{0}) + C_{1, 1} P_{1} (Y_{1}) = C_{1, 1}$

where the minimizing rule is δ*(y) = 1, for all y ∈ $Y$ . Similarly V(1) = C_0,0.

■

We can write V(π₀) in terms of the likelihood ratio L(y) and threshold τ as:

$\begin{array}{l} V (π_{0}) = π_{0} [C_{1, 0} P_{0} {L (Y) \geq τ} + C_{0, 0} P_{0} {L (Y) < τ}] \\ + (1 - π_{0}) [C_{1, 1} P_{1} {L (Y) \geq τ} + C_{0, 1} P_{1} {L (Y) < τ}] . \end{array}$

If L(y) has no point masses² under P₀ or P₁, then V is differentiable in π₀ (since τ is differentiable in π₀).

Figure 11.2: Minimax (equalizer) rule when V is differentiable at $π_{0}^{(m)}$ .

11.3.2 Equalizer Rule

Let us first consider the case where V is indeed differentiable for all π₀. Then V(π₀) achieves its maximum value at either the end points π₀ = 0 or π₀ = 1 or within the interior π₀ ∈ (0, 1). If we assume uniform costs, then V(0) = V(1) = 0, and the maximum cannot be attained at the end points. Therefore, we further restrict our analysis to the case of uniform costs (the more general setting is considered in [2]).

Theorem 11.3.1. If C_0,0 = C_1,1 = 0 and V is differentiable on [0, 1], then

$δ_{m} = δ_{B, π_{0}^{m}}$

where $π_{0}^{m} = \arg \max_{π_{0}} V (π_{0})$ , obtained by solving dV (π₀)/dπ₀ = 0, i.e., δ_m is a Bayes rule for the worst case prior. Furthermore, δ_m is a Bayes equalizer rule, i.e., R₀(δ_m) = R₁(δ_m). Note that randomization cannot improve the minimax rule in this case.

Proof: The proof follows from Figure 11.2 using the following steps:

1. For any δ ∈ Δ, the risk line r(π₀; δ) cannot intersect with V(π₀).

2. For fixed $π_{0}^{(1)}$ , the risk line $r (π_{0}; δ_{B, π_{0}^{(1)}})$ is tangent to V at $π = π_{0}^{(1)}$ .

3. Any rule with risk line that is not tangential to V cannot be minimax because one can always find a rule with risk line that has the same slope and is tangential to V with smaller R_max.

4. Among all Bayes rules, the one that has R₀ = R₁ is minimax.

Figure 11.3: Minimax rule when V is not differentiable at $π_{0}^{(m)}$ .

Since the tangent to V at any fixed prior π₀ is unique and corresponds to a deterministic Bayes rule for that prior, randomization cannot yield a better minimax rule.

■

If V is not differentiable for all π₀, then the arguments given in the proof of Theorem 11.3.1 can still be used as long as V is differentiable at its maximum, and the minimax rule is still the unique Bayes rule for the worst case prior. If V is not differential at its maximum, then we have the scenario depicted in Figure 11.3. Note that δ⁻ and δ⁺ are deterministic Bayes rules with same Bayes risk $V (π_{0}^{m})$ , and since they are likelihood ratio tests with δ⁻ having a larger risk under P₀,

$δ^{-} = {\begin{array}{l} 1 & if & L (y) \geq τ (π_{0}^{m}) \\ 0 & if & L (y) < τ (π_{0}^{m}) \end{array}, δ^{+} = {\begin{array}{l} 1 & if & L (y) > τ (π_{0}^{m}) \\ 0 & if & L (y) \leq τ (π_{0}^{m}) \end{array}$

where $τ (π_{0}^{m}) = π_{0}^{m} / (1 - π_{0}^{m})$ . For δ⁻ and δ⁺ to be different, L(Y) must have a point mass at $τ (π_{0}^{m})$ , i.e., $P_{j} {L (Y) = τ (π_{0}^{m})} \neq 0$ , for j = 0, 1. This also implies that V is not differentiable at $π_{0}^{m}$ . Also, if δ⁻ and δ⁺ are different, then neither of them can be an equalizer rule.

Finding the minimax rule within the set of deterministic rules Δ is challenging in this case, since step 2 in the proof of Theorem 11.3.1 does not hold, and it is possible for a rule that has risk line that is not tangential to V to be minimax within Δ. We may need to resort to brute force enumeration to find minimax rules within Δ as we did in Example 11.1.2. Fortunately we can circumvent this problem by allowing for randomized decision rules.

It should be clear from Figure 11.3 that if an equalizer rule exists in Δ̃, which is tangential to V at $π_{0}^{m}$ , then it must be minimax within the class Δ̃. Now, consider

${\tilde{δ}}_{B, π_{0}^{m}} = {\begin{array}{l} δ^{-} & with probability q \\ δ^{+} & with probability (1 - q) \end{array}$

The conditional risks of this randomized decision rule are given by

$\begin{array}{l} R_{0} ({\tilde{δ}}_{B, π_{0}^{m}}) = q R_{0} (δ^{-}) + (1 - q) R_{0} (δ^{+}) \\ R_{1} ({\tilde{δ}}_{B, π_{0}^{m}}) = q R_{1} (δ^{-}) + (1 - q) R_{1} (δ^{+}) \end{array}$

Thus, setting

$q = \frac{R_{1} (δ^{+}) - R_{0} (δ^{+})}{(R_{1} (δ^{+}) - R_{0} (δ^{+})) + (R_{0} (δ^{-}) - R_{1} (δ^{-}))} ≜ q_{m}$

(11.14)

produces an equalizer rule.

Theorem 11.3.2. If C_0,0 = C_1,1 = 0 and V is not differentiable at its maximum, then the minimax solution within the set of randomized decision rules Δ̃ is given by the equalizer rule:

${\tilde{δ}}_{m} = {\tilde{δ}}_{B, π_{0}^{m}} = {\begin{array}{l} 1 & i f & L (y) > τ (π_{0}^{m}) \\ 1 w . p . q_{m} & i f & L (y) = τ (π_{0}^{m}) \\ 0 & i f & L (y) < τ (π_{0}^{m}) \end{array}$

where $π_{0}^{m} = \arg \max_{π_{0}} V (π_{0})$ and q_m is given in (11.14).

11.3.3 Examples

Example 11.3.1. Signal Detection in Gaussian Noise (continued). In this example we study the minimax solution to the detection problem described in Example 11.2.1. We assume uniform costs. We can compute the minimum Bayes risk curve as:

$\begin{array}{l} V (π_{0}) = π_{0} P_{0} {Y \geq τ^{'}} + (1 - π_{0}) P_{1} {Y < τ^{'}} \\ = π_{0} Q (\frac{τ^{'} - μ_{0}}{σ}) + (1 - π_{0}) Φ (\frac{τ^{'} - μ_{1}}{σ}) \end{array}$

with

$τ^{'} = \frac{σ^{2}}{μ_{1} - μ_{0}} \log (\frac{π_{0}}{1 - π_{0}}) + \frac{μ_{1} + μ_{0}}{2} .$

Clearly V is a differentiable function, and therefore the deterministic equalizer rule is minimax. We can solve for the equalizer rule without explicitly maximizing V. In particular, if we denote the LRT with threshold τ′ (see (11.12)) by δ_τ′, then

$R_{0} (δ_{τ^{'}}) = Q (\frac{τ^{'} - μ_{0}}{σ}), R_{1} (δ_{τ^{'}}) = Φ (\frac{τ^{'} - μ_{1}}{σ}) = Q (\frac{μ_{1} - τ^{'}}{σ}) .$

Setting R₀(δ_τ′) = R₁(δ_τ′) yields

${τ^{'}}_{m} = \frac{μ_{1} + μ_{0}}{2}$

from which we can conclude that τ_m = 1 and $π_{0}^{m} = 0.5$ .

Thus the minimax decision rule is given by

$δ_{m} = δ_{B, 0.5} = {\begin{matrix} 1 & if y \geq \frac{μ_{1} + μ_{0}}{2} \\ 0 & otherwise \end{matrix}$

and the minimax risk is given by

$r (δ_{m}) = V (0.5) = Q (\frac{μ_{1} - μ_{0}}{2 σ}) .$

□

Example 11.3.2. Discrete Observations (continued). In this example, we study the minimax solution to the detection problem described in Example 11.2.2. Recall that L(a) = 0, L(b) = 1, and L(c) = ∞. Assuming uniform costs, Bayes rules for prior π₀ (randomized and deterministic) are given by:

${\tilde{δ}}_{B, π_{0}} (y) = {\begin{array}{l} 1 & if & L (y) > τ (π_{0}) \\ 1 w . p . q & if & L (y) = τ (π_{0}) \\ 0 & if & L (y) < τ (π_{0}) \end{array}$

(11.15)

where τ(π₀) = π₀/(1 − π₀) and q ∈ [0, 1].

For π₀ ∈ (0, 0.5), τ(π₀) ∈ (0, 1), and thus all the Bayes rules in (11.15) collapse to the single deterministic rule:

$δ^{-} (y) = {\begin{array}{l} 1 & if & y = b, c \\ 0 & if & y = a \end{array} .$

Similarly, for π₀ ∈ (0.5, 1), τ(π₀) ∈ (1, ∞), and thus all the Bayes rules in (11.15) collapse to the single deterministic rule:

$δ^{+} (y) = {\begin{array}{l} 1 & if & y = c \\ 0 & if & y = a, b \end{array} .$

For π₀ = 0.5, the following set of randomized decision rules are all Bayes rules:

${\tilde{δ}}_{B, 0.5} (y) = {\begin{array}{l} 1 & if & y = c \\ 1 w . p . q & if & y = b \\ 0 & if & y = a \end{array}$

Figure 11.4: Minimax rule for Example 11.3.2.

and these rules can be obtained by randomizing between δ⁺ and δ⁻. From the above discussion it is clear that the minimum Bayes risk curve V is as shown in Figure 11.4, with the worst case prior $π_{0}^{m} = 0.5$ . Furthermore, it is easy to check that R₁(δ⁻) = R₀(δ⁺) = 0, and R₀(δ⁻) = R₁(δ⁺) = 0.5. Therefore, from (11.14), q_m = 0.5, and the minimax decision rule is given by:

${\tilde{δ}}_{m} = {\begin{array}{l} 1 & if & y = c \\ 1 w . p . 0.5 & if & y = b \\ 0 & if & y = a \end{array}$

with minimax risk r(δ̃_m) = V(0.5) = 0.25.

It is interesting to note that δ₂ and δ₄ in Example 11.1.2 are the same as δ⁺ and δ⁻, respectively, and that randomizing between these rules with equal probability is indeed the minimax solution within Δ̃.

□

11.4 Binary Neyman-Pearson Detection

For binary detection problems without a prior on the state, a commonly used alternative to minimax formulation is the Neyman-Pearson formulation, which is based on trading off the following two types of error probabilities:

$\begin{array}{l} P robability of False Alarm ≜ P_{F} (\tilde{δ}) = P_{0} {\tilde{δ} (Y) = 1} \\ P robability of Miss ≜ P_{M} (\tilde{δ}) = P_{1} {\tilde{δ} (Y) = 0} \end{array}$

(11.16)

The goal is to minimize P_M subject to the constraint P_F ≤ α, for α ∈ (0, 1).

Figure 11.5: Risk line and Bayesian minimum risk curve for uniform costs.

An alternative measure of performance that is commonly used in radar and surveillance applications is:

$P robability of Detection ≜ P_{D} (\tilde{δ}) = P_{1} {\tilde{δ} (Y) = 1} = 1 - P_{M} (\tilde{δ}) .$

P_D(δ̃) is also called the power of the decision rule δ̃. The Neyman-Pearson (N-P) problem is generally stated in terms P_D and P_F as:

${\tilde{δ}}_{NP} = \arg \max_{\begin{array}{l} \tilde{δ} \in \tilde{Δ} : \\ P_{F} (\tilde{δ}) \leq α \end{array}} P_{D} (\tilde{δ}) for α \in (0, 1) .$

(11.17)

Note that unlike the Bayesian and minimax optimization problems, which are formulated in terms of conditional risks, the N-P optimization problem is stated in terms of conditional error probabilities. In particular, we are implicitly assuming uniform costs, which means P_D(δ̃) = 1 ‒ R₁(δ̃) and P_F(δ̃) = R₀(δ̃), and the N-P optimization is to minimize R₁(δ̃) subject to R₀(δ̃) ≤ α.

11.4.1 Solution to the N-P Optimization Problem

To solve the N-P optimization problem, we once again resort to Bayesian risk lines and the minimum risk curve V(π₀) with uniform costs. As depicted in Figure 11.5, the risk line r(π₀; δ̃) for any rule δ̃ ∈ Δ̃ lies above the concave function V(π₀), and intersects the π₀ = 0 line at level P_M(δ̃) and the π₀ = 1 line at level P_F(δ̃). Among all decision rules with risk lines that have intersection with the π₀ = 1 line at a level less than or equal to α, we are interested in the one which has the smallest intersection with the π₀ = 0 line. As in the solution to the minimax problem, let us first consider the case where V is differentiable for all π₀. Then it is clear that the decision rule that solves the N-P problem has a risk line that is tangential to V and intersects the π₀ = 1 line at a level exactly equal to α. Such a rule is deterministic Bayes rule (LRT) that compares the likelihood ratio L(y) to a threshold η that satisfies the P_F constraint.

Figure 11.6: N-P optimization when V is not differentiable for all π₀ ∈ [0, 1].

Theorem 11.4.1. if V is differentiable on [0, 1], then

${\tilde{δ}}_{N P} (y) = δ_{η} = {\begin{matrix} 1 & i f L (y) \geq η \\ 0 & o t h e r w i s e \end{matrix}$

where η is chosen so that P₀{L(Y) ≥ η} = α.

Now consider the case where V is not differentiable, and we have the scenario depicted in Figure 11.6. The decision rule δ⁺ is the deterministic LRT that has the largest value of P_F satisfying the constraint P_F ≤ α, and the decision rule δ⁻ is the other deterministic LRT for the same prior. By randomizing between δ⁺ and δ⁻ we can produce a decision rule that has P_F = α, and is hence a solution to (11.17).

Theorem 11.4.2. If V is not differentiable for all π₀ ∈ [0, 1], then

${\tilde{δ}}_{N P} (y) = {\tilde{δ}}_{η, γ} = {\begin{array}{l} 1 & i f & L (y) > η \\ 1 w . p . γ & i f & L (y) = η \\ 0 & i f & L (y) < η \end{array}$

where η and γ are chosen so that P₀{L(Y) > η} + γP₀{L(Y) = η} = α.

Figure 11.7: Complementary CDF of the likelihood ratio L(Y).

11.4.2 N-P Rule and Receiver Operating Characteristic

The procedure for finding the parameters η and γ of the Neyman-Pearson solution is illustrated in Figure 11.7, where we plot P₀{L(y) > η} as a function of η. As seen in Figure 11.7, P₀{L(y) > η} is a right continuous function of η. Given P_F constraint α, we first choose η(α) as:

$η (α) = \min {η \geq 0 : P_{0} {L (y) > η} \leq α} .$

If P₀{L(y) > η(α)} = α, then we do not need to randomize and we can set γ(α) = 0. If P₀{L(y) > η(α)} < α, then we pick γ(α) so that

$α = P_{0} {L (y) > η (α)} + γ (α) P_{0} {L (y) = η (α)}$

which implies that

$γ (α) = \frac{α - P_{0} {L (y) > η (α)}}{P_{0} {L (y) = η (α)}}$

The probability of detection (power) of δ̃_NP for P_F level α can be computed as:

$P_{D} ({\tilde{δ}}_{NP}) = P_{1} {L (y) > η (α)} + γ (α) P_{1} {L (y) = η (α)} .$

A plot of P_D(δ̃_NP) versus P_F(δ̃_NP) = α is called the receiver operating characteristics (ROC) of the Neyman-Pearson decision rule (see Figure 11.8). Some properties of the ROC are discussed in Exercise 11. In particular, the ROC is a concave function that lies above the 45° line, i.e., P_D(δ̃_NP) ≥ P_F(δ̃_NP).

Figure 11.8: Receiver operating characteristic (ROC).

11.4.3 Examples

Example 11.4.1. Signal Detection in Gaussian Noise (continued). In this example we study the N-P solution to the detection problem described in Example 11.2.1. As in the Bayesian setting of this problem, we can simplify the form of the LRT by noting that

$L (y) η \Leftrightarrow y > η^{'} = \frac{σ^{2}}{μ_{1} - μ_{0}} \log η + \frac{μ_{1} + μ_{0}}{2}$

Thus

${\tilde{δ}}_{NP} (y) = {\begin{array}{l} 1 & if & y > η^{'} \\ 1 w . p . γ & if & y = η^{'} \\ 0 & if & y < η^{'} \end{array} .$

Randomization is not needed since P₀{Y = η′} = P₁{Y = η′} = 0 for all η′ ∈ ℝ, and therefore

${\tilde{δ}}_{NP} (y) = δ_{η^{'}} (y) {\begin{array}{l} 1 & if & y \geq η^{'} \\ 0 & if & y < η^{'} \end{array} .$

Now

$P_{F} (δ_{η^{'}}) = P_{0} {Y \geq η^{'}} = Q (\frac{η^{'} - μ_{0}}{σ}) .$

Therefore, we can meet a P_F constraint of α by setting

$η^{'} (α) = σ Q^{- 1} (α) + μ_{0} .$

Figure 11.9: ROC for Example 11.4.1.

The power of δ_η′ is given by:

$P_{D} (δ_{η^{'}}) = P_{1} {Y \geq η^{'}} = Q (\frac{η^{'} (α) - μ_{1}}{σ}) = Q (Q^{- 1} (α) - ρ)$

where ρ = (μ₁ − μ₀)/σ is a measure of the signal–to–noise ratio (SNR). The ROC is plotted in Figure 11.9. As ρ increases, the P_D increases for a given level of P_F.

□

Example 11.4.2. Discrete Observations (continued). In this example we study the N-P solution to the detection problem described in Example 11.2.2. Due to the fact that L(a) = 0, L(b) = 1, and L(c) = ∞, we have that

$P_{0} {L (Y > η} = {\begin{array}{l} 0.5 & if & η \in [0, 1) \\ 0 & if & η \in [1, \infty) \end{array} .$

Thus, for α ∈ (0, 0.5), η(α) = 1 and $γ (α) = \frac{α - 0}{0.5} = 2 α$ , which yields

${\tilde{δ}}_{NP} (y) = {\begin{array}{l} 1 & if & y = c \\ 1 w . p . 2 α & if & y = b \\ 0 & if & y = a \end{array}$

and

$P_{D} ({\tilde{δ}}_{NP}) = p_{1} (c) + 2 α p_{1} (b) = 0.5 + α .$

Figure 11.10: ROC for Example 11.4.2.

For α ∈ [0.5, 1), η(α) = 0 and $γ (α) = \frac{α - 0.5}{0.5} = 2 α - 1$ , which yields

${\tilde{δ}}_{NP} (y) = {\begin{matrix} 1 & if y = c, b \\ 1 w . p . 2 α - 1 & if y = a \end{matrix}$

and P_D(δ̃_NP) = 1. The ROC is plotted in Figure 11.10.

□

11.5 Bayesian Composite Detection

So far we have assumed that conditional densities p₀ and p₁ are specified completely. Under this assumption, we saw that all three formulations of the binary detection problem (Bayes, minimax, Neyman-Pearson) led to the same solution structure, LRT, which is a comparison of the likelihood ratio L(y) to an appropriately chosen threshold. We now study the situation where p₀ and p₁ are not specified explicitly, but we are told that they come from a parametrized family of densities {p_θ, θ ∈ Λ}, with Λ being a discrete set or a subset of a Euclidean space. The hypothesis H_j corresponds to θ ∈ Λ_j, j = 0, 1, and Λ₀ ∪ Λ₁ = Λ, Λ₀ ∩ Λ₁ = $0$ .

We can consider composite binary detection (hypothesis testing) as a statistical decision theory problem where the set of states $S$ = Λ is nonbinary, but the set of decisions $D$ = {0, 1} is still binary, and the cost function relating the decisions and states is of the form:

$C (i, θ) = C_{i, j} for a l l θ \in Λ_{j}, i, j = 0, 1.$

(11.18)

In this section we consider a Bayesian formulation of the problem, where we assume that the state θ is a realization of a random variable Θ with prior PDF (PMF) given by π(θ). From (11.8), we immediately have that Bayes rule for composite hypothesis testing is given by:

$δ_{B} (y) \arg \min_{i \in D} C (i | y)$

where, using the notation introduced in (11.2),

$C (i | y) = \int_{θ \in Λ} C (i, θ) p (θ | y) μ (d θ), with p (θ | y) = \frac{p_{θ} (y) π (θ)}{p (y)} .$

Using (11.18), we can expand C(i∣y) as:

$C (i | y) = C_{i, 0} \int_{θ \in Λ_{0}} p (θ | y) μ (d θ) + C_{i, 1} \int_{θ \in Λ_{1}} p (θ | y) μ (d θ)$

from which we can easily conclude that:

$C (1 | y) \leq C (0 | y) \Leftrightarrow \frac{\int_{θ \in Λ_{1}} p_{θ} (y) π (θ) μ (d θ)}{\int_{θ \in Λ_{0}} p_{θ} (y) π (θ) μ (d θ)} \geq \frac{C_{1, 0} - C_{0, 0}}{C_{0, 1} - C_{1, 1}} .$

(11.19)

Now, if we define the priors on the hypotheses as

$π_{j} ≜ \int_{θ \in Λ_{j}} π (θ) μ (d θ), j = 0, 1.$

(11.20)

and the conditional densities for the hypotheses as

$p (y | Λ_{j}) ≜ \frac{1}{π_{j}} \int_{θ \in Λ_{j}} p_{θ} (y) π (θ) μ (d θ)$

then we can see that

$C (1 | y) \leq C (0 | y) \Leftrightarrow L (y) \geq τ$

with τ as defined in (11.11) and L(y) = p(y∣Λ₁)/p(y∣Λ₀).

Therefore, we can conclude that Bayes rule for composite detection is nothing but a LRT for the (simple) binary detection problem:

$H_{0} : Y ~ p (y | Λ_{0}) versus H_{1} : Y ~ p (y | Λ_{1})$

with priors π₀ and π₁ as defined in (11.20).

Example 11.5.1. Consider the composite detection problem in which Λ = [0, ∞), Λ₀ = [0, 1), and Λ₁ = [1, ∞), with uniform costs, and

$p_{θ} (y) = θ e^{- θ y} I_{{y \geq 0},} π (θ) = e^{- θ} I_{{θ \geq 0}}$

where $I$ is the indicator function. To compute the Bayes rule for this problem, we first compute

$\int_{θ \in Λ_{1}} p_{θ} (y) π (θ) μ (d θ) = \int_{1}^{\infty} θ e^{- θ (y + 1)} d θ = \frac{(y + 2) e^{- (y + 1)}}{{(y + 1)}^{2}}$

and

$\int_{θ \in Λ_{0}} p_{θ} (y) π (θ) μ (d θ) = \int_{0}^{1} θ e^{- θ (y + 1)} d θ = \frac{1 - (y + 2) e^{- (y + 1)}}{{(y + 1)}^{2}} .$

Then, from (11.19), we get that

$δ_{B} = {\begin{array}{l} 1 & if (y + 2) \geq 0.5 e^{(y + 1)} \\ 0 & otherwise \end{array}$

which can be simplified to

$δ_{B} = {\begin{array}{l} 1 & if 0 \leq y \leq τ^{'} \\ 0 & if y > τ^{'} \end{array}$

where τ′ is a solution to the transcendental equation (y + 2) = 0.5e⁽^y⁺¹⁾.

□

11.6 Neyman-Pearson Composite Detection

We now consider the more interesting setting for the composite detection problem where there is no prior on the state. A common way to pose the optimization problem in this setting is a generalization of the Neyman-Pearson formulation (see (11.16)). We define the probabilities of false alarm and detection of a test δ̃ ∈ Δ̃ by:

$\begin{array}{l} P_{F} (\tilde{δ}; θ) = P_{θ} {\tilde{δ} (Y) = 1}, θ \in Λ_{0} \\ P_{D} (\tilde{δ}; θ) = P_{1} {\tilde{δ} (Y) = 1}, θ \in Λ_{1} \end{array}$

The goal in UMP detection is to constrain P_F(δ̃; θ) ≤ α, for all θ ∈ Λ₀, and to simultaneously maximize P_D(δ̃; θ), for all θ ∈ Λ₁. If such a test exists, it is called UMP.

11.6.1 UMP Detection with One Composite Hypothesis

We begin by studying the special case where only H₁ is composite, i.e., Λ₀ is the singleton set equal to {θ₀}. The UMP optimization problem can be stated as:

$Maximize P_{D} (\tilde{δ}; θ), for all θ \in Λ_{1}, subject to P_{F} (\tilde{δ}; θ_{0}) \leq α .$

For fixed θ₁ ∈ Λ₁, we can compute the likelihood ratio as

$L_{θ_{1}} (y) = \frac{p_{θ_{1}} (y)}{p_{θ_{0}} (y)}$

and the corresponding Neyman-Pearson test is given by (see Theorem 11.4.2)

${\tilde{δ}}_{NP} (y; θ_{1}) = {\begin{array}{l} 1 & if & L_{θ_{1}} (y) > η_{α} (θ_{1}) \\ 1 w . p . γ_{α} (θ_{1}) & if & L_{θ_{1}} (y) = η_{α} (θ_{1}) \\ 0 & if & L_{θ_{1}} (y) < η_{α} (θ_{1}) \end{array}$

with γ_α(θ₁) and γ_α(θ₁) satisfying

$P_{θ_{0}} {L (Y) > η_{α} (θ_{1})} + γ_{α} (θ_{1}) P_{θ_{0}} {L (Y) = η_{α} (θ_{1})} = α .$

Now, if it turns out that δ̃_NP(y; θ₁) is independent of θ₁, then that test is UMP since it is the N-P solution for all θ₁ ∈ Λ₁. Otherwise, no UMP solution exists. In the following, we provide some illustrative examples.

Example 11.6.1. Detection of One-Sided Composite Signal in Gaussian Noise. This detection problem arises in communications and radar applications where the signal amplitude is unknown but the phase is known. The two hypotheses are described by:

$H_{0} : Y = Z versus H_{1} : Y = θ + Z$

where θ > 0 is an unknown parameter (signal amplitude), and Z ~ $N$ (0, σ²). This is a composite detection problem with θ₀ = 0, and Λ₁ = (0, ∞).

For fixed θ > 0, L_θ(y) = p_θ(y)/p₀(y) has no point masses under P₀ or P_θ, and therefore δ̃_NP(y; θ) is deterministic LRT:

$δ_{NP} (y; θ) = {\begin{array}{l} 1 & if & L_{θ} (y) \geq η (θ) \\ 0 & if & L_{θ} (y) < η (θ) \end{array} = {\begin{array}{l} 1 & if & y \geq η^{'} (θ) \\ 0 & if & y < η^{'} (θ) \end{array}$

where (see Example 11.4.1) η′(θ) is given by

$η^{'} (θ) = \frac{σ^{2} \log η (θ)}{θ} + \frac{θ}{2} .$

For an α-level test we need to find η′_α(θ) such that P₀{Y ≥ η′_α(θ)} = α. Exploiting the fact that Y ~ $N$ (0, σ²) under P₀, we get

$Q (\frac{{η^{'}}_{α} (θ)}{σ}) = α \Rightarrow {η^{'}}_{α} (θ) = σ Q^{- 1} (α) .$

Note that η′_α(θ) is independent of θ, and therefore the UMP solution is given by:

$δ_{UMP} = {\begin{array}{l} 1 & if & y \geq σ Q^{- 1} (α) \\ 0 & if & y < σ Q^{- 1} (α) \end{array} .$

Note that while the test δ_UMP is independent of the θ, the performance of the test in terms of the P_D depends strongly on θ. In particular,

$P_{D} (δ_{UMP}; θ) = P_{θ} {Y \geq σ Q^{- 1} (α)} = Q (Q^{- 1} (α) - θ / σ) .$

□

Example 11.6.2. Detection of Two-Sided Composite Signal in Gaussian Noise. This detection problem arises in communications and radar applications where the signal amplitude and phase are both unknown. The two hypotheses are as described in Example 11.6.1, except that θ ∈ ℝ, i.e., θ can be both positive and negative. There is no UMP test for this problem. This can be seen as follows.

First consider θ = 1. Then following the same steps as in Example 11.6.1, we can show that the α-level N-P test is given by:

$δ_{NP} (y; 1) = {\begin{array}{l} 1 & if & y \geq σ Q^{- 1} (α) \\ 0 & if & y < σ Q^{- 1} (α) \end{array} .$

Now, consider θ = −1. Then, it is not difficult to see that L₋₁(y) ≥ η iff y ≤ η′ in this case. Therefore the α-level N-P test is given by:

$δ_{NP} (y; - 1) = {\begin{array}{l} 1 & if & y \leq σ Φ^{- 1} (α) \\ 0 & if & y < σ Φ^{- 1} (α) \end{array} .$

Since the most powerful tests for θ = −1 and θ = 1 are not the same, there is no uniformly most powerful test.

Example 11.6.3. Detection of One-Sided Composite Signal in Cauchy Noise. From Examples 11.6.1 and 11.6.2, we may be tempted to conclude that for problems involving signal detection in noise, UMP tests exist as long as H₁ is one-sided. To see that this is not true in general, we consider the example where the noise has a Cauchy distribution, i.e.,

$p_{θ} (y) = \frac{1}{π [1 + {(y - θ)}^{2}]}$

and we are testing H₀ : θ = 0 against the one-sided composite hypothesis H₁ : θ > 0. Then

$L_{θ} (y) = \frac{1 + y^{2}}{1 + {(y - θ)}^{2}} .$

It is easy to check that the α-level N-P tests for θ = 1 and θ = 2 are different, and hence there is no UMP solution.

□

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 11 Fundamentals of Detection Theory

Create new playlist

Sign In

Sign Up

Table of Contents for
11 Fundamentals of Detection Theory