Advanced Topics in Estimation Theory


In the previous chapters, we discussed various classes of estimators, which attain certain optimality criteria, like minimum variance unbiased estimators (MVUE), asymptotic optimality of maximum likelihood estimators (MLEs), minimum mean–squared–error (MSE) equivariant estimators, Bayesian estimators, etc. In this chapter, we present additional criteria of optimality derived from the general statistical decision theory. We start with the game theoretic criterion of minimaxity and present some results on minimax estimators. We then proceed to discuss minimum risk equivariant and standard estimators. We discuss the notion of admissibility and present some results of Stein on the inadmissibility of some classical estimators. These examples lead to the so–called Stein–type and Shrinkage estimators.


Given a class inline of estimators, the risk function associated with each d inline inline is R(d, θ), θ inline Θ. The maximal risk associated with d is R*(d) = inline.jpg R(d, θ). If in inline there is an estimator d* that minimizes R*(d) then d* is called a minimax estimator. That is,

Unnumbered Display Equation

A minimax estimator may not exist in inline. We start with some simple results.

Lemma 9.1.1. Let inline = {F(x;θ), θ inline Θ} be a family of distribution functions and inline a class of estimators of θ. Suppose that d* inline inline and d* is a Bayes estimator relative to some prior distribution H* (θ) and that the risk function R*(d, θ) does not depend on θ. Then d* is a minimax estimator.

Proof. Since R(d*, θ) = ρ* for all θ in Θ, and d* is Bayes against H*(θ) we have

(9.1.1) numbered Display Equation

On the other hand, since ρ* = R(d*, θ) for all θ

(9.1.2) numbered Display Equation

From (9.1.1) and (9.1.2), we obtain that

(9.1.3) numbered Display Equation

This means that d* is minimax.        QED

Lemma 9.1.1 can be generalized by proving that if there exists a sequence of Bayes estimators with prior risks converging to ρ*, where ρ* is a constant risk of d*, then d* is minimax. We obtain this result as a corollary of the following lemma.

Lemma 9.1.2. Let {Hk;k ≥ 1} be a sequence of prior distributions on Θ and let {inline.jpgk;k ≥ 1} be the corresponding sequence of Bayes estimators with prior risks ρ (inline.jpgk, Hk). If there exists an estimator d* for which

(9.1.4) numbered Display Equation

then d* is minimax.

Proof. If d* is not a minimax estimator, there exists an estimator inline.jpg such that

(9.1.5) numbered Display Equation

Moreover, for each k ≥ 1 since inline.jpgk is Bayes,

(9.1.6) numbered Display Equation

But (9.1.5) in conjunction with (9.1.6) contradict (9.1.4). Hence, d* is minimax.        QED


In Section 5.7.1, we discussed the structure of models that admit equivariant estimators with respect to certain groups of transformations. In this section, we return to this subject and investigate minimum risk, Bayes and minimax equivariant estimators. The statistical model under consideration is specified by a sample space inline.jpg and a family of distribution functions inline = {F(x;θ);θ inline Θ}. Let inline be a group of transformations that preserves the structure of the model, i.e., ginline.jpg = inline.jpg for all g inline inline, and the induced group inline.jpg of transformations on Θ has the property that inline.jpgΘ = Θ for all inline.jpg inline inline.jpg. An equivariant estimator inline.jpg(X) of θ was defined as one which satisfies the structural property that inline.jpg(gX) = inline.jpginline.jpg(X) for all g inline inline.

In cases of various orbits of inline in Θ, we may index the orbits by a parameter, say ω (θ). The risk function of an equivariant estimator inline.jpg(X) is then R(inline.jpg, ω (θ)). Bayes equivariant estimators can be considered. These are equivariant estimators that minimize the prior risk associated with θ, relative to a prior distribution H(θ). We assume that ω (θ) is a function of θ for which the following prior risk exists, namely,

(9.2.1) numbered Display Equation

where K(ω) is the prior distribution of ω (θ), induced by H(θ). Let U(X) be a maximal invariant statistic with respect to inline. Its distribution depends on θ only through ω (θ). Suppose that g(u;ω) is the probability density function (p.d.f.) of U(X) under ω. Let k(ω | U) be the posterior p.d.f. of ω given U(X). The prior risk of θ can be written then as

(9.2.2) numbered Display Equation

where Eω|U{R(inline.jpg, ω)} is the posterior risk of θ, given U(X). An equivariant estimator inline.jpgK is Bayes against K(ω) if it minimizes Eω|U{R(inline.jpg, ω)}.

As discussed earlier, the Bayes equivariant estimators are relevant only if there are different orbits of inline in Θ. Another approach to the estimation problem, if there are no minimum risk equivariant estimators, is to derive formally the Bayes estimators with respect to invariant prior measures (like the Jeffreys improper priors). Such an approach to the above problem of estimating variance components was employed by Tiao and Tan (1965) and by Portnoy (1971). We discuss now formal Bayes estimators more carefully.

9.2.1 Formal Bayes Estimators for Invariant Priors

Formal Bayes estimators with respect to invariant priors are estimators that minimize the expected risk, when the prior distribution used is improper. In this section, we are concerned with invariant prior measures, such as the Jeffreys noninformative prior h(θ)dθ ∝|I(θ)|1/2dθ. With such improper priors, the minimum risk estimators can often be formally derived as in Section 5.7. The resulting estimators are called formal Bayes estimators. For example, if inline = {F(x;θ); −∞ < θ < ∞} is a family of location parameters distributions (of the translation type), i.e., the p.d.f.s are f(x;θ) = inline (xθ) then, for the group inline of real translations, the Jeffreys invariant prior is h(θ)dθdθ. If the loss function is L(inline.jpg, θ) = (inline.jpgθ)2, the formal Bayes estimator is

(9.2.3) numbered Display Equation

Making the transformation Y = X(1)θ, where X(1) ≤ … ≤ X(n), we obtain

(9.2.4) numbered Display Equation

This is the Pitman estimator (5.7.10).

When inline is a family of location and scale parameters, with p.d.f.s

(9.2.5) numbered Display Equation

we consider the group inline of real affine transformations inline = {[α, β]; −∞ < α < ∞, 0 < β < ∞}. The fisher information matrix of (μ, σ) for inline is, if exists,

(9.2.6) numbered Display Equation


(9.2.7) numbered Display Equation

(9.2.8) numbered Display Equation


(9.2.9) numbered Display Equation

Accordingly, |I(μ, σ)| inline.jpg and the Jeffreys invariant prior is

(9.2.10) numbered Display Equation

If the invariant loss function for estimating μ is L(inline.jpg, μ, σ) = inline.jpg then the formal Bayes estimator of μ is

(9.2.11) numbered Display Equation

Let y1 ≤ … ≤ yn represent realization of the order statistics X(1) ≤ … ≤ X(n). Consider the change of variables

(9.2.12) numbered Display Equation

then the formal Bayes estimator of μ is

(9.2.13) numbered Display Equation

For estimating σ, we consider the invariant loss function L(inline.jpg, σ) = (inline.jpgσ)2/σ2. The formal Bayes estimator is then

(9.2.14) numbered Display Equation

Note that the formal Bayes estimator (9.2.13) is equivalent to the Pitman estimator, for a location parameter family (with known scale parameter). The estimator (9.2.14) is the Pitman estimator for a scale parameter family.

Formal Bayes estimation can be used also when the model has parameters that are invariant with respect to the group of transformations inline. In the variance components model discussed in Example 9.3, the variance ratio ρ = τ2/σ2 is such an invariant parameter. These parameters are called also nuisance parameters for the transformation model.

9.2.2 Equivariant Estimators Based on Structural Distributions

Fraser (1968) introduced structural distributions of parameters in cases of invariance structures, when all the parameters of the model can be transformed by the transformations in inline. Fraser’s approach does not require the assignment of a prior distribution to the unknown parameters. This approach is based on changing the variables of integration from those representing the observable random variables to those representing the parameters. We start the explanation by considering real parameter families. More specifically, let inline = {F(x;θ);θ inline Θ} be a family of distributions, where Θ is an interval on the real line. Let inline be a group of one–to–one transformations, preserving the structure of the model. For the simplicity of the presentation, we assume that the distribution functions of inline are absolutely continuous and the transformation in inline can be represented as functions over inline.jpg × Θ. Choose in Θ a standard or reference point e and let U be a random variable, having the distribution F(u;e), which is the standard distribution. Let inline(u) be the p.d.f. of the standard distribution. The structural model assumes that if a random variable X has a distribution function F(x;θ); when θ = inline.jpge, g inline inline, then X = gU. Thus, the structural model can be expressed in the formula

(9.2.15) numbered Display Equation

Assume that G(u, θ) is differentiable with respect to u and θ. Furthermore, let

(9.2.16) numbered Display Equation

The function G(u, θ) satisfies the equivariance condition that

Unnumbered Display Equation

with an invariant inverse; i.e.,

Unnumbered Display Equation

We consider now the variation of u as a function of θ for a fixed value of x. Writing the probability element of U at u in the form

(9.2.17) numbered Display Equation

we obtain for every fixed x a distribution function for θ, over Θ, with p.d.f.

(9.2.18) numbered Display Equation

where m(θ, x) = inline.jpg. The distribution function corresponding to k(θ, x) is called the structural distribution of θ given X = x. Let L(inline.jpg(x), θ) be an invariant loss function. The structural risk of inline.jpg(x) is the expectation

(9.2.19) numbered Display Equation

An estimator θ0(x) is called minimum risk structural estimator if it minimizes R(inline.jpg(x)). The p.d.f. (9.2.18) corresponds to one observation on X. Suppose that a sample of n independent identically distributed (i.i.d.) random variables X1, …, Xn is represented by the point x = (x1, …, xn). As before, θ is a real parameter. Let V(X) be a maximal invariant statistic with respect to inline. The distribution of V(X) is independent of θ. (We assume that Θ has one orbit of inline.) Let k(v) be the joint p.d.f. of the maximal invariant V(X). Let u1 = G−1 (x1, θ) and let inline(u| x) be the conditional p.d.f. of the standard variable U = [θ]−1X, given V = v. This conditional p.d.f. of θ, for a given x is then, like in (9.2.18),

(9.2.20) numbered Display Equation

If the model depends on a vector θ of parameters we make the appropriate generalizations as will be illustrated in Example 9.4.

We conclude the present section with some comment concerning minimum properties of formal Bayes and structural estimators. Girshick and Savage (1951) proved that if all equivariant estimators in the location parameter model have finite risk, then the Pitman estimator (9.2.24) is minimax. Generally, if a formal Bayes estimator with respect to an invariant prior measure (as the Jeffreys priors) and invariant loss function is an equivariant estimator, and if the parameter space Θ has only one orbit of inline, then the risk function of the formal Bayes estimator is constant over Θ. Moreover, if this formal Bayes estimator can be obtained as a limit of a sequence of proper Bayes estimators, or if there exists a sequence of proper Bayes estimators and the lower limit of their prior risks is not smaller than the risk of the formal Bayes estimator, then the formal Bayes is a minimax estimator. Several theorems are available concerning the minimax nature of the minimum risk equivariant estimators. The most famous is the Hunt–Stein Theorem (Zacks, 1971; p. 346).


9.3.1 Some Basic Results

The class of all estimators can be classified according to the given risk function into two subclasses: admissible and inadmissible ones.

Definition. An estimator inline.jpg1(x) is called inadmissible with respect to a risk function R(inline.jpg, θ) if there exists another estimator inline.jpg2(x) for which

(9.3.1) numbered Display Equation

From the decision theoretic point of view inadmissible estimators are inferior. It is often not an easy matter to prove that a certain estimator is admissible. On the other hand, several examples exist of the inadmissibility of some commonly used estimators. A few examples will be provided later in this section. We start, however, with a simple and important lemma.

Lemma 9.3.1 (Blyth, 1951). If the risk function R(inline.jpg, θ) is continuous in θ for each inline.jpg, and if the prior distribution H(θ) has a positive p.d.f. at all θ then the Bayes estimator inline.jpgH(x) is admissible.

Proof. By negation, if inline.jpgH(x) is inadmissible then there exists another estimator inline.jpg* (x) for which (9.3.1) holds. Let θ* be a point at which the strong inequality (ii) of (9.3.1) holds. Since R(inline.jpg, θ) is continuous in θ for each inline.jpg, there exists a neighborhood N(θ*) around θ* over which the inequality (ii) holds for all θ inline N(θ*). Since h(θ) > 0 for all θ, PH{N(θ*)} > 0. Finally from inequality (i) we obtain that

(9.3.2) numbered Display Equation

The left–hand side of (9.3.2) is the prior risk of θ* and the right–hand side is the prior risk of inline.jpgH. But this result contradicts the assumption that inline.jpgH is Bayes with respect to H(θ).        QED

All the examples, given in Chapter 8, of proper Bayes estimators illustrate admissible estimators. Improper Bayes estimators are not necessarily admissible. For example, in the N(μ, σ2) case, when both parameters are unknown, the formal Bayes estimator of σ2 with respect to the Jeffreys improper prior h(σ2) dσ2dσ2/σ2 is Q/(n – 3), where Q = Σ(Xiinline.jpg)2. This estimator is, however, inadmissible, since Q/(n + 1) has a smaller MSE for all σ2. There are also admissible estimators that are not Bayes. For example, the sample mean inline.jpg from a normal distribution N(θ, 1) is an admissible estimator with respect to a squared–error loss. However, inline.jpg is not a proper Bayes estimator. It is a limit (as k → ∞) of the Bayes estimators derived in Section 8.4, inline.jpgk = inline.jpg is also an improper Bayes estimator with respect to the Jeffreys improper prior h(θ)dθdθ. Indeed, for such an improper prior

(9.3.3) numbered Display Equation

The previous lemma cannot establish the admissibility of the sample mean inline.jpg. We provide here several lemmas that can be used.

Lemma 9.3.2. Assume that the MSE of an estimator inline.jpg1 attains the Cramér–Rao lower bound (under the proper regularity conditions) for all θ, −∞ < θ < ∞, which is

(9.3.4) numbered Display Equation

where B1(θ) is the bias of inline.jpg1. Moreover, if for any estimator inline.jpg2 having a Cramér–Rao lower bound C2(θ), the inequality C2(θ) ≤ C1(θ) for all θ implies that B2(θ) = B1(θ) for all θ, then θ1 is admissible.

Proof. If inline.jpg1 is inadmissible, there exists an estimator inline.jpg2 such that

Unnumbered Display Equation

with a strict inequality at some θ′. Since R(inline.jpg1, θ) = C1(θ) for all θ, we have

(9.3.5) numbered Display Equation

for all θ. But, according to the hypothesis, (9.3.5) implies that B1(θ) = B2(θ) for all θ. Hence, C1(θ) = C2(θ) for all θ. But this contradicts the assumption that R(inline.jpg2, θ′) < R(inline.jpg1, θ′). Hence, inline.jpg1 is admissible.        QED

Lemma 9.3.2 can be applied to prove that, in the case of a sample from N(0, σ2), inline.jpg is an admissible estimator of σ2. (The MVUE and the MLE are inadmissible!) In such an application, we have to show that the hypotheses of Lemma 9.3.2 are satisfied. In the N(0, σ2) case, it requires lengthy and tedious computations (Zacks, 1971, p. 373). Lemma 9.3.2 is also useful to prove the following lemma (Girshick and Savage, 1951).

Lemma 9.3.3. Let X be a one–parameter exponential type random variable, with p.d.f.

Unnumbered Display Equation

–∞ < inline < ∞. Then inline.jpg = X is an admissible estimator of its expectation μ (inline) = + K′(inline), for the quadratic loss function (inline.jpgμ)2/σ2(inline); where σ2(inline) = + K″(inline) is the variance of X.

Proof. The proof of the present lemma is based on the following points. First X is an unbiased estimator of μ (inline). Since the distribution of X is of the exponential type, its variance σ2(inline) is equal to the Cramér–Rao lower bound, i.e.,

(9.3.6) numbered Display Equation

This implies that I(inline) = σ2(inline), which can be also derived directly. If inline.jpg(X) is any other estimator of μ(inline) satisfying the Cramér–Rao regularity condition with variance D2(inline), such that

(9.3.7) numbered Display Equation

then from the Cramér–Rao inequality

(9.3.8) numbered Display Equation

where B(inline) is the bias function of inline.jpg(X). Thus, we arrived at the inequality

(9.3.9) numbered Display Equation

all −∞ < inline < ∞. This implies that

(9.3.10) numbered Display Equation

for all −∞ < inline < ∞. From (9.3.10), we obtain that either B(inline) = 0 for all inline or

(9.3.11) numbered Display Equation

for all inline such that B(inline) ≠ 0. Since B(inline) is a decreasing function, either B(inline) = 0 for all inlineinline0 or B(inline) ≠ 0 for all inlineinline0. Let G(inline) be a function defined so that G(inline0) = 1/B(inline0) and G′(inline) = 1/2 for all inlineinline0; i.e.,

(9.3.12) numbered Display Equation

Since 1/B(inline) is an increasing function and inline.jpg, it is always above G(inline) on inlineinline0. It follows that

(9.3.13) numbered Display Equation

In a similar manner, we can show that inline.jpg or inline.jpg. This implies that B(inline) = 0 for all inline. Finally, since the bias function of inline.jpg(X) = X is also identically zero we obtain from the previous lemma that inline.jpg(X) is admissible.        QED

Karlin (1958) established sufficient condition for the admissibility of a linear estimator

(9.3.14) numbered Display Equation

of the expected value of T in the one parameter exponential family, with p.d.f.

Unnumbered Display Equation

Theorem 9.3.1 (Karlin) Let X have a one–parameter exponential distribution, with inline.jpg. Sufficient conditions for the admissibility of (9.3.14) as estimator of Eθ {T(X)}, under squared–error loss, is

Unnumbered Display Equation


Unnumbered Display Equation

where inline.jpg.

For a proof of this theorem, see Lehmann and Casella (1998, p. 331).

Considerable amount of research was conducted on the question of the admissibility of formal or generalized Bayes estimators. Some of the important results will be discussed later. We address ourselves here to the question of the admissibility of equivariant estimators of the location parameter in the one–dimensional case. We have seen that the minimum risk equivariant estimator of a location parameter θ, when finite risk equivariant estimators exist, is the Pitman estimator

Unnumbered Display Equation

The question is whether this estimator is admissible. Let Y = (X(2)X(1), …, X(n)X(1)) denote the maximal invariant statistic and let f(x | y) the conditional distribution of X(1), when θ = 0, given Y = y. Stein (1959) proved the following.

Theorem 9.3.2. If inline.jpg(X) is the Pitman estimator and

(9.3.15) numbered Display Equation

then inline.jpg(X) is an admissible estimator of θ with respect to the squared–error loss.

We omit the proof of this theorem, which can be found in Stein’s paper (1959) or in Zacks (1971, pp. 388–393). The admissibility of the Pitman estimator of a two–dimensional location parameter was proven later by James and Stein (1960). The Pitman estimator is not admissible, however, if the location parameter is a vector of order p ≥ 3. This result, first established by Stein (1956) and by James and Stein (1960), will be discussed in the next section.

The Pitman estimator is a formal Bayes estimator. It is admissible in the real parameter case. The question is under what conditions formal Bayes estimators in general are admissible. Zidek (1970) established sufficient conditions for the admissibility of formal Bayes estimators having a bounded risk.

9.3.2 The Inadmissibility of Some Commonly Used Estimators

In this section, we discuss a few well–known examples of some MLE or best equivariant estimators that are inadmissible. The first example was developed by Stein (1956) and James and Stein (1960) established the inadmissibility of the MLE of the normal mean vector θ, in the N(θ, I) model, when the dimension of θ is p ≥ 3. The loss function considered is the squared–error loss, inline.jpg. This example opened a whole area of research and led to the development of a new type of estimator of a location vector, called the Stein estimators. Another example that will be presented establishes the inadmissibility of the best equivariant estimator of the variance of a normal distribution when the mean is unknown. This result is also due to Stein (1964). Other related results will be mentioned too.

I. The Inadmissibility of the MLE in the N(θ, I) Case, With p ≥ 3

Let X be a random vector of p components, with p ≥ 3. Furthermore assume that XN(θ}, I). The assumption that the covariance matrix of X is I, is not a restrictive one, since if XN(θ, V), with a known V, we can consider the case of Y = C−1X, where V = CC′. Obviously, YN(η}, I) where η = C−1θ. Without loss of generality, we also assume that the sample size is n = 1. The MLE of θ is X itself. Consider the squared–error loss function inline.jpg. Since X is unbiased, the risk of the MLE is R* = p for all θ. We show now an estimator that has a risk function smaller than p for all θ, and when θ is close to zero its risk is close to 2. The estimator suggested by Stein is

(9.3.16) numbered Display Equation

This estimator is called the James–Stein estimator. The risk function of (9.3.16) is

(9.3.17) numbered Display Equation

The first term on the RHS of (9.3.17) is p. We notice that XX ∼ χ2inline.jpg. Accordingly,

(9.3.18) numbered Display Equation

where inline.jpg. We turn now to the second term on the RHS of (9.3.17). Let inline.jpg and inline.jpg. Note that UN(||θ||, 1) is independent of V and V ∼ χ2[p – 1]. Indeed, we can write

(9.3.19) numbered Display Equation

where A = (Iθθ′/||θ||2) is an idempotent matrix of rank p – 1. Hence, V∼ χ2[p – 1]. Moreover, Aθ/||θ|| = 0. Hence, U and V are independent. Furthermore,

(9.3.20) numbered Display Equation

We let W = ||X||2, and derive the p.d.f. of U/W. This is needed, since the second term on the RHS of (9.3.17) is -2(p – 2)[1 – ||θ||Eθ{U/W}]. Since U and V are independent, their joint p.d.f. is

(9.3.21) numbered Display Equation

Thus, the joint p.d.f. of U and W is

(9.3.22) numbered Display Equation

0 ≤ u2w ≤ ∞. The p.d.f. of R = U/W is then

(9.3.23) numbered Display Equation


(9.3.24) numbered Display Equation

By making the change of variables to t = inline.jpg and expanding inline.jpg we obtain, after some manipulations,

(9.3.25) numbered Display Equation

where inline.jpg. From (9.3.17), (9.3.18) and (9.3.25), we obtain

(9.3.26) numbered Display Equation

Note that when θ = 0, P0[J = 0] = 1 and R(θ, 0) = 2. On the other hand, inline.jpg. The estimator inline.jpg given by (9.3.17) has smaller risk than the MLE for all θ values. In the above development, there is nothing to tell us whether (9.3.17) is itself admissible. Note that (9.3.17) is not an equivariant estimator with respect to the group of real affine transformations, but it is equivariant with respect to the group of orthogonal transformations (rotations). If the vector X has a known covariance matrix V, the estimator (9.3.17) should be modified to

(9.3.27) numbered Display Equation

This estimator is equivariant with respect to the group inline of nonsingular transformations XAX. Indeed, the covariance matrix of Y = AX is inline = AVA′. Therefore, Yinline.jpg,−1Y = XV−1X for every A inline inline.

Baranchick (1973) showed, in a manner similar to the above, that in the usual multiple regression model with normal distributions the commonly used MLEs of the regression coefficients are inadmissible. More specifically, let X1, …, Xn be a sample of n i.i.d. (p + 1) dimensional random vectors, having a multinormal distribution N(θ, inline.jpg). Consider the regression of Y = X1 on Z = (X2, …, Xp + 1)′. If we consider the partition θ′ = (η, ζ′) and

Unnumbered Display Equation

then the regression of Y on Z is given by

Unnumbered Display Equation

where α = ηβζ and β = V−1C. The problem is to estimate the vector of regression coefficients β. The least–squares estimators (LSE) is inline.jpg, where inline.jpg. Y1, …, Yn and Z1, …, Zn are the sample statistics corresponding to X1, …, Xn.

Consider the loss function

(9.3.28) numbered Display Equation

With respect to this loss function Baranchick proved that the estimators

(9.3.29) numbered Display Equation

have risk functions smaller than that of the LSEs (MLEs) inline.jpg and inline.jpg, at all the parameter values, provided inline.jpg and p3,np + 2. R2 is the squared–multiple correlation coefficient given by

Unnumbered Display Equation

The proof is very technical and is omitted. The above results of Stein and Baranchick on the inadmissibility of the MLEs can be obtained from the following theorem of Cohen (1966) that characterizes all the admissible linear estimate of the mean vector of multinormal distributions. The theorem provides only the conditions for the admissibility of the estimators, and contrary to the results of Stein and Baranchick, it does not construct alternative estimators.

Theorem 9.3.3 (Cohen, 1966). Let XN(θ, I) where the dimension of X is p. Let inline.jpg = AX be an estimator of θ, where A is a p × p matrix of known coefficients. Then inline.jpg is admissible with respect to the squared–error loss ||inline.jpgθ||2 if and only if A is symmetric and its eigenvalues α i (i = 1, …, p) satisfy the inequality.

(9.3.30) numbered Display Equation

with equality to 1 for at most two of the eigenvalues.

For a proof of the theorem, see Cohen (1966) or Zacks (1971, pp. 406–408). Note that for the MLE of θ the matrix A is I, and all the eigenvalues are equal to 1. Thus, if p ≥ 3, X is an inadmissible estimator. If we shrink the MLE towards the origin and consider the estimator θλ = λ X with 0 < λ < 1 then the resulting estimator is admissible for any dimension p. Indeed, inline.jpgλ is actually the Bayes estimator (8.4.31) with A1 = V = I, inline.jpg, = τ2I and A2 = 0. In this case, the Bayes estimator is inline.jpg, where 0 < τ < ∞. We set λ = τ2/(1 + τ2). According to Lemma 9.3.1, this proper Bayes estimator is admissible. In Section 9.3.3, we will discuss more meaningful adjustment of the MLE to obtain admissible estimators of θ.

II. The Inadmissibility of the Best Equivariant Estimators of the Scale Parameter When the Location Parameter is Unknown

Consider first the problem of estimating the variance of a normal distribution N(μ, σ2) when the mean μ is unknown. Let X1, …, Xn be i.i.d. random variables having such distribution. Let (inline.jpg, Q) be the minimal sufficient statistic, inline.jpg and inline.jpg. We have seen that the minimum risk equivariant estimator, with respect to the quadratic loss inline.jpg. Stein (1964) showed that this estimator is, however, inadmissible! The estimator

(9.3.31) numbered Display Equation

has uniformly smaller risk function. We present here Stein’s proof of this inadmissibility.

Let S = Q + ninline.jpg2. Obviously, S ∼ χ2[n;nμ2/2σ2] ∼ χ2[n + 2J] where JP(nμ2/2σ2). Consider the scale equivariant estimators that are functions of (Q, S). Their structure is f(Q, S) = inline.jpg. Moreover, the conditional distribution of Q/S given J is the beta distribution inline.jpg. Furthermore given J, Q/S and S are conditionally independent. Note that for inline.jpg we use the function inline.jpg. Consider the estimator

(9.3.32) numbered Display Equation

Here, inline.jpg. The risk function, for the quadratic loss L(inline.jpg, σ2) = (inline.jpg2σ2)2/σ4 is, for any function inline.jpg,

(9.3.33) numbered Display Equation

where inline.jpg and inline.jpg. Let W = Q/S. Then,

(9.3.34) numbered Display Equation

We can also write,

(9.3.35) numbered Display Equation

We notice that inline1(W) ≤ inline0(W). Moreover, if inline.jpg then inline1(W) = inline0(W), and the first and third terms on the RHS of (9.3.35) are zero. Otherwise,

Unnumbered Display Equation


(9.3.36) numbered Display Equation

for all J and W, with strict inequality on a (J, W) set having positive probability. From (9.3.34) and (9.3.36) we obtain that R(inline1) < R(inline0). This proves that inline.jpg is inadmissible.

The above method of Stein can be used to prove the inadmissibility of equivariant estimators of the variance parameters also in other normal models. See for example, Klotz, Milton and Zacks (1969) for a proof of the inadmissibility of equivariant estimators of the variance components in Model II of ANOVA.

Brown (1968) studied the question of the admissibility of the minimum risk (best) equivariant estimators of the scale parameter σ in the general location and scale parameter model, with p.d.f.s f(x;μ, σ) = inline.jpg. The loss functions considered are invariant bowl–shaped functions, L(δ). These are functions that are nonincreasing for δ ≤ δ0 and nondecreasing for δ > δ0 for some δ0. Given the order statistic X(1) ≤ … ≤ X(n) of the sample, let inline.jpg = inline.jpg and Zi = (X(i)inline.jpg)/S, i = 3, …, n. Z = (Z3, …, Zn)′ is a maximal invariant with respect to the group inline of real affine transformations. The best equivariant estimator of ω = σk is of the form inline.jpg0 = inline0(Z)Sk, where inline0(Z) is an optimally chosen function. Brown proved that the estimator

(9.3.37) numbered Display Equation

where K(Z) is appropriately chosen functions, and inline1(Z) < inline0(Z) has uniformly smaller risk than inline.jpg0. This established the inadmissibility of the best equivariant estimator, when the location parameter is unknown, for general families of distributions and loss functions. Arnold (1970) provided a similar result in the special case of the family of shifted exponential distributions, i.e., f(x;μ, σ) = I{xinline.jpg.

Brewster and Zidek (1974) showed that in certain cases one can refine Brown’s approach by constructing a sequence of improving estimators converging to a generalized Bayes estimator. The risk function of this estimator does not exceed that of the best equivariant estimators. In the normal case N(μ, σ2), this estimator is of the form inline (Z)Q, where

(9.3.38) numbered Display Equation

with inline.jpg. The conditional expectations in (9.3.38) are computed with μ = 0 and σ = 1. Brewster and Zidek (1974) provided a general group theoretic framework for deriving such estimators in the general case.

9.3.3 Minimax and Admissible Estimators of the Location Parameter

In Section 9.3.1, we presented the James–Stein proof that the MLE of the location parameter vector in the N(θ, I) case with dimension p ≥ 3 is inadmissible. It was shown that the estimator (9.3.17) is uniformly better than the MLE. The estimator (9.3.17) is, however, also inadmissible. Several studies have been published on the question of adjusting estimator (9.3.17) to obtain minimax estimators. In particular, see Berger and Bock (1976). Baranchick (1970) showed that a family of minimax estimators of θ is given by

(9.3.39) numbered Display Equation

where S = XX and inline(S) is a function satisfying the conditions:

(9.3.40) numbered Display Equation

If the model is N(θ, σ2I) with known σ2 then the above result holds with S = XX/σ2. If σ2 is unknown and inline.jpg2 is an estimator of σ2 having a distribution like σ2· χ2[ν]/(ν + 2) then we substitute in (9.3.39) S = XX/inline.jpg2. The minimaxity of (9.3.39) is established by proving that its risk function, for the squared–error loss, does not exceed the constant risk, R* = p, of the MLE X. Note that the MLE, X, is also minimax. In addition, (9.3.39) can be improved by

(9.3.41) numbered Display Equation

where a+ = max(a, 0). These estimators are not necessarily admissible. Admissible and minimax estimators of θ similar to (9.3.39) were derived by Strawderman (1972) for cases of known σ2 and p ≥ 5. These estimators are

(9.3.42) numbered Display Equation

where inline.jpg for p = 5 and 0 ≤ a ≤ 1 for p ≥ 6, also

Unnumbered Display Equation

in which G(x;ν) = P{G(1, ν) ≤ x}. In Example 9.9, we show that θa(X) are generalized Bayes estimators for the squared–error loss and the hyper–prior model

(9.3.43) numbered Display Equation

Unnumbered Display Equation

Note that if a < 1 then G(λ) is a proper prior and inline.jpg(X) is a proper Bayes estimator. For a proof of the admissibility for the more general case of inline.jpg see Lin (1974).

9.3.4 The Relationship of Empirical Bayes and Stein–Type Estimators of the Location Parameter in the Normal Case

Efron and Morris (1972a, 1972b, 1973) show the connection between the estimator (9.3.16) and empirical Bayes estimation of the mean vector of a multinormal distribution. Ghosh (1992) present a comprehensive comparison of Empirical Bayes and the Stein–type estimators, in the case where XN(θ, I). See also the studies of Lindley and Smith (1972), Casella (1985), and Morris (1983).

Recall that in parametric empirical Bayes procedures, one estimates the unknown prior parameters, from a model of the predictive distribution of X, and substitutes the estimators in the formulae of the Bayesian estimators. On the other hand, in hierarchical Bayesian procedures one assigns specific hyper prior distributions for the unknown parameters of the prior distributions. The two approaches may sometimes result with similar estimators.

Starting with the simple model of p–variate normal X| θN(θ, I), and θN(0, τ2I), the Bayes estimator of θ, for the squared–error loss L(inline.jpg, θ) = ||inline.jpgθ||2, is inline.jpgB = (1 – B)X, where B = 1/(1 + τ2). The predictive distribution of X is N(0, B−1I). Thus, the predictive distribution of XX is like that of B−1χ2[p]. Thus, for p > 3, (p – 2)/XX is predictive–unbiased estimator of B. Substituting this estimator for B in inline.jpgB yields the parametric empirical Bayes estimator

(9.3.44) numbered Display Equation

inline.jpgEB derived here is identical with the James–Stein estimator (9.3.16). If we change the Bayesian model so that θN(μ 1, I), with μ known, then the Bayesian estimator is inline.jpgB = (1 – B)X + Bμ 1, and the corresponding empirical Bayes estimator is

(9.3.45) numbered Display Equation

If both μ and τ are unknown, the resulting empirical Bayes estimator is

(9.3.46) numbered Display Equation

where inline.jpg.

Ghosh (1992) showed that the predictive risk function of inline.jpg, namely, the trace of the MSE matrix inline.jpg is

(9.3.47) numbered Display Equation

Thus, inline.jpg has smaller predictive risk than the MLE inline.jpgML = X if p ≥ 4.

The Stein–type estimator of the parametric vector β in the linear model XAβ + inline is

(9.3.48) numbered Display Equation

where inline.jpg is the LSE and

(9.3.49) numbered Display Equation

It is interesting to compare this estimator of β with the ridge regression estimator (5.4.3), in which we substitute for the optimal k value the estimator pσ2/inline.jpginline.jpg. The ridge regression estimators obtains the form

(9.3.50) numbered Display Equation

There is some analogy but the estimators are obviously different. A comprehensive study of the property of the Stein–type estimators for various linear models is presented in the book of Judge and Bock (1978).


Example 9.1. Let X be a binomial B(n, θ) random variable. n is known, 0 < θ < 1. If we let θ have a prior beta distribution, i.e., θβ (ν1, ν2) then the posterior distribution of θ given X is the beta distribution β (ν1 +X, ν2 + nX). Consider the linear estimator inline.jpg. The MSE of inline.jpgα, β is

Unnumbered Display Equation

We can choose α0 and β0 so that R(inline.jpgα0, β0, θ) = (β0)2. For this purpose, we set the equations

Unnumbered Display Equation

The two roots are

Unnumbered Display Equation

With these constants, we obtain the estimator

Unnumbered Display Equation

with constant risk

Unnumbered Display Equation

We show now that θ* is a minimax estimator of θ for a squared–error loss by specifying a prior beta distribution for which θ* is Bayes.

The Bayes estimator for the prior β (ν1, ν2) is

Unnumbered Display Equation

In particular, if inline.jpg then inline.jpg. This proves that θ is minimax.

Finally, we compare the MSE of this minimax estimator with the variance of the MVUE, X/n, which is also an MLE. The variance of inline.jpg = X/n is θ (1 – θ)/n. V{inline.jpg} at θ = 1/2 assumes its maximal value of 1/4n. This value is larger than R(θ*, θ). Thus, we know that around θ = 1/2 the minimax estimator has a smaller MSE than the MVUE. Actually, by solving the quadratic equation

Unnumbered Display Equation

we obtain the two limits of the interval around θ = 1/2 over which the minimax estimator is better. These limits are given by

Unnumbered Display Equation                inline

Example 9.2.

A. Let X1, …, Xn be i.i.d. random variables, distributed like N(θ, 1), −∞ < θ < ∞. The MVUE (or MLE), inline.jpg, has a constant variance n−1. Thus, if our loss function is the squared–error, inline.jpg is minimax. Indeed, the estimators inline.jpg are Bayes with respect to the prior N(0, k) distributions. The risks of these Bayesian estimators are

Unnumbered Display Equation

But ρ (inline.jpgk, k) → n−1 as k→ ∞. This proves that inline.jpg is minimax.

B. Consider the problem of estimating the common mean, μ of two normal distributions, which were discussed in Example 5.24. We can show that (X + Y)/2 is a minimax estimator for the symmetric loss function (inline.jpgμ)2/σ2 max(1, ρ). If the loss function is (inline.jpgμ)2/σ2 then the minimax estimator is inline.jpg, regardless of inline.jpg. This is due to the large risk when ρ → ∞ (see details in Zacks (1971, p. 291)).        inline

Example 9.3. Consider the problem of estimating the variance components in the Model II of analysis of variance. We have k blocks of n observations on the random variables, which are represented by the linear model

Unnumbered Display Equation

where eij are i.i.d. N(0, σ2); a1, …, ak are i.i.d. random variables distributed like N(0, τ2), independently of {eij}. In Example 3.3, we have established that a minimal sufficient statistic is inline.jpg. This minimal sufficient statistic can be represented by inline.jpg, where

Unnumbered Display Equation


Unnumbered Display Equation

ρ = τ2/σ2 is the variance ratio, inline.jpg = 1, 2, 3 are three independent chi–squared random variables. Consider the group inline of real affine transformations, inline = {[α, β]; −∞ < α < ∞, 0 < β < ∞}, and the quadratic loss function L(inline.jpg, θ) = (inline.jpgθ)2/θ2. We notice that all the parameter points (μ, σ2, τ2) such that τ2/σ2 = ρ belong to the same orbit. The values of ρ, 0 < ρ < ∞, index the various possible orbits in the parameter space. The maximal invariant reduction of T* is

Unnumbered Display Equation

Thus, every equivariant estimator of σ2 is the form

Unnumbered Display Equation

where U = Qa/Qe, inline(U) and f(U) are chosen functions. Note that the distribution of U depends only on ρ. Indeed, inline.jpg. The risk function of an equivariant estimator inline.jpg is (Zacks, 1970)

Unnumbered Display Equation

If K(ρ) is any prior distribution of the variance ratio ρ, the prior risk EK{R(f, ρ)} is minimized by choosing f(U) to minimize the posterior expectation given U, i.e.,

Unnumbered Display Equation

The function fK(U) that minimizes this posterior expectation is

Unnumbered Display Equation

The Bayes equivariant estimator of σ2 is obtained by substituting fK(u) in inline.jpg. For more specific results, see Zacks (1970b).        inline

Example 9.4. Let X1, …, Xn be i.i.d. random variables having a location and scale parameter exponential distribution, i.e.,

Unnumbered Display Equation

–∞ < μ < ∞, 0 < σ < ∞.

A minimal sufficient statistic is (X(1), Sn), where X(1) ≤ … ≤ X(n) and Sn = inline.jpg. We can derive the structural distribution on the basis of the minimal sufficient statistic. Recall that X(1) and Sn are independent and

Unnumbered Display Equation

where G(1) ≤ … ≤ G(n) is an order statistic from a standard exponential distribution G(1, 1), corresponding to μ = 0, σ = 1.

The group of transformation under consideration is inline = {[a, b]; −∞ < a, −∞ < a < ∞, 0 < b < ∞}. The standard point is the vector (G(1), SG), where G(1) = (X(1)μ)/σ and SG = Sn/σ. The Jacobian of this transformation is J(X(1), Sn, μ, σ) = inline.jpg. Moreover, the p.d.f. of (G(1), SG) is

Unnumbered Display Equation

Hence, the structural distribution of (μ, σ) given (X(1), Sn) has the p.d.f.

Unnumbered Display Equation

for −∞ < μX(1), 0 < σ < ∞.

The minimum risk structural estimators in the present example are obtained in the following manner. Let L(inline.jpg, μ, σ) = (inline.jpgμ)2/σ2 be the loss function for estimating μ. Then the minimum risk estimator is the μ–expectation. This is given by

Unnumbered Display Equation

where inline.jpg.

It is interesting to notice that while the MLE of μ is X(1), the minimum risk structural estimator might be considerably smaller, but close to the Pitman estimator.

The minimum risk structural estimator of σ, for the loss function L(inline.jpg, σ) = (inline.jpgσ)2/σ2, is given by

Unnumbered Display Equation

One can show that inline.jpg is also the minimum risk equivariant estimator of σ.        inline

Example 9.5. A minimax estimator might be inadmissible. We show such a case in the present example. Let X1, …, Xn be i.i.d. random variables having a normal distribution, like N(μ, σ2), where both μ and σ2 are unknown. The objective is to estimate σ2 with the quadratic loss L(inline.jpg2, σ2) = (inline.jpg2σ2)2/σ4. The best equivariant estimator with respect to the group inline.jpg of real affine transformation is inline.jpg, where inline.jpg. This estimator has a constant risk inline.jpg. Thus, inline.jpg is minimax. However, inline.jpg2 is dominated uniformly by the estimator (9.3.31) and is thus inadmissible.        inline

Example 9.6. In continuation of Example 9.5, given the minimal sufficient statistics (inline.jpgn, Q), the Bayes estimator of σ2, with respect to the squared error loss, and the prior assumption that μ and σ2 are independent, μN(0, τ2) and 1/2σ2G(λ, ν), is

Unnumbered Display Equation

This estimator is admissible since h(μ, σ2) > 0 for all (μ, σ2) and the risk function is continuous in (μ, σ2).        inline

Example 9.7. Let X1, …, Xn be i.i.d. random variables, having the location parameter exponential density f(x;μ) = e−(xμ)I{xμ}. The minimal sufficient statistic is inline.jpg. Moreover, X(1)μ + G(n, 1). Thus, the UMVU estimator of μ is inline.jpg. According to (9.3.15), this estimator is admissible. Indeed, by Basu’s Theorem, the invariant statistic Y = (X(2)X(1), X(3)X(2), …, X(n)X(n – 1)) is independent of X(1). Thus

Unnumbered Display Equation


Unnumbered Display Equation        inline

Example 9.8. Let YN(θ, I), where θ = Hβ,

Unnumbered Display Equation

and β′ = (β1, β2, β3, β4). Note that inline.jpg is an orthogonal matrix. The LSE of β is inline.jpg, and the LSE of θ is inline.jpg = H(HH)−1H′ = Y. The eigenvalues of H(HH)−1H′ are αi = 1, for i = 1, …, 4. Thus, according to Theorem 9.3.3, inline.jpg is admissible.        inline

Example 9.9. Let XN(θ, I). X is a p–dimensional vector. We derive the generalized Bayes estimators of θ, for squared–error loss, and the Bayesian hyper–prior (9.3.43). This hyper–prior is

Unnumbered Display Equation


Unnumbered Display Equation

The joint distribution of (X′, θ′)′, given λ, is

Unnumbered Display Equation

Hence, the conditional distribution of θ, given (X, λ) is

Unnumbered Display Equation

The marginal distribution of X, given λ, is inline.jpg. Thus, the density of the posterior distribution of λ, given X, is

Unnumbered Display Equation

where S = XX. It follows that the generalized Bayes estimator of θ, given X is

Unnumbered Display Equation


Unnumbered Display Equation        inline


Section 9.1

9.1.1 Consider a finite population of N units. M units have the value x = 1 and the rest have the value x = 0. A random sample of size n is drawn without replacement. Let X be the number of sample units having the value x = 1. The conditional distribution of X, given M, n is H(N, M, n). Consider the problem of estimating the parameter P = M/N, with a squared–error loss. Show that the linear estimator inline.jpg, with inline.jpg and inline.jpg has a constant risk.

9.1.2 Let inline.jpg be a family of prior distributions on Θ. The Bayes risk of H inline inline.jpg is ρ(H) = ρ (inline.jpgH, H), where inline.jpgH is a Bayesian estimator of θ, with respect to H. H* is called least–favorable in inline.jpg if ρ(H*) = inline.jpg. Prove that if H is a prior distribution in inline.jpg such that ρ(H) = inline.jpg then

(i) inline.jpgH is minimax for inline.jpg;
(ii) H is least favorable in inline.jpg.

9.1.3 Let X1 and X2 be independent random variables X1 | θ1B(n, θ1) and X2| θ2B(n, θ2). We wish to estimate δ = θ2θ1.

(i) Show that for the squared error loss, the risk function R(inline.jpg, θ1, θ2) of

Unnumbered Display Equation

attains its supremum for all points (θ1, θ2) such that θ1 + θ2 = 1.

(ii) Apply the result of Problem 2 to show that inline.jpg(X1, X2) is minimax.

9.1.4 Prove that if inline.jpg(X) is a minimax estimator over Θ1, where Θ1inline Θ and inline.jpg, then inline.jpg is minimax over Θ.

9.1.5 Let X1, …, Xn be a random sample (i.i.d.) from a distribution with mean θ and variance σ2 = 1. It is known that |θ| ≤ M, 0 < M < ∞. Consider the linear estimator inline.jpga, b = ainline.jpgn + b, where inline.jpgn is the sample mean, with 0 ≤ a ≤ 1.

(i) Derive the risk function of inline.jpga, b.
(ii) Show that

Unnumbered Display Equation

(iii) Show that inline.jpga* = a* inline.jpgn, with a* = inline.jpg is minimax.

Section 9.2

9.2.1 Consider Problem 4, Section 8.3. Determine the Bayes estimator for μ, δ = μη and σ2 with respect to the improper prior h(μ, η, σ2) specified there and the invariant loss functions

Unnumbered Display Equation

respectively, and show that the Bayes estimators are equivariant with respect inline = {[α, β];−∞ < α < ∞, 0 < β < ∞}.

9.2.2 Consider Problem 4, Section 8.2. Determine the Bayes equivariant estimator of the variance ratio ρ with respect to the improper prior distribution specified in the problem, the group inline = {[α, β];−∞ < α < ∞, 0 < β < ∞} and the squared–error loss function for ρ.

9.2.3 Let X1, …, Xn be i.i.d. random variables having a common rectangular distribution R(θ, θ + 1), −∞ < θ < ∞. Determine the minimum MSE equivariant estimator of θ with a squared–error loss L(inline.jpg, θ) = (inline.jpgθ)2 and the group inline of translations.

9.2.4 Let X1, …, Xn be i.i.d. random variables having a scale parameter distribution, i.e.,

Unnumbered Display Equation

(i) Show that the Pitman estimator of σ is the same as the Formal Bayes estimator (9.2.14).
(ii) Derive the Pitman estimator of inline.jpg when inline (x) = exI{x ≥ 0}.

Section 9.3

9.3.1 Minimax estimators are not always admissible. However, prove that the minimax estimator of θ in the B(n, θ) case, with squared–error loss function, is admissible.

9.3.2 Let XB(n, θ). Show that inline.jpg = X/n is an admissible estimator of θ

(i) for the squared–error loss;
(ii) for the quadratic loss (inline.jpgθ)2/θ (1 – θ).

9.3.3 Let X be a discrete random variable having a uniform distribution on {0, 1, …, θ}, where the parameter space is Θ = {0, 1, 2, …}. For estimating θ, consider the loss function L(inline.jpg, θ) = θ (inline.jpgθ)2.

(i) Derive the Bayesian estimator inline.jpgH, where H is a discrete prior distribution on Θ.
(ii) What is the Bayesian estimator of θ, when h(θ) = inline.jpg?
(iii) Show that inline.jpg = X is a Bayesian estimator only for the prior distribution concentrated on θ = 0, i.e., PH{θ = 0} = 1.
(iv) Compare the risk function of inline.jpg with the risk function of inline.jpg1 = max(1, X).
(v) Show that an estimator inline.jpgm = m is Bayesian against the prior Hm s.t. PHm{θ = m} = 1, m = 0, 2, ….

9.3.4 Let X be a random variable (r.v.) with mean θ and variance σ2, 0 < σ2 < ∞. Show that inline.jpga, b = aX + b is an inadmissible estimator of θ, for the squared–error loss function, whenever

(i) a > 1, or
(ii) a < 0, or
(iii) a = 1 and b ≠ 0.

9.3.5 Let X1, …, Xn be i.i.d. random variables having a normal distribution with mean zero and variance σ2 = 1/inline.

(i) Derive the Bayes estimator of σ2, for the squared–error loss, and gamma prior, i.e., inline.jpg.
(ii) Show that the risk of the Bayes estimator is finite if n + n0 > 4.
(iii) Use Karlin’s Theorem (Theorem 9.3.1) to establish the admissibility of this Bayes estimator, when n + n0 > 4.

9.3.6 Let XB(n, θ), 0 < θ < 1. For which values of λ and γ the linear estimator

Unnumbered Display Equation

is admissible?

9.3.7 Prove that if an estimator has a constant risk, and is admissible then it is minimax.

9.3.8 Prove that if an estimator is unique minimax then it is admissible.

9.3.9 Suppose that inline = {F(x;θ), θ inline Θ} is invariant under a group of transformations inline. If Θ has only one orbit with respect to inline.jpg (transitive) then the minimum risk equivariant estimator is minimax and admissible.

9.3.10 Show that any unique Bayesian estimator is admissible.

9.3.11 Let XN(θ, I) where the dimension of X is p > 2. Show that the risk function of inline.jpgc = inline.jpg, for the squared–error loss, is

Unnumbered Display Equation

For which values of c does inline.jpgc dominate inline.jpg = X? (i.e., the risk of inline.jpgc is uniformly smaller than that of X).

9.3.12 Show that the James–Stein estimator (9.3.16) is dominated by

Unnumbered Display Equation

where a+ = max(a, 0).



(i) The variance of inline.jpg. Thus, the MSE of inline.jpg is

Unnumbered Display Equation

Now, for inline.jpg and inline.jpg we get

Unnumbered Display Equation

This is a constant risk (independent of P).



Unnumbered Display Equation

Let w = θ1 + θ2, g(w) = 2ww2 attains its maximum at w = 1. Thus, R(inline.jpg, θ1, θ2) attains its supremum on {(θ1, θ2): θ1+θ2 = 1}, which is R* (δ) = inline.jpg.

(ii) Let θ1, θ2 be priorly independent, having the same prior distribution beta (a, a). Then, the Bayes estimator for this prior and squared–error loss is

Unnumbered Display Equation

Let inline.jpg then the Bayes estimator δB(X1, X2) is equal to inline.jpg(X1, X2). Hence,

Unnumbered Display Equation

Thus, inline.jpg(X1, X2) is minimax.


(i) inline.jpga, b = ainline.jpgn + b. Hence,

Unnumbered Display Equation

(ii) Since |θ | ≤ M < ∞,

Unnumbered Display Equation

(iii) For b = 0,

Unnumbered Display Equation

The value of a that minimizes this supremum is a* = M2/(M2 + 1/n). Thus, if a* = a and b = 0

Unnumbered Display Equation

Thus, θ* = a* X is minimax.


Θ = {0, 1, 2, …}, L(inline.jpg, θ) = θ (inline.jpgθ)2.

(i) inline.jpg. Let h(θ) be a discrete p.d.f. on Θ. The posterior p.d.f. of θ, given X = x, is

Unnumbered Display Equation

The posterior risk is

Unnumbered Display Equation

Thus, the Bayes estimator is the integer part of

Unnumbered Display Equation

(ii) If inline.jpg, the Bayesian estimator is the integer part of

Unnumbered Display Equation

Let p(i;τ) and P(i, τ) denote, respectively, the p.d.f. and c.d.f. of the Poisson distribution with mean τ. We have,

Unnumbered Display Equation


Unnumbered Display Equation


Unnumbered Display Equation

Note that BH(x) > x for all x ≥ 1. Indeed,

Unnumbered Display Equation

for all x ≥ 1.

(iii) If P{θ = 0} = 1 then obviously P{X = 0} = 1 and BH(X) = 0 = X. On the other hand, suppose that for some j > 0, P{θ = j} > 0, then P{X = j} = inline.jpg and, with probability inline.jpg. Thus, BH(X) = X = 0 if, and only, P{θ = 0} = 1.
(iv) Let inline.jpg = X. The risk is then

Unnumbered Display Equation

Note that inline.jpg(θ, 0) = 0. Consider the estimator inline.jpg1 = max(1, X). In this case, R(inline.jpg1, 0) = 1. We can show that R(inline.jpg1, θ) < R(inline.jpg, θ) for all θ ≥ 1. However, since R(inline.jpg, 0) < R(inline.jpg1, 0), inline.jpg1 is not better than inline.jpg.

