9 Introduction to Particle Filtering: The Next Stage in Tracking

Tracking development has progressed slowly in terms of its ability to track under more stressing conditions. Rudolph Kalman’s¹ innovative paper on linear filtering and prediction problems led the way to provide one of the first adaptive and optimal approaches for realistic target tracking. As a result of the importance of this work, Kalman was awarded the Kyoto Prize ($340,000), considered to be the Japanese equivalent to the Nobel Prize.² Since that time, the Kalman filter has become the cornerstone for most of the major tracking developments to date.^{3, 4, 5, 6 and 7} Kalman filters represent the class of optimal Bayesian estimators that operate under conditions involving linear-Gaussian uncertainty. As long as the target dynamics operates within a polynomial motion model and sensor measurements are related linearly to the target state with Gaussian noise, the approach is an optimal estimator. Although the success of the Kalman filter led to numerous military and civilian applications, it has its problems⁸: to name a few, there is marginal stability of the numerical solution of the Riccati equation, small round-off errors accumulate and degrade performance, and conversion of non-Cartesian sensor measurements introduces coupling biases within the Cartesian Kalman state. This did not stop researchers from developing methods to expand Kalman filter operating bounds to the larger nonlinear problem classes.

The extended Kalman filter (EKF) has provided the means to adapt these linear principles to conditions involving nonlinear dynamics or nonlinear measurements by employing partial derivatives as a means to linearize the systems. The Kalman filter retains its basic form, but partial derivatives are applied either to the state transition matrix or the measurement matrix to develop a minimum variance estimate of the target state. In essence, this provides a “piecewise linear” application to the nonlinear tracking problem.

Still certain problem classes are not addressed by these methods; the most prominent problem involves conditions where sensor measurements provide incomplete observations (such as range-only, bearing-only, or time difference of arrival tracking) or where tracking models drastically change (for instance, road-following and move-stop-move tracking). Recently, problems of this class have been successfully addressed by techniques such as the unscented * (transform) Kalman filter (UKF) and particle filtering. While the EKF provides an approximation to optimal nonlinear estimation, the UKF has been shown to provide good performance in propagating the state mean and covariance through small-scale nonlinear transformations. The basic premise is that a set of sigma points (generally 2n + 1, n being the state space size) represent the target state estimate probability density—one particle for the mean and 2n particles to bound the covariance of the track estimate. However, these samples are not randomly selected but they are deterministic. Yet they are sufficient to capture third-order moments. Other techniques also exist that extend the Kalman filter to nonlinear dynamic conditions, such as the modified gain Kalman filter, the Gauss–Hermite quadrature filter—a version similar to the UKF.

Particle filtering extends the notions of the unscented transform, using sequential Monte Carlo sum (SMS) methods to represent the state estimate for nonlinear non-Gaussian problem classes. The full potential of particle filtering has not been explored. While many of the concepts were introduced in the 1960s and 1970s, interest did not develop until recently.⁹ This chapter explores some of the concepts that lead to the development of particle filters, and the details behind their implementation. An example is provided to demonstrate some of the strengths and weaknesses of this approach. Particle filtering is an important contribution to the world of dynamic target estimation, but like the other approaches, it is designed to satisfy a specific class of tracking problems. The Kalman filter provides an optimal solution for linear-Gaussian problem classes, whereas the particle filter is designed to address nonlinear non-Gaussian conditions, but it is not an optimal filter. Still, it represents a major step toward satisfying the needs of working in nonlinear tracking environments.

9.2 Target State Filtering Problem

The target state model can be represented by a discrete-time stochastic model

\begin{matrix} x_{k + 1} = f_{k} (x_{k}, u_{k}, v_{k}) & (9.1) \end{matrix}

$\begin{matrix} x_{k + 1} = f_{k} (x_{k}, u_{k}, v_{k}) & (9.1) \end{matrix}$

where u_k represents the external deterministic control, v_k represents the random process noise that aids in capturing any mismatch of the target model with the true state dynamics, and the time index k represents current time, whereas k + 1 provides a prediction to the next time step. The function f(⋅) indicates the possible nonlinear relationship of the target dynamic behavior. Likewise, the measurement equation could also be nonlinear as indicated by h(⋅) in the following equation:

\begin{matrix} z_{k} = h_{k} (x_{k}, u_{k}, w_{k}) & (9.2) \end{matrix}

$\begin{matrix} z_{k} = h_{k} (x_{k}, u_{k}, w_{k}) & (9.2) \end{matrix}$

where w_k represents the measurement noise.

We are interested in generating filtered track estimates of x_k from a sequence of sensor measurements. Using the measurement history, Z^k ≡ {z_i, I = 1, 2, …,k}, we can develop the posterior probability density function (pdf), p(x_k|Z^k). The posterior pdf provides the means to determine the conditional predicted target state, based on Bayes’ theorem.

9.2.1 Chapman–Komolgorov Equation

Given the posterior pdf p(x_k|Z^k), we want to obtain a recursive relationship relating the prior pdf at time k + 1. The Chapman–Komolgorov equation represents a one-step transition density for a Markov random sequence,¹⁰ an optimal estimator. As Z^k represents the cumulative information that leads to the current state estimate, the posterior density summarizes the target past in a probabilistic sense. Starting with Bayes’ theorem, we can write,

p (x_{k + 1} | Z^{k + 1}) = p (x_{k + 1} | Z^{k}, z_{k + 1}) = \frac{p (z_{k + 1} | x_{k + 1}, Z^{k}) p (x_{k + 1} | Z^{k})}{p (z_{k + 1} | Z^{k})}

$p (x_{k + 1} | Z^{k + 1}) = p (x_{k + 1} | Z^{k}, z_{k + 1}) = \frac{p (z_{k + 1} | x_{k + 1}, Z^{k}) p (x_{k + 1} | Z^{k})}{p (z_{k + 1} | Z^{k})}$

or more compactly as

\begin{matrix} p (x_{k + 1} | Z^{k + 1}) = \frac{1}{c} p (z_{k + 1} | x_{k + 1} | Z^{k}) & (9.3) \end{matrix}

$\begin{matrix} p (x_{k + 1} | Z^{k + 1}) = \frac{1}{c} p (z_{k + 1} | x_{k + 1} | Z^{k}) & (9.3) \end{matrix}$

where

c = p (z_{k + 1} | Z^{k + 1}) = \int p (z_{k + 1} | x_{k + 1}) p (x_{k + 1} | Z^{k}) {dx}_{k}

$c = p (z_{k + 1} | Z^{k + 1}) = \int p (z_{k + 1} | x_{k + 1}) p (x_{k + 1} | Z^{k}) {dx}_{k}$

The first step is to show the relationship between the prediction pdf p(x_k+1|Z^k) and the prior update p(x_k|Z^k). That is, we can expand the second term of Equation 9.3 as

p (x_{k + 1} | Z^{k}) = \frac{p (x_{k + 1}, Z^{k})}{p (Z^{k})} = \int \frac{p (x_{k + 1}, x_{k}, Z^{k})}{p (Z^{k})} {dx}_{k} = \int \frac{p (x_{k + 1} | x_{k}, Z^{k}) p (x_{k}, Z^{k})}{p (Z^{k})} {dx}_{k}

$p (x_{k + 1} | Z^{k}) = \frac{p (x_{k + 1}, Z^{k})}{p (Z^{k})} = \int \frac{p (x_{k + 1}, x_{k}, Z^{k})}{p (Z^{k})} {dx}_{k} = \int \frac{p (x_{k + 1} | x_{k}, Z^{k}) p (x_{k}, Z^{k})}{p (Z^{k})} {dx}_{k}$

or simply recognizing the joint pdf as p(x_k, Z^k) = p(x_k|Z^k)p(Z^k) (assuming the process is Markov), we get

p (x_{k + 1} | Z^{k}) = \int p (x_{k + 1} | x_{k}) p (x_{k} | Z^{k}) {dx}_{k}

$p (x_{k + 1} | Z^{k}) = \int p (x_{k + 1} | x_{k}) p (x_{k} | Z^{k}) {dx}_{k}$

Equation 9.3 becomes the Chapman–Komolgorov equation:

\begin{matrix} p (x_{k + 1} | Z^{k + 1}) = \frac{1}{c} p (z_{k + 1} | x_{k + 1}) \int p (x_{k + 1} | x_{k}) p (x_{k} | Z^{k}) {dx}_{k} & (9.4) \end{matrix}

$\begin{matrix} p (x_{k + 1} | Z^{k + 1}) = \frac{1}{c} p (z_{k + 1} | x_{k + 1}) \int p (x_{k + 1} | x_{k}) p (x_{k} | Z^{k}) {dx}_{k} & (9.4) \end{matrix}$

Note that the observation likelihood p(z_k+1|x_k+1) can be taken outside the integral of the Chapman—Komolgorov (Equation 9.4). The terms inside the integral include the dynamic transition of the target to the next state prediction p(x_k+1|x_k), and the prior state p(x_k|Z^k). The conditional pdf p(x_k|Z^k) has the property of being an information state. The optimal estimator then consists of a recursive formulation of the conditional pdf. In general, the process is computationally intensive.

The Chapman–Komolgorov equation forms the basis for developing track estimation. If the system is linear, and all noises are Gaussian, then the functional recursion becomes the recursion for the conditional mean and covariance (Kalman filter), in which case the covariance estimation errors (accuracy) are independent of measurements and can be computed off-line.

For the linear case, the target dynamic equation can generally be written as

x_{k + 1} = F_{k} x_{k} + G_{k} u_{k} + v_{k}

$x_{k + 1} = F_{k} x_{k} + G_{k} u_{k} + v_{k}$

where v_k is assumed be a zero-mean white Gaussian noise with covariance Q_k. The mean and covariance of the one-step prediction track state estimate then become, respectively,

{\hat{x}}_{k + 1 | k} = F_{k} {\hat{x}}_{k | k} + G_{k} u_{k}

${\hat{x}}_{k + 1 | k} = F_{k} {\hat{x}}_{k | k} + G_{k} u_{k}$

and

P_{k + 1 | k} = F_{k} P_{k | k} F_{k}^{'} + Q_{k}

$P_{k + 1 | k} = F_{k} P_{k | k} F_{k}^{'} + Q_{k}$

while the measurement equation can be written as

z_{k} = H_{k} x_{k} + w_{k}

$z_{k} = H_{k} x_{k} + w_{k}$

where w_k is the measurement noise generally assumed to be zero-mean white Gaussian with covariance R.

The nonlinear case, however, shows the accuracy to be measurement dependent. The EKF represents an attempt to apply Kalman filtering principles to conditions of nonlinearity, either in the target dynamic or the measurement model (or both) by applying a first-order Taylor series expansion. In essence, it attempts to linearize the nonlinear functions of the state or measurement models. Taking Equations 9.1 and 9.2, we can expand the state estimate to include the first-order derivative and higher-order terms (HOT) as (ignoring the external control)

{\hat{x}}_{k + 1 | k} = f (k, {\hat{x}}_{k | k}) + f_{X} (k) (x_{k} - {\hat{x}}_{k | k}) + HOT + v_{k}

${\hat{x}}_{k + 1 | k} = f (k, {\hat{x}}_{k | k}) + f_{X} (k) (x_{k} - {\hat{x}}_{k | k}) + HOT + v_{k}$

where f_X(k) represents the Jacobian matrix

\begin{matrix} f_{X} (k) \equiv & {\begin{matrix} [\nabla_{X} f {(x, k)}^{'}] \end{matrix}}^{'} |_{x = x_{k | k}} \\ = & (\begin{matrix} \frac{{\partial f}_{1}}{{\partial x}_{1}} & \cdot \cdot \cdot & \frac{{\partial f}_{1}}{{\partial x}_{n}} \\ ⋮ & ⋮ & ⋮ \\ \frac{{\partial f}_{m}}{{\partial x}_{1}} & \cdot \cdot \cdot & \frac{{\partial f}_{m}}{{\partial x}_{n}} \end{matrix}) = F (k) \end{matrix}

$\begin{matrix} f_{X} (k) \equiv & {\begin{matrix} [\nabla_{X} f {(x, k)}^{'}] \end{matrix}}^{'} |_{x = x_{k | k}} \\ = & (\begin{matrix} \frac{{\partial f}_{1}}{{\partial x}_{1}} & \cdot \cdot \cdot & \frac{{\partial f}_{1}}{{\partial x}_{n}} \\ ⋮ & ⋮ & ⋮ \\ \frac{{\partial f}_{m}}{{\partial x}_{1}} & \cdot \cdot \cdot & \frac{{\partial f}_{m}}{{\partial x}_{n}} \end{matrix}) = F (k) \end{matrix}$

Then the prediction state and the associated covariance become, respectively,

{\hat{x}}_{k + 1 | k} = f (k, {\hat{x}}_{k | k})

${\hat{x}}_{k + 1 | k} = f (k, {\hat{x}}_{k | k})$

and

P_{k + 1 | k} = F (k) P_{k | k} F {(k)}^{'} + Q_{k}

$P_{k + 1 | k} = F (k) P_{k | k} F {(k)}^{'} + Q_{k}$

Likewise, nonlinearities in the measurement prediction can be written as

z_{k + 1} = h (k + 1, {\hat{x}}_{k + 1 | k}) + h_{X} (k + 1) (x_{k + 1} - {\hat{x}}_{k + 1 | k}) + HOT + w_{k}

$z_{k + 1} = h (k + 1, {\hat{x}}_{k + 1 | k}) + h_{X} (k + 1) (x_{k + 1} - {\hat{x}}_{k + 1 | k}) + HOT + w_{k}$

where

h_{X} (k + 1) \equiv {[\nabla_{X} h {(x, k + 1)}^{'}]}^{'} |_{x = x_{k + 1 | k}} = H (k + 1)

$h_{X} (k + 1) \equiv {[\nabla_{X} h {(x, k + 1)}^{'}]}^{'} |_{x = x_{k + 1 | k}} = H (k + 1)$

The prediction measurement and the prediction measurement covariance become, respectively,

{\hat{z}}_{k + 1 | k} = h (x_{k + 1}, {\hat{x}}_{k + 1 | k})

${\hat{z}}_{k + 1 | k} = h (x_{k + 1}, {\hat{x}}_{k + 1 | k})$

and

S_{k + 1} = H (k + 1) P_{k + 1 | k} H {(k + 1)}^{'} + R_{k + 1}

$S_{k + 1} = H (k + 1) P_{k + 1 | k} H {(k + 1)}^{'} + R_{k + 1}$

A complete description can be found in Refs 5 and 11. Some of the key issues with using an EKF include the need to evaluate a Jacobian matrix for the state and measurement equations to approximate nonlinear calculations, and covariance computations are no longer decoupled from the state estimate calculations.

Limitations with the EKF could involve introducing (1) biases through the linearization of the nonlinear conditions, (2) reduced accuracy in the prediction state, or (3) complications due to human error in calculating Jacobian matrices. In addition, there is no guarantee of reduced error even if we include second-order terms from the Taylor series expansion. Finally, extensive simulations may be required to test the consistency of the filter.

As the EKF uses the first-order terms as an approximation, errors will “creep” into the system if neglected higher-order terms begin to dominate.

Note that the EKF always approximates the p(x_k|Z^k) as Gaussian, which is often considered to be piecewise continuous between updates. As such, the principles of the EKF assume that the posterior densities hold for the Gaussian $p (x_{k} | Z^{k}) = N (x_{k}, {\hat{x}}_{k | k}, P_{k | k})$ $p (x_{k} | Z^{k}) = N (x_{k}, {\hat{x}}_{k | k}, P_{k | k})$ .

The approach provides a fast and simple method to apply higher-order moments to solve non-Gaussian posterior densities, as long as the nonlinearity is not too severe. Other methods, such as the UKF, extend these techniques to include accuracy up to third order.*

Particle filters represent the next level, designed to characterize non-Gaussian process noise in the target state or measurements through a sequence of particles (point masses) of the density that estimates the state in terms of a sequential Monte Carlo process. Before we proceed, we need to examine the notion of importance sampling.

9.2.2 Monte Carlo Integration and Importance Sampling

Monte Carlo integration involves developing a numerical integration for a multidimensional integral that represents a particular function, which draws from samples of x over a specified set of dimensions, n_x.

\begin{matrix} I = \int g (x) dx & where x \in ℜ^{n_{x}} \end{matrix}

$\begin{matrix} I = \int g (x) dx & where x \in ℜ^{n_{x}} \end{matrix}$

We need to factor g(x) into two functions, g(x) = f(x)π(x), such that π(x) represents a pdf with probabilistic properties π(x) ≥ 0 and ∫π(x)dx = 1. Then we can use these properties to approximate the integral

\begin{matrix} I = \int f (x) π (x) dx & (9.5) \end{matrix}

$\begin{matrix} I = \int f (x) π (x) dx & (9.5) \end{matrix}$

in terms of its sample moments. The sample mean would be expressed as $E [I_{N}] = (1 / N) Σ_{i = 1}^{N} f (x^{i})$ $E [I_{N}] = (1 / N) Σ_{i = 1}^{N} f (x^{i})$ by sampling N independent samples from f(x), where xⁱ is the ith sample. The variance (second central moment) of f(x) is σ² = ∫(f(x) − I)²π(x)dx. In the limiting sense, convergence occurs based on the central limit theorem showing that the characteristic function approaches a normal distribution, that is, $lim_{N \to \infty} \sqrt{N} (E [I_{N}] - I) \sim N (0, σ^{2})$ $lim_{N \to \infty} \sqrt{N} (E [I_{N}] - I) \sim N (0, σ^{2})$ .

For the Bayesian tracking problem, the pdf π(x) represents the posterior density. Ideally, we want to generate samples from this density π(x); in reality, we may know little about the density. However, suppose we can only approximate π(x) with a similar density q(x), termed an importance density. We want to ensure that a sample drawn from q(x) will have a good chance of coming from the true posterior pdf π(x). An acceptable distribution would contain the true pdf, but it need not fully contain the entire pdf as long as the samples rejected (outside of q(x)) would have a high probability of not affecting the target state. Ideally, a good approximation density would be the one proportional to the true posterior. If the ratio π(x)/q(x) is bounded by some constant, say M, then we can accept every sample drawn with probability 1.

We can rewrite Equation 9.5 in terms of this approximation as

I = \int f (x) π (x) dx = \int f (x) \frac{π (x)}{q (x)} q (x) dx

$I = \int f (x) π (x) dx = \int f (x) \frac{π (x)}{q (x)} q (x) dx$

To illustrate the problem, consider the pdf π(x) in Figure 9.1 to be a Gaussian mixture that represents the true target conditions. Then, assume that the true pdf has the form:

π(x) = 0.5N(x; 0, 1.0) + 0.5N(x; 1.25, 0.25)

where each Gaussian density has a uniform probability of acceptance.* We can let the initial importance function be a uniform density that expands the original true pdf to have a size, say U(–2, 2). Because the importance density supports (covers) the interval containing most of the true pdf π(x), samples from this estimate will eventually converge to π(x) (central limit theorem). If, however, we could use the function q *(x) in Figure 9.1, convergence would be more rapid and the ratio $\tilde{w} (x^{i})$ $\tilde{w} (x^{i})$ ∝ (π(xⁱ)/Mq(xⁱ)) would ensure a sample q(x) has a probability of acceptance as the ratio of the higher curve to the lower density.¹²

Images

FIGURE 9.1
Conceptual example of importance sampling.

This would give a sample mean estimate as

\begin{matrix} E [I_{N}] = \frac{1}{N} Σ_{i = 1}^{N} f (x^{i}) \tilde{w} (x^{i}) & (9.6) \end{matrix}

$\begin{matrix} E [I_{N}] = \frac{1}{N} Σ_{i = 1}^{N} f (x^{i}) \tilde{w} (x^{i}) & (9.6) \end{matrix}$

where $\tilde{w} (x^{i}) = π (x^{i}) / q (x^{i})$ $\tilde{w} (x^{i}) = π (x^{i}) / q (x^{i})$ represents the importance sampling weights. Either q(xⁱ) or q^*(xⁱ) would be adequate, but q^*(xⁱ) would ensure a quicker convergence. However, it is customary to use a uniform density when initiating a particle filter. Note that the density q(x) need not sum to 1, but it must have a positive finite solution. To compensate for this, we can normalize $\tilde{w} (x^{i})$ $\tilde{w} (x^{i})$ by simply replacing Equation 9.6 with

E [I_{N}] = \frac{1}{N} Σ_{i = 1}^{N} f (x^{i}) {\frac{\tilde{w} (x^{i})}{(1 / N) Σ_{j = 1}^{N} \tilde{w} (x^{j})}} = \frac{1}{N} Σ_{i = 1}^{N} f (x^{i}) w (x^{i})

$E [I_{N}] = \frac{1}{N} Σ_{i = 1}^{N} f (x^{i}) {\frac{\tilde{w} (x^{i})}{(1 / N) Σ_{j = 1}^{N} \tilde{w} (x^{j})}} = \frac{1}{N} Σ_{i = 1}^{N} f (x^{i}) w (x^{i})$

where $\tilde{w} (x^{i})$ $\tilde{w} (x^{i})$ represents importance sampling weights that support (cover) the posterior density of the Bayesian track, π(x).

If we compare results between the true and the estimated mixtures, as long as the importance function supports the posterior density, the means will be relatively consistent. Similarly, the sampled variance approaches the desired posteriori density, π(x), as¹³

lim_{N \to \infty} \sqrt{N} (E [I_{N}] - I) \to N (0, E {[f (x) - I]}^{2} w (x))

$lim_{N \to \infty} \sqrt{N} (E [I_{N}] - I) \to N (0, E {[f (x) - I]}^{2} w (x))$

which is consistent with sampling the original distribution: $σ_{fn} \sim σ_{I} / \sqrt{N}$ $σ_{fn} \sim σ_{I} / \sqrt{N}$ .

9.3 Particle Filter

The particle filter represents a set of hypothesized states for the target through a finite set of particles as a belief of the true target state.^{9,14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 and 28} Ideally, we would like to construct a posterior density based on the target history that propagates a representative set of particles through the current time. This posterior probability p(X_k|Z^k) would be represented by a set of samples (particles) drawn independently from this density. We could denote this random measure (collected set) as ${X_{k}^{j}, w_{k}^{j}}_{j = 1}^{N}$ ${X_{k}^{j}, w_{k}^{j}}_{j = 1}^{N}$ where the particles (support points) are given by ${X_{k}^{j}}_{j = 1}^{N}$ ${X_{k}^{j}}_{j = 1}^{N}$ , along with a set of weights, ${w_{k}^{j}}$ ${w_{k}^{j}}$ , that determine each particle’s relative importance to the target’s trajectory. The weights are normalized by $Σ_{j = 1}^{N} w_{k}^{j} = 1$ $Σ_{j = 1}^{N} w_{k}^{j} = 1$ . As in the standard tracking problem, we could use this random measure to generate a new update ${X_{k + 1}^{j}}_{j = 1}^{N}$ ${X_{k + 1}^{j}}_{j = 1}^{N}$ by applying the next subsequent measurement z_k+1 to the current measure ${X_{k}^{j}}_{j = 1}^{N}$ ${X_{k}^{j}}_{j = 1}^{N}$ .

The true target track density can be represented by the random measure as an approximation to the posterior density by a discrete sum of point masses*:

\begin{matrix} p (X_{k + 1} | Z^{k + 1}) \approx Σ_{j = 1}^{N} w_{k + 1}^{j} δ (X_{k + 1} - X_{k + 1}^{j}) * & (9.7) \end{matrix}

$\begin{matrix} p (X_{k + 1} | Z^{k + 1}) \approx Σ_{j = 1}^{N} w_{k + 1}^{j} δ (X_{k + 1} - X_{k + 1}^{j}) * & (9.7) \end{matrix}$

Unfortunately, as we saw in Section 9.2, little may be known of this density. We will have to rely on developing a set of particles from a more convenient density—the importance density^† q(X_k+1|Z^k+1). If we sample the particles ${x_{k}^{j}, w_{k}^{j}}_{j = 1}^{N}$ ${x_{k}^{j}, w_{k}^{j}}_{j = 1}^{N}$ from this density rather than the true posterior, we need to modify the weights accordingly,

\begin{matrix} w_{k + 1}^{j} \propto \frac{p (X_{k + 1}^{j} | Z^{k + 1})}{q (X_{k + 1}^{j} | Z^{k + 1})} & (9.8) \end{matrix}

$\begin{matrix} w_{k + 1}^{j} \propto \frac{p (X_{k + 1}^{j} | Z^{k + 1})}{q (X_{k + 1}^{j} | Z^{k + 1})} & (9.8) \end{matrix}$

for the posterior density as

\begin{matrix} p (X_{k + 1} | Z^{k + 1}) \approx Σ_{j = 1}^{N} \frac{p (X_{k + 1}^{j} | Z^{k + 1})}{q (X_{k + 1}^{j} | Z^{k + 1})} δ (X_{k + 1} - X_{k + 1}^{j}) & (9.9) \end{matrix}

$\begin{matrix} p (X_{k + 1} | Z^{k + 1}) \approx Σ_{j = 1}^{N} \frac{p (X_{k + 1}^{j} | Z^{k + 1})}{q (X_{k + 1}^{j} | Z^{k + 1})} δ (X_{k + 1} - X_{k + 1}^{j}) & (9.9) \end{matrix}$

However, to be effective for target tracking, we need to develop a recursive expression for the weights in Equation 9.8 that represent a product of the current state and the previous history.

Thus, for the denominator of the weights in Equation 9.8, we want the recursive relationship:

\begin{matrix} q (X_{k + 1} | Z^{k + 1}) = q (x_{k + 1} | X_{k}, Z^{k + 1}) q (X_{k} | Z^{k}) & (9.10) \end{matrix}

$\begin{matrix} q (X_{k + 1} | Z^{k + 1}) = q (x_{k + 1} | X_{k}, Z^{k + 1}) q (X_{k} | Z^{k}) & (9.10) \end{matrix}$

For the numerator, we want to adapt the pdf correspondingly as

\begin{matrix} p (X_{k + 1}^{j} | Z^{k + 1}) = \frac{p (z_{k + 1} | X_{k + 1}^{j}, Z^{k}) p (X_{k + 1}^{j} | Z^{k})}{p (z_{k + 1} | Z^{k})} & (9.11) \end{matrix}

$\begin{matrix} p (X_{k + 1}^{j} | Z^{k + 1}) = \frac{p (z_{k + 1} | X_{k + 1}^{j}, Z^{k}) p (X_{k + 1}^{j} | Z^{k})}{p (z_{k + 1} | Z^{k})} & (9.11) \end{matrix}$

where the target trajectory history becomes, using the Markov assumption

p (X_{k + 1}^{j} | Z^{k}) = p (x_{k + 1}^{j} | X_{k}^{j}, Z^{k}) p (X_{k}^{j} | Z^{k})

$p (X_{k + 1}^{j} | Z^{k}) = p (x_{k + 1}^{j} | X_{k}^{j}, Z^{k}) p (X_{k}^{j} | Z^{k})$

Then the pdf (Equation 9.11) becomes

\begin{matrix} p (X_{k + 1}^{j} | Z^{k + 1}) = & \frac{p (z_{k + 1} | x_{k + 1}^{j}) p (x_{k + 1}^{j} | x_{k}^{j})}{p (z_{k + 1} | Z^{k})} p (X_{k}^{j} | Z^{k}) \\ = & \frac{1}{c} p (z_{k + 1} | x_{k + 1}^{j}) p (x_{k + 1}^{j} | x_{k}^{j}) p (X_{k}^{j} | Z^{k}) \end{matrix}

$\begin{matrix} p (X_{k + 1}^{j} | Z^{k + 1}) = & \frac{p (z_{k + 1} | x_{k + 1}^{j}) p (x_{k + 1}^{j} | x_{k}^{j})}{p (z_{k + 1} | Z^{k})} p (X_{k}^{j} | Z^{k}) \\ = & \frac{1}{c} p (z_{k + 1} | x_{k + 1}^{j}) p (x_{k + 1}^{j} | x_{k}^{j}) p (X_{k}^{j} | Z^{k}) \end{matrix}$

with c being the normalization constant based on the measurement history.

Thus, the weights in Equation 9.8 can now be rewritten as

\begin{matrix} w_{k + 1}^{j} = & \frac{1}{c} \frac{p (z_{k + 1} | x_{k + 1}^{j}) p (x_{k + 1}^{j} | x_{k}^{j})}{q (x_{k + 1}^{j} | X_{k}^{j}, Z^{k + 1})} \frac{p (X_{j}^{i} | Z^{k})}{q (X_{k}^{j} | Z^{k})} \\ = & \frac{1}{c} \frac{p (z_{k + 1} | x_{k + 1}^{j}) p (x_{k + 1}^{j} | x_{k}^{j})}{q (x_{k + 1}^{j} | X_{k}^{j}, Z^{k + 1})} w_{k}^{j} \end{matrix}

$\begin{matrix} w_{k + 1}^{j} = & \frac{1}{c} \frac{p (z_{k + 1} | x_{k + 1}^{j}) p (x_{k + 1}^{j} | x_{k}^{j})}{q (x_{k + 1}^{j} | X_{k}^{j}, Z^{k + 1})} \frac{p (X_{j}^{i} | Z^{k})}{q (X_{k}^{j} | Z^{k})} \\ = & \frac{1}{c} \frac{p (z_{k + 1} | x_{k + 1}^{j}) p (x_{k + 1}^{j} | x_{k}^{j})}{q (x_{k + 1}^{j} | X_{k}^{j}, Z^{k + 1})} w_{k}^{j} \end{matrix}$

By the Markov property, we need only take the previous sample particles drawn from $q (x_{k}^{j} | Z^{k})$ $q (x_{k}^{j} | Z^{k})$ and apply the current measurement z_k+1. We can augment the weights to deal with the last state $x_{k}^{j}$ $x_{k}^{j}$ and the current measurement z_k+1 as

\begin{matrix} w_{k + 1}^{j} \propto \frac{p (z_{k + 1} | x_{k + 1}^{j}) p (x_{k + 1}^{j} | x_{k}^{j})}{q (x_{k + 1}^{j} | x_{k}^{j}, z_{k + 1})} w_{k}^{j} & (9.12) \end{matrix}

$\begin{matrix} w_{k + 1}^{j} \propto \frac{p (z_{k + 1} | x_{k + 1}^{j}) p (x_{k + 1}^{j} | x_{k}^{j})}{q (x_{k + 1}^{j} | x_{k}^{j}, z_{k + 1})} w_{k}^{j} & (9.12) \end{matrix}$

We now have a good approximation to the posterior density of the true target state, Equation 9.7, as

\begin{matrix} p (x_{k + 1} | z^{k + 1}) \approx Σ_{j = 1}^{N} w_{k + 1}^{j} δ (x_{k + 1} - x_{k + 1}^{j}) & (9.13) \end{matrix}

$\begin{matrix} p (x_{k + 1} | z^{k + 1}) \approx Σ_{j = 1}^{N} w_{k + 1}^{j} δ (x_{k + 1} - x_{k + 1}^{j}) & (9.13) \end{matrix}$

using the weights given in Equation 9.12. If the importance function provides an effectively bounded support, that is, q(x) = {x:p(x) > 0}, then as N → ∞, Equation 9.13 converges to the true pdf Equation 9.7.

9.4 Resampling

Ideally, the development thus far should be sufficient to provide an effective particle filtering system. However, it turns out that over time the variances of the weights increase significantly. This will often result in only a small percentage of particles gaining sufficient weight insuring that they continue to be selected over again. Depending on the quality of the match of the importance function, any substantive deviation from the true pdf will provide conditions for divergence. If the measurements appear in the tail of the prior, this will lead to potential failure of the particle filter. Eventually, a single particle will dominate while the remaining particles will develop negligible weights, leading to an ineffective representation for the pdf estimate.

Depending on the degree of dimensionality, this degeneracy effect may take only a small number of updates to occur. One concern with higher dimensionality is that it will become more difficult to select a suitable q(x). A large number of particles are necessary to properly converge, but in general, convergence is not guaranteed. As such, another concern involves inefficiency. Evaluating each of N samples when most of them have negligible weights is computationally inefficient.

A balance is necessary to maintain a sufficiently large particle set for effective importance sampling, while at the same time, managing the effects of degeneracy that severely decrease the number of effective particles.

Resampling leads to an effective solution by recreating a set of new particles whenever the need arises. Examining Figure 9.2, we can get a better perspective of the problem. Step A shows a set of 12 uniformly generated samples ${x_{k + 1}^{j}, N^{- 1}}$ ${x_{k + 1}^{j}, N^{- 1}}$ that approximates the prediction density p(x_k+1|Z^k). These samples provide a uniformly weighted random measure of the particle set derived from the previously updated estimate of the target trajectory. As a new measurement is received, z_k+1 (step B), we determine the likelihood function, p(z_k+1|x_k+1), to compute a new set of particle weights. These weights, $w_{k + 1}^{j}$ $w_{k + 1}^{j}$ , from Equation 9.12, provide a new weighted random measure, ${x_{k + 1}^{j}, w_{k + 1}^{j}}$ ${x_{k + 1}^{j}, w_{k + 1}^{j}}$ (step C), that gives more value to those particles that match the measurement, effectively selecting the most suitable particles—update function, p(x_k+1|Z^k+1).

At this step, we can see where the key concern resides regarding a practical implementation for the particle filter. Assuming that the five most heavily weighted particles dominate the particle set, all the particle diversity lies in two compact regions.

Repeating the cycle for the next time index will only increase the possibility for track failure, leading to the same particles being chosen over again. This problem is referred to as particle impoverishment. The impact is the effect that the track system is trying to predict target dynamics with very small process noise, and could cause divergence between the true path and the predicted estimate.

Images

FIGURE 9.2
Graphical representation of particle filter process cycle.

Resampling/update (step D) focuses on representing the random measure from the most representative particles with a new set ${x_{k + 1}^{j *}, N^{- 1}}$ ${x_{k + 1}^{j *}, N^{- 1}}$ , providing a uniformly resampled estimate of the update density. This brings “new life” to the particle set, insuring that the sample size is large enough to maintain diversity. For instance, in Figure 9.2, the larger weighted particles are used to resample the original set of 12 particles. Note that this “regenerates” the particle set, clustering them around the location of the heavily weighted particles.

To prevent the particle size from becoming too small, we need to develop a metric that assesses when the sample set is beginning to contain too many particles of insufficient weight, before the set becomes too small. An effective metric discussed in Ristic et al.⁹ and devised by Liu et al.²⁹ sets an effectiveness condition that can be compared to a threshold N_THRESH. This metric

1 < N_{Eff} = \frac{1}{Σ_{j = 1}^{N} {(w_{k + 1}^{j})}^{2}} < N_{THRESH} < N

$1 < N_{Eff} = \frac{1}{Σ_{j = 1}^{N} {(w_{k + 1}^{j})}^{2}} < N_{THRESH} < N$

sets the resampling step into place by regenerating N samples over the span of a number of higher-valued importance weights before their number becomes too restrictive, while eliminating samples with very low weights. A common threshold is to set N_THRESH = (2/3)N. The new sample set generates N uniformly weighted samples, providing a “reapproximation” of the updated function, p(x_k+1|Z^k+1). We add track process noise to the resampled set to reintroduce some diversity to the particle set. This new random measure ${x_{k + 1}^{j}, N^{- 1}}$ ${x_{k + 1}^{j}, N^{- 1}}$ approximates the new predictive density p(x_k+1|Z^k+1), and the cycle repeats (step D). This particle filtering cycle is referred to as a forward sequential importance sampling and can be represented by the following sequence:

\underset{(Step A)}{Sampling} \to \underset{(Step B)}{importance weights} \to \underset{(Step C)}{selection} \to \underset{(Step D)}{resampling / propagation}

$\underset{(Step A)}{Sampling} \to \underset{(Step B)}{importance weights} \to \underset{(Step C)}{selection} \to \underset{(Step D)}{resampling / propagation}$

According to Ristic et al.,⁹ resampling improves against degeneracy, but it also introduces another key problem—sample impoverishment.

9.5 Markov Chain Monte Carlo

Sample impoverishment involves the lack of particle diversity with repeated applications of the recycling step. Over time, selected particles will begin to dominate, gaining heavier weights, and could produce multiple generations of offspring through the selection and resampling process. This continued selection of the same set of particles causes a lack of diversity among the particles. In the extreme case, we could see a single particle giving rise to the entire next generation of particles. This is most pronounced when process noise is too small to contain any deviations in actual target motion.

Markov Chain Monte Carlo (MCMC) techniques are an effective counter to this problem, having been instrumental in solving statistical physics problems in the 1950s,³⁰ mainly involving spatial analysis.

The MCMC approach relies on the generation of samples based on the transition process that impacts the target state space through the Markov chain. There are a number of advantages when one considers the discrete-time Markov Chain process. The primary property is the conditional effect of containing the entire history of the distribution of the target state in the previous state. That is,

p (x_{k + 1}^{j} | x_{k}^{j}) = p (x_{k + 1}^{j} | x_{k}^{j}, x_{k - 1}^{j}, ..., x_{0}^{j})

$p (x_{k + 1}^{j} | x_{k}^{j}) = p (x_{k + 1}^{j} | x_{k}^{j}, x_{k - 1}^{j}, ..., x_{0}^{j})$

for each particle.

To converge to a stationary distribution, the Markov chain needs to satisfy the following³¹: (1) It must be irreducible. That is, every state is reachable from every other state (probabilistic connectivity). From the tracking problem, every state defines a dynamic model—linear acceleration models, turning models, move-stop-move models, etc. (2) The chain needs to be aperiodic, which prevents the states from oscillating periodically. (3) The chain must be positive recurrent. The discrete-time Markov chain is guaranteed to return to any state infinitely often. That is, no state becomes prohibited during the track process. Finally, an irreducible, aperiodic Markov chain in the limiting condition always exists and is independent of the initial probability distribution.³²

As an example, consider the two-state problem,¹¹ where

T = [p_{ij}] = [\begin{matrix} 0.9 & 0.1 \\ 0.33 & 0.67 \end{matrix}]

$T = [p_{ij}] = [\begin{matrix} 0.9 & 0.1 \\ 0.33 & 0.67 \end{matrix}]$

Here the term p_ij is the Markov probability of transitioning from state i to state j. This state problem can be interpreted as in Figure 9.3.

The rationale is that targets will travel between two points in the shortest time (p₁₁ = 0.9). The maneuver probability is based on an expected time within a turn. The transition probability from nonmaneuver to maneuver mode should occur relatively infrequently (p₁₂ = 0.1) and the return to a nonmaneuver state should occur as timely as possible (p₂₁ = 0.33).

On the basis of the properties of the Markov chain, the probability of starting from any initial point will stabilize to the distribution p(x) = (0.767, 0.233) and the target distribution will represent the invariant (balanced) density of the chain. The transition kernel

T (X^{*} | X) p (X) = T (X | X^{*}) p (X^{*})

$T (X^{*} | X) p (X) = T (X | X^{*}) p (X^{*})$

satisfies the balance condition, meaning that transition from one state to another is feasible.

A number of MCMC methods have been designed over the years. The Metropolis–Hastings (M-H) is the most popular and widely used.³⁰ Gibbs sampling is another, being a special case of M-H, that is, it represents the simplest of the Markov chain simulation algorithms. The Gibbs is most effective when we can sample directly from the conditional posterior distribution. The M-H is effective when we cannot sample directly from these distributions.³³

Images

FIGURE 9.3
Markov state transition.

9.6 Metropolis–Hastings

Recall that resampling creates conditions of sample impoverishment due to the selection of the same particles over repeated cycles.⁹ To prevent this problem, each cycle needs to undergo a test to determine whether the current state model is sufficient, $x_{k}^{i}$ $x_{k}^{i}$ , or that a possible change of state has taken place, $x_{k}^{i *}$ $x_{k}^{i *}$ , according to the proposed distribution, $T (x_{k}^{i *} | x_{k}^{j})$ $T (x_{k}^{i *} | x_{k}^{j})$ .

The test involves first drawing the new sample $x_{k}^{i *}$ $x_{k}^{i *}$ from $T (x_{k}^{i *} | x_{k}^{i})$ $T (x_{k}^{i *} | x_{k}^{i})$ and accepting the new state $x_{k + 1}^{i} = x_{k}^{i *}$ $x_{k + 1}^{i} = x_{k}^{i *}$ with probability⁹

\begin{matrix} α = min {1, \frac{p (X_{k}^{j *} | Z^{k}) T (x_{k}^{j} | x_{k}^{j *})}{p (X_{k}^{j} | Z^{k}) T (x_{k}^{j *} | x_{k}^{j})}} & (9.14) \end{matrix}

$\begin{matrix} α = min {1, \frac{p (X_{k}^{j *} | Z^{k}) T (x_{k}^{j} | x_{k}^{j *})}{p (X_{k}^{j} | Z^{k}) T (x_{k}^{j *} | x_{k}^{j})}} & (9.14) \end{matrix}$

only if α ≥ u, where u is a uniform random number between 0 and 1. The particles are excited based on either the representation of the zero-mean Gaussian density of the new state $T (x_{k}^{j *} | x_{k}^{j})$ $T (x_{k}^{j *} | x_{k}^{j})$ or the current state $T (x_{k}^{j *} | x_{k}^{j})$ $T (x_{k}^{j *} | x_{k}^{j})$ .

The impact of the M-H process is to “allow” the particles to jump according to the transition matrix, preventing particles from becoming too attached to the current target state model. This has a similar effect of genetic algorithms that use “natural selection” to search for an acceptable solution. Here we search for the most appropriate state.

Figure 9.4 provides a description of the M-H process. We are expanding Figure 9.2 for ease of description, with the M-H–modified process (step D*) replacing the “step D” from the figure, whenever the M-H test has passed.

Images

FIGURE 9.4
Graphical representation of Metropolis–Hastings in particle filter process cycle.

In this case, let the box shown in step D be the uniformly selected random value to be used to test against each particle shown. Let the “dot” {•} refer to the acceptance value α. When α ≥ u, we accept the new state * $x_{k + 1}^{i} = {(x_{k}^{i})}_{T 2}$ $x_{k + 1}^{i} = {(x_{k}^{i})}_{T 2}$ whenever the “dot” lies above the box u. Then, only those particles exceeding the randomly selected u will be given the new state and accepted for update in T2; all other particles are rejected, and maintain the old state, T1.

The M-H process greatly impacts the efficiency of the particle filter. If the jump size is set too small, convergence will be slow; if the jump size is set too large, the particles are rejected and the algorithm maintains its current state.

9.7 Particle Filtering Example

We will consider a bearings-only tracking example for a target state with sampling time T, and operating with a dynamic state in two dimensions: 2D position and 2D velocity. The sampling period T is defined as T = (t_k+1 − t_k); the target state is represented as $x = f (x, \dot{x}, y, \dot{y})$ $x = f (x, \dot{x}, y, \dot{y})$ , where the position and velocity terms, respectively, are

x (k + 1) = x (k) + \dot{x} (k) dt

$x (k + 1) = x (k) + \dot{x} (k) dt$

and

\dot{x} (k + 1) = \dot{x} (k) + v (k)

$\dot{x} (k + 1) = \dot{x} (k) + v (k)$

where v(k) is the zero-mean Gaussian process noise.

Sensor observations are based on nonlinear angle-only measurements^†

\begin{matrix} z (k + 1) = {tan}^{- 1} [\frac{(y (k) - y_{s})}{(x (k) - x_{s})}] + w (k) & (9.15) \end{matrix}

$\begin{matrix} z (k + 1) = {tan}^{- 1} [\frac{(y (k) - y_{s})}{(x (k) - x_{s})}] + w (k) & (9.15) \end{matrix}$

where (x_s, y_s) is the sensor location and w(k) is the zero-mean Gaussian measurement noise.

Scenario settings include two cases: using either one or two sensors. For the case with two sensors, only one sensor can provide measurements at a time, collected independently.

The particle filter is initialized by a uniform distribution of particles of a size allowed to subsume the true target pdf. These initial particles are fed to the prediction step that compares each particle to the bearing measurement statistically. That is, a normal pdf value is used with a mean difference between the sensor measurement and the arctangent of the estimated target position and a standard deviation based on the measurement error.

An inherent difficulty with using the normal pdf as a comparison of the target estimate in terms of the sensor bearing angle measurement is dealing with outlier measurements. Whenever the particle set exceeds three standard deviations, the statistical values produce very small weights. Often these weights are close to zero. When this effect occurs, a new set of particles needs to be regenerated from the previous time index to attempt to provide a new particle set representatively closer to the bearing measurement. This is attempted up to a number of times, until an appropriate set of weights are determined, or the measurement is rejected (at the last attempt, currently set as the 20th) and the target propagates based on its dynamic model.

Next, the particle set is resampled based on either the surviving weighted particles or a new uniform weighting.

Finally, a M-H sampling algorithm is used when conditions indicate a possibility that the true target path deviates significantly from the estimated target dynamic model. We use the measurement z(k + 1) to calculate the weights and resample the particles at time k. A move is performed, if necessary. We choose the new dynamic model by calculating $T (x_{k}^{i *} | x_{k}^{i}) = N [x^{*} (k), σ^{2}]$ $T (x_{k}^{i *} | x_{k}^{i}) = N [x^{*} (k), σ^{2}]$ for both dimensions, x(k) and y(k). The dynamic model change is accepted based on Equation 9.14 as follows:

α = min {1, \frac{T (x_{k}^{i *} | x_{k}^{i})}{T (x_{k}^{i} | x_{k}^{i *})}}

$α = min {1, \frac{T (x_{k}^{i *} | x_{k}^{i})}{T (x_{k}^{i} | x_{k}^{i *})}}$

This could continue up to N steps. Currently N is set to 1.

9.8 Provide a Set of Performance Evaluations

Figure 9.5 shows a particle filtering bearing-only scenario for both one and two sensors. A target track is initiated for a target heading southeast and abruptly turns southwest approximately half-way through the scenario. The sensors are bearing-only, providing accurate angle information on the target, but contain no range observations. As such, the single-sensor example must rely on the accuracy of particle filter target dynamic estimates to represent true target dynamics, whereas, for the two-sensor case, target estimates can take advantage of sensor diversity to build a better track estimate.

The particle filter is a standard bootstrap filter * that uses a Gaussian pdf (importance density) to choose the particle weights (Equation 9.12) by comparing the predicted particle estimate with the sensor measurement (Equation 9.15). The filter then tests to see if the weights are close to zero (outside the range of the measurement). If this occurs, the algorithm attempts to try to randomly readjust the estimate back to the measurement (attempting this up to 20 times) from the previous time step, which if reached, the new weights are applied to the update. If not, the dynamic model propagates the particles. Regardless, the particles are updated according to Equation 9.13. The weights are then used to repropagate the surviving particles after resampling. Finally, a comparison is made between the resampled particles and an update based on the dynamic model using the M-H (Equation 9.14). The resultant particle set is used to update the filter.

Sample tracking results are shown in Figure 9.6 for both the single-sensor and the two-sensor conditions.

Note the track results for the single-sensor case. In a number of instances, the track estimate tends to veer from the true target track, but particle spread brings it back into line when track angle estimates become incongruous with bearing measurements. This is quite evident during the maneuver, where the M-H was necessary to adjust track estimates to meet new dynamic conditions. For the two-sensor case, however, the algorithm can characterize angular mismatch much sooner due to the geographic diversity of the two sensors, keeping particle expansion in check. The only way to maintain a significantly smaller track error in the single-sensor is to develop more accurate dynamic model error.

Images

FIGURE 9.5
Particle filtering scenario involving a single sensor and two sensors.

Images

FIGURE 9.6
Sample performance involving a single sensor and two sensors.

For each test, we applied 100 Monte Carlo runs. We show the effects of different sized particle sets ranging from 100 to 1000 particles in increments of 100. Four different sensor bearing errors, σ_z, are explored that distinctly affect the particle weights, and ultimately the dispersion of the particles over time through the normal pdf, N(z−arctan(y/x), σ_z). The arctan(y/x) provides the mean-target estimate conversion into measurement space. The bearing errors (STD) are measured in radian terms as [0.001, 0.003, 0.005, 0.010], corresponding to angular errors, in degrees, of [0.34, 1.03, 1.72, 3.44], respectively, giving 3σ accuracy bounds corresponding, in kilometers, to approximately [0.54, 1.62, 2.70, 5.40] errors when the target is nearest to sensor 1, early in the track (about 90 km) and bounds twice as large at the time the target enters the maneuver (approximately 180 km from sensor 1).

One problem with target tracking, in general, and in particular particle filtering, is building the target dynamic model to meet a wide variety of target track conditions. A wide particle dispersal pattern will provide a greater chance for particles to “fall” within the bearing measurement range, but causes a wider variation on the target track estimate. A tighter pattern shows a more “stable” track estimate, but could cause track divergence when all particles fall outside the beam. This later condition corresponds to particles falling outside the 3σ Gaussian limits of the bearing measurement, and occurs many times for the first bearing measurement error (0.001 rad).

By using this approach, the first bearing measurement error (0.001 rad) enforces very tight constraints on the track estimate. In many cases, this constrained smaller error causes the track estimate to diverge from the true target track in part due to the algorithm setting and in part due to this use of the 3σ range, which sometimes favors the larger sensor error cases. For small errors, this forces the algorithm to spend a greater deal of time trying to catch up to the current measurement. It also points out the effects of when the algorithm’s track estimate “tries to catch up” to the target (sometimes succeeding), or when it “loses” the target. Both of these cases cause the particle density to increase its “search” space. In the former case, this slows the algorithm down for a period of time until it does catch up; in the later case, the algorithm attempts to spend a great deal of time with each particle (up to 20 tries) to search for the target measurement. The remaining three bearing measurement errors pose no problem for the particle filter.

Four parametric conditions are examined: (1) normalized run time is the time to complete a track run, averaged over the 100 Monte Carlo runs. This is obviously machine-dependent and has not been optimized for performance. However, it still provides a reasonable estimate of overall performance. (2) Tracking rate is the percentage of time that the true target position is within 3σ of at least one particle. This indicator tells us that the particle filter has not deviated so much as to cause irreversible track divergence. (3) The average number of effective particles provides the average number of particles within 3σ of the true target position. Tracking rate shows that the filter error contains the true target, whereas the average number of effective particles gives an estimate of particle spread by telling us how many of the particles are clustered near the true target location. (4) Root mean square (RMS) tracking error provides a quantitative estimate of how close the particles actually are to the true target position. We have modified the RMS conditions in this case to show only those particles within the 3σ distance to the true target position. This is because under conditions where the target estimate begins to diverge (see Figure 9.6), using the total number of particles may show RMS divergence many times larger than as indicated, giving a misleading estimate of performance. This parameter, together with the average number of effective particles, provides a good quality indicator of the ability of the particle track estimate to remain “close” to the true target position. Figures 9.7 through 9.10 provide these results.

Images

FIGURE 9.7
Normalized run time for single-sensor and two-sensor cases.

Images

FIGURE 9.8
Tracking rate for single-sensor and two-sensor cases (when at least one particle is within 3σ of the target).

Images

FIGURE 9.9
Average number of effective particles for single-sensor and two-sensor cases (average number of particles within 3σ of the target).

Images

FIGURE 9.10
Root mean square tracking error for single-sensor and two-sensor cases (when the particles are within 3σ error).

Figure 9.7 provides the normalized run time for the single-sensor and two-sensor cases. For the two-sensor case, note that normalized run time performance is consistently good for the higher bearing errors, corresponding to good track estimate. For the smallest error (0.34°), however, we have a high rate of track divergence causing the particles to often fall outside the bearing range, which gives high run times (as much as three times processing speed). For the bearing errors in the single-sensor case, processing speeds are dominated by the algorithm’s “tests” to ensure that the particles stay within bounded measurement limits (see Figure 9.6). As the particle density increases, processing speeds increase linearly.

The tracking rate is shown in Figure 9.8. For both single- and two-sensor cases, the small bearing error gives poor performance, due to a higher rate of track divergence. However, for each of the other bearing errors, high track rate can be seen to converge, starting with a particle set of between 200 and 300 particles, giving a strong indication that the track will be maintained.

Figure 9.9 shows the average number of effective particles that fall within 3σ of the target for both cases. Again, the small (0.34°) beam pointing error shows that track divergence has difficulty maintaining the track giving the poorest performance, whereas the other three bearing errors tend to converge quicker for increasing numbers of particles. As expected, the two-sensor case shows rapid convergence, demonstrating that for particle densities as low as 200, at least 50% of the particles are near the true target position; however, for the single-sensor condition, good performance requires a density of at least 400 particles to ensure this 50% condition.

Finally, Figure 9.10 shows the RMS tracking error only for those particles within 3σ of the target. For the single-sensor case, tracking errors are relatively small, except during the target-turn, which occurred around time index 50. While the smaller pointing error (0.34°) shows small RMS errors, this is to be expected as this occurs only for those particles that fall within 3σ of the target—close to the true target position. However, together with the average number of effective particles (Figure 9.9), we see that only 30% of these particles on average actually remain close to the true target. For larger bearing errors, a greater number of particles (and a larger spread) fall within the acceptable error bound giving rise to somewhat higher RMS error—at least 50% of the particles being within the 3σ bound. The two-sensor case shows a significant improvement over the single-sensor case, as expected.

9.9 Summary

This chapter has provided an overview of the development of particle filtering from recognition of the limits of the Kalman filter that led to the EKF, to the conception of particle filtering. Discussion of the development led to the use of importance sampling to characterize acceptable density support functions that reasonably encompass the true target pdf even though the target pdf may not be known. Problems inherent to particle filter development and use were also discussed. Finally, an example scenario was provided that shows some of the overall conditions expected in particle filter tracking along with metrics that help to assess tracking performance. Many articles presenting particle filters also provide excellent pseudocode adequate for developing particle filter simulations. For instance, as Ristic et al.⁹ builds the theoretical development of the particle filter, sample pseudocode descriptions are provided, with each subsequent section building upon the previous pseudocode descriptions.

Additional efforts continue to expand the capabilities of the particle filters. For instance, in an upcoming IEEE issue³⁴ methods are provided to apply particle filters to dynamic Bayesian networks to address nonlinearity and hybrid state conditions in sequential inferencing for complex dynamic systems.

Additionally, particle filtering plays a role in fault diagnostics for autonomous operation of mobile robots,³⁵ and recent developments have provided reconfigurable adaptation of a suite of particle filters for domain-specific processing,³⁶ just to name a few examples.

References

1. R. Kalman, A new approach to linear filtering and prediction problems, Transactions of AME, Journal of Basic Engineering, 82, 35–45, 1960.

2. E. Brookner, Tracking and Kalman Filtering Made Easy, Wiley, New York, NY, 1998.

3. Y. Bar-Shalom (Ed.) Multitarget-Multisensor Tracking: Advanced Applications, Vol. I, Artech House, Norwood, MA, 1990.

4. Y. Bar-Shalom (Ed.) Multitarget-Multisensor Tracking: Applications and Advances, Vol. II, Artech House, Norwood, MA, 1992.

5. Y. Bar-Shalom and W. Blair, Eds., Multitarget-Multisensor Tracking: Applications and Advances, Vol. III, Artech House, Norwood, MA, 2000.

6. S. Blackman, Multiple-Target Tracking with Radar Applications, Artech House, Norwood, MA, 1986.

7. S. Blackman and R. Popoli, Design and Analysis of Modern Tracking Systems, Artech House, Norwood, MA, 1999.

8. M. Grewal and A. Andrews, Kalman Filtering: Theory and Practice, Prentice Hall, Upper Saddle River, NJ, 1993.

9. B. Ristic, S. Arulampalam, and N. Gordon, Beyond the Kalman Filter: Particle Filters for Tracking Applications, Artech House, Boston, MA, 2004.

10. H. Stark and J. Woods, Probability, Random Processes, and Estimation Theory for Engineers, Prentice Hall, Upper Saddle River, NJ, 1986.

11. Y. Bar-Shalom and X.R. Li, Multitarget Multisensor Tracking: Principles and Techniques, YBS Publishing, Storrs, CT, 1995.

12. A. Gelman, J. Carlin, H. Stern, and D. Rubin, Bayesian Data Analysis, Chapman and Hall/CRC, Boca Raton, FL, 2004.

13. E. Dudewicz and S. Mishra, Modern Mathematical Statistics, Wiley Series in Probability & Mathematical Statistics, Wiley, New York, NY, 1988. [The work is discussed in Chapter 6.3, Central Limit Theorem, Limit Laws, Applications.]

14. M. Sanjeev Arulampalam, S. Maskell, N. Gordon, and T. Clapp, A tutorial on particle filters for online non-linear/non-Gaussian Bayesian tracking, IEEE Transactions on Signal Processing, 50(2), 174–188, 2002.

15. C. Andrieu, A. Doucet, S. Singh, and V. Tadic, Particle methods for change detection, system identification, and control, Proceedings of the IEEE, Special Issue on Sequential State Estimation, 92(3), 423–438, 2002.

16. C. Kwok, D. Fox, and M. Meila, Real-time particle filters, Proceedings of the IEEE, Special Issue on Sequential State Estimation, 92(3), 469–484, 2002.

17. C. Agate and K. Sullivan, Road-constrained target tracking and identification using a particle filter, Signal and Data Processing of Small Targets 2003, Proceedings of SPIE, Vol. 5204, 2003.

18. M. Rutten, N. Gordon, S. Maskell, Particle-based track-before-detect in Rayleigh noise, Signal and Data Processing of Small Targets 2004, Proceedings of SPIE, Vol. 5428, 2004.

19. M. Ulmke, On multitarget track extraction and maintenance using sequential Monte Carlo methods, Signal and Data Processing of Small Targets 2005, Proceedings of SPIE, Vol. 5913, 2005.

20. E. Blasch, A. Rice, and C. Yang, Nonlinear track evaluation using absolute and relative metrics, Signal and Data Processing of Small Targets 2006, Proceedings of SPIE, Vol. 6236, 2006.

21. H. Kamel and W. Badawy, Comparison between smoothing and auxiliary particle filter in tracking a maneuvering target in a multiple sensor network, Acquisition, Tracking, and Pointing XX, SPIE, Vol. 5810, 2005.

22. T. Clemons III and K.C. Chang, Estimation filters for missile tracking with airborne laser, Acquisition, Tracking, and Pointing XX, SPIE, Vol. 6238, 2006.

23. F. Gustafsson, F. Gunnarsson, N. Bergman, U. Forssell, J. Jansson, R. Karlsson, and P.J. Nordlund, Particle filters for position, navigation, and tracking, IEEE Transactions on Signal Processing, Special Issue on Monte Carlo Methods for Statistical Signal Processing, 50(2), 425–437, 2002.

24. D. Crisan and A. Doucet, A survey of convergence results on particle filtering methods for practitioners, IEEE Transactions on Signal Processing, 50(3), 736–746, 2002.

25. S. Arulampalam and B. Ristic, Comparison of the particle filter with range-parameterized and modified polar EKFs for angle-only tracking, Signal and Data Processing of Small Targets 2000, Proceedings of SPIE, Vol. 4048, 2000.

26. K. Ezal and C. Agate, Tracking and interception of ground-based RF sources using autonomous guided munitions with passive-bearings-only sensors and tracking algorithms, Acquisition, Tracking, and Pointing XVIII, Proceedings of SPIE, Vol. 5430, 2004.

27. A. Marrs, S. Maskell, and Y. Bar-Shalom, Expected likelihood for tracking in clutter with particle filters, Signal and Data Processing of Small Targets 2002, Proceedings of SPIE, Vol. 4728, 2002.

28. R. Van der Merwe, A. Doucet, N. De Freitas, and E. Wan, The unscented particle filter, Technical Report, CUED/F-INFENG/TR 380, Cambridge University Engineering Department, August 2000.

29. J. Liu and R. Chen, Sequential Monte Carlo methods for dynamical systems, Journal of the American Statistical Association, 93, 1032–1044, 1998.

30. C. Andrieu, N. Freitas, A. Doucet, and M. Jordan, Introduction to Markov Chain Monte Carlo for Machine Learning, Kluwer Academic Publishers, The Netherlands, 2001.

31. W. Gilks, S. Richardson, and D. Spiegelhalter, Markov Chain Monte Carlo in Practice, Chapman and Hall, London, UK, 1996.

32. W. Steward, Introduction to Numerical Solutions of Markov Chains, Princeton University Press, Princeton, NJ, 1994.

33. A. Gelman, J. Carlin, H. Stern, and D. Rubin, Bayesian Data Analysis, Chapman and Hall/CRC, Boca Raton, FL 2004.

34. H. Chen and K.C. Chang, K-nearest neighbor particle filters for dynamic hybrid Bayesian networks, IEEE Transactions on Aerospace and Electronic Systems, 44(2), 2008.

35. N. de Freitas, R. Dearden, F. Hutter, R. Morales-Menendez, J. Mutch, and D. Poole, Diagnosis by a waiter and a Mars explorer, Proceedings of the IEEE, Special Issue on Sequential State Estimation, 92(3), 455–468, 2002.

36. S. Hong, J. Lee, A. Athalye, P, Djuric, and W.D. Cho, Design methodology for domain specific parameterizable particle filter realizations, IEEE Transactions on Circuits and Systems I: Regular Papers, 54(9), 1987–2000, 2007.

* The author’s affiliation with the MITRE Corporation is provided only for identification purposes and is not intended to convey or imply MITRE’s concurrence with, or support for, the positions, opinions, or viewpoints expressed by the author.

* The unscented Kalman filter is described in detail in Chapter 15. As such, it will not be discussed here.

* See Chapter 15 for a more detailed discussion.

* N(x; m, σ²) represents a Gaussian density on x, with a mean m and a standard deviation σ.

* $δ (X_{k + 1} - X_{k + 1}^{j})$ $δ (X_{k + 1} - X_{k + 1}^{j})$ is the Dirac delta function.

† Also known as sequential importance sampling.

* T1 and T2 refer to the Markov transition states. In this case, when α ≥ u, we “move” the particle to the new state T2, otherwise the particle remains in its present state, T1.

† The model is only valid for small bearing errors where noise is additive, but is a good representation to demonstrate the effects of nonlinear measurements.

* Also known as a sequential importance resampling filter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 9 Introduction to Particle Filtering: The Next Stage in Tracking

Create new playlist

Sign In

Sign Up

Table of Contents for
9 Introduction to Particle Filtering: The Next Stage in Tracking