This chapter provides an overview of the techniques introduced so far when a random sample of landmark configurations is available in two dimensions. In this chapter we make particular use of complex numbers which leads to simple methodology in this important case. Although most of the material for this chapter has been described using general matrix notation in previous chapters, it is often much simpler to use complex vectors in the 2D case. This chapter is designed to be largely self-contained for planar shape analysis using complex notation.
Consider two centred configurations y = (y1, …, yk)T and w = (w1, …, wk)T, both in , with y*1k = 0 = w*1k, where y* denotes the transpose of the complex conjugate of y. In order to compare the configurations in shape we need to establish a measure of distance between the two shapes.
A suitable procedure is to match w to y using the similarity transformations and the differences between the fitted and observed y indicate the magnitude of the difference in shape between w and y. Consider the complex regression equation
where A = (A1, A2)T = (a + ib, βeiθ)T are the 2 × 1 complex parameters with translation a + ib, scale β > 0 and rotation 0 ≤ θ < 2π; ε is a k × 1 complex error vector; and XD = [1k, w] is the k × 2 ‘design matrix’. To carry out the registration we could estimate A by minimizing the least squares objective function, the sum of square errors
The full Procrustes registration of w on y is obtained by estimating A with , where
Definition 8.1 The full Procrustes fit (registration) of w onto y is:
where are chosen to minimize
Result 8.1 The full Procrustes fit has matching parameters
Proof: We wish to minimize (over β, θ, a, b) the expression
(remember y and w are centred). Clearly, the minimizing a and b are zero. Let y*w = γeiϕ (γ ≥ 0) and then
So to minimize ||y − βeiθw||2 over θ we need to maximize 2βγcos (θ + ϕ). Clearly, a solution for θ is . To find the minimizing scale we solve
where γ = |y*w|. Hence,
as required.
The solution is the standard least squares solution (but with complex variables) and we can write the solution in the familiar form
Note that the full Procrustes fit of w onto y is given explicitly by:
The residual vector is given by:
where Hhat is the ‘hat’ matrix for XD, that is
The minimized value of the objective function is:
Now this expression is not symmetric in y and w unless y*y = w*w. A convenient standardization is to take the configurations to be unit size, that is
So, if we include standardization, then we obtain a suitable measure of shape distance.
Definition 8.2 The full Procrustes distance between complex configurations w and y is given by:
The expression for the distance follows from Equation (8.7).
Note that the full Procrustes fit of w onto y is actually obtained by complex linear regression of y on w, which is a simpler procedure than working with minimization over rotation matrices as in Chapter 3.
This is not the only choice of distance between shapes, and further choices of distance were considered in Chapter 3. However, the full Procrustes distance is natural from a statistical point of view, obtained from a least squares criterion and optimizing over the full set of similarity parameters. The squared full Procrustes distance naturally appears exponentiated in the density for many simple probability distributions for shape, as we shall see in Chapter 10.
Consider the situation where a random sample of configurations w1, …, wn is available and we wish to estimate a population mean shape, such as the population full Procrustes mean.
Definition 8.3 The full Procrustes estimate of mean shape is obtained by minimizing (over μ) the sum of square full Procrustes distances from each wi to an unknown unit size mean configuration μ, that is
as previously seen in Equation (6.11). Again we us assume that the configurations w1, …, wn have been centred, so that w*i1k = 0.
Result 8.2 (Kent 1994) The full Procrustes mean shape can be found as the eigenvector corresponding to the largest eigenvalue of the complex sum of squares and products matrix
where the zi = wi/||wi||, i = 1, …, n, are the pre-shapes.
Proof: We wish to minimize
Therefore,
Hence, is given by the complex eigenvector corresponding to the largest eigenvalue of S [using e.g. Mardia et al. 1979, Equation (A.9.11)]. All rotations of are also solutions, but these all correspond to the same shape .
The eigenvector is unique (up to a rotation) provided there is a single largest eigenvalue of S (which is the case for most practical datasets). We shall see in Section 10.2 that the solution corresponds to the MLE of modal shape under the complex Bingham model. Note that in this special case we do not need to use the iterative GPA algorithm of Section 7.3, and this is a further indication of the 2D case being special using complex arithmetic. In order to estimate the full Procrustes mean shape with this method we can use the R command:
procGPA(data,eigen2d=TRUE)
and examples are given in Section 8.4.
The full Procrustes fits or full Procrustes coordinates of w1, …, wn are:
where each wPi is the full Procrustes fit of wi onto . Calculation of the full Procrustes mean shape can also be obtained by taking the arithmetic mean of the full Procrustes coordinates, that is has the same shape as the Procrustes mean shape (see Result 8.2). The Procrustes residuals are calculated as:
and the Procrustes residuals are useful for investigating shape variability (see Section 7.7).
An alternative equivalent procedure to working with centred configurations would be to work with the Helmertized landmarks Hwi, where H is the sub-Helmert matrix given in Equation (2.10). This procedure was originally used by Kent (1991, 1992, 1994). The least squares estimate of shape is the leading eigenvector of HSHT. Note that is identical to , up to an arbitrary rotation.
Definition 8.4 To obtain an overall measure of shape variability we consider the root mean square RMS(dF) of full Procrustes distance from each configuration to the full Procrustes mean ,
Example 8.1 In Figure 8.1 we see the raw digitized data from the female and male gorilla skulls from the dataset described in Section 1.4.8. The landmarks have been recorded by a digitizer to be registered so that opisthion is at the origin and the line from opisthion to basion is horizontal. There are k = 8 landmarks in m = 2 dimensions. In Figure 8.2 and Figure 8.3 we also see full Procrustes fits of the females and males separately. For each sex the landmarks match up quite closely because the shape variability is small. The full Procrustes mean for each sex is found from the dominant eigenvector of the complex sum of squares and products matrix for each sex. In Figure 8.4 we see the full Procrustes registration of the female average shape and the male average shape. It is also of interest to assess whether there is a significant average shape difference between the sexes and, if so, to describe the difference. We consider methods for testing for average shape differences in Chapter 9. The full Procrustes distance dF between the mean shapes is 0.059, and the within-sample RMS(dF) is 0.044 for females and 0.050 for males. We see later in Section 9.1.2 that the difference in mean shapes between the sexes is statistically significant.
In R we can make explicit use of the complex eigenvector solution to the Procrustes mean by using the option eigen2d=TRUE
in function procGPA
. This method can be much faster than using the generalized Procrustes algorithm of Section 7.3 if the number of observations n is large.
data(apes)
ans1<-procGPA(apes$x,eigen2d=TRUE,tol1=1e-10)
ans2<-procGPA(apes$x,eigen2d=FALSE)
riemdist(ans1$mshape,ans2$mshape)
[1] 1.724934e-07
As seen above both the complex eigenvector and GPA give almost identical results. The rotation and scale of both estimates are arbitrary, and the complex eigenvector has centroid size 1, and the first two coordinates are horizontal.
After having obtained an average configuration we often wish to examine the structure of shape variability in a sample, using tangent space PCA (see Section 7.7). In practice one uses real coordinates in the tangent space to carry out further analysis, as the complex covariance structure (which is isotropic at each landmark) is very restrictive.
We denote the real vectors of the tangent coordinates as vi, i = 1, …, n. These could be the Procrustes residuals ri of Equation (8.13) or another choice of tangent coordinates which we introduced in Chapter 3.
Example 8.2 Consider the mouse vertebral data described in Section 1.4.1. There are k = 6 landmarks in m = 2 dimensions. The analysis here is similar to Kent (1994). In Figure 8.5 we have a plot of the Procrustes mean shape obtained from the dominant eigenvector of the complex sum of squares and products matrix. The full Procrustes mean shape is centred, with unit size, and rotated so that the line joining the two farthest apart landmarks is horizontal. Hence, the mean shape has coordinates ( − 0.51 − 0.14i, 0.51 − 0.14i, 0.09 + 0.15i, 0.01 + 0.42i, −0.07 + 0.16i, −0.03 − 0.45i)T.
In order to examine the structure of variability we examine the eigenstructure of the sample covariance matrix Sv of the Procrustes residuals. The square roots of the eigenvalues of Sv are:
Hence, the first two PCs explain 69 and 10% of the variability, respectively. The last four zero eigenvalues are zero due to the four constraints for location, rotation and scale. For each PC, shapes at 6 standard deviations away from the mean are calculated. In Figure 8.5 we see the mean shape with these unit vectors drawn for the first two PCs. The vectors of the first and second PCs in the figure are given by:
There appears to be a high dependence between certain landmarks, as indicated by the fact that the first PC explains such a large proportion of the variability. The first PC highlights a shift downwards and inwards for landmarks 1 and 2, balanced by an upwards movement for landmark 4. At the same time landmarks 3 and 5 move inwards slightly whereas there is little movement in landmark 6. The second PC is not distinguishable from PC3 and PC4 due to similar eigenvalues. If we had chosen to display the PCs in a different manner (e.g. relative to landmarks 1 and 2 as in Bookstein coordinates), then our interpretation would be different. In particular we display the PCs in Figure 8.6 where the mean and a figure at 3 standard deviations along the PC have been registered relative to landmarks 1 and 2. Our interpretation would be that the first PC includes the movement of landmarks 3, 4, 5 and 6 upwards relative to points 1 and 2. Landmark 4 shows the largest movement, followed by 3 and 5 together and landmark 6 shows the smallest movement. Both of these interpretations are correct, as they are describing the same PCs.
3.139.104.23