Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

12
Spatial Statistics

12.1 Introduction

Spatial statistics is a part of applied statistics and is concerned with modelling and analysis of spatial data. By spatial data we mean data where, in addition to the (primary) phenomenon of interest, the relative spatial locations of observations are also recorded because these may be important for the interpretation of data. This is of primary importance in earth‐related sciences such as geography, geology, hydrology, ecology, and environmental sciences, but also in other scientific disciplines concerned with spatial variations and patterns such as astrophysics, economics, agriculture, forestry, and epidemiology, and, at a microscopic scale, medical and health research. Spatial statistics uses nearly all methods described in the first eleven chapters of this book and also multivariate analysis and Bayesian methods, neither of which are discussed in this book. We therefore restrict ourselves in this chapter to a few basic principles and give hints for further reading for other important methods. As a consequence of this, the list of references is relatively long.

We restrict our attention to continuous characteristics and to Gaussian distributions and analyse examples with the program package R as we did in other chapters of this book. Those who prefer SAS and understand a bit of German are referred to the procedures in 6/61 of Rasch et al. (2008).

6/61/0000 Spatial Statistics – Introduction
6/61/1010 Estimation of the covariance function of a random variable with constant trend
6/61/1020 Estimation of the semi‐variogram of a random variable
6/61/1021 Estimation of the parameter of an exponential semi‐variogram model
6/61/1022 Estimation of the parameter of a spherical semi‐variogram model
6/61/1030 Definition of increments and of the generalised covariance function for non‐steady state random variables
6/61/1031 Estimation of the generalised covariance function for non‐steady state random variables
6/61/1100 Modelling spatial dependencies between two variables
6/61/2000 Spatial prediction – survey
6/61/2010 Prediction of stationary random variables using the covariance function
6/61/2020 Prediction of stationary random variables using the semi‐variogram
6/61/2030 Prediction of non‐steady state random variables
6/61/2040 Spatial prediction: Co‐kriging
6/61/2050 Prediction of probabilities
6/61/2051 Prediction of exceedance probabilities
6/61/2052 Hermitean prediction.

We restrict ourselves to one‐ and two‐dimensional regions D ⊂ R¹ and D ⊂ R², respectively. However, D ⊂ R³ (oil and mineral prospection, 3D imaging) is also possible. In some fields such as Bayesian data analysis, design and simulation one even requires spaces D of dimension >3; this pertains, in particular, to the design and analysis of computer experiments with a moderate to large number of input variables.

Points in D ⊂ R² are written as s^T = (x₁, x₂) images and the coordinates x₁, x₂ are in geostatistics often the Gauss–Krüger coordinates on the earth based on the degrees (°) of the longitude meridional zone of the surface of the earth, see Krüger (1912). The surface is subdivided in meridional zones of a latitude of 3° running from the North Pole to the South Pole parallel to its central meridian. The degrees of the central meridian of each meridional zone counted from 0° eastwards are mapped to code numbers by dividing them by three as shown in Table 12.1.

Table 12.1 Gauss–Krüger code numbers.

Central meridian	Western longitude				Eastern longitude
Degree	…	6°	3°	0°	3°	6°	…
Degree east from 0°		354°	357°	0°	3°	6°
Code number	…	118	119	0	1	2	…

The meridional zone is conformally mapped on a cylinder barrel with the axis in the equatorial plane and a radius equal to the curvature radius of the meridian. Its origin is the intersection of the central meridian and the equator. From the origin the coordinates of the points on the surface of the earth are defined like in a usual Cartesian coordinate system, positive to the east by the so‐called easting (x₁), and to the north by the so‐called grid north (x₂). The coordinates on the earth can be transformed to the Gauss–Krüger coordinates via https://www.koordinaten‐umrechner.de. As an example, we give the Gauss–Krüger coordinates of the (first author's) house in Feldrain 73 in Rostock, Germany.

Degrees minutes seconds E 12° 11′ 41; N 54° 06′ 18

Easting: 4507572; grid north 59922353.

First we must select a spatial model for the observations (variables) at the points in D ⊂ R²; their realisations are our observations.

The general model is for observations y(s) at images

12.1

with side conditions for weak stationarity (second order stationarity):

12.2

12.3

Formula (12.3) leads for h = 0 to var(y(s))= C(0).

If we further assume that C(h) = C(∥h∥) with the Euclidian norm ∥h∥ of h, then C is called isotropic, otherwise anisotropic.

The positions of observation sites s ∈ D can be fixed in advance (as, e.g. the position of wind power stations in an area) or may be random.

As examples where observation points occur randomly, we mention meteor strikes in a special area. To this situation also belongs the possibly oldest mapping of clusters of cholera cases in the London epidemic of 1854 (Snow 1855). Randomly means in this connection that we assume that a point is equally probable to occur at any location and that the position of a point is not affected by any other point.

Further, the observation points may not be fixed but determined by the scientist (monitoring). In this case, besides problems of analysis, design problems also exist by selecting the optimal observation points in an area. For this, readers are referred to Müller (2007).

Comprehensive treatments of the whole field of spatial statistics are given in Ripley (1988), Cressie (1993), and Gaetan and Guyon (2010).

Basically, there are four classes of problems which spatial statistics is concerned with: point pattern analysis, geostatistical data analysis, areal/lattice data analysis and spatial interaction analysis. These sub‐problems are treated in overview papers such as: Pilz (2010), Mase (2010), Kazianka and Pilz (2010a), Diggle (2010), and Spöck and Pilz (2010).We discuss mainly geostatistical data analysis with some hints to areal data analysis.

For a good overview on software for different problem areas of spatial data analysis with R we recommend the book by Bivand et al. (2013), for the important issue of simulation of spatial models we refer to Lantuéjoul (2002) and Gaetan and Guyon (2010). An overview of methodology and software for interfacing spatial data analysis and geographic information system (GIS) for visualising spatial data is given in Pilz (2009).

Due to the fact that in some fields of application of spatial statistics special methods have been applied, and further because the theory and applications are still in development, we cannot give a closed presentation of the field. We describe basic methods used in geostatistics using Euclidean distances and then give examples. For point pattern analysis, areal/lattice data analysis, and spatial interaction analysis the reader is referred to the books mentioned above.

12.2 Geostatistics

In geostatistics D is a continuous subspace of R² or R³ and the random variable (field) is observed at n > 2 fixed sites s₁, s₂, … , s_n ∈ D. Typical examples include rainfall data, data on soil characteristics (porosity, humidity, etc.), oil and mineral exploration data, air quality, and groundwater data. In this chapter only one characteristic observation variable is measured per observation point on a line or plane. Multivariate geostatistics dealing with observation vectors per observation point is described in detail in Wackernagel (2010).

The concept of stationarity is key in the analysis of spatial and/or temporal variation: roughly speaking, stationarity means that the statistical characteristics (e.g. mean and variance) of the random variable of interest do not change over the considered area. However, testing for stationarity is not possible. For spatial prediction the performance of a stationary and a non‐stationary model could be compared through assessment of the accuracy of predictions.

In this chapter we assume that the random vectors y^T = [y(s₁), y(s₂), … , y(s_n) ] follow an n‐dimensional normal (Gaussian) distribution for any collection of spatial locations {s₁, s₂, … , s_n} ⊂ D and any n ≥ 1. In the literature the collection of random variables {y(s) : s ∈ D} is then usually termed a Gaussian random field (GRF) over D. For other types of random fields and a detailed explanation of their mathematical structure and most important properties we refer to Cressie and Wikle (2011), where also extensions to so‐called spatio‐temporal random fields are considered. Often non‐normal random variables may be transformed by a so‐called Box–Cox transformation as a generalisation of the normal case by including an additional parameter λ. This transformation is given by:

12.4

A GRF is completely determined by its expectation (trend function)

and covariance function C(s_i − s_j) = cov[y(s_i), y(s_j)] : i, j = 1, … , n.

Contrary to traditional statistics, in a geostatistical setting we usually observe only one realisation of y at a finite number of locations . Therefore, the distribution underlying the random field cannot be inferred without imposing further assumptions. The simplest assumption is that of (strict) stationarity, which means that the normal distributions do not change when all positions are translated by the same (lag) vector h and this implies that (12.2) and (12.3) are valid.

Often (in areal/lattice data analysis) measurements are not related to points but to areas A_i, i = 1, … n . In such cases in place of distances between points measures of spatial proximity w_ij between two areas A_i and A_j are used and represented in a square n × n matrix W = (w_ij).

According to Bailey and Gatrell (1995) some possible criteria for determining proximities might be:

w_ij = 1 if A_j shares a common boundary with A_i and w_ij = 0 else.
w_ij = 1 if the centroid of A_j is one of the k nearest centroids to that of A_i and w_ij = 0 else.
w_ij = if the inter‐centroid distance d_ij < δ (δ > 0, γ < 0); and w_ij = 0 else.
w_ij = , where l_ij is the length of the common boundary between A_iand A_j and l_i is the perimeter of A_i.

All diagonal elements w_ii are set to 0. Note that the spatial proximity matrix W must not necessarily be symmetric.

For more proximity measures we refer to Bailey and Gatrell (1995) and any other publications on areal spatial analysis like Anselin and Griffith (1988).

12.2.1 Semi‐variogram Function

From now on, we focus on geostatistics and assume second order stationarity, i.e. (12.2) and (12.3) hold, and additionally, we assume isotropy. In geostatistics it is common to use the so‐called semi‐variogram function

12.5

Observing that under the assumption of stationarity it holds that

12.6

we can simply use the empirical moment estimate to estimate the semi‐variogram according to

12.7

where denotes the number of sampling locations separated from each other by the .

We call the estimated semi‐variogram function or the sample semi‐variogram function. Since a vector h is uniquely determined by its length and its direction, it is customary to form lag distance classes along given directions, usually this is done for the four main directions 0°, 45°, 90°, and 135°.

Then

This is called the variogram function. Dividing by two leads to the semi‐variogram function

12.9

as a special case of (12.5).

For ‘classical’ estimation methods for the trend function and variogram parameters see Mase (2010), for Bayesian approaches we refer to Banerjee et al. (2014), Kazianka and Pilz (2010a) and Pilz et al. (2012). For non‐stationary variogram modelling we refer to the review provided by Sampson et al. (2001) and Schabenberger and Gotway (2005).

Theoretically, at zero separation distance (lag = 0), the semi‐variogram value γ(0) is zero, because . However, at an infinitesimally small separation distance, the semi‐variogram function often exhibits a so‐called nugget effect, which is some value greater than zero. For example, if the semi‐variogram model intercepts the x₂‐axis at 3, then the nugget effect is 3. Furthermore, often is an asymptotic value of the semi‐variogram model called sill. The nugget effect can be attributed to measurement errors or spatial sources of variation at distances smaller than the sampling interval (or both). Measurement error occurs because of the error inherent in measurement devices. Natural phenomena can vary spatially over a range of scales. Variation at microscales smaller than the sampling distances will appear as part of the nugget effect. Before collecting data, it is important to gain some understanding of the scales of spatial variation.

Problem 12.1

Calculate estimates of the semi‐variogram function and show graphs.

Solution

We use the R package geoR.

Example 12.1

From the library(geoR) we use a simulated data set s100 of 100 geodata in R². We use a part of the text of http://leg.ufpr.br/geoR/geoRdoc/geoRintro.html.

 > library(geoR)
> data(s100)
> summary(s100)
Number of data points: 100 
Coordinates summary
        Coord.X    Coord.Y
min 0.005638006 0.01091027
max 0.983920544 0.99124979
Distance summary
        min         max 
0.007640962 1.278175109 
Data summary
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
-1.1676955  0.2729882  1.1045936  0.9307179  1.6101707  2.8678969 
Other elements in the geodata object are
[1] "cov.model" "nugget"    "cov.pars"  "kappa"     "lambda"

The function plot.geodata shows a 2 × 2 display (as given in Figure 12.1) with data locations (top plots) and data versus coordinates (bottom plots). For an object of the class "geodata" the plot is produced by the command: plot(s100)

Figure 12.1 Exploratory spatial data analysis of dataset s100.

 > plot(s100)

Empirical semi‐variograms are calculated using the function variog. Theoretical and empirical semi‐variograms can be plotted and visually compared. The most important theoretical variogram models will be given in the sequel. For example, Figure 12.2 shows the theoretical variogram model (exponential model) used to simulate the data s100 and the estimated variogram values at different lags.

Figure 12.2 Exponential semi‐variogram model underlying the dataset s100.

 > bin1 <- variog(s100, uvec = seq(0,1,l=11))
variog: computing omnidirectional variogram
> plot(bin1)
> lines.variomodel(cov.model = "exp", cov.pars = c(1,0.3),nugget = 0,   max.dist = 1,  lwd = 3)
> legend(0.4,0.3,legend=c("exponential model"),lty=1,lwd = 3)

Directional variograms can also be computed by the function variog using the arguments direction and tolerance. For example, to compute a variogram for the direction 60° with the default tolerance angle (22.5°) the command would be:

 > vario60 <- variog(s100, max.dist = 1, direction=pi/3)
> plot(vario60)
> title(main = expression(paste("directional, angle = ", 60 * degree)))

and the plot is shown on the left panel of Figure 12.3.

Figure 12.3 Empirical directional variograms for dataset s100.

For a quick computation in the four main directions we can use the function variog4 and the corresponding plot is shown on the right panel of Figure 12.3.

 > vario.4 <- variog4(s100, max.dist = 1)
> plot(vario.4, lwd=2)

The semi‐variogram as the graph of the semi‐variogram function is often non‐linear and can be estimated by the methods of Section 8.1.2 using ∥h∥ as regressor from the estimates of the semi‐variogram function as regressands. For this we have to select an appropriate regression model. However, we may only use semi‐variogram models, which are conditionally negative semidefinite, which means that

must hold for any collection of locations {s₁, s₂, … , s_n} ⊂ D, n ≥ 1, and any vector a^T = (a₁, … , a_n) ∈ Rⁿ such that . This implies that

due to (12.9), for any linear combination of observations.

This non‐negative definiteness is just what a covariance function C(·) is supposed to accomplish.

The following parametric semi‐variogram models are in use (writing for ∥h∥).

Exponential model:

12.10

Spherical model:

12.11

with images .

Gaussian model:

12.12

Bessel model:

12.13

The function K₁(·) is the modified Bessel function of the second kind and first order.

Matern model:

12.14

is the modified Bessel function of the second kind and order ν and Γ(.) the gamma function.

Power model:

12.15

We remark that the exponential model and the Bessel model are special cases of the Matern model when we choose ν = 1/2 and ν = 1, respectively. Further, the Gaussian model appears as a limiting case of the Matern model when ν approaches infinity. This parameter ν, also called smoothness parameter, is crucial for determining the ‘roughness’ of the realisations of the underlying random field. The larger ν the smoother the observed realisations (paths for D ⊂ R¹ and surfaces for D ⊂ R², respectively) will be.

Problem 12.2

Calculate the estimate of the semi‐variogram function.

Figure 12.4 Variogram cloud and omnidirectional variogram of elevation data.

Solution

We use the R package geoR.

Example 12.2

Surface elevation data are taken from Davis (2002). This is an object of the class geodata, which is a list with the following elements: coords x–y coordinates (multiples of 50 ft) and data elevations (multiples of 10 ft). The data are available in the R package geoR.

 > library(geoR)
> data(elevation)
> summary(elevation)
Number of data points: 52 

Coordinates summary
      x   y
min 0.2 0.0
max 6.3 6.2

Distance summary
     min      max 
0.200000 8.275869 

Data summary
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
690.0000 787.5000 830.0000 827.0769 873.0000 960.0000

> par(mfrow=c(1,2))
> plot(variog(elevation, option="cloud"), xlab="h",  ylab=expression(gamma(h)))
variog: computing omnidirectional variogram
> plot(variog(elevation, uvec=seq(0.8, by = 0.5)), xlab="h",  ylab=expression(gamma(h)))
variog: computing omnidirectional variogram

On the left of Figure 12.4 is the empirical variogram of the elevation data given with the option ‘cloud’. This shows all the squared differences between the observed values for all pairs of spatial locations. On the right side the sample variogram with for the lag‐class distance is given.

Figure 12.5 Empirical semi‐variogram (circles) and the fitted theoretical semi‐variogram.

12.2.2 Semi‐variogram Parameter Estimation

The unknown parameters occurring in the semi‐variogram models (12.10)–(12.14) can be estimated by the least squares method of Section 8.2.1.1 or in the case of GRF the maximum likelihood method could be applied.

Problem 12.3

Show how the parameters of a Matern semi‐variogram can be estimated.

Solution

Parameter estimation using semi‐variogram based methods with the R package geoR proceeds in two steps. First the empirical (sample) semi‐variogram is calculated and plotted.

Example 12.3

A data set data(s100)can be found in geoR.

 > library(geoR)
> data(s100)
> v.s100<-variog(s100,max.dist=1)
variog: computing omnidirectional variogram
> plot(v.s100)
> lines.variomodel(seq(0,1,l=100),cov.pars=c(0.9,0.2),cov.model="mat",
     kap=1.5, nug=0.2)

Note: In "geoR", the smoothness parameter of the Matern variogram is denoted by kappa (kap, for short), the nugget variance by nug and the cov.pars stands for the vector of variogram parameters indicating the partial sill c and the range (correlation radius).

Then we use the function variofit with the empirical semi‐variogram as argument and the Matern model (12.14) for the semi‐variogram function (Figure 12.5).

 > variofit(v.s100,ini=c(0.9,0.2),cov.model="mat",kap=1.5,nug=0.2)
variofit: covariance model used is matern
variofit: weights used: npairs
variofit: minimisation function used: optim 
variofit: model parameters estimated by WLS (weighted least squares):
covariance model is: matern with fixed kappa = 1.5
parameter estimates:
tausq sigmasq phi
0.3036 0.9000 0.2942
Practical Range with cor=0.05 for asymptotic range: 1.395627
variofit: minimised weighted sum of squares = 23.6572

Note: In the "variofit" function the nugget variance c₀ is denoted by tausq (short for tausquared), sigmasq (short for sigmasquared) stands for the (partial) sill value c and phi denotes the correlation radius.

12.2.3 Kriging

In geostatistics kriging is a method of interpolation to predict values of the variable of interest in D at positions where no measurement has been made. It is based on a master thesis of Krige (1951) and the mathematical foundation by Matheron (1963). Observed values are modelled by a GRF. Under suitable assumptions, kriging gives the best linear unbiased prediction (BLUP) of values not observed in the corresponding area. The basic idea of kriging is to predict the value of a function at a given point by computing a weighted average or linear combination of the observed values of the function in the neighbourhood of the point. The methods used are a kind of regression analysis in two dimensions. Basically, we distinguish between ordinary and universal kriging: ordinary kriging assumes a constant trend function, E(y(s)) = const for all s in D, whereas universal kriging assumes a non‐constant trend. Placing the problem in a stochastic framework permits precision‐defining optimality for estimations of unknown parameters from the random variables for which the measurements are realisations. A criterion imposed is that the estimator be unbiased, or that in an average sense the difference between the predicted value and the actual value is zero. Another optimality criterion is that the prediction variance be minimised. This variance (kriging variance) is defined to be the expectation of the average squared difference between predicted and actual values. The kriging estimator minimises this variance. This minimisation is performed algebraically and results in a set of equations known as the kriging equations – in ordinary kriging we call them ordinary kriging equations (OK system).

For ordinary kriging, we must make the following assumptions already made above:

y_i; i = 1, … , n are normally distributed.
Second order stationarity, meaning, in particular, that the y_i; i = 1, … , n all have the same constant mean and variance.

We restrict ourselves to observation points in D ⊂ R² and the univariate case with one character measured at each of n points in D. In the case that in (12.1) μ(s) is known we speak about simple kriging. This is in many practical situations an unrealistic assumption. Therefore, we describe the so‐called ordinary kriging where μ(s) is assumed to be unknown and constant.

The kriging prediction at an unobserved location s₀ ∈ D is then given by a linear combination

12.16

where the weights in λ^T = (λ₁, … , λ_n) are determined as solutions of the OK system of the observations images .

Here G = (γ(‖s_i − s_j‖))_{i, j = 1, … , n} is the (n × n)‐semi‐variogram matrix of the observations,

is the vector of semi‐variogram values, and λ₀ is the Lagrange multiplier, which is necessary due to the minimisation under the equality constraint , which, in turn, results from the unbiasedness condition E() = μ. The kriging variance then becomes σ²(s₀) = var() = λ₀ + λ^Tγ₀. We illustrate this with the following toy example where n = 4.

Problem 12.4

Measurements y(x₁, x₂) are taken at four locations:

y(10; 20) = 40
y(30; 280) = 130
y(250; 130) = 90
y(360; 120) = 160.

Predict the value at y(180; 120) by ordinary kriging. Use as covariance function

and the distance .

Solution and Example

First we initialise the vectors of coordinates, the observation vector and c₀

 > x_1=c(10,30,250,360)
> x_2=c(20,280,130,120)
> Z=c(40,130,90,160)
> c_0=2000

Further we need some functions to calculate G, the semi‐variogram matrix and γ₀:

 > f=function(x,y){
  n=length(x)
  G=matrix(0,nrow=n,ncol=n)
  for(i in 1:n){
  for(j in 1:n){
  1
  G[i,j]=c_0-(2000*exp(-(sqrt((x[i]-x[j])^2+(y[i]-y[j])^2)/250)))
  }
  }
  G
  }
> G=f(x_1,x_2)
> G
[,1] [,2] [,3] [,4]
[1,] 0.000 1295.259 1304.3323 1533.6761
[2,] 1295.259 0.000 1310.6009 1538.7534
[3,] 1304.332 1310.601 0.0000 714.2622
[4,] 1533.676 1538.753 714.2622 0.0000
> Gamma_0=function(x,y){
  s=c(0,0,0,0)
  for(i in 1:4){
  s[i]=c_0-(2000*exp(-(sqrt((180-x[i])^2+(120-y[i])^2)/250)))
  }
  s
  }

The two sides of the equations of the OK system are obtained by

 > gamma_0=Gamma_0(x_1,x_2)
> rS=c(gamma_0,1)
> rS 
[1] 1091.3326 1168.1651 492.7234 1026.4955 1.0000
> OK=matrix(1,nrow=5,ncol=5)
> OK[5,5]=0
> for(i in 1:4){
  for(j in 1:4){
  OK[i,j]=G[i,j]
  }
  }
> OK
 [,1] [,2] [,3] [,4] [,5]
[1,] 0.000 1295.259 1304.3323 1533.6761 1
[2,] 1295.259 0.000 1310.6009 1538.7534 1
[3,] 1304.332 1310.601 0.0000 714.2622 1
[4,] 1533.676 1538.753 714.2622 0.0000 1
[5,] 1.000 1.000 1.0000 1.0000 0

Now we solve the OK system for λ and λ₀:

 > lambda=solve(OK)%*%rS

and the estimate becomes:

 > estimate =sum(lambda [-5]%*%Z)
> estimate
 [1] 86.58756

the kriging variance is

 > sigmasq<- t(lambda[-5])%*%gamma_0+lambda[5]
> sigmasq[,1]
 [,1]
[1,] 754.7532

Alternatively, the kriging prediction (12.16) can be written as

12.17

In (12.17) , K is the covariance matrix K = (C(‖s_i − s_j‖))_{i, j}, and

12.18

is the generalised least squares estimator of the unknown expectation μ. If μ(s) is no longer assumed to be an unknown constant then it can be more generally modelled as a linear regression setup

12.19

with given regression functions f₁, … f_r and unknown regression coefficients β₀, β₁, … , β_r.

The case of ordinary kriging just considered corresponds to the special case where β₁ = … = β_r = 0. In the case of (12.17) with non‐constant functions f₁, … f_r we speak of universal kriging. In this case the OK predictor (12.17) is replaced by the universal kriging predictor (UK predictor)

12.20

where , F is the (spatial) n × (r + 1) design matrix

12.21

and is the generalised least squares estimator

12.22

with the covariance matrix K and the vector k₀ as defined before.

Since the GRF is completely defined by the trend function μ(s) and the covariance function , we also can estimate the regression parameters β^T = (β₀, β₁, … , β_r) and the covariance parameters θ^T = (c₀, c, ρ, ν) jointly using a likelihood approach. Since the GRF assumption implies that the observation vector follows an n‐dimensional normal distribution N_n(Fβ, K), where K = K(θ), the log‐likelihood function thus reads

For any given parameter vector θ this function is maximised by the realisation of in (12.22). Then the covariance parameter vector can be estimated by maximising the so‐called profile log‐likelihood function with respect to θ. This can be done using the R package geoR, which then also determines the predicted UK values according to (12.20).

The function proflik(·)in the R package geoR allows us to visualise profile likelihoods. This function requires a likfit(·) object and sequences for the variogram parameters to be plotted (sill or range sequences).

Problem 12.5

Show how to calculate (kriging)‐predicted values.

Solution and Example

We use from Paulo J. Ribeiro Jr. a part of his ‘geoR solution’ article (https://www.stat.washington.edu/peter/591/geoR_sln.html).

Turning to the parameter estimation, we fit the model with a constant mean, isotropic exponential correlation function and allowing for estimating the variance of the conditionally independent observations (nugget variance).

 > library(geoR)
> data(s100)
> s100.ml <- likfit(s100, ini = c(1, 0.15))
> s100.ml
likfit: estimated model parameters:
    beta    tausq  sigmasq      phi 
"0.7766" "0.0000" "0.7517" "0.1827" 
Practical Range with cor=0.05 for asymptotic range: 0.5473814
likfit: maximised log-likelihood = -83.57
> summary(s100.ml)
Summary of the parameter estimation
- - - - - - - - - - --  - - - - - - -- - 
Estimation method: maximum likelihood 
Parameters of the mean component (trend):
  beta 
0.7766 

Parameters of the spatial component:
   correlation function: exponential
      (estimated) variance parameter sigmasq (partial sill) =  0.7517
      (estimated) cor. fct. parameter phi (range parameter)  =  0.1827
   anisotropy parameters:
      (fixed) anisotropy angle = 0  ( 0 degrees )
      (fixed) anisotropy ratio = 1
Parameter of the error component:
      (estimated) nugget =  0
Transformation parameter:
      (fixed) Box-Cox parameter = 1 (no transformation)
Practical Range with cor=0.05 for asymptotic range: 0.5473814
Maximised Likelihood:
   log.L n.params      AIC      BIC 
"-83.57"      "4"  "175.1"  "185.6" 
non spatial model:
   log.L n.params      AIC      BIC 
"-125.8"      "2"  "255.6"  "260.8" 
Call:
likfit(geodata = s100, ini.cov.pars = c(1, 0.15)).

In the output above AIC stands for the Akaike‐criterion (Akaike 1973) and BIC for the Schwarz criterion (Schwarz 1978), both are model choice criteria taking account of the number of parameters in a model. In the above program, the output tausq is the nugget variance, sigmasq the sill and phi the correlation radius.

Finally, we start the spatial prediction defining a grid of points. The kriging function by default performs ordinary kriging. It minimally requires the data, prediction locations and estimated model parameters.

 > s100.gr <- expand.grid((0:100)/100, (0:100)/100)
> s100.kc <- krige.conv(s100, locations = s100.gr, krige =  krige.control(obj.model = s100.ml))
krige.conv: model with constant mean
krige.conv: Kriging performed using global neighbourhood
> names(s100.kc)
[1] "predict"      "krige.var"    "beta.est"     "distribution" "message"
[6] "call"

If the locations form a grid the predictions can be visualised as image or contour. The plots in Figure 12.6 show the contours of the predicted values (left) and the image of the respective standard errors (right).

Figure 12.6 Predicted values and standard errors for s100.

 > par(mfrow = c(1, 2), mar = c(3.5, 3.5, 0.5, 0.5))
> image(s100.kc, col = gray(seq(1, 0.2, l = 21)))
> contour(s100.kc, nlevels = 11, add = TRUE)
> image(s100.kc, val = sqrt(krige.var), col = gray(seq(1, 0.2, l = 21)))

It is worth noticing that under the Gaussian model the prediction errors depend only on the coordinates of the data, but not on their values (except through model parameter estimates).

 > image(s100.kc, val = sqrt(krige.var), col = gray(seq(1, 0.2, l = 21)),  coords.data = TRUE)

The plot in Figure 12.7 shows the kriging standard deviations, the darker areas exhibit higher values. The black dots visualise the locations (coordinates) of the observations.

Figure 12.7 Association between standard deviations and data locations.

If, instead of the default ‘ordinary kriging’, we want to include a non‐constant trend, then we have to specify this through the trend.spatial function. For more information on this see the section details in the documentation for likfit(.).

12.2.4 Trans‐Gaussian Kriging

Now we consider the case of non‐normal random fields. It is then often possible to transform the observation such that the transformed data are realisations of a normal distribution. A striking example of this is to apply the Box–Cox transformation (12.4). This transformation depends on an additional parameter λ. We can find an appropriate value of λ with the function boxcox(.) in the package geoR. The function boxcox(.) gives a graphical summary of the log‐likelihood function depending on λ and indicates a realised 95% confidence interval for λ. For λ we use an easy to interpret value from the set {λ = 1 (no transformation), λ = 0.5 (square root transformation), λ = 0 (logarithmic transformation), λ = −0.5 (inverse square root transformation), λ = −1 (reciprocal transformation)}. Kriging with Box–Cox transformed data is usually called trans‐Gaussian kriging in the literature. This type of kriging and Bayesian extensions thereof were introduced by De Oliveira et al. (1997). Spöck and Pilz (2015) deal with the sampling design problem for optimal prediction with non‐normal random fields and give an example of the optimal placement of monitoring stations for an existing rainfall monitoring network in Upper Austria.

The R package gstat also implements trans‐Gaussian kriging using a boxcox(.) function, which may be found in the library (MASS).

For illustration we now use the Meuse Data meuse.all from the package gstat. This data set gives locations and top soil heavy metal concentrations (ppm), along with a number of soil and landscape variables, collected in a flood plain of the river Meuse, near the village Stein. Heavy metal concentrations are bulk sampled from an area of approximately 15 × 15 m.

Example 12.1

In this example we use gstat and data meuse.all from that package. This package and the data are well described in Bivand et al. (2013).

In this data set we use the zinc observations.

For nice graph representation (see Figure 12.8) we use also the R package sp.

 > library(gstat)
> data (meuse.all)
> library(sp)

First, we look for an appropriate Box–Cox transformation.

 > library(MASS)
> boxcox(zinc∼1, data=meuse.all)

Figure 12.8 Determining a 95% confidence interval for Box–Cox parameter.

The plot suggests to choose λ = 0 for simplicity, i.e. to work with a logarithmic transformation of the observed data.

Before we can use the kriging function krige we call

 > coordinates(meuse.all)<‐ c(^"x^", ^"y^" )

and calculate an empirical semi‐variogram

 > empv1<‐variogram(log(zinc) ∼ 1,meuse.all)

and fit it to the spherical semi‐variogram model (12.11)

 > v=vgm(0.6,"Sph",870,0.05)
> m1.zinc=fit.variogram(empv1,v)
> m1.zinc∼
    model         psill            range
1   Nug           0.0506           0.000
2   Sph           0.5906           896.965
> plot(empv1,model=m1.zinc, main="spherical variogram fit of log(zinc)")

As can be seen from the output above Figure 12.8 and from the graph in Figure 12.9 this empirical semi‐variogram has a nugget variance c = 0.0506; a (partial) sill c = 0.5906 (i.e. an overall sill 0.6412) and a range a = 896.965.

Figure 12.9 Empirical and fitted semi‐variogram model for Meuse zinc data.

 > gridded(meuse.grid)=∼x+y
> summary(meuse.grid)
Object of class SpatialPixelsDataFrame
Coordinates:
      min        max
x  178440     181560
y  329600     333760
Is projected: NA 
proj4string : [NA]
Number of points: 3103
Grid attributes:
  cellcentre.offset cellsize cells.dim
x            178460       40        78
y            329620       40       104

Next we prepare the predictions. The results are visualized in Figure 12.10.

Figure 12.10 OK‐predicted values of zinc data (on a log‐scale).

Figure 12.11 OK variances of predicted zinc values.

 > z=krige(log(zinc)∼1, meuse.all, meuse.grid, model = m1.zinc)
[using ordinary kriging]
> z["var1.pred"]
Object of class SpatialPixelsDataFrame
Object of class SpatialPixels
Grid topology:
  cellcentre.offset cellsize cells.dim
x            178460       40        78
y            329620       40       104
SpatialPoints:
               x        y
1         181180   333740
2         181140   333700
………………………………..
3102   179180   329620
3103   179220   329620
Coordinate Reference System (CRS) arguments: NA 
Data summary:
     var1.pred    
 Min.   : 4.777
 1st Qu.: 5.238
 Median : 5.573
 Mean   : 5.707
 3rd Qu.: 6.172
 Max.   : 7.440

Next we find the predicted variances; these are plotted in Figure 12.11.

 > z["var1.var"]
Object of class SpatialPixelsDataFrame
Object of class SpatialPixels
Data summary:
                  var1.var      
 Min.   : 0.08549  
 1st Qu.: 0.13728  
 Median : 0.16218  
 Mean   : 0.18533  
 3rd Qu.: 0.21161  
 Max.   : 0.50028

The commands for plotting the predicted values and variances, respectively, read as follows:

 > spplot(z["var1.pred"], main = "ordinary kriging predictions of log(zinc)")
> spplot(z["var1.var"], main = "ordinary kriging variances of log(zinc)")

12.3 Special Problems and Outlook

We now briefly outline more recent developments which go beyond the traditional kriging and trans‐Gaussian kriging considered in Section 12.2.3.

12.3.1 Generalised Linear Models in Geostatistics

In the same way as we had extended linear regression models to generalised linear regression models in Chapter 11, we may take the step from spatial linear models to so‐called generalised linear geostatistical models (GLGMs). This allows us to model spatial random variables [with observations (realisations)] following a distribution from the exponential family. Spatial modelling and prediction for such observations is implemented in the R package geoRglm and is well described in Diggle and Ribeiro (2007), where in particular worked‐through examples of binomially and Poisson distributed environmental and public health data are considered. Extensions to hierarchical Bayesian GLGMs can be found in Banerjee et al. (2014).

12.3.2 Copula Based Geostatistical Prediction

The GLGM framework does not allow distributions outside the exponential family. In particular we cannot use it for heavy‐tailed/extreme value distributions, which have a much slower decay of probability in the tails than the normal distribution. A prominent example of such a distribution is the generalised extreme value distribution with distribution function

with location, scale, and shape parameter μ, σ, and τ, respectively. Observations from non‐Gaussian, skewed or heavy‐tailed distributions are dealt with in the library (intamap). A general framework of handling such data are copulas which are distribution functions on the unit cube [0, 1]ⁿ with uniformly distributed margins, introduced by Sklar (1959). Copulas are invariant under strictly increasing transformations of the marginals; thus, frequently applied data transformations (e.g. square root and log transformations) do not change the copula. The relation between two locations separated by the lag‐vector h is characterised by the bivariate distribution with distribution function

12.23

The copula C_h thus becomes a function of the separating vector h. Spatial copulas have been introduced by Bardossy (2006) and Kazianka and Pilz (2010b). Spatial copulas describe the spatial dependence over the whole range of quantiles for a given separating vector h, not only the mean dependence, as the variogram does. The main difference to GLGMs is that these only model the means μ_i = E(y(s_i)), i = 1, … , n through some link function , where Z(·) is a stationary GRF with mean zero and given covariance function. The spatial copula, however, allows us to build a complete multivariate distribution of the random variables whose realisations are the observed values [y(s₁), … , y(s_n)]. Recently, so‐called vine copula have been developed which extend the concept of the bivariate copula C_h in (12.23) to higher dimensions – see e.g. Gräler (2014). The corresponding R packages copula, spcopula and VineCopula allow flexible spatial data modelling. A Matlab toolbox for copula‐based spatial analysis is given in Kazianka (2013).

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In: 2nd International Symposium on Information Theory, Tsaghkadzor, Armenia, USSR, September 2–8, 1971 (ed. B.N. Petrov and F. Csáki), 267–281. Budapest: Akadémiai Kiadó.
Anselin, L. and Griffith, D.A. (1988). Do spatial effects really matter in regression analysis? Reg. Sci. 65: 11–34.
Bailey, T.C. and Gatrell, T. (1995). Interactive Spatial Data Analysis. London: Longman Scientific & Technical.
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2014). Hierarchical Modeling and Analysis for Spatial Data, 2e. Boca Raton, Florida: CRC Press/Chapman & Hall.
Bardossy, A. (2006). Copula‐based geostatistical models for groundwater quality parameters. Water Resour. Res. 42: W11416.
Bivand, R.S., Pebesma, E.J., and Gomez‐Rubio, V. (2013). Applied Spatial Data Analysis with R, 2e. Berlin: Springer.
Cressie, N.A.C. (1993). Statistics for Spatial Data. New York: Wiley.
Cressie, N.A.C. and Wikle, C.K. (2011). Statistics of Spatio‐Temporal Data. New York: Wiley.
Davis, J.C. (2002). Statistics and Data Analysis in Geology, 3e. New York: Wiley.
De Oliveira, V., Kedem, B., and Short, D.A. (1997). Bayesian prediction of transformed Gaussian random fields. J. Am. Stat. Assoc. 92: 1422–1433.
Diggle, P.J. (2010). Spatial point pattern. In: International Encyclopedia of Statistical Science, Volumes I, II, III (ed. M. Lovric), 1361–1363. Berlin: Springer.
Diggle, P. and Ribeiro, P. (2007). Model‐Based Geostatistics. New York: Springer.
Gaetan, C. and Guyon, H. (2010). Spatial Statistics and Modeling. New York: Springer.
Gräler, B. (2014). Modelling skewed spatial random fields through the spatial vine copula. Spatial Stat. 10: 87–102.
Kazianka, H. (2013). spatialCopula: a Matlab toolbox for copula‐based spatial analysis. Stochastic Environ. Res. Risk Assess. 27: 121–135.
Kazianka, H. and Pilz, J. (2010a). Model‐based Geostatistics. In: International Encyclopedia of Statistical Science, Volumes I, II, III (ed. M. Lovric), 833–836. Berlin: Springer.
Kazianka, H. and Pilz, J. (2010b). Copula‐based geostatistical modeling of continuous and discrete data including covariates. Stochastic Environ. Res. Risk Assess. 24: 661–673.
Krige, D.G, (1951) A statistical approach to some mine valuations and allied problems at the Witwatersrand, Master's thesis of the University of Witwatersrand, Johannesburg, S.A..
Krüger, L., (1912) Konforme Abbildung des Erdellipsoids in die Ebene, In: Veröff. Kgl. Preuß. Geod. Inst. Nr. 51.
Lantuéjoul, C. (2002). Geostatistical Simulation. Models and Algorithms. Berlin: Springer.
Mase, S. (2010). Geostatistics and kriging predictors. In: International Encyclopedia of Statistical Science, Volumes I, II, III (ed. M. Lovric), 609–612. Berlin: Springer.
Matheron, G. (1963). Principles of Geostatistics. Econ. Geol. 58: 1246–1266.
Müller, W. (2007). Collecting Spatial Data, 3e. Heidelberg: Springer.
Pilz, J. (ed.) (2009). Interfacing Geostatistics and GIS. Berlin‐Heidelberg: Springer.
Pilz, J. (2010). Spatial statistics. In: International Encyclopedia of Statistical Science, Volumes I, II ,III (ed. M. Lovric), 1363–1368. Berlin: Springer.
Pilz, J., Kazianka, H., and Spöck, G. (2012). Some advances in Bayesian spatial prediction and sampling design. Spatial Stat. 1: 65–81.
Rasch, D., Herrendörfer, G., Bock, J. et al. (eds.) (2008). Verfahrensbibliothek Versuchsplanung und ‐ auswertung, 2e. München Wien: R. Oldenbourg.
Ripley, B.D. (1988). Statistical Inference for Spatial Processes. Cambridge, UK: Cambridge University Press.
Sampson, P.D., Damien, D., and Guttorp, P. (2001). Advances in Modelling and Inference. London: Academic Press.
Schabenberger, O. and Gotway, C.A. (2005). Statistical Methods for Spatial Data Analysis. Boca Raton: Chapman & Hall/CRC Press.
Schwarz, G.E. (1978). Estimating the dimension of a model. Ann. Stat. 6: 461–464.
Sklar, A. (1959). Fonctions de repartition á n dimensions et leurs marges. Publ. Inst. Statistique Univ. Paris 8: 229–231.
Snow, J. (1855). On the Mode of Communication of Cholera. London: John Churchill.
Spöck, G. and Pilz, J. (2010). Analysis of areal and spatial interaction data. In: International Encyclopedia of Statistical Science, Volumes I, II, III (ed. M. Lovric), 35–39. Berlin: Springer.
Spöck, G. and Pilz, J. (2015). Taking account of covariance estimation uncertainty in spatial sampling design for prediction with trans‐Gaussian random fields. Front. Environ. Sci. 3: 39, 1–39, 22.
Wackernagel, H. (2010). Multivariate Geostatistics. Heidelberg‐Berlin: Springer.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 12 Spatial Statistics

Create new playlist

Sign In

Sign Up

12.1 Introduction

12.2 Geostatistics

12.2.1 Semi‐variogram Function

12.2.2 Semi‐variogram Parameter Estimation

12.2.3 Kriging

12.2.4 Trans‐Gaussian Kriging

12.3 Special Problems and Outlook

12.3.1 Generalised Linear Models in Geostatistics

12.3.2 Copula Based Geostatistical Prediction

References

Table of Contents for
12 Spatial Statistics