Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 10
Spatial Panels

10.1 Spatial Correlation

If the cross‐sectional dimension of a dataset has any form of ordering, or if a distance is defined over each pair of observations (here: spatial units), one can use spatial methods to account for the possibility that correlation be stronger between “nearby” ones. The most commonly used definitions of proximity are either distance‐ or neighborhood‐related. Neighborhood depends on the spatial units being arranged in a topological space on a regular or irregular grid, an example of the latter being state or regional borders in geography.¹ On the subject, see Anselin (1988, Ch. 3).

This subject is most relevant in nonrandom samples such as countries within a geographical region, or regions within one country; but spatial methods can also be employed wherever some kind of distance between observations is defined, be it in a geographic space or perhaps in an economic, demographic, or psychological one. Hence spatial methods, although more common in the former context, can be relevant in random samples too, such as, e.g., in household surveys.

10.1.1 Visual Assessment

Correlation in bidimensional space can be multifaceted, and in some ways more complicated to assess than correlation in time, which has a single dimension and often an obvious direction. Therefore, preliminary data analysis based on visual assessments, while always important and perhaps underutilized in econometric practice (Kleiber and Zeileis, 2008), is all the more useful in a spatial context. In the first part of this section we present an example of visual assessment of spatial correlation drawing on R's map plotting facilities; next, we proceed to formal statistical tests.

Example 10‐1 Visual assessment of spatial correlation – `HousePricesUS` data set

Visualizing data on a choropleth map is often the first step toward assessing the correlation of data in a geographical space. Plotting statistical maps is a complex subject that is out of the scope of the present book and is made easier by a number of dedicated packages: below we provide an example of plotting maps with ggplot2, adapting the example in package fiftystater (Murphy, 2016) for displaying the growth of house prices indices in the USA between 1980 (=100) and 2000 (darker is lower):

 data("HousePricesUS", package="pder")
library("ggplot2")
data("fifty_states", package = "fiftystater")
houses00 <- subset(HousePricesUS, year == 2000)
houses00$name <- tolower(houses00$name)
p <- ggplot(houses00, aes(map_id = name)) +
    geom_map(aes(fill = price), map = fifty_states) +
    expand_limits(x = fifty_states$long, y = fifty_states$lat) +
    coord_map() +
    scale_x_continuous(breaks = NULL) +
    scale_y_continuous(breaks = NULL) +
    labs(x = "", y = "") +
    theme(legend.position = "bottom",
          panel.background = element_blank()) +
    theme(legend.text = element_text(size = 6),
          legend.title= element_text(size = 8),
          axis.title = element_text(size = 8))
p <- p + scale_fill_gradient2(low = "grey30", high = "grey5")
p

Figure 10.1Growth of house prices indexes in the USA between 1980 and 2000.

Clusters of low‐growth regions are evident by their darker color and the opposite (Figure 10.1): in general, shades tend to distribute nonrandomly, nearby states tending to behave similarly. Formal testing for spatial correlation is likely to corroborate this first impression.

10.1.2 Testing for Spatial Dependence

One first issue when confronted with spatially referenced data is to determine whether spatial dependence exists, i.e., whether “nearby” units (according to the chosen metric) are more correlated than distant ones. The raw data are tested for spatial dependence in order to inform and justify the use of spatial estimation methods; then, after estimation, the residuals are tested again to determine whether the model has been able to effectively account for the spatial features of the process at hand.

10.1.2.1 CD p Tests for Local Cross‐sectional Dependence

A very flexible way of assessing whether dependence in the cross‐section of a panel dataset is spatially related goes through a particularization of the CD test for general cross‐sectional dependence described in Chapter . The latter is in principle completely a‐spatial, being based on a scaled average of the pairwise correlation coefficients between observations (or residuals). Still, the CD can be restricted to those pairs of observations satisfying one given criterion: most frequently, a contiguity‐based neighborhood one but also that distance be under a given cutoff level.

The local variant of the CD test, called test (Pesaran, 2004), takes into account an appropriate subset of neighboring cross‐sectional units to check the null of no cross‐sectional dependence against the alternative of local cross‐sectional dependence, i.e., dependence between neighbors only. To do so, the pairs of neighboring units are selected by means of a binary proximity matrix, in which zeros correspond to pairs of observations that are not neighbors. The latter is used for discarding the correlation coefficients relative to pairs of observations that are not neighbors in computing the CD statistic. The test is then defined as:

where is the ‐th element of the ‐th order proximity matrix, so that if any pair are not neighbors, and is eliminated from the summation; and is the number of time series observations in common between individuals and ( if the panel is balanced).²

The same procedure can be applied to the LM and SCLM tests described in section 4.3.1. The local version of either test can be computed supplying an matrix (of any type coercible to logical), providing information on whether any pair of observations are neighbors or not, to the w argument of pcdtest. If w is supplied, only neighboring pairs will be used in computing the test; else, w will default to NULL, and all observations will be used. The matrix needs not really be binary, so commonly used “row‐standardized” matrices can be employed as well: it is enough that neighboring pairs correspond to nonzero elements in w³.

10.1.2.2 The Randomized W Test

The test is flexible and well behaved in small samples; moreover it does not suffer the biggest drawback of its global sibling, which does not have any power under zero‐mean dependence and therefore cannot be employed, for example, on cross‐sectionally demeaned data – or equivalently on the residuals of a model containing time fixed effects. Nevertheless, it does not tolerate serial correlation and can be sensitive to non‐spatial types of dependence. In fact, if cross‐sectional dependence of the non‐spatial type is present and a test is performed, it will be based on a subset of spatially related pairs from a population of correlated ones; it is therefore likely to yield a false positive result (a type I error) favoring spatial dependence.

The idea underlying the test, that not all pairs of neighbors are correlated but only those in a specific spatial relationship are and that the latter are identified through the matrix, gives rise to another testing procedure that is remarkably robust to all the above confounding features. The RW test of Millo (2017a) employs a permutation procedure to produce a large number of randomized neighborhood matrices and then compares the statistic under the true spatial ordering with the population of those under the randomized ones. If spatial dependence is absent, the observations must be exchangeable in the cross‐section: then, the true will not take an extreme value with respect to the randomization‐based ones, and the null hypothesis of no spatial dependence will hold. As usual, the share of randomized statistics more extreme than the true one will be the pseudo‐ of the test. In the majority of situations, the alternative hypothesis is of positive spatial dependence. In this case a one‐tailed test will be appropriate. Given a panel‐indexed vector , call the randomized statistic from the ‐th draw, with ; and the one under the true W. If the alternative is positive spatial dependence, the pseudo‐ of the one‐tailed test is then

(10.1)

where is the indicator function. The null of no spatial dependence in would be rejected at, say, 5% significance if , meaning that the actual value is more extreme than the 95th quantile of the distribution of randomized values.

Negative spatial autocorrelation is less common in empirical practice but can be relevant, e.g., in the description of competitive processes (see Griffith and Arbia, 2010; Elhorst and Zigova, 2014). In this case it may happen that the distribution of randomized statistics be shifted in the opposite direction by positive global dependence so that the value of the true test statistic be less extreme, and the one‐tailed procedure would not work. A two‐tailed test is then needed, which is easily accomplished by taking absolute values and cross‐sectionally demeaning the data so that the average of the factors, and hence the average global correlation, is re‐centered on zero:

(10.2)

To take heed of possible asymmetries in the (re‐centered) distribution of randomized statistics, one can go the safest way employing the asymmetric version of the test:

(10.3)

Example 10‐2 Spatial dependence – `HousePricesUS` data set

In their analysis of the income elasticity of house prices across continental US states, Holly et al. (2010) employ CD tests to assess cross‐sectional dependence in the raw data and in the residuals from the various regression models they estimate. Below we present an assessment of spatial dependence in the house prices index from their dataset employing a binary neighborhood matrix. As was the case for the a‐spatial CD test, if analyzing raw data, then the data.frame must be pre‐transformed into a pdata.frame, so the testing function can find the appropriate indices:

 data("usaw49", package="pder")
library("plm")
php <- pdata.frame(HousePricesUS)
pcdtest(php$price, w = usaw49)

Pesaran CD test for local cross-sectional dependence
in panels

data:  php$price
z = 37, p-value <2e-16
alternative hypothesis: cross-sectional dependence

The local test finds a strong, statistically very significant average correlation between neighboring pairs. There is little doubt that the original data are correlated in the cross section; it remains to be ascertained whether said correlation is truly spatial or due to common factor influence in the process originating the data. An RW test will determine whether there is any spatial correlation proper left after controlling for cross‐sectional correlation:

 library("splm")
rwtest(php$price, w = usaw49, replications = 999)

Randomized W test for spatial correlation of order 1

data:  formula
p-value = 0.002
alternative hypothesis: twosided

The spatial correlation according to the “true” neighborhood matrix is the most extreme in the distribution of statistics obtained from drawing 999 more random orderings next to the original one, leaving little doubt about the presence of a spatial component in the process generating the data. The same is found when analyzing the explanatory variable, income. The question becomes then, after estimating the model, whether there is any spatial correlation remaining in the residuals after explaining house prices through income, or whether the spatial structure of income effectively explained away that in the dependent variable, house prices. Holly et al. (2010) estimate a common correlated effects (CCE) model of house prices vs income in order to control for unobservable common factors perturbating the relationship of interest. CCE will effectively “defactor” the model residuals, so that any purely spatial process will now be detectable without the confounding effect of the former so that a test of the model residuals will reveal it. The same goes for the RW test, but the latter will control for factor structures so that it can be applied without defactoring as well. Residuals from a pmg object are a regular pseries so that pcdtest and rwtest can be directly applied:

 mgmod <- pmg(log(price) ˜ log(income), data = HousePricesUS)
ccemgmod <- pmg(log(price) ˜ log(income), data = HousePricesUS, model = "cmg")
pcdtest(resid(ccemgmod), w = usaw49)

Pesaran CD test for local cross-sectional dependence
in panels

data:  resid(ccemgmod)
z = 28, p-value <2e-16
alternative hypothesis: cross-sectional dependence
rwtest(resid(mgmod), w = usaw49, replications = 999)

Randomized W test for spatial correlation of order 1

data:  formula
p-value = 0.002
alternative hypothesis: twosided

Any way we look at it, substantial spatial dependence is still present in the model residuals even after controlling for cross‐sectional common factors.

10.2 Spatial Lags

The basic tool of spatial econometrics is the definition of a spatial lag. Given an observation and a distance metric, the spatial lag of that observation is usually defined as some kind of weighted average of the observations that are considered “near” to it according to the given metric: . Either a distance or a neighborhood matrix is commonly employed to provide the weights. In the neighborhood case, for each pair of observations , the matrix will have an element if the two are neighbors, i.e., if they share a common border like Germany and Austria (first‐order neighborhood) or if there are at most other observations separating them (‐th order neighborhood), so that Italy and Germany are second‐order neighbors. In the distance‐based case, the generic element will be dependent on some inverse function of the distance between them, usually the reciprocal: . It is customary to set a cutoff point at some distance beyond which one does not expect any influence to be present so that if ⁴. In both cases, it is customary to standardize so that the rows sum to one: . Then, for each , will contain, respectively, the simple average of values in neighboring locations or a distance‐weighted average of all for which .

In all of the following, we will refer to the simpler neighborhood‐based definition of proximity. All techniques illustrated in this chapter are nevertheless applicable as well in the case of distance‐based weights. The spatial weights matrix can be based on definitions of distance not based on geographical position but defined instead in some other kind of space, like e.g., one where dimensions are corresponding to some set of economic or demographic or psychological characteristics. The technical aspects of estimation do not vary with respect to the case of geographical distance, or neighborhood, as long as the fundamental hypothesis of exogeneity of holds. One desirable feature of geographic space is that it is exogenous, unlike, e.g., bilateral (contemporaneous) trade‐based weights in a model of international commerce, which would be generated inside the same economic system to be modeled.

It is important to recall that the hypothesis of exogenous and time‐invariant will be maintained throughout this chapter. Spatial lags in a panel setting can be written compactly in vector form stacking observations by time first, in the now‐standard notation, as . The concept of spatial lag has some analogies with the familiar time lag but also important differences, the most important one being that while time is directed, space is generally not; hence the idea of predeterminedness and the fact that usually (although not always) the past is expected to influence the future but not vice versa do not apply. Dependence in space is usually circular, and the influence from “nearby” observations gives rise to feedback effects that importantly affect estimation. In particular, as will be clear in the following, a spatial lag of the dependent variable is endogenous by construction, and a model including it will require more sophisticated techniques than (ordinary or generalized) least squares in order to be consistently estimated.

10.2.1 Spatially Lagged Regressors

Suppose that the need to account for space in the specification has been established either a priori, in the economic model, or because spatial dependence has been detected in the data or in the residuals of an estimated model.

One first way to consider the influence of neighboring spatial units is to take into account spatial lags of the explanatory variables. The economic meaning of spatially lagged regressors is to account for explicit spatial influences from relevant explanatory variables in nearby spatial units. Spatial lags can easily be added to the specification and, provided was exogenous to begin with, pose no additional problem in estimation of this model.

As a first example of augmenting a model with a spatial lag, let us consider the case of a spatially lagged regressor representing (if is row‐standardized) the average of at neighboring locations.

Example 10‐3 Spatially lagged explanatory variables – `Cigarette` data set

Baltagi and Griffin (2001) consider demand for cigarettes across 46 US states over the years 1963‐1992 in the framework of the rational addiction model. Next to the original dynamic model, static versions have become an ubiquitous example in papers and textbooks. The demand for cigarettes (sales) is estimated as a function of real per capita income (ndi/cpi), cigarette price in the given state (price), and minimum price in neighboring states (pimin), this last term accounting for cross‐border smuggling. The model is then estimated by fixed effects through plm ; we use coeftest for a compact representation of the estimation output.

 library("plm")
library("splm")
data("Cigar", package = "plm")
fm <- log(sales) ˜ log(price) + log(pimin) + log(ndi / cpi)
femod <- plm(fm, Cigar)
library("lmtest")
coeftest(femod)

t test of coefficients:

             Estimate Std. Error t value Pr(>|t|)
log(price)    -0.7513     0.0462   -16.3   <2e-16 ***
log(pimin)     0.4946     0.0456    10.8   <2e-16 ***
log(ndi/cpi)   0.6801     0.0368    18.5   <2e-16 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

A natural application of the spatial lag operator in this context is to substitute pimin with an average of the prices in neighboring states, to account for the smuggling effect across all borders. The spatial lag operator, when applied to price using a binary contiguity, row‐standardized matrix, produces exactly this average price. We read in the relevant matrix, standardize it, and check that the row sums are actually all 1:

 data("usaw46", package = "pder")
wcig <- usaw46 / apply(usaw46, 1, sum)
summary(apply(wcig, 1, sum))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
      1       1       1       1       1       1

In a cross‐sectional setting, spatial lags are very easy to construct as . In a panel setting, every cross‐section has to be premultiplied by ; or, equivalently, a larger block‐diagonal neighborhood matrix has to be employed. Let us construct a spatial (panel) lag of the price variable. Remembering that panel data are usually ordered by state, year with the first being the “slow” index, we can proceed by making a reordered copy of Cigar, extracting the variable price and lagging it through premultiplication by ; then adding it to the dataset:

 cig <- Cigar[order(Cigar$year, Cigar$state),]
wp <- kronecker(diag(1, 30), wcig) %*% cig$price
Cigar$wp <- wp[order(cig$state, cig$year)]

or, much faster although less intuitive, by reversing the Kronecker product:

 Cigar$wp <- kronecker(wcig, diag(1,30)) %*% Cigar$price

Now wp is a regular regressor, which we can add to the specification in lieu of pimin, redoing all the previous steps, estimating the alternative model to appreciate the difference:

 fm2 <- update(fm,. ˜. - log(pimin) + log(wp))
femod2 <- plm(fm2, Cigar)
coeftest(femod2)

t test of coefficients:

             Estimate Std. Error t value Pr(>|t|)
log(price)    -0.8292     0.0528   -15.7   <2e-16 ***
log(ndi/cpi)   0.6294     0.0371    17.0   <2e-16 ***
log(wp)        0.5874     0.0537    10.9   <2e-16 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

To automate the tedious construction of spatial panel lags, a function slag is provided, needing a pseries and either a proximity matrix or an equivalent listw object to represent the spatial ordering of observations. The slag operator can be employed directly in formulae:

 lwcig <- mat2listw(wcig)
fm3 <- update(fm,. ˜. - log(pimin) +
                    log(slag(price, listw=lwcig)))

The somewhat cumbersome syntax deriving from the need to specify can be avoided, e.g., defining a small convenience function where the given matrix is hardwired, as follows (the output is the same):

 wx <- function(x) slag(x, listw = lwcig)
fm3.alt <- update(fm,. ˜. - log(pimin) + log(wx(price)))

As it turns out, substituting the minimum price in neighboring states with the average price of neighbors has little effect on the model results.

10.2.2 Spatially Lagged Dependent Variables

A more direct, although much more problematic, way of incorporating spatial structure in an econometric model is through inclusion of spatial lags of the dependent variable. The model is then:

where is the spatial weights matrix of known constants whose diagonal elements are set to zero, and is the corresponding spatial parameter.

This is called the spatial lag model proper. From a theoretical viewpoint, it is appropriate whenever one expects the outcome of one observation to influence the outcomes of neighboring ones, such as, e.g., for the spreading of a disease, where one unit being positive has a direct effect on the likelihood of neighboring units to be so too.

Another example is if (within‐period) strategic interaction is expected to happen, e.g., each country takes the tax rates of neighbors into account in setting its own and may react within the same time period, as in Franzese and Hays (2006). In this case, one might expect positive spatial correlation. In a microeconomic setting, the effect of a spatial lag term could be expected to turn out positive is in copycatting behavior, when e.g., buying a product sparks imitation hereby raising the propensity of neighbors to follow suit. A negative spatial lag can instead be consistent with the idea of free riding: if one can reap advantage from the actions of neighbors through some kind of externality, then this will lower his or her own effort: an example is labor market training in the European Union, where trained labor can easily commute across borders (Franzese and Hays, 2008).

Spatial‐lag‐type dependence has been evocatively termed “substantial” (Franzese and Hays, 2007) as opposed to spatial error dependence, which in the same context is described as “nuisance,” to be controlled for the sake of precision in estimation but devoid of theoretical meaning. This is not necessarily true, as spatial error dependence can have substantial meaning too, for example in the context of economic shock diffusion (see e.g. Holly et al., 2010), and can be a subject of the analysis in its own right.

The spatial lag process, and by extension the model with a spatial lag plus regressors, is universally known by the acronym SAR, for “spatially autoregressive.” The term is inherently endogenous; in a reduced form, the model becomes nonlinear:

so that maximum likelihood estimation (ML) is called for. Only as a very first approximation, it can be of interest to estimate the so‐called “spatial OLS”.

10.2.2.1 Spatial OLS

Ordinary least squares estimation is consistent, under the usual exogeneity conditions on , for models with spatially lagged regressors, in which case it is also efficient provided that the standard hypotheses of homoscedasticity and incorrelation hold; in fact, adding may eliminate the spatial correlation in error terms and effectively make OLS the efficient estimator. Even in the case of the spatial error model, OLS remain consistent, although inefficient, for .

As a first approximation, and in cases where ML and GM are problematic (one for all, dynamic panels), the so‐called spatial‐OLS method has been advocated: adding the spatial lag of the dependent variable to the right‐hand side regressors. This solution is in general not advisable because is endogenous by construction, and therefore the estimator is hopelessly biased; yet simulation studies have shown how the magnitude of the bias can be limited in real‐world cases, to the point of making this computationally simple solution relatively viable in some applied settings (see Franzese and Hays, 2007).

10.2.2.2 ML Estimation of the SAR Model

An appropriate way to estimate a SAR model, provided the errors are i.i.d. normal, is by ML. Let us start from the cross‐sectional case where is and is a vector of length . Denoting , the model becomes so that . Expressing the usual likelihood function of the linear model in terms of the transformed requires adding the Jacobian of the transformation, i.e., the determinant of , therefore the log‐likelihood becomes:

and this likelihood is to be optimized with respect to and , efficient optimization strategies having been outlined in the seminal book of Anselin (1988). The pure‐SAR panel case, pooling the data without any individual feature, just substitutes for , for and so that it could be estimated with the lagsarlm function from package spdep. Nevertheless, it is always preferable for computational reasons to resort to specific methods for spatial panels when available.

Example 10‐4 Spatial lag model – `HousePricesUS` data set

In the house prices application of Holly et al. (2010), the authors estimate a SAR model of defactored residuals to assess the presence and the degree of spatial correlation net of the influence of common factors. We replicate their analysis by estimating a pure‐SAR model (no regressors but an intercept) of the residuals from the model.⁵ With respect to common‐factor robust testing (see previous example), this has the additional advantage of explicitly estimating a spatially autoregressive coefficient measuring the intensity of the spatial effect.

The CCE residuals from pmg (or, equivalently, pcce) are a regular pseries; it is easy to make a dataframe in plm‐compliant format by stacking the individual and time indexes next to the residuals themselves, stripped of their panel attributes through the as.numeric.pseries converter function. The function spreml can then be used for estimation, specifying lag=TRUE and errors='ols' for a pure SAR model without any panel features:

 e <- resid(ccemgmod)
edat <- data.frame(ind = attr(e, "index")[[1]],
                   tind = attr(e, "index")[[2]],  e = as.numeric(e))
sarmod.e <- spreml(e ˜ 1, data = edat, w = usaw49, lag = TRUE, errors = "ols")
summary(sarmod.e)$ARCoefTable
       Estimate Std. Error t-value   Pr(>|t|)
lambda   0.6498    0.02037    31.9 2.685e-223

The spatial correlation in the (defactored) residuals is estimated at 65% and is statistically significant at any confidence level. The simpler spatial‐OLS model can be easily estimated with the help of the slag function:

 library("lmtest")
coeftest(plm(e ˜ slag(e, listw = usaw49) - 1, data = edat, model = "p"))

t test of coefficients:

                        Estimate Std. Error t value
slag(e, listw = usaw49)   0.8831     0.0251    35.2
                        Pr(>|t|)
slag(e, listw = usaw49)   <2e-16 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The bias in the SAR coefficient is evident, yet this simple procedure can be enough for detecting a problem or as a very first assessment.

10.2.3 Spatially Correlated Errors

The other main specification in the literature, the spatial error, is instead appropriate when one expects the innovation relative to one observation to influence the outcomes of neighboring ones, as would be the case for an economic shock of some kind to a given region (fully) influencing the relevant dependent variable in that region and also propagating – with distance‐decaying intensity – toward nearby ones; or for a location‐related measurement error, by its nature affecting nearby observations in a similar way. Another reason for spatially correlated errors is misspecification resulting from the omission of a spatially correlated variable. This specification is called SEM, for “spatial error model”.

The model is then the familiar linear model with regressors:

where is a vector of spatially autocorrelated idiosyncratic errors that follows a spatial autoregressive process of the form

with as the spatial autoregressive parameter, the spatial weights matrix and . As can be seen, the SEM model is nothing but a linear model with a SAR process in the errors instead of in the response. The likelihood for the cross‐sectional SEM model is:

where . As for the SAR case, pooling the data is accomplished by substituting to , the extended proximity matrix for and .

It is typical in the literature to estimate either of the two specifications, SAR or SEM, although in principle they can be combined. The subject of choosing between the spatial lag and the spatial error models by means of diagnostic testing will be treated in the following; it should nevertheless be borne in mind that the specification of one or the other spatial model should always be informed by the a priori beliefs of the researcher and the economic model she postulates for the phenomenon at hand. In fact, while some empirical cases happen to be sufficiently clear‐cut for an exclusively data‐driven decision to be taken, most of the time model uncertainty – regarding the specification of regressors, of the neighborhood structure (the matrix), or that of the spatial process in either response or error – is so pervasive that one can hardly rely on statistical procedures alone in order to conduct a specification search.

Nevertheless, from a diagnostic rather than modeling viewpoint, a general result is that the omission of a spatially correlated relevant regressor would show up as spatially correlated errors, and the same would happen for the omission of a spatially lagged dependent variable; much as would happen in time series data with omitted dynamics showing up in residual autocorrelation. Generality stops here, though, because while the symptoms of either neglected spatial lag or error processes are similar, the consequences on the properties of estimators are different already. In fact, an omitted spatial lag renders the estimator inconsistent, while an omitted spatial process in the error merely results in inefficiency and invalid inference.

Example 10‐5 Spatial error – `RiceFarms` data set

The RiceFarms dataset contains observations from 171 rice farms in Indonesia, observed over six growing seasons, three wet and three dry, between 1975 and 1983. The farms are located in six different villages of the Chimanuk River basin in West Java. According to Druska and Horrace (2004), two villages are in flatlands on the north coast of the island, three in the highlands (600‐1100 m) in the central part of West Java, and the last is in the center of the island with an average altitude of 375 meters. Roads and more in general proximity to big cities are extremely heterogeneous.

In this geographical setting, one can expect both village‐level heterogeneity and spatial correlation between farms belonging to the same village. Spatial dependence is easier to justify for the error terms, due to spillovers across neighboring farms in idiosyncratic factors and climate conditions; more difficult to find reasons for the inclusion of a spatial lag of the dependent variable, as it seems less realistic for the outcome in one farm to influence those of neighbors.⁶

With respect to the original analysis, our production frontier equation will relate rice output to three inputs only: seed, labor hours totlabor, and land size, all in logs.

A contiguity matrix riceww is provided, where for each farm all other farms from the same village are defined as neighbors. The SEM panel model is then explicitly augmented with village fixed effects and time fixed effects to account for the influence of the different growing seasons. It is estimated through the spreml function, setting the lag to FALSE and the errors to 'sem':

 data("RiceFarms", package = "splm")
data("riceww", package = "splm")
library("spdep")
ricelw <- mat2listw(riceww)
Rice <- pdata.frame(RiceFarms, index = "id")

 riceprod <- log(goutput) ˜ log(seed) + log(totlabor) +
    log(size) + region + time

rice.sem <- spreml(riceprod, data = Rice, w = riceww,
                   lag = FALSE, errors = "sem")

summary(rice.sem)
ML panel with, spatial error correlation

Call:
spreml(formula = riceprod, data = Rice, w = riceww, lag = FALSE,
    errors = "sem")

Residuals:
    Min.  1st Qu.   Median  3rd Qu.     Max.
-1.06858 -0.23300  0.00581  0.23481  1.48962

Error variance parameters:
    Estimate Std. Error t-value Pr(>|t|)
rho   0.5627     0.0518    10.9   <2e-16 ***

Coefficients:
                  Estimate Std. Error t-value Pr(>|t|)
(Intercept)        5.85413    0.19469   30.07  < 2e-16 ***
log(seed)          0.16626    0.02475    6.72  1.8e-11 ***
log(totlabor)      0.24822    0.02758    9.00  < 2e-16 ***
log(size)          0.59776    0.02800   21.35  < 2e-16 ***
regionlangan      -0.09779    0.09137   -1.07    0.285
regiongunungwangi -0.14048    0.08422   -1.67    0.095.
regionmalausma    -0.11865    0.08650   -1.37    0.170
regionsukaambit    0.00723    0.09372    0.08    0.938
regionciwangi     -0.01381    0.08465   -0.16    0.870
time2             -0.04745    0.07885   -0.60    0.547
time3             -0.18551    0.07886   -2.35    0.019 *
time4             -0.34722    0.07883   -4.40  1.1e-05 ***
time5              0.15818    0.07886    2.01    0.045 *
time6              0.13805    0.07881    1.75    0.080.
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Somewhat surprisingly, the village fixed effects show up as all but unimportant. On the converse, spatial error correlation between farms belonging to the same village (estimated by the coefficient) is substantial and highly significant.

10.3 Individual Heterogeneity in Spatial Panels

Cross‐sectional spatial specifications are readily extended to the case of a pooled panel dataset, as above, but in the case of spatial panels, just as in the general case, it becomes of primary interest to model heterogeneity and persistence at the individual level. Again, the most popular device is the inclusion of individual, time‐invariant effects in the model, and again the crucial distinction is whether said effects can be assumed independent from the model regressors or not. From a statistical viewpoint, the approach detailed in the previous chapters when speaking of non‐spatial panels is still valid, but there are also specific considerations to be made for spatial applications. For example, as the random effects hypothesis is considered consistent with sampling individuals from a potentially infinite population, some (Elhorst and Fréret (2009) for example) have dismissed its plausibility in spatial econometric contexts, where sampling most typically takes place over a fixed set of countries or regions.

Spatial methods are nevertheless of interest also in contexts much akin to random sampling. For one, applications on survey data can be devised where individual units are located into some non‐geographic space, defined by their attributes and a distance function. Among the geographically referenced data proper, the same random samples of firms or households can be located and recorded as points in the landscape (Bell and Bockstael, 2000). In this sense, the RiceFarms dataset is a good candidate for random effects: many locations with similar characteristics, plausibly drawn from the same distribution, although lacking latitude and longitude information, are grouped in a way that naturally defines a neighborhood. Another case, where this time data are located as points in geographical space, are the ever more popular spatial applications from experimental contexts in life sciences, of which we will see an example later in the chapter.

Moreover, from a computational viewpoint random effects turn out to be a more general case with respect to fixed effects.

10.3.1 Random versus Fixed Effects

As detailed in the previous chapters and recalled above, unobserved individual heterogeneity is dealt with in different ways depending on the statistical properties of the individual effects, the crucial distinction becoming whether one can assume them to be uncorrelated with the regressors or not. If uncorrelated, then individual effects can be considered as a component of the error term. If not, then the latter strategy leads to inconsistency; the individual effects will have to be estimated or, more frequently, eliminated by first differencing or time‐demeaning the data. In the spatial setting, the standard solution to the fixed effects case has long been time‐demeaning: in the framework of Elhorst (2003), fixed effects estimation of spatial panel models is accomplished as pooled ML estimation on time‐demeaned data. Nevertheless, Elhorst's procedure has been questioned by Anselin et al. (2008) because time‐demeaning alters the properties of the joint distribution of errors, introducing serial dependence. As it turns out, despite the misspecification of the likelihood, the only parameter affected is the variance of the error term, the other estimators remaining consistent.⁷

To solve the problem, Lee and Yu (2010a, 3.2) suggest either a different orthonormal transformation of the data, or an ex‐post correction of the estimated variance (see also Lee and Yu, 2012). For all this, ML estimation of spatial panel models with individual fixed effects is encompassed by the ML estimator for the pooled case, after a suitable transformation of the data and, in the case one uses the simpler within transformation, an appropriate ex‐post correction of the error variance estimate.

Example 10‐6 Spatial fixed effects – `RiceFarms` data set

Spatial fixed effects panels can be estimated through the general wrapper function for maximum likelihood estimation, spml for “spatial panel by maximum likelihood”, leaving the model argument at the default value of 'within' (in the case model is either of 'random' or 'pooling', the lower level function spreml seen in previous examples is called; while here a special infrastructure is used). The spatial structure in the error can be 'none' or either of 'b' or 'kkp', which makes a difference only in the random effects case. A spatial lag can be included setting lag to TRUE. Village fixed effects must be omitted here because of collinearity, while time ones can be implicitly added to estimation by specifying effect='twoways', again consistently with the syntax of plm.

 riceprod0 <- update(riceprod,. ˜. - region - time)
semfemod <- spml(riceprod0, Rice, listw = ricelw,
                 lag = FALSE, spatial.error = "b")
summary(semfemod)
Spatial panel fixed effects error model


Call:
spml(formula = riceprod0, data = Rice, listw = ricelw, lag = FALSE,
    spatial.error = "b")

Residuals:
   Min. 1st Qu.  Median 3rd Qu.    Max.
-1.0195 -0.2105  0.0222  0.2127  1.3298

Spatial error parameter:
    Estimate Std. Error t-value Pr(>|t|)
rho   0.7913     0.0249    31.8   <2e-16 ***

Coefficients:
              Estimate Std. Error t-value Pr(>|t|)
log(seed)       0.1342     0.0226    5.94  2.8e-09 ***
log(totlabor)   0.2505     0.0267    9.38  < 2e-16 ***
log(size)       0.5419     0.0273   19.84  < 2e-16 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

A Hausman‐type test will determine whether the individual effects are to be treated as fixed or can be assumed incorrelated with the regressors, employing a more efficient random effects specification:

 Rice <- pdata.frame(RiceFarms, index = "id")
sphtest(riceprod0, Rice, listw = ricelw)

Hausman test for spatial models

data:  x
chisq = 2.6, df = 3, p-value = 0.4
alternative hypothesis: one model is inconsistent

The random effects hypothesis being not rejected, random effects methods are in order.

10.3.2 Spatial Panel Models with Error Components

While fixed effects estimation of spatial panels can be performed in the framework of the pooled spatial models, after transforming out the individual effects by a within transformation, treating the individual effects as random introduces substantial complications in the specification of the likelihood.

We consider a general static panel model that includes a spatial lag of the dependent variable and spatial autoregressive disturbances:

The disturbance vector is the sum of two terms:

being the individual effect and a vector of spatially autocorrelated idiosyncratic errors that follow a spatial autoregressive process of the form

with as the spatial autoregressive parameter, the spatial weights matrix and . The spatial weights matrices in the lag and the error term can differ (see the following). is assumed non‐singular.

10.3.2.1 Spatial Panels with Independent Random Effects

In a random effects specification, the unobserved individual effects are assumed uncorrelated with the other explanatory variables in the model and can therefore be safely treated as components of the error term.⁸ In this case, , and the error term can be rewritten as:

where . As a consequence, the composite error term becomes

and its variance‐covariance matrix is:

(10.4)

In deriving several Lagrange multiplier (LM) tests, Baltagi et al. (2003b) consider a panel data regression model that is a special case of the model presented above in that it does not include a spatial lag of the dependent variable. Elhorst (2003), Elhorst and Fréret (2009) define a taxonomy for spatial panel data models both under the fixed and the random effects assumptions. Following the typical distinction made in cross‐sectional models, they define the fixed as well as the random effects panel data versions of the spatial error and spatial lag models. However, unlike Case (1991), they do not consider a model including both the spatial lag of the dependent variable and a spatially autocorrelated error term. Therefore, the models reviewed in Elhorst (2003), Elhorst and Fréret (2009) can also be seen as special cases of this more general specification.

Following the treatment in Millo (2014), on which this part of the chapter is based, we label the combined model containing both a spatial lag and a spatial error process SAREM. (This is also often called , because of the two spatial autoregressive processes, one in the response and one in the errors.) If a random individual effect is also part of the composite error term, then we will add the suffix RE. Although SAR and SEM, combined with either FE or RE, are by far the most popular specifications, the literature has also dealt with different types of spatial diffusion processes in the errors other than the autoregressive one, most notably the spatial moving average.⁹ We do not consider them here.

10.3.2.2 Spatially Correlated Random Effects

A different specification for the disturbances was considered in Kapoor et al. (2007). They assume that spatial correlation applies to both the individual effects and the remainder error components. Although the two data‐generating processes look similar, they do imply different spatial spillover mechanisms governed by a different structure of the implied variance‐covariance matrix. In this case, commonly referred to as KKP, the composite disturbance term

follows a first‐order spatial autoregressive process of the form:

It follows that the variance‐covariance matrix of is:

(10.5)

where is the typical variance‐covariance matrix of a one‐way error component model. The variance matrix in (10.5) is simpler than the one in (10.4), and therefore its inverse is easier to calculate, as will be discussed below. As Baltagi et al. (2013) observe, the economic meaning of the two models is also different: in the first model only the time‐varying components diffuse spatially; in the second, spatial spillovers too have a permanent component. Lee and Yu (2012, 2.4) illustrate the difference between this latter specification and through the likelihood of the between model. We label this latter alternative specification , and its extension to including a spatial lag (see Mutl and Pfaffermayr, 2011) .

10.3.3 Estimation

To review the theory of maximum likelihood estimation of spatial panel models with random effects, we will start from models with a spatially lagged dependent variable, spatial error correlation, and a general covariance structure for the error, as described by Anselin (1988), without any panel structure (although it must be noted that in his book Anselin (1988) already considered a SEM panel with random effects, deriving the model likelihood, as a special case). Following Millo (2014), we will introduce random effects as just one particular type of error covariance structure, thus comprising spatial panels in Anselin's general framework.¹⁰

10.3.3.1 Spatial Models with a General Error Covariance

Maximum Likelihood estimation with a general error covariance matrix has been outlined in Magnus (1978) (see also Anselin et al., 2008). If the error is distributed as then the log‐likelihood is

Particularizing this likelihood w.r.t. the case at hand, and adding a spatial filter if needed, provides a general framework for ML estimation of the models of interest. Anselin (1988), the classic reference on spatial econometric model estimation by ML, outlines the general procedure for a model with spatial lag, spatial errors, and possibly nonspherical residuals as follows. Let us restrict the analysis, for the moment, to one cross‐section and let our model be:

(10.6)

with and, in general, . Two special cases of this general model are often found in applied literature: if one has the spatial autoregressive (SAR) model, while if , the spatial (autoregressive) error (SEM) model. Both usually include the hypothesis of spherical remainder errors: . Introducing the now‐standard simplifying notation , the model becomes:

where are potentially different spatial weights matrices.¹¹ If there exists such that and , and is invertible, then and the model (10.6) can be written as

or, equivalently,

with a “well‐behaved” error.

Still following Anselin, making the estimator operational requires the transformation from the unobservable to observables. Expressing the likelihood function in terms of requires calculating the Jacobian of the transformation . These determinants are to be added to the log‐likelihood, which becomes

where the difference w.r.t. the usual likelihood of the classic linear model is given by the terms of the Jacobian.¹² The likelihood is thus a function of , , , and parameters in .

It will be convenient for our purposes, and without loss of generality, to scale the overall errors' covariance writing it as (the latter expression is in fact more general, as it does not constrain the heteroscedastic error term to be spatially lagged, through premultiplication by , in its entirety. In our case, only the error covariance of the specification can be separated into a heteroscedastic error term and a spatial filter and therefore straightforwardly written as , while the more common SEM specification cannot). This likelihood can be concentrated w.r.t. and the error variance , by substituting

(10.7)

and a closed‐form GLS solution for and is available for any given set of spatial and other covariance parameters

(10.8)

so that a two‐step procedure is possible that alternates optimization of the concentrated likelihood and GLS estimation. From here on, we explicitly consider the (balanced) panel structure of the data: individuals observed over time periods.

10.3.3.2 General Maximum Likelihood Framework

Building on the framework from Anselin (1988) outlined above, explicitly particularizing and operationalizing it with respect to a number of possible error covariance structures, all specifications outlined above can be estimated without the need to pre‐transform the data as has been customary in the literature since Elhorst (2003). Random effects will instead be considered as one feature of the errors' covariance, just like spatial (or, later on in the chapter, serial) correlation (see Millo, 2014). Considering the spatial dependence features together with all the other sources of heteroscedasticity and correlation instead of separating it clearly, as done in the original Anselin framework, has the advantage of keeping some components of the error term (most notably, the random effects) out of the spatial dependence, which can remain a feature of the idiosyncratic error only, in accordance with most applications in the literature; but also some clear computational disadvantages, as will be discussed below. We will also consider the alternative specification where the individual effects are lagged together with the idiosyncratic errors, as in Kapoor et al. (2007), which one can straightforwardly express in terms of Anselin's original expression , also extending the structure of to include serial correlation. This latter will turn out to be easier to compute, especially on large examples.

First we will discuss the combination of a spatial lag with any error covariance structure; then we will review the most significant among the latter; lastly we will give an example of operationalization through the use of analytical expressions for the inverse and determinant of the error covariance matrix .

Optimization will generally be subject to box constraints according to the following rules: the spatial lag and spatial errors coefficients and will be bounded between and 1, where is the smallest characteristic root of ;¹³ the serial correlation coefficient will be constrained to the usual stationarity condition and the variance ratio of the random effects to be non‐negative.

Spatial Lag

Although both the SAR and the SEM specifications are popular in the literature, estimation generally focuses on one effect only, and there are few applications allowing for both of them to be present in the estimated model, one notable exception being the pioneering work of Case (1991). It is nevertheless straightforward, at least as far as expressing the likelihood is concerned, to combine a spatial lag with any error structure, including spatial dependence ones.

The general likelihood for the spatial lag panel model combined with any error covariance structure is a panel version of (10.7):

(10.9)

The usual iterative procedure a la Oberhofer and Kmenta (1974) can be employed to obtain the maximum likelihood estimates. Starting from an initial value for the spatial lag parameter and the error covariance parameters, we obtain estimates for and from the first‐order conditions:

(10.10)

The likelihood can be concentrated and maximized with respect to the parameters in and . The estimated values thereof are then used to update the expression for These steps are then repeated until convergence. In other words, for a specific the estimation can be operationalized by a two‐step iterative procedure that alternates between GLS (for and ) and concentrated likelihood (for the remaining parameters) until convergence.

This general scheme can be applied to the random effects case, where it provides a simple and effective equivalent to the usual partial time‐demeaning procedure, as well as to all the more complicated error covariance specifications discussed in the following.

For example, the spatial autoregressive model with random effects can be written as a combination of spatial filtering on the regressand and a random effects structure in the errors:

hence it can be estimated by “plugging into” the general likelihood (10.9) the particular scaled error covariance characterized by one parameter: , the ratio of the variance of the individual effect over that of the idiosyncratic error.

Example 10‐7 Spatial lag and RE – `RiceFarms` data set

In the following, a version of the RiceFarms model is estimated, for the sake of illustration and comparison. This specification has little economic underpinning, there not being many reasons why the output of one farm should depend on the output of neighboring ones (here, firms from the same village).

 sarremod.ml <- spml(riceprod0, Rice, listw = ricelw,
                    model = "random", lag = TRUE, spatial.error = "none")
summary(sarremod.ml)
ML panel with spatial lag, random effects

Call:
spreml(formula = formula, data = data, index = index, w = listw2mat(listw),
    w2 = listw2mat(listw2), lag = lag, errors = errors, cl = cl)

Residuals:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   1.59    2.52    2.81    2.78    3.05    4.02

Error variance parameters:
    Estimate Std. Error t-value Pr(>|t|)
phi   0.3690     0.0701    5.27  1.4e-07 ***

Spatial autoregressive coefficient:
       Estimate Std. Error t-value Pr(>|t|)
lambda   0.4132     0.0268    15.4   <2e-16 ***

Coefficients:
              Estimate Std. Error t-value Pr(>|t|)
(Intercept)     2.7731     0.1834   15.12  < 2e-16 ***
log(seed)       0.1415     0.0253    5.58  2.4e-08 ***
log(totlabor)   0.2740     0.0280    9.78  < 2e-16 ***
log(size)       0.5231     0.0295   17.72  < 2e-16 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Nevertheless, the SAR parameter turns out significant and relatively large in magnitude. This will prove to be a feature of model specification, more precisely of neglecting the “true” source of spatial dependence: the SEM term. More on this in the next examples. Individual effects are in turn detected, witness the significant variance ratio parameter , although is estimated at little over one third of .

Error Structures

As already discussed, the spatial error, random effects model gives rise to two possible specifications, depending on the interaction between the spatial autoregressive effect and the individual error components: the specification first analyzed by Anselin (1988) where only the idiosyncratic error is spatially correlated:

with the scaled errors' covariance (denoting and ):

and that of Kapoor et al. (2007) where the same spatial process applies both to the individual and the idiosyncratic error component:

where the scaled errors' covariance is:

Example 10‐8 Baltagi or KKP random effects SEM – `RiceFarms` data set

In the following, both the and models are estimated, again, on the RiceFarms dataset.

 semremod.ml <- spml(riceprod0, Rice, listw = ricelw,
                    model = "random", lag = FALSE, spatial.error = "b")
summary(semremod.ml)
ML panel with, random effects, spatial error correlation

Call:
spreml(formula = formula, data = data, index = index, w = listw2mat(listw),
    w2 = listw2mat(listw2), lag = lag, errors = errors, cl = cl)

Residuals:
   Min. 1st Qu.  Median 3rd Qu.    Max.
-1.1858 -0.2563  0.0119  0.2476  1.3683

Error variance parameters:
    Estimate Std. Error t-value Pr(>|t|)
phi   0.2955     0.0565    5.23  1.7e-07 ***
rho   0.7748     0.0271   28.57  < 2e-16 ***

Coefficients:
              Estimate Std. Error t-value Pr(>|t|)
(Intercept)     5.6983     0.1797   31.70  < 2e-16 ***
log(seed)       0.1520     0.0235    6.47  9.8e-11 ***
log(totlabor)   0.2562     0.0272    9.42  < 2e-16 ***
log(size)       0.5757     0.0275   20.96  < 2e-16 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
sem2remod.ml <- spml(riceprod0, Rice, listw = ricelw,
                    model = "random", lag = FALSE, spatial.error = "kkp")
summary(sem2remod.ml)
ML panel with, spatial RE (KKP), spatial error correlation

Call:
spreml(formula = formula, data = data, index = index, w = listw2mat(listw),
    w2 = listw2mat(listw2), lag = lag, errors = errors, cl = cl)

Residuals:
   Min. 1st Qu.  Median 3rd Qu.    Max.
-1.1855 -0.2563  0.0119  0.2478  1.3703

Error variance parameters:
    Estimate Std. Error t-value Pr(>|t|)
phi   0.2959     0.0569     5.2    2e-07 ***
rho   0.7686     0.0277    27.8   <2e-16 ***

Coefficients:
              Estimate Std. Error t-value Pr(>|t|)
(Intercept)     5.6986     0.1864   30.57  < 2e-16 ***
log(seed)       0.1518     0.0236    6.44  1.2e-10 ***
log(totlabor)   0.2564     0.0273    9.41  < 2e-16 ***
log(size)       0.5763     0.0275   20.94  < 2e-16 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The differences are minimal. Random effects are significant, albeit weak in magnitude; while in accordance with the original work of Druska and Horrace (2004), very strong spatial error correlation is detected. The limited importance of the RE component makes the distinction between the two specifications scarcely relevant.

10.3.3.3 Generalized Moments Estimation

The computational intensity of ML estimation, which in the simpler models is related mostly to the need to recompute the determinants at each optimization step, has long been a limiting factor in practical applications. Samples of cross‐sectional size in the hundreds were the practical maximum for the simple SAR or SEM models at the end of the 20th century, both because of the difficulty in obtaining a result at all and of the numerical unreliability of the latter if any because of precision problems (Kelejian and Prucha, 1999, Bell and Bockstael, 2000). Today, much more powerful computers have extended the scope of ML methods, but on the other hand the increasing availability of GIS data has brought forward a new generation of estimation problems of ever increasing size (an early survey and examples in Bell and Bockstael, 2000).

This has prompted researchers to explore alternative estimation strategies. Kelejian and Prucha (1999) proposed the generalized moments (GM) method, which, despite being asymptotically equivalent to ML under normality of the errors, is consistent irrespective of the latter; computationally, moreover, it does not require the numerically cumbersome calculation of the determinants.

The GM estimator for the cross‐sectional SEM model (see also Bell and Bockstael, 2000) is based on the following three moments of the error term:

(10.11)

The estimation strategy is based on the idea of estimating the spatial autoregressive coefficient based on the residuals from a consistent estimator (here, OLS) and then using it in a feasible GLS analysis. With respect to maximum likelihood, the GM estimator has the additional advantage of not relying on a normality assumption for the errors. One drawback is that standard errors are not available for the parameter.

The Kelejian and Prucha (1999) GM estimator has first been extended to the panel case by Druska and Horrace (2004), then by Kapoor et al. (2007) who estimated the above described model with RE, a specification which, after them, is known as KKP. In order to perform feasible GLS, one does now need consistent estimates of the spatial autoregressive parameter and the two variance components of the composite error, and . The estimator a la KKP estimates them based on six moment conditions, using the OLS residuals , which are still consistent in this setting:

(10.12)

where , , , and ; and and are, respectively, a time‐demeaning and a time‐averaging matrix.

The moment conditions are now redundant and can be employed in different ways. The simplest is to consider only the first three moment conditions. The second way is to employ all six moments in estimating the three unknown parameters, weighing them through a covariance matrix calculated under the assumption of normally distributed errors. The third and last proceeds like the second, using all available moments but employs a simplified weighting matrix.

GM methods have been extended to the other relevant specifications in spatial econometrics. Spatial fixed effects models can also be estimated in this framework, through a modification of the KKP procedure suggested by Mutl and Pfaffermayr (2011) and consisting in replacing the OLS residuals, inconsistent under the fixed effects assumption, with spatial 2SLS within residuals; the spatial parameter is estimated by an adaptation of the simplified KKP procedure (first three moment conditions only) and used in a spatial Cochrane‐Orcutt transformation of the within‐transformed variables. The GM method has also been extended to the SAR and SAREM models, so that now any combination of spatial lag and error, with individual effects of either random or fixed type, can be estimated through this numerically very efficient method (see Millo and Piras, 2012).

Example 10‐9 Spatial GM – `RiceFarms` data set

The function spgm (for “spatial panel by GM”) is the general wrapper for GM estimation in splm, and the counterpart to spml. The model is specified as either 'within' or 'random', as usual; analogously, a SAR term is added by setting lag to TRUE; differently from spml, whether to include a SEM term is a binary choice because only KKP‐type random effects are available: hence the spatial.error argument can only be TRUE or FALSE.

 semremod.gm <- spgm(riceprod0, Rice, listw = ricelw,
                    lag = FALSE, spatial.error = TRUE)
summary(semremod.gm)
Spatial panel fixed effects GM model


Call:
spgm(formula = riceprod0, data = Rice, listw = ricelw, lag = FALSE,
    spatial.error = TRUE)

Residuals:
     Min.   1st Qu.    Median   3rd Qu.      Max.
-0.841575 -0.163147  0.000527  0.167523  1.355049
Estimated spatial coefficient, variance components and theta:
          Estimate
rho         0.7807
sigma^2_v   0.0801

Coefficients:
              Estimate Std. Error t-value Pr(>|t|)
log(seed)       0.1346     0.0227    5.94  2.9e-09 ***
log(totlabor)   0.2508     0.0268    9.36  < 2e-16 ***
log(size)       0.5418     0.0274   19.77  < 2e-16 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Comparing the result of the model by GM below with the previous ML example one can see that there is no substantial difference between the estimated coefficients, despite the moderate size of the sample; lastly, as observed, the GM method does not provide an estimate of dispersion for ; hence no significance testing is possible.

10.3.4 Testing

10.3.4.1 LM Tests for Random Effects and Spatial Errors

Requiring only the estimation of the restricted specification, Lagrange multiplier (LM) tests in the tradition of Breusch and Pagan (1980) are particularly appealing in a spatial random effects setting because of the computational difficulties related to ML estimation of encompassing models.

Baltagi et al. (2003b) derived joint, marginal and conditional tests for all combinations of random effects and spatial correlation. Starting from the random effects model with SEM errors (), the error term can be written as:

(10.13)

and the (unscaled) variance covariance matrix of the errors as:

(10.14)

The hypotheses under consideration are:

under the alternative that at least one component is not zero
assuming no spatial correlation, under the one‐sided alternative that the variance component is greater than zero
assuming no random effects, under the two‐sided alternative that the spatial autocorrelation coefficients is different from zero
assuming the possible existence of random effects, under the two‐sided alternative that the spatial autocorrelation coefficient is different from zero
assuming the possible existence of spatial autocorrelation and the one‐sided alternative that the variance component is greater than zero

The joint LM test for the first hypothesis of no random effects and no spatial autocorrelation () is given by:

(10.15)

where , , and denotes OLS residuals. The marginal LM test for random effects assuming no spatial correlation is given by:

(10.16)

An alternative standardized version with better finite sample properties can be obtained by centering and scaling the one‐sided LM statistic:

(10.17)

Analogously, the marginal LM test of no spatial autocorrelation assuming no random effects is given by:

(10.18)

which also admits a standardized form with better properties:

(10.19)

and are asymptotically normally distributed as for fixed , under and respectively. Based on the latter, a one‐sided joint test statistic for can be derived as:

(10.20)

which is asymptotically distributed as a standard normal. In practical applications can turn out negative, especially when the random effects variance is small, and the same applies to when the spatial autocorrelation coefficient is small. A test for the joint null hypothesis can therefore be based on the following decision rule:

Under the null the test statistic has a mixed ‐distribution given by:

(10.21)

When using , one is assuming that random regional effects do not exist. However, especially when the random effect variance is actually large, this may lead to incorrect inference. For this reason Baltagi et al. (2003b) derived a conditional LM test for spatial autocorrelation allowing for the random effects variance to be non‐zero. The expression for the test assumes the following form:

(10.22)

where . Also, , and .

Contrarily to previous tests that use OLS residuals, the residuals come from the ML estimation of a one‐way error component model. This last point, on the converse, makes the implementation slightly more complicated. A one‐sided test is simply obtained by taking the square root of 10.22. The resulting test statistics are asymptotically distributed as a standard normal. Similarly, when using , one is assuming no spatial error correlation. This assumption may lead to incorrect inference particularly when it is not the case that is close to zero. A conditional LM test allowing for spatial error correlation can be derived as:

(10.23)

where and . A one‐sided test can be defined by taking the square root of 10.23 based on ML residuals. The test statistic is again asymptotically normally distributed.

Example 10‐10 BSK tests – `RiceFarms` data set

In the RiceFarms case, it is easy to assume the presence of farm individual effects, perhaps representing parcel quality, farmer's ability, or other time‐invarying idiosyncrasies. In the following we test for either random farm effects or spatial correlation in the remainder errors, drawing on the specification from the previous examples (i.e., controlling for village and time fixed effects).

The main function to perform the joint, marginal, and conditional tests for random effects and spatial error correlation is bsktest. It will take a pair of formula, data arguments, plus a listw object representing the spatial ordering and the test to be performed.

The joint test (test = 'LMH') is of little use, because it will reject in the presence of either effect, giving no further directions:

 bsktest(riceprod, data = Rice, listw = ricelw, test = "LMH")

Baltagi, Song and Koh LM-H one-sided joint test

data:  log(goutput) ˜ log(seed) + log(totlabor) + log(size) + region +     time
LM-H = 310, p-value <2e-16
alternative hypothesis: Random Regional Effects and Spatial autocorrelation

More interestingly, the conditional test for random farm effects allowing for spatial error correlation (test = 'CLMmu') does in turn reject:

 bsktest(riceprod, data = Rice, listw = ricelw, test = "CLMmu")

Baltagi, Song and Koh LM*- mu conditional LM test
(assuming lambda may or may not be = 0)

data:  log(goutput) ˜ log(seed) + log(totlabor) + log(size) + region +     time
LM*-mu = 11, p-value <2e-16
alternative hypothesis: Random regional effects

as does the conditional spatial test, allowing for random effects:

 bsktest(riceprod, data = Rice, listw = ricelw, test = "CLMlambda")

Baltagi, Song and Koh LM*-lambda conditional LM test
(assuming sigma^2_mu >= 0)

data:  log(goutput) ˜ log(seed) + log(totlabor) + log(size) + region +     time
LM*-lambda = 21, p-value <2e-16
alternative hypothesis: Spatial autocorrelation

A comprehensive specification is appropriate.

10.3.4.2 Testing for Spatial Lag vs Error

If a researcher has a strong reason to expect a spatial‐data‐ generating process to be of the SAR (or, respectively, SEM) kind, then her only problem is to determine whether said spatial effect is present. Then she can either proceed general to specific, estimating the SAR (SEM) model and assessing the significance of the spatial coefficient, or specific to general, testing from the non‐spatial model toward the spatial alternative. In a ML framework, the optimal LM tests for one effect assuming the other out are called marginal. They are dependent on the above hypothesis and will be inconsistent if it is violated; in case only the “other” effect is actually present, they will usually yield a type I error.

As outlined above, although empirical practice has mostly concentrated on either the SAR or the SEM model, estimation of SAREM models containing both a spatial lag and a spatial error is possible. Therefore, if the researcher does not have a strong prior in favor of either, an empirical strategy can be to start from the most general SAREM specification, together with the appropriate kind of individual heterogeneity, and let the data tell us which of the two spatial processes – if any and if not both – did actually generate the observed sample, by looking at the significance diagnostic for either spatial coefficient.

One drawback of this strategy is its computational demands and lesser stability than estimating the simpler models; another is that it does not allow the inclusion of a full set of spatially lagged regressors, a specification approach that has become increasingly popular in recent years.

Lagrange multiplier tests for SAR (SEM) can be either of the conditional type, allowing for the presence of SEM (SAR) tout court, or of the locally robust type, allowing for a limited deviation from zero of the SEM (SAR) coefficient. The former are optimal under the standard assumptions of the ML framework detailed above, and provided the general SAREM model holds; and they require residuals from the restricted SEM (SAR) model. The second kind have suboptimal statistical properties with respect to the optimal conditional tests, and under the above hypotheses on the data‐generating process, they are not guaranteed to hold if misspecification is “too far away,” i.e., if the SAR (SEM) coefficient is of sizable magnitude (and how far is far, i.e., whether 0.1 or 0.4 is tolerable, is an empirical question); moreover the currently available robust LM tests have been developed in a cross‐sectional framework and do not explicitly incorporate panel features. On the other hand, they are computationally simpler being based on the residuals of the non‐spatial model, and they allow including spatially lagged regressors; hence their remarkable success in applied practice.

Marginal vs Locally Robust LM Tests

The original LM tests for either spatial lag or error (Burridge, 1980; Anselin, 1988) were derived in a cross‐sectional context, as tests for, respectively, vs assuming (henceforth ); and vs assuming (henceforth ). i.e., both can only be employed assuming that the “other” effect is not present. Otherwise, each test has power against the “wrong” alternative as well; therefore, these procedures are of limited value in the model selection process.

Based on the general local robustness framework of Bera and Yoon (1993), in a cross sectional context, Anselin et al. (1996) derived robust LM statistics for allowing for (henceforth ) and, respectively, for allowing for (henceforth ). These procedures have since been successfully employed in specification searches to discriminate between SAR and SEM models, as formalized in Florax et al. (2003).

Marginal Spatial LM Tests

In the context of pooled cross sections, without allowing for any correlation feature across either time or cross section (i.e., setting and in equation ), any cross‐ sectional test can be straightforwardly applied to the pooled dataset. The LM tests of Anselin et al. (1996) (LM) are simply rewritten for the pooled dataset, stacked by cross section, and drawing on an enlarged version of the weights matrix obtained by replicating the cross‐sectional over the main diagonal so that (see Anselin et al., 2008). The pooled test becomes:

(10.24)

where are the OLS residuals and

and

(Elhorst, 2010, Formulae 11 to 13). In turn, the pooled test is:

(10.25)

Locally Robust Spatial LM Tests

The robust LM tests of Anselin et al. (1996) can in turn be straightforwardly adapted to the (pooled) panel case, as per Elhorst (2014, Ch. 2.3):

using, again, the OLS residuals (Elhorst, 2010, Formulae 14‐15).

Moreover, according to Bera et al. (2009), the LM test for the joint null hypothesis versus or is equal to the sum of the marginal test for one effect and the locally robust test for the other:

so that the RLM tests can also be obtained indirectly by subtracting the marginal test for the “other” effect from the joint test.

The slmtest function, specifying test = 'lml' ('lme') will perform either the marginal test for SAR (SEM) assuming no SEM (SAR) component in the data‐generating process or the locally robust version if specifying test = 'rlml' ('rlme').

Example 10‐11 Robust LM tests for SAR or SEM – `RiceFarms` data set

As we have seen, in the rice farms example there are reasons for assuming out a spatial lag model from the beginning. Nevertheless, it is sensible to check this assumption. The standard versions of the (pooled) LM test for SAR (SEM), as observed, is not robust to the presence of a SEM (SAR) term, i.e., of the “other” effect. Robust LM tests instead allow for “local” deviations from zero of the “other” parameter. Of course, the extent of the tolerated deviation is uncertain; still, from this example the difference between the false positive given by the non‐robust SAR test (test = 'lml') and the locally robust counterpart (test = 'rlml') clearly stands out:

 local.rob.LM <- matrix(ncol = 4, nrow = 2)
tests <- c("lml", "lme", "rlml", "rlme")
dimnames(local.rob.LM) <- list(c("LM test", "p-value"),
                               tests)
for(i in tests) {
    local.rob.LM[1, i] <- slmtest(riceprod, data = Rice,
                                  listw=ricelw, test = i)$statistic
    local.rob.LM[2, i] <- slmtest(riceprod, data = Rice,
                                  listw=ricelw, test = i)$p. value
    }
round(local.rob.LM, 4)
          lml   lme   rlml  rlme
LM test 39.28 244.8 0.1654 205.7
p-value  0.00   0.0 0.6842   0.0

The robust test favors the SEM model over the SAR.

It shall be kept in mind, moreover, that none of the above procedures allow for individual effects; one approximate solution is to demean the data. The Within function – which can be used directly in the model formula – will subtract time means, thus eliminating any individual effect, of either random or fixed type:

 local.rob.LMw <- matrix(ncol = 4, nrow = 2)
wriceprod <- Within(log(goutput)) ˜ Within(log(seed)) +
    Within(log(totlabor)) + Within(log(size)) +
    region + time
dimnames(local.rob.LMw) <- list(c("LM test", "p-value"),
                               c("lml", "lme", "rlml", "rlme"))
for(i in c("lml", "lme", "rlml", "rlme")) {
    local.rob.LMw[1, i] <- slmtest(wriceprod, data = Rice,
                                  listw=ricelw, test = i)$statistic
    local.rob.LMw[2, i] <- slmtest(wriceprod, data = Rice,
                                  listw=ricelw, test = i)$p. value
    }
round(local.rob.LMw, 4)
          lml   lme  rlml  rlme
LM test 125.2 604.3 1.538 480.6
p-value   0.0   0.0 0.215   0.0

The result is unchanged, but now we are more confident in it because we have controlled, although in an ad hoc way, for individual effects.

Likelihood‐Based Tests

Given that estimation of the full SAREM model is possible (see the extensive discussion in Millo, 2014), one could directly employ the encompassing model as a specification device, relying on the Wald restriction tests from the general model as an alternative specification strategy instead of looking at the RLM tests. This strategy has the drawback of being computationally more intensive but also some important advantages: the Wald z‐tests for significance of and are optimal; there is no need for robustification, as the “other” spatial effect is explicitly accounted for in the model, as can be random individual effects; lastly, estimation of the encompassing model also provides the magnitudes of the spatial coefficients together with the significance level of the zero‐restriction tests so that their substantial importance can be assessed.

As usual, two kinds of tests are possible from the estimated encompassing model: Wald‐type tests, requiring only an estimate of the latter, and likelihood ratio tests, requiring both the encompassing and the restricted.

Wald Tests

Wald‐type tests are ‐tests for significance of the relevant parameter in the encompassing model. Thus, from ML estimates of the general SAREM‐RE model,

and symmetrically for . Importantly, the test can be made conditional to (i.e., valid in the presence of) individual random effects by including them in the specification. As observed, fixed individual effects can be eliminated through data transformation in two ways, both familiar from the spatial panel literature: either through time‐demeaning (within transformation) (Elhorst, 2003) or by forward orthogonal deviations (Lee and Yu, 2010a). The former induces residual serial correlation, which can nevertheless be considered (i.e., estimated out) in the encompassing model; while the latter preserves the features of the original errors covariance matrix (Debarsy and Ertur, 2010, p. 7).

Example 10‐12 Wald tests for SEM vs SAR – `RiceFarms` data set

In the following we estimate the full model in order to test whether it is possible to simplify it, in a general‐to‐specific fashion. The spml function is the highest‐level wrapper for spatial panel estimation by maximum likelihood, allowing for either fixed, random, or no effects (in the random or none cases, it calls spreml internally). Its syntax is mostly consistent with that of plm. We select model='random' and spatial.error='b' for “Baltagi,” which selects the specification ('kkp' would estimate the ).

 saremremod <- spml(riceprod, data = Rice, listw = ricelw, lag = TRUE,
                   model = "random", spatial.error = "b")
summary(saremremod)
ML panel with spatial lag, random effects, spatial error correlation

Call:
spreml(formula = formula, data = data, index = index, w = listw2mat(listw),
    w2 = listw2mat(listw2), lag = lag, errors = errors, cl = cl)

Residuals:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 -1.154  -0.321  -0.076  -0.090   0.149   1.351

Error variance parameters:
    Estimate Std. Error t-value Pr(>|t|)
phi   0.2967     0.0568    5.23  1.7e-07 ***
rho   0.6281     0.0790    7.96  1.8e-15 ***

Spatial autoregressive coefficient:
       Estimate Std. Error t-value Pr(>|t|)
lambda  -0.0134     0.1755   -0.08     0.94

Coefficients:
                  Estimate Std. Error t-value Pr(>|t|)
(Intercept)         5.9684     0.1957   30.50  < 2e-16 ***
log(seed)           0.1531     0.0235    6.51  7.6e-11 ***
log(totlabor)       0.2492     0.0271    9.19  < 2e-16 ***
log(size)           0.5784     0.0274   21.08  < 2e-16 ***
regionlangan       -0.0926     0.1051   -0.88    0.378
regiongunungwangi  -0.1567     0.0969   -1.62    0.106
regionmalausma     -0.1572     0.0995   -1.58    0.114
regionsukaambit    -0.0243     0.1078   -0.23    0.822
regionciwangi      -0.0267     0.0973   -0.27    0.784
time2              -0.0612     0.0813   -0.75    0.452
time3              -0.1911     0.0813   -2.35    0.019 *
time4              -0.3650     0.0813   -4.49  7.1e-06 ***
time5               0.1626     0.0813    2.00    0.045 *
time6               0.1325     0.0813    1.63    0.103
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the estimation results, we gather that the is the better specification: the estimated SAR term is not significant, while the SEM coefficient is of considerable magnitude and highly significant.

LR Tests

Likelihood ratio tests are based on the likelihoods from the general and the restricted model. The test statistic is a simple transform of the difference in likelihoods:

where is the full vector of ML parameter estimates from the unrestricted model and from the restricted one, and the number of restrictions. Thus,

and symmetrically for . Again, including random effects in the estimated models makes the test conditional to these effects, while fixed effects can be transformed out as detailed in the previous paragraph but always keeping in mind the effects of the transformation on the error properties.

Example 10‐13 LR tests for SEM vs SAR – `RiceFarms` data set

The restriction test for the SAR term is performed as:

 ll1 <- saremremod$logLik
ll0 <- spml(riceprod, data = Rice, listw = ricelw, lag = FALSE,
                   model = "random", spatial.error = "b")$logLik
LR <- 2 * (ll1 - ll0)
pLR <- pchisq(LR, df = 1, lower.tail = FALSE)
pLR
[1] 0.9121

The from the LR spatial lag test is very high, and not unlike the (asymptotically equivalent) result from the Wald restriction test in the previous example.

10.4 Serial and Spatial Correlation

It is possible to generalize the structure of the errors further by introducing serial correlation in the remainder of the error term, together with spatial correlation and random effects. Baltagi et al. (2007) do so in the context of the Anselin , specifying the model errors as the sum of an individual, time‐invariant component and an idiosyncratic one that is spatially autocorrelated, as above, but also has serial correlation in the remainder:

(10.26)

where is i.i.d.. The combination of this more general error structure, termed because of the addition of Serially autoRegressive errors, with a spatially lagged dependent variable and the estimation of the most general model can still be dealt with in the general ML framework outlined above.

10.4.1 Maximum Likelihood Estimation

The model combining spatial and serial correlation with individual effects can be estimated by maximum likelihood, through an extension of the framework outlined in the previous sections of this chapter.

10.4.1.1 Serial and Spatial Correlation in the Random Effects Model

Generalizing the structure of the errors further by introducing serial correlation in the remainder of the error term, together with spatial correlation and random effects, Baltagi et al. (2007) derived a number of conditional and marginal LM tests for the different effects, possibly allowing for the presence of the other ones. Based on their work, Millo (2014) extended the model to include a SAR term. The errors of the SAREM model are specified as in the previous paragraph, so that the full model is:

To derive the likelihood, Baltagi et al. (2007) suggest a Prais‐Winsten transformation of the model with random effects and spatial autocorrelation. Following their simplifying notation, define: with:

(10.27)

then the expression for the scaled error covariance matrix can be written as

While in principle the inverse and determinant of can be calculated by brute force, in practice it is convenient, and often necessary, to rely on simplified analytical expressions to reduce the computational burden and extend the range of feasible sample sizes. Baltagi et al. (2007) derived expressions for the inverse and determinant of the error covariance matrix:

where and . They can be plugged in the general likelihood (10.9) to estimate the model.

10.4.1.2 Serial and Spatial Correlation with KKP‐Type Effects

As an alternative to the specification, Millo (2014) presents an extension of the errors a la Kapoor et al. (2007) to serial correlation in the remainder errors. As in the case, the random effects are spatially lagged together with the idiosyncratic ones, while the remainder errors in turn are serially correlated:

This alternative specification assumes that individual effects follow the same spatial diffusion process as the idiosyncratic errors do. By analogy, it is termed . Just as in the case, the error covariance is then again of the form (see Section 10.3.3.1), which simplifies computations considerably. In fact, the (scaled) error covariance for this model is:

and, by the properties of Kronecker products, its inverse is

so that there is no need for the numerically demanding and unstable inversion of .¹⁴

Example 10‐14 Serial and spatial correlation – `EvapoTransp` data set

Mountains are a crucial source of water for public, agricultural and hydropower use. Obojes et al. (2015) explore the effect of vegetation composition and structure on water balance on some high‐ elevation grasslands in the Alps, in order to infer the potential to influence the water balance of mountain areas through land management. In particular, they evaluate the consequences of the abandonment of mountain areas: leading to the proliferation of tall grasses and dwarf shrubs and therefore affecting the water balance. Of the different components of the water balance, evapotranspiration (ET) is the one most influenced by vegetation. They repeatedly measure the water balance of soil monoliths in deep seepage collectors in four experimental sites over three study areas, two in the French Alps, one in Switzerland, and one in Austria.

Persistence in both time and space is apparent in the data and attributed to small‐scale features of the particular observation context. In order to account for these, they estimate a panel model with both spatial and serial correlation, using a distance‐based, row‐standardized weights matrix.

We replicate the results from the Austrian site, based on the original data, available as EvapoTransp, and weights matrix etw. There are 5 repeated measurements over 86 observation units.

 data("EvapoTransp", package = "pder")
data("etw", package = "pder")
evapo <- et ˜ prec + meansmd + potet + infil + biomass + plantcover +
  softforbs + tallgrass + diversity + matgram + dwarfshrubs + legumes
semsr.evapo <- spreml(evapo, data=EvapoTransp, w=etw,
                      lag=FALSE, errors="semsr")
summary(semsr.evapo)
ML panel with, AR(1) serial correlation, spatial error correlation

Call:
spreml(formula = evapo, data = EvapoTransp, w = etw, lag = FALSE,
    errors = "semsr")

Residuals:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 -2.260  -0.500   0.021  -0.047   0.420   2.373

Error variance parameters:
    Estimate Std. Error t-value Pr(>|t|)
psi   0.1665     0.0482    3.45  0.00056 ***
rho   0.8665     0.0246   35.29  < 2e-16 ***

Coefficients:
             Estimate Std. Error t-value Pr(>|t|)
(Intercept)  0.866041   0.562326    1.54   0.1235
prec        -0.129636   0.154338   -0.84   0.4009
meansmd      0.018968   0.004452    4.26  2.0e-05 ***
potet        0.551144   0.335828    1.64   0.1008
infil        0.023513   0.021876    1.07   0.2824
biomass      0.002335   0.000305    7.65  1.9e-14 ***
plantcover   0.019174   0.110332    0.17   0.8620
softforbs    0.132359   0.041463    3.19   0.0014 **
tallgrass    0.174540   0.054099    3.23   0.0013 **
diversity    0.040775   0.035790    1.14   0.2546
matgram     -0.029814   0.033040   -0.90   0.3669
dwarfshrubs  0.098405   0.054127    1.82   0.0691.
legumes     -0.016304   0.005591   -2.92   0.0035 **
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Although simple OLS would be consistent in this setting, the spatially and serially correlated ML model improves the precision of the estimates and leads to substantially different results. For example, in the spatial‐serial error model, the coefficient on precipitation (Prec) is halved and not significant any more, with respect to what would result from OLS (reported below); analogously for potential evapotranspiration potET).

 library("lmtest")
coeftest(plm(evapo, EvapoTransp, model="pooling"))

t test of coefficients:

             Estimate Std. Error t value Pr(>|t|)
(Intercept)  1.076993   0.168586    6.39  4.5e-10 ***
prec        -0.207954   0.037563   -5.54  5.5e-08 ***
meansmd      0.022749   0.006489    3.51   0.0005 ***
potet        0.767941   0.088380    8.69  < 2e-16 ***
infil        0.055648   0.031030    1.79   0.0736.
biomass      0.000104   0.000389    0.27   0.7900
plantcover   0.044657   0.157801    0.28   0.7773
softforbs    0.104305   0.058300    1.79   0.0743.
tallgrass    0.173013   0.078668    2.20   0.0284 *
diversity    0.016214   0.051333    0.32   0.7523
matgram     -0.069537   0.049001   -1.42   0.1566
dwarfshrubs  0.071451   0.077135    0.93   0.3548
legumes     -0.019447   0.008115   -2.40   0.0170 *
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The same happens for any specification omitting the spatial error term, like random effects or the serially correlated errors model, which can be obtained setting errors to 'sr' (output not reported).

Controlling for spatial error correlation seems the key feature here, witness the large ; nevertheless, despite the low magnitude of the serial correlation coefficient, omitting time persistence would still lead to substantially different results, namely to a false positive for the significance test on DwarfShrubs:

 coeftest(spreml(evapo, EvapoTransp, w=etw, errors="sem"))

z test of coefficients:

             Estimate Std. Error z value Pr(>|z|)
(Intercept)  1.062540   0.566049    1.88  0.06050.
prec        -0.155642   0.153350   -1.01  0.31013
meansmd      0.017929   0.003962    4.52  6.0e-06 ***
potet        0.532574   0.327493    1.63  0.10390
infil        0.022110   0.019396    1.14  0.25430
biomass      0.002312   0.000286    8.08  6.3e-16 ***
plantcover   0.016307   0.097288    0.17  0.86689
softforbs    0.131949   0.036543    3.61  0.00031 ***
tallgrass    0.176606   0.047692    3.70  0.00021 ***
diversity    0.038389   0.031549    1.22  0.22369
matgram     -0.031006   0.029147   -1.06  0.28743
dwarfshrubs  0.104405   0.047821    2.18  0.02902 *
legumes     -0.016654   0.004929   -3.38  0.00073 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

10.4.2 Testing

Testing for either effect in the context of the spatially and serially correlated model with individual heterogeneity is performed within the same maximum likelihood framework used for estimation.

10.4.2.1 Tests for Random Effects, Spatial, and Serial Error Correlation

Baltagi et al. (2007) derive the joint, marginal, and conditional LM tests for the model with serial correlation.

They consider all possible combinations of joint, marginal and conditional tests:

the joint test for , (J)
the marginal tests for and assuming in turn that the other two are zero (M.1‐3)
the joint tests for any combination of two of the parameters assuming the third one is zero (M.4‐6)
the marginal tests for and assuming in turn that the other two may or may not be zero (C.1‐3)
the joint tests for any combination of two of the parameters assuming the third one may or may not be zero (C.4‐6)

M.1‐3 are well‐established testing procedures in the literature (as observed in Baltagi et al., 2007). M.1 (test for ) is the LM test for spatial error correlation derived by Anselin (1988) in the context of a pooled model with no serial correlation or individual effects. On the other hand, M.2 (test for ) is analogous, for large , to the well‐known Breusch (1978), Godfrey (1978) serial correlation test. Finally, M.3 is simply the Breusch and Pagan (1980) random effects test.

Baltagi et al. (2007, Appendix A.3) show that the test statistic for the joint hypothesis M.4 () assuming no random effects is simply the sum of the marginal tests M.1 () and M.2 (). Additionally, M.5 is the Baltagi et al. joint test outlined in section 10.3.4.1; and M.6 is the joint test for random individual effects and serial correlation derived in Baltagi and Li (1995) (see section 4.3.2).

As a result of the previous discussion, we only consider the three‐way joint test J and the one‐way conditional tests C.1‐3.

The corresponding null hypotheses are:

under the alternative that at least one component is not zero (J)
, assuming : test for spatial correlation, allowing for serial correlation and random individual effects (C.1)
, assuming : test for serial correlation, allowing for spatial correlation and random individual effects (C.2)
, assuming : test for random individual effects, allowing for spatial and serial correlation (C.3)

The joint LM test for is given by:

(10.28)

where, , , , , is a matrix with bidiagonal elements equal to one and denotes OLS residuals. Under , is distributed as .

The conditional C.1 test for gives rise to the following statistic, asymptotically distributed as under :

(10.29)

where

is the score vector (evaluated at the null), a vector of ML residuals obtained from the estimation of the model with individual error components and serial correlation, , and has been defined above.

The conditional C.2 test for is based on the following statistic, asymptotically distributed as under the null:

(10.30)

where is the corresponding element of the information matrix,¹⁵

(10.31)

with the score evaluated at the null and the vector of ML residuals from the estimation of a panel model with individual error components and serial correlation. Both and assume the same expression as before, while as usual .

The conditional C.3 test for is based on the following statistic:

(10.32)

where

is the score evaluated at the null, is the corresponding element of the information matrix¹⁶ and is the vector of estimated residuals from the ML estimation of a panel model with spatially and serially correlated errors but no individual error components. The test statistic is asymptotically distributed as under :

Example 10‐15 Conditional BSJK tests – `RiceFarms` data set

We now address the issue of serial correlation in the remainder errors of the RiceFarms model. In other words, we check whether persistence characteristics in the output of an individual farm have effectively been accounted for by the individual effects, which in previous examples have proved significant, statistical evidence favoring the random hypothesis. On the spatial side, there has been ample evidence of spatial effects of the SEM type. For all this, on one hand a joint test is guaranteed to reject; on the other, tests for each single “effect” (spatial or serial correlation, or individual effects) will have to account for the possible presence of one or both of the others.

The tests are performed specifying a formula and a data.frame, the spatial weights listw to be employed and the test. Although we are here particularly interested in the 'C.2' test, for the sake of comparison we perform both the joint test 'J' and all three conditional tests. As observed, the joint test will use the OLS residuals, while the others those from the appropriate restricted specification, e.g, the those from a model.

 bsjk.LM <- matrix(ncol = 4, nrow = 2)
tests <- c("J", paste("C", 1:3, sep = "."))
dimnames(bsjk.LM) <- list(c("LM test", "p-value"),
                               tests)
for(i in tests) {
    mytest <- bsjktest(riceprod, data = RiceFarms, index = "id",
                       listw = ricelw, test = i)
    bsjk.LM[1, i] <- mytest$statistic
    bsjk.LM[2, i] <- mytest$p. value
    }
round(bsjk.LM, 6)
            J   C.1       C.2  C.3
LM test 319.5 371.5 11.894431 75.8
p-value   0.0   0.0  0.000563  0.0

All tests reject the respective null hypotheses: the joint and the (spatial effects) most forcefully and then the (random effects). The test rejects less forcefully; still it provides evidence for some serial correlation in the remainder errors of the rice production equation after controlling for spatial and individual random effects.

Example 10‐16 Spatial and serial correlation – `RiceFarms` data set

The result from the test, although not very sharp, warrants an investigation into the serial correlation issue. To this end, we estimate the full SEMSRRE model, visualizing only the significance table for the error components. t‐statistics are expected to mimic the results of the asymptotically equivalent LM tests closely: still there is more information to be extracted from the encompassing model, i.e., the magnitudes, and hence the substantial importance, of the estimated parameters.

 semsrre.rice <- spreml(riceprod, data = Rice,
                       w=riceww, lag = FALSE, errors = "semsrre")
round(summary(semsrre.rice)$ErrCompTable, 6)
    Estimate Std. Error t-value Pr(>|t|)
phi   0.2500    0.05904   4.234 0.000023
psi   0.1250    0.04092   3.054 0.002259
rho   0.6136    0.04617  13.291 0.000000

Spatial error correlation is confirmed as the statistically strongest effect, with the now familiar large coefficient. Both individual effects and serial correlation play minor roles: the variance ratio of the random effects over the idiosyncratic errors is about one fourth; the estimated serial correlation coefficient is but 0.13.

10.4.2.2 Spatial Lag vs Error in the Serially Correlated Model

Testing for spatial lag vs spatial error in a model allowing for random effects and/or serially correlated errors can be done via the Wald approach, from the encompassing specification.¹⁷

Example 10‐17 Spatial and serial correlation – `EvapoTransp` data set

In the evapotranspiration example, spatial correlation in the errors comes as a consequence of the particular experimental environment: in other words, it is imposed over the alternative of spatial lag dependence as a theoretical a priori of the researchers. In fact, it seems difficult to come up with reasons why the outcome, actual evapotranspiration, at one site should influence that of neighboring sites, while it is quite natural to expect that measurement errors at nearby sites be correlated. As a statistical check, we estimate the encompassing specification with SAR, SEM, and serial error correlation, reporting only the relevant coefficient tables for the error variance parameters and the spatial lag coefficient:

 saremsrre.evapo <- spreml(evapo, data = EvapoTransp,
                          w = etw, lag = TRUE, errors = "semsr")
summary(saremsrre.evapo)$ARCoefTable
       Estimate Std. Error t-value Pr(>|t|)
lambda   -0.322     0.2804  -1.148   0.2508
round(summary(saremsrre.evapo)$ErrCompTable, 6)
    Estimate Std. Error t-value Pr(>|t|)
psi   0.1679    0.04820   3.483 0.000496
rho   0.9000    0.02902  31.006 0.000000

The statistical evidence from estimation backs the a priori considerations: unlike the spatial error coefficient , the estimate of the spatial lag is not significant.

Notes

3 A row‐standardized proximity matrix is one transformed so that all rows sum to 1. The very comprehensive package spdep for spatial dependence analysis (see Bivand, 2008) contains features for creating, lagging, and manipulating neighbor list objects of class nb, that can be readily converted to and from proximity matrices by means of the nb2mat function. Higher orders of the

test can be obtained lagging the corresponding nbs through nblag.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.