Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 2
The Error Component Model

The error component model is relevant when the slopes, i.e., the marginal effects of the covariates on the response, are the same for all the individuals, the intercepts being a priori different. Note that for some authors, the error component model is a byword for the “random‐effects model” as opposed to the “fixed‐effects model.” These two estimators will be analyzed in this chapter as two different ways to consider the individual component of the error terms for the same error component model (assuming no correlation and correlation with the regressors, respectively).

This is the landmark model of panel data econometrics, and this chapter presents the main results about it.

2.1 Notations and Hypotheses

2.1.1 Notations

For the observation of individual at period , we can write the model to be estimated, denoting by the response, the vector of covariates, the error, the intercept, and the vector of parameters associated to the covariates:

(2.1)

It'll be sometimes easier to store the intercept and the slopes in the same vector of coefficients. Denoting by this vector and the associated vector of covariates, the model can then be written:

(2.2)

For the error component model, the error is the sum of two effects:

the first, is the individual effect for individual ,
the second, is the residual effect, also called the idiosyncratic effect.

(2.3)

For the whole sample, we'll denote by the vector containing the response and the matrix of covariates, storing the observations ordered by individual first and then by period. We'll suppose from now that the panel is balanced, which means that we have the same number of observations () for all the individuals (). In this case, is a vector of length and a matrix of dimension .

Denoting by a vector of ones of length , we get:

(2.4)

When we want to use the extended vector of coefficients, we denote , and the model to be estimated is:

(2.5)

2.1.2 Some Useful Transformations

Panel data econometricians usually break the total variation up into the sum of intra‐individual and inter‐individual variations. These two variations can easily be obtained by transforming the data using different transformation matrices, which can be written using Kronecker products.

The Kronecker product of 2 matrices, denoted , is the matrix obtained by multiplying each element of by .

denotes the identity matrix of dimension , is a vector of ones of length and is a matrix of 1 of dimension .

The inter‐individual (or between) transformation is obtained by using a transformation matrix denoted by , which is defined by:

For example, we have, for and :

We then have:

To get the intra‐individual (or within) transformation, we'll use a transformation matrix defined as:

These two matrices have very important properties:

they are symmetric, so we then have and ,
they are idempotent, which means that and . For example, for the between transformation, if we apply it twice to , we obtain: . One computes the individual means of a vector, which already contains individual means; the vector is, therefore, unchanged; we then have , and the same reasoning applies to ,
they perform a decomposition of a vector, which means that , as and therefore ,
they are orthogonal: . Indeed, as the two matrices are symmetric and using the result that , we have: . consist in taking the deviations from individual means of the individual means and is therefore equal to 0 irrespective of .

and therefore perform an orthogonal decomposition of a vector ; this means that pre‐multiplying by each of the two matrices, we obtain two vectors that sum to and for which the inner product is 0.

2.1.3 Hypotheses Concerning the Errors

is the sum of a vector of length containing the idiosyncratic part of the error and of the individual effect , which is a vector of length for which each element is repeated times. This can be written in matrix form:

(2.6)

The estimated model will be defined by estimated parameters and by a vector of residuals .

(2.7)

(2.8)

Subtracting 2.5 from 2.8 enables to write the residuals as a function of the errors:

(2.9)

To get a similar expression in terms of and , we use 2.4 and 2.7:

The mean of this expression is, denoting :

In a linear model with an intercept, , which is the average of the residuals, is 0. Using the two previous equations, we get:

(2.10)

with a matrix that post‐multiplied by a vector returns a vector of the same length containing the overall mean. , post‐multiplied by a vector returns the vector in deviations from the overall mean.

The expressions (2.9 and 2.10) will be used all along this chapter to analyze the properties of the estimators.

The following hypotheses are made concerning the errors:

the expected values of the two components of the error are supposed to be 0; anyway, their means can't be identified if there is an intercept in the model,
the individual effects are homoscedastic and mutually uncorrelated,
the idiosyncratic part of the error is also homoscedastic and uncorrelated,
the two components of the errors are uncorrelated.

In this case, the covariance matrix of the errors depends only on the variance of the two components of the errors, i.e., the two parameters and . Concerning the variance and covariances of the errors, we then have:

for the variance of one error: ,
for the covariance of two errors of the same individual for two different periods: ,
for the covariance of two errors of two different individuals (belonging to the same period or not): .

For a given individual , the covariance matrix of the vector of errors for this individual is:

(2.11)

For the whole sample, we have , and the covariance matrix is a square matrix of dimension that contains submatrices . For , this submatrix is given by 2.11; for , this is a 0 matrix given the hypothesis of no correlation between the errors of two different individuals. The covariance matrix of the errors is then a block‐diagonal matrix, the blocks being the matrix given by the equation 2.11. This matrix can then be expressed as a Kronecker product:

This matrix can also be usefully expressed in terms of the two transformation matrices within and between described in subsection 2.1.2. In fact, and . Introducing these two matrices in the expression of , we get:

which finally implies, denoting :

(2.12)

Finally, all along this chapter, we'll suppose that both components of the errors are uncorrelated with the covariates: .

2.2 Ordinary Least Squares Estimators

The variability in a panel has two components:

the between or inter‐individual variability, which is the variability of panel's variables measured in individual means, which is or, in matrix form, ,
the within or intra‐individual variability, which is the variability of panel's variables measured in deviation from individual means, which is or, in matrix form .

Three estimations by ordinary least squares can then be performed: the first one on raw data, the second one on the individual means of the data (between model), and the last one on the deviations from individual means (within model).

2.2.1 Ordinary Least Squares on the Raw Data: The Pooling Model

The model to be estimated is . Using the second formulation, the sum of squares residuals can be written:

and the first‐order conditions for a minimum are (up to the multiplicative factor):

(2.13)

The first column of is a vector of ones associated to , which is the first element of . Therefore, dividing the first element of this vector by the number of observations leads to:

(2.14)

This is the well‐known result that the mean of the sample, i.e., () is on the regression line of the ordinary least squares estimator. The other first‐order conditions imply that , which can be rewritten, the average residual being equal to 0:

(2.15)

which means that the sample covariances between the residuals and the covariates are 0. Solving 2.13, we get the ordinary least squares estimator for the whole vector of coefficients:

(2.16)

Substituting by in 2.16,

(2.17)

To get the estimator of the slopes, one splits in and in :

The formula for the inverse of a partitioned matrix is given by:

(2.18)

with . The upper left block may also be written:

We have here:

with . returns a vector of length for which all the elements are the vector mean . One can easily check that this matrix is idempotent. We then have:

(2.19)

which is a formula similar to 2.16, but with variables pre‐multiplied by , this transformation removing the overall mean of every variable. For the intercept , we find the same expression as 2.14. In order to analyze the characteristics of the OLS estimator, we substitute in 2.19 by :

The estimator is then unbiased () if , i.e., if the theoretical covariances between the covariates and the errors are all 0. This result is directly linked with expression 2.13, which indicates that the OLS estimator is computed so that empirical covariances between the residuals and the covariates are all 0. The estimator is consistent if: . This expression is:

The first term is the population covariance matrix of the covariates and the second one the population covariance vector of the covariates and the errors. The estimator is therefore consistent if the covariance matrix of the covariates exists, is not 0, and if the covariances between the covariates and the errors are all 0. The variance of the OLS estimator is given by:

(2.20)

Note that for the error component model, the covariance matrix of the errors doesn't reduce to a scalar times the identity matrix because of the correlation induced by the individual effects. Therefore, the variance of the OLS estimator doesn't reduce to , and using this expression in tests will lead to biased inference.

In conclusion, the OLS estimator, even if it is unbiased and consistent, has two limitations:

the first one is that the usual estimator of the variance is not correct and should be replaced by a more complex expression,
the second is that, in this context, OLS is not the best linear unbiased estimator, which means that there exist other linear unbiased estimators that are more efficient.

2.2.2 The between Estimator

The between estimator is the OLS estimator applied to the model pre‐multiplied by , i.e., the model in individual means.

Note that the items of the model that don't exhibit intra‐individual variations are unaffected by this transformation. This is the case of the column of 1 associated to the intercept, of the matrix associated to the individual effects and also of some covariates with no intra‐individual as, for exemple, the gender in a sample of individuals. Note also that the observations of this model are in fact distinct observations of individual means repeated times. Using as in the case of the OLS estimator, the formula of the inverse of a partitioned matrix, the between estimator is:

(2.21)

is a matrix that transforms a variable in its individual means in deviation from the overall mean. The variance of is obtained by replacing by :

The expression of given by 2.12 implies that . Consequently, the expression of the variance of the between estimator is simply:

(2.22)

For the full vector of the coefficients (including the intercept ), the between estimator and its variance are:

(2.23)

(2.24)

To estimate , we use the deviance of the between model: . Using 2.23 and 2.9:

The matrix is idempotent, and its trace is, using the property that the trace is invariant under cyclical permutations: . We then have and . The unbiased estimator of is then . The one returned by an OLS program is: and the covariance matrix of the coefficients should then be multiplied by .

2.2.3 The within Estimator

The within estimator is obtained by applying the OLS estimator to the model pre‐multiplied by the matrix.

The within transformation removes the vector of 1 associated to the intercept and the matrix associated to the vector of individual effects. It also removes covariates that don't exhibit intra‐individual variation. Applying OLS to the transformed model leads to the within estimator:

(2.25)

The variance of is:

. The within transformation therefore induces a correlation among the errors of the model. The variance of the within estimator reduces to:

(2.26)

we then have, in spite of this correlation, the standard expression of the variance. In order to estimate , one uses the deviance of the within estimator: . Using 2.25 and 2.10:

The matrix is idempotent and its trace is . We then have . The unbiased estimator of is then , and the one returned by an OLS program is: . The covariance matrix of the coefficients should then be multiplied by: .

The within model is also called the “fixed‐effects model” or the least‐squares dummy variable model, because it can be obtained as a linear model in which the individual effects are estimated and then taken as fixed parameters. This model can be written:

where is now a vector of parameters to be estimated. There are therefore parameters to estimate in this model.¹ The estimation of this model is computationally feasible if is not too large. In a micro panel of large size, the estimation becomes problematic.

The equivalence between both models may be established using the Frisch‐Waugh theorem or using the formula of the inverse of a partitioned matrix. The Frisch‐Waugh theorem states that it is equivalent to regress on a set of covariates or to regress the residuals of from a regression on on the residuals of on a regression on . The application of the Frisch‐Waugh theorem in this context consists in regressing each variable with respect to and getting the residuals. Here, for each variable, the residual is . The first‐order condition of the sum of squared residuals minimization is . being a matrix which selects the individuals, we finally get for every individual, denoting :

Consequently, we have and the residuals are the deviations of the variable from its individual means. Therefore, the Frisch‐Waugh theorem implies that the fixed effect model can be estimated by applying the OLS estimator to the model transformed in deviations from the individual means, i.e., by regressing on .

With the within coefficients in hand, specific intercepts for every individual in the sample can then be computed:

where is the vector of individual means of .

If one wants to define individual effects with 0 mean in the sample, a general intercept can be computed: , being the overall mean of . We then have for every individual in the sample

Example 2‐1 within estimator – `TobinQ` data set

To illustrate the estimation of the estimators seen in this chapter, we use the TobinQ dataset of the pder package. These data concern 188 American firms for 35 years (from 1951 to 1985).

 data("TobinQ", package = "pder")

Schaller (1990) wishes to test Tobin (1969)'s theory of investment. In this model, the main variable that explains investment is the ratio between the value of the firm and the replacement cost of its physical capital, this ratio being called “Tobin Q”. If the financial market is perfect, the value of the firm equals the actual value of its future profits. If the Tobin Q is greater than 1, this means that the profitability of investment is greater than its cost and so that the investment is valuable. The response is therefore the rate of investment (investment divided by the capital stock) and the covariate is Tobin Q.

The plm package provides the plm function to estimate linear models on panel data. Its main arguments are:

formula, the symbolic description of the model,
data, the data.frame, which can be either an ordinary data.frame or a pdata.frame; in the first case, the index may be added to indicate the individual and time index,
model, the estimator one wants to compute: 'within', 'between', 'pooling' (which is the OLS estimator) and 'random' (which is the GLS estimator that will be presented in the next section).

We first create a pdata.frame using the pdata.frame function. This is done indicating in the index:

a character vector of length two indicating the individual and time index,
a character vector of length one indicating the individual index (in this case, it is assumed that there is no time index in the data),
an integer indicating the number of periods (only for a balanced panel with observations first ordered by individuals and then by period),
NULL, the default: in this case, it is assumed that the first two columns of the data.frame contain the individual and the time index.

These different possibilities are illustrated below, the first two columns of TobinQ containing the individual and the time index.

 pTobinQ <- pdata.frame(TobinQ)
pTobinQa <- pdata.frame(TobinQ, index = 188)
pTobinQb <- pdata.frame(TobinQ, index = c('cusip'))
pTobinQc <- pdata.frame(TobinQ, index = c('cusip', 'year'))

The pdim function can be used to inspect the individual and time dimensions of the data. It has a method for pdata.frame objects (without any further argument) and for data.frame. In the latter case, the index argument can be set; if not, it is once more assumed that the first two columns of the data.frame contain the individual and the time index.

 pdim(pTobinQ)
Balanced Panel: n = 188, T = 35, N = 6580

 pdim(TobinQ, index = 'cusip')
pdim(TobinQ)

A pdata.frame has an index attribute, which is a data.frame that contains the index. It can be extracted using the index function:

 head(index(pTobinQ))
  cusip year
2  2824 1951
3  2824 1952
4  2824 1953
5  2824 1954
6  2824 1955
7  2824 1956

We then estimate the three models we have described:

 Qeq <- ikn ˜ qn
Q.pooling <- plm(Qeq, pTobinQ, model = "pooling")
Q.within <- update(Q.pooling, model = "within")
Q.between <- update(Q.pooling, model = "between")

Either simple or extended printing of the results is obtained as usual with R applying the print.plm or summary.plm methods to the object containing the fitted model. For example, for the within estimator, we get:

 Q.within

Model Formula: ikn ˜ qn

Coefficients:
     qn
0.00379
summary(Q.within)
Oneway (individual) effect Within Model

Call:
plm(formula = Qeq, data = pTobinQ, model = "within")

Balanced Panel: n = 188, T = 35, N = 6580

Residuals:
    Min.  1st Qu.   Median  3rd Qu.     Max.
-0.21631 -0.04525 -0.00849  0.03365  0.61844

Coefficients:
   Estimate Std. Error t-value Pr(>|t|)
qn 0.003792   0.000173      22   <2e-16 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Total Sum of Squares:    36.7
Residual Sum of Squares: 34.1
R-Squared:      0.0702
Adj. R-Squared: 0.0428
F-statistic: 482.412 on 1 and 6391 DF, p-value: <2e-16

For the within estimator, the fixef.plm method computes the individual effects. Three flavors of fixed effects may be obtained depending on the value of the type argument:

'level', the default value, returns the individual intercepts, i.e., ,
'dfirst' returns the individual effects in deviations from the first individual; is in this case the intercept for the first individual,
'dmean' returns the individual effects in deviations from their mean; in this case, is the average of the individual intercepts.

 head(fixef(Q.within))
  2824   6284   9158  13716  17372  19411
0.1453 0.1281 0.2581 0.1100 0.1267 0.1695
head(fixef(Q.within, type = "dfirst"))
    6284     9158    13716    17372    19411    19519
-0.01723  0.11279 -0.03528 -0.01856  0.02420 -0.01038
head(fixef(Q.within, type = "dmean"))
     2824      6284      9158     13716     17372     19411
-0.014213 -0.031448  0.098581 -0.049492 -0.032778  0.009986

We then illustrate the equivalence of the within estimator and the least‐squares dummy variables estimator. For this later estimator, we use the lm function with the cusip variable used as a covariate, as it is the individual index. The default behavior of lm is to remove the first level of the factor. The fixed effects are then equal to those obtained using the fixef.plm function with the argument type equal to 'dfirst'.

 head(coef(lm(ikn ˜ qn + factor(cusip), pTobinQ)))
       (Intercept)                 qn  factor(cusip)6284
          0.145290           0.003792          -0.017235
 factor(cusip)9158 factor(cusip)13716 factor(cusip)17372
          0.112794          -0.035279          -0.018564

2.3 The Generalized Least Squares Estimator

The within estimator is a regression on data that have been transformed so that the individual effects vanish (they are, so to say, “transformed out”), while the least squares dummy variables considers the individual effects as parameters to be estimated (they are “estimated out”); both give identical estimates of the slopes. On the contrary, the GLS estimator considers the individual effects as random draws from a specific distribution and seeks to estimate the parameters of this distribution in order to obtain efficient estimators of the slopes.

2.3.1 Presentation of the GLS Estimator

When the errors are not correlated with the covariates but are characterized by a non‐scalar covariance matrix , the efficient estimator is the generalized least squares estimator:

(2.27)

In order to compute the variance of , we substitute as previously by . We then have:

Using a reasoning similar to 2.20, we obtain the variance of the estimator:

(2.28)

The hypothesis we have made concerning the errors implies that the covariance matrix of the errors is given by 2.12: , which is a linear combination of two idempotent and orthogonal matrices. depends only on two parameters: the variances of the two components of the error terms ( and ). We have shown, in subsection 2.1.2, that these two matrices are idempotent ( and ) and orthogonal (). The expression of powers of is then particularly simple:

(2.29)

which can be easily checked, for example for . This result can also be extended to negative integers and to rationals; we then have, for :

and the GLS estimator of the random error model and its variance are then:

(2.30)

(2.31)

For the vector of slopes, we obtain:

(2.32)

(2.33)

This estimator is called the random effects model, as opposed to the fixed effects model. This results from the fact that, as observed, in this case, the individual effects are considered as random deviates, the parameters of whose distribution we seek to estimate.

The dimension of the matrix is given by the size of the sample. If the sample is large, it is therefore not practical to compute the estimator according to the matrix formula 2.27. A more efficient way is to apply OLS on suitably pre‐transformed data. To this end, one has to compute the matrix such that: and then use this matrix to transform all the variables of the model. Denoting and the transformed variables, the estimation by OLS on transformed variables gives:

which is the GLS given by 2.30. The expression of the matrix is obtained using equation 2.29 for :

This transformation consists in a linear combination of the between and within transformations with weights depending on the variances of the two error components. In fact, pre‐multiplying the variables by (which is equivalent to premultiplication by and simplifies notation), the weights become respectively and 1. The transformed variable is therefore:

with, denoting :

As will be explained in detail below, the importance of the individual effects in the composite errors, measured by their share of the total variance, determines how close the estimator will be to either the within or the pooled OLS, which are obtained as special cases, respectively, when the variance of the individual effects dominates () or vanishes ().

2.3.2 Estimation of the Variances of the Components of the Error

In order to make operational the estimator, residuals from consistent estimators are used to estimate the unknown parameters and (and hence ). The estimator obtained is then called the feasible generalized least squares estimator.

Consider the errors of the model , their individual mean and their deviations from these individual means . By hypothesis, we have: . For the individual means, we get:

The variance of the deviation from the individual means is easily obtained by isolating terms in :

the sum then contains terms. The variance is:

which finally leads to:

If were known, natural estimators of these two variances et would be:

(2.34)

(2.35)

i.e., estimators based on the norm of the errors transformed using the between and within matrices. Of course, the errors are unknown, but consistent estimation of the variances may be obtained by substituting the errors by residuals obtained from a consistent estimation of the model. Among the numerous estimators available, the one proposed by Wallace and Hussain (1969) is particularly simple as it consists on using the OLS residuals to write the sample counterpart of equations 2.34 and 2.35

The estimated variance of the individual effects can then be obtained:

The estimator of Amemiya (1971) is based on the estimation of the within model. We first compute the overall intercept

and then compute the residuals :

These residuals are then used to compute the two quadratic form.

Note that the later is just the deviance of the within estimation divided by . Note also that the variance of the individual effect is overestimated if the model contains some time‐invariant variables which disappear with the within transformation.

In this case, Hausman and Taylor (1981) proposed the following adjustment: are regressed on all the time‐invariant variables in the model and the residuals of this regression are substituted with in the computation of the quadratic forms. This will reduce the estimate of and leave unchanged the estimate of , so that the estimate of will also decrease.

For the Swamy and Arora (1972) estimator, the within and the between models are estimated. The residuals of the between model are used for the first quadratic form and those of the within model for the second one.

Note that Swamy and Arora (1972) use the degrees of freedom of both regressions for the estimation of the variances, i.e., is deduced from the number of observations. Note also that and are the residuals of the between and within regressions computed on the transformed data, so that the numerators of the two quadratic forms are the deviances of the two regressions.

For all these estimators, is not directly estimated but obtained by subtracting from . In small samples, it can therefore be negative, and in this case it is set to 0.

On the contrary, for the Nerlove (1971) estimator, is estimated by computing the empirical variance of the fixed effects of the within model, as the estimate of is obtained by dividing the quadratic form of the within residuals by the number of observations.

Example 2‐2 random effects model – `TobinQ` data set

The random effects model is obtained by setting model to 'random'. Specific arguments indicate how the variances are estimated.

random.method is one of 'walhus' for Wallace and Hussain (1969), 'swar' for Swamy and Arora (1972), amemiya for Amemiya (1971), 'ht' for Hausman and Taylor (1981) and 'nerlove' for Nerlove (1971).
random.models is an alternative to the random.methods argument : it is a character vector of length 1 or 2 that indicates which preliminary estimations are performed in order to estimate the variances; for example, c("within", "between") use the within residuals to estimate and the between residuals to estimate , c("pooling") or ("pooling", "pooling") use the pooling residuals for the estimation of both variances,
random.dfcor is a numeric vector of length 2; it indicates what is the denominator of the two quadratic forms. If :
- 0 the number of observations is used ,
- 1, the numerators of the theoretical formulas are used
- 2, the number of estimated parameters are deduced .

The following two commands estimate the same Swamy and Arora (1972) model :

 Q.swar <- plm(Qeq, pTobinQ, model = "random", random.method = "swar")
Q.swar2 <- plm(Qeq, pTobinQ, model = "random",
               random.models = c("within", "between"),
               random.dfcor = c(2, 2))
summary(Q.swar)
Oneway (individual) effect Random Effect Model
   (Swamy-Arora's transformation)

Call:
plm(formula = Qeq, data = pTobinQ, model = "random", random.method = "swar")

Balanced Panel: n = 188, T = 35, N = 6580

Effects:
                  var std.dev share
idiosyncratic 0.00533 0.07303  0.73
individual    0.00202 0.04493  0.27
theta: 0.735

Residuals:
   Min. 1st Qu.  Median 3rd Qu.    Max.
-0.2330 -0.0475 -0.0103  0.0336  0.6211

Coefficients:
            Estimate Std. Error t-value Pr(>|t|)
(Intercept) 0.159327   0.003425    46.5   <2e-16 ***
qn          0.003862   0.000168    22.9   <2e-16 ***
‐‐‐
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Total Sum of Squares:    37.9
Residual Sum of Squares: 35.1
R-Squared:      0.0742
Adj. R-Squared: 0.074
F-statistic: 526.854 on 1 and 6578 DF, p-value: <2e-16

The results indicate that the part in the variance of the individual effect is about one fourth. The parameter called is the part of the individual mean that is removed from each variable for the GLS estimator. It can be written as and is here equal to . This high value is due to the large time dimension of this panel (). This implies that the GLS estimator is closer to the within estimator () than to the OLS estimator ().

The part of the result that deals with the estimation of the two components of the error may also be obtained by applying the ercomp function either to the GLS fitted model or using a formula – data interface:

 ercomp(Qeq, pTobinQ)
ercomp(Q.swar)

We then compare the results obtained with the 4 estimation methods we've presented:

 Q.walhus <- update(Q.swar, random.method = "swar")
Q.amemiya <- update(Q.swar, random.method = "amemiya")
Q.nerlove <- update(Q.swar, random.method = "nerlove")
Q.models <- list(swar = Q.swar, walhus = Q.walhus,
                 amemiya = Q.amemiya, nerlove = Q.nerlove)
sapply(Q.models, function(x) ercomp(x)$theta)
   swar.id  walhus.id amemiya.id nerlove.id
    0.7351     0.7351     0.7361     0.7489
sapply(Q.models, coef)
                swar   walhus  amemiya  nerlove
(Intercept) 0.159327 0.159327 0.159328 0.159344
qn          0.003862 0.003862 0.003862 0.003855

The first sapply command extracts from the ercomp object the theta element, indicating the proportion of the individual mean that is removed from the variables. These are very close to each other, and consequently, the estimated coefficients for the 4 models are almost identical.

2.4 Comparison of the Estimators

We have four different estimators of the same model : the between and the within estimators use only one source of the variance of the sample, while the OLS and the GLS estimators use both.

Note first that, if the hypothesis that the errors and the covariates are uncorrelated is true, all these models are unbiased and consistent, which means that they should give similar results, at least in large samples.

We'll first analyze the relations between these estimators; we'll then compare their variances; and finally we'll analyze in which circumstances we should use fixed or random effects.

2.4.1 Relations between the Estimators

We can expect the OLS and GLS estimators to give intermediate results between the within and the between estimators as they use both sources of variance. From equation 2.32, the GLS estimator can be written :

Using 2.21 and 2.25, can then be expressed as a weighted average of the within and the between estimators.

A similar result applies to the OLS estimator which is the GLS estimator for .

For the OLS estimator, the weights are very intuitive because they are just the shares of the intra‐ and the inter‐individual variances of the covariates. For the GLS estimator, the weights depend not only on the shares of the variance of the covariates but also on the variance of the errors, which determines the parameter. The GLS estimator will always give less weight to the between variation, as is lower than 1. It leads to two special cases :

; this means that is “small” compared to . In this case, the GLS estimator converges to the within estimator,
; this means that is “large” compared to . In this case, the GLS estimator converges to the OLS estimator.

The relation between the estimators can also be illustrated by the fact that the OLS and the GLS can be obtained by stacking the within and between transformations of the model:²

(2.36)

The matrix of covariance of the errors of this stacked model is :

(2.37)

Applying OLS to 2.36, we get;

which is the OLS estimator,

while applying GLS to 2.36 yields the GLS estimator of equation 2.30.

2.4.2 Comparison of the Variances

From equation 2.33, the variance of the GLS estimator can be written :

(2.38)

The variance of the within estimator being : , is a positive definite matrix, and the GLS estimator is therefore more efficient than the within estimator. Similarly, equation 2.22 shows that the variance of the between may be written and therefore is also a positive definite matrix.

2.4.3 Fixed vs Random Effects

The individual effects are not fixed or random by nature. Within the same framework (the individual effects model), they are treated as either a vector of constant parameters or the realization of random deviates for the purpose of estimation, depending on their probabilistic structure and, in particular, on their correlation with the explanatory variables.

In a micro‐panel, the random effects approach is appealing, as we work on a sample with numerous individuals who are randomly drawn from a very large population. There is no interest in estimating the individual effects, and the random effect approach is more appropriate, given the way the sample was obtained.

On the contrary, in a macro‐panel, the sample is fixed or quasi‐fixed and almost exhaustive (think of the countries of the world or the large enterprises of a country). In this case, the estimation of the individual effects may be an interesting result, and the fixed effects approach seems relevant.

Anyway, the main argument that leads to choose one of the two approaches is the possibility of correlation between some covariates and the individual effects. If we maintain the hypothesis that the idiosyncratic error is uncorrelated with the covariates (), two situations can occur :

: the individual effects are not correlated; in this case, both models are consistent, but the random effects estimator is more efficient that the fixed effects model,
: the individual effects are correlated; in this case, only the fixed effects method gives consistent estimates as, with the within transformation, the individual effects vanish.

Example 2‐3 comparison of the estimators – `TobinQ` data set

The following command extracts the coefficient of qn and its standard deviation for the four estimators (we consider only the Swamy and Arora (1972) method for the GLS estimator, as all the random effects models give very similar results).

 sapply(list(pooling = Q.pooling, within = Q.within,
            between = Q.between, swar = Q.swar),
       function(x) coef(summary(x))["qn", c("Estimate", "Std. Error")])
             pooling    within   between      swar
Estimate   0.0043920 0.0037919 0.0051847 0.0038622
Std. Error 0.0001529 0.0001726 0.0007491 0.0001683

The OLS and GLS estimators are in the interval defined by the within and between estimators, and the GLS estimator is closer to the within estimator than OLS.

Looking at the standard deviations, OLS seems to be the most efficient model, but remember that the standard formula for computing the variance of the OLS estimator is biased if individual effects are present. The standard deviation for the GLS estimator (1.683E‐04) is slightly lower than for the within estimator (1.726E‐04) and much lower than for the between estimator (7.491E‐04).

The formal relation between the different estimators is then illustrated by computing the shares of the variances for the covariate qn. For this purpose, we'll extract this series from the padata.frame, which is not, as for data.frame, a numeric vector, but a pseries object, which inherits from the pdata.frame it has been extracted from the index attribute. The summary.psries method applied to this object indicate the variance structure of the series:

 summary(pTobinQ$qn)
total sum of squares: 314300
     id    time
0.43081 0.09393

We can use the Within and the Between function with this series in order to compute its within and the between transformations, and then the weights of the within and the between estimators in the OLS estimator.

 SxxW <- sum(Within(pTobinQ$qn) ^ 2)
SxxB <- sum((Between(pTobinQ$qn) - mean(pTobinQ$qn)) ^ 2)
SxxTot <- sum( (pTobinQ$qn - mean(pTobinQ$qn)) ^ 2)
pondW <- SxxW / SxxTot
pondW
[1] 0.5692
pondW * coef(Q.within)[["qn"]] +
  (1 - pondW) * coef(Q.between)[["qn"]]
[1] 0.004392

The weight of the within model is 57%. The OLS estimator (0.0044) is then about half way between the between estimator (0.0052) and the within estimator (0.0038). To get the GLS estimator, we first estimate the parameter using the residuals of the within and the between estimators:

 T <- 35
N <- 188
smxt2 <- deviance(Q.between) * T / (N - 2)
sidios2 <- deviance(Q.within) / (N * (T - 1) - 1)
phi <- sqrt(sidios2 / smxt2)

The weights for the within and the between estimators and the GLS estimator are then computed:

 pondW <- SxxW / (SxxW + phi^2 * SxxB)
pondW
[1] 0.9496
pondW * coef(Q.within)[["qn"]] +
  (1 - pondW) * coef(Q.between)[["qn"]]
[1] 0.003862

The weight of the within estimator (0.95) is much larger for the GLS estimator than for the OLS estimator. This is mainly due to the fact that is large (35 years). The GLS estimator (0.039) is therefore very close to the within estimator (0.0038).

2.4.4 Some Simple Linear Model Examples

Even if they are of limited practical interest, given that relevant econometric models usually contain several covariates, simple linear models have a great pedagogical value, as they enable the graphical representation of the sample and estimators using regression lines. They are for this reason very useful to illustrate the relationship between the estimators. We'll use successively four data sets.

Example 2‐4 simple linear model – `ForeignTrade` data set

The first one, called ForeignTrade, has been used by Kinal and Lahiri (1993) to construct a full model of external exchange for developing countries, which will be presented in details in chapter 6. For now, we'll simply analyze the link between the imports (imports) and the national product (gnp). Both variables are measured in log and per capita.

The following commands create a pdata.frame, extract the covariate and apply to it the summary.pdata.frame method, which computes the decomposition of its variance. We then use the ercomp function in order to compute the variances of the error components. Finally, to estimate all the models, we first create a vector containing the names of the models, and we then use the sapply function in order to extract the coefficient from these fitted models.

 data("ForeignTrade", package = "pder")
FT <- pdata.frame(ForeignTrade)
summary(FT$gnp)
total sum of squares: 4111
      id     time
0.982480 0.007638
ercomp(imports ˜ gnp, FT)
                 var std.dev share
idiosyncratic 0.0863  0.2938  0.07
individual    1.0779  1.0382  0.93
theta: 0.942
models <- c("within", "random", "pooling", "between")
sapply(models, function(x) coef(plm(imports ˜ gnp, FT, model = x))["gnp"])
 within.gnp  random.gnp pooling.gnp between.gnp
    0.90236     0.76816     0.06366     0.04871

For this model, the variance of the covariate and of the error is almost only due to the inter‐individual variation (respectively 98 and 93%). In this case, the GLS estimator consists in removing 94% of the individual mean and is therefore almost identical to the within model. Concerning the OLS estimator, which takes into account almost all the inter‐individual variation, it is very close to the between estimator. Finally, the first two models give results that are very different from the last two models and return a much higher elasticity. The Figure 2.1 indicates that there is a strong negative correlation between the individual effects and the covariate. In this case, the estimators that do not control for the individual effects are biased downward. This is the case for the OLS and the between estimators, and to a much lesser extent for the GLS estimator, which uses only a very small part of the inter‐individual variation.

Figure 2.1Imports in terms of the national product for the ForeignTrade data.

Example 2‐5 simple linear model – `TurkishBanks` data set

The TurkishBanks data were used by El‐Gamal and Inanoglu (2005) to analyze production costs of banks. The only covariate is the production, and both variables are in logs. Computing as before, we get:

 data("TurkishBanks", package = "pder")
TurkishBanks <- na.omit(TurkishBanks)
TB <- pdata.frame(TurkishBanks)
summary(log(TB$output))
total sum of squares: 2692
     id    time
0.84730 0.01255
ercomp(log(cost) ˜ log(output), TB)
                var std.dev share
idiosyncratic 0.329   0.574   0.6
individual    0.216   0.464   0.4
theta:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
  0.619   0.651   0.651   0.647   0.651   0.651
sapply(models, function(x)
       coef(plm(log(cost) ˜ log(output), TB, model = x))["log(output)"])
 within.log(output)  random.log(output) pooling.log(output)
             0.5064              0.6471              0.8007
between.log(output)
             0.8531

The variation of the covariate is mainly inter‐individual (85%), but for the error, the share of the individual effect and that of the idiosyncratic effect are similar (40% and 60%). The OLS and the between estimators are therefore very close. The GLS estimator is about halfway between the OLS and the within estimators because the transformation removes about 65% of the individual mean. The Figure 2.2 indicates that the individual effects are positively correlated with the covariate, and consequently, the between, the OLS and in a lesser extent the GLS estimators are upward‐biased.

Figure 2.2Cost in terms of output for the TurkishBanks data.

Example 2‐6 simple linear model – `TexasElectr` data set

The TexasElectr data are used by Kumbhakar (1996) and Horrace and Schmidt (1996) and concern the production cost of electric firms in Texas. We first define the cost as being the sum of labor expense explab, capital expense expcap, and fuel expense exfuel. The same computations are then done as above.

 data("TexasElectr", package = "pder")
TexasElectr$cost <- with(TexasElectr, explab + expfuel + expcap)
TE <- pdata.frame(TexasElectr)
summary(log(TE$output))
total sum of squares: 113.5
    id   time
0.8234 0.1685
ercomp(log(cost) ˜ log(output), TE)
                  var std.dev share
idiosyncratic 0.10681 0.32681  0.99
individual    0.00109 0.03299  0.01
theta: 0.0808
sapply(models, function(x)
       coef(plm(log(cost) ˜ log(output), TE, model = x))["log(output)"])
 within.log(output)  random.log(output) pooling.log(output)
             2.6325              1.2260              1.1804
between.log(output)
             0.8689

The variation of the covariate is mainly inter‐individual (82%); yet this is not the case for the error, for which the idiosyncratic share is very important: therefore, only a very small part of the individual mean is removed while applying the GLS estimator. The GLS and OLS estimators are therefore almost equal. The within estimator is much higher because the individual effects and the covariate are negatively correlated (see Figure 2.3).

Cost and output for the TexasElectr data set.

Example 2‐7 simple linear model – `DemocracyIncome25` data set

The last dataset used is DemocracyIncome25 used by Acemoglu, Johnson, Robinson, and Yared (2008). This dataset deals with 25 countries, observed over 7 25‐year periods between 1850 and 2000. The authors analyze the dynamic causal relationship between wealth and democracy. Their analysis will be reproduced in detail in chapter 7. For now, we'll simply analyze the relationship between democracy (democracy) and wealth (income) lagged one period.

 data("DemocracyIncome25", package = "pder")
DI <- pdata.frame(DemocracyIncome25)
summary(lag(DI$income))
total sum of squares: 135
    id   time
0.4298 0.4891
ercomp(democracy ˜ lag(income), DI)
                 var std.dev share
idiosyncratic 0.0586  0.2422  0.79
individual    0.0155  0.1243  0.21
theta: 0.378
sapply(models, function(x)
       coef(plm(democracy ˜ lag(income), DI, model = x))["lag(income)"])
 within.lag(income)  random.lag(income) pooling.lag(income)
             0.1870              0.2101              0.2309
between.lag(income)
             0.2892

Figure 2.4Democracy and lagged income for the data DemocracyIncome25.

The share of the inter‐individual variation for the covariate and for the error are rather weak (43 and 21%). 41% of the individual mean is removed from the variables in order to compute the GLS estimator. Finally, Figure 2.4 shows that there is no obvious correlation between the individual effects and the covariate; consequently, the 4 estimators are rather close to each other.

2.5 The Two‐ways Error Components Model

The two‐ways error component is obtained by adding a time‐invariant effect to the model.

2.5.1 Error Components in the Two‐ways Model

We make for the time effects the same hypotheses that we made for the individual effects:

has a zero mean and is homoscedastic, its variance is denoted by ,
the time effects are mutually uncorrelated, ,
the time effects are uncorrelated with the individual effects and the idiosyncratic terms.

With these hypotheses, the covariance matrix of the errors becomes:

As for the individual error component model, we write this covariance matrix as a linear combination of idempotent and mutually orthogonal matrices. To this aim, we write:

computes, as before, the individual means , the time means and the overall mean . Finally, the within matrix now produces deviations from the individual and the time means: :

With these notations, we get:

It can be easily checked that these matrices are idempotent. On the contrary, they are not all orthogonal, as . The product of these two matrices allows to compute the time means of the individual means, which results in the overall mean. For this reason, we use and , which return respectively the individual and the time means in deviations from the overall mean. We finally obtain:

2.5.2 Fixed and Random Effects Models

As for the individual effects model, the two‐ways fixed effects model can be obtained in two different ways:

by estimating by OLS the model that includes individual and time dummies,
by estimating by OLS the model where all the variables have been transformed in deviations from the individual and the time means: .

For the GLS model the variables are pre‐multiplied by or more simply by:

Collecting terms, we obtain the following expression for the transformed data:

with:

Example 2‐8 two‐ways effect model – `TobinQ` data set

We've previously stored the four random effect models in a list called Q.models. The two‐ways effect model is obtained by setting the effect argument to 'twoways'.

 Q.models2 <- lapply(Q.models, function(x) update(x, effect = "twoways"))
sapply(Q.models2, function(x) sqrt(ercomp(x)$sigma2))
         swar  walhus amemiya nerlove
idios 0.06970 0.06970 0.06969 0.06850
id    0.04508 0.04508 0.04573 0.04735
time  0.02093 0.02093 0.02170 0.02262
sapply(Q.models2, function(x) ercomp(x)$theta)
      swar   walhus amemiya nerlove
id    0.7472 0.7472 0.7505  0.7624
time  0.764  0.764  0.772   0.7843
total 0.6863 0.6863 0.6933  0.7085

The first sapply command extracts the standard deviations of the three components of the error. As for the individual effects model, the estimates of the variance components are very similar. The standard deviation of individual effects is more than twice the one of time effects. The second command extracts the theta parameters. About 75% of the individual and time means are removed from the variables.

2.6 Estimation of a Wage Equation

Example 2‐9 multiple linear model – `UnionWage` data set

The estimation of a wage function is an important subject in econometrics, especially in panel data econometrics, the main covariate of interest being generally education. We use here the UnionWage dataset used by Vella and Verbeek (1998), who investigated the impact of union negotiations on wages and the potential endogeneity of this covariate. The data concern 545 men observed during 8 years, from 1980 to 1987.

 data("UnionWage", package = "pglm")
pdim(UnionWage)
Balanced Panel: n = 545, T = 8, N = 4360

The response, wage, is the log of the hourly wage. The covariates are: whether wages are set during negotiations with unions union, the number of years of education school, the number of years of experience exper and its square, the community com, which identifies black black and Hispanic hisp workers, whether one lives in a rural area rural, the marital status married, having a health problem health, the region region, and the activity sector sector.

The within and OLS models are estimated, including or not occupation dummies.

 UnionWage$exper2 <- with(UnionWage, exper ^ 2)
wages.within1 <- plm(wage ˜ union + school + exper + exper2 +
                         com + rural + married + health +
                         region + sector, UnionWage)
wages.within2 <- plm(wage ˜ union + school + exper + exper2 +
                         com + rural + married + health +
                         region + sector + occ, UnionWage)
wages.pooling1 <- update(wages.within1, model = "pooling")
wages.pooling2 <- update(wages.within2, model = "pooling")

Estimation results are presented in Table 2.1, using the stargazer library (Hlavac, 2013). We use several possibilities offered by the library to improve the appearance of the table:

the omit argument is used to omit two sets of coefficients corresponding to the region and sector factors; omit.labels indicates how the information about these covariates will be included in the table,
the statistic and adjusted are removed from the output using omit.stat,
customized names for the response and the covariates are provided with dep.var.labels and covariate.labels
column.labels and column.separate are used to indicate the method of estimation used for the first two and the last two models.

Table 2.1 Wage Equation.

Dependent variable:				log of hourly wage
	pooling estimation		within estimation
	(1)	(2)	(3)	(4)
union membership	0.176	0.146	0.080	0.079
	(0.017)	(0.017)	(0.020)	(0.019)
education years	0.078	0.090
	(0.005)	(0.005)
experience years	0.070	0.076	0.111	0.112
	(0.010)	(0.010)	(0.009)	(0.008)
experience years squared
	(0.001)	(0.001)	(0.001)	(0.001)
black
	(0.023)	(0.023)
hispanic
	(0.022)	(0.022)
rural residence			0.048	0.050
	(0.019)	(0.018)	(0.029)	(0.029)
married	0.102	0.110	0.038	0.040
	(0.015)	(0.015)	(0.018)	(0.018)
health problems	0.035	0.058	0.010	0.017
	(0.054)	(0.054)	(0.047)	(0.047)
Intercept	0.273	0.039
	(0.091)	(0.076)
region dummies	Yes	Yes	Yes	Yes
sector dummies	Yes	Yes	Yes	Yes
occupation dummies	Yes	No	Yes	No
Observations	4,360	4,360	4,360	4,360
R	0.278	0.264	0.192	0.190

Note: p0.1; p0.05; p0.01

 library("stargazer")
stargazer(wages.pooling2, wages.pooling1, wages.within2, wages.within1,
          omit = c("region", "sector", "occ"),
          omit.labels = c("region dummies", "sector dummies", "occupation
          dummies"),           column.labels = c("pooling estimation", "within estimation"),
          column.separate = c(2, 2),
          dep.var.labels = "log of hourly wage",
          covariate.labels = c("union membership", "education years",
                               "experience years", "experience years squared",
                               "black", "hispanic", "rural residence",
                               "married", "health problems",
                               "Intercept"),
          omit.stat = c("adj.rsq", "f"),
          title = "Wage equation",
          label = "tab:wagesresult",
          no.space = TRUE
)

Table 2.1 exactly matches the results presented in Vella and Verbeek (1998) in columns (2), (1), (3), and (4).

Looking at the results, we see that the union premium is about 18% with the OLS model and falls to 8% for the within model. This indicates that the individual effects are strongly positively correlated with union membership. The return of education is about 8% more wage for one more year of education. This is a consistent result only if the education level is uncorrelated with the individual effects. If there is any correlation, the only consistent model is the within model; unfortunately, the within transformation eliminates all the time‐invariant covariates (education, community, and rural residence).

This example illustrates the main concern about panel data econometrics, the correlation between some covariates and the two components of the error term:

if there is no correlation, use GLS, which gives consistent and efficient estimators and allows estimating the coefficients for time‐invariant covariates;
if there is correlation only with the individual component of the error, use the within model; it provides consistent estimates, but the effect of time‐invariant covariates cannot be estimated;
if there is correlation between any covariates and both components of the error term, none of the models we have presented are consistent. Vella and Verbeek (1998) argued that the endogeneity of union membership is not limited to the time‐invariant part of the error. In this case, all the models presented, including the within model, are inconsistent, and the authors propose a more sophisticated estimation procedure in order to obtain a consistent estimator.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.