For unbalanced panels, the number of observations for each individual is now individual specific and denoted by . We'll denote by the total number of observations. Compared to the balanced panel case, three complications appear:
The model to be estimated can be written:
The fixed effects model may be estimated by regressing on and . Like in the balanced panel case, the Frisch‐Waugh theorem enables to avoid the estimation of the fixed effects. The estimation of may be obtained by regressing in a first stage and on , computing the residuals and then regressing in the second stage the residuals of on those of . As in the balanced panel case, these residuals are just the individual within transformation, i.e., or in matrix form, and the fixed effects model is simply obtained by regressing on .
For the GLS model, the covariance matrix of the errors is:
The GLS estimator writes:
is a block‐diagonal matrix that contains square matrices of ones of dimension . For balanced panels, and . returns the sum of the values of for each individual. is also a block‐diagonal matrix, with blocks of the form:
with .
The inverse of a block‐diagonal matrix being equal to a block‐diagonal matrix for which the blocks are the inverses of those of the initial matrix, it is sufficient to calculate the inverse of . As it is a linear combination of two idempotent and orthogonal matrices, the general formula for any power of is:
In particular, the inverse is:
which can also be written as , with:
The GLS estimator may then be obtained by applying OLS on variables that have been transformed by pre‐multiplying them by or, equivalently, by (which will simplify notation):
As in the balanced case, the transformed data can be expressed as quasi‐differences, , with:
the only difference being that now, the proportion of the individual mean that is removed is not a constant, as it depends on the number of observations for each individual.
For the two‐ways error component model, we have:
or, in matrix form:
where and are matrices of respectively individual and time dummies. Pre‐multiplying a vector by and returns, respectively, the individual and time sum of the variable.
and are two diagonal matrices that contain the number of observations for each individual and time‐series. Pre‐multiplying a vector by or by returns, respectively, the individual and the time series means. Finally, is a matrix of ones and zeros, which indicates whether an observation for a specific individual and time period is present or not.
To help visualizing these matrices, we consider a panel with 3 individuals and 4 periods; the panel is unbalanced, as the first individual is not observed in the third and fourth periods, and the third one is not observed in the first period.
The fixed effects model can be estimated regressing on and the two matrices associated with the effects vectors and .
The application of the Frisch‐Waugh theorem implies that the estimation can be performed by regressing in a first stage , , and on and then, in a second stage, by regressing the residuals of on those of and , which means regressing on and .
Applying the same theorem again, one can regress in and on in the first stage, and the residuals of on those of in the second stage.
Residuals of a regression on are obtained by pre‐multiplying the variables by the matrix:
where, for any matrix , is the generalized inverse of . Finally, the two‐ways error component fixed effects model may be obtained by applying to and every column of the following transformation:
The double‐within transformation consists then, for unbalanced panels, in multiplying any data vector by the following matrix:
Therefore, the two‐ways fixed effects model is still easy to compute even if the panel is unbalanced: all that is required is OLS estimation and the computation of deviations from the individual means. One proceeds as follows:
The within transformation is performed on variables, and then preliminary linear estimations are performed on covariates before the final estimation for which there are covariates.
Note that no specific matrix computation is required and that, in particular, the matrix of individual dummies, which is often very large (), need not to be stored during the estimation.
The variance matrix of the errors is:
with:
Denote the covariance matrix of the errors of the individual one‐way error components model. We then have:
is block‐diagonal, with blocks: . and being idempotent and orthogonal, the matrix (defined so that ) is also a block‐diagonal matrix with blocks: . We then have:
for which the inverse is:
We then apply the following result: to the matrix in brackets:
Finally, we have:
and the GLS estimator is:
Let and the matrix of the covariates and of the time dummies measured in quasi‐difference from the individual means. We then have:
and a similar expression for . With the two matrices and in hand, the computation of the estimator requires:
These are reasonable computational tasks: note especially that the matrix of individual effects needn't be stored and that the dimension of the matrix that has to be inverted is and not or and that, at least for micro‐panels, is relatively small. Note also that computation of the GLS estimator requires explicit matrix operations and it can no longer be obtained as a series of linear regressions on transformed data.
Remember that, in the balanced panel case, we used the result that natural estimators of and were:
Feasible estimates were obtained by replacing by the residuals from a consistent estimation. For the balanced case, and were natural denominators. This is no longer the case when the panel is unbalanced, as is not the same for all individuals (and is not the same for all time periods).
The strategy used here consists in computing the expected values of the quadratic forms in order to obtain unbiased estimators of the variance components:
Different estimators are obtained using different preliminary models to obtain the residuals. Among the numerous possible choices, as previously seen on chapter 2:
The model and its estimation are:
The intercept can be removed by pre‐multiplying every element of the model by: , which subtracts from every variable its overall mean and therefore removes the intercept, as .
Subtracting the expression of the model and of its estimation, we get:
The three estimators we use (OLS, within, and between) can be seen as GLS estimators of this model, with being equal, respectively, to , , and :
Using the two previous expressions, we get with:
is the matrix that transforms the error vector into the residuals vector. Note that it is not a symmetric matrix, at least unless (which corresponds to the pooling model). The quadratic form of the residuals with a matrix is:
being a scalar, it is also equal to its trace:
Using the cyclic property of the trace operator, we get:
from which, taking expectations, we obtain:
with .
Finally, we get:
Replacing by its expression and denoting , we get:
or, denoting :
The most common estimators are obtained by considering the quadratic forms with the within, between‐individual, and between‐time matrices. We then get the following system of equations:
with
Using the following results: , , , , , , , , , and , and , , , , , , , we get:
or:
The estimator is obtained by equating the quadratic form and its expected value:
The matrices corresponding to the three most common estimators are presented in Figure 3.1.
Very often in economics, the phenomenon under investigation is not well described by a single equation but by a system of equations. It is particularly the case in the field of micro‐econometrics of consumption or production. For example, the behavior of a producer is described by a minimum cost equation along with equations of factor demand. In this case, there are two advantages in considering the whole system of equations:
Linear restrictions on the vector of coefficients to be estimated can be represented using a restriction matrix and a numeric vector :
For example, if the sum of the first two coefficients must equal 1 and the first and third ones should be equal, the joint restrictions can be written as:
To estimate the constrained OLS estimator, we write the Lagrangian:
with and the vector of Lagrange multipliers associated to the different constraints.1 The Lagrangian can also be written as:
The first‐order conditions become:
which can also be written in matrix form:
The constrained OLS estimator can be obtained using the formula for the inverse of a partitioned matrix (see equation 2.18):
with and .
We have here . The constrained estimator is then: , with and
The unconstrained estimator being , we finally get:
The difference between the constrained and the unconstrained estimators is then a linear combination of the excess of the linear constraints of the model evaluated for the unconstrained model.
We consider a system of equations denoted , with . In matrix form, the system can be written as follows:
The covariance matrix of the errors of the system is:
We suppose that the errors of two equations and for the same observations are correlated and that the covariance, denoted by , is constant. With this hypothesis, the covariance matrix is:
Denoting by the matrix of inter‐equations covariances, we have:
Because of the inter‐equations correlations, the efficient estimator is the GLS estimator: . This estimator, first proposed by Zellner (1962), is known by the acronym SUR for seemingly unrelated regression. It can be obtained by applying OLS on transformed data, each variable being pre‐multiplied by . This matrix is simply . Denoting by the elements of , the transformed response and covariates are:
is a matrix that contains unknown parameters, which can be estimated using residuals of a consistent but inefficient preliminary estimator, like OLS. The efficient estimator is then obtained the following way:
can conveniently be computed using the Cholesky decomposition, i.e., computing the lower‐triangular matrix such that .
Applying the SUR estimator on panel data is straightforward when only the between or the within variability of the data is taken into account. In this case, one just has to apply the above formula using the variables in individual means (between‐SUR) or in deviations from individual means (within‐SUR). Taking into account both sources of variability requires more attention and leads to the SUR error component model proposed by Avery (1977) and Baltagi (1980). The errors of the model then present two sources of correlation:
Every observation is now characterized by three indexes: is the observation of for equation , individual and period . The observations are first ordered by equation, then by individual. Denoting the error vector for equation and individual , one gets:
The errors concerning different individuals being uncorrelated, the correlation matrix for two equations and all individuals is:
Finally, for the whole system of equations, denoting and the two matrices of dimension containing the parameters and , the covariance matrix of the errors is:
The SUR error component model may be obtained by applying OLS on transformed data, every variable being pre‐multiplied by .
and may be estimated using the Cholesky decomposition of and (see Kinal and Lahiri, 1990).
The two error covariance matrices being unknown, the error‐component SUR estimator is obtained with the following steps:
Different choices of preliminary estimates lead to different SUR‐error component estimators. For example, Baltagi (1980) used the method of Amemiya (1971) while Avery (1977) chose the one of Swamy and Arora (1972).
An alternative to the OLS estimator presented in the previous chapter is the maximum likelihood estimator. Contrary to the GLS estimator, the parameters are not estimated sequentially (first and then ) but simultaneously.
In order to write the likelihood of the model, the distribution of the errors must be perfectly characterized; compared with the GLS model, we then must add an hypothesis concerning the distribution of the two components of the error term, the individual and the idiosyncratic effects: we'll suppose that they are both normally distributed. The likelihood is the joint density for the whole sample, which is the product of the individual densities in the case of a random sample. This is not the case here, as the observations of individual are correlated because of the common individual effect. The model to be estimated is then:
with and . For a given value of the individual effect, , the density for is:
For a given value of , the distribution of is the one of a vector of independent random deviates, and the joint distribution is therefore the product of individual densities:
The unconditional distribution is obtained by integrating out the individual effects , which means that the mean value of the density is computed for all possible values of :
with, denoting , and :
Denoting by the first term, we have and the joint density is then (denoting ):
For the second term, we have:
so that the joint density for an individual is finally:
The contribution of the ‐th individual to the log likelihood function is simply the logarithm of the joint density:
The log likelihood function is then obtained by summing over all the individuals of the panel:
or, more simply in the special case of a balanced panel:
Note also that:
The first derivatives of the log likelihood are, denoting :
Solving 3.9, we obtain:
The estimator of is simply obtained by using 3.10 as the residual variance of the model estimated on the transformed data:
Finally, using 3.11 and 3.13, the transformation parameter is:
The estimation can be performed iteratively. Starting from an estimator of (for example the within estimator), we calculate using the formula given by 3.14. We then transform the response and the covariates using this estimator of and we compute a second estimation of using 3.12. These computations are repeated until the convergence of and . is then estimated using 3.13.
The nested random effect model is relevant when the individuals can be put together in different groups. For example, with a panel of firms, groups may be constituted by regions or production sectors.
In this chapter, we'll restrict ourselves to panels with two characteristics:
The number of individuals and the length of time series for two groups may be different. This is why this model, presented in Baltagi et al. (2001) is called the unbalanced nested error component model, even if its unbalancedness must be understood in the very restrictive sense we've just described.
Three effects will now be considered: the usual individual and idiosyncratic effects, but also a new one that represents group effects . Denoting by the matrix of group dummies:
is block‐diagonal with (the number of groups) blocks of the following shape:
Replacing by and by , this can be rewritten as a linear combination of three symmetric, idempotent, and orthogonal matrices which sum to :
where:
This expression enables to easily find the expression for , denoting and :
which finally writes:
with and .
The model can therefore be estimated by OLS on transformed variables for which part of the individual and the group mean (respectively and have been subtracted).
We proceed along the lines of section 3.1.3. Using residuals from a preliminary estimation denoted , we compute a quadratic form of with a matrix .
Replacing by its expression and denoting , we obtain:
or, denoting :
The most popular estimators are obtained by computing the three quadratic forms with the within‐individual, between‐individual and between‐group matrices. We then get the following system of equations:
with:
Using the following results: , , , , , , , , , , , , , , , , , , .
We finally obtain:
or:
Baltagi et al. (2001) have proposed a variant of the Amemiya (1971) estimator (where the within estimator is used for the three quadratic forms), the Wallace and Hussain (1969) estimator (the OLS estimator is used for the three quadratic forms) and of the Swamy and Arora (1972) estimator (the within, between‐individual and between‐group are used respectively for the within, between‐individual, and between‐group quadratic forms). The detailed formulas are presented in Figure 3.2.
3.17.150.163