Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 9
Count Data and Limited Dependent Variables

It is often the case in economics that the dependent variable is not continuous so that OLS estimation is not appropriate. On the one hand, the response may be a count, i.e., it takes only non‐negative integer values. In this case, the most commonly used specifications are the Poisson and the NegBin models. On the other hand, the response may exhibit limited dependence. In this case, one can assume that there exists a continuous non‐observable variable called . The value of is not observed for some part of the domain or not observed at all. The different cases are depicted in Figure 9.1:

Figure 9.1a presents the case of a binomial variable (), which indicates the position of relative to a threshold ,
Figure 9.1b presents the case of an ordinal variable (), which indicates the position of relative to two thresholds and ,
Figure 9.1c presents the case of a left‐ truncated variable at ; on the right of , we have , observations characterized by are simply not available,
Figure 9.1d presents the case of a left‐censored variable at ; as for the truncated case, one observes, on the right of , . The sample contains observations for which , but the corresponding values of are unobserved.

4 Bell-shaped curves labeled binomial response (top left), ordinal model (top right), truncated response (bottom left), and censored response (bottom right). — Figure 9.1Limited dependent variable.

Some of these models belong to a broad category called “generalized linear models”. More specifically, this concerns:

the binomial model and especially two particular cases, the logit and the probit models,
the Poisson model.

The Negbin model is also a generalized linear model if its supplementary parameter is a fixed parameter and is not estimated.

In a cross‐section context, both base R and several packages provide the relevant estimators, using the maximum likelihood method:

probit, logit and Poisson models can be fitted using the glm function,
the NegBin model can be estimated using the glm.nb function of the MASS package,
the ordinal model can be fitted using the polr function of this same package,
the censored model can be estimated using the tobit function of the AER package or the censReg function of the censReg package,
the truncated model can be fitted using the truncreg function of the truncreg package.

The pglm package provides similar estimators for panel data. It enables the estimation of binomial and Poisson models and for convenience, also for Negbin and ordinal models, even if strictly speaking these last two are not proper generalized linear models.

The pldv function of the plm package provides panel estimators for the case where the response is either truncated or censored.

These models are often estimated using the maximum likelihood method, which requires to make strong hypotheses concerning the distribution of the response. When these hypotheses are not valid, except for very special cases, the estimator is no longer consistent.

This last is a very general drawback of maximum likelihood estimators, but there is also another drawback that is specific to panel data. In linear models, individual effects can be removed using an appropriate transformation (within or first differences) or can be directly estimated. This is not the case for most of the models presented in this chapter; the individual effects cannot be removed, and their estimation leads to the incidental parameter problem.

When for fixed , for the linear model, the estimation of individual effects is not consistent, as the number of parameters to be estimated grows with and the variance of the estimators is constant. On the contrary, nevertheless, the estimator of the vector of parameters of interest is consistent.

Differently from the linear case, for most of the models reviewed in this chapter, when the individual effects are estimated, their inconsistency “contaminates” the estimation of , which becomes inconsistent as well¹. This incidental parameter problem leads to abandoning the fixed effects models where the fixed effects are estimated in favor of three alternatives²:

the random effects model, which is always usable: one first writes the individual effects' conditional probabilities and then computes the unconditional probabilities by integrating out the individual effects, making a hypothesis about their distribution,
a fixed effects model, which uses the notion of sufficient statistic: for example, in a logit model, the probability of being unemployed at period depends on the individual effect, and so does the number of spells of unemployment for every period. By contrast, the ratio of this probability, which is the probability to be unemployed in period knowing the total number of periods for which the individual is unemployed, does not contain the individual effect. This technique, which is not available for all the models reviewed, enables, like the within transformation of the linear models, to get rid of the individual effects,
for censored or truncated responses, the linear model can be consistently applied if some observations are removed from the sample beforehand (one then speaks of a trimmed estimator).

In the next sections, we will present the three categories of models previously cited: binomial and ordinal models, truncated and censored models, and count data models. For each of these three sections, we will first briefly describe the estimators used with cross‐sectional data. We will then present the estimators appropriate for panel data. We will finally reproduce different empirical examples of these models.

9.1 Binomial and Ordinal Models

9.1.1 Introduction

9.1.1.1 The Binomial Model

We consider a model for which the response is binomial, and we denote without loss of generality the two possible values 0 and 1. We then define a latent variable that is continuous on the real line and is unobserved. The latent variable is linked to the observable binomial variable by the following rule of observation:

The value of the latent variable is the sum of a linear combination of the covariates and an error term. Without loss of generality, if includes an intercept, we set .

The variance of is not identified; it can therefore be set to 1 or to any other arbitrary value. Probabilities for the two possible values of the response are then:

Denoting by the cumulative density of , we then have:

the last expression being valid if the density of is symmetric. Denoting , which equals for , the probability of the outcome can be expressed in a compact form:

(9.1)

Two distributions are often used: the normal distribution:

which leads to the probit model, and the logistic distribution:

which leads to the logit model.

For a sample of size , the log‐likelihood function is obtained by summing the logs of (9.1) for all the observations:

9.1.1.2 Ordered Models

An ordered model is a model for which the response can take distinct values (with ). The construction of the model is very similar to the one of the binomial model. We consider a latent variable, like before equal to the sum of a linear combination of the covariates and an error:

Denoting a vector of parameters, with and , the rule of observation for the different values of is then:

Denoting by the cumulative density of , the probability for a given value of is:

The probability of the outcome can be written:

(9.2)

For a sample of size , the log‐likelihood function is obtained by summing the logarithms of (9.2) for all the observations:

As for the binomial model, the most common choices for the distribution of are the normal and the logistic distributions, which lead respectively to the ordered probit and logit models.

9.1.2 The Random Effects Model

For panel data, we now have repeated observations of for the same individuals. The latent variable is then defined by:

We assume as usual that the error can be written as the sum of an individual effect and an idiosyncratic term . Two observations for the same individual are then correlated because of the common term . If the vector contains an intercept, we can suppose, without loss of generality, that .

9.1.2.1 The Binomial Model

For a given value of , the probability of the outcome for individual at period is defined as before:

Denoting , the joint probability for all the periods for individual is:

The unconditional probability is obtained by integrating out this expression for . Assuming that the distribution of is normal with a standard deviation of , we obtain:

With the change of variable:

we obtain

There is no closed‐form for this integrand, but it can be efficiently numerically approximated using Gauss‐Hermite quadrature. This method consists in evaluating the function for different values of (denoted ) and computing a linear combination of these evaluations, with weights denoted by . For a fixed number of evaluations , the values of are tabulated.

(9.3)

and the log‐likelihood function is obtained by summing over all the individuals the logarithm of (9.3).

Example 9‐1 random effects logit model – `Reelection` data set

Brender and Drazen (2008) studied the influence of fiscal policy on the reelection of politicians. It is often suggested that, just before elections, politicians implement more expansionary fiscal policies, i.e., they reduce taxes or increase public spending. A panel of 75 countries is used, with a number of observations varying from 1 to 16. A subsample of these data is also considered when the incumbent is a candidate to the next election (for the other observations, reelection means that the incumbent political party wins the election). This subsample can be selected using the dummy variable narrow. The response is reelect: it equals 1 in case of reelection and 0 otherwise. The two main covariates are ddefterm and ddefey. Both variables measure the change in the ratio of government balance (budget surplus) and GDP. The first one is the difference between the two years prior to the elections and the two previous years. For the second, this is the difference between the election year and the previous year. Control variables include the growth rate of GDP gdppc and dummies for developing countries dev, for new democracies and for majoritarian electoral systems maj. The Reelection data set is available in the pder package.

 data("Reelection", package = "pder")

We first estimate the logit and probit models, with the glm function. This function uses the same arguments as lm, and a supplementary one called family, which indicates the distribution of the response, in our case the binomial distribution. The link between the parameter of the distribution and the linear predictor is indicated with the link argument. The family argument can be either a character string (here 'binomial'), the name of a function (here binomial) or a function call (here binomial()). The last possibility is the only one that allows to use a link that is not the default one. The logit model is obtained with link = 'logit' (the default), the probit model with link='probit'. The four following commands all compute the logit model:

 elect.l <- glm(reelect ˜ ddefterm + ddefey + gdppc + dev + nd + maj,
          data = Reelection, family = "binomial", subset = narrow)
l2 <- update(elect.l, family = binomial)
l3 <- update(elect.l, family = binomial())
l4 <- update(elect.l, family = binomial(link = 'logit'))

while only the following command allows the estimation of the probit model:

 elect.p <- update(elect.l, family = binomial(link = 'probit'))

The syntax of pglm is similar to glm. Like for plm, there are different ways of describing the structure of the sample:

by providing a pdata.frame to the data argument,
by providing a data.frame and using the index argument,
by only providing a data.frame if the first two columns of the data contain the individual and the time indexes (which is the case for the Reelection data set).

The logit and probit random effects models are estimated below:

 library("pglm")
elect.pl <- pglm(reelect ˜ ddefterm + ddefey + gdppc + dev + nd + maj,
                Reelection, family = binomial(link = 'logit'),
                subset = narrow)
elect.pp <- update(elect.pl, family = binomial(link = 'probit'))

Estimation results are presented using the screenreg function of the texreg package:

 library("texreg")
screenreg(list(logit = elect.l, probit = elect.p,
               plogit = elect.pl, pprobit = elect.pp),
          digits = 3)

===================================================================
                logit        probit        plogit       pprobit
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
(Intercept)       -1.328 **    -0.822 ***    -1.537 **    -0.942 **
                  (0.410)      (0.248)       (0.489)      (0.294)
ddefterm          14.413        8.381        14.086        8.223
                  (7.746)      (4.685)       (8.211)      (4.853)
ddefey            14.171 *      8.555 *      13.793 *      8.339
                  (6.660)      (4.039)       (6.998)      (4.257)
gdppc             17.017 *     10.652 *      19.380 *     12.076 **
                  (6.911)      (4.198)       (7.618)      (4.602)
dev                0.822 *      0.504 *       0.893 *      0.541 *
                  (0.358)      (0.218)       (0.430)      (0.258)
nd                 0.683        0.425         0.810        0.495
                  (0.380)      (0.232)       (0.439)      (0.264)
maj                0.768 *      0.472 *       0.847 *      0.515 *
                  (0.314)      (0.192)       (0.381)      (0.230)
sigma                                         0.841 *     -0.518 *
                                             (0.346)      (0.205)
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
AIC              343.708      343.851
BIC              368.497      368.640
Log Likelihood  -164.854     -164.926      -163.435     -163.434
Deviance         329.708      329.851
Num. obs.        255          255           255          255
===================================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

The probability of being reelected is larger in developing and newly democratic countries and for majoritarian electoral systems. The growth rate of GDP also has the predicted positive effect on the probability of being reelected. The coefficients of the two fiscal policy covariates are positive, which means that expansionary fiscal policies before elections do not have a systematic positive effect on the probability of the incumbent being reelected. On the contrary, the results indicate that voters tend to sanction such policies.

9.1.2.2 Ordered Models

The line of reasoning is very similar to that of binomial models. The joint probability for an individual for a given value of the individual effect is:

Assuming a normal distribution for the individual effects, the unconditional probability is:

Using the same change of variable as previously, we obtain:

which can be approximated using Gauss‐Hermite quadrature:

Example 9‐2 random effects ordered model – `Fairness` data set

Raux et al. (2009) analyze the perceived fairness of different methods of demand rationing using a survey in which individuals had to indicate their opinion on an ordinal scale concerning different rationing modes for parking places and for fast train seats. The response is answer and takes integer values from 0 (very unfair) to 3 (very fair). The main covariate is a factor indicating the rationing mode: peak‐load pricing peak, administrative rule admin, random allocation lottery, additive supply addsupply, queuing queuing, moral rule moral, and compensation rule compensation. The other covariates are dummies indicating that the rationing is recurring or not recurring, that the individual has a diploma education and has a driving license driving. The Fairness dataset is available in the pglm package.

 data("Fairness", package = "pglm")

We first use the polr function from the MASS package to estimate the ordered probit and logit models. We restrict our attention to the rationing of parking places.

 library("MASS")
parking.ol <- polr(answer ˜ recurring + driving + education + rule,
                   data = Fairness, subset = good == "parking",
                   Hess = TRUE, method = "logistic")
parking.op <- update(parking.ol, method = "probit")

The “link” is indicated with the method argument and we set the Hess argument to TRUE so that the Hessian, which is necessary to calculate the standard errors of the coefficients, is computed.

We then estimate the random effects ordered models using pglm. The following details should be remarked:

the family argument is used, like for glm, and an ordinal function is added, which allows, setting link to either 'probit' or 'logit', the estimation of the probit and the logit ordered models,
the number of evaluations for the Gauss‐Hermite quadrature method is indicated with the argument R,
the index is here mandatory, as the second column of Fairness is not the time index.

 parking.opp <- pglm(as.numeric(answer) ˜ recurring + driving + education + rule,
                    data = Fairness, subset = good == 'parking',
                    family = ordinal(link = 'probit'), R = 10, index = 'id',
                    model = "random")
parking.olp <- update(parking.opp, family = ordinal(link = 'probit'))

Results of the four models are presented using the screenreg function:

 library("texreg")
screenreg(list(ologit = parking.ol, oprobit = parking.op,
               pologit = parking.olp, poprobit = parking.opp),
          digits = 3)

============================================================================
                  ologit         oprobit        pologit        poprobit
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
recurringyes         -0.120         -0.070         -0.077         -0.077
                     (0.075)        (0.044)        (0.059)        (0.059)
drivingno             0.413 ***      0.237 ***      0.255 **       0.255 **
                     (0.101)        (0.060)        (0.080)        (0.080)
educationno          -0.480 ***     -0.280 ***     -0.309 **      -0.309 **
                     (0.138)        (0.079)        (0.105)        (0.105)
ruleadmin            -0.133         -0.061         -0.066         -0.066
                     (0.144)        (0.086)        (0.088)        (0.088)
rulelottery           0.330 *        0.217 *        0.238 **       0.238 **
                     (0.141)        (0.085)        (0.086)        (0.086)
ruleaddsupply         1.892 ***      1.141 ***      1.221 ***      1.221 ***
                     (0.143)        (0.083)        (0.085)        (0.085)
rulequeuing           2.973 ***      1.731 ***      1.848 ***      1.848 ***
                     (0.152)        (0.086)        (0.089)        (0.089)
rulemoral             4.597 ***      2.656 ***      2.837 ***      2.837 ***
                     (0.166)        (0.093)        (0.098)        (0.098)
rulecompensation      4.231 ***      2.458 ***      2.622 ***      2.622 ***
                     (0.162)        (0.091)        (0.096)        (0.096)
(Intercept)                                        -0.269 ***     -0.269 ***
                                                   (0.072)        (0.072)
mu_1                                                1.019 ***      1.019 ***
                                                   (0.038)        (0.038)
mu_2                                                2.515 ***      2.515 ***
                                                   (0.059)        (0.059)
sigma                                               0.529 ***      0.529 ***
                                                   (0.050)        (0.050)
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
AIC                5482.722       5490.689
BIC                5553.360       5561.326
Log Likelihood    -2729.361      -2733.344      -2705.814      -2705.814
Deviance           5458.722       5466.689
Num. obs.          2661           2661           2661           2661
============================================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

9.1.3 The Conditional Logit Model

The random effects model is consistent only if the individual effects are uncorrelated with the covariates. If it is not the case, the conditional logit model can be used. It is well known in the statistic literature and has been introduced in panel data econometrics by Chamberlain (1980).

The general presentation of this model is quite complex, but the intuition of it can be perceived using the special case where . We denote . Only the individuals for which can be used to estimate the conditional logit model (more generally, only individuals for which may be used).

For a given period , the probabilities for the two values of are:

or more generally:

If the idiosyncratic components of the errors are i.i.d., the joint probability for two observations is simply the product of and :

or also, as one and only one of the two equals 1:

(9.4)

The probability that is equal to the sum of the probabilities of:

and , which is ,
and , which is .

which is therefore:

(9.5)

Dividing (9.4) by (9.5), one finally obtains the joint probability of and given their sum:

(9.6)

This conditional probability is free of the individual effect and the likelihood that uses this expression can therefore be considered as a fixed effects logit model. Note that there is no similar estimator for the probit model.

Example 9‐3 conditional logit model – `MagazinePrices` data set

Cecchetti (1986) analyzes price changes, with an application to magazines. His analysis is replicated (and criticized) by Willis (2006). Price changes are costly for two reasons:

changing prices induce administrative costs,
in a monopolistic competition context, increasing prices will lead to a loss of customers.

For these two reasons, there is a difference between the optimal price of a good for a given period and the actual price . A price change will occur only if the gap between the two becomes greater than a given threshold. More formally, the price will change if:

is then the minimum relative gap between the optimal and the actual price that would result in a price change. If the price changes, given the infrequency of price changes, the enterprise will set its new price above the optimal price, the relative difference being equal to .

Denote the last period when the price of good has changed. For this period, we have:

If the price doesn't change in period , we have . Replacing in the previous equation, we have:

(9.7)

In the context of a simple monopolistic competition model, the demand function for the firm and its cost function are:

where is the demand faced by the whole industry, the factor price index, and the average price in the industry.

Substituting the expression of demand in the cost function, writing the profit function, and setting to zero the first derivative of profit with respect to price, we obtain the following price function:

Writing the same price function for the period when the last price change occurred and subtracting both equations, we get:

Finally, denoting by the time since the last price change, assuming an identical variation of the average price of the industry and of the inputs and denoting by the demand variation for the whole industry since the last price change of enterprise :

Adding an error term to this expression and inserting it in equation 9.7), we obtain:

is a specific term for enterprise at period , which represents the price change policy. The probability of a price change can then be written:

where is the cumulative density of , assumed to be logistic.

Cecchetti (1986) assumes that can be supposed constant for 3 consecutive years. In this case, the period of observation being of 27 years, there are 9 different effects for each magazine. We present below the results of 3 estimations that replicate Table 1 of Willis (2006).

We successively estimate a simple logit, a logit with magazine fixed effects for which the effects are estimated (and therefore suffering from the incidental parameter problem), and a conditional logit model (using the clogit function of the survival) where three‐year magazine fixed effects are removed. The MagazinePrices data set is available in the pder package.

 data("MagazinePrices", package = "pder")
logitS <- glm(change ˜ length + cuminf + cumsales, data = MagazinePrices,
              subset = included == 1, family = binomial(link = 'logit'))
logitD <- glm(change ˜ length + cuminf + cumsales + magazine,
              data = MagazinePrices,
              subset = included == 1, family = binomial(link = 'logit'))
library("survival")
logitC <- clogit(change ˜ length + cuminf + cumsales + strata(id),
                 data = MagazinePrices,
                 subset = included == 1)
library("texreg")
screenreg(list(logit = logitS, "FE logit" = logitD,
               "cond. logit" = logitC), omit.coef = "magazine")

=====================================================
                logit        FE logit     cond. logit
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
(Intercept)       -1.90 ***    -1.18 **
                  (0.14)       (0.42)
length            -0.10 **     -0.07 *       1.02 ***
                  (0.03)       (0.03)       (0.28)
cuminf             6.93 ***     8.83 ***    19.20 *
                  (1.12)       (1.25)       (7.51)
cumsales          -0.36        -1.14         7.60 *
                  (0.98)       (1.06)       (3.46)
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
AIC             1008.90      1028.35       173.44
BIC             1028.63      1230.62
Log Likelihood  -500.45      -473.18
Deviance        1000.90       946.35
Num. obs.       1026         1026         1026
R^2                                          0.20
Max. R^2                                     0.32
Num. events                                213
Missings                                     0
=====================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

Note that the coefficient of the length of the period since the last price change has the expected positive sign and is significant only for the conditional logit model.

9.2 Censored or Truncated Dependent Variable

9.2.1 Introduction

It's often the case in economics that the response is only observed on a certain range of values; we then say that the dependent variable is truncated. For example:

if the response is a proportion, it is necessarily left‐ truncated on 0 and right‐truncated on 1,
consumption for a good is necessarily positive and therefore left‐truncated on 0,
the demand for a sports event is necessarily lower or equal to the number of seats in the stadium and is therefore right‐ truncated to this capacity.

From now on, we will consider the most common case, which is a 0 left truncation, but the models we will present easily extend to the case of left or/and right truncations at any value.

As usual, we will assume that the dependent variable can be represented by a latent variable that equals the sum of a linear combination of different covariates and an error term.

The observed response equals if it is not in the truncated zone (i.e., here, if it's strictly positive) and equals the truncature (here, 0) otherwise.

(9.8)

Two kinds of samples can be used to estimate this model:

a sample is truncated when only observations for which are available (we therefore don't even know the values of the covariates for observations for which is in the truncation zone),
a sample is censored when it consists of observations for which is either inside or outside the truncation zone.

This latter case is particularly important in econometrics and leads to a model which is called the tobit model (Tobin, 1958). From now, we'll refer to the truncated model when the first kind of sample is used and to the censored model for the second kind of sample.

We'll first analyze why applying a linear regression to a censored or a truncated model leads to inconsistent estimators. We'll then present a non‐parametric method that leads, removing some specific observations, to a consistent estimator while making minimal hypotheses on the model errors. We'll conclude this section with the maximum likelihood estimator, which relies on the much stronger hypothesis of homoscedasticity and normal distribution.

9.2.2 The Ordinary Least Squares Estimator

Let be the density of the distribution of which is supposed, without loss of generality as long as the equation contains an intercept, to be of 0 expected value. We then have:

If were observed, OLS would be a consistent estimator for . This is not the case when we only observe the truncated variable . On the truncated sample, we have , or . The distribution of for the sample is then , depicted by the dotted line in Figure 9.2.

Graph displaying 3 bell-shaped curve for y* (solid), y* | y* > 0 (dotted), and y* | y* > 0 & y* < 2βTx (dashed), with y* having shaded regions labeled P(y* < 0) (left) and P(y* > 2βTx) (right). Below is a right arrow labeled ε. — Figure 9.2Distribution of and .

The distribution of is not symmetric around 0, and its expected value is positive, because the left side of the distribution, corresponding to values of , is truncated. We therefore have:

which is, for a normal distribution:

or, subtracting :

is known as the inverse mills ratio and is a decreasing function of its argument. Computing the derivative with respect to one covariate , we obtain:

which is negative if , as is the average of for and is therefore greater than . The OLS estimator computed on the truncated sample is therefore downward biased.

For the censored sample, we have for censored observations. We then have:

where the last expression holds for a normal distribution. Subtracting , we obtain the expected value of the error of the censored model:

Computing once again the derivative with respect to a covariate , we have:

which, as previously, has the opposite sign of , implying that the OLS estimator on the censored sample is downward biased.

The bias of the OLS estimator on censored and truncated samples is illustrated on Figure 9.3

Graphs for whole, censored, and truncated samples, each with 2 ascending lines with dot markers. The lines in whole sample coincide to each other. The lines in censored sample and truncate sample intersect. — Figure 9.3OLS bias for the censored and the truncated samples.

9.2.3 The Symmetrical Trimmed Estimator

The OLS estimator is inconsistent because the truncation leads to an asymmetric distribution for the errors, for which the expected values depends on . Powell (1986) proposes to restore the symmetry by removing some observations.

9.2.3.1 Truncated Sample

In the case of the truncated sample, observations for which , or , are missing. The symmetry may be restored by removing from the right side of the distribution, the observations for which , or . The distributions of and are depicted by the dashed line in Figure 9.2. In this case, we have:

A consistent estimator may be obtained using the normal conditions and restricting the sample to observations for which . Denoting by the function that is equal to 1 if is true and 0 otherwise, we have:

(9.9)

These first‐order conditions may be obtained by minimizing the function:

(9.10)

In this case, all the observations for which and those for which have a weight equal to in the objective function and a zero weight in the first‐order conditions. The weight in the objective function ensures that fallacious solutions of the first‐order conditions like are excluded.

9.2.3.2 Censored Sample

In the case of the censored sample, symmetry is restored by replacing by when (as is replaced by 0 when ). We then have:

(9.11)

These first‐order conditions may be obtained by minimizing the following function:

(9.12)

Observations for which now have a weight equal to in the objective function and a zero weight in the first‐order conditions.

9.2.4 The Maximum Likelihood Estimator

If we can assume that the errors are normal and homoscedastic, a more efficient estimator is the maximum likelihood estimator.

9.2.4.1 Truncated Sample

The maximum likelihood estimator for a truncated sample has been proposed by Hansman and Wise (1976). The density of the distribution of is normal, with expected value equal to and standard deviation . We then have:

The probability of being negative is: .

The density of the distribution of , denoted , is the zero left‐truncated distribution of : We then have:

(9.13)

The log‐likelihood function is obtained by summing the logarithms of the density (9.13) for the observations in the sample:

(9.14)

9.2.4.2 Censored Sample

When the sample is censored, the distribution of is a mix of a discrete and a continuous distribution. An observation for which enters the log‐likelihood function as:

while for a positive observation, the contribution to the likelihood is the truncated normal density:

times the probability that be positive: . We finally get the log‐likelihood function (9.15):

(9.15)

9.2.5 Fixed Effects Model

Honoré (1992) proposed a symmetrical trimmed estimator that is an extension of Powell (1986)'s estimator to panel data. For now, we consider a panel with only two observations for every individual and one covariate.

The only hypothesis made concerning the errors and is that they are identically distributed. The symmetry hypothesis, which was required for the Powell (1986) estimator to be consistent, is not necessary here.

9.2.5.1 Truncated Sample

For the truncated model, only observations for which are available. Figure 9.4 ³ presents the distribution of and .

Distribution of y*n1 (left) and of y*n2 (right) for βTΔxn > 0 (a) and βTΔxn < 0 (b), each depicted by a two-peak curve centering at βTxn1 + ηn (for y*n1) and βTxn2 + ηn (for y*n2). — Figure 9.4Distribution of and of .

With the hypotheses we've made, these two distributions only differ by their position, being centered on and on . Because of the truncation, the two distributions conditioned to the fact that the observation is in the sample (), to the values of the covariates () and to that of the individual effect () are more substantially different. If (Figure 9.4a), the truncated part of the distribution of is larger than the one of . However, identical distributions can be obtained by truncating not at 0 (which is the selection rule of the sample) but at . In the case where (Figure 9.4b), is similarly truncated at .

We then obtain two identical conditional distributions for:

and in the case when ,
and in the case when ,

More generally, the observations that should be removed to restore symmetry are those for which or . This situation is depicted in Figure 9.5. When (9.5a), the joint distribution of is symmetric around the line which is the line with intercept . Truncating at and , we obtain two symmetric zones and . The probability of having in zones or is the same. This result leads to a first‐moment condition:

(9.16)

Moreover, by symmetry, in Figure 9.5a:

the vertical distance between in zone on the line is ,
the horizontal distance between in and the line is ,

which can be written as a second‐moment condition:

(9.17)

For a sample of size , truncated as previously described, the sample analogues of the two moment conditions (9.16) and (9.17) are:

(9.18)

(9.19)

(9.18 and 9.19) are respectively the first‐order conditions of the LAD and of the least squares estimator. These first‐order conditions may be obtained by maximizing:

with:

If , we obtain the trimmed least squares estimator; if , we obtain the trimmed least absolute deviations estimator. Only the observations for which are included in the first‐order conditions, the presence of in the objective function excluding trivial solutions.

Image described by caption and surrounding text. — Figure 9.5Symmetry of the distribution of .

images — Figure 9.5Symmetry of the distribution of .

9.2.5.2 Censored Sample

For the censored sample, observations for which are available, the observation rule for being:

From Figure 9.5, we can see that not only and are symmetrical but also defined by and defined by .

Therefore, to restore symmetry for the censored sample, we have to get rid of the zone for which and (the dotted zone on Figure 9.5).

The symmetry between and leads to the following moment condition:

(9.20)

Moreover, for:

in , the vertical distance to the limit of the zone is ,
in , the horizontal distance to the limit of the zone is

which translates into the following moment condition:

(9.21)

Using (9.16 and 9.20), we obtain:

(9.22)

and using (9.17 et 9.21), we obtain:

(9.23)

The sample analogues to (9.22) are the first‐order conditions of the following function:

(9.24)

which is the trimmed LAD estimator on the censored sample.

Finally, the sample equivalent of (9.23) are the first‐ order conditions of the following function:

(9.25)

which is the trimmed least squares estimator for the censored sample. The trimmed LAD and least squares estimators have been extended to the case where the dependent variable is two‐sided censored or truncated by Alan et al. (2013).

Example 9‐4 trimmed tobit model – `LateBudgets` data set

Andersen et al. (2012) study the late adoption of budgets. They use a panel of American states for the 1988‐2007 period, for which the date of budget adoption has been collected so that late budget situations can be detected and, in this case, the number of days from the legal limit date can be computed. Among the factors that may explain late budgets, the authors use:

a shock to the fiscal climate, which is proxied by the annual change of unemployment rate unempdiff,
divided control over the state government: splitbranch is a dummy indicating that both chambers are controlled by a different party than the governor's and splitleg is a dummy indicating that the two chambers are controlled by different parties.
variables linked to the cost of a late budget: elcyear is a dummy for election years, deadline is a factor with levels ("none", "soft", "hard") that indicates if there is a legal date for the end of legislative works,
shutdown indicates whether the state law dictates a shutdown of state government activities in the event of a late budget, supmaj that budget adoption requires a super‐majority,
different covariates indicating political and legislative context: the fact that the governor is newly elected newgov, the number of years since the incumbent governor took office govexp, a dummy for a democrat governor demgov, a dummy indicating that the governor is subject to a binding term limit lameduck, a 1‐to‐5 scale for full‐ vs. part‐time legislatures, where 1 corresponds to a part‐time “citizen” legislature, and 5 corresponds to a full‐time professional legislature fulltimeleg, a dummy that indicates that the state law does not allow a budget deficit to be carried over to the next fiscal year nocarry,
several social and demographic covariates: population pop, the percentage of African Americans black, of college graduates graduate, of people older than 65 years elderly, of children between 5 and 17 years old kids, and the response rate in the 1990 US census censusrep, which is used as a proxy for social capital.

In order to investigate whether change of the unemployment rate has an asymmetric effect on budget adoption, two variables are created, indicating positive values of unemployment rise unemprise and fall unempfall.

 data("LateBudgets", package = "pder")
LateBudgets$dayslatepos <- pmax(LateBudgets$dayslate, 0)
LateBudgets$divgov <- with(LateBudgets,
                           factor(splitbranch == "yes" |
                                  splitleg == "yes",
                                  labels = c("no", "yes")))
LateBudgets$unemprise <- pmax(LateBudgets$unempdiff, 0)
LateBudgets$unempfall <- - pmin(LateBudgets$unempdiff, 0)
form <- dayslatepos ˜ unemprise + unempfall + divgov + elecyear +
    pop + fulltimeleg + shutdown + censusresp + endbalance + kids +
    elderly + demgov + lameduck + newgov + govexp + nocarry +
    supmaj + black + graduate

The model is estimated using the pldv function, which has a model argument with a default value of 'fd' (for first‐difference), which in this context is the fixed effects model of Honoré (1992). Two supplementary arguments can also be specified:

objfun indicates whether one wants to minimize the sum of the least squares of the residuals ('lsq', the default value), or the sum of the absolute values of the residuals ('lad'),
sample indicates if the sample is censored ('censored', the default value) or truncated ('truncated').

 FEtobit <- pldv(form, LateBudgets)
summary(FEtobit)
Oneway (individual) effect First-Difference Model

Call:
pldv(formula = form, data = LateBudgets)

Unbalanced Panel: n = 48, T = 2-20, N = 730
Observations used in estimation: 682

Residuals:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 -107.8   -15.1     5.2     7.6    26.4   168.5

Coefficients:
            Estimate Std. Error t-value Pr(>|t|)
unemprise      9.042     10.944    0.83    0.409
unempfall    -31.641      6.887   -4.59  5.2e-06 ***
divgovyes     19.793      8.767    2.26    0.024 *
elecyear     -24.505     10.190   -2.40    0.016 *
pop           -0.683      2.512   -0.27    0.786
endbalance    -3.856     62.829   -0.06    0.951
kids           0.774      4.547    0.17    0.865
elderly       60.880      2.669   22.81  < 2e-16 ***
demgovyes     -6.371      6.770   -0.94    0.347
lameduckyes  -22.032      4.043   -5.45  7.1e-08 ***
newgovyes      5.606     10.532    0.53    0.595
govexp         3.395     38.894    0.09    0.930
- - -
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Total Sum of Squares:    328000
Residual Sum of Squares: 996000
R-Squared:      0.0255
Adj. R-Squared: 0.00953
F-statistic: -40.8339 on 11 and 670 DF, p-value: 1

As can be seen from the results, the economic situation influences the timing of budget adoption. The effect is asymmetric, an increase of the unemployment rate having stronger impact than a drop in the unemployment rate. Divided control over the government (measured by divgov) has a significantly positive effect on late budget adoptions.

9.2.6 The Random Effects Model

The trimmed estimator has two useful features: it is robust to non‐normality and heteroscedasticity, on the one hand, and to correlation between the individual effects and the covariates, on the other hand, the individual effects being wiped out by the first‐ difference transformation. However, if the errors are normal and homoscedastic and if the individual effects are also normal and uncorrelated with the covariates, the maximum likelihood estimator is consistent and more efficient.

For panel data with individual effects, the latent variable writes:

9.2.6.1 Truncated Sample

The density of is:

The joint density of is, assuming the independence of the errors:

(9.26)

Assuming that the distribution of individual effects is normal with a standard deviation equal to , the unconditional joint density is obtained by integrating out (9.26) for the individual effects:

(9.27)

Using the change of variable , we obtain:

(9.28)

which can be approximated by the Gauss‐Hermite quadrature method:

(9.29)

The log‐likelihood function for the truncated model is then simply obtained by summing the logarithms of (9.29) for all individuals:

(9.30)

9.2.6.2 Censored Sample

In this case, the conditional distribution of is either given by a probability or by a density:

Using a similar reasoning as for the truncated model, individual contributes to the likelihood with a product of probabilities and/or densities:

(9.31)

The log‐likelihood function for the censored sample is obtained by summing over all the individuals the logarithm of (9.31):

(9.32)

Example 9‐5 random effects censored model – `Donor` data set

Landry et al. (2012) study the dynamic of behaviors of donors to public utility organizations and more specifically to the “Center for Natural Hazards Research at East Carolina University” (ECU). A first door‐to‐door campaign was realized in 2004. During this campaign, two kinds of treatment were used: a standard “simply ask for money” treatment, called VCM, and a treatment with a lottery with which potential donors can receive a gift. The second campaign took place in 2006. Some of the donors of the first campaign had been solicited, and three treatments were used, described in the factor variable treatment with three levels: "vcm" for a “simply ask for money” treatment and "sgift" if a small gift (a bookmark) or "lgift" if a large gift (a book) were given to the potential donors. The main objective of the article is to study whether people who initially give to charities are more willing to give again than others. The response is the amount of the gift; it is therefore left‐censored at 0. In the article, the authors present results of linear regressions with solicitors' fixed effects. In the online appendix, the same equations are estimated using a random effects tobit model. Two equations are estimated, both employing treatment and a dummy for previous donors prcontr as explanatory variables, the second adding an interaction term between the two.

 data("Donor", package = "pder")
library("plm")
library("texreg")
T3.1 <- plm(donation ˜ treatment +  prcontr, Donor, index = "id")
T3.2 <- plm(donation ˜ treatment * prcontr - prcontr, Donor, index = "id")
T5.A <- pldv(donation ˜ treatment +  prcontr, Donor, index = "id",
             model = "random", method = "bfgs")
T5.B <- pldv(donation ˜ treatment * prcontr - prcontr, Donor, index = "id",
             model = "random", method = "bfgs")
screenreg(list(OLS = T3.1, Tobit = T5.A, OLS = T3.2, Tobit = T5.B),
          reorder.coef = c(1:3, 7:9, 4:6))

=============================================================================
                           OLS         Tobit         OLS         Tobit
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
treatmentsgift               -0.41         2.36         0.06         3.53
                             (0.61)       (1.86)       (0.66)       (2.04)
treatmentlgift                1.79 **      6.36 ***     2.07 **      7.66 ***
                             (0.64)       (1.93)       (0.68)       (2.08)
prcontryes                    1.29 *       5.74 **
                             (0.59)       (1.79)
treatmentvcm:prcontryes                                 3.14 **     10.78 **
                                                       (1.13)       (3.41)
treatmentsgift:prcontryes                               0.20         4.47
                                                       (0.95)       (2.88)
treatmentlgift:prcontryes                               1.05         3.23
                                                       (1.00)       (3.01)
(Intercept)                              -15.16 ***                -16.05 ***
                                          (1.89)                    (1.97)
sd.nu                                     16.40 ***                 16.36 ***
                                          (0.80)                    (0.80)
sd.eta                                     4.05 ***                  3.92 ***
                                          (1.11)                    (1.10)
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
R^2                           0.02                      0.02
Adj. R^2                     -0.02                     -0.01
Num. obs.                  1039         1039         1039         1039
Log Likelihood                         -1498.38                  -1496.84
=============================================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

The average gift (including censored observations) is 2.5$. The first column indicates that previous donors give on average 1.3$ more. A large gift increases the donation by 1.8%, while a small gift has no effect on donation. The third column distinguishes the treatment effect for previous donors and the others. For the vcm treatment, the gift of previous donors is much larger (about 3$). On the contrary, there is no difference between previous donors and other people when a gift is proposed by solicitors. The random effects tobit models are presented in columns 2 and 4. The results are very similar but more difficult to interpret, as the expected value of the response for the tobit model is:

For example, for someone who didn't give previously and who received the vcm treatment, . With , we obtain an expected donation of 1.57. For someone who made a donation previously and who also received the vcm treatment, we have , and the expected donation is 2.88. The effect for previous donors is therefore equal to , which is very close to the linear regression coefficient.

9.3 Count Data

We now consider the case where the response is a count. We will first briefly review the estimation of count data models in a cross‐sectional context, and then we will describe specific estimators for panel data.

9.3.1 Introduction

The two most widely used models when the response is a count are the Poisson and the NegBin models.

9.3.1.1 The Poisson Model

We first suppose that the response follows a Poisson distribution of parameter (which is the mean and the variance of the variable). Under this distributional assumption, the probability of observing a value is:

Using the logarithmic link, the Poisson parameter is the exponential of the linear predictor:

which leads to the following probability for observation :

Taking the logarithm of this probability and summing over all individuals, we obtain the following log‐likelihood function:

9.3.1.2 The NegBin Model

Count data often exhibit excess dispersion, i.e., the variance is greater than the mean. In this case, the NegBin model is more appropriate than the Poisson model.

Suppose that is a random variable that follows a Poisson distribution of parameter (with in the case of a logarithmic link), being a random variable.

The conditional probability of is:

Let now suppose that follows a gamma distribution. If contains an intercept, the mean of is not identified and therefore a one‐parameter distribution, which imposes a unit mean, is chosen.

Integrating out this conditional probability using the density of , we obtain:

To understand the meaning of , the first two moments of are computed. For a given value of , we have, as for the Poisson model: . The unconditional mean is , because the expected value of equals 1.

To compute the unconditional variance, the variance decomposition formula is applied:

A general formula for is:

For , we get the Negbin1 model, with and . In this case, the variance is proportional to the mean.

For , we obtain the Negbin2 model, with and ; here, the variance is a quadratic function of the mean.

9.3.2 Fixed Effects Model

Fixed effects Poisson and NegBin models are proposed by Hausman et al. (1984).

9.3.2.1 The Poisson Model

The fixed effects Poisson model is very specific, as it doesn't suffer from the incidental parameter problem and can therefore be obtained either by estimating the individual effects or by using a sufficient statistic⁴.

In a panel context, the Poisson parameter for individual in period is written:

which means that the individual effect is multiplicative. For a given value of the individual effect, the probability of observing is:

Let be the sum of all the values of the response for individual and the sum of the Poisson parameters. A sum of Poisson variables follows a Poisson distribution with parameter equal to the sum of the parameters of the summed variables. We therefore have:

(9.33)

Let be the vector of values of for individual . We then have:

(9.34)

Applying Bayes' theorem, we obtain:

i.e., the joint probability of the components of is the product of the conditional probability of given and the marginal distribution of . This conditional probability is:

which implies:

(9.35)

As for the logit model, is a sufficient statistic, which means that it allows to get rid of the individual effects. Taking the logarithm of this expression and summing over all individuals, we obtain the within Poisson model:

(9.36)

or:

(9.37)

As stated previously, the Poisson model is not affected by the incidental parameter problem, as the same estimator may be obtained by estimating the individual effects. To show this result, we take the logarithm of the joint probability for the observations of for individual (equation 9.34), in order to obtain the log‐likelihood function:

(9.38)

The first‐order condition for to maximize the log‐ likelihood function is:

which implies that: .

Introducing this expression in (9.38) and summing over all , we obtain the concentrated log‐likelihood function:

(9.39)

The two log‐likelihood functions (9.37) and (9.39) are proportional, they therefore lead to the same estimators of . Moreover, if a logarithmic link is chosen, we have: . The likelihood is in this case proportional to:

which is similar to the likelihood of a multinomial logit model for which individuals must choose one among mutually exclusive alternatives. The difference is that in this latter model is either equal to 0 or to 1, and , as in our context each is a natural integer.

9.3.2.2 Negbin Model

Hausman et al. (1984) also propose a fixed effects NegBin model. We just present below without demonstration the joint probability for individual :

(9.40)

9.3.3 Random Effects Models

9.3.3.1 The Poisson Model

Hausman et al. (1984) also proposed a between and a random effects Poisson model, integrating out the relevant probabilities (9.33 et 9.34 respectively). A gamma distribution hypothesis is made for the individual effects, with the following density:

with

the gamma function. The expected value and the variance of are respectively:

If the model contains an intercept, the expected value is not identified and we can then suppose, without restriction, that it is equal to 1, which implies . We then obtain a gamma distribution with one parameter (denoted ):

Integrating out the conditional probabilities (9.33 and 9.34), we obtain the unconditional probabilities for the between and the random effects models:

which leads to the following log‐likelihood functions:

(9.41)

(9.42)

9.3.3.2 The NegBin Model

In addition to the Poisson model, Hausman et al. (1984) also proposed between and random effects NegBin models. We just present below without demonstration the joint probability for individual .

(9.43)

(9.44)

Example 9‐6 fixed effects NegBin model – `GiantsShoulders` data set

Furman and Stern (2011) assess the impact of a scientific institution, a biological resource center, whose objective is to certify and disseminate knowledge, on knowledge accumulation. More specifically, they are interested in the ACTT (American Type Culture Collection), which collects, certifies, and distributes biological organisms. The authors are interested in the citations of publications for which the results are hosted by the ACTT, and they try to estimate the causal effect of ACTT hosting. There is an obvious selection problem, because it is natural to think that some of the best pieces of research will end up to be hosted by ACTT and that the same would be heavily cited because of their quality even if they were not hosted by the ACTT.

In order to identify the causal effect of ACTT hosting on knowledge dissemination, the authors use two strategies:

the first is that there is often a long lag between publication and hosting, and this lag is mostly exogenous,
the second consists in matching every hosted article to a similar (same journal, date, and subject) non‐hosted article.

The GiantsShoulders data set is available in the pder package.

 data("GiantsShoulders", package = "pder")
head(GiantsShoulders)
  pair article brc pubyear brcyear year citations
1  184    1184 yes    1983    1994 1983         0
2  184    1184 yes    1983    1994 1984        31
3  184    1184 yes    1983    1994 1985        89
4  184    1184 yes    1983    1994 1986       105
5  184    1184 yes    1983    1994 1987        84
6  184    1184 yes    1983    1994 1988        75

The response is citations, the annual number of citations of the article. Each article is identified by the variable article and by the pair of articles it belongs to pair. For each pair, an article is hosted by the ATCC and the other is not, which is indicated by the variable brc. Years of observation, publication, and hosting are indicated by the variables year, puyear, and brcyear.

Figure 1 in Furman and Stern (2011), reproduced here in Figure 9.6, presents the average number of citations for hosted and non‐hosted articles as a function of publication age. It is computed using the dplyr and the ggplot2 packages.

 library("dplyr")
library("ggplot2")
GiantsShoulders <- mutate(GiantsShoulders, age = year - pubyear)
cityear <- summarise(group_by(GiantsShoulders, brc, age),
                     cit = mean(citations, na.rm = TRUE))
ggplot(cityear, aes(age, cit)) + geom_line(aes(lty = brc)) +
    geom_point(aes(shape = brc)) + scale_x_continuous(limits = c(0, 20))

Figure 9.6Average annual citations by age, BRC versus control articles.

As can be seen, the number of citations increases the first year, to reach a maximum at about the third or fourth year and then decreases. Figure 9.6 also shows that hosted articles are much more cited that non‐hosted articles.

To estimate the marginal causal effect of the hosting institution, two covariates are constructed for hosted articles:

window is 1 around the hosting date, more precisely for a three‐year period centered on the hosting year,
post_brc is 1 for articles hosted for more than a year.

To reproduce the results exactly, we use annual fixed effects for years after 1979 and 5‐year effects for the 1970‐74 and 1975‐79 periods. We also introduce fixed effects for the age of the articles (omitting the 31 years age dummy).

 GiantsShoulders <- mutate(GiantsShoulders,
                          window = as.numeric( (brc == "yes") &
                                               abs(brcyear - year) <= 1),
                          post_brc = as.numeric( (brc == "yes") &
                                                 year - brcyear > 1),
                          age = year - pubyear)
GiantsShoulders$age[GiantsShoulders$age == 31] <- 0
GiantsShoulders$year[GiantsShoulders$year %in% 1970:1974] <- 1970
GiantsShoulders$year[GiantsShoulders$year %in% 1975:1979] <- 1975

In the two first columns, a linear model is estimated. The first model contains only age fixed effects, and the second one add pairs and years fixed effects. The results are similar; the selection effect of hosting is about 50% of more citations, and the marginal effect is 35% for the hosting period and about 50% for latter years.

The other two columns present the results of the fixed effects NegBin model. Pair (column 3) and article (column 4) fixed effects are alternatively used.

 library("pglm")
t3c1 <- lm(log(1 + citations) ˜ brc + window + post_brc + factor(age),
           data = GiantsShoulders)
t3c2 <- update(t3c1, . ˜ .+  factor(pair) + factor(year))
t3c3 <- pglm(citations ˜ brc + window + post_brc + factor(age) + factor(year),
           data = GiantsShoulders, index = "pair",
           effect = "individual", model = "within", family = negbin)
t3c4 <- pglm(citations ˜ window + post_brc + factor(age) + factor(year),
             data = GiantsShoulders, index = "article",
             effect = "individual", model = "within", family = negbin)
screenreg(list(t3c2, t3c3, t3c4),
          custom.model.names = c("ols: age/year/pair-FE",
                                 "NB:age/year/pair-FE",
                                 "NB: age/year/article-FE"),
          omit.coef="(factor)|(Intercept)", digits = 3)

=================================================================================
                ols: age/year/pair-FE  NB:age/year/pair-FE  NB: age/year/article-FE
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
brcyes             0.501 ***                0.752 ***
                  (0.057)                  (0.073)
window             0.385 ***                0.352 ***           0.565 ***
                  (0.074)                  (0.082)             (0.065)
post_brc           0.535 ***                0.538 ***           0.810 ***
                  (0.063)                  (0.079)             (0.056)
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
R^2                0.538
Adj. R^2           0.522
Num. obs.       4857                     4857                4857
RMSE               0.829
Log Likelihood                         -10759.180           -9632.404
=================================================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

9.4 More Empirical Examples

Charness and Villeval (2009) investigate the difference in behavior between senior and junior workers in terms of risk aversion, competition, and cooperation. They conduct an experiment during which every participant can invest his or her initial endowment in a public good game, which is the explanatory variable of their econometric analysis. This variable is left‐ (null contribution) and right‐ (contribution of the full endowment) censored. As the participants are observed during 16 periods, they use a random effects tobit model. The Seniors data set is available in package pder.

Michalopoulos and Papaioannou (2016) explore the consequences of ethnic partitioning, which is one aspect of the “scramble for Africa” during which European countries partitioned Africa without caring much about the boundaries of ethnic groups. Their pseudo‐panel consists of 825 ethnic groups belonging to 49 countries. The authors estimate a Negbin model, where the response is the number of conflicts in an ethnicity‐country homeland, the major covariate being a dummy for partitioned ethnic areas. They introduce country fixed effects and estimate a specification where the fixed effects are estimated and not wiped out using a sufficient statistic. Partitioned ethnicities experience an increase of 57% in political violence compared to other areas. The data are available in package pder as ScrambleAfrica.

Bardhan and Mookherjee (2010) analyze the political determinants of land reform in West Bengal, India. More specifically, they use yearly data on 89 villages for the 1978‐1998 period. The response is the percentage of land or of households affected by land reform; it is highly censored, as it is 0 for more than 80% of the sample. The main covariate is the presence of a left‐wing coalition at the head of the local government. The authors use the trimmed least absolute deviation estimator of Honoré (1992) and don't find any significant effect of the left‐wing government variable on the strength of land reform. The LandReform data are available in package pder

Brandts and Cooper (2006) analyze how financial incentives can be used to overcome a history of coordination failure. For this purpose, they conduce an experiment where “firms,” composed of four “employees” have an output that is related to the lowest level of effort implemented by the employees. The individual or the lowest firm level effort is the response and, as the same employees/firms are observed during 30 different rounds, panel data techniques are used. The level of effort being ordinal, the authors use ordered probit models, with firm random effects and nested random effects respectively when the analysis is at the firm or at the employee level. Their dataset is available in the pder package as CoordFailure.

Farber et al. (2016) conducted an audit study to analyze the determinants of callbacks to job applications. They sent four fake resumes for 1,118 job openings and the response is a dummy indicating a callback, the covariates being the unemployment spell duration, the age, and the fact that the worker has held a low level interim job. They estimate a random effects and a conditional logit model with job opening effects. The Callbacks data are to be found in the pder package.

Bazzi (2017) investigates the influence of income on migration. At the household and at the village level, the response being in the first case a dummy that indicates whether a person in the household migrated during the given year and in the second case the percentage of the population of the village that has migrated. The main covariates are rainfall, rice price shock, and wealth at the household level and at the village level, indicators of the shape of wealth distribution. The author uses a conditional logit at the household level and the two‐sided trimmed least absolute deviations estimator (see Alan et al., 2013) at the village level. The IncomeMigrationV (village level) and IncomeMigrationH (household level) datasets are also included in the pder package.

Vella and Verbeek (1998) estimate the union premium for young men. In a first step, they estimate a dynamic random effect probit model for union membership. The UnionWage dataset is available in the pglm package.

Hausman et al. (1986) and Cincer (1997) study the dynamic relationship between patents and R&D using yearly panels of firms. They fit different count data models, including conditional Poisson and Negbin models. The data sets they used are available as PatentsRDUS and PatentsRD in the pglm packages.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.