Chapter 9
Count Data and Limited Dependent Variables

It is often the case in economics that the dependent variable is not continuous so that OLS estimation is not appropriate. On the one hand, the response may be a count, i.e., it takes only non‐negative integer values. In this case, the most commonly used specifications are the Poisson and the NegBin models. On the other hand, the response may exhibit limited dependence. In this case, one can assume that there exists a continuous non‐observable variable called images. The value of images is not observed for some part of the domain or not observed at all. The different cases are depicted in Figure 9.1:

  • Figure 9.1a presents the case of a binomial variable (images), which indicates the position of images relative to a threshold images,
  • Figure 9.1b presents the case of an ordinal variable (images), which indicates the position of images relative to two thresholds images and images,
  • Figure 9.1c presents the case of a left‐ truncated variable at images; on the right of images, we have images, observations characterized by images are simply not available,
  • Figure 9.1d presents the case of a left‐censored variable at images; as for the truncated case, one observes, on the right of images, images. The sample contains observations for which images, but the corresponding values of images are unobserved.
4 Bell-shaped curves labeled binomial response (top left), ordinal model (top right), truncated response (bottom left), and censored response (bottom right).

Figure 9.1Limited dependent variable.

Some of these models belong to a broad category called “generalized linear models”. More specifically, this concerns:

  • the binomial model and especially two particular cases, the logit and the probit models,
  • the Poisson model.

The Negbin model is also a generalized linear model if its supplementary parameter is a fixed parameter and is not estimated.

In a cross‐section context, both base R and several packages provide the relevant estimators, using the maximum likelihood method:

  • probit, logit and Poisson models can be fitted using the glm function,
  • the NegBin model can be estimated using the glm.nb function of the MASS package,
  • the ordinal model can be fitted using the polr function of this same package,
  • the censored model can be estimated using the tobit function of the AER package or the censReg function of the censReg package,
  • the truncated model can be fitted using the truncreg function of the truncreg package.

The pglm package provides similar estimators for panel data. It enables the estimation of binomial and Poisson models and for convenience, also for Negbin and ordinal models, even if strictly speaking these last two are not proper generalized linear models.

The pldv function of the plm package provides panel estimators for the case where the response is either truncated or censored.

These models are often estimated using the maximum likelihood method, which requires to make strong hypotheses concerning the distribution of the response. When these hypotheses are not valid, except for very special cases, the estimator is no longer consistent.

This last is a very general drawback of maximum likelihood estimators, but there is also another drawback that is specific to panel data. In linear models, individual effects can be removed using an appropriate transformation (within or first differences) or can be directly estimated. This is not the case for most of the models presented in this chapter; the individual effects cannot be removed, and their estimation leads to the incidental parameter problem.

When images for fixed images, for the linear model, the estimation of individual effects is not consistent, as the number of parameters to be estimated grows with images and the variance of the estimators is constant. On the contrary, nevertheless, the estimator of the vector of parameters of interest images is consistent.

Differently from the linear case, for most of the models reviewed in this chapter, when the individual effects are estimated, their inconsistency “contaminates” the estimation of images, which becomes inconsistent as well1. This incidental parameter problem leads to abandoning the fixed effects models where the fixed effects are estimated in favor of three alternatives2:

  • the random effects model, which is always usable: one first writes the individual effects' conditional probabilities and then computes the unconditional probabilities by integrating out the individual effects, making a hypothesis about their distribution,
  • a fixed effects model, which uses the notion of sufficient statistic: for example, in a logit model, the probability of being unemployed at period images depends on the individual effect, and so does the number of spells of unemployment for every period. By contrast, the ratio of this probability, which is the probability to be unemployed in period images knowing the total number of periods for which the individual is unemployed, does not contain the individual effect. This technique, which is not available for all the models reviewed, enables, like the within transformation of the linear models, to get rid of the individual effects,
  • for censored or truncated responses, the linear model can be consistently applied if some observations are removed from the sample beforehand (one then speaks of a trimmed estimator).

In the next sections, we will present the three categories of models previously cited: binomial and ordinal models, truncated and censored models, and count data models. For each of these three sections, we will first briefly describe the estimators used with cross‐sectional data. We will then present the estimators appropriate for panel data. We will finally reproduce different empirical examples of these models.

9.1 Binomial and Ordinal Models

9.1.1 Introduction

9.1.1.1 The Binomial Model

We consider a model for which the response is binomial, and we denote without loss of generality the two possible values 0 and 1. We then define a latent variable images that is continuous on the real line and is unobserved. The latent variable is linked to the observable binomial variable images by the following rule of observation:

equation

The value of the latent variable is the sum of a linear combination of the covariates and an error term. Without loss of generality, if images includes an intercept, we set images.

equation

The variance of images is not identified; it can therefore be set to 1 or to any other arbitrary value. Probabilities for the two possible values of the response are then:

equation

Denoting by images the cumulative density of images, we then have:

equation

the last expression being valid if the density of images is symmetric. Denoting images, which equals images for images, the probability of the outcome can be expressed in a compact form:

Two distributions are often used: the normal distribution:

equation

which leads to the probit model, and the logistic distribution:

equation

which leads to the logit model.

For a sample of size images, the log‐likelihood function is obtained by summing the logs of (9.1) for all the observations:

equation

9.1.1.2 Ordered Models

An ordered model is a model for which the response can take images distinct values (with images). The construction of the model is very similar to the one of the binomial model. We consider a latent variable, like before equal to the sum of a linear combination of the covariates and an error:

equation

Denoting images a vector of parameters, with images and images, the rule of observation for the different values of images is then:

equation

Denoting by images the cumulative density of images, the probability for a given value images of images is:

equation

The probability of the outcome can be written:

For a sample of size images, the log‐likelihood function is obtained by summing the logarithms of (9.2) for all the observations:

equation

As for the binomial model, the most common choices for the distribution of images are the normal and the logistic distributions, which lead respectively to the ordered probit and logit models.

9.1.2 The Random Effects Model

For panel data, we now have repeated observations of images for the same individuals. The latent variable is then defined by:

equation

We assume as usual that the error can be written as the sum of an individual effect images and an idiosyncratic term images. Two observations for the same individual are then correlated because of the common term images. If the images vector contains an intercept, we can suppose, without loss of generality, that images.

9.1.2.1 The Binomial Model

For a given value of images, the probability of the outcome for individual images at period images is defined as before:

equation

Denoting images, the joint probability for all the periods for individual images is:

equation

The unconditional probability is obtained by integrating out this expression for images. Assuming that the distribution of images is normal with a standard deviation of images, we obtain:

equation

With the change of variable:

equation

we obtain

equation

There is no closed‐form for this integrand, but it can be efficiently numerically approximated using Gauss‐Hermite quadrature. This method consists in evaluating the function for different values of images (denoted images) and computing a linear combination of these evaluations, with weights denoted by images. For a fixed number of evaluations images, the values of images are tabulated.

and the log‐likelihood function is obtained by summing over all the individuals the logarithm of (9.3).

9.1.2.2 Ordered Models

The line of reasoning is very similar to that of binomial models. The joint probability for an individual images for a given value of the individual effect is:

equation

Assuming a normal distribution for the individual effects, the unconditional probability is:

equation

Using the same change of variable as previously, we obtain:

equation

which can be approximated using Gauss‐Hermite quadrature:

equation

9.1.3 The Conditional Logit Model

The random effects model is consistent only if the individual effects are uncorrelated with the covariates. If it is not the case, the conditional logit model can be used. It is well known in the statistic literature and has been introduced in panel data econometrics by Chamberlain (1980).

The general presentation of this model is quite complex, but the intuition of it can be perceived using the special case where images. We denote images. Only the individuals for which images can be used to estimate the conditional logit model (more generally, only individuals for which images may be used).

For a given period images, the probabilities for the two values of images are:

equation

or more generally:

equation

If the idiosyncratic components of the errors are i.i.d., the joint probability for two observations is simply the product of images and images:

equation

or also, as one and only one of the two images equals 1:

equation(9.4)

The probability that images is equal to the sum of the probabilities of:

  • images and images, which is images,
  • images and images, which is images.

which is therefore:

Dividing (9.4) by (9.5), one finally obtains the joint probability of images and images given their sum:

(9.6)equation

This conditional probability is free of the individual effect and the likelihood that uses this expression can therefore be considered as a fixed effects logit model. Note that there is no similar estimator for the probit model.

9.2 Censored or Truncated Dependent Variable

9.2.1 Introduction

It's often the case in economics that the response is only observed on a certain range of values; we then say that the dependent variable is truncated. For example:

  • if the response is a proportion, it is necessarily left‐ truncated on 0 and right‐truncated on 1,
  • consumption for a good is necessarily positive and therefore left‐truncated on 0,
  • the demand for a sports event is necessarily lower or equal to the number of seats in the stadium and is therefore right‐ truncated to this capacity.

From now on, we will consider the most common case, which is a 0 left truncation, but the models we will present easily extend to the case of left or/and right truncations at any value.

As usual, we will assume that the dependent variable can be represented by a latent variable images that equals the sum of a linear combination of different covariates and an error term.

equation

The observed response images equals images if it is not in the truncated zone (i.e., here, if it's strictly positive) and equals the truncature (here, 0) otherwise.

(9.8)equation

Two kinds of samples can be used to estimate this model:

  • a sample is truncated when only observations for which images are available (we therefore don't even know the values of the covariates images for observations for which images is in the truncation zone),
  • a sample is censored when it consists of observations for which images is either inside or outside the truncation zone.

This latter case is particularly important in econometrics and leads to a model which is called the tobit model (Tobin, 1958). From now, we'll refer to the truncated model when the first kind of sample is used and to the censored model for the second kind of sample.

We'll first analyze why applying a linear regression to a censored or a truncated model leads to inconsistent estimators. We'll then present a non‐parametric method that leads, removing some specific observations, to a consistent estimator while making minimal hypotheses on the model errors. We'll conclude this section with the maximum likelihood estimator, which relies on the much stronger hypothesis of homoscedasticity and normal distribution.

9.2.2 The Ordinary Least Squares Estimator

Let images be the density of the distribution of images which is supposed, without loss of generality as long as the equation contains an intercept, to be of 0 expected value. We then have:

equation

If images were observed, OLS would be a consistent estimator for images. This is not the case when we only observe the truncated variable images. On the truncated sample, we have images, or images. The distribution of images for the sample is then images, depicted by the dotted line in Figure 9.2.

Graph displaying 3 bell-shaped curve for y* (solid), y* | y* > 0 (dotted), and y* | y* > 0 & y* < 2βTx (dashed), with y* having shaded regions labeled P(y* < 0) (left) and P(y* > 2βTx) (right). Below is a right arrow labeled ε.

Figure 9.2Distribution of images and images.

The distribution of images is not symmetric around 0, and its expected value is positive, because the left side of the distribution, corresponding to values of images, is truncated. We therefore have:

equation

which is, for a normal distribution:

equation

or, subtracting images:

equation

images is known as the inverse mills ratio and is a decreasing function of its argument. Computing the derivative with respect to one covariate images, we obtain:

equation

which is negative if images, as images is the average of images for images and is therefore greater than images. The OLS estimator computed on the truncated sample is therefore downward biased.

For the censored sample, we have images for censored observations. We then have:

equation

where the last expression holds for a normal distribution. Subtracting images, we obtain the expected value of the error of the censored model:

equation

Computing once again the derivative with respect to a covariate images, we have:

equation

which, as previously, has the opposite sign of images, implying that the OLS estimator on the censored sample is downward biased.

The bias of the OLS estimator on censored and truncated samples is illustrated on Figure 9.3

Graphs for whole, censored, and truncated samples, each with 2 ascending lines with dot markers. The lines in whole sample coincide to each other. The lines in censored sample and truncate sample intersect.

Figure 9.3OLS bias for the censored and the truncated samples.

9.2.3 The Symmetrical Trimmed Estimator

The OLS estimator is inconsistent because the truncation leads to an asymmetric distribution for the errors, for which the expected values depends on images. Powell (1986) proposes to restore the symmetry by removing some observations.

9.2.3.1 Truncated Sample

In the case of the truncated sample, observations for which images, or images, are missing. The symmetry may be restored by removing from the right side of the distribution, the observations for which images, or images. The distributions of images and images are depicted by the dashed line in Figure 9.2. In this case, we have:

equation

A consistent estimator may be obtained using the normal conditions and restricting the sample to observations for which images. Denoting by images the function that is equal to 1 if images is true and 0 otherwise, we have:

(9.9)equation

These first‐order conditions may be obtained by minimizing the function:

(9.10)equation

In this case, all the observations for which images and those for which images have a weight equal to images in the objective function and a zero weight in the first‐order conditions. The weight in the objective function ensures that fallacious solutions of the first‐order conditions like images are excluded.

9.2.3.2 Censored Sample

In the case of the censored sample, symmetry is restored by replacing images by images when images (as images is replaced by 0 when images). We then have:

(9.11)equation

These first‐order conditions may be obtained by minimizing the following function:

(9.12)equation

Observations for which images now have a weight equal to images in the objective function and a zero weight in the first‐order conditions.

9.2.4 The Maximum Likelihood Estimator

If we can assume that the errors are normal and homoscedastic, a more efficient estimator is the maximum likelihood estimator.

9.2.4.1 Truncated Sample

The maximum likelihood estimator for a truncated sample has been proposed by Hansman and Wise (1976). The density of the distribution of images is normal, with expected value equal to images and standard deviation images. We then have:

equation

The probability of images being negative is: images.

The density of the distribution of images, denoted images, is the zero left‐truncated distribution of images: We then have:

The log‐likelihood function is obtained by summing the logarithms of the density (9.13) for the images observations in the sample:

(9.14)equation

9.2.4.2 Censored Sample

When the sample is censored, the distribution of images is a mix of a discrete and a continuous distribution. An observation for which images enters the log‐likelihood function as:

equation

while for a positive observation, the contribution to the likelihood is the truncated normal density:

equation

times the probability that images be positive: images. We finally get the log‐likelihood function (9.15):

9.2.5 Fixed Effects Model

Honoré (1992) proposed a symmetrical trimmed estimator that is an extension of Powell (1986)'s estimator to panel data. For now, we consider a panel with only two observations for every individual and one covariate.

equation

The only hypothesis made concerning the errors images and images is that they are identically distributed. The symmetry hypothesis, which was required for the Powell (1986) estimator to be consistent, is not necessary here.

9.2.5.1 Truncated Sample

For the truncated model, only observations for which images are available. Figure 9.43 presents the distribution of images and images.

Distribution of y*n1 (left) and of y*n2 (right) for βTΔxn > 0 (a) and βTΔxn < 0 (b), each depicted by a two-peak curve centering at βTxn1 + ηn (for y*n1) and βTxn2 + ηn (for y*n2).

Figure 9.4Distribution of images and of images.

With the hypotheses we've made, these two distributions only differ by their position, images being centered on images and images on images. Because of the truncation, the two distributions conditioned to the fact that the observation is in the sample (images), to the values of the covariates (images) and to that of the individual effect (images) are more substantially different. If images (Figure 9.4a), the truncated part of the distribution of images is larger than the one of images. However, identical distributions can be obtained by truncating images not at 0 (which is the selection rule of the sample) but at images. In the case where images (Figure 9.4b), images is similarly truncated at images.

We then obtain two identical conditional distributions for:

  • images and images in the case when images,
  • images and images in the case when images,

More generally, the observations that should be removed to restore symmetry are those for which images or images. This situation is depicted in Figure 9.5. When images (9.5a), the joint distribution of images is symmetric around the images line which is the images line with intercept images. Truncating at images and images, we obtain two symmetric zones images and images. The probability of having images in zones images or images is the same. This result leads to a first‐moment condition:

Moreover, by symmetry, in Figure 9.5a:

  • the vertical distance between images in zone images on the images line is images,
  • the horizontal distance between images in images and the images line is images,

which can be written as a second‐moment condition:

For a sample of size images, truncated as previously described, the sample analogues of the two moment conditions (9.16) and (9.17) are:

(9.18 and 9.19) are respectively the first‐order conditions of the LAD and of the least squares estimator. These first‐order conditions may be obtained by maximizing:

equation

with:

equation

If images, we obtain the trimmed least squares estimator; if images, we obtain the trimmed least absolute deviations estimator. Only the observations for which images are included in the first‐order conditions, the presence of images in the objective function excluding trivial solutions.

Image described by caption and surrounding text.

Figure 9.5Symmetry of the distribution of images.

9.2.5.2 Censored Sample

For the censored sample, observations for which images are available, the observation rule for images being:

equation

From Figure 9.5, we can see that not only images and images are symmetrical but also images defined by images and images defined by images.

Therefore, to restore symmetry for the censored sample, we have to get rid of the zone for which images and images (the dotted zone on Figure 9.5).

The symmetry between images and images leads to the following moment condition:

Moreover, for:

  • images in images, the vertical distance to the limit of the zone is images,
  • images in images, the horizontal distance to the limit of the zone is images

which translates into the following moment condition:

Using (9.16 and 9.20), we obtain:

and using (9.17 et 9.21), we obtain:

The sample analogues to (9.22) are the first‐order conditions of the following function:

(9.24)equation

which is the trimmed LAD estimator on the censored sample.

Finally, the sample equivalent of (9.23) are the first‐ order conditions of the following function:

(9.25)equation

which is the trimmed least squares estimator for the censored sample. The trimmed LAD and least squares estimators have been extended to the case where the dependent variable is two‐sided censored or truncated by Alan et al. (2013).

9.2.6 The Random Effects Model

The trimmed estimator has two useful features: it is robust to non‐normality and heteroscedasticity, on the one hand, and to correlation between the individual effects and the covariates, on the other hand, the individual effects being wiped out by the first‐ difference transformation. However, if the errors are normal and homoscedastic and if the individual effects are also normal and uncorrelated with the covariates, the maximum likelihood estimator is consistent and more efficient.

For panel data with individual effects, the latent variable writes:

equation

9.2.6.1 Truncated Sample

The density of images is:

equation

The joint density of images is, assuming the independence of the errors:

Assuming that the distribution of individual effects is normal with a standard deviation equal to images, the unconditional joint density is obtained by integrating out (9.26) for the individual effects:

(9.27)equation

Using the change of variable images, we obtain:

(9.28)equation

which can be approximated by the Gauss‐Hermite quadrature method:

The log‐likelihood function for the truncated model is then simply obtained by summing the logarithms of (9.29) for all individuals:

(9.30)equation

9.2.6.2 Censored Sample

In this case, the conditional distribution of images is either given by a probability or by a density:

equation

Using a similar reasoning as for the truncated model, individual images contributes to the likelihood with a product of probabilities and/or densities:

The log‐likelihood function for the censored sample is obtained by summing over all the individuals the logarithm of (9.31):

(9.32)equation

9.3 Count Data

We now consider the case where the response is a count. We will first briefly review the estimation of count data models in a cross‐sectional context, and then we will describe specific estimators for panel data.

9.3.1 Introduction

The two most widely used models when the response is a count are the Poisson and the NegBin models.

9.3.1.1 The Poisson Model

We first suppose that the response follows a Poisson distribution of parameter images (which is the mean and the variance of the variable). Under this distributional assumption, the probability of observing a value images is:

equation

Using the logarithmic link, the Poisson parameter is the exponential of the linear predictor:

equation

which leads to the following probability for observation images:

equation

Taking the logarithm of this probability and summing over all individuals, we obtain the following log‐likelihood function:

equation

9.3.1.2 The NegBin Model

Count data often exhibit excess dispersion, i.e., the variance is greater than the mean. In this case, the NegBin model is more appropriate than the Poisson model.

Suppose that images is a random variable that follows a Poisson distribution of parameter images (with images in the case of a logarithmic link), images being a random variable.

The conditional probability of images is:

equation

Let now suppose that images follows a gamma distribution. If images contains an intercept, the mean of images is not identified and therefore a one‐parameter distribution, which imposes a unit mean, is chosen.

equation

Integrating out this conditional probability using the density of images, we obtain:

equation
equation

To understand the meaning of images, the first two moments of images are computed. For a given value of images, we have, as for the Poisson model: images. The unconditional mean is images, because the expected value of images equals 1.

To compute the unconditional variance, the variance decomposition formula is applied:

equation

A general formula for images is:

equation

For images, we get the Negbin1 model, with images and images. In this case, the variance is proportional to the mean.

For images, we obtain the Negbin2 model, with images and images; here, the variance is a quadratic function of the mean.

9.3.2 Fixed Effects Model

Fixed effects Poisson and NegBin models are proposed by Hausman et al. (1984).

9.3.2.1 The Poisson Model

The fixed effects Poisson model is very specific, as it doesn't suffer from the incidental parameter problem and can therefore be obtained either by estimating the individual effects or by using a sufficient statistic4.

In a panel context, the Poisson parameter for individual images in period images is written:

equation

which means that the individual effect is multiplicative. For a given value of the individual effect, the probability of observing images is:

equation

Let images be the sum of all the values of the response for individual images and images the sum of the Poisson parameters. A sum of Poisson variables follows a Poisson distribution with parameter equal to the sum of the parameters of the summed variables. We therefore have:

Let images be the vector of values of images for individual images. We then have:

Applying Bayes' theorem, we obtain:

equation

i.e., the joint probability of the components of images is the product of the conditional probability of images given images and the marginal distribution of images. This conditional probability is:

equation

which implies:

(9.35)equation

As for the logit model, images is a sufficient statistic, which means that it allows to get rid of the individual effects. Taking the logarithm of this expression and summing over all individuals, we obtain the within Poisson model:

(9.36)equation

or:

As stated previously, the Poisson model is not affected by the incidental parameter problem, as the same estimator may be obtained by estimating the individual effects. To show this result, we take the logarithm of the joint probability for the images observations of images for individual images (equation 9.34), in order to obtain the log‐likelihood function:

The first‐order condition for images to maximize the log‐ likelihood function is:

equation

which implies that: images.

Introducing this expression in (9.38) and summing over all images, we obtain the concentrated log‐likelihood function:

The two log‐likelihood functions (9.37) and (9.39) are proportional, they therefore lead to the same estimators of images. Moreover, if a logarithmic link is chosen, we have: images. The likelihood is in this case proportional to:

equation

which is similar to the likelihood of a multinomial logit model for which images individuals must choose one among images mutually exclusive alternatives. The difference is that in this latter model images is either equal to 0 or to 1, and images, as in our context each images is a natural integer.

9.3.2.2 Negbin Model

Hausman et al. (1984) also propose a fixed effects NegBin model. We just present below without demonstration the joint probability for individual images:

(9.40)equation

9.3.3 Random Effects Models

9.3.3.1 The Poisson Model

Hausman et al. (1984) also proposed a between and a random effects Poisson model, integrating out the relevant probabilities (9.33 et 9.34 respectively). A gamma distribution hypothesis is made for the individual effects, with the following density:

equation

with

equation

the gamma function. The expected value and the variance of images are respectively:

equation

If the model contains an intercept, the expected value is not identified and we can then suppose, without restriction, that it is equal to 1, which implies images. We then obtain a gamma distribution with one parameter (denoted images):

equation

Integrating out the conditional probabilities (9.33 and 9.34), we obtain the unconditional probabilities for the between and the random effects models:

equation
equation

which leads to the following log‐likelihood functions:

(9.41)equation
(9.42)equation

9.3.3.2 The NegBin Model

In addition to the Poisson model, Hausman et al. (1984) also proposed between and random effects NegBin models. We just present below without demonstration the joint probability for individual images.

(9.43)equation
(9.44)equation

9.4 More Empirical Examples

Charness and Villeval (2009) investigate the difference in behavior between senior and junior workers in terms of risk aversion, competition, and cooperation. They conduct an experiment during which every participant can invest his or her initial endowment in a public good game, which is the explanatory variable of their econometric analysis. This variable is left‐ (null contribution) and right‐ (contribution of the full endowment) censored. As the participants are observed during 16 periods, they use a random effects tobit model. The Seniors data set is available in package pder.

Michalopoulos and Papaioannou (2016) explore the consequences of ethnic partitioning, which is one aspect of the “scramble for Africa” during which European countries partitioned Africa without caring much about the boundaries of ethnic groups. Their pseudo‐panel consists of 825 ethnic groups belonging to 49 countries. The authors estimate a Negbin model, where the response is the number of conflicts in an ethnicity‐country homeland, the major covariate being a dummy for partitioned ethnic areas. They introduce country fixed effects and estimate a specification where the fixed effects are estimated and not wiped out using a sufficient statistic. Partitioned ethnicities experience an increase of 57% in political violence compared to other areas. The data are available in package pder as ScrambleAfrica.

Bardhan and Mookherjee (2010) analyze the political determinants of land reform in West Bengal, India. More specifically, they use yearly data on 89 villages for the 1978‐1998 period. The response is the percentage of land or of households affected by land reform; it is highly censored, as it is 0 for more than 80% of the sample. The main covariate is the presence of a left‐wing coalition at the head of the local government. The authors use the trimmed least absolute deviation estimator of Honoré (1992) and don't find any significant effect of the left‐wing government variable on the strength of land reform. The LandReform data are available in package pder

Brandts and Cooper (2006) analyze how financial incentives can be used to overcome a history of coordination failure. For this purpose, they conduce an experiment where “firms,” composed of four “employees” have an output that is related to the lowest level of effort implemented by the employees. The individual or the lowest firm level effort is the response and, as the same employees/firms are observed during 30 different rounds, panel data techniques are used. The level of effort being ordinal, the authors use ordered probit models, with firm random effects and nested random effects respectively when the analysis is at the firm or at the employee level. Their dataset is available in the pder package as CoordFailure.

Farber et al. (2016) conducted an audit study to analyze the determinants of callbacks to job applications. They sent four fake resumes for 1,118 job openings and the response is a dummy indicating a callback, the covariates being the unemployment spell duration, the age, and the fact that the worker has held a low level interim job. They estimate a random effects and a conditional logit model with job opening effects. The Callbacks data are to be found in the pder package.

Bazzi (2017) investigates the influence of income on migration. At the household and at the village level, the response being in the first case a dummy that indicates whether a person in the household migrated during the given year and in the second case the percentage of the population of the village that has migrated. The main covariates are rainfall, rice price shock, and wealth at the household level and at the village level, indicators of the shape of wealth distribution. The author uses a conditional logit at the household level and the two‐sided trimmed least absolute deviations estimator (see Alan et al., 2013) at the village level. The IncomeMigrationV (village level) and IncomeMigrationH (household level) datasets are also included in the pder package.

Vella and Verbeek (1998) estimate the union premium for young men. In a first step, they estimate a dynamic random effect probit model for union membership. The UnionWage dataset is available in the pglm package.

Hausman et al. (1986) and Cincer (1997) study the dynamic relationship between patents and R&D using yearly panels of firms. They fit different count data models, including conditional Poisson and Negbin models. The data sets they used are available as PatentsRDUS and PatentsRD in the pglm packages.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.63.136