Poisson Models for Count Data with Two Observations Per Individual

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

4.2. Poisson Models for Count Data with Two Observations Per Individual

When there are only two observations per individual, we saw previously that a linear or logistic fixed effects analysis could be done using simplified methods with conventional software. This is also true for count data. In fact, a fixed effects Poisson regression model can be estimated with an ordinary logistic regression program.

For the patent data, let's ignore the intervening years and focus only on 1975 and 1979. Let y_i₁ be the patent count for firm i in 1975 and y_i₂ the patent count in 1979. We assume that each of these variables has a Poisson distribution with parameter λ_it. That is, the probability that y_it = r is given by

Why a Poisson distribution? Well, the Poisson distribution is perhaps the simplest probability distribution that is appropriate for count data. It can be derived from a stochastic process in which it is assumed that events (in this case patents) cannot occur simultaneously, and that events are independent (Cameron and Trivedi 1998). That is, the occurrence of an event neither raises nor lowers the probability of future events. Note that we are not assuming that there is a single Poisson distribution for the entire sample. Instead, each firm's patent count is drawn from a different Poisson distribution whose parameter λ_it varies across both firms and time.

An important property of the Poisson distribution is that its mean and variance are equal, and both are equal to the Poisson parameter:

Next, we let λ_it be a log-linear function of the predictor variables:

As in earlier chapters, x_it represents the time-varying predictor variables, z_i denotes the time-invariant predictors, and α_i denotes the unobserved fixed effects. The vector x_it includes the R & D expenditures in the current year t and in each of the preceding five years.

Our goal is to estimate the parameters in equation (4.3). To do this, we shall use conditional maximum likelihood, the same method used in chapter 3 to estimate the fixed effects logistic regression model. Consider the distribution of y_i₂ conditional on the total event count for the two time periods combined, denoted by w_i = y_i₁ + y_i₂. It can be shown that y_i2|w_i ~ B (p_i, w_i).

That is, conditional on the total count w_i, the 1979 count y_i₂ has a binomial distribution with parameters p_i and w_i, where

It follows, after a bit of algebra, that

Thus, we have converted our Poisson regression model into a logistic regression model in which the predictor variables are difference scores for the original predictors. Note that, as in earlier applications, both α_i and γz_i drop out of equation (4.5).

To implement this conditional approach in SAS, we may use any SAS procedure that does logistic regression for grouped data, including LOGISTIC, GENMOD, CATMOD and PROBIT. Here's how to do it in GENMOD. First, we create a new data set that contains the total count for each firm and the difference scores for the research and development variables:

DATA patents;
   SET my.patents;
   total=pat75+pat79;
   rd_0=logr79-logr75;
   rd_1=logr78-logr74;
   rd_2=logr77-logr73;
   rd_3=logr76-logr72;
   rd_4=logr75-logr71;
   rd_5=logr74-logr70;
RUN;

RD_0 is the difference score for the same years in which the patents were counted, whereas RD_1 through RD_5 are difference scores for lags of one to five years.

Let's first estimate a model with no covariates:

PROC GENMOD DATA=patents;
   MODEL pat79/total = / DIST=B;
RUN;

Note that the dependent variable is expressed with the events/trials syntax, which tells SAS that PAT79 events occurred out of a possible TOTAL. As in chapter 3, DIST=B specifies that PAT79 has a binomial distribution whose default link function is logit (i.e., logistic). Results are shown in Output 4.1.

Table 4.1. Output 4.1 Conditional Poisson Regression Model for Patents, with No Covariates
Model Information
Data Set	WORK.PATENTS
Distribution	Binomial
Link Function	Logit
Response Variable (Events)	pat79
Response Variable (Trials)	total

Number of Observations Read	346
Number of Observations Used	300
Number of Events	11107
Number of Trials	23865
Number of Invalid Responses	46

Criteria For Assessing Goodness Of Fit
Criterion	DF	Value	Value/DF
Deviance	299	1001.3656	3.3490
Scaled Deviance	299	1001.3656	3.3490
Pearson Chi-Square	299	938.2458	3.1379
Scaled Pearson X2	299	938.2458	3.1379
Log Likelihood		−16484.8031

Analysis Of Parameter Estimates
Parameter	DF	Estimate	Standard Error	Wald 95% Confidence Limits		Chi-Square	Pr>ChiSq
Intercept	1	−0.1386	0.0130	−0.1640	−0.1131	114.03	<.0001
Scale	0	1.0000	0.0000	1.0000	1.0000

Under "Model Information" we see that 46 firms had invalid response values. These are firms that had 0 patents in both 1975 and 1979, so their total for the two years was also 0. Of course, the binomial distribution is undefined when the number of trials is 0, which is why these firms are excluded. This points out a more general characteristic of Poisson regression that extends to the next section as well. Whenever you condition on the total count, those cases that have a total count of 0 are effectively removed from the likelihood function. If the total is 0, then each component must also be 0, leaving no within-individual variability to analyze.

In the next panel, "Criteria for Goodness of Fit," we see that both the deviance and the Pearson chi-square statistics are more than three times their degrees of freedom. For a model to have a good fit, these statistics should be close to their degrees of freedom. However, because many of the expected counts generated by this model are small (near 0), the chi-square distribution may not be a good approximation. For that reason, it's probably not a good idea to compute a p-value. Nevertheless, the magnitude of these ratios suggests that there is a problem with overdispersion, about which I'll have more to say as the chapter progresses.

Finally, we get to the "Analysis of Parameter Estimates." The only estimate is the intercept, with a value of −;.1386. What does this tell us? Well, if m₁ is the mean number of patents in year 1 and m₂ is the mean for year 2, the intercept is log(m₁/m₂). If the mean number of patents were exactly the same in both years, the intercept would be 0. The fact that it's negative indicates that the mean went down over time. More specifically, if we calculate 100(exp(−.1386) − 1) = − 12.9%, we get the percentage decrease in the mean from 1975 to 1979. Furthermore, because the chi-square for the intercept is so large, we can reject the null hypothesis that the means for the two years are the same. In effect, what we have here is the count data analog of the paired-comparisons t-test discussed in chapter 1, or the McNemar test for dichotomous variables discussed in chapter 3.

Now let's add the lagged variables for research and development expenditures as covariates, with results shown in Output 4.2. Again we see that the ratio of the goodness-of-fit chi-square statistics to their degrees of freedom is above 3, suggesting that we really need to do something about overdispersion. But let's postpone that issue for a moment. Examining the parameter estimates and their associated statistics, we see that RD_0, the contemporaneous measure of research and development expenditures, has a highly significant effect on the patent count, with a coefficient of .5214. To interpret this, keep in mind that both the dependent variable (expected number of patents) and the independent variable (research and development expenditures) are logged (see equation (4.3)). In that case, we can say that a 1% increase in R & D expenditures is associated with a .52% increase in the expected number of patents in the same year, controlling for the lagged R & D measures. The effects of the lagged measures are much smaller.

Table 4.2. Output 4.2 Conditional Poisson Regression Model for Patents, with Covariates
Criteria For Assessing Goodness Of Fit
Criterion	DF	Value	Value/DF
Deviance	293	949.3031	3.2399
Scaled Deviance	293	949.3031	3.2399
Pearson Chi-Square	293	890.2903	3.0385
Scaled Pearson X2	293	890.2903	3.0385
Log Likelihood		−16458.7718

Analysis Of Parameter Estimates
Parameter	DF	Estimate	Standard Error	Wald 95% Confidence Limits		Chi-Square	Pr>ChiSq
Intercept	1	−0.2225	0.0178	−0.2573	−0.1876	156.50	<.0001
rd_0	1	0.5214	0.0844	0.3561	0.6868	38.19	<.0001
rd_1	1	−0.2067	0.1129	−0.4280	0.0146	3.35	0.0671
rd_2	1	−0.1179	0.1110	−0.3355	0.0996	1.13	0.2880
rd_3	1	0.0601	0.0958	−0.1277	0.2478	0.39	0.5305
rd_4	1	0.1806	0.0900	0.0042	0.3569	4.03	0.0448
rd_5	1	−0.0932	0.0690	−0.2284	0.0420	1.83	0.1765
Scale	0	1.0000	0.0000	1.0000	1.0000

Now let's attend to the overdispersion problem. The big danger with overdispersion is that the standard errors may be underestimated, leading to chi-squares that are too large and p-values that are too low. There are several possible solutions to this problem, one of which is to formulate and estimate a model that directly builds in the overdispersion. One such model is the negative binomial model that will be discussed later in this chapter. But a simpler, though less elegant, approach is to correct the standard errors and chi-squares based on the goodness-of-fit ratios that alerted us to the problem. In PROC GENMOD, this is accomplished by using the DSCALE or PSCALE options on the MODEL statement. For example,

PROC GENMOD DATA=patents;
   MODEL pat79/total = rd_0-rd_5 / DIST=B DSCALE;
RUN;

The DSCALE option uses the deviance chi-square to make the adjustment while PSCALE uses the Pearson chi-square. The adjustment is very simple: Calculate the square root of the ratio of the chi-square statistic to its degrees of freedom. In Output 4.3 this number is reported as the "Scale" parameter in the last line. All standard errors are then multiplied by the scale parameter, which in turn attenuates the chi-squares and the p-values, as shown in Output 4.3.

Table 4.3. Output 4.3 Conditional Poisson Regression Model with Overdispersion Adjustment
Criteria For Assessing Goodness Of Fit
Criterion	DF	Value	Value/DF
Deviance	293	949.3031	3.2399
Scaled Deviance	293	293.0000	1.0000
Pearson Chi-Square	293	890.2903	3.0385
Scaled Pearson X2	293	274.7858	0.9378
Log Likelihood		−5079.9585

Parameter	DF	Estimate	Standard Error	Wald 95% Confidence	Limits	Chi-Square	Pr>ChiSq
Intercept	1	−0.2225	0.0320	−0.2852	−0.1597	48.30	<.0001
rd_0	1	0.5214	0.1519	0.2238	0.8191	11.79	0.0006
rd_1	1	−0.2067	0.2032	−0.6050	0.1916	1.03	0.3091
rd_2	1	−0.1179	0.1998	−0.5095	0.2736	0.35	0.5550
rd_3	1	0.0601	0.1724	−0.2779	0.3980	0.12	0.7275
rd_4	1	0.1806	0.1620	−0.1369	0.4980	1.24	0.2649
rd_5	1	−0.0932	0.1241	−0.3365	0.1501	0.56	0.4527
Scale	0	1.8000	0.0000	1.8000	1.8000

When this is done for the patent data, we find that only RD_0 retains its statistical significance, and even for this variable the chi-square is greatly reduced. Note also that the coefficients are not modified at all by this overdispersion correction. Other approaches to overdispersion—such as estimating a negative binomial model—might produce different coefficient estimates.

It's also possible to include predictor variables that do not vary with time, although the interpretation of their coefficients is not always straightforward. Output 4.4, for example, shows results for a model that includes the dummy variable for SCIENCE sector and the variable for LOGSIZE of the firm (while deleting the nonsignificant lagged R & D measures). Neither variable approaches statistical significance. Their coefficients can be interpreted as measuring interactions between each variable and time. Like all interactions, these coefficients can be interpreted in two ways. For example, the coefficient of .0275 for SCIENCE represents the difference between the coefficient for SCIENCE in 1979 and the coefficient in 1975. The fact that it is far from statistically significant suggests that this variable has the same the effect in both years. Alternatively, we can interpret .0275 as the increment in the effect of time for firms in the science sector, relative to those not in the science sector. Again, because it is far from significant, we may conclude that the rate of change in the number of patents from 1975 through 1979 is essentially the same for the two sectors.

Table 4.4. Output 4.4 Conditional Poisson Model with Time-Invariant Covariates
Analysis Of Parameter Estimates
Parameter	DF	Estimate	Standard Error	Wald 95% Confidence Limits		Chi-Square	Pr>ChiSq
Intercept	1	−0.3335	0.1100	−0.5490	−0.1180	9.20	0.0024
rd_0	1	0.3770	0.1025	0.1761	0.5778	13.53	0.0002
science	1	0.0275	0.0482	−0.0670	0.1219	0.33	0.5686
logsize	1	0.0161	0.0146	−0.0125	0.0448	1.22	0.2700
Scale	0	1.7961	0.0000	1.7961	1.7961

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Poisson Models for Count Data with Two Observations Per Individual

Create new playlist

Sign In

Sign Up

4.2. Poisson Models for Count Data with Two Observations Per Individual

Table of Contents for
Poisson Models for Count Data with Two Observations Per Individual