4 Applications

In this section we review the substantive contributions of the DCDP literature to three main areas of labor economics: (i) labor supply (female and male), (ii) job search and (iii) human capital.

4.1 Labor supply

The literature on dynamic labor supply models can be usefully divided into that on females and males. This is because the two literatures have emphasized different aspects of behavior. A key feature of female labor supply is that a large percentage of women (particularly married women) do not work during significant portions of their life cycle. The central role of the decision of whether or not to work has made the DCDP approach more common in the study of female labor supply than in the literature on males.

The literature on women has also emphasized the relationship between participation and human capital accumulation, while tending to ignore saving. This is no accident, because, as Eckstein and Wolpin (1989b) note, it is very difficult computationally to handle participation, human capital and saving simultaneously.70 The literature has also striven to model how fertility, marriage and participation decisions interact.

In contrast, the literature on males has emphasized the continuous choice of hours of work and savings, with participation usually taken as given. Given an assumption of interior solutions, most papers on dynamics of male labor supply have worked with the first order conditions of agents’ optimization problems, rather than using the DCDP approach.71 Nevertheless, at the end of this section, we review an empirical paper on male labor supply (Imai and Keane, 2004) that adapts the DCDP approach to the case of continuous choices of labor supply and consumption.

4.1.1 Female labor supply

As we have already noted in the previous discussion, the prevalence of nonparticipation creates a problem for the analysis of labor supply decisions given that a person’s market wage rate is not usually observed for nonparticipants. The classic paper by Heckman (1974) developed a method for estimating a labor supply function (with continuous hours and nonparticipation) when wages are only observed for workers. In his framework, the labor supply function is estimated jointly with a wage offer function by maximum likelihood.

The possibility of nonparticipation raises several additional issues. First, participating in the labor market may entail a fixed time and/or money cost (Cogan, 1981). Second, nonparticipation may lead to a lack of skill appreciation. Thus, the literature on female labor supply has allowed work experience, as a measure of human capital accumulated on the job, to affect wage offers (e.g., Weiss and Gronau, 1981; Eckstein and Wolpin, 1989b). Third, the fact that there is heterogeneity in the extent to which women participate over their lifetimes raises the question of the extent to which that heterogeneity is due to permanent (unobserved) differences in preferences for work or to the influence of past work decisions on participation that arise through transitory taste shocks (Heckman and Willis, 1977). Fourth, nonparticipation implies a potentially central role for marriage and fertility decisions.

4.1.2 Mincer’s (1962) life cycle model

The earliest paper on labor supply of women to adopt formally a life cycle perspective was Mincer (1962). A married woman’s labor supply is based on the permanent income of her husband, as well as her market wage and the couple’s tastes for market work, home work and children. Given this framework, the observed variation over the life-cycle in a woman’s work hours is merely the result of the allocation of work hours to periods when market wages are high relative the value of home time (i.e., intertemporal substitution).

Based on this framework, Mincer (1962) hypothesized that a transitory change in husband’s income, which has no significant effect on his permanent income, should have no impact on a woman’s labor supply. Mincer provided some informal evidence on this hypothesis using data from the 1950 Survey of Consumer Expenditures. Taking 6,766 married white women, he stratified them into 12 groups based on husband’s education and age and on the presence of young children. He then subdivided each group into households where the husband worked all year vs. those where the husband had a spell of unemployment. Mincer found that women had a higher participation rate if the husband had experienced an unemployment spell. Based on this evidence that women do respond to transitory changes in husband’s income, Mincer concluded that a simple life-cycle model (with perfect foresight and no constraints on borrowing) could not adequately describe the data.

Two points are worth noting. First, Mincer (1962) uses households where the husband works all year as a “control group” for similar households (in terms of education, age and children) where the husband experiences an unemployment spell, with unemployment as the “treatment.” Thus, one possible explanation for Mincer’s finding is that the treatment and control groups differ in unobserved ways, and that women in the treatment group would have worked more regardless. Second, there are alternative explanations that are consistent with a life-cycle model. For instance, depending on the stochastic process for husband’s income, unemployment shocks may induce long lived reductions in earnings. It is also possible that leisure time of the husband and wife are nonseparable in utility or that unemployed husbands may contribute to home production and/or child care. In either case, unemployment of the husband may reduce the value of home time for the wife.

4.1.3 Non-full solution methods of estimation

The modern structural literature on the estimation of life cycle models of female labor supply begins with Heckman and MaCurdy (1980).72 They adopt the utility function


image     (87)


where image is household image’s consumption at image, image the wife’s hours of work, image maximum available hours in the period, and image and image are taste-shifters. Leisure is given by image. Households have perfect foresight about future preferences and wages. The household maximizes its discounted flow of utility over the finite horizon, image,


image     (88)


where image is the household’s subjective rate of time preference. The household faces the lifetime budget constraint


image     (89)


where image is the household’s initial assets and image is the (constant) rate of interest. Assuming an interior solution, the first-order conditions for all image are

image     (90)

image     (91)

where image is the marginal utility of wealth at image. Using the utility function specification (87), (91) becomes


image     (92)


Taking logs and rearranging yields the Frisch demand function for leisure,


image     (93)


To deal with corner solutions, Heckman and MaCurdy (1980) note that a women will choose not to work if the marginal utility of leisure, evaluated at zero hours of work, exceeds the marginal value of working, that is, if

image     (94)

image     (95)

Taking logs and rearranging, we can express this participation condition as a reservation wage condition, namely


image     (96)


Notice that if the household has a lower level of lifetime wealth, and hence a higher value of image, the reservation wage is correspondingly reduced.

To obtain an estimable model, Heckman and MaCurdy (1980) assume functional forms for the taste shifter image and for the wage offer function, namely

image     (97)

image     (98)

where image and image are vectors of observables that affect the taste for leisure and market productivity, image and image are individual permanent components of the taste for leisure and market productivity and image and image are respective transitory shocks. Substituting (97) and (98) into (93) and (96), we obtain reduced form equations for (i) leisure conditional on participation and (ii) the participation decision rule:

image     (99)

image     (100)

where image is an individual-specific fixed effect which subsumes the marginal utility of wealth term image as well as the individual permanent components of tastes for work and productivity.

Under the assumptions of the model (i.e., perfect foresight, no borrowing constraints) this fixed effect is time invariant, capturing everything from periods outside of period image relevant for the woman’s labor supply decision at time image. For example, in this model it is not necessary to explicitly include the current or future earnings of a married woman’s husband, which is captured through image. In principle, it is not even necessary to control explicitly for whether a woman is married, as the woman’s marriage history is also built into image. For instance, a single woman is assumed to anticipate the earnings potential of any husband she will eventually marry. Marriage can only enter the model because it shifts tastes for work, not because it alters lifetime wealth.

To estimate the model Heckman and MaCurdy (1980) assume that the stochastic terms image and image are jointly normal and serially uncorrelated.73 The hours and participation Eqs (99) and (100) are estimated jointly with the wage Eq. (97) by maximum likelihood. The data consist of 30 to 65 year old continuously married white women from the 1968-75 waves of the PSID. There are 672 women who meet the selection criteria, but to estimate the fixed effects image, only women who work at least once can be used, leaving 452.74

The variables included in the wage equation image are potential experience (i.e., age—education—6) and its square along with the local unemployment rate. Because only time varying covariates can be included due to the presence of the fixed effect in the wage equation, education, for example, is not included. The variables included as taste shifters image are the total number of children, the number of children less than 6, the wife’s age, a measure of the number of hours the husband is unemployed, “other” household income, and an indicator for whether the husband is retired or disabled.75

The results of the estimation are mostly standard. Tastes for home time are increasing in the number of children and especially the number less than 6. Both “other” income and the husband’s hours of unemployment are statistically insignificant, which Heckman and MaCurdy (1980) interpret as evidence that supports the life-cycle model and that contradicts Mincer (1962). But interestingly, the estimate of image bumps up against its lower bound of zero. This implies a Frisch elasticity of leisure of image. Converting to a Frisch labor supply elasticity, and noting that mean hours worked in the sample is about 1300, we have that


image     (101)


which is certainly a large value.

In a subsequent paper, Heckman and MaCurdy (1982) acknowledged that their choice of functional form had implicitly constrained the elasticity of substitution for leisure, and also for hours, to be large. Specifically, if the Frisch elasticity for leisure is image and we impose image, then the elasticity must range from −1 to image. Then, for example, if leisure takes up at least two thirds of available time, (101) implies that the Frisch elasticity of labor supply must be at least 2.

Heckman and MaCurdy (1982) report new results based on an additively separable CRRA utility function,


image     (102)


Adopting (102) does not change anything important in terms of the estimating equations, the only difference being that the constant term image drops out of the equation for image. But now the constraint on image is only that it be less than one. In fact, Heckman and MaCurdy (1982) estimate image, which implies a Frisch elasticity of leisure of image. Interestingly, this still implies a large value of the Frisch elasticity of labor supply equal to 2.35.

The change in the utility specification has some impact on the other parameter estimates. The impact of children on tastes for work becomes larger. The coefficient on income of other household members becomes quantitatively much larger, but is only significant at the 20% level. Heckman and MaCurdy (1982) interpret this result as being “less favorable toward the permanent income hypothesis.” Husband unemployment hours also becomes marginally significant and negative, implying that husband time at home increases the wife’s tastes for work.

Heckman and MaCurdy (1980) conduct a second stage estimation where they regress the fixed effects on various determinants of lifetime wealth. Given estimates of the fixed effects, image, and given an estimate of image and the wage equation fixed effects image, we can back out estimates of image. Thus, it is possible to isolate only a composite of the marginal utility of wealth minus the fixed effect in tastes for leisure. It turns out that this composite is reduced by wife’s education. We would expect education to increase lifetime wealth (thus reducing image) both by increasing own and potential husband’s earnings. But the effect of education on tastes for leisure image is an empirical question. The result implies either that education increases taste for leisure, or, if it reduces it, that this effect is outweighed by the income effect.

The Heckman and MaCurdy (1980, 1982) papers, as well as earlier work in a static framework by Heckman (1974), do not accommodate fixed costs of work. Within a static model, Cogan (1981) argued that ignoring fixed costs can lead to severe bias in estimates of female labor supply functions. To see the problem, consider the simple quasi-linear utility function given by


image     (103)


where image represents non-labor income and image represents fixed costs of working (e.g., child care costs). The equation for optimal hours conditional on working is simply


image     (104)


In the absence of fixed costs the reservation wage image is

image     (105)

image

However, as Cogan (1981) points out, it is not appropriate to use marginal conditions to determine the participation decision rule in the presence of fixed costs. Instead, it is necessary to compare the utilities conditional on working and not working, that is,

image     (106)

image

Thus, the decision rule for whether to work (whether image) can be expressed as


image     (107)


It is instructive to compare (105), which simply says that the person begins to work when desired hours are positive with (107), which says a person will begin to work only when optimal hours cross a positive threshold value image, which Cogan (1981) refers to as reservation hours. Inspection of the right hand side of the inequality in (107) provides intuition for the threshold value; optimal hours conditional on working must be high enough to cover fixed costs plus an additional term which equals the monetized value of the lost utility from leisure.

Thus, as Cogan (1981) describes, in the presence of fixed costs of work the labor supply function is discontinuous, jumping from zero to the reservation hours level when the reservation wage is reached. The specifications assumed in Heckman (1974) and Heckman and MaCurdy (1980, 1982) are not consistent with such behavior. Another key point is that both costs of working (image) and tastes for work (image) enter the participation equation, while only image enters the labor supply equation. Hence, it is possible that a variable like young children could affect fixed costs of working but not tastes for work, that is, that the presence of young children could affect the participation decision but not labor supply conditional on participating.

To estimate labor supply behavior in the presence of fixed costs, Cogan (1981) proposes to jointly estimate a labor supply function as in (104), a reservation hours function as in (107) and an offer wage function. This is in contrast to Heckman’s approach of jointly estimating a labor supply function (104), a participation equation based on marginal conditions as in (105) and an offer wage function.

Cogan (1981) compares both approaches using data on married women aged 30 to 34 taken from the 1967 National Longitudinal Survey of Mature Women. In this sample, 898 wives worked and 939 did not. The labor supply and reservation hours functions both include the wife’s education and age, number of young children, and husband’s earnings. Cogan estimates that fixed costs are substantial (about 28% of average annual earnings), and that a young child raises fixed costs by about a third. He finds that ignoring fixed costs leads to severe overestimates of labor supply elasticities (conditional on work). Cogan’s labor supply function implies a Marshallian elasticity of 0.89 at the mean of the data, compared to 2.45 obtained using the Heckman (1974) approach. The Hicks elasticities are 0.93 vs. 2.64.

However, Cogan also shows that the elasticities are rather meaningless in this context. As he notes, a 10% increase in the offer wage to the average nonworking woman in the sample would not induce her to enter the labor market. But a 15% increase would induce her to jump to over 1300 hours. However, an additional 15% wage increase would “only” induce a further increase of 180 hours (or 13.6%). 76

An important aspect of Cogan (1981) is that he pays close attention to how the model fits the distribution of hours. This is quite unusual in the static literature, where the focus tends to be on estimating elasticities rather than simulating behavior.77 Cogan finds that the model without fixed costs cannot explain the fact that few people are observed to work very few hours. Indeed, the model without fixed costs has to predict a large fraction of women working few hours to be able to fit the large fraction of women who do not work. As Cogan describes, this leads to a flattening of the labor supply function, which exaggerates wage elasticities (see Cogan, 1981, Fig. 2). The model with fixed costs provides a much better fit to the data and does not have this problem.

Kimmel and Knieser (1998) extend the Heckman and MaCurdy (1980, 1982) analysis to include fixed costs of work. That is, they estimate a labor supply equation analogous to (99) jointly with a participation decision rule and an offer wage function, namely

image     (108)

image     (109)

The first equation is the Frisch labor supply function where the fixed effect image captures the marginal utility of initial assets along with any fixed effects in tastes for work. The second equation gives the probability of participation, where image is the cumulative standard normal. The fixed effect image captures not just the marginal utility of wealth and tastes for work, but also individual heterogeneity in the fixed costs of work.

Following Cogan (1981), the existence of fixed costs breaks the tight link between the parameters in the participation and labor supply equations as we previously saw. Thus, there is no necessary relationship between the parameters image and image in (108) and the parameters image and image in (109). In this framework image is the conventional Frisch elasticity of labor supply conditional on employment. But, we can also introduce a Frisch participation elasticity given by


image     (110)


where image is the standard normal density.

Kimmel and Knieser (1998) estimate this model using data on 2428 women from the Survey of Income Program Participation (SIPP), 68% of whom are married. The tri-annual interview information was collected in May 1983 to April 1986, giving 9 periods of data. The variables included in image are marital status, children, education and a quadratic in time. The model is estimated in two stages, where in the first stage predicted wages are constructed for workers and nonworkers by estimating the wage equation using Heckman’s (1979) two-step procedure. The use of predicted wages serves three purposes: (i) to deal with measurement error, (ii) to fill in missing wages and (iii) to deal with possible endogeneity of wages (which would arise if women with high unobserved tastes for work also tend to have high wages). The variables that appear in the wage equation but not in image are race and a quadratic in age (potential experience).

The estimates imply a Frisch elasticity of 0.66 for employed women, and a Frisch participation elasticity of 2.39. Average hours of the entire population is given by image, where image is average hours of the employed and image is the percentage employed. Thus we have that


image


Thus, the participation elasticity is much larger than the hours elasticity. This result provides some justification for models of female labor supply that focus primarily on the participation decision (see below).

Altug and Miller (1998) extend the life-cycle model of Heckman and MaCurdy (1980, 1982) to include human capital accumulation in the form of learning-by-doing. In addition, they incorporate fixed costs of work, state dependence in tastes for leisure, and aggregate shocks. The first step in Altug and Miller (1998) is to estimate the wage offer function, which takes the form


image     (111)


Here image is a vector containing work experience, lagged participation and hours, and other observable determinants of skill, image is a time-invariant skill endowment of person image and image is a skill rental price (determined in equilibrium). In estimation, the image can be treated as individual fixed effects and image as time dummies. A key assumption is that image reflects only measurement error (and not unobserved variation in skill). Given that assumption, no selection bias problem arises if we estimate (111) by OLS only using periods when women are working, provided we include fixed effects.

Altug and Miller (1998) estimate the wage offer function using PSID data from 1967 to 1985. They require that the women reside in a PSID household for at least 6 consecutive years and that they be employed for at least two years (so that the fixed effects, image, can be estimated). This gives a sample of 2169 women. The estimates imply that labor market experience, particularly recent experience, has a large effect on current wages. For instance, a person who worked the average level of hours for the past four years would have current offer wages about 25% higher than someone who had not worked. Interestingly, the lagged participation coefficients are negative while lagged hours coefficients are positive. The implication is that low levels of hours do not increase human capital: one has to work about 500 to 1000 hours to keep skill from depreciating.

The time dummies in the estimation are estimates of the rental price of skill. The rental price is estimated to be pro-cyclical, falling in the recession years of 1975 and 1980-1982 and rising in 1977, 1983 and 1985. Average wages among all women in the PSID sample are slightly more pro-cyclical than the estimated rental rates. This suggests a compositional effect whereby people with high image’s tend to enter during booms. This is consistent with the mild pro-cyclical bias in aggregate wage measures for males found by Keane et al. (1988).

Altug and Miller (1998) assume a current period utility function given by


image     (112)


Here the first term is CRRA in consumption, image is an indicator for positive hours, image captures the fixed cost of work, image is the disutility of labor, image is a vector of demographic variables that reflect the fixed costs of working, image includes image along with lagged hours of work that shift tastes for leisure hours and image and image are stochastic shocks to tastes for the work and nonwork options, respectively. These shocks can be interpreted as unobserved variation in the fixed cost of work and the value of home time. Additive separability and the distributional assumptions on image and image play a key role in the estimation procedure, as discussed below.

As in an earlier paper (Altug and Miller, 1990), it is assumed that markets are complete (that all idiosyncratic shocks are perfectly insurable). Given this assumption and the specification of the utility function, the marginal utility of consumption can be shown to be given by

image     (113)

image     (114)

As seen in (113), perfect insurance implies that image can be decomposed into the product of an individual-specific component image, reflecting the marginal utility of wealth for individual image, and a time varying component image, reflecting aggregate shocks. A person image with a low image has a relatively low marginal utility of wealth. But, a person’s position in the wealth distribution is constant over time. The only source of uncertainty in the marginal utility of wealth over time are aggregate shocks that cause movements in image.

To obtain an estimable equation, let image, where image and image are observed and unobserved shifters of tastes for consumption, respectively. The consumption equation, (114), can be estimated by fixed effects (or in first differences), assuming the image are exogenous. Altug and Miller (1998) include household size, children, age and region in image and the image are estimated as time dummies. The equation is estimated on data from the PSID, which contains only food consumption. As we would expect, the estimated values of image are high in the recession years of 1975 and 1980-1982.

In the final step, Altug and Miller (1998) estimate the first order condition for hours jointly with a participation condition which allows for fixed costs of work. The first order condition for hours is complex because the marginal utility of leisure is not equated to simply the current wage times the marginal utility of consumption. There is an additional term that arises because working today increases future wages and alters future disutilities from work. We refer to this term as the “expected future return to experience.”

Altug and Miller deal with this problem using a version of the Hotz and Miller (1993) estimation algorithm.78 To outline that procedure, first, given estimates of (111) and (114), they back out estimates of the individual effects image and image. Second, they use nonparametric regression to estimate the probabilities of participation conditional on the state variables, that is, on the estimated values of image and image, the work history, and a set of demographics (age, education, marital status, race, children, age and region).79 Third, they assume the image and image in (112) are iid extreme value shocks, noting that they are the only source of randomness in the current period payoffs from working vs. not working. As in Hotz and Miller (1993), the value functions at any state can be backed out from the conditional choice probabilities calculated in step 2. This allows one to express the “expected future return to experience” terms as a simple function of the conditional participation probabilities (and their derivatives with respect to image). In the final estimation step, the parameters left to be estimated are the those associated with the fixed cost of work image and the disutility of labor image.

It is important to understand the restrictions in this approach. There can be no stochastic variation in the marginal utility of leisure, because this additional source of randomness would preclude obtaining simple expressions for the expected future return to experience. Having actual productivity shocks instead of only measurement error in wages would have the same effect. And, consumption and leisure must be separable in utility, so that the stochastic term in tastes for consumption does not influence labor supply decisions. Thus, the extreme value error and additive separability assumptions are crucial.

So far, we have discussed approaches based on estimating the first-order condition for optimal labor supply. An alternative is the “life-cycle consistent” or “two-stage budgeting” approach, where one estimates labor supply equations that condition on the full income allocated to a period (MaCurdy, 1983). Using this approach, Blundell and Walker (1986) estimate a life-cycle consistent model of labor supply behavior of married couples. They use data on couples where both the husband and wife work, and the estimation of the labor supply function is done jointly with a probit equation for whether the wife works (to control for selection into the sample). In sharp contrast to Heckman and MaCurdy (1982) and Kimmel and Knieser (1998), they obtained an (average) Frisch elasticity of labor supply for women of only 0.033. The Hicks elasticity is 0.009. Based on the figures in their paper, we calculate an income effect of −0.206 (at the mean of the data) and a Marshallian elasticity of −0.197.

Blundell et al. (1998) applied this life-cycle consistent approach to married women from UK Family Expenditure Survey 1978 to 1992. UK tax rates were reduced substantially over the period, and the basic idea of the paper is to exploit this variation to help identify labor supply elasticities. As the authors describe, the decline in rates caused different cohorts to face different paths of tax rates. Relative wages for different education groups also changed markedly over this period.

The idea of the paper can be understood as follows. Imagine we group the data by cohort and education level. That is, for each education/cohort we construct group means of hours and wages in each year. We then subtract group and time means from these quantities. The key assumption in Blundell et al. (1998) is that any residual variation in wages after taking out group and time means is exogenous. Their leading example of what might cause such residual variation in wages for a group is tax changes that affect groups differentially. Another source of variation would be exogenous technical change that affects groups differently. The key assumption here is that there are no shifts in labor supply behavior within any of the groups over time (e.g., tastes for leisure can vary by cohort/education level, but not within an education/cohort group over time). They also assume that taking out time means purges both hours and wages for all groups from the influence of aggregate shocks, a seemingly strong assumption as time affects (like the business cycle) may well affect different education/skill groups differently.

The simplest way to think about using the grouped data is to think of regressing the group mean of hours on the group mean of wages, after purging these means of group and time effects. An equivalent approach is to use the individual data and proceed in two steps. In the first step regress after-tax wages on time/group interaction dummies, and get the residuals from this regression. In the second step, regress hours on the after-tax wage, time and group dummies and wage residual. Note that we want the wage coefficient to be identified by wage variation within group over time. The wage equation residual captures other sources of wage variation, as the first stage wage equation controlled for time/group interactions.80

The authors also attempt to deal with possible compositional effects of changes in participation rates on the mean of the error term in the labor supply equation (e.g., a higher wage may induce women with higher tastes for leisure to enter the market) by including an inverse Mills’ ratio term that is a function of the group/time participation rate. The labor supply equation that Blundell et al. (1998) actually estimate has the form


image     (115)


where image is the tax rate, the second term is “virtual” non-labor income, image is a vector of demographic variables (for example, dummy variables for children of various ages), image and image are the group and time dummy variables, image and image are residuals from the first stage regressions of wages and virtual income on the group and time dummies, and image is the Mills’ ratio used to correct for nonparticipation. The authors estimate this hours function by OLS.

To implement this procedure Blundell et al. (1998) group the FES data into 2 education groups (legal minimum vs. additional education) and 4 cohorts (people born in 1930-1939, 1940-1949, 1950-1959 and 1960-1969), or 8 groups in total. They include only 20 to 50 year old women with employed husbands. This gives 24,626 women of whom 16,781 work. Note that only workers are used to estimate (115), although the full sample is used to estimate the Mills’ ratio. One detail is that 2970 of these women are within a few hours of a kink point in the tax schedule. Blundell et al. choose to drop these women from the data and construct an additional Mills’ ratio term to deal with the selection bias this creates. They find that the group/time interactions are highly significant in the wage and virtual income equations.

The estimates imply an uncompensated wage elasticity at the mean of the data of 0.17 and a compensated elasticity of 0.20. In a sensitivity test, the authors report results where, in the first stage, the over-identifying instruments are 5 parameters that describe the tax rules interacted with group dummies. This reduces the number of instruments relative to the case where the group dummies were fully interacted with time dummies. It also means that only variation in wages and virtual income specifically induced by tax changes is used to identify the labor supply elasticities. The estimates give an uncompensated elasticity of 0.18 and an essentially zero income effect. Thus, results are little affected.

4.1.4 DCDP models

The first paper to adopt a full solution approach to modeling female labor supply was Eckstein and Wolpin (1989b). The main focus of the paper is on how the decision to work today affects wages and tastes for work in the future. Thus, the paper focuses on three of the four issues central to the female labor supply literature (i) fixed costs of working, (ii) human capital accumulation, and (iii) state dependence in tastes for work. To make estimation feasible (particularly given the 1989 computing technology) Eckstein and Wolpin (1989b) make some key simplifying assumptions. First, they ignore savings and assume a static budget constraint. Second, they ignore the choice of hours of work and treat labor supply as a discrete work/no-work decision.

This set of decisions is notable, as it illustrates well the different paths that the male and female life-cycle labor supply literatures have taken. The life-cycle literature on males has emphasized decisions about hours and savings, which Eckstein and Wolpin (1989b) ignore, while in most cases ignoring participation, human capital and state dependence, which they stress. This is not a value judgement on either literature, but simply an observation about what aspects of behavior researchers have found most essential to model in each case. The emphasis on participation, human capital and state dependence explains why the female labor supply literature came to the use of DCDP models several years earlier than the male labor supply literature, as these features are very difficult to handle using Euler equation methods.

A third simplifying assumption that Eckstein and Wolpin (1989b) make is that they do not model marriage or fertility. To avoid having to model fertility decisions, the paper looks only at women who were at least 39 years old in 1967 (and hence for the most part past child bearing age). The number of children affects the fixed costs of work, but it is treated as a predetermined variable. Marriage is taken as exogenously given. Including marriage and fertility as additional choice variables would not have been feasible given 1989 technology, but, as we will see, incorporating them as choice variables has been the main thrust of the subsequent literature.

Eckstein and Wolpin (1989b) assume a utility function for married woman image at age image given by


image     (116)


where image is an indicator for labor force participation, image is work experience (the sum of the lagged image’s), image is a vector of numbers of children in various age ranges (0-5 and 6-17) and image is the woman’s completed schooling. The budget constraint is specified as


image     (117)


where image is the wife’s wage (annual earnings) if she works and image is the annual income of the husband (assumed exogenous).81 The assumption that utility is linear in consumption has some important consequences. First, substitution of (117) into (116) makes clear that we cannot separately identify the fixed cost of work image and the monetary costs of children image from the disutility of work image and the effect of children on the disutility of work image. Thus, image and image are normalized to zero.

The second implication of this specification is that the model will exhibit no income effects on labor supply unless consumption and participation interact in the utility function. If image, then husband’s income will have no impact on the wife’s labor supply. A clear pattern in the data is that women with higher income husbands are less likely to work, which would imply that image. Thus, to fit the data, consumption and leisure must be complements in utility, although in general, a negative income effect and consumption/leisure complementarity are conceptually distinct phenomena.

Eckstein and Wolpin (1989b) assume a standard log earnings function (linear in schooling, quadratic in work experience) with both a stochastic productivity shock and measurement error. A key point is that there are no shocks to tastes for work, so the only stochastic components in the model are the productivity shocks and measurement error. This simplifies the solution to the dynamic programming problem.82 The solution takes the form of a sequence of reservation wages (contingent on age, work experience and other state variables). The decision rule for participation is simply to work if the offer wage exceeds the reservation wage, which is a deterministic function of the state. The measurement error accounts for cases where women are observed to make decisions that violate this condition.

Eckstein and Wolpin (1989b) estimate the model by maximum likelihood using data on 318 white married women from the NLS Mature Women’s cohort. The NLS interviewed them 11 times in the 16 years from 1967 to 1982, making it difficult to construct complete employment histories for all the women. To be in the sample, the women had to have at least four consecutive valid years of data on labor force participation and have a spouse present in every interview from 1967 to 1982. The data set contained 3020 total observations, 53% of which were for working years. The discount factor is fixed at 0.952.

An interesting aspect of the estimates is that they show substantial selection bias in OLS wage equation estimates. The OLS schooling coefficient is 0.08, while the model estimate (which corrects for selection) is 0.05. The experience profile is initially less steep but also less strongly concave than implied by OLS. The estimates also imply that 85% of observed wage variation is measurement error.83

With regard to the utility function estimates, Eckstein and Wolpin (1989a,b) find that children (especially young children) negatively affect tastes for work, as expected. The impact of state dependence is imprecisely estimated, but it implies that experience reduces tastes for work. Schooling reduces tastes for work as well. However, both taste effects are clearly outweighed by the positive effects of experience and schooling on wage offers.

Eckstein and Wolpin (1989b) find that image; thus, as expected, husband income reduces the wife’s participation rate. To quantify the size of the income effect, they consider a woman at age 39 with 15 years of work experience, 12 years of schooling, no children and a husband with $10,000 in annual earnings (which is close to the mean in the data). The baseline prediction of the model is that she will work 5.9 years out of the 21 years through age 59, or 28% of the time. If husband’s earnings increase 50% the model predicts her participation rate will drop by half, to 14%. So the elasticity of the participation rate with respect to non-labor income is roughly 1.0. Converting this to an income effect, and noting that the mean wage in the data is $2.27 dollars per hour and work is assumed to be 2000 hours per year, we obtain an income effect of −0.45.

Unfortunately, Eckstein and Wolpin (1989b) do not report a simulation of how an exogenous change in the wage rate (an increase in the intercept, the skill rental price, of the log wage function) would affect labor supply. However, as schooling is exogenous, and the effect of schooling on tastes for work is quantitatively small, we can approximate this using the estimated schooling coefficient. Consider, the same representative woman described above, and assume her education level is increased from 12 to 16. An extra 4 years of schooling raises the wage rate roughly 22% at the mean of the data. The model predicts that this will cause her participation rate from age 39 to 59 to increase by 108%. Thus, the implied (uncompensated) elasticity of the participation rate with respect to the wage is roughly 5.0.

Finally, Eckstein and Wolpin (1989b) report a detailed description of how the model fits labor force participation rates, conditional on 28 experience and age cells. In general, the model provides a very good fit to the data. As we noted earlier, there are very few papers in the static labor literature, or the literature on dynamic models based on first order conditions, that examine model fit. In contrast, the careful examination of model fit in the DCDP literature has become standard practice. The focus of the former literature is on estimation of parameters or elasticities, while the focus of the DCDP literature is on model simulations under baseline vs. counterfactual scenarios. It is only natural to compare the simulated baseline data to the actual data. Keane and Wolpin (2009) argue that it ought to be the industry standard to assess model fit in all econometric models (including static models, nonstructural models, etc.).

The next paper in the DCDP literature on female labor supply did not appear until Van der Klaauw (1996), which extended Eckstein and Wolpin (1989b) to include marriage as a choice. Women have up to 4 options in each period, given by the cross product of work and marriage choices. Another extension is that Van der Klaauw (1996) models decisions starting from when a woman has left school (rather than age 39, as in Eckstein and Wolpin), which may be as young as 14. Obviously then, he cannot treat fertility as given. Thus, Van der Klaauw (1996) models the arrival of children as a stochastic process, where arrival probabilities depend upon the state variables (i.e., marital status, education, age and race). This is a common practice in DCDP modeling—that is, to take variables that one believes are endogenous, but which one does not wish to model explicitly as a choice (either for computational reasons or because they are not the main focus of the analysis), and treat them as being generated by a stochastic process that depends on the other state variables.84

The model is in many ways similar to Eckstein and Wolpin (1989b), again incorporating a static budget constraint and a utility function that is linear in consumption. Van der Klaauw specifies the utility function, conditional on the participation image and marriage choice image, as


image     (118)


Consumption is interacted with participation, as in Eckstein and Wolpin (1989b), which enables the model to explain why women work less if they have high income husbands. Tastes for marriage image are allowed to depend on demographics, children and lagged marriage. Marriage, image, is also interacted with consumption, image, thus letting marriage shift the marginal utility of consumption. The effects of demographics, children and lagged participation on tastes for work are captured by letting image and image depend on these variables. There is a separate taste shock for each of the mutually exclusive choices, image.

Recall that in Eckstein and Wolpin’s (1989b) model a woman received utility from total household consumption. Here, a woman is assumed to consume her own income plus a fraction of the husband’s income (which depends on her work status), so she receives utility from private consumption. A single women has a probability each year of receiving a marriage offer. The potential husband is characterized by his mean wage, which depends on the woman’s characteristics (reflecting marriage market equilibrium) and a transitory wage draw.

It is worth noting that this is a search model of marriage only in a trivial sense. There is no match-specific component to the marriage. That is, a husband does not come with a permanent component to his earnings level, which could make him a “good draw” given the woman’s demographics. Nor is there any permanent component to the utility level he provides. Thus, the woman has no reason to decline a marriage offer in the hope of a better offer. Her only reason for systematic delay is that mean husband income is found to be increasing in the woman’s potential experience, and thus, her age. This setup substantially reduces the computational burden of estimation, as there is no “husband type” variable that must be included in the state space. But at the same time, the model is not informative about the effect of permanent differences in husband income on the wife’s labor supply, as all permanent differences are a deterministic function of the wife’s own characteristics.

The woman’s own wage offer function includes standard covariates, such as education, a quadratic in experience, race, age and region. It also includes a lagged participation indicator, which allows recent work experience to be relatively more important. An unusual aspect of the specification, however, is that it is specified in levels, with an additive error. This is also true of the husband’s wage function. The reason for adopting this specification is that, when these functions are substituted into the budget constraint to obtain the choice-specific consumption level and this in turn is substituted into the utility function, each of the 4 alternatives turns out to have an additive error that consists of the relevant image, plus a function of the female and male wage equation errors.

From a computational point of view, what enables handling the additional complexity of making marriage a choice is the assumption that these four additive choice-specific error terms, say image for image, image, are assumed to be distributed iid extreme value. As we have previously discussed, this assumption leads to closed form solutions for the DP problem and for the likelihood function. As also noted, the cost of making the extreme value assumption is that (i) it is contrary to the evidence suggesting that wage errors are approximately log normal and (ii) it assumes that shocks are contemporaneously uncorrelated. This latter assumption is very strong given that the four composite errors contain common error components; for example, image and image have husband income shocks in common.85

The model is estimated on PSID data from 1968 to 1985. The sample includes 548 females aged 12 to 19 in 1968 (29 to 36 in 1985), so that complete work and marital histories can be constructed (avoiding the initial conditions problem that would arise for women who were older in 1968). The terminal period is set at age 45 to reduce computational burden. It is assumed that image if the woman worked at least 775 hours in a year, but, as in Eckstein and Wolpin (1989a,b), the work choice is assumed to entail 2000 hours of work regardless of actual hours. An approximation is necessary due to the binary nature of the work decision.

The model is estimated in stages. In the first stage, the “reduced form” model with the woman’s and the husband’s wage equations substituted into (119) is estimated. In the second stage, the wage equations are estimated using employment and marriage decision rules from the reduced form model to implement a selection correction. In the third stage, a minimum distance estimator (see Chamberlain (1984)) is used to recover the structural parameters.

The estimates of the wage equations are a bit difficult to compare to prior literature as they are in levels. For instance, they imply that a year of schooling raises a woman’s earnings by $1379 per year. As mean earnings in the data are $13,698 per year, this is roughly 10% at the mean of the data. A year of schooling also raises potential husband’s earnings by $1266 per year (vs. a mean of $19,800) or 6.4%. This suggests that an important part of the return to schooling for women comes through the marriage market.86 The utility function estimates imply that children reduce the utility from participation while lagged work increases the utility from participation.

Van der Klaauw (1996) presents a substantial amount of evidence on the fit of the model, showing that it provides a good fit to the proportion of women who are working and married conditional on years since leaving school, to marriage rates by age, and to the hazard functions for marriage and divorce. It also provides a good fit to the proportion of women making each of the 4 marital status/work choices conditional on work experience and age.

Van der Klaauw (1996) then uses the model to simulate the impact of exogenous $1000 increases in annual offer wages and husband offer wages. The $1000 wage increase leads to a 26% (i.e., 2.5 year) increase in work experience by age 35. As this is a 7.3% wage increase, this implies an uncompensated labor supply elasticity of roughly 3.6. It is notable, however, that this elasticity is not comparable to a conventional Marshallian elasticity that holds all else fixed. In particular, the wage increase causes a 1 year increase in average years to first marriage, and a 1.3 year decrease in average total years of marriage. The reduction in marriage is part of what induces the increase in labor supply.87

The next significant paper in the DCDP literature on female labor supply is Francesconi (2002), which extends Eckstein and Wolpin (1989a,b) by making fertility a choice and allowing for both full- and part-time work. Thus, women have 6 choices in each annual period (after age 40 only the 3 work options are available). Francesconi (2002) also allows full and part-time experience to have separate effects on wage offers.88 Thus, the model has three endogenous state variables: number of children, and part-time and full-time experience.

Marriage is taken to be exogenous and the model begins when a woman first gets married and ends at age 65. Women are assumed to make decisions based on the expected value of husband’s income. As in Van der Klaauw (1996), the husband’s mean income is purely a function of the woman’s characteristics (i.e., age at marriage, education, education/age of marriage interactions, age). As in Eckstein and Wolpin (1989a,b) women receive utility from total consumption of the household, net of fixed costs of work and costs of children. There is again a static budget constraint, with utility linear in consumption. Utility for woman image at age image, conditional on her part-time and full-time work and fertility choices (image, image, image), is given by


image     (119)


The tastes for part and full-time work, image and image, are allowed to be a function of the stock of children, image, work experience and schooling. Tastes for children vary stochastically over time, as captured by image. Consumption is interacted with all the choice variables in order to allow husband’s income to affect work and fertility decisions. Work and fertility decisions are interacted, which enables the model to capture the fact that women have lower participation rates during years that they have newborn children.

The stochastic terms in the model are the errors in the full and part-time log wage equations and the shock to tastes for children. There are no additional taste shocks. The errors are assumed to be distributed as joint normal. Thus, as in Eckstein and Wolpin (1989b), it is necessary to assume wages are measured with error to account for observations where women are observed to work at wages that are less than the reservation wage. Given that the model contains 6 choices and three error terms the evaluation of the Emax function integrals is difficult. Thus, Francesconi (2002) uses a simulation method like that proposed in Keane and Wolpin (1994) to evaluate the Emax functions. However, the state space is small enough that he can simulate the Emax function at every point in the state space (there is no need to interpolate between points). The three dimensional choice probability integrals are also simulated.

A point worth stressing is that Francesconi (2002) assumes that only the number of children, and not their ages, enters the state space. If children of different ages had different effects on labor supply, as we have previously noted, the size of the state space would grow astronomically. Francesconi can accommodate that newborns have a different effect on labor supply than older children, because newborns are treated as a current choice variable, and they do not enter the state (as they are no longer newborns in the next period). But allowing, e.g., the number of children aged 1 to 5 to have a different effect than the number of children aged 6-17, would greatly increase in complexity.

Francesconi (2002) also follows Van der Klaauw (1996) in limiting the size of the state space by assuming husband’s mean income is purely a function of the woman’s characteristics. Thus, husband-specific characteristics (e.g., a husband skill endowment) need not be included in the state space. Further, it is assumed that husband’s earnings are realized only after the wife’s labor supply and fertility decisions are made. As a result, the effect of husband’s income on the wife’s behavior can only be identified to the extent that there are exclusion restrictions, such that certain characteristics of the wife enter the model only through their effect on the husband’s wage. In fact, the husband’s wage function includes the wife’s age, age at marriage and education/age of marriage interactions, and all of these variables are excluded from the wife’s wage function and from her taste parameters.

Finally, Francesconi (2002) also extends earlier DCDP models of female labor supply by following the procedure in Keane and Wolpin (1997) to allow for unobserved heterogeneity. Specifically, he allows for three discrete types of women in terms of their skill endowments (the intercepts in the offer wage functions) and in tastes for children (image and image).

The model is estimated on a sample of 765 white women from the NLS Young Women Survey who were interviewed 16 times over the 24 years from 1968 to 1991. To be included in the sample the woman must be at least 19 and be continuously married to the same spouse during the sample period.89 Part-time is defined as 500 to 1500 hours and full-time is defined as 1500 + hours. The discount factor is fixed at 0.952. In contrast to the multi-step procedure in Van der Klaauw (1996), the decision rules and wage offer functions are estimated jointly. There are separate wage offer functions for part-time and full-time work.

The estimates of the wage function imply that a year of schooling raises the full-time offer wage by 8.4% and the part-time offer wage by 7.6%, estimates that are intermediate between the Eckstein and Wolpin (1989b) and Van der Klaauw (1996) results. Full-time experience has a larger positive effect on full-time offer wages than part-time experience. Effects of experience on part-time offer wages are generally much smaller. Measurement error accounts for about 63% of the variance of observed wages. Evaluated at the mean of the data, an extra year of school raises mean husband wages by 11%. This is consistent with the finding of Van der Klaauw (1996) that a large part of the return to schooling for women comes through the marriage market rather than the labor market. The interaction terms between consumption and work and fertility (image) are all negative, which generates negative income effects on both labor supply and fertility. In addition, individuals of the type with a high skill endowment have relatively low tastes for children.

Francesconi (2002) reports results indicating that the model provides a good fit to all 6 annual choice options up to 24 years after marriage, which corresponds to age 47 on average (the last observed age in the NLSY79 data he analyzed). He also fits a static model (i.e., a model with the discount factor set to 0) and finds that it too provides a good fit to the in-sample data. But the models differ dramatically in their out-of-sample predictions. The static model predicts that women’s labor supply will increase sharply after about age 47 and into their 60’s. The DCDP model implies that work will stay flat and then drop slowly in their 60’s. The latter prediction is much closer to what is observed in CPS data, which covers adult women of all ages. 90 The static model explains low participation rates as resulting from the presence of children; when children leave the household, participation rates rise sharply. In the dynamic model, the return to human capital investment, that is of working, falls as one approaches the terminal period, which counteracts the effect of children leaving.

Finally, Francesconi (2002) conducts a number of simulations of how permanent changes in wages would affect labor supply. For example, consider an average woman with 2 years of full-time work experience at the time of marriage. The baseline model simulation shows that she will work for 6.8 out of the 11 years from age 30 to 40. An increase in the log wage function intercept (which represents the rental price of skill) would increase offer wages at the mean of the data by roughly 10.5%, and it would increase full-time work by roughly 60%. This implies an elasticity of labor supply with respect to rental price of skill of roughly 5.6. However, this is somewhat of an exaggeration, as some of the increase in full-time work must come from reduced part-time work. Unfortunately, Francesconi (2002) does not report the decrease in part-time work that accompanies the increase in full-time work.

The last two papers on female labor supply described below are Keane and Wolpin (2007, 2010). In these papers, Keane and Wolpin utilize approximate solution methods developed in Keane and Wolpin (1994), and estimation methods developed in Keane and Wolpin (2001), to estimate a model of female life-cycle behavior that is considerably richer than previous models in the literature. Both marriage and fertility are treated as choices, and both full and part-time work options are available. Schooling is also a choice. An important feature of the data that is not accommodated in prior dynamic models is that a large fraction of single women with children participate in public welfare programs. Thus, welfare participation (when eligible) is also incorporated as a choice.

In the model, women begin making decisions at age 14, and the terminal period is age 65. The fertile period is assumed to last up until age 45, and during this period women have up to 36 choice options in each period. Afterwards they have up to 18 options.91 The decision period is assumed to be 6-months until age 45, which is a compromise between the length of a school semester and the child gestation period. After age 45, the decision period is one year (as the fraction of women who either attend school or have children after 45 is negligible). Given that behavior of girls as young as 14 is being modeled, it is essential to consider the role of parental co-residence and parental income support. Yet, as this is not a focal point of the model, the authors choose not to treat living with parents as a choice. Both the probability of co-residence and parental transfers are treated as stochastic processes that depend on a person’s state variables.

One fundamental difference from Van der Klaauw (1996) and Francesconi (2002) is that marriage is treated as a true search process. Each period a woman may receive a marriage offer that consists of: (1) the mean wage of the husband, and (2) a marriage quality draw (which captures nonpecuniary aspects of the match). The potential husband’s mean wage depends on the woman’s characteristics, such as her schooling and skill level, as well as a permanent component drawn from a distribution. Thus, a husband fixed effect becomes part of the state space. In this setup, a woman has an incentive to reject marriage offers while waiting for a husband with a high mean wage.

Another fundamental difference from prior work is that the model is non-stationary in the sense that the economic environment changes over time. Specifically, the welfare rules change over time and differ by state, so each cohort of women (as defined by the semi-annual period in which they reach age 14) in each state faces a different sequence of welfare rules. This creates a number of computational problems. First, each cohort of women in each state faces a different dynamic optimization problem (raising computational burden). Second, one must make an assumption about how women forecast future rules. Third, the rules are complex, making it difficult to characterize them.

Keane and Wolpin (2007, 2010) deal with these problems as follows. First, they develop a simple 5 parameter function that characterizes the welfare benefit rules in each state in each year quite accurately. Second, they assume women use a state-specific VAR in these 5 parameters to predict future rules. Third, they only use data from 5 large states, so as to reduce the number of DP problems that must be solved in estimation. This enables them to use the data from other states for out-of-sample validation.

Keane and Wolpin (2007, 2010) assume that a woman receives disutility from a variable that measures “non-leisure” time. This is a sum of work hours, a fixed time cost of work, time spent in school, time required to collect welfare, and time required to care for children.92 The authors estimate weights on the variables other than work hours to account for the fact that school time, child care time and time collecting welfare may entail more/less disutility than time spent working. A woman receives utility from consumption, which is assumed to be a share of total household income. Utility is quadratic in non-leisure time and linear in consumption. Similar to the previous papers we discussed, consumption is interacted with non-leisure time. The estimated coefficient is negative, implying that consumption and leisure are complements, inducing negative income effects on labor supply and fertility.

Additional interactions are introduced that allow marriage and children to shift the degree of complementarity between consumption and leisure. This would have been irrelevant in the papers discussed previously, as they do not try to explain labor supply, marriage and fertility choices jointly. The estimates imply that marriage and children both significantly reduce the degree of complementarity between consumption and leisure, but do not eliminate it.

Women also receive utility/disutility from children, pregnancy, marriage, school attendance and welfare participation. Utility is quadratic in number of children. The utility/disutility from pregnancy is a polynomial in age. As one would expect, this becomes a large negative for women as they approach 45, consistent with the greater risks associated with pregnancy at older ages. The disutility of welfare attendance enables the model to explain the common phenomenon of nonparticipation by eligible women (see Moffitt (1983)). The utility function coefficient on each of the 5 choice variables (hours, pregnancy, marriage, school and welfare) consists of a constant plus a stochastic taste shock. This enables the model to generate a nonzero probability of any observed choice outcome.

The model allows for unobserved heterogeneity in the form of 6 types of women who differ in the preference parameters (constant terms) associated with the 5 choice variables (i.e., different tastes), and in the intercepts of the own and potential husband offer wage functions (i.e., different skills). The model includes observed heterogeneity as well; the heterogeneous skill and taste parameters differ across states and across ethnic groups (blacks, whites and Hispanics). Finally, the utility function includes interactions of indicators for full and part-time work, school and marriage with lagged values of these indicators, to capture state dependence in tastes for these choice options.93

The model is estimated using data from the National Longitudinal Survey of Youth 1979 cohort (NLSY79). The NLSY79 includes women aged 14 to 21 in 1979. The paper uses the data from the years 1979 to 1991. Thus, the women reach a maximum age of 33. The states used in estimation are California, Michigan, New York, North Carolina and Ohio. To be in the sample, a woman had to reside in the same state for the whole sample period, which screens out about 30%. This leaves data on approximately 2800 women.94 The annual discount factor is fixed at 0.93.

Estimates of the log wage function imply that (at the mean of the data) an additional year of school raises wages by 9.1%. And 84% of the variance of wages is attributed to measurement error (the true log wage standard deviation is 0.17). The experience coefficients imply that the first year of full-time work raises wages by 2.6%, and that the experience profile peaks at 36 years. In addition, lagged full-time work raises the current wage offer by 7%, while lagged part-time raises it by 3%. Black and Hispanic women have lower offer wages than white women (by 13% and 6%, respectively).

In the husband offer wage function, the coefficient of the woman’s skill endowment (i.e., intercept in the woman’s wage function) is 1.95, implying a very high degree of assortative mating on skill. And each additional year of education for the woman raises the husband offer wage by 3%. Black and Hispanic women have much lower husband offer wages than whites (by 30% and 14%, respectively). The estimates imply that women receive 55% of total household income. So, just as in Van der Klaauw (1996) and Francesconi (2002), much of the return to schooling appears to emerge through the marriage market.

Keane and Wolpin (2007) provide a good deal of evidence on the fit of the model and assess how well it predicts behavior in the holdout state of Texas. The model performs reasonably well in these tests, including providing better predictions than some candidate competing nonstructural models.

As has been the focus of the labor supply literature, Keane and Wolpin (2010) estimate labor supply wage elasticities. Recall that the model has six types of women, which we can rank by skill level from type 1 (highest skill endowment) to types 6 (lowest). Type 6 account for the majority of welfare participants. Keane and Wolpin (2010) report experiments where they increase the offer wage by 5% for each type separately. The wage elasticities are inversely proportional to skill level, ranging from only 0.6 for type 1 to 9.2 for type 6. Thus, the overall elasticity of 2.8 is deceptive with regard to behavior of various subsets of the population.

For type 6 women, the 5% wage increase has a dramatic impact on all aspects of their behavior. For instance, for white women of type 6, the percent working at ages 22 to 29.5 increases from 34% to 50% (a 47% increase). But it is also notable that mean completed schooling increases from 11.5 to 12 years, the high school drop out rate drops from 42% to 24%, welfare participation drops from 25% to 20%, and incidence of out-of-wedlock teenage pregnancies drops from 3.4% to 2.8%. All of these behavioral changes (i.e., more education, fewer teenage pregnancies, less welfare participation) contribute to the increase in labor supply. In contrast, type 1 are already completing a high level of schooling, are rarely having children at young ages, are not participating in welfare, and are participating in the labor market at a high rate. Thus, in a sense there are fewer channels through which a wage increase can affect them. In summary, the results indicate that wage elasticities of labor supply for low skilled women are much greater than for high skilled women.

It is difficult to summarize the estimates of labor supply elasticities for women across the studies we have surveyed. Several of the non-DCDP studies we have examined calculate what might be called “short run” elasticities that hold work experience, marriage and fertility fixed. On the other hand, the DCDP models calculate “long run” elasticities that allow, depending on the study, some combination of experience, fertility, marriage and education to adjust to wage changes. Nevertheless, a reasonable assessment of the estimates from this literature is that the labor supply elasticity estimates for women are generally quite large. The DCDP models give uniformly large “long run” elasticities ranging from 2.8 to 5.6. The life-cycle models of Heckman and MaCurdy (1982) and Kimmel and Knieser (1998) give large Frisch elasticities (2.35 to 3.05). The Marshallian elasticity of 0.89 obtained by Cogan (1981) in a static model is also quite large.95 Thus, 7 of the 9 studies obtain large female labor supply elasticities (of various types). Only the Blundell and Walker (1986) and Blundell et al. (1998) studies find small elasticities. This may be because these two studies consider the labor response of working women to wage changes, while the other 7 studies incorporate the participation margin.

The richness of the Keane and Wolpin (2010) model enables them to address a variety of substantive issues beyond calculating labor supply elasticities. These focus on (i) the factors that account for differences between blacks, whites and Hispanics in choice behavior and (ii) the effects of changing welfare rules. With respect to behavioral differences among minority and white women, the model estimates indicate that black women face a worse marriage market than do white women. The mean earnings of potential husbands, that is, the pool of men who make marriage offers, is 27 percent lower for black than for white women. In addition, unobservable traits of potential mates reduce the psychic value of getting married by $2,500 (in 1987 NY dollars) for black women relative to white women. The estimates also indicate that black women face poorer labor market opportunities. Wage offers are 12.5 percent lower for black women than for white women. In terms of preferences, the stigma attached to being on welfare is smaller for black women, although the difference, 290 dollars per six month period, does not seem that large. Black women do not differ from white women in the disutility they attach to work (an extra 1000 hours of work is equivalent to a 117 dollar greater drop in consumption for black than for white women), but they are estimated to have a significantly greater preference for children (the birth of a child is equivalent to a greater increase in consumption by 1352 dollars for black women than for white women).

To assess the importance of labor market, marriage market and preference differences, Keane and Wolpin (2010) simulate behaviors of black women under alternative counterfactual scenarios. They find that equalizing marriage market opportunities between black and white women would reduce welfare participation of black women at, for example, ages 26-29 from 29.7 percent to 21.4 percent, thus closing 37 percent of the black-white gap. Equalizing labor market opportunities has a somewhat larger impact, reducing the gap by about 45 percent. However, these changes have opposite effects on employment. Improving marriage market opportunities of black women in this age group reduces their employment rate from 55.7 to 42.3 percent, and thus widens the black-white gap, while employment rates are essentially equalized when labor market opportunities are equalized. Both counterfactuals increase marriage rates in that age range, although directly operating on marriage market conditions has a much larger impact, reducing the marriage rate gap of 37 percentage points to only 10 percentage points. Along with this large increase in marriage rates, the mean number of teenage births increase slightly. On the other hand, the relatively small increase in marriage rates that accompany the counterfactual improvement in the labor market leads to a fall in the mean number of teenage births by 13 percent and closes the black-white gap by 38 percent. Finally, improving the marriage market of black women reduces their completed schooling by a third of a year on average, widening the gap with white women, while improving their labor market opportunities increases their completed schooling by 0.2 years. Increasing welfare stigma of black women to that of white women, given the relatively small difference noted above, has only a small impact on behavior; the largest effect is to reduce welfare participation by 3.3 percentage points, significantly less than that exhibited for the other counterfactuals.

As these counterfactual experiments illustrate, none of these differences, when taken one at a time, can account for the racial differences in outcomes. Improving marriage market opportunities, by itself, reduces some of the gaps, but widens others. Improving labor market opportunities reduces all of the gaps, but considerably less so for demographic outcomes. And, welfare stigma accounts for little of the racial differences in behavior.

In another counterfactual experiment, Keane and Wolpin (2010) simulate the effect of eliminating welfare. Because welfare receipt is heavily concentrated among one of the six (unobserved) types, this experiment was performed only for women with the preferences and opportunities of the women of that type. For this type, 68.1 percent of black women and 24.6 percent of white women are receiving welfare at ages 26-29. That difference, of 43.5 percentage points is eliminated in the experiment. But, perhaps the most striking result is that eliminating welfare also essentially eliminates the employment gap, even though labor market opportunities are worse for black women. The original gap of 16.5 percentage points is reduced to 1.4 percentage points. Eliminating welfare also increases marriage rates more for black women, by 14.8 percentage points, than for white women, 8.2 percentage points, reducing the original gap from 36.2 percentage points to 29.6 percentage points. The mean number of teenage births fall slightly, but about the same for black and white women. A similar result is observed for the proportion of women of this type who do not graduate from high school.

As Keane and Wolpin (2010) conclude, there is no simple answer as to what causes the differences in the behavior of black and white women. The welfare system in place in the US until the major reform in 1996 differentially affected the labor market attachment of black women, but did not by itself account for much of the difference in marriage, fertility and schooling. The poorer marriage and labor market opportunities of black women both contributed importantly to the greater dependency of black women on welfare. Ultimately, it is the interaction of all of these factors, the welfare system, opportunities and preferences that jointly account for the large racial gaps in labor market and demographic outcomes.

In summary, the female labor supply literature has emphasized the connection between participation decisions and human capital, fertility and marriage. Those papers that have attempted to model fertility and/or marriage as choices have ignored savings behavior to achieve computational tractability. There is as of yet no model of female life cycle behavior that includes savings along with human capital, fertility and marriage. This is an important avenue for future research, although a difficult one, because it involves modeling interactions within a household in a dynamic framework.96

4.1.5 Male labor supply

As we have noted, DCDP models of female labor supply have ignored considerations of consumption smoothing through savings and borrowing behavior. This is in sharp contrast to the literature on male labor supply, which has made consumption smoothing a major focus, in conjunction with a continuous hours choice and, with few exceptions, has ignored human capital accumulation.

Indeed, without availing itself of the DCDP approach, the literature on males has generally adopted estimation methodologies that specifically seek to avoid having to solve the full dynamic programming problem. A notable example is the seminal work by MaCurdy (1981, 1983), who developed estimation methods using the Euler conditions of dynamic models of labor supply with savings. Shaw (1989) extended this approach to a model with labor supply, savings and human capital accumulation. Of course, the DCDP methodology does not preclude modeling all these aspects of behavior, but it is computationally burdensome. On the other hand, a limitation of Euler equation and other “non-DCDP” or “non-full solution” approaches, is that, while they can deliver structural parameter estimates, they do not in general allow one to simulate behavioral responses to changes in policy (or the economic environment more generally).

To our knowledge only one paper, Imai and Keane (2004), has used DCDP methodology to estimate a labor supply model with assets, continuous hours and on-the-job human capital accumulation. We present a simplified version of the Imai and Keane (2004) model that captures the main points.

Assume that a worker’s human capital, denoted by image, evolves according to the simple human capital production function


image     (120)


The growth in human capital, in this formulation, is a constant fraction of hours worked. image is the person’s skill (or human capital) endowment at the time of labor force entry.97 A person’s wage at time image, is equal to the current stock of human capital times the (constant) rental price of human capital, image.98 Human capital is subject to a transitory productivity shock. Specifically,


image     (121)


The period-specific utility function is given by


image     (122)


where image is an age-varying parameter that shifts tastes for work. In contrast to the female labor supply literature, where utility is typically assumed to be linear in consumption, utility in (122) is CRRA in consumption. Given the emphasis of the male literature on savings, the CRRA is a more natural choice. Assets evolve according to


image     (123)


where image is the tax rate on labor income. Given this setup, the state space at image, consists of image.

The individual is assumed to maximize the expected present discounted value of utility over a finite horizon. The value function at age image is then

image     (124)

image     (125)

where the expectation is taken over the future transitory productivity shocks and tastes for work conditional on the current state space. As is common in these types of models, we assume these stochastic terms are independent over time. In that case, we can replace image with image, that is, we can drop image and image from image in forming the expectation in (125). Note that image is simply the analog of the image function in the discrete choice problem already discussed.

As in the discrete choice case, the solution of the model consists of finding the image functions. Imai and Keane (2004) do that using a backsolving and approximation procedure similar to Keane and Wolpin (1994), adapted to continuous choice variables. In the terminal period, the value function is


image     (126)


In this simple static problem, and without a bequest motive, given image and image, the consumer chooses image and image to maximize utility subject to the budget constraint image.99

In principle, the backsolving procedure starts by calculating image for every possible state in image at which the worker might enter period image. The solution for image is given by the first-order condition


image     (127)


This equation can be solved numerically for the optimal image using an iterative search procedure. Once the optimal image is determined for each state point, the optimal image is found from the budget constraint. image is then found by substituting the optimal value of image and image into (126).

Although we need to calculate image only at the deterministic components of the state space, image, a problem arises, because the number of possible levels of human capital and assets at the start of period image is extremely large, if not infinite. Thus, it is not computationally feasible to literally solve for image for every possible state value. Thus, Imai and Keane (2004) adopt the Keane and Wolpin (1994) approximation method discussed earlier, which involves solving for image at a finite (and relatively small) subset of the possible state points.

To implement that procedure, a regression is estimated as some flexible function of the state variables and used to predict or interpolate the value of image at any desired state point image, including, in particular, points that were not among those used to fit the regression. Thus, having fit this interpolating regression, we may proceed as if image is known for every possible state point in image. As before, denote the interpolating function that approximates image as image. We must assume that image is a smooth differentiable function of image and image (e.g., a polynomial) for the next step. For expositional convenience, let image be the following simple function,


image     (128)


and let image be the predicted value of image, where the image’s are estimated parameters.

As in the discrete choice setting, the next step of the backsolving process moves back to period image. Then, using the predicted values of image from the approximating function,


image     (129)


Upon substituting in the laws of motion for image and image, we get


image     (130)


Finding the optimal values of consumption and hours is now just like a static optimization problem. The first order conditions are given by

image     (131)

image

These two equations can be solved numerically for image and image at any given state point in image.100

Following the development for period image, the next step is to calculate the values of image at a subset of the state points. For a given value of image and image we can substitute the optimal image and imageinto image(130) and numerically integrate over the joint distribution of image and image. Given the values of image, we can then estimate the interpolating function at image, say


image     (132)


Using this interpolating function, we can write the (approximate) value functions at time image in an analogous fashion to (130). The only difference is in the interpolating function parameters. These steps are repeated until an approximate solution is obtained for every period back to image.

The approximate solution consists of the complete set of interpolating function parameters, the image’s for image. Given these estimated interpolating functions, it is possible to solve numerically the simple two equation system like (131) at each image to find the optimal choice of a worker at any point in the state space. In particular, using image, one can solve for optimal labor supply and consumption in period image, the first period of the working life. As previously noted, this is what first order conditions alone do not provide. Furthermore, by drawing values for the taste shocks and rental rates and repeatedly solving optimal labor supply and consumption over time, one can simulate entire career paths of workers. This enables one to simulate how changes in the economic environment, such as changes in tax rates, would affect the entire life-cycle path of labor supply and consumption, as one can re-solve the model and simulate career paths under different settings for the policy parameters.

Imai and Keane (2004) estimate their model using white males from the NLSY79. They choose this data set because of its fairly extensive asset data. The men in their sample are aged 20 to 36 and, as the focus of their paper is solely on labor supply, they are required to have finished school. Due to the computational burden of estimation they randomly choose 1000 men from the NLSY79 sample to use in estimation. People are observed for an average of 7.5 years, each starting from the age at school completion.

Notably, Imai and Keane (2004) allow for measurement error in observed hours, earnings and assets when constructing the likelihood function. As all outcomes are measured with error, construction of the likelihood is fairly simple. One can simulate career histories for each worker, and then form the likelihood of a worker’s observed history of hours, earnings and assets as the joint density of the set of measurement errors necessary to reconcile the observed history with the simulated data.101

Imai and Keane (2004) estimate that image. In a model without human capital, this would yield a Frisch elasticity of image, which implies a much higher willingness to substitute labor intertemporally than in almost all prior studies for men (see MaCurdy (1983) for an exception). Simulations of the model reveal that, even accounting for human capital effects, the estimate of image implies more elastic labor supply than in most prior work.

Imai and Keane (2004) explain their high estimate of intertemporal substitution based on the logic of Fig. 1. The figure presents a stylized (but fairly accurate) picture of how wages and hours move over the life cycle. Both wages and hours have a hump shape, but the hump in wages is much more pronounced. This apparently weak response of hours to wages leads conventional methods of estimating the intertemporal elasticity of substitution (which ignore the effect of working on the accumulation of human capital) to produce small values.

image

Figure 1 Hours, wages and price of time over the life-cycle. Note: HC denotes the return to an hour of work experience, in terms of increased present value of future wages. The opportunity cost of time is Wage + HC.

Indeed, Imai and Keane (2004) show that if they simulate data from their model and apply instrumental variable methods like those in MaCurdy (1981) and Altonji (1986) to estimate image, they obtain values of 0.325 (standard error = 0.256) and 0.476 (standard error = 0.182), respectively. Thus, the Imai and Keane (2004) model generates life-cycle histories that, when viewed through the lens of models that ignore human capital accumulation, imply similarly low co-movement between hours and wages to those obtained in most prior work. As further confirmation of this point, the authors report simple OLS regressions of hours changes on wage changes for both the NLSY79 data and the data simulated from their model. The estimates are −0.231 and −0.293, respectively. Thus, a negative correlation between hours changes and wage changes in the raw data is perfectly consistent with a high willingness to substitute labor intertemporally over the life cycle.

What reconciles these prima facie contradictory observations is the divergence between the opportunity cost of time and the wage in a model with returns to work experience. In particular, Imai and Keane (2004) estimate that from age 20 to 36 the mean of the opportunity cost of time increases by only 13%. In contrast, the mean wage rate increases by 90% in the actual data, and 86% in the simulated data. Thus, the wage increases about 6.5 times faster than the opportunity cost of time. These figures imply that conventional methods of calculating image will understate it by a factor of roughly 6.5.

This point is illustrated in Fig. 1 by the line labeled “Wage + HC,” which adds the wage and the return to an hour of work experience (in the form of higher future earnings) to obtain the opportunity cost of time. As the figure illustrates, the opportunity cost of time is much flatter over the life cycle than is the wage rate. Thus, hours appear to be much more responsive to changes in the opportunity cost of time than to changes in wages alone.

Imai and Keane (2004) use their model to simulate how workers of different ages would respond to a 2% temporary unanticipated annual wage increase. For a worker at age 20, hours increase only 0.6%. But the response grows steadily with age. At age 60 the increase in hours is nearly 4%, and at age 65 it is about 5.5%. The reason the effect of a temporary wage increase rises with age is that, as depicted in Fig. 1, as a person ages the current wage becomes a larger fraction of the opportunity cost of time. According to Imai and Keane (2004)’s estimates, at age 20 the wage is less than half of the opportunity cost of time, but by age 40 the wage is 84% of the opportunity cost of time.

Unfortunately, the Imai and Keane (2004) simulations do not reveal what the model implies about how workers would respond to permanent tax changes. To fill this gap, Keane (2009a) uses the Imai-Keane model to simulate the impact of a permanent 10% tax rate increase (starting at age 20 and lasting through age 65) on labor supply over the entire working life. If the tax revenue is simply thrown away, the model implies that average hours of work (from ages 20 to 65) drops from 1992 per year to 1954 per year, a 2% drop. If the revenue is redistributed as a lump sum transfer, labor supply drops to 1861 hours per year, a 6.6% drop. The later is as a reasonable approximation to the compensated elasticity with respect to permanent tax changes implied by the model (i.e., 1.32).

The effects of the tax, however, are very different at different ages. As seen in Table 2, tax effects on labor supply slowly rise from age 20 to about age 40. Starting in their 40’s, the effects on labor supply start to grow quite quickly, and by age 60 effects are substantial. Thus, in response to a permanent tax increase, workers not only reduce labor supply, but also shift their lifetime labor supply out of older ages towards younger ages.

Table 2 Effect of a 5% tax on earnings on labor supply by age.

Age Pure tax Tax plus lump sum redistribution
20 −0.7% −3.2%
30 −0.7% −3.3%
40 −0.9% −4.2%
45 −1.2% −5.7%
50 −2.1% −8.7%
60 −9.1% −20.0%
20-65 (Total hours) −2.0% −6.6%

To our knowledge, there are only two papers besides Imai and Keane (2004) that have used full solution methods to estimate a life-cycle model that includes both human capital investment and savings, Keane and Wolpin (2001) and Van der Klaauw and Wolpin (2008). Neither of those papers, however, models the continuous choice of hours, although they allow for several discrete alternatives.102 The main focus of the Keane and Wolpin (2001) paper is on schooling choice (not labor supply), so we discuss it in a later section. But their paper is of interest here because it assumes a CRRA utility function in consumption, and so, like Imai-Keane, provides an estimate of the key preference parameter image, which governs income effects in labor supply and intertemporal substitution in consumption. Keane and Wolpin (2001) obtain image, which implies weaker income effects, and less curvature in consumption (i.e., higher willingness to substitute intertemporally), than much of the prior literature. Keane and Wolpin (2001, p. 1078) argue that the reason is that their work accommodates liquidity constraints, and that failure to do so may have led to a downward bias in estimates of image in prior work.103

Imai and Keane (2004) estimate that image. This implies a somewhat lower intertemporal elasticity of substitution in consumption than the Keane and Wolpin (2001) estimate of image (that is, image vs. −2.0). But their estimate of image still implies weaker income effects on labor supply, and a higher willingness to substitute consumption intertemporally, than much of the prior literature. Instead of liquidity constraints (as in Keane and Wolpin (2001)), the Imai and Keane (2004) model “explains” the fact that young workers do not borrow heavily against higher future earnings by assuming age effects in the marginal utility of consumption. Both models provide a good fit to asset data over the life-cycle. Finally, Keane (2009b) uses the Imai-Keane estimates of image and image to calibrate a simple two-period equilibrium model. He finds that welfare costs of labor income taxation are much larger than more conventional values of image and image would suggest.

In summary, although the literature that uses dynamic programming models to study life-cycle labor supply, asset accumulation and human capital investment for males is quite small, it has produced important results. Specifically, it finds that the intertemporal elasticities of substitution for both labor supply and consumption are quite a bit larger than implied by earlier work. This, in turn, implies that tax effects on labor supply for males may be larger than conventionally thought. Clearly more work is called for to investigate the robustness of these results to alternative model specifications and data sources.

4.2 Job search

Along with the dynamic labor force participation model, among the first applications of the DCDP approach was to the estimation of models of job search—the transition from unemployment to employment. The labor supply and job search literatures have, however, addressed different questions and followed distinct paths. To understand why that has been the case, recall that in the labor force participation model workers with the same characteristics, and thus the same level of productivity, are offered the same wage. That is, in the wage offer function, image, image is assumed to be a market-level (for example, competitively determined) skill rental price and image and image are worker characteristics.

In contrast, the job search literature starts from the assumption that firms may offer different wages (skill rental prices, image) to identical workers within a given labor market. Then, the wage offer received by a worker of given characteristics from a firm image is image, where image is the mean skill rental price in the labor market and where image reflects firm image’s idiosyncratic component of the skill rental price. In the basic model, the accepted job lasts “forever” and individuals are not subject to productivity shocks. Given this wage structure, once an individual accepts a wage offer from a firm with a given image, their skill rental price is fixed for as long as they work for that firm.104 The information set of the individual includes the distribution of image across firms, but not which firms are matched with particular values of image. Because there are more and less desirable firms, individuals have an incentive to engage in job search, that is, to look for a high wage firm. Job search is sequential. The difference in the labor force participation and job search models thus reflects the different assumptions made about the wage structure of the labor market.

The partial equilibrium search model has normally been used to understand a different phenomenon than has the labor supply model. Official labor force statistics distinguish among three mutually exclusive and exhaustive states, being employed, being unemployed and being out-of-the labor force. The distinction between the latter two states is based on whether an individual is actively seeking work. Both the labor force participation and search models consider only two states. In the labor force participation model, unemployment and out-of-the labor force are collapsed into one nonemployment state. In that model, it is assumed that a new wage offer is received every period and any individual will work at some offered wage. The search model conditions on individuals having already chosen unemployment over being out-of-the labor force and does not assume that a job offer necessarily arises each period. Because of this difference, labor force participation models have been applied to low frequency data based on the employment-nonemployment dichotomy, commonly at the annual level and often for women, while job search models have been applied to high frequency data, for example, at the weekly level, based on the employment-unemployment dichotomy.

The structural implementation of the standard partial equilibrium job search model was first considered by Wolpin (1987) and Van den Berg (1990), building upon a nonstructural literature that had begun a decade or more before.105 The nonstructural empirical literature was focused on the evaluation of the effect of UI programs, more specifically, on the estimation of the impact of unemployment benefits on the duration of unemployment and wages. The empirical approach in that literature was (and is still) based, loosely and in some ways incorrectly, on the sequential job search model first formalized by McCall (1970) and Mortensen (1970). The structural empirical literature, following the DCDP paradigm, is based on the explicit solution and estimation of the sequential model.

There have been many extensions and modifications of the standard model in the structural empirical literature. Within the standard framework, Stern (1989), extending the original contribution of Stigler (1961) to a sequential framework, allowed for simultaneous search, that is, for the submission of multiple job applications in a period. Blau (1991) dropped the assumption of wealth maximization, allowing for job offers to include not only a wage offer but also an hours offer. Ferrall (1997) incorporated all major features of the Canadian UI program. Gemici (2007) considered the joint husband and wife search-migration decision in an intra-household bargaining framework. Paserman (2008), adopting a behavioral approach, allowed for hyperbolic discounting.

The standard model has also been extended beyond the consideration of the single transition from unemployment to employment. Wolpin (1992) incorporated job-to-job transitions and both involuntary and voluntary transitions into unemployment.106 Rendon (2006) allowed for a savings decision in a setting where agents can also transit, both through quits and layoffs, from employment to unemployment.107 Both of these latter papers also allowed for wage growth with the accumulation of work experience, employer-specific (tenure) in the case of Rendon and both general and employer-specific in the case of Wolpin.

As noted, the standard job search model assumes that ex ante identical workers may receive different wage offers, or analogously, that the same unemployed worker may receive different offers over time. Diamond (1970) showed that with the assumptions of the standard job search model, in a game in which firms are aware of worker search strategies, the wage offer distribution will be degenerate at the worker’s reservation wage or outside option. This result led to the development of models in which wage dispersion could be rationalized as an equilibrium outcome, which in turn led to a structural empirical literature. The empirical literature has focused on two kinds of models, those based on a search-matching-bargaining approach to wage determination (Albrecht and Axell, 1984; Burdett and Mortensen, 1998). These models, not only rationalized wage dispersion, but also allowed for quantification and policy analyses in an equilibrium setting, for example, changes in UI benefits or changes in the level of the minimum wage.

The empirical implementation of equilibrium search models has become a major strand of the structural job search literature.108 Embedded within those models are different variants of the standard partial equilibrium search model and, in that sense, the development of the DCDP estimation approach was a critical precursor. However, given that the modeling has gone well beyond the partial equilibrium search model to which the DCDP approach has direct application, it would take us too far afield to provide a review of that literature. For such a review, we would refer the reader to the chapter by Mortensen and Pissarides in the Handbook of Labor Economics (Volume 3b, 1999) or the more recent survey by Eckstein and Vandenberg (2007).

In the rest of this section we review the structural empirical literature on the partial equilibrium job search model. Because the structural literature is explicitly connected to the theory, we first present the formal structure of the standard job search model and show how the nonstructural empirical literature can be interpreted in the context of the job search model. We then discuss conditions for identification and methods of estimation. Finally, we describe three empirical papers that have estimated extended versions of the standard model, and thus exemplify the nature of scientific progress in the structural literature, as we discussed in the introduction of this chapter, and we report empirical findings from counterfactual experiments in those papers.

4.2.1 The standard discrete-time job search model

In the discrete time formulation, an unemployed individual receives a job offer in each period with probability image. Wage offers are drawn from the known cumulative distribution function, image. An accepted job offer (and its concomitant wage) is permanent. While unemployed an individual receives image, unemployment benefits (if eligible) net of the cost of search. The individual is assumed to maximize the present discounted value of net income. We consider both infinite and finite horizon models, which have somewhat different empirical implications and implications for the identification of model parameters.109

Infinite horizon model

The value of a wage offer of image, given a discount factor of image, is

image     (133)

image     (134)

In any period, the value of continuing to search, image, either because an offer was rejected or an offer was not received, consists of the current period payoff, image, plus the discounted expected value of waiting another period. In that case, if an offer is received, with probability image, the individual chooses between the maximum of the value of working at a wage image, image, or continuing to search and receive image. If no offer is received, which occurs with probability image, the individual must continue to search and receives image. Thus the alternative-specific value function, the Bellman equation, for the search choice is


image     (135)


Rearranging, yields


image     (136)


which has a unique solution for image, as long as the cost of search is not so large as to make the right hand side negative.110 Defining image, the reservation wage, to be the wage offer that equates the value of search and the value of accepting the job, that is,


image     (137)


with a little further algebra, we obtain the following implicit equation for the reservation wage (which must have a unique solution given that image does):


image     (138)
111


Thus, the reservation wage is a function of image and image:


image     (139)
112


The individual accepts any wage offer that exceeds the reservation wage and declines offers otherwise.

Although the reservation wage is a deterministic function, the length of an unemployment spell is stochastic because the timing and level of wage offers are probabilistic. Thus, measures of the outcomes of search, such as the duration of unemployment spells and the level of accepted wages, are probabilistic. In particular, the survivor function, the probability that the duration of unemployment is at least as large as some given length, is

image     (140)

image     (141)

The term inside the brackets in (140) is the probability of receiving an offer in a period and rejecting it (because it is below the reservation wage) plus the probability of not receiving an offer. The cdf, pdf and hazard function are:

image     (142)

image     (143)

image     (144)

As seen, the survivor function, the cdf and the pdf can all be written as functions of the hazard rate, the exit rate from unemployment conditional on not having previously exited. From (144), it can be seen that the hazard rate is constant. Thus, in a homogeneous population, the infinite horizon search model implies the absence of duration dependence.

Given parameter values, mean duration is given by


image     (145)


Notice that mean duration is simply one over the hazard rate.113 Likewise, the mean of the accepted wage is


image     (146)


which clearly is larger than the mean of the wage offer distribution. 114

In addition to implying a constant hazard rate, the infinite horizon model has predictions about the impact of changes in image and image on the reservation wage, on the hazard rate and on the moments of the accepted wage distribution. It thus is, in principle, possible to test the theory. Comparative static effects of the hazard rate (and thus mean duration) with respect to its arguments are (see Mortensen (1986)):

image     (147)

image     (148)

image     (149)

image     (150)

An increase in the level of unemployment compensation benefits increases the reservation wage and reduces the unemployment hazard rate (147). An increase in the offer probability has an ambiguous effect; it increases the reservation wage, which reduces the hazard rate, but also directly increases the hazard rate through the higher offer probability (148). It turns out that for a certain class of distributions (log concave), the latter effect dominates (Burdett and Ondrich, 1985). Increasing the mean of the wage offer distribution (149), image, increases the hazard rate; although an increase in image increases the reservation wage, the increase is less than one for one. Finally, increasing the mean preserving spread of the distribution (150), image, reduces the hazard because an increase in the mass of the right tail of the wage offer distribution increases the payoff to search, thus increasing the reservation wage. An additional set of implications follow about the mean of the accepted wage; anything that increases the reservation wage also increases the mean accepted wage.

Finite horizon model

Spells of unemployment tend to be short (weeks or months) in relation to an individual’s life span. A finite lifetime would not seem, therefore, to be a reason to explore the implications of a finite horizon search model. On the other hand, it is reasonable to assume that individuals generally will not be able to self-finance extended periods of unemployment and that external borrowing is limited. One can think, then, of the finite horizon as corresponding to the maximal unemployment period that can be financed through internal and external funds, although we continue to assume that once a job is accepted it lasts forever, that is, the horizon is infinite subsequent to accepting a job. In addition to the previous notation, we denote by image the end of the search horizon. To close the model, it is necessary to specify the value function if the terminal period is reached without having accepted a job. We assume that in that case the individual receives image forever.115

Without going into the details, the reservation wage path can be shown to satisfy the following difference equation:

image     (151)

image     (152)

Notice that (151) reduces to the implicit reservation wage equation for the infinite horizon problem if image. Given a distributional assumption for wage offers, image, the solution of the finite horizon reservation wage path can be obtained numerically by starting from period image and working backwards.116

In the finite horizon case, the reservation wage is decreasing in the duration of the spell, image, rather than being constant as in the infinite horizon case. In addition, the reservation wage is bounded from below by image (at image), and from above by the infinite horizon reservation wage image. The hazard rate is thus increasing in image. Thus, the longer the spell duration, the greater the exit rate. The important property of the finite horizon reservation wage is that it depends on the time left until the horizon is reached. The reservation wages are equal under two different horizons not when they have the same amount of time elapsed since beginning the spell of unemployment, but when they have the same amount of time left until the horizon is reached image.

Nonstructural (parametric) approach to estimation

The early nonstructural approach to estimating the job search model was regression based. The primary concern of that literature, as well as the later literature based on hazard modeling, was the estimation of the impact of unemployment benefits on the duration of unemployment and post-unemployment wages.117 The regression (or hazard rate) specification in the nonstructural approach was motivated by the standard job search model. Classen (1977) provides a clear statement of the connection between the theory and the regression specification. The latter is given by

image     (153)

image     (154)

where image is spell duration, image is a measure of the post-unemployment wage, image is the weekly UI benefit amount and the image’s are “proxies” for a worker’s skill level, the cost of search and the job offer rate. As Classen notes, the determinants of both spell duration and the post-unemployment wage should be exactly the same as they are both optimal outcomes derived from the search model. Among the proxy variables used in Classen’s analysis are demographics, such as age, race and sex, and a measure of the wage on the job held prior to beginning the unemployment spell. Although not included in the Classen study, other variables often included in this type of specification are education, marital status, number of dependents and a measure of aggregate unemployment in the relevant labor market.

A test of the theory amounts to a test that benefits increase both expected duration and the mean accepted wage, that is, that image and image are both positive. Any further test of the theory would involve specifying how proxy variables are related to the structural parameters, image, the offer probability, and image, the wage offer distribution.

Classen is clear as to the purpose of including the pre-unemployment wage, namely as a proxy, most directly perhaps for the mean of the wage offer distribution. However, although the inclusion of that variable or, as is also common, of the replacement rate, the ratio of the benefit level to the pre-unemployment wage, was and continues to be standard in the nonstructural literature, the need for stating a rationale has been lost. As has been pointed out elsewhere (Wolpin, 1995), the inclusion of the pre-unemployment wage (or the replacement rate) cannot be justified by the standard search model, that is, given perfect measures of image and image, it would not have any impact on search outcomes. More importantly, however, given its ubiquitous use, is that its inclusion leads to “proxy variable bias,” as explained below.

Of course, the rationale for its inclusion is to avoid omitted variable bias. For example, suppose that the pre-unemployment wage is meant to proxy the mean of the wage offer distribution faced by individual image, image. Now, UI benefits are usually tied to the pre-unemployment wage, at least up to some limit. Thus, if some determinants of image are omitted (we do not have a perfect measure of image), variation in the benefit level will reflect, in part, the fact that those with higher pre-unemployment wages have higher image’s. In that case, a positive correlation between UI benefit levels and the pre-unemployment wage will lead to a negative bias in the effect of UI benefits on the duration of unemployment (recall that the higher is image, the greater the hazard rate).

Although omitted variable bias is well understood, the potential for bias introduced by using proxy variables is less well appreciated. The source of the problem is that, in the context of a search model, the pre-unemployment wage must have been the outcome of a search during a prior unemployment spell. To isolate the effect of using this proxy, assume that the duration of the prior unemployment spell was governed by the same behavioral process and fundamentals as the current spell. Such an assumption is consistent with a model in which there are exogenous layoffs and unemployment spells are renewal processes (the stochastic properties of all unemployment spells are the same).118 In particular, suppose that only the benefit level and the mean of the wage offer distribution vary (say, geographically) in the sample. Then, taking deviations from means (without renaming the variables, to conserve on notation), the duration equation is


image     (155)


Assuming image is unobserved, and thus omitted from the regression, the bias in the OLS estimator of image is given by


image     (156)


where image is the regression coefficient of image on image. Thus, if image and image are positively correlated and image as theory suggests, the bias in the estimated effect of UI benefits on duration will be negative.

Now, because the pre-unemployment wage, image, is the result of a prior search, it will have the same arguments as (155), namely

image     (157)

image     (158)

where image and where image given the definition of image. To derive a regression equation that includes image and image, solve for image in (158) and substitute into (155), yielding

image     (159)

image     (160)

where image. We are interested in whether the OLS estimator for image is biased, that is, whether image. It is easiest to see whether this holds by explicitly deriving the bias expression. It is given by


image     (161)


The bias is zero if either imageimage or if image, either of which would be a fortuitous property of the sample.

What is the relationship between the biases from omitting image (omitted variable bias) versus including image (proxy variable bias)? It turns out that a sufficient condition for the bias from omitting the pre-unemployment wage to be smaller than from including it is that image, which only will hold if image, a violation of the theory. In general, the biases cannot be ordered and it is unclear which is the better strategy to follow to minimize the bias. The implicit (that is, without justification) assumption made by almost all researchers is that omitted variable bias is greater than proxy variable bias in this context. Moreover, if the benefit level varies with the pre-unemployment wage and we have good measures of the mean of the wage offer distribution, variation in the benefit level from this source is helpful in identifying the UI benefit effect. The variation in the pre-unemployment wage around the mean of the offer distribution that induces benefit variation is purely due to random draws from the wage offer distribution.

Researchers adopting the nonstructural approach have universally included the pre-unemployment wage, and thus, assumed that omitted variable bias is greater than proxy variable bias. There is a larger point reflected by this choice, namely the importance of theory in empirical work. Structural work requires that all variables be explicitly accounted for in the model. A similar standard for nonstructural work might have revealed the existence of the choice between omitted variable and proxy variable bias, a choice not explicitly acknowledged in the nonstructural literature.119

Structural approach to estimation

Identification: Identification is no less an important issue in structural empirical work than in nonstructural work. We consider identification of the standard search model parameters assuming we have data from a homogeneous population on durations of unemployment and on accepted wages.120

To establish identification, it is useful to rewrite the reservation wage implicit Eq. (138) as

image     (162)

image     (163)

image     (164)

where the last equality uses (144). From the accepted wage data, note that a consistent estimator of the reservation wage is the lowest observed wage: image.121 Then, given an estimate of image from the duration data, and recognizing that from the accepted wage data, we can also obtain an estimate of image, we can identify image (which includes the unobserved cost of search) if we take image as given. This result does not require a distributional assumption for wage offers.

We cannot, however, separate image and image without a distributional assumption. Although we know image, given an estimate of image, we can recover the wage offer distribution, image, only if it is possible to recover the untruncated distribution from the truncated distribution. Obviously, that cannot be done without making a distributional assumption. Assuming that image is recoverable from the accepted wage distribution, then from imageimage, we can recover the offer probability, image.

Recoverability of the wage offer distribution is not always possible. It is useful for later reference to consider an example taken from Flinn and Heckman (1982). Assume that wage offer distribution is Pareto, that is, having pdf


image     (165)


Notice that the support of the distribution is bounded from below by a constant image. The density of accepted wages is

image     (166)

image     (167)

Given an estimate of image, from the minimum observed wage, and of the conditional density, we can recover image. However, there are many values of image that are consistent with the truncated distribution. We, thus, cannot identify image, which means we cannot identify the wage offer distribution, image. As already noted, the consequence of this lack of identification is that we cannot separate image and image, or, more specifically, image. To see that explicitly, write the reservation wage equation and the hazard function under the Pareto distribution,

image     (168)

image     (169)

Because image and image only enter multiplicatively as image, it is impossible to separately identify them. Fortunately, most of the commonly used distributions for wage offer functions, for example, the lognormal distribution, are recoverable from the distribution of accepted wages. However, there is an important lesson to draw, namely that parametric assumptions do not always assure identification.

The analysis of identification in the finite horizon case is similar. The reservation wage at each period can be consistently estimated from the period-specific minimum observed wages. Analogous to (164), we can write the implicit reservation wage equation as


image     (170)
122


This equation must hold exactly at each time image. As long as there are durations of unemployment that extend through more than two periods, that is, given image and image, image (or image) and image can be separately identified by solving the two difference equations for the two unknowns, image and image. Recall that this separation was impossible in the infinite horizon case. Moreover, if we have more than three periods of data, the model is rejectable. Identifiability of image and image, however, still requires recoverability of the wage offer distribution.

Likelihood function: The likelihood function for the search model takes an analogous form as that for the binary labor force participation model, given data on unemployment durations and accepted wages. Consider the likelihood contribution of an individual solving an infinite horizon search model, who has a completed unemployment spell of length image and accepted wage image:

image     (171)

image     (172)

Note that the reservation wage, the solution to the implicit equation (138), is a function of the model parameters image and image. The first (bracketed) term in the likelihood is the probability that in each of the periods up to image the individual received an offer and rejected it or did not receive an offer. The second term is the probability that the individual received an offer of image in period image and accepted it. Individuals who have incomplete unemployment spells would contribute only the first term to the likelihood.

In the labor force participation model of Section 3.1.1, the probability of accepting an offer to work depended on both a random wage draw and a random taste draw. Thus, whether or not an individual worked conditional on a wage draw was probabilistic, because it depended on the taste draw that we do not observe. In the search model, however, for any given value of the reservation wage (or the parameters that determine it), whether or not an individual works conditional on a wage draw is deterministic, that is, its probability, image in (172), is either one or zero. In order that the likelihood not be degenerate, the reservation wage must be less than the lowest accepted wage in the sample; for this reason, the minimum observed wage in the sample is the maximum likelihood estimate of the reservation wage. That would not create an issue if observed wages were all reasonable. However, in most survey data sets the lowest reported wage is often quite small, less than one dollar or even only a few cents. Such outliers would potentially have an extreme effect on the estimates of the structural parameters. One remedy would be to be to trim the wage data, say by whatever percent led to a “reasonable” lowest wage. However, that would be essentially choosing the reservation wage by fiat. A second alternative would be to add another error to the model, for example, by allowing the cost of search to be stochastic, in which case the reservation wage would be stochastic.123 A very low accepted wage would be consistent with the individual having drawn a very high search cost.

Of course, adding another source of error does not deal with what is the likely root cause of the problem, which is that wages are not accurately reported.124 That “fact” has led researchers to directly allow for measurement error in the reported wage. Introducing measurement error not only accounts for a real feature of the data, it is also convenient in that it does not require any change in the solution of the search problem. The reservation wage is itself unaffected by the existence of measurement error; the implicit reservation wage equation is still given by (138). Taking into account the existence of measurement error, letting image be the reported accepted wage and image the true accepted wage, the likelihood function (172) becomes


image     (173)


where image is the distribution of the measurement error and where the third equality emphasizes the fact that it is the true wage only and not the reported wage that affects the acceptance probability. The most common assumption in the literature is that the measurement error is multiplicative, that is, proportional to the true wage.

The estimation of the finite horizon search problem when there are extreme low-wage outliers is particularly problematic. Recall that the reservation wage is declining with duration. Thus, if an outlier observation occurs at an early duration, the entire subsequent path of reservation wages must lie below the reservation wage at that early duration. The incorporation of measurement error is, therefore, critical for estimation. The analogous likelihood contribution for an individual for the finite horizon model, which takes into account that the reservation wage is duration dependent, is


image     (174)


The estimation of the partial equilibrium search model involves the same iterative process as for the labor force participation model, namely numerically solving a dynamic programming problem at trial parameters and maximizing the likelihood function. The generality of the DCDP approach has allowed researchers considerable flexibility in modeling choices. Thus, researchers have adopted different distributional assumptions and have extended the standard search model in a number of directions that we have already mentioned.

As has been generally true in the DCDP literature, the theoretical models that serve as their foundation cannot directly be taken to the data. In the case of the standard search model, neither the infinite horizon nor the finite horizon model can fit the generally observed fact that the hazard rate out of unemployment declines with duration. Recall that the hazard rate is constant in the infinite horizon case and increasing in the finite horizon case. There are several ways to deal with this mismatch between the data and the models. In the infinite horizon model, introducing unobserved heterogeneity in model fundamentals, such as the cost of search, the offer probability and/or the wage offer distribution, can produce negative duration dependence in the population hazard while maintaining stationarity at the individual level. In the finite horizon case, allowing for time dependencies in model fundamentals, such as allowing offer probabilities to decline with duration, can create negative duration dependence, in this case at the individual level as well as at the population level.

Selected literature

Rather than do a comprehensive review of the contributions of DCDP modeling to the empirical (partial equilibrium) search literature, we illustrate the broad range of model specifications and the usefulness of the approach for policy evaluation with three examples.

Rendon (2006): In this first example, Rendon (2006) extends the standard finite horizon search model to allow for exogenous job loss (layoffs), for on-the-job search and for savings in the presence of borrowing constraints. Recall that the standard finite horizon search model imposes a terminal search period, with the putative rationale being that the individual cannot search indefinitely due to a limit on borrowing. However, because this limitation is not an explicit part of the model, the terminal value function (or equivalently, the terminal reservation wage) is not determined as part of the model. Rendon (2006) fills this lacuna in the structural empirical literature.125

Rendon, building on theoretical papers by Danforth (1979) and others and on the DCDP model of Wolpin (1992), considers a job search model with the following features:

1. Individuals maximize the present discounted value of lifetime utility. Flow utility is a CRRA function in consumption. Individuals are finitely lived and exogenously retire at a known time prior to the end of life. Time is discrete. Prior to retirement, the individual is in one of two employment states, unemployed or employed.
2. As in the standard search model, in each period of unemployment, the individual receives a job (wage) offer with a positive probability. If an offer is received, the individual makes an acceptance-rejection decision. The individual enters each period with some level of assets (positive or negative) and decides the level of assets to carry forward to the next period. Income while unemployed consists of the interest return (or payment, if assets are negative) on assets, and unemployment compensation benefits plus other family transfers minus the cost of search.
3. In each employment period, the individual faces a positive probability of layoff as well as a positive probability of receiving a wage offer from another employer. Regardless of whether an offer is received, the individual can decide to quit and become unemployed. As in the unemployment state, the individual decides on the level of assets to carry forward into the next period. Income while employed consists of the interest return (or payment, if assets are negative) on current assets plus the wage. Wages grow, starting at the initial accepted wage, deterministically with job tenure.
4. The level of assets that an individual holds in any period can be negative; the individual can carry debt, but the amount of debt cannot fall below the present value of the amount the individual can pay back with certainty (the Hakansson-Miller or “natural” borrowing limit). Because the individual can with some small probability be unemployed until the retirement date, the only certain income each period is the amount of unemployment income given by UI benefits and family transfers net of search costs.

In this model, individuals generally accumulate assets while employed as insurance against future unemployment spells and decumulate assets to finance search while unemployed. An individual’s reservation wage, as in the standard model, declines with duration as assets are run down. Individuals who start an unemployment spell (at the same life cycle point, say, due to a layoff) with higher assets (having randomly drawn a higher acceptable wage offer on the prior spell of unemployment) have higher reservation wages, longer unemployment spells and higher accepted wages. Thus, wages will not only be positively correlated across employment spells generated by job-to-job transitions, but also across unemployment spells separated by layoffs.

Unemployment spells are not only generated by exogenous layoffs, but also by voluntary quits into unemployment, even though wages on the job are non-stochastic. This behavior can arise when the offer probability is greater while unemployed than while employed. Consider an unemployed individual who, having not received frequent offers or only received offers at low wages, has drawn down assets to finance consumption, perhaps even hitting the borrowing constraint. The individual in this situation optimally accepts a low wage job. The individual, once employed, will begin to accumulate assets as insurance. At some time, the individual, having not received any higher wage offer from another firm will have accumulated sufficient assets for it to be optimal to quit into unemployment, financing consumption with the accumulated assets, to take advantage of the higher job offer rate during unemployment.126

As seen by this discussion, the existence of voluntary quits requires a certain parameter configuration. Given data in which voluntary quits arise, this parameter configuration must be an outcome of the estimation. Most DCDP models, like this search model, have the characteristic that model predictions are parameter dependent. This characteristic does not imply that these models do not have rejectable restrictions. Indeed, DCDP models are generally highly restrictive. Recall that the standard finite horizon model was rejectable, not in the conventional way of testing comparative static predictions, but because only a few parameters determined the entire profile of reservation wages. Tests of DCDP models are best thought of as tests arising from cross-equation restrictions. Models like the one estimated in Rendon also have such cross-equation restrictions, but they do not have easily derived analytical forms. However, to the extent that those cross-equation restrictions are seriously violated, the model will not be able to fit the data very well. Tests of model fit are (imperfect) tests of the model’s implicit cross equation restrictions.

In addition to the discrete state variable, job tenure, the search problem in Rendon has two continuous state variables, assets and the accepted wage. The latter is a state variable because the accepted wage is permanent (and wage growth is deterministic), which implies that reservation wage for accepting an offer from another employer depends on the wage at the current employer. Rendon solves the DP model by discretizing assets and wages, a method we discussed above. However, given the fine discretization he used, it was not tractable to solve and estimate the model over his postulated 40.5-year working life on a quarterly basis, that is for 162 quarters. To make it tractable, Rendon solved the model on a quarterly basis for the first 12.5 years, then on an annual basis for the next 8 years and finally a biannual basis for the next 20 years.127

Rendon’s model is extremely parsimonious. It contains only 12 parameters that must account for all of the labor force transitions, wages and assets observations of a sample of white male high school graduates over the first 40 quarterly subsequent to their graduation.128 The model is estimated by maximum likelihood. As is conventional in the DCDP literature, Rendon computes chi-square statistics for the match between the actual data and data generated by the model estimates for a wide range of statistics. The quality of the fit of the model is mixed.

The estimated model is used to perform a number of counterfactual exercises. In particular, Rendon considers the impact on labor market outcomes of relaxing the extent to which borrowing constraints are binding. In the quantitative experiment he performs, he finds that allowing agents to borrow up to 50 percent of the natural borrowing limit, as opposed to the estimated 10 percent in the baseline, would increase the duration of unemployment in the first period after graduating from high school by 12.5 percent and increase the accepted wage on that first job by one-third. Given a greater ability to borrow to finance unemployment spells, agents will hold fewer assets throughout their life cycle; in the experiment, asset holdings would be one-third less 10 years after graduation. Thus, Rendon finds that borrowing constraints importantly affect employment outcomes and asset accumulation.

Paserman (2008): The standard search model is based on the conventional assumption that agents use exponential discounting in weighing the current net cost of search and the future wage payoff from continuing to search. Paserman (2008), following a growing literature in which agents are assumed to have time-inconsistent preferences, specifies and estimates a search model with (quasi) hyperbolic discounting. In addition to allowing for present-biased preferences, Paserman extends the search model to include (i) a decision about search intensity, in essence, a choice about the per-period probability of receiving a job offer, (ii) an exogenous probability of layoff once employed and (iii) the receipt of unemployment benefits for a fixed period of time. The agent solves a finite horizon problem until the point at which unemployment benefits are exhausted and an infinite horizon problem from that point forward. Thus, the reservation wage and search intensity are constant after exhaustion, but are duration dependent during the period when the agent is still eligible for unemployment benefits.

To see the role of hyperbolic discounting, consider the simple discrete time finite horizon search model, abstracting from the additional extensions introduced by Paserman. With hyperbolic discounting, the value functions are:


image     (175)


where as before image is the value of searching in period image, image is the value of accepting a wage of image is unemployment income net of the cost of search in period image, image is the offer probability, image is the “long-run” discount factor and image is the “short-run” discount factor. Note that the value of accepting a wage at image after searching in period image, as viewed at image, is discounted by image, that is, exponentially. Thus, it is as if the agent has two selves. The agent who is making a decision in the current period, the current self, is impatient, discounting the expected future payoff to search by image. However the future self, the self who will receive the benefit of the search and controls future decisions, discounts exponentially. In formulating (175), it was assumed that the current self is aware that when the next period is reached, the current self at that time will be impatient, in which case the agent is deemed sophisticated. This is in contrast to a naive agent, who instead would assume that in the next period the current self would no longer be impatient.

The reservation wage is, as before, the wage that equates the value of search and the value of employment, namely image. With a little algebra, we can write (175), analogous to (170), as


image     (176)


Obviously, the reservation wage equation with hyperbolic discounting is the same as that with exponential discounting if image. Further, given its recursive structure, it is clear also that the reservation wage path is lower at all image as the degree of impatience is greater, that is, as image is smaller. Thus, the effect of present-bias in agent preferences is to make job acceptance occur sooner, leading to shorter durations of unemployment and lower accepted wages. The future self, however, would have preferred that the current self be more patient.

As noted, Paserman’s model is somewhat more complicated. In particular, he allows for a choice of search intensity, which affects the probability of receiving an offer and which is costly. In this setting, an agent has two instruments to minimize the current cost of search, the choice of search intensity and the choice of the acceptance wage. Della Vigna and Paserman (2005) show that with hyperbolic discounting (in an infinite horizon setting) agents will choose a lower search intensity and a lower reservation wage. Because a lower search intensity leads to a lower offer probability, and thus to longer spells, while a lower reservation wage leads to shorter spells, as in the case of the standard search model, the net effect on expected spell duration is, in general, ambiguous. However, DellaVigna and Paserman show that, as in the standard model, log concavity of the wage offer distribution is sufficient for the expected duration of unemployment to fall.

As in the standard finite horizon model, we can use (176) to consider identification. In the standard model, we noted that image and image could be separately identified with data on at least three periods that include accepted wages. However, identification is no longer possible with the addition of hyperbolic discounting; one cannot separately identify the structural parameters image and image from image and image. But, Paserman also has data on UI benefits. In his model, image is a composite of the level of UI benefits, image, and the value of search time, image.129 Thus the first term in (176) is image. Clearly, cross-section variation in image is sufficient to identify image, which implies that image and image are also identified.130

The model is estimated by maximum likelihood, with an extended version of likelihood function (174) to account for unobserved heterogeneity and layoffs.131 The estimation method, like all DCDP models, requires iterating between the solution of the DP problem and calculation of the likelihood function. As is standard in the DCDP literature, Paserman presents evidence on goodness-of-fit to evaluate the performance of the model.

The long-run discount factor, when not estimated to be at the boundary, could not be distinguished from one. The short-run discount factor was estimated to be 0.40 for a low-skilled sample, 0.48 for a medium-skilled sample and 0.89 for a high skilled sample. The image-value for a likelihood ratio test of whether the estimates of image were equal to one were less than 0.01 for the first two samples and 0.08 for the third.

Paserman uses the estimates of the model to assess the impact of policy interventions on unemployment search outcomes and on welfare. Measuring welfare in a hyperbolic discounting model is, however, somewhat problematic as there are, in essence, two agents (selves). Paserman adopts as the welfare measure the exponentially discounted utility of the long-run self under the strategy chosen by the hyperbolic self. In an exponential discounting setting, because there is a single agent making optimal choices subject to constraints, any additional constraints must always reduce welfare. With hyperbolic discounting, this is not necessarily the case. Using this welfare measure, Paserman addresses the question of whether it is possible to design a policy that not only improves welfare, but also reduces unemployment duration and lowers government outlays.

Paserman finds that by imposing a fine, equal to the amount of unemployment benefits, on unemployed agents who do not meet a search effort threshold, it is possible to achieve all of these goals. Indeed, Paserman shows (numerically) that there exists a threshold level of search intensity at which unemployment durations fall and for which the increase in agent welfare and the savings in government outlays is maximized. This experiment implies that program in which the search intensity of unemployed workers is monitored not only may reduce the cost of the UI system (subject to the cost of monitoring), but may also improve the welfare of those who are unemployed.

Ferrall (1997): As we have stressed, DCDP models have been used extensively for policy evaluation. In the present context, for example, most empirical applications of the DCDP approach to job search provide a quantitative assessment of the impact of changes in the UI system, such as altering benefits. Many of those models capture some, but not all, of the features of the UI system, often in a somewhat stylized fashion. It is reasonable to suppose that the closer a model mimics UI program rules, the better the model will be in evaluating policy changes. Ferrall (1997) structurally estimates a DCDP model of job search, which integrates all of the major features of the Canadian UI system.132

Ferrall studies the transition from school to work. In Canada, as in the US, the period of search for one’s first job after leaving school is usually not covered by the UI system. Although that spell of search unemployment is not insured, there is still the potential for the UI system, given its structure, to affect search behavior. To understand why, consider the structure of the UI system in Canada relevant during the time period studied by Ferrall. In that system:

1. An unemployed worker who is eligible for insurance must wait 2 weeks after becoming unemployed before collecting benefits.133
2. The benefit level depends on the previous wage through a fixed replacement rate (0.60 at the time) The insurable weekly wage is bounded from below by $106 and from above by $530. Thus benefits are $0 if the wage on the previously held job was less that $106, 0.6 times the wage if the wage is between the bounds and $318 for wages at or above $530.
3. To be eligible for UI benefits, a person must have worked on insurable jobs a certain number of weeks in the 52 weeks prior to becoming unemployed. The number of weeks depends on the regional unemployment rate and the person’s previous employment and UI history.
4. The number of weeks of benefits depends on the number of weeks worked on the previous job and on whether the individual is eligible for extended benefits, but is capped at 52.

There are two elements of the UI system that would affect search during an uninsured spell. First, because benefits are paid during insured spells, there is an incentive for individuals in an uninsured spell to become employed to be eligible for benefits during future unemployment spells. Thus the UI system reduces the reservation wage in an uninsured spell; further, the reservation wage will be lower the higher are benefits (Mortensen, 1976). On the other hand, because the level of benefits increases with the wage on the prior job, there will be an incentive for someone in an uninsured spell to wait for a higher wage offer, that is, to have a higher reservation wage. This incentive, however, only applies to individuals whose reservation wage would otherwise be below the maximum insurable wage, although the standard search model would no longer apply. Moreover, the magnitude of these incentive effects depend on the probability that an individual will be laid off from future jobs.

The model estimated by Ferrall, aside from the explicit incorporation of UI rules, differs from the standard single spell search model in a number of ways. The model allows for a search period during school, an initial uninsured spell after leaving school, the first job spell and a subsequent insured unemployment spell if a layoff occurs. The individual maximizes the expected present value of the log of consumption, where consumption equals the wage while working and the sum of UI benefits net of the cost of search plus the value of home production. In each period of unemployment, the individual receives a job offer with some positive probability. However, a job offer comes not only with a wage offer, but also with a layoff rate. The wage offer function is Pareto and the layoff rate can take on a fixed number of values, randomly drawn.134 Individuals differ, according to their unobserved type, in their market skill level and in their value of home production.

The solution method is by backwards recursion, where the value function for the infinite horizon search problem when UI benefits are exhausted after a layoff occurs serves as the terminal value function for the insured unemployment spell at the time of benefit exhaustion. The model is solved backwards from there as a finite horizon problem until the beginning of the search period while in school. The estimation is by maximum likelihood. Ferrall provides evidence of model fit.

Ferrall performs a number of counterfactual experiments that vary the parameters of the UI system. The most extreme is the elimination of the UI system, an out-of-sample extrapolation that is only possible within the structural framework. The resulting impact on unemployment durations depends on geographic location and education.135 Recall from the earlier discussion that there was no unambiguous prediction of how reservation wages would be affected by such an experiment. Ferrall finds that for those with at most a high school education residing outside of the Atlantic region, reservation wages rise; the expected duration of unemployment after leaving school (including those who have no unemployment spell) is estimated to increase by about 50 percent. Similarly, for those with some college residing outside of the Atlantic region, the increase is 40 percent. However, there is almost no effect on the expected duration for those residing in the Atlantic region regardless of education.

4.3 Dynamic models of schooling and occupational choices

This section describes the use of DCDP models in labor economics to study schooling and occupational choice and to analyze the effects of policy interventions aimed at increasing skill investment, such as tuition and school attendance subsidies and student loan programs. We begin with a brief discussion of the foundational schooling and occupational choice models from the early literature, which tended to be either static models or life-cycle models without uncertainty. These first generation models were influential in shaping the questions addressed and models developed in the later DCDP literature. We then describe contributions to the more modern DCDP literature.

4.3.1 Foundational literature

One of the earliest discussions of the determinants of schooling and occupational choices is given by Walsh (1935), which describes a model in which individuals invest in education until the return on education equals the return on other possible investments.136 The paper also examines the empirical support for the model using data from a variety of sources.137 Walsh (1935) calculates the returns associated with different levels of schooling and with a subset of occupations (doctor, lawyer, engineer), adjusting for costs (tuition, room and board) and foregone earnings. He raises the potential problem of ability bias in comparing lifetime earnings streams of different education levels and different professions. After finding that the wage returns to being a college graduate and to being a lawyer greatly exceed costs, whereas the returns to receiving an M.A. and Ph.D. are lower than the cost, he attributes the difference to nonpecuniary benefits associated with working in academia.

Roy (1951), in a seminal paper, provides the modern framework for modeling occupational choice as an earnings maximization problem that he then uses to analyze the implications of self-selection into occupations for earnings distributions. The Roy model assumes that individuals are endowed with two different skills, drawn from a joint log normal distribution. Each skill is productive in only one of two occupations, denoted by Roy as hunting and fishing. Skill is measured in units of output produced. Thus, an individual’s earnings in an occupation are the product of the price of a unit of occupation-specific output times the amount of skill (output production) embodied in the person; individuals choose to work in the occupation that maximizes their earnings. Roy (1951) did not apply the model to data, but showed that the structural parameters of the underlying model, the means, variances and covariances of the joint skill distribution, can be recovered from earnings data, even though earnings in a particular occupation is only observed for people who chose that occupation. The identification of these structural parameters derives from the theoretical formulation of the determination of earnings and from the distributional assumption.138

The literature started by Roy (1951) emphasized the importance of self-selection, skill heterogeneity and latent skills in understanding occupational choices and earnings. However, in Roy’s formulation, skills are treated as endowments. Another branch of the literature, associated with Mincer (1958), Becker (1964, 1967) and Ben-Porath (1967), evolved with the aim of understanding the human capital investment (or skill acquisition) decision and the implications for lifetime earnings of acquiring skills through schooling and job training investments. Mincer (1958) proposes a lifetime earnings model where the only cost of schooling is foregone earnings. In his model, all individuals are assumed to be ex ante identical, which implies a compensating earnings differential for individuals who spend more time in school. In equilibrium, everyone is indifferent between alternative schooling levels, because (discounted) life-time earnings are the same, but there is an earnings premium to each additional year of schooling at every post-schooling age. By equating lifetime earnings for individuals with different levels of schooling, Mincer (1958) derives a log earnings equation that is linear in years of schooling. Mincer (1974) augments the schooling model with a model of on the job investment that leads to a log earnings function that is linear in schooling and quadratic in work experience. That equation has come to be known as the Mincer earnings function, which has had widespread application in empirical work.

Mincer’s schooling model is silent about which individuals invest in schooling. Becker (1967), in his Woytinsky lecture, specifies a human capital production function in which the marginal return from investing in human capital declines with investment due to an individual’s limited capacity. The marginal cost of investing in human capital, the cost of financing additional human capital investment, depends on access to funding (parental and governmental subsidies and loans to education) and is increasing in the level of investment as cheaper sources of capital are used first. The equilibrium level of human capital investment equates the marginal return to marginal cost (at the intersection of the downward sloping demand curve and the upward sloping supply curve). Becker’s model implies that the level of human capital investment will in general differ across people, because of differences they face in either supply or demand conditions. For example, a higher level of innate ability implies a higher demand curve, because higher ability is assumed to make human capital investments more productive.

Rosen (1977) translates this framework into a schooling choice model. The log of earnings is assumed to be a function of schooling and ability. Following Becker, schooling (time spent investing in human capital) increases the stock of human capital and thus earnings, but at a decreasing rate. The marginal return is the derivative of the log earnings function with respect to schooling. The marginal cost of schooling, the interest rate at which an individual can borrow depends on family background. The optimal level of schooling is found by equating the marginal return to the marginal cost.

These previous authors model the human capital investment decision as a one-time decision. Ben-Porath (1967) extends the optimal human capital investment decision to a life cycle setting. The Ben-Porath (1967) model assumes that individuals choose a human capital investment profile to maximize discounted lifetime earnings. Human capital is produced at any age through the application of time (a fraction of an individual’s human capital stock) and purchased inputs, conditional on an individual’s ability and existing stock of human capital. The fraction of the human capital stock not used to produce additional human capital is used to produce earnings. Similar to Roy (1951), an individual’s earnings at any age is the product of a market determined price of a unit of human capital and the individual’s stock of human capital not used in investment at that age. Schooling, in this framework, is viewed as a period of full-time investment (no earnings) and on-the-job training as a time of partial investment. Given a finite lifetime, the optimal human capital investment profile, the fraction of time spent investing, declines with age. Thus, any period of full-time investment, that is schooling, would come first.

Willis and Rosen (1979) empirically implement a model of schooling choice that combines the essential features of this early theoretical literature. 139 The paper develops a two sector model where individuals decide whether or not to attend college, basing their decision on expected lifetime earnings with and without college, on financing capacities that differ by family background and on nonpecuniary benefits of education. The model incorporates two unobservable abilities, associated with high school and college level skills. Willis and Rosen (1979) find that the decision to attend college is strongly influenced by expected lifetime earnings gains and that family background is an important determinant of college-going decisions. In addition, they find comparative advantage to be an important feature of the labor market; that is, high school graduates have better prospects as a high school graduate than would an average college graduate and college graduates having better prospects as a college graduate than would an average high school graduate.

4.3.2 DCDP models

The DCDP literature extends this earlier work on schooling and occupational choice to a dynamic setting, in which individuals face a sequential decision problem with uncertainty. It incorporates features from the earlier literature, allowing for worker heterogeneity, multiple skill types, latent skills, self-selection and comparative advantage. The literature can be broadly categorized into partial equilibrium approaches, which take skill prices as given, and market equilibrium approaches, where there is an explicit link between the prices paid to skill in the economy and aggregate skill quantities. Here, we first describe partial equilibrium models of schooling and occupational choices and then the more limited set of market equilibrium models. Subsequently, we consider DCDP models that have been developed for particular contexts, for example, to analyze the decision about college major or the decision to enter and exit the teaching profession.

Partial equilibrium models of schooling and occupational choice

Gotz and McCall (1984), one of the pioneering papers in the DCDP literature, as noted previously, develops an occupational choice model for the purpose of studying the retention decision of Air Force officers, in particular, how retention responds to compensation policy and to the retirement system. In the model, officers make a binary choice at each age about whether to stay or leave the Air Force so as to maximize the expected present value of pecuniary and nonpecuniary returns. There is a single taste shock that is realized each period and that affects the value of the military option. An officer who leaves joins the civilian labor force and earns a civilian wage. In addition to considering compensation and pension benefits, the model also explicitly accounts for the effects of the chance of promotion on the expected value of staying in the military. The probabilities of promotion and military and civilian earnings are treated as exogenous.140 The model also allows for persistent differences among individuals in their preference for military service (permanent unobserved heterogeneity).

The model parameters are estimated by maximum likelihood using data on officer employment histories from 1973-1977 as well as data from the Current Population Survey used to construct estimates of civilian earnings. As a way of validating the model, the estimated model is used to forecast retention rates for data not used in the estimation, which shows that the estimated model produces good out-of-sample forecasts. The fit of the dynamic retention model is also compared to that of two competing models, one that does not allow for unobserved permanent preference heterogeneity and a lifetime model without per-period shocks, where individuals know with certainty the year they will leave the military. The dynamic model that allows both for permanent unobserved heterogeneity and for per-period shocks provides the best fit to the data. The dynamic model is then used to assess the effects of a number of policy interventions of interest to the Military, including (i) an increase in pay and allowances, (ii) the introduction of a bonus based on years of service completed, (iii) a decline in the value of the military retirement annuity, (iv) an increase in flight pay and (v) indexing pay to the CPI.141

Miller (1984), another pioneering paper in the DCDP literature, develops and estimates a matching model of occupational choice. The model assumes that the pay-off to a particular job within an occupation depends on a match-specific component and a random component. Individuals do not know the match-specific component prior to starting the job, but they observe their output. Beliefs about the quality of the match change with experience on the job through a Bayesian updating procedure. Jobs for which the expected return stream are identical are defined as being the same occupation. The model has implications for which jobs should be sampled first and for how long. For example, the notion of equalizing differences would imply that jobs with high informational content pay less on average in equilibrium and attract relatively inexperienced workers who quickly discover their personal match quality and leave in the event of a bad match. These types of jobs would include a large number of inexperienced workers in the process of learning about their match plus a small number of experienced, permanent workers. Jobs with lower informational benefits should have less turnover, pay more for new entrants and have a less variable wage distribution.

The dynamics in the model arise due to the learning process as the value of remaining in a job provides not only an immediate payoff but also information about the future payoff in that job. The parameters of Miller’s (1984) job-matching model are estimated by maximum likelihood using data on job tenure and job changes, where the discrete time hazard of remaining in a given job or switching to a new job are derived from solving the dynamic programming model.142 The hazard function is assumed to also depend on the initial observed demographic characteristics of the individual. To capture unobservable heterogeneity, the hazard model depends on two unobserved states, following the approach of Heckman and Singer (1984). The Coleman-Rossi data set, which surveyed a sample of men about their entire work history, education and family background, is used to estimate the model. The empirical evidence supports the prediction that young inexperienced individuals receive low wages in part because they seek out different kinds of occupations, those with greater informational content, than do older individuals.

The preceding papers focused only on the occupational choice decision. The first DCDP model to combine schooling, working and occupational choices in a single framework is Keane and Wolpin (1997). To illustrate concretely the specification of a DCDP model of human capital accumulation, we next describe Keane and Wolpin’s (1997) model’s structure in detail. As further discussed below, a number of papers in the recent DCDP literature extend the Keane and Wolpin (1997) modeling framework to incorporate additional features.

In the baseline model presented in Keane and Wolpin (1997), individuals make repeated choices over time, starting at age 16 and ending at age 65, about whether to participate in one of five different sectors of the economy: (i) work in a white-collar occupation, (ii) work in a blue-collar occupation, (iii) work in the military, (iv) attend school, or (v) engage in home production. There is a finite horizon during which individuals accumulate schooling and occupation-specific experience that affect future wage earning opportunities.

Denote the five choice alternatives in each time period by image where image. The first three alternatives image are the work alternatives, the 4th is the schooling alternative and the last is the home alternative. Let image if alternative image is chosen in time period image. image represents the reward (contemporaneous utility) from choosing alternative image, which captures all benefits and costs associated with that alternative.

The reward in a work sector is the wage, which is the product of the price paid per unit skill times the amount of skill accumulated in that occupational sector. Let image denote the rental price paid to skill in occupational sector image and image the occupation-specific skill units.


image     (177)


The technology for skill production depends on the number of years of schooling, image, and on occupation-specific work experience, image. The production function takes the form:


image     (178)


where image represents the endowment of skill at age 16. The log wage equation is:


image     (179)


The wage equation has the Mincer form of being linear in years of education and quadratic in experience but has the Ben-Porath (1967) and Griliches (1977) pricing equation interpretation.

If a person goes to school, the per period reward is:


image     (180)


where image and image are tuition costs, image, is endowed skill at age 16 and image is a random shock component. The home alternative has the associated nonpecuniary reward:


image     (181)


where image is the skill endowment and image the random shock component.

The initial conditions in the model are the highest grade completed at age 16 (image) along with the unobserved skill endowments in the different sectors. It is assumed that accumulated experience is zero for all alternatives in the first period. The shock components are assumed to be joint normally distributed and serially independent, conditional on the unobserved endowments:


image     (182)


The state vector at any image is described by


image     (183)


where


image     (184)


is the vector of age-16 endowments,


image     (185)


is vector of work experience and schooling accumulated in the different sectors and


image     (186)


is the vector of shocks.

The value function at age image is the maximized value of the expected remaining lifetime utility, taken over all possible sequences of future choices, with respect to the choice at image,


image     (187)


The problem can be written recursively in Bellman equation form. For image, the alternative specific value function is


image     (188)


where the expectation is taken over the random shock components. In the last time period, image,


image     (189)


The value function is the maximum over the alternative specific value functions:


image     (190)


The state variables that evolve over time are the accumulated sector-specific experience and the years of completed schooling:

image     (191)

image     (192)

The observed data are the sector choices that people make and their observed wages (for the sectors with pecuniary rewards) starting from age 16 and ending at age image (at most age 27 in the data):

image     (193)

image

It is assumed that individuals observe contemporaneous shocks image, but that the researcher does not. The observed state space (exclusive of the shocks) is


image     (194)


The likelihood is


image     (195)


where image denotes the vector of choices and wages at age a. The estimation proceeds by: (i) choosing an initial set of parameters, (ii) solving the dynamic programming problem numerically (by approximating the image functions as previously described), (iii) computing the likelihood, and (iv) iterating to maximize the likelihood until convergence.

The baseline model that Keane and Wolpin (1997) estimate also includes unobservable heterogeneity. Specifically, there are assumed to be 4 types of individuals with heterogeneous age 16 endowments, denoted by


image     (196)


The type of the individual is assumed to be known to individuals but unknown to the researcher. Unobservable heterogeneity introduces the potential for comparative advantage into the model in that some individuals persistently get higher rewards in certain sectors, but perhaps not others. Unobserved permanent endowment differences are necessary to fit the high degree of persistence in choices observed in the data.

In the model, the only observable initial conditions that varies is schooling attained at age 16, image.143 If the shocks were serially correlated, then it would be problematic to condition the analysis on image, because image likely reflects prior schooling decisions that would be affected by earlier shocks. If the shocks are iid, however, then conditioning the analysis on image is not problematic. The maintained assumption is that the initial condition image is exogenous with respect to the shocks conditional on the unobserved type. Accounting for unobservable heterogeneity and for initial conditions, the likelihood is:


image     (197)


The type probability is estimated as a function of the initial schooling.

Estimation of the model’s parameters is based on eleven years of data on young white males from the NLSY79. The analyses subsample consists of 1373 observations on white males who were age 16 or less as of Oct. 1, 1977 and who are followed through 1988. Each time period in the model corresponds to one year in the data. Wages are measured as full-time equivalent wages, estimated as average weekly wages times 50. Parameter estimates are obtained by simulated maximum likelihood, as previously described.

Keane and Wolpin (1997) evaluate the goodness-of-fit of their baseline model (described previously) and of a few alternative model specifications that differ in their degree of parsimony to learn which features of the model are important to achieving a good fit. Their preferred model augments the baseline model to incorporate skill depreciation during periods of nonwork, job-finding costs, school reentry costs, and nonpecuniary components of work sector alternatives. As a way of validating the model, Keane and Wolpin (1997) also evaluate the fit of the model out-of-sample by predicting the choices of younger birth cohorts (using CPS data) that were not used in estimating the model.

A consistent empirical finding (Willis and Rosen, 1979; Heckman and Sedlacek, 1985) is that comparative advantage plays an important allocative role in the labor market. Workers self-select into occupations and into sectors based on their relative productivities. Keane and Wolpin (1997) find that comparative advantages determined by age 16 lead to large differences in school attainment and later labor market outcomes. Indeed, most of the variation in lifetime utility comes from inequality in skill and preference endowments at age 16, pointing to the importance of early influences in explaining lifetime inequality.

The estimated model is used to predict the effects of a $2000 (1987 dollars) college tuition subsidy on the college going rate. Under the preferred model specification, the subsidy increases the high school graduation rate by 3.5 percentage points and the college graduation rate by 8.5 percentage points. However, the main beneficiaries of the subsidy, in terms of lifetime utility, are individuals who would have gone to college without the subsidy.

A follow-up paper by Keane and Wolpin (2000) uses a similar framework to analyze the sources of black/white differentials in schooling attainment and earnings and to assess the impact of policies intended to close the racial gaps. Race enters the model in a number of ways, as a determinant of preference parameters, unobserved type probabilities, and wages. The paper finds that differences in initial age 16 skill endowments are the primary explanation for low schooling attainments of blacks relative to whites. This finding has important implications for policy. Keane and Wolpin (2000) implement a scheme to equalize the schooling distributions of black and white males through the combined use of a high school graduation bonus and a college graduation bonus. Although this policy, by design, closes the racial schooling gap, it has only a very small effect on the racial earnings gap due to differences in skill endowments at age 16.

An area of research that has received much attention in the nonstructural literature focuses on the effect of credit market constraints on college enrollment. The finding in that literature that tuition effects are inversely related to parental income has often been interpreted as evidence for the existence of borrowing constraints that have adverse consequences for college attendance (see, e.g., Kane, 1999, p. 63). A paper by Keane and Wolpin (2001) studies how borrowing constraints and parental transfers affect educational attainment by estimating a DCDP model of schooling, work and savings decisions of young men, using data from the NLSY79 cohort. The model allows for parents to provide transfers to youths, which the youths take as given and which vary depending on whether the youth chooses to go to college. Like the previous papers, the model incorporates unobserved heterogeneity (endowments at age 16). In the model, schooling and work are not mutually exclusive choices and youths can work full or part time while still attending school (full or part time). Youths may borrow up to a limit. Keane and Wolpin (2001) find that borrowing constraints are tight (financing college tuition through uncollateralized borrowing is not feasible in the model). In addition, consistent with the pattern found in the nonstructural literature, Keane and Wolpin report that a tuition increase generates a pattern of larger percentage declines in enrollment for youth whose parents have lower SES.

On the surface, it would appear that the inference drawn in the nonstructural literature, that borrowing constraints exist and limit college attendance of youths from less affluent families, is validated by the congruence of these two findings. However, when Keane and Wolpin simulate the impact of relaxing the borrowing constraint, by allowing youths to borrow the full tuition cost, they find that there is only a negligible increase in college attendance. However, allowing college attendees to borrow up to the full tuition amount leads to a reduction in their propensity to work while attending school and to an increase in their consumption. They therefore conclude that college attendance is not limited to any great extent by borrowing constraints, but rather primarily by age-16 endowments of pre-market skills and/or preferences.144

The finding that borrowing constraints are tight and yet relaxing them does not lead to increased college attendance has been controversial. However, it is consistent with earlier research by Cameron and Heckman (1998, 1999) that estimates a sequential model of school attendance decisions.145 That research finds a strong positive correlation between family income and college attendance, conditional on high school graduation, even after controlling for effects of dynamic selection on unobservables. After controlling for AFQT test score (interpreted as a proxy for the individual’s endowment at age 16), however, liquidity constraints no longer play a strong role in college attendance decisions.

In the previously described papers, log wages are specified as a linear function of the number of years of schooling. Belzil and Hansen (2002) estimate a DCDP model of schooling decisions with a focus on allowing the returns to different levels of schooling to vary. In particular, they model the wage equation as a spline in years of schooling with eight knots. Their model assumes that individuals make sequential decisions as to whether to attend school for up to 22 years, after which they enter the labor market. 146 While in school, they receive parental transfers according to a parental transfer function that depends on accumulated years of schooling. After entering the labor market, individuals are employed with some probability and, if employed, receive a wage rate. Both the wage rate and the probability of employment depend on their schooling attainment and labor force experience. 147 The model has three source of uncertainty: a schooling preference shock, a wage shock and an employment shock. It also includes six unobserved types to capture unobservable heterogeneity in schooling ability and in market ability, and one of the goals of the paper is to recover the correlation between unobserved schooling and market ability.

The model is estimated by simulated maximum likelihood on a sample of white males from the NSLY79. The estimated parameters indicate that log wages are convex in years of schooling, with statistical tests rejecting the hypothesis of linearity. The log wage equation has estimated returns to schooling that are very low (1%) until 11th grade, increase to 3.7% in grade 12, and exceed 10% between grades 14 and 16. The estimated returns to schooling are substantially lower than corresponding OLS estimates. For a linear in schooling specification, one obtains an OLS estimate of 10%, in comparison with the structural model estimates of 2% on average up to grade 12 and 7% after grade 12.148 They also find that there is a strong positive correlation (0.28) between market ability and realized schooling, which would imply that estimated returns to schooling from wage regressions that do not control for the endogeneity of schooling will tend to be upwardly biased.

Sullivan (2010) develops a DCDP model that combines a model of labor force dynamics (as in Wolpin (1992) and Rendon (2006)) with a human capital model of schooling and occupational choice (as in Keane and Wolpin (1997)). The previous literature considered job search as a separate phenomenon from schooling and occupational choice, though the choices are clearly related. In Sullivan’s (2010) model, workers decide in each period whether to attend school and/or work in one of five occupations or neither work nor attend school. An individual who has not graduated from high school may also decide to earn a GED. An employed individual may stay at the current job or switch jobs either within the same occupation or with a change in occupation. Human capital accumulated through work experience is both firm- and occupation-specific. Individuals have heterogeneous skill endowments and preferences for employment in different occupations. Wage offers include a match-specific component, reflecting worker-firm permanent match productivity, and an iid time varying shock. Search arises because of variation in worker-firm match productivity together with mobility costs. Model parameters are estimated by simulated maximum likelihood using data from the NLSY79.

The model estimates are used to perform a number of counterfactuals. Sullivan’s (2010) analysis finds that occupational and job mobility are critical determinants of life cycle wage growth, quantitatively more important than the accumulation of occupation-specific human capital. As in previous research, the results also indicate the importance of comparative advantage in understanding schooling and occupational choices. Sullivan also finds that unobservable heterogeneity plays a relatively smaller, though still substantial, role in explaining labor market outcomes than has been found, for example, by Keane and Wolpin (1997). He attributes 56% of the variation in lifetime utility to permanent heterogeneity, which compares to 90% found in Keane and Wolpin (1997).

General equilibrium models of schooling and occupational choice

Most of the literature on modeling occupational and schooling choices is partial equilibrium, taking skill prices as given. However, a few papers in the literature estimate general equilibrium models in which skill prices respond to changes in aggregate market demand and supply for skills.

The earliest paper to estimate a multi-sector general equilibrium model is Heckman and Sedlacek (1985). The paper is an extension of Roy (1951). Although static, and thus not a DCDP model, the paper serves as a link to the later general equilibrium models that fall within the DCDP paradigm. Specifically, Heckman and Sedlacek (1985) estimate a model of individuals’ decisions among three sectors: work in the manufacturing sector, work in the non-manufacturing sector or not work. In addition to specifying the micro-level supply-side sector choice model, the paper estimates an aggregate demand function for skill. The micro supply-side model and the aggregate demand models are used jointly to simulate the effects of price changes on employment levels and wages, such as an increase in the price of energy that predominantly affects labor demand in manufacturing.

Establishing the link between aggregate skill quantities and skill prices can be conceptually important for analyzing the effects of policy interventions. To illustrate, consider, for example, the impact of a college tuition subsidy on the fraction of people going to college. A tuition subsidy must act as a positive inducement to college attendance. In a general equilibrium framework, a college tuition subsidy that induces more people to go to college would also decrease the price paid to college skill given the increase in the aggregate quantity of college educated labor. For this reason, we would expect the predicted general equilibrium effect of a college tuition subsidy on college-going to be smaller than the predicted partial equilibrium effect. The quantitative significance of the supply effect on skill prices is an empirical question.

The papers described below develop and estimate general equilibrium models incorporating schooling and employment choices. The goals of these papers are to understand historical wage and employment patterns for workers of different skill levels and to analyze the effects of skill formation policies, such as tuition subsidies.

Wage inequality has increased at least since the 1980’s, with low skill workers experiencing both absolute and relative declines in real wages as the economic returns to skill acquisition have risen. Heckman et al. (1998) (HLT) present the first general equilibrium model of schooling and job training choices, which they use to explore alternative explanations for observed wage patterns and to simulate the effects of college tuition subsidies. In the HLT model, individuals make decisions about whether to go to college, about post-school on-the-job training (human capital investments a la Ben-Porath) and about life-cycle savings. The model assumes that individuals decide whether or not to go to college and on their optimal life cycle consumption and human capital investment paths, assuming they work each period until the age of retirement.149 There are no credit constraints. The market wage for each skill type is the product of the skill rental price and the amount of accumulated skill. Individuals are heterogeneous in terms of initial skill endowments, captured by the observed AFQT test score.

The model is solved for overlapping generations of agents and estimated using both aggregate CPS data (from 1963 to 1993) and longitudinal data from the NLSY79. Cohorts make different choices because they face different (known) skill prices over their lifetimes. The model assumes a one-to-one correspondence between schooling groups (high school and college) and skill types, that is, that different skill types cum schooling classes are imperfectly substitutable. However, age groups within a given schooling group (high school or college) are perfect substitutes. Skill prices are determined in equilibrium. Equilibrium skill prices induce aggregate skill supplies that equate marginal revenue skill products to skill prices.

HLT calculate the partial equilibrium and general equilibrium impact of a 100 dollar increase in tuition on college enrollment. They find that the partial equilibrium response is a decline in enrollment of 1.6 percent. However, when they allow for skill prices to adjust to the reduction in college skill, that is for the increase in the relative price of college skill, they find that the decline in enrollment is less than 0.2 percent. Thus, the adjustment in the relative price of college to high school skill almost completely offsets the disincentive to acquire schooling. Presumably, a tuition subsidy of a similar magnitude can be expected to lead to only a negligible increase in college enrollment due to the fall in the relative college skill price.

Lee (2005) estimates an alternative formulation of a general equilibrium schooling and occupational choice model. The specification of the individual’s problem parallels that of Keane and Wolpin (1997). Specifically, in each period an individual decides whether to attend school, work in one of two occupations, blue collar or white collar jobs, or do neither. Individuals are heterogeneous in skill and preference endowments and are subject to idiosyncratic time-varying shocks. A critical difference between the model of Lee (2005) and that of HLT (1998) is that in Lee’s model, an individual’s skill type is not equated with their schooling. Schooling augments both white and blue-collar skill, though differentially, and it is the aggregate levels of the occupation-specific skills that enter as inputs into the aggregate production function. In Lee’s (2005) model, occupations are not perfectly substitutable, but education types are perfectly substitutable within occupation and age groups. This difference between the HLT (1998) and Lee (2005) models has important consequences for the relationship between partial and general equilibrium effects of tuition policies.

Lee (2005) estimates the model using simulated method of moments applied to CPS data on schooling, occupational choice, employment and cohort size, under an assumption that individuals have perfect foresight about future skill prices. The estimated model is then used to investigate how cohort size affects skill prices and wages and also to evaluate the effects of a college tuition subsidy. As in HLT, Lee evaluates the partial and general equilibrium impacts of a 100 dollar increase in tuition on college enrollment. The partial equilibrium effect ranges from 1.2 to 1.9 depending on age and gender, similar in magnitude to HLT. However, the general equilibrium effect is only found to be about 10 percent lower, in sharp contrast to the result in HLT. The reason, supported by simulations performed by Lee, is that workers can respond to changes in the relative price of blue- and white-collar skill by switching sectors. A tuition increase that reduces college enrollment, and thus increases the relative white collar skill price, induces some blue-collar workers to switch to white-collar jobs, mitigating the rise in the white collar skill price.

A recent paper by Lee and Wolpin (2010) develops and estimates a general equilibrium model to explain the evolution of wages and employment over the last 30 years, including gender differentials in employment and earnings, which were not considered in the previous two studies that focused only on males.150 Specifically, the study aims to account for changes in wage inequality (both overall and within demographic groups), increases in relative wages and employment of women, and a shift that has occurred over time in employment from the goods to the service producing sector. There is an extremely large, mostly nonstructural, literature that considers each of these major labor market changes as separate phenomena.151Lee and Wolpin (2010) develop a comprehensive framework which includes many of the factors considered to be potential explanations for these major labor market changes.152

The model estimated in the paper has two production sectors, corresponding to goods and services. Aggregate production depends on three skill types (white-, pink- and blue-collar) and on capital. There are time-varying neutral and non-neutral technological changes as well as combined aggregate productivity and relative product price shocks. The goods-to-service product price and the price of capital evolve exogenously.

In the model, men and women age 16-65 can choose to work in any of six sector-occupations (pink collar, white collar or blue collar in either the goods or service sectors), to attend school or to stay home. Each period, individuals receive wage offers from each sector-occupation that depend on schooling attainment and accumulated experience in each sector-occupation. There are also nonpecuniary payoffs and preference shocks to the different options. To capture lower labor force participation rates of women during child-bearing ages, the value of the home choice is assumed to depend on the number of preschool age children in the household. It is also allowed to vary over time to reflect technological improvements that are thought to have occurred in the home sector. In addition, there is a cost of transiting between sector-occupations, which can be interpreted as labor market frictions.153 The population at any point in time consists of overlapping generations of both sexes. Unobservable heterogeneity is incorporated by including five unobserved types of individuals who differ in sector-specific endowments and in preferences for the home and school options.

Skill prices are equated to marginal revenue products evaluated at aggregate skill amounts. The paper also develops a belief consistent forecast rule for future skill prices, as an approximation of a rational expectations equilibrium. Model parameter estimates are obtained by simulated method of moments, matching the model’s predicted levels of wages, employment and school enrollment to data from the CPS, BLS and NLSY79.

Lee and Wolpin (2010) use the estimated model to assess the relative contribution of changing technology, preferences and exogenous forcing variables (the goods to service product price, the price of capital, fertility) as explanations of the previously described major labor market changes. This is done by using the model to simulate labor force outcomes under hypothetical scenarios relative to a baseline economy. The key findings from the analysis are that (i) neutral technological change best accounts for service sector employment growth, (ii) skill-biased technological change best explains the rise in the college wage premium and the increase in overall wage inequality, (iii) the combination of neutral and biased technological change account for the declining gender gap and increased female labor force participation, and (iv) changes over time in fertility and in the valuation of the home sector can account for female-male relative wage and employment growth. The study concludes that a competitive general equilibrium model of the labor market provides a comprehensive framework for analyzing the determinants of wage and employment changes over the last 30 years and that both demand and supply side factors are required to account for the major labor market changes.

4.3.3 The use of DCDP models in related contexts

The previously discussed studies focused on schooling and occupational choice decisions. We next describe a DCDP literature that develops models to study how marriage decisions interact with labor force decisions, the operation of particular occupational labor markets, various behaviors of adolescent youth and the effects of job training on training program participants.

Marriage and career decisions

Gould (2008) estimates a DCDP model of marriage and career decisions of young men age 16-35 using data from the NLSY79 with the aim of exploring the extent to which schooling and employment choices are influenced by marriage market considerations. Individuals choose among four sector options: schooling, white-collar work, blue-collar work and home. In addition, men face potential marriage opportunities that are conditional on their demographic characteristics and on their current and previous marriage, schooling and employment decisions. Based on the available opportunities, they decide among marriage states. Specifically, there is some probability of receiving a marriage offer (from a woman of given type) and men decide whether to accept the offer. Married men also face an exogenous probability of having their marriage terminated by their wife. The model incorporates four unobserved types of individuals to allow for unobservable heterogeneity.

Model parameters are estimated by simulated maximum likelihood.154 The estimated model is used to study how young men’s career choices would change if there were no marriage market returns to career decisions, that is, by shutting down marriage within the model. Simulation results show that the marriage market significantly affects men’s schooling and labor market decisions. Without marriage, men work less, study less, and relatively more often choose blue-collar over white-collar work. Another simulation examines the effects of changing divorce costs on men’s choices. A decrease in divorce costs leads men to take fewer measures to guard against a marital break-up; they invest less in education and relatively more often choose blue-collar over white-collar work. Overall, Gould (2008) finds that the private returns to human capital investment include significant returns in the marriage market.

Occupational labor markets

We next describe three studies that use DCDP models to analyze the operation of a particular occupational labor market. Sauer (1998, 2004) studies life-cycle career choices of law school graduates following graduation from the University of Michigan Law School and how these choices are affected by financing options and loan forgiveness programs. Stinebrickner (2001) studies the decisions by certified elementary and secondary teachers to stay in or exit from the teaching sector.

Sauer (1998) estimates a model of a law school graduate’s choices among five employment sectors that differ in pecuniary and nonpecuniary returns, in promotion and dismissal probabilities, and in the extent to which human capital is transferable across sectors. The possible employment sectors are nonprofit, elite private law firm, non-elite private law firm, separate business and sole proprietor. Attorneys choose among the different sectors taking into account effects of current choices on future job opportunities and wage offers, which depend on endogenously accumulated sector-specific work experience. Within private law firms, lawyers also have the opportunity for promotion from an associate to a partner position. The model includes unobserved types that are assumed to be known to both the worker and to the firm, but not to the researcher. The model is estimated by simulated maximum likelihood.

An interesting aspect of the model is that it generates sector-specific non-monotonic hazards in duration of employment, as observed in the data, through a mechanism that is different from that of the classical job-matching model (Jovanovic, 1979). In Sauer’s (1998) model, the ability of the worker and the quality of the match are known from the beginning, and non-monotonic hazards arise because of self-selection. In particular, high ability lawyers face higher probabilities of promotion at private law firms and stay at these firms when they get promoted. Low ability lawyers initially also work at private firms even though they have a low probability of getting promoted, because their experience pays off later in the form of higher-paying jobs in other sectors. The self-selection mechanism has implications for effects of policy interventions in the market for lawyers, such as programs that forgive loans if a lawyer enters the nonprofit sector. Simulations using the estimated model indicate that a loan forgiveness program induces low ability types to enter the nonprofit sector earlier but is relatively ineffective in attracting high ability lawyers.

A follow-up paper by Sauer (2004) extends his previous DCDP model to incorporate educational financing decisions. The study’s goal is to measure the effects of short-term parental cash transfers and family background on educational borrowing and in-school work decisions, and ultimately on earnings after graduation, and also to better understand effects of policies such as tuition tax credits and loan forgiveness programs on these decisions and on post-graduation outcomes. The model assumes that individuals maximize their expected present value of lifetime utility by making decisions on the level of educational indebtedness, whether to work while in school and the type of post-graduation employment. Total financial resources during law school come from five possible sources: parental cash transfers, initial assets, stochastic unobserved assets, educational debt, and stochastic labor income. Post-graduation job market choices are modeled analogously to Sauer (1998). The model also includes three unobserved types to capture unobservable heterogeneity, where the type probabilities depend on family background variables (that include whether the father was an attorney) and whether the individual has an Ivy League BA.

Model parameters are estimated by simulated maximum likelihood allowing for classification error. The estimated model is used to examine the effects on student’s borrowing, work while in school and subsequent employment choices of a loan forgiveness program that grants an annual subsidy equal to an individuals’ debt obligation for individuals who take jobs in the nonprofit sector within the first 10 years after graduation. Simulations of the model with and without such a program indicate that the program increases borrowing and reduces work while in school. However, the loan forgiveness program has essentially no effect on the choice of first job, with the same types of individuals being most likely to enter the nonprofit sector. 155 The main difference is that they enter that sector with more debt. The effect of the loan forgiveness program on participation in the nonprofit sector is similar to that found in Sauer (1998), except that now allowing for individuals to change their borrowing behavior increases substantially the cost of providing the program.

Stinebrickner (2001) develops and estimates a DCDP model to study the decision of certified elementary and secondary teachers to remain in the teaching sector, to exit into the nonteaching sector or to leave the labor force. Certified teachers often leave the teaching sector within two to nine years following certification. This high turnover is of particular concern because certification requirements and wage rigidities in the teacher labor market make it difficult for the market to adapt to fluctuations in teacher demand.

In Stinebrickner’s (2001) model, certified teachers receive wage offers in each period in both the teaching and nonteaching sectors. They decide whether to work in the teaching sector, in the nonteaching sector or to not work. The model incorporates marital status and number of children, which are assumed to evolve exogenously. Also, wage offers in both the teaching and nonteaching sectors are allowed to depend on the individual’s SAT score, interpreted as a measure of academic ability. Model parameters are estimated by simulated maximum likelihood using data on 450 certified teachers from the National Longitudinal Survey of the High School Class of 1972.

A key result from the analysis is that the primary cause of leaving decisions by teachers is not the relative attractiveness of nonteaching occupations but rather the decision not to work, which for women is strongly influenced by changes in marital status and in numbers of children. For teachers with a high SAT score, though, relatively better options in the nonteaching sector, which has a larger earnings premium for skills measured by the SAT, is a factor influencing their decision to leave the teaching sector. Model simulations indicate that teacher labor supply is responsive to changes in teaching wage offers. Increasing the teacher wage by 20 percent increases the proportion of person-years spent in teaching from 0.5 to 0.8, with a greater response among teachers with higher SAT scores.

Schooling-related choices

There have been a few applications of DCDP models to analyze youth behaviors while in school, for example, the decisions by youth to work while in school, drop-out of school, to enroll in college or to major in a particular subject in college. Eckstein and Wolpin (1999) use a DCDP model to study the determinants of school dropout decisions and to analyze whether working while in school is detrimental to school performance. In the model, youths choose among various work-school combination alternatives so as to maximize expected lifetime utility. Youths who attend high school accumulate credits towards graduation and receive grades reflecting their performance. In each period, they also receive random wage offers for either part-time or full-time employment, which they can either accept or reject. The wage offers depend on their skill endowments, educational attainment and previous labor market experience. Working potentially reduces school performance, as measured by course grades, and thus may increase the probability of failing to progress. The model also incorporates unmeasured heterogeneity at the time of entering high school, in preferences, abilities and in the expected value assigned to receiving a high school diploma. The model is estimated by simulated maximum likelihood using white males from the NLSY79.

Determining the impact of work on high school performance has been the subject of a substantial economics, sociology, and psychology literature. 156Eckstein and Wolpin (1999) finds that working while in school reduces academic performance, but the quantitative effect is small. A hypothetical policy that forces youths to stay in high school for five years without working or until they graduate increases the percentage of high school graduates by only 2 percentage points, but increases the average number of years of high school completed by dropouts by one year. As in other studies, initial traits at the time of starting high school are found to be major determinants of dropping out behavior. Youths of the types with lower school ability and/or motivation, a lower expected value of a high school diploma, a higher value placed on leisure time, higher skills in jobs that do not require a high school diploma, and a lower consumption value of attending school tend to drop out of school. The implication is that youth labor policies that do not alter the traits that youths bring to high school will be relatively ineffective in improving school outcomes.

Arcidiacono (2005) uses a DCDP model to study how changing the admission and financial aid rules at colleges affect future earnings of individuals. Specifically, he develops and estimates a behavioral model of decisions about where to submit college applications, which school to attend and what field to study. In the model, individuals make application decisions based on their expectation of the probability of acceptance, the application cost, the expected financial aid conditional on acceptance, and on an expectation of how well they will like a particular college and major combination. Schools make admissions and financial aid decisions; but rather than specifying and structurally estimating the school optimization problem, it is assumed that the school’s maximization problem leads to a logit probability of a particular student being admitted to school conditional on the quality of the school and the individual’s own ability. School quality is measured by the average math and verbal SAT at the school.157

Conditional on the offered financial aid and acceptance set, individuals decide what school to attend and what field to study. They also have the option of not attending school and going directly to the labor market. After college, individuals enter the labor market and their expected utility is equated to the log of the expected present value of lifetime earnings. The model is estimated using panel data on high school graduates from a single cohort (the National Longitudinal Study of the Class of 1972). Parameters are estimated by simulated maximum likelihood.158

The estimated model is used to examine (i) the effects of affirmative action on college-going decisions of African American students and on their labor market outcomes and (ii) the reasons for large earnings and ability differences across college majors. With regard to affirmative action, Arcidiacono (2005) simulates how African American educational choices would change if they faced white admission and financial aid rules. Past research has shown that racial preference in the admissions process is a practice mainly at top tier institutions. Model simulations show that removing racial advantages in financial aid substantially reduces the number of African Americans who attend college and that removing advantages in admission reduces the number attending top-tier schools. However, even though such policies affect the college choice decision, they do not do much to alter lifetime earnings, which is in large part determined by initial endowments, in line with Keane and Wolpin’s (1997) earlier finding.

Second, Arcidiacono (2005) uses the model to examine the reasons for large earnings and ability differences across college majors, in particular the high earnings premiums for natural science and business majors. Arcidiacono (2005) finds that monetary premia for certain majors cannot explain ability sorting across majors. Instead, almost all of the sorting occurs because of differing preferences for majors (and the jobs associated with those majors) by initial abilities. Differences in math ability is shown to be an especially important factor explaining both labor market returns and sorting across majors.

Job training

Cohen-Goldner and Eckstein (2008) uses a DCDP model to study the impact of a job training program on labor mobility and human capital accumulation. Their data consist of a short panel of observations on 419 prime age male immigrants in Israel who came from the former Soviet Union. Many of these immigrants were highly skilled upon arrival to Israel, but some of their skills were not directly transferable to the Israeli labor market. A typical pattern in the data is that immigrants start out as unemployed, move to blue collar jobs and then gradually move into white collar jobs. The government offers these immigrants a language course and job training courses to facilitate their employment transition, with a requirement that they pass a test in the Hebrew language to participate in training. One of the goals of Cohen-Goldner and Eckstein (2008) is to study the effects of these local training courses on labor market outcomes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.129.253