Taxes, Deadweight Loss and Intertemporal Female Labor Supply: Evidence from Panel Data

Very few existing studies have estimated female labor supply elasticities using a U.S. panel data set, though cross-sectional studies abound. Also, most existing studies have done so in a static framework. I make an attempt to fill the gap in this literature, by estimating a lifecycle-consistent specification with taxes, in a limited dependent variable framework, on a panel of married females from the PSID. Both parametric random effects and semiparametric fixed effects methods are applied. I find evidence of larger substitution effects than found in female labor supply literature with taxes, suggesting considerable distortionary effects from income taxation. The uncompensated wage elasticity is estimated at 1.4, which is comparable to estimates found in other labor supply studies with taxes. The income effect in a lifecycle-consistent specification is negative and very small. The estimate of compensated elasticity for females in the sample is 1.4 using random effects estimator and 1.35 using semiparametric fixed effect estimator. I estimate exact deadweight loss from taxes and find that deadweight loss from a 20% increase in the marginal tax rate is about 30% of tax revenue collected, evaluated at the sample mean. The deadweight loss from taxation of wife’s labor income from 1980-1987, for a median household is estimated to be 57% of tax revenue as opposed to 49% for a switch to a revenue-neutral proportional tax system. Finally, the intertemporal preference parameters are estimated using GMM.


Introduction
Due to a sustained increase in female labor force participation and proliferation of tax and transfer policies targeted at their labor market behavior, the estimation of female labor supply elasticities is an important focus of research in public finance and labor economics. There is a very large literature on female labor supply, but little consensus on elasticity estimates. For example, the estimates of compensated wage elasticities range from -0.12 (Nakamura and Nakamura, 1981) to as large as 15.35 (Dooley, 1982). However, there is little disagreement that female elasticities are much larger than those for males, implying bigger deadweight loss from taxation (Hausman, 1981a;Triest, 1990;Kimmel and Kniesner, 1998). In this paper, I estimate female labor supply elasticities in the presence of taxes in a lifecycle-consistent framework, using a panel of married females from the Panel Study of Income Dynamics (PSID), and then calculate the economic cost of taxation.
The paper makes five contributions to the literature. First, I empirically estimate a lifecycle-consistent secondary earner female labor supply model without imposing separability in the intertemporal budget constraint. I fully take into account the nonlinearities in the budget set induced by the tax system by using the differentiable budget constraint methodology proposed by MaCurdy, Green, and Paarsch (1990). 1 Second, I am one of the first to apply semiparametric techniques to account for fixed effects in the context of female labor supply on a U.S. panel dataset. Third, this is one of the first attempts to estimate a panel selection-bias-corrected wage equation in a Type III Tobit framework. 1 I do not use maximum likelihood with a kinked budget set implemented in the path-breaking works of Hausman (1981a) and Burtless and Hausman (1978). This method has been the subject of much controversy and debate over the last decades. Besides imposing theoretical restrictions at the kink points (MaCurdy et. al, 1990), this method is susceptible to nonlinear measurement error problems, and also to errors in specifying the Fourth, I use the labor supply parameter estimates to estimate exact deadweight loss and simulate the impact of alternative tax regimes on female labor supply. Finally, I estimate the parameters of the intertemporal Euler equation and intertemporal labor supply elasticity.
I have five primary findings. First, the labor supply estimates are sensitive to the method used to predict wages for non-workers. Second, I estimate a compensated wage elasticity of 1.4, which translates into a sizeable substitution effect. Third, my estimates of the uncompensated wage elasticity at the intensive margin are between 0.8 and 0.9; on the participation margin, the estimated elasticity is 0.7, somewhat lower. Thus, I find evidence of substantial response on both margins. The large estimated substitution effect has implications for tax policy, and creates potential for large efficiency costs of income taxation. Fourth, the exact deadweight loss (at the sample mean) from a 20% tax imposition in most specifications is close to 30% of the tax revenue raised. My simulations indicate that the tax system that existed between 1980 and 1987 could have resulted in a deadweight loss of 57% of tax revenue collected, for the median married household. On the other hand, a revenue-neutral switch to proportional taxation would reduce deadweight loss from taxation to 49% of tax revenue. Finally, my estimate of the Frisch elasticity of 1.5 is very close to the compensated elasticity, which suggests that intratemporal elasticities provide a good approximation for evaluation of the intertemporal impact of an anticipated wage change.
The analysis proceeds in three steps. First, I estimate a selectivity-bias-corrected wage equation to predict wages for non-workers in the sample. I use both a standard probit budget set. Moreover, Saez (2000) argues that there is lack of clustering of individuals, except at the first kink point of the tax system, a situation that is not particularly conducive to maximum likelihood estimation. 4 and a Type III Tobit model that uses more information by specifying a censored selection variable based on hours. 2 Next, I estimate a lifecycle-consistent labor supply equation based on two-stage budgeting, treating the after-tax wage and virtual income as endogenous.
I use the parametric random effects estimator and semiparametric estimators proposed by Honore (1992) and Kyriazidou (1997). Finally, I use the intratemporal labor supply parameter estimates to recover the intertemporal preference parameters. Specifically, I follow Blundell, Meghir, and Neves (1993) and Ziliak and Kniesner (1999) and parameterize a monotonic transformation of the indirect utility function, allowing the underlying preference parameters to vary with demographics. All the variables in the model are treated as endogenous, and the parameters of a log transformation of the Euler equation are estimated using lagged values of economic and demographic variables in the information set as instruments. I apply iterative GMM to estimate the intertemporal preference parameters efficiently. This step enables me to recover the Frisch elasticity. This paper is organized as follows. Section 2 presents the theoretical framework for married female labor supply in the presence of joint non-linear capital and labor income taxation, and outlines the reasons for the use of two-stage budgeting. Section 3 outlines the econometric specification. Section 4 provides a brief description of the data and construction of these key variables: wage, income, assets and taxes. Section 5 describes the estimation of the selection-bias-corrected wage equation. The estimation of the labor supply equation is discussed in Section 6. The results on deadweight loss are presented in Section 7. The estimation of the intertemporal preference parameters is discussed in Section 8.
There is a brief conclusion.

Background and Theoretical Framework
Most female labor supply studies accounting for taxes have adopted a static framework. Even though I use a dynamic framework to study the effect of taxes, the static model serves as a useful starting point to describe the theory.

The Static Model with Taxes
be a strictly quasiconcave utility function in which t C is consumption in period t , t H is hours worked, and t X is a vector of exogenous taste shifters.
In a standard static labor supply model with taxes, the consumer maximizes the utility function in period t , ) , , ( , subject to the budget constraint: where t W is the gross wage, t y is the unearned income, t I , the taxable income of the individual, t D , the tax deductions, t E , exemptions and ) .
( T is a function determining tax liability. Graduated tax rate and bracket structures create a piecewise linear budget set with kinks at the points where the marginal tax rate changes. Figure 1 presents the budget set for a typical individual under a hypothetical progressive income tax with three tax brackets. (2) Because the tax rate the individual faces is a function of hours, the after-tax wage and virtual income are functions of hours worked and in the estimation need to be treated as endogenous.
There are three approaches to the estimation of the static model. One approach to estimating the labor supply function is to linearize the budget constraint at observed hours (Hall, 1973). To account for the fact that taxes are endogenous, one can use an appropriate instrumental variable procedure to recover the parameters of the labor supply function. A more efficient approach is to use maximum likelihood to estimate the labor supply function while fully accounting for the probability of observing an individual on the budget segments or the kinks (Burtless and Hausman, 1978;Hausman, 1980;Hausman, 1981a;Hausman, 1985a;Hausman, 1985b). The maximum likelihood approach imposes the requirement of exact knowledge of the budget set and all the tax parameters and is particularly vulnerable to 7 measurement error in income. A third approach is to construct a smooth and differentiable budget constraint, as proposed by MaCurdy, Green and Paarsch (1990) and used among others by Ziliak and Kniesner (1999), Blomquist (1996), Aaronson and French (2002), that provides a reasonable approximation of the tax schedule facing the individual and facilitates the use of instrumental variable methods to estimate the labor supply function.
The static model imposes myopic behavior on the part of consumers and assumes perfectly constrained capital markets. This means that if the labor supply decision is taken in a multi-period setting then static regressions can confuse shifts of wage profiles with movements along wage profiles, and, thus, yield parameter estimates that lack precise economic meaning. The elasticities derived from the static specification can be placed in an intertemporal setting but are economically meaningful only under strong assumptions of either complete myopia or perfectly constrained capital markets (Blundell and MaCurdy, 1999). I relax both of these restrictions by estimating a dynamic model with taxes.

The Dynamic Model with Taxes
The consumer maximizes the expected present discounted value of utility: where t r is the stochastic rate of return and t E is the expectation operator conditional on the information set 1 − Ω t , subject to the asset accumulation constraint where t A represents assets in period t . t I is adjusted gross income, such that Noting that t C and t H appear in the period 1 + t asset accumulation constraint, the Lagrangean for the maximization problem can be written as: The first-order conditions are is the marginal tax rate.
If the tax function is nonlinear, from (9) and (10) and using the implicit function theorem, one gets the λ -constant consumption and hours demand functions MaCurdy (1980), MaCurdy (1981)). Estimating the λ -constant labor supply function is a key method for empirical implementation in panel data, where λ is treated as a individualspecific fixed effect. In the presence of joint nonlinear labor and capital income taxation, this function depends on λ and current as well as future prices and tax rates. In this scenario, λ , the marginal utility of wealth, ceases to be a sufficient statistic for extra-period information. Also, one can see from (9) and (10) that future taxes enter the opportunity cost for today's consumption and leisure through the term This is because capital income on t A is taxed in period 1 + t . Thus, there is intertemporal non-separability in the budget constraint induced by joint nonlinear capital and labor income taxation that creates a link between today's actions and tomorrow's relative prices. Note that if joint labor and capital income taxation is ignored, the portion in the curly brackets in (9) and (10) disappears, giving consumption and hours demand functions that have no betweenperiod dependence, and λ becomes a valid sufficient statistic for extra-period information.
One gets an empirically tractable labor supply function of the form: Indeed, with intertemporal separability of the budget set, (9) and (10) collapse to the usual within-period marginal rate of substitution condition: Even though non-separability in the intertemporal budget constraint does not change the relative allocation within period, one can no longer use λ -constant labor supply function to get the levels of demand.
As an alternative, I use two-stage budgeting, proposed by Gorman (1959), to solve the consumer's problem. In particular, Blomquist (1985) showed that in the presence of nonlinear taxation, the λ -constant labor supply function fails to account for the nonseparabilities in the budget constraint, but two-stage budgeting continues to be valid. In the first-stage, the consumer allocates total expenditure across periods to equate the marginal utility of wealth. In the second-stage, she takes the allocation of wealth between periods as given, and allocates between consumption and hours, like a standard static intratemporal I choose to condition on full income, defined in Blundell and MaCurdy (1999) where t B is unearned non-asset income in period t. This measure can be derived as follows.
We can write the full income as where L is the total time endowment, t B , the unearned non-asset income, and other components are as defined before. Rearrange to yield In this framework, the consumer's problem in the first stage of the two-stage budgeting framework can be written as: is the full income measure that captures the capability to transfer funds between periods through t A ∆ , a component that is ignored in the static model.
In a labor supply model with taxes and a piecewise-linear budget constraint, the after-tax wage is the slope of the budget segment on which the individual locates, and is the appropriate price of leisure. In this set up, the coefficient on the after-tax wage can be interpreted as the Marshallian elasticity, and compensated elasticities can be derived using the Slutsky equation (Blundell and MaCurdy, 1999).

Econometric Specification
I assume a linear specification of the life-cycle consistent labor supply function, where the hours worked,  (Hausman (1981c)): In the labor supply equation (19), it X is a vector of exogenous taste shifters, which consists of children and other demographic factors. 5 In the context of female labor supply, 4 The virtual income is computed as is the marginal tax rate, t I is the adjusted gross income and ) ( t I T is the actual taxes calculated at t I by integrating the marginal tax function. 5 Mroz (1987)

Data
The Panel Study of Income Dynamics (PSID) began in 1968, and is a longitudinal study of a representative sample of U.S. individuals (men, women, and children) and the family units in which they reside. The sample consists of wives. Women belonging to Survey of Economic Opportunity (SEO) subsample were excluded from the analysis sample.
The final sample consists of 6,288 person-years of wives from the 1980-1987 PSID, for women who were in the sample for at least two years. Table 1 gives the descriptive statistics on selected variables.

Wages
The PSID contains more than one measure of the wage rate. One measure can be formed by dividing annual real earnings by the annual hours worked. This measure has been found in the literature to induce division bias in labor supply estimates, yielding parameter estimates inconsistent with theory (Ziliak and Kniesner, 1999). 7 I use a selfreported measure of the hourly wage, using the methodology in Ziliak and Kniesner (1999), 14 that does not require dividing annual labor income with annual hours, and is relatively free of division bias.

Taxes
I use NBER's TAXSIM calculator to compute the marginal tax rate for each individual given her filing status, number of exemptions, and itemization status. The marginal tax rate function is smoothed using a cubic polynomial. 8 This marginal tax function is integrated to infer the actual taxes paid for every individual. The details of the construction of the differentiable budget constraint using the smooth marginal tax rate function are described in the appendix.

Income
Non-labor income of the wife is estimated as the difference between family money income and the labor income of the wife. Thus, wives' non-labor income includes labor income of their husbands. The total labor income of the household is defined as the sum of head's and wife's labor income. The adjusted gross income for the purposes of calculating the marginal tax rate is computed as the sum of total labor income of the household and the capital income. This paper assumes that the wife, as the secondary earner in the household, chooses her labor supply after the husband has made his decision. In a household with head and wife filing joint returns, the marginal tax rate faced by the wife is the tax on the joint labor and capital income of the household.

Assets
I use the methodology adopted by Ziliak and Kniesner (1999) to construct the asset measures for all the individuals in the sample. Assets consist of liquid and illiquid assets.
Liquid assets are the capitalized rent, interest, and dividend income, which are regularly collected by PSID. The illiquid assets are computed from the difference between the house value and remaining principal amount. The full income measure uses the asset variable as an input.

Estimation of the Wage Equation
The hours equation in (19) is estimated over the full sample of wives, and has hourly wage as an explanatory variable. Wages are observed only for women who work.
Following the literature, I calculate wages for women out of the labor force as follows. 9 To estimate the wage equation, I write the following two-equation system where it W is the gross wage rate of the individual, i α and i ξ are individual-specific fixed effects, and it u and it a are mean zero and homoscedastic error terms, respectively. Let i x represent the vector of all the leads and lags of it x i.e. ) ,....., , , . This approach allows for arbitrary correlation between it u and i x .
* it d is a latent variable that determines labor force participation.
. In a standard two-step selectivity-bias-correction procedure, the selection equation is estimated using Probit in the first step, and the wage equation is estimated with the inverse Mills ratio as a regressor in the second step (Heckman (1980)). This is a Type II Tobit representation of the wage equation. However, there is no a priori reason to ignore the information on hours of work. A Type III Tobit model uses all available information on hours. In this framework, the selection equation (24) is written with a censored dependent variable, such that the model becomes I correct for selection bias in the wage equation, using a correlated random effects approach suggested by Woolridge (1995). 10 In this framework, the wage equation has the following form.
10 Kyriazidou (1997) uses differencing under a conditional exchangeability assumption on the error term to eliminate the fixed effect as well as the selection bias term from the wage equation. Under certain regularity conditions, a two-step method gets rid of both the effects. In the first step, φ in the participation equation is estimated using conditional fixed effects logit or any other consistent estimator. In the second step, ψ is estimated semiparametrically, using a kernel-weighted least squares on first differences. Using this estimator, one can get all the slope parameters but the constant is not identified. Therefore, I do not use this method to correct for selection bias to get the wage prediction because I need the constant to get the correct wage level.
is the term arising due to a possible selection bias. I use the Type III Tobit because it uses more information than a Type II Tobit model, which assumes that the selection indicator is unobserved. An estimate of it υ comes from the Tobit residuals by estimating (24) The estimates are presented in Table 2. Column 1 presents the results using Type II Tobit and column 2 has the results for Type III Tobit. Some differences in signs and magnitudes of coefficients are noticeable between columns 1 and 2. Age after 45 has a 11 In several studies, the wage equation for females, estimated on a sample of workers, has been found to be relatively insensitive to selection bias (Heckman (1980), Hausman (1981)). So, first I tested for selection bias in the sample based on a procedure applicable to panel data, suggested in Wooldridge (1995). The coefficient on the estimated residuals is statistically significant with a t-statistic of 4.9. Thus, I rejected the null hypothesis that the estimates of the wage equation on the working sample will not be affected by selection bias. This test entails estimating the second-stage wage equation with fixed effects. The coefficients on the residuals follows a t-distribution under the null. 12 I also estimated a wage equation with a quartic in age, education and all nonredundant interactions. The results did not change.
18 positive effect using the Type II Tobit while it has a negative effect in the Type III Tobit.
High school education has much stronger effect using Type III. Bad health has a negative effect on wage using Type III Tobit while a positive effect using Type II Tobit. Table 2 clearly shows that the results using the two models are quite different. 13

Identification
Estimation of the hours equation is plagued by problems of endogeneity. First, in a nonlinear budget set framework, the marginal tax rate is endogenous to the choice of hours of work. Second, as noted by Eissa (1995), the marginal tax rate is a nonlinear function of income and family size, and may be correlated with underlying tastes for work which also may be correlated with income and family size. Third, the gross wage itself may be endogenous, because it may be correlated with unobserved tastes for work.
Most previous studies on taxes and labor supply have used cross-sectional variation in marginal tax rates to identify the effect of taxes. Because the marginal tax rate depends on choices that are correlated with labor supply, there may be insufficient independent cross-sectional variation in tax rates to identify the tax effect on labor supply. Plausibly exogenous time-series variation generated by the tax reforms in the 1980s is relatively underexploited in studies that employ nonlinear budget set techniques. There were two major tax policy reforms in the 1980's: the Economic Recovery Act of 1981 (ERTA) and the Tax Reform Act of 1986 (TRA86). Specifically, Eissa (1995) estimated a reduced-form specification in a difference-in-difference framework using cross-sectional and time-series 13 I interpret the coefficients assuming that the individual effects are correlated only with the lags and leads of time-varying regressors. If this restriction is not imposed, the coefficients on time invariant regressors are not variation in marginal tax rates from TRA86. However, this study did not explicitly take into account the nonlinear budget set faced by individuals. Eissa (1995) used the TRA86 in a natural experiment framework to identify the labor supply effects of taxes on wives married to husbands at or above 99 th percentile of the income distribution. She used two different control groups; wives married to husbands between the 75 th and 80 th percentiles of the income distribution; and those married to husbands between 90 th and 95 th percentiles of the income distribution. 14 Using panel data, this paper circumvents these problems by using both crosssectional and time-series variation in tax rates over the 1980s to identify the effect of taxes on labor supply. To get a sense of the time-series variation in the 1980s, Figure 2 graphs the federal marginal tax rates by real adjusted gross income for a household filing married jointly, with two children and no age exemption from 1980 to 1987. It can be seen that there was substantial variation in marginal tax rates during this period.
One pitfall of the natural experiment approach is that the behavioral parameters are not identified. In particular, the best one can do is to identify a weighted substitutionincome effect (Blundell and MaCurdy, 1999). Therefore, I follow a structural approach where I explicitly seek to identify the parameters of the indirect utility function (20) by estimating the labor supply equation (19). I treat both the after-tax wage and virtual full income as endogenous. Also, the virtual income is constructed using a lump-sum transfer that depends on the marginal tax rate, which is endogenous. I instrument for them using the separately identified from the individual specific effect (Wooldridge, 1995).
net wage and virtual income constructed from a synthetic tax rate. 15 The identifying assumption is that this synthetic marginal tax rate is correlated with the observed marginal tax rate but is uncorrelated with hours of work. Four types of synthetic tax rates were used to construct the instruments for the after-tax wage rate: (1)   , which is constructed by replacing the corresponding marginal tax rates and taxes with the synthetic counterparts and using a synthetic twice lagged value of full income in place of contemporaneous full income

Estimation
The hours equation (19)  synthetic because I removed the supposedly endogenous components from this measure of full income, e.g., taxes paid and transfer income. 18 I test for the existence of weak instruments by doing a partial F-test on the instruments in the reduced form regression. The null of weak instruments is consistently rejected. The partial F-statistic on instruments in the after tax wage regression is 241.83 and that in Cv Y regression is 527.67. Both these statistics pass the rule of thumb tests presented in Bound, Jaeger and Baker (1994) and Stock, Wright and Yogo (2002). As a note of caution, both these results may not apply to nonlinear models. 19 In nonlinear models, differencing to get rid of the fixed effects is not a viable alternative. A sufficient statistic for Logit and Poisson models is available. However, for the Tobit model a sufficient statistic to condition out the i ϑ s is not available. The only way to deal with fixed effects is to estimate them as Table 4 presents estimates of the labor supply parameters treating 8 years of data on married females as independent cross sections and ignoring the individual-specific effects. Table 4 uses the predicted wage from the Type III Tobit model. The Amemiya Generalized

Cross Section Results
Least Squares (AGLS) estimator of Newey (1987) is used to estimate the Tobit hours model with instrumental variables. In addition to after-tax wage and virtual full income, other controls include the number of children between one and two years old, number of children between three and five years old, number of children between six and thirteen years old, a dummy for self-reported bad health, and a quartic in age.
The results are quite consistent with theory. In particular, the Slutsky restrictions are satisfied in each year. The uncompensated wage elasticity is estimated between 1.8 and 2.5.
These estimates are close to the higher end of results found in the existing literature, which primarily uses a Type II Tobit model to predict wages. 20 The income elasticity ranges from -0.21 to -0.31. Leisure appears as a normal good in each cross-section. The compensated elasticities are between 1.8 to 2.5. On the intensive margin, compensated elasticities range from 1.2 to 1.8. 21 parameters. In datasets where the time dimension is fixed, these parameters grow with sample size, giving rise to incidental parameters problem as found by Neyman and Scott (1948). The inconsistency induced in estimating the i ϑ s, due to incidental parameters problem is transmitted to γ due to the nonlinearities.
However, the amount of inconsistency decreases as ∞ → T so that at least for larger panels the amount of inconsistency will not be large. This approach has been adopted in Heckman and MaCurdy (1980) and Jakubson (1988) to estimate labor supply functions. 20 At the intensive margin, these elasticities are between 1.2 to 1.8. The elasticities on the participation margin range from 0.6 to 1. 21 These estimates can be compared with static specifications estimated in the literature on estimation of female labor supply tax effects in a nonlinear budget set environment, using the PSID. Hausman (1981) estimated wage and income elasticities for females working full time of 1 and -0.5 respectively. Triest (1990) estimated uncompensated wage elasticities of 0.9 and 1.12 for one error and two error model respectively, using Maximum Likelihood. The income elasticities estimated here are much closer to the estimates of Triest (1990), whose estimates were between -0.15 to -0.31. Rosen (1976) estimated an uncompensated elasticity of

Random Effects Estimation with Instrumental Variables
In column 1, of Table 5 I present results for the pooled Tobit model

Semiparametric Fixed Effects Estimation Results
The incidental parameters problem can induce inconsistency in the estimates of parameters of interest in a parametric fixed effect model. The maximum likelihood estimation of Tobit models is also susceptible to biases from heteroscedasticity and nonnormality of the true error distributions. Some semiparametric estimators deal with both 2.3 and an income elasticity of -0.42. Hausman and Ruud (1986) estimated wage and income elasticities of 0.76 and -0.36, respectively. Thus, my estimates obtained from estimating a static model using the crosssections are qualitatively similar to the previous literature. I also attempted to replicate Triest (1990) for the 1983 cross-section. A 95% confidence band around my parameter estimates substantially overlaps the confidence band around estimates in Triest (1990). So using cross-section in a static framework, I am broadly able to replicate the estimates found elsewhere. 22 See Appendix 2 for details of estimation 23 The coefficient on children between 1-2 years old ranges from -303 to -346. This translates into a reduction of 140 annual hours of work for each additional child between 1-3 years of age. An additional child between 3-5 years old reduces labor supply by about 100 hours in the richest specification. A child between 6-13 years of age reduces hours worked by much smaller 36 hours. As expected, I find evidence of an effect on labor supply that is declining in the age of children. Conditional on the number of children in these three age ranges, the effect of total number of children is not statistically significant. On average, bad health status reduces labor 24 of these problems by using a trimming mechanism to restore the symmetry of the error distribution that is spoiled by censoring or truncation, and then use a least squares or a least absolute deviation estimator. The distribution of the error term is left unspecified.
In this paper, I apply the trimmed least squares (TLS) estimator and trimmed least absolute deviation (TLAD) estimators proposed in Honore (1992). 24 The estimates are presented in column 3 and 4 of Table 5. The coefficient on after-tax wage is estimated at 344.6 using TLS estimator and 360.2 using TLAD estimator. Leisure is estimated as a normal good. 25 The implied compensated elasticities are 1.04 and 1.11 from TLS and TLAD, respectively. There is no obvious way to deal with endogeneity in Honore's estimator. One of the shortcomings of this estimator is that regressors are treated as strictly exogenous. 26 Nevertheless, if the endogeneity of net wage and virtual income operates only through the time invariant individual specific effect, the estimates will be consistent. Again, the semiparametric estimates confirm the finding of a strong wage effect.

Semiparametric Selection Corrected Hours Equation using Kyriazidou (1997)
Female labor supply estimates have been found to be sensitive not only to the estimates of the wage equation and the use of the instrument set but also to the type of Tobit model. In particular, selection-bias-corrected or truncated specifications are known to yield results different from censored specifications (Killingsworth and Heckman, 1986;Triest, 1990). I also estimate a selection-bias-corrected specification. In the following estimation procedure, I do not assume any distribution for the individual effect or the time varying error supply by 110 hours. I include health as a control as most other studies do e.g. Triest (1990). I recognize that it may be endogenous in a labor supply equation. 24 The estimation of TLS was carried out using a GAUSS program available from Honore's website. 25 The coefficients on children entered significantly and with expected negative signs. The wage coefficient was quite robust to different specifications and to addition of time and region dummies in the model. I experiment with family size and union dummy as potential exclusion restrictions. The results were similar using either. 30 26 An instrumental variable estimator of this type is Honore and Hu (2001). 27 The details of the estimation procedure are available from the author on request. A STATA program to implement the estimator is also available. 28 I have a choice between the conditional maximum likelihood approach (Chamberlain, 1984), the semi parametric maximum score estimator (Manski, 1987). While the conditional ML approach assumes a distribution for the error term and is n -consistent and asymptotically normal, the semiparametric estimator converges at a rate slower than n . Chamberlain (1992) shows that with bounded support of the variables in the selection equation, identification of the selection equation parameters is possible only in the logistic case.
Even if we assume unbounded support, n consistency is achieved only in the logistic case. 29 This method is also adopted in Charlier, Melenberg and van Soest (1997). 30 The estimation  The results for the hours equation are presented in columns 5 and 6 of Table 5.
Column 5 presents the parameter estimates without using instrumental variables for net wage and virtual full income and just estimating the selection-bias-corrected labor supply equation by weighted least squares, using the kernel density estimate of the selection index, as weights. The uncompensated wage elasticity is estimated at 1.24. The estimate of the income elasticity is -0.002. Column 6, presents estimates using instrumental variables for net wage and virtual income. The uncompensated wage effect is about 451. The coefficient on virtual full income is -0.0002 and statistically insignificant. Slutsky condition is satisfied and leisure appears as a normal good. The compensated elasticity is estimated at 1.35. The estimates are moderately sensitive to the bandwidth constant h . 31 The estimated net wage and full income coefficients are close to the estimates from random effects Tobit model with instrumental variables. This suggests that the random effects Tobit model may not be misspecified.

Sensitivity to Instruments, Functional form, and Wage Measure
In the results discussed so far, I have used the husband's last-dollar marginal tax rate as an instrument for the wife's observed marginal tax rate. The estimates will be inconsistent if husband's last-dollar marginal tax rate itself is endogenous. Therefore, I estimated the labor supply elasticities using the maximum state marginal tax rate and maximum federal plus state marginal tax rate as instruments for the observed marginal tax insignificant. The signs on other variables like children and health were as expected. Children and bad health had a negative effect on probability of labor force participation. Overall, I find theoretically consistent estimates in the selection equation. 31 I experimented with a grid of possible values. However, I present the results for the bandwidth constant, which gave me the best bias standard error tradeoff for the estimates. Over a range of bandwidth constants, I found that point estimates declined with the value of bandwidth while the precision increased. I subjectively rate. These instruments were constructed by running the sample through NBER's TAXSIM by assigning an adjusted gross income of $350,000 to every individual. I also constructed a more synthetic version of this instrument by running a hypothetical individual who is married with two children, filing jointly, and with no age exemption. This addresses the concern that family size may be endogenous with the marginal tax rate. The results presented in Table 6 are fairly robust to the use of these instruments. Both parametric and semiparametric results did not change by much.
Many researchers have used predicted wages only for non-workers, while using observed gross wages for workers in their estimation. I examine the sensitivity of estimates due to this methodology. The results are presented in Table 7. I find that the results are sensitive to using this method. The estimated compensated elasticities are now lower. Also, the parameter estimates are not robust across estimators. The estimated compensated elasticity from the random effects estimator with instrumental variables is 0.67. Using the semiparametric TLS and TLAD, estimators the estimates of compensated elasticities go down to about 0.4. This may be due the endogeneity of observed gross wage. Kyriazidou's estimator did not yield statistically significant estimates for a broad range of bandwidth constant h. However, leisure is still estimated as a normal good and the Slutsky condition is satisfied across all estimators.
I also checked the sensitivity of the results with respect to functional form assumptions for the labor supply function. In particular, I used a quadratic labor supply function. The top panel of Table 8 presents the comparison using predicted wages from the chose the value that yielded sufficient precision. The estimates remain practically unchanged for bandwidth constant higher than 5. The results did not vary with the choice of the order of kernel. 28 wage equation for everybody in the sample. The estimated elasticities are about 20% larger than those estimated with a linear labor supply function. The compensated elasticity increased from 1.67 to 2.03. The bottom panel uses predicted wages only for non-workers.
These results did not change by much. The compensated elasticity using both the functional forms are close to 1.

Deadweight Loss from Taxes
I now use the parameter estimates to estimate deadweight loss from taxes. In the nonlinear budget set estimation with taxes, Triest (1990) found that taxes reduce labor supply by 30% in the censored specifications he estimated. Hausman (1981) found that the tax system reduces labor supply by 8.5%. He found a mean deadweight loss as a percentage of tax revenue of 28.7% for husbands. For wives, he estimated deadweight loss from 4.6% for a mean wage of 2.11 dollars to 35.7% for a mean wage of 5.79 dollars. He found that the estimated deadweight loss rises with the market wage rate. For wives he found a deadweight loss of 18.4% of the tax revenue. Hausman (1981) noted that tax treatment of married persons creates substantial deadweight loss for working wives. A wife who worked full time at $4 an hour and whose husband's income in 1975 was $10,000 would face a deadweight loss of 58.1% of tax revenue. From these calculations, it appears that the deadweight loss for wives is very strongly related to husband's income. The higher the husband's income -for a wife filing jointly -the higher the marginal tax rate and higher the potential for deadweight loss as it rises with the square of the tax rate.
For the linear labor supply equation (19), the compensating variation can be written as (Hausman, 1981): The exact deadweight loss is easily calculated after subtracting the tax revenue from the compensating variation.
First, I use estimates from the random-effects instrumental variables estimator presented in Table 5 to calculate the exact deadweight loss from an across-the-board 20% tax imposition. 32 This gives us an estimate of the magnitude of potential distortion caused by taxes. From a 20% tax imposition, I estimate a compensating variation (CV) of 625 dollars. 33 The change in consumer surplus (CS) is marginally higher than the compensating variation as expected from theory. Indeed the difference between CV and CS is extremely small which is contrary to what was found in Hausman (1981). In this paper, it is small because of the inconsequential size of the income effect. The magnitude of the deadweight loss as a fraction of tax revenue raised gives us a measure of the economic costs of taxation (Hausman, 1981). 34 Next, I compare the efficiency of the existing progressive tax system and proportional tax system which yields the same tax revenue. Because deadweight loss is 32 Specifically I measure the deadweight loss from a 20% increase in the current price of leisure. This follows standard practice in this literature. 33 This means that a consumer will be required to be compensated $625 to keep her at the same level of utility as before a 20% tax imposition. At the sample means of hours, income and other explanatory variables, it implies a deadweight loss of 186.57 dollars. 34 The validity of the exact deadweight loss calculations using the indirect utility function approach depends upon the existence of the said utility function. Vartia's (1983) method does not rely on the existence of such indirect utility function and calculates the required compensation by numerically solving the first order differential equation obtained from the Roy's Identity. To check the sensitivity of the results, I also applied Vartia's approach. The results were almost identical.
proportional to the square of the tax rate, a progressive tax system is likely to create substantially higher deadweight loss than a proportional tax system. 35 Column 1 of Table 9 presents estimates of the effect of the existing tax system on the deadweight loss. I find that deadweight loss varies a great deal in the sample. Due to the presence of outliers, I focus on the median estimates. Under the progressive tax system, the median compensating variation is $4951. Thus the median household would need to be compensated $4951 to achieve the pre-tax level of utility under the progressive tax system.
The deadweight loss for the median household is $1588 -about 57% of tax revenue.
Column 2 of Table 9 shows the calculations for a new tax regime in which the same revenue was collected using a much simpler and proportional tax structure. My simulations suggest that this switch would lead to efficiency gains. The compensating variation for the median household is $4821. This implies a deadweight loss of $1458, i.e., 49% of tax revenue.

Intertemporal Preference Parameters
So far in the paper, I have concentrated on within-period elasticities. It is well known that an anticipated or evolutionary wage change with constant marginal utility of wealth ( λ ), induces a larger labor supply response than Hicksian or Marshallian elasticities (MaCurdy, 1981;Browning, 1985). This is because with λ being constant, a change in wages is only due to a movement along the given life-cycle profile, known at the beginning of the lifetime, and, hence, there is no wealth effect associated with a λ -constant wage change. On the contrary, the income-or utility-constant wage changes contain a wealth 35 I calculated the actual tax paid on wife's labor income by assuming that she makes the decision to work after the decision of the husband, based on a secondary earner model. I could easily get an estimate of the taxes paid by running the income of every household through TAXSIM, first before adding wife's labor income and then after adding her labor income. The difference between the taxes paid estimates from the two runs gives us the taxes paid on wife's income. 31 effect. Therefore, λ -constant (also known as Frisch) elasticities are important for analysis of the effect of anticipated change in wage (Browning, 1985). The Frisch elasticity can be obtained from intertemporal preference parameters.
Intertemporal preference parameters are estimated following the approach used in Ziliak and Kniesner (1999) and Blundell, Meghir and Neves (1993). Writing the Box-Cox transformation of the indirect utility as is the indirect utility function (20) and t σ , the intertemporal preference parameter that is allowed to vary with a set of demographics, t X , such that From the maximization of the consumers lifetime expected utility, the Euler equation for the marginal utility of wealth can be written as: where t r is the after-tax interest rate. This equation represents the first stage of the twostage budgeting procedure and intuitively states that the consumer chooses consumption and labor supply such that marginal utility of wealth in period t equals the discounted value of marginal utility of wealth in period 1 + t . t ε is the forecast error.
From taking a log of (36) and first differencing and substituting the marginal utility of wealth, by differentiating (20) and substituting for t λ , we get the following equation, Like Ziliak and Kniesner (1999) and Blundell et al. (1993), I estimate equation (37) using GMM. I allow the error to be serially correlated and use the Newey-West correction to the covariance matrix. I use iterative GMM where the variance-covariance matrix is reestimated at each iteration.
In the intertemporal labor supply model I estimate, I allow the intertemporal preference parameters to vary with children, assets, and health. I treat all the regressors as endogenous and follow Ziliak and Kniesner (1999) and Blundell, Meghir and Neves (1993) in using the lagged values of gross wage, after-tax wage, virtual full income, asset, virtual assets, self-reported bad health, the net after-tax real interest rate, age, interactions between education and age, home ownership dummy, and number of children as instruments. I use the third and fourth lag of all these variables as instruments. The results of the estimation of the intertemporal preference parameters are presented in Table 10. I find that assets have a positive and economically significant effect on intertemporal preferences. Higher assets increase intertemporal substitution. I find that young children reduce intertemporal substitution. The effect of older children and the total number of children have a negative but statistically insignificant impact on intertemporal substitution. The effect of poor health on intertemporal substitution is negative and significant. The common discount rate is 33 estimated to be 0.05. 36 Even though the signs of the estimates of the intertemporal preference parameters are as expected, they imply an intertemporal labor supply elasticity that is very close to the compensated elasticity, i.e., 1.5. The deadweight loss calculations do not take into account general equilibrium effects. I also do not model endogenous human capital formation which is an area of active research (Heckman, Lochner and Taber, 1998). These are possible extensions to be explored in future.

Conclusion
which in this case is 13. I cannot reject the null that the orthogonality conditions are statisfied.      Eissa (1995) Participation: 0.4-0.6 Hours:0.8-1  Note: Dependent Variable is annual hours of work. t statistic is reported in parenthesis. Other controls include a quartic in age, family size and year dummies. Column (5) does not include family size and quartic in age but includes interaction of year dummies with the children variables and health. The estimates are based on 6462 person years of observations. For trimmed least squares and trimmed least absolute deviation estimators, the optimization of the objective function was carried out using Powell's algorithm. Parameter estimates were almost identical for a number of different starting values. Family Size was used as the exclusion restriction in all specifications. Note that Semiparametric estimator in column 5 is the estimator proposed by Kyriazidou (1997) and in column 6, results are presented from the instrumental variable extension of this estimator by Charlier, Melenberg and van Soest (1997).

Construction of Differentiable Budget Constraint
Marginal Tax Rates are calculated using NBER TAXSIM. We constructed a grid of adjusted gross incomes (AGI) from $50 to $350,000. We then use NBER's TAXSIM to generate marginal tax rates at every level of AGI in $50 interval conditional on tax status defined by any combination of tax marital status, number of dependents and age exemption.
The individual's marginal tax rate τ was obtained from this grid.
The budget set can have multiple kinks depending on the tax status of the individual and can have multiple points of nondifferentiability, we constructed a differentiable budget constraint using a method suggested by MaCurdy, Green, and Paarsch (1990) and implemented by Ziliak and Kniesner (1999), to smooth the budget set around the kink points. First a lower bound and an upper bound was found from the TAXSIM grid below and above which the marginal tax rates do not change. Thus we explicitly took accounted for the first kink and the last kink in the budget set. The individual's budget sets around all the remaining kink points were approximated using a cubic polynomial. Because NBER TAXSIM does not report taxable income, we fitted a cubic polynomial of tax rates on adjusted gross income. Because the marginal rate is a smooth and continuously differentiable function of adjusted gross income, we can integrate the function back to obtain total tax payments. Using the coefficients from the polynomial regression, we were able to get the implicit tax rates at income levels corresponding to each contribution level in $50 intervals.
Where st τ is the average state tax rate and fica τ the payroll tax rate. 37 Similar strategy has been applied in Cunningham and Engelhardt (2002) 57 APPENDIX 2

Random Effects Tobit with Instrumental Variables
In standard random effects Tobit specifications, the following assumptions are made with regard to labor supply equation (20).  38 These are obtained using the method suggested by Smith and Blundell (1986). This approach has been extended to panel data by Vella and Verbeek (1999). 39 I tested for the null of pooled OLS against random effects. The Breusch-Pagan test statistic soundly rejected the null of appropriateness of pooled OLS on this sample. it X contains the two instruments (first dollar after tax wage rate and first dollar virtual full income) and all the other exogenous variables in the model. Next, I estimate the residuals from the above regression i.e. to the regression made statistically indistinguishable difference to the estimates. So, I report results conditioning on just it u 1 and it u 2 . 41 A panel data application of Newey (1987) can be found in Schineller (1999). This procedure involves estimating a reduced form of the hours equation. From a consistent (but not efficient) estimate of ) , ( β α and the reduced form estimates, Newey's procedure uses minimum distance to recover asymptotically efficient estimates of the structural parameters. The procedure is also known as Amemiya Genralized Least Squares (AGLS) after the original suggestion of this approach by Amemiya (1978).