Health Care Expenditure and Income in the OECD Reconsidered: Evidence from Panel Data

This paper reconsiders the long-run economic relationship between health care expenditure and income using a panel of 20 OECD countries observed over the period 1971-2004. In particular, the paper studies the non-stationarity and cointegration properties between health care spending and income. This is done in a panel data context controlling for both cross-section dependence and unobserved heterogeneity. Cross-section dependence is modelled through a common factor model and through spatial dependence. Heterogeneity is handled through fixed effects in a panel homogeneous model and through a panel heterogeneous model. Our findings suggest that health care is a necessity rather than a luxury, with an elasticity much smaller than that estimated in previous studies.


Introduction
Health care expenditure in the OECD 1 varies substantially over time and across countries. From 1970 to 2004, per capita health expenditure has increased markedly in the OECD with an annual average rate of 11.5 per cent. Such temporal dynamic has been characterized by large di¤erences across countries, leading to marked geographical heterogeneity in the level of spending. For example, a snapshot in 2004 shows that the US, with an average of $6,037 2 , has the highest amount of health expenditure, followed by Switzerland ($4,045), Norway ($4,103), and Germany ($3,169). On the other hand, countries that devote less resources to health care are Turkey and Mexico, with an average per capita expenditure of $562 and $655, respectively. As a share of Gross Domestic Product (GDP), health care spending in the OECD has almost doubled over this period, increasing from 4.9 per cent in 1970 to 8.8 per cent in 2004. However, there is a substantial heterogeneity across these OECD countries. In fact, while several countries continued to experience an increase in their share in the 80s and 90s, others have experienced modest declines, possibly associated with reforms aimed at limiting the percentage rise in health care spending as a proportion of GDP. Over time, the shares of health care spending as a percentage of GDP, ranged between 2.5 and 7.0 per cent in the 70s, compared to 5.5 and 15.2 per cent in 2004.
Since the work by Kleiman (1974) and Newhouse (1977), income has been identi…ed as the most important factor explaining di¤erences across countries in the level and growth of health care expenditure. Therefore, earlier research focused on measuring the size of the income elasticity of health care, and on its policy implications for the …nancing and distribution of health care resources. Advocates of health care being a luxury good, argued that it is a commodity much like any other and is best left to market forces. On the other hand, advocates of health care being a necessity, often support the idea of more government intervention in the health care sector (see Culyer, 1988;and Di Matteo, 2003). We will review the empirical literature for the OECD countries in the next section. Several empirical studies pointed to the possible non-stationarity of health care spending and income, which in turn cast doubt on prior inference on income elasticity obtained from spurious regressions. This literature focused on studying the time series properties of health expenditure and income, and on assessing whether there exists a long-run relationship between them.
A number of non-income determinants of health care spending have been identi…ed in the literature. For example, the age structure of the population has been traditionally ‡agged as an important factor in explaining variations of health care expenditure across countries (Leu, 1986;Culyer, 1988). Indicators such as the share of young (e.g., under 15 years) and old people (e.g., above 65 or 75 years) over the active or total population have been included in regression models explaining per-capita health spending. Nevertheless, little evidence exists on a signi…cant e¤ect of these variables (Leu, 1986;Hitiris and Posnett, 1992; Di Matteo and Di Matteo, 1998;Grossman, 1972). Another determinant of health expenditure is the extent to which health care expenditure is …nanced by the government, though only few empirical studies support its e¤ect on health care spending (Leu, 1986;Culyer 1988;Hitiris and Posnett, 1992).
Microeconomic theory emphasizes the role of real prices for health care services in determining the demand for health care (Grossman, 1972). A positive e¤ect of relative prices on health spending would support the so-called Baumol (1967) cost disease theory that productivity in the health sector is low relative to other sectors. Hence, prices for health services will rise relative to other prices because wages in low productivity sectors must keep up with wages in high productivity sectors. However, there is no empirical consensus on the effect of real prices on health care spending. See (Hartwig, 2008;Okunade et al., 2004), who report a positive and statistically signi…cant e¤ect, and (Gerdtham et al., 1992;Murthy and Ukpolo, 1994) who report an insigni…cant e¤ect. Yet, there are skeptics who do not recommend the use of price indexes in health care, especially across countries that provide health care at no cost or at very low cost, see Berndt et al. (2000). In fact, Hartwig (2008, p.6) argues that "..we have to recognize that medical care price indices can probably not be relied on as de ‡ators or explanatory variables." Given the paucity of data on price across the OECD, the diverse national schemes of price regulation, and the problems with measuring quality of health care in obtaining this medical price index, we decided not use this variable in our empirical analysis (see Section 5).
Since the work by Newhouse (1992), technological progress has been seen as an important driver of health care expenditure. However, very few studies have attempted to study the relationship between technological progress and health care expenditure due to the di¢ culty of …nding an appropriate proxy for changes in medical care technology. A number of proxies have been considered in the literature, such as the surgical procedures and the number of speci…c medical equipment (Baker and Wheeler 2000;Weil, 1995); the R&D spending speci…c to health care (Okunade and Murthy, 2002); life expectancy and infant mortality (Dregen and Reimers, 2005). Some other papers have proxied the e¤ect of technical change by adding a time index (Gerdtham and Lothgren, 2000), or time-speci…c intercepts (Di Matteo, 2004) in the regression speci…cation.
To summarize, while income has been recognized as an important determinant of health care spending, there is still no consensus on which other factors may be associated with the remaining largely unexplained variation in per capita health expenditure. 3 Some attribute this failure to identify other non-income determinants to the limited availability of health care data at the macro level, others even blame the weakness of the econometric methods used, or the informal economic theory used to model per capita medical care expenditure (Wilson, 1999).
This paper studies the long-run economic relationship between health care expenditure and income in the OECD countries, ultimately assessing whether health care is a luxury or a necessity. Using a panel of 20 OECD countries followed over the period 1971-2004, we investigate the non-stationarity and cointegration properties between health care spending and income. The dynamics of health expenditure and income and their relationship are investigated by estimating a heterogenous panel model with cross sectionally correlated errors. Initially a factor structure is included in the econometric speci…cation with the intent to synthesize the e¤ects of shocks that may hit health spending and that are not directly measurable by the econometrician, such as advances in medical care technology, policy shifts, new diseases, and shifts in preferences and expectations by users of health services. The factor structure can capture any contemporaneous correlation that arises from the common response of countries to such unanticipated events. We then model cross section dependence by assuming that the regression errors follow a spatial autoregressive process. Indeed, consumption of health care resources of a single country may be related to unobservable general population characteristics of neighbouring countries. Another explanation for the geographical concentration of health spending is the di¤usion of technology across countries (see for example Skinner and Staiger, 2005). A very recent strand of literature has recognized that cross section dependence is an important characteristic of health data, and has tried to incorporate it in their models (Jewell et al., 2003;Freeman, 2003; Carrion-i-Silvestre 2005; Wang and Rettenmaier, 2006;Chou, 2007). We also check the robustness of our results by including in the regression speci…cation variables recognized by the literature to play an important role, such as government expenditure on health, and the age structure. The aim is to assess income elasticity more accurately, controlling for various alternative forms of cross section dependence, as well as non income determinants of health expenditure.
The plan of the paper is as follows. Section 2 summarizes the prior empirical results on this topic. Section 3 introduces the econometric methods adopted. Section 4 presents the data. Section 5 summarizes our empirical results, and points to some of the limitations of our study. While, Section 6 gives our concluding remarks.

Income elasticity in the OECD
This short review summarizes some of the existing studies that have used panel data sets to measure the relationship between health care spending and income in the OECD.
We start with Gerdtham et al. (1992) who estimated a regression for health care spending as a function of GDP and a number of other variables, including institutional and socio-demographic factors. Using data on 20 OECD countries over the period 1960 and 1987, they estimated an income elasticity that is larger than one, thus …nding that health care is a luxury good. This …nding is in line with previous results based on a single cross section (e.g., Kleiman, 1974;New-house, 1977;and Leu (1986). Using the same data, Hitiris and Posnett (1992) estimated a regression model for health care expenditure and income, controlling for unobserved heterogeneity by adding country-speci…c e¤ects. They measured an income elasticity close to one, thus questioning the luxury attribute of health care raised by Gerdtham et al. (1992). As observed by Hansen and King (1996), one limitation of the above studies is that they have ignored the possibility of non-stationarity in health data and income. Using the same data set as Gerdtham et al. (1992), they computed Dickey Fuller statistics for health care spending, GDP, and residuals from a regression of GDP on health expenditure, for each country separately. While detecting non-stationarity for health care spending and GDP for the majority of OECD countries, they did not …nd evidence of cointegration among the variables. Using data on 24 OECD countries observed over the period 1960 to 1991, Blomqvist and Carter (1997) computed the Phillips and Perron t-ratios for health care spending and GDP and for regression residuals. The authors conclude that their results cast doubt on pooling and upon the notion of an elasticity larger than one.
McCoskey and Selden (1998) revisited the work by Hansen and King (1996), applying for the …rst time non-stationarity tests that exploit the panel nature of the OECD data. The low power of country-by-country tests employed in previous studies is one of the major motivations for the use of panel unit root tests. Speci…cally, McCoskey and Selden (1998) computed the tests by Im, Pesaran, and Shin (2003), and rejected the joint hypotheses of unit root in all countries for both health care spending and income, though observing that results are sensitive to the inclusion of a time trend in the augmented Dickey-Fuller equation. Using data on 21 countries followed over the years 1960-1997, Lothgren (2000, 2002) computed the Im, Pesaran, and Shin (2003) test and the panel version of the Kwiatkowski and Phillips (1992) test, with linear trends, and concluded in favour of non-stationarity and cointegration between health care spending and GDP (see also Okunade and Karakus (2001)). Similar results have been obtained by Dregen and Reimers (2005), for the years 1975-2001. These authors estimated the relationship between health care expenditure and GDP controlling for non-income determinants and a proxy of technical progress, concluding that health care expenditure is not a luxury good.
The above results have been criticized by Jewell et al. (2003), who emphasized the importance of controlling for structural breaks and cross section dependence. Further, to mitigate cross section dependence they included time-speci…c e¤ects common to all countries. Hence, the authors detected one or two breaks for most OECD countries, and found that health care spending and GDP are stationary once these breaks are taken into account. Similar results have been obtained by Carrion-i-Silvestre (2005). He based his inference on the bootstrap distribution of stationarity tests, in order to render the analysis robust to the presence of cross section dependence. Hartwig (2008) reviews this literature and concludes: "Unfortunately, given that the available time series are rather short, which lowers the power of the tests, and that the number of competing tests is huge (and growing), some uncertainty is likely to remain with respect to the properties of the time series analyzed in this …eld of research." In the next section, we review a number of methods to study the long-run relation between health care expenditure and income. Our regression speci…cation incorporates global shocks, spatial spillovers and unobserved heterogeneity across countries. Blomqvist and Carter (1997, p.226) argued that their most important …nding is that ".. pooling restrictions are of very doubtful validity. Even allowing for di¤erent country intercepts, there is considerable evidence, albeit somewhat questionable, against the hypotheses of equal income elasticities and a common trend re ‡ecting technological progress." In response to this statement, we consider the following linear heterogeneous panel regression model:

The econometric model
where h it indicates real per-capita health care expenditure in the i th country at time t, x it is a k 1 set of regressors including income, public expenditure on health, and the age structure; i is a country-speci…c intercept, d t is a time dummy, and u it is the error term. All variables in (1) are expressed in natural logarithm. In this paper we consider two alternative ways of incorporating cross section dependence in equation (1). The …rst model assumes that the errors have the following multifactor structure in which f t is the m 1 vector of unobserved common e¤ects and " it is a country-speci…c error assumed to be independently distributed. From (2), correlation arises because the responses to common external forces or perturbations is similar, though not identical, across countries. Notice that common factors induce a correlation between pairs of statistical units that does not depend on how close they are in the geographical space. In model (1), we allow x it to be correlated with the unobserved e¤ects f t . Therefore, common factors can impact health expenditure not only directly via the factor structure (2), but also indirectly by a¤ecting the regressors. The second model we consider for the error term u it is the following spatial autoregressive process: where with s ij being the generic (i; j)th element of a N N spatial weights matrix S (Anselin, 1988). In our empirical work we will adopt weights based on the inverse of the distance expressed in kilometers across countries.  (Alexander, 1993), childhood cancer (Gatrell and Whitelegg, 1993), and asthma (Hsiao, 2000). We remark that most of these works detect very localised forms of concentration of diseases. However, environmental factors as well as dietary and lifestyle of populations may a¤ect the incidence of certain pathologies at a larger scale, thus producing signi…cant geographical patterns of diseases also at the national level (Haining, 2003).
The estimation and testing approach to equation (1) with multifactor errors (2) is based on the Common Correlated E¤ects (CCE) method advanced by Pesaran (2006).
where z t = h t ; x 0 t 0 , with h t and x t being the cross section averages of the dependent variable and regressors respectively. In our analysis we will compute CCE Pooled (CCEP) estimator for the average of the slope coe¢ cients (Pesaran, 2006). Heterogeneity is captured by the individual speci…c …xed e¤ects, i , the time dummies, d t , and the loadings, g i . For comparison purposes, we will also compute the Fixed E¤ects (FE) estimator with period dummies. We observe that the CCEP and the FE frameworks di¤er in that the latter assumes that g i is zero, and the 0 i s are the same. The estimation and testing strategy of equation (1) with …xed e¤ects and spatially correlated errors (3) is based on maximum likelihood estimation (spatial MLE) techniques.

Testing for unit roots
Consider the p th order augmented Dickey Fuller regression where q it is either the logarithm of real per-capita health spending, the logarithm of the jth regressor x j;it , or regression residuals from equation (1). u it are errors that we assume to have a single factor structure, where the idiosyncratic component follows a spatial autoregressive process as in (3). When testing for unit roots, the null hypothesis is against the alternative that where N 1 is such that N 1 =N is nonzero and tends to a …xed constant as N goes to in…nity. Pesaran (2007) proposes to test (7) against (8) by computing the simple average of the t-ratios of the ordinary least squares estimates of b i in equation (10), namely, wheret i is the ordinary least squares t-ratio of b i in the following Dickey Fuller regression augmented with the cross section averages q t 1 and q t j , for j = 0; :::; p where z t = ( q t 1 ; q t ; q t 1 ; :::; q t p ) 0 . The critical values for the CIPS tests are given in Tables 2(a)-2(c) in (Pesaran, 2007).
The CIPS test has been designed for testing the unit root hypothesis when the variable under study has a factor structure. However, Monte Carlo experiments have indicated that this test is robust also to the presence of other sources of cross section dependence such as the spatial autoregressive process (3) (see Baltagi et al., 2007). As a robustness check, we also calculate the panel unit roots test proposed by Im, Pesaran and Shin (2003) ( IPS) and Breitung (2000), which do not account for cross section dependence in the data. The IPS statistic is given by (9) wheret i is based on model (6) rather than (10), (i.e., the original model not augmented with the cross section averages, see Baltagi (2008, p.278)). The Breitung (2000) statistic is a modi…cation of the augmented Dickey Fuller statistic from (6) that has more power than IPS if individual speci…c trends are included, see Baltagi (2008, p.280).

Cross section dependence tests
We now brie ‡y review some statistics of cross section dependence that we use in our empirical work. A statistic that captures the overall amount of cross section dependence in the data, at a descriptive level, is the following average pairwise correlation coe¢ cient where ij is given by 8 and q it are regression residuals from equations (1) or (6). We also consider a diagnostic test of cross section independence based on the above pairwise correlation coe¢ cients. In particular, we consider the CD LM test based on the Lagrange Multiplier statistic (Frees, 1995) Under the null hypothesis of no cross section dependence, the CD LM tends to a N (0; 1) with T ! 1 and then N ! 1.
In our empirical study we also test for spatial correlation. In particular, we compute the following Moran's I test statistic (Kelejian and Prucha, 2001) it , and s ij , i; j = 1; :::; N , are the spatial weights. The Moran's I is asymptotically normally distributed as N goes to in…nity, for …xed T . Spatial statistics such as the Moran's I di¤er from the CD statistic (12) since they exploit information on the spatial ordering of the data, giving more importance to countries that are closer to each other.

Data description
Our analysis uses annual data on 20 OECD countries from 1971 to 2004 (T = 34), gathered from the OECD Health Data Set 2007. We collected information on per-capita total health care expenditure and per-capita income estimated in GDP purchasing power parity, and expressed in US Dollars. We also gathered data for the following variables that have been identi…ed by the literature as having a role in determining health care expenditure: public expenditure on health care computed as government expenditure over total health care expenditure; the dependency rates for old and young people, de…ned as the population aged 65 and over divided by the population aged 15-64, and the population aged 0-14 divided by the population aged 15-64, respectively. All variables are expressed in natural logarithm. As shown in Table 1, the sample of 20 countries and 34 years decreases by few units when public expenditure on health care and the age structure are added to the regression.
After a preliminary exploratory data analysis, our empirical study is structured as follows: we …rst check whether our variables are non-stationary; we then estimate the income elasticity controlling for a set of regressors and for unobserved common factors; …nally, we test whether our variables form a cointegrating set and therefore if they are linked in the long-run. 9 Table 2 reports the average correlation coe¢ cient and the CD LM tests for the …rst di¤erences of the logarithm of all the variables, regressed on a country-speci…c intercept. Results indicate the presence of cross section correlation between pairs of countries for all variables. The Moran I statistic suggests the presence of geographical concentration of health care spending and its determinants. Therefore, these tests show that pairs of countries in our data set are correlated to each other for all variables, and that in some cases, these display a spatial pattern. These two sources of correlation will be taken into account when studying the time series properties of the variables as well as when estimating the health care spending equation. Failure to do so may lead to misleading inference, particularly if the source of cross section dependence is correlated with the regressors (Andrews, 2005). Table 3 reports the results of panel unit root tests which do not account for cross-country dependence. The …rst column of Table 3 reports the Im, Pesaran and Shin (2003) W tbar statistic for the logarithm of our variables when the ADF regression has an intercept only. Interestingly, all the variables considered reject the panel unit root hypothesis with the exception of the public and old people variables. The second column of Table 3 reports the Im, Pesaran and Shin (2003) statistic for the logarithm of our variables when the ADF regression has an intercept and a linear time trend. In this case, only income and old people do not reject the null of panel unit root. The third column of Table 3 shows results for the Breitung (2000) t-statistic for the intercept and trend case. 4 In all cases, the lag order p, was selected using the SIC criterion. Results from the Breitung t-statistic do not reject the null of panel unit root for all variables. 5 Table 4 shows the CIPS statistics for the logarithm of our variables. We report these results for lag orders p = 0; 1; 2; 3. As we can see from the table, most of the variables are non-stationary when adding an intercept only, and when including an intercept and a linear trend. On the other hand, they are stationary when the unit root tests are applied to the …rst di¤erences of these variables. Given the sizeable amount of cross country dependence detected by tests reported in Table 2, we believe that the CIPS unit roots tests give more reliable inference than those that do not account for cross section dependence, and we conclude that the variables under study are non-stationary.
In order to check the sensitivity of our panel unit root results, we run these tests again but now removing one country at a time from the sample. Table  5 report the CIPS statistics for the variables health care expenditure, income, 4 Note that the null hypothesis for the IPS and Breitung tests are the same. However, the alternatives are not. The IPS statistic has an alternative described in (7) where some fraction of the countries are stationary while others are not. Breitung (2000) shows that his test exhibits better power than the IPS test in the presence of country speci…c trends. 5 We have also computed other …rst generation panel unit root tests. The Levin Lin and Chu (2002), and the Maddala and Wu (1999) Fisher type tests. All of which reject the null of a panel unit root for health care expenditures and income in the case of individual e¤ects and a linear trend. The Hadri (2000) test on the other hand, which reverses the null and the alternative, rejects the null of stationarity for all variables. and public expenditure in the intercept and trend case. By and large, these results show that if we drop any country from the analysis, the results of the CIPS tests are similar to those reported in Table 4. The variable that exhibits the most sensitivity is income. Table 6 shows results from FE, Spatial MLE, and CCEP estimation when income is the only variable included in the regression (Panel A), as well as when public expenditure and dependency rates are added (Panel B). If we focus on the FE estimates (column (I) Panels A, B), the income elasticity is smaller than one, suggesting the necessity nature of health care. One interesting point to observe is that if we omit the time dummies, the FE estimate of the income elasticity becomes larger than one. The time dummies however are signi…cant and we chose to include them. One reason for such decrease in the parameter is that including period e¤ects might reduce the amount of cross section dependence present in the data. The variables public expenditure and dependency rate for old people are not signi…cant in both regressions, thus con…rming similar …ndings in previous studies on the OECD countries (Hitiris and Posnett, 1992). On the other hand, the variable dependency rate for young people has a signi…cant and positive in ‡uence on health care expenditure for the regression reported in Panel B.
For the MLE accounting for spatial correlation (column II Panels A, B), the parameter estimates for the income elasticity are close to their FE non spatial counterpart. However, the estimates of the other control variables are di¤erent, with the old people variable becoming signi…cant. Interestingly, once one controls for the period e¤ects, the estimated spatial coe¢ cients are negative ranging from -0.41 to -0.46. These may be capturing the indirect e¤ects of unobservable neighbouring variables such as environmental risks which are di¢ cult to measure on health care spending. The negative sign of the spatial coe¢ cient may be attributed to the presence of unobserved common factors that a¤ect health spending and that are not captured by the time dummies, ultimately resulting in a biased estimate of the spatial e¤ect. In this sense, one may argue that this model is too simplistic to represent the phenomenon under study.
The CCEP estimates (column III Panels A, B) give the lowest estimates of the income elasticity, especially when we control for non-income variables. These results corroborate the hypothesis that health care is a necessity good. Given the sizeable amount of correlation across countries detected in our exploratory data analysis, we believe that the CCEP approach, incorporating the e¤ect of unobservable common factors, is more appropriate for estimating equation (1). Table 6 also reports the statistics CD LM , and Moran's I applied to the residuals of the CCEP, spatial MLE and FE regressions. These indicate the presence of a general form of cross section dependence 6 . Figure 1 shows the distributions of country-speci…c estimates of income elasticities in the sub-periods 1971-1987 and 1988-2004, by OLS and by CCE. Notice that OLS elasticities take only positive values and are widely dispersed around their mean, which is larger than one in both sub-periods. Conversely, CCE elasticities take negative values for some countries, and are much more concentrated around their mean value which is smaller than one. Table 7 performs the CIPS panel unit roots tests on the residuals from the estimated equations reported in 6. The CCEP residuals from the …rst and second regression are stationary for p = 0; 1; 2, suggesting the existence of a longrun economic relationship between health expenditure and income whether one controls for public expenditure and dependency rates, or not. In contrast, for the FE regressions, we do not reject the unit root hypothesis in the residuals, for p = 0; 1; 2; 3 whether we control for public expenditure and dependency rates, or not. Hence, there is a marked di¤erence between the CCEP and FE (non spatial and spatial). Table 8 reports the error correction models attached to the CCEP estimation, which has shown a cointegration relation between the variables. The coef-…cient attached to h i;t 1 b 0 x i;t 1 measures the speed of adjustment of health care spending to a deviation from the long-run equilibrium relation between expenditure and its determinants. As expected, this coe¢ cient is negative and signi…cant in both regressions. Notice that short run changes in public expenditure, and dependency rates do not seem to have signi…cant e¤ects on health expenditure.

Concluding remarks
This paper investigated the long-run economic relationship between health care expenditure and income in the OECD countries. Using a panel of 20 OECD countries followed over 34 years, we have studied the non-stationarity and cointegration properties of health care expenditure and GDP, ultimately measuring income elasticity of health care. This paper contributes to the literature adopting tests that allow one to explicitly to control for cross-country dependence and unobserved heterogeneity. Our analysis indicates that health care expenditure and most of its determinants are non-stationary, and that they are linked in the long-run. Our results show that health care is a necessity rather than a luxury, with an elasticity much smaller than that estimated in other OECD studies. As for non-income determinants, our analysis indicates a role for the percentage of young people in explaining health expenditure variations.  Notes: , CD LM and I are computed as in (11), (12), and (13), respectively. All variables (expressed in …rst di¤erences) have been regressed on a country-speci…c intercept.
" " indicates that the coe¢ cient is signi…cant at the 5% level   Notes: we only report the intercept and the trend case.
" " indicates that the test is signi…cant at the 5% level. Figure 1: Kernel density of country-speci…c OLS and CCE income elasticities at the beginning and at the end of the sample period. The superscript " " indicates that the coe¢ cient is signi…cant at the 5% level.