Panel Data Inference Under Spatial Dependence

This paper focuses on inference based on the usual panel data estimators of a one-way error component regression model when the true specification is a spatial error component model. Among the estimators considered, are pooled OLS, random and fixed effects, maximum likelihood under normality, etc. The spatial effects capture the cross-section dependence, and the usual panel data estimators ignore this dependence. Two popular forms of spatial autocorrelation are considered, namely, spatial auto-regressive random effects (SAR-RE) and spatial moving average random effects (SMA-RE). We show that when the spatial coefficients are large, test of hypothesis based on the usual panel data estimators that ignore spatial dependence can lead to misleading inference.


Introduction
This paper considers spatial dependence across panels as a simple way of capturing cross-section dependence among countries, regions, or neighbors. The structure of the dependence can be related to location and distance, both in a geographic space as well as a more general economic or social network space, see Anselin (1988) and Anselin, Le Gallo and Jayet (2008). Following Kapoor, Kelejian and Prucha (2007) and Fingleton (2008a), we focus on two spatial error processes: the spatial autoregressive (SAR) and the spatial moving average (SMA) random effects model, namely SAR-RE and SMA-RE. Under the assumption that the true model is SAR-RE or SMA-RE, inference based on the usual panel data estimators including pooled OLS, random and fixed effects, Maximum Likelihood under normality, etc., is investigated using Monte Carlo experiments. In panel data analysis, the one-way error component model is popular for capturing heterogeneity among the crosssectional units, see Wallace and Hussain (1969), Amemiya (1971), Swamy and Arora (1972), to mention only a few. These random effects estimators are programed using standard software like EViews and Stata. Under the assumption of no cross-section dependence, the small sample efficiency of these estimators as well as inference based upon them is studied, for example, by Maddala and Mount (1973), Baltagi (1981) and Moulton (1980), to mention only a few. This paper introduces cross-section dependence through spatial autoregressive and moving average error component models, and re-examine inference based on these estimators. We show that when the spatial coefficients are large, test of hypothesis based on the usual panel data estimators that ignore spatial dependence can lead to misleading inference. In a similar spirit, it is worth noting that Baltagi, Bresson and Pirotte (2007) studied the performance of panel unit root tests under spatial dependence of RE-SAR and RE-SMA types. They find that for the random effects SAR model for example, panel unit root tests that ignore this spatial correlation, will yield over-sized unit root tests of up to 20% at the 5% significance level. The plan of this paper is as follows: Section 2 presents the model, while Section 3 describes the Monte Carlo design. Section 4 contains the Monte Carlo results, and the last section concludes.
where Z it = [1, X it ] is 1 × (k + 1), δ ′ = [γ, β ′ ] is 1 × (k + 1). Kapoor, Kelejian and Prucha (2007) proposed a SAR random effects model, hereafter SAR-RE. In their specification, the disturbance term u t itself follows a SAR process and the remainder term ε t follows an error component structure W N is an (N × N) known spatial weights matrix which has zero diagonal elements and is usually row-normalized. Also, µ is an N × 1 vector of individual effects which are assumed to be IID(0, σ 2 µ ), and v t is an N × 1 vector of remainder effects which are assumed to be IID(0, σ 2 v ). µ and v t are independent of each other and among themselves. Combining (2) and (3), we obtain the SAR-RE specification for the (N × 1) error vector u t at time t : where I N is an identity matrix of dimension N × N and B N = (I N − ρW N ).
The matrix B N is assumed to be non-singular, and the row and column sums of the matrix W N are bounded uniformly in absolute value. For the full (NT × 1) vector of disturbances, we have: where ι T is a vector of ones of dimension T ×1. The corresponding (N T × N T ) covariance matrix is given by: where Ω ε = σ 2 µ (J T ⊗ I N ) + σ 2 v I NT = σ 2 v Q 0,N + σ 2 1 Q 1,N and Q 0,N = I T − J T ⊗ I N (8) with J T = ι T ι ′ T , J T = J T /T and σ 2 1 = σ 2 v +T σ 2 µ , see Baltagi (2008). Fingleton (2008a) extended this model to the spatial moving average random effects specification, hereafter SMA-RE. In that case, the disturbance term u t in (2) follows a SMA process where D N = (I N + λW N ), and ε t follows an error component structure as in (3). So, the full SMA-RE (NT × 1) vector of disturbances is given by: and the corresponding (NT × NT ) covariance matrix is given by: Regression models containing spatially correlated disturbance terms based on the SAR or SMA models are typically estimated using Maximum Likelihood (ML), where the likelihood function corresponds to the normal distribution. However, this can be computationally demanding for large N. Kelejian and Prucha (1999) suggested a generalized moments (GM) estimation method for the SAR model in a cross-section setting, and Fingleton (2008b) extended this generalized moments estimator to the SMA model. Kapoor, Kelejian and Prucha (2007) generalized this GM procedure from cross-section to panel data and derived its large sample properties when T is fixed and N → ∞. Kapoor, et al. (2007) proposed three generalized moments (GM) estimators of ρ, σ 2 v and σ 2 1 = σ 2 v + T σ 2 µ based on the following six moment conditions: Under the random effects specification (5), the OLS estimator of δ is consistent. Using δ OLS one gets a consistent estimator of the disturbances ε = y − Z δ OLS . The GM estimators of σ 2 1 , σ 2 ν and ρ are the solution of the sample counterpart of the six equations given above. Kapoor, et al. (2007) suggest three GM estimators. The first involves only the first three moments which do not involve σ 2 1 and yield estimates of ρ and σ 2 ν . The fourth moment condition is then used to solve for σ 2 1 given estimates of ρ and σ 2 ν . The second GM estimator is based upon weighing the moment equations by the inverse of a properly normalized variance-covariance matrix of the sample moments evaluated at the true parameter values. A simple version of this weighting matrix is derived under normality of the disturbances. The third GM estimator is motivated by computational considerations and replaces a component of the weighting matrix for the second GM estimator by an identity matrix. Kapoor, et al. (2007) perform Monte Carlo experiments comparing ML and these three GM estimation methods. They find that on average, the RMSE of ML estimator and their weighted GM estimators are quite similar 1 . Fingleton (2008a) extended this GM estimator to the SMA panel data model with random effects (11). The moment conditions for SMA-RE are similar to those derived by Kapoor, et al. (2007).
The spatial feasible GLS estimator of δ, hereafter S-FGLS, is then obtained by replacing ρ, σ 2 v and σ 2 1 by their GM estimators. More precisely, we have: with where for SMA-RE. If ρ = λ = 0, (1) is reduced to the usual one-way error component model. Thus, the vector of disturbances (5) is reduced to: with Z µ = ι T ⊗ I N . Following (23), the covariance matrix of u is where with θ 2 = σ 2 v /σ 2 1 . In order to estimate the parameters of the one-way error component model, we can use several estimators: Ordinary Least Squares (OLS), Within (W), Feasible Generalized Least Squares (FGLS) and Maximum Likelihood (ML) under normality of the disturbances, see Breusch (1987). Here, we are interested in inference using these estimators under the assumption that the true model is SAR-RE or SMA-RE.
The usual estimated covariance matrices of the OLS, Within, FGLS and ML estimators from a typical regression package are: For the FGLS estimator, we need estimates of the variance components σ 2 µ and σ 2 v . We can use the following estimators: Amemiya (1971), Wallace and Hussain (1969), Swamy and Arora (1972), Nerlove (1971a), Henderson III (1953) and Minque (Rao (1970, 1972). Moreover, under normality of the disturbances, we can apply the Maximum Likelihood estimator considered by Breusch (1987).

Monte Carlo Design
In this section, we consider the small sample performance of usual estimators of an error component model with spatially autocorrelated residuals. The data generating process (DGP) consider two specifications on the remainder errors, namely SAR, given in (2), and SMA, given in (10). More formally: with u t = ρW N u t + ε t for SAR λW N ε t + ε t for SMA with ρ, λ = 0.2, 0.8.
Moreover, the remainder term ε t follows an error component structure: . Throughout the experiment the parameters were set at γ = 5 and β = 0.5. The x it explanatory variable is generated as in Nerlove (1971b) with: where ω it is a random variable uniformly distributed on the interval [−0.5, 0.5] and x i0 = 5 + 10ω i0 . The first 20 period observations were discarded to minimize the effect of initial values. For the error component (32), three cases for the residuals variances are considered: For the spatial weights matrices, we use regular and irregular lattices structures, as in Anselin and Moreno (2003) and in Kelejian and Prucha (1999).
For the regular spatial case, we use two weight matrices which essentially differ in their degree of sparseness. Following Kelejian and Prucha (1999), the weight matrices are labelled as "j ahead and j behind" with the non-zero elements being 1/2j, j = 1 and 5 (see Figures 1 and 2 in the Appendix). Note that, as j increase, the value of non-zero elements 1/2j decreases and, this is turn may reduce the amount of spatial correlation. For the irregular spatial case, we take the spatial groupings of the largest French administrative communes (see Baltagi, Bresson and Pirotte (2007)). These spatial weight matrices may represent high-order contiguity relationships. We consider the structures of 1 (see Figure 3 in the Appendix) and 3-nearest neighborhoods. 2 The one or three-order contiguity matrices reflect the fact that neighborhood j may be one of the 1 or 3-nearest neighborhoods to i, but j may have some other 1 or 3-nearest neighborhoods not including i. We consider several individual and time dimensions N = 50, T = (5, 10). We evaluate the efficiency of thirteen estimators under spatial autocorrelation of the disturbances. First, we consider nine estimators of the one-way error component model which ignore spatial dependence: • Ordinary Least Squares (OLS).
Second, we consider four estimators which take into account cross-section spatial dependence: • The spatial True GLS SAR-RE estimator where σ 2 µ , σ 2 ν and ρ are known.
For all experiments, 1000 replications are performed. To see how inference based on these estimates of β perform, we focus on the simple test for H 0 : β = 0.4, 0.5, 0.6 and 0.8 (when the true β = 0.5). We also calculate the variances of the estimators of β using the following formulas: • Empirical variances from 1000 replications: (35) • Computed variances. For OLS, this is given by (26); for Within by (27); for FGLS by (28); for ML by (29); and for S-FGLS by (20). For SAR-RE, Ω u is given by (6), while for SMA-RE, it is given by (12). For each estimator, we average the computed variances of β over the 1000 replications.
• True variances: 8 For equations (40) and (41), Ω u is given by (6) for a SAR-RE and by (12) for a SMA-RE. Equations (36) to (41) are true formulas. However, if an estimate of Ω is used, we average these variances of β over the 1000 replications.
4 Inference Using the Usual One-way Error Component Estimators 4.1 Size and power Table 1 gives the percentage of rejections of H 0 : β = 0.4, 0.5, 0.6 and 0.8 (when the true β = 0.5), for the case of N = 50, T = (5, 10), σ 2 µ , σ 2 v = (0.8, 0.2) , (ρ = λ = 0.2, 0.8), and a W(1,1) matrix, one neighbor ahead and one neighbor behind, see Figure 1 in the Appendix. When the true model is SAR-RE, i.e., the first half of Table 1, and for T = 5 and ρ = 0.2, the OLS estimator rejects the null hypothesis H 0 : β = 0.5, when it is true, in 12.2% of the cases at the 5% significance level. This means that the test is over-sized. This gets worse when the spatial coefficient ρ increases to 0.8. In this case, the size of the test becomes 15.6%. When the true model is SMA-RE, i.e., the bottom half of Table 1, and for T = 5 and λ = 0.2, the OLS estimator rejects the null hypothesis H 0 : β = 0.5, when it is true, in 12.3% of the cases at the 5% significance level. Once again, the test is over-sized, and it gets slightly worse when the spatial coefficient λ increases to 0.8. The size of the test becomes 12.8%. The fixed effects (within) and random effects (FGLS) as well as ML estimators have a size of 6.9% to 10.2%, when the true model is SAR-RE, and T = 5 and ρ = 0.2. However, this gets worse when the spatial coefficient ρ increases to 0.8. In this case, the size of these tests varies between 28.2% and 35.3%. When the true model is SMA-RE, and T = 5 and λ = 0.2, the size of the fixed effects (within) and random effects (FGLS) as well as ML estimators varies between 6.8% to 9.9%. This gets worse when the spatial coefficient λ increases to 0.8. The size of these tests varies between 12.2% to 17.1%. The size of the true GLS whether it is SAR-RE or SMA-RE is never statistically different from 5% for all cases considered in Table 1. This is also true for FGLS SAR-RE and FGLS SMA-RE except for FGLS SAR-RE when λ = 0.8 (resp. FGLS SMA-RE when ρ = 0.8) in Table 1. Table 2 shows that the true variances of all the β considered, are well estimated by their empirical counterparts using 1000 replications. For exam-ple, the OLS true variance is 3.738 for T = 5 and ρ = 0.2, when the true model is SAR-RE. The empirical estimate of this using 1000 replications is 3.805. However, the usual regression package under-estimates this variance and yields 2.385. This under-estimation of the true variance of OLS under spatial dependence leads to the over-sizing of the test of H 0 : β = 0.5, in Table 1. This gets worse when the spatial coefficient ρ increases to 0.8. The true variance of OLS increases to 17.468 and its empirical estimate is 16.777, while the usual regression package under-estimates this variance and yields 9.7386. This undermines inference based on OLS estimates which ignore heterogeneity and spatial dependence present in this model. 3 The same is true for fixed effects (within) and random effects (FGLS) as well as ML estimators which deal with heterogeneity but ignore spatial dependence. The true variances are of the order of 0.89 to 0.92 for T = 5 and ρ = 0.2, when the true model is SAR-RE, and their empirical estimates using 1000 replications vary between 0.83 and 0.85. However, the usual regression package estimates will under-estimate these variances and yield magnitudes between 0.56 and 0.71. This gets worse when the spatial coefficient ρ increases to 0.8. The true variances of fixed effects (within) and random effects (FGLS) as well as ML estimators increase in range to 10.60 and 11.09 and their empirical counterparts vary between 9.49 and 10.00, while the usual regression package under-estimates these variances and yield estimates in the range of 2.40 to 3.09. This seriously undermines inference based on these panel estimates which account for heterogeneity but ignore the spatial dependence present in this model. Note that for the FGLS SAR-RE and FGLS SMA-RE, the computed variances tend to over-estimate the true variances in Table 2.
Tables 1 and 2 give also the results when we double T from 5 to 10, holding N = 50 constant. Basically, we get the same results for over-sizing of the test for H 0 : β = 0.4, 0.5, 0.6 and 0.8 (when the true β = 0.5), and under-estimation of the true variances for all the estimators considered. Of course, the power now increases drastically with T especially for β = 0.4, and 0.6.
Tables 3 and 5 replicate the results in Table 1 on size and power of H 0 , only now the heterogeneity in the individual effects is reduced from 80% of the total variance to 50% in Table 3, and 20% in Table 5. The results for size and power are the same, but the magnitudes are different. Similarly, Tables 4 and 6 replicate the results in Table 2 for the true and estimated variances, only now the heterogeneity in the individual effects is reduced from 80% of the total variance to 50% in Table 4, and 20% in Table 6. The results for the under-estimation of the true-variances are the same, but the magnitudes are different.

Spatial weighted matrix effect
Tables 7, 9 and 11 replicate the results in Tables 1, 3 and 5, only now for a weight matrix W(5,5) that allows 5 neighbors ahead and 5 neighbors behind, see Figure 2 in the Appendix. Again, the results for size and power are the same, but the magnitudes are different. Similarly, Tables 8, 10 and 12 replicate the results in Table 2, 4 and 6, only now for a weight matrix W(5,5) that allows 5 neighbors ahead and 5 neighbors behind. The results for the under-estimation of the true-variances are the same, but the magnitudes are different. Tables 13, 15 and 17 replicate the results in Tables 1, 3 and 5, only now for an asymmetric one-order contiguity matrix, from french communes, see Figure  3 in the Appendix. Again, the results for size and power are the same, but the magnitudes are different. Similarly, Tables 14, 16 and 18 replicate the results in Table 2, 4 and 6, only now for an asymmetric one-order contiguity matrix, from french communes. The results for the under-estimation of the true-variances are the same, but the magnitudes are different. 4 Table 19 checks the performance of the Hausman (1978) test when the true model is SAR-RE or SMA-RE. This test is based on the contrast between the within estimator and the Swamy and Arora (1972) feasible GLS RE esti-mator. 5 For all experiments performed, the null hypothesis of no correlation between the individual effects and the regressor is satisfied. Table 19 therefore checks the sensitivity of the usual Hausman test that ignores the spatial correlation of the SAR or SMA type. The results indicate that the Hausman test yields size not significantly different from 5% for all experiments considered when ρ = λ = 0.2, i.e., weak spatial correlation. One exception is the asymmetric one-order contiguity matrix which is over-sized yielding at worst 7.2% rather than 5%. Things get better when T increases from 5 to 10. However, when ρ = λ = 0.8, i.e., when the spatial correlation effect is larger, the size of the Hausman test is distorted. It varies from an under-sizing of 1.5% for SAR-RE, T = 10 and an asymmetric one-order contiguity matrix, to over-sizing by as much as 14% for SAR-RE, T = 5 and a W(5,5) matrix. For the SMA-RE model, this distortion in size for the Hausman test varies from 3.1% to 9.4% rather than 5%.

Conclusions
We showed that when the spatial coefficients are large, test of hypothesis based on the usual panel data estimators that ignore spatial dependence can lead to misleading inference. This can be explained by the fact that the variances of these misspecified estimators under-estimate their true variances for the spatial panel models considered. These results are robust to doubling T , various spatial weight matrices, with one or five neighbors ahead and behind, as well as for asymmetric spatial weight matrices based on french communes.            (Fingleton (2008a)) 11.9 6.0 11.2 48.0 32.9 6.0 32.6 98.9 10.6 6.9 11.6 42.0 23.6 5.8 24.0 92.8