Testing for Heteroskedasticity and Spatial Correlation in a Random Effects Panel Data Model

A panel data regression model with heteroskedastic as well as spatially correlated disturbancesis considered, and a joint LM test for homoskedasticity and no spatial correlation is derived. In addition, a conditional LM test for no spatial correlation given heteroskedasticity, as well as a conditional LM test for homoskedasticity given spatial correlation, are also derived. These LM tests are compared with marginal LM tests that ignore heteroskedasticity in testing for spatial correlation, or spatial correlation in testing for homoskedasticity. Monte Carlo results show that these LM tests as well as their LR counterparts perform well even for small N and T. However, misleading inference can occur when using marginal rather than joint or conditional LM tests when spatial correlation or heteroskedasticity is present.


Introduction
The standard error component panel data model assumes that the disturbances have homoskedastic variances and no spatial correlation, see Hsiao (2003) and Baltagi (2005). These may be restrictive assumptions for a lot of panel data applications. For example, the cross-sectional units may be varying in size and as a result may exhibit heteroskedasticity. Also, for trade flows across a panel of countries, there may be spatial effects affecting this trade depending on the distance between these countries. The standard error components model has been extended to take into account spatial correlation by Anselin (1988), Baltagi, Song and Koh (2003), and Kapoor, Kelejian and Prucha (2007), to mention a few. This model has also been generalized to take into account heteroskedasticity by Mazodier and Trognon (1978), Baltagi and Griffin (1988), Li and Stengos (1994), Lejeune (1996), Holly and Gardiol (2000), Roy (2002) and Baltagi, Bresson and Pirotte (2006) to mention a few. For a review of these papers, see Baltagi (2005). However, these strands of literature are almost separate in the panel data error components literature. When one deals with heteroskedasticity, spatial correlation is ignored, and when one deals with spatial correlation, heteroskedasticity is ignored.
LM tests for spatial models are surveyed in Anselin (1988Anselin ( , 2001 and Anselin and Bera (1998), to mention a few. For a joint test of the absence of spatial correlation and random effects in a panel data model, see Baltagi, Song and Koh (2003). However, these tests ignore the heteroskedasticity in the disturbances. On the other hand, Holly and Gardiol (2000) derived an LM statistic which tests for homoskedasticity of the disturbances in the context of a one-way random effects panel data model. However, this LM test ignores the spatial correlation in the disturbances. This paper extends the Holly and Gardiol (2000) model to allow for spatial correlation in the remainder disturbances. It derives a joint LM test for homoskedasticity and no spatial correlation. The 2 restricted model is the standard random effects error component model. It also derives a conditional LM test for no spatial correlation given heteroskedasticity, as well as, conditional LM test for homoskedasticity given spatial correlation. The paper then contrasts these LM tests with marginal LM tests that ignore heteroskedasticity in testing for spatial correlation, or spatial correlation in testing for homoskedasticity, again in the context of a random effects panel data model. These LM tests are computationally simple. Monte Carlo results show that misleading inference can occur when using marginal rather than joint or conditional LM tests when spatial correlation or heteroskedasticity is present. It is important to note that this paper does not consider alternative forms of spatial lag dependence. It also does not allow for endogeneity of the regressors and requires the normality asssumption to derive the LM tests. The rest of the paper is organized as follows: Section 2 reviews the general heteroskedastic one-way error component model with spatially autocorrelated residual disturbances. Section 3 derives the joint and conditional LM tests described above. Section 4 performs Monte Carlo simulations comparing the size and power of these LM tests along with their LR counterparts. Section 5 concludes.

The Model
Consider the following panel data regression model y ti = X ′ ti β + u ti , i = 1, · · · , N ; t = 1, · · · T, (2.1) where y ti is the observation on the ith region for the tth time period. X ti denotes the k × 1 vector of observations on the non-stochastic regressors and u ti is the regression disturbance. In vector form, the disturbance vector of (2.1) is assumed to have random region effects as well as spatially autocorrelated residual disturbances, see Anselin (1988): where u ′ = (u t1 , . . . , u tN ), ε ′ t = (ε t1 , . . . , ε tN ) and µ ′ = (µ 1 , . . . , µ N ) are assumed independent and normally distributed according to: zero. f i is a (p × 1) vector of strictly exogenous regressors which determine the heteroskedasticity of the individual specific effects. θ is(p × 1) vector of parameters, and λ is the scalar spatial autoregressive coefficient with |λ| < 1. W is a known N × N spatial weight matrix whose diagonal elements are zero. W also satisfies the condition that (I N − λW ) is nonsingular for all |λ| < 1.
v ′ t = (v t1 , . . . , v tN ), where v ti is i.i.d. over i and t and is assumed to be N (0, σ 2 v ). The v ti process is also independent of the µ i process. One can rewrite (2.3) as (2.5) where B = I N − λW and I N is an identity matrix of dimension N. The model (2.1) can be rewritten in matrix notation as where y is now of dimension N T × 1, X is N T × k, β is k × 1 and u is N T × 1. X is assumed to be of full column rank and its elements are assumed to be asymptotically bounded in absolute value.
Equation(2.2) can be written in vector form as: where v ′ = (ν ′ 1 , . . . , ν ′ T ), ι T is a vector of ones of dimension T , I N is an identity matrix of dimension T and ⊗ denotes the kronecker product. Under these assumptions, the variance covariance matrix of u can be written as where J T is a matrix of ones of dimension T and F is (N ×p) matrix of regressors that determine the heteroskedasticity. diag(h(F θ)) denotes a diagonal (N × N ) matrix with its ith diagonal element 4 being the ith element of the (N × 1) vector h(F θ). Note that the computational difficulty in dealing with this Ω is only hampered by the inversion of (B ′ B) . Smirnov (2005) has designed an algorithm for computing the information matrix without storing (B ′ B) −1 . He developed a sparse version of the conjugate gradient method for which he reports good numerical stability and modest computational requirements.

Joint LM Test
In this subsection, we derive the joint LM test for testing for no heteroskedasticity and no spatial correlation in a random effects panel data model. The null hypothesis is given by H a o : θ 1 = · · · = θ p = 0 and λ = 0| σ 2 µ > 0, σ 2 v > 0. The log-likelihood function under normality of the disturbances is given by where ϕ ′ = σ 2 v , σ 2 µ , λ, θ 1 , . . . , θ p . The information matrix is block-diagonal between β and ϕ. Since H a 0 involves only ϕ, the part of the information due to β is ignored, see Breusch and Pagan (1980). In order to obtain the joint LM statistic, we need D(ϕ) = (∂L/∂ϕ) and the information matrix J(ϕ) = E[−∂ 2 L/∂ϕ∂ϕ ′ ] evaluated at the restricted ML estimator ϕ. Under the null hypothesis H a 0 , the variance-covariance matrix reduces to It is the familiar form of the random effects error component model with no spatial correlation or heteroskedasticity, see Baltagi (2005).

Its inverse is given by
The score D ϕ , and information matrix J a (ϕ) ,under the null hypothesis H a 0 are derived in Appendix 1. The resulting LM statistic for H a 0 is given by: ., u iT ), i.e., the residuals are re-ordered according to time for each individual. F = I N −J N F , and σ 2 1 and σ 2 v are the restricted maximum likelihood estimates of σ 2 1 and σ 2 v given by Under the null hypothesis H a 0 , LM λθ should be asymptotically distributed as χ 2 p+1 . Although we do not explicitly derive the asymptotic distribution of our test statistics, they are likely to hold under a similar set of primitive assumptions developed by Kelejian and Prucha (2001). Also, note that for large N , tricks for computing the trace(W 2 ) can be approximated as in Barry and Pace (1999) to any desired accuracy using an O(N ) algorithm.

Conditional LM Tests
The joint LM test derived in the previous section is useful especially when one does not reject the null hypothesis H a 0 . However, if the null hypotheses is rejected, one can not infer whether the presence of heteroskedasticity, or the presence of spatial correlation, or both factors caused this rejection. Alternatively, one can derive two conditional LM tests. The first one tests for the absence of spatial correlation assuming that heteroskedasticity of the individual effects might be present.
The second one tests for homoskedasticity assuming that spatial correlation might be present. All in the context of a random effects panel data model.
For the first conditional LM test, the null hypothesis is given by H b 0 : λ = 0 (assuming at least one element of θ is not zero). Under the null hypothesis H b 0 , the variance-covariance matrix in (2.8) 6 reduces to Replacing J T by TJ T and I T by (E T +J T ), where E T = I T −J T , andJ T = J T /T , see Wansbeek and Kapteyn (1982), one gets: This also implies that Appendix 2 derives the score D ϕ and information matrix J b (ϕ) under H b 0 . The resulting conditional LM statistic for H b 0 is given by withũ = y − Xβ M LE denoting the restricted maximum likelihood residuals under H b 0 , i.e., under a random effects panel data model with no spatial correlation but possible heteroskedasticity, , where θ, σ 2 µ and σ 2 v are the restricted MLE of θ, σ 2 µ and σ 2 v , under H b 0 . Under the null hypothesis H b 0 , LM λ|θ should be asymptotically distributed as χ 2 1 .
For the second conditional LM test, the null hypothesis is given by: H c 0 : θ = 0 (assuming λ may not be zero). Under the null hypothesis H c 0 , the variance-covariance matrix in (2.8) reduces to Replacing J T by TJ T and I T by (E T +J T ), where E T = I T −J T , andJ T = J T /T , see Wansbeek and Kapteyn (1982), one gets: where φ = σ 2 µ /σ 2 v . This also implies that where Z = T φI N + (B ′ B) −1 −1 . Appendix 3 derives the score D ϕ and information matrix J c (ϕ) under H c 0 , i.e., under a random effects panel data model with no heteroskedasticity but possible spatial correlation. The resulting conditional LM statistic for H c 0 is given by

Monte Carlo Results
The experimental design for the Monte Carlo simulations is based on the format extensively used in earlier studies in the spatial regression model by Anselin and Rey (1991) and Anselin and Florax (1995), and in the heteroskedastic panel data model by Roy (2002).
The model is set as follows: where α = 5 and β = 0.5. x it is generated by a similar method to that of Nerlove (1971). In fact, The initial values x i0 are chosen as (5 + 10z i0 ). For the disturbances, andx i. denoting the individual mean of x it . Denoting the expected variance of µ i byσ 2 µ i and following Roy (2002), we fixσ 2 µ i + σ 2 ν = 20. We let σ 2 ν take the values 4 and 16. For each fixed value 8 of σ 2 ν , θ is assigned values 0, 1, 2 and 3 with θ = 0 denoting the homoskedastic individual specific error. For a fixed value of σ 2 ν , we obtain the values ofσ 2 µ i and using (4.2), we get the values for σ 2 µ for each θ value considered. Then we obtain the values of σ 2 µ i for each σ 2 µ under the four different θ values considered. The matrix W is a rook or queen type weight matrix, and the rows of this matrix are standardized so that they sum to one. The spatial autocorrelation factor λ is varied over the positive range from 0 to 0.9 by increments of 0.1. Two values for N = 25 and 49, and three values for T = 5, 7 and 12 are chosen. For each experiment, 1000 replications are performed. We chose small values of N and T to demonstrate that the size of these tests work well even in small samples. Of course for larger samples, the computational difficulty increases, but the asymptotics should even be better behaved. Table 1 gives the empirical size of the joint LM and LR tests for the null hypothesis H a 0 : θ = λ = 0 at the 5% significance level when N = 25, 49 and T = 5, 7, 12 for both the queen and rook weight matrices. Additionally, we ran a limited set of experiments for N = 100 and T = 7 with a Rook design to show that the power improves as N gets large. These results are not shown here to save space but are available upon request from the authors. Table 2 gives the empirical size of the conditional LM and LR tests for the null hypothesis H b 0 : λ = 0 (given θ = 0) and Tables 3a, 3b give the empirical size of the conditional LM and LR tests for the null hypothesis H c 0 : θ = 0 (given λ = 0). As we can see from these Tables, the empirical size is not significantly different from 5% for a Bernoulli with 1000 replications and probability of success of 0.05. Table 4 gives the empirical size of the marginal LM and LR tests for the null hypothesis H d 0 : λ = 0 (given θ = 0) and Tables 5a, 5b, give the empirical size of the marginal LM and LR tests for the null hypothesis H e 0 : θ = 0 (given λ = 0) . As we can see from the Tables, the empirical size of the LM and LR tests are severely undersized for σ 2 ν = 16, and high values of λ, and this empirical size is significantly different from 5% for a Bernoulli with 1000 replications and probability of success of 0.05.
The power of these tests for various experiments are not shown here to save space and are available as Tables A1-A60 upon request from the authors. Figure 1 plots the empirical power for the joint LM and LR test for the null hypothesis H a 0 : θ = λ = 0, for N = 25 and 49, and T = 12. This is done for various values of λ, when σ 2 ν takes the values 4 and 16 and θ = 0, 1, 2, 3. This is done for a Rook weight matrix and an exponential form of heteroskedasticity. The power of the joint  LM and LR tests climbs to one quickly as soon as λ exceeds 0.3 for various values of θ. Also, this power increases with θ for 0 < λ < 0.3. The power also increases with N. Figure 2 repeats these power plots for the case of quadratic heteroskedasticity. Again, the same phenomenon is observed for this alternative form of heteroskedascity. Figure 3 plots the empirical power for the marginal and conditional LM and LR tests for the null hypothesis H b 0 : λ = 0 (given θ = 0), for N = 25 and 49, and T = 12. This is done for various values of λ, when σ 2 ν take the value 16 and θ = 3. This is done for a Rook weight matrix and a quadratic form of heteroskedasticity. As clear from Figure 3, the conditional LM and LR tests perform better than their marginal counterparts for θ = 0. However, the power of all these tests is close to one for λ > 0.3. Figure 4 plots the empirical power for the marginal and conditional LM and LR tests for the null hypothesis H c 0 : θ = 0 (given λ = 0), for N = 25 and 49, and T = 12. This is done for various values of θ, when σ 2 ν take the value 16 and λ = 0.9. This is done for a Rook weight matrix and an exponential form of heteroskedasticity. As clear from Figure 4, the conditional LM and LR tests perform better than their marginal counterparts for λ = 0, and this power is increasing with θ. These figures give a flavour of the power performance of these tests for a subset of the experiments. Of course, more plots can be given, but we refrain from doing so because of space limitation.        Figure 1. Frequency of rejections for H a 0 : λ = 0 and θ = 0 N=25, 49 T=12, Exponential heteroskedasticity, weight matrix is ROOK. Figure 2. Frequency of rejections for H a 0 : λ = 0 and θ = 0 N=25, 49 T=12, Quadratic heteroskedasticity, weight matrix is ROOK. N=25, λ = 0.9 N=49, λ = 0.9

Conclusion
This paper considered a panel data regression model with heteroskedasticity in the individual effects and spatial correlation in the remainder error. Testing for heteroskedasticity ignoring the spatial correlation as well as testing for spatial correlation ignoring heteroskedasticity is shown to lead to misleading results. The paper derived joint and conditional LM tests for heteroskedasticity and spatial correlation and studied their performance using Monte Carlo experiments. This paper generalized the Holly and Gardiol (2000) paper by allowing for spatial correlation in the remainder disturbances. The paper does not consider alternative forms of spatial lag dependence and this should be the subject of future research. In addition, it is important to point out that the asymptotic distribution of our test statistics were not explicitly derived in the paper but that they are likely to hold under a similar set of primitive assumptions developed by Kelejian and Prucha (2001). The results in the paper should be tempered by the fact that the N = 25, 49 used in our Monte Carlo experiments may be small for a typical micro panel. Larger N = 100 did improve the performance of these tests whose critical values were based on their large sample distributions. Although the computational difficulty increases with N, one can use computational tricks given by Barry and Pace (1999) and Smirnov (2005).

Acknowledgement
This work was supported by KOSEF(R01-2006-000-10563-0). We would like to thank the guest editor Manfred M. Fischer and two anonymous referees for helpful comments and suggestions. We dedicate this paper in memory of our colleague and co-author Seuck Heun Song who passed away March, 2008.

Appendix 1
This appendix derives the joint LM test for testing for no heteroskedasticity and no spatial correlation in a random effects panel data model, i.e., H a o : θ 1 = · · · = θ p = 0 and λ = 0| σ 2 µ > 0, σ 2 v > 0. The variance-covariance matrix of the disturbances in the unrestricted model is given by (2.8) and can be written as where J T is a matrix of ones of dimension T and F is (N ×p) matrix of regressors that determine the heteroskedasticity. diag(h(F θ)) denotes a diagonal (N × N ) matrix with its ith diagonal element being the ith element of the (N × 1) vector h(F θ). The log-likelihood function under normality of the disturbances is given by where ϕ ′ = σ 2 v , σ 2 µ , λ, θ 1 , . . . , θ p . The information matrix is block-diagonal between β and ϕ. Since H a 0 involves only ϕ, the part of the information due to β is ignored, see Breusch and Pagan (1980). In order to obtain the joint LM statistic, we need D(θ) = (∂L/∂θ) and the information matrix J(θ) = E[−∂ 2 L/∂θ∂θ ′ ] evaluated at the restricted ML estimator ϕ. Under the null hypothesis H a 0 , the variance-covariance matrix reduces to Ω 0 = σ 2 µ (I N ⊗ J T ) + σ 2 ǫ (I N ⊗ I T ). It is the familiar form of the random effects error component model with no spatial correlation or heteroskedasticity, see Baltagi (2005). Its inverse is given by Hartley and Rao (1967) or Hemmerle and Hartley (1973) give a general useful formula to derive the scores: Under the null hypothesis H a 0 , we get where h (1) (0) denote the first derivative of h evaluated at zero, and F k is the kth column of F . This Anselin (1988, p.164). Substituting (A.5) into (A.4), we get This uses the fact that tr(W ) = tr(W ′ ) = 0. From the second score equation, one gets that . Note that σ 2 1 is used to simplify the last score equation. S is an (N × 1) vector with typical element S i =ũ ′ iJ Tũi and F = I N −J N F . Here,ũ ′ i = ( u i1 , .., u iT ), i.e., the residuals are re-ordered according to time for each individual. Therefore, the score with respect to each element of ϕ, evaluated at the resticted MLE is given by For the information matrix, it is useful to use the formula given by Harville(1977): for r,s=1,2,..,p+3. Under the null hypothesis H a 0 ,we get: Therefore, the information matrix under the null hypothesis H a 0 is given by where b = tr(W 2 + W ′ W ). Since D ′ ϕ = (0, 0, D( λ), D( θ)) , and J a (ϕ) is a block diagonal matrix with respect to ϕ 1 = σ 2 v , σ 2 µ , θ and λ, we only need the lower (p + 1) × (p + 1) block of the inverse of J a (ϕ). Let And using the partitioned inverse formula, we get where F = (I N −J N )F. The resulting LM statistic for H a 0 is given by Under the null hypothesis, LM λθ should be asymptotically distributed as χ 2 p+1 .

Appendix 2
This appendix derives the conditional LM test for testing no spatial correlation assuming there might be heteroskedasticity, i.e., H b 0 : λ = 0 (assuming some elements of θ may not be zero). Under the null hypothesis H b 0 , the variance-covariance matrix in (A.1) reduces to Replacing J T by TJ T and I T by (E T +J T ), where E T = I T −J T , andJ T = J T /T , see Wansbeek and Kapteyn (1982), one gets: This also implies that In this case: Using (A.4), we get Therefore, the score with respect to each element of ϕ, evaluated at the resticted MLE under H b 0 is given by , under a random effects panel data model with no spatial correlation but possible heteroskedasticity, Also, the elements of the information matrix for this model are given by where r,z and s are (N × 1) vectors with typical elements: r i = h (1) (f ′ i θ)/g i , z i = h (1) (f ′ i θ)/g 2 i and s i = h(f ′ i θ)h (1) (f ′ i θ) /g 2 i , respectively. Therefore, the information matrix under the null hypothesis H b 0 is given by: Since D ′ ϕ = (0, 0, D( λ), 0) , and J b (ϕ) is a block diagonal matrix with respect to ϕ 1 = σ 2 v , σ 2 µ , θ and λ, the resulting conditional LM statistic for H b 0 is given by Under the null hypothesis H b 0 , LM λ|θ should be asymptotically distributed as χ 2 1 .

Appendix 3
This appendix derives the conditional LM test for testing for homoskedasticity assuming there might be spatial correlation, i.e H c 0 : θ = 0 (assuming λ may not be zero). Under the null hypothesis H c 0 , the variance-covariance matrix in (A.1) reduces to Replacing J T by TJ T and I T by (E T +J T ), where E T = I T −J T , andJ T = J T /T , see Wansbeek and Kapteyn (1982), one gets: (A.16) where φ = σ 2 µ /σ 2 v . This also implies that where Z = T φI N + (B ′ B) −1 −1 . In this case: Using (A.4), we get where d 1 is an (N × 1) vector with its typical element being the diagonal element of the matrix Z.
Therefore, the score with respect to each element of ϕ, evaluated at the resticted MLE under H c 0 is given by