Testing for Random Effects and Spatial Lag Dependence in Panel Data Models

This paper derives a joint Lagrange Multiplier (LM) test which simultaneously tests for the absence of spatial lag dependence and random individual effects in a panel data regression model. It turns out that this LM statistic is the sum of two standard LM statistics. The first one tests for the absence of spatial lag dependence ignoring the random individual effects, and the second one tests for the absence of random individual effects ignoring the spatial lag dependence. This paper also derives two conditional LM tests. The first one tests for the absence of random individual effects without ignoring the possible presence of spatial lag dependence. The second one tests for the absence of spatial lag dependence without ignoring the possible presence of random individual effects.


Introduction
Spatial models deal with correlation across spatial units usually in a cross-section setting, see Anselin (1988a).
Panel data models allow the researcher to control for heterogeneity across these units, see Baltagi (2005). Spatial panel models can control for both heterogeneity and spatial correlation, see Baltagi, Song and Koh (2003). Testing for spatial dependence has been extensively studied by Anselin (1988aAnselin ( , 1988bAnselin ( , 2001 and Anselin and Bera (1998), to mention a few. Baltagi, Song and Koh (2003) considered the problem of jointly testing for random region e¤ects in the panel as well as spatial correlation across these regions. However, the last study allowed for spatial correlation only in the remainder error term. This paper generalizes the Baltagi, Song and Koh (2003) to allow for spatial lag dependence of the autoregressive kind in the dependent variable rather than the error term. In fact, this paper derives a joint LM test which simultaneously tests for the absence of spatial lag dependence and random individual e¤ects in a panel data regression model. It 1 turns out that this LM statistic is the sum of two standard LM statistics. The …rst LM, tests for the absence of spatial lag dependence ignoring the random individual e¤ects. This is the standard LM test derived in Anselin (1988b) for cross-section data. The second LM, tests for the absence of random individual e¤ects ignoring the spatial lag dependence. This is the standard LM test derived in Breusch and Pagan (1980) for panel data. This paper also derives two conditional LM tests. The …rst one tests for the absence of random individual e¤ects without ignoring the possible presence of spatial lag dependence. The second one tests for the absence of spatial lag dependence without ignoring the possible presence of random individual e¤ects.
This should provide useful diagnostics for applied researchers working in this area.

The model and test statistics
Consider a panel data regression model with spatial lag dependence: y t = W y t + X t + u t ; i = 1; : : : ; N ; t = 1; :::; T where y 0 t = (y t1 ; : : : ; y tN ) is a vector of observations on the dependent variables for N regions or households at time t = 1; :::; T: is a scalar spatial autoregressive coe¢ cient and W is a known N N spatial weight matrix whose diagonal elements are zero. W also satis…es the condition that (I N W ) is non-singular for all j j < 1: I N is an identity matrix of dimension N . X t is an N k matrix of observations on k explanatory variables at time t. u 0 t = (u t1 ; : : : ; u tN ) is a vector of disturbances following an error component model: where 0 = ( 1 ; : : : ; N ) and i is i.i.d. over i and is assumed to be N (0; 2 ): 0 t = ( t1 ; : : : ; tN ) and ti is i.i.d. over t and i and is assumed to be N (0; 2 ). The f i g process is also independent of the f it g process.

Equation
(1) can be rewritten in matrix notation as y = (I T W ) y + X + u; i = 1; : : : ; N ; t = 1; :::; T where y is of dimension N T 1, X is N T k, is k 1 and u is N T 1. The observations are ordered with t being the slow running index and i the fast running index, i.e., y 0 = (y 11 ; : : : ; y 1N ; : : : ; y T 1 ; : : : ; y T N ) : X is assumed to be of full column rank and its elements are assumed to be asymptotically bounded in absolute value. Equation (2) can also be written in vector form as where 0 = ( 0 1 ; : : : ; 0 T ) ; T is a vector of ones of dimension T , I N is an identity matrix of dimension N; and denotes the Kronecker product. Under these assumptions, the variance-covariance matrix for u can be written as where J T is a matrix of ones of dimension T .
Under the normality assumption, the log-likelihood function of equation (1) is given by where ! i 's are the eigenvalues of W . Using the notation in Baltagi (2005), we can write = 2 ; where = Q + 2 P; P = J T I N ; J T = T 0 T =T; Q = I T N P; 2 = 2 = 2 1 and 2 1 = T 2 + 2 . From which it follows that ln j j = N T ln 2 +N ln 2 . The log-likelihood function in (6) can be rewritten as and one can estimate this model using maximum likelihood, see Anselin (1988a). This paper derives a joint LM test for the absence of spatial lag dependence as well as random e¤ects.
The null hypothesis is H a 0 : = 2 = 0; and the alternative H a 1 is that at least one component is not zero. This generalizes the LM test derived in Anselin (1988b) for the absence of spatial lag dependence H b 0 : = 0 (assuming no random e¤ects, i.e., 2 = 0), and the Breusch and Pagan (1980) LM test for the absence of random e¤ects H c 0 : 2 = 0 (assuming no spatial lag dependence, i.e., = 0). We also derive two conditional LM tests, one for H d 0 : = 0 (assuming the possible existence of random e¤ects, i.e., 2 > 0); and the other one for H e 0 : 2 = 0 (assuming the possible existence of spatial lag dependence, i.e., may be di¤erent from zero). All the proofs are given in the Appendix to the paper.

Joint LM test for H
The joint LM test statistic for testing H a 0 : = 2 = 0 is given by : LM = R 2 =B; and LM = N T G 2 =2 (T 1) : e is the restricted MLE under H a 0 which yields OLS,ũ denotes the OLS residuals, and e 2 =ũ 0ũ =N T . R is a generalization of a similar term de…ned in Anselin (1988b) for the LM test of no spatial dependence in the cross-section case. In fact, R can be interpreted as N T times the regression coe¢ cient of (I T W ) y onũ: Here, the joint LM test LM J is the sum of two LM test statistics: The …rst is LM = R 2 =B;which is the LM test statistic for testing H b 0 : = 0 assuming there is no random region e¤ects, i.e., assuming 2 = 0, see Anselin (1988a). LM is asymptotically distributed as which is the LM test statistic for testing H c 0 : 2 = 0 assuming there is no spatial lag dependence, i.e., assuming that = 0, see Breusch and Pagan (1980). Since LM and LM are asymptotically independent, LM J is asymptotically distributed It is important to point out that the asymptotic distribution of our test statistics are not explicitly derived in the paper but that they are likely to hold under a similar set of primitive assumptions developed by Kelejian and Prucha (2001).

Conditional LM Test for H
When one uses LM de…ned in (8) to test H b 0 : = 0, one implicitly assumes that the random region e¤ects do not exist. This may lead to incorrect inference especially when 2 is large. To overcome this problem, we derive a conditional LM test for no spatial lag dependence assuming the possible existence of random region e¤ects. The null hypothesis is H d 0 : = 0 (assuming 2 > 0), and the conditional LM test statistic is given by  (9) is of the same form as LM in (8). However, R 1 and B 1 are now di¤erent from R and B, and they are based on di¤erent restricted ML residuals, namely û, those of a random e¤ects panel data model with no spatial lag dependence, see Baltagi (2005), rather than the OLS residualsũ: 2.3 Conditional LM Test for H e 0 : 2 = 0 (assuming may or may not be zero) Similarly, if one uses LM de…ned in (8) to test H c 0 : 2 = 0, one implicitly assumes that the spatial lag dependence does not exist. This may lead to incorrect inference especially when is large. To overcome this problem, we derive a conditional LM test for no random region e¤ects given the existence of spatial lag dependence. The null hypothesis is H e 0 : 2 = 0 (assuming may not be zero), and the conditional LM test statistic is given by where G 1 = T u 0 P u u 0 u 1 and u denotes the restricted maximum likelihood residuals under the null hypothesis H e 0 ; i.e., under a spatial lag dependence panel data model with no random e¤ects. Note that LM = in (10) is of the same form as LM in (8). However, G 1 di¤ers from G in that they are based on di¤erent restricted ML residuals. The former is based on u t = y t W y t + X t ; where and are the MLE of and in a spatial lag panel data model with no random e¤ects, while the latter is based on OLS residualsũ: Ord, J. (1975), Estimation Methods for Models of Spatial Interaction, Journal of the American Statistical Association, 70, 120-126.

The …rst-order and second-order derivatives
From the log-likelihood function given in (6), one can obtain the score equations as follows: with T denoting a vector of ones of dimension T:

6
The second-order derivatives are given by

Joint Test
Under the null hypothesis H a 0 : = 2 = 0; equation (1) becomes a regression model with no spatial lag dependence or random region e¤ects. The variance-covariance matrix reduces to 2 I N T and the restricted MLE of is~ OLS , so thatũ = y X~ OLS are the OLS residuals and~ 2 =ũ 0ũ =N T . This is clear from the score equations evaluated under H a 0 : = 2 = 0 : Therefore, the score with respect to 0 = ( ; 2 ; 2 ; 0 ), evaluated under the null hypothesis H a 0 : = 2 = 0 is given by where R is a generalization of a similar term de…ned in Anselin (1988b) for the LM test of no spatial dependence in the cross-section case. In fact, R can be interpreted as N T times the regression coe¢ cient of (I T W ) y onũ: Under H a 0 , the elements of the information matrixJ are given by: Hence, the information matrixJ evaluated under H a 0 can be written as Using partitioned inversion, we know that the upper 2 2 block of the inverse matrixJ 1 is given bỹ This can be easily derived as: Anselin and Bera (1998) for a similar B term in the cross-section case.
Therefore, the joint LM statistic for H a 0 is given by This section derives the conditional LM test for no spatial lag dependence given the existence of random region e¤ects. The null hypothesis is H d 0 : = 0 (assuming 2 > 0). Under the null, the score equations are given by using tr [W ] = 0. Under the null hypothesis H d 0 , there is no spatial lag dependence and the variancecovariance matrix = 2 J T I N + 2 I N T . It is the familiar form of the one-way error component model, see Baltagi (2005). The restricted MLE of ; 2 ; and 2 ; are those based on MLE of a random e¤ects panel data model with no spatial lag dependence. These are denoted by b ; b 2 ; and b 2 ; respectively. The corresponding restricted MLE residuals are denoted byû: In fact,^ 2 1 =û 0 Pû=N; and^ 2 =û 0 Qû=N (T 1): Therefore, the score with respect to 0 = ( ; 2 ; 2 ; 0 ), evaluated under the null hypothesis H d 0 , is given bŷ , the elements of the information matrix b J are given by: Therefore, the information matrixĴ evaluated under H d 0 can be written as Using partitioned inversion, we know that the upper 1 1 element of the inverse matrix b J 1 is given by b J 11 = Ĵ 11 Ĵ 12Ĵ 1 22Ĵ 21 1 : Herê Therefore, the LM statistic for H d 0 is given by LM = =D 0Ĵ 1D = R 1 B 1 1 R 1 = R 2 1 =B 1 : This is of the same form as LM for testing H b 0 : = 0 (assuming no random e¤ects, i.e., 2 = 0). However, R 1 and B 1 are now di¤erent from R and B. In fact, they are based on di¤erent restricted ML residuals, namelyû, those of a random e¤ects panel data model with no spatial lag dependence, see Baltagi (2005), rather than the OLS residualsũ: 3.4 Conditional LM Test for H e 0 : 2 = 0 (assuming may or may not be zero) This section derives the conditional LM test for no random region e¤ects given the existence of spatial lag dependence. The null hypothesis is H e 0 : 2 = 0 (assuming that may not be zero). Under the null, the 11 score equations are given by Under the null hypothesis H e 0 , the variance-covariance matrix reduces to 2 I N T and the restricted MLE of and are in fact the MLE of a spatial lag model with no random e¤ects, see Anselin (1988a). These are denoted by and : Here, 2 = u 0 u=N T; with u = y (I T W ) y X : Therefore, the score with respect to 0 = ( 2 ; 0 ; 2 ; ), evaluated under the null hypothesis H d 0 , is given by where G 1 = T u 0 P u