Testing for Heteroskedasticity and Serial Correlation in a Random Effects Panel Data Model

This paper considers a panel data regression model with heteroskedastic as well as serially correlated disturbances, and derives a joint LM test for homoskedasticity and no first order serial correlation. The restricted model is the standard random individual error component model. It also derives a conditional LM test for homoskedasticity given serial correlation, as well as, a conditional LM test for no first order serial correlation given heteroskedasticity, all in the context of a random effects panel data model. Monte Carlo results show that these tests along with their likelihood ratio alternatives have good size and power under various forms of heteroskedasticity including exponential and quadratic functional forms.


Introduction
The standard error component panel data model assumes that the disturbances have homoskedastic variances and constant serial correlation through the random individual effects, see Hsiao (2003) and Baltagi (2005). These may be restrictive assumptions for a lot of panel data applications. For example, the cross-sectional units may be varying in size and as a result may exhibit heteroskedasticity. Also, for investment behavior of firms, for example, an unobserved shock this period may affect the behavioral relationship for at least the next few periods. In fact, the standard error components model has been extended to take into account serial correlation by Lillard and Willis(1978), Baltagi and Li (1995), Galbraith and Zinde-Walsh (1995), Bera, Sosa-Escudero and Yoon (2001) and Hong and Kao (2004) to mention a few. This model has also been generalized to take into account hetetoskedasticity by Mazodier and Trognon (1978), Baltagi and Griffin (1988), Li and Stengos (1994), Lejeune (1996), Holly and Gardiol (2000), Roy (2002) and Baltagi, Bresson and Pirotte (2006) to mention a few. For a review of these papers, see Baltagi (2005). However, these strands of literature are almost separate in the panel data error components literature. When one deals with heteroskedasticity, serial correlation is ignored, and when one deals with serial correlation, heteroskedasticity is ignored. Exceptions are robust estimation of the variancecovariance matrix of the reported estimates. Baltagi and Li (1995) for example, derived a Lagrange Multiplier (LM) test which jointly tests for the presence of serial correlation as well as random individual effects assuming homoskedasticity of the disturbances. While, Holly and Gardiol (2000), for example, derived an LM statistic which tests for homoskedasticity of the disturbances in the context of a oneway random effects panel data model. The latter LM test assumes no serial correlation in the remainder disturbances. This paper extends the Holly and Gardiol (2000) model to allow for first order serial correlation in the remainder disturbances as described in Baltagi and Li (1995). It derives a joint LM test for homoskedasticity and no first order serial correlation.

2
The restricted model is the standard random effects error component model. It also derives a conditional LM test for homoskedasticity given serial correlation, as well as, a conditional LM test for no first order serial correlation given heteroskedasticity. Monte Carlo results show that these tests along with their likelihood ratio alternatives have good size and power under various forms of heteroskedasticity including exponential and quadratic functional forms.

The Model
Consider the following panel data regression model : where y it is the observation on the dependent variable for the ith individual at the tth time period, x it denotes the kx1 vector of observations on the nonstochastic regressors. The regression disturbances of (1) are assumed to follow a one-way error component model where µ i denote the random individual effects which are assumed to be normally and independently distributed with mean 0 and variance the function h(·) is an arbitrary non-indexed (strictly) positive twice continuously differentiable function, see Breusch and Pagan (1979). α is a p × 1 vector of unrestricted parameters and z i is a p×1 vector of strictly exogenous regressors which determine the heteroskedasticity of the individual specific effects. The first element of z i is one, and without loss of generality, h(α 1 ) = σ 2 µ . Therefore, when the model is homoskedastic with α 2 = α 3 = .. = α p = 0, this model reduces to the standard random effects model, as in Holly and Gardiol (2000). In 3 addition, we allow the remainder disturbances to follow an AR(1) process: ν it = ρν i,t−1 + it , with |ρ| < 1 and it ∼ IIN (0, σ 2 ), as described in Baltagi and Li (1995). The µ i 's are independent of the ν it 's, and ν i,0 ∼ N (0, σ 2 /(1 − ρ 2 )).
The model considered generalizes the one-way error component model to allow for heteroskedastic individual effects a la Holly and Gardiol (2000) and for first order serially correlated remainder disturbances a la Baltagi and Li (1995). The model (1) can be rewritten in matrix notation as where y is of dimension N T × 1, X is N T × k, β is k × 1 and u is a N T × 1. X is assumed to be of full column rank. The disturbance in equation (2)can be written in vector form as: where ι T is a vector of ones of dimension T , I N is an identity matrix of dimension N , µ = (µ 1 , · · · , µ N ) and ν = (ν 11 , · · · , ν 1T , · · · , ν N 1 , · · · , ν N T ). Under these assumptions, the variance-covariance matrix of u can be written as where J T is a matrix of ones of dimension T , and diag[h(z i α)] is a diagonal matrix of dimension N × N and V is the familiar AR(1) covariance matrix. It is well established that the matrix transforms the usual AR(1) model into serially uncorrelated disturbances. For panel data, this has to be applied for N individuals, see Baltagi and Li (1995). The transformed regres-4 sion disturbances are given by : where Therefore,the variance-covariance matrix of transformed model is given by and collect like terms, we get where λ 2 i = d 2 (1 − ρ) 2 h(z i α) + σ 2 , from which it is easy to infer, see Wansbeek and Kapteyn (1982) and Baltagi and Li (1995) that where r is an arbitrary scalar. r = −1 obtains the inverse, while r = − 1 2 obtains Ω * − 1 2 . In addition, one gets, Magnus (1982).

Joint LM Test
In this subsection, we derive the joint LM test for testing for no heteroskedasticity and no serial correlation of the first order in a random effects panel data model. The null hypothesis is given by H a 0 : α 2 = · · · = α p = 0 and ρ = 0.
The log-likelihood function under normality of the disturbances is given by where (12) uses the fact that Ω = E(uu ) is related to Ω * by Ω * = ( ρ, α ). Since, the information matrix is block diagonal between the θ and β parameters, the part of the information matrix corresponding to β will be ignored in computing the LM statistic, see Breusch and Pagan (1980). Under the null hypothesis H a 0 , the variance-covariance matrix reduces to Ω a = σ 2 µ I N ⊗ J T + σ 2 I N T . This is the familiar one-way random effects error component model, see Baltagi (2005) Using general formulas on log-likelihood differentiation, see Hemmerle and Hartley(1973) and Harville(1977), Appendix 1 derives the scores of the likelihood evaluated at the restricted MLE under H a 0 : Thus, the score vectors under H a 0 are given by where is an N x p matrix of observations on the p variables z k , k = 1, 2, .., p, each of dimension N x1, , are the solutions of ∂L ∂α 1 | H 0 = 0 and ∂L ∂σ 2 | H 0 = 0, respectively. In addition, using the results of Harville (1977), the information matrix for θ under H a 0 is derived in Appendix 1 as: Using (14) and (15), the LM statistic for the hypothesis (11) is given by where , g = (g 1 , · · · , g N ) ,J ρρ is given by (15). In (16), the second equality follows from the fact that f Z(Z Z) −1 Z ι N = 0 and the last equality uses f = g −ι N and g ι N = N . Under the null hypothesis H a 0 , the LM statistic of (16) is asymptotically distributed as χ 2 p . 7

Conditional LM Tests
The joint LM test derived in the previous section is useful especially when one does not reject the null hypothesis H a 0 . However, if the null hypotheses is rejected, one can not infer whether the presence of heteroskedasticity, or the presence of serial correlation, or both factors caused this rejection. In this section, we derive two conditional LM tests. The first one tests for the absence of serial correlation of the first order assuming that heteroskedasticity of the individual effects might be present. The second one tests for homoskedasticity assuming that serial correlation of the first order might be present. All in the context of a random effects panel data model.
For the first conditional LM test, the null hypothesis is given by H b 0 : ρ = 0 (assuming some elements of α may not be zero) Under H b 0 , the variance-covariance matrix of the disturbances is given by: Replacing J T by TJ T and I T by E T +J T , and collecting like terms, one gets, see Wansbeek and Kapteyn (1982), where w 2 i = T h(z i α) + σ 2 . This also implies that Using the general formula of Hemmerle and Hartley(1973), Appendix 2 derives the scores where u = y −X β M LE denotes the restricted MLE residuals under H b 0 . Also, w 2 i = T h(z i α)+ σ 2 , where α and σ 2 are the restricted MLE of α and σ 2 under H b 0 . Therefore, the score vector under H b 0 can be written as: Appendix 2 also derives the information matrix with respect to θ = (σ 2 , ρ, α ) under H b 0 . This is given by: Therefore, the resulting LM test statistic for testing H b 0 : ρ = 0 (assuming some elements of α may not be zero) is where J b (θ) ρρ is the element of the inverse of the information matrix corresponding to ρ evaluated under H b 0 . Under the null hypothesis, LM b is asymptotically distributed as χ 2 1 .
The second conditional LM test the null hypothesis: H c 0 : α 2 = · · · = α p = 0 (given σ 2 µ > 0 and ρ > 0) 9 The variance-covariance matrix of the disturbances under H c 0 is given by where V = σ 2 Σ, and Σ = 1 1−ρ 2 R, where R is the usual AR(1) correlation matrix. Denote by F = ∂R ∂ρ . Using the general formula of Hemmerle and Hartley(1973), Appendix 3 derives the scores under H c 0 . These are given by: where, u = y − X β M LE denotes the restricted maximum likelihood residuals under the null hypothesis H c 0 . Also,ρ,σ 2 andα 1 are the restricted ML estimates of ρ, σ 2 and α 1 , under Note that for k = 1, z i1 = 1, and ∂L Therefore, the score vector under H c 0 can be written as: where Appendix 3 also derives the information matrix with respect to θ = (σ 2 , ρ, α ) under H c 0 . This is given by: whereĈ Therefore, the the resulting LM test statistic for testing H c 0 : α 2 = · · · = α p = 0 (given σ 2 µ > 0 and ρ > 0) reduces to LM c is the familiar LM test used in testing the heteroskedasticity by Breusch and Pagan (1979). However, this one uses the random effects MLE residuals rather than OLS residuals.
Under the null hypothesis H c 0 , LM c is asymptotically distributed as χ 2 p−1 .

Monte Carlo Results
The design of Monte Carlo experiments follows closely that of Baltagi et al. (2006) andLi andStengos (1994). Consider the following simple regression model where β 0 = 5 and β 1 = 0.5. x it was generated using, uniformly distributed on the interval [0, 2]. We choose N = 50, 100 and 200 and T = 10. For each x i , we generate T +10 observations and drop the first 10 observations in order to reduce the dependency on the initial values. In addition, ν it follows a traditional AR(1) process, ). The autocorrelation coefficient ρ varies over the set 0 to 0.5 by increments of 0.1.
For the individual heteroskedasticity, we adopt the Roy (2002) setup. More specifically, we denoted as quadratic heteroskedasticity, or denoted as exponential heteroskedasticity.x i. is the individual mean of x it . Denoting the expected variance of µ i byσ 2 µ i and following Roy (2002) and Baltagi et al. (2006), we fix the expected total varianceσ 2 =σ 2 µ i + σ 2 = 8 to make it comparable across the different data generating processes. We let σ 2 take the values 2, 4 and 6. For each fixed value of σ 2 , α is assigned values 0, 1, 2 and 3, with α = 0 denoting the homoskedastic individual specific error. For a fixed value of σ 2 , we obtain a value ofσ 2 µ i = (8 − σ 2 ) and using a specific value of α, we get the corresponding value for σ 2 µ from (34) Table 1 gives the empirical size of the joint LM and LR tests for H a 0 : α = ρ = 0 at the 5% significance level, when N = 50, 100 and 200 and T = 10. This is done for both quadratic and exponential heteroskedasticity, and for σ 2 = 2, 4, and 6. These correspond to cases where the percentage of the total variance due to the remainder errors are 25%, 50% and 75%, respectively. For 1000 replications, counts between 37 and 63 are not significantly different from 50 at the .05 level. Table 1 shows that at the 5% level, the size of the joint LR and LM tests are not significantly different from 5%. Figures 1 and 2 give a sample of the power of the joint LM and LR tests for N = 100 and 200 and T = 10, for both quadratic and exponential heteroskedasticity, and for σ 2 = 4. This power is reasonably high as long as ρ is larger than 0.2. For ρ smaller than 0.2, the power increase with α, and more so for exponential rather than quadratic heteroskedasticity. For a fixed α, ρ and σ 2 , this power increases as N increases. Table 2 gives the empirical size of the conditional LM and LR tests for the null hypothesis H b 0 : ρ = 0 (given α = 0) at the 5% significance level, when N = 50, 100 and 200 and T = 10. This is done for both quadratic and exponential heteroskedasticity, and for σ 2 = 2, 4, and 6. The size of these conditional tests is not significantly different from 5% except in a few cases. For example, for exponential heteroskedasticity, N = 50, α = 1, and σ 2 = 6, the size of the LM and LR tests were 7.7% and 7.4%, respectively. Figures 3 and 4 give a sample of the power of these conditional LM and LR tests for N = 100 and 200 and T = 10, for both quadratic and exponential heteroskedasticity, and for σ 2 = 4. This power is reasonably high as long as ρ is larger than 0.2. For ρ smaller than 0.2, the power increase with N, and is about the same magnitude for both exponential and quadratic heteroskedasticity.

Conditional Tests for H c
0 : α = 0 (given ρ = 0) Table 3 gives the empirical size of the conditional LM and LR tests for the null hypothesis H c 0 : α = 0 (given ρ = 0) at the 5% significance level, when N = 50, 100 and 200 and T = 10. This is done for both quadratic and exponential heteroskedasticity, and for σ 2 = 2, 4, and 6. The size of these conditional tests is not significantly different from 5% except in a few cases. For example, for quadratic heteroskedasticity, N = 50, ρ = 0.2, and σ 2 = 2, the size of the LR test was 7.6% (oversized), while for exponential heteroskedasticity, N = 50, ρ = 0.5, and σ 2 = 4, the size of the LM test was 2.7% (undersized). Figures 5 and 6 give a sample of the power of these conditional LM and LR tests for H c 0 : α = 0 (given ρ = 0) for N = 100 and 200 and T = 10, for both quadratic and exponential heteroskedasticity, and for σ 2 = 2 and 4. This power is low for N = 100 but improves for N = 200 especially as α increases, 14 and more so for exponential rather than quadratic heteroskedasticity.

Conclusion
This paper simultaneously deals with heteroskedastic as well as serially correlated disturbances in the context of a panel data regression model. This is different from the standard econometrics literature which usually deals with heteroskedasticity ignoring serial correlation or vice versa. Exceptions are robust estimation procedures which allow for a general variance-covariance matrix of the disturbances. The paper proposes a joint LM test for homoskedasticity and no first order serial correlation, as well as a conditional LM test for homoskedasticity given serial correlation, and a conditional LM test for no first order serial correlation given heteroskedasticity. Monte Carlo results show that these tests along with their likelihood ratio alternatives have good size and power under various forms of heteroskedasticity including exponential and quadratic functional forms. Wansbeek, T.J. and A. Kapteyn, 1982, A simple way to obtain the spectral decomposition of variance components models for balanced data, Communications in Statistics A11, 2105-2112.      This appendix derives the joint LM test for testing H a 0 : α 2 = · · · = α p = 0 and ρ = 0. The variance-covariance matrix of the disturbances in (4) can be written as

Acknowledgement
where J T is a matrix of ones of dimension T , and diag[h(z i α)] is a diagonal matrix of dimension N ×N and V is the familiar AR(1) covariance matrix. The log-likelihood function under normality of the disturbances is given by where θ = (σ 2 , ρ, α ). The information matrix is block-diagonal between β and θ, since H a 0 involves only θ, the part of the information due to β is ignored, see Baltagi (1995). In order to obtain the joint LM statistic, we need D(θ) = (∂L/∂θ) and the information matrix J(θ) = E[−∂L 2 /∂θ∂θ ] evaluated at the restricted ML estimator θ. Under the null hypothesis, the variance-covariance matrix reduces to Ω = σ 2 µ (I N ⊗ J T ) + σ 2 (I N ⊗ I T ). It is the familiar form of the one-way error component model, see Baltagi(1995). Under the null hypothesis we obtain where σ 2 1 = T σ 2 µ + σ 2 .
Also, using the the results of Harville (1977), we obtain the information matrix under the null hypothesis H a 0 : Therefore, information matrix under the null hypothesis H a 0 can be obtained in matrix form asJ thenJ a (θ) can be written asJ Using Searle (), the inverse of partitioned matrix can be obtained as In (A.11), we obtain Also we obtaiñ where the fourth equality follows from the fact that the first column of Z isι N and the last equality follows from the first-order condition in (A.6).
Therefore, the LM statistic for the hypothesis H a 0 is obtained by ,J ρρ is given by (A.9). The LM statistic of (A.14) is the familiar term used in testing the heteroscedasticity in Breusch and Pagan (1979). Under the null hypothesis, the LM statistic of (A.14) is asymptotically distributed as χ 2 p .

32
Appendix 2 This appendix derives the conditional LM test for testing H b 0 : ρ = 0 (given α = 0). The variance-covariance matrix of the disturbances is given by (A.1). Under H b 0 we obtain Then, we obtain the following quantities Using the results of Hemmerle and Hartly(1973), we obtain under the null hypothesis H b 0 : where α is the ML estimator of α and σ 2 is the solution of D( σ 2 ) = 0 under H b 0 , and h (z i α) is the evaluated value of ∂h(z i α)/ ∂z i α. Therefore, the partial derivatives under H b 0 can be written in vector form as where D( α) = (D( α 1 ), · · · , D( α p )) . Also, using the the results of Harville (1977), we obtain the information matrix under the null hypothesis H b 0 : Let W = diag(ŵ 2 1 , · · · ,ŵ 2 N ) and H = diag(h (z 1α ), · · · , h (z Nα )), then, in vector form, we obtain the following quantity Note that from (B.5) we obtain Thus, the information matrix with respect to θ = (σ 2 , ρ, α) under the H b 0 can be written in vector form as Therefore, the resulting LM test statistic for testing H b 0 : ρ = 0 (given α = 0) is H c 0 : α 2 = · · · = α p = 0(given σ 2 µ > 0 and ρ > 0) vs H c 1 : not H 0 (C.1) The variance-covariance matrix of the disturbances is given by (A.1). Under H c 0 we obtain where R is the AR(1) correlation matrix. It is well established, see for e.g. Kadiyala(1968), that transform the usual AR(1) model into a serially uncorrelated regression with independent observations. Therefore, one can obtain the transformed covariance matrix and given by Therefore, Ω * −1 given by Since Ω is related to Ω * by Ω * = (I N ⊗ C) Ω (I N ⊗ C ), Ω −1 is given by where the last equation follows fromι δ T = Cι T /(1 − ρ) and C C = Σ −1 .

2) Information Matrix
Also, using the the formula of Harville (1977), we obtain