Prediction in a Generalized Spatial Panel Data Model with Serial Correlation

This paper considers the generalized spatial panel data model with serial correlation proposed by Lee and Yu (2012) which encompasses a lot of the spatial panel data models considered in the literature, and derives the best linear unbiased predictor (BLUP) for that model. This in turn provides valuable BLUP for several spatial panel models as special cases.


Introduction
Panel data has been used in forecasting gasoline demand across OECD countries, see Baltagi and Griffin (1997); Residential electricity and natural-gas demand using a panel of American states, see Maddala, Trost, Li and Joutz (1997); World carbon dioxide emissions, see Schmalensee, Stoker and Judson (1998); Growth rates of OECD countries, see Hoogstrate, Palm and Pfann (2000); Cigarette sales using a panel of American states, see Baltagi and Li (2004); The impact of uncertainty on U.K. investment authorizations using a panel of U.K. industries, see Driver, Imai, Temple and Urga (2004); Sale of state lottery tickets using panel data on postal (ZIP) codes, see Frees and Miller (2004); Exchange rate determination using industrialized countries quarterly panel data, see Rapach and Wohar (2004); Migration to Germany from 18 source countries over the period , see Brucker and Siliverstovs (2006); Short-term forecasts of employment in a panel of 326 West German regional labor markets observed over the period [1987][1988][1989][1990][1991][1992][1993][1994][1995][1996][1997][1998][1999][2000][2001][2002], see Longhi and Nijkamp (2007); Annual growth rates of real gross regional product for a panel of Chinese regions, see Girardin and Kholodilin (2011), to mention a few.See Baltagi (2013) for a summary of selected empirical panel data forecasting applications.Wansbeek and Kapteyn (1978), Lee andGriffiths (1979), andTaub (1979) were among the first contributions in econometrics to the problem of prediction in an error component panel data model.Baltagi and Li (1992) extended this prediction to the case of an error component panel model with serial correlation in the remainder disturbance term.While Baltagi andLi (2004, 2006) extended it to the case of spatial autocorrelation in the remainder disturbance term, and Baltagi, Bresson and Pirotte (2012) carried out an extensive Monte Carlo study comparing forecasts in a spatial panel data model.See Baltagi (2013) for a recent survey in the Handbook of Forecasting.This paper considers the generalized spatial panel data model with serial correlation proposed by Lee and Yu (2012) which encompasses a lot of the spatial panel data models considered in the literature, and derives the best linear unbiased predictor (BLUP) for that model.
This in turn provides valuable BLUP for several spatial panel models as special cases.
Section 2 gives a brief description of the Lee and Yu (2012) generalized spatial panel data regression model with serial correlation, while Section 3 derives the best linear unbiased predictor (BLUP) for that model.The Lee and Yu (2012) model encompasses a lot of the spatial and panel regression models used in empirical economics.The BLUP for these special cases are shown to follow easily from our BLUP derivation for the generalized model.Lee and Yu (2012) considered the following generalized spatial panel data regression model with serial correlation, spatial autocorrelation and random effects: ′ y it = x it β + u it , i = 1, . . ., N ; t = 1 . . ., T, ( where y it is the observation on the ith region for the tth time period, x it denotes the k × 1 vector of observations on the nonstochastic regressors and u it is the regression disturbance.In vector form, the disturbance vector of Equation ( 1) is assumed to have random region effects, spatially autocorrelated residual disturbances and a first-order autoregressive remainder disturbance term: with and ′ where u = (u t1 , . . ., u tN ) and ε t , ν t and e t are similarly defined.η ′ = (η 1 , . . ., η N ) denote the vector of  3) and ( 4) can be rewritten as: Lee and Yu (2012), we employ the following assumptions: Assumption 1 W 1 , W 2 , M 1 and M 2 are nonstochastic spatial weights matrices with zero diagonal elements.
Assumption 4 W 1 , W 2 , M 1 and M 2 are uniformly bounded in both row and column sums in absolute value Assumption 5 N is large, whereas T is finite.As pointed out by Lee and Yu (2012), this model nests various spatial panel models in the literature including the following1 : 1.When λ 1 = 0 and δ 1 = δ 2 = 0, the model reduces to the random effects spatial autoregressive RE-SAR model with serial correlation in the remainder disturbances considered by Baltagi, Song, Jung and Koh (2007).
4. When λ 1 = λ 2 = 0, δ 1 = 0 and ρ = 0, the model reduces to the random effects spatial moving average RE-SMA model with no serial correlation described by Anselin, Le Gallo and Jayet (2008).The model in Equation ( 1) can be rewritten in matrix notation as

When λ
where X is assumed to be of full column rank and its elements are assumed to be bounded in absolute value.The disturbance term can be written in vector form as where v = (v 1 , . . ., v ) and u is similarly defined.ι T is a vector of ones of dimension T . is an identity T I T matrix of dimension T and ⊗ denotes the Kronecker product.
Under the random effects model, Lee and Yu (2012) showed that the variance-covariance matrix of u can be written as , where J T is a matrix of ones of dimension T , and E (vv where V is the familiar AR(1) variance-covariance matrix of dimension (11) .

One can easily verify that
is the Prais-Winsten transformation matrix as in Baltagi and Li (1992).From Equation (9), the transformed spatial panel data regression disturbances are given by covariance matrix of the Prais-Winsten-transformed spatial panel data model is given by where Cι T and α Therefore, the variance- T , and collect * 2 where ι α′ /d 2 and d 2 T like terms, see Baltagi and Li (1992), we get Note that Ω in Equation ( 10) is related to Ω * in σ 2 µ where Z = .
One can easily verify that Equation ( 17) is equivalent to the inverse of the variance-covariance matrix given by Lee and Yu ( 2012) , see Magnus (1982).Under the assumption of normality, the log-likelihood function for this where u * is given by Equation ( 13) and Ω * −1 is given by Equation ( 16).
Assumption 7 Elements of the N × k matrix of regressors X are nonstochastic and bounded, uniformly in ∑ N and T .Also, under the asymptotic setting 1 T in Assumption 5, the limit of t=1 X ′ Ω −1 X exists and is e and 0 denotes true value of ϕ.

N T
Under Assumptions 1-8, Lee and Yu (2012) establishes consistency and asymptotic normality of the quasi-maximum likelihood estimator.They provided Matlab programs for these estimation methods.See ̸ ̸ also Millo (2014) for R programs performing maximum likelihood estimation of panel data models with random effects, a spatially lagged dependent variable and spatially and serially correlated errors.In this paper we are interested in prediction.This is taken up in the next section.

BLUP
Goldberger (1962) showed that, for a known Ω, the best linear unbiased predictor (BLUP) for the ith individual s periods ahead (y i,T +s ) is given by where w = E (uu i,T +s ) is the covariance between the future disturbance u i,T +s and the sample disturbances ˆ′ û. β GLS is the GLS estimator of β from Equation (8) based on the true Ω.Also, ûGLS = y − x β GLS denotes the corresponding GLS residual vector.From Equation ( 9), u i,T +s can be rewritten as u i,T +s = η i +ε i,T +s = , where l ′ as the ith row of I N and v T +s is the N × 1 vector of disturbances for the (T + s)th time period.Focusing on the last term of Equation ( 20), which we will call the Goldberger BLUP term, we get Consider the first term in Equation ( 21).Define Z 1 = (A ′ 1 A 1 ) Z. Using Equation (13), The first term in Equation ( 21) can be expressed as: where z 1ik is the (i, k)th elements of Z 1 and û is the itth elements of û = (C ⊗ I N ) ˆThis uses it GLS u GLS .

T T T T
Consider the second term in Equation ( 21).Notice that since µ and v t are independent, and Ω −1 in Equation ( 18) can be rewritten as Hence the second term in Equation ( 21) can be written as: where z 2ik is the (i, k)th element of Z 2 and u ˆGLS = (C ⊗ I N ) ûGLS .This uses the following results: ( ) 22) and (25), one gets the following Goldberger BLUP term: t=2 Special case 1: When λ 1 = 0 and δ 1 = δ 2 = 0, the model reduces to the random effects spatial autoregressive RE-SAR model with serial correlation considered by Baltagi, Song, Jung and Koh (2007).
In this case, we have The Goldberger BLUP term given in Equation ( 26) reduces to where z ik and g ik are the (i, k)th elements of Z and (B 2 ′ B 2 ) Z, respectively.Equivalently, g ik can be Goldberger's BLUP extra term derived by Song and Jung (2002) for the random effects error component model with SAR correlation and serial correlation in the remainder disturbances.

Special case 2:
When λ 1 = λ 2 = 0 and δ 1 = δ 2 = 0, the model reduces to the random effects panel data model with AR(1) remainder error term and no spatial correlation considered by Baltagi and Li (1992).This model is special case 1, but with no spatial correlation.In this case, we have Substituting these e µ terms into Equation ( 27), the Goldberger BLUP term given in Equation ( 26) reduces to where l ik is the (i, k)th elements of I N .When s = 1, it further reduces to Goldberger's BLUP extra term derived by Baltagi and Li (1992) for the random effects panel data model with AR(1) remainder error term and no spatial correlation.

Special case 3:
When λ 1 = 0, δ 1 = δ 2 = 0 and ρ = 0, the model reduces to the random effects spatial autoregressive RE-SAR model with no serial correlation considered by Anselin (1988).This is special * case 1, but with no serial correlation.Note that ρ = 0 implies that α = 1, . Substituting these into Equation ( 27), the Goldberger BLUP term µ e given in Equation ( 26) reduces to ( ) where c 1ik is the (i, k)th elements of C 1 and u ¯k.= u ˆkt .This is Goldberger's BLUP extra term derived by Baltagi andLi (2004, 2006) for the random effects error component model with SAR correlation in the remainder disturbances.

Special case 4:
When λ 1 = λ 2 = 0, δ 1 = 0 and ρ = 0, the model reduces to the random effects spatial moving average RE-SMA model with no serial correlation described by Anselin, Le Gallo and Jayet (2008).* In this case, wee have Substituting these into Equation ( 27), the Goldberger BLUP term given in Equation ( 26) reduces to ( ) where c 2ik is the (i, k)th elements of C 2 and u ¯k.= u ˆkt .This is Goldberger's BLUP extra term derived by Baltagi andLi (2004, 2006) for the RE-SMA model with no serial correlation in the remainder disturbances.

Special case 5:
When λ 1 = λ 2 , δ 1 = δ 2 = 0, W 1 = W 2 and ρ = 0, the model reduces to the spatial autoregressive random effects SAR-RE model with no serial correlation considered by Kapoor, Kelejian and Prucha (2007).In this case, we have Substituting these results into Equation ( 27), the Goldberger BLUP term given in 1 Equation ( 26) reduces to ( ) where I N .Substituting these results into Equation ( 27), the Goldberger BLUP term given in 1 Equation (26) reduces to ( ) where l ik is the (i, k)th elements of I N u i.

and ¯=
∑ T 1 T t=1 u it .This is again equival where l ′ as the ith row of I N .This is Goldberger's BLUP extra term derived by Baltagi, Bresson and Pirotte i (2012) for the SMA-RE model with no serial correlation in the remainder disturbances and it is the same as the one for SAR-RE model with no serial correlation in the remainder disturbances considered in special case 5. Note, however, that the feasible predictor will be based on different estimates of the residuals and variance components once the model is estimated by maximum likelihood or Generalized Moments.

Special case 7:
When δ 1 = δ 2 = 0 and ρ = 0, the model reduces to the generalized random effects spatial autoregressive model with no serial correlation, proposed by Baltagi, Egger and Pfaffermayr (2013).* In this case, we have ≡ C 3 .Substituting these results into Equation ( 27), the Goldberger BLUP term given in Equation ( 26) reduces to ( ) where c 3ik is the (i, k)th elements of C 3 and ūk.= u ˆkt .

T t=1
Special case 8: When λ 1 = λ 2 = 0, δ 1 = δ 2 = 0 and ρ = 0, the model reduces to the familiar random effects model without spatial or serial autocorrelation.In this case, we have Substituting these results into Equation ( 27), the Goldberger BLUP term given in Equation ( 26) reduces to ( ) where l ik is the (i, k)th elements of I N and ūi.= u ˆit .This is again equivalent to where l ′ as the ith row of I N .This is Goldberger's BLUP extra term derived by Wansbeek and Kapteyn i (1978), Lee andGriffiths (1979), andTaub (1979) for the random effects error component model and it is the same as the one for SAR or SMA correlation in the remainder disturbances in special cases 5 and 6 but with different estimates of the residuals and variance components once the model is estimated by maximum likelihood or Generalized Moments.In order to make this forecast operational, β GLS is replaced by its feasible GLS estimate and the variance components are replaced by their feasible estimates.

Monte Carlo Simulation
This section performs some Monte Carlo experiments to evaluate the performance of our proposed predictors for the random effects model with both time autocorrelated and spatial correlated disturbances.It is important to note that Baltagi, Bresson and Pirotte (2012) performed extensive Monte Carlo experiments to evaluate the performance of predictors for the random effects model with spatial correlated disturbances.
Following Baltagi, Bresson and Pirotte (2012) the data generating process starts with a simple panel data regression with random one-way error components disturbances The variable x it was generated as x it = δ i + ξ it , where δ i is a random variable uniformly distributed on the interval [−7.5, 7.5] and ξ it is a random variable uniformly distributed on the interval [−5, 5].We choose the same spatial weight matrix Baltagi, Bresson and Pirotte (2012), the matrix W is created such that its i-th row has non-zero elements in positions i + 5 and i − 5. Therefore, the i-th element of u is directly related to the five ones immediately before it and the five ones immediately after it.This matrix is defined in a circular world so that the non-zero elements in rows 1 and N are, respectively, in positions (2,3,4,5,6 This matrix is row normalized so that all of its non-zero elements are equal to 1/10.As in Kapoor, Kelejian and Prucha (2007), this weighting matrix is referred as "5 ahead and 5 behind".The remainder disturbances u it were generated as an spatially correlated process with the following Data Generating Processes (DGP): 1. SAR: δ 1 = δ 2 = 0, λ 1 and λ 2 take values (0, 0.2, 0.5, 0.8) .These are reported in Tables 1-4 for ρ = 0, 0.2, 0.5, 0.8.
3. SARMA: δ 1 , δ 2 , λ 1 and λ 2 take values (0.2, 0.5) .These are reported in Tables 9-12 for ρ = 0, 0.2, 0.5, 0.8. iid The individual specific effect µ i is a random variable uniformly distributed as µ i ∼ N (0, 10).The remainder disturbances ν it were generated as an AR(1) process with ν it = ρν i,t−1 +ε it , where ε it is a random iid variable uniformly distributed as ε it ∼ N (0, 10) and ρ takes values (0, 0.2, 0.5, 0.8).Baltagi, Bresson and Pirotte (2012) considered several forecasts using panel data with spatial error correlation where the true data generating process was assumed to be a simple error component regression model with spatial remainder disturbances of the autoregressive or moving average type.Here, we extend this to the spatial autoregressive moving average type.
Predictions were made for only one period ahead.In order to depict the typical United States panel, the sample sizes (N, T ) in the different experiments were chosen as (49, 10).For each experiment, we perform 1, 000 replications.For each replication we estimate the model using the first 10 years and forecast 1 year ahead.Following Baltagi, Bresson and Pirotte (2012), we report the sampling root mean square error (RMSE) of each of the predictors considered above, which is computed as where R = 1, 000 replications.Following Frees and Miller (2004) among others, we also summarize the accuracy of the forecasts using the mean absolute error (MAE) For example, Willmott and Matsuura (2005) show that MAE has advantages over RMSE.RMSE and MAE of each of the predictors is reported in Tables 1-12.The columns of these Tables are labeled with the estimator used.The first column is OLS, the second column is the estimator for special case 1 which is a RE-SAR with AR(1) remainder error, the third column is the estimator for special case 2 which is a RE-SAR with no serial correlation, etc.The last column is for the Generalized estimator for a SARMA with AR(1) remainder error.For each model except OLS, slope and variance components parameters are estimated using MLE.It is worth pointing out that MLE estimators from an incorrectly specified model may affect the properties of the forecast.OLS is consistent but not efficient and ignores the heterogeneity in the panel and the spatial correlation.We include it for applied researchers that ignore spatial correlation and heterogeneity in the panel.Obviously, its predictions do not use the Goldberger correction and perform badly in Monte Carlo as the BLUP theory predicts.
Overall, forecasts one year ahead based on OLS, an estimator that ignores heterogeneity, spatial correlation and time autocorrelation performs the worst in terms of RMSE in all Tables.In Tables with ρ = 0, predictors one year ahead based on estimators that do not correct for serial correlation perform well in terms of RMSE.As ρ increases to 0.5 and 0.8, predictors one year ahead based on estimators that correct for serial correlation perform well in terms of RMSE.In Table 12, where the DGP is a SARMA with ρ = 0.8, the best RMSE is obtained by cases 1, 2, and the General predictor, all of which take care of serial correlation.
Predictors that account for time autocorrelation improve the forecast performance by a big margin.
Predictors that account for spatial correlation improve the forecast, but by a smaller margin.These findings are consistent with those in Baltagi, Bresson and Pirotte (2012).This is true whether the true model is SAR, SMA or SARMA with AR(1) remainder error.In Table 4, where the true model is SAR with ρ = 0.8, OLS has a RMSE of 8.242 for λ 1 = λ 2 = 0.8.Correcting for heterogeneity using a random effects estimator, as in case 8, only drops this to 6.686.If we do RE correcting for serial and spatial correlation, this forecast RMSE drops to 4.579 for case 1 (SAR-RE) and 4.573 for the General (SARMA-RE) estimator.Note that ignoring the spatial correlation and correcting only for serial correlation as in case 2, RE with AR(1), drops this forecast RMSE already to 4.604.Correcting for spatial correlation without correcting for time wise serial correlation as in cases 3-7 drops this RMSE only to 6.692 to 6.726 range.The results are similar in Table 8, where the true model is SMA with ρ = 0.8, OLS has a RMSE of 6.195 for δ 1 = δ 2 = 0.8.Correcting for heterogeneity using a random effects estimator, as in case 8, only drops this to 4.871.If we do RE correcting for serial and spatial correlation, this forecast RMSE drops to 3.301 for case 1 (SAR-RE) and 3.300 for the General (SARMA-RE) estimator.Note that ignoring the spatial correlation and correcting only for serial correlation as in case 2, RE with AR(1), drops this forecast RMSE already to 3.302.Correcting

Conclusion
for spatial correlation without correcting for time wise serial correlation as in cases 3-7 drops this RMSE only to the 4.867 to 4.873 range.The same thing happens for Table 12, where the true model is SARMA with ρ = 0.8, OLS has a RMSE of 7.041 for λ 1 = λ 2 = 0.8 and δ 1 = δ 2 = 0.5.Correcting for heterogeneity using a random effects estimator, as in case 8, only drops this to 5.598.If we do RE correcting for serial and spatial correlation, this forecast RMSE drops to 3.814 for case 1 (SAR-RE) and 3.810 for the General (SARMA-RE) estimator.Note that ignoring the spatial correlation and correcting only for serial correlation as in case 2, RE with AR(1), drops this forecast RMSE already to 3.821.Correcting for spatial correlation without correcting for time wise serial correlation as in cases 3-7 drops this RMSE only to the 5.591 to 5.601 range.Results of MAE yield similar findings to those of RMSE in Tables 1-12.This paper derives Goldberger's (1962) best linear unbiased predictor (BLUP) for the generalized spatial panel data model with serial correlation proposed by Lee and Yu (2012).Since the latter model encompasses a lot of the spatial panel data models considered in the literature, this in turn provides valuable BLUP for several spatial panel models as special cases.Extensions of this BLUP should be applied to dynamic spatial panel models, see Baltagi, Fingleton and Pirotte (2014), and to panel data models with a spatial lag, as well as higher order autoregressive and moving average processes, see Baltagi andLiu (2013a, 2013b).Furthermore, applied researchers may be interested in confidence intervals for serially dependent data.See Lahiri and Yang (2013) for an example.One might be interested in obtaining confidence intervals of ŷi,T +s .This leaves a potential research topic for the future.Notes: N = 49, T = 10.1,000 replications and 1 year forecasting ahead.Notes: N = 49, T = 10.1,000 replications and 1 year forecasting ahead.

l
′ as the ith row of I N .This is Goldberger's BLUP extra term derived by Baltagi, Bresson and Pirotte i (2012) for the SAR-RE model with no serial correlation in the remainder disturbances.Special case 6: When λ 1 = λ 2 = 0, δ 1 = δ 2 , M 1 = M 2 and ρ = 0, the model reduces to the spatial moving average random effects SMA-RE model with no serial correlation considered by Fingleton −1

Table 1 :
RMSE and MAE of Spatial Panel Data Predictors: SAR and ρ = 0

Table 2 :
RMSE and MAE of Spatial Panel Data Predictors: SAR and ρ = 0.2

Table 3 :
RMSE and MAE of Spatial Panel Data Predictors: SAR and ρ = 0.5

Table 4 :
RMSE and MAE of Spatial Panel Data Predictors: SAR and ρ = 0.8

Table 5 :
RMSE and MAE of Spatial Panel Data Predictors: SMA and ρ = 0

Table 6 :
RMSE and MAE of Spatial Panel Data Predictors: SMA and ρ = 0.2

Table 7 :
RMSE and MAE of Spatial Panel Data Predictors: SMA and ρ = 0.5

Table 8 :
RMSE and MAE of Spatial Panel Data Predictors: SMA and ρ = 0.8

Table 9 :
RMSE and MAE of Spatial Panel Data Predictors: SARMA and ρ = 0

Table 10 :
RMSE and MAE of Spatial Panel Data Predictors: SARMA and ρ = 0.2

Table 11 :
RMSE and MAE of Spatial Panel Data Predictors: SARMA and ρ = 0.5