Seemingly Unrelated Regressions with Spatial Error Components

This paper considers various estimators using panel data seemingly unrelated regressions (SUR) with spatial error correlation. The true data generating process is assumed to be SUR with spatial error of the autoregressive or moving average type. Moreover, the remainder term of the spatial process is assumed to follow an error component structure. Both maximum likelihood and generalized moments (GM) methods of estimation are used. Using Monte Carlo experiments, we check the performance of these estimators and their forecasts under misspecification of the spatial error process, various spatial weight matrices, and heterogeneous versus homogeneous panel data models.


Introduction
Since Zellner's (1962) seminal paper on seemingly unrelated regressions (SUR) analyzing multiple equations with correlated disturbances, various extensions have been proposed, for e.g., to deal with the serially correlated case, the nonlinear case, the misspecified case, and SUR with unequal number of observations, see Srivastava and Dwivedi (1979). 1 Of particular interest for this paper are the extensions of SUR to panel data utilizing the error component model, see Avery (1977), Baltagi (1980), Magnus (1982) and Prucha (1984) to mention a few. Some applications of SUR panel data with error components include Verbon (1980) who estimated a set of four labor demand equations, using data from the Netherlands on 18 industries over 10 semiannual periods covering the period 1972-79; Beierlein, Dunn and McConnon (1981) who estimated six equations describing the demand for electricity and natural gas in the northeastern United States using data on nine states comprising the Census Bureau's northeastern region of the USA for the period 1967-77; Brown, Kleidon and Marsh (1983) who studied the size-related anomalies in stock returns using a panel of 566 firms observed quarterly over the period June 1967to December 1975Howrey and Varian (1984) who estimated a system of demand equations for electricity by time of day. Their data were based on the records of 60 households whose electricity usage was recorded over a five-month period in 1976 by the Arizona Public Service Company; Sickles (1985) who modeled the technology and specific factor productivity growth in the US airline industry; Wan, Griffiths and Anderson (1992) who estimated production functions for rice, maize and wheat production using panel data on 28 regions of China over the period 1980-83; Baltagi, Griffin and Rich (1995) who estimated a SUR model consisting of a translog variable cost function and its corresponding input share equations for labor, fuel and material using panel data of 24 U.S. airlines over the period 1971-1986Egger and Pfaffermayr (2004) who used industry-level data of bilateral outward FDI stocks and exports of the U.S. and Germany to other countries between 1989 and 1999 to study the effects of distance as a common determinant of exports and foreign direct investment (FDI) in a three-factor New Trade Theory model; and more recently, Baltagi and Rich (2005) who estimated a SUR model consisting of a translog cost function and its corresponding in-1 put share equations for production workers, nonproduction workers, energy, materials, and capital utilizing the National Bureau of Economic Research (NBER) manufacturing productivity database file. The panel data covered 459 manufacturing industries at the SIC 4-digit level over the period 1959-1996. In addition, SUR models have been extended to allow for spatial autocorrelation, see Anselin (1988a,b). In fact, Anselin (1988a) derived a Lagrange Multiplier test for spatial autocorrelation in a SUR context. This paper extends Anselin's (1988a,b) SUR spatial model to the panel data case. This more general model allows for correlation across space, time and equations. It combines the simplicity of dealing with heterogeneity in the panel using an error component model and spatial correlation using a spatial autoregressive (SAR) or spatial moving average (SMA) disturbances. In this context, Wang and Kockelman (2007) derived the maximum likelihood estimator (under the normality assumption) of a SUR error component panel data model with SAR disturbances. They applied it to estimation of crash rates in 169 cities in China over the period 1999-2002. The next section presents the seemingly unrelated regressions (SUR) panel model with spatial correlated error components. Section 3 presents the various estimators considered including maximum likelihood and generalized moments (GM) methods. We propose extensions of the Kapoor, et al. (2007) GM method to deal with SUR panel with SAR error component structure. Also, extensions of the Fingleton (2008a) GM method and Wang and Kockelman (2007) maximum likelihood (ML) method to deal with SUR panel with SMA error component structure. Section 4 gives the Monte Carlo design. The true data generating process is assumed to be SUR with spatial error of the autoregressive (SAR) or moving average (SMA) type. Moreover, the remainder term of the spatial process is assumed to follow an error component structure. Section 5 gives the Monte Carlo results along with sensitivity checks of these results to misspecification of the spatial error process, various spatial weight matrices, heterogeneous versus homogeneous spatial and panel estimators, and their performance in out of sample prediction. Section 6 concludes. 2 2 SUR with spatially correlated error components Consider the set of M equations: where y j is (T N × 1), X j is (T N × k j ), β j is (k j × 1), and the (T N × 1) error vector ε j follows a spatial autoregressive (SAR) or a spatial moving average (SMA) process. Those processes can be expressed as : where I T is an identity matrix of order T , W jN is an (N × N) known spatial weights matrix, ρ j is the spatial autoregressive parameter and λ j is the spatial moving average parameter for equation j = 1, . . . , M . The diagonal elements of the spatial weight matrices W jN are zero. We assume that the matrices I N − ρ j W jN are non-singular, and that the row and column sums of the matrices W jN are bounded uniformly in absolute value for j = 1, . . . , M . The matrix of exogenous regressors X j has full column rank and its elements are uniformly bounded in absolute value. In contrast to much of the classical literature on panel data, we group the data by periods rather than units. This grouping is more convenient for modelling spatial correlation via (2). The disturbance term u j of the processes (2) follows an additive error components structure: where Z µ = ι T ⊗ I N , ι T is a (T × 1) vector of ones; µ j = µ 1j , . . . , µ Nj ′ and v j = (v 11j , ..., v N1j , ..., v 1T j , ..., v NT j ) ′ are random vectors with 0 means and covariance matrix for j and l = 1, 2, . . . , M , see Baltagi (1980). We note that the specification of u j corresponds to that of classical one-way error component model, see Baltagi (2008). In fact, if ρ j = 0 (resp. λ j = 0), ∀j = 1, 2, . . . , M , so that there is no spatial autocorrelation, then this reduces to the usual SUR panel model with error components suggested by Avery (1977) and Baltagi (1980). Following Baltagi (1980), the covariance matrix of u is given by where Ω jl is a typical submatrix of Ω u given by where (6) can also be written as with where J T = J T /T and σ 2 1 jl = σ 2 v jl + T σ 2 µ jl . The matrices Q 1 and Q 2 are symmetric, idempotent and orthogonal to each other. Furthermore, Q 1 + Q 2 = I T N , trQ 1 = N and trQ 2 = N (T − 1). Replacing Ω jl in (5) by its value, given in (6) we get where Σ u = Ω µ ⊗ J T + Ω v ⊗ I T . Alternatively, from (7) where Ω µ = σ 2 µ jl , Ω 1 = σ 2 1 jl and Ω v = σ 2 v jl , all of dimension (M × M ). Then, the inverse of that covariance matrix is given by or see Baltagi (1980). From (2) and (3), the spatial-RE specification of the (T N × 1) error vector ε j of equation j can be expressed as: with (14) is given by: or where A is a block-diagonal matrix defined as with typical block matrix A jj = I T ⊗ H Nj for j = 1, . . . , M . Following the properties of the matrices Ω u and A, we obtain the inverse covariance matrix of ε defined as or Consider the SUR spatial panel model given in (1) -(3). The true generalized least squares (GLS) estimator of β is given by with typical element of the jth equation The y * j and X * j can be viewed as the result of a spatial Cochrane-Orcutt type transformation of the original model. More specifically, premultiplication of (1) and (2) Stacking the set of M equations, we get with y * = A −1 y and X * = A −1 X. In light of the properties of (13), we can write Guided by the classical error component literature, we note that a convenient way of computing the GLS estimator β GLS is to further transform the model in (26) by premultiplying it by Ω −1/2 u . The GLS estimator of β is then identical to the OLS estimator of β computed from the resulting transformed model. Ω −1/2 v and Ω −1/2 1 can be obtained from a Cholesky decomposition of Ω v and Ω 1 , see Kinal and Lahiri (1990). We note that if ρ j = 0 (resp. λ j = 0), ∀j = 1, 2, . . . , M , so that there is no spatial autocorrelation, then the GLS estimator reduces to that proposed by Avery (1977) and Baltagi (1980) for the SUR panel data model.
Let ρ j (resp. λ j ), σ 2 1 jl and σ 2 v jl be estimators of ρ j (resp. λ j ), σ 2 1 jl and σ 2 v jl . The corresponding feasible GLS estimator of β, say β F GLS , is then obtained by replacing ρ j (resp. λ j ), σ 2 1 jl and σ 2 v jl by those estimators in the expression for the GLS estimator where X * = A −1 X and y * = A −1 y. This estimator can be easily computed as an OLS estimator on a transformed system of equations described above.
We propose a FGLS procedure that can be obtained in two steps : • Estimate each equation with SAR-RE (resp. SMA-RE) process using the GM spatial panel data estimator proposed by Kapoor, et al. (2007) (resp. Fingleton (2008a) to obtain consistent estimates of ρ j (resp. λ j ) for j = 1, . . . , M . We can also estimate ρ j (resp. λ j ) using the GM cross-section estimator proposed by Kelejian andPrucha (1999) (resp. Fingleton (2008b)). This computes the cross-sectional GM estimator for each equation with SAR disturbances (resp. SMA disturbances) for each time period and averages the estimates over time • Knowing the true disturbances u j , the analysis of variance estimates of Ω v and Ω 1 are given by . . . , u M ] is the N T ×M matrix of disturbances for all M equations, see Avery (1977) and Baltagi (1980). Using the consistent estimates of the residuals from step 1, one obtains consistent estimates of Ω v and Ω 1 .
The GM estimation method is computationally simple and yields consistent estimates under mild conditions given in Kapoor, et al. (2007). This was suggested as an alternative to the standard MLE (under normality of the disturbances) which is computationally demanding even for the single equation case. Under normality of the disturbances, the log-likelihood function is given by: 7 and basic mathematical manipulations result in the following: The parameters in (30) are intertwined, and the first order conditions of maximization are non-linear. However, the model can be estimated using a three-step method (see Wang and Kockelman (2007)) 2 : • First, β can be estimated using a feasible generalized least squares estimator (FGLS), conditional on Ω µ , Ω v and ρ (resp. λ), i.e., by maximizing the conditional likelihood L(β/ρ, Ω µ , Ω v ).
• Third, we maximize the concentrated log-likelihood function L(ρ/Ω µ , Ω v , β) over ρ (resp. λ). The optimized values of Ω µ , Ω v and β from the first two steps are plugged in the likelihood and the values of ρ are obtained by non-linear optimization. The estimated ρ (resp. λ) then re-enters the estimation of Ω µ , Ω v and β. This procedure is iterated until convergence.

Monte Carlo design
In this section, we consider the Monte Carlo design to study the small sample performance of several estimators of a SUR with spatial error components disturbances. The data generating process (DGP) considers two specifications on the remainder errors (2), namely SAR and SMA. We suppose that M = 2, then our spatial SUR specification is: or with β 0,1 = β 1,1 = β 0,2 = β 1,2 = 1, ι T N is an (T N ×1) vector of ones, (X 1 , X 2 ) are two explanatory variables. Following Baltagi, Egger and Pfaffermayr (2007), the DGP of x j,it , j = 1, 2, is defined by: with δ j,i ∼ iid.U (−7.5, 7.5) and ω j,it ∼ iid.U (−5, 5). The (2N T × 1) spatial-RE vector of the disturbances ε is: where the matrix A is defined by (19) with W 1 = W 2 = W N where W N is the spatial weight matrix defined by Kelejian and Prucha (1999). We use two weight matrices which essentially differ in their degree of sparseness. The weight matrices are labelled as "s ahead and s behind" with the non-zero elements being 1/(2s), s = 1 and 5. This row normalizes the weight matrices so that their elements sum to one. We generate the error components term as: The variance-covariance matrices Ω µ and Ω v are defined by: In order to generate the vector of disturbances (µ + v), we use the Cholesky decomposition 4 . We consider several individual and time dimensions N = (50, 100), T = (10, 20). For all experiments, 1000 replications are performed.
For each experiment, we consider the following 18 estimators: Homogeneous estimators (without spatial): 1. The pooled OLS equation by equation which ignores the individual heterogeneity, the spatial correlation and the correlation across equations.
2. The random effects (RE) estimator, equation by equation, which assumes that the µ i 's are iid(0, σ 2 µ ), and independent of the remainder disturbances v it 's. This estimator accounts for random individual effects but does not take into account the spatial autocorrelation nor the correlation across equations.
3. The fixed-effects (FE) estimator, equation by equation, which accounts for fixed individual effects but does not take into account the spatial autocorrelation and correlation across equations.
5. The SUR fixed effects (FE) estimator which ignores spatial autocorrelation but takes into account the correlation across equations.
Homogeneous estimators (with spatial): 8. The SUR-ML random effects (RE) estimator which takes into account the spatial autocorrelation of the SAR type.
9. The SUR-ML random effects (RE) estimator which takes into account the spatial autocorrelation of the SMA type.
10. The SUR-ML fixed effects (FE) estimator which takes into account the spatial autocorrelation of the SAR type.
11. The SUR-ML fixed effects (FE) estimator which takes into account the spatial autocorrelation of the SMA type.
12. The SUR-FGLS random effects (RE) estimator which takes into account the spatial autocorrelation of the SAR type, using the GM method.
In the first step, we estimate each equation with SAR-RE process using the GM spatial panel data estimator proposed by Kapoor, et al. (2007) to obtain consistent estimates of ρ j , j = 1, 2.
13. The SUR-FGLS random effects (RE) estimator which takes into account the spatial autocorrelation of the SMA type, using the GM method. In the first step, we estimate each equation with SMA-RE process using the GM spatial panel data estimator proposed by Fingleton (2008a) to obtain consistent estimates of λ j , j = 1, 2.
Heterogeneous estimator (without spatial): 14. The average heterogeneous OLS, equation by equation, to obtain a pooled estimator, see Pesaran and Smith (1995).
Heterogeneous estimators (with spatial): 15. The average heterogeneous SUR assuming a SAR specification on the remainder disturbances using Kelejian and Prucha (1999) GM approach to estimate ρ jt . This estimates cross-sectional GM-OLS with SAR disturbances for each time period and averages the estimates over time.
16. The average heterogeneous SUR assuming a SMA specification on the remainder disturbances using Fingleton (2008b) GM approach to estimate λ jt . This estimates cross-sectional GM-OLS with SMA disturbances for each time period and averages the estimates over time.
17. The SUR-FGLS random effects (RE) estimator which takes into account the spatial autocorrelation of the SAR type, using GM-Averagewithin residuals. In the first step, we estimate ρ j , j = 1, 2, using the GM cross-section estimator proposed by Kelejian and Prucha (1999). This computes the cross-sectional GM estimator for each equation with SAR disturbances for each time period and averages the estimates over time ρ j = 1/T T t=1 ρ jt , j = 1, 2. 18. The SUR-FGLS random effects (RE) estimator which takes into account the spatial autocorrelation of the SMA type, using GM-Averagewithin residuals. In the first step, we estimate λ j , j = 1, 2, using the GM cross-section estimator proposed by Fingleton (2008b). This computes the cross-sectional GM estimator for each equation with SMA disturbances for each time period and averages the estimates over time We focus on the estimates β 1,1 , β 1,2 , ρ 1 , λ 1 , ρ 2 , λ 2 , the standard errors σ β 1,1 , σ β 1,2 , and the variance components σ 2 Kapoor, et al. (2007), we adopt a measure of dispersion which is closely related to the standard measure of root mean square error (RMSE) defined as follows: where bias is the difference between the median and the true value of the parameter, and IQ is the interquantile range defined as c 1 − c 2 where c 1 is the 0.75 quantile and c 2 is the 0.25 quantile. Clearly, if the distribution is normal the median is the mean and, aside from a slight rounding error, IQ/1.35 is the standard deviation. In this case, the measure (36) reduces to the standard RMSE. Moreover, we check the prediction-performance of the 18 alternative estimators considered. Here, we use the usual RMSE criterion and compute the out of sample forecast errors for each predictor associated with the 18 estimators. An average RMSE is calculated across the N individuals at different forecasts horizons. Table 1 gives the RMSE for the various estimators considered when the true DGP is a SUR panel model with SAR-RE remainder disturbances. The sample size is (N, T ) = (50, 10), the weight matrix is W (1, 1), i.e., one neighbor behind and one neighbor ahead. The spatial coefficients are

RMSE performance of the estimators
Focusing on the RMSE of the slope coefficient of the first equation (β 1,1 ), we observe the following results: Not surprisingly, OLS and average OLS perform the worst because they ignore the spatial correlation, the individual heterogeneity and the cross-equation correlation. Taking into account only the cross-equation correlation by performing Zellner's SUR estimation ignoring the spatial effects and the individual heterogeneity reduces the RMSE from 0.02546 for OLS to 0.02234 for Zellner's SUR. Interestingly, if one performed RE or FE ignoring the spatial effects and the cross-equation correlation, the reduction in RMSE would have been even more (0.01776 and 0.02019, respectively). Correcting for both individual heterogeneity and cross-equation correlation by performing SUR-FE and SUR-FGLS RE reduces the RMSE further to 0.01769 and 0.01577, respectively.
Note also that SUR-RE leads to similar RMSE for feasible GLS and ML, respectively 0.01577 and 0.01593. Correcting for spatial correlation, individual heterogeneity and the cross-equation correlation by performing SUR-ML or SUR-FGLS SAR-RE yields the lowest RMSE of 0.01123 (for feasible GLS) and 0.01140 (for the corresponding ML). The RMSE for SUR SAR-FE using ML is 0.01308. If the wrong spatial structure was used in the estimation, i.e., SMA rather than SAR, the corresponding RMSE for SUR SMA-RE would be 0.01599 for ML and 0.01767 for the SUR-ML SMA-FE. Ignoring the individual effects but not the spatial correlation or the cross-equation correlation, by applying Average SUR SAR yield a RMSE of 0.01855. Interestingly, this RMSE remains almost the same had one misspecified the SAR process and performed Average SUR SMA. If we take into account the individual effects, the corresponding heterogeneous RMSE for SUR-FGLS SAR-RE (av.) is 0.01116 and 0.01213 for SUR-FGLS SMA-RE (av.).
Similar results are obtained had we focused on the slope coefficient of the second equation β 1,2 . Only the magnitudes of the RMSEs would have been different. For example, the RMSE of OLS is 0.02655, that of FE is 0.01566, that of RE is 0.01464. Zellner's SUR is 0.02332. SUR-FE is 0.01477 and SUR-FGLS RE is 0.01417 and 0.01432 for SUR-ML RE. The lowest RMSE is obtained for SUR SAR-RE (0.01230) whether FGLS or ML. Misspecifying the SAR process by a SMA process yields a RMSE of 0.01336 for SUR SMA-RE by FGLS and 0.01248 by ML. The corresponding RMSE for SUR-ML SMA-FE is 0.01512. The heterogeneous estimators yield a RMSE of 0.2685 for average OLS, 0.02146 for Average SUR SAR and 0.02081 for Average SUR SMA. The corresponding heterogeneous RMSE for SUR-FGLS SAR-RE (av.) is 0.01250 and 0.01232 for SUR-FGLS SMA-RE (av.). Table 2 gives the RMSE for the various estimators considered when the true DGP is a SUR panel model with SMA-RE remainder disturbances. The sample size is N = 50 and T = 10, the weight matrix is W (1, 1), i.e., one neighbor behind and one neighbor ahead. The spatial coefficients are Focusing on the RMSE of the slope coefficient of the first equation (β 1,1 ), we observe the following results: OLS and average OLS still perform the worst. Zellner's SUR (0.02136) performs better in terms of RMSE than OLS (0.02488) but worse than RE (0.01475), FE (0.01747), SUR-FE (0.01564), SUR-ML RE (0.01336) and SUR-FGLS RE (0.01343). Correcting for spatial correlation, individual heterogeneity and the cross-equation correlation by performing SUR-ML or SUR-FGLS SMA-RE yields the lowest RMSE of 0.01044 and 0.01025, respectively. If the wrong spatial structure was used in the estimation, i.e., SAR rather than SMA, the corresponding RMSE for SUR SAR-RE would be 0.01038 for FGLS, 0.01052 for ML. Similar results are obtained for the slope coefficient of the second equation β 1,2 . Table 3 gives the RMSE for the various estimators considered when the true DGP is a SUR panel model with SAR-RE remainder disturbances. All the parameters are the same as in Table 1, except the spatial coefficients which are now (ρ 1 , ρ 2 ) = (0.8, 0.5) rather than (0.5, 0.3) , implying higher spatial autocorrelation of the SAR type. Focusing on the RMSE of the slope coefficient of the first equation (β 1,1 ), we observe the same performance as in Table 1 but the magnitude of the RMSE almost doubles for some estimators.
For example, OLS and average OLS still perform the worst because they ignore the spatial correlation, the individual heterogeneity and the crossequation correlation. They yield RMSE that is now of the order of 0.0501 and 0.0505 rather than 0.0255 and 0.026, as in Table 1. Zellner's SUR estimation ignoring the spatial effects and the individual heterogeneity yields a RMSE of 0.0449 rather than 0.0223 as in Table 1. If one performed RE or FE ignoring the spatial effects and the cross-equation correlation, the RMSE would have been 0.0289 and 0.0345, rather than 0.0178 and 0.0202, as in Table 1. Correcting for both individual heterogeneity and cross-equation correlation by performing SUR-FE and SUR-FGLS RE yield RMSE of 0.0313 and 0.0273, rather than 0.0177 and 0.0158, as in Table 1.
SUR-RE leads to similar RMSE for feasible GLS and ML, 0.0273 and 0.0272 in Table 3 rather than 0.0159 and 0.0158, as in Table 1. Correcting for spatial correlation, individual heterogeneity and the cross-equation correlation by performing SUR-ML or SUR-FGLS SAR-RE yields the lowest RMSE of 0.0107 (for feasible GLS) and 0.0110 (for the corresponding ML). This is compared to 0.0112 and 0.0114 in Table 1. The RMSE for SUR SAR-FE using ML is 0.0116 in Table 3 compared to 0.0131 in Table 1. If the wrong spatial structure was used in the estimation, i.e., SMA rather than SAR, the corresponding RMSE for SUR SMA-RE would be 0.0125 for ML and 0.0138 for the SUR-ML SMA-FE in Table 3 compared to 0.0160 and 0.0177 in Table 1. Ignoring the individual effects but not the spatial correlation or the cross-equation correlation, by applying Average SUR SAR yield a RMSE of 0.0196 in Table 3 compared to 0.0186 in Table 1. Interestingly, this RMSE is very different from Average SUR SMA (the misspecified estimator) which is now 0.0222 in Table 3 rather than 0.0190 in Table 1. If we take into account the individual effects, the corresponding heterogeneous RMSE for SUR-FGLS SAR-RE (av.) is 0.0111 and 0.0148 for SUR-FGLS SMA-RE (av.) in Table 3 compared to 0.0112 and 0.0121 in Table 1.
Similar results are obtained had we focused on the slope coefficient of the second equation β 1,2 . Only the magnitudes of the RMSEs would have been different. For example, the RMSE of OLS is 0.0285 in Table 3 compared to 0.0266 in Table 1, that of FE is 0.0204 in Table 3 compared to 0.0157 in Table 1, that of RE is 0.0183 in Table 3 Table 3 compared to 0.0269 in Table 1. Average SUR SAR and Average SUR SMA yield RMSE of 0.0200 and 0.0207 in Table 3 compared to 0.0215 and 0.0208 in Table 1. The corresponding heterogeneous RMSE for SUR-FGLS SAR-RE (av.) and SUR-FGLS SMA-RE (av.) are 0.0121 and 0.0129 in Table 3 compared to 0.0125 and 0.0123 in Table 1. Table 4 gives the RMSE for the various estimators considered when the true DGP is a SUR panel model with SMA-RE remainder disturbances. Compared to Table 2, all the parameters are the same except the spatial coefficients which are now (λ 1 , λ 2 ) = (0.8, 0.5) rather than (0.5, 0.3) , implying higher spatial autocorrelation of the SMA type. Focusing on the RMSE of the slope coefficient of the first equation (β 1,1 ), we observe the following results: OLS and average OLS still perform the worst with RMSE of 0.0270 and 0.0274 in Table 4 compared to 0.0249 and 0.0251 in Table 2. Zellner's SUR yields a RMSE of 0.0236 in Table 4 compared to 0.0214 in Table 2. RE yields 0.0156 in Table 4 compared to 0.0148 in Table 2. FE yields 0.0189 in Table 4 compared to 0.0175 in Table 2. SUR-FE yields 0.0177 in Table  4 compared to 0.0156 in Table 2. SUR-ML RE yields 0.0151 in Table 4 compared to 0.0134 in Table 2. SUR-FGLS RE yields 0.0151 in Table 4 compared to 0.0134 in Table 2. Correcting for spatial correlation, individual heterogeneity and the cross-equation correlation by performing SUR-ML or SUR-FGLS SMA-RE yields the lowest RMSE of 0.0058 and 0.0061 in Table  4 compared to 0.0104 and 0.0103 in Table 2. If the wrong spatial structure was used in the estimation, i.e., SAR rather than SMA, the corresponding RMSE for SUR SAR-RE would be 0.0076 for FGLS, 0.0082 for ML in Table  4 compared to 0.0104 and 0.0105 in Table 2. Similar results are obtained for the slope coefficient of the second equation β 1,2 but the magnitudes of the RMSEs are higher. Table 5 gives the forecast RMSE results when the true DGP is a SUR panel model with SAR-RE remainder disturbances. The sample size is still N = 50, T = 10, and the weight matrix is W (1, 1). In general, for (ρ 1 , ρ 2 ) = (0.5, 0.3) with σ 2 µ 1 = σ 2 µ 2 = 1, σ 2 v 1 = σ 2 v 2 = 1 and ρ µ = ρ v = 0.5, the lowest forecast RMSE is that of SUR SAR-RE (ML, FGLS and FGLS (av.)). This is followed closely by SMA-RE. Misspecifying the SAR by an SMA in an error component model does not seem to affect the forecast performance as long as it is taken into account. However, the magnitudes of the RMSE in Table 5 (where the true DGP is a SAR-RE process) are higher than those in Table 6 (where the true DGP is a SMA-RE process). Once again, the forecast RMSE based on ML and FGLS are quite similar. Pooled OLS, SUR-FGLS, average heterogeneous OLS, average SUR SAR and average SUR SMA perform worse in terms of forecast RMSE than spatial/panel homogeneous estimators. This forecast performance is robust whether we are predicting one period, two periods or 5 periods ahead and is also reflected in the average over the five years. The gain in forecast performance is substantial once we account for RE or FE and is only slightly improved by additionally accounting for spatial autocorrelation.

Forecast Accuracy
Tables 7 and 8 lead to similar RMSE as those reported in Tables 5 and 6 except that the magnitudes of the RMSEs for the first equation are almost double for some forecasts. Compared to Tables 5 and 6, all the parameters are the same except for the spatial coefficients which are now higher (ρ 1 , ρ 2 ) = (λ 1 , λ 2 ) = (0.8, 0.5) rather than (0.5, 0.3) , implying higher spatial autocorrelation. In Table 7, when the true DGP is a SAR panel model with SAR-RE remainder disturbances, the average RMSE is around 2 for the first equation compared to 1 in Tables 5 and 6. OLS, SUR-FGLS, Average SUR SAR and Average SUR SMA continue to perform badly yielding the worst RMSE forecasts.

The spatial Weight Matrix effect
For the various estimators considered, Tables 9 and 10 report the RMSE results as Tables 1 and 2 except that the weight matrix is changed from a W (1, 1) to W (5, 5) , i.e., five neighbors behind and five neighbors ahead. Except for the magnitudes of the RMSE, the same rankings in terms of RMSE performance are exhibited as before.
For forecasts accuracy, Tables 11 and 12 report the forecast RMSE results as Tables 5 and 6 except that the weight matrix is now W (5, 5) rather than W (1, 1) . Except for the magnitudes of the forecast RMSE, the same rankings in terms of RMSE performance are exhibited as before. From our limited experiments, we conclude that our results are robust to the W matrices considered.

Stronger correlation across equations
In Table 13, we consider a set of experiments with higher correlation across equations. In particular, we let ρ µ = ρ v = 0.9 rather than ρ µ = ρ v = 0.5 as in Table 1. The sample size is still fixed at (N, T ) = (50, 10) , the weight matrix is W (1, 1), i.e., one neighbor behind and one neighbor ahead.; the spatial coefficients are (ρ 1 , ρ 2 ) = (0.5, 0.3) with σ 2 µ 1 = σ 2 µ 2 = 1, σ 2 v 1 = σ 2 v 2 = 1. Table  13 gives the RMSE for the various estimators considered when the true DGP is a SUR panel model with SAR-RE remainder disturbances. Focusing on the RMSE of the slope coefficient of the first equation (β 1,1 ), we observe that the estimators that correct for spatial correlation, individual heterogeneity and the cross-equation correlation continue to give the lowest RMSE. Comparing these results with those in Table 1, we find that the RMSE of the SUR-ML SAR-RE estimator is reduced from 0.01140 in Table 1 to 0.00614 in Table  13, while that of OLS increased from 0.02546 in Table 1 to 0.03048 in Table  13. The former takes into account the stronger cross-equation correlation, while the latter does not. In fact, the gain in RMSE, as we go from OLS to SUR-FGLS is more substantial in Table 13 than in Table 1. The former is a reduction of RMSE from 0.0255 to 0.0223, while the latter is a reduction of RMSE from 0.0305 to 0.0140. Similar comparisons substantiate this gain, with RMSE of SUR-FE falling from 0.0177 in Table 1 to 0.0089 in Table 13. Similar results are obtained had we focused on the slope coefficient of the second equation β 1,2 .
Does this gain in RMSE in the estimates translate into better RMSE forecasts? Table 14 gives the forecast RMSE results when the true DGP is a SUR panel model with SAR-RE remainder disturbances, generated by the corresponding estimates given in Table 13. Comparing the forecasts to those in Table 5 with weaker cross-equation dependence, we see that better estimates in terms of RMSE do translate into better RMSE forecasts for all the homogeneous estimators accounting for spatial effects and heterogeneity. However, the reduction in RMSE forecasts is not huge. This also is true for other estimators like FE, RE, SUR-FE, SUR-ML RE and SUR-FGLS RE, but it does not hold for SUR-FGLS for example.