Modelling and Testing for Structural Changes in Panel Cointegration Models with Common and Idiosyncratic Stochastic Trend

In this paper, we propose an estimation and testing framework for parameter instability in cointegrated panel regressions with common and idiosyncratic trends. We develop tests for structural change for the slope parameters under the null hypothesis of no structural break against the alternative hypothesis of (at least) one common change point, which is possibly unknown. The limiting distributions of the proposed test statistics are derived. Monte Carlo simulations examine size and power of the proposed tests.


Introduction
Estimation and testing for structural changes is an important research topic in time series econometrics. A recent annals volume of the Journal of Econometrics published in 2005 entitled \Modelling structural breaks, long memory and stock market volatility" (edited by Anindya Banerjee and Giovanni Urga, 2005) and Perron (2006) o er the most recent comprehensive reviews on the topic. In contrast, scarce is the literature on the issues (estimation and testing) of structural changes in panel models, e.g., Han and Park (1989), Wolfson (1992, 1993), Joseph et al. (1997), Hansen (1999), Chiang et al. (2002), Kao (2001, 2002), Wachter and Tzavalis (2004) and Bai (2006). The estimation and testing for structure change in panels have many applications in economics, For example, scal/monetary policies may a ect every unit in the economy ( rms/regions), stock market crashes in the US may also cause the chain reaction in other stock markets in the world.
Despite the potential usefulness in economics, the econometric theory of the testing and estimation of structural changes in panels is still underdeveloped. This paper lls the gap in the literature by proposing an estimation and testing framework for parameter instability in cointegrated panel regression. We derive tests for structural change for the slope parameters in panel cointegration models with cross-sectional dependence that is captured by the common stochastic trends. The tests are for the null hypothesis of no structural break against the alternative hypothesis of (at least) one common change point which is possibly unknown. This paper makes two contributions to the existing literature. First, we develop an asymptotic theory for the estimates of the parameters in the model. We consider both the case of observed and unobserved common shocks. Ordinary large panels asymptotic theory (Phillips and Moon, 1999;Kao, 1999) cannot be applied in our framework due to the strong cross-sectional dependence introduced by the common shocks.
We note that the limiting distributions of the common shocks coe cients are mixed normal, in contrast with asymptotic normality found in the literature. Second, along similar lines as Andrews (1993), we derive the limiting distribution of a Wald-type test for the null hypothesis of no structural change at an unknown point in cointegrated panels where units are cross dependent. The tests we derive are based on functionals of the Wald-type statistic.
The organization of the paper is as follows. Section 2 introduces the model. Section 3 discusses asymptotics. The limiting distribution of the OLS under the null of no structural change is established. Section 4 de nes the test statistic. The limiting distributions of the proposed test are also derived. Section 5 discusses the local power. In Section 6 we report the nite sample properties, i.e., size and power, of our proposed 1 tests. Section 7 provides concluding remarks. Some useful lemmas are given in Appendix A. In Appendix B we report the proofs of the main results in the paper.
We write the integral R 1 0 W (s)ds as R W when there is no ambiguity over limits. We de ne 1=2 to be any matrix such that = 1=2

Model and Assumptions
Consider the following panel model with common and idiosyncratic shocks i = 1; :::; n and t = 1; :::; T; where i is the individual e ect. The parameters and are R 1 and p 1, respectively, F t = (F 1t ; :::; F Rt ) 0 is a R 1 vector of common stochastic trends x it is a p 1 vector of observable I(1) individual-speci c regressors, and u it ; " 0 t ; 0 it 0 are error terms. When common shocks F t are not observable in (1), we then assume that F t can be estimated by a set of observable exogenous variables, z it , such that where i is a vector of factor loadings and e it is the error term. 1 It is important to point out that our model in (1) is a standard common slope coe cients panel model not a factor-loading model as in Bai (2004), for example. Similar to this paper but not the same is Stock and Watson (1999. In Stock and Watson's setup, y it in (1) (with n = 1) is the time series variable to be forecasted and z i = (z i1 ; z i2 ; :::; z iT ) 0 is a n-dimensional multiple time series of candidate predictors.
The main aim of this paper is to develop test statistics to test the constancy over time for = 0 ; 0 0 with unknown change points. Considering the alternative hypothesis that there is only one change point k, three possible sets of alternative hypotheses can be considered as opposed to the null of no structural change in : (1) only the common shocks coe cients may change, (2) only the idiosyncratic shocks coe cients may change or (3) both and may be a ected by the break.
Denote t = 0 t ; 0 t 0 : Given the null hypothesis the alternative could be de ned as H a : t = 1 for t = 1; :::; k 2 for t = k + 1; :::; T Note that testing for the constancy of for the common factor, F t , may have a di erent interpretation than the usual constancy of the slope parameter. 3 This is the case especially when F t is not observed and has to be estimated e.g. using the principal component estimator (see Bai, 2003Bai, , 2004Ng, 2002, 2004). In this case, the estimated factor matrix,F , is given by T times the eigenvectors corresponding to the R largest eigenvalues of the matrix ZZ 0 , where Z = (z 1 ; z 2 ; :::; z n ) 0 is T n with z i = (z i1 ; z i2 ; :::; z iT ) 0 . Since there is no guarantee that the R largest eigenvalues will have the same order for each t, the corresponding eigenvectors will have di erent meanings over time. For example, in the term structure literature (see e.g. Litterman and Scheinkman, 1991; Audrino et al. 2005), one usually uses a three-factor speci cation (level, slope and curvature) to explain the yield curves. The largest eigenvalue (and the corresponding eigenvector) for period t may not the be the same one in period s. This will make the parameter non constant. Thus, being non constant may indicate instability in the factor structure and not merely lack of constancy of a slope parameter. Recently, Perignon and Villa (2006) provide some discussion on the stability of the latent factor structure of interest rates over time.
We need the following assumptions.
Assumption M1: Let ! it = (u it ; " 0 t ; 0 it ; e it ) 0 . We assume that (a) ! it is iid over t and the invariance principle holds for the partial sums of ! it , so that for a given i, Assumption M1(a) considers a framework of no endogeneity of the regressors, serial dependence or contemporaneous correlation other than the one determined by the common shocks F t are allowed for. Extensions to allow for endogeneity of the regressors, serial correlation and weak cross-sectional dependence among the regression errors are straightforward. Assumption M1(a), therefore, is considered merely for the purpose of simpli cation. Assumption M1(b) is a standard requirement for factor analysis and it is needed when F t are not observable. Note here we allow non-zero covariance between " t and it : Assumption M1(c) rules out cointegration among regressors. Assumption M1(d) is a standard requirement in large panel factor literature. Assumption M2 is also standard. Assumption M3 states that the joint limit theory developed by Phillips and Moon (1999) holds for (5) and (6).
The following proposition is important for developing the asymptotics in this paper.
Proposition 1 states that the asymptotic magnitude of the cross term (7) below). The asymptotic mixed normality result in part (b) is also di erent from the distribution limit in equation (6) where asymptotic normality holds. This result is due to the shock w t being common to all units and I(1).
We now turn to estimation of (under the null of no structural change).

Asymptotics of the Parameter Estimates Under the Null
In this section we provide asymptotics for the OLS of model (1) under the null hypothesis of no structural change. We distinguish the case of F t observed from that where F t needs to be estimated.

F t is Observable
Let^ be the OLS of : Then we havê The following proposition characterizes the limiting distribution of^ .
Proposition 2 Let Assumptions M1(a)-M1(d) and M3 hold. Then, as (n; T ) ! 1 it holds that Proposition 2 states that^ and^ are asymtotically independent. This result is a consequence of Proposition 1, i.e., Note that results in Proposition 2 have p nT convergence, as in Phillips and Moon (1999) and Kao (1999). However, the limiting distribution of^ is di erent from the panel cointegration literature, where normality holds. The mixed normality found in our case is due to the shocks w t being nonstationary and common across units, which implies 1 being a random matrix rather than a constant as in the standard panel cointegration as in (6).

F t is Unobservable
In order to estimate when F t is unobservable, we consider a two step approach. First, we derive the estimator of the vector of common shocks,F t , using equation (4). We then plug this estimator in equation (1) to retrieve an estimate for .

Estimation of F t
The estimatorF t , can be estimated by the method of principal components, (see e.g., Bai (2004)). 4 That is, F t can be found by minimizing subject to the normalization 1 where z it is given in (4). Let F = (F 1 ; :::; F T ) 0 and Z = (z 1 ; z 2 ; :::; z n ) 0 a T n matrix with z i = (z i1 ; z i2 ; :::; z iT ) 0 . The estimatorF = F 1 ; :::;F t 0 is a T R matrix which is found by T times the eigenvectors corresponding to the R largest eigenvalues of the T T matrix ZZ 0 .
It is known that the solution to the above minimization problem is not unique, i.e., i and F t are not directly identi able since they are identi able only up to a transformation. Therefore, instead of estimating the factors F t (or the loadings i ), what one does by employing the principal component estimator is to estimate the space spanned by them up to a R R transformation matrix, say H, thereby nding HF t instead of F t . Therefore, computing the OLS of for example, would result in estimating H 1 rather than . However, as far as testing is concerned, knowledge of HF t is the same as directly estimating F t . Hence, for the purpose of notational simplicity, we assume H being a R R identity matrix in this paper.

Estimation of
The OLS estimator of is computed from where 2 e = V ar(e it ) and the random variableQ B is de ned as The following theorem characterizes the limiting distribution of^ when F t are not observable.
Theorem 1 Suppose Assumptions M1-M3 hold, with n=T ! 0 as (n; T ) ! 1: We get where Note that^ and^ are asymptotically independent due to 1 it being a block diagonal matrix asymptotically similar to (9). The limiting distributions are essentially the same as those found in Proposition 2, the only di erence with respect to (8), being the presence of the extra variance term in the limiting distribution of^ . This arises from the estimation error of the common shocks,F t F t .

Test Statistics
The asymptotic theory for^ derived in the Section 3 is used to derive the limiting distribution for the Waldtype statistic under the null hypothesis of no structural change. A variety of tests for a break, based on the Wald statistic have been discussed in the literature, e.g., Andrews (1993), Andrews and Ploberger (1994).
In this section, we consider three statistics: the supremum of the Wald statistic, SupW; the average Wald statistic, AveW, and the logarithm of the Andrews-Ploberger exponential Wald statistic, ExpW.
Assumption PSE:(Partial Sample Estimation) k T ! r 2 (0; 1) as T and k ! 1: Assumption PSE states that the fraction of T at which the change point occurs, r, is bounded away from zero and one. Therefore, the structural break will divide the sample into two subsamples each of nontrivial size. This assumption follows an argument similar to that in Corollary 1 in Andrews (1993, p.838).
Consider the following partial sample OLŜ Let^ 2 u and^ 2 be consistent estimators for 2 u and 2 respectively under H 0 . De nê The following theorem characterizes the limiting distribution of the Wald test under the null.
Theorem 2 Suppose Assumptions M1{M3 and PSE hold, and that n T ! 0 as (n; T ) ! 1. Then, under the null H 0 of no structural change where in this case B ( ) is a p-dimensional standard Brownian motion. For a given r, Q R (r) and Q p (r) are independent such that where BM (s) denotes a p-vector of independent Brownian processes on has a chi-squared distribution with p degrees of freedom. However, r cannot be 1=2 since s will be zero when In order to obtain a test statistic that the critical values can be taken from the literature, e.g., Andrews (1993), Andrews and Ploberger (1994), we consider the following modi cation to the Wald test: It is clear that where Note that for a xed r, Q R (r) and Q p (r) are independent and Hence we have the following corollary: Corollary 1 Suppose Assumptions M1{M3 and PSE hold, and that n T ! 0 as (n; T ) ! 1. Then, under the null H 0 of no structural change The results in Theorem 2 and the rest of the paper continue to hold if we relax some of the restrictions contained in Assumption M1. Particularly, assume that a multivariate invariance principle for ! it holds, In this case, one can replace the OLS estimator by the fully modi ed (FM) estimator or dynamic OLS (DOLS), e.g., Phillips and Moon (1999) and Kao and Chiang (2000), to take account of the presence of serial correlation and exogeneity. This can be performed by replacing b 2 u by b u:" in (15) for the Wald test statistic, where b u:" is a consistent estimator for Further, the results in Theorem 2 are for testing the stability of : However, one can construct tests separately for and using Q R (r) and Q p (r) since Q R (r) and Q p (r) are independent. Theorem 2 states that if one wants to test only for the constancy of it holds that if one is interested in testing merely for the constancy of it holds that Finally, theorem 2 is valid for any consistent estimators of 2 u and 2 . To estimate 2 u , one could computê which is consistent under H 0 . To nd a consistent estimator,^ 2 , of 2 , from equation (12) a possible choice From equation (13), we have^ where^ i is a consistent estimate of i andê it can be computed aŝ Therefore, we can provide an estimate for 2 aŝ The following proposition characterizes the consistency of^ 2 u and^ 2 under H 0 .
The limiting distribution for the Wald test is now used to test for the presence of a structural break.
Following Andrews (1993) and Andrews and Ploberger (1994), we consider three functionals of the Wald statistic W( ): and where r represents the fraction of the sample trimmed away from the beginning and the end of the sample. Therefore, to carry out the test we only use data belonging to the sub-interval of the full sample Critical values for SupW; AveW; and ExpW can be taken from Andrews (1993) and Andrews and Ploberger (1994) since D (r) is 2 R+p for a xed r. For example, when r = 0:15 and R = p = 1; the critical values of the 5% level for SupW; AveW; and ExpW are 11:79, 4:61, and 3:22 respectively.

Local Asymptotic Power
In this section, we evaluate the power of the Wald statistic against local alternatives. We assume the following sequence of local alternatives: 1 arbitrary function de ned on the unit interval, with the subelements g ( ) and g ( ) being R 1 and p 1 respectively.
The properties of g t T are speci ed in the following assumption.
Assumption LP:(Local Power) The function g t T belongs to the class of Riemann integrable functions and as (n; T ) ! 1 and for all k: In what follows, we derive the asymptotic behavior of the Wald statistic under the sequence of local alternatives (23). Model (1) can be rewritten as Similarly, when common shocks are replaced by their estimatesX it we have be the OLS estimators under the local alternative (23), and let~ 2 u and~ 2 be consistent estimators for 2 u and 2 respectively under the local alternatives H (nT ) jk ; for j = 1; 2, the Wald statistics under the local alternative can be computed as The local asymptotic power for the Wald statistics is given in the following theorem.  (23), where D (r) is de ned in Theorem 2.
The arguments in Theorem 3 also hold for the modi ed Wald test statistic. Theorem 3 indicates that the Wald statistics in (24) has nontrivial local power irrespective of the particular type of the structural change.
The theorem holds for any choice of the estimators~ 2 u and~ 2 which is consistent under H To estimate 2 we propose~ where^ 2 is de ned in equation (21)  as (n; T ) ! 1

Monte Carlo Simulations
In this section we present the simulation results that are designed to assess the null rejection probabilities and the power properties of SupW(k); AveW(k); and ExpW(k) statistics. To compare the performance of the proposed tests we conduct Monte Carlo experiments based on the following design We assess the power of the test considering an alternative hypothesis of structural change in both and . We consider break location is assumed to take place at the 40% of the sample. To control for the break magnitude, we simulate model (1)-(4) assuming that, under H a t = for t < k (1 + c) for t k where c is a scalar that de nes the percentage change in the parameter values. We set c = 0:1. When generating the DGP, the rst 1,000 observations are discarded to avoid dependence on the initial conditions. All our results are based on sample size of n = f20; 40; 60; 120; 240; 480g and T = f20; 40; 60; 120; 240; 480g with 10,000 iterations. The size and power are evaluated at 5% level. All programs are written by GAUSS.
Those critical values were taken from Andrews (1993) and Andrews and Ploberger (1994).  Table 2 gives the power of the test statistics. All tests show very good power properties. The power gain is substantial as T increases and more moderate for increasing sizes of n. This result is consistent with the p nT asymptotics of the three tests, as reported in the paper.

Conclusion
In this paper, we derive an asymptotic theory for testing for an unknown common change point in a cointegrated panel regression with common and idiosyncratic shocks. We develop the asymptotic theory for the cases of observable and unobservable common shocks and we derive the limiting distribution of the supre-
Lemma A.2 Under Assumptions M1 and M2, as (n; T ) ! 1 Proof. For part (a), note that Assumption M1 ensures that As far as terms II and III are concerned, application of the Cauchy-Schwartz inequality and of Lemma A.1(a) ensures that they are bounded by (ŵ t w t ) u it = I + II: From Proposition 1 we have applying Cauchy-Schwartz inequality and Lemma A.1(a) to II leads to To prove (c) we note that For II.
Proof. To prove (a), note Equation (5) in Assumption M3 states that We know from Proposition 1 that 1 In order to prove (b), note that From equation (6) in Assumption M3 that We also know from Proposition 1 This proves part (b).
Lemma A.4 Under Assumptions M1-M3 it holds that, as (n; T ) ! 1 and where Z 1 N (0; I R ), and Proof. To prove part (a), note that From equation (5) 1 We have We know from Proposition 1 that For II, the Cauchy-Schwartz inequality and Lemma A.1(b) lead to Therefore, as (n; T ) ! 1 To prove part (b), note that As far as a is concerned, we have and according to Lemma A.3(b) we have For II, we have Therefore, II = o p (1) and For b, we know from Bai (2004) that, as (n; T ) ! 1 and From Theorem 2 in Bai (2004) we know that for a given t It is clear that where B is the standard Brownian motion. It follows that Finally, consider the joint distribution of the elements in " Any linear combination of these elements takes the form for some 1 and 2 : Let For a given T , it is also clear that every element of & iT are iid across i conditional on C, the -algebra generated by fF t g. Without loss of the generality, we assume R = 1. It is clear that every element of & iT are iid across i conditional on C which is an invariant -eld. Thus Notice that E and var are conditional expectation and conditional variance respectively. It follows that conditional on C, : Hence, we can use the MDS CLT to get where Z N (0; I R ) and E i 0 i jC and Z are independent. Thus, any linear combination of the two elements in the vector in (27) is asymptotically mixed normal, i.e., " This proves (ii).
Consider (iii). Recall that we have, as (n; T ) ! 1 with p n As T ! 1, and for all n, we have therefore, for (n; T ) ! 1 we have This proves the Lemma.
Lemma A.5 Let Assumptions M1-M3 and PSE hold. Then, as (n; T ) ! 1 with p n=T ! 0, it holds that (a) for all r where Z 1 and Z 1 are independent standard normals of dimensions R and p respectively. Proof. Consider (a). Note Let's assume p = 1 and 2 = 1 x it ! = 0: for any G t I(1): This proves (a).
Next we consider (b). Let C be the -eld generated by the fw t g and We begin with the sequential limit. We know that Hence, an MDS CLT, e.g., Corollary 3.1 of Hall and Heyde (1980), implies that This is because for a given t, as n ! 1 by a CLT.

27
where Z 1 N (0; I R ) and Denote (n; T ) seq ! 1 as the sequential limit, i.e., T ! 1 rst and n ! 1 later. Thus, as (n; T ) seq ! 1; We now show the limiting distribution continues to hold in the joint limit, i.e., (n; T ) ! 1: Given the sequential limit results derived above, establishing the joint limit results is done by verifying the conditions (i) -(iv) in Theorem 3 in Phillips and Moon (1999). Conditions (i), (ii), and (iv) are obviously satis ed.
We only have to verify uniform integrability in (iii). Put in our context, the uniform integrability condition as T ! 1: Thus iT is iid across i conditional on C with mean zero and covariance ui R Q i Q 0 i : Now we need to show that k iT k 2 is uniformly integrable in T for all i.
by a continuous mapping theorem (CMT) and It follows that k iT k 2 is uniformly integrable. We then apply Theorem 3 in Phillips and Moon (1999) to Lemma A.3(a) ensures that, as (n; T ) ! 1 we have According to Lemma A.3(b), it is clear that conditional on conditional on C, the -algebra generated by where denotes the asymptotic covariance between 1 p nT P n i=1 P T t=1 w t u it and 1 p nT P n i=1 P T t=1x it u it : Combining the two results, we get conditional on C, Hence without conditioning C This proves the proposition.

B.3 Proof of Theorem 1
Proof. The proof is Similar to Proposition 2. Recall We know from Lemma A.4 that Hence using Lemma A.4 and the similar steps to Proposition 2 we can show that This proves the theorem.

B.4 Proof of Theorem 2
Proof. Theorem 2 states two separate results that need to be proved: As far as equation (16) is concerned, consider the de nitions of^ 1k ,^ 1k , S 1 (r), S 2 (r), M 1 (r) and M 2 (r).
Then, use Assumption PSE and in light of Lemma A.5 and the consistency of^ 2 and^ 2 u , we have that, uniformly in r and M 2 (r) 1 S 2 (r)

30
Then we note that where I is (R + p) (R + p) identity matrix. Therefore, use equations (29) and (30), under H 0 we have Also, use Lemma A.5(c), it follows that and similarly for S 2 (r) and M 2 (r). Therefore we can write Letting C be the sigma eld generated by the fF t g, we have that, conditional on C: where Z 1 and Z 2 are two R-dimensional independent standard normals. Z 1 and Z 2 are independent since they arise from the presence of the stochastic increments dB (r), which are independent across r. Therefore we also have Z r where Z has an R-dimensional standard normal distribution. Hence we have the following passages Since this result does not depend on C -i.e. it holds true for all the possible elements in the sigma-eld Cwe have that, unconditionally on C Q R (r) 2 R : after equation (25). Then under the local alternatives H We are now ready also to prove consistency of~ 2 . Sincẽ 2 =~ 2 u +^ (nT )0^ 2 ^ (nT ) ; and since^