Recurrence and Ergodicity of Interacting Particle Systems

Preamble. An important technical result (Proposition 2.3) and its proof, present in the original submission, was erroneously omitted when this paper was published in 2000. The missing text, which should have appeared on page 243 directly before Section 3, is included in this erratum, together with a short account of its context. Simultaneously with the publication of this erratum, the electronic version of the paper in Vol. 116 No. 2 (2000) will be completed by insertion of the missing text. Readers of Probability Theory and Related Fields who have access to the electronic version of this erratum will also have access via a URL to the full, intact paper.


Introduction
We start with an example to explain the problem we address in this work.
Consider the (basic) voter model (ξ t ) t≥0 on Z d . Think of each point in Z d as being occupied by an individual that is capable of holding either of the opinions 0 and 1. After a rate-one exponential waiting time, a given individual chooses one of his 2d nearest neighbors at random and assumes its opinion. All waiting times and choices of neighbors are made independently. The opinion of the voter at x at time t is given by ξ t (x).
Formally, we can define the voter model as the Markov process (ξ t ) t≥0 on X = {0, 1} Z d , equipped with the product topology, via its generator G. For x ∈ X and i, j, k ∈ Z d let For F : X → R depending only on finitely many coordinates, we define GF by For more a detailed description and background information, see Chapter V of Liggett (1985). It is well known that the voter model clusters in dimension d ≤ 2. More precisely, if we start at time 0 with independent opinions, where opinion 1 has probability θ ∈ (0, 1), then (1.1) Here, δ 0 and δ 1 are the unit masses on the states where all individuals have opinion 0, respectively 1, L denotes the law of a random variable, and =⇒ denotes weak convergence of probability measures. Note that since X carries the product topology, (1.1) is equivalent to convergence of the finite dimensional distributions. A question that arises naturally, given (1.1), is: Does the opinion at a given site change value infinitely often?
The question has been answered affirmatively by means of rather special arguments in Cox and Griffeath (1986). A simple argument that works for shift ergodic initial states was brought to our attention by Jeff Steif: Consider the events For |i − j| = 1 is is easy to see that A i = A j a.s. hence a.s. A i = A := ∩ j A j . However A is shift invariant and by ergodicity we have P[A] ∈ {0, 1}. Since θ < 1, clearly P[A] = 0. Now change the 1 in the definition of A i into 0 to conclude that the opinion changes infinitely often. There are two drawbacks of this argument: (i) It works only for shift ergodic initial states. (ii) For many models it is hard to check whether A i = A j a.s. or not.
The aim of this work is to give a robust and simple abstract argument that can be applied to a large variety of models and for initial states that only need to have a global density. We do not assume translation invariance or even ergodicity. This argument relies only on the assumptions that a certain class of probability measures on the state space is preserved under the dynamics of (ξ t ), and that this class is "convergence determining" in an appropriate sense. In particular, the argument does not rely on quantitative estimates that make use of special features (or the dimension!) of the considered models.
For the voter model, we are able to prove a.s. alternation of types under more general conditions than were considered in Cox and Griffeath (1986). We also consider several related models, as well as a model of mutually catalytic branching recently introduced by Dawson and Perkins (1998).
Note that our focus lies on the situation where the model is not ergodic, i.e., the weak limit points are mixtures of the ergodic invariant measures. We show that the process gets "close" to any of the ergodic states (occurring in the mixture) at arbitrarily late times. This question has often been connected to the question of whether low-dimensional binary branching random walk, starting in a homogeneous Poisson field, populates a given site at arbitrarily late times. This is known to be be true for d = 2 and false for d = 1. Note, however, that this is a fundamentally different question than the one we address, since branching random walk is ergodic. Namely, if d ≤ 2, the unit mass δ 0 on the empty configuration is the only invariant measure (with σ-finite intensity measure).
It turns out that the correct notion of convergence of random probability measures is crucial. We discuss the topological details and give the abstract statement in Section 2. In Section 3 we apply our result to: • the multitype voter model, • interacting Fleming Viot Processes, • interacting Brownian motions, • mutually catalytic branching super random walk.

Result
In this section we formulate and prove our abstract result.
Let X be a locally compact Polish space and denote by P(X) the space of probability measures on X equipped with the topology of weak convergence of probability measures. P(X) is again a locally compact Polish space (see, e.g., Kallenberg (1983)). Consider now a discrete time Markov process (ξ n ) n∈N0 on X.
(We could consider a Feller process (ξ t ) t≥0 on X instead, but we choose the discrete time setting for the sake of generality.) Denote by (S(n)) n∈N0 its semigroup. That is, for µ ∈ P(X) and n ∈ N 0 , We want to describe the longtime behavior of ξ n in terms of its possible limit points µ θ , θ ∈ Θ, where Θ is an abstract set. (We do not assume that (µ θ ) θ∈Θ necessarily exhausts the class of possible limit points.) In the example of the voter model, Θ = [0, 1] and µ θ = (1 − θ)δ 0 + θδ 1 .
Now we make the crucial definition: Clearly, D(µ θ ) is a convex set but it is in general not closed. For example, in the voter model π θ S(n) n→∞ =⇒ µ θ , where π θ is the product measure on {0, 1} Z d with intensity θ. We will see later that π θ S(n) ∈ D(µ θ ), n ∈ N 0 , but obviously µ θ / ∈ D(µ θ ) if θ ∈ (0, 1). Since D(µ θ ) is not compact we cannot hope for a nice description in terms of extremal elements. In spite of this, we give a mild sufficient condition for a set M θ ⊂ P(X) to be a subset of D(µ θ ) that covers a wide range in the examples. Assumption 1. M θ ⊂ P(X) is invariant under the dynamics of (ξ n ), i.e., is open and µ θ ∈ U, then for all µ ∈ M θ , Proof The proof is simple as is left as an exercise.

2
In the context of the voter model, for 0 < θ < 1 and with µ θ = (1 − θ)µ 0 + θµ 1 , we would like to argue that (A1) and (A2) guarantee that for any initial measure µ ∈ M θ , the process ξ t gets "close" to 0 and to 1 at arbitrarily late times t. The meaning of (A1) is clear. However, condition (A2) is somewhat unusual, so we would like to discuss its interpretation and our reasons for choosing it.
There are basically three types of convergence that we might choose for (A2): convergence of the means, stochastic convergence and almost sure convergence. Since we consider convergence of random probability measures, convergence in L 1 and stochastic convergence coincide, and both are implied by almost sure convergence, while both imply convergence of the means. We illustrate the meaning of these concepts in the example of the voter model.
By convergence of the means, we mean the condition That is, M θ is a subset of the domain of attraction of µ θ . In the example of the voter model, we could set M θ = {µ θ }, in which case (2.2) would certainly hold. However, in this case we would have ξ n ≡ ξ 0 a.s., so there would be no change of types at all. Hence, this notion is too rough for our purposes. By almost sure convergence, we mean the condition  Liggett (1985)), it is possible to verify (2.3) for some classes M θ . However, verification becomes rather difficult for more complicated models, so we do not adopt this notion of convergence. By stochastic convergence, we mean exactly (A2), which is a weaker condition than (2.3), but still strong enough for our purposes. For the voter model, to verify (A2), we only have to show that for all finite H ⊂ Z d and ε > 0, This fact, for µ belonging to a large class M θ , is easily proved using duality (see the proof of Theorem 1) below. Now we come back to the general situation. Let S θ = supp(µ θ ) be the closed support of µ θ . For a sequence (x n ) n∈N in X let A((x n ) n∈N ) denote the set of accumulations points of (x n ) n∈N in X.

Applications
The situation we have in mind is that of a "general" interacting particle system where a global variable, typically the density of particles, is preserved under the dynamics. The models we consider here have a number of features in common. The state space is X ⊂ V G , equipped with the product topology, where the countably infinite Abelian group G (we exclude expressis verbis the possibility of G finite!) plays the role of the site space. V is the space of values that a local coordinate can assume. In the context of genealogical as the number of particles of type e ∈ E. When V is compact we can, in fact, take X = V G , but for non-compact V , we need to impose growth conditions on the coordinates. In all cases, the interaction of the coordinates will be described in terms of an irreducible random walk kernel a(·, ·) on G. The continuous time transition kernel a t is defined by where a (n) is the n-step transition probability of a.
For v ∈ V , we let v denote the element v ∈ X such that v(g) = v for all g ∈ G. P always denotes the space of probability measures on a locally compact Polish space equipped with the weak topology.

The Multitype voter model
Fix a positive integer c > 1, the number of types (opinions), let E = {1, . . . , c} be the space of types, and let V = {1 {e} , e ∈ E}. Let X = V G , and define, for x ∈ X and g, g ′ , h ∈ G, We define the voter model (ξ t ) t≥0 to be the Markov process on X with generator G, where for F : X → R depending on only finitely many coordinates, Define the simplex and for θ ∈ Θ let M θ be the collection of µ ∈ P(X) such that for all g ∈ G and e ∈ E, In the case that G = Z d , the collection M θ contains all translation invariant, shift ergodic µ ∈ P(X) satisfying µ(dx)x(0) = θ (see pp. 180-181 of Cox, Greven and Shiga (1995) for the case c = 2). For θ ∈ Θ define and note that S θ = {e : e ∈ E and θ(e) > 0.} We assume that the symmetrized kernel a given by is recurrent. It is well known that the voter model clusters in this situation. In particular, Theorem V.1.9 of Liggett (1985) implies that for all µ ∈ M θ , Proof It suffices to verify that (A1) and (A2) hold, in which case M θ ⊂ D(µ θ ) is in the domain of stochastic attraction of µ θ and our conclusion is justified by Proposition 2.3. To do this, we make use of duality (see Chapter V of Liggett (1985)), which we briefly describe. Let (η g t , g ∈ G) t≥0 be a system of rate one continuous time coalescing random walks on G, with step distribution a(g, h). For each g ∈ G, η g t is a random walk started at g. The random walks η g t run independently until two of them meet, at which time the walks (instantly) coalesce, and after that move together. A special case of the duality relation (see (V.1.7) of Liggett (1985)) connecting η t and ξ t is: for all x ∈ X, finite H ⊂ G and v ∈ V , Fix µ ∈ M θ and t > 0. To verify (A1), we must show that for fixed g and e, For H ⊂ G, let τ H to be the first time at which all the random walks started in H have coalesced, Hence By the assumption that µ ∈ M θ , the second term on the left side above tends to 0 as s → ∞. The right side also tends to 0 as s → ∞, since G is infinite and a is irreducible, and since F t (h) → 0 as |h| → ∞. (That is, for any sequence (G n ) of finite subsets of G such that G n ↑ G as n → ∞, sup{F t (h) : h ∈ G \ G n } → 0 as n → ∞.) In order to show that (A2) holds, it suffices to show that for ε > 0 and finite H ⊂ G,  The set E of types is finite, so it suffices to prove that for each fixed e ∈ E, (3.12) Choose an arbitrary g ∈ H. By (3.7), Since we have assumed that a is recurrent, P[τ H > t] → 0 as t → ∞. Therefore, as t → ∞, on account of (3.4).

Interacting diffusions
Here we consider a two-type genealogical model with migration and resampling. We suppose that at each site g ∈ G there is a large colony of individuals, and each individual must be one of two genealogical types, A or B. The frequency of type A at site g at time t is ξ t (g). Hence E = {1, 2} and we identify P(E) with [0, 1] and let V = [0, 1]. Further we let (ξ t ) t≥0 be the Markov process with state space V G and generator G, where, for suitable F : X → R, The migration kernel a is an irreducible random walk kernel on G, and the diffusion coefficient (or resampling function) ̺ is a function ̺ : [0, 1] → [0, ∞) that satisfies ̺(0) = ̺(1) = 0, ̺(r) > 0, r ∈ (0, 1), (3.14) ̺ is Lipschitz continuous.
The ergodic theory of this process has been studied by Shiga (1980a,b) (for the case ̺(r) = r(1 − r)), Notahara and Shiga (1980) and Cox and Greven (1994). As with the voter model, there is either coexistence or local extinction of one type, depending on whether the symmetrized kernel a defined in (3.1) is transient or recurrent. We assume here that a is recurrent. Let Θ = [0, 1], and for θ ∈ Θ let M θ be the collection of µ ∈ P(X) such that for all g ∈ G, lim s→∞ µ(dx)(a s x(g) − θ) 2 = 0. Proof It suffices to verify that (A1) and (A2) hold, in which case M θ ⊂ D(µ θ ) is in the domain of stochastic attraction of µ θ and our conclusion is justified by Proposition 2.3. Fix µ ∈ M θ and t > 0. To verify (A1) we must show that In order to compute the first and second moment we use Lemma 1 of Cox and Greven (1994): Now it is straightforward to check the formula It follows that (3.22) The first term on the right side of (3.22) tends to 0 as s → ∞ because µ ∈ M θ . The second term on the right side of (3.22) is bounded above by ̺ ∞ t 0 a 2(s+r) (g, g) dr, and this also tends to 0 as s → ∞ (recall that |G| = ∞ and that a is irreducible, hence a r (g, g) → ∞ as r → ∞). We have thus established (3.18) In order to show that (A2) holds, it suffices to prove that for finite H ⊂ G and ε > 0, (3.23) We break the proof of (3.23) into two parts. First, we show that for any g ∈ G and ε > 0, Then we show that for any g, h ∈ G and ε > 0, It is easy to see that (3.24) and (3.25) imply (3.23) Let H ⊂ G be finite, let δ > 0, and define Since µ ∈ M θ and H is finite, Chebyshev's inequality and (3.15) imply that that lim t→∞ µ(Γ t (δ)) = 1. (3.26) Suppose now that H = {g, h}. In the proof of Theorem 4 in Cox and Greven (1994), it is shown that for δ > 0, where q t (δ, g, h) → 0 as t → ∞. (The quantity q t (δ, g, h) is the probability that two random walks starting from g and h, which move independently according to the kernel a s , and coalesce at rate c whenever they occupy the same site, coalesce by time t. The constant c depends on g, h, ̺ and δ, but is strictly positive.) After a little rearrangement (using (3.19)), this inequality implies that By choosing t large enough so that q t (δ, g, h) < δ 2 , we have Assume now that 0 < δ < 1/4. For r ∈ [2δ, 1 − 2δ], r(1 − r) ≥ δ. Therefore, for large t, the last estimate implies that (3.31) On the other hand, On account of these estimates, A similar argument gives the inequality Combining (3.31), (3.33) and (3.34), we obtain that for x ∈ Γ t (δ) Given ε > 0, we may choose δ > 0 small enough, and then t large enough so that q t (δ, g, g) < δ 2 , and for all x ∈ Γ t (δ), In view of (3.26), (3.24) holds. To prove (3.25), suppose that |ξ t (g) − ξ t (h)| > δ. Then, it must be the case that at least one of ξ t (g), ξ t (h) belong to the interval [δ, 1 − δ], or, one of ξ t (g), ξ t (h) is smaller than δ and the other larger than 1 − δ. In the latter case, For t large enough so that q t (δ, g, h) < δ 2 , and all x ∈ Γ t (δ), (3.29) and Chebyshev's inequality imply On account of this estimate, (3.31) and (3.36), Given ε > 0, we may choose δ > 0 small enough so that the right side above is less than ε, and t large enough so that q t (δ, g, h) < δ 2 . We therefore obtain that, for all x ∈ Γ t (δ), In view of (3.26), (3.25) holds.

Interacting Fleming Viot Processes
Here we consider a generalization of the two allele (A and B, say) model of the last example to infinitely many alleles. The space E of alleles (or types) is now infinite. W.l.o.g. we assume E = [0, 1]. The interval [0, 1] is understood as an arbitrary labeling of the types. Though, we need some measureability of E and thus equip it with the Borel σ-field B from the euclidian metric on [0, 1]. Now ξ t (g)(A) is the frequency at time time t of individuals in the colony g ∈ G having a type that is in A ∈ B. Hence ξ t (g) ∈ ∆ E := V := P(E, B) (the set of probability measures on (E, B)) and (ξ t ) is a Markov process with values in The process (ξ t ) is a model with migration and resampling. While the migration is just the one we introduced in the previous subsection we must be more careful with the resampling: we can define (ξ t ) uniquely only for the so-called Fisher-Wright case We define (ξ t ) in terms of its generator G which is defined for certain polynomials F : X → R by In particular, the the locally predominant type changes infinitely often.
Proof For fixed A ∈ B the process ( ξ t (g); g ∈ G) t≥0 = (ξ t (g)(A); g ∈ G) t≥0 is just the process of interacting Fisher-Wright diffusions on [0, 1]. That is the process of interacting diffusions from the last example with diffusion coefficient ̺(x) = x(1 − x). Hence the claim follows from Theorem 2.

Interacting Brownian motions
So far we have considered examples where the state space (at each site) was compact. Now we come up with our first example of a non-compact state space.
Here we consider only one type, i.e., E = {1}. In the notation of the last few examples we have Θ = V = R and X ⊂ R G is a Liggett-Spitzer space (see Liggett and Spitzer (1981)). More precisely, fix γ ∈ (0, ∞) G with g∈G γ(g) < ∞ and with the property that sup g∈G γ(g) −1 (γa)(g) < ∞. (3.43) Now define x γ = g∈G |x(g)|γ(g) and let For example, if G = Z d and a is the kernel of simple random walk then γ = (1 + x 2 ) −p fulfills the above assumption for p > d. Hence all x ∈ R Z d that do not grow faster than a polynomial are possible initial configurations. We define linearly interacting Brownian motions as the Markov process on X with generator We show that if the symmetrized kernel a is recurrent, then for µ ∈ M and moreover that M ⊂ D(µ 0 ), the domain of stochastic attraction of µ 0 (Of course, other subsets of D(µ 0 ) are conceivable). To make precise sense of this statement letR = R ∪ {±∞} be the two point compactification of the real line. The bold symbols −∞ and +∞ denote the elements inR G with all components equal to −∞ respectively +∞.
In fact, even the stronger statement needed for (A2) is true or equivalently: for all ε > 0, K > 0 and H ⊂ G finite We give the simple proof of (3.49): First note that (ξ t ) t≥0 solves a system of stochastic differential equations where {(W t (g) t≥0 , g ∈ G} is an independent family of standard Wiener processes. (This can be seen by an approximation procedure as in Shiga and Shimizu (1980), proof of Theorem 3.2.) Hence ξ t can be written as From (3.51) we derive for x ∈ X the first and second moment: where G t (g, h) is the Green function of the symmetrized kernel a Since a is irreducible and a is recurrent, the weak ratio limit theorem (see, e.g., Spitzer's book, Proposition 1.5) implies Hence asymptotically the components are perfectly correlated while Since under P x the field {ξ t (g), g ∈ G} is Gaussian and E x [ξ t (g)] = (a t x)(g) is tight as t → ∞ (w.r.t. µ) for all g ∈ G, this implies (3.49). Hence we have shown that (A2) holds. Assumption (A1) however is an immediate consequence of (3.51). Thus we can apply Proposition 2.3 to get the following result: (3.56) 2

Mutually catalytic branching super random walk
We now come to the example that mainly motivated our work. Consider a two-type "infinitesimal mass"  (1998)). Formally we define (ξ t ) t≥0 as the Markov process on X with generator G given by The explicit construction of this process can be found in Dawson and Perkins (1998). Uniqueness in law is based on Mytnik's duality (see Mytnik (1996)). Dawson and Perkins investigate the longtime behavior of (ξ t ). They show that if d = 1 or d = 2 and ξ 0 ≡ θ ∈ (0, ∞) 2 then locally one type dies out (in probability) while the other type is locally constant but random. The question that was raised by Ed Perkins at the 1997 Vancouver Probability Meeting is whether it is always (i.e. as time passes) the same type that is locally predominant. From the above discussion the reader might by now guess the right answer. Here however, we first want to give the result of Dawson and Perkins in detail.
Further let δ v be the unit mass at the element v ∈ X with all components equal to v ∈ V . Finally define In order to apply our abstract argument we have to have an invariant class M θ ⊂ D(µ θ ) in the domain of stochastic attraction of µ θ . A large class M θ with these features has been obtained by Cox, Klenke and Perkins (1999). They show in their Theorem 2 that M θ = µ ∈ P(X) : C µ < ∞, lim t→∞ µ(dx) (a t x(i)(c) − θ(c)) 2 = 0, i ∈ Z d , c = 1, 2 , (3.61) where is convergence determining in the sense of (A2). In particular, we have It is simple to check invariance of M θ (A1). In fact, for all T > 0, i ∈ Z d and c = 1, 2, by Theorem 2.2 of Dawson and Perkins (1998), Hence µS(T ) ∈ M θ and M θ fulfills assumption (A1). Now we present the main new result of this work which is an immediate consequence of Proposition 2.3 and Cox, Klenke and Perkins (1999). In particular, P µ -a.s. the locally predominant type changes infinitely often.