V.9 Ergodic Theorems

Vitaly Bergelson


Consider the sequence image, where z is a complex number of modulus 1. While for z ≠ 1 our sequence is not convergent, it is not hard to see that, on average, it exhibits quite regular behavior. Indeed, using the formula for the sum of a geometric progression, and assuming that z ≠ 1, we have, for any N > M ≥ 0,

Image

which implies that when N - M is large enough, the averages

Image

are small. More formally, we have

Image

This simple fact is a special, one-dimensional case of von Neumann’s ergodic theorem, which was the first mathematical statement to throw light on the so-called quasi-ergodic hypothesis in statistical mechanics and the kinetic theory of gases.

Von Neumann’s theorem concerns the average behavior of powers of UNITARY OPERATORS [III.50 §3.1] on HILBERT SPACES [III.37]. If U is such an operator defined on a Hilbert space image, then we can associate with U the U-invariant subspace imageinv that consists of all vectors fimage such that U f = f: that is, all vectors that are fixed by U. Let P be the ORTHOGONAL PROJECTION [III.50 §3.5] onto that subspace. Then von Neumann’s theorem asserts that, for every fimage,

Image

In other words, in a certain sense the averages

Image

converge to the orthogonal projection P. (This is not actually the theorem as formulated by VON NEUMANN [VI.91], but it is simpler to explain. He proved an equivalent statement about a continuous family of unitary operators (UT)Timage.)

Before we discuss various applications and refinements of von Neumann’s theorem, let us briefly comment on its proof. Von Neumann’s original proof used sophisticated machinery such as the spectral theory of one-parameter groups of unitary operators, obtained by Marshall Stone. Over the years many alternative proofs were offered, the simplest being a “geometric” proof due to RIESZ [VI.74], which we will describe below. To give the rough idea of von Neumann’s proof it is convenient to use the fact (which follows from the SPECTRAL THEOREM [III.50 §3.4]) that any unitary operator U on a Hilbert space image has a “functional model.” That is, we can realize the Hilbert space image as a function space, consisting of all (equivalence classes of) square-integrable functions with respect to some finite MEASURE [III.5 5], in such away that U becomes a multiplication operator Mimage (f) = imagef, where image is a complex-valued measurable function that satisfies |image (x) | = 1 for almost every x. It is not hard to see, after passing to such a functional model, that von Neumann’s theorem follows immediately from its one-dimensional case as expressed by formula (1). Note that in this case the orthogonal projection to the space of invariant elements takes a function f to the function g such that g(x) = f(x) if image(x) = 1 and g(x) = 0 otherwise.

Riesz’s proof is based on the observation that the orthogonal complement of the subspace imageinv of U-invariant vectors is spanned by the set of vectors of the form Ug - g. To see this, note first that if fimageinv, then

f, Ug〉 = 〈U–1 f, g) = 〈f, g〉,

from which it follows that 〈f, Ugg〉 = 0 and thus that f is orthogonal to Ugg. Conversely, if fimageinv, then 〈f, Uff〉 = 〈f, Uf〉 – 〈f, f〉. This is less than 0, by the CAUCHY–SCHWARZ INEQUALITY [V.19] and the fact that ||Uf|| = ||f|| but Uff. In particular, f is not orthogonal to Uff. Thus, imageinv is the orthogonal complement of the (closed) subspace of image generated by functions of the form Ugg.

Now the conclusion of von Neumann’s theorem holds trivially if fimageinv, since then P f = f and Unf = f for every n. On the other hand, if f = Ugg, then P f = 0. As for the averages, we know that Unf = Un+1gUng, from which it follows that imageUnf = UNgUMg. Since ||UNgUMg|| is at most 2||g|| for every M and N, we find that

Image

has norm at most 2||g||/(NM) and hence tends to 0. So the theorem is true in this case as well. It is straightforward to check that the set of functions for which the theorem holds is a closed linear subspace of image, and therefore the theorem is proved.

The reason that von Neumann’s theorem and other similar results are relevant to physics is that it is often possible to represent the evolution of the parameters associated with a physical system by a subset X ⊂ ℝd that has finite d-dimensional volume, together with a continuous family (Tτ)τimage of volume-preserving transformations from X to X. With each such transformation Tτ one can associate the unitary map Uτ, defined on L2 (X) (the Hilbert space of square-integrable functions on X) by the formula (Uτf)(x) = f(Tτx). The fact that these maps are unitary follows from the fact that the transformations Tτ preserve volume; also, it follows from the fact that the transformations Tτ depend continuously on τ that the maps Uτ do as well.

To simplify the discussion let us now “discretize” the situation. Instead of considering the continuous families (Tτ) and (Uτ) we shall fix a transformation T = Tτ0 (say, for τ0 = 1) and let U be the corresponding unitary operator. Assume that our volume-preserving transformation T is ergodic, which means that there is no proper subset AX of positive volume such that T(A) ⊂ A. This assumption can easily be shown to be equivalent to the fact that the only elements of L2(X) that satisfy Uf = f are the constant functions. It follows from von Neumann’s theorem that for any fL2 (X) the averages

Image

converge to a constant whose value is easy to find by performing term-by-term integration: it is equal to (∫ f dm)/vol(X). Since von Neumann’s theorem also tells us that limNM→∞ AN,M(f) is always a U-invariant function, we see that the assumption of ergodicity is a necessary and sufficient condition for the time average represented by NM→∞ AN,M(f) to equal the space average, (∫ f dm)/vol(X).

It is also possible to use von Neumann’s theorem to strengthen a classical theorem of POINCARÉ [VI.61], called Poincaré’s recurrence theorem. This result states that if X is a set of finite volume, as above, and A is a subset of X with nonzero volume, then “almost all points of A return infinitely often to A.” In other words, if we set à to be the set of all points xA such that TnxA for infinitely many n, then the measure of the set of points in A but not in à is 0. The main step in the proof of Poincaré’s theorem is to prove the same about the set A1, which consists of all points xA such that TnxA for some positive integer n. To see why this is true, let B be the set of all points in A but not in A1. The sets B, T–1B, T–2B,. . . all have the same measure, since T is volume preserving. (TnB is defined to be the set of all x such that TnxB.) Since X has finite volume, there must exist positive integers m and n such that the intersection of TmB and T(m+n) B has positive measure, and from this it follows that the measure of BTnB is also positive. But if xB then xA1, so TnxA and therefore TnxB, so this is a contradiction.

Now let us apply the von Neumann ergodic theorem with f equal to the characteristic function of a set A (that is, f(x) = 1 when xA and f(x) = 0 otherwise) and U defined in terms of T as before. Suppose also that the set X has volume 1 and write μ for the measure on X. Then one can check that 〈f, Unf〉 = μ(AT–nA). It follows that

Image

If we let NM tend to infinity, then AN,M f tends to a U-invariant function g. Since g is U-invariant, 〈f, g〉 = 〈Unf, g〉 for every n, and therefore 〈f, g〉 = 〈AN,M(f), g〉 for every N and M, and finally 〈f, g〉 = 〈g, g〉. By the Cauchy-Schwarz inequality, this is at least (∫ g(x) dμ)2 = (∫ f (x) dμ)2 = μ(A)2. Therefore, we deduce that

Image

If you choose two “random sets” of measure μ(A), then their intersection will typically be (μ(A))2, so the inequality above is saying that the average intersection of A with Tn A is at least as big as the “expected” intersection. This result, due to Khinchin, gives more precise information about the nature of Poincaré recurrence.

When a unitary operator is defined in terms of a measure-preserving transformation as above, it is natural to ask whether the averages converge not just in the sense of the L2-norm but also in the more classical sense of convergence almost everywhere. (For a related thought in a different context, see CARLESONS THEOREM [V.5].) The answer is that they do, as was shown by BIRKHOFF [VI.78] soon after he learned of von Neumann’s theorem. He proved that for each integrable function f one could find a function f* such that f*(Tx) = f*(x) for almost every x, and such that

Image

for almost every x. Suppose that the transformation T is ergodic, let AX be a set of positive measure, and let f (x) be the characteristic function of A. It follows from Birkhoff’s theorem that for almost every xX one has

Image

Since the expression

Image

describes the frequency of visits of Tnx to the set A, we see that in an ergodic system the images x, Tx, T2x, . . . of a typical point xA visit A with a frequency that equals the proportion of the space occupied by A.

The ergodic theorems of von Neumann and Birkhoff have been generalized over the years in many different directions. These far-reaching extensions of ergodic theorems, and more generally the ergodic method, have found impressive applications in such diverse fields as statistical mechanics, number theory, probability theory, harmonic analysis, and combinatorics.

Further Reading

Furstenberg, H. 1981. Recurrence in Ergodic Theory and Combinatorial Number Theory. M. B. Porter Lectures. Princeton, NJ: Princeton University Press.

Krengel, U. 1985. Ergodic Theorems, with a supplement by A. Brunel. De Gruyter Studies in Mathematics, volume 6. Berlin: Walter de Gruyter.

Mackey, G. W. 1974. Ergodic theory and its significance for statistical mechanics and probability theory. Advances in Mathematics 12:178–268.

 


The Fermat–Euler Theorem

See MODULAR ARITHMETIC [III.58]