Our theory in a certain sense bridges the positions of Einstein and Bohr, since the complete theory is quite objective and deterministic (“God does not play dice with the universe”), and yet on the subjective level, of assertions relative to observer states, it is probabilistic in the strong sense that there is no way for observers to make any predictions better than the limitations imposed by the uncertainty principle.15

In conclusion, we have seen that if we wish to adhere to objective descriptions then the principle of the psycho-physical parallelism requires that we should be able to consider some mechanical devices as representing observers. The situation is then that such devices must either cause the probabilistic discontinuities of Process 1 or must be transformed into the superpositions we have discussed. We are forced to abandon the former possibility since it leads to the situation that some physical systems would obey different laws from the rest, with no clear means for distinguishing between these two types of systems. We are thus led to our present theory which results from the complete abandonment of Process 1 as a basic process. Nevertheless, within the context of this theory, which is objectively deterministic, it develops that the probabilistic aspects of Process 1 reappear at the subjective level, as relative phenomena to observers.

One is thus free to build a conceptual model of the universe, which postulates only the existence of a universal wave function which obeys a linear wave equation. One then investigates the internal correlations in this wave function with the aim of deducing laws of physics, which are statements that take the form: Under the conditions C the property A of a subsystem of the universe (subset of the total collection of coordinates for the wave function) is correlated with the property B of another subsystem (with the manner of correlation being specified). For example, the classical mechanics of a system of massive particles becomes a law which expresses the correlation between the positions and momenta (approximate) of the particles at one time with those at another time.16 All statements about subsystems then become relative statements, i.e., statements about the subsystem relative to a prescribed state for the remainder (since this is generally the only way a subsystem even possesses a unique state), and all laws are correlation laws.co

The theory based on pure wave mechanics is a conceptually simple causal theory, which fully maintains the principle of the psycho-physical parallelism. It therefore forms a framework in which it is possible to discuss (in addition to ordinary phenomena) observation processes themselves, including the interrelationships of several observers, in a logical, unambiguous fashion. In addition, all of the correlation paradoxes, like that of Einstein, Rosen, and Podolsky,17 find easy explanation.

While our theory justifies the personal use of the probabilistic interpretation as an aid to making practical predictions, it forms a broader frame in which to understand the consistency of that interpretation. It transcends the probabilistic theory, however, in its ability to deal logically with questions of imperfect observation and approximate measurement.

Since this viewpoint will be applicable to all forms of quantum mechanics which maintain the superposition principle, it may prove a fruitful framework for the interpretation of new quantum formalisms. Field theories, particularly any which might be relativistic in the sense of general relativity, might benefit from this position, since one is free to construct formal (non-probabilistic) theories, and supply any possible statistical interpretations later. (This viewpoint avoids the necessity of considering anomalous probabilistic jumps scattered about space-time, and one can assert that field equations are satisfied everywhere and everywhen, then deduce any statistical assertions by the present method).

By focusing attention upon questions of correlations, one may be able to deduce useful relations (correlation laws analogous to those of classical mechanics) for theories which at present do not possess known classical counterparts. Quantized fields do not generally possess pointwise independent field values, the values at one point of space-time being correlated with those at neighboring points of space-time in a manner, it is to be expected, approximating the behavior of their classical counterparts. If correlations are important in systems with only a finite number of degrees of freedom, how much more important they must be for systems of infinitely many coordinates.

Finally, aside from any possible practical advantages of the theory, it remains a matter of intellectual interest that the statistical assertions of the usual interpretation do not have the status of independent hypotheses, but are deducible (in the present sense) from the pure wave mechanics, which results from their omission.

APPENDIX I

We shall now supply the proofs of a number of assertions which have been made in the text.

Image1. Proof of Theorem 1

We now show that {X, Y, . . . , Z} > 0 unless X, Y, . . . , Z are independent random variables. Abbreviate P(xi, yj, . . . , zk) by Pij. . .k, and let

Image

(Note that Pi Pj . . . Pk = 0 implies that also Pij. . .k = 0.) Then always

Image

and we have

Image

Applying the inequality for x Image 0:

Image

(which is easily established by calculating the minimum of x ln x – (x – 1)) to (1.3) we have:

Image

Therefore we have for the sum:

Image

unless all Qij. . .k = 1. But ∑ij. . .k PiPj. . . PkQij. . .k = ∑ij. . .k Pij. . .k = 1, and ∑ij. . .k Pi Pj. . . Pk = 1, so that the right side of (1.6) vanishes. The left side is, by (1.3) the correlation {X, Y, . . . , Z}, and the condition that all of the Qij. . .k equal 1 is precisely the independence condition that Pij. . .k = Pi Pj. . . Pk for all i, j, . . . , k. We have therefore proved that

Image

unless X, Y, . . . , Z are mutually independent.

Image2. Convex function inequalities

We shall now establish some basic inequalities which follow from the convexity of the function x ln x.

LEMMA 1. xi Image 0, Pi Image 0, ∑i Pi = 1

Image

This property is usually taken as the definition of a convex function,18 but follows from the fact that the second derivative of x ln x is positive for all positive x, which is the elementary notion of convexity. There is also an immediate corollary for the continuous case:

COROLLARY 1. g(x) Image 0, P(x) Image 0, ∫ P(x)dx = 1

Image

We can now derive a more general and very useful inequality from Lemma 1:

LEMMA 2. xi Image 0, ai Image 0 (all i)

Image

Proof. Let Pi = ai/i ai, so that Pi Image 0 and ∑i Pi = 1. Then by Lemma 1:

Image

Substitution for Pi yields:

Image

which reduces to

Image

and we have proved the lemma.             Image

We also mention the analogous result for the continuous case:

COROLLARY 2. f(x) Image 0, g(x) Image 0 (all x)

Image

Image3. Refinement theorems

We now supply the proof for Theorems 2 and 4 of Chapter II, which concern the behavior of correlation and information upon refinement of the distributions. We suppose that the original (unrefined) distribution is Pij. . .k = P(xi, yj, . . . , zk), and that the refined distribution is Image, where the original value xi for X has been resolved into a number of values Image, and similarly for Y, . . . , Z. Then:

Image

Computing the new correlation {X, Y, . . . , Z}′ for the refined distribution Image we find:

Image

However, by Lemma 2, Image2:

Image

Substitution of (3.3) into (3.2), noting that Image, is equal to Image, leads to:

Image

and we have completed the proof of Theorem 2 (Chapter II), which asserts that refinement never decreases the correlation.19

We now consider the effect of refinement upon the relative information. We shall use the previous notation, and further assume that Image, are the information measures for which we wish to compute the relative information of Image and of Pij. . .k. The information measures for the unrefined distribution Pij. . .k then satisfy the relations:

Image

The relative information of the refined distribution is

Image

and by exactly the same procedure as we have just used for the correlation we arrive at the result:

Image

and we have proved that refinement never decreases the relative information (Theorem 4, Chapter II).

It is interesting to note that the relation (3.4) for the behavior of correlation under refinement can be deduced from the behavior of relative information (3.7). This deduction is an immediate consequence of the fact that the correlation is a relative information—the information of the joint distribution relative to the product measure of the marginal distributions.

Image4. Monotone decrease of information for stochastic processes

We consider a sequence of transition-probability matrices Image = 1 for all n, i, and 0 Image Image Image 1 for all n, i, j), and a sequence of measures Image (Image Image 0) having the property that

Image

We further suppose that we have a sequence of probability distributions, Image, such that

Image

For each of these probability distributions the relative information In (relative to the Image measure) is defined:

Image

Under these circumstances we have the following theorem:

THEOREM. In+1 Image In.

Proof. Expanding In+1 we get:

Image

However, by Lemma 2 (Image2, Appendix I) we have the inequality

Image

Substitution of (4.5) into (4.4) yields:

Image

and the proof is completed.               Image

This proof can be successively specialized to the case where T is stationary (Image = Tij for all n) and then to the case where T is doubly-stochastic (∑ Tij = 1 for all j):

COROLLARY 3. Image is stationary (Image = Tij, all n), and the measure ai is a stationary measure (aj = ∑iaiTij), imply that the information, In = Image, is monotone decreasing. (As before, Image.)

Proof. Immediate consequence of preceding theorem.               Image

COROLLARY 4. Tij is doubly-stochastic (∑i Tij = 1, all j) implies that the information relative to the uniform measure (ai = 1, all i), In = ∑i Image ln Image, is monotone decreasing.

Proof. For ai = 1 (all i) we have that ∑i aiTij = ∑i Tij = 1 = aj. Therefore the uniform measure is stationary in this case and the result follows from Corollary 1.

These results hold for the continuous case also, and may be easily verified by replacing the above summations by integrations, and by replacing Lemma 2 by its corollary.               Image

Image5. Proof of special inequality for Chapter IV (1.7)

LEMMA. Given probability densities P(r), P1(x), P2(r), with P(r) = ∫ P1(x)P2(r – xτ)dx. Then IR Image IX – ln τ, where IX = ∫ P1(x) ln P1(x) dx and IR = ∫ P(r) ln P(r) dr.

Proof. We first note that:

Image

and that furthermore

Image

We now define the density Image (x):

Image

which is normalized, by (5.1). Then, according to Image2, Corollary 1 of Appendix I, we have the relation:

Image

Substitution from (5.3) gives

Image

The relation P(r) = ∫ P1(x)P2(r – xτ)dx, together with (5.5) then implies

Image

which is the same as:

Image

Integrating with respect to r, and interchanging the order of integration on the right side gives:

Image

But using (5.2) and the fact that ∫ P(r)dr = 1 this means that

Image

and the proof of the lemma is complete.               Image

Image6. Stationary point of IK + IX

We shall show that the information sum:

Image

where

Image

is stationary for the functions:

Image

with respect to variations ψ, δψ which preserve the normalization:

Image

The variation δψ gives rise to a variation δImage of Image(k):

Image

To avoid duplication of effort we first calculate variation δIξ for an arbitrary wave function u(ξ). By definition,

Image

so that

Image

We now suppose that u has the real form:

Image

and from (6.6) we get

Image

We now compute δIk for Image0 using (6.8), (6.2), and (6.4):

Image

where

Image

Interchanging the order of integration and performing the definite integration over k we get:

Image

while application of (6.8) to ψ0 gives

Image

where

Image

Adding (6.10) and (6.11), and substituting for a′, b′, a″, b″, yields:

Image

But the integrand of (6.12) is simply ψ0(x)δψ(x), so that

Image

Since ψ0 is real, ψ0δψ + c.c. = ψ*δψ + c.c. = ψ*δψ + ψ0δψ* = δ(ψ*ψ), so that

Image

due to the normality restriction (6.3), and the proof is completed.

APPENDIX II

REMARKS ON THE ROLE OF THEORETICAL PHYSICScp

There have been lately a number of new interpretations of quantum mechanics, most of which are equivalent in the sense that they predict the same results for all physical experiments. Since there is therefore no hope of deciding among them on the basis of physical experiments, we must turn elsewhere and inquire into the fundamental question of the nature and purpose of physical theories in general. Only after we have investigated and come to some sort of agreement upon these general questions, i.e., of the role of theories themselves, will we be able to put these alternative interpretations in their proper perspective.

Every theory can be divided into two separate parts: the formal part and the interpretive part. The formal part consists of a purely logico-mathematical structure, i.e., a collection of symbols together with rules for their manipulations, while the interpretive part consists of a set of “associations,” which are rules which put some of the elements of the formal part into correspondence with the perceived world. The essential point of a theory, then, is that it is a mathematical model, together with an isomorphism1 between the model and the world, of experience (i.e., the sense perceptions of the individual, or the “real world”—depending upon one’s choice of epistemology).cq

The model nature is quite apparent in the newest theories, as in nuclear physics, and particularly in those fields outside of physics proper, such as the theory of games, various economic models, etc., where the degree of applicability of the models is still a matter of consiaderable doubt. However, when a theory is highly successful and becomes firmly established, the model tends to become identified with “reality” itself, and the model nature of the theory becomes obscured. The rise of classical physics offers an excellent example of this process. The constructs of classical physics are just as much fictions of our own minds as those of any other theory; we simply have a great deal more confidence in them. It must be deemed a mistake, therefore, to attribute any more “reality” here than elsewhere.cr

Once we have granted that any physical theory is essentially only a model for the world of experience, we must renounce all hope of finding anything like “the correct theory.” There is nothing which prevents any number of quite distinct models from being in correspondence with experience (i.e., all “correct”), and furthermore no way of ever verifying that any model is completely correct, simply because the totality of all experience is never accessible to us.

Two types of prediction can be distinguished: the prediction of phenomena already understood, in which the theory plays simply the role of a device for compactly summarizing known results (the aspect of most interest to the engineer), and the prediction of new phenomena and effects, unsuspected before the formulation of the theory. Our experience has shown that a theory often transcends the restricted field in which it was formulated. It is this phenomenon (which might be called the “inertia” of theories) which is of most interest to the theoretical physicist and supplies a greater motive to theory construction than that of aiding the engineer.

From the viewpoint of the first type of prediction we would say that the “best” theory is the one from which the most accurate predictions can be most easily deduced—two not necessarily compatible ideals. Classical physics, for example, permits deductions with far greater ease than the more accurate theories of relativity and quantum mechanics, and in such a case we must retain them all. It would be the worst sort of folly to advocate that the study of classical physics be completely dropped in favor of the newer theories. It can even happen that several quite distinct models can exist which are completely equivalent in their predictions, such that different ones are most applicable in different cases, a situation which seems to be realized in quantum mechanics today. It would seem foolish to attempt to reject all but one in such a situation, where it might be profitable to retain them all.

Nevertheless, we have a strong desire to construct a single all-embracing theory which would be applicable to the entire universe. From what stems this desire? The answer lies in the second type of prediction—the discovery of new phenomena—and involves the consideration of inductive inference and the factors which influence our confidence in a given theory (to be applicable outside of the field of its formulation). This is a difficult subject and one which is only beginning to be studied seriously. Certain main points are clear, however, for example, that our confidence increases with the number of successes of a theory. If a new theory replaces several older theories which deal with separate phenomena, i.e., a comprehensive theory of the previously diverse fields, then our confidence in the new theory is very much greater than the confidence in either of the older theories, since the range of success of the new theory is much greater than any of the older ones. It is therefore this factor of confidence which seems to be at the root of the desire for comprehensive theories.

A closely related criterion is simplicity—by which we refer to conceptual simplicity rather than ease in use, which is of paramount interest to the engineer. A good example of the distinction is the theory of general relativity which is conceptually quite simple, while enormously cumbersome in actual calculations. Conceptual simplicity, like comprehensiveness, has the property of increasing confidence in a theory. A theory containing many ad hoc constants and restrictions, or many independent hypotheses, in no way impresses us as much as one which is largely free of arbitrariness.

It is necessary to say a few words about a view which is sometimes expressed, the idea that a physical theory should contain no elements which do not correspond directly to observables. This position seems to be founded on the notion that the only purpose of a theory is to serve as a summary of known data, and overlooks the second major purpose, the discovery of totally new phenomena. The major motivation of this viewpoint appears to be the desire to construct perfectly “safe” theories which will never be open to contradiction. Strict adherence to such a philosophy would probably seriously stifle the progress of physics.

The critical examination of just what quantities are observable in a theory does, however, play a useful role, since it gives an insight into ways of modification of a theory when it becomes necessary. A good example of this process is the development of Special Relativity. Such successes of the positivist viewpoint, when used merely as a tool for deciding which modifications of a theory are possible, in no way justify its universal adoption as a general principle which all theories must satisfy.cs

In summary, a physical theory is a logical construct (model), consisting of symbols and rules for their manipulation, some of whose elements are associated with elements of the perceived world. The fundamental requirements of a theory are logical consistency and correctness. There is no reason why there cannot be any number of different theories satisfying these requirements, and further criteria such as usefulness, simplicity, comprehensiveness, pictorability, etc., must be resorted to in such cases to further restrict the number. Even so, it may be impossible to give a total ordering of the theories according to “goodness,” since different ones may rate highest according to the different criteria, and it may be most advantageous to retain more than one.ct

As a final note, we might comment upon the concept of causality. It should be clearly recognized that causality is a property of a model and not a property of the world of experience. The concept of causality only makes sense with reference to a theory in which there are logical dependencies among the elements. A theory contains relations of the form “Aimplies B,” which can be read as “A causes B,” while our experience, uninterpreted by any theory, gives nothing of the sort, but only a correlation between the event corresponding to B and that corresponding to A.

ao For further historical details, see the discussions of the long thesis in chapters 1 and 2 of this volume.

ap This is the von Neumann–Dirac formation of quantum mechanics (von Neumann, 1955). See the discussion of these postulates in the conceptual introduction (chapter 3, pgs. 28–29). Everett later contrasted this with the Copenhagen interpretation. See for example pgs. 238–40. See also the discussion of Everett’s understanding of the Copenhagen interpretation in the conceptual introduction (chapter 3, pg. 32).

aq What follows are two versions of what is now known as the Wigner’s Friend story (Wigner, 1961). Everett uses these idealized thought experiments to argue for the inconsistency of the standard collapse formulation of quantum mechanics. See also Everett’s presentation in the short thesis (chapter 6, pg. 176) and the discussion of Everett’s understanding of the measurement problem in the conceptual introduction (chapter 3, pg. 30).

ar It is extremely hypothetical because such an experiment would be virtually impossible in practice due to environmental decoherence effects. Everett is concerned with a conceptual problem that has nothing to do with what measurements one might in fact be able to perform.

as This would be a hidden variable proposal like Bohmian mechanics. Everett has no quick argument against this proposal here. It is discussed later in this thesis (pg. 153). Everett’s main objection is that, since pure wave mechanics works fine without such an assumption, postulating hidden variables is unnecessary.

at Although Everett took his discussion of information theory to be an important part of his project, most of the material from this chapter was cut in writing the short thesis.

au This intuition gives Everett a way to tie the notion of a correlation to the formal notion of information since correlations in this sense provide information one might use to infer one state from another. But Everett’s use of standard epistemic language is misleading since the formal apparatus here refers to uninterpreted distributions and not to values. More specifically, information here is just a measure of the shape of a distribution. Similarly, correlation is just a measure of how the properties of two distributions covary.

av The correlation measure is robust under refinement in the sense that it never decreases when one considers a distribution over a more fine-grained set of physical distinctions. Strong

correlations then remain strong under more precise physical specification. The monotonic behavior of the correlations under increasingly fine-grained partitions also supports Everett’s ultimate definition of the general correlation between variables as the limit of the correlations between the variables under increasingly fine-grained partitions.

aw This is a more general notion of relative information. Whereas the more general notion has the standard information measure as a special case (for finite distributions when the basic measure is uniform), it introduces a new parameter: the basic measure. Everett needs the more general notion to get an information measure for distributions over infinite sets that does not diverge on increasingly fine-grained partitions.

ax The claim here is that the more general notion of information introduced does not change the salient objective properties of the correlation measure.

ay This is Everett’s notion for distributions over infinite sets: The information of a distribution is the limit of information of distributions under increasingly fine-grained partitions.

az The information of a general distribution then is always relative to a set of basic measures on each of the variables. The idea is that there are choices of the basic measures for which the information for the distributions defined relative to the basic measures will be finite. In particular, Everett proposes using the uniform measure for distributions over finite sets and Lebesgue measure for distributions over infinite sets.

ba That the total information is constant means that the information-theoretic entropy is also constant for all classical systems. Since one might expect the thermodynamic entropy typically to increase over time for macroscopic systems, this example illustrates why the information-theoretic entropy cannot be taken to determine the thermodynamic entropy.

bb Everett starts with this statement of the second law of thermodynamics as an empirically supported postulate. He then seeks to translate this statement of the second law into a form that is compatible with the fact that the total information-theoretic entropy is constant.

bc The marginal distribution is the distribution of a proper subset of the total set of variables where one simply ignores any correlations outside the subset. Such distributions are typically not fully descriptive since one has thrown away correlation information.

bd What follows in the next few pages is an introduction of the basic notation and concepts of quantum mechanics. Everett favors the Schrödinger over the Heisenberg representation. Most of the orthodox quantum formalism is simply carried over into pure wave mechanics, but often with a different significance.

be Everett alternates in the text between calling this eigenfunction (Imagei and θj.

bf Relative states track the precise nature of the correlations between the subsystems of a composite system. For a countable dimensional space, the state Image relative to state η of S2 is the state one gets by expanding the total state of the composite system in a basis such that the state η occurs in precisely one term in the expansion and taking the rest of the renormalized term to be the relative state of S1. See the conceptual introduction (pg. 34) for a discussion of how Everett understood relative states and the role they played in pure wave mechanics.

bg The more spread out over the elements of the basis, the lower the information of the basis for the state; the more focused, the higher the information. The information of the basis for the state is then a measure of how close the state is to being an element of the basis. The information of an operator for the state is defined similarly, but with respect to the eigenvectors associated with the operator.

bh Everett likely means IA′, (ψ).

bi See the conceptual introduction, (chapter 3, pg. 34), for a discussion of Everett’s notion of relative states.

bj For approximately correlating interactions, there are always relative system states that are approximate eigenstates of the measurement. Typically, in realistic interactions there also are relative states where neither the measurement pointer nor the observed quantity of the object system are approximately determinate and relative states where they are both approximately determinate but not well correlated. Consider, for example, the state of the pointer on the measuring device relative to the experimenter having tripped and fallen onto the measuring device while performing the measurement.

bk The prediction of pure wave mechanics is entirely straightforward. The question is whether one can reconcile the entangled state prediction with experience. On the face of it, this is simply a question of empirical adequacy. The goal is to show how to deduce the standard predictions of quantum mechanics for observers in pure wave mechanics or, more specifically, to show how to find the determinate measurement records we take ourselves to have in the structure described by pure wave mechanics. See the discussion of empirical adequacy in the second appendix (pg. 168).

bl Each of the following deductions identifies a property of the correlation structure of pure wave mechanics. These properties follow from the definition of a good measurement and the linearity of the dynamics. The significance of the properties for Everett’s project, however, depends on how one interprets Everett. On the bare theory, as a simple case, each of these properties describes the surefire dispositions of observers within pure wave mechanics. See the discussion of the bare theory in (Barrett, 1999).

bm For Everett a measurement is fully constituted by the correlation that is produced in the absolute state between the observer and the object system. This correlation, as he explains in the footnote here, leads to a postmeasurement observer for whom multiple relative states obtain. Everett then associates the relative states of the observer with different experiences. See the discussion of relative states in the conceptual introduction, chapter 3, (pg. 34).

bn This is the beginning of Everett’s extended argument for the special status of the norm-squared measure of typicality in pure wave mechanics. A version of this argument can also be found in the short thesis, chapter 9 (pg. 188). Everett considered his discussion of typicality to be of central importance. See pgs. 273–75 and 295 for descriptions of the role of these considerations in pure wave mechanics. See also the discussion of empirical faithfulness in the conceptual introduction (pg. 53). Establishing the uniqueness of the norm-squared measure required a special set of background assumptions. Everett was well aware that there were many ways one might introduce a typicality measure over branches. Although the norm-squared measure is perhaps particularly natural, that there are other measures one might adopt indicates that there can be no way to deduce this measure from pure wave mechanics alone; rather, to argue for the uniqueness of the norm-squared measure, one must add something to the theory that constrains one’s selection. The question then becomes one of the naturalness of the principles one adds. The methodological problem is that when one knows what measure of typicality one wants, the principles one adds to get that measure invariably seem natural.

bo This is a search for an appropriate measure of typicality on the set of elements of the superposition. The constraints Everett imposes are that (1) the measure be a positive function over the elements of the superposition for each possible expansion of the state, (2) each such measure must depend only on the magnitude, not the phase, of the coefficients associated with the terms describing the elements on the particular expansion, and (3) the measures associated with different expansions of the absolute state must be related by a nesting requirement so that the measure assigned to a term in a coarser-grained expansion that represents a linear combination of individual terms in a finer-grained expansion is equal to the sum of the finer-grained measures on the individual terms. This last condition represents a constraint on how the measures associated with different expansions of the absolute state are related. Everett then argues that the norm-squared coefficient measure is the unique measure up to a multiplicative constant satisfying these requirements.

bp See also the discussion of the analogy between probability in pure wave mechanics and probability in statistical mechanics in the short thesis (pgs. 191–92). Everett further explained the parallel with statistical mechanics in his comments at the Xavier conference in 1961 (pg. 275). A disanalogy, however, is that whereas the probabilities in statistical mechanics might be thought of as epistemic, resulting from one not knowing the microstate of a system, it is unclear what the corresponding epistemic consideration might be in pure wave mechanics.

bq The original version of the long thesis reads here: “We choose for this measure the square amplitude of the coefficients of the superposition, a choice which we shall subsequently see is not as arbitrary as it appears” (Everett 1956, pg. 98). Everett later changed this to the stronger statement for the version included in the DeWitt and Graham anthology (DeWitt and Graham, 1973).

br See the discussion in fn. ea on pg. 193.

bs See also Everett’s discussion of this result in the short thesis (chapter 6) and Barrett (1999, limiting properties of the bare theory) for more details.

bt Everett likely means “. . . S1 to S3.”

bu This is Everett’s description of the EPR experiment (Einstein et al., 1935).

bv The stable relative configuration in the correlation structure represents the hydrogen atom as a classical object with a diachronic identity. For Everett, a complex object is fully determined by the internal correlations between its parts. This holds for microscopic objects like the hydrogen atom and by direct analogy for macroscopic systems like the cannonball below.

bw While correspondents complained that Everett did not address macrophysical phenomena, see for example pg. 228, this was in fact something he believed he had fully addressed. As explained here, Everett’s account of the quasi-classicality of branch states depended on the persistence of approximately determinate relative positions and approximately determinate relative momenta on each branch, for systems with large masses over short times. Indeed, it is the relative quasi-classical behavior of macrosystems that often allows one to identify the same branch at different times. On this view, classical laws describe the regular behavior of relative quasi-classical properties of macroscopic systems on each branch. See also pg. 158 and the discussion of classicality in the conceptual introduction (pgs. 49–50).

bx It is, in Everett’s words, the exclusion of the middle ground that does the work here. Such systems allow for sharp records.

by Since the dynamics is linear, there is a precise formal sense in which each element of the superposition can be thought of as following the dynamics separately, but this cannot be understood to preclude the possibility of interference between elements when predicted by the linear dynamics. It was essential to Everett’s understanding of pure wave mechanics that interference between branches always be possible, at least in principle. See for example pgs. 149–50.

bz See Everett’s footnote regarding the language difficulty (pg. 121). See Everett’s other discussions of reversibility and his discussions of interference between branches (pgs. 224, 240, 287, and 150).

ca The question of what one can affect and what one can know under the linear dynamics is somewhat more subtle than suggested by what Everett says here. Although there is a clear sense in which a relative observer cannot influence another element of the absolute state, he might at least in principle know the relative states associated with other elements of the absolute state by knowing something concerning the absolute state itself. See the discussion following pg. 73 and pg. 176, Wigner (1961), and the following footnote cb.

cb See Albert (1992, Ch. 8) for a description of what more one might know and Monton (1998) for further discussion of this point.

cc Everett is clear here that irreversible processes are not required for an interaction to count as a measurement—only an appropriate correlation between the pointer variable and the system being measured.

cd This is one of the central problems Everett starts with in his short thesis but does not discuss in detail there, chapter 9 (pg. 196). The point of this section is that, unlike Bohr’s interpretation, Everett’s relative-state interpretation provides compelling models for all correlating, or measurement-like, interactions.

ce Everett’s argument for the operational reality of all branches then was that the linear dynamics requires that it is always possible in principle that one might observe interference between branches.

cf Everett’s criticisms of the Copenhagen interpretation led to conflict with Bohr and the Copenhagen colleagues. Wheeler, as his adviser, tried to explain that Everett did not really mean to be attacking the orthodox interpretation. See chapter 12 (pg. 219) for Wheeler’s defense of Everett and the later exchange between Everett and Petersen in the discussions following pgs. 236 and 238. See also the discussion of Everett’s views in the conceptual introduction, chapter 3 (pg. 32).

cg In addition to wanting a theory that satisfies the minimal conditions of being logically consistent and empirically faithful, Everett explains in the second appendix that one might also want a theory that is comprehensive and pictorable. Such a theory would provide models for all physical interactions, including measurements, something the Copenhagen interpretation does not accomplish and explicitly denies as being a virtue.

ch Everett’s reference here originally read “I. E. Siegal.” Everett seems to have mistaken Norbert Wiener’s collaborator Armand Siegel for the mathematical physicist Irving Segal. Everett’s primary argument here against hidden variable theories is that hidden variables are not needed since pure wave mechanics is similarly consistent, empirically faithful, comprehensive, and pictorable, but is also simpler. See appendix II (pg. 168) for Everett’s discussion of theoretical virtues and theory selection.

ci Text read “Siegal.”

cj The GRW formulation of quantum mechanics is a recent example of such a theory (Ghirardi et al., 1986). Everett had no fundamental objection to this strategy. But since he believed that pure wave mechanics formed a satisfactory theory, he took the introduction of a stochastic dynamics to be unnecessary.

ck See also Everett’s discussion of Bopp’s theory in his letter to DeWitt in chapter 16 (pg. 256).

cl The brackets here are Everett’s own.

cm Although Everett took other strategies for addressing the quantum measurement problem seriously, he favored pure wave mechanics on the grounds that it satisfied the minimal conditions of consistency and empirical faithfulness and had the added virtues of being simple and comprehensive. See Appendix II (pg. 168).

cn The suggestion is that the mouse’s observation does not cause the universe to split. Rather, the observation sets up a correlation between the mouse’s measurement record and the world thereby splitting the mouse’s state into a set of relative states, typically one for each possible measurement outcome. See also pgs. 188–89.

co This is Everett’s summary of how pure wave mechanics explains the quasi-classical behavior of macrosystems. See also the earlier extended discussion (pgs. 134–37).

cp This is Everett’s extended discussion of the nature and cognitive status of physical theories. He explains in the first sentence of the section the primary reason for this discussion. Everett describes his understanding of the physical theories as being essentially the same as Philipp Frank’s, chapter 17 (pg. 257). See also Everett’s discussion of the material in this section in his correspondence with DeWitt, chapter 16 (pg. 252).

cq This point concerns the proper relationship between a theory’s mathematical model and experience. Everett first describes the relationship as an isomorphism. In this description, the mathematical model of experience described by an empirically faithful theory and our actual experience would have precisely the same structure. He then suggests that the relationship between the model and experience is better characterized as a homomorphism. Here he seems to have in mind an isomorphism between a proper substructure of the model and a proper substructure of our representation of experience. There are two considerations involved in the homomorphism: (1) the theory is not required to explain all of our experience and (2) there may be parts of the model that are not interpreted as our experience. Consideration (1) allows pure wave mechanics to be a perfectly satisfactory physical theory without capturing all our experience—it need not, for example, say anything concerning the exchange rate between the dollar and the euro. Consideration (2) allows the formal model to contain more than is found in our experience. Everett identifies experience in the correlation model of pure wave mechanics by the memory sequences represented by the terms in an appropriate expansion of the absolute state; but, as he explains to DeWitt, not all of the memory sequences represented in the absolute state are directly relevant to our experience (chapter 16, pgs. 254–55). Rather, it is enough for Everett that one can find our experience represented in a typical term in the absolute state in the norm-squared sense of typical. In this sense pure wave mechanics is taken to be empirically faithful. See also the discussion of the empirical virtues of pure wave mechanics in Everett’s letter to DeWitt (chapter 16, pg. 252) and the discussion of Everett’s understanding of the status of theories in the conceptual introduction (chapter 3, pgs. 51–54).

cr Everett held that the nonempiricical entities and structures of our best physical theories are to be regarded as fictions and that even the long-term success of a theory is not an indication of its descriptive truth. It was, consequently, the logical, empirical, and pragmatic virtues of theories that formed the proper basis for their evaluation. This view agrees well with Everett’s identification of his position with Philipp Frank’s operational view in chapter 17 (pg. 257). Everett did not opt for a simple-minded version of positivism since he held that taking a theory to be nothing more than a representation of experience missed the picturing and forward-looking aspects of empirical inquiry. See pg. 171.

cs A simple-minded positivist view does not acknowledge the broad collection of theoretical virtues that are properly relevant to the acceptance of a theory as summarized below. Further, as Everett has argued, a theory should take the risk of predicting future experience of new unexperienced types.

ct The picture is one of theory selection by means of a pragmatic and forward-looking cost– benefit analysis where one should be willing to take epistemic risks but where the ultimate descriptive truth of the theory and the metaphysical nature of the world are largely irrelevant. One consequently may wish to keep more than one theory out of a set of strictly incompatible theories. More specifically, whereas Everett took the relative-state interpretation to be the best option, he did not argue that one should simply reject other interpretations.

1 We use here the terminology of von Neumann [J. von Neumann, Mathematical Foundations of Quantum Mechanics. (Translated by R. T. Beyer) Princeton University Press: 1955.].

2 In the words of von Neumann ([J. von Neumann, Mathematical Foundations of Quantum Mechanics. (Translated by R. T. Beyer) Princeton University Press: 1955.], p. 418) : “. . . it is a fundamental requirement of the scientific viewpoint—the so-called principle of the psycho-physical parallelism—that it must be possible so to describe the extra-physical process of the subjective perception as if it were in reality in the physical world—i.e., to assign to its parts equivalent physical processes in the objective environment, in ordinary space.”

3 The theory originated by Claude E. Shannon [C. E. Shannon, W. Weaver, The Mathematical Theory of Communication. University of Illinois Press: 1949].

1 We regard it as undefined if P(wi, . . . , xj) = 0. In this case P(wi, . . . , xj, yk, . . . , z1) is necessarily zero also.

2 This definition corresponds to the negative of the entropy of a probability distribution as defined by Shannon [C. E. Shannon, W. Weaver, The Mathematical Theory of Communication. University of Illinois Press: 1949].

3 A good discussion of information is to be found in Shannon [C. E. Shannon, W. Weaver, The Mathematical Theory of Communication. University of Illinois Press: 1949], or Woodward [P. M. Woodward, Probability and Information Theory, with Applications to Radar. McGraw-Hill, New York: 1953]. Note, however, that in the theory of communication one defines the information of a state xi, which has a priori probability Pi, to be – ln Pi. We prefer, however, to regard information as a property of the distribution itself.

4 A measure is a non-negative, countably additive set function, defined on some subsets of a given set. It is a probability measure if the measure of the entire set is unity. See Halmos [P. R. Halmos, Measure Theory. Van Nostrand, New York: 1950].

5 See Kelley [J. Kelley, General Topology. Van Nostrand, New York: 1955], p. 65.

6 See Feller [W. Feller, An Introduction to Probability Theory and its Applications. Wiley, New York: 1950], or Doob [J. L. Doob, Stochastic Processes. Wiley, New York: 1953].

7 A Markov process is a stochastic process whose future development depends only upon its present state and not on its past history.

8 See Khinchin [A. I. Khinchin, Mathematical Foundations of Statistical Mechanics. (Translated by George Gamow) Dover, New York: 1949], p. 15.

1 More rigorously, one considers only finite sums, then completes the resulting space to arrive at H1 Image H2.

2 In case Image = 0 (unnormalizable) then choose any function for the relative function. This ambiguity has no consequences of any importance to us. See in this connection the remarks on p. 165.

3 Except if Image = 0. There is still, of course, no dependence upon the basis.

4 Also called a statistical operator (von Neumann [J. von Neumann, Mathematical Foundations of Quantum Mechanics. (Translated by R. T. Beyer) Princeton University Press: 1955]).

5 A better, coordinate free representation of a mixture is in terms of the operator which the density matrix represents. For a mixture of states ψn (not necessarily orthogonal) with weights pn, the density operator is p = Σnpnn], where [ψn] stands for the projection operator on ψn

6 The density matrix of a subsystem always has a pure discrete spectrum, if the composite system is in a state. To see this we note that the choice of any orthonormal basis in S2 leads to a discrete (i.e., denumerable) set of relative states in S1. The density matrix in S1 then represents this discrete mixture, Image. This means that the expectation of the identity,

7 See von Neumann [J. von Neumann, Mathematical Foundations of Quantum Mechanics. (Translated by R. T. Beyer) Princeton University Press: 1955], p. 296.

8 Cf. Chapter II, §7.

9 The relations {C, Image} Image {Ã, Image} = {S1, S2} and {A, D} Image {S1, S2} for all C on S1, D on S2, can be proved easily in a manner analogous to (2.27). These do not, however, necessarily imply the general relation (2.29).

10 If Ut is the unitary operator generating the time dependence for the state function of the composite system S = S1 + S2, so that Image, then we shall say that S1 and S2 have not interacted during the time interval [0, t] if and only if Image is the direct product of two subsystem unitary operators, i.e., if Image.

11 Here H means the total Hamiltonian of S, not just an interaction part.

12 Actually, rather than referring to canonical operators Ã, Image, which are not unique, we should refer to the bases of the canonical representation, {ξi} in S1 and j} in S2, since any operators à = Image, with the completely arbitrary eigenvalues λi, μij, are canonical. The limit then refers to the limit of the canonical bases, if it exists in some appropriate sense. However, we shall, for convenience, continue to represent the canonical bases by operators.

13 The maximum of {A, B} is —Ia if A has only a discrete spectrum, and ∞ if it has a continuous spectrum.

14 von Neumann [J. von Neumann, Mathematical Foundations of Quantum Mechanics. (Translated by R. T. Beyer) Princeton University Press: 1955], p. 442.

15 See discussion of relative states, p. 99.

1 At this point we encounter a language difficulty. Whereas before the observation we had a single observer state afterwards there were a number of different states for the observer, all occurring in a superposition. Each of these separate states is a state for an observer, so that we can speak of the different observers described by the different states. On the other hand, the same physical system is involved, and from this viewpoint it is the same observer, which is in different states for different elements of the superposition (i.e., has had different experiences in the separate elements of the superposition). In this situation we shall use the singular when we wish to emphasize that a single physical system is involved, and the plural when we wish to emphasize the different experiences for the separate elements of the superposition. (e.g., “The observer performs an observation of the quantity A, after which each of the observers of the resulting superposition has perceived an eigenvalue.”)bm

2 See Khinchin [A. I. Khinchin, Mathematical Foundations of Statistical Mechanics. (Translated by George Gamow) Dover, New York: 1949].

3 Cf. Chapter II, §6.

4 We assume that such transfers merely duplicate, but do not destroy, the original information.

5 Einstein [A. Einstein, B. Podolsky, N. Rosen, Phys. Rev. 47, 777, 1935.].

1 They are, of course, vacuously correct otherwise.

2 For any e one can construct a complete orthonormal set of (one particle) states Imageµ, v, where the double index µ, v refers to the approximate position and momentum, and for which the expected position and momentum values run independently through sets of approximately uniform density, such that the position and momentum uncertainties, σx and σp, satisfy σx Image C and σp Image C(Image/2) for each Imageµ, v, where C is a constant ∼ 60. The uncertainty product then satisfies σx σp Image C2(Image/2), about 3,600 times the minimum allowable, but still sufficiently low for macroscopic objects. This set can then be used as a basis for our decomposition into states where every body has a roughly defined position and momentum. For a more complete discussion of this set see von Neumann [J. von Neumann, Mathematical Foundations of Quantum Mechanics. (Translated by R. T. Beyer) Princeton University Press: 1955.], pp. 406–407.

3 Cf. Chapter III, Image1.

4 See Chapter III, Image2, particularly footnote 6, p. 106.

5 Since ∑i Tij = ∑i |(ηi, Imagej)|2 = ∑i(Imagej, [ηi]Imagej) = (Imagej, ∑ii]Imagej) = (Imagej, IImagej) = 1, and similarly ∑j Tij = 1 because Tij is symmetric.

6 For another, more complete, discussion of this topic in the probabilistic interpretation see von Neumann [J. von Neumann, Mathematical Foundations of Quantum Mechanics. (Translated by R. T. Beyer) Princeton University Press: 1955.], Chapter V, Image4.

7 See any textbook on statistical mechanics, such as ter Haar [D. ter Haar, Elements of Statistical Mechanics. Rinehart, New York, 1954.], Appendix I.

8 Cf. the discussion of Chapter II, Image7. See also von Neumann [J. von Neumann, Mathematical Foundations of Quantum Mechanics. (Translated by R. T. Beyer) Princeton University Press: 1955.], Chapter V, Image4.

9 See Bohm [D. Bohm, Quantum Theory. Prentice-Hall, New York: 1951.], p. 202.

10 Cf. von Neumann [J. von Neumann, Mathematical Foundations of Quantum Mechanics. (Translated by R. T. Beyer) Princeton University Press: 1955.], Chapter IV, Image4.

11 Cf. Image2, this chapter.

12 Bohm [D. Bohm, Quantum Theory. Prentice-Hall, New York: 1951.], p. 593.

13 This time is, strictly speaking, not well defined. The results, however, do not depend critically upon it.

14 As pointed out by Bohm [D. Bohm, Quantum Theory. Prentice-Hall, New York: 1951.], p. 604.

15 See Chapter III, Image1.

1 Such as that of Einstein, Rosen, and Podolsky [A. Einstein, B. Podolsky, N. Rosen, Phys. Rev. 47, 777, 1935.], as well as the paradox of the introduction.

2 Cf. Appendix II.

3 Einstein [A. Einstein, in Albert Einstein, Philosopher-Scientist. The Library of Living Philosophers, Inc., Vol. 7, p. 665. Evanston: 1949.].

4 Bohm [D. Bohm, Phys. Rev. 84, 166, 1952 and 85, 180, 1952.].

5 Wiener and Siegel [N. Wiener, A. Siegel, Nuovo Cimento Suppl. 2, 982 (1955).].

6 For an example of this type of theory see Einstein and Rosen [A. Einstein, N. Rosen, Phys. Rev. 48, 73, 1935.].

7 A non-denumerable infinity, in fact, since the set I is uncountable!

8 Bopp [F. Bopp, Z. Naturforsch. 2a(4), 202, 1947; 7a, 82, 1952; 8a, 6, 1953.].

9 Schrödinger [E. Schrödinger, Brit. J. Phil. Sci. 3, 109, 233, 1952.].

10 Heisenberg [W. Heisenberg, in Niels Bohr and the Development of Physics. McGraw-Hill, p. 12, New York: 1955.].

11 Einstein [A. Einstein, in Albert Einstein, Philosopher-Scientist. The Library of Living Philosophers, Inc., Vol. 7, p. 665. Evanston, I11.: 1949.].

12 For example, the paradox of Einstein, Rosen, and Podolsky [A. Einstein, B. Podolsky, N. Rosen, Phys. Rev. 47, 777, 1935.].

13 Address delivered at Palmer Physical Laboratory, Princeton, Spring, 1954.

14 See in this connection Chapter IV, particularly pp. 205, 206.

15 Cf. Chapter V, Image2.

16 Cf. Chapter V, Image2.

17 Einstein, Rosen, and Podolsky [a. Einstein, b. Podolsky, n. Rosen, Phys. Rev. 47, 777, 1935.].

18 See Hardy, Littlewood, and Pólya [G. H. Hardy, J. E. Littlewood, G. Pólya, Inequalities. Cambridge University Press: 1952.], p. 70.

19 Cf. Shannon [C. E. Shannon, W. Weaver, The Mathematical Theory of Communication. University of Illinois Press: 1949.], Appendix 7, where a quite similar theorem is proved.

1 By isomorphism we mean a mapping of some elements of the model into elements of the perceived world which has the property that the model is faithful, that is, if in the model a symbol Aimplies a symbol B, and Acorresponds to the happening of an event in the perceived world, then the event corresponding to B must also obtain. The word homomorphism would be technically more correct, since there may not be a one-one correspondence between the model and the external world.