Reality is that which, when you stop believing in it, does not go away.
Philip K. Dick1
In Chapters 1 and 2 we developed the necessary mathematical apparatus and terminology that will be used throughout this book. Chapter 3 has provided some heuristics and gently led us to the threshold of quantum mechanics. It is now time to open the door, introduce the basic concepts and tools of the trade, and continue our journey to quantum computing.2
In Section 4.1 we spend a few words on the motivations behind quantum mechanics. We then introduce quantum states and how they are distinguishable from one another through observations. Section 4.2 describes observable physical quantities within the quantum framework. How observable quantities are measured is the topic of Section 4.3. The dynamics of quantum systems, i.e., their evolution in time, is the focus of Section 4.4. Finally, in Section 4.5, we revisit the tensor product and show how it describes the way in which larger quantum systems are assembled from smaller ones. In the process, we meet the crucial notion of entanglement, a feature of the quantum world that pops up again in the chapters ahead.
Why quantum mechanics? To answer this question, we have to hearken back in time to the dawn of the twentieth century. Classical mechanics still dominated the scene, with its double-pronged approach: particles and waves. Matter was considered to be ultimately composed of microscopic particles, and light was thought of as continuous electromagnetic waves propagating in space.
The dichotomy – particles versus waves – was proven false by several groundbreaking experiments. For instance, the diffraction experiment shows that a beam of subatomic particles hitting a crystal diffract following a wave-like pattern, entirely similar to the diffraction pattern of light itself. By the mid-twenties, physicists started associating waves to all known particles, the so-called matter waves (the first proposal was made by French physicist Louis De Broglie in 1924 in his doctoral dissertation).
The photoelectric effect (observed by Hertz in 1887) showed that an atom hit by a beam of light may absorb it, causing some electrons to transition to a higher-energy orbital (i.e., farther from the nucleus). Later on, the absorbed energy may be released in the form of emitted light, causing the excited electrons to revert to a lower orbital. What the photoelectric effect unraveled was that light-matter transactions always occur through discrete packets of energy, known as photons (the concept was introduced by Einstein in his seminal 1905 paper, as a way to account for the photoelectric effect). Photons act as genuine particles that can get absorbed and emitted, one at a time.
Further experimental evidence from many quarters accumulated over time, strongly suggesting that the old duality particle–wave theory must be replaced by a new theory of the microscopic world in which both matter and light manifest a particle-like and a wave-like behavior. Time was ripe for the conceptual framework of quantum mechanics.
In Chapter 3 we met a toy version of the double-slit experiment; as it turns out, this was an actual experiment, indeed an entire series of related experiments, the first one being carried out with light by the English polymath Thomas Young around 1801. Before we move on, it is worth our while to revisit it briefly, as it contains most of the main ingredients that make up quantum’s magic.
One shines light at a boundary with two slits that are very close to each other. The pattern of the light to the right of the boundary will have certain regions that are dark and certain others that are bright, as depicted in Figure 4.1.
The reason why there are regions on the screen with no light is that light waves are interfering with each other. Light is propagating as a single wave from its source;
the two slits cause this wave to split into two independent ones, which can then interfere with each other when reaching the screen. Some regions are going to be darker, others are going to be brighter, depending on whether the two waves are in phase (positive interference) or out of phase (negative interference).What would happen if we closed off one of the slits? In that case, there is no splitting and therefore no interference pattern whatsoever (Figure 4.2).
Two remarks on this seminal experiment are in order:
As we have already pointed out in Chapter 3, the double-slit experiment can be done with just one photon at a time. Rather than spotting patterns of lighter or darker light on the screen, we are now looking for which region is more or less likely for the single photon to land. The same pattern can then be viewed as describing the probability for a certain region to get hit by the photon. The natural question then is, if there is a single photon why would there be any interference pattern? Yet, experiments have shown that such a pattern is there. Our photon is a true chameleon: sometimes it behaves as a particle and sometimes as a wave, depending on how it is observed.
The double-slit experiment is not only about light: one can perform it equally well with electrons, protons, and even atomic nuclei, and they will all exhibit exactly the same interference behavior.3 Once again, this clearly indicates that the rigid distinction between waves and particles as a paradigm of description of the physical world is untenable at the quantum level.
In the rest of this section, we are going to introduce the basic mathematical description of a quantum physical system. We shall restrict ourselves to two simple examples, to illustrate the basic machinery:
a particle confined to a set of discrete positions on a line
a single-particle spin system
Consider a subatomic particle on a line; moreover, let us suppose that it can only be detected at one of the equally spaced points {x0, x1,…, xn−1}, where x1 = x0 + δx, x2 = x1 + δx,…, with δx some fixed increment.
In real life, a particle can of course occupy any of the points of the line, not just a finite subset thereof. However, if we followed this route, the state space of our system would be infinite dimensional, requiring a considerably larger mathematical apparatus than the one covered in the last chapters. Whereas such an apparatus is vital for quantum mechanics, it is not needed for an exposition of quantum computing.4 For our current exposition, we can thus assume that the set {x0, x1,…, xn−1} is composed of a great many points (n large) and that δx is tiny, thereby providing a reasonably good approximation of a continuous system.
We are now going to associate to the current state of the particle an n-dimensional complex column vector [c0, c1,…, cn−1]T.
The particle being at the point xi shall be denoted as , using the Dirac ket notation. (Do not worry about the funny symbol: it will be explained momentarily.) To each of these n basic states, we shall associate a column vector:
Observe that these vectors form the canonical basis of . From the standpoint of classical mechanics, the basic states in Equation (4.2) are all we shall ever need. Not so in quantum mechanics: experimental evidence testifies to the fact that the particle can be in a strange fuzzy blending of these states (think again of the double-slit!). To catch up with Nature, we shall make a bold leap by positing that all vectors in represent a legitimate physical state of the particle.
What can all this possibly mean?
An arbitrary state, which we shall denote as , will be a linear combination of , by suitable complex weights, c0, c1,…, cn−1, known as complex amplitudes,5
Thus, every state of our system can be represented by an element of as
We say that the state is a superposition of the basic states. represents the particle as being simultaneously in all {x0, x1 ,…, xn−1} locations, or a blending of all the . There are, however, different possible blendings (much like in the recipe for baking an apple pie you can vary the proportions of the ingredients and obtain different flavors). The complex numbers c0, c1,…, cn-1 tell us precisely which superposition our particle is currently in. The norm square of the complex number ci divided by the norm squared of will tell us the probability that, after observing the particle, we will detect it at the point xi:
Observe that p(xi) is always a positive real number and 0 ≤ p(xi) ≤ 1, as any genuine probability should be.
When is observed, we will find it in one of the basic states. We might write it as6
Example 4.1.1 Let us assume that the particle can only be at the four points {x0, x1, x2, x3}. Thus, we are concerned with the state space . Let us also assume that now the state vector is
We shall calculate the probability that our particle can be found at position x2. The norm of is given by
The probability is therefore
Exercise 4.1.1 Let us assume that the particle is confined to {x0, x1,…, x5} and the current state vector is
What is the likelihood of finding the particle at position x3?
Kets can be added: if
and
then
Also, for a complex number , we can scalar multiply a ket by c:
What happens if we add a ket to itself?
The sum of the moduli squared is
For the state , the chance that the particle will be found in position j is
In other words, the ket describes the same physical system as . Notice that we could replace 2 with an arbitrary and get the same results. Geometrically, the vector and all its complex scalar multiples , i.e., the entire subspace generated by, describe the same physical state. The length of does not matter as far as physics goes.
Exercise 4.1.2 Let be [c0, c1,…, cn−1]T. Check that multiplying by any complex number c will not alter the calculation of probabilities. (Hint: Factor out c in the ratio.)
Example 4.1.2 The vectors
differ by the factor 3 + i (verify it!), and are thus representatives of the same quantum state.
Exercise 4.1.3 Do the vectors [1 + i,2 − i]T and [2 + 2i, 1 − 2i]T represent the same state?
As we can multiply (or divide) a ket by any (complex) number and still have a representation of the same physical state, we may as well work with a normalized , i.e.,
which has length 1.7
Example 4.1.3 The vector [2 − 3i, 1 + 2i]T has length given by
We can normalize it by simply dividing by its length:
Exercise 4.1.4 Normalize the ket
Exercise 4.1.5 (a) Verify that the two state vectors and are each of length 1 in . (b) Find the vector on the unit ball of representing the superposition (addition) of these two states.
Given a normalized ket , the denominator of Equation (4.5) is 1, and hence, the equation reduces to
We are now done with our first motivating example. Let us move on to the second one. In order to talk about it, we need to introduce a property of subatomic particles called spin. As it turns out, spin will play a major role in our story, because it is the prototypical way to implement quantum bits of information, or qubits, which we shall encounter in Section 5.1.
What is spin? The Stern–Gerlach experiment (first performed in 1922) showed that an electron in the presence of a magnetic field will behave as if it were a charged spinning top: it will act as a small magnet and strive to align itself to the external field. The Stern–Gerlach experiment (as shown in Figure 4.3) consists of shooting a beam of electrons through a nonhomogeneous magnetic field oriented in a certain direction, say, vertically (z direction). As it happens, the field splits the beam into two streams, with opposite spin. Certain electrons will be found spinning one way, and certain others spinning the opposite way.
With respect to a classical spinning top, there are two striking differences:
First, the electron does not appear to have an internal structure, by itself it is just a charged point. It acts as a spinning top but it is no top! Spin is therefore a new property of the quantum world, with no classical analog.
Secondly, and quite surprisingly, all our electrons can be found either at the top of the screen or at the bottom, none in between. But, we had not prepared the “spinning” electrons in any way before letting them interact with the magnetic field. Classically, one would have expected them to have different magnetic components along the vertical axis, and therefore to be differently pulled by the field. There should be some in the middle of the screen. But there isn’t. Conclusion: when the spinning particle is measured in a given direction, it can only be found in two states: it spins either clockwise or anticlockwise (as shown in Figure 4.4).
For each given direction in space, there are only two basic spin states. For the vertical axis, these states have a name: spin up and spin down . The generic state will then be a superposition of up and down, or
Just like before, c0 is the amplitude of finding the particle in the up state, and similarly for c1.
Example 4.1.4 Consider a particle whose spin is described by the ket
The length of the ket is
Therefore, the probability of detecting the spin of the particle in the up direction is
The probability of detecting the spin of the particle in state down is
Exercise 4.1.6 Let the spinning electron’s current state be . Find the probability that it will be detected in the up state.
Exercise 4.1.7 Normalize the ket given in Equation (4.25).
In Chapter 2, the inner product was introduced as an abstract mathematical idea. This product turned a vector space into a space with a geometry: angles, orthogonality, and distance were added to the canvas. Let us now investigate its physical meaning. The inner product of the state space gives us a tool to compute complex numbers known as transition amplitudes, which in turn will enable us to determine how likely the state of the system before a specific measurement (start state), will change to another (end state), after measurement has been carried out. Let
be two normalized states. We can extract the transition amplitude between state and state by the following recipe: will be our start state. The end state will be a row vector whose coordinates will be the complex conjugate of coordinates.
Such a state is called a bra, and will be denoted , or equivalently
To find the transition amplitude we multiply them as matrices(notice that we put them side by side, forming a bra–ket, or bra(c)ket, i.e., their inner product):
We can represent the start state, the ending state, and the amplitude of going from the first to the second as the decorated arrow:
This recipe is, of course, none other than the inner product of Section 2.4. What we have done is simply split the product into the bra–ket form. Although this is mathematically equivalent to our previous definition, it is quite handy for doing calculations, and moreover opens up an entirely new vista: it shifts the focus from states to state transitions.8
Note: The transition amplitude between two states may be zero. In fact, that happens precisely when the two states are orthogonal to one another. This simple fact hints at the physical content of orthogonality: orthogonal states are as far apart as they can possibly be. We can think of them as mutually exclusive alternatives: for instance, an electron can be in an arbitrary superposition of spin up and down, but after we measure it in the z direction, it will always be either up or down, never both up and down. If our electron was already in the up state before the z direction measurement, it will never transition to the down state as a result of the measurement.
Assume that we are given a normalized start state and an orthonormal basis , representing a maximal list of mutually exclusive end states associated with some specific measurement of the system. In other words, we know beforehand that the result of our measurement will necessarily be one or the other of the states in the basis, but never a superposition of any of them. We show in Section 4.3 that for every complete measurement of a quantum system there is an associated orthonormal basis of all its possible outcomes.
We can express in the basis as
We invite you to check that and that |b0|2 + |b1|2 + ··· + |bn−1|2 = 1.
It is thus natural to read Equation (4.33) in the following way: each |bi|2 is the probability of ending up in state after a measurement has been made.
Exercise 4.1.8 Check that the set is an orthonormal basis for the state space of the particle on the line. Similarly, verify that is an orthonormal basis of the one-particle spin system.
From now on, we shall use the row–column and the bra–ket notation introduced earlier interchangeably, as we deem fit.9
Let us work through a couple of examples together.
Example 4.1.5 Let us compute the bra corresponding to the ket . It is quite easy; we take the complex conjugate of all the entries, and list them: .
Example 4.1.6 Let us now compute the amplitude of the transition from , . We first need to write down the bra corresponding to the end state:
Now we can take their inner product:
Exercise 4.1.9 Calculate the bra corresponding to the ket .
Exercise 4.1.10 Calculate the amplitude of the transition from to
Observe that in the calculation of transition amplitudes via the inner product, the requirement that the representatives be normalized states can be easily removed by simply dividing the hermitian product by the product of the length of the two vectors (or equivalently, normalizing your states first, and then computing their inner product). Here is an example.
Example 4.1.7 Let us calculate the amplitude of the transition from to . Both vectors have norm .
We can take their inner product first:
and then divide it by the product of their norm:
Equivalently, we can first normalize them, and then take their product:
The result is, of course, the same. We can concisely indicate it as
Let us pause one moment, and see where we are.
We have learned to associate a vector space to a quantum system. The dimension of this space reflects the amount of basic states of the system.
States can be superposed, by adding their representing vectors.
A state is left unchanged if its representing vector is multiplied by a complex scalar.
The state space has a geometry, given by its inner product. This geometry has a physical meaning: it tells us the likelihood for a given state to transition into another one after being measured. States that are orthogonal to one another are mutually exclusive.
Before moving on to the next sections, we invite you to write a simple computer simulation.
Programming Drill 4.1.1 Write a program that simulates the first quantum system described in this section. The user should be able to specify how many points the particle can occupy (warning: keep the max number low, or you will fairly quickly run out of memory). The user will also specify a ket state vector by assigning its amplitudes. The program, when asked the likelihood of finding the particle at a given point, will perform the calculations described in Example 4.1.1. If the user enters two kets, the system will calculate the probability of transitioning from the first ket to the second, after an observation has been made.
Physics is, by and large, about observations: physical quantities like mass, momentum, velocity, etc., make sense only insofar as they can be observed in a quantifiable way. We can think of a physical system as specified by a double list: on the one hand, its state space, i.e., the collection of all the states it can possibly be found in (see the previous section), and on the other hand, the set of its observables, i.e., the physical quantities that can be observed in each state of the state space.
Each observable may be thought of as a specific question we pose to the system: if the system is currently in some given state , which values can we possibly observe?
In our quantum dictionary, we need to introduce the mathematical analog of an observable:
Postulate 4.2.1 To each physical observable there corresponds a hermitian operator.
Let us see what this postulate actually entails. First of all, an observable is a linear operator, which means that it maps states to states. If we apply the observable Ω to the state vector , the resulting state is now .
Example 4.2.1 Let be the start state in the two-dimensional spin state space. Now, let
This matrix acts as an operator on . Therefore, we can apply it to . The result is the vector . Observe that and are not scalar multiples of one another, and thus they do not represent the same state: Ω has modified the state of the system.
Secondly, as we already know from Chapter 2, the eigenvalues of a hermitian operator are all real. The physical meaning of this fact is established by the following:
Postulate 4.2.2 The eigenvalues of a hermitian operatorΩ associated with a physical observable are the only possible values observable can take as a result of measuring it on any given state. Furthermore, the eigenvectors ofΩ form a basis for the state space.
As we have said before, observables can be thought of as legitimate questions we can pose to quantum systems. Each question admits a set of answers: the eigenvalues of the observable. We learn in the next section how to compute the likelihood that one specific answer will come up out of the entire set.
Before delving into the subtler properties of observables, let us mention some real-life ones. In the case of the first quantum system of Section 4.1, namely, the particle on the line, the most obvious observable is position. As we have stated already, each observable represents a specific question we pose to the quantum system. Position asks: “Where can the particle be found?” Which hermitian operator corresponds to position? We are going to tell first how it acts on the basic states:
In plain words, P acts as multiplication by position.
As the basic states form a basis, we can extend Equation 4.42 to arbitrary states:
Here is the matrix representation of the operator in the standard basis:
P is simply the diagonal matrix whose entries are the xi coordinates. Observe that P is trivially hermitian, its eigenvalues are the xi values, and its normalized eigenvectors are precisely the basic state vectors that we met at the beginning of Section 4.1: .
Exercise 4.2.1 Verify the last statement. [Hint: Do it by brute force (start with a generic vector, multiply it by the position operator, and assume that the result vector is a scalar multiple of the original one. Conclude that it must be one of the basis vectors).]
There is a second natural question one may ask of our particle: What is your velocity? Actually, physicists ask a slightly different question: “What is your momentum?” where momentum is defined classically as velocity times mass. There is a quantum analog of this question, which is represented in our discrete model by the following operator (recall that δx is the increment of distance on the line):
In words, momentum is, up to the constant −i * ħ, the rate of change of the state vector from one point to the next.10
The constant ħ (pronounced h bar) that we have just met is a universal constant in quantum mechanics, known as the reduced Planck constant. Although it plays a fundamental role in modern physics (it is one of the universal constants of nature), for the purpose of the present discussion it can be safely ignored.
As it turns out, position and momentum are the most elementary questions we can ask of the particle: there are of course many more, such as energy, angular momentum, etc., but these two are in a sense the basic building blocks (most observables can be expressed in terms of position and momentum). We shall meet again position and momentum at the end of the next section.
Our second example of observables comes from the spin system. The typical question we might pose to such a system is: given a specific direction in space, in which way is the particle spinning? We can, for instance, ask: is the particle spinning up or down in the z direction? Left or right in the x direction? In or out in the y direction? The three spin operators corresponding to these questions are
Each of the three spin operators comes equipped with its orthonormal basis. We have already met up and down, the eigenbasis of Sz. Sx has eigenbasis , or left and right, and Sy has , or in and out.
Exercise 4.2.2 Consider a particle in initial spin up. Apply Sx to it and determine the probability that the resulting state is still spin up.
Reader Tip. The remainder of this section, although quite relevant for general quantum theory, is tangential to quantum computation, and can thus be safely skipped in a first reading (just take a look at the summary at the end of this section and proceed to Section 4.3).
We are going to make some calculations with the operators described before in a little while; first, though, we need a few additional facts on observables and their associated hermitian matrices.
Up to this point, the collection of physical observables on a given quantum system is just a set. However, even an informal acquaintance with elementary physics teaches us that observable quantities can be added, multiplied, or multiplied by a scalar number to form other meaningful physical quantities, i.e., other examples of observables abound: think of momentum as mass times velocity, work as force times displacement, total energy of a particle in motion as the sum of its kinetic and potential energies, etc. We are thus naturally concerned with the following issue: to what extent can we manipulate quantum observables to obtain yet other observables?
Let us start our investigation from the first step, namely, multiplying an observable by a number (i.e., a scalar). There is no problem with carrying out this operation: indeed, if we scalar multiply a hermitian matrix by a real scalar (i.e., we multiply all its entries), the result is still hermitian.
Exercise 4.2.3 Verify the last statement.
Exercise 4.2.4 What about complex scalars? Try to find a hermitian matrix and a complex number such that their product fails to be hermitian.
Let us make the next move. What about the addition of two hermitian matrices? Suppose we are looking at two physical observables, represented respectively by the hermitians Ω1 and Ω2. Again, no problem: their sum Ω1 + Ω2 is the observable whose representative is the sum of the corresponding hermitian operators, Ω1 + Ω2, which happens to be hermitian.
Exercise 4.2.5 Check that the sum of two arbitrary hermitian matrices is hermitian.
From these two facts it ensues that the set of hermitian matrices of fixed dimension forms a real (but not a complex) vector space.
How about products? It is quite tempting to conclude that the product of two physical quantities, represented respectively by the hermitians Ω1 and Ω2, is an observable whose representative is the product (i.e., matrix composition) of Ω1 and Ω2. There are two substantial difficulties here. First, the order in which operators are applied to state vectors matters. Why? Well, simply because matrix multiplication, unlike multiplication of ordinary numbers or functions, is not, in general, a commutative operation.
Example 4.2.2 Let
Their product Ω2*Ω1 is equal to
whereas Ω1* Ω2 is equal to
Exercise 4.2.6 Let and . Verify that both are hermitian. Do they commute with respect to multiplication?
The second difficulty is just as serious: in general, the product of hermitian operators is not guaranteed to be hermitian. Let us now investigate in a more rigorous way what it takes for the product of two hermitian operators to be hermitian. Notice that we have
where the first equality comes from the fact that Ω1 is hermitian and the second equality comes from the fact that Ω2 is hermitian. For Ω1 * Ω2 to be hermitian, we would need that
This in turn implies
or equivalently, the operator
must be the zero operator (i.e., the operator that sends every vector to the zero vector).
The operator [Ω1, Ω2] is so important that it deserves its own name; it is called the commutator of Ω1 and Ω2. We have just learned that if the commutator is zero then the product (in whichever order) is hermitian. We are going to meet the commutator again in a little while. Meanwhile, let us familiarize ourselves with the commutator through a simple and very important example.
Example 4.2.3 Let us calculate the commutators of the three spin matrices (we shall deliberately ignore the constant factor ):
A bit more concisely,
As we have just seen, none of the commutators are zero. The spin operators do not commute with each other.
Now it is your turn.
Exercise 4.2.7 Explicitly calculate the commutator of the operators of Example 4.2.2.
Note: A moment's thought shows that the product of a hermitian operator with itself always commutes and so does the exponent operation. Therefore, given a single hermitian Ω, we automatically get the entire algebra of polynomials over Ω, i.e., all operators of the form
All such operators commute with one another.
Exercise 4.2.8 Show that the commutator of two hermitian matrices is a hermitian matrix.
If the commutator of two hermitian operators is zero, or equivalently, the two operators commute, there is no difficulty in assigning their product (in whatever order) as the mathematical equivalent of the physical product of their associated observables. But what about the other cases, when the two operators do not commute? The Heisenberg’s uncertainty principle, which we are going to meet at the end of this section, will provide an answer.
There is yet another aspect of the association between observables and hermitian operators that can provide substantial physical insight: we know from Chapter 2 that hermitian operators are precisely those operators that behave well with respect to the inner product, i.e.,
for each pair , .
From this fact, it immediately derives that is a real number for each , which we shall denote as (the subscript points to the fact that this quantity depends on the state vector). We can attach a physical meaning to the number .
Postulate 4.2.3 is theexpected value of observing Ω repeatedly on the same state ψ.
This postulate states the following: suppose that
is the list of eigenvalues of Ω. Let us prepare our quantum system so that it is in state and let us observe the value of Ω. We are going to obtain one or another of the aforementioned eigenvalues. Now, let us start all over again many times, say, n times, and let us keep track of what was observed each time. At the end of our experiment, the eigenvalue λi has been seen pi times, where 0 ≤ pi ≤ n(in statistical jargon, its frequency is pi/n). Now perform the calculation
If n is sufficiently large, this number (known in statistics as the estimated expected value of Ω) will be very close to .
Example 4.2.4 Let us calculate the expected value of the position operator on an arbitrary normalized state vector: let
be our state vector and
where
In particular, if happens to be just , we simply get xi (verify it!). In other words, the expected value of position on any of its eigenvectors is the corresponding position xi on the line.
Example 4.2.5 Let and .
Let us calculate :
The bra associated with is . The scalar product , i.e., the average value of Ω on , is thus equal to
Exercise 4.2.9 Repeat the steps of the previous example where
and
We now know that the result of observing Ω repeatedly on a given state will be a certain frequency distribution on the set of its eigenvalues. In plain words, sooner or later we will encounter all its eigenvalues, some more frequently and some less. In the next section we compute the probability that a given eigenvalue of Ω will actually be observed on a given state. For now, we may be interested in knowing the spread of the distribution around its expected value, i.e., the variance of the distribution. A small variance will tell us that most of the eigenvalues are very close to the mean, whereas a large variance means just the opposite. We can define the variance in our framework in a few stages. First, we introduce the hermitian operator
(I is the identity operator). The operator ΔΨ (Ω) acts on a generic vector in the following fashion:
So ΔΨ(Ω) just subtracts the mean from the result of Ω. What then is the mean of ΔΨ(Ω) itself on the normalized state ? A simple calculation shows that it is precisely zero: ΔΨ(Ω) is the demeaned version of Ω.
Exercise 4.2.10 Verify the last statement.
We can now define the variance of Ω at as the expectation value of ΔΨ(Ω) squared (i.e., the operator ΔΨ(Ω ) composed with itself):
Admittedly, the definition looks at first sight rather obscure, although it is not so bad if we remember the usual definition of the variance of a random variable X as
where E is the expected value function. The best course is to turn to a simple example to get a concrete feel for it.
Example 4.2.6 Let Ω be a 2-by-2 diagonal matrix with real entries:
and let
Let us denote by μ(pronounced “mu”) the mean of Ω on .
Now we calculate ΔΨ(Ω) *ΔΨ (Ω):
Finally, we can compute the variance:
We are now able to see that if both λ1 and λ2 are very close to μ , the term in the equation will be close to zero. Conversely, if either of the two eigenvalues is far from μ(it is immaterial whether above or below it, because we are taking squares), the variance will be a big real number. Conclusion: the variance does indeed inform us about the spread of the eigenvalues around their mean.
Our reader may still be a bit unsatisfied after this example: after all, what it shows is that the definition of variance given above works as it should in the case of diagonal matrices. Actually, it is a known fact that all hermitian matrices can be diagonalized by switching to a basis of eigenvectors, so the example is comprehensive enough to legitimize our definition.
Example 4.2.7 Let us calculate the variance of the operator described in Example 4.2.5:
We now compute ΔΨ(Ω)*ΔΨ(Ω):
Hence the variance is
Exercise 4.2.11 Calculate the variance of the position operator. Show that the variance of position on any of its eigenvectors is zero.
Exercise 4.2.12 Calculate the variance of Sz on a generic spin state. Show that the variance of Sz reaches a maximum on the state .
Note: The variance of the same hermitian varies from state to state: In particular, on an eigenvector of the operator the variance is zero, and the expected value is just the corresponding eigenvalue: we can say that an observable is sharp on its eigenvectors (no ambiguity on the outcome).
Exercise 4.2.13 Prove the preceding statement. (Hint: Work out some examples first.)
We have built all the machinery needed to introduce a fundamental theorem of quantum mechanics, known as Heisenberg’s uncertainty principle. Let us begin with two observables, represented by the two hermitians Ω1 and Ω2, and a given state, say, . We can compute the variance of Ω1 and Ω2 on , obtaining Varψ(Ω1) and Varψ (Ω2). Do these two quantities relate in any way, and if so, how?
Let us see what the question actually means. We have two observables, and our hope would be to simultaneously minimize their variances, thereby getting a sharp outcome for both. If there were no correlation in the variances, we could expect a very sharp measure of both observables on some convenient state (such as a common eigenvector, if any such existed). Alas, this is not the case, as shown by the following.
Theorem 4.2.1 (Heisenberg’s Uncertainty Principle). The product of the variances of two arbitrary hermitian operators on a given state is always greater than or equal to one-fourth the square of the expected value of their commutator. In formulas:
As promised, we have found our commutator once again. Heisenberg’s principle tells us that the commutator measures how good a simultaneous measure of two observables can possibly be. In particular, if the commutator happens to be zero (or equivalently, if the observables commute), there is no limit (at least in principle) to our accuracy. In quantum mechanics, however, there are plenty of operators that do not commute: in fact, we have seen that the directional spin operators provide one such example.
Exercise 4.2.14 Use the calculation of the commutator in Example 4.2.3 and Heisenberg’s principle to give an estimate of how accurate a simultaneous observation of spin in the z and x directions can be.
Another typical example, related to our first quantum system, is given by the pair position–momentum, which we have also met in the last section. So far for the particle on the line has been described in terms of its position eigenbasis, i.e., the collection . can be written in many other orthonormal bases, corresponding to different observables. One of those is the momentum eigenbasis. This basis comes up when we think of as a wave (a bit like a wave hovering over the line). We can thus decompose it into its basic frequencies, just as we can resolve a sound into its basic pure tones. These pure tones are precisely the elements of the momentum eigenbasis.
The image of in the position basis is as different as it can possibly be from the one associated with the momentum eigenbasis. The position eigenbasis is made of “peaks,” i.e., vectors that are zero everywhere except at a point (Dirac’s deltas, in math jargon). Therefore, is decomposed into a weighted sum of peaks. The momentum eigenbasis, on the other hand, is made of sinusoids, whose position is totally undetermined.
The commutator of the position–momentum pair captures well this inherent dissimilarity: it is not zero, and therefore our hope to keep the comforting traditional picture of a particle as a tiny billiard ball moving around in space is dashed. If we can pin down the particle position at a given point in time (i.e., if the variance of its position operator is very small), we are at a loss as to its momentum (i.e., the variance of its momentum operator is very big), and vice versa.
Let us sum up:
Observables are represented by hermitian operators. The result of an observation is always an eigenvalue of the hermitian.
The expression represents the expected value of observing Ω on .
Observables in general do not commute. This means that the order of observation matters. Moreover, if the commutator of two observables is not zero, there is an intrinsic limit to our capability of measuring their values simultaneously.
Programming Drill 4.2.1 Continue your simulation of a quantum system by adding observables to the picture: the user will input a square matrix of the appropriate size, and a ket vector. The program will verify that the matrix is hermitian, and if so, it will calculate the mean value and the variance of the observable on the given state.
The act of carrying out an observation on a given physical system is called measuring. Just as a single observable represents a specific question posed to the system, measuring is the process consisting of asking a specific question and receiving a definite answer.
In classical physics, we implicitly assumed that
the act of measuring would leave the system in whatever state it already was, at least in principle; and
the result of a measurement on a well-defined state is predictable, i.e., if we know the state with absolute certainty, we can anticipate the value of the observable on that state.
Both these assumptions proved wrong, as research in the subatomic scale has repeatedly shown: systems do get perturbed and modified as a result of measuring them. Furthermore, only the probability of observing specific values can be calculated: measurement is inherently a nondeterministic process.
Let us briefly recapitulate what we know: an observable can only assume one of its eigenvalues as the result of an observation. So far though, nothing tells us how frequently we are going to see a specific eigenvalue, say, λ. Moreover, our framework does not tell us yet what happens to the state vector if λ is actually observed. We need an additional postulate to handle concrete measures:
Postulate 4.3.1 LetΩ be an observable and be a state. If the result of measuringΩ is the eigenvalueλ, the state after measurement will always be an eigenvector corresponding toλ.
Example 4.3.1 Let us go back to Example 4.2.1: It is easy to check that the eigenvalues of Ω are and and the corresponding normalized eigenvectors are and.
Now, let us suppose that after an observation of Ω on , the actual value observed is λ1. The system has “collapsed” from to .
Exercise 4.3.1 Find all the possible states the system described in Exercise 4.2.2 can transition into after a measurement has been carried out.
What is the probability that a normalized start state will transition to a specific eigenvector, say, ? We must go back to what we said in Section 4.1: the probability of the transition to the eigenvector is given by the square of the inner product of the two states: . This expression has a simple meaning: it is the projection of along .
We are ready for a new insight into the real meaning of of the last section: first, let us recall that the normalized eigenvectors of Ω constitute an orthogonal basis of the state space. Therefore, we can express as a linear combination in this basis:
Now, let us compute the mean:
(Verify this identity!)
As we can now see, is precisely the mean value of the probability distribution
where each pi is the square of the amplitude of the collapse into the corresponding eigenvector.
Example 4.3.2 Let us go back to Example 4.3.1 and calculate the probabilities that our state vector will fall into one of the two eigenvectors:
Now, let us compute the mean value of the distribution:
which is precisely the value we obtained by directly calculating .
Exercise 4.3.2 Perform the same calculations as in the last example, using Exercise 4.3.1. Then draw the probability distribution of the eigenvalues as in the previous example.
Note: As a result of the foregoing discussion, an important fact emerges. Suppose we ask a specific question (i.e., we choose an observable) and perform a measurement once. We get an answer, say, λ, and the system transitions to the corresponding eigenvector. Now, let us ask the same question immediately thereafter. What is going to happen? The system will give exactly the same answer, and stay where it is. All right, you may say. But, what about changing the question? The following example will clarify matters.
Example 4.3.3 Until now we have dealt with measurements relative to only one observable. What if there were more than one observable involved? With each observable there is a different set of eigenvectors the system can possibly collapse to after a measurement has taken place. As it turns out, the answers we get will depend on which order we pose our questions, i.e., which observable we measure first.
There is an intriguing experiment that one can easily perform in order to see some of these ideas in action (and have some fun in the process). Suppose you shoot a beam of light. Light is also a wave, and like all waves it vibrates during its journey (think of sea waves). There are two possibilities: either it vibrates along all possible planes orthogonal to its line of propagation, or it does it only in a specific one. In the second case we say that light is polarized.11 What kind of questions we can ask concerning polarization? We can set a specific plane, and ask: is light vibrating along this plane or is orthogonal?
For our experiment we need thin plastic semitransparent polarization sheets (they are fairly easy to obtain). Polarization sheets do two things: once you orient them in a specific direction, they measure the polarization of light in the orthogonal basis corresponding to that direction (let us call it the vertical–horizontal basis), and then filter out those photons that collapsed to one of the elements of the basis (Figure 4.5).
What if we had two sheets? If the two sheets were oriented in the same direction, there would be no difference whatsoever (why? because we are asking the same question; the photon will give once more the same exact answer). However, if we rotated the second sheet by 90º, then no light would pass through both sheets (Figure 4.6).
Placing the sheets orthogonal to each other ensures that the permitted half that passes through the left sheet is filtered out by the right sheet.
What happens if we add a third sheet? Placing a third sheet to the left or to the right of the other two sheets does not have any effect whatsoever. No light was permitted before and none will be allowed through the additional sheet. However, placing the third sheet in-between the other two at an angle, say, 45º, does have a remarkable effect (Figure 4.7).
Light will pass through all the three sheets! How can this be? Let us see what is going on here. The left sheet measures all the light relative to the up–down basis. The polarized light in the vertical polarization state that goes through is then considered to be a superposition with respect to the diagonal middle sheet measuring basis. The middle sheet recollapses the permitted half, filters some, and passes some through. But what is passed through is now in a diagonal polarization state. When this light passes through the right sheet, it is again in a superposition of the vertical– horizontal basis, and so it must collapse once more. Notice that only one-eighth of the original light passes through all three sheets.
A brief summary is in order:
The end state of the measurement of an observable is always one of its eigenvectors.
The probability for an initial state to collapse into an eigenvector of the observable is given by the length squared of the projection.
When we measure several observables, the order of measurements matters.
We have come a long way. We now have three main ingredients to cook up quantum dishes. We need one more, dynamics.
Programming Drill 4.3.1 Next step in the simulation: when the user enters an observable and a state vector, the program will return the list of eigenvalues of the observable, the mean value of the observable on the state, and the probability that the state will transition to each one of the eigenstates. Optional: plot the corresponding probability distribution.
Thus far, we have been concerned with static quantum systems, i.e., systems that do not evolve over time. To be sure, changes could still occur as a result of one or possibly many measurements, but the system itself was not time-dependent. In reality, of course, quantum systems do evolve over time, and we thus need to add a new hue to the canvas namely quantum dynamics. Just as hermitian operators represent physical observables, unitary operators introduce dynamics in the quantum arena.
Postulate 4.4.1 The evolution of a quantum system (that is not a measurement) is given by a unitary operator or transformation.
That is, if U is a unitary matrix that represents a unitary operator and represents a state of the system at time t, then
will represent the system at time t + 1.
An important feature of unitary transformations is that they are closed under composition and inverse, i.e., the product of two arbitrary unitary matrices is unitary, and the inverse of a unitary transformation is also unitary. Finally, there is a multiplicative identity, namely, the identity operator itself (which is trivially unitary). In math jargon, one says that the set of unitary transformations constitutes a group of transformations with respect to composition.
Exercise 4.4.1 Verify that
are unitary matrices. Multiply them and verify that their product is also unitary.
We are now going to see how dynamics is determined by unitary transformations: assume we have a rule, , that associates with each instant of time
a unitary matrix
Let us start with an initial state vector . We can apply to , then apply to the result, and so forth. We will obtain a sequence of state vectors
Such a sequence is called the orbit12 of under the action of at the time clicks t0, t1,…, tn−1.
Observe that one can always go back, just like running a movie backward, simply by applying the inverses of , in reverse order: evolution of a quantum system is symmetric with respect to time.
We can now preview how a quantum computation will look. A quantum computer shall be placed into an initial state , and we shall then apply a sequence of unitary operators to the state. When we are done, we will measure the output and get a final state. The next chapters are largely devoted to working out these ideas in detail.
Here is an exercise for you on dynamics:
Exercise 4.4.2 Go back to Example 3.3.2(quantum billiard ball), keep the same initial state vector [1, 0, 0, 0]T, but change the unitary map to
Determine the state of the system after three time steps. What is the chance of the quantum ball to be found at point 3?
The reader may wonder how the sequence of unitary transformations is actually selected in real-life quantum mechanics. In other words, given a concrete quantum system, how is its dynamics determined? How does the system change? The answer lies in an equation known as the Schrödinger equation:13
A complete discussion of this fundamental equation goes beyond the scope of this introductory chapter. However, without going into technical details, we can at least convey its spirit. Classical mechanics taught physicists that the global energy of an isolated system is preserved throughout its evolution.14 Energy is an observable, and therefore for a concrete quantum system it is possible to write down a hermitian matrix representing it (this expression will of course vary from system to system). This observable is called the hamiltonian of the system, indicated by H in Equation (4.96).
The Schrödinger equation states that the rate of variation of the state vector with respect to time at the instant t is equal (up to the scalar factor to multiplied by the operator – i*. By solving the equation with some initial conditions one is able to determine the evolution of the system over time.
Time for a small recap:
Quantum dynamics is given by unitary transformations.
Unitary transformations are invertible; thus, all closed system dynamics are reversible in time (as long as no measurement is involved).
The concrete dynamics is given by the Schrödinger equation, which determines the evolution of a quantum system whenever its hamiltonian is specified.
Programming Drill 4.4.1 Add dynamics to your computer simulation of the particle on a grid: the user should input a number of time steps n, and a corresponding sequence of unitary matrices Un of the appropriate size. The program will then compute the state vector after the entire sequence Un has been applied.
The opening section of this chapter described a simple quantum system: a particle moving in a confined one-dimensional grid (the set of points {x0, x1,…, xn−1}). Now, let us suppose that we are dealing with two particles confined to the grid. We shall make the following assumption: the points on the grid that can be occupied by the first particle will be {x0, x1,…, xn−1}. The second particle can be at the points {y0, y1,…, ym−1}.
Can we lift the description we already have to this new setup? Yes. The details will keep us busy in this section.
Our answer will not be confined to the aforementioned system. Instead, it will provide us with a quantum version of a building block game, i.e., a way of assembling more complex quantum systems starting from simpler ones. This procedure lies at the very core of modern quantum physics: it enables physicists to model multiparticle quantum systems.15
We need one last expansion of our quantum dictionary: assembling quantum systems means tensoring the state space of their constituents.
Postulate 4.5.1 Assume we have two independent quantum systems Q and Q′, represented respectively by the vector spaces and . The quantum system obtained by merging Q and Q′ will have the tensor product as a state space.
Notice that the postulate above enables us to assemble as many systems as we like. The tensor product of vector spaces is associative, so we can progressively build larger and larger systems:
Let us go back to our example. To begin with, there are n× m possible basic states:
, meaning the first particle is at x0 and the second particle at y0.
, meaning the first particle is at x0 and second particle at y1.
, meaning the first particle is at x0 and the second particle at ym−1.
, meaning the first particle is at x1 and the second particle at y0.
, meaning the first particle is at xi and the second particle at yj.
, meaning the first particle is at xn−1 and the second particle at ym−1.
Now, let us write the generic state vector as a superposition of the basic states:
which is a vector in the (n× m)-dimensional complex space .
The quantum amplitude |ci, j| squared will give us the probability of finding the two particles at positions xi and yj, respectively, as shown by the following example.
Example 4.5.1 Assume n = 2 and m = 2 in the above. We are thus dealing with the state space whose standard basis is
Now, let us consider the state vector for the two-particle system given by
What is the probability of finding the first particle at location x1 and the second one at y1? We look at the last amplitude in the list given before, and use the same recipe as in the one-particle system:
Exercise 4.5.1 Redo the steps of the last example when n = m = 4 and c0, 0 = c0, 1 = … = c3, 3 = 1 + i.
The same machinery can be applied to any other quantum system. For instance, it is instructive to generalize our spin example of Section 4.1 to a system where many particles are involved. You can try yourself.
Exercise 4.5.2 Write down the generic state vector for the system of two particles with spin. Generalize it to a system with n particles (this is important: it will be the physical realization for quantum registers!).
Now that we are a bit familiar with quantum assemblage, we are ready for the final puzzling surprise of quantum mechanics: entanglement. Entanglement will force us to abandon one last comforting faith, namely, that assembled complex systems can be understood completely in terms of their constituents.
The basic states of the assembled system are just the tensor product of basic states of its constituents. It would be nice if each generic state vector could be rewritten as the tensor product of two states, one coming from the first quantum subsystem and the other one from the second. It turns out that this is not true, as is easily shown by this example.
Example 4.5.2 Let us work on the simplest nontrivial two-particle system: each particle is allowed only two points. Consider the state
In order to clarify what is left out, we might write this as
Let us see if we can write as the tensor product of two states coming from the two subsystems. Any vector representing the first particle on the line can be written as
Similarly, any vector representing the second particle on the line can be written as
Therefore, if came from the tensor product of the two subsystems, we would have
For our in Equation (4.104) this would imply that c0c′0 = c1c′1 = 1 and c0c′1 = c1c′0 = 0. However, these equations have no solution. We conclude that cannot be rewritten as a tensor product.
Let us go back to and see what it physically means. What would happen if we measured the first particle? A quick calculation will show that the first particle has a 50–50 chance of being found at the position x0 or at x1. So, what if it is, in fact, found in position x0? Because the term has a 0 coefficient, we know that there is no chance that the second particle will be found in position y1. We must then conclude that the second particle can only be found in position y0. Similarly, if the first particle is found in position x1, then the second particle must be in position y1. Notice that the situation is perfectly symmetrical with respect to the two particles, i.e., it would be the same if we measured the second one first. The individual states of the two particles are intimately related to one another, or entangled. The amazing side of this story is that the xi’s can be light years away from the yj’s. Regardless of their actual distance in space, a measurement’s outcome for one particle will always determine the measurement’s outcome for the other one.
The state is in sharp contrast to other states like
Here, finding the first particle at a particular position does not provide any clue as to where the second particle will be found (check it!).
States that can be broken into the tensor product of states from the constituent subsystems (like ) are called separable states, whereas states that are unbreakable (like ) are referred to as entangled states.
Exercise 4.5.3 Assume the same scenario as in Example 4.5.2 and let
Is this state separable?
A clear physical case of entanglement is in order. We must revert to spin. Just as there are laws of conservation of momentum, angular momentum, energy-mass, and other physical properties, so too there is a law of conservation of total spin of a quantum system. This means that in an isolated system the total amount of spin must stay the same. Let us fix a specific direction, say, the vertical one (z axis), and the corresponding spin basis, up and down. Consider the case of a quantum system, such as a composite particle, whose total spin is zero. This particle might split up at some point in time into two other particles that do have spin (Figure 4.8).
The spin states of the two particles will now be entangled. The law of conservation of spin stipulates that because we began with a system of total spin zero, the sum of the spins of the two particles must cancel each other out. This amounts to the fact that if we measure the spin of the left particle along the z axis and we find it in state (where the subscript is to describe which particle we are dealing with), then it must be that the spin of the particle on the right will be . Similarly, if the state of the left particle is , then the spin of the right particle must be .
We can describe this within our notation. In terms of vector spaces, the basis that describes the left particle is BL = {↑L, ↓L} and the basis that describes the right particle isBR = {↑R, ↓R}. The basis elements of the entire system are
In such a vector space, our entangled particles can be described by
similar to Equation (4.104). As we said before, the combinations and cannot occur because of the law of conservation of spin. When one measures the left particle and it collapses to the state then instantaneously the right particle will collapse to the state , even if the right particle is millions of light-years away.
How will entanglement arise in the tale that we are telling? We find in Chapter 6 that it plays a central role in algorithm design. It is also used extensively in Chapter 9 while discussing cryptography (Section 9.4) and teleportation (Section 9.5). Entanglement makes a final appearance in Chapter 11, in connection with decoherence.
What have we learned?
We can use the tensor product to build complex quantum systems out of simpler ones.
The new system cannot be analyzed simply in terms of states belonging to its subsystems. An entire set of new states has been created, which cannot be resolved into their constituents.
Programming Drill 4.5.1 Expand the simulation of the last sections by letting the user choose the number of particles.
References: There are many elementary introductions to quantum mechanics that are very readable. Here is a list of some of them: Chen (2003), Gillespie (1974), Martin (1982), Polkinghorne (2002), and White (1966).
Special mention must be made of the classic introduction by P.A.M. Dirac (1982). Seventy years after its first publication, it remains a classic that is worth reading.
For a more advanced and modern presentation see, e.g., Volume III of Feynman (1963), Hannabuss (1997), Sakurai (1994), or Sudbery (1986).
For a short history of the early development of quantum mechanics, see Gamow (1985).
1 The quotation is taken from Dick’s 1978 lecture How to build a Universe that does not fall apart two days later, freely available on the Web at http://deoxy.org/pkd_how2build.htm.
2 No attempt will be made to present the material in an exhaustive historical manner. The curious reader can refer to the references at the end of this chapter for a plethora of good, comprehensive introductions to quantum mechanics.
3 Such experiments have indeed been performed, only much later than Young’s original version of the double-slit experiment. We invite you to read about this fascinating slice of experimental physics in Rodgers (2002).
4 We mention in passing that in computer simulation one must always turn a continuous physical system (classical or quantum) into a discrete one: computers cannot deal with infinities.
5 This name comes from the fact that is indeed a (complex) wave when we study its time evolution, as we shall see at the end of Section 4.3. Waves are characterized by their amplitude (think of the intensity of a sound wave) – hence the name above – as well as by their frequency (in case of sound waves, their pitch). As it turns out, the frequency of plays a key role in the particle’s momentum. You can think of Equation (4.3) as describing as the overlap of n waves, the , each contributing with intensity ci.
6 The wiggly line is used throughout this chapter to denote the state of a quantum system before and after measurement.
7 In Section 3.3, we limited ourselves to normalized complex vectors. Now you see why!
8 This line of thought has been pursued by some researchers, in the ambitious attempt to provide a satisfactory interpretation of quantum mechanics. For instance, Yakhir Aharonov and his colleagues have in recent years proposed a model called the two-vector formalism, in which the single vector description is replaced with the full bra–ket pair. The interested reader can consult Aharonov’s recent book Quantum Paradoxes(Aharonov and Rohrlich, 2005).
9 An historical note is in order: the bra–ket notation, which is now ubiquitous in quantum mechanics, was introduced by the great physicist Paul A.M. Dirac around 1930.
10 The calculus-enabled reader would have easily recognized a one-step discrete version of the derivative in the momentum. Indeed, if δx goes to zero, momentum is precisely the derivative with respect to position of times the scalar −i * ħ.
11 Polarization is a familiar phenomenon: fancy sun glasses are made on the basis of light polarization.
12 A small warning: one commonly thinks of an orbit as closed (a typical example is the orbit of the moon around the earth). In dynamics, this is not always the case: an orbit can be open or closed.
13 The version shown here is actually the discretized version of the original equation, which is a differential equation obtained from the above by letting δt become infinitesimal. It is this discretized version (or variants thereof) that is usually employed in computer simulation of quantum systems.
14 For instance, a stone dropped from a height falls down in such a way that its kinetic energy plus its potential energy plus energy dissipated from attrition is constant.
15 By thinking of fields such as the electromagnetic field as systems composed of infinitely many particles, this procedure makes field theory amenable to the quantum approach.