Quantum mechanics is based on linear algebra. The general theory uses infinite dimensional vector spaces. Fortunately for us, to describe spin or polarization we need only finite dimensions, which makes things much easier. In fact, we need only a few tools. At the end of this chapter I have given a list. The rest of the chapter explains how to use these tools and what the calculations mean. There are many examples. It is important to work carefully through all of them. The mathematics introduced here is essential to everything that follows. Like much mathematics, it can seem complicated when it is first introduced, but it becomes almost second nature with practice. The actual computations only involve addition and multiplication of numbers, along with an occasional square root and trigonometric function.
We will be using Paul Dirac’s notation. Dirac was one of the founders of quantum mechanics, and his notation is used extensively throughout both quantum mechanics and quantum computing. It is not widely used outside these disciplines, which is surprising given how elegant and useful it is.
But first, we begin with a brief description of the numbers we will be using. These are real numbers—the standard decimal numbers with which we are all familiar. Practically every other book on quantum computation uses complex numbers—these involve the square root of negative one. So, let’s start by explaining why we are not going to be using them.
Real numbers are straightforward to use. Complex numbers are—well, more complicated. To talk about these numbers we would have to talk about their moduli and explain why we have to take conjugates. For what we are going to do, complex numbers are not needed and would only add another layer of difficulty. Why then, you ask, do all the other books use complex numbers? What can you do with complex numbers that you cannot do with real ones? Let’s briefly address these questions.
Recall that we measured the spin of an electron at various angles. These angles are all in one plane, but we live in a three-dimensional world. We compared measuring spin to using our quantum clock. We could only ask about directions given by the hand moving around the two-dimensional face. If we move to three dimensions, our analog would not be a clock face, but a globe with the hand at its center pointing to locations on the surface. We could ask, for example, if the hand is pointing to New York. The answer would be either that it is, or that it is pointing to the point diametrically opposite New York. The mathematical model for spin in three dimensions uses complex numbers. The computations involving qubits that we will look at, however, need to measure spin in only two dimensions. So, though our description using real numbers is not quite as encompassing as that using complex numbers, it is all that we need.
Finally, complex numbers provide an elegant way of connecting trigonometric and exponential functions. At the very end of the book we will look at Shor’s algorithm. This would be hard to explain without using complex numbers. But this algorithm also needs continued fractions, along with results from number theory and results about the speed of an algorithm for determining whether a number is prime. There would be a significant jump in the level of mathematical sophistication and knowledge needed if we were to describe Shor’s algorithm in full detail. Instead we will describe the basic ideas that underlie the algorithm, indicating how these fit together. Once again, our description will use only real numbers.
So, for what we are going to do, complex numbers are not needed. If, however, after reading this book, you want to continue studying quantum computation, they will be needed for more advanced topics.
Now that we have explained why we are going to stay with the real numbers, we begin our study of vectors and matrices.
A vector is just a list of numbers. The dimension of the vector is the number of numbers in the list. If the lists are written vertically, we call them column vectors or kets. If the lists are written horizontally, we call them row vectors or bras. The numbers that make up a vector are often called entries. To illustrate, here is a three-dimensional ket and a four-dimensional bra:
The names bra and ket come from Paul Dirac. He also introduced notation for naming these two types of vectors: a ket with name v is denoted by ; a bra with name w is denoted by
. So we might write
Later we will see why we use two different symbols to surround the name, and the reason that tells us which side the angled bracket goes. But, for now, the important thing is to remember that kets refer to columns (think of the repeated “k” sound) and that bras, as usual, have their entries arranged horizontally.
Vectors in two or three dimensions can be pictured as arrows. We will lookat an example using . (In what follows we will often use kets for our examples, but if you like you can replace them with bras.) The first entry, 3 in this example, gives the change in the x-coordinate from the initial point to the terminal point. The second entry gives the change in the y-coordinate going from the initial point to terminal point. We can draw this vector with any initial point—if we choose (a, b) as the coordinates of its initial point, then the coordinates of its terminal point will be at (a+3, b+1). Notice that if the initial point is drawn at the origin, the terminal point has coordinates given by the entries of the vector. This is convenient, and we will often draw them in this position. Figure 2.1 shows the same ket drawn with different initial points.
Figure 2.1 Same ket drawn in different positions.
The length of a vector is, as might be expected, the distance from its initial point to its terminal point. This is the square root of the sum of squares of the entries. (This comes from the Pythagorean theorem.) We denote the length of a ket by
, so for
we have
Vectors of length 1 are called unit vectors. Later we will see that qubits are represented by unit vectors.
We can multiply a vector by a number. (In linear algebra, numbers are often called scalars. Scalar multiplication just refers to multiplying by a number.) We do this by multiplying each of the entries by the given number. For example, multiplying the ket by the number c gives
.
It is straightforward to check that multiplying a vector by a positive number c multiplies its length by a factor of c. We can use this fact to enable us to get vectors of different lengths pointing in the same direction. In particular, we will often want to have a unit vector pointing in the direction given by a non–unit vector. Given any non-zero vector , its length is
. If we multiply
by the reciprocal of its length, we obtain a unitvector. For example, as we have already seen, if
then
If we let
then
Consequently, is a unit vector that points in the same direction as
.
Given two vectors that have the same type—they are both bras or both kets—and they have the same dimension, we can add them to get a new vector of the same type and dimension. The first entry of this vector just comes from adding the first entries of the two vectors, the second entry from adding the two second entries, and so on. For example, if and
, then
.
Vector addition can be pictured by what is often called the parallelogram law for vector addition. If the vector is drawn so that its initial point is at the terminal point of
, then the vector that goes from the initial point of
to the terminal point of
is
. This can be drawn giving a triangle.
We can interchange the roles of and
, drawing the initial point of
at the terminal point of
. The vector that goes from the initial point of
to the terminal point of
is
. Again, this gives a triangle. But we know that
. So if we draw the triangle construction for
and
where both the vectors have the same initial and terminal points, the two triangles connect to give us a parallelogram with the diagonal representing both
and
. Figure 2.2 illustrates this where
,
, and consequently
=
.
Figure 2.2 Parallelogram law for vector addition.
Figure 2.2 helps us visualize some basic properties of vector addition. One of the most important comes from the Pythagorean theorem. We know that if a, b, and c represent the lengths of the three sides of a triangle, then if and only if the triangle is a right triangle. The picture then tells us that two vectors
and
are perpendicular if and only if
.
The word orthogonal means exactly the same thing as perpendicular, and it is the word that is usually used in linear algebra. We can restate our observation: Two vectors and
are orthogonal if and only if
.
If we have a bra and a ket of the same dimension, we can multiply them—the bra on the left and the ket on the right—to obtain a number. This is done in the following way, where we suppose that both and
are n-dimensional:
We use concatenation to denote the product. This just means that we write down the terms side by side with no symbol between them. So the product is written . By squeezing the symbols even closer the vertical lines coincide and we get
, which is the notation we will use. The definition of the bra-ket product is
The vertical lines of the bras and kets are “pushed together,” which helps us to remember that the bra has the vertical line on the right side and the ket has it on the left. The result consists of terms sandwiched between angle brackets. The names “bra” and “ket” come from “bracket,” which is almost the concatenation of the two names. Though this is a rather weak play on words, it does help us to remember that, for this product, that the “bra” is to the left of the “ket.”
In linear algebra this product is often called the inner product or the dot product, but the bra-ket notation is the one used in quantum mechanics, and it is the one that we will use throughout the book.
Now that we have defined the bra-ket product, let’s see what we can do with it. We start by revisiting lengths.
If we have a ket denoted by , then the bra
with the same name is defined in the obvious way. They both have exactly the same entries, but for
they are arranged vertically, and for
horizontally.
Consequently, , and so the length of
can be written succinctly as
.
To illustrate, we return to the example where we found the length of :
. Then we take the square root to obtain
.
Unit vectors are going to become very important in our study. To see whether a vector is unit—has length 1—we will repeatedly use the fact that a ket is a unit vector if and only if
.
Another important concept is orthogonality. The bra-ket product can also tell us when two vectors are orthogonal.
The key result is: Two kets and
are orthogonal if and only if
. We will look at a couple of examples and then give an explanation of why this result is true.
Since , we know that
and
are not orthogonal. Since
, we know that
and
are orthogonal.
Why does this work? Here is an explanation for two-dimensional kets.
Let and
, then
. We calculate the square of the length of
.
Clearly this number equals if and only if
Now recall our observation that two vectors
and
are orthogonal if and only if
. We can restate this observation using our calculation for the square of the length of
to say: Two vectors
and
are orthogonal if and only if
Though we have shown this for two-dimensional kets, the same argument can be extended to kets of any size.
The word “orthonormal” has two parts; ortho from orthogonal, and normal from normalized which, in this instance, means unit. If we are working with two-dimensional kets, an orthonormal basis will consist of a set of two unit kets that are orthogonal to one another. In general, if we are working with n-dimensional kets, an orthonormal basis consists of a set of n unit kets that are mutually orthogonal to one another.
We begin by looking at two-dimensional kets. The set of all two-dimensional vectors is denoted by . An orthonormal basis for
consists of a set containing two unit vectors
and
that are orthogonal. So, given a pair of kets, to check whether they form an orthonormal basis, we must check first to see if they are unit vectors, and then check whether they are orthogonal. We can check both of these conditions using bra-kets. We need
,
, and
.
The standard example, which is called the standard basis, is to take and
. It is straightforward to check that the two bra-ket properties are satisfied. While
is a particularly easy basis to find, there are infinitely many other possible choices. Two of these are
In the last chapter we considered measuring the spin of a particle. We looked at spin measured in the vertical direction and in the horizontal direction. The mathematical model for measuring spin in the vertical direction will be given using the standard basis. Rotating the measuring apparatus will be described mathematically by choosing a new orthonormal basis. The three two-dimensional bases* that we have listed will all have important interpretations concerning spin, so instead of naming the vectors in the bases with letters we will use arrows, with the direction of the arrow related to the direction of spin. Here are the names we are going to use:
Our three bases can be written more succinctly as ,
and
Since these are orthonormal, we have the following bra-ket values.
Given a ket and an orthonormal basis, we can express the ket as a weighted sum of the basis vectors. Although at this stage it is not clear that this is useful, we will see later that this is one of the basic ideas on which our mathematical model is based. We start by looking at two-dimensional examples.
Any vector in
can be written as a multiple of
plus a multiple of
. This is equivalent to the rather obvious fact that for any numbers c and d the equation
has a solution. Clearly, this has a solution of and
, and this is the only solution.
Can any vector in
be written as a multiple of
plus a multiple of
? Equivalently, does the following equation have a solution for any numbers c and d?
How do we solve this? We could replace the kets with their two-dimensional column vectors and then solve the resulting system of two linear equations in two unknowns. But there is a far easier way of doing this using bras and kets.
First, take the equation and multiply both sides on the left by the bra This gives us the following equation.
Next, distribute the terms on the right side of the equation.
We know both of the bra-kets on the right side. The first is 1. The second is 0. This immediately tells us that is equal to
So, we just need to evaluate this product.
Consequently,
We can use exactly the same method to find . We start with the same initial equation
and multiply both sides on the left by the bra
.
This means that we can write
The sum on the right consists of multiplying the basis vectors by certain scalars and then adding the resulting vectors. I described it earlier as a weighted sum of the basis vectors, but you have to be careful with this interpretation. There is no reason for the scalars to be positive. They can be negative. In our example, if c were to equal –3 and d were to equal 1, both of the weights, and
, would be negative. For this reason the term linear combination of the basis vectors is used instead of weighted sum.
Now let’s move to n dimensions. Suppose that we are given an n-dimensional ket and an orthonormal basis
. Can we write
as a linear combination of the basis vectors? If so, is there a unique way of doing this? Equivalently, does the equation
have a unique solution? Again, the answer is yes. To see this we will show how to find the value for . The calculation follows exactly the same method we used in two dimensions. Start by multiplying both sides of the equation by
. We know that
equals 0 if
and equals 1 if
. So, after multiplying by the bra, the right side simplifies to just
, and we obtain that
. This tells us that
,
, etc. Consequently, we can write
as a linear combination of the basis vectors:
At this stage, this all seems somewhat abstract, but it will all become clear in the next chapter. Different orthonormal bases correspond to choosing different orientations to measure spin. The numbers given by the bra-kets like are called probability amplitudes. The square of
will give us the probability of
jumping to
when we measure it. This will all be explained, but understanding the equation written above is crucial to what follows.
An ordered basis is a basis in which the vectors have been given an order, that is, there is a first vector, a second vector, and so on. If is a basis, we will denote the ordered basis by
—we change the brackets from curly to round. For an example, we will look at
. Recall that the standard basis is
. Two sets are equal if they have the same elements—the order of the elements does not matter, so
. The two sets are identical.
However, for an ordered basis the order the basis vectors are given matters. . The first vector in the ordered basis on the left is not equal to the first vector in the ordered basis on the right, so the two ordered bases are distinct.
The difference between unordered bases and ordered bases might seem rather pedantic, but it is not. We will see several examples where we have the same set of basis vectors in which the order is different. The permutation of the basis vectors will give us important information.
As an example, earlier we noted that the standard basis corresponds to measuring the spin of an electron in the vertical direction. The ordered basis
will correspond to measuring the spin when the south magnet is on top of our measuring apparatus. If we flip the apparatus through
we will also flip the basis elements and use the ordered basis
.
Supposing that we have been given a ket and an orthonormal basis
, we know how to write
as a linear combination of the basis vectors. We end up with
To simplify things, we will write this as
There is a useful formula for the length of
It’s
Let’s quickly see why this is true. We know that
Using we obtain
The next step is to expand the product of the terms in the parentheses. This looks as though it is going to be messy, but it is not. We again use the facts that equals 0 if
and equals 1 if
. All the bra-ket products with different subscripts are 0. The only bra-kets that are nonzero are the ones where the same subscript is repeated, and these are all 1. Consequently, we end up with
.
Matrices are rectangular arrays of numbers. A matrix M with m rows and n columns is called an matrix. Here are a couple of examples:
A has two rows and three columns so it is a matrix. B is a
matrix. We can think of bras and kets as being special types of matrices: bras have just one row, and kets have just one column.
The transpose of a matrix
, denoted
, is the
matrix formed by interchanging the rows and the columns of M. The ith row of M becomes the ith column of
, and the jth column of M becomes the jth row of
. For our matrices A and B we have:
Column vectors can be considered as matrices with just one column, and row vectors can be considered as matrices with just one row. With this interpretation, the relation between bras and kets with the same name is given by and
Given a general matrix that has multiple rows and columns, we think of the rows as denoting bras and the columns as denoting kets. In our example, we can think of A as consisting of two bras stacked on one another or as three kets side by side. Similarly, B can be considered as three bras stacked on one another or as two kets side by side.
The product of the matrices A and B uses this idea. The product is denoted by AB. It’s calculated by thinking of A as consisting of bras and B of kets. (Remember that bras always come before kets.)
The product AB is calculated as follows:
Notice that the dimension of the bras in A is equal to the dimension of the kets in B. We need to have this in order for the bra-ket products to be defined. Also notice that . In our example, BA is a
matrix, so it is not even the same size as AB.
In general, given an matrix A and an
matrix B, write A in terms of r-dimensional bras and B in terms of r-dimensional kets.
The product AB is the matrix that has
as the entry in the ith row and jth column, that is,
Reversing the order of multiplication gives BA, but we cannot even begin the calculation if m is not equal to n because the bras and kets would have different dimensions. Even if m is equal to n, and we can multiply them, we would end up with a matrix that has size . This is not equal to AB, which has size
, if n is not equal to r. Even in the case when n, m and r are all equal to one another, it is usually not the case that AB will equal BA. We say that matrix multiplication is not commutative to indicate this fact.
Matrices with the same number of rows as columns are called square matrices. The main diagonal of a square matrix consists of the elements on the diagonal going from the top left of the matrix to the bottom right. A square matrix that has all leading diagonal entries equal to 1 and all other entries equal to 0 is called an identity matrix. The identity matrix is denoted by
The identity matrix gets its name from the fact that multiplying matrices by the identity is analogous to multiplying numbers by 1. Suppose that A is an matrix. Then
.
Matrices give us a convenient way of doing computations that involve bras and kets. The next section shows how we will be using them.
Suppose that we are given a set of n-dimensional kets and we want to check to see if it is an orthonormal basis. First, we have to check that they are all unit vectors. Then we have to check that the vectors are mutually orthogonal to one another. We have seen how to check both of these conditions using bras and kets, but the calculation can be expressed simply using matrices.
We begin by forming the matrix
, then take its transpose.
Then we take the product
Notice that the entries down the main diagonal are exactly what we need to calculate in order to find if the kets are unit. And the entries off the diagonal are what we have to calculate to see if the kets are mutually orthogonal. This means that the set of vectors is an orthonormal basis if and only if . This equation gives a succinct way of writing down everything that we need to check.
Though it is a concise expression, we still need to do all the calculations to find the entries. We need to calculate all the entries along the main diagonal in order to check that the vectors are unit. However, we don’t need to calculate the entries below the main diagonal. If then one of
and
will be above and the other below the main diagonal. These two bra-ket products are equal, and once we have calculated one we don’t need to calculate the other. So, after we have checked that all the main diagonal entries are 1, we just need to check that all the entries above (or below) the diagonal are 0.
Now that we have checked that is an orthonormal basis, suppose that we are given a ket
and want to express it as a linear combination of the basis vectors. We know how to do this.
Everything can be calculated using the matrix .
This has been a long chapter in which much mathematical machinery has been introduced. But the mathematics has been building, and we now have a number of ways for performing calculations. Three key ideas that we will need later are summarized in the final section. (They are at the end of the chapter for easy reference.) Before we conclude we look at some naming conventions.
A square matrix M that has real entries and has the property that is equal to the identity matrix is called an orthogonal matrix.
As we saw in the last section, we can check to see whether we have an orthonormal basis by forming the matrix of the kets and then checking whether the resulting matrix is orthogonal. Orthogonal matrices will also be important when we look at quantum logic gates. These gates also correspond to orthogonal matrices.
Two important orthogonal matrices are
The matrix corresponds to the ordered basis
, which we will meet in the next chapter where we will see how it is connected to measuring spin in the horizontal direction. We will also meet exactly the same matrix later. It is the matrix corresponding to a special gate, called the Hadamard gate.
The matrix corresponds to taking the standard basis for
and ordering with the last two vectors interchanged. This matrix is associated with the CNOT gate. We will explain later exactly what gates are, but practically all of our quantum circuits will be composed of just these two types of gates. So, these orthogonal matrices are important!
(If we were working with complex numbers, the matrix entries could be complex numbers. Matrices with complex entries that correspond to orthogonal matrices are called unitary.** Real numbers are a subset of the complex numbers, so all orthogonal matrices are unitary. If you look at practically every other book on quantum computing, they will call the matrices describing the CNOT gate and the Hadamard gate unitary, but we are calling them orthogonal. Both are correct.)
Here is a list of three tasks that we will need to perform repeatedly. These are all easy to do. The methods for tackling each task are given.
To do this, first construct . Then compute
If this is the identity matrix, we have an orthonormal basis. If it isn’t, we don’t.
To do this, construct Then
To do this, use
Now that we have the tools, we return to the study of spin.