8
Quantum Algorithms

Popular descriptions of quantum algorithms describe them as being much faster than regular algorithms. This speedup, it is explained, comes from being able to put the input into a superposition of all possible inputs and then performing the algorithm on the superposition. Consequently, instead of running the algorithm on just one input, as you do classically, you can run the algorithm using “quantum parallelism” on all possible inputs at the same time. These descriptions often end at this point. But this leaves many unanswered questions. We seem to end up with many possible answers all superimposed on one another. If we make a measurement, won’t we get just one of these answers at random? There are far more likely to be wrong answers than right answers, so aren’t we more likely to end up with a wrong answer than with the right answer?

Clearly, there has to be more to quantum algorithms than just putting everything into a superposition of states. The real art of constructing these algorithms is being able to manipulate these superpositions so that when we make measurements we get a useful answer. In this chapter, we will look at three quantum algorithms and see how they tackle this problem. We will see that not every algorithm is susceptible to a quantum speedup. Quantum algorithms are not classical algorithms that have been sped up. Instead, they involve quantum ideas to see the problem in a new light; the algorithms work not by the use of brute force, but by ingenious ways of exploiting underlying patterns that can be seen from only the quantum viewpoint.

We will describe three algorithms in detail. All three are ingenious exploitations of underlying mathematical patterns. The level of difficulty increases as we move through the algorithms. Some mathematics books use a star to denote a difficult section and a double star to denote a very difficult section. The Deutsch-Jozsa algorithm probably deserves a star, and Simon’s algorithm a double star.

At the end of the chapter, we will talk a little about the properties that questions must have in order for a quantum algorithm to solve them faster than a classical one, and why they seem so hard! But first we must describe how the speed of algorithms is measured.

The Complexity Classes P and NP

Imagine that you are given the following problems. You are told that you are not allowed to use a calculator or computer but have to work them out using paper and pencil.

Find two whole numbers bigger than 1 whose product is equal to 35.
Find two whole numbers bigger than 1 whose product is equal to 187.
Find two whole numbers bigger than 1 whose product is equal to 2,407.
Find two whole numbers bigger than 1 whose product is equal to 88,631.

You won’t have much difficulty doing the first question, but each subsequent question is harder and will take more steps and consequently more time to solve. Before we analyze this in more detail, consider another four problems.

Multiply 7 by 5 and check that it equals 35.
Multiply 11 by 17 and check that it equals 187.
Multiply 29 by 83 and check that it equals 2407.
Multiply 337 by 263 and check that it equals 88,631.

These questions are undoubtedly easier than the first series. Again each subsequent question takes more time to solve than the previous one, but the amount of time is growing more slowly. Even the fourth question takes less than a minute to solve by hand.

We will denote the number of digits of the input number by n, so in the first set of questions we start with and go up to

We will let denote the time, or equivalently the number of steps, to solve a question of input length n. Complexity looks at how the size of grows as n grows. In particular, we ask if we can find some positive numbers k and p such that for every value of n. If we can, we say that the underlying problem can be solved in polynomial time. If, on the other hand, we can find positive number k and a number , such that for every value of n, we say that the problem requires exponential time. Recall the basic fact concerning polynomial versus exponential growth: Given enough time, something with exponential growth will grow much faster than something with polynomial growth. In computer science, questions that can be solved in polynomial time are considered tractable, but those with exponential growth are not. Problems that can be solved in polynomial time are regarded as easy; those that require exponential time are hard. In practice, it turns out that most polynomial time problems involve polynomials with small degree, so even if we don’t have the computational power to solve a problem with a large value of n at the moment, we should have it in a few years. On the other hand, with an exponential time problem, once the size has increased beyond what we can currently tackle, increasing the size of n even slightly more produces a problem that becomes much harder and is unlikely to be solvable in the foreseeable future.

Let’s look at our two sets of problems. The second set involves multiplying two numbers together, but this is easy to do. As n increases it does take more time, but it can be shown that this is a polynomial time problem. What about the first set of questions? If you tried tackling them, you will probably believe that the amount of time needed is exponential in n and not polynomial in n, but is this the case? Everybody thinks so, but, on the other hand, nobody has found a proof.

In 1991, RSA Laboratories posted a challenge. It listed large numbers, each of which was the product of two primes. The challenge was to factor them. They went from being about 100 decimal digits long to 600 digits. You were of course allowed to use computers! There were prizes for the first person to factor them. The 100 digit numbers were factored relatively quickly, but the numbers with 300 or more digits still haven’t been factored.

If a problem can be solved in polynomial time we say it belongs to the complexity class P. So the problem that consists of multiplying two numbers together belongs to P. Suppose that instead of solving the problem, someone gives you the answer and you just have to check that the answer is correct. If this process of checking that an answer is correct takes polynomial time, then we say the problem belongs to complexity class NP.* The problem of factoring a large number into the product of two primes belongs to NP.

Clearly, checking that an answer is correct is easier than actually finding the answer, so every problem that is in P is also in NP, but what about the converse question. Does every NP problem belong to P? Is it true that every question whose answer can be checked in polynomial time can also be solved in polynomial time? You are probably saying to yourself, “Of course not!” Most people would agree that it seems extremely unlikely, but nobody has managed to prove that P is not equal to NP. The problem of factoring a large number into the product of two primes belongs to NP, and we don’t think it belongs to P, but nobody has been able to prove it.

The problem of whether NP is equal to P is one of the most important in computer science. In 2000, the Clay Mathematics Institute listed seven “Millennium Prize Problems,” each with a prize of a million dollars. The P versus NP problem is one of the seven.

Are Quantum Algorithms Faster Than Classical Ones?

Most quantum computer scientists believe that P is not equal to NP. They also think that there are problems that are in NP but not P, which a quantum computer can solve in polynomial time. This means that there are problems that a quantum computer can solve in polynomial time that a classical computer cannot. To prove this, however, involves the first step of showing that some problem belongs to NP but not to P, and as we have seen, nobody knows how to do this. So, how can we compare the speed of quantum algorithms to classical algorithms? There are two ways: one theoretical, the other practical. The theoretical way is to invent a new way of measuring complexity that makes it easier to construct proofs. The practical way is to construct quantum algorithms for solving important real-world problems in polynomial time that we believe, but have been unable to prove, do not belong to P.

An example of the second approach is Shor’s algorithm for factoring the product of two primes. Peter Shor constructed a quantum algorithm that works in polynomial time. We believe, but have been unable to prove, that a classical algorithm cannot do this in polynomial time. Why is that important? Well, as we shall see our Internet security depends on this. That said, in the rest of this chapter we will take the first approach—defining a new way of calculating complexity.

Query Complexity

All of the algorithms that we are going to look at in this chapter concern evaluating functions. The Deutsch and Deutsch-Jozsa algorithms consider functions that belong to two classes. We are given a function at random, and we have to determine which of the two classes the function belongs to. Simon’s algorithm concerns periodic functions of a special type. Again we are given one of these functions at random, and we have to determine the period.

When we run these algorithms we have to evaluate the functions. The query complexity counts the number of times that we have to evaluate the function to get our answer. The function is sometimes called a black box or an oracle. Instead of saying that we are evaluating the function, we say that we are querying the black box or the oracle. The point of this is that we don’t have to worry about how to write an algorithm that emulates the function, so we don’t have to calculate the number of steps that function takes to evaluate the input. We just keep track of the number of questions. This is much simpler. To illustrate, we begin with the most elementary example.

Deutsch’s Algorithm

David Deutsch is one of the founders of quantum computing. In 1985, he published a landmark paper that described quantum Turing machines and quantum computation.** This paper also includes the following algorithm—the first to show that a quantum algorithm could be faster than a classical one.

The problem concerns functions of just one variable. The input can be either 0 or 1. The output also just takes the values of 0 or 1. There are four of these functions that we will denote , , and :

The function sends both inputs to 0; i.e., and .

The function sends 0 to 0 and 1 to 1; i.e., and .

The function sends 0 to 1 and 1 to 0; i.e., and .

The function sends both inputs to 1; i.e., and .

The functions and are called constant functions. The output is the same value for both inputs—the output is constant. A function is called balanced if it sends half its inputs to 0 and the other half to 1. Both and are balanced.

The question that Deutsch posed is this: Given one of these four functions at random, how many function evaluations do we have to make to determine whether the function is constant or balanced? It is important to understand what we are asking. We are not interested in which of the four functions we have, but solely in whether the given function is constant or not.

The classical analysis is as follows. We can evaluate our given function at either 0 or 1. Supposing that we choose to evaluate it by plugging in 0, then there are two possible outcomes—either we get 0 or we get 1. If we get 0, all we know is . The function could be either or . Since one is constant and the other is balanced, we are forced to evaluate our function again to decide between them. Classically, to answer the question we have to plug both 0 and 1 into the function. We need to make two function evaluations.

We now look at the quantum version of the question. First, we construct gates that correspond to the four functions. The following picture depicts the gates, where i can take on the numbers 0, 1, 2, or 3.

This says that:

If we input , it outputs .

Notice that for each i, one of and is equal to 0 and the other is equal to 1, and one of and is equal to 0 and the other is equal to 1. This means that the four outputs always give us the standard basis elements, telling us the matrix representing our gate is orthogonal—and so we really do have a gate.

Though we enter two bits of information and get two bits as output, the information these gates gives for classical bits, and is exactly the same as for the functions evaluated at 0 and 1. The top qubit is exactly what we entered, so that piece of output gives us no new information. The choice of and for the second input gives us the option of the second output giving us the function evaluated on the top input ket or of the opposite answer. If we know one of these answers, we know the other.

The quantum computing question that corresponds to the classical question is: Given one of these four gates at random, how many times do you have to use the gate to determine whether the underlying function is constant or whether it is balanced?

If we restrict to just entering and into the gate, the analysis is exactly the same as before. You have to use the gate twice. But David Deutsch showed that if we are allowed to input qubits containing superpositions of and , the gate only needs to be used once. To show this, he used the following circuit.

The little meter symbol at the right end of the top wire means that we are going to measure this qubit. The lack of the meter symbol on the second wire tells us that we won’t be measuring the second output qubit. Let us see how this circuit works.

The qubits are input. They go through the Hadamard gates, which puts them in the state

These then go through the gate. The state becomes

This can be rearranged to give:

We now make the observation that is either or , depending on whether is 0 or 1. But we can be clever about this and write

We also deduce that

The state of our qubits after passing through the gate can then be written as

We can rearrange this to give

then

and finally

This shows that the two qubits are not entangled, and the top qubit has state

Let’s examine this state for each of the four possibilities for .

For , we have , so the qubit is .

For , we have and so the qubit is .

For , we have , so the qubit is .

The next step in the circuit is to send our qubit through the Hadamard gate. This gate sends to and to . So we know:

If , the qubit is .

If we now measure the qubit in the standard basis, we will get 0 if i is either 0 or 3, and we will get 1 if i is either 1 or 2. Of course, and are the constant functions and and are the balanced. So, if after measuring we get 0, we know with certainty that the original function was constant. If we get 1, we know that the original function was balanced.

Consequently, we need to ask the oracle only one question versus two. For Deutsch’s problem there is therefore a slight speedup using a quantum algorithm. This algorithm has no real practical applications, but, as we noted earlier, it was the first example of proving that there are quantum algorithms faster than classical ones.

We will look at two other quantum algorithms in detail. They both involve inputting a number of qubits and then sending each one through a Hadamard gate. We introduce a little more mathematics to help keep the description of many qubits in superposition from becoming too unwieldy.

The Kronecker Product of Hadamard Matrices

We know that the matrix for the Hadamard gate is given by

This tells us that

and

Suppose that we input two qubits and send both through Hadamard gates. The four basis vectors will be sent as follows:

goes to

Recall that we can write everything in terms of four-dimensional kets. The previous four statements are equivalent to saying:

goes to ,

goes to .

This is a description of an orthonormal basis being sent to another orthonormal basis. So, we can write the matrix that corresponds to this. We call this new matrix .

There is an underlying pattern to this matrix that involves H.

This pattern continues. The matrix that corresponds to inputting three qubits and sending all three through Hadamard gates can be written using .

As n increases these matrices quickly get large, but it is always true that

and this gives us a recursive formula that lets us quickly calculate them. These products of matrices telling us how to act on tensor products are called Kronecker products.

For Simon’s algorithm we are going to need to study these matrices in some detail, but for our next algorithm the key observation is that the entries in the top row of each these matrices are all equal to one another; for they all equal .

The Deutsch-Jozsa Algorithm

Deutsch’s algorithm looked at functions of one variable. You were given one of these and had to determine whether it was a constant or balanced function. The Deutsch-Jozsa problem is a generalization of this.

We now have functions of n variables. The inputs for each of these variables, as before, can be either 0 or 1. The output is either 0 or 1. We are told that our function is either constant—all the inputs get sent to 0, or all the inputs get sent to 1 — or it is balanced—half the inputs get sent to 0 and the other half to 1. If we are given one of these functions at random, how many function evaluations do we need to determine whether the function belongs to the constant group or to the balanced group?

To illustrate, we consider the case when n = 3. Our function takes three inputs, each of which can take two values. This means that there are , or 8, possible inputs:

(0,0,0), (0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), (1,1,0), (1,1,1).

Classically, suppose we evaluate and get the answer that . We cannot deduce anything from this piece of information alone, so we ask for another function evaluation, say . If we get , then we are done. We know that the function cannot be constant, so it must be balanced. On the other hand, if we get , we cannot deduce anything from the two pieces of information we have. In the worst possible scenario, we could get the same answer for the first four questions and still not be able to answer the question. For example, from the fact that , , , we cannot determine whether or not the function is balanced. We need to ask one more question. If the answer to the next question is also 1, then we know the function is constant. If the answer is 0, then we know the function is balanced.

This analysis works in general. Given a function of n variables, there will be possible input strings. In the best-case scenario we can obtain the answer with just two questions to the oracle, but in the worst case it will take us questions. Since the appears as an exponent, the function is exponential. In the worst case it requires an exponential number of inquiries to the oracle. The Deutsch-Jozsa algorithm is a quantum algorithm that just requires one question to the oracle, so the speedup is substantial!

The first step, as in all of these questions, is to describe the oracle. For each of the functions we need to construct an orthogonal matrix that captures the essence of the function. We just generalize our previous construction.

Given any function that has n boolean inputs and has just one boolean output, we construct the gate F given by the following circuit, where the slashes with n on the top lines indicate that we have n wires in parallel.

Remember that this circuit tells us what happens when each of the kets, , is either or . The input consists of kets, and , where the first n correspond to the function variables. The output also consists of kets, the first n of which are exactly the same as the input kets. The last output is the ket if and the ket of the other boolean value when .

The next step after describing how the black-box function works is to give the quantum circuit that incorporates this function. It is the natural generalization of the circuit used for Deutsch’s algorithm: all the top qubits pass through Hadamard gates on either side of the black box.

As before, we will analyze what this circuit does step by step. We show the case when , just to make things look a little less messy on the page, but every step we do works in exactly the same way for every value of n.

Step 1. The Qubits Pass through the Hadamard Gates

The top n inputs are all . For , this is . The following calculation shows what happens.

It gives a superposition of all possible states; each of the basis kets has the same probability amplitude ( in this case).

(This calculation works for every value of n. After the n qubits have passed through they are in superposition of all possible states, each of which has the same probability amplitude: .)

The bottom entry is just . This becomes after passing through the Hadamard gate. At this stage, our three input qubits will be in the following state.

. We will rewrite this as

Step 2. The Qubits Pass through the F Gate

After passing through the F gate the qubits will be in the following state.

We now use the fact that that if a is either 0 or 1 we have the following

to rewrite the state as

As before, this shows that the bottom qubit is not entangled with the top qubits. We just look at the top two qubits. These top two are in state:

(The argument that we have used works for general n. At this stage you obtain a state that is a superposition of all basis kets. Each ket, is multiplied by .)

Step 3. The Top Qubits Pass through the Hadamard Gates

The standard method is to convert our state to a column vector and then multiply by the appropriate Kronecker product of the Hadamard matrix. This gives:

However, we are not going to calculate all of the entries in the resulting column vector. We are just going to calculate the top entry. This entry comes from multiplying the bra corresponding to the top row of the matrix with the ket given by the column vector. We get

This is the probability amplitude of the ket . We calculate this amplitude for the possible functions.

If f is constant and sends everything to 0, the probability amplitude is 1.

If f is constant and sends everything to 1, the probability amplitude is −1.

For the balanced functions, the probability amplitude is 0.

Step 4. Measure the Top Qubits

When we measure the top qubits we will get one of 00, 01, 10, or 11. The question becomes “do we get 00?” If the function is constant, then we will with probability 1. If the function is balanced, we get it with probability 0. So, if the result of measuring gives 00, we know our function was constant. If the result is not 00, then the function is balanced.

The analysis works for general n. Just before we measure the qubits the probability amplitude for is

As with , this number will be ±1 if f is constant and 0 if f is balanced. So, if every measurement gives 0, the function is constant. If at least one of the measurements is 1, the function is balanced.

Consequently, we can solve the Deutsch-Jozsa problem for any value n with just one use of the circuit. We only need to ask the oracle one question. Recall in the classical case, that in the worst case it required questions, so the improvement is dramatic.

Simon’s Algorithm

The two algorithms that we have seen so far have been unusual in that we get the final answer with certainty after just one query. Most quantum algorithms use a mixture of quantum algorithms and classical algorithms; they involve more than one use of a quantum circuit; and they involve probability. Simon’s algorithm contains all of these components. However, before we describe the algorithm, we need to discuss the problem being tackled, and before we can do that, we need to introduce a new way of adding binary strings.

Bitwise Addition of Strings Modulo 2

We defined to be the exclusive or XOR, or, equivalently, as addition modulo 2. Recall

We extend this definition to the addition of binary strings of the same length by the formula:

, where

This is like doing addition in binary, but ignoring any carries. Here’s a concrete example of bitwise addition:

The Statement of Simon’s Problem

We have a function f that sends binary strings of length n to binary strings of length n. It has the property that there is some secret binary string s, such that if and only if or . We don’t allow s to be the string consisting entirely of 0s; this forces pairs of distinct input strings to have the same output strings. The problem is to determine the secret string s. An example should make all of this clear.

We will take so our function f will take binary strings of length 3 and give other binary strings of length 3. Suppose that the secret string is . Now

Consequently, for this value of s, we get the following pairings:

A function with this property is

Now, of course, we don’t know the function f or the secret string s: We want to find s. The question is how many function evaluations need to be made to determine this string?

We keep evaluating the function f on strings. We stop as soon as we get a repeated answer. Once we have found two input strings that give the same output, we can immediately calculate s.

For example, if we find that then we know that

Using the fact that

bitwise add 011 to the left of both sides of the equation to obtain

How many evaluations do we have to make using a classical algorithm? We have eight binary strings. It is possible to evaluate four of these and get four different answers, but with the fifth evaluation we are guaranteed to get a pair. In general, for strings of length n, there are binary strings and, in the worst case, we will need to function evaluations to get a repeat. So, in the worst case, we will need to ask the oracle questions.

Before we look at the quantum algorithm, we need to look at the Kronecker product of Hadamard matrices in a little more detail.

The Dot Product and the Hadamard Matrix

Given two binary strings, and of the same length, we define the dot product by

, where denotes our usual multiplication.

So, for example, if and , then . This operation can be thought of as multiplying corresponding terms of the sequences, then adding and finally determining whether the sum is odd or even.

In computer science, we often start counting at 0, so instead of counting from 1 to 4 we count from 0 to 3. Also, we often use binary. The numbers 0, 1, 2, and 3 are represented in binary by 00, 01, 10, 11. Given a matrix, we will label both the rows with these numbers, as is shown here:

The position of an entry in this matrix is given by listing both the row and the column in which it appears. If we make the entry in the ith row and jth column be , we get the following matrix.

Compare this matrix to . Notice that the entries that are 1 in our dot product matrix are in exactly the same positions as the negative entries in . Using the facts that and , we can write

This method of finding where the positive and negative entries holds in general; for example, if we want the entry of that is in the row with number 101 and column with number 111, we calculate the dot product and get 0. This tells us that the entry will be positive.

Hadamard Matrices and Simon’s Problem

Now that we know how to find entries of Kronecker products of Hadamard matrices, we are going to use this knowledge to see what happens when we add two columns of one of these products. We are going to add two columns that are paired by the secret string s given in Simon’s problem. If one column is labeled by the string , the other will have label . We are going to add these two columns together.

To illustrate, we will work with strings of length 2, and suppose the secret string is 10. We will be adding column 00 to 10, or column 01 to 11.

Here is .

Adding the 00 column to 10 gives:

Adding the 01 column to 11 gives:

Notice that some probability amplitudes are getting amplified and some are canceling. What exactly is going on here?

It is fairly easy to check that products and bitwise addition obey the usual law of exponents.

This tells us that and will be equal if , and that and will have opposite signs if .

We can summarize this as:

This tells us that when we add the two columns given by b and , the entry in row a will be 0 if and will be either 2 or if . In general, the entries cancel in the rows labeled with strings that have a dot product of 1 with the string s.

Looking back at our example, the reason that the bottom two entries are 0 is that these rows have labels 10 and 11 and these both have a dot product of 1 with our secret string s. The nonzero entries occur in the rows with labels 00 and 01 and these both have a dot product of 0 with s.

We now have the information needed to understand the quantum circuit for Simon’s problem. It is going to give us a string whose dot product with the secret string s is 0. It is going to do that by adding two columns of the Hadamard matrix. Let’s see how it works.

The Quantum Circuit for Simon’s Problem

The first thing is to construct the black box—the gate that acts like f. The following circuit gives the construction.

We can think of this as inputting two strings consisting of s and s, both of which have the same length. The top string is unchanged. The bottom string is the function evaluated on the top string added bitwise to the bottom string.

The following circuit gives the circuit for the algorithm.

We will illustrate what happens in the case . Everything we do generalizes straightforwardly for general n.

The first step consists of the qubits in the top register passing through the Hadamard gates. This should now look familiar. The top two qubits are initially in state , while after passing through the Hadarmard gates they will be in state

The bottom two qubits remain in state . So, at this stage the four qubits are in state:

The next thing that happens is that the qubits pass through the F gate. This changes the state to:

The top qubits now pass through the Hadamard gates, which changes the state to:

The pattern of + and − signs comes from the matrix for . We now rearrange the terms, solving for the first two qubits, which results in the following:

This way of writing the state has a couple of nice features. The first is that here, also, the pattern of + and − signs comes from the matrix for . The second is that the pairs of qubits to the left of the tensor product correspond to the row numbers.

Now we use the fact that we know that , so . We can simplify things, combining these terms, by adding their probability amplitudes. This corresponds to the column addition we just looked at. To illustrate, suppose that , then and If we plug these values into the state, we obtain:

which simplifies to

The kets to the left of the tensor products are labeled with the row numbers of the matrix. The 0s on the right of the tensor products occur in the rows whose dot product with s is 1.

We can simplify the state to:

When we measure the top two qubits we will get either 00 or 01, each with probability 1/2.

Though we have only looked at the relatively simple case of , everything we have done holds for every value of n. At the end of the process we will end up with one of the strings whose dot product with the secret string is 0. Each of these strings is equally likely.

You might be concerned that after all this work we still don’t know s. This is where the classical part of Simon’s algorithm comes in.

The Classical Part of Simon’s Algorithm

We start with an example with . We know that there is some secret number . We are not allowing 00000, so there are possible choices for s. We are going to try to find it using Simon’s quantum circuit.

We run it and get 10100 as an answer. We know that the dot product of this with s gives 0. So,

This tells us that . Since these digits are either 0 or 1, we deduce that .

We run the circuit again hoping that we don’t get 10100 again. (The probability of this happening is , so we are fairly safe.) We also hope that we don’t get 00000, which wouldn’t give us any new information. Suppose we get 00100. Then we know that

This shows that must be 0. From the first step, we can now deduce that must also be 0. We run the circuit again and get 11110. We know that

which tells us that . Running the circuit again gives 00111, telling us that

Consequently, we must have and, since , we have .

We know that not all of the digits are 0, so we must have , and consequently s must be 01011. For this example, we made four calls to the oracle.

At this point, there are a couple of questions that you might be asking. The first concerns the algorithm for finding s using the outputs of the quantum circuit. We have seen what to do in a specific case, but is there an algorithm—a step-by-step procedure—that tells you what to do in every case? The second question concerns how we are measuring the number of questions we are asking the oracle. When we looked at the classical algorithm, we took the worst possible case and saw that after questions we would definitely have our answer. But when we come to the quantum algorithm, the worst possible case is much worse! We are getting an answer at random. The answer does have a dot product of 0 with s, but we could get the same answer more than once. We could run our quantum circuit times and get a string of 0s every single time. It is unlikely, but it is possible. A string of 0s gives us no information, so it is possible that after questions to the oracle we haven’t deduced anything at all about the secret number. We will address both of these concerns.

Each time we run the circuit, we get a number whose dot product with s is zero. This gives us a linear equation in the n unknowns. Running the circuit several times results in our obtaining several—a system of—equations. In the previous example, at each stage we got a new equation, but that new equation also gave us some new information. The technical term for this is that the equation is linearly independent of the previous equations. In order to calculate s we need a system of linearly independent equations.***

Algorithms for solving systems of equations are extremely well known. They are studied in courses like Linear Algebra and Matrix Theory and have numerous applications. They are so commonly needed that they are programmed into most scientific calculators. We won’t discuss them here apart from mentioning that the number of steps required to solve a system of n equations can be bounded above by a quadratic expression involving n. We say the system can be solved in quadratic time.

The other question that we need to address is this: How many times do we need to run the quantum circuit? As we pointed out, in the worst-case scenario, we can keep running our qubits through the circuit and never get any useful information. However, this is extremely unlikely. We examine this idea in more detail in the next section.

Complexity Classes

In complexity theory, the main classification is between problems that take polynomial time to solve and those that need more than polynomial time. Polynomial time algorithms are regarded as being practical even for very large values of n, but non-polynomial time algorithms are regarded as being infeasible for large n.

Problems that classical algorithms can solve in polynomial time are denoted by P. Problems that quantum algorithms can solve in polynomial time are denoted by QP (sometimes it is denoted by EQP, for exact quantum polynomial time). Usually when we use these terms we are referring to the number of steps that an algorithm takes, but, remember, we defined a new way of measuring complexity—query complexity—that counts the number of questions we need to ask an oracle. We saw that the Deutsch-Jozsa problem was not in the class P, but belonged to QP for query complexity. (The constant function is a degree 0 polynomial.) This is sometimes described as saying that the Deutsch-Jozsa problem separates P and QP—it is a problem that belongs to QP but not to P for query complexity.

However, let’s recall the worst-case scenario for the classical algorithm. To make things more concrete, we will take . We are given a function that takes 10 inputs and told that it is either balanced or constant. We have to keep evaluating our function on specific inputs until we can deduce the answer. There are possible inputs. The worst-case scenario is when the function is balanced, but we get the same answer for the first 512 evaluations, and then on the 513th evaluation we get the other value. But how likely is this to happen?

If the function is balanced, for each input value we are equally likely to get either a 0 or a 1. This can be compared to tossing a fair coin and obtaining a head or a tail. How likely is it to toss a fair coin 512 times and get heads every time? The answer is , which is less than 1 divided by a googol, where a googol is . It is a minute number!

Suppose you were given a coin and asked whether it was fair or was double-headed. If you toss it once and get heads, you can’t really answer the question. But if you toss it ten times and it comes up heads every time, then you can be fairly sure that it is double-headed. Of course, you could be wrong, but in practice we are willing to accept being wrong as long as the probability of this happening is very small.

This is what we do for the bounded-error complexity classes. We choose some bound on the probability of getting an error that we think is acceptable. Then we look at algorithms that can answer the question within our bound for error.

Returning to the Deutsch-Jozsa example, suppose that we want at least a 99.9 percent success rate, or equivalently an error rate of less than 0.1 percent. If a function is balanced the probability of evaluating the function 11 times and getting 0 every time is 0.00049 to five decimal places. Similarly, the probability of obtaining 1 every time is 0.00049. Consequently, the probability of obtaining the same answer 11 times in a row when the function is even is just less than 0.001. So if we are willing to accept a bound on the probability of error of 0.1 percent, we can choose to make at most 11 function evaluations. If during the process we get both a 0 and a 1, we can stop and know with certainty that our function is balanced. If all 11 evaluations are the same, we will say the function is constant. We could be wrong, but our error rate is less than our chosen bound. Notice that this argument works for any n. In every case, we need 11 function evaluations at most.

Problems that classical algorithms can solve in polynomial time with the probability of error within some bound are denoted BPP (for bounded-error probabilistic polynomial time). The Deutsch-Jozsa problem is in the class BPP.

One thing that you might be worried about is whether a problem could be in BPP for one bound on the probability of error, but not in the class BPP for a smaller bound. This doesn’t happen. If the problem is in the class BPP, it will be there for every choice of the bound.

We now return to Simon’s algorithm. We need to keep sending qubits through the circuit until we have linearly independent equations. As we know, in the worst case this process can go on forever, so Simon’s algorithm is not in class QP. However, let’s choose a bound that we are willing to accept on the probability of making an error. Then we can calculate N so that is less than our bound.

We won’t prove this, but it can be shown that if we run the circuit times, the probability of the equations containing a system linearly independent equations is greater than .

We can now state Simon’s algorithm. First we decide on a bound on the probability of error and calculate the value N. Again, the number N does not depend on n. We can use the same value of N in each case. We run Simon’s circuit times. The number of queries is , which, since N is fixed, is a linear function of n. We make the assumption that our system of equations contains independent vectors. We could be wrong, but the probability of being wrong is less than the bound that we chose. Then we solve the system of equations using a classical algorithm. The time taken will be quadratic in , but because N is a constant, this can be expressed as a quadratic in n.

The algorithm as a whole contains the quantum part that takes linear time added to the classical part that takes quadratic time, giving quadratic time overall. Problems that quantum algorithms can solve in polynomial time with the probability of error within some bound are denoted BQP (for bounded-error quantum polynomial time). Simon’s algorithm shows the problem belongs to BQP for query complexity.

We showed that the classical algorithm, in the worst case, took function evaluations—this is exponential in n, not polynomial, so the problem definitely does not belong to P. It can also be shown that even if we allow a bound on the probability of error the algorithm is still exponential, so the problem does not belong to BPP. We say that Simon’s problem separates BPP and BQP for query complexity.

Quantum Algorithms

We started this chapter by describing how in many popular descriptions the speedup provided by quantum algorithms is said to come solely from quantum parallelism—the fact that we can put the input into a superposition that involves all the basis states. However, we have looked at three algorithms and have seen that though we need to use quantum parallelism, we need to do much more. We will briefly look at what is needed and why it is hard!

The three algorithms we have studied are the most elementary and considered standard, but as you have probably noticed they are by no means trivial. The dates when they were published tells an important story. David Deutsch published his algorithm in his landmark paper of 1985. This was the first quantum algorithm, and it showed that a quantum algorithm could be faster than a classical one. Deutsch and Jozsa published their generalization of Deutsch’s algorithm in 1992, seven years later. It might seem surprising that what seems to be a fairly straightforward generalization took so long to find, but it is important to realize that it is the modern notation and presentation that make the generalization seem to be the natural one. Deutsch’s paper doesn’t state the problem exactly the way it is stated here, and it doesn’t use diagrams for quantum circuits that are now standard. That said, there was an incredibly productive period from 1993 to 1995 when many of the most important algorithms were discovered. Daniel Simon’s algorithm was published in this window, as were the algorithms by Peter Shor and Lov Grover that we will look at in the next chapter.

Orthogonal matrices represent quantum gates. Quantum circuits consist of combinations of gates. These correspond to multiplying orthogonal matrices, and since the product of orthogonal matrices results in an orthogonal matrix, any quantum circuit can be described by just one orthogonal matrix. As we have seen, an orthogonal matrix corresponds to a change of basis—a different way of viewing the problem. This is the key idea. Quantum computing gives us more ways of viewing a problem than classical computing does. But in order to be effective, there has to be a view that shows the correct answer separated from other possible incorrect answers. Problems that quantum computers can solve faster than classical computers need to have a structure that becomes visible only when we transform it using an orthogonal matrix.

The problems that we have looked at are clearly reverse-engineered. They are not important problems that people have been considering for years and that we have only now discovered that if we view them from the right quantum computing perspective they become simpler to solve. Rather, they are problems that are specially created using the structure of Kronecker products of Hadamard matrices. Of course, what we really want is not to reverse-engineer a problem, but to take a problem that is important and then construct a quantum algorithm that is faster than any known classical algorithm. This is what Peter Shor achieved in his 1994 landmark paper, in which he showed (among other things) how quantum computing could be used to break the codes that are currently used for Internet security. We will briefly discuss Shor’s algorithm in the next chapter, where we look at the impact of quantum computing.