8 
Quantum Algorithms

Popular descriptions of quantum algorithms describe them as being much faster than regular algorithms. This speedup, it is explained, comes from being able to put the input into a superposition of all possible inputs and then performing the algorithm on the superposition. Consequently, instead of running the algorithm on just one input, as you do classically, you can run the algorithm using “quantum parallelism” on all possible inputs at the same time. These descriptions often end at this point. But this leaves many unanswered questions. We seem to end up with many possible answers all superimposed on one another. If we make a measurement, won’t we get just one of these answers at random? There are far more likely to be wrong answers than right answers, so aren’t we more likely to end up with a wrong answer than with the right answer?

Clearly, there has to be more to quantum algorithms than just putting everything into a superposition of states. The real art of constructing these algorithms is being able to manipulate these superpositions so that when we make measurements we get a useful answer. In this chapter, we will look at three quantum algorithms and see how they tackle this problem. We will see that not every algorithm is susceptible to a quantum speedup. Quantum algorithms are not classical algorithms that have been sped up. Instead, they involve quantum ideas to see the problem in a new light; the algorithms work not by the use of brute force, but by ingenious ways of exploiting underlying patterns that can be seen from only the quantum viewpoint.

We will describe three algorithms in detail. All three are ingenious exploitations of underlying mathematical patterns. The level of difficulty increases as we move through the algorithms. Some mathematics books use a star to denote a difficult section and a double star to denote a very difficult section. The Deutsch-Jozsa algorithm probably deserves a star, and Simon’s algorithm a double star.

At the end of the chapter, we will talk a little about the properties that questions must have in order for a quantum algorithm to solve them faster than a classical one, and why they seem so hard! But first we must describe how the speed of algorithms is measured.

The Complexity Classes P and NP

Imagine that you are given the following problems. You are told that you are not allowed to use a calculator or computer but have to work them out using paper and pencil.

  • Find two whole numbers bigger than 1 whose product is equal to 35.
  • Find two whole numbers bigger than 1 whose product is equal to 187.
  • Find two whole numbers bigger than 1 whose product is equal to 2,407.
  • Find two whole numbers bigger than 1 whose product is equal to 88,631.

You won’t have much difficulty doing the first question, but each subsequent question is harder and will take more steps and consequently more time to solve. Before we analyze this in more detail, consider another four problems.

  • Multiply 7 by 5 and check that it equals 35.
  • Multiply 11 by 17 and check that it equals 187.
  • Multiply 29 by 83 and check that it equals 2407.
  • Multiply 337 by 263 and check that it equals 88,631.

These questions are undoubtedly easier than the first series. Again each subsequent question takes more time to solve than the previous one, but the amount of time is growing more slowly. Even the fourth question takes less than a minute to solve by hand.

We will denote the number of digits of the input number by n, so in the first set of questions we start with 11860_e008_001.jpg and go up to 11860_e008_002.jpg

We will let 11860_e008_003.jpg denote the time, or equivalently the number of steps, to solve a question of input length n. Complexity looks at how the size of 11860_e008_004.jpg grows as n grows. In particular, we ask if we can find some positive numbers k and p such that 11860_e008_005.jpg for every value of n. If we can, we say that the underlying problem can be solved in polynomial time. If, on the other hand, we can find positive number k and a number 11860_e008_006.jpg, such that 11860_e008_007.jpg for every value of n, we say that the problem requires exponential time. Recall the basic fact concerning polynomial versus exponential growth: Given enough time, something with exponential growth will grow much faster than something with polynomial growth. In computer science, questions that can be solved in polynomial time are considered tractable, but those with exponential growth are not. Problems that can be solved in polynomial time are regarded as easy; those that require exponential time are hard. In practice, it turns out that most polynomial time problems involve polynomials with small degree, so even if we don’t have the computational power to solve a problem with a large value of n at the moment, we should have it in a few years. On the other hand, with an exponential time problem, once the size has increased beyond what we can currently tackle, increasing the size of n even slightly more produces a problem that becomes much harder and is unlikely to be solvable in the foreseeable future.

Let’s look at our two sets of problems. The second set involves multiplying two numbers together, but this is easy to do. As n increases it does take more time, but it can be shown that this is a polynomial time problem. What about the first set of questions? If you tried tackling them, you will probably believe that the amount of time needed is exponential in n and not polynomial in n, but is this the case? Everybody thinks so, but, on the other hand, nobody has found a proof.

In 1991, RSA Laboratories posted a challenge. It listed large numbers, each of which was the product of two primes. The challenge was to factor them. They went from being about 100 decimal digits long to 600 digits. You were of course allowed to use computers! There were prizes for the first person to factor them. The 100 digit numbers were factored relatively quickly, but the numbers with 300 or more digits still haven’t been factored.

If a problem can be solved in polynomial time we say it belongs to the complexity class P. So the problem that consists of multiplying two numbers together belongs to P. Suppose that instead of solving the problem, someone gives you the answer and you just have to check that the answer is correct. If this process of checking that an answer is correct takes polynomial time, then we say the problem belongs to complexity class NP.* The problem of factoring a large number into the product of two primes belongs to NP.

Clearly, checking that an answer is correct is easier than actually finding the answer, so every problem that is in P is also in NP, but what about the converse question. Does every NP problem belong to P? Is it true that every question whose answer can be checked in polynomial time can also be solved in polynomial time? You are probably saying to yourself, “Of course not!” Most people would agree that it seems extremely unlikely, but nobody has managed to prove that P is not equal to NP. The problem of factoring a large number into the product of two primes belongs to NP, and we don’t think it belongs to P, but nobody has been able to prove it.

The problem of whether NP is equal to P is one of the most important in computer science. In 2000, the Clay Mathematics Institute listed seven “Millennium Prize Problems,” each with a prize of a million dollars. The P versus NP problem is one of the seven.

Are Quantum Algorithms Faster Than Classical Ones?

Most quantum computer scientists believe that P is not equal to NP. They also think that there are problems that are in NP but not P, which a quantum computer can solve in polynomial time. This means that there are problems that a quantum computer can solve in polynomial time that a classical computer cannot. To prove this, however, involves the first step of showing that some problem belongs to NP but not to P, and as we have seen, nobody knows how to do this. So, how can we compare the speed of quantum algorithms to classical algorithms? There are two ways: one theoretical, the other practical. The theoretical way is to invent a new way of measuring complexity that makes it easier to construct proofs. The practical way is to construct quantum algorithms for solving important real-world problems in polynomial time that we believe, but have been unable to prove, do not belong to P.

An example of the second approach is Shor’s algorithm for factoring the product of two primes. Peter Shor constructed a quantum algorithm that works in polynomial time. We believe, but have been unable to prove, that a classical algorithm cannot do this in polynomial time. Why is that important? Well, as we shall see our Internet security depends on this. That said, in the rest of this chapter we will take the first approach—defining a new way of calculating complexity.

Query Complexity

All of the algorithms that we are going to look at in this chapter concern evaluating functions. The Deutsch and Deutsch-Jozsa algorithms consider functions that belong to two classes. We are given a function at random, and we have to determine which of the two classes the function belongs to. Simon’s algorithm concerns periodic functions of a special type. Again we are given one of these functions at random, and we have to determine the period.

When we run these algorithms we have to evaluate the functions. The query complexity counts the number of times that we have to evaluate the function to get our answer. The function is sometimes called a black box or an oracle. Instead of saying that we are evaluating the function, we say that we are querying the black box or the oracle. The point of this is that we don’t have to worry about how to write an algorithm that emulates the function, so we don’t have to calculate the number of steps that function takes to evaluate the input. We just keep track of the number of questions. This is much simpler. To illustrate, we begin with the most elementary example.

Deutsch’s Algorithm

David Deutsch is one of the founders of quantum computing. In 1985, he published a landmark paper that described quantum Turing machines and quantum computation.** This paper also includes the following algorithm—the first to show that a quantum algorithm could be faster than a classical one.

The problem concerns functions of just one variable. The input can be either 0 or 1. The output also just takes the values of 0 or 1. There are four of these functions that we will denote 11860_e008_008.jpg 11860_e008_009.jpg, 11860_e008_010.jpg, and 11860_e008_011.jpg:

The function 11860_e008_012.jpg sends both inputs to 0; i.e., 11860_e008_013.jpg and 11860_e008_014.jpg.

The function 11860_e008_015.jpg sends 0 to 0 and 1 to 1; i.e., 11860_e008_016.jpg and 11860_e008_017.jpg.

The function 11860_e008_018.jpg sends 0 to 1 and 1 to 0; i.e., 11860_e008_019.jpg and 11860_e008_020.jpg.

The function 11860_e008_021.jpg sends both inputs to 1; i.e., 11860_e008_022.jpg and 11860_e008_023.jpg.

The functions 11860_e008_024.jpg and 11860_e008_025.jpg are called constant functions. The output is the same value for both inputs—the output is constant. A function is called balanced if it sends half its inputs to 0 and the other half to 1. Both 11860_e008_026.jpgand 11860_e008_027.jpg are balanced.

The question that Deutsch posed is this: Given one of these four functions at random, how many function evaluations do we have to make to determine whether the function is constant or balanced? It is important to understand what we are asking. We are not interested in which of the four functions we have, but solely in whether the given function is constant or not.

The classical analysis is as follows. We can evaluate our given function at either 0 or 1. Supposing that we choose to evaluate it by plugging in 0, then there are two possible outcomes—either we get 0 or we get 1. If we get 0, all we know is 11860_e008_028.jpg. The function could be either 11860_e008_029.jpg or 11860_e008_030.jpg. Since one is constant and the other is balanced, we are forced to evaluate our function again to decide between them. Classically, to answer the question we have to plug both 0 and 1 into the function. We need to make two function evaluations.

We now look at the quantum version of the question. First, we construct gates that correspond to the four functions. The following picture depicts the gates, where i can take on the numbers 0, 1, 2, or 3.

11860_008_fig_001.jpg

This says that:

If we input 11860_e008_031.jpg, it outputs 11860_e008_032.jpg.

If we input 11860_e008_033.jpg, it outputs 11860_e008_034.jpg.

If we input 11860_e008_035.jpg, it outputs 11860_e008_036.jpg.

If we input 11860_e008_037.jpg, it outputs 11860_e008_038.jpg.

Notice that for each i, one of 11860_e008_039.jpg and 11860_e008_040.jpg is equal to 0 and the other is equal to 1, and one of 11860_e008_041.jpg and 11860_e008_042.jpg is equal to 0 and the other is equal to 1. This means that the four outputs always give us the standard basis elements, telling us the matrix representing our gate is orthogonal—and so we really do have a gate.

Though we enter two bits of information and get two bits as output, the information these gates gives for classical bits, 11860_e008_043.jpg and 11860_e008_044.jpg is exactly the same as for the functions evaluated at 0 and 1. The top qubit is exactly what we entered, so that piece of output gives us no new information. The choice of 11860_e008_045.jpg and 11860_e008_046.jpg for the second input gives us the option of the second output giving us the function evaluated on the top input ket or of the opposite answer. If we know one of these answers, we know the other.

The quantum computing question that corresponds to the classical question is: Given one of these four gates at random, how many times do you have to use the gate to determine whether the underlying function 11860_e008_047.jpg is constant or whether it is balanced?

If we restrict to just entering 11860_e008_048.jpg and 11860_e008_049.jpg into the gate, the analysis is exactly the same as before. You have to use the gate twice. But David Deutsch showed that if we are allowed to input qubits containing superpositions of 11860_e008_050.jpg and 11860_e008_051.jpg, the gate only needs to be used once. To show this, he used the following circuit.

11860_008_fig_002.jpg

The little meter symbol at the right end of the top wire means that we are going to measure this qubit. The lack of the meter symbol on the second wire tells us that we won’t be measuring the second output qubit. Let us see how this circuit works.

The qubits 11860_e008_052.jpg are input. They go through the Hadamard gates, which puts them in the state

11860_e008_053.jpg.

These then go through the 11860_e008_054.jpg gate. The state becomes

11860_e008_055.jpg.

This can be rearranged to give:

11860_e008_056.jpg.

We now make the observation that 11860_e008_057.jpg is either 11860_e008_058.jpg or 11860_e008_059.jpg, depending on whether 11860_e008_060.jpg is 0 or 1. But we can be clever about this and write

11860_e008_061.jpg.

We also deduce that

11860_e008_062.jpg.

The state of our qubits after passing through the 11860_e008_063.jpg gate can then be written as

11860_e008_064.jpg.

We can rearrange this to give

11860_e008_065.jpg,

then

11860_e008_066.jpg,

and finally

11860_e008_067.jpg.

This shows that the two qubits are not entangled, and the top qubit has state

11860_e008_068.jpg.

Let’s examine this state for each of the four possibilities for 11860_e008_069.jpg.

For 11860_e008_070.jpg, we have 11860_e008_071.jpg, so the qubit is 11860_e008_072.jpg.

For 11860_e008_073.jpg, we have 11860_e008_074.jpg and 11860_e008_075.jpg so the qubit is 11860_e008_076.jpg.

For 11860_e008_077.jpg, we have 11860_e008_078.jpg and 11860_e008_079.jpg so the qubit is 11860_e008_080.jpg.

For 11860_e008_081.jpg, we have 11860_e008_082.jpg, so the qubit is 11860_e008_083.jpg.

The next step in the circuit is to send our qubit through the Hadamard gate. This gate sends 11860_e008_084.jpg to 11860_e008_085.jpg and 11860_e008_086.jpg to 11860_e008_087.jpg. So we know:

If 11860_e008_088.jpg, the qubit is 11860_e008_089.jpg.

If 11860_e008_090.jpg, the qubit is 11860_e008_091.jpg.

If 11860_e008_092.jpg, the qubit is 11860_e008_093.jpg.

If 11860_e008_094.jpg, the qubit is 11860_e008_095.jpg.

If we now measure the qubit in the standard basis, we will get 0 if i is either 0 or 3, and we will get 1 if i is either 1 or 2. Of course, 11860_e008_096.jpg and 11860_e008_097.jpg are the constant functions and 11860_e008_098.jpgand 11860_e008_099.jpgare the balanced. So, if after measuring we get 0, we know with certainty that the original function was constant. If we get 1, we know that the original function was balanced.

Consequently, we need to ask the oracle only one question versus two. For Deutsch’s problem there is therefore a slight speedup using a quantum algorithm. This algorithm has no real practical applications, but, as we noted earlier, it was the first example of proving that there are quantum algorithms faster than classical ones.

We will look at two other quantum algorithms in detail. They both involve inputting a number of qubits and then sending each one through a Hadamard gate. We introduce a little more mathematics to help keep the description of many qubits in superposition from becoming too unwieldy.

The Kronecker Product of Hadamard Matrices

We know that the matrix for the Hadamard gate is given by

11860_e008_100.jpg.

This tells us that

11860_e008_101.jpg,

and

11860_e008_102.jpg.

Suppose that we input two qubits and send both through Hadamard gates. The four basis vectors will be sent as follows:

11860_e008_103.jpg goes to

11860_e008_104.jpg.

11860_e008_105.jpg goes to

11860_e008_106.jpg.

11860_e008_107.jpg goes to

11860_e008_108.jpg.

11860_e008_109.jpg goes to

11860_e008_110.jpg.

Recall that we can write everything in terms of four-dimensional kets. The previous four statements are equivalent to saying:

11860_e008_111.jpg goes to 11860_e008_112.jpg,

11860_e008_113.jpg goes to 11860_e008_114.jpg,

11860_e008_115.jpg goes to 11860_e008_116.jpg,

11860_e008_117.jpg goes to 11860_e008_118.jpg.

This is a description of an orthonormal basis being sent to another orthonormal basis. So, we can write the matrix that corresponds to this. We call this new matrix 11860_e008_119.jpg.

11860_e008_120.jpg

There is an underlying pattern to this matrix that involves H.

11860_e008_121.jpg.

This pattern continues. The matrix that corresponds to inputting three qubits and sending all three through Hadamard gates can be written using 11860_e008_122.jpg.

11860_e008_123.jpg

As n increases these matrices quickly get large, but it is always true that

11860_e008_124.jpg,

and this gives us a recursive formula that lets us quickly calculate them. These products of matrices telling us how to act on tensor products are called Kronecker products.

For Simon’s algorithm we are going to need to study these matrices in some detail, but for our next algorithm the key observation is that the entries in the top row of each these matrices are all equal to one another; for 11860_e008_125.jpg they all equal 11860_e008_126.jpg.

The Deutsch-Jozsa Algorithm

Deutsch’s algorithm looked at functions of one variable. You were given one of these and had to determine whether it was a constant or balanced function. The Deutsch-Jozsa problem is a generalization of this.

We now have functions of n variables. The inputs for each of these variables, as before, can be either 0 or 1. The output is either 0 or 1. We are told that our function is either constant—all the inputs get sent to 0, or all the inputs get sent to 1 — or it is balanced—half the inputs get sent to 0 and the other half to 1. If we are given one of these functions at random, how many function evaluations do we need to determine whether the function belongs to the constant group or to the balanced group?

To illustrate, we consider the case when n = 3. Our function takes three inputs, each of which can take two values. This means that there are 11860_e008_127.jpg, or 8, possible inputs:

(0,0,0), (0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), (1,1,0), (1,1,1).

Classically, suppose we evaluate 11860_e008_128.jpg and get the answer that 11860_e008_129.jpg. We cannot deduce anything from this piece of information alone, so we ask for another function evaluation, say 11860_e008_130.jpg. If we get 11860_e008_131.jpg, then we are done. We know that the function cannot be constant, so it must be balanced. On the other hand, if we get 11860_e008_132.jpg, we cannot deduce anything from the two pieces of information we have. In the worst possible scenario, we could get the same answer for the first four questions and still not be able to answer the question. For example, from the fact that 11860_e008_133.jpg, 11860_e008_134.jpg, 11860_e008_135.jpg, 11860_e008_136.jpg we cannot determine whether or not the function is balanced. We need to ask one more question. If the answer to the next question is also 1, then we know the function is constant. If the answer is 0, then we know the function is balanced.

This analysis works in general. Given a function of n variables, there will be 11860_e008_137.jpg possible input strings. In the best-case scenario we can obtain the answer with just two questions to the oracle, but in the worst case it will take us 11860_e008_138.jpg questions. Since the 11860_e008_139.jpg appears as an exponent, the function is exponential. In the worst case it requires an exponential number of inquiries to the oracle. The Deutsch-Jozsa algorithm is a quantum algorithm that just requires one question to the oracle, so the speedup is substantial!

The first step, as in all of these questions, is to describe the oracle. For each of the functions we need to construct an orthogonal matrix that captures the essence of the function. We just generalize our previous construction.

Given any function 11860_e008_140.jpg that has n boolean inputs and has just one boolean output, we construct the gate F given by the following circuit, where the slashes with n on the top lines indicate that we have n wires in parallel.

11860_008_fig_003.jpg

Remember that this circuit tells us what happens when each of the kets, 11860_e008_141.jpg, is either 11860_e008_142.jpg or 11860_e008_143.jpg. The input consists of 11860_e008_144.jpg kets, 11860_e008_145.jpg and 11860_e008_146.jpg, where the first n correspond to the function variables. The output also consists of 11860_e008_147.jpg kets, the first n of which are exactly the same as the input kets. The last output is the ket 11860_e008_148.jpg if 11860_e008_149.jpg and the ket of the other boolean value when 11860_e008_150.jpg.

The next step after describing how the black-box function works is to give the quantum circuit that incorporates this function. It is the natural generalization of the circuit used for Deutsch’s algorithm: all the top qubits pass through Hadamard gates on either side of the black box.

11860_008_fig_004.jpg

As before, we will analyze what this circuit does step by step. We show the case when 11860_e008_151.jpg, just to make things look a little less messy on the page, but every step we do works in exactly the same way for every value of n.

Step 1. The Qubits Pass through the Hadamard Gates

The top n inputs are all 11860_e008_152.jpg. For 11860_e008_153.jpg, this is 11860_e008_154.jpg. The following calculation shows what happens.

11860_e008_155.jpg

It gives a superposition of all possible states; each of the basis kets has the same probability amplitude (11860_e008_156.jpg in this case).

(This calculation works for every value of n. After the n qubits have passed through 11860_e008_157.jpg they are in superposition of all possible states, each of which has the same probability amplitude: 11860_e008_158.jpg.)

The bottom entry is just 11860_e008_159.jpg. This becomes 11860_e008_160.jpg after passing through the Hadamard gate. At this stage, our three input qubits will be in the following state.

11860_e008_161.jpg. We will rewrite this as

11860_e008_162.jpg

Step 2. The Qubits Pass through the F Gate

After passing through the F gate the qubits will be in the following state.

11860_e008_163.jpg

We now use the fact that that if a is either 0 or 1 we have the following

11860_e008_164.jpg

to rewrite the state as

11860_e008_165.jpg

As before, this shows that the bottom qubit is not entangled with the top qubits. We just look at the top two qubits. These top two are in state:

11860_e008_166.jpg

(The argument that we have used works for general n. At this stage you obtain a state that is a superposition of all basis kets. Each ket, 11860_e008_167.jpg is multiplied by 11860_e008_168.jpg.)

Step 3. The Top Qubits Pass through the Hadamard Gates

The standard method is to convert our state to a column vector and then multiply by the appropriate Kronecker product of the Hadamard matrix. This gives:

11860_e008_169.jpg.

However, we are not going to calculate all of the entries in the resulting column vector. We are just going to calculate the top entry. This entry comes from multiplying the bra corresponding to the top row of the matrix with the ket given by the column vector. We get

11860_e008_170.jpg.

This is the probability amplitude of the ket 11860_e008_171.jpg. We calculate this amplitude for the possible functions.

If f is constant and sends everything to 0, the probability amplitude is 1.

If f is constant and sends everything to 1, the probability amplitude is −1.

For the balanced functions, the probability amplitude is 0.

Step 4. Measure the Top Qubits

When we measure the top qubits we will get one of 00, 01, 10, or 11. The question becomes “do we get 00?” If the function is constant, then we will with probability 1. If the function is balanced, we get it with probability 0. So, if the result of measuring gives 00, we know our function was constant. If the result is not 00, then the function is balanced.

The analysis works for general n. Just before we measure the qubits the probability amplitude for 11860_e008_172.jpg is

11860_e008_173.jpg.

As with 11860_e008_174.jpg, this number will be ±1 if f is constant and 0 if f is balanced. So, if every measurement gives 0, the function is constant. If at least one of the measurements is 1, the function is balanced.

Consequently, we can solve the Deutsch-Jozsa problem for any value n with just one use of the circuit. We only need to ask the oracle one question. Recall in the classical case, that in the worst case it required 11860_e008_175.jpg questions, so the improvement is dramatic.

Simon’s Algorithm

The two algorithms that we have seen so far have been unusual in that we get the final answer with certainty after just one query. Most quantum algorithms use a mixture of quantum algorithms and classical algorithms; they involve more than one use of a quantum circuit; and they involve probability. Simon’s algorithm contains all of these components. However, before we describe the algorithm, we need to discuss the problem being tackled, and before we can do that, we need to introduce a new way of adding binary strings.

Bitwise Addition of Strings Modulo 2

We defined 11860_e008_176.jpg to be the exclusive or XOR, or, equivalently, as addition modulo 2. Recall

11860_e008_177.jpg

We extend this definition to the addition of binary strings of the same length by the formula:

11860_e008_178.jpg, where

11860_e008_179.jpg.

This is like doing addition in binary, but ignoring any carries. Here’s a concrete example of bitwise addition:

11860_e008_180.jpg

The Statement of Simon’s Problem

We have a function f that sends binary strings of length n to binary strings of length n. It has the property that there is some secret binary string s, such that 11860_e008_181.jpg if and only if 11860_e008_182.jpg or 11860_e008_183.jpg. We don’t allow s to be the string consisting entirely of 0s; this forces pairs of distinct input strings to have the same output strings. The problem is to determine the secret string s. An example should make all of this clear.

We will take 11860_e008_184.jpg so our function f will take binary strings of length 3 and give other binary strings of length 3. Suppose that the secret string is 11860_e008_185.jpg. Now

11860_e008_186.jpg.

Consequently, for this value of s, we get the following pairings:

11860_e008_187.jpg.

A function with this property is

11860_e008_188.jpg.

Now, of course, we don’t know the function f or the secret string s: We want to find s. The question is how many function evaluations need to be made to determine this string?

We keep evaluating the function f on strings. We stop as soon as we get a repeated answer. Once we have found two input strings that give the same output, we can immediately calculate s.

For example, if we find that 11860_e008_189.jpgthen we know that

11860_e008_190.jpg.

Using the fact that

11860_e008_191.jpg,

bitwise add 011 to the left of both sides of the equation to obtain

11860_e008_192.jpg.

How many evaluations do we have to make using a classical algorithm? We have eight binary strings. It is possible to evaluate four of these and get four different answers, but with the fifth evaluation we are guaranteed to get a pair. In general, for strings of length n, there are 11860_e008_193.jpg binary strings and, in the worst case, we will need to 11860_e008_194.jpg function evaluations to get a repeat. So, in the worst case, we will need to ask the oracle 11860_e008_195.jpg questions.

Before we look at the quantum algorithm, we need to look at the Kronecker product of Hadamard matrices in a little more detail.

The Dot Product and the Hadamard Matrix

Given two binary strings, 11860_e008_196.jpg and 11860_e008_197.jpg of the same length, we define the dot product by

11860_e008_198.jpg, where 11860_e008_199.jpg denotes our usual multiplication.

So, for example, if 11860_e008_200.jpg and 11860_e008_201.jpg, then 11860_e008_202.jpg. This operation can be thought of as multiplying corresponding terms of the sequences, then adding and finally determining whether the sum is odd or even.

In computer science, we often start counting at 0, so instead of counting from 1 to 4 we count from 0 to 3. Also, we often use binary. The numbers 0, 1, 2, and 3 are represented in binary by 00, 01, 10, 11. Given a 11860_e008_203.jpg matrix, we will label both the rows with these numbers, as is shown here:

11860_e008_204.jpg

The position of an entry in this matrix is given by listing both the row and the column in which it appears. If we make the entry in the ith row and jth column be 11860_e008_205.jpg, we get the following matrix.

11860_e008_206.jpg

Compare this matrix to 11860_e008_207.jpg. Notice that the entries that are 1 in our dot product matrix are in exactly the same positions as the negative entries in 11860_e008_208.jpg. Using the facts that 11860_e008_209.jpg and 11860_e008_210.jpg, we can write

11860_e008_211.jpg.

This method of finding where the positive and negative entries holds in general; for example, if we want the entry of 11860_e008_212.jpg that is in the row with number 101 and column with number 111, we calculate the dot product and get 0. This tells us that the entry will be positive.

Hadamard Matrices and Simon’s Problem

Now that we know how to find entries of Kronecker products of Hadamard matrices, we are going to use this knowledge to see what happens when we add two columns of one of these products. We are going to add two columns that are paired by the secret string s given in Simon’s problem. If one column is labeled by the string 11860_e008_213.jpg, the other will have label 11860_e008_214.jpg. We are going to add these two columns together.

To illustrate, we will work with strings of length 2, and suppose the secret string is 10. We will be adding column 00 to 10, or column 01 to 11.

Here is 11860_e008_215.jpg.

11860_e008_216.jpg

Adding the 00 column to 10 gives:

11860_e008_217.jpg.

Adding the 01 column to 11 gives:

11860_e008_218.jpg.

Notice that some probability amplitudes are getting amplified and some are canceling. What exactly is going on here?

It is fairly easy to check that products and bitwise addition obey the usual law of exponents.

11860_e008_219.jpg.

This tells us that 11860_e008_220.jpg and 11860_e008_221.jpg will be equal if 11860_e008_222.jpg, and that 11860_e008_223.jpg and 11860_e008_224.jpg will have opposite signs if 11860_e008_225.jpg.

We can summarize this as:

11860_e008_226.jpg.

This tells us that when we add the two columns given by b and 11860_e008_227.jpg, the entry in row a will be 0 if 11860_e008_228.jpg and will be either 2 or 11860_e008_229.jpg if 11860_e008_230.jpg. In general, the entries cancel in the rows labeled with strings that have a dot product of 1 with the string s.

Looking back at our example, the reason that the bottom two entries are 0 is that these rows have labels 10 and 11 and these both have a dot product of 1 with our secret string s. The nonzero entries occur in the rows with labels 00 and 01 and these both have a dot product of 0 with s.

We now have the information needed to understand the quantum circuit for Simon’s problem. It is going to give us a string whose dot product with the secret string s is 0. It is going to do that by adding two columns of the Hadamard matrix. Let’s see how it works.

The Quantum Circuit for Simon’s Problem

The first thing is to construct the black box—the gate that acts like f. The following circuit gives the construction.

11860_008_fig_005.jpg

We can think of this as inputting two strings consisting of 11860_e008_231.jpgs and 11860_e008_232.jpgs, both of which have the same length. The top string is unchanged. The bottom string is the function evaluated on the top string added bitwise to the bottom string.

The following circuit gives the circuit for the algorithm.

11860_008_fig_006.jpg

We will illustrate what happens in the case 11860_e008_233.jpg. Everything we do generalizes straightforwardly for general n.

The first step consists of the qubits in the top register passing through the Hadamard gates. This should now look familiar. The top two qubits are initially in state 11860_e008_234.jpg, while after passing through the Hadarmard gates they will be in state

11860_e008_235.jpg.

The bottom two qubits remain in state 11860_e008_236.jpg. So, at this stage the four qubits are in state:

11860_e008_237.jpg

The next thing that happens is that the qubits pass through the F gate. This changes the state to:

11860_e008_238.jpg

The top qubits now pass through the Hadamard gates, which changes the state to:

11860_e008_239.jpg

The pattern of + and − signs comes from the matrix for 11860_e008_240.jpg. We now rearrange the terms, solving for the first two qubits, which results in the following:

11860_e008_241.jpg

This way of writing the state has a couple of nice features. The first is that here, also, the pattern of + and − signs comes from the matrix for 11860_e008_242.jpg. The second is that the pairs of qubits to the left of the tensor product correspond to the row numbers.

Now we use the fact that we know that 11860_e008_243.jpg, so 11860_e008_244.jpg. We can simplify things, combining these terms, by adding their probability amplitudes. This corresponds to the column addition we just looked at. To illustrate, suppose that 11860_e008_245.jpg, then 11860_e008_246.jpg and 11860_e008_247.jpg If we plug these values into the state, we obtain:

11860_e008_248.jpg

which simplifies to

11860_e008_249.jpg.

The kets to the left of the tensor products are labeled with the row numbers of the matrix. The 0s on the right of the tensor products occur in the rows whose dot product with s is 1.

We can simplify the state to:

11860_e008_250.jpg.

When we measure the top two qubits we will get either 00 or 01, each with probability 1/2.

Though we have only looked at the relatively simple case of 11860_e008_251.jpg, everything we have done holds for every value of n. At the end of the process we will end up with one of the strings whose dot product with the secret string is 0. Each of these strings is equally likely.

You might be concerned that after all this work we still don’t know s. This is where the classical part of Simon’s algorithm comes in.

The Classical Part of Simon’s Algorithm

We start with an example with 11860_e008_252.jpg. We know that there is some secret number 11860_e008_253.jpg. We are not allowing 00000, so there are 11860_e008_254.jpg possible choices for s. We are going to try to find it using Simon’s quantum circuit.

We run it and get 10100 as an answer. We know that the dot product of this with s gives 0. So,

11860_e008_255.jpg.

This tells us that 11860_e008_256.jpg. Since these digits are either 0 or 1, we deduce that 11860_e008_257.jpg.

We run the circuit again hoping that we don’t get 10100 again. (The probability of this happening is 11860_e008_258.jpg, so we are fairly safe.) We also hope that we don’t get 00000, which wouldn’t give us any new information. Suppose we get 00100. Then we know that

11860_e008_259.jpg

This shows that 11860_e008_260.jpg must be 0. From the first step, we can now deduce that 11860_e008_261.jpg must also be 0. We run the circuit again and get 11110. We know that

11860_e008_262.jpg,

which tells us that 11860_e008_263.jpg. Running the circuit again gives 00111, telling us that

11860_e008_264.jpg.

Consequently, we must have 11860_e008_265.jpg and, since 11860_e008_266.jpg, we have 11860_e008_267.jpg.

We know that not all of the digits are 0, so we must have 11860_e008_268.jpg, and consequently s must be 01011. For this example, we made four calls to the oracle.

At this point, there are a couple of questions that you might be asking. The first concerns the algorithm for finding s using the outputs of the quantum circuit. We have seen what to do in a specific case, but is there an algorithm—a step-by-step procedure—that tells you what to do in every case? The second question concerns how we are measuring the number of questions we are asking the oracle. When we looked at the classical algorithm, we took the worst possible case and saw that after 11860_e008_269.jpg questions we would definitely have our answer. But when we come to the quantum algorithm, the worst possible case is much worse! We are getting an answer at random. The answer does have a dot product of 0 with s, but we could get the same answer more than once. We could run our quantum circuit 11860_e008_270.jpg times and get a string of 0s every single time. It is unlikely, but it is possible. A string of 0s gives us no information, so it is possible that after 11860_e008_271.jpg questions to the oracle we haven’t deduced anything at all about the secret number. We will address both of these concerns.

Each time we run the circuit, we get a number whose dot product with s is zero. This gives us a linear equation in the n unknowns. Running the circuit several times results in our obtaining several—a system of—equations. In the previous example, at each stage we got a new equation, but that new equation also gave us some new information. The technical term for this is that the equation is linearly independent of the previous equations. In order to calculate s we need a system of 11860_e008_272.jpg linearly independent equations.***

Algorithms for solving systems of equations are extremely well known. They are studied in courses like Linear Algebra and Matrix Theory and have numerous applications. They are so commonly needed that they are programmed into most scientific calculators. We won’t discuss them here apart from mentioning that the number of steps required to solve a system of n equations can be bounded above by a quadratic expression involving n. We say the system can be solved in quadratic time.

The other question that we need to address is this: How many times do we need to run the quantum circuit? As we pointed out, in the worst-case scenario, we can keep running our qubits through the circuit and never get any useful information. However, this is extremely unlikely. We examine this idea in more detail in the next section.

Complexity Classes

In complexity theory, the main classification is between problems that take polynomial time to solve and those that need more than polynomial time. Polynomial time algorithms are regarded as being practical even for very large values of n, but non-polynomial time algorithms are regarded as being infeasible for large n.

Problems that classical algorithms can solve in polynomial time are denoted by P. Problems that quantum algorithms can solve in polynomial time are denoted by QP (sometimes it is denoted by EQP, for exact quantum polynomial time). Usually when we use these terms we are referring to the number of steps that an algorithm takes, but, remember, we defined a new way of measuring complexity—query complexity—that counts the number of questions we need to ask an oracle. We saw that the Deutsch-Jozsa problem was not in the class P, but belonged to QP for query complexity. (The constant function is a degree 0 polynomial.) This is sometimes described as saying that the Deutsch-Jozsa problem separates P and QP—it is a problem that belongs to QP but not to P for query complexity.

However, let’s recall the worst-case scenario for the classical algorithm. To make things more concrete, we will take 11860_e008_273.jpg. We are given a function that takes 10 inputs and told that it is either balanced or constant. We have to keep evaluating our function on specific inputs until we can deduce the answer. There are 11860_e008_274.jpg possible inputs. The worst-case scenario is when the function is balanced, but we get the same answer for the first 512 evaluations, and then on the 513th evaluation we get the other value. But how likely is this to happen?

If the function is balanced, for each input value we are equally likely to get either a 0 or a 1. This can be compared to tossing a fair coin and obtaining a head or a tail. How likely is it to toss a fair coin 512 times and get heads every time? The answer is 11860_e008_275.jpg, which is less than 1 divided by a googol, where a googol is 11860_e008_276.jpg. It is a minute number!

Suppose you were given a coin and asked whether it was fair or was double-headed. If you toss it once and get heads, you can’t really answer the question. But if you toss it ten times and it comes up heads every time, then you can be fairly sure that it is double-headed. Of course, you could be wrong, but in practice we are willing to accept being wrong as long as the probability of this happening is very small.

This is what we do for the bounded-error complexity classes. We choose some bound on the probability of getting an error that we think is acceptable. Then we look at algorithms that can answer the question within our bound for error.

Returning to the Deutsch-Jozsa example, suppose that we want at least a 99.9 percent success rate, or equivalently an error rate of less than 0.1 percent. If a function is balanced the probability of evaluating the function 11 times and getting 0 every time is 0.00049 to five decimal places. Similarly, the probability of obtaining 1 every time is 0.00049. Consequently, the probability of obtaining the same answer 11 times in a row when the function is even is just less than 0.001. So if we are willing to accept a bound on the probability of error of 0.1 percent, we can choose to make at most 11 function evaluations. If during the process we get both a 0 and a 1, we can stop and know with certainty that our function is balanced. If all 11 evaluations are the same, we will say the function is constant. We could be wrong, but our error rate is less than our chosen bound. Notice that this argument works for any n. In every case, we need 11 function evaluations at most.

Problems that classical algorithms can solve in polynomial time with the probability of error within some bound are denoted BPP (for bounded-error probabilistic polynomial time). The Deutsch-Jozsa problem is in the class BPP.

One thing that you might be worried about is whether a problem could be in BPP for one bound on the probability of error, but not in the class BPP for a smaller bound. This doesn’t happen. If the problem is in the class BPP, it will be there for every choice of the bound.

We now return to Simon’s algorithm. We need to keep sending qubits through the circuit until we have 11860_e008_277.jpg linearly independent equations. As we know, in the worst case this process can go on forever, so Simon’s algorithm is not in class QP. However, let’s choose a bound that we are willing to accept on the probability of making an error. Then we can calculate N so that 11860_e008_278.jpg is less than our bound.

We won’t prove this, but it can be shown that if we run the circuit 11860_e008_279.jpg times, the probability of the 11860_e008_280.jpg equations containing a system 11860_e008_281.jpg linearly independent equations is greater than 11860_e008_282.jpg.

We can now state Simon’s algorithm. First we decide on a bound on the probability of error and calculate the value N. Again, the number N does not depend on n. We can use the same value of N in each case. We run Simon’s circuit 11860_e008_283.jpg times. The number of queries is 11860_e008_284.jpg, which, since N is fixed, is a linear function of n. We make the assumption that our system of 11860_e008_285.jpg equations contains 11860_e008_286.jpg independent vectors. We could be wrong, but the probability of being wrong is less than the bound that we chose. Then we solve the system of 11860_e008_287.jpg equations using a classical algorithm. The time taken will be quadratic in 11860_e008_288.jpg, but because N is a constant, this can be expressed as a quadratic in n.

The algorithm as a whole contains the quantum part that takes linear time added to the classical part that takes quadratic time, giving quadratic time overall. Problems that quantum algorithms can solve in polynomial time with the probability of error within some bound are denoted BQP (for bounded-error quantum polynomial time). Simon’s algorithm shows the problem belongs to BQP for query complexity.

We showed that the classical algorithm, in the worst case, took 11860_e008_289.jpg function evaluations—this is exponential in n, not polynomial, so the problem definitely does not belong to P. It can also be shown that even if we allow a bound on the probability of error the algorithm is still exponential, so the problem does not belong to BPP. We say that Simon’s problem separates BPP and BQP for query complexity.

Quantum Algorithms

We started this chapter by describing how in many popular descriptions the speedup provided by quantum algorithms is said to come solely from quantum parallelism—the fact that we can put the input into a superposition that involves all the basis states. However, we have looked at three algorithms and have seen that though we need to use quantum parallelism, we need to do much more. We will briefly look at what is needed and why it is hard!

The three algorithms we have studied are the most elementary and considered standard, but as you have probably noticed they are by no means trivial. The dates when they were published tells an important story. David Deutsch published his algorithm in his landmark paper of 1985. This was the first quantum algorithm, and it showed that a quantum algorithm could be faster than a classical one. Deutsch and Jozsa published their generalization of Deutsch’s algorithm in 1992, seven years later. It might seem surprising that what seems to be a fairly straightforward generalization took so long to find, but it is important to realize that it is the modern notation and presentation that make the generalization seem to be the natural one. Deutsch’s paper doesn’t state the problem exactly the way it is stated here, and it doesn’t use diagrams for quantum circuits that are now standard. That said, there was an incredibly productive period from 1993 to 1995 when many of the most important algorithms were discovered. Daniel Simon’s algorithm was published in this window, as were the algorithms by Peter Shor and Lov Grover that we will look at in the next chapter.

Orthogonal matrices represent quantum gates. Quantum circuits consist of combinations of gates. These correspond to multiplying orthogonal matrices, and since the product of orthogonal matrices results in an orthogonal matrix, any quantum circuit can be described by just one orthogonal matrix. As we have seen, an orthogonal matrix corresponds to a change of basis—a different way of viewing the problem. This is the key idea. Quantum computing gives us more ways of viewing a problem than classical computing does. But in order to be effective, there has to be a view that shows the correct answer separated from other possible incorrect answers. Problems that quantum computers can solve faster than classical computers need to have a structure that becomes visible only when we transform it using an orthogonal matrix.

The problems that we have looked at are clearly reverse-engineered. They are not important problems that people have been considering for years and that we have only now discovered that if we view them from the right quantum computing perspective they become simpler to solve. Rather, they are problems that are specially created using the structure of Kronecker products of Hadamard matrices. Of course, what we really want is not to reverse-engineer a problem, but to take a problem that is important and then construct a quantum algorithm that is faster than any known classical algorithm. This is what Peter Shor achieved in his 1994 landmark paper, in which he showed (among other things) how quantum computing could be used to break the codes that are currently used for Internet security. We will briefly discuss Shor’s algorithm in the next chapter, where we look at the impact of quantum computing.