6

Who’s Out There?

It’s important to be able to keep secrets and have ways of recognizing changes to data. However, neither of these capabilities addresses what is perhaps the greatest peril in cyberspace. Every day, thousands of people are fooled because of how easily cyberspace facilitates impersonation. Use of cryptography alone cannot address this problem, but there are ways in which it can help.

Woof Woof

A famous 1993 cartoon from the New Yorker is often cited to illustrate the issue of anonymity on the internet.1 It features two dogs, one of which is sitting at a computer, paw on keyboard, looking down at his companion and barking: “On the internet, nobody knows you’re a dog.” It’s a great cartoon. Isn’t it funny (both “ha ha” funny and “peculiar” funny) that dogs could be using the internet without our knowing! However, the unexpected success of this cartoon is that, in an apparently harmless way, it captures a sinister truth.

Dogs, clever though they are at retrieving bits of sausage from under the sofa, lack any ability to wrestle with a keyboard. Much as we’d really love them to be, dogs are not on the internet. So, who is?

Consider this story: Chloe is twelve, and a big fan of a social media platform that allows users to easily record and share short video clips of themselves on their phone, dancing to short excerpts from their favorite artists. All her friends have accounts, and they post almost every day. Fortunately, Chloe is a sensible girl and has parents who are aware of some of the potential dangers of social media and the internet. They have advised her to share her videos only with people she regards as friends in the physical world. She does not permit “friends of friends” to see her work, because Chloe has been warned that her friends may not be as careful as she is about who can see videos. All seems good until Chloe’s account is audited one evening by her parents.

“Are you just sharing with your friends?” they ask.

“Yes,” replies Chloe.

“Are these all friends who you really know in real life?”

“Yes. I mean, almost,” says Chloe. “There’s this dog, and it’s really funny, and it has its own account, so I’m following it. The videos are great; do you want to see one?”

“In a moment, perhaps,” reply the interrogators, “but are you friends with this dog?”

“Well,” confesses Chloe, “I was following the dog, and the dog asked if it could follow me, so I said yes. I mean, it’s a dog, and its videos are really, really funny; this is my favorite one . . .”

Here’s the real truth behind the New Yorker cartoon. On the internet, not everyone thinks carefully enough about the implications of the fact that you’re not a dog.

The Need to Know Who

Think about the things you do each day in cyberspace. Before you do many of them, you have to, either explicitly or implicitly, answer a question such as: Do you have an existing account? Or: Have you registered? Indeed, even first accessing cyberspace typically requires answering such a question. Have you ever wondered why?

For one thing, most things we do in cyberspace relate somehow to a commercial service. Cyberspace might be an intangible, abstract concept, but it’s facilitated by equipment, networks, and services operated by human beings, which all cost money to provide. Commercial providers of these components often need to determine who’s out there so at least they know where to send the bill.

Even seemingly free services in cyberspace must be paid for. We almost always have to register for free services, and then pay for them by submitting personal data and exposing ourselves to commercial advertising. These service providers need to know who’s out there so that they can correlate the data they’re collecting, and the ads they’re pitching, to the profile of the user of the service.2

Another reason for determining who is out there is that a great deal of information in cyberspace is intended for a restricted audience. Few workplaces would function smoothly if everyone always knew everything. In sensitive sectors of government and the military, strict control of access to information is particularly important. Hopefully, you have configured the privacy settings of your social media account to control who sees what. In cyberspace, wise data owners need to know who is out there before deciding whether to release data to them.

So, given that dogs such as wire-haired dachshunds are not in cyberspace, who is out there? The safety of many activities in cyberspace depends on how accurately we can conjure up an answer. The problem is that in cyberspace, getting an accurate answer is very hard. Chat rooms, social media networks, and online dating services would be much safer environments if we could reliably answer this most vexing of questions.

Human versus Machine

The process of entity authentication attempts to determine who is out there. The word entity is deliberately abstract. The different ways that entity authentication can be provided depend, at least in part, on whether the entity is controlled by the pulsing of a heart, or by a clock driving the processing of a microchip.

Let’s consider one way that entity authentication can be provided in the physical world. A traveler approaches a border control point. The immigration officer must determine whether to permit the traveler to enter the country. The traveler is asked for, and presents, a passport.

A passport is a document containing a range of physical security mechanisms. Modern passports feature holograms, special ink, and computer chips, and they include biometric information about the person to whom they are issued.3 These mechanisms are designed to make the passport hard to forge, and to bind the passport to the intended bearer. The passport is a token, which is the result of a fairly laborious administrative process, designed to make it hard for passports to be inappropriately issued to the wrong people. The immigration officer is likely to admit the traveler if the passport seems to be valid and the traveler appears to be the person it was issued to. Importantly, it is the combination of both human and passport that the officer takes into consideration. Border control officers are just as unlikely to admit a traveler who cheerily announces their name but has no passport, as they are to approve a person who submits a valid passport but is wearing a paper bag over their head.

In cyberspace, it is easy enough to create tokens that play a role similar to passports. You are undoubtedly familiar with presenting the likes of passwords, bank card numbers, or other security tokens in order to access services in cyberspace. These were almost certainly acquired after some administrative process, sometimes as minimal as supplying an email address, designed to link you to a particular service. In cyberspace it is relatively easy to present the token, but much harder to demonstrate the presence of the person to whom the token was issued. Alas, we’ve all got paper bags over our heads in cyberspace.

Of course, entity authentication is not always so important. Border control deals with admitting real human beings, with all their flesh and flaws, into the country. It matters who they are. For much of what we do in cyberspace, it matters less. A web retailer might love to know who is using its website so that it can profile visitor browsing behavior, but it can derive value from visitor data without accurately identifying every human being who views its web pages.

Note that precisely which entity matters is not always obvious in cyberspace. A mobile phone company wants to know where to send the bill. The entity the phone company wants to authenticate is thus the account holder, who is not necessarily the human using the phone, as is true when a parent buys a phone for a child. The mobile phone owner, on the other hand, does not want an opportunistic stranger who discovers a lost phone on the seat of a train to be able to use that phone. The entity that the owner is more concerned about is thus the person who uses the phone.4

Even more confusing is our own perspective on who is out there. We’re often under the impression that humans directly communicate with one another in cyberspace, with computers merely acting as humble facilitators. This is, however, largely an illusion.

Everything that happens in cyberspace is really an interaction involving computers, in most cases one computer communicating with another. The perception that human beings are at the end of any interaction in cyberspace is slightly dangerous, since often they are not. Even if your phone is in your hand, it is capable of doing all sorts of extraordinary things without asking you. Most of these are benign, even desirable, such as checking for updates or retrieving messages from a server. However, your phone certainly has the potential capability, if you’re not careful, to be used to clean out your bank account and wire the balance to a stranger.5

Even when humans are involved palpably in a digital communication, a problem arises from the fact that (for now, at least) human beings are not computers, and computers are not human beings.6 Every time you interact with cyberspace, you are not strictly at the end of the line of communication. Your computer is.

To see this, consider just about the simplest possible interaction with cyberspace. You’re sitting in front of your computer typing an email. You do so by forming your thoughts into words, and then transferring these words to your computer by pressing letters on a keyboard. You are certainly present during this interaction and are undoubtedly communicating directly with your own computer. What could possibly go wrong here? Surely, you are out there.

In most cases, everything will be fine, but much could go wrong. After you have pressed the symbols on your keyboard, your computer takes over. You, the human, are no longer part of the process. A whole sequence of invisible operations takes place, beginning with matching keyboard characters to digital codes, before this data is submitted for processing by applications running on the device. If your computer is working properly, then everything is rosy. But if your computer has been infected by malware (an undesirable program such as a computer virus), then your computer could do some things you do not intend. For example, your computer could store what you type and send this information to someone who is conducting surveillance of your activities.7 It could also suppress or alter the information you type, resulting in a different email being sent. You may well have been there, but it’s what your computer does that really matters.

The fact that a computer could behave differently from the way a human user expects, or indeed conduct tasks that the human user is unaware of, is something attackers often exploit. We cannot prevent this gap between humans and devices, so we have to manage it somehow. One method that you will undoubtedly have encountered is the captcha (a term that derives from the phrase “completely automated public Turing test to tell computers and humans apart”). Captchas are used to test the presence of a human by setting tasks that machines are currently less effective at, such as deciding which alphabetic characters are suggested by an almost illegible squiggle, or which of a series of photographs features a building that could plausibly be a shop.8

Love them or loathe them (the smart money is on the latter), the need for captchas is symptomatic of the gap between human and machine. At the very least, we need to keep this gap in mind whenever we’re wondering who is out there.

“Hello from the Other Side”

Try screaming “Hello, who’s out there?” into cyberspace. Even if you hear back a faint “It’s me,” what value can you possibly place in an answer from the void?

Any comprehensive answer has two important components: the first relating to identity, the second relating to time.

Just as discussed for physical security mechanisms, the only way to distinguish one entity from another in cyberspace is to equip the entity with a special capability distinguishing it from the crowd. There are many different ways this can be done, and these vary, depending on whether the entity you have in mind is a human or a computer.

Humans can be given a tangible object. To test presence, a human could be asked to demonstrate that they have this object. In cyberspace, objects can be things such as smartcards, tokens, and even phones, the possession of which represents evidence of the human’s presence. Of course, the biggest problem with using only possession of objects for entity authentication is that objects can be lost or stolen.

Humans are, themselves, objects. The field of biometrics9 is based on extracting characteristics of a human and using them to provide entity authentication. Biometrics vary in their effectiveness, but some have become well established. Air travelers and convicted criminals will be familiar with fingerprinting and automatic face recognition, both of which can also be deployed in cyberspace. Biometrics are less easily lost or stolen, at least directly from the humans they represent.10 However, biometrics are simply physical measurements converted into digital values. If the digital values are compromised in some way—for example, the database storing them is stolen—then serious problems arise. You’ve been asked to change your password many times, but what could you do if someone asked you to change your fingerprint?

By far the most common approach to entity authentication in cyberspace is to base it around the special capability of knowing something that others don’t. This technique can be used to authenticate either humans or computers. A significant advantage of the latter over the former is that computers tend not to have problems remembering complicated things, such as strong passwords or cryptographic keys. Most of the ways cryptography can be used to support entity authentication are based on secret knowledge being the distinguisher.

Because these different approaches to entity authentication all have their own strengths and drawbacks, it is not uncommon in cyberspace to require that multiple techniques be applied together. A classic example of two-factor entity authentication is to present both a bank card (tangible object) and a PIN (secret knowledge) to a point-of-sale terminal in a store. In this case it is really the presence of the bank card that’s being tested for, since the card contains a chip that stores the cryptographic keys used to protect the transaction. However, the knowledge of the PIN provides an extra layer of authentication by demonstrating that the human who knows the PIN is also present. Two-factor authentication thus attempts to authenticate two different entities at once: the card and the human owner. Unfortunately, banks are not so thorough when we buy things online without the use of a point-of-sale terminal.11 As a result, this type of card-not-present transaction is where most fraud happens.12

Note that entity authentication is not always explicitly about identification of who is out there. For some applications, it suffices to establish that whoever is out there is authorized13 to do something. For example, many cities now support payment for public transportation using “pay-as-you-go” smartcards that have preloaded value. The ticket reader on the train needs only determine whether there is enough credit on the card to open the barrier and permit the traveler to pass. It is not strictly necessary to identify the traveler, although some systems may do this for other reasons, such as journey profiling.

The second component to answering our scream into cyberspace relates to time. If you shout “Hello, who’s out there?” into a deep, dark chasm and hear back “It’s me,” then is “me” a living, breathing person? Or might it be a recording? One of the challenges facing investigators of kidnappings is to determine whether hostages are still alive when videos of them are released. Since this issue can be just as important to the kidnappers, it used to be common practice for such videos to feature hostages holding newspapers to prove that the video was recorded after a displayed date.14

This inclusion of evidence of liveness can be just as important in cyberspace. In this aspect, biometrics have a built-in advantage over the likes of passwords. While a victim can be forced to reveal their password and then be thrown down a well, good biometric technology requires a response to “Who’s out there?” from a living body.

However, as previously discussed, entity authentication is more often required for a device than for a human. Since information from the past can easily be recorded and then replayed in cyberspace, what is really needed is that an answer to the question “Who’s out there?” include evidence that the response is genuinely new. This is often referred to as evidence of freshness, rather than of liveness. Intriguingly, as will be discussed shortly, cryptography can be used to provide evidence of freshness without explicitly incorporating clock-based time.

Strong entity authentication mechanisms should thus indicate freshness, as well as establish either identity or authorization (or both). However, the most common entity authentication mechanism that you use every day in cyberspace—supplying a password—does not do so. This is just one of many reasons why it is so flawed as a means of establishing who is out there.

Agonizing Passwords

In cyberspace, it seems impossible to do anything without supplying a password. Passwords have become the default means of offering some kind of evidence of who might be out there. When you log on to a website (or indeed a computer or an app), you typically need to supply a username and a password. Passwords are liked because they appear to be an easy means of providing entity authentication. Passwords are loathed because, in many ways, they’re not. Elizabeth Stobert has referred to the agony of passwords, and we all instinctively know what she means.15

The real reason you should be wary of passwords is that they are very weak entity authentication mechanisms. You are probably familiar with some of the criticisms of passwords, but it’s worth identifying their two most critical flaws.

First, passwords are relatively easy for someone else to acquire. An attacker could get hold of your password in many different ways. An attacker who happens to be physically nearby could simply watch you type a password into a computer—a process sometimes referred to as shoulder surfing—or perhaps obtain it from a note stuck on the wall of your office. But even if the attacker is not nearby, there are still plenty of options. Sometimes passwords are passed across a network such as the internet in the clear (in other words, unencrypted). A clever attacker thus simply needs to watch your communications in order to obtain your password.

An attacker could also try to guess your password, since passwords are rarely chosen well. The vast majority of passwords are either easily acquired personal information or simple modifications of dictionary words. Worse, many technologies come with well-known default passwords that a user is supposed to change at the earliest opportunity, but in practice often they don’t know how to do that, or they don’t care or bother to.

Second, passwords have longevity. Since passwords are a bit of a nuisance to set up, you tend to use a password for an extended period of time. Indeed, for many applications you may have never changed your password. Since passwords don’t incorporate any notion of freshness, if someone else acquires your password, you’re potentially in a whole lot of cyber trouble.16

One password might be useful to an attacker, but many passwords form a treasure chest. One place where large numbers of passwords potentially exist is on the computer of someone who asks you for a password. For example, an online retailer might ask you to log in by supplying a password before you complete a purchase. This arrangement is convenient for them, since they can store your personal data (which could include payment-related data) and link your visits. This means that somewhere on the retailer’s computer system reside a whole bunch of passwords. The database containing these passwords presents a lucrative target for attackers to seek out. And they do, sometimes successfully.17

Fortunately, any respectable organization using passwords to authenticate its customers will not maintain a database containing passwords, thanks to cryptography.18 All the organization really needs is evidence that whoever is logging in knows their password. Cryptography enables the organization to verify this without requiring knowledge of the password itself. To do so, what is necessary is a database containing some means of checking whether supplied passwords are correct—in other words, a database of integrity checks for passwords.

Hash functions are candidates for this purpose. Here’s the idea: When you first create an account, you supply the organization with a username and password. The organization hashes this password and stores the hash next to your username in a database. Whenever you log in, you resupply the username and password. The organization hashes the offered password and then checks its database to see whether this hash matches the one next to your username. If it does, then it’s you.

For this purpose, a hash of a password is as good as a password. If the supplied password is not correct, then its hash will not match the one in the database. Importantly, however, anyone who manages to access the database will not learn the passwords from the hashes stored there. When passwords are managed in this way, nobody else knows your password—even the administrators of the password system. Notice that if you forget your password, nobody is able to retrieve it for you and you’re forced to reset it. Hash-protected passwords are like days. You can always start a new one, but you can never be given back one you’ve carelessly lost.

Revenge of the Reference Book

Username—easy. Password—um . . . While this might be you trying to log on to one of your accounts in cyberspace, it is also the conundrum faced by an attacker. In the absence of anything better, at this point the attacker might as well just guess.

The beauty, and ultimate ugliness, of passwords is that they are memorable. The need for your brain to readily recall a password tends to place a limit on how complex a password can be. As noted earlier, the Oxford English Dictionary contains less than 300,000 words. Even allowing for clever morphs of words using other keyboard characters to replace letters, on a cryptographic scale of things there are only so many possible passwords that the attacker needs to try.

Here’s a real attack on passwords: The attacker establishes a list of candidate passwords. The obvious stuff, such as password, test, abc123, and justinbieber, can go at the beginning, followed by the 300,000 dictionary words, and then all their close relatives, such as ju5t1n81e8er. An attacker with nothing but guesswork to go on just starts firing away. However, an attacker who has much more usefully managed to access a database containing hashes of passwords starts hashing away. This latter, more powerful attacker is in a position of strength. All the attacker needs is for one of the candidate passwords on the list to hash to the same value as one of the hashes in the database. As soon as this happens, the attacker knows a username and password that will let them log on to the system. This is called a dictionary attack, since the list used is essentially a dictionary of passwords.

Dictionary attacks can’t be prevented. Using passphrases instead of passwords can make them harder to carry out, but do you do that? Good on you if you answered yes. Passphrases are harder to remember and take longer to type, and the chances of making an error as we enter them is much greater than for passwords. Many ingenious methods have been proposed for choosing more complex passwords,19 but the majority of password users simply won’t follow any advice that makes life even harder than it already is, regardless of the potential security reward. If you can’t stop something from happening, the next best thing is to discourage it. This brings us to one of the most surprising uses of cryptography that you engage with on a daily basis.

Cryptography is a complete nuisance, as any computer system engineer will tell you (although nuisance is probably not the precise word they would use).20 Conducting cryptographic operations costs time and energy. If a systems engineer can get away without using cryptography in their system, then they certainly will do so. Security is the enemy of performance, they will tell you. Cryptography slows systems down. Now there’s an idea!

Most cryptographic algorithms are designed to be computed as quickly as possible. However, to protect against a dictionary attack there is a distinct advantage of using a cryptographic tractor rather than a Ferrari. Instead of using a normal hash function to store hashes of passwords, a deliberately “slow” hash function could be designed that takes, say, one second longer to compute than a typical password hash takes (an operation that would normally take a tiny fraction of a second). Slowing down the experience of logging in by one second would be barely noticeable to a system user. However, if an attacker had a password dictionary of 64 million passwords (dictionaries of this size are available for purchase on the internet), then deliberately delaying each hash computation by one second would slow the time required to perform a complete dictionary search by 64 million seconds, which is about two years. To proceed, the attacker would need to be either extremely patient or very determined.

Cryptographic algorithms designed for the purpose of behaving like slow hash functions are sometimes known as key-stretching algorithms.21 Organizations often deploy several layers of different key-stretching algorithms to protect their passwords, making life even harder for an attacker with a password dictionary. Use of these algorithms does not make passwords any stronger as a means of providing entity authentication, but key stretching helps to deter one of the most dangerous ways of defeating passwords.

Too Many Passwords

Passwords for this, passwords for that. Make sure this one is at least eight characters, make sure that one includes upper- and lowercase characters and at least one number or other symbol. It’s frustrating, isn’t it? Worse, a typical “top ten” internet safety tip is that you should make sure all your passwords are completely different.

There’s a good reason to have different passwords for each of the various websites and applications you log on to. Suppose, instead, that you use one password for everything. No matter how wonderful this password is, the security of your only password depends on how well the system with the poorest security looks after it. Your bank may well do an excellent job at password management, but does the website of the small campsite you booked online last year care quite so much about security?

Are your passwords all well chosen and distinct? Really? Truly? If you make claims to such password sainthood, then either you’re deceiving yourself or cryptography is helping you.

The best option for anyone struggling with a proliferation of passwords is to deploy a password manager. These come in many flavors, including both hardware and software versions, but the basic concept is the same. Password managers address the three core challenges of choosing different strong passwords for each application you use, and later remembering them. A good password manager will generate strong passwords on your behalf, securely store them, and then automatically recall them whenever they are required.

Generating and recalling strong passwords is much simpler for a computer to do than for a human, since a computer is unburdened by cognitive biases and has almost flawless memory. A password manager securely stores all these strong passwords in a local database and then encrypts the database with a key. So far, so good.

There are two issues to resolve now. First, whenever you’re asked to supply one of your passwords, the key to decrypt the database is required. Where is this key? Second, the purpose of all these passwords is to provide entity authentication of you, the human user. A password manager consists of software running on your computer (and may also include a piece of hardware). How do these stored passwords get linked to you?

Different password managers deal with these two questions in different ways, but arguably the most common answer, to both, is to use a password—what else? You activate the password manager by entering a password, and the key to the database is computed from this password. A password to protect passwords; has anything really been gained?

To an extent, it has. The challenge of managing many passwords has been reduced to the difficulty of managing one password. This is a much easier problem to deal with. Yes, the password to the password manager should be strong. Yes, you need to be able to remember it. Yes, you must make sure this password is kept secure. But it’s only one password.

Of course, it is also a single point of failure. If your password manager is compromised, then everything is lost. Some password managers thus deploy stronger entity authentication techniques to link you to your passwords, including biometrics or the use of two-factor authentication. However they work, the bottom line is that password managers use encryption to make passwords simpler to manage, but they don’t really make the fundamental problems of passwords go away. Password managers treat the symptoms but not the cause.22

Masquerade on Masquerade

Whether you like it or not, passwords are not going away anytime soon as a means of providing entity authentication. Because passwords are entrenched as a security mechanism but also weak, they tend to be the point of vulnerability exploited by so many of the frauds in cyberspace.23

Recall that I previously mentioned several different ways an attacker could acquire your password, assuming you’re not using a state-of-the-art password manager. There is another, perhaps even more straightforward, technique: the attacker could simply ask you for it.

This strategy might seem unlikely to succeed, but phishing attacks work in precisely this way. A phishing attack is often launched under the cover of an official-looking email from the likes of your bank or system administrator, asking you to do something for security reasons, such as resetting your password. If you proceed, then in most cases you follow a web link taking you to a spoof website run by the fraudster, which will first ask you for your current password (a common requirement to enable a password reset). You type in your password and then . . . bye-bye password, or credit card number, or whatever other important security information the criminals were seeking.24

Getting hold of your password can be the start of endless mischief in cyberspace, since, from the perspective of any website or application relying on your password for entity authentication, you are your password. The fraudster can now do anything your password enables you to do.

Matters would be made even worse if you were subject to a phishing attack on the password you use for your password manager. We’d all like to imagine that the type of savvy person who deploys a password manager would not fall for such a trick. Suppose, however, that you receive an email purporting to be from the company who sold you the password manager technology, asking you to enter your password in order to activate an upgrade of the password manager software (a request that no reputable company would send you). You wouldn’t enter it, would you? If this catastrophe unfolds, then whoever is behind the phishing attack can now potentially do everything you can do in cyberspace.

It is worth reflecting on the underlying anatomy of this type of fraud. An attacker conducts a masquerade (as, say, your bank) in order to perpetrate another masquerade (as you). The core problem arises through your failure to authenticate the source of the original phishing email and/or any website you are subsequently directed to. You are possibly fooled by (weak) data integrity mechanisms that made the original email appear genuine (logos, use of appropriate language, plausibility of the request, etc.). The problem, in our cryptographic parlance, is that a semblance of data integrity is not sufficient to provide strong entity authentication. Because of your failure to ask “Who’s out there?” during the phishing attack itself, the next time one of the websites you visit asks “Who’s out there?” the answer might be you, even when you’re not.

A friend of mine opened a bank account in the US in the 1990s and was asked what password he wanted. To my friend’s great surprise, the teller wrote his response down in a notebook. Let’s be frank; nobody manages passwords like this anymore. At least, they shouldn’t! Thanks to cryptography, nobody other than you should ever know your passwords. You should enter passwords only when you are absolutely sure you’re talking to a legitimate service in cyberspace.

Perfect Passwords

I’ve been giving passwords, deservedly, quite a hard time. But now let’s take an entirely different approach. If we could reinvent the world, what would a perfect password look like?

A perfect password should be unpredictable, in order to be as resistant as possible to guessing and dictionary attacks. In other words, it should be randomly generated. A perfect password should be used only for logging in to one system and not shared across multiple applications. A decent password manager can facilitate both of these requirements. However, a perfect password should also be of no use to an attacker who acquires it, by whatever means (shoulder surfing, use of a keylogger, observing the network the password is sent over, etc.). Hmm . . . how could a perfect password be devised?

It’s certainly possible to take a step in this direction. No doubt you have occasionally, perhaps far too often, been asked to change your password. This is another of the frustrating aspects of password management. You have just succeeded in memorizing a complex password with all those funny characters, and then some well-meaning security expert recommends you change it? Bothersome maybe, but regular password change reduces the risk of some threats to passwords, such as dictionary attacks, as well as potentially limits the impact of an already compromised password (which you might not even be aware has been compromised).25 Use of a password manager can make regular password changes less painful, but it doesn’t make the process entirely painless. Nor does it create a perfect password, since a stolen password is still useful to an attacker until the next time it is changed.

Suppose an attacker observes a password. The only way to make this information completely useless to the attacker is to ensure that this password is never, ever used again. A perfect password must therefore not just be used to authenticate to only one system; it must be used only once to do so. Every time a perfect password is used, it must then be changed.

Fortunately, cryptography can be used to enable perfect passwords. Indeed, there is a very good chance you use a perfect password every time you authenticate to your online bank. It’s worth knowing how this idea works in practice.

The first important aspect of a perfect password is that it should be generated randomly. True randomness is difficult to achieve, since it commonly requires a physical process, such as the tossing of a coin or the rolling of dice. More practically, computers extract true randomness from white noise produced by, for example, the oscillations of a transistor. However, as previously observed, one of the fundamental properties of any good cryptographic algorithm is that its output appears to be randomly generated. A cryptographic algorithm can never produce true randomness, since there is one sense in which the output of a cryptographic algorithm is predictable. If you encrypt the same plaintext, using the same key, and the same encryption algorithm, you always get the same ciphertext. Similarly, if you hash the same data, using the same hash function, you always obtain the same hash output. By contrast, when you toss a coin, the outcome can never be predicted.

However, this predictability of a cryptographic computation does not matter if we make sure that the data input into a cryptographic algorithm is different every time. Different input to a cryptographic algorithm should result in different output. Ensuring different input each time the algorithm is computed thus means that the output of a cryptographic algorithm can be used as a perfect password.

Here’s the idea behind the use of perfect passwords to authenticate to your online bank: the technologies that banks use vary. One fairly common approach is to issue customers a small device called a token.26 Some tokens just have a display screen, while others resemble a pocket calculator. Whatever form of device is used, the bank is really equipping you with a cryptographic algorithm and a key. The algorithm is common to all customers of the bank, but the key is unique to you. The bank maintains a database of all keys issued to customers.

When you authenticate to your bank, the token computes a perfect password by using the algorithm and the key. The token displays the password on the screen. You send this password to the bank, which then repeats the computation, using its own copy of the key. If the two outputs match, the bank is convinced that it’s you out there. More precisely, the bank is sure that whoever is out there must have access to the cryptographic key that was sent to you. If someone else has stolen your token, then there could be a problem, so many banks also incorporate other layers of authentication (for example, some tokens themselves ask “Who’s out there?” by requesting that you first enter a PIN).

The passwords produced by the token are cryptographically generated, hence random enough. The algorithm is normally specially designed for generating random passwords, but there is no reason why, at least in theory, it could not be an encryption or MAC algorithm. What is most important is that the input supplied to the algorithm is used for only one password computation. The next time the bank asks you to authenticate, the input to the algorithm should be different. In this way, every time you log in to the bank, the password will be different.

Note that the input to the algorithm on your token does not have to be secret. The only secret in this system is the key that you share with your bank. What must be true is that the input is known to both you and your bank, and this input changes each time the bank asks who is out there. What do you and your bank both know that changes every time you’re asked to authenticate—that changes every time?

Many tokens include a clock and work by using the current time as the input to the cryptographic algorithm. Token technologies without a keypad use this technique, typically computing a perfect password every thirty seconds or so and then displaying it on the screen. The bank customer sends the bank the current displayed password as evidence that they have the key (token) currently in their possession. Of course, clocks do stray over time, but the lag on an individual token can be monitored and compensated for by the bank.27

The time is just one example of a piece of nonsecret data that can be known at the same time by two entities in different corners of cyberspace. If it’s not possible to use clocks, one alternative is to maintain a notion of artificial time by using a counter. In this case, the bank and token each use the counter to keep track of the number of times that authentication using the token has occurred. The latest count is used as the shared nonsecret input to the algorithm. After each authentication attempt, both the bank and the token increment the counter so that they share a new nonsecret value.

This is also how many remote keyless entry systems for cars operate. The “bank” in this case is your car, and the “token” is your car key fob. Both the car and the key fob share a cryptographic algorithm and a secret key. They also each maintain a counter. Every time you press the button to open your car, the key fob computes and wirelessly transmits a perfect password to the car. The car verifies the correctness of the password, before releasing the door catch.28

There is another way of facilitating perfect passwords that does not require the synchronizing of clocks or the maintenance of a synchronized counter. The flexibility offered by freedom from a need to synchronize is the reason that this alternative approach underpins not just some perfect password tokens, but also the way entity authentication is performed when you access your Wi-Fi, visit a secure website, or do many other things.

Digital Boomerangs

An indigenous hunter stealthily creeps up to the edge of a coastal swamp in eastern Australia. The distant ducks dabble unaware. The hunter reaches back and launches his boomerang. It curves around the far shore of the wetland before spinning back, low over the water. The ducks take flight, while the boomerang spirals back into the hunter’s hand.29 This scene might seem irrelevant to the discussion at hand, but cyberspace is constantly buzzing with digital boomerangs. Indeed, we couldn’t reliably do half the things we do in cyberspace without them.

To understand why, let’s consider our hunter once again. Suppose the hunter is blind (which makes boomerang throwing an even more dangerous sport than it already is). Further, let’s suppose that instead of seeking dinner, the hunter wishes to use the boomerang to learn something about his surrounding environment. This is precisely why we throw digital boomerangs into cyberspace.

Although our hunter can’t see the boomerang as it flies, one thing he can be fairly sure of is that the boomerang returning to him must be the same boomerang he threw (unless one of his friends decides to play an elaborate prank). However, if you launch some data into cyberspace and it later comes back to you, it is less clear that this is exactly the same data. The returning data could, for example, be a copy of identical data you sent sometime in the past. For this reason, normally we toss only freshly generated random numbers into cyberspace. Because these numbers are new and randomly chosen, it is extremely unlikely that a copy of them has ever been sent into cyberspace before. Just like the hunter, we are thus assured that when this random number returns, it must indeed be the random number we recently sent.

Despite his blindness, our hunter might be able to deduce information about the environment when the boomerang returns. Suppose some pungent melaleuca trees fringe the far shore of the swamp.30 As the boomerang flies low over these bushes, it picks up traces of their scent. The blind hunter might now be able to learn something about where the returned boomerang has been from its smell. Note, critically, that the hunter can make this deduction because the returning boomerang has been modified (in this case, ever so slightly) from the boomerang that was thrown.

Like the hunter, we are completely blind in cyberspace. When we send random numbers into cyberspace and they return, we have absolutely no idea where they have been. However, one advantage that data has over boomerangs is the ease with which it can be modified. If random numbers can be transformed in a manner that identifies who modified them, this information can be used to determine precisely where the returning random number has just been. In other words, digital boomerangs enable us to work out who is out there.31

This principle, often referred to as challenge-response, is readily implemented using cryptography. Going back to our online banking tokens, if the token has a keypad, then we can use challenge-response instead of relying on a system clock. In this case, the bank generates a fresh random number and sends it to the customer of the bank. This is the challenge. The real challenge the bank is issuing is: “Show me what you can do with this new randomly generated number.” The customer enters the challenge into their token, which then uses its key and cryptographic algorithm to compute a response displayed on the token screen. The customer sends the response back to the bank, which then checks whether the same response is obtained when the bank processes the challenge using the algorithm and its own copy of the customer’s key. The bank hurls a random number into cyberspace and gets back a modified version, transformed in a way that only the customer should be able to do. The digital boomerang returns and, crucially, the bank knows where it has just been.

The Importance of Challenge-Response

The principle of challenge-response is vital to security in cyberspace. Most practical processes involving cryptography feature some throwing of digital boomerangs.

Thus far I have mainly presented cryptography as a set of tools, which provide properties such as confidentiality, data integrity, and entity authentication. In practice, most uses of cryptography involve more than one party and more than one tool. Challenge-response provides a good example: The bank generates a challenge (almost certainly using a cryptographic random number generator) and sends it to the user, who enters it into the token. The token then applies a cryptographic algorithm to the challenge in order to compute a response, which is sent back to the bank, which recomputes the response locally and then verifies whether the received response from the token matches the response computed locally by the bank.

Most cryptography occurs in a flurry of send this, do that, check this, encrypt that, send it back again, and so on. The whole caboodle is normally referred to as a cryptographic protocol, which dictates the precise procedure everyone needs to follow for the cryptographic tools used in the protocol to deliver the desired security. In fact, a cryptographic protocol is essentially a cryptographic algorithm whose operations are carried out by a number of different entities.

Many of the cryptographic protocols you regularly use incorporate challenge-response. For example, whenever you use your web browser to connect to a remote website where you intend to process sensitive data, such as when you make online purchases, access webmail, or conduct online banking, your web browser and the web server (the computer hosting the web page) are hopefully using a cryptographic protocol known as Transport Layer Security (TLS) to talk to one another.32 One of the first steps of TLS is that your web browser and the website each send a random number to the other.

However complex the rest of a cryptographic protocol is, the reason many protocols begin with sending a randomly generated challenge to elicit a response is that establishing who is out there is perhaps the most fundamental part of any security process in cyberspace. The TLS protocol negotiates choices of cryptographic algorithm and establishes keys that can be used to encrypt and protect the integrity of subsequent communications between your web browser and the website you’re visiting. Why bother doing this unless you can be sure of the identity of the website you’re trying to securely connect to?

The security protocols used by Wi-Fi similarly determine keys for protecting the data flowing between a device and the network, but there is no point in going to this trouble if the device is not permitted to access the Wi-Fi network or if the Wi-Fi network is not genuine. Most cryptographic protocols begin with entity authentication, and most entity authentication commences with some sort of challenge-response.

Mister Nobody

Knock knock. Who’s there? Mister! Mister who? Mister nobody! As with most playground games, deep therein lies a hidden truth. Sometimes we want to respond to “Who’s out there?” in cyberspace with “I’m not telling you; mind your own business!”

The antithesis of entity authentication is anonymity. People desire anonymity in cyberspace for many reasons. It’s most natural to seize on negative motivations for anonymity, such as to conduct criminal activities or espionage. In many situations, however, anonymity is desirable for more constructive reasons. Citizens of despotic regimes might wish to remain anonymous when they criticize the government. Journalists might desire anonymity. More mundanely, someone browsing a website might wish to remain anonymous to prevent the website from recording their personal information, or to restrict the site owner’s ability to conduct user profiling and targeted advertising. Indeed, the concept of anonymity has been argued to be a human right that supports the broader and more fundamental right to personal privacy.33

You might imagine that anonymity is the default state of existence in cyberspace. After all, the entity authentication mechanisms that I’ve described thus far are all motivated by the apparent ease of masquerade in cyberspace. You can’t see who’s out there, so maybe you should implement some perfect passwords to find out? The truth is that it’s easy to be kind of anonymous in cyberspace. True anonymity is much harder to achieve.

You probably feel anonymous when you’re in cyberspace. It feels like you, the device you’re interacting with, and everything beyond is an unknowable void; nobody is with you, nobody can see you, nobody knows who you are. Indeed, you can easily pretend to be someone else. When the ticketing website annoyingly forces you to register with them, you smugly type “Mickey Mouse” into the name box and become a cartoon rodent. It’s not unlike the anonymity you feel when you get behind the wheel of your car: it’s just you, a box of bamboozling technology, and the open highway.

There is a negative side to this perception of anonymity. Many people experience a reduction in their natural reserve and their natural desire to conform to behavioral norms. Anonymity appears to unleash some less attractive aspects of personality, which are otherwise constrained.34 You may have experienced this phenomenon in your car, where a degree of anonymity has led you into conflict with other drivers in a way that doesn’t happen when you walk among other pedestrians on a busy city street. Car drivers hit their horns in situations where pedestrians apologize. In extreme cases, car drivers behave very badly, leading to incidents of road rage.

Apparent anonymity unleashes extraordinary demons in cyberspace. Our use of cyberspace for everyday communications has made a raft of societal ills easier to perpetuate, and has given them a much wider reach. Harassment through vitriolic remarks (trolling), cyber bullying, and cyber stalking is on the rise, partially facilitated by the perceived anonymity of cyberspace.35 Sometimes these acts are carried out by people who are known to their victims but whose inhibitions have been reduced in cyberspace. But people who are consciously trying to be anonymous in cyberspace often are guilty of the worst offenses. Just look at the extraordinary comments following articles in online newspapers and magazines. Some of the remarks are deeply disturbing, with the worst normally posted under aliases.

One car driver badly harassing another (for example, tailgating or driving in a dangerous fashion) might well think their apparent anonymity will protect them from prosecution, but that’s not necessarily the case. Cars, after all, have registration numbers that can be reported and traced, and roads are often watched by CCTV cameras that can be consulted during an investigation. Cyberspace is no different.

In fact, from an anonymity perspective, cyberspace is a considerably worse environment. Each device accesses the internet using a unique address, which acts as an identifier of the connection and sometimes the device itself. Infrastructure companies, such as mobile operators and internet service providers, often log network activity. Computing devices typically have a range of features that can be used to identify them on the basis of their specific hardware and software. Almost every action in cyberspace leaves a trace, and many of these can be used to unmask a casual attempt to remain anonymous.36

You must make an effort if you really want anonymity in cyberspace. Just as cryptography provides some of the strongest mechanisms for not being anonymous in cyberspace, it also enables some of the best methods for achieving anonymity.

Peeling Onions

The best-known technology for supporting anonymity in cyberspace is Tor. This tool does not provide perfect anonymity (whatever that might mean), but it provides sufficient anonymity to make it a technology of choice, not just for political dissidents and online black-market vendors, but also for ordinary users who have a need for privacy in cyberspace.37

Tor consists of a special web browser and a network of dedicated routers, which are essentially delivery centers. Routers are a standard component of the internet. Normal traffic (not using Tor) includes the unique internet addresses of both the sender of data and the intended receiver, and data travels from sender to receiver by being passed from one router to another until the destination is reached. This addressing information is not secret, so all these intermediate routers can easily see who is sending data and where it is going. Indeed, the whole point is for routers to be able to see addressing information; otherwise they don’t know where to direct the data on the next hop of its journey.

The challenge in providing anonymity is how to give routers enough information that they can keep passing the data toward the destination, without revealing the full information about who is sending what to whom. This sounds like a job for encryption, but if you just encrypt the addressing information, then nobody knows where the information needs to go. The Tor solution is both simple and ingenious.

Here’s the analogy: You’re a whistle-blower who wants to send a document to a journalist. You want to do this urgently and anonymously. You could seal the document in an envelope with the journalist’s address and call a courier, but then the courier has the potential to “de-anonymize” you, since they are aware of the address of both sender and receiver. To address this de-anonymization problem, Tor establishes a network of “safe houses.”

To deliver the document using Tor, you first randomly select three safe houses from the Tor network. You seal the document in an envelope with the journalist’s address. You then seal this envelope inside another envelope, addressed to the third safe house, which is then sealed in an envelope addressed to the second safe house, which is then sealed in an envelope addressed to the first safe house. Now you call the courier. The courier delivers this well-padded packet to the first safe house. Here the first envelope is removed, revealing the address of the second safe house. The first safe house now calls a new courier, who takes the parcel onward. A similar process happens at both the second and third safe houses. At the third safe house, the destination address is revealed, and the final courier delivers the remaining envelope to the journalist.

This scheme might sound contrived, but it is very effective. No safe house or courier is aware of both who sent the parcel and who is supposed to receive it. The first safe house and courier know where it came from, the third safe house and courier know where it is going, but nobody knows both these things. In Tor, the safe houses are routers, and the envelopes are layers of encryption. Data sent using Tor is encrypted in three layers, with each router stripping off one layer of encryption before passing it on. This process is sometimes referred to as onion routing, since it is analogous to a chef peeling off the layers of an onion.

Anonymity is a fascinating aspect of cyberspace, for the reasons previously discussed, and many more. However, there are many polarized views of anonymity in cyberspace. The negative aspects of anonymity38 have led some to regard it as one of the greatest scourges of cyberspace. Others see anonymity as the defining feature of cyberspace freedom.39 As I will discuss later in greater detail, because cryptography provides the best means to facilitate anonymity in cyberspace, cryptography itself is often either demonized or celebrated.

Who’s Who?

My analysis of who is out there has been a bit simplistic. I talked about the separation between human and computer, but reality is even more complex.

What is a human in cyberspace? Most people have many different personas in cyberspace. You, the human, are undoubtedly represented by different aliases and user accounts across the range of services you use in cyberspace. Some people even have different accounts with the same service. Which of these is the “real” you? All of them? Some of them?

Who else, apart from a human, can be out there in cyberspace? It can be a laptop, a phone, a token, a key, a network address. It could also be a web server, a network router, a computer program . . . The possibilities are almost endless.

Matters are going to get even more confusing in the future. The vast majority of humans are rarely far from their mobile phone, making phones an attractive device on which to base authentication. Modern mobile phones are capable of securely storing keys and computing sophisticated cryptographic algorithms. Humans are not just more reliably carrying computers, but the future could see humans becoming much more like computers. Advances in health monitoring make it likely that human bodies of the future could be implanted with small computing sensors. More ominously, like it or not, there are projects exploring how to connect human brains to cyberspace.40 Meanwhile, computers themselves are getting better at behaving like humans. Computers are already thinking more like humans as progress in artificial intelligence and in processing large data sets enables machines to anticipate, and even surpass, human decision-making. Advances in robotics make it likely that a cyber human of some shape and form might be with us soon.

What all this means for the future of entity authentication, heaven knows. Whatever technologies we end up using in future cyberspace, however, the core question of who is out there is not going to go away. Whoever asks this question needs to think carefully about who “who” is. Who do you need to know is out there? Human, token, account, key? And when you get your answer, who’s the “who” that replied? Similarly, when you’re asked who’s out there, who answers on your behalf? You or your phone? It would be wise to know, since, if you misplace your phone, you really should be aware of the extent to which you’ve also lost your “cyberself.”

Who is out there? The answer might be complicated, but, to be secure in cyberspace, we ought to know.