It’s 8 a.m. on a Monday morning. As usual, it’s not my alarm clock that wakes me up, but my dog Milou licking my face (a bit gross, I know). I give her a squeeze, push her to the side, and reach for my phone.
I check my WhatsApp and Facebook, scan my emails, and catch up on the latest news on CNN. Before taking Milou out, I put on my Fitbit to make sure that every step I take counts toward my daily goal of ten thousand steps. I’m tempted to tie my Fitbit to her collar, but resist. Don’t judge me. According to my friend Alice Moon’s research, I’m not the only one who cheats on their step count.
Milou and I go for a walk across campus where she can run around off leash and harass the early-riser students, who love the free cuddles.
After a quick shower at home, I grab my phone and wallet and head out to work. On the way, I pick up a matcha latte and a croissant from the deli around the corner. I arrive at the office shortly before 9 a.m.
In less than one hour, I have generated millions of digital footprints. On a server somewhere around the world, there are now digital records of the messages I have sent and received (incoming: cute pictures of my nieces; outgoing: cute pictures of Milou),the fact that I checked in with Facebook from my home location, spent about ten minutes reading five different articles on CNN’s website, I took about two thousand steps walking across the Columbia University campus in Manhattan, and got an unhealthy deli breakfast.
In addition, the sensors in my phone have registered that there was physical activity starting from 8 a.m. and have tracked my GPS location continuously. The cameras on the corners of streets, outside the deli, and inside the elevators have collected visuals of me, telling them exactly where I have been at any point in time, whether I was alone or accompanied, and whether I looked happy or not (at 8 a.m., I rarely do).
Like the average person, you and I generate about six gigabytes of data every hour.1 Every hour! Just imagine how many USB drives you would need to save all the data that accumulates over the course of your lifespan. And that is justyourdata.
Worldwide, there are now an estimated 149 zettabytes of data (that is 149,000,000,000,000,000,000,000 bytes), with numbers doubling every year. If you were to store all this data on CD-ROMs and stack them on top of each other, you would reach far beyond the moon. In fact, some have suggested that today there are almost as many digital pieces of data as there are stars in the vast universe. Romantic, isn’t it?
Each of these data points represents a little puzzle piece of who we are. My Spotify playlist, for example, reveals that I love techno and Taylor Swift. My credit card history, that I enjoy traveling. And my GPS records, that I love to go for long walks in the park.
Individually, these pieces aren’t that meaningful. Just like in a puzzle, you start with a pile of disconnected chaos. But once you put the pieces together, you gradually begin to see the full picture and understand its meaning. The same is true for data. Once connected, our digital traces provide a rich picture of our personal habits, preferences, needs, and motivations. In short: our psychology.
The Internet Knows You Better Than Your Spouse Does
In 2015, theFinancial Timesran an article with the provocative headline, “Facebook understands you better than your spouse.” Sounds like the opening to a dystopian science fiction novel? Nope. It’s the result of a real scientific study published by my former colleagues at the University of Cambridge.2
The research team led by Youyou Wu had built a series of machine learning models that could translate a person’s Facebook likes into personality profiles. The results were astonishing: after observing just ten likes from someone’s Facebook profile, their model was able to judge a user’s personality better than their work colleagues. Sixty-five likes? It knew users better than someone’s friends. A hundred-twenty likes? Better than family members. And three hundred? Better than their spouse.
When my colleagues first told me about their findings, I was sure they had made a mistake (and so were they, initially). Clearly, there was a bug in the code.
But there wasn’t. My colleagues were right. Seven years later, I am still amazed by their findings. We call our spouses our “other halves” for a reason. They often have years of data on us. They plan, experience, and live life with us every day. And yet, with access to just about three hundred of your Facebook likes, a computer can know you just as well or even better.
This puts the snooping skills of my village neighbors to shame. Even the most curious among them probably knew far less about me than any semiskilled computer scientist or engineer with access to the right data. Today, a fifteen-year-old kid in the basement of their parents’ home could figure out more about me than all my village neighbors together.
But how do computers become such master snoopers? How do they make sense of the vast, unstructured sea of digital footprints to painta picture of the person behind it? The simple answer is: they observe and learn (yup, big shocker, that’s why it’s called machine learning).
Let me illustrate this process with an example that has absolutely nothing to do with computers or algorithms. My main protagonists are chickens. Baby chicks, to be precise.
A Villager’s Guide to Machine Learning
Have you ever heard of sexing? Don’t worry: this topic is safe for work.
Chick sexing refers to the practice of distinguishing between female chicks (pullets) and male chicks (cockerels). Large commercial hatcheries use it to separate the high-value female chicks from the male chicks almost immediately after birth.
While the female chicks are used for egg production, the male chicks are usually killed to reduce unnecessary cost for the hatchery (and suddenly becoming a vegetarian doesn’t sound so bad anymore).
The act of sexing is done by experienced personnel—sexers—who have to decide within a few seconds whether a chick is female or male by examining the vent in the chick’s rear (known as “vent sexing”).
As it turns out, this is not an easy task. The genitals of newborn chicken are almost indistinguishable by eye, and there are so many exceptions that it’s practically impossible for even the most experienced sexers to explain their decision-making process. After years of training, they simply know. But how do they learn to distinguish between male and female chicks in the first place? Trial and error.
Imagine you have just started as a sexer in a major hatchery. It’s day one and you are excited to start your new job. But there is no instruction manual, no fifty-page report or PowerPoint deck to introduce you to the wonders of chick sexing. Instead, you are teamed up with an experienced chick sexer who stands right next to you, quietly observing.
You pick up the first chick and examine its rear. Of course, you have no idea; it’s your first day at work and your experience with chick vents has been, um, limited.
You shrug and put the chick into the pullets bin. Your mentor says, “Yes.” Success. You pick up the next one and after a short examination put it in the cockerels bin. Your mentor says, “No.”
Your first day at work won’t feel very satisfying—your chance of making the right choice will likely hover just above 50 percent (that’s a coin flip). But after a couple of weeks of running through the trial-and-error game with your mentor, your brain will have been trained to accurately distinguish between male and female chicks. You have become a sexing master! Just like your mentor, the rules guiding your decision-making might be too complex to articulate, but you have nevertheless internalized them.
Computers learn in the same way: trial and error. You throw a lot of examples at them and give feedback on whether their predictions are right or wrong. Doing so will gradually allow the algorithm to learn how the input (a chick’s rear or a set of Facebook likes) is related to the output (a chick’s gender or a user’s personality traits).
Do your Facebook likes include content about Oscar Wilde, Leonardo da Vinci, and Plato? You’re probably intellectually curious and open-minded. Accounting, MyCalendar, and national law enforcement? Most likely organized and reliable.
The more data you have to run through the trial-and-error game, the better the computer will become at turning educated guesses into highly accurate predictions. That’s exactly what my colleagues did when they conducted their man-versus-machine experiment. They collected a large dataset of over eighteen thousand Facebook users, combined their likes with self-reported personality profiles, and wrote a few lines of code to automate the trial-and-error learning process (a trivial task that can be done with user-friendly commercial software today).
In the next few chapters, I will take you on a journey through different types of digital footprints and show you how much they canreveal about who you are. Some of these footprints, such as your social media profiles, are created intentionally (chapter 2). Others, such as the GPS records extracted from your smartphone, are mere by-products of your interactions with technology (chapter 3).
But they all have one thing in common: they offer a fascinating window into your psychology—the aspects of your identity that define who you are beyond what is visible to the naked eye. We’ll venture into the worlds of political ideology, sexual orientation, socioeconomic status, mental health, cognitive ability, personal values, and more.
But most of our time together will be spent in the world of personality traits. It’s where most of the existing research lives, including my own. And it’s also the world that has received the most public attention and scrutiny (think Cambridge Analytica).
Because personality is such a popular destination, I want to get us all on the same page about what to expect there—a crash course for aspiring master snoopers, if you want (if you know all about personality, feel free to skip ahead to chapter 2).
The Big Five Personality Traits
Like most people, you probably have an intuitive concept of personality that guides your everyday behavior and social interactions.
In my school, it was clear that Vera was the party animal, while I was, well, the nerdy geek who went home at 11 p.m. when everybody else was still out dancing.
In my village, we all explained the butcher’s frequent outbursts of anger with his impulsive and irritable character. And at university, we all predicted that Anne would become a successful lawyer as a result of her competitive nature.
Although such lay theories of personality help us navigate our social world, they are often implicit and only loosely defined. You might not be able to fully explain why you think a particular person is irritable and you might not be consistent in your terminology.Sometimes you might label the same behavior impulsive, other times grumpy or angry.
In contrast to lay theories, scientific models of personality provide a structured approach to describing how people differ from one another in the ways they think, feel, and behave. Rather than accounting for the full complexity of someone’s identity, they provide pragmatic approximations of what most people are like.
The results of a personality test, for example, will tell you that my school friend Vera is highly extroverted. But you won’t know if Vera is the kind of person who goes to parties to talk to other people, or if she mostly goes there to dance. Or maybe both.
Scientific personality models sacrifice a high level of granularity for a high level of consistency and comparability. You won’t be able to understand all the nuances of who Vera is. But you will be able to directly compare her character to that of others.
The most popular scientific model of personality is the Big Five.3 You might also know it as OCEAN model named after the five personality traits it measures: openness to experience, conscientiousness, extroversion, agreeableness, and neuroticism. I will give you the chance to take a short personality test in a minute. It will allow you to learn about your own personality profile before we embark on our master snooping tour.
But let me first add a little bit more color to the five traits. You can also see a summary in table 1-1.
TABLE 1-1
The Big Five in a nutshell
Personality trait
Low
High
Openness
Practical; down-to-earth; traditional; conservative; preference for the familiar
Imaginative; curious; original/creative; appreciation for art, beauty, and aesthetics; open-minded
Source:Adapted from Gerald Matthews, Ian J. Deary, and Martha C. Whiteman,Personality Traits(Cambridge, UK: Cambridge University Press, 2003).
Openness to experience: The Picasso trait
Openness to experience (or openness) refers to the extent to which people prefer novelty over convention. People scoring high on openness are intellectually curious, sensitive to beauty, individualistic, imaginative, and unconventional. You might find them engaged in philosophical discussions, traveling the world, exploring new restaurants, visiting a museum, writing poetry, or painting.
People scoring low on openness, on the other hand, are down-to-earth and more conservative in the values they hold (including politics). They might not get excited by the idea of traveling to new and unknown places, but instead prefer to return to their all-time favorite all-inclusive hotel on the Riviera.
The Spanish painter, sculptor, printmaker, ceramicist, stage designer, poet, and playwright Pablo Picasso is an excellent example of an open-minded personality. Regarded as one of the most talented artists of his time, and one of the most inspiring and influential figures of the twentieth century, Picasso experimented with a wide variety of artistic styles over the course of his career and gave birth to more novel forms of artistic expression than any other artist at the time (e.g., the collage or the Cubist movement). The quote “Others have seen what is and asked why. I have seen what could be and asked why not” perfectly captures Picasso’s mix of intellectual curiosity,preference for novelty, and artistic interest, which also characterizes the personality trait of openness.
Conscientiousness: The Angela Merkel trait
Conscientiousness refers to the extent to which people prefer an organized or a flexible approach in life. It captures how we control, regulate, and direct our impulses.
People scoring high on conscientiousness are organized, reliable, perfectionists, and efficient. They tend to be good at following rules, resisting temptation, and sticking to schedules. And they love order. Everything needs to be in the right place. Everything needs to be perfect.
In contrast, people scoring low on conscientiousness are more spontaneous, impulsive, careless, absent-minded, or disorganized. They don’t care as much about achievements and instead take a much more relaxed and spontaneous approach to life. They might wait until the last minute to study for an exam or plan their holiday on the way to the airport. And yes, they are the ones who regularly forget about their friends’ birthdays or their own wedding anniversary.
I’ve heard some people call conscientiousness the German trait. I assume that’s because it leaves you with the image of a meticulously organized and perfectionist person. The sort of person that organizes their socks according to color and stacks up books in alphabetical order. It’s why I call it the Angela Merkel trait—always perfectly prepared and dependable (I should say that, sadly, this characteristic doesn’t apply to all Germans).
Extroversion: The Lady Gaga trait
Extroversion refers to the extent to which people enjoy company and seek excitement and stimulation. It is marked by a pronouncedengagement with the external world, versus being comfortable with one’s own company.
People scoring high on extroversion can be described as energetic, active, talkative, sociable, outgoing, and enthusiastic. They love people. Actually, more like LOVE people. You’ll most likely find them at social gatherings, trying to be the center of attention and entertaining the crowd. They are charming and usually full of energy and positive emotions (as they will gladly tell you).
Contrary to that, people scoring low on extroversion are more reserved, quiet, or withdrawn. They value their me-time and are much more introspective than their extroverted counterparts. Why waste your time and energy on other people when you can lose yourself in thoughts and daydreams?
The icon that best captures the essence of extroversion for me is the singer Lady Gaga (at least the public persona she portrays; I’ve sadly never met her). The eccentric singer is extremely outgoing and energetic. Her outfits are legendary. They are designed to attract as much attention as possible.
Agreeableness: The Mother Teresa trait
Agreeableness reflects people’s need for cooperation and social harmony. It provides insights into the ways in which we express our opinions and manage relationships.
People scoring high on agreeableness are generally trusting, soft-hearted, generous, and sympathetic. Because they are all about social harmony, they avoid confrontation whenever possible, try their best to not offend or insult, and are prepared to make personal sacrifices in the service of others (e.g., through donations or volunteering).
In contrast, people scoring low on agreeableness are more competitive, stubborn, self-confident, or aggressive. They have no problem speaking up when they don’t like something or when they believe something needs to be changed.
One of the agreeableness icons that has caught the public imagination is Mother Teresa, the selfless, generous, and caring nun who founded a religious congregation to help the most vulnerable members of society. A symbol for altruism and kindness. Her charity provided homes to patients dying of HIV/AIDS, leprosy, and tuberculosis, and today sponsors soup kitchens and mobile health clinics, and runs schools and orphanages.
Neuroticism: The Piglet trait
Finally, neuroticism (also known inversely as emotional stability) refers to the extent to which people experience negative emotions. It reflects the ease with which we cope with and respond to life’s demands.
People scoring high on neuroticism are anxious, nervous, and moody. They tend to get irritated by seemingly small challenges and worry a lot. Am I going to get sick? Am I going to get fired? Is it safe to use the subway?
On the flip side, people scoring low on neuroticism are more emotionally stable, optimistic, and self-confident. They are generally easygoing and don’t get stressed quickly. Missed the subway? Got a cryptic email from the boss? Having family over? Emotionally stable people keep calm and carry on.
The neuroticism icon is one of my favorite fictional characters: Piglet. Piglet is a young pig and one of Winnie the Pooh’s best friends. In the Disney cartoon, he stutters, is constantly nervous, and fears the wind and darkness. When he gets scared, his ears start to twitch. Most of the time, Piglet thinks about all the possible ways in which situations could go wrong. His mind then races with negative thoughts that jump from one worst-case scenario to the next.
What defines who we are is our particular combination of these personality traits. It’s what we call a personality profile.
Think back to Lady Gaga. I introduced her as the icon for extroversion, but she also scores high on openness. And these two characteristics aren’t independent; they influence each other. It’s unlikely, for example, that Lady Gaga would express her openness through shrill and unconventional outfits if she wasn’t also extroverted and interested in attracting attention.
Imagine someone who is open-minded but rather introverted. Got someone? The friend that immediately comes to mind for me is a woman I went to grad school with. She was highly open-minded but also extremely introverted. As you can imagine, flashy clothes weren’t exactly her thing. Instead, she loved going to the museum and devoured classic literature.
What’s Your Personality?
Let’s turn to you now. If you haven’t taken a Big Five test before, I recommend investing the next few minutes doing so. It will make the rest of the book much more relevant and engaging. You can visit this book’s official website,www.mindmasters.ai/mypersonality, or do a simpler paper-and-pencil version in appendix A at the end of the book.
As you respond to the questions and interpret your results, I want you to keep one thing in mind: there are no inherently good or bad traits. Scoring high or low on each of the five personality dimensions has its own unique advantages and disadvantages.
For example, you might be tempted to consider high agreeableness—the tendency to be trusting and caring—a good trait to have. Being friendly and trusting certainly has its advantages in some aspects of life (e.g., relationships and teamwork), but extreme levels of agreeableness can also be thought of as overly gullible, opportunistic, and lacking a necessary level of assertiveness.
Although being somewhat disagreeable (i.e., critical and competitive) might not make you a lot of friends, it is important when you have to make difficult decisions or take the lead in a competitive environment. The same is true for neuroticism. Few people want to be seen as anxious and vulnerable. However, while being highly neurotic certainly poses challenges for people’s health, it is also often associated with great innovative potential and genius—especially when paired with high levels of openness. Just think of the founder of Apple, Steve Jobs. Known to be highly neurotic, Jobs’s slightly eccentric nature and ability to be in touch with his own emotions is what made him one of the most successful figures of the twenty-first century.
Anyway, give it a go.
Now that you know how machines learn and what your own personality profile looks like, we are ready to dive into research on how computers can predict such profiles without you ever having to touch a questionnaire. I’ve already told you that computers are better than colleagues, friends, and family members when it comes to predicting your personality from Facebook likes. How do they do that? And what other pieces of the personality puzzle do our social media profiles hold?