JUSTIN WAS IN a reflective mood. On 4 February 2018, in the living room of his home in Memphis, Tennessee, he sat watching the Super Bowl, eating M&Ms. Earlier that week he’d celebrated his 37th birthday, and now – as had become an annual tradition – he was brooding over what his life had become.
He knew he should be grateful, really. He had a perfectly comfortable life. A stable nine-to-five office job, a roof over his head and a family who loved him. But he’d always wanted something more. Growing up, he’d always believed he was destined for fame and fortune.
So how had he ended up being so … normal? ‘It was that boy-band,’ he thought to himself. The one he’d joined at 14. ‘If we’d been a hit, everything would have been different.’ But, for whatever reason, the band was a flop. Success had never quite happened for poor old Justin Timberlake.
Despondent, he opened another beer and imagined what might have been. On the screen, the Super Bowl commercials came to an end. Music started up for the big half-time show. And in a parallel universe – virtually identical to this one in all but one detail – another 37-year-old Justin Timberlake from Memphis took the stage.
Why is the real Justin Timberlake so successful? And why did the other Justin Timberlake fail? Some people (my 14-year-old-self included)fn1 might argue that pop-star Justin’s success is deserved: his natural talent, his good looks, his dancing abilities and the artistic merit of his music made fame inevitable. But others might disagree. Perhaps they’d claim there is nothing particularly special about Timberlake, or any of the other pop superstars who are worshipped by legions of fans. Finding talented people who can sing and dance is easy – the stars are just the ones who got lucky.
There’s no way to know for sure, of course. Not without building a series of identical parallel worlds, releasing Timberlake into each and watching all the incarnations evolve, to see if he manages success every time. Unfortunately, creating an artificial multiverse is beyond most of us, but if you set your sights below Timberlake and consider less well-known musicians instead, it is still possible to explore the relative roles of luck and talent in the popularity of a hit record.
This was precisely the idea behind a famous experiment conducted by Matthew Salganik, Peter Dodds and Duncan Watts back in 2006 that created a series of digital worlds.1 The scientists built their own online music player, like a very crude version of Spotify, and filtered visitors off into a series of eight parallel musical websites, each identically seeded with the same 48 songs by undiscovered artists.
In what became known as the Music Lab,2 a total of 14,341 music fans were invited to log on to the player, listen to clips of each track, rate the songs, and download the music they liked best.
Just as on the real Spotify, visitors could see at a glance what music other people in their ‘world’ were listening to. Alongside the artist name and song title, participants saw a running total of how many times the track had already been downloaded within their world. All the counters started off at zero, and over time, as the numbers changed, the most popular songs in each of the eight parallel charts gradually became clear.
Meanwhile, to get some natural measure of the ‘true’ popularity of the records, the team also built a control world, where visitors’ choices couldn’t be influenced by others. There, the songs would appear in a random order on the page – either in a grid or in a list – but the download statistics were shielded from view.
The results were intriguing. All the worlds agreed that some songs were clear duds. Other songs were stand-out winners: they ended up being popular in every world, even the one where visitors couldn’t see the number of downloads. But in between sure-fire hits and absolute bombs, the artists could experience pretty much any level of success.
Take 52Metro, a Milwaukee punk band, whose song ‘Lockdown’ was wildly popular in one world, where it finished up at the very top of the chart, and yet completely bombed in another world, ranked 40th out of 48 tracks. Exactly the same song, up against exactly the same list of other songs; it was just that in this particular world, 52Metro never caught on.3 Success, sometimes, was a matter of luck.
Although the path to the top wasn’t set in stone, the researchers found that visitors were much more likely to download tracks they knew were liked by others. If a middling song got to the top of the charts early on by chance, its popularity could snowball. More downloads led to more downloads. Perceived popularity became real popularity, so that eventual success was just randomness magnified over time.
There was a reason for these results. It’s a phenomenon known to psychologists as social proof. Whenever we haven’t got enough information to make decisions for ourselves, we have a habit of copying the behaviour of those around us. It’s why theatres sometimes secretly plant people in the audience to clap and cheer at the right times. As soon as we hear others clapping, we’re more likely to join in. When it comes to choosing music, it’s not that we necessarily have a preference for listening to the same songs as others, but that popularity is a quick way to insure yourself against disappointment. ‘People are faced with too many options,’ Salganik told LiveScience at the time. ‘Since you can’t listen to all of them, a natural short cut is to listen to what other people are listening to.’4
We use popularity as a proxy for quality in all forms of entertainment. For instance, a 2007 study looked into the impact of an appearance in the New York Times bestseller list on the public perception of a book. By exploiting the idiosyncrasies in the way the list is compiled, Alan Sorensen, the author of the study, tracked the success of books that should have been included on the basis of their actual sales, but – because of time lags and accidental omissions – weren’t, and compared them to those that did make it on to the list. He found a dramatic effect: just being on the list led to an increase in sales of 13–14 per cent on average, and a 57 per cent increase in sales for first-time authors.
The more platforms we use to see what’s popular – bestseller lists, Amazon rankings, Rotten Tomatoes scores, Spotify charts – the bigger the impact that social proof will have. The effect is amplified further when there are millions of options being hurled at us, plus marketing, celebrity, media hype and critical acclaim all demanding your attention.
All this means that sometimes terrible music can make it to the top. That’s not just me being cynical. During the 1990s, two British music producers – fully aware of this fact – were rumoured to have made a bet on who could get the worst song possible into the charts. Supposedly, the result of the wager was a girl group called Vanilla, whose debut song, ‘No way no way, mah na mah na’, was based on the famous Muppets ditty. It featured a performance that could only generously be described as singing, artwork that looked like it had been made in Microsoft Paint and a promotional video that had a good claim to being the worst ever shown.5 But Vanilla had some powerful allies. Thanks to a few magazine features and an appearance on BBC’s Top of the Pops, the song still managed to get to number 14 in the charts.fn2
Admittedly, the band’s success was short-lived. By their second single, their popularity was already waning. They never released a third. All of which does seem to suggest that social proof isn’t the only factor at play – as indeed a follow-up experiment from the Music Lab team showed.
The set-up to their second study was largely the same as the first. But this time, to test how far the perception of popularity became a self-fulfilling prophecy, the researchers added a twist. Once the charts had had the chance to stabilize in each world, they paused the experiment and flipped the billboard upside down. New visitors to the music player saw the chart topper listed at the bottom, while the flops at the bottom took on the appearance of the crème de la crème at the top.
Almost immediately, the total number of downloads by visitors dropped. Once the songs at the top weren’t appealing, people lost interest in the music on the website overall. The sharpest declines were in downloads for the turkeys, now at the top of the charts. Meanwhile, the good tracks languishing at the bottom did worse than when they were at the top, but still better than those that had previously been at the end of the list. If the scientists had let the experiment run on long enough, the very best songs would have recovered their popularity. Conclusion: the market isn’t locked into a particular state. Both luck and quality have a role to play.6
Back in reality – where there is only one world’s worth of data to go on – there’s a straightforward interpretation of the findings from the Music Lab experiments. Quality matters, and it’s not the same thing as popularity. That the very best songs recovered their popularity shows that some music is just inherently ‘better’. At one end of the spectrum, a sensational song by a fantastic artist should (at least in theory) be destined for success. But the catch is that the reverse doesn’t necessarily hold. Just because something is successful, that doesn’t mean it’s of a high quality.
Quite how you define quality is another matter altogether, which we’ll come on to in a moment. But for some, quality itself isn’t necessarily important. If you’re a record label, or a film producer, or a publishing house, the million-dollar question is: can you spot the guaranteed successes in advance? Can an algorithm pick out the hits?
Investing in movies is a risky business. Few films make money, most will barely break even, and flops are part of the territory.7 It’s a high-stakes business: when the costs of making a movie run into the tens or hundreds of millions, failure to predict the demand for the product can be catastrophically expensive.
That was a lesson learned the hard way by Disney with its film John Carter, released in 2012. The studio sank $350 million into making the movie, determined that it should sit alongside the likes of Toy Story and Finding Nemo as their next big franchise. Haven’t seen it? Me neither. The film failed to capture the public’s imagination and wound up making a loss of $200 million, resulting in the resignation of the head of Walt Disney Studios.8
The great and the good of Hollywood have always accepted that you just can’t accurately predict the commercial success of a movie. It’s the land of the gut-feel. Gambling on films that might bomb in the box office is just part of the job. In 1978, Jack Valenti, president and CEO of the Motion Picture Association of America, put it this way: ‘No one can tell you how a movie is going to do in the marketplace. Not until the film opens in a darkened theatre and sparks fly up between the screen and the audience.’9 Five years later, in 1983, William Goldman – the writer behind The Princess Bride and Butch Cassidy and the Sundance Kid – put it more succinctly: ‘Nobody knows anything.’10
But, as we’ve seen throughout this book, modern algorithms are routinely capable of predicting the seemingly unpredictable. Why should films be any different? You can measure the success of a movie, in revenue and in critical reception. You can measure all sorts of factors about the structure and features of a film: starring cast, genre, budget, running time, plot features and so on. So why not apply these same techniques to try and find the gems? To uncover which films are destined to triumph in the box office?
This has been the ambition of a number of recent scientific studies that aim to tap into the vast, rich depths of information collected and curated by websites like the Internet Movie Database (IMDb) or Rotten Tomatoes. And – perhaps unsurprisingly – there are a number of intriguing insights hidden within the data.
Take the study conducted by Sameet Sreenivasan in 2013.11 He realized that, by asking users to tag films with plot keywords, IMDb had created a staggeringly detailed catalogue of descriptors that could show how our taste in films has evolved over time. By the time of his study, IMDb had over 2 million films in its catalogue, spanning more than a century, each with multiple plot tags. Some keywords were high-level descriptions of the movie, like ‘organized-crime’ or ‘father-son-relationship’; others would be location-based, like ‘manhattan-new-york-city’, or about specific plot points, like ‘held-at-gunpoint’ or ‘tied-to-a-chair’.
On their own, the keywords showed that our interest in certain plot elements tends to come in bursts; think Second World War films or movies that tackle the subject of abortion. There’ll be a spate of releases on a similar topic in quick succession, and then a lull for a while. When considered together, the tags allowed Sreenivasan to come up with a score for the novelty of each film at the time of its release – a number between zero and one – that could be compared against box-office success.
If a particular plot point or key feature – like female nudity or organized crime – was a familiar aspect of earlier films, the keyword would earn the movie a low novelty score. But any original plot characteristics – like the introduction of martial arts in action films in the 1970s, say – would earn a high novelty score when the characteristic first appeared on the screen.
As it turns out, we have a complicated relationship with novelty. On average, the higher the novelty score a film had, the better it did at the box office. But only up to a point. Push past that novelty threshold and there’s a precipice waiting; the revenue earned by a film fell off a cliff for anything that scored over 0.8. Sreenivasan’s study showed what social scientists had long suspected: we’re put off by the banal, but also hate the radically unfamiliar. The very best films sit in a narrow sweet spot between ‘new’ and ‘not too new’.
The novelty score might be a useful way to help studios avoid backing absolute stinkers, but it’s not much help if you want to know the fate of an individual film. For that, the work of a European team of researchers may be more useful. They discovered a connection between the number of edits made to a film’s Wikipedia page in the month leading up to its cinematic release and the eventual box-office takings.12 The edits were often made by people unconnected to the release – just typical movie fans contributing information to the page. More edits implied more buzz around a release, which in turn led to higher takings at the box office.
Their model had modest predictive power overall: out of 312 films in the study, they correctly forecast the revenue of 70 movies with an accuracy of 70 per cent or over. But the better a film did, and the more edits were made to the Wikipedia page, the more data the team had to go on and the more precise the predictions they made. The box-office takings of six high-earning films were correctly forecast to 99 per cent accuracy.
These studies are intellectually interesting, but a model that works only a month before a film’s release isn’t much use for investors. How about tackling the question head-on instead: take all the factors that are known earlier in the process – the genre, the celebrity status of the leading actors, the age guidance rating (PG, 12, etc.) – and use a machine-learning algorithm to predict whether a film will be a hit?
One famous study from 2005 did just that, using a neural network to try to predict the performance of films long before their release in the cinema.13 To make things as simple as possible, the authors did away with trying to forecast the revenue exactly, and instead tried to classify movies into one of nine categories, ranging from total flop to box-office smash hit. Unfortunately, even with that step to simplify the problem, the results left a lot to be desired. The neural network outperformed any statistical techniques that had been tried before, but still managed to classify the performance of a movie correctly only 36.9 per cent of the time on average. It was a little better in the top category – those earning over $200 million – correctly identifying those real blockbusters 47.3 per cent of the time. But investors beware. Around 10 per cent of the films picked out by the algorithm as destined to be hits went on to earn less than $20 million – which by Hollywood’s standards is a pitiful amount.
Other studies since have tried to improve on these predictions, but none has yet made a significant leap forward. All the evidence points in a single direction; until you have data on the early audience reaction, popularity is largely unpredictable. When it comes to picking the hits from the pile, Goldman was right. Nobody knows anything.
So predicting popularity is tricky. There’s no easy way to prise apart what we all like from why we like it. And that poses rather a problem for algorithms in the creative realm. Because if you can’t use popularity to tell you what’s ‘good’ then how can you measure quality?
This is important: if we want algorithms to have any kind of autonomy within the arts – either to create new works, or to give us meaningful insights into the art we create ourselves – we’re going to need some kind of measure of quality to go on. There has to be an objective way to point the algorithm in the right direction, a ‘ground truth’ that it can refer back to. Like an art analogy of ‘this cluster of cells is cancerous’ or ‘the defendant went on to commit a crime’. Without it, making progress is tricky. We can’t design an algorithm to compose or find a ‘good’ song if we can’t define what we mean by ‘good’.
Unfortunately, in trying to find an objective measure of quality, we come up against a deeply contentious philosophical question that dates back as far as Plato. One that has been the subject of debate for more than two millennia. How do you judge the aesthetic value of art?
Some philosophers – like Gottfried Leibniz – argue that if there are objects that we can all agree on as beautiful, say Michelangelo’s David or Mozart’s Lacrimosa, then there should be some definable, measurable, essence of beauty that makes one piece of art objectively better than another.
But on the other hand, it’s rather rare for everyone to agree. Other philosophers, such as David Hume, argue that beauty is in the eye of the beholder. Consider the work of Andy Warhol, for instance, which offers a powerful aesthetic experience to some, while others find it artistically indistinguishable from a tin of soup.
Others still, Immanuel Kant among them, have said the truth is something in between. That our judgements of beauty are not wholly subjective, nor can they be entirely objective. They are sensory, emotional and intellectual all at once – and, crucially, can change over time depending on the state of mind of the observer.
There is certainly some evidence to support this idea. Fans of Banksy might remember how he set up a stall in Central Park, New York, in 2013, anonymously selling original black-and-white spray-painted canvases for $60 each. The stall was tucked away in a row of others selling the usual touristy stuff, so the price tag must have seemed expensive to those passing by. It was several hours before someone decided to buy one. In total, the day’s takings were $420.14 But a year later, in an auction house in London, another buyer would deem the aesthetic value of the very same artwork great enough to tempt them to spend £68,000 (around $115,000 at the time) on a single canvas.15
Admittedly, Banksy isn’t popular with everyone. (Charlie Brooker – creator of Black Mirror – once described him as ‘a guffhead [whose] work looks dazzlingly clever to idiots’.)16 So you might argue this story is merely evidence of the fact that Banksy’s work doesn’t have inherent quality. It’s just popular hype (and social proof) that drives those eye-wateringly high prices. But our fickle aesthetic judgement has also been observed in respect of art forms that are of undeniably high quality.
My favourite example comes from an experiment conducted by the Washington Post in 2007.17 The paper asked the internationally renowned violinist Joshua Bell to add an extra concert to his schedule of sold-out symphony halls. Armed with his $3.5 million Stradivarius violin, Bell pitched up at the top of an escalator in a metro station in Washington DC during morning rush hour, put a hat on the ground to collect donations and performed for 43 minutes. As the Washington Post put it, here was one of ‘the finest classical musicians in the world, playing some of the most elegant music ever written on one of the most valuable violins ever made’. The result? Seven people stopped to listen for a while. Over a thousand more walked straight past. By the end of his performance, Bell had collected a measly $32.17 in his hat.
What we consider ‘good’ also changes. The appetite for certain types of classical music has been remarkably resilient to the passing of time, but the same can’t be said for other art forms. Armand Leroi, a professor of evolutionary biology at Imperial College London, has studied the evolution of pop music, and found clear evidence of our changing tastes in the analysis. ‘There’s an intrinsic boredom threshold in the population. There’s just a tension that builds as people need something new.’18
By way of an example, consider the drum machines and synthesizers that became fashionable in late-1980s pop – so fashionable that the diversity of music in the charts plummeted. ‘Everything sounds like early Madonna or something by Duran Duran,’ Leroi explains. ‘And so maybe you say, “OK. We’ve reached the pinnacle of pop. That’s where it is. The ultimate format has been found.”’ Except, of course, it hadn’t. Shortly afterwards, the musical diversity of the charts exploded again with the arrival of hip hop. Was there something special about hip hop that caused the change? I asked Leroi. ‘I don’t think so. It could have been something else, but it just happened to be hip hop. To which the American consumer responded and said, “Well, this is something new, give us more of it.”’
The point is this. Even if there are some objective criteria that make one artwork better than another, as long as context plays a role in our aesthetic appreciation of art, it’s not possible to create a tangible measure for aesthetic quality that works for all places in all times. Whatever statistical techniques, or artificial intelligence tricks, or machine-learning algorithms you deploy, trying to use numbers to latch on to the essence of artistic excellence is like clutching at smoke with your hands.
But an algorithm needs something to go on. So, once you take away popularity and inherent quality, you’re left with the only thing that can be quantified: a metric for similarity to whatever has gone before.
There’s still a great deal that can be done using measures of similarity. When it comes to building a recommendation engine, like the ones found in Netflix and Spotify, similarity is arguably the ideal measure. Both companies have a way to help users discover new films and songs, and, as subscription services, both have an incentive to accurately predict what users will enjoy. They can’t base their algorithms on what’s popular, or users would just get bombarded with suggestions for Justin Bieber and Peppa Pig The Movie. Nor can they base them on any kind of proxy for quality, such as critical reviews, because if they did the home page would be swamped by arthouse snooze-fests, when all people actually want to do is kick off their shoes after a long day at work and lose themselves in a crappy thriller or stare at Ryan Gosling for two hours.
Similarity, by contrast, allows the algorithm to put the focus squarely on the individual’s preferences. What do they listen to, what do they watch, what do they return to time and time again? From there, you can use IMDb or Wikipedia or music blogs or magazine articles to pull out a series of keywords for each song or artist or movie. Do that for the entire catalogue, and then it’s a simple step to find and recommend other songs and films with similar tags. Then, in addition, you can find other users who liked similar films and songs, see what other songs and films they enjoyed and recommend those to your user.
At no point is Spotify or Netflix trying to deliver the perfect song or film. They have little interest in perfection. Spotify Discover doesn’t promise to hunt out the one band on earth that is destined to align wholly and flawlessly with your taste and mood. The recommendation algorithms merely offer you songs and films that are good enough to insure you against disappointment. They’re giving you an inoffensive way of passing the time. Every now and then they will come up with something that you absolutely love, but it’s a bit like cold reading in that sense. You only need a strike every now and then to feel the serendipity of discovering new music. The engines don’t need to be right all the time.
Similarity works perfectly well for recommendation engines. But when you ask algorithms to create art without a pure measure for quality, that’s where things start to get interesting. Can an algorithm be creative if its only sense of art is what happened in the past?
In October 1997, an audience arrived at the University of Oregon to be treated to a rather unusual concert. A lone piano sat on the stage at the front. Then the pianist Winifred Kerner took her place at the keys, poised to play three short separate pieces.
One was a lesser-known keyboard composition penned by the master of the baroque, Johann Sebastian Bach. A second was composed in the style of Bach by Steve Larson, a professor of music at the university. And a third was composed by an algorithm, deliberately designed to imitate the style of Bach.
After hearing the three performances, the audience were asked to guess which was which. To Steve Larson’s dismay, the majority voted that his was the piece that had been composed by the computer. And to collective gasps of delighted horror, the audience were told the music they’d voted as genuine Bach was nothing more than the work of a machine.
Larson wasn’t happy. In an interview with the New York Times soon after the experiment, he said: ‘My admiration for [Bach’s] music is deep and cosmic. That people could be duped by a computer program was very disconcerting.’
He wasn’t alone in his discomfort. David Cope, the man who created the remarkable algorithm behind the computer composition, had seen this reaction before. ‘I [first] played what I called the “game” with individuals,’ he told me. ‘And when they got it wrong they got angry. They were mad enough at me for just bringing up the whole concept. Because creativity is considered a human endeavour.’19
This had certainly been the opinion of David Hofstadter, the cognitive scientist and author who had organized the concert in the first place. A few years earlier, in his 1979 Pulitzer Prize winning book Gödel, Escher, Bach, Hofstadter had taken a firm stance on the matter:
Music is a language of emotions, and until programs have emotions as complex as ours, there is no way a program will write anything beautiful … To think that we might be able to command a pre-programmed ‘music box’ to bring forth pieces which Bach might have written is a grotesque and shameful mis-estimation of the depth of the human spirit.20
But after hearing the output of Cope’s algorithm – the so-called ‘Experiments in Musical Intelligence’ (EMI) – Hofstadter conceded that perhaps things weren’t quite so straightforward: ‘I find myself baffled and troubled by EMI,’ he confessed in the days following the University of Oregon experiment. ‘The only comfort I could take at this point comes from realizing that EMI doesn’t generate style on its own. It depends on mimicking prior composers. But that is still not all that much comfort. To my absolute devastation [perhaps] music is much less than I ever thought it was.’21
So which is it? Is aesthetic excellence the sole preserve of human endeavour? Or can an algorithm create art? And if an audience couldn’t distinguish EMI’s music from that of a great master, had this machine demonstrated the capacity for true creativity?
Let’s try and tackle those questions in turn, starting with the last one. To form an educated opinion, it’s worth pausing briefly to understand how the algorithm works.fn3 Something David Cope was generous enough to explain to me.
The first step in building the algorithm was to translate Bach’s music into something that can be understood by a machine: ‘You have to place into a database five representations of a single note: the on time, the duration, pitch, loudness and instrument.’ For each note in Bach’s back catalogue, Cope had to painstakingly enter these five numbers into a computer by hand. There were 371 Bach chorales alone, many harmonies, tens of thousands of notes, five numbers per note. It required a monumental effort from Cope: ‘For months, all I was doing every day was typing in numbers. But I’m a person who is nothing but obsessive.’
From there, Cope’s analysis took each beat in Bach’s music and examined what happened next. For every note that is played in a Bach chorale, Cope made a record of the next note. He stored everything together in a kind of dictionary – a bank in which the algorithm could look up a single chord and find an exhaustive list of all the different places Bach’s quill had sent the music next.
In that sense, EMI has some similarities to the predictive text algorithms you’ll find on your smartphone. Based on the sentences you’ve written in the past, the phone keeps a dictionary of the words you’re likely to want to type next and brings them up as suggestions as you’re writing.fn4
The final step was to let the machine loose. Cope would seed the system with an initial chord and instruct the algorithm to look it up in the dictionary to decide what to play next, by selecting the new chord at random from the list. Then the algorithm repeats the process – looking up each subsequent chord in the dictionary to choose the next notes to play. The result was an entirely original composition that sounds just like Bach himself.fn5
Or maybe it is Bach himself. That’s Cope’s view, anyway. ‘Bach created all of the chords. It’s like taking Parmesan cheese and putting it through the grater, and then trying to put it back together again. It would still turn out to be Parmesan cheese.’
Regardless of who deserves the ultimate credit, there’s one thing that is in no doubt. However beautiful EMI’s music may sound, it is based on a pure recombination of existing work. It’s mimicking the patterns found in Bach’s music, rather than actually composing any music itself.
More recently, other algorithms have been created that make aesthetically pleasing music that is a step on from pure recombination. One particularly successful approach has been genetic algorithms – another type of machine learning, which tries to exploit the way natural selection works. After all, if peacocks are anything to go by, evolution knows a thing or two about creating beauty.
The idea is simple. Within these algorithms, notes are treated like the DNA of music. It all starts with an initial population of ‘songs’ – each a random jumble of notes stitched together. Over many generations, the algorithm breeds from the songs, finding and rewarding ‘beautiful’ features within the music to breed ‘better’ and better compositions as time goes on. I say ‘beautiful’ and ‘better’, but – of course – as we already know, there’s no way to decide what either of those words means definitively. The algorithm can create poems and paintings as well as music, but – still – all it has to go on is a measure of similarity to whatever has gone before.
And sometimes that’s all you need. If you’re looking for a background track for your website or your YouTube video that sounds generically like a folk song, you don’t care that it’s similar to all the best folk songs of the past. Really, you just want something that avoids copyright infringement without the hassle of having to compose it yourself. And if that’s what you’re after, there are a number of companies who can help. British startups Jukebox and AI Music are already offering this kind of service, using algorithms that are capable of creating music. Some of that music will be useful. Some of it will be (sort of) original. Some of it will be beautiful, even. The algorithms are undoubtedly great imitators, just not very good innovators.
That’s not to do these algorithms a disservice. Most human-made music isn’t particularly innovative either. If you ask Armand Leroi, the evolutionary biologist who studied the cultural evolution of pop music, we’re a bit too misty-eyed about the inventive capacities of humans. Even the stand-out successes in the charts, he says, could be generated by a machine. Here’s his take on Pharrell Williams’ ‘Happy’, for example (something tells me he’s not a fan):
‘Happy, happy, happy, I’m so happy.’ I mean, really! It’s got about, like, five words in the lyrics. It’s about as robotic a song as you could possibly get, which panders to just the most base uplifting human desire for uplifting summer happy music. The most moronic and reductive song passable. And if that’s the level – well it’s not too hard.
Leroi doesn’t think much of the lyrical prowess of Adele either: ‘If you were to analyse any of the songs you would find no sentiment in there that couldn’t be created by a sad song generator.’
You may not agree (I’m not sure I do), but there is certainly an argument that much of human creativity – like the products of the ‘composing’ algorithms – is just a novel combination of pre-existing ideas. As Mark Twain says:
There is no such thing as a new idea. It is impossible. We simply take a lot of old ideas and put them into a sort of mental kaleidoscope. We give them a turn and they make new and curious combinations. We keep on turning and making new combinations indefinitely; but they are the same old pieces of colored glass that have been in use through all the ages.22
Cope, meanwhile, has a very simple definition for creativity, which easily encapsulates what the algorithms can do: ‘Creativity is just finding an association between two things which ordinarily would not seem related.’
Perhaps. But I can’t help feeling that if EMI and algorithms like it are exhibiting creativity, then it’s a rather feeble form. Their music might be beautiful, but it is not profound. And try as I might, I can’t quite shake the feeling that seeing the output of these machines as art leaves us with a rather culturally impoverished view of the world. It’s cultural comfort food, maybe. But not art with a capital A.
In researching this chapter, I’ve come to realize that the source of my discomfort about algorithms making art lies in a different question. The real issue is not whether machines can be creative. They can. It is about what counts as art in the first place.
I’m a mathematician. I can trade in facts about false positives and absolute truths about accuracy and statistics with complete confidence. But in the artistic sphere I’d prefer to defer to Leo Tolstoy. Like him, I think that true art is about human connection; about communicating emotion. As he put it: ‘Art is not a handicraft, it is the transmission of feeling the artist has experienced.’23 If you agree with Tolstoy’s argument then there’s a reason why machines can’t produce true art. A reason expressed beautifully by Douglas Hofstadter, years before he encountered EMI:
A ‘program’ which could produce music … would have to wander around the world on its own, fighting its way through the maze of life and feeling every moment of it. It would have to understand the joy and loneliness of a chilly night wind, the longing for a cherished hand, the inaccessibility of a distant town, the heartbreak and regeneration after a human death. It would have to have known resignation and world-weariness, grief and despair, determination and victory, piety and awe. It would have had to commingle such opposites as hope and fear, anguish and jubilation, serenity and suspense. Part and parcel of it would have to be a sense of grace, humour, rhythm, a sense of the unexpected – and of course an exquisite awareness of the magic of fresh creation. Therein, and only therein, lie the sources of meaning in music.24
I might well be wrong here. Perhaps if algorithmic art takes on the appearance of being a genuine human creation – as EMI did – we’ll still value it, and bring our own meaning to it. After all, the long history of manufactured pop music seems to hint that humans can form an emotional reaction to something that has no more than the semblance of an authentic connection. And perhaps once these algorithmic artworks become more commonplace and we become aware that the art didn’t come from a human, we won’t be bothered by the one-way connection. After all, people form emotional relationships with objects that don’t love them back – like treasured childhood teddy bears or pet spiders.
But for me, true art can’t be created by accident. There are boundaries to the reach of algorithms. Limits to what can be quantified. Among all of the staggeringly impressive, mind-boggling things that data and statistics can tell me, how it feels to be human isn’t one of them.