41   Evaluating Creativity in Computers

Computational creativity tries to convince people not that computers are better, but that they can contribute to the world with more creativity. Computers add more stuff to like.

—Tony Veale25

Many scientists do not consider creativity in computers to be merely a philosophical question. From the earliest days of computer science, there have been researchers who study the ways in which computers can at least appear to be creative. A group of these launched the field of computational creativity research at the turn of the twenty-first century. Its proponents consider it a subfield of AI, and their work is influenced by currents in AI, computer science, and cognitive psychology, as well as in art, literature, and music. Several of its leading lights appear in this book, among them Simon Colton, Pablo Gervás, Anna Jordanous, Rafael Pérez y Pérez, Graeme Ritchie, Tony Veale, and Geraint Wiggins. In an influential paper, Colton and Wiggins define computational creativity as follows: “The philosophy, science and engineering of computational systems which by taking on particular responsibilities, exhibit behaviours that unbiased observers would deem to be creative.”26 There are two striking terms here—“responsibilities” and “unbiased.” According to this definition, the computer is not passive; it is not a support tool like Adobe Photoshop. It has creative responsibilities, such as taking aesthetic criteria into account and producing explanations and commentaries alongside the works it creates. In shaping this definition, Colton had in mind his own painting system, The Painting Fool. He asserts “that software has to earn itself a place at the table in the discussion of what creativity means.”27

Unbiased observers are those who do not begin with the assumption that computers can be creative and therefore need to be persuaded by hard evidence.

Proponents of computational creativity tend to prefer a top-down approach with grand overarching rules, programming the system with software and huge databases. They claim that this is how we think, using a toolbox of knowledge acquired from years of study and reflection. Poet Nick Montford, for one, is critical of this approach. Another poet, Allison Parrish, disagrees with the belief that computers should produce work as close as possible to ours. She prefers to seek out “new ways for poetry to exist.”28

Geraint Wiggins and the Mind’s Chorus

Creativity is a value problem which needs to be solved.

—Geraint Wiggins29

Geraint Wiggins, a professor of computational creativity at Queen Mary University of London, is an important figure in the field. He studies creativity in music, largely adhering to the guidelines of computational creativity. To Wiggins, “Music is like mathematics because it need not have real-world meaning, like language or figurative art. Music can express emotions but can’t say ‘the glass is on the table.’”30

“From an early age, I have been interested in the use of computers for music,” Wiggins tells me.31 By the time he went to university in Edinburgh, he was already an accomplished French horn player and church organist. But his passion for computer science won out and he completed a PhD in the subject in 1997. At that time Edinburgh had just instituted a PhD course in musical composition and he then decided to embark on a second PhD, thus achieving his wish to combine computer science and music.

Wiggins has strong opinions on creativity. He rejects the concept of genius, which, he believes, skews the creativity debate. He also disputes the notion of big-C and little-c creativities, seeing creativity as a continuum rather than two poles. The concept of big-C Creativity, he argues, was an invention of the Romantic era of the late eighteenth and early nineteenth centuries, when thinkers rebelled against the Age of Enlightenment, in which rational scientific thinking held sway. The Romantics asserted that science cannot explain everything and postulated some indefinable creative impulse, taking Beethoven’s genius as a prime example.

In recent years, Wiggins has speculated on what he calls “spontaneous creativity” or “creativity before consciousness”—in other words, unconscious thinking. He claims this is the basis of musical composition.32 Instead of what I refer to as intersecting lines of thought, he uses the phrase “the mind’s chorus,” referring to information stored in our memories that speaks to us—is retrieved—when we start working on a problem.

As an historic example of ideas apparently springing out of nowhere, he cites the letter Mozart supposedly wrote to his father in which he described how he composed not in real time, but as a burst, “all at once.”33 As noted earlier, scholars now doubt the letter’s authenticity. Nevertheless, the words are uncannily similar to Poincaré’s description of his experience when the solution to a problem that he had been laboring over suddenly emerged just he was stepping up into an omnibus.

The powers of unconscious thought never cease to amaze, nor do the workings of the hidden layers of artificial neural networks. Both, in time, can be unraveled.

Graeme Ritchie’s Mathematical Criteria for Measuring the Creativity of a Computer Program

People have suggested that humor is AI-complete, so if you can do humor, you can do AI.

—Graeme Ritchie34

To unravel Ritchie’s rather inscrutable statement, the most difficult problems in the AI field are known as AI-complete, meaning that they are so difficult that solving them is equivalent to solving the central problem of AI—developing computers that are equal in intelligence to people. Graeme Ritchie, honorary senior fellow in computational linguistics at the University of Aberdeen, specializes in computational humor, a challenging field indeed.

Ritchie is interested not in human creativity or in machine creativity but in how to assess the degree of creativity a particular computer program possesses.35 He starts on the basis that there is no definition of creativity and that it is impossible to discuss undefined terms. He does not consider defining creativity in terms of the amount of novelty, value, or surprise its products possess to be much help.

He believes that the best way to assess a program’s creativity is by using human judgment. To this end he has devised a complex system for determining whether a particular program is creative by examining its products, such as a novel or a work of art, to which human judgment can be applied. The key point is the amount of information and material—data—that has been fed into the machine. The greater the amount of data the program needs, the smaller its actual creativity.

In other words it all depends to what extent the program has been fine-tuned to create the desired result—that is, to what extent the final product differs from the initial catalyst, which he dubs the inspiring set. This is the input data which sets the program in motion – musical notes, words, visual images. The more fine-tuned the program is, the less creative it is, in his assessment.

His basic criterion is that the program has to create an artifact – a piece of writing, work of art or piece of music – that can be assessed by people, a criterion which he claims places the computer program on the same level as a human artist, writer or composer.

Ritchie has drawn up a list of highly mathematical criteria framed in equations to measure the level of creativity of any particular program. The best way to discuss them is to show them in action by describing the experiences of creativity researchers who have applied them.

Anna Jordanous, senior lecturer at the School of Computing at the University of Kent, has written at length on the difficulties inherent in using the methods currently available to evaluate whether an artifact is novel. She used Ritchie’s criteria to compare four systems for improvising jazz, including one of her own, carried out by computers alongside human musicians.36 Ritchie’s criteria, she found, “failed to guide my improvising system to more creative behaviour,” and concluded that “using measures of creative output is not good enough for evaluating creativity.”37 In other words, measuring creativity does not help in evaluating or nurturing it.

Simon Colton, creator of The Painting Fool, suggests a “creative tripod”—skill, appreciation, and imagination—all of which must be present to give the “perception of creativity,” though this seems a little too theoretical to be practicable. In any case, few people except Colton actually use his tripod, Jordanous tells me.38

Tony Veale, creator of Scéalextric and Metaphor Magnet, among much else, appreciates the value of Ritchie’s criteria in evaluating software, but prefers to opt for crowd-sourced feedback from anonymous judges.39

Pablo Gervás, creator of poetic algorithms, assiduously applied Ritchie’s criteria to one of his computer poetry programs. He later met with Ritchie to discuss his results. Ritchie informed him that his criteria were only intended to be used “for a particular application”—that is, piecemeal.40

It seems that Ritchie’s criteria are only of use when the inspiring set, the initial information that generates the putative creative process, can be absolutely pinned down. But this is not the case in many computer programs that generate combinations of words. For this, it is simply too difficult to find a way to measure the value of novelty—which is precisely what Ritchie is trying to do with his criteria.41 We await further results.

Anna Jordanous’s Fourteen Components of Creativity

Creativity doesn’t need to be good.

—Anna Jordanous42

Anna Jordanous is trying to find a systematic way to evaluate whether an algorithm is creative and precisely how creative it is.43 “Creativity evaluation,” she writes, “has been described as the ‘Grand Challenge’ for computational creativity research.”44 To do this, she feels, we have to be clear as to what precisely creativity is.

Evaluating the creative systems developed by researchers into computational creativity is not easy. Jordanous began by asking, “What do people mean by the word creativity in academic discussions held across disciplines?”45 Working with Bill Keller, a language expert and colleague at the University of Kent, she took thirty digitized academic papers that examined the subject from various angles and applied language processing and statistical analysis to pick out the words most frequently associated with creativity.46 They then clustered similar terms and isolated fourteen components. This was not an exhaustive search. They were only looking for themes that popped out of academic discussions of creativity.

Jordanous and Keller’s fourteen components of creativity are as follows: “Active involvement & Persistence, Dealing with uncertainty, Domain competence, General intellect, Generating results, Independence & freedom, Intention & emotional involvement, Originality, Progression & development, Social interaction and communication, Spontaneity & subconscious processing, Thinking & evaluation, Value, and Variety, divergence & experimentation.”47 Each of these can be expounded and expanded on. Depending on the area, some are more important than others.

She then asked thirty-four people to apply these criteria to jazz improvisations carried out by computer systems playing alongside human musicians.48 Significantly, the judges ranked value—that is, the quality of the music—very near the bottom, as low as twelve out of fourteen, whereas active involvement and persistence—getting on with it, doing one’s best—was ranked fourth. The most important factor was social interaction and communication, the computer seeming to interact with the human musicians, followed by intention and emotional involvement, perhaps the computer seeming to be involved in what it was doing, then domain competence, skill at its instrument—all of which makes sense.

“It doesn’t really matter what the end results are in terms of creativity,” she says. “It doesn’t need to be good. In fact, there is a guy who plays a solo with just one note, a very high saxophone note, held for almost an entire chorus. You wouldn’t say it was technically, ‘Wow, amazing,’ but it was the way he played it.”49 The key thing is, a performance doesn’t need to be good to be creative. In this way, Jordanous could apply these criteria to work out what to focus on to make her system more creative. The key point is, she says, that “people in computational creativity have come around to the idea that process must be taken into account.” All fourteen components are to do with the process, with the activity of making the product. The quality of the end result is secondary.

Jordanous emphasizes that her fourteen evaluation guidelines free evaluators from being tied down to any one fixed definition of creativity. There is nothing wrong with evaluating, so long as the sample is large enough to eliminate differences of opinion over what is and is not creative.

Notes