13   Ahmed Elgammal’s Creative Adversarial Networks

I really believe that to advance AI you have to look at how people digest art.

—Ahmed Elgammal132

Egyptian-born computer scientist Ahmed Elgammal has a mission: to find a way for a machine to create new, original, and exciting artworks—not more of the same “in the style of” existing artworks, and not so way out as to be dismissed as bizarre, but artworks that stand comparison with works of the greatest contemporary artists.

Elgammal recalls with regret that when he finished high school he had to “choose between archaeology, art history, and computer science.”133 He decided to study computer science at the University of Alexandria, but “never gave up on his passion for art and history.” As a graduate student, he combined his interests by studying computer vision, the way in which computers understand images by processing them in ways analogous to human vision, as in convolutional neural networks (ConvNets), the networks behind DeepDream.

Elgammal was convinced that “art and engineering are two sides of the same coin that can push the boundaries of what machines can do beyond game playing or driving a car.”134 As a professor in the Department of Computer Studies at Rutgers University, in 2014 he established the Art and Artificial Intelligence Laboratory, with the subtitle Advancing AI Technology in the Digital Humanities.

Initially the lab trained machines to distinguish different art styles—to tell the difference between, for example, Renaissance and baroque art. To do so, they used the WikiArt dataset, a huge online collection of 81,449 fine art paintings in twenty-seven different styles and forty-five different genres (including interiors, landscapes, and sculptures) by 1,119 artists, spanning the years 1400 to 2000—essentially, the entire canon of Western art. They supplemented this with information, including the sorts of brushstrokes used. They also included visual similarities, which artists influenced which others.

Elgammal and his coworkers went on to explore creativity, which they sought to quantify by looking at art history and setting up an algorithm to score how creative each painting was. Their baseline was how novel a painting was relative to earlier artworks and how influential it was in sparking further works in the new genre. The algorithm found that Picasso’s 1907 Les Demoiselles d’Avignon spurred a whole new movement, cubism. This confirmed the accuracy of their algorithm. Says Elgammal, results such as this meant that “many events in art history could be explained by visual aspects of art, more than semantic and social contexts,” contradicting what many art historians believe.135

Elgammal has become interested in how style evolves over time. He links this with what he sees as the current trend in AI to generate images from images we have already seen, such as training machines on thousands of images from art history. “That seemed silly to me,” he says, “because art is not just generating things that look like art.”136 He has in mind style transfer, which transforms an image into a certain style such as a work by Picasso, and GANs, where the generator tries to produce images that resemble those in the discriminator’s training set.

He argues that artists should abandon painting styles that have become stale and experiment with new styles that will arouse the viewer. He calls this the style’s “arousal potential.” This new art should be novel, surprising, complex, ambiguous, and puzzling. But if the work is too novel, viewers may find it repellent. The creative artist has to walk a fine line.

An example is the French artist Paul Cézanne who, in the late nineteenth and early twentieth centuries, moved on from the naturalism of impressionist art with its fleeting effects of light and color and developed a style based on abstract qualities and symbolic content, with cubes, geometric shapes, and experiments with perspective. Because Cézanne had moved away from impressionism, his works were received critically by the establishment—though younger artists, like Picasso, appreciated his mastery. Cézanne’s work was a stepping stone to cubism and twentieth-century art. In Elgammal’s analysis, Cézanne’s style was novel and thus creative, but not so extreme as to be repulsive. To illustrate what he means by repulsive, Elgammal cites DeepDream’s psychedelic imagery.

Elgammal and his group set to work to computerize their scheme and develop a way to bring about a style change that is novel and therefore creative. They formulated a variation on GANs that they called creative adversarial networks (CANs). Like GANs, CANs have two networks, a discriminator (D) and a generator (G). D is trained on the WikiArt dataset and learns to discriminate between art and nonart. To begin, G is untrained, in a state of pure noise, in latent space.

G begins sending images to D, which D rejects as nonart. G begins to learn what D doesn’t like, and adjusts its weights to generate ever more art-like images, as assessed by D. But D can also detect the style of an image—its “art-style classification” function. When it notices that the image fits a particular style, then a function called style ambiguity kicks in, pushing the generator to produce works in styles which differ from all those in the WikiArt dataset—in other words, something altogether new and original.

The images in figure 13.1 were created by a CAN with the style ambiguity function turned off. The images are recognizable as portraits, landscapes and so on.

Figure 13.1

Images created by a CAN with the style ambiguity function turned off, 2017.

Conversely, turning the style ambiguity function on generates highly abstract images, as shown in figure 13.2.

Figure 13.2

Images created by a CAN with the style ambiguity function turned on, 2017.

The problem that Elgammal sets the network is to find a style that differs from all those in the training set. But the training set, the images it has been fed, encompass the whole of Western art. In the end, without any human intervention, it settles on abstraction as the solution to the problem. There are two possible explanations. Either there is a bias in the data toward abstraction, or, as Elgammal puts it, “the machine has captured the trajectory of art history, which is towards abstraction.”137 He opts for the second. It seems that moving toward abstraction is natural for both human and machine artists.

Elgammal’s aim “was to do a visual Turing test,” to see whether a sample of viewers could tell whether a work of art was made by a machine or a human artist.138 Most people can distinguish machine art from human art. So what art would be best to set up in comparison with the works created by a CAN? Elgammal and his team chose twenty-five paintings by abstract expressionist artists produced between 1945 and 2007. Art lovers would probably be familiar with most of them, and they also had no recognizable subject matter. The second set was twenty-five paintings by contemporary artists shown at Art Basel 2016. These could be considered the pinnacle of the modern art world as assessed by the art establishment. Elgammal and his team set the CAN works, the abstract expressionist works, and the Art Basel works before a panel of eighteen viewers drawn from Amazon Mechanical Turk, a crowd-sourcing internet marketplace.

The results astounded Elgammal. The panel concluded that 53 percent of the CAN-generated artworks had been created by artists, as opposed to 85 percent of the abstract expressionists and 41 percent of the Art Basel collection. In other words, more than half the CAN works looked as if they were the products of human imagination to an admittedly nonexpert group of viewers.

But could the CAN products really be considered art? Elgammal asked a further twenty-one viewers how they felt when they interacted with the three sets of paintings. Did they feel that the painting was composed very intentionally?139 Did they see “an emergent structure?” Did the painting communicate with them? Were they inspired by the work? Elgammal expected the CAN products to rate much lower than the art produced by real artists. In fact, the CAN products rated higher. It seemed that the (again, nonexpert) panel really did consider the CAN images to be works of art.

The next question was whether the machine’s new art style was the result of turning on the style ambiguity function. Elgammal turned to a pool of art history students, well versed in novelty and aesthetics, and asked them to assess pairs of images produced by the CAN, one with the style ambiguity function turned off, which therefore produced images in the style of those in the WikiArt dataset (i.e., that fit in with the canon of Western art), the other with the style ambiguity function turned on. He asked them which images were more novel and which more aesthetically pleasing. More than half considered the CAN images produced with style ambiguity turned on to be more novel, and 60 percent found them also more aesthetically pleasing.

Elgammal’s work has received far more positive reviews in the media than has DeepDream.140 At the 2017 Frankfurt Book Fair, he was invited to exhibit artworks made by CAN. Artists admired the work and photographed it and inquired about the artist. Elgammal told them that this was AI art and explained his methods. To his surprise, the artists did not respond negatively. Rather they saw Elgammal’s machine as another artist with whom they could also connect. “That’s the ultimate approval, the ultimate signal I can get,” says Elgammal. “An artist sees it and connects with it.”141

In the future, Elgammal would like to have artists collaborate with CAN, perhaps coming up with creative ideas that the machine can explore. “I want to look into what would have happened if Picasso had lived in the twenty-first century.” His aim is “to further our understanding of how humans create art, then to simulate that with a machine.” But for the moment he does not believe that machines can truly be creative.142

At present, Elgammal’s machine has “no semantic understanding of art beyond the concept of styles.”143 In other words, it doesn’t understand the subject matter and certainly could not discuss possible meanings of the abstract forms it produces: Are they Jungian archetypes, for example? But in the not-too-distant future, machines will be capable of scanning the web and learning key features of the world in which we live. In that way they will “learn” about subject matter, objects, hopes and dreams, love and hate.

Elgammal also applied the criteria for creativity proposed by the British computer scientist Simon Colton to his work. Among these are novelty and a system’s ability to assess its own creations.144 Elgammal concludes that his system satisfies both requirements. The interaction between the two systems in CAN forces it to explore creative space as it strives to deviate from the styles the discriminator has been fed and yet still to produce works that the discriminator judges to be art.

“If the machine can be creative in the future it has to be its own judge,” he says.

Notes