2003

PAPERCLIP MAXIMIZER CATASTROPHE

Even if highly intelligent and capable AIs are given useful goals, it is nevertheless possible to have dangerous results in the future. One famous example, discussed in 2003 by philosopher and futurist Nick Bostrom, is the horror of the Paperclip Maximizer. Imagine a future in which an AI system supervises a set of factories producing paperclips. The AI is given a mission of producing as many paperclips as possible. If the AI is not sufficiently constrained, one might imagine that it could optimize its goal, first by operating the factories at maximum efficiency, and then devoting more and more resources to the task until vast regions of land and more factories were devoted to making paperclips. Eventually, all the available resources on Earth could be converted to this task, followed by all the relevant matter in the Solar System, turning everything into paperclips.

Although this scenario may sound implausible, it is intended to focus attention on the serious point that AIs may not have humanlike motives that we can truly understand. Even innocuous goals could prove dangerous if AIs gain the ability to evolve and create their successive improved machines, as discussed in the entry on Intelligence Explosion. How might humans ensure that AI goals, and their constituent mathematical reward and utility functions, remain stable and comprehensible through the decades and centuries? How might useful “off switches” remain accessible? What if such reward circuits and software cause the AI to lose interest in the external world and devote its energy to maximizing a reward signal, much like taking a drug and dropping out of society? How will we deal with different countries and geopolitical regimes using different reward functions for their own AIs?

Another famous example, attributed to AI expert Marvin Minsky, is the Riemann hypothesis catastrophe, in which we imagine a superintelligent AI system with the goal of solving this difficult and important hypothesis in mathematics. Such a system might devote more and more computing resources and energy to the task, taking over and creating ever-improved systems at the expense of humanity.

SEE ALSO “Darwin among the Machines” (1863), Intelligence Explosion (1965), Leakproof “AI Box” (1993)