ARTIFICIAL GENERAL INTELLIGENCE
RI is a hot topic. Every day, we hear about new AI systems that can perform tasks that previously required human intelligence. If computers are becoming more and more intelligent every day, how long will it be before they are as smart as people? What happens then? Will they take over the world? Will they treat us like pets? Will they try to exterminate us like they did in The Terminator? Will they read every textbook and user manual on the planet and take all our jobs?
The resurgence of interest in AI has been due to a series of important and high-profile applications. These achievements have put AI back on the map, and, while they provide many benefits, they also have spurred most of the current fears around AI. In a 2019 survey, 54 percent of Americans said they believe we will have AGI in ten years, and 80 percent of Americans said they believe that researchers need to manage the development of AI more carefully.1
Arnold Schwarzenegger’s eponymous killer robot in the first Terminator movie offers a terrifying scenario. “I’ll be back,” he says and soon returns by smashing into a police station with a car and then shooting most of the police officers. Although the AGI systems of film and books—like Schwarzenegger’s T-800—are scary, there is a huge gap between those and today’s narrow AI systems. The Terminator will likely remain forever in the realm of science fiction.2 First, the AI systems of 2020 are narrow AI systems; they can only perform one task. Second, they are not capable of commonsense reasoning based on general world knowledge or of other types of thinking, such as planning, imagination, and abstract reasoning.
AI SYSTEMS CAN ONLY PERFORM ONE TASK
Think about some of the tasks people do daily. We wake up and make breakfast, maybe watch the news or the weather, discuss current events and argue about who should do which chores, and then we drive to work. We might take a class to learn a new skill, joke with our coworkers, or read a user manual to learn a new technology. Later, we’ll need to decide where to go for dinner Saturday night, help the children with their homework, and read a book or watch TV.
Each one of these easy tasks has multiple subtasks that a human must perform to accomplish it. Take watching the news. This task requires subtasks involving capabilities found in narrow AI systems, such as speech recognition, image classification, facial recognition, motion detection and tracking, and information extraction. But it also requires many subtasks that are far beyond the capabilities of narrow AI, like determining how a news event will impact the reader’s life and those of their family, friends, and colleagues.
Taken further, a single human task like making breakfast for the family involves a much larger number of subtasks than watching the news. For example, you have to plan the menu and figure out what ingredients are needed, then determine whether they are available. If you don’t have to go to the store, you can start measuring and mixing the ingredients, following a recipe, and eventually working the stove. When the food is done, you have to determine the appropriate serving vessel (plates, bowls, platters, or mugs but not hats, cats, or parachutes) and silverware to set the table. Finally, you’ll have to clean up and wash the dishes. And each of these subtasks involves several microtasks that, in turn, involve planning, spatial awareness, coordination, and many other skills that are not easy to program or for an AI system to learn.
People are constantly doing what are referred to in AI as tasks, and each task is hierarchically composed of many other tasks. People are often not even aware of each of these micro actions. Part of the reason is that these tasks are so routine, and part is that there are so many of them.
In contrast, today’s narrow AI systems3 are severely limited. A system that can identify stoplights cannot identify pedestrians. A machine translation system cannot classify images, play games, or recognize handwriting. A system that can diagnose brain cancer from medical images cannot predict the risk of breast cancer from gene sequences. IBM’s DeepQA system can answer Jeopardy! questions but cannot answer reading comprehension questions. A system that can recognize human faces cannot distinguish a cat from a dog.4
Could we build AGI by creating a narrow AI program for every task that people do and then code some sort of master control program to figure out which program to use and when? It’s unlikely. We can argue about whether it is even possible to define every task a person does. Even if we could do so, the bigger problem is that the control program itself would need to think and reason. People (and sci-fi AGI systems) encounter different environmental stimuli every day, and these stimuli change the nature of the tasks we do. The news is different every day. The weather can be unpredictable. Every day, we are presented with novel challenges by our children, our boss, our spouse, our friends, and our coworkers. We converse about the news, we adapt to the weather, and we deal with new challenges. A control program would need the ability to apply commonsense reasoning to a great deal of knowledge of the world just to decide which program to invoke to such an extent that the control program itself would need to be an AGI system.
AI SYSTEMS HAVE NO COMMON SENSE AND CANNOT THINK OR REASON
Few would disagree that commonsense reasoning is one of the most important—if not the most important—aspect of human-level intelligence. This type of reasoning manifests itself most prominently in our ability to understand natural language. We use commonsense reasoning to transform a string of words into a full understanding of a natural language text.
Science fiction robots with AGI capabilities like the Terminator and C-3PO of Star Wars understand natural language at a human level. They understand a wide variety of natural language utterances about various topics and use their understanding in performing multiple natural language tasks, such as carrying on a conversation, answering questions, and assessing sentiment. Most importantly, these science fiction AGI systems fully comprehend the nuances of natural language utterances that require commonsense reasoning to understand.
We can create personal assistants that can parrot back responses based on ELIZA-like patterns. These systems are tremendously useful and may, at times, appear to understand natural language. However, if you try to engage them in a conversation that requires them to understand and reason about current events or a myriad of other contexts, these systems will fail miserably. They do not understand language in any real sense. They only provide responses to commands and questions that have been anticipated by developers who have included ELIZA-like patterns and conventionally programmed rules.
We can also create social chatbots by applying deep learning to massive training tables of human chat interactions. However, these systems cannot converse at a human level because there is no way for them to gain commonsense knowledge just by looking for statistical patterns in large bodies of text.
Many AI researchers have studied the development of natural language processing capabilities in computers for the fifty-five to sixty years since Bert Green wrote BASEBALL and Joseph Weizenbaum wrote ELIZA. Our most prominent natural language processing systems are the personal assistants developed by major vendors who have the largest AI staffs in the world. Yet the best they can do is to have teams of people who are dedicated to analyzing user logs, thinking about what users might ask on Halloween, and hand coding ELIZA-like responses to user inputs.
None of the other natural language processing systems understand language either, including Watson DeepQA and the reading comprehension systems that reportedly read better than humans. These systems do not understand natural language in any real sense. They mostly match words in questions to those in documents. While it is surprising that these dumb strategies can produce intelligent-sounding results under specific circumstances, these systems are not reasoning based on commonsense knowledge of the world.
There have been many attempts to build systems that perform other types of thought, including planning, imagination, and abstract reasoning. None of these research projects has managed to come close to re-creating the reasoning capabilities found in people. In each case, these researchers created systems that can perform a single task and then argued that their systems had reasoning, planning, and imagination capabilities. A system that can perform just one type of planning task, for example, does not exhibit an AGI-level capability, and because of the narrow nature of the task-specificity, it will never develop into an AGI system.
BREAKING OUT OF THE NARROW AI BOX
Today’s AI systems are outstanding narrow AI systems. They are huge successes, as is evidenced by their ability to solve real-world problems. Narrow AI technology has rightfully gained the attention of the business world and governments. All that said, today’s narrow AI is not AGI and cannot evolve into AGI.
It is not hard to see why the supervised learning and reinforcement learning paradigms have had difficulties moving beyond narrow AI. For supervised learning, the goal is to learn a function that predicts an output from inputs. Supervised learning systems can only be successful with inputs not seen during training if they are similar to ones that they saw during training.
The same is true for reinforcement learning, where the goal is to learn a function that can predict the optimal action for a given state. The function learned by a reinforcement learning system only will work for an unobserved state if it is similar to states observed during training. However, these techniques are unlikely to ever work well for learning many dissimilar tasks like people are capable of learning.
Geoffrey Hinton has said that he has doubts that current paradigms, including supervised learning, reinforcement learning, and natural language processing, will ever lead to AGI. In a 2017 interview,5 Hinton suggested that to get to AGI will likely require throwing out the currently dominant supervised learning paradigm and the efforts of “some graduate student who is deeply suspicious of everything I have said.” Yann LeCun has also said that supervised learning and reinforcement learning will never lead to AGI because they cannot be used to create systems that have commonsense knowledge about the world.6
Some AI researchers are starting to speculate about new approaches. When we evaluate the viability of these new approaches, it is important to remember that enthusiasm for the narrow AI accomplishments should not translate into optimism about these new approaches, because the existing narrow AI approaches are a dead end in terms of building AGI systems.
Ben Goertzel, who is generally credited with inventing the term AGI, likens it to flying machines. We were able to create blimps, airplanes, and other flying machines because we had a general theory of thermodynamics. We do not have an analogous theory for AGI.7 What we have are some vague ideas.
LEARNING LIKE PEOPLE
Many researchers describe human learning as compositional: We learn many building block skills that we then put together to learn new skills. People learn concepts, rules, and knowledge about the world that transfer over as we learn to do different tasks. In the first eighteen months of life, children learn prelinguistic concepts, such as object permanence and various intuitive physics concepts. Then they learn how language and numbers work and what characteristics make dogs different from elephants. These become building blocks that support the learning of various other skills. And when adults learn new skills, we tend to transfer what we have learned by acquiring one skill into learning another skill. Someone who learns how to write computer programs in the Java programming language will find it much easier to learn how to write programs in the Python programming language. As we get older, we also learn how to apply our episodic memory of life experiences. We learn how to abstract facts and beliefs from these experiences. We learn motor procedures like pouring a cup of coffee, walking, and swinging a golf club. We learn how to make use of sensory information. We learn etiquette, manners, morals, and rules.
These researchers argue that the key to commonsense AI reasoning is to build systems that learn compositionally like people. The idea is for systems to learn concepts and rules that can serve as building blocks that enable the system to learn higher-level concepts and higher-level rules.
A group of current and former MIT brain and cognitive science researchers, led by MIT professor Joshua Tenenbaum, suggest that the first step is for computers to acquire all the prelinguistic concepts and rules that people learn in the first year and a half of life.8 Their idea is to build computers that learn models of the world, such as intuitive physics, so that the computers can apply these models flexibly, like people do, rather than being limited to a specific task like today’s narrow AI systems. The second step, which children do from one and a half years to three years, is for computers to use this foundation to begin to learn language. The third step is for computers to learn almost everything else through language.9 This proposal has generated a great deal of debate in the research community.10 At this point, however, it is mostly a set of ideas with no evidence that it will lead to AGI.
Interestingly, many of the approaches to building commonsense reasoning into computer systems are turning the clock back and revisiting good, old-fashioned AI symbolic techniques.11 Some researchers suggest using symbolic techniques by themselves, and others suggest using a hybrid of symbolic techniques and deep learning. Gary Marcus has been a long-time proponent of the hybrid approach. In 1992, he and Steven Pinker showed that the best model of children’s learning of irregular verbs uses a hybrid approach involving symbolic rules for regular verbs and a neural network–like learning system for irregular verbs. He has since argued long and hard that deep learning by itself cannot result in AGI systems.12 In his book Rebooting AI,13 Marcus and his coauthor, NYU professor Ernest Davis, propose a hybrid model that combines symbolic reasoning with deep learning. Their specific recipe sounds reasonable, but it represents a massive, long-term effort with many potential pitfalls and dead ends.
My biggest concern about this approach is that progress in understanding how people represent commonsense knowledge has been glacial. Forty years ago, we had a long debate about the nature of the internal representations people use to answer questions like “What shape is a German Shepherd’s ears?” We still do not know the answer, even though some of the top people in the fields of AI and cognitive science took part in the debate. Answering a question about the shape of a dog’s ears is just a drop of water in an ocean of representational schemes and reasoning processes. Moreover, we do not even know whether these representational schemes and reasoning processes are innate or learned. Innateness has been an ongoing topic of academic debate for over fifty years, with no resolution in sight.
How long will it be before we know enough about how people think to make real progress toward AGI? At the current rate of progress, it appears we will need hundreds—maybe thousands—of years, and it may never happen.
DEEP LEARNING
Some researchers argue that while supervised and reinforcement learning per se are dead ends for building AGI systems, deep learning may yet take us to the promised land. OpenAI is a nonprofit that received well over a billion dollars in investment capital. OpenAI’s charter is to build AGI systems safely. Accordingly, OpenAI has three departments, only one of which is trying to develop AGI technology. The other two departments focus on AI safety and policy.
Cofounder and CTO Greg Brockman argues that the way forward is to keep exploring deep learning systems that have three properties: generality, competence, and scalability. Generality refers to finding architectural components that researchers can use for many different applications. An example is gradient descent, an algorithm that is used in most deep learning networks to find the optimal values of weights. Competence refers to proven real-world capabilities, such as the deep learning networks that obsoleted forty years of computer vision research. Scalability refers to deep learning networks in which performance improves as the network becomes larger.
GPT-2 is a great example of scalability. It is ten times larger than the GPT system of a year prior, and researchers trained it on ten times the amount of data. GPT-2 showed a surprising ability to generate human-sounding—if not always coherent—text. OpenAI sees this as an emergent capability that resulted solely from making the network bigger. Brockman argues that by continuing deep learning research and then scaling systems that meet his three requirements, we will see more emergent capabilities, and these will eventually include commonsense reasoning and AGI.14
GPT-2 certainly demonstrated a massive ability to extract statistical regularities of its training text and perhaps the ability to memorize small segments of the text. However, it did not learn facts about the world or gain any ability to reason based on this world knowledge. At this stage of the game, I see absolutely no evidence that learning world knowledge and reasoning skills will emerge from this approach, and I see no logical rationale for believing it will happen. OpenAI has since released GPT-3, which is one hundred times larger than GPT-2. However, like GPT-2, it generates text with mostly incorrect facts.15 Even at one hundred times the size of its predecessor, it still is not acquiring world knowledge.
Deep learning pioneer Yoshua Bengio agrees with Tenenbaum and Marcus that compositional learning is critical. However, he argues that symbolic techniques are not necessary.16 Bengio has proposed novel deep learning architectures designed to break deep learning out of its narrow AI box. One goal is to learn higher-level building blocks that can help AI systems learn compositionally.17 These are interesting ideas for researchers to explore, but they are in the preliminary stages.18 Here again, the idea that these systems will magically learn world knowledge and reasoning is a leap of faith.
Yann LeCun also agrees that the ability to learn facts about the world and commonsense reasoning rules is a necessary step on the road to AGI. He has stated repeatedly that self-supervised learning is the way forward toward AGI.19 Recall that, while supervised learning requires large training tables of labeled data, self-supervised learning does not require humans to label data. Language models like GPT-2 are an example of self-supervised learning. By predicting the next word in a text, the text provides its own supervision, and this makes it possible to train massive networks on massive training sets. However, the goal of self-supervised learning for LeCun is not learning the task itself. Instead, he argues, if the task is complex enough, the self-supervised learning network will be forced to acquire world knowledge and reasoning rules. However, GPT-2 does not appear to have learned any world knowledge or reasoning rules. GPT-3 is a hundred times bigger, and it has certainly learned better statistics and probably memorized more text. However, it has not gained the ability to acquire world knowledge or reasoning rules.
A related approach is to create tests that require commonsense knowledge and reasoning. If an AI system can learn to pass these tests, then it must have those capabilities. One problem with this approach is that it is difficult, if not impossible, to develop a test that definitively requires commonsense knowledge and reasoning. Most of the tests that were initially thought to require meeting this standard were later shown to be susceptible to simple statistical approaches. Here again, the approach is to rely on deep learning systems magically acquiring world knowledge and reasoning skills. The idea that we can simply turn loose a deep learning algorithm on a training table and expect it to somehow learn commonsense knowledge and reasoning is wishful thinking.
MODELING THE HUMAN BRAIN
Another proposed approach to AGI is to understand the architecture of the physical human brain and model AI systems after it. After decades of research, we know only some very basic facts about how the physical brain processes information. For example, we know that the cortex statically and dynamically stores learned knowledge, the basal ganglia processes goals and subgoals and learns to select information by reinforcement learning, and the limbic brain structures interface the brain to the body and generate motivations, emotions, and the value of things.
There is also a great deal of research underway to understand the human brain. The Human Brain Project is a massive, ten-year project that started in 2013 and is funded by the European Union.20 It employs over five hundred scientists at more than one hundred European universities but has yet to solve the mysteries of the human mind. However, we are still in the early stages of understanding the whole brain. The human brain has 100 billion neurons and 1,000 trillion synapses. The most detailed map of a brain contains 31,000 neurons (for a rat brain).21
Understanding the human brain and modeling it in AI systems is a plausible approach to AGI. However, as compelling an idea as this is, no one has any concrete ideas about how to do this. We have no idea what algorithm is used to drive the synaptic changes in the brain during the learning process.
The idea of modeling the neurons in the brain has been in the proposal stage for over forty years. It has yet to gain any real traction partly because of the extremely slow progress in understanding the human brain and partly because we have no concrete method for modeling what we know about the human brain in AI programs. Here again, we are near the starting gate and have no evidence that this approach will succeed.
Ray Kurzweil, a technology futurist, has long argued that AGI will occur as a by-product of the trend toward bigger and faster computers. He popularized the idea of the singularity, which is the point in time that computers are smart enough to improve their own programming. Once that happens, his theory states, their intelligence will grow exponentially fast, and they will quickly attain a superhuman level of intelligence. Kurzweil predicted that the singularity would occur around 2045.
The human brain has about 86 billion neurons. Each neuron has connections to hundreds or thousands of other neurons. There may be as many as a quadrillion connections in the human brain. A computer that can simulate this number of connections and has enough processing power to compute results on these many connections may well be available by 2030. It is certainly possible that new technology, such as quantum computing, will move that date forward. The question is whether this inevitable increase in processing speed and power will lead to AGI.
The first electronic computer, the ENIAC, was created in 1945, weighed 30 tons, could execute a little over 350 computer instructions per second, and occupied 1,800 square feet of space.22 In the 1970s, IBM mainframes could execute 750,000 instructions per second. Intel released its Stratix 10 chip in 2018. This chip is smaller than a hand and can execute 10 trillion instructions per second. The speed of today’s computers is already beyond belief, and computers will continue to get faster at a rapid pace.
That said, it is hard to imagine how processing power by itself can create AGI. If I turn on a computer from the 1970s with no programs loaded, turn on one of today’s computers with no programs loaded, or turn on a computer fifty years from now with no programs loaded, none of these computers will be capable of doing anything at all. If I load a word processing program on each of these computers, then each of them will be limited to performing word processing. Newer, more modern computers will be able to respond faster and process bigger documents, but they will still only be capable of word processing. The same will be true for the computers of the future.
Probably the most salient argument against the idea of computer power by itself as the road to AGI is the fast-thinking dog analogy.23 If we were to find a way to increase the speed of a dog’s brain, no matter how fast it gets, it would never beat a human at chess unless we also found a way to reprogram the dog’s brain so that it can understand chess. The same is true of computers. Faster computers by themselves will not result in AGI. As Steven Pinker said, “Sheer processing power is not a pixie dust that magically solves all your problems.”24 In the unlikely event that AGI ever becomes possible, the programming and learning algorithms will likely be complex enough to require extremely powerful computers. However, those programming and learning algorithms will be necessary; speed and power will not be sufficient.
WILL WE ACHIEVE AGI?
Researchers have a long history of being overly optimistic about the prospects for AGI. In 1956, John McCarthy and Marvin Minsky, two of the best-known figures in the history of AI, proposed a ten-person, two-month study at Dartmouth College in Hanover, New Hampshire.25 Their proposal included this paragraph:
The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.
In 1967, Marvin Minsky said, “Within a generation, I am convinced, few compartments of intellect will remain outside the machine’s realm. The problem of creating ‘artificial intelligence’ will be substantially solved.”26
Over the years, there have been many surveys of AI researchers asking about the timeline for AGI. When researchers are flat out asked to give their best estimate of when AGI will appear, the results have been remarkably consistent. Over half of the sixty-seven AI researchers polled in a 1971 study said we would see AGI in twenty to fifty years.27 These earlier predictions will be proven wrong in 2021. In a survey of ninety-five AI timeline predictions between 1950 and 2012, the most common projected timeline for AGI was fifteen to twenty-five years.28 A 2018 survey found that 50 percent of researchers believed there was a chance we would see AGI come to pass in the next forty-five years.29
Optimism was extremely high in the first rise of AI and very low during the first AI winter. Optimism was extremely high again in the late 1970s and early 1980s during the second rise of AI. By the end of the 1980s, optimism about AGI was again extremely low. Now, as I write this, we are on the third rise of AI, and optimism is high once again. However, the optimism is due to narrow AI. Will this time be different? Probably. Optimism will likely stay high because there are so many real-world applications for today’s narrow AI technology. Some people would argue that image recognition alone is changing our world by enabling positive innovations such as self-driving cars, as well as scary innovations such as surveillance. This optimism for narrow AI has naturally, but incorrectly, spilled over to optimism about prospects for AGI. As Oren Etzioni, the CEO of the Allen Institute for AI, said, “It reminds me of the metaphor of a kid who climbs up to the top of the tree and points at the moon, saying, ‘I’m on my way to the moon.’”30
I cannot say for sure that we will never develop AGI. What I can say is that we are only at the starting gate. Again.