It is said that how far we can see the future depends on the length we look back in history. Let us first briefly review the history of the Internet and artificial intelligence.
Everyone has heard about the history of the Internet. It was born in the US military’s laboratory during the 1960s and was first used to transfer and share intelligence among several universities and research institutions. By the end of the 1980s, a group of scientists proposed the concept of the World Wide Web and created TCP/IP (Transmission Control Protocol/Internet Protocol), which set a unified standard for computer network communication, enabling the Internet to expand for the world. At this point, a broad and far-reaching information highway was in front of the world.
About twenty years ago, Mark Anderson, a twenty-three-year-old young man, invented the Netscape browser, which ignited the blazing flame of the mass Internet and opened the door for commercial Internet. At that time, Microsoft began to worry about whether its own software business would be subverted by the Internet; young employees from Sun Microsystems resolutely decided to leave the rigid company and invent a language that could be used in various operating systems to break Microsoft’s monopoly and open the door of Internet innovation, so the Java programming language was born. The Java language has greatly accelerated the development of Internet products.
At that time, there were hardly any Internet cafes in Beijing and Shanghai. In 1997, the year of Hong Kong’s reunification, InfoHighWay (once a pacemaker of Chinese Internet industry) had just started a national network-access service; Zhang Xiaolong had just developed the Foxmail email software program; the National Informatization Conference was also held that year. Looking at the World Wide Web from the outside, everything was just waking up. However, in the technical circle, new technologies and new ideas had emerged in an endless stream, and various commercial wars had been wreaked.
At that time, I was still working for Infoseek, the US search-engine pioneer. On the front line, I felt the atmosphere of the Internet business war and American people’s enthusiasm for the new technology wave. At that time, I wondered if China was ready for the new technological revolution. I wrote the book Battle in Silicon Valley in 1998, detailing the struggle and innovation process of Silicon Valley geniuses. After finishing this book, I returned to China in 1999 and founded Baidu in a hotel in Beijing.
Recalling the time when Netscape, Sun Microsystems, and Microsoft competed in the Internet field like the regimes in the Three Kingdoms Period, I am still excited even today. At the time, people were guessing who the winner would be. Microsoft seemed to be invincible, as it can always digest new technologies. The development of Netscape went through ups and downs and was eventually acquired by AOL, which was further acquired in 2014 by Verizon, which dominated the wireless business. Later, Verizon also acquired Yahoo, the company that had been very powerful for many years. Sun Microsystems was once quite influential: in 2001, it had fifty thousand employees worldwide and a market capitalization of more than $200 billion. However, when the Internet bubble burst, Sun Microsystems fell into the valley from the peak in a year and was acquired by Oracle in 2009.
The development of the Internet greatly exceeded the expectations of most people at that time. New technology companies rapidly rose; Apple and Google finally completed the counterattack against Microsoft by launching a mobile operating system. Mark Anderson, who created the Netscape browser, the innovator I described at the beginning of Battle in Silicon Valley, was only a name that hardly anyone born after 1990 knows.
But Mark Anderson did not leave; he became the godfather of Silicon Valley venture capital. Internet technology is still triumphant. Yesterday, we focused on the big bosses in the industry competing in various ways; today, we lament that mobile Internet devices have surpassed PCs in an all-around way. But we have inadvertently ignored a silently rising “ghost”—artificial intelligence”—and the Internet is just part of its body.
The emergence of artificial intelligence, accompanied by computers, happened earlier than that of the Internet. At the Dartmouth Conference in 1956, artificial intelligence was officially included in the agenda. At that time, the size of a computer was as big as a house, and its computing power was low. So why did anyone dare to propose the concept of artificial intelligence? The reason lies in the intuition of scientists. At that time, Claude Shannon had already completed his three major communication laws, laying the foundation for computer and information technology. Marvin Minsky had created the first neural network computer (he and his companions used three thousand vacuum tubes and one automatic indicator from the B-24 bomber to simulate a network of forty neurons), and soon he finished the paper “Neural Nets and Brain Model Problem.” This paper was not taken seriously at the time, but it became the originator of artificial-intelligence technology for the future. As early as 1950, Alan Turing had proposed various concepts such as the well-known Turing test, machine learning, genetic algorithm, and reinforcement learning.
Two years after Turing died, John McCarthy officially proposed the concept of artificial intelligence at the Dartmouth meeting. The ten young scientists who participated in the conference later became the leading figures in the field of artificial intelligence in the world. The field of artificial intelligence began. However, lots of their achievements were buried in computer development—for example, the procedures that could solve the closed calculus problem, the robots that could build blocks, and so on.
The ideal was advanced, but the infrastructure was still like an infant. The advancement of artificial intelligence hit two insurmountable bottlenecks: One was the problem of the algorithm logic itself; that is, the development of the mathematical method was not enough. The other one was the lack of hardware computing power. For example, scientists continued to summarize human grammar rules day and night to design computer language models, but the machine had never been able to improve the translation accuracy to a satisfactory level.
The link between new technology and industry was not connected, exciting product applications had not been invented, government and business investment had been greatly reduced, and artificial-intelligence research and development experienced two low setbacks during the mid-1970s and 1990s. But the public did not pay attention to that; instead, the fast-growing computer was already a magical, intelligent tool.
For the ordinary people, the most common example of “artificial intelligence” is probably arcade games. In the 1980s, game rooms appeared on the streets of some Chinese small county towns. Those arcade NPCs (nonplayer characters) can always be easily defeated by skilled players. This not only was a poor demonstration of artificial intelligence, but also resulted in a misconception that intelligence was something installed in a computer. This view was not changed until the rise of the Internet and cloud computing.
In 2012, I noticed that deep learning had made breakthroughs in academia and applications. For example, the effect of identifying images by deep learning suddenly increased significantly over any previous algorithm. I immediately realized that the new era is coming, and the search would be innovated. In the past we searched with text and now we can search with voice and images as well. For example, if I see a plant that I don’t know, I can take a photo, upload it, and search; it will immediately be recognized correctly. The previous way of searching with words could not identify such a plant. Apart from searching, many things that seemed impossible before are now possible.
Speech recognition, image recognition, natural language understanding, and persona are the most essential intellectual abilities of human beings. When computers acquire these capabilities, a new revolution will come. In the future, stenographers and simultaneous interpreters may be replaced by machines because computers can do better. We may not need a car driver in the future, as the car can drive itself in a safer and more efficient way. In business, workers can provide the best customer service with the help of smart customer-service assistants. Artificial intelligence has empowered people more than ever. The industrial revolutions freed humankind of heavy labor. In the past, human beings needed to do some rough work by themselves, such as moving stones. Now, machines can carry bigger stones for us. After the arrival of the intelligent revolution, machines could help us to accomplish much mental work. In the next twenty to fifty years, we will continue to see all kinds of changes and harvest all sorts of surprises. This is a very natural process.
However, we must pay tribute to the pioneers of artificial-intelligence.
Today, Baidu has a large and powerful team of artificial-intelligence researchers, many of whom have been engaged in machine-learning research since the 1990s. Some have studied with famous tutors, and some have worked in large technology companies for many years. The present research and development are only natural results.
In the 1990s, only a few scientists, such as Geoffrey Hinton and Michael Jordan, insisted on the exploration of machine learning. Former Baidu chief scientist Andrew Ng studied with Jordan during the 1990s, and, later, he taught the theory of machine learning to countless young people through online courses. Lin Yuanqing, former dean of Baidu Research Institute, and Xu Wei, outstanding Baidu scientist and the world’s first person to make a language model by using a neural-network technique, also worked in the American laboratory of NEC Corporation (formerly known as Nippon Electronic Company), a deep-learning center, more than ten years ago. Artificial-intelligence experts who have worked there include Vladimir Vapnik, a member of the American Academy of Engineering who invented the SVM (Support Vector Machine); Yann LeCun, a leader in deep-learning industry who invented the convolutional neural network and is now head of Facebook’s artificial intelligence lab; Léon Bottou, the key figure of the deep-learning stochastic gradient algorithm; and Yu Kai, original director of Baidu Deep Learning Lab.
Many of them have experienced several ebbs and flows of artificial-intelligence research. In short, the original artificial-intelligence research was mostly based on rules—people summed up various rules and input them into the computer, which the computer itself was not able to do. This advanced approach, a machine-learning technique, was based on statistics that allowed computers to find the most probable and appropriate models from large amounts of data and multiple paths.
In the last couple of years, artificial intelligence has become vibrant again, thanks to the upgraded version of machine-learning technology, a deep-learning approach based on a multilayer computer chip neural network. With a multilayer chip connection, the technology replicates the mesh-connection mode of a large number of neurons in the human brain, supplemented by exquisite reward and punishment algorithm design and big data, training the computer to efficiently search for models and rules from the data, thus opening up a new era of machine intelligence.
A few people with determination have saved the excitement for the return of artificial intelligence. In China, Baidu was one of the first companies to deploy artificial intelligence. It seems to have done a lot of things that other companies had not heard of before. About seven years ago, Qi Lu and I talked about the tremendous progress in deep learning in the United States and were determined to make a big move into the field. Finally, in January 2013, I officially announced the establishment of IDL (Deep Learning Institute) at the Baidu Annual Meeting, the first research institute under “Deep Learning” in the global business industry. I was the dean, not just because my knowledge of deep learning was best but also because my name would show the great importance I attach to deep learning and summon scientists who have stuck to this field for many years.
This is the first time Baidu ever set up a research institute. Our engineers are researchers, and researches have always been closely integrated with practical applications. I believe that deep learning will have a huge impact in many areas in near future, although some of those areas are not in Baidu’s business scope. Therefore, it is necessary to create a special space to attract talent, and let workers try various innovations freely, do research in areas unfamiliar to Baidu, and explore the revolutionary path of artificial intelligence for all mankind.
If the enlightenment phase of artificial intelligence can be called the 1.0 era, then obviously now it has entered the 2.0 era. Machine translation is a typical case. In the past, machine-translation methods were based on the rules of words and grammar. Humans constantly summed up the grammar rules for the machine, but they couldn’t keep up with the changes in human language, especially the context. So, machine translation always made mistakes such as translating into “How are you?” ( means “It’s you” with an astonished tone in Chinese.)
Later, SMT (statistical machine translation) appeared, with its basic idea to find out common vocabulary combination rules by statistical analysis of a large number of parallel corpora (collections of written material) and trying to avoid strange phrase combinations. SMT already has the basic functions of machine learning. There are two stages of training and decoding: the training stage is to let the computer construct a statistical translation model through data statistics and then use this model for translation; the decoding stage is to get the best translation by using estimated parameters and given optimization objectives.
SMT research has been in the industry for more than twenty years. For phrases or shorter sentences, the translation works well, but for long sentences, especially for languages with different structure, such as Chinese and English, the result is far from satisfactory. Recently, the NMT (neural machine translation) approach has emerged. The core of NMT is a deep neural network with countless nodes (neurons). After a sentence of a language is vectorized, it is transmitted in layers in the network and transformed into a form that the computer can “understand.” Then it undergoes multiple layers of complex conduction operation. A translation in another language is thus generated.
But the premise to apply this model requires a large amount of data; otherwise, it is useless. Search engines like Baidu and Google can discover and collect massive human translations from the Internet and feed such huge data to the NMT system, which can then train and debug a more accurate translation mechanism better than SMT. The more Chinese-English bilingual corpus we store, the better outcome NMT will achieve.
SMT used all local information before, and the processing unit was the phrases, or segmentations of the sentence. At the end of decoding, the translations of several phrases were stitched together, without fully utilizing the global information. NMT uses global information, first encoding the information of the entire sentence (similar to human reading through the entire sentence before translating) and then generating a translation, based on the encoded information. This is the advantage and the reason why it is better in terms of fluency.
For example, a very important step of translation is word-order adjustment. In Chinese, we put all the attributives in front of the central words, while in English we put the prepositional phrases behind the central words that they modify, and machines often confuse this order. The advantage of NMT in word-order learning brings the fluency of its translation, especially in the translation of long sentences.
Traditional translation methods are not completely useless, and each method has its own advantages. Taking idiom translation as an example, there are often customary translations, free translation instead of literal translation. Idioms must be translated in the corpus with corresponding contents. Nowadays the needs of Internet users are diverse, and translation involves many areas, such as speaking, résumés, and news. So it is quite difficult to meet all the requirements with just one method. Therefore, Baidu has been combining traditional methods, such as rule-based, instance-based, and statistical-based methods, with NMT to advance research.
In this machine translation model, humans do not need to look for voluminous language rules by themselves but need to set mathematical methods, debug parameters, and help computer networks to find their own rules. As long as humans enter a language, machines will output another, without worrying about what has been done in the process. This is called end-to-end translation. It sounds amazing, but in fact, Bayesian methods and hidden Markov models in probability theory can both be used to solve this problem.
Taking the Bayesian method in information distribution as an example, we can construct a personality-feature model described by probability. For instance, one of the characteristics of male-reader model is that the probability of clicking on military news is 40 percent, while for female-reader model it is 4 percent. Once a reader clicks on military news, the gender probability of the reader can be back-deduced, and with other behavior data, this reader’s gender and other features can be judged after comprehensive calculation. This is the magic of mathematics. Of course, the mathematical methods used by computer neural networks are much more than those described in this example.
The idea of artificial-intelligence technology methods like machine translation dictates that the amount of data must be large enough. The Internet provides a massive amount of data that scientists used to dream of but found it hard to realize. The original intention of the birth of the Internet was to facilitate information communication, resulting in information explosion, which promoted the development of artificial-intelligence technology.
Take chess as an example. In 1952, Arthur Samuel developed a checker program with similar competitiveness as amateur masters. The rules of checkers are relatively simple, and computers have far more advantages than humans, but chess is much more difficult. When Zhang Ya-Qin, former president of Baidu, was the dean of the research institute at Microsoft, he invited Xu Feng Xiong, a computer talent in Taiwan who developed the famous Deep Blue chess robot at IBM (International Business Machines Corporation). During the 1990s, Deep Blue was the most qualified representation of artificial intelligence, concentrating “wisdom” on one supercomputer, using multiple CPUs (central processing units) and parallel computing technology; it continuously defeated human chess masters and finally defeated international chess champion Garry Kasparov in 1997. But shortly after the game, IBM announced that Deep Blue retired. Zhang Ya-Qin said to Xu Feng Xiong, “You should invent a Go robot and come back to me when it can defeat me,” but Xu Feng Xiong did not come to him, even after he left Microsoft.
Some bottlenecks of Deep Blue are hard to overcome: although it can handle the calculations on the chess board, it becomes powerless facing the variability of a different order of magnitude on the Go board. Based on the decision-tree algorithm, the model of exhausting all possibilities is beyond the carrying capacity of computers. Although the algorithm is continuously optimized, still it cannot break the computational barrier. The Oriental wisdom represented by Go seems to be inviolable at the level of artificial intelligence, but a new era is coming.
The computer intelligence represented by Deep Blue seems to have nothing to do with the Internet. However, the development of cloud computing and big data has finally brought artificial intelligence and the Internet together. The combination of these two complementary powers allows us to acquire a different wisdom model from the Deep Blue era. Multichip distributed computing, coupled with the big data accumulated by humans, and with the new algorithm beyond the decision-tree as a chain, embodies a perfect combination of human intelligence and machine intelligence.
From 2016 to 2017, AlphaGo (a Go robot) was insurmountable in the Go field. Its “thinking” is different from humans and Deep Blue. In short, it contains the data of millions of humans’ Go games. For a more professional interpretation, it can be said that the Monte Carlo search algorithm and deep-learning-based pattern recognition contributed to AlphaGo’s achievements, the most important of which is deep learning, which its predecessor Deep Blue does not possess.
According to the research of all parties, AlphaGo does not think about how to play but learns the game of master players (big data). It records every situation in all the games, trains millions of situations as input, and then predicts the next step for human masters through a multilayer neural network. Through ingenious neural-network design and training, this multilayer neural network models the “sense” of master players for the current situation; the winning percentage in the previous games is already known. When actually playing, the computer can record the game by visual recognition, compare it with the previous game data to find the same mode (situation), search for different situations for further development, and choose some high-quality possible steps for the next move according to the winning percentage from the past game histories, instead of trying each possible step. That greatly reduces the amount of system calculations and saves the system from “exhaustion.” It is like a human being; it doesn’t exhaust all the possible points but picks some points based on experience and feeling. Humans still have to calculate and compare which points are better after selecting a few; for the machine, this calculation is handed over to the Monte Carlo search algorithm.
A vivid but not necessarily accurate metaphor would help to explain. Monte Carlo tree search is an optimization of previous decision-tree algorithms, which even after a high-quality possible step is decided still has to exhaust possibilities for the next choice, branching at each choice, resulting in an exponential explosion on the number of optional paths.
The Monte Carlo method shows the subtlety of probability. Suppose that under a certain Go situation, the deep-learning network gives three choices of a move: A, B, and C. Taking these three points as the root nodes, we can imagine three actual trees, each having countless branches. A Monte Carlo search does not exhaust all branches, but sends three million ants to start from A, B and C, one million for each point; the ants quickly climb up to the treetops (that is, the black and white take steps alternately until one wins; generally that will be within two hundred steps). There must be some ants reaching the highest point (that is, the outcome is determined). If the ant goes to the end, the black wins; if it does not, the white wins.
Suppose three hundred thousand among the one million ants starting from point A reached the end point, with five hundred thousand from point B and four hundred thousand from point C having reached the end point. The system would conclude that choosing point B for the black will bring a higher winning percentage and then take point B as the next step. This is the probability sampling algorithm, which greatly reduces the amount of computation compared to the item-by-item exhaustion method.
Why send one million ants instead of one hundred thousand or ten million? This is usually based on the computing power of the computer, and a rough estimate of the competitor. If a higher winning percentage can be reached by sending one hundred thousand ants, then we do so. The more ants sent at the same time, the higher the computing power required.
The CPU chip and the GPU (graphics processing unit) chip simultaneously perform neural-network calculations and Monte Carlo tree searches to simulate a massive final situation, which is incomparable for human computing power. Since deep-learning models master players, it seems that artificial intelligence has the big picture of human beings, in the data of millions of games between master players.
I believe smart readers, even if they don’t know much about mathematics, have basically understood how AlphaGo operates, although the specific algorithms and strategies are far more complicated than the aforementioned description. AlphaGo showed us the current level of artificial-intelligence and deep-learning technology development. However, there are many institutions and individuals who do similar research and development, and like the eight immortals crossing the sea, each show special prowess.
Once recorded by the Internet in the form of data, human behavior becomes an endless treasure that nourishes artificial intelligence of all types and helps humans. Machine translation, speech recognition, and image recognition are all based on a large amount of data provided by the Internet, as well as user click behavior. Why is the accuracy of the Baidu search engine unmatched by other search engines in China? Baidu has the largest amount of data, the most advanced algorithm, and the most profound accumulation. Every click of the user is actually training the Baidu Brain behind the search engine, telling it which information is most important for the users.
When artificial intelligence encounters setback, people start believing that it is difficult for a machine to think like a human being, but this also shows opportunity. After the 1990s, people realized that artificial intelligence does not have to think in the same way as humans do, as long as it can solve humans’ problems. So, when the linguist Noam Chomsky was asked, “Can the machine think?” he quoted the Danish computer scientist Edsger Dijkstra: “Can the submarine swim?” The submarine cannot swim like a fish or a person, but it possesses outstanding underwater ability.
When we look back at history, not just the development of the Internet, we can find the entire human industrial development gestating artificial intelligence. Kevin Kelly said that the self-reciprocating motion of the steam-engine piston is a delicate design and such a self-response already contains the element of “evolution”: the pursuit of automation is the evolutionary dynamic of artificial intelligence.
For example, at the beginning of the industrial revolution, steam engines first appeared in coal mines and pits. Since early steam engines were inefficient and energy intensive, they could only be used where coal was particularly abundant and cheap. When coal was mined, a lot of water was produced, which needed to be extracted from the mine. With this demand and enough cheap energy, the idea of the steam engine was generated. Once it went into service, the steam engine constantly developed, eventually pushing the industrial revolution. Artificial intelligence is similar. When you have enough data, the data is the fuel on which the artificial-intelligence engine can run.
Thanks goes to the development of the Internet and the data records generated by all human activities, without which the computer would lack the means of learning. Thanks also goes to those artificial-intelligence explorers who are not all computer scientists. Some of them do biological research, some do engineering research, some study the automatic iterative optimization of mathematics and computer programs, and some update the collaborative architecture of computer chips. The various research results have merged into the sea and finally become today’s artificial intelligence.
The media’s astonishment regarding AlphaGo in 2016 was actually a delayed reaction. Back in 2007, Geoffrey Hinton, a giant in the field of artificial intelligence, had noticed that “the gale is raging before the storm is about to burst.”
At that time, one of Hinton’s students, with the help of Google big data, applied Hinton’s earlier research findings to speech-recognition technology and achieved remarkable success. Hinton couldn’t help but sigh: “The past failure was only due to the lack of data and computing power.”
In the second decade of the twenty-first century, everything is ready for artificial intelligence, and a time of fierce competition is just about to begin. Since 2015, the artificial-intelligence entrepreneurship has continued to percolate. According to the data analysis of the artificial-intelligence industry released by CB Insights, a US venture-capital data organization, artificial-intelligence investment exceeded $1 billion in the first quarter of 2016, and there were 121 investments in the second quarter, compared with twenty-one investments in the same period of 2011. From the second quarter of 2011 to that of 2016, the amount of investment in artificial intelligence exceeded $7.5 billion, of which more than $6 billion was generated after 2014.
Wuzhen Index: Global Artificial Intelligence Development Report shows that in the first two quarters of 2016, more than sixty artificial-intelligence startups were set in China, with an investment amount of $600 million. In the past year (2015), 202 investments were made in mainland China in the field of artificial intelligence, involving a total of $1 billion (about 6.8 billion yuan), and the market is huge.
In 2016, Academician Tieniu Tan, vice president of the Chinese Academy of Sciences and vice chairman of the China Artificial Intelligence Society, said that the global artificial intelligence market in 2015 was $127 billion. In 2016, it was estimated to reach $165 billion. By 2018, this figure will exceed $200 billion.
China, the United States, and Britain are the most important development regions for artificial intelligence. The United States is the origin of the Internet and artificial intelligence. It has a unique talent advantage, coupled with a strong technology base and huge research funding; it remains number one in this field. In addition to Google, Facebook, Microsoft, Amazon, IBM, Apple, and many other giants who made large investments in the field of artificial intelligence, there are nearly one hundred large and small companies focusing on AI business. For example, x.ai, which specializes in natural language processing, attracted a three-round financing of $340 million.
The United Kingdom’s long-established famous universities keep gathering talent in the field of artificial intelligence under the circumstance of shrinking manufacturing industries. DeepMind Technologies, which developed AlphaGo, has benefited from the people who graduated from UK universities.
Amazon launched Alexa Smart Voice Assistant and Echo Smart Speaker to compete in the voice market with Apple, Google, and Microsoft. In June 2016, Amazon’s CEO Jeff Bezos revealed in an interview with US technology blogger Walt Mossberg that Amazon’s investment in key projects in the field of artificial intelligence has lasted for four years. “The project team has more than 1,000 people, and what you see is only the tip of the iceberg.”
In September 2016, Microsoft announced the establishment of a new artificial intelligence R&D group under the leadership of Executive Vice President Harry Schum. He led thousands of computer scientists and engineers to integrate artificial intelligence into the company’s products, including Bing, Digital Assistant Cortana, and robot projects. At the end of the year, Microsoft officially released a service that can develop chat robots and announced that it will provide CPU service for the OpenAI artificial intelligence lab cofounded by Elon Musk and Sam Altman, former president of the startup incubator Y Combinator.
Facebook also has its own artificial-intelligence lab and a team like Google Brain—Applied Machine Learning. The mission of the group is to promote artificial-intelligence technology in a variety of Facebook products. In the words of Mike Schroepfer, the company’s chief technology officer, “About one-fifth of Facebook’s engineers are now using machine-learning technology.”
Of course, AlphaGo’s owner, Google (which acquired the program from DeepMind Technologies), is not content with the ability of playing Go, and its artificial-intelligence investment has been expanding over the past years. In 2012, Google only had two in-depth learning projects, but this number exceeded one thousand at the end of 2016. At present, Google utilizes deep learning in its search engine, Android, Gmail (free webmail service), translation, maps, YouTube (video website), and even unmanned vehicles.
China has a huge amount of business-application scenarios, users, and data, as well as the largest group of talent; it has made rapid progress. In addition to BAT (short for the three major Internet companies in China: Baidu, Alibaba, and Tencent), Huawei Technologies, and other giants who intensively developed artificial intelligence, many other artificial-intelligence companies in vertical industries are also emerging. In various Internet forums in 2016, heads of Internet companies in the fields of e-commerce, social media, and search engines were all leading the topic of artificial intelligence, reporting large or small achievements.
In 2016, Baidu’s speech-recognition accuracy rate reached 97 percent, and face-recognition accuracy reached 99.7 percent. Baidu’s platforms TianSuan, TianXiang, TianGong, and Tianzhi, through the cloudization of Baidu Brain, have successively opened up the technology and capabilities of Baidu Brain to the whole society.
A few defenders in the field of machine learning more than a decade ago are now the most valuable talent. After the rise of artificial intelligence, available talent became one of the two scarcest resources in the open-source world, in addition to data.
The professional knowledge behind artificial intelligence is highly correlated with basic disciplines, such as mathematics and biology. Artificial-intelligence scientists are the best in these fields, which is especially rare. However, in China, there are less than two hundred doctoral and postgraduates students in this field per year, which is far from the demand of numerous new startups. This is the case abroad too. In 2015, Uber headhunted 40 of the 140 researchers at Carnegie Mellon Robotics Academy, causing an uproar in the industry.
The foregoing is not all about talent competition. AI companies are basically sensitive to the problem of brain drain. In 2017 and 2018, many academic stars quit their previous position or started a business, which highlights reality in the burgeoning AI field. Those who do well academically must be able to realize their potential.
Baidu is the main representative of China’s artificial-intelligence industry. A large number of top workers have joined Baidu: Wang Haifeng worked at Microsoft before joining Baidu; Andrew Ng came to Baidu from the United States; Zhang Ya-Qin came to Baidu from Microsoft; and Lin Yuanqing left NEC American Lab, which is full of machine-learning experts. Jing Kun, the creator of robot Xiaoice, came to Baidu from Microsoft; Qi Lu, the highest-ranking Chinese executive who was in a US giant technology company and the authority of artificial-intelligence technology, gave up the position of Microsoft vice president to join Baidu. At the same time, many skilled workers started in Baidu before creating their own artificial-intelligence application company. Baidu is the epitome of China’s vitality in attracting and cultivating artificial-intelligence talent.
Many super brains in the industry have come together to create an epoch-making Chinese brain. We have experienced the PC era, are now in the era of mobile Internet, and are about to enter in the era of super intelligence of the Internet of Everything. A “super brain” environment will become possible, for which we process the gathered various kinds of data, and Baidu is creating such an environment. The purpose is to let artificial intelligence penetrate the lives of Chinese and even the world, like water and electricity, and to turn everything in the world toward the direction of “informati zation.”2 For example, Baidu Brain has its own eyes, ears, mouth, and cognitive decision-making ability. Overall, it is equivalent to a child, but its local abilities such as translation, speech recognition, and image recognition greatly surpass that of human beings. We open these abilities to everyone to develop and explore various artificial-intelligence applications. Baidu Brain has become the tool of many developers and the operating system of artificial intelligence. It promotes the formation of artificial-intelligence standardization, comprehensively serving companies, entrepreneurs, and individuals.
Therefore, we are eagerly calling for the establishment of China Brain, a national platform for deep-learning servers, algorithms, and application infrastructure. It will be a manifestation of the all-around upgrading of China’s competitiveness and a powerful accelerator for the Chinese renaissance.
Relative to human data nourishing artificial intelligence, I need to discuss our users, the countless consumers who support Baidu and high-tech Internet development.
Today, in addition to the influence of large companies such as Google, Microsoft, and the BAT group, the decentralization of the Internet and big data technologies enables small businesses, talented technicians, and even users to change the situation.
The secret to AOL’s success is actually simple: discover what people want most and then offer it to them. Its success also benefited from strong marketing propaganda; it established a user-friendly corporate image. President Jim Kimsey played a key role here. He knows nothing about computer technology, but from the perspective of a business person, he sets his feet deeply rooted in the public. “The center of my world are consumers; we want to be the Coca-Cola of the Internet world.”
I emphasized the importance of users in Battle in Silicon Valley. In the eyes of engineers, “user” is strictly and rationally defined; the user requirements-development-feedback description is rigorous in the technical documentation. But the development of the Internet not only facilitates technical services but also provides a stage for thoughts and emotions. We can say that the Internet has created a kind of opinion-based user.
Many of our programmers and engineers enjoy Baidu’s relaxed environment for technical workers, which is simple and reliable. Some technicians are not good at communication or complicated interactions; they are eager to develop a wide range of products. Users’ various emotions are different from the engineer’s. Engineers in the laboratory may not experience the type of events of ordinary people, nor the complex and volatile transactions and emotions. People from the media and public relations can better understand users’ emotions. Staff from our public relations department sometimes complain about technicians not knowing users’ psychology; when encountering problems, they often think that modifying code is the only thing that needs to be done. But human emotions cannot be fixed, which troubled us deeply. One of the problems we must solve is how to break the gap between technicians, merchants, and ordinary users. We need inspired product ideas and humility to allow cross-border learning.
The thinking about users’ daily needs is an ongoing task that needs to be sustained. But as far as the theme of this book is concerned, we are engineers, after all, and shall never forget to meet the needs of users with technology and numbers. We use technology to accurately differentiate data and serve different users.
The trend of digitization has been discussed in books like Nicholas Negroponte’s Being Digital and Kevin Kelly’s Out of Control: The New Biology of Machines, Social Systems, and the Economic World and What Technology Wants, and it also always exists in the technician’s mind. We are surrounded by living data. Data often sounds dangerous. For example, will private data be sold? We will continue to talk about this topic later. Here, in simple terms, the data in artificial intelligence is by no means data like an ID-card number or a password. Today’s artificial intelligence focuses on finding the overall model of chaotic data, thereby optimizing production and service. Advances in translation, speech recognition, and image recognition are the best examples. These chaotic data will be of great value for human beings after AI sorts out their regular patterns, from daily speech recognition to credit-fraud prevention in the financial field to antiterrorism security at the national level.
Technology should adapt to users all the time. The product side responds directly to the user’s needs and should continuously optimize the performance of the technology. We believe that good artificial intelligence should be unobtrusive, unlike a power source with unstable voltage. It should continuously improve accuracy and optimize product details. For example, some companies are good at speech-recognition technology, but their overall design of the input method is not convenient enough, which affects users’ experience. Baidu has unsuccessful product examples that need to be changed with the help of users.
Data and technology are not cold; they will show humanness when combined with good artificial-intelligence methods.
There was an active degree of migration between Dongguan and other cities in China after the antipornography campaign in Dongguan during early 2014. A senior news editor told us that he thought the map figure Baidu created with visualization technology transcended the news event itself, and there was a feeling of overlooking the world as a god. The Baidu Migration Index reflects human migration through data-visualization techniques. The migration of people in the digital age is only a small page in the epic of human migration for millions of years, but, it is a historic page in the era of big data.
I think the map signifies a historic moment in the era of artificial intelligence, as smart-map technology illustrates human activity. Artificial intelligence doesn’t have humanity, but when combined with the developer’s creativity and ideas, it can provide a new perspective, even a different type of human compassion.
Both computers and the Internet are part of the artificial-intelligence body. Each data set is a record of human activity and humanity. Artificial intelligence thus finally emerges like the soul. It can be humane.
One philosopher said that human beings are a kind of ongoing existence. Baidu has accumulated a large amount of map data, supplemented by the designer’s wisdom and various sophisticated algorithms, that can depict various human mobile behaviors and observe people’s living conditions on the road.
The maximum amount of Baidu Map’s daily service reached 72 billion recorded events, each one representing a record of human activity. It can also be seen from the map data that cities in Central and Western China are increasingly connected. The bustling traffic heat map and rhythm are like the city’s life pulse. The eye of the map has a large field of vision.
People of our generation have all heard the song of Tong Ange: “In order to live, people are running around, but they are staggered in fate.” I hope that with the help of artificial intelligence, human trajectories are not only staggered but also intersected and merged into rivers and lived endlessly.
A young scientist at Baidu Big Data Lab who majored in biology studied the laws of fish-school movement at Princeton before he returned to China. Upon seeing the migration map, he said that Baidu let him know that human data can also be studied like a fish school, only with more convenient methods, so he decided to join Baidu. In 2016, he and his colleagues used the search data changes on Baidu Map to accurately predict the decline in iPhone sales. With the help of data, the Big Data Lab provides intelligent awareness for a variety of urban life and business operations.
In 2014, the Ministry of Transport proposed to deepen reform in the following ways: promote pragmatic innovation; accelerate development of four modes of transportation; accelerate the construction of a market- and enterprise-oriented industrial technical-innovation system that combines production, education, and research; and promote the transformation of scientific and technological achievements into transportation productivity. The Ministry of Transport made efforts to establish a multichannel and multimodal travel-information service system and initially establish an integrated traffic-information service platform to release real-time travel information to the society, solving problems such as poor travel information.
In this context, Baidu proposed the China Smart Transportation Cloud Service Platform Cooperation Plan and jointly established a cooperation platform with the Research Institute of Highway of the Ministry of Transport and the National Intelligent Transport Systems Center of Engineering and Technology. Relying on the Ministry of Transport’s key technology project, Research and Demonstration on Open Public Travel Information Service Based on Cloud Computing, this plan activates existing data, establishes a province-wide information-resource sharing and exchanging mechanism, promotes the sharing of service information between government and enterprises, and is open to the whole society.
The smart map can report the degree of road congestion according to the user’s moving speed and can also intelligently avoid the odd and even license-number restricted routes. Using virtual-reality technology, we can find our way on the map as we do on the road. Based on traffic big data, coupled with algorithm assistance, and responding to the needs of traffic-management departments, intelligent map systems have been able to provide solutions for the mitigation of urban traffic congestion, greatly reducing the pressure from the traffic control department.
Smart map’s collection of geographic data has made many smart projects possible. High-definition map technology with centimeter-level accuracy has been applied to the development of unmanned vehicles. At the World Internet Conference in 2016, an unmanned Baidu vehicle was publicly tested and commissioned in Wuzhen. The total mileage was about 3.16 kilometers; the vehicle passed three traffic lights and made one U-turn, with pedestrians, cars, and electromobiles on the road and complex weather condition of rain, mist, and haze. The test competes with the road tests conducted by Silicon Valley peers in North America. This is a small step for unmanned vehicles, but it will definitely be a big step in artificial intelligence.
Artificial intelligence does not grow on trees; it is a natural result of the progress on computer-network technology and data-processing technology over the past decades and the data set of human beings. The intelligent development of Baidu Search and Baidu Map is a microcosm of this process.
There are plenty of news reports about robots in various mass media; some of them are just for fun. For example, news broke the other day about a robot attacking people in an exhibition. The truth is that an educational assistant robot fell from the stage and hit someone. Another news item concerns a cemetery equipping itself with robots to embolden grave guards, but such robots are actually just toys, more like a spoof. If we look back at history scientifically, we will find that artificial intelligence is neither a myth nor a joke, but a real product from the labor of mankind. We don’t need to fear or worship it.
In the field of artificial intelligence, a scientist’s description of technology is often straightforward and modest. Jun Wu, a former Google engineer, said in 2003 that he and his companions worked together to improve keyword search accuracy of Google. One of the main problems they solved is which meaning of synonyms to search to fulfill the user’s demand. For users, if the search results are inaccurate, they will continue to search for a synonym or select a result that is not ranked among the top search results. At this time, users are actually making a keyword collocation, and the system will record the collocation relationship; now the goal is to return the result faster and more accurately. Wu said:
The way we did it may sound less technical. We have sorted out the keyword collocation that users have searched for years, stopped one of the company’s five largest data centers during the four-day vacation of the Independence Day of the United States in 2003, and used these four days to specialize in the matching of each keyword. This is actually an exhaustive method. It is to solidify the word combination relationship that users often choose. The next time users take a similar search, the system can present the result faster and more accurately.
In fact, the technical logic in the field of machine translation is similar to the tactical exhaustive method applied in the search Wu described. According to the New York Times, at a meeting of Google’s translation department on a Wednesday in June 2016, an article by Baidu in a core journal on machine translation provoked the staff’s discussion. Mike Schuster’s words restored silence to the conference room: “Yes, Baidu has published a new paper. It feels like someone has seen through what we have done—papers share similar structures and results.” Baidu’s BLEU score (bilingual evaluation understudy, a measure of the accuracy between machine translation and pure human translation) basically matched Google’s achievements in internal tests in February and March. Quoc V. Le didn’t feel uncomfortable. He concluded that this is a sign that Google is on the right track. “This system is very similar to ours,” he said quietly.
Quoc V. Le is a PhD student of Andrew Ng. He may not know that the paper has nothing to do with Ng but was independently finished by the natural language department. The New York Times report about Chinese companies is of course simplified. However, Ng believes that Chinese media should not always think that foreign technology is more developed. The media tend to report the post-knowledge as a breakthrough. In fact, many advanced inventions in the field of artificial intelligence were done by Chinese people, but Chinese media may neglect them at first; then when foreign technology comes up with the same inventions, they regard the latter as breakthroughs.
Baidu has released an NMT-based translation system one year ahead of schedule, and Google then launched a similar system in 2016. We can conclude that the basic skills of the most cutting-edge explorers in this field are similar. In the end, he who possesses the most profound accumulation of knowledge and best optimization will win.
Today’s artificial intelligence is different from the past; it has changed thinking rules into data and strategy. In the past, humans always wanted to design the perfect logic for computers and constantly abstracted human logic rules into functions and then entered them into computers. Today’s artificial intelligence is based primarily on advances in big data foundations and algorithms. In other words, the explosion of artificial intelligence today is based on the Internet outbreak in the late 1990s. With the help of the Internet, significant data will be generated continuously. We should notice that the data here is not what users consciously put in, such as name, age, address, hobbies, etc., but the data automatically generated when they use the Internet. For example, each search and each click is a kind of data, and each moving trajectory (such as your walking or driving routes) also counts.
China is already the world’s number one manufacturing power; now it needs to upgrade its “soft power.” Spirit and culture are soft power, as are calculations and data. The combination of this kind of soft power with the traditional industry makes “smart+.” It will be practically integrated into our production and life visibly and tangibly.
Instead of asking what Baidu wants to do, we’d better ask, “Why must Baidu do it?”
Every company has its own strategy and tactics. In 2013, the Chinese mobile Internet entrepreneurial trend began, and many companies invested huge sums of money in this huge bottomless pit, showing their brave strategy. Baidu focuses on the long-term and scientific aspects. At that time, few people noticed that Baidu was working on artificial intelligence. Today, artificial intelligence has become famous throughout the world, and some people are amazed with the advance and firmness of Baidu’s strategy. Baidu recognizes the nature of the Internet information industry in advance; once determined, it resolutely takes its own path and does not care about an outsider’s judgment. Baidu’s multiparty layout and key breakthroughs have helped to set up the foundation of artificial intelligence when the whole world started to pay attention to it.
We didn’t direct Baidu’s artificial-intelligence technology toward activities such as playing Go or predicting the results of singers’ competitions. Instead, we are focused on developing internal strengths and concentrated on transforming artificial intelligence into practical services that can improve human life. We not only apply deep learning to a few areas such as speech recognition, machine translation, and street-view house-number recognition but also apply its success to significantly enhancing users’ experience.
In 2013, Baidu Navigation was the first in China to announce that navigation would be permanently free and brought China into the era of free navigation. Now, we are opening the data interface of Baidu Maps for others to develop and use. Everyone can use the positioning technology and solutions provided by Baidu Maps to save a lot of cost compared to the traditional GPS tracker. Delivery companies can use this platform to plan the optimal delivery route, and game developers can develop location games like Pokémon Go. We open Baidu Brain so that more people can use artificial eyes and ears to serve themselves. We open up the deep-learning development platform PaddlePaddle so that more people who are interested can create their own artificial-intelligence services. We also hope that nontechnical people can learn to use data intelligence to optimize their work, improve their personality, and pursue their own ideals.
A lot of college-entrance-examination candidates must have used the Duer personal-assistant robot to help them choose their college and major. In China, each field attracts many participants. When I was a student, people called the college-entrance examination “a thousand troops and horses crossing the wooden bridge.” Similar to the map data, Duer analyzes the huge data of college-entrance examinations, responds to and senses a student’s desire and anxiety through deep-learning technology, and tries to provide an accurate response. Here, artificial intelligence records not the map trajectory in the physical space but the spiritual trajectory of a student’s growth.
In the early 1990s, I moved to the United States to study computer technology. At that time, there were many young people like me, with the desire to change the world with code, traveling between China and the United States like migratory birds. If there were a data map at the time to record these transoceanic trajectories, it would be very interesting. Now artificial-intelligence scientists have brought the fire to China again, and I believe that the flame will burn even more warmly because China has enough fuel. China’s population of educated people is large, and the popularity of computers and mobile devices is high. A large amount of data makes China uniquely advantageous in developing and applying deep-learning technologies. With this advantage, we are ready to create the legend like the Silicon Valley in the 1990s.
What Baidu has created is not only the frontier development but also data infrastructure and a deep-learning and development platform, gathering the wisdom of people.
Before Trump was elected as the President of the United States, more than a hundred Silicon Valley elites issued an open letter saying Trump’s election would be a disaster for innovation. This shook me a lot. If the innovation in the United States is really affected, then who will take the banner to lead the direction of innovation? Can we move the world’s innovation center from Silicon Valley to China?
Talented individuals are indeed coming to join us. Baidu also set up a laboratory in Silicon Valley to get closer to American skilled workers. The China Brain plan proposed by Baidu is the equivalent to any super project. Seventy years ago, top scientists returned to China from abroad to build great projects with great enthusiasm. Will such glorious achievements reappear today?
Of course, it must be noted that the great projects of that era were contingent on national investment and industrial policies. After the end of the Cold War, the country’s competitive pressures had decreased, and investment in cutting-edge technology had also been greatly reduced. Musk’s rocket-development project was actually the result of the government’s decision to transfer the NASA (National Aeronautics and Space Administration) rocket technology and team to him. The Chinese government has strong determination and investment, and the development of the artificial-intelligence industry is a consensus. It is the best of times; it is the most uncertain of times. Artificial intelligence is a way to adapt to uncertainty. Large and small companies invest in artificial-intelligence research and development, bringing competition and diversification, which should form a benign interaction and growth.
With regard to the uncertainty, the White House report is already exploring the impact of artificial intelligence on employment. The rapid development of Silicon Valley and the decline of the central manufacturing industry have increased the country’s rift. Some people have enjoyed progress and others have been left behind. Baidu wants to be the gatherer of talent, and Chinese companies must strive to build an ark to take the masses to the intelligent era.
Dr. Wang Haifeng, Chief Technical Officer of Baidu, was elected as an ACL (Association for Computational Linguistics) Fellow in November 2016. The youngest fellow of ACL, he is also the first Chinese to be the chairman in the association’s fifty-year history. The selection committee wrote in a comment to him, “Wang Haifeng has made outstanding achievements in the fields of machine translation, natural language processing, and search engine technology, both in academia and business industry, and has made tremendous contributions to the development of ACL in Asia.” At the beginning of 2017, Qi Lu, a well-known scientist and executive in the field of artificial intelligence, joined Baidu. These developments indicate the trend of international talent mobility. Thousands of outstanding artificial-intelligence scientists in China are working together to create the future of mankind.
Not long ago, Amazon’s no-cashier supermarket astonished shopaholics. Behind this special shopping experience, there is the shadow of the cashiers being laid off. Today, with various online customer services being replaced by machine customer service, shorthand translation being replaced by speech recognition, and even cashiers, drivers, factory workers, copy clerks, and lawyers being replaced by artificial intelligence, how shall we face the world? What kind of support should the government and enterprises provide for workers? How should we adjust the economic and social structure to adapt to the era of artificial intelligence? We want to listen to the needs of ordinary people. This is also the original intention of this book by our artificial-intelligence team.
Peter Thiel is a venture-capitalist genius in the Silicon Valley, equally famous as Marc Andreessen. He is the creator of Mosaic and Netscape, two of the first widely used web browsers, so he is good at grasping technical trends and capturing dark horses. In 2016, he became famous again because of his accurate prediction of Trump being elected President of the United States. He said in 2011, “We wanted flying cars; instead, we got 140 characters.” The 140-character Twitter was once very popular, but Peter Thiel clearly saw what was missing from the bustling Internet. He criticized people for slowing down the pace of progress, insisted that hippie culture replaced progressivism, and said venture capitalists are keen to invest in light-asset companies, most of which are mobile Internet companies, such as Airbnb and Uber, but without clear planning and confidence for the future. He believes that in the Internet+ era, humans have made great progress at the bit level and little at the atomic level. Therefore, he decisively invested in rockets, anticancer drugs, and artificial intelligence.
I agree that mobile Internet entrepreneurship has covered up the progress we pursue. Baidu must strive for its own direction and contribute to the advancement of human core capabilities. Thiel said that Americans in the early twentieth century were willing to try new things and dare to plan and implement the decades-long moon-landing plan. However, today people do not have such a plan. Only the venture capitalists are looking around for the added value and timely pleasure. Baidu is willing to imagine an intelligent world and realize it. It wants to make artificial intelligence a new operating system, not only for computers but also for the world. At the same time, it seriously contemplates and responds to the challenges of artificial intelligence in advance, eventually making a different world. So, I said that we must make it!
The intelligent revolution is a benign revolution in production and lifestyle and also a revolution in our way of thinking. Huge opportunities and challenges coexist. Subsequently, I will discuss the specific aspects of the intelligent revolution; detail the breakthroughs made on deep learning, such as visual recognition, speech recognition, and natural-language processing; and depict the upcoming intelligent society along dimensions of manufacturing upgrading, unmanned driving, financial innovation, management revolution, and smart life. Furthermore, I will explore how people should cope with the development of artificial intelligence, and I will take the pulse of the intelligent revolution.