“Information technology grows exponentially, basically doubling every year. What used to fit in a building now fits in your pocket, and what fits in your pocket today will fit inside a blood cell in 25 years’ time.”
Ray Kurzweil, 2009
At the height of World War II, Alan Turing and the Bletchley Park team had just developed the first programmable electronic digital computer, designed specifically to assist British codebreakers with cryptanalysis of Lorenz ciphers. The German Lorenz rotor stream cipher machines were widely used during the war by the German army to send encrypted messages and dispatches. The Colossus Mark I became operational on 5th February 1944. An improved version of the Colossus (the Mark II) became operational on 1st June 1944, just days before the D-Day operations commenced.
The first programmable computing device that wasn’t a single-purpose machine was the Electronic Numerical Integrator and Computer (ENIAC). Originally used by the US army to calculate artillery firing tables, it became operational on 29th June 1947. By 1950, there were only a handful of such computing machines on the planet. Nevertheless, computing had had its start.
How does that compare with today?
Nowadays, even an everyday gadget such as the sound module inside a musical greeting card1 has approximately 1,000 times the processing power of all the combined computing technology in the world at the end of World War II, and it costs just 10 cents per chip. Moore’s Law strikes again!
The average computer that you carry around in your pocket today has more processing power than the world’s biggest banks, corporations and airlines had back in the 1980s. The tablet computer that you use today would have cost US$30 to 40 million in equivalent computing power to build just two or three decades ago, and would have been known as a supercomputer at the time. The smartphone that you probably have in your pocket is more powerful than all of the computers that NASA had in the 1970s during the Apollo project, and almost 3 million times more powerful than the Apollo guidance computer that Neil Armstrong, Buzz Aldrin and Michael Collins used to navigate their way to the lunar surface. The most powerful supercomputer in 1993, built by Fujitsu for Japan’s space agency at an approximate cost of US$34 million (1993 prices), could easily be outstripped performance-wise by a smartphone like the Samsung Galaxy S6. The same smartphone is 30 to 40 times more powerful than all of the computers that Bank of America had in 1985.2 An Xbox 360 has about 100 times more processing power than the space shuttle’s first flight computer.
If you wear a smartwatch on your wrist, it likely has more processing power than a desktop computer dating back 15 years. The Raspberry Pi Zero computer, which costs just US$5 today, has the equivalent processing capability of the iPad 2 released in 2011. Vehicles like the Tesla Model S carry multiple central processing units (CPUs) and graphics processing units (GPUs), creating a combined computing platform greater than that of a 747 airliner3.
Within 30 years, you’ll be carrying around in your pocket or embedded in your clothes, home and even within your body computing technology that will be more powerful than the most powerful supercomputer built today, and probably even more powerful than all of the computers connected to the Internet in the year 1995.4
The early days of the Internet began as a project known as the Advanced Research Projects Agency Network (ARPANET), led by the Advanced Research Projects Agency (ARPA, later Defense Advanced Research Projects Agency, DARPA) and the academic community. The first ARPANET link was established between the University of California, Los Angeles (UCLA), and the Stanford Research Institute (SRI) at 22:30 on 29th October 1969.
“We set up a telephone connection between us and the guys at SRI. We typed the L and we asked on the phone, “Do you see the L?”
“Yes, we see the L,” came the response.
We typed the O, and asked, “Do you see the O?”
“Yes, we see the O.”
Then we typed the G, and the system crashed...”5
Prof. Leonard Kleinrock, UCLA, from an interview on the first
ARPANET packet-switching test in 1969
In parallel to the development of early computer networks, various computer manufacturers set about shrinking and personalising computer technology so that it could be used at home or in the office. Contrary to popular belief, IBM wasn’t the first company to create a personal computer (PC). In the early 1970s, Steve Jobs and Steve Wozniak had been busy working on their own version of the personal computer. The result—the first Apple computer (retrospectively known as the Apple I)—actually preceded the IBM model6 by almost five years, and used a very different engineering approach. However, it wasn’t until Apple launched the Apple II that personal computing really became a “thing”.
Figure 3.2: An original Apple I computer designed by Jobs and Wozniak and released in 19767 (Credit: Bonhams New York)
Around the same time as Jobs and Wozniak’s development of the earliest form of PC, there was also a rapid downsizing of computers in the workplace. No longer did computers need to be room-filling behemoths separated into disk packs, printers, input devices and CPUs, and mainframes were no longer the sole domain of the corporate computing landscape. Now mainframes were making way for minicomputers, or what are more commonly known as midrange systems.
The term “minicomputers” wasn’t really an accurate description of something that was still the size of a large fridge, but they were still more powerful and smaller than the common early mainframe computers. Digital Equipment Corporation (DEC) developed a series of Programmed Data Processor (PDP) minicomputers, starting with the PDP-1 and gaining significant traction by the time the PDP-11 was released. In the 1980s, Sun Microsystems, HP and other companies started to dominate enterprise computing platforms for accounting and basic enterprise systems. However, the personal computer was about to revolutionise the workplace environment too, primarily as a result of emerging networking technology.
In 1979, Robert Metcalfe founded 3Com, expanding on the work Xerox Palo Alto Research Center (PARC) had done in the early 1970s on local and wide area network (LAN/WAN) Ethernet technology. Initially, the software that could utilise these LAN-based protocols was limited to simple tasks like file sharing, printing files or sending emails. This technology quickly developed into what became known as n-tier computing, allowing us to link many personal computers and application servers into very powerful office network systems. Companies like Oracle were born out of the need to build databases and software systems on these new architectures.
Metcalfe’s Law, named after 3Com’s founder, essentially states that as the number of connections (or nodes) in a network increases, the value to those users on the network grows exponentially. It explains why social networks like Facebook and Twitter have grown so quickly in recent years. Understanding the effect of networks is essential to understanding our future. When we combine the laws of network growth with the growth of computers defined by Moore’s Law, we essentially see that exponential growth of interconnected computers and devices is now unstoppable. During 2008, the number of “things” connected to the Internet exceeded the number of people on the planet,8 and the growth of global computing networks has only continued to accelerate.
Today, we network light bulbs, home thermostats, door locks, aircraft, vehicles, drones, robotic vacuum cleaners and many more appliances and gadgets. We are on the cusp of an explosion in interconnected, intelligent devices, sensors and nodes that promises to change the world as we know it. By 2020, 50 billion “things” will be connected to the Internet, but by 2030 we could be talking as many as 100 trillion sensors, or 150 sensors for every human on the planet. These sensors will be generating feedback on everything from our heart rate to the charge in our electric vehicle, the pollution in the air around us, the sugar levels in our blood stream or even the condition of our daily feculence. They will drive an informed, measured future that extends life and makes the planet safer and cleaner.
For such a revolution in connectivity to truly change the future and fortunes of the planet, we’re going to have to enable network access to everyone, not just developed nations. How do we get there?
During the 2014 OccupyHongKong movement, one of the stars of the show was a new app (at least it was at the time) called FireChat, which utilised a form of mesh networking technology. Basically, this app can use your phone’s WiFi or Bluetooth radio to communicate with other phones, even when the Internet and cellular networking services are down. Open Garden, the creators of FireChat, announced a partnership in Tahiti in October 2015 that will enable residents of the island to communicate with each other without needing a data plan or connection to a cellular service.
Mesh networking promises to be the ultimate solution to network connectivity. In theory, every Internet-enabled device can become a node on a distributed network that not only enables that device to connect to the web, but also allows other devices to communicate via shared connections. In today’s web, you have access points, whether via Internet service provider (ISP) hardpoints or WiFi hotspots, that act as connections to the larger network that is the Internet. Mesh nodes are small radio transmitters that don’t just communicate with users of that node or access point, they also communicate with each other. In that way, if one of those nodes loses connectivity to the Internet backbone, it just shares connectivity with other nodes that it is in range of. This is a truly distributed network topology, no longer wholly reliant on connections to the Internet backbone at each access point.
The implications of this are far-reaching, particularly in the rural areas of regions like Africa or countries like Indonesia, India and China where network connectivity is limited or non-existent. In theory, it means that every device with a small radio embedded into it, even in isolated areas, could become an Internet-access device. In addition to mesh networking technology, both Facebook and Google are working on technologies that will bring wireless Internet access to more than 2 billion unconnected people.
Facebook, in the shape of Internet.org, is prototyping a network of high-altitude, solar-powered drones called Aquila that uses lasers to transmit to small towers or dishes on the ground. The drones will stay aloft for months at a time and fly above commercial aircraft. Google is working on a similar project called Project Loon, but using high-altitude balloons instead.
Over the next 20 years, the greatest innovations will not be in the growth of networks, but in the way we use intelligent, networked computers embedded in every aspect of our daily lives. To leverage off this, we’ll need a new paradigm of design, new software and new ways to interact with devices. This focus on design is apparent in the beauty of devices like the iPhone compared with the earliest mobile phones, the specialisation of the large display screen in a Tesla or in the absence of screens in devices like Amazon Echo. We are finding increasingly imaginative ways to build technology into the world around us.
In 1982, I started Year 9 at a select-entry secondary school in Melbourne, Australia. Melbourne High School was one of the first secondary schools in Australia to introduce computer science as a subject. Today, we see the likes of President Obama coding his first Java sequence, and kids learning to code via YouTube and Codecademy, but back in the early 1980s, coding was still a university endeavour. When I started learning to code at school, we used a computer inherited from Melbourne University that allowed us to program in Basic, Pascal, Cobol and Fortran, but only using paper cards.
Figure 3.4: Computer programming in the 1970s was often done on punch cards rather than via a keyboard.
To code in those days, you had to hand write your code on paper, then transpose your code one line at a time onto either graphite or punch cards. You would then use a card stack reader that would read one card at a time and interpret either the pencil mark or the punch hole as a letter, number or character that was then interpreted and compiled. The classic “Hello World” program would have required using four different cards.
I definitively fell into the geek squad at school. I still vividly remember the time when I hacked into the school administrator system to find the teachers’ records; I got a two-week time out from the computer room for that one. I would offer to code other kids’ assignments for them for a nominal fee. It wasn’t about the money, it was simply a test to see if I could get the same results or output using different versions of the program.
Around this time, my pal Dan Goldberg introduced me to my first Apple II computer, and not long after that I got my first Vic-20 microcomputer at home. A few years later, I convinced my father to invest in an IBM compatible computer for the home. I’d gone from punching in programs on paper cards that read graphite pencil marks to keyboards and monochrome screens. Interfaces, especially when it came to games or graphics, were extremely primitive.
The Commodore Vic-20 microcomputer that I owned had about 4k of in-built ram, a 16-k expansion pack and a cassette tape deck for storing programs. I connected my Vic-20 up to an old black-and-white TV that my parents had lying around, but it was capable of 16 vibrant colours. I recall buying Vic-20 hobby magazines and pouring over lines of code, painstakingly typing these lines in so I could play a new game. This was how I learnt to code. By changing parameters, I was learning syntax, I was learning programming logic. When I graduated from secondary school, I had enough programming skill to go straight into a commercial programming position, allowing me to attend university part-time while I did what I loved every day—coding.
When Windows 3 and 3.11 came along, suddenly there was a graphical user interface that made using computers even simpler. You had standard controls and elements such as edit boxes, radio buttons and other design elements that gave you a great deal more flexibility when compared with old green screen tech.
Computers were getting more and more powerful, but interfaces were getting easier to use at the same time. The first generation of computer interfaces were known only to engineers. The second generation of interfaces allowed users to be trained to use specific programs, without having to be a programmer. Even with that progress though, knowing one computer system or operating system did not translate to being able to operate or navigate another system you weren’t intimately familiar with.
It soon became possible to buy off-the-shelf software and put that disk or cartridge into your console or computer, and even if you had never used the software before, you had a fair chance of being able to navigate it. Today, we download apps to our phones or software to our laptops that take just minutes to figure out, instead of weeks of intensive training. YouTube and other web-based tools allowed my 12-year-old son to learn how to code “mods” in Java for the popular Minecraft9 game ecosystem in a few weeks.
Ultimately, as this trend continues, we’ll have immensely powerful computers embedded in the world around us that require no obvious interaction to operate beyond a response to a spoken word, or an action on our behalf. To illustrate, think about wearing a Fitbit device and the input that the computer in a wearable like that requires to be effective.
The introduction of multitouch was a big leap forward in interface design for personal devices. It allowed us to carry extremely powerful computers around in our pocket that didn’t require the additional hardware of a mouse or keyboard. From the perspective of the accuracy of input, it has been argued that multitouch degraded our capability but, at the same time, no one can deny that the simplicity of these devices means that even a two-year-old child can pick up an iPad and use it with ease.
The next phase of computing will see the way we use computers radically evolve. Input will be divided between direct input from an operator or user via either a virtual keyboard, voice, touch, gestures or feedback, via sensors that capture everything from biometrics and health data, geolocation, machine/device performance through to environmental data and finally from social, heuristic and behavioural analytics that start to anticipate and compare your behaviour. Input will be neither linear nor based on a single screen or interface.
If you carry a smartphone, a wearable fitness device or a smartwatch, your device is already capturing a ton of information about you and your movements each day. The internal accelerometer, combined with the global positioning system (GPS) chip, captures movement data; it’s precise enough to even calculate your steps and the change in altitude as you move up a flight of stairs. While devices like smartwatches and fitness bands capture your heart rate and act as a pedometer, the next generation of sensors will be able to capture far, far more.
In 2014, Samsung announced a prototype wearable known as Simband that has half a dozen different sensors that can keep tabs on your daily steps, heart rate, blood flow and pressure, skin temperature, oxygen level and how much sweat you are producing—12 key data points in all. The Simband display looks similar to a heart rate monitor that you would see in an intensive care unit (ICU) but is worn on the wrist.
Figure 3.7: The Simband display with electrocardiogram (ECG or EKG10) and other feedback (Credit: Samsung)
In much the way GPS or navigation software can predict how traffic is going to impact your journey or travel time, over the next decade health sensors married with AI and algorithms will be able to sense developing cardiovascular disease, impending strokes, GI tract issues, liver function impairment or acute renal failure11 and even recommend or administer direct treatment that prevents a critical event while you seek more direct medical assistance.
Insurance companies that offer health and life insurance are starting to understand that these tools will dramatically reduce their risk in underwriting policies, as well as help policyholders (that’s us) manage their health better in concert with medical professionals. Insurance will no longer be about assessing your potential risk of heart disease, as much as it will be about monitoring your lifestyle and biometric data so that the risk of heart disease can be managed. The paper application forms that you fill out for insurance policies today will be pretty much useless compared with the data insurers can get from these types of sensor arrays. Besides, an application form won’t be able to help you actively manage diet, physical activity, etc., to reduce the ongoing risk of heart disease. It’s why organisations like John Hancock, the US insurance giant, are already giving discounts to policyholders who wear fitness trackers.12
With so much data being uploaded to the interwebs,13 every second of every day, we have already gone well beyond the point where humans can effectively analyse the volume and breadth of data the world collects without the use of other computers. This will also dramatically change the way we view diagnosis.
You might recall a few years ago that IBM fielded a computer to compete in the game show Jeopardy! against two of its longtime champions. Watson, as the computer is known, won the game show convincingly, defeating the two previously undefeated human challengers Jennings and Rutter.14 More recently, IBM Watson was board approved by the NY Genome Centre to act as a medical diagnostician.15 As far as we are aware, this is the first time a specific machine intelligence (MI) has been certified academically or professionally to practise medicine. It will certainly not be the last time.
What was the driver behind the medical certification? The team behind IBM Watson wondered whether Watson could learn to hypothesise on problems like diagnosing cancer or finding genetic markers for hereditary conditions if they gave it the right data. For months, the team at IBM fed over 20 years of medical journals on oncology, patient case studies and diagnosis methodologies into Watson’s data repository to test their theory.
In the peer-reviewed paper released by Baylor College of Medicine and IBM, at the conclusion of the study, scientists were able to demonstrate a possible new path for generating scientific questions that may be helpful in the long-term development of new, effective treatments for disease. In a matter of weeks, biologists and data scientists, using Watson technology, accurately identified proteins that modify the p53 protein structure16. The study noted that this feat would have taken researchers years to accomplish without Watson’s cognitive capabilities. Watson analysed 70,000 scientific articles on p53 to predict proteins that turn on or off p53’s activity. This automated analysis led the Baylor cancer researchers to identify six potential proteins to target for new research. These results are notable, considering that over the last 30 years, scientists averaged one similar target protein discovery per year. Watson outperformed the collective US-based cancer research effort with its US$5 billion in funding by 600 per cent.
Even more impressive though, when Watson was fed data on a specific patient’s symptoms, it could accurately diagnose specific cancer types and the most effective treatment more than 90 per cent of the time.17 Why is this significant? Human doctors, oncology specialists with 20 years of medical experience, generally get it right just 50 per cent of the time. How is Watson able to consistently outperform human specialists in the field? Primarily, it is because of “his” ability to synthesize 20 years of research data in seconds with perfect recall.
The next obvious move is to allow doctors to use Watson to better diagnose patients, right? The hitch here was that doctors could only treat patients based on advice from a licensed diagnostician. That is why the NY Genome Centre sought and achieved board approval for Watson to be registered in New York as a licensed diagnostician.
“What Watson can do—he looks at all your medical records. He has been fed and taught by the best doctors in the world. And comes up with what are the probable diagnoses, percent confidence, why, rationale, diagnosis, odds, conflicts. I mean, that has just started to roll out in Southeast Asia, to a million patients. They will never see the Memorial Sloan Kettering Cancer Center, as you and I have here. [But] they will have access. I mean, that is a big deal.”
Ginni Rometty, chairman and CEO of IBM, during an
interview on Charlie Rose, April 2015
Now that we have established that Watson is more accurate at cancer diagnosis than a human doctor, my question to you is this: who would you rather have diagnose you if your GP suspected you might have the disease? Dr Watson or a “human”? You might argue that Watson probably doesn’t have a very good bedside manner, but that’s where understanding where this technology is taking us may radically change your view of the future of health care. By the way, did you notice that the CEO of IBM called Watson ‘he’? Just saying…
It’s very likely that sensors you carry on or inside your body in the future will be able to accurately assess changes in your health and diagnose a condition well before it becomes a problem. Soon computers will automatically assess your genetic make-up and flag known conditions for these algorithms or machine intelligences to look out for. By flagging certain anomalies, algorithms or intelligences like Watson could then recommend specific dietary changes, modifications required in your daily routine, like more sleep or more exercise, along with supplements or even personalised, DNA-specific medicines. Think of these machine intelligences in the role of a potential coach, much like a nutritionist, personal trainer or doctor. As wearable and ingestible medical devices progress, treatment will be administered automatically. In the case of diabetes, you could have your insulin levels maintained with an implant. Should the problem become more serious, the device could then flag the problem to your medical professional so, using his superior bedside manner, he could sit you down for a more “human” discussion.
By 2020, medical data on individual patients will double every 73 days.18 We need technology to connect the dots, flag the outliers and recommend courses of action that doctors would have done in the past. Avoiding emergency procedures for manageable conditions will become the norm, and the biggest costs may be associated with a subscription to the medical service and the devices you wear or ingest rather than on visits to the hospital or a doctor.
The show Breaking Bad dramatised the problems associated with the affordability of the US healthcare system by showing a high school teacher who had to turn to manufacturing illicit drugs in order to be able to afford cancer treatment. In the future, the divide in health care may not be between those with or without insurance, but perhaps between those with or without access to a healthcare AI and wearable medical tech. Smart societies will ensure that all of their citizens have access to this technology as it will dramatically reduce the cost burden of health care on society.
With the potential of 50 to 100 trillion sensors by 2030, the vast majority of inputs into computer systems around us will be automated, and not via direct input. Whether sensors in our smartwatches, accelerometers in our smartphones, biometric readers, passive cameras or algorithms capturing behavioral data, the amount of data that comes from the world around us versus data that we input via a keyboard or screen will be in the ratio of 10,000:1 within just a decade. In other words, the way computers respond embedded in the world around us will be influenced more by what we do, what we say and how we act rather than by what we type or click on.
The future of computing is one that combines sensors and machine intelligence. Sensors will be the way we input data, and algorithms will be the synthesis of the data. Interfaces will simply provide the results that matter to us. We’ll do very little of the driving or input, at least in a conventional sense.
The trends in interface and experience design are now taking us in quite a different direction from how we have traditionally thought about software and interfaces to computers. In the future, there will be a significant departure from software applications themselves.
While output display has dramatically improved, input has not radically changed. We’ve gone from cardpunch to keyboards, then we added a mouse, camera and a microphone, and more recently we’ve enabled screens to be multitouch. However, most input is still predicated on the use of a QWERTY-style keyboard.
We moved from very simple text-based interfaces to increasingly complex interactions over time. The early computer displays were primitive monochrome displays. When we first started using web browsers and mobile phones, the interactions were, once again, quite primitive. With the iPhone, mobile apps came along and were much more interactive than limited mobile web pages. Our move to smartwatches, smart glasses and such more recently has created a distributed approach to software. We can have an app on our phone, but the display and notifications associated with that app can now be instantiated on our smartwatch or smart glasses. It won’t be long before our office desk, our living room wall, our car dashboard and other environments all have embedded screens that enable interactions. We will overlay the real world with data, insights and context using augmented reality (AR) smart glasses and contact lenses too.
In the apps era, most businesses such as banks and airlines went for the bundling of increasing functionality, but as more and more capability is added, the propensity for “engagement rot”, as author Jared Spool calls it, becomes very high. The problem is that you cannot retain low friction19 user experiences when you have an abundance of features; essentially you get to a point where the features create complexity and confusion. Where is that point? Take, for example, special offers or discounts from retailers. If offers or deals are embedded in a banking app, at some point from a design perspective, you are faced with the issue of whether it is a “deals” app or a banking app. The design decision is no longer clear because you have two competing, compelling use cases vying for the customer’s attention.
Longer term understanding of the evolution of interface design, embedded computing and interaction science lead us to the inevitable conclusion that apps will become less and less important over time.
That summer, Google made an eight-pound prototype of a computer meant to be worn on the face. To Ive, then unaware of Google’s plans, “the obvious and right place” for such a thing was the wrist. When he later saw Google Glass, Ive said, it was evident to him that the face “was the wrong place.” [Tim Cook, Apple’s C.E.O.] said, “We always thought that glasses were not a smart move, from a point of view that people would not really want to wear them. They were intrusive, instead of pushing technology to the background, as we’ve always believed.”
Ian Parker on Jonathan Ive’s thinking
about wearable notification devices20
As context becomes critical to better engagement, functionality is already shifting away from apps. Whether on your smartwatch, smartphone, smart glasses or some other form of interface embedded in the world around us, the best advice and the best triggers for both an improved relationship and revenue-generating moments will be small, purpose-built chunks of experience.
Think about experiences where the software or technology is embedded in a customer’s life. Uber is a great example of this. The team behind Uber looked at the problem of moving people around and embedded an app in a user’s life in a fundamentally different way to the way a taxi company had handled “trips” previously, and in doing so revolutionised personal journeys. It wasn’t the app but the total experience that Uber designed. In doing so, they redesigned the way drivers were recruited, the way Uber’s vehicles were dispatched (no radio), the way a passenger orders a car, the way you pay for your journey and a bunch of other innovations. Uber even allows its drivers to get a car lease or open a bank account when they sign up.
The total taxi market size in San Francisco prior to Uber was US$150 million annually. In early 2015, Uber CEO Travis Kalanick revealed it had ballooned to US$650 million, with Uber taking US$500 million in revenue.21 By building an experience, not just an app, Uber attracted a ton of new business that would never have gone to taxi companies. Uber didn’t build a better taxi, it didn’t iterate on the journey—it started from scratch across the entire experience. The effect on the traditional taxi companies? The San Francisco Examiner reported on 6th January 2016 that San Francisco’s Yellow Cab Co-Op had filed for bankruptcy.
The temptation to bundle more and more features and functionality is high. Look at Facebook and Facebook Messenger, and how Messenger has now been decoupled from Facebook. A controversial change for some, but one that recognises that messaging and interacting with your newsfeed are very different priorities that should not compete. Interactions are moving towards distinct experiences embedded in our day-to-day lives, not a bundled feature set in a software application.
Let me illustrate it another way.
In the twentieth century, people watched their favourite TV shows on a specific channel at a specific time. If you wanted to watch the show again, before the advent of videocassette recorders (VCRs), you had to wait for reruns. That’s not how our children consume content today. They choose a show they want to watch and then watch it on YouTube or Netflix in real time. There’s almost no differentiation between PewDiePie’s channel on YouTube and House of Cards on Netflix. In fact, some studies show that streamed content has already overtaken TV in respect to viewing preferences.22
While you might still have apps on your phone, such as games or books that you are reading, behavioural and contextual content will inevitably become part of your personal, tailored content experience. The limitation today is simply contextualisation, bandwidth and predictive or location-based analytics. Combine those capabilities and it becomes less about apps and simply content that responds to your needs.
In 1997, Intel unveiled ASCI Red, the first supercomputer with sustained 1 teraflop performance. The system had 9,298 Pentium II chips that filled 72 computing cabinets. Recently, NVIDIA announced its first Tegra X1 teraflop processor for mobile devices. We’re talking about a CPU that can fit in a smartphone, vehicle,23 tablet or a smartwatch and can execute or compute 1,000,000,000,000 instructions per second—the same as that supercomputer from 1997. To highlight how much technology has advanced in just 15 years, consider this: ASCI Red occupied 1,600 square feet and consumed 500,000 watts of power, with another 500,000 watts needed to cool the room it occupied to achieve 1 teraflop performance. By comparison, the Tegra X1 is the size of a thumbnail and draws under 10 watts of power.
One of the emerging platforms for such computing devices is obviously in-car computing platforms that require enough processing capability to enable autonomous driving, along with enhancements to in-car display and dashboard visualisation. Over the next ten years, the growth of embedded computing in cars will increase exponentially. The Mercedes F015, launched at the 2015 Consumer Electronics Show (CES), is an example of conceptually where the car “space” might take us with self-drive technology. Cars that are a space for being entertained, for working, for playing, for socialising instead of just for driving will soon become the norm. An interactive lounge space, if you like. When you no longer need cars that require all-round windows to create visibility for driving, those windows can become integrated displays. I’ll talk about this more later.
When 1 teraflop chips (or more powerful computers) can be embedded in everyday spaces, it will be possible for everything to become an interactive display. This was well illustrated in a series of future concept videos produced by Corning themed A Day Made of Glass in which we see mirrors, tabletops, walls and cars becoming interactive devices complete with touchscreen interaction and contextual intelligence.
Figure 3.9: The Mercedes F015 uses internal space very differently from traditional cars. (Credit: Mercedes)
Figure 3.10: With cheap supercomputers on a chip, everything can become an interactive display. (Credit: Corning, A Day Made of Glass)
As computers become embedded all around us in our cars, homes, schools and workplaces, the concept of a “screen” and operating systems as we’ve known them will start to break. For screens built into our mirrors or tabletops we won’t have an app store that we use to download software, but we’ll undoubtedly have some ability to personalise. More importantly, these screens will speak to some sort of central AI or agent that will pull relevant information from our personal cloud to learn about us and then reflect that—from our appointment schedule through to breaking news in our fields of interest, or other relevant data that informs or advises us. These computers won’t just display relevant information. While the Samsung Simband has six different sensors that are constantly collecting relevant information about you, the computers embedded all around us in the future will constantly be listening and learning 24/7.
Two recent computing platform developments illustrate the start of this interface paradigm shift. Amazon Echo and the Indiegogo supported start-up Jibo have both recently entered the market as personal devices for the home. Both technologies are embedded in your home and can listen, learn and respond to cues from the world around them, in real time. Jibo goes so far as to position itself as your family’s personal assistant. These take the technologies of Google Voice, Apple’s Siri or Microsoft’s Cortana and embed them into our homes, with access to the almost infinite informational resources that the Internet provides.
It starts off fairly simply. You can ask Echo or Jibo things like “Will it rain tomorrow?”, “Is milk on my shopping list”, “Remind me to book my hotel for our holiday next week”, etc. Jibo takes this further as it has mobility and a built-in camera that allows you to ask it to take a snapshot of your family, for example. The screen within Jibo’s interface will even use its display to illustrate different personalities based on who in the family it is interacting with.
While these first-generation “home assistants” are currently limited to information requests, it won’t be long before we’ll be using technologies like these in our homes and offices to reliably manage our schedules, do our shopping and make day-to-day decisions. Within 20 years, these devices will be AIs that have enough basic intelligence to cater for any need we might have that can be executed or solved digitally, along with interfacing with our own personal dashboards/UIs, clouds and sensor networks to advise us on our physical health, financial well-being and many other areas that we used to consider the domain of human advisers.
Figure 3.11: Family robot Jibo is billed as a personal assistant and communications device for the home. (Credit: Jibo)
In December 2013, Time magazine ran a story entitled “Meet the Robot Telemarketer Who Denies She’s a Robot”24 describing a sales call that Washington Bureau Chief Michael Scherer of Time received. Scherer, sensing something was off, asked the robot if she was a person or a computer. She replied enthusiastically that she was real, with a charming laugh. But when Scherer asked, “What vegetable is found in tomato soup?” the robot said she didn’t understand the question. The robot called herself Samantha West.
The goal of algorithms like these is simply to pre-qualify the recipient of the call before transferring them to a human to close the sale. Voice recognition was an essential precursor to such algorithms. While today tools like Siri and Cortana recognise unaccented speech fairly well, there was a time when voice recognition was considered science fiction.
As early as 1932, scientists at Bell Labs were working on the problem of machine-based “speech perception”. By 1952, Bell had developed a system for single-digit speech recognition but it was extremely limited. In 1969, however, John Pierce, one of Bell’s leading engineers, wrote an open letter to the Acoustical Society of America criticising speech recognition at Bell and compared it to “schemes for turning water into gasoline, extracting gold from the sea, curing cancer, or going to the moon”. Ironically, one month after Pierce published his open letter, Neil Armstrong landed on the moon. Regardless, Bell Labs still had its funding for speech recognition pulled soon after.
By 1993, speech recognition systems developed by Ray Kurzweil could recognise 20,000 words (uttered one word at a time), but accuracy was limited to about 10 per cent. In 1997, Bill Gates was pretty bullish on speech recognition, predicting that “In this 10-year time frame, I believe that we’ll not only be using the keyboard and the mouse to interact, but during that time we will have perfected speech recognition and speech output well enough that those will become a standard part of the interface.”25 In the year 2000, it was still a decade away.
The big breakthroughs came with the application of Markov models and Deep Learning models or neural networks, basically better computer performance and bigger source databases. However, the models that we have today are limited because they still don’t learn language. These algorithms don’t learn language like a human; they identify a phrase through recognition, look it up on a database and then deliver an appropriate response.
Recognising speech and being able to carry on a conversation are two very different achievements. What would it take for a computer to fool a human into thinking it was a human, too?
In 1950, Alan Turing published a famous paper entitled “Computing Machinery and Intelligence”. In his paper, he asked not just if a computer or machine could be considered something that could “think”, but more specifically “Are there imaginable digital computers which would do well in the imitation game?”26 Turing proposed that this “test” of a machine’s intelligence—which he called the “imitation game”—be tested in a human-machine question and answer session. Turing went on in his paper to say that if you could not differentiate the computer or machine from a human within 5 minutes, then it was sufficiently human-like to have passed his test of basic machine intelligence or cognition. Researchers who have since added to Turing’s work classify the imitation game as one version or scenario of what is now more commonly known as the Turing Test.
An autonomous, self-driving car won’t need to pass the Turing Test to put a taxi driver out of work.
While computers are not yet at the point of regularly passing the Turing Test, we are getting closer to that point. On 7th June 2014, the Royal Society of London hosted a Turing Test competition. The competition, which occurred on the 60th anniversary of Turing’s death, included a Russian chatter bot named Eugene Goostman, which successfully managed to convince 33 per cent of its human judges that it was a 13-year-old Ukrainian who had learnt English as a second language. While some, such as Joshua Tenenbaum, a professor of Mathematical Psychology at MIT, have called the results of the competition “unimpressive”, it still shows that we are much closer to passing a computer off as a human than ever before.
Interactions like booking an airline ticket or changing a hotel reservation, resolving a problem with your bank, booking your car for a service or finding out the results of a paternity test could all be adequately handled by machine intelligences in the very near term. In many instances, they already are. A human won’t effectively differentiate the experience enough to justify the cost of a human-based call centre representative. In fact, my guess is that it won’t be long before you’ll have to agree to a charge if you want to speak to a “real” human. Many airlines and hotels already levy a phone service charge if you call instead of change a booking online. It’s pretty clear that human concierge services will become a premium level service only for the most valuable customer relationships in the future. For the rest of us, the basic model of service will be AI based. But here’s the thing we should recognise—in that future, a human won’t actually provide a better level of service.
We might very well suspect that we’re talking to a computer in the future, but the interaction will be so good that we won’t be 100 per cent sure or we just won’t care. Fifteen years from now, machine interactions will be widespread and AI/MIs will be differentiated and identified as such because they’ll be better and faster at handling certain problems. For example, Uber could advertise its AI, self-driving cars as “The Safest Drivers in the World”, knowing that statistically an autonomous vehicle will be 20 times safer than a human out of the gate.
Key to this future is the need for AIs to learn language, to learn to converse. In an interview with the Guardian newspaper in May 2015, Professor Geoff Hinton, an expert in artificial neural networks, said Google is “on the brink of developing algorithms with the capacity for logic, natural conversation and even flirtation.” Google is currently working to encode thoughts as vectors described by a sequence of numbers. These “thought vectors” could endow AI systems with a human-like “common sense” within a decade, according to Hinton.
Some aspects of communication are likely to prove more challenging, Hinton predicted.
“Irony is going to be hard to get,” he said. “You have to be master of the literal first. But then, Americans don’t get irony either. Computers are going to reach the level of Americans before Brits...”
Professor Geoff Hinton, from an interview with
the Guardian newspaper, 21st May 2015
These types of algorithms, which allow for leaps in cognitive understanding for machines, have only been possible with the application of massive data processing and computing power.
Is the Turing Test or a machine that can mimic a human the required benchmark for human interactions with a computer? Not necessarily. First of all, we must recognise that we don’t need an MI to be completely human-equivalent for it to be disruptive to employment or our way of life.
To realise why a human-equivalent computer “brain” is not necessarily the critical goal, we need to understand the progression of AI through its three distinct phases:
• Machine Intelligence—rudimentary machine intelligence or cognition that replaces some element of human thinking, decision-making or processing for specific tasks. Neural networks or algorithms that can make human equivalent decisions for very specific functions, and perform better than humans on a benchmark basis. This does not prohibit the intelligence from having machine learning or cognition capabilities so that it can learn new tasks or process new information outside of its initial programming. In fact, many machine intelligences already have this capability. Examples include: Google self-driving car, IBM Watson, high-frequency trading (HFT) algorithms, facial recognition software
• Artificial General Intelligence—a human-equivalent machine intelligence that not only passes the Turing Test and responds as a human would but can also make human equivalent decisions. It will likely also process non-logic or informational cues such as emotion, tone of voice, facial expression and nuances that currently a living intelligence could (can your dog tell if you are angry or sad?). Essentially, such an AI would be capable of successfully performing any intellectual task that a human being could.
• Hyperintelligence—a machine intelligence or collection of machine intelligences (what do you call a group of AIs?) that have surpassed human intelligence on an individual or collective basis such that they can understand and process concepts that a human cannot understand.
We simply don’t require full AI to have significant impact in employment patterns or put at risk people employed in the service industry. We don’t need to wait another 10, 15 or 30 years to see this happen, and the Turing Test is fairly meaningless as a measure of the ability of machine intelligence to disrupt the way we live and work.
The fact is, machines don’t have to evolve exactly the same intelligence as humans to actually be considered intelligent. Using the same measures we apply to the animal kingdom, Watson may have already demonstrated intelligence far greater than many of the species on the planet today. Does a machine have to be as smart as a human or smarter than a human to be considered intelligent? No. In fact, at its core, we shouldn’t expect AIs to think like humans at all really. Why should machine intelligence evolve or progress so that it thinks exactly like us? It doesn’t have to, and it most likely won’t. Let me illustrate with two examples.
Between 2009 and 2013, machine intelligent HFT algorithms accounted for between 49 and 73 per cent of all US equity trading volume, and 38 per cent in the European Union in 2014. On 6th May 2010, the Dow Jones plunged to its largest intraday point loss, only to recover that loss within minutes. After a five-month investigation, the US Securities and Exchange Commission (SEC) and the Commodities Future Trading Commission (CFTC) issued a joint report that concluded that HFT had contributed significantly to the volatility of the so-called “flash” crash. A large futures exchange, CME Group, said in its own investigation that HFT algorithms probably stabilised the market and reduced the impact of the crash.
For an industry that has developed trading into a fine art over the last 100 years, HFT algorithms represent a significant departure from the trading rooms of Goldman Sachs, UBS and Credit Suisse. The algorithms themselves have departed significantly from typical human behaviour. Very different behaviour and decision-making has been observed when analysing HFT trading patterns. What has led to this shift?
Perhaps it is the fact that HFT has neither the biases that human traders might have (for instance, staying in an asset class position longer than advised because the individual trader likes the stock or the industry) nor the same ethical basis for making a decision. While some might argue that Wall Street isn’t exactly the bastion of ethics, the fact is an HFT algorithm simply doesn’t have an ethical basis for a decision unless those skills have been programmed in.
Audi has been testing self-driving cars, two modified Audi RS7s that have a brain the size of a PS4 in the boot, on the racetrack. The race-ready Audis at this stage aren’t completely self-driving in that the engineers need to first drive them for a few laps so that the cars can learn the boundaries. The two cars are known as Ajay and Bobby,27 and interestingly they have both developed different driving styles despite identical hardware, software and mapping. Despite the huge amount of expertise on the Audi engineering team, they can’t readily explain why there is this apparent difference in driving styles.
We’re likely to see many different variations of “intelligence” in machine cognition that don’t fit a traditional human model or our expectations, but nonetheless will be both an improvement over traditional human decision-making and simply a departure from a traditional human approach to critical thinking. Just because an intelligence that develops in a machine is different from that of a human doesn’t make it inferior or less intelligent.
People who are most concerned about AIs taking over the world or subjugating humans probably regard all AIs as super-IQ humans, with the same desires, ethics and violent and egotistical tendencies that we humans have. A super-intelligent version of us would indeed be scary. Yet there’s simply no reason to believe that artificial intelligence exhibits human tendencies, biases and prejudices. In fact, the opposite is far more likely.
The AIs we have will not only be able to detect emotion and sentiment within a few years, they will also be able to detect when you are lying. At some point in time, we’ll probably hand over the process of electing government to an AI. Imagine how a truly clean, unbiased election process might work, especially if we were to use an AI to maximise representation of every eligible voter through optimal configuration of boundaries and districts. What about in respect to resource allocation and tackling problems like climate change? When an AI can model planetary climate science over millennia with pinpoint accuracy and give precise, verifiable impact estimates on continued use of fossil fuels or the impact of cow farts on carbon dioxide (CO2) levels, for example, think about how that will affect resource allocation and adoption of renewable energy.
Yes, AI does represent a danger to the status quo because it will probably be the purest form of common sense and logic. Anything that doesn’t pass the smell test today will be exposed rapidly in a world of AI. With machine learning in the mix, and the ability to hypothesise, very soon we’re going to have to justify poor human decision-making against the irrefutable logic of a machine with all of the facts and efficiency of thought that we, as humans, just can’t compete with. Within 15 years, humans will probably be banned from driving in some cities because self-driving cars will be demonstrably less risky. Insurers too will charge much more for human-driven vehicles.
We are going to need to learn that machine intelligence in its component form may still be highly differentiated from humans, and will most certainly be disruptive long before human-equivalent AI is reached. Don’t think that just because we’re 20 to 30 years away from human-equivalence that all of this is theoretical. Machines have long been taking jobs away from humans; it started 200 years ago with the steam machine. Algorithms and robots are just one more machine in a long line of industry-disrupting technologies.
____________
1 A typical musical greeting card like this has the ability to store a 3.5MB audio file (up to 300 seconds at 12 kHz or better audio quality).
2 In 1985, Bank of America had seven IBM 3033 mainframes in its San Francisco data centre, which had a combined processing capacity of 40 gigaflops. The Samsung Galaxy S6 has the equivalent of more than 1200 gigaflops of processing capability or 380 gigaflops across a multi-core architecture.
3 Excluding the in-flight entertainment system
4 I did the maths. In 1995, there were 45 million computers connected to the web. If they all had modern Pentium processors or equivalent, it would equate to 120 MHz x 45 million, or about 5.5 petahertz. If Moore’s Law continues (or equivalent) until 2045 to 2050, we would have a single chip with the same capability.
5 Gregory Gromov, “Roads and Crossroads of Internet History,” NetValley, 1995, http://history-of-internet.com/.
6 The IBM Personal Computer (Model 5150) was introduced to the world on 12th August 1981. It was quickly followed up with the launch of its IBM Machine Type number 5160, or what we now know as the IBM XT, on 8th March 1983. This model came with a dedicated Seagate 10 MB hard disk drive. For more than a decade thereafter, people talked about the dominant form of personal computer as “IBM Compatible”. That’s how strong IBM’s branding around “PC” became back then.
7 At the History of Science auction held at Bonhams New York on 22nd October 2014, one of the 50 original Apple-I computers (and one of only about 15 or so that are operational) was sold to The Henry Ford for a staggering US$905,000.
8 Cisco—Internet of Things (IoT)
9 Minecraft is a trademark owned by Mojang/Microsoft.
10 Globally, the term ECG is most common in which the Greek word for “heart” cardia or kardia is central to the acronym (elektro-cardia-graph, literally “electric-heart-writing”). The US common usage is EKG, using the original Greek spelling term rather than the English transliteration (cardio).
11 R.W. White, R. Harpaz, N.H. Shah, W. DuMouchel and E. Horvitz, “Toward enhanced pharmacovigilance using patient-generated data on the Internet,” Journal of Clinical Pharmacology & Therapeutics 96, no. 2 (August 2014): 239–46.
12 “All Things Considered,” NPR Radio, aired 8 April 2015.
13 Slang for “Internet”
14 John Markoff, “Computer wins on ‘Jeopardy!’: Trivial, It’s Not!” New York Times, 16 February 2011, http://www.nytimes.com/2011/02/17/science/17jeopardy-watson.html.
15 Irana Ivanova, “IBM’s Watson joins Genome Center to cure cancer,” Crain’s New York Business, 19 March 2014, http://www.crainsnewyork.com/article/20140319/HEALTH_CARE/140319845/ibmswatson-joins-genome-center-to-cure-cancer.
16 p53 is often called a “tumour suppressor protein structure” because of its role in defending the body against the formation of cancer cells.
17 Ian Steadman, “IBM’s Watson is better at diagnosing cancer than human doctors,” Wired, 11 February 2013, http://www.wired.co.uk/news/archive/2013-02/11/ibm-watson-medical-doctor.
18 “IBM and Partners to Transform Person Health with Watson and Open Cloud,” IBM Press Release, 13 April 2015, https://www-03.ibm.com/press/us/en/pressrelease/46580.wss.
19 Friction here refers to the user workload required to use the software. High friction requires multiple interactions, clicks or entries. Low friction requires minimal interaction building data models from previous interactions, external data, etc. Low friction interfaces also have optimal presentation of information so that readability and usability are high.
20 Ian Parker, “The Shape of Things to Come—How an Industrial Designer became Apple’s Greatest Product,” New Yorker, 23 February 2015, http://www.newyorker.com/magazine/2015/02/23/shape-things-come.
21 Henry Blodget, “Uber CEO Reveals Mind-Boggling Statistic That Skeptics Will Hate,” Business Insider, 19 January 2015.
22 Todd Spangler, “Streaming overtakes live TV among consumer viewing preferences,” Variety, 22 April 2015, http://variety.com/2015/digital/news/streaming-overtakes-live-tv-among-consumer-viewing-preferences-study-1201477318/.
23 Tesla uses Tegra chips in its cars.
24 “Meet the Robot Telemarketer Who Denies She’s a Robot,” Time, 13 December 2013, http://newsfeed.time.com/2013/12/10/meet-the-robot-telemarketer-who-denies-shes-a-robot/.
25 Taken from Bill Gates’ speech at the Microsoft Developers Conference on 1st October 1997
26 A. M. Turing, “Computing Machinery and Intelligence,” MIND: A Quarterly Review of Psychology and Philosophy vol. LIX, no. 236. (October 1950), http://mind.oxfordjournals.org/content/LIX/236/433.
27 Test Car A and Test Car B became Ajay and Bobby, respectively.