Chapter 8
IN THIS CHAPTER
Communicating in new ways
Sharing ideas
Employing multimedia
Improving human sensory perception
People interact with each other in myriad ways. In fact, few people realize just how
many different ways communication occurs. When many people think about communication,
they think about writing or talking. However, interaction can take many other forms,
including eye contact, tonal quality, and even scent (see https://www.smithsonianmag.com/science-nature/the-truth-about-pheromones-100363955/
). An example of the computer version of enhanced human interaction is the electronic
nose, which relies on a combination of electronics, biochemistry, and artificial intelligence
to perform its task and has been applied to a wide range of industrial applications
and research (see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3274163/
). This chapter concentrates more along the lines of standard communication, however,
including body language. You get a better understanding of how AI can enhance human
communication through means that are less costly than building your own electronic
nose.
AI can also enhance the manner in which people exchange ideas. In some cases, AI provides entirely new methods of communication, but in many cases, AI provides a subtle (or sometimes not so subtle) method of enhancing existing ways to exchange ideas. Humans rely on exchanging ideas to create new technologies, build on existing technologies, or learn about technologies needed to increase an individual’s knowledge. Ideas are abstract, which makes exchanging them particularly difficult at times, so AI can provide a needed bridge between people.
At one time, if someone wanted to store their knowledge to share with someone else,
they generally relied on writing. In some cases, they could also augment their communication
by using graphics of various types. However, only some people can use these two forms
of media to gain new knowledge; many people require more, which is why online sources
such as YouTube (https://www.youtube.com/
) have become so popular. Interestingly enough, you can augment the power of multimedia,
which is already substantial, by using AI, and this chapter tells you how.
The final section of this chapter helps you understand how an AI can give you almost superhuman sensory perception. Perhaps you really want that electronic nose after all; it does provide significant advantages in detecting scents that are significantly less aromatic than humans can smell. Imagine being able to smell at the same level that a dog does (which uses 100 million aroma receptors, versus the 1 million aroma receptors that humans possess). It turns out two ways let you achieve this goal: using monitors that a human accesses indirectly, and direct stimulation of human sensory perception.
Communication involving a developed language initially took place between humans via the spoken versus written word. The only problem with spoken communication is that the two parties must appear near enough together to talk. Consequently, written communication is superior in many respects because it allows time-delayed communications that don’t require the two parties to ever see each other. The three main methods of human nonverbal communication rely on:
The first two methods are direct abstractions of the spoken word. They aren’t always
easy to implement, but people have been doing so for thousands of years. The body-language
component is the hardest to implement because you’re trying to create an abstraction
of a physical process. Writing helps convey body language using specific terminology,
such as that described at https://writerswrite.co.za/cheat-sheets-for-writing-body-language/
. However, the written word falls short, so people augment it with symbols, such as
emoticons and emoji (read about their differences at https://www.britannica.com/demystified/whats-the-difference-between-emoji-and-emoticons
). The following sections discuss these issues in more detail.
The introduction to this section discusses two new alphabets used in the computer
age: emoticons (http://cool-smileys.com/text-emoticons
) and emoji (https://emojipedia.org/
). The sites where you find these two graphic alphabets online can list hundreds of
them. For the most part, humans can interpret these iconic alphabets without too much
trouble because they resemble facial expressions, but an application doesn’t have
the human sense of art, so computers often require an AI just to figure out what emotion
a human is trying to convey with the little pictures. Fortunately, you can find standardized
lists, such as the Unicode emoji chart at https://unicode.org/emoji/charts/full-emoji-list.html
. Of course, a standardized list doesn’t actually help with translation. The article
at https://www.geek.com/tech/ai-trained-on-emoji-can-detect-social-media-sarcasm-1711313/
provides more details on how someone can train an AI to interpret and react to emoji
(and by extension, emoticons). You can actually see an example of this process at
work at https://deepmoji.mit.edu/
.
The emoticon is an older technology, and many people are trying their best to forget
it (but likely won’t succeed). The emoji, however, is new and exciting enough to warrant
a movie (see https://www.amazon.com/exec/obidos/ASIN/B0746ZZR71/datacservip0f-20/
). You can also rely on Google’s AI to turn your selfies into emoji (see https://www.fastcodesign.com/90124964/exclusive-new-google-tool-uses-ai-to-create-custom-emoji-of-you-from-a-selfie
). Just in case you really don’t want to sift through the 2,666 official emoji that
Unicode supports (or the 564 quadrillion emoji that Google’s Allo, https://allo.google.com/
, can generate), you can rely on Dango (https://play.google.com/store/apps/details?id=co.dango.emoji.gif&hl=en
) to suggest an appropriate emoji to you (see https://www.technologyreview.com/s/601758/this-app-knows-just-the-right-emoji-for-any-occasion/
).
The world has always had a problem with the lack of a common language. Yes, English has become more or less universal — to some extent, but it’s still not completely universal. Having someone translate between languages can be expensive, cumbersome, and error prone, so translators, although necessary in many situations, aren’t necessarily a great answer either. For those who lack the assistance of a translator, dealing with other languages can be quite difficult, which is where applications such as Google Translate (see Figure 8-1) come into play.
One of the things you should note in Figure 8-1 is that Google Translate offers to automatically detect the language for you. What’s
interesting about this feature is that it works extremely well in most cases. Part
of the responsibility for this feature is the Google Neural Machine Translation (GNMT)
system. It can actually look at entire sentences to make sense of them and provide
better translations than applications that use phrases or words as the basis for creating
a translation (see http://www.wired.co.uk/article/google-ai-language-create
for details).
A significant part of human communication occurs with body language, which is why the use of emoticons and emoji are important. However, people are becoming more used to working directly with cameras to create videos and other forms of communication that involve no writing. In this case, a computer could possibly listen to human input, parse it into tokens representing the human speech, and then process those tokens to fulfill a request, similar to the manner in which Alexa or Google Home and their ilk work.
Of course, there are other characteristics, but even if an AI can get these five areas down, it can go a long way to providing a correct body-language interpretation. In addition to body language, current AI implementations also take characteristics like tonal quality into consideration, which makes for an extremely complex AI that still doesn’t come close to doing what the human brain does seemingly without effort.
An AI doesn’t have ideas because it lacks both intrapersonal intelligence and the ability to understand. However, an AI can enable humans to exchange ideas in a manner that creates a whole that is greater than the sum of its parts. In many cases, the AI isn’t performing any sort of exchange. The humans involved in the process perform the exchange by relying on the AI to augment the communication process. The following sections provide additional details about how this process occurs.
A human can exchange ideas with another human, but only as long as the two humans know about each other. The problem is that many experts in a particular field don’t actually know each other — at least, not well enough to communicate. An AI can perform research based on the flow of ideas that a human provides and then create connections with other humans who have that same (or similar) flow of ideas.
One of the ways in which this communication creation occurs is in social media sites
such as LinkedIn (https://www.linkedin.com/
), where the idea is to create connections between people based on a number of criteria.
A person’s network becomes the means by which the AI deep inside LinkedIn suggests
other potential connections. Ultimately, the purpose of these connections from the
user’s perspective is to gain access to new human resources, make business contacts,
create a sale, or perform other tasks that LinkedIn enables using the various connections.
To exchange ideas successfully, two humans need to communicate well. The only problem is that humans sometimes don’t communicate well, and sometimes they don’t communicate at all. The issue isn’t just one of translating words but also ideas. The societal and personal biases of individuals can preclude the communication because an idea for one group may not translate at all for another group. For example, the laws in one country could make someone think in one way, but the laws in another country could make the other human think in an entirely different manner.
Theoretically, an AI could help communication between disparate groups in numerous ways. Of course, language translation (assuming that the translation is accurate) is one of these methods. However, an AI could provide cues as to what is and isn’t culturally acceptable by prescreening materials. Using categorization, an AI could also suggest aids like alternative graphics and so on to help communication take place in a manner that helps both parties.
Humans often base ideas on trends. However, to visualize how the idea works, other parties in the exchange of ideas must also see those trends, and communicating using this sort of information is notoriously difficult. AI can perform various levels of data analysis and present the output graphically. The AI can analyze the data in more ways and faster than a human can so that the story the data tells is specifically the one you need it to tell. The data remains the same; the presentation and interpretation of the data change.
Studies show that humans relate better to graphical output than tabular output, and
graphical output will definitely make trends easier to see. As described at http://sphweb.bumc.bu.edu/otlt/mph-modules/bs/datapresentation/DataPresentation2.html
, you generally use tabular data to present only specific information; graphics always
work best for showing trends. Using AI-driven applications can also make creating
the right sort of graphic output for a particular requirement easier. Not all humans see graphics in precisely the same way,
so matching a graphic type to your audience is essential.
Most people learn by using multiple senses and multiple approaches. A doorway to learning that works for one person may leave another completely mystified. Consequently, the more ways in which a person can communicate concepts and ideas, the more likely it is that other people will understand what the person is trying to communicate. Multimedia normally consists of sound, graphics, text, and animation, but some multimedia does more.
AI can help with multimedia in numerous ways. One of the most important is in the creation, or authoring, of the multimedia. You find AI in applications that help with everything from media development to media presentation. For example, when translating the colors in an image, an AI may provide the benefit of helping you visualize the effects of those changes faster than trying one color combination at a time (the brute-force approach).
After using multimedia to present ideas in more than one form, those receiving the ideas must process the information. A secondary use of AI relies on the use of neural networks to process the information in various ways. Categorizing the multimedia is an essential use of the technology today. However, in the future you can look forward to using AI to help in 3-D reconstruction of scenes based on 2-D pictures. Imagine police being able to walk through a virtual crime scene with every detail faithfully captured.
People used to speculate that various kinds of multimedia would appear in new forms. For example, imagine a newspaper that provides Harry Potter-like dynamic displays. Most of the technology pieces are actually available today, but the issue is one of market. For a technology to become successful, it must have a market — that is, a means for paying for itself.
One way that AI truly excels at improving human interaction is by augmenting humans in one of two ways: allowing them to use their native senses to work with augmented data or by augmenting the native senses to do more. The following sections discuss both approaches to enhancing human sensing and therefore improve communication.
When performing various kinds of information gathering, humans often employ technologies that filter or shift the data spectrum with regard to color, sound, or smell. The human still uses native capabilities, but some technology changes the input such that it works with that native capability. One of the most common examples of spectrum shifting is astronomy, in which shifting and filtering light enables people to see astronomical elements, such as nebula, in ways that the naked eye can’t, — and thereby improving our understanding of the universe.
However, shifting and filtering colors, sounds, and smells manually can require a great deal of time, and the results can disappoint even when performed expertly, which is where AI comes into play. An AI can try various combinations far faster than a human and locate the potentially useful combinations with greater ease because an AI performs the task in a consistent manner.
As an alternative to using an external application to shift data spectrum and somehow
make that shifted data available for use by humans, you can augment human senses.
In augmentation, a device, either external or implanted, enables a human to directly
process sensory input in a new way. Many people view these new capabilities as the
creation of cyborgs, as described at https://www.theatlantic.com/technology/archive/2017/10/cyborg-future-artificial-intelligence/543882/
. The idea is an old one: use tools to make humans ever more effective at performing
an array of tasks. In this scenario, humans receive two forms of augmentation: physical
and intellectual.
Physical augmentation of human senses already takes place in many ways and is guaranteed to increase as humans become more receptive to various kinds of implants. For example, night vision glasses currently allow humans to see at night, with high-end models providing color vision controlled by a specially designed processor. In the future, eye augmentation/replacement could allow people to see any part of the spectrum as controlled by thought, so that people would see only that part of the spectrum needed to perform a specific task.
Intelligence Augmentation requires more intrusive measures but also promises to allow humans to exercise far
greater capabilities. Unlike AI, Intelligence Augmentation (IA) has a human actor
at the center of the processing. The human provides the creativity and intent that
AI currently lacks. You can read a discussion of the differences between AI and IA
at https://www.financialsense.com/contributors/guild/artificial-intelligence-vs-intelligence-augmentation-debate
.