Chapter 19 How to Make a Better Human

[ 19 ]

How to Make a Better Human

Growing up, I remember watching reruns of a 1970s TV show about a NASA pilot who had a terrible crash, but according to the plot, scientists said, “We can rebuild him; we have the technology” and the pilot went on to become “the Six Million Dollar Man” (now the equivalent of about $40 million). With one eye with exceptional zoom vision, and one arm and two legs that were bionic, he could run 60 miles per hour and perform great feats. In the show he used his superhuman strength as a spy for good.

It was one of the first mainstream shows that demonstrated what might be accomplished by seamlessly bringing together technology with humans. I’m not sure that it would have been as commercially viable if they had called the show The Six Million Dollar Cyborg, but that’s really what he was—a human–machine combination.

Today we have new possibilities with the resurgence and hype surrounding AI and ML. I’m sure if you are in the product management, product design, or innovation space, you’ve heard all kinds of prognostications about what might be possible. Here, I’d like to suggest what might be the most powerful combination: thinking about human computational styles and supporting them with ML-equipped experiences to create not the physical prowess of the Six Million Dollar Man, but the mental prowess that they never explored in the TV show.

Symbolic AI and the AI Winter

I’m not sure if you are aware, but as I write this we are in at least the second cycle of hype and promise surrounding AI. In the 1950s and 1960s, Alan Turing posited that mathematically, 0s and 1s could represent any type of mathematical deduction, suggesting that computers could perform formal reasoning. From there, scientists in neurobiology and information processing started to wonder, given the similarity of brain neurons’ ability to fire an action potential or not (effectively a one or zero), if there might be the possibility of creating an artificial brain, and capable of reasoning. Turing proposed the Turing test: essentially, if you could pass questions through to some entity, and that entity could provide answers back, and humans couldn’t distinguish between the artificial system’s responses and that of a real human, then that system passed,and could be considered AI.

From there, others, like Herbert Simon, Allen Newell, and Marvin Minsky, started to look at intelligent behavior that could be formally represented and work on how “expert systems” could be built up with an understanding of the world. Their artificial intelligence machines tackled some basic language tasks, games like checkers, and some analogical reasoning. There were bold predictions that within a generation the problem of AI would largely be solved.

Unfortunately, their approach showed promise in some fields, but great limits in others—in part because it focused on symbolic processing, very high level reasoning, logic, and problem solving. This symbolic approach to thinking did find success in other areas, including semantics, language, and cognitive science, but the focus was much more on understanding human intelligence than building generalized AI.

By the 1970s, money for AI in the academic world had dried up, and there was what was called the “AI Winter”—going from the amazing promise of the 1950s to real limitations in the 1970s.

Artificial Neural Networks and Statistical Learning

Very different approaches to AI and the notion of creating an “artificial brain” started to be considered in the 1970s and 1980s. Scientists in a divergent set of fields composing cognitive science (psychology, linguistics, computer science), particularly David Rumelhart and James McClelland, looked at this from a very different, “subsymbolic” approach. Perhaps rather than trying to build representations that were used by humans, they hypothesized, we could instead build systems like brains—systems that had many individual processes (like neurons) that could affect one another with inhibition or excitation (like neurons) and have “back-propagation” that changed the connections between the artificial neurons depending on whether the output of the system was correct.

This approach was radically different because: (a) it was a much more “brain-like” parallel distributed processing (PDP), in comparison to a series of computer commands; (b) it focused much more on statistical learning; and (c) the programmers didn’t explicitly provide the information structure, but rather sought to have the PDP learn through trial and error and adjust the weights between its artificial neurons itself.

These PDP models had interesting successes in natural language processing and perception. Unlike the symbolic efforts in the first wave, this group did not make any assumptions about how these ML systems would represent the information. These systems are the underpinnings of Google TensorFlow and Facebook Torch. It is this type of parallel process that is responsible for today’s self-driving cars and voice interfaces.

With the incredible resources available in mobile phones and the cloud, modern systems have the computing power Newell and Simon likely never even dreamed of having. But while great strides have been made in natural language processing and image processing, these systems are still far from perfect, as shown in Figure 19-1.

There have been many breathless prognostications about the power of AI and its unstoppable intelligence. While these systems have been getting better, they are highly dependent on having the data available to train them and still have their limitations.

Figure 19-1

Less-than-perfect captions assigned by an ML algorithm

I Didn’t Say That, Siri!

You may have your own experiences with how voice commands can on the one hand be incredibly powerful, but on the other have significant limitations. Their ability to recognize any language at all is impressive. This is a hard problem and they have shown real ability to do solve it. We put these systems to the test, studying Apple Siri, Google Assistant, Amazon Alexa, Microsoft Cortana, and Hound. Using a Jeopardy!-like setup, we asked participants to create a command or question using the provided terms, designed to get an answer (e.g., “Cincinnati, tomorrow, weather,” for which participants might say, “Hey Siri, what is the weather tomorrow in Cincinnati?”).

To make a long story short, we found that these systems were quite good at answering questions about basic facts (e.g., the weather, or the capital of a country), but had real trouble with two very natural human abilities. First, humans can put together ideas easily (e.g., population, country with Eiffel Tower—which we know is France). When we asked these systems a question like “What is the population of the country with the Eiffel tower?” they generally produced the population of Paris or just gave an error. Second, we can guage context. If we asked “What is the weather in Cincinnati?” and followed up with, “How about the next day?” these systems were generally unable to follow the thread of the conversation.

In addition, we found the humans experiencing these systems had a significant preference for the AI systems that responded in the most humanistic way—even if that system got something incorrect or was unable to answer (e.g., “I don’t know how to answer that yet”). When it addressed the participants in the way they addressed it, they were most satisfied.

But is Siri really smart? Intelligent? It can add a reminder and turn on music, but you can’t ask it if it is a good idea to purchase a certain car, or how to get out of an escape room. It has limited, ML-based answers. It is not “intelligent” in a way that would pass the Turing test.

The Six Minds and AI

Interestingly, the first wave of AI was known for its strength in performing analogies and reasoning (memory, decision making, and problem solving), and the more recent approach has been much more successful with voice and image recognition (vision, attention, language). The systems that provided more human-like responses tend to be favored (emotion).

I hope you are seeing where I am headed. The current systems are starting to show the limitations of a brute force, purely statistical, subsymbolic representation. While these systems are without a doubt amazingly powerful and fantastic for solving certain problems, no amount of faster chips or new training regimens will achieve the goals of AI sought out in the 1950s.

If more speed isn’t the answer, what is? Some of the most prominent scientists in the ML and AI fields are suggesting we take another look at the human mind. If studying the individual—and group neuron level achieved this success in the perceptual realm, perhaps considering other levels of representation will provide even more success at the symbolic level with vision/attention, wayfinding and representations of space, language and semantics, memory, and decision making.

Just like with traditional product and service design, you might expect that I would encourage those building AI systems to consider the representations you’re using as inputs and outputs, and test representations that are at different symbolic levels (e.g., word-level, semantic-level), rather than purely perceptual levels (e.g., pixels, phonemes, sounds).

I Get By with a Little Help from My (AI) Friends

While AI and ML researchers seek to produce independently intelligent systems, it is very likely that more near-term successes can be achieved using AI and ML tools as cognitive support tools. We already have many of these right now on our mobile devices. We can remember things using reminders, translate street signs with our smartphones, get help with directions from mapping programs, and get encouragement to achieve goals with programs that count calories and help us save money, or get more sleep or exercise.

In our studies of voice-activated systems today, however, the biggest challenges we’ve seen are the differences between the language employed by users versus that used by the system, and when the assistance was provided relative to when it was needed. When building things that allow customers or workers to do things faster and more easily by augmenting their cognitive abilities, the Six Minds can be an excellent framing of how ML and AI can support human endeavors:

Vision/Attention

AI tools, particularly with cameras, could easily help to draw attention to the important parts of a scene. They could help bring relevant information into focus (e.g., what form elements are unfinished), or if they know what you are seeking, highlight relevant words on a page or in parts of a scene. Any number of possibilities come to mind. When entering a hotel room for the first time, people want to know where the light switches are, how to change the temperature, and where the outlets are to recharge their devices. Imagine looking through your glasses and having these things highlighted in your view.

Wayfinding

Given successes with Lidar and automated cars, it seems likely that the type of heads-up display I just mentioned could also bring into attention the highway exit you need to choose, that tucked away subway entrance, or the store you might be seeking in the mall. Much like game playing, it could show two views—the immediate scene in front of you, and a bird’s-eye map of the area and where you are in that space.

Memory/Language

We work with a number of major retailers and financial institutions who seek to provide personalization in their digital offerings. By getting evidence through search terms, clickstreams, communications, and surveys, one could easily see the organization and the terminology of the system being tailored to the individual. Video is a good example, where some customers might be just starting out and need a good camera for YouTube videos, while others might be seeking specific types of ENG (electronic news-gathering) cameras with 4:2:2 color, etc. Neither group really wants to see the other’s offerings in their search, and the language and detail that each group needs would be very different.

Decision Making

I have discussed the fact that problem solving is really a process of breaking down large problems into their component parts and solving each of these subproblems. In each step, you have to make decisions about your next move. Buying a printer is a good example. A design studio might want a larger-format printer with very accurate colors. A law firm might want legal paper handling, good multiuser functionality, and the ability to bill the printing back to the client automatically. A parent with school-aged kids might want a quick, durable color printer all family members can use. By asking a little about the needs of the individual, and supporting each of the microdecisions that need to be made along the way (e.g., What is the price? How much is toner? Can it print on different sizes of paper? Do I need double-sided printing? What reviews are there from families?), the ML/AI might be able to intuit the types of goals the individual might have. The individual’s location in the problem space might suggest exactly what that person should and shouldn’t be presented with at this time.

Emotion

Perhaps one of the most interesting possibilities is that increasingly accurate systems for detecting facial expressions, movement and speech patterns can ascertain the user’s emotional state, which could be used to moderate the amount presented on a screen, the words used (perhaps the user is overwhelmed and wants a simpler route to an answer).

Endless possibilities abound, but they all revolve around what the individual is trying to accomplish, how they think they can accomplish it, what they are looking for right now, the words they expect, how they believe they can interact with the system, and where they are looking. I hope that framing your problem in terms of the Six Minds will allow you and your team to exceed all previous attempts at satisfying your users with a brilliant experience. I hope you can heighten every one of your users’ cognitive processes in reality, just as that fictional team of scientists augmented the physical capabilities of the Six Million Dollar Man.

Concrete Recommendations

  • Suggest different ways of training AI systems explicitly for semantics (rather than skipping this).
  • Consider explicitly training AI systems in specific types of syntactic patterns that were less common in the findings you collected.
  • Think about how you want to augment cognition (directing attention, encouraging certain kinds of interactions, providing information persuasively, etc.).
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.152.251