Chapter 2. The Superpowers of Autonomous AI

Arthur C. Clarke said that “Any sufficiently advanced technology is indistinguishable from magic.” He’s saying that new technology can revolutionize how we do things. When you apply it to the right problem or situation, it can seem like magic compared to previous solutions. The automobile, with its “horsepower” inside the combustion engine, must have seemed like magic compared to a horse drawn carriage.

The Gemini and early Apollo space missions used human calculators to chart out the trajectory of the spacecraft and plan how to control the rocket into orbit. The IBM computer that performed its calculations using tiny transistors thousands of times faster than the fastest humans must have seemed like magic to many.

In the same way, the things that modern AI can do seem like magic compared to earlier methods machines have used to make decisions, like when engineers at Siemens successfully taught an AI to auto calibrate CNC machines 30 times faster and more precisely than expert operators. Sending expert operators out to manually calibrate CNC machines is costly and time consuming, but experts and programmers could not find a way to auto-calibrate machines using existing methods. As the spinning tools (they look like drill bits) cut the metal inside CNC machines, the friction causes them to lose accuracy. After cutting a certain amount of metal, the machines need to be recalibrated. To calibrate a machine, an expert operator performs a standard cutting operation like a circle, takes measurements during the operation, then adjusts the machine to remove the error. They repeat the process many times until the machine error is less than 2 microns. The AI not only calibrated the machines 30 times faster than experts, it calibrated to superhuman accuracy and worked for many different types of machines and cutting operations.

Augmenting Human Intelligence

Of the first 100 brains that I designed, 65 of them were intended to help humans perform better at their jobs by recommending decisions in real-time. In each of these cases, people turned to AI to help them make high-value decisions because they were looking for answers to a changing world, a changing workforce, or pressing problems. Though many fear that AI will replace human intelligence and take people’s jobs, I believe that the future of useful AI is a world where AI augments human intelligence, with humans using AI to help them make decisions that outperform either humans or AI alone. We see this in Chess, where Crampton and Stephens demonstrated that teams of humans and AI playing chess regularly beat both top chess players and sophisticated AI and I believe that you will see this in the AI that you design too.

How humans make decisions and acquire skills

Each of the limitations of machine automation above could be mitigated by more human-like decision making. How do we know this is true? When machines don’t make good decisions, humans rush in to take over decision-making. I worked with a major airline to design an AI to better transfer bags between planes during layovers. If the bags are not transferred in time, they miss their next flight. Each group of bags that need to transfer from one plane to another can be placed on one of the following transports:

  • A conveyor system can transport many bags at a time, more slowly.

  • A system of carts drive across tarmac to deliver the bags from the tail of one plane to the tail of another plane. This method is faster, but can accommodate far fewer bags at a time.

Each of these transport mechanisms represents a strategy that airport managers can use to transport a group of bags from one plane to another. If you don’t use the right strategy at the right time, bags get missed. This decision is pretty easy when things are going as planned in the airport. The automated scheduling system uses a single expert rule and an optimization routine. If the layover is greater than 45 minutes, put the bags on the conveyor. Otherwise, put the bags on the carts.

The problem arises when things don’t go as planned. When flights are cancelled due to weather, crew gets sick and other unexpected events occur, the 45 minute rule breaks down. As flights get rerouted, some bags which are on the carts, no longer need to rush and take up valuable space needed to get bags to rescheduled flights. Other bags get placed on the conveyor and will miss their flight. When this irregular operations occur, humans rush in to make the nuanced decisions (that a threshold rule could never make well) about which strategy to use for each set of bags based on predictions and schedule adjustments.

Figure_1-21.png
Figure 2-1. Two strategies for transferring bags during layovers

Humans act on what they perceive

We don’t know exactly how it works, but humans make decisions based on loops of perception and action. We perceive, then we act. Very rarely do we stop and make explicit calculations like machines when we control complex processes. For example, do you stop and calculate angles during your golf swing or explicitly search through options while driving?

Humans build complex correlations into their intuition with practice

Many times during brain design workshops, people have told me that humans aren’t good at managing many variables. I disagree. We don’t think visually in more than three dimensions and for most of us it is hard to calculate in multi-dimensional space, but we don’t use either of these methods to make complex decisions. We use our intuition. So, while it’s true that we don’t calculate in many dimensions well, we do learn to manage many variables while making decisions. It just takes time (a lot of time), feedback, repetition and exploration of the task landscape to build the correlations of these many variables into our intuition. I have a friend who knits. She tells me that it is important to select the right kind of yarn to match the kind of garment you are making. She doesn’t use a lookup table to select the right type of yarn for each knitting project, she uses her intuition from previous knitting projects to make the decision.

Humans abstract to strategy for complex tasks

The AI researchers on my team had a strong hunch that the best way to train AI was through imitation. Reasonable hypothesis. There are even special algorithms that learn by imitating the actions taken by experts. Imitation is indeed powerful, but every time I interviewed experts, they told me about strategy.

At the steel mill, experts filled whiteboards with different strategies they use to coat steel evenly to the right thickness (this process is called galvanization and prevents rust). Some strategies work best when the steel is thick and narrow and the coating is thick. Other strategies work best when the steel is wide and thin and the coating is thin. Why was I hearing so often about strategies when talking to industrial experts about how they manage their processes? It’s because humans use strategies to make complex decisions and to teach each other.

Note

A strategy is a labeled course of action that describes what to do in a specific scenario.

Here are some examples of strategies from well-known games and industrial processes that I discuss in this book. Notice how the strategies (and the goals; see Chapter 4, “Setting Goals for Autonomous AI,” for more details) come in pairs. This is because most real and natural systems have a fuzzy trade-off: you know, yin and yang, both sides of the coin, playing the devil’s advocate. See how many phrases we have to describe this phenomenon?

Table 2-1. Strategies in decision-making
Task Strategy When to use Strategy

Controlling the damper in an HVAC system

Close the damper to recycle air.

Energy is expensive and air is very hot or cold.

Controlling the damper in an HVAC system

Open the damper to freshen air.

Building occupancy is high, air quality is bad or energy is cheap.

Transferring bags to destination plane during layover

Put the bags on the slower, high bandwidth conveyor.

Longer layover

Transferring bags to destination plane during layover

Deliver the bags tail to tail with carts.

Shorter layovers

Crushing rocks in a gyratory crusher

Choke the crusher by stuffing it full of rocks.

Rocks are large and hard.

Crushing rocks in a gyratory crusher

Regulate the crusher by keeping it less than ¾ full.

Rocks are small and soft.

Scoring the ball through the hoop in Basketball

Shoot a layup.

Very close to the basket

Scoring the ball through the hoop in Basketball

Shoot a jump shot.

Farther from the basket

So, the experts were telling me that they use strategy to communicate and teach what to do, but the AI researchers and data scientists were telling me that it’s better to search for options. Enter the Dreyfus Model of Skill Acquisition!

Hubert (1929-2017) and Stuart Dreyfus developed a model to describe how humans acquire skills. These expert systems and computer science pioneers developed their model of skill acquisition for the United States Air Force in 1980 to help fighter pilots improve their emergency response. There are many models that attempt to describe how humans learn, and this model has its critics, but I’ve not seen a better model for describing how to acquire skills in a way that’s useful for designing Autonomous AI.

Figure_1-23.png
Figure 2-2. The Dreyfus Model of Skill Acquisition adapted for designing AI brains

Beginner

The first-stage learner, the beginner, learns the rules and goals of the game and begins to practice simple expert rules. The beginning chess player must first learn the goal of the game: you win when you capture (land on a square occupied by) your opponent’s king. Then, the beginner learns the rules of the game: which movements each piece can make turn by turn (bishops can move any distance along diagonal unoccupied squares, rooks can move any distance along lateral unoccupied squares). Next, the beginner is usually given one or more strategies to practice. One such common strategy is a point system that assigns a point value to each captured piece (<<#Figure_2-2>>). The beginner then practices many games, tallying the points accumulated with the pieces she captures. This system emphasises a few mental shortcuts: Queens are extremely important, so protect them; Rooks are more important than Knights and Bishops. Why Does this matter for designing AI? Autonomous AI learns by practicing just like human beginners, so you can teach it by giving it rules to practice.

Figure_1-24.png
Figure 2-3. Point system for chess.

Advanced Beginner

As the advanced beginner practices, she develops these rules into fuzzy skills (concepts) by identifying exceptions to these rules under a variety of different conditions. When I first learned how to play Texas Hold ‘Em rules poker, I read a book by a World Poker Tour champion who provided the following expert rule for me to practice as a beginner: play Top 10 hands only, fold everything else. His expert system defined the top 10 hands (AA, AK, etc.) and advised beginning by playing those hands only. Week after week, I did just that. I played only hands from the expert list and folded every other hand. Eventually, I developed preferences for playing hands that were not in the list. For example, depending on the flop (that’s the three cards in the middle of the table that all players can combine with the cards they hold to form a poker hand) and the bidding behavior of the opponents, I might gladly play a pair of eights even though it is not a top 10 hand.

Why does this matter for designing autonomous AI? Autonomous AI identifies and learns exceptions to the rules during practice, much like humans do. Both human and AI brains inflate rules given for practice into skills and strategies that use both the rules and the exceptions to make decisions.

Competent

As the learner moves from advanced beginner to competent, they abstract up a level from considering and calculating each move to deploying various learned strategies at various times. They learn strategies and practice when to use them. This is the stage where the Chess student, after practicing the point system and gaining a feel for how pieces move and interact, begins to learn known opening sequences and strategies for mid-game and mating (the final phase of the game where you capture the opponent King). The coach of a competent student gives more advice about which strategy to use for each board position than advice about which individual moves to make. During this phase coaches and teachers are also keenly aware of checkpoints that suggest transitions between strategies.

Why does this matter for designing Autonomous AI? Autonomous AI can learn strategy. This means that we can teach multiple AI brains different strategies (one for each brain) and arrange those brains in a hierarchy that decides which strategy to use under which conditions. See Autonomous AI improvises and strategizes for more details.

Proficient

The proficient learner spends a huge amount of time building their catalogue of strategies and learning which conditions are good conditions to deploy each strategy. The proficient learner also begins to improvise on and across strategies. This phase of skill mastery requires a tremendous amount of practice time and for humans, it requires a significant amount of determination and dedication.

Why does this matter for designing Autonomous AI? In situations where we know many, many strategies, we can teach each of them to Autonomous AI and let the brain figure out the best situations to use each strategy.

Expert

The expert learner has built into their intuition their strategies and preferences about when to use them. Many expert learners transcend their strategies and develop completely new paradigms of play that match their unique style. For example, the 15th-century Göttingen manuscript so preferred the Chess playing style that favors building edifices, closed-center positions, and knights over bishops that he developed the Queen’s Gambit strategy. This strategy invites opponents to play the aggressive, high-mobility style that Queen’s Gambit is designed to crush. These players developed a completely new chess paradigm to organize, abstract, and make sense of the large number of strategies that match their chess-playing personality. Some experts also become teachers by codifying new paradigms into building blocks of skill that beginners can practice.

There’s a new kind of AI in town

Each of the Autonomous AI applications that I cite in this book leverage an exciting new form of AI called Deep Reinforcement Learning (DRL). DRL allows optimization algorithms to exhibit useful human-like decision-making in some controlled situations. In order to design highly capable brains, it is important to acknowledge the capability of DRL but also to recognize that the principles of brain design transcend any particular technology, including Deep Reinforcement Learning. There are two key components to DRL: Reinforcement Learning algorithms and Deep Neural Networks.

Reinforcement Learning

In reinforcement learning, an AI brain acquires skill for a specific task through trial and error and by receiving feedback. The brain practices completing the task, gets feedback from a digital simulation (or from the real thing), and learns to take actions that lead to the best rewards.

Reinforcement Learning (RL) algorithms approximate learning by changing their behavior based on feedback. Reinforcement Learning algorithms try actions and receive feedback in the form of a “reward”: they translate the resulting actions into an index of success toward the goal for the task. These learners are like young children: infinitely curious (often well beyond their years) and persistent (to the point of being obstinate).

Algorithms do not apply reasoning to their exploration: they learn based on significant trial and error. They efficiently explore large spaces and succeed at tasks by learning complex behaviors that achieve rewards under a huge variety of scenarios. Along the way, they often learn the fundamental dynamics of the task, which can let them succeed in scenarios for which they were not trained. For example, the brain I referenced above that Microsoft built for Siemens practiced so much (it made more than 100 million decisions in the simulation) that it learned to successfully calibrate real-life machines, even on operations that it had never experienced in training.

For details on how Reinforcement Learning Algorithms work, I recommend Reinforcement Learning by Phil Winder Ph.D.

Figure_1-25.png
Figure 2-4. Reinforcement learning algorithms “learn” by trying actions, getting feedback and pursuing maximal future reward.

Neural Networks

A neural network is a system of interconnected nodes that imitate neurons in our brains. The nodes accept different weights (or values of importance) based on the effect that the weight of each node has on the output of the entire network. The network learns how to represent complex relationships between network inputs and outputs as the weights of each node evolve. When there are multiple layers of these nodes in the network, this approach is called deep learning. So, a neural network devises (or learns) outputs by correlating success in a task with certain conditions and assigning more weight to relevant nodes in the network.

Autonomous AI “stores” its learning in a neural network which can build and respond to complex, non-linear relationships. The Universal Approximation theorem assures us that brains, with practice, can, theoretically, learn to control any system. The reinforcement learning algorithm manipulates the neural network weights as the brain learns which action responses lead to maximum future rewards for each environment state. To learn more about neural networks, I recommend Neural Networks and Deep Learning by Aurélien Géron.

Autonomous AI makes more human-like decisions

I’m not claiming that AI brains can achieve parity with humans or complete human capability in any areas: I’m saying that when an Autonomous AI is designed properly and makes full use of existing AI and automation components, it can radically outperform systems that calculate actions from known mathematical relationships, search and select actions using objective criteria, or look up actions from recorded human expertise. Let’s look at each of the attributes a little more deeply. Human-like decision making in specific situations.

Autonomous AI perceives, then acts

One unique quality that sets Autonomous AI apart from automated systems is its ability to learn from “what it sees and hears” and make supervisory decisions. For example, when human operators control kilns that cook limestone as the first step in the cement-making process, they use their senses. They supervise and control the process based on how the kiln flame looks: its shape, its color, and the haziness of the air around it. When an AI can do the same, this kind of “sensory perception” moves it toward higher-level executive functioning (but only in the context of machine and process control).

While automated systems cannot make supervisory decisions based on visual and auditory perception, autonomous AI can. Deep Reinforcement Learning, which uses reinforcement learning algorithms to train neural networks, is the only control system technology that can input visual, sound, and categorical perception and act on this perception to control the system.

Autonomous AI learns and adapts when things change

I once met with a room full of Model Predictive Control (MPC) experts who quizzed me (it was perfectly friendly, I promise) about the benefits of DRL compared to MPC. After over 60 minutes of conversation, one of the researchers gave me this example that helped me explain. Have you ever borrowed your friend’s car and screeched the tires (accelerated too quickly) because the accelerator on their car was touchier than yours? I drive an old Honda Pilot that has a really mushy gas pedal. You have to press the pedal far down to get it to accelerate. MPC control systems (and most other control systems) have a single behavior when it comes to the variable that they control. The distance that you have to press the pedal to get additional acceleration is called a gain. Touchy gas pedals have a high gain and mushy gas pedals have a low gain. Either way, a gain-based control system will never change its behavior. It will either be touchy or mushy.

Autonomous AI can learn to change its behavior based on the circumstances it is in. It can be a mushy gas pedal when it needs to be and a touchy gas pedal when it needs to be.

Autonomous AI can spot patterns

Humans often use pattern recognition from previous experience to make decisions. Supply chain and logistics professionals described to me many times how they make decisions about how much of each product type and how to deliver each product type by matching patterns. One of these companies makes water sports equipment like kayaks and canoes and built an AI to help them make better plans and schedules. They produce and ship the boats to large box stores like Walmart, Target and Costco who sell them. Here’s the challenge. Each of those stores provides a forecast for how many boats of each type and color they want. But these forecasts are never completely accurate and they never will be. So, the best way to deal with this is for planners to match patterns based on past experience. The patterns sound like these made up examples:

  • Costco tends to ask for more canoes than they actually need during the off-season.

  • Target tends to underestimate the amount of blue boats they need during peak season.

  • Walmart overestimates how many kayaks they need in urban areas but underestimates how many they need in rural areas.

Much like humans, AI brains can match patterns like the ones above even complex patterns that involve many variables.

Autonomous AI infers from experience

One of the key motivations for developing reinforcement learning algorithms was to handle situations where some system changes cannot be measured. Of course, key states need to be measured in the environment but Autonomous AI can infer and respond to some changes in the environment by building correlations with other state variables. Pepsi built an AI that does this. This AI (which performs as an expert operator making CheetosTM snack foods) responds to the moisture level of the incoming corn and controls the manufacturing equipment differently based on the way it responds to changes in moisture, even though this moisture level is not measured.

Autonomous AI infers from experience

Deep Reinforcement Learning (DRL) is the only technology of any kind that has demonstrated the ability to learn strategy. Early chess-playing machines utilized programmed strategies , but brains learn strategy as they gain experience. The AlphaChess AI program learned and regularly used the 12 most common opening move sequences in chess on its own without being taught.

If brains can learn chess strategies, they can also learn strategies to control high-value equipment and processes. For example, a piece of equipment called a gyratory crusher is commonly used to crush rocks as the first step in many mining processes. The goal is to crush as much rock as possible (measured in tons per hour) to the particle size that will fit through the holes in a large shaking sieve. A gyrating steel arm rotates inside the cone and crushes the rocks against the steel cone. The rocks then fall through the bottom of the funnel. If you stuff the crusher chock-full, the resulting compression forces crush even the largest, hardest rocks, but it takes more time for the rocks to move through the crusher. If you fill the crusher ⅔ to ¾ full, you can move more material through the crusher per hour, but the crusher doesn’t generate as much compressive force. The first control strategy efficiently crushes large, hard rocks. This second control strategy is perfect for increasing throughput when the rocks are softer and smaller. An Autonomous AI can learn to move between these two strategies to maximize throughput.

Figure_1-26.png
Figure 2-5. Choke the gyratory crusher when the rocks are large and hard. This ensures that they get crushed efficiently. Regulate the crusher when the rocks are softer and smaller. This maximizes the throughput of crushed rocks.

Autonomous AI can plan for the long-term future

Complex situations where the optimal move in the short term is shortsighted, not optimal in the long term. Let me give you an example: squirrels store up nuts and save them for later. Bears store energy as fat, not because they need it now but because they will need it later when they hibernate during the winter.

Researchers at Microsoft Bonsai experienced this forward thinking in an amusing way while teaching the Jaco robotic arm to grasp and stack one block on another. This seven-jointed robot arm was folding in on itself while performing the task, so the researcher set a reward for maximizing the distance between the block and the shoulder of the robotic arm and a stiff penalty for bringing the block closer to the base and shoulder.

After more practice, it began to wind up, bringing the block closer to its shoulder—at great short-term penalty—to strike or throw the block. This machine agent learned that it could achieve much greater rewards in the long term by winding up and throwing the block, enough to cancel out the penalty for winding up. Of course, what the researcher really intended was to reward the agent for keeping the robot wrist far from the shoulder. Once they changed the parameters of the reward to be more specific, this disincentivized the throwing behavior and prompted the robot to learn the intended behavior of extending its arm.

You might just be thinking, why not just reward the AI for the distance to incentivize this behavior from the beginning? That’s a great question. Why did the brain learn to throw when it was penalized for winding up and what might it do if it was not penalized for winding up? We got lucky. Either way, if we tried this again, the AI would probably explore a lot of movements that don’t include a wind up. Best to teach winding up explicitly as a skill.

So far in this chapter, we’ve demonstrated that the human-like decision-making capabilities of Autonomous AI can outperform automated machine intelligence. The rest of this book will describe how to design Autonomous AI, but let’s finish off this chapter with a discussion of when to use Autonomous AI.

When should you use Autonomous AI?

You don’t need to replace every control system, optimization algorithm and expert system with an AI brain and you don’t need Autonomous AI for every application. Autonomous AI is an investment, here’s when you should consider it.

When the superpowers matter most

You now know how to compare the decision-making capabilities of brains with automated systems and humans. Now you need to decide whether or not this difference in performance matters. Almost every brain I’ve ever designed is for a system or process where a 1% improvement in the key performance indicators leads to over $1 million USD in savings or increased revenue for one facility. That matters.

When humans need to take over the decision making process

When humans have to take over, it’s a sure fire sign that you need more human-like decision making. Here’s a few final tips for when to use Autonomous AI:

  • Look for high-value decisions

  • Look for decisions that require human intervention

  • Look for complex decisions that require complex perception.

Table 2-2. Comparison of decision-making capabilities for machine intelligence and humans
Control Theory (Math) Optimization (Menus) Expert Systems (Manuals) Deep Reinforcement Learning Autonomous AI Humans

Changes Behavior

No

No

Limited

Yes

Yes

Yes

Fuzzy Control

Limited

No

No

Yes

Yes

Yes

Non-linear Control

Limited

No

No

Yes

Yes

Yes

Advanced Perception

No

No

Limited

Yes

Yes

Yes

Understands Concepts

Limited

No

Yes

Limited

Yes

Yes

Explores Efficiently

No

Yes

No

Yes

Yes

Yes

Predictable and Deterministic

Yes

Yes

Yes

No

Limited

Yes

Can make decisions across multiple types of tasks

No

No

No

No

No

Yes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset