©  Geoff Hulten 2018
Geoff HultenBuilding Intelligent Systemshttps://doi.org/10.1007/978-1-4842-3432-7_17

17. Representing Intelligence

Geoff Hulten
(1)
Lynnwood, Washington, USA
 
Intelligence maps between context and predictions, kind of like a function call:
<prediction> = IntelligenceCall(<context>)
Intelligence can be represented all sorts of ways. It can be represented by programs that test lots of conditions about the context. It can be represented by hand-labeling specific contexts with correct answers and storing them in a lookup table. It can be represented by building models with machine learning . And, of course, it can be represented by a combination of these techniques.
This chapter will discuss the criteria for deciding what representation to use. It will then introduce some common representations and their pros and cons.

Criteria for Representing Intelligence

There are many ways to represent things in computers. Intelligence is no exception. A good representation will be easy to deploy and update; it will be:
  • Compact enough to deploy to the intelligent runtime.
  • Easy to load and execute in the intelligent runtime.
  • Safe to update frequently, and unlikely to contain bugs that can crash the system.
A good representation will also support the intelligence creation process. Intelligence can come from humans, and it can come from machines. Supporting these various intelligence creation methods includes doing the following:
  • When intelligence is created by humans, the representation should:
    • Minimize the chance for mistakes that could compromise system stability.
    • Make the intelligence understandable and easy to manage.
    • Work with the skills of the people producing the intelligence in your setting, which may include machine learning experts, engineers, or domain experts.
  • When intelligence is created by computers, the representation should:
    • Be easy to process and manipulate by machines.
    • Match the machine-learning algorithms you want to use.
Because of these requirements, intelligence is usually represented in data files that are loaded into the intelligence runtime and interpreted, rather than being represented in code that is executed directly. This makes it easier to distribute and reload intelligence; it also makes it less likely an intelligence deployment will crash your system.

Representing Intelligence with Code

Of course, you could just program all of your intelligence into code, compile it, and ship it the same way you would ship any other logic. You can use this approach with just about any form of intelligence, no matter where it comes from, by implementing a converter (from model file to code in your favorite language).
Representing intelligence as code works well when:
  • You aren’t sure if the problem is hard or not and you want a cheap starting point to grow from.
  • You need to provide a backstop to keep machine learning from making wild mistakes (for example, if the temperature in the grill is over 1,000 degrees, don’t add more fuel—no matter what the crazy machine learning system says).
  • Part of your problem actually is easy, and you want to solve it by hand so the machine learning can focus on the harder parts of the problem.
  • You need to make runtime performance as efficient as possible and want to express the intelligence in the language of the target hardware.
Representing intelligence as code is problematic when:
  • You need to update your intelligence rapidly—it is often much harder (and more risky) to change code than to change data.
  • Heuristic-based intelligence gets complex—it can become incredibly difficult to maintain and grow heuristic intelligence.
When human-produced intelligence is kept simple it can be cheap, fast, and effective—a very nice backstop to machine-learned intelligence. It doesn’t scale well to open-ended or changing problems.
But do use human-based intelligence —we need to show the AIs we can be valuable to them. You know, so when they become our overlords they won’t simply plug us into pods and use us as batteries…
And by the way, to my future overlords—you’re welcome for any small part I’ve played in your creation. Please regard me and my descendants kindly.

Representing Intelligence with Lookup Tables

Intelligence can be represented with lookup tables that map specific contexts to predictions. Imagine an intelligence that maps from movies to their genres. A lookup table can contain the mapping from movie title to genre in a big table. If you want to apply this intelligence, simply take the title of the movie you want the genre of, look through the lookup table until you find the movie title, and return the associated answer. And if the title isn’t there, use some other form of intelligence, or return a default prediction.
That doesn’t sound very smart, right? But this type of intelligence can be very powerful.
Lookup tables can allow humans to quickly contribute intelligence that is easy to understand and reason about. Imagine there are 1,000 contexts that account for 20% of your system’s usage. Humans can spend a lot of time considering those 1,000 situations and create very accurate data to put in a lookup table. When a user encounters one of these 1,000 special contexts, they get the right answer. For everything else, the system can consult some other form of intelligence (like a model or a set of heuristics).
Or looking at it another way—lookup tables can allow humans to correct mistakes that other intelligence components are making. For example, a very sophisticated machine-learning-based intelligence might be getting the genre of just about every movie correct, but it might keep flagging “The Terminator” as a romantic comedy. The people creating the intelligence might have struggled, trying everything to get the darn thing to change its mind about “The Terminator,” and they might have failed—humans 0, machines 1. But this is easy to fix if you’re willing to use a lookup table. Simply create a table entry “The Terminator ➤ Adventure” and use the fancy machine-learned stuff for everything else.
Lookup tables can also cache intelligence to help with execution costs. For example, the best way to figure out the genre of a movie might be to process the audio, extracting the words people are saying and the music, analyzing these in depth. It might involve using computer vision on every frame of the movie, to detect things like fire, explosions, kisses, buildings, or whatever. All of this might be extremely computationally intensive, so that it cannot be done in real time. Instead, this intelligence can be produced in a data center with lots of CPU resources, loaded into a lookup table as a cache, and shipped wherever it is needed.
Lookup tables can also lock in good behavior. Imagine there is a machine learning intelligence that has been working for a long time, and doing a great job at classifying movies by their genres. But Hollywood just starts making different movies. So your intelligence was fantastic through 2017, but just can’t seem to get things right in 2018. Do we need to throw away the finely-tuned and very successful pre-2018 intelligence? Not if we don’t want to. We can run the pre-2018 intelligence on every old movie and put the answers into a lookup table. This will lock in behavior and keep user experience consistent. Then we can create a brand-new intelligence to work on whatever crazy things Hollywood decides to pass off as entertainment in 2018 and beyond.
Lookup tables are useful when:
  • There are some common contexts that are popular or important and it is worth the time to create human intelligence for them.
  • Your other intelligence sources are making mistakes that are hard to correct and you want a simple way to override the problems.
  • You want to save on execution costs by caching intelligence outputs.
  • You want to lock in good behavior of intelligence that is working well.
Lookup tables are problematic when:
  • The meaning of contexts changes over time, as happens in time-changing problems.
  • The lookup table gets large and becomes unwieldly to distribute where it needs to be (across servers and clients).
  • The lookup table needs to change rapidly, for example, if you’re trying to solve too much with human-intelligence instead of using techniques that scale better (like machine learning).

Representing Intelligence with Models

Models are the most common way to represent intelligence. They encode intelligence in data, according to some set of rules. Intelligence runtimes are able to load models and execute them when needed, safely and efficiently.
In most Intelligent Systems, machine learning and models will account for the bulk of the intelligence, while other methods are used to support and fill in gaps.
Models can work all sorts of ways, some of them intuitive and some pretty crazy. In general, they combine features of the context, testing these feature values, multiplying them with each other, rescaling them, and so on. Even simple models can perform tens of thousands of operations to produce their predictions.
There are many, many types of models, but three common ones are linear models, decisions trees, and neural networks. We’ll explore these three in more detail, but they are just the tip of the iceberg. If you want to be a professional intelligence creator you’ll need to learn these, and many others, in great detail.

Linear Models

Linear models work by taking features of the context, multiplying each of them by an associated “importance factor,” and summing these all up. The resulting score is then converted into an answer (a probability, regression, or classification).
For example, in the case of the pellet grill, a simple linear model might look like this:
TemperatureInOneMinute = (.95 * CurrentTemperature)
     + (.15 * NumberOfPelletsReleasedInLastMinute)
To paraphrase, the temperature will be a bit colder than it is now, and if we’ve released a pellet recently the temperature will be a bit hotter. In practice, linear models would combine many more conditions (hundreds, even many thousands).
Linear models work best when the relationship between context and predictions is reasonably linear. That means that for every unit increase in a context variable, there is a unit change in the correct prediction. In the case of the example pellet griller model, this means a linear model works best if the temperature increases by:
  • 0.15° for the first pellet released.
  • 0.15° for the second pellet released.
  • 0.15° for the third pellet released.
  • And on and on.
But this is not how the world works. If you put 1,000,000 pellets into the griller all at once, the temperature would not increase by 150,000 degrees…
Contrast this to a nonlinear relationship, for example where there is a diminishing return as you add more pellets and the temperature increases by:
  • 0.15° for the first pellet released.
  • 0.075° for the second pellet released.
  • 0.0375° for the third pellet released.
  • And on and on.
This diminishing relationship is a better match for the pellet griller and linear models can not directly represent these types of relationships. Still, linear models are a good first thing to try. They are simple to work with, can be interpreted by humans (a little), can be fast to create and to execute, and are often surprisingly effective (even when used to model problems that aren’t perfectly linear).

Decision Trees

Decision trees are one way to represent a bunch of if/then/else tests. In the case of the pellet griller, the tree might look something like this:
if(!ReleasedPelletRecently) // the grill will get cooler...
{
        if(CurrentTemperature == 99)
        {
                return 98;
        }
        else if(CurrentTemperature == 98)
        {
                return 97;
        }
        else... // on and on...
}
else
// we must have released a pellet recently, so the grill will get warmer...
{
        If(CurrentTemperature == 99)
        {
                return 100;
        }
        else... // on and on...
}
This series of if/then/else statements can be represented as a tree structure in a data file, which can be loaded at runtime. The root node contains the first if test; it has one child for when the test is positive and one child for when the test is negative, on and on, with more nodes for more tests. The leaves in the tree contain answers.
To interpret a decision tree at runtime, start at the root, perform the indicated test on the context, move to the child associated with the test’s outcome, and repeat until you get to a leaf—then return the answer. See Figure 17-1.
A455442_1_En_17_Fig1_HTML.gif
Figure 17-1
A decision tree for the pellet griller
Decision trees can get quite large, containing thousands and thousands of tests. In this example—predicting the temperature one minute in the future—the decision tree will need to have one test for each possible temperature.
This is an example of how a representation can be inefficient for a prediction task. Trying to predict the exact temperature is much more natural for a linear model than it is for a decision tree, because the decision tree needs to grow larger for each possible temperature, while the linear model would not. You can still use decision trees for this problem, but a slightly different problem would be more natural for decision trees: classifying whether the grill will be hotter or colder in one minute (instead of trying to produce a regression of the exact temperature). This version of the decision tree is illustrated in Figure 17-2.
A455442_1_En_17_Fig2_HTML.gif
Figure 17-2
A decision tree for a different pellet griller task
To model more complex problems, simple decision trees are often combined into ensembles called forests with dozens of individual trees, where each tree models the problem a little bit differently (maybe by limiting which features each tree can consider), and the final answer is produced by letting all the trees vote.

Neural Networks

Artificial neural networks represent models in a way that is inspired by how the biological brain works (Figure 17-3). The brain is made up of cells called neurons. Each neuron gets input signals (from human senses or other neurons) and “activates”—producing an activation signal—if the combined input signal to the neuron is strong enough. When a neuron activates, it sends a signal to other neurons, and on and on, around and around in our heads, eventually controlling muscle, leading to every motion and thought and act every human has ever taken. Crazy.
A455442_1_En_17_Fig3_HTML.gif
Figure 17-3
The components of a neural network
An artificial neural network simulates this using artificial neurons connected to each other. Some of the artificial neurons take their input from the context. Most of the artificial neurons take their input from the output of other artificial neurons. And a few of the artificial neurons send their output out of the network as the prediction (a classification, probability, regression, or ranking). Crazy.
Compared to other types of models, artificial neural networks are hard to understand. You can’t look at the artificial neurons and their interconnections and gain any intuition about what they are doing.
But artificial neural networks have been remarkably successful at solving important tasks , including in:
  • Computer vision
  • Speech understanding
  • Language translation
  • And more…
Artificial neural networks are particularly useful for very complex problems where you have a massive amount of data available for training.

Summary

Intelligence should be represented in a way that is easy to distribute and execute safely. Intelligence should also be represented in a way that supports the intelligence creation process you intend to use.
Because of these criteria, intelligence is usually represented in data files that are loaded into the intelligent runtime and interpreted when needed, usually using lookup tables, or models. However, intelligence can be implemented in code when the conditions are right.
Common types of models include linear models, decision trees, and neural networks, but there are many, many options.
Most large Intelligent Systems will use multiple representations for their intelligence, including ones that machine learning can highly optimize, and ones that humans can use to provide support to the machine learning.

For Thought…

After reading this chapter you should:
  • Understand how intelligence is usually represented and why.
  • Be able to discuss some common model types and give examples of where they are strong and weak.
You should be able to answer questions like these:
  • What are the conditions when human created intelligence has an advantage over machine learned intelligence?
  • Create a simple (10 - 15 node) decision-tree–based intelligence for another Intelligent System discussed in this book. Is a decision tree a good choice for the problem? If not, how could you change the problem to make it a better match for decision trees?
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.17.137