Chapter 17

AI-driven process change

Abstract

This chapter reviews artificial intelligence (AI) technologies and discusses how AI will increasingly be used in the automation of business processes. The chapter considers how AI is compatible with other automation software we have already learned to manage using a process change methodology.

Keywords

Artificial intelligence (AI); Knowledge-based systems; Neural network systems; Watson; AlphaGo; Natural language systems; Vision systems; Robotics; Decision management systems

The world of business is clearly going to continue to change very rapidly. New technologies will be introduced each year, new tastes will become popular with consumers, and new business models will be developed that will challenge whole industries to come up with solutions to the disruption caused. Organizations will need to continue to change their business processes to accommodate these ongoing changes. If I had to pick a single technological change that I thought would have the largest impact on process work in the next few years I would pick the widespread adoption of artificial intelligence (AI) techniques. I believe that business and the nature of human work will be slowly but profoundly altered as companies and consumers incorporate AI techniques in their daily processes. In this chapter I want to define AI and consider some of the ways in which it will drive process change.

In a sense, of course, AI is simply today’s cutting-edge computer technology. Thus, in a broad sense, all I am really saying is that organizations will continue to automate using the latest computing techniques. In earlier iterations, computer technologies took over the storage of data and most routine mathematical and bookkeeping calculations. Later, computers invaded the front office, replacing typewriters and offering automated spreadsheets for office workers. Other computer techniques have automated routine physical operations using robots, like those that assemble cars, and using software applications that have replaced most routine document processing work that humans formerly did. Computers have expanded their role from calculating to communication and now provide email and web services that enable constant, worldwide message flows and daily “meetings.” AI techniques will automate most tasks that currently require human analysis and decision-making skills and many operations that involve linguistic or fine motor skills.

Artificial Intelligence

If you read business publications you have already read articles on AI or on one of the more specific AI techniques, such as cognitive computing, process mining, machine learning, automated decision making, robotic process automation, natural language processing, speech recognition, or intelligent agents. Clearly, AI isn’t a single technology; instead, it includes a lot of different techniques that can be clustered in different ways to build different kinds of software applications. I’ll try to provide an overview of both the technology and its possible applications in this chapter, placing a special emphasis on how AI technologies will affect the work of business process analysts.

For the past 20 years I have primarily focused on business process change following the success of this book, which was first published in 2003. During the 1980s and 1990s, however, I spent most of my time writing, speaking, and consulting on AI. My consulting in the 1980s resulted from a book I had written in 1985, Expert Systems: AI for Business. That book described an earlier iteration of AI. Recently I have been impressed by the latest developments in AI and their potential to revolutionize business processes in the near future.

Artificial intelligence (AI) is a term chosen by a committee at a workshop at Dartmouth College in the summer of 1956 to describe the branch of computer science focused on building computers that showed human-like intelligence. AI researchers asked how they might get computers to see, to speak, to ask questions, to store human knowledge, to learn new things, and to understand the importance of ongoing events. They also asked how they might get computers to guide machines that could undertake manipulations that ranged from surgery and walking to assembling complex devices and driving cars. Most of the emphasis has been on getting computers to identify patterns and to respond to new or unpredictable situations—as a human does when he or she meets a new person and enters into a conversation to learn about the new person.

AI is often said to be subdivided into several branches, including knowledge representation, natural languages, and robotics. Since that first AI conference in 1956 AI has experienced three periods in which commercial groups became excited about the possibilities of using AI techniques for practical applications. The first was in the late 1950s, just after the launch of Sputnik. The Russian success stimulated the US government to become very interested in what Russian scientists were doing. The US military became excited about the possibility of using AI techniques to translate lots of Russian documents into English. After a few years of experimentation it became obvious that the then current state of language translation wasn’t up to the job, interest in AI died down, and funding dried up. This is not to say that research in computer science departments was discontinued, but only to say that there was no longer any interest in trying to develop commercial applications.

The second time people got very interested in AI was in the 1980s, when software applications called "expert" or "knowledge-based systems" seemed to promise that new software systems could be built that would capture and replicate the knowledge and analysis capabilities of human experts. This round of commercial AI activity was stimulated by a couple of applications built at Stanford that demonstrated human expertise. Dendral was a system that could infer molecular structures as a result of receiving information generated by a mass spectrometer. In effect, Dendral did something that had previously only been done by very skilled analytic chemists. (The systems that are used to analyze human genomes today are descendants of Dendral.) Mycin was a medical system that could analyze meningitis infections and prescribe treatments. Mycin did what only physicians who had specialized in meningitis diseases were normally capable of doing. Both systems proved in tests that they could perform as well as human experts at their selected tasks. On the basis of the results achieved by Dendral and Mycin, software companies were launched to create software tools designed to facilitate the development of other knowledge-based expert system applications. For several years large organizations invested in the technology and explored the uses of knowledge-based techniques.

Data, Information, and Knowledge

Words such as “data” and “information” are used in lots of different ways. To understand their meaning in a specific context you need to know how a specific speaker is using them. I try to use these words as they were often used in the mid-1980s in AI circles to make a point about where “big data” end and AI begins.

Here are my definitions:

  •  Data refer to specific items (e.g., x, y). Names or numbers such as 3.14159.
  •  Information refers to propositions (e.g., x = y, or x > y) that relate names or numbers. Thus the propositions that 3.14159 is an irrational number, or 3.14159 is pi are both examples of information.
  •  Knowledge refers to a rule or other statement (e.g., if x = y and n < m, then do a) that combines propositions to recommend specific actions. For example, if you want to calculate the circumference of a circle you multiply its diameter by 3.14159.

Computer systems currently capture huge amounts of data. Some are captured as items, such as customer name, credit card number, items purchased, and amounts spent on purchases on structured forms. In these cases some information is implied and captured as the data are entered. Thus we create information by associating the customer name with the credit card number. Most data, however, are unstructured. They are captured as textual documents (such as emails to companies, or blog entries), as verbal items (like recorded phone calls), or as visual items (such as video recordings of customers entering a building or cars arriving in a parking lot). All these data are being saved in databases. When you consider the bits and bytes involved in unstructured text documents and verbal or visual recordings it’s no wonder that captured data are growing rapidly.

In the very recent past turning even a small part of the unstructured data being collected into information would have been very expensive. You would have had to have an employee physically scan videos of employees arriving to determine what time a given employee arrived on a given day. Recent breakthroughs in text-reading software and visual-scanning software, however, are making it possible to quickly convert huge amounts of data into information at a modest cost. Making data into information is in itself valuable. Data can be converted into information that humans can then scan looking for useful patterns. Data analytic software can do it even more rapidly and often identify patterns that are more complex than humans would normally recognize.

What’s more exciting is the use of AI to automate certain types of actions. Knowing employees tend to make a specific type of mistake on certain days is interesting, but knowing that taking a specific action will reduce the occurrence of the mistake is even more interesting. Knowledge, as I have already suggested, allows us to move from information to action. In the 1980s we explored the use of rule-based expert systems to deal with complex human decision making. The systems proved effective, but they also proved expensive to develop and very expensive to maintain. Today’s new generation of AI applications is based on a different approach: neural networks and deep learning algorithms. In essence, the application generalizes a pattern from a number of trial runs during which humans provide the right answers. Once the machine is trained to identify the pattern it tends to get even better at recognizing the pattern with additional practice, which it can undertake on its own. Learning is key! By using today's AI applications, and training them as we train people, with examples, and by reinforcing correct responses, the systems can be developed more quickly and they can continuously improve, eliminating the need for expensive maintenance cycles.

The essence of it is that we have huge amounts of data, and we now have tools that let us turn the data into information and still others that let us automate the examination of information and generate the knowledge needed to take appropriate decisions.

This is only one way to use these three words, but its how a lot of AI people commonly use them, and I find it a very useful way to think about all this.

Several interesting knowledge-based systems were developed in the 1980s and some proved quite valuable. Ultimately, however, most of the knowledge-based systems that were developed in the 1980s proved too difficult to update and maintain. Human experts in cutting-edge fields are constantly learning and modifying their ideas and practices. If expert software systems were to function as human experts they needed to change as rapidly as their human counterparts. The computers available in the 1980s—remember that the IBM PC was first introduced in 1981—simply weren’t powerful enough or fast enough to run most knowledge-based systems. In addition, the effort required to update expert systems proved to be too extensive to be practical. By the mid-1990s the interest in knowledge-based systems waned.

Interest in knowledge-based systems, however, did have an important consequence. It introduced commercial software people to a wide variety of new ideas and techniques ranging from object-oriented techniques, incremental development methodologies, and graphical user interfaces to the use of rules and various logic-based approaches to application design. These techniques flourished even while basic AI techniques receded into the background.

In the past few years AI has experienced another round of commercial interest, led by major successes in game-playing applications developed by AI groups that been experimenting with the latest AI techniques.

IBM’s Watson Plays Jeopardy!

In the 1990s IBM created Deep Blue, an AI application specifically designed to play chess. It was the latest in a series of chess-playing programs that IBM developed, and in 1997, during its second challenge match with Garry Kasparov, the world chess grandmaster, Deep Blue won the match. (Deep Blue won two games, Kasparov won one, and three games were drawn.) Those who studied the software architecture of Deep Blue know that it depended on brute force, a term computer people use to refer to the fact that the system relied on its ability to search millions of examples and evaluate millions of possibilities in a few minutes more than on its ability to reason. Specifically, Deep Blue used an approach that looked forward several moves for each reasonable “next move” and then chose the move that would yield the highest number of points. The fact that Deep Blue defeated a human grandmaster was impressive, but it didn’t immediately suggest any other applications, since the application was highly specialized to evaluate a chess board and select the next best chess move.

As the new millennium began IBM was looking around for another challenging problem, and wanted to find one with more applications than chess. IBM also wanted to explore new techniques being developed in AI labs. In 2004 IBM began to consider developing an application that could play Jeopardy!. Jeopardy! is a very popular TV game that draws large viewing audiences and offers some real challenges for a computer. In Jeopardy! contestants are given “answers” and asked to come up with the “question” that would lead to such an answer. The “questions” and “answers” used on Jeopardy! are drawn from a broad base of general knowledge on topics such as history, literature, science, politics, geography, film, art, music, and pop culture. Moreover, the game format requires that the contestants be able to consider the “answers” provided, which are often subtle, ironic, or contain riddles, and generate responses within about 3 seconds.

In essence, a Jeopardy!-playing application posed two different problems: understanding natural language so as to be able to identify the right question and then searching a huge database of general information for an answer that fits the question. Searching a huge database quickly was a more or less physical problem, but “hearing” and then “understanding” spoken English, and finally determining which of several possible answers was the right match for the question being asked, were serious cognitive problems.

In 2007 IBM established a team of 15 people, and gave them 5 years to solve it. The team in turn recruited a large staff of consultants from leading AI labs in universities and began. The first version was ready in 2008 and in February of 2010 the software application Watson proved it could beat two of the best known former Jeopardy! winners, Brad Rutter and Ken Jennings, in a widely watched TV match.

The key to Watson’s analytic functionality is DeepQA (Deep Question Analytics), a massively parallel probabilistic architecture that uses and combines more than 100 different techniques—a mixture of knowledge and neural net techniques—to analyze natural language, identify sources, find and generate hypotheses, and then evaluate evidence and merge and rank hypotheses. In essence, DeepQA can perform thousands of simultaneous tasks in seconds to provide answers to questions. Given a specific query, Watson might decompose it and seek answers by activating hundreds or thousands of threads running in parallel.

Watson maintained all its data in memory to help provide the speed it needed for Jeopardy! It had 16 terabytes of RAM. It used 90 clustered IBM Power 750 servers with 32 cores running at 3.55 GHz. The entire system runs on Linux and operates at over 80 teraflops (i.e., 80 trillion operations per second).

To sum up: IBM demonstrated that AI-based natural language analysis and generation had reached the point where a system like Watson could understand open-ended questions and respond in real time. Watson examined Jeopardy! “answers,” defined what information was needed, accessed vast databases to find the needed information, and then generated an English response in under 3 seconds. It did it faster and better than two former human Jeopardy! winners and easily won the match.

Unlike Deep Blue, which was more or less abandoned once it had shown it could win chess matches, Watson is a more generic type of application. It includes elements that allow it to listen to and respond in English. Moreover, it is capable of examining a huge database to come up with responses to questions. Today, the latest version of Watson functions as a general purpose AI tool (some would prefer to call it an AI platform) and is being used by hundreds of developers to create new AI applications.

Fukoku Mutual Life Insurance Company in Tokyo (Japan), for example, worked with IBM’s Watson to develop an application to calculate payments for medical treatments. The system considers hospital stays, medical histories, and surgical procedures. If necessary the application has the ability to “read” unstructured text notes, and “scan” medical certificates and other photographic or visual documents to gather needed data. Development of the application cost 200 million yen. It is estimated that it will cost about 15 million yen a year to maintain. It will displace approximately 34 employees, saving the company about 140 million yen each year, and thus it will pay for itself in 2 years. The new business process using the Watson application will drastically reduce the time required to generate payments, and the company estimates that the new approach will increase its productivity by 30%.

Google’s AlphaGo

While IBM was working on its Jeopardy!-playing application, Google acquired its own AI group and that group decided to illustrate the power of recent AI developments with its own game-playing system. Go is an ancient board game that is played on a 19 × 19 matrix. The players alternate placing black or white “stones” on the points created by the intersecting lines. The goal of the game is to end up controlling the most space on the board. Play is defined by a very precise set of rules.

When IBM’s Deep Blue beat chess grandmaster Garry Kasporov, in 1997, AI experts immediately began to think about how they could build a computer that could play and defeat a human Go player, since Go was the only game of strategy that everyone acknowledged was more difficult than chess. This can be exemplified by noting that the first move of a chess game offers 20 possibilities, whereas the first move in a Go game offers the first player a chance of placing the stone in any one of 361 intersections (Figure 17.1). The second player then responds by placing a stone in any one of the 360 remaining positions. A typical chess game lasts around 80 moves, while Go games can last for 150 turns. Both games have explicit moves and rules that theoretically would allow an analyst to create a branching diagram to explore all logical possibilities. In both cases, however, the combinations are so vast that logical analysis is impossible. Possible game states in either game are greater than the number of atoms in the universe. (The search space for chess is generally said to be 1047, whereas the search space for Go is generally held to be 10170.)

Figure 17.1
Figure 17.1 Two people playing Go.

In October 2015 AlphaGo, a program developed by DeepMind (a subsidiary of Google), defeated Fan Hui, the European Go champion, five times in a five-game Go tournament. In March 2016 an improved version of AlphaGo played a tournament with the leading Go master in the world, Lee Sedol, in Seoul. AlphaGo won four games in a five-game tournament.

So, how does AlphaGo work? The first thing to say is that the core of AlphaGo was not developed as a software package to play Go. The basic neural net architecture used in AlphaGo was initially developed to play Atari software games. The Atari-playing program was designed to “look” at computer screens (matrices of pixels) and respond to them. When DeepMind subsequently decided to tackle the Go-playing problem, it simply re-purposed the Atari software package. The input that AlphaGo uses is a detailed 19 × 19 matrix of a Go board with all the stones that have been placed on it. The key point, however, is that the underlying AlphaGo platform is based on a generic software package designed to learn to play games; it’s not a specially developed Go-playing program.

AlphaGo largely depends on two deep neural nets. A neural network is an AI approach that depends on using various algorithms to analyze statistical patterns and determine which patterns are most likely to lead to a desired result.

As already noted, the basic unit being evaluated by AlphaGo is the entire Go board. Input for the neural network was a graphic representation of the entire 19 × 19 Go board with all of the black and white stones in place. In effect, AlphaGo “looks” at the actual board and state of play, and then uses that complete pattern as one unit. Winning games are boards with hundreds of stones in place. The unit that preceded the winning board was a board with all the final stones, save one, and so forth. A few years ago no computer would have been able to handle the amount of data that AlphaGo was manipulating to “consider” board states. (Much of IBM’s Watson’s usefulness is its ability to ask questions and provide answers in human language. This natural language facility isn’t really a part of the core ‘thought processes’ going on in Watson, but it adds a huge amount of utility to the overall application. In a similar way, the ability of AlphaGo to use images of actual Go boards with their pieces in place adds an immense amount of utility to AlphaGo when it’s presented as a Go-playing application.)

Note also that AlphaGo examined 100,000s of Go games as it learned to identify likely next moves or board states that lead to a win. A few decades ago, it would have been impossible to obtain detailed examples of good Go games. The games played in major tourneys have always been recorded, but most Go games were not documented. All that changed with the invention of the Internet and the Web. Today many Go players play with Go software in the Cloud, and their moves are automatically captured. Similarly, many players exchange moves online, and many sites document games. Just as business and government organizations now have huge databases that they can mine for patterns, today’s Go applications are able to draw on huge databases of Go games, and the team that developed AlphaGo was able to draw on these databases when they initially trained AlphaGo using actual examples (i.e., supervised learning).

One key to understanding AlphaGo, and other deep neural network–based applications, is to understand the role of reinforcement learning. When we developed expert systems in the late 1980s, and a system failed to make a prediction correctly according to a human expert, the developers and the human expert spent days or even weeks poring over the hundreds of rules in the systems to see where the system went wrong. Then rules were changed and tests were run to see if specific rule changes would solve the problem. Making even a small change in a large expert system was a very labor-intensive and time-consuming job. AlphaGo, once it understood what a win meant, was able to play with a copy of itself and learn from every game it won. At the speed AlphaGo works it can play a complete game with a copy of itself in a matter of a seconds.

As already mentioned, AlphaGo defeated the leading European Go master in October 2015. In March 2016 it played the world Go champion. Predictably, the world Go champion studied AlphaGo’s October games to learn how AlphaGo plays. Unfortunately for him, AlphaGo had played millions of additional games—playing against a version of itself—since October, and significantly increased its ability to judge board states that lead to victory. Unlike the expert system development team that was forced to figure out how their system failed and then make a specific improvement the AlphaGo team has simply put AlphaGo in learning mode, and then set it to playing games with a version of itself. Each time AlphaGo won it adjusted the connection weights of its network to develop better approximations of the patterns that lead to victory. (Every so often the version of AlphaGo that it was playing against would be updated so it was as strong as the winning version of AlphaGo. That would make subsequent games more challenging for AlphaGo and make the progress even more rapid.) AlphaGo is capable of playing a million Go games a day with itself when in Reinforcement Learning mode.

As impressive as AlphaGo’s October victory over Fan Hui was it paled by comparison with AlphaGo’s win over the Go champion Lee Sedol in March of 2016. Fan Hui, the European Go Champion, while a very good player, was only ranked a 2-dan professional (he was ranked 633rd best professional Go player in the world), while Lee was ranked a 9-dan professional and widely considered the strongest active player in the world. Experts, after examining the games that AlphaGo played against Fan Hui, were confident that Lee Sedol could easily defeat AlphaGo. (They informally ranked AlphaGo a 5-dan player.) In fact, when the match with Lee Sedol took place (4 months after the match with Fan Hui) everyone was amazed at how much better AlphaGo was. What the professional Go players failed to realize was that in the course of 4 months AlphaGo had played millions of games with itself, constantly improving its play. It was as if a human expert had managed to accumulate several additional lifetimes of experience between the October and the March matches. Lee Sedol, after he lost the second game, said that he was in shock and impressed that AlphaGo had played a near perfect game.

AlphaGo was designed to maximize the probability that it would win the game. Thus, if AlphaGo has to choose between a scenario where it will win by 20 points with an 80% probability and another where it will win by 2 points with 99% probability it will choose the second. This explains the combination of AlphaGo’s very aggressive middlegame play, but its rather conservative play during the endgame. It may also explain the difficulties that Lee Sedol seemed to have when he reached the endgame and found many of the moves he wanted to make were already precluded.

To beat Lee Sedol, AlphaGo used 1920 processors and a further 280 GPUs—specialized chips capable of performing simple calculations in staggering quantities.

In spring 2017 AlphaGo was at it again, playing Chinese Grandmaster Ke Jie, and once again winning. The AlphaGo team announced following that victory that their program would “retire” and that Google would focus on working on more pressing human problems. Their work on helping clinicians diagnose patient problems faster, for example, is getting a lot of attention.

What was impressive about these last games was not the wins, but the buzz around the innovations that AlphaGo had introduced into Go play. We are all becoming accustomed to the idea that AI systems can acquire vast amounts of knowledge and use that knowledge to solve problems. Many people, however, still imagine that the computer is doing something like a rapid search of a dictionary, looking up information as it is needed. In fact, AlphaGo learned to play Go by playing human players. Then it improved its skills by playing millions of games against itself. In the process AlphaGo developed new insights into what worked and what didn’t work. AlphaGo has now begun to develop approaches—sequences of moves—that it uses over and over again is similar situations. Students of Go have noticed these characteristic sequences of moves, given them names, and are now beginning to study and copy them.

One of the sequences is being referred to as the “early 3-3 invasion.” (Roughly, this refers to a way to capture a corner of the board by playing on the point that is three spaces in from the two sides of the corner.) Corner play has been extensively studied by Go masters and—just as openings have been studied and catalogued in chess play—experts tend to agree on what corner play works well and what is to be avoided. Thus grandmasters were shocked when AlphaGo introduced a new approach to corner play—a slight variation on an approach that was universally thought to be ineffective—and proceeded to use it several times, proving that it was powerful and useful. Indeed, following AlphaGo’s latest round of games Go masters are carefully studying a number of different, new move sequences that AlphaGo has introduced. More impressively, in games just after his loss to AlphaGo Chinese Grandmaster Ke Jie started using the early 3-3 invasion sequence in his own games.

All this may seem trivial stuff, but the bottom line is AlphaGo introduced serious innovations in its Go play. It isn’t just doing what human grandmasters have been doing; it’s going beyond them and introducing new ways of playing Go. In essence, AlphaGo is an innovative player! What this means for the rest of us is really important. It means that when Google develops a patient-diagnostic assistant, and after that assistant has studied the data on thousands or millions of patients it will begin to suggest insights that are beyond or better than those currently achieved by human doctors.

The deep learning neural network technology that underlies today’s newest AI systems is considerably more powerful than the kinds of AI technologies we have used in the recent past. It can learn and it can generalize, try variations, and identify the variations that are even more powerful than those it was already using. These systems promise us not only automation of performance, but automation of innovation. This is both exciting and challenging. Organizations that move quickly and introduce these systems are going to be well placed to gain insights that will give them serious competitive advantages over their more staid competitors.

AI Technologies

Without being very explicit, by discussing some AI techniques we’ve considered two broadly different approaches. One approach uses knowledge and logic in explicit ways, which means its reasoning can be checked. The other approach uses techniques that don’t depend on explicit knowledge, but rely instead on the statistical analysis of patterns. The first are usually termed knowledge-based or logic-based systems. The second set of techniques are usually referred to as machine learning or neural network systems. Both approaches are being used in today’s AI applications, although neural network systems predominate. We’ll consider each set of techniques in a bit more explicitly.

Knowledge-Based Approaches

Knowledge-based systems represent knowledge in explicit ways and use the knowledge so represented to reason about problems. Different knowledge-based systems use different kinds of logical inferencing techniques to manipulate the knowledge to reason and draw conclusion.

To better understand the problem it’s important to have a basic idea of how a knowledge-based system was architected and created. A knowledge-based system traditionally consisted of three main elements: (1) a knowledge base, (2) an inference engine, and (3) working memory. In essence, early knowledge bases were composed of rules, each independent of all the others. Thus a single Mycin rule might be something like this:

  • If    the site of the culture is blood, and
  •          the morphology of the organism is rod, and
  •          the gram stain of the organism is gramneg, and
  •          the patient is a compromised-host,
  • Then there is suggestive evidence (0.6) that the identity of the organism is Pseudomonas aeruginosa.

An inference engine is an algorithm that responds to input by examining all the rules in the knowledge base to see if it could arrive at any conclusions. In essence, the inference engine relied on the principles of logic (e.g., if A = B, and B = C, then A = C). If it could reach any conclusions it stored them in working memory. Then the algorithm began again, treating the information in working memory as new input and checked to see if it could reach any other conclusions. At various points the application would fire rules that would ask the user questions and use the answers, which it placed in working memory, to drive still more analysis. To make things more complex the rules were associated with confidence factors (e.g., 0.6) that allowed the system to reach conclusions in which it was more or less confident. (Keep in mind that most of these rules were derived from human experts, and a lack of complete confidence is very typical of the knowledge used by many human experts.)

In the sample rule given above, if the inference engine sought to evaluate the rule it would consider one If clause at a time. It would begin by seeking to determine if the site of the culture was blood. If it could find this information in working memory it would assume it as a fact and proceed. If it didn’t find this fact it might ask the physician what the site of the culture was, and so forth. Without going into more detail it’s possible to see that an expert system depended on explicit statements of knowledge in the form of rules. These rules are complex and require careful testing. A large expert system might rely on a knowledge base with hundreds or even thousands of rules.

To build an expert system a developer needed to sit down with a human expert and work with the expert to elicit the rules. The expert and the developer would consider cases, examine scenarios, and systematically develop rules that an expert might use to analyze a case and prescribe one or more responses. Together they would estimate the confidence that each rule should express, and then they would test the resulting rule base against dozens of cases to refine it. The development of a knowledge base for a significant expert system was a very time-consuming and expensive process—just the opportunity costs of having a world-class human expert focus on the development of software, rather than on using his or her expertise to focus on problems the company actually faced, cost a great deal. The development of an expert system could take months or even years.

Unfortunately, once built, tested, and found to work, most expert systems began to degrade as knowledge of the particular domain continued to evolve and the knowledge base of rules became dated. Human experts are constantly attending conferences, discussing new cases and new technologies, and reading books and journals to stay up to date, while the new expert system was forced to wait until new rules could be added before it could use the latest knowledge. By the late 1980s most companies began to abandon the quest for expert systems as they found that maintaining the systems they had built was proving too expensive to justify the effort. Even more to the point, there weren’t that many human experts waiting to be turned into expert systems.

In essence, the rule-based systems developed in the 1980s were too fragile, limited, and too slow. The existing technology could capture the expertise, but it couldn’t automatically learn new things or update its knowledge. By the end of the 1980s most AI researchers in universities had stopped focusing on developing rule-based applications and had begun to explore new approaches that seemed to offer better chances for learning and more flexible ways of storing knowledge.

As an aside, when we discussed AI in the 1980s we often said that academic AI research was focused on a variety of techniques, including not only knowledge use but also natural language applications and robotics. At the time, however, there were no commercial examples of natural language or intelligent robotic applications.

As a second aside, as a consultant who specialized in teaching IT people about AI in the 1980s I can report on the profound change that the brief focus on AI in the 1980s produced. Commercial computing had begun in the 1960s when most large companies began to acquire mainframes to help with their data storage, bookkeeping, inventory, and payroll. The rapid growth of computing meant that many companies hired and trained programmers to use specific software languages (e.g., COBOL) and to develop specific types of applications (e.g., bank payroll systems). There weren’t any computer science courses in most colleges in the 1960s. Many of these people thought of computing rather narrowly. For many, when they first began to learn about expert systems they learned more about the underlying theory of computer science than they had before. Many were intrigued with the idea that computer systems didn’t need to follow a specific set of steps, but could use an inference engine to interrogate users and modify its activities as its working knowledge changed. Others found confidence techniques fascinating. Mycin, just like a human expert, usually didn’t decide that a patient had a specific type of infection. Instead, it concluded that the patient might have any of three or four different kinds of meningitis, with different degrees of confidence. Since some infections can quickly prove fatal, Mycin often recommended more than one drug to treat three or four different possibilities. Programmers had come to think that software systems always generated a correct answer and had to adjust to the fact that computers could also provide multiple estimates or guesses. The whole approach used to develop knowledge systems ended up fascinating IT developers. Instead of laying out a path knowledge engineers acquired rules one at a time, put them in a systems knowledge base, and then tested the system to see if it could solve a problem. As they added knowledge the system became smarter. The idea of developing a system incrementally, and testing and revising it to improve the system, had a profound impact on IT development practices in the 1990s. Expert systems development was the ultimate example of Agile software development. Similarly, although more technical, the AI systems in the 1980s introduced software developers to the ideas behind object-oriented programming that led to extensive changes in how software is engineered today.

Although the interest in commercial AI faded in the late 1980s, and disappeared by the mid-1990s, the people who had done the exploration remained and went on to other jobs. (Most advanced computer games and a lot of sophisticated Internet and web techniques are applications of AI techniques.) The specific technologies that had been explored and commercialized in the course of that decade also remained. The companies that had developed software tools for expert systems development, for example, looked for other tasks they could assist with. Many of the expert systems–building tools were repurposed to assist business people who were focused on capturing business rules. Instead of trying to capture the rules used by human experts, rule-based tool vendors sought to position their tools to capture modest sets of rules that were used to describe business policies. These applications proved valuable in efforts to help organizations comply with laws and policy requirements of various kinds.

Other commercial developers saw an opportunity to help their organizations by providing tools to help extract patterns and advice from the large databases that organizations began to struggle with in the late 1990s. AI commercial activity in the 1980s provided most IT people with their first taste of AI techniques, and provided a clear understanding of some of the practical problems that AI systems would have to overcome if they were to prove commercially viable.

Neural Networks

In the early 1980s, when expert systems were all the rage, most of the attention in the AI world was focused on knowledge-based systems. There had been an early period of interest in neural network systems, but funding for the neural network approach had largely dried up when networks had failed to achieve results in tests of natural language processing. In the 1990s, with more powerful computers available, neural networks because the focus on most AI research.

Connectionist machines, adaptive systems, self-organizing systems, artificial neural systems, and statistically based mapping systems are all terms occasionally used to describe what we will refer to here as neural networks. Neural networks are said to be based on biological neural networks, but in fact they only resemble a biological network in a rather limited way.

Neural networks are systems of nodes, “neurons,” or processing elements that are arranged in multiple layers. The nodes are connected by links that at any given moment either pass information or don’t pass information from one node to the next. As information is passed certain connections become stronger. As specific outcomes or results are reinforced, pathways become stronger and the network comes to identify specific pathways with particular outcomes. This process is termed training. As the network is exposed to more data and reinforced it continues to modify its outcomes and is said to learn.

Figure 17.2 shows three process elements stacked to form a parallel structure or layer. Note that inputs may be distributed among processing elements and that each processing element produces at least one output.

Figure 17.2
Figure 17.2 Multiple processing elements forming a single layer.

Figure 17.3 pictures a multilayer network composed of 14 processing elements arranged in an input layer, two hidden layers, and an output layer. Inputs are shown feeding into processing elements in the first layer, each of which is connected to processing elements in the next layer. The final layer is called the output layer. Hidden layers are so termed because their outputs are internal to the network. This simple network has weighted connections going from input-processing elements in the first hidden layer to processing elements in the second hidden layer, which in turn has weighted connections leading to the output layer. In other words, each output from a processing element in one layer becomes the input for processing elements in a subsequent layer, and so on. Theoretically, any number of processing elements may be arranged into any number of layers. The limitation is simply the actual computing power available and the functionality of the net.

Figure 17.3
Figure 17.3 Multilayer neural network consisting of many interconnected processing elements.

We’ve already gone into more detail than any business manager will need to know. One way to think of neural networks is as a collection of algorithms, each particularly suited (but not restricted) to a different application domain. In neural network terminology the terms algorithm and network are often used interchangeably. The term network can be used in a generic way (i.e., not referring to any specific type of network) or it can be used to refer to a specific algorithm (or learning rule), such as a back propagation network.

Any book on neural networks at this point would begin considering various types of neural network algorithms and what each was best suited to analyze. For our purposes, suffice it to say that there are a lot of algorithms, that they involve very technical considerations, and that knowledge of these algorithms is growing very rapidly. Most business managers will never need to understand the specific algorithms, but most large companies will want an IT employee, or a consultant, who does understand them and can help figure out which ones are appropriate to the problems your company faces.

Neural networks are said to be intelligent because of their ability to “learn” in response to input data and to adjust the connection weights of multiple nodes throughout the network.

Combined Approaches

Today there is a growing emphasis on combining the two approaches. Neural networks provide an excellent way to develop a powerful system. Moreover, the system can learn and become more powerful. Unfortunately, if someone asks how the system is making a decision the developer can only fall back on an algorithm and statistical data, which isn’t very satisfactory. Rule- or knowledge-based systems are hard to develop and maintain, but they offer explicit statements of the knowledge used and the logical path followed. Increasingly, the trick is to do all of visual, auditory, and nonlogical processing with neural networks, and to supplement those systems with small knowledge-based systems that can provide explicit explanations for just those tasks or subtasks where humans are likely to want an explanation.

Let’s consider developing AI systems to manage a self-driving auto. You can use explicit algorithms to compare the auto’s GPS location and the coordinates of the destination. Then you use a mapping systems to plot a course. You use neural network– based visual systems to actually “look” at the environment as the car proceeds along its route. You will also need a system that combines rules and neural networks if you are going to include a natural language system to talk with the rider. And you will probably use a rule-based system to apply legal rules to assure that the car stops at stop signs and follows other rules of the road. Increasingly, AI development involves knowing when to combine various AI techniques.

Developing and Deploying AI-Based Processes

A more technical book on AI at this point might turn to a discussion of how one creates a neural network or something that incorporates multiple neural networks systems, like a natural language-translating system. This book, however, is not for software developers but for business process analysts. Our focus is on how to improve business processes and not on the details of how to implement software systems to support process improvement. Figure 17.4 provides an overview of our generic process methodology with spaces for strategic and architectural changes, the set of activities involved in business process redesign, and an overview of an IT methodology for neural network development. Our focus in on how AI considerations will change activities at the process level. We assume that, once a process team has designed a new process that calls for an AI application, requirements will be provided to an IT group that will then undertake the actual creation, training, and subsequently the support of a neural network as it is used in the redesign business process.

Figure 17.4
Figure 17.4 Redesign to incorporate AI into an existing process.

So what sort of things will a process analyst have to do to take AI into consideration as he or she undertakes a new business process analysis? At the understand phase of a process project, practitioners will want to consider AI options as they undertake a scope analysis. They will want to consider if AI techniques can be used, if their use will likely solve the problems they face, and whether an AI solution can be used as part of a cost-effective solution.

During the next few years, as organizations continue to learn about the practical uses of AI and to develop realistic estimates of the problems and costs of employing AI techniques, estimates will necessarily be less accurate, but this should change over the course of the decade as experimentation is undertaken and more is learned. In the meantime large organizations will probably want to develop teams of IT experts who follow the AI market and who can serve as members of process redesign teams to provide advice and estimates as needed.

When a process team first undertakes a scope analysis they focus on problems with the existing process. Having identified the problems associated with a given process and come to some agreement as to the urgency associated with specific problems the analyst next considers options. Appendix 1 provides a checklist of some of the problems that are common to business processes. The challenge, of course, is to imagine a solution for a problem that will be effective and can be implemented at a reasonable cost. The challenge facing process analysts as they seek to integrate AI techniques into their processes is to imagine where AI techniques can be effectively used in solving problems. Figure 17.5 provides a high-level overview of one way of thinking about the AI techniques currently available. Obviously, the techniques can be combined in various combinations. Keeping up to speed by reading magazines such as BusinessWeek or Fortune in the next few years will provide lots of concrete examples of how AI techniques are used. At the same time lots of new companies will be formed to promote specific types of solutions for either generic problems (e.g., phone answering) or for specific industries.

Figure 17.5
Figure 17.5 Some of the leading AI techniques in use today. (Most involve the use of one or more neural nets.)

It’s always tempting, of course, when you hear about a specific technological solution to think about how you might field a similar solution. In the long run, however, it’s better not to be led by specific technological solutions as such, but by the business problems you face, and to treat the technological examples you learn about as a stepping stone to conceiving new solutions to the specific problems you face.

It’s worth taking a moment to consider whether you should aim to replace people or support them. A typical manager does a lot of different jobs, switching over the course of a day from analysis to reporting to disciplining to promoting. One could look at what the manager does and think that an AI application could do everything the manager does. A more detailed analysis would probably suggest that it would cost a great deal and take a long time to create a set of applications that would do everything that the manager does. It’s usually better to dig a little deeper and identify the specific tasks that the manager does and then target one or a few specific tasks. For example, a manager may spend part of each day reading reports to stay current on new developments. An AI system could probably scan more reports in a fraction of the time and provide the manager with a summary. Or, a manager might make decisions after reviewing lots of data. An AI application could probably review the data and indicate the optimal solution, leaving the manager to actually implement the solution. We are not suggesting that organizations always avoid trying to replace managers, but rather that they do it incrementally. In most cases organizations will want to use AI applications to supplement the work being done by decision makers and then gradually expand the applications.

Considering jobs more generically, it’s useful to think of all human jobs as composed of three types of skills: (1) physical or motor skills, (2) cognitive or knowledge skills, and (3) affective or interpersonal skills. To date, computer systems have proven best at duplicating physical or motor skills. AI systems will extend the reach of computers to many cognitive or knowledge skills. They may or may not ever be very good at affective or interpersonal skills.

Figure 17.6 pictures a scope diagram showing the various types of interactions that processes are typically engaged in and suggesting some areas where one might look for opportunities to use AI to support or supplement existing approaches that are generating problems.

Figure 17.6
Figure 17.6 Scope diagram with some notes on where AI techniques might be useful if there are problems.

To provide an example, let’s consider that customers complain about the problems they have making phone reservations. Let’s assume Cars-R-Us currently uses humans from a foreign country. Sometimes their accents get in the way of clear, easy communications. Customers have trouble understanding the phone personnel and, as a result, don’t always get the precise reservations they wish. As an additional problem the company sometimes has to increase the number of answering phones to deal with high volume and sometimes has too many people waiting for calls.

The process team decides this is an opportunity to try a natural language system. It can provide a higher quality voice that will use very clear English. It will also cross-check reservations as they are made to assure that all details are covered. In addition, more “agents” can be brought online as needed. This approach has already been tried by other companies and seems to work well. In addition, Cars-R-Us has thousands of calls recorded and so has the data needed to train a “car reservation agent” application.

The issues encountered in a scope diagram that AI might address include data problems, analysis failures, detail failures, bottlenecks where failure is a result of not enough people, or a need to rapidly increase or decrease the people available for the task.

The Analysis and Redesign Phases of a Project

As process analysts move from the understanding phase of a redesign effort and begin to carry out in-depth analysis, AI will generally function like any other technology that you use to automate a process. Specific business activities will usually remain, but will switch from being done by humans to being done by AI systems. In the example pictured in Figure 17.7, which we assume was prepared during the redesign phase, we assume that problems with the reserve car activity will be solved by replacing the existing phone-answering operation with a neural network application that will answer phones, talk to customers as they seek to establish reservations, agree to reservations, and record this information. Normally, we don’t show IT applications supporting core processes when we prepare diagrams like this, but we do in this case to focus attention on the fact that one of the subprocesses is an AI application.

Figure 17.7
Figure 17.7 Redesign process that now incorporates an AI application.

If we continued to play out this example the process redesign team, probably made up of business managers and employees, business analysts, and perhaps an IT and AI specialist would develop software requirements for the AI reservation system to be developed and then pass those requirements on to an appropriate IT group during the implementation phase of the overall effort. The IT group would use a neural network development methodology to develop the actual software application required and then work with the process team to test and ultimately implement the new neural network application. The process group and the IT group would also need to establish plans with the day-to-day process management team to monitor and oversee changes in the neural network application. One would assume that, no matter how much advanced training the application received, once it started talking with customers online, began booking reservations, and so forth the system would modify its behavior and might even develop new approaches to some specific types of situations. The managers of the process and IT would want to be prepared to support the system to assure that everything ran smoothly and that useful innovations were captured and used elsewhere if appropriate.

A Quick Review

Earlier we suggested that commercial attention focused on rule-based AI technologies in the 1980s was stimulated in part by the success of two expert systems, Dendral and Mycin. It’s probably fair to say that the current round of commercial interest in AI is being driven by the popular successes of two cognitive game-playing applications: IBM’s Watson, which won Jeopardy!, and Google’s AlphaGo, which defeated the world’s leading professional Go player.

These victories in themselves aren’t of too much value, but the capabilities demonstrated in the course of these two victories are hugely impressive. In the case of Watson it’s now clear that applications can be provided with natural language interfaces that can query and respond to users in more or less open-ended conversations. At the same time Watson is capable of examining huge databases and organizing the knowledge there to answer complex, open-ended questions. In the case of AlphaGo it’s equally clear that an application capable of expert performance can continue to learn by examining huge online databases of journals and news stories, or by working against itself to perform a task faster, better, or cheaper, and can improve very rapidly.

We’ve looked at recent advances in several ways. We’ve contrasted them with the rule-based approaches used in the 1980s. We’ve briefly considered the role of large databases and machine learning, and how pattern-matching algorithms have advanced the state of the art. We’ve also considered the basics of neural networks and recent advances in deep neural networks and reinforcement learning that have made today’s neural networks much more powerful than earlier versions. We specifically considered two demonstration applications: IBM’s Watson, which won at Jeopardy!, and Google’s AlphaGo, a cognitive system that just beat the world champion Go player. Each of these topics has emphasized some basic themes.

There have been no major technological breakthroughs. All the basic technologies being used have been around for at least two decades. There have, however, been minor technological breakthroughs, and these in turn have forced researchers to review older techniques and reevaluate their power. Thus deep neural networks, various types of feedback techniques, and reinforcement learning have been combined with techniques for searching massive databases and the steady growth of computing power to generate a powerful new generation of AI applications.

New applications are being designed around architectures that combine lots of different techniques (sometimes the same technique used in multiple different ways) running on multiple machines, which results in lots of different problem-solving approaches leading to exciting new solutions.

AI does not describe a specific technology or even a well-defined approach to computing. The term is not being used in the rather focused way that expert systems was used in the 1980s. Instead, the term is being used to describe a broad approach to application development that combines a wide variety of different techniques. The applications being developed combine AI and non-AI techniques in complex architectures that include not only knowledge capture and knowledge analysis capabilities, but natural language, visual front ends, and large-scale database search capabilities.

One feature of AI applications that is very significant is the proved ability of some applications to rapidly learn and improve on their own in at least some circumstances. It was the heavy cost of development and the rigidity and the rapid obsolescence of completed expert systems that doomed the second round of AI commercialization. AlphaGo suggests that the third round of AI may enjoy a lot more success. One imagines future AI applications linked to the Internet and constantly reading journals, newsfeeds, and conference proceedings and then updating their knowledge and simultaneously improving their problem-solving capabilities.

Although we haven’t gone into much technical detail it’s clear that most of the developers working at commercial organizations today will have to work very hard to ascend the steep learning curve that the use of the latest cognitive computing applications will require. The development of cognitive applications relies on integrating a variety of complex algorithms embedded in a variety of different neural networks and using rather esoteric techniques to train and improve the resulting applications. No one who has ever begun to explore the technologies and the knowledge bases that are required for the creation of a powerful natural language program will imagine that most companies could successfully hire people to develop a proprietary natural language application. Similarly, the effort required to build a powerful learning application, like AlphaGo, requires a very thorough knowledge of a vast number of new and complex learning algorithms that only a few corporate software developers currently know. This means that the growth and utilization of new, cognitive computing techniques and tools will depend on commercial organizations obtaining packaged modules to provide these capabilities, and then tailoring them for specific needs.

What is important for the readers of this book to know is that AI techniques will increasingly dominate software development and that computer systems will increasingly prove capable of duplicating what were previously thought to be human tasks. This in turn will require business process developers to reconsider what processes can be automated and to become more skilled in their analysis of tasks that humans currently perform. The techniques described in this book should enable process analysts to conceptualize the overall challenges implicit in AI-based business processes. At the same time, predictably, new process analysis techniques will be developed and will need to be mastered by developers who will increasingly have to analyze cognitive and decision management tasks of all kinds.

Notes and References

Some commentators seem to think that the emphasis on “artificial” suggests that AI is not real intelligence, but fake intelligence, much as artificial flowers are only plastic replicas of real flowers. In fact, the committee originally intended the term AI to apply to intelligence shown by artifacts—intelligence shown by things made by humans. Hence, AI is best understood as a synonym of machine intelligence.

For a good overview of the interest in AI in the 1980s: Harmon, Paul, and David King, Expert Systems: Artificial Intelligence in Business, Wiley, 1985.

A nice discussion of the breakthroughs of Hinton, Yoshua, and others is available at http://www.iro.umontreal.ca/~pift6266/H10/notes/deepintro.html.

See the deep learning entry in Wikipedia for a good review of deep learning. It is available at https://en.wikipedia.org/wiki/Deep-learning.

Hsu, Feng-hsiung, Behind Deep Blue, Princeton University Press, 2002.

Hall, Curt, “How Smart Is Watson, and What Is Its Signifiance to BI and DSS?” Advisor, Cutter Consortium, March 1, 2011.

Ferruci, David, et al., “Building Watson: An Overview of the DeepQA Project,” AI Magazine, Fall, 2010.

A perfect information game refers to a game that has well-defined rules and in which all the elements and moves are known to all players at all times. In theory it is always possible to analyze such games to determine if there is a set of moves that will guarantee a win. This can’t be done, however, if the number of possible moves is so great that it exceeds the ability of our fastest computers to physically do the analysis. Both chess and Go fall in this latter category, and are termed NP complete games (nondeterministic polynomial time)—games that are impossible to analyze completely because the combinations are so extensive as to make complete enumeration impossible.

Silver, David et al., “Mastering the game of Go with deep neural networks and tree search,” Nature, Vol. 529, Issue 7587, pp. 484–489, January 27, 2016.

Mnih, Volodymyr et al., “Playing Atari with Deep Reinforcement Learning,” Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS, 2010).

A detailed description of the Go innovations that AlphaGo has introduced is available at http://deepmind.com/blog/innovations-alphago/.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.201.17