Chapter 2


Artificial intelligence, machine learning and big data

On 11 May 1997, an IBM computer named Deep Blue made history by defeating Garry Kasparov, the reigning world chess champion, in a match in New York City. Deep Blue won using raw computing muscle, evaluating up to 200 million moves per second as it referred to a list of rules it had been programmed to follow. Its programmers even adjusted its programming between games. But Deep Blue was a one-trick pony, soon dismantled. Computers were far from outperforming humans at most elementary tasks or in more complicated games, such as the Chinese game of Go, where there are more possible game states than atoms in the universe (see Figure 2.1).

Figure 2.1 A Go gameboard.

Figure 2.1 A Go gameboard.

Fast forward 19 years to a match in Seoul, Korea, when a program named AlphaGo defeated reigning world Go champion Lee Sedol. Artificial intelligence had not simply improved in the 19 years since Deep Blue, it had become fundamentally different. Whereas Deep Blue had improved through additional, explicit instructions and faster processors, AlphaGo was learning on its own. It first studied expert moves and then it practised against itself. Even the developers of AlphaGo couldn’t explain the logic behind certain moves that it made. It had taught itself to make them.

What are artificial intelligence and machine learning?

Artificial intelligence (AI) is a broad term for when a machine can respond intelligently to its environment. We interact with AI in Apple’s Siri, Amazon’s Echo, self-driving cars, online chat-bots and gaming opponents. AI also helps in less obvious ways. It is filtering spam from our inboxes, correcting our spelling mistakes, and deciding what posts appear on top of our social media feeds. AI has a broad range of applications, including image recognition, natural language processing, medical diagnosis, robotic movements, fraud detection and much more.

Machine learning (ML) is when a machine keeps improving its performance, even after you’ve stopped programming it. ML is what makes most AI work so well, especially when there is abundant training data. Deep Blue was rule-based. It was AI without machine learning. AlphaGo used machine learning and gained its proficiency by first training on a large dataset of expert moves and then playing additional games against itself to learn what did or didn’t work. Since machine learning techniques improve with more data, big data amplifies machine learning. Most AI headlines today and almost all the AI I’ll discuss in this book will be applications of machine learning.

The origins of AI

Researchers have been developing AI methods since the 1950s. Many techniques used today are several decades old, originating in the self-improving algorithms developed in the research labs of MIT’s Marvin Minsky and Stanford’s John McCarthy.

AI and ML hit several false starts. Researchers had high expectations, but computers were limited and initial results were disappointing. By the early 1970s, what was termed ‘the first AI winter’ had set in, lasting through the end of the decade.

Enthusiasm for AI resurfaced in the 1980s, particularly following industry success with expert systems. The US, UK and Japanese governments invested hundreds of millions of dollars in university and government research labs, while corporations spent similar amounts on in-house AI departments. An industry of hardware and software companies grew to support AI.

The AI bubble soon burst again. The supporting hardware market collapsed, expert systems became too expensive to maintain and extensive investments proved disappointing. In 1987, the US government drastically cut AI funding, and the second AI winter began.

Why the recent resurgence of AI?

AI picked up momentum again in the mid-1990s, partly due to the increasing power of supercomputers. Deep Blue’s 1997 chess victory was actually a rematch. It had lost 15 months earlier, after which IBM gave it a major hardware upgrade.12 With twice the processing power, it won the rematch using brute computational force. Although it had used specialized hardware, and although its application was very narrow, Deep Blue had demonstrated the increasing power of AI.

Big data gave an even greater boost to AI with two key developments:

  1. We started amassing huge amounts of data that could be used for machine learning.
  2. We created software that would allow normal computers to work together with the power of a super-computer.

Powerful machine learning methods could now run on affordable hardware and could feast on massive amounts of training data. As an indication of scale, ML applications today may run on networks of several hundred thousand machines.

One especially well-publicized machine learning technique that is increasingly used today is artificial neural networks, a technique recently extended to larger (deeper) networks and branded as deep learning. This technique contributed to AlphaGo’s victory in 2016.

Artificial neural networks and deep learning

Artificial neural networks (ANN) have been around since the late 1950s. They are collections of very simple building blocks pieced together to form larger networks. Each block performs only a few basic calculations, but the whole network can be ‘trained’ to assist with complicated tasks: label photos, interpret documents, drive a car, play a game, etc. Figure 2.2 gives examples of ANN architectures.

Artificial neural networks are so named because of their similarity to the connected neurons within the animal brain. They function as pattern recognition tools, similar to the early layers of our mind’s visual cortex but not comparable with the parts of our brains that handle cognitive reasoning.

Figure 2.2 Examples of artificial neural network architectures.

Figure 2.2 Examples of artificial neural network architectures.13

The challenge of building an ANN is in choosing an appropriate network model (architecture) for the basic building blocks and then training the network for the desired task.

The trained model is deployed to a computer, a smartphone or even a chip embedded within production equipment. There are an increasing number of tools available to facilitate this process of building, training and deploying ANNs. These include Caffe, developed by Berkley Vision Lab, TensorFlow, developed within Google and released to Apache in November 2015, Theano and others.

‘Training’ an ANN, involves feeding it millions of labelled examples. To train an ANN to recognize animals, for example, I need to show it millions of pictures and label the pictures with the names of the animals they contain. If all goes well, the trained ANN will then be able to tell me which animals appear in new, unlabelled photos. During the training, the network itself does not change, but the strength of the various connections between the ‘neurons’ is adjusted to make the model more accurate.

Larger, more complex ANNs can produce models that perform better, but they can take much longer to train. The layered networks are now generally much deeper, hence the rebranding of ANNs as ‘deep learning’. Using them requires big data technologies.

ANNs can be applied to a broad range of problems. Before the era of big data, researchers would say that neural networks were ‘the second-best way to solve any problem’. This has changed. ANNs now provide some of the best solutions. In addition to improving image recognition, language translation and spam filtering, Google has incorporated ANNs into core search functionality with the implementation of RankBrain in 2015. RankBrain, a neural network for search, has proven to be the biggest improvement to ranking quality Google has seen in several years. It has, according to Google, become the third most important of the hundreds of factors that determine search ranking.14

Case study – The world’s premier image recognition challenge

The premier challenge in image recognition is the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC), in which global research teams compete to build machine learning programs to label over 14 million images. An ANN won the challenge for the first time in 2012, and in a very impressive way. Whereas the best classification error rate for earlier ML algorithms had been 26 per cent, the ANN had a classification error rate of only 15 per cent.

ANNs won every subsequent competition. In 2014, the GoogLeNet program won with an error rate of only 6.7 per cent using an ANN with 22 layers and several million artificial neurons. This network was three times deeper than that of the 2012 winner, and, in comparison with the number of neurons in the animal brain, their network was slightly ahead of honey bees but still behind frogs.

By 2016, the winning ML algorithm (CUImage) had reduced the classification error to under 3 per cent using an ensemble of AI methods, including an ANN with 269 layers (10 × deeper than the 2014 winner).

How AI helps analyse big data

Most big data is unstructured data, including images, text documents and web logs. We store these in raw form and extract detailed information when needed.

Many traditional analytic methods rely on data that is structured into fields such as age, gender, address, etc. To better fit a model, we often create additional data fields, such as average spend per visit or time since last purchase, a process known as feature engineering.

Certain AI methods do not require feature selection and are especially useful for data without clearly defined features. For example, an AI method can learn to identify a cat in a photo just by studying photos of cats, without being taught concepts such as cat faces, ears or whiskers.

Some words of caution

Despite early enthusiasm, the AI we have today is still ‘narrow AI’. Each is only useful for the specific application for which it was designed and trained. Deep learning has brought marginal improvements to narrow AI, but what is needed for full AI is a substantially different tool set.

Gary Marcus, a research psychologist at New York University and co-founder of Geometric Intelligence (later acquired by Uber), describes three fundamental problems with deep learning.15

Figure 2.3 Example of AI failure.

Figure 2.3 Example of AI failure.16

Figure 2.4 Dog or ostrich?

Figure 2.4 Dog or ostrich?

  1. There will always be bizarre results, particularly when there is insufficient training data. For example, even as AI achieves progressively more astounding accuracy in recognizing images, we continue to see photos tagged with labels bearing no resemblance to the photo, as illustrated in Figure 2.3
    Or consider Figure 2.4 above. By modifying the image of a dog in the left-hand image in ways that the human eye cannot detect, researchers fooled the best AI program of 2012 into thinking the image on the right was an ostrich.17
  2. It is very difficult to engineer deep learning processes. They are difficult to debug, revise incrementally and verify.
  3. There is no real progress in language understanding or causal reasoning. A program can identify a man and a car in a photo, but it won’t wonder, ‘Hey, how is that man holding the car above his head?’

Keep in mind

Artificial intelligence is still limited to addressing specific tasks with clear goals. Each application needs a specially designed and trained AI program.

Remember also that AI is very dependent on large quantities of diverse, labelled data and that Al trained with insufficient data will make more mistakes. We’ve already seen self-driving cars make critical errors when navigating unusual (e.g. untrained) conditions. Our tolerance for inaccuracy in such applications is extremely low.

AI often requires a value system. A self-driving car must know that running over people is worse than running off the road. Commercial systems must balance revenue and risk reduction with customer satisfaction.

Applications of AI in medicine bring their own promises and pitfalls. A team at Imperial College London has recently developed AI that diagnoses pulmonary hypertension with 80 per cent accuracy, significantly higher than the 60 per cent accuracy typical among cardiologists. Application of such technology, though, brings before us some complicated issues, as we’ll discuss later.18

AI applications have captured headlines over the past few years, and will doubtless continue to do so. I’ll talk more about AI in Chapter 8, when I discuss choosing analytic models that fit your business challenges. But AI is just one of many analytic tools in our tool chest, and its scope is still limited. Let’s step back now and consider the bigger picture of how big data can bring value through a wider set of tools and in a broad range of applications.

Takeaways

  • AI has been actively studied for 60 years, but has twice gone through winters of disillusionment.
  • Much of AI involves machine learning, where the program self-learns from examples rather than simply following explicit instructions.
  • Big data is a natural catalyst for machine learning.
  • Deep learning, a modern enhancement of an older method known as neural networks, is used in much of today’s AI technology.
  • AI programs are limited in scope and will always make non-intuitive errors.

Ask yourself

  • Where in your organization do you have large amounts of labelled data that could be used for training a machine learning program, such as recognizing patterns in pictures or text, or predicting next customer action based on previous actions?
  • If you’ve already started an AI project in your organization, how much greater is its estimated return on investment (ROI) than its cost? If you multiply the estimated chance of success by the estimated ROI, you should get a number exceeding the estimated cost.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.5.57