6 Trends Framing the State of AI and ML

O’Reilly online learning is a trove of information about the trends, topics, and issues tech leaders need to know about to do their jobs. We use it as a data source for our annual platform analysis, and we’re using it as the basis for this report, where we take a close look at the most-used and most-searched topics in machine learning (ML) and artificial intelligence (AI) on O’Reilly.1

Our analysis of ML- and AI-related data from the O’Reilly online learning platform indicates:

  • Unsupervised learning surged in 2019, with usage up by 172%.
  • Deep learning cooled slightly in 2019, slipping 10% relative to 2018, but deep learning still accounted for 22% of all AI/ML usage.
  • Although TensorFlow grew by just 3%, it too garnered a 22% share of AI/ML usage in 2019.
  • PyTorch looks like a contender: it posted triple-digit growth in usage share rates in both 2018 and 2019.
  • Reinforcement learning fell by 5% in 2019; it’s up hugely—1,500+%—since 2017, however.
  • Sustained strength in unsupervised learning, neural networks, reinforcement learning, etc., demonstrates that organizations are experimenting with advanced ML tools and methods.
Figure 1. AI/ML topics on the O’Reilly online learning platform with the most usage in 2019 (left) and the rate of change for each topic (right)

Growth in ML and AI Is Unabated

Engagement with the artificial intelligence topic continues to grow, up 88% in 2018 and 58% in 2019 (see Figure 1), outpacing share growth in the much-larger machine learning topic (+14% in 2018 and +5% in 2019). Aggregating artificial intelligence and machine learning topics accounts for nearly 5% of all usage activity on the platform—a touch less than, and growing 50% faster than, the well-established data science topic (see Figure 2). Data engineering remains the largest topic in the data category with just over 8% usage share on the platform (Figure 2). But the data engineering share is down about 8% in 2019, mostly from declines in engagement with data management topics.

Figure 2. High-level data topics on the O’Reilly online learning platform with the most usage (left) and and the rate of change for each topic (right)

Unsupervised Learning Is Growing

Interest in the unsupervised learning topic increased significantly, with usage up by 53% in 2018 and by 172% in 2019 (see Figure 1).2 What’s driving this growth?

For most people and most use cases, supervised learning serves as the default, assumed strategy for machine learning. That makes unsupervised learning worth noting as a separate topic, given the growth in engagement driven by more sophisticated users, improved tools, and use cases not easily addressed with supervised methods. By analogy, users are more apt to engage with specific supervised learning methods—e.g., linear and logistic regressions, support vector machines—than with the canonical topic of supervised learning itself. Unsupervised learning, by contrast, isn’t as well understood, even if the names of its methods (e.g., clustering and association—and its applications (neural networks) are familiar to many users).

In all likelihood, the surge in unsupervised learning activity on O’Reilly is being driven by a lack of familiarity with the term itself, as well as with its uses, benefits, requirements, etc. It’s likely, too, that the visible success of unsupervised learning in neural networks and deep learning has helped spur interest, as has the diversity of open source tools, libraries, tutorials, etc., that support unsupervised learning.3 That some of these tools (scikit-learn, PyTorch, and TensorFlow) are also Python-based doesn’t hurt either.

Usage in Advanced Techniques Is Up—Mostly

It’s said that the success of neural networks and, especially, deep learning—neither of which is new—helped spur the resurrection of a number of other disused or neglected ideas. One example is reinforcement learning, which experienced an exponential spike in usage on the O’Reilly platform in 2018—growing by 1,612%—before regressing slightly (–5%) in 2019 (see Figure 1). Looking at AI/ML topic detail, we see usage in neural networks continuing its upward trend—up 52% in 2018; up 17% in 2019—but the related topic of deep learning dropped 10% in 2019. The drop in deep learning is likely a function of inter-year noise and not evidence of an emerging trend, given the significant usage growth in 2018 (+52%). These closely related topics are popular: aggregating neural networks, deep learning, and TensorFlow usage nets nearly half (47%) of all AI/ML category usage, showing a slight decline (–3%) in 2019 after growing 24% in 2018.

In our “AI adoption in the enterprise 2020” survey, we found that deep learning was the most popular ML method among companies that are evaluating AI. Among companies using AI to support production use cases, deep learning was second.4 It might be that—at 1% of platform usage and 22% of all AI/ML usage—deep learning has approached its asymptote. Growth could be slow from here on out.

The Rising AI/ML Tide Lifts (Almost) All Boats

Another topic showing consistent growth is natural language processing (NLP) (see Figure 1). Its growth rate isn’t spectacular—+15% in 2018; +9% in 2019—but NLP now accounts for about 12% of all AI/ML usage on O’Reilly. That’s about six times the share of unsupervised learning and five times the share of reinforcement learning usage. Interest in some methods or applications of ML seems to be waning, however. For example, the chatbots topic continues to decline, first by 17% in 2018 and by 34% in 2019. This is probably a reflection of the comparative maturity of the space. The chatbot was one of the first applications of AI in experimental and production usage. This likely doesn’t portend the end of interactions with occasionally helpful—and still sometimes horrifying—customer service chatbots. Computer vision usage shows a slow decline, falling by 3% in 2018 and 2% in 2019. Probably more noise than trend, moreover, computer vision accounts for about twice as much usage activity as the fast-growing unsupervised learning topic.

Python-Based Tools Are Ascendant in AI/ML

Reports of Torch’s death are somewhat misleading. In fact, PyTorch—a wrapper that permits users to call Torch’s ML libraries from Python—posted triple-digit growth in usage in just the last few years, surging by almost 400% in 2018 and by 111% in 2019 (see Figure 1). PyTorch’s popularity is probably a function of the success of Python itself, particularly for ML and AI; vanilla Torch uses Lua as a wrapper to expose its core C libraries, while PyTorch eschews Lua (in favor of Python) for the same purpose. Once you factor in the preeminence of Python, the rising popularity of PyTorch makes a lot of sense.

This may have something to do with TensorFlow’s outsized presence in ML, too. In 2019, it accounted for 1% of all usage, about a third as much usage as machine learning, and 22% of all AI/ML usage. TensorFlow isn’t a Python-exclusive technology—it exposes stable C and Python APIs—but its users tend to be Python-savvy and its related projects, patterns, tutorials, etc., disproportionately involve Python.5

The results of our recent AI adoption survey underscore this trend. TensorFlow was also the number one ML technology in the survey, while PyTorch came in at number four. Two additional Python-based tools (scikit-learn and Keras) also cracked the top five.6 We know from our annual analysis of usage and search on the O’Reilly online learning platform that one of Python’s fastest areas of growth is in ML- and AI-related development. The prominence of these and other Python-related tools attests to this fact.

What’s in a Name? The Shift to “Artificial Intelligence”

Does the growing engagement in neural networks, reinforcement learning, unsupervised learning, and the increased focus on putting models into production augur a shift in how practitioners in the space frame what they do? We think yes, with practitioners increasingly calling their work “artificial intelligence”—a notion supported by the growth in AI usage on O’Reilly, the increasing embrace of sophisticated tools, and the empirical trend of putting those tools into production, which we see in our AI surveys. AI has always been the general term for building intelligent systems, with machine learning covering the more specific case of building software that learns and modifies its outputs without the need for additional coding. Here are some examples of what, when viewed in aggregate, helps explain why those in the space think machine learning doesn’t quite cover all they do:

  • Machine learning produces models that are widely used in the automation of tasks such as credit scoring, fraud detection, recommendation engines, etc., but ML models are increasingly deployed in libraries or services and exposed via APIs—such that a model or ensemble of models can be invoked by any valid user, program, or service.
  • To some extent, models can be built with an aim toward reuse, such that, for example, a data profiling model can be invoked and used to support different business use cases.
  • Tools and techniques like reinforcement learning and unsupervised learning open up new use cases, including decision support, interactive games, real-time retail recommendation engines, and data discovery.
  • The focus of usage—and, with it, design and development—is shifting from the specific to the generalized. ML libraries and services have the potential to transform the software products we deliver, the processes that consume them, and—concomitant with this—the experiences of users, customers, and partners alike.
  • This isn’t just ML; it’s a kind of AI: a new way of thinking about and applying machine intelligence. It has implications for software architecture, infrastructure, and operations—for virtually all domains.

So this isn’t artificial general intelligence, but AI as the application of machine learning to solve problems, increase productivity, accelerate processes, and in many cases deliver wholly new products and services.

Concluding Thoughts

As organizations adopt analytic technologies, they’re discovering more about themselves and their worlds. Adoption of ML especially prompts people at all levels of an organization to start asking questions that challenge in different ways what the organization thinks it knows about itself. An organization’s use of ML tools and techniques, and the contexts in which it uses them, will tend to change, too. For example, the techniques of supervised learning are useful for classifying known-knowns and for elucidating certain kinds of known-unknowns; they’re unsuitable for surfacing unknown-unknowns, however. Unsupervised techniques are better for this. Not for classifying, synthesizing, or understanding unknown-unknowns—that’s the responsibility of human intelligence—but for surfacing them in the first place.

The upshot is that adopters are integrating both kinds of learning into their ML practices. They’re also apt to experiment with advanced ML methods—such as deep learning—that have applications for both supervised and unsupervised learning. In fact, we found in our AI adoption survey that those new to ML are almost as likely to experiment with deep learning as mature adopters. Right now, companies are successfully using ML to ferret out known-unknowns and unknown-unknowns in their business worlds. They’re instantiating what they discover, analyze, and understand about their worlds in models. Some are also starting to incorporate these models into automated, quasi-intelligent products, services, and software. All of this partakes of the propulsive logic of self-discovery. It’s at the root of a question Plato first formulated almost 2,500 years ago: “But how will you look for something when you don’t in the least know what it is?” he has Meno ask Socrates, “How on earth are you going to set up something you don’t know as the object of your search?” Philosophical tradition treats this question as a paradox, but it’s also possible to see it as an inquiry into how an object of knowledge augments and transforms itself.

With ML and AI, we’re training machines to surface new objects of knowledge that help us as we learn to ask new, different, and sometimes difficult questions about ourselves. By all indications, we seem to be having some success with this.

1 This article is based on non-personally-identifiable information about the top search terms and most-used topics on O’Reilly online learning. We compared aggregated data for the last three years: a full year of data for 2017 and 2018, and through the end of October for 2019.

2 Overall usage share for the unsupervised learning topic on O’Reilly more than tripled year-over-year. For the platform as a whole, unsupervised learning accounts for an extremely small percentage of usage activity—less than 0.1%. Share for specific applications (e.g., deep learning) is much higher.

3 Neural networks and deep learning are not exclusive to unsupervised learning. However, research into either topic is likely to surface the connection to unsupervised learning, as well as reinforcement learning, transfer learning, etc.

4 In our survey, we also found that two-thirds of mature AI adopters and 55% of organizations that are evaluating AI are using deep learning.

5 TensorFlow supports stable C and Python APIs. The project supports API-level access in other languages—including C++, Go, Java, and JavaScript—but does not guarantee compatibility with them.

6 Neither scikit-learn nor Keras generated significant activity on the O’Reilly platform itself, however. By contrast, in our 2020 AI adoption survey, the scikit-learn library (ranked second) was used by about 48% of respondents in 2019 and 2020. Keras, a Python library used in developing neural networks, climbed to number five in the 2020 edition of the survey; it was used by more than one-third of all respondents.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.250.114