© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
N. Sabharwal, G. BhardwajHands-on AIOpshttps://doi.org/10.1007/978-1-4842-8267-0_5

5. Fundamentals of Machine Learning and AI

Navin Sabharwal1   and Gaurav Bhardwaj1
(1)
New Delhi, India
 

You learned about AIOps, its architecture, and its components in previous chapters, and you will be venturing deeper into AIOps in the next few chapters. Before you start implementing algorithms for AIOps, though, you need to learn a few basic things about machine learning and artificial intelligence. This chapter will cover the fundamentals of artificial intelligence, machine learning, and deep learning. It will then go on to discuss specific techniques that are used in AIOps.

What Is Artificial Intelligence and Machine Learning?

Artificial intelligence, machine learning, and deep learning are terms that are often used interchangeably. This chapter will introduce each of these terms and detail the differences and overlaps between these technologies.

Most of us have come across AI in some form or another. Today AI is being used in most of the applications that we use every day. From recommendations on shopping sites to finding the most relevant article through a search engine like Google, everything involves AI. We all use speech recognition technologies in the form of voice assistants such as Alexa, Siri, etc. All of these are powered by artificial intelligence. What started as science fiction has become part of our everyday life; artificial intelligence has come a long way in a short span of time.

Let’s start with understanding artificial intelligence. The term artificial intelligence is used to define intelligence that is created by humans and programmed by humans. The AI systems mimic human intelligence and reasoning to make decisions and solve problems. Just like human intelligence resides in the human brain, AI resides in the form of algorithms, data, and models in a machine.

AI has been used to beat humans in games like Chess and Go. AI is also solving problems and creating new use cases for various applications that were not possible before it became available. AI is now being used in diverse fields including military and surveillance applications, healthcare, and diagnostics. Even the legal and governance field is something that has not been left untouched by artificial intelligence. Thus, AI as a technology is finding applications in almost everything.

Artificial intelligence as a term has gained traction since 1950s. It encompasses every type of technology that has an element of intelligence that is artificially created. Machine learning blossomed in the 1980s, and with more computing power being made available, deep learning gained traction starting in 2010. We will be explaining each of these in this chapter.

AI has exploded over the last decade primarily because of newer hardware technologies like GPUs that have made processing faster and cheaper. More powerful processing systems are being made at lower costs, and this has resulted in making the AI algorithms and techniques feasible. Theoretically, the techniques have existed for a long period; however, the technology infrastructure to execute these models at scale did not exist.

Why Machine Learning Is Important

AI is broadly classified into general AI and narrow AI. General AI as a concept that is still science fiction. General AI is a machine that possesses all the senses that we possess such as sight, sound, touch, smell, and taste and maybe even beyond human senses to see and experience things that are beyond human capabilities. These general AI machines would act just like humans do and possess powers beyond human capabilities. The term general in general AI means that just like humans are very general purpose and can do a variety of tasks and learn a variety of things, the machines should be able to do the same.

In contrast, narrow AI consists of technologies that perform specific tasks as well as or better than humans. The current state of the technology and implementations are centered around narrow AI. Think of narrow AI as a specialist that can perform one type of a task quickly and with accuracy but is not capable of learning other tasks that are different in nature. Examples of narrow AI are systems that can understand natural language, systems that can translate language, systems that can recognize objects, etc. These systems are created and fine-tuned to do a particular task.

There is a third term called “AI superintelligence,” which is used extensively in science fiction to depict an artificial intelligence that reaches a critical threshold of intelligence with which it can optimize itself in progressively smaller timeframes and reach a level of superintelligence. Thus, superintelligence has the ability to transform and optimize itself rapidly by coding itself recursively and improving with each iteration. As the superintelligence improves itself with each iteration, it becomes faster and more powerful, and thus each subsequent cycle takes less time. It is postulated that if such a system becomes reality, it may be able to recode itself millions of times and consume all the available compute power on the planet to create a superintelligence in hours as each successive step takes less time.

Broadly, AI systems have these three key qualities:
  • Intentionality: The AI system has the intention of making a decision based on real-time analysis of data or based on historical analysis of data. For example, an algorithm for correlation can classify website hits as normal user hits or a DoS attack.

  • Intelligence: AI systems have intelligence by using machine learning and deep learning technologies to understand the environment through data and arrive at a probabilistic conclusion. An example is performing the root-cause analysis (RCA) of an issue. These systems try to mimic human intelligence and decision-making, but technologically we are still very far from reaching levels of human intelligence in AI systems. However, in some specific tasks such as natural language understanding, some AI systems are providing accuracy levels that are on par with average human accuracy.

  • Adaptability: AI systems have the ability to learn and adapt as they compile information and make decisions. AI systems have the ability to adapt as data changes. Thus, rather than learning from static rules, an AI system is highly adaptable and learns from changing data. For example, predicting the outage or service impact in business involves the ability to adapt to changing environments and new data.

Types of Machine Learning

AI is quite an extensive and complex domain that consists of multiple specialized subdomains. Let’s understand machine learning and deep learning in greater detail.

Machine Learning

Machine learning is a technique that includes the algorithms that parse data, learn from the data supplied to the algorithm, and then apply the learning to make decisions. Unlike traditional programming where a person must manually create programs and rules based on the point-in-time snapshot of the environment and specific logic derived by the programmer, ML algorithms automatically derive that logic from data and create required rules.

Machine learning algorithms are at work in most of the applications that we use today. A simple example is a recommender system that recommends a movie, a song, or an item for us to buy based on analysis of our past preferences as well as the preferences of other users who are statistically similar to us.

As the name suggests, machine learning algorithms are not static; they continue to learn with new data that is being fed into the system, and they get progressively better at the job with more data.

The term machine learning was first coined by Arthur Samuel in 1952; its canonical and more practical definition was given by Tom Mitchell in 1997, which says “An agent or a program is said to learn from experience E with respect to some class of tasks T and performance measure P, if learner performance at tasks T, as measured by P, improves with experience E.”

Machine learning has three different types, namely, supervised learning, unsupervised learning, and reinforcement learning, as shown in Figure 5-1.

A model tree diagram with header machine learning. It has three branches, supervised learning, unsupervised learning, and reinforcement learning. Supervised learning is labeled as task driven. Unsupervised learning is labeled as data driven. Reinforcement learning is labeled as learn from hit and trial.

Figure 5-1

Types of machine learning

Supervised (Inductive) Learning

In supervised learning, users provide training data for a specific task to ML algorithms to analyze and learn from it. Training data includes both input, which is termed as features, and correct output, called the label. A simple example is an event with its description, timestamp, number of times it has occurred, the device on which it has occurred, and whether it is a critical alert or noise. Once the system gets enough data examples, it learns what is an alert and what is noise. The more training data, the higher the accuracy of the system.

This model is then applied to unseen data as an input for the algorithm to predict a response. The model is essential for the algorithm finding the connections between the parameters provided and establishing a cause-and-effect relationship between the features in the dataset. At the end of the training, the algorithm develops a draft of how the data works and the relationship between the input and the output.

For machine learning algorithms, the labeled dataset is divided into training data and test data where the training dataset trains the algorithm, and later the testing dataset is used to feed into the model to measure its accuracy of predictions. Models are then further fine-tuned progressively to arrive at better accuracy.

Both the quantity and quality of training data are important in determining the accuracy of output. As per best practices, the dataset should be split into a 70 to 30 ratio between training and testing datasets. If there is uneven class distribution in the dataset, then it is called an imbalanced training dataset, which means many more records for a particular class/label and very few records for another class/label. An imbalanced dataset causes imperfect decision boundaries, leading to inaccuracies in prediction. From an AIOps perspective, the training dataset should represent the actual production or operational characteristics with correct labeled parameters.

Another important factor in improving the accuracy of prediction is to remove irrelevant features as they negatively impact the learning process. The features are various inputs that are part of the data; it is important that only relevant features are provided in the training. Incorrect, insufficient, or irrelevant data in training will lead to errors in the model.

From an AIOps perspective, most commonly used supervised machine algorithms are as follows:
  • Regression: Regression is used to predict a continuous numerical value. For instance, at any specific time interval, it finds out what is the percentage utilization of CPU or count of website hits, capacity (GB) of a database, etc.

  • Classification: Classification is used if the expectation is to get a discrete value from a set of finite categories. We have already seen an example of classifying an event as an alert or noise. Classification use cases are marked as binary classification problem or multiclass classification problems depending upon whether the predicted output can belong to either of two classes or one of the multiple classes.

The following are supervised machine learning algorithms:
  • Regression

  • Logistic regression

  • Classification

  • Naive Bayes classifiers

  • K-NN (k-nearest neighbors)

  • Decision trees

  • Support vector machine

With an understanding of supervised ML techniques, let’s explore another crucial ML technique, that of unsupervised learning.

Unsupervised Learning

One of the big challenges in AIOps implementation in any organization is to get clean labeled data. In such scenarios, unsupervised machine learning provides the starting point for AIOP’s journey. Unlike supervised learning, it doesn’t need any “correct answer” as a label. Rather, these algorithms explore inherent patterns and relationships from millions of unlabeled data points to discover hidden correlation and patterns in the data. The unsupervised learning algorithms can adapt to the data by dynamically changing hidden correlations. It allows the model to work on its own to discover patterns and information that were previously undetected. Since there is no training data and it is left to the algorithm to decipher and make sense of the data, it is computationally more intensive than supervised learning. Since there is no training provided and no expert input to the algorithm, it is also less accurate than supervised learning. At times unsupervised learning is used to arrive at an initial analysis, and then the data is fed to supervised learning algorithms after gaining more insights. Though unsupervised learning does not use labels, humans still need to analyze the output generated to make sense of the data and fine-tune the models so that the expected result is generated.

There are unsupervised machine learning algorithms that are mostly used in AIOps.

Clustering identifies the anomalies in the dataset by splitting, and clustering is based on the similarity and differences between features and patterns available in the dataset. Multiple clusters help the operations team to diagnose issues and anomalies. Clusters divide the entire data set into subsets that are more similar to each other, and thus it creates insights into the data without any training. Clustering will bring together similar events, and this can then be used to further analyze the data and select appropriate mechanisms and algorithms that need to be used for AIOps.

Association discovers the relationship between entities based on the correlation deduced from the dataset. For example, a long-running database job on a server can be associated with high CPU utilization and may lead to high response time on website hits or transactions.

From an AIOps perspective, datapoints should be exposed to unsupervised learning to leverage clustering and association algorithms for root-cause analysis purposes. As a best practice, all events and incidents should be fed into the unsupervised learning algorithm to determine noise.

Dimensionality reduction is a learning technique used when the number of features in a given dataset is too high. It reduces the number of data inputs to a manageable size while also preserving the data integrity. Often, this technique is used in the preprocessing data stage. This ensures that noisy features or features that are not relevant to the task at hand are reduced.

The following are types of unsupervised learning:
  • Clustering

  • Exclusive (partitioning)

  • Agglomerative

  • Overlapping

  • Probabilistic

The following are the most commonly used clustering algorithms:
  • Hierarchical clustering: As the name suggest, this clustering algorithm uses a “hierarchy” of clusters to decompose data into a cluster and form a tree structure called a dendrogram. From an AIOps perspective, it is useful in automatically detecting service models without a CMDB and performing service impact analysis on it. You can either follow a top-down approach (divisive) or a follow bottom-up approach (agglomerative) to group data points in a cluster. In the divisive approach, all data points are assumed to be part of one large cluster (like application service), and then based on the termination logic they get divided into smaller clusters (like technical service). In agglomerative, each data point is assumed to be a cluster, which iteratively gets merged to create large clusters.

  • Centroid-based clustering: These algorithms are one of the simplest and most effective ways of creating clusters and assigning data points to them. In this algorithm, we have to first find “K” centroids and then group dataset based on the distance (or proximity) with centroids. K-means is a popular centroid-based clustering algorithm, and we will be exploring it in detail in Chapter 8 for anomaly detection, noise detection, and probable cause analysis processes as part of the AIOps implementation in IT operations.

  • Density-based clustering: Both centroid-based and hierarchical clustering techniques are based on the distance (or proximity) between data points without considering the density of data points at a specific position (or time stamp). This limitation is handled by a density-based clustering algorithm, and the techniques perform much better than K-means algorithms for outlier detection and noise from data set.

Next, we will discuss reinforcement learning, which is another technique that is used in specific situations.

Reinforcement Learning

Both supervised and unsupervised algorithms have a dependency on the dataset for learning. But you can also learn from feedback on your actions whether it is in the form of a reward or penalty. Reinforcement learning leverages this methodology and is driven by feedback from the sequence of action. It continuously improves and learns by the trial-and-error method. The reinforcement learning algorithms are based on training that involves reward and penalty. Thus, for every successful action, a reward is given to the agent, and for every unsuccessful step a penalty is levied. The reinforcement learning policy determines what is expected from the agent. It also involves tweaking the short-term and long-term rewards and penalties so that the agent does not go for short-term rewards while losing out on a bigger reward in the longer run. Apart from the agent, reward and penalty, and policy, the other important element is the environment in which the agent is operating, so at every step of the agent the state of the agent and its environment is fed back into the system to calculate what steps to take next. Reinforcement learning powers some of the AI-based systems that have outplayed human agents in traditional and video games.

Figure 5-2 summarizes the algorithms that we have discussed in machine learning.

A model tree diagram with header machine learning algorithms has three branches, supervised, unsupervised, and reinforcement learning. Each learning has two sub-branches with outputs. The sub-branches are regression, classification, clustering, association, classification, and control.

Figure 5-2

Types of ML algorithms

Differences Between Supervised and Unsupervised Learning

In supervised learning, the goal is to predict outcomes for new data by leveraging the learnings from the training data. In this model, the categories are known for the result, so you know what the end result will be from a set of possible outcomes. With an unsupervised learning algorithm, the goal is to get insights from the available data, and the algorithm decides on the features from the data provided. The output of unsupervised learning and their categories is not known.

Supervised learning models are ideal for data where we know the categories and we have sufficient training data available to train the machine learning algorithm. Supervised learning algorithms are used extensively in price prediction, forecasting, sentiment analysis, and other classification and regression tasks. Unsupervised machine learning is used extensively in recommendation engines, anomaly detection, etc.

Supervised leaning algorithms are more accurate since the training data is curated, validated, and provided by subject-matter experts, while unsupervised algorithms are not as accurate and require human intervention for interpretation of the result. Training supervised learning algorithms requires time and effort to create the training data, while for unsupervised learning, there is no requirement to create training data.

With an understanding of all three learning techniques, let’s understand how to choose from various options that are available to us for implementing these technologies.

Choosing the Machine Learning Approach

Choosing the right type of machine learning approach depends on various factors.
  • Objective: Is the objective or goal to understand the data and its correlation and features in more detail? Is the goal to cluster together similar data for analysis? Or is the goal a probabilistic prediction of a discrete or continuous variable? Dependent on the end goal of the application, you need to choose between supervised and unsupervised machine learning.

  • Data: Is there availability of training and labeled data? Can this data be made available? If yes, then you can go for supervised learning; otherwise, in absence of training data, you need to opt for unsupervised machine learning techniques.

ML techniques discussed so far in AIOps may need to deal with a lot of textual data and not just the metrices. Natural language processing (NLP) made it possible to analyze text data using ML techniques and will be discussed next.

Natural Language Processing

NLP is one of the key research fields within the AI domain that deals with processing information contained (or rather hidden) in natural language text. NLP uses language semantics and syntax to determine the context as well as real meaning and emotions getting conveyed via text.

NLP enables an AIOps system to mine knowledge from various rulebooks and knowledge articles available within the organization as well as search the latest information available in vendor repositories or online communities. It is practically impossible for an individual to scan and ingest knowledge from various thousands of documents available and accordingly perform specific tasks or actions in a limited time window. Knowledge obtained from NLP helps tremendously in improving reinforcement learning algorithms, executing automated resolutions efficiently, and providing recommendations and guidance to SRE and DevOps while dealing with an outage.

Chatbots, which were once considered as an optional entity, have become a must-have for business, especially in the service industry. Thanks to the emerging technology, self-service capabilities are given to end users, which makes them feel empowered. With NLP, chatbots can have a human-like meaningful conversation instead of just giving predefined limited responses to queries. This becomes a huge differentiator in an AIOps system. Let’s understand the natural language process in more detail.

What Is Natural Language Processing?

Natural human language is complex; it has various variations that may mean the same thing. It is ambiguous at times, it is dependent on the context of the object or situation, and it is extremely diverse. Thus, it comes with its own set of challenges when we try to make machines interpret or decipher the language. We haven’t used the word understand here, because the machines may not have an understanding of the language like we have. That’s why deciphering and interpreting what we are trying to communicate is more appropriate.

NLP is the domain in artificial intelligence that makes it possible for computers to understand human language. NLP analyzes the grammatical structure of sentences and the individual meaning of words; it uses machine learning and deep learning algorithms and other specific NLP techniques to extract meaning from the provided input. Thus, NLP is the technology that makes machines decipher human language, and machines can take input using natural human language rather than software code.

NLP technology is used extensively in various use cases; however, it is most visible in the form of virtual assistants like Apple Siri, Google Assistant, Microsoft Cortana, and Amazon Alexa. You will also find cognitive virtual assistants in many applications and websites where you can type in your query in human language and the system will be able to decipher it and return an answer. All these systems use NLP technologies behind the scenes.

Beyond the chatbots and virtual assistants, there are many other applications that use NLP. Text recommendations or next word suggestions when you are trying to key in terms in a search engine, language translation when you use Google Translation Services, spam filtering in email, sentiment analysis of posts or Twitter feeds—all these use cases use NLP technologies.

In a nutshell, the goal of NLP is to make human language—which is complex, ambiguous, and extremely diverse—easy for machines to understand.

NLP and machine learning are both subsets of artificial intelligence. To create NLP algorithms, machine learning technologies are used. Since it is a domain in itself, NLP is called out separately as a domain in artificial intelligence technologies. There are techniques, algorithms, and systems entirely devoted to NLP.

NLP applies two techniques to help computers understand text: syntactic analysis and semantic analysis.

Syntactic Analysis

Syntactic analysis—or parsing—analyzes text using basic grammar rules to identify sentence structure, how words are organized, and how words relate to each other. Some of its main subtasks include the following:
  • Tokenization consists of breaking up text into smaller parts called tokens to make text easier to handle. This is the first step in NLP to break down the text into tokens so that it can be processed further by the NLP engine.

  • Part-of-speech tagging (PoS tagging) labels the tokens that are generated from tokenization as noun, adjective, verb, adverb, etc. This helps the NLP engine to infer the meaning of the word in a particular context. For example, the word Saw can mean seeing in the past or it can be a noun pointing to the object Saw.

  • Stop-word removal removes frequently occurring words that don’t add any semantic value, such as I, they, have, like, yours, etc.

  • Stemming converts a word into root form by removing suffixes. For example, Studying will be converted to Study. It refers to a crude heuristic process that trims ends of words to find the correct root most of the time but is less resource extensive as well as fast.

  • Lemmatization also converts the word into the root but by using vocabulary and morphological analysis of word with the aim to remove inflectional endings and return the base or dictionary form of a word. For example, it replaces an inflected word with its base form, so Saw gets converted into See. Lemmatization is more accurate but more resource extensive and complex.

  • Anaphora resolution deals with the problem of resolving what a pronoun such as he or a noun such as CEO refer to.

Semantic Analysis

Semantic analysis focuses on capturing the meaning of text. It follows a two-step process. First it tries to extract the meaning of each individual word, and then it looks at the combination of words to decipher what they mean in a particular context.
  • Word-sense disambiguation: This deals with the problem of determining the true meaning or sense of a word that is used in the sentence. This impacts the output of anaphora resolution as well.

  • Relationship extraction: Relationship extraction analyzes the textual data and tries to find relationships between various entities. Entities can be various nouns such as people, places, geographies, objects, etc.

NLP tools and techniques are highly applicable in the AIOps domain. Chatbots or cognitive virtual agents are one of the modules in AIOps powered by NLP technologies. Within the cognitive virtual assistant there are multiple NLP services and techniques that are combined to deliver the cognitive virtual assistant. With an understanding of the NLP technique, let’s discuss its AIOps use cases.

NLP AIOps Use Cases

NLP plays a crucial role in AIOps systems in the understanding of textual data in alert messages (SNMP traps, emails, event IDs, etc.), incidents, user chat messages, log contents, and many other sources that convey issues or feedbacks. NLP enables the interpretation of such text for AIOps systems and accordingly takes the appropriate actions. Let’s explore some of the NLP use cases in AIOps.

Sentiment Analysis

Sentiment analysis identifies emotions in text and classifies the data as positive, negative, or neutral.

While using ticket data or chat data, sentiment analysis is a key use case. For every ticket feedback and for every conversation initiated and closed with the agents, you can leverage sentiment analysis to find out how the users feel about the services rendered. This comes in handy during a cognitive virtual assistant conversation as well because the cognitive virtual agent can tailor its responses based on the sentiment of the user. As an example, if a user is angry, the cognitive virtual agent can start the conversation with a pacifying statement like “sorry for the inconvenience caused” and make the conversation more natural and human-like rather than a robotic conversation where the sentiment may get ignored.

Language Translation

Many IT operations support multiple geographies and languages, and it becomes difficult for IT service administrators and service desk agents to cover 24/7 for all languages. Thus, language translation services come in handy for service desk agents to translate the ticket data or conversation data and use that to pinpoint the user’s issue and resolve it. Language translation is used both in cognitive virtual assistants and in user-to-agent communication. Translation services also find their utility in analyzing the ticket data that may have been provided in different languages.

Text Extraction

Text extraction enables you to pull out predefined information from text. IT teams typically deal with large volumes of data in various knowledge management repositories. There is also data in known error databases (KEDBs) in IT service management systems. There is also voluminous data in technical documentation that provides information for root-cause analysis, troubleshooting, and resolution. All this information becomes overwhelming for operations teams. AIOps systems extract the relevant information from these repositories and present the right information to administrators and service desk agents so they can troubleshoot and resolve issues or complete service request and changes.

Topic Classification

Topic classification helps the AIOps engine to organize the unstructured text available in various repositories into categories. Most of the time information is spread across multiple systems, and any directory structure or categorization structure that was initially created becomes unmanageable. Content for related categories and topics gets spread out in various repositories, and manually maintaining the topics and relevant documents becomes impossible. The AIOps engine helps you to organize this unstructured data into various categories. This feature can also be used to tag the tickets into various categories in IT service management.

This covers the overview of the NLP technique and its use case, and now we will move to the last but most complex technique that is used in AIOps, which is deep learning.

Deep Learning

We covered machine learning and its branches in the previous sections. As introduced earlier, deep learning is a subset of machine learning or a specialized way of making machines learn from data. Deep learning functions in a similar fashion as machine learning; however, the technologies used to achieve the end results and the capabilities that deep learning provides are a bit different.

Though both machine learning and deep learning are used interchangeably, both are different technically in their approach to learning.

Deep learning is a subfield of machine learning that structures algorithms in layers to create an “artificial neural network” that can learn and make intelligent decisions on its own.

A deep learning model is designed to analyze data in a structure similar to how humans draw conclusions. Deep learning algorithms use a layered structure called an artificial neural network to process the data and learn various features in the data.

Artificial neural networks have been in existence theoretically for a long time; however, the application of artificial neural networks was possible only once the computing technology progressed to provide the compute resources necessary to run these models. Thus, neural networks and deep learning became practical and created miraculous solutions in the last decade. Google created AlphaGo, which beat all human champions at the game, and this demonstrated the arrival of computing intelligence, which could be applied to games that required billions of computations to solve.

Artificial neural networks were inspired by biological neural networks in animal and human brains. The biological neural networks learn progressively by seeing and learning from data without any task-specific program guiding them. A human is able to recognize a cat from a dog after just seeing a few samples of cats and dogs; however, a machine would need much more data to learn, but it surely learns the difference and is able to classify them under the right species after learning.

Artificial neural networks are created using layers of neurons. Different layers called hidden layers perform a different transformation on the input as the signal moves from the input layer through the various hidden layers to the output layer, which finally arrives at the result as shown in Figure 5-3.

An illustration of three layers of a deep neural network. The layers are the Input layer, multiple hidden layers, and output layer. The input layer has 5 circles placed vertically. The multiple hidden layers have 3 sets of circles placed vertically. Each set has 5 circles. The output layer has 3 circles placed vertically.

Figure 5-3

Deep learning neural network

Neural networks can have a huge number of neurons and connections spanning millions of connections for complex use cases.

Neural networks and deep learning are being used in various use cases such as NLP, machine translation, image recognition, video gameplays, spam filtering, search engines, etc.

Neural networks at a fundamental level comprise inputs, weights, a bias (threshold), and an output. The neural network then balances the weight when training data is fed into the system using backpropagation to find out which neurons cause the errors in mapping to the correct output. After the neural network is trained with all the data in multiple iterations, the weights of each neuron get configured in a manner that the collective response as an output is accurate.

Threshold is an important parameter in neural networks; it leads to the activation of the node if the value generated is above the threshold, and it then sends the data to the next layer. Thus, the output of one neuron in a layer impacts the input and thus the output of the next layer of neurons to which it is connected. The hidden layers comprising multiple neurons have their own activation functions and keep passing the information based on the threshold to the next layer.

The idea is that each layer discovers a feature in the input, and progressively lower-level features are discovered by the next layer, just like a human brain would work, finding a higher-level object and then discovering what the object is and its features.

The general equation for a neural network is as follows, where w is the weight and x are the features:

A general equation represented as sigma from i equals 0 to m of, w subscript i x subscript i plus bias equals w subscript 1 x subscript 1 plus w subscript 2 x subscript 2 plus w subscript 3 x subscript 3 plus bias.

Neutral networks use backpropagation, which allows the weights in neurons to adjust in response to the error that they are contributing to the output. This allows for the adjustment of the weights and the model to progressively reach higher levels of accuracy and converge so that it can then be deployed for the use case for which it is getting trained.

Though a deep learning model needs much higher computing resources and data, the flexibility of the model and its ability to learn complex features makes it a compelling proposition for various use cases. Deep learning is generally deployed in use cases where there is higher complexity, the amount of data is much higher, and there is availability of data for training the deep learning model.

Summary

This chapter covered the basics of machine learning, deep learning, natural language processing, and how some of the capabilities can be used in the AIOps domain. We covered various techniques including supervised and unsupervised machine learning as well as NLP. These techniques provide the foundation on which AIOps platforms are built. In the next chapter, we will look at specific AIOps use cases and how we can use these techniques in those cases. We will start with the most common use case of deduplication in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.5.86