Publisher Summary

This chapter discusses the advanced techniques for data classification starting with Bayesian belief networks, which do not assume class conditional independence. Bayesian belief networks allow class conditional independencies to be defined between subsets of variables. They provide a graphical model of causal relationships, on which learning can be performed. Trained Bayesian belief networks can be used for classification. Backpropagation is a neural network algorithm for classification that employs a method of gradient descent. It searches for a set of weights that can model the data so as to minimize the mean-squared distance between the network’s class prediction and the actual class label of data tuples. Rules may be extracted from trained neural networks to help improve the interpretability of the learned network. In general terms, a neural network is a set of connected input/output units in which each connection has a weight associated with it. The weights are adjusted during the learning phase to help the network predict the correct class label of the input tuples. A more recent approach to classification known as support vector machines, a support vector machine transforms training data into a higher dimension, where it finds a hyperplane that separates the data by class using essential training tuples called support vectors. Pairs classification using frequent patterns, exploring relationships between attribute–value that occurs frequently in data is described. This methodology builds on research on frequent pattern mining. Lazy learners or instance-based methods of classification, such as nearest-neighbor classifiers and case-based reasoning classifiers, which store all of the training tuples in pattern space and wait until presented with a test tuple before performing generalization are also presented. Other approaches to classification, such as genetic algorithms, rough sets, and fuzzy logic techniques, are introduced. Multiclass classification, semi-supervised classification, active learning, and transfer learning are explored.

In this chapter, you will learn advanced techniques for data classification. We start with Bayesian belief networks (Section 9.1), which unlike naïve Bayesian classifiers, do not assume class conditional independence. Backpropagation, a neural network algorithm, is discussed in Section 9.2. In general terms, a neural network is a set of connected input/output units in which each connection has a weight associated with it. The weights are adjusted during the learning phase to help the network predict the correct class label of the input tuples. A more recent approach to classification known as support vector machines is presented in Section 9.3. A support vector machine transforms training data into a higher dimension, where it finds a hyperplane that separates the data by class using essential training tuples called support vectors. Section 9.4 describes classification using frequent patterns, exploring relationships between attribute–value pairs that occur frequently in data. This methodology builds on research on frequent pattern mining (Chapters 6 and 7).

Section 9.5 presents lazy learners or instance-based methods of classification, such as nearest-neighbor classifiers and case-based reasoning classifiers, which store all of the training tuples in pattern space and wait until presented with a test tuple before performing generalization. Other approaches to classification, such as genetic algorithms, rough sets, and fuzzy logic techniques, are introduced in Section 9.6. Section 9.7 introduces additional topics in classification, including multiclass classification, semi-supervised classification, active learning, and transfer learning.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.133.245