An overview of classification methods

The classification task is one of the basic tasks of applied statistics and machine learning, as well as artificial intelligence (AI) as a whole. This is because classification is one of the most understandable and easy-to-interpret data analysis technologies, and classification rules can be formulated in a natural language. In machine learning, a classification task is solved using supervised algorithms because the classes are defined in advance, and the objects in the training set have class labels. Analytical models that solve a classification task are called classifiers.

Classification is the process of moving an object to a predetermined class based on its formalized features. Each object in this problem is usually represented as a vector in N-dimensional space. Each dimension in that space is a description of one of the features of the object.

We can formulate the classification task with mathematical notation. Let X denote the set of descriptions of objects, and Y be a finite set of names or class labels. There is an unknown objective function—namely, the mapping , whose values are known only on the objects of the final training sample . So, we have to construct an algorithm, capable of classifying an arbitrary object. In mathematical statistics, classification problems are also called discriminant analysis problems.

The classification task is applicable to many areas, including the following:

Trade: The classification of customers and products allows a business to optimize marketing strategies, stimulate sales, and reduce costs.
Telecommunications: The classification of subscribers allows a business to appraise customer loyalty, and therefore develop loyalty programs.
Medicine and health care: Assisting the diagnosis of disease by classifying the population into risk groups.
Banking: The classification of customers is used for credit-scoring procedures.

Classification can be solved by using the following methods:

Logistic regression
The kNN method
SVM
Discriminant analysis
Decision trees
Neural networks

We have looked into discriminant analysis in Chapter 6, Dimensionality Reduction, as an algorithm for dimensionality reduction, but most libraries provide an application programming interface (API) for working with the discriminant analysis algorithm as a classifier, too. We will discuss decision trees in Chapter 9, Ensemble Learning, focusing on algorithm ensembles. We will also discuss neural networks in the chapter that follows this: Chapter 10, Neural Networks for Image Classification.

Now we've discussed what the classification task is, let's look at various classification methods.

Table of Contents for An overview of classification methods

Create new playlist

Sign In

Sign Up

Table of Contents for
An overview of classification methods