Chapter 6. Learning to Recognize Traffic Signs

The goal of this chapter is to train a multiclass classifier to recognize traffic signs. In this chapter, we will cover the following topics:

  • Supervised learning concepts
  • The German Traffic Sign Recognition Benchmark (GTSRB) dataset feature extraction
  • Support vector machines (SVMs)

We have previously studied how to describe objects by means of keypoints and features, and how to find the correspondence points in two different images of the same physical object. However, our previous approaches were rather limited when it comes to recognizing objects in real-world settings and assigning them to conceptual categories. For example, in Chapter 2, Hand Gesture Recognition Using a Kinect Depth Sensor, the required object in the image was a hand, and it had to be nicely placed in the center of the screen. Wouldn't it be nice if we could remove these restrictions?

In this chapter, we will instead train a Support Vector Machine (SVM) to recognize all sorts of traffic signs. Although SVMs are binary classifiers (that is, they can be used to learn, at most, two categories: positives and negatives, animals and non-animals, and so on), they can be extended to be used in multiclass classification. In order to achieve good classification performance, we will explore a number of color spaces as well as the Histogram of Oriented Gradients (HOG) feature. Then, classification performance will be judged based on accuracy, precision, and recall. The following sections will explain all of these terms in detail.

To arrive at such a multiclass classifier, we need to perform the following steps:

  1. Preprocess the dataset: We need a way to load our dataset, extract the regions of interest, and split the data into appropriate training and test sets.
  2. Extract features: Chances are that raw pixel values are not the most informative representation of the data. We need a way to extract meaningful features from the data, such as features based on different color spaces and HOG.
  3. Train the classifier: We will train the multiclass classifier on the training data in two different ways: the one-vs-all strategy (where we train a single SVM per class, with the samples of that class as positive samples and all other samples as negatives), and the one-vs-one strategy (where we train a single SVM for every pair of classes, with the samples of the first class as positive samples and the samples of the second class as negative samples).
  4. Score the classifier: We will evaluate the quality of the trained ensemble classifier by calculating different performance metrics, such as accuracy, precision, and recall.

The end result will be an ensemble classifier that achieves a nearly perfect score in classifying 10 different street sign categories:

Learning to Recognize Traffic Signs

Planning the app

The final app will parse a dataset, train the ensemble classifier, assess its classification performance, and visualize the result. This will require the following components:

  • main: The main function routine (in chapter6.py) for starting the application.
  • datasets.gtsrb: A script for parsing the German Traffic Sign Recognition Benchmark (GTSRB) dataset. This script contains the following functions:
    • load_data: A function used to load the GTSRB dataset, extract a feature of choice, and split the data into training and test sets.
    • _extract_features: A function that is called by load_data to extract a feature of choice from the dataset.
  • classifiers.Classifier: An abstract base class that defines the common interface for all classifiers.
  • classifiers.MultiClassSVM: A class that implements an ensemble of SVMs for multiclass classification using the following public methods:
    • MultiClassSVM.fit: A method used to fit the ensemble of SVMs to training data. It takes a matrix of training data as input, where each row is a training sample and the columns contain feature values, and a vector of labels.
    • MultiClassSVM.evaluate: A method used to evaluate the ensemble of SVMs by applying it to some test data after training. It takes a matrix of test data as input, where each row is a test sample and the columns contain feature values, and a vector of labels. The function returns three different performance metrics: accuracy, precision, and recall.

In the following sections, we will discuss these steps in detail.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.144.216