Introducing image recognition

A typical goal of image recognition is to detect and identify an object in a digital image. Image recognition is applied in factory automation to monitor product quality; surveillance systems to identify potentially risky activities, such as moving persons or vehicles; security applications to provide biometric identification through fingerprints, iris, or facial features; autonomous vehicles to reconstruct conditions on the road and environment; and so on.

Digital images are not presented in a structured way with attribute-based descriptions; instead, they are encoded as the amount of color in different channels, for instance, black-white and red-green-blue channels. The learning goal is to identify patterns that are associated with a particular object. The traditional approach for image recognition consists of transforming an image into different forms, for instance, to identify object corners, edges, same-color blobs, and basic shapes. Such patterns are then used to train a learner to distinguish between objects. Some notable examples of traditional algorithms are listed here:

Edge detection finds boundaries of objects within an image
Corner detection identifies intersections of two edges or other interesting points, such as line endings, curvature maxima or minima, and so on
Blob detection identifies regions that differ in a property, such as brightness or color, compared to its surrounding regions
Ridge detection identifies additional interesting points in the image using smooth functions
Scale invariant feature transform (SIFT) is a robust algorithm that can match objects, even if their scale or orientation differs from the representative samples in the database
Hough transform identifies particular patterns in the image

A more recent approach is based on deep learning. Deep learning is a form of neural network, which mimics how the brain processes information. The main advantage of deep learning is that it's possible to design neural networks that can automatically extract relevant patterns, which in turn can be used to train a learner. With recent advances in neural networks, image recognition accuracy has significantly boosted. For instance, the ImageNet challenge, where competitors are provided more than 1.2 million images from 1,000 different object categories, reports that the error rate of the best algorithm was reduced from 28% in 2010, using support vector machines (SVM), to only 7% in 2014, using a deep neural network.

In this chapter, we'll take a quick look at neural networks, starting from the basic building block, the perceptron, and gradually introducing more complex structures.

Table of Contents for Introducing image recognition

Create new playlist

Sign In

Sign Up

Table of Contents for
Introducing image recognition