Template matching for object detection

Before we start with the shape-analysis and feature-analysis algorithms, we are going to learn about an easy-to-use, extremely powerful method of object detection called template matching. Strictly speaking, this algorithm does not fall into the category of algorithms that use any knowledge about the shape of an object, but it uses a previously acquired template image of an object that can be used to extract a template-matching result and consequently objects of known look, size, and orientation. You can use the matchTemplate function in OpenCV to perform a templating-matching operation. Here's an example that demonstrates the complete usage of the matchTemplate function:

Mat object = imread("Object.png"); 
Mat objectGr; 
cvtColor(object, objectGr, COLOR_BGR2GRAY); 
Mat scene = imread("Scene.png"); 
Mat sceneGr; 
cvtColor(scene, sceneGr, COLOR_BGR2GRAY); 
 
TemplateMatchModes method = TM_CCOEFF_NORMED; 
 
Mat result; 
matchTemplate(sceneGr, objectGr, result, method);

method must be an entry from the TemplateMatchModes enum, which can be any of the following values:

TM_SQDIFF
TM_SQDIFF_NORMED
TM_CCORR
TM_CCORR_NORMED
TM_CCOEFF
TM_CCOEFF_NORMED

For detailed information about each template-matching method, you can refer to the OpenCV documentation. For our practical examples, and to learn how the matchTemplate function is used in practice, it is important to note that each method will result in a different type of result, and consequently a different interpretation of the result is required, which we'll learn about in this section. In the preceding example, we are trying to detect an object in a scene by using an object image and a scene image. Let's assume the following images are the object (left-hand side) and the scene (right-hand side) that we'll be using:

The very simple idea in template matching is that we are searching for a point in the scene image on the right-hand side that has the highest possibility of containing the image on the left-hand side, or in other words, the template image. The matchTemplate function, depending on the method that is used, will provide a probability distribution. Let's visualize the result of the matchTemplate function to better understand this concept. Another important thing to note is that we can only properly visualize the result of the matchTemplate function if we use any of the methods ending with _NORMED, which means they contain a normalized result, otherwise we have to use the normalize method to create a result that contains values in the displayable range of the OpenCV imshow function. Here is how it can be done:

normalize(result, result, 0.0, 1.0, NORM_MINMAX, -1);

This function call will translate all the values in result to the range of 0.0 and 1.0, which can then be properly displayed. Here is how the resulting image will look if it is displayed using the imshow function:

As mentioned previously, the result of the matchTemplate function and how it should be interpreted depends completely on the template matching method that is used. In the case that we use the TM_SQDIFF or TM_SQDIFF_NORMED methods for template matching, we need to look for the global minimum point in the result (it is shown using an arrow in the preceding image), which has the highest possibility of containing the template image. Here's how we can find the global minimum point (along with global maximum, and so on) in the template matching result:

double minVal, maxVal; 
Point minLoc, maxLoc; 
minMaxLoc(result, &minVal, &maxVal, &minLoc, &maxLoc);

Since the template-matching algorithm works only with objects of a fixed size and orientation, we can assume that a rectangle that has an upper-left point that is equal to the minLoc point and has a size that equals the template image is the best possible bounding rectangle for our object. We can draw the result on the scene image, for better comparison, using the following sample code:

Rect rect(minLoc.x, 
          minLoc.y, 
          object.cols, 
          object.rows); 
 
Scalar color(0, 0, 255); 
int thickness = 2; 
rectangle(scene, 
          rect, 
          color, 
          thickness);

The following image depicts the result of the object detection operation that was performed using the matchTemplate function:

If we use TM_CCORR, TM_CCOEFF, or their normalized versions, we must use the global maximum point as the point with the highest possibility of containing our template image. The following image depicts the result of the TM_CCOEFF_NORMED method used with the matchTemplate function:

As you can see, the brightest point in the resultant image corresponds to the upper-left point of the template image in the scene image.

Before ending our template matching lesson, let's also note that the width and height of the template matching resultant image is smaller than the scene image. This is because the template matching resultant image can only contain the upper-left point of the template image, so the template image width and height are subtracted from the scene image's width and height to determine the resultant image's width and height in the template-matching algorithm.

Table of Contents for Template matching for object detection

Create new playlist

Sign In

Sign Up

Table of Contents for
Template matching for object detection