Using the Continuously Adaptive Mean (CAM) Shift

To overcome the limitations of the Mean Shift algorithm, we can use an improved version of it, which is called the Continuously Adaptive Mean Shift, or simply the CAM Shift algorithm. OpenCV contains the implementation for the CAM Shift algorithm in a function named CamShift, which is used almost in an identical manner to the meanShift function. The input parameters of the CamShift function are the same as meanShift, since it also uses a back-projection image to update a search window using a given set of termination criteria. In addition, CamShift also returns a RotatedRect object, which contains both the search window and its angle.

Without using the returned RotatedRect object, you can simply replace any call to the meanShift function with CamShift, and the only difference would be that the results will be scale-invariant, meaning the search window will become bigger if the object is nearer (or bigger) and vice versa. For instance, we can replace the call to the meanShift function in the preceding example code for the Mean Shift algorithm with the following:

CamShift(backProject, 
         srchWnd, 
         criteria);

The following images depict the result of replacing the meanShift function with CamShift in the example from the previous section:

Notice that the results are now scale-invariant, even though we didn't change anything except replace the mentioned function. As the object moves farther away from the camera, or becomes smaller, the same Mean Shift algorithm is used to calculate its position, however, this time the search window is resized to fit the exact size of the object, and the rotation is calculated, which we didn't use. To be able to use the rotation value of the object, we need to store the result of the CamShift function in a RotatedRect object first, as seen in the following example:

RotatedRect rotRect = CamShift(backProject, 
                               srchWnd, 
                               criteria);

To draw a RotatedRect object, or in other words a rotated rectangle, you must use the points method of RotatedRect to extract the consisting 4 points of the rotated rectangle first, and then draw them all using the line function, as seen in the following example:

Point2f rps[4]; 
rotRect.points(rps); 
for(int i=0; i<4; i++) 
    line(frame, 
         rps[i], 
         rps[(i+1)%4], 
         Scalar(255,0,0),// blue color 
         2);

You can also use a RotatedRect object to draw a rotated ellipse that is covered by the rotated rectangle. Here's how:

ellipse(frame, 
        rotRect, 
        Scalar(255,0,0), 
        2);

The following image displays the result of using the RotatedRect object to draw a rotated rectangle and ellipse at the same time, over the tracked object:

In the preceding image, the red rectangle is the search window, the blue rectangle is the resulting rotated rectangle, and the green ellipse is drawn by using the resulting rotated rectangle.

To summarize, we can say that CamShift is far better suited to dealing with objects of varying size and rotation than meanShift, however, there are still a couple of possible enhancements that can be done when using the CamShift algorithm. First things first, the initial window size still needs to be set, but since CamShift is taking care of the size changes, then we can simply set the initial window size to be the same as the whole image size. This would help us avoid having to deal with the initial position and size of the search window. If we can also create the histogram of the object of interest using a previously saved file on disk or any similar method, then we will have an object detector and tracker that works out of the box, at least for all the cases where our object of interest has a visibly different color than the environment.

Another huge improvement to such a color-based detection and tracking algorithm can be achieved by using the inRange function to enforce a threshold on the S and V channels of the HSV image that we are using to calculate the histogram. The reason is that in our example, we simply used the hue (or the H, or the first) channel, and we didn't take into account the high possibility of having extremely dark or bright pixels that might have the same hue as our object of interest. This can be done by using the following code when calculating the histogram of the object to be tracked:

int lbHue = 00 , hbHue = 180; 
int lbSat = 30 , hbSat = 256; 
int lbVal = 30 , hbVal = 230; 
 
Mat mask; 
inRange(objImgHsv, 
        Scalar(lbHue, lbSat, lbVal), 
        Scalar(hbHue, hbSat, hbVal), 
        mask); 
 
calcHist(&objImgHue, 
         nimages, 
         channels, 
         mask, 
         histogram, 
         dims, 
         histSize, 
         ranges, 
         uniform);

In the preceding example code, the variables starting with lb and hb refer to the lower bound and higher bound of the values that are allowed to pass the inRange function. objImgHsv is obviously a Mat object containing our object of interest, or a ROI that contains our object of interest. objImgHue is the first channel of objImgHsv, which is extracted using a previous call to the split function. The rest of the parameters are nothing new, and you've already used them in previous calls to the functions used in this example.

Combining all of the algorithms and techniques described in this section can help you create an object-detector, or even a face-detector and tracker that can work in realtime and with stunning speed. However, you might still need to account for the noise that will interfere, especially with the tracking, which is almost inevitable because of the nature of color-based or histogram-based trackers. One of the most widely used solutions to these issues is the subject of the next section in this chapter.

Table of Contents for Using the Continuously Adaptive Mean (CAM) Shift

Create new playlist

Sign In

Sign Up

Table of Contents for
Using the Continuously Adaptive Mean (CAM) Shift