Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 8. Tracking Objects

In this chapter, we will explore the vast topic of object tracking, which is the process of locating a moving object in a movie or video feed from a camera. Real-time object tracking is a critical task in many computer vision applications such as surveillance, perceptual user interfaces, augmented reality, object-based video compression, and driver assistance.

Tracking objects can be accomplished in several ways, with the optimal technique being largely dependent on the task at hand. We will learn how to identify moving objects and track them across frames.

Detecting moving objects

The first task that needs to be accomplished for us to be able to track anything in a video is to identify those regions of a video frame that correspond to moving objects.

There are many ways to track objects in a video, all of them fulfilling a slightly different purpose. For example, you may want to track anything that moves, in which case differences between frames are going to be of help; you may want to track a hand moving in a video, in which case Meanshift based on the color of the skin is the most appropriate solution; you may want to track a particular object of which you know the aspect, in which case techniques such as template matching will be of help.

Object tracking techniques can get quite complex, let's explore them in the ascending order of difficulty, starting from the simplest technique.

Basic motion detection

The first and most intuitive solution is to calculate the differences between frames, or between a frame considered "background" and all the other frames.

Let's look at an example of this approach:

import cv2
import numpy as np

camera = cv2.VideoCapture(0)

es = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (9,4))
kernel = np.ones((5,5),np.uint8)
background = None

while (True):
  ret, frame = camera.read()
  if background is None:
    background = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    background = cv2.GaussianBlur(background, (21, 21), 0)
    continue
  
  gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
  gray_frame = cv2.GaussianBlur(gray_frame, (21, 21), 0)
  
  diff = cv2.absdiff(background, gray_frame)
  diff = cv2.threshold(diff, 25, 255, cv2.THRESH_BINARY)[1]
  diff = cv2.dilate(diff, es, iterations = 2)
  image, cnts, hierarchy = cv2.findContours(diff.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
  
  for c in cnts:
    if cv2.contourArea(c) < 1500:
      continue
    (x, y, w, h) = cv2.boundingRect(c)
    cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
  
  cv2.imshow("contours", frame)
  cv2.imshow("dif", diff)
  if cv2.waitKey(1000 / 12) & 0xff == ord("q"):
      break

cv2.destroyAllWindows()
camera.release()

After the necessary imports, we open the video feed obtained from the default system camera, and we set the first frame as the background of the entire feed. Each frame read from that point onward is processed to calculate the difference between the background and the frame itself. This is a trivial operation:

diff = cv2.threshold(diff, 25, 255, cv2.THRESH_BINARY)[1]

Before we get to do that, though, we need to prepare our frame for processing. The first thing we do is convert the frame to grayscale and blur it a bit:

gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
gray_frame = cv2.GaussianBlur(gray_frame, (21, 21), 0)

Note

You may wonder about the blurring: the reason why we blur the image is that, in each video feed, there's a natural noise coming from natural vibrations, changes in lighting, and the noise generated by the camera itself. We want to smooth this noise out so that it doesn't get detected as motion and consequently get tracked.

Now that our frame is grayscaled and smoothed, we can calculate the difference compared to the background (which has also been grayscaled and smoothed), and obtain a map of differences. This is not the only processing step, though. We're also going to apply a threshold, so as to obtain a black and white image, and dilate the image so holes and imperfections get normalized, like so:

diff = cv2.absdiff(background, gray_frame)  
diff = cv2.threshold(diff, 25, 255, cv2.THRESH_BINARY)[1]
diff = cv2.dilate(diff, es, iterations = 2)

Note that eroding and dilating can also act as a noise filter, much like the blurring we applied, and that it can also be obtained in one function call using cv2.morphologyEx, we show both steps explicitly for transparency purposes. All that is left to do at this point is to find the contours of all the white blobs in the calculated difference map, and display them. Optionally, we only display contours for rectangles greater than an arbitrary threshold, so tiny movements are not displayed. Naturally, this is up to you and your application needs. With a constant lighting and a very noiseless camera, you may wish to have no threshold on the minimum size of the contours. This is how we display the rectangles:

image, cnts, hierarchy = cv2.findContours(diff.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for c in cnts:
    if cv2.contourArea(c) < 1500:
      continue
    (x, y, w, h) = cv2.boundingRect(c)
    cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 255, 0), 2)
  
cv2.imshow("contours", frame)
cv2.imshow("dif", diff)

OpenCV offers two very handy functions:

cv2.findContours: This function computes the contours of subjects in an image
cv2.boundinRect: This function calculates their bounding box

So there you have it, a basic motion detector with rectangles around subjects. The final result is something like this:

For such a simple technique, this is quite accurate. However, there are a few drawbacks that make this approach unsuitable for all business needs, most notably the fact that you need a first "default" frame to set as a background. In situations such as—for example—outdoor cameras, with lights changing quite constantly, this process results in a quite inflexible approach, so we need a bit more intelligence into our system. That's where background subtractors come into play.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8. Tracking Objects

Create new playlist

Sign In

Sign Up

Chapter 8. Tracking Objects

Detecting moving objects

Basic motion detection

Note

Table of Contents for
8. Tracking Objects