Setting up the app

In order to run our app, we will need to execute a main function routine that reads a frame of a video stream, generates a saliency map, extracts the location of the proto-objects, and tracks these locations from one frame to the next.

The main function routine

The main process flow is handled by the main function in chapter5.py, which instantiates the two classes (Saliency and MultipleObjectTracker) and opens a video file showing the number of soccer players on the field:

import cv2
import numpy as np
from os import path

from saliency import Saliency
from tracking import MultipleObjectsTracker


def main(video_file='soccer.avi', roi=((140, 100), (500, 600))):
    if path.isfile(video_file):
        video = cv2.VideoCapture(video_file)
    else:
        print 'File "' + video_file + '" does not exist.'
        raise SystemExit

    # initialize tracker
    mot = MultipleObjectsTracker()

The function will then read the video frame by frame, extract some meaningful region of interest (for illustration purposes), and feed it to the Saliency module:

    while True:
        success, img = video.read()
        if success:
            if roi:
                # grab some meaningful ROI
                img = img[roi[0][0]:roi[1][0], roi[0][1]:roi[1][1]]
            # generate saliency map
            sal = Saliency(img, use_numpy_fft=False, gauss_kernel=(3, 3))

The Saliency will generate a map of all the interesting proto-objects and feed that into the tracker module. The output of the tracker module is the input frame annotated with bounding boxes as shown in the preceding figure.

cv2.imshow("tracker", mot.advance_frame(img, sal.get_proto_objects_map(use_otsu=False)))

The app will run through all the frames of the video until the end of the file is reached or the user presses the q key:

if cv2.waitKey(100) & 0xFF == ord('q'):
    break

The Saliency class

The constructor of the Saliency class accepts a video frame, which can be either grayscale or RGB, as well as some options such as whether to use NumPy's or OpenCV's Fourier package:

def __init__(self, img, use_numpy_fft=True, gauss_kernel=(5, 5)):
    self.use_numpy_fft = use_numpy_fft
    self.gauss_kernel = gauss_kernel
    self.frame_orig = img

A saliency map will be generated from a down sampled version of the image, and because the computation is relatively time-intensive, we will maintain a flag need_saliency_map that makes sure we do the computations only once:

    self.small_shape = (64, 64)
    self.frame_small = cv2.resize(img, self.small_shape[1::-1])

    # whether we need to do the math (True) or it has already
    # been done (False)
    self.need_saliency_map = True

From then on, the user may call any of the class' public methods, which will all be passed on the same image.

The MultiObjectTracker class

The constructor of the tracker class is straightforward. All it does is set up the termination criteria for mean-shift tracking and store the conditions for the minimum contour area (min_area) and minimum frame-by-frame drift (min_shift2) to be considered in the subsequent computation steps:

def __init__(self, min_area=400, min_shift2=5):
    self.object_roi = []
    self.object_box = []

    self.min_cnt_area = min_area
    self.min_shift2 = min_shift2

    # Setup the termination criteria, either 100 iteration or move
    # by at least 1 pt
    self.term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 100, 1) 

From then on, the user may call the advance_frame method to feed a new frame to the tracker.

However, before we make use of all this functionality, we need to learn about image statistics and how to generate a saliency map.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.79.206