Automatically tracking all players on a soccer field

Our goal is to combine the saliency detector with mean-shift tracking to automatically track all the players on a soccer field. The proto-objects identified by the saliency detector will serve as input to the mean-shift tracker. Specifically, we will focus on a video sequence from the Alfheim dataset, which can be freely obtained from http://home.ifi.uio.no/paalh/dataset/alfheim/.

The reason for combining the two algorithms (saliency map and mean-shift tracking), is to maintain correspondence information between objects in different frames as well as to remove some false positives and improve the accuracy of detected objects.

The hard work is done by the previously introduced MultiObjectTracker class and its advance_frame method. The advance_frame method is called whenever a new frame arrives, and accepts proto-objects and saliency as input:

    def advance_frame(self,
                      frame: np.ndarray,
                      proto_objects_map: np.ndarray,
                      saliency: np.ndarray) -> np.ndarray:

The following steps are covered in this method:

Create contours from proto_objects_map and find bounding rectangles for all contours that have an area greater than min_object_area. The latter is the candidate bounding boxes for tracking with the mean shift algorithm:

        object_contours, _ = cv2.findContours(proto_objects_map, 1, 2)
        object_boxes = [cv2.boundingRect(contour)
                        for contour in object_contours
                        if cv2.contourArea(contour) > self.min_object_area]

The candidate boxes might be not the best ones for tracking them throughout the frames. For example, in this case, if two players are close to each other, they result in a single object box. We need some approach to select the best boxes. We could think about some algorithm that will analyze boxes tracked from previous frames in combination with boxes obtained from saliency, and deduce the most probable boxes.

But we will do it in a simple manner here—if the number of boxes from the saliency map doesn't increase, boxes from the previous frame to the current frame using the saliency map of the current frame are tracked, which are saved as objcect_boxes:

        if len(self.object_boxes) >= len(object_boxes):
            # Continue tracking with meanshift if number of salient objects
            # didn't increase
            object_boxes = [cv2.meanShift(saliency, box, self.term_crit)[1]
                            for box in self.object_boxes]
            self.num_frame_tracked += 1

If it did increase, we reset the tracking information, which is the number of frames through which the objects were tracked and the initial centers of the objects were calculated:

        else:
            # Otherwise restart tracking
            self.num_frame_tracked = 0
            self.object_initial_centers = [
                (x + w / 2, y + h / 2) for (x, y, w, h) in object_boxes]

Finally, save the boxes and make an illustration of the tracking information on the frame:

self.object_boxes = object_boxes
return self.draw_good_boxes(copy.deepcopy(frame))

We are interested in boxes that move. For that purpose, we calculate the displacements of each box from their initial location at the start of tracking. We suppose that objects that appear larger on a frame should move faster, hence we normalize the displacements on box width:

    def draw_good_boxes(self, frame: np.ndarray) -> np.ndarray:
        # Find total displacement length for each object
        # and normalize by object size
        displacements = [((x + w / 2 - cx)**2 + (y + w / 2 - cy)**2)**0.5 / w
                         for (x, y, w, h), (cx, cy)
                         in zip(self.object_boxes, self.object_initial_centers)]

Next, we draw boxes and their number, which have average displacement per frame (or speed) greater than the value that we specified on the initialization of the tracker. A small number is added in order not to divide by 0 on the first frame of tracking:

        for (x, y, w, h), displacement, i in zip(
                self.object_boxes, displacements, itertools.count()):
            # Draw only those which have some avarage speed
            if displacement / (self.num_frame_tracked + 0.01) > self.min_speed_per_pix:
                cv2.rectangle(frame, (x, y), (x + w, y + h),
                              (0, 255, 0), 2)
                cv2.putText(frame, str(i), (x, y),
                            cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255))
        return frame

Now you understand how it is possible to implement tracking using the mean-shift algorithm. This is only one approach for tracking out of many others on offer. Mean-shift tracking might particularly fail when the objects rapidly change in size, as would be the case if an object of interest were to come straight at the camera.

For such cases, OpenCV has a different algorithm, cv2.CamShift, which also takes into account rotations and changes in size, where CAMShift stands for Continuously Adaptive Mean-Shift. Moreover, OpenCV has a range of available trackers that can be used out of the box and are referred to as the OpenCV Tracking API. Let's learn about them in the next section.

Table of Contents for Automatically tracking all players on a soccer field

Create new playlist

Sign In

Sign Up

Table of Contents for
Automatically tracking all players on a soccer field