Labeling and defining data for localization

When labeling data, we need to follow simple binary logic. Whenever an image contains an object that belongs to a particular class that you require, it will label it as 1, and if it doesn't contain an object, it will label it 0. The following image depicts a bounding box around a car:

When it come to object localization, the labeling of the object is more difficult as we have much more information to deal with.

Consider that P_c simply denotes whether this image has any of the classes we want to predict. If it's 1, this means that we have one of the classes we want to predict, and 0 means we don't have any of them. b_x and b_y mark the center of the bounding box, and b_h and b_w mark the height and width of the bounding box. We also have the class number that's depicted by C1, C2, C3, and so on.

Before we proceed with the process of detection, let's define the coordinates of the image:

The top left corner is (0,0) and the bottom right corner is marked as (1,1).

The P_c value signals that we have one of the classes in the image. The coordinates (0.5, 0.5) depict the center of the bounding box and of the image. The height is at 0.5 and the width is measured at 0.8. Lastly, we have the digit 1, which depicts that we have a car in the image.

To understand this better, let's consider another example:

The value of P_c here is 0, which means that the concerned object is not present in the image. If the value of Pc is 0, we don't need to care about the other values, we simply ignore them.

Table of Contents for Labeling and defining data for localization

Create new playlist

Sign In

Sign Up

Table of Contents for
Labeling and defining data for localization