Generating negatives

The real challenge, however, is to come up with the perfect example of a non-pedestrian. After all, it's easy to think of example images of pedestrians. But what is the opposite of a pedestrian?

This is actually a common problem when trying to solve new machine learning problems. Both research labs and companies spend a lot of time creating and annotating new datasets that fit their specific purpose.

If you're stumped, let me give you a hint on how to approach this. A good first approximation to finding the opposite of a pedestrian is to assemble a dataset of images that look like the images of the positive class but do not contain pedestrians. These images could contain anything like cars, bicycles, streets, houses, and maybe even forests, lakes, or mountains. Also, you might find pedestrians near roadside (especially in cities) but you might not find a pedestrian on a landscape, so all these points can also be thought of while generating negatives.

A good place to start is the Urban and Natural Scene dataset by the Computational Visual Cognition Lab at MIT. The complete dataset can be obtained from http://cvcl.mit.edu/database.htm, but don't bother. I have already assembled a good amount of images from categories such as open country, inner cities, mountains, and forests. You can find them in the data/pedestrians_neg directory:

In [10]: negdir = "%s/pedestrians_neg" % datadir

All the images are in color, in .jpeg format, and are 256 x 256 pixels. However, in order to use them as samples from a negative class that go together with our earlier images of pedestrians, we need to make sure that all images have the same pixel size. Moreover, the things depicted in the images should roughly be at the same scale. Thus, we want to loop through all the images in the directory (via os.listdir) and cut out a 64 x 128 region of interest (ROI):

In [11]: import os
...      hroi = 128 #roi for height of the image
...      wroi = 64 # roi for width of the image
...      X_neg = []
...      for negfile in os.listdir(negdir):
...          filename = '%s/%s' % (negdir, negfile)

To bring the images to roughly the same scale as the pedestrian images, we resize them as follows:

...          img = cv2.imread(filename)
...          img = cv2.resize(img, (512, 512))

Then we cut out a 64 x 128 pixel ROI by randomly choosing the top-left corner coordinate (rand_x, rand_y). We do this an arbitrary number of five times to bolster our database of negative samples:

...          for j in range(5):
...              rand_y = random.randint(0, img.shape[0] - hroi)
...              rand_x = random.randint(0, img.shape[1] - wroi)
...              roi = img[rand_y:rand_y + hroi, rand_x:rand_x + wroi, :]
...              X_neg.append(hog.compute(roi, (64, 64)))

Some examples from this procedure are shown in the following diagram:

What did we almost forget? Exactly, we forgot to make sure that all feature values are 32-bit floating point numbers. Also, the target label of these images should be -1, corresponding to the negative class:

In [12]: X_neg = np.array(X_neg, dtype=np.float32)
...      y_neg = -np.ones(X_neg.shape[0], dtype=np.int32)
...      X_neg.shape, y_neg.shape
Out[12]: ((250, 1980, 1), (250,))

Then we can concatenate all positive (X_pos) and negative samples (X_neg) into a single dataset, X, which we split using the all too familiar train_test_split function from scikit-learn:

In [13]: X = np.concatenate((X_pos, X_neg))
...      y = np.concatenate((y_pos, y_neg))
In [14]: from sklearn import model_selection as ms
...      X_train, X_test, y_train, y_test = ms.train_test_split(
...          X, y, test_size=0.2, random_state=42
...      )

A common and painful mistake is to accidentally include a negative sample that does indeed contain a pedestrian somewhere. Make sure this does not happen to you!

Table of Contents for Generating negatives

Create new playlist

Sign In

Sign Up

Table of Contents for
Generating negatives