Algorithmic pareidolia in computer vision

One of the major tasks in computer vision is object detection and face detection in particular. There are many electronic devices with face detection features that run such algorithms behind the scenes and detect a face. So, what happens when we put pareidolia-inducing objects in front of these software? Sometimes these software interpret faces the exact same way we do; sometimes it might agree with us, and sometimes it brings entirely new faces to our attention.

In the case of an object recognition system built using artificial neural network, higher-level features/layers correspond to more recognizable features, like faces or objects. Enhancing these features brings out what the computer sees. These reflect the training set of images that the network has seen previously. Let's take the Inception network, and ask it to predict the objects it sees in some pareidolia-inducing images. Let's take these pansy flowers in the following photo. To me, these flowers sometimes looks like a butterfly, and sometimes like a face of an angry man with a thick mustache:

Let's see what the Inception model sees in this. We will use the pretrained Inception network model trained on ImageNet data. To load the model, use the following code:

from keras.applications import inception_v3
from keras import backend as K
from keras.applications.imagenet_utils import decode_predictions
from keras.preprocessing import image
K.set_learning_phase(0)

model = inception_v3.InceptionV3(weights='imagenet',include_top=True)

To read the image file and convert it to a data batch with one image, which is the expected input for the predict function of the Inception network model, we use the following function:

def preprocess_image(image_path):
    img = image.load_img(image_path)
    img = image.img_to_array(img)
    #convert single image to a batch with 1 image
    img = np.expand_dims(img, axis=0) 
    img = inception_v3.preprocess_input(img)
    return img

Now, let's use the preceding method to preprocess the input image and predict the object the model sees. We will use the modeld.predict method to get the predicted class probabilities of all 1,000 classes in ImageNet. To convert this array of probabilities to real class labels, ordered in decreasing order of probability score, we use the decode_predictions method from keras. The list of all 1,000 ImageNet classes or synsets can be found here: http://image-net.org/challenges/LSVRC/2014/browse-synsets. Note that the pansy flower is not in the known set of classes on which the model is trained:

img = preprocess_image(base_image_path)
preds = model.predict(img)
for n, label, prob in decode_predictions(preds)[0]:
    print (label, prob)

The predictions. None of the top-predicted classes are of any strong probability, and this is expected as the model has not seen this particular flower before:

bee 0.022255851
earthstar 0.018780833
sulphur_butterfly 0.015787734
daisy 0.013633176
cabbage_butterfly 0.012270376

In the preceding photo, the model finds a bee. Well, that's not a bad guess; as you can see, in the yellow flowers, the lower half of the central black/brown shade looks like a bee indeed. Also, it sees some yellow and white butterflies called sulphur and cabbage butterflies, like we humans may perceive at a quick glance. The following photo shows the actual images of these identified objects/classes. Clearly, some of the feature detector hidden layers in this network got activated with this input. Maybe the filters detecting wings in insects/birds got activated along with some color-related filters to arrive at the preceding conclusion:

The ImageNet architecture and the number of feature maps in it is huge. Let's assume for a moment that we know the feature map layer that detects these wings. Now, given an input image, we can extract the features from this layer. Can we change the input image such that the activation from this layer increases? That means we have to modify the input image such that we see more of wing-like objects in the input image, even if they're not there. The resultant image will be one like a dream, with butterflies everywhere. This is exactly what is done in DeepDream.

Now let's take a look at some of the feature maps in the Inception network. To understand what a convolution model has learned, we can try to visualize the convolution filters.

Table of Contents for Algorithmic pareidolia in computer vision

Create new playlist

Sign In

Sign Up

Table of Contents for
Algorithmic pareidolia in computer vision