Image recognition

Now, to go ahead and start doing vision processing, let's connect the camera to Raspberry Pi. Once you have done that, you need to write the following code:

import cv2
import numpy as np
cap = cv2.VideoCapture(0)

while True:
        _, image = cap.read()
        cv2.imshow("Frame", image)
        hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)


        lowerGreen = np.array([80,50,50])
        upperGreen = np.array([130,255,255])

        mask = cv2.inRange(hsv, lowerGreen, upperGreen)
        res = cv2.bitwise_and(image, image, mask=mask)
        cv2.imshow('mask',mask)
        cv2.imshow('result',res)
        key = cv2.waitKey(1) & 0xFF


        if key == ord('q'):
                break
cv2.destroyAllWindows()
cap.release()

Before you actually compile this code, let me tell you what exactly we are doing:

import numpy as np

In the preceding line, we are importing the library numpy as np, or in other words, we have imported the library, and every time we need to call is we simply need to write np:

cap = cv2.VideoCapture(0)

In the preceding line, we are telling Raspberry Pi to capture the video from a specific port. Now you must be thinking that how do we know which port it is connected to?

The thing with ports is that, unlike GPIOs, they are not hardware dependent rather they are software allocated. So, if your camera is the first device to be connected to the USB port, then it is very likely to be connected to port 0.

In this example, a USB camera is the only piece of hardware that we are adding; hence, we can be very sure that it will be at port 0 only.

Now the command is not only for port selection. The primary work for it is to capture the video from the camera. Now, every time we need to call it by our given name cap rather than the entire function itself:

_, image = cap.read()

In the preceding line, we are using the function cv.VideoCapture(0) by the name of cap. We are also using a function named read(). Now, what this will do is that it will return two things. First, it will return a Boolean value, or in another words, a true or a false. That is, whether the image has been captured successfully or not. The second reading, which we are more concerned with, is entire frame read from the image. This whole data would be stored in form of an array:

        cv2.imshow("Frame", image)

In this line, we are using the library cv2 and a function of the library named imshow(). What this does is that it shows the image that has been captured by the camera. Now going ahead, we have two arguments that are being passed, that is, "Frame" and image. Frame is the name of the window that would show us the captured image. Further to this, we have the second argument image. Now as we remember, we have stored the image in the variable named image. So, it will directly show what we have already stored in the previous line:

hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

In this line, we are doing something amazing. The cvtColour() function is a converter. What does it convert? As we know, the image is made of an array of two-dimensional pixels. What this function does is convert the values that our camera gives into the desired value of the user.

Let me explain this in a bit more detail. The camera that we are using gives us RGB image. So whatever we can see is a mix of carried brightness of these three colors. Now for our detection, we would like to convert it into hue, saturation, and value. Why are we doing that you may as, first it makes the task pretty easy for us. So to understand this, let's see what this hue, saturation, and value is.

Hue basically represents the color we are talking about. Every hue value represents a specific color, and the saturation is the color intensity. So the more the saturation, the deeper it is, and the lower the saturation, the more faint the color is. Finally, value—this term can be confusing; this basically means how much black is there in the images. So let me give you a rough example:

Now the first image shows you the hue: 100, saturation: 100, and value: 100. Hence, black is zero, the color is green, and saturation is 100%. In the subsequent picture, you can see the color has faded when the saturation is kept on a lower percentage. Finally, when value is reduced in the next image, then the color gets really dark.

So now coming back to the point, why hue saturation value? So now to detect any color, we simply need one unit instead of three different unit forming that color hence making the job simpler. There are various other reasons to do so as well. But at this time, it is not a concern for us.

Now moving forward, we have passed on two arguments—image, which is where the converting algorithm will take the raw data from, second is cv.Colour_BGR2HSV, which basically tells us the algorithm to use during the conversion. So as we have already discussed, we have to convert the RGB values to hue saturation values (HSV). Finally, these values will be returned to a variable named hsv:

        lowerGreen = np.array([40,50,50])
        upperGreen = np.array([80,255,255])

In this line, we are giving the upper and lower range values, which needs to be detected. As you can see, we are detecting a green color; hence, we would be providing the upper and lower values for both the ends. If you want to change the color that you want to detect, then you simply need to change this value and the job will be done:

mask = cv2.inRange(hsv, lowerGreen, upperGreen)

Now, we are segregating the objects that are falling in this color range and giving the value to an array. This is done by a section function named inRange(). So, there are three arguments that we need to pass. First, which image does it need to work on, what is the lower range value that it needs to detect, and the upper range value that we have provided as hsv, lowerGreen, upperGreen. The result of this would be an array that would have the value of the image that has all color to be blacked out and only the color that lies in the specific color range to be shown in a plain white color:

res = cv2.bitwise_and(image, image, mask=mask)

bitwise_and is a function of cv2 library; what it does is simply logical and of the two values of the array. The arguments that we are passing are image and the image with mask, or in other words, we are passing two images—one being the raw and the other being the mask. With this function, we are ending those two images. The result of this would an image that has a black background all around and the object that lies in the specific color range will be shown in a proper color image:

cv2.imshow('mask',mask)

We have previously used a function named cv2.inRange, and what it did was to filter out the specific color ranges that we had defined. Now, that function gave us a new array by the name of mask. What it had is an array that has all the value as null, except for those who fall into the specific color range. The image in the range would be the only one to be shown here. This would result in an image that is black all around except for the points wherein the color is in the specified range resulting in a white image. In our program, we are using cv2.inRange and storing the value into a variable named mask.

The cv2.imshow() function is something that we used before as well. This simply shows the resultant image of the array. Hence, we have given the command cv2.imshow('mask',mask); hence, it would open a window by the name of 'mask', and thereafter in that window, it would show the resulting image stored in the mask variable:

        cv2.imshow('result',res)

We are doing a similar thing here. In the previous lines, we have used the function named cv2.bitwise_and(). This was used to do the logical and of two image arrays, that is, image and mask, the result of which was res; hence, it would show us the image corresponding to it. Now that we have done the logical part of image and mask, the output will be an image that would be black all around, but the portion falling into our chosen category to be shown in their original color:

        key = cv2.waitKey(1) & 0xFF

Now this is interesting. The cv2.waitKey() function gives us the value of the key pressed. But the problem is that it returns a 32-bit integer value. However, when we talk about ASCII, it returns only 8 bit of data. Hence, we would have to only look for this 8 bits out of the 32-bit integer value returned by the waitKey() function. Now to do that, we are doing a logical and of the value received by the waitKey() and a hexadecimal number 0xFF what this hexadecimal translates to is 11111111 in decimal. So when we add the 32-bit int to the hexadecimal number, we would only be left with the last 8-bit value, which is also the only relevant part for us. Hence, the value of key would a 8-bit ASCII value:

        if key == ord('q'):
                break

Now we are taking a simple if statement and comparing the value of key to ord('q'). What the ord function does is that it takes in the argument and converts it into the ASCII value. So, in this function, if q key is pressed, then the loop would break and the program would come out of it:

cv2.destroyAllWindows()

This is a very simple command. It will close all the windows that we have opened using cv2.imshow().

cap.release()

Using this function, we are releasing the camera. Hence, making this resource free and ready to be used by any other program.

Table of Contents for Image recognition

Create new playlist

Sign In

Sign Up

Table of Contents for
Image recognition