Using OpenCV to perform face detection

Unlike what you may think from the outset, performing face detection on a still image or a video feed is an extremely similar operation. The latter is just the sequential version of the former: face detection on videos is simply face detection applied to each frame read into the program from the camera. Naturally, a whole host of concepts are applied to video face detection such as tracking, which does not apply to still images, but it's always good to know that the underlying theory is the same.

So let's go ahead and detect some faces.

Performing face detection on a still image

The first and most basic way to perform face detection is to load an image and detect faces in it. To make the result visually meaningful, we will draw rectangles around faces on the original image.

Now that you have haarcascades included in your project, let's go ahead and create a basic script to perform face detection.

import cv2

filename = '/path/to/my/pic.jpg'

def detect(filename):
  face_cascade = cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')
  
  img = cv2.imread(filename)
  gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
  faces = face_cascade.detectMultiScale(gray, 1.3, 5)
  for (x,y,w,h) in faces:
    img = cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
  cv2.namedWindow('Vikings Detected!!')
  cv2.imshow('Vikings Detected!!', img)
  cv2.imwrite('./vikings.jpg', img)
  cv2.waitKey(0)

detect(filename)

Let's go through the code. First, we use the obligatory cv2 import (you'll find that every script in this book will start like this, or almost similar). Secondly, we declare the detect function.

def detect(filename):

Within this function, we declare a face_cascade variable, which is a CascadeClassifier object for faces, and responsible for face detection.

  face_cascade =
  cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')

We then load our file with cv2.imread, and convert it to grayscale, because that's the color space in which the face detection happens.

The next step (face_cascade.detectMultiScale) is where we operate the actual face detection.

  img = cv2.imread(filename)
  gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
  faces = face_cascade.detectMultiScale(gray, 1.3, 5)

The parameters passed are scaleFactor and minNeighbors, which determine the percentage reduction of the image at each iteration of the face detection process, and the minimum number of neighbors retained by each face rectangle at each iteration. This may all seem a little complex in the beginning but you can check all the options out in the official documentation.

The value returned from the detection operation is an array of tuples that represent the face rectangles. The utility method, cv2.rectangle, allows us to draw rectangles at the specified coordinates (x and y represent the left and top coordinates, w and h represent the width and height of the face rectangle).

We will draw blue rectangles around all the faces we find by looping through the faces variable, making sure we use the original image for drawing, not the gray version.

for (x,y,w,h) in faces:
    img = cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)

Lastly, we create a namedWindow instance and display the resulting processed image in it. To prevent the image window from closing automatically, we insert a call to waitKey, which closes the window down at the press of any key.

  cv2.namedWindow('Vikings Detected!!')
  cv2.imshow('Vikings Detected!!', img)
  cv2.waitKey(0)

And there we go, a whole set of Vikings have been detected in our image, as shown in the following screenshot:

Performing face detection on a still image

Performing face detection on a video

We now have a good foundation to understand how to perform face detection on a still image. As mentioned previously, we can repeat the process on the individual frames of a video (be it a camera feed or a video) and perform face detection.

The script will perform the following tasks: it will open a camera feed, it will read a frame, it will examine that frame for faces, it will scan for eyes within the faces detected, and then it will draw blue rectangles around the faces and green rectangles around the eyes.

  1. Let's create a file called face_detection.py and start by importing the necessary module:
    import cv2
  2. After this, we declare a method, detect(), which will perform face detection.
         def detect():
      face_cascade = cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')
      eye_cascade = cv2.CascadeClassifier('./cascades/haarcascade_eye.xml')
      camera = cv2.VideoCapture(0)
  3. The first thing we need to do inside the detect() method is to load the Haar cascade files so that OpenCV can operate face detection. As we copied the cascade files in the local cascades/ folder, we can use a relative path. Then, we open a VideoCapture object (the camera feed). The VideoCapture constructor takes a parameter, which indicates the camera to be used; zero indicates the first camera available.
    while (True):
        ret, frame = camera.read()
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
  4. Next up, we capture a frame. The read() method returns two values: a Boolean indicating the success of the frame read operation, and the frame itself. We capture the frame, and then we convert it to grayscale. This is a necessary operation, because face detection in OpenCV happens in the grayscale color space:
        faces = face_cascade.detectMultiScale(gray, 1.3, 5)
  5. Much like the single still image example, we call detectMultiScale on the grayscale version of the frame.
        for (x,y,w,h) in faces:
            img = cv2.rectangle(frame,(x,y),(x+w,y+h),(255,0,0),2)
     
           
            roi_gray = gray[y:y+h, x:x+w]
            
            eyes = eye_cascade.detectMultiScale(roi_gray, 1.03, 5, 0, (40,40))

    Note

    There are a few additional parameters in the eye detection. Why? The method signature for detectMultiScale takes a number of optional parameters: in the case of detecting a face, the default options were good enough to detect faces. However, eyes are a smaller feature of the face, and self-casting shadows in my beard or my nose and random shadows in the frame were triggering false positives.

    By limiting the search for eyes to a minimum size of 40x40 pixels, I was able to discard all false positives. Go ahead and test these parameters until you reach a point at which your application performs as you expected it to (for example, you can try and specify a maximum size for the feature too, or increase the scale factor and number of neighbors).

  6. Here we have a further step compared to the still image example: we create a region of interest corresponding to the face rectangle, and within this rectangle, we operate "eye detection". This makes sense as you wouldn't want to go looking for eyes outside a face (well, for human beings at least!).
            for (ex,ey,ew,eh) in eyes:
                cv2.rectangle(img,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)
  7. Again, we loop through the resulting eye tuples and draw green rectangles around them.
        cv2.imshow("camera", frame)
        if cv2.waitKey(1000 / 12) & 0xff == ord("q"):
          break
    
      camera.release()
      cv2.destroyAllWindows()
    
    if __name__ == "__main__":
      detect()
  8. Finally, we show the resulting frame in the window. All being well, if any face is within the field of view of the camera, you will have a blue rectangle around their face and a green rectangle around each eye, as shown in this screenshot:
    Performing face detection on a video

Performing face recognition

Detecting faces is a fantastic feature of OpenCV and one that constitutes the basis for a more advanced operation: face recognition. What is face recognition? It's the ability of a program, given an image or a video feed, to identify a person. One of the ways to achieve this (and the approach adopted by OpenCV) is to "train" the program by feeding it a set of classified pictures (a facial database), and operate the recognition against those pictures.

This is the process that OpenCV and its face recognition module follow to recognize faces.

Another important feature of the face recognition module is that each recognition has a confidence score, which allows us to set thresholds in real-life applications to limit the amount of false reads.

Let's start from the very beginning; to operate face recognition, we need faces to recognize. You can do this in two ways: supply the images yourself or obtain freely available face databases. There are a number of face databases on the Internet:

To operate face recognition on these samples, you would then have to run face recognition on an image that contains the face of one of the sampled people. That may be an educational process, but I found it to be not as satisfying as providing images of my own. In fact, I probably had the same thought that many people had: I wonder if I could write a program that recognizes my face with a certain degree of confidence.

Generating the data for face recognition

So let's go ahead and write a script that will generate those images for us. A few images containing different expressions are all that we need, but we have to make sure the sample images adhere to certain criteria:

  • Images will be grayscale in the .pgm format
  • Square shape
  • All the same size images (I used 200 x 200; most freely available sets are smaller than that)

Here's the script itself:

import cv2

def generate():
  face_cascade = cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')
  eye_cascade = cv2.CascadeClassifier('./cascades/haarcascade_eye.xml')
  camera = cv2.VideoCapture(0)
  count = 0
  while (True):
    ret, frame = camera.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    faces = face_cascade.detectMultiScale(gray, 1.3, 5)
    
    for (x,y,w,h) in faces:
        img = cv2.rectangle(frame,(x,y),(x+w,y+h),(255,0,0),2)
        
        f = cv2.resize(gray[y:y+h, x:x+w], (200, 200))

        cv2.imwrite('./data/at/jm/%s.pgm' % str(count), f)
        count += 1

    cv2.imshow("camera", frame)
    if cv2.waitKey(1000 / 12) & 0xff == ord("q"):
      break

  camera.release()
  cv2.destroyAllWindows()

if __name__ == "__main__":
  generate()

What is quite interesting about this exercise is that we are going to generate sample images building on our newfound knowledge of how to detect a face in a video feed. Effectively, what we are doing is detecting a face, cropping that region of the gray-scaled frame, resizing it to be 200x200 pixels, and saving it with a name in a particular folder (in my case, jm; you can use your initials) in the .pgm format.

I inserted a variable, count, because we needed progressive names for the images. Run the script for a few seconds, change expressions a few times, and check the destination folder you specified in the script. You will find a number of images of your face, grayed, resized, and named with the format, <count>.pgm.

Let's now move on to try and recognize our face in a video feed. This should be fun!

Recognizing faces

OpenCV 3 comes with three main methods for recognizing faces, based on three different algorithms: Eigenfaces, Fisherfaces, and Local Binary Pattern Histograms (LBPH). It is beyond the scope of this book to get into the nitty-gritty of the theoretical differences between these methods, but we can give a high-level overview of the concepts.

I will refer you to the following links for a detailed description of the algorithms:

First and foremost, all methods follow a similar process; they all take a set of classified observations (our face database, containing numerous samples per individual), get "trained" on it, perform an analysis of faces detected in an image or video, and determine two elements: whether the subject is identified, and a measure of the confidence of the subject really being identified, which is commonly known as the confidence score.

Eigenfaces performs a so called PCA, which—of all the mathematical concepts you will hear mentioned in relation to computer vision—is possibly the most descriptive. It basically identifies principal components of a certain set of observations (again, your face database), calculates the divergence of the current observation (the faces being detected in an image or frame) compared to the dataset, and it produces a value. The smaller the value, the smaller the difference between face database and detected face; hence, a value of 0 is an exact match.

Fisherfaces derives from PCA and evolves the concept, applying more complex logic. While computationally more intensive, it tends to yield more accurate results than Eigenfaces.

LBPH instead roughly (again, from a very high level) divides a detected face into small cells and compares each cell to the corresponding cell in the model, producing a histogram of matching values for each area. Because of this flexible approach, LBPH is the only face recognition algorithm that allows the model sample faces and the detected faces to be of different shape and size. I personally found this to be the most accurate algorithm generally speaking, but each algorithm has its strengths and weaknesses.

Preparing the training data

Now that we have our data, we need to load these sample pictures into our face recognition algorithms. All face recognition algorithms take two parameters in their train() method: an array of images and an array of labels. What do these labels represent? They are the IDs of a certain individual/face so that when face recognition is performed, we not only know the person was recognized but also who—among the many people available in our database—the person is.

To do that, we need to create a comma-separated value (CSV) file, which will contain the path to a sample picture followed by the ID of that person. In my case, I have 20 pictures generated with the previous script, in the subfolder, jm/, of the folder, data/at/, which contains all the pictures of all the individuals.

My CSV file therefore looks like this:

jm/1.pgm;0
jm/2.pgm;0
jm/3.pgm;0
...
jm/20.pgm;0

Note

The dots are all the missing numbers. The jm/ instance indicates the subfolder, and the 0 value at the end is the ID for my face.

OK, at this stage, we have everything we need to instruct OpenCV to recognize our face.

Loading the data and recognizing faces

Next up, we need to load these two resources (the array of images and CSV file) into the face recognition algorithm, so it can be trained to recognize our face. To do this, we build a function that reads the CSV file and—for each line of the file—loads the image at the corresponding path into the images array and the ID into the labels array.

def read_images(path, sz=None):
    
    c = 0
    X,y = [], []
    for dirname, dirnames, filenames in os.walk(path):
        for subdirname in dirnames:
            subject_path = os.path.join(dirname, subdirname)
            for filename in os.listdir(subject_path):
                try:
                    if (filename == ".directory"):
                        continue
                    filepath = os.path.join(subject_path, filename)
                    im = cv2.imread(os.path.join(subject_path, filename), cv2.IMREAD_GRAYSCALE)

                    # resize to given size (if given)
                    if (sz is not None):
                        im = cv2.resize(im, (200, 200))

                    X.append(np.asarray(im, dtype=np.uint8))
                    y.append(c)
                except IOError, (errno, strerror):
                    print "I/O error({0}): {1}".format(errno, strerror)
                except:
                    print "Unexpected error:", sys.exc_info()[0]
                    raise
            c = c+1
            

    return [X,y]

Performing an Eigenfaces recognition

We're ready to test the face recognition algorithm. Here's the script to perform it:

def face_rec():
    names = ['Joe', 'Jane', 'Jack']
    if len(sys.argv) < 2:
        print "USAGE: facerec_demo.py </path/to/images> [</path/to/store/images/at>]"
        sys.exit()

    [X,y] = read_images(sys.argv[1])
    y = np.asarray(y, dtype=np.int32)
    
    if len(sys.argv) == 3:
        out_dir = sys.argv[2]
    
    model = cv2.face.createEigenFaceRecognizer()
    model.train(np.asarray(X), np.asarray(y))
    camera = cv2.VideoCapture(0)
    face_cascade = cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')
    while (True):
      read, img = camera.read()
      faces = face_cascade.detectMultiScale(img, 1.3, 5)
      for (x, y, w, h) in faces:
        img = cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        roi = gray[x:x+w, y:y+h]
        try:
            roi = cv2.resize(roi, (200, 200), interpolation=cv2.INTER_LINEAR)
            params = model.predict(roi)
            print "Label: %s, Confidence: %.2f" % (params[0], params[1])
            cv2.putText(img, names[params[0]], (x, y - 20), cv2.FONT_HERSHEY_SIMPLEX, 1, 255, 2)
        except:
            continue
      cv2.imshow("camera", img)
      if cv2.waitKey(1000 / 12) & 0xff == ord("q"):
        break
    cv2.destroyAllWindows()

There are a few lines that may look a bit mysterious, so let's analyze the script. First of all, there's an array of names declared; those are the actual names of the individual people I stored in my database of faces. It's great to identify a person as ID 0, but printing 'Joe' on top of a face that's been correctly detected and recognized is much more dramatic.

So whenever the script recognizes an ID, we will print the corresponding name in the names array instead of an ID.

After this, we load the images as described in the previous function, create the face recognition model with cv2.createEigenFaceRecognizer(), and train it by passing the two arrays of images and labels (IDs). Note that the Eigenface recognizer takes two important parameters that you can specify: the first one is the number of principal components you want to keep and the second is a float value specifying a confidence threshold.

Next up, we repeat a similar process to the face detection operation. This time, though, we extend the processing of the frames by also operating face recognition on any face that's been detected.

This happens in two steps: firstly, we resize the detected face to the expected size (in my case, samples were 200x200 pixels), and then we call the predict() function on the resized region.

Note

This is a bit of a simplified process, and it serves the purpose of enabling you to have a basic application running and understand the process of face recognition in OpenCV 3. In reality, you will apply a few more optimizations, such as correctly aligning and rotating detected faces, so the accuracy of the recognition is maximized.

Lastly, we obtain the results of the recognition and, just for effect, we draw it in the frame:

Performing an Eigenfaces recognition

Performing face recognition with Fisherfaces

What about Fisherfaces? The process doesn't change much; we simply need to instantiate a different algorithm. So, the declaration of our model variable would look like so:

model = cv2.face.createFisherFaceRecognizer()

Fisherface takes the same two arguments as Eigenfaces: the Fisherfaces to keep and the confidence threshold. Faces with confidence above this threshold will be discarded.

Performing face recognition with LBPH

Finally, let's take a quick look at the LBPH algorithm. Again, the process is very similar. However, the parameters taken by the algorithm factory are a bit more complex as they indicate in order: radius, neighbors, grid_x, grid_y, and the confidence threshold. If you don't specify these values, they will automatically be set to 1, 8, 8, 8, and 123.0. The model declaration will look like so:

  model = cv2.face.createLBPHFaceRecognizer()

Note

Note that with LBPH, you won't need to resize images, as the division in grids allows comparing patterns identified in each cell.

Discarding results with confidence score

The predict() method returns a two-element array: the first element is the label of the recognized individual and the second is the confidence score. All algorithms come with the option of setting a confidence score threshold, which measures the distance of the recognized face from the original model, therefore a score of 0 signifies an exact match.

There may be cases in which you would rather retain all recognitions, and then apply further processing, so you can come up with your own algorithms to estimate the confidence score of a recognition; for example, if you are trying to identify people in a video, you may want to analyze the confidence score in subsequent frames to establish whether the recognition was successful or not. In this case, you can inspect the confidence score obtained by the algorithm and draw your own conclusions.

Note

The confidence score value is completely different in Eigenfaces/Fisherfaces and LBPH. Eigenfaces and Fisherfaces will produce values (roughly) in the range 0 to 20,000, with any score below 4-5,000 being quite a confident recognition.

LBPH works similarly; however, the reference value for a good recognition is below 50, and any value above 80 is considered as a low confidence score.

A normal custom approach would be to hold-off drawing a rectangle around a recognized face until we have a number of frames with a satisfying arbitrary confidence score, but you have total freedom to use OpenCV's face recognition module to tailor your application to your needs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.171.153