Hand gesture recognition

What remains to be done is to classify the hand gesture based on the number of extended fingers. For example, if we find five extended fingers, we assume the hand to be open, whereas no extended fingers implies a fist. All that we are trying to do is count from zero to five and make the app recognize the corresponding number of fingers.

This is actually trickier than it might seem at first. For example, people in Europe might count to three by extending their thumb, index finger, and middle finger. If you do that in the US, people there might get horrendously confused, because they do not tend to use their thumbs when signaling the number two. This might lead to frustration, especially in restaurants (trust me). If we could find a way to generalize these two scenarios—maybe by appropriately counting the number of extended fingers—we would have an algorithm that could teach simple hand gesture recognition to not only a machine but also (maybe) to an average waitress.

As you might have guessed, the answer is related to convexity defects. As mentioned earlier, extended fingers cause defects in the convex hull. However, the inverse is not true; that is, not all convexity defects are caused by fingers! There might be additional defects caused by the wrist, as well as the overall orientation of the hand or the arm. How can we distinguish between these different causes of defects?

Distinguishing between different causes of convexity defects

The trick is to look at the angle between the farthest point from the convex hull point within the defect (farthest_pt_index) and the start and end points of the defect (start_index and end_index, respectively), as illustrated in the following screenshot:

Distinguishing between different causes of convexity defects

In this screenshot, the orange markers serve as a visual aid to center the hand in the middle of the screen, and the convex hull is outlined in green. Each red dot corresponds to the point farthest from the convex hull (farthest_pt_index) for every convexity defect detected. If we compare a typical angle that belongs to two extended fingers (such as θj) to an angle that is caused by general hand geometry (such as θi), we notice that the former is much smaller than the latter. This is obviously because humans can spread their fingers only a little, thus creating a narrow angle made by the farthest defect point and the neighboring fingertips.

Therefore, we can iterate over all convexity defects and compute the angle between the said points. For this, we will need a utility function that calculates the angle (in radians) between two arbitrary, list-like vectors, v1 and v2:

def angle_rad(v1, v2):
    return np.arctan2(np.linalg.norm(np.cross(v1, v2)), np.dot(v1, v2))

This method uses the cross product to compute the angle, rather than doing it in the standard way. The standard way of calculating the angle between two vectors v1 and v2 is by calculating their dot product and dividing it by the norm of v1 and the norm of v2. However, this method has two imperfections:

  • You have to manually avoid division by zero if either the norm of v1 or the norm of v2 is zero
  • The method returns relatively inaccurate results for small angles

Similarly, we provide a simple function to convert an angle from degrees to radians:

def deg2rad(angle_deg):
    return angle_deg/180.0*np.pi

Classifying hand gestures based on the number of extended fingers

What remains to be done is actually to classify the hand gesture based on the number of extended fingers. The _detect_num_fingers method will take as input the detected contour (contours), the convexity defects (defects), and a canvas to draw on (img_draw):

def _detect_num_fingers(self, contours, defects, img_draw):

Based on these parameters, it will then determine the number of extended fingers.

However, we first need to define a cut-off angle that can be used as a threshold to classify convexity defects as being caused by extended fingers or not. Except for the angle between the thumb and the index finger, it is rather hard to get anything close to 90 degrees, so anything close to that number should work. We do not want the cut-off angle to be too high, because that might lead to misclassifications:

self.thresh_deg = 80.0

For simplicity, let's focus on the special cases first. If we do not find any convexity defects, it means that we possibly made a mistake during the convex hull calculation, or there are simply no extended fingers in the frame, so we return 0 as the number of detected fingers:

if defects is None:
    return [0, img_draw]

However, we can take this idea even further. Due to the fact that arms are usually slimmer than hands or fists, we can assume that the hand geometry will always generate at least two convexity defects (which usually belong to the wrists). So, if there are no additional defects, it implies that there are no extended fingers:

if len(defects) <= 2:
    return [0, img_draw]

Now that we have ruled out all special cases, we can begin counting real fingers. If there is a sufficient number of defects, we will find a defect between every pair of fingers. Thus, in order to get the number right (num_fingers), we should start counting at 1:

num_fingers = 1

Then, we can start iterating over all convexity defects. For each defect, we will extract the four elements and draw its hull for visualization purposes:

for i in range(defects.shape[0]):
    # each defect point is a 4-tuplestart_idx, end_idx, farthest_idx, _ == defects[i, 0]
    start = tuple(contours[start_idx][0])
    end = tuple(contours[end_idx][0])
    far = tuple(contours[farthest_idx][0])

    # draw the hull
    cv2.line(img_draw, start, end [0, 255, 0], 2)

Then, we will compute the angle between the two edges from far to start and from far to end. If the angle is smaller than self.thresh_deg degrees, it means that we are dealing with a defect that is most likely caused by two extended fingers. In such cases, we want to increment the number of detected fingers (num_fingers), and draw the point with green. Otherwise, we draw the point with red:

# if angle is below a threshold, defect point belongs
# to two extended fingers
if angle_rad(np.subtract(start, far), np.subtract(end, far))
        < deg2rad(self.thresh_deg):
    # increment number of fingers
    num_fingers = num_fingers + 1

    # draw point as green
    cv2.circle(img_draw, far, 5, [0, 255, 0], -1)
else:
    # draw point as red
    cv2.circle(img_draw, far, 5, [255, 0, 0], -1)

After iterating over all convexity defects, we pass the number of detected fingers and the assembled output image to the recognize method:

return (min(5, num_fingers), img_draw)

This will make sure that we do not exceed the common number of fingers per hand.

The result can be seen in the following screenshots:

Classifying hand gestures based on the number of extended fingers

Interestingly, our app is able to detect the correct number of extended fingers in a variety of hand configurations. Defect points between extended fingers are easily classified as such by the algorithm, and others are successfully ignored.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.197.95