Chapter 4. Skeletal Motion and Face Tracking

Capturing and tracking skeleton images of one or two people is one of the most exciting features of Kinect development. It can transform many ideas to reality, including gesture recognition, multi-touch emulation, data-driven character animations, and even some advanced techniques such as motion capture and model reconstruction. The skeletal mapping work in every Kinect device is actually done by a microprocessor in the sensor (or directly by the Xbox core), and the results can be retrieved using corresponding APIs for use in our own applications.

The Microsoft Kinect SDK 1.5 also includes a new face tracking module that can track the position and rotation of our heads, and the shapes of our eyes and mouth. It even provides APIs to compute a virtual face mesh, which can be directly rendered in the 3D world. We will also introduce these excellent functionalities in this chapter, although they are not quite related to our planned Fruit Ninja game.

The face tracker API may not be located in the same directory of the Kinect SDK. If you have already installed the Developer Toolkit as discussed in Chapter 1, Getting Started with Kinect, you should be able to locate it at ${FTSDK_DIR}. Here, the environment variable indicates the location of the Kinect Developer Toolkit.

Understanding skeletal mapping

At present, Microsoft Kinect can identify up to six people within the view of the field, but it can only track at most two people in detail at the same time.

The players must stand (or sit) in front of the Kinect device, facing the sensors. If the player shows only a part of his body to the sensors, or wants the sensors to recognize sideways poses, the result may not be accurate, as some part of the skeleton may be in the wrong place, or may jitter back and forth.

Usually, the player is suggested to stand between 0.8 m and 4.0 m away from the device. Kinect for Windows may perform better for near distances because it has a near depth range mode (0.4 m) for use.

In every frame, Kinect will calculate a skeleton image for each person in tracking, which includes 20 joints to represent a complete human body. The positions and meanings of these joints can be found in the following figure:

Understanding skeletal mapping

The skeleton mapping

Note

Kinect uses infrared lights to calculate the depth of people and reconstructs the skeleton accordingly. So if you are using multiple Kinect devices for more precise skeleton mapping or other purposes, a different infrared light source (including another Kinect) in the view of the field will interfere with the current device and thus reduce the precision of computation. The interference may be low but we still have to avoid such a problem in practical situations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.199.162