Chapter 3. Skeletal Tracking

Skeletal tracking allows applications to recognize people and follow their actions. Skeletal tracking combined with gesture-based programming enables applications to provide a natural interface and increase the usability and ease of the application itself.

In this chapter we will learn how to enable and handle the skeleton data stream. For instance, we will address the following:

  • Tracking users by analyzing the skeleton data streamed by Kinect and mapping them to the color stream
  • Understanding what joints are and which joints are tracked in the near and seated mode
  • Observing the movements of the tracked users to detect simple actions

Mastering the skeleton data stream enables us to implement an application by tracking the user's actions and to recognize the user's gestures.

The Kinect sensor, thanks to the IR camera, can recognize up to six users in its field of view. Of these, only up to two users can be fully tracked, while the others are tracked from one single point only, as demonstrated in the following image:

Skeletal Tracking

Tracking up to six users in the field of view

Tracking users

The application flow for tracking users is very similar to the process we described in the color frame and depth frame management:

  1. Firstly, we need to ensure that at least one Kinect sensor is connected.
  2. Secondly, we have to enable the stream (in this case the skeleton one).
  3. And finally, we need to handle the frames that the sensor is streaming through the relevant SDK APIs.

In this chapter we will mention only the code that is relevant to skeletal tracking. The source code attached to the book does include all the detailed code and we can refer to the previous chapter to refresh ourselves on how to address step 1.

To enable the skeleton stream, we simply invoke the KinectSensor.SkeletonStream.Enable() method.

The Kinect sensor streams out in the skeleton stream's skeleton tracking data. This data is structured in the Skeleton class as a collection of joints. A joint is a point at which two skeleton bones are joined. This point is defined by the SkeletonPoint structure, which defines a 3D position—or point defined in meters by the three values (x,y,z)—in the skeleton space. We have up to twenty joints per single skeleton. A detailed list of the joint types is defined by the JointType enumeration at http://msdn.microsoft.com/en-us/library/microsoft.kinect.jointtype.aspx.

We are going to store the skeleton data in the private Skeleton[] skeletonData array that we size as per the sensor.SkeletonStream.FrameSkeletonArrayLength property. This property provides the total length of the skeleton data buffer for the SkeletonFrame class and enables skeleton tracking to fully track active skeletons and/or track the location of active skeletons.

We enable our application to listen to and manage the skeleton stream defining the void sensor_AllFramesReady(object sender, AllFramesReadyEventArgs e) event handler attached to the this.sensor.AllFramesReady event.

The following code snippet summarizes the necessary steps to enable the skeleton stream:

//handle the status changed event for the current sensor. 
//All the available status value are defined in the Microsoft.Kinect.KinectStatus enum
void KinectSensors_StatusChanged(object sender, StatusChangedEventArgs e)
{

    //select the first (if any available) connected Kinect Sensor from the KinectSensor.KinectSensors collection
    this.sensor = KinectSensor.KinectSensors.FirstOrDefault(s => s.Status == KinectStatus.Connected);

    if (null != this.sensor)
    {//enable the skeleton stream
        sensor.SkeletonStream.Enable();

        // Allocate Skeleton data
        skeletonData = new Skeleton[sensor.SkeletonStream.FrameSkeletonArrayLength];

        // subscribe to the event raised when all frames are ready
        this.sensor.AllFramesReady += sensor_AllFramesReady;

        // Start the sensor
        try
        {
            this.sensor.Start();}
        catch (IOException)
        {
            this.sensor = null; }
    } }

Note

As we have noticed, we subscribed to the AllFramesReady event, which is raised when all the frames (color, depth, and skeleton) are ready. We could rather subscribe to the SkeletonFrameReady event, which is raised when only the skeleton frame is ready. As we will see soon, we opted for the AllFrameReady event because in our example, we need to handle both the skeleton and the color frames.

In this example we will manage the skeleton stream reacting to the frame ready event. We could apply the same consideration debated for the color frame and approach skeleton tracking using the polling technique. To do so, we should leverage the SkeletonStream.OpenNextFrame() method instead of subscribing to the AllFramesReady event or to the SkeletonFrameReady event.

At this stage the code written in the sensor_AllFramesReady event handler should:

  • Handle the color stream data
  • Handle the skeleton stream data
  • Visualize the skeleton drawing color overlapping the color frame

The following code snippet embeds all the activities aforementioned:

/// <summary>
/// manage the entire stream data received from the sensor
/// </summary>
/// <param name=”sender”></param>
/// <param name=”e”></param>
void sensor_AllFramesReady(object sender, AllFramesReadyEventArgs e)
{
    using (ColorImageFrame colorFrame = e.OpenColorImageFrame())
    {   if (colorFrame != null)
        {   //copy the color frame's pixel data to the array
            colorFrame.CopyPixelDataTo(this.colorPixels);

            //draw the WritableBitmap
            this.colorBitmap.WritePixels(
                    new Int32Rect(0, 0, this.colorBitmap.PixelWidth, this.colorBitmap.PixelHeight),
                    this.colorPixels, this.colorBitmap.PixelWidth * colorFrame.BytesPerPixel, 0);
        }    }

    //handle the Skeleton stream data
    using (SkeletonFrame skeletonFrame = e.OpenSkeletonFrame()) 
    // Open the Skeleton frame
    {   if (skeletonFrame != null && this.skeletonData != null) 
        // check that a frame is available
        {                 
            skeletonFrame.CopySkeletonDataTo(this.skeletonData); 
            // get the skeletal information in this frame           
        }  }

    //draw the output
    using (DrawingContext dc = this.drawingGroup.Open())
    {
        // draw the color stream output
        dc.DrawImage(this.colorBitmap, new Rect(0.0, 0.0, RenderWidth, RenderHeight));

        //draw the skeleton stream data 
        DrawSkeletons(dc);

        // define the limited area for rendering the visual outcome
        this.drawingGroup.ClipGeometry = new RectangleGeometry(new Rect(0.0, 0.0, RenderWidth, RenderHeight));
    }}

For all the explanations related to the color stream data and frame, we can refer to the previous chapter. Let's now focus on the skeleton data stream and how we visualize them overlapping the color frame.

Copying the skeleton data

Thanks to the SkeletonFrame.CopySkeletonDataTo method, we can copy the skeleton data to our skeletonData array, where we store each skeleton as collection of the joints.

We can draw the skeleton data overlapping the color frame on the screen thanks to an instance of the System.Windows.Media.DrawingContext class.We obtain this instance calling the Open() method of the System.Windows.Media.DrawingGroup class.

There are certainly other ways we could obtain the graphical result. Having said that, the DrawingGroup class provides a handy solution to our problem where we need to handle a collection of bones and joints that can be activated upon as a single image.

RenderWidth and RenderHeight are two double constants set to 640.0f and 480.0f. We use them to handle the width and height dimensions of the image we display.

The following code snippet initializes the DrawingImage imageSource and DrawingGroup drawingGroup variables we use for displaying the graphical outcome of this chapter's example:

this.drawingGroup = new DrawingGroup();

// Create an image source that we can use in our image control
this.imageSource = new DrawingImage(this.drawingGroup);

// Display the drawing using our image control
imgMain.Source = this.imageSource;

For drawing the skeletons, we loop through the entire skeleton data and we render it skeleton by skeleton. For the skeletons that get fully tracked, we draw a complete skeleton composed by bones and joints. For the skeletons that are not able to be fully tracked, we draw a single ellipse only to highlight their position. We highlight when a user moves to the edge of the field of view. This provides a visual feedback indicating the user skeleton has been clipped:

/// <summary>
/// Draw the skeletons defined in the skeleton data
/// </summary>
/// <param name=”drawingContext”>dc used to design lines and ellipses representing bones and joints</param>
private void DrawSkeletons(DrawingContext drawingContext)
{
    foreach (Skeleton skeleton in this.skeletonData)
    {   if (skeleton != null)
        {
            // Fully Tracked skeleton
            if (skeleton.TrackingState == SkeletonTrackingState.Tracked)
            {
                DrawTrackedSkeletonJoints(skeleton.Joints, drawingContext);  }
            // Recognized position of the skeleton
            else if (skeleton.TrackingState == SkeletonTrackingState.PositionOnly)
            {
                DrawSkeletonPosition(skeleton.Position, drawingContext);  }

            //handle clipped edges
            RenderClippedEdges(skeleton, drawingContext);
        }    } }

We render the fully tracked skeletons using lines to represent bones and ellipses to represent joints. A section of the body is defined as a set of bones and their related joints. The following code snippet highlights the mechanism used to render the head and shoulders. We could apply the same mechanism to render the left arm, the right arm, the body, the left leg, and the right leg:

/// <summary>
/// Draw the skeleton joints successfully fully tracked 
/// </summary>
/// <param name=”jointCollection”>joint collection to draw</param>
/// <param name=”drawingContext”>design the graphical output</param>
private void DrawTrackedSkeletonJoints(JointCollection jointCollection, DrawingContext drawingContext)
{
    // Render Head and Shoulders
    DrawBone(jointCollection[JointType.Head], jointCollection[JointType.ShoulderCenter], drawingContext);
    DrawBone(jointCollection[JointType.ShoulderCenter], jointCollection[JointType.ShoulderLeft], drawingContext);
    DrawBone(jointCollection[JointType.ShoulderCenter], jointCollection[JointType.ShoulderRight], drawingContext);

    // Render other bones...

    //Render all the joints
    foreach (Joint singleJoint in jointCollection)
    {
        DrawJoin(singleJoint, drawingContext);
    } }

We render a skeleton identified with its position only using a single azure-colored ellipse, as defined in the following code snippet:

/// <summary>
/// Draw the skeleton position only
/// </summary>
/// <param name=”skeletonPoint”>skeleton single point</param>
/// <param name=”drawingContext”>dc used to design the graphical output</param>
private void DrawSkeletonPosition(SkeletonPoint skeletonPoint, DrawingContext drawingContext)
{
    drawingContext.DrawEllipse(Brushes.Azure, null, this.SkeletonPointToScreen(skeletonPoint), 2, 2); }

The following code demonstrates how we can provide a visual feedback when the user moves to the edge of the field of view. Thanks to the Skeleton.ClippedEdges.HasFlag method, the skeletal tracking system provides a feedback whenever the user skeleton has been clipped on a given edge:

/// <summary>
/// Highlights the edge where the skeleton data have been clipped
/// </summary>
/// <param name=”skeleton”>single skeleton</param>
/// <param name=”drawingContext”>dc used to design the graphical output</param>
private void RenderClippedEdges(Skeleton skeleton, DrawingContext drawingContext)
{   //tests wherever the user skeleton has been clipped or not
    if (skeleton.ClippedEdges.HasFlag(FrameEdges.Bottom))
    {   // colors the bottom border when the user is reaching it
        drawingContext.DrawRectangle(
                Brushes.Red,
                null,
                new Rect(0, RenderHeight - 10, RenderWidth, 10));
    }
    //manage the other edges
}

As stated previously, we intend a bone to be a line connecting two adjacent joints. The single joint can assume a TrackingState value defined by the JointTrackingState enum: NotTracked, Inferred, and Tracked. We define a bone as tracked if and only if both the joints have TrackingState equal to JointTrackingState.Tracked. We define a bone as non-tracked if at least one of its joints has TrackingState equal to JointTrackingState.Inferred. We are not able to render the bone if any of its joints has TrackingState equal to JointTrackingState.NotTracked:

/// <summary>
/// draw a bone as line between two given joints
/// </summary>
/// <param name=”jointFrom”>starting joint of the bone</param>
/// <param name=”jointTo”>ending joint of the bone</param>
/// <param name=”drawingContext”>dc used to design the graphical output</param>
private void DrawBone(Joint jointFrom, Joint jointTo, DrawingContext drawingContext)
{   if (jointFrom.TrackingState == JointTrackingState.NotTracked ||
    jointTo.TrackingState == JointTrackingState.NotTracked)
    {
        return; // nothing to draw, one of the joints is not tracked
    }

    if (jointFrom.TrackingState == JointTrackingState.Inferred ||
    jointTo.TrackingState == JointTrackingState.Inferred)
    {
        // Draw thin lines if either one of the joints is inferred
        DrawNonTrackedBoneLine(jointFrom.Position, jointTo.Position, drawingContext);  
    }

    if (jointFrom.TrackingState == JointTrackingState.Tracked &&
    jointTo.TrackingState == JointTrackingState.Tracked)
    {
        // Draw bold lines if the joints are both tracked
        DrawTrackedBoneLine(jointFrom.Position, jointTo.Position, drawingContext);  
    }}

We draw the bone simply by calling the DrawingContext.DrawLine method. We can use two different colors for differentiating between tracked bones and non-tracked bones. For example, we can define Pen trackedBonePen = new Pen(Brushes.Gold, 6) for tracked bones. The following method defines the way we render tracked bones:

/// <summary>
/// draw a line representing a tracked bone
/// </summary>
/// <param name=”skeletonPointFrom”>starting point of the bone</param>
/// <param name=”skeletonPointTo”>ending point of the bone</param>
/// <param name=”drawingContext”>dc used to design the graphical output</param>
private void DrawTrackedBoneLine(SkeletonPoint skeletonPointFrom, SkeletonPoint skeletonPointTo, DrawingContext drawingContext)
{
    drawingContext.DrawLine(this.trackedBonePen, this.SkeletonPointToScreen(skeletonPointFrom), this.SkeletonPointToScreen(skeletonPointTo));
}

Similarly, we can draw the joints as ellipses and differentiate those with TrackingState equal to JointTrackingState.Tracked from those with TrackingState equal to JointTrackingState.Inferred or JointTrackingState.NotTracked. The following code snippet indicates how we can render the joint and adjust it according to the joints' TrackingState:

    if (singleJoint.TrackingState == JointTrackingState.NotTracked)
    {
        return; // nothing to draw
    }
// singleJoint is the joint to draw
    if (singleJoint.TrackingState == JointTrackingState.Inferred)
    {
        DrawNonTrackedJoint(singleJoint, drawingContext);
        // Draw thin ellipse if the joint is inferred
    }

// drawingContext is the dc used to design the graphical

    if (singleJoint.TrackingState == JointTrackingState.Tracked)
    {
        DrawTrackedJoint(singleJoint, drawingContext);
        // Draw bold ellipse if the joint is tracked
    }

private void DrawTrackedJoint(Joint singleJoint, DrawingContext drawingContext)
{
    drawingContext.DrawEllipse(
                    this.trackedJointBrush,
                    null,
                    this.SkeletonPointToScreen(singleJoint.Position),
                    10, 10); 
} 

To visualize the single skeletons overlapping the color image in the right position, we utilize the CoordinateMapper.MapSkeletonPointToColorPoint method, which maps a point from skeleton space to color space:

/// <summary>
/// Maps a SkeletonPoint to lie within our render space and converts to Point
/// </summary>
/// <param name=”skelpoint”>point to map</param>
/// <returns>mapped point</returns>
private Point SkeletonPointToScreen(SkeletonPoint skelpoint)
{
    // Convert point to color space.  
    // We are assuming our output resolution to be 640x480.
    ColorImagePoint colorPoint = this.sensor.CoordinateMapper.MapSkeletonPointToColorPoint(skelpoint, ColorImageFormat.RgbResolution640x480Fps30);
    return new Point(colorPoint.X, colorPoint.Y);
}

We are now ready: our skeletons overlap the color data stream and we can take a funny x-ray of ourselves. The full list of joints is detailed in the JointType enumeration available online at http://msdn.microsoft.com/en-us/library/microsoft.kinect.jointtype.aspx. The joint state is detailed in the JointTrackingState enumeration available at http://msdn.microsoft.com/en-us/library/microsoft.kinect.jointtrackingstate.aspx.

Note

The Kinect sensor in its skeletal tracking mode by default selects the first two recognized users in the field of view. We can use the AppChoosesSkeletons and ChooseSkeletons members of the SkeletonStream class to actively choose in the application which skeleton to track among the six users recognized in the field of view.

We may decide to track the closest skeleton or the skeleton that falls in a predefined distance interval. The source code attached to this chapter defines a simple routine for tracking the closest skeleton.

The remaining four skeletons are tracked highlighting the HipCenter (center, between hips) joint only.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.162.247