Chapter 9. You are the mouse!

Kinect is clearly a new way to communicate with the computer. The world of applications quickly took advantage of the versatility of the mouse when it was introduced in the ancient ages of computer science, and applications now must evolve to take into account a new way of obtaining input data through the Kinect sensor.

Before the mouse was invented, user interfaces were all based on the keyboard. Users had to type directions and data, and they had to use the Tab key to move from one field to another. Screens were based on text boxes and text blocks.

Then the mouse appeared, and user interfaces quickly started to change and adapt to a new way of providing input to the computer. Graphical user interfaces (GUIs) appeared, and users started to manipulate them and interact with them using the mouse.

This same process will repeat with Kinect. For now, interfaces are designed to be used with a mouse, a keyboard, a touchpad, or a touch screen. These interfaces are not ready for Kinect yet, and Kinect is not precise enough to move a cursor on a screen effectively—yet.

Computer interfaces must evolve, but before that happens, you can consider using Kinect to control the mouse—with some limitations.

First, it’s probably not a good idea to try using Kinect in your office to control a mouse for work applications. The human body is not designed to make it easy for someone to keep both hands in the air for a long time, especially when sitting, so there are limits to what you can do to control an application you use sitting at a desk through the Kinect sensor.

However, you can think about using Kinect to control the mouse in situations where you can stand up, because in that position, it’s easier to move your hands around.

In this chapter, you will discover how you can control the mouse pointer with Kinect, as well as how you can reduce the jitter (the undesired deviation of the values provided by the sensor) to get a smoother movement.

Controlling the mouse pointer

Before looking at the Kinect-specific code you need to control the mouse, you must import a Win32 function called SendInput (http://pinvoke.net/default.aspx/user32/SendInput.html). This function can send Windows messages to the topmost window on the computer screen to control inputs (mouse, keyboard, hardware). It uses some basic Win32 structures defined as follows in a file called enum.cs:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;

namespace Kinect.Toolbox
{
    [Flags]
    internal enum MouseEventDataXButtons : uint
    {
        Nothing = 0x00000000,
        XBUTTON1 = 0x00000001,
        XBUTTON2 = 0x00000002
    }

    [Flags]
    internal enum MOUSEEVENTF : uint
    {
        ABSOLUTE = 0x8000,
        HWHEEL = 0x01000,
        MOVE = 0x0001,
        MOVE_NOCOALESCE = 0x2000,
        LEFTDOWN = 0x0002,
        LEFTUP = 0x0004,
        RIGHTDOWN = 0x0008,
        RIGHTUP = 0x0010,
        MIDDLEDOWN = 0x0020,
        MIDDLEUP = 0x0040,
        VIRTUALDESK = 0x4000,
        WHEEL = 0x0800,
        XDOWN = 0x0080,
        XUP = 0x0100
    }

    [StructLayout(LayoutKind.Sequential)]
    internal struct MOUSEINPUT
    {
        internal int dx;
        internal int dy;
        internal MouseEventDataXButtons mouseData;
        internal MOUSEEVENTF dwFlags;
        internal uint time;
        internal UIntPtr dwExtraInfo;
    }
}

The MOUSEINPUT structure provides two members (dx and dy) to define the mouse offset relative to its current position. It is included in a more global structure, INPUT (which you also have to include in your enum.cs file):

[StructLayout(LayoutKind.Explicit)]
internal struct INPUT
{
    [FieldOffset(0)]
    internal int type;
    [FieldOffset(4)]
    internal MOUSEINPUT mi;

    public static int Size
    {
        get { return Marshal.SizeOf(typeof(INPUT)); }
    }
}

The complete structure includes members for keyboard and hardware inputs, but they are not required for our application.

You can now create the following class to handle the control of the mouse:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;

namespace Kinect.Toolbox
{
       public static class MouseInterop
    {
        [DllImport("user32.dll")]
        private static extern uint SendInput(int nInputs, INPUT[] inputs, int size);

        static bool leftClickDown;

        public static void ControlMouse(int dx, int dy, bool leftClick)
        {
            INPUT[] inputs = new INPUT[2];

            inputs[0] = new INPUT();
            inputs[0].type = 0;
            inputs[0].mi.dx = dx;
            inputs[0].mi.dy = dy;
            inputs[0].mi.dwFlags = MOUSEEVENTF.MOVE;

            if (!leftClickDown && leftClick)
            {
                inputs[1] = new INPUT();
                inputs[1].type = 0;
                inputs[1].mi.dwFlags = MOUSEEVENTF.LEFTDOWN;
                leftClickDown = true;
            }
            else if (leftClickDown && !leftClick)
            {
                inputs[1] = new INPUT();
                inputs[1].type = 0;
                inputs[1].mi.dwFlags = MOUSEEVENTF.LEFTUP;
                leftClickDown = false;
            }

            SendInput(inputs.Length, inputs, INPUT.Size);
        }
    }
}

Notice that the code for the ControlMouse method is fairly obvious. You simply have to create a first INPUT value to set the mouse position.

Note

Use MOUSEEVENTF.MOVE to specify position offsets and use MOUSEEVENTF.ABSOLUTE to specify absolute positions.

The second INPUT value is used to control the left mouse click. You use a leftClickDown boolean to handle the Up/Down events pair.

Using skeleton analysis to move the mouse pointer

Using the skeleton stream, you can track a hand and use its position to move the mouse. This is fairly simple to accomplish, because from the point of view of the screen, the position of the hand can be directly mapped to the screen space.

The basic approach

This first approach is fairly simple; you simply use the Tools.Convert method to get a two-dimensional (2D) projected value of the hand position:

public static Vector2 Convert(KinectSensor sensor, SkeletonPoint position)
        {
            float width = 0;
            float height = 0;
            float x = 0;
            float y = 0;

            if (sensor.ColorStream.IsEnabled)
            {
                var colorPoint =
sensor.MapSkeletonPointToColor(position, sensor.ColorStream.Format);
                x = colorPoint.X;
                y = colorPoint.Y;

                switch (sensor.ColorStream.Format)
                {
                    case ColorImageFormat.RawYuvResolution640x480Fps15:
                    case ColorImageFormat.RgbResolution640x480Fps30:
                    case ColorImageFormat.YuvResolution640x480Fps15:
                        width = 640;
                        height = 480;
                        break;
                    case ColorImageFormat.RgbResolution1280x960Fps12:
                        width = 1280;
                        height = 960;
                        break;
                }
            }
            else if (sensor.DepthStream.IsEnabled)
            {
                var depthPoint =
sensor.MapSkeletonPointToDepth(position, sensor.DepthStream.Format);
                x = depthPoint.X;
                y = depthPoint.Y;

                switch (sensor.DepthStream.Format)
                {
                    case DepthImageFormat.Resolution80x60Fps30:
                        width = 80;
                        height = 60;
                        break;
                    case DepthImageFormat.Resolution320x240Fps30:
                        width = 320;
                        height = 240;
                        break;
                    case DepthImageFormat.Resolution640x480Fps30:
                        width = 640;
                        height = 480;
                        break;
                }
            }
            else
            {
                width = 1;
                height = 1;
            }

            return new Vector2(x / width, y / height);
        }

Using this method, you can create the following class:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Windows.Forms; // You must add a reference to this assembly in your project
using System.Windows.Media.Media3D;
using Microsoft.Kinect;

namespace Kinect.Toolbox
{
    public class MouseController
    {
        Vector2? lastKnownPosition;

        public void SetHandPosition(KinectSensor sensor, Joint joint, Skeleton skeleton)
        {
            Vector2 vector2 = Tools.Convert(sensor, joint.Position);

            if (!lastKnownPosition.HasValue)
            {
                lastKnownPosition = vector2;
                return;
            }
         // GlobalSmooth will be added in the next part
            MouseInterop.ControlMouse((int)((vector2.X - lastKnownPosition.Value.X) * Screen.
PrimaryScreen.Bounds.Width * GlobalSmooth),
(int)((vector2.Y - lastKnownPosition.Value.Y) *
Screen.PrimaryScreen.Bounds.Height * GlobalSmooth), false);

            lastKnownPosition = vector2;
        }
    }
}

The main concern with this solution is the jitter produced by the Kinect sensor. It can lead to a really annoying experience because a user may have to work hard to maintain the proper position to control the mouse effectively.

Adding a smoothing filter

To improve the overall stability of the motion conducted by the Kinect sensor, you may decide you want to update your user interface. (You will learn more about doing this in Chapter 10)

But you can also try to improve the smoothness of your data input by using a custom filter on the raw data. For instance, you can use the Holt Double Exponential Smoothing filter—there is a good description of the algorithm for this filter on Wikipedia at http://en.wikipedia.org/wiki/Exponential_smoothing.

The algorithm first tries to smooth the data by reducing the jitter and then by trying to find a trend in the past data to deduce new values.

Following is the code you need to add to the MouseController class in order to apply the filter:

// Filters
        Vector2 savedFilteredJointPosition;
        Vector2 savedTrend;
        Vector2 savedBasePosition;
        int frameCount;

        public float TrendSmoothingFactor
        {
            get;
            set;
        }

        public float JitterRadius
        {
            get;
            set;
        }

        public float DataSmoothingFactor
        {
            get;
            set;
        }

        public float PredictionFactor
        {
            get;
            set;
        }

        public float GlobalSmooth
        {
            get;
            set;
        }

        public MouseController()
        {
            TrendSmoothingFactor = 0.25f;
            JitterRadius = 0.05f;
            DataSmoothingFactor = 0.5f;
            PredictionFactor = 0.5f;

            GlobalSmooth = 0.9f;
        }

        Vector2 FilterJointPosition(KinectSensor sensor, Joint joint)
        {
            Vector2 filteredJointPosition;
            Vector2 differenceVector;
            Vector2 currentTrend;
            float distance;

            Vector2 baseJointPosition = Tools.Convert(sensor, joint.Position);
            Vector2 prevFilteredJointPosition = savedFilteredJointPosition;
            Vector2 previousTrend = savedTrend;
            Vector2 previousBaseJointPosition = savedBasePosition;

            // Checking frames count
            switch (frameCount)
            {
                case 0:
                    filteredJointPosition = baseJointPosition;
                    currentTrend = Vector2.Zero;
                    break;
                case 1:
                    filteredJointPosition = (baseJointPosition + previousBaseJointPosition) *
0.5f;
                    differenceVector = filteredJointPosition - prevFilteredJointPosition;
                    currentTrend =
differenceVector * TrendSmoothingFactor + previousTrend * (1.0f - TrendSmoothingFactor);
                    break;
                default:
                    // Jitter filter
                    differenceVector = baseJointPosition - prevFilteredJointPosition;
                    distance = Math.Abs(differenceVector.Length());

                    if (distance <= JitterRadius)
                    {
                        filteredJointPosition = baseJointPosition * (distance / JitterRadius) +
prevFilteredJointPosition * (1.0f - (distance / JitterRadius));
                    }
                    else
                    {
                        filteredJointPosition = baseJointPosition;
                    }

                    // Double exponential smoothing filter
                    filteredJointPosition = filteredJointPosition *
(1.0f - DataSmoothingFactor) + (prevFilteredJointPosition + previousTrend) * DataSmoothingFactor;

                    differenceVector = filteredJointPosition - prevFilteredJointPosition;
                    currentTrend = differenceVector * TrendSmoothingFactor +
previousTrend * (1.0f - TrendSmoothingFactor);
                    break;
            }

            // Compute potential new position
            Vector2 potentialNewPosition = filteredJointPosition + currentTrend *
PredictionFactor;

            // Cache current value
            savedBasePosition = baseJointPosition;
            savedFilteredJointPosition = filteredJointPosition;
            savedTrend = currentTrend;
            frameCount++;

            return potentialNewPosition;
        }

This function can simply filter the joint position, so the SetHandPosition method will evolve to the following:

public void SetHandPosition(KinectSensor sensor, Joint joint, Skeleton skeleton)
{
    Vector2 vector2 = FilterJointPosition(sensor, joint);

    if (!lastKnownPosition.HasValue)
    {
        lastKnownPosition = vector2;
        return;
    }

    MouseInterop.ControlMouse((int)(
(vector2.X - lastKnownPosition.Value.X) * Screen.PrimaryScreen.Bounds.Width * GlobalSmooth),
(int)((vector2.Y - lastKnownPosition.Value.Y) * Screen.PrimaryScreen.Bounds.Height *
GlobalSmooth), false);

    lastKnownPosition = vector2;
}

Thanks to your brand-new filter code, the control of your mouse from the Kinect sensor should be less jittery, with fewer jerky movements. You will never achieve the precision of a handheld mouse, but it can work well with a large user interface.

You can use this filter in conjunction with the inbuilt TransformSmoothParameters. It can help smoothing the hand position in the three-dimensional (3D) space, and your filter can finish the work in the 2D space.

Handling the left mouse click

Creating a way to accomplish a common or left mouse click with input from the Kinect sensor is complex because if you decide to detect the click when the user pushes a hand (meaning the position of the hand on the z axis is changing), you can be almost certain that the user will also move the position of that hand on another axis at the same time, and so accordingly the mouse position will not be stable.

On the other hand, the implementation of this solution is simple:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Windows.Media.Media3D;
using Microsoft.Kinect;

namespace Kinect.Toolbox
{
    public class MouseController
    {
        Vector2? lastKnownPosition;
        float previousDepth;

        public void SetHandPosition(KinectSensor sensor, Joint joint, Skeleton skeleton)
        {
            Vector2 vector2 = FilterJointPosition(sensor, joint);

            if (!lastKnownPosition.HasValue)
            {
                lastKnownPosition = vector2;
                previousDepth = joint.Position.Z;
                return;
            }

            var isClicked = Math.Abs(joint.Position.Z - previousDepth) > 0.05f;

            MouseInterop.ControlMouse((int)(
(vector2.X - lastKnownPosition.Value.X) * Screen.PrimaryScreen.Bounds.Width * GlobalSmooth),
(int)((vector2.Y - lastKnownPosition.Value.Y) * Screen.PrimaryScreen.Bounds.Height *
GlobalSmooth), isClicked);

            lastKnownPosition = vector2;
            previousDepth = joint.Position.Z;
        }
    }
}

You save the previous depth and compare it to the current value to determine if the difference is greater than a given threshold. But as mentioned previously, this is not the best solution. Chapter 10 provides you with a better way to handle mouse clicks.

If the interface has not been modified, you also can choose to use a gesture by another joint as a click trigger. In that case, you would update your class to this version:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Windows.Media.Media3D;
using Microsoft.Kinect;

namespace Kinect.Toolbox
{
    public class MouseController
    {
        Vector2? lastKnownPosition;
        float previousDepth;

        // Filters
        Vector2 savedFilteredJointPosition;
        Vector2 savedTrend;
        Vector2 savedBasePosition;
        int frameCount;

        // Gesture detector for click
        GestureDetector clickGestureDetector;
        bool clickGestureDetected;
        public GestureDetector ClickGestureDetector
        {
            get
            {
                return clickGestureDetector;
            }
            set
            {
                if (value != null)
                {
                    value.OnGestureDetected += (obj) =>
                        {
                            clickGestureDetected = true;
                        };
                }

                clickGestureDetector = value;
            }
        }

        // Filter parameters
        public float TrendSmoothingFactor
        {
            get;
            set;
        }

        public float JitterRadius
        {
            get;
            set;
        }

        public float DataSmoothingFactor
        {
            get;
            set;
        }

        public float PredictionFactor
        {
            get;
            set;
        }

        public float GlobalSmooth
        {
            get;
            set;
        }

        public MouseController()
        {
            TrendSmoothingFactor = 0.25f;
            JitterRadius = 0.05f;
            DataSmoothingFactor = 0.5f;
            PredictionFactor = 0.5f;

            GlobalSmooth = 0.9f;
        }

        Vector2 FilterJointPosition(KinectSensor sensor, Joint joint)
        {
            Vector2 filteredJointPosition;
            Vector2 differenceVector;
            Vector2 currentTrend;
            float distance;

            Vector2 baseJointPosition = Tools.Convert(sensor, joint.Position);
            Vector2 prevFilteredJointPosition = savedFilteredJointPosition;
            Vector2 previousTrend = savedTrend;
            Vector2 previousBaseJointPosition = savedBasePosition;

            // Checking frames count
            switch (frameCount)
            {
                case 0:
                    filteredJointPosition = baseJointPosition;
                    currentTrend = Vector2.Zero;
                    break;
                case 1:
                    filteredJointPosition =
(baseJointPosition + previousBaseJointPosition) * 0.5f;
                    differenceVector = filteredJointPosition - prevFilteredJointPosition;
                    currentTrend =
differenceVector * TrendSmoothingFactor + previousTrend * (1.0f - TrendSmoothingFactor);
                    break;
                default:
                    // Jitter filter
                    differenceVector = baseJointPosition - prevFilteredJointPosition;
                    distance = Math.Abs(differenceVector.Length());

                    if (distance <= JitterRadius)
                    {
                        filteredJointPosition = baseJointPosition * (distance / JitterRadius) +
prevFilteredJointPosition * (1.0f - (distance / JitterRadius));
                    }
                    else
                    {
                        filteredJointPosition = baseJointPosition;
                    }

                    // Double exponential smoothing filter
                    filteredJointPosition = filteredJointPosition *
(1.0f - DataSmoothingFactor) + (prevFilteredJointPosition + previousTrend) * DataSmoothingFactor;

                    differenceVector = filteredJointPosition - prevFilteredJointPosition;
                    currentTrend =
differenceVector * TrendSmoothingFactor + previousTrend * (1.0f - TrendSmoothingFactor);
                    break;
            }

            // Compute potential new position
            Vector2 potentialNewPosition = filteredJointPosition + currentTrend *
PredictionFactor;

            // Cache current value
            savedBasePosition = baseJointPosition;
            savedFilteredJointPosition = filteredJointPosition;
            savedTrend = currentTrend;
            frameCount++;

            return potentialNewPosition;
        }

        public void SetHandPosition(KinectSensor sensor, Joint joint, Skeleton skeleton)
        {
            Vector2 vector2 = FilterJointPosition(sensor, joint);

            if (!lastKnownPosition.HasValue)
            {
                lastKnownPosition = vector2;
                previousDepth = joint.Position.Z;
                return;
            }

            bool isClicked;

            if (ClickGestureDetector == null)
                isClicked = Math.Abs(joint.Position.Z - previousDepth) > 0.05f;
            else
                isClicked = clickGestureDetected;

            MouseInterop.ControlMouse((int)(
(vector2.X - lastKnownPosition.Value.X) * Screen.PrimaryScreen.Bounds.Width * GlobalSmooth),
(int)((vector2.Y - lastKnownPosition.Value.Y) *
Screen.PrimaryScreen.Bounds.Height * GlobalSmooth), isClicked);

            lastKnownPosition = vector2;
            previousDepth = joint.Position.Z;

            clickGestureDetected = false;
        }
    }
}

This update tests the value of the ClickGestureDetector and then handles the GestureDetected event to virtually “detect” the click.

Ultimately, developers need to concentrate on building new user experiences that take advantage of all that Kinect has to offer, but if you want to add Kinect to an existing application, you now have the tools required to do so. Using these tools with the Kinect sensor, you are the mouse!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.138.230