Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Capturing frames from a depth camera

Back in Chapter 2, Handling Files, Cameras, and GUIs, we discussed the concept that a computer can have multiple video capture devices and each device can have multiple channels. Suppose a given device is a stereo camera. Each channel might correspond to a different lens and sensor. Also, each channel might correspond to different kinds of data, such as a normal color image versus a depth map. The C++ version of OpenCV defines some constants for the identifiers of certain devices and channels. However, these constants are not defined in the Python version.

To remedy this situation, let's add the following definitions in depth.py:

# Devices.CAP_OPENNI = 900 # OpenNI (for Microsoft Kinect)CAP_OPENNI_ASUS = 910 # OpenNI (for Asus Xtion)
# Channels of an OpenNI-compatible depth generator.CAP_OPENNI_DEPTH_MAP = 0 # Depth values in mm (16UC1)CAP_OPENNI_POINT_CLOUD_MAP = 1 # XYZ in meters (32FC3)CAP_OPENNI_DISPARITY_MAP = 2 # Disparity in pixels (8UC1)CAP_OPENNI_DISPARITY_MAP_32F = 3 # Disparity in pixels (32FC1)CAP_OPENNI_VALID_DEPTH_MASK = 4 # 8UC1
# Channels of an OpenNI-compatible RGB image generator.CAP_OPENNI_BGR_IMAGE = 5CAP_OPENNI_GRAY_IMAGE = 6

The depth-related channels require some explanation, as given in the following list:

A depth map is a grayscale image in which each pixel value is the estimated distance from the camera to a surface. Specifically, an image from the CAP_OPENNI_DEPTH_MAP channel gives the distance as a floating-point number of millimeters.
A point cloud map is a color image in which each color corresponds to an (x, y, or z) spatial dimension. Specifically, the CAP_OPENNI_POINT_CLOUD_MAP channel yields a BGR image, where B is x (blue is right), G is y (green is up), and R is z (red is deep), from the camera's perspective. The values are in meters.
A disparity map is a grayscale image in which each pixel value is the stereo disparity of a surface. To conceptualize stereo disparity, let's suppose we overlay two images of a scene, shot from different viewpoints. The result would be similar to seeing double images. For points on any pair of twin objects in the scene, we can measure the distance in pixels. This measurement is the stereo disparity. Nearby objects exhibit greater stereo disparity than far-off objects. Thus, nearby objects appear brighter in a disparity map.
A valid depth mask shows whether the depth information at a given pixel is believed to be valid (shown by a nonzero value) or invalid (shown by a value of zero). For example, if the depth camera depends on an infrared illuminator (an infrared flash), depth information is invalid in regions that are occluded (shadowed) from this light.

The following screenshot shows a point cloud map of a man sitting behind a sculpture of a cat:

The following screenshot has a disparity map of a man sitting behind a sculpture of a cat:

A valid depth mask of a man sitting behind a sculpture of a cat is shown in the following screenshot:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Capturing frames from a depth camera

Create new playlist

Sign In

Sign Up

Capturing frames from a depth camera

Table of Contents for
Capturing frames from a depth camera