Microsoft’s Kinect takes advantage of 3D sensing technology from PrimeSense. It utilizes an IR emitter to project dots on users and an IR receiver and system-on-a-chip (SoC) to analyze the information to generate a 3D image.
One of the most important sensors created in recent years is from PrimeSsense and embodied in Microsoft’s Kinect game controller for the XBox 360 (see the figure). The Kinect provides a 3D image that allows users to interact with games by recognizing their movements (see “How Microsoft’s PrimeSense-based Kinect Really Works” at electronicdesign.com).
Microsoft provides an application programming interface (API) for game designers to take advantage of the Kinect input for applications running on the XBox 360. The hardware interface is open, and the Kinect and PrimeSense’s own version of the device have been utilized by a number of developers using other platforms.
One of the most popular applications is robotics, such as the Bilibot and Willow Garage’s Turtlebot (see “Cooperation Leads To Smarter Robots” at electronicdesign.com). The Kinect isn’t the only source of 3D information for games and robots, but it is one of the least expensive platforms.
Making 3D Easier
One tool called the Point Cloud Library (PCL) has a Berkeley Software Distribution (BSD) license, making it free for commercial or research use. This cross-platform tool has been ported to all major operating systems including Linux, MacOS, and Windows. Android is part of the Linux mix.
PCL is a collection of algorithms and APIs. The algorithms include operations for filtering, feature estimation, and model fitting. The tool can be used to combine 3D point clouds. It also can create surface definitions from a point cloud.
PCL n-D point cloud datasets are a collection of points with 3D position information. Points can have other attributes associated with them. Devices like the Kinect can generate point cloud information. They also provide a way to describe the 3D environment. PCL can be used to merge and manipulate point cloud data.
PCL is implemented in C++ as a template library. The flexible architecture works efficiently with CPU SIMD/SSE support as well as with GPUs with support such as CUDA. PCL also supports parallelization via OpenMP and Intel’s Thread Building Blocks (TBB) (see “Dev Tools Target Parallel Processing” at electronicdesign.com).
One trick to improve performance is to split data into different formats and areas. For example, 8-bit RGB information can be stored as 24-bit values while position information is typically stored as floating point numbers.
PCL’s Perception Processing Graphs (PPG) enables the creation of more generic, parameterized algorithms. For example, the algorithm for detecting a wall, a door, or a table could be the same. PPG ties in well with PCL’s concept of nodelets, which are dynamically loadable plugins that can operate on local data, providing a mechanism for distributing computation.
PCL works with the Robot Operating System (ROS). Willow Garage and a host of other contributors support ROS and PCL. Willow Garage’s PR2 robot takes advantage of both.
3D Interaction
PCL is useful for environment procession and the recognition aspects of 3D processing, but turning that into interactions is yet another task. The OpenNI, for natural interaction, is an organization created to make this task easier. It was started by PrimeSense, the company whose technology is inside the Kinect. OpenNI uses the Lesser General Public License (LGPL).
The object-oriented OpenNI system is designed to address operations such as full-body tracking and hand gestures as well as speech and command recognition. The vision and audio perception middleware is designed to support a range of input sensors including devices like the Kinect. Cameras and microphones are some of the other devices that might be part of the mix.
The idea is to offer applications with a generic command and interaction interface. OpenNI and the middleware components provide isolation from the input devices. A middleware module can be a gesture alert generator, a hand pointing generator, and a scene analyzer. Modules also could feature pose detection plus skeleton and joint detection. Different vendors may provide similar modules that might take advantage of underlying computational hardware or sensors.
OpenNI might be based on open source, but it lends itself to licensing control as well. Modules and applications can use the licensing mechanism to make sure users and other applications or modules are allowed to take advantage of certain features.
OpenNI is initially available on Windows and Ubuntu Linux. The PCL visualization library provides a way to convert point clouds into viewable presentations. So, 3D is more than gameplay. Tools like PCL and OpenNI provide the basis for other possibilities.
Microsoft
www.microsoft.com
OpenNI
www.openni.org
PointClouds.org
www.pointclouds.org
PrimeSense
www.primesense.com
Robot Operating System
www.ros.org
Willow Garage
www.willowgarage.com