Kinect (see picture) is an attempt to create a controller-free. It houses an infra-red depth sensor with a resolution of 320x240 - aptly gaming experience for the Xbox 360.- effectively advertised in this clip. Kinect was designed to work with the Xbox 360 console. However, within days of its launch in November 2010,Adafruit industries offered a bounty for an open-source driver - won by Mr. Héctor Martín Cantero. Subsequently, Matt Cutts offered coupleof $1000 prizes:
"The first $1000 prize goes to the person or team that writes the coolest open-source app, demo, or program using the Kinect. The second prize goes to the person or team that does the most to make it easy to writeprograms that use the Kinect on Linux." This got the ball rolling and people went bonkers developing everything from the uber-cool UI fromMinority Report to a kitchen sink ... well almost.

external image kinect-sensor-description.jpg external image 500x_kinect_specs.jpg

Kinnext is an attempt at developing a motion capture system using OpenKinect. Essentially, we are trying to replicate with the OpenKinect what the Kinect already does. The fun part is that our baby will be open-source! Developers will be able to plug-in to the Kinnext API and do their thing.

Mocap Using Kinect


Mocap describes the process of recording movement and translating it on to a digital model. This report discusses the implementation of Mocap with the help of Kinect, a depth sensing device. Mocap is not a novel concept, it has been around in different fields like Gaming, Entertainment, Military and Medical-related etc. This project aims on implementing Mocap at a cost relatively lower than at commercial level.


Problem Description:

Motion capture is considered for even complex and rapid movements. Specific hardware and special programs are required to obtain and process the data. The overall cost can potentially be prohibitive for small productions or low budget education and research projects.

Motivation and Need:

The use of motion capture begun in the late 1970's, and only now its becoming widespread. Motion capture is simply capturing info about movement of human body. Software tools for working with motion-captured data, allow great control of style and quality of final output, for anything ranging from realistic to 'cartoony' motion. Motion capture is mostly covered under Computer Vision, an interesting and fast developing field.

Our Objective:

In selecting this project, our main objective was to provide a 3D description of real motion. Alongside, addressing the above mentioned problem. i.e. devise a system with minimal expenses.

Project Flow:

Fig1:Project Flow


How Motion Capture is done?
There are several methods for motion capture, including

This optical system consists of a lightweight suit, a lot of reflective dots (markers) and about a number of motion-capture cameras that feed information into computers running 3-D motion-capture software.


An electromagnetic system involves a suit of magnetic sensors that receive signals from a magnetic transmitter.


A mechanical motion-capture setup involves a heavy suit that is basically one big mechanical sensor. It sends data to the motion-capture computer when a part of the suit detects movement.
Of above all, Optical Mocap is widely used. Also, it is the approach we used in our project. First and foremost, such systems require sensors to “watch” a moving human or machine.

Elements of Optical Approach


A sensor can be a camera or a series of cameras, depending on the accuracy required for output. Like quantity, the placement and scope of sensor/ sensors can have a large impact on results. Similarly, the sensor can be static, or controlled for enhanced results eg. Robot camera. The sensor we used in our project is Kinect Xbox 360.

Marker design and placement:
Markers are placed on the object under observation to detect motion. They must be “visible” to the sensors, and be uniquely identifiable in one way or another. The placement of markers is often standardized, (e.g using markers on joints only) in order to optimize identification.

Joints for placing Markers

Reverse Kinematics:

The subject's motion is represented by a mathematical model. It can be the set of “joint angles” of the model, stored and used evaluate the subject's motion. Given a set of predefined joint angles, the motion at a given moment can easily be regenerated- this concept is known as Reverse Kinematics.

Our Sensor - Kinect XBOX 360

How does Kinect works?

Kinect is Microsoft’s motion sensor add-on for the Xbox 360 gaming console. Itprovides a natural user interface (NUI) that allows users to interact without a controller. It works by gathering the 3D scene information from a continuously-projected infra-structured light. It includes a depth sensor which felicitates this task. Also, an RGB camera and multi-array microphone for colors and voice.

Kinect Specifications:

The RGB video stream uses 8-bit VGA resolution (640 × 480 pixels). The Kinect sensor has a range of 1.2–3.5 m (3.9–11 ft) distance when used with the Xbox software.

Interfacing with Kinect:

Few hours after Kinect's launch, OpenKinect project was born. It a community of people working on libraries for Kinect and it's interface with PCs and other devices. The community being open-sourced, focuses mainly on libfreenect.

What Is libfreenect?

Libfreenect is the open source driver that enables users from almost any platform to gain access to the Kinect’s RGB video and Depth stream. It also allows to control the motor's position to some extent. This simply requires plugging the KInect's USB adaptor into the computer. The libfreenect driver initializes the device and starts providing both streams (RGB and Depth). Here is a high-level overview diagram that shows how libfreenect works:

Fig 3:Libfreenect acts as a bridge between the USB device and an application.

Using Kinect with Ubuntu:

For installing kinect on ubuntu we need to use terminal where we will give commands of drivers as follows:
sudo apt-get install cmake libglut3-dev pkg-config build-essential libxmu-dev libxi-dev libusb-1.0-0-dev
cd ~/
mkdir Kinect
cd Kinect
mkdir repos
cd repos
git clone git:// .
cd libfreenect
mkdir build
cd build
cmake ..
sudo make install
The following command opens a file
sudo nano /etc/udev/rules.d/51-kinect.rules
add the following to this file, Press Ctrl+O, then press Enter and then press Ctrl+X
# ATTR{product}=="Xbox NUI Motor"
SUBSYSTEM=="usb", ATTR{idVendor}=="045e", ATTR{idProduct}=="02b0", MODE="0666"
# ATTR{product}=="Xbox NUI Audio"
SUBSYSTEM=="usb", ATTR{idVendor}=="045e", ATTR{idProduct}=="02ad", MODE="0666"
# ATTR{product}=="Xbox NUI Camera"
SUBSYSTEM=="usb", ATTR{idVendor}=="045e", ATTR{idProduct}=="02ae", MODE="0666"

Kinect demo:

Commands for running the demo:


GLview Code:

Interfacing Kinect with OpenCV:

We are using Opencv to work on Kinect's RGB output. Cvdemo.c, an opencv example to run with kinect, available easily. We used cmake and make instead. Cmake and make commands are used to compile and execute source code on Linux.

Using Cmake:

Cmake is an open source software that configures your build parameters before compilation. In simple words, it just checks your system. It will generate the Makefile used later by make to buikd the example.
Open Terminal and access the build directory present in your source code.
Write down cmake...
The terminal will show some processing. If the configuration is successful, a message will be displayed indicating it.
Now the *.c examples for kinect are configured.

Using 'make':

Write 'make' to start building the binaries. Once built successfully, write 'make install'. You have to be root to carry out this operation, so use sudo with it.
After getting through these steps smoothly, we are ready to run cvdemo.

Running cvdemo:

Don't forget to plug in Kinect..!
cvdemo shows both RGB and Depth videos.
To clean up later

Fig 4:Output of cvdemo

Tracking Markers in OpenCV

Now our next step is to track the markers placed on the moving body. We have used built-in functions of OpenCV to track them. Before moving to the exact implementation in OpenCV, we discuss some related concepts:

Using HSV color space:

The HSV color space is best suited to the need of tracking objects by color. Using the RGB space is not suitable for Image Processing because of factors that can't be controlled. For example, if we choose to track a 'high intensity' blue color, that would be affected by texture, light etc.
This brings us back to HSV. Like RGB, HSV also uses 3 channels.
H: Hue represents the color, say 'red'. Here Dark red/Light red won't be a color
S: Saturation is the 'amount' of color, this differentiates between a pale and pure colorV: Value or Intensity is the brightness of color
The lightness or darkness of the color does not affect the hue channel. In OpenCV, the hue values range from 0-179. To define the range of color or which the object should be tracked, we have used the cvInRangeS function. The function cvInRangeS takes a source array and does the range check for every element of the array, in our case the image.

To work on HSV space, we need to convert the Kinect's RGB image to HSV. For this, we use the cvCvtColor function. It takes an source image, converts it to the specified color space and stores it into the destination image.


After identifying an object of particular color, we need to filter it out from the remaining image. For this purpose, we have wsed Thresholding. It is one of the basic techniques of image segmentation. For example, using this technique, all the red color in an image can be segmented. A thresholded image is easy for an computer to analyze.

Object Tracking in OpenCV:

#include <opencv/cv.h>
#include <opencv/highgui.h>
#include <stdio.h>
#include "libfreenect_cv.h"

IplImage* GetThresholdedImage(IplImage* img)
IplImage* imgHSV = cvCreateImage(cvGetSize(img), 8, 3);
cvCvtColor(img, imgHSV, CV_BGR2HSV);
IplImage* imgThreshed = cvCreateImage(cvGetSize(img), 8, 1);
cvInRangeS(imgHSV, cvScalar(20, 100, 100,0), cvScalar(30, 255, 255,0), imgThreshed);
return imgThreshed;

int main(int argc, char argv)
while (cvWaitKey(10) < 0)
IplImage *image = freenect_sync_get_rgb_cv(0);
if (!image) {
printf("Error: Kinect not connected?\n");
return -1; }
cvCvtColor(image, image, CV_RGB2BGR);
cvShowImage("RGB", image);
IplImage* imgYellowThresh = GetThresholdedImage(image);
cvShowImage("Threshold", imgYellowThresh);

cvReleaseImage(&imgYellowThresh);return 0;


Fig 5: Color tracking for red.

Optical Flow Algorithm:

Optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer (an eye or a camera) and the scene[1] . Optical flow techniques such as motion detection, object segmentation, time-to-collision and focus of expansion calculations, motion compensated encoding, and stereo disparity measurement utilize this motion of the objects' surfaces and edges [2].


Figure 6: The optic flow experienced by a rotating observer (in this case a fly). The direction and magnitude of optic flow at each location is represented by the direction and length of each arrow. Image taken from [2].

In our project we are using optical flow algorithm of opencv which comes as sample to track the markers motion. The input it will get will be the output of the color filter i.e. it will get the threshold video of yellow markers. After that it will draw a green colored flow map on the input .It will compare consecutive frames, as it finds motion it will draw a vector line as shown in figure 5 to track the motion.

Functions used in Optical Flow:

1. void drawOptFlowMap(const CvMat* flow, CvMat* cflowmap, int step, double scale,
CvScalar color)
Description : This function is drawing optical flow map on the input (threshold video).

2. void calcOpticalFlowFarneback(const Mat&prevImg, const Mat&nextImg, Mat&flow , doublepyrScale, intlevels, intwinsize, intiterations, intpolyN, doublepolySigma, int flags)
Description : This function calculates dense optical flow using Gunnar Farneback’s algorithm[3] i.e. main function to track the motion of the marker.

3.CvMat*cvCreateMat(int rows, int cols, int type)[4]

Description : It allocates a matrix which is essential argument for calOpticalFlowFarneback function.

4.void cvShowImage( const char* name, const CvArr* image )[5]
Description: It will show the output window of the motion of markers.

Code For Motion Tracking of Markers After Color Filter:

#include <opencv/cv.h>
#include <opencv/highgui.h>
#include <stdio.h>
#include "libfreenect_cv.h"

void drawOptFlowMap(const CvMat* flow, CvMat* cflowmap, int step,

double scale, CvScalar color)


int x, y;
for( y = 0; y < cflowmap->rows; y += step)
for( x = 0; x < cflowmap->cols; x += step)
CvPoint2D32f fxy = CV_MAT_ELEM(*flow, CvPoint2D32f, y, x);
cvLine(cflowmap, cvPoint(x,y), cvPoint(cvRound(x+fxy.x), cvRound(y+fxy.y)),
color, 1, 8, 0);
cvCircle(cflowmap, cvPoint(x,y), 2, color, -1, 8, 0);


IplImage* GetThresholdedImage(IplImage* img)

Convert the image into an HSV image
IplImage* imgHSV = cvCreateImage(cvGetSize(img), 8, 3);
cvCvtColor(img, imgHSV, CV_BGR2HSV);
IplImage* imgThreshed = cvCreateImage(cvGetSize(img), 8, 1);
cvInRangeS(imgHSV, cvScalar(20, 100, 100,0), cvScalar(30, 255, 255,0), imgThreshed); Passing the hue ranges of yellow color
return imgThreshed;

int main(int argc, char argv)
CvMat* prevgray = 0, *gray = 0, *flow = 0, *cflow = 0;
while (cvWaitKey(10) < 0) {
int firstFrame = gray == 0;
IplImage *image = freenect_sync_get_rgb_cv(0);
if (!image) {
printf("Error: Kinect not connected?\n");
return -1;
cvCvtColor(image, image, CV_RGB2BGR);
IplImage *depth = freenect_sync_get_depth_cv(0);
if (!depth) {
printf("Error: Kinect not connected?\n");
return -1;
IplImage* imgYellowThresh = GetThresholdedImage(image);
cvShowImage("Color Tracking", imgYellowThresh);
gray = cvCreateMat(imgYellowThresh->height, imgYellowThresh->width, CV_8UC1);
prevgray = cvCreateMat(gray->rows, gray->cols,gray->type);
flow = cvCreateMat(gray->rows, gray->cols, CV_32FC2);
cflow = cvCreateMat(gray->rows, gray->cols, CV_8UC3);
cvCopy(imgYellowThresh, gray,0);
if(!firstFrame )
cvCalcOpticalFlowFarneback(prevgray,gray, flow, 0.5, 3, 15, 3, 5, 1.2, 0);
cvCvtColor(prevgray, cflow, CV_GRAY2BGR);
drawOptFlowMap(flow, cflow, 16, 1.5, CV_RGB(0, 255, 0));
cvShowImage("Flow", cflow);
CvMat* temp;
CV_SWAP(prevgray, gray, temp);}
return 0;

Joint kinematics

The coupling between two body segments is known as a joint. Joint Kinematics is the

difference in orientation and position between two body segments as a function of time. Body

segments are linked to each other at the joints.

Body Segments
A human body has 17 segments, mentioned below:

1. Pelvis
2. Knee
3. Ankle
4. Right_thigh
5. Right_shank
6. Right_foot
7. Left_thigh
8. Left_shank
9. Left_foot
10. Trunk
11. Head
12. Right_scapula
13. Right_upper_arm
14. Right_hand
15. Left_scapula
16. Left_upper_arm
17. Left_hand

Computing Joint Angles
3D coordinates of the markers are used as input to compute joint angles of body. 3D
positions of all markers recorded during the experiment. There is a tool called BodyMech which
is employed to extract joint angle.


BodyMech is a software tool, which runs in MATLAB and offers functions for
3D human movement analysis based on cluster marker registrations. It is an open source package
for 3D kinematic analysis. It does not focus on the sensor aspects of motion capturing, but on the
subsequent marker processing steps.
BodyMech offers a sequence in which a file having marker positions is read to extract X,
Y and Z coordinates and then applying suitable methods to compute their joint angles. First
defining body segments that are feed into the functions of BodyMech that give joint angles.

BodyMech Functions:**

Following is a list of functions employed for the computation of joint angles.
Creates Global parameters and initializes BodyMech. Opens the Graphical User Interface

of BodyMech.


Generation of a substructure to the global variable BODY that houses contextual

information to a movement study.


Generation of variable that represent a rigid body.


Generation of variable that represent a joint of human body.


Generation of variable that represent a rigid body.


Calls a marker-file with specific calculations


Calls at the end of each BodyMech function.


Makes BodyMech global variables available in BodyMech Functions.


Loads and calls an m-file with specific anatomical calculations.

Application of rigid body transformation for each time-instance. Only valid markers are


Calculates the rotation matrix between two segments, and decomposition of the matrix

into sequential rotation angles.

Color Filter Functions:

IplImage* GetThresholdedImage(IplImage* img)

Filters a particular color from the image passed to it as an argument.

cvCVtColor(IntPtr src,IntPtr dst,COLOR_CONVERSION code)

Converts input image from one color space to another.


Some of my favorite mocap applications include: