A motion parallax screen using Kinect [w/ code]

I’ve seen some examples of people who build motion parallax capable screens using Kinect, but as usual – they don’t share the code. Too bad.
Well this is your chance to see how it’s done, and it’s fairly simple as well.

Let’s start by getting the user’s head position. This is done using OpenNI‘s library, that provides a skeleton model and hence the head. I used the NiUserTracker sample code as a basis, and stripped out everything that is not needed.
The only things I was interested were the head position and hands positions so I created a struct to hold these, plus some things OpenNI need to get the positions. I did this so it could be run in a different thread, and this struct can be the shared memory:

struct openni_stuff {
	xn::DepthGenerator* dg;
	xn::UserGenerator* ug;
	xn::Context* ctx;
	XnSkeletonJointPosition* Head;
	XnSkeletonJointPosition* rh;
	XnSkeletonJointPosition* lh;
};

All these must be populated in the main before starting the thread

xn::Context g_Context;
xn::DepthGenerator g_DepthGenerator;
xn::UserGenerator g_UserGenerator;
XnSkeletonJointPosition Head;
XnSkeletonJointPosition lHand;
XnSkeletonJointPosition rHand;
int main(..) {
..
g_Context.Init();
g_Context.FindExistingNode(XN_NODE_TYPE_DEPTH, g_DepthGenerator);
g_Context.FindExistingNode(XN_NODE_TYPE_USER, g_UserGenerator);
..
g_UserGenerator.Create(g_Context);
..
g_Context.StartGeneratingAll();
..
DWORD threadid;
struct openni_stuff s;
s.ctx = &g_Context;
s.dg = &g_DepthGenerator;
s.ug = &g_UserGenerator;
s.Head = &Head;
s.rh = &rHand;
s.lh = &lHand;
CreateThread(
            NULL,                   // default security attributes
            0,                      // use default stack size
			MyThreadFunction,       // thread function name
            (LPVOID)(&s),          // argument to thread function
            0,                      // use default creation flags
            &threadid);   // returns the thread identifie
glutMainLoop();

This code is very abstracted, there are more things to do in order for it to work, you can see them in the code repo.
But basically the new thread is the one getting the information off the OpenNI framework and keeps the head position and hands positions vectors updated.

DWORD WINAPI MyThreadFunction( LPVOID lpParam ) {
       //Unpack the struct, don't care for shallow copy since it's all pointers anyway
	struct openni_stuff s = *((struct openni_stuff*)lpParam);
	for(;;) {
		getOpenNIData(s);
		Sleep(30);
	}
	return 0;
}
void getOpenNIData (struct openni_stuff s)
{
	xn::SceneMetaData sceneMD;
	xn::DepthMetaData depthMD;
	s.dg->GetMetaData(depthMD);
	if (!g_bPause)
	{
		// Read next available data
		s.ctx->WaitAndUpdateAll();
	}
	// Process the data
	g_DepthGenerator.GetMetaData(depthMD);
	s.dg->GetMetaData(depthMD);
	rHand.position.X = NULL;
	s.ug->GetUserPixels(0, sceneMD);
	DrawDepthMap(depthMD, sceneMD, *s.Head, *s.rh, *s.lh);
}

I thought this will give a performance boost as the WaitAndUpdateAll() call usually takes a little while, but it didn’t matter much…
The OpenGL (GLUT) runs on the main thread, and just looks at these updated vectors for the current position.

Off-Axis projection

The concept of off-axis projection is very important for this project. This very good article explains everything about generalized perspective projections, it also includes C code!, I recommend reading it. But basically off-axis projection is when the viewing eye is not perpendicular the projection surface, nor it needs to be centered in relation to it. It’s what goes on in our human binocular vision, each eye looks at the same point but they are not perpendicular to the virtual projection surface (they are angled to it), and they both have an offset from the center. Just read that little paper….
Anyway, cutting to the chase, we need to project the rendered objects in the scene onto the projection table, assuming the user is not looking at it perpendicularly (like they would with a normal screen). Thanks to the code in the aforementioned article – this is a breeze.

void subtract(float u[3], float v[3], float n[3]) {
	u[0] = v[0] - n[0];
	u[1] = v[1] - n[1];
	u[2] = v[2] - n[2];
}
void projection( float *pa,
				float *pb,
				float *pc,
				float *pe, float n, float f)
{
	float va[3], vb[3], vc[3];
	float vr[3], vu[3], vn[3];
	float l, r, b, t, d, M[16];
	// Compute an orthonormal basis for the screen.
	subtract(vr, pb, pa);
	subtract(vu, pc, pa);
	glmNormalize(vr);
	glmNormalize(vu);
	glmCross(vr, vu, vn);
	glmNormalize(vn);
	// Compute the screen corner vectors.
	subtract(va, pa, pe);
	subtract(vb, pb, pe);
	subtract(vc, pc, pe);
	// Find the distance from the eye to screen plane.
	d = -glmDot(va, vn);
	// Find the extent of the perpendicular projection.
	l = glmDot(vr, va) * n / d;
	r = glmDot(vr, vb) * n / d;
	b = glmDot(vu, va) * n / d;
	t = glmDot(vu, vc) * n / d;
	// Load the perpendicular projection.
	glMatrixMode(GL_PROJECTION);
	glPushMatrix();
	glLoadIdentity();
	glFrustum(l, r, b, t, n, f);
	// Rotate the projection to be non-perpendicular.
	memset(M, 0, 16 * sizeof (float));
	M[0] = vr[0]; M[4] = vr[1]; M[ 8] = vr[2];
	M[1] = vu[0]; M[5] = vu[1]; M[ 9] = vu[2];
	M[2] = vn[0]; M[6] = vn[1]; M[10] = vn[2];
	M[15] = 1.0f;
	glMultMatrixf(M);
	// Move the apex of the frustum to the origin.
	glTranslatef(-pe[0], -pe[1], -pe[2]);
	glMatrixMode(GL_MODELVIEW);
	glPushMatrix();
}

I am using glm.h & glm.c from Nate Robbins to do some basic lin-algebra. I just didn’t feel like re-writing the code, and I’m already using it to load Wavefront OBJ models. The only missing function is subtract which is included.
Loading the OBJ models is super easy with glm.h:

	   objmodel_ptr = glmReadOBJ("../bunny1.obj");
	   if (!objmodel_ptr)
		   exit(0);
	   glmUnitize(objmodel_ptr);
	   glmFacetNormals(objmodel_ptr);
	   glmVertexNormals(objmodel_ptr, 90.0);

Now that we can create off-axis views (this can be reused for other projects, such as projects with VR glasses!), I draw the scene after applying this projection:

GLfloat eye[4] = {0,200,1050,0}; //position of eye
double kinectHeight = 300;  //the Kinect is by the table, at a certain height (measured)
GLdouble tlv[3] = {-530, -kinectHeight, 90},   //top-left point of table in Kinect coordinates (millimeters)
		trv[3] = {530, -kinectHeight, 90}, //top-right
		brv[3] = {530, -kinectHeight, 955}, //bottom-right
		blv[3] = {-530, -kinectHeight, 955}; //bottom-left
GLdouble obj[3] = {-200, tlv[1], 522.5}; //the virtual object's real-world position (mm)
static void display(GLenum mode)
{
       //set the eye position
	if(Head.position.X != 0.0f || Head.position.Y != 0.0f || Head.position.Z != 0.0f)
	{
		eye[0] = Head.position.X;
		eye[1] = Head.position.Y;
		eye[2] = Head.position.Z;
	}
	glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT);
	offAxisView();
}
void offAxisView() {
	projection(blvf, brvf, tlvf, eye, 1.0f, 10000.0f);
	glLightfv(GL_LIGHT0, GL_POSITION, lightp);
	drawScene();
	glPopMatrix();
	glMatrixMode(GL_PROJECTION);
	glPopMatrix();
	glMatrixMode(GL_MODELVIEW);
}
void drawScene() {
	//Just draw an object..
	glPushMatrix();
	glTranslated(obj[0]-10,obj[1]+80,obj[2]); //translating to accomodate for obj size
	glColor4f(1.0, 0.0, 0.0, 1.0);
	glScaled(80,80,80);
	glmDraw(objmodel_ptr,GLM_SMOOTH);
	glPopMatrix();
}

You can see that I measured the position of the table in respect to the Kinect sensor’s center, we assume that it is the origin, and these are used for the off-axis projection w.r.t the eye.
That’s pretty much it… the program runs, you have to stand in the silly “Psi” position for the OpenNI framework to calibrate, and then the graphics will be rendered according to your head position.
To create your own setup, just put in the right position of the table in respect to the Kinect sensor in real-world coordinates (mm).

Code

Can be downloaded from SVN as usual:
svn co https://morethantechnical.googlecode.com/svn/trunk/kinect_motion_parallax/main.cpp

Video

Enjoy
Roy.

2 replies on “A motion parallax screen using Kinect [w/ code]”

i want c++ code for intel OASIS PROJECT. IF U CAN PLEASE HELP ME.

does this bring the effect that johnny chung lee was able to portray? I mean if this is not rendered on a table, but simply displayed on the screen, then does the effect equal what johnny lee did?

Comments are closed.