Head Pose Estimation with OpenCV & OpenGL Revisited [w/ code]

So I was contacted earlier by someone asking about the Head Pose Estimation work I put up a while back. And I remembered that I needed to go back to that work and fix some things, so it was a great opportunity.
I ended up making it a bit nicer, and it's also a good chance for us to review some OpenCV-OpenGL interoperation stuff. Things like getting a projection matrix in OpenCV and translating it to an OpenGL ModelView matrix, are very handy.
Let's get down to the code.

Using PnP

Basically nothing has changed from last time: I use PnP to get the 6DOF pose of the head from point-correspondences. The correspondences I pick out manually beforehand, but getting the 2D position of: Left Eye, Right Eye, Left Ear, Right Ear, Left Mouth, Right Mouth and Nose. Then I used a 3D model of a female head from TurboSquid (here) to get 3D points of the same features, simply using MeshLab's "Get Info" selector.
Solving a PnP (Perspective-N-Point) problem is good when you have 2D-3D correspondences and want to get the 3D object's orientation (6DOF).

I ended up with a set of 3D points

	modelPoints.push_back(Point3f(2.37427,110.322,21.7776));	// l eye (v 314)
	modelPoints.push_back(Point3f(70.0602,109.898,20.8234));	// r eye (v 0)
	modelPoints.push_back(Point3f(36.8301,78.3185,52.0345));	//nose (v 1879)
	modelPoints.push_back(Point3f(14.8498,51.0115,30.2378));	// l mouth (v 1502)
	modelPoints.push_back(Point3f(58.1825,51.0115,29.6224));	// r mouth (v 695)
	modelPoints.push_back(Point3f(-61.8886,127.797,-89.4523));	// l ear (v 2011)
	modelPoints.push_back(Point3f(127.603,126.9,-83.9129));		// r ear (v 1138)

And 2D points

102 108
144 114
116 136
104 152
132 153
96 100
198 106

for every image of Angelina that I had.

The next step is pretty obvious - we solve the PnP:

void loadWithPoints(Mat& ip, Mat& img) {
	int max_d = MAX(img.rows,img.cols);
	camMatrix = (Mat_<double>(3,3) << max_d, 0, img.cols/2.0,
										0,	max_d, img.rows/2.0,
										0,	0,	1.0);
	cout << "using cam matrix " << endl << camMatrix << endl;
	double _dc[] = {0,0,0,0};

	Mat rotM(3,3,CV_64FC1,rot);
	double* _r = rotM.ptr<double>();
	printf("rotation mat: \n %.3f %.3f %.3f\n%.3f %.3f %.3f\n%.3f %.3f %.3f\n",

	printf("trans vec: \n %.3f %.3f %.3f\n",tv[0],tv[1],tv[2]);

	double _pm[12] = {_r[0],_r[1],_r[2],tv[0],

	Matx34d P(_pm);
	Mat KP = camMatrix * Mat(P);
//	cout << "KP " << endl << KP << endl;

	//reproject object points - check validity of found projection matrix
	for (int i=0; i<op.rows; i++) {
		Mat_<double> X = (Mat_<double>(4,1) << op.at<float>(i,0),op.at<float>(i,1),op.at<float>(i,2),1.0);
//		cout << "object point " << X << endl;
		Mat_<double> opt_p = KP * X;
		Point2f opt_p_img(opt_p(0)/opt_p(2),opt_p(1)/opt_p(2));
//		cout << "object point reproj " << opt_p_img << endl; 
		circle(img, opt_p_img, 4, Scalar(0,0,255), 1);
	rotM = rotM.t();// transpose to conform with majorness of opengl matrix

solvePnP gives us a rotation matrix and a translation vector. Luckily we can simply use them in OpenGL to render, like we do in Augmented Reality, but note that I'm transposing the rotation matrix because OpenGL is Column-Major, not Row-Major like OpenCV (see here).
I also added a small check for reprojection of the 3D points back on the image, just to visualize that the fitting is almost never 100%.

A word about OpenCV and OpenGL

I created a tiny reusable piece of code, that goes with me whenever I mix OpenCV and OpenGL. Basically all I have there are functions to load up textures from OpenCV Mats into OpenGL textures and draw them to the raster.

void copyImgToTex(const Mat& _tex_img, GLuint* texID, double* _twr, double* _thr);

typedef struct my_texture {
	GLuint tex_id;
	double twr,thr,aspect_w2h;
	Mat image;
	my_texture():tex_id(-1),twr(1.0),thr(1.0) {}
	bool initialized;
	void set(const Mat& ocvimg) { 
		copyImgToTex(image, &tex_id, &twr, &thr); 
		aspect_w2h = (double)ocvimg.cols/(double)ocvimg.rows;
} OpenCVGLTexture;

void glEnable2D();	// setup 2D drawing
void glDisable2D(); // end 2D drawing
OpenCVGLTexture MakeOpenCVGLTexture(const Mat& _tex_img); // create an OpenCV-OpenGL image
void drawOpenCVImageInGL(const OpenCVGLTexture& tex); // draw an OpenCV-OpenGL image

Very basic stuff, just binding and uploading textures and drawing them in 2D to the screen.

One more sorta useful thing is getting the pixels back from OpenGL after rendering:

void saveOpenGLBuffer() {
	static unsigned int opengl_buffer_num = 0;
	int vPort[4]; glGetIntegerv(GL_VIEWPORT, vPort);
	Mat_<Vec3b> opengl_image(vPort[3],vPort[2]);
		Mat_<Vec4b> opengl_image_4b(vPort[3],vPort[2]);
		glReadPixels(0, 0, vPort[2], vPort[3], GL_BGRA, GL_UNSIGNED_BYTE, opengl_image_4b.data);
		mixChannels(&opengl_image_4b, 1, &opengl_image, 1, &(Vec6i(0,0,1,1,2,2)[0]), 3);
	stringstream ss; ss << "opengl_buffer_" << opengl_buffer_num++ << ".jpg";
	imwrite(ss.str(), opengl_image);

You can use this just for getting the image, and not saving to file.

Visualizing the results

My display function is pretty straightforeward, but I'll show it here anyway:

void display(void)
	// draw the image in the back
	int vPort[4]; glGetIntegerv(GL_VIEWPORT, vPort);
	glTranslated(vPort[2]/2.0, 0, 0);

	glClear(GL_DEPTH_BUFFER_BIT); // we want to draw stuff over the image
	// draw only on left part
	glViewport(0, 0, vPort[2]/2, vPort[3]);

	// put the object in the right position in space
	Vec3d tvv(tv[0],tv[1],tv[2]);
	glTranslated(tvv[0], tvv[1], tvv[2]);

	// rotate it
	double _d[16] = {	rot[0],rot[1],rot[2],0,
						0,	   0,	  0		,1};
	// draw the 3D head model
	glColor4f(1, 1, 1,0.75);
	glmDraw(head_obj, GLM_SMOOTH);

	glScaled(50, 50, 50);
	//----------End axes

	// restore to looking at complete viewport
	glViewport(0, 0, vPort[2], vPort[3]); 


I first draw the OpenCV images on the raster, then add in the 3D head model.
Note that I'm using the exact results I got from solvePnP - the variables rot and tvec.
For some strange reason, I'm getting some wonky faces on the 3D model... I tried using MeshLab to fix all the faces, vertices, normals, etc. but to no avail. Can you tell what is the problem?


Here's a montage of all the results:

Code and Salutations

Code is up at the GitHub: https://github.com/royshil/HeadPosePnP

Thanks for tuning in!