So I was contacted earlier by someone asking about the Head Pose Estimation work I put up a while back. And I remembered that I needed to go back to that work and fix some things, so it was a great opportunity.
I ended up making it a bit nicer, and it’s also a good chance for us to review some OpenCV-OpenGL interoperation stuff. Things like getting a projection matrix in OpenCV and translating it to an OpenGL ModelView matrix, are very handy.
Let’s get down to the code.
Using PnP
Basically nothing has changed from last time: I use PnP to get the 6DOF pose of the head from point-correspondences. The correspondences I pick out manually beforehand, but getting the 2D position of: Left Eye, Right Eye, Left Ear, Right Ear, Left Mouth, Right Mouth and Nose. Then I used a 3D model of a female head from TurboSquid (here) to get 3D points of the same features, simply using MeshLab’s “Get Info” selector.
Solving a PnP (Perspective-N-Point) problem is good when you have 2D-3D correspondences and want to get the 3D object’s orientation (6DOF).
I ended up with a set of 3D points
modelPoints.push_back(Point3f(2.37427,110.322,21.7776)); // l eye (v 314) modelPoints.push_back(Point3f(70.0602,109.898,20.8234)); // r eye (v 0) modelPoints.push_back(Point3f(36.8301,78.3185,52.0345)); //nose (v 1879) modelPoints.push_back(Point3f(14.8498,51.0115,30.2378)); // l mouth (v 1502) modelPoints.push_back(Point3f(58.1825,51.0115,29.6224)); // r mouth (v 695) modelPoints.push_back(Point3f(-61.8886,127.797,-89.4523)); // l ear (v 2011) modelPoints.push_back(Point3f(127.603,126.9,-83.9129)); // r ear (v 1138)
And 2D points
102 108 144 114 116 136 104 152 132 153 96 100 198 106
for every image of Angelina that I had.
The next step is pretty obvious – we solve the PnP:
void loadWithPoints(Mat& ip, Mat& img) { int max_d = MAX(img.rows,img.cols); camMatrix = (Mat_<double>(3,3) << max_d, 0, img.cols/2.0, 0, max_d, img.rows/2.0, 0, 0, 1.0); cout << "using cam matrix " << endl << camMatrix << endl; double _dc[] = {0,0,0,0}; solvePnP(op,ip,camMatrix,Mat(1,4,CV_64FC1,_dc),rvec,tvec,false,CV_EPNP); Mat rotM(3,3,CV_64FC1,rot); Rodrigues(rvec,rotM); double* _r = rotM.ptr<double>(); printf("rotation mat: \n %.3f %.3f %.3f\n%.3f %.3f %.3f\n%.3f %.3f %.3f\n", _r[0],_r[1],_r[2],_r[3],_r[4],_r[5],_r[6],_r[7],_r[8]); printf("trans vec: \n %.3f %.3f %.3f\n",tv[0],tv[1],tv[2]); double _pm[12] = {_r[0],_r[1],_r[2],tv[0], _r[3],_r[4],_r[5],tv[1], _r[6],_r[7],_r[8],tv[2]}; Matx34d P(_pm); Mat KP = camMatrix * Mat(P); // cout << "KP " << endl << KP << endl; //reproject object points - check validity of found projection matrix for (int i=0; i<op.rows; i++) { Mat_<double> X = (Mat_<double>(4,1) << op.at<float>(i,0),op.at<float>(i,1),op.at<float>(i,2),1.0); // cout << "object point " << X << endl; Mat_<double> opt_p = KP * X; Point2f opt_p_img(opt_p(0)/opt_p(2),opt_p(1)/opt_p(2)); // cout << "object point reproj " << opt_p_img << endl; circle(img, opt_p_img, 4, Scalar(0,0,255), 1); } rotM = rotM.t();// transpose to conform with majorness of opengl matrix }
solvePnP gives us a rotation matrix and a translation vector. Luckily we can simply use them in OpenGL to render, like we do in Augmented Reality, but note that I’m transposing the rotation matrix because OpenGL is Column-Major, not Row-Major like OpenCV (see here).
I also added a small check for reprojection of the 3D points back on the image, just to visualize that the fitting is almost never 100%.
A word about OpenCV and OpenGL
I created a tiny reusable piece of code, that goes with me whenever I mix OpenCV and OpenGL. Basically all I have there are functions to load up textures from OpenCV Mats into OpenGL textures and draw them to the raster.
void copyImgToTex(const Mat& _tex_img, GLuint* texID, double* _twr, double* _thr); typedef struct my_texture { GLuint tex_id; double twr,thr,aspect_w2h; Mat image; my_texture():tex_id(-1),twr(1.0),thr(1.0) {} bool initialized; void set(const Mat& ocvimg) { ocvimg.copyTo(image); copyImgToTex(image, &tex_id, &twr, &thr); aspect_w2h = (double)ocvimg.cols/(double)ocvimg.rows; } } OpenCVGLTexture; void glEnable2D(); // setup 2D drawing void glDisable2D(); // end 2D drawing OpenCVGLTexture MakeOpenCVGLTexture(const Mat& _tex_img); // create an OpenCV-OpenGL image void drawOpenCVImageInGL(const OpenCVGLTexture& tex); // draw an OpenCV-OpenGL image
Very basic stuff, just binding and uploading textures and drawing them in 2D to the screen.
One more sorta useful thing is getting the pixels back from OpenGL after rendering:
void saveOpenGLBuffer() { static unsigned int opengl_buffer_num = 0; int vPort[4]; glGetIntegerv(GL_VIEWPORT, vPort); Mat_<Vec3b> opengl_image(vPort[3],vPort[2]); { Mat_<Vec4b> opengl_image_4b(vPort[3],vPort[2]); glReadPixels(0, 0, vPort[2], vPort[3], GL_BGRA, GL_UNSIGNED_BYTE, opengl_image_4b.data); flip(opengl_image_4b,opengl_image_4b,0); mixChannels(&opengl_image_4b, 1, &opengl_image, 1, &(Vec6i(0,0,1,1,2,2)[0]), 3); } stringstream ss; ss << "opengl_buffer_" << opengl_buffer_num++ << ".jpg"; imwrite(ss.str(), opengl_image); }
You can use this just for getting the image, and not saving to file.
Visualizing the results
My display function is pretty straightforeward, but I’ll show it here anyway:
void display(void) { // draw the image in the back int vPort[4]; glGetIntegerv(GL_VIEWPORT, vPort); glEnable2D(); drawOpenCVImageInGL(imgTex); glTranslated(vPort[2]/2.0, 0, 0); drawOpenCVImageInGL(imgWithDrawing); glDisable2D(); glClear(GL_DEPTH_BUFFER_BIT); // we want to draw stuff over the image // draw only on left part glViewport(0, 0, vPort[2]/2, vPort[3]); glPushMatrix(); gluLookAt(0,0,0,0,0,1,0,-1,0); // put the object in the right position in space Vec3d tvv(tv[0],tv[1],tv[2]); glTranslated(tvv[0], tvv[1], tvv[2]); // rotate it double _d[16] = { rot[0],rot[1],rot[2],0, rot[3],rot[4],rot[5],0, rot[6],rot[7],rot[8],0, 0, 0, 0 ,1}; glMultMatrixd(_d); // draw the 3D head model glColor4f(1, 1, 1,0.75); glmDraw(head_obj, GLM_SMOOTH); //----------Axes glScaled(50, 50, 50); drawAxes(); //----------End axes glPopMatrix(); // restore to looking at complete viewport glViewport(0, 0, vPort[2], vPort[3]); glutSwapBuffers(); }
I first draw the OpenCV images on the raster, then add in the 3D head model.
Note that I’m using the exact results I got from solvePnP – the variables rot
and tvec
.
For some strange reason, I’m getting some wonky faces on the 3D model… I tried using MeshLab to fix all the faces, vertices, normals, etc. but to no avail. Can you tell what is the problem?
Results
Here’s a montage of all the results:
Code and Salutations
Code is up at the GitHub: https://github.com/royshil/HeadPosePnP
Thanks for tuning in!
Roy.