Simple triangulation with OpenCV from Harley & Zisserman [w/ code]

Hi
I sense that a lot of people are looking for a simple triangulation method with OpenCV, when they have two images and matching features.
While OpenCV contains the function cvTriangulatePoints in the triangulation.cpp file, it is not documented, and uses the arcane C API.
Luckily, Hartley and Zisserman describe in their excellent book "Multiple View Geometry" (in many cases considered to be "The Bible" of 3D reconstruction), a simple method for linear triangulation. This method is actually discussed earlier in Hartley's article "Triangulation".
I implemented it using the new OpenCV 2.3+ C++ API, which makes it super easy, and here it is before you.

Edit (4/25/2015): In a new post I am using OpenCV's cv::triangulatePoints() function. The code is available online in a gist.

Edit (6/5/2014): See some of my more recent work on structure from motion in this post on SfM and that post on the recent Qt GUI and SfM library.

Update 2017: See the new Mastering OpenCV3 book with a deeper discussion, and a more recent post on the implications of using OpenCV3 for SfM.

The thing about triangulation is that you need to know the extrinsic parameters of your cameras - the difference in location and rotation between them.
To get the camera matrices... that's a different story that I'm going to cover shortly in a tutorial (already in writing) about Structure from Motion.

But let's assume that we already have the extrinsic matrices. In most cases, where you know what the motion is (you took the pictures :), you can just write the matrices explicitly.

Linear Triangulation

/**
 From "Triangulation", Hartley, R.I. and Sturm, P., Computer vision and image understanding, 1997
 */
Mat_ LinearLSTriangulation(Point3d u,		//homogenous image point (u,v,1)
				   Matx34d P,		//camera 1 matrix
				   Point3d u1,		//homogenous image point in 2nd camera
				   Matx34d P1		//camera 2 matrix
								   )
{
	//build matrix A for homogenous equation system Ax = 0
	//assume X = (x,y,z,1), for Linear-LS method
	//which turns it into a AX = B system, where A is 4x3, X is 3x1 and B is 4x1
	Matx43d A(u.x*P(2,0)-P(0,0),	u.x*P(2,1)-P(0,1),		u.x*P(2,2)-P(0,2),
		  u.y*P(2,0)-P(1,0),	u.y*P(2,1)-P(1,1),		u.y*P(2,2)-P(1,2),
		  u1.x*P1(2,0)-P1(0,0), u1.x*P1(2,1)-P1(0,1),	u1.x*P1(2,2)-P1(0,2),
		  u1.y*P1(2,0)-P1(1,0), u1.y*P1(2,1)-P1(1,1),	u1.y*P1(2,2)-P1(1,2)
			  );
	Mat_ B = (Mat_(4,1) <<	-(u.x*P(2,3)	-P(0,3)),
					  -(u.y*P(2,3)	-P(1,3)),
					  -(u1.x*P1(2,3)	-P1(0,3)),
					  -(u1.y*P1(2,3)	-P1(1,3)));

	Mat_ X;
	solve(A,B,X,DECOMP_SVD);

	return X;
}

This method relies very simply on the principle that every 2D point in image plane coordinates is a projection of the real 3D point. So if you have two views, you can set up an overdetermined linear equation system to solve for the 3D position.

See how simple defining a Matx43d struct from scratch and using it in solve(..) is?
I tried doing some more fancy stuff with Mat.row(i) and Mat.col(i), trying to stick to Hartley's description of the A matrix, but it just didn't work.

Using it

Using this method is easy:

//Triagulate points
void TriangulatePoints(const vector& pt_set1,
					   const vector& pt_set2,
					   const Mat& Kinv,
					   const Matx34d& P,
					   const Matx34d& P1,
					   vector& pointcloud,
					   vector& correspImg1Pt)
{
#ifdef __SFM__DEBUG__
	vector depths;
#endif

	pointcloud.clear();
	correspImg1Pt.clear();

	cout << "Triangulating...";
	double t = getTickCount();
	unsigned int pts_size = pt_set1.size();
#pragma omp parallel for
	for (unsigned int i=0; i		Point2f kp = pt_set1[i];
		Point3d u(kp.x,kp.y,1.0);
		Mat_ um = Kinv * Mat_(u);
		u = um.at(0);
		Point2f kp1 = pt_set2[i];
		Point3d u1(kp1.x,kp1.y,1.0);
		Mat_ um1 = Kinv * Mat_(u1);
		u1 = um1.at(0);

		Mat_ X = IterativeLinearLSTriangulation(u,P,u1,P1);

//		if(X(2) > 6 || X(2) < 0) continue;

#pragma omp critical
		{
			pointcloud.push_back(Point3d(X(0),X(1),X(2)));
			correspImg1Pt.push_back(pt_set1[i]);
#ifdef __SFM__DEBUG__
			depths.push_back(X(2));
#endif
		}
	}
	t = ((double)getTickCount() - t)/getTickFrequency();
	cout << "Done. ("<
	//show "range image"
#ifdef __SFM__DEBUG__
	{
		double minVal,maxVal;
		minMaxLoc(depths, &minVal, &maxVal);
		Mat tmp(240,320,CV_8UC3); //cvtColor(img_1_orig, tmp, CV_BGR2HSV);
		for (unsigned int i=0; i			double _d = MAX(MIN((pointcloud[i].z-minVal)/(maxVal-minVal),1.0),0.0);
			circle(tmp, correspImg1Pt[i], 1, Scalar(255 * (1.0-(_d)),255,255), CV_FILLED);
		}
		cvtColor(tmp, tmp, CV_HSV2BGR);
		imshow("out", tmp);
		waitKey(0);
		destroyWindow("out");
	}
#endif
}

Note that you must have the camera matrix K (a 3x3 matrix of the intrinsic parameters), or rather it's inverse, noted here as Kinv.

Results and some discussion

3D view of the triangulated point cloud

Notice how stuff is distorted in the 3D view... but this is not due projective ambiguity! as I am using the Essential Matrix to obtain the camera P matrices (cameras are calibrated). Hartley and Zisserman explain this in their book on page 258, and the reasons for projective ambiguity (and how to resolve it) on page 265. The distortion must be due to inaccurate point correspondence...

The cool visualization is done using the excellent PCL library.

Iterative Linear Triangulation

Hartley, in his article "Triangulation" describes another triangulation algorithm, an iterative one, which he reports to "perform substantially better than the [...] non-iterative linear methods". It is, again, very easy to implement, and here it is:

/**
 From "Triangulation", Hartley, R.I. and Sturm, P., Computer vision and image understanding, 1997
 */
Mat_<double> IterativeLinearLSTriangulation(Point3d u,	//homogenous image point (u,v,1)
											Matx34d P,			//camera 1 matrix
											Point3d u1,			//homogenous image point in 2nd camera
											Matx34d P1			//camera 2 matrix
											) {
	double wi = 1, wi1 = 1;
	Mat_<double> X(4,1); 
	for (int i=0; i<10; i++) { //Hartley suggests 10 iterations at most
		Mat_<double> X_ = LinearLSTriangulation(u,P,u1,P1);
		X(0) = X_(0); X(1) = X_(1); X(2) = X_(2); X_(3) = 1.0;
		
		//recalculate weights
		double p2x = Mat_<double>(Mat_<double>(P).row(2)*X)(0);
		double p2x1 = Mat_<double>(Mat_<double>(P1).row(2)*X)(0);
		
		//breaking point
		if(fabsf(wi - p2x) <= EPSILON && fabsf(wi1 - p2x1) <= EPSILON) break;
		
		wi = p2x;
		wi1 = p2x1;
		
		//reweight equations and solve
		Matx43d A((u.x*P(2,0)-P(0,0))/wi,		(u.x*P(2,1)-P(0,1))/wi,			(u.x*P(2,2)-P(0,2))/wi,		
				  (u.y*P(2,0)-P(1,0))/wi,		(u.y*P(2,1)-P(1,1))/wi,			(u.y*P(2,2)-P(1,2))/wi,		
				  (u1.x*P1(2,0)-P1(0,0))/wi1,	(u1.x*P1(2,1)-P1(0,1))/wi1,		(u1.x*P1(2,2)-P1(0,2))/wi1,	
				  (u1.y*P1(2,0)-P1(1,0))/wi1,	(u1.y*P1(2,1)-P1(1,1))/wi1,		(u1.y*P1(2,2)-P1(1,2))/wi1
				  );
		Mat_<double> B = (Mat_<double>(4,1) <<	-(u.x*P(2,3)	-P(0,3))/wi,
						  -(u.y*P(2,3)	-P(1,3))/wi,
						  -(u1.x*P1(2,3)	-P1(0,3))/wi1,
						  -(u1.y*P1(2,3)	-P1(1,3))/wi1
						  );
		
		solve(A,B,X_,DECOMP_SVD);
		X(0) = X_(0); X(1) = X_(1); X(2) = X_(2); X_(3) = 1.0;
	}
	return X;
}

(remember to define your EPSILON)
This time he works iteratively in order to minimize the reprojection error of the reconstructed point to the original image coordinate, by weighting the linear equation system.

Recap

So we've seen how easy it is to implement these triangulation methods using OpenCV's nice Matx### and Mat_ structs.
Also solve(...,DECOMP_SVD) is very handy for overdetermined non-homogeneous linear equation systems.
Watch out for my Structure from Motion tutorial coming up, which will be all about using OpenCV to get point correspondence from pairs of images, obtaining camera matrices and recovering dense depth.

If you are looking for more robust solutions for SfM and 3D reconstructions, see:
http://phototour.cs.washington.edu/bundler/
http://code.google.com/p/libmv/
http://www.cs.washington.edu/homes/ccwu/vsfm/
Enjoy,
Roy.

Share