<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>More Than Technical &#187; video</title>
	<atom:link href="http://www.morethantechnical.com/category/video/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.morethantechnical.com</link>
	<description>On software, code, the internet and more.</description>
	<lastBuildDate>Mon, 23 Aug 2010 10:51:39 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=abc</generator>
<atom:link rel="hub" href="http://pubsubhubbub.appspot.com"/><atom:link rel="hub" href="http://superfeedr.com/hubbub"/>		<item>
		<title>Iterative Closest Point (ICP) for 2D curves with OpenCV [w/ code]</title>
		<link>http://www.morethantechnical.com/2010/06/06/iterative-closest-point-icp-with-opencv-w-code/</link>
		<comments>http://www.morethantechnical.com/2010/06/06/iterative-closest-point-icp-with-opencv-w-code/#comments</comments>
		<pubDate>Sun, 06 Jun 2010 07:59:37 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[Solutions]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[opencv]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[vision]]></category>
		<category><![CDATA[curve fitting]]></category>
		<category><![CDATA[icp]]></category>
		<category><![CDATA[knn]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=666</guid>
		<description><![CDATA[ICP &#8211; Iterative closest point, is a very trivial algorithm for matching object templates to noisy data. It&#8217;s also super easy to program, so it&#8217;s good material for a tutorial. The goal is to take a known set of points (usually defining a curve or object exterior) and register it, as good as possible, to [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2010/06/icp.png" rel="lightbox[666]"><img src="http://www.morethantechnical.com/wp-content/uploads/2010/06/icp.png" alt="" title="ICP" width="283" height="223" class="alignleft size-full wp-image-669" /></a>ICP &#8211; Iterative closest point, is a very trivial algorithm for matching object templates to noisy data. It&#8217;s also super easy to program, so it&#8217;s good material for a tutorial. The goal is to take a known set of points (usually defining a curve or object exterior) and register it, as good as possible, to a set of other points, usually a larger and noisy set in which we would like to find the object. The basic algorithm is described very briefly in <a href="http://en.wikipedia.org/wiki/Iterative_Closest_Point">wikipedia</a>, but there are a ton of papers on the subject.</p>
<p>I&#8217;ll take you through the steps of programming it with OpenCV.<br />
<span id="more-666"></span><br />
So the algorithm is basically</p>
<ol>
<li>Find the points which are closest to our curve</li>
<li>Compute the best rotation and translation between our points and the closest points</li>
<li>Move our points according to the found transformation, and check the error</li>
<li>Run another iteration until convergence.</li>
</ol>
<p>How easy is that?? OpenCV gives us the tools to do everything in there!</p>
<h2>Finding closest points</h2>
<p>In v2.x of OpCV they introduced the FLANN framework of range search methods. It&#8217;s fairly easy to use, and produces good results, although I did run into some problems with it before. Let&#8217;s see how it&#8217;s used:</p>
<pre class="brush: plain;">
float flann_knn(Mat&amp; m_destinations, Mat&amp; m_object, vector&lt;int&gt;&amp; ptpairs, vector&lt;float&gt;&amp; dists = vector&lt;float&gt;()) {
	// find nearest neighbors using FLANN
	cv::Mat m_indices(m_object.rows, 1, CV_32S);
	cv::Mat m_dists(m_object.rows, 1, CV_32F);

	Mat dest_32f; m_destinations.convertTo(dest_32f,CV_32FC2);
	Mat obj_32f; m_object.convertTo(obj_32f,CV_32FC2);

	assert(dest_32f.type() == CV_32F);

	cv::flann::Index flann_index(dest_32f, cv::flann::KDTreeIndexParams(2));  // using 2 randomized kdtrees
    flann_index.knnSearch(obj_32f, m_indices, m_dists, 1, cv::flann::SearchParams(64) ); 

    int* indices_ptr = m_indices.ptr&lt;int&gt;(0);
    //float* dists_ptr = m_dists.ptr&lt;float&gt;(0);
    for (int i=0;i&lt;m_indices.rows;++i) {
   		ptpairs.push_back(indices_ptr[i]);
    }

	dists.resize(m_dists.rows);
	m_dists.copyTo(Mat(dists));

	return cv::sum(m_dists)[0];
}
</pre>
<p>This code was practically ripped off OpenCV&#8217;s sample code, and worked straight up.. You can see I have 2 input matrices that define the &#8220;destination&#8221; points &#8211; these are the unknown, the &#8220;object&#8221; points &#8211; these are our points, and an output vector to mark the correspondence between the two sets. I also have a vector for distances between points of each pair.<br />
I define a FLANN index object using KD-tree and perform a KNN (K nearest neighbors) search on it, for all the object points. After invoking the search function, it&#8217;s all garnish.</p>
<h2>Compute transform</h2>
<p>With the point pairs given, computing the transform between the two sets should be easy. And it is. I started by computing it using least-squares to find the parameters of the transformation: rotation angle, translation in X and Y directions. But my solution was not &#8220;resistant&#8221; to scale, so I looked up another easy solution to the problem, and surely enough a simple solution was found in this <a href="http://www.cs.duke.edu/researchers/artificial_intelligence/temp/eggert_rigid_body_transformations.pdf">paper</a>. </p>
<pre class="brush: plain;">
void findBestReansformSVD(Mat&amp; _m, Mat&amp; _d) {
	Mat m; _m.convertTo(m,CV_32F);
	Mat d; _d.convertTo(d,CV_32F);

	Scalar d_bar = mean(d);
	Scalar m_bar = mean(m);
	Mat mc = m - m_bar;
	Mat dc = d - d_bar;

	mc = mc.reshape(1); dc = dc.reshape(1);

	Mat H(2,2,CV_32FC1);
	for(int i=0;i&lt;mc.rows;i++) {
		Mat mci = mc(Range(i,i+1),Range(0,2));
		Mat dci = dc(Range(i,i+1),Range(0,2));
		H = H + mci.t() * dci;
	}

	cv::SVD svd(H);

	Mat R = svd.vt.t() * svd.u.t();
	double det_R = cv::determinant(R);
	if(abs(det_R + 1.0) &lt; 0.0001) {
		float _tmp[4] = {1,0,0,cv::determinant(svd.vt*svd.u)};
		R = svd.u * Mat(2,2,CV_32FC1,_tmp) * svd.vt;
	}
#ifdef BTM_DEBUG
	//for some strange reason the debug version of OpenCV is flipping the matrix
	R = -R;
#endif
	float* _R = R.ptr&lt;float&gt;(0);
	Scalar T(d_bar[0] - (m_bar[0]*_R[0] + m_bar[1]*_R[1]),d_bar[1] - (m_bar[0]*_R[2] + m_bar[1]*_R[3]));

	m = m.reshape(1);
	m = m * R;
	m = m.reshape(2);
	m = m + T;// + m_bar;
	m.convertTo(_m,CV_32S);
}
</pre>
<p>I really just followed what they said in the paper: take the mean point off the two sets, build a correlation matrix from the distances, do <a href="http://en.wikipedia.org/wiki/Singular_value_decomposition">SVD</a> and use U and Vt for computing the rotation and translation. It actually works!</p>
<h2>Putting it together</h2>
<p>This is the easy part, just using these two functions and small while loop:</p>
<pre class="brush: plain;">
	while(true) {
		pair.clear(); dists.clear();
		double dist = flann_knn(destination, X, pair, dists);

		if(lastDist &lt;= dist) {
			X = lastGood;
			break;	//converged?
		}
		lastDist = dist;
		X.copyTo(lastGood);

		cout &lt;&lt; &quot;distance: &quot; &lt;&lt; dist &lt;&lt; endl;

		Mat X_bar(X.size(),X.type());
		for(int i=0;i&lt;X.rows;i++) {
			Point p = destination.at&lt;Point&gt;(pair[i],0);
			X_bar.at&lt;Point&gt;(i,0) = p;
		}

		ShowQuery(destination,X,X_bar);

		X = X.reshape(2);
		X_bar = X_bar.reshape(2);
		findBestReansformSVD(X,X_bar);
		X = X.reshape(1); // back to 1-channel
	}
</pre>
<p>This will iteratively search for points and move the curve until the curve doesn&#8217;t move anymore.</p>
<h2>Some proof</h2>
<p><object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/0eBpBxCaYpE&#038;hl=en&#038;fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/0eBpBxCaYpE&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></p>
<h2>Code</h2>
<p>Available in the SVN repo:</p>
<pre class="brush: plain;">
svn checkout https://morethantechnical.googlecode.com/svn/trunk/ICP ICP
</pre>
<p>Thanks for joining!<br />
Roy.</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fwww.morethantechnical.com%2F2010%2F06%2F06%2Fiterative-closest-point-icp-with-opencv-w-code%2F&amp;linkname=Iterative%20Closest%20Point%20%28ICP%29%20for%202D%20curves%20with%20OpenCV%20%5Bw%2F%20code%5D"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2010/06/06/iterative-closest-point-icp-with-opencv-w-code/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Implementing PTAM: stereo, tracking and pose estimation for AR with OpenCV [w/ code]</title>
		<link>http://www.morethantechnical.com/2010/03/06/implementing-ptam-stereo-tracking-and-pose-estimation-for-ar-with-opencv-w-code/</link>
		<comments>http://www.morethantechnical.com/2010/03/06/implementing-ptam-stereo-tracking-and-pose-estimation-for-ar-with-opencv-w-code/#comments</comments>
		<pubDate>Sat, 06 Mar 2010 16:53:11 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[Recommended]]></category>
		<category><![CDATA[Website]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[opencv]]></category>
		<category><![CDATA[opengl]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[school]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[vision]]></category>
		<category><![CDATA[3d]]></category>
		<category><![CDATA[augmented reality]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=606</guid>
		<description><![CDATA[Hi Been working hard at a project for school the past month, implementing one of the more interesting works I&#8217;ve seen in the AR arena: Parallel Tracking and Mapping (PTAM) [PDF]. This is a work by George Klein [homepage] and David Murray from Oxford university, presented in ISMAR 2007. When I first saw it on [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2010/03/ptam.png" rel="lightbox[606]"><img class="alignleft size-full wp-image-617" title="ptam" src="http://www.morethantechnical.com/wp-content/uploads/2010/03/ptam.png" alt="" width="350" height="286" /></a>Hi</p>
<p>Been working hard at a project for school the past month, implementing one of the more interesting works I&#8217;ve seen in the AR arena: Parallel Tracking and Mapping (PTAM) [<a href="http://www.robots.ox.ac.uk/~gk/publications/KleinMurray2007ISMAR.pdf" target="_blank">PDF</a>]. This is a work by George Klein [<a href="http://www.robots.ox.ac.uk/~gk/" target="_blank">homepage</a>] and David Murray from Oxford university, presented in ISMAR 2007.</p>
<p>When I first saw it on youtube [<a href="http://www.youtube.com/watch?v=pBI5HwitBX4" target="_blank">link</a>] I immediately saw the immense potential &#8211; mobile markerless augmented reality. I thought I should get to know this work a bit more closely, so I chose to implement it as a part of advanced computer vision course, given by Dr. Lior Wolf [<a href="http://www.cs.tau.ac.il/~wolf/" target="_blank">link</a>] at TAU.</p>
<p>The work is very extensive, and clearly is a result of deep research in the field, so I set to achieve a few selected features: Stereo initialization, Tracking, and small map upkeeping. I chose not to implement relocalization and full map handling.</p>
<p>This post is kind of a tutorial for 3D reconstruction with OpenCV 2.0. I will show practical use of the functions in cvtriangulation.cpp, which are not documented and in fact incomplete. Furthermore I&#8217;ll show how to easily combine OpenCV and OpenGL for 3D augmentations, a thing which is only briefly described in the docs or online.</p>
<p>Here are the step I took and things I learned in the process of implementing the work.</p>
<p>Update: A nice patch by yazor fixes the video mismatching &#8211; thanks! and also a nice application by Zentium called &#8220;iKat&#8221; is doing some kick-ass <a href="http://gizmodo.com/5489946/ikat-augmented-reality-app-works-without-real+world-prompt">mobile markerless augmented reality</a>.<br />
<span id="more-606"></span></p>
<h2>Preparations&#8230;</h2>
<p>Before going straight to coding, I had to prepare a few things.</p>
<ul>
<li>A working compilation of OpenCV &#8211; not trivial with the new version 2.0.</li>
<li>A calibrated camera.</li>
<li>Test data</li>
</ul>
<p>Compiling OpenCV 2.0 proved to be a bit tricky. Even though the sourceforge project offers binary release for Win32, I compiled the whole thing from source. It turned out the binary release doesn&#8217;t contain .lib files, and anyway has compatibility issues between MS VS 2005 and 2008 &#8211; something about the embedded manifest [<a href="http://www.google.com/search?q=opencv+2.0+VS+2008+manifest+erro" target="_blank">google</a>]. I downloaded the freshest source from SVN, and compiled it, but it didn&#8217;t solve the debug-release problem, so I was left with using the release dlls even for debug evironment.</p>
<p>Initially I thought I&#8217;ll try an uncalibrated camera approach, but soon abandoned it. I had to calibrate my cameras, which I did  very easily using OpenCV&#8217;s &#8220;calibration.cpp&#8221;, which strangely is <strong>not built</strong> when building all examples &#8211; it has to be built manually. But everything went smoothly, and I soon got a calibration matrix (focal length, center of projection) and radial distortion coefficients.</p>
<h3>Getting Test Data</h3>
<p><object style="width: 480px; height: 295px;" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="295" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="src" value="http://www.youtube.com/v/WXsufPbEUmM&amp;hl=en_US&amp;fs=1&amp;" /><param name="align" value="left" /><embed style="width: 480px; height: 295px;" type="application/x-shockwave-flash" width="480" height="295" src="http://www.youtube.com/v/WXsufPbEUmM&amp;hl=en_US&amp;fs=1&amp;" align="left"></embed></object>For the test data I wanted to get a few views of a planar scene, where the first two views are separated only by a translation of ~5cm, as K&amp;M do in the PTAM article. This known translation is helpful when trying to triangulate the initial features in the scene. When you have prior knowledge of where the cameras are, you can simply intersect the epipolar lines between the two views and recover the 3D position of the points &#8211; up to a scale. Keep in mind you must also have feature correspondence: a point on image A must be correlated to a point in image B.</p>
<p>To achieve this I set up a small program that uses Optical Flow to track some 2D features in the scene, and grab a few screens + feature vectors. See &#8216;capture_data.cpp&#8217;.</p>
<h2>Stereo Initialization</h2>
<p>Now that I have 2 views with feature correspodence:</p>
<p><a rel="lightbox" href="http://www.morethantechnical.com/wp-content/uploads/2010/03/frames_correl.png"><img class="alignnone size-full wp-image-607" title="frames_correl" src="http://www.morethantechnical.com/wp-content/uploads/2010/03/frames_correl.png" alt="" width="634" height="259" /></a></p>
<p>I would like to triangulate the features. This is possible, as I discussed earlier, since I know the rotation (none), translation (5cm on -x axis) and camera calibration parameters (focal length, center of projection).</p>
<h3>Triangulation</h3>
<p>For triangulation, OpenCV has only recently added a couple of functions that implement triangulation [<a href="http://n2.nabble.com/An-implementation-of-the-Optimal-Triangulation-Method-td2295331.html" target="_blank">link</a>] as shown by Hartly &amp; Zisserman [<a href="http://users.cecs.anu.edu.au/~hartley/Papers/CVPR99-tutorial/tut_4up.pdf" target="_blank">PDF</a>, page 12]. However, these functions are not formally documented, and in fact they are missing some important parts. This is how I used cvTriangulation(), which is the key function:</p>
<pre class="brush: plain;">
//this function will initialize the 3D features from two views
void stereoInit() {

//first load camera intrinsic parameters
FileStorage fs(&quot;cam_work.out&quot;,CV_STORAGE_READ);
FileNode fn = fs[&quot;camera_matrix&quot;];
camera_matrix = Mat((CvMat*)fn.readObj(),true);

fn = fs[&quot;distortion_coefficients&quot;];
distortion_coefficients = Mat((CvMat*)fn.readObj(),true);

//vector&lt;Point2d&gt; points[2]; //these Point2d vectors hold the 2D features, double precision, from the 2 views

//get copy of points
_points[0] = points[0];
_points[1] = points[1];
Mat pts1M(_points[0]), pts2M(_points[1]); //very easy in OpenCV 2.0 to convert vector&lt;&gt; to Mat.

//Undistort points
Mat tmp,tmpOut;
pts1M.convertTo(tmp,CV_32FC2);  //undistort takes only floats not doubles, so convert to Point2f
undistortPoints(tmp,tmpOut,camera_matrix,distortion_coefficients);
tmpOut.convertTo(pts1M,CV_64FC2);  //go back to double precision

pts2M.convertTo(tmp,CV_32FC2);
undistortPoints(tmp,tmpOut,camera_matrix,distortion_coefficients);
tmpOut.convertTo(pts2M,CV_64FC2);

vector&lt;uchar&gt; tri_status; //this will hold the status for each point, a good point will have 1, bad - 0

//now triangulate
triangulate(_points[0],_points[1],tri_status);

}

void triangulate(vector&lt;Point2d&gt;&amp; points1, vector&lt;Point2d&gt;&amp; points2, vector&lt;uchar&gt;&amp; status) {

	//Convert points to 1-channel, 2-rows, double precision - This is important - see the code
...

	Mat ___tmp(2,pts1Mt.cols,CV_64FC1,__d);
...
	Mat ___tmp1(2,pts2Mt.cols,CV_64FC1,__d1);
...

	CvMat __points1 = ___tmp, __points2 = ___tmp1;

	//projection matrices
	double P1d[12] = {	-1,0,0,0,
						0,1,0,0,
						0,0,1,0 };	//Identity, but looking into -z axis
	Mat P1m(3,4,CV_64FC1,P1d);
	CvMat* P1 = &amp;(CvMat)P1m;
	double P2d[12] = {	-1,0,0,-5,
						0,1,0,0,
						0,0,1,0 };  //Identity rotation, 5cm -x translation, looking into -z axis
	Mat P2m(3,4,CV_64FC1,P2d);
	CvMat* P2 = &amp;(CvMat)P2m;

	float _d[1000] = {0.0f};
	Mat outTM(4,points1.size(),CV_32FC1,_d);
	CvMat* out = &amp;(CvMat)outTM;

//using cvTriangulate with the created structures
	cvTriangulatePoints(P1,P2,&amp;__points1,&amp;__points2,out);

//we should check the triangulation result by reprojecting 3D-&gt;2D and checking distance
	vector&lt;Point2d&gt; projPoints[2] = {points1,points2};

	double point2D_dat[3] = {0};
	double point3D_dat[4] = {0};
	Mat twoD(3,1,CV_64FC1,point2D_dat);
	Mat threeD(4,1,CV_64FC1,point3D_dat);

	Mat P[2] = {Mat(P1),Mat(P2)};

	int oc = out-&gt;cols, oc2 = out-&gt;cols*2, oc3 = out-&gt;cols*3;

	status = vector&lt;uchar&gt;(oc);

	//scan all points, reproject 3D-&gt;2D, and keep only good ones
	for(int i=0;i&lt;oc;i++) {
		double W = out-&gt;data.fl[i+oc3];
        point3D_dat[0] = out-&gt;data.fl[i] / W;
        point3D_dat[1] = out-&gt;data.fl[i+oc] / W;
        point3D_dat[2] = out-&gt;data.fl[i+oc2] / W;
        point3D_dat[3] = 1;

        bool push = true;
        /* !!! Project this point for each camera */
        for( int currCamera = 0; currCamera &lt; 2; currCamera++ )
        {
            //reproject! using the P matrix of the current camera
			twoD = P[currCamera] * threeD;

            float x,y;
            float xr,yr,wr;
 	x = (float)projPoints[currCamera][i].x;
	y = (float)projPoints[currCamera][i].y;

            wr = (float)point2D_dat[2];
            xr = (float)(point2D_dat[0]/wr);
            yr = (float)(point2D_dat[1]/wr);

            float deltaX,deltaY;
            deltaX = (float)fabs(x-xr);
            deltaY = (float)fabs(y-yr);

			//printf(&quot;error from cam %d (%.2f,%.2f): %.6f %.6f\n&quot;,currCamera,x,y,deltaX,deltaY);

			if(deltaX &gt; 0.01 || deltaY &gt; 0.01) {
				push = false;
			}
        }
		if(push) {
			// A good 3D reconstructed point, add to known world points

			double s = 7;
			Point3d p3d(point3D_dat[0]/s,point3D_dat[1]/s,point3D_dat[2]/s);
			//printf(&quot;%.3f %.3f %.3f\n&quot;,p3d.x,p3d.y,p3d.z);
			points1Proj.push_back(p3d);
			status[i] = 1;
		} else {
			status[i] = 0;
		}

	}
}
</pre>
<p>OK, now that I have (hopefully) triangulated 3D features from the initial state: 2 views of a planar scene with 5cm translation on the X axis &#8211; I can move on the pose estimation.</p>
<h2>Pose Estimation</h2>
<p>Theoretically, if I know the 3D position of features in the world and their respective 2D position in the image, it should be easy to recover the position of the camera, because there are a rotation matrix and translation vector that define this transformation. Practically in OpenCV, finding the position of an object using 3D-2D correlation is done by using the solvePnP() [<a href="http://opencv.willowgarage.com/documentation/cpp/camera_calibration_and_3d_reconstruction.html#solvepnp" target="_blank">link</a>] function.</p>
<p>Since I have an initial guess of the rotation and translation &#8211; from the first 2 frames &#8211; I can &#8220;help&#8221; the function estimate the new ones.</p>
<pre class="brush: plain;">
void findExtrinsics(vector&lt;Point2d&gt;&amp; points, vector&lt;double&gt;&amp; rv, vector&lt;double&gt;&amp; tv) {
	//estimate extrinsics for these points

	Mat rvec(rv),tvec(tv);

//initial &quot;guess&quot;, in case it wasn't already supplied
	if(rv.size()!=3) {
		rv = vector&lt;double&gt;(3);
		rvec = Mat(rv);
		double _d[9] = {1,0,0,
						0,-1,0,
						0,0,-1};
		Rodrigues(Mat(3,3,CV_64FC1,_d),rvec);
	}
	if(tv.size()!=3) {
		tv = vector&lt;double&gt;(3);
		tv[0]=0;tv[1]=0;tv[2]=0;
		tvec = Mat(tv);
	}

	//create a float rep  of points
	vector&lt;Point2f&gt; v2(points.size());
	Mat tmpOut(v2);
	Mat _tmpOut(points);
	_tmpOut.convertTo(tmpOut,CV_32FC2);

	solvePnP(points1projMF,tmpOut,camera_matrix,distortion_coefficients,rvec,tvec,true);

	printf(&quot;frame extrinsic:\nrvec: %.3f %.3f %.3f\ntvec: %.3f %.3f %.3f\n&quot;,rv[0],rv[1],rv[2],tv[0],tv[1],tv[2]);

//the output of the function is a Rodrigues form of rotation, so convert to regular rot-matrix
	Mat rotM(3,3,CV_64FC1); ///,_r);
	Rodrigues(rvec,rotM);
	double* _r = rotM.ptr&lt;double&gt;();
	printf(&quot;rotation mat: \n %.3f %.3f %.3f\n%.3f %.3f %.3f\n%.3f %.3f %.3f\n&quot;,
		_r[0],_r[1],_r[2],_r[3],_r[4],_r[5],_r[6],_r[7],_r[8]);
}
</pre>
<p>After getting the extrinsic parameters of the camera, the next step is plugging in the visualization!</p>
<h2>Integrating OpenGL</h2>
<p>Generally, it should be possible to create a 3D scene that matches exactly the true world scene, where the triangulated features appear in the scene aligned exactly with the world. I was not able to achieve that, but I got pretty close:<br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/Q1HVjAWls_E&amp;hl=en_US&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="480" height="385" src="http://www.youtube.com/v/Q1HVjAWls_E&amp;hl=en_US&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>It&#8217;s basically what you do in augmented reality, you align the virtual camera&#8217;s position and rotation with the results you get from the vision part of the system. In the pose estimation we ended with a 3D rotation vector (Rodrigues form) and 3D translation vector which is used as-is, so only the rotation vector should be converted to 3&#215;3 matrix using the Rodrigues() function.</p>
<p>This is the OpenGL glut display() function that draws the scene:</p>
<pre class="brush: plain;">
void display(void)
{
	glClearColor(1.0f, 1.0f, 1.0f, 0.5f);
	glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);	// Clear Screen And Depth Buffer

	//draw the background - the frame from the camers
	glMatrixMode(GL_PROJECTION);
	glPushMatrix();
	gluOrtho2D(0.0,352.0,288.0,0.0);
	glMatrixMode(GL_MODELVIEW);
	glPushMatrix();
	glDisable(GL_DEPTH_TEST);
	glDrawPixels(352,288,GL_RGB,GL_UNSIGNED_BYTE,backPxls.data);
	glEnable(GL_DEPTH_TEST);
	glPopMatrix();
	glMatrixMode(GL_PROJECTION);
	glPopMatrix();

    const double t = glutGet(GLUT_ELAPSED_TIME) / 1000.0;
	a = t*20.0;

	glMatrixMode(GL_MODELVIEW);
	glLoadIdentity();

//use the camera position 3D vector
	curCam[0] = cam[0]; curCam[1] = cam[1]; curCam[2] = cam[2];
//there seems to be some kind of offset...
	glTranslated(-curCam[0]+0.5,-curCam[1]+0.7,-curCam[2]);

//and the 3x3 rotation matrix
	double _d[16] = {	rot[0],rot[1],rot[2],0,
						rot[3],rot[4],rot[5],0,
						rot[6],rot[7],rot[8],0,
						0,	   0,	  0		,1};
	glMultMatrixd(_d);

//flip the rotation on the x-axis
	glRotated(180,1,0,0);

	//draw the 3D feature points
	glPushMatrix();
	glColor4d(1.0,0.0,0.0,1.0);
	for(unsigned int i=0;i&lt;points1Proj.size();i++) {
		glPushMatrix();
glTranslated(points1Proj[i].x,points1Proj[i].y,points1Proj[i].z);
		glutSolidSphere(0.03,15,15);
		glPopMatrix();
	}
	glPopMatrix();

	glutSwapBuffers();

	if(!running) {
		glutLeaveMainLoop();
	}

	Sleep(25);
}
</pre>
<p>This pretty much coveres my work, in a very concise way. The complete source code will reveal all I have done, and will provide a better copy-and-paste ground for your own projects.</p>
<h2>Things not covered in this work</h2>
<p>Initially I tried to implement a very crucial part of the PTAM work &#8211; pairing the 3D map with 2D features in the image. This allows them to re-align the map in every frame (when the tracking is bad) so the pose estimation does not &#8220;loose grip&#8221;. In essence, they keep a visual identity for each map feature, very similar to a descriptor like SURF or SIFT, so at any point they can find where in the new image are the features and recover the camera pose from the 2D-3D correspondence. I ran into a problem utilizing OpenCV&#8217;s SURF functionality, it seems to have a bug when trying to compute the descriptor for user-given feature points.</p>
<p>Another thing I chose not to implement is creating a full map of the surroundings. I wanted to achieve a simple working solution for a small map (essentially a single frame), and see how it works. In the original work by K&amp;M they constantly add more and more features to the map untill it has covered the whole surrounding room.</p>
<h2>Code and Working the Program</h2>
<p>As usual my code is available for checkout from the blog&#8217;s SNV repo:</p>
<pre class="brush: plain;">
svn checkout http://morethantechnical.googlecode.com/svn/trunk/ptam ptam
</pre>
<p>To get the stereo initialization you must press [spacebar] twice: Once when the camera has stabilized and the features are stable, and another time when the camera has translated and again stabilized.<br />
This marks the 2 keyframes that will be used for stereo init and triangulation.<br />
From that point on, the 3D scene will start and the track-and-estimate stage begins. Try not to move the camera violently as the optical flow may suffer.</p>
<p>Thanks Lior for your help getting the hang of these subjects, and the opportunity to meddle with a subject I long gone wanted to explore.</p>
<p>I hope everyone will enjoy and learn from my enjoyment and learning.</p>
<p>Bye!</p>
<p>Roy.</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fwww.morethantechnical.com%2F2010%2F03%2F06%2Fimplementing-ptam-stereo-tracking-and-pose-estimation-for-ar-with-opencv-w-code%2F&amp;linkname=Implementing%20PTAM%3A%20stereo%2C%20tracking%20and%20pose%20estimation%20for%20AR%20with%20OpenCV%20%5Bw%2F%20code%5D"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2010/03/06/implementing-ptam-stereo-tracking-and-pose-estimation-for-ar-with-opencv-w-code/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>iPhone OS 3.x Raw data of camera frames</title>
		<link>http://www.morethantechnical.com/2010/02/27/iphone-os-3-x-raw-data-of-camera-frames/</link>
		<comments>http://www.morethantechnical.com/2010/02/27/iphone-os-3-x-raw-data-of-camera-frames/#comments</comments>
		<pubDate>Sat, 27 Feb 2010 20:44:32 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[Mobile phones]]></category>
		<category><![CDATA[Website]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[vision]]></category>
		<category><![CDATA[frame grabbing]]></category>
		<category><![CDATA[iphone]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=599</guid>
		<description><![CDATA[Hi All It looks like it&#8217;s finally here &#8211; a way to grab the raw data of the camera frames on the iPhone OS 3.x. Update: Apple officially supports this in iOS 4.x using AVFoundation, here&#8217;s sample code from Apple developer. A gifted hacker named John DeWeese was nice enough to comment on a post [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2010/02/iphone-os-3-STEWIE.png" rel="lightbox[599]"><img src="http://www.morethantechnical.com/wp-content/uploads/2010/02/iphone-os-3-STEWIE-300x212.png" alt="" title="&quot;where is my data?!&quot;" width="300" height="212" class="alignleft size-medium wp-image-601" /></a>Hi All</p>
<p>It looks like it&#8217;s finally here &#8211; a way to grab the raw data of the camera frames on the iPhone OS 3.x. </p>
<p><strong>Update</strong>: Apple officially supports this in iOS 4.x using AVFoundation, <a href="http://developer.apple.com/iphone/library/qa/qa2010/qa1702.html#TNTAG1">here&#8217;s</a> sample code from Apple developer.</p>
<p>A gifted hacker named <a href="http://deweeeese.blogspot.com/">John DeWeese</a> was nice enough to comment on <a href="http://www.morethantechnical.com/2009/05/06/iphone-camera-frame-grabbing-and-a-real-time-meanshift-tracker/">a post from May 09&#8242;</a> with <a href="http://deweeeese.blogspot.com/2010/02/processing-iphone-camera-video-on.html">his method of hacking the APIs to get the frames</a>. Though cumbersome, it looks like it should work, but I haven&#8217;t tried it yet. I promise to try it soon and share my results.</p>
<p>Way to go John!<br />
Some code would be awesome&#8230;</p>
<p>Roy.</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fwww.morethantechnical.com%2F2010%2F02%2F27%2Fiphone-os-3-x-raw-data-of-camera-frames%2F&amp;linkname=iPhone%20OS%203.x%20Raw%20data%20of%20camera%20frames"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2010/02/27/iphone-os-3-x-raw-data-of-camera-frames/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>SmartHome &#8211; Embedded computing course project</title>
		<link>http://www.morethantechnical.com/2010/02/21/smarthome-embedded-computing-course-project/</link>
		<comments>http://www.morethantechnical.com/2010/02/21/smarthome-embedded-computing-course-project/#comments</comments>
		<pubDate>Sun, 21 Feb 2010 16:42:09 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Website]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[school]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[arm]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[swt]]></category>
		<category><![CDATA[zigbee]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=580</guid>
		<description><![CDATA[Hi In the past few weeks I have been working hard at a few projects for end-of-term at Uni. One of the projects is what I called &#8220;SmartHome&#8221;, for Embedded computing [link] course, is a home monitoring [link] application. In the course the students were given an LPC2148 arm7-MCU (NXP) based education board, implemented by [...]]]></description>
			<content:encoded><![CDATA[<p>Hi<br />
In the past few weeks I have been working hard at a few projects for end-of-term at Uni. One of the projects is what I called &#8220;SmartHome&#8221;, for Embedded computing [<a href="http://www.tau.ac.il/~stoledo/Courses/Embedded/announcement.html" target="_blank">link</a>] course, is a home monitoring [<a href="http://en.wikipedia.org/wiki/Home_automation" target="_blank">link</a>] application. In the course the students were given an LPC2148 arm7-MCU (NXP) based education board, implemented by Embedded Artists [<a href="http://www.embeddedartists.com/products/education/edu_2148.php" target="_blank">link</a>]. My partner Gil and I decided to work with ZigBee extension modules [<a href="http://en.wikipedia.org/wiki/ZigBee" target="_blank">link</a>] to enable remote communication.</p>
<p>Here are the steps we took to bring this project to life.<br />
<span id="more-580"></span></p>
<h2>The Idea</h2>
<p>Our vision is to create a home monitoring and controlling system, that will enable tracking different sensors around the house and also control switches.The system should be centralized by a master controller, and also wireless so it will not need cumbersome wiring throughout the house. We would also like the system to be easily controlled by a PC, so visual information could be displayed to the user, as well as allow manual control of the electronic switches.</p>
<p>Such a system will be able to automatically:</p>
<ul>
<li>Turn off the garden lighting when the light outside is bright enough,</li>
<li>Control the lawn watering system,</li>
<li>Control air conditioning in the house according to the temperature,</li>
<li>etc.</li>
</ul>
<h2>The Hardware</h2>
<p style="text-align: left;">As I mentioned, we were given an LPC2148 education board [<a href="http://www.embeddedartists.com/products/education/edu_2148.php" target="_blank">link</a>], implemented by Embedded Artists, that boasts a 12Mhz ARM CPU, and many peripheral subsystems. Among the systems are: LCD screen, numerous LEDs, a LED matrix, a fan, analog dials, USB and UART I/O and more. The boards also has a port for connecting a ZigBee module to enable RF communication, but it doesn&#8217;t contain the actual module. Since we needed remote communication between our stations, we bought 3 XBee modules from Maxstream [<a href="http://www.digi.com/products/wireless/point-multipoint/xbee-series1-module.jsp#overview" target="_blank">link</a>].</p>
<p style="text-align: center;"><a href="http://www.morethantechnical.com/wp-content/uploads/2010/02/Image096.jpg" rel="lightbox[580]"><img class="size-medium wp-image-586 alignnone" title="XBee module by MaxStream" src="http://www.morethantechnical.com/wp-content/uploads/2010/02/Image096-300x225.jpg" alt="" width="210" height="158" /></a><a href="http://www.morethantechnical.com/wp-content/uploads/2010/02/Image097.jpg" rel="lightbox[580]"> <img class="size-medium wp-image-588 alignnone" title="LPC2148 board by Embedded Artists" src="http://www.morethantechnical.com/wp-content/uploads/2010/02/Image097-300x225.jpg" alt="" width="210" height="158" /></a><a href="http://www.morethantechnical.com/wp-content/uploads/2010/02/Image098.jpg" rel="lightbox[580]"> <img class="alignnone size-medium wp-image-589" title="XBee module on LPC2148 board" src="http://www.morethantechnical.com/wp-content/uploads/2010/02/Image098-300x225.jpg" alt="" width="210" height="158" /></a></p>
<h2>The Software</h2>
<p>The LPC boards can be programmed very easily using tools provided by Embedded Artists, such as GCC with Newlib for compiling (both Win and Linux), and <a href="http://www.flashmagictool.com/" target="_blank">Flashmagic</a> (for windows) or lpc21isp for loading the compiled program. We used these tools for programing the &#8220;embedded&#8221; part of the application, for the PC client we used Java with <a href="http://www.eclipse.org/swt/" target="_blank">SWT</a> and serial port connectivity (<a href="http://rxtx.qbang.org/wiki/index.php/Download" target="_blank">RxTx</a> for Windows and Linux).</p>
<h3>Embedded program</h3>
<p>The embedded program on the LPC board has two parts: A master and A client. The system has only one master, and up to 3 clients. The master gathers information from the clients, and controls their behavior according to the user requests. For communication between the master and clients we created a communication protocol that has only a few simple messages:</p>
<ul>
<li>INIT &#8211; The master sends this request to initialize the connection between itself and a client.</li>
<li>ACK &#8211; This is the response for every request.</li>
<li>POLL &#8211; The clients answers this request with the data from all it&#8217;s sensors.</li>
<li>OPERATE &#8211; The master commands the client to switch something on or off.</li>
<li>ERROR &#8211; A general error response.</li>
</ul>
<p>All commands / responses have a unified structure described here:</p>
<pre class="brush: plain;">
General struct:
 ___________________________________________ _____
|    OP   | To | From | &lt;----- DATA ------&gt; | EOM
|_________|____|______|_____________________|_____

Data:

POLL (response)
_ ____________ ______ ______ ________
 | Temperature| ADC0 | ADC1 | Button |
_|____________|______|______|________|

OPERATE
_ _________
 | BITMASK |
_|_________|
</pre>
<p>This way we had a very easy implementation of the communication module, since we always had to look for only 14 bytes on the wire.</p>
<h3>Communication with ZigBee module</h3>
<p>To jumpstart our implementation we used the examples provided by Embedded Artists for operating the ZigBee module [<a href="http://www.embeddedartists.com/support/LPC2148_EDU/XBee_example.zip" target="_blank">link</a>]. The communication with the ZigBee module is on the UART1 port of the MCU. The example code takes care of opening the correct GPIO pins, setting the IRQ masks to enable interrupts and provides a very simple API for transmitting and receiving characters with the XBee over UART  (described in uart.h and uart.c files).</p>
<p>We took the XBee example code and expanded it to be able to recieve and send data between two stations. That means setting up each station&#8217;s module with it&#8217;s ID, address, channel and target address by AT commands [<a href="http://ftp1.digi.com/support/documentation/90000982_B.pdf" target="_blank">spec</a>, see page. 28]: ATID, ATCH, ATMY and ATDL. Two other key features are putting the module into command mode (rather than transmit mode) to set the mentioned parameters, this is done by &#8216;+++&#8217; to enter command mode and ATCN to exit. Once we had a decent framework to communicate between the stations, we started to build the logic.</p>
<p>The master station logic is like so:</p>
<ol>
<li>Initialize XBee module.</li>
<li>INIT all client stations, and see which station answers &#8211; these will be our &#8220;up&#8221; stations.</li>
<li>Loop:
<ol>
<li>POLL each &#8220;up&#8221; station in a loop.</li>
<li>Take care of any OPERATE requests from the PC client.</li>
</ol>
</li>
</ol>
<p>This way, the client&#8217;s logic boils down to just looping, testing for any request and taking care of it. Both client and master share most of the code, so only the main process code is essentially different. The example code uses a framework called &#8220;Preemtive OS&#8221; to allow multitasking / processes (code was bundled in the examples).</p>
<h3>Communication with PC</h3>
<p>Communication between the master and PC client also required some lightweight &#8220;protocol&#8221;. The communication is again over UART (0 this time, 1 is used by XBee), only now the PC is doing a UART-over-USB with the board. The protocol we ended up with supports these features:</p>
<ul>
<li>Commands the PC wants the master to perform look like &#8220;m=&lt;command&gt;=&lt;parameters&gt;&#8221;, such commands can be:
<ul>
<li>&#8220;poll=&lt;i&gt;&#8221;, send a POLL to the station indexed i</li>
<li>&#8220;test=&lt;i&gt;&#8221;, send an INIT to the station indexed i</li>
<li>&#8220;toggle=&lt;i&gt;_&lt;sw&gt;_&lt;onoff&gt;&#8221;, set the switch <em>sw</em> on the station indexed <em>i</em> to <em>onoff</em> status</li>
</ul>
</li>
<li>Notifications the master would like to share with the user (PC) look like &#8220;pc=&lt;notification&gt;&#8221;, such notifications can be:
<ul>
<li> &#8220;up=&lt;i&gt;&#8221;, the station indexed <em>i</em> is up</li>
<li>&#8220;master_online&#8221;,</li>
<li>&#8220;temp=&lt;i&gt; &lt;temp&gt;&#8221;, the station indexed <em>i</em> has a temperature reading of <em>temp</em></li>
<li>&#8220;e=&lt;error message&gt;&#8221;, error message from the master</li>
</ul>
</li>
</ul>
<p>This is an incomplete set of features, but it&#8217;s representative of the idea we want to present.</p>
<h3>PC client program</h3>
<p>The PC client we built using Java, so it would (and should) be portable between OSs. The GUI was built with the SWT, using the eclipse Visual Editor plug-in which simplified the process. Communication with the LPC board is done over the serial port of the board, as I mentioned the PC creates a virtual COM port using a USB-to-UART driver (FTDI, bundled with windows) [<a href="http://en.wikipedia.org/wiki/FTDI" target="_blank">link</a>].</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="344" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/1ZC2SuYb7P0&amp;hl=en_US&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="425" height="344" src="http://www.youtube.com/v/1ZC2SuYb7P0&amp;hl=en_US&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>To test the GUI we created a &#8220;simulator&#8221;, that mimics the operation of a master station, that is shown in the video above.</p>
<p>The high-level design of the program is described in this inheritance UML diagram:</p>
<p style="text-align: center;"><a href="http://www.morethantechnical.com/wp-content/uploads/2010/02/smart_home_uml.png" rel="lightbox[580]"><img class="size-medium wp-image-593 aligncenter" title="smart_home_uml" src="http://www.morethantechnical.com/wp-content/uploads/2010/02/smart_home_uml-251x300.png" alt="" width="251" height="300" /></a></p>
<p>Our PC client uses serial-port connectivity based on the old javax.serial APIs. Though these APIs have been abandoned by Sun, and there is no official Win32 implementation bundled with the JDK (only for *NIXs/Solaris), a project named RxTx is upkeeping an implementation of this API for windows [<a href="http://rxtx.qbang.org/wiki/index.php/Download" target="_blank">link</a>].</p>
<h2>The Code</h2>
<p>We are releasing the code under the BSD license, for everyone to use, enjoy, learn and expand.</p>
<p>It is available via the blog&#8217;s SVN repo:</p>
<p>PC Client (requires RxTx serial port impl and SWT)</p>
<pre class="brush: plain;">
svn checkout http://morethantechnical.googlecode.com/svn/trunk/SmartHomePCClient
</pre>
<p>Embedded program (includes all dependencies)</p>
<pre class="brush: plain;">
svn checkout http://morethantechnical.googlecode.com/svn/trunk/smarthome_embedded/final project
</pre>
<p>To compile the master program &#8220;make&#8221; in the master directory, and you&#8217;ll get an &#8220;xbee_master.hex&#8221; file, same goes for the client in the client directory (these directories don&#8217;t contain code, only a makefile). Then you have to upload the hex into the board, this is done either by &#8220;make deploy&#8221; (if you have lpc21isp), or FlashMagic on Win.</p>
<p>Java is compiled as usual&#8230; just remember the dependencies.</p>
<p>We would like to thank Sivan Toledo for the guidance, loaned hardware and inspiration.</p>
<p>And thanks all for listening!<br />
Roy S. &amp; Gil Ramon</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fwww.morethantechnical.com%2F2010%2F02%2F21%2Fsmarthome-embedded-computing-course-project%2F&amp;linkname=SmartHome%20%26%238211%3B%20Embedded%20computing%20course%20project"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2010/02/21/smarthome-embedded-computing-course-project/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GeekCon 2009: RunVas &#8211; Our project [w/ video, img]</title>
		<link>http://www.morethantechnical.com/2009/10/13/geekcon-2009-runvas-our-project-w-video-img/</link>
		<comments>http://www.morethantechnical.com/2009/10/13/geekcon-2009-runvas-our-project-w-video-img/#comments</comments>
		<pubDate>Tue, 13 Oct 2009 09:14:49 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[3d]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[opengl]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[computer vision]]></category>
		<category><![CDATA[geekcon]]></category>
		<category><![CDATA[geekcon 2009]]></category>
		<category><![CDATA[geekcon09]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[jogl]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=464</guid>
		<description><![CDATA[Hi everyone In the last weekend I attended GeekCon 2009, a tech-conference, with a friend and colleague Arnon (not Arnon from the blog, who recently had a birthday &#8211; Happy B-Day Arnon!). Each team that attended had to create a project they can complete in 2-days of the conference. Our project is called &#8220;RunVas&#8221;, and [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/10/runvas.PNG" rel="lightbox[464]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/10/runvas-300x251.PNG" alt="runvas" title="runvas" width="300" height="251" class="alignleft size-medium wp-image-466" /></a>Hi everyone</p>
<p>In the last weekend I attended <a href="http://www.geekcon.org/home">GeekCon 2009</a>, a tech-conference, with a friend and colleague Arnon (not Arnon from the blog, who recently had a birthday &#8211; Happy B-Day Arnon!). Each team that attended had to create a project they can complete in 2-days of the conference. Our project is called &#8220;RunVas&#8221;, and the basic idea was to let people run around and paint by doing so. We wanted to combine computer vision with a little artistic angle.</p>
<p>Here&#8217;s some more details<br />
<span id="more-464"></span></p>
<h2>GeekCon you say?</h2>
<p>First of all a few words about GeekCon itself. The conference is a &#8220;non-conference&#8221; or &#8220;un-conference&#8221;, which is a conference not focused on the business side of innovation and technology, but on the fun and creative side. The moto is something like: &#8220;geek out as hard as you possibly can in 2 days, and get it out of your system for the rest of the year&#8221;.</p>
<p>So teams from all corners of technology: Elect. Eng., Comp. Sci., Metal and wood works, etc.,  register and state their project of choice. The managment decides whether the project can actually be delivered in 2 days, and is actually a &#8220;GeekCon project&#8221;. By &#8220;GeekCon project&#8221; they mean something that demonstrates a nice concept/idea in a cool way, and is <strong>utterly useless </strong>in real life. This is the official stand.</p>
<h2>Our project</h2>
<p>We were accepted in with our project, RunVas. A simple idea, based around the latest fashion of getting people out of the house, away from the computer and hit the lawns running. We wanted also to combine technical and artistic point-of-views. So we create a system that tracks objects in a video scene, and sends the results to a drawing engine. The drawing will be presented on a virtual &#8220;canvas&#8221;, that the runners can view as they run, hence the name &#8220;RunVas&#8221;. We weren&#8217;t able to achieve all of that, but we had a good go at it, and delivered something nice.</p>
<h3>Implementation</h3>
<p>The CV part, object tracking, was programmed by Arnon, using the archaic <a href="http://en.wikipedia.org/wiki/Macromedia_Director">Macromedia Director</a>, donno which version but an old one anyway. And the drawing part was created by myself using the groundwork I had done for my <a href="http://www.morethantechnical.com/2009/07/27/advanced-issues-in-3d-game-building-with-jogl-openglswt-w-code-video/">3D graphics game </a>I programmed for school using SWT/JOGL. Personally I was amazed by how quickly I was able to pick up the framework from that project and re-use it for another, completely different, project. I guess that if you write stuff in a good solid structure you can build anything on top of it.</p>
<h2>Media</h2>
<p>So without further ado, here&#8217;s a short video:<br />
<object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/zD-kUlarcyY&#038;hl=en&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/zD-kUlarcyY&#038;hl=en&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></p>
<p>And my flickr stream with photo I uploaded in real time from the conference:<br />
<object width="400" height="300"><param name="flashvars" value="offsite=true&#038;lang=en-us&#038;page_show_url=%2Fphotos%2F30599876%40N02%2Ftags%2Fgeekcon09%2Fshow%2F&#038;page_show_back_url=%2Fphotos%2F30599876%40N02%2Ftags%2Fgeekcon09%2F&#038;user_id=30599876@N02&#038;tags=geekcon09&#038;jump_to=&#038;start_index="></param><param name="movie" value="http://www.flickr.com/apps/slideshow/show.swf?v=71649"></param><param name="allowFullScreen" value="true"></param><embed type="application/x-shockwave-flash" src="http://www.flickr.com/apps/slideshow/show.swf?v=71649" allowFullScreen="true" flashvars="offsite=true&#038;lang=en-us&#038;page_show_url=%2Fphotos%2F30599876%40N02%2Ftags%2Fgeekcon09%2Fshow%2F&#038;page_show_back_url=%2Fphotos%2F30599876%40N02%2Ftags%2Fgeekcon09%2F&#038;user_id=30599876@N02&#038;tags=geekcon09&#038;jump_to=&#038;start_index=" width="400" height="300"></embed></object></p>
<h2>Code</h2>
<p>The code for the canvas drawing proggy is available in the <a href="http://code.google.com/p/morethantechnical/source/checkout">SVN repo</a>.</p>
<p>Thanks!<br />
Roy.</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fwww.morethantechnical.com%2F2009%2F10%2F13%2Fgeekcon-2009-runvas-our-project-w-video-img%2F&amp;linkname=GeekCon%202009%3A%20RunVas%20%26%238211%3B%20Our%20project%20%5Bw%2F%20video%2C%20img%5D"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2009/10/13/geekcon-2009-runvas-our-project-w-video-img/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Awesome pictures fusing with a GIMP plugin [w/ code]</title>
		<link>http://www.morethantechnical.com/2009/09/21/awesome-pictures-fusing-with-a-gimp-plugin-w-code/</link>
		<comments>http://www.morethantechnical.com/2009/09/21/awesome-pictures-fusing-with-a-gimp-plugin-w-code/#comments</comments>
		<pubDate>Mon, 21 Sep 2009 12:58:58 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[graphics]]></category>
		<category><![CDATA[gui]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[gimp]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=381</guid>
		<description><![CDATA[Switching, merging or swapping, call it what you like &#8211; it&#8217;s a pain to pull off. You need to spend a lot of time tuning the colors, blending the edges and smudging to get a decent result. So I wrote a plugin for the wonderful GIMP program that helps this process. The merge is done [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert_bear_arrow.png" rel="lightbox[381]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert_bear_arrow.png" alt="desert_bear_arrow" title="desert_bear_arrow" width="320" height="240" class="alignleft size-full wp-image-462" /></a>Switching, merging or swapping, call it what you like &#8211; it&#8217;s a pain to pull off. You need to spend a lot of time tuning the colors, blending the edges and smudging to get a decent result. So I wrote a plugin for the wonderful GIMP program that helps this process. The merge is done using a blending algorithm that blends in the colous from the original image into the pasted image.</p>
<p>I&#8217;ll write a little bit about coding GIMP plugins, which is very simple, and some about the algorithm and its sources.</p>
<p>Let&#8217;s see how it&#8217;s done<br />
<span id="more-381"></span></p>
<h2>Writing a GIMP plugin</h2>
<p>As I mentioned, writing a plugin for the GIMP is not very hard. I used the excellent (though outdated) <a href="http://developer.gimp.org/writing-a-plug-in/1/index.html">guide for writing a plugin</a> from GIMP.org.<br />
There are 2 kinds of plugins: scripts, and &#8220;native&#8221; plugins. Script plugins are written in script-fu, the scripting language for GIMP, they are not compiled. On the opposite side, the &#8220;native&#8221; plugins are written either in C and compiled to executable, or in Python or Perl. I wrote my plugin in C based roughly on the <a href="http://developer.gimp.org/plug-in-template.html">plugin template</a>, and compiled both on Linux and windows.</p>
<h3>From the ground up</h3>
<p>The first step I did was getting my hands on the pixel data of the selected layer. This is covered pretty nicely in the <a href="http://developer.gimp.org/writing-a-plug-in/2/index.html">tutorial</a>.<br />
Once you get a GimpDrawble you just use gimp_pixel_rgn_init, and then gimp_pixel_rgn_get_row to get a whole row of pixel data (interleaved RGB/RGBA).</p>
<p>To access elements in the row of bytes you need to know how many channels are in the interleave.</p>
<pre class="brush: plain;">
gint         channels, x1, y1, x2, y2, x_off, y_off, m, n;
GimpPixelRgn rgn_in;
guchar       *row;

gimp_drawable_mask_bounds (drawable-&gt;drawable_id,
	                                   &amp;x1, &amp;y1,
	                                   &amp;x2, &amp;y2);
gimp_drawable_offsets(drawable-&gt;drawable_id,&amp;x_off,&amp;y_off);
m = y2 - y1; //height of relevant region
n = x2 - x1; //width of relevant region

channels = gimp_drawable_bpp (drawable-&gt;drawable_id);
gimp_pixel_rgn_init (&amp;rgn_in,
			 drawable,
			 x1, y1,
			 n, m,
			 FALSE, FALSE);
row = g_new (guchar, channels * n);
for (int y = 0; y &lt; m; y++) {
gimp_pixel_rgn_get_row (&amp;rgn_in, 		row, 		0, y, n);

//manipulate pixels...
}
</pre>
<p>Working with layers masks is also simple. Masks are GimpDrawables as well.</p>
<pre class="brush: plain;">
gint32 mask_id = gimp_layer_get_mask(drawable-&gt;drawable_id);
GimpDrawable* mask_drawable = gimp_drawable_get(mask_id);
</pre>
<h3>Progress reporting</h3>
<p>During the run of a lengthy operation it&#8217;s recommended to notify the user of the progress. The easiest way to do it in a GIMP plugin is use the built-in progress bar. Very easy to use:</p>
<pre class="brush: plain;">
gimp_progress_init (&quot;Clone...&quot;);
....
gimp_progress_set_text_printf(&quot;Create matrix %dx%d&quot;,mn,mn);
gimp_progress_update(0.33);
....
gimp_progress_update(1.0);
</pre>
<h3>User inputs</h3>
<p>User input for plugins can be easily done using GTK dialogs. The <a href="http://developer.gimp.org/writing-a-plug-in/3/index.html">gimp plugin development tutorial</a> is covering this briefly, but its fairly simple for someone who is already fimiliar with GTK. I copied off the tutorial, and changed the spinbutton to a checkbutton for my needs.</p>
<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/09/dialog.png" rel="lightbox[381]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/09/dialog.png" alt="dialog" title="dialog" width="299" height="140" class="alignleft size-full wp-image-443" /></a></p>
<p>One tip: in the tutorial code the gimp_dialog_new function will not accept just an int for the 4th parameter (needs GtkDialogFlags), so I passed in GtkDialogFlags::GTK_DIALOG_MODAL.</p>
<p>Now that we got the bases covered, let&#8217;s move on to the algorithm.</p>
<h2>The blending algorithm</h2>
<p>The algorithm is based on a <a href="http://www.irisa.fr/vista/Papers/2003_siggraph_perez.pdf">paper published in 03&#8242; by Microsoft Research</a>. The article presents a way to merge two images by inspecting their gradients, and using this information to &#8220;bleed&#8221; in colors from the background image into the pasted image. More generally, the colors from the outer image (the background) will seep into the pasted image, and they will do it faster if the pasted image is more smooth. This creates a very nice blending effect.</p>
<p>There&#8217;s some mathematical mumbo-jumbo about estimating and discretising the calculation of the gradient, and in the end it boils down to a simple sparse matrix multiplication. I used <a href="http://home.gna.org/getfem/gmm_intro.html">GMM++</a> library to do this in C environment. The library compiles from source (only .h files), so you only need to #include it and set the project paths correctly.</p>
<h2>How to use the plugin</h2>
<p>OK, so all this BS about GIMP, C and math gave me a headache, and we&#8217;re here to have some fun after all. So without further ado, let&#8217;s see how to use this plugin to do some nice tricks. Originally I used this plugin to merge faces together, but it actually can be used to merge anything.</p>
<p>For example, a polar bear in the desert:<br />
<a href="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert_bear.png" rel="lightbox[381]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert_bear-300x225.png" alt="desert_bear" title="desert_bear" width="300" height="225" class="alignnone size-medium wp-image-444" /></a></p>
<p>This is how to do it:</p>
<ol>
<li>get a background picture, preferably one with a uniform color part on which you&#8217;ll want to blend in the external object</li>
<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert.png" rel="lightbox[381]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert.png" alt="desert" title="desert" width="320" height="240" class="alignnone size-full wp-image-445" /></a></p>
<li>get an image of an object, preferably surrounded by a uniform color &#8220;buffer&#8221; so it will blend nicely into the background</li>
<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/09/bear.jpeg" rel="lightbox[381]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/09/bear.jpeg" alt="bear" title="bear" width="124" height="93" class="alignnone size-full wp-image-446" /></a></p>
<li>if needed, cut out a bounding rectangle of the object with a good margin on all sides, and place it where you want on the background</li>
<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert_bear_no_merge.png" rel="lightbox[381]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert_bear_no_merge-300x225.png" alt="desert_bear_no_merge" title="desert_bear_no_merge" width="300" height="225" class="alignnone size-medium wp-image-447" /></a></p>
<li>create a mask, and use lasso to create a tighter mask around the object</li>
<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert_bear_cut.png" rel="lightbox[381]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert_bear_cut.png" alt="desert_bear_cut" title="desert_bear_cut" width="570" height="268" class="alignnone size-full wp-image-448" /></a></p>
<li>select the object layer, and fire the plugin</li>
<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert_bear_filer_menu.png" rel="lightbox[381]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/09/desert_bear_filer_menu.png" alt="desert_bear_filer_menu" title="desert_bear_filer_menu" width="640" height="480" class="alignnone size-full wp-image-449" /></a></p>
<ul>
<p>Some other stuff I did:<br />
<a href="http://www.morethantechnical.com/wp-content/uploads/2009/09/moon_mouse.png" rel="lightbox[381]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/09/moon_mouse.png" alt="moon_mouse" title="moon_mouse" width="320" height="240" class="alignnone size-full wp-image-455" /></a><br />
<a href="http://www.morethantechnical.com/wp-content/uploads/2009/09/sea_shark.png" rel="lightbox[381]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/09/sea_shark.png" alt="sea_shark" title="sea_shark" width="320" height="240" class="alignnone size-full wp-image-456" /></a></p>
<h3>Video time&#8230;</h3>
<p><object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/Wo_WghEqpPA&#038;hl=en&#038;fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/Wo_WghEqpPA&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></p>
<h2>Source of course</h2>
<p>Source is as usual in our <a href="http://code.google.com/p/morethantechnical/source/browse/#svn/trunk/ImageCloningGIMPPlugin">Google Code SVN repo</a>.</p>
<p>Executable (win32) is available <a href="http://morethantechnical.googlecode.com/files/ImageCloningGIMPPlugin.exe">here</a>.</p>
<p>Thank you for tuning in!<br />
Roy.</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fwww.morethantechnical.com%2F2009%2F09%2F21%2Fawesome-pictures-fusing-with-a-gimp-plugin-w-code%2F&amp;linkname=Awesome%20pictures%20fusing%20with%20a%20GIMP%20plugin%20%5Bw%2F%20code%5D"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2009/09/21/awesome-pictures-fusing-with-a-gimp-plugin-w-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>iPhoneOS 3.1 will not allow marker-based AR</title>
		<link>http://www.morethantechnical.com/2009/08/26/iphoneos-3-1-will-not-allow-marker-based-ar/</link>
		<comments>http://www.morethantechnical.com/2009/08/26/iphoneos-3-1-will-not-allow-marker-based-ar/#comments</comments>
		<pubDate>Wed, 26 Aug 2009 14:01:39 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[Mobile phones]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[augmented reality]]></category>
		<category><![CDATA[iphone]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=417</guid>
		<description><![CDATA[Hi I had very high hopes for iPhoneOS 3.1 in the AR arena. With all the hype about it, I naturally thought that with 3.1 developers will be able to bring marker-detection AR to the app-store &#8211; meaning, using legal and published APIs. A look around 3.1&#8242;s APIs I wasn&#8217;t able to find anything that [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/08/no-ar.png" rel="lightbox[417]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/08/no-ar-161x300.png" alt="no-ar" title="no-ar" width="161" height="300" class="alignleft size-medium wp-image-419" /></a>Hi</p>
<p>I had very high hopes for iPhoneOS 3.1 in the AR arena. With all the <a href="http://gizmodo.com/5322448/apple-will-let-iphone-apps-augment-our-sad-little-realities-in-september-with-os-31">hype</a> <a href="http://www.engadget.com/2009/07/24/iphones-augmented-reality-apps-coming-with-september-os-3-1-lau/">about</a> it, I naturally thought that with 3.1 developers will be able to bring marker-detection AR to the app-store &#8211; meaning, using legal and published APIs.  A look around 3.1&#8242;s APIs I wasn&#8217;t able to find anything that will allow this. </p>
<p>Not all AR is banned. In fact AR apps like <a href="http://www.layar.eu/">Layar</a> will be very much possible, as they rely on compass &#038; gyro to create the AR effect. These don&#8217;t require processing the live video feed from the camera, only overlaying data over it. This can be done easily with the new cameraOverlayView property of UIImagePickerController. All you need to do is create a transparent view with the required data, and it will be overlaid on the camera preview.</p>
<p>Sadly, to get marker-detection abilities developers must still hack the system (camera callback rerouting), or use very slow methods (UIGetScreenImage). I can only hope apple will see the potential of letting developers manipulate the live video feed.</p>
<p>Roy.</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fwww.morethantechnical.com%2F2009%2F08%2F26%2Fiphoneos-3-1-will-not-allow-marker-based-ar%2F&amp;linkname=iPhoneOS%203.1%20will%20not%20allow%20marker-based%20AR"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2009/08/26/iphoneos-3-1-will-not-allow-marker-based-ar/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>Near realtime face detection on the iPhone w/ OpenCV port [w/code,video]</title>
		<link>http://www.morethantechnical.com/2009/08/09/near-realtime-face-detection-on-the-iphone-w-opencv-port-wcodevideo/</link>
		<comments>http://www.morethantechnical.com/2009/08/09/near-realtime-face-detection-on-the-iphone-w-opencv-port-wcodevideo/#comments</comments>
		<pubDate>Sun, 09 Aug 2009 11:45:05 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[Mobile phones]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[face]]></category>
		<category><![CDATA[face detection]]></category>
		<category><![CDATA[iphone]]></category>
		<category><![CDATA[opencv]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=400</guid>
		<description><![CDATA[Hi OpenCV is by far my favorite CV/Image processing library. When I found an OpenCV port to the iPhone, and even someone tried to get it to do face detection, I just had to try it for myself. In this post I&#8217;ll try to run through the steps I took in order to get OpenCV [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/08/vlcsnap-6148087-processes.png" rel="lightbox[400]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/08/vlcsnap-6148087-processes-193x300.png" alt="iphone + opencv = win" title="iphone + opencv = win" width="193" height="300" class="alignleft size-medium wp-image-405" /></a>Hi<br />
OpenCV is by far my favorite CV/Image processing library. When I found an OpenCV port to the iPhone, and even someone tried to get it to do face detection, I just had to try it for myself.</p>
<p>In this post I&#8217;ll try to run through the steps I took in order to get OpenCV running on the iPhone, and then how to get OpenCV&#8217;s face detection play nice with iPhoneOS&#8217;s image buffers and video feed (not yet OS 3.0!). Then i&#8217;ll talk a little about optimization</p>
<p><strong>Update</strong>: Apple officially supports camera video pixel buffers in iOS 4.x using AVFoundation, <a href="http://developer.apple.com/iphone/library/qa/qa2010/qa1702.html#TNTAG1">here&#8217;s</a> sample code from Apple developer.</p>
<p>Let&#8217;s begin<br />
<span id="more-400"></span></p>
<h2>Cross compiling OpenCV on iPhoneOS</h2>
<p>The good people @ computer-vision-software.com have posted a <a href="http://www.computer-vision-software.com/blog/2009/04/opencv-vs-apple-iphone/">guideline on how to compile OpenCV on iPhone</a> and <a href="http://ildan.blogspot.com/2008_07_01_archive.html">link them as static libraries</a>, and I followed it. I did have to recompile it with one change &#8211; OpenCV needed zlib linkage, and the OpenCV configure script wasn&#8217;t able to config the makefiles to compile zlib as well. So I <a href="http://www.zlib.net/zlib-1.2.3.tar.gz">downloaded zlib</a> from the net, and just added all the files to the XCode project to compile and link. If you&#8217;re trying to recreate this, remember to configure/build zlib before adding the files to XCode so you get a zconf.h file.  Now OpenCV linked perfectly.<br />
All in all it was really not a big deal to compile OpenCV to the iPhoneOS. I imagined it will be much harder&#8230;</p>
<p>OK moving on to</p>
<h2>Plain vanilla face detection</h2>
<p>So the first step is to just get OpenCV to detect a single face on a single image. But let&#8217;s make it harder and use UIImage.<br />
So first, I took OCV&#8217;s facedetect.c example, and added it to the project as is. Then I add 2 peripheral functions to setup and tear down the structs and allocated static memory (things that are done in the main function).</p>
<pre class="brush: plain;">
void init_detection(char* cascade_location) {
	cascade = (CvHaarClassifierCascade*)cvLoad( cascade_location, 0, 0, 0 );
	storage = cvCreateMemStorage(0);
}

static IplImage *gray = 0, *small_img = 0;

void release_detection() {
	if (storage)
    {
        cvReleaseMemStorage(&amp;storage);
    }
    if (cascade)
    {
        cvReleaseHaarClassifierCascade(&amp;cascade);
    }
	cvReleaseImage(&amp;gray);
	cvReleaseImage(&amp;small_img);
}
</pre>
<p>The detect_and_draw function remains exactly the same at this point. I just take the XML files of the haarcascades, and add them to the projects resources.<br />
Now I initialize the detection structs from my UIView or UIViewController that will do the detection. The main NSBundle will find the path to the XML file:</p>
<pre class="brush: plain;">
NSString* myImage = [[NSBundle mainBundle] pathForResource:@&quot;haarcascade_frontalface_alt&quot; ofType:@&quot;xml&quot;];
		char* chars = (char*)malloc(512);
		[myImage getCString:chars maxLength:512 encoding:NSUTF8StringEncoding];
		init_detection(chars);
</pre>
<p>Awesome, now let&#8217;s face-detect already! For that all we need is to attach a picture of someone to the projects resources, load it, convert it to IplImage* and hand it over to detect_and_draw &#8211; simple.<br />
I used a couple of helper function from <a href="http://www.computer-vision-software.com/blog/2009/04/opencv-vs-apple-iphone/">the informative post</a> I mentioned earlier:</p>
<pre class="brush: plain;">
- (void)manipulateOpenCVImagePixelDataWithCGImage:(CGImageRef)inImage openCVimage:(IplImage *)openCVimage;
- (CGContextRef)createARGBBitmapContext:(CGImageRef)inImage;
- (IplImage *)getCVImageFromCGImage:(CGImageRef)cgImage;
-(CGImageRef)getCGImageFromCVImage:(IplImage*)cvImage;
</pre>
<p>Now it&#8217;s only putting it together:</p>
<pre class="brush: plain;">
IplImage* im = [self getCVImageFromCGImage:[UIImage imageNamed:&quot;a_picture.jpg&quot;].CGImage];
detect_and_draw(im);
UIImage* result = [UIImage imageWithCGImage:[self getCGImageFromCVImage:im]];

UIImageView* imv = [[UIImageView alloc] initWithImage:result];
[self addSubview:imv];
[imv release];
</pre>
<p>Just remember those externs, if you don&#8217;t use a header file:</p>
<pre class="brush: plain;">
extern &quot;C&quot; void detect_and_draw( IplImage* img, CvRect* found_face );
extern &quot;C&quot; void init_detection(char* cascade_location);
extern &quot;C&quot; void release_detection();
</pre>
<p>Sweet. But detecting a face on a single photo is not so difficult &#8211; we want video and real-time face detection! So let&#8217;s do that..</p>
<h2>Tying it up with video feed from the iPhone camera (no OS 3.0 yet)</h2>
<p>This step was so amazingly simple, it was borderline funny. I used my well-known camera frame grabbing code from <a href="http://github.com/norio-nomura/iphonetest/tree/9713242dda6c6bc897da4bd639a1fdadc29b6fd7/CameraTest">Norio Numora</a>. Of course to align it with OS 3.0 you must plug it in to the API Apple provide, and not this wily hack, but it&#8217;s really a plug-and-play situation. I use it in <a href="http://www.morethantechnical.com/2009/07/01/augmented-reality-on-the-iphone-using-nyartoolkit-w-code/">many</a> of my <a href="http://www.morethantechnical.com/2009/05/06/iphone-camera-frame-grabbing-and-a-real-time-meanshift-tracker/">projects</a> that use the iPhone camera, untill video on the OS 3.0 will be finalized.<br />
So all I needed was to set everything up, make a timer to fire every so-and-so millisec, and send the frame to detection:</p>
<pre class="brush: plain;">
- (id)initWithNibName:(NSString *)nibNameOrNil bundle:(NSBundle *)nibBundleOrNil {
    if (self = [super initWithNibName:nibNameOrNil bundle:nibBundleOrNil]) {
        // Initialization code
		ctad = [[CameraTestAppDelegate alloc] init];
		[ctad doInit];

		NSString* myImage = [[NSBundle mainBundle] pathForResource:@&quot;haarcascade_frontalface_alt&quot; ofType:@&quot;xml&quot;];
		char* chars = (char*)malloc(512);
		[myImage getCString:chars maxLength:512 encoding:NSUTF8StringEncoding];
		init_detection(chars);		

		[self.view addSubview:[ctad getPreviewView]];
		[self.view sendSubviewToBack:[ctad getPreviewView]];

		repeatingTimer = [NSTimer scheduledTimerWithTimeInterval:0.0909 target:self selector:@selector(doDetection:) userInfo:nil repeats:YES];
}

-(void)doDetection:(NSTimer*) timer {
	if([ctad getPixelData]) {
		if(!im) {
			im = cvCreateImageHeader(cvSize([ctad getVideoSize].width,[ctad getVideoSize].height), 8, 4);
		}
		cvSetData(im, [ctad getPixelData],[ctad getBytesPerRow]);
		CvRect r;
		detect_and_draw(im,&amp;r);
		if(r.width &gt; 0 &amp;&amp; r.height &gt; 0) {
			NSLog(@&quot;Face: %.0f,%.0f,%.0f,%.0f&quot;,r.x,r.y,r.width.r.height);
		}
	}
}
</pre>
<p>See that for optimization sake, I only create the IplImage header once (the if goes in only in the first time), and every frame after that I only set the IplImage data by taking the buffer I got from the camera. This way the IplImage is sharing buffers, so there is also a little memory optimization there.<br />
From that point on you can take it anywhere you like. Add stuff to faces, mark the face in the image, etc.</p>
<p>But&#8230; there&#8217;s the issue of performance. This method will get you very very bad timings. In the area of 5-15 seconds (!!) for a single frame &#8211; which is horrendous. And I promised near real time performance. So without further ado,</p>
<h2>Optimizing the hell out of the detection algorithm</h2>
<p>Well the guys at computer-vision-software.com have done some <a href="http://www.computer-vision-software.com/blog/2009/04/fixing-opencv/">work in the field of optimizing OpenCV&#8217;s haar-based detection</a>, but never released code. Their method was based on the fact that the iPhone&#8217;s CPU can handle integers far better than floating-points, so they set out to change the algorithm to use integers. I also did that, and found that it only shaves off a few millisec of the total time. The far more influencing factor is the window size of the features scan, the scaling factor of the window size, and the derived number of passes.</p>
<p>Let me explain a little bit how the detection works in OpenCV. First you set the minimal size of the window. Then you specify a scale factor. OpenCV uses this scale factor to do multiple passes over the image to scan for feature-hits. It take the window size, say 30&#215;30, and the factor, say 1.1, and starts multiplying the window size by the factor until it reaches the size of the image. So for a 256&#215;256 image you get: 30&#215;30 scan, 33&#215;33, 36&#215;36, 39&#215;39, 43&#215;43&#8230; 244&#215;244 &#8211; a total of 23 passes, for one frame! This is way too much&#8230; This is done to get better and finer results, and it may be good for resource abundant systems, but this is not our case.</p>
<p>So first thing I did was slash down on those scans. There is, as expected a very strong impact on the quality of the results, but the times are getting close to acceptable. After all my optimizations I got the timing down to even ~120ms.<br />
I optimized a few things:</p>
<ul>
<li>The size of the input image, originally ~300&#215;400, was cut down by 1.5</li>
<li>The scale factor for cvHaarDetectObjects: I played with values ranging from 1.2 to 1.5, with pleasing timings</li>
<li>The ROI (region of interest) in the IplImage to scan was set every frame to have the previous frame&#8217;s detection, the location of the face, plus some buffer on the sides to allow movement of the face frame-to-frame. This decreases the scanned area from the whole image to just a small portion that contains the known face. Of course if a face was not found the ROI is reset.</li>
<li>I change the internal works of the cvHaarDetectObjects algorithm to do a lot less floats multiplications and turned them into integer multiplications.</li>
<li>I dawned upon me just the other day that I can also optimize the size of the search window, and not keep it constant from frame to frame (30&#215;30). If the last frame had found a 36&#215;36 face, the next detection should also try for a 36&#215;36 object. I haven&#8217;t tried it yet.</li>
<li>Memory optimization: don&#8217;t alloc buffers every frame, share buffers, etc.</li>
</ul>
<p>So first the most influential change, is in the detection phase:</p>
<pre class="brush: plain;">
void detect_and_draw( IplImage* img, CvRect* found_face )
{
	static CvRect prev;

	if(!gray) {
		gray = cvCreateImage( cvSize(img-&gt;width,img-&gt;height), 8, 1 );
		small_img = cvCreateImage( cvSize( cvRound (img-&gt;width/scale),
							 cvRound (img-&gt;height/scale)), 8, 1 );
	}

	if(prev.width &gt; 0 &amp;&amp; prev.height &gt; 0) {
		cvSetImageROI(small_img, prev);

		CvRect tPrev = cvRect(prev.x * scale, prev.y * scale, prev.width * scale, prev.height * scale);
		cvSetImageROI(img, tPrev);
		cvSetImageROI(gray, tPrev);
	} else {
		cvResetImageROI(img);
		cvResetImageROI(small_img);
		cvResetImageROI(gray);
	}

    cvCvtColor( img, gray, CV_BGR2GRAY );
    cvResize( gray, small_img, CV_INTER_LINEAR );
    cvEqualizeHist( small_img, small_img );
    cvClearMemStorage( storage );

		CvSeq* faces = mycvHaarDetectObjects( small_img, cascade, storage,
										   1.2, 0, 0
										   |CV_HAAR_FIND_BIGGEST_OBJECT
										   |CV_HAAR_DO_ROUGH_SEARCH
										   //|CV_HAAR_DO_CANNY_PRUNING
										   //|CV_HAAR_SCALE_IMAGE
										   ,
										   cvSize(30, 30) );

	if(faces-&gt;total&gt;0) {
		CvRect* r = (CvRect*)cvGetSeqElem( faces, 0 );
		int startX,startY;
		if(prev.width &gt; 0 &amp;&amp; prev.height &gt; 0) {
			r-&gt;x += prev.x;
			r-&gt;y += prev.y;
		}
		startX = MAX(r-&gt;x - PAD_FACE,0);
		startY = MAX(r-&gt;y - PAD_FACE,0);
		int w = small_img-&gt;width - startX - r-&gt;width - PAD_FACE_2;
		int h = small_img-&gt;height - startY - r-&gt;height - PAD_FACE_2;
		int sw = r-&gt;x - PAD_FACE, sh = r-&gt;y - PAD_FACE;
		prev = cvRect(startX, startY,
					  r-&gt;width + PAD_FACE_2 + ((w &lt; 0) ? w : 0) + ((sw &lt; 0) ? sw : 0),
					  r-&gt;height + PAD_FACE_2 + ((h &lt; 0) ? h : 0) + ((sh &lt; 0) ? sh : 0));
		printf(&quot;found face (%d,%d,%d,%d) setting ROI to (%d,%d,%d,%d)\n&quot;,r-&gt;x,r-&gt;y,r-&gt;width,r-&gt;height,prev.x,prev.y,prev.width,prev.height);
		found_face-&gt;x = (int)((double)r-&gt;x * scale);
		found_face-&gt;y = (int)((double)r-&gt;y * scale);
		found_face-&gt;width = (int)((double)r-&gt;width * scale);
		found_face-&gt;height = (int)((double)r-&gt;height * scale);
	} else {
		prev.width = prev.height = found_face-&gt;width = found_face-&gt;height = 0;
	}
}
</pre>
<p>As you can see I keep the previous face in prev, and use it to set the ROI of the images for the next frame. Note that the small_img is a scaled-down version of the input image, so the detection results must be scaled-up to match the real size of the input.</p>
<p>Now, I can bore you with the details of how I changed the cvHaarDetectObjects to use more integers, but I won&#8217;t. Anyway it&#8217;s all in the code, that is freely available, so you can diff it against cvHarr.cpp of OpenCV and see the changes. In short what I did was:</p>
<ul>
<li>Mark out image scaling and canny pruning.</li>
<li>in the cvSetImagesForHaarClassifierCascade, which fires many times for each frame and is governed on scaling/shifting/rotating the Haar classifiers to get better detection, I changed the weights and sizes to be integers rather than floats.</li>
<li>in cvRunHaarClassifierCascade, which calculates the score for a single Haar feature-hit, I changed the results calculation to integers instead of floats.</li>
<li>I played around with integer oriented calculations of the sqrt function, that the cvRunHaarClassifierCascade func uses (fires many many times each frame), but that actually caused a slow-down on the device. Turns out the standard library (math.h) implementation is the best</li>
</ul>
<p>Well guys, that&#8217;s pretty much all my discovery in the field. Please keep working on it. I&#8217;m anxious to see a true real-time face detection on the iPhone.</p>
<p>Time for a video proof? you bet</p>
<h2>Here&#8217;s proof that all I wrote here is not total BS</h2>
<p><object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/DKn8UTW9Qfo&#038;hl=en&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/DKn8UTW9Qfo&#038;hl=en&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></p>
<h2>Code</h2>
<p>Code is as usual available in Google Code SVN repo:<br />
<a href="http://code.google.com/p/morethantechnical/source/browse/#svn/trunk/FaceDetector-iPhone">http://code.google.com/p/morethantechnical/source/browse/#svn/trunk/FaceDetector-iPhone</a></p>
<p>OK, &#8216;Till next time, enjoy<br />
Roy.</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fwww.morethantechnical.com%2F2009%2F08%2F09%2Fnear-realtime-face-detection-on-the-iphone-w-opencv-port-wcodevideo%2F&amp;linkname=Near%20realtime%20face%20detection%20on%20the%20iPhone%20w%2F%20OpenCV%20port%20%5Bw%2Fcode%2Cvideo%5D"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2009/08/09/near-realtime-face-detection-on-the-iphone-w-opencv-port-wcodevideo/feed/</wfw:commentRss>
		<slash:comments>36</slash:comments>
		</item>
		<item>
		<title>Advanced topics in 3D game building [w/ code, video]</title>
		<link>http://www.morethantechnical.com/2009/07/27/advanced-issues-in-3d-game-building-with-jogl-openglswt-w-code-video/</link>
		<comments>http://www.morethantechnical.com/2009/07/27/advanced-issues-in-3d-game-building-with-jogl-openglswt-w-code-video/#comments</comments>
		<pubDate>Mon, 27 Jul 2009 12:17:27 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[3d]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[gui]]></category>
		<category><![CDATA[opengl]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[school]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[game]]></category>
		<category><![CDATA[jogl]]></category>
		<category><![CDATA[swt]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=325</guid>
		<description><![CDATA[Hi The graphics course I took at TAU really expanded my knowledge of 3D rendering, and specifically using OpenGL to do so. The final task of the course, aside from the exam, was to write a 3D game. We were given 3 choices for types of games: worms-like, xonix-like and lightcycle-like. We chose to write [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/07/snails_3d.png" rel="lightbox[325]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/07/snails_3d-300x223.png" alt="snails_3d" title="snails_3d" width="300" height="223" class="alignleft size-medium wp-image-364" /></a>Hi</p>
<p>The graphics course I took at TAU really expanded my knowledge of 3D rendering, and specifically using OpenGL to do so. The final task of the course, aside from the exam, was to write a 3D game. We were given 3 choices for types of games: worms-like, xonix-like and lightcycle-like. We chose to write our version of Worms in 3D.</p>
<p>I&#8217;ll try to take you through some of the problems we encountered, the decisions we made, and show as much code as possible. I&#8217;m not, however, gonna take you through the simple (yet grueling) work of actually showing meshes to the screen or moving them around, these subjects are covered extensively online.</p>
<p>The whole game is implemented in Java using JOGL and SWT for 3D rendering. The code is of course available entirely <a href="http://code.google.com/p/taucomputergraphics09/source/browse/#svn/trunk/CG_EX3">online</a>.</p>
<p><span id="more-325"></span></p>
<h2>In the begining there was mesh</h2>
<p>Our game is a turn-based shooter, Worms-like. The players are supposed to walk on the face of a world mesh (a 3d structure), and shoot missiles at each other.<br />
When it comes to positioning and aligning 3D objects on top of other 3D objects, I must say I was overwhelmed by the complexity. Its really not that easy to do, but once you get the idea its easy.<br />
So you have an arbitrary 3D object, and you want another object to &#8220;walk&#8221; over it. Few things you need to know:</p>
<ol>
<li>the normal of the mesh at any given point, so your player can have an &#8220;up&#8221; direction (away from the mesh)</li>
<li>the global orientation of your player object, so when you align it up it will actually face up</li>
</ol>
<p>Now, to align the object you have to make it be oriented upwards from the mesh and point in the forward direction. So before drawing the player object make the model-view matrix face the right direction. For this we <a href="http://www.opengl.org/resources/faq/technical/lookat.cpp">ported some CPP code</a> that does exactly that:</p>
<pre class="brush: plain;">
public static void multLookAt (/*float eyex, float eyey, float eyez,
            float atx, float aty, float atz,
            float upx, float upy, float upz,*/
			Vector3D origin,
			Vector3D forward,
			Vector3D up,
            GL gl)
	{
		float m[] = new float[16];

		// Compute our new look at vector, which will be
		//   the new negative Z axis of our transformed object.
		forward = forward.getNormalized();

		// Cross product of the new look at vector and the current
		//   up vector will produce a vector which is the new
		//   positive X axis of our transformed object.
		Vector3D xaxis = forward.crossProduct(up).getNormalized();
		m[0] = xaxis.x;
		m[1] = xaxis.y;
		m[2] = xaxis.z;

		// Calculate the new up vector, which will be the
		//   positive Y axis of our transformed object. Note
		//   that it will lie in the same plane as the new
		//   look at vector and the old up vector.
		up = xaxis.crossProduct(forward);
		m[4] = up.x;
		m[5] = up.y;
		m[6] = up.z;

		// Account for the fact that the geometry will be defined to
		//   point along the negative Z axis.
		forward = forward.multiply(-1f);
		m[8] = forward.x;
		m[9] = forward.y;
		m[10] = forward.z;

		// Fill out the rest of the 4x4 matrix
		m[3] = 0.f;     // xaxis is m[0..2]
		m[7] = 0.f;     // up is m[4..6]
		m[11] = 0.f;    // -at is m[8..10]
		m[12] = origin.x;
		m[13] = origin.y;
		m[14] = origin.z;
		m[15] = 1.f;

		// Multiply onto current matrix stack.
		gl.glMultMatrixf(FloatBuffer.wrap(m));
	}
</pre>
<h2>Walk the mesh</h2>
<p>Another key point in our game is making the player character walk smoothly on the mesh. The instructions for this exercise were that we have to make the player either walk from vertex to vertex, from edge to edge or the most complicated option &#8211; an arbitrary point on the mesh to another. We chose to make the player walk from edge to edge.<br />
Say that the player object is on an edge. To make it walk to another edge we needed to know what are the possible edges that it could reach. For that we created a <a href="http://www.google.com/search?q=half+edge">half-edge structure</a> of the world mesh. Then when we query the structure we can get in constant time the neighboring edges to the player (typically the <a href="http://bluntobject.wordpress.com/2007/03/13/mesh-data-structures-vol-2-vertex-one-rings/">one-ring of both vertices</a> of the current edge he&#8217;s standing on). We only have to choose from the list of edges which edge the player should go to, and where on that edge he will &#8220;land&#8221;. This is done by choosing the edge with the closest point to the approximation of the players next location.<br />
Let me explain: </p>
<ul>
<li>The player has a certain speed, meaning it can cover a certain distance in a given time. </li>
<li>The approximate next location of the player is the distance it should cover in minimal time. We took a constant value.</li>
<li>Once you have a approximate location, find the edge (and a point on that edge) that best conforms with this location</li>
</ul>
<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/07/snail_approx_next_location.PNG" rel="lightbox[325]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/07/snail_approx_next_location-300x222.PNG" alt="approx next location" title="approx next location" width="300" height="222" class="alignleft size-medium wp-image-355" /></a><br />
 In the image you see the player object (snail), the ring of edges around the player that we check (green), the approximate next location (yellow), and the calculated next location (purple).</p>
<p>To find a best location for the object we calculated the distance between the player and the approximate. We ported another piece of <a href="http://softsurfer.com/Archive/algorithm_0106/algorithm_0106.htm">C code to measure distance between segments</a> to do that:</p>
<pre class="brush: plain;">
/**
	 * find the distance between 2 line segments
	 * @param S1P1 1st point of segment 1
	 * @param S1P0 2nd pt of seg 1
	 * @param S2P1 1st pt of seg 2
	 * @param S2P0 2nd pt of seg 2
	 * @return
	 */
public static Vector3D dist3D_Segment_to_Segment( Vector3D S1P1, Vector3D S1P0, Vector3D S2P1, Vector3D S2P0)
	{
	    Vector3D   u = S1P1.minus(S1P0);
	    Vector3D   v = S2P1.minus(S2P0);
	    Vector3D   w = S1P0.minus(S2P0);
	    float    a = u.innerProduct(u);        // always &gt;= 0
	    float    b = u.innerProduct(v);
	    float    c = v.innerProduct(v);        // always &gt;= 0
	    float    d = u.innerProduct(w);
	    float    e = v.innerProduct(w);
	    float    D = a*c - b*b;       // always &gt;= 0
	    float    sc, sN, sD = D;      // sc = sN / sD, default sD = D &gt;= 0

	    // compute the line parameters of the two closest points
	        sN = (b*e - c*d);
	        if (sN &lt; 0f) {       // sc &lt; 0 =&gt; the s=0 edge is visible
	            return null;
	        }
	        else if (sN &gt; sD) {  // sc &gt; 1 =&gt; the s=1 edge is visible
	            return null;
	        }

	    // finally do the division to get sc
	    sc = (Math.abs(sN) &lt; Vector3D.EPSILON ? 0f : sN / sD);

	    return S1P0.add(u.multiply(sc)); // return the intersection point
	}
</pre>
<p>We actually check what is the intersection point of the approximate edge with the line that goes from the player and in the forward direction. This is the point the player should be if it walked from his current position to that edge.</p>
<p>Orchestrating:</p>
<pre class="brush: plain;">
	/**
	 * find the location where the player should be in the next frame
	 * @param distanceToMoveInThisInterval the distance the player object should (try to) move
	 * @return
	 */
	public EdgeAndIntersectionPointAndDistance findNextLocation(float distanceToMoveInThisInterval) {
		//find all adjacent edges using half-edge struct
		Edge currentEdge = model.getCurrentEdge();
		Set&lt;Edge&gt; allEdges = currentEdge.b.get2RingOfEdges();
		allEdges.addAll(currentEdge.a.get2RingOfEdges());

//A point that lies a little bit along the line from the player and in the forward direction
		Vector3D locationAndSome = location.add(dirForward.multiply(0.05f));

		//find the edge that the player direction is intersecting
		allEdges.remove(currentEdge);
		ArrayList&lt;EdgeAndIntersectionPointAndDistance&gt; intersectingEdges = new ArrayList&lt;EdgeAndIntersectionPointAndDistance&gt;();
		for (Edge e : allEdges) {
			//skip the opposite edge too.
			if((e.a == currentEdge.a &amp;&amp; e.b == currentEdge.b) ||
					(e.b == currentEdge.a &amp;&amp; e.a == currentEdge.b))
				continue;

			Vector3D intPt = Utils.dist3D_Segment_to_Segment(e.a.getVector3D(), e.b.getVector3D(), model.getLocation(), locationAndSome);
			if(intPt != null) {
				float dist = intPt.minus(locationAndSome).getNorm();
				intersectingEdges.add(
					new EdgeAndIntersectionPointAndDistance(e,dist,intPt)
					);
			}
		}

		EdgeAndIntersectionPointAndDistance bestEdge = null;
		float bestDistance = Float.MAX_VALUE;
		for (EdgeAndIntersectionPointAndDistance ead : intersectingEdges) {
			float absD = ead.distance;
			if(absD &lt; bestDistance) {
				bestDistance = absD;
				bestEdge = ead;
			}
		}

		//find the location on that edge where the player should be
		return bestEdge;
	}
</pre>
<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/07/object_mesh_direction.png" rel="lightbox[325]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/07/object_mesh_direction-300x139.png" alt="object_mesh_direction" title="object_mesh_direction" width="300" height="139" class="alignleft size-medium wp-image-361" /></a><br />
Of course when the player gets to his next location, he needs to get the correct up-direction and forward direction. This can be taken from the next edge&#8217;s vertices that hold normals.</p>
<p>There are some additional issues to traversing the mesh:</p>
<ul>
<li>Avoiding hitting other players. Done by not allowing 2 player to inhabit the same edge.</li>
<li>Collecting bonuses that are on the ground by checking if the player had walked over (or close to) an edge that has a bonus on it</li>
<li>Up keeping the half-edge structure by updating the edges if they have players/bonuses on them</li>
</ul>
<h2>Make a dent in the world</h2>
<p>One more interesting issue is the missiles impact on the world. Naturally, we&#8217;d like the missiles that hit the ground to make a dent in the world. Our solution was to take the 1-ring of the hit vertex, and lower all the vertices on the ring in the opposite direction to their normals.<br />
<a href="http://www.morethantechnical.com/wp-content/uploads/2009/07/dent_in_mesh.PNG" rel="lightbox[325]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/07/dent_in_mesh-150x150.PNG" alt="dent_in_mesh" title="dent_in_mesh" width="150" height="150" class="alignnone size-thumbnail wp-image-363" /></a><a href="http://www.morethantechnical.com/wp-content/uploads/2009/07/dent_diagram.png" rel="lightbox[325]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/07/dent_diagram-300x120.png" alt="dent_diagram" title="dent_diagram" width="300" height="120" class="alignleft size-medium wp-image-372" /></a></p>
<pre class="brush: plain;">
	public static void makeADentInTheMesh(Vertex vtx, float amount, IWorldMeshHandler worldMeshHandler) {
		//Make a dent in the mesh...
		Set&lt;Edge&gt; es = vtx.get2RingOfEdges();
		for (Edge ed : es) {
			Vector3D v = new Vector3D(ed.a.x,ed.a.y,ed.a.z);
			v = v.minus(ed.a.getNormal().multiply(amount));
			ed.a.x = v.x; ed.a.y = v.y; ed.a.z = v.z;
			v = new Vector3D(ed.b.x,ed.b.y,ed.b.z);
			v = v.minus(ed.b.getNormal().multiply(amount));
			ed.b.x = v.x; ed.b.y = v.y; ed.b.z = v.z;

			//If there's stuff on the edge, move it accordingly
			if(ed.playerOnEdge != null) {
				ed.playerOnEdge.getModel().fixLocation();
			}
			if(ed.packageOnEdge != null) {
				Utils.fixObjLocationToEdge(ed.packageOnEdge.getModel(), ed);
			}
			if(ed.treeOnEdge != null) {
				Utils.fixObjLocationToEdge(ed.treeOnEdge.getModel(), ed);
			}
		}

		//Recalculate normals - the positions have changed, creating new &quot;up&quot; directions
		for (Edge ed : es) {
			ed.a.calcNormal();
			ed.b.calcNormal();
		}

		//reset the display list of the world mesh since the vertices and faces have changed
		worldMeshHandler.getWorld().getRenderer().setResetList();
	}
</pre>
<h2>if pigs (missiles) could fly&#8230;</h2>
<p>So far I&#8217;ve covered mesh oriented movement. Missiles, however, are not mesh-bound &#8211; they fly around &#8220;freely&#8221; above the mesh. To imitate gravity and a &#8220;steep course of flight&#8221; for the missiles, we use <a href="http://en.wikipedia.org/wiki/B%C3%A9zier_curve">Bezier curves</a> of either 3 or 4 keypoints.<br />
To calc a point on the curve all you need to know is the current time of flight.</p>
<pre class="brush: plain;">
	protected Vector3D[] mBezierMultiplyV0_V3 = null;

	protected Vector3D V0;
	protected Vector3D V1;
	protected Vector3D V2;
	protected Vector3D V3;

	protected void calcExpectedFlyTime() {
		float distToMove = V3.minus(V0).getNorm(); //total distance to cover
		expectedFlyTime = distToMove / flySpeed;  //approximate time of flight
	}

	protected Vector3D getCurrentCurveLocation(float u) {
		Vector3D out = null;
		out = mBezierMultiplyV0_V3[0].multiply(u * u * u);
		out = out.add(mBezierMultiplyV0_V3[1].multiply(u * u));
		out = out.add(mBezierMultiplyV0_V3[2].multiply(u));
		out = out.add(mBezierMultiplyV0_V3[3]);

		return out;
	}

	private void advanceMissile() {
		float u = timeFromLaunch / expectedFlyTime;
		Vector3D newLocation = this.getCurrentCurveLocation(u);
		this.location = newLocation;

		//calc the change of angle
		this.dirForward = newLocation.minus(this.prevLocation).getNormalized();
		this.dirUp = ((this.dirForward).crossProduct(this.dirLeft)).getNormalized(); //cross-product of left and forward is the new up direction

		this.prevLocation = newLocation;
	}
</pre>
<h2>Get ready for impact!</h2>
<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/07/missile_hit.png" rel="lightbox[325]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/07/missile_hit-300x155.png" alt="missile_hit" title="missile_hit" width="300" height="155" class="alignleft size-medium wp-image-367" /></a>I&#8217;ve talked about what happens when a missile hits, and about the missile&#8217;s course, but how do we know when the missile hits the ground?<br />
We have implemented this using a KD-tree over all the vertices in the world mesh, to check what is the closest vertex to the missile. When the missile gets close enough we check the dot product of the missile&#8217;s location and the normal of the surface &#8211; when the sign flips the missile hit the ground.</p>
<pre class="brush: plain;">
	protected void checkForHitWithMesh() {
		Vector3D tipLocation = location.add(dirForward.multiply(distToTip));
		double[] key = {tipLocation.x, tipLocation.y, tipLocation.z};

		//check with KD Tree to get the nearest vertex
		Vertex closestVetrex = null;
		try {
			closestVetrex = VertexKdTree.getVertexKdTree().nearest(key);
		} catch (KeySizeException e) {
			e.printStackTrace();
			return;
		}

		//check if missile is lost in space...
		Vector3D closestVertLocationWorld = aimer.getTransformToMeshLocation().transform(closestVetrex.getVector3D());
		if (tipLocation.minus(closestVertLocationWorld).getNorm() &gt; 1) {
			projectileHandler.setCurrentProjectile(null);
			flightFinished = true;
			objNModeHandler.nextMode();
			return;
		}

		//Check if missile is inside ground
		if ((closestVetrex.getNormal()).innerProduct(tipLocation.minus(closestVertLocationWorld)) &lt; 0) {
			flightFinished = true;
			hitMeshAtVertex(closestVetrex);
			projectileHandler.setCurrentProjectile(null);
		}
	}
</pre>
<h2>Particle Shmarticle</h2>
<p><a href="http://www.morethantechnical.com/wp-content/uploads/2009/07/particles_explosion.PNG" rel="lightbox[325]"><img src="http://www.morethantechnical.com/wp-content/uploads/2009/07/particles_explosion-300x221.PNG" alt="particles_explosion" title="particles_explosion" width="300" height="221" class="aligncenter size-medium wp-image-369" /></a><br />
<a href="http://en.wikipedia.org/wiki/Particle_system">Particle systems</a> are a decent way to simulate smoke, fire, water splashes and anything &#8220;particley&#8221;. We ripped some C code from the net (I can&#8217;t remember where, but it was GPLed), again, and ported it to Java.<br />
The only problem is that OpenGL can&#8217;t really display particles, you need something that has some kind of surface, or a line. Other options are drawing GL_POINTs, or GLU spheres, but both options either don&#8217;t look pleasing or are very costly in terms of performance. So we used GL_TRIANGLE_STRIP to draw small rectangles of random sizes as the particles.</p>
<pre class="brush: plain;">
	public void drawParticles(GL gl) {
		gl.glEnable(GL.GL_BLEND);
		gl.glBlendFunc(GL.GL_SRC_ALPHA, GL.GL_ONE_MINUS_SRC_ALPHA);
		gl.glDisable(GL.GL_LIGHTING);

		for (int i = 0; i &lt; MAX_PARTICLES; i++) {
			// Each particle is handled differently depending on whether it's
			// alive or not.
			Particle particle = m_aParticles[i];
			if (particle.isAlive()) {
				// This particular particle is alive.
				handleLiveParticle(gl, particle);
			} else {
				// This particular particle is dead.
				handleDeadParticle(gl, particle);
			}
		}
		gl.glDisable(GL.GL_BLEND);
		gl.glEnable(GL.GL_LIGHTING);
	}

	private void handleLiveParticle(GL gl, Particle particle) {
		// The current location of the particle; Need to account for the
		// zoom
		// distance so user can zoom in and out the particles.
		float x = particle.getXLocation();
		float y = particle.getYLocation();
		float z = particle.getZLocation();

		// Set the color to draw this particle. The particle's life value
		// will act as the alpha.
		gl.glColor4f(particle.getRed(), particle.getGreen(),
				particle.getBlue(), particle.getLife());
		// Draw the particle using triangle strips.
		gl.glBegin(GL.GL_TRIANGLE_STRIP);
		// Map the texture and create the vertices for the particle.
		float pSize = (float) Math.random() * particle.getLife();
		float r = (float) Math.random() - pSize;
		gl.glVertex3f(x + pSize, y + pSize, z + r);
		gl.glVertex3f(x - pSize, y + pSize, z + r);
		gl.glVertex3f(x + pSize, y - pSize, z - r);
		gl.glVertex3f(x - pSize, y - pSize, z - r);
		gl.glEnd();

		// Update the particles' properties.
		updateParticle(particle);
	}
</pre>
<p>Note that we make sure the &#8220;particles&#8221; are blended into the buffer, as particles&#8217; color is fading as they near death.</p>
<h2>Sum up</h2>
<p>OK, I&#8217;ve tried to share most of the interesting points in the making of the game. The code is downloadable from the <a href="http://code.google.com/p/taucomputergraphics09/">Google code SVN repo</a>. And here&#8217;s a short video explaining some aspects of playing the game:<br />
<object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/XbQ0Qd3gHZM&#038;hl=en&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/XbQ0Qd3gHZM&#038;hl=en&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></p>
<p>Thanks for tuning in!<br />
Roy.</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fwww.morethantechnical.com%2F2009%2F07%2F27%2Fadvanced-issues-in-3d-game-building-with-jogl-openglswt-w-code-video%2F&amp;linkname=Advanced%20topics%20in%203D%20game%20building%20%5Bw%2F%20code%2C%20video%5D"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2009/07/27/advanced-issues-in-3d-game-building-with-jogl-openglswt-w-code-video/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Augmented reality on the iPhone using NyARToolkit [w/ code]</title>
		<link>http://www.morethantechnical.com/2009/07/01/augmented-reality-on-the-iphone-using-nyartoolkit-w-code/</link>
		<comments>http://www.morethantechnical.com/2009/07/01/augmented-reality-on-the-iphone-using-nyartoolkit-w-code/#comments</comments>
		<pubDate>Wed, 01 Jul 2009 11:37:26 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[3d]]></category>
		<category><![CDATA[Mobile phones]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[opengl]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[augmented reality]]></category>
		<category><![CDATA[iphone]]></category>
		<category><![CDATA[nyartoolkit]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=312</guid>
		<description><![CDATA[Hi I saw the stats for the blog a while ago and it seems that the augmented reality topic is hot! 400 clicks/day, that&#8217;s awesome! So I wanted to share with you my latest development in this field &#8211; cross compiling the AR app to the iPhone. A job that proved easier than I originally [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.morethantechnical.com/wp-content/uploads/2009/07/nyarrr-218x300.png" alt="nyarrr" title="nyarrr" width="218" height="300" class="alignleft size-medium wp-image-319" />Hi</p>
<p>I saw the stats for the blog a while ago and it seems that the augmented reality topic is hot! 400 clicks/day, that&#8217;s awesome!</p>
<p>So I wanted to share with you my latest development in this field &#8211; cross compiling the AR app to the iPhone. A job that proved easier than I originally thought, although it took a while to get it working smoothly.</p>
<p>Basically all I did was take NyARToolkit, compile it for armv6 arch, combine it with Norio Namura&#8217;s iPhone camera video feed code, slap on some simple OpenGL ES rendering, and bam &#8211; Augmented Reality on the iPhone.</p>
<p><strong>Update</strong>: Apple officially supports camera video pixel buffers in iOS 4.x using AVFoundation, <a href="http://developer.apple.com/iphone/library/qa/qa2010/qa1702.html#TNTAG1">here&#8217;s</a> sample code from Apple developer.</p>
<p>This is how I did it&#8230;<br />
<span id="more-312"></span><br />
I recommend you read my <a href="http://www.morethantechnical.com/2009/06/28/augmented-reality-with-nyartoolkit-opencv-opengl/">last post</a> on this matter. I have some insights, however superficial, to working with NyARToolkit implementation for C++, that I also use here.</p>
<h2>Getting NyARToolkit C++ to compile on iPhone</h2>
<p>First of all, I needed to cross-compile NyARToolkit for iPhone&#8217;s CPU architecture (Arm), but this was a very simple task &#8211; it just compiled off the bat! No tweaking done, what so ever.<br />
But that&#8217;s only the beginning, as iPhone apps are built using Objective-C and not C++ (maybe they can, but all the documentation is in obj-c). So I needed to write an Obj-C wrapper around NyARTk to allow my iPhone app to interact with it.</p>
<p>I only needed a very small set of functions out of NyARTk to get Aug.Reality &#8211; those that have to do with marker detection. I ended up with a lean API:</p>
<pre class="brush: plain;">
@interface NyARToolkitWrapper : NSObject {
	bool wasInit;
}

-(void)initNyARTwithWidth:(int)width andHeight:(int)height;
-(bool)detectMarker:(float[])resultMat;
-(void)setNyARTBuffer:(Byte*)buf;
-(void)getProjectionMatrix:(float[])m;
</pre>
<p>I also have some functions I used for debugging, and non-optimized stages. The inner works of the wrapper are not very interesting (and you can see them in the code yourself), they are mainly invoking NyARSingleDetectMarker functions.</p>
<h2>In the beginning &#8211; there was only marker detection</h2>
<p>OK, to get AR basically what I need to do is:</p>
<ol>
<li>initialize NyARTk inner structs</li>
<li>set NyARTk&#8217;s RGBA buffer with each frame&#8217;s pixles</li>
<li>get the extrinsic parameters of the camera, and draw the OpenGL scene accordingly</li>
</ol>
<p>This is for full fledged AR, but let me start with a simpler case &#8211; detecting the market in a single image read from a file. No OpenGL, no camera. Just reading the file&#8217;s pixels data, and feeding it to NyARTk.</p>
<p>Now this is far more simple:</p>
<pre class="brush: plain;">
CGImageRef img = [[UIImage imageNamed:@&quot;test_marker.png&quot;] CGImage];
int width = CGImageGetWidth(img);
int height = CGImageGetHeight(img);
Byte* brushData = (Byte *) malloc(width * height * 4);
CGContextRef cgctx = CGBitmapContextCreate(brushData, width, height, 8, width * 4, CGImageGetColorSpace(img), kCGImageAlphaPremultipliedLast);
CGContextDrawImage(cgctx, CGRectMake(0, 0, (CGFloat)width, (CGFloat)height), img);
CGContextRelease(cgctx);

[nyartwrapper initNyARTwithWidth:width andHeight:height];
[nyartwrapper setNyARTBuffer:brushData];
[nyartwrapper detectMarker:ogl_camera_matrix];
</pre>
<p>First I read the image to UIImage, then get it&#8217;s respective CGImage. But what I need are bytes, so I create a temporary CGBitmapContext, draw the image into it and use the context pixel data (allocated by me).</p>
<h2>Adding the 3D rendering</h2>
<p>This is nice, but nothing is shown to the screen, which sux. So the next step will be to create an OpenGL scene, and draw some 3D using the calibration we now have. To do this I used <a href="http://developer.apple.com/iphone/library/samplecode/GLSprite/listing2.html">EAGLView from Apple&#8217;s OpenGL ES docs</a>.<br />
This view will setup an environment to draw a 3D scene, by giving you a delegate to do the actual drawing while hiding all the perepherial code (frame buffers&#8230; and other creatures you wouldn&#8217;t want to meet in a dark 3D alley scene).</p>
<p>All I needed to implement in my code were two functions defined in the protocol:</p>
<pre class="brush: plain;">
@protocol _DGraphicsViewDelegate&lt;NSObject&gt;

@required

// Draw with OpenGL ES
-(void)drawView:(_DGraphicsView*)view;

@optional
-(void)setupView:(_DGraphicsView*)view;

@end
</pre>
<p>&#8216;setupView&#8217; will initialize the scene, and &#8216;drawView&#8217; will draw each frame. In setupView we&#8217;ll have the viewport setting, lighting, generating texture buffers etc., You can see all that in the code, it&#8217;s not very interesting&#8230;</p>
<p>In drawView we&#8217;ll draw the background and the 3D scene. Now this took some trickery. First I though i&#8217;ll take the easy route and just have the 3D scene be transparent,  draw the view using a simple UIView of some kind, and overlay the 3D over it. I didn&#8217;t manage to get it to work, so I took a different path (harder? don&#8217;t know) and I decided to paint the background over a 3D plane, in the 3D scene itself, using textures. This is how I did it in all my AR app on other devices.<br />
Now, the camera video feed is 304&#215;400 pixels, and OpenGL textures are best optimized at power-of-2 sizes, so I created a 512&#215;512 texture. But for now we&#8217;re talking about a single frame.</p>
<pre class="brush: plain;">
const GLfloat spriteTexcoords[] = {0,0.625f,   0.46f,0.625f,   0,0,   0.46f,0,};
const GLfloat spriteVertices[] =  {0,0,0,   1,0,0,   0,1,0   ,1,1,0};

glMatrixMode(GL_PROJECTION);
glPushMatrix();
glLoadIdentity();
glOrthof(0, 1, 0, 1, -1000, 1);
glMatrixMode(GL_MODELVIEW);
glPushMatrix();
glLoadIdentity();

// Sets up pointers and enables states needed for using vertex arrays and textures
glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(3, GL_FLOAT, 0, spriteVertices);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glTexCoordPointer(2, GL_FLOAT, 0, spriteTexcoords);	

glBindTexture(GL_TEXTURE_2D, spriteTexture);
glEnable(GL_TEXTURE_2D);

glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);

glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);

glMatrixMode(GL_PROJECTION);
glPopMatrix();
glMatrixMode(GL_MODELVIEW);
glPopMatrix();
</pre>
<p>Basically, I go into orthographic mode and draw a rectangle with the texture on it, nothing fancy.</p>
<p>Next up &#8211; drawing the perspective part of the scene, the part that aligns with the actual camera&#8230;</p>
<pre class="brush: plain;">
//Load the projection matrix (intrinsic parameters)
glMatrixMode(GL_PROJECTION);
glLoadMatrixf(ogl_projection_matrix);

//Load the &quot;camera&quot; matrix (extrinsic parameters)
glMatrixMode(GL_MODELVIEW);
glLoadMatrixf(ogl_camera_matrix);

glLightfv(GL_LIGHT0, GL_POSITION, lightPosition);
glEnable(GL_LIGHTING);
glEnable(GL_LIGHT0);

glDisable(GL_TEXTURE_2D);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);

glPushMatrix();	

glScalef(kTeapotScale, kTeapotScale, kTeapotScale);

{
        static GLfloat spinZ = 0.0;
        glRotatef(spinZ, 0.0, 0.0, 1.0);
        glRotatef(90.0, 1.0, 0.0, 0.0);
        spinZ += 1.0;
}

glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_NORMAL_ARRAY);
glVertexPointer(3 ,GL_FLOAT, 0, teapot_vertices);
glNormalPointer(GL_FLOAT, 0, teapot_normals);
glEnable(GL_NORMALIZE);

for(int i = 0; i &lt; num_teapot_indices; i += new_teapot_indicies[i] + 1)
{
        glDrawElements(GL_TRIANGLE_STRIP, new_teapot_indicies[i], GL_UNSIGNED_SHORT, &amp;new_teapot_indicies[i+1]);
}

glPopMatrix();
</pre>
<p>For this also I learned from Apple&#8217;s OpenGL ES docs (find it <a href="https://developer.apple.com/iphone/library/samplecode/GLGravity/listing2.html">here</a>). I ended up with this:<br />
<img src="http://www.morethantechnical.com/wp-content/uploads/2009/07/Picture-5-161x300.png" alt="Picture 5" title="Picture 5" width="161" height="300" class="alignnone size-medium wp-image-315" /></p>
<h2>Tying it together with the camera</h2>
<p>This runs on the simulator, since the camera is not involved just yet. I used it to fix the lighting and such, before moving to the device. But we&#8217;re here to get it work on the device, so next I plugged in the <a href="http://github.com/norio-nomura/iphonetest/tree/9713242dda6c6bc897da4bd639a1fdadc29b6fd7/CameraTest">code from Norio Nomura</a>.<br />
Some people have asked me to post up a working version of Nomura&#8217;s code, so you can get it with the code for this app (scroll down). Nomura was kind enough to make it public under MIT license.</p>
<p>First, I set up a timer to fire in ~11fps, and initialize the camera hook to grab the frames from the internal buffers:</p>
<pre class="brush: plain;">
repeatingTimer = [NSTimer scheduledTimerWithTimeInterval:0.0909 target:self selector:@selector(load2DTexFromFile:) userInfo:nil repeats:YES];

ctad = [[CameraTestAppDelegate alloc] init];
[ctad doInit];
</pre>
<p>And then I take the pixel data and use it for the background texture and the marker detection:</p>
<pre class="brush: plain;">
-(void)load2DTexWithBytes:(NSTimer*) timer {
	if([ctad getPixelData] != NULL) {
		CGSize s = [ctad getVideoSize];
		glBindTexture(GL_TEXTURE_2D, spriteTexture);
		glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, s.width, s.height, GL_BGRA, GL_UNSIGNED_BYTE, [ctad getPixelData]);

		if(![nyartwrapper wasInit]) {
			[nyartwrapper initNyARTwithWidth:s.width andHeight:s.height];
			[nyartwrapper getProjectionMatrix:ogl_projection_matrix];

			[nyartwrapper setNyARTBuffer:[ctad getPixelData]];
		}

		[nyartwrapper detectMarker:ogl_camera_matrix];
	}
}
</pre>
<p>All this happens 11 times per second, so it must be concise.</p>
<h2>Video proof time&#8230;</h2>
<p>Well, looks like we are pretty much done! time for a video&#8230;<br />
<object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/0DzJVtj-klY&#038;hl=en&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/0DzJVtj-klY&#038;hl=en&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></p>
<h2>How did you get the phone to stand still so nicely?</h2>
<p>An important issue&#8230; when it comes to shooting the phone w/o holding it.<br />
Well I used a little piece of metal that&#8217;s used to block the PCI docks in the PC. In hebrew will call these scrap metal &#8220;Flakch&#8221;s (don&#8217;t try to pronounce this at home). I bended it in the middle to create a kind of &#8220;leg&#8221;, and the ledge to hold the phone already exists.<br />
<img src="http://www.morethantechnical.com/wp-content/uploads/2009/07/IMG_0023-225x300.png" alt="metal iPhone stand" title="metal iPhone stand" width="225" height="300" class="alignnone size-medium wp-image-316" /></p>
<h2>The code</h2>
<p>As promised, <a href="http://code.google.com/p/morethantechnical/source/browse/#svn/trunk/NyARToolkit-iPhone">here&#8217;s the code</a> (I omitted some files whose license is questionable).</p>
<p>That&#8217;s all folks!<br />
See you when I get this to work on the Android&#8230;<br />
Roy.</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fwww.morethantechnical.com%2F2009%2F07%2F01%2Faugmented-reality-on-the-iphone-using-nyartoolkit-w-code%2F&amp;linkname=Augmented%20reality%20on%20the%20iPhone%20using%20NyARToolkit%20%5Bw%2F%20code%5D"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2009/07/01/augmented-reality-on-the-iphone-using-nyartoolkit-w-code/feed/</wfw:commentRss>
		<slash:comments>90</slash:comments>
		</item>
	</channel>
</rss>
