<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>More Than Technical &#187; video</title>
	<atom:link href="http://www.morethantechnical.com/category/video/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.morethantechnical.com</link>
	<description>On software, code, the internet and more.</description>
	<lastBuildDate>Sun, 05 Feb 2012 07:04:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
<atom:link rel="hub" href="http://pubsubhubbub.appspot.com"/><atom:link rel="hub" href="http://superfeedr.com/hubbub"/>		<item>
		<title>Spherical harmonics face relighting using OpenCV, OpenGL [w/ code]</title>
		<link>http://www.morethantechnical.com/2011/12/20/spherical-harmonics-face-relighting-using-opencv-opengl-w-code/</link>
		<comments>http://www.morethantechnical.com/2011/12/20/spherical-harmonics-face-relighting-using-opencv-opengl-w-code/#comments</comments>
		<pubDate>Tue, 20 Dec 2011 00:59:34 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[3d]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[gui]]></category>
		<category><![CDATA[opencv]]></category>
		<category><![CDATA[opengl]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[school]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[vision]]></category>
		<category><![CDATA[glsl]]></category>
		<category><![CDATA[harmonics]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[recoloring]]></category>
		<category><![CDATA[relighting]]></category>
		<category><![CDATA[shaders]]></category>
		<category><![CDATA[spherical]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=948</guid>
		<description><![CDATA[Implementing a face image relighting algorithm using spherical harmonics, based on a paper written by Wang et al (2007).]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2011/12/Screen-shot-2011-12-19-at-8.13.27-PM.png" rel="lightbox[948]"><img src="http://www.morethantechnical.com/wp-content/uploads/2011/12/Screen-shot-2011-12-19-at-8.13.27-PM-300x130.png" alt="" title="Spherical harmonics face relighting" width="300" height="130" class="alignleft size-medium wp-image-1015" /></a>Hi!<br />
I&#8217;ve been working on implementing a face image relighting algorithm using spherical harmonics, one of the most elegant methods I&#8217;ve seen lately.<br />
I start up by aligning a face model with OpenGL to automatically get the canonical face normals, which brushed up my knowledge of GLSL. Then I continue to estimating real faces &#8220;spharmonics&#8221;, and relighting.</p>
<p>Let&#8217;s start!<br />
<span id="more-948"></span></p>
<h2>Some mathematical background</h2>
<p>Don&#8217;t worry, it wont hurt. much.</p>
<p>So Spherical Harmonics, were invented to numerically express a whole bunch of things in physics like gravity and magnetic fields. But they also became very useful for computer graphics as they are perfect for modelling light falling on a spherical body.</p>
<h3>But what ARE those mysterious spherical harmonics? </h3>
<p>The way I see it, they are a series of &#8220;modes&#8221; or &#8220;eigenvectors&#8221; or &#8220;orthogonal components&#8221; of a base that spans the surface of a sphere.<br />
To put it simple, they describe the surface of a sphere in increasing finer grained portions. Much like a Fourier decomposition does to a function, there is the base and there are coefficients that when multiplied with the base they recover the function.</p>
<h3>How is that good for graphics? </h3>
<p>People have used spherical harmonics mostly to model lighting of spherical objects. When you know the coefficients that describe the lighting, you can change them to <i>Re-light</i> an object, or <i>De-light</i>, or transfer the lighting conditions of one scene to another. Very useful!</p>
<p>Some good researchers, Basri and Jacobs, back in 2001 have formulated the first 9 harmonics as a function of the surface normal. On this page Basri references all his work on the subject: <a href="http://www.wisdom.weizmann.ac.il/~ronen/index_files/harmonic.html" target="_blank">http://www.wisdom.weizmann.ac.il/~ronen/index_files/harmonic.html</a> </p>
<p>But I like to reference a work that&#8217;s easier to process than Basri&#8217;s, that is the work of Wang et al from 2007. These guys made the steps to use spherical harmonics easier to follow: <a href="http://research.microsoft.com/en-us/um/people/zliu/cvpr2007.pdf" title="http://research.microsoft.com/en-us/um/people/zliu/cvpr2007.pdf" target="_blank">http://research.microsoft.com/en-us/um/people/zliu/cvpr2007.pdf</a>.<br />
But their algorithm is quite advanced, as it solves not only for the harmonics&#8217; coefficients but also for the normals of the object in the image. They use some fancy optimization of an energy function over a graph, that I&#8217;m not going to discuss.<br />
But they did make the process of finding the spherical harmonics&#8217; coefficient very clear.</p>
<h4>The bottom line</h4>
<p>We should solve for a vector of 9 coefficients that describes the &#8220;lighting of the object&#8221; (a face in our case).<br />
Each coefficient will tell us how much that specific harmonic is strong or weak, or in other words how lit is that certain area of the object.</p>
<p>Wang and Basri show a very simple method of using simultaneous linear equations to solve for the lighting coefficients, it depends only on knowing the normal of the object&#8217;s surface at each pixel in the image.</p>
<h2>Getting the normals of a canonical face</h2>
<p>So to get the normals, I thought the best way is to use a canonical model of a face (some king of an average face), instead of trying to recover the normals from the image pixels.<br />
For that end, I used Rhino3D to model (very roughly) a shape that resembles a human face, starting from an elongated sphere.<br />
Now all that&#8217;s left is to align the model with the face to relight, and that will supply the normals.<br />
<a href="http://www.morethantechnical.com/wp-content/uploads/2011/12/snapshot00.png" rel="lightbox[948]"><img src="http://www.morethantechnical.com/wp-content/uploads/2011/12/snapshot00-300x224.png" alt="" title="rough model of a human face" width="300" height="224" class="alignleft size-medium wp-image-1011" /></a><br />
Cool. Then I built a small app that allows the user to move the model around until it&#8217;s aligned with the face image. I used <a href="http://www.fltk.org/" target="_blank">FLTK 3.0</a> to do it since they have a simple interface with OpenGL, they are cross platform, and lightweight.<br />
So I set up a scene where I have the image as the background, and the model is floating above it, half transparent so the user can find the right spot. I added functions for rotating the model, and extra stuff like turning the model opaque.</p>
<p style="text-align: center">
<iframe width="480" height="360" src="http://www.youtube.com/embed/wIwAX2UM64E" frameborder="0" allowfullscreen></iframe>
</p>
<p>To get the normal map I used a very simple GLSL shader, that simply colors the pixel with the value of the normal nX,nY,nZ -> R,G,B.<br />
This way the result image OpenGL renders is simply the normal map of the face model. I just grab it using glReadPixels.</p>
<h2>Estimating spherical harmonics</h2>
<p>So, after the model is aligned, we can assume we have the normals ready for us for each pixel in the image, and the intensity in each pixel is also known.<br />
The first step that Wang suggests, without knowledge of the real face albedo (the real color of every pixel without any lighting effect), is to get an approximation of the 9-vector of lighting coefficients by setting a constant albedo. Easy enough, we can set the albedo to the average color in the face.<br />
Then we can simply build a huge set of linear equations (huge as the number of pixels in the image), and solve an overdetermined system to get the 9 coefficients.</p>
<pre class="brush: plain; title: ; notranslate">
		Scalar albedo_constant = mean(face_img_hsv, smallFaceMask);

		//setup linear equation system, lighting coefficients (l) is unknown
		//I = p00 * Ht * l
		float p00 = (float)albedo_constant[2] / 255.0f;

		cout &lt;&lt; &quot;Build Ht(&quot;&lt;&lt;n&lt;&lt;&quot;,9)...&quot;;
		cout &lt;&lt; &quot;Build I(&quot;&lt;&lt;n&lt;&lt;&quot;,1)...&quot;;
		//build Ht and I
		Mat_&lt;float&gt; Ht(n,9);
		Mat_&lt;float&gt; I(n,1);
		int pos = 0;
		vector&lt;Mat_&lt;uchar&gt; &gt; face_img_chnls; split(face_img_hsv, face_img_chnls);
		for (int i=0; i&lt;normalMapFlat.rows; i++) {
			if (smallFaceMask(i) == 0) { //is this pixel on the face?
				continue;
			}
			Ht.row(pos) = p00 * calculateSphericalHarmonicsForNormal(normalMapFlat(i));
			I(pos,0) = face_img_chnls[2](i) / 255.0f; //get V from HSV of pixel [0,1]
			pos ++;
		}
		cout &lt;&lt; &quot;DONE&quot;  &lt;&lt; endl;

		cout &lt;&lt; &quot;Solve&quot; &lt;&lt;endl;
		solve(Ht, I, l, DECOMP_SVD);

		cout &lt;&lt; &quot;initial lighting coeffs: &quot;;
		for (int i=0; i&lt;l.rows; i++) {
			cout&lt;&lt;l.at&lt;float&gt;(i)&lt;&lt;&quot;,&quot;;
		}
</pre>
<p>Booyah! lighting coefficients.</p>
<p>But this is only the first step. Now we can get an approximation of the albedo as well, using the coefficients:</p>
<pre class="brush: plain; title: ; notranslate">
		Mat_&lt;Vec3b&gt; face_img_v3b = face_img;

		#pragma omp parallel for schedule(dynamic)
		for (int y=0; y&lt;face_img.rows; y++) {
			for (int x=0; x&lt;face_img.cols; x++) {
				if (face_mask(y,x) == 0) {
					albedo(y,x) = 0;
					continue;
				}
				Mat sph = calculateSphericalHarmonicsForNormal(normalMap(y,x));
				Mat_&lt;float&gt; sph_l = sph * l;
				float fsph_l = sph_l(0);

				for (int cn = 0; cn&lt;3; cn++) {
					float fimg = face_img_v3b(y,x)[cn] / 255.0f;
					albedo(y,x)[cn] = (fimg / fsph_l);
				}
			}
		}
</pre>
<p>Done.<br />
Now that we have an initial albedo, Wang suggests we compute the coefficients again to get a better approximation, and then the albedo again.<br />
I however ran into some problems trying to do the second iteration, and the results always came out too dark&#8230; But even with the first iteration you can see a very nice change.<br />
Look at the video from before, you can see the right side of the face, which is over-lit, was darkened and the left side was lit up.</p>
<h2>Code</h2>
<p>The code for spherical harmonics analysis of images is part of a bigger project I have been working on for some time. I also spoke of it in a <a href="http://www.morethantechnical.com/2011/12/01/identity-transfer-in-photographs/" target="_blank">previous post</a>.<br />
Anyway it&#8217;s up in GitHub: <a href="https://github.com/royshil/HeadReplacement/tree/master/HeadReplacement" target="_blank">https://github.com/royshil/HeadReplacement/tree/master/HeadReplacement</a><br />
You&#8217;re looking for 4 files:</p>
<ul>
<li>SpharmonicsUI.cpp
<li>SpharmonicsUI.h
<li>spherical_harmonics_analysis.cpp
<li>spherical_harmonics_analysis.h
</ul>
<p>You can use the CMakeLists.txt to compile, but here&#8217;s a CMakeLists.txt that should take you there in one piece (fingers crossed):</p>
<pre class="brush: plain; title: ; notranslate">
find_package(OpenCV REQUIRED)
find_package(OpenGL REQUIRED)
find_package(OpenMP REQUIRED)

######## Find and add GLEE ########
file(GLOB_RECURSE GLEE_PATH &quot;${CMAKE_SOURCE_DIR}/GLee.c&quot;)
if(GLEE_PATH STREQUAL GLEE_PATH-NOTFOUND)
	message(STATUS &quot;GLEE was not found&quot;)
else()
	list(LENGTH GLEE_PATH GLEE_PATH_LEN)
	if(GLEE_PATH_LEN GREATER 1)
		list(GET GLEE_PATH 1 GLEE_PATH)
	endif()
	file(RELATIVE_PATH GLEE_PATH ${CMAKE_SOURCE_DIR} ${GLEE_PATH})
	get_filename_component(GLEE_PATH ${GLEE_PATH} REALPATH)
	get_filename_component(GLEE_PATH ${GLEE_PATH} PATH)
	message(STATUS &quot;Found GLEE at ${GLEE_PATH}&quot;)
	add_library(GLEE ${GLEE_PATH}/GLee.c)
endif()

############ Find FLTK ############
if(NOT DEFINED FLTK_PATH)
	file(GLOB_RECURSE FLTK_PATH &quot;${CMAKE_SOURCE_DIR}/Widget.h&quot;)
	if(FLTK_PATH STREQUAL FLTK_PATH-NOTFOUND   OR   FLTK_PATH STREQUAL &quot;&quot;)
		message(STATUS &quot;FLTK was not found !!!!!&quot;)
	else()
		list(LENGTH FLTK_PATH FLTK_PATH_LEN)
		if(FLTK_PATH_LEN GREATER 1)
			list(GET FLTK_PATH 1 FLTK_PATH)
		endif()
		file(RELATIVE_PATH FLTK_PATH ${CMAKE_SOURCE_DIR} ${FLTK_PATH})
		get_filename_component(FLTK_PATH ${FLTK_PATH} REALPATH)
		get_filename_component(FLTK_PATH ${FLTK_PATH} PATH)
		message(STATUS &quot;Found FLTK at ${FLTK_PATH}&quot;)
	endif()
else()
	get_filename_component(FLTK_PATH ${FLTK_PATH} REALPATH)
	message(STATUS &quot;FLTK path set to ${FLTK_PATH}&quot;)
endif()
set(FLTK_INCLUDE_DIR ${FLTK_PATH}/include)
set(FLTK_LIB_DIR ${FLTK_PATH}/lib)

######## Relighting #######
include_directories(${FLTK_INCLUDE_DIR})
include_directories(${OpenGL_INCLUDE_DIRS})
include_directories(${GLEE_PATH})
add_library(VirtualSurgeon_Relighting
	../HeadReplacement/glm.cpp
	../HeadReplacement/spherical_harmonics_analysis.cpp
	../HeadReplacement/LaplacianBlending.cpp
	../HeadReplacement/SpharmonicsUI.cpp
	../HeadReplacement/OGL_OCV_common.cpp
	)
</pre>
<p>Note that I had to resort to some very dark magic to recover the location of FLTK and GLEE&#8230; But it&#8217;s a jungle out there.</p>
<p>The source of the photograph is: <a href="http://www.flickr.com/photos/roel1943/309048020/" target="_blank">http://www.flickr.com/photos/roel1943/309048020/</a><br />
It is released under Creative Commons 2.0 ShareAlike-Attribution. So all the results here are also CC-2.0-SA-A&#8230; <img src='http://www.morethantechnical.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Enjoy,<br />
Roy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2011/12/20/spherical-harmonics-face-relighting-using-opencv-opengl-w-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Kinect browser plugin with FireBreath [w/ code]</title>
		<link>http://www.morethantechnical.com/2011/12/02/a-kinect-browser-plugin-with-firebreath-w-code/</link>
		<comments>http://www.morethantechnical.com/2011/12/02/a-kinect-browser-plugin-with-firebreath-w-code/#comments</comments>
		<pubDate>Fri, 02 Dec 2011 14:17:58 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[opengl]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[browser]]></category>
		<category><![CDATA[kinect]]></category>
		<category><![CDATA[plugin]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=996</guid>
		<description><![CDATA[Hi, Just reporting on a small achievement, part of a big project: Creating a browser plugin to display the Kinect depth map on screen. The integration was fairly easy, which leads me to think that both FireBreath and OpenNI/Nite are pretty neat framework that are robust.. So let&#8217;s see how it&#8217;s done From a template [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2011/12/Screen-shot-2011-12-02-at-9.12.03-AM.png" rel="lightbox[996]"><img src="http://www.morethantechnical.com/wp-content/uploads/2011/12/Screen-shot-2011-12-02-at-9.12.03-AM-150x150.png" alt="" title="Screen shot 2011-12-02 at 9.12.03 AM" width="150" height="150" class="alignleft size-thumbnail wp-image-1006" /></a>Hi,<br />
Just reporting on a small achievement, part of a big project: Creating a browser plugin to display the Kinect depth map on screen.<br />
The integration was fairly easy, which leads me to think that both FireBreath and OpenNI/Nite are pretty neat framework that are robust..<br />
So let&#8217;s see how it&#8217;s done<br />
<span id="more-996"></span></p>
<h2>From a template FireBreath plugin to an OpenGL plugin</h2>
<p>FireBreath is kind of an amazing project. They aim to be able to write a single source that will create plugins for all browsers and all operating systems. A daunting feat by my book. But building a MacOS Safari/Firefox plugin using their framework proved very simple&#8230;<br />
So I started here: <a href="http://www.firebreath.org/display/documentation/Mac+Video+Tutorial">http://www.firebreath.org/display/documentation/Mac+Video+Tutorial</a><br />
It&#8217;s a video tutorial of how to create a plugin from template, build it, install it and run it. Follow their instructions and you&#8217;ll have your plugin ready in 10 minutes.<br />
The next step will be to make our plugin display an OpenGL scene, which is what OpenNI/NITE use to display their depth map. This was also easy, borrowing code from the <a href="http://www.firebreath.org/display/documentation/OpenGL+Plugin">FireBreath OpenGL example</a>.<br />
However I ended up with a smaller source since I threw away most of the stuff&#8230;</p>
<pre class="brush: plain; title: ; notranslate">
class tutorialpluginMac : public tutorialplugin {
public:
    tutorialpluginMac();
	~tutorialpluginMac();

    BEGIN_PLUGIN_EVENT_MAP()
	EVENTTYPE_CASE(FB::AttachedEvent, onWindowAttached, FB::PluginWindowMac)
	EVENTTYPE_CASE(FB::DetachedEvent, onWindowDetached, FB::PluginWindowMac)
	PLUGIN_EVENT_MAP_CASCADE(tutorialplugin)
    END_PLUGIN_EVENT_MAP()

    virtual bool onWindowAttached(FB::AttachedEvent *evt, FB::PluginWindowMac*);
    virtual bool onWindowDetached(FB::DetachedEvent *evt, FB::PluginWindowMac*);
protected:

private:
    void* m_layer;

};

void glutDisplay (void); //this is implemented in the NITE code

@interface MyCAOpenGLLayer : CAOpenGLLayer {
    GLfloat m_angle;
}
@end

@implementation MyCAOpenGLLayer

- (id) init {
    if ([super init]) {
        m_angle = 0;
    }
    return self;
}

- (void)drawInCGLContext:(CGLContextObj)ctx pixelFormat:(CGLPixelFormatObj)pf forLayerTime:(CFTimeInterval)t displayTime:(const CVTimeStamp *)ts {
    //m_angle += 1;
    GLsizei width = CGRectGetWidth([self bounds]), height = CGRectGetHeight([self bounds]);
    GLfloat halfWidth = width / 2, halfHeight = height / 2;

    glViewport(0, 0, width, height);

	glutDisplay(); //let NITE draw it's stuff

    [super drawInCGLContext:ctx pixelFormat:pf forLayerTime:t displayTime:ts];
}

@end

tutorialpluginMac::tutorialpluginMac() : m_layer(NULL) {}

tutorialpluginMac::~tutorialpluginMac()
{
    if (m_layer) {
        [(CALayer*)m_layer removeFromSuperlayer];
        [(CALayer*)m_layer release];
        m_layer = NULL;
    }
}

bool tutorialpluginMac::onWindowAttached(FB::AttachedEvent* evt, FB::PluginWindowMac* wnd)
{
	cout &lt;&lt; &quot;tutorialpluginMac::onWindowAttached&quot; &lt;&lt; endl;
    if (FB::PluginWindowMac::DrawingModelCoreAnimation == wnd-&gt;getDrawingModel() ||
		FB::PluginWindowMac::DrawingModelInvalidatingCoreAnimation == wnd-&gt;getDrawingModel())
	{
        cout &lt;&lt; &quot; Setup CAOpenGL drawing. &quot;&lt;&lt;endl;
        MyCAOpenGLLayer* layer = [MyCAOpenGLLayer new];
        layer.asynchronous = (FB::PluginWindowMac::DrawingModelInvalidatingCoreAnimation == wnd-&gt;getDrawingModel()) ? NO : YES;
        layer.autoresizingMask = kCALayerWidthSizable | kCALayerHeightSizable;
        layer.needsDisplayOnBoundsChange = YES;
        m_layer = layer;
        if (FB::PluginWindowMac::DrawingModelInvalidatingCoreAnimation == wnd-&gt;getDrawingModel())
            wnd-&gt;StartAutoInvalidate(1.0/30.0);
        [(CALayer*) wnd-&gt;getDrawingPrimitive() addSublayer:layer];
    }
    return tutorialplugin::onWindowAttached(evt,wnd);
}

bool tutorialpluginMac::onWindowDetached(FB::DetachedEvent* evt, FB::PluginWindowMac* wnd)
{
    return tutorialplugin::onWindowDetached(evt,wnd);
}
</pre>
<p>(You guys will have to fill in the gaps&#8230; includes, etc.)</p>
<p>This goes in a new file, a new subclass of the generic plugin, only for Mac. For windows, you should subclass again and create the OpenGL context using WIN32 API or equivalent.</p>
<p>CMakeLists.txt files are also affected. Check out the repo.</p>
<h2>NITE OpenGL rendering</h2>
<p>Now that the plugin will just draw whatever NITE is drawing, half the battle is done. So for the drawing code I took the simple NiPointViewer example from the NITE library (get it <a href="http://www.openni.org/">here</a>).<br />
But, since we have need no windows management in the OpenNI, again we can make everything more simple. I took the code exactly as it is, and changed really just a small bit.<br />
I added<br />
<code><br />
#undef USE_GLUT<br />
#undef USE_GLES<br />
</code>, which pretty much makes that code compile to a very lean code (without window management etc.).<br />
And I rescued the glOrtho call in glutDisplay()<br />
<code><br />
//#ifdef USE_GLUT<br />
	glOrtho(0, mode.nXRes, mode.nYRes, 0, -1.0, 1.0);<br />
#if defined(USE_GLES)<br />
	glOrthof(0, mode.nXRes, mode.nYRes, 0, -1.0, 1.0);<br />
#endif<br />
</code></p>
<p>But the rest is pretty much identical.</p>
<p>One more thing, we should start the Kinect driver and OpenNI stack from somewhere in the plugin loading steps. In the main.cpp file from NITE I changed the main() function to kinect_main().<br />
I did that by adding it here, in the generic plugin (not the Mac subclass because it should be called from all OSs):</p>
<pre class="brush: plain; title: ; notranslate">
bool tutorialplugin::onWindowAttached(FB::AttachedEvent *evt, FB::PluginWindow *)
{
    // The window is attached; act appropriately
	kinect_main(0, 0);
	cout &lt;&lt; &quot;tutorialplugin::onWindowAttached&quot; &lt;&lt; endl;
    return true;
}
</pre>
<p>It now will fire when a window is attached to the plugin. The OpenGL calls will start running after the OGL context is up and starts rendering in a loop.</p>
<h2>Source and stuff</h2>
<p>Get the source for the Kinect-FireBreath plugin at GitHub: <a href="https://github.com/royshil/KinectPlugin">https://github.com/royshil/KinectPlugin</a></p>
<p>This is how it looks:<br />
<a style="display:block;" href="http://www.morethantechnical.com/wp-content/uploads/2011/12/Screen-shot-2011-12-02-at-9.12.03-AM.png" rel="lightbox[996]"><img src="http://www.morethantechnical.com/wp-content/uploads/2011/12/Screen-shot-2011-12-02-at-9.12.03-AM.png" alt="" title="Screen shot 2011-12-02 at 9.12.03 AM" width="341" height="462" class="alignleft size-full wp-image-1006" /></a></p>
<p>Cool.<br />
Roy</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.morethantechnical.com%2F2011%2F12%2F02%2Fa-kinect-browser-plugin-with-firebreath-w-code%2F&amp;title=A%20Kinect%20browser%20plugin%20with%20FireBreath%20%5Bw%2F%20code%5D" id="wpa2a_2"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2011/12/02/a-kinect-browser-plugin-with-firebreath-w-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Identity Transfer in Photographs</title>
		<link>http://www.morethantechnical.com/2011/12/01/identity-transfer-in-photographs/</link>
		<comments>http://www.morethantechnical.com/2011/12/01/identity-transfer-in-photographs/#comments</comments>
		<pubDate>Thu, 01 Dec 2011 05:29:58 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[opencv]]></category>
		<category><![CDATA[opengl]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[school]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[vision]]></category>
		<category><![CDATA[head]]></category>
		<category><![CDATA[identity]]></category>
		<category><![CDATA[images]]></category>
		<category><![CDATA[photographs]]></category>
		<category><![CDATA[replacement]]></category>
		<category><![CDATA[survey]]></category>
		<category><![CDATA[transfer]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=1000</guid>
		<description><![CDATA[Hi! I would like to present something I have been working on recently, a work that immensely affect what I wrote in the blog in the past two years&#8230; To use it: Go on this page, Watch the short instruction video, download the application (MacOSX-Intel-x64 Win32) and make yourself a model! It takes just a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2011/12/male_model.jpg" rel="lightbox[1000]"><img src="http://www.morethantechnical.com/wp-content/uploads/2011/12/male_model-150x150.jpg" alt="" title="male_model" width="150" height="150" class="alignleft size-thumbnail wp-image-1001" /></a>Hi!</p>
<p>I would like to present something I have been working on recently, a work that immensely affect what I wrote in the blog in the past two years&#8230;</p>
<p>To use it:<br />
Go on this <a href="http://palimpost.xvm.mit.edu/HeadReplacement/default.html">page</a>,<br />
Watch the short <a href="http://youtu.be/YhHb3FAqaUk">instruction video</a>,<br />
download the application (<a href="http://palimpost.xvm.mit.edu/HeadReplacement/bin/HeadReplacement.dmg">MacOSX-Intel-x64</a> <a href="http://palimpost.xvm.mit.edu/HeadReplacement/bin/HeadReplacement_win32.zip">Win32</a>)<br />
and make yourself a model!<br />
It takes just a couple of minutes and it&#8217;s very simple&#8230;</p>
<p>This work is an academic research project, Please please, take the time to fill out the <a href="https://docs.google.com/spreadsheet/viewform?formkey=dGNBX0ljZXRVXzdtbjBQZ0dULTQwelE6MQ">survey</a>! It is very short..<br />
The results of the <a href="https://docs.google.com/spreadsheet/viewform?formkey=dGNBX0ljZXRVXzdtbjBQZ0dULTQwelE6MQ">survey</a> (the survey alone, no photos of your work) will possibly be published in an academic paper.</p>
<p>Note: No information is sent anywhere in any way outside of your machine (you may even unplug the network). All results are saved locally on your computer, and no inputs are recorded or transmitted. The application contains no malware. The source is available here.</p>
<p>Note II: All stock photos of models used in the application are released under Creative Commons By-NC-SA 2.0 license. Creator: http://www.flickr.com/photos/kk/. If you wish to distribute your results, they should also be released under a CC-By-NC-SA 2.0 license.</p>
<p>Thank you!<br />
Roy.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.morethantechnical.com%2F2011%2F12%2F01%2Fidentity-transfer-in-photographs%2F&amp;title=Identity%20Transfer%20in%20Photographs" id="wpa2a_4"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2011/12/01/identity-transfer-in-photographs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A simple object classifier with Bag-of-Words using OpenCV 2.3 [w/ code]</title>
		<link>http://www.morethantechnical.com/2011/08/25/a-simple-object-classifier-with-bag-of-words-using-opencv-2-3-w-code/</link>
		<comments>http://www.morethantechnical.com/2011/08/25/a-simple-object-classifier-with-bag-of-words-using-opencv-2-3-w-code/#comments</comments>
		<pubDate>Thu, 25 Aug 2011 03:34:27 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[opencv]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Recommended]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[vision]]></category>
		<category><![CDATA[classification]]></category>
		<category><![CDATA[object]]></category>
		<category><![CDATA[svm]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=917</guid>
		<description><![CDATA[ A simple object classifier with Bag-of-Words using OpenCV 2.3]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2011/08/20101201191626.jpg" rel="lightbox[917]"><img src="http://www.morethantechnical.com/wp-content/uploads/2011/08/20101201191626-300x178.jpg" alt="" title="20101201191626" width="300" height="178" class="alignleft size-medium wp-image-928" /></a><br />
Just wanted to share of some code I&#8217;ve been writing.<br />
So I wanted to create a food classifier, for a cool project down in the Media Lab called FoodCam. It&#8217;s basically a camera that people put free food under, and they can send an email alert to the entire building to come eat (by pushing a huge button marked &#8220;Dinner Bell&#8221;). Really a cool thing.</p>
<p>OK let&#8217;s get down to business.<br />
<span id="more-917"></span><br />
I followed a very simple technique described in <a href="http://scholar.google.com/scholar?cluster=2469382617192238945&amp;hl=en&amp;as_sdt=0,22" target="_blank">this paper</a>. I know, you say, &#8220;A Paper? Really? I&#8217;m not gonna read that technical boring stuff, give the bottom line! man.. geez.&#8221; Well, you are right, except that this paper IS the bottom line, it&#8217;s dead simple. It&#8217;s almost a tutorial. It is also referenced by the OpenCV documentation.</p>
<p>The method is simple:<br />
- Extract features of choice from training set that contains all classes.<br />
- Create a vocabulary of features by clustering the features (kNN, etc). Let&#8217;s say 1000 features long.<br />
- Train your classifiers (SVMs, Naive-Bayes, boosting, etc) on training set again (preferably a different one), this time check the features in the image for their closest clusters in the vocabulary. Create a histogram of responses for each image to words in the vocabulary, it will be a 1000-entries long vector. Create a sample-label dataset for the training.<br />
- When you get an image you havn&#8217;t seen &#8211; run the classifier and it should, god willing, give you the right class.</p>
<p>Turns out, those crafty guys in WillowGarage have done pretty much all the heavy lifting, so it&#8217;s up for us to pick the fruit of their hard work. OpenCV 2.3 comes packed with a <a href="http://opencv.itseez.com/modules/features2d/doc/object_categorization.html" target="_blank">set of classes</a>, whose names start with BOW for Bag Of Words, that help a lot with implementing this method.</p>
<p>Starting with the first step:</p>
<pre class="brush: plain; title: ; notranslate">
Mat training_descriptors(1,extractor-&gt;descriptorSize(),extractor-&gt;descriptorType());

SurfFeatureDetector detector(400);
vector keypoints;

// computing descriptors
Ptr extractor(
   new OpponentColorDescriptorExtractor(
      Ptr(new SurfDescriptorExtractor())
   )
);

while(..loop a directory? a file?..) {
   Mat img = imread(filepath);
   detector.detect(img, keypoints);
   extractor-&gt;compute(img, keypoints, descriptors);
   training_descriptors.push_back(descriptors);
}
</pre>
<p>Simple!<br />
Let&#8217;s go create a vocabulary then. Luckily, OpenCV has taken care of that, and provide a simple API:</p>
<pre class="brush: plain; title: ; notranslate">
BOWKMeansTrainer bowtrainer(1000); //num clusters
bowtrainer.add(training_descriptors);
Mat vocabulary = bowtrainer.cluster();
</pre>
<p>Boom. Vocabulary.<br />
Now, let&#8217;s train us some SVM classifiers!<br />
We&#8217;re gonna train a 2-class SVM, in a 1-vs-all kind of way. Meaning we train an SVM that can say &#8220;yes&#8221; or &#8220;no&#8221; when choosing between one class and the rest of the classes, hence 1-vs-all.<br />
But first, we need to scour the training set for our histograms (the responses to the vocabulary, remember?):</p>
<pre class="brush: plain; title: ; notranslate">
vector&lt;KeyPoint&gt; keypoints;
Mat response_hist;
Mat img;
string filepath;
map&lt;string,Mat&gt; classes_training_data;

Ptr&lt;FeatureDetector &gt; detector(new SurfFeatureDetector());
Ptr&lt;DescriptorMatcher &gt; matcher(new BruteForceMatcher&lt;L2&lt;float&gt; &gt;());
Ptr&lt;DescriptorExtractor &gt; extractor(new OpponentColorDescriptorExtractor(Ptr&lt;DescriptorExtractor&gt;(new SurfDescriptorExtractor())));
Ptr&lt;BOWImgDescriptorExtractor&gt; bowide(new BOWImgDescriptorExtractor(extractor,matcher));
bowide-&gt;setVocabulary(vocabulary);

#pragma omp parallel for schedule(dynamic,3)
for(..loop a directory?..) {
   img = imread(filepath);
   detector-&gt;detect(img,keypoints);
   bowide.compute(img, keypoints, response_hist);

   #pragma omp critical
   {
      if(classes_training_data.count(class_) == 0) { //not yet created...
         classes_training_data[class_].create(0,response_hist.cols,response_hist.type());
         classes_names.push_back(class_);
      }
      classes_training_data[class_].push_back(response_hist);
   }
   total_samples++;
}
</pre>
<p>Now, two things:<br />
First notice I&#8217;m keeping the training data for each class separately, this is because we will need this for later creating the 1-vs-all samples-labels matrices.<br />
Second, I use OpenMP multi(-threading)processing to make the calculation parallel, and hence faster, on multi-core machines (like the one I used). Time is sliced by a whole lot. OpenMP is a gem, use it more. Just a couple of #pragma directives and you&#8217;re multi-threading.</p>
<p>Alright, data gotten, let&#8217;s get training:</p>
<pre class="brush: plain; title: ; notranslate">
#pragma omp parallel for schedule(dynamic)
for (int i=0;i&lt;classes_names.size();i++) {
   string class_ = classes_names[i];
   cout &lt;&lt; omp_get_thread_num() &lt;&lt; &quot; training class: &quot; &lt;&lt; class_ &lt;&lt; &quot;..&quot; &lt;&lt; endl;

   Mat samples(0,response_cols,response_type);
   Mat labels(0,1,CV_32FC1);

   //copy class samples and label
   cout &lt;&lt; &quot;adding &quot; &lt;&lt; classes_training_data[class_].rows &lt;&lt; &quot; positive&quot; &lt;&lt; endl;
   samples.push_back(classes_training_data[class_]);
   Mat class_label = Mat::ones(classes_training_data[class_].rows, 1, CV_32FC1);
   labels.push_back(class_label);

   //copy rest samples and label
   for (map&lt;string,Mat&gt;::iterator it1 = classes_training_data.begin(); it1 != classes_training_data.end(); ++it1) {
      string not_class_ = (*it1).first;
      if(not_class_.compare(class_)==0) continue; //skip class itself
      samples.push_back(classes_training_data[not_class_]);
      class_label = Mat::zeros(classes_training_data[not_class_].rows, 1, CV_32FC1);
      labels.push_back(class_label);
   }

   cout &lt;&lt; &quot;Train..&quot; &lt;&lt; endl;
   Mat samples_32f; samples.convertTo(samples_32f, CV_32F);
   if(samples.rows == 0) continue; //phantom class?!
   CvSVM classifier;
   classifier.train(samples_32f,labels);

   //do something with the classifier, like saving it to file
}
</pre>
<p>Again, I parallelize, although the process is not too slow.<br />
Note how I build the samples and the labels, where each time I put in the positive samples and mark the labels &#8217;1&#8242;, and then I put the rest of the samples and label them &#8217;0&#8242;.</p>
<p>Moving on to &#8230;. testing the classifiers!<br />
Nothing seems to me like more fun than creating a confusion matrix! Not really, but let&#8217;s see how it&#8217;s done:</p>
<pre class="brush: plain; title: ; notranslate">
map&lt;string,map&lt;string,int&gt; &gt; confusion_matrix; // confusionMatrix[classA][classB] = number_of_times_A_voted_for_B;
map&lt;string,CvSVM&gt; classes_classifiers; //This we created earlier

vector&lt;string&gt; files; //load up with images
vector&lt;string&gt; classes; //load up with the respective classes

for(..loop over a directory?..) {
   Mat img = imread(files[i]),resposne_hist;

   vector&lt;KeyPoint&gt; keypoints;
   detector-&gt;detect(img,keypoints);
   bowide-&gt;compute(img, keypoints, response_hist);

   float minf = FLT_MAX; string minclass;
   for (map&lt;string,CvSVM&gt;::iterator it = classes_classifiers.begin(); it != classes_classifiers.end(); ++it) {
      float res = (*it).second.predict(response_hist,true);
      if (res &lt; minf) {
         minf = res;
         minclass = (*it).first;
      }
   }
   confusion_matrix[minclass][classes[i]]++;
}
</pre>
<p>When you take a look in my files, you will find a much complicated way of doing this. But this is the core idea &#8211; look in the image for the response histogram to the vocabulary of features (rather, feature-cluster-ceneters), run it by all the classifiers  and take the one with the best score. Simple.<br />
Consider making this parallel as well. No reason for it to be serial.</p>
<p>That&#8217;s about covers it.</p>
<h2>Code</h2>
<p>Lately I&#8217;m pushing stuff in Github.com using git rather than SVN on googlecode. Donno why, it&#8217;s just like that.<br />
Get the whole thing at:<br />
<code><a href="https://github.com/royshil/FoodcamClassifier" target="_blank">https://github.com/royshil/FoodcamClassifier</a></code></p>
<p>Follow the build instructions, they&#8217;re a breeze, and then follow the runnning instructions. It&#8217;s basically a series of command-line programs you run to get through each step, and in the end you have like a &#8220;predictor&#8221; service that takes an image and produces a prediction.</p>
<p>OK guys, have fun classifying stuff!<br />
Roy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2011/08/25/a-simple-object-classifier-with-bag-of-words-using-opencv-2-3-w-code/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>A motion parallax screen using Kinect [w/ code]</title>
		<link>http://www.morethantechnical.com/2011/06/05/a-motion-parallax-screen-using-kinect-w-code/</link>
		<comments>http://www.morethantechnical.com/2011/06/05/a-motion-parallax-screen-using-kinect-w-code/#comments</comments>
		<pubDate>Sun, 05 Jun 2011 04:54:38 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[3d]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[opengl]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[vision]]></category>
		<category><![CDATA[head tracking]]></category>
		<category><![CDATA[kinect]]></category>
		<category><![CDATA[motion parallax]]></category>
		<category><![CDATA[projection]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=863</guid>
		<description><![CDATA[How to create a motion-parallax screen using Kinect head tracking. Code in C++, using OpenGL and OpenNI's skeleton model.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve seen some examples of people who build <a href="http://en.wikipedia.org/wiki/Motion_parallax">motion parallax</a> capable screens using Kinect, but as usual &#8211; they don&#8217;t share the code. Too bad.<br />
Well this is your chance to see how it&#8217;s done, and it&#8217;s fairly simple as well.<br />
<span id="more-863"></span><br />
Let&#8217;s start by getting the user&#8217;s head position. This is done using <a href="http://www.openni.org/">OpenNI</a>&#8216;s library, that provides a skeleton model and hence the head. I used the NiUserTracker sample code as a basis, and stripped out everything that is not needed.</p>
<p>The only things I was interested were the head position and hands positions so I created a struct to hold these, plus some things OpenNI need to get the positions. I did this so it could be run in a different thread, and this struct can be the shared memory:</p>
<pre class="brush: plain; title: ; notranslate">
struct openni_stuff {
	xn::DepthGenerator* dg;
	xn::UserGenerator* ug;
	xn::Context* ctx;
	XnSkeletonJointPosition* Head;
	XnSkeletonJointPosition* rh;
	XnSkeletonJointPosition* lh;
};
</pre>
<p>All these must be populated in the main before starting the thread</p>
<pre class="brush: plain; title: ; notranslate">
xn::Context g_Context;
xn::DepthGenerator g_DepthGenerator;
xn::UserGenerator g_UserGenerator;
XnSkeletonJointPosition Head;
XnSkeletonJointPosition lHand;
XnSkeletonJointPosition rHand;

int main(..) {
..
g_Context.Init();
g_Context.FindExistingNode(XN_NODE_TYPE_DEPTH, g_DepthGenerator);
g_Context.FindExistingNode(XN_NODE_TYPE_USER, g_UserGenerator);
..
g_UserGenerator.Create(g_Context);
..
g_Context.StartGeneratingAll();
..
DWORD threadid;
struct openni_stuff s;
s.ctx = &amp;g_Context;
s.dg = &amp;g_DepthGenerator;
s.ug = &amp;g_UserGenerator;
s.Head = &amp;Head;
s.rh = &amp;rHand;
s.lh = &amp;lHand;
CreateThread(
            NULL,                   // default security attributes
            0,                      // use default stack size
			MyThreadFunction,       // thread function name
            (LPVOID)(&amp;s),          // argument to thread function
            0,                      // use default creation flags
            &amp;threadid);   // returns the thread identifie

glutMainLoop();
</pre>
<p>This code is very abstracted, there are more things to do in order for it to work, you can see them in the code repo.<br />
But basically the new thread is the one getting the information off the OpenNI framework and keeps the head position and hands positions vectors updated. </p>
<pre class="brush: plain; title: ; notranslate">
DWORD WINAPI MyThreadFunction( LPVOID lpParam ) {
       //Unpack the struct, don't care for shallow copy since it's all pointers anyway
	struct openni_stuff s = *((struct openni_stuff*)lpParam);
	for(;;) {
		getOpenNIData(s);
		Sleep(30);
	}
	return 0;
}

void getOpenNIData (struct openni_stuff s)
{
	xn::SceneMetaData sceneMD;
	xn::DepthMetaData depthMD;
	s.dg-&gt;GetMetaData(depthMD);

	if (!g_bPause)
	{
		// Read next available data
		s.ctx-&gt;WaitAndUpdateAll();
	}

	// Process the data
	g_DepthGenerator.GetMetaData(depthMD);
	s.dg-&gt;GetMetaData(depthMD);
	rHand.position.X = NULL;
	s.ug-&gt;GetUserPixels(0, sceneMD);
	DrawDepthMap(depthMD, sceneMD, *s.Head, *s.rh, *s.lh);
}
</pre>
<p>I thought this will give a performance boost as the WaitAndUpdateAll() call usually takes a little while, but it didn&#8217;t matter much&#8230;</p>
<p>The OpenGL (GLUT) runs on the main thread, and just looks at these updated vectors for the current position.</p>
<h2>Off-Axis projection</h2>
<p>The concept of off-axis projection is very important for this project. This <a href="http://csc.lsu.edu/~kooima/pdfs/gen-perspective.pdf">very good article explains everything about generalized perspective projections</a>, it also includes C code!, I recommend reading it. But basically off-axis projection is when the viewing eye is not perpendicular the projection surface, nor it needs to be centered in relation to it. It&#8217;s what goes on in our human binocular vision, each eye looks at the same point but they are not perpendicular to the virtual projection surface (they are angled to it), and they both have an offset from the center. Just read that little paper&#8230;.</p>
<p>Anyway, cutting to the chase, we need to project the rendered objects in the scene onto the projection table, assuming the user is not looking at it perpendicularly (like they would with a normal screen). Thanks to the code in the aforementioned article &#8211; this is a breeze.</p>
<pre class="brush: plain; title: ; notranslate">
void subtract(float u[3], float v[3], float n[3]) {
	u[0] = v[0] - n[0];
	u[1] = v[1] - n[1];
	u[2] = v[2] - n[2];
}

void projection( float *pa,
				float *pb,
				float *pc,
				float *pe, float n, float f)
{
	float va[3], vb[3], vc[3];
	float vr[3], vu[3], vn[3];
	float l, r, b, t, d, M[16];

	// Compute an orthonormal basis for the screen.
	subtract(vr, pb, pa);
	subtract(vu, pc, pa);

	glmNormalize(vr);
	glmNormalize(vu);
	glmCross(vr, vu, vn);
	glmNormalize(vn);

	// Compute the screen corner vectors.
	subtract(va, pa, pe);
	subtract(vb, pb, pe);
	subtract(vc, pc, pe);

	// Find the distance from the eye to screen plane.
	d = -glmDot(va, vn);

	// Find the extent of the perpendicular projection.
	l = glmDot(vr, va) * n / d;
	r = glmDot(vr, vb) * n / d;
	b = glmDot(vu, va) * n / d;
	t = glmDot(vu, vc) * n / d;
	// Load the perpendicular projection.
	glMatrixMode(GL_PROJECTION);
	glPushMatrix();
	glLoadIdentity();
	glFrustum(l, r, b, t, n, f);
	// Rotate the projection to be non-perpendicular.
	memset(M, 0, 16 * sizeof (float));
	M[0] = vr[0]; M[4] = vr[1]; M[ 8] = vr[2];
	M[1] = vu[0]; M[5] = vu[1]; M[ 9] = vu[2];
	M[2] = vn[0]; M[6] = vn[1]; M[10] = vn[2];
	M[15] = 1.0f;
	glMultMatrixf(M);
	// Move the apex of the frustum to the origin.
	glTranslatef(-pe[0], -pe[1], -pe[2]);
	glMatrixMode(GL_MODELVIEW);
	glPushMatrix();
}
</pre>
<p>I am using <a href="http://www.xmission.com/~nate/tutors.html">glm.h &#038; glm.c from Nate Robbins</a> to do some basic lin-algebra. I just didn&#8217;t feel like re-writing the code, and I&#8217;m already using it to load Wavefront OBJ models. The only missing function is <code>subtract</code> which is included.</p>
<p>Loading the OBJ models is super easy with glm.h:</p>
<pre class="brush: plain; title: ; notranslate">
	   objmodel_ptr = glmReadOBJ(&quot;../bunny1.obj&quot;);
	   if (!objmodel_ptr)
		   exit(0);

	   glmUnitize(objmodel_ptr);
	   glmFacetNormals(objmodel_ptr);
	   glmVertexNormals(objmodel_ptr, 90.0);
</pre>
<p>Now that we can create off-axis views (this can be reused for other projects, such as projects with VR glasses!), I draw the scene after applying this projection:</p>
<pre class="brush: plain; title: ; notranslate">
GLfloat eye[4] = {0,200,1050,0}; //position of eye
double kinectHeight = 300;  //the Kinect is by the table, at a certain height (measured)
GLdouble tlv[3] = {-530, -kinectHeight, 90},   //top-left point of table in Kinect coordinates (millimeters)
		trv[3] = {530, -kinectHeight, 90}, //top-right
		brv[3] = {530, -kinectHeight, 955}, //bottom-right
		blv[3] = {-530, -kinectHeight, 955}; //bottom-left
GLdouble obj[3] = {-200, tlv[1], 522.5}; //the virtual object's real-world position (mm)

static void display(GLenum mode)
{
       //set the eye position
	if(Head.position.X != 0.0f || Head.position.Y != 0.0f || Head.position.Z != 0.0f)
	{
		eye[0] = Head.position.X;
		eye[1] = Head.position.Y;
		eye[2] = Head.position.Z;
	}

	glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT);
	offAxisView();
}

void offAxisView() {
	projection(blvf, brvf, tlvf, eye, 1.0f, 10000.0f);

	glLightfv(GL_LIGHT0, GL_POSITION, lightp);
	drawScene();

	glPopMatrix();
	glMatrixMode(GL_PROJECTION);
	glPopMatrix();
	glMatrixMode(GL_MODELVIEW);
}

void drawScene() {
	//Just draw an object..
	glPushMatrix();
	glTranslated(obj[0]-10,obj[1]+80,obj[2]); //translating to accomodate for obj size
	glColor4f(1.0, 0.0, 0.0, 1.0);
	glScaled(80,80,80);
	glmDraw(objmodel_ptr,GLM_SMOOTH);
	glPopMatrix();
}
</pre>
<p>You can see that I measured the position of the table in respect to the Kinect sensor&#8217;s center, we assume that it is the origin, and these are used for the off-axis projection w.r.t the eye.</p>
<p>That&#8217;s pretty much it&#8230; the program runs, you have to stand in the silly &#8220;Psi&#8221; position for the OpenNI framework to calibrate, and then the graphics will be rendered according to your head position.</p>
<p>To create your own setup, just put in the right position of the table in respect to the Kinect sensor in real-world coordinates (mm).</p>
<h2>Code</h2>
<p>Can be downloaded from SVN as usual:<br />
<code>svn co https://morethantechnical.googlecode.com/svn/trunk/kinect_motion_parallax/main.cpp</code></p>
<h2>Video</h2>
<p><iframe width="560" height="349" src="http://www.youtube.com/embed/qK4VNo9bI2U" frameborder="0" allowfullscreen></iframe></p>
<p>Enjoy<br />
Roy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2011/06/05/a-motion-parallax-screen-using-kinect-w-code/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Download all your Last.fm loved tracks in two simple steps</title>
		<link>http://www.morethantechnical.com/2011/03/14/download-all-you-last-fm-loved-tracks-in-a-single-command/</link>
		<comments>http://www.morethantechnical.com/2011/03/14/download-all-you-last-fm-loved-tracks-in-a-single-command/#comments</comments>
		<pubDate>Mon, 14 Mar 2011 04:27:48 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[Recommended]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Solutions]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[Website]]></category>
		<category><![CDATA[download]]></category>
		<category><![CDATA[encoding]]></category>
		<category><![CDATA[lame]]></category>
		<category><![CDATA[mp3]]></category>
		<category><![CDATA[mp4]]></category>
		<category><![CDATA[mplayer]]></category>
		<category><![CDATA[shell]]></category>
		<category><![CDATA[youtube]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=844</guid>
		<description><![CDATA[I&#8217;m a fan of Last.fm online radio, and I have a habit of marking every good song that I hear as a &#8220;loved track&#8221;. Over the years I got quite a list, and so I decided to turn it into my jogging playlist. But for that, I need all the songs downloaded to my computer [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m a fan of <a href="http://www.last.fm/home">Last.fm</a> online radio, and I have a habit of marking every good song that I hear as a &#8220;loved track&#8221;. Over the years I got quite a list, and so I decided to turn it into my jogging playlist. But for that, I need all the songs downloaded to my computer so I can put them on my mobile. While Last.fm does link to Amazon for downloading all the loved songs for pay, I&#8217;m going to walk the fine moral line here and suggest how you can download every song from existing free YouTube videos.<br />
If it really bothers you, think of it as if I created a YouTube playlist and now I&#8217;m using my data plan to stream the songs off YT itself..<br />
Moral issues resolved, we can move on to the scripting.<br />
<span id="more-844"></span><br />
What you need to have:<br />
Linux-like system, <a href="http://www.mplayerhq.hu/design7/news.html">MPlayer</a>, <a href="http://lame.sourceforge.net/">Lame MP3 encoder</a>, some command-line experience or at least adventure-ness.</p>
<p>So first you&#8217;ll need to export your loved tracks from Last.fm in tab separated format &#8211; a mere button press.<br />
<a href="http://www.morethantechnical.com/wp-content/uploads/2011/03/Screen-shot-2011-03-14-at-12.03.26-AM.png" rel="lightbox[844]"><img src="http://www.morethantechnical.com/wp-content/uploads/2011/03/Screen-shot-2011-03-14-at-12.03.26-AM-300x111.png" alt="" title="Screen shot 2011-03-14 at 12.03.26 AM" width="300" height="111" class="aligncenter size-medium wp-image-849" /></a></p>
<p>The &#8220;tsv&#8221; (tab separated values) file has a simple format: <code>&lt;song name&gt; &lt;artist&gt; &lt;Last.fm url&gt;</code></p>
<p>And now for the script, first, the loved tracks file is tab separated, so we use AWK to get the 2 first fields which are song-name and song-artist.<br />
Then we use a neat command-line tool to download YT movies: <a href="http://rg3.github.com/youtube-dl/documentation.html">http://rg3.github.com/youtube-dl/documentation.html</a>.</p>
<pre class="brush: plain; title: ; notranslate">
mkdir mylovedtracks
cd mylovedtracks
awk -F\t '{print &quot;../youtube-dl.py -f 18 -t \&quot;ytsearch:&quot; $1 &quot; &quot; $2 &quot;\&quot;&quot;}' ../my_lovedtracks.tsv | csh
</pre>
<p>The single-liner will download all the loved tracks from the tsv file into the current directory, given that <code>youtube-dl.py &#038; my_lovedtracks.tsv</code> exist in the parent directory. <code>-f 18</code> says it will download only MP4s and <code>ytsearch</code> says it will try to search YT for the term &#8220;song-name song-artist&#8221; and download the 1st result. The <code>| csh</code> says it will send this command AWK formatted into a new shell process.</p>
<p>The saved MP4 will be named after the name of the video, with addition of the YT hash string.</p>
<p>All the mp4s have been downloaded, so let&#8217;s batch convert them to mp3s:</p>
<pre class="brush: plain; title: ; notranslate">
mkdir sound
for f in *.mp4 ; do n=`echo $f | cut -d '.' -f1`; if [ ! -e sound/$n.mp3 ]; then `mplayer $n.mp4 -vc dummy -vo null -ao pcm:file=sound/temp.wav; lame -V2 sound/temp.wav sound/$n.mp3; rm sound/temp.wav`; fi ; done
</pre>
<p>This single-liner will extract audio from the mp4 into a PCM temp.wav file using MEncoder, and then convert to VBR MP3 using Lame.<br />
You can run this command many times, as it checks if the file has not been converted yet. So you&#8217;re impatient (like me) on converting some of the MP4s before everything was downloaded &#8211; just run it, and later run it again.</p>
<p>Congrats, all your loved tracks were downloaded.</p>
<p>A few limitation to this method:<br />
* Sometimes downloaded songs are not exactly what you wanted, especially specific versions. The search is arbitrary, and can&#8217;t be controlled too much.<br />
* ID3 tags are non existent, although something can probably be done about that in the Lame encoding phase.<br />
* Very high potential for parallelization that is unexploited. Mostly in the YT download phase, where YT pushes the first ~15% of the video very fast (I saw 1200Kb/s even), and then maintains a steady d/l rate to get the video downloaded by ~1:00 minute (may be as low as 50Kb/s). Downloading many videos at once could help.<br />
* Still not a true single-liner, it is a two-step thing. But that can be done by modifying the 2nd step a bit and putting into the AWK print of the 1st step.<br />
* MP3&#8242;s volume normalization &#8211; very important! else every songs sounds different and you must do vol-up vol-down all the time&#8230;</p>
<p>Still, did a nice quick job for me&#8230;</p>
<p>Roy.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.morethantechnical.com%2F2011%2F03%2F14%2Fdownload-all-you-last-fm-loved-tracks-in-a-single-command%2F&amp;title=Download%20all%20your%20Last.fm%20loved%20tracks%20in%20two%20simple%20steps" id="wpa2a_6"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2011/03/14/download-all-you-last-fm-loved-tracks-in-a-single-command/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Neat OpenCV smoothing trick when Kineacking (Kinect Hacking) [w/ code]</title>
		<link>http://www.morethantechnical.com/2011/03/05/neat-opencv-smoothing-trick-when-kineacking-kinect-hacking-w-code/</link>
		<comments>http://www.morethantechnical.com/2011/03/05/neat-opencv-smoothing-trick-when-kineacking-kinect-hacking-w-code/#comments</comments>
		<pubDate>Sat, 05 Mar 2011 20:57:26 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[3d]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[opencv]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[vision]]></category>
		<category><![CDATA[depth]]></category>
		<category><![CDATA[inpainting]]></category>
		<category><![CDATA[kinect]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=824</guid>
		<description><![CDATA[I found a nice little trick to ease the work with the very noisy depth image the Kinect is giving out. The image is filled with these &#8220;blank&#8221; values that basically note where the data is unreadable. The secret is to use inpainting to cover these areas and get a cleaner image. And as always, [...]]]></description>
			<content:encoded><![CDATA[<p>I found a nice little trick to ease the work with the very noisy depth image the Kinect is giving out. The image is filled with these &#8220;blank&#8221; values that basically note where the data is unreadable. The secret is to use inpainting to cover these areas and get a cleaner image. And as always, no need to dig deep &#8211; OpenCV has it all included.<br />
<span id="more-824"></span></p>
<p>Start from a simple Kinect frames feed from <a href="http://openkinect.org/wiki/C%2B%2BOpenCvExample">here</a>:</p>
<pre class="brush: plain; title: ; notranslate">

int main(int argc, char **argv) {
	bool die(false);

	Mat depthMat(Size(640,480),CV_16UC1);
	Mat depthf  (Size(640,480),CV_8UC1);
	Mat rgbMat(Size(640,480),CV_8UC3,Scalar(0));
	Mat ownMat(Size(640,480),CV_8UC3,Scalar(0));

        Freenect::Freenect freenect;
        MyFreenectDevice&amp; device = freenect.createDevice&lt;MyFreenectDevice&gt;(0);

	device.startVideo();
	device.startDepth();

    while (!die) {
    	device.getVideo(rgbMat);
    	device.getDepth(depthMat);
    	depthMat.convertTo(depthf, CV_8UC1, 255.0/2048.0);
        cv::imshow(&quot;depth&quot;,depthf);
		char k = cvWaitKey(5);
		if( k == 27 ){
			break;
		}
    }

   	device.stopVideo();
	device.stopDepth();
	return 0;
}
</pre>
<p>Now let&#8217;s stretch the signal a little bit and add the inpainting:</p>
<pre class="brush: plain; title: ; notranslate">
		//interpolation &amp; inpainting
		{
			Mat _tmp,_tmp1; //minimum observed value is ~440. so shift a bit
			Mat(depthMat - 400.0).convertTo(_tmp1,CV_64FC1);

			Point minLoc; double minval,maxval;
			minMaxLoc(_tmp1, &amp;minval, &amp;maxval, NULL, NULL);
			_tmp1.convertTo(depthf, CV_8UC1, 255.0/maxval);  //linear interpolation

                       //use a smaller version of the image
			Mat small_depthf; resize(depthf,small_depthf,Size(),0.2,0.2);
                        //inpaint only the &quot;unknown&quot; pixels
			cv::inpaint(small_depthf,(small_depthf == 255),_tmp1,5.0,INPAINT_TELEA);

			resize(_tmp1, _tmp, depthf.size());
			_tmp.copyTo(depthf, (depthf == 255));  //add the original signal back over the inpaint
		}
</pre>
<p>Note that I&#8217;m using a small copy of the image, because inpainting is a heavy computation, and it works best on low frequencies. I copy back the original signal over the up-sized inpainted one to retain high frequencies.</p>
<p>It works pretty well!<br />
<object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/Jm8yflH5BDs?hl=en&#038;fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/Jm8yflH5BDs?hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></p>
<p>Enjoy<br />
Roy.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.morethantechnical.com%2F2011%2F03%2F05%2Fneat-opencv-smoothing-trick-when-kineacking-kinect-hacking-w-code%2F&amp;title=Neat%20OpenCV%20smoothing%20trick%20when%20Kineacking%20%28Kinect%20Hacking%29%20%5Bw%2F%20code%5D" id="wpa2a_8"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2011/03/05/neat-opencv-smoothing-trick-when-kineacking-kinect-hacking-w-code/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The woes of Frame Animation on Android [w/ code]</title>
		<link>http://www.morethantechnical.com/2011/03/01/the-woes-of-frame-animation-on-android-w-code/</link>
		<comments>http://www.morethantechnical.com/2011/03/01/the-woes-of-frame-animation-on-android-w-code/#comments</comments>
		<pubDate>Tue, 01 Mar 2011 05:48:24 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[graphics]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[android]]></category>
		<category><![CDATA[animation]]></category>
		<category><![CDATA[gui]]></category>
		<category><![CDATA[imagemagick]]></category>
		<category><![CDATA[java]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=822</guid>
		<description><![CDATA[My adventures of getting frame animation on the Android 2.1 continue, and take a turn for the worse. Will I come up victorious in the end? Not sure&#8230; Using Android&#8217;s Frame Animation API The first attempt I took at frame animation was using Android&#8217;s own AnimationDrawable. I thought it would give me the best solution, [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2011/03/android_frame.png" rel="lightbox[822]"><img src="http://www.morethantechnical.com/wp-content/uploads/2011/03/android_frame-269x300.png" alt="" title="android_frame" width="269" height="300" class="alignleft size-medium wp-image-831" /></a>My adventures of getting frame animation on the Android 2.1 continue, and take a turn for the worse. Will I come up victorious in the end? Not sure&#8230;</p>
<p><span id="more-822"></span></p>
<h2>Using Android&#8217;s Frame Animation API</h2>
<p>The first attempt I took at frame animation was using Android&#8217;s own <a href="http://developer.android.com/guide/topics/graphics/2d-graphics.html#frame-animation">AnimationDrawable</a>. I thought it would give me the best solution, as it&#8217;s the closest to the native OS and probably optimized. I was wrong.<br />
This API is highly suspect to OutOfMemory exceptions, either when loading an animation of more than ~50 frames, an when loading more than one animation. On top of that, it is not doing a great job at displaying the frames, and produces a lot of jitter.</p>
<p>So using it is very very simple, and I wrote about it in a <a href="http://www.morethantechnical.com/2011/02/07/some-things-i-learned-about-androids-frame-animation/">previous post</a>, when I was still trying to make it work.</p>
<p>Basically, all you need is an ImageView to display your animation. Prepare some <a href="http://developer.android.com/reference/android/graphics/drawable/AnimationDrawable.html">AnimationDrawable</a> in XML. Then you can either <a href="http://www.morethantechnical.com/2011/02/07/some-things-i-learned-about-androids-frame-animation/">pre-load an animation</a>, or just fire an animation regularly, which is only <a href="http://developer.android.com/reference/android/graphics/drawable/AnimationDrawable.html">setting the Drawable</a> for the ImageView.</p>
<p>If you fire the animation more than once, remember to <a href="http://developer.android.com/reference/android/graphics/drawable/AnimationDrawable.html#setVisible(boolean, boolean)">reset</a> it.</p>
<p>This is by far the simplest way to go, and best if you have a simple animation. But it dies very quickly of memory issues if you push it too hard.</p>
<h2>Using HTML and animated GIFs</h2>
<p>This seemed like a classical solution for frame animation. The browser should have absolutely no problems playing it &#8211; so I thought. Turns out Android 2.1&#8242;s web browser doesn&#8217;t play animated GIFs! So when I was working on my development phone, a 2.2er, there was no problem, but when I switched to my deployment phone, a 2.1er , the screen just goes black.<br />
If you&#8217;re on 2.2 &#8211; this a very nice way to frame animate. It&#8217;s clean and works at high frame rates.</p>
<p>First you would need to obtain animated GIFs that are compatible with Android&#8217;s web browser. I did that using <a href="http://www.imagemagick.org/Usage/anim_basics/">ImageMagick&#8217;s (IM) animation toolbox</a>. </p>
<pre class="brush: plain; title: ; notranslate">
convert myanim_split_*.png myanim.gif
</pre>
<p>But&#8230; there are some details to attend to. First, you probably would like to have control over looping the animation, that can be done using the <code>-loop</code> parameter of <code>convert</code>, setting it to 1 will play the animation once, 0 will loop forever.<br />
How about transparent background? this is very important for character animation, that usually live in a &#8220;world&#8221; and should not occlude the background. Well, if your animation&#8217;s PNGs have the background transparet, that would reflect in the animated GIF. But, just creating a GIF out of your PNGs will give you something like this:<br />
<a href="http://www.morethantechnical.com/wp-content/uploads/2011/02/anim.gif" rel="lightbox[822]"><img src="http://www.morethantechnical.com/wp-content/uploads/2011/02/anim.gif" alt="" title="anim" width="320" height="240" class="aligncenter size-full wp-image-825" /></a></p>
<p>The background is not clearing frame-to-frame. So add in the <code>-set dispose background</code> (or &#8220;dispose previous&#8221; to the IM line:</p>
<pre class="brush: plain; title: ; notranslate">
convert myanim_split_*.png -set dispose background myanim.gif
</pre>
<p>Now it looks better, and also has a transparent background:<br />
<a href="http://www.morethantechnical.com/wp-content/uploads/2011/02/anim1.gif" rel="lightbox[822]"><img src="http://www.morethantechnical.com/wp-content/uploads/2011/02/anim1.gif" alt="" title="anim" width="320" height="240" class="aligncenter size-full wp-image-826" /></a></p>
<p>There&#8217;s still the issue of showing it up as part of the layout, and that will be done with WebView. The tricky part is controlling when the animation will fire. There are several ways to go about it, such as loading a single HTML file and using <code>webview.loadUrl</code> with a &#8220;javascript:&#8230;&#8221; URL to replace the image, or using a simpler method of <code>webview.loadData</code> with HTML code that just displays the image with <code>&lt;img src=".."&gt;</code>. There is no &#8220;right way&#8221; as this is the &#8220;hack way&#8221; anyway.</p>
<pre class="brush: plain; title: ; notranslate">
WebView wb = (WebView) findViewById(R.id.webview);
wb.loadDataWithBaseURL(
		&quot;fake://lala&quot;,
		&quot;&lt;div style=\&quot;text-align: center;\&quot;&gt;&lt;IMG id=\&quot;myanim\&quot; SRC=\&quot;file:///android_asset/myanim.gif\&quot; style=\&quot;height: 100%\&quot; /&gt;&lt;/div&gt;&quot;,
		&quot;text/html&quot;,
		&quot;UTF-8&quot;,
		&quot;fake://lala&quot;);
</pre>
<p>Notice that the GIF file must reside in the &#8220;assets&#8221; directory of the android project, and the &#8220;file:///android_asset/&#8221; URL goes right to there. You may as well skip the whole HTML thing and just have WebView loadURL the GIF file right away.</p>
<p>Now this will work on Android 2.2 and up, but will fail for Android 2.1 as the web browser doesn&#8217;t implement animated GIFs yet.</p>
<h2>Using MediaPlayer and animated GIFs</h2>
<p>I saw somewhere that even though the WebView in 2.1 cannot play GIFs, the MediaPlayer sure can, so it&#8217;s worth mentioning that.</p>
<p>So I set up a surface in the layout</p>
<pre class="brush: plain; title: ; notranslate">
&lt;SurfaceView android:id=&quot;@+id/mysurfaceview&quot;
	android:layout_width=&quot;fill_parent&quot;
	android:layout_height=&quot;0dip&quot;
	android:layout_weight=&quot;1&quot;
	/&gt;
</pre>
<p>Got the holder in code, and instantiated a MediaPlayer</p>
<pre class="brush: plain; title: ; notranslate">
mCharPreview = (SurfaceView) findViewById(R.id.mysurfaceview);
holder = mCharPreview.getHolder();
holder.addCallback(this);
extras = getIntent().getExtras();

mMediaPlayer = new MediaPlayer();
mMediaPlayer.setDisplay(holder);
mMediaPlayer.setOnCompletionListener(this);
mMediaPlayer.setOnPreparedListener(this);
mMediaPlayer.setOnBufferingUpdateListener(this);
mMediaPlayer.setOnVideoSizeChangedListener(this);
mMediaPlayer.setAudioStreamType(AudioManager.);
</pre>
<p>And then listen on surfaceCreated to prepare the player</p>
<pre class="brush: plain; title: ; notranslate">
public void surfaceCreated(SurfaceHolder holder) {
        try {
            AssetFileDescriptor openFd = getAssets().openFd(&quot;myanim.gif&quot;);
		mMediaPlayer.setDataSource(openFd.getFileDescriptor());
            mMediaPlayer.prepare();
	} catch (Exception e) {
		e.printStackTrace();
		(new AlertDialog.Builder(this)).setTitle(&quot;Exception&quot;).setMessage(e.getClass().getName() + &quot;:&quot; + e.getLocalizedMessage()).create().show();
	}
}
</pre>
<p>Then listen on prepared to start the animation</p>
<pre class="brush: plain; title: ; notranslate">
public void onPrepared(MediaPlayer mediaplayer) {
        mIsVideoReadyToBePlayed = true;
        if (mIsVideoReadyToBePlayed) {
             holder.setFixedSize(200, 300);
             mMediaPlayer.start();
        }
}
</pre>
<p>But &#8211; this code failed for me in the <code>mMediaPlayer.prepare()</code> with an exception like <code>"Prepare failed.: status=0xFFFFFFF"</code>. I&#8217;m guessing that the player fails because the format is not recognized.</p>
<h2>Using HTML and Javascript</h2>
<p>So, after being exhausted with having some engine play the animation, I decided to go back to using WebView with Javascript code to flip the images one after the other.</p>
<p>Well this was pretty straight forward, I created a small HTML code:</p>
<pre class="brush: plain; title: ; notranslate">
&lt;html&gt;
 &lt;script&gt;
 //---------------------------------------------------------------
//just a function to pad numbers with 0s, good for animation frames in sequential files...
 function FormatNumberLength(num, length) {
    var r = &quot;&quot; + num;
    while (r.length &lt; length) {
        r = &quot;0&quot; + r;
    }
    return r;
}
//---------------------------------------------------------------

  var state=0;
  var myinterval = -1; //JS interval handler

//---------------------------------------------------------------
//parse querystring - that's where I get the image to show, and other parameters
  var qs = (function(a) {
    if (a == &quot;&quot;) return {};
    var b = {};
    for (var i = 0; i &lt; a.length; ++i)
    {
        var p=a[i].split('=');
        b[p[0]] = decodeURIComponent(p[1].replace(/\+/g, &quot; &quot;));
    }
    return b;
  })(window.location.search.substr(1).split('&amp;'));
//---------------------------------------------------------------

  var firstFrame = parseInt(qs[&quot;first&quot;]);   //the index of the first frame
  var lastFrame = parseInt(qs[&quot;last&quot;]);    //the index of the last frame

//---------------------------------------------------------------
  // intializes animation timer
  function ini() {
  	if(firstFrame &gt;= 0) { //animation
    	setTimeout(&quot;myinterval = setInterval(\&quot;periodic()\&quot;,100);&quot;,100);
    } else {				//static
    	document.getElementById(&quot;myimg&quot;).src=&quot;file:///android_asset/&quot;+ qs[&quot;anim_file&quot;] +&quot;.png&quot;;
    }
  }
//---------------------------------------------------------------

  // called regularly to perform animation
  function periodic() {
    state+=2;  //jump by 2 frames? can change this to 1 if you like...
    animState = firstFrame + state;

    if (animState &lt;= lastFrame) {
		document.getElementById(&quot;myimg&quot;).src=&quot;file:///android_asset/&quot;+ qs[&quot;anim_file&quot;] + FormatNumberLength(animState,4) +&quot;.png&quot;;
    }
    else {
    	if(qs[&quot;loop&quot;] == &quot;false&quot;)  //'loop' will say if we keep playing the animation.. duh
    		clearInterval(myinterval);
    	else
    		state = 0;
    }
  }

 &lt;/script&gt;
 &lt;style&gt;
 #wrapper {
     text-align: center;
     background-color: black; /* make sure to use this, since sometimes there are ghosts */
 }
 #im {
     background-color: black;
 }
 &lt;/style&gt;
 &lt;body onLoad=&quot;ini()&quot;&gt;
  &lt;div id=&quot;wrapper&quot;&gt;
     &lt;img id=&quot;myimg&quot; src=&quot;&quot;/&gt; &lt;!--   this is where the image goes --&gt;
  &lt;/div&gt;
 &lt;/body&gt;
&lt;/html&gt;
</pre>
<p>Put this file, as usual, in the &#8220;assets&#8221; directory where it can be found easily with the <code>"file:///android_assets/..."</code> URL.</p>
<p>So, this HTML is loaded into the WebView with some parameters on the URL that will tell it what to show, like so:</p>
<pre class="brush: plain; title: ; notranslate">
       //This class holds all we need for an animation: filename, number of frames, etc.
	private class MyAnim {
		String filename;
		int start;
		int end;
		boolean loop;
		public MyAnim(String filename, int start, int end) {
			super();
			this.filename = filename;
			this.start = start;
			this.end = end;
			this.loop = false;
		}
		public MyAnim(String filename, int start, int end, boolean loop) {
			super();
			this.filename = filename;
			this.start = start;
			this.end = end;
			this.loop = loop;
		}
	}

        //this function will &quot;fire&quot; an animation, essentially load the HTML code with the proper parameters
	private void fireAnimation(final MyAnim myAnim, final boolean shouldTurn) {
		findViewById(R.id.webview).post(new Runnable() {
			@Override
			public void run() {
				long now = (new Date()).getTime();
				if((now - anim_start_ts) &lt; 2000) return; //let other animations finish man! geez...

				WebView wb = (WebView) findViewById(R.id.webview);

				//supply the &quot;base&quot; filename, the first frame number, last frame and should the animation repeat
				wb.loadUrl(&quot;file:///android_asset/animate.html?anim_file=&quot;+myAnim.filename+&quot;&amp;first=&quot;+myAnim.start+&quot;&amp;last=&quot;+myAnim.end+&quot;&amp;loop=&quot;+myAnim.loop);
				wb.invalidate();

				anim_start_ts = now;
			}
		});
	}
</pre>
<p>The filenames should be sequential, like: &#8220;myanim0001.jpg&#8221;, &#8220;myanim0002.jpg&#8221;, &#8230;.<br />
And then an animation may be from the file &#8220;mayanim0023.jpg&#8221; to &#8220;myanim0046.jpg&#8221;. The parameters for the HTML code will be then: <code>anim_file="myanim"&#038;first=23&#038;last=46</code>.</p>
<p>But! We are not done.<br />
Because stupid Android 2.1 does not clear the first image that loads into the <code>&lt;img ... &gt;</code>!<br />
So you get these weird &#8220;ghosting&#8221; effects, where the new images of the animation are shown superimposed on the first image&#8230;<br />
I couldn&#8217;t get past this, so I found another way of doing it&#8230;.</p>
<h2>Using HTML, JS and innerHTML</h2>
<p>This is what I ended up using.<br />
Basically everything stays the same, except for the fact that now we are not leaving the <code>&lt;img ... &gt;</code> in there to ghost stuff up, we&#8217;re completely replacing it with new HTML code using innerHTML.</p>
<p>The change is slight, in the JS I used:</p>
<pre class="brush: plain; title: ; notranslate">
    	document.getElementById(&quot;wrapper&quot;).innerHTML=
	    	&quot;&lt;div style=\&quot;-webkit-transform: scaleX(&quot;  +  ((qs[&quot;flip&quot;]==&quot;0&quot;)?&quot;-1&quot;:&quot;1&quot;)  +  &quot;);\&quot;&gt;&quot;+
	    	&quot;&lt;img id=\&quot;im\&quot; style=\&quot;height:100%\&quot; src=\&quot;file:///android_asset/&quot;+ qs[&quot;anim_file&quot;]+&quot;\&quot; /&gt;&quot;+
	    	&quot;&lt;/div&gt;&quot;;
</pre>
<p>Note that now I also introduced a new parameter: &#8220;flip&#8221;, that uses <code>-webkit-transform</code> to flip the displayed image.</p>
<p>Phew, that was a battle. I won. And now we learned a few things about Android and options for frame animation.<br />
Some other options to explore: flipping OpenGLES textures, encoding animations to mp4s?</p>
<p>Please share your experience with Android frame animation!</p>
<p>Roy.</p>
<p>Portions of this page are modifications based on work created and <a href="http://code.google.com/policies.html">shared by Google</a> and used according to terms described in the <a href="http://creativecommons.org/licenses/by/3.0/">Creative Commons 3.0 Attribution License</a>.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.morethantechnical.com%2F2011%2F03%2F01%2Fthe-woes-of-frame-animation-on-android-w-code%2F&amp;title=The%20woes%20of%20Frame%20Animation%20on%20Android%20%5Bw%2F%20code%5D" id="wpa2a_10"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2011/03/01/the-woes-of-frame-animation-on-android-w-code/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>How to rotate a video using MEncoder and FFmpeg and live to tell the tale</title>
		<link>http://www.morethantechnical.com/2011/02/08/how-to-rotate-a-video-using-mencoder-and-ffmpeg-and-live-to-tell-the-tale/</link>
		<comments>http://www.morethantechnical.com/2011/02/08/how-to-rotate-a-video-using-mencoder-and-ffmpeg-and-live-to-tell-the-tale/#comments</comments>
		<pubDate>Tue, 08 Feb 2011 16:43:07 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[ffmpeg]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[Recommended]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[command line]]></category>
		<category><![CDATA[encoding]]></category>
		<category><![CDATA[mencoder]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=811</guid>
		<description><![CDATA[Hi I&#8217;d like to share a quick tip on rotating video files. I&#8217;m always frustrated with taking videos with my phone. Single handedly it&#8217;s easiest to do it when the phone is upright and not in landscape mode. But the files are always saved in landscape mode, which makes them rotated when you watch. Although [...]]]></description>
			<content:encoded><![CDATA[<p>Hi</p>
<p>I&#8217;d like to share a quick tip on rotating video files.</p>
<p>I&#8217;m always frustrated with taking videos with my phone. Single handedly it&#8217;s easiest to do it when the phone is upright and not in landscape mode. But the files are always saved in landscape mode, which makes them rotated when you watch.<br />
Although there are plenty of GUI software to do it, using the command line is faster and can also be batched!</p>
<p><span id="more-811"></span></p>
<h2>Using FFmpeg</h2>
<p>This is the basic syntax<br />
<code>./ffmpeg -vf transpose=0 -i input.mp4 output.avi</code></p>
<p>Just using <code>-vf transpose=0</code> to rotate 90 deg clockwise. If you get the &#8220;Unrecognized option &#8216;vf&#8217;&#8221; error, you need to configure &#038; build ffmpeg with <code>--enable-filters</code> (or at least without<code> --disable-filters</code>), and check with <code>ffmpeg -filters</code> that you get them to show up.</p>
<p>Also, I always get very lousy video quality when using the plain vanilla settings. It turns out the problem is with the frame rate. If you leave it as-is the (default) mpeg compression makes a lot of I and P frames, and too few B frames. So set it up explicitly.</p>
<p>I ended up with<br />
<code>ffmpeg -vf transpose=0 -i input.mp4 -r 30 output.avi</code></p>
<p>Easy.</p>
<p>Update [3/1/11]: Actually just using <code>-vf transpose=0</code> will flip the video horizontally as well, which is undesirable in some cases. To counter that I use: <code>-vf "transpose=0,hflip=0"</code>, and it resolves the problem.</p>
<h2>Using MEncoder</h2>
<p>Again, pretty simple thing to do:<br />
<code>mencoder input.mp4 -nosound -o characters-resize-turn.avi -vf rotate=0 -ovc lavc -lavcopts vcodec=mpeg4 -ofps 30</code></p>
<p>Note that I kill the audio with <code>-nosound</code>, and again set the frame rate<code>-ofps 30</code> or else I get the &#8220;duplicate frames&#8221; problem.</p>
<p>Enjoy<br />
Roy.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.morethantechnical.com%2F2011%2F02%2F08%2Fhow-to-rotate-a-video-using-mencoder-and-ffmpeg-and-live-to-tell-the-tale%2F&amp;title=How%20to%20rotate%20a%20video%20using%20MEncoder%20and%20FFmpeg%20and%20live%20to%20tell%20the%20tale" id="wpa2a_12"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2011/02/08/how-to-rotate-a-video-using-mencoder-and-ffmpeg-and-live-to-tell-the-tale/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Hand gesture recognition via model fitting in energy minimization w/OpenCV</title>
		<link>http://www.morethantechnical.com/2010/12/28/hand-gesture-recognition-via-model-fitting-in-energy-minimization-wopencv/</link>
		<comments>http://www.morethantechnical.com/2010/12/28/hand-gesture-recognition-via-model-fitting-in-energy-minimization-wopencv/#comments</comments>
		<pubDate>Mon, 27 Dec 2010 22:11:12 +0000</pubDate>
		<dc:creator>Roy</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[opencv]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Recommended]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[vision]]></category>
		<category><![CDATA[Website]]></category>
		<category><![CDATA[work]]></category>
		<category><![CDATA[computer vision]]></category>

		<guid isPermaLink="false">http://www.morethantechnical.com/?p=762</guid>
		<description><![CDATA[Hi Just wanted to share a thing I made &#8211; a simple 2D hand pose estimator, using a skeleton model fitting. Basically there has been a crap load of work on hand pose estimation, but I was inspired by this ancient work. The problem is setting out to find a good solution, and everything is [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.morethantechnical.com/wp-content/uploads/2010/12/hands.png" rel="lightbox[762]"><img src="http://www.morethantechnical.com/wp-content/uploads/2010/12/hands-300x248.png" alt="hands with model fitted" title="hands with model fitted" width="300" height="248" class="aligncenter size-medium wp-image-796" /></a>Hi</p>
<p>Just wanted to share a thing I made &#8211; a simple 2D hand pose estimator, using a skeleton model fitting. Basically there has been a crap load of work on hand pose estimation, but I was inspired by <a href="http://scholar.google.com/scholar?cluster=136383770354228708&#038;hl=en&#038;as_sdt=40000000">this ancient work</a>. The problem is setting out to find a good solution, and everything is very hard to understand and implement. In such cases I like to be inspired by a method, and just set out with my own implementation. This way, I understand whats going on, simplify it, and share it with you!</p>
<p>Anyway, let&#8217;s get down to business.<br />
<span id="more-762"></span></p>
<h1>A bit about energy minimization problems</h1>
<p>A dear friend revealed before me the wonders of energy minimization problems a while back, and ever since I have trying to find uses for that method. Basically, it is trying to find a global minimum for a complicated energy function (usually with many parameters), by following the function&#8217;s gradient. Such methods are often called <a href="http://en.wikipedia.org/wiki/Gradient_descent">Gradient Descent</a>, and used mostly for non-linear systems that can&#8217;t be solved easily using a least-squares variant. </p>
<p>A lot of work in computer vision was done using energy functions (I believe the most seminal was <a href="http://scholar.google.com/scholar?cluster=10809837120977085662&#038;hl=en&#038;as_sdt=40000000">Snakes</a>, over 10,000 citations), usually having two terms: Internal energy and External energy. The equilibrium between the two terms should result in a low-energy system &#8211; our optimal result. So we would like to formulate the terms in our system such that when they are 0 &#8211; they describe the system as we want it.</p>
<p>Following the works with active contours, I believe the external energy function should have to do with how the hand model fits to the hand blob, and the internal energy will have to do with how &#8220;comfortable&#8221; the hand is with this configuration.</p>
<h1>The hand model</h1>
<p>Let&#8217;s see how a 2D model of a hand might look like<br />
<a href="http://www.morethantechnical.com/wp-content/uploads/2010/12/Screen-shot-2010-12-25-at-10.50.41-AM.png" rel="lightbox[762]"><img src="http://www.morethantechnical.com/wp-content/uploads/2010/12/Screen-shot-2010-12-25-at-10.50.41-AM.png" alt="" title="Screen shot 2010-12-25 at 10.50.41 AM" width="232" height="231" class="aligncenter size-full wp-image-790" /></a><br />
Kinda looks like a rake&#8230; huh?</p>
<p>There are some parts that practically can&#8217;t change much, i.e the palm (orange), and some that might change drastically, i.e the fingers (red). Each finger has joints (blue circle), and a tip (bigger blue circle).</p>
<pre class="brush: plain; title: ; notranslate">
typedef struct finger_data {
	Point2d origin_offset;		//base or finger relative to center hand
	double a;					//angle
	vector&lt;double&gt; joints_a;	//angles of joints
	vector&lt;double&gt; joints_d;	//bone length
} FINGER_DATA;

typedef struct hand_data {
	FINGER_DATA fingers[5];		//fingers
	double a;					//angle of whole hand
	Point2d origin;				//center of palm
	Point2d origin_offset;		//offset from center for optimization
	double size;				//relative size of hand = length of a finger
} HAND_DATA;
</pre>
<p>At first I thought, since I&#8217;m only interested in the tips of the fingers, to use Inverse Kinematics to guide the tips to a certain point and let the joints find their own minimal energy position, following <a href="http://freespace.virgin.net/hugo.elias/models/m_ik2.htm">this</a> article. But I abandoned this method because of complications. </p>
<p>I also had to simplify this model, for real-time estimation and also better results. So in the end I ended up with a very rigid model, that allows only on joint per finger and no angular movement.</p>
<h1>Using tnc.c</h1>
<p>tnc.c is a &#8220;library&#8221;, essentially one c file, that implements a line search algorithm that is able to find the minimum point of a multi-variate function. I&#8217;m not certain of the algorithm details, and it&#8217;s not so important as it can be replaced with any other similar library. But, tnc.c has a great advantage &#8211; it is dead simple. One function will start the gradient decent, calling-back a function to calculate the gradients.</p>
<p>So basically I had to write just one very short function:</p>
<pre class="brush: plain; title: ; notranslate">
static int my_f(double x[], double *f, double g[], void *state) {
	DATA_FOR_TNC* d_ptr = (DATA_FOR_TNC*)state;
	DATA_FOR_TNC new_data = *d_ptr;

	mapVecToData(x,new_data.hand);

	*f = calc_Energy(new_data,*d_ptr);

	//calc gradients
	{
		double _x[SIZE_OF_HAND_DATA];

		for(int i=0;i&lt;SIZE_OF_HAND_DATA;i++) {
			memcpy(_x, x, sizeof(double)*SIZE_OF_HAND_DATA); //reset variables
			_x[i] = _x[i] + EPSILON; //change only one variable
			mapVecToData(_x, new_data.hand);
			double E_epsilon = calc_Energy(new_data,*d_ptr);
			g[i] = ((E_epsilon - *f) / EPSILON); //calc the gradient for this variable change
		}
	}

	return 0;
}
</pre>
<p>This function is called by tnc.c on every iteration of the search, the <code>double x[]</code> is the state of variables the search is now examining, <code>double* f</code> is the energy for this state, <code>double g[]</code> are the gradients (same size as x[]), and <code>voide* state</code> is a user-defined variable that can be carried along the process.</p>
<p>So what I did is simply changed the value of each parameter in turn, to test how it effects the energy in the system. I get a measure of the energy, then I subtract it from the &#8220;natural&#8221; setup (without any changes to parameters) energy measure, and I get the gradient for this parameter.</p>
<p>The energy function came out a bit different in the end:</p>
<pre class="brush: plain; title: ; notranslate">

static double calc_Energy(DATA_FOR_TNC&amp; d, DATA_FOR_TNC&amp; orig_d) {
	double _sum = 0.0;

	//external energy: how close are the joints to the hand blob? (how well do they fit to it)
	vector&lt;Point2d&gt; joints;
	Mat tips(5,1,CV_64FC2);

	for (int j=0; j&lt;5; j++) {
		joints.clear();
		FINGER_DATA f = d.hand.fingers[j];
		Point2d _newTip = newTip(f,d.hand,joints); //get joints for this finger

		for (int i=0; i&lt;tmp.size(); i++) { //for each joint find how far it is from the blob
			double ds = pointPolygonTest(d.contour, tmp[i]+getHandOrigin(d.hand), true);
			ds += 5;
			ds = 1 * ((ds &lt; 0) ? -1 : 1) * (ds*ds) ;
			_sum -= (ds &gt; 0) ? 0 : 100*ds;
		}

		tips.at&lt;Point2d&gt;(j,0) = _newTip;
	}

	//lazyness of fingers - joints should strive to be as they were in the natural pose
	vector&lt;double&gt; _angles;
//	for (int j=0; j&lt;5; j++) {
//		FINGER_DATA f = d.hand.fingers[j];
//		FINGER_DATA of = orig_d.hand.fingers[j];
////		_angles.push_back(f.a - of.a);
//		for (int i=0; i&lt;f.joints_d.size(); i++) {
////			_angles.push_back(f.joints_a[i] - of.joints_a[i]);
//			_angles.push_back(f.joints_d[i] - of.joints_d[i]);
//		}
//	}
	_angles.push_back(d.hand.a-orig_d.hand.a); //the angle of the hand should be as it was before
	_sum  += 10000*norm(Mat(_angles));

	if(_sum &lt; 0) return 0;
	return _sum;
}
</pre>
<p>You&#8217;ll notice the commented out section. The &#8220;laziness of fingers&#8221; turned out not to give good results&#8230; A different metric is needed! I have not found it yet, maybe you have a good idea?</p>
<p>Starting tnc.c is very simple: Allocating the vectors for X and gradients, initializing the model from the blob, and calling the <code>simple_tnc</code> convenience method. <code>simple_tnc</code> starts <code>tnc</code> with some default parameters that don&#8217;t affect the outcome (at least in my tries).</p>
<pre class="brush: plain; title: ; notranslate">
void estimateHand(Mat&amp; mymask) {
	double _x[SIZE_OF_HAND_DATA] = {0};
	Mat X(1,SIZE_OF_HAND_DATA,CV_64FC1,_x);
	double f;
	Mat gradients(Size(SIZE_OF_HAND_DATA,1),CV_64FC1,Scalar(0));

	namedWindow(&quot;state&quot;);

	initialize_hand_data(d, mymask);

	mapDataToVec((double*)X.data, d.hand);

	simple_tnc(SIZE_OF_HAND_DATA, (double*)X.data, &amp;f, (double*)gradients.data, my_f, (void*)&amp;d, 1, 0);

	mapVecToData((double*)X.data, d.hand);
	showstate(d,1);

	d.hand.origin = getHandOrigin(d.hand); //move to new position
}
</pre>
<h1>Results and Discussion</h1>
<p>Here are my results so far:<br />
<object width="480" height="385"><param name="movie" value="http://www.youtube.com/v/uETHJQhK144?fs=1&amp;hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/uETHJQhK144?fs=1&amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed></object></p>
<p>It&#8217;s not perfect, but it&#8217;s a start. Tracking and estimating open hand is pretty good, with some orientation change as well. But when the fingers are closed&#8230; that&#8217;s where problems start. </p>
<p>Sometimes the joints &#8220;hover&#8221; over the black area to &#8220;land&#8221; in a white area so they &#8220;fit&#8221;, but they should not do that. One easy thing to do to counter this is to measure the distance of the whole bone, and not just the joint.</p>
<p>The model right now doesn&#8217;t use all the joints possible, because it is too heavy computationally. Plus the energy does not depend (or change) the angle of the fingers. So this is a very very simple model of a hand&#8230;</p>
<p>But, it is a good start! All the <a href="http://www.youtube.com/watch?v=mLT4CFLIi8A&#038;feature=related">other</a> <a href="http://www.youtube.com/watch?v=6Uw_8Y1RuQQ&#038;feature=related">stuff</a> I <a href="http://www.youtube.com/watch?v=B_UYmQJT-F0&#038;feature=related">have</a> <a href="http://www.youtube.com/watch?v=F8GVeV0dYLM&#038;feature=related">seen</a> <a href="http://www.youtube.com/watch?v=Rmh-mZFxWns&#038;feature=related">online</a> is just basic high-curvature points counting and color-based or feature-based segmentation and tracking&#8230; My model actually tries to fit an articulate and precise model of a hand to the image.</p>
<h1>How did you get such nice blobs?!</h1>
<p>You ask. They are beautiful aren&#8217;t they&#8230; nice and clean, easy for tracking and model fitting. It&#8217;s no magic though&#8230;<br />
Well, I took part of a <a href="http://depthjs.media.mit.edu/">project in the Media Lab, called DepthJS</a>, that uses the MS Kinect to control web pages. I wrote the computer-vision part. So all the <a href="https://github.com/doug/depthjs">code is there</a>, you can grab it, I just plugged it into this little project. Basing off <a href="http://openkinect.org/wiki/C%2B%2BOpenCvExample">this very simple example of using OpenCV2.X and libfreenect</a>.</p>
<p>Wow, this was a longie.. I hope you learned something and got inspired. I got to do a second overview of the project, and I&#8217;m inspired. Inspiration all around!</p>
<p>Code is obviously yours for the taking:<br />
<a href="https://github.com/royshil/OpenHPE">https://github.com/royshil/OpenHPE</a></p>
<p>Please contribute your own views, thoughts, code, rants in the comments and github page.</p>
<p>Enjoy<br />
Roy.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.morethantechnical.com%2F2010%2F12%2F28%2Fhand-gesture-recognition-via-model-fitting-in-energy-minimization-wopencv%2F&amp;title=Hand%20gesture%20recognition%20via%20model%20fitting%20in%20energy%20minimization%20w%2FOpenCV" id="wpa2a_14"><img src="http://www.morethantechnical.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.morethantechnical.com/2010/12/28/hand-gesture-recognition-via-model-fitting-in-energy-minimization-wopencv/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
	</channel>
</rss>

