code graphics opencv programming Recommended vision Website

Bust out your own graphcut based image segmentation with OpenCV [w/ code]

This is a tutorial on using Graph-Cuts and Gaussian-Mixture-Models for image segmentation with OpenCV in C++ environment.
Update 10/30/2017: See a new implementation of this method using OpenCV-Python, PyMaxflow, SLIC superpixels, Delaunay and other tricks.
Been wokring on my masters thesis for a while now, and the path of my work came across image segmentation. Naturally I became interested in Max-Flow Graph Cuts algorithms, being the “hottest fish in the fish-market” right now if the fish market was the image segmentation scene.
So I went looking for a CPP implementation of graphcut, only to find out that OpenCV already implemented it in v2.0 as part of their GrabCut impl. But I wanted to explore a bit, so I found this implementation by Olga Vexler, which is build upon Kolmogorov’s framework for max-flow algorithms. I was also inspired by Shai Bagon’s usage example of this implementation for Matlab.
Let’s jump in…

3d code graphics opencv opengl programming Recommended vision Website

Quick and Easy Head Pose Estimation with OpenCV [w/ code]

Update: check out my new post about this
Just wanted to share a small thing I did with OpenCV – Head Pose Estimation (sometimes known as Gaze Direction Estimation). Many people try to achieve this and there are a ton of papers covering it, including a recent overview of almost all known methods.
I implemented a very quick & dirty solution based on OpenCV’s internal methods that produced surprising results (I expected it to fail), so I decided to share. It is based on 3D-2D point correspondence and then fitting of the points to the 3D model. OpenCV provides a magical method – solvePnP – that does this, given some calibration parameters that I completely disregarded.
Here’s how it’s done

code graphics opencv opengl programming Recommended school video vision Website

Implementing PTAM: stereo, tracking and pose estimation for AR with OpenCV [w/ code]

Been working hard at a project for school the past month, implementing one of the more interesting works I’ve seen in the AR arena: Parallel Tracking and Mapping (PTAM) [PDF]. This is a work by George Klein [homepage] and David Murray from Oxford university, presented in ISMAR 2007.
When I first saw it on youtube [link] I immediately saw the immense potential – mobile markerless augmented reality. I thought I should get to know this work a bit more closely, so I chose to implement it as a part of advanced computer vision course, given by Dr. Lior Wolf [link] at TAU.
The work is very extensive, and clearly is a result of deep research in the field, so I set to achieve a few selected features: Stereo initialization, Tracking, and small map upkeeping. I chose not to implement relocalization and full map handling.
This post is kind of a tutorial for 3D reconstruction with OpenCV 2.0. I will show practical use of the functions in cvtriangulation.cpp, which are not documented and in fact incomplete. Furthermore I’ll show how to easily combine OpenCV and OpenGL for 3D augmentations, a thing which is only briefly described in the docs or online.
Here are the step I took and things I learned in the process of implementing the work.
Update: A nice patch by yazor fixes the video mismatching – thanks! and also a nice application by Zentium called “iKat” is doing some kick-ass mobile markerless augmented reality.

graphics programming Website work

Extending Justin Talbot's GrabCut Impl [w/ code]

Justin Talbot has done a tremendous job implementing the GrabCut algorithm in C [link to paper, link to code]. I was missing though, the option to load ANY kind of file, not just PPMs and PGMs.
So I tweaked the code a bit to receive a filename and determine how to load it: use the internal P[P|G]M loaders, or offload the work to the OpenCV image loaders that take in many more type. If the OpenCV method is used, the IplImage is converted to the internal GrabCut code representation.

Image<Color>* load( std::string file_name )
 if( file_name.find( ".pgm" ) != std::string::npos )
 return loadFromPGM( file_name );
 else if( file_name.find( ".ppm" ) != std::string::npos )
 return loadFromPPM( file_name );
 return loadOpenCV(file_name);
void fromImageMaskToIplImage(const Image<Real>* image, IplImage* ipli) {
 for(int x=0;x<image->width();x++) {
 for(int y=0;y<image->height();y++) {
 //Color c = (*image)(x,y);
 Real r = (*image)(x,y);
 CvScalar s = cvScalarAll(0);
 if(r == 0.0) {
 s.val[0] = 255.0;
 cvSet2D(ipli,ipli->height - y - 1,x,s);
Image<Color>* loadIplImage(IplImage* im) {
 Image<Color>* image = new Image<Color>(im->width, im->height);
 for(int x=0;x<im->width;x++) {
 for(int y=0;y<im->height;y++) {
 CvScalar v = cvGet2D(im,im->height-y-1,x);
 Real R, G, B;
 R = (Real)((unsigned char)v.val[2])/255.0f;
 G = (Real)((unsigned char)v.val[1])/255.0f;
 B = (Real)((unsigned char)v.val[0])/255.0f;
 (*image)(x,y) = Color(R,G,B);
 return image;
Image<Color>* loadOpenCV(std::string file_name) {
 IplImage* im = cvLoadImage(file_name.c_str(),1);
 Image<Color>* i = loadIplImage(im);
 return i;

Well, there’s nothing fancy here, but it does give you a fully working GrabCut implementation on top of OpenCV… so there’s the contribution.

GrabCutNS::Image<GrabCutNS::Color>* imageGC = GrabCutNS::loadIplImage(orig);
 GrabCutNS::Image<GrabCutNS::Color>* maskGC = GrabCutNS::loadIplImage(mask);
 GrabCutNS::GrabCut *grabCut = new GrabCutNS::GrabCut( imageGC );
 IplImage* __GCtmp = cvCreateImage(cvSize(orig->width,orig->height),8,1);

I also added the GrabCutNS namespace, to differentiate the Image class from the rest of the code (that probably has an Image already).
Code is as usual available online in the SVN repo.

graphics Mobile phones programming video vision

Near realtime face detection on the iPhone w/ OpenCV port [w/code,video]

iphone + opencv = winHi
OpenCV is by far my favorite CV/Image processing library. When I found an OpenCV port to the iPhone, and even someone tried to get it to do face detection, I just had to try it for myself.
In this post I’ll try to run through the steps I took in order to get OpenCV running on the iPhone, and then how to get OpenCV’s face detection play nice with iPhoneOS’s image buffers and video feed (not yet OS 3.0!). Then i’ll talk a little about optimization
Update: Apple officially supports camera video pixel buffers in iOS 4.x using AVFoundation, here’s sample code from Apple developer.
Update: I do not have the xcodeproj file for this project, please don’t ask for it. Please see here for compiling OpenCV for the iPhone SDK 4.3.
Let’s begin

3d graphics opengl programming video

Augmented Reality with NyARToolkit, OpenCV & OpenGL

I have been playing around with NyARToolkit’s CPP implementation in the last week, and I got some nice results. I tried to keep it as “casual” as I could and not get into the crevices of every library, instead, I wanted to get results and fast.
First, NyARToolkit is a derivative of the wonderful ARToolkit by the talented people @ HIT Lab NZ & HIT Lab Uni of Washington. NyARToolkit however was ported to many other different platforms, like Java, C# and even Flash (Papervision3D?), and in the process making it object oriented, instead of ARToolkit procedural approach. NyARToolkit have made a great job, so I decided to build from there.
NyART don’t provide any video capturing, and no 3D rendering in their CPP implementation (they do in the other ports), so I set out to build it on my own. OpenCV is like a second language to me, so I decided to take its video grabbing mechanism wrapper for Win32. For 3D rendering I used the straightforward GLUT library which does an excellent job ridding the programmer from all the Win#@$#@ API mumbo-jumbo-CreateWindowEx crap.
So let’s dive in….

graphics programming vision

Porting Rob Hess's SIFT impl. to Java

beavers_siftThis is a Java port of Rob Hess’ implementation of SIFT that I did for a project @ work.
However, I couldn’t port the actual extraction of SIFT descriptors from images as it relies very heavily on OpenCV. So actually all that I ported to native Java is the KD-Tree features matching part, and the rest is in JNI calls to Rob’s code.
I wrote this more as a tutorial to Rob’s work, with an easy JNI interface to Java.
You can find the sources here:
Here’s how to use it:

graphics gui programming vision work

Combining Java's BufferedImage and OpenCV's IplImage

I recently did a small project combining a Java web service with a OpenCV processing. I tried to transfer the picture from Java environment (as BufferedImage) to OpenCV (IplImage) as seamlessly as possible. This proved a but tricky, especially the Java part where you need to create your own buffer for the image, but it worked out nicely.
Let me show you how I did it

gui linux programming qt Uncategorized video

Qt & OpenCV combined for face detecting QWidgets

As my search for the best platform to roll-out my new face detection concept continues, I decided to give ol’ Qt framework a go.
I like Qt. It’s cross-platform, a clear a nice API, straightforward, and remindes me somewhat of Apple’s Cocoa.
My intention is to get some serious face detection going on mobile devices. So that means either the iPhone, which so far did a crummy job performance-wise, or some other mobile device, preferably linux-based.
This led me to the decision to go with Qt. I believe you can get it to work on any linux-ish platform (limo, moblin, android), and since Nokia baught Trolltech – it’s gonna work on Nokia phones soon, awesome!
Lets get to the details, shall we?