«

»

May 05

Bust out your own graphcut based image segmentation with OpenCV [w/ code]

This is a tutorial on using Graph-Cuts and Gaussian-Mixture-Models for image segmentation with OpenCV in C++ environment.

Been wokring on my masters thesis for a while now, and the path of my work came across image segmentation. Naturally I became interested in Max-Flow Graph Cuts algorithms, being the "hottest fish in the fish-market" right now if the fish market was the image segmentation scene.

So I went looking for a CPP implementation of graphcut, only to find out that OpenCV already implemented it in v2.0 as part of their GrabCut impl. But I wanted to explore a bit, so I found this implementation by Olga Vexler, which is build upon Kolmogorov's framework for max-flow algorithms. I was also inspired by Shai Bagon's usage example of this implementation for Matlab.

Let's jump in...

Bit of Theory

Before we move on, let's dig in a little in the theory. We look at the picture as a set of nodes, where each pixel is node and is connected to its neighbors by edges and has a label - this can be called a Markov Random Field. MRFs can be solved, i.e. give an optimal labeling for each node and thus an optimal labeling, in a number of ways, one of which being graph cuts based on maximal flow. After we label the graph, we expect to get a meaningful segmentation of the image. This paper, by some of the big names in the field (Vexler, Kolmogorov, Agarwala), explains it pretty throughly. There a number of well known segmentation methods that use graph cuts, such as: Lazy Snapping [04], GrabCut [04] and more.

The math in the articles is, as usual, pretty horrific. I like to keep things simple, so I'll try to explain the method of GC-segmentation in a simple way. We all remember min cut - max flow algorithms from 2nd year CS, right? well segmentation using GC is not very different. The magic happens when we weight the nodes and edges in a meaningful way, thus creating meaningful cuts. The weights are usually spit to two terms: Data term (or cost) and Smoothness term. The data term says in simple words: "How happy this pixels is with that label", and the smoothness term pretty much says "How easy can a label expand from this pixel to that neighbor". So when you think about it, the easiest thing would be to put as the data term the likelyhood of a pixel to belong to some label, and for the smoothness term - just use the edges in the picture!

So anyway, back to the code, only thing left is to create a graph, give weights, and max the flow. Here's a bit of code:

Mat im_small = imread("a_pic.jpg");
int num_lables = 3;

// create a grid type graph
GCoptimizationGridGraph gc(im_small.cols,im_small.rows,num_lables);

This piece of code created a directed grid graph where every pixel will be a vertex, and each pixel can have one of 3 lables (3 parts in the image to segment).

GMM to the rescue!

Now for the weighting. One very "standard" way to give a data term to the pixels is by using Gaussian-Mixture-Models (GMM): A method that fits a few gaussian distributions over an unknown probability function to estimate how it looks. In the spirit of keeping things simple, I won't go into details. I'll just say that it's a tool to get the probablility of a pixel to belong to a cluster of other pixels, and it has built-in implementation in OpenCV, which is reason enough for me to use it. In OpenCV GMM models are named EM, which is kind of erroneous, since EM (Expectation-Maximization) is one of the best methods to estimate a GMM and not a GMM itself.

Using EM in OpenCV is really very easy:

CvEM model;
CvEMParams ps(3);

Mat lables;
Mat samples;
im.reshape(1,im.rows*im.cols).convertTo(samples,CV_32FC1,1.0/255.0);

model.train(samples,Mat(),ps,&lables);

Mat probs = model.get_probs();

Here's how it looks when training over the whole image as input data (you can see original image, labeling, minus log probability):

Clustering using EM in OpenCV

But, this is not exactly what we wanted... Since we are dealing with segmentation here, we would like to segment certain area. The purpose of the GMM is to learn how that area looks, based on a small set of samples, and then predict the label for all the pixels in the image.

In GrabCut they pretty much create a GMM for every logical "cluster" they want to segment: Positively Background, Probably Background, Probably Foreground and Positively Foreground. This is a good idea and I will follow it, but again, I'm aiming not for Object Extraction rather for k-way segmentation. In other words I'm looking for a way to divide the image to a few areas that are significantly similar, and also not similar to the other areas.

To do that I will create a K-gaussians GMM for N areas (see the code, it's a long one). I tried 2 versions, where I create a 1-gaussian GMM for each channel (RGB, etc.) of each area it's called doEM1D(), and another one with K-gaussian and N-clusters GMM for each area. The results have varied:

Three 1 channel 1 Gaussian GMMs


Three 3-channels 3-Gaussians per channel GMMs

This will provide us the data-term for our segmentation - each pixel can now say how comfortable it is with the label it got (we simply use the probability from the GMM).

Play it smooth

Right, moving on to the smoothness term. I mentioned before it would be easiest to just use the edges in the image. I use the Sobel filter, which gives a nice strong edge. Again we must look at each pixel's value as the likelyhood to have an edge in it, so we should use -log to get it in nice big integers where the probability drops.

GaussianBlur(gray32f,gray32f,Size(11,11),0.75);

Sobel(gray32f,_tmp,-1,1,0,3);	//sobel for dx
_tmp1 = abs(_tmp);  //use abs value to get also the opposite direction edges
_tmp1.copyTo(res,(_tmp1 > 0.2));  //threshold the small edges...

double maxVal,minVal;
minMaxLoc(_tmp,&minVal,&maxVal);
cv::log(res,_tmp);
_tmp = -_tmp * 0.17;
_tmp.convertTo(grayInt1,CV_32SC1);

Sobel(gray32f,_tmp,-1,0,1,3);	//sobel for dy
_tmp1 = abs(_tmp);
res.setTo(Scalar(0));
_tmp1.copyTo(res,(_tmp1 > 0.2));

imshow("tmp1",res); waitKey();

minMaxLoc(_tmp,&minVal,&maxVal);
cv::log(res,_tmp);
_tmp = -_tmp * 0.17;
_tmp.convertTo(grayInt,CV_32SC1);

Now put everything into a bowl and mix!

And the last part of the process will be to put the labels probabilities per pixel and edges into the grid graph we created earlier:

GCoptimizationGridGraph gc(im.cols,im.rows,num_lables);

//Set the pixel-label probability
int N = im.cols*im.rows;
double log_1_3 = log(1.3);
for(int i=0;i<N;i++) {
   double* ppt = probs.ptr<double>(i);
   for(int l=0;l<num_lables;l++) {
      int icost = MAX(0,(int)floor(-log(ppt[l])/log2));
      gc.setDataCost(i,l,icost);
   }
}

//Set the smoothness cost
Mat Sc = 5 * (Mat::ones(num_lables,num_lables,CV_32SC1) - Mat::eye(num_lables,num_lables,CV_32SC1));
gc.setSmoothCostVH((int*)(Sc.data),(int*)dx.data,(int*)dy.data);

lables.create(N,1,CV_8UC1);

printf("\nBefore optimization energy is %d\n",gc.compute_energy());
gc.expansion(1);
printf("\nAfter optimization energy is %d\n",gc.compute_energy());

//Get the labeling back from the graph
for ( int  i = 0; i < N; i++ )
   ((uchar*)(lables.data + lables.step * i))[0] = gc.whatLabel(i);

Easy. Now the labeling should give us a nice segmentation:

3 x 3-ch-3-gs GMMs labeling


3 x 1-ch-1-gs GMMs labeling

But, there's a lot of noise in the labeling... A good heuristic to apply will be to take only the largest connected-component of each label, and also try to the the component that is closest to the original marking.

Lables extraction without larget component keeping


Lables extraction with largest component keeping

vector<vector<Point>> contours;
for(int itr=0;itr<2;itr++) {

Mat mask = (_tmpLabels == itr); //Get a mask of this label

contours.clear();
//find the contours in that mask
cv::findContours(mask,contours,CV_RETR_EXTERNAL,CV_CHAIN_APPROX_NONE);

//compute areas
vector<double> areas(contours.size());
for(unsigned int ai=0;ai<contours.size();ai++) {
	Mat _pts(contours[ai]);
	Scalar mp = mean(_pts);

	areas[ai] = contourArea(Mat(contours[ai]))/* add some bias here to get components that are closer to initial marking*/;
}

//find largest connected component
double max; Point maxLoc;
minMaxLoc(Mat(areas),0,&max,0,&maxLoc);

//draw back on mask
_tmpLabels.setTo(Scalar(3),mask);	//all unassigned pixels will have value of 3, later we'll turn them to "background" pixels

mask.setTo(Scalar(0)); //clear...
drawContours(mask,contours,maxLoc.y,Scalar(255),CV_FILLED);

//now that the mask has only the wanted component...
_tmpLabels.setTo(Scalar(itr),mask);

}

Code and salutations

As usual the code is available from the blog's SVN:

svn checkout http://morethantechnical.googlecode.com/svn/trunk/GMMGraphCutSegmentation GMMGraphCutSegmentation 

Hey! We're pretty much done! Glad you (and I) made it to the end, it wasn't easy after all... I hope you learned something about GMMs and Graph-Cuts in OpenCV and in general.

BTW: The pictures are from Flickr, under creative commons license.
http://farm1.static.flickr.com/33/40406598_fd4e74d51c_d.jpg
http://www.flickr.com/photos/willemvelthoven/56589010/sizes/m/in/pool-99557785@N00/

See ya!
Roy.

Share
  • http://yeonchool.wordpress.com/ yeonchool

    Hi Roy,

    I have a question. I know that there are two terminals in graph cut, source and sink. by the way, how it is possible to segment image into multiple labels with just the two nodes ?
    I have thought so far that only binary segmentation is possible using the graph cut.

    Thanks in advance.

  • Samo

    Check out Move-Making Algorithms for multi-label segmentation (One vs. all the rest [binary segmentation]: each step is a single graph cut optimization step, the process is iterative until it finishes in some local maximum (this is a NP-hard problem)).

    Multi-label optimization: multi-label energies via the α-expansion and α-β-swap algorithms [http://vision.csd.uwo.ca/code/]

  • sabri

    i tried your code :
    **Erreur2 error C2065: 'CvEM' : identificateur non déclaré
    **Erreur3 error C2923: 'std::vector' : 'CvEM' n'est pas un argument de type modèle valide pour le paramètre '_Ty'
    **Erreur4 error C2514: 'std::vector' : la classe n'a aucun constructeur
    **Erreur5 error C2678: '[' binaire : aucun opérateur trouvé qui accepte un opérande de partie gauche de type 'std::vector' (ou il n'existe pas de conversion acceptable)
    Erreur 6 error C2228: la partie gauche de '.clear' doit avoir un class/struct/union
    Erreur 7 error C2678: '[' binaire : aucun opérateur trouvé qui accepte un opérande de partie gauche de type 'std::vector' (ou il n'existe pas de conversion acceptable)
    Erreur 8 error C2228: la partie gauche de '.train' doit avoir un class/struct/union
    Erreur 9error C2678: '[' binaire : aucun opérateur trouvé qui accepte un opérande de partie gauche de type 'std::vector' (ou il n'existe pas de conversion acceptable)
    Erreur 10 error C2228: la partie gauche de '.get_means' doit avoir un class/struct/union
    Erreur 11 error C2678: '[' binaire : aucun opérateur trouvé qui accepte un opérande de partie gauche de type 'std::vector' (ou il n'existe pas de conversion acceptable) c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 401 1 ConsoleApplication3
    Erreur 12 error C2228: la partie gauche de '.predict' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 401 1 ConsoleApplication3
    Erreur 13 error C2678: '[' binaire : aucun opérateur trouvé qui accepte un opérande de partie gauche de type 'std::vector' (ou il n'existe pas de conversion acceptable) c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 402 1 ConsoleApplication3
    Erreur 14 error C2228: la partie gauche de '.get_weights' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 402 1 ConsoleApplication3
    Erreur 15 error C2227: la partie gauche de '->data' doit pointer vers un type class/struct/union/générique c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 402 1 ConsoleApplication3
    Erreur 16 error C2228: la partie gauche de '.db' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 402 1 ConsoleApplication3
    Avertissement 17 warning C4244: '=' : conversion de 'double' en 'float', perte possible de données c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 407 1 ConsoleApplication3
    Erreur 18 error C2065: 'CvEM' : identificateur non déclaré c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 439 1 ConsoleApplication3
    Erreur 19 error C2923: 'std::vector' : 'CvEM' n'est pas un argument de type modèle valide pour le paramètre '_Ty' c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 439 1 ConsoleApplication3
    Erreur 20 error C3203: 'vector' : la classe modèle non spécialisée ne peut pas être utilisée comme argument modèle pour le paramètre modèle '_Ty' ; type réel attendu c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 439 1 ConsoleApplication3
    Erreur 21 error C2065: 'CvEM' : identificateur non déclaré c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 441 1 ConsoleApplication3
    Erreur 22 error C2923: 'std::vector' : 'CvEM' n'est pas un argument de type modèle valide pour le paramètre '_Ty' c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 441 1 ConsoleApplication3
    Erreur 23 error C2955: 'std::vector' : l'utilisation de classe modèle requiert une liste d'arguments modèle c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 441 1 ConsoleApplication3
    Erreur 24 error C2109: un indice requiert un type tableau ou pointeur c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 472 1 ConsoleApplication3
    Erreur 25 error C2228: la partie gauche de '.clear' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 472 1 ConsoleApplication3
    Erreur 26 error C2109: un indice requiert un type tableau ou pointeur c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 473 1 ConsoleApplication3
    Erreur 27 error C2228: la partie gauche de '.train' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 473 1 ConsoleApplication3
    Erreur 28 error C2109: un indice requiert un type tableau ou pointeur c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 478 1 ConsoleApplication3
    Erreur 29 error C2228: la partie gauche de '.get_means' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 478 1 ConsoleApplication3
    Erreur 30 error C2227: la partie gauche de '->data' doit pointer vers un type class/struct/union/générique c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 478 1 ConsoleApplication3
    Erreur 31 error C2228: la partie gauche de '.db' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 478 1 ConsoleApplication3
    Erreur 32 error C2109: un indice requiert un type tableau ou pointeur c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 501 1 ConsoleApplication3
    Erreur 33 error C2228: la partie gauche de '.get_means' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 501 1 ConsoleApplication3
    Erreur 34 error C2227: la partie gauche de '->data' doit pointer vers un type class/struct/union/générique c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 501 1 ConsoleApplication3
    Erreur 35 error C2228: la partie gauche de '.db' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 501 1 ConsoleApplication3
    Erreur 36 error C2109: un indice requiert un type tableau ou pointeur c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 503 1 ConsoleApplication3
    Erreur 37 error C2228: la partie gauche de '.get_covs' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 503 1 ConsoleApplication3
    Erreur 38 error C2227: la partie gauche de '->data' doit pointer vers un type class/struct/union/générique c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 503 1 ConsoleApplication3
    Erreur 39 error C2228: la partie gauche de '.db' doit avoir un class/struct/union c:\users\s\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\graphcut.cpp 503 1 ConsoleApplication3
    40 IntelliSense : identificateur "CvEM" non défini c:\Users\s\Documents\Visual Studio 2013\Projects\ConsoleApplication3\ConsoleApplication3\graphcut.cpp 342 9 ConsoleApplication3
    41 IntelliSense : identificateur "CvEM" non défini
    42 IntelliSense : identificateur "CvEM" non défini
    Erreur 1 error C4996: 'fopen': This function or variable may be unsafe. Consider using fopen_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.