Bust out your own graphcut based image segmentation with OpenCV [w/ code]
This is a tutorial on using Graph-Cuts and Gaussian-Mixture-Models for image segmentation with OpenCV in C++ environment.
Been wokring on my masters thesis for a while now, and the path of my work came across image segmentation. Naturally I became interested in Max-Flow Graph Cuts algorithms, being the “hottest fish in the fish-market” right now if the fish market was the image segmentation scene.
So I went looking for a CPP implementation of graphcut, only to find out that OpenCV already implemented it in v2.0 as part of their GrabCut impl. But I wanted to explore a bit, so I found this implementation by Olga Vexler, which is build upon Kolmogorov’s framework for max-flow algorithms. I was also inspired by Shai Bagon’s usage example of this implementation for Matlab.
Let’s jump in…
Bit of Theory
Before we move on, let’s dig in a little in the theory. We look at the picture as a set of nodes, where each pixel is node and is connected to its neighbors by edges and has a label – this can be called a Markov Random Field. MRFs can be solved, i.e. give an optimal labeling for each node and thus an optimal labeling, in a number of ways, one of which being graph cuts based on maximal flow. After we label the graph, we expect to get a meaningful segmentation of the image. This paper, by some of the big names in the field (Vexler, Kolmogorov, Agarwala), explains it pretty throughly. There a number of well known segmentation methods that use graph cuts, such as: Lazy Snapping [04], GrabCut [04] and more.
The math in the articles is, as usual, pretty horrific. I like to keep things simple, so I’ll try to explain the method of GC-segmentation in a simple way. We all remember min cut – max flow algorithms from 2nd year CS, right? well segmentation using GC is not very different. The magic happens when we weight the nodes and edges in a meaningful way, thus creating meaningful cuts. The weights are usually spit to two terms: Data term (or cost) and Smoothness term. The data term says in simple words: “How happy this pixels is with that label”, and the smoothness term pretty much says “How easy can a label expand from this pixel to that neighbor”. So when you think about it, the easiest thing would be to put as the data term the likelyhood of a pixel to belong to some label, and for the smoothness term – just use the edges in the picture!
So anyway, back to the code, only thing left is to create a graph, give weights, and max the flow. Here’s a bit of code:
Mat im_small = imread("a_pic.jpg");
int num_lables = 3;
// create a grid type graph
GCoptimizationGridGraph gc(im_small.cols,im_small.rows,num_lables);
This piece of code created a directed grid graph where every pixel will be a vertex, and each pixel can have one of 3 lables (3 parts in the image to segment).
GMM to the rescue!
Now for the weighting. One very “standard” way to give a data term to the pixels is by using Gaussian-Mixture-Models (GMM): A method that fits a few gaussian distributions over an unknown probability function to estimate how it looks. In the spirit of keeping things simple, I won’t go into details. I’ll just say that it’s a tool to get the probablility of a pixel to belong to a cluster of other pixels, and it has built-in implementation in OpenCV, which is reason enough for me to use it. In OpenCV GMM models are named EM, which is kind of erroneous, since EM (Expectation-Maximization) is one of the best methods to estimate a GMM and not a GMM itself.
Using EM in OpenCV is really very easy:
CvEM model; CvEMParams ps(3); Mat lables; Mat samples; im.reshape(1,im.rows*im.cols).convertTo(samples,CV_32FC1,1.0/255.0); model.train(samples,Mat(),ps,&lables); Mat probs = model.get_probs();
Here’s how it looks when training over the whole image as input data (you can see original image, labeling, minus log probability):
But, this is not exactly what we wanted… Since we are dealing with segmentation here, we would like to segment certain area. The purpose of the GMM is to learn how that area looks, based on a small set of samples, and then predict the label for all the pixels in the image.
In GrabCut they pretty much create a GMM for every logical “cluster” they want to segment: Positively Background, Probably Background, Probably Foreground and Positively Foreground. This is a good idea and I will follow it, but again, I’m aiming not for Object Extraction rather for k-way segmentation. In other words I’m looking for a way to divide the image to a few areas that are significantly similar, and also not similar to the other areas.
To do that I will create a K-gaussians GMM for N areas (see the code, it’s a long one). I tried 2 versions, where I create a 1-gaussian GMM for each channel (RGB, etc.) of each area it’s called doEM1D(), and another one with K-gaussian and N-clusters GMM for each area. The results have varied:
This will provide us the data-term for our segmentation – each pixel can now say how comfortable it is with the label it got (we simply use the probability from the GMM).
Play it smooth
Right, moving on to the smoothness term. I mentioned before it would be easiest to just use the edges in the image. I use the Sobel filter, which gives a nice strong edge. Again we must look at each pixel’s value as the likelyhood to have an edge in it, so we should use -log to get it in nice big integers where the probability drops.
GaussianBlur(gray32f,gray32f,Size(11,11),0.75);
Sobel(gray32f,_tmp,-1,1,0,3); //sobel for dx
_tmp1 = abs(_tmp); //use abs value to get also the opposite direction edges
_tmp1.copyTo(res,(_tmp1 > 0.2)); //threshold the small edges...
double maxVal,minVal;
minMaxLoc(_tmp,&minVal,&maxVal);
cv::log(res,_tmp);
_tmp = -_tmp * 0.17;
_tmp.convertTo(grayInt1,CV_32SC1);
Sobel(gray32f,_tmp,-1,0,1,3); //sobel for dy
_tmp1 = abs(_tmp);
res.setTo(Scalar(0));
_tmp1.copyTo(res,(_tmp1 > 0.2));
imshow("tmp1",res); waitKey();
minMaxLoc(_tmp,&minVal,&maxVal);
cv::log(res,_tmp);
_tmp = -_tmp * 0.17;
_tmp.convertTo(grayInt,CV_32SC1);
Now put everything into a bowl and mix!
And the last part of the process will be to put the labels probabilities per pixel and edges into the grid graph we created earlier:
GCoptimizationGridGraph gc(im.cols,im.rows,num_lables);
//Set the pixel-label probability
int N = im.cols*im.rows;
double log_1_3 = log(1.3);
for(int i=0;i<N;i++) {
double* ppt = probs.ptr<double>(i);
for(int l=0;l<num_lables;l++) {
int icost = MAX(0,(int)floor(-log(ppt[l])/log2));
gc.setDataCost(i,l,icost);
}
}
//Set the smoothness cost
Mat Sc = 5 * (Mat::ones(num_lables,num_lables,CV_32SC1) - Mat::eye(num_lables,num_lables,CV_32SC1));
gc.setSmoothCostVH((int*)(Sc.data),(int*)dx.data,(int*)dy.data);
lables.create(N,1,CV_8UC1);
printf("\nBefore optimization energy is %d\n",gc.compute_energy());
gc.expansion(1);
printf("\nAfter optimization energy is %d\n",gc.compute_energy());
//Get the labeling back from the graph
for ( int i = 0; i < N; i++ )
((uchar*)(lables.data + lables.step * i))[0] = gc.whatLabel(i);
Easy. Now the labeling should give us a nice segmentation:
But, there’s a lot of noise in the labeling… A good heuristic to apply will be to take only the largest connected-component of each label, and also try to the the component that is closest to the original marking.
vector<vector<Point>> contours;
for(int itr=0;itr<2;itr++) {
Mat mask = (_tmpLabels == itr); //Get a mask of this label
contours.clear();
//find the contours in that mask
cv::findContours(mask,contours,CV_RETR_EXTERNAL,CV_CHAIN_APPROX_NONE);
//compute areas
vector<double> areas(contours.size());
for(unsigned int ai=0;ai<contours.size();ai++) {
Mat _pts(contours[ai]);
Scalar mp = mean(_pts);
areas[ai] = contourArea(Mat(contours[ai]))/* add some bias here to get components that are closer to initial marking*/;
}
//find largest connected component
double max; Point maxLoc;
minMaxLoc(Mat(areas),0,&max,0,&maxLoc);
//draw back on mask
_tmpLabels.setTo(Scalar(3),mask); //all unassigned pixels will have value of 3, later we'll turn them to "background" pixels
mask.setTo(Scalar(0)); //clear...
drawContours(mask,contours,maxLoc.y,Scalar(255),CV_FILLED);
//now that the mask has only the wanted component...
_tmpLabels.setTo(Scalar(itr),mask);
}
Code and salutations
As usual the code is available from the blog’s SVN:
svn checkout http://morethantechnical.googlecode.com/svn/trunk/GMMGraphCutSegmentation GMMGraphCutSegmentation
Hey! We’re pretty much done! Glad you (and I) made it to the end, it wasn’t easy after all… I hope you learned something about GMMs and Graph-Cuts in OpenCV and in general.
BTW: The pictures are from Flickr, under creative commons license.
http://farm1.static.flickr.com/33/40406598_fd4e74d51c_d.jpg
http://www.flickr.com/photos/willemvelthoven/56589010/sizes/m/in/pool-99557785@N00/
See ya!
Roy.








Hi,
I’m learning OpenCV and I just discovered this blog. I had been battling through the wikipedia page on Markov Random Fields ( among many others ) and in one sentance you described what one is clearly. I’m sure there is many ideas that seem enormously complicated when looking at mathematical definitions and equations but can actually be explained quite simply. So, thanks for that, and the code. I’ll keep checking here for more enlightenment.
Saul
Thanks for your share of the code,but why the code is not completed?Could you give me the full code? I really need it very much,thank you again~~
O(∩_∩)O~
My Email is 316855257@qq.com ,Could you send the code to me?
I really need that for my next study,Thank you~~
Hi Wangjin
The whole code is in the SVN repo.. you can even see the complete file online at code.google.com/p/morethantechnical
Roy
Hi Roy:
Glad to receive your reply,you are so kind,
Hope more exchange for the future time~
Good Luck to you.
Wang Jin
Roy,
Sorry,but I also want to ask,How can I get the .h file “GCoptimization.h”?
Hi Roy,
I am developing some algorithms using OpenCV for fun during my free time (right now I am working on Active Appearance Models), and I stumbled upon your blog while looking for some segmentation code. I know too much how ungrateful publishing open source code can be, as too many people rip it off without caring for the developers. So I just wanted to take the time to thank you for your work.
Also, since I am trying to maintain a little blog myself, I wanted to tell you to keep up on posting your good articles, as I find them very interesting and inspiring. Thank you for that, too!
Finally regarding your experiments with Graph-Cut algorithms, have your tried to work in different color spaces? I have seen in the source code that you have commented a line converting the image from RGB to HSV. How did this turn out? Skin textures are always very sensitive to illumination, and sometimes results can be improved if the image is transformed from RGB to YCbCr, and then only the Cb and Cr terms are kept to detect skin. Have you tried anything in the YCbCr space?
Cheers,
Manu
Hi Manu
Thank you for the encouragement.. up keeping a blog is need a diffcult task, but it’s worth every comment and approach from people that the blog did help. That was the goal in the first place.
Regarding the HSV and YCbCr, Yes, I have experimented with different colorspaces, but the best results were achieved in the RGB colorspace. The main problem with conversion to other colorspaces is the difference in quantization, for (a simple) example the HSV is not [0..255]^3, rather the H channel is [0..180]. I recommend using 32F images if possible.
However this should be researched more deeply, with more examples, so please share your exprience!
Roy.
Hi Roy,
I must to say,this article is perfect,I am intersted in this aspact,and now I am trying to realize it via MATLAB,however,I have one problem to ask you.
Now I have realized the GMM code to get the probablility of a pixel to belong to a certain cluster,I use it as the Data Term simplely,but the result is not so satisfactory.I don’t very understand how did you use the
probablility.
Did you divide the whole image into K areas,then use the GMM model to every area? And then how to get the Data Term,I want to sovle this problem clearly,so should you explain it for me?
Thank you for your time,wishes to you.
libn
Hey Roy,
I made some tests, and what worked better for me when performing a segmentation of facial images based on a simple color comparison was to stay in RGB space (I just compute the mean color and standard deviation in the facial area and consider that each pixel within [mean - std, mean + std] is a facial pixel). YCbCr gave similar results compared to RGB on most of my tests, and sometimes worse results. As for HSV, I haven’t tried it, but I agree that as the HSV space is just a linear combination of the RGB space, and since the value domain of HSV is more limited than RGB, then results are very unlikely to be better than with RGB.
Also, I have tried something different. I have implemented a color model based on Lambertain surface reflectance. I obtained good results when training the model with multiple images for background segmentation, but I did not get anything conclusive when training the model on an area of a single image. I have actually blogged about this color model a few weeks ago:
http://codecapsule.com/2010/09/01/background-segmentation-lambertain-color-model-opencv/
Finally, regarding segmentation in a single image, I have obtained quite good results with the GrabCut implementation provided in OpenCV 2.1. However, I have to admit that it is sometimes a bit slow.
Manu
Hi Roy,
I do not know how to use the codes since it combines c++ and matlab. Could you explain how to compile the code? Is it run in c++ or matlab?
This is pure C++ just using OpenCV.. No MATLAB needed.
To compile, generally you should:
- setup a project in your environment (MS Visual Studio, XCode, CodeBlocks or just a Makefile)
- Add in all the (cpp & h) files from the repo
- Link in OpenCV, as in add the include directories (all the .h and .hpp) so the compiler will find them, add all libraries (.la or .lib files) directories so the linker will find them
- Compile and run!
This is very general, but should get you there
Roy.
Hello Roy
let me first thank you for sharing this knowledge and the valuable information
can you please explain how can i checkout the code from the SVN. I used TortoiseSVN with the given URL:
http://morethantechnical.googlecode.com/svn/trunk/GMMGraphCutSegmentation
but it gives me an error
thanks again
My Email is mjkim87@swu.ac.kr ,Could you send the whole code to me?
I really need that for my study,please^^ Thank you~~
All the code you need is @ the SVN repo:
svn checkout https://morethantechnical.googlecode.com/svn/trunk/GMMGraphCutSegmentationHi, can you explain in more detail this code:
//Set the smoothness cost
Mat Sc = 5 * (Mat::ones(num_lables,num_lables,CV_32SC1) – Mat::eye(num_lables,num_lables,CV_32SC1));
gc.setSmoothCostVH((int*)(Sc.data),(int*)dx.data,(int*)dy.data);
1) what is the purpose of Sc?
2) i check the documetaon fro GC, and there is no explanation for setSmoothCostVH(they even don’t use it at all), There many other functions to set up smoothies term(fro example setSmoothness), so i don’t understand why did you choose to use setSmoothCostVH?
@maxim
This is the “smoothness term” matrix.
It marks the penalty for moving from one label to another in neighboring pixels.
I’m using it because it’s the only way to set it up without what they call a “Functor”, but rather using just a bulk of data.
Their Functor is basically a callback function with (x,y) position so you can specify the smoothness term your self for each pixel.
Go ahead and use whatever method suits your needs… I believe it’s all the same.
Hi,
what is the motivation to use -log (and why do you divide it by log2)
int icost = MAX(0,(int)floor(-log(ppt[l])/log2));
instead of directly estimated probability something like this:
icost = ppt[l];
?
hi Roy,
first all i can tell you is thanks for the sharing knowledge i learn a lot here.. Could you send the full code to me? i really need it for my segmentation study..
my email is : wilsonsbrown@gmail.com
thx before..
@maxim:
your probability will look something like this:
p = exp(-f(u))
So in will be close to 1 for f(u) close to 0 and go towards 0 when f(u) is large. High probabilities are good. The cost function from the “energy world” works the other way around: A high cost is bad. Thats where the minus comes from. The logarithm is used to get rid of the exp(…), which makes it unnecessarily complex to minimize the cost. Because log is a monotonically increasing function, minimizing
-log(exp(-f(u)))= f(u)
yields the same u as maximizing
p=exp(-f(u))
Its just easier.
@Roy: Awesome article, I am trying to do something similar. How fast is this solution that you are proposing? I just checked out the code – its only one file, is that correct? I will look into it in more detail soon.
Hello,
Could you please provide a link to the header file:
#include “GCoptimization.h”
Thanks
hello , ,
thank you so mucht for you explanation
can you pleaze say me how can i , use your code with 4 class of color , in your program you use only 3 colors , but i have an image with 4 objets to semgment
( i am so sorry for my bad language , i’m a french ^^” )
sorry , but i didnt find this file “GCoptimization.h”
can you plz share it , or send it to me i really this code to comprare this method with my method of my study project
thank you so much
@kadi
Find the Graph-Cut Optimization library here: http://vision.csd.uwo.ca/code/
For the 4-way segmentation, you just need to add an extra CvEM object, and label the pixels accordingly.
thank you very much for your reply,
I spent a night trying to execute your code, i corrected a lot of problems, but I can not correct this problem (I created a new project, I added the library files gruphcut, and link with the opencv library)
can you please help me to correct these errors, I will be thankful to you, I really need to run it today and see what it will give me
http://data.imagup.com/10/1130477755.JPG
i use opencv2.2 and code::bloks IDE
why it does not find this function “calCovarMatrix” , and “countours” , and the arguments !
there is an other library i have to link it with my project ?
Could you please say me whey i have this kind of errors ?
i didnt find answers
hi Roy ,
i think that , there is not a fucntion in opencv2.2 named “” calCovarMatrix “” , so i use now “” cvCalCovarMatrix “” but it give me this bug : “” we cannot convert cv::Mat to cvArr** ? i goggled but now answer :”(
can you say me your IDE and version of opencv used in the compilation of this code ?
thank you so much .
Hi Roy,
I’m having some problems with compiling the code? I am using Visual Studio 2008 and OpenCV 2.1.
Here is the error:
error C2664: ‘GCoptimizationGridGraph::setSmoothCostVH’ : cannot convert parameter 1 from ‘int *’ to ‘GCoptimization::EnergyTermType *’
1> Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or function-style cast
Hope you could help me fix this.
Thanks!
@carl
You must define GCoptimization::EnergyTermType to be ‘int’ if you plan to use integers.
Do that in GCoptimization.h.
Hi, Roy.
I have a problem with using cv::findContours function. There’s a access violation runtime error. It breaks at calling the fincContours function line. Someone have a same problem with me. I just thought that this function maybe crash on win32.
So I just wanna know your compile environment(system, compilers, and openCV version).
Thanks in advance.
hey this tutorial was very helpful… .. you have explained very elegantly… i ll be lookin forward for your other tutorials..
hi,roy,i don.t konw why ,in the graph.cpp
1>graphcut.obj : error LNK2019: 无法解析的外部符号 “public: __int64 __thiscall GCoptimization::expansion(int)” (?expansion@GCoptimization@@QAE_JH@Z),该符号在函数 “void __cdecl graphcut(class cv::Mat &,class cv::Mat &,int)” (?graphcut@@YAXAAVMat@cv@@0H@Z) 中被引用
1>graphcut.obj : error LNK2019: 无法解析的外部符号 “public: __int64 __thiscall GCoptimization::compute_energy(void)” (?compute_energy@GCoptimization@@QAE_JXZ),该符号在函数 “void __cdecl graphcut(class cv::Mat &,class cv::Mat &,int)” (?graphcut@@YAXAAVMat@cv@@0H@Z) 中被引用
1>graphcut.obj : error LNK2019: 无法解析的外部符号 “public: void __thiscall GCoptimization::setSmoothCost(int *)” (?setSmoothCost@GCoptimization@@QAEXPAH@Z),该符号在函数 “void __cdecl graphcut(class cv::Mat &,class cv::Mat &,int)” (?graphcut@@YAXAAVMat@cv@@0H@Z) 中被引用
1>graphcut.obj : error LNK2019: 无法解析的外部符号 “public: void __thiscall GCoptimizationGridGraph::setSmoothCostVH(int *,int *,int *)” (?setSmoothCostVH@GCoptimizationGridGraph@@QAEXPAH00@Z),该符号在函数 “void __cdecl graphcut(class cv::Mat &,class cv::Mat &,int)” (?graphcut@@YAXAAVMat@cv@@0H@Z) 中被引用
1>graphcut.obj : error LNK2019: 无法解析的外部符号 “public: void __thiscall GCoptimization::setDataCost(int,int,int)” (?setDataCost@GCoptimization@@QAEXHHH@Z),该符号在函数 “void __cdecl graphcut(class cv::Mat &,class cv::Mat &,int)” (?graphcut@@YAXAAVMat@cv@@0H@Z) 中被引用
1>graphcut.obj : error LNK2019: 无法解析的外部符号 “public: __thiscall GCoptimizationGridGraph::GCoptimizationGridGraph(int,int,int)” (??0GCoptimizationGridGraph@@QAE@HHH@Z),该符号在函数 “void __cdecl graphcut(class cv::Mat &,class cv::Mat &,int)” (?graphcut@@YAXAAVMat@cv@@0H@Z) 中被引用
1>graphcut.obj : error LNK2019: 无法解析的外部符号 “public: virtual __thiscall GCoptimizationGridGraph::~GCoptimizationGridGraph(void)” (??1GCoptimizationGridGraph@@UAE@XZ),该符号在函数 “void __cdecl graphcut1(class cv::Mat &,class cv::Mat &,class cv::Mat &,class cv::Mat &,int,class cv::Mat &)” (?graphcut1@@YAXAAVMat@cv@@000H0@Z) 中被引用
1>C:\Users\waterlife\Documents\Visual Studio 2008\Projects\Graphiccut\Debug\Graphiccut.exe : fatal error LNK1120: 7 个无法解析的外部命令
can u give me some ideas
@reachgoal
Looks like a linker problem…
Make sure the GCOptimization library is compiled on the bundle, or is linked to your executable properly
@ Roy,
Error 2 error LNK2019: unresolved external symbol _cvSVD referenced in function “public: void __thiscall GaussianFitter::finalize(struct Gaussian &,unsigned int,bool)const ” (?finalize@GaussianFitter@@QBEXAAUGaussian@@I_N@Z) GMM.obj GrabCut
what is needed here.
@CV
Obviously, the linker is missing the cvSVD function.
Make sure all OpenCV .lib/.so/.dylib files are properly added to the linking step.