Mar 05

Skin Detection with Probability Maps and Elliptical Boundaries [OpenCV, w/code]

maskSharing a bit of code I created for skin detection.

At first I tried Skin Probability Maps, of which I read in a number of papers (Kukumanu et al 2007, Gomez & Morales 2002).
Basically SPMs are very easy to implement: Calculate the histogram for skin region and non-skin region, and then make a decision for each pixel based on those histograms.
There's a few assumptions and decision to make, to name a few:

  1. What is the skin region in a given image? If we don't yet have a model for it. I call that the Bootstrap problem,
  2. What colorspace to use/histogram?
  3. The per pixel decision parameter (theta in the literature)


For bostratpping (namely, getting an initial guess for skin pixels) I used the method from Gomez & Morales 2002:

//get HSV
Mat hsv; cvtColor(rgb, hsv, CV_BGR2HSV);
//get Normalized RGB (aka rgb)
Mat nrgb = getNormalizedRGB(rgb);
//take the pixels that are inside the ranges in both colorspaces
Mat mask_hsv, mask_nrgb;
//H=[0,50], S= [0.20,0.68] and V= [0.35,1.0]
inRange(hsv, Scalar(0,0.2*255.0,0.35*255.0), Scalar(50.0/2.0,0.68*255.0,1.0*255.0), mask_hsv);
//r = [0.36,0.465], g = [0.28,0.363]
inRange(nrgb, Scalar(0,0.28,0.36), Scalar(1.0,0.363,0.465), mask_nrgb);

//combine the masks
outputmask = mask_hsv & mask_nrgb;
Input RGB

Input RGB

Bootstrap mask

Bootstrap mask


After we get a bootstrap, which you can see is pretty bad, we can start training the model to get a better segmentation. Essentially this means calculating 2 histograms, one for skin pixels and one for non-skin pixels.

Mat nrgb = getNormalizedRGB(img_rgb);

MatND skin_hist,non_skin_hist;
skin_hist = calc_rg_hist(nrgb,mask);
non_skin_hist = calc_rg_hist(nrgb,~mask);
//create a probabilty density function
float skin_pixels = countNonZero(mask), non_skin_pixels = countNonZero(~mask);
for (int ubin=0; ubin < 250; ubin++) {
	for (int vbin = 0; vbin < 250; vbin++) {
		if (skin_hist.at<float>(ubin,vbin) > 0) {
			skin_hist.at<float>(ubin,vbin) /= skin_pixels;
		if (non_skin_hist.at<float>(ubin,vbin) > 0) {
			non_skin_hist.at<float>(ubin,vbin) /= non_skin_pixels;

We normalize the histograms with the number of skin (and non-skin) pixels to create a Probability Density Function (that sums to 1).

Skin pixels histogram

Skin pixels histogram

Non-skin pixels histogram

Non-skin pixels histogram


The SPM algorithm simply states you make a binary prediction (yes or no) for each pixel based on a \theta value (a variable given to the algorithm):

\dfrac{p(c | \text{skin})}{p(c | \text{non-skin})} > \theta

The implementation is trivial if you already have the normalized histograms (and thus PDFs):

float skin_hist_val = skin_hist.at<float>(gbin,rbin);
if (skin_hist_val > 0) {
	float non_skin_hist_val = non_skin_hist.at<float>(gbin,rbin);
	if (non_skin_hist_val > 0) {
		if((skin_hist_val / non_skin_hist_val) > theta_thresh)
			result_mask(i) = 255;
			result_mask(i) = 0;
	} else {
		result_mask(i) = 0;
} else {
	result_mask(i) = 0;


I find it good to re-train the model a number of time. What I mean is that I iterate a number of times with train-predict operations, where every iteration I use the result of the prediction to train the model again:

for (int i=0; i<10; i++) { //predict-train N times for convergence medianBlur(mask, mask, 3); spm.train(cameraframe, mask); spm.predict(cameraframe, mask); imshow("mask", mask); waitKey(50); } [/code] The results get kind of better with each retraining. Of course there's risk of converging around something which is not skin, and then it all goes to hell. [caption id="attachment_1268" align="aligncenter" width="300"]skin region mask after 10 re-trains skin region mask after 10 re-trains[/caption]

input RGB

input RGB

skin hist after 10 retrains

skin hist after 10 retrains

non skin hist after 10 re-trains

non skin hist after 10 re-trains

Elliptical Boundary Model

I learned of the elliptical boundary model through the paper by Lee and Yoo from 2002. But I found the results to be lesser than those of the SPM. In fact I anticipated that, since the EBM tries to model the distribution (or rather give a criterion for pixel-based decision based on the distribution), where the SPMs make no modeling of the distribution and use it as is. The SPMs are more memory-intensive (keeping the entire histograms in memory), but are more accurate since all the information is retained.

The EBM model the skin color distribution by fitting an ellips(oid) to the distribution of skin color values in the histogram.
Screen Shot 2013-03-05 at 10.47.17 AM

And makes a decision by using the following inequality: [X-\psi]^t \Lambda^{-1} [X-\psi] > \theta
Where X is the tested pixel color-value in r-g space, \psi is the mean value in the distribution and \Lambda is calculated like so:
\Lambda = \frac{1}{N} \sum_{i=1}^n {f_i (X_i - \mu) (X_i - \mu)^t}
Where \mu is a weighted mean of the colors in the histogram, f_i is the frequency of the color-value X_i - both of which come from the histogram.
So you kind of see how simple it is to calculate it: (given f_hist is the histogram on the r-g space)

void train() {
T ustep = range_dist[0]/hist_bins[0], vstep = range_dist[1]/hist_bins[1];

//calc n, X_i and mu
Mat_ mu(1,2); mu.setTo(0);
vector f;
int n = countNonZero(f_hist);
int count = 0;
int N = 0;
Mat_ X(n,2);
for (T ubin=0; ubin < hist_bins[0]; ubin++) { for (T vbin = 0; vbin < hist_bins[1]; vbin++) { T histval = f_hist.at(ubin,vbin);
if (histval > 0) {
Mat_ sampleX = (Mat_(1,2) << low_range[0] + ustep * (ubin+.5), low_range[1] + vstep * (vbin+.5)); sampleX.copyTo(X.row(count++)); f.push_back(histval); mu += histval * sampleX; N += histval; } } } mu /= (T)N; //calc psi - mean of DB reduce(X, psi, 0, CV_REDUCE_AVG); //calc Lambda Lambda.create(2,2); for (int i=0; i < n; i++) { Mat_ X_m_mu = (X.row(i) - mu);
Mat_ prod = f[i] * X_m_mu.t() * X_m_mu;
Lambda += prod;
Lambda /= N;
Mat_ linv = Lambda.inv();
Lambda_inv.val[0] = linv(0,0);
Lambda_inv.val[1] = linv(0,1);
Lambda_inv.val[2] = linv(1,0);
Lambda_inv.val[3] = linv(1,1);

cout << "n = " << n << " N = " << N << " mu " << mu << "\npsi " << psi << "\nlambda_inv "<< Lambda_inv<<"\n"; } [/code] But like I said the results are not so good: (\theta = 2 in this case) Screen Shot 2013-03-05 at 12.05.12 PM

There's again the option of "retraining" the EBM, although this time it's basically just accumulating the r-g histogram with more pixels.


Check the code out, with a snippet on how to use it at: http://web.media.mit.edu/~roys/src/SPMandEBM.zip


  • kevin

    hi, roy , what's fps can this reach?

  • Yury

    I can't see your source code.. would u please sent me it with e-mail. I very interested in this problem.

  • Rachitha


    When I try to to compile your code, i get the following error,

    skinprobablilitymaps.h(123): error C2660: 'cv::Mat_::reshape' : function does not take 2 arguments

    I am using opencv 2.3.1, and i checked the documentation, but reshape does need two arguments, any idea why this is happening?

  • monkdc

    have solved the problem you listed here,dose it work?

  • ali

    TQ very much for your great tutorial and provided code! appreciate it alot.

  • Royi

    How good this method is compared to the State of The Art in literature?