Categories

# Skin Detection with Probability Maps and Elliptical Boundaries [OpenCV, w/code]

Skind detection wit Skin Probabilty Maps and Elliptical Boundary Models, implementations in c++ with OpenCV

Sharing a bit of code I created for skin detection.

At first I tried Skin Probability Maps, of which I read in a number of papers (Kukumanu et al 2007, Gomez & Morales 2002).
Basically SPMs are very easy to implement: Calculate the histogram for skin region and non-skin region, and then make a decision for each pixel based on those histograms.
There’s a few assumptions and decision to make, to name a few:

1. What is the skin region in a given image? If we don’t yet have a model for it. I call that the Bootstrap problem,
2. What colorspace to use/histogram?
3. The per pixel decision parameter (theta in the literature)

### Bootstrap

For bostratpping (namely, getting an initial guess for skin pixels) I used the method from Gomez & Morales 2002:

//get HSV
Mat hsv; cvtColor(rgb, hsv, CV_BGR2HSV);
//get Normalized RGB (aka rgb)
Mat nrgb = getNormalizedRGB(rgb);
//take the pixels that are inside the ranges in both colorspaces
//H=[0,50], S= [0.20,0.68] and V= [0.35,1.0]
//r = [0.36,0.465], g = [0.28,0.363]


### Training

After we get a bootstrap, which you can see is pretty bad, we can start training the model to get a better segmentation. Essentially this means calculating 2 histograms, one for skin pixels and one for non-skin pixels.

Mat nrgb = getNormalizedRGB(img_rgb);
MatND skin_hist,non_skin_hist;
//create a probabilty density function
for (int ubin=0; ubin < 250; ubin++) {
for (int vbin = 0; vbin < 250; vbin++) {
if (skin_hist.at<float>(ubin,vbin) > 0) {
skin_hist.at<float>(ubin,vbin) /= skin_pixels;
}
if (non_skin_hist.at<float>(ubin,vbin) > 0) {
non_skin_hist.at<float>(ubin,vbin) /= non_skin_pixels;
}
}
}


We normalize the histograms with the number of skin (and non-skin) pixels to create a Probability Density Function (that sums to 1).

### Predicting

The SPM algorithm simply states you make a binary prediction (yes or no) for each pixel based on a $$\theta$$ value (a variable given to the algorithm):

$$\dfrac{p(c | \text{skin})}{p(c | \text{non-skin})} > \theta$$

The implementation is trivial if you already have the normalized histograms (and thus PDFs):

float skin_hist_val = skin_hist.at<float>(gbin,rbin);
if (skin_hist_val > 0) {
float non_skin_hist_val = non_skin_hist.at<float>(gbin,rbin);
if (non_skin_hist_val > 0) {
if((skin_hist_val / non_skin_hist_val) > theta_thresh)
else
} else {
}
} else {
}


### Retraining

I find it good to re-train the model a number of time. What I mean is that I iterate a number of times with train-predict operations, where every iteration I use the result of the prediction to train the model again:

for (int i=0; i<10; i++) { //predict-train N times for convergence medianBlur(mask, mask, 3); spm.train(cameraframe, mask); spm.predict(cameraframe, mask); imshow("mask", mask); waitKey(50); } [/code] The results get kind of better with each retraining. Of course there's risk of converging around something which is not skin, and then it all goes to hell. [caption id="attachment_1268" align="aligncenter" width="300"] skin region mask after 10 re-trains[/caption]

## Elliptical Boundary Model

I learned of the elliptical boundary model through the paper by Lee and Yoo from 2002. But I found the results to be lesser than those of the SPM. In fact I anticipated that, since the EBM tries to model the distribution (or rather give a criterion for pixel-based decision based on the distribution), where the SPMs make no modeling of the distribution and use it as is. The SPMs are more memory-intensive (keeping the entire histograms in memory), but are more accurate since all the information is retained.
The EBM model the skin color distribution by fitting an ellips(oid) to the distribution of skin color values in the histogram.

And makes a decision by using the following inequality: $$[X-\psi]^t \Lambda^{-1} [X-\psi] > \theta$$
Where $$X$$ is the tested pixel color-value in r-g space, $$\psi$$ is the mean value in the distribution and $$\Lambda$$ is calculated like so:
$$\Lambda = \frac{1}{N} \sum_{i=1}^n {f_i (X_i – \mu) (X_i – \mu)^t}$$
Where $$\mu$$ is a weighted mean of the colors in the histogram, $$f_i$$ is the frequency of the color-value $$X_i$$ – both of which come from the histogram.
So you kind of see how simple it is to calculate it: (given f_hist is the histogram on the r-g space)

void train() {
T ustep = range_dist[0]/hist_bins[0], vstep = range_dist[1]/hist_bins[1];
//calc n, X_i and mu
Mat_ mu(1,2); mu.setTo(0);
vector f;
int n = countNonZero(f_hist);
int count = 0;
int N = 0;
Mat_ X(n,2);
for (T ubin=0; ubin < hist_bins[0]; ubin++) { for (T vbin = 0; vbin < hist_bins[1]; vbin++) { T histval = f_hist.at(ubin,vbin);
if (histval > 0) {
Mat_ sampleX = (Mat_(1,2) << low_range[0] + ustep * (ubin+.5), low_range[1] + vstep * (vbin+.5)); sampleX.copyTo(X.row(count++)); f.push_back(histval); mu += histval * sampleX; N += histval; } } } mu /= (T)N; //calc psi - mean of DB reduce(X, psi, 0, CV_REDUCE_AVG); //calc Lambda Lambda.create(2,2); for (int i=0; i < n; i++) { Mat_ X_m_mu = (X.row(i) – mu);
Mat_ prod = f[i] * X_m_mu.t() * X_m_mu;
Lambda += prod;
}
Lambda /= N;
Mat_ linv = Lambda.inv();
Lambda_inv.val[0] = linv(0,0);
Lambda_inv.val[1] = linv(0,1);
Lambda_inv.val[2] = linv(1,0);
Lambda_inv.val[3] = linv(1,1);
cout << "n = " << n << " N = " << N << " mu " << mu << "\npsi " << psi << "\nlambda_inv "<< Lambda_inv<<"\n"; } [/code] But like I said the results are not so good: ($$\theta = 2$$ in this case)
There’s again the option of “retraining” the EBM, although this time it’s basically just accumulating the r-g histogram with more pixels.

### Code

Check the code out, with a snippet on how to use it at: http://web.media.mit.edu/~roys/src/SPMandEBM.zip
Thanks!
Roy.