Image Recoloring using Gaussian Mixture Model and Expectation Maximization [OpenCV, w/Code]

Hi,
I'll present a quick and simple implementation of image recoloring, in fact more like color transfer between images, using OpenCV in C++ environment. The basis of the algorithm is learning the source color distribution with a GMM using EM, and then applying changes to the target color distribution. It's fairly easy to implement with OpenCV, as all the "tools" are built in.

I was inspired by Lior Shapira's work that was presented in Eurographics 09 about image appearance manipulation, and a work about recoloring for the colorblind by Huang et al presented at ICASSP 09. Both works deal with color manipulation using Gaussian Mixture Models.

Update 5/28/2015: Adrien contributed code that works with OpenCV v3! Thanks! https://gist.github.com/adriweb/815c1ac34a0929292db7

Let's see how it's done!

A little theory

I won't bore you with the math, but just to get a hang of the idea, what we would like to do is learn how the colors in the source and target images are distributed. Naturally we can assume, like many other things in nature, the colors in a picture have a normal (Gaussian) distribution, but we can go further by saying the distribution might have a few Gaussians describing it. This is called a mixture distribution, and it's a very handy statistical tool. Mixtures can be estimated using a powerful and ubiquitous tool called Expectation Maximization, which I have previously covered. EM essentially tries to recover the mean (mu) and variance (sigma) of the Gaussians in the mixture, by iteratively checking the current hypothesis against the data until finally converging at an extremum.

Learning the color model

For the learning process we must set up the sample data. So we create a binary mask saying where in the image the model can find the colors to learn.
--images--
Then we scan the mask and for each foreground pixel we add it's value (here, RGB, but basically can be anything) to the sample data. Finally we train the CvEM object, which contains the GMM parameters.

void TrainGMM(CvEM& source_model, Mat& source, Mat& source_mask) {
		int src_samples_size = countNonZero(source_mask);
		Mat source_samples(src_samples_size,3,CV_32FC1);

		Mat source_32f; 
		source_32f = source;

		int sample_count = 0;
		for(int y=0;y<source.rows;y++) {
			Vec3f* row = source_32f.ptr<Vec3f>(y);  //pointer to pixel data in the row
			uchar* mask_row = source_mask.ptr<uchar>(y); //pointer to binary mask
			for(int x=0;x<source.cols;x++) {
				if(mask_row[x] > 0) {
					source_samples.at<Vec3f>(sample_count++,0) = row[x];
				}
			}
		}

		source_model.clear();
		CvEMParams ps(3/* = number of gaussians*/);
		source_model.train(source_samples,Mat(),ps,NULL);
}

What we have are three 3-dimensional (R,G,B) Gaussians, describing the colors in the selected area.

Matching Gaussians

Now we have a couple of GMMs - one for the target and one for the source. The idea is to take a pixel in the target, see how the 3 target Gaussians describe it, and shift it to use the 3 source Gaussians. This will, hopefully, cause its color to change from target to source. But we need to know which Gaussian in the target corresponds to which Gaussian in the source. I made a quick selection algorithm that greedily chooses a Gaussian for each Gaussian. I permutate the order of selection for the greedy algorithm, and pick the best permutation to get closer to the optimal selection.

vector<int> Recoloring::MatchGaussians(CvEM& source_model, CvEM& target_model) {
		int num_g = source_model.get_nclusters();
		Mat sMu(source_model.get_means());
		Mat tMu(target_model.get_means());
		const CvMat** target_covs = target_model.get_covs();
		const CvMat** source_covs = source_model.get_covs();

		double best_dist = std::numeric_limits<double>::max();
		vector<int> best_res(num_g);
		vector<int> prmt(num_g); 

		for(int itr = 0; itr < 10; itr++) {
			for(int i=0;i<num_g;i++) prmt[i] = i;	//make a permutation
			randShuffle(Mat(prmt));

			//Greedy selection
			vector<int> res(num_g);
			vector<bool> taken(num_g);
			for(int sg = 0; sg < num_g; sg++) {
				double min_dist = std::numeric_limits<double>::max(); 
				int minv = -1;
				for(int tg = 0; tg < num_g; tg++) {
					if(taken[tg]) continue;

					//TODO: can save on re-calculation of pairs - calculate affinity matrix ahead
					//double d = norm(sMu(Range(prmt[sg],prmt[sg]+1),Range(0,3)),	tMu(Range(tg,tg+1),Range(0,3)));
					
					//symmetric kullback-leibler
					Mat diff = Mat(sMu(Range(prmt[sg],prmt[sg]+1),Range(0,3)) - tMu(Range(tg,tg+1),Range(0,3)));
					Mat d = diff * Mat(Mat(source_covs[prmt[sg]]).inv() + Mat(target_covs[tg]).inv()) * diff.t();
					Scalar tr = trace(Mat(
						Mat(Mat(source_covs[prmt[sg]])*Mat(target_covs[tg])) + 
						Mat(Mat(target_covs[tg])*Mat(source_covs[prmt[sg]]).inv()) + 
						Mat(Mat::eye(3,3,CV_64FC1)*2)
						));
					double kl_dist = ((double*)d.data)[0] + tr[0];
					if(kl_dist<min_dist) {
						min_dist = kl_dist;
						minv = tg;
					}
				}
				res[prmt[sg]] = minv;
				taken[minv] = true;
			}

                       //total distance for the permutation
			double dist = 0;
			for(int i=0;i<num_g;i++) {
				dist += norm(sMu(Range(prmt[i],prmt[i]+1),Range(0,3)),
							tMu(Range(res[prmt[i]],res[prmt[i]]+1),Range(0,3)));
			}
			if(dist < best_dist) {
				best_dist = dist;
				best_res = res;
			}
		}

		return best_res;
	}

I used Symmetric Kullback-Leibler for the distance between Gaussians, as suggested by Huang et al.

Applying the color

Now all we have to do is use the method Shapira suggested in his work to transform a pixel's color, from the Gaussians describing it.
I'm only putting the essence of the code, the rest is in the file.

		Mat pr; Mat samp(1,3,CV_32FC1);
		for(int y=0;y<target.rows;y++) {
			Vec3f* row = target_32f.ptr<Vec3f>(y);
			uchar* mask_row = target_mask.ptr<uchar>(y);
			for(int x=0;x<target.cols;x++) {
				if(mask_row[x] > 0) {
                                        //take pixel data
					memcpy(samp.data,&(row[x][0]),3*sizeof(float)); 

                                        //Use the GMM to predict how close this pixel is to each gaussian 
					float res = target_model.predict(samp,&pr);

					Mat samp_64f; samp.convertTo(samp_64f,CV_64F);

                                        //Move the pixel to the new Gaussians
					//From Shapira09: Xnew = Sum_i { pr(i) * Sigma_source_i * (Sigma_target_i)^-1 * (x - mu_target) + mu_source }
					Mat Xnew(1,3,CV_64FC1,Scalar(0));
					for(int i=0;i<num_g;i++) {
						if(((float*)pr.data)[i] <= 0) continue;

                                               //For each Gaussian, subtract the original mean and add the target mean,
                                               //use probabilities to get a weighted average.
						Xnew += Mat((
							//Mat(source_covs[match[i]]) *
							//Mat(target_covs[i]).inv() * 
							Mat(samp_64f - tMu_64f(Range(i,i+1),Range(0,3))).t() +
							sMu_64f(Range(match[i],match[i]+1),Range(0,3)).t()
							) * (double)(((float*)pr.data)[i])).t();
					}

                                        //Put pixel back into place
					Mat _tmp; Xnew.convertTo(_tmp,CV_32F);
					memcpy(&(row[x][0]),_tmp.data,sizeof(float)*3);
				}
			}
		}

You might notice I skip the part of multiplying by the covariances matrices, as Shapira did. I found it produces better results, but it's probably caused by a bug.

Results


Code and stuff

Source code is available in SVN repo:

svn checkout http://morethantechnical.googlecode.com/svn/trunk/GMM_Recoloring recoloring

Images from Flickr (Creative Commons):

  • http://www.flickr.com/photos/wwworks/2956622857/sizes/s/
  • http://www.flickr.com/photos/violentz/3199292482/sizes/s/
  • http://www.flickr.com/photos/davidw/164670455/sizes/m/
  • http://www.flickr.com/photos/djania/252225693/sizes/m/

Now go recolor the world!

Thanks for tuning in..
Roy.

Share