Packing Better Montages than ImageMagick with Python Rect Packer

ImageMagick has a built in Montage creating tool. It's good enough for casual montaging, but it's definitely suboptimal for packing varying size images.

All photos from:

Simply using ImageMagick's montage it looks something the following. First the script that I run:

TEMP_DIRECTORY=$(mktemp -d /tmp/montageXXXXXX)
/usr/local/bin/mogrify -path ${TEMP_DIRECTORY}/ -geometry 480x480\> "$@"
/usr/local/bin/montage ${TEMP_DIRECTORY}/* -geometry +2+2 "$( dirname "$1" )"/montage.jpg

First I rescale all the images to "up-to 480x480" keeping aspect ratio, and then run the montage with a 2x2 pixel border.

Original images (just scaled down)

This looks pretty bad. Mostly because montage will not pack the rectangles more densely.

We could first resize all the images so that their height is e.g. 480px:

for f in "$@"
	/usr/local/bin/convert "$f" -geometry x480 "${f%.*}_480h.jpg"

And then running montage, to get this:

Images resized to height=480px

Already looking much better, but we have little control over the resulting size of the montage, ImageMagick just does its best job at packing everything. With similar heights - it's an easy job. However we can still see a lot of annoying whitespace on the right. What if there's a better way to pack the images?

Enter, rectpack:

This is a Python package implementing a few algorithms for rectangle packing, a concrete spatial instance of the classic knapsack problem (NP complete!) from computer science:

Here's my script:

import cv2
import rpack
import os
import glob
from rectpack import newPacker
import pickle
import numpy as np
import argparse

parser = argparse.ArgumentParser(description='Montage creator with rectpack')
parser.add_argument('--width', help='Output image width', default=5200, type=int)
parser.add_argument('--aspect', help='Output image aspect ratio, \
    e.g. height = <width> * <aspect>', default=1.0, type=float)
parser.add_argument('--output', help='Output image name', default='output.png')
parser.add_argument('--input_dir', help='Input directory with images', default='./')
parser.add_argument('--debug', help='Draw "debug" info', default=False, type=bool)
parser.add_argument('--border', help='Border around images in px', default=2, type=int)
args = parser.parse_args()

files = sum([glob.glob(os.path.join(args.input_dir, '*.' + e)) for e in ['jpg', 'jpeg', 'png']], [])
print('found %d files in %s' % (len(files), args.input_dir))

print('getting images sizes...')
sizes = [(im_file, cv2.imread(im_file).shape) for im_file in files]

# NOTE: you could pick a different packing algo by setting pack_algo=..., e.g. pack_algo=rectpack.SkylineBlWm
packer = newPacker(rotation=False)
for i, r in enumerate(sizes):
    packer.add_rect(r[1][1] + args.border * 2, r[1][0] + args.border * 2, rid=i)

out_w = args.width
aspect_ratio_wh = args.aspect
out_h = int(out_w * aspect_ratio_wh)

packer.add_bin(out_w, out_h)


output_im = np.full((out_h, out_w, 3), 255, np.uint8)

used = []

for rect in packer.rect_list():
    b, x, y, w, h, rid = rect

    used += [rid]

    orig_file_name = sizes[rid][0]
    im = cv2.imread(orig_file_name, cv2.IMREAD_COLOR)
    output_im[out_h - y - h + args.border : out_h - y - args.border, x + args.border:x+w - args.border] = im
    if args.debug:
        cv2.rectangle(output_im, (x,out_h - y - h), (x+w,out_h - y), (255,0,0), 3)
        cv2.putText(output_im, "%d"%rid, (x, out_h - y), cv2.FONT_HERSHEY_PLAIN, 3.0, (0,0,255), 2)

print('used %d of %d images' % (len(used), len(files)))

print('writing image output %s:...' % args.output)
cv2.imwrite(args.output, output_im)


Running it like so:

$ python3 --input_dir ~/Downloads/montage/resize480/ --width 2200 --border 10 --debug True

Resulted in this:

Montage with rectpack

That doesn't look the best, but it's definitely nice it tries to tile things together.

There are some options to consider:

$ python3 --help
usage: [-h] [--width WIDTH] [--aspect ASPECT] [--output OUTPUT]
               [--input_dir INPUT_DIR] [--debug DEBUG] [--border BORDER]

Montage creator with rectpack

optional arguments:
  -h, --help            show this help message and exit
  --width WIDTH         Output image width
  --aspect ASPECT       Output image aspect ratio, e.g. height = <width> *
  --output OUTPUT       Output image name
  --input_dir INPUT_DIR
                        Input directory with images
  --debug DEBUG         Draw "debug" info
  --border BORDER       Border around images in px

Running over the fixed height images:

$ python3 --input_dir ~/Downloads/montage/h480/ --width 4800 --aspect 0.5 --border 5 --debug True


$ python3 --input_dir ~/Downloads/montage/h480/ --width 2500 --aspect 1.2 --border 5

This gives us more control of the montage.



Cylindrical image warping for panorama stitching

Just sharing a code snippet to warp images to cylindrical coordinates, in case you're stitching panoramas in Python OpenCV...

This is an improved version from what I had in class some time ago...
It runs VERY fast. No loops involved, all matrix operations. In C++ this code would look gnarly.. Thanks Numpy!



Projector-Camera Calibration - the "easy" way

First let me open by saying projector-camera calibration is NOT EASY. But it's technically not complicated too.

It is however, an amalgamation of optimizations that accrue and accumulate error with each step, so that the end product is not far from a random guess.
So 3D reconstructions I was able to get from my calibrated pro-cam were just a distorted mess of points.

Nevertheless, here come the deets.
Continue reading "Projector-Camera Calibration - the "easy" way"


Revisiting graph-cut segmentation with SLIC and color histograms [w/Python]

As part of the computer vision class I'm teaching at SBU I asked students to implement a segmentation method based on SLIC superpixels. Here is my boilerplate implementation.

This follows the work I've done a very long time ago (2010) on the same subject.

For graph-cut I've used PyMaxflow:, which is very easily installed by just pip install PyMaxflow

The method is simple:

  • Calculate SLIC superpixels (the SKImage implementation)
  • Use markings to determine the foreground and background color histograms (from the superpixels under the markings)
  • Setup a graph with a straightforward energy model: Smoothness term = K-L-Div between superpix histogram and neighbor superpix histogram, and Match term = inf if marked as BG or FG, or K-L-Div between SuperPix histogram and FG and BG.
  • To find neighbors I've used Delaunay tessellation (from scipy.spatial), for simplicity. But a full neighbor finding could be implemented by looking at all the neighbors on the superpix's boundary.
  • Color histograms are 2D over H-S (from the HSV)



WTH OpenGL 4? Rendering elements arrays with VAOs and VBOs in a QGLWidget

I spent an entire day getting OpenGL 4 to display data from a VAO with VBOs so I thought I'd share the results with you guys, save you some pain.

I'm using the excellent GL wrappers from Qt, and in particular QGLShaderProgram.
This is pretty straightforward, but the thing to remember is that OpenGL is looking for the vertices/other elements (color? tex coords?) to come from some bound GL buffer or from the host. So if your app is not working and nothing appears on screen, just make sure GL has a bound buffer and the shader locations match up and consistent (see the const int I have on the class here).


Quickly: How to render a PDF to an image in C++?

Using Poppler, of course!
Poppler is a very useful tool for handling PDF, so I've discovered lately. Having tried both muPDF and ImageMagick's Magick++ and failed, Poppler stepped up to the challenge and paid off.

So here's a small example of how work the API (with OpenCV, naturally):

#include <iostream>
#include <fstream>
#include <sstream>
#include <opencv2/opencv.hpp>
#include <poppler-document.h>
#include <poppler-page.h>
#include <poppler-page-renderer.h>
#include <poppler-image.h>

using namespace cv;
using namespace std;
using namespace poppler;

Mat readPDFtoCV(const string& filename,int DPI) {
    document* mypdf = document::load_from_file(filename);
    if(mypdf == NULL) {
        cerr << "couldn't read pdf\n";
        return Mat();
    cout << "pdf has " << mypdf->pages() << " pages\n";
    page* mypage = mypdf->create_page(0);

    page_renderer renderer;
    image myimage = renderer.render_page(mypage,DPI,DPI);
    cout << "created image of  " << myimage.width() << "x"<< myimage.height() << "\n";

    Mat cvimg;
    if(myimage.format() == image::format_rgb24) {
    } else if(myimage.format() == image::format_argb32) {
    } else {
        cerr << "PDF format no good\n";
        return Mat();
    return cvimg;

All you have to do is give it the DPI (say you want to render in 100 DPI) and a filename.
Keep in mind it only renders the first page, but getting the other pages is just as easy.

That's it, enjoy!