How to rotate a video using MEncoder and FFmpeg and live to tell the tale


I'd like to share a quick tip on rotating video files.

I'm always frustrated with taking videos with my phone. Single handedly it's easiest to do it when the phone is upright and not in landscape mode. But the files are always saved in landscape mode, which makes them rotated when you watch.
Although there are plenty of GUI software to do it, using the command line is faster and can also be batched!

Continue reading "How to rotate a video using MEncoder and FFmpeg and live to tell the tale"


Hand gesture recognition via model fitting in energy minimization w/OpenCV

hands with model fittedHi

Just wanted to share a thing I made - a simple 2D hand pose estimator, using a skeleton model fitting. Basically there has been a crap load of work on hand pose estimation, but I was inspired by this ancient work. The problem is setting out to find a good solution, and everything is very hard to understand and implement. In such cases I like to be inspired by a method, and just set out with my own implementation. This way, I understand whats going on, simplify it, and share it with you!

Anyway, let's get down to business.

Edit (6/5/2014): Also see some of my other work on hand gesture recognition using smart contours and particle filters

Continue reading "Hand gesture recognition via model fitting in energy minimization w/OpenCV"


iPhone OS 3.x Raw data of camera frames

Hi All

It looks like it's finally here - a way to grab the raw data of the camera frames on the iPhone OS 3.x.

Update: Apple officially supports this in iOS 4.x using AVFoundation, here's sample code from Apple developer.

A gifted hacker named John DeWeese was nice enough to comment on a post from May 09' with his method of hacking the APIs to get the frames. Though cumbersome, it looks like it should work, but I haven't tried it yet. I promise to try it soon and share my results.

Way to go John!
Some code would be awesome...



Near realtime face detection on the iPhone w/ OpenCV port [w/code,video]

iphone + opencv = winHi
OpenCV is by far my favorite CV/Image processing library. When I found an OpenCV port to the iPhone, and even someone tried to get it to do face detection, I just had to try it for myself.

In this post I'll try to run through the steps I took in order to get OpenCV running on the iPhone, and then how to get OpenCV's face detection play nice with iPhoneOS's image buffers and video feed (not yet OS 3.0!). Then i'll talk a little about optimization

Update: Apple officially supports camera video pixel buffers in iOS 4.x using AVFoundation, here's sample code from Apple developer.
Update: I do not have the xcodeproj file for this project, please don't ask for it. Please see here for compiling OpenCV for the iPhone SDK 4.3.

Let's begin
Continue reading "Near realtime face detection on the iPhone w/ OpenCV port [w/code,video]"


Augmented Reality with NyARToolkit, OpenCV & OpenGL


I have been playing around with NyARToolkit's CPP implementation in the last week, and I got some nice results. I tried to keep it as "casual" as I could and not get into the crevices of every library, instead, I wanted to get results and fast.

First, NyARToolkit is a derivative of the wonderful ARToolkit by the talented people @ HIT Lab NZ & HIT Lab Uni of Washington. NyARToolkit however was ported to many other different platforms, like Java, C# and even Flash (Papervision3D?), and in the process making it object oriented, instead of ARToolkit procedural approach. NyARToolkit have made a great job, so I decided to build from there.

NyART don't provide any video capturing, and no 3D rendering in their CPP implementation (they do in the other ports), so I set out to build it on my own. OpenCV is like a second language to me, so I decided to take its video grabbing mechanism wrapper for Win32. For 3D rendering I used the straightforward GLUT library which does an excellent job ridding the programmer from all the Win#@$#@ API mumbo-jumbo-CreateWindowEx crap.

So let's dive in....
Continue reading "Augmented Reality with NyARToolkit, OpenCV & OpenGL"


Qt & OpenCV combined for face detecting QWidgets

As my search for the best platform to roll-out my new face detection concept continues, I decided to give ol' Qt framework a go.

I like Qt. It's cross-platform, a clear a nice API, straightforward, and remindes me somewhat of Apple's Cocoa.

My intention is to get some serious face detection going on mobile devices. So that means either the iPhone, which so far did a crummy job performance-wise, or some other mobile device, preferably linux-based.
This led me to the decision to go with Qt. I believe you can get it to work on any linux-ish platform (limo, moblin, android), and since Nokia baught Trolltech - it's gonna work on Nokia phones soon, awesome!

Lets get to the details, shall we?
Continue reading "Qt & OpenCV combined for face detecting QWidgets"


OpenGL for AviSynth [Update: now w/code]


I had a little project at work recently, that involved creating movie clips using AviSynth.
And I was appalled by the shabbiness of existing transition plugins available freely for AviSynth, they always reminded me of 80s-like video editing...
So I set out to integrate AviSynth with OpenGL to create a nice 3D transition effect for our movie clips.

I had 2 major bases to cover:

  • AviSynth plugin API
  • OpenGL rendering

AviSynth API is not so well documented, but they have very good ground-up examples on how to DIY plugin. Here is the one I used, that basically does nothing but copy the input frame to the output frame.
Open GL on the other hand is very well documented and "tutorialed". I based my code on this example from NeHe.

So basically what I wanted to achive is:

  1. Read input frame (AviSynth)
  2. Paint frame as texture over 3D model (OpenGL)
  3. Draw rendered 3D image to output frame (OpenGL+AviSynth)

Reading the frame is pretty straightforward. Frames come encoded as RGB 24bit, with a little twist: rows size in bytes is not width*3 as you'd expect it be, but AviSynth use a parameter called "Pitch" to determine row size in bytes.

Update (14/9/09): source is now available in the repo: browse download
Continue reading "OpenGL for AviSynth [Update: now w/code]"


Showing video with Qt toolbox and ffmpeg libraries

I recently had to build a demo client that shows short video messages for Ubuntu environment.
After checking out GTK+ I decided to go with the more natively OOP Qt toolbox (GTKmm didn't look right to me), and I think i made the right choice.

So anyway, I have my video files encoded in some unknown format and I need my program to show them in a some widget. I went around looking for an exiting example, but i couldn't find anything concrete, except for a good tip here that led me here for an example of using ffmpeg's libavformat and libavcodec, but no end-to-end example including the Qt code.

The ffmpeg example was simple enough to just copy-paste into my project, but the whole painting over the widget's canvas was not covered. Turns out painting video is not as simple as overriding paintEvent()...

Firstly, you need a separate thread for grabbing frames from the video file, because you won't let the GUI event thread do that.
That makes sense, but when the frame-grabbing thread (I called VideoThread) actually grabbed a frame and inserted it somewhere in the memory, I needed to tell the GUI thread to take that buffered pixels and paint them over the widget's canvas.

This is the moment where I praise Qt's excellent Signals/Slots mechanism. So I'll have my VideoThread emit a signal notifying some external entity that a new frame is in the buffer.
Here's a little code:

Continue reading "Showing video with Qt toolbox and ffmpeg libraries"