Hi
I saw the stats for the blog a while ago and it seems that the augmented reality topic is hot! 400 clicks/day, that’s awesome!
So I wanted to share with you my latest development in this field – cross compiling the AR app to the iPhone. A job that proved easier than I originally thought, although it took a while to get it working smoothly.
Basically all I did was take NyARToolkit, compile it for armv6 arch, combine it with Norio Namura’s iPhone camera video feed code, slap on some simple OpenGL ES rendering, and bam – Augmented Reality on the iPhone.
Update: Apple officially supports camera video pixel buffers in iOS 4.x using AVFoundation, here’s sample code from Apple developer.
This is how I did it…
I recommend you read my last post on this matter. I have some insights, however superficial, to working with NyARToolkit implementation for C++, that I also use here.
Getting NyARToolkit C++ to compile on iPhone
First of all, I needed to cross-compile NyARToolkit for iPhone’s CPU architecture (Arm), but this was a very simple task – it just compiled off the bat! No tweaking done, what so ever.
But that’s only the beginning, as iPhone apps are built using Objective-C and not C++ (maybe they can, but all the documentation is in obj-c). So I needed to write an Obj-C wrapper around NyARTk to allow my iPhone app to interact with it.
I only needed a very small set of functions out of NyARTk to get Aug.Reality – those that have to do with marker detection. I ended up with a lean API:
@interface NyARToolkitWrapper : NSObject { bool wasInit; } -(void)initNyARTwithWidth:(int)width andHeight:(int)height; -(bool)detectMarker:(float[])resultMat; -(void)setNyARTBuffer:(Byte*)buf; -(void)getProjectionMatrix:(float[])m;
I also have some functions I used for debugging, and non-optimized stages. The inner works of the wrapper are not very interesting (and you can see them in the code yourself), they are mainly invoking NyARSingleDetectMarker functions.
In the beginning – there was only marker detection
OK, to get AR basically what I need to do is:
- initialize NyARTk inner structs
- set NyARTk’s RGBA buffer with each frame’s pixles
- get the extrinsic parameters of the camera, and draw the OpenGL scene accordingly
This is for full fledged AR, but let me start with a simpler case – detecting the market in a single image read from a file. No OpenGL, no camera. Just reading the file’s pixels data, and feeding it to NyARTk.
Now this is far more simple:
CGImageRef img = [[UIImage imageNamed:@"test_marker.png"] CGImage]; int width = CGImageGetWidth(img); int height = CGImageGetHeight(img); Byte* brushData = (Byte *) malloc(width * height * 4); CGContextRef cgctx = CGBitmapContextCreate(brushData, width, height, 8, width * 4, CGImageGetColorSpace(img), kCGImageAlphaPremultipliedLast); CGContextDrawImage(cgctx, CGRectMake(0, 0, (CGFloat)width, (CGFloat)height), img); CGContextRelease(cgctx); [nyartwrapper initNyARTwithWidth:width andHeight:height]; [nyartwrapper setNyARTBuffer:brushData]; [nyartwrapper detectMarker:ogl_camera_matrix];
First I read the image to UIImage, then get it’s respective CGImage. But what I need are bytes, so I create a temporary CGBitmapContext, draw the image into it and use the context pixel data (allocated by me).
Adding the 3D rendering
This is nice, but nothing is shown to the screen, which sux. So the next step will be to create an OpenGL scene, and draw some 3D using the calibration we now have. To do this I used EAGLView from Apple’s OpenGL ES docs.
This view will setup an environment to draw a 3D scene, by giving you a delegate to do the actual drawing while hiding all the perepherial code (frame buffers… and other creatures you wouldn’t want to meet in a dark 3D alley scene).
All I needed to implement in my code were two functions defined in the protocol:
@protocol _DGraphicsViewDelegate<NSObject> @required // Draw with OpenGL ES -(void)drawView:(_DGraphicsView*)view; @optional -(void)setupView:(_DGraphicsView*)view; @end
‘setupView’ will initialize the scene, and ‘drawView’ will draw each frame. In setupView we’ll have the viewport setting, lighting, generating texture buffers etc., You can see all that in the code, it’s not very interesting…
In drawView we’ll draw the background and the 3D scene. Now this took some trickery. First I though i’ll take the easy route and just have the 3D scene be transparent, draw the view using a simple UIView of some kind, and overlay the 3D over it. I didn’t manage to get it to work, so I took a different path (harder? don’t know) and I decided to paint the background over a 3D plane, in the 3D scene itself, using textures. This is how I did it in all my AR app on other devices.
Now, the camera video feed is 304×400 pixels, and OpenGL textures are best optimized at power-of-2 sizes, so I created a 512×512 texture. But for now we’re talking about a single frame.
const GLfloat spriteTexcoords[] = {0,0.625f, 0.46f,0.625f, 0,0, 0.46f,0,}; const GLfloat spriteVertices[] = {0,0,0, 1,0,0, 0,1,0 ,1,1,0}; glMatrixMode(GL_PROJECTION); glPushMatrix(); glLoadIdentity(); glOrthof(0, 1, 0, 1, -1000, 1); glMatrixMode(GL_MODELVIEW); glPushMatrix(); glLoadIdentity(); // Sets up pointers and enables states needed for using vertex arrays and textures glEnableClientState(GL_VERTEX_ARRAY); glVertexPointer(3, GL_FLOAT, 0, spriteVertices); glEnableClientState(GL_TEXTURE_COORD_ARRAY); glTexCoordPointer(2, GL_FLOAT, 0, spriteTexcoords); glBindTexture(GL_TEXTURE_2D, spriteTexture); glEnable(GL_TEXTURE_2D); glDrawArrays(GL_TRIANGLE_STRIP, 0, 4); glDisableClientState(GL_VERTEX_ARRAY); glDisableClientState(GL_TEXTURE_COORD_ARRAY); glMatrixMode(GL_PROJECTION); glPopMatrix(); glMatrixMode(GL_MODELVIEW); glPopMatrix();
Basically, I go into orthographic mode and draw a rectangle with the texture on it, nothing fancy.
Next up – drawing the perspective part of the scene, the part that aligns with the actual camera…
//Load the projection matrix (intrinsic parameters) glMatrixMode(GL_PROJECTION); glLoadMatrixf(ogl_projection_matrix); //Load the "camera" matrix (extrinsic parameters) glMatrixMode(GL_MODELVIEW); glLoadMatrixf(ogl_camera_matrix); glLightfv(GL_LIGHT0, GL_POSITION, lightPosition); glEnable(GL_LIGHTING); glEnable(GL_LIGHT0); glDisable(GL_TEXTURE_2D); glDisableClientState(GL_TEXTURE_COORD_ARRAY); glPushMatrix(); glScalef(kTeapotScale, kTeapotScale, kTeapotScale); { static GLfloat spinZ = 0.0; glRotatef(spinZ, 0.0, 0.0, 1.0); glRotatef(90.0, 1.0, 0.0, 0.0); spinZ += 1.0; } glEnableClientState(GL_VERTEX_ARRAY); glEnableClientState(GL_NORMAL_ARRAY); glVertexPointer(3 ,GL_FLOAT, 0, teapot_vertices); glNormalPointer(GL_FLOAT, 0, teapot_normals); glEnable(GL_NORMALIZE); for(int i = 0; i < num_teapot_indices; i += new_teapot_indicies[i] + 1) { glDrawElements(GL_TRIANGLE_STRIP, new_teapot_indicies[i], GL_UNSIGNED_SHORT, &new_teapot_indicies[i+1]); } glPopMatrix();
For this also I learned from Apple’s OpenGL ES docs (find it here). I ended up with this:
Tying it together with the camera
This runs on the simulator, since the camera is not involved just yet. I used it to fix the lighting and such, before moving to the device. But we’re here to get it work on the device, so next I plugged in the code from Norio Nomura.
Some people have asked me to post up a working version of Nomura’s code, so you can get it with the code for this app (scroll down). Nomura was kind enough to make it public under MIT license.
First, I set up a timer to fire in ~11fps, and initialize the camera hook to grab the frames from the internal buffers:
repeatingTimer = [NSTimer scheduledTimerWithTimeInterval:0.0909 target:self selector:@selector(load2DTexFromFile:) userInfo:nil repeats:YES]; ctad = [[CameraTestAppDelegate alloc] init]; [ctad doInit];
And then I take the pixel data and use it for the background texture and the marker detection:
-(void)load2DTexWithBytes:(NSTimer*) timer { if([ctad getPixelData] != NULL) { CGSize s = [ctad getVideoSize]; glBindTexture(GL_TEXTURE_2D, spriteTexture); glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, s.width, s.height, GL_BGRA, GL_UNSIGNED_BYTE, [ctad getPixelData]); if(![nyartwrapper wasInit]) { [nyartwrapper initNyARTwithWidth:s.width andHeight:s.height]; [nyartwrapper getProjectionMatrix:ogl_projection_matrix]; [nyartwrapper setNyARTBuffer:[ctad getPixelData]]; } [nyartwrapper detectMarker:ogl_camera_matrix]; } }
All this happens 11 times per second, so it must be concise.
Video proof time…
Well, looks like we are pretty much done! time for a video…
How did you get the phone to stand still so nicely?
An important issue… when it comes to shooting the phone w/o holding it.
Well I used a little piece of metal that’s used to block the PCI docks in the PC. In hebrew will call these scrap metal “Flakch”s (don’t try to pronounce this at home). I bended it in the middle to create a kind of “leg”, and the ledge to hold the phone already exists.
The code
As promised, here’s the code (I omitted some files whose license is questionable).
That’s all folks!
See you when I get this to work on the Android…
Roy.