03 August 2011

Introduction To Augmented Reality Implementation

This tutorial/project uses OpenCV, OpenGL (+FreeGLUT) to build an augmented reality application from the ground up.

Lately I've done a bit of experimentation with computer vision, specifically I've been working with Augmented Reality from the ground up. I've made a tutorial program that might help others to get a few of the basic concepts down pat. This sort of work will no doubt seem like jibberish to those who haven't studied computer vision so I suggest you read up on that if you aren't familiar, but hopefully even those who don’t understand computer vision will be able to build the code and start experimenting even if they don't exactly know what’s going on under the hood. If your interested in working with augmented reality from a higher level perspective I suggest that you take a look at one of the higher augmented reality libraries like ARToolkit.

Building The Tutorial Program

You can grab the code from here: http://www.yarrago.com/resources/downloads/software/Yarrago_Augmented_Reality_1.0.2.zip

I'm currently only distributing the project as source only. You should be able to compile under both Windows and Linux and I've tested it with both Visual Studio Express 2010 and GCC and found that it builds and seems to run fine under both.

It is a learning code and has no real application. Please dont be to critical of the programing, it is intended as a computer vision example not a programming example.

Note: The demo requires OpenCV 2.2 or later (I've tested with both OpenCV 2.2 and 2.3 to date).


Install the following packages (instructions are for Fedora, I assume other distros will be similar, but slightly different).
opencv, opencv-devel, freeglut, freeglut-devel, cmake

# yum install opencv
# yum install opencv-devel
# yum install freeglut
# yum install freeglut-devel
# yum install cmake

Extract the package and create a new directory in the base folder (the folder that contains Source and Resources), move into the directory:
# cd AugmentedReality
# mkdir Build
# cd Build

To build you need to run cmake to generate the makefile scripts and then make the program, afterward just run the program.
# cmake ../Source
# make
# ./AugmentedReality

Download and install:
  • OpenCV 2.2 or later (I reccomend OpenCV 2.3 currently). See my notes elsewhereon this blog if you run into problems with OpenCV 2.2.
  • FreeGLUT
  • CMake (Optional)

You can now either use CMake to build the project solution files (following a method similar to the above Linux configuration) if you are familiar with CMake or create a Visual Studio Solution from scratch by following these instructions.

I will assume that you have installed OpenCV to C:\OpenCV2.3\, if not just substitute in the path you installed OpenCV to.

Create a Visual Studio Solution from scratch:
[New Project]
[Win32 Console Application]
Name: "AugmentedReality"
Console Application / Empty Project
Add the two source files to the project.

Set up the project paths and libraries:
Project->AugmentedReality Properties->Configuration Properties (All Configurations):
VC++ Directories->Include Directories: Add C:\OpenCV2.3\build\include
VC++ Directories->Library Directories: Add C:\OpenCV2.3\build\x86\vc10\lib

Linker->Input (Debug): Add

Linker->Input (Release): Add

Windows And Linux
By now you should have a working build of the program.

Copy the images from the Standard Images directory to the applications working directory.
# cd [Build]
# cp ../Resources/Standard Images/* .

# cd "[Build Windows]\[SolutionName]\[ProjectName]"
# copy "../../../Resources/Standard Images/" .

Run Down Of Program Operation

Now here’s how you go about playing with the program.

To get started you will need to print out the full chessboard image. I suggest you print out two copies on say an A4 sheet of paper. Attach one of them to something stiff like a piece of cardboard (you will get better calibration results with this). You want a nice reasonable sized white boarder around the chessboard image so that the detection algorithm will work correctly. You can either use the markers provided by printing out the photos in "Resources\Photos" at a photo shop, or create your own (explained later in the tutorial) using magazines or photos you already have.

The program has a number of modes that step you through the process. The mode can be changed by using the number keys.
0) No augmentation, just display the live image.
1) Find and display the corners of the chessboard.
2) Augment a planar image onto the chess board (planar).
3) Augment a planar image onto the chess board surface.
4) Show the calibration images in sequence.
5) Draw a 3D figgure on the chessboard.
6) Allow the user to select a marker image.
7) Show identified features in the marker and live image.
8) Show the mapped feature matches.
9) Show an augmentation on a natural image.

q or 'esc') Quits the program.
c) Reloads the calibration images from file.
h) Holds the next image in an internal image buffer.
a) Adds the currently held image to the list of calibration images.
s) Saves the list of calibration images.
b) Saves the held image as a marker image.
v) Toggles the view so that you can see the currently held image.
i) Toggles on/off subpixel accuracy.
m) Cycle through the list of avaliable natural markers.
d) Changes the displayed 3D model.
-, +) Changes the natural marker scale be a factor of 2.

The program will start in mode 0. This mode allows you to check that your webcam is connected, working and is selected by the software.

Firstly we can make sure that the webcam can detect the chessboard by pressing '1'. The chessboard will be drawn with markers showing the corners of the squares. If you hold the chessboard and camera completely still you will probably be able to notice that the corners will jiggle around a little. You can improve performance by turning on subpixel accuracy by pressing 'i'.

We can start with a simple augmentation by pressing '2' and then showing your webcam the full chessboard. The augmentation in this mode will only work if the entire chessboard is within the frame. If you find that you get no augmentation or the program crashes check that you have put the image files in the directory with the executable program. In this mode the augmentation is quite simply performed by finding the chessbord corners like in mode '1' and then painting the overlay image into the picture between the outside corners of the chessboard.

We can try a slightly more complex augmentation by going to mode '3' and then using our deformable chess board, carefully bend the chessboard and you should find that in this mode you can get the augmentation to bend with the chessboard to a limited extent. If you bend the chessboard to much you will find that the corners are no longer able to be detected and the augmentation will drop out. This augmentation operates in much the same way as the previous, however rather than only relying on the outside four corners of the chessboard all of the detected corners are used.

To progress further doing more complex augmentations we need to calibrate the camera, this involves capturing a number of images of the chessboard at different locations and orientations. By calibrating the camera we can find the intrinsic parameters of the camera, such as its focal length. For this step it is important to use a planar chessboard, this means you need the chessboard to be rigid as any deformation will cause you to capture poor calibration images, as there is in implicit assumption here that all the corners of the chessboard lie on a plane. From your previous experimentation you will now have an idea when the chessboard can be detected and when it can't. Try capturing an image by pressing 'h' to grab the next image from the camera and then 'v' to view the image that you just captured. What we want to do is capture a number of images of the chessboard at different orientations and scales so that we can calibrate the camera. So grab a few images using the above steps and add them to the calibration list by pressing 'a' to add the held image. If you capture an image that you don’t think will be any good simply don't add it to the list, once an image is in the list there is no direct way of removing it short of restarting the program, (or reloading all calibration image from file by pressing 'c', before you have saved the current list). To save the list for future use press 's', these images will now be automatically loaded when you restart the program so you don't need to perform this every time. If you are unsure whether an image will be good for calibration because you are unsure whether the chessboard will be detected you can view the image as previous by pressing 'v' and then changing the mode to '1', if the corners are found then you know the image will be ok. You will need to recalibrate the camera if you change some of the parameters of the camera like the zoom. You can view all images in the calibration list by going into mode '4', which simply cycles through the calibration images 1 at a time.

Now with your chessboard and going into mode '5' you will find that you should be able to get an augmentation with the teapot appearing on the chessboard. You can change the augmentation model by pressing 'd'. Have fun :)

Augmenting with the chess board is all well and good, but it is somewhat limiting. A better method is to use natural markers. To capture a marker enter mode '6', you can then click and drag to select an axis aligned rectangle marker image. I suggest when doing this first position the marker and then hold the image by pressing 'h' and then viewing the image by pressing 'v', from here the selection process is much easier because the marker no longer moves. When you have a marker image press 'b' to save it, using this method the marker is recognised as whatever occupies the full screen and should be a planar image and not deformable at all.

We can now examine the augmentation process by changing to mode '7', this shows where the features that we will use to match the marker image are located. You ideally want a good number of points to be found within your marker image. Cycle through the list of available markers by pressing 'm' a couple of times till you find your marker.

Mode '8' shows the matching correspondence between your captured marker and the live image. The system is performing well when the solid rectangle appears directly around the marker image. When this is happening the system is using RANSAC to find the best homography from your marker to the live image.

Finally you can augment with your natural marker by going to mode '9', the teapot should appear over the marker and stay still as you move the marker around. If you get poor results try different markers and try environments with more light (I have found that cheap webcams tend to blur the image in low light and this significantly reduces the accuracy of the system). You can change the scale of the model by pressing '+', '-' and you can change the model by pressing 'd'. Again pressing 'm' will cycle through the list of available markers. Have fun :).

The following video steps quickly through the entire process above to give you an idea of the results you should be getting.

Videos Demonstrating Augmented Reality

Here are a bunch of Augmented Reality videos that I have come across that are interesting and will hopefully get you interested in Augmented Reality. They also show you some of the potential of what is achievable.

Turn any webcam into a 3D Scanner
(Use a webcam and a chess board to reconstruct structure from motion)

ProFORMA: Probabilistic Feature-based On-line Rapid Model Acquisition
(Structure from video)

OpenCV Augmented Reality Pong
(Using 2 white pens as bats to make an augmented reality pong game)

simplAR: Augmented reality for OpenCV beginners
(Augmented reality using a planar image drawn on a chess board using OpenCV)

OpenCV: Augmented Reality – Testing
(Augmented reality using OpenCV and black and white block markers)

Augmented Reality using DirectShow, WPF, OpenCV
(Augmented reality 3D box on chess board)

Marker Recognition using SURF Descriptors and OpenCv
(Identifying black and white markers in real time)

Augmented Reality - NyARToolkit C++, OpenCV & OpenGL
(Using an augmented reality toolkit to identify simple markers and drawing a 3D cube)

openFrameworks & OpenCV SURF
(Using surf markers to track a natural drawing within a video frame)

Markless AR using SURF(SIFT) method
(Tracking natural features of a book)

BRIEF Based Planar Object Detector
(Demonstration of the capabilities of the BREIF detector)

Augmented Reality for Board Games
(Augmented reality enhanced Monopoly board game)

HandyAR (Markerless Augmented Reality from UCSB)
(Using a hand as an augmented reality marker)

The making of the O'Neill augmented reality surf game
(Real time augmented reality using natural markers for cooperate promotion)

Augmented Reality Magic 1.0
(Using augmented reality at the application level to enhance a magic performance)

OASIS: Playing Lego with Kinect style camera and interactive projector system
(Augmenting lego with virtual projections to form new children’s games)

Pattie Maes and Pranav Mistry demo SixthSense
(Demo of how we can use augmented reality in everyday life with a projector instead of a screen)

An Augmented Reality X-Ray System Based on Visual Saliency (ISMAR 2010)
(Using augmented reality combined with visual saliency to augment virtual features with real world in a more natural way)

Sensor Fusion Video
(Demonstrating augmented reality natural feature extraction and fusion with gyros/accelerometer to stabilise outside the marker region)

HandyAR (Markerless Augmented Reality from UCSB)
(Using a human hand as an augmented reality marker)

Real-Time Markerless 3D Tracking
(Real time edge based tracking of simple rigid 3D objects (box) and augmentation with 3 teapots)

Scene Modelling, Recognition and Tracking with Invariant Image Features
(Tracking 3D rigid objects and creating augmented images with natural 3D markers)

Mobile Phone Augmented Reality at 30Hz
(Tracking natural planar markers in real time)

Parallel Tracking and Mapping for Small AR Workspaces (PTAM)
(Constructing augmented reality environments without using predefined markers)

System Overview

The following diagram shows a system overview of how the system operates.

A detailed discription of what is under the hood and how things work is still to come.


  1. What FPS do you get for mode 1 and 2, where we are projecting chess corners and image on the chess board?

    Also the program doesn't run in release mode in VC++ 2010 express, Win 7 64 bit.

  2. Can i know how is the Scale determined for calibration, as it said to be scale of virtual world/relative to size of the real world

  3. When I press 5, I can't get a teapot on the view. what can be the reason? And with mouses 7, 8, 9 crashes the program. win 7-64bit, MVS 2008

  4. Hello, I wonder why I cannot debug the program. Is there anything that I missed :(