The next step is determine the physical position of the two cameras relative to one another mathematically. This allows the software to combine the two data streams into one 3D scene.

Pick the right environment to calibrate

Calibration requires some ambient infrared light in the room. The sun is a good source but can be too bright if it's direct. The best setting is a living room or studio with large windows where you can get filtered sunlight without it being direct. Bear in mind that windows in newer buildings are often treated with IR-blocking coatings. If neither of those are an option, having access to 'hot lights' that emit heat, such as halogen or tungsten, will work. We've also had good luck with IR lights.

Attach the cameras together

Mount the cameras using the mounting solution you chose from above, affix the HD camera to the depth sensor. Shooting requires that the two cameras be securely bound together and not subject to any movement in relation to each other. Make sure everything is locked down tight, including the safety catch on your quick release!

You'll need something to diffuse the depth sensor's IR laser projector for one step during calibration the depth camera for the calibration stage. We often use a square of tissue paper, cigarette rolling paper or handkerchief.

Lock of your zoom to the widest setting and put a piece of tape over both the zoom ring and the lens body to prevent accidentally zooming later. Zooming in or out after you've calibrated will nullify the calibration.

Capture Application: Calibrate Lenses

Plug in the sensor to your computer and open the Capture application depending on which sensor you are using, open the CaptureKinect or CaptureXtionPro application.

Set your working directory to a convenient place on your hard drive. All your footage will be stored here. Change it by clicking the text at the top off the Capture window. The default directory is depthframes, which is inside of your application folder. You'll definitely want to change this. Observe that the software creates a '_calibration' folder for you inside the project directory. Files are autosaved as you go - so relax, your work will be saved.

Select Calibrate Lenses tab, first of the four views on the capture application. It is split in half horizontally; your depth camera stream, if connected via USB, should display on the left, and the right pane should be empty to begin with. If you don't see the depth camera view, see the troubleshooting page.

Note about Kinect model number:
There are two version of CaptureKinect application on OS X, one for model #1414 and one for model #1473. Check the bottom of your Kinect to find the model number and open the corresponding capture application.


Capture Lens Properties

In order to accurately calibrate the two cameras, RGBD Toolkit needs to understand the subtleties of the camera lenses - imperfect manufacturing processes mean that every lens will be slightly different. These values are called lens intrinsic parameters and describe image size, field of view, optical center of the lens, and any distortion found in the lens. To determine these values we capture and analyze images from both cameras.

Note the fields of view are symmetrical, and the principal point is at the center of the depth camera's fixed 640x480 frame.

Note the fields of view are symmetrical, and the principal point is at the center of the depth camera's fixed 640x480 frame.

Calibrate Depth Camera

Aim your rig at an area of space which has plenty of 'visible information' - featuring different colors, contrasts and depths. Hit the Self Calibrate Depth Camera button at the bottom of the left-hand pane. This will automatically analyze the incoming video stream (great!), and once complete should display results similar to the following results:

To capture the HD camera's lens properties it takes a bit more effort and patience since we don't have a direct software connection to the camera. First, set your camera rig up on a tripod and place your checkerboard on a stand in front, a distance from the camera so that it occupies approximately 1/4 of the frame. Place the board in the top left quadrant, focus, and record a short video from this perspective. Don't worry if the checkerboard is not exactly horizontal or vertical, but do ensure that the entire checkerboard is in the frame, including the white border around the outside black squares. Make sure the board is exposed well, evenly lit, and that the lens is focused so the corners are crisp. Record a 1-3 second video of this, mindful of keeping the camera very still.

Repeat this process at a distance where the checkerboard occupies around 1/9th of the frame, taking 9 more images, making 13 in total.


Download the clips onto your computer into your project's working directory, wherever you set it in the first step. It is helpful to add them to a new folder inside '_calibration', called 'slrIntrinsics' or something similarly explanatory.

Set the Square Size (cm) of the checkerboard inside the application. For reference, use 3.38 if you have used A3 sized checkerboard and 2.54 if you used the A4 sized board. If yours is a different size, measure one square precisely and use that width.

Drag all of the video clips them into the 'Capture Lenses' tab's right-hand window pane. This should automatically start the calibration process. You may need to wait for a few seconds while this takes place; the application selects the middle frame from each video, converts it into a black and white .png which is stored in your working folder's _calibration directory. It uses OpenCV library to determine the checkerboard corners to create a model of the lens.


Once the analysis is complete, the software will display a 'Total Error' figure below the checkerboard images. This is the average error across all the calibration images. Alongside this, you can view the individual error margins for each image by scrubbing the mouse from left to right across the calibration image window. A 'Total Error' of < 0.200 is desirable. If your calibration has resulted in a larger average error than this, scrub through your image set and look for any outlier images which have an error of > 0.300. Note the filename of any outliers. You can re-perform the analysis at any time, simply by dragging the videos onto the window pane again - this time excluding the erroneous clips. This should improve your Total Error.

If nearly all of your images have high error, you will need to reshoot them. Before you do this, look for elements in your environment which could have caused the error. Is there light streaking across your checkerboard? Check the troubleshooting section for more reasons why you may be getting high error.

Congratulations, you've now sensed the actual structure of your camera lenses to create a model. With this we can now determine the relationship between the two lenses.

Calibrate Correspondence


Navigate to the second tab, labeled Calibrate Correspondence.

Now that we have the lens models from the first tab, we can determine the spatial relationship between the cameras.

If you are using the laser cut mount, you can to pivot the sensor up and down in order to match the field of view (FoV) to the video camera's lens. Ideally the video camera will be able to see everything the depth sensor can see, with a little bit of margin on the top and bottom.

Set the checkerboard a few feet away from the camera.

Using a live preview mode on your video camera position the top of the board flush with the top of the camera's FoV. Note that the viewfinder and the live preview may differ on some cameras if you are shooting wide format.

While looking at the capture application, adjust the sensor's angle on the mount until the view matches, err on the low side to allow the color camera to see a bit more than what the sensor sees. Depending on your lens you may find that your color information appears inside your depth camera's field of view. There may be some compromises to be made here! The laser cut mounting solution allows for minute adjustment of the depth sensor's angles by loosening the locking (upper) screws.

Tighten the upper screws to lock the mount angle - know that from this point onwards it is important to maintain the camera positions relative to each other (hence the fancy mounting system!)


Now that we've matched the views, we need to take corresponding images of the checkerboard from the two cameras to determine how they sit. Looking back at the capture page, with the checkerboard in each quadrant, you need to capture three images, one short video clip from the video camera, one depth impression from the sensor, and one infrared view of the checkerboard from the sensor. This is where the IR light diffuser is important, so make sure that is handy before beginning. A second pair of hands is helpful at this step.

First, set the checkerboard centered, in front of the lens. Focus the video camera on the board and take a short clip.

Being very careful not to move the rig at all, go back to the RGBCapture app and click the thumbnail of the depth image on the left. This will capture a snapshot of the image for correspondence to the video clip you just took.


Diffuse the IR projector with the paper or cloth, its the farthest left lens of on the face of the sensor, the one with the red sparkles coming out. Observe that the graininess disappears from the camera preview, and red dots appear in the corners of the squares on the checkerboard in the preview. Click the second tile to capture an image whilst the red dots are showing.

If the checkerboard is too dark or no red dots appear, it means you need more ambient IR light. Get closer to the window, or shine a hot light on it from far away. It's important that the board is illuminated evenly and sufficiently.

Repeat this process with the checkerboard at four different depths away from the cameras, making sure to refocus at every plane. The idea is to fill up an imaginary box of checkerboard points in the 3D space in front of the camera. This helps to best interpret the relationship between the two cameras that will work at all distances from the lens. Once you've captured all four sets, download the video clips from the camera and drop them into a new folder in the working directory you set before. One at a time, drag the video files into their corresponding rectangular tiles in the application next to the corresponding depth and IR thumbnails taken from the device.

With four sets of three images complete, click 'Generate RGB/Depth Correspondence'. If you get an error it means the algorithm was unable to find an exact fit for the selected checkerboard pairs. Try hitting 'ignore' to excluding a few of the image sets - 'bad apples' may be throwing off the model calculation. Just like before, excluding images may help in this situation. Click 'Ignore' on all but one of the images, and attempt to Generate RGB/Depth Correspondence again. When you find an image that allows the process to complete successfully, try combining it with other images. There is some randomness in the algorithm, so it helps to try the same combinations a few times just to see if it 'guesses' a better starting place.


By pressing the left and right arrows you can cycle through previews of the four checkerboard calibration sets. If it's correct, you'll see the checkerboard image data pixels (in black and white) mapped cleanly onto the depth model of the same image. You'll also see corresponding colored dots floating in space near corresponding to the checkerboard depth planes. Some dots are missing from the grid pattern, as they were removed as outliers while generating the calibration. An ideal calibration will contain dots from at least three different sets of colors. By cycling through all tests checkerboards sets, the checkerboard image should be visibly well aligned to the depth data.

The camera is set up as a video game style WASD first-person camera, using the following controls:

Function                                   Shortcut
Move Forward                       w
Move Backward                     s
Move Left                              a
Move Right                            d
Move Up                                e
Move Down                           c
Rotate Counterclockwise      q
Rotate Clockwise                  r

Once you have a calibration where all the checkerboards depth and image data match up for all the levels, you can move onto recording! As long as your camera and depth sensor lenses stay in position, you won't have to go through the painstaking process again.