Computing Disparity


Let's talk now about how we compute stereo disparity. We have a pair of images taken by a stereo camera. We have a left image and a right image, and our goal is to compute what we call a disparity image. In the disparity image, the brightness of a pixel corresponds to the stereo disparity at that particular point.

A point that's bright has got a large stereo disparity and that occurs when an object is close to us, and a darker pixel has got a smaller disparity and corresponds to an object that is further away. We want to understand how we can take that original left and right image, process it and turn it into this disparity image. The fundamental principle of this computational stereo can be described quite simply. 
Here are two pictures that I took of the Eiffel Tower from slightly different viewpoints. Let's consider a single point on the top of the tower. I'm going to select a window of pixels around that, and I'm going to make a copy of that window of pixels and there it is at the top.

Now I'm going to transfer the distance U to the right hand image and do a little bit of construction. I am going to say that the disparity that I am looking for has got a maximum value that I'm going to call D-max. Now, the stereo process involves looking for that little template of the top of the Eiffel Tower at every possible position along that search line and the best fit is clearly at this particular location, which is a distance D away from the vertical reference line that I drew.

D is the value of the disparity at this point. Clearly this is quite a computationally intensive process. I have to do a template match at a number of different possible disparity values from zero up to D-max. And this is just to determine the disparity at the coordinate U, V.

I actually need to do this for every single pixel in the input image. It is certainly a lot of computation, but with today's computers it can be done very, very quickly even for very high resolution stereo image pairs.
Let's load a stereo pair into our workspace. I'm going to load the left image from the file rocks2-L and I'm going to load the right image from rocks2-R. I've used the reduce option to iread the resolution in the horizontal and vertical direction because these are very, very high-resolution images and the stereo computation is a little slow under MATLAB.

The first thing I’m going to do is to display the images as an anaglyph. Pass the left and right images to the anaglyph function and here we see an anaglyph representation of this stereo scene of a pile of rocks. If I had my red and blue glasses on, this would look very powerfully three-dimensional.
Let’s have a look at the left and right images in a bit more detail. I’m going to use the toolbox function, stdisp, stereo display and it gives a window that looks something like this and I’m just going to make that a little bit wider.

Now, here we see the left image and the right image. Let me just click on a particular point in the image. I’m going to click on this rock in the foreground. I’m going to click on that particular blotch there and if I go to the other image and that’s the same blotch; it’s just there,

I’m going to click on it and up the top we see the disparity that is the left ward shift in the right image and it’s 79.29 pixels. Now let’s have a look at a point on the back. Let’s have a look at this very distinctive valley between these two rocks and click on that point.

I’m going to find the same point in the other image and it’s just there and now we see that the disparity, the left ward shift has been reduced. It’s only 43.45 pixels. Let’s find a point that’s somewhere in between those things. Let’s have a look at this white dot here and we find it in the right image. It’s just there. In this case, the disparity is 52 pixels.
So we can see very clearly the relationship between depth and disparity. Points in the foreground have got a large disparity. Points in the background have got a much smaller disparity.

Now let’s compute disparity for every single pixel in this pair of images. I’m going to put the disparity into the workspace variable D. I’m going to use the toolbox function istereo and the pass in the left image and the right image.

The range of disparity is to search over and for this particular pair, I've worked out that the smallest disparity is 40 pixels and the largest disparity is 90 pixels and I’m passing the half width of the matching window. The matching window will be three times two plus one. That’s a seven by seven window.

Computation takes a moment or two and now I can display the disparity image and here we see it. If I can click on some pixels here we see in the foreground this rock here has got disparities starting off around 83 falling away as the corresponding points in the world get further away from the camera, falls off to 70 something. These rocks up the back here have got a much smaller disparity, the order of 50. This rock’s something like 40 and these really dark ones up the back have a disparity of 45 or 46.
We can see that this image is far from perfect. There are some points that just don’t look right. There are some very anomalous bright points around the edges of the rocks and that’s because those parts of the rock are visible from one camera view but not visible from the other camera view and that means that the stereo matching can’t be completed properly.

You also note that there’s a black region down the side of the image and that’s where the fields of view of a camera don’t fully overlap. This is a very simplistic stereovision algorithm. There are much more sophisticated techniques available but it demonstrates the principle.


There is no code in this lesson.

Given two images of a scene taken from slightly different viewpoints, a stereo image pair, it’s possible to determine the disparity for every pixel using template matching. The disparity image is one where the value of each pixel is inversely related to the distance between that point in the scene and the camera.

Professor Peter Corke

Professor of Robotic Vision at QUT and Director of the Australian Centre for Robotic Vision (ACRV). Peter is also a Fellow of the IEEE, a senior Fellow of the Higher Education Academy, and on the editorial board of several robotics research journals.

Skill level

This content assumes an understanding of high school level mathematics; for example, trigonometry, algebra, calculus, physics (optics) and experience with MATLAB command line and programming, for example workspace, variables, arrays, types, functions and classes.

More information...

Rate this lesson


Check your understanding

Leave a comment