Summary of image geometry


Let's recap some of the topics we've covered in this lecture. We started by talking about homogeneous coordinates, a different way to represent points in a 2 dimensional plane. We talked about ways to convert Cartesian coordinates to homogeneous coordinates and homogeneous coordinates back to Cartesian coordinates.

Mathematically, lines and points are duels. This leads to some very nice ways to compute the intersection points at 2 lines or the line that is formed by 2 points. We've discussed a different perspective projection model called the Central Projection Model and the key difference is that the image plane is between the object and the origin and it forms an image that is not inverted.

We can represent this as a matrix modification using homogeneous coordinates of the world point and of the point on the image plane. In a real digital camera, the image plane is a large array of light sensitive elements, which form the pixels of the resulting image.

We need to deal with the discrete nature of this image plane, the fact that the coordinates are measured in units of pixels rather than in units of meters. There's also an origin shift involved instead of the centre of the image plane, the origin in pixel coordinates is in the top left corner and we can introduce that with a simple linear transformation. Ultimately, we combine a number of matrix terms together. The first 2 we refer to as the intrinsic parameters of a camera.
They described the camera entirely in terms of the dimensions of its pixels, the coordinate of the principal point and the focal length of the lens. These are parameters of the camera itself. It doesn't matter where the camera is or where it's pointing. The intrinsic parameters are invariant of the location of the camera.

The 3rd matrix in this chain, we refer to as the extrinsic parameters and they described the pose of the camera that is its position and its orientation with respect to the world's coordinate frame. We can combine all these matrices together into a single 3 by 4 matrix which encodes all that information. So, the projection from the world point to image plane point is done simply by a matrix modification which gives us the homogeneous coordinates of the point on the image plane and there is a simple mapping between those homogeneous coordinates and the Cartesian image plane coordinates which we typically denote by U and V.

This representation is scale invariant. We can multiply the camera matrix by an arbitrary non-zero constant and the projection remains unchanged.

This matrix is often written in a normalized form where element C34 is equal to 1. If the points lie on a plane in the world, then we can write a different relationship between the coordinate of the point on the plane and the coordinate of the point on the image plane and we use a 3 by 3 homography matrix to do this.

Homography matrix maps points from a plane to another plane. We can compute the homography matrix if we know 4 sets of corresponding points between the 2 planes.

Corresponding means that the point P and the point Q refer to the same point, the same feature in the world.

One application of homographies is to compute perspective rectification. We can compute an homography H which maps points from point P in 1 image to point Q in another image and if we choose H correctly, we can undo the effect of perspective distortion. Another application of homographies is that we can take a rectangular graphic like this and distort it in such a way that it appears to lay perfectly into a different plane. In this case, the plane is the surface of the swimming pool.

The final example of how we might be able to use an homography is related to the project part of this course and we can use it to map a coordinate in the image plane of the camera to an XY coordinate of a point on the robots 2 dimensional worksheet.

Let’s recap the important points from the topics we have covered about homogeneous coordinates, image formation, camera modeling and planar homographies.

Professor Peter Corke

Professor of Robotic Vision at QUT and Director of the Australian Centre for Robotic Vision (ACRV). Peter is also a Fellow of the IEEE, a senior Fellow of the Higher Education Academy, and on the editorial board of several robotics research journals.

Skill level

This content assumes an understanding of high school level mathematics; for example, trigonometry, algebra, calculus, physics (optics) and experience with MATLAB command line and programming, for example workspace, variables, arrays, types, functions and classes.

More information...

Rate this lesson


Leave a comment