The geometry of image formation



We’ll start by recapping some fundamentals of two-dimensional geometry. To start with, we’re going to talk about a Euclidian plane. A Euclidian plane is a non-curved surface where the rules of Euclidian geometry apply. We normally take this for granted. We’re just going to be explicit about the fact that we’re working with flat surfaces.

The second concept is Cartesian coordinates. Again, this is a concept you should be well familiar with. We have a set of orthogonal axes typically labeled X and Y. We have a point P and we describe its position with respect to the origin in terms of distances along the X and Y axes. A concept that is perhaps less familiar to you will be homogenous coordinates and we’re going to be using them a lot in this lecture. Those of you who’ve done the introduction to robotics course will have already encountered homogenous coordinates.
To represent a point in a two-dimensional Cartesian space, we describe it by a pair of numbers written as tuple X, Y and we say that this point lives in the space of two-dimensional real numbers, which we denote by the symbol R2 (R squared).

In homogenous coordinates, we represent the same point with three numbers. In this case, we’ve represent it as X, Y and the number one. In fact, the number one is somewhat arbitrary; it could be any non-zero constant. We say now that this point belongs to the two-dimensional projective space, which we denote by the symbol P2 (P squared). So it’s the same point, just two different ways to represent it. The homogenous representation, the representation in projective space is going to be very, very handy to us in this lecture.
Now to convert from homogenous coordinate to a Cartesian coordinate, I’m going to start with the very general homogenous coordinate denoted here as X tilde, Y tilde and Z tilde.

To convert it through a Cartesian coordinate what I do is take the first two numbers and divide by the third. So X and Y Cartesian coordinates are given by X tilde on Z tilde and Y tilde on Z tilde respectively.

Now a mathematician would say that in projective space, lines and points have duals. They have equivalent representations. Some very interesting and very useful things flow from this duality. We can write a line in homogenous form a little bit different to what we’re used to but we can write a line as three numbers denoted here as L1, L2 and L3. Imagine that I’ve got a point which moves along that line. So the line is really the set of all possible points that lie along it. I can express the fact that the point lies on the line by the dot product of the line and the point being equal to zero. We refer to this as the point equation of a line. A line is defined in terms of all the points that could possibly lie along it.
Now I can expand that out and we can see that that looks a little bit different to the conventional Cartesian representation of a line. But if we do the transformation from homogenous coordinates to Cartesian coordinates, we can show that these two representations of a line are equivalent.

An advantage of using the homogenous form is that it’s very easy to represent a line that is vertical where in Cartesian coordinates, that means that the gradient of the line is equal to infinity. In homogenous form, we treat that situation very simply and conveniently. We don’t need to introduce any infinities. We also might be interested in how to describe a line that joins two points. So here we have two points and here’s a line that passes through those two points. Then the homogenous representation of the line which remember is the three vector is given simply by the cross-product of the two points that lie on the line. So a very, very simple way of finding the equation of a line that joins two points much, much more conveniently than it is for Cartesian coordinates.
We might also be interested in the problem of the coordinate of the point at the intersection of two lines. So here we have two lines and this is the intersection point and the intersection point is given by the cross-product of the two lines. We call this the line equation of a point. The point is defined in terms of two lines.

So let’s create a point. A horizontal of 100. A vertical coordinate of 200. And points are represented as Coulomb vectors. So that’s a row vector that I’ve written just there, so I will transpose it. So that will display as a column vector. So this is a Euclidean or Cartesian coordinate for a point.

Now we can convert the Euclidean coordinate into a homogeneous coordinate using the function e2h, and we can see that all it has done is appended a 1 to it. So now instead of a vector with two elements, its now a Coulomb vector with three elements. The last element is a one.

Now we can convert the homogeneous coordinate back to a Euclidean coordinate by using the inverse function h2e, for homogeneours to Euclidean, and I apply that to the answer for the last operation and we see that it is back now to a vector with two elements. The result is the vector that we started with.


There is no code in this lesson.

Let’s recap the basics of homogeneous coordinates to represent points on a plane.

Professor Peter Corke

Professor of Robotic Vision at QUT and Director of the Australian Centre for Robotic Vision (ACRV). Peter is also a Fellow of the IEEE, a senior Fellow of the Higher Education Academy, and on the editorial board of several robotics research journals.

Skill level

This content assumes an understanding of high school level mathematics; for example, trigonometry, algebra, calculus, physics (optics) and experience with MATLAB command line and programming, for example workspace, variables, arrays, types, functions and classes.

More information...

Rate this lesson


Check your understanding

Leave a comment

Previous lesson Next lesson