Summary of image formation


Let's recap some of the important topics we've covered in this particular lecture.  We've talked about the fundamental image formation process and the simplest case is what we call the "Pinhole Camera."  Light comes from a light source, in this case the sun, reflects off points on the three dimensional object, those light rays pass through the pin hole and cause a very faint inverted image to be formed on the image plane. 

The geometry of this situation can be very simply explained.  It can be described using similar triangles and that leads us to the very simple relationship here between three dimensional world coordinates, capital X, capital Y, capital Z and the two dimensional image plane coordinates little x and little y. 

So this is a mapping or a projection from three dimensions down to two dimensions and this is called "Perspective Projection." 

One of the disadvantages of the pinhole imaging model is that a lot of the light rays are lost; they simply hit the opaque plane that holds the pinhole.  If we introduce a lens instead we're able to gather many more of the rays that leave each particular point on the object and that allows a brighter image to be formed. 

A consequence of the perspective projection, the mapping from three dimensions down into two, is that some of the geometry is not preserved.  Here we see that parallel lines in the world are no longer parallel lines in the image.  We see that a large circle in the world is not a circle in the image.  And perspective projection guarantees only to do this, it will map a line to a line, but it doesn't necessarily retain parallelism, angles are not necessarily preserved and conic sections are mapped to conic sections; so that means that a circle could appear as an ellipse and an ellipse could appear as a circle.  

A really important consequence of the perspective projection is that there is no unique inverse.  We've lost a dimension, so for a particular image there are an infinite number of possible objects that could have caused that image.  It could be a small object close to me or it could be a large object further away.  There is no way, just from the image geometry that we can tell this. 

So in practice we use a number of tricks that come from our experience of growing up in a three dimensional world, in order to disambiguate this.  Though we can trick this.  If we create a two dimensional image; shown here as a piece of street art, which causes on the retina of our eye, exactly the same image as we would get from looking at a three dimensional world, then all of the three dimensional interpretation part of our brain kicks in and causes us to feel, to interpret the image as being three dimensional even though the rational part of our brain knows that this is a piece of flat street art.  The impression is still very, very vivid.  

Another way we can look at the consequence of there being no unique inverse is in an illusion like this and it's because Chris is a full size person but because he's standing further away he appears to be smaller and although initially we're surprised when we look at that, again the rational part of our brain that understands the three dimensional world kicks in and says "Okay, actually we will assume that he is a full size person, he therefore must be further away."  

Let’s recap the important points from the topics we have covered about image formation and perspective projection.

Professor Peter Corke

Professor of Robotic Vision at QUT and Director of the Australian Centre for Robotic Vision (ACRV). Peter is also a Fellow of the IEEE, a senior Fellow of the Higher Education Academy, and on the editorial board of several robotics research journals.

Skill level

This content assumes an understanding of high school-level mathematics, e.g. trigonometry, algebra, calculus, physics (optics) and some knowledge/experience of programming (any language).

More information...

Rate this lesson


Leave a comment