In a much earlier lecture, we talked about the data mismatch problem. That's the fact that a sensor like a camera emits a large number of pixels, perhaps 1 million pixels in every image and it's going to be outputting anything from 10 to 50 images per second that is an enormous amount of data.
A machine like a robot requires relatively little information in order to tell it where to go an arm type manipulator like shown here requires simply 6 numbers which described the angles of the joints of the robot.
A mobile robot like the vacuum cleaner shows here really only requires 2 numbers to control its motion, that's its forward velocity V and its rotational velocity omega.
So, we have the problem with a huge amount of data coming from a camera sensor, relatively little data required by the robot. The way we get around that then is to extract what we call features from the image and we've talked a bit in previous lectures about region features. We've talked about how we can take an import image. We can find regions which are sets of pixels with either similar intensity or with similar color and we can then describe those regions in terms of their centroid, their bounding box, moment’s perimeter, circularity and so on.
We've talked a lot about regions but there are number of other types of features that are really useful and interesting and that I'd like to talk a little bit about.
Most real world scenes contain a lot of lines. This particular picture of a church, I've highlighted a number of lines. Some of these lines come from the contrast between the roof and the sky, between the roof and the wall, the edges of windows, the edges of doors and so on. Certainly, in any man-made environment, there are enormous number of lines but even in many natural environments there are lines as well, perhaps the vertical edges of tree trunks and so on.
So, lines are very prevalent in the environment and lines are very simple and compact to describe. A line might contain a large number of pixels but is really described by just 2 parameters, its intercept and its slope.
Now, an interesting question then is: How do we extract lines in an image, the important, the dominant lines within an image?
Another class of features what we call point features and what's overlaid on this picture of a building here are points in the image that are interesting, that is they describe points that are quite distinctive and if I took another picture of this building, I'd have a pretty good chance of finding those same points in a different view of the same building.
So, if I'd move my camera, if the sun's come out, the sun's going behind the cloud, I would still be able to locate these points. The overlaid graphical information entails a lot about these features points. The centre of the circle indicates the centre of this distinctive pattern that is a pattern that we are very likely to be able to find in another view of the same scene.
The size of the circle indicates something about the scale of the pattern. Is it a very small distinct pattern or a large distinct pattern? The radio lines say something about the orientation of that pattern.
An image contains a huge amount of pixel data, and a video stream is a massive flow of pixel data. Typically a robot has only a few inputs, the position or velocity of its joints. How do we go from all that camera data to the small amount of data the robot really needs?
This content assumes an understanding of high school level mathematics; for example, trigonometry, algebra, calculus, physics (optics) and experience with MATLAB command line and programming, for example workspace, variables, arrays, types, functions and classes.