These image based visual control strategies have got two important requirements.
One is to reliably find the points on the target object, and by reliably means every single frame: as the object is moving or the camera is moving, we need to be able to identify the same points on the target object.
The second problem is to determine which point is which. This is what we call the correspondence problem. So, in the case of our triangular target before, we need to determine which point in the current view corresponds to which point in the desired view.
Considering now the first problem: how do we reliably find points on the target object.
Once upon a time this was done by creating artificial scenes that have got very high contrasts. So here are two examples from my PhD research back in the 1990s, and what I had was very much a black background and a white object. So we could use very simple thresholding techniques and binary vision processing techniques, which were very very fast, in order to identify the object. But this is, clearly, very unrealistic.
Another approach is to introduce something into the scene which has got a distinctive color. We have talked before about how this mobile robot could navigate with respect to an orange traffic cone. The image on the right shows our landing target for an indoor flying robot. It could see this configuration of four yellow targets, identify the centroid of those, and land on them.
Another way we can make the problem easy is to make the target something that's very bright; something that actually emits light. So perhaps we could put a colored LED on the object in the scene, and the robot's vision system can identify that particular patch of very bright color. By using the bright color it makes it easier to segment.
Another approach is to put some kind of pattern on the object, and here we see a number of patterns that are used for localisation. In the middle we have what we call a self-similar landmark, and you can see that the computer vision algorithm is able to very, very robustly determine the position of the centres of those three landmarks, almost irrespective of the pose of the sheet, even if the targets are somewhat obscured. On the left we see some related landmarks, this time using vertical lines rather than circles, which are used in that particular robot to identify the location of the crucible to the robot so it can pick it up.
We can also use QR code as shown here on the right. We can compute centroid of the QR code, it’s a very unique pattern, but the QR code also contains a number of bytes of information. Often times these are used to encode URLs in advertising. So this QR code has got a position but it's also got some information associated with it.
Related to QR codes is another type of code called an AR code or an augmented reality code, and they can be used
to label objects in a three-dimensional world.
Visual servoing is concerned with the motion of points in the world. How can we reliably detect such points using computer vision techniques.
This content assumes high school level mathematics and requires an understanding of undergraduate-level mathematics; for example, linear algebra - matrices, vectors, complex numbers, vector calculus and MATLAB programming.