Vision and motion
We have already discussed how the pixel velocity - or the image plane velocity - is related to the camera velocity by the image Jacobian matrix. But, these quantities are all interrelated. If we know some, we can use that information to estimate others. The Jacobian matrix depends very clearly on a constant parameter: f-hat — related to the focal length; it depends on image plane coordinate of a point u and v; and it also depends on the depth of the point — the Z coordinate of the point in 3D space. It is interesting to note that the depth of the point — Capital Z occurs only in the first three columns of the image Jacobian matrix.
Let's say we want to use this interrelationship to estimate the distance to a point.
If we look at a point across a sequence of images we can compute its velocity in the image plane. We know the coordinate of the point on the image plane, and we can measure the camera velocity using something like an inertial measurement unit, which would tell us the rotational velocities and also give us the velocity. So if we know these three things then it is reasonably straightforward to solve for the distance to the point.
Consider now the problem where we want to estimate the speed of the camera.
So we can again measure the pixel velocity by looking at some points in a scene across a sequence of images. We know the u and v coordinate of a point in the image. In this case we need to determine the distance to the point and we can use something like stereo vision to tell us how far an object is away from the camera. Once we have these three pieces of information then we can solve for the camera velocity. And a technique like this is referred to as visual odometry. An odometer is a device on a mobile robot that counts the rotation of the wheels to determine how far it is that you have moved. Visual odometry provides the same information, but doesn't actually require a wheel and an odometer. It estimates this by observing how the world is flowing past the camera.
Let's consider a simple example of visual odometry.
This is an expression that we have seen before, but I have highlighted two sub-matrices; they are both 1x3 sub-matrices. I am going to denote one of them J sub u and one of them as J sub v. We assume that the robot is moving in the X,Y plane and that the world is a constant distance away from the robot.
We've talked about optical flow. At a particular pixel coordinate we can determine it’s velocity in the image plane and that gives us the information u-dot and v-dot. We can use an inertial measurement unit. We can use the instantaneous angular velocity measured by this particular sensor and we can substitute that into the spatial velocity vector for the elements omega-X, omega-Y, and omega-Z. Because the robot is moving at a constant distance with respect to the world then Z is known, and Z is constant, so we can say that the vZ is approximately equal to zero. There are some elements in the image Jacobian that are not particularly significant. Their value is close to zero, so as a first order approximation we can just leave them out.
The result then is that we can write a simplified expression for u-dot and v-dot in terms of vX and vY, the velocity of the camera which is actually the information that we are trying to determine, as well as the object distance — the world distance, the sub-matrices of the image Jacobian, and the angular velocity of the camera.
We can rearrange this to get expressions for the velocity of the camera in terms of various quantities that we are able to measure.
So without having a wheel and without having an actual encoder or an odometer we can estimate the velocity of our aircraft, as it's moving through the air.
Visual odometry is really well suited to robots that don't have conventional odometers. The conventional odometer requires that you have got a wheel and some kind of counter that tells you how many times the wheel has turned. If it is a flying robot or if it is an underwater robot then it doesn't have wheels and then visual odometry is a really useful sensing modality.
So here we can see the underwater robot. It is using its stereo cameras to maintain a constant altitude above the seabed. It is also computing optical flow. And it is combining that with the height information that comes from the stereo vision, and combined with some information from onboard gyroscopes — which give it its' angular rate — in order to determine its velocity with respect to the seabed. If, for example, there is an ocean current pushing the robot sideways then the visual odometry will pick this up. There will be a Y component of velocity and then the control system onboard the robot can apply appropriate propeller thrust in order to counter that. So visual odometry is telling the robot its true velocity with respect to the seabed.
The image Jacobian depends not only on the image plane coordinates but also the distance from the camera to the points of interest. If this distance is not known, what can we do? Let’s look at how we can determine this distance, and how the optical flow equation can be rearranged to convert from observed image plane optical flow to actual camera velocity which is also known as visual odometry.
This content assumes high school level mathematics and requires an understanding of undergraduate-level mathematics; for example, linear algebra - matrices, vectors, complex numbers, vector calculus and MATLAB programming.