Just to recap we have a camera, which is observing an object in three-dimensional space. We can project the position of the object in 3D space into the image plane where it has a coordinate U, V. Assume that the camera is able to move and it has translational velocity and rotational velocity. The velocity of the point on the image plane is related to the camera spatial velocity by the image Jacobian.
We can demonstrate this principle very simply. If I move my head to the right then the robot and everything else in the world appears to move to the left. If instead I rotate my head towards the right then everything in the world also tends to move to the left.
So there's an ambiguity between moving my head; pure translation to the right and the rotation of my head to the right. The visual appearance is somewhat similar.
There are two important concepts that we need to be aware of. The first issue is perceptibly. In general as the camera moves, points on the image plane move as well.
So there are some cases where with the camera moving some points in the image plane will not appear to move. In that case we say the motion is imperceptible. Looking at the image we are not able to tell whether the camera is moving or not.
Another important issue is what we call ambiguity and that is at sometimes quite different camera velocities cause the same change in the image. And we will give some concrete examples of both of these effects shortly. Consider the case where the camera is moving in the positive Z direction we have this radiating pattern of optical flow that we introduced earlier.
Let's consider we have an image plane point right in the middle of the image plane. We can see that the optical flow vectors get larger as we move away from the centre. In fact at the very centre there is no apparent velocity. But, if I am heading directly towards an object it doesn't appear to be moving to the side or upwards or downwards; it's stays exactly where it is.
This is the thing that I am going to hit if I keep moving in the positive Z direction. So if I just look at a point in the middle of the image plane; it's not possible for me to determine whether I am moving forwards or not.
Another example of imperceptible motion is the case where the camera is rotating around its optical axis. So again if I consider a point in the image plane there is no apparent velocity at this particular point with the camera rotating.
Clearly if I consider a point toward the edge of the image there is a significant velocity component due to the camera rotation. But, in the centre of the image there is none and I cannot tell by looking at a point there whether my camera is rotating around the optical axis or not.
Let's compute the visual Jacobian but, this time we are going to do it about the point which is in the centre of the image. And that is at coordinate 512, 512 that is the principal point.
That's where the optical axis pierces the image plane and we can see that up here in the parameters of the camera object.
We will compute the visual Jacobian at this particular point, now let's take a look at what happens when we move the camera and start again with motion in the "X" direction. We transpose that and we see again that this point moves at 160 pixels per second in the negative "U" direction.
Now let's see what happens if we move in the "Z" direction and we see now that this particular point is going to have a zero velocity.
So my camera is moving forward at a meter per second and as far as that particular point, its projection image plane; it's not moving at all. And the same thing happens in the case of rotation around the optical axis. We see again, this particular point has got zero motion on the image plane.
Motion ambiguity is an interesting thing. If we look at the optical flow field due to positive velocity in the "X' direction; we see that all that all the optical flow vectors point towards the left.
If I were to rotate the camera in a positive direction around the "Y" axis we get the flow field shown here on the right. And we can see that they look quite similar. They are certainly very similar in the middle rows of the image, but we can see at the top and the bottom of the image that for the camera rotation case the optical flow vectors appear to be slightly curved, they are not all parallel as they are in the case of pure velocity in the "X" direction.
There is some sort of ambiguity between translation and rotation. If I translate the camera or I rotate the camera we get somewhat similar optical flow fields. Let's explore this in a little bit more detail.
On the left we see the optical flow due to pure camera translation. On the right hand side we have the case for the camera rotating around the "Y" axis. The top graph shows the case when the camera has a lens with a large focal length. This corresponds to a telephoto or a zoom lens. And we see in this case that the optical flow patterns are very, very similar indeed. However, if we use a lens with a small focal length that corresponds to a wide field of view we see that the optical flow pattern now is very, very different to the pattern due to pure "X" translation. This ambiguity in fact depends on the focal length of the lens. We can explain this by looking in some detail at the structure of the image Jacobian matrix.
Consider the case where the camera has translation velocity in the "X" direction. Then that velocity multiplies this column of the image Jacobian. The ambiguous case occurs with rotation around the "Y" axis and omega "Y" multiplies this column of Jacobian matrix.
For the case where "F" is very large, along the focal length; then these two terms become quite similar. So what happens in this case because "F" is very large it dominates "U' squared. So the result is effectively a constant. It's no longer a function of "U" and the same for the first column the corresponding term it is not a function of "U". So these two terms become similar functions.
If we look at the second row we can see that it has "F", becomes very large and it is in the denominator. This term will approach zero and again becomes very similar to the corresponding elements in the first column of the image Jacobian matrix.
We can take an intuitive approach to this problem. Here we have a very wide field of view image of a cathedral that corresponds through a very small focal length lens. And here we see a narrow field of view image of same cathedral and that corresponds to a large focal length lens; a big zoom lens, a telephoto lens if you'd like.
Now that large focal length lens corresponds to the central part of this wide field of view image. So in this central part of the wide field of view image where the "U" and "V" coordinates are small we get this motion ambiguity. We can't tell whether the camera is rotating or translating. But, even in the periphery of the image where U and V are large, the motion is not ambiguous at all. Camera rotation and camera translation cause very, very different optical flow phenomenon. This ambiguity is very definitely related to the focal length.
Clearly rotational motion and translational motion are very, very different. But, what we've just been saying is that using an image sensor; just like our eye that it is not possible for us to tell in general the difference between some rotational motions and some translational motions.
Surely this would be a showstopper for using vision to control robots. And it is a bit of a problem and one way to get around that is to use a separate sensor to measure rotation.
So this is a picture of a rotational measurement unit which has got gyroscope sensors in it which can measure independently the rotation of a robot and we can use this extra information to disambiguate the information that we get from our visual sensor.
But, what about us? We have our eyes and clearly for us we can easily determine if we are rotating or whether we are translating. Yes we can, but to do that we also use inertial sensors. We actually have rotational sensors in our head. In our inner ear there are these three semi-circular canals and they measure the angular velocity of our head and we have three sensors in our left ear and three sensors in our right ear. And our brain fuses the rotational information that comes from our inner ear with our optical flow information that comes from our eyes.
It is fused in our brain it gives us a very unambiguous information about our motion whether we're rotating or whether we are translating.
This is a very very powerful illusion. I am inside a rotating drum, and my eye can detect the motion of this drum. My eye and my brain are computing what we call optical flow, and this optical flow that’s caused by that rotating drum is the same sort of optical flow that I get if I rotate my head this way, or I rotate my head that way. And this particular illusion causes me to feel a little bit uneasy, because the information that I get with my eyes — the optical flow — tells me that my head is moving from side to side, but the gyroscopic sensors in my ears tell me that’s not happening. So there’s a disconnect between what my ears are telling me about my attitude, and what my eyes are telling me about my motion, and that leads to the sensation of seasickness or a slight nausea, which is why I’m holding on very tightly to these rails.
The relationship between world coordinates, image coordinates and camera spatial velocity has some interesting ramifications. Some very different camera motions cause identical motion of points in the image, and some camera motions leads to no change in the image at all in some parts of the image. Let’s explore at these phenomena and how we can overcome them.
This content assumes high school level mathematics and requires an understanding of undergraduate-level mathematics; for example, linear algebra - matrices, vectors, complex numbers, vector calculus and MATLAB programming.