Summary of 3D geometry and pose


Let’s recap some of the important concepts from this lecture.

Firstly, there are some very important conventions. First is the so called right-handed coordinate frame. When we construct a 3-dimensional coordinate frame, we need to construct it properly and we use our right hand to guide us. The x-axis is aligned with our thumb, the y-axis is aligned with our index finger and the z-axis with our middle finger. When it comes to describing the direction of rotation around an axis, we imagine that we’re grasping that axis with our thumb pointing in the direction of the arrow and the direction that our fingers curl around the axis indicate the positive rotational direction.

A rotation matrix in three dimensions can be described by a 3 x 3 matrix and the columns of that matrix are unit vectors that are aligned with the axes of coordinate frame B. The first column is the x-axis of frame B, second column is the y-axis of frame B and the third column is the z-axis of frame B. And, we use this rotation matrix to rotate a vector from frame B to frame A.

There are three elementary rotation matrices which correspond to rotation around the x-axis, the y-axis and the z-axis. We can create an arbitrary rotation between any two coordinate frames by using a sequence of these elementary rotations. There are in fact 12 possible rotation sequences. An important caveat, part of Euler rotation theorem, is that no two consecutive rotations can be about the same axis. Of this 12, 6 of these sequences are referred to as Euler angles. They contain two rotations about the same axis, but not sequentially. For example, rotate about x, then, y, then x again. Although all six of these technically are Euler angles, when the term Euler angles is used, people are generally referring to the specific sequence, ZYZ. But, this convention varies across different disciplines of engineering. So, it’s important to be sure when you’re talking to somebody about Euler angles that you’re talking about exactly the same sequence.

The other six are referred to as Cardan angles. And, two of these specifically are often referred to as roll, pitch, yaw angles. The Cardan angles have rotations about three different axes.

We can describe the rotation from any one frame to another in terms of a single rotation, a rotation by an angle Theta about some axis V. And, we’ve discussed techniques to determine Theta and V given a rotation matrix. And, we introduced the Rodriguez equation which allows us to go from an axis and an angle back to a rotation matrix.

We introduced quaternions. These are hypercomplex numbers. They actually comprise a scalar and a three-element vector. A unit magnitude quaternion which is referred to as a unit quaternion can be used to encode rotation. And, there are some simple rules to determine the inverse of a quaternion and to compound two quaternions, and we can work out the effect of two consecutive rotations.

An important question that comes up a lot is why do we use something like a rotation matrix. A rotation matrix is a 3 by 3 matrix and it contains nine numbers, but we know from Euler’s theorem that we only need three numbers - three angles - to describe a rotation. What’s the advantage of using nine numbers when I could just use three?

Well, importantly, the rotation matrix is not just any old 3 by 3 matrix. It is a specific matrix. It is an orthonormal matrix. So, although there are nine elements, there are a number of constraints. In fact there are six constraints, which leaves us with only 3 degrees of freedom. The first constraints are that each column in this matrix is a unit vector. In fact, that each column is a unit vector introduces three constraints. And then, the columns are all orthogonal to one another and that introduces another three constraints.

The rotation matrix has some very convenient properties. Firstly, we can compound poses or rotations simply by multiplication. That’s not the case for Euler angles or roll, pitch, yaw angles.

A second really important benefit is that the columns of a rotation matrix describe the x, y and z axes of the new coordinate frame in terms of the old one. Another important property is that you can actually read a rotation matrix. The columns of the rotation matrix describe in terms of unit vectors the x,y and z-axis of the new coordinate frame.

Let’s have a look at a few of the different rotational representations that we’ve discussed. We can have a rotation matrix with its nine numbers in it. And, one of the advantages of rotation matrix is that there is no possibility of having a singularity. We talked about singularities, the gimbal problem, and that occurs with Euler angles or roll, pitch, yaw representation of orientation.

Compounding two rotations is easy. For a rotation matrix, we simply multiply the two matrices together, whereas for the three-angle representations, it’s nontrivial. We can also consider that we can represent a coordinate frame’s orientation by two vectors; an approach vector and an orientation vector. That’s just six numbers. It’s singularity free but not very easy to compound two orientations expressed in this form. We can also consider the angle + axis orientation. We have one number to describe the amount of rotation, the angle, and we need two more numbers to represent the rotational axis. It’s a unit vector in three dimensions, so there are only two unique numbers in it. This representation is singularity free, but it’s a bit problematic to represent a zero rotation. And, once again, compounding two rotations is nontrivial. Final representation is the unit quaternion. It’s got four numbers, though only three of them are unique because it’s got a unit magnitude. It’s singularity free. And, we introduced the Hamiltonian product as a way of multiplying two quaternions. So, in some respect, the quaternion has got a lot of advantages, singularity free, represented by a fewer numbers, and two quaternions can be multiplied together using fewer arithmetic operation than is the case for matrix multiplication. In robotics, it’s really the top and bottom representation that are in common use, the rotation matrix and the unit quaternion.

Finally, we can represent pose, which has got both translational component and a rotational component, by a single homogeneous transformation matrix - a 4 x 4 matrix. Composition of relative poses is done by multiplying matrices. Negation or an inverse relative pose is done by inverting the matrix. Remember that inverting a homogeneous transformation matrix is not computed by taking its transpose. That’s a property only of a rotation matrix. And, transforming a point represented by a vector from one frame to another is done by multiplying homogeneous representation of the point by a homogeneous transformation matrix.

Final thing to remember, coordinate frames are your friend. Use lots of them when you’re trying to describe a complex problem. Attach them to every object of interest and write them the relationships between the coordinate frames. Some of the relationships might be constant. Some of those relationships might be time varying. And, remember, NASA uses them. They must be good.

We recap the important points from this lecture.

Professor Peter Corke

Professor of Robotic Vision at QUT and Director of the Australian Centre for Robotic Vision (ACRV). Peter is also a Fellow of the IEEE, a senior Fellow of the Higher Education Academy, and on the editorial board of several robotics research journals.

Skill level

This content assumes high school level mathematics and requires an understanding of undergraduate-level mathematics; for example, linear algebra - matrices, vectors, complex numbers, vector calculus and MATLAB programming.

More information...

Rate this lesson


Leave a comment