Vision as a process


The way we see the world is actually quite complex. Our eyes are not like a camera that takes an image that we then go and process. As we look at a scene our eye moves continuously to different points in the scene in order to improve its understanding of what is in front of it. So seeing is actually a very active process.

So researchers have developed technology that allows us to study where a human being is looking; involves a camera which looks at the subject’s eyes, and perhaps an illumination system. So with this we can work out continuously and in real time where a human being is looking.

Now this painting was used in a very famous study that was conducted back in 1967 into how a human subject’s gaze depends on the particular question that they were asked. So what they could do is take a subject, ask them a question and then track where their eye looked within the painting. So here is an example: here is one trace, one subject trace. You can see that their gaze is moving all around the room. So of these three questions, which question do you think was asked that led to this particular gaze pattern by that human subject. I will let you think about that for a moment.

And here is the answer. The question that was asked is: ‘What are the material circumstances of the family?’.

So in order to answer this question, the human subject has directed their gaze towards the possessions of the particular family. So they’re checking out the furniture, they are checking out the paintings on the wall and so on. So given a specific question to a subject, our eye moves over the scene in a particular way so as to best answer the question. It’s not just a matter of taking a picture and processing it: our eye actively moves around the scene in order to best answer the question.

Here is another experiment. See if you can work out which question has been asked.

We can see that the gaze is checking out particularly the facial regions of the particular people here. So the question that was asked is: ‘What are the ages of the figures in the painting?’. So clearly it is useful to look at the faces of the people in order to work out what their age is.

Third one, in this particular case this is what the subject’s gaze did. We can see him checking out the faces, and it is also checking out the whole body of each of the human subjects in this painting. And the question that was asked is: ‘What type of clothes are the family wearing?’.

So given the subject was answering that particular question, then their gaze is focused just on the faces and the clothes of each of the people. It is not looking at the furniture; it is not looking at the paintings; it’s not looking out the window.

It is important to understand that seeing is a very active process. Now it is driven by perhaps some of the fastest moving muscles in the human body. So the human eye is able to rotate at up to six hundred degrees per second, and has a phenomenal acceleration, something like 35000º per second squared, so amazing muscles are able to point the eye very quickly anywhere within the field of view of the eyes.

So this lovely quote that I like very much by a very famous early vision researcher David Marr. And he says vision is ‘the process of discovering from images what is present in the world and where it is’. So vision is a process. I think that is a very important message to take away when we start to think about how robots might see.

Seeing is an active process and our brain uses our eyes to find answers to questions — how we look at a scene depends on what we want to know. Eye tracking experiments measure where eye movement, where we look and what we pay attention to.

Professor Peter Corke

Professor of Robotic Vision at QUT and Director of the Australian Centre for Robotic Vision (ACRV). Peter is also a Fellow of the IEEE, a senior Fellow of the Higher Education Academy, and on the editorial board of several robotics research journals.

