Advanced 3D Computer Vision


I'd like to finish off by sharing with you two quite amazing research videos - they both show the power of combining visual information taken from multiple viewpoints.

The first is a system that draws large repositories of photographs, for instance, Flickr. In this case they retrieve a whole lot of images taken of the Rome coliseum.

All of these images taken by different people at different times clearly have a lot of overlap. What they do is they find the find the same points on the coliseum in multiple images and from that, then they can then reconstruct the relative precision of the cameras that took those pictures.

Then they can combine this information across hundreds or even thousands of cameras taking images of the coliseum and use that to reconstruct a full three dimensional model of the coliseum.

They didn't have to go and visit the coliseum, they just mined the web for photographs that were already there with some very clever algorithms were able to extract from all those pictures, this three dimensional model.

This next one extends the technique to combine information from millions and millions of images taken by different people with different cameras at different times but all the same subject matter.


There is no code in this lesson.

Let’s look at some recent research results that vividly show how information from many 2D images taken from many different locations can be combined to form a detailed 3D model of the world.

Professor Peter Corke

Professor of Robotic Vision at QUT and Director of the Australian Centre for Robotic Vision (ACRV). Peter is also a Fellow of the IEEE, a senior Fellow of the Higher Education Academy, and on the editorial board of several robotics research journals.

Skill level

This content assumes an understanding of high school-level mathematics, e.g. trigonometry, algebra, calculus, physics (optics) and some knowledge/experience of programming (any language).

More information...

Rate this lesson


Leave a comment