A rebuilding exercise is underway in Rome, but it’s not one that uses bricks and mortar, rather, it uses digital images – maybe even ones you provided unwittingly. A team from the University of Washington (UW) has developed a new computer algorithm that uses hundreds of thousands of tourist photos to automatically reconstruct an entire city in about a day. It’s thought that one use for the technology could be to provide visitors with an on-line virtual-reality 3-D tour of cities they visit.
UW has been studying ways of utilizing the increasingly large digital photo collections available on photo-sharing websites like Flickr. The digital Rome was built from 150,000 tourist photos tagged on that site with the word ‘Rome’ or ‘Roma’ (there are around 2 million ‘Rome/Roma’ images on the site).
And while Rome itself wasn’t built in a day, the UW computers analyzed each image and, in just 21 hours, combined them to create a 3-D digital model. Using this model, a viewer can fly around Rome's landmarks, from the Trevi Fountain to the Pantheon to inside the Sistine Chapel.
"How to match these massive collections of images to each other was a challenge," said Sameer Agarwal, a UW acting assistant professor of computer science and engineering, and lead author of a paper being presented in October at the International Conference on Computer Vision in Kyoto, Japan. Until now, he said, "even if we had all the hardware we could get our hands on and then some, a reconstruction using this many photos would take forever."
Those readers using Microsoft’s free tool Photosynth might recognize some of the team’s earlier photo-stitching technology, known as Photo Tourism, which was much slower than the new algorithm.
"With Photosynth and Photo Tourism, we basically reconstruct individual landmarks. Here we're trying to reconstruct entire cities," said co-author Noah Snavely, who developed Photo Tourism as his UW doctoral work and is now an assistant professor at Cornell University.
The newly-developed code works more than 100 times faster than the previous version. It first establishes likely matches and then concentrates on those parts. The code also uses parallel processing techniques, allowing it to run simultaneously on many computers, or even on remote servers connected through the Internet.
Parallel processing is vital to the project when one considers the hundreds of thousands of photos needed to be matched for entire cities. Previous versions of the Photo Tourism software matched each photo to every other photo in the set. But as the number of images increases the number of matches explodes, increasing with the square of the number of photos. A set of 250,000 images would take at least a year for 500 computers to process, Agarwal said. A million photos would take more than a decade.
"If a city reconstruction took several months, it would be just about building Rome," UW computer science professor Steve Seitz said. "But on a timeline of one day you can methodically start going through all the cities and start building models of them."
The software could build cities for video games automatically, instead of doing so by hand. It also might be used in architecture for digital preservation of cities, or integrated with online 3-D maps, Seitz said.
In addition to Rome, the team recreated the Croatian coastal city of Dubrovnik, processing 60,000 images in less than 23 hours using a cluster of 350 computers, and Venice, Italy, processing 250,000 images in 65 hours using a cluster of 500 computers.
The research was supported by the National Science Foundation, the Office of Naval Research and its Spawar lab, Microsoft Research, and Google.
The data set consists of 150,000 images from Flickr.com associated with the tags ‘Rome’ or ‘Roma’. Matching and reconstruction took a total of 21 hours on a cluster with 496 compute cores. After this, the images organized themselves into a number of groups corresponding to the major landmarks in the city of Rome. Among these clusters are the Colosseum, St Peter's Basilica, Trevi Fountain and the Pantheon. UW says an advantage of using community photo collections is the rich variety of view points from where these photographs are taken. For instance, see the quick 3-D view of the reconstruction of the interior of St Peter's Basilica. The triangles on the image depict from where the images were captured.
Other cities are captured here online.
See the stories that matter in your inbox every morning