Computational creativity and the future of AI

Disney Research algorithm automatically edits footage captured by multiple cameras


August 11, 2014

An algorithm developed at Disney Research is able to take footage of an event captured fro...

An algorithm developed at Disney Research is able to take footage of an event captured from multiple cameras and automatically edit it into a cohesive video

Image Gallery (2 images)

These days, with most people toting camera-packing smartphones, friends and families act as a veritable film crew, ready to capture important moments from a multitude of angles. But editing the footage into a cohesive whole can be a time-consuming chore. Now a team at Disney Research has developed an algorithm that automatically edits hours of raw footage into something less tedious to sit through.

Unlike software such as Magisto, Highlight Hunter and LiveLight, which help editors sort the video wheat from the chaff captured by a single camera, the algorithm developed at Disney Research combines footage of a single event captured from different points of view by different cameras. It does this by deducing what event is the most significant based on what the various cameras are focused on.

"Though each individual has a different view of the event, everyone is typically looking at, and therefore recording, the same activity – the most interesting activity," says Yaser Sheikh, an associate research professor of robotics at Carnegie Mellon University and part of the team at Disney Research Pittsburgh. "By determining the orientation of each camera, we can calculate the gaze concurrence, or 3D joint attention, of the group."

Other approaches for automatically or semi-automatically combining footage from multiple cameras generally rely on selecting the most stable or best lit footage and periodically switching between the camera angles available. But because the algorithm developed at Disney Research calculates the spatial relationship between the subject(s) and the various cameras, it is also able to adhere to established cinematographic guidelines.

These include the 180-degree rule that says the camera needs to stay on one side of the axis that connects subjects in a scene. For example, if there two people in a scene, the axis would be an imaginary line connecting them. Changing the camera angle from one side of this line to the other would be called jumping or crossing the line and confuse the viewer.

By calculating spatial relationships, the algorithm is able to adhere to cinematographic r...

The system will also avoid shots of only a very short duration, which can be jarring to the viewer, and jump cuts, which are cuts from one shot to another that vary only slightly in terms of perspective and either give the viewer the impression of a jump forward in time, or jumpy camerawork.

Although the system takes several hours to carry out the computations necessary to put together a cohesive video lasting a few minutes, the Disney Research team says professional editors using the same raw footage took on average more than 20 hours to achieve similar results.

"The resulting videos might not have the same narrative or technical complexity that a human editor could achieve, but they capture the essential action and, in our experiments, were often similar in spirit to those produced by professionals," says Ariel Shamir, an associate professor of computer science at the Interdisciplinary Center, Herzliya, Israel, and a member of the Disney Research Pittsburgh team.

While the algorithm may not replace professional editors, its creators say it could assist them in editing large amounts of footage.

The Disney Research Pittsburgh team will present a paper (PDF) on their algorithm at ACM SIGGRAPH 2014, which that is currently underway in Vancouver, Canada.

The video below demonstrates how the algorithm works.

Source: Disney Research

About the Author
Darren Quick Darren's love of technology started in primary school with a Nintendo Game & Watch Donkey Kong (still functioning) and a Commodore VIC 20 computer (not still functioning). In high school he upgraded to a 286 PC, and he's been following Moore's law ever since. This love of technology continued through a number of university courses and crappy jobs until 2008, when his interests found a home at Gizmag.   All articles by Darren Quick
1 Comment

Apps for


Court cases


Movie, TV editting?

FX work?


Re enactments: legal & for Education.


Stephen N Russell
12th August, 2014 @ 08:34 am PDT
Post a Comment

Login with your gizmag account:

Or Login with Facebook:

Related Articles
Looking for something? Search our 31,282 articles
Recent popular articles in Digital Cameras
Product Comparisons