Facial recognition is in (the reflection of) the eye of the beholder
By Brian Dodson
January 16, 2014
The worst has happened. You receive an emailed kidnap demand with a picture of your loved one in dire straits. You contact the authorities, and in a flash (relatively speaking), they have identified the kidnapper and possibly some accomplices, and are well on their way toward recovering the victim. How did this happen? By identifying the faces of the kidnappers caught in the reflection of your loved one's eyes.
The scenario above isn't yet standard practice, but the basic technology for accomplishing the task now exists. Familiar faces can be recognized from a very small number of pixels, as small as 7 x 10 pixels in one study. A very familiar example appears below. The image on the left has 16 x 20 pixel resolution, while on the right the same image is blurred to make recognition easier.
It is now commonplace for digital cameras to have 10-50 megapixel CMOS sensors. There is even a smartphone, the Nokia Lumia 1020, that has a 41-MP sensor. (Although this camera automatically generates an oversampled 5-MP image from the raw data, the raw data is still available for use.)
A 50 mm equivalent lens covers a horizontal angle of about 40 degrees. With a 40-MP sensor (and good optics), each pixel is about one-third of a minute of arc in size, enabling resolution about five times more acute than that of the human eye. In addition, a good picture captures everything within the bit depth of the pixels, whereas our eyes have a very small area of high resolution on the retina, and our brains fill in the details, often incorrectly. A camera captures a lot of information which we cannot "see at a glance," or even by careful examination.
A study just carried out by Dr. Rob Jenkins of the University of York and Christie Kerr of the University of Glasgow, both in the UK, has found that the picture of a high-end camera is capable of seeing images reflected from the corneas of a subject being photographed. The images, which can be of high enough quality to identify people by their faces, cover most of the area in front of the subject, owing to the curvature of the cornea. In essence, a fisheye view of the entire region in front of the subject can be found in the image of the subject's eyes.
The lead photograph provides an excellent example of just how much information can be contained in corneal reflections. One has the definite impression that, if these people were known to an observer, their images would be recognizable.
Here's how the photographs were taken. A Hasselblad H2D 39-megapixel digital camera with a 120 mm macro lens (equivalent of a 75 mm lens on a 35 mm camera) was about 1 meter (39 in) away from the subject. The subject was lit by a pair of Bowens DX1000 flash lamps with dish reflectors and baffles illuminated both the subject and his visual field. Another pair in soft boxes (essentially volume diffusers) were directed at a group of people standing around the photographer. An exposure of 1/250 sec. at f/8 was used in the well-lit environment.
It is interesting to follow the complexity of the image and its subimages. The photographs were 5,412 pixels wide and 7,216 pixels high, for a total of 39,052,922 pixels. The number of pixels in the subject's face (without the hair) was about 12 million pixels. The iris of the subject covers an area of about 54,000 pixels, about 0.005 the size of the face. Now, the images of the bystanders in the room, reflected from the eye's cornea, averaged about 30 x 50 pixels in size (1,500 in total.)
The ease with which faces can be recognized in the lead photo is significantly improved by a small amount of image processing. At the left in the photo above is the raw image from the photograph showing heavy pixelation. The photo in the middle was smoothed by subjecting the first photo to bicubic interpolation, a procedure which automatically removes most of the high spatial frequency noise associated with pixelation. Finally, on the right, the center image was adjusted in brightness and contrast by PhotoShop's Autocontrast function.
That as observers we have the strong impression of recognizable faces is perhaps not surprising. Those of us old enough to remember when having a QVGA video camera on one's computer was fun didn't have any real problem recognizing people in a small group shot. QVGA, though, is a 320 x 240 pixel image. If you take the enlarged image of the subject's eye from the lead photograph, and draw two horizontal bars touching the top and bottom of the eye, the area between the two bars is about equivalent to a QVGA image.
Now, however, comes the issue of the correctness of that impression. Jenkins and Kerr address this through a pair of psychological experiments. In the first, an image taken by corneal reflection was paired with a high quality image. Half the combinations were of the same person, while the other half were of a different person from the student body who had the same gender, ethnicity, hair color, build, and approximate age as the person in the first image. Participants in the study broke down into two groups, one in which neither face was familiar, and a second in which at least one of the faces was familiar.
The object was to decide if the two faces presented to them were of the same person, or of different people. This would seem a difficult task, considering the low resolution of the corneal reflection photos, especially for a participant to whom neither of the images was of a familiar face. Despite this, the results were well above chance, with 71 percent successful identification in the first group, and 84 percent success when one face was familiar.
In the second experiment, participants were chosen who had known Dr. Jenkins for an average of 18 years, but had never seen any of the people in the other corneal reflection photos. Ninety percent of the time, the participants identified Dr. Jenkins' corneal reflection, and fewer than ten percent of the participants made an incorrect identification.
Note that the subjects were not aware that Dr. Jenkins' picture would be among those presented to them, nor that he had anything to do with the study. On a scale of 0 to 10, the confidence level for correct spontaneous recognition as rated by the participants was about 8, while the confidence level they gave for incorrect identifications was about 5.
The Jenkins/Kerr study has shown quite clearly that there is enough information in a corneal reflection taken under rather favorable conditions to identify people near the camera with a good deal of certainty.
There are two remaining pieces of the puzzle to make a routine forensic (or surveillance) tool out of this concept. First, cameras and image processing need to become a bit more capable, as normally the lighting will not be as favorable for capturing corneal reflections as in this study. Second, methods of automated facial recognition that compare to criminal databases and social media, which have recently achieved remarkable fidelity, must be adapted to the image distortions that appear naturally in the corneal reflections.
The tech described above is neat and more than a bit surprising. While there are applications which would benefit society, there are also many potential uses with obvious "Big Brother" privacy issues. It is up to us to insure that such developments are used wisely – may we choose better than our usual wont, or we may all have to walk around with Rorschach masks.
The short but cool video below zooms continuously from the full photo of the lead photo to the face in a green border within the corneal reflection.
Source: PLOS One