Light-Field Imaging Approaches Commercial Viability
Recent advances in light-field imaging suggest that light-field photography can increasingly displace traditional 2D photography.
by Kurt Akeley
LIGHT-FIELD technology adds an exciting new dimension to traditional still or moving-picture photography. Whereas traditional 2D cameras can only capture the scene from one observation point perspective, light-field cameras can capture an array of observation points along with the associated features of the scene that users would expect to see in real life, including variable depth of field, occlusion, and depth perception. The ability to capture the light field of a scene and then render that scene from multiple points of view is truly exciting and innovative. With today’s state of the art in electronics, image detection, and optics, we have finally reached a point where electronic light-field photography is a practical pursuit of science and engineering. This article summarizes the development of light-field photography over the past century, briefly addresses issues such as depth of field and final-image resolution, and identifies future photographic and imaging opportunities based on experimental and research work to date.
Both intuitively and in actual practice, the light passing across a plane may be characterized as a function of four geometric dimensions: the spectral flow at each angle (two geometric dimensions) through each point on the plane (two additional geometric dimensions). This geometric function may be referred to as the 4D light field or as just the light field.
Early in the 20th century, physicist and inventor Gabriel Lippmann of Luxembourg developed integral imaging: the capture and re-display of the 4D light field at
a bounded planar region. As depicted on the left side of Fig. 1, an array of micro-cameras, each comprising a lens and a sensor, captures the 4D light field, each camera capturing two angular dimensions at its 2D position in the array. The captured light field is recreated by an array of micro-projectors, as depicted on the right side of the same figure. The method used to store and transfer the 4D light field depends on the camera and projection technology: Lippmann used photographic film, which was developed in place and then re-projected through the same micro-lens array, while contemporary systems utilize digital sensors and displays, whose data are stored and transferred using computing technology.
Fig. 1: In this example of integral imaging, an array of micro-cameras captures the 4D light-field imagery on the left, which is recreated by an array of micro-projectors on the right.
In principle, the experience of viewing the recreated light field is equivalent to viewing the original scene. In practice, constrained sampling and spectral resolution limit the fidelity of the recreation, and hence of the viewing experience. At very low resolution, head-motion parallax may be experienced as discrete
views and binocular parallax may be experienced intermittently or not at all (because both eyes see the same discrete view position). As resolution is increased, motion parallax becomes continuous, and correct binocular parallax (i.e., stereo vision) is achieved. With still further increase in resolution, parallax is accurately developed across the human pupil, stimulating correct accommodation (focus response) in a human viewer and depicting correct blur cues on the viewer’s retinas. Achieving such resolution is challenging; however, in part due to the confounding of both capture and reconstruction by diffraction, which result
from the small device geometries required for high spatial and angular sampling rates.
A simple thought experiment illustrates the leap from integral imaging to light-field photography: store the captured 4D light field as the light-field picture then create 2D photographs from it by imaging the recreated light field using a conventional camera. The traditional camera may be positioned arbitrarily within the light field to adjust center of perspective and tilt, and the camera’s focus distance, focal length, and aperture may be adjusted to control the properties of blur in the resulting 2D image. Thus, critical photographic aspects that are unchangeable in traditional photographs may be varied freely, within the constraints of the captured light field, after the capture of the 4D light-field picture. While direct realization of the thought experiment is impractical, modern computing technology replaces physical recreation of the light field, as well as the physical camera used to reimage it, with software simulation, leaving only the light-field capture unit (light-field camera) to be implemented in hardware (Fig. 2).
The light-field camera depicted in Fig. 2 differs from that in Fig. 1 by including an objective lens through which the micro-lens array images the scene2. Such lenslet light-field cameras adopt the form of conventional cameras, allowing adjustments to focal length and focus of the single lens to adapt the field of view and depth of field of the captured light-field picture. Because 4D capture is very demanding of sensor resolution, such adaptation may be critical to successful light-field photography.
Fig. 2: An illustration of light-field photography includes a camera (at left) with a micro-lens array that images the scene.
Multi-camera arrays, whose micro-cameras image the scene directly rather than through an additional objective lens, may also be used to capture light-field pictures. While multi-camera arrays with adjustable lenses on each camera have been implemented as research projects, the expense and complexity of assembling and calibrating tens or hundreds of lenses make such systems impractical as consumer products. Thus, multi-camera arrays may be practical only for applications that do not require adjustments to the focus distance or field of view, such as in mobile devices, where the absence of an objective lens may also reduce the thickness of the camera.
Multi-camera arrays also serve a pedagogical purpose – their construction clearly illustrates the dimensions of spatial light-field resolution (the grid of cameras) and angular light-field resolution (the pixel resolution of each camera) of the captured light-field picture. The optics of the lenslet light-field camera are indirect; they are best understood as they relate to an equivalent multi-camera array. The equivalence is a simple duality, as illustrated in Fig. 3. Light-field spatial resolution – the grid of cameras of a multi-camera array – corresponds to the pixel resolution behind each micro-lens in the lenslet light-field camera. And light-field angular resolution – the grid of pixels in each camera of the multi-camera array – corresponds to the number of micro-lenses in the lenslet light-field camera.
Fig. 3: Light-field-camera duality is illustrated above. The grid of cameras in a multi-camera array (upper left) – corresponds to the pixel resolution behind each micro-lens in the lenslet light-field camera (upper right). And light-field angular resolution — the grid of pixels in each camera of the multi-camera array (lower left) – corresponds to the number of micro-lenses in the lenslet light-field camera (lower right).
Light fields may also be captured using a traditional 2D camera by taking multiple exposures at different times and positions. This has the advantage of utilizing readily available equipment, but the resulting data are inherently corrupted if the scene changes during the capture period. Light fields may also be inferred from 2D images, rather than sampled directly as in lenslet light-field cameras and multi-camera arrays. While this approach has been demonstrated in research systems, the author knows of no products that utilize it.
Depth of Field
While the angular and spatial dimensions of the captured light-field relate more directly to the mechanism of the multi-camera array, other properties of the
captured light field, in particular its depth of field, are more readily grasped by considering the optics of the lenslet light-field camera. And because depth of field is inherently non-linear, it is convenient to do as photographers do and use f/# as its proxy.
The f/# of the image cast on the light-field sensor of the camera depicted in Fig. 4 is the ratio of the focal length of the objective lens to its diameter.a But the light that reaches a single pixel on the camera’s light-field sensor surface passes through only one Nth of the objective lens, so its effective f/# is N times greater: f/(a/N). Here, we may ignore light spread due to the micro-lenses because
1. The micro-lenses are understood to be “in focus” (meaning that they are separated from the sensor surface by a distance equal to their focal lengths), and
2. The micro-lenses are tiny relative to the objective lens.
Thus, the bundle of all rays that pass through a micro-lens and reach a single pixel is only slightly larger at the objective lens than the depicted bundles, whose rays pass through only the centers of the micro-lenses.
Fig. 4: The above schematic depicts the depth of field of a focused lenslet light-field camera.
The depth of field of the light-field camera is equivalent to the depth of field of the ray bundles it captures and is therefore also N times that of a conventional camera with the same optics. While this is a substantial improvement, it does not obviate the need to adjust the focus of the light-field camera. For example, only scene objects that are within the depth of field of the light-field camera can be sharply refocused in images that are reconstructed from the captured light-field picture. And the captured depth of field may be shallow relative to the range of depths in the scene, especially when the objective lens has a long focal length.
Light-field-camera depth of field also collapses quickly as the micro-lens array is defocused by positioning it either nearer or farther from the sensor surface
than its focal length. The reason is that the bundle of rays traced from a single sensor pixel quickly spreads to cover a significant fraction of the objective-lens aperture, reducing the f/# advantage from N back toward unity.
Final Image Resolution
The “natural” pixel dimensions of the 2D images reconstructed from a captured light-field picture are equal to the pixel dimensions of each camera in the capturing multi-camera array, which from duality (Fig. 3) correspond to the dimensions of the micro-lens array in a lenslet light-field camera. Unless care is taken, reconstruction of an image with greater pixel dimensions results in more pixel data, but no increase in the true resolution of the image. While the sensor pixel dimensions of cameras in a multi-camera array may be readily increased to the current limits of individual sensors, the micro-lens array dimensions of a lenslet light-field camera are not so easily increased because the micro-lenses typically share a single image sensor. But multi-camera arrays are often impractical, due to their need for multiple complex objective lenses, and N values in the range of 10–15 mean that the final image dimensions of a lenslet light-field camera may be a factor of several hundred (N2) less than the pixel count of the camera’s sensor. So final-image resolution is a significant challenge for light-field photography.
There are many ways to increase final image resolution. Here, we list methods appropriate for a lenslet light-field camera, following each with a short discussion of its limitations:
• Improve the reconstruction algorithm. Micro-lens-array resolution is achievable with simple back projection. Using approaches outlined in Ramamoorthi and Liang,3 so-called super-resolution, as much as a factor of two in each dimension, may be achieved with more complex back-projection algorithms, which typically compute a depth map of the scene. Even more complex algorithms, such as algebraic reconstruction, increase computation load dramatically, with final image improvement limited to an asymptote that remains far below sensor resolution.
• Reduce micro-lens diameter. As micro-lens diameter is decreased, the number of micro-lenses increases, raising final image resolution linearly. But each micro-lens “covers” fewer sensor pixels, reducing N and consequently the depth of field of the light-field camera.
• Reduce micro-lens diameter and sensor pixel pitch. As both micro-lens diameter and pixel pitch are reduced in lock step, final
image resolution increases with no corresponding decrease in depth of field. But the range of wavelengths in the visible spectrum remains constant, and sensor pixels are already near diffraction limits, so this approach does not define an obvious path to significant improvement in final image resolution.
• Increase sensor size. Increasing the dimensions of the light-field sensor, while holding the micro-lens diameter and sensor pixel pitch constant, increases final image resolution linearly and without physical limit. But large sensors are expensive, and linear increases in sensor dimensions correspond to cubic increases in camera (especially objective lens) volume, which both increases cost and decreases utility (due to increased size and weight).
• Defocus the micro-lens array. As discussed in Ng,1 reducing the gap between the micro-lens array and the sensor surface below the focal length of the micro-lenses yields a substantial increase in final image
resolution, asymptotically approaching sensor pixel resolution as the gap approaches zero. (Increasing the gap also has the effect, though with additional complexity.) But defocusing the micro-lens array also reduces the depth-of-field advantage of the lenslet light-field camera. And it interacts badly with other opportunities of light-field photography that are described next.
While the most familiar advantage of light-field photography is its ability to reconstruct images with arbitrary parallax effects (focus distance, depth of field, center of perspective, etc.) from a single exposure, there are other opportunities that may, over time, prove more important. These include:
1. Improve the tradeoff between depth of field and amount of light captured. In traditional photography, depth of field is increased by reducing the aperture
of the objective lens, which causes less light to reach the sensor. Subject to limitations due to diffraction, a lenslet light-field camera with the same optics and sensor size of a conventional camera can achieve N times its depth of field (f/#) with no reduction in captured light.
2. Create a depth map from a single exposure. Traditional photography supports depth-map creation only with multiple exposures, using either a single camera (which is sequential and inherently risks data corruption in dynamic scenes) or multiple cameras in parallel (which avoids data corruption, but quickly becomes impractical as the number of cameras is increased). Lenslet light-field cameras capture the necessary information in a single exposure (avoiding data corruption) and do so without emitting any radiation of their own (avoiding detection and operating without distance constraints due to limits in emission intensity).
3. Correct for aberrations in the objective lens. The objective lens of a lenslet light-field camera need not form an image – it must only facilitate a tight sampling of the scene’s light field. As discussed in Ng,1 eliminating the need for image formation greatly reduces time-honored constraints on lens design, opening the possibility of lenses with dramatically improved performance and/or reductions in cost. Indeed, replacing optics hardware with computation may become the most consequential aspect of light-field photography.b
As these opportunities are explored, and others are discovered, light-field photography may increasingly displace traditional 2D photography.
1R. Ng, Digital Light Field Photography, Stanford Ph.D. Dissertation (1996).
2R. Ng, M. Levoy, M. Bredif, G. Duval, M. Horowitz, and P. Hanrahan, Light Field Photography with a Hand-held Plenoptic Camera, Stanford Tech Report CTSR 2005-02 (2005).
3R. Ramamoorthi and C. K. Liang, “A Light Transport Framework for Lenslet Light Field Cameras,” ACM Transactions on Graphics (to be published). •