Toward the Ultimate 3-D Display
Holographic stereography and a new photorefractive material may help create a 3-D display that reproduces all human visual cues.
by Pierre-Alexandre Blanche
LET US START WITH a statement about displays: "It is all about the viewer." Understanding human physiology and how we perceive depth should lead display technology toward a flawless and thus successful 3-D system. What is true for the consumer market is even more relevant for special applications such as medical imaging, remote training, or intelligence analysis. In these fields, the 3-D image can incorporate elements or data from different instruments or sensors, making the fused information complex to analyze. But a complex image does not, and should not, be difficult to interpret. In areas where information interpretation is critical, the display ought to be perfectly adapted to the decision maker.
Several cues are used by the brain to determine absolute and relative distances1: occlusion, relative size of objects, atmospheric scattering, texture, and shading are already exploited in the case of 2-D displays. Indeed, we did not wait for the introduction of stereoscopic theater or television to understand relative positions of elements in a scene. When an explosion happens in the background in a 2-D movie, we understand why the hero remains unharmed. A film director can exploit these cues (and the absence of others) to make objects appear closer or farther away than they are in reality. The use of a telephoto lens is such an example because it normalizes the size and distance difference between near and far objects. However, these artistic touches can be misleading in cases where the display is used for critical analysis. You do not want a surgeon to think the artery is farther away than it is in reality; you do not want the intelligence analyst to think the structure or building is smaller than it is on the field. Discrimination in these cases can be improved when more cues are provided, thus the focus on integrating stereopsis, vergence, and accommodation into the display environment.
Stereopsis, Vergence, and Accommodation
Stereopsis and vergence are phenomena resulting from binocular vision. Different images are perceived by the left and right eye due to their lateral separation, and these images are interpreted by the brain to deduce distance. Stereopsis is a consequence of the binocular disparity (mostly the parallax shift), while vergence is the rotation of the eyes in opposite directions around their vertical axis to fixate on the same point of an image [see Figs. 1(a) and 1(b)].
Stereoscopic displays use those cues to reproduce 3-D by employing a technique that was introduced in 1838 by Sir Charles Wheatstone. In a variety of implementations – such as color filters (anaglyph), polarizers, or shutters – stereoscopic displays present a different image to the left and the right eye and prevent crosstalk by using some sort of eyewear.2
Stereoscopy has many advantages: it is robust, quite easy to implement, and effective to a certain extent. It only requires a pair of images to be recorded and does not dramatically change the image-capture techniques (side-by-side cameras). In the case of polarization and active-shutter technologies, the display refresh rate only needs to be doubled to 60 or 120 Hz – reasonable frequencies for the current technology in both cinema and television. This convenience is the main reason for stereoscopy's recent commercial success, in theaters at least. Nonetheless, there are a few issues with this approach, one of which is that almost every stereoscopic technique today requires some sort of eyewear. The public has been fairly accepting of this eyewear tradeoff when it comes to experiencing immersive 3-D, as theater attendance has shown over the last couple of years. Likewise, millions of people wear prescription glasses without complaint – because the benefits far outweigh the discomfort.
Another limitation with stereoscopy is that it does not reproduce motion parallax: When the viewer moves in front of the display, the viewpoint does not change. You cannot look around an object as in real life. For a stationary audience such as one in a theater, this is not an issue at all. Motion parallax can also be addressed by a head-tracking system that determines the position of the viewer and calculates the viewpoint in real time.3 However, head tracking can be implemented for one and only one viewer at any given moment. For a multi-viewer audience, there would be one master directing the display, resulting in awkward sensations for the other viewers.
While the mandatory eyewear and the lack of motion parallax are mild concerns with stereoscopic approaches, the most serious problem is the accommodation-vergence conflict. Indeed, even if the vergence cue is correctly reproduced by a stereoscopic system, the accommodation is not. This infor-mation conflict inside the brain leads to asthenopia, i.e., visual fatigue, including headache, nausea, and motion sickness.4,5
Vergence is the rotation of both eyes along their vertical axis to fix on a common point in a scene. It is shown in Figs. 1(a) and 1(b)how the eyes rotate so images are formed near the fova spot, which is responsible for sharp central vision. The brain interprets the tension on the eye muscles (media and lateral rectus) to aid in determining the object's distance. If that object is in the forefront of the scene, the eyes rotate inward, causing the lines of sight to cross near the viewer. If the object is farther away, the eyes rotate outward to make the lines of sights more parallel. One can see why this cue is correctly reproduced by stereoscopy, since the basis of this technique is laterally shearing the left and right image elements according to depth.
But the eye is also a single-lens optical system in which the image plane is fixed: the image must form on the retina. For such a system, a sharp image is obtained for only one position of the object. When the object position changes, the image position changes accordingly unless the focal length of the lens is adjusted, exactly describing what happens in the eye. The eye lens is not rigid and can be deformed by the ciliary muscles to change its optical power. This is the basis of the accommodation cue: the adjustment of the eye lens to focus on objects at various distances [see Figs. 1(c) and 1(d)]. Accommodation is not related to stereoscopic vision, since you still need to accommodate when viewing with a single eye. It should be clear by now why no stereoscopic technique correctly reproduces the accommodation required by a true 3-D scene: when looking at the display, your eyes accommodate to the position of the emitter plane (the screen), and this position, or distance from observer to image, remains constant.
Fig. 1: Accommodation and vergence cues differ for a far or near object.
As shown in Fig. 2, with stereoscopy the brain receives conflicting information. The vergence changes and is correctly reproduced, but the accommodation is fixed and thus incorrect. This conflict can lead to eye fatigue and related physiological effects such as headache, dizziness, and nausea. However, there is a certain degree of mismatch that the brain can tolerate, especially when the screen is far away, as in a theater. At large distances, accommodation is not used, and the eye lens stays at rest. This is why the vast majority of people can appreciate a 3-hour movie in a stereoscopic theater. But for displays that are located much closer to the viewer, such as a television or a workstation, the conflict becomes more pronounced and the effects can be intolerable.5 This is especially the case in professional environments where workers spend most of their time looking at the display, or for young children, whose visual systems are still in development.6 This is the fundamental reason why better techniques are needed to reproduce 3-D. That brings us to holography.
Fig. 2: The accommodation-vergence conflict occurs in stereoscopic displays because the eyes accommodate to the screen but rotate to fix the apparent image.
Holography is the reproduction of both the amplitude and the phase of a scene by a diffractive pattern. We are familiar with systems that display amplitude, or intensity. From a photograph to an LCD TV, every display reproduces the light intensity. The phase, or wavefront of a light field, is less common and describes how the light wave is particularly curved at each given point of a scene. For a 2-D image, the wavefront is flat because each emission point is at the same distance from the viewer (the Huygens–Fresnel principle) but for a real scene, objects at the forefront have a more convex wave pattern than elements in the background. This is precisely why the eye needs to accommodate. It is now obvious why holography is the ultimate technique to display 3-D: it reconstructs the correct light field, and in doing so all of the optical cues are reproduced.
There is a catch. The reason why we still do not have holographic television or theater, even though holography was discovered in 1947, is due to the last two terms of the above-mentioned holographic definition: the "diffractive pattern." To diffract light, the pattern (you can read pixel) needs to be of the scale of the wavelength. For visible light, this is around 500 nm. Now, if you want a reasonable screen size – let's say 0.5 x 0.5 m with a reasonable field of view (50°) – you need to control 1 x 1012 pixels. At video rate (60 Hz), three colors and 8 bits per color, this multiplies to 5 x 1016 bits/sec, a bandwidth that is not easily accessible. To put these numbers into perspective, you would have to tile a 1080p/1080i HDTV with about 2 x 106 pixels, 500,000 times, then shrink it to the size of a 15-in. monitor. Despite the sheer size of the problem, or maybe because of it, researchers around the world are working on solutions, and excellent works have been published by various groups.7
If holography is the ultimate solution to reproduce the light field, it might also be over-kill for human vision. Indeed, holography achieves nanometer resolution that cannot be resolved by the eye. Since the resolution of stereoscopy is not enough with 3 x 106pixels, and holography is way too much with 1 x 1012 pixels, could we achieve a trade off and find a technique in between these two extremes that will reproduce all the visual cues, but with limited bandwidth and a reasonable number of pixels? The answer to that question is stereography.
Based on the work done by Gabriel Lippmann on integral photography in the early 1900s, a stereographic display projects different rays of light in different directions from each of its pixels. This is why this technique is also called multiple-viewpoint rendering. To explain it, think of the display as a frame through which the viewer looks to see a scene in the background. This principle is sketched in Fig. 3(a). If you trace all the rays of light coming from one point of the scene and passing through the frame, you will find they all have different angles. Those are the rays the display has to reproduce, for each and every point of the scene. Though this analogy with a frame and a background scene is useful for explanation, stereography can also reproduce objects in front of the screen (the frame in the example), where the ray-tracing method is identical [Fig. 3(b)].
Fig. 3: According to the principles of stereo-graphy, the pixels emit directional rays: (a) object in the back of the emitter plane,(b) object in front of the emitter plane.
How many rays a stereographic display should generate to reproduce all of the visual cues depends on the depth of field, as well as geometrical factors such as the viewer pupil size, its distance from the screen, and the field of view of the display.8 But considering a high-definition display that has 2 x 106 pixels (HDTV), and a depth of field about the screen diagonal, this would translate to about 2 x 109 rays.
The technique of angular reorientation of the light rays from the display is used in autostereoscopic (glasses-free) 3-D television.9 In such a device, a lenslet array is laid over a 2-D screen. As shown in Fig. 4, each lenslet covers several pixels and redirects their beams in different directions, creating different view zones. When the viewer is correctly positioned, his left and right eyes intercept two different zones and a stereoscopic effect is created (note that this is stereoscopic and not stereographic). However, the accommodation cue is not reproduced with those televisions. This is because at least two view zones need to enter each eye to approximate the wave curvature. Compared to glasses-free 3-D television, many more view zones need to be projected in a stereographic system.
Fig. 4: The principle of autostereoscopic 3-D television is shown above: A lens array overlays a regular 2-D screen. Each lens redirects the light emitted from individual pixels, creating different view zones. When the viewer's eyes intercept two different view zones, the stereoscopic effect is achieved.
To obtain more view zones, one can imagine packing more pixels under each lenslet, but this means drastically reducing the pixel size (down to 0.001 mm), which is not technically possible yet. On the other hand, if you increase the size of the lenslet itself to cover more pixels, the lateral resolution of the display suffers.10
Different approaches have been proposed to realize stereographic 3-D displays by reproducing accommodation: optical demagnification of the pixels by a telescope,11 and the use of acousto-optic modulators to reorient lasers beams and sweep the view zone.12 But up until now, it seems that there is no display yet capable of providing a large autostereoscopic 3-D image without some sort of artifact. This can be understood by the sheer numbers presented above. It is not easy to move from a system driving 2 x 106pixels to one that will manage 2 x 109, a factor of 1000 larger.
Hogels and Stereographic Still Imagery
While we wait for a dynamic stereographic display to emerge from research labs, stereographic still pictures can currently be made, thanks to holography. These images are composed of pixels, but each of those pixels is a hologram. They are named "hogels," for the contraction of both words. These hogels are recorded when two laser beams interfere in a holographic recording material. The first beam is called a reference beam and does not contain any information. The second beam is modulated by an LCD and focused by a lens into one spot. That means all the pixels from the LCD screen are compressed together by the lens into one single hogel. The holographic recording technique actually achieves what is shown in Fig. 3(a): The back image is shrunken into one pixel.
When this kind of hologram is replayed, the hogels reproduce a structured cone of light, so that each angle emits a different ray. That is the actual principle of stereography just discussed. What is important to notice here is the compelling beauty of these images, which are realistically reproducing vibrant colors and depth of field. The images are attractive from an artistic point of view, but they are also useful in technical areas such as medical imaging, architecture, and industrial design.13 Recently, a study was conducted with regard to the performance of U.S. Army military personnel when using regular 2-D topographic maps (on which soldiers had been trained and were used to) and stereographic 3-D holographic maps.14 The results revealed improvements in planning and execution on every single task when the soldiers were using the holographic maps. It is exciting to imagine what effect such a technology, if a display is made possible, could have in the civilian world.
One of the reasons holographic stereography has not yet been implemented into a 3-D dynamic display is because of the recording material. Indeed, holographic recording materials such as photopolymers, silver-halide emulsions, or dichromated gelatin are permanent. They are exposed to the laser beams once, then chemically processed to reveal the hologram, but cannot be refreshed. With these materials, holographic cinema can and has been demonstrated,15 but a display with dynamic imaging is not possible.
A new type of material, called photo-refractive polymers, was discovered in the early nineties. These polymers can be recorded, erased, and refreshed without any fatigue. They do not need any post-processing for the hologram to be revealed; it appears by itself after the exposure due to electronic charges re-localization.16 The properties of this class of material have been extensively studied and improved in recent years to achieve very high figures of merit such as diffraction efficiency, sensitivity, speed, and reliability. Since the materials are in a polymeric form, they can be cast into a large screen and could be an ideal material to develop a dynamic stereographic 3-D display. That is the pathway the author's group at the University of Arizona, College of Optical Sciences, has followed to demonstrate a 3-D display with a screen size up to 17 in. (see Fig. 5), and a refresh rate of few seconds.17
Fig. 5: The author holds a 17-in.-diagonal photorefractive polymer sheet.
Using the principle of holographic stereography, and with the photorefractive material at the heart of the system, a telepresence experiment has also been demonstrated that records images of a person in one location and prints the hologram in another, using the Internet to send the data. Figure 6 shows photographs of one of such holographic stereogram taken at different angles to demonstrate the parallax. The image can be erased and refreshed at will without any material fatigue. The photo-refractive polymer might be to 3-D display what phosphor has been to 2-D CRT screens.
Fig. 6: A photograph of the holographic stereogram at different angles shows the parallax.
In its present state, this display is still an experimental setup that needs further development to achieve video rate and a compact system. However, this is a new direction toward a 3-D display that respects human vision by providing all the cues: accommodation, vergence, and parallax.
So, while the challenges of stereographic 3-D displays are numerous in quantity and large in magnitude, we can expect that rapid progress will continue to be made on different fronts. Humans are meant to see in 3-D, and our quest for the ultimate display will continue until a system reproduces each and every visual cue, flawlessly.
1Perceiving in Depth, Vol. 1: Basic Mechanisms, Ian P. Howard, ed. (Oxford University Press, 2012).
2A. Woods, "3-D Displays in the Home," Information Display 25, No. 7 (July 2009).
3See, for example, http://www.cs.unc.edu/~maimone/KinectPaper/kinect.html.
4M. Lambooij, W. Ijsselsteijn, M. Fortuin, and I. Heynderickx, "Visual Discomfort and Visual Fatigue of Stereoscopic Displays: A Review," J. Imag. Sci. Tech. 53(3), 030201–030201-14 (2009).
5M. S. Banks, K. Akeley, D. M. Hoffman, and A. R. Girshick, "Consequences of Incorrect Focus Cues in Stereo Displays," Information Display 24, No. 7 (July 2008).
6S. K. Rushton and P. M. Riddell, "Developing Visual Systems and Exposure to Virtual Reality and Stereo Displays: Some Concerns and Speculations about the Demands on Accommodation and Vergence," Applied Ergonomics 30, 69–78 (1999).
7H. Stolle and R. Häussler, "A New Approach to Electro-Holography: Can This Move Holography into the Mainstream?," Information Display 24, No. 7 (July 2008).
8C. Slinger, C. Cameron, and M. Stanley, "Computer-Generated Holography as a Generic Display Technology," Computer, IEEE, 0018-9162/05 (2005); P. St.-Hilaire, "Scalable Optical Architectures for Electronic Holography," Ph.D. Thesis, Program in Media Arts and Sciences, Massachusetts Institute of Technology (September 1994); Y. Ichihashi, N. Masuda, M. Tsuge, H. Nakayama, A. Shiraki, T. Shimobaba, and T. Ito, "One-unit system to reconstruct a 3-D movie at a video-rate via electroholography," Optics Express 17, Issue 22, 19691-19697 (2009).
8Y. Takaki, K. Tanaka, and J. Nakamura, "Super multi-view display with a lower resolution flat-panel display," Optics Express 19, No. 5, 412928 (February 2011).
9M. Salmimaa and T. Järvenpää, "Characterizing Autostereoscopic 3-D Displays," Information Display 25, No. 1 (January 2009).
10P. J. Bos and A. K. Bhowmik, "Liquid-Crystal Technology Advances toward Future 'True' 3-D Flat-Panel Displays," Information Display 27, No. 9 (September 2011).
11Y. Takaki, "High-Density Directional Display for Generating Natural Three-Dimensional Images," Proc. IEEE 94, No. 3 (March 2006).
12T. Balogh et al., "The Holovizio System: New Opportunity Offered by 3D Displays," Proc. TMCE (2008).
13Rabbit Holes: http://www.rabbitholes.com/; Geola: http://www.geola.lt/; Zebra Imaging: http://www.zebraimaging.com/
14J.. Martin and M. Holzbach, "Evaluation of Holographic Technology in Close Air Support Mission Planning and Execution," Air Force Research Laboratory Human Effectiveness Directorate, Warfighter Readiness Research Division, Technical Report approved for public release via www.dtic.mil website, id: AFRL-RH-AZ-TR-2008-0025, accession number ADA486177 (2008).
15E. N. Leith, D. B. Brumm, and S. S. H. Hsiao, "Holographic Cinematography," Applied Optics 11, No. 9 (September 1972).
16O. Ostroverkhova and W. E. Moerner, "Organic Photorefractives: Mechanisms, Materials and Applications," Chemical Reviews 104(7), 3267-3314 (2004).
17S. Tay and N. Peyghambarian, "Refreshable Holographic 3-D Displays," Information Display 24, No. 07 (July 2008); P.-A. Blancheet al., "Holographic three-dimensional tele-presence using large-area photorefractive polymer," Nature 468, No. 4 (November 2010). •