Seeing the World in Three Dimensions

In the real world, our sense of our environment as a three-dimensional space is provided by a variety of depth cues, all of which usually combine to provide us with a coherent model of spatial layout. In order to effectively simulate a three-dimensional space, it is necessary to provide as many of these visual cues in as accurate a way as possible. In the simplest case, such cues as occlusion, shading and perspective can be simulated by a static two-dimensional image such as a painting or a photograph.

Occlusion provides information about the depth order of objects: if one object occludes another then we assume that it is closer to us.
Shading provides information about the location and orientation of surfaces relative to the light sources that illuminate the scene, as well as about objects that occlude these light sources. If an object exhibits a gradual change of shading across its surface we assume that a gradual change in the orientation of the surface underlies this change.
Perspective provides us with the appropriate scaling of objects whose size we already implicitly know, as well as the looming and receding effects of objects that are moving towards or away from us. One consequence of this is that there is only one viewing position that will produce a perspective correct image on the retina: if we change our viewing position, a static image will not change to reflect our new viewpoint.

Although these cues can achieve compelling effects (consider the reaction of audiences to early motion pictures) the illusion they create is incomplete and largely dependent upon a suspension of disbelief on the part of the viewer. In order to supplement this incomplete illusion, more of the visual cues upon which we base our everday visual experiences can be simulated:

Dynamic perspective: When we change our viewing position, our perspective view of the world changes. Even small head movements can provide information about the spatial layout of a scene. In order to maintain a perspective correct view, the position and orientation of the viewer's eyes must be constantly measured, and the projected image must be modified accordingly. Note that although motion pictures provide a type of dynamic perspective (in the form of varying camera lens angles for close-up or panoramic shots), this is dictated by the film director, and is used to achieve an intended visual effect rather than to be perspective correct.
Stereo viewing: Because our eyes are spatially separated, the images projected onto our left and right retinae are not the same. When comparing the two retinal projections, elements within the visual scene are offset by an amount inversely proportional to their depth in the scene i.e. near objects have large offsets, while more distant objects have very small offsets. When viewing a two-dimensional image these offsets merely reflect the distance from the eyes to the image plane, rather than the distances made implicit by the visual cues discussed above. To simulate the effects of stereo, each eye must be presented with its own perspective correct image.
Field of view: It is important to maintain the illusion across the field of view, since the incursion of the image boundary into the visual field tends to destroy the illusion. Even without allowing for head movements, the visual field subtends about 180° laterally and 120° vertically. In comparison a 19" monitor viewed from 2 feet subtends only 35° laterally and 27° vertically.
Extra-retinal cues: Our visual systems also make use of non-visual information. Proprioceptive information is provided by our vestibular balance mechanisms and by kinesthetic data from muscles controlling neck movements, eye movements (convergence) and focus (accommodation).

Whilst each of these cues on their own can enhance a 3D visual simulation, their combination is mutually reinforcing. Complementary cues will enhance the illusion of 3D, while conflicting cues will diminish the effect or even cause visual discomfort.