In the real world, our sense of our environment as a three-dimensional
space is provided by a variety of depth cues, all of which usually
combine to provide us with a coherent model of spatial layout. In order
to effectively simulate a three-dimensional space, it is necessary to
provide as many of these visual cues in as accurate a way as possible.
In the simplest case, such cues as occlusion, shading and perspective
can be simulated by a static two-dimensional image such as a painting
or a photograph.
-
Occlusion provides information about the depth order of
objects: if one object occludes another then we assume that it is
closer to us.
-
Shading provides information about the location and
orientation of surfaces relative to the light sources that illuminate
the scene, as well as about objects that occlude these light sources.
If an object exhibits a gradual change of shading across its surface we
assume that a gradual change in the orientation of the surface
underlies this change.
-
Perspective provides us with the appropriate scaling of
objects whose size we already implicitly know, as well as the looming
and receding effects of objects that are moving towards or away from
us. One consequence of this is that there is only one viewing position
that will produce a perspective correct image on the retina:
if we change our viewing position, a static image will not change to
reflect our new viewpoint.
Although these cues can achieve compelling effects (consider the
reaction of audiences to early motion pictures) the illusion they
create is incomplete and largely dependent upon a suspension of
disbelief on the part of the viewer. In order to supplement this
incomplete illusion, more of the visual cues upon which we base our
everday visual experiences can be simulated:
-
Dynamic perspective: When we change our viewing position,
our perspective view of the world changes. Even small head movements
can provide information about the spatial layout of a scene. In order
to maintain a perspective correct view, the position and orientation of
the viewer's eyes must be constantly measured, and the projected image
must be modified accordingly. Note that although motion pictures
provide a type of dynamic perspective (in the form of varying camera
lens angles for close-up or panoramic shots), this is dictated by the
film director, and is used to achieve an intended visual effect rather
than to be perspective correct.
-
Stereo viewing: Because our eyes are spatially separated,
the images projected onto our left and right retinae are not the same.
When comparing the two retinal projections, elements within the visual
scene are offset by an amount inversely proportional to their depth in
the scene i.e. near objects have large offsets, while more distant
objects have very small offsets. When viewing a two-dimensional image
these offsets merely reflect the distance from the eyes to the image
plane, rather than the distances made implicit by the visual cues
discussed above. To simulate the effects of stereo, each eye must be
presented with its own perspective correct image.
-
Field of view: It is important to maintain the illusion
across the field of view, since the incursion of the image boundary
into the visual field tends to destroy the illusion. Even without
allowing for head movements, the visual field subtends about 180°
laterally and 120° vertically. In comparison a 19" monitor
viewed from 2 feet subtends only 35° laterally and 27°
vertically.
-
Extra-retinal cues: Our visual systems also make use of
non-visual information. Proprioceptive information is provided by our
vestibular balance mechanisms and by kinesthetic data from muscles
controlling neck movements, eye movements (convergence) and focus
(accommodation).
-
Whilst each of these cues on their own can enhance a 3D visual
simulation, their combination is mutually reinforcing. Complementary
cues will enhance the illusion of 3D, while conflicting cues will
diminish the effect or even cause visual discomfort.
-