Object Constancy

Consider the objects in Figs. 1A and 1B. Call them SOGI and DUVA, respectively. Although you have never seen these objects before, you should have little difficulty perceiving that the images in Figs. 1C-1I are of the SOGI, not the DUVA. The ability to treat different two-dimensional (2D) images as the same three-dimensional (3D) shape is known as object constancy, and the acheivement of object constancy is the fundamental goal of object perception. As is the case with most fundamental cognitive processes, we generally take this ability for granted, but also as with many other cognitive processes, the mechanisms that allow us to perceive objects are far from simple.

One way to appreciate the difficulties inherent in achieving object constancy is to consider what we would have to do to progam a computer to recognize objects. To a computer, the images in Fig. 1 are nothing more than collections of numbers that specify the intensity of each small dot, or pixel, in each image. To teach the computer to associate the collection of pixels in Fig. 1A with the label SOGI and the pixels in Fig. 1B with the label DUVA is a simple matter of

Encyclopedia of the Human Brain Volume 3

Copyright 2002, Elsevier Science (USA).

All rights reserved.

Figure 1 (A) and (B) show two novel objects under similar viewing conditions. (C)-(I) show how the former object would look under different viewing conditions. More specifically, the images illustrate (C) a translation (caused by the observer shifting his or her gaze or the object moving in the picture plane), (D) a size change (caused by the observer moving backward or the object moving away from the observer in depth), (E) a shift in lighting direction, (F) a shift in picture plane orientation (caused by the object turning upside down or the observer standing on his or her head), (G) a mirror reflection, and (H-I) viewpoint shifts (caused by the object rotating in depth or the observer moving around the object). Each of these changes to viewing conditions introduces significant alterations to the image delivered to the retina, and the human object perception system must somehow compensate for these alterations in order to successfully identify the object.

Figure 1 (A) and (B) show two novel objects under similar viewing conditions. (C)-(I) show how the former object would look under different viewing conditions. More specifically, the images illustrate (C) a translation (caused by the observer shifting his or her gaze or the object moving in the picture plane), (D) a size change (caused by the observer moving backward or the object moving away from the observer in depth), (E) a shift in lighting direction, (F) a shift in picture plane orientation (caused by the object turning upside down or the observer standing on his or her head), (G) a mirror reflection, and (H-I) viewpoint shifts (caused by the object rotating in depth or the observer moving around the object). Each of these changes to viewing conditions introduces significant alterations to the image delivered to the retina, and the human object perception system must somehow compensate for these alterations in order to successfully identify the object.

storing the pixel values and labels of both images in memory. When one or the other image is encountered again, the new pixel values can be compared to those of the previously seen images. If the new values match those of Fig. 1A, the SOGI label is retrieved, whereas if the values match those of Fig. 1B, the DUVA label is retrieved. But what will happen when the images in Figs. 1C-1I are presented to the computer? The collections of pixels in these images are identical to neither the learned SOGI nor the learned DUVA image, so our simple pixel comparison process will fail to produce a perfect match with either remembered image.

The human visual system faces the same conundrum: the raw input for vision is a set of firing rates of receptor cells (rods and cones) in the retina, roughly analogous to the set of pixel values that serve as input to the computer. If an object was always seen under the same viewing conditions, its image would always produce the same retinal firing rates and object perception would be straightforward. But Figs. 1C-1I illustrate several different ways in which alterations in viewing conditions produce different retinal input from the same object. Somehow, the visual system must find a way to compensate for these changes to the retinal image so that the perception of the object remains constant. Let us consider each of these forms of object constancy in turn.

In Fig. 1C, the image of the SOGI has been moved, or translated, relative to the bounding rectangle. If one's gaze is fixed at the center of the rectangle, the pattern of retinal activity will be almost completely different for this image than for Fig. 1A. The visual system usually elicits eye movements that bring the image of a to-be-recognized object onto the fovea, a process that effectively compensates for image translations. However, common experience and psychophysical experiments indicate that objects can be recognized before being foveated: even when pictures of objects are shown experimentally for less than 0.3 sec, precluding eye movements, translations produce a minimal, if any, decrease in recognition accuracy or time. Therefore, some other compensation mechanism besides eye movements must also be at work.

In Fig. 1D, the image of the SOGI has been shrunk to one half its size in Fig. 1A. Although actual, physical size is an important characteristic of every object, the size of an image on the retina is determined jointly by the size of the object and the viewing distance (when viewed from certain distances, a tennis ball, a basketball, and the moon may all produce the same retinal size despite wide variations in physical size). To use object size as a cue for identity, then, depth cues would have to be consulted to determine object distance. Whereas this strategy may sometimes be utilized, the images in Figs. 1A and 1D contain no depth cues, yet we have a strong predilection to perceive them as the same object (and thus the same size), demonstrating that the visual system often somehow ignores retinal size when computing object identity. As with translations, most psychophysical studies that have investigated the issue have found little if any effect of object size on recognition fluency.

Fig. 1E shows what would happen if the light source were moved from above and to the left of the SOGI, as it is in Fig. 1A, to a position above and to the right of the object. Although one might not even notice this change if it was not explicitly pointed out, almost all of the pixels in the interior of the objects are different in the two images. For example, in Fig. 1A, the side ofthe SOGI's prism-shaped "head" is brighter (i.e., the pixel values here are more intense, and retinal receptor cells will fire at higher response rates to this surface) than the front of the star-shaped part, whereas in Fig. 1E this relationship is reversed. Because humans and other diurnal animals evolved under conditions in which the angle of their only light source (the sun) was constantly changing throughout every day, compensation for lighting direction may have been an important factor in the evolution of object perception.

Fig. 1F shows what the SOGI would look like if the observer were to stand on his or her head or if the

SOGI were to be flipped upside down: the image's orientation has been shifted 180° in the picture plane relative to Fig. 1A. Because humans usually stand on their feet and many objects have a normal upright position (the trunks of trees and wheels of cars almost always touch the ground), orientation often provides a diagnostic cue for recognition. Psychophysics studies discussed later show that, although we can identify upturned trees and cars with high accuracy, the time taken to recognize such misoriented objects varies with the amount of misorientation, indicating that the visual system may take advantage of this orientation diagnosticity by representing objects in their canonical positions.

Fig. 1G is a mirror image of Fig. 1A. Although the amount of pixel shifting is about the same between mirror reflections as between upright and upside down images, reflections cause very small (if any) decreases in performance in psychophysical experiments, whereas, as noted earlier, turning of an image upside down typically leads to a substantial impairment in recognition time. Explanations for this combination of results feature prominently in many theories of object recognition.

Fig. 1H shows what the SOGI would look like if the observer moved slightly to the left relative to the vantage point from which the object was seen in Fig. 1A, or if the object rotated to its right. Fig. 1I shows the result of a larger change in viewpoint. Note that, unlike all of the other image transformations discussed here, changes in viewpoint may cause different parts and surfaces of the object to become visible or invisible and, at the least, lead to changes in the projection of the 3D surfaces onto the 2D retinal image. For example, the front surface of the star-shaped part of the SOGI covers a larger amount of the image (relative to other surfaces in the object) in Fig. 1H compared to Fig. 1A and then becomes completely invisible in Fig. 1I. In this way, the image changes introduced by viewpoint shifts are more radical than those caused by the other transformations discussed earlier. Thus, compensation for viewpoint shifts seems to be a computationally more demanding problem, and the most crucial measure of any object perception theory is usually taken to be how well it can account for humans' ability to achieve viewpoint constancy.

Was this article helpful?

0 0
All About Alzheimers

All About Alzheimers

The comprehensive new ebook All About Alzheimers puts everything into perspective. Youll gain insight and awareness into the disease. Learn how to maintain the patients emotional health. Discover tactics you can use to deal with constant life changes. Find out how counselors can help, and when they should intervene. Learn safety precautions that can protect you, your family and your loved one. All About Alzheimers will truly empower you.

Get My Free Ebook


Post a comment